Making a 'fast film'

My brother and I teamed up last weekend to make a short film for a competition in 72 hours using only AI generated content.

I really enjoy the process of making short films: last year I had a lot of fun making a goofy puppet short (it’s mostly in-jokes, sorry). This was a perfect excuse to do another one.

The brief:

The competition was a joint effort by Pika.ai and ElevenLabs:

Theme: ‘Post-Reality’
Constraints:
- 1-3 minutes
All video must be generated using pika.ai (text-to-video and image-to-video) and dialogue using ElevenLabs (text-to-voice)
Additional music + sound effects are allowed

In theory, we had from Friday to Sunday night to work, but due to unavoidable logistical issues, we actually had only a few hours to put it all together. I do best with tight deadlines, so this was right up my alley.

The process

To start, I poked around Pika to see what it was capable of. It seemed similar to Runway, a tool I’ve used before. You enter a text prompt and get a 3 second video clip:

While you can mention styles and negative prompts, the default outputs tend to skew heavily towards Pixar and Disney. Also, there’s no guarantee the style will be consistent across multiple videos.

Luckily, there’s an image-to-video mode that lets you upload an image along with your text prompt to generates the video. That sounded more suitable to this project.

But first, I needed a story.

The narrative

I was keen to work with my brother on this one, as he’s a good writer and video editor. Given the a short timeframe, I was happy to roll with his initial idea. He wrote a 2 minute dialogue between two characters, and gave me a few suggestions for what images might go with it. This was enough to get me started!

Pulling out the image suggestions from the dialogue, I used an LLM to transform those into image prompts:

I had to decide on a consistent style for the images. I played around with a few:

My partner P had the idea for ‘film noir comic’ (on the left), which I loved for its distinctive and stark contrast.

I used a small Python script to append these style guidelines (“film noir comic style, black and white, high contrast”) to each prompt, then requested those images from the OpenAI DALL-E API (which at the time of writing cost $0.04 an image).

from openai import OpenAI
from PIL import Image
from io import BytesIO
import os
import requests

api_key = "" # API KEY HERE

client = OpenAI(api_key=api_key)

def generate_images(prompts, api_key):

    save_path = "./"

    if not os.path.exists(save_path):
        os.makedirs(save_path)

    for index, prompt in enumerate(prompts):
        # Append the style to each prompt for consistency
        full_prompt = f"{prompt}, film noir comic style, black and white, high contrast"

        try:
            # Generate the image with the openai Python client
            response = client.images.generate(model="dall-e-3",
            prompt=full_prompt,
            size="1024x1024",
            quality="standard",
            n=1)

            # Extract the URL of the generated image
            image_url = response.data[0].url
            print(f"Image generated for prompt: {prompt} at URL: {image_url}")

            # Download and save the image
            image_data = requests.get(image_url)
            image = Image.open(BytesIO(image_data.content))
            image_save_path = os.path.join(save_path, f'{prompt}.png')
            image.save(image_save_path)
            print(f"Image generated and saved successfully for prompt: '{full_prompt}' at {image_save_path}")
        except Exception as e:
            print(f"Failed to generate or save image for prompt: '{full_prompt}'. Error: {str(e)}")

# List of prompts to generate images for
prompts = [

# Prompts go here

]

generate_images(prompts, api_key)

The images

Then I got a bunch of images to play with. I love the high contrast, comic book style:

From there, I uploaded these into Pika along with the text prompt to generate 3 second clips for each. Mostly it added some subtle motion with zooming and panning, occasionally it animated certain elements, like billowing smoke, or people walking. With more time I could have fine tuned it to get more interesting motion.

The images above are square, Pika let me scale that up to a 5:2 cinematic video:

The voices

This part was surprisingly easy using ElevenLabs. I simply uploaded the dialogue text for each character, chose a voice, and hit generate.

I also generated a few sound effects in Pika, but the quality is still a bit lacking. The majority of sounds I pulled royalty-free from Pixabay.

The editing

We now had all the pieces:

dialogue voiced with realistic sounding voices (well, at least the female voice, as you’ll hear below)
~20 video clips of 3-7 seconds in length
sound effects

My brother used his editing skills to cut it all together in record speed. We had a round of feedback. Another burst of edits for the final cut, and finished the revised version 20 minutes before the deadline.

Going from idea to finished short in such a short timeframe was a great experience. This is one of the cases where I see AI tools removing the barriers to creativity and one of the reasons I’m optimistic for the future.

Here it is!

The brief:#

The process#

The narrative#

The images#

The voices#