26 minutes ago14 min read

AI Filmmaking in 2025: Image-to-Video, Soundtracks, and SFX

Generative AI models are making filmmaking accessible to solo creators at a reasonable price and with significantly less technical gatekeeping.

We haven't yet reached a point where single text prompts generate a finished movie. Creators still need to chain together AI tools, so this article will show you the best options available and how to make use of them effectively.

This is not an affiliate article. Each of the services we recommend here are genuine endorsements based on firsthand experience and rigorous testing.

A complete guide to AI filmmaking in 2025

AI filmmaking is made up of several interlocking stages. To keep things simple, we will bucket them into scriptwriting, mood boarding, image generation, video generation, soundtrack creation, sound design, and post production.

You may have more experience with some of these tools than others, so we've created a table of contents below. Feel free to jump forward to the section that's most relevant for you.

What is screenwriting?
1. Using Notion to ideate and organize your screenplay
2. AI scriptwriting writing assistants (Notion, GPT, and Claude)
Mood boards: Gathering and generating imagery for your film
1. How to use Midjourney's character and style reference flags
Best image-to-video generators for AI filmmaking
Generating AI movie soundtracks and character voices
Post-Production: assembling your AI film in video editing software
1. Applying sound design and SFX to every scene

Screenwriting: From initial ideas to a finished screenplay

Stories are the cornerstone of every great film. A single great idea will propel you through the full movie making process. So your first task is to start jotting down big-picture ideas and find one that really speaks to you.

Don't be afraid to return to classic books, movies, and video games in search of a meaningful concept. Nostalgia can be a powerful source of inspiration and there are endless opportunities to adapt the story to make it your own.

Choosing a script development workspace to organize your ideas

Ideas can arrive at any moment, so it's important to record those ideas while they're fresh. Jot things down on paper or record a voice memo as needed.

However, as your ideas begin to evolve into a complete story, it's best to have a tool for organizing them. That's where workspace software comes in handy.

Notion is my app of choice for storyboarding and mapping out scenes. It has all the features of a Google Doc, but with a nested-page structure and tree view. Ordinary word documents are compartmentalized in a way impairs visibility.

The screenshot below features an example of how I organized scenes, characters, and other important documents in Notion. Attaching an emoji to each page makes it easier for my eyes to quickly find the page I'm looking for. Then I treat each of those pages like an ordinary document for managing text and images.

Using Notion for managing AI filmmaking workflows

AI scriptwriting assistants: Notion, ChatGPT and Claude

Dozens of companies sell AI scriptwriting tools, but most of them are charging extra for user interfaces wrapped around the same underlying models. You can save money by going straight to the base model. OpenAI's ChatGPT and Anthropic's Claude are the most common choices.

Notion includes a built in AI writing assistant, so if you've chosen their workspace, it can be convenient to stay in their system instead of toggling over to a separate AI tool. If you already have Notion's premium plan you'll save money this way too.

Mood boards: Gathering images for scenes & characters

Creating mood boards for your film with Pinterest

Mood boarding refers to a technique where multiple images are bundled together around a character, location, or scene. You don't need a finished script to get started with this process. Visual inspiration will help you anchor mental concepts in something tangible.

Pinterest boards are a common place to start, because the site already has a large repository of art, movies, and game stills to browse. Open a free account and start searching with keywords or media titles. Pin the images you like to save them in categories of your choosing.

Canva offers a free mood board feature where you can lay out images on a flat background. As you build them out over time, you can screenshot the final board and drop it into Notion.

Style boards: Collect images that capture the look and feel of the film you want to create. They don't need to be tied to specific scenes or the plot of your story. It's more about exploring visual aesthetics.

Characters and locations: Explore existing character art to zero in on the personality of people and places that you've outlined in Notion.

The AI image generation tool Midjourney released a mood board tool in December 2024 that makes it easy to arrange generated images on a blank canvas and attach notes to them. Watch the short overview below to learn more:

Using Midjourney to generate AI images from Mood boards

The images you collect in the mood bard have a practical purpose. You can plug them directly into Midjourney and use special prompts to achieve the look you're going for, without plagiarizing the original material. These are called character and style reference flags, or cref and sref for short.

To make use of style and character references, you'll need to download a free copy of Discord and sign up for the basic Midjourney subscription. Follow their onboarding instructions to start chatting with the Midjourney bot. That's where you'll do all of your prompting and image generation.

Upload two images (style and character): Start by dragging in your style and character reference image files into the text box, one at a time. Submit each one to upload it to Midjourney's servers.
Constructing the text prompts: Type "/imagine" and press enter to begin prompting the model for images. Describe what you want to see using simple, non-abstract sentences.
Character and style reference flags: Before submitting the text prompt, we're going to special text commands called flags. They use a dash-dash format like "--cref" for character reference and "--sref" for style reference.
1. Enter your image prompt, then use "--cref" and press the space bar to create an extra space after the flag.
2. Scroll up, click on the character reference image you uploaded to Midjourney during step 1 and then drag it onto the location immediately after the space. The image URL will go wherever you drag it.I you drop it into the middle of your prompt, it will break the format. See the screenshot below and follow the green arrow, avoiding the red arrow.
You can repeat this a second time, this time with the --sref flag, and drag in your style reference image.
Aspect ratio flags: You can use the "--ar" flag to request a specific aspect ratio for the AI generated image. Try "--ar 2:1" to get an image that will fit the dimensions of your film.
The order of your flags don't matter. The --ar, --cref, and --sref can go in any order, but the text that comes immediately after them is important. Don't put the aspect ratio after your char/style reference flags, obviously.

The final prompt will look something like this. I've put the aspect ratio, character reference and style reference flags in bold to help make it more readable:

/imagine prompt:Man running and jumping over a gap between two cliffs, apocalyptic background, mass lava pit below him --ar 2:1 --cref https://cdn.discordapp.com/attachments/filename.png?ex=321&is=654&hm=987 --sref https://cdn.discordapp.com/attachments/994766080998379580/1319883512169631844/filename2.png?ex=123&is=456&hm=789&

This will result in the creation of four images. Below you'll find buttons labeled U1-4 and V1-4. The "U" buttons will upscale the image and make them larger, while the "V" buttons will create four new variations on that image.

Once you've upscaled the version you like, you can control + click on that image to access the "copy image" option. Toggle back to your mood board in Midjourney's web app to arrange it on a canvas.

You can also use Notion to organize your favorite images alongside the text outlines of your scenes and characters. Here is a simple example of that:

Best image-to-video generators for AI filmmaking

Image-to-video tools are the most exciting part of this process, because they bring your AI images to life and turn them into clips for your film.

We've tested all of the commercial models extensively and Kling is currently both the best and most affordable option, at only $2.99 for hundreds of credits. Other models like Runway will run you $15 for a batch of credits and support fewer generations with low quality image-to-video results.

As we've mentioned at the beginning of this article, we don't have any kind of affiliate deal with Kling or any other tool that we're recommending.

Image-to-video with Kling AI (Lowest cost, highest quality)

I created the video above using Midjourney images and Kling AI. It demonstrates Kling's ability to animate even the most obscure art styles that were likely not in the original training data.

Creating AI film scenes with image-to-video

As you can see in the screenshot above, we've uploaded a keyframe that we wanted to animate. The prompt and camera control panel on the left includes an image prompt, with an optional "end image" keyframe that you can animate toward.

Below that image upload section is a text prompt, where you can describe what you want to see. To target specific visual elements and apply motion, it's best to use the motion brush module below that text prompt area.

Controlling film character motion in Kling AI

When you open the Motion Brush tool, you'll see the image and a control panel on the right. Click Area 1 and draw a shape around the visual element you want to target. Then click on Track 1 to draw an arrow describing the motion path you're trying to achieve. You can repeat this for up to six areas and motion paths total.

Here's what that scene looks like once rendered. Notice how the frogs on the left and right stood straight up to stare at the visual that we targeted.

Motion Brush will automatically disable the camera movement module. Camera controls are a better choice if you want figures to remain in place, while your angle of the scene pans or zooms into a new position.

I recommend sticking with professional mode rather than standard and always starting with the 5 second clip option. If you like what Kling AI creates, then you can double down and bump it up to 10 seconds instead. Sometimes the output will be wrong for some obvious reason that you overlooked, and you'll be able to make your money stretch if you start with smaller generations.

Expect ~7 minute rendering times for 5 second clips and ~12 minutes for 10 seconds clips. If you are queueing up multiple clips at once, they will be tackled one at a time. So the more you have lined up, the longer you'll be waiting.

Your video history will be stored on the right side of Kling's interface, you can browse through them and re-download the files at any time.

Comparing Kling to Pika AI and Luma video generators

Pika and Luma are two popular alternatives to Kling that offer image-to-video. They do an okay retaining the image style and achieving the motion requested in text prompts, but they still have too many shortcomings to be viable at this time.

The prompt field takes an image and text prompt, as you can see above. When we asked for a jumping frog, there was some visual distortion and blurriness:

By comparison, Luma offers start and end keyframes, text prompting, and control over output dimensions. If you don't choose any dimensions, it will default to the aspect ratio of your input image.

The results from Luma are worse than Pika. The frogs take on new shapes and textures. They also failed to achieve the upward jumping motion that we asked for. It's probably fair to say the output is not usable.

Storing and organizing your finished AI video clips

As you generate AI videos for your film, it's important to keep them organize. Notion is great for idea management but it's not intended as a file storage system. If you have limited hard drive space and intend to generate a high volume of content, Google Storage offers 100GB for an incredibly low $1/month.

You can link to those files at the top of your Notion pages, using URLs that you grab from Google Drive. Simply open the file and click the share button in the upper right corner. Then click "copy URL" and paste that link into your Notion page.

Linking to AI video clips stored in Google Drive

Generating AI movie soundtracks and voices

As George Lucas famously said, sound is half of the picture. Film scores, character voices, foley and cinematic sound design all play an important role in the process.

We'll share the best tools for each of these tasks in a moment. First, a brief public service announcement about organizing your audio files. Your AI film will require significantly more audio files than video files, so this is an important factor.

AudioCipher: Keeping your film audio organized

Notion maybe be a great place to store text outlines and image files. Google Drive works for storing large video files, but it's not a good platform for managing large volumes of audio. Drive's audio playback features are notoriously clunky, lacks metadata, and cannot playback MIDI files.

AudioCipher's MIDI Vault is the perfect way to keep your audio and MIDI files organized by scene and mood. Store all of your film music, character dialogue, and sound effects in note cards with metadata. You can use metatags to label cards according to the film you're working on and then name each card after a scene. Filter through your film scenes using the card list as shown below.

Organizing your film audio files in AudioCipher

Musicians who write their own melodies and background music can stash and play back their MIDI files on cards as well. This makes it easier to adjust the sound design or mix later on, so you don't have to go through the hassle of digging through a mountain of DAW project files.

Add detailed notes about the scene, DAW plugins and virtual instruments you used along with any other relevant information. Drag and drop audio or MIDI files into cards and then drag them back out to your DAW or video editor later on.

Visit the AudioCipher homepage to learn more about the Vault.

How to create AI film scores with Suno and Udio

Suno and Udio are the easiest starting point for generating music cues for your film. Both services understand movie genres and accept simple text prompts, though Udio is better at creating sophisticated AI film scores. If I had to choose, I would say that Udio is the superior option for movie soundtracks.

We've previously covered AI film scoring with Meta's MusicGen, but in 2025 you're better off sticking with one of these two options.

AI text-to-music film soundtracks in Suno

View the annotated screenshot of Suno above for a general overview of how to get started. You'll click the create tab, switch over to "instrumental", enter a simple description of the style of music you want, and hit generate. We recommend the using the latest V4 model for the best output.

Suno understands instrument names, but Udio generally does not. If you're designing a horror soundtrack and know that you want analog synthesizers for one scene, but violins and a string ensemble for another one, Suno will be the better option.

The downside to Suno it sounds more like pop music than a film score. It is almost always quantized, detracting from the fluid quality of a great movie soundtrack.

When prompted with words like "cinematic" and "film score", Suno may generate dramatic percussion with risers and impact design.

If you have a specific energy arc for the scene, this could interfere with your pacing. On the other hand, it can also lead to happy accidents and flexible creators can edit their AI video clips around the background music during post.

Udio's interface is almost identical to Suno in its layout, as shown in the screenshot above. You'll click create, describe your song with comma separated keywords, select instrumental mode and then hit the create button. However, the quality of Udio's music is substantially better for traditional film music. It has real compositional depth, maintains consistency while allowing for sudden and appropriate changes to tempo and key.

Generations are only 30 seconds by default, but this works to your advantage. Credits are conserved and you can easily extend Udio's best ideas to get more music in that same style. In fact, you can extend forward or backward in time, choosing between intro/outro or middle sections. Suno does not offer this feature.

Extending your movie's music cues in Udio

Style transfer: Uploading your own music to Suno & Udio

Musicians who want to retain creative agency can upload their own music to Suno and Udio. This makes it possible to write leitmotifs and use the AI to improve on the arrangements. Each service has its own strengths and weaknesses here.

To extend in Suno, click the "upload audio" button at the top of the left panel. You'll see a popover where you can upload the file and when it's finished, you'll find that uploaded track at the bottom of your song list. Click the extend button as shown below:

Uploading and extending AI music cues in Suno

When you hit extend, the song is loaded into your prompt panel. Switch to instrumental mode and describe the arrangement you want.

Suno is much better at style transfer than Udio. So if you upload a simple melody and want to hear it across different styles and instruments, you'll have a lot of fun here. Note that suno will create a new file, rather than extending directly from the original music that you uploaded.

To upload a reference track in Udio, click the upload icon highlighted in the screenshot above. Unlike Suno, the power of Udio is in its ability to retain an existing arrangement and continue in that style. Their model will be better for composers who have a finished score and want help expanding it forward or backward in time, as we mentioned in the previous section.

As you download your music cues, you can store them alongside the MIDI source files using the AudioCipher Vault. Instead of dumping them in a random local folder, add them to a card and use meta tags to keep your project organized.

How to store your AI film soundtrack cues in AudioCipher

Generating character dialogue with AI voice models

There are two very different ways to approach AI voice generation. The quick and easy method is with text-to-speech (TTS), but you'll have less control over the pace and expression. Voice transfer is more effective but sometimes results in unwanted artifacts.

ElevenLabs is currently one of the best solutions for AI voices in general. They were picked up by Disney's incubator in 2024 and have continued to add new models each year. I recommend trying out their free text-to-speech tool. Paste in some dialogue from your film and listen to it with over a dozen male and female voices.

Generating AI character voices for your film with Eleven Labs

Punctuation makes a big difference with ElevenLabs TTS models. If you want to add dramatic pauses or emphasis, be sure to use commas and exclamation points. You can even use unorthodox combinations like "?!" to capture the sense of bewilderment that you may be going for in certain scenes.

Characters can pronounce the same sentence many different ways, even with the same punctuation. However, after you generate with one voice, you'll need to select a different character and then return again to render a fresh take.

Once you've signed up for a paid account, you'll be able hit the download button to save the voice over and store it in your MIDI vault card for easy access later, during post production.

Post-production: Assembling your film in a video editor

Now that you've created your video scenes and background music,

Professional nonlinear editors like Adobe Premier, DaVinci Resolve, and Final Cut Pro can have a bit of a learning curve. If you're brand new to video editing, consider starting with a basic tool like iMovie. Advanced effects and text overlay can be achieved through easy consumer apps like Capcut and Veed.

Putting all of the pieces together and creating your AI film will be so much easier if you've stayed organized during this process. Your script and scenes are neatly outlined in Notion, video clips on Google Drive, and AI soundtrack and music cues in AudioCipher. All you need is a tool where you can assemble them.

Generating sound design and SFX for your film

Great films rely on carefully placed SFX and sound design. There are countless resources available for adding foley to your project. You don't need to use genAI to create those sounds and in fact, it might be better not to.

Audio Design Desk is an excellent resource for finding and placing sounds to video. Their audio timeline has direct access to all of the major libraries like Epidemic, Artlist, and Soundstripe.

Check out the trailer below for details on how it works:

Audio Design Desk is not a video editor, so you'll need to have the video clips prepared ahead of time first. You can sync up your nonlinear editors with ADD using their media bridge.

If dropping a few hundred dollars on ADD isn't in the cards, go straight to the source and pick sounds from one of the popular libraries. Drop it right into the video editing timeline and assemble them there instead. That will work fine too.

This concludes our guide. If you'd like to see what other AI filmmakers were creating in 2024, check out this roundup. To go deeper and learn more, check out the free tutorials at YouTube's AI Filmmaking Academy.

We hope you've learned a lot and answered any questions you had about which tools will serve you best on this journey. Good luck and have fun!