AI music extension refers to recording / uploading musical content and using generative AI models to expand on that idea.
Musicians can now upload ideas and overcome creative block quickly, sampling the music directly or referencing it as the starting point for their own songwriting.
There are currently only a few products on the market with an AI music extender; Suno, Udio, and Soundgen. We'll cover each of them her and share tips on how to get the most value from them.
Table of Contents
Best AI Music Extenders
AI powered music-to-song generation refers to models that take in an initial audio file and transform it into a complete arrangement. Each AI music platform has its own model architecture and training data, which means they each produce different kinds of musical output.
Udio (AI song extender)
Udio began as a text-to-song generator, without an audio upload feature. They introduced the AI song extender during mid-2024, available only on the paid plan.
Subscribers can upload their own music clips and use text prompts to describe what they want to hear. We experimented with this feature extensively and discovered a few interesting details about the model's performance:
Udio's best AI music extension features
Udio can extend your audio forward or backward in time, relative to the whole clip. That means you can generate intros lead smoothly into your own music.
Udio combines the input track with the generative output, so it truly feels like a song extension. Other apps, like Suno, do not stitch the input and output.
Udio applies stem separation to multitrack recordings, and then isolates solo tracks like a lead melody, writing a new arrangement around that stem.
Udio will write melodic variations and reharmonize an initial concept, adding new chords as well as new instruments. Text prompts influence the output.
Each generation tends to be unique and different. If the music output sounds too similar, use the Strength slider to diverge further from the original input.
To try the feature out for yourself, log into your account and at the top of the page, find the upload icon at the right corner of the text prompt field.
How do you extend audio files in Udio?
Lyrics: Select custom to input your own lyrics, instrumental to omit vocals, and auto-generated to let Udio's system come up with words for you. Try to come up with your own. AI generated lyrics tend to be a bit mediocre.
Prompt strength: Strength refers to how much influence the input has over the output. Maximum strength values will force your input into the new generation and can create less natural sounding music, while very low strength settings may sound more natural but drift far away from the original input. For this reason, it's best to start in the middle and adjust incrementally.
Seed: The default value of -1 randomizes the seed, so that the output resulting from each generation is sufficiently different from the previous one. If you type in a random positive value and keep other settings the same, you'll get a more closely related output.
Clip start: This feature is used to approximate whether you want the clip to be based on a song's intro, middle, or outro. It doesn't indicate where the AI model actually extends from. Those controls are located in the section above, labeled extension placement.
Context length: This refers to how much of the song Udio should take into consideration when composing the extended section. The more context you include, the closer it will adhere and the more of your melodic or instrumental concepts will be incorporated.
Udio's terms of service allows them to train on your music
Be aware that if you're uploading original music to Udio's system, their terms of service specifically states that "input content" will be used to improve and modify their machine learning models. The grant of rights includes the option for their company and affiliates to reproduce, store and modify your files.
Suno AI Music (AI song extender)
Suno is currently tied with Udio as the top AI song extender. Users can extend songs created in the app or upload their own music as the reference file. Text prompts are used to communicate the targeted style of music output.
One key difference is that Suno does not include the original audio file in the extension. The reference file provides context about key, bpm, and melodic shape but Suno uses that information to do its own thing.
Udio includes and merges the uploaded audio with the extension, resulting in a more seamless experience. Suno users need to line them up back to back in a DAW in order to hear what they sound like together.
In my ongoing experiments, I've found that Udio tends to extend audio in the same style that it originally appeared. This means that if you upload a solo instrument, it's likely to expand with more notes from that same instrument.
Suno on the other hand does an excellent job turning melodies into complete arrangements, referencing text prompts to apply the appropriate style.
Overall, Suno and Udio both have their pros and cons. I suggest trying them both out and experimenting with different text and audio inputs, to get more first hand experience with the tools.
SoundGen (AI music extender)
SoundGen is considered an AI music extender, because it generates instrumental music without singing vocals. The product has a strong grasp of melodies, chord progressions, percussion and arrangement. Like Suno and Udio, text promots are used to guide the music extensions.
Soundgen is unique because they provide granular editing tools like trimming and audio file management. Their standalone app supports the ability to drag-and-drop files right into your DAW.
The interface is built with musicians in mind, but anyone can use it. Think of it like a bandmate or collaborator more than an instant song generator.
Best AI MIDI Extenders
AI MIDI Extenders are a niche subset of AI music extension. There is currently only one high quality model, trained on a high volume of symbolic music and music theory analysis, with an easy-to-use commercial interface.
Hooktheory's AI MIDI extender
Launched in June 2024, Hooktheory's Aria feature is powered by the most advanced AI MIDI gen model on the market. The lead engineer is a Carnegie Mellon University professor and research scientist at Google DeepMind.
How Aria's AI MIDI extender works:
Open up Hooktheory's Hookpad application
Start a new project or open an existing one
Select one or more measures after an existing MIDI section
Click Aria and choose from three options: Create melody, chords or both.
HookTheory fine-tuned their model on 50,000+ song transcriptions from their own user base. The chord and melody data across all genres has been combined into a single predictive MIDI engine.
Visit Hooktheory's webpage about Aria to learn more.
Google's MusicLM, MusicFX, and Lyria models
Google made a big splash at the I/O conference in May 2024, opening with a live performance of their latest MusicFX web app. Despite a lot of hype surrounding the event, Deepmind is still withholding their audio-to-audio feature. There's a chance we'll see it drop some time later this year.
The I/O event marked a nearly eighteen month period since the original MusicLM paper was published back in January 2023. That document had promised a melodic conditioning feature that would include humming and whistling as inputs. Have a listen to those examples here.
In November 2023, Google's Deepmind team published a follow up report that proposed the following: "Imagine singing a melody to create a horn line, transforming chords from a MIDI keyboard into a realistic vocal choir, or adding an instrumental accompaniment to a vocal track."
This system was tied to a new model called Lyria and an AI-generated watermark called SynthID that Google can use to trace songs back to their system. We've shared a screenshot of that interface below, but just to be clear, the Lyria app is still not available.