Earlier this year, Meta released MusicGen, a musical AI model that could use text prompts to create 12-second samples. It turns out this was just the start of the tech giant’s ambitions in audio AI though.
Yesterday Meta announced AudioCraft, pitched as “generative AI for audio made simple and available to all”. It’s a framework that creates “high-quality, realistic audio and music from text-based user inputs” using three models.
MusicGen is one of them, alongside AudioGen (which makes sounds rather than music) and EnCodec, a decoder which Meta said “allows for higher quality music generation with fewer artifacts”.
All of this is being open sourced so that people can play with the models “for research purposes and to further people’s understanding of the technology”. Meta’s intention is that developers will be able to build their own sound and music generators (or compression algorithms) on top of AudioCraft.
Meta’s description of potential applications for these models is interesting. “Imagine a professional musician being able to explore new compositions without having to play a single note on an instrument,” it suggested. That’s a familiar usage for AI music models: the idea that they can be compositional or jamming tools for musicians.
However: “Or an indie game developer populating virtual worlds with realistic sound effects and ambient noise on a shoestring budget. Or a small business owner adding a soundtrack to their latest Instagram post with ease,” Meta’s blog post continued.
Neither is a new idea, but it’s a useful reminder that AI-generated music is going to be competing with human music in these areas: composers who might be hired or tracks that might be licensed for those games, and the library of commercial music that’s available for those social-media posts.
This isn’t new, and ‘compete’ does not mean ‘replace’ en masse. There will still be game syncs and original soundtracks, and plenty of businesses will want to use human music (from famous tracks to emerging artists – the latter being one of the focuses for TikTok’s Commercial Music Library for example) too.
AudioCraft and MusicGen, as well as Google’s MusicLM, seems certain to spur a new wave of experiments and products that make it easier than ever to produce ever-better-quality AI-generated music. Some of that will yield exciting assistive tools for human musicians, and some of it may give them competition worries.
As ever, it boils down to the fact that the technology itself is just technology. It’s the decisions humans make about how to train it (Meta stressed again in its AudioCraft announcement that MusicGen was “trained with Meta-owned and specifically licensed music” on that front) and what to do with it once it’s developed that matter.
Recent history shows enough terrible decisions made by humans to make that not exactly reassuring. But that’s what should govern the music industry’s approach to AI now.
Driven by some of the biggest tech giants on earth, AI music technology is advancing rapidly. It’s encouraging (yes alright, sometimes with a regulatory stick) humans to do good things with it that’s the key challenge.