“What’s the end-game for this? There isn’t this place in the world where teenagers come together to make music for each other. That place does not exist, and that’s nuts! That thing needs to exist, and it will exist. And getting the AI working is the price of admission to build that thing…”
Stephen Phillips, CEO of Australian startup Popgun, thinks that the early business models in this sector – AI-music as a replacement for production music, for example – are just a sliver of the ultimate potential for this technology.
What’s more, his thoughts on how AI music might disrupt the current music industry are less about people choosing to listen to AI-made music instead of human-made music, but rather about people (non-musicians) using AI tools to make music for one another. Specifically: kids.
“Where’s the ‘pop stars on training wheels’ place where they make music for each other, release it and watch each other pretend to be pop stars, but then go on to become legitimate pop stars? Who’s going to create that space where the next Billie Eilish comes from?” he says.
“The current pop industry is very few musicians controlled by three or four companies, played by a billion people. The new thing has to start with kids making music for each other and ignoring all of that world. It’s still too hard, though, and AI has to be the answer for that.”
This is not (yet) Popgun’s business. Music Ally first wrote about the Australian startup in June 2017 when it was teaching an AI called ‘Alice’ how to play piano with humans; then again in November 2018 when it showed how its AI had learned to compose and play drums, bass and piano together; and then again in July 2019 when it showed off how it had taught an AI to sing.
In terms of products, Popgun has released Splash Pro, a plug-in for the Ableton Live software that lets musicians quickly create original, AI-generated compositions, including custom drums and vocal ideas, when they’re working on their own projects. Another product, Gloss, uses AI to master music, podcasts and video audio. The company’s website trails a third product, Splash as ‘coming soon’. You might surmise that it takes the technology used for Splash Pro and makes it available for non-pros in some way.
Phillips isn’t talking about Splash just yet, but it’s clear that the company has been experimenting with AI-music products designed for a mainstream (and young) audience.
It’s not just Popgun talking about the potential for an app that does for music-making what Instagram did for photography and TikTok did (particularly for kids) for videography. And it sideswipes some of the familiar, sceptical responses to the notion of AI-generated music – for example, that it lacks the backstories and/or societal roots of human artists. The meaning of, say, Adele is about far more than her music.
And this is true. But if something emerges that is not about ‘AIs making music for humans’ but about ‘teenagers using AI tools to make music for one another’, then there will be no shortage of meaning, and cultural connections.
“It’s us who give the meaning to the music: it’s just functional music if there’s no human associated with this. Billie Eilish is not a star because her stuff sounds so much better than angry teenage stuff we’ve heard before. She’s compelling as an artist,” says Phillips.
“But think about all these twentysomething hipster-songwriters in LA googling what 13 year-old girls say to each other, so they can write songs for them. 13 year-old girls know what they say to each other!”
“Once they have the tools to make music that sounds great, they’ll make music for each other, and it’ll sound incredibly genuine to them. Instead of Taylor Swift pretending to be a 15 year-old, there’ll be an actual 15 year-old, talking about the things 15 year-olds relate to most.”
It feels like an app like this that truly makes waves is a while off (but then again, who saw TikTok coming a couple of years ago?) but Phillips’ views are a reminder that what we’ve heard of AI-music so far, and seen in terms of business models, is still very early days in the commercial history of this area – as opposed to the academic/research history, which stretches back decades.
“Like all new things, people overestimate its impact in the short term, and underestimate its impact in the long term,” suggests Phillips. “When it really started to get serious in 2016 to 2017-ish, people were like ‘It’s going to put everyone out of work!’ And now? It’s going to put spome functional-music people out of work is my guess. But that’s not the end-game.”
Popgun’s AI, as its YouTube demos testify, has evolved rapidly and impressively since the company emerged in 2017. Phillips has a sharp sense, however, of what the current technology is capable of – and where it still falls short of what humans can do. It turns out that one of the latter is that traditional A&R skill of spotting a great piece of music.
“We haven’t tackled the core problem of quality or taste. I can go and jump on our piano model and spit out a million piano tracks, eight bars long in any different key, and within that million there’ll be one or two that are amazing. What I don’t have is a computational way to work out which ones they are,” he says.
“We’ve spent two years working on how we model these instruments so they sound like humans are playing them, with 10 guys as good as anyone’s got. These instruments know how to play the piano! But they don’t know if what they have done is good. Right and good are different things.”
“They know what they’re playing is not out of tune – they’re trained on stuff that’s in tune – but they don’t have any clue if what they are playing is original or interesting. That’s why we need humans to curate it.”
Phillips relates the tale of a recent conference he attended, where a well-known tech CEO made the confident prediction that AIs would be having chart hits within the next decade, because they’d be able to generate an endless supply of music, partnered with an algorithm that ranks them and releases the best.
“I was like: it’s the second bit that’s the problem!” laughs Phillips. “It’s just not possible right now: to say that’s a solved problem is just not credible to anyone who works in this space. ‘Humans can’t predict whether this song will be a hit, but my AI can’? Even people who don’t work in music AI, but work in AI, would dismiss that as not possible.”
Phillips is just as interesting on another important question for anyone working in AI music and the industries it affects: is this about making music that sounds familiar from the genres we already know, or about creating new sounds and new genres?
He’s forthright on this, based on Popgun’s research into composition for its technology. “AI shouldn’t just compose original music. It should sound completely different. It should be pleasurable, but also ‘Holy hell! Like nothing I’ve ever heard before!’ We’re only just starting to work on that.”
Which brings us back to that idea of a consumer product for non-musicians (even if they aren’t teenagers, but especially those) to make music for one another. If it sounds slightly… alien, or jarring compared to the music we have already – if it puzzles or even angers people outside this community? Well, that would surely only add to its authenticity within the new music-makers.
“The implications for the pop-music industry are interesting. I don’t think it does anything but help music. labels will exist to harvest the best and brightest of those kids and turn them into pop stars. Whatever happens with AI music, there’ll still be 20 super-engaging, charismatic people better than everybody else at the top of the charts,” says Phillips.
An interesting facet of our conversation focuses on Popgun’s work teaching an AI to sing, with vocals having been one of the biggest challenges for AI-music startups to crack so far. Phillips admits that in 2018, he thought the task of AI-vocals would be much simpler than it has proved to be. A year later, we realised this is really hard!” he laughs.
“Singing is so much harder than speaking: you’ve got to sing at different pitches, there are all these controls. And no decent data-sets anywhere: there are thousands of years of people speaking, but very little of people singing dry without reverb. We had to record all of our own singers.”
Phillips estimates that Popgun is “still one, maybe two generations away” from an AI that can sing in a manner that people would enjoy listening to: at which point the company may have 10 or 20 AI singers controllable by users of its products: typing in lyrics to be sung.
“It’s going to be their voice: but what they imagine it would be like, if they could sing. And at that point, they’ll really start lip-synching and making videos,” says Phillips, before warning that this technology will not serve to bury genuine musical talent.
“Even if that exists, some kid who’s super-talented will come along and make a better version than the other kids. That’s what pop stars are!” he says.