Smart speakers like Amazon’s Echo and Google Home are selling in their tens of millions, and it’s becoming clear that voice assistants like Alexa and Google Assistant (and Apple’s Siri once its HomePod device ships) will be increasingly-common ways people interact with music.
But what does this really mean? What should the music industry be doing to adapt to it? And how soon will this be truly mainstream? A session today at Music Biz and Music Ally’s NY:LON Connect conference in New York explored the issues.
The speakers were Ryan Redington; director of Amazon Music; Pete Downton, deputy CEO of 7digital; and Ryan Taylor, director of partnerships at Sonos.
Downton spoke first, focusing on the behaviour of music listeners. He showed research showing that the “passive massive” listen to music mainly on the radio, and also a bit on YouTube. Meanwhile, “music enthusiasts” use subscription services, downloads, CDs and vinyl.
“There are two constituents in the music consumption world,” he stressed. “They go to the effort of subscribing to a streaming service today, but maybe 20 years ago they’d have gone to the effort of finding a store and buying an album on the day of release.”
Downton thinks the music industry needs to think more about that passive massive in the streaming era. “Consumption of music is still dominated by free-to-air services. Principally radio. The UK is the second largest market for Spotify in terms of revenue… but for all of the years of effort, we’ve not really shifted behaviour a great deal when it comes to listening to radio. This is where the passive massive still lives,” he said.
All the streaming services combined account for 15% of music listening in the UK, according to research from AudienceNet cited by Downton, with YouTube accounting for 9%, and radio 52%. “Spotify is still the same size as Radio 2 when it comes to people’s music listening,” he noted.
Could smart speakers change this? “12 months ago, if you were to look at a Sonos owner in the UK, maybe 70% of the listening time was spent listening to the radio, and 30% listening to streaming. If you look at the Amazon Echo, it’s the mirror opposite. Why? Because these experiences are extremely convenient… This is the most profound change in consuming music that we’ve seen.”
Downton suggested that the last big change in user interfaces came with the iPhone in 2007, and that platform – smartphones and apps in general – is what helped services like Spotify to grow. His view is that voice-controlled devices is the next big change.
“The voice platform changes everything. I think the convenience will drive adoption in the next few years, but then we’ll start to see those deeper, richer experiences that we all know the music industry needs to sustain and grow the music industry,” said Downton.
Already, Edison/NPR research suggests that 70% of smart speaker owners listen to more music in the home after buying it; that 39% say they spend less time listening to traditional radio; and that 31% make e-commerce transactions through their device.
Downton finished by talking about cars, with the automotive industry scrambling to sign partnerships with Amazon and Google to use their Alexa and Google Assistant technology, while startups like SoundHound are also cutting deals for its Houndify voice assistant.
“Until now, digital music has been about smartphones,” said Downton. But now it’s going to be about how smart speakers flow music through people’s homes – and their cars.

Amazon’s Ryan Redington was next to speak, having started working on music when Amazon had its MP3 store, and then taking in its cloud player, AutoRip feature, and then its move into streaming in 2014 with Prime Music, and then Music Unlimited in 2016.
And the introduction of Amazon’s Echo and its Alexa voice assistant. “It forced us to completely reimagine our music business… there’s not even a search bar, it’s literally just people speaking. You’re forced to reimagine how to manage a music service and interact with customers,” said Redington.
“Music is consistently one of the top use cases: it shows the power of bringing music back into the home,” he continued, before outlining some of Amazon’s findings on what drives engagement around music on Echo devices.
The first is removing friction: the ability to say ‘Alexa, play music’ rather than spending too long trying to think of what to play. “It takes two seconds, if that, and you immediately have music playing. That immediately opens up the opportunity to listen to more music… It makes it so easy for the consumer to actually engage in the music they want to listen to.”
The second is speaking naturally: “this ability to talk about music just like talking to a friend, and we do the work on the back-end to work out what you want to listen to,” as he put it. ‘Alexa, play upbeat pop music from the 90s’ being the famous. “I don’t think I’ve ever seen that search typed in to our music service in the past, but people talk like that to Alexa,” he said.
This brings metadata challenges: for example, the way albums released on CD had street dates of the 1980s and 1990s, even if they were originally released in the 1960s or 1970s – purely because the digital metadata had come from the CD release. Not good for ‘play me some 90s music’ type queries. “We realised what we were serving to customers was completely the wrong era. That’s unacceptable!” he said.
Amazon has also been trying to teach Alexa the difference between fast songs and slow songs, as well as interpreting ‘Alexa, play the song that goes…’ lyric-based enquiries. And then ‘Alexa, play the new song by Ed Sheeran’ – a query that relies on Alexa understanding what the “new” Ed Sheeran song is at a point in his album cycle when he may be working his fourth single.
“What song is the label working on radio? That’s probably the song that the customer wants,” said Redington. “In a voice world, the bar is extremely higher. We don’t have the luxury of returning a bunch of search results, returning 10 results above the fold, and as long as one of them is right, we’re good. In a voice environment without a screen, we just have to start playing music.”
Amazon is looking carefully about how Echo owners are interacting with music, and developing new metadata accordingly where necessary.
“We’ve seen a lot of requests based on moods: ‘play me happy music’ or ‘play me sad music’,” said Redington. Amazon could make playlists for happy and sad songs, but it’s trying instead to add a layer of personalisation: tagging songs as happy and sad (and other moods) so that Alexa can then use its knowledge of the listener’s preferences to play them the songs that match their tastes, as well as their mood.
“The way that we like to think about the music business is in three waves. The first was really driven by the phone… the next wave is coming from the home, and to pick a case in point, we now see on Alexa more music consumption on Amazon Music is happening on Alexa as a platform than mobile and iOS combined,” he said. “And the last part comes in the car.”
Last to speak was Sonos’ Ryan Taylor. He talked stats too: Sonos customers say they listen to 78.6% more music than before their purchase – this is all Sonos systems, not just its latest smart speaker. 31% only use free services, 27% only use paid services, and 43% use a combination of the two.
Sonos launched the Sonos One, its first smart speaker, a few months ago. A couple of findings: “More people in the household are listening, but we’re also seeing more engagement: 15% more music listening for voice households than non-voice households,” he said. This may partly be because these are early adopters, he admitted.