Big data isn’t a new music-industry trend. Labels, managers and artists have been awash in data since the early days of the digital music era.
It’s true that the emergence of streaming has intensified all this: millions of lines of data on streams and social signals, which musicians and their teams have to make sense of. A pair of sessions at AIM’s Music Connected conference today in London explored what it all means.
Chris Carey of Media Insight Consulting kicked off with a session exploring how the music industry can use data well, rather than being overwhelmed by it.
“Data doesn’t always have all the answers,” said Carey, noting that Leicester City are currently one win away from winning the Premier League, despite 5,000-to-1 odds at the start of the season.
“One of my big concerns for people learning how to use data is that it can discourage you from doing bold things,” he said. “Data will give you a lot of information, it will give you insight, but it will very rarely give you the absolute answer… People don’t always know what’s good for them. Part of the role of our industry is to lead them: to lead consumers to a better place.”
He added that “big data often sits in silos” – incredibly detailed data from Twitter, Facebook, Spotify and other digital services, but it only covers each of those individual services rather than showing you how they hang together. “Big data tells you what happened very well… but you often don’t understand the attitude that underpinned the behaviour,” said Carey.
Carey outlined some of his company’s latest research: a survey of 2,008 UK adults, conducted online earlier this month. 16-24 year-olds are using streaming playlists to discover new music, while 35-44s are gigging as much as younger groups, for example.
What do people spend money on? Paid streaming is popular among 25-34s, but CDs still hold up “remarkably well” – although Carey noted that his survey’s question about what people had spent money on in the last “three months” might include their Christmas spending – CD gifts included.
How much music are British people listening to? Carey pointed out that in a streaming world we are thinking about share of time as much as share of wallet-spend. 31% of the UK listen to between one and two hours of music a day, 20% listen for 2-4 hours, 8% listen for 4-6 hours, and 4% listen to 6-8 hours. 2% listen to more than eight hours.
Carey noted that the younger people are, the more they listen. 11% of 16-24 year-olds listen for 4-6 hours, and 6% for 6-8 hours. “If young people now are disproportionately streaming than anyone else, they carry more weight in deciding how much gets paid.”
37% of the British population are listening to music on YouTube, compared to 12% on iTunes; 10% Spotify’s free tier and 4% Spotify’s premium tier; 1% SoundCloud and 1% Apple Music. “30% of the population use none of these,” said Carey.

Carey also noted that devices skew by age: younger people are more likely to listen on phones and laptops, whereas older people are still listening at home on a sound system.
He moved on to social networks Instagram and Snapchat skew massively young, which isn’t surprising, but they are also the favourite ways those young people like to be updated about news from artists. That’s why they’re becoming increasingly important to artists and music marketers.
.@MusicEcon101 is comparing Spotify and YouTube at #MusicConnected pic.twitter.com/JGPrH5XHZ8
— Music Ally (@MusicAlly) April 27, 2016

Carey’s speech was followed by a panel discussion moderated by Paul Sanders of The State 51 Conspiracy, and including Spotify’s Will Page, MusicBrainz’s Robert Kaye and data scientist Tijl de Bie.
Kaye kicked off in his role as head of ‘the Wikipedia of music’ MusicBrainz, which was founded in 2004. “Our mission is to curate and make available public data-sets,” he said. “We’re really fiercely independent. We work with some really big companies, but we value our independence. We are open-source, open-data, open-finances.”
Its partners include Google, which uses MusicBrainz’s metadata in its search engine, YouTube and Google Play store. The BBC also uses its data, as does Spotify, which uses it to clean up the data it receives from labels “because the majors send crappy data, and Spotify has a number of data problems”.
MusicBrainz has data on more than 1m artists,15.7m recordings and 104k labels and publishers, as well as events, places, barcodes, acoustic fingerprints and CD identifiers. “Spotify, Apple Music, YouTube and everyone else is a data silo,” said Kaye of the organisation’s mission. “As a data nerd I am kind of pissed off by that.”
He’s hoping that “a thousand open-source data recommendations” will spring from MusicBrainz’s work, and even if 990 of them aren’t good, that will leave 10 that can have an impact on the digital music world.
Page talked about making something meaningful from the scale of music data that is battering musicians and music companies. “These huge datasets people talk about, there’s ways of folding them into something much more meaningful for an indie label or artist,” he said.
Over to De Bie to talk about “the power of data science” for the music community. He showed some charts based on The Echo Nest’s data on how loudness, harmonic simplicity and danceability corresponded to songs that wre hits in decades from the 1960s to the 2010s:
He also talked about predictive machine learning, and technology trying to predict which songs will be hits and which won’t. As he admitted, it’s still a science capable of being confounded by the listening public – the musical equivalent of Leicester City’s stunning season.
De Bie also suggested that the music industry can make more of “exploratory data mining” areas like how different artists cluster together in genres. Or by analysing the demographics and locations of Twitter users who tweet about certain artists, and also the features of the music.
The data science of @MassiveAttackUK and trip-hop #MusicConnected pic.twitter.com/I2T2ovk45E
— Music Ally (@MusicAlly) April 27, 2016
He called for more access to data on music. “It makes sense to make it available more often to researchers who don’t want to make any money off you, but want to have fun and play around,” said De Bie. He’d like more music services and companies to make their data accessible through APIs and linked data.
Page took up the baton, talking about the “make or buy decision” that indie labels are faced with; then how to make more productive use of data in the indie sector; and finally what Spotify is doing with its Discover Weekly playlist in terms of data.
“There is a lot of fragmentation, a lot of cost structure issues facing this industry,” he admitted. “There is a lot of duplication in this business. Removing it could reduce costs and perhaps boost revenues.”
Page also talked about Spotify’s Fan Insights analytics tool as a way of the streaming service getting its data out into the open – including a feature ranking the most popular playlists in terms of listens and ‘new listens’ for any artist.
Page encouraged labels to try to “remove the time lag” in responding to this kind of data in their businesses. “You need quality over quantity in that sense,” he said, in terms of indie labels interpreting the data that they’re receiving from digital services.
As for Discover Weekly, Spotify’s weekly personalised playlist for its users based on their habits. “We’re trying to think about how you apply psychology – not spreadsheets and pivot tables – to data,” he said. What makes Discover Weekly popular? “It’s regular, it’s random, and it’s reinforcing: the more you do it, the more you do it,” said Page. “It’s like the lottery.”