Super Hi-Fi wants to revamp music-streaming with AI… inspired by radio


Of all the startup demos I’ve been given over the last decade working for Music Ally, Super Hi-Fi’s is the first that involves listening to Seal’s 1990s hit ‘Crazy’ being mixed into classic jazz cut ‘Monk’s Dream’ by Thelonious Monk. Oh, and it’s an AI doing the mixing, not a human.

Super Hi-Fi isn’t an ‘AI DJ’ startup, though. Its technology is more akin to an AI radio producer: capable of mixing tracks into one another, but also dropping in radio-style branding stings; announcements of artist names and track titles; and cueing up interview snippets.

The US startup isn’t trying to put radio producers out of work. Instead, its technology is designed for music-streaming services, to make them sound a bit more like, well, traditional radio.

“It could be personal, heartfelt stories from artists, but it could be sports, news, a sound effect, or it could just be that the songs blend together properly,” CEO Zack Zalon tells Music Ally, as his co-founder Brendon Cassidy prepares the demo.

“What comes out of your speakers should sound like something that’s been hand-tuned by somebody that really cares. But you can’t [currently] do that at scale, in real-time, with millions of people listening to millions of streams.”

That’s the other twist: that the audio that Super Hi-Fi’s AI can drop in to people’s streams won’t just match the songs that are being played, but also whatever data the streaming service has on that listener: news and sports stories plus ads that are relevant to their interests, location and demographic.

This is the pitch, anyway: Super Hi-Fi’s clients so far, iHeartRadio and Peloton, have started off by using its entry-level feature of making songs mix together more seamlessly than your average streaming-service crossfade. But the startup already has a powerful ally in its corner evangelising its more-ambitious features: Universal Music Group.

UMG and Super Hi-Fi announced a ‘strategic partnership’ in June that would see the companies “work together to introduce Super Hi-Fi’s powerful AI tech to UMG’s partners across the globe and to co-develop new ways to enhance and promote UMG artists and music”.

Universal actually plays an earlier role in Super Hi-Fi’s founding story. Zalon and Cassidy first met nearly 20 years ago, when they were working for Farmclub·com, an ‘online record label’ scouting for unsigned acts, founded by UMG’s Jimmy Iovine and Doug Morris, with its own (short-lived) TV show spin-off.

Since then, the pair have launched internet-radio service Radio Free Virgin for mogul Sir Richard Branson; and through their We See Dragons digital agency, built CBS Radio’s digital platform; rebuilt AOL Radio and Yahoo LaunchCast – “when those were platforms that mattered!” notes Zalon; and built the Muve Music service for US telco Cricket Wireless, which was a big success with 3.5 million subscribers – at least until larger telco AT&T bought Cricket in 2014 and Muve Music shut down.

We See Dragons has also worked with sports teams; made apps for National Geographic; and even built “the world’s biggest consumer diabetes management app” for Johnson & Johnson. Along the way, Zalon says they learned how those kinds of companies brand themselves, and looked back at the evolving music-streaming market with a critical eye as a result.

“Digital music services today don’t really stand for anything. You can’t talk about the values of any one, or point to much that’s different between them,” suggests Zalon.

“Without disparaging them, they all have the same music catalogue, they’re all available on the same platforms, and the one thing that’s been different is the visual interface: the UI. But as more and more consumers are listening to music on smart speakers, where there’s no visual UI at all, those services are starting to lose out on the one thing that’s allowed them to be different.”

This is the core of Super Hi-Fi’s pitch to streaming services. First, that they need to have stronger audio brands. Second, that the on-air ‘between-songs’ content of traditional radio stations is the place they should look for inspiration. And third, that AI (and specifically Super Hi-Fi’s AI) will be the key to doing this in a ‘mass-personalised’ way, for hundreds of millions of listeners in real-time.

“That’s what makes the difference: the space between the songs and how people are using that is what ultimately defines what companies stand for and sound like, and what makes them different,” says Zalon.

“Our vision is to use some of the techniques of radio, powering the spaces between the songs with great content and great experiences, but in a way that’s incredibly personalised. And to make it sound great,” is Zalon’s summary of his pitch. “And we can help services to differentiate themselves.”

It’s a good pitch, and the demo matches its quality. Cassidy starts by showing the AI mixing Ramones’ ‘I Wanna Be Sedated’ into Blondie’s ‘Call Me’, and then Mark Ronson’s ‘Uptown Funk’ into Walk The Moon’s ‘Shut Up And Dance’.

Cassidy explains that Super Hi-Fi started by teaching its AI to understand music, and then to mix it together in a way that goes beyond simple cross-fading or beat-matching. It’s also capable of creating a large number of different mixes between two specific songs, rather than simply the same one each time.

“Then we started teaching the AI about more things: about content, things that can be interspersed with and overlaid on top of songs,” says Cassidy, playing a new version of the Ramones-to-Blondie mix with a radio station-like sting added in.

Super Hi-Fi’s technology can also choose different voices (in terms of speed and intensity) to match the tracks – a feature demonstrated by the AI picking a “chainsmoking classic-rock guy” sting to go between Coldplay’s ‘Magic’ and Rush’s ‘Tom Sawyer’.

The next part of the demo sees the transition between Nicki Minaj’s ‘Anaconda’ and Kanye West’s ‘Stronger’ bridged by the announcement of Kanye’s name, a clip of him talking in an interview, and then a spoof DSP ident (“Super Hi-Fi! The most revolutionary music service!”) as his track kicks in.

“All of this production would have taken a music person in a radio station 45 minutes to an hour to put together. The AI did it in micro-seconds,” says Cassidy.

Showing a demo of what Super Hi-Fi’s AI could do in real-time is one thing; proving that it can work at the kind of scale involved with services like Spotify, Apple Music and Amazon Music quite another. Zalon admits that pitching in this kind of technology has had its challenges.

“We went out to every digital music service and proselytised this vision: ‘You have to create something that sounds good, and you have to take radio-like experiences and integrate them into the stream, to create these personal experiences that are identifiable and differentiated,” he says. “Everyone was excited, but for a while nobody would integrate it. Nobody would buy what we were selling.”

There was a moment two years ago when Zalon and Cassidy considered giving up on the idea and focusing purely on their digital agency, but the pair chose to redouble their efforts, hopeful that the streaming services would eventually come round to their way of thinking.

How Super Hi-Fi's website explains how its technology works

How Super Hi-Fi’s website explains how its technology works

The iHeartRadio and Peloton partnerships, even though they’re only using a part of Super Hi-Fi’s promised capabilities, are a start: Zalon says that even just mixing tracks together has improved a ‘time spent listening’ metric for iHeartRadio by 10%.

The Universal Music partnership is another step forward, although how much clout UMG really has to convince its DSP licensees to test this technology remains to be seen. Even so, Zalon says that more deals will follow in the near future.

“There are others that I’d love to be able to talk about that we’re going to be announcing later this year, that we think can be even more powerful,” he says, while also enthusing about what the tie-up with UMG says about how major labels see the music/tech world in 2019.

“If every single service sounds exactly the same, with the same experience, the same catalogue, quality, at some point it’s just going to be a race to the bottom with price,” he suggests. “That’s one of the reasons they [UMG] are very strong supporters. We can help their licensees to get more value and to drive more interest from their consumers.”

“Universal is certainly foremost among them, but it seems like all the labels are super-pumped about innovation. They realise their future and their fortunes are really tied up in how fast they can get streaming services to grow, so they’re very focused on having people do innovative things. Unlike 20 years ago, they’re not approaching the market with fear any more.”

If Super Hi-Fi’s technology can do at scale what its creators promise it can, it may be coming at a good time, given the clear priority that music-streaming services are placing on attracting listeners (and advertisers) away from broadcast radio.

There’s a comment I keep coming back to, in this regard, from Spotify’s chief financial officer Barry McCarthy, when he was speaking at a Goldman Sachs conference in September 2018:

“The 20-year trend here is linear dies, everything on-demand wins. Instead of free / paid, it’s paid plus free, and free eats broadcast radio,” said McCarthy. “I don’t know what happens to news and I don’t know what happens to sports, but for sure, the combination of your phone plus a voice-activated interface enables the car, and the car is the principal user experience for broadcast radio. Broadcast radio, SiriusXM are extremely threatened by the growth and evolution of streaming services.”

Right now, what’s happening to news and sports for Spotify is mainly about podcasts, and also its early experiments to blend clips from those spoken-word shows with music: for example its Your Daily Drive personalised playlist, which launched in the US this year.

Super Hi-Fi (or technology like it) could play a role here. The company already has a partnership with Associated Press to help that news organisation turn its reports into the kind of audio snippets that can be slotted in to streams by the startup’s AI.

Zalon says that Super Hi-Fi is willing to play this kind of middleman role for other media companies, to ensure there’s a good pool of audio to pull from. That said, the big streaming services also have the access to talent and studio facilities needed to create the artist-interview content, and in Spotify’s case, the podcast producers capable of creating daily news and sports content too.

Super Hi-Fi is self-funded so far, although Zalon prefers not to use the ‘bootstrapped’ label: “It’s not like we haven’t spent that much. We just did it with our own money, we didn’t take money from a VC…” He adds that the company is now fielding interest from external investors, which may come in handy in the future as and when it needs to “hyper-scale” the business.

Still, a lack of investors so far means no ticking clock yet on an ‘exit’ for Super Hi-Fi. And if its tech does deliver, there will surely be opportunities.

In a way, it reminds me of The Echo Nest back in the day, with its music recommendation and personalisation technology, which was used by a number of streaming services right up to the point where Spotify bought it in 2014, and its rivals had to find alternative tech.

You could imagine a similar outcome for Super Hi-Fi, although tradition dictates that when a journalist asks a startup about this kind of thing, there’s a set answer that founders are obliged to give. Zalon knows this too.

“It sounds very clichêd to say ‘we’re not building this to sell it, we’re building it to build it’, but our internal brand message really is that if we want to power the ecosystem of enhanced music experiences, we have to be out there in a lot of different systems,” he says, with a smile.

“We have a product roadmap that extends way beyond what we’re able to show today, and we believe our valuation is not going to be dictated today, but by what we’re going to be doing three to four years down the line. Right now our goal is singularly to grow this business into a profitable centre of innovation for digital music services to take advantage of. We honestly have zero interest in selling.”

I remember a similar conversation with The Echo Nest’s CEO Jim Lucchese a couple of years before Spotify stepped in – “Exit opportunities? If we can be the data engine for understanding music, it can be an incredibly valuable and incredibly profitable business,” he said in 2012. He wasn’t lying: it’s just that as his company delivered on its promise, it ultimately attracted an offer too good to refuse.

But Zalon is also right to emphasise that Super Hi-Fi is heads-down focusing on building its technology, and proving it works with live, large-scale clients – which starts by convincing them that it’s an appealing and even necessary addition.

He returns to the radio comparisons, citing two prominent Los Angeles radio stations that play “90% the same music”, yet one of them makes more than double the annual revenues of the other.

“Same music, same delivery technology. The difference is the content they have between the songs, and the way they produce that content. You have to make it sound right, and it has to be presented the right way,” he says.

“Spotify, Pandora, Deezer and the others realise they have to create this large tapestry of listening experiences that’s about more than just music. They have to elevate the production and presentation experiences of it. I know it’s self-serving for me to say that! But we also know this to be a fact!”

“We had some really advanced scientists on our team, and we put them on the task of figuring out how to develop technology powerful enough to make real-time content and production decisions that sound like a human is producing it. We ended up with the world’s most powerful artificial intelligence able to do that that’s ever existed.”

Written by: Stuart Dredge