The impact of creative AI on musicians may be a hot topic in 2023, but French composer and artist Benoit Carré has been exploring it for much longer.
In 2015 he joined the Sony CSL laboratory as a musician in residence, working on its famous ‘Daddy’s Car’ project to create a song in the style of the Beatles.
Since then, he has released a series of albums and projects using creative AI technologies, latterly under his Skygge pseudonym. We wrote about his use of tools developed at Spotify for an ‘American Folk Songs’ EP in 2019, and then his ‘Melancholia’ album in 2022.
And in 2023? Skygge’s latest release is a new version of his ‘Ocean Noir’ track sung by Grimes’s vocal clone. It’s one of the early projects to use their recently-launched service, which licensed Grimes’s voice to other musicians in return for a share of the royalties of any tracks they released using it.
Skygge’s new version came out on 23 June, complete with artwork created using text-to-image AI Midjourney. Carré told Music Ally more about the thinking behind the project, as well as his views on the various AI creativity debates that are currently raging.
How did the new version of Ocean Noir come about, and what was it like to create?
“I discovered voice cloning about a year ago on Uberduck.com. There was already a wide choice of voice clones, HAL 9000 (from the film A Space Odyssey), Barack Obama, David Bowie, Lady Gaga, and many more. The technology has come a long way since then. Grimes has put her very realistic Elftech voice clone online.
In 2022, I released a song called Ocean Noir, which was directly inspired by the unlikely combination of a Schubert fugue and a Cuica. The sound I got at the time was alien, strange, and melancholic. Ocean Noir was the right song to replace my vocal timbre with a vocal clone.
Elftech was extremely easy to use, a simple drag-and-drop process. It’s important to remember that to use voice cloning technology, you have to sing yourself first! Once you’ve sung the song, you upload your voice (without music) to the cloning technology.
The output that you get is exactly the same song with the same melody and rhythm, but the timbre has been changed. Hearing myself sing with Grimes’ timbre was really cool. It’s creative because I can sing closer to the way she sings and hybridize myself even more.
It’s like playing with your image, changing your identity. As voice is a key element of my identity, I find it very new to be able to mix it with another one, it opens a new field of creation.”
What are your thoughts on this idea of officially-licensed vocal clones controlled by (and rewarding) the original artist?
“I love the idea of artists taking the lead and inventing new businesses. The technology is there now and we need to take advantage of it. The labels will follow if the biggest artists take the plunge. They are the real leaders today.
Technologically, it’s dizzying! At the moment I’m creating a hybrid voice model between a female voice and my own voice. These days I can train my own model overnight with a Google notebook, just imagine all the mixes you can do.
It’s great, but it’s dangerous because it means I can make a model with Taylor Swift’s voice, hybridize it with other singers and get a vocal clone that’s impossible to trace.”
You’ve been working with AI technology in your music longer than most musicians. What do you think about the debates around this in 2023?
“I started at Sony CSL with Flow Machines, which I helped develop with researchers. I don’t think there’s been anything as exciting since. You could play with styles by giving at least two scores, and Flow Machines would generate fragments of music that were sometimes very inspiring.
I think researchers should always take a musical approach to inventing new tools. Today’s text2music technologies give impressive results, but they can be taken or left and give the impression that there is only one solution, whereas the point of AI is to offer several solutions so that musicians can choose the one that inspires them.
Today’s AI for musicians consists of plug-ins that perform specific tasks. Maybe that’s where we’ll end up: powerful plug-ins for sound processing and possibly for recommendation, to suggest a bass or drum set that matches the song you’re working on.
And soon, I think, vocal cloning will invade prods and perhaps live performances with ‘augmented’ singers, since we can already clone a vocal timbre in real time!”
If you were in charge of the music industry, what change would you make in the next year to shape the evolution of creative AI and AI music?
“I dream of tools that allow musicians to really explore their creativity, to go further, and be surprised by the AI’s suggestions to deepen their uniqueness, as I have been since I started working on this.
Instead of asking (to text2music tools) for ‘Slow-tempo, piano-led melancholic song. Carribean drums. Vocals are cool and expressive’, we should be able to ask for ‘a melody in C minor inspired by Chopin’s étude number ten, with simple chords and a Caribbean rhythm, tempo 120, 4 bars, percussion, all tracks separate, etc.’ and play with the parameters until we get a result that inspires us – maybe only keep the chords! – and leads us to an original creation.
The power of AI makes it possible to offer highly personalised interaction and mix styles ad infinitum. It’s the complete opposite of the standardisation of music that’s encouraged by the marketplace of sounds/audio files that all sound the same!”
How do you feel, as an artist, using tools like text-to-image creators, given the debate around whether those have been trained on visual artists’ works without permission? Is there ever a worry about using something that has not treated creators fairly?
“I don’t know exactly how these models have been trained, but like everyone else I have heard the artists’ protests and I understand them. What’s happening with images is frightening because these technologies are devilishly effective!
If they were as effective for music, all musicians would be very destabilised. Perhaps all the challenges of music, which are more complex to model than images, will one day be solved and it will be our turn to be scared.
As an artist, I have to ask myself that question (do I use text2image for my album cover?), it’s true. I used MidJourney to create the cover for the single Ocean Noir and I made a video for the track with Stable Diffusion.
For me, it’s all part of the SKYGGE project, which explores the collaboration between man and machine. If it was a different project, I probably would have a bad conscience about using these technologies, but you also have to realise that it’s the whole system that’s going wrong.
Streams don’t earn enough, you have to produce more and more because streaming platforms work like a social network that needs to be fed regularly, and at the same time you have to reduce production costs – it’s sooo challenging!”