We’ve spoken at length about the wonders and dangers of visual deepfakes produced by AI, so it should come as no surprise that OpenAI’s new initiative—AI that generates audio mimicry of famous musicians—has us divided.
OpenAI is a California-based technology company with claims to fame that encompass Image GPT and Microscope. Recently, they added “Jukebox” to that list, and while the implications are certainly cool, they’re also terrifying.
The basic premise of Jukebox is simple: Using existing audio samples from a musician, the AI creates a new track that is an “approximate” of that musician’s style and sound. Some iterations are reportedly off-kilter enough to warrant sullied listening, while others sound straight from their target musician’s mouth.
The catch is that Jukebox is eerily good at creating “new” content in the style of deceased artists–such as Michael Jackson and Elvis Presley—leading to a deeper question of whether or not doing so is in poor taste, a violation of copyright, or even purely unethical.
Sampling old tracks from deceased artists in new music isn’t new. Contemporary artists such as Drake are known for looping in pieces of audio to complement a hook or a verse, and the music market is saturated with remakes, remasters, and covers of old music. However, one can argue that—as cool as Jukebox’s application is—using AI to impersonate dead people is sketchy territory.
One might point to the famous Tupac hologram that was popular a few years ago as a counterargument. I would personally argue that even that was potentially done in poor taste, but, again, it was a visual aid to existing music—not an AI approximation of what some programmer in California thought the next Tupac single would have been if he hadn’t been executed.
This brings us to the larger problem that exists with all deepfakes: Accountability. In the past, using your eyes and ears to judge the content of one’s speech and actions was enough; at times, it was difficult, but it was still enough to arrive at a consensus, personally or otherwise.
Deepfakes make it impossible to trust your eyes and ears the same way. People have already reported egregious instances of visual deepfakes being used to insert their likeness into things like pornography and racist rhetoric; how will progressively generated audio fit into that model? On the other side of the argument, how many people will start to claim that offensive or problematic video or audio of them is “faked”? Only time will tell.
This is yet another cautionary tale that one can respect and revere the limitations of technology without pushing them to unnecessary lengths. As Jukebox becomes more prominent, we will almost certainly become stuck with the consequences of its success.