Earlier today, OpenAI put a temporary halt to the use of one of its ChatGPT voices, known as Sky, which had an unusually high similarity to the voice of ‘Black Widow’ Scarlett Johansson. Now, the Hollywood A-lister has accused OpenAI of the same, i.e. using a voice strikingly similar to hers in their new GPT-4o chatbot, despite her prior refusal to collaborate with the company.
On Monday, Johansson released a statement outlining her experience with OpenAI. According to her, OpenAI CEO Sam Altman approached her in September 2023, proposing that she lend her voice to the then-in-development GPT-4o. Johansson, for undisclosed personal reasons, declined the offer. Fast forward nine months, and when OpenAI unveiled GPT-4o and its suite of voice assistants last week, Johansson was struck by a disturbing realization – the voice assistant dubbed “Sky” sounded remarkably similar to her own. This uncanny resemblance, she claims, was not lost on her friends, family, and even media outlets. Adding fuel to fire was Altman’s cryptic tweet “her,” which seemed to be an apparent reference to the film “Her,” where Johansson voiced an AI that develops a deep relationship with a human.
“Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and AI. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, I declined the offer,” Johansson noted in her statement. She went on to add that Altman had once again contacted her agent a few days before the release of the “ChatGPT 4.0 demo” asking her to reconsider her stance on the matter, and that she is looking forward to “resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected.”
The ability to create near-perfect replicas of human voices with AI presents a multitude of fears. For one, AI-generated voice replicas could be used to create highly believable deepfakes, where real people are seemingly made to say or do things they never did. This could be used to damage the reputations of individuals, sow discord within communities, or manipulate public opinion on important issues, especially when used in the political arena. For another, it could also result in widespread identity fraud and fraud, especially since voice recognition is increasingly used for security purposes, from unlocking smartphones to authorizing financial transactions. Flawless voice replicas could be employed to bypass such security measures. Furthermore, this may lead to the growth of social engineering scams as well.
In light of the backlash, OpenAI announced they would pause the use of the “Sky” voice. The company clarified that the voice was recorded by a professional actor and was not intended to imitate Johansson. OpenAI stated that the casting occurred before reaching out to Johansson, and they chose not to disclose the names of their voice actors to protect their privacy. Still, Johansson’s legal team has already sent letters to Altman and OpenAI, requesting detailed information on how the “Sky” voice was developed.
For those who are unaware, the controversial “Sky” voice is part of OpenAI’s new GPT-4o model, which contains a new Voice Mode. As the name suggests, the Voice mode lets users interact with ChatGPT by speaking to it, while also acting as a text-to-speech tool that provides answers to user prompts in oral form. It was originally just available to paid subscribers, but later in November, OpenAI announced that the feature would become free for all users with the mobile app.