Actress Scarlett Johansson released a statement this week expressing her outrage and concern over OpenAI's use of a voice that is “eerily similar” to her own as the default voice for ChatGPT.
The voice in question, called “Sky,” has been available to users since September 2023, but the similarity to Johansson's voice became more apparent last week when OpenAI demoed an updated model called GPT-4o. Johansson claims that OpenAI CEO Sam Altman had previously asked her if she would like to provide her voice for ChatGPT, but that she declined the invitation.
The warm, playful tone of Skye's voice is strikingly similar to the digital companion called Samantha in the film. she (2013), with Johansson providing the voice.
Though Altman has since claimed that Skye's voice was not intended to resemble Johansson's, he seemed to allude to the connection by simply tweeting the word “her” on May 13, 2024, the day GPT-4o launched.
OpenAI later explained the process for creating Skye's voice in a blog post, saying that the voice was “created by another professional actress using her own natural speaking voice.” But as the audio samples that can be used to generate synthetic voices become smaller and smaller, it's becoming easier than ever to replicate a person's voice without their consent.
As a scholar of acoustics, I am interested in the ways in which AI technologies are raising new questions and concerns about voice and identity, and my research situates recent developments, anxieties, and aspirations around AI within a longer history of voice and technology.
Stolen Voice
This is not the first time a performer has challenged unauthorized imitation of his or her voice.
In 1988, Bette Midler filed a lawsuit against the Ford Motor Company for using a voice similar to hers in a series of advertisements. The U.S. Ninth Circuit Court of Appeals ultimately ruled in her favor, with Circuit Judge John T. Noonan stating in his decision that “the impersonation of her voice constitutes the appropriation of her identity.”
Tom Waits filed a similar lawsuit against Frito-Lay after hearing a voice that sounded like his own raspy voice in a Doritos radio commercial. As musicologist Mark C. Samples has noted, the case was “[ed] It is legally impossible to accurately represent the timbre of a person's voice to the level of their visual representation.
Lawmakers are only just beginning to address the challenges and risks that come with the widespread adoption of AI.
For example, the Federal Communications Commission recently ruled to ban robocalls that use AI-generated voices. In the absence of more specific policies and legal frameworks, these voice imitation examples continue to serve as important precedents.
Chatbots and Gender
OpenAI is likely referring to the movie she Sky’s voice design also places ChatGPT within a longstanding tradition of assigning female voices and personas to computers.
The first chatbot was built in 1966 by Professor Joseph Weizenbaum at MIT. Professor Weizenbaum designed the chatbot, called ELIZA, to communicate with users in the same way a psychotherapist would. ELIZA has influenced and inspired today's digital assistants, many of which default to a female voice. When it was first released in 2011, Siri talked to ELIZA as if it were a friend.
Many science and technology scholars, including Tao Fan and Heather Woods, have criticized the way tech companies appeal to gender stereotypes when designing voice assistants.
Communications scholars Jessa Ringel and Kate Crawford suggest that voice assistants evoke the historically feminized role of the secretary, as they perform both administrative and emotional labor. By invoking this subservient metaphor, they argue, tech companies are attempting to distract users from the surveillance and data extraction that voice assistants perform.
OpenAI said it sought a “trustworthy and familiar voice” when selecting the voice for ChatGPT. It is indicative that the voice it chose is female, to help users feel at ease with the rapid advances in AI technology. Despite the remarkable progress being made in the conversational abilities of voice assistants, Sky's voice shows that the technology industry has yet to move beyond these regressive tropes.
Protecting our voice
Johansson's statement concludes by calling for “transparency and appropriate legislation” to protect voice likeness and identity. Indeed, it will be interesting to see what legal and policy ramifications emerge from this high-profile case of unauthorized voice simulation.
But it's not just celebrities who should be concerned about how their voices are used by AI systems: our voices are already being recorded and used to train AI by platforms like Zoom and Otter.ai, and even virtual assistants like Alexa.
An illegal AI imitation of Johansson's voice may seem like the stuff of a dystopian future, but it's best understood in the context of ongoing discussions about voice, gender, and privacy — a sign of what's already here, not what's to come.
Courtesy of The Conversation
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Quote: ChatGPT's use of a voice actor who looks like Scarlett Johansson reflects the troubling history of gender stereotypes in tech (May 26, 2024) Retrieved May 26, 2024, from https://techxplore.com/news/2024-05-chatgpt-soundalike-scarlett-johansson-history.html
This document is subject to copyright. It may not be reproduced without written permission, except for fair dealing for the purposes of personal study or research. The content is provided for informational purposes only.