First, OpenAI provided tools that allow people to create digital images simply by describing what they want to see. Later, they built similar technology to produce full-motion video similar to Hollywood movies.

This time, we announced a technology that can reproduce someone's voice.

The popular AI startup announced Friday that a small group of companies is testing a new OpenAI system called Voice Engine that can recreate the human voice from a 15-second recording. Upload your own recordings and paragraphs of text to have your text read aloud using a synthesized voice that resembles your own.

The text does not have to be in your native language. For example, if you speak English, your voice can be reproduced in Spanish, French, Chinese, and many other languages.

OpenAI is not sharing this technology more widely because we are still trying to understand its potential risks. Similar to image and video generators, audio generators can also help spread disinformation across social media. It also allows criminals to impersonate others while online or on the phone.

The company said it was particularly concerned that this type of technology could be used to defeat voice authentication systems that control access to online banking accounts and other personal applications.

“This is a sensitive issue and it’s important to get it right,” OpenAI product manager Jeff Harris said in an interview.

The company is exploring ways to watermark synthesized voices and add controls to prevent people from using the technology with the voices of politicians and other celebrities.

Last month, OpenAI took a similar approach when it announced its video generator Sora. Although the technology was demonstrated, it was not released to the public.

OpenAI is one of many companies that has developed a new breed of AI technology that can quickly and easily generate synthetic speech. That includes tech giants like Google as well as startups like New York-based Eleven Labs. (The New York Times sued OpenAI and its partner Microsoft for alleged copyright infringement involving artificial intelligence systems that generate text.)

Companies can also use these technologies to generate audiobooks, give voice to online chatbots, and build automated radio station DJs. Since last year, OpenAI has been using its technology to power its conversational version of ChatGPT. And it has long provided businesses with a set of voices that can be used for similar applications. They were all constructed from clips provided by the voice actors.

But the company doesn't yet offer a public tool like Voice Engine that lets individuals and businesses recreate audio from short clips. The ability to reproduce any sound in this way is what makes the technology so dangerous, Harris said. He said the technology could be especially dangerous in an election year.

In January, New Hampshire residents received robocall messages discouraging them from voting in the state's primary in a voice that was likely artificially generated to be heard by President Biden. The Federal Communications Commission later banned such calls.

Harris said OpenAI has no immediate plans to make money from the technology. He said the tool could be particularly useful for people who have lost their voice due to illness or accident.

He demonstrated how this technique could be used to recreate a woman's voice after it was damaged by a brain tumor. He said she was able to speak after she provided him with a simple recording of a presentation she once gave as a high school student.



Source link