OpenAI says it can clone a voice from just 15 seconds of audio

OpenAI just announced it A new tool called Voice Engine. It’s a voice cloning technology that can mimic any speaker by analyzing a 15-second audio sample. The company says it creates “natural-sounding speech” with “emotional and realistic voices.”

The technology is based on the company’s technology and it works from 2022. OpenAI uses a version of the toolkit to enhance the preset voices already available in the current text-to-speech API and read-aloud functionality. The company’s official blog has a bunch of samples, and they sound pretty close to the real thing. I invite you to listen to them and imagine the possibilities, both good and bad.

OpenAI says they see the technology being useful for aiding reading, language translation, and helping those suffering from sudden or degenerative speech conditions. The company raised a It helped a speech-impaired patient by creating a Sound Engine clone drawn from a recorded voice for a school project.

Despite the potential benefits, bad actors will abuse this technology to engage in serious fraud, . With that in mind, the Sound Engine isn’t quite ready for prime time, as there are serious privacy issues that need to be addressed before it can be fully deployed.

OpenAI acknowledges that the technology has “serious risks, especially in an election year.” The company incorporates input from “US and international partners from government, media, entertainment, education, civil society and other fields” to ensure the product is launched with minimal risk. All preview testers agreed to OpenAI’s usage policies, which prohibit impersonating another person without consent or legal right.

Additionally, anyone using the technology must disclose to their audience that the voices are generated by artificial intelligence. OpenAI has implemented security measures such as watermarking and “proactive monitoring” of how the system is being used to trace the origin of any sound. When the product officially launches, it will have a “voice-proof list” that detects and blocks AI-generated speakers that sound too similar to famous faces.

As for when that rollout will happen, OpenAI is tight-lipped. TechCrunch and it looks like it will be overturned . Voice Engine can cost $15 per million characters, which equates to about 162,500 words. That’s the length of Stephen King Shine. It certainly sounds like a budget-friendly way to make an audiobook. Marketing materials also refer to an “HD” version that costs twice as much, but the company hasn’t detailed how that will work.

OpenAI made big strides this week. It just announced another partnership with its best friend Microsoft to build an AI-based supercomputer called Stargate. It is reported that the project will cost 100 billion dollars. .

This article contains affiliate links; we may earn a commission if you click on such a link and make a purchase.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *