OpenAI’s Voice Engine AI model can clone your voice: Here’s how

By Ayushi Jain | Updated on 01-Apr-2024

Ayushi Jain

01-Apr-2024

HIGHLIGHTS

OpenAI has unveiled the Voice Engine AI model.

The Voice Engine model has the ability to clone voices with accuracy.

It uses text input and just a 15-second audio sample to generate natural-sounding speech.

Imagine a world where your voice isn’t just yours anymore. Where a simple text input and a short audio sample are all it takes for AI to mimic your voice, crafting speech that sounds remarkably like you. Well, this could be future as OpenAI has unveiled the Voice Engine model.

The Voice Engine model has the ability to clone voices with accuracy, revolutionising the way we interact with technology.

In this article, we delve into the details of OpenAI’s Voice Engine, exploring how it works.

Also read: OpenAI’s Sora will be available for everyone later this year: Know more

OpenAI’s Voice Engine model uses text input and just a 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.

According to OpenAI, this model can generate “emotive and realistic voices.”

Also read: OpenAI’s Sora videos spark debate: Raising fears over realistic AI-generated content

Voice Engine was initially developed by OpenAI in late 2022, and has since been used to empower the preset voices featured in the text-to-speech API, alongside ChatGPT Voice and Read Aloud.

In an interview with TechCrunch, Jeff Harris– a member of the product staff at OpenAI– revealed that the Voice Engine model was trained on a mix of licensed and publicly available data.

It’s important to note that OpenAI is not releasing the Voice Engine model widely right now.

OpenAI stated that its partners have agreed to follow its usage policies. These policies prohibit impersonating others without consent or legal rights, require obtaining explicit and informed consent from the original speaker, not building ways for individual users to create their own voices, and mandate disclosing to listeners that the voices are generated by AI.

OpenAI has also implemented safety measures for Voice Engine, such as watermarking to track the origin of any generated audio and actively monitoring its usage.

Ayushi Jain

Tech news writer by day, BGMI player by night. Combining my passion for tech and gaming to bring you the latest in both worlds. View Full Profile