Microsoft’s AI can clone your voice after analyzing a 3-second audio clip
Microsoft has developed artificial intelligence that perfectly clones a person’s voice after analyzing just three seconds of an audio clip of them speaking, but some fear it will provide a tool for fraudsters to steal your voice.
Called VALL-E, the system could be used by a phone scammer to capture just three seconds of your voice and replicate it, which would also include your emotional range and acoustic environment.
This would allow bad actors to bypass systems that use your voice as a password.
VALL-E is not available to the public and Microsoft has not revealed any plans on when or if it will be.
While AI is scary for some users, others see the technology as a way for people who have lost their voices due to throat disease, ALS or other injury to regain their speech.
Microsoft has developed a new AI tool called VALL-E. You can clone a person’s voice just by listening to three seconds of an audio clip.
However, some Twitter users have raised an important question: do you own the sound of your voice?
The Microsoft Vall-E team has addressed the ethical issue with a statement: ‘The experiments in this paper were conducted under the assumption that the user of the model is the target speaker and has been approved by the speaker.
“However, when the model is generalized to invisible speakers, the relevant components must be accompanied by speech editing models, including the protocol to ensure that the speaker agrees to execute the modification and the system to detect the edited speech.”
VALLE has been trained in 60,000 hours of the English language, and Microsoft claims it can replicate American, British and various European accents.
Some Twitter users have raised an important question: do you own the sound of your voice?
The system could be used by a phone scammer to capture just three seconds of your voice and replicate it, which would also capture your emotional range and acoustic environment.
VALL-E can only convert written text to speech, but this is enough for someone to use technology to steal your voice and ‘put words in your mouth’.
Microsoft hasn’t released it to the public yet, but the company has high hopes for its AI — it’s set to revolutionize the way we listen to audiobooks and smart assistants.
The creators of VALL-E said that the artificial intelligence tool is designed for high-quality text-to-speech applications.
This includes editing speech in a recording of a person, such as an audio book.
While AI strikes fear among some users, others see the technology as a way for people who have lost their voices due to throat disease, ALS, or other injury to regain their speech.
The AI is making waves on Twitter, where it has received mixed opinions. Several people pointed out that VALL-E is bad news for voice actors.
VALL-E analyzes what the person in the audio clip sounds like, breaks that information down into different components, then uses its training data to find something similar and combines the two.
The AI is making waves on Twitter, where it has received mixed opinions.
One user said that VALL-E has no use except for scam and phishing purposes, while another is hoping it will be a game changer for people who have lost their speech.
Another Twitter user said this would have been great for the late Stephen Hawking, who lost his voice and used computer generated sound.
Several people pointed out that VALL-E is bad news for voice actors.
“Now they’re going after the voice actors, who’s next,” a user named “Gabriel” tweeted.