Google Tacotron 2 Text to Speech AI Sounds similar to Human Voice

Google comes to the top when it comes to experimenting in the field of  Artificial Intelligence (AI). Today, the tech giant Google has taken a step to advance further in the field of Artificial Intelligence.

Tacotron 2, the latest version of AI-powered speech synthesis system, sounds pretty close to the human voice. It has also uploaded some speech samples of the Tacotron 2 so that listeners can experience the ultimate technology.

Tacotron 2 uses two deep neural networks for output

The Tacotron 2 is Google second generation of the text to speech technology. It comes with two deep neural networks for flawless output. The first neural network translates the text into a spectrogram (pdf), which visually renders audio frequencies. Wavenet reads the spectrogram chart and produces the similar audio elements.

google tacotron 2 csalabs

Google Tacotron 2 Responds to Punctuation Too

It also responds to punctuations used in the text and can also learn to stress on some particular words, when they are written in caps.


Links to AI Samples

Check out audio samples of Tacotron 2 on GitHub. There are two audio samples for every single text and Google has not made it clear that which one is generated by Tacotron 2 and which is human speech.

After listening to the samples and figuring out Tacotron 2 samples by viewing the source code, we can say that Google has achieved some impressive results here. The voice is very much similar to Human Voice, better than other text to speech technologies that sounds too mechanical.

