Text-to-Speech

Voice is how we naturally communicate and can often be more appropriate than SMS text, email, chat or other forms of written or visual communication.

Solution category

People talk to, and listen to, other people. It's how we naturally communicate with each other. With text-to-speech you can synthesise human speech and make interaction with an automated system more natural. Bringing more natural interactions to scalable and cost-effective automated systems delivers positive customer experiences and drives adoption.

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech computer or speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.

Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output.

Text-to-Speech (TTS) refers to the ability of computers to read text aloud. A TTS Engine converts written text to a phonemic representation, then converts the phonemic representation to waveforms that can be output as sound. TTS engines with different languages, dialects and specialized vocabularies are available through third-party publishers.

Use Text-to-Speech service > Use Speech-to-Text service >

Login

Reset password

Text-to-Speech

Text-to-Speech Services

Text-to-Speech

Speech-to-Text

Voice Gateway

Find out more...