Text-to-Speech

TTS: Convert text to lifelike speech.

People like talking with people and listening to people. With Text-to-Speech you can synthesise human speech and make interaction with an automated system more natural. Natural interactions deliver positive customer experiences that drive adoption of services.

The Text-to-Speech service enables almost real-time text-to-speech (TTS) conversion on a long or short text. The resulting lifelike voice stream (in MP3 format) can then be delivered via a number of different channels:

Returned during RESTful Voice API call (synchronous)
HTTPS retrieval using RESTful Voice API call (asynchronous)
Download via melroselabs.com (subscription only)
Attachment in an email

Future delivery channels for the service include:

HTTPS callback (POST)
URL link in an email, SMS or HTTPS callback
Playback during a SIP call
Voice messaging - delivery via voice call

Don't see what you need - then ask for it!

Don't see something you need, then make a feature request.

Text-to-Speech is a cloud service that quickly converts text into high-quality lifelike speech. You can use Text-to-Speech to develop applications that increase engagement and accessibility, and for voice-overs on videos. Text-to-Speech supports multiple languages and includes a variety of lifelike voices, so you can build speech-enabled applications that work worldwide and use the ideal voice for your customers, whatever their language.

Standard and Neural TTS

The standard speech engine produces high-quality speech for all supported voices. For en-GB, en-US and some es-US and pt-BR voices, the neural speech engine can be used to produce perfectly natural speech. Where a voice is supported by the neural speech engine, the service will use the neural speech engine by default.

See available TTS languages and voices for samples, and for which use the neural speech engine. All voices support SSML.

TTS Conversion Speed

Conversion of text-to-speech takes place extremely fast, regardless of whether you are using the service via the TTS Console, email, REST API or Zapier. For example, conversation of 250 words takes less than 1.5 seconds.

Voices and Languages

Available languages and text-to-speech voices

The following languages and voices are available using the Text-to-Speech service:

Language	Voice	Email (example)
Arabic (arb)	Zeina (f)	zeina.arb.voice@api.melroselabs.com
Chinese, Mandarin (cmn-CN)	Zhiyu (f)	zhiyu.cmn-cn.voice@api.melroselabs.com
Danish (da-dk)	Mads (m) Naja (f)	mads.da-dk.voice@api.melroselabs.com
Dutch (nl-nl)	Lotte (f) Ruben (m)	lotte.nl-nl.voice@api.melroselabs.com
English, Australian (en-AU)	Russell (m) Nicole (f)	russell.en-au.voice@api.melroselabs.com
English (en-GB)²	Emma (f) Amy (f) Brian (m)	emma.en-gb.voice@api.melroselabs.com
English, Indian (en-IN)	Aditi (f) Raveena (f)	aditi.en-in.voice@api.melroselabs.com
English (en-US)²	Salli (f)¹ Ivy (f) Joanna (f) Kendra (f) Kimberly (f) Joey (m) Justin (m) Matthew (m)	salli.en-us.voice@api.melroselabs.com
English, Welsh (en-GB-WLS)	Geraint (m)	geraint.en-gb-wls.voice@api.melroselabs.com
French (fr-FR)	Céline/Celine (f) Léa/Lea (f) Mathieu (m)	celine.fr-fr.voice@api.melroselabs.com
French, Canadian (fr-CA)	Chantal (f)	chantal.fr-ca.voice@api.melroselabs.com
Germany (de-DE)	Vicki (f) Marlene (f) Hans (m)	vicki.de-de.voice@api.melroselabs.com
Hindi (hi-IN)	Aditi (f)	aditi.hi-in.voice@api.melroselabs.com
Icelandic (is-IS)	Karl (m) Dóra/Dora (f)	karl.is-is.voice@api.melroselabs.com
Italian (it-IT)	Bianca (f) Carla (f) Giorgio (m)	bianca.it-it.voice@api.melroselabs.com
Japanese (ja-JP)	Mizuki (f) Takumi (m)	mizuki.ja-jp.voice@api.melroselabs.com
Korean (ko-KR)	Seoyeon (f)	seoyeon.ko-kr.voice@api.melroselabs.com
Norwegian (nb-NO)	Liv (f)	liv.nb-no.voice@api.melroselabs.com
Polish (pl-PL)	Jan (m) Ewa (f) Maja (f) Jacek (m)	jan.pl-pl.voice@api.melroselabs.com
Portuguese (pt-BR)	Camila² (f) Vitória/Vitoria (f) Ricardo (m)	vitoria.pt-br.voice@api.melroselabs.com
Portuguese (pt-PT)	Cristiano (m) Inês/Ines (f)	cristiano.pt-pt.voice@api.melroselabs.com
Romanian (ro-RO)	Carmen (f)	carmen.ro-ro.voice@api.melroselabs.com
Russian (ru-RU)	Tatyana (f) Maxim (m)	tatyana.ru-ru.voice@api.melroselabs.com
Spanish (es-ES)	Enrique (m) Lucia (f) Conchita (f)	enrique.es-es.voice@api.melroselabs.com
Spanish (es-MX)	Mia (f)	mia.es-mx.voice@api.melroselabs.com
Spanish (es-US)	Lupe² (f) Penélope/Penelope (f) Miguel (m)	penelope.es-us.voice@api.melroselabs.com
Swedish (sv-SE)	Astrid (f)	astrid.sv-se.voice@api.melroselabs.com
Turkish (tr-TR)	Filiz (f)	filiz.tr-tr.voice@api.melroselabs.com
Welsh (cy-GB)	Gwyneth (f)	gwyneth.cy-gb.voice@api.melroselabs.com

¹ Default voice. Used for voice@api.melroselabs.com or when no voice is specified during a Voice API REST call.
² Will use neural engine unless standard engine is requested.

Using the Service

The Melrose Labs Text-to-Speech service is available using the TTS Console, SMTP API (email), Text-to-Speech REST API and Zapier (Melrose Labs Speech).

REST API

Zapier

TTS Console

The TTS Console enables you to select the language and voice, enter up to 2000 characters of text and perform a text-to-speech conversion. Play/pause controls are available and audio can be downloaded as an MP3 file. TTS Console is only available when signed-in, otherwise the limited TTS demo is available. If you do not have a subscription to the service, you can perform up to 5 conversion requests.

Email - SMTP API

The Melrose Labs Text-to-Speech service is available using our SMTP API (email).

Using Email to Access the Text-to-Speech Service

Email can be used to easily and quickly perform a text-to-speech (TTS) conversion as an alternative to using REST HTTPS calls to the Voice API. Send an email to the Text-to-Speech service at voice@api.melroselabs.com and use the subject field to contain the text you wish converted. After a number of seconds, you will receive an email back from the service with an MP3 file containing the converted speech.

Various voices and corresponding languages are available using email addresses specific to each voice. The format of each email address is voice.language.voice@api.melroselabs.com and the options for voice and language are shown in the list of available languages and voices.

An API key is currently not required when using email.

REST API

The Melrose Labs Text-to-Speech service is available using our REST Voice API.

REST Voice API Documentation SIGN-UP | LOGIN TO GET API KEY

Convert text to speech using the Voice Gateway Text-to-Speech service with RESTful Voice API

Example using cURL, Node.js, Python and PHP

Syncronous
Asyncronous

Submit conversion request and retrieve resulting speech.

Request:

curl https://api.melroselabs.com/voice/tts/ \
	--header 'x-api-key: [API_KEY]' --header 'Content-Type: application/json' \
	--data-raw '{"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}'

Response:

MP3 file

Request:

var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.melroselabs.com/voice/tts/',
  'headers': {
    'x-api-key': '[API_KEY]',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"})
};
request(options, function (error, response) { 
  if (error) throw new Error(error);
  console.log(response.body); // response is of type audio/mp3
});

Response:

MP3 file

Request:

import requests
import json

url = "https://api.melroselabs.com/voice/tts/"
payload = {
	{"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}
}
headers = {
  'x-api-key': '[API_KEY]',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data = json.dumps(payload))

# response is of type audio/mp3
print(response.content)

Response:

MP3 file

Request:

<?php 
$data = {"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => "https://api.melroselabs.com/voice/tts/",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => "POST",
  CURLOPT_POSTFIELDS => $data,
  CURLOPT_HTTPHEADER => array(
    "x-api-key: [API_KEY]",
    "Content-Type: application/json"
  )
));

$response = curl_exec($curl);

curl_close($curl);

echo $response; // response is of type audio/mp3
?>

Response:

MP3 file

The asyncronous method is a two step process involving the submitting of the conversion request and then the retrieval of the result.

1. Submit conversion request.

Request:

curl https://api.melroselabs.com/voice/texttospeech/ \
	--header 'x-api-key: [API_KEY]' --header 'Content-Type: application/json' \
	--data-raw '{"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}'

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.melroselabs.com/voice/texttospeech/',
  'headers': {
    'x-api-key': '[API_KEY]',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"})
};
request(options, function (error, response) { 
  if (error) throw new Error(error);
  console.log(response.body); // response is of type application/json
});

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

import requests
import json

url = "https://api.melroselabs.com/voice/texttospeech/"
payload = {
	{"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}
}
headers = {
  'x-api-key': '[API_KEY]',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data = json.dumps(payload))

# response is of type application/json
print(response.text.encode('utf8'))

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

<?php 
$data = {"voiceText": "Welcome Allan. The event for today will begin at 9.30am in room H32.", "voice": "Emma"}

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => "https://api.melroselabs.com/voice/texttospeech/",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => "POST",
  CURLOPT_POSTFIELDS => $data,
  CURLOPT_HTTPHEADER => array(
    "x-api-key: [API_KEY]",
    "Content-Type: application/json"
  )
));

$response = curl_exec($curl);

curl_close($curl);

echo $response; // response is of type application/json
?>

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

2. Retrieve resulting speech.

Request:

curl --location --request GET https://api.melroselabs.com/voice/texttospeech/1ccead78-6550-4aac-a6b4-a4942b908659 \
	--header 'x-api-key: [API_KEY]'

Response:

MP3 file

Request:

var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.melroselabs.com/voice/texttospeech/1ccead78-6550-4aac-a6b4-a4942b908659',
  'headers': {
    'x-api-key': '[API_KEY]'
  }
};
request(options, function (error, response) { 
  if (error) throw new Error(error);
  console.log(response.body); // response is of type audio/mp3
});

Response:

MP3 file

Request:

import requests
import json

url = "https://api.melroselabs.com/voice/texttospeech/1ccead78-6550-4aac-a6b4-a4942b908659"

headers = {
  'x-api-key': '[API_KEY]'
}

response = requests.request("GET", url, headers=headers)

# response is of type audio/mp3
print(response.content)

Response:

MP3 file

Request:

<?php 
$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => "https://api.melroselabs.com/voice/texttospeech/1ccead78-6550-4aac-a6b4-a4942b908659",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => "GET",
  CURLOPT_HTTPHEADER => array(
    "x-api-key: [API_KEY]"  )
));

$response = curl_exec($curl);

curl_close($curl);

echo $response; // response is of type audio/mp3
?>

Response:

MP3 file

Get your API Key now and start using the Text-to-Speech service REST API

SIGN-UP | LOGIN TO GET API KEY

Zapier

Find us on Zapier. Use our invite link to access Melrose Labs Speech.

Zapier: Melrose Labs Speech

SSML

Speech Synthesis Markup Language (SSML) is supported by the Text-to-Speech service. When using SSML, the text you wish to be converted should start with ssml: and be contained in <speak></speak> tags.

For example: ssml:<speak>Mary had a little lamb <break time="3s"/>Whose fleece was white as snow.</speak>

Supported SSML tags

Tag	Description	Example
`audio`	The audio tag lets you provide the URL for an MP3 file that can be played while rendering a response.
`break`	Represents a pause in the speech.	`<break time="3s"/>`
`emphasis`	Emphasize the tagged words or phrases. Emphasis changes rate and volume of the speech. More emphasis is spoken louder and slower. Less emphasis is quieter and faster.
`lang`	Use lang to specify the language model and rules to speak the tagged content as if it were written in the language specified by the `xml:lang` attribute.
`p`	Represents a paragraph. This tag provides extra-strong breaks before and after the tag.
`phoneme`	Provides a phonemic/phonetic pronunciation for the contained text.
`prosody`	Modifies the volume, pitch, and rate of the tagged speech.
`s`	Represents a sentence. This tag provides strong breaks before and after the tag.
`say-as`	Describes how the text should be interpreted. `interpret-as` attribute values: `characters` or `spell-out`: Spell out each letter. `cardinal` or `number`: Interpret the value as a cardinal number. `ordinal`: Interpret the value as an ordinal number. `digits`: Spell out digits separately. `fraction`: Interpret a value as a fraction. `unit`: Interpret a value as a measurement. `date`: Interpret a value as a date. `time`: Interpret a value as a time. `telephone`: Interpret a value as a 7-digit or 10-digit telephone number. `address`: Interpret a value as part of a street address. `interjection`: Interpret the value as an interjection. `expletive`: "Bleep" out the content inside the tag. `format` attribute values (only for use when `interpret-as` is set to `date`) `mdy` `dmy` `ymd` `md` `dm` `ym` `my` `d` `m` `y`	`<say-as interpret-as="digits">54321</say-as>`
`sub`	Pronounce the specified word or phrase as a different word or phrase. Specify the pronunciation to substitute with the `alias` attribute.
`voice`	Speak the text with the specified voice.

Pricing

The Text-to-Speech service is available for limited (5 requests/day) free use and unlimited (5000 requests/day) for subscription use.

Subscription - Basic usage plan	GBP 10 per month for 500,000 characters per month.
Additional characters	GBP 1.50 per 100,000 characters
Subscription accounts are limited to 5000 requests per day. Text for conversion to speech can be up to 3000 billed characters. SSML tags are not billed. Total input text including SSML can be up to 6000 characters.

Subscribe to the service:

Sign-up / login-in
Click on the "Subscribe to service" button at the top of this page

Data Retention and Data Privacy

We operate on the following principles:

Customer data is encrypted when at rest and when in motion.
Data is only kept for as long as necessary to provide service to the customer, ensure that we can support the customer, and be able to fulfil our legal and regulatory obligations.
MP3 files resulting from a text-to-speech asynchronous API conversion (email, asynchronous REST API or Zapier) are stored for 90 days and then automatically deleted. Files are stored encrypted using AES-256. When using synchronous API calls, no MP3 files are stored.

Limits

The Text-to-Speech service has the following limits:

Daily Requests (Basic Subscription)	5,000 requests	Maximum number of requests that can be made using an API key on a subscription account.
Daily Requests (Free)	5 requests	Maximum number of requests that can be made using an API key on a free account.
Text Length	3,000 characters	Maximum length of billable text to be converted to speech. SSML tags are not billable.
Text Length #2	6,000 characters	Maximum overall length of text to be converted to speech, including SSML tags.
MP3 File Lifetime	90 days	Duration after which file resulting from asynchronous call is automatically deleted.
API Key Lifetime	Subscription: indefinite Free: 45 days	Duration of validity of API Key, after which key expires.

Duration limits are from the moment the conversion request is submitted to the service.

Need to convert speech-to-text? See our Speech-to-Text service.

Service snapshot

Lifelike voices
Asynchronous and synchronous RESTful API
Large selection of voices
Multi-language support
Fast TTS conversion

Login

Reset password

Text-to-Speech

TTS: Convert text to lifelike speech.

Don't see what you need - then ask for it!

Standard and Neural TTS

TTS Conversion Speed

TTS Demonstration

Text-to-Speech Demo - synchronous API calls (fast!)

Voices and Languages

Available languages and text-to-speech voices

Using the Service

TTS Console

Email - SMTP API

Using Email to Access the Text-to-Speech Service

REST API

Convert text to speech using the Voice Gateway Text-to-Speech service with RESTful Voice API

Example using cURL, Node.js, Python and PHP

Get your API Key now and start using the Text-to-Speech service REST API

Zapier

SSML

Supported SSML tags

Pricing

Data Retention and Data Privacy

Limits

Service snapshot

Find out more...