Voice Gateway

Speech-to-Text

Transcribe speech into text.


People like talking. Let them do the talking and your application do the listening. With Speech-to-Text you can take human speech and convert it into text that you application can act upon.

The Speech-to-Text service allows an application to have a speech-to-text (STT) conversion performed on a long or short audio stream and for the speech in that audio stream to be transcribed as text. This service can be used in interactive systems (e.g. voice controlled systems) or for offline transcribing of speech.

  • Upload an audio file (MP3)
  • Storage of text from TTS converted speech
  • Retrieve text of speech once TTS conversion complete
  • Realtime transcription of speech (coming soon)

Transcript

Using the Service

The Melrose Labs Speech-to-Text service is available using the Speech-to-Text REST API and Zapier (Melrose Labs Speech).

Note that the Speech-to-Text service requires MP3 files to be sampled at 22050 Hz. The service may not convert speech to the correct text in all cases.

REST API

The Melrose Labs Speech-to-Text service is available using our REST Voice API.

Convert speech to text using the Voice Gateway Speech-to-Text service with RESTful Voice API
Example using cURL, Node.js, Python and PHP

Submit conversion request.

Request:

curl https://api.melroselabs.com/voice/speechtotext/ \
	--header 'x-api-key: [API_KEY]' --header 'Content-Type: audio/mp3' \
	--data-binary 'MP3-FILE'

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.melroselabs.com/voice/speechtotext/',
  'headers': {
    'x-api-key': '[API_KEY]',
    'Content-Type': 'audio/mp3'
  },
  body: JSON.stringify(MP3-FILE)
};
request(options, function (error, response) { 
  if (error) throw new Error(error);
  console.log(response.body); // response is of type application/json
});

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

import requests
import json

url = "https://api.melroselabs.com/voice/speechtotext/"
payload = {
	MP3-FILE
}
headers = {
  'x-api-key': '[API_KEY]',
  'Content-Type': 'audio/mp3'
}

response = requests.request("POST", url, headers=headers, data = json.dumps(payload))

# response is of type application/json
print(response.text.encode('utf8'))

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Request:

<?php 
$data = array("name"=>"file","file"=>"@FILE")

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => "https://api.melroselabs.com/voice/speechtotext/",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => "POST",
  CURLOPT_SAFE_UPLOAD => FALSE,
  CURLOPT_POSTFIELDS => $data,
  CURLOPT_HTTPHEADER => array(
    "x-api-key: [API_KEY]",
    "Content-Type: audio/mp3"
  )
));

$response = curl_exec($curl);

curl_close($curl);

echo $response; // response is of type application/json
?>

Response:

{"transactionID": "1ccead78-6550-4aac-a6b4-a4942b908659"}

Retrieve text of speech.

Request:

curl --location --request GET https://api.melroselabs.com/voice/speechtotext/1ccead78-6550-4aac-a6b4-a4942b908659 \
	--header 'x-api-key: [API_KEY]' 

Response:

{"text": "Alice was beginning to get very tired of sitting by her sister on the bank and of having nothing to do."}

Request:

var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.melroselabs.com/voice/speechtotext/1ccead78-6550-4aac-a6b4-a4942b908659',
  'headers': {
    'x-api-key': '[API_KEY]'
  }
};
request(options, function (error, response) { 
  if (error) throw new Error(error);
  console.log(response.body); // response is of type application/json
});

Response:

{"text": "Alice was beginning to get very tired of sitting by her sister on the bank and of having nothing to do."}

Request:

import requests
import json

url = "https://api.melroselabs.com/voice/speechtotext/1ccead78-6550-4aac-a6b4-a4942b908659"

headers = {
  'x-api-key': '[API_KEY]'
}

response = requests.request("GET", url, headers=headers)

# response is of type application/json
print(response.text.encode('utf8'))

Response:

{"text": "Alice was beginning to get very tired of sitting by her sister on the bank and of having nothing to do."}

Request:

<?php 
$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => "https://api.melroselabs.com/voice/speechtotext/1ccead78-6550-4aac-a6b4-a4942b908659",
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => "GET",
  CURLOPT_HTTPHEADER => array(
    "x-api-key: [API_KEY]"  )
));

$response = curl_exec($curl);

curl_close($curl);

echo $response; // response is of type application/json
?>

Response:

{"text": "Alice was beginning to get very tired of sitting by her sister on the bank and of having nothing to do."}

Get your API Key now and start using the Speech-to-Text service REST API

SIGN-UP | LOGIN TO GET API KEY

Pricing

The Speech-to-Text service is provided free-of-charge.

Data Retention and Data Privacy

We operate on the following principles:

  • Customer data is encrypted when at rest and when in motion.
  • Data is only kept for as long as necessary to provide service to the customer, ensure that we can support the customer, and be able to fulfil our legal and regulatory obligations.
  • Text files resulting from a speech-to-text API conversion, and input MP3 files, are stored for 90 days and then automatically deleted. Files are stored encrypted using AES-256.

The Speech-to-Text service is one of the many building blocks we are releasing over the coming months as part of the Voice Gateway, and making available through the Voice API.

Need to convert text-to-speech? See our Text-to-Speech service.

Service snapshot

  • Convert file (MP3) containing speech to text
  • synchronous and asynchronous RESTful API
  • Text storage
  • Fast automatic speech recognition

Find out more...

Please provide your name.
Please provide a valid company name.
Please type your message.
Please provide a valid email address.