Introduction

The Contexta360 Speech To Text Analyzer API is a highly accurate and customizable speech-to-text service for Dutch and English. The API provides the following services:

  • Speech-to-text transcription that transcribes audio files to text.
  • Speech-to-text analytics that analyzes the transcriptions and gives more insights about agent performance or conversation content.

Note: This API is only available in production and cannot be tested in our sandbox environment.

  1. Upload audio file with a POST ​/transcripts​/create request. You receive a transcriptionId.
  2. Check status with a GET /transcripts/status request.
  3. Download transcription with a GET transcripts/show request.

Note: If you provide a Callback URL with the POST ​/transcripts​/create request, the API sends you a notification when the transcription process has finished.


Conceptual model

Conceptual model


Definitions

Decoder

A decoder in this documentation refers to a speech analyzer that is optimized to recognize certain predetermined conditions in analyzed recordings, such as language or dialect.

Callback (webhook)

A callback or webhook is a HTTP POST callback implemented by you, that can be called by another system when an event is triggered on that system to notify you. You will need to make the address of your webhook/callback known on the other system to make it work.

API workflow

API workflow


Features and constraints

Features

  • Recognition of spoken text in Dutch and English (more to come).
  • Callback support to let the sender know that the decoding has been finished.
  • Supports transcription history.
  • Analyses the transcription and adds metadata about the conversation.
  • NEW! Audiolink is available for audio files with sizes greater than 10MB. _

Constraints

  • This API cannot be tested in our sandbox environment. You will need to apply for production to use the API.
  • Supported file formats are: .wav, .mp3 or .opus.
  • Maximum audio file size is 10MB. If you have audio files that exceed this size limitation, use the audiolink parameter in the POST /transcripts/create request.
  • Due to security reasons, transcripts are only available for 24 hours after creation.


How to...

Get client information

Retrieve available information of yourself as a client that is held within the system.

SwaggerHub:

  1. Select GET /clients/me.
  2. Click Try it out.
  3. Click Execute.
  4. Check the response code and message.

Postman:

  1. Select (GET) /clients/me.
  2. Click Send.
  3. Check the response code and message.
^^Response example^^
{
  "data": {
    "id": "string",
    "email": "string",
    "name": "string",
    "persistData": true
  }
}

List decoders

Get a list of possible decoders from Contexta and define the one that fits your use case best.

SwaggerHub:

  1. Select GET /decoders.
  2. Click Try it out.
  3. Click Execute.
  4. Check the response code and message.

Postman:

  1. Select (GET) /decoders.
  2. Click Send.
  3. Check the response code and message.
^^Response example^^
{
  "data": {
    "decoders": [
      {
        "tag": "PUBLIC",
        "default": false,
        "_id": "5af597e9f36d280745034405",
        "description": "Standard Contexta models for English transcription purposes",
        "lang": "en",
        "title": "Contexta Standard English"
      },
      {
        "tag": "PUBLIC",
        "default": false,
        "_id": "5b16b7c7fb6fc02bcb8ec28e",
        "description": "Standard Contexta models for generic transcription purposes",
        "lang": "nl",
        "title": "Contexta Standard Dutch"
      },
      {
        "tag": "PUBLIC",
        "default": true,
        "_id": "5af57b13f36d280745033547",
        "description": "Enhanced Contexta models for phone-related transcription purposes",
        "lang": "nl",
        "title": "Contexta Enhanced Dutch"
      }
    ]
  }
}

Create a transcript from an audio recording

Provide an audio file with the conversation to be transcribed. The file must meet the following requirements:

  • File format is .wav, .mp3 or .opus (uncompressed files preferred)
  • Minimum bit rate: 32k
  • Sampling rate: 8KHz
  • Encoding: pcm_s16le
  • Preferred bit rate of 256k and sampling rate of 16KHz
  • Maximum audio file size: 10MB.
  • NEW An audiolink parameter is available for audio files greater than 10MB.

Provide the appropriate decoder _id (retreived by GET /decoders), depending on the language spoken in the audio file. An optional callback URL can be provided where a notification will be send when the transcription process is finished. An optional JSON metadata object can be provided as well, which is passed through as is to the callback URL (if given) when the transcript is done.

SwaggerHub:

  1. Select POST /transcripts/create.
  2. Click Try it out.
  3. Edit the parameters by filling out audio by providing an audio file. Other parameters are optional, for example callbackurl to provide a callback, or decoder to force a decoder _id.
  4. Click Execute.
  5. Check the response code and message.

Postman:

  1. Select (POST) /create.
  2. Click the Params section of the request and provide values for the audio key. Other keys are optional, for example callbackurl to provide a callback, or decoder to force a decoder _id.
  3. Click Send.
  4. Check the response code and message.
^^Response example^^
{
  "data": {
    "transcriptId": "5c9b99fd4d1c2752***"
  }
}

Check the transcription process status

You can check the status of the transcription process for a previously uploaded audio file by providing its transcriptId. The request will respond with one of the following: pending, inprogress, ready.

SwaggerHub:

  1. Select GET /transcripts/status/{transcriptId}.
  2. Click Try it out.
  3. Edit the parameters by filling out transcriptId by providing the transcript id, obtained from the result of a /create request.
  4. Click Execute.
  5. Check the response code and message.

Postman:

  1. Upon a /create request, the resulting transcriptId should automatically reflect to the environment variable with the name contexta_transcriptId.
  2. This request assumes that the contexta_transcriptId was filled. Please check that it was.
  3. Select (GET) /status/{transcriptId}.
  4. Click Send.
  5. Check the response code and message.
^^Response example^^
{
  "data": {
    "transcriptId": "5c9b99fd4d1c2752***",
    "status": "DONE"
  }
}

Get the transcribed audio as JSON

When ready, you can produce a JSON file with the transcript of a previously uploaded audio file. Due to security reasons, transcripts are only available for 24 hours after creation.

Note: This request is not supported for SwaggerHub.

Postman:

  1. Upon a /create request, the resulting transcriptId should automatically reflect to the environment variable with the name contexta_transcriptId.
  2. This request assumes that the contexta_transcriptId was filled. Please check that it was.
  3. Select (GET) /analyzer/transcripts/{transcriptId}.
  4. Click Send.
  5. Check the response code and message.
^^Response example^^
{
  "name":"3b23f5ca295d09752fdef***",
  "overall_conf":"0.86",
  "path":"3b23f5ca295d09752fdef***.wav",
  "word_count":"22",
  "SpeakerList":
    [
      {"spkid":"spk1"},
    ],
  "SegmentList":
    [
      {"spkid":"spk1","question":"False","words":
        [
          {"text":"okay","conf":"0.87","dur":"0.42","stime":"0.70"},
          {"text":"we're","conf":"0.97","dur":"0.24","stime":"1.12"},
          {"text":"trying","conf":"1.00","dur":"0.33","stime":"1.36"},
          {"text":"this","conf":"0.99","dur":"0.18","stime":"1.69"},
          {"text":"for","conf":"0.99","dur":"0.18","stime":"1.90"},
          {"text":"a","conf":"0.99","dur":"0.21","stime":"2.08"},
          {"text":"second","conf":"1.00","dur":"0.39","stime":"2.32"},
          {"text":"time","conf":"1.00","dur":"0.54","stime":"2.74"},
          {"text":"to","conf":"1.00","dur":"0.27","stime":"3.31"},
          {"text":"test","conf":"1.00","dur":"0.54","stime":"3.61"},
          ...
        ]
      }
    ]
}

For the Speech To Text Analyzer API, this request is replied with a lot more data compared to the regular Speech To Text API. You will receive:

  1. Meta data about the full conversation
  2. Meta data for every speaker
  3. Meta data for every segment
Full conversation   Speaker   Segment
Most important keywords   Most important keywords of this speaker   Is this segment a question
Total duration of conversation   Total duration   Sentiment of this segment
Total duration of silence   Number of words spoken by this speaker  
Silence ratio   Talk speed  
Word count   Talk ratio  
Talk speed   Sentiment score  
Number of speakers   Number of positive sentiments  
Sentiment score   Number of negative sentiments  
Number of positive sentiments  
Number of negative sentiments  


Check transcript history

With this request you can check the history of transcription requests and their status. For security reasons, this feature cannot be tested and is only available in production.

SwaggerHub:

  1. Select GET /transcripts.
  2. Click Try it out.
  3. Click Execute.
  4. Check the response code and message.

Postman:

  1. Select (GET) /transcripts.
  2. Click Send.
  3. Check the response code and message.
^^Response example^^
[
  {
    "_id": "string",
    "duration": 0,
    "numSpeakers": 0,
    "decoder": {
      "_id": "string",
      "title": "string"
    },
    "callbackurl": "string",
    "data": "string",
    "originalName": "string",
    "status": "PENDING",
    "created_at": "string"
  }
]


Return codes

Code   Description
200   Success.
201   Created.
202   Accepted.
302   Found. Link in location header.
400   Bad request.
401   Unauthorized.
403   Forbidden.
404   Not found.
405   Method not allowed.
412   Precondition failed.
429   Too many requests.
500   Internal server error.
502   Bad gateway.
503   Service unavailable.