Frequently Asked Question

Transcribe - Convert audio file to text and optionally summarise

Last Updated 2 months ago

Transcribe API

Overview

The Transcribe API converts audio files to text using speech recognition technology. It also offers an optional summarisation feature.

Version

1.001

Endpoint

/v2/transcribe

Authentication

This API requires authentication using an API key. The key should be passed in the X_API_KEY header.

Request

The API accepts a multipart/form-data POST request with the following parameters:

Parameter	Type	Required	Description
audio	file	Yes	The audio file to transcribe.
summarise	boolean	No	Whether to generate a summary of the transcribed text. Default is false.
summarisewith	string	No	A prompt to perform the summarisation

Response

The API returns a JSON object with the following structure:

Field	Type	Description
status	string	The status of the request. Possible values: "success" or "error".
filename	string	The name of the uploaded audio file.
transcription	string	The transcribed text from the audio file. Only present if status is "success".
language	string	The language of the transcription (currently defaults to 'en'). Only present if status is "success".
duration	float	The processing time in seconds. Only present if status is "success".
summary	string	A summary of the transcribed content. Only present if summarise parameter was set to true and the summarization was successful.
message	string	Error message explaining what went wrong. Only present if status is "error".
error_code	integer	The PHP upload error code. Only present if status is "error" and there was a file upload issue.
error_details	string	A detailed explanation of the upload error. Only present if status is "error" and there was a file upload issue.

Example Response (Success)


{
  "status": "success",
  "filename": "meeting_recording.mp3",
  "transcription": "Welcome to our annual conference on artificial intelligence and machine learning. Today, we'll be discussing the latest advancements in natural language processing.",
  "language": "en",
  "duration": 12.5,
  "summary": "This is a conference introduction about AI and machine learning, focusing on recent developments in NLP."
}

Example Response (Error)


{
  "status": "error",
  "message": "File upload failed",
  "error_code": 1,
  "error_details": "The uploaded file exceeds the upload_max_filesize directive in php.ini"
}

Error Handling

If the request method is not POST, the API will return an error.
If the API key is invalid or the user is not found, the response will contain the validation status.
If the file upload fails, the API will return an error with details about what went wrong.
If the transcription service encounters an error, the API will return an error with the specific message.

Notes

Transcription is billed per second of transcription time, that is, if it takes 1 minute to transcribe a long conversation, 60 x the second rate is charged.

When summarisation is requested, the API makes an additional call to the LLM API to generate a summary of the transcribed text, and this additional request will consume API credit.

summarisewith should be a LLM prompt that expects the transcription to be embedded. Keep this terse and direct for best results.

Security

Ensure that the API key is kept secure and not exposed in client-side code.
All responses are sent with the "Content-Type: application/json" header.