POST
/
transcriptions
/
async-generate-audio
Async Generate Audio Transcription
curl --request POST \
  --url https://mavi-backend.memories.ai/serve/api/v2/transcriptions/async-generate-audio \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "asset_id": "re_657929111888723968",
  "model": "whisper-1",
  "speaker": true
}
'
{
  "code": "0000",
  "msg": "success",
  "data": {
    "task_id": "ec2449885ba84c4f943a80ff0633158e"
  },
  "failed": false,
  "success": true
}
This endpoint allows you to generate audio transcription asynchronously.

Code Example

import requests

BASE_URL = "https://mavi-backend.memories.ai/serve/api/v2/transcriptions"
API_KEY = "sk-5f8843b8c0641efd5a3a6478b7679caa"
HEADERS = {
    "Authorization": f"{API_KEY}"
}

def async_generate_audio(asset_id: str, model: str, speaker: bool):
    url = f"{BASE_URL}/async-generate-audio"
    data = {"asset_id": asset_id, "model": model, "speaker": speaker}
    resp = requests.post(url, json=data, headers=HEADERS)
    return resp.json()

# Usage example
result = async_generate_audio("re_657929111888723968", "whisper-1", True)
print(result)

Response

Returns the transcription task information.
{
  "code": "0000",
  "msg": "success",
  "data": {
    "task_id": "ec2449885ba84c4f943a80ff0633158e"
  },
  "failed": false,
  "success": true
}

Response Parameters

ParameterTypeDescription
codestringResponse code indicating the result status
msgstringResponse message describing the operation result
dataobjectResponse data object containing task information
data.task_idstringUnique identifier of the transcription task
successbooleanIndicates whether the operation was successful
failedbooleanIndicates whether the operation failed

Callback Response Parameters

When the audio transcription is complete, a callback will be sent to your configured webhook URL.
ParameterTypeDescription
codestringResponse code (“0000” indicates success)
messagestringStatus message (e.g., “SUCCESS”)
dataobjectResponse data object containing the transcription result and metadata
data.dataobjectInner data object containing transcription segments and usage information
data.data.dataarrayArray of transcription segments with timestamps
data.data.data[].start_timenumberStart time of the segment in seconds
data.data.data[].end_timenumberEnd time of the segment in seconds
data.data.data[].textstringTranscription text for this segment
data.data.data[].speakerstring | nullSpeaker identifier if speaker=true was requested, otherwise null
data.data.usage_metadataobjectUsage statistics for the API call
data.data.usage_metadata.durationnumberAudio duration in seconds
data.data.usage_metadata.modelstringThe model used for transcription (e.g., “whisper-1”)
data.data.usage_metadata.output_tokensintegerNumber of output tokens (0 for audio transcription)
data.data.usage_metadata.prompt_tokensintegerNumber of prompt tokens (0 for audio transcription)
data.msgstringDetailed message about the operation result
data.successbooleanIndicates whether the transcription was successful
task_idstringThe task ID associated with this transcription request
Speaker Identification: The speaker field in each transcription segment will only contain a speaker identifier (e.g., “SPEAKER_00”) when the request parameter speaker=true is set. Otherwise, it will be null.

Understanding the Callback Response

The callback response has a nested structure with the transcription segments and usage information inside data.data. Response Structure:
callback_response
├── code: "0000"
├── message: "SUCCESS"
├── data
│   ├── data
│   │   ├── data: [array of transcription segments]
│   │   │   └── [
│   │   │       {
│   │   │         start_time: 0.0,
│   │   │         end_time: 2.0,
│   │   │         text: " Oh",
│   │   │         speaker: null  // or "SPEAKER_00" if speaker=true
│   │   │       },
│   │   │       ...
│   │   │     ]
│   │   └── usage_metadata
│   │       ├── duration: 2
│   │       ├── model: "whisper-1"
│   │       ├── output_tokens: 0
│   │       └── prompt_tokens: 0
│   ├── msg: "ASR transcription completed successfully"
│   └── success: true
└── task_id: "016c7052f8224d5c971e35b7d08972fc"
How to access the data:
  • Transcription segments: callback_response.data.data.data
  • First segment text: callback_response.data.data.data[0].text
  • First segment speaker: callback_response.data.data.data[0].speaker (will be null if speaker=false)
  • Time range: callback_response.data.data.data[0].start_time to callback_response.data.data.data[0].end_time
  • Usage statistics: callback_response.data.data.usage_metadata
  • Audio duration: callback_response.data.data.usage_metadata.duration
  • Model used: callback_response.data.data.usage_metadata.model
  • Success status: callback_response.data.success
  • Task ID: callback_response.task_id

Authorizations

Authorization
string
header
required

Body

application/json
asset_id
string
required

The audio asset ID to transcribe

Example:

"re_657929111888723968"

model
string
required

The transcription model to use

Example:

"whisper-1"

speaker
boolean
required

Whether to include speaker identification

Example:

true

Response

200 - application/json

Transcription task information

code
string

Response code indicating the result status

Example:

"0000"

msg
string

Response message describing the operation result

Example:

"success"

data
object

Response data object containing task information

success
boolean

Indicates whether the operation was successful

Example:

true

failed
boolean

Indicates whether the operation failed

Example:

false