POST
/
transcriptions
/
sync-generate-audio
Sync Generate Audio Transcription
curl --request POST \
  --url https://mavi-backend.memories.ai/serve/api/v2/transcriptions/sync-generate-audio \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "asset_id": "re_657929111888723968",
  "model": "whisper-1",
  "speaker": true
}
'
{
  "code": "0000",
  "msg": "success",
  "data": {
    "model": "whisper-1",
    "items": [
      {
        "text": "Hello, how are you today?",
        "start_time": 0.0,
        "end_time": 2.98
      },
      {
        "text": "I'm doing well, thank you.",
        "start_time": 2.98,
        "end_time": 6.78
      },
      {
        "text": "What are you planning to do?",
        "start_time": 6.78,
        "end_time": 10.56
      },
      {
        "text": "I'm going to the park.",
        "start_time": 10.56,
        "end_time": 13.1
      }
    ]
  },
  "failed": false,
  "success": true
}
This endpoint allows you to generate audio transcription synchronously.

Code Example

import requests

BASE_URL = "https://mavi-backend.memories.ai/serve/api/v2/transcriptions"
API_KEY = "sk-5f8843b8c0641efd5a3a6478b7679caa"
HEADERS = {
    "Authorization": f"{API_KEY}"
}

def sync_generate_audio(asset_id: str, model: str = None, speaker: bool = None):
    url = f"{BASE_URL}/sync-generate-audio"
    data = {"asset_id": asset_id}
    if model:
        data["model"] = model
    if speaker:
        data["speaker"] = speaker
    resp = requests.post(url, json=data, headers=HEADERS)
    return resp.json()

# Usage example
result = sync_generate_audio("re_657929111888723968", "whisper-1", True)
print(result)

Response

Returns the transcription result directly. The response contains transcription segments with timing information organized in an array. Response Structure:
  • Returns a standard response format with code, msg, data, success, and failed fields
  • The data object contains:
    • model: The transcription model used (e.g., “whisper-1”)
    • items: Array of transcription segments, each containing:
      • text: The transcribed text for the segment
      • start_time: Start time of the segment in seconds
      • end_time: End time of the segment in seconds
      • speaker: Speaker identifier (only present when speaker=true parameter is set)
Response Format Variations:
  • When speaker=false: Items do not include the speaker field
  • When speaker=true: Items include the speaker field with speaker identification
{
  "code": "0000",
  "msg": "success",
  "data": {
    "model": "whisper-1",
    "items": [
      {
        "text": "Hello, how are you today?",
        "start_time": 0.0,
        "end_time": 2.98
      },
      {
        "text": "I'm doing well, thank you.",
        "start_time": 2.98,
        "end_time": 6.78
      },
      {
        "text": "What are you planning to do?",
        "start_time": 6.78,
        "end_time": 10.56
      },
      {
        "text": "I'm going to the park.",
        "start_time": 10.56,
        "end_time": 13.1
      }
    ]
  },
  "failed": false,
  "success": true
}

Response Parameters

ParameterTypeDescription
codestringResponse code indicating the result status
msgstringResponse message describing the operation result
dataobjectResponse data object containing transcription results
data.modelstringThe transcription model used (e.g., “whisper-1”)
data.itemsarray[object]Array of transcription segments with timing information
data.items[].textstringTranscribed text for the segment
data.items[].start_timenumberStart time of the segment in seconds
data.items[].end_timenumberEnd time of the segment in seconds
data.items[].speakerstringSpeaker identifier (only present when speaker=true)
successbooleanIndicates whether the operation was successful
failedbooleanIndicates whether the operation failed

Authorizations

Authorization
string
header
required

Body

application/json
asset_id
string
required

The asset ID to transcribe

Example:

"re_657929111888723968"

model
string

The transcription model to use

Example:

"whisper-1"

speaker
boolean

Whether to include speaker identification

Example:

true

Response

200 - application/json

Transcription result

code
string

Response code indicating the result status

Example:

"0000"

msg
string

Response message describing the operation result

Example:

"success"

data
object

Response data object containing transcription results

success
boolean

Indicates whether the operation was successful

Example:

true

failed
boolean

Indicates whether the operation failed

Example:

false