Sync Generate Audio Transcription

curl --request POST \
  --url https://mavi-backend.memories.ai/serve/api/v2/transcriptions/sync-generate-audio \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "asset_id": "re_657929111888723968",
  "model": "whisper-1",
  "speaker": true
}
'

{
  "code": "0000",
  "msg": "success",
  "data": {
    "model": "whisper-1",
    "items": [
      {
        "text": "Hello, how are you today?",
        "start_time": 0.0,
        "end_time": 2.98
      },
      {
        "text": "I'm doing well, thank you.",
        "start_time": 2.98,
        "end_time": 6.78
      },
      {
        "text": "What are you planning to do?",
        "start_time": 6.78,
        "end_time": 10.56
      },
      {
        "text": "I'm going to the park.",
        "start_time": 10.56,
        "end_time": 13.1
      }
    ]
  },
  "failed": false,
  "success": true
}

POST

transcriptions

sync-generate-audio

Sync Generate Audio Transcription

curl --request POST \
  --url https://mavi-backend.memories.ai/serve/api/v2/transcriptions/sync-generate-audio \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "asset_id": "re_657929111888723968",
  "model": "whisper-1",
  "speaker": true
}
'

{
  "code": "0000",
  "msg": "success",
  "data": {
    "model": "whisper-1",
    "items": [
      {
        "text": "Hello, how are you today?",
        "start_time": 0.0,
        "end_time": 2.98
      },
      {
        "text": "I'm doing well, thank you.",
        "start_time": 2.98,
        "end_time": 6.78
      },
      {
        "text": "What are you planning to do?",
        "start_time": 6.78,
        "end_time": 10.56
      },
      {
        "text": "I'm going to the park.",
        "start_time": 10.56,
        "end_time": 13.1
      }
    ]
  },
  "failed": false,
  "success": true
}

This endpoint allows you to generate audio transcription synchronously.

Code Example

import requests

BASE_URL = "https://mavi-backend.memories.ai/serve/api/v2/transcriptions"
API_KEY = "sk-5f8843b8c0641efd5a3a6478b7679caa"
HEADERS = {
    "Authorization": f"{API_KEY}"
}

def sync_generate_audio(asset_id: str, model: str = None, speaker: bool = None):
    url = f"{BASE_URL}/sync-generate-audio"
    data = {"asset_id": asset_id}
    if model:
        data["model"] = model
    if speaker:
        data["speaker"] = speaker
    resp = requests.post(url, json=data, headers=HEADERS)
    return resp.json()

# Usage example
result = sync_generate_audio("re_657929111888723968", "whisper-1", True)
print(result)

Response

Returns the transcription result directly. The response contains transcription segments with timing information organized in an array. Response Structure:

Returns a standard response format with code, msg, data, success, and failed fields
The data object contains:
- model: The transcription model used (e.g., “whisper-1”)
- items: Array of transcription segments, each containing:
  - text: The transcribed text for the segment
  - start_time: Start time of the segment in seconds
  - end_time: End time of the segment in seconds
  - speaker: Speaker identifier (only present when speaker=true parameter is set)

Response Format Variations:

When speaker=false: Items do not include the speaker field
When speaker=true: Items include the speaker field with speaker identification

{
  "code": "0000",
  "msg": "success",
  "data": {
    "model": "whisper-1",
    "items": [
      {
        "text": "Hello, how are you today?",
        "start_time": 0.0,
        "end_time": 2.98
      },
      {
        "text": "I'm doing well, thank you.",
        "start_time": 2.98,
        "end_time": 6.78
      },
      {
        "text": "What are you planning to do?",
        "start_time": 6.78,
        "end_time": 10.56
      },
      {
        "text": "I'm going to the park.",
        "start_time": 10.56,
        "end_time": 13.1
      }
    ]
  },
  "failed": false,
  "success": true
}

Response Parameters

Parameter	Type	Description
code	string	Response code indicating the result status
msg	string	Response message describing the operation result
data	object	Response data object containing transcription results
data.model	string	The transcription model used (e.g., “whisper-1”)
data.items	array[object]	Array of transcription segments with timing information
data.items[].text	string	Transcribed text for the segment
data.items[].start_time	number	Start time of the segment in seconds
data.items[].end_time	number	End time of the segment in seconds
data.items[].speaker	string	Speaker identifier (only present when `speaker=true`)
success	boolean	Indicates whether the operation was successful
failed	boolean	Indicates whether the operation failed

Authorizations

Authorization

string

header

required

Body

application/json

asset_id

string

required

The asset ID to transcribe

Example:

"re_657929111888723968"

model

string

The transcription model to use

Example:

"whisper-1"

speaker

boolean

Whether to include speaker identification

Example:

true

Response

200 - application/json

Transcription result

code

string

Response code indicating the result status

Example:

"0000"

msg

string

Response message describing the operation result

Example:

"success"

data

object

Response data object containing transcription results

Show child attributes

success

boolean

Indicates whether the operation was successful

Example:

true

failed

boolean

Indicates whether the operation failed

Example:

false

Sync Generate Speaker Async Generate Summary

⌘I

Getting Started

Base

Transcript

Video Metadata & Transcript

VLM

Embeddings

Sync Generate Audio Transcription

Code Example

Response

Response Parameters

Authorizations

Body

Response

Getting Started

Base

Transcript

Video Metadata & Transcript

VLM

Embeddings

​Code Example

​Response

​Response Parameters

Authorizations

Body

Response

Code Example

Response

Response Parameters