Documentation Index
Fetch the complete documentation index at: https://api-tools.memories.ai/llms.txt
Use this file to discover all available pages before exploring further.
Product: Visual Intelligence — Audio File Transcription
Use case: Transcribe an uploaded audio/video file to text — async batch or sync, multiple providers (Whisper, ElevenLabs, AssemblyAI) with optional speaker labels. For live streams, see Live Audio Transcription.
Host:
https://mavi-backend.memories.ai/serve/api/v2
Auth: Authorization: sk-mavi-... (no Bearer prefix)SPEAKER_00, SPEAKER_01, etc. — anonymous labels based on voice characteristics, not identity.
Need named speakers? Use Multimodal Speaker Recognition, which combines voice + face recognition to identify speakers by name.
Pricing: $0.001/second of audio or video
Endpoints
| Method | Endpoint | Returns |
|---|---|---|
POST | /transcriptions/sync-generate-speaker | Result directly |
POST | /transcriptions/async-generate-speaker | task_id + webhook callback |
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| asset_id | string | Yes | The uploaded audio or video asset ID |
Code Examples
Sync Response
Sync Response Parameters
| Parameter | Type | Description |
|---|---|---|
| data.model | string | Model used (e.g., pyannote) |
| data.items | array | Speaker segments |
| data.items[].start | number | Segment start time in seconds |
| data.items[].end | number | Segment end time in seconds |
| data.items[].speaker | string | Anonymous speaker label (e.g., SPEAKER_00) |
Async Response
Callback Response Parameters
| Parameter | Type | Description |
|---|---|---|
| data.data.data | array | Speaker segments |
| data.data.data[].start | number | Segment start time in seconds |
| data.data.data[].end | number | Segment end time in seconds |
| data.data.data[].speaker | string | Anonymous speaker label |
| data.data.usage_metadata.duration | number | Total audio duration in seconds |
| data.data.usage_metadata.model | string | Model used |
| task_id | string | Task ID matching the initial response |
