Skip to main content
GET
/
serve
/
api
/
v1
/
get_video_caption
Get Video Caption
curl --request GET \
  --url https://api.memories.ai/serve/api/v1/get_video_caption \
  --header 'Authorization: <api-key>'
{
  "code": "0000",
  "msg": "success",
  "data": {},
  "success": true,
  "failed": false
}

Documentation Index

Fetch the complete documentation index at: https://api-tools.memories.ai/llms.txt

Use this file to discover all available pages before exploring further.

Product: Visual Search Use case: Upload videos and images, auto-index them, then search by natural language, image, or transcript phrase Host: https://api.memories.ai/serve/api/v1 Auth: Authorization: sk-mavi-... (no Bearer prefix)
Retrieve the visual caption of a video — a chronological list of scene descriptions produced by the indexing pipeline. Each segment carries a time range and a natural-language description of what is happening on screen during that window. For the spoken-words transcription, use Get Audio Transcription.

Prerequisites

Endpoint

GET /serve/api/v1/get_video_caption

Request Example

import requests

url = "https://api.memories.ai/serve/api/v1/get_video_caption"
headers = {"Authorization": "sk-mavi-..."}
params = {
    "video_no": "VI702915390254350336",
    "unique_id": "pov_laundry_NkdWNOrQByo",
}
response = requests.get(url, headers=headers, params=params)
print(response.json())

Query Parameters

video_no
string
required
The video identifier returned by the upload API.
unique_id
string
default:"default"
Namespace scoping the lookup to the account folder the video was uploaded under. Defaults to default.

Response Example

{
  "code": "0000",
  "msg": "success",
  "data": {
    "videoNo": "VI702915390254350336",
    "transcriptions": [
      {
        "index": 0,
        "content": "A person's hand is seen holding a white object, possibly a phone, near a wooden door. The camera pans slightly to reveal a wall with coats hanging on a rack.",
        "startTime": "0",
        "endTime": "4"
      },
      {
        "index": 1,
        "content": "The camera moves past a wooden door, revealing a hallway with another door at the end. A laundry basket is visible on the right.",
        "startTime": "4",
        "endTime": "7"
      }
    ],
    "createTime": "1777047288621",
    "video_bucket": "mavi-resource",
    "video_blob": "VI702915390254350336.mp4"
  },
  "success": true,
  "failed": false
}

Response Fields

code
string
Business status code. 0000 indicates success.
msg
string
Human-readable status message.
data.videoNo
string
Echo of the requested video identifier.
data.transcriptions
array
Ordered list of scene description segments covering the full video.
data.transcriptions[].index
integer
Zero-based index of the segment within the video.
data.transcriptions[].content
string
Natural-language description of what is visible on screen during this segment.
data.transcriptions[].startTime
string
Segment start time in seconds, returned as a string.
data.transcriptions[].endTime
string
Segment end time in seconds, returned as a string.
data.createTime
string
Upload-time timestamp of the underlying video, in milliseconds since epoch, returned as a string.
data.video_bucket
string
GCS bucket of the underlying video file. Omitted when the storage location cannot be resolved.
data.video_blob
string
GCS blob (object) path of the underlying video. Use it with video_bucket at GET /serve/api/v2/download?bucket=&blob= to fetch the file directly.
success
boolean
true when code == "0000".
failed
boolean
Inverse of success.

Notes & Limits

  • Availability: data.transcriptions is populated by the indexing pipeline. If the video has not yet reached status: PARSE, the call may return data: null or an empty transcriptions array — poll Get Metadata until parsing completes before depending on the result.
  • Numeric strings: startTime, endTime, and createTime are strings — cast with int(...) before arithmetic.
  • Rate limiting: Subject to the standard Visual Search rate limits. See Rate limits.

Authorizations

Authorization
string
header
required

Query Parameters

video_no
string
required

The unique video ID returned by the upload API.

unique_id
string
default:default

Unique ID scope.

Response

200 - application/json

Successful response

code
Example:

"0000"

msg
string
Example:

"success"

data
success
boolean
Example:

true

failed
boolean
Example:

false