Skip to main content
POST
/
serve
/
api
/
v1
/
search
Search from Private Library
curl --request POST \
  --url https://api.memories.ai/serve/api/v1/search \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "search_param": "boat in the ocean",
  "search_type": "BY_CLIP",
  "unique_id": "default",
  "top_k": 100,
  "filtering_level": "medium",
  "video_nos": [
    "VI635764894954369024"
  ],
  "tag": "test1",
  "camera_tag": "Canon EOS 5D",
  "datetime_taken": "2025-10-20 11:00:00",
  "latitude": 88.88,
  "longitude": 88.88
}
'
{
  "code": "0000",
  "msg": "success",
  "data": [
    {
      "videoNo": "VI576925607808602112",
      "videoName": "1920447021987282945",
      "startTime": "13",
      "endTime": "18",
      "audio_ts": "...matched transcript text...",
      "score": 0.5221236659362116,
      "video_bucket": "mavi-resource",
      "video_blob": "VI576925607808602112.mp4",
      "keyframe_bucket": "mavi-keyframe",
      "keyframe_blob": "<uuid>/keyframe-000013.jpg"
    }
  ],
  "success": true,
  "failed": false
}

Documentation Index

Fetch the complete documentation index at: https://api-tools.memories.ai/llms.txt

Use this file to discover all available pages before exploring further.

Product: Visual Search Use case: Upload videos and images, auto-index them, then search by natural language, image, or transcript phrase Host: https://api.memories.ai/serve/api/v1 Auth: Authorization: sk-mavi-... (no Bearer prefix)
Use a natural-language query to retrieve the most semantically similar clip segments from your Private Video Library. This is the canonical “find me a moment” endpoint. For related operations: Search by Image (library-wide) · Search by Image (within a video) · Search by Transcript.

Prerequisites

Request Example

import requests

headers = {"Authorization": "sk-mavi-..."}

response = requests.post(
    "https://api.memories.ai/serve/api/v1/search",
    headers=headers,
    json={
        "search_param": "person walking on a beach at sunset",
        "search_type": "BY_CLIP",
        "unique_id": "default",
        "top_k": 10,
        "filtering_level": "medium"
    }
)
print(response.json())

Parameters

search_param
string
required
Natural-language search query. Must be non-empty.
search_type
string
default:"BY_CLIP"
What to return:
  • BY_CLIP — video clip segments (default). BY_VIDEO is an accepted alias.
  • BY_AUDIO — audio moments (semantic match against transcripts).
  • BY_CAPTION — caption-level vector search against the video_transcript table; the response shape is different (see BY_CAPTION response below).
Server-side enum is [BY_CLIP, BY_VIDEO, BY_AUDIO, BY_IMAGE, BY_CAPTION]. BY_IMAGE is exposed via Search by Image (within a video).
unique_id
string
default:"default"
Namespace scoping the search to a folder in your account.
top_k
integer
default:"100"
Maximum results to return.
  • For BY_CLIP / BY_AUDIO: range 1 – 1000.
  • For BY_CAPTION: range 1 – 200 (server-side default is 10 when top_k is null; otherwise the value you send is used).
filtering_level
string
Minimum similarity score:
  • low — score ≥ 0.15
  • medium — score ≥ 0.225
  • high — score ≥ 0.4
Omit to return all results regardless of score.
video_nos
array
Restrict search to these specific videos. Accepts up to 100 video IDs.
tag
string
Filter to content carrying this tag.
camera_tag
string
Filter by camera model (must match camera_model set at upload time).
datetime_taken
string
Filter to content captured at or after this time. Format: yyyy-MM-dd HH:mm:ss.
latitude
number
GPS latitude filter. Must be paired with longitude.
longitude
number
GPS longitude filter. Must be paired with latitude.

Response

{
    "code": "0000",
    "msg": "success",
    "data": [
        {
            "videoNo": "VI576925607808602112",
            "videoName": "1920447021987282945",
            "startTime": "13",
            "endTime": "18",
            "audio_ts": "the sun was setting over the water",
            "score": 0.5221236659362116,
            "video_bucket": "mavi-resource",
            "video_blob": "VI576925607808602112.mp4",
            "keyframe_bucket": "mavi-keyframe",
            "keyframe_blob": "<uuid>/keyframe-000013.jpg"
        }
    ],
    "success": true,
    "failed": false
}
data[].videoNo
string
Unique identifier of the matched video.
data[].videoName
string
Internal stored name of the video.
data[].startTime
string
Start time of the matched segment, in seconds.
data[].endTime
string
End time of the matched segment, in seconds.
data[].audio_ts
string
Transcript text. Populated for BY_AUDIO searches when transcription is available.
data[].score
number
Relevance score. Higher is more relevant.
data[].video_bucket
string
GCS bucket of the original video file. Omitted when the storage location cannot be resolved.
data[].video_blob
string
GCS blob (object) path of the original video. Use it with video_bucket at GET /serve/api/v2/download?bucket=&blob= to fetch the file directly.
data[].keyframe_bucket
string
GCS bucket of the matched keyframe image (BY_CLIP only).
data[].keyframe_blob
string
GCS blob (object) path of the matched keyframe image.

BY_CAPTION Response

When search_type=BY_CAPTION, the endpoint performs a vector similarity search over the video_transcript table and returns a different item shape:
{
    "code": "0000",
    "msg": "success",
    "data": [
        {
            "video_no": "VI576925607808602112",
            "text": "the sun was setting over the water",
            "vector": [0.012, -0.034, 0.005, "..."],
            "user_id": "<md5 of internal user key>",
            "start_time": 12.5,
            "end_time": 18.7,
            "score": 0.82
        }
    ],
    "success": true,
    "failed": false
}
data[].video_no
string
Unique identifier of the matched video.
data[].text
string
The caption segment text that matched.
data[].vector
array
Stored embedding vector of the matched caption row. Dimensionality is set by the embedding model and may be hundreds of floats — response size can grow accordingly.
data[].user_id
string
Internal MD5-encoded user namespace identifier (consistent across calls for the same unique_id).
data[].start_time
number
Start time of the matched caption, in seconds.
data[].end_time
number
End time of the matched caption, in seconds.
data[].score
number
Similarity score (1 - distance). Higher is more similar.

Notes & Limits

  • Rate limiting: Exceeding the per-account rate limit returns an error. See Rate limits.
  • Billing: Each successful call deducts credits from your account balance.

Authorizations

Authorization
string
header
required

Body

application/json
search_param
string
required

Natural-language search query. Must be non-empty.

Example:

"boat in the ocean"

search_type
enum<string>
default:BY_CLIP

Search modality. BY_VIDEO is treated as BY_CLIP internally. BY_CAPTION performs vector search over the video_transcript table and returns a different item shape (see response).

Available options:
BY_VIDEO,
BY_CLIP,
BY_AUDIO,
BY_IMAGE,
BY_CAPTION
Example:

"BY_CLIP"

unique_id
string
default:default

Scope/folder identifier for the authenticated account.

top_k
integer
default:100

Maximum number of results to return. Range 1-1000 for BY_CLIP/BY_AUDIO/BY_IMAGE. For BY_CAPTION the range is 1-200 (server-side default is 10 when null).

Required range: 1 <= x <= 1000
filtering_level
enum<string>

Similarity-score filter. low=0.15, medium=0.225, high=0.4.

Available options:
low,
medium,
high
Example:

"medium"

video_nos
string[]

Optional list of video numbers to restrict the search to. Max 100.

Maximum array length: 100
Example:
["VI635764894954369024"]
tag
string

Optional tag filter.

Example:

"test1"

camera_tag
string

Optional camera/device model filter. Matches the camera_model supplied at upload time.

Example:

"Canon EOS 5D"

datetime_taken
string

Optional capture-time filter in format yyyy-MM-dd HH:mm:ss.

Example:

"2025-10-20 11:00:00"

latitude
number<double>

Optional latitude filter. Must be supplied together with longitude.

Example:

88.88

longitude
number<double>

Optional longitude filter. Must be supplied together with latitude.

Example:

88.88

Response

200 - application/json

Successful response

Response shape depends on search_type. For BY_CLIP / BY_VIDEO / BY_AUDIO data is an array of video-search items (carries video_bucket/video_blob and, for BY_CLIP, keyframe_bucket/keyframe_blob); for BY_IMAGE data is a paginated image-search object (items carry bucket/blob); for BY_CAPTION data is an array of caption-search items carrying the embedding vector, text, user_id, and time range.

code
string
Example:

"0000"

msg
string
Example:

"success"

data
object
success
boolean
Example:

true

failed
boolean
Example:

false