Image Caption

Product: Visual Intelligence — Human ReID & Caption Use case: Identity-aware vision analysis — generate captions for videos and images, with optional reference photos to name specific people in the output (Human Re-identification) Host: https://security.memories.ai Auth: Dedicated API key required — contact support@memories.ai

Analyze images and generate natural-language captions or descriptions. Supports single and batch inputs via URL, local file upload, or base64 encoding. Base URL: https://security.memories.ai

Access to this API requires a dedicated API key separate from the standard Memories.ai key. Contact support@memories.ai to request access.

Endpoints

Method	Endpoint	Input method
`POST`	`/v1/understand/uploadImg`	URL (single or batch)
`POST`	`/v1/understand/uploadImgFile`	Local file (single or batch, multipart)
`POST`	`/v1/understand/uploadImgFileBase64`	Base64-encoded (single or batch)

Supported formats: image/png, image/jpeg

Request Examples

import requests

url = "https://security.memories.ai/v1/understand/uploadImg"
headers = {"Authorization": "sk-mavi-..."}

payload = {
    "url": "https://example.com/photo.jpg",
    "user_prompt": "What is happening in this image?",
    "system_prompt": "You are an image understanding system.",
    "thinking": False
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Parameters

url

string | array

Single image URL (string) or list of image URLs (array). Used for /uploadImg only.

user_prompt

string

required

Instruction for the analysis — e.g. "What is happening in this image?".

system_prompt

string

Role or context for the AI — e.g. "You are an image understanding system.".

thinking

boolean

default:"false"

Enable reasoning mode for more detailed analysis.

reasoning_effort

string

Only applies when thinking is true. Level 1–10; higher values use more tokens. Default -1 (model decides).

boolean

default:"false"

true for Q&A / chat style; false for caption / information retrieval style.

image_base64

string

Base64-encoded single image. Used with /uploadImgFileBase64. Requires img_type.

images

array

Batch base64 input. Each item: { "image_base64": "...", "img_type": "image/png" }.

img_type

string

MIME type of the base64 image — "image/png" or "image/jpeg". Required when using image_base64.

Response

{
  "code": 0,
  "msg": "success",
  "data": {
    "text": "A person is sitting at a desk reviewing documents in a well-lit office.",
    "token": {
      "input": 273,
      "output": 79,
      "total": 352
    }
  }
}

Response Fields

code

integer

0 = success, -1 = failure.

data.text

string

Generated caption or descriptive text.

data.token.input

integer

Number of input tokens consumed.

data.token.output

integer

Number of output tokens generated.

data.token.total

integer

Total token count.

model_time

integer

Model processing time in milliseconds.

upload_time

integer

Upload time in milliseconds.

Get Started

Asset Management

Social Media Scraping

Audio File Transcription

Live Audio Transcription

Video Model APIs

Video Task APIs

Live Video Content Moderation

Live Video Understanding

Image Model APIs

Embeddings

Human ReID & Caption

Reference

Endpoints

Request Examples

Parameters

Response

Response Fields

Get Started

Asset Management

Social Media Scraping

Audio File Transcription

Live Audio Transcription

Video Model APIs

Video Task APIs

Live Video Content Moderation

Live Video Understanding

Image Model APIs

Embeddings

Human ReID & Caption

Reference

Documentation Index

​Endpoints

​Request Examples

​Parameters

​Response

​Response Fields

Endpoints

Request Examples

Parameters

Response

Response Fields