Skip to main content

Documentation Index

Fetch the complete documentation index at: https://api-tools.memories.ai/llms.txt

Use this file to discover all available pages before exploring further.

Product: Visual Agents (composing Visual Search + Visual Intelligence endpoints) Repo: Memories-ai-labs/examples — MIT-licensed, one Jupyter notebook per agent Auth: Authorization: sk-mavi-... (no Bearer prefix)
This page maps every Visual Agents pattern to a runnable Jupyter notebook in the Memories-ai-labs/examples repo. Each notebook is self-contained — no shared library imports, every API call inlined with comments explaining the wire shape and gotchas.

Why notebooks?

A customer evaluating Memories.ai wants to see the actual request body, the actual response, and the reasoning step in context — not eight layers of abstraction. Each notebook interleaves markdown (what & why) with code (the API call) and the live response, so it reads top-to-bottom as a walkthrough you can re-run.

Quickstart

git clone https://github.com/Memories-ai-labs/examples.git
cd examples
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

export MEMORIES_API_KEY=sk-mavi-...

jupyter lab notebooks/
Open 00_search_api_overview.ipynb and click Run All. Real API hits come back in seconds.

The notebooks

Open them in this order if you’re new to the platform — the search-API overview is the foundation every agent notebook builds on.

00. Search API Overview

Every /search variant — semantic BY_CLIP, BY_AUDIO, exact-phrase transcripts, search by tag, by camera, time-windowed, plus composable filters. Read this first.

01. SOP Compliance — QSR

Verify drive-thru staff handoffs, greetings, drink inclusion. Search candidate moments → VLM verify each with a strict JSON schema.

02. Service Quality

Service-event timeline from a floor cam → table touches, inter-course time, bounce count. No VLM needed — just multiple semantic searches.

03. Security & Threat

Six scenarios (shoplifting, scanner bypass, masked entry, slip-and-fall, restricted-area breach, altercation) with severity-tagged incident log.

04. SOP Compliance — Auto

Same SOP pattern as 01, but for an automotive service bay. Per-arrival audit: greeting within 60s? Air filter checked?

05. Video Searching Agent

Discover videos across YouTube, TikTok, Instagram, X via the managed /queries/stream SSE endpoint. Includes an inline SSE parser.

06. Video Editing (VEA)

Async highlight-reel pipeline via /video/clip + /video/edit — final asset arrives at your configured webhook.

07. Personal Memory (LUCI)

Date-windowed natural-language questions over personal recordings. Combines datetime_taken filter + transcript lookup + VLM identification.

08. Visual RAG

Two-channel retrieve (BY_CLIP + BY_AUDIO) → merge overlapping time-ranges → VLM-verify each candidate with citations.

09. Creator Intelligence

Per-video VLM scoring (production, audio, delivery, hook, brand safety) → aggregated creator scorecard with recommendation.

How each notebook is structured

Every notebook follows the same shape:
SectionContent
Title + use caseMarkdown — what the agent does, when to use it
SetupOne code cell with import requests, host config, API key wiring
Helper functionsCode cells — one helper per endpoint with a docstring explaining the wire shape and gotchas (e.g. Gemini’s choices[].text envelope, code=0001 transient retry, code-fence stripping)
Step 1: RetrieveMarkdown → code cell with the search call → real output
Step 2: ReasonMarkdown → code cell with the VLM call → real JSON output
Step 3: AggregateMarkdown → domain-specific aggregation code
Where to go nextMarkdown — how to customize, what to swap out for production
Helper functions are inlined rather than imported so each notebook stands alone. The wire-shape comments are the point — they’re the patterns you copy into your own code.

The shared ReAct skeleton

Agents 1, 3, 4, 7, 8 walk the same four-step loop:
StepEndpoint(s)What it does
1. IndexPOST /upload, GET /get_metadata (poll until status=PARSE)Get the footage into your private library and wait for indexing
2. RetrievePOST /search (semantic, BY_CLIP / BY_AUDIO), GET /search_audio_transcripts (exact phrase)Find candidate moments by natural language
3. ReasonPOST /vu/chat/completions (Gemini / Qwen / Nova VLM)Verify each candidate, extract structured facts
4. LoopPer-notebook control flow over steps 2 & 3Iterate until the answer is grounded
The three odd-shaped notebooks:
  • 05 — Video Searching wraps the managed /queries/stream SSE endpoint. The agentic loop runs server-side; the notebook parses typed events (started, progress, tool_call, tool_result, error, complete).
  • 06 — Video Editing (VEA) is fire-and-forget against async /video/edit. The final asset_id arrives at your webhook URL (configure at api-platform.memories.ai/webhooks).
  • 09 — Creator Intelligence doesn’t use /search at all — it takes a list of video URLs and runs N VLM calls in series.

Hosting videos for the VLM

The Memories.ai VLM endpoint needs a publicly fetchable file_uri. The Visual Search /download endpoint streams the raw bytes back to you — it does not return a hosted URL. To wire up the full loop (index → search → VLM), you must bridge that gap yourself. Each notebook uses a MEDIA_URL_MAP dict (or the MEMORIES_MEDIA_URL_TEMPLATE env var). For demo runs, the notebooks default to mapping the seed video to the public test asset (test_1min.mp4) so the VLM step works out of the box. For production, replace those entries with your own CDN URLs.

Verification

Two notebooks were executed end-to-end via jupyter nbconvert --execute against api.memories.ai:
  • 00_search_api_overview.ipynb — BY_CLIP returned 5 real hits with scores, BY_AUDIO returned 3 transcript hits with spoken-words snippets, /search_audio_transcripts returned 3 LIKE matches, and the tag / camera / datetime filters all returned 0 cleanly (expected — this test account has no matching content).
  • 09_creator_intelligence.ipynb — real Gemini call returned {production_quality: 85, audio_quality: 10, delivery: 0, hook_strength: 70, brand_safety: 100} for the silent public test video.
The other notebooks reuse API call patterns previously live-verified in the same repo’s history — same Gemini envelope, same SSE event types, same /video/edit webhook contract.