Get Started
Create Your API Key
Generate one
sk-mavi-... key that works across Visual Intelligence, Visual Search, and Visual Agents. Under 2 minutes.Upload Your First Video
The Visual Search indexing pipeline at a glance — pick the right upload method and learn how to wait for the parse to finish.
Products
Visual Intelligence
Stateless inference APIs. Direct REST calls for transcription, captioning, model inference, embeddings, live-stream moderation, and Human ReID. No persistence — you bring the data, we return the result.
Visual Search
Indexed video + image library. Upload videos and images once, the platform auto-indexes them, and you query by natural language, image similarity, or transcript phrase. State lives on the server.
Visual Agents
Pre-built agents + workflow APIs. Open-source video-searching and video-editing agents, plus managed services for queries, clip/edit/split, and screenplay extraction. Fork the agents, or call the managed endpoints directly.
Which One Do I Use?
| If you want to… | Use |
|---|---|
| Run AI analysis on a one-off video file you have | Visual Intelligence — Video Model APIs or Video Task APIs |
| Transcribe an audio or video file to text | Visual Intelligence — Audio File Transcription |
| Pull a transcript / caption from a YouTube / TikTok / Instagram / X link | Visual Intelligence — Social Media Scraping |
| Moderate a live RTMP stream in real time | Visual Intelligence — Live Video Content Moderation |
| Transcribe a live audio broadcast in real time | Visual Intelligence — Live Audio Transcription |
| Identify specific named people in a video | Visual Intelligence — Human ReID & Caption |
| Build a searchable video library you can query later | Visual Search — upload once, auto-indexed, query later |
| Find moments across all your uploaded videos by natural language | Visual Search — Search by Text |
| Build a video discovery / editing bot | Visual Agents — Video Searching Agent or Video Editing Agent |
| Extract storyboard / screenplay data from short drama episodes | Visual Agents — Screenplay Extraction |
| Drive the platform from the terminal | Memories CLI — see Tools below |
What’s Inside Each Product
Visual Intelligence
Stateless REST APIs onhttps://mavi-backend.memories.ai/serve/api/v2 (plus two specialty hosts — see Base URLs).
| Group | What it does |
|---|---|
| Asset Management | Upload / download / delete video and image assets used by other VI APIs |
| Social Media Scraping | Metadata, transcripts, captions, comments from YouTube, Instagram, TikTok, Twitter/X |
| Audio File Transcription | Whisper, ElevenLabs, AssemblyAI providers + speaker diarization / recognition |
| Live Audio Transcription | Real-time STT on live audio — server-pull (callback) or WebSocket (client-push) |
| Video Model APIs | Direct VLM calls with your own prompt — Gemini, Nova, Qwen |
| Video Task APIs | Pre-packaged tasks on top of VLMs — Video Frame Description, Video Summary |
| Live Video Content Moderation | NSFW / violence / logo detection on RTMP/RTSP streams |
| Live Video Understanding | Custom AI prompt continuously applied to a live RTMP stream |
| Image Model APIs | Direct ILM calls — Gemini, GPT, Nova, Qwen |
| Embeddings | Image / video / text embeddings for semantic search and retrieval |
| Human ReID & Caption | Identity-aware vision — caption a video with named people. Requires a dedicated key |
Visual Search
Indexed video + image library onhttps://api.memories.ai/serve/api/v1. Upload once, query forever.
| Group | What it does |
|---|---|
| Index Upload | Upload videos (from file, URL, or social-media creator handle) and images for indexing |
| Search | Semantic and keyword search across your private library and the public video library — by text, image, or transcript phrase |
| Library Management | List / get metadata / download / delete videos in your library |
Visual Agents
Reference implementations and managed workflow APIs onhttps://mavi-backend.memories.ai/serve/api/v2.
| Agent / Service | Open-source repo | Managed API |
|---|---|---|
| Video Searching Agent | video-searching-agent | POST /queries/stream |
| Video Editing Agent (VEA) | vea-open-source | POST /video/edit, /video/clip, /video/split |
| Screenplay Extraction | — (managed only) | POST /screenplay/tasks (async) |
Tools
Memories CLI
Command-line tool for the entire ecosystem — uploads, searches, agent calls, asset management from your terminal or shell scripts. Samesk-mavi-... key as the rest of the platform.
Memories CLI on GitHub
Install instructions, command reference, and source.
Memories.ai Console
Manage API keys, view usage and credits, configure webhooks.
Billing
All three products share a unified billing system. Pricing varies per endpoint — see the pricing note at the top of each endpoint page, or the Memories.ai Console for your current usage and credits.Compliance
Memories.ai meets the highest standards of data security and privacy with HIPAA, SOC 2 Type 2, and GDPR compliance. For more information, see the Trust Center.
