Rate Limit

Rate Limit Overview

To ensure platform stability and fair usage, Memories.ai API enforces rate limits based on the type of API endpoint you are calling.

Rate limits are applied per account. All API keys under the same account share the same rate limit quota.

Rate Limit Tiers

Standard APIs

1 QPS (Query Per Second)Applies to most API endpoints including:

Video / Audio / Image Upload
Transcription
Embeddings Generation

Scraping & Task APIs

Varies by endpoint and channelApplies to scraping and long-running task endpoints including:

YouTube / TikTok / Instagram / Twitter Scraping
Async task-based processing

Understanding Models

Model Provider Rate LimitRate limits follow the underlying model provider’s own limits:

Video Understanding Models (VLM)
Image Understanding Models (ILM)

Stream Processing

Concurrent Stream LimitLimited by max concurrent streams per account (video + audio combined)

Detailed Rate Limits by Endpoint

Video Processing — 1 QPS

Endpoint	Rate Limit
Upload File	1 QPS
Get Upload Signed URL	1 QPS
Upload File Using Signed URL	1 QPS
Edit Video	1 QPS
Scene Detection	1 QPS
Split Video	1 QPS
Extract Video Frames	1 QPS
Get Asset Metadata	1 QPS
Download Asset	1 QPS
Delete Asset	1 QPS

Transcription — 1 QPS

Endpoint	Rate Limit
Sync Generate Audio Transcription	1 QPS
Sync Generate Speaker	1 QPS
Async Generate Video Description	1 QPS
Async Generate Audio Transcription	1 QPS
Async Generate Speaker	1 QPS
Async Generate Summary	1 QPS
Speaker Recognition	1 QPS

Embeddings — 1 QPS

Endpoint	Rate Limit
Generate Video Embedding	1 QPS
Generate Image Embedding	1 QPS
Generate Text Embedding	1 QPS

Stream Processing — Concurrent Stream Limit

Access Required: Stream processing features are not enabled by default. Please contact sales to enable stream processing for your account.

Stream processing endpoints are limited by the maximum number of concurrent streams per account (video + audio combined), rather than QPS.

Endpoint	Rate Limit
Start Video Stream Moderation	Max N concurrent streams
Stop Video Stream Moderation	No Limit
Start Audio Stream Transcription	Max N concurrent streams
Stop Audio Stream Transcription	No Limit

When the server capacity is reached, the API returns status code 16 (Capacity Reached). Please retry later or contact sales for a higher concurrent stream limit.

Rate limits for scraping endpoints vary by endpoint type and the channel parameter used.

Metadata & Transcript Endpoints

These endpoints accept a channel parameter (rapid / memories.ai / apify). Rate limits are enforced per channel.

Endpoint	Channel	Rate Limit
YouTube Video Metadata	`rapid`	12 QPH
YouTube Video Metadata	`memories.ai`	10 QPS
YouTube Video Metadata	`apify`	10 QPS
TikTok Video Metadata	`rapid` / `memories.ai`	600 QPM
TikTok Video Metadata	`apify`	10 QPS
Instagram Video Metadata	`rapid` / `memories.ai`	25 QPH
Instagram Video Metadata	`apify`	10 QPS
Twitter Video Metadata	`rapid` / `memories.ai`	20 QPH
Twitter Video Metadata	`apify`	10 QPS
YouTube Video Transcript	`rapid` / `memories.ai`	150 QPM
YouTube Video Transcript	`apify`	10 QPS
TikTok Video Transcript	`rapid` / `memories.ai`	600 QPM
TikTok Video Transcript	`apify`	10 QPS
Instagram Video Transcript	`rapid` / `memories.ai`	150 QPM
Instagram Video Transcript	`apify`	10 QPS
Twitter Video Transcript	`rapid` / `memories.ai`	20 QPH
Twitter Video Transcript	`apify`	10 QPS

Detail & Comment Endpoints

These endpoints do not use a channel parameter.

Endpoint	Rate Limit
TikTok Video Detail	600 QPM
TikTok Video Comment	600 QPM
TikTok Video Comment Reply	600 QPM
YouTube Video Detail	10 QPS
YouTube Video Comment	10 QPS
YouTube Video Comment Reply	10 QPS

Video Understanding Models — Model Provider Rate Limit

Memories.ai does not impose its own QPS limit on these endpoints. The effective rate limit is determined by the underlying model provider (e.g., Google Gemini, Amazon Nova, Alibaba Qwen). If you exceed the provider’s throughput limit, the API will return an error. Usage is also subject to your account’s token quota and billing limits.

If you are choosing between providers, see Video Model Selection or Image Model Selection before optimizing for rate limits alone.

Endpoint	Rate Limit
Gemini Video	Subject to Gemini rate limit
Nova Video	Subject to Nova rate limit
Qwen Video	Subject to Qwen rate limit

Image Understanding Models — Model Provider Rate Limit

Endpoint	Rate Limit
Gemini Image	Subject to Gemini rate limit
GPT Image	Subject to GPT rate limit
Nova Image	Subject to Nova rate limit
Qwen Image	Subject to Qwen rate limit

What Happens When You Exceed the Limit?

If you exceed the rate limit, the API will return a 429 Too Many Requests response:

{
  "code": 429,
  "msg": "Rate limit exceeded",
  "data": null
}

Recommended retry strategy: Implement exponential backoff starting with a 1-second delay, doubling each retry, up to a maximum of 32 seconds.

import time
import requests

def request_with_retry(url, headers, data, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)
        if response.status_code != 429:
            return response
        wait_time = min(2 ** attempt, 32)
        time.sleep(wait_time)
    return response

Repeated rate limit violations may result in temporary suspension of your API key. Please ensure your application respects the rate limits.

Need Higher Rate Limits?

If your use case requires higher throughput, we offer customized rate limit plans for enterprise customers.

Contact Sales

Get in touch with our sales team to discuss a custom rate limit plan tailored to your needs.

Enterprise plans can include increased QPS/QPM limits, dedicated infrastructure, and priority support.

Getting Started

Video Processing

Transcription

Social Media Scraping

Video Understanding Models

Image Understanding Models

Embeddings

Stream Processing

Screenplay Extraction

Rate Limit Overview

Rate Limit Tiers

Standard APIs

Scraping & Task APIs

Understanding Models

Stream Processing

Detailed Rate Limits by Endpoint

Video Processing — 1 QPS

Transcription — 1 QPS

Embeddings — 1 QPS

Stream Processing — Concurrent Stream Limit

Metadata & Transcript Endpoints

Detail & Comment Endpoints

Video Understanding Models — Model Provider Rate Limit

Image Understanding Models — Model Provider Rate Limit

What Happens When You Exceed the Limit?

Need Higher Rate Limits?

Contact Sales

Getting Started

Video Processing

Transcription

Social Media Scraping

Video Understanding Models

Image Understanding Models

Embeddings

Stream Processing

Screenplay Extraction

Documentation Index

​Rate Limit Overview

​Rate Limit Tiers

Standard APIs

Scraping & Task APIs

Understanding Models

Stream Processing

​Detailed Rate Limits by Endpoint

​Video Processing — 1 QPS

​Transcription — 1 QPS

​Embeddings — 1 QPS

​Stream Processing — Concurrent Stream Limit

​Social Media Scraping

​Metadata & Transcript Endpoints

​Detail & Comment Endpoints

​Video Understanding Models — Model Provider Rate Limit

​Image Understanding Models — Model Provider Rate Limit

​What Happens When You Exceed the Limit?

​Need Higher Rate Limits?

Contact Sales

Rate Limit Overview

Rate Limit Tiers

Detailed Rate Limits by Endpoint

Video Processing — 1 QPS

Transcription — 1 QPS

Embeddings — 1 QPS

Stream Processing — Concurrent Stream Limit

Social Media Scraping

Metadata & Transcript Endpoints

Detail & Comment Endpoints

Video Understanding Models — Model Provider Rate Limit

Image Understanding Models — Model Provider Rate Limit

What Happens When You Exceed the Limit?

Need Higher Rate Limits?