Image Model Selection

Use this page when your request is image-only. Memories.ai currently documents Gemini, GPT, Nova, and Qwen image understanding endpoints.

This guide is intentionally practical. It focuses on the request and output patterns you will actually use in Memories.ai, not external benchmark scoreboards.

Quick Picks

Best default

Start with Gemini Image if you want one strong default for image reasoning and schema-based extraction.

Simplest JSON path

Start with GPT Image if you want the most familiar documented response_format path for JSON output.

Lowest-cost batch

Start with Qwen Image if cost is the main constraint and you need image analysis at scale.

Tool-first extraction

Start with Nova Image if your workflow is built around tool-style extraction via toolConfig.

Choose By Workflow

If your goal is…	Start here	Why
General-purpose image reasoning	Gemini Image	Strong default when you need reasoning plus schema-style structured output
Strict JSON extraction with a familiar API pattern	GPT Image	The page documents `response_format` directly and uses an OpenAI-style image request/response shape
Lowest-cost image analysis	Qwen Image	Cheapest published starting price among the documented image providers
Tool-driven extraction pipeline	Nova Image	The docs expose `toolConfig` for structured extraction through tool specs
Explicit thinking toggle plus schema output	Gemini Image or Qwen Image	Both document structured output plus explicit thinking-related fields

Provider Differences That Matter

Provider	Documented input shape	Structured output path	Thinking control	When it usually fits best
Gemini Image	`input_file` with image MIME type	`extra_body.metadata.responseSchema`	`thinking_config.thinking_budget`	Default image reasoning with schema-based extraction
GPT Image	`image_url` blocks in OpenAI-style messages	`response_format`	Not documented on this page	Image-only JSON extraction with the simplest request/response mental model
Nova Image	`image_url` plus text; tool config lives in `extra_body.metadata.toolConfig`	Tool-based extraction via `toolConfig`	Not documented on this page	Tool-oriented extraction flows and low-cost operational usage
Qwen Image	`image` plus `text` blocks in the same content array	`extra_body.metadata.response_format.json_schema`	`enable_thinking` + `thinking_budget`	Lowest-cost image analysis with explicit thinking controls

Suggested Starting Models

Scenario	Recommended model
Default image reasoning / extraction	`gemini:gemini-3-flash-preview`
Familiar JSON-first image extraction	`gpt:gpt-5-mini`
Cheapest first pass on large image volume	`qwen:qwen3-vl-flash`
Tool-oriented extraction workflow	`nova:us.amazon.nova-lite-v1:0`

Qwen source pricing is documented per 1K tokens on its provider pages. This guide converts the lowest published tier into 1M token terms for faster cross-provider comparison.

How To Evaluate On Your Own Data

Before standardizing on an image model, compare at least these cases:

clean product or document images vs noisy real-world photos
captioning vs field extraction
free-form output vs strict JSON output
small interactive workloads vs bulk batch processing

The right choice often depends more on your output contract than on headline model branding.

Getting Started

Video Processing

Transcription

Social Media Scraping

Video Understanding Models

Image Understanding Models

Embeddings

Stream Processing

Screenplay Extraction

Quick Picks

Best default

Simplest JSON path

Lowest-cost batch

Tool-first extraction

Choose By Workflow

Provider Differences That Matter

Suggested Starting Models

How To Evaluate On Your Own Data

Getting Started

Video Processing

Transcription

Social Media Scraping

Video Understanding Models

Image Understanding Models

Embeddings

Stream Processing

Screenplay Extraction

Documentation Index

​Quick Picks

Best default

Simplest JSON path

Lowest-cost batch

Tool-first extraction

​Choose By Workflow

​Provider Differences That Matter

​Suggested Starting Models

​How To Evaluate On Your Own Data

​Related Guides

Quick Picks

Choose By Workflow

Provider Differences That Matter

Suggested Starting Models

How To Evaluate On Your Own Data

Related Guides