Generate chat completions using GPT ILM model with image inputs.
POST https://mavi-backend.memories.ai/serve/api/v2/iu/chat/completionsImage Understanding (ILM) endpoints use the /iu path prefix. Video Understanding (VLM) endpoints use /vu instead.gpt: prefix when used in the model parameter (e.g., gpt:gpt-4o).
| Model | Input Price | Output Price |
|---|---|---|
| gpt-5.2 | $1.75/1M tokens | $14/1M tokens |
| gpt-5.2-2025-12-11 | $1.75/1M tokens | $14/1M tokens |
| gpt-5.2-chat-latest | $1.75/1M tokens | $14/1M tokens |
| gpt-5.1 | $1.25/1M tokens | $10/1M tokens |
| gpt-5.1-2025-11-13 | $1.25/1M tokens | $10/1M tokens |
| gpt-5.1-chat-latest | $1.25/1M tokens | $10/1M tokens |
| gpt-5 | $1.25/1M tokens | $10/1M tokens |
| gpt-5-2025-08-07 | $1.25/1M tokens | $10/1M tokens |
| gpt-5-chat-latest | $1.25/1M tokens | $10/1M tokens |
| gpt-5-mini | $0.25/1M tokens | $2/1M tokens |
| gpt-5-mini-2025-08-07 | $0.25/1M tokens | $2/1M tokens |
| gpt-5-nano | $0.05/1M tokens | $0.4/1M tokens |
| gpt-5-nano-2025-08-07 | $0.05/1M tokens | $0.4/1M tokens |
| Model | Input Price | Output Price |
|---|---|---|
| o1 | $15/1M tokens | $60/1M tokens |
| o1-2024-12-17 | $15/1M tokens | $60/1M tokens |
| o3 | $2/1M tokens | $8/1M tokens |
| o3-2025-04-16 | $2/1M tokens | $8/1M tokens |
| o4-mini | $1.1/1M tokens | $4.4/1M tokens |
| o4-mini-2025-04-16 | $1.1/1M tokens | $4.4/1M tokens |
| Model | Input Price | Output Price |
|---|---|---|
| gpt-4.1 | $2/1M tokens | $8/1M tokens |
| gpt-4.1-2025-04-14 | $2/1M tokens | $8/1M tokens |
| gpt-4.1-mini | $0.4/1M tokens | $1.6/1M tokens |
| gpt-4.1-mini-2025-04-14 | $0.4/1M tokens | $1.6/1M tokens |
| gpt-4.1-nano | $0.1/1M tokens | $0.4/1M tokens |
| gpt-4.1-nano-2025-04-14 | $0.1/1M tokens | $0.4/1M tokens |
| Model | Input Price | Output Price |
|---|---|---|
| gpt-4o | $2.5/1M tokens | $10/1M tokens |
| gpt-4o-2024-08-06 | $2.5/1M tokens | $10/1M tokens |
| gpt-4o-2024-05-13 | $5/1M tokens | $15/1M tokens |
| chatgpt-4o-latest | $5/1M tokens | $15/1M tokens |
| gpt-4o-mini | $0.15/1M tokens | $0.60/1M tokens |
| gpt-4o-mini-2024-07-18 | $0.15/1M tokens | $0.60/1M tokens |
| Model | Input Price | Output Price |
|---|---|---|
| gpt-4-turbo | $10/1M tokens | $30/1M tokens |
| gpt-4-turbo-2024-04-09 | $10/1M tokens | $30/1M tokens |
gpt: prefix: "model": "gpt:gpt-4o"| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| model | string | Yes | - | The model to use (e.g., gpt:gpt-4o) |
| messages | array | Yes | - | Array of message objects. Each message contains: - role: Role type, values: system, user, assistant- content: Message content, can be a string or array. Array items can contain:- type: Content type, text or image_url- text: Text content (when type is text)- image_url: Image object (when type is image_url)- url: Image URL or base64 encoded image |
| response_format | object | No | - | Force JSON output format. Contains type field with value json_object |
| temperature | number | No | 0 | Controls randomness: 0.0-2.0, 0 = deterministic |
| max_tokens | integer | No | 1000 | Maximum number of tokens to generate |
| top_p | number | No | 1.0 | Nucleus sampling: 0.0-1.0, consider tokens with top_p probability mass |
| frequency_penalty | number | No | 0.0 | Reduces repetition of frequent tokens: -2.0 to 2.0 |
| presence_penalty | number | No | 0.0 | Increases likelihood of new topics: -2.0 to 2.0 |
| n | integer | No | 1 | Number of completions to generate |
| stream | boolean | No | false | Whether to stream the response |
| stop | string | array | null | No | null | Stop sequences. Can be a string, array of strings, or null |
| Parameter | Type | Description |
|---|---|---|
| id | string | Unique identifier for the chat completion |
| object | string | Object type, always “chat.completion” |
| created | integer | Unix timestamp of when the completion was created |
| model | string | The model used for the completion |
| choices | array | Array of completion choices |
| choices[].index | integer | Index of the choice in the choices array |
| choices[].message | object | Message object containing the assistant’s response |
| choices[].message.role | string | Role of the message, always “assistant” |
| choices[].message.content | string | Content of the message |
| choices[].message.refusal | string | null | Refusal message if the request was refused |
| choices[].message.annotations | array | Annotations for the message |
| choices[].logprobs | object | null | Log probability information |
| choices[].finish_reason | string | Reason why the completion finished |
| usage | object | Token usage information |
| usage.prompt_tokens | integer | Number of tokens in the prompt |
| usage.completion_tokens | integer | Number of tokens in the completion |
| usage.total_tokens | integer | Total number of tokens used |
| usage.prompt_tokens_details | object | Detailed prompt token information |
| usage.prompt_tokens_details.cached_tokens | integer | Number of cached tokens |
| usage.prompt_tokens_details.audio_tokens | integer | Number of audio tokens |
| usage.completion_tokens_details | object | Detailed completion token information |
| usage.completion_tokens_details.reasoning_tokens | integer | Number of reasoning tokens |
| usage.completion_tokens_details.audio_tokens | integer | Number of audio tokens |
| usage.completion_tokens_details.accepted_prediction_tokens | integer | Number of accepted prediction tokens |
| usage.completion_tokens_details.rejected_prediction_tokens | integer | Number of rejected prediction tokens |
| service_tier | string | The service tier used for the request |
| system_fingerprint | string | System fingerprint for the model version |
The model to use (e.g., gpt:gpt-4o)
"gpt:gpt-4o"
Array of message objects
Force JSON output format
Controls randomness: 0.0-2.0, 0 = deterministic
0 <= x <= 2Maximum number of tokens to generate
Nucleus sampling: 0.0-1.0
0 <= x <= 1Reduces repetition of frequent tokens: -2.0 to 2.0
-2 <= x <= 2Increases likelihood of new topics: -2.0 to 2.0
-2 <= x <= 2Number of completions to generate
Whether to stream the response
Stop sequences
Chat completion response with JSON format
"chatcmpl-CsRhYgDaLSNjl80v5uYBufEDbJqAM"
"chat.completion"
1767092232
The model used for the completion
"gpt-4o-2024-08-06"
The service tier used for the request
"default"
System fingerprint for the model version
"fp_deacdd5f6f"