POST
/
ilm
/
chat
/
completions
Chat Completions GPT
curl --request POST \
  --url https://mavi-backend.memories.ai/serve/api/v2/ilm/chat/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt:gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "temperature": 0,
  "max_tokens": 1000,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "n": 1,
  "stream": false,
  "stop": "<string>"
}
'
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-2024-08-06",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "{\"title\": \"Image Title\", \"summary\": \"Image summary...\", \"objects\": [{\"name\": \"object1\", \"count\": 1, \"confidence\": 0.95}], \"scene\": \"indoor\", \"warnings\": []}",
      "refusal": null,
      "annotations": []
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 447,
    "completion_tokens": 112,
    "total_tokens": 559,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_deacdd5f6f"
}
This endpoint allows you to generate chat completions with image inputs using GPT ILM model.

Request Body

ParameterTypeRequiredDefaultDescription
modelstringYes-The model to use (e.g., gpt:gpt-4o)
messagesarrayYes-Array of message objects. Each message contains:
- role: Role type, values: system, user, assistant
- content: Message content, can be a string or array. Array items can contain:
- type: Content type, text or image_url
- text: Text content (when type is text)
- image_url: Image object (when type is image_url)
- url: Image URL or base64 encoded image
response_formatobjectNo-Force JSON output format. Contains type field with value json_object
temperaturenumberNo0Controls randomness: 0.0-2.0, 0 = deterministic
max_tokensintegerNo1000Maximum number of tokens to generate
top_pnumberNo1.0Nucleus sampling: 0.0-1.0, consider tokens with top_p probability mass
frequency_penaltynumberNo0.0Reduces repetition of frequent tokens: -2.0 to 2.0
presence_penaltynumberNo0.0Increases likelihood of new topics: -2.0 to 2.0
nintegerNo1Number of completions to generate
streambooleanNofalseWhether to stream the response
stopstring | array | nullNonullStop sequences. Can be a string, array of strings, or null

Code Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-XXX",
    base_url="https://mavi-backend.memories.ai/serve/api/v2/ilm"
)

def call_my_ilm():
    resp = client.chat.completions.create(
        model="gpt:gpt-4o",
        messages=[
            {"role": "system", "content": "You are a multimodal assistant. Only output JSON, do not output any extra text."},
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": """
Please analyze the image and strictly output the following JSON structure (all fields must be present, use null or empty array for missing values):

{
  "title": string,
  "summary": string,
  "objects": [
    {"name": string, "count": integer, "confidence": number}
  ],
  "scene": string,
  "warnings": [string]
}

Note:
- Only output JSON
- No Markdown
- No explanation of the process
"""
                    },
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": "https://storage.googleapis.com/memories-test-data/gun5.png"  # base64 or url
                        }
                    }
                ]
            }
        ],
        response_format={"type": "json_object"},  # Force JSON output format
        temperature=0,  # Controls randomness: 0.0-2.0, 0 = deterministic
        max_tokens=1000,  # Maximum number of tokens to generate
        top_p=1.0,  # Nucleus sampling: 0.0-1.0, consider tokens with top_p probability mass
        frequency_penalty=0.0,  # -2.0 to 2.0, reduces repetition of frequent tokens
        presence_penalty=0.0,  # -2.0 to 2.0, increases likelihood of new topics
        n=1,  # Number of completions to generate
        stream=False,  # Whether to stream the response
        stop=None  # Stop sequences (list of strings)
    )
    return resp

# Usage example
result = call_my_ilm()
print(result)

Response

Returns the chat completion response with JSON format.
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-2024-08-06",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "{\"title\": \"Image Title\", \"summary\": \"Image summary...\", \"objects\": [{\"name\": \"object1\", \"count\": 1, \"confidence\": 0.95}], \"scene\": \"indoor\", \"warnings\": []}",
      "refusal": null,
      "annotations": []
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 447,
    "completion_tokens": 112,
    "total_tokens": 559,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_deacdd5f6f"
}

Response Parameters

ParameterTypeDescription
idstringUnique identifier for the chat completion
objectstringObject type, always “chat.completion”
createdintegerUnix timestamp of when the completion was created
modelstringThe model used for the completion
choicesarrayArray of completion choices
choices[].indexintegerIndex of the choice in the choices array
choices[].messageobjectMessage object containing the assistant’s response
choices[].message.rolestringRole of the message, always “assistant”
choices[].message.contentstringContent of the message
choices[].message.refusalstring | nullRefusal message if the request was refused
choices[].message.annotationsarrayAnnotations for the message
choices[].logprobsobject | nullLog probability information
choices[].finish_reasonstringReason why the completion finished
usageobjectToken usage information
usage.prompt_tokensintegerNumber of tokens in the prompt
usage.completion_tokensintegerNumber of tokens in the completion
usage.total_tokensintegerTotal number of tokens used
usage.prompt_tokens_detailsobjectDetailed prompt token information
usage.prompt_tokens_details.cached_tokensintegerNumber of cached tokens
usage.prompt_tokens_details.audio_tokensintegerNumber of audio tokens
usage.completion_tokens_detailsobjectDetailed completion token information
usage.completion_tokens_details.reasoning_tokensintegerNumber of reasoning tokens
usage.completion_tokens_details.audio_tokensintegerNumber of audio tokens
usage.completion_tokens_details.accepted_prediction_tokensintegerNumber of accepted prediction tokens
usage.completion_tokens_details.rejected_prediction_tokensintegerNumber of rejected prediction tokens
service_tierstringThe service tier used for the request
system_fingerprintstringSystem fingerprint for the model version

Authorizations

Authorization
string
header
required

Body

application/json
model
string
required

The model to use (e.g., gpt:gpt-4o)

Example:

"gpt:gpt-4o"

messages
object[]
required

Array of message objects

response_format
object

Force JSON output format

temperature
number
default:0

Controls randomness: 0.0-2.0, 0 = deterministic

Required range: 0 <= x <= 2
max_tokens
integer
default:1000

Maximum number of tokens to generate

top_p
number
default:1

Nucleus sampling: 0.0-1.0

Required range: 0 <= x <= 1
frequency_penalty
number
default:0

Reduces repetition of frequent tokens: -2.0 to 2.0

Required range: -2 <= x <= 2
presence_penalty
number
default:0

Increases likelihood of new topics: -2.0 to 2.0

Required range: -2 <= x <= 2
n
integer
default:1

Number of completions to generate

stream
boolean
default:false

Whether to stream the response

stop

Stop sequences

Response

200 - application/json

Chat completion response with JSON format

id
string
required
Example:

"chatcmpl-CsRhYgDaLSNjl80v5uYBufEDbJqAM"

object
string
required
Example:

"chat.completion"

created
integer
required
Example:

1767092232

model
string
required

The model used for the completion

Example:

"gpt-4o-2024-08-06"

choices
object[]
required
usage
object
required
service_tier
string

The service tier used for the request

Example:

"default"

system_fingerprint
string

System fingerprint for the model version

Example:

"fp_deacdd5f6f"