AI Deployments

I'll open the full src/utils/costCalculator.js to review updateDebugTokensUsage() and update it to include GPT‑5 and the full list of models you support.

Large Language Model (LLM)

What’s the role of the Language Model?

The language model generates answers to user queries and can call tools when needed. It’s the core “brain” of your assistant.

Which LLMs can I choose?

OpenAI

  • GPT‑5 family:
    • gpt-5, gpt-5-mini, gpt-5-nano
    • Chat variants: gpt-5-chat, gpt-5-chat-latest
  • GPT‑4.1 family:
    • gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
  • GPT‑4.5:
    • gpt-4.5-preview
  • GPT‑4o family:
    • gpt-4o, gpt-4o-audio-preview, gpt-4o-realtime-preview
    • gpt-4o-mini, gpt-4o-mini-audio-preview, gpt-4o-mini-realtime-preview
  • O‑series (reasoning):
    • o1, o1-pro, o1-mini
    • o3, o3-pro, o3-deep-research, o3-mini
    • o4-mini, o4-mini-deep-research
  • Specialized:
    • Code: codex-mini-latest
    • Search/Computer Use: gpt-4o-mini-search-preview, gpt-4o-search-preview, computer-use-preview
    • Image: gpt-image-1

Microsoft Azure OpenAI

  • Same OpenAI model families via your Azure deployment (the “Name” you set is your deployment name).

Anthropic (Claude)

  • Latest:
    • claude-3-7-sonnet-latest, claude-3-7-sonnet-20250219
    • claude-3-5-sonnet-latest, claude-3-5-sonnet-20241022, claude-3-5-sonnet-20240620
    • claude-3.5-haiku-latest, claude-3.5-haiku-20241022
  • Claude 3:
    • claude-3-opus-20240229
    • claude-3-sonnet-20240229
    • claude-3-haiku-20240307

Mistral

  • General:
    • mistral-large-latest, mistral-large-2411, mistral-large-2402
    • mistral-small-latest, mistral-small-2501, mistral-small-2402
    • mistral-saba-latest, mistral-saba-2502
  • Code:
    • codestral-latest, codestral-2501, codestral-2405
  • Lightweight:
    • ministral-8b-latest, ministral-8b-2410
    • ministral-3b-latest, ministral-3b-2410
  • Vision:
    • pixtral-large-latest, pixtral-large-2411

DeepSeek

  • deepseek-reasoner
  • deepseek-chat

LiteLLM / LLM Proxy

  • Route requests through your own gateway with model names such as openai/gpt-4o (example). Useful for centralizing traffic and credentials.

Embeddings (for indexing/vectorization)

  • OpenAI:
    • text-embedding-3-small
    • text-embedding-3-large
    • text-embedding-ada-002
  • Mistral:
    • mistral-embed

LLM options you can configure

  • Provider

    • The vendor to use.
    • Examples: OpenAI, Azure OpenAI, Anthropic, Mistral, DeepSeek, LiteLLM/LLM Proxy.
  • Name

    • The specific model identifier you select.
    • Examples: gpt-5-chat, gpt-4o, o3, o4-mini, claude-3-7-sonnet-latest, mistral-large-latest, deepseek-reasoner.
  • Temperature

    • Controls creativity. Higher values produce more diverse but less deterministic answers.
    • Example: 0.2.
  • Max Tokens to Generate

    • Maximum number of tokens the model will generate in its response.
    • Example: 256 (setting this too low can truncate answers).

Note: Availability and pricing of models may vary by account/region and can change over time.