Skip to main content

AI Providers

The ai.providers section configures connections to AI model providers. Atmos supports API providers (Anthropic, OpenAI, Gemini, Grok, Ollama, Bedrock, Azure OpenAI) and CLI providers (Claude Code, OpenAI Codex, Gemini CLI) that reuse your existing subscription.

Experimental

Configuration

atmos.yaml

ai:
default_provider: "anthropic"
providers:
anthropic:
model: "claude-sonnet-4-6"
api_key: !env "ANTHROPIC_API_KEY"
max_tokens: 4096
openai:
model: "gpt-4o"
api_key: !env "OPENAI_API_KEY"
max_tokens: 4096

Provider Settings

Each provider in the providers map supports the following settings:

model
Model identifier to use for this provider. Defaults vary by provider (see below).
api_key
API key for the provider. Use the !env YAML function to read from an environment variable (e.g., api_key: !env "ANTHROPIC_API_KEY"). Not required for Ollama (local) or Bedrock (uses AWS credentials).
max_tokens
Maximum tokens per response (default: 4096). OpenAI reasoning models (o3, o4-mini) use max_completion_tokens internally -- Atmos handles this automatically.
base_url
Custom API endpoint. Required for Grok (https://api.x.ai/v1) and Azure OpenAI (https://<resource>.openai.azure.com). For Bedrock, set to the AWS region (e.g., us-east-1). Ollama defaults to http://localhost:11434/v1.
api_version
API version string. Required for Azure OpenAI (e.g., 2025-04-01-preview).

Supported Providers

Anthropic (Claude)claude-sonnet-4-6
API Key authentication. Advanced reasoning, default choice.
OpenAI (GPT)gpt-4o
API Key authentication. Widely available, strong general capabilities.
Google (Gemini)gemini-2.5-flash
API Key authentication. Fast responses, larger context window.
xAI (Grok)grok-4-latest
API Key authentication. OpenAI-compatible, real-time knowledge.
Ollama (Local)llama4
No authentication (local). Privacy, offline use, zero API costs.
AWS Bedrockanthropic.claude-sonnet-4-6
AWS Credentials authentication. Enterprise security, AWS integration.
Azure OpenAIgpt-4o
API Key / Azure AD authentication. Enterprise compliance, Azure integration.

CLI Providers

CLI providers invoke a locally installed AI tool as a subprocess. No API key needed — the CLI tool uses your existing subscription.

Claude Codeclaude-code
Claude Pro/Max subscription. Full MCP support. Best for interactive development.
OpenAI Codexcodex-cli
ChatGPT Plus/Pro subscription. Full MCP support. Open source (Apache 2.0).
Gemini CLIgemini-cli
Google account (free tier: 1K req/day). MCP blocked for personal accounts.

CLI Provider Settings

In addition to model, CLI providers support:

binary
Path to the CLI binary. Optional — defaults to finding it on PATH.
max_turns
Maximum agentic turns per invocation (Claude Code only).
max_budget_usd
Budget cap per invocation in USD (Claude Code only).
full_auto
Enable automatic approval for file writes (Codex CLI only). When MCP servers are configured, --dangerously-bypass-approvals-and-sandbox is used automatically.
allowed_tools
List of tools Claude Code is allowed to use (Claude Code only).

CLI Provider Examples

atmos.yaml

ai:
default_provider: "claude-code"
providers:
claude-code:
max_turns: 10
# max_budget_usd: 1.00
# allowed_tools: ["Read", "Glob", "Grep"]

codex-cli:
model: "gpt-5.4-mini"
full_auto: true

gemini-cli:
model: "gemini-2.5-flash"

MCP Pass-Through

When mcp.servers is configured, CLI providers automatically pass MCP servers to the CLI tool:

  • Claude Code: Temp .mcp.json via --mcp-config
  • Codex CLI: Written to ~/.codex/config.toml (backup/restore after exit)
  • Gemini CLI: Written to .gemini/settings.json in cwd (backup/restore, MCP blocked for personal accounts)

Auth-requiring servers are wrapped with atmos auth exec -i <identity>. Toolchain PATH and ATMOS_* env vars are injected automatically.

Provider Comparison

Setup
Cloud: Easy (API key). Ollama: Medium (install + download model). Enterprise: Complex (cloud config).
Cost
Cloud: Per-token pricing. Ollama: Free. Enterprise: Per-token + cloud costs.
Data Privacy
Cloud: Sent to provider. Ollama: 100% local. Enterprise: Stays in your cloud.
Offline
Cloud: No. Ollama: Yes. Enterprise: No.
Rate Limits
Cloud: Yes. Ollama: No. Enterprise: Yes (configurable).
Compliance
Cloud: Provider-dependent. Ollama: Your control. Enterprise: SOC2, HIPAA, ISO.
Private Network
Cloud: No. Ollama: Yes. Enterprise: Yes (VPC/VNet).

Performance and Cost

Anthropic (Claude) — 200K token context
$3-$15 per 1M tokens (input/output).
OpenAI (GPT-4o) — 128K token context
$2.50-$10 per 1M tokens (input/output).
Google (Gemini) — up to 2M token context
$0.075-$0.30 per 1M tokens (input/output).
xAI (Grok) — 2M token context
$3-$15 per 1M tokens (input/output).
Ollama — 128K token context
$0 (hardware only).
AWS Bedrock — 200K token context
$3-$15 + AWS costs per 1M tokens.
Azure OpenAI — 128K token context
$2.50-$10 + Azure costs per 1M tokens.
Cost Optimization

For budget-conscious usage, try Gemini (cheapest cloud) or Ollama (free). For balanced cost and quality, Claude Sonnet or GPT-4o work well. Enterprise providers can leverage committed-use discounts.

Privacy and Security

Anthropic
Data location: US (cloud). Compliance: SOC 2, GDPR.
OpenAI
Data location: US (cloud). Compliance: SOC 2, GDPR.
Gemini
Data location: US (cloud). Compliance: SOC 2, GDPR.
Grok
Data location: US (cloud). Compliance: Standard.
Ollama
Data location: Your machine. Compliance: Your control.
Bedrock
Data location: Your AWS region. Compliance: SOC 2, HIPAA, ISO.
Azure OpenAI
Data location: Your Azure region. Compliance: SOC 2, HIPAA, ISO.

For sensitive infrastructure data, consider Ollama (never leaves your machine) or enterprise providers (stays in your cloud account).

Provider Examples

atmos.yaml

ai:
providers:
anthropic:
model: "claude-sonnet-4-6"
api_key: !env "ANTHROPIC_API_KEY"
max_tokens: 4096

openai:
model: "gpt-4o"
api_key: !env "OPENAI_API_KEY"
max_tokens: 4096

gemini:
model: "gemini-2.5-flash"
api_key: !env "GEMINI_API_KEY"
max_tokens: 8192

grok:
model: "grok-4-latest"
api_key: !env "XAI_API_KEY"
base_url: "https://api.x.ai/v1"
max_tokens: 4096

Token Caching

Token caching (prompt caching) reduces costs by reusing frequently-sent content like system prompts.

Anthropic — 90% discount
Manual markers. Configuration required (see below).
OpenAI — 50% discount
Automatic. No configuration required.
Gemini — Free (included)
Automatic. No configuration required.
Grok — 75% discount
Automatic. No configuration required.
Bedrock — Up to 90% discount
Automatic. No configuration required.
Azure OpenAI — 50-100% discount
Automatic. No configuration required.
Ollama — N/A
Local processing, caching not applicable.

Most providers implement caching automatically. Only Anthropic requires explicit configuration:

atmos.yaml

ai:
providers:
anthropic:
model: "claude-sonnet-4-6"
api_key: !env "ANTHROPIC_API_KEY"
cache:
enabled: true # Enable prompt caching (default)
cache_system_prompt: true # Cache skill system prompt
cache_project_instructions: true # Cache ATMOS.md content
cache.enabled
Enable or disable prompt caching (default: true). Cached tokens cost 90% less for Anthropic.
cache.cache_system_prompt
Cache the skill system prompt across messages (default: true).
cache.cache_project_instructions
Cache the ATMOS.md project instructions content (default: true).
Maximize Cache Savings

For Anthropic, keep your ATMOS.md content stable during conversations and use named sessions for extended work on the same topic. For all other providers, caching works transparently with no action required.

Multi-Provider Configuration

Configure multiple providers and switch between them. Set default_provider for CLI commands, and press Ctrl+P during a chat session to switch mid-conversation.

Provider Isolation

Each provider maintains its own completely isolated conversation history. When you switch providers, the new provider only sees messages from its own thread. When you switch back, it picks up right where you left off. UI notifications like "Switched to..." are never sent to any provider.

Security Best Practices

Cloud Providers (Anthropic, OpenAI, Google, xAI)
Use environment variables for API keys (never commit to git), rotate keys regularly, and review each provider's data retention policies.
Local Provider (Ollama)
All processing is local. Keep Ollama updated and use disk encryption for sensitive data.
Enterprise Providers (Bedrock, Azure OpenAI)
Use IAM roles or managed identities instead of API keys, enable audit logging (CloudTrail or Azure Monitor), configure private endpoints, and use customer-managed encryption keys where available.