Skip to main content

AI Troubleshooting

Solutions for common Atmos AI issues, organized by symptom.

Experimental

Quick Checks

Before diving into specific errors, verify these basics:

ai:
enabled: true
default_provider: "anthropic"

Verify your API key is set

[ -n "$ANTHROPIC_API_KEY" ] && echo "Set" || echo "NOT set"

Test connectivity

atmos ai ask "test"

Common Errors

"AI features are not enabled"

Add ai.enabled: true to your atmos.yaml.

"API key not found"

Export the key for your provider:

export ANTHROPIC_API_KEY="sk-ant-..."  # From console.anthropic.com

"Failed to create AI client"

Usually means an invalid or revoked API key, or insufficient credits. Test the key directly:

Test Anthropic API key

curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model": "claude-sonnet-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}]}'

"Unsupported AI provider"

Valid provider names: anthropic, openai, gemini, grok, ollama, bedrock, azureopenai.

Rate Limiting (429 Errors)

Long conversations send full history with each request. Reduce it:

ai:
max_history_messages: 20

Provider Issues

Ollama

"Connection refused" — Start the server and verify your config:

shell

ollama serve
curl http://localhost:11434/api/version
providers:
ollama:
base_url: "http://localhost:11434/v1"

"Model not found" — Pull the model and use the exact name from ollama list:

shell

ollama list
ollama pull llama4

Slow or out of memory — Use a smaller model (e.g. llama3.1:8b needs ~8GB RAM) or reduce max_tokens.

AWS Bedrock

"Authentication failed" — Verify AWS credentials:

shell

aws sts get-caller-identity

"Access denied" — Your IAM role needs bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions. Also enable model access in AWS Console under Bedrock > Model access.

"Model not found" — Use the full Bedrock model ID and check your region:

providers:
bedrock:
model: "anthropic.claude-sonnet-4-6"
base_url: "us-east-1"

Azure OpenAI

"Authentication failed" — Get the key from Azure Portal > OpenAI resource > Keys and Endpoint.

"Resource not found" — Copy the exact endpoint URL from Azure Portal:

providers:
azureopenai:
base_url: "https://your-resource-name.openai.azure.com"

"Deployment not found" — Use your deployment name (not the model name):

providers:
azureopenai:
model: "gpt-4o" # Your deployment name
api_version: "2025-04-01-preview"

List deployments with az cognitiveservices account deployment list --name your-resource --resource-group your-rg.

Other Providers

Anthropic — "Authentication error"
Key must start with sk-ant-.
OpenAI — "Insufficient quota"
Add billing info at platform.openai.com.
Grok — "Connection failed"
Set base_url: "https://api.x.ai/v1".

See AI Providers for full provider documentation.

Sessions

Sessions Not Persisting

Enable sessions and use named sessions:

ai:
sessions:
enabled: true
path: ".atmos/sessions"

shell

atmos ai chat --session my-project

Database Errors

If you get "Failed to initialize session storage", back up and recreate:

shell

cp .atmos/sessions/sessions.db .atmos/sessions/sessions.db.backup
rm .atmos/sessions/sessions.db
atmos ai chat # Creates a new database

Conversation Memory Not Working

  • Use --session name — anonymous sessions don't persist
  • Add sessions.enabled: true to atmos.yaml
  • Check directory permissions: chmod 755 .atmos/sessions

All seven providers support conversation memory.

Auto-Compact Not Working

Verify the required settings are all present:

ai:
max_history_messages: 50 # Required — threshold is calculated from this
sessions:
enabled: true
auto_compact:
enabled: true
trigger_threshold: 0.75 # Triggers at 38 messages (50 x 0.75)

If use_ai_summary: true and no summaries appear, check that the AI provider is configured and the API key is valid. Test with use_ai_summary: false first.

Triggers too often? Increase trigger_threshold or max_history_messages.

Too expensive? Set use_ai_summary: false to use simple concatenation instead of AI summaries.

See Auto-Compact for details.

MCP Server

"MCP server is not enabled"

The MCP server is disabled by default and must be explicitly enabled:

mcp:
enabled: true
ai:
enabled: true
tools:
enabled: true

Not Starting

Test the MCP server directly:

shell

atmos mcp start  # Should output: Starting Atmos MCP server on stdio...

For Claude Desktop, use the full path to atmos:

{
"mcpServers": {
"atmos": {
"command": "/usr/local/bin/atmos",
"args": ["mcp", "start"]
}
}
}

Tools Not Available

Restart the MCP client and check for errors with debug logging:

shell

ATMOS_LOGS_LEVEL=Debug atmos mcp start

See MCP Server for the complete setup guide.

LSP Integration

Server Not Found

Install and verify:

shell

npm install -g yaml-language-server && yaml-language-server --version
brew install terraform-ls && terraform-ls version

If installed but not found, use the full path:

lsp:
servers:
yaml-ls:
command: "/usr/local/bin/yaml-language-server"

Not Validating Files

Verify LSP is enabled and the server config matches your file types:

lsp:
enabled: true
servers:
yaml-ls:
filetypes: ["yaml", "yml"]

See LSP Client for detailed configuration.

Skills

Skill Not Appearing

Press Ctrl+A in chat to open the skill selector. Skills need all three required fields:

ai:
skills:
my-skill:
display_name: "My Skill" # Required
description: "..." # Required
system_prompt: "..." # Required

Ctrl+A only works in the main chat view — press Esc first if you're in another panel.

Skill Gives Generic Responses

The system_prompt needs more detail. Include a role definition, focus areas, and tool usage instructions:

system_prompt: |
You are a specialized Atmos stacks analyst.
FOCUS: Stack configuration, dependency analysis.
APPROACH: Use atmos_describe_component first, then recommend.
Always use tools to gather data before answering.

Tool Access Denied

The tool isn't in the skill's allowed_tools. Add it, or switch to the General skill (Ctrl+A) which has access to all tools.

Tool names use the atmos_ prefix: atmos_describe_component, atmos_list_stacks, atmos_validate_stacks, etc. See AI Tools for the full list.

Claude Code Subagents

Subagent Not Invoked

Check the file exists and has proper frontmatter:

shell

ls -la .claude/agents/atmos-expert.md
---
name: atmos-expert
description: Expert in Atmos infrastructure...
tools:
- mcp__atmos__describe_stacks
model: inherit
---

Invoke with @atmos-expert your question.

Can't Access MCP Tools

Tool names need the mcp__atmos__ prefix. The MCP server must be running and the server name must match:

tools:
- mcp__atmos__describe_stacks # Correct
- describe_stacks # Wrong

Restart Claude Desktop after changing subagent or MCP configuration.

See Claude Code Integration for the complete guide.

Project Instructions

AI Not Using Instructions

Verify instructions are enabled and the file exists:

ai:
instructions:
enabled: true
file: "ATMOS.md"

shell

ls -la ATMOS.md

If the file doesn't exist, instructions are silently skipped. Create it manually.

Large File Performance

Keep ATMOS.md under 10KB. Remove outdated content to reduce tokens per request.

Debugging

When nothing else works:

1. Enable debug logging

atmos ai ask "test" --logs-level=Debug

2. Inspect configuration

atmos describe config | grep -A 10 "ai"

3. Test network connectivity:

curl -sI https://api.anthropic.com | head -1

4. Try a minimal config to rule out config problems:

ai:
enabled: true
default_provider: "anthropic"

Performance

Slow responses — Try a faster model (claude-haiku-4-5-20251001, gpt-5-mini, gemini-2.5-flash), go local with Ollama, reduce max_tokens, or limit history with max_history_messages: 20.

Timeout errors — Break complex questions into smaller parts.

Getting Help

  1. Review AI Configuration
  2. Search GitHub Issues
  3. Open a new issue with: Atmos version, provider/model, error message, and reproduction steps