OpenClaw Cost Optimization 2026
API costs scale with token volume. This guide shows which config changes have the most impact — model selection, token caps, and switching to local inference — without changing agent behavior.
1-Minute Execution Version
Copy, replace YOUR_KEY, restart gateway.
{
"llm": {
"provider": "deepseek",
"apiKey": "YOUR_KEY",
"model": "deepseek-chat",
"baseURL": "https://api.deepseek.com/v1",
"maxTokens": 4096,
"temperature": 0.7
}
}DeepSeek V3 input: $0.27/M tokens, output: $1.10/M tokens (as of Feb 2026 — verify at DeepSeek Pricing).
Why Costs Spike Unexpectedly
OpenClaw agents are multi-turn by design. Each tool call or sub-agent spawns new completions that accumulate fast. Three patterns account for most unexpected cost spikes:
No maxTokens ceiling
Without a token cap, a single runaway task can generate 128K+ tokens ($1.28 on GPT-4.1 output alone). A cap of 4096 limits worst-case output cost.
Expensive model for every task
Using GPT-4.1 or Sonnet for simple file lookups and summaries wastes 10–50× compared to smaller models. Routing by task type is the highest-leverage change.
Long conversation history replayed
OpenClaw replays the full message history per turn by default. For long sessions, this means paying for the same context on every turn.
Step 1: Pick the Right Model
The model swap is the single highest-impact change. Use our Model Pricing page for current rates. Rough 2026 tiers:
| Model | Best For | Input / Output |
|---|---|---|
| GPT-4.1 | Complex reasoning, long doc analysis | $2 / $8 /M |
| Claude Sonnet 4.6 | Code gen, long context | $3 / $15 /M |
| GPT-4.1 mini | Summaries, Q&A, routing | $0.40 / $1.60 /M |
| DeepSeek V3 | Code gen, most agent tasks | $0.27 / $1.10 /M |
| Ollama (local) | Privacy-sensitive, offline | $0 (compute only) |
Prices approximate. Verify at each provider's pricing page before committing.
Step 2: Add a maxTokens Cap
This is the fastest safety net. Without it, a single agent loop can exhaust a daily budget in minutes.
{
"llm": {
"model": "deepseek-chat",
"maxTokens": 4096
}
}Start at 4096 and increase only if you find agents truncating legitimately. Most single-turn tasks need under 2000 output tokens.
Step 3: Switch to DeepSeek V3
DeepSeek V3 handles most OpenClaw workloads — code generation, task planning, JSON structuring — at significantly lower cost than GPT-4-tier models. It is not a reasoning model (no chain-of-thought), so it's not suited for tasks that explicitly need reasoning_effort.
{
"llm": {
"provider": "deepseek",
"apiKey": "sk-...",
"model": "deepseek-chat",
"baseURL": "https://api.deepseek.com/v1",
"maxTokens": 4096,
"temperature": 0.7
}
}Get an API key at platform.deepseek.com. For a full setup walkthrough, see our DeepSeek Setup Guide.
Step 4: Run Locally with Ollama (Zero API Cost)
For privacy-sensitive tasks or when you want to eliminate API costs entirely, Ollama runs models on your machine. Performance depends on your hardware.
# 1. Pull a model locally
ollama pull qwen2.5:14b
# 2. Update openclaw config
{
"llm": {
"provider": "ollama",
"model": "qwen2.5:14b",
"baseURL": "http://localhost:11434/v1",
"maxTokens": 4096
}
}Ollama Limitations
Ollama timeout errors (30s default) are common with larger models on modest hardware. See our Ollama Timeout troubleshooting guide if you hit this.
Generate a Config Automatically
Tell the Config Wizard your provider and budget. It generates a complete openclaw.json with correct fields, token limits, and baseURL.
Step 5: Track Usage Before Optimizing Further
Without usage data, optimization is guesswork. Each provider has built-in usage dashboards:
Set a monthly spend alert in your provider's billing settings. Most let you trigger an email at a fixed dollar threshold.
Quick Reference
Swap to DeepSeek V3 or GPT-4.1 mini
Highest impact — 60–90% cost reduction for most tasks
Set maxTokens: 4096
Prevents runaway single-task cost
Use Ollama for local-only tasks
Zero API cost; limited by hardware
Set provider spend alerts
Catches unexpected cost spikes early
Related Guides
Other pages that affect cost:
Other Tools
Did this guide solve your problem?