token-optimizer
Reduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and multi-provider fallbacks. Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. Includes ready-to-use scripts for task classification, usage monitoring, and optimized heartbeat scheduling. All operations are local file analysis only - no network requests, no code execution. See SECURITY.md for details.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/qsmtco/token-optimizer-qsmtcoToken Optimizer
Comprehensive toolkit for reducing token usage and API costs in OpenClaw deployments. Combines smart model routing, optimized heartbeat intervals, usage tracking, and multi-provider strategies.
Quick Start
Immediate actions (no config changes needed):
-
Generate optimized AGENTS.md (BIGGEST WIN!):
python3 scripts/context_optimizer.py generate-agents # Creates AGENTS.md.optimized — review and replace your current AGENTS.md -
Check what context you ACTUALLY need:
python3 scripts/context_optimizer.py recommend "hi, how are you?" # Shows: Only 2 files needed (not 50+!) -
Install optimized heartbeat:
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md -
Enforce cheap models for casual chat:
python3 scripts/model_router.py "thanks!" # Shows: Use Quick, not Deep! -
Check current token budget:
python3 scripts/token_tracker.py check
Expected savings: 50-80% reduction in token costs for typical workloads (context optimization is the biggest factor!).
Core Capabilities
1. Context Optimization (NEW!)
Biggest token saver — Only load files you actually need, not everything upfront.
Problem: Default OpenClaw loads ALL context files every session:
- SOUL.md, AGENTS.md, USER.md, TOOLS.md, MEMORY.md
- docs/**/*.md (hundreds of files)
- memory/2026-*.md (daily logs)
- Total: Often 50K+ tokens before user even speaks!
Solution: Lazy loading based on prompt complexity.
Usage:
python3 scripts/context_optimizer.py recommend "<user prompt>"
Examples:
# Simple greeting → minimal context (2 files only!)
context_optimizer.py recommend "hi"
→ Load: SOUL.md, IDENTITY.md
→ Skip: Everything else
→ Savings: ~80% of context
# Standard work → selective loading
context_optimizer.py recommend "write a function"
→ Load: SOUL.md, IDENTITY.md, memory/TODAY.md
→ Skip: docs, old memory, knowledge base
→ Savings: ~50% of context
# Complex task → full context
context_optimizer.py recommend "analyze our entire architecture"
→ Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY+YESTERDAY.md
→ Conditionally load: Relevant docs only
→ Savings: ~30% of context
Output format:
{
"complexity": "simple",
"context_level": "minimal",
"recommended_files": ["SOUL.md", "IDENTITY.md"],
"file_count": 2,
"savings_percent": 80,
"skip_patterns": ["docs/**/*.md", "memory/20*.md"]
}
Integration pattern: Before loading context for a new session:
from context_optimizer import recommend_context_bundle
user_prompt = "thanks for your help"
recommendation = recommend_context_bundle(user_prompt)
if recommendation["context_level"] == "minimal":
# Load only SOUL.md + IDENTITY.md
# Skip everything else
# Save ~80% tokens!
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-qsmtco-token-optimizer-qsmtco": {
"enabled": true,
"auto_update": true
}
}
}