llmrouter
Intelligent LLM proxy that routes requests to appropriate models based on complexity. Save money by using cheaper models for simple tasks. Tested with Anthropic, OpenAI, Gemini, Kimi/Moonshot, and Ollama.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/alexrudloff/llmrouterWhat This Skill Does
The llmrouter skill acts as an intelligent, cost-optimizing proxy for your OpenClaw agent. By analyzing the complexity of incoming requests, it automatically routes them to the most appropriate LLM model based on your configuration. This allows you to delegate simple, routine tasks to lightweight, inexpensive models (like GPT-4o-mini or Claude Haiku) while reserving high-end reasoning models (like Claude Opus or OpenAI o3) for complex, high-stakes tasks. By effectively managing your model consumption, this skill significantly reduces API expenditures and latency.
Installation
To install the skill, run the following command in your terminal: clawhub install openclaw/skills/skills/alexrudloff/llmrouter. Once installed, you will need to clone the source repository, configure your environment variables (API keys), and define your model preferences within config.yaml. You must also ensure that either a local Ollama instance or your preferred cloud provider (Anthropic, OpenAI, etc.) is configured to handle the classification logic. Start the server using python server.py and verify your setup by checking the health endpoint at http://localhost:4001/health.
Use Cases
- Automated Cost Control: Automatically scale down token usage for trivial tasks like summarizing short messages or basic formatting.
- Optimized Performance: Ensure that latency-sensitive requests are routed to fast flash-models, while complex coding tasks utilize reasoning-capable models.
- Unified API Gateway: Manage all your disparate LLM provider API keys through a single, intelligent interface, streamlining your OpenClaw deployment.
Example Prompts
- "Summarize the following meeting transcript in three bullet points: [transcript text]"
- "Refactor this 200-line Python module to improve readability and apply PEP 8 standards: [code snippet]"
- "Write a quick greeting email for a new project lead."
Tips & Limitations
- Classifier Accuracy: If you notice that complex tasks are being routed to smaller, faster models, consider upgrading your classifier model (e.g., from Qwen 2.5 3b to a larger variant) to improve intent classification.
- Network Latency: Running a local classifier via Ollama adds minimal overhead compared to network requests. However, ensure sufficient system memory is available if running both the server and local LLMs simultaneously.
- Provider Support: While highly flexible, ensure that your configuration matches the exact model name formats expected by your chosen providers. Reasoning models are automatically detected, but verify your
config.yamlsyntax if you encounter unexpected behavior.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-alexrudloff-llmrouter": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api