adaptive-routing
Routes LLM requests to a local model first (Ollama, LM Studio, llamafile), validates the response quality, and escalates to cloud only when the local result fails. Tracks local vs escalated vs cloud outcomes in a persistent dashboard. Use when: (1) user asks to run a task with a local model first, (2) user wants to reduce cloud API costs or keep requests private, (3) user wants post-outcome quality validation before committing to a local result, (4) user asks to see token savings or the routing dashboard, (5) any request where local-vs-cloud routing should be decided automatically with a quality gate. Supports Ollama, LM Studio, and llamafile as local providers.
Why use this skill?
Reduce cloud AI costs and enhance privacy with adaptive-routing. Automatically route tasks between local LLMs and cloud APIs with built-in quality validation.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/joelnishanth/adaptive-routingWhat This Skill Does
The adaptive-routing skill provides an intelligent, automated bridge between local LLM hardware (like Ollama, LM Studio, or llamafile) and high-performance cloud providers. Rather than defaulting to expensive cloud inference for every task, this skill employs a multi-step logic gate: it first checks for local availability, assesses prompt complexity, and performs a quality validation check post-generation. If the local response meets your predefined threshold, it is used immediately, saving you API costs and ensuring maximum privacy. If it fails validation or is too complex for the local model, the system automatically escalates to a cloud model.
Installation
To install this skill, use the clawhub CLI within your terminal. Ensure you have the required Python dependencies installed, as the skill relies on local script execution to monitor model health and track usage metrics. Run the following command:
clawhub install openclaw/skills/skills/joelnishanth/adaptive-routing
Use Cases
- Privacy-First Workflows: Keep sensitive data on your local machine by forcing local execution for specific internal documents.
- Cost Optimization: Dramatically reduce cloud API bills by offloading simple summarizing, formatting, or parsing tasks to local llamafiles.
- Automated Quality Assurance: Implement a 'quality gate' where the agent verifies the output of a small local model before presenting it to the user.
- Dashboarding: Track the cumulative savings in terms of token usage and financial cost between local and cloud transitions over time.
Example Prompts
- "Run this task using a local model first to keep it private, and only escalate to the cloud if the quality is too low."
- "Show me the dashboard for my adaptive routing savings to see how much I've saved on cloud costs this month."
- "Summarize this transcript locally; if the result isn't accurate enough, use GPT-4o for a high-quality summary instead."
Tips & Limitations
- Performance: The responsiveness of this skill depends on your local hardware. If you are running quantized models, ensure your system has sufficient RAM/VRAM to handle the requested token load.
- Complexity Scoring: The internal routing logic uses a complexity threshold. If you find the agent is escalating too frequently, you can adjust the complexity score parameters in the configuration files.
- Dependencies: Always verify your local provider (Ollama, LM Studio) is active via
check_local.pybefore attempting deep routing, as the system will default to cloud if no local endpoint is detected.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-joelnishanth-adaptive-routing": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, external-api, code-execution