ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

local-first-llm

Routes LLM requests to a local model (Ollama, LM Studio, llamafile) before falling back to cloud APIs. Tracks token savings and cost avoidance in a persistent dashboard. Use when: (1) user asks to run a task with a local model first, (2) user wants to reduce cloud API costs or keep requests private, (3) user asks to see their token savings or LLM routing dashboard, (4) any request where local-vs-cloud routing should be decided automatically. Supports Ollama, LM Studio, and llamafile as local providers.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/joelnishanth/local-first-llm
Or

Local-First LLM

Route requests to a local LLM first; fall back to cloud only when necessary. Track every decision to show real token and cost savings.

Quick Start

1. Check if a local LLM is running

python3 skills/local-first-llm/scripts/check_local.py

Returns JSON: { "any_available": true, "best": { "provider": "ollama", "models": [...] } }

2. Route a request

python3 skills/local-first-llm/scripts/route_request.py \
  --prompt "Summarize this meeting transcript" \
  --tokens 800 \
  --local-available \
  --local-provider ollama

Returns: { "decision": "local", "reason": "...", "complexity_score": -1 }

3. Log the outcome

After executing the request, record it:

python3 skills/local-first-llm/scripts/track_savings.py log \
  --tokens 800 \
  --model gpt-4o \
  --routed-to local

4. Show the dashboard

python3 skills/local-first-llm/scripts/dashboard.py

Full Routing Workflow

┌─────────────────────────────────────────────────────┐
│  1. check_local.py  →  is a local provider running? │
│                                                      │
│  2. route_request.py  →  local or cloud?             │
│     - sensitivity check  (private data → local)      │
│     - complexity score   (high score → cloud)        │
│     - availability gate  (no local → cloud)          │
│                                                      │
│  3. Execute with the chosen provider                 │
│                                                      │
│  4. track_savings.py log  →  record the outcome      │
│                                                      │
│  5. dashboard.py  →  show cumulative savings         │
└─────────────────────────────────────────────────────┘

Routing Rules (Summary)

ConditionRoute
No local provider available☁️ Cloud
Prompt contains sensitive data (password, secret, api key, ssn, etc.)🏠 Local
Complexity score ≥ 3☁️ Cloud
Complexity score < 3🏠 Local

For full scoring details, see references/routing-logic.md.


Executing with a Local Provider

Once route_request.py returns "decision": "local", send the request:

Ollama

curl http://localhost:11434/api/generate \
  -d '{"model": "llama3.2", "prompt": "YOUR_PROMPT", "stream": false}'

LM Studio / llamafile (OpenAI-compatible)

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "local-model", "messages": [{"role": "user", "content": "YOUR_PROMPT"}]}'

Dashboard

Metadata

Stars1947
Views1
Updated2026-03-04
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-joelnishanth-local-first-llm": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.