ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

Token Guard

Skill by edmonddantesj

Why use this skill?

Optimize LLM API usage with Token Guard. Prevent 429 rate limit errors, manage token quotas, and implement smart fallbacks for your OpenClaw AI agents.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/edmonddantesj/token-guard
Or

What This Skill Does

Token Guard is a specialized middleware agent designed to act as a defensive perimeter for your LLM API interactions. It acts as an intelligent traffic controller that sits between your OpenClaw agent and your LLM providers. By pre-calculating the token weight of every request, it prevents you from hitting hard rate limits (429 errors) before they even occur. It manages sliding window quotas, tracks per-model usage, and provides automatic fallback mechanisms when a specific model reaches its capacity. Beyond simple throttling, it includes a robust duplicate detection system and a response caching layer, ensuring that repeated or accidental requests don't waste precious API credits. Its core philosophy revolves around maximum utility efficiency, allowing users on tiered or free-tier plans to achieve consistent uptime without the overhead of manual monitoring.

Installation

To integrate Token Guard into your current OpenClaw setup, run the following command in your terminal: clawhub install openclaw/skills/skills/edmonddantesj/token-guard

Once installed, you can register the skill in your agent.yaml or include it directly as middleware in your orchestration logic to ensure every outgoing API call is automatically validated against your defined quota thresholds.

Use Cases

Token Guard is ideal for high-volume automated workflows. If you are scraping large datasets or processing batches of PDFs using Gemini or Claude, Token Guard prevents the 'death spiral' of hitting rate limits and attempting to retry immediately. It is also perfect for budget-conscious developers who need to switch from expensive models like GPT-4o to more cost-effective alternatives like Haiku or Deepseek when a quota limit is approached. It is particularly effective in agentic loops where runaway recursion might otherwise trigger an accidental account suspension.

Example Prompts

  1. "Check if my current prompt for the 500-page research document will exceed the Gemini Flash rate limit and suggest a fallback model if it does."
  2. "Enable proactive throttling for my Claude-Sonnet integration and set a hard limit to block requests when usage exceeds 90% of my per-minute quota."
  3. "Summarize the last 5 logs from Token Guard to show me which processes are consuming the most tokens."

Tips & Limitations

To get the most out of Token Guard, ensure you have correctly configured your model-specific quotas in the settings file. While the internal token estimator is highly accurate (especially for CJK languages), it should not be treated as a 1:1 replacement for provider-side counting in billing audits. Note that this tool does not provide its own API keys; it relies on your existing credentials to perform the calls it validates. Avoid setting your 'block' threshold too low, as this may result in unnecessary service interruptions during periods of high agent activity.

Metadata

Stars2387
Views2
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-edmonddantesj-token-guard": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#rate-limit#token-management#cost-optimization#llm-guard
Safety Score: 4/5

Flags: external-api