OpenClaw Model Fallback Fails on 429 Rate Limit Errors
TL;DR โ Quick Fix
The model fallback chain in OpenClaw doesn't trigger on 429 rate limit errors because the error handling for rate limiting isn't correctly integrated into the fallback logic. Ensure your OpenClaw version is up-to-date and check provider configurations.
Run DiagnosticsNext Step
Fix now, then reduce repeat incidents
If this issue keeps coming back, validate your setup in Doctor first, then harden your config.
Error Signal
Agent failed before reply: All models failed (1): google/gemini-3-flash-preview: Provider google is in cooldown (all profiles unavailable) (rate_limit).What's Happening
Your OpenClaw agent hits a 429 rate limit error from its primary LLM, like google/gemini-3-flash-preview. Instead of automatically trying the next model in your fallbacks list, it just stops. You see an error message like "All models failed (1)", even if you have multiple fallbacks configured. This means only the primary model was even attempted.
The Fix
This issue is complex and depends heavily on your OpenClaw version. The good news is that recent versions have improved rate limit handling.
-
Update OpenClaw: First, make sure you're on the latest stable version of OpenClaw. Many rate limit and fallback issues have been addressed in recent releases.
npm install -g openclaw@latestor if using Docker:
docker pull openclaw/openclaw:latest -
Review Provider Configuration: Double-check your API keys and ensure your providers aren't being overused in ways that trigger aggressive rate limiting. Sometimes, aggressive retries from your end can actually cause the provider to rate-limit you more severely. Look at your
open-apisconfiguration, specifically anyretryorcooldownsettings. You might need to adjust these to be less aggressive if you're hitting rate limits frequently. -
Test Fallbacks Individually: Temporarily set your primary model to one of your fallback models and deliberately trigger a rate limit on that model (if possible in a test environment) to see if the next fallback is attempted. This helps isolate if the problem is with the primary model's error code or the fallback chain itself.
Why This Occurs
The core of the problem lies in how OpenClaw's error handling for HTTP 429 responses interacts with the model fallback mechanism. Previously, the rate_limit error might not have been correctly flagged as a reason to immediately proceed to the next model in the chain. The system might have treated it more like a temporary unavailability rather than a hard stop that should trigger a fallback. Updates to the underlying provider SDKs and OpenClaw's error parsing logic have aimed to fix this.
As one user noted, "The primary model's lโฆ" (the comment was cut off, but implies the initial error handling was too specific to the primary model failing, not that it should trigger a fallback).
Prevention
- Monitor Provider Usage: Keep an eye on your API usage dashboards for each LLM provider. Proactive monitoring helps you anticipate rate limits before they impact your agents.
- Configure Fallbacks Wisely: Use a diverse set of fallback models. Don't rely on just one or two, and consider models from different providers with potentially different rate limiting policies.
- Implement Caching: If your agents perform repetitive tasks, implement caching to reduce the number of LLM calls, thus lowering the chance of hitting rate limits.
- Staggered Agent Runs: If you have many agents, stagger their start times or task execution to spread the load across LLM providers.
- Use the Latest OpenClaw: Stay updated. The OpenClaw team actively fixes these kinds of issues. Check the changelog regularly.
Last Updated: March 2026
Did this guide solve your problem?