cascadeflow
Set up CascadeFlow as an OpenClaw custom provider with fast, copy-paste steps. Use when users want quick install, preset selection (OpenAI-only, Anthropic-only, mixed), OpenClaw model alias setup, and safe production defaults for cascading with streaming and agent loops.
Why use this skill?
Optimize OpenClaw with CascadeFlow. Implement intelligent LLM cascading to lower API costs and reduce latency using preset routing strategies for OpenAI and Anthropic.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/saschabuehrle/cascadeflowWhat This Skill Does
CascadeFlow is a high-performance optimization layer for OpenClaw that acts as a custom provider to reduce AI operational costs and latency. By implementing a sophisticated cascading architecture, it allows developers to route requests dynamically between different LLM providers (OpenAI, Anthropic, etc.) based on defined strategy presets. The tool exposes an OpenAI-compatible /v1/chat/completions endpoint, ensuring a seamless integration experience for OpenClaw agents without requiring complex infrastructure changes. It is specifically designed to maintain full support for streaming responses and multi-step agent loops, ensuring that cost-saving measures do not hinder agent performance or user experience.
Installation
To integrate CascadeFlow, ensure you have Python 3.10+ installed. Begin by cloning the repository from GitHub: git clone https://github.com/lemony-ai/cascadeflow.git. Navigate into the folder, create a virtual environment with python3 -m venv .venv and activate it. Install dependencies using pip install -e ".[openclaw,providers]". Once installed, select a preset configuration from examples/configs/—such as mixed-anthropic-openai.yaml—and define your ANTHROPIC_API_KEY and OPENAI_API_KEY in a .env file. Launch the server using the provided command targeting your chosen config and specifying auth tokens for security. Finally, map the baseUrl to http://127.0.0.1:8084/v1 in OpenClaw and set the model to cascadeflow.
Use Cases
CascadeFlow is ideal for high-scale agent deployments where throughput costs become prohibitive. It is perfect for developers building complex agentic workflows that require a mix of model capabilities—using a cheaper 'fast' model for routine tasks and escalating to more powerful models only when necessary. This cascading behavior is transparent to the agent loop, making it highly effective for RAG (Retrieval-Augmented Generation) pipelines, complex code generation tasks, and interactive chat sessions where latency spikes need to be strictly controlled.
Example Prompts
- "/model cflow; Analyze this codebase and suggest a refactor pattern that minimizes unnecessary re-runs."
- "/model cflow; Research the latest developments in modular AI architecture and summarize them in a concise report."
- "/model cflow; Help me iterate on this prompt chain to ensure the final output is formatted as valid JSON."
Tips & Limitations
Always run CascadeFlow behind a TLS reverse proxy if you intend to expose the service beyond 127.0.0.1. While the cascading logic significantly lowers costs, it relies heavily on the accuracy of your chosen preset. Start with the openai-only.yaml or anthropic-only.yaml to baseline your performance before experimenting with the mixed configurations. Note that CascadeFlow adds a small overhead; ensure your network latency between the proxy and the LLM providers is accounted for in your latency budgets.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-saschabuehrle-cascadeflow": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api