What This Skill Does

The failover-gateway skill provides a robust, active-passive high availability solution for OpenClaw. It addresses the critical need for uptime by managing a standby VPS that monitors your primary instance. If your primary OpenClaw node becomes unreachable, the health monitor triggers an automated promotion sequence, causing the standby node to take over communication responsibilities. This design prevents data loss and service downtime by ensuring that at least one instance of your agent remains operational at all times.

Installation

Provision a secondary, lightweight VPS and install the OpenClaw environment.
Configure Tailscale or a similar VPN to allow encrypted communication between your primary and standby nodes.
Run clawhub install openclaw/skills/skills/ember-claw/failover-gateway-pub on your standby machine.
Initialize your workspace repository using Git to ensure both nodes stay synced.
Modify your standby configuration to only enable specific secondary channels to avoid conflicts.
Deploy the included systemd services to initiate the health monitor, which polls the primary node every 30 seconds.
Test by manually stopping the primary service to observe the standby promotion process.

Use Cases

Mission-Critical Operations: Ideal for users managing automated trading or long-running tasks that cannot afford extended downtime.
Geographical Redundancy: Deploying nodes in different regions to mitigate localized data center outages.
Disaster Recovery: Creating a clean, minimal-resource recovery point that avoids the complexity of load balancing by using a channel-splitting strategy.

Example Prompts

"OpenClaw, verify the current health status of my primary node and report if the failover-gateway is actively monitoring."
"Update my failover-gateway configuration to prioritize Discord notifications as the secondary channel during a primary outage."
"Show me the last timestamp when the standby node successfully polled the primary heartbeat."

Tips & Limitations

Channel Splitting: This is the most critical component. By ensuring the primary and standby own different channels, you eliminate split-brain issues without complex database synchronization.
Resource Allocation: You can save costs by running a smaller VPS for the standby, as it only needs enough power to handle essential recovery tasks.
Limitations: This skill does not synchronize memory state between nodes. If a task is mid-execution during a failover, it may not resume perfectly from the exact second of failure unless your workflow is idempotent.

failover-gateway

Why use this skill?

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)