Official Verified system Safety 2/5

Incident Response

Skill by chunhualiao

Why use this skill?

Automate system recovery with the Incident Response skill for OpenClaw. Follow a structured 7-phase SRE loop to diagnose failures, verify root causes, and prevent regressions.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/chunhualiao/incident-response

Download Source Code (.zip)

What This Skill Does

The Incident Response skill provides a rigorous, standardized framework for the OpenClaw agent to troubleshoot and resolve system failures. Designed for reliability, it mandates a strict 7-phase loop—Triage, Evidence, 5 Whys, Restore, Prevent, Monitor, and Document—to ensure that no issue is resolved without understanding the root cause. This methodology prevents 'band-aid' fixes and ensures that system regressions, configuration losses, or gateway crashes are addressed systematically. By interacting with configuration backups, git audit trails, and session logs, the agent acts as a diligent site reliability engineer (SRE) within your environment.

Installation

To install this skill, use the ClawHub command-line interface: clawhub install openclaw/skills/skills/chunhualiao/incident-response

Use Cases

This skill is essential for maintaining infrastructure stability. Use it when you detect unexpected behavior, such as missing configuration settings, lost agent bindings, or complete system unresponsiveness. It is particularly effective for post-mortem analysis after a gateway crash or when investigating unauthorized changes that disrupted production workflows. Whether dealing with silent failures or explicit errors, this skill forces the agent to verify the current state before attempting any remedial actions.

Example Prompts

"Investigate: my gateway crashed twice this morning, and half my agent bindings have disappeared. Please look into this."
"Something changed in the routing logic yesterday and now my production agents are not responding. Can you perform a root cause analysis?"
"The config settings for the secure-gateway agent disappeared after the last update. Fix this and ensure it doesn't happen again."

Tips & Limitations

Strict Sequencing: Never skip phases. The skill is designed to force the agent into a logical deduction flow. If you try to force a 'Restore' before 'Evidence Collection', the agent will resist and guide you back to the proper phase.
Evidence Priority: Always allow the agent to run the provided diagnostic commands (rg, git log, python script) as they are the primary source of truth.
Manual Intervention: While this skill is powerful, it assumes the agent has SSH access to your remote hosts. Ensure your agent is properly authenticated to the environment before triggering the investigation.
Non-Destructive First: The Triage phase (Phase 0) is designed to be read-only to avoid compounding errors during a live system outage.

Read Full Documentation on GitHub

Metadata

Author@chunhualiao

Stars3562

Updated2026-03-29

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-chunhualiao-incident-response": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#incident-response#troubleshooting#devops#sre#system-maintenance

Safety Score: 2/5

Flags: file-write, file-read, code-execution, network-access

Related Skills

claude-usage

Check Claude Max plan usage limits by launching Claude Code and running /usage. Use when the user asks about Claude plan usage, remaining quota, rate limits, or sends /claude_usage.

chunhualiao 3562

save-to-obsidian

Saves markdown content to remote Obsidian vault via SSH

chunhualiao 3562

task-runner

Persistent task queue system. Users add tasks at any time via natural language; tasks are stored in a single persistent queue file and executed asynchronously via subagents. A heartbeat/cron dispatcher wakes periodically to check pending tasks, spawn workers, and report completions. The system never "finishes" — it always remains ready for the next task.

chunhualiao 3562

openclaw-docker-setup

Install and configure a fully operational Dockerized OpenClaw instance on macOS from scratch. Includes browser pairing, Discord channel setup, and optional Gmail/Google Drive integration. Use when user asks to "install openclaw docker", "set up dockerized openclaw", "openclaw in docker", or "isolated openclaw instance".

chunhualiao 3562

skill-releaser

Release skills to ClawhHub through the full publication pipeline — auto-scaffolding, OPSEC scan, dual review (agent + user), force-push release, security scan verification. Use when releasing a skill, preparing a skill for release, reviewing a skill for publication, or checking release readiness.

chunhualiao 3562