ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified system Safety 2/5

Incident Response

Skill by chunhualiao

Why use this skill?

Automate system recovery with the Incident Response skill for OpenClaw. Follow a structured 7-phase SRE loop to diagnose failures, verify root causes, and prevent regressions.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/chunhualiao/incident-response
Or

What This Skill Does

The Incident Response skill provides a rigorous, standardized framework for the OpenClaw agent to troubleshoot and resolve system failures. Designed for reliability, it mandates a strict 7-phase loop—Triage, Evidence, 5 Whys, Restore, Prevent, Monitor, and Document—to ensure that no issue is resolved without understanding the root cause. This methodology prevents 'band-aid' fixes and ensures that system regressions, configuration losses, or gateway crashes are addressed systematically. By interacting with configuration backups, git audit trails, and session logs, the agent acts as a diligent site reliability engineer (SRE) within your environment.

Installation

To install this skill, use the ClawHub command-line interface: clawhub install openclaw/skills/skills/chunhualiao/incident-response

Use Cases

This skill is essential for maintaining infrastructure stability. Use it when you detect unexpected behavior, such as missing configuration settings, lost agent bindings, or complete system unresponsiveness. It is particularly effective for post-mortem analysis after a gateway crash or when investigating unauthorized changes that disrupted production workflows. Whether dealing with silent failures or explicit errors, this skill forces the agent to verify the current state before attempting any remedial actions.

Example Prompts

  1. "Investigate: my gateway crashed twice this morning, and half my agent bindings have disappeared. Please look into this."
  2. "Something changed in the routing logic yesterday and now my production agents are not responding. Can you perform a root cause analysis?"
  3. "The config settings for the secure-gateway agent disappeared after the last update. Fix this and ensure it doesn't happen again."

Tips & Limitations

  • Strict Sequencing: Never skip phases. The skill is designed to force the agent into a logical deduction flow. If you try to force a 'Restore' before 'Evidence Collection', the agent will resist and guide you back to the proper phase.
  • Evidence Priority: Always allow the agent to run the provided diagnostic commands (rg, git log, python script) as they are the primary source of truth.
  • Manual Intervention: While this skill is powerful, it assumes the agent has SSH access to your remote hosts. Ensure your agent is properly authenticated to the environment before triggering the investigation.
  • Non-Destructive First: The Triage phase (Phase 0) is designed to be read-only to avoid compounding errors during a live system outage.

Metadata

Stars3562
Views6
Updated2026-03-29
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-chunhualiao-incident-response": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#incident-response#troubleshooting#devops#sre#system-maintenance
Safety Score: 2/5

Flags: file-write, file-read, code-execution, network-access