ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified system Safety 5/5

Delx Ops Guardian

Skill by davidmosiah

Why use this skill?

Automate your incident response and operational recovery for OpenClaw agents with the Delx Ops Guardian skill. Ensure stability with safe, human-approved service management.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/davidmosiah/delx-ops-guardian
Or

What This Skill Does

The Delx Ops Guardian is a specialized, runbook-focused utility for OpenClaw agents designed to facilitate incident response and service recovery. It acts as an automated custodian for production environments, strictly adhering to the principle of least privilege. Rather than serving as a general-purpose management tool, it is constrained to handle service instability, monitor cron job health, and manage memory-related bottlenecks. The skill operates within a highly controlled scope, ensuring that every remedial action—such as restarting a gateway or disabling a looping cron job—is performed against a predefined, safe set of commands. It emphasizes observability by requiring evidence-based classification of incidents before any intervention takes place. With built-in human-in-the-loop requirements, it ensures that significant actions, like restarting services repeatedly or modifying schedules, are never executed without explicit oversight, making it a critical asset for teams that prioritize system uptime and operational stability.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/davidmosiah/delx-ops-guardian

Ensure that your agent instance has the necessary permissions to read logs and monitor local service states as defined in the skill scope.

Use Cases

  • Service Recovery: Automatically detecting when the openclaw-gateway or core services become unresponsive or enter a 'degraded' state.
  • Cron Monitoring: Identifying cron jobs that are failing repeatedly and disabling the problematic job to prevent system resource exhaustion.
  • Memory Management: Analyzing memory pressure logs and taking preventative measures to stabilize production agents during peak load.
  • Incident Reporting: Standardizing the post-incident documentation process by providing consistent reports with root-cause identification and evidence-backed resolution paths.

Example Prompts

  1. "The openclaw-gateway service is flapping and causing downstream timeouts. Please investigate, stabilize, and report on the recovery process."
  2. "A cron job is loop-failing in the production environment. Identify the job, disable it, and verify that the system returns to a healthy status."
  3. "Memory guard triggers are firing on our production node. Check the logs, confirm the root cause, and apply a temporary operational fix if safe to do so."

Tips & Limitations

  • Strict Scope: This skill is a 'runbook-only' tool. It does not allow for package management, firewall configuration, or credential modification.
  • Human Approval: Always be prepared to provide manual confirmation for actions that impact system state, such as service restarts or schedule adjustments.
  • Evidence First: The skill relies on local logs. Ensure your agent has appropriate permissions to read from journalctl and internal workspaces. If logs are rotated or deleted, the skill may lack the context to diagnose incidents effectively.

Metadata

Stars2387
Views1
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-davidmosiah-delx-ops-guardian": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ops#incident-response#sre#recovery
Safety Score: 5/5

Flags: file-read, code-execution