devops-ops-bot
Server health monitoring with alerts and auto-recovery. Checks CPU, memory, disk, and uptime with configurable thresholds. Sends Slack/Discord alerts and can auto-restart services on critical.
Why use this skill?
Monitor server CPU, memory, and disk usage with the devops-ops-bot. Get real-time alerts and trigger automated service restarts for your infrastructure.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/gruted/devops-ops-botWhat This Skill Does
The devops-ops-bot is a robust, lightweight command-line interface (CLI) tool designed for proactive server health monitoring. As an OpenClaw AI agent skill, it allows the AI to monitor the vital signs of your infrastructure, including CPU load, memory utilization, disk usage, and system uptime. Unlike passive monitoring tools, this bot provides configurable threshold-based alerts that can differentiate between 'ok', 'warn', and 'crit' states.
Beyond simple monitoring, the bot facilitates automated incident response. It can dispatch real-time alerts to Slack or Discord via webhook integration, ensuring your team is notified immediately when a system threshold is breached. Perhaps most powerfully, it supports auto-recovery workflows, enabling the agent to trigger custom commands (such as a service restart via systemctl) when a critical state is detected. With its JSON output capability, it integrates seamlessly into existing log aggregation pipelines, making it a professional-grade addition to any DevOps toolkit.
Installation
To integrate this skill into your environment via the OpenClaw ecosystem, execute the following command in your terminal:
clawhub install openclaw/skills/skills/gruted/devops-ops-bot
Alternatively, you can install it globally via npm using npm install -g @gruted/devops-ops-bot or by using the provided one-liner installation script. Docker images are also available under ghcr.io/gruted/devops-ops-bot:latest for ephemeral or containerized monitoring tasks.
Use Cases
- Automated Service Recovery: Automatically restart a crashed Nginx or database service when CPU or memory consumption spikes past a critical threshold, reducing manual intervention.
- Performance Trending: Use the JSON output feature to feed system stats into a centralized dashboard, helping you identify slow performance degradations over time.
- Alert Fatigue Management: Configure custom warnings to receive low-priority notifications for memory spikes, while reserving critical alerts for total system failure or service outages.
Example Prompts
- "OpenClaw, run a health check on my local server and report the status. If any metric is critical, restart the nginx service immediately."
- "Monitor the current CPU and disk usage thresholds and send an alert to my Slack webhook if memory usage exceeds 85%."
- "Set up a cron job to perform a health check every 5 minutes and output the data in JSON format so I can track the performance logs."
Tips & Limitations
- Security: Since this skill can execute shell commands (e.g., via
--restart-cmd), ensure the user running the OpenClaw agent has the appropriate permissions but is restricted enough to avoid unintended system-wide impact. - Alerting: Always verify your webhook URLs. Incorrect configuration will cause the bot to fail silently regarding notifications.
- Threshold Tuning: Start with the default thresholds before narrowing them down; setting thresholds too aggressively may lead to flapping services where a process is restarted unnecessarily during minor, transient spikes.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-gruted-devops-ops-bot": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api, code-execution