Official Verified developer tools Safety 4/5

aws-ecs-monitor

AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch log analysis for error categorization, restart detection, and production alerts.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/briancolinger/aws-ecs-monitor

Download Source Code (.zip)

What This Skill Does

The aws-ecs-monitor is a robust diagnostic agent skill designed to bridge the gap between AWS ECS infrastructure and actionable production intelligence. It serves as a comprehensive monitoring suite that performs real-time health checks on your containerized services, including ALB target health and SSL certificate validation. Beyond simple connectivity, this skill integrates directly with AWS CloudWatch to perform automated log analysis. It identifies critical production issues such as OOM (Out-of-Memory) errors, application panics, timeouts, and 5xx status code surges. By providing both a high-level health status and a deep-dive diagnostic interface, it enables users to identify not just that a service is failing, but exactly why.

Installation

You can install the skill directly into your OpenClaw environment using the following command: clawhub install openclaw/skills/skills/briancolinger/aws-ecs-monitor

Once installed, ensure your local environment is configured with the aws CLI and appropriate IAM permissions. The skill requires ecs:ListServices, ecs:DescribeServices, elasticloadbalancing:DescribeTargetGroups, elasticloadbalancing:DescribeTargetHealth, logs:FilterLogEvents, and logs:DescribeLogGroups to function effectively across your infrastructure.

Use Cases

Production Outage Response: Automatically trigger the auto-diagnose command when an alert is received to instantly see if the cause is a code crash or an environment-specific error.
Deployment Health Validation: After a new deployment, run the health monitor to ensure all tasks are passing target group health checks and that no recent restarts are occurring in logs.
Trend Analysis: Analyze log summaries over the last 120 minutes to identify intermittent 5xx errors that might be affecting user experience during peak traffic.

Example Prompts

"Check the current production health for all services in the 'my-api-cluster' and provide a summary of any active alerts."
"I noticed 503 errors on the checkout service. Please run a deep-dive analysis on the logs for the last 30 minutes to identify the root cause."
"Is the SSL certificate for my-production-domain.com valid, and are there any container restart events in the last hour?"

Tips & Limitations

Permissions: Ensure your IAM user or role has the minimum read-only permissions requested to avoid failures during the Describe phases.
Performance: Deep analysis on large log groups can be time-consuming; use the --minutes flag to limit the window for faster results.
Configuration: Always set your ECS_CLUSTER and ECS_REGION correctly in your environment variables to ensure the shell scripts point to the correct AWS resource set.

Read Full Documentation on GitHub

Metadata

Author@briancolinger

Stars4190

Updated2026-04-18

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-briancolinger-aws-ecs-monitor": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#aws#ecs#monitoring#cloudwatch#devops

Safety Score: 4/5

Flags: network-access, file-write, file-read, external-api, code-execution

Related Skills

pr-reviewer

Automated GitHub PR code review with diff analysis, lint integration, and structured reports. Use when reviewing pull requests, checking for security issues, error handling gaps, test coverage, or code style problems. Supports Go, Python, and JavaScript/TypeScript. Requires `gh` CLI authenticated with repo access.

briancolinger 4190

email-triage

IMAP email scanning and triage with AI classification via a local Ollama LLM. Scans unread emails, categorizes them as urgent, needs-response, informational, or spam, and surfaces important messages for agent consumption. Works standalone with heuristic fallback — Ollama optional but recommended.

briancolinger 4190

dreaming

Creative exploration during quiet hours. Turns idle heartbeat time into freeform thinking — hypotheticals, future scenarios, reflections, unexpected connections. Use when you want your agent to do something meaningful during low-activity periods instead of just returning HEARTBEAT_OK. Outputs written to files for human review later (like remembering dreams in the morning).

briancolinger 4190