aws-ecs-monitor
AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch log analysis for error categorization, restart detection, and production alerts.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/briancolinger/aws-ecs-monitorWhat This Skill Does
The aws-ecs-monitor is a robust diagnostic agent skill designed to bridge the gap between AWS ECS infrastructure and actionable production intelligence. It serves as a comprehensive monitoring suite that performs real-time health checks on your containerized services, including ALB target health and SSL certificate validation. Beyond simple connectivity, this skill integrates directly with AWS CloudWatch to perform automated log analysis. It identifies critical production issues such as OOM (Out-of-Memory) errors, application panics, timeouts, and 5xx status code surges. By providing both a high-level health status and a deep-dive diagnostic interface, it enables users to identify not just that a service is failing, but exactly why.
Installation
You can install the skill directly into your OpenClaw environment using the following command:
clawhub install openclaw/skills/skills/briancolinger/aws-ecs-monitor
Once installed, ensure your local environment is configured with the aws CLI and appropriate IAM permissions. The skill requires ecs:ListServices, ecs:DescribeServices, elasticloadbalancing:DescribeTargetGroups, elasticloadbalancing:DescribeTargetHealth, logs:FilterLogEvents, and logs:DescribeLogGroups to function effectively across your infrastructure.
Use Cases
- Production Outage Response: Automatically trigger the
auto-diagnosecommand when an alert is received to instantly see if the cause is a code crash or an environment-specific error. - Deployment Health Validation: After a new deployment, run the health monitor to ensure all tasks are passing target group health checks and that no recent restarts are occurring in logs.
- Trend Analysis: Analyze log summaries over the last 120 minutes to identify intermittent 5xx errors that might be affecting user experience during peak traffic.
Example Prompts
- "Check the current production health for all services in the 'my-api-cluster' and provide a summary of any active alerts."
- "I noticed 503 errors on the checkout service. Please run a deep-dive analysis on the logs for the last 30 minutes to identify the root cause."
- "Is the SSL certificate for my-production-domain.com valid, and are there any container restart events in the last hour?"
Tips & Limitations
- Permissions: Ensure your IAM user or role has the minimum read-only permissions requested to avoid failures during the
Describephases. - Performance: Deep analysis on large log groups can be time-consuming; use the
--minutesflag to limit the window for faster results. - Configuration: Always set your
ECS_CLUSTERandECS_REGIONcorrectly in your environment variables to ensure the shell scripts point to the correct AWS resource set.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-briancolinger-aws-ecs-monitor": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, external-api, code-execution
Related Skills
pr-reviewer
Automated GitHub PR code review with diff analysis, lint integration, and structured reports. Use when reviewing pull requests, checking for security issues, error handling gaps, test coverage, or code style problems. Supports Go, Python, and JavaScript/TypeScript. Requires `gh` CLI authenticated with repo access.
email-triage
IMAP email scanning and triage with AI classification via a local Ollama LLM. Scans unread emails, categorizes them as urgent, needs-response, informational, or spam, and surfaces important messages for agent consumption. Works standalone with heuristic fallback — Ollama optional but recommended.
dreaming
Creative exploration during quiet hours. Turns idle heartbeat time into freeform thinking — hypotheticals, future scenarios, reflections, unexpected connections. Use when you want your agent to do something meaningful during low-activity periods instead of just returning HEARTBEAT_OK. Outputs written to files for human review later (like remembering dreams in the morning).