slo-implementation
Define and implement Service Level Indicators (SLIs) and Service Level Objectives (SLOs) with error budgets and alerting. Use when establishing reliability targets, implementing SRE practices, or measuring service performance.
Why use this skill?
Learn to define reliable service metrics, calculate error budgets, and automate SRE practices with the slo-implementation skill for OpenClaw agents.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/anton-abyzov/sw-slo-implementationWhat This Skill Does
The slo-implementation skill acts as a comprehensive SRE (Site Reliability Engineering) co-pilot within the OpenClaw ecosystem. It provides a standardized framework for engineering teams to define, track, and enforce service reliability. By automating the transition from abstract reliability goals to concrete technical implementation, this skill bridges the gap between business objectives and infrastructure reality. It empowers users to define Service Level Indicators (SLIs), establish Service Level Objectives (SLOs), and mathematically derive error budgets, ensuring that innovation velocity remains balanced with service stability. The skill facilitates the creation of complex Prometheus-compatible queries for availability, latency, and durability, providing a structured approach to managing production health.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface or the integrated package manager. Run the following command:
clawhub install openclaw/skills/skills/anton-abyzov/sw-slo-implementation
Ensure that you have appropriate permissions to apply configuration files if you intend to push these definitions to your monitoring stack directly.
Use Cases
- Service Reliability Audits: Use this skill to evaluate your existing infrastructure and determine if your current monitoring coverage is sufficient to meet business SLAs.
- Error Budget Management: Automatically calculate how much room you have for error within a rolling 28-day window before triggering a feature freeze.
- Alerting Strategy: Transition from reactive, symptom-based alerting to proactive, budget-based alerting by defining thresholds that account for the user experience.
- Cross-Team Communication: Standardize terminology across product and engineering teams using the provided SLI/SLO/SLA hierarchy for improved alignment.
Example Prompts
- "Analyze my current API traffic and help me construct a 99.9% availability SLI using this Prometheus expression."
- "Generate a YAML error budget policy that triggers a slack alert when we have consumed 80% of our monthly error budget for the checkout service."
- "Calculate the monthly downtime allowed for a 99.99% latency target and explain how this impacts our deployment strategy for the next quarter."
Tips & Limitations
- Start Simple: Don't try to track too many SLOs initially. Start with high-impact services (e.g., login, checkout) before expanding to background workers.
- Context Matters: Remember that an SLI is a measurement, not a goal. Your SLO should reflect the user experience, not just the technical feasibility.
- Query Complexity: While this skill provides excellent templates, always validate your PromQL expressions against your actual metrics data to ensure labels and metric names match your specific instrumentation.
- Rolling Windows: The standard 28-day window is recommended for most web services, but high-velocity environments may require shorter intervals for tighter feedback loops.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-anton-abyzov-sw-slo-implementation": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Related Skills
network-engineer
Cloud network architect for VPC design, service mesh, zero-trust networking, load balancers, and CDN optimization. Use for network troubleshooting or connectivity issues.
jira-multi-project-mapper
Expert in mapping SpecWeave specs to multiple JIRA projects with intelligent project detection and cross-project coordination. Use when syncing to multiple JIRA projects (project-per-team, component-based), or managing bidirectional sync across team boundaries.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
performance-optimization
React Native performance with Hermes V1, FlashList, expo-image v2, concurrent rendering. Use for slow app, memory leaks, or FPS issues.
release-strategy-advisor
Release strategy advisor - detects brownfield patterns (tags, CI/CD, changelogs), recommends versioning strategy based on architecture. Creates release-strategy.md.