kafka-devops
Kafka DevOps and SRE specialist. Expert in infrastructure deployment, CI/CD, monitoring, incident response, capacity planning, and operational best practices for Apache Kafka.
Why use this skill?
Optimize and manage your Apache Kafka infrastructure with this SRE agent. Expert support for configuration, monitoring, deployment, and performance tuning.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/anton-abyzov/sw-kafka-devopsWhat This Skill Does
The kafka-devops skill acts as a specialized Site Reliability Engineer (SRE) for Apache Kafka clusters. It leverages deep domain knowledge to assist with infrastructure as code, cluster configuration, performance tuning, and incident management. Whether you are deploying on Kubernetes using Strimzi, managing traditional virtual machines, or configuring Managed Kafka services, this agent provides actionable insights. It excels at diagnosing consumer lag, optimizing broker partitioning strategies, automating CI/CD pipelines for topic management, and establishing robust monitoring alerting rules.
Installation
To integrate this skill into your OpenClaw environment, execute the following command in your terminal or terminal-integrated agent console:
clawhub install openclaw/skills/skills/anton-abyzov/sw-kafka-devops
Ensure that your OpenClaw agent has the necessary permissions to access your cloud environment or Kubernetes configuration files if you intend for it to perform direct infrastructure modifications.
Use Cases
- Cluster Sizing & Capacity Planning: Use the agent to calculate partition counts based on throughput requirements and retention policies.
- CI/CD Integration: Automatically validate and apply topic configurations, ACLs, and schema registry updates via GitOps workflows.
- Incident Response: Analyze diagnostic logs to identify bottlenecks in ISR (In-Sync Replicas), troubleshoot under-replicated partitions, or debug consumer group rebalancing issues.
- Performance Tuning: Gain recommendations on JVM heap settings, disk I/O optimization for log segments, and network throughput adjustments.
Example Prompts
- "Analyze my current Kafka cluster configuration and recommend partition counts for a high-throughput topic expecting 500MB/s ingestion with a replication factor of 3."
- "Draft a Terraform module for a Strimzi KafkaCluster resource that implements tiered storage and enables JMX exporter metrics."
- "I am seeing frequent rebalances in my consumer group. Write a script to analyze consumer lag and provide a summary of the worst-performing partitions."
Tips & Limitations
- Context is Key: Always provide relevant configuration snippets (like server.properties) when asking for specific tuning advice.
- Safety First: When using the agent for configuration changes, always request a dry-run or a Terraform plan output before applying changes to production clusters.
- Limitations: The agent provides advisory information and code generation. It does not replace the need for physical monitoring tools like Prometheus or Confluent Control Center. Ensure all suggested configuration changes are validated in a non-production staging environment first.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-anton-abyzov-sw-kafka-devops": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: code-execution, external-api
Related Skills
network-engineer
Cloud network architect for VPC design, service mesh, zero-trust networking, load balancers, and CDN optimization. Use for network troubleshooting or connectivity issues.
jira-multi-project-mapper
Expert in mapping SpecWeave specs to multiple JIRA projects with intelligent project detection and cross-project coordination. Use when syncing to multiple JIRA projects (project-per-team, component-based), or managing bidirectional sync across team boundaries.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
performance-optimization
React Native performance with Hermes V1, FlashList, expo-image v2, concurrent rendering. Use for slow app, memory leaks, or FPS issues.
release-strategy-advisor
Release strategy advisor - detects brownfield patterns (tags, CI/CD, changelogs), recommends versioning strategy based on architecture. Creates release-strategy.md.