ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

github-actions-step-flake-audit

Detect flaky GitHub Actions job steps by finding mixed success/failure conclusions across runs.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/daniellummis/github-actions-step-flake-audit
Or

What This Skill Does

The github-actions-step-flake-audit skill is a sophisticated diagnostic tool designed to maintain CI/CD health by identifying non-deterministic behaviors in GitHub Actions. By analyzing historical JSON exports of workflow runs, the skill aggregates step outcomes across repository, workflow, job, and step identifiers. It calculates the failure rate for every unique step and applies configurable thresholds to flag 'flaky' behavior—specifically, steps that oscillate between success and failure across multiple executions. It outputs actionable data in either human-readable text or structured JSON, allowing for both manual investigation and programmatic CI gating.

Installation

To install this skill, use the OpenClaw CLI within your terminal:

clawhub install openclaw/skills/skills/daniellummis/github-actions-step-flake-audit

Ensure your project contains the necessary GitHub Actions run artifacts, which can be generated using the command: gh run view <run-id> --json databaseId,workflowName,headBranch,headSha,url,repository,jobs > artifacts/github-actions/run-<run-id>.json.

Use Cases

  1. CI Maintenance: Automatically break build pipelines when critical flakes (exceeding the 40% failure rate) are detected to prevent 'flaky noise' from desensitizing the engineering team.
  2. Regression Hunting: Compare flakiness before and after dependency updates by running the audit across different time-windowed artifacts.
  3. Post-Mortem Analysis: Audit why a specific suite is failing intermittently by isolating the most unstable steps across hundreds of historical workflow runs.
  4. Performance Optimization: Identify which slow-running steps are also unreliable, prioritizing them for refactoring or migration to more stable runners.

Example Prompts

  1. 'Run the step flake audit on all artifacts in my folder and show me the top 10 most unstable steps in JSON format.'
  2. 'Check if any GitHub Actions steps have a failure rate higher than 20% in the last 50 runs and print a summary.'
  3. 'Audit my CI artifacts and enable the fail gate; if any step is critical, stop the process so I can investigate.'

Tips & Limitations

  • Data Quality: The accuracy of this skill is entirely dependent on the quality of your gh run view exports. Ensure you are exporting full JSON data rather than truncated views.
  • Threshold Tuning: Use MIN_OCCURRENCES to avoid flagging steps that have only run once or twice; a minimum of 5-10 runs is recommended for statistical significance.
  • Scope: While excellent at detecting intermittent failures, it does not diagnose the root cause (e.g., resource contention, network timeouts, or race conditions). It is a detection tool, not a debugger.

Metadata

Stars3376
Views0
Updated2026-03-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-daniellummis-github-actions-step-flake-audit": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ci-cd#github-actions#devops#debugging#automation
Safety Score: 4/5

Flags: file-read, code-execution