Official Verified developer tools Safety 4/5

github-actions-recovery-latency-audit

Measure GitHub Actions failure recovery latency and unresolved incident age by workflow group.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/daniellummis/github-actions-recovery-latency-audit

Download Source Code (.zip)

What This Skill Does

The github-actions-recovery-latency-audit skill is a sophisticated diagnostic tool designed to provide visibility into your CI/CD health. Unlike basic pass/fail metrics, this skill identifies the duration of "incident" windows, where a failure occurs and persists until a successful run is documented. It aggregates data by repository, workflow, branch, and event, allowing teams to distinguish between transient network issues and systemic regressions. By processing GitHub Actions run JSON exports, it calculates recovery latency (Time to Recovery) and monitors the age of unresolved incidents. Its automated scoring system assigns severity levels (ok, warn, critical) based on user-defined thresholds, facilitating proactive maintenance. This skill is essential for teams looking to enforce CI stability gates, as it can be configured to exit with error codes when critical thresholds are breached, preventing the merging of unstable configurations or failing to address long-standing blockers.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/daniellummis/github-actions-recovery-latency-audit

Ensure that you have collected your run artifacts as JSON files prior to execution, as the skill operates on local filesystem exports. Refer to the skill's source repository at openclaw/skills for additional configuration details or to review the source scripts.

Use Cases

DevOps Stability Monitoring: Identify which core services or repositories have high recovery latency, signaling a need for better test isolation or flaky-test remediation.
CI/CD Gatekeeping: Use the FAIL_ON_CRITICAL flag in your pipeline to stop deployments when any critical incident remains open for more than the permitted hours, ensuring a "green-first" deployment policy.
Incident Management Post-mortems: Analyze historical recovery patterns to provide quantitative data for engineering reviews regarding pipeline reliability.
Alert Fatigue Reduction: Use the TOP_N setting to focus developer attention on the most problematic workflows rather than overwhelming the team with every minor transient failure.

Example Prompts

"Analyze my GitHub Actions runs in artifacts/ and tell me which workflows have a recovery latency higher than 18 hours."
"Run the recovery audit on my local fixtures using the critical thresholds of 36 hours for open incidents, and provide the output in JSON format."
"Check if there are any critical CI failures that have been unresolved for more than 12 hours across the 'main' branch of all repositories."

Tips & Limitations

Data Quality: The accuracy of the audit is entirely dependent on the quality of your JSON run exports. Always ensure the gh run view command includes necessary fields like conclusion, createdAt, and workflowName.
Deterministic Testing: Use the NOW_ISO input when debugging or testing your reporting logic to ensure consistent results across repeated executions.
Scalability: While the tool is efficient, ensure your RUN_GLOB is scoped appropriately to avoid processing excessive amounts of stale history, which may impact execution time.
Network Scope: This skill does not automatically pull data from GitHub; you must provide the local files, which is a design choice to maintain security and avoid rate-limiting issues.

Read Full Documentation on GitHub

Metadata

Author@daniellummis

Stars3376

Updated2026-03-24

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-daniellummis-github-actions-recovery-latency-audit": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ci-cd#github-actions#devops#automation#metrics

Safety Score: 4/5

Flags: file-read, code-execution

Related Skills

github-actions-cache-hardening-audit

Audit GitHub Actions workflow cache usage for poisoning, keying, and secret-path risks.

daniellummis 3376

render-env-guard

Preflight-check Render service environment variables before deploys; catches missing keys and placeholder/template values that commonly break production rollouts.

daniellummis 3376

github-actions-trigger-health-audit

Audit GitHub Actions run health by trigger event and workflow so flaky or noisy automation sources are easy to prioritize.

daniellummis 3376

github-actions-cancel-waste-audit

Audit cancelled and timed-out GitHub Actions runs from JSON exports to surface wasted CI minutes and noisy workflows.

daniellummis 3376

github-actions-run-gap-audit

Detect GitHub Actions workflow groups that stopped running on their normal cadence using median run intervals and current inactivity gap.

daniellummis 3376