ci-flake-triage
Detect flaky tests from JUnit XML retries and emit a triage report with top unstable cases.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/daniellummis/ci-flake-triageCI Flake Triage
Use this skill to turn noisy JUnit retry artifacts into a focused flaky-test report.
What this skill does
- Reads one or more JUnit XML files (for example: first run + rerun artifacts)
- Aggregates status per test case (
passed,failed,skipped,error) - Flags flaky candidates when a test has both fail-like and pass outcomes
- Separates persistent failures from flaky failures
- Prints top flaky tests to prioritize stabilization work
Inputs
Optional:
JUNIT_GLOB(default:test-results/**/*.xml)TRIAGE_TOP(default:20)OUTPUT_FORMAT(textorjson, default:text)FAIL_ON_PERSISTENT(0or1, default:0) — exit non-zero when persistent failures existFAIL_ON_FLAKE(0or1, default:0) — exit non-zero when flaky candidates exist
Run
Text report:
JUNIT_GLOB='artifacts/junit/**/*.xml' \
TRIAGE_TOP=15 \
bash skills/ci-flake-triage/scripts/triage-flakes.sh
JSON output for CI ingestion:
JUNIT_GLOB='artifacts/junit/**/*.xml' \
OUTPUT_FORMAT=json \
FAIL_ON_PERSISTENT=1 \
bash skills/ci-flake-triage/scripts/triage-flakes.sh
Run with bundled fixtures:
JUNIT_GLOB='skills/ci-flake-triage/fixtures/*.xml' \
bash skills/ci-flake-triage/scripts/triage-flakes.sh
Output contract
- Exit
0when no fail gates are enabled (default) - Exit
1ifFAIL_ON_PERSISTENT=1and persistent failures are found - Exit
1ifFAIL_ON_FLAKE=1and flaky candidates are found - In
textmode, prints summary + top flaky + persistent failures - In
jsonmode, prints machine-readable summary and testcase details
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-daniellummis-ci-flake-triage": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
github-actions-recovery-latency-audit
Measure GitHub Actions failure recovery latency and unresolved incident age by workflow group.
github-actions-cache-hardening-audit
Audit GitHub Actions workflow cache usage for poisoning, keying, and secret-path risks.
render-env-guard
Preflight-check Render service environment variables before deploys; catches missing keys and placeholder/template values that commonly break production rollouts.
github-actions-trigger-health-audit
Audit GitHub Actions run health by trigger event and workflow so flaky or noisy automation sources are easy to prioritize.
github-actions-run-gap-audit
Detect GitHub Actions workflow groups that stopped running on their normal cadence using median run intervals and current inactivity gap.