skills-eval
Evaluate and improve Claude skill quality through auditing
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/athola/nm-abstract-skills-evalNight Market Skill — ported from claude-night-market/abstract. For the full experience with agents, hooks, and commands, install the Claude Code plugin.
Skills Evaluation and Improvement
Table of Contents
- Overview
- Quick Start
- Evaluation Workflow
- Evaluation and Optimization
- Resources
Overview
This framework audits Claude skills against quality standards to improve performance and reduce token consumption. Automated tools analyze skill structure, measure context usage, and identify specific technical improvements. Run verification commands after each audit to confirm fixes work correctly.
The skills-auditor provides structural analysis, while the improvement-suggester ranks fixes by impact. Compliance is verified through the compliance-checker. Runtime efficiency is monitored by tool-performance-analyzer and token-usage-tracker.
Quick Start
Basic Audit
Run a full audit of all skills or target a specific file to identify structural issues.
# Audit all skills
make audit-all
# Audit specific skill
make audit-skill TARGET=path/to/skill/SKILL.md
Analysis and Optimization
Use skill_analyzer.py for complexity checks and token_estimator.py to verify the context budget.
make analyze-skill TARGET=path/to/skill/SKILL.md
make estimate-tokens TARGET=path/to/skill/SKILL.md
Improvements
Generate a prioritized plan and verify standards compliance using improvement_suggester.py and compliance_checker.py.
make improve-skill TARGET=path/to/skill/SKILL.md
make check-compliance TARGET=path/to/skill/SKILL.md
Evaluation Workflow
Start with make audit-all to inventory skills and identify high-priority targets. For each skill requiring attention, run analysis with analyze-skill to map complexity. Generate an improvement plan, apply fixes, and run check-compliance to verify the skill meets project standards. Finalize by checking the token budget for efficiency.
Evaluation and Optimization
Quality assessments use the skills-auditor and improvement-suggester to generate detailed reports. Performance analysis focuses on token efficiency through the token-usage-tracker and tool performance via tool-performance-analyzer. For standards compliance, the compliance-checker automates common fixes for structural issues.
Scoring and Prioritization
We evaluate skills across five dimensions: structure compliance, content quality, token efficiency, activation reliability, and tool integration. Scores above 90 represent production-ready skills, while scores below 50 indicate critical issues requiring immediate attention.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-athola-nm-abstract-skills-eval": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
extract
Analyze a codebase and build a knowledge base of business logic, architecture, data flow, and engineering patterns. The foundation for gauntlet challenges and agent integration
discourse
>- Scan community discussion channels (HN, Lobsters, Reddit, tech blogs) for experience reports and opinions on a topic
synthesize
>- Merge, deduplicate, rank, and format research findings from multiple channels into a coherent report. Use after research agents return their results
workflow-monitor
Detect workflow failures and inefficient patterns, then create GitHub issues for improvement via /fix-workflow
architecture-paradigm-hexagonal
Hexagonal (Ports and Adapters) architecture isolating domain logic from infrastructure