ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

Awesome Autoresearch

Skill by adisinghstudent

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/adisinghstudent/awesome-autoresearch
Or
---
name: awesome-autoresearch
description: Curated index of autonomous improvement loops, research agents, and autoresearch-style systems inspired by Karpathy's autoresearch.
triggers:
  - set up an autoresearch loop
  - build a self-improving agent
  - implement autonomous research workflow
  - create an experiment optimization loop
  - add autoresearch skill to my project
  - build a keep-or-revert improvement loop
  - set up a research agent pipeline
  - automate ml experimentation with agents
---

# 🔬 Awesome Autoresearch

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.

A curated index of autonomous improvement loops, research agents, and autoresearch-style systems. The core pattern: an LLM agent proposes changes, runs experiments, measures a metric, and keeps or reverts — looping until a budget is exhausted or a threshold is met.

---

## What Is Autoresearch?

Autoresearch (originated by [karpathy/autoresearch](https://github.com/karpathy/autoresearch)) is an **autonomous experiment loop** where:

1. An LLM agent reads a codebase and a goal metric
2. It proposes a targeted change (hypothesis)
3. The change is applied and the metric is measured
4. If the metric improves → keep; otherwise → revert
5. Repeat within a fixed compute/time budget

The pattern generalizes to any measurable objective: model loss, Sharpe ratio, test pass rate, API latency, prompt quality, etc.

---

## Core Loop Pattern

```python
# Canonical keep-or-revert autoresearch loop
import subprocess, shutil, json
from pathlib import Path

METRIC_CMD = ["python", "eval.py"]          # returns JSON {"score": float}
BUDGET = 20                                  # number of iterations
GOAL = "maximize score"

def measure() -> float:
    result = subprocess.run(METRIC_CMD, capture_output=True, text=True)
    return json.loads(result.stdout)["score"]

def run_loop(agent_propose_fn):
    best_score = measure()
    print(f"Baseline: {best_score:.4f}")

    for step in range(BUDGET):
        # Agent proposes a diff/edit
        agent_propose_fn(goal=GOAL, step=step, best=best_score)

        score = measure()
        if score > best_score:
            best_score = score
            print(f"[{step}] ✅ Improved → {score:.4f}")
            # Commit the change (git add -A && git commit)
            subprocess.run(["git", "commit", "-am", f"step {step}: {score:.4f}"])
        else:
            print(f"[{step}] ❌ Reverted  ({score:.4f} < {best_score:.4f})")
            # Revert to last good state
            subprocess.run(["git", "checkout", "--", "."])

    print(f"Final best: {best_score:.4f}")

Installation Patterns by Platform

Claude Code Skill (SKILL.md / CLAUDE.md)

Create CLAUDE.md or .claude/skills/autoresearch.md in your repo:

## Autoresearch Loop

Metadata

Stars3809
Views0
Updated2026-04-05
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-adisinghstudent-awesome-autoresearch": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.