ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

autoresearch

Autonomous AI research skill for running automated neural network experiments. This skill should be used when the user wants to set up autonomous AI research experiments, run automated neural network training, conduct autonomous machine learning research, or let AI agents experiment with model architectures and hyperparameters. Based on Andrej Karpathy's autoresearch project, this skill enables AI agents to autonomously modify training code, run experiments, evaluate results, and iteratively improve models. Use when: (1) Setting up autonomous research experiments, (2) Running automated neural network training, (3) Conducting AI-driven research optimization, (4) Experimenting with model architectures and hyperparameters, (5) Implementing autonomous research loops, or (6) When the user mentions "autonomous research", "AI experiments", "automated training", "neural network optimization", or "autoresearch".

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/baiyunrei2025/autoresearch-karpathy
Or

Autoresearch Skill

This skill enables autonomous AI research experiments based on Andrej Karpathy's autoresearch project. It allows AI agents to autonomously modify neural network training code, run experiments, evaluate results, and iteratively improve models.

Core Concept

The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously. The agent modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You can leave it running overnight and wake up to a log of experiments and (hopefully) a better model.

Key Files

The project has three core files:

  1. prepare.py — Fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). Not modified.
  2. train.py — The single file the agent edits. Contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. This file is edited and iterated on by the agent.
  3. program.md — Baseline instructions for the agent. This file is edited and iterated on by the human.

Requirements

  • Single NVIDIA GPU (tested on H100)
  • Python 3.10+
  • uv package manager

Quick Start Workflow

Phase 1: Initial Setup

  1. Clone the repository (if not already done):

    git clone https://github.com/karpathy/autoresearch.git
    cd autoresearch
    
  2. Install dependencies:

    uv sync
    
  3. Prepare data (one-time setup):

    uv run prepare.py
    

Phase 2: Experiment Setup

  1. Agree on a run tag (e.g., based on date like mar20)
  2. Create a new branch:
    git checkout -b autoresearch/<tag>
    
  3. Initialize results file:
    echo -e "commit\tval_bpb\tmemory_gb\tstatus\tdescription" > results.tsv
    

Phase 3: Autonomous Experimentation Loop

The agent follows this loop indefinitely:

LOOP FOREVER:
  1. Look at current git state
  2. Modify train.py with experimental idea
  3. git commit
  4. Run experiment: uv run train.py > run.log 2>&1
  5. Extract results: grep "^val_bpb:\|^peak_vram_mb:" run.log
  6. If crash → analyze logs and fix or mark as crash
  7. Record results in results.tsv
  8. If improved → keep commit
  9. If not improved → git reset

Key Metrics

  • val_bpb (validation bits per byte) — Lower is better, vocab-size-independent
  • Training time — Fixed 5-minute budget per experiment
  • Peak VRAM — Memory usage in GB
  • Statuskeep, discard, or crash

Constraints

What the agent CAN do:

  • Modify train.py (architecture, optimizer, hyperparameters, training loop, etc.)
  • Experiment with different model configurations
  • Run training experiments autonomously

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-baiyunrei2025-autoresearch-karpathy": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.