Official Verified

autoresearch

Autonomous AI research skill for running automated neural network experiments. This skill should be used when the user wants to set up autonomous AI research experiments, run automated neural network training, conduct autonomous machine learning research, or let AI agents experiment with model architectures and hyperparameters. Based on Andrej Karpathy's autoresearch project, this skill enables AI agents to autonomously modify training code, run experiments, evaluate results, and iteratively improve models. Use when: (1) Setting up autonomous research experiments, (2) Running automated neural network training, (3) Conducting AI-driven research optimization, (4) Experimenting with model architectures and hyperparameters, (5) Implementing autonomous research loops, or (6) When the user mentions "autonomous research", "AI experiments", "automated training", "neural network optimization", or "autoresearch".

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/baiyunrei2025/autoresearch-karpathy

Download Source Code (.zip)

Autoresearch Skill

This skill enables autonomous AI research experiments based on Andrej Karpathy's autoresearch project. It allows AI agents to autonomously modify neural network training code, run experiments, evaluate results, and iteratively improve models.

Core Concept

The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously. The agent modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You can leave it running overnight and wake up to a log of experiments and (hopefully) a better model.

Key Files

The project has three core files:

prepare.py — Fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). Not modified.
train.py — The single file the agent edits. Contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. This file is edited and iterated on by the agent.
program.md — Baseline instructions for the agent. This file is edited and iterated on by the human.

Requirements

Single NVIDIA GPU (tested on H100)
Python 3.10+
uv package manager

Quick Start Workflow

Phase 1: Initial Setup

Clone the repository (if not already done):

git clone https://github.com/karpathy/autoresearch.git
cd autoresearch

Install dependencies:
```
uv sync
```
Prepare data (one-time setup):
```
uv run prepare.py
```

Phase 2: Experiment Setup

Agree on a run tag (e.g., based on date like mar20)
Create a new branch:
```
git checkout -b autoresearch/<tag>
```

Initialize results file:

echo -e "commit\tval_bpb\tmemory_gb\tstatus\tdescription" > results.tsv

Phase 3: Autonomous Experimentation Loop

The agent follows this loop indefinitely:

LOOP FOREVER:
  1. Look at current git state
  2. Modify train.py with experimental idea
  3. git commit
  4. Run experiment: uv run train.py > run.log 2>&1
  5. Extract results: grep "^val_bpb:\|^peak_vram_mb:" run.log
  6. If crash → analyze logs and fix or mark as crash
  7. Record results in results.tsv
  8. If improved → keep commit
  9. If not improved → git reset

Key Metrics

val_bpb (validation bits per byte) — Lower is better, vocab-size-independent
Training time — Fixed 5-minute budget per experiment
Peak VRAM — Memory usage in GB
Status — keep, discard, or crash

Constraints

What the agent CAN do:

Modify train.py (architecture, optimizer, hyperparameters, training loop, etc.)
Experiment with different model configurations
Run training experiments autonomously

Read Full Documentation on GitHub

Metadata

Author@baiyunrei2025

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-baiyunrei2025-autoresearch-karpathy": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

Office Docs

Skill by baiyunrei2025

baiyunrei2025 4473

Agent Browser Skill

Skill by baiyunrei2025

baiyunrei2025 4473