Official Verified

autoresearch-loop

Apply Karpathy's autoresearch methodology to iteratively improve anything measurable — Claude skills, n8n workflows, system prompts, business processes, or any artifact with a clear quality metric. Inspired by github.com/karpathy/autoresearch (56k stars). The loop: propose a change → test it → measure against the target metric → keep if better, discard if not → repeat until a stopping condition is met. Trigger this skill when the user explicitly requests an iterative improvement loop, e.g.: "improve this skill automatically", "iterate on this workflow", "run autoresearch on", "run experiments on this", "optimize this automatically", "set up an improvement loop", or "run the autoresearch method".

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/autosolutionsai-didac/as-autoresearch-loop

Download Source Code (.zip)

Autoresearch Loop Skill

Karpathy's autoresearch methodology applied to improving Claude skills, n8n workflows, system prompts, and business processes.

Core idea: Define what "better" means. Lock everything except the artifact being improved. Propose a change → test → measure → keep or discard → repeat until a stopping condition is met.

When NOT to use this loop:

You can't define a single measurable metric (e.g. "improve my writing style" — too subjective)
The artifact is too large to evaluate cheaply in a fixed budget
There's no fixed eval set (or you can't create one) — without a stable yardstick, you're just guessing
You need to improve two interdependent artifacts simultaneously — do them sequentially instead
The artifact is a one-time document (a single client proposal, a one-off report) — the loop is for artifacts that will be reused and improved over time. A one-time deliverable has no future eval value; just write it well directly

If you can't answer "what number tells me if this experiment worked?", stop and define that first.

The methodology is format-agnostic: The loop works for any artifact type — code, prompts, documents, design systems, API configurations, process specs — as long as you can define an artifact, a metric, and a repeatable eval. For novel artifact types not covered by the examples below: walk through the setup phase (artifact → metric → eval → budget) and creatively define each. A Figma component library's metric could be a checklist pass rate (accessibility, consistency, coverage); its eval could be test scenarios ("render a data table", "create a form with validation states") scored against that checklist. Start with a small eval (5–10 test cases) to validate the metric produces meaningful signal before committing to a full campaign.

Setup Phase

Before the loop starts, establish these five things with the user:

1. The Artifact (What You're Improving)

The single file, document, workflow, or process being iteratively modified. Think of this as train.py in Karpathy's repo — the one thing the agent edits.

Examples:

A SKILL.md file
An n8n workflow JSON
A system prompt
An SOP document
A business process description

Fixed files: Identify what must NOT change — the evaluation criteria, input test cases, external integrations. These are your prepare.py.

Read Full Documentation on GitHub

Metadata

Author@autosolutionsai-didac

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-autosolutionsai-didac-as-autoresearch-loop": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

agent-memory-setup

Set up the full OpenClaw agent memory system with 3-tier memory (HOT/WARM/COLD), daily logs, semantic search (QMD), and lossless context management (Lossless Claw). Use when onboarding a new agent, setting up memory for a fresh OpenClaw instance, or when asked to install the memory system on a new agent. Triggers on "set up memory", "install memory system", "onboard new agent memory", "memory setup", "agent onboarding", "configure agent memory", "add memory to my agent", "how do I set up memory", "initialize memory", "memory system for OpenClaw".

autosolutionsai-didac 4473

agent-memory-setup-v2

Create a 3-tier memory directory structure (HOT/WARM/COLD) for OpenClaw agents and configure the built-in memory-core plugin to use Google Gemini Embeddings 2 (gemini-embedding-2-preview) for semantic memory search. Creates memory/ directories and stub files only — no code execution or external API calls from the setup script. After setup, the agent's memory_search tool uses Gemini's cloud embedding API to index memory files. Requires a free Google Gemini API key. Use when setting up a new agent's memory system or asked about semantic memory search. Triggers on "set up memory", "memory setup", "agent memory", "gemini memory", "semantic search memory", "onboard new agent".

autosolutionsai-didac 4473

gamma

Create presentations, documents, social posts, and web pages via the Gamma.app API. Use when asked to create a presentation, pitch deck, slide deck, document, social media carousel, or webpage using Gamma. Also use when asked to generate slides, export to PDF/PPTX, or create content from a Gamma template. Triggers on "create a presentation", "make a deck", "gamma", "slides", "pitch deck", "create a document in gamma".

autosolutionsai-didac 4473

agent-memory-setup

autosolutionsai-didac 4473

deep-research

Conduct deep multi-phase research using parallel subagents and iterative search. Use for deep research requests, comprehensive analysis, competitive intelligence, market research, or thorough investigation of complex topics.

autosolutionsai-didac 4473