ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

markdown.new-crawl

Use `https://markdown.new/crawl/{target_url}` endpoints to recursively crawl a site section and return markdowns. Trigger this skill when the user asks for multi-page extraction, whole-docs crawl, link-depth crawling, or job-based crawl polling from a URL. Prefer local terminal access (`curl`) with `/crawl`, `/crawl/status/{jobId}`, and `/crawl/{url}` before other browsing methods.

Why use this skill?

Learn how to use the markdown.new-crawl skill to recursively extract website content, convert pages to Markdown, and automate documentation ingestion via OpenClaw.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/ctxinf/markdown-new-crawl
Or

What This Skill Does

The markdown.new-crawl skill provides a powerful, local-first interface for recursively scraping website sections and converting them into clean Markdown. It is designed for developers, researchers, and data analysts who need to ingest large documentation sets, multi-page articles, or knowledge bases into an AI-ready format. By leveraging the markdown.new API, it orchestrates asynchronous crawl jobs, allowing for efficient link-depth management, pattern-based URL filtering, and structured data output. It acts as a programmatic bridge between live web content and local context, favoring terminal-based curl execution to ensure transparency and reliability.

Installation

To integrate this tool into your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/ctxinf/markdown-new-crawl

Ensure your local environment allows outgoing network connections to markdown.new for optimal performance. Once installed, the skill will be automatically discovered by the OpenClaw agent for relevant requests.

Use Cases

This skill is ideal for several high-impact tasks:

  1. Documentation Ingestion: Crawling entire technical docs (e.g., product manuals or API references) to build a local context base for RAG (Retrieval-Augmented Generation).
  2. Competitive Research: Extracting structured text from competitor landing pages or blogs for market trend analysis.
  3. Archival & Backup: Converting legacy or ephemeral web content into persistent Markdown files.
  4. Structured Scraping: Using the format=json flag to extract page metadata and content into structured datasets for downstream programmatic processing.

Example Prompts

  1. "Crawl the entire documentation for the new API at https://docs.service.com and compile it into a single markdown file for my project context."
  2. "I need to analyze all pages in the /guides section of this site. Please crawl it with a depth of 3 and ignore external links."
  3. "Start a crawl job for https://blog.example.com, limit it to 20 pages, and provide the results in JSON format."

Tips & Limitations

  • Rate Management: Note that each crawl consumes 50 units of your 500-unit daily limit. Optimize your requests by setting a low limit where possible.
  • Depth Matters: The default depth is 5. If your target is a large site, reduce the depth to avoid unnecessary page requests.
  • Fallback: If a crawl fails due to host blocking or timeouts, the skill is designed to fall back to alternative browser-style extraction methods automatically.
  • Retention: Crawl jobs are stored for 14 days. After this period, job IDs will expire and return errors if polled.

Metadata

Author@ctxinf
Stars3409
Views1
Updated2026-03-25
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-ctxinf-markdown-new-crawl": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#scraping#markdown#automation#crawling#web-dev
Safety Score: 4/5

Flags: network-access, external-api