markdown.new-crawl
Use `https://markdown.new/crawl/{target_url}` endpoints to recursively crawl a site section and return markdowns. Trigger this skill when the user asks for multi-page extraction, whole-docs crawl, link-depth crawling, or job-based crawl polling from a URL. Prefer local terminal access (`curl`) with `/crawl`, `/crawl/status/{jobId}`, and `/crawl/{url}` before other browsing methods.
Why use this skill?
Learn how to use the markdown.new-crawl skill to recursively extract website content, convert pages to Markdown, and automate documentation ingestion via OpenClaw.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/ctxinf/markdown-new-crawlWhat This Skill Does
The markdown.new-crawl skill provides a powerful, local-first interface for recursively scraping website sections and converting them into clean Markdown. It is designed for developers, researchers, and data analysts who need to ingest large documentation sets, multi-page articles, or knowledge bases into an AI-ready format. By leveraging the markdown.new API, it orchestrates asynchronous crawl jobs, allowing for efficient link-depth management, pattern-based URL filtering, and structured data output. It acts as a programmatic bridge between live web content and local context, favoring terminal-based curl execution to ensure transparency and reliability.
Installation
To integrate this tool into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/ctxinf/markdown-new-crawl
Ensure your local environment allows outgoing network connections to markdown.new for optimal performance. Once installed, the skill will be automatically discovered by the OpenClaw agent for relevant requests.
Use Cases
This skill is ideal for several high-impact tasks:
- Documentation Ingestion: Crawling entire technical docs (e.g., product manuals or API references) to build a local context base for RAG (Retrieval-Augmented Generation).
- Competitive Research: Extracting structured text from competitor landing pages or blogs for market trend analysis.
- Archival & Backup: Converting legacy or ephemeral web content into persistent Markdown files.
- Structured Scraping: Using the
format=jsonflag to extract page metadata and content into structured datasets for downstream programmatic processing.
Example Prompts
- "Crawl the entire documentation for the new API at https://docs.service.com and compile it into a single markdown file for my project context."
- "I need to analyze all pages in the /guides section of this site. Please crawl it with a depth of 3 and ignore external links."
- "Start a crawl job for https://blog.example.com, limit it to 20 pages, and provide the results in JSON format."
Tips & Limitations
- Rate Management: Note that each crawl consumes 50 units of your 500-unit daily limit. Optimize your requests by setting a low
limitwhere possible. - Depth Matters: The default depth is 5. If your target is a large site, reduce the
depthto avoid unnecessary page requests. - Fallback: If a crawl fails due to host blocking or timeouts, the skill is designed to fall back to alternative browser-style extraction methods automatically.
- Retention: Crawl jobs are stored for 14 days. After this period, job IDs will expire and return errors if polled.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-ctxinf-markdown-new-crawl": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api