crawl
Crawl any website and save pages as local markdown files. Use when you need to download documentation, knowledge bases, or web content for offline access or analysis. No code required - just provide a URL.
Why use this skill?
Efficiently crawl any website and save content as structured Markdown files. Ideal for documentation archiving and data analysis with zero code requirements.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/barneyjm/crawlWhat This Skill Does
The Crawl skill is a robust web scraping and content extraction tool designed to help OpenClaw users convert live websites into structured, offline-ready Markdown files. By leveraging the Tavily Search API, this skill navigates complex URL structures, follows navigational links, and extracts readable content based on specific focus areas or broad site crawls. It serves as a bridge between the vast, chaotic web and the structured environment of your local workspace.
Installation
To install the Crawl skill, run the following command in your terminal: clawhub install openclaw/skills/skills/barneyjm/crawl. After installation, you must configure your Tavily API key to authorize the skill. Add your credentials to your configuration file at ~/.claude/settings.json under the env object as TAVILY_API_KEY. Once configured, the skill is ready to be triggered via the ./scripts/crawl.sh interface or directly through your agent's command interface.
Use Cases
- Documentation Archiving: Download entire documentation suites for offline reference or to provide a static knowledge base for local LLM fine-tuning or RAG implementations.
- Market Analysis: Extract content from industry portals, blogs, or competitors' websites to perform automated sentiment or trend analysis.
- Knowledge Management: Aggregate scattered web resources into a unified, clean Markdown repository that can be easily searched or indexed by your local files.
- Developer Workflows: Automatically fetch API references and code samples from online documentation to speed up integration processes.
Example Prompts
- "Crawl the documentation at https://docs.openclaw.com with a max depth of 2, and save the content into the ./docs folder so I can reference it while offline."
- "Can you perform a focused crawl of https://api-guide.example.com? I specifically need to extract the section on authentication and error handling into a single report."
- "Go to https://tech-updates.com and gather all recent blog posts from the last month that mention AI agent architectures, and save them as separate markdown files."
Tips & Limitations
When using the Crawl skill, start with a lower max_depth to avoid excessive data usage and unnecessary API costs. Use select_paths and exclude_paths (which support regex) to focus the crawler on relevant documentation pages and avoid noise like footers, social links, or administrative login pages. Note that the skill relies on the Tavily API, so ensure your network permits these requests. Always be mindful of website robots.txt policies when performing deep crawls. For better results, provide clear semantic instructions to allow the agent to filter out irrelevant chunks, which is particularly effective when working with large, complex domains.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-barneyjm-crawl": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, external-api
Related Skills
query
Search for places using natural language with Camino AI's location intelligence API. Returns relevant results with coordinates, distances, and metadata. Use when you need to find real-world locations like restaurants, shops, landmarks, or any point of interest.
search
Search the web using Tavily's LLM-optimized search API. Returns relevant results with content snippets, scores, and metadata. Use when you need to find web content on any topic without writing code.
journey
Plan multi-waypoint journeys with route optimization, feasibility analysis, and time budget constraints. Use when you need to plan trips with multiple stops or check if an itinerary is achievable.
travel-planner
Plan complete day trips, walking tours, and multi-stop itineraries with time budgets using Camino AI's journey planning and route optimization.
real-estate
Evaluate any address for home buyers and renters. Get nearby schools, transit, grocery stores, parks, restaurants, and walkability using Camino AI's location intelligence.