crawl-for-ai
Web scraping using local Crawl4AI instance. Use for fetching full page content with JavaScript rendering. Better than Tavily for complex pages. Unlimited usage.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/angusthefuzz/crawl-for-aiWhat This Skill Does
The crawl-for-ai skill enables the OpenClaw AI agent to perform high-fidelity web scraping by leveraging a local Crawl4AI instance. Unlike standard search APIs that might return truncated text or miss dynamically loaded content, this skill executes JavaScript to fully render pages before extraction. It offers two distinct endpoints: a proxy mode for clean, OpenWebUI-ready markdown, and a direct mode that provides comprehensive data including HTML structure, media assets, and hyperlink references. This tool is essential for users requiring deep research, data harvesting, or the analysis of complex, client-side rendered web applications.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Ensure you have a running instance of Crawl4AI. Execute the following command in your terminal:
clawhub install openclaw/skills/skills/angusthefuzz/crawl-for-ai
After installation, you must configure your environment variables to point the agent to your local instance. Set CRAWL4AI_URL to your service address (e.g., http://localhost:11235). If your instance is secured, include the CRAWL4AI_KEY environment variable. Ensure that your local network configuration allows the agent process to communicate with the designated ports (11234 or 11235).
Use Cases
This skill is best suited for scenarios where accuracy and depth are paramount. Use it for:
- Technical Research: Extracting documentation from modern web frameworks that rely heavily on React or Vue rendering.
- Market Analysis: Scraping dynamic product pages, tables, and media assets for data processing.
- Content Curation: Converting complex web articles into clean, readable markdown for long-term storage or analysis.
- Competitor Monitoring: Gathering comprehensive metadata from competitor sites that Tavily or traditional scrapers might fail to process due to bot protections or JS-dependency.
Example Prompts
- "Use crawl-for-ai to scrape the full content of this documentation page at https://example.com/docs and summarize the primary installation steps in markdown format."
- "Go to the product catalog at https://shop.example.com/items, use the crawl-for-ai tool, and extract all item names and prices into a table for me."
- "Fetch the full page data for https://tech-blog.example.com/deep-dive using crawl-for-ai, including all internal links found on the page."
Tips & Limitations
Because this tool runs locally, it is bound by your machine's hardware capabilities rather than external API quotas. However, intensive crawling can impact local CPU and memory usage. If a page is particularly massive, opt for the proxy endpoint to minimize data bloat. Be aware that some sites implement advanced anti-scraping measures; ensure your local Crawl4AI instance is configured with appropriate user-agent headers if you encounter blocks. Since this involves network requests, ensure your firewall permits outgoing traffic from the local instance to the target URLs.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-angusthefuzz-crawl-for-ai": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access
Related Skills
stirling-pdf
PDF manipulation via Stirling-PDF API. Merge, split, convert, OCR, compress, sign, redact, and more. Self-hosted.
pmc-harvest
Fetch articles from PubMed Central using NCBI APIs. Search journals, retrieve full text via OAI-PMH, batch harvest for RAG pipelines. No API key required.
cozi
Interact with Cozi Family Organizer (shopping lists, todo lists, item management). Unofficial API client for family organization.
mealie
Interact with Mealie recipe manager (recipes, shopping lists, meal plans). Self-hosted recipe and meal planning API client.
Tnbc Research Swarm
Skill by angusthefuzz