ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 3/5

firecrawl

Web scraping and content extraction using Firecrawl API. Use when users need to crawl websites, extract structured data, convert web pages to markdown, scrape multiple URLs, or build knowledge bases from web content. Supports single page extraction, site-wide crawling, batch processing, and structured data extraction with CSS selectors.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/antonia-sz/web-scraper-firecrawl
Or

What This Skill Does

The firecrawl skill provides a powerful interface for web scraping and content extraction, specifically optimized for LLM consumption. It acts as a bridge between raw, messy web content and structured, clean data. By leveraging the Firecrawl API, the skill handles complex JavaScript rendering, site-wide navigation, and granular content extraction. It enables users to fetch single pages, perform recursive site crawls, map URL structures, and extract structured data using CSS selectors. The output is typically delivered as clean Markdown, which is ideal for RAG (Retrieval-Augmented Generation) pipelines, content migration, or automated research tasks. It effectively bypasses the common headaches associated with manual scraping, such as dynamic content loading and HTML cleanup.

Installation

To integrate this skill into your environment, use the OpenClaw installer command: clawhub install openclaw/skills/skills/antonia-sz/web-scraper-firecrawl. After installation, ensure you have a valid Firecrawl API key. You must configure your environment by setting the FIRECRAWL_API_KEY environment variable. Ensure the requests library is installed in your Python environment, as it serves as the underlying transport layer for the API interactions.

Use Cases

  1. Knowledge Base Generation: Automatically convert technical documentation sites into unified Markdown files to serve as context for private AI agents.
  2. Competitive Intelligence: Batch-scrape competitor product pages and use structured extraction to pull pricing data into JSON format for comparison.
  3. Content Migration: Export entire websites for archival purposes or to migrate legacy CMS content into modern documentation systems.

Example Prompts

  1. "Firecrawl this documentation site at https://docs.example.com and save all pages as markdown files in my project folder for RAG training."
  2. "Use firecrawl to map all URLs on https://blog.example.com and then extract the article titles and publish dates using CSS selectors."
  3. "Scrape these 5 URLs provided in urls.txt and give me the output in clean markdown format for my research report."

Tips & Limitations

  • Rate Limiting: Always be mindful of the target website's robots.txt and your Firecrawl plan's rate limits. Use the --limit flag for large crawls to avoid excessive consumption.
  • Dynamic Content: For sites heavily reliant on client-side rendering (SPA), use the --wait-for flag to ensure the JS fully executes before extraction.
  • Data Privacy: Ensure you have authorization to crawl specific sites. The skill performs network requests and data collection; respect copyright and terms of service for the scraped content.

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-antonia-sz-web-scraper-firecrawl": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#web-scraping#markdown#data-extraction#firecrawl#automation
Safety Score: 3/5

Flags: network-access, file-write, file-read, external-api