ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

smart-web-scraper

Extract structured data from any web page. Supports CSS selectors, auto-detection of tables and lists, JSON/CSV output formats. Use when asked to scrape a website, extract data from a page, pull product info, gather contact details, or collect listings from a URL.

Why use this skill?

Effortlessly scrape and extract structured data from any web page with the Smart Web Scraper. Supports CSS selectors, auto-detection of tables/lists, and multiple output formats (JSON, CSV).

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/mariusfit/smart-web-scraper
Or

What This Skill Does

The Smart Web Scraper is a powerful tool designed to extract structured data from any given web page. It offers flexibility by supporting CSS selectors for precise data targeting, and it can automatically detect and parse HTML tables and lists. The extracted data can be conveniently output in various formats, including JSON, CSV, plain text, and Markdown, with options to save directly to a file. This skill is ideal for collecting product information, contact details, listings, or any other data presented on a webpage. It also includes advanced features like multi-page crawling to follow pagination and extract data across multiple pages.

Installation

To install the Smart Web Scraper, use the following command:

clawhub install openclaw/skills/skills/mariusfit/smart-web-scraper

This command will download and set up the necessary components for the skill to function.

Use Cases

  • E-commerce Data Collection: Scrape product names, prices, descriptions, and availability from online stores.
  • Lead Generation: Extract contact information (emails, phone numbers) from business directories or company websites.
  • Market Research: Gather data from competitor websites, such as feature lists, pricing tiers, or customer reviews.
  • Content Aggregation: Collect articles, blog posts, or news listings from various sources.
  • Data Analysis: Pull structured data from reports or public datasets presented in tables on web pages.
  • Real Estate Listings: Extract property details, prices, and agent information from real estate portals.

Example Prompts

  1. "Scrape all product details from https://shop.example.com/gadgets using the CSS selector .product-item and save the output as JSON to gadgets.json."
  2. "Extract all the pricing tables from https://example.com/services and format the output as CSV."
  3. "Crawl the news website starting from https://news.example.com/page/1 for the next 10 pages, extracting article titles using the selector h2.article-title, and output as JSON."

Tips & Limitations

  • Specificity is Key: When using CSS selectors, be as specific as possible to ensure you extract only the desired data and avoid noise.
  • Check Website Structure: Web page structures can change. If the scraper stops working, the website's HTML might have been updated, requiring a review of your selectors.
  • Respect robots.txt: While not explicitly enforced by the tool, always be mindful of a website's robots.txt file and terms of service to avoid overloading servers or violating usage policies.
  • Dynamic Content: This scraper primarily works with server-rendered HTML. Content loaded dynamically via JavaScript after the initial page load might not be fully captured without additional tools or configurations.
  • Rate Limiting: Be cautious when scraping large amounts of data or crawling many pages from a single website. Implement delays or use the tool responsibly to avoid being blocked.
  • Error Handling: Network issues or unexpected HTML structures can lead to errors. Consider adding error handling in your workflow if using this skill in an automated pipeline.
  • Dependencies: Ensure you have the necessary libraries (beautifulsoup4, lxml) installed as indicated in the quick start guide (uv run --with beautifulsoup4 --with lxml).

Metadata

Author@mariusfit
Stars1401
Views13
Updated2026-02-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-mariusfit-smart-web-scraper": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#web scraping#data extraction#automation#seo
Safety Score: 4/5

Flags: network-access, file-write, data-collection