Official Verified developer tools Safety 4/5

Scrape

Legal web scraping with robots.txt compliance, rate limiting, and GDPR/CCPA-aware data handling.

Why use this skill?

Safely extract web data with the OpenClaw Scrape skill. Features built-in robots.txt compliance, rate limiting, and PII-stripping to ensure your scraping stays legal.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/ivangdavila/scrape

Download Source Code (.zip)

What This Skill Does

The Scrape skill is a robust, ethically-engineered web data extraction agent designed for the OpenClaw ecosystem. It bridges the gap between raw web access and legal compliance by automating the discovery of site policies. The skill serves as a protective layer, enforcing robots.txt adherence, managing sophisticated rate limiting, and ensuring that your data collection processes remain within the boundaries of international regulations like GDPR and CCPA. By prioritizing public data over protected resources, it helps developers and analysts build datasets without triggering the legal pitfalls associated with aggressive scraping.

Installation

To integrate the Scrape skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/ivangdavila/scrape

Use Cases

Market Research: Extracting public-facing product pricing or listing data from e-commerce sites to perform competitive analysis.
Content Aggregation: Summarizing public news articles or blog posts while maintaining clear audit trails of source origins.
Lead Qualification: Harvesting public corporate directory information to populate CRM systems, provided strict PII filtering is applied.
Academic Research: Gathering public datasets from non-authenticated domains for statistical analysis and training models.

Example Prompts

"Scrape the public pricing table from example-store.com/products and save the data to a JSON format, ensuring you adhere to their robots.txt file first."
"Research the latest announcements on tech-blog.org. Use the Scrape skill to pull the headlines, but ensure you strip any author email addresses to remain GDPR compliant."
"Check if there is public contact information available for the company at industry-news.net. Please respect their rate limits and provide a log of the headers used for the request."

Tips & Limitations

The Scrape skill operates best when you provide it with clear instructions regarding the scope of the target site. Always prioritize site-provided APIs; if a site offers an official API, you must use it instead of scraping. Remember that the skill does not grant permission to bypass login walls; any attempt to access authenticated data is strictly prohibited and likely violates site terms of service. Always monitor your logs to verify that the PII-stripping features are functioning as expected, and treat the tool as a helpful assistant that requires your final approval before executing high-volume data operations.

Read Full Documentation on GitHub

Metadata

Author@ivangdavila

Stars2102

Updated2026-03-06

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-ivangdavila-scrape": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#scraping#compliance#automation#data-extraction

Safety Score: 4/5

Flags: network-access, data-collection

Related Skills

Animations

Create performant web animations with proper accessibility and timing.

ivangdavila 2190

Arduino

Develop Arduino projects avoiding common wiring, power, and code pitfalls.

ivangdavila 2190

Bulgarian

Write Bulgarian that sounds human. Not formal, not robotic, not AI-generated.

ivangdavila 2190

Arabic

Write Arabic that sounds human. Not formal, not robotic, not AI-generated.

ivangdavila 2190

Assistant

Manage tasks, communications, and scheduling with proactive and organized support.

ivangdavila 2190