ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

Skrape

Ethical web data extraction with robots exclusion protocol adherence, throttled scraping requests, and privacy-compliant handling ("Scrape responsibly!").

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/10oss/skrape
Or

What This Skill Does

Skrape is a sophisticated, ethically-aligned web data extraction agent designed for OpenClaw. It balances the need for information retrieval with strict adherence to the robots exclusion protocol (robots.txt), Terms of Service, and evolving legal precedents like the hiQ v. LinkedIn and Van Buren v. US rulings. Skrape acts as a responsible intermediary, ensuring that data is gathered through throttled requests (with mandatory 2-3 second delays) and proper User-Agent identification. It is built to prioritize APIs where available, ensuring developers avoid invasive scraping methods when an official data channel exists.

Installation

To integrate Skrape into your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/10oss/skrape

Ensure your local environment allows for outbound network requests, as the agent requires external connectivity to perform verification checks and data retrieval.

Use Cases

Skrape is best suited for:

  • Market Research: Gathering public-facing product pricing or listing information to identify trends.
  • Competitive Analysis: Auditing public data sets to inform business strategy without violating copyright.
  • Content Aggregation: Creating curated lists of public news or factual data while providing proper source attribution.
  • Legal Compliance Auditing: Automated checking of robots.txt and Terms of Service files across large domain sets.

Example Prompts

  1. "Skrape, check the robots.txt for example-ecommerce.com and if allowed, extract the current list of product names and prices for the electronics category."
  2. "Please research if there is a public API available for status.github.com. If not, safely scrape the latest service status updates, ensuring you include proper attribution."
  3. "Conduct a data discovery scan for public factual information regarding standard industry pricing for cloud storage, respecting all site access boundaries and throttling requests accordingly."

Tips & Limitations

To maintain high safety standards, Skrape enforces a mandatory 2-3 second delay between requests. Users should avoid requesting private, authenticated, or PII-heavy pages, as the skill is configured to trigger warnings or halt operations upon detecting restricted zones. Always prioritize official APIs; Skrape is not intended to bypass authentication walls or circumvent technical access controls. Remember that 'publicly accessible' does not always imply 'legally reusable' for all data types, especially regarding creative design and proprietary compilations. Users must implement their own data retention policies as Skrape encourages the prompt deletion of unnecessary PII to remain GDPR and CCPA compliant.

Metadata

Author@10oss
Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-10oss-skrape": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#scraping#automation#compliance#data#web-agent
Safety Score: 4/5

Flags: network-access, data-collection