ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified data analysis Safety 4/5

scraper

Structured extraction and cleanup for public, user-authorized web pages. Use when the user wants to collect, clean, summarize, or transform content from accessible pages into reusable text or data. Do not use to bypass logins, paywalls, captchas, robots restrictions, or access controls. Local-only output.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/agistack/scraper
Or

What This Skill Does

The Scraper skill is a robust, local-first utility designed to transform unstructured web content into clean, readable data. By leveraging standard Python libraries, it fetches public webpage content, strips away irrelevant HTML tags and boilerplate code, and registers the data into a local workspace directory. This tool acts as an intermediary layer between raw web pages and your agent’s analytical capabilities, ensuring that any information processed is formatted for optimal token efficiency and readability. It is designed to be ethical and compliant, strictly adhering to robots.txt guidelines and existing access controls.

Installation

To integrate this skill into your environment, use the OpenClaw CLI tool. Run the following command in your terminal: clawhub install openclaw/skills/skills/agistack/scraper Ensure that your system has Python 3 installed and accessible via the python3 command, as the skill relies on local script execution to handle data retrieval and parsing.

Use Cases

  • Knowledge Management: Convert long-form articles or blog posts into concise summaries stored in your local library.
  • Market Research: Extract technical specifications from product pages to populate local data files for comparison.
  • Content Curation: Aggregate text from multiple sources into a unified local workspace for offline analysis.
  • Technical Documentation: Fetch open-source library documentation to help the agent provide more context-aware programming assistance.

Example Prompts

  1. "Scrape the content from https://example.com/tech-article and save it as a text file for me to review later."
  2. "Could you fetch the documentation from https://openclaw.org/docs and extract the main usage instructions?"
  3. "List my previous scraping jobs and tell me which one contains the data from the recent blog update."

Tips & Limitations

  • Public Access Only: The skill will fail on pages requiring logins or premium subscriptions. Do not attempt to bypass paywalls.
  • Rate Limiting: Be mindful of your request frequency. Excessive scraping may trigger rate limits from host servers.
  • Structure: The cleaner works best on standard HTML documents; highly dynamic JavaScript-heavy sites might return incomplete content due to the reliance on basic Python fetching.

Metadata

Author@agistack
Stars3809
Views0
Updated2026-04-05
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-agistack-scraper": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#web-scraping#data-extraction#content-cleanup#offline-data
Safety Score: 4/5

Flags: network-access, file-write, file-read, code-execution

Related Skills

Case

A comprehensive AI agent skill for anyone navigating a legal case. Whether you are the plaintiff, the defendant, or somewhere in between — this skill helps you understand what you are actually facing, build the strongest possible position, work with your attorney as an equal partner, make decisions at every fork in the road with clear eyes, and reach the best available outcome from wherever you are starting.

agistack 3809

terminal

Local shell copilot for command planning, safe execution, preview-first workflows, output summarization, privacy-aware history controls, and step-by-step terminal help. Use whenever the user wants to run terminal commands, inspect files, debug shell issues, automate local tasks, or translate natural language into shell actions. Prefer safe preview before mutation. Require explicit confirmation for destructive commands. Local-only.

agistack 3809

pitch-pro

Pitch development and presentation coaching for founders and salespeople. Use when user mentions investor pitches, sales presentations, elevator pitches, pitch decks, or persuasion scenarios. Builds value propositions, crafts pitches for different audiences, prepares for objections, and coaches delivery. All work is advisory - human judgment required for all decisions.

agistack 3809

Daily Life Autopilot

The most comprehensive proactive life management skill for AI agents. Covers morning intelligence briefings, email and message triage, follow-up tracking, subscription and bill monitoring, file organization, meeting preparation, health habit nudges, and end-of-day review. Built for busy professionals, parents, entrepreneurs, and anyone who wants their AI to run the operational layer of daily life without being micromanaged. No technical setup required.

agistack 3809

interview

Interview preparation system with company research, story building, and mock interview practice. Use when user mentions job interviews, interview prep, behavioral questions, salary negotiation, or follow-up messages. Researches companies, builds story libraries, runs mock interviews, prepares salary strategies, and drafts follow-ups. NEVER guarantees job offers.

agistack 3809