web-scraper
Intelligent web scraper that fetches any URL and returns clean Markdown content. Triggers on requests like "帮我抓取网页", "获取这个网页内容", "fetch this URL", "scrape this page", "读取网页", "get web content", "爬取", "抓取", or when users provide a URL they want to read/extract content from.
Why use this skill?
Intelligent web scraper for OpenClaw that fetches any URL and converts it into clean, readable Markdown content with integrated per-fetch billing.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/codehourra/web-scraper-proWhat This Skill Does
The Web Scraper Pro is a robust, intelligent tool designed for the OpenClaw AI ecosystem to bridge the gap between static URLs and actionable LLM data. It functions by fetching the raw HTML of any target webpage and processing it through a multi-layer pipeline to extract meaningful content, which is then converted into clean, high-quality Markdown. By stripping away extraneous navigation bars, advertisements, and JavaScript-heavy overhead, the skill provides the model with only the essential text, headers, and structural elements needed for analysis.
This skill is built for developers and power users who require reliable data extraction. It features a mandatory integration with SkillPay to ensure sustainable billing, charging exactly 0.001 USDT per successful fetch. This mechanism guarantees that the infrastructure remains operational while providing a seamless, automated checkout process if a user's balance runs low.
Installation
To install this skill, use the OpenClaw hub command via your terminal: clawhub install openclaw/skills/skills/codehourra/web-scraper-pro. Ensure your environment variables, specifically SKILLPAY_USER_ID, are correctly configured so that the automated billing logic can authenticate your requests against the SkillPay API. Failure to set this up will result in failed execution cycles.
Use Cases
- Research & Summarization: Quickly grab content from long-form articles or technical blog posts to summarize key takeaways.
- Data Aggregation: Collect information from multiple industry portals for market analysis without manual copying.
- Content Migration: Transform existing web content into structured Markdown for documentation or knowledge base updates.
- Trend Tracking: Regularly monitor specific news pages for updates or changes in institutional policies.
Example Prompts
- "帮我抓取网页 https://example-blog.com/tech-trend 并总结核心观点。"
- "scrape this page: https://documentation.service.com/api-guide and convert the content into a summary for my project."
- "读取网页 https://news-outlet.org/latest-report 并提取其中的主要统计数据。"
Tips & Limitations
- Billing First: Always ensure your balance is topped up. The skill will trigger an automated check; if your balance is zero, it will refuse the request and provide a payment link.
- Rate Limiting: Respect the terms of service of the target websites. Excessive scraping from a single domain may result in IP-level blocks.
- Dynamic Content: While the scraper uses a multi-layer fallback, some heavily obfuscated single-page applications (SPAs) may present challenges. For the best results, aim for stable, content-rich pages.
- Cost Efficiency: Because each call carries a cost of 0.001 USDT, perform bulk operations intentionally to avoid unintended drainage of your credits.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-codehourra-web-scraper-pro": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api