Back to Registry
View Author Profile
Official Verified
scrapling
Advanced web scraping with Scrapling — MCP-native guidance for extraction, crawling, and anti-bot handling. Use via mcporter (MCP) for execution; this skill provides strategy, recipes, and best practices.
skill-install — Terminal
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/cccccqqqqq/scrapling-yooOr
Scrapling Web Scraping — MCP-Native Guidance
Guidance Layer + MCP Integration
Use this skill for strategy and patterns. For execution, call Scrapling's MCP server viamcporter.
Quick Start (MCP)
1. Install Scrapling with MCP support
pip install scrapling[mcp]
# Or for full features:
pip install scrapling[mcp,playwright]
python -m playwright install chromium
2. Add to OpenClaw MCP config
{
"mcpServers": {
"scrapling": {
"command": "python",
"args": ["-m", "scrapling.mcp"]
}
}
}
3. Call via mcporter
mcporter call scrapling fetch_page --url "https://example.com"
Execution vs Guidance
| Task | Tool | Example |
|---|---|---|
| Fetch a page | mcporter | mcporter call scrapling fetch_page --url URL |
| Extract with CSS | mcporter | mcporter call scrapling css_select --selector ".title::text" |
| Which fetcher to use? | This skill | See "Fetcher Selection Guide" below |
| Anti-bot strategy? | This skill | See "Anti-Bot Escalation Ladder" |
| Complex crawl patterns? | This skill | See "Spider Recipes" |
Fetcher Selection Guide
┌─────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Fetcher │────▶│ DynamicFetcher │────▶│ StealthyFetcher │
│ (HTTP) │ │ (Browser/JS) │ │ (Anti-bot) │
└─────────────────┘ └──────────────────┘ └──────────────────┘
Fastest JS-rendered Cloudflare,
Static pages SPAs, React/Vue Turnstile, etc.
Decision Tree
- Static HTML? →
Fetcher(10-100x faster) - Need JS execution? →
DynamicFetcher - Getting blocked? →
StealthyFetcher - Complex session? → Use Session variants
MCP Fetch Modes
fetch_page— HTTP fetcherfetch_dynamic— Browser-based with Playwrightfetch_stealthy— Anti-bot bypass mode
Anti-Bot Escalation Ladder
Level 1: Polite HTTP
# MCP call: fetch_page with options
{
"url": "https://example.com",
"headers": {"User-Agent": "..."},
"delay": 2.0
}
Level 2: Session Persistence
# Use sessions for cookie/state across requests
FetcherSession(impersonate="chrome") # TLS fingerprint spoofing
Level 3: Stealth Mode
# MCP: fetch_stealthy
StealthyFetcher.fetch(
url,
headless=True,
solve_cloudflare=True, # Auto-solve Turnstile
network_idle=True
)
Level 4: Proxy Rotation
See references/proxy-rotation.md
Adaptive Scraping (Anti-Fragile)
Scrapling can survive website redesigns using adaptive selectors:
# First run — save fingerprints
products = page.css('.product', auto_save=True)
# Later runs — auto-relocate if DOM changed
products = page.css('.product', adaptive=True)
MCP usage:
mcporter call scrapling css_select \\
--selector ".product" \\
--adaptive true \\
--auto-save true
Metadata
AI Skill Finder
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skill Add to Configuration
Paste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-cccccqqqqq-scrapling-yoo": {
"enabled": true,
"auto_update": true
}
}
}Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.