scrapling
Advanced web scraping with anti-bot bypass, JavaScript support, and adaptive selectors. Use when scraping websites with Cloudflare protection, dynamic content, or frequent UI changes.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/cryptos3c/openclaw-scraplingScrapling Web Scraping Skill
Use Scrapling to scrape modern websites, including those with anti-bot protection, JavaScript-rendered content, and adaptive element tracking.
When to Use This Skill
- User asks to scrape a website or extract data from a URL
- Need to bypass Cloudflare, bot detection, or anti-scraping measures
- Need to handle JavaScript-rendered/dynamic content (React, Vue, etc.)
- Website requires login or session management
- Website structure changes frequently (adaptive selectors)
- Need to scrape multiple pages with rate limiting
Commands
All commands use the scrape.py script in this skill's directory.
Basic HTTP Scraping (Fast)
python scrape.py \
--url "https://example.com" \
--selector ".product" \
--output products.json
Use when: Static HTML, no JavaScript, no bot protection
Stealth Mode (Bypass Anti-Bot)
python scrape.py \
--url "https://nopecha.com/demo/cloudflare" \
--stealth \
--selector "#content" \
--output data.json
Use when: Cloudflare protection, bot detection, fingerprinting
Features:
- Bypasses Cloudflare Turnstile automatically
- Browser fingerprint spoofing
- Headless browser mode
Dynamic/JavaScript Content
python scrape.py \
--url "https://spa-website.com" \
--dynamic \
--selector ".loaded-content" \
--wait-for ".loaded-content" \
--output data.json
Use when: React/Vue/Angular apps, lazy-loaded content, AJAX
Features:
- Full Playwright browser automation
- Wait for elements to load
- Network idle detection
Adaptive Selectors (Survives Website Changes)
# First time - save the selector pattern
python scrape.py \
--url "https://example.com" \
--selector ".product-card" \
--adaptive-save \
--output products.json
# Later, if website structure changes
python scrape.py \
--url "https://example.com" \
--adaptive \
--output products.json
Use when: Website frequently redesigns, need robust scraping
How it works:
- First run: Saves element patterns/structure
- Later runs: Uses similarity algorithms to relocate moved elements
- Auto-updates selector cache
Session Management (Login Required)
# Login and save session
python scrape.py \
--url "https://example.com/dashboard" \
--stealth \
--login \
--username "[email protected]" \
--password "password123" \
--session-name "my-session" \
--selector ".protected-data" \
--output data.json
# Reuse saved session (no login needed)
python scrape.py \
--url "https://example.com/another-page" \
--stealth \
--session-name "my-session" \
--selector ".more-data" \
--output more_data.json
Use when: Content requires authentication, multi-step scraping
Extract Specific Data Types
Text only:
python scrape.py \
--url "https://example.com" \
--selector ".content" \
--extract text \
--output content.txt
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-cryptos3c-openclaw-scrapling": {
"enabled": true,
"auto_update": true
}
}
}