crawler
Web crawling and scraping reference — robots.txt protocol, Scrapy framework, anti-bot detection, headless browsers, and legal considerations
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bytesagain3/crawlerCrawler
Web crawling and scraping reference — robots.txt protocol, Scrapy framework, anti-bot detection, headless browsers, and legal considerations. No API keys or credentials required — outputs reference documentation only.
Commands
| Command | Description |
|---|---|
intro | Crawling vs scraping, robots.txt, sitemap |
standards | HTTP caching, structured data, meta tags |
troubleshooting | Anti-bot detection, JS rendering, encoding |
performance | Concurrency, dedup, incremental, distributed |
security | Legal landscape, ethical guidelines, proxies |
migration | BeautifulSoup to Scrapy, requests to Playwright |
cheatsheet | Scrapy commands, CSS/XPath, curl, user-agents |
faq | Legality, JS pages, blocking, storage |
Output Format
All commands output plain-text reference documentation via heredoc. No external API calls, no credentials needed, no network access.
Powered by BytesAgain | bytesagain.com | [email protected]
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bytesagain3-crawler": {
"enabled": true,
"auto_update": true
}
}
}Tags
Related Skills
scrapebadger
Web scraping platform — Twitter/X data, Vinted marketplace, and general web scraping API
kuaipu-skill
自动化快普系统登录、验证码识别、自动操作和审批流程查询等。
youtube-apify-transcript
Fetch YouTube transcripts via APIFY API. Works from cloud IPs (Hetzner, AWS, etc.) by bypassing YouTube's bot detection. Free tier includes $5/month credits (~714 videos). No credit card required.