ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

crawler

Web crawling and scraping reference — robots.txt protocol, Scrapy framework, anti-bot detection, headless browsers, and legal considerations

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bytesagain3/crawler
Or

Crawler

Web crawling and scraping reference — robots.txt protocol, Scrapy framework, anti-bot detection, headless browsers, and legal considerations. No API keys or credentials required — outputs reference documentation only.

Commands

CommandDescription
introCrawling vs scraping, robots.txt, sitemap
standardsHTTP caching, structured data, meta tags
troubleshootingAnti-bot detection, JS rendering, encoding
performanceConcurrency, dedup, incremental, distributed
securityLegal landscape, ethical guidelines, proxies
migrationBeautifulSoup to Scrapy, requests to Playwright
cheatsheetScrapy commands, CSS/XPath, curl, user-agents
faqLegality, JS pages, blocking, storage

Output Format

All commands output plain-text reference documentation via heredoc. No external API calls, no credentials needed, no network access.


Powered by BytesAgain | bytesagain.com | [email protected]

Metadata

Stars3875
Views1
Updated2026-04-07
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bytesagain3-crawler": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#web-scraping#scrapy#crawler#robots-txt#selenium
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.