ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified browser automation Safety 3/5

playwright-scraper-skill

Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex sites like Discuss.com.hk.

Why use this skill?

Master web scraping with the Playwright-based OpenClaw skill. Includes stealth modes for Cloudflare-protected sites and simple scripts for dynamic content.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/waisimon/playwright-scraper-skill
Or

What This Skill Does

The playwright-scraper-skill is a robust, Playwright-powered solution for OpenClaw designed to handle web data extraction challenges. Unlike standard fetch tools, this skill provides a tiered approach to scraping by offering both a lightweight, speed-focused script for standard dynamic pages and a sophisticated stealth-enabled module for websites protected by advanced anti-bot solutions like Cloudflare. It is specifically optimized to navigate common hurdles such as JavaScript-rendered content, headless browser detection, and user-agent fingerprinting.

Installation

To get started, ensure you have the OpenClaw environment active. Navigate to the skill directory and install the necessary dependencies:

  1. cd playwright-scraper-skill
  2. npm install
  3. npx playwright install chromium

Ensure your system meets the requirements for running headless browsers, as this skill relies on the Chromium engine to simulate a real user environment.

Use Cases

This skill is built for users requiring high-fidelity data extraction.

  • For simple dynamic content: Use playwright-simple.js when elements are rendered via React, Vue, or other client-side frameworks.
  • For restricted environments: Deploy playwright-stealth.js when you encounter 403 Forbidden errors or anti-scraping challenges. The stealth implementation mimics human behavior through randomized delays and realistic headers.
  • For niche platforms: Use specialized handlers like deep-scraper for YouTube or reddit-scraper for social platforms to ensure compliant and structured output.

Example Prompts

  1. "OpenClaw, use the stealth scraper to fetch the latest hot threads from https://m.discuss.com.hk/ and summarize the top three topics."
  2. "Can you please scrape the pricing table from the dynamic dashboard at https://example.com/pricing using the simple playwright script?"
  3. "The previous fetch attempt returned an empty page for this site; please try again using the playwright-stealth module to bypass the security wall."

Tips & Limitations

  • Efficiency First: Always attempt a basic web_fetch before resorting to Playwright; it is significantly faster and consumes fewer system resources.
  • Rate Limiting: Even with stealth enabled, excessive scraping can trigger IP-based rate limiting. Consider adding manual delays or proxy rotations if you are scraping at scale.
  • Maintenance: Web interfaces change frequently. If a scraper fails, check for DOM structure changes in the target website before debugging the automation scripts themselves. Always keep your browser binaries updated via npx playwright install.

Metadata

Author@waisimon
Stars919
Views16
Updated2026-02-12
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-waisimon-playwright-scraper-skill": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#scraping#playwright#automation#stealth#web-data
Safety Score: 3/5

Flags: network-access, file-write, file-read, code-execution