OpenClaw Browser Automation: Full Setup Guide (2026)

The browser-use-api skill gives your OpenClaw agent eyes and hands on the web. Instead of writing Playwright scripts, you describe what you want in plain language and the agent handles navigation, clicks, form fills, and data extraction autonomously.

What Makes browser-use-api Different

Traditional browser automation requires you to write selectors, handle dynamic content, and maintain fragile scripts. browser-use-api pairs a Chromium instance with your LLM so the agent reads the page visually, decides what to click, and adapts when layouts change — no selectors needed.

Core Capabilities

Form Filling

Navigate to any form, fill fields with dynamic data, handle CAPTCHAs, and submit — all from a single mission prompt.

Data Extraction

Scrape structured data from tables, listings, and dashboards. The agent understands context, not just raw HTML.

Screenshot Monitoring

Capture visual evidence of page state, compare changes over time, and alert on anomalies.

Installation

The browser-use-api skill is available in the OpenClaw Skill Registry. Add it to your config via the Config Wizard or manually:

// clawhub.json — add to mcpServers

{
  "mcpServers": {
    "browser-use-api": {
      "command": "npx",
      "args": ["-y", "@openclaw/browser-use-api"],
      "env": {
        "BROWSER_HEADLESS": "true",
        "BROWSER_VIEWPORT_WIDTH": "1280",
        "BROWSER_VIEWPORT_HEIGHT": "900",
        "BROWSER_TIMEOUT_MS": "30000"
      }
    }
  }
}

First-Time Setup: Install Chromium

Run npx playwright install chromium once to download the browser binary. This is a one-time 130MB download. After that, all browser automation runs locally.

Workflow 1: Automated Form Filling

Send a mission that describes the form goal. The agent navigates to the page, identifies fields, and fills them with your data.

// Mission prompt — form filling

{
  "mission": "Go to https://example.com/contact-form. Fill in:
    - Name: John Smith
    - Email: [email protected]
    - Subject: Partnership Inquiry
    - Message: [read from ./message.txt]
    Submit the form and screenshot the confirmation page.",
  "maxSteps": 15
}

Navigate

Agent opens the URL in a headless browser

Analyze

LLM reads the page screenshot, identifies form fields and labels

Fill

Agent types data into each field, handles dropdowns and checkboxes

Submit

Clicks the submit button, waits for confirmation

Confirm

Screenshots the success state as evidence

Workflow 2: Data Scraping

Extract structured data from any website — no API needed. The agent reads pages like a human and returns clean JSON.

// Mission prompt — product data scraping

{
  "mission": "Go to https://shop.example.com/products.
    For each product on the first 3 pages, extract:
    - product name
    - price
    - rating
    - availability (in stock / out of stock)
    Save results to ./products.json as an array of objects.",
  "maxSteps": 40
}

Why This Works Without Selectors

The LLM interprets the visual layout of each page — it sees "Price: $29.99" the same way you do, regardless of the underlying HTML structure. This makes your scraper resilient to site redesigns.

Workflow 3: Screenshot Monitoring

Run the agent on a schedule to capture page state and alert you to changes — price drops, status updates, content changes.

// Mission prompt — price monitoring

{
  "mission": "Check the price of the MacBook Pro M4 at apple.com/shop.
    If the price is below $1,500, send an alert via Slack webhook:
    https://hooks.slack.com/...
    Include a screenshot of the product page in the message.
    Save the current price to ./price-log.json with a timestamp.",
  "maxSteps": 10
}

# Schedule with cron (every hour)

0 * * * * openclaw run --config clawhub.json --mission ./missions/price-monitor.json

Configuration Reference

Option	Default	Description
BROWSER_HEADLESS	true	Run without visible browser window. Set false for debugging.
BROWSER_VIEWPORT_WIDTH	1280	Browser window width in pixels.
BROWSER_VIEWPORT_HEIGHT	900	Browser window height in pixels.
BROWSER_TIMEOUT_MS	30000	Max wait time for page loads (ms).
BROWSER_SLOW_MO	0	Add delay between actions (ms). Useful for debugging.
BROWSER_SCREENSHOT_DIR	./screenshots	Directory to save screenshots.

Troubleshooting

Browser Timeout

Increase BROWSER_TIMEOUT_MS to 60000 for slow sites. Also check the troubleshooting guide for browser-control-timeout.

Timeout guide →

CAPTCHA Failures

Some sites require human verification. Use headed mode (BROWSER_HEADLESS=false) and add maxSteps for retry logic.

Missing Chromium

Run: npx playwright install chromium. This downloads the browser binary needed for automation.

Agent Loop Errors

If the agent seems stuck, reduce maxSteps and add more specific instructions to your mission prompt.

Ready to Automate Your Browser?

Start with the Config Wizard to add browser-use-api to your setup, or browse the Skill Registry to see all available browser tools.

Open Config Wizard browser-use-api Skill

Agent Memory & Context Long-Term Memory Guide