agent-browser
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection
Why use this skill?
Automate web tasks with the agent-browser skill. Fast, accessibility-driven navigation and deterministic element selection for AI agents. Install via ClawHub.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/matrixy/agent-browser-clawdbotWhat This Skill Does
The agent-browser skill is a high-performance, headless browser automation tool specifically engineered for AI agents. Unlike standard Puppeteer or Playwright wrappers, agent-browser focuses on accessibility-first interaction. By leveraging accessibility tree snapshots, it allows agents to interact with web elements using unique, deterministic 'refs' (e.g., @e1, @e2). This significantly reduces the fragility of automated scripts that rely on brittle CSS selectors or XPaths, which often change during site updates. It provides a full suite of navigation, interaction, state checking, and network control capabilities, making it ideal for complex workflows that require authentication, state persistence, and SPA navigation.
Installation
To install this skill, run the following command in your terminal:
clawhub install openclaw/skills/skills/matrixy/agent-browser-clawdbot
Ensure you have the OpenClaw environment initialized. You can verify the installation by running agent-browser --help after installation completes.
Use Cases
- Automated Web Testing: Run regression suites on complex single-page applications (SPAs) without dealing with UI element shifting.
- Data Scraping & Extraction: Extract large volumes of data from websites that require user interaction, such as clicking 'load more' buttons or handling infinite scrolls.
- Multi-session Workflows: Use session isolation to manage different user roles (e.g., 'admin' and 'user') simultaneously in separate browser instances.
- Task Automation: Automate repetitive manual web tasks like filling out forms, handling authentication cookies, or navigating through multi-step procurement or signup processes.
Example Prompts
- "Open https://github.com/login, fill in the credentials using the accessibility tree to find the input fields, and save the session state to auth.json once logged in."
- "Navigate to the dashboard and wait until the network is idle, then capture a snapshot of the main content and get the text from all list items with the class .data-row."
- "Block all requests matching '**/ads/*' and then proceed to scrape the prices from the product page by clicking the next button until we reach the end of the list."
Tips & Limitations
- Accessibility Matters: Because this skill relies on the accessibility tree, ensure the target websites have proper ARIA labels for the best experience. If an element is missing from the snapshot, it might be hidden from accessibility tools.
- Performance: For large pages, use the
-c(compact) flag during snapshots to keep the context window manageable for your LLM. - Persistence: Always use
state saveandstate loadto avoid redundant login flows, which saves time and prevents triggers of anti-bot systems. - Headless Mode: This skill runs headless by default. While this is efficient, some sites with sophisticated bot detection may require additional headers or user-agent spoofing, which can be configured via network routing tools.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-matrixy-agent-browser-clawdbot": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, data-collection
Related Skills
linear
Manage Linear projects, issues, and tasks via the bundled Node CLI and the official Linear API. Use when you need to read, create, update, or organize Linear issues, projects, teams, milestones, comments, cycles, labels, and documents.
agent-registry
MANDATORY agent discovery system for token-efficient agent loading. Claude MUST use this skill instead of loading agents directly from ~/.claude/agents/ or .claude/agents/. Provides lazy loading via search and get tools. Use when: (1) user task may benefit from specialized agent expertise, (2) user asks about available agents, (3) starting complex workflows that historically used agents. This skill reduces context window usage by ~95% compared to loading all agents upfront.
audio-reply
Generate audio replies using TTS. Trigger with "read it to me [public URL]" to fetch and read content aloud, or "talk to me [topic]" to generate a spoken response. Also responds to "speak", "say it", "voice reply".