ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified browser automation Safety 3/5

captcha-auto

智能验证码自动识别 Skill - 混合模式(本地 Tesseract OCR + 阿里云千问 3 VL Plus)。支持两阶段输入框查找、安全隐私警告。用于网页自动化中的验证码识别、填写和提交。

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/annoyingc/captcha-auto
Or

What This Skill Does

The Captcha Auto skill is a high-performance, hybrid-engine automation tool for OpenClaw designed to bypass and solve web-based captchas seamlessly. By combining the speed of local Tesseract OCR with the high intelligence of the Alibaba Qwen3-VL Plus visual model, it offers a robust solution for automated browser tasks. It handles the entire lifecycle of captcha solving: locating the input field on the page, detecting the captcha image, solving the puzzle using the optimal processing engine, inputting the result, and triggering the submission action. It is specifically engineered to handle varying complexity levels, keeping costs low by prioritizing local processing while failing over to high-end AI only when necessary.

Installation

To install, ensure your environment is prepared by navigating to your workspace directory. Execute the following command in your terminal: clawhub install openclaw/skills/skills/annoyingc/captcha-auto. It is critical to run this from the ~/.openclaw/workspace directory; installing in your home directory or elsewhere will result in the skill failing to link correctly with your OpenClaw agent. Once installed, confirm the directory structure by verifying the contents of ~/.openclaw/workspace/skills/captcha-auto/. Don't forget to run npm install within the workspace to ensure all node dependencies are resolved.

Use Cases

This skill is ideal for developers and QA engineers who require automated interaction with secure web forms. Primary use cases include: 1) Automated testing of user registration or login flows that implement basic captcha protections. 2) Data collection tasks that require navigating past static or semi-complex captcha barriers on public-facing websites. 3) Reducing manual friction in repetitive browser-based workflows, such as portal authentication for internal research or administrative statistics sites where manual human intervention would otherwise halt a script.

Example Prompts

  1. "OpenClaw, please navigate to the registration portal at https://tjy.stats.gov.cn and use the captcha-auto skill to bypass the numeric verification field before submitting the form."
  2. "I need to automate the login process for our internal ASP.NET dashboard. Can you trigger the captcha-auto sequence once the page loads to identify the text in the captcha input box?"
  3. "Run a test on the target URL provided and use the hybrid captcha-auto model to solve the visual challenge; output the final submission status to my logs."

Tips & Limitations

  • Privacy First: Never use this skill on pages containing sensitive personal info, banking data, or password inputs as screenshots are processed via external APIs.
  • Configuration: Ensure your VISION_API_KEY is correctly set in your environment variables. Without this, the model will be unable to perform complex fallback recognition.
  • Accuracy: While it achieves a 100% success rate on tested platforms, performance may vary depending on the captcha type; some extremely noisy or distorted captchas might still require multiple attempts.
  • Prerequisites: A fully functional installation of Google Chrome or Chromium is mandatory for the browser automation components to interact with the DOM properly.

Metadata

Author@annoyingc
Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-annoyingc-captcha-auto": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#browser-automation#captcha-solver#ocr#web-scraping#computer-vision
Safety Score: 3/5

Flags: external-api, file-read, file-write, network-access