ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified data analysis Safety 4/5

paddleocr-text-recognition

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bobholamovic/paddleocr-text-recognition
Or

What This Skill Does

The paddleocr-text-recognition skill serves as a high-performance interface for optical character recognition (OCR) within the OpenClaw environment. It leverages the robust PaddleOCR engine to extract printed or handwritten text, along with coordinate-based spatial information, from images and PDF documents. This skill is designed to act as an automated vision bridge, enabling the agent to interpret complex visual inputs such as invoices, screenshots, or scanned documents without requiring native image interpretation logic. The tool manages the execution flow via a dedicated Python wrapper, ensuring that data is processed, parsed, and presented back to the agent in a structured JSON format.

Installation

To integrate this skill into your OpenClaw agent, execute the following command in your terminal: clawhub install openclaw/skills/skills/bobholamovic/paddleocr-text-recognition Once the repository is synced, navigate to the skill directory (skills/paddleocr-text-recognition) and prepare your environment by running the dependency installer: pip install -r scripts/requirements.txt

Use Cases

This skill is indispensable for workflows requiring data extraction from non-textual files. Ideal scenarios include:

  • Converting image-based receipts or invoices into structured machine-readable text for expense reporting.
  • Digitizing historical scans or images of documents that lack underlying text layers.
  • Extracting specific fields, tables, or text blocks from structured forms located in images or PDF files.
  • Automating the ingestion of data from URLs pointing to visual content, such as web-based product specifications or diagrams.

Example Prompts

  1. "Please extract all the text from this invoice image and save it as a JSON object so I can process the total amount."
  2. "I've uploaded a screenshot of a technical error log. Use PaddleOCR to read the text and highlight any lines containing 'Critical Failure'."
  3. "Go to this URL: https://example.com/receipt.pdf, run OCR on the document, and list all line items found in the table."

Tips & Limitations

  • Direct API Execution: Always utilize the scripts/ocr_caller.py script provided. Do not attempt to process images using alternate methods.
  • Error Handling: If the API returns an error, display it clearly to the user. Do not attempt to use built-in vision capabilities as a fallback, as this violates strict operational protocols.
  • File Management: By default, results are saved in the system temp directory. Utilize the --stdout flag if you prefer immediate parsing without persistence, or define a specific path with --output for better file management.
  • Input Precision: Ensure the file path or URL provided is accessible and correctly formatted to avoid API timeouts or connection failures.

Metadata

Stars4190
Views0
Updated2026-04-18
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bobholamovic-paddleocr-text-recognition": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ocr#image-processing#text-recognition#data-extraction#paddleocr
Safety Score: 4/5

Flags: file-read, file-write, code-execution