Official Verified utilities Safety 5/5

openocr-skills

Extract text from images, documents and scanned PDFs using OpenOCR - a lightweight and efficient OCR system with document parsing model requiring only 0.1B parameters, capable of running recognition on personal PCs. Supports text detection, recognition, universal VLM recognition, and document parsing with layout analysis

Why use this skill?

Efficiently extract text, formulas, and tables from images and PDFs using OpenOCR. Lightweight, accurate, and optimized for local document parsing and layout analysis.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/topdu/openocr-skill

Download Source Code (.zip)

What This Skill Does

The OpenOCR skill provides an advanced, lightweight, and highly efficient optical character recognition engine for OpenClaw. Built on a sophisticated 0.1B parameter model, this skill enables AI agents to read, interpret, and structure data from a wide variety of visual sources, including scanned documents, photographs of text, complex mathematical formulas, and tabular data. Unlike heavy, cloud-dependent OCR systems, OpenOCR is designed to run efficiently on local hardware while maintaining enterprise-grade accuracy. It supports specialized workflows ranging from simple text detection to full document layout analysis, converting unstructured image data into actionable text or structured Markdown formats. By integrating layout analysis with universal recognition, the skill acts as a bridge between physical media and digital intelligence.

Installation

To integrate OpenOCR into your agent's capability set, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/topdu/openocr-skill

Ensure that you have the necessary system dependencies and Python environment set up as outlined in the OpenClaw core documentation. Once installed, the skill will be automatically registered to your agent's registry.

Use Cases

Digitization & Archiving: Convert legacy paper records, invoices, and physical receipts into searchable digital databases or spreadsheets.
Academic & Research Assistance: Extract complex mathematical equations and technical formulas from screenshots or research papers to perform further symbolic computation.
Document Structuring: Perform layout analysis on multi-column documents or forms, preserving the logical order of text and data tables.
Data Extraction: Automate the entry of information from identity cards, labels, or product packaging into CRM or ERP systems.

Example Prompts

"Extract all text from this receipt image and format it as a markdown table with date, merchant, and total amount."
"Analyze this PDF page and perform layout analysis to extract the mathematical formulas and text content separately."
"Scan this photo of a handwritten note and convert it into a clean, digital text document."

Tips & Limitations

Performance: While the model is optimized for personal PCs, high-resolution scans or extremely long documents may benefit from GPU acceleration; ensure your use_gpu flag is set to 'auto' for best results.
Pre-processing: For blurry or low-light images, consider running basic image enhancement or cropping to the text region before calling the 'det' (detection) task to improve recognition accuracy.
Complexity: The 'doc' task is highly powerful for layout analysis but requires more memory than the 'rec' (text recognition only) task. Use specific tasks rather than defaulting to complex ones when you only need simple text reading.

Read Full Documentation on GitHub

Metadata

Author@topdu

Stars946

Updated2026-02-13

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-topdu-openocr-skill": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

comparison-table-gen

Auto-generates comparison tables for concepts, drugs, or study results in Markdown format.

aipoch-ai 4473

AB-Agents-Vision-MiniMax

👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).

alexburrstudio 4473

AB-Agents-Vision

👁️ Image analysis using MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Works with local files and URLs. Simple shell wrapper.

alexburrstudio 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

openocr-skills

Why use this skill?

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags

Related Skills

comparison-table-gen

AB-Agents-Vision-MiniMax

AB-Agents-Vision

DocPilot

DocPilot