Official Verified

openocr-skills

Extract text from images, documents and scanned PDFs using OpenOCR - supports text detection, recognition, universal VLM recognition, and document parsing with layout analysis

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/topdu/opencr-skill

Download Source Code (.zip)

OpenOCR Skill

Overview

This skill enables intelligent text extraction, document parsing, and universal recognition using OpenOCR - an accurate and efficient general OCR system. It provides a unified interface for text detection, text recognition, end-to-end OCR, VLM-based universal recognition (text/formulas/tables), and document parsing with layout analysis. Supports Chinese, English, and more.

How to Use

Provide the image, scanned document, or PDF
Optionally specify the task type (det/rec/ocr/unirec/doc)
I'll extract text, formulas, tables, or full document structure

Example prompts:

"Extract all text from this image"
"Detect text regions in this photo"
"Recognize the formula in this screenshot"
"Parse this PDF document with layout analysis"
"Convert this scanned page to Markdown"

Domain Knowledge

OpenOCR Fundamentals

from openocr import OpenOCR

# Initialize with a specific task
engine = OpenOCR(task='ocr')

# Run OCR on an image (callable interface)
results, time_dicts = engine(image_path='image.jpg')

# Results contain detected boxes with recognized text
for result in results:
    for line in result:
        box = line[0]       # Bounding box coordinates
        text = line[1][0]   # Recognized text
        conf = line[1][1]   # Confidence score
        print(f"{text} ({conf:.2f})")

Supported Tasks

# Available task types
tasks = {
    'det':    'Text Detection - detect text regions with bounding boxes',
    'rec':    'Text Recognition - recognize text from cropped images',
    'ocr':    'End-to-End OCR - detection + recognition pipeline',
    'unirec': 'Universal Recognition - VLM-based text/formula/table recognition (0.1B params)',
    'doc':    'Document Parsing - layout analysis + universal recognition (0.1B params)',
}

# Task selection via parameter
det_engine = OpenOCR(task='det')
rec_engine = OpenOCR(task='rec')
ocr_engine = OpenOCR(task='ocr')
unirec_engine = OpenOCR(task='unirec')
doc_engine = OpenOCR(task='doc')

Configuration Options

from openocr import OpenOCR

# === Text Detection ===
detector = OpenOCR(
    task='det',
    backend='onnx',                          # 'onnx' (default) or 'torch'
    onnx_det_model_path=None,                # Custom detection model (auto-downloads if None)
    use_gpu='auto',                          # 'auto', 'true', or 'false'
)

# === Text Recognition ===
recognizer = OpenOCR(
    task='rec',
    mode='mobile',                           # 'mobile' (fast) or 'server' (accurate)
    backend='onnx',                          # 'onnx' (default) or 'torch'
    onnx_rec_model_path=None,                # Custom recognition model
    use_gpu='auto',
)

Read Full Documentation on GitHub

Metadata

Author@topdu

Stars946

Updated2026-02-13

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-topdu-opencr-skill": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

comparison-table-gen

Auto-generates comparison tables for concepts, drugs, or study results in Markdown format.

aipoch-ai 4473

AB-Agents-Vision-MiniMax

👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).

alexburrstudio 4473

AB-Agents-Vision

👁️ Image analysis using MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Works with local files and URLs. Simple shell wrapper.

alexburrstudio 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

openocr-skills

Install via CLI (Recommended)

OpenOCR Skill

Overview

How to Use

Domain Knowledge

OpenOCR Fundamentals

Supported Tasks

Configuration Options

Metadata

Tags

Related Skills

comparison-table-gen

AB-Agents-Vision-MiniMax

AB-Agents-Vision

DocPilot

DocPilot