ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

openocr-skills

Extract text from images, documents and scanned PDFs using OpenOCR - supports text detection, recognition, universal VLM recognition, and document parsing with layout analysis

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/topdu/opencr-skill
Or

OpenOCR Skill

Overview

This skill enables intelligent text extraction, document parsing, and universal recognition using OpenOCR - an accurate and efficient general OCR system. It provides a unified interface for text detection, text recognition, end-to-end OCR, VLM-based universal recognition (text/formulas/tables), and document parsing with layout analysis. Supports Chinese, English, and more.

How to Use

  1. Provide the image, scanned document, or PDF
  2. Optionally specify the task type (det/rec/ocr/unirec/doc)
  3. I'll extract text, formulas, tables, or full document structure

Example prompts:

  • "Extract all text from this image"
  • "Detect text regions in this photo"
  • "Recognize the formula in this screenshot"
  • "Parse this PDF document with layout analysis"
  • "Convert this scanned page to Markdown"

Domain Knowledge

OpenOCR Fundamentals

from openocr import OpenOCR

# Initialize with a specific task
engine = OpenOCR(task='ocr')

# Run OCR on an image (callable interface)
results, time_dicts = engine(image_path='image.jpg')

# Results contain detected boxes with recognized text
for result in results:
    for line in result:
        box = line[0]       # Bounding box coordinates
        text = line[1][0]   # Recognized text
        conf = line[1][1]   # Confidence score
        print(f"{text} ({conf:.2f})")

Supported Tasks

# Available task types
tasks = {
    'det':    'Text Detection - detect text regions with bounding boxes',
    'rec':    'Text Recognition - recognize text from cropped images',
    'ocr':    'End-to-End OCR - detection + recognition pipeline',
    'unirec': 'Universal Recognition - VLM-based text/formula/table recognition (0.1B params)',
    'doc':    'Document Parsing - layout analysis + universal recognition (0.1B params)',
}

# Task selection via parameter
det_engine = OpenOCR(task='det')
rec_engine = OpenOCR(task='rec')
ocr_engine = OpenOCR(task='ocr')
unirec_engine = OpenOCR(task='unirec')
doc_engine = OpenOCR(task='doc')

Configuration Options

from openocr import OpenOCR

# === Text Detection ===
detector = OpenOCR(
    task='det',
    backend='onnx',                          # 'onnx' (default) or 'torch'
    onnx_det_model_path=None,                # Custom detection model (auto-downloads if None)
    use_gpu='auto',                          # 'auto', 'true', or 'false'
)

# === Text Recognition ===
recognizer = OpenOCR(
    task='rec',
    mode='mobile',                           # 'mobile' (fast) or 'server' (accurate)
    backend='onnx',                          # 'onnx' (default) or 'torch'
    onnx_rec_model_path=None,                # Custom recognition model
    use_gpu='auto',
)

Metadata

Author@topdu
Stars946
Views0
Updated2026-02-13
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-topdu-opencr-skill": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#ocr#text-detection#text-recognition#document-parsing#vlm#unirec#layout-analysis#formula#table
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.