Official Verified productivity Safety 4/5

General Ocr Struct

Skill by 9penny

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/9penny/general-ocr-struct

Download Source Code (.zip)

What This Skill Does

The General OCR Struct skill provides a robust, offline-first optical character recognition (OCR) engine for OpenClaw. Powered by RapidOCR, this skill specializes in transforming raw pixels from screenshots, scans, or images into usable text data. Unlike generic tools, it separates the recognition process from the interpretive phase, ensuring that raw data is captured reliably before any downstream processing (like summarization or data entry) occurs. It is particularly adept at handling mixed-language environments, such as combined Chinese and English text, as well as complex layouts like tabular transaction data and receipts.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/9penny/general-ocr-struct

Ensure that you have a compatible Python environment installed. The first time you execute the command, the skill may trigger a one-time dependency installation process to fetch necessary libraries. Once initialized, the skill runs entirely offline, protecting your data privacy and ensuring zero latency from external API calls.

Use Cases

This skill is designed for scenarios where accuracy and data integrity are paramount:

Financial Reconciliation: Extract transaction details from bank statements or e-wallet screenshots, organizing them into machine-readable formats.
Content Digitization: Convert images of scanned documents, meeting minutes, or handwritten notes into editable digital text.
Workflow Automation: Capture data from recurring reports or chat screenshots for automatic insertion into Excel or database schemas.
Verification: Act as a middle-ware layer where raw OCR text is reviewed by the user before an AI agent generates a summary or makes a business decision based on the content.

Example Prompts

"Run raw OCR on the screenshot in my downloads folder and show me the text output so I can check for errors."
"Extract the transaction rows from this bank screenshot and give me the output in JSON format so I can import it into my spreadsheet."
"Please process this image of a receipt, group the items by price and description, and mark any blurry parts as 待确认."

Tips & Limitations

For the best results, always ensure that your input images are of high resolution. If the OCR engine struggles to identify characters, the text output may be fragmented or incorrect. Avoid inferring data when the text is illegible; the skill is configured to use '待确认' (To be confirmed) to maintain data accuracy. By following the workflow of 'Recognize -> Review -> Structure', you prevent the propagation of AI hallucinations in your downstream data pipelines.

Read Full Documentation on GitHub

Metadata

Author@9penny

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-9penny-general-ocr-struct": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ocr#automation#productivity#data-extraction#image-processing

Safety Score: 4/5

Flags: file-read, code-execution

Related Skills

repo-analysis

Read, explain, and evaluate a software repository or GitHub project in an engineering-oriented way. Use when the user asks to read a repo, understand a codebase, analyze architecture, evaluate whether a project is worth following or adopting, prepare onboarding notes, or summarize stack, module boundaries, risks, and entry points. Supports three output modes: 速读版, 架构版, and 接手评审版. Also supports a lightweight GitHub health layer for public repositories when the user asks whether a project is worth following, adopting, or referencing. Triggers include requests like 读一下这个项目, 看看这个 GitHub 仓库, 分析一下 repo, 这个项目怎么样, 帮我快速理解代码结构, 给我一个架构分析, or 给我一个接手评审.

9penny 4473