smart-ocr
Extract text from images and scanned documents using PaddleOCR - supports 100+ languages
Why use this skill?
Efficiently extract text from images, screenshots, and scanned documents using the Smart OCR skill. Supports 100+ languages and high-accuracy recognition.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/lijie420461340/smart-ocrWhat This Skill Does
The Smart OCR (Optical Character Recognition) skill empowers the OpenClaw agent to interpret and digitize text from virtually any visual source. Utilizing the robust PaddleOCR engine, this skill is capable of processing images, screenshots, scanned PDF documents, and even handwritten notes with high precision. By supporting over 100 languages, the skill bridges the gap between static image files and actionable, searchable, or editable digital data. Whether you are dealing with complex character sets, business cards, or dense technical documentation, Smart OCR transforms pixels into structured text results, including spatial coordinates and confidence scores for every detected line.
Installation
To integrate the Smart OCR skill into your OpenClaw environment, run the following command in your terminal:
clawhub install openclaw/skills/skills/lijie420461340/smart-ocr
Once installed, the agent will have the necessary dependencies to handle image processing requests directly.
Use Cases
- Digitizing Paperwork: Quickly convert scanned receipts, invoices, and contracts into plain text for bookkeeping or database entry.
- Content Extraction: Pull text from infographics, product labels, or screenshots that cannot be selected by traditional copy-paste methods.
- Global Language Support: Easily handle documents written in languages like Chinese, Japanese, Korean, Arabic, and many others without needing separate specialized software.
- Accessibility: Convert printed educational materials or physical signs into digital formats for reading assistants.
- Data Analysis: Extract tabular data or serial numbers from equipment photographs to streamline technical inventory management.
Example Prompts
- "Extract all the text from this invoice image and organize it into a structured format."
- "Can you perform an OCR scan on this PDF and tell me the total amount written on the document?"
- "Read the Japanese text in this product manual image and provide a summary of the installation steps."
Tips & Limitations
For best results, ensure images are high-resolution and well-lit. While PaddleOCR is highly accurate, handwritten text with low contrast may produce lower confidence scores. If you are processing multilingual documents, use the 'multilingual' language configuration for optimal auto-detection. Keep in mind that extremely large high-resolution images may consume significant memory; consider resizing if you encounter performance bottlenecks.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-lijie420461340-smart-ocr": {
"enabled": true,
"auto_update": true
}
}
}Tags
Flags: file-read
Related Skills
DocPilot
智能文档处理专家,支持文档解析、信息抽取、文档分类
AB-Agents-Vision-MiniMax
👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).
AB-Agents-Vision
👁️ Image analysis using MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Works with local files and URLs. Simple shell wrapper.
DocPilot
智能文档处理专家,支持文档解析、信息抽取、文档分类
xianyu-data-grabber
闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。