smart-ocr
Extract text from images and scanned documents using PaddleOCR - supports 100+ languages
Why use this skill?
Use the Smart OCR skill to extract text from images and documents. Supports 100+ languages with high accuracy using PaddleOCR technology.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/duykhangdangzn1/smarWhat This Skill Does
The smart-ocr skill provides a robust interface for the PaddleOCR engine, allowing your OpenClaw agent to interpret and digitize visual information from various sources. It acts as an optical character recognition powerhouse, capable of processing images, screenshots, and scanned documents. By utilizing advanced machine learning models, the skill accurately identifies text characters, their spatial coordinates, and the confidence level of each detection, supporting over 100 languages. Whether you are dealing with clean digital text or complex handwritten notes, this skill translates visual data into machine-readable strings.
Installation
To integrate this skill into your environment, run the following command in your terminal:
clawhub install openclaw/skills/skills/duykhangdangzn1/smar
Ensure that you have the necessary system dependencies for image processing installed. The skill is designed to run efficiently on both CPU and GPU architectures, providing flexibility for resource-constrained or high-performance setups.
Use Cases
- Digitizing Paperwork: Automatically ingest physical invoices, receipts, or contracts into structured data formats.
- Data Extraction: Pull specific information from screenshots or interface captures for database entry or reporting.
- Language Translation: Detect and translate multilingual text within documents by pairing OCR results with translation modules.
- Accessibility: Transform text contained in static images into selectable, searchable, or accessible formats for assistive technology.
- Business Intelligence: Quickly process identification documents, business cards, or shipping labels during automated workflows.
Example Prompts
- "Look at this screenshot of the invoice and extract the total amount and vendor name."
- "Read the handwritten notes from this image and compile them into a plain text summary."
- "OCR the attached document and translate the detected Chinese text into English."
Tips & Limitations
To achieve the best results, ensure images have high contrast and good lighting. While PaddleOCR is highly effective, extremely degraded or low-resolution images may yield lower confidence scores. If your text is in a specific language, explicitly declare the language code (e.g., 'ch' for Chinese, 'japan' for Japanese) in your configuration or prompt to increase recognition accuracy. For large-scale batch processing, enable GPU acceleration in the configuration settings to significantly reduce wait times. Note that handwriting recognition is supported but may require higher quality inputs than standard printed text to maintain precision.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-duykhangdangzn1-smar": {
"enabled": true,
"auto_update": true
}
}
}Tags
Flags: file-read
Related Skills
DocPilot
智能文档处理专家,支持文档解析、信息抽取、文档分类
AB-Agents-Vision-MiniMax
👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).
AB-Agents-Vision
👁️ Image analysis using MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Works with local files and URLs. Simple shell wrapper.
DocPilot
智能文档处理专家,支持文档解析、信息抽取、文档分类
xianyu-data-grabber
闲鱼数据抓取技能。使用 Playwright + OCR 技术突破反爬虫,抓取闲鱼商品数据(标题、价格、想要人数等),自动上传截图和数据到 Gitee 仓库。支持批量关键词搜索、竞品分析、市场调研。