smart-ocr
Extract text from images and scanned documents using PaddleOCR - supports 100+ languages
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/duykhangdangzn1/smarSmart OCR Skill
Overview
This skill enables intelligent text extraction from images and scanned documents using PaddleOCR - a leading OCR engine supporting 100+ languages. Extract text from photos, screenshots, scanned PDFs, and handwritten documents with high accuracy.
How to Use
- Provide the image or scanned document
- Optionally specify language(s) to detect
- I'll extract text with position and confidence data
Example prompts:
- "Extract all text from this screenshot"
- "OCR this scanned PDF document"
- "Read the text from this business card photo"
- "Extract Chinese and English text from this image"
Domain Knowledge
PaddleOCR Fundamentals
from paddleocr import PaddleOCR
# Initialize OCR engine
ocr = PaddleOCR(use_angle_cls=True, lang='en')
# Run OCR on image
result = ocr.ocr('image.png', cls=True)
# Result structure: [[box, (text, confidence)], ...]
for line in result[0]:
box = line[0] # [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
text = line[1][0] # Extracted text
conf = line[1][1] # Confidence score
print(f"{text} ({conf:.2f})")
Supported Languages
# Common language codes
languages = {
'en': 'English',
'ch': 'Chinese (Simplified)',
'cht': 'Chinese (Traditional)',
'japan': 'Japanese',
'korean': 'Korean',
'french': 'French',
'german': 'German',
'spanish': 'Spanish',
'russian': 'Russian',
'arabic': 'Arabic',
'hindi': 'Hindi',
'vi': 'Vietnamese',
'th': 'Thai',
# ... 100+ languages supported
}
# Use specific language
ocr = PaddleOCR(lang='ch') # Chinese
ocr = PaddleOCR(lang='japan') # Japanese
ocr = PaddleOCR(lang='multilingual') # Auto-detect
Configuration Options
from paddleocr import PaddleOCR
ocr = PaddleOCR(
# Detection settings
det_model_dir=None, # Custom detection model
det_limit_side_len=960, # Max side length for detection
det_db_thresh=0.3, # Binarization threshold
det_db_box_thresh=0.5, # Box score threshold
# Recognition settings
rec_model_dir=None, # Custom recognition model
rec_char_dict_path=None, # Custom character dictionary
# Angle classification
use_angle_cls=True, # Enable angle classification
cls_model_dir=None, # Custom classification model
# Language
lang='en', # Language code
# Performance
use_gpu=True, # Use GPU if available
gpu_mem=500, # GPU memory limit (MB)
enable_mkldnn=True, # CPU optimization
# Output
show_log=False, # Suppress logs
)
Processing Different Sources
Image Files
# Single image
result = ocr.ocr('image.png')
# Multiple images
images = ['img1.png', 'img2.png', 'img3.png']
for img in images:...
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-duykhangdangzn1-smar": {
"enabled": true,
"auto_update": true
}
}
}Tags
Related Skills
voice-ai-tts
High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.
feishu-sticker
Send images as native Feishu stickers. Features auto-upload, caching, and GIF-to-WebP conversion.
voice-ai-tts
High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.
tts-whatsapp
Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery
translateimage
Translate text in images, extract text via OCR, and remove text using TranslateImage AI