Official Verified file management Safety 5/5

PDF OCR Extraction

Extract text from scanned PDFs using optical character recognition

Why use this skill?

Convert scanned PDFs and images into searchable, editable text using OpenClaw's OCR skill. Improve document management and data extraction efficiency.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/lijie420461340/pdf-ocr

Download Source Code (.zip)

What This Skill Does

The PDF OCR Extraction skill is a powerful tool for OpenClaw users designed to convert static, image-based documents into machine-readable text. Whether you are dealing with scanned invoices, physical contracts, archived books, or images of receipts, this skill utilizes advanced Optical Character Recognition (OCR) to identify characters and layout structures. It bridges the gap between raw visual data and actionable digital information, allowing you to search, copy, and analyze the contents of files that were previously locked within a static image format. The skill supports various output formats, including plain text, structured Markdown tables, and even the creation of fully searchable PDF files that maintain the original visual integrity of the document.

Installation

To add this capability to your OpenClaw agent, use the following command in your terminal or command-line interface:

clawhub install openclaw/skills/skills/lijie420461340/pdf-ocr

Once installed, the skill integrates directly with your agent's document processing workflow. Ensure you have the necessary file permissions for your agent to read your source PDFs from your local file system or cloud storage integration.

Use Cases

Digitizing Paperwork: Convert physical scanned documents into editable digital archives.
Data Entry Automation: Extract data from tables in PDF reports and convert them into structured JSON or Markdown formats for spreadsheet imports.
Searchable Archiving: Turn batches of legacy PDFs into text-searchable files, drastically improving document discovery and management.
Historical Analysis: Process scanned books or archives where text formatting is complex and requires layout preservation.

Example Prompts

"Please OCR the invoice scanned on my desktop and extract the date, total amount, and vendor name into a table."
"Take the document titled 'research_paper_scan.pdf' and generate a searchable PDF version while maintaining the original image layout."
"Extract all text from pages 5 through 12 of the provided document and highlight any words with low confidence levels."

Tips & Limitations

For the best results, ensure your input files are scanned at a minimum of 300 DPI. Documents with poor contrast, extreme skew, or heavy shadows may result in lower confidence scores. While typed text achieves high accuracy (95%+), handwritten content—especially cursive—is significantly less reliable. If you notice persistent errors, consider pre-processing your images to improve brightness and alignment before running the skill. Always review the 'Uncertain Text' report generated by the skill if the overall confidence score is below 85%.

Read Full Documentation on GitHub

Metadata

Author@lijie420461340

Stars1656

Updated2026-02-28

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-lijie420461340-pdf-ocr": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

career-compass

职场罗盘 by Barry — 一站式求职辅助 Skill。整合简历解析优化、公司调研（就业向）、同城职位搜索、模拟面试四大模块。输入个人信息/简历，自动生成简历优化方向、公司调研报告、招聘表单，并可进行模拟面试。

barry0-0 4473

wechat-article-export

微信公众号多功能导出工具。將公眾號文章導出為長截圖（PNG）、PDF 或 Markdown，支持任選一種或多種格式。觸發詞：「導出微信文章」、「公眾號截圖」、「文章轉PDF」、「文章轉Markdown」、「微信導出」。

benzking 4473

mailbox-bot

Real mailing address for your AI agent. Receive, scan, and forward postal mail — or send letters and documents. CMRA postal mail infrastructure your agent manages via API.

arbengine 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

AB-Agents-Vision-MiniMax

👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).

alexburrstudio 4473