Official Verified developer tools Safety 4/5

paddleocr-doc-parsing

Use this skill to extract structured Markdown/JSON from PDFs and document images—tables with cell-level precision, formulas as LaTeX, figures, seals, charts, headers/footers, multi-column layout and correct reading order. Trigger terms: 文档解析, 版面分析, 版面还原, 表格提取, 公式识别, 多栏排版, 扫描件结构化, 发票, 财报, 复杂 PDF, PDF转Markdown, 图表, 阅读顺序; reading order, formula, LaTeX, layout parsing, structure extraction, PP-StructureV3, PaddleOCR-VL.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bobholamovic/paddleocr-doc-parsing

Download Source Code (.zip)

What This Skill Does

The paddleocr-doc-parsing skill integrates PaddleOCR’s powerful Optical Character Recognition capabilities directly into your OpenClaw agent workflow. This skill is designed to handle high-fidelity document digitization by converting images (JPG, PNG, BMP, TIFF) and PDF files into structured, machine-readable formats. Utilizing advanced algorithms like PP-StructureV3, it performs comprehensive layout analysis to distinguish between text blocks, tables, and even complex mathematical formulas. By providing output in a structured Markdown format, it ensures that your AI agents can parse document content while preserving the original structural context, headers, and document hierarchy. This is an essential tool for any agent tasked with reading invoices, research papers, or archived physical documents.

Installation

To install this skill, run the following command in your terminal within your OpenClaw environment: clawhub install openclaw/skills/skills/bobholamovic/paddleocr-doc-parsing

After installation, you must configure your environment variables to allow the agent to communicate with the PaddleOCR API. Set the PADDLEOCR_API_URL to your specific API endpoint and PADDLEOCR_ACCESS_TOKEN to your unique security credential obtained from the PaddleOCR official website.

Use Cases

Automated Data Extraction: Automatically pull line items and totals from invoices or receipts in PDF format for accounting purposes.
Content Digitization: Convert scanned research documents or archives into clean Markdown files for use in RAG (Retrieval-Augmented Generation) databases.
Table Processing: Analyze complex tables in documents and convert them into structured JSON data or Markdown tables for downstream analysis.
Accessibility: Transform image-based documentation into text-based formats suitable for screen readers or automated summary generation.

Example Prompts

"Please parse the PDF document located at ./invoices/march_invoice.pdf and summarize the total amount found in the markdown output."
"Use the paddleocr-doc-parsing skill to extract the table content from this image and convert it into a CSV file format."
"Analyze the research paper at https://example.com/study.pdf and extract all headers and bulleted lists using the OCR skill."

Tips & Limitations

To achieve the best results, ensure your input documents are high-resolution; blurry or low-light images may degrade the accuracy of the character recognition. The skill supports over 110 languages, making it highly versatile, but complex handwritten text may require verification. Always monitor your API quota usage as defined in the official PaddleOCR documentation to avoid service interruptions. When processing large batches of files, consider writing the output to a specific directory using the -o flag to manage storage effectively. If you encounter errors, verify that your access token has sufficient permissions for the specific OCR model version you are requesting.

Read Full Documentation on GitHub

Metadata

Author@bobholamovic

Stars4190

Updated2026-04-18

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bobholamovic-paddleocr-doc-parsing": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#ocr#document-parsing#pdf-processing#digitization#text-extraction

Safety Score: 4/5

Flags: file-read, file-write, external-api

Related Skills

paddleocr-text-recognition

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.

bobholamovic 4190