Official Verified file management Safety 4/5

pdf-cn

PDF 文档处理 | PDF Document Processing. 读取、提取、合并、分割 PDF | Read, extract, merge, split PDFs. 支持文本提取、表格识别、注释 | Supports text extraction, table recognition, annotations. 触发词：PDF、pdf.

Why use this skill?

Efficiently extract, merge, split, and parse PDF documents with the pdf-cn skill. Perfect for automating document workflows and data extraction tasks.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/guohongbin-git/pdf-cn

Download Source Code (.zip)

What This Skill Does

The pdf-cn skill is a comprehensive toolkit for manipulating PDF documents directly within the OpenClaw environment. It provides essential programmatic interfaces for extracting raw text, identifying and parsing complex tables into structured data, merging multiple PDF files, splitting documents into individual pages, and rotating page orientations. Built upon robust libraries like pypdf, pdfplumber, and reportlab, this skill empowers users to automate document workflows that would otherwise require manual intervention or expensive proprietary software. Whether you are dealing with research papers, financial reports, or standardized forms, pdf-cn acts as a reliable intermediary to transform static PDF layouts into dynamic, machine-readable formats like Excel or plain text.

Installation

To integrate this skill into your OpenClaw agent, use the official installation command provided by the repository:

clawhub install openclaw/skills/skills/guohongbin-git/pdf-cn

Ensure that you have the necessary Python dependencies installed in your environment, particularly pypdf, pdfplumber, and pandas if you intend to perform advanced table-to-Excel data extraction.

Use Cases

Data Digitization: Extract tabular data from scanned reports or invoices into structured CSV/Excel files for analysis.
Document Management: Programmatically combine individual PDF documents into a single master report or split large multi-page manuals into searchable, single-page files.
Information Extraction: Automate the retrieval of metadata or specific text sections from high-volume document archives.
Reporting Automation: Create new PDF reports from scratch using custom layouts and text data retrieved during the agent execution flow.

Example Prompts

"PDF: Extract all tables from the attached document and save them into a file named summary.xlsx."
"pdf: Split the file 'Project_Specs.pdf' into individual pages and rename them based on the page number."
"PDF: Merge the three files 'part1.pdf', 'part2.pdf', and 'part3.pdf' into one complete document called 'Final_Project.pdf'."

Tips & Limitations

OCR Support: This skill works best with text-based PDFs. If you are processing image-based scanned documents, you may need an additional OCR engine (like Tesseract) for accurate text extraction.
Complex Layouts: While table extraction is highly efficient, tables with merged cells or complex grid lines may require manual data validation post-extraction.
Memory Usage: When merging or splitting extremely large documents, ensure your system has sufficient RAM to handle the document buffer.

Read Full Documentation on GitHub

Metadata

Author@guohongbin-git

Stars2387

Updated2026-03-09

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-guohongbin-git-pdf-cn": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#pdf#automation#document-processing#data-extraction#productivity

Safety Score: 4/5

Flags: file-write, file-read, code-execution

Related Skills

sspai-hot-cn

少数派热门文章监控 | SSPAI Hot Articles Monitor. 获取少数派热门数码评测、应用推荐、效率工具 | Get SSPAI trending digital reviews, app recommendations, productivity tools. 触发词：少数派、sspai、数码评测、效率工具.

guohongbin-git 2387

binance-pro-cn

币安专业版 | Binance Pro. 完整币安集成 | Complete Binance integration. 现货/合约交易、杠杆、质押 | Spot/futures trading, leverage, staking. 触发词：币安、Binance、交易、trading.

guohongbin-git 2387

v2ex-hot-cn

V2EX 热门话题监控 | V2EX Hot Topics Monitor. 获取 V2EX 热门帖子、技术讨论、数码生活 | Get V2EX trending posts, tech discussions, digital life. 触发词：V2EX、v2、程序员社区.

guohongbin-git 2387

xueqiu-hot-cn

雪球热门讨论监控 | Xueqiu Hot Discussions Monitor. 获取雪球热门股票讨论、投资观点、大V动态 | Get Xueqiu trending stock discussions, investment insights, top posts. 触发词：雪球、股票、投资、xueqiu.

guohongbin-git 2387

tianyancha-cn

企业信息查询 - 天眼查/企查查/爱企查数据查询（Bloomberg 终端中国版）

guohongbin-git 2387