Official Verified data analysis Safety 4/5

pdf-extraction

Extract text, tables, and metadata from PDFs using pdfplumber

Why use this skill?

Efficiently extract text, tables, and metadata from complex PDF files using the pdf-extraction skill. Perfect for automating document workflows.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/lijie420461340/pdf-extraction

Download Source Code (.zip)

What This Skill Does

The pdf-extraction skill for OpenClaw is a powerful utility designed for developers and data analysts who need to programmatically interact with PDF documents. Built upon the robust pdfplumber library, this skill goes beyond simple text-to-string extraction. It provides granular access to the internal structure of PDF files, including character-level positioning, font metadata, line analysis, and complex table detection. Whether you are dealing with scanned reports, structured financial statements, or multi-column academic papers, this skill allows the OpenClaw agent to parse, interpret, and convert PDF data into machine-readable formats like CSV or structured JSON. It is an essential tool for automating document-heavy workflows that require high precision and spatial awareness.

Installation

To integrate this skill into your environment, run the following command in your terminal within the OpenClaw ecosystem:

clawhub install openclaw/skills/skills/lijie420461340/pdf-extraction

Ensure that you have the necessary system dependencies installed for pdfplumber (typically including libpoppler) to handle document rendering and extraction tasks.

Use Cases

This skill is highly versatile and serves several professional domains:

Financial Data Processing: Automating the extraction of complex tables from bank statements or quarterly reports.
Academic Research: Parsing large datasets or bibliographies from research papers for citation management.
Legal Tech: Extracting specific clauses or metadata from legal contracts that maintain strict document formatting.
Document Archiving: Converting legacy static PDF archives into clean, searchable, and processable database entries.

Example Prompts

"Extract all tables from this quarterly financial report and save them into a structured CSV file for me."
"Please scan pages 5 through 10 of this document and extract the text, ensuring you preserve the original layout and indentation."
"Identify the invoice total and the recipient company name from this PDF and provide them as a JSON object."

Tips & Limitations

Tip: Use the 'layout=True' parameter if you are dealing with multi-column PDFs; it helps maintain the logical reading order.
Tip: If a table is not being detected correctly, inspect the 'rects' or 'lines' of the page to verify if the lines are explicitly drawn in the PDF structure.
Limitation: This skill is primarily for text-based PDFs. It does not perform Optical Character Recognition (OCR) on image-only PDFs. For those files, consider a pre-processing step using an OCR engine before feeding the results into this skill.

Read Full Documentation on GitHub

Metadata

Author@lijie420461340

Stars1656

Updated2026-02-28

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-lijie420461340-pdf-extraction": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

career-compass

职场罗盘 by Barry — 一站式求职辅助 Skill。整合简历解析优化、公司调研（就业向）、同城职位搜索、模拟面试四大模块。输入个人信息/简历，自动生成简历优化方向、公司调研报告、招聘表单，并可进行模拟面试。

barry0-0 4473

wechat-article-export

微信公众号多功能导出工具。將公眾號文章導出為長截圖（PNG）、PDF 或 Markdown，支持任選一種或多種格式。觸發詞：「導出微信文章」、「公眾號截圖」、「文章轉PDF」、「文章轉Markdown」、「微信導出」。

benzking 4473

DocPilot

智能文档处理专家，支持文档解析、信息抽取、文档分类

ankylala 4473

collab-to-skill

将“人类 + Agent”共同打磨出来的流程、决策与方法，提炼成可复用的 Skill。适用于把高质量协作过程从聊天/项目推进中抽取出来，沉淀为可分发的技能包。

beachanger 4473

accounting-assistant

Buchhaltungs-Automatisierung mit EÜR-Erstellung, DATEV-Export, PDF-Beleganalyse und Steuer-Vorbereitung. Ideal für Freelancer und KMU.

akkualle 4473