ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified file management Safety 4/5

pdf

Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

Why use this skill?

Master PDF manipulation with the OpenClaw PDF skill. Effortlessly extract data, merge, split, and generate documents using Python automation.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/seanphan/pdf-2
Or

What This Skill Does

The PDF skill is a versatile, comprehensive toolkit designed for the OpenClaw AI agent to interact with, transform, and extract data from Portable Document Format (PDF) files. It leverages industry-standard Python libraries including pypdf, pdfplumber, and reportlab to offer a suite of high-level operations. Users can rely on this skill to automate complex document workflows, such as parsing large datasets contained in tables, merging disparate report files into a unified document, splitting massive volumes into manageable chunks, rotating pages for readability, and extracting metadata for indexing. Furthermore, the skill enables the programmatic generation of new PDF documents using reportlab, allowing for custom report formatting and dynamic document creation from scratch.

Installation

To install this skill, use the following command in your OpenClaw terminal or environment: clawhub install openclaw/skills/skills/seanphan/pdf-2 Ensure you have the required Python dependencies pre-installed in your environment, specifically pypdf, pdfplumber, pandas, and reportlab.

Use Cases

  • Automated Data Extraction: Quickly scrape tabular financial data from scanned reports and convert them into clean Excel sheets using pdfplumber and pandas.
  • Document Assembly: Batch process and merge hundreds of invoice PDFs into a single consolidated file for archiving.
  • Form Processing: Programmatically identify and populate interactive PDF forms with dynamic data.
  • Content Sanitization: Rotate, reorder, or split sensitive pages out of larger documents to prepare them for specific distribution requirements.
  • Report Generation: Automatically create professional-looking PDF summaries directly from raw analytical output.

Example Prompts

  • "Please scan the 50-page financial report named 'Q3_Earnings.pdf' and extract all tables into a single Excel file for me to review."
  • "I have three separate project briefs in PDF format. Can you merge them into one cohesive document and title the metadata correctly?"
  • "Rotate the first page of 'manual.pdf' 90 degrees clockwise and split every page into individual files saved in a new folder."

Tips & Limitations

When dealing with PDF extraction, remember that the accuracy of text and table extraction often depends on the source PDF's quality—OCR-scanned images without embedded text layers may yield lower accuracy than digital-born PDFs. For complex form-filling, ensure the form fields are standard Adobe-compatible objects. If you encounter errors, verify that the file path is correct and the file is not password-protected. Always check the reference.md file in the source repository for edge cases regarding encrypted files or non-standard page encodings.

Metadata

Author@seanphan
Stars1054
Views1
Updated2026-02-16
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-seanphan-pdf-2": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#pdf#automation#data-extraction#python#document-management
Safety Score: 4/5

Flags: file-read, file-write, code-execution

Related Skills

artifacts-builder

Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui). Use for complex artifacts requiring state management, routing, or shadcn/ui components - not for simple single-file HTML/JSX artifacts.

seanphan 1054

template-skill

Replace with description of the skill and when Claude should use it.

seanphan 1054

canvas-design

Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.

seanphan 1054

xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

seanphan 1054

algorithmic-art

Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.

seanphan 1054