pdf-tools
View, extract, edit, and manipulate PDF files. Supports text extraction, text editing (overlay and replacement), merging, splitting, rotating pages, and getting PDF metadata. Use when working with PDF documents for reading content, adding/editing text, reorganizing pages, combining files, or extracting information.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/cmpdchtr/pdf-toolsWhat This Skill Does
The pdf-tools skill provides a comprehensive suite of Python-based utilities designed to handle Portable Document Format (PDF) files with high precision. It empowers OpenClaw agents to perform complex operations including raw text extraction, structural metadata analysis, file merging, page-level splitting, rotation, and targeted text overlaying. By leveraging mature libraries like pdfplumber and PyPDF2, the skill allows the agent to read and modify PDF documents programmatically, making it an essential component for automated document management, report generation, and data processing workflows.
Installation
To integrate this skill, first ensure you have Python 3 installed. The toolset requires specific dependencies to function correctly. Execute the following command in your terminal:
pip3 install pdfplumber PyPDF2
After installing the requirements, install the skill into the OpenClaw environment using the agent's package management interface:
clawhub install openclaw/skills/skills/cmpdchtr/pdf-tools
Use Cases
- Document Digitization and Summarization: Automatically extract text from long reports to feed into LLMs for summarization or analysis.
- File Organization: Break down large binders or multi-chapter documents into individual page files for archiving or easier navigation.
- Compliance and Reporting: Batch process documents by applying headers, footers, or watermarks via text overlays, or merge multiple individual receipts/invoices into a single document for submission.
- PDF Inspection: Verify file integrity and structure by reviewing metadata and page counts before processing.
Example Prompts
- "Please extract the text from the first three pages of report.pdf and save it to a new file named summary.txt."
- "I need you to take invoice.pdf, rotate pages 2 and 4 by 90 degrees, and merge the result with supplemental_info.pdf into a final document called output.pdf."
- "Add the word 'DRAFT' as an overlay at coordinates (100, 100) on the first page of contract.pdf."
Tips & Limitations
- Text-Based vs. Scanned: This skill is optimized for text-based PDFs. If you are working with scanned image-based PDFs, text extraction might not yield reliable results without OCR integration.
- Editing Complexity: While text overlays are robust, full text replacement within existing PDF layers can be delicate. Always work on a copy of your source file to avoid accidental data corruption.
- Indexing: All page numbers follow a 1-indexed system (starting at 1), so ensure your input parameters align with standard document layouts.
- File Safety: The tools perform validation checks to ensure files exist, but verify your paths before running batch operations to prevent script failures.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-cmpdchtr-pdf-tools": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution