pdfagent
Self-hosted PDF operations and conversions with metered usage output.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/cap-txt/pdfagentWhat This Skill Does
The pdfagent is a powerful, self-hosted utility for OpenClaw designed to handle complex PDF lifecycle operations directly on your local machine. It serves as an abstraction layer for enterprise-grade PDF tools like qpdf, ghostscript, and ocrmypdf, allowing the AI agent to execute document manipulation tasks with precision. By returning usage metrics in a structured JSON format, it ensures that every transformation—whether merging documents, splitting pages, or applying OCR—is trackable and transparent. Because it runs locally via uv, your sensitive documents never leave your filesystem, making it a secure choice for handling confidential paperwork.
Installation
Installation is straightforward via the ClawHub ecosystem. Execute the following command in your terminal:
clawhub install openclaw/skills/skills/cap-txt/pdfagent
Ensure you have uv installed on your system path. Additionally, for the full range of features, install the system dependencies required by your OS package manager: qpdf, ghostscript, poppler, libreoffice, chromium, and ocrmypdf. The tool will automatically detect these binaries when running the doctor command.
Use Cases
- Document Archiving: Converting scanned images and physical paperwork into searchable PDFs using OCR.
- Automated Reporting: Merging multiple data snapshots into a singular, compressed PDF report for stakeholders.
- Batch Processing: Splitting large multi-hundred-page invoices into individual files based on specific page ranges.
- Optimizing Storage: Using preset compression modes (e.g., 'ebook' or 'screen') to reduce the size of PDFs for faster email attachments or web hosting.
- Multi-step Workflows: Using the agent mode to chain complex operations like merging, rotating, and encrypting in one pass.
Example Prompts
- "pdfagent, please take the scan in my Downloads folder, apply OCR in English, and save the resulting searchable PDF as 'archive_2023.pdf'."
- "I have two PDFs: 'invoice_1.pdf' and 'invoice_2.pdf'. Merge them together, compress the output for email, and tell me how much time the processing took."
- "Split the document 'manual.pdf' into individual files for pages 1 through 5, and make sure the output is placed in the 'sections' folder."
Tips & Limitations
Always append the --json flag to your command-line requests if you are integrating this into a programmatic pipeline; this ensures the AI receives structured feedback. If you are dealing with encrypted files, ensure you pass the --password flag, otherwise, the agent will throw an access error. Note that this tool relies on system-level binaries; if a specific tool (like libreoffice) is missing, pdfagent may fall back to less accurate methods. Periodically run uv run {baseDir}/scripts/pdfagent_cli.py doctor to ensure your environment is fully configured and all system dependencies are healthy.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-cap-txt-pdfagent": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution