doc-parser
Parse complex documents with IBM's docling - handles tables, figures, and multi-column layouts
Why use this skill?
Use the OpenClaw doc-parser skill to accurately convert complex PDFs, Word docs, and images into structured Markdown and JSON using IBM's docling.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/lijie420461340/doc-parserWhat This Skill Does
The doc-parser skill is a sophisticated document understanding tool powered by IBM's docling library. It acts as a bridge between unstructured data formats—such as complex multi-column PDFs, Microsoft Word documents, and image-based files—and structured, machine-readable data. Unlike simple text extractors, this skill performs advanced layout analysis, ensuring that document structure is preserved. It is capable of identifying and extracting tabular data, images, captions, and hierarchical headings, converting them into clean formats like Markdown, text, or JSON. This is an essential utility for users who need to transform static reports, academic papers, or business invoices into data that can be programmatically processed or analyzed.
Installation
You can install the doc-parser skill directly via the OpenClaw CLI using the following command:
clawhub install openclaw/skills/skills/lijie420461340/doc-parser
Use Cases
- Data Digitization: Convert archived physical documents scanned as images into structured text formats.
- Report Automation: Automatically extract key findings and tables from complex financial or analytical PDFs into Excel-compatible formats.
- Academic Research: Quickly pull figures, captions, and citation lists from academic papers to build personal knowledge bases.
- Content Conversion: Migrate content from proprietary formats like DOCX or PPTX into platform-agnostic Markdown or JSON for integration with downstream AI models.
Example Prompts
- "Parse the attached financial report and output all tables into a CSV-friendly JSON format for me."
- "Extract all figures and their corresponding captions from this 50-page research paper and save them as a summary list."
- "Convert this complex, multi-column technical manual into clean Markdown, ensuring the headers and sub-sections are correctly nested."
Tips & Limitations
- Format Variety: While it excels with PDF, DOCX, and images, ensure your document source is clear for the best OCR results.
- Computational Intensity: For extremely large files or documents with complex rendering, extraction may take slightly longer than simple text scraping.
- Accuracy: Complex layouts benefit from the integrated pipeline options; if results are suboptimal, ensure your input files are not corrupted.
- Data Privacy: Be mindful of sensitive data when processing documents; always verify that the content complies with your organization's internal data handling policies.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-lijie420461340-doc-parser": {
"enabled": true,
"auto_update": true
}
}
}Tags
Flags: file-read
Related Skills
align
Data and text alignment reference — sequence alignment, text formatting, memory alignment, and CSS/layout alignment. Use when aligning sequences, formatting columnar output, or understanding byte alignment.
footer
Footer design reference — layout patterns, sticky footers, SEO, accessibility, legal requirements. Use when designing web page footers or implementing responsive footer components.
layout
Generate CSS layouts. Use when building grid or flexbox layouts, creating responsive breakpoints, or scaffolding HTML pages.
layout-analyzer
Analyze document structure and layout using surya - detect text blocks, tables, and reading order
pdf-extraction
Extract text, tables, and metadata from PDFs using pdfplumber