Unidoc Parser
Skill by aaiccee
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aaiccee/unidoc-parserWhat This Skill Does
The Unidoc Parser skill is a powerful document conversion agent designed to bridge the gap between unstructured documents and machine-readable data. Utilizing the UniDoc cloud-based API, this tool enables the seamless transformation of complex file formats—including PDFs, Microsoft Word documents (.doc, .docx), and various image formats like PNG and JPG—into either standardized Markdown or structured JSON. By handling both synchronous processing for quick tasks and asynchronous polling for larger, resource-intensive files, it ensures that your documents are ready for downstream AI consumption, indexing, or archival without requiring manual intervention.
Installation
To integrate this capability into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/aaiccee/unidoc-parser
Ensure that you have an active network connection, as the skill requires communication with the UniDoc API endpoint (http://unidoc.uat.hivoice.cn) to perform its processing tasks.
Use Cases
This skill is indispensable for professionals dealing with high-volume document ingestion. Use cases include:
- Data Extraction: Converting scanned invoices or legal contracts into JSON format for direct ingestion into databases or CRM systems.
- Content Migration: Turning legacy Word documents into Markdown files to facilitate easier migration to static site generators or documentation platforms.
- Batch Processing: Automating the conversion of entire directories of research papers or reports into a readable, searchable format using asynchronous mode for background execution.
Example Prompts
- "Unidoc, please parse the report at ./documents/annual_review.pdf into Markdown format and save the result to ./outputs/annual_review."
- "Convert the document at /data/receipts/march_invoices.docx into JSON so I can use the data in my spreadsheet generator. Use asynchronous mode since it is a large file."
- "Run the Unidoc Parser on /files/scanned_image.jpg to extract the text content into an md file."
Tips & Limitations
To maximize efficiency, prioritize using asynchronous mode (--mode async) for large batches or files exceeding standard size limits, as this avoids connection timeouts. The tool automatically handles status polling every second, so you do not need to intervene during the process. Please note that the utility performs file-write operations, so ensure your target directories have appropriate write permissions. If you face connectivity hurdles, verify that your local environment has access to the UniDoc API URL; if errors persist, check the references/unidoc-notes.md file for API-specific troubleshooting steps.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aaiccee-unidoc-parser": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, external-api
Related Skills
Asr File Transfer
Skill by aaiccee
med-chronic-disease-review
门诊慢病审核(糖尿病/高血压)。输入 OCR 结果数组 JSON,输出审核结论与原因(原始 JSON + 自然语言结论)。
u2-tts
Text-to-speech conversion using UniSound's TTS WebSocket API for generating high-quality Chinese Mandarin audio from text. Supports multiple voices, adjustable parameters, and real-time streaming synthesis.
med-initial-record-gen
从中文医患对话文本生成门诊初诊病历,输出结构化分段的病历正文(文本)。
u2-audio-file-transcriber
Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer service, and other domains.