alicloud-ai-text-document-mind
Use Document Mind (DocMind) via Node.js SDK to submit document parsing jobs and poll results. Designed for Claude Code/Codex document understanding workflows.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/cinience/alicloud-ai-text-document-mindWhat This Skill Does
The alicloud-ai-text-document-mind skill enables powerful document intelligence capabilities within your OpenClaw agent by leveraging Alibaba Cloud's DocMind service. This tool is designed to automate the extraction of complex document structures, text content, and layout information from various file types, including PDFs and images. By providing an asynchronous job-processing interface, the skill allows the agent to submit large documents, poll for their processing status, and retrieve structured results. This is particularly useful for workflows involving RAG (Retrieval-Augmented Generation), automated document auditing, and data extraction from unstructured corporate documents.
Installation
To integrate this skill into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/cinience/alicloud-ai-text-document-mind
Ensure you have configured your Alibaba Cloud credentials by setting the ALICLOUD_ACCESS_KEY_ID, ALICLOUD_ACCESS_KEY_SECRET, and ALICLOUD_REGION_ID environment variables. The skill relies on the @alicloud/docmind-api20220711 SDK, which will be resolved during the installation process.
Use Cases
- Automated Data Extraction: Automatically parse multi-page PDFs or scanned reports and convert them into machine-readable JSON formats.
- Content Indexing: Feed parsed document layouts into vector databases to improve the accuracy of RAG systems.
- Compliance Auditing: Extract specific tables or entities from financial and legal documents for automated verification.
- Document Summarization: Extract raw text from complex documents with non-standard layouts, ensuring the agent has high-quality context before generating summaries.
Example Prompts
- "Parse the document at https://example.com/financial-report.pdf using DocMind and extract all tables into a JSON file."
- "Can you use DocMind to process my local file './q3_results.pdf' and provide a summary of the key performance indicators?"
- "Submit the document at this URL to DocMind and keep checking the status until it's finished, then show me the extracted text."
Tips & Limitations
- Asynchronous Workflow: DocMind jobs are asynchronous. Always design your agent tasks to handle the polling loop (typically every 10-30 seconds).
- Accessibility: For URL-based submissions, ensure the file is hosted on a publicly reachable server. For private files, use the local file upload method via
SubmitDocStructureJobAdvanceRequest. - Timeout Management: The service has a maximum processing window of 120 minutes. Set your polling logic to account for this ceiling.
- Error Handling: Always verify the return status of the polling requests; common issues include invalid file formats, network timeouts, or expired credentials.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-cinience-alicloud-ai-text-document-mind": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-read, external-api
Related Skills
volcengine-compute-ecs
Manage Volcengine ECS instances and related resources. Use when users need instance inventory, lifecycle operations, troubleshooting, or automation templates for ECS.
alicloud-ai-search-opensearch
Use OpenSearch vector search edition via the Python SDK (ha3engine) to push documents and run HA/SQL searches. Ideal for RAG and vector retrieval pipelines in Claude Code/Codex.
alicloud-storage-oss-ossutil
Alibaba Cloud OSS CLI (ossutil 2.0) skill. Install, configure, and operate OSS from the command line based on the official ossutil overview.
alicloud-platform-openapi-product-api-discovery
Discover and reconcile Alibaba Cloud product catalogs from Ticket System, Support & Service, and BSS OpenAPI; fetch OpenAPI product/version/API metadata; and summarize API coverage to plan new skills. Use when you need a complete product list, product-to-API mapping, or coverage/gap reports for skill generation.
alicloud-ai-image-qwen-image
Generate images with Model Studio DashScope SDK using Qwen Image generation models (qwen-image, qwen-image-plus, qwen-image-max and snapshots). Use when implementing or documenting image.generate requests/responses, mapping prompt/negative_prompt/size/seed/reference_image, or integrating image generation into the video-agent pipeline.