ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

alicloud-ai-text-document-mind

Use Document Mind (DocMind) via Node.js SDK to submit document parsing jobs and poll results. Designed for Claude Code/Codex document understanding workflows.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/cinience/alicloud-ai-text-document-mind
Or

What This Skill Does

The alicloud-ai-text-document-mind skill enables powerful document intelligence capabilities within your OpenClaw agent by leveraging Alibaba Cloud's DocMind service. This tool is designed to automate the extraction of complex document structures, text content, and layout information from various file types, including PDFs and images. By providing an asynchronous job-processing interface, the skill allows the agent to submit large documents, poll for their processing status, and retrieve structured results. This is particularly useful for workflows involving RAG (Retrieval-Augmented Generation), automated document auditing, and data extraction from unstructured corporate documents.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/cinience/alicloud-ai-text-document-mind Ensure you have configured your Alibaba Cloud credentials by setting the ALICLOUD_ACCESS_KEY_ID, ALICLOUD_ACCESS_KEY_SECRET, and ALICLOUD_REGION_ID environment variables. The skill relies on the @alicloud/docmind-api20220711 SDK, which will be resolved during the installation process.

Use Cases

  • Automated Data Extraction: Automatically parse multi-page PDFs or scanned reports and convert them into machine-readable JSON formats.
  • Content Indexing: Feed parsed document layouts into vector databases to improve the accuracy of RAG systems.
  • Compliance Auditing: Extract specific tables or entities from financial and legal documents for automated verification.
  • Document Summarization: Extract raw text from complex documents with non-standard layouts, ensuring the agent has high-quality context before generating summaries.

Example Prompts

  1. "Parse the document at https://example.com/financial-report.pdf using DocMind and extract all tables into a JSON file."
  2. "Can you use DocMind to process my local file './q3_results.pdf' and provide a summary of the key performance indicators?"
  3. "Submit the document at this URL to DocMind and keep checking the status until it's finished, then show me the extracted text."

Tips & Limitations

  • Asynchronous Workflow: DocMind jobs are asynchronous. Always design your agent tasks to handle the polling loop (typically every 10-30 seconds).
  • Accessibility: For URL-based submissions, ensure the file is hosted on a publicly reachable server. For private files, use the local file upload method via SubmitDocStructureJobAdvanceRequest.
  • Timeout Management: The service has a maximum processing window of 120 minutes. Set your polling logic to account for this ceiling.
  • Error Handling: Always verify the return status of the polling requests; common issues include invalid file formats, network timeouts, or expired credentials.

Metadata

Author@cinience
Stars3562
Views1
Updated2026-03-29
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-cinience-alicloud-ai-text-document-mind": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#docmind#alicloud#document-parsing#ocr#automation
Safety Score: 4/5

Flags: network-access, file-read, external-api