ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified file management Safety 4/5

word-reader

读取 Word 文档(.docx 和 .doc 格式)并提取文本内容。支持文档解析、表格提取、图片处理等功能。使用当用户需要分析 Word 文档内容、提取文本信息或批量处理文档时。

Why use this skill?

Efficiently extract text, tables, and metadata from Word documents with the Word Reader skill. Perfect for automation, data analysis, and report processing in your AI agent workflow.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/xtfnhcyjpgf/word-reader
Or

What This Skill Does

The Word Document Reader skill is a powerful automation tool designed for OpenClaw to process and interpret Word documents in both .docx and .doc formats. It serves as a bridge between unstructured office documents and actionable AI insights. The skill leverages advanced parsing to extract full-text content, structured tables, metadata, and embedded image information, providing it in various formats like JSON, Markdown, or plain text. Whether you are dealing with a single technical report, a series of project requirements, or a library of legacy documents, this skill enables seamless integration of document data into your AI-driven workflows.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command: clawhub install openclaw/skills/skills/xtfnhcyjpgf/word-reader

Ensure that your system has the necessary prerequisites for handling older .doc files. On Ubuntu/Debian, install antiword using sudo apt-get install antiword, or on macOS via brew install antiword. Additionally, ensure the python-docx library is installed in your Python environment for robust .docx handling: pip3 install python-docx.

Use Cases

This skill is highly effective for automating document analysis tasks. Use it when you need to:

  • Automate the extraction of data from complex table-heavy reports for database input.
  • Perform bulk metadata harvesting to index document repositories by author or creation date.
  • Convert legacy documentation into Markdown for ingestion into knowledge bases or LLM training pipelines.
  • Streamline content audit processes by identifying image placeholders and text structure within project files.

Example Prompts

  1. "Please read the project-specs.docx file in the current directory and convert the content into a structured Markdown format for my documentation."
  2. "Extract all the table data from the 'Budget_2024.docx' file and provide the result as JSON so I can load it into a spreadsheet."
  3. "Scan the folder '/docs/archives' for all .docx files, extract the metadata, and compile a report summarizing the authors and creation dates."

Tips & Limitations

When working with large files, be patient, as complex parsing can be resource-intensive. The skill is optimized for text and structured data; while it can identify image metadata, it does not extract raw image files. If you encounter encoding issues with older .doc files, try specifying the encoding parameter. For best results, ensure your files are not password-protected, as the tool requires direct read access to parse content accurately. Always verify the output format (JSON vs. Markdown) based on whether you are using the output for program integration or human readability.

Metadata

Stars879
Views0
Updated2026-02-11
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-xtfnhcyjpgf-word-reader": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#document-parsing#office-automation#data-extraction#word#productivity
Safety Score: 4/5

Flags: file-read, code-execution