Official Verified

azure-doc-ocr

Extract text and structured data from documents using Azure Document Intelligence (formerly Form Recognizer). Supports OCR for PDFs, images, scanned documents, handwritten text, CJK languages, tables, forms, invoices, receipts, ID documents, business cards, and tax forms. Uses the REST API v4.0 (2024-11-30) with prebuilt models for various document types. Triggers: OCR, text extraction, Azure Document Intelligence, PDF OCR, image OCR, scanned documents, handwriting recognition, CJK text extraction, table extraction, invoice processing, receipt scanning, ID document recognition, document parsing, form extraction, Azure Form Recognizer

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/li-hongmin/azure-doc-ocr

Download Source Code (.zip)

Azure Document Intelligence OCR

Extract text and structured data from documents using Azure Document Intelligence REST API.

Quick Start

1. Environment Setup

Set your Azure Document Intelligence credentials:

export AZURE_DOC_INTEL_ENDPOINT="https://your-resource.cognitiveservices.azure.com"
export AZURE_DOC_INTEL_KEY="your-api-key"

2. Single File OCR

# Basic text extraction from PDF
python scripts/ocr_extract.py document.pdf

# Extract with layout (tables, structure)
python scripts/ocr_extract.py document.pdf --model prebuilt-layout --format markdown

# Process invoice
python scripts/ocr_extract.py invoice.pdf --model prebuilt-invoice --format json

# OCR from URL
python scripts/ocr_extract.py --url "https://example.com/document.pdf"

# Save output to file
python scripts/ocr_extract.py document.pdf --output result.txt

# Extract specific pages
python scripts/ocr_extract.py document.pdf --pages 1-3,5

3. Batch Processing

# Process all documents in a folder
python scripts/batch_ocr.py ./documents/

# Custom output directory and format
python scripts/batch_ocr.py ./documents/ --output-dir ./extracted/ --format markdown

# Use layout model with 8 workers
python scripts/batch_ocr.py ./documents/ --model prebuilt-layout --workers 8

# Filter specific extensions
python scripts/batch_ocr.py ./documents/ --ext .pdf,.png

Model Selection Guide

Document Type	Recommended Model	Use Case
General text	`prebuilt-read`	Pure text extraction, any document
Structured docs	`prebuilt-layout`	Tables, forms, paragraphs, figures
Invoices	`prebuilt-invoice`	Vendor info, line items, totals
Receipts	`prebuilt-receipt`	Merchant, items, totals, dates
IDs/Passports	`prebuilt-idDocument`	Identity documents
Business cards	`prebuilt-businessCard`	Contact information
W-2 forms	`prebuilt-tax.us.w2`	US tax documents
Insurance cards	`prebuilt-healthInsuranceCard.us`	Health insurance info

See references/models.md for detailed model documentation.

Supported Input Formats

PDF: .pdf (including scanned PDFs)
Images: .png, .jpg, .jpeg, .tiff, .bmp
URLs: Direct links to documents

Output Formats

text: Plain text concatenation of all extracted content
markdown: Structured output with headers and tables (best with layout model)
json: Raw API response with full extraction details

Features

Handwriting Recognition: Extracts handwritten text alongside printed text
CJK Support: Full support for Chinese, Japanese, Korean characters
Table Extraction: Preserves table structure (use layout model)
Multi-page Processing: Handles documents with multiple pages
Concurrent Processing: Batch script supports parallel processing
URL Input: Process documents directly from URLs

Environment Variables

Read Full Documentation on GitHub

Metadata

Author@li-hongmin

Stars1656

Updated2026-02-28

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-li-hongmin-azure-doc-ocr": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

spotlight

Search files and content using macOS Spotlight indexing (mdfind). Use when the user asks to search local files, documents, or directories on macOS. Supports text content search inside PDFs, Word documents, text files, and more. Much faster than grep for large document collections. Only works on macOS systems with Spotlight enabled.

li-hongmin 1656