Official Verified

layout-analyzer

Analyze document structure and layout using surya - detect text blocks, tables, and reading order

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/lijie420461340/layout-analyzer

Download Source Code (.zip)

Layout Analyzer Skill

Overview

This skill enables document layout analysis using surya - an advanced document understanding system. Detect text blocks, tables, figures, headings, and determine reading order in complex documents.

How to Use

Provide the document image or PDF
Specify what layout elements to detect
I'll analyze the structure and return detected regions

Example prompts:

"Analyze the layout of this document page"
"Detect all tables and text blocks in this image"
"Determine the reading order for this PDF page"
"Find headings and paragraphs in this document"

Domain Knowledge

surya Fundamentals

from surya.detection import DetectionPredictor
from surya.layout import LayoutPredictor
from surya.reading_order import ReadingOrderPredictor
from PIL import Image

# Load image
image = Image.open("document.png")

# Detect layout elements
layout_predictor = LayoutPredictor()
layout_result = layout_predictor([image])

Layout Element Types

Element	Description
Text	Regular paragraph text
Title	Document/section titles
Section-header	Section headings
List-item	Bulleted/numbered items
Table	Tabular data
Figure	Images/diagrams
Caption	Figure/table captions
Footnote	Footnotes
Formula	Mathematical equations
Page-header	Headers
Page-footer	Footers

Text Detection

from surya.detection import DetectionPredictor
from PIL import Image

# Initialize detector
detector = DetectionPredictor()

# Load image
image = Image.open("document.png")

# Detect text regions
results = detector([image])

# Access results
for page_result in results:
    for bbox in page_result.bboxes:
        print(f"Text region: {bbox.bbox}")
        print(f"Confidence: {bbox.confidence}")

Layout Analysis

from surya.layout import LayoutPredictor
from PIL import Image

# Initialize layout predictor
layout_predictor = LayoutPredictor()

# Analyze layout
image = Image.open("document.png")
layout_results = layout_predictor([image])

# Process results
for page_result in layout_results:
    for element in page_result.bboxes:
        print(f"Type: {element.label}")
        print(f"Bbox: {element.bbox}")
        print(f"Confidence: {element.confidence}")

Reading Order Detection

from surya.reading_order import ReadingOrderPredictor
from surya.layout import LayoutPredictor
from PIL import Image

# Get layout first
layout_predictor = LayoutPredictor()
image = Image.open("document.png")
layout_results = layout_predictor([image])

# Determine reading order
reading_order_predictor = ReadingOrderPredictor()
order_results = reading_order_predictor([image], layout_results)

# Access ordered elements
for page_result in order_results:
    for i, element in enumerate(page_result.ordered_bboxes):
        print(f"{i+1}. {element.label}: {element.bbox}")

OCR with Layout

Read Full Documentation on GitHub

Metadata

Author@lijie420461340

Stars1656

Updated2026-02-28

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-lijie420461340-layout-analyzer": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

align

Data and text alignment reference — sequence alignment, text formatting, memory alignment, and CSS/layout alignment. Use when aligning sequences, formatting columnar output, or understanding byte alignment.

bytesagain3 4097

footer

Footer design reference — layout patterns, sticky footers, SEO, accessibility, legal requirements. Use when designing web page footers or implementing responsive footer components.

bytesagain 3500

layout

Generate CSS layouts. Use when building grid or flexbox layouts, creating responsive breakpoints, or scaffolding HTML pages.