layout-analyzer
Analyze document structure and layout using surya - detect text blocks, tables, and reading order
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/lijie420461340/layout-analyzerLayout Analyzer Skill
Overview
This skill enables document layout analysis using surya - an advanced document understanding system. Detect text blocks, tables, figures, headings, and determine reading order in complex documents.
How to Use
- Provide the document image or PDF
- Specify what layout elements to detect
- I'll analyze the structure and return detected regions
Example prompts:
- "Analyze the layout of this document page"
- "Detect all tables and text blocks in this image"
- "Determine the reading order for this PDF page"
- "Find headings and paragraphs in this document"
Domain Knowledge
surya Fundamentals
from surya.detection import DetectionPredictor
from surya.layout import LayoutPredictor
from surya.reading_order import ReadingOrderPredictor
from PIL import Image
# Load image
image = Image.open("document.png")
# Detect layout elements
layout_predictor = LayoutPredictor()
layout_result = layout_predictor([image])
Layout Element Types
| Element | Description |
|---|---|
| Text | Regular paragraph text |
| Title | Document/section titles |
| Section-header | Section headings |
| List-item | Bulleted/numbered items |
| Table | Tabular data |
| Figure | Images/diagrams |
| Caption | Figure/table captions |
| Footnote | Footnotes |
| Formula | Mathematical equations |
| Page-header | Headers |
| Page-footer | Footers |
Text Detection
from surya.detection import DetectionPredictor
from PIL import Image
# Initialize detector
detector = DetectionPredictor()
# Load image
image = Image.open("document.png")
# Detect text regions
results = detector([image])
# Access results
for page_result in results:
for bbox in page_result.bboxes:
print(f"Text region: {bbox.bbox}")
print(f"Confidence: {bbox.confidence}")
Layout Analysis
from surya.layout import LayoutPredictor
from PIL import Image
# Initialize layout predictor
layout_predictor = LayoutPredictor()
# Analyze layout
image = Image.open("document.png")
layout_results = layout_predictor([image])
# Process results
for page_result in layout_results:
for element in page_result.bboxes:
print(f"Type: {element.label}")
print(f"Bbox: {element.bbox}")
print(f"Confidence: {element.confidence}")
Reading Order Detection
from surya.reading_order import ReadingOrderPredictor
from surya.layout import LayoutPredictor
from PIL import Image
# Get layout first
layout_predictor = LayoutPredictor()
image = Image.open("document.png")
layout_results = layout_predictor([image])
# Determine reading order
reading_order_predictor = ReadingOrderPredictor()
order_results = reading_order_predictor([image], layout_results)
# Access ordered elements
for page_result in order_results:
for i, element in enumerate(page_result.ordered_bboxes):
print(f"{i+1}. {element.label}: {element.bbox}")
OCR with Layout
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-lijie420461340-layout-analyzer": {
"enabled": true,
"auto_update": true
}
}
}Tags
Related Skills
align
Data and text alignment reference — sequence alignment, text formatting, memory alignment, and CSS/layout alignment. Use when aligning sequences, formatting columnar output, or understanding byte alignment.
footer
Footer design reference — layout patterns, sticky footers, SEO, accessibility, legal requirements. Use when designing web page footers or implementing responsive footer components.
layout
Generate CSS layouts. Use when building grid or flexbox layouts, creating responsive breakpoints, or scaffolding HTML pages.
doc-parser
Parse complex documents with IBM's docling - handles tables, figures, and multi-column layouts
eleutherios
Epistemic analysis infrastructure - query knowledge graphs with suppression detection, coordination signatures, and multi-perspective clustering. Local-first, no cloud dependencies.