ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

siphonclaw

Document intelligence pipeline with visual search, OCR, and field capture

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/curtisgc1/siphonclaw
Or

SiphonClaw

Domain-agnostic document intelligence pipeline. Ingest PDFs, images, and spreadsheets into a searchable knowledge base with dual-track retrieval (text + visual), OCR, confidence scoring, and field capture.

Built for field service engineers, researchers, mechanics, and anyone who needs fast answers from large document collections.

What SiphonClaw Does

  • Ingest documents (PDF, Excel, images, screenshots) into a local vector database with text and visual embeddings
  • Search using triple hybrid retrieval: BM25 keyword matching + semantic text vectors + visual page embeddings, fused with RRF and reranked with a cross-encoder
  • Identify equipment, parts, or components from photos using vision models, then search the local knowledge base
  • Capture field fixes and repair notes as first-class knowledge base entries for future retrieval
  • Score every response with composite confidence (retrieval + faithfulness + relevance + coverage) and footnote-style source citations

MCP Tools

SiphonClaw exposes five tools via MCP for integration with agents and other MCP-compatible clients.


siphonclaw_search

Search the knowledge base using triple hybrid retrieval (text + visual + keyword).

Parameters:

NameTypeRequiredDescription
querystringyesNatural language search query or exact part number / error code
top_kintegernoNumber of results to return (default: 5, max: 20)
filtersobjectnoMetadata filters (e.g., {"source_type": "service_manual", "model": "ModelA"})
modestringnoSearch mode: "hybrid" (default), "text", "visual", "keyword"

Returns:

{
  "results": [
    {
      "content": "Extracted text from the matching chunk or page",
      "source": "ServiceManual_ModelA.pdf",
      "page": 42,
      "section": "4.3 Transformer Replacement",
      "score": 0.92,
      "match_type": "hybrid"
    }
  ],
  "confidence": 0.87,
  "confidence_tier": "Confident - verify part number",
  "keywords_used": ["low voltage supply", "assembly mount", "ModelA"],
  "citations": ["[1] ServiceManual_ModelA, page 42", "[2] Parts Catalog PC-1102, page 15"]
}

siphonclaw_ingest

Add a document or photo to the knowledge base. Supports PDF, Excel, images (JPG/PNG), and screenshots.

Parameters:

NameTypeRequiredDescription
file_pathstringyesAbsolute path to the file to ingest
source_typestringnoDocument type hint: "manual", "parts_catalog", "field_note", "photo", "other" (default: auto-detect)
metadataobjectnoAdditional metadata to attach (e.g., {"model": "ModelA", "domain": "industrial"})

Returns:

Metadata

Author@curtisgc1
Stars3409
Views0
Updated2026-03-25
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-curtisgc1-siphonclaw": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.