ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

heteromind

Unified heterogeneous knowledge QA system. Automatically routes natural language queries to SQL databases, Knowledge Graphs, or table files using 4-layer detection (rule-based, LLM semantic, schema matching, entity verification). Supports multi-LLM providers and bilingual queries. Trigger on data queries, "how many", "show", aggregations, filters, joins, or structured information requests.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bahuia/heteromind
Or

HeteroMind

Unified heterogeneous knowledge QA system with automatic source detection and multi-stage reasoning.

Core Concept

Natural language queries are automatically routed to the appropriate knowledge source (SQL, Knowledge Graph, or Table files) without requiring users to specify the data source. A 4-layer detection architecture ensures accurate source identification, followed by multi-stage query generation with self-revision and voting.

User Query → Source Detection (4 layers) → Query Generation → Self-Revision → Voting → Execution → Answer

When to Use

TriggerAction
"How many employees in X?"NL2SQL engine
"Who is the founder of X?"NL2SPARQL engine (KG)
"Which quarter had highest sales?"TableQA engine
"Show average salary by department"Auto-detect SQL
Queries with aggregations, filters, joinsRoute to SQL
Entity relationship queriesRoute to KG
Questions about CSV/Excel filesRoute to TableQA
Multi-hop queries across sourcesDecompose + fuse

Architecture

4-Layer Source Detection

Layer 1 (15%): Rule-Based
  - 20+ keywords per source type
  - 7 regex patterns (aggregation, comparison, relation)
  - Fast pre-filtering

Layer 2 (35%): LLM Semantic
  - Intent classification
  - Entity/predicate detection
  - Multi-hop identification

Layer 3a (25%): SQL Schema Match
  - Inverted index on tables/columns
  - Automatic JOIN inference
  - Confidence scoring

Layer 3b (25%): KG Entity Link
  - Entity mention extraction
  - SPARQL endpoint lookup
  - Predicate pattern matching

Layer 3c (25%+30%): Entity Verification
  - Cross-source entity existence check
  - 30% score boost for verified entities

Layer 4: Multi-Source Fusion
  - Weighted aggregation
  - Execution plan generation

Query Generation Pipeline

1. Schema/Entity Linking     → Identify relevant tables/columns/entities
2. Parallel Generation       → Generate 3 candidates concurrently
3. Multi-Round Revision      → 2 rounds of self-review
4. Validation               → Syntax and semantic checks
5. Voting                   → Select best candidate
6. Execution                → Run query
7. Result Verification      → Validate reasonableness

Engines

NL2SQL Engine

from src.engines.nl2sql.multi_stage_engine import MultiStageNL2SQLEngine

engine = MultiStageNL2SQLEngine({
    "name": "sql_engine",
    "schema": schema,
    "llm_config": {
        "model": "deepseek-chat",
        "api_key": "sk-...",
    },
    "generation_config": {
        "num_candidates": 3,
        "max_revisions": 2,
        "parallel_generation": True,
    },
})

result = await engine.execute("How many employees in Engineering?", {})

Features:

  • Schema linking (rule-based + LLM)
  • Parallel SQL candidate generation
  • Multi-round self-revision
  • Voting mechanism
  • Result verification

NL2SPARQL Engine

Metadata

Author@bahuia
Stars4473
Views1
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bahuia-heteromind": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.