heteromind
Unified heterogeneous knowledge QA system. Automatically routes natural language queries to SQL databases, Knowledge Graphs, or table files using 4-layer detection (rule-based, LLM semantic, schema matching, entity verification). Supports multi-LLM providers and bilingual queries. Trigger on data queries, "how many", "show", aggregations, filters, joins, or structured information requests.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bahuia/heteromindHeteroMind
Unified heterogeneous knowledge QA system with automatic source detection and multi-stage reasoning.
Core Concept
Natural language queries are automatically routed to the appropriate knowledge source (SQL, Knowledge Graph, or Table files) without requiring users to specify the data source. A 4-layer detection architecture ensures accurate source identification, followed by multi-stage query generation with self-revision and voting.
User Query → Source Detection (4 layers) → Query Generation → Self-Revision → Voting → Execution → Answer
When to Use
| Trigger | Action |
|---|---|
| "How many employees in X?" | NL2SQL engine |
| "Who is the founder of X?" | NL2SPARQL engine (KG) |
| "Which quarter had highest sales?" | TableQA engine |
| "Show average salary by department" | Auto-detect SQL |
| Queries with aggregations, filters, joins | Route to SQL |
| Entity relationship queries | Route to KG |
| Questions about CSV/Excel files | Route to TableQA |
| Multi-hop queries across sources | Decompose + fuse |
Architecture
4-Layer Source Detection
Layer 1 (15%): Rule-Based
- 20+ keywords per source type
- 7 regex patterns (aggregation, comparison, relation)
- Fast pre-filtering
Layer 2 (35%): LLM Semantic
- Intent classification
- Entity/predicate detection
- Multi-hop identification
Layer 3a (25%): SQL Schema Match
- Inverted index on tables/columns
- Automatic JOIN inference
- Confidence scoring
Layer 3b (25%): KG Entity Link
- Entity mention extraction
- SPARQL endpoint lookup
- Predicate pattern matching
Layer 3c (25%+30%): Entity Verification
- Cross-source entity existence check
- 30% score boost for verified entities
Layer 4: Multi-Source Fusion
- Weighted aggregation
- Execution plan generation
Query Generation Pipeline
1. Schema/Entity Linking → Identify relevant tables/columns/entities
2. Parallel Generation → Generate 3 candidates concurrently
3. Multi-Round Revision → 2 rounds of self-review
4. Validation → Syntax and semantic checks
5. Voting → Select best candidate
6. Execution → Run query
7. Result Verification → Validate reasonableness
Engines
NL2SQL Engine
from src.engines.nl2sql.multi_stage_engine import MultiStageNL2SQLEngine
engine = MultiStageNL2SQLEngine({
"name": "sql_engine",
"schema": schema,
"llm_config": {
"model": "deepseek-chat",
"api_key": "sk-...",
},
"generation_config": {
"num_candidates": 3,
"max_revisions": 2,
"parallel_generation": True,
},
})
result = await engine.execute("How many employees in Engineering?", {})
Features:
- Schema linking (rule-based + LLM)
- Parallel SQL candidate generation
- Multi-round self-revision
- Voting mechanism
- Result verification
NL2SPARQL Engine
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bahuia-heteromind": {
"enabled": true,
"auto_update": true
}
}
}