ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

data-type-classifier

Classify construction data by type (structured, unstructured, semi-structured). Analyze data sources and recommend appropriate storage/processing methods

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/datadrivenconstruction/data-type-classifier
Or

Data Type Classifier

Overview

Based on DDC methodology (Chapter 2.1), this skill classifies construction data by type, analyzes data sources, and recommends appropriate storage, processing, and integration methods.

Book Reference: "Типы данных в строительстве" / "Data Types in Construction"

Quick Start

from dataclasses import dataclass, field
from enum import Enum
from typing import List, Dict, Optional, Any, Tuple
from datetime import datetime
import json
import re
import mimetypes

class DataStructure(Enum):
    """Data structure classification"""
    STRUCTURED = "structured"           # Tables, databases, spreadsheets
    SEMI_STRUCTURED = "semi_structured" # JSON, XML, IFC
    UNSTRUCTURED = "unstructured"       # Documents, images, videos
    GEOMETRIC = "geometric"             # CAD, BIM geometry
    TEMPORAL = "temporal"               # Time-series, schedules
    SPATIAL = "spatial"                 # GIS, coordinates

class DataFormat(Enum):
    """Common construction data formats"""
    # Structured
    CSV = "csv"
    EXCEL = "excel"
    SQL = "sql"
    PARQUET = "parquet"

    # Semi-structured
    JSON = "json"
    XML = "xml"
    IFC = "ifc"
    BCF = "bcf"

    # Unstructured
    PDF = "pdf"
    DOCX = "docx"
    IMAGE = "image"
    VIDEO = "video"

    # Geometric
    DWG = "dwg"
    DXF = "dxf"
    RVT = "rvt"
    NWD = "nwd"
    OBJ = "obj"
    STL = "stl"

    # Schedule
    MPP = "mpp"
    P6 = "p6"
    XER = "xer"

class StorageRecommendation(Enum):
    """Storage system recommendations"""
    RELATIONAL_DB = "relational_database"
    DOCUMENT_DB = "document_database"
    OBJECT_STORAGE = "object_storage"
    GRAPH_DB = "graph_database"
    TIME_SERIES_DB = "time_series_database"
    VECTOR_DB = "vector_database"
    FILE_SYSTEM = "file_system"
    DATA_LAKE = "data_lake"

@dataclass
class DataCharacteristics:
    """Characteristics of a data source"""
    has_schema: bool
    has_relationships: bool
    is_queryable: bool
    is_binary: bool
    has_geometry: bool
    has_temporal: bool
    has_text_content: bool
    avg_record_size: Optional[int] = None  # bytes
    estimated_volume: Optional[str] = None  # small/medium/large/huge
    update_frequency: Optional[str] = None

@dataclass
class DataClassification:
    """Classification result for a data source"""
    source_name: str
    source_type: str
    detected_format: DataFormat
    structure: DataStructure
    characteristics: DataCharacteristics
    storage_recommendation: StorageRecommendation
    processing_tools: List[str]
    integration_options: List[str]
    quality_considerations: List[str]
    confidence: float

@dataclass
class ClassificationReport:
    """Complete classification report"""
    total_sources: int
    classifications: List[DataClassification]
    summary_by_structure: Dict[str, int]
    summary_by_format: Dict[str,...

Metadata

Stars3376
Views1
Updated2026-03-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-datadrivenconstruction-data-type-classifier": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.