data-type-classifier
Classify construction data by type (structured, unstructured, semi-structured). Analyze data sources and recommend appropriate storage/processing methods
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/datadrivenconstruction/data-type-classifierData Type Classifier
Overview
Based on DDC methodology (Chapter 2.1), this skill classifies construction data by type, analyzes data sources, and recommends appropriate storage, processing, and integration methods.
Book Reference: "Типы данных в строительстве" / "Data Types in Construction"
Quick Start
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Dict, Optional, Any, Tuple
from datetime import datetime
import json
import re
import mimetypes
class DataStructure(Enum):
"""Data structure classification"""
STRUCTURED = "structured" # Tables, databases, spreadsheets
SEMI_STRUCTURED = "semi_structured" # JSON, XML, IFC
UNSTRUCTURED = "unstructured" # Documents, images, videos
GEOMETRIC = "geometric" # CAD, BIM geometry
TEMPORAL = "temporal" # Time-series, schedules
SPATIAL = "spatial" # GIS, coordinates
class DataFormat(Enum):
"""Common construction data formats"""
# Structured
CSV = "csv"
EXCEL = "excel"
SQL = "sql"
PARQUET = "parquet"
# Semi-structured
JSON = "json"
XML = "xml"
IFC = "ifc"
BCF = "bcf"
# Unstructured
PDF = "pdf"
DOCX = "docx"
IMAGE = "image"
VIDEO = "video"
# Geometric
DWG = "dwg"
DXF = "dxf"
RVT = "rvt"
NWD = "nwd"
OBJ = "obj"
STL = "stl"
# Schedule
MPP = "mpp"
P6 = "p6"
XER = "xer"
class StorageRecommendation(Enum):
"""Storage system recommendations"""
RELATIONAL_DB = "relational_database"
DOCUMENT_DB = "document_database"
OBJECT_STORAGE = "object_storage"
GRAPH_DB = "graph_database"
TIME_SERIES_DB = "time_series_database"
VECTOR_DB = "vector_database"
FILE_SYSTEM = "file_system"
DATA_LAKE = "data_lake"
@dataclass
class DataCharacteristics:
"""Characteristics of a data source"""
has_schema: bool
has_relationships: bool
is_queryable: bool
is_binary: bool
has_geometry: bool
has_temporal: bool
has_text_content: bool
avg_record_size: Optional[int] = None # bytes
estimated_volume: Optional[str] = None # small/medium/large/huge
update_frequency: Optional[str] = None
@dataclass
class DataClassification:
"""Classification result for a data source"""
source_name: str
source_type: str
detected_format: DataFormat
structure: DataStructure
characteristics: DataCharacteristics
storage_recommendation: StorageRecommendation
processing_tools: List[str]
integration_options: List[str]
quality_considerations: List[str]
confidence: float
@dataclass
class ClassificationReport:
"""Complete classification report"""
total_sources: int
classifications: List[DataClassification]
summary_by_structure: Dict[str, int]
summary_by_format: Dict[str,...
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-datadrivenconstruction-data-type-classifier": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
data-lineage-tracker
Track data origin, transformations, and flow through construction systems. Essential for audit trails, compliance, and debugging data issues.
cwicr-cost-calculator
Calculate construction costs using DDC CWICR resource-based methodology. Break down costs into labor, materials, equipment with transparent pricing.
data-anomaly-detector
Detect anomalies and outliers in construction data: unusual costs, schedule variances, productivity spikes. Statistical and ML-based detection methods.
historical-cost-analyzer
Analyze historical construction costs for benchmarking, trend analysis, and estimating calibration. Compare projects, track escalation, identify patterns.
df-merger
Merge pandas DataFrames from multiple construction sources. Handle different schemas, keys, and data quality issues.