ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

data-quality-check

Assess construction data quality using completeness, accuracy, consistency, timeliness, and validity metrics. Automated validation with regex patterns, thresholds, and reporting.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/datadrivenconstruction/data-quality-check
Or

Data Quality Check for Construction

Overview

Based on DDC methodology (Chapter 2.6), this skill provides comprehensive data quality assessment for construction projects. Poor data quality leads to poor decisions - validate early, validate often.

Book Reference: "Требования к качеству данных и его обеспечение" / "Data Quality Requirements"

"Качество данных определяется пятью ключевыми метриками: полнота, точность, согласованность, своевременность и достоверность." — DDC Book, Chapter 2.6

Quick Start

import pandas as pd

# Load construction data
df = pd.read_excel("bim_export.xlsx")

# Quick quality check
quality_score = {
    'completeness': (1 - df.isnull().sum().sum() / df.size) * 100,
    'unique_ids': df['ElementId'].nunique() == len(df),
    'valid_volumes': (df['Volume_m3'] >= 0).all()
}

print(f"Completeness: {quality_score['completeness']:.1f}%")
print(f"Unique IDs: {quality_score['unique_ids']}")
print(f"Valid volumes: {quality_score['valid_volumes']}")

Data Quality Dimensions

The 5 Quality Metrics

import pandas as pd
import numpy as np
import re
from datetime import datetime, timedelta

class DataQualityChecker:
    """Comprehensive data quality assessment for construction data"""

    def __init__(self, df):
        self.df = df.copy()
        self.results = {}
        self.issues = []

    def check_completeness(self, required_columns=None):
        """Check for missing values (Полнота)"""
        if required_columns is None:
            required_columns = self.df.columns.tolist()

        completeness = {}
        for col in required_columns:
            if col in self.df.columns:
                non_null = self.df[col].notna().sum()
                total = len(self.df)
                completeness[col] = (non_null / total) * 100
            else:
                completeness[col] = 0
                self.issues.append(f"Missing required column: {col}")

        overall = np.mean(list(completeness.values()))

        self.results['completeness'] = {
            'by_column': completeness,
            'overall': overall,
            'threshold': 95,
            'passed': overall >= 95
        }

        return self.results['completeness']

    def check_accuracy(self, rules=None):
        """Check data accuracy against rules (Точность)"""
        if rules is None:
            # Default construction data rules
            rules = {
                'Volume_m3': {'min': 0, 'max': 10000},
                'Area_m2': {'min': 0, 'max': 100000},
                'Weight_kg': {'min': 0, 'max': 1000000},
                'Cost': {'min': 0, 'max': 100000000}
            }

        accuracy = {}
        for col, bounds in rules.items():
            if col in self.df.columns:
                valid = self.df[col].between(
                    bounds.get('min', -np.inf),
                    bounds.get('max', n...

Metadata

Stars3376
Views0
Updated2026-03-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-datadrivenconstruction-data-quality-check": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.