ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

bigdata

Split large files, run parallel processing, and stream batch analysis. Use when sampling datasets, aggregating logs, or transforming bulk data.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bytesagain3/bigdata
Or

BigData

A comprehensive data processing toolkit for ingesting, transforming, querying, filtering, aggregating, and managing data workflows — all from the command line with local timestamped log storage.

Commands

CommandDescription
bigdata ingest <input>Ingest raw data into the system. Without args, shows recent ingest entries
bigdata transform <input>Record a data transformation step. Without args, shows recent transforms
bigdata query <input>Log and track data queries. Without args, shows recent queries
bigdata filter <input>Apply and record data filters. Without args, shows recent filters
bigdata aggregate <input>Record aggregation operations. Without args, shows recent aggregations
bigdata visualize <input>Log visualization tasks. Without args, shows recent visualizations
bigdata export <input>Log export operations. Without args, shows recent exports
bigdata sample <input>Record data sampling operations. Without args, shows recent samples
bigdata schema <input>Track schema definitions and changes. Without args, shows recent schemas
bigdata validate <input>Log data validation checks. Without args, shows recent validations
bigdata pipeline <input>Record pipeline configurations. Without args, shows recent pipelines
bigdata profile <input>Log data profiling operations. Without args, shows recent profiles
bigdata statsShow summary statistics across all entry types
bigdata search <term>Search across all log entries for a keyword
bigdata recentShow the 20 most recent activity entries from the history log
bigdata statusHealth check — version, data dir, total entries, disk usage, last activity
bigdata helpShow all available commands
bigdata versionPrint version (v2.0.0)

Each data command (ingest, transform, query, etc.) works the same way:

  • With arguments: saves the entry with a timestamp to its dedicated .log file and records it in the activity history
  • Without arguments: displays the 20 most recent entries from that command's log

Data Storage

All data is stored locally in plain-text log files:

~/.local/share/bigdata/
├── ingest.log          # Ingested data entries
├── transform.log       # Transformation records
├── query.log           # Query log
├── filter.log          # Filter operations
├── aggregate.log       # Aggregation records
├── visualize.log       # Visualization tasks
├── export.log          # Export operations
├── sample.log          # Sampling records
├── schema.log          # Schema definitions
├── validate.log        # Validation checks
├── pipeline.log        # Pipeline configurations
├── profile.log         # Profiling results
└── history.log         # Unified activity log with timestamps

Each entry is stored as YYYY-MM-DD HH:MM|<value> for easy parsing and export.

Requirements

Metadata

Stars4097
Views0
Updated2026-04-14
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bytesagain3-bigdata": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#bigdata#tool#utility
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.