Datasets
Browse and load ready-to-use AI/ML datasets with fast manipulation. Use when searching datasets, loading training data, transforming formats.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/ckchzh/datasetsWhat This Skill Does
The Datasets skill is a powerful, local-first data processing toolkit designed for AI/ML workflows within OpenClaw. It provides a standardized command-line interface to ingest, transform, query, filter, and validate datasets directly from your terminal. By maintaining a centralized, pipe-delimited logging system at ~/.local/share/datasets/, it ensures that every transformation, aggregation, or schema change is tracked with timestamps for reproducibility. It is lightweight, depending only on standard Unix utilities like grep, tail, and sed, making it an excellent choice for local data management without external dependencies or cloud API overhead.
Installation
To integrate this skill into your environment, run the following command within your OpenClaw interface:
clawhub install openclaw/skills/skills/ckchzh/datasets
Use Cases
This skill is perfect for data scientists and developers who need to keep local logs of their data pipeline steps. Use it to keep track of dataset versions, log the output of data validation checks, monitor the progress of data transformations, or quickly search through past analytical queries. Because it stores data in local files, it acts as an audit trail for your machine learning research, ensuring you can review the history of your data processing decisions at any time.
Example Prompts
- "Datasets, record a new ingest for the climate-data.csv file and let me know the current status of my storage."
- "Search my logs for all transformations involving the 'normalization' step."
- "Aggregate the current dataset entries to provide a summary of all my activity over the past week."
Tips & Limitations
- Tip: Use
datasets statsregularly to monitor your disk usage, as extensive logging can grow over time. - Tip: You can pipe the output of
datasets export jsoninto other CLI tools likejqfor advanced parsing. - Limitation: As a file-based system, it is designed for metadata tracking and small-to-medium dataset pointers. It does not perform heavy data processing in memory; rather, it records the operations performed on that data. Ensure your input values are concise strings to keep the log files clean and readable.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-ckchzh-datasets": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read
Related Skills
header
Header design reference — navigation patterns, sticky headers, responsive menus, accessibility. Use when designing website headers or implementing navigation components.
docker-helper
Dockerfile生成、docker-compose编排、命令速查、调试排错、镜像优化、仓库配置. Use when you need docker helper capabilities. Triggers on: docker helper.
rsyslog
RSyslog advanced system logging reference. RainerScript configuration, input/output modules (imtcp/imfile/omfwd/omelasticsearch), templates with property replacer, content-based filtering, TLS-encrypted remote logging, queue performance tuning, and debug troubleshooting.
Fitness Plan — Science-Based Training & Workout Auditor
Track workouts, calculate BMI/1RM, and access exercise science guides. 支持科学健身计划制定、BMI/最大力量计算及运动解剖学参考。Use when planning gym sessions, calculating macro needs, or auditing training splits.
pesticide
Pesticide management reference — chemical classes, application methods, IPM strategies, residue limits, safety protocols. Use when selecting pest control products, calculating spray rates, or managing pesticide compliance.