feature-engineer
Comprehensive feature engineering for ML pipelines: data quality assessment, feature creation, selection, transformation, and encoding. Activates for "feature engineering", "create features", "feature selection", "data preprocessing", "handle missing values", "encode categorical", "scale features", "feature importance". Ensures features are production-ready with automated validation, documentation, and integration with SpecWeave increments.
Why use this skill?
Automate your ML pipeline with the OpenClaw feature-engineer skill. Perform data quality audits, create advanced features, and integrate directly with SpecWeave.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/anton-abyzov/sw-feature-engineerWhat This Skill Does
The feature-engineer skill is a specialized OpenClaw agent designed to streamline the machine learning pipeline by automating the transformation of raw, messy data into high-performance, model-ready inputs. It integrates seamlessly with the SpecWeave ecosystem, ensuring that every transformation is versioned and documented via increment IDs. The skill operates in two primary phases: Data Quality Assessment and Feature Creation. During the assessment phase, the agent performs a deep-dive audit of your dataset, identifying missing values, detecting statistical outliers, flagging data type inconsistencies, and calculating correlation matrices. Once the data quality is verified, the feature creation engine takes over, allowing you to generate sophisticated features including temporal components (day, hour, holiday flags), behavioral aggregations (group-by statistics), complex interaction terms, ratio features, and discretized bins. By offloading this labor-intensive process to the agent, data scientists can focus on model architecture and evaluation rather than spending days on manual preprocessing scripts.
Installation
To install this skill, run the following command in your terminal or OpenClaw interface:
clawhub install openclaw/skills/skills/anton-abyzov/sw-feature-engineer
Ensure you have the SpecWeave library installed in your current Python environment to enable automated documentation and increment tracking.
Use Cases
This skill is ideal for:
- Preparing customer datasets for churn prediction models by aggregating interaction history.
- Cleaning raw transactional logs that contain significant missing values and outliers.
- Discretizing continuous variables like age or salary to help decision tree-based models perform better.
- Automating the feature engineering phase for time-series forecasting, specifically by generating day-of-week and month indicators.
- Standardizing features across multiple experiments to ensure consistency within the SpecWeave increment workflow.
Example Prompts
- "Perform a data quality assessment on my current dataframe and suggest a strategy for handling missing values in the email and phone columns."
- "Create interaction features from the 'age' and 'income' columns, and generate ratios for 'revenue' over 'cost' using increment 0042."
- "Please generate temporal features from my purchase_date column, including holiday indicators and the day of the week."
Tips & Limitations
- Always run the Data Quality Assessment before initiating Feature Creation; ignoring outliers can lead to skewed aggregate statistics.
- The skill is currently optimized for tabular data (pandas DataFrames). For large-scale distributed datasets (e.g., PySpark), additional configuration may be required.
- When creating interaction features, be mindful of multicollinearity, especially if you plan to use linear models later.
- Keep your increment ID consistent throughout the experiment to ensure your documentation remains audit-ready.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-anton-abyzov-sw-feature-engineer": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, code-execution
Related Skills
network-engineer
Cloud network architect for VPC design, service mesh, zero-trust networking, load balancers, and CDN optimization. Use for network troubleshooting or connectivity issues.
jira-multi-project-mapper
Expert in mapping SpecWeave specs to multiple JIRA projects with intelligent project detection and cross-project coordination. Use when syncing to multiple JIRA projects (project-per-team, component-based), or managing bidirectional sync across team boundaries.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
performance-optimization
React Native performance with Hermes V1, FlashList, expo-image v2, concurrent rendering. Use for slow app, memory leaks, or FPS issues.
release-strategy-advisor
Release strategy advisor - detects brownfield patterns (tags, CI/CD, changelogs), recommends versioning strategy based on architecture. Creates release-strategy.md.