Pandas
Analyze, transform, and clean DataFrames with efficient patterns for filtering, grouping, merging, and pivoting.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/ivangdavila/pandasSetup
On first use, create ~/pandas/ and read setup.md for initialization. User preferences are stored in ~/pandas/memory.md — users can view or edit this file anytime.
When to Use
User needs to work with tabular data in Python. Agent handles DataFrame operations, data cleaning, aggregations, merges, pivots, and exports.
Architecture
Memory lives in ~/pandas/. See memory-template.md for structure.
~/pandas/
├── memory.md # User preferences and common patterns
└── snippets/ # Saved code patterns (optional)
Quick Reference
| Topic | File |
|---|---|
| Setup process | setup.md |
| Memory template | memory-template.md |
Core Rules
1. Use Vectorized Operations
- NEVER iterate with
forloops over DataFrame rows - Use
.apply()only when vectorized alternatives don't exist - Prefer
df['col'].str.method()overapply(lambda x: x.method())
2. Chain Methods for Readability
# Good: method chaining
result = (df
.query('age > 30')
.groupby('city')
.agg({'salary': 'mean'})
.reset_index())
# Bad: intermediate variables everywhere
filtered = df[df['age'] > 30]
grouped = filtered.groupby('city')
result = grouped.agg({'salary': 'mean'}).reset_index()
3. Handle Missing Data Explicitly
- Always check
df.isna().sum()before analysis - Choose strategy:
dropna(),fillna(), or interpolation - Document WHY missing values exist before removing them
4. Use Categorical for Repeated Strings
# Memory savings for columns with few unique values
df['status'] = df['status'].astype('category')
df['country'] = df['country'].astype('category')
5. Merge with Validation
# Always specify how and validate
result = pd.merge(
df1, df2,
on='id',
how='left',
validate='m:1' # Many-to-one: catch unexpected duplicates
)
6. Prefer query() for Complex Filters
# Readable
df.query('age > 30 and city == "NYC" and salary < 100000')
# Hard to read
df[(df['age'] > 30) & (df['city'] == 'NYC') & (df['salary'] < 100000)]
7. Set Index When Appropriate
# Faster lookups, cleaner merges
df = df.set_index('user_id')
user_data = df.loc[12345] # O(1) lookup
Common Traps
- SettingWithCopyWarning → Use
.loc[]for assignment:df.loc[mask, 'col'] = value - Slow loops → Replace
iterrows()with vectorized ops orapply() - Memory explosion → Use
dtypeinread_csv():pd.read_csv(f, dtype={'id': 'int32'}) - Silent data loss → Check shape before/after merge:
print(f"Before: {len(df1)}, After: {len(result)}") - Index confusion → Use
reset_index()aftergroupby()to get clean DataFrame - Chained indexing →
df['a']['b']fails silently; usedf.loc[:, ['a', 'b']]
Security & Privacy
Data storage:
- User preferences stored in
~/pandas/memory.md - All DataFrame operations run locally
- No data is sent externally
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-ivangdavila-pandas": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
Animations
Create performant web animations with proper accessibility and timing.
Arduino
Develop Arduino projects avoiding common wiring, power, and code pitfalls.
Bulgarian
Write Bulgarian that sounds human. Not formal, not robotic, not AI-generated.
Arabic
Write Arabic that sounds human. Not formal, not robotic, not AI-generated.
Assistant
Manage tasks, communications, and scheduling with proactive and organized support.