Skip to content

Commit 326785d

Browse files
improve data-analysis skill structure + description
- expand description with trigger terms for agent selection - add structured workflow (load, profile, transform, visualize, report) - add error recovery table for common data issues - replace fragmented code snippets with end-to-end EDA example
1 parent 348e5ea commit 326785d

1 file changed

Lines changed: 50 additions & 21 deletions

File tree

  • examples/agents/skills/data-analysis
Lines changed: 50 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,58 @@
11
---
22
name: data-analysis
3-
description: Analyze datasets, generate charts, and create summary reports. Use when the user needs to work with CSV, Excel, or other tabular data formats for analysis or visualization.
3+
description: >
4+
Analyze datasets, generate charts, and create summary reports from CSV, Excel,
5+
JSON, Parquet, or other tabular data. Capabilities: statistical profiling,
6+
outlier detection, pivot tables, groupby aggregation, time-series analysis,
7+
correlation matrices, and publication-ready visualizations.
8+
Trigger terms: analyze data, plot chart, summarize CSV, data profiling,
9+
statistics, histogram, scatter plot, dashboard, EDA, exploratory analysis.
410
---
511

612
# Data Analysis
713

814
## When to use this skill
915
Use this skill when the user needs to:
10-
- Analyze CSV or Excel files
11-
- Generate charts and visualizations
12-
- Calculate statistics and summaries
13-
- Clean and transform data
14-
15-
## How to analyze data
16-
1. Use pandas for data analysis:
17-
```python
18-
import pandas as pd
19-
df = pd.read_csv('data.csv')
20-
summary = df.describe()
21-
```
22-
23-
## How to create visualizations
24-
1. Use matplotlib or seaborn for charts:
25-
```python
26-
import matplotlib.pyplot as plt
27-
df.plot(kind='bar')
28-
plt.savefig('chart.png')
29-
```
16+
- Analyze CSV, Excel, JSON, or Parquet files
17+
- Generate charts and visualizations (bar, line, scatter, heatmap)
18+
- Calculate statistics, correlations, or distributions
19+
- Clean, transform, pivot, or aggregate data
20+
- Perform exploratory data analysis (EDA)
21+
22+
## Workflow
23+
24+
1. **Load & validate** -- Read the file, confirm shape and dtypes, report missing values.
25+
2. **Profile** -- Run `df.describe()`, check nulls, detect outliers (IQR or z-score).
26+
3. **Transform** -- Filter, group, pivot, or resample as needed.
27+
4. **Visualize** -- Generate charts; save to file with `plt.savefig()`.
28+
5. **Report** -- Summarize key findings in plain language.
29+
30+
## Error Recovery
31+
32+
| Problem | Action |
33+
|---------|--------|
34+
| File not found / wrong path | List directory contents, ask user to confirm filename |
35+
| Encoding error on read | Retry with `encoding='latin-1'` then `'cp1252'` |
36+
| Mixed dtypes in column | Use `pd.to_numeric(col, errors='coerce')` and report coerced rows |
37+
| Empty dataframe after filter | Warn user, show original value counts for filter column |
38+
| Chart rendering fails | Fall back to text-based summary table |
39+
40+
## Example: end-to-end EDA
41+
42+
```python
43+
import pandas as pd, matplotlib.pyplot as plt, seaborn as sns
44+
45+
df = pd.read_csv('sales.csv', parse_dates=['date'])
46+
assert not df.empty, "Dataset is empty"
47+
48+
# Profile
49+
print(df.describe())
50+
print(f"Missing values:\n{df.isnull().sum()}")
51+
52+
# Visualize
53+
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
54+
df.groupby('region')['revenue'].sum().plot.bar(ax=axes[0], title='Revenue by Region')
55+
sns.heatmap(df.select_dtypes('number').corr(), annot=True, ax=axes[1])
56+
plt.tight_layout()
57+
plt.savefig('eda_report.png', dpi=150)
58+
```

0 commit comments

Comments
 (0)