Engineering Advanced Data AnalysisPandasVisualization

Data Analyst

Data analysis workflows: pandas/SQL patterns, statistical analysis, visualization (matplotlib, plotly), dashboard design, A/B test analysis, and business metrics.

A practical data analysis skill that turns raw data into defensible insights. It handles the full workflow: data cleaning, exploratory analysis, statistical testing, visualization, and stakeholder-ready summaries. Opinionated on common pitfalls — Simpson's paradox, p-hacking, misleading visualizations, and metric misalignment — so analysis holds up to scrutiny.

Added Mar 20, 2026

$npx skills add johnefemer/skillfish --skill data-analyst

What This Skill Can Do

Concrete capabilities you get when you install this skill.

Clean and reshape datasets using pandas, SQL window functions, and dbt transforms

Run statistical tests (t-test, chi-square, Mann-Whitney) and interpret results correctly

Design and analyze A/B experiments with proper power calculations and guardrail metrics

Build interactive dashboards with Plotly, Streamlit, or Observable

Define business metrics that align with company objectives and resist gaming

Identify and flag misleading analysis patterns: confounding, survivorship bias, and p-hacking

Real Conversations

See how the Data Analyst skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

We want to run an A/B test on our checkout button. Current conversion rate is 3.2%. We want to detect a 0.5 percentage point lift. How many users do we need?

Agent

Here's the power calculation:

import numpy as np
from scipy import stats

def sample_size_two_proportion(
    baseline: float,
    mde: float,  # minimum detectable effect
    alpha: float = 0.05,
    power: float = 0.80,
) -> int:
    p1 = baseline
    p2 = baseline + mde
    p_bar = (p1 + p2) / 2

    z_alpha = stats.norm.ppf(1 - alpha / 2)  # two-tailed
    z_beta = stats.norm.ppf(power)

    n = (
        (z_alpha * np.sqrt(2 * p_bar * (1 - p_bar))
         + z_beta * np.sqrt(p1 * (1 - p1) + p2 * (1 - p2))) ** 2
        / (p2 - p1) ** 2
    )
    return int(np.ceil(n))

n = sample_size_two_proportion(baseline=0.032, mde=0.005)
print(f"Per variant: {n:,} users")  # → ~14,700 per variant

Result: ~14,700 users per variant (29,400 total).

Critical decisions to make now:

  • Runtime: At your traffic volume, how long to hit 29k users? If >4 weeks, consider raising MDE to 1pp.
  • Primary metric: conversion rate. Guardrail metrics: revenue per user, cart abandonment — log these too.
  • Segment pre-registration: decide mobile vs desktop splits BEFORE launch, not after peeking.

Do NOT peek at results early and stop — that inflates false positive rate to 20%+. Use sequential testing (e.g., scipy.stats with O'Brien-Fleming boundaries) if you need to monitor continuously.

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Works well with any frontier model. Complex statistical reasoning benefits from GPT-4o or Claude Sonnet; pandas/SQL generation is reliable on most models.

Context Window

SKILL.md is ~8KB. Fits in 32K context. Load dataset schema and sample rows in context for best results.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Works Great With

These skills complement Data Analyst for end-to-end coverage. Install them together for better results.

$ skillfish add johnefemer/skillfish --all # install all skills at once

Ready to try Data Analyst?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

$npx skills add johnefemer/skillfish --skill data-analyst
← Browse all skills