Engineering TDDTestingQuality

TDD Guide

Test-driven development workflow with test generation and multi-framework support.

$skillfish add ~skills/tdd-guide

Real Conversations

See how the TDD Guide skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

Starting a new Node.js shopping cart service. Team is skeptical of TDD. Can you show me the red-green-refactor cycle on a real example?

Agent

TDD Red-Green-Refactor — Shopping Cart:

RED — Write a failing test first:

// cart.test.ts
import { Cart } from './cart';

describe('Cart', () => {
  it('adds an item and returns correct total', () => {
    const cart = new Cart();
    cart.addItem({ id: 'sku-1', name: 'Widget', price: 999, quantity: 2 });
    expect(cart.total()).toBe(1998);
  });
});
// Run: npx jest -- FAILS: Cannot find module './cart'

GREEN — Write the minimum code to pass:

// cart.ts
export class Cart {
  private items: CartItem[] = [];

  addItem(item: CartItem) {
    this.items.push(item);
  }

  total(): number {
    return this.items.reduce((sum, i) => sum + i.price * i.quantity, 0);
  }
}
// Run: npx jest -- PASSES

REFACTOR — Improve without breaking:

// Add quantity merging for duplicate SKUs
addItem(item: CartItem) {
  const existing = this.items.find(i => i.id === item.id);
  if (existing) { existing.quantity += item.quantity; }
  else { this.items.push({ ...item }); }
}
// Run: npx jest -- still PASSES

Next test (repeat cycle):

it('applies 10% discount coupon', () => {
  cart.addItem({ id: 'sku-1', price: 1000, quantity: 1 });
  cart.applyCoupon({ code: 'SAVE10', type: 'percent', value: 10 });
  expect(cart.total()).toBe(900);
});

The pitch to skeptics: TDD doesn't slow you down — it shifts debugging time from after deployment to during writing. The test suite you build becomes the spec.

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase. Web-based AI can use the knowledge and frameworks, but won't have tool access.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best
Claude Opus 4 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2
Good
Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only
32K+ Skill + conversation
100K+ Skill + references + codebase

All current frontier models (Claude, GPT, Gemini) support 100K+ context. Use the full window for complex multi-service work.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Good to Know

Advanced guide and reference material for TDD Guide. Background, edge cases, and patterns worth understanding.

Contents

When TDD Helps vs Hurts

TDD is a design tool, not a religious obligation. Apply it where the feedback loop pays off.

Scenario TDD appropriate? Notes
New feature with clear spec Yes Red-green-refactor shines; tests become the living spec
Exploratory / spike work No Write a spike, extract understanding, throw it away, then TDD the real implementation
Legacy code without tests Characterization tests first Write tests describing current behavior (even if buggy) before modifying anything
Performance-critical hot path Profile first TDD the correctness; benchmark separately. Don't optimize before measuring.
UI components (visual) Maybe TDD business logic and state; skip TDD for visual assertions — use snapshot or visual regression testing
Third-party integration wrappers Yes Mock the external call; TDD the wrapper's behavior and error handling

The Test Pyramid in Practice

The canonical ratio: 70% unit / 20% integration / 10% E2E. This keeps the test suite fast and failures localized.

How teams end up with an inverted pyramid:

  • E2E tests are written first because they "feel like real testing"
  • Integration tests catch the bugs unit tests miss, so more get added
  • Unit tests require design discipline (DI, pure functions) that teams skip under deadline pressure

How to detect an inverted pyramid from CI metrics:

If: test suite runtime > 10 minutes
And: disabling E2E tests cuts runtime by >60%
Then: your pyramid is inverted

The other signal: flaky tests. E2E and integration tests are the primary source of flakiness (network, timing, shared state). If you have >2% flake rate on your suite, you probably have too many high-level tests.

Recovery: Don't delete E2E tests — add unit tests until the ratio normalizes. Deleting tests leaves undetected regressions.


Coverage Is Not Quality

100% line coverage is achievable with tests that assert nothing:

it('runs without throwing', () => {
  expect(() => processPayment(validOrder)).not.toThrow();
});
// Covers every line. Detects zero behavioral regressions.

What coverage misses:

  • Missing test cases (the code path you didn't think to write)
  • Wrong assertions (testing the wrong output)
  • Incorrect edge case handling that still returns a value

Mutation score is the better signal. A mutation testing tool (Stryker for JS/TS, mutmut for Python, pitest for Java) makes small code changes — flipping > to >=, changing a return value — and runs your tests. If your tests don't catch the mutation, the test is weak.

Target: mutation score >70% on critical business logic. Don't apply it to all code — it's expensive to run and the ROI is highest on payment, auth, and calculation logic.


AAA vs BDD Test Style

Both are structuring conventions. The difference is readability for different audiences.

AAA (Arrange/Act/Assert) — clear for developers reading code:

it('applies percentage discount to cart total', () => {
  // Arrange
  const cart = new Cart([{ price: 1000, qty: 1 }]);
  const coupon = { type: 'percent', value: 10 };
  // Act
  cart.applyCoupon(coupon);
  // Assert
  expect(cart.total()).toBe(900);
});

BDD (Given/When/Then) — reads like a product spec:

it('applies percentage discount to cart total', () => {
  given('a cart containing one item at $10.00', () => {
    const cart = new Cart([{ price: 1000, qty: 1 }]);
    when('a 10% discount coupon is applied', () => {
      cart.applyCoupon({ type: 'percent', value: 10 });
      then('the total should be $9.00', () => {
        expect(cart.total()).toBe(900);
      });
    });
  });
});

When to use each: AAA for team-internal tests where all readers are engineers. BDD when tests double as acceptance criteria reviewed by product or QA — the Given/When/Then maps directly to user story format. Avoid mixing styles in the same file.


"Obvious Implementation" vs "Fake It Till You Make It"

Kent Beck described two modes in Test-Driven Development by Example:

Obvious implementation: When you know exactly how to write the code correctly, just write it. Don't introduce artificial fakery for its own sake. Skip the stub if the real implementation is three lines.

Fake it till you make it: When the correct implementation isn't clear, start with a hardcoded return value that makes the test pass. Let subsequent tests force you to generalize.

// Test 1: passes with hardcoded return
total() { return 1998; }

// Test 2: new items — hardcoded breaks
// Forces you to write: return this.items.reduce(...)

The value of "fake it" isn't the fake code — it's the discipline of writing tests that force you toward the real implementation one constraint at a time. Use it when you're unsure of the design, not as a default.

Ready to try TDD Guide?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

$skillfish add ~skills/tdd-guide
← Browse all 169 skills