Engineering Advanced Tech DebtQualityMetrics

Tech Debt Tracker

Codebase debt scanner, prioritizer, and trend dashboard.

$skillfish add ~skills/tech-debt-tracker

Real Conversations

See how the Tech Debt Tracker skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

My CTO wants me to present 'the state of our tech debt' to the board next week. How do I actually measure and quantify it?

Agent

Boards respond to numbers, not vague statements about code quality. The key is translating technical debt into metrics the business already cares about: velocity, risk, and engineering capacity. Here's a framework that gives you defensible data without drowning in analysis.

The Five Debt Categories

Not all debt is the same. Measure each category separately or you'll conflate problems that have different solutions.

Category What it is Primary metric
Code quality Complexity, duplication, dead code Cyclomatic complexity, duplication %
Test coverage Missing or weak tests Line/branch coverage %, mutation score
Architecture Wrong abstractions, tight coupling Dependency cycles, module coupling score
Infrastructure Outdated runtimes, manual deploys, flaky CI Deployment frequency, MTTR, CI pass rate
Documentation Missing ADRs, outdated READMEs, no onboarding guide Hours to first PR for new engineer

Measurement Approach by Category

Code Quality

# JavaScript/TypeScript — complexity via ESLint
npx eslint --rule 'complexity: ["warn", 10]' src/ --format json > complexity.json

# Python
radon cc src/ -a -s  # average cyclomatic complexity per module

# All languages — duplication
npx jscpd src/ --min-lines 10 --reporters json

Target: average cyclomatic complexity < 10 per function. Duplication < 5% of codebase.

Test Coverage

npm test -- --coverage  # Jest
pytest --cov=src --cov-report=json  # Python

Track line coverage AND branch coverage. A file can be 90% line-covered with 40% branch coverage — that's where bugs hide.

Architecture

Count dependency cycles using madge (JS) or import-linter (Python). A codebase with zero cycles between modules is clean. Any cycles indicate architectural debt — modules that can't be changed independently.

Infrastructure Debt

Pull from your CI/CD tooling:

  • Deployment frequency: how many deploys per week?
  • MTTR: average time from incident detection to resolution (from your incident tracker)
  • CI pass rate: % of pipeline runs that pass on first attempt

Documentation

Track time-to-first-PR for the last 3 new engineers. If it's over 3 days, documentation is a blocker.

Debt Score Framework

Create a single number for board slides. Score each category 1–5 (5 = worst), then calculate a weighted total:

Category Weight Your Score Weighted
Code quality 20% 3 0.60
Test coverage 25% 4 1.00
Architecture 20% 2 0.40
Infrastructure 25% 3 0.75
Documentation 10% 4 0.40
Total 3.15 / 5.0

Re-run this quarterly. The trend matters more than the absolute number — a score moving from 3.5 to 2.8 over two quarters is the story you want to tell.

Executive Summary Format (One Slide)

State of Engineering Health — Q1 2025

Overall Debt Score: 3.1 / 5.0  (lower = better)
Trend: ↓ 0.4 from Q4 2024  ✓ improving

Where we are now:
• Test coverage: 61% (target: 80%)
• Deployment frequency: 2x/week (target: daily)
• Avg complexity per function: 14 (target: <10)
• Time to first PR (new hires): 4 days (target: 2 days)

Business impact:
• Est. 18% of sprint capacity spent on debt-related rework
• 3 of last 5 production incidents traced to untested code paths
• Onboarding a new engineer currently takes ~2 weeks to full productivity

Requested investment:
• 20% of engineering capacity per sprint dedicated to debt reduction
• Est. ROI: 15% velocity increase within 2 quarters

The board doesn't need to understand cyclomatic complexity. They need to understand: "We're spending 18% of our engineering budget reworking things that should have been done right the first time." That's the framing that gets buy-in.

Keep the slide to these four elements: where we are, how we're trending, what it's costing the business, and what we're asking for.

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase. Web-based AI can use the knowledge and frameworks, but won't have tool access.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best
Claude Opus 4 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2
Good
Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only
32K+ Skill + conversation
100K+ Skill + references + codebase

All current frontier models (Claude, GPT, Gemini) support 100K+ context. Use the full window for complex multi-service work.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Ready to try Tech Debt Tracker?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

$skillfish add ~skills/tech-debt-tracker
← Browse all 169 skills