Engineering Advanced Tech DebtQualityMetrics

Tech Debt Tracker

Codebase debt scanner, prioritizer, and trend dashboard.

Install for

Real Conversations

See how the Tech Debt Tracker skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

My CTO wants me to present 'the state of our tech debt' to the board next week. How do I actually measure and quantify it?

Agent

Boards respond to numbers, not vague statements about code quality. The key is translating technical debt into metrics the business already cares about: velocity, risk, and engineering capacity. Here's a framework that gives you defensible data without drowning in analysis.

The Five Debt Categories

Not all debt is the same. Measure each category separately or you'll conflate problems that have different solutions.

Category	What it is	Primary metric
Code quality	Complexity, duplication, dead code	Cyclomatic complexity, duplication %
Test coverage	Missing or weak tests	Line/branch coverage %, mutation score
Architecture	Wrong abstractions, tight coupling	Dependency cycles, module coupling score
Infrastructure	Outdated runtimes, manual deploys, flaky CI	Deployment frequency, MTTR, CI pass rate
Documentation	Missing ADRs, outdated READMEs, no onboarding guide	Hours to first PR for new engineer

Measurement Approach by Category

Code Quality

# JavaScript/TypeScript — complexity via ESLint
npx eslint --rule 'complexity: ["warn", 10]' src/ --format json > complexity.json

# Python
radon cc src/ -a -s  # average cyclomatic complexity per module

# All languages — duplication
npx jscpd src/ --min-lines 10 --reporters json

Target: average cyclomatic complexity < 10 per function. Duplication < 5% of codebase.

Test Coverage

npm test -- --coverage  # Jest
pytest --cov=src --cov-report=json  # Python

Track line coverage AND branch coverage. A file can be 90% line-covered with 40% branch coverage — that's where bugs hide.

Architecture

Count dependency cycles using madge (JS) or import-linter (Python). A codebase with zero cycles between modules is clean. Any cycles indicate architectural debt — modules that can't be changed independently.

Infrastructure Debt

Pull from your CI/CD tooling:

Deployment frequency: how many deploys per week?
MTTR: average time from incident detection to resolution (from your incident tracker)
CI pass rate: % of pipeline runs that pass on first attempt

Documentation

Track time-to-first-PR for the last 3 new engineers. If it's over 3 days, documentation is a blocker.

Debt Score Framework

Create a single number for board slides. Score each category 1–5 (5 = worst), then calculate a weighted total:

Category	Weight	Your Score	Weighted
Code quality	20%	3	0.60
Test coverage	25%	4	1.00
Architecture	20%	2	0.40
Infrastructure	25%	3	0.75
Documentation	10%	4	0.40
Total			3.15 / 5.0

Re-run this quarterly. The trend matters more than the absolute number — a score moving from 3.5 to 2.8 over two quarters is the story you want to tell.

Executive Summary Format (One Slide)

State of Engineering Health — Q1 2025

Overall Debt Score: 3.1 / 5.0  (lower = better)
Trend: ↓ 0.4 from Q4 2024  ✓ improving

Where we are now:
• Test coverage: 61% (target: 80%)
• Deployment frequency: 2x/week (target: daily)
• Avg complexity per function: 14 (target: <10)
• Time to first PR (new hires): 4 days (target: 2 days)

Business impact:
• Est. 18% of sprint capacity spent on debt-related rework
• 3 of last 5 production incidents traced to untested code paths
• Onboarding a new engineer currently takes ~2 weeks to full productivity

Requested investment:
• 20% of engineering capacity per sprint dedicated to debt reduction
• Est. ROI: 15% velocity increase within 2 quarters

The board doesn't need to understand cyclomatic complexity. They need to understand: "We're spending 18% of our engineering budget reworking things that should have been done right the first time." That's the framing that gets buy-in.

Keep the slide to these four elements: where we are, how we're trending, what it's costing the business, and what we're asking for.

You

We've documented about 60 tech debt items. How do we decide which to tackle first without just going by whoever complains loudest?

Agent

Sixty items is exactly the number where gut-feel prioritization breaks down. Here's a structured scoring model that takes politics out of the decision and gives you an output you can defend to both engineering and product.

The Prioritization Matrix

Score each item on three dimensions, each 1–3:

Dimension	1 (Low)	2 (Medium)	3 (High)
Impact on velocity	Slows one person occasionally	Slows the team regularly	Blocks delivery; causes rework across multiple areas
Risk of leaving it	Cosmetic; no failure modes	Could cause bugs or incidents under load	Active failure risk; has caused or will cause incidents
Cost to fix	< 1 day	1–5 days	> 5 days (normalize: higher cost = lower priority)

Priority score = (Velocity Impact × Risk) ÷ Fix Cost

This formula naturally surfaces items that have high impact AND high risk but are cheap to fix — the "quick wins" that build momentum. It also prevents the team from spending a quarter on a heroic refactor that touches one obscure system.

Example: Scoring 60 Items Into Tiers

Item                              Velocity  Risk  Fix Cost  Score  Tier
──────────────────────────────────────────────────────────────────────
Auth token refresh — no retry        3       3       1       9.0   P1
God class: OrderService (2400 LOC)   3       2       3       2.0   P2
Missing indexes on reports table     3       3       1       9.0   P1
No error boundary in checkout UI     2       3       1       6.0   P1
Hardcoded staging URLs in 14 files   1       2       1       2.0   P3
Duplicated validation logic (6x)     2       1       2       1.0   P3
Zero tests on billing module         3       3       2       4.5   P1
Webpack build > 8 minutes            2       1       2       1.0   P3
Manual DB migrations (no rollback)   2       3       2       3.0   P2
Legacy React class components        1       1       3       0.3   P4

Tier Definitions

Tier	Score	Action
P1	≥ 5.0	Address in next 2 sprints — these are hurting you now
P2	2.0–4.9	Schedule within the quarter — rising risk if ignored
P3	0.5–1.9	Opportunistic — fix when touching related code
P4	< 0.5	Backlog — revisit in 6 months or close as won't-fix

With 60 items, you'll typically find 8–12 P1s, 15–20 P2s, and the rest in P3/P4. That P1/P2 list is your working debt backlog.

Handling the Business vs. Engineering Tension

Product prioritizes debt that has visible user impact. Engineering prioritizes debt that blocks them. Both are valid — but they're solving different problems.

The framing that works: debt items are velocity tax. A P1 debt item costs you roughly 0.5–1 story point per sprint indefinitely until it's fixed. Paying it off is a one-time cost to recover ongoing capacity.

Translate your top P1s into product language:

"Zero tests on billing module" → "Every billing change requires manual QA, adding 2 days to every payment feature. The next billing feature is already in the roadmap — let's fix this before we build it."
"Missing indexes on reports table" → "Reports are timing out for customers with > 10K records. We have 3 enterprise customers in that range today, 12 in the pipeline."

Embedding Debt Paydown Into the Sprint Cadence

The 20% rule: Reserve 20% of sprint capacity for debt. On a two-week sprint with 8 engineers, that's roughly 6.4 engineer-days. Enough to clear one P1 item per sprint.

Operationally: create a recurring "Debt Sprint" story worth 20% of your total points. The team fills it from the top of the P1/P2 backlog. It's non-negotiable — not a buffer for scope creep.

Track debt paydown as a metric in your sprint retrospective. If P1 count is going down over time, you're winning. If it's flat or growing, the 20% allocation isn't enough and you need a bigger conversation with product.

The key insight: a structured score prevents debt from being assigned to whoever shouts loudest in the sprint planning meeting. When an item is P1 with a score of 8.5, that's the answer — not a negotiation.

You

Engineering wants to spend a full quarter on tech debt cleanup but product keeps deprioritizing it. How do I make the business case?

Agent

Product deprioritizes debt cleanup because it gets framed as an engineering preference, not a business problem. The way to win this argument is to stop talking about code and start talking about money, risk, and speed.

Reframe the Problem

Product hears "tech debt quarter" as: engineering wants to stop shipping features. That's the wrong frame.

The right frame: technical debt is a tax on every feature we ship. A quarter of debt cleanup isn't 0 features — it's investing to ship the next 4 quarters faster and with fewer incidents.

Calculate the Cost of Debt

This is the most powerful slide in the business case. Make it concrete:

Step 1: Measure rework time Survey the engineering team (or pull from sprint retrospectives): what percentage of each sprint goes to rework, debugging issues caused by fragile code, or working around bad architecture?

Industry average is 20–30%. If your team says 25%, that's real.

Step 2: Translate to dollars

Team size: 8 engineers
Average fully-loaded engineer cost: $180,000/year
                                  = $3,462/week per engineer

Sprint capacity: 8 engineers × 2 weeks = 16 engineer-weeks per sprint
Debt rework rate: 25% of capacity

Cost of debt per sprint:
  16 weeks × 25% × $3,462 = $13,848/sprint

Cost of debt per year (26 sprints):
  26 × $13,848 = $360,000/year

You're spending $360K per year on work that produces no value. A quarter of cleanup investment (roughly $120K in engineering time) that cuts rework from 25% to 12% pays for itself in 8 months and returns $180K/year thereafter.

Step 3: Incident cost Pull the last 6 months of production incidents from your incident tracker. For each one, calculate: hours to resolve × engineers involved × hourly rate. Add customer impact (churn, SLA credits, sales deals delayed). Most teams find that 2–3 major incidents per quarter trace directly to debt-related fragility.

Before/After Projection

Metric	Before (current)	After (post-cleanup)	Change
Sprint rework rate	25%	12%	+13% capacity
Deployment frequency	2x/week	Daily	2.5x faster releases
Mean time to resolve incidents	4.2 hours	1.8 hours	57% reduction
New engineer time to first PR	4 days	1.5 days	3x faster onboarding
Features shipped per quarter	18	~23	+28% output

These are projections — be transparent about that. But they're grounded in your actual rework rate, not made up. That credibility matters.

The "What Do We Ship Instead?" Answer

Product will ask what customers miss out on during a debt quarter. Have an answer ready.

The honest answer: customers don't miss a quarter of features — they miss 6–8 weeks of features, because the team still ships. A quarter of 20% debt allocation means roughly 4 months of reduced (not zero) feature output, not a hard stop.

More importantly: list 2–3 features currently blocked or significantly slowed by the debt. "The new reporting dashboard we've been promising enterprise customers for two quarters is slow to build partly because our data layer has no clear ownership. That's a debt problem, not a roadmap problem."

The 1-Page Business Case Structure

HEADER: Engineering Velocity Investment — Q3 2025

THE PROBLEM
• Engineering is operating at ~75% effective capacity
• ~$360K/year in engineering time spent on rework (not new value)
• 3 major incidents in Q2 traced to fragile, untested code
• New engineers take 4 days to make their first contribution

THE INVESTMENT
• One quarter with 40% of capacity on debt reduction (vs. 20% ongoing)
• Approximate cost: ~$90K in diverted engineering time
• Targeted items: [list your top 5 P1 debt items with plain-English descriptions]

THE RETURN (12-month projection)
• +13% engineering throughput → ~3 additional features/quarter
• Estimated incident reduction: 60% fewer P1 incidents
• Onboarding improvement: new engineers productive in <2 days
• Payback period: ~8 months

WHAT PRODUCT GETS
• [Feature A] — currently blocked by OrderService refactor — unblocked in Q4
• [Feature B] — requires database index work — ships 6 weeks earlier
• Fewer rollbacks and hotfixes disrupting roadmap execution

The key is that product needs to see something in it for them — not just an abstract engineering health improvement. Show them which roadmap items ship faster or get unblocked. That's the conversation that changes the outcome.

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best

Claude Opus 4.8 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2

Good

Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only

32K+ Skill + conversation

100K+ Skill + references + codebase

All current frontier models support 100K+ context.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Ready to try Tech Debt Tracker?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

Install for

← Browse all skills

Tech Debt Tracker

Real Conversations

The Five Debt Categories

Measurement Approach by Category

Code Quality

Test Coverage

Architecture

Infrastructure Debt

Documentation

Debt Score Framework

Executive Summary Format (One Slide)

The Prioritization Matrix

Example: Scoring 60 Items Into Tiers

Tier Definitions

Handling the Business vs. Engineering Tension

Embedding Debt Paydown Into the Sprint Cadence

Reframe the Problem

Calculate the Cost of Debt

Before/After Projection

The "What Do We Ship Instead?" Answer

The 1-Page Business Case Structure

Get Started

Claude Code, OpenCode, or any CLI agent

Cursor, Windsurf, or IDE-based agents

Claude.ai, ChatGPT, Gemini, or any web AI

Models & Context

Recommended Models

Context Window

Pro tips for best results

Ready to try Tech Debt Tracker?