Engineering BackendAPIsDatabases

Senior Backend

REST APIs, database optimization, authentication, and microservices.

Install for

Real Conversations

See how the Senior Backend skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

GET /api/orders/ takes 4.2 seconds for 100 orders. Django Debug Toolbar shows 312 queries. Each order has line items, products, and a customer.

Agent

Root cause: N+1 — 312 = 1 + (100 × 3 relations)

Fix with select_related and prefetch_related:

# BEFORE: 312 queries
class OrderListView(generics.ListAPIView):
    queryset = Order.objects.all()

# AFTER: 3 queries
class OrderListView(generics.ListAPIView):
    queryset = (
        Order.objects
        .select_related('customer')           # JOIN — FK/OneToOne
        .prefetch_related(
            'line_items__product',             # 2 separate queries, IN clause
            'line_items__product__category',   # folded into prefetch above
        )
        .only(
            'id', 'created_at', 'status', 'total_cents',
            'customer__id', 'customer__email', 'customer__name',
        )
        .annotate(item_count=Count('line_items'))
    )

Serializer — avoid per-instance .count() calls:

class OrderSerializer(serializers.ModelSerializer):
    item_count = serializers.IntegerField(read_only=True)  # from annotation

Expected result: 312 queries → 3 queries, 4.2s → ~80ms

Regression test — lock this in:

from django.test.utils import CaptureQueriesContext
from django.db import connection

with CaptureQueriesContext(connection) as ctx:
    self.client.get('/api/orders/')
assert len(ctx.captured_queries) <= 5

You

Our Node.js API uses JWTs with 24h expiry. We want to add refresh token rotation and detect if a stolen token is being reused.

Agent

Refresh Token Rotation with Reuse Detection:

Token strategy:

Access token: 15-min expiry, stateless JWT, never stored server-side
Refresh token: 30-day expiry, opaque random bytes, stored hashed in DB, sent via httpOnly cookie

Schema:

CREATE TABLE refresh_tokens (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id     UUID NOT NULL REFERENCES users(id),
  token_hash  TEXT NOT NULL UNIQUE,
  parent_id   UUID REFERENCES refresh_tokens(id),  -- rotation chain
  used_at     TIMESTAMPTZ,
  expires_at  TIMESTAMPTZ NOT NULL
);

Rotation handler:

async function rotateRefreshToken(incomingToken: string) {
  const stored = await db.refreshTokens.findByHash(
    await bcrypt.hash(incomingToken, 10)
  );
  if (!stored) throw new UnauthorizedException('Invalid token');

  if (stored.used_at) {
    // Token already used — likely stolen. Revoke entire family.
    await db.refreshTokens.revokeFamily(stored.id);
    throw new UnauthorizedException('Token reuse detected — all sessions revoked');
  }

  await db.refreshTokens.markUsed(stored.id);

  const newToken = crypto.randomBytes(48).toString('base64url');
  await db.refreshTokens.create({
    userId: stored.user_id,
    tokenHash: await bcrypt.hash(newToken, 10),
    parentId: stored.id,
    expiresAt: addDays(new Date(), 30),
  });

  return { accessToken: signJwt(stored.user_id), refreshToken: newToken };
}

Cookie config: httpOnly: true, secure: true, sameSite: 'strict' — never expose the refresh token to JavaScript.

You

We're building a public API used by 3,000 developer customers. We need rate limiting, API versioning, and graceful degradation when a downstream service is down. How do I design this properly?

Agent

A public API at 3,000 developer customers is a different threat model than an internal service — you need rate limiting that's fair, hard to game, and observable. Here's how to design this properly.

Rate Limiting Strategy

Three common algorithms — pick based on your burst tolerance:

Algorithm	Burst Handling	Implementation	Best For
Token Bucket	Allows bursts up to bucket capacity	Redis + Lua script	Most APIs — flexible and developer-friendly
Leaky Bucket	Smooths bursts to constant output rate	Queue-based	Protecting fragile downstream services
Sliding Window	Exact counts, no burst allowance	Redis sorted set	Strict compliance (billing, per-unit quotas)

Recommendation: Token bucket for your developer API. Developers expect occasional burst tolerance (batch processing, retries), and token bucket gives you that without sacrificing protection.

Redis-Backed Rate Limiter (Node.js)

Two layers: per-customer and global.

// lib/rateLimiter.ts
import { Redis } from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

interface RateLimitResult {
  allowed: boolean;
  remaining: number;
  resetAt: number; // unix timestamp
}

// Atomic token bucket via Lua — prevents race conditions
const TOKEN_BUCKET_SCRIPT = `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])  -- tokens per second
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now

-- Refill tokens based on elapsed time
local elapsed = now - last_refill
local new_tokens = math.min(capacity, tokens + (elapsed * refill_rate))

if new_tokens >= requested then
  redis.call('HMSET', key, 'tokens', new_tokens - requested, 'last_refill', now)
  redis.call('EXPIRE', key, 86400)
  return {1, math.floor(new_tokens - requested)}
else
  redis.call('HMSET', key, 'tokens', new_tokens, 'last_refill', now)
  redis.call('EXPIRE', key, 86400)
  return {0, math.floor(new_tokens)}
end
`;

export async function checkRateLimit(customerId: string): Promise<RateLimitResult> {
  const now = Date.now() / 1000;

  // Per-customer: 1,000 req/min, burst up to 200
  const customerKey = `rl:customer:${customerId}`;
  const [custAllowed, custRemaining] = await redis.eval(
    TOKEN_BUCKET_SCRIPT, 1, customerKey, 200, 16.67, now, 1
  ) as [number, number];

  // Global: 50,000 req/min across all customers
  const globalKey = 'rl:global';
  const [globalAllowed] = await redis.eval(
    TOKEN_BUCKET_SCRIPT, 1, globalKey, 5000, 833, now, 1
  ) as [number, number];

  const allowed = custAllowed === 1 && globalAllowed === 1;
  const resetAt = Math.ceil(now) + 60;

  return { allowed, remaining: custRemaining, resetAt };
}

// middleware/rateLimitMiddleware.ts
export async function rateLimitMiddleware(req, res, next) {
  const customerId = req.headers['x-customer-id'] || req.apiKey?.customerId;
  if (!customerId) return res.status(401).json({ error: 'Unauthorized' });

  const result = await checkRateLimit(customerId);

  // Always set headers — developers depend on these
  res.set({
    'X-RateLimit-Limit': '1000',
    'X-RateLimit-Remaining': result.remaining.toString(),
    'X-RateLimit-Reset': result.resetAt.toString(),
    'Retry-After': result.allowed ? undefined : '60',
  });

  if (!result.allowed) {
    return res.status(429).json({
      error: 'rate_limit_exceeded',
      message: 'Rate limit exceeded. See X-RateLimit-Reset for retry time.',
      docs: 'https://docs.yourapi.com/rate-limits',
    });
  }

  next();
}

API Versioning

Approach	Example	Trade-offs
URL path	`/v1/orders`	Most visible, easy to route, pollutes URLs
Header	`Accept: application/vnd.api+json;version=2`	Clean URLs, harder to test in browser
Query param	`/orders?version=2`	Easy to test, messy for caching

Recommendation: URL path versioning for a public developer API. It's explicit, curl-friendly, and easy to document. Reserve header versioning for when you need URL stability (e.g., webhooks).

Version strategy: support N and N-1. Deprecate with Sunset and Deprecation response headers 6 months before removal.

// Sunset header for deprecated versions
if (req.apiVersion === 'v1') {
  res.set('Sunset', 'Sat, 01 Jan 2027 00:00:00 GMT');
  res.set('Deprecation', 'Mon, 01 Jul 2026 00:00:00 GMT');
  res.set('Link', '<https://docs.yourapi.com/v2/migration>; rel="successor-version"');
}

Circuit Breaker for Downstream Failures

Use opossum — battle-tested Node.js circuit breaker library.

import CircuitBreaker from 'opossum';

const paymentServiceBreaker = new CircuitBreaker(callPaymentService, {
  timeout: 3000,          // fail requests taking >3s
  errorThresholdPercentage: 50,  // open circuit if 50% of requests fail
  resetTimeout: 30000,    // try again after 30s
  volumeThreshold: 10,    // need 10 requests before tripping
});

paymentServiceBreaker.fallback(() => ({
  status: 'degraded',
  message: 'Payment processing is temporarily unavailable. Your order has been queued.',
  retryAfter: 30,
}));

// Expose circuit state in health endpoint
paymentServiceBreaker.on('open', () => {
  metrics.increment('circuit_breaker.payment_service.open');
  alerting.warn('Payment service circuit breaker opened');
});

export async function processPayment(data) {
  return paymentServiceBreaker.fire(data);
}

Health endpoint convention — expose circuit states so your operations team can see degraded dependencies without digging through logs:

GET /health
{
  "status": "degraded",
  "services": {
    "payment_service": { "status": "open", "since": "2026-03-09T11:23:00Z" },
    "notification_service": { "status": "closed" }
  }
}

Recommended Stack

Rate limiting: express-rate-limit for simple cases; Redis + Lua for multi-tier per-customer limits
Circuit breaker: opossum (Node.js), resilience4j (JVM), Polly (.NET)
API versioning routing: Express Router with version prefix, or an API gateway (Kong, AWS API Gateway) to handle routing declaratively
Observability: Emit rate_limit_exceeded, circuit_breaker_open, and api_version_used metrics from day one — you'll need them to tune thresholds

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best

Claude Opus 4.8 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2

Good

Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only

32K+ Skill + conversation

100K+ Skill + references + codebase

All current frontier models support 100K+ context.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Ready to try Senior Backend?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

Install for

← Browse all skills

Senior Backend

Real Conversations

Rate Limiting Strategy

Redis-Backed Rate Limiter (Node.js)

API Versioning

Circuit Breaker for Downstream Failures

Recommended Stack

Get Started

Claude Code, OpenCode, or any CLI agent

Cursor, Windsurf, or IDE-based agents

Claude.ai, ChatGPT, Gemini, or any web AI

Models & Context

Recommended Models

Context Window

Pro tips for best results

Ready to try Senior Backend?