Senior DevOps
CI/CD, infrastructure automation, containerization, and cloud platforms.
$skillfish add ~skills/senior-devops Real Conversations
See how the Senior DevOps skill responds to actual requests. These are representative examples of what your AI agent will produce.
Our k8s rolling deploys cause ~30 seconds of 503 errors. We have 3 replicas behind an ALB, Node.js app, and our readiness probe is just checking port 3000.
Root Cause: Pods receive traffic before they're ready, and terminate before draining in-flight requests.
Fix 1 — Proper readiness probe (checks app logic, not just port):
readinessProbe:
httpGet:
path: /health/ready # must return 200 only when app is fully initialized
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 30 # give app time to start before killing it
Node.js health endpoint:
app.get('/health/ready', (req, res) => {
// Check DB connection, cache, etc.
if (dbPool.totalCount > 0) res.sendStatus(200);
else res.sendStatus(503);
});
Fix 2 — Graceful shutdown on SIGTERM:
process.on('SIGTERM', async () => {
server.close(async () => { // stop accepting new connections
await dbPool.end(); // drain DB connections
process.exit(0);
});
setTimeout(() => process.exit(1), 25000); // force exit after 25s
});
Fix 3 — Deployment spec:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # never remove a pod before a new one is ready
template:
spec:
terminationGracePeriodSeconds: 30
After these changes: 503s during deploy → 0. ALB waits for readiness, app drains in-flight requests before shutdown.
Get Started
How to install and use this skill in your preferred environment.
Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase. Web-based AI can use the knowledge and frameworks, but won't have tool access.
Models & Context
Which AI models and context windows work best with this skill.
Recommended Models
Larger models produce more detailed, production-ready outputs.
Context Window
This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.
All current frontier models (Claude, GPT, Gemini) support 100K+ context. Use the full window for complex multi-service work.
Pro tips for best results
Be specific
Include numbers — users, budget, RPS — so the skill can size the architecture.
Share constraints
Compliance needs, team size, and existing stack all improve the output.
Iterate
Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.
Combine skills
Pair with companion skills below for end-to-end coverage.
Ready to try Senior DevOps?
Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.
$skillfish add ~skills/senior-devops