Engineering SecurityOWASPPentesting

Senior Security

Threat modeling, vulnerability analysis, secure architecture, and pentesting.

$skillfish add ~skills/senior-security

Real Conversations

See how the Senior Security skill responds to actual requests. These are representative examples of what your AI agent will produce.

You

Building auth for a B2B SaaS: email/password + SSO via SAML. Storing hashed passwords in Postgres. What are the top threats and mitigations?

Agent

STRIDE Threat Model — Auth System:

T1 — Credential stuffing (Spoofing)

  • Mitigation: rate limit /login to 5 attempts/min per IP and 10/hour per email
  • Implement: express-rate-limit with Redis store, or Cloudflare rate limiting rules
  • Detection: alert on >100 failed logins/min — likely automated attack

T2 — Password database compromise (Information Disclosure)

# Ensure you're using bcrypt/argon2 with appropriate cost factor
import bcrypt
# bcrypt with cost 12 takes ~250ms — slow enough to resist offline cracking
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
  • NEVER store plain text or MD5/SHA1 hashes
  • Pepper: add server-side secret to password before hashing (separate from salt)

T3 — SAML XML injection / signature bypass (Tampering)

  • Use a hardened SAML library — python3-saml, node-saml (not homegrown parsing)
  • Validate: InResponseTo attribute to prevent replay attacks
  • Require signed assertions AND signed envelopes
  • Common bypass: XML comment injection in NameID — ensure your library strips or rejects comments

T4 — Session fixation (Elevation of Privilege)

  • Regenerate session ID immediately after successful login: req.session.regenerate()
  • Set session cookie: httpOnly: true, secure: true, sameSite: 'strict'

T5 — Enumeration via response timing (Information Disclosure)

// Constant-time response even when user doesn't exist
const hash = user?.passwordHash ?? DUMMY_HASH;
await bcrypt.compare(password, hash);  // always runs, masks user existence

Priority order: T1 (immediate, high volume risk) > T3 (SAML bypass can be catastrophic) > T2 > T4 > T5

Get Started

How to install and use this skill in your preferred environment.

Skills are designed for AI coding agents (Claude Code, Cursor, Windsurf) and IDE-based workflows where the agent can read files, run scripts, and act on your codebase. Web-based AI can use the knowledge and frameworks, but won't have tool access.

Models & Context

Which AI models and context windows work best with this skill.

Recommended Models

Best
Claude Opus 4 Claude Sonnet 4 GPT-4.1 Gemini 2.5 Pro Grok 3 Kimi K2
Good
Claude Haiku 4.5 GPT-4.1 mini Gemini 2.5 Flash Grok 3 mini

Larger models produce more detailed, production-ready outputs.

Context Window

This skill's SKILL.md is typically 3–10 KB — fits in any modern context window.

8K Skill only
32K+ Skill + conversation
100K+ Skill + references + codebase

All current frontier models (Claude, GPT, Gemini) support 100K+ context. Use the full window for complex multi-service work.

Pro tips for best results

1

Be specific

Include numbers — users, budget, RPS — so the skill can size the architecture.

2

Share constraints

Compliance needs, team size, and existing stack all improve the output.

3

Iterate

Start with a high-level design, then ask follow-ups for IaC, cost analysis, or security review.

4

Combine skills

Pair with companion skills below for end-to-end coverage.

Good to Know

Advanced guide and reference material for Senior Security. Background, edge cases, and patterns worth understanding.

Contents

OWASP Top 10 2021 Quick Reference

# Category Description Primary fix pattern
A01 Broken Access Control Authorization checks missing or bypassable Enforce least-privilege server-side on every request; never rely on client-side role claims or hidden UI elements
A02 Cryptographic Failures Sensitive data exposed due to weak or absent encryption Use TLS 1.2+ in transit; encrypt at rest with AES-256; never use MD5/SHA1 for passwords; use bcrypt/argon2 with appropriate cost factor
A03 Injection Untrusted data sent to interpreters (SQL, OS, LDAP, NoSQL) Parameterized queries and prepared statements; input validation as defense-in-depth; never concatenate user input into query strings
A04 Insecure Design Missing or inadequate security controls in the design phase Threat model during design; define abuse cases alongside use cases; apply secure design patterns (fail-safe defaults, complete mediation)
A05 Security Misconfiguration Default credentials, unnecessary features enabled, verbose errors Harden all environments to the same baseline; disable default accounts; suppress stack traces in production; automate config drift detection
A06 Vulnerable and Outdated Components Libraries, frameworks, or dependencies with known CVEs Maintain SBOM; automate dependency scanning (Dependabot, Snyk); define a patch SLA by severity (Critical: 24h, High: 7d)
A07 Identification and Authentication Failures Broken authentication, session fixation, weak credentials Regenerate session IDs post-login; implement MFA; enforce rate limiting on authentication endpoints; use secure session cookie flags
A08 Software and Data Integrity Failures CI/CD tampering, unsigned updates, deserialization of untrusted data Sign artifacts; verify integrity before deployment; never deserialize untrusted data without schema validation
A09 Security Logging and Monitoring Failures Insufficient logging to detect or investigate attacks Log auth events, access control failures, and input validation errors; alert on anomalies; ensure logs are tamper-resistant and centralized
A10 Server-Side Request Forgery (SSRF) Server makes requests to attacker-controlled destinations Validate and allowlist target URLs server-side; block internal IP ranges (169.254.x.x, 10.x, 172.16.x, 192.168.x); disable HTTP redirects

CVSS v3.1 Scoring Guide

CVSS v3.1 base score is computed from 8 metrics across two groups:

Exploitability metrics (how the attack is delivered):

Metric Values Scoring effect
Attack Vector (AV) Network / Adjacent / Local / Physical Network = highest score
Attack Complexity (AC) Low / High Low = higher score
Privileges Required (PR) None / Low / High None = highest score
User Interaction (UI) None / Required None = higher score

Impact metrics (what the attacker achieves):

Metric Values Scoring effect
Scope (S) Unchanged / Changed Changed = higher score; applies when compromise propagates beyond the vulnerable component
Confidentiality (C) None / Low / High High = highest score
Integrity (I) None / Low / High High = highest score
Availability (A) None / Low / High High = highest score

Severity labels:

Score range Severity Typical patch SLA
9.0–10.0 Critical Immediate / 24 hours
7.0–8.9 High 7 days
4.0–6.9 Medium 30 days
0.1–3.9 Low 90 days or next release
0.0 None

CVSS base score does not account for exploit availability or environmental context. A Critical CVSS score on a system with no external exposure may be lower priority than a High score on a public-facing API — use the Temporal and Environmental metric groups to adjust.

Threat Modeling Approaches

STRIDE PASTA DREAD
What it models Threat categories against system components Attack scenarios aligned to business risk Severity scoring of identified threats
Best for New system design; architecture review Existing systems; risk prioritization for remediation backlog Ranking and communicating threat severity to non-technical stakeholders
Approach Systematic: each component × 6 threat types (Spoofing, Tampering, Repudiation, Information Disclosure, DoS, Elevation of Privilege) 7-stage process: objectives → technical scope → decomposition → threat analysis → vulnerability analysis → attack modeling → risk/impact analysis Scoring model: Damage potential, Reproducibility, Exploitability, Affected users, Discoverability
Output format Threat list with mitigations per component; DFD-based Risk-ranked attack tree with business impact mapping Numeric score per threat; ranked mitigation list
Limitation Does not inherently prioritize by likelihood or business impact Resource-intensive; overkill for small systems Scoring is subjective; different analysts produce different results

Practical combination: Use STRIDE on the data flow diagram during design to enumerate threats exhaustively, then apply DREAD scores to prioritize the output for the engineering backlog.

Pentesting vs. Code Review Trade-offs

Code Review (SAST + manual) Penetration Testing
Finds Logic flaws, hardcoded secrets, insecure patterns, missing input validation, vulnerable dependency usage Authentication bypasses, chained vulnerabilities, business logic abuse, client-server trust issues
Misses Runtime behavior, environment-specific misconfiguration, third-party API behavior Code paths not reachable from the API surface, internal library vulnerabilities not exposed externally
Best timing During development; before merging to main; at PR review Before major releases; after significant architectural changes; annually for production systems
Output Line-level findings with remediation guidance Exploit chains demonstrating real-world impact

Recommended sequence: Code review first, pentest second. Code review eliminates the low-hanging fruit (SQL injection, hardcoded credentials, dependency CVEs) so the pentest budget is spent on logic and chain vulnerabilities that automated tools and code review cannot find. Running a pentest on a codebase with unpatched OWASP Top 10 issues is inefficient — the report will be dominated by findings that a SAST tool would have caught.

When to do both: Before any major release shipping customer data, before SOC 2 or ISO 27001 certification audits, and after any significant re-architecture. Budget for a pentest at least annually for systems handling financial or health data.

Supply Chain Attack Vectors

Attack type Mechanism Mitigation
Dependency confusion Attacker publishes a public package with the same name as an internal private package at a higher version number; package manager resolves public over private Configure package managers to prefer private registry; use scoped package names (@org/package); pin versions with integrity hashes
Typosquatting Malicious package published with a name one character different from a popular library (reqests vs requests) Audit new dependencies before adding; use npm audit and equivalent; enable registry alerts for new packages matching internal name patterns
Compromised maintainer Legitimate maintainer account is taken over; malicious code inserted into a trusted package update Pin dependency versions and use lockfiles; review changelogs before upgrading; monitor for unexpected network activity in CI builds
Malicious CI/CD Attacker gains write access to CI pipeline (via leaked token, compromised plugin, or pull request injection) and modifies build artifacts Treat CI secrets as high-value credentials; use ephemeral runners; require signed commits for pipeline config changes; verify artifact integrity between pipeline stages

Ready to try Senior Security?

Install the skill and start getting expert-level guidance in your workflow — any agent, any IDE.

$skillfish add ~skills/senior-security
← Browse all 169 skills