Code Review and Security Audit Prompts
Claude is a highly capable code reviewer, but the quality of the review scales directly with the quality of the prompt. A prompt like "review this code" produces generic feedback. A prompt that specifies what to look for, how to structure the output, and what context the code runs in produces actionable, prioritized findings that you can act on immediately.
Prompting Claude for Thorough Code Reviews
The first decision is scope: what kind of review do you need? Code reviews generally fall into three categories, and each benefits from a different prompt structure.
Correctness review — Does the code do what it is supposed to do? Are there bugs, edge cases, or incorrect logic?
Quality review — Is the code readable, maintainable, and well-structured? Does it follow conventions?
Security review — Are there vulnerabilities that could be exploited? Does it handle untrusted input safely?
Combining all three in one prompt produces an unfocused response. Better to specify the type of review upfront, or to run separate prompts for each concern.
For any code review, the prompt should include:
- The code itself (wrapped in a tag that identifies the language)
- What the code is supposed to do (context Claude cannot infer from the code alone)
- The review type and specific concerns to prioritize
- The desired output format: severity levels, locations, concrete suggestions
Requiring structured output — not just prose — makes reviews more actionable. A response organized by severity level with file locations and specific recommendations is far more useful than paragraphs of commentary.
Security Audit Prompt Patterns
Security reviews benefit most from a structured prompt because the attack surface is broad and findings need to be prioritized by severity. The OWASP Top 10 provides a well-recognized framework for organizing findings.
When prompting for a security audit, be explicit about the threat model: what kind of attacker, what access do they have, and what data or systems are at risk. Claude's security analysis improves significantly when it knows whether this is an internal tool, a public API, or a consumer-facing application handling sensitive data.
Key vulnerability classes to explicitly request coverage for:
- Injection (SQL, NoSQL, command injection, LDAP) — untrusted data passed to interpreters
- Broken authentication — session management, JWT handling, password storage
- XSS (reflected, stored, DOM-based) — unsanitized user content rendered in HTML
- IDOR (Insecure Direct Object Reference) — accessing resources by predictable IDs without authorization checks
- CSRF — state-changing requests that do not verify origin
- Sensitive data exposure — secrets in logs, responses, or error messages
- Broken access control — missing authorization checks, privilege escalation paths
- Security misconfiguration — permissive CORS, verbose error messages, open redirects
Here is a complete security audit prompt template:
<context>
Application: Multi-tenant SaaS API (REST, Node.js/Express)
Authentication: JWT tokens issued on login, validated via middleware
Data sensitivity: Users can only access their own organization's data
Threat model: Public internet API — assume hostile, unauthenticated attackers
plus authenticated users attempting to access other orgs' data
</context>
<code language="typescript">
// [paste the route handlers and middleware to audit]
</code>
<instructions>
Perform a security audit focused on the OWASP Top 10. For each finding:
Format each finding as:
[SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
Vulnerability: [name/type]
Location: [file name and function/line if identifiable]
Description: [what the vulnerability is and how it could be exploited]
Recommendation: [specific fix with example code where applicable]
Prioritize findings in order of severity.
After individual findings, include a summary section:
- Total findings by severity level
- The single most urgent fix to implement first
- Any systemic patterns that suggest a larger architectural concern
Be specific about exploitation scenarios — explain how an attacker
would actually use each vulnerability, not just that it exists.
</instructions>
The threat model section is critical. Without it, Claude does not know whether to flag a missing CSRF token as critical (public API) or low (internal admin tool behind a VPN). The same code has different risk profiles depending on where it runs and who can access it.
Performance Analysis Prompts
Performance reviews require context about the scale and usage pattern — what is acceptable for 100 users may be unacceptable for 100,000. Always provide:
- The scale/load context (requests per second, data volume, expected growth)
- The runtime environment (Node.js, edge, browser, serverless)
- Whether the code has already been profiled and what metrics you saw
<context>
This function runs on every API request to /api/feed — approximately
2,000 requests/minute at peak. It currently has a p99 latency of 800ms
which we need to reduce to under 200ms. The database is PostgreSQL
via Prisma ORM. We are on a serverless platform with cold starts.
</context>
<code language="typescript">
[the function to analyze]
</code>
<instructions>
Analyze for performance issues. For each issue:
- Identify what is slow and why (N+1 queries, synchronous I/O, etc.)
- Estimate the relative impact (high/medium/low)
- Provide a specific, implementable fix
Focus on database query efficiency first, then compute-heavy operations,
then unnecessary I/O. Do not suggest infrastructure changes — only
code-level optimizations.
</instructions>
The instruction to not suggest infrastructure changes is important. Without it, Claude often recommends adding caching layers or upgrading to a larger instance — useful advice in general, but not what you want when you need code-level fixes.
Accessibility Audit Prompts
Accessibility reviews are often overlooked but follow the same structured pattern. The key difference is that accessibility has its own standards framework: WCAG (Web Content Accessibility Guidelines).
When prompting for accessibility reviews, specify the compliance level (A, AA, or AAA) and the component type, because different UI patterns have different accessibility requirements.
<context>
Component: Modal dialog with a form (React, TypeScript)
Framework: Next.js with Tailwind CSS
Target compliance: WCAG 2.1 AA
Users: Public-facing web application, must support screen readers
and keyboard-only navigation
</context>
<code language="tsx">
[the modal component code]
</code>
<instructions>
Perform an accessibility audit against WCAG 2.1 AA criteria.
For each finding:
- WCAG criterion violated (e.g., "1.3.1 Info and Relationships")
- What is wrong and why it fails the criterion
- Impact on users (which disability is affected and how)
- Fix: specific code change with corrected markup
Pay special attention to:
- Focus management (focus trap within modal, return focus on close)
- ARIA attributes (role, aria-modal, aria-labelledby)
- Keyboard interaction (Escape to close, Tab cycling)
- Color contrast ratios for all text elements
- Form label associations
After findings, rate the component's overall accessibility:
[Passes AA / Fails AA with minor fixes needed / Fails AA with major rework needed]
</instructions>
Multi-File Review Patterns
Reviewing a single file in isolation misses architectural issues. When you need Claude to review how multiple files interact, structure the prompt to make relationships explicit.
Pattern 1: Provide files with role annotations
<instructions>
Review these files for consistency, correct data flow, and error handling.
Focus on how data passes between the API route, the service layer, and
the database query. Flag any place where an error in one layer would
not be properly caught or reported to the caller.
</instructions>
<file name="src/app/api/users/route.ts" role="API route handler">
[code]
</file>
<file name="src/services/userService.ts" role="business logic layer">
[code]
</file>
<file name="src/lib/db/userQueries.ts" role="database access layer">
[code]
</file>
Pattern 2: Ask Claude to trace a specific flow
<instructions>
Trace the complete flow of a user signup request through these files.
For each step, identify:
1. What data is received and what validation happens
2. What can go wrong and whether the error is handled
3. What data is passed to the next layer and whether it is correctly typed
Flag any point where the chain could silently fail, return incorrect data,
or expose sensitive information in an error response.
</instructions>
<files>
[paste the relevant files with filenames as comments]
</files>
Tracing a specific flow forces Claude to think about the code as a connected system rather than isolated modules. This catches issues like: the API route validates email format but the service layer does not check for duplicates before inserting; or the database layer throws a specific error type that the API route catches as a generic 500.
Using XML Tags to Define Review Scope and Criteria
XML tags are particularly powerful for code reviews because they let you separate the code being reviewed from the review criteria and context. This prevents Claude from confusing review instructions with code to review.
A robust review prompt has at minimum three tagged sections:
<context>
[What this code does, where it runs, who uses it, threat model]
</context>
<code>
[The actual code to review — can be multiple files]
</code>
<criteria>
[What to look for, how to format findings, what severity levels mean]
</criteria>
You can add additional tags for specific review dimensions:
<must_not_change>
[Code or patterns that are intentional and should not be flagged]
</must_not_change>
<known_issues>
[Issues you already know about — tell Claude not to re-report these]
</known_issues>
<focus_areas>
[Specific functions, patterns, or lines you are most concerned about]
</focus_areas>
The <known_issues> tag is especially useful for iterative reviews. After fixing the first round of findings, you can re-run the review with previous findings listed as known issues, so Claude focuses on new problems rather than re-reporting resolved ones.
Structured Review Output
Requiring structured output format transforms a code review from prose commentary into a reviewable artifact. Severity levels, file locations, and concrete recommendations make it easy to create tickets, prioritize work, and track resolution.
A review prompt should always specify:
- Severity taxonomy — define what CRITICAL/HIGH/MEDIUM/LOW mean in your context
- Required fields — what every finding must include (location, description, recommendation)
- Grouping — organize by severity, by file, or by vulnerability type
- Summary section — a prioritized action list separate from individual findings
Here is a practical severity taxonomy you can include directly in your prompts:
Severity definitions for this review:
- CRITICAL: Exploitable vulnerability or data loss risk. Must fix before deploy.
- HIGH: Significant bug or security weakness. Fix in current sprint.
- MEDIUM: Code quality issue that increases maintenance cost or risk over time.
- LOW: Style, convention, or minor improvement suggestion.
Providing these definitions prevents Claude from using its own judgment about what "critical" means, which may not align with your team's priorities.
Practical Example: Complete Security Review Prompt
Here is a full security review prompt for a login endpoint, showing all the patterns combined:
<context>
Application: Express.js API route for user authentication
Environment: Node.js, public internet, stores bcrypt-hashed passwords
Concern: Preparing for a third-party security audit — want to find issues first
</context>
<code language="javascript">
app.post('/api/login', async (req, res) => {
const { email, password } = req.body
const user = await db.query(
`SELECT * FROM users WHERE email = '${email}'`
)
if (!user || !bcrypt.compareSync(password, user.password_hash)) {
res.status(401).json({ error: 'Invalid credentials', email: email })
return
}
const token = jwt.sign(
{ userId: user.id, email: user.email, role: user.role },
'secret123'
)
res.json({ token, user })
})
</code>
<known_issues>
None — this is the first review pass.
</known_issues>
<instructions>
Perform a security review of this login route.
Severity definitions:
- CRITICAL: Exploitable vulnerability, immediate risk
- HIGH: Security weakness that should be fixed before production
- MEDIUM: Defensive improvement that reduces attack surface
- LOW: Best practice suggestion
For each vulnerability found:
[SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
Vulnerability: [name]
Description: [what is wrong and how an attacker would exploit it]
Fix: [corrected code or specific guidance]
After findings, list the fixes in priority order.
End with one sentence on the biggest systemic risk in this code.
</instructions>
This prompt would surface issues like SQL injection via string interpolation, a hardcoded JWT secret, leaking the email in error responses, returning the full user object (including password hash) in the success response, and using synchronous bcrypt comparison. Each finding would be formatted consistently with a severity level and a concrete fix.
Key Takeaways
- Scope the review upfront: correctness, quality, or security — mixing all three produces unfocused feedback
- Security audits benefit from a named threat model: who is attacking, what access do they have, what data is at risk
- Name the vulnerability classes you want covered (OWASP Top 10, injection, XSS, IDOR) rather than leaving the scope open-ended
- Always require structured output with severity levels and concrete fixes — prose commentary is harder to act on than a prioritized finding list
- Accessibility reviews follow the same pattern but use WCAG criteria instead of OWASP
- Multi-file reviews need explicit role annotations and flow-tracing instructions to catch cross-layer issues
- Use XML tags to separate code, context, criteria, and known issues — this prevents Claude from confusing what to review with how to review
- Performance reviews need scale context: current load, observed latency, target latency — Claude cannot assess performance risk without knowing what scale matters
Discussion
Sign in to join the discussion.

