Code Review and Security Audit Prompts

Claude is a highly capable code reviewer, but the quality of the review scales directly with the quality of the prompt. A prompt like "review this code" produces generic feedback. A prompt that specifies what to look for, how to structure the output, and what context the code runs in produces actionable, prioritized findings that you can act on immediately.

Prompting Claude for Thorough Code Reviews

The first decision is scope: what kind of review do you need? Code reviews generally fall into three categories, and each benefits from a different prompt structure.

Correctness review — Does the code do what it is supposed to do? Are there bugs, edge cases, or incorrect logic?

Quality review — Is the code readable, maintainable, and well-structured? Does it follow conventions?

Security review — Are there vulnerabilities that could be exploited? Does it handle untrusted input safely?

Combining all three in one prompt produces an unfocused response. Better to specify the type of review upfront, or to run separate prompts for each concern.

For any code review, the prompt should include:

The code itself (wrapped in a tag that identifies the language)
What the code is supposed to do (context Claude cannot infer from the code alone)
The review type and specific concerns to prioritize
The desired output format: severity levels, locations, concrete suggestions

Requiring structured output — not just prose — makes reviews more actionable. A response organized by severity level with file locations and specific recommendations is far more useful than paragraphs of commentary.

Security Audit Prompt Patterns

Security reviews benefit most from a structured prompt because the attack surface is broad and findings need to be prioritized by severity. The OWASP Top 10 provides a well-recognized framework for organizing findings.

When prompting for a security audit, be explicit about the threat model: what kind of attacker, what access do they have, and what data or systems are at risk. Claude's security analysis improves significantly when it knows whether this is an internal tool, a public API, or a consumer-facing application handling sensitive data.

Key vulnerability classes to explicitly request coverage for:

Injection (SQL, NoSQL, command injection, LDAP) — untrusted data passed to interpreters
Broken authentication — session management, JWT handling, password storage
XSS (reflected, stored, DOM-based) — unsanitized user content rendered in HTML
IDOR (Insecure Direct Object Reference) — accessing resources by predictable IDs without authorization checks
CSRF — state-changing requests that do not verify origin
Sensitive data exposure — secrets in logs, responses, or error messages
Broken access control — missing authorization checks, privilege escalation paths
Security misconfiguration — permissive CORS, verbose error messages, open redirects

Here is a complete security audit prompt template:

<context>
  Application: Multi-tenant SaaS API (REST, Node.js/Express)
  Authentication: JWT tokens issued on login, validated via middleware
  Data sensitivity: Users can only access their own organization's data
  Threat model: Public internet API — assume hostile, unauthenticated attackers
                plus authenticated users attempting to access other orgs' data
</context>

<code language="typescript">
// [paste the route handlers and middleware to audit]
</code>

<instructions>
  Perform a security audit focused on the OWASP Top 10. For each finding:

  Format each finding as:
  [SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
  Vulnerability: [name/type]
  Location: [file name and function/line if identifiable]
  Description: [what the vulnerability is and how it could be exploited]
  Recommendation: [specific fix with example code where applicable]

  Prioritize findings in order of severity.
  After individual findings, include a summary section:
  - Total findings by severity level
  - The single most urgent fix to implement first
  - Any systemic patterns that suggest a larger architectural concern

  Be specific about exploitation scenarios — explain how an attacker
  would actually use each vulnerability, not just that it exists.
</instructions>

The threat model section is critical. Without it, Claude does not know whether to flag a missing CSRF token as critical (public API) or low (internal admin tool behind a VPN). The same code has different risk profiles depending on where it runs and who can access it.

Performance Analysis Prompts

Performance reviews require context about the scale and usage pattern — what is acceptable for 100 users may be unacceptable for 100,000. Always provide:

The scale/load context (requests per second, data volume, expected growth)
The runtime environment (Node.js, edge, browser, serverless)
Whether the code has already been profiled and what metrics you saw

<context>
  This function runs on every API request to /api/feed — approximately
  2,000 requests/minute at peak. It currently has a p99 latency of 800ms
  which we need to reduce to under 200ms. The database is PostgreSQL
  via Prisma ORM. We are on a serverless platform with cold starts.
</context>

<code language="typescript">
  [the function to analyze]
</code>

<instructions>
  Analyze for performance issues. For each issue:
  - Identify what is slow and why (N+1 queries, synchronous I/O, etc.)
  - Estimate the relative impact (high/medium/low)
  - Provide a specific, implementable fix
  Focus on database query efficiency first, then compute-heavy operations,
  then unnecessary I/O. Do not suggest infrastructure changes — only
  code-level optimizations.
</instructions>

The instruction to not suggest infrastructure changes is important. Without it, Claude often recommends adding caching layers or upgrading to a larger instance — useful advice in general, but not what you want when you need code-level fixes.

Accessibility Audit Prompts

Accessibility reviews are often overlooked but follow the same structured pattern. The key difference is that accessibility has its own standards framework: WCAG (Web Content Accessibility Guidelines).

When prompting for accessibility reviews, specify the compliance level (A, AA, or AAA) and the component type, because different UI patterns have different accessibility requirements.

<context>
  Component: Modal dialog with a form (React, TypeScript)
  Framework: Next.js with Tailwind CSS
  Target compliance: WCAG 2.1 AA
  Users: Public-facing web application, must support screen readers
         and keyboard-only navigation
</context>

<code language="tsx">
  [the modal component code]
</code>

<instructions>
  Perform an accessibility audit against WCAG 2.1 AA criteria.
  For each finding:
  - WCAG criterion violated (e.g., "1.3.1 Info and Relationships")
  - What is wrong and why it fails the criterion
  - Impact on users (which disability is affected and how)
  - Fix: specific code change with corrected markup

  Pay special attention to:
  - Focus management (focus trap within modal, return focus on close)
  - ARIA attributes (role, aria-modal, aria-labelledby)
  - Keyboard interaction (Escape to close, Tab cycling)
  - Color contrast ratios for all text elements
  - Form label associations

  After findings, rate the component's overall accessibility:
  [Passes AA / Fails AA with minor fixes needed / Fails AA with major rework needed]
</instructions>

Multi-File Review Patterns

Reviewing a single file in isolation misses architectural issues. When you need Claude to review how multiple files interact, structure the prompt to make relationships explicit.

Pattern 1: Provide files with role annotations

<instructions>
  Review these files for consistency, correct data flow, and error handling.
  Focus on how data passes between the API route, the service layer, and
  the database query. Flag any place where an error in one layer would
  not be properly caught or reported to the caller.
</instructions>

<file name="src/app/api/users/route.ts" role="API route handler">
  [code]
</file>

<file name="src/services/userService.ts" role="business logic layer">
  [code]
</file>

<file name="src/lib/db/userQueries.ts" role="database access layer">
  [code]
</file>

Pattern 2: Ask Claude to trace a specific flow

<instructions>
  Trace the complete flow of a user signup request through these files.
  For each step, identify:
  1. What data is received and what validation happens
  2. What can go wrong and whether the error is handled
  3. What data is passed to the next layer and whether it is correctly typed

  Flag any point where the chain could silently fail, return incorrect data,
  or expose sensitive information in an error response.
</instructions>

<files>
  [paste the relevant files with filenames as comments]
</files>

Tracing a specific flow forces Claude to think about the code as a connected system rather than isolated modules. This catches issues like: the API route validates email format but the service layer does not check for duplicates before inserting; or the database layer throws a specific error type that the API route catches as a generic 500.

Using XML Tags to Define Review Scope and Criteria

XML tags are particularly powerful for code reviews because they let you separate the code being reviewed from the review criteria and context. This prevents Claude from confusing review instructions with code to review.

A robust review prompt has at minimum three tagged sections:

<context>
  [What this code does, where it runs, who uses it, threat model]
</context>

<code>
  [The actual code to review — can be multiple files]
</code>

<criteria>
  [What to look for, how to format findings, what severity levels mean]
</criteria>

You can add additional tags for specific review dimensions:

<must_not_change>
  [Code or patterns that are intentional and should not be flagged]
</must_not_change>

<known_issues>
  [Issues you already know about — tell Claude not to re-report these]
</known_issues>

<focus_areas>
  [Specific functions, patterns, or lines you are most concerned about]
</focus_areas>

The <known_issues> tag is especially useful for iterative reviews. After fixing the first round of findings, you can re-run the review with previous findings listed as known issues, so Claude focuses on new problems rather than re-reporting resolved ones.

Structured Review Output

Requiring structured output format transforms a code review from prose commentary into a reviewable artifact. Severity levels, file locations, and concrete recommendations make it easy to create tickets, prioritize work, and track resolution.

A review prompt should always specify:

Severity taxonomy — define what CRITICAL/HIGH/MEDIUM/LOW mean in your context
Required fields — what every finding must include (location, description, recommendation)
Grouping — organize by severity, by file, or by vulnerability type
Summary section — a prioritized action list separate from individual findings

Here is a practical severity taxonomy you can include directly in your prompts:

Severity definitions for this review:
- CRITICAL: Exploitable vulnerability or data loss risk. Must fix before deploy.
- HIGH: Significant bug or security weakness. Fix in current sprint.
- MEDIUM: Code quality issue that increases maintenance cost or risk over time.
- LOW: Style, convention, or minor improvement suggestion.

Providing these definitions prevents Claude from using its own judgment about what "critical" means, which may not align with your team's priorities.

Practical Example: Complete Security Review Prompt

Here is a full security review prompt for a login endpoint, showing all the patterns combined:

<context>
  Application: Express.js API route for user authentication
  Environment: Node.js, public internet, stores bcrypt-hashed passwords
  Concern: Preparing for a third-party security audit — want to find issues first
</context>

<code language="javascript">
app.post('/api/login', async (req, res) => {
  const { email, password } = req.body
  const user = await db.query(
    `SELECT * FROM users WHERE email = '${email}'`
  )
  if (!user || !bcrypt.compareSync(password, user.password_hash)) {
    res.status(401).json({ error: 'Invalid credentials', email: email })
    return
  }
  const token = jwt.sign(
    { userId: user.id, email: user.email, role: user.role },
    'secret123'
  )
  res.json({ token, user })
})
</code>

<known_issues>
  None — this is the first review pass.
</known_issues>

<instructions>
  Perform a security review of this login route.

  Severity definitions:
  - CRITICAL: Exploitable vulnerability, immediate risk
  - HIGH: Security weakness that should be fixed before production
  - MEDIUM: Defensive improvement that reduces attack surface
  - LOW: Best practice suggestion

  For each vulnerability found:
  [SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
  Vulnerability: [name]
  Description: [what is wrong and how an attacker would exploit it]
  Fix: [corrected code or specific guidance]

  After findings, list the fixes in priority order.
  End with one sentence on the biggest systemic risk in this code.
</instructions>

This prompt would surface issues like SQL injection via string interpolation, a hardcoded JWT secret, leaking the email in error responses, returning the full user object (including password hash) in the success response, and using synchronous bcrypt comparison. Each finding would be formatted consistently with a severity level and a concrete fix.

Key Takeaways

Scope the review upfront: correctness, quality, or security — mixing all three produces unfocused feedback
Security audits benefit from a named threat model: who is attacking, what access do they have, what data is at risk
Name the vulnerability classes you want covered (OWASP Top 10, injection, XSS, IDOR) rather than leaving the scope open-ended
Always require structured output with severity levels and concrete fixes — prose commentary is harder to act on than a prioritized finding list
Accessibility reviews follow the same pattern but use WCAG criteria instead of OWASP
Multi-file reviews need explicit role annotations and flow-tracing instructions to catch cross-layer issues
Use XML tags to separate code, context, criteria, and known issues — this prevents Claude from confusing what to review with how to review
Performance reviews need scale context: current load, observed latency, target latency — Claude cannot assess performance risk without knowing what scale matters

Code Review and Security Audit Prompts

Prompting Claude for Thorough Code Reviews

The first decision is scope: what kind of review do you need? Code reviews generally fall into three categories, and each benefits from a different prompt structure.

Correctness review — Does the code do what it is supposed to do? Are there bugs, edge cases, or incorrect logic?

Quality review — Is the code readable, maintainable, and well-structured? Does it follow conventions?

Security review — Are there vulnerabilities that could be exploited? Does it handle untrusted input safely?

Combining all three in one prompt produces an unfocused response. Better to specify the type of review upfront, or to run separate prompts for each concern.

For any code review, the prompt should include:

The code itself (wrapped in a tag that identifies the language)
What the code is supposed to do (context Claude cannot infer from the code alone)
The review type and specific concerns to prioritize
The desired output format: severity levels, locations, concrete suggestions

Security Audit Prompt Patterns

Key vulnerability classes to explicitly request coverage for:

Injection (SQL, NoSQL, command injection, LDAP) — untrusted data passed to interpreters
Broken authentication — session management, JWT handling, password storage
XSS (reflected, stored, DOM-based) — unsanitized user content rendered in HTML
IDOR (Insecure Direct Object Reference) — accessing resources by predictable IDs without authorization checks
CSRF — state-changing requests that do not verify origin
Sensitive data exposure — secrets in logs, responses, or error messages
Broken access control — missing authorization checks, privilege escalation paths
Security misconfiguration — permissive CORS, verbose error messages, open redirects

Here is a complete security audit prompt template:

<context>
  Application: Multi-tenant SaaS API (REST, Node.js/Express)
  Authentication: JWT tokens issued on login, validated via middleware
  Data sensitivity: Users can only access their own organization's data
  Threat model: Public internet API — assume hostile, unauthenticated attackers
                plus authenticated users attempting to access other orgs' data
</context>

<code language="typescript">
// [paste the route handlers and middleware to audit]
</code>

<instructions>
  Perform a security audit focused on the OWASP Top 10. For each finding:

  Format each finding as:
  [SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
  Vulnerability: [name/type]
  Location: [file name and function/line if identifiable]
  Description: [what the vulnerability is and how it could be exploited]
  Recommendation: [specific fix with example code where applicable]

  Prioritize findings in order of severity.
  After individual findings, include a summary section:
  - Total findings by severity level
  - The single most urgent fix to implement first
  - Any systemic patterns that suggest a larger architectural concern

  Be specific about exploitation scenarios — explain how an attacker
  would actually use each vulnerability, not just that it exists.
</instructions>

Performance Analysis Prompts

Performance reviews require context about the scale and usage pattern — what is acceptable for 100 users may be unacceptable for 100,000. Always provide:

The scale/load context (requests per second, data volume, expected growth)
The runtime environment (Node.js, edge, browser, serverless)
Whether the code has already been profiled and what metrics you saw

<context>
  This function runs on every API request to /api/feed — approximately
  2,000 requests/minute at peak. It currently has a p99 latency of 800ms
  which we need to reduce to under 200ms. The database is PostgreSQL
  via Prisma ORM. We are on a serverless platform with cold starts.
</context>

<code language="typescript">
  [the function to analyze]
</code>

<instructions>
  Analyze for performance issues. For each issue:
  - Identify what is slow and why (N+1 queries, synchronous I/O, etc.)
  - Estimate the relative impact (high/medium/low)
  - Provide a specific, implementable fix
  Focus on database query efficiency first, then compute-heavy operations,
  then unnecessary I/O. Do not suggest infrastructure changes — only
  code-level optimizations.
</instructions>

Accessibility Audit Prompts

When prompting for accessibility reviews, specify the compliance level (A, AA, or AAA) and the component type, because different UI patterns have different accessibility requirements.

<context>
  Component: Modal dialog with a form (React, TypeScript)
  Framework: Next.js with Tailwind CSS
  Target compliance: WCAG 2.1 AA
  Users: Public-facing web application, must support screen readers
         and keyboard-only navigation
</context>

<code language="tsx">
  [the modal component code]
</code>

<instructions>
  Perform an accessibility audit against WCAG 2.1 AA criteria.
  For each finding:
  - WCAG criterion violated (e.g., "1.3.1 Info and Relationships")
  - What is wrong and why it fails the criterion
  - Impact on users (which disability is affected and how)
  - Fix: specific code change with corrected markup

  Pay special attention to:
  - Focus management (focus trap within modal, return focus on close)
  - ARIA attributes (role, aria-modal, aria-labelledby)
  - Keyboard interaction (Escape to close, Tab cycling)
  - Color contrast ratios for all text elements
  - Form label associations

  After findings, rate the component's overall accessibility:
  [Passes AA / Fails AA with minor fixes needed / Fails AA with major rework needed]
</instructions>

Multi-File Review Patterns

Reviewing a single file in isolation misses architectural issues. When you need Claude to review how multiple files interact, structure the prompt to make relationships explicit.

Pattern 1: Provide files with role annotations

<instructions>
  Review these files for consistency, correct data flow, and error handling.
  Focus on how data passes between the API route, the service layer, and
  the database query. Flag any place where an error in one layer would
  not be properly caught or reported to the caller.
</instructions>

<file name="src/app/api/users/route.ts" role="API route handler">
  [code]
</file>

<file name="src/services/userService.ts" role="business logic layer">
  [code]
</file>

<file name="src/lib/db/userQueries.ts" role="database access layer">
  [code]
</file>

Pattern 2: Ask Claude to trace a specific flow

<instructions>
  Trace the complete flow of a user signup request through these files.
  For each step, identify:
  1. What data is received and what validation happens
  2. What can go wrong and whether the error is handled
  3. What data is passed to the next layer and whether it is correctly typed

  Flag any point where the chain could silently fail, return incorrect data,
  or expose sensitive information in an error response.
</instructions>

<files>
  [paste the relevant files with filenames as comments]
</files>

Using XML Tags to Define Review Scope and Criteria

A robust review prompt has at minimum three tagged sections:

<context>
  [What this code does, where it runs, who uses it, threat model]
</context>

<code>
  [The actual code to review — can be multiple files]
</code>

<criteria>
  [What to look for, how to format findings, what severity levels mean]
</criteria>

You can add additional tags for specific review dimensions:

<must_not_change>
  [Code or patterns that are intentional and should not be flagged]
</must_not_change>

<known_issues>
  [Issues you already know about — tell Claude not to re-report these]
</known_issues>

<focus_areas>
  [Specific functions, patterns, or lines you are most concerned about]
</focus_areas>

Structured Review Output

A review prompt should always specify:

Severity taxonomy — define what CRITICAL/HIGH/MEDIUM/LOW mean in your context
Required fields — what every finding must include (location, description, recommendation)
Grouping — organize by severity, by file, or by vulnerability type
Summary section — a prioritized action list separate from individual findings

Here is a practical severity taxonomy you can include directly in your prompts:

Severity definitions for this review:
- CRITICAL: Exploitable vulnerability or data loss risk. Must fix before deploy.
- HIGH: Significant bug or security weakness. Fix in current sprint.
- MEDIUM: Code quality issue that increases maintenance cost or risk over time.
- LOW: Style, convention, or minor improvement suggestion.

Providing these definitions prevents Claude from using its own judgment about what "critical" means, which may not align with your team's priorities.

Practical Example: Complete Security Review Prompt

Here is a full security review prompt for a login endpoint, showing all the patterns combined:

<context>
  Application: Express.js API route for user authentication
  Environment: Node.js, public internet, stores bcrypt-hashed passwords
  Concern: Preparing for a third-party security audit — want to find issues first
</context>

<code language="javascript">
app.post('/api/login', async (req, res) => {
  const { email, password } = req.body
  const user = await db.query(
    `SELECT * FROM users WHERE email = '${email}'`
  )
  if (!user || !bcrypt.compareSync(password, user.password_hash)) {
    res.status(401).json({ error: 'Invalid credentials', email: email })
    return
  }
  const token = jwt.sign(
    { userId: user.id, email: user.email, role: user.role },
    'secret123'
  )
  res.json({ token, user })
})
</code>

<known_issues>
  None — this is the first review pass.
</known_issues>

<instructions>
  Perform a security review of this login route.

  Severity definitions:
  - CRITICAL: Exploitable vulnerability, immediate risk
  - HIGH: Security weakness that should be fixed before production
  - MEDIUM: Defensive improvement that reduces attack surface
  - LOW: Best practice suggestion

  For each vulnerability found:
  [SEVERITY: CRITICAL/HIGH/MEDIUM/LOW]
  Vulnerability: [name]
  Description: [what is wrong and how an attacker would exploit it]
  Fix: [corrected code or specific guidance]

  After findings, list the fixes in priority order.
  End with one sentence on the biggest systemic risk in this code.
</instructions>

Key Takeaways

Scope the review upfront: correctness, quality, or security — mixing all three produces unfocused feedback
Security audits benefit from a named threat model: who is attacking, what access do they have, what data is at risk
Name the vulnerability classes you want covered (OWASP Top 10, injection, XSS, IDOR) rather than leaving the scope open-ended
Always require structured output with severity levels and concrete fixes — prose commentary is harder to act on than a prioritized finding list
Accessibility reviews follow the same pattern but use WCAG criteria instead of OWASP
Multi-file reviews need explicit role annotations and flow-tracing instructions to catch cross-layer issues
Use XML tags to separate code, context, criteria, and known issues — this prevents Claude from confusing what to review with how to review
Performance reviews need scale context: current load, observed latency, target latency — Claude cannot assess performance risk without knowing what scale matters

Code Review and Security Audit Prompts

Prompting Claude for Thorough Code Reviews

Security Audit Prompt Patterns

Performance Analysis Prompts

Accessibility Audit Prompts

Multi-File Review Patterns

Using XML Tags to Define Review Scope and Criteria

Structured Review Output

Practical Example: Complete Security Review Prompt

Key Takeaways

Questions & Answers

Code Review and Security Audit Prompts

Prompting Claude for Thorough Code Reviews

Security Audit Prompt Patterns

Performance Analysis Prompts

Accessibility Audit Prompts

Multi-File Review Patterns

Using XML Tags to Define Review Scope and Criteria

Structured Review Output

Practical Example: Complete Security Review Prompt

Key Takeaways

Questions & Answers