Prompting with Extended Thinking

Knowing when to enable extended thinking is half the battle. The other half is knowing how to write prompts that work with it rather than against it. The counterintuitive finding from working with extended thinking is that the prompting style that works best is often the opposite of what most prompt engineers expect.

The Central Insight: High-Level Goals, Not Step-by-Step Instructions

When extended thinking is enabled, Claude uses its internal scratchpad to plan and reason through the problem. If you've already provided a detailed step-by-step breakdown of how to approach the problem, you've taken that responsibility away from Claude — and often constrained it to a reasoning path that isn't optimal.

Without extended thinking, explicit step-by-step instructions are often helpful. They substitute for Claude's limited working memory in a single forward pass.

With extended thinking, those same instructions can hurt performance. They force Claude to follow a predetermined path rather than discovering the best approach during its internal reasoning phase.

The shift in mental model: instead of writing a procedure for Claude to execute, write a problem for Claude to solve.

What Changes in Your Prompt Style

Without Extended Thinking	With Extended Thinking
Break the task into numbered steps	State the goal and constraints clearly
Tell Claude exactly what to do first	Let Claude decide its own approach
Specify the reasoning path	Specify the quality criteria for the output
"First do X, then do Y, then do Z"	"Solve X. Requirements: Y. Output format: Z."

This doesn't mean being vague. It means shifting from prescribing the method to describing the problem and the desired outcome.

The "Think Deeply About X" Pattern

One effective pattern is a direct invitation to reason thoroughly, without specifying how. This simple framing signals to Claude that deep analysis is expected and appropriate.

Compare these two approaches for a complex code architecture task:

Constrained (less effective with thinking):

First, list all the components we'll need.
Then, for each component, describe its responsibilities.
Next, define the interfaces between components.
Finally, identify any potential bottlenecks.

Goal-oriented (more effective with thinking):

Think carefully about the best architecture for this system.
Requirements: [list requirements]
Constraints: [list constraints]
Produce a complete design with justification for each major decision.

The second version lets Claude use its thinking budget to explore the design space before committing. The first version locks it into a structure before that exploration happens.

Thinking Budget Optimization

Start smaller than you think you need. Claude doesn't always use its full budget, and a well-scoped problem often requires less thinking than expected.

Practical starting points by task type:

Constraint optimization (staffing, scheduling): 8,000–12,000 tokens
Code architecture design: 6,000–10,000 tokens
Complex debugging: 5,000–8,000 tokens
Legal or financial analysis: 8,000–16,000 tokens
Multi-step math proofs: 4,000–8,000 tokens

If the quality of the output is insufficient, increase the budget in steps of 2,000–4,000 tokens. If quality is already good, try reducing — you may be able to get the same result at lower cost.

Using Thinking for Code Generation and Debugging

Extended thinking is particularly valuable for:

Architecture from scratch — designing a system before any code exists, where the thinking phase explores different structural approaches and their tradeoffs.

Root cause analysis — when a bug has multiple plausible causes and systematic elimination is required. The thinking phase works through candidates before settling on an explanation.

Refactoring decisions — evaluating whether to restructure code and what the safest migration path looks like.

For code tasks, your prompt should describe what the code needs to do and the constraints it operates under, not the implementation steps. Let the thinking process explore the implementation.

Known Limitations

No response prefilling with extended thinking. The prefilling technique (providing the start of Claude's response) is incompatible with extended thinking. When thinking is enabled, Claude must start its response from scratch — the internal reasoning process and a pre-seeded response start can't coexist.

Thinking tokens are not streamed by default. In standard API integrations, the internal thinking content is not returned in the response. You receive only the final output. Some integrations support receiving thinking content, but this is not the default behavior.

Extended thinking requires a minimum budget. You cannot set budget_tokens below 1,024. For very fast tasks, this minimum may make extended thinking impractical from a latency standpoint.

Prompt Templates for Extended Thinking

These templates show the high-level, goal-oriented structure that works best.

# Architecture Design Template You are designing [type of system] for [context]. <requirements> Functional requirements: - [requirement 1] - [requirement 2] Non-functional requirements: - Scale: [expected load] - Latency: [acceptable response times] - Reliability: [uptime requirements] </requirements> <constraints> - [technology constraint, e.g., must use existing PostgreSQL database] - [team constraint, e.g., team has no Kubernetes experience] - [budget constraint] </constraints> Think carefully about the architecture before responding. Explore at least two distinct approaches before recommending one. Produce: 1. Recommended architecture with a diagram in text/ASCII 2. Justification for each major design decision 3. Tradeoffs you considered and rejected, with reasons 4. Top 3 implementation risks and mitigations

# Complex Debugging Template <bug_description> [What behavior is observed vs. what is expected] </bug_description> <context> System: [brief system description] When it happens: [conditions that trigger the bug] Frequency: [always / intermittent / rare] Recent changes: [anything that changed before the bug appeared] </context> <relevant_code> [paste the relevant code here] </relevant_code> Reason through the possible causes systematically before identifying the most likely root cause. Consider both the obvious candidates and less obvious ones. Provide: 1. Root cause analysis with confidence level 2. Proposed fix with code 3. How to verify the fix resolves the issue 4. Whether any related code should be reviewed as a precaution

Try It: A Complex Reasoning Task

The prompt below is structured using the goal-oriented style suited for extended thinking. Run it, then experiment by rewriting it as step-by-step instructions to see how the output changes.

Loading Prompt Playground...

Loading Exercise...

Key Takeaways

With extended thinking enabled, high-level goal descriptions outperform step-by-step instructions
The thinking budget is used for internal reasoning — let Claude determine its own approach rather than prescribing it
Use the pattern: describe the problem, state the constraints, specify what a good output looks like
Prefilling is incompatible with extended thinking — don't combine the two techniques
Start with a conservative thinking budget and increase in small increments based on output quality
Thinking tokens are not returned in the response by default — you receive only the finished output

Prompting with Extended Thinking

The Central Insight: High-Level Goals, Not Step-by-Step Instructions

Without extended thinking, explicit step-by-step instructions are often helpful. They substitute for Claude's limited working memory in a single forward pass.

The shift in mental model: instead of writing a procedure for Claude to execute, write a problem for Claude to solve.

What Changes in Your Prompt Style

Without Extended Thinking	With Extended Thinking
Break the task into numbered steps	State the goal and constraints clearly
Tell Claude exactly what to do first	Let Claude decide its own approach
Specify the reasoning path	Specify the quality criteria for the output
"First do X, then do Y, then do Z"	"Solve X. Requirements: Y. Output format: Z."

This doesn't mean being vague. It means shifting from prescribing the method to describing the problem and the desired outcome.

The "Think Deeply About X" Pattern

One effective pattern is a direct invitation to reason thoroughly, without specifying how. This simple framing signals to Claude that deep analysis is expected and appropriate.

Compare these two approaches for a complex code architecture task:

Constrained (less effective with thinking):

First, list all the components we'll need.
Then, for each component, describe its responsibilities.
Next, define the interfaces between components.
Finally, identify any potential bottlenecks.

Goal-oriented (more effective with thinking):

Think carefully about the best architecture for this system.
Requirements: [list requirements]
Constraints: [list constraints]
Produce a complete design with justification for each major decision.

The second version lets Claude use its thinking budget to explore the design space before committing. The first version locks it into a structure before that exploration happens.

Thinking Budget Optimization

Start smaller than you think you need. Claude doesn't always use its full budget, and a well-scoped problem often requires less thinking than expected.

Practical starting points by task type:

Constraint optimization (staffing, scheduling): 8,000–12,000 tokens
Code architecture design: 6,000–10,000 tokens
Complex debugging: 5,000–8,000 tokens
Legal or financial analysis: 8,000–16,000 tokens
Multi-step math proofs: 4,000–8,000 tokens

If the quality of the output is insufficient, increase the budget in steps of 2,000–4,000 tokens. If quality is already good, try reducing — you may be able to get the same result at lower cost.

Using Thinking for Code Generation and Debugging

Extended thinking is particularly valuable for:

Architecture from scratch — designing a system before any code exists, where the thinking phase explores different structural approaches and their tradeoffs.

Root cause analysis — when a bug has multiple plausible causes and systematic elimination is required. The thinking phase works through candidates before settling on an explanation.

Refactoring decisions — evaluating whether to restructure code and what the safest migration path looks like.

For code tasks, your prompt should describe what the code needs to do and the constraints it operates under, not the implementation steps. Let the thinking process explore the implementation.

Known Limitations

Extended thinking requires a minimum budget. You cannot set budget_tokens below 1,024. For very fast tasks, this minimum may make extended thinking impractical from a latency standpoint.

Prompt Templates for Extended Thinking

These templates show the high-level, goal-oriented structure that works best.

Try It: A Complex Reasoning Task

The prompt below is structured using the goal-oriented style suited for extended thinking. Run it, then experiment by rewriting it as step-by-step instructions to see how the output changes.

Loading Prompt Playground...

Loading Exercise...

Key Takeaways

With extended thinking enabled, high-level goal descriptions outperform step-by-step instructions
The thinking budget is used for internal reasoning — let Claude determine its own approach rather than prescribing it
Use the pattern: describe the problem, state the constraints, specify what a good output looks like
Prefilling is incompatible with extended thinking — don't combine the two techniques
Start with a conservative thinking budget and increase in small increments based on output quality
Thinking tokens are not returned in the response by default — you receive only the finished output

Prompting with Extended Thinking

The Central Insight: High-Level Goals, Not Step-by-Step Instructions

What Changes in Your Prompt Style

The "Think Deeply About X" Pattern

Thinking Budget Optimization

Using Thinking for Code Generation and Debugging

Known Limitations

Prompt Templates for Extended Thinking

Try It: A Complex Reasoning Task

Key Takeaways

Questions & Answers

Prompting with Extended Thinking

The Central Insight: High-Level Goals, Not Step-by-Step Instructions

What Changes in Your Prompt Style

The "Think Deeply About X" Pattern

Thinking Budget Optimization

Using Thinking for Code Generation and Debugging

Known Limitations

Prompt Templates for Extended Thinking

Try It: A Complex Reasoning Task

Key Takeaways

Questions & Answers