Retry and Fallback Patterns
When a step fails, you have choices: retry the same step, try a different approach, or accept partial results. This lesson covers strategies for each.
Retry Strategies
Simple Retry
Try the same step again, hoping for a different result:
async function retryStep(stepFn, input, maxRetries = 3) {
let lastError;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const result = await stepFn(input);
return result;
} catch (error) {
lastError = error;
console.log(`Attempt ${attempt} failed: ${error.message}`);
}
}
throw new Error(`Failed after ${maxRetries} attempts: ${lastError.message}`);
}
Exponential Backoff
For rate limits and temporary failures, wait longer between retries:
async function retryWithBackoff(stepFn, input, options = {}) {
const { maxRetries = 5, baseDelay = 1000, maxDelay = 30000 } = options;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await stepFn(input);
} catch (error) {
if (attempt === maxRetries - 1) throw error;
// Calculate delay: 1s, 2s, 4s, 8s, 16s (capped at maxDelay)
const delay = Math.min(baseDelay * Math.pow(2, attempt), maxDelay);
// Add jitter to prevent thundering herd
const jitter = delay * 0.2 * Math.random();
console.log(`Retry in ${delay + jitter}ms...`);
await sleep(delay + jitter);
}
}
}
Retry with Modification
Change the prompt or parameters on retry:
Adaptive Retry
Adjust strategy based on the type of failure:
async function adaptiveRetry(stepFn, input, maxRetries = 3) {
let modifiedInput = input;
let lastError;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await stepFn(modifiedInput);
} catch (error) {
lastError = error;
// Adapt based on error type
if (error.type === 'FORMAT_ERROR') {
modifiedInput = addStricterFormatting(modifiedInput);
} else if (error.type === 'INCOMPLETE_OUTPUT') {
modifiedInput = simplifyRequest(modifiedInput);
} else if (error.type === 'TOKEN_LIMIT') {
modifiedInput = reduceInputSize(modifiedInput);
}
}
}
throw lastError;
}
Fallback Strategies
Alternative Prompt
Use a different prompt that's more likely to succeed:
async function withFallbackPrompt(input) {
try {
// Try the sophisticated prompt first
return await runPrompt(detailedPrompt, input);
} catch (error) {
// Fall back to simpler prompt
console.log('Falling back to simple prompt');
return await runPrompt(simplePrompt, input);
}
}
Alternative Model
Fall back to a different model:
async function withModelFallback(prompt, input) {
const models = ['gpt-4', 'gpt-3.5-turbo', 'claude-3-sonnet'];
for (const model of models) {
try {
return await runPrompt(prompt, input, { model });
} catch (error) {
console.log(`${model} failed, trying next...`);
}
}
throw new Error('All models failed');
}
Cached Results
Use previously computed results when live computation fails:
async function withCacheFallback(stepFn, input) {
try {
const result = await stepFn(input);
await cache.set(getCacheKey(input), result);
return result;
} catch (error) {
const cached = await cache.get(getCacheKey(input));
if (cached) {
return { ...cached, fromCache: true };
}
throw error;
}
}
Degraded Output
Return a simpler, partial result:
Combining Retry and Fallback
Tiered Approach
async function robustStep(input) {
// Tier 1: Try primary approach with retries
try {
return await retryWithBackoff(primaryStep, input, { maxRetries: 3 });
} catch (error) {
console.log('Primary approach failed');
}
// Tier 2: Try fallback approach with retries
try {
return await retryWithBackoff(fallbackStep, input, { maxRetries: 2 });
} catch (error) {
console.log('Fallback approach failed');
}
// Tier 3: Return degraded result
return degradedResult(input);
}
Decision Tree
┌─ Success ─────────────────────────────► Result
│
Try Primary Step ───┤
│ ┌─ Retryable? ─┐
└─ Fail ─┤ ├─ Yes ── Retry (max 3x)
│ │
└─ No ─────────┘
│
┌───────────────┴───────────────┐
│ │
Try Fallback Step Use Cached Result
│ │
┌───────┴───────┐ ┌───────┴───────┐
│ │ │ │
Success Fail Available Not Available
│ │ │ │
▼ ▼ ▼ ▼
Result Degraded Mode Cached Hard Failure
Implementing Circuit Breakers
Prevent repeated failures from overwhelming the system:
class CircuitBreaker {
constructor(options = {}) {
this.failures = 0;
this.threshold = options.threshold || 5;
this.resetTimeout = options.resetTimeout || 60000;
this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
}
async execute(fn) {
if (this.state === 'OPEN') {
throw new Error('Circuit breaker is OPEN');
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
onSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}
onFailure() {
this.failures++;
if (this.failures >= this.threshold) {
this.state = 'OPEN';
setTimeout(() => {
this.state = 'HALF_OPEN';
}, this.resetTimeout);
}
}
}
Best Practices
Know When to Stop
Not all failures should trigger retries:
| Error Type | Retry? | Reason |
|---|---|---|
| Rate limit | Yes (with backoff) | Temporary |
| Timeout | Yes (1-2 times) | Might be transient |
| Invalid API key | No | Won't fix itself |
| Content policy | No | Fundamental issue |
| Format error | Yes (with modification) | Might work with changes |
Log Everything
async function trackedRetry(stepFn, input, options) {
const attempts = [];
for (let i = 0; i < options.maxRetries; i++) {
const attempt = { number: i + 1, timestamp: Date.now() };
try {
const result = await stepFn(input);
attempt.status = 'success';
attempts.push(attempt);
return { result, attempts };
} catch (error) {
attempt.status = 'failure';
attempt.error = error.message;
attempts.push(attempt);
}
}
return { result: null, attempts, finalStatus: 'exhausted' };
}
Exercise: Design a Recovery Strategy
Design a complete retry/fallback strategy for this scenario:
Key Takeaways
- Simple retry works for transient failures
- Exponential backoff prevents overwhelming APIs
- Retry with modification can fix format issues
- Fallback to simpler prompts/models increases reliability
- Cache results for fast fallback
- Degraded output is better than complete failure
- Circuit breakers prevent cascading failures
- Always log retry attempts for debugging
- Know which errors are worth retrying
Next, we'll explore graceful degradation strategies in more depth.

