Retry and Fallback Patterns

When a step fails, you have choices: retry the same step, try a different approach, or accept partial results. This lesson covers strategies for each.

Retry Strategies

Simple Retry

Try the same step again, hoping for a different result:

async function retryStep(stepFn, input, maxRetries = 3) {
  let lastError;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const result = await stepFn(input);
      return result;
    } catch (error) {
      lastError = error;
      console.log(`Attempt ${attempt} failed: ${error.message}`);
    }
  }

  throw new Error(`Failed after ${maxRetries} attempts: ${lastError.message}`);
}

Exponential Backoff

For rate limits and temporary failures, wait longer between retries:

async function retryWithBackoff(stepFn, input, options = {}) {
  const { maxRetries = 5, baseDelay = 1000, maxDelay = 30000 } = options;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await stepFn(input);
    } catch (error) {
      if (attempt === maxRetries - 1) throw error;

      // Calculate delay: 1s, 2s, 4s, 8s, 16s (capped at maxDelay)
      const delay = Math.min(baseDelay * Math.pow(2, attempt), maxDelay);

      // Add jitter to prevent thundering herd
      const jitter = delay * 0.2 * Math.random();

      console.log(`Retry in ${delay + jitter}ms...`);
      await sleep(delay + jitter);
    }
  }
}

Retry with Modification

Change the prompt or parameters on retry:

Loading Prompt Playground...

Adaptive Retry

Adjust strategy based on the type of failure:

async function adaptiveRetry(stepFn, input, maxRetries = 3) {
  let modifiedInput = input;
  let lastError;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await stepFn(modifiedInput);
    } catch (error) {
      lastError = error;

      // Adapt based on error type
      if (error.type === 'FORMAT_ERROR') {
        modifiedInput = addStricterFormatting(modifiedInput);
      } else if (error.type === 'INCOMPLETE_OUTPUT') {
        modifiedInput = simplifyRequest(modifiedInput);
      } else if (error.type === 'TOKEN_LIMIT') {
        modifiedInput = reduceInputSize(modifiedInput);
      }
    }
  }

  throw lastError;
}

Fallback Strategies

Alternative Prompt

Use a different prompt that's more likely to succeed:

async function withFallbackPrompt(input) {
  try {
    // Try the sophisticated prompt first
    return await runPrompt(detailedPrompt, input);
  } catch (error) {
    // Fall back to simpler prompt
    console.log('Falling back to simple prompt');
    return await runPrompt(simplePrompt, input);
  }
}

Alternative Model

Fall back to a different model:

async function withModelFallback(prompt, input) {
  const models = ['gpt-4', 'gpt-3.5-turbo', 'claude-3-sonnet'];

  for (const model of models) {
    try {
      return await runPrompt(prompt, input, { model });
    } catch (error) {
      console.log(`${model} failed, trying next...`);
    }
  }

  throw new Error('All models failed');
}

Cached Results

Use previously computed results when live computation fails:

async function withCacheFallback(stepFn, input) {
  try {
    const result = await stepFn(input);
    await cache.set(getCacheKey(input), result);
    return result;
  } catch (error) {
    const cached = await cache.get(getCacheKey(input));
    if (cached) {
      return { ...cached, fromCache: true };
    }
    throw error;
  }
}

Degraded Output

Return a simpler, partial result:

Loading Prompt Playground...

Combining Retry and Fallback

Tiered Approach

async function robustStep(input) {
  // Tier 1: Try primary approach with retries
  try {
    return await retryWithBackoff(primaryStep, input, { maxRetries: 3 });
  } catch (error) {
    console.log('Primary approach failed');
  }

  // Tier 2: Try fallback approach with retries
  try {
    return await retryWithBackoff(fallbackStep, input, { maxRetries: 2 });
  } catch (error) {
    console.log('Fallback approach failed');
  }

  // Tier 3: Return degraded result
  return degradedResult(input);
}

Decision Tree

                    ┌─ Success ─────────────────────────────► Result
                    │
Try Primary Step ───┤
                    │        ┌─ Retryable? ─┐
                    └─ Fail ─┤              ├─ Yes ── Retry (max 3x)
                             │              │
                             └─ No ─────────┘
                                    │
                    ┌───────────────┴───────────────┐
                    │                               │
            Try Fallback Step               Use Cached Result
                    │                               │
            ┌───────┴───────┐               ┌───────┴───────┐
            │               │               │               │
         Success          Fail           Available      Not Available
            │               │               │               │
            ▼               ▼               ▼               ▼
         Result      Degraded Mode       Cached        Hard Failure

Implementing Circuit Breakers

Prevent repeated failures from overwhelming the system:

class CircuitBreaker {
  constructor(options = {}) {
    this.failures = 0;
    this.threshold = options.threshold || 5;
    this.resetTimeout = options.resetTimeout || 60000;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
  }

  async execute(fn) {
    if (this.state === 'OPEN') {
      throw new Error('Circuit breaker is OPEN');
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failures++;
    if (this.failures >= this.threshold) {
      this.state = 'OPEN';
      setTimeout(() => {
        this.state = 'HALF_OPEN';
      }, this.resetTimeout);
    }
  }
}

Best Practices

Know When to Stop

Not all failures should trigger retries:

Error Type	Retry?	Reason
Rate limit	Yes (with backoff)	Temporary
Timeout	Yes (1-2 times)	Might be transient
Invalid API key	No	Won't fix itself
Content policy	No	Fundamental issue
Format error	Yes (with modification)	Might work with changes

Log Everything

async function trackedRetry(stepFn, input, options) {
  const attempts = [];

  for (let i = 0; i < options.maxRetries; i++) {
    const attempt = { number: i + 1, timestamp: Date.now() };
    try {
      const result = await stepFn(input);
      attempt.status = 'success';
      attempts.push(attempt);
      return { result, attempts };
    } catch (error) {
      attempt.status = 'failure';
      attempt.error = error.message;
      attempts.push(attempt);
    }
  }

  return { result: null, attempts, finalStatus: 'exhausted' };
}

Exercise: Design a Recovery Strategy

Design a complete retry/fallback strategy for this scenario:

Loading Prompt Playground...

Key Takeaways

Simple retry works for transient failures
Exponential backoff prevents overwhelming APIs
Retry with modification can fix format issues
Fallback to simpler prompts/models increases reliability
Cache results for fast fallback
Degraded output is better than complete failure
Circuit breakers prevent cascading failures
Always log retry attempts for debugging
Know which errors are worth retrying

Next, we'll explore graceful degradation strategies in more depth.

Retry and Fallback Patterns

When a step fails, you have choices: retry the same step, try a different approach, or accept partial results. This lesson covers strategies for each.

Retry Strategies

Simple Retry

Try the same step again, hoping for a different result:

async function retryStep(stepFn, input, maxRetries = 3) {
  let lastError;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const result = await stepFn(input);
      return result;
    } catch (error) {
      lastError = error;
      console.log(`Attempt ${attempt} failed: ${error.message}`);
    }
  }

  throw new Error(`Failed after ${maxRetries} attempts: ${lastError.message}`);
}

Exponential Backoff

For rate limits and temporary failures, wait longer between retries:

async function retryWithBackoff(stepFn, input, options = {}) {
  const { maxRetries = 5, baseDelay = 1000, maxDelay = 30000 } = options;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await stepFn(input);
    } catch (error) {
      if (attempt === maxRetries - 1) throw error;

      // Calculate delay: 1s, 2s, 4s, 8s, 16s (capped at maxDelay)
      const delay = Math.min(baseDelay * Math.pow(2, attempt), maxDelay);

      // Add jitter to prevent thundering herd
      const jitter = delay * 0.2 * Math.random();

      console.log(`Retry in ${delay + jitter}ms...`);
      await sleep(delay + jitter);
    }
  }
}

Retry with Modification

Change the prompt or parameters on retry:

Loading Prompt Playground...

Adaptive Retry

Adjust strategy based on the type of failure:

async function adaptiveRetry(stepFn, input, maxRetries = 3) {
  let modifiedInput = input;
  let lastError;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await stepFn(modifiedInput);
    } catch (error) {
      lastError = error;

      // Adapt based on error type
      if (error.type === 'FORMAT_ERROR') {
        modifiedInput = addStricterFormatting(modifiedInput);
      } else if (error.type === 'INCOMPLETE_OUTPUT') {
        modifiedInput = simplifyRequest(modifiedInput);
      } else if (error.type === 'TOKEN_LIMIT') {
        modifiedInput = reduceInputSize(modifiedInput);
      }
    }
  }

  throw lastError;
}

Fallback Strategies

Alternative Prompt

Use a different prompt that's more likely to succeed:

async function withFallbackPrompt(input) {
  try {
    // Try the sophisticated prompt first
    return await runPrompt(detailedPrompt, input);
  } catch (error) {
    // Fall back to simpler prompt
    console.log('Falling back to simple prompt');
    return await runPrompt(simplePrompt, input);
  }
}

Alternative Model

Fall back to a different model:

async function withModelFallback(prompt, input) {
  const models = ['gpt-4', 'gpt-3.5-turbo', 'claude-3-sonnet'];

  for (const model of models) {
    try {
      return await runPrompt(prompt, input, { model });
    } catch (error) {
      console.log(`${model} failed, trying next...`);
    }
  }

  throw new Error('All models failed');
}

Cached Results

Use previously computed results when live computation fails:

async function withCacheFallback(stepFn, input) {
  try {
    const result = await stepFn(input);
    await cache.set(getCacheKey(input), result);
    return result;
  } catch (error) {
    const cached = await cache.get(getCacheKey(input));
    if (cached) {
      return { ...cached, fromCache: true };
    }
    throw error;
  }
}

Degraded Output

Return a simpler, partial result:

Loading Prompt Playground...

Combining Retry and Fallback

Tiered Approach

async function robustStep(input) {
  // Tier 1: Try primary approach with retries
  try {
    return await retryWithBackoff(primaryStep, input, { maxRetries: 3 });
  } catch (error) {
    console.log('Primary approach failed');
  }

  // Tier 2: Try fallback approach with retries
  try {
    return await retryWithBackoff(fallbackStep, input, { maxRetries: 2 });
  } catch (error) {
    console.log('Fallback approach failed');
  }

  // Tier 3: Return degraded result
  return degradedResult(input);
}

Decision Tree

                    ┌─ Success ─────────────────────────────► Result
                    │
Try Primary Step ───┤
                    │        ┌─ Retryable? ─┐
                    └─ Fail ─┤              ├─ Yes ── Retry (max 3x)
                             │              │
                             └─ No ─────────┘
                                    │
                    ┌───────────────┴───────────────┐
                    │                               │
            Try Fallback Step               Use Cached Result
                    │                               │
            ┌───────┴───────┐               ┌───────┴───────┐
            │               │               │               │
         Success          Fail           Available      Not Available
            │               │               │               │
            ▼               ▼               ▼               ▼
         Result      Degraded Mode       Cached        Hard Failure

Implementing Circuit Breakers

Prevent repeated failures from overwhelming the system:

class CircuitBreaker {
  constructor(options = {}) {
    this.failures = 0;
    this.threshold = options.threshold || 5;
    this.resetTimeout = options.resetTimeout || 60000;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
  }

  async execute(fn) {
    if (this.state === 'OPEN') {
      throw new Error('Circuit breaker is OPEN');
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failures++;
    if (this.failures >= this.threshold) {
      this.state = 'OPEN';
      setTimeout(() => {
        this.state = 'HALF_OPEN';
      }, this.resetTimeout);
    }
  }
}

Best Practices

Know When to Stop

Not all failures should trigger retries:

Error Type	Retry?	Reason
Rate limit	Yes (with backoff)	Temporary
Timeout	Yes (1-2 times)	Might be transient
Invalid API key	No	Won't fix itself
Content policy	No	Fundamental issue
Format error	Yes (with modification)	Might work with changes

Log Everything

async function trackedRetry(stepFn, input, options) {
  const attempts = [];

  for (let i = 0; i < options.maxRetries; i++) {
    const attempt = { number: i + 1, timestamp: Date.now() };
    try {
      const result = await stepFn(input);
      attempt.status = 'success';
      attempts.push(attempt);
      return { result, attempts };
    } catch (error) {
      attempt.status = 'failure';
      attempt.error = error.message;
      attempts.push(attempt);
    }
  }

  return { result: null, attempts, finalStatus: 'exhausted' };
}

Exercise: Design a Recovery Strategy

Design a complete retry/fallback strategy for this scenario:

Loading Prompt Playground...

Key Takeaways

Simple retry works for transient failures
Exponential backoff prevents overwhelming APIs
Retry with modification can fix format issues
Fallback to simpler prompts/models increases reliability
Cache results for fast fallback
Degraded output is better than complete failure
Circuit breakers prevent cascading failures
Always log retry attempts for debugging
Know which errors are worth retrying

Next, we'll explore graceful degradation strategies in more depth.

Retry and Fallback Patterns

Retry Strategies

Simple Retry

Exponential Backoff

Retry with Modification

Adaptive Retry

Fallback Strategies

Alternative Prompt

Alternative Model

Cached Results

Degraded Output

Combining Retry and Fallback

Tiered Approach

Decision Tree

Implementing Circuit Breakers

Best Practices

Know When to Stop

Log Everything

Exercise: Design a Recovery Strategy

Key Takeaways

Discussion

Retry and Fallback Patterns

Retry Strategies

Simple Retry

Exponential Backoff

Retry with Modification

Adaptive Retry

Fallback Strategies

Alternative Prompt

Alternative Model

Cached Results

Degraded Output

Combining Retry and Fallback

Tiered Approach

Decision Tree

Implementing Circuit Breakers

Best Practices

Know When to Stop

Log Everything

Exercise: Design a Recovery Strategy

Key Takeaways

Discussion