Types of Chain Failures

Understanding the different ways prompt chains can fail is essential for building robust systems. Each failure type requires a different handling strategy.

Failure Categories

Prompt chain failures fall into four main categories:

┌─────────────────────────────────────────────────────────────┐
│                    CHAIN FAILURES                           │
├─────────────────┬─────────────────┬─────────────────────────┤
│   API/System    │    Content      │    Logic/Flow           │
│   Failures      │    Failures     │    Failures             │
├─────────────────┼─────────────────┼─────────────────────────┤
│ • Rate limits   │ • Wrong format  │ • Invalid transitions   │
│ • Timeouts      │ • Missing data  │ • Infinite loops        │
│ • Network errors│ • Hallucinations│ • Dead ends             │
│ • Token limits  │ • Off-topic     │ • State corruption      │
└─────────────────┴─────────────────┴─────────────────────────┘

API and System Failures

Rate Limiting

When you hit API rate limits:

// Example error
{
  "error": {
    "type": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry after 60 seconds."
  }
}

Symptoms: 429 status codes, specific error messages Impact: Step cannot execute Recovery: Exponential backoff, request queuing

Timeouts

When a step takes too long:

// Timeout scenario
try {
  const result = await runStep(input, { timeout: 30000 });
} catch (error) {
  if (error.code === 'TIMEOUT') {
    // Step took longer than 30 seconds
  }
}

Symptoms: No response within expected time Impact: Chain stalls, resources held Recovery: Retry with longer timeout, use cached result, skip step

Token Limit Exceeded

When input or output exceeds model limits:

Loading Prompt Playground...

Symptoms: API error about token limits Impact: Step fails completely Recovery: Chunk input, summarize first, use larger context model

Content Failures

Wrong Output Format

When the AI doesn't follow format instructions:

Expected:
{ "sentiment": "positive", "score": 0.8 }

Actual:
"The sentiment is positive with a high score."

Symptoms: JSON parse errors, missing fields Impact: Next step receives invalid input Recovery: Retry with stricter instructions, parse flexibly, use format validation

Missing or Incomplete Data

When extraction or analysis is incomplete:

Loading Prompt Playground...

Symptoms: Fewer results than expected, empty fields Impact: Downstream steps lack data Recovery: Second-pass extraction, human review queue

Hallucinations

When the AI generates false information:

Examples:

Citing non-existent sources
Adding details not in the input
Making up statistics

Symptoms: Information doesn't match source, implausible claims Impact: Incorrect final output, trust issues Recovery: Fact-checking step, confidence thresholds, human review

Off-Topic or Irrelevant Output

When the AI doesn't address the actual task:

Symptoms: Output doesn't match request, tangential content Impact: Wasted processing, chain produces wrong result Recovery: Relevance validation, retry with clearer instructions

Logic and Flow Failures

Invalid State Transitions

When the chain reaches an impossible state:

// Example: Order processing chain
state = { status: "delivered" };

// Invalid: Can't cancel a delivered order
nextStep = "cancel_order";  // Should fail

// Valid transitions from "delivered"
validTransitions = ["return_initiated", "feedback_requested"];

Symptoms: Unexpected state combinations, assertion failures Impact: Chain in undefined state Recovery: State validation, rollback to known state

Infinite Loops

When a chain never terminates:

Step: Improve → Evaluate → (not good enough) → Improve → Evaluate → ...

Symptoms: Ever-increasing step count, no termination Impact: Resource exhaustion, cost explosion Recovery: Maximum iteration limits, early stopping criteria

Dead Ends

When a chain reaches a state with no valid next action:

// No handler for this case
if (analysis.category === "unknown") {
  // What now? Chain has no path forward
}

Symptoms: Chain halts with no error, no output Impact: Incomplete processing Recovery: Default handlers, explicit "unknown" paths

Failure Detection Strategies

Structural Validation

Check output structure matches expectations:

function validateOutput(output, schema) {
  const errors = [];

  for (const [field, requirements] of Object.entries(schema)) {
    if (requirements.required && !(field in output)) {
      errors.push(`Missing required field: ${field}`);
    }
    if (field in output && typeof output[field] !== requirements.type) {
      errors.push(`Wrong type for ${field}: expected ${requirements.type}`);
    }
  }

  return { valid: errors.length === 0, errors };
}

Semantic Validation

Check output makes sense:

Loading Prompt Playground...

Consistency Checks

Verify outputs are internally consistent:

// Check for contradictions
if (output.sentiment === "positive" && output.score < 0) {
  throw new Error("Inconsistent: positive sentiment with negative score");
}

// Check for impossible combinations
if (output.status === "completed" && !output.completion_date) {
  throw new Error("Inconsistent: completed without completion date");
}

Exercise: Classify Failures

For each scenario, identify the failure type and suggest a recovery strategy:

Loading Prompt Playground...

Key Takeaways

Failures fall into three main categories: API/System, Content, and Logic/Flow
Each failure type has characteristic symptoms and recovery strategies
API failures often need retry with backoff
Content failures need validation and reformatting
Logic failures need bounds checking and default paths
Detection requires both structural and semantic validation
Plan for failures during chain design, not after deployment

Next, we'll explore specific strategies for validating outputs between steps.

Types of Chain Failures

Understanding the different ways prompt chains can fail is essential for building robust systems. Each failure type requires a different handling strategy.

Failure Categories

Prompt chain failures fall into four main categories:

┌─────────────────────────────────────────────────────────────┐
│                    CHAIN FAILURES                           │
├─────────────────┬─────────────────┬─────────────────────────┤
│   API/System    │    Content      │    Logic/Flow           │
│   Failures      │    Failures     │    Failures             │
├─────────────────┼─────────────────┼─────────────────────────┤
│ • Rate limits   │ • Wrong format  │ • Invalid transitions   │
│ • Timeouts      │ • Missing data  │ • Infinite loops        │
│ • Network errors│ • Hallucinations│ • Dead ends             │
│ • Token limits  │ • Off-topic     │ • State corruption      │
└─────────────────┴─────────────────┴─────────────────────────┘

API and System Failures

Rate Limiting

When you hit API rate limits:

// Example error
{
  "error": {
    "type": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry after 60 seconds."
  }
}

Symptoms: 429 status codes, specific error messages Impact: Step cannot execute Recovery: Exponential backoff, request queuing

Timeouts

When a step takes too long:

// Timeout scenario
try {
  const result = await runStep(input, { timeout: 30000 });
} catch (error) {
  if (error.code === 'TIMEOUT') {
    // Step took longer than 30 seconds
  }
}

Symptoms: No response within expected time Impact: Chain stalls, resources held Recovery: Retry with longer timeout, use cached result, skip step

Token Limit Exceeded

When input or output exceeds model limits:

Loading Prompt Playground...

Symptoms: API error about token limits Impact: Step fails completely Recovery: Chunk input, summarize first, use larger context model

Content Failures

Wrong Output Format

When the AI doesn't follow format instructions:

Expected:
{ "sentiment": "positive", "score": 0.8 }

Actual:
"The sentiment is positive with a high score."

Symptoms: JSON parse errors, missing fields Impact: Next step receives invalid input Recovery: Retry with stricter instructions, parse flexibly, use format validation

Missing or Incomplete Data

When extraction or analysis is incomplete:

Loading Prompt Playground...

Symptoms: Fewer results than expected, empty fields Impact: Downstream steps lack data Recovery: Second-pass extraction, human review queue

Hallucinations

When the AI generates false information:

Examples:

Citing non-existent sources
Adding details not in the input
Making up statistics

Symptoms: Information doesn't match source, implausible claims Impact: Incorrect final output, trust issues Recovery: Fact-checking step, confidence thresholds, human review

Off-Topic or Irrelevant Output

When the AI doesn't address the actual task:

Symptoms: Output doesn't match request, tangential content Impact: Wasted processing, chain produces wrong result Recovery: Relevance validation, retry with clearer instructions

Logic and Flow Failures

Invalid State Transitions

When the chain reaches an impossible state:

// Example: Order processing chain
state = { status: "delivered" };

// Invalid: Can't cancel a delivered order
nextStep = "cancel_order";  // Should fail

// Valid transitions from "delivered"
validTransitions = ["return_initiated", "feedback_requested"];

Symptoms: Unexpected state combinations, assertion failures Impact: Chain in undefined state Recovery: State validation, rollback to known state

Infinite Loops

When a chain never terminates:

Step: Improve → Evaluate → (not good enough) → Improve → Evaluate → ...

Symptoms: Ever-increasing step count, no termination Impact: Resource exhaustion, cost explosion Recovery: Maximum iteration limits, early stopping criteria

Dead Ends

When a chain reaches a state with no valid next action:

// No handler for this case
if (analysis.category === "unknown") {
  // What now? Chain has no path forward
}

Symptoms: Chain halts with no error, no output Impact: Incomplete processing Recovery: Default handlers, explicit "unknown" paths

Failure Detection Strategies

Structural Validation

Check output structure matches expectations:

function validateOutput(output, schema) {
  const errors = [];

  for (const [field, requirements] of Object.entries(schema)) {
    if (requirements.required && !(field in output)) {
      errors.push(`Missing required field: ${field}`);
    }
    if (field in output && typeof output[field] !== requirements.type) {
      errors.push(`Wrong type for ${field}: expected ${requirements.type}`);
    }
  }

  return { valid: errors.length === 0, errors };
}

Semantic Validation

Check output makes sense:

Loading Prompt Playground...

Consistency Checks

Verify outputs are internally consistent:

// Check for contradictions
if (output.sentiment === "positive" && output.score < 0) {
  throw new Error("Inconsistent: positive sentiment with negative score");
}

// Check for impossible combinations
if (output.status === "completed" && !output.completion_date) {
  throw new Error("Inconsistent: completed without completion date");
}

Exercise: Classify Failures

For each scenario, identify the failure type and suggest a recovery strategy:

Loading Prompt Playground...

Key Takeaways

Failures fall into three main categories: API/System, Content, and Logic/Flow
Each failure type has characteristic symptoms and recovery strategies
API failures often need retry with backoff
Content failures need validation and reformatting
Logic failures need bounds checking and default paths
Detection requires both structural and semantic validation
Plan for failures during chain design, not after deployment

Next, we'll explore specific strategies for validating outputs between steps.

Types of Chain Failures

Failure Categories

API and System Failures

Rate Limiting

Timeouts

Token Limit Exceeded

Content Failures

Wrong Output Format

Missing or Incomplete Data

Hallucinations

Off-Topic or Irrelevant Output

Logic and Flow Failures

Invalid State Transitions

Infinite Loops

Dead Ends

Failure Detection Strategies

Structural Validation

Semantic Validation

Consistency Checks

Exercise: Classify Failures

Key Takeaways

Discussion

Types of Chain Failures

Failure Categories

API and System Failures

Rate Limiting

Timeouts

Token Limit Exceeded

Content Failures

Wrong Output Format

Missing or Incomplete Data

Hallucinations

Off-Topic or Irrelevant Output

Logic and Flow Failures

Invalid State Transitions

Infinite Loops

Dead Ends

Failure Detection Strategies

Structural Validation

Semantic Validation

Consistency Checks

Exercise: Classify Failures

Key Takeaways

Discussion