External Memory Systems

When chains need more context than fits in a single prompt, external memory systems provide a solution. This lesson covers patterns for using external storage to extend chain capabilities.

Why External Memory?

Internal context (the prompt) has hard limits. External memory enables:

Unlimited history: Store all previous interactions
Shared state: Multiple chains access the same data
Persistence: State survives beyond a single session
Selective retrieval: Load only relevant context

External Memory Architecture

┌─────────────────────────────────────────────────────────────┐
│                       CHAIN STEP                            │
├─────────────────────────────────────────────────────────────┤
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Current    │    │   Retrieved  │    │    Model     │  │
│  │   Input      │ +  │   Context    │ →  │   Prompt     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│                             ▲                               │
│                             │ Retrieve                      │
│                      ┌──────┴──────┐                       │
│                      │  External   │                       │
│                      │  Memory     │                       │
│                      └─────────────┘                       │
│                             ▲                               │
│                             │ Store                        │
│                      Previous Step                         │
│                      Outputs                               │
└─────────────────────────────────────────────────────────────┘

Memory Store Types

Key-Value Store

Simple storage by unique keys:

class KeyValueMemory {
  constructor() {
    this.store = new Map();
  }

  async save(key, value) {
    this.store.set(key, {
      value,
      timestamp: Date.now()
    });
  }

  async get(key) {
    return this.store.get(key)?.value;
  }

  async getMultiple(keys) {
    return keys.map(k => this.get(k));
  }
}

// Usage
const memory = new KeyValueMemory();
await memory.save('step1_entities', extractedEntities);
await memory.save('step2_analysis', analysisResult);

// Later step retrieves specific data
const entities = await memory.get('step1_entities');

Vector Store (Semantic Search)

Retrieve by meaning, not just key:

Loading Prompt Playground...

Structured Database

For complex queries and relationships:

class StructuredMemory {
  async saveStepResult(chainId, stepId, result) {
    await db.stepResults.insert({
      chainId,
      stepId,
      result: JSON.stringify(result),
      timestamp: new Date(),
      tokenCount: estimateTokens(result)
    });
  }

  async getStepResults(chainId, filters = {}) {
    let query = db.stepResults.where({ chainId });

    if (filters.stepIds) {
      query = query.where('stepId').in(filters.stepIds);
    }

    if (filters.maxTokens) {
      // Get most recent until token limit
      const results = [];
      let totalTokens = 0;

      for await (const row of query.orderBy('timestamp', 'desc')) {
        if (totalTokens + row.tokenCount > filters.maxTokens) break;
        results.push(row);
        totalTokens += row.tokenCount;
      }

      return results.reverse();
    }

    return query.all();
  }
}

Retrieval Patterns

Recency-Based Retrieval

Get most recent relevant data:

async function getRecentContext(memory, chainId, maxTokens) {
  const items = await memory.query({
    chainId,
    orderBy: 'timestamp DESC',
    limit: 20
  });

  // Take items until token budget exhausted
  const context = [];
  let tokens = 0;

  for (const item of items) {
    if (tokens + item.tokens > maxTokens) break;
    context.push(item);
    tokens += item.tokens;
  }

  return context;
}

Relevance-Based Retrieval

Get semantically similar content:

async function getRelevantContext(vectorStore, query, maxTokens) {
  // Embed the query
  const queryEmbedding = await embed(query);

  // Search for similar content
  const results = await vectorStore.search(queryEmbedding, {
    limit: 10,
    minScore: 0.7
  });

  // Filter by token budget
  const context = [];
  let tokens = 0;

  for (const result of results) {
    if (tokens + result.tokens > maxTokens) break;
    context.push(result);
    tokens += result.tokens;
  }

  return context;
}

Hybrid Retrieval

Combine multiple strategies:

async function hybridRetrieval(memory, vectorStore, query, maxTokens) {
  // Always include recent step results
  const recent = await getRecentContext(memory, query.chainId, maxTokens * 0.3);

  // Search for relevant historical context
  const relevant = await getRelevantContext(vectorStore, query.text, maxTokens * 0.5);

  // Include key facts
  const keyFacts = await memory.get(`${query.chainId}:keyFacts`);

  return {
    recent,
    relevant,
    keyFacts,
    totalTokens: estimateTokens([recent, relevant, keyFacts])
  };
}

Memory Management

Writing to Memory

Loading Prompt Playground...

Memory Cleanup

async function cleanupMemory(memory, chainId, retentionPolicy) {
  // Remove old entries
  await memory.deleteWhere({
    chainId,
    timestamp: { $lt: Date.now() - retentionPolicy.maxAge }
  });

  // Keep only N most recent per category
  const categories = await memory.getCategories(chainId);
  for (const category of categories) {
    await memory.keepNewest(chainId, category, retentionPolicy.maxPerCategory);
  }

  // Summarize old detailed entries
  const oldEntries = await memory.query({
    chainId,
    timestamp: { $lt: Date.now() - retentionPolicy.summarizeAfter },
    summarized: false
  });

  for (const entry of oldEntries) {
    const summary = await summarize(entry.content);
    await memory.update(entry.id, {
      content: summary,
      summarized: true
    });
  }
}

Memory-Augmented Prompts

Injecting Retrieved Context

function buildMemoryAugmentedPrompt(basePrompt, retrievedContext) {
  return `
${basePrompt}

## Retrieved Context (from memory)

${retrievedContext.map((item, i) =>
  `### Memory Item ${i + 1} (${item.source})
  ${item.content}`
).join('\n\n')}

## Current Task
Using the retrieved context above, complete the following:
`;
}

Referencing Memory

Loading Prompt Playground...

Exercise: Design a Memory System

Design an external memory system for this scenario:

Loading Prompt Playground...

Key Takeaways

External memory extends chains beyond context window limits
Key-value stores work for simple state persistence
Vector stores enable semantic search for relevant context
Structured databases support complex queries
Combine recency and relevance for best retrieval
Carefully decide what to write to memory
Clean up old entries to manage storage
Inject retrieved context clearly into prompts

In the next module, we'll explore integrating external tools into your chains.

External Memory Systems

When chains need more context than fits in a single prompt, external memory systems provide a solution. This lesson covers patterns for using external storage to extend chain capabilities.

Why External Memory?

Internal context (the prompt) has hard limits. External memory enables:

Unlimited history: Store all previous interactions
Shared state: Multiple chains access the same data
Persistence: State survives beyond a single session
Selective retrieval: Load only relevant context

External Memory Architecture

┌─────────────────────────────────────────────────────────────┐
│                       CHAIN STEP                            │
├─────────────────────────────────────────────────────────────┤
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │   Current    │    │   Retrieved  │    │    Model     │  │
│  │   Input      │ +  │   Context    │ →  │   Prompt     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│                             ▲                               │
│                             │ Retrieve                      │
│                      ┌──────┴──────┐                       │
│                      │  External   │                       │
│                      │  Memory     │                       │
│                      └─────────────┘                       │
│                             ▲                               │
│                             │ Store                        │
│                      Previous Step                         │
│                      Outputs                               │
└─────────────────────────────────────────────────────────────┘

Memory Store Types

Key-Value Store

Simple storage by unique keys:

class KeyValueMemory {
  constructor() {
    this.store = new Map();
  }

  async save(key, value) {
    this.store.set(key, {
      value,
      timestamp: Date.now()
    });
  }

  async get(key) {
    return this.store.get(key)?.value;
  }

  async getMultiple(keys) {
    return keys.map(k => this.get(k));
  }
}

// Usage
const memory = new KeyValueMemory();
await memory.save('step1_entities', extractedEntities);
await memory.save('step2_analysis', analysisResult);

// Later step retrieves specific data
const entities = await memory.get('step1_entities');

Vector Store (Semantic Search)

Retrieve by meaning, not just key:

Loading Prompt Playground...

Structured Database

For complex queries and relationships:

class StructuredMemory {
  async saveStepResult(chainId, stepId, result) {
    await db.stepResults.insert({
      chainId,
      stepId,
      result: JSON.stringify(result),
      timestamp: new Date(),
      tokenCount: estimateTokens(result)
    });
  }

  async getStepResults(chainId, filters = {}) {
    let query = db.stepResults.where({ chainId });

    if (filters.stepIds) {
      query = query.where('stepId').in(filters.stepIds);
    }

    if (filters.maxTokens) {
      // Get most recent until token limit
      const results = [];
      let totalTokens = 0;

      for await (const row of query.orderBy('timestamp', 'desc')) {
        if (totalTokens + row.tokenCount > filters.maxTokens) break;
        results.push(row);
        totalTokens += row.tokenCount;
      }

      return results.reverse();
    }

    return query.all();
  }
}

Retrieval Patterns

Recency-Based Retrieval

Get most recent relevant data:

async function getRecentContext(memory, chainId, maxTokens) {
  const items = await memory.query({
    chainId,
    orderBy: 'timestamp DESC',
    limit: 20
  });

  // Take items until token budget exhausted
  const context = [];
  let tokens = 0;

  for (const item of items) {
    if (tokens + item.tokens > maxTokens) break;
    context.push(item);
    tokens += item.tokens;
  }

  return context;
}

Relevance-Based Retrieval

Get semantically similar content:

async function getRelevantContext(vectorStore, query, maxTokens) {
  // Embed the query
  const queryEmbedding = await embed(query);

  // Search for similar content
  const results = await vectorStore.search(queryEmbedding, {
    limit: 10,
    minScore: 0.7
  });

  // Filter by token budget
  const context = [];
  let tokens = 0;

  for (const result of results) {
    if (tokens + result.tokens > maxTokens) break;
    context.push(result);
    tokens += result.tokens;
  }

  return context;
}

Hybrid Retrieval

Combine multiple strategies:

async function hybridRetrieval(memory, vectorStore, query, maxTokens) {
  // Always include recent step results
  const recent = await getRecentContext(memory, query.chainId, maxTokens * 0.3);

  // Search for relevant historical context
  const relevant = await getRelevantContext(vectorStore, query.text, maxTokens * 0.5);

  // Include key facts
  const keyFacts = await memory.get(`${query.chainId}:keyFacts`);

  return {
    recent,
    relevant,
    keyFacts,
    totalTokens: estimateTokens([recent, relevant, keyFacts])
  };
}

Memory Management

Writing to Memory

Loading Prompt Playground...

Memory Cleanup

async function cleanupMemory(memory, chainId, retentionPolicy) {
  // Remove old entries
  await memory.deleteWhere({
    chainId,
    timestamp: { $lt: Date.now() - retentionPolicy.maxAge }
  });

  // Keep only N most recent per category
  const categories = await memory.getCategories(chainId);
  for (const category of categories) {
    await memory.keepNewest(chainId, category, retentionPolicy.maxPerCategory);
  }

  // Summarize old detailed entries
  const oldEntries = await memory.query({
    chainId,
    timestamp: { $lt: Date.now() - retentionPolicy.summarizeAfter },
    summarized: false
  });

  for (const entry of oldEntries) {
    const summary = await summarize(entry.content);
    await memory.update(entry.id, {
      content: summary,
      summarized: true
    });
  }
}

Memory-Augmented Prompts

Injecting Retrieved Context

function buildMemoryAugmentedPrompt(basePrompt, retrievedContext) {
  return `
${basePrompt}

## Retrieved Context (from memory)

${retrievedContext.map((item, i) =>
  `### Memory Item ${i + 1} (${item.source})
  ${item.content}`
).join('\n\n')}

## Current Task
Using the retrieved context above, complete the following:
`;
}

Referencing Memory

Loading Prompt Playground...

Exercise: Design a Memory System

Design an external memory system for this scenario:

Loading Prompt Playground...

Key Takeaways

External memory extends chains beyond context window limits
Key-value stores work for simple state persistence
Vector stores enable semantic search for relevant context
Structured databases support complex queries
Combine recency and relevance for best retrieval
Carefully decide what to write to memory
Clean up old entries to manage storage
Inject retrieved context clearly into prompts

In the next module, we'll explore integrating external tools into your chains.

External Memory Systems

Why External Memory?

External Memory Architecture

Memory Store Types

Key-Value Store

Vector Store (Semantic Search)

Structured Database

Retrieval Patterns

Recency-Based Retrieval

Relevance-Based Retrieval

Hybrid Retrieval

Memory Management

Writing to Memory

Memory Cleanup

Memory-Augmented Prompts

Injecting Retrieved Context

Referencing Memory

Exercise: Design a Memory System

Key Takeaways

Discussion

External Memory Systems

Why External Memory?

External Memory Architecture

Memory Store Types

Key-Value Store

Vector Store (Semantic Search)

Structured Database

Retrieval Patterns

Recency-Based Retrieval

Relevance-Based Retrieval

Hybrid Retrieval

Memory Management

Writing to Memory

Memory Cleanup

Memory-Augmented Prompts

Injecting Retrieved Context

Referencing Memory

Exercise: Design a Memory System

Key Takeaways

Discussion