Cost Optimization for Prompt Chains
Production chains need to balance quality with cost. This lesson covers strategies for optimizing token usage and API costs.
Understanding Chain Costs
Chain Cost = Σ(Input Tokens + Output Tokens) × Price per Token
Each step in a chain adds to total cost. Chains can multiply costs quickly.
Cost Drivers in Chains
Token Accumulation
// Cost grows with each step
const chainCosts = {
step1: { input: 500, output: 200 }, // 700 tokens
step2: { input: 800, output: 300 }, // 1,100 tokens (includes step1 output)
step3: { input: 1200, output: 400 }, // 1,600 tokens (includes step1+2)
total: 3400 // Total tokens used
};
The Context Accumulation Problem
Loading Prompt Playground...
Optimization Strategies
1. Selective Context Passing
Only pass what each step needs:
async function optimizedChain(document) {
// Step 1: Full document needed
const summary = await summarize(document);
// Step 2: Only needs document, not summary
const entities = await extractEntities(document);
// Step 3: Only needs summary, not full document
const sentiment = await analyzeSentiment(summary);
// Step 4: Needs summary + entities + sentiment, NOT full document
const report = await generateReport({
summary, // 1000 tokens
entities, // 500 tokens
sentiment // 300 tokens
// Total: 1800 tokens instead of 6800
});
return report;
}
2. Model Selection by Task
Loading Prompt Playground...
3. Caching Strategies
class ChainCache {
constructor(ttlSeconds = 3600) {
this.cache = new Map();
this.ttl = ttlSeconds * 1000;
}
getCacheKey(step, input) {
// Create deterministic cache key
return `${step}:${hashInput(input)}`;
}
async getOrCompute(step, input, computeFn) {
const key = this.getCacheKey(step, input);
const cached = this.cache.get(key);
if (cached && Date.now() - cached.timestamp < this.ttl) {
console.log(`Cache hit for ${step}`);
return cached.value;
}
const result = await computeFn(input);
this.cache.set(key, {
value: result,
timestamp: Date.now()
});
return result;
}
}
// Usage in chain
const cache = new ChainCache(3600);
async function cachedChain(input) {
const step1Result = await cache.getOrCompute(
'classification',
input,
() => classifyContent(input)
);
// If same classification seen before, reuse analysis
const step2Result = await cache.getOrCompute(
'analysis',
{ type: step1Result.type, input },
() => analyzeContent(input, step1Result)
);
return step2Result;
}
4. Prompt Compression
Loading Prompt Playground...
Cost Monitoring
Tracking Token Usage
class CostTracker {
constructor(pricing) {
this.pricing = pricing;
this.usage = {
totalInputTokens: 0,
totalOutputTokens: 0,
byStep: {},
byModel: {}
};
}
recordUsage(step, model, inputTokens, outputTokens) {
// Track totals
this.usage.totalInputTokens += inputTokens;
this.usage.totalOutputTokens += outputTokens;
// Track by step
if (!this.usage.byStep[step]) {
this.usage.byStep[step] = { input: 0, output: 0, calls: 0 };
}
this.usage.byStep[step].input += inputTokens;
this.usage.byStep[step].output += outputTokens;
this.usage.byStep[step].calls++;
// Track by model
if (!this.usage.byModel[model]) {
this.usage.byModel[model] = { input: 0, output: 0 };
}
this.usage.byModel[model].input += inputTokens;
this.usage.byModel[model].output += outputTokens;
}
getCost() {
let totalCost = 0;
for (const [model, usage] of Object.entries(this.usage.byModel)) {
const pricing = this.pricing[model];
totalCost += (usage.input / 1000) * pricing.input;
totalCost += (usage.output / 1000) * pricing.output;
}
return {
totalCost,
breakdown: this.usage
};
}
}
Setting Budgets and Alerts
class BudgetManager {
constructor(dailyBudget, alertThreshold = 0.8) {
this.dailyBudget = dailyBudget;
this.alertThreshold = alertThreshold;
this.dailySpend = 0;
this.lastReset = new Date().toDateString();
}
checkBudget(estimatedCost) {
this.resetIfNewDay();
if (this.dailySpend + estimatedCost > this.dailyBudget) {
throw new Error('Daily budget exceeded');
}
if (this.dailySpend / this.dailyBudget > this.alertThreshold) {
this.sendAlert(`Budget ${this.alertThreshold * 100}% consumed`);
}
}
recordSpend(cost) {
this.dailySpend += cost;
}
resetIfNewDay() {
const today = new Date().toDateString();
if (today !== this.lastReset) {
this.dailySpend = 0;
this.lastReset = today;
}
}
}
Exercise: Optimize a Chain
Loading Prompt Playground...
Key Takeaways
- Token costs accumulate across chain steps
- Pass only necessary context between steps
- Use cheaper models for simpler tasks
- Implement caching for repeated operations
- Compress prompts without losing clarity
- Monitor costs and set budget alerts
- Balance cost optimization with quality requirements
Next, we'll explore latency and performance optimization.

