Module 13: Cost Comparison

Understanding the True Cost of Vector Databases

Introduction

Choosing a vector database isn't just about features—cost matters. This module provides a framework for comparing costs across different options.

By the end of this module, you'll understand:

The components that drive costs
How to estimate costs for your workload
Hidden costs to watch for
Strategies to reduce costs

13.1 Cost Components

Direct Costs

1. Storage

Vector data (dimensions × count × 4 bytes)
Index overhead (typically 1.5-3x vector size)
Metadata storage

2. Compute

Query processing
Index building
Embedding generation (often overlooked)

3. Network

Data transfer between services
Cross-region traffic

Indirect Costs

4. Operations

Team time for maintenance
Monitoring and alerting
Incident response

5. Development

Integration effort
Migration costs
Learning curve

13.2 Pricing Models

Pinecone

Serverless (2024 pricing):

Storage: $0.33/GB/month
Queries: $8 per 1M queries
Writes: $2 per 1M writes

Pod-Based:

s1: ~$70/month per pod
p1: ~$90/month per pod
p2: ~$120/month per pod

function estimatePineconeCost(
  vectorCount: number,
  dimensions: number,
  monthlyQueries: number,
  monthlyWrites: number
): number {
  // Serverless estimate
  const storageBytesPerVector = dimensions * 4 * 2.5  // With overhead
  const storageGB = (vectorCount * storageBytesPerVector) / 1e9
  const storageCost = storageGB * 0.33

  const queryCost = (monthlyQueries / 1_000_000) * 8
  const writeCost = (monthlyWrites / 1_000_000) * 2

  return storageCost + queryCost + writeCost
}

// Example: 1M vectors, 1536 dims, 10M queries, 100K writes
console.log(estimatePineconeCost(1_000_000, 1536, 10_000_000, 100_000))
// ~$90/month

Qdrant Cloud

Starter:

Free tier with limits
~$25/month for basic cluster

Production:

Priced by cluster size and resources
~$100-$500/month for small production

Enterprise:

Custom pricing

Self-Hosted (AWS/GCP)

function estimateSelfHostedCost(
  memoryGB: number,
  storageGB: number,
  region: string = 'us-east-1'
): number {
  // EC2 instance estimate (memory-optimized)
  let instanceCost = 0
  if (memoryGB <= 16) instanceCost = 70    // r5.large
  else if (memoryGB <= 32) instanceCost = 140   // r5.xlarge
  else if (memoryGB <= 64) instanceCost = 280   // r5.2xlarge
  else if (memoryGB <= 128) instanceCost = 560  // r5.4xlarge
  else instanceCost = 1120  // r5.8xlarge

  // EBS storage
  const storageCost = storageGB * 0.10  // gp3

  // Assume 2 instances for HA
  return (instanceCost * 2) + storageCost
}

// 1M vectors needing ~20GB memory
console.log(estimateSelfHostedCost(32, 100))
// ~$290/month (2x r5.xlarge + storage)

pgvector on Managed Postgres

Supabase:

Free: 500MB
Pro: $25/month + usage
Team: $599/month

Neon:

Free: 0.5GB
Pro: Pay per usage (~$0.09/GB storage)
Scale: Custom

AWS RDS:

db.r5.large: ~$120/month
db.r5.xlarge: ~$240/month
Plus storage costs

13.3 Total Cost of Ownership

Beyond Monthly Bills

Total Cost = Direct Costs + Operational Costs + Opportunity Costs

Direct Costs:
- Cloud bills
- API costs (embeddings)

Operational Costs:
- Engineering time for maintenance
- Monitoring tools
- Backup solutions

Opportunity Costs:
- Time not spent on features
- Migration effort if you switch

Embedding Costs (Often Forgotten)

function estimateEmbeddingCost(
  documentsToEmbed: number,
  avgTokensPerDoc: number,
  monthlyQueries: number,
  avgTokensPerQuery: number
): number {
  // OpenAI text-embedding-3-small pricing (as of 2024)
  const costPerMTokens = 0.02  // $0.02 per 1M tokens

  // Initial embedding
  const embedTokens = documentsToEmbed * avgTokensPerDoc
  const embedCost = (embedTokens / 1_000_000) * costPerMTokens

  // Query embeddings (monthly recurring)
  const queryTokens = monthlyQueries * avgTokensPerQuery
  const queryCost = (queryTokens / 1_000_000) * costPerMTokens

  return {
    initial: embedCost,
    monthly: queryCost,
    firstYear: embedCost + (queryCost * 12)
  }
}

const costs = estimateEmbeddingCost(
  1_000_000,  // 1M documents
  500,        // 500 tokens avg
  10_000_000, // 10M queries/month
  50          // 50 tokens per query
)
console.log('Initial:', costs.initial)    // ~$10
console.log('Monthly:', costs.monthly)    // ~$10
console.log('First Year:', costs.firstYear) // ~$130

13.4 Cost Comparison by Scale

Small Scale (< 100K vectors)

Solution	Monthly Cost	Notes
Chroma (local)	$0	Self-hosted, limited scale
Supabase Free	$0	500MB limit
Pinecone Serverless	~$10-20	Pay per use
Neon Free	$0	Limited compute

Recommendation: Start with free tiers, move to paid as you grow.

Medium Scale (100K - 1M vectors)

Solution	Monthly Cost	Notes
Pinecone Serverless	$50-150	Depends on queries
Supabase Pro	$25 + usage	Good if using Supabase already
Qdrant Cloud	$50-100	Good performance
Self-hosted (small)	$100-200	Operational overhead

Recommendation: Managed services often win on total cost.

Large Scale (1M - 10M vectors)

Solution	Monthly Cost	Notes
Pinecone	$200-1000	Depends on pods/usage
Qdrant Cloud	$200-500	Production tier
Self-hosted	$300-800	Requires expertise
pgvector (RDS)	$300-600	If already on Postgres

Recommendation: Compare based on your query patterns and team expertise.

Very Large Scale (> 10M vectors)

Solution	Monthly Cost	Notes
Pinecone Enterprise	$1000+	Custom pricing
Qdrant Enterprise	Custom	On-premise option
Self-hosted cluster	$1000-5000+	Significant ops work

Recommendation: Talk to vendors for custom pricing; self-hosted may be cheapest but requires expertise.

13.5 Cost Optimization Strategies

1. Reduce Dimensions

// 3072 dimensions → 1536 dimensions
// Storage: 50% reduction
// Query speed: Faster
// Quality: Slight decrease (test to verify)

const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: text,
  dimensions: 512  // Reduced from 1536
})

2. Use Quantization

Some databases support compressed vectors:

-- pgvector: halfvec type uses 2 bytes instead of 4
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  embedding halfvec(1536)  -- 50% storage savings
);

3. Prune Old Data

// Delete vectors older than threshold
async function pruneOldVectors(daysToKeep: number) {
  const cutoff = new Date()
  cutoff.setDate(cutoff.getDate() - daysToKeep)

  await index.delete({
    filter: {
      createdAt: { $lt: cutoff.toISOString() }
    }
  })
}

4. Cache Frequent Queries

// Cache top 1000 queries
// Reduces vector DB costs by 50%+ for many apps
const cache = new LRUCache({ max: 1000, ttl: 3600000 })

async function search(query: string) {
  const cached = cache.get(query)
  if (cached) return cached  // Free!

  const result = await vectorSearch(query)  // Costs money
  cache.set(query, result)
  return result
}

5. Use Appropriate Tier

// Don't over-provision
// Start small, scale as needed

// Pinecone: Start with serverless
// Then move to pods if cost-effective

// Self-hosted: Start with smaller instances
// Scale up based on actual usage

13.6 Making the Decision

Cost Decision Matrix

interface CostProfile {
  vectorCount: number
  monthlyQueries: number
  monthlyWrites: number
  latencyRequired: 'low' | 'medium' | 'high'
  teamExpertise: 'low' | 'medium' | 'high'
  existingInfra: 'postgres' | 'cloud' | 'none'
}

function recommendSolution(profile: CostProfile): string {
  const { vectorCount, teamExpertise, existingInfra } = profile

  if (vectorCount < 100_000) {
    if (existingInfra === 'postgres') return 'pgvector'
    return 'Chroma or Pinecone Serverless'
  }

  if (vectorCount < 1_000_000) {
    if (teamExpertise === 'low') return 'Pinecone'
    if (existingInfra === 'postgres') return 'pgvector'
    return 'Qdrant Cloud'
  }

  if (vectorCount < 10_000_000) {
    if (teamExpertise === 'high') return 'Self-hosted Qdrant'
    return 'Pinecone or Qdrant Cloud'
  }

  return 'Enterprise tier or custom solution'
}

Questions to Ask Vendors

What's included in the quoted price?
How are overages billed?
Is there a minimum commitment?
What are the data transfer costs?
What support is included?
Can I export my data?

Key Takeaways

Include all costs: Embeddings, ops time, not just database bills
Start small: Free tiers and serverless for early stages
Optimize before scaling: Caching and dimension reduction are cheap
Match to expertise: Self-hosted is only cheaper if you have the skills
Re-evaluate regularly: Pricing changes, your needs change

Exercise: Cost Analysis

For your current or planned project:

Estimate your data scale (vectors, dimensions)
Estimate query volume (QPS, monthly total)
Calculate costs for 3 different solutions
Include embedding API costs
Factor in team time for operations
Compare total cost of ownership

Create a spreadsheet comparing options over 12 months.

Next up: Module 14 - Choosing the Right Database