Lesson 6.2: pgvector vs Dedicated Vector Databases
Comparison Matrix
| Feature | pgvector | Pinecone | Weaviate | Qdrant |
|---|---|---|---|---|
| SQL Queries | ✅ Native | ❌ Limited | ❌ Limited | ⚠️ Basic |
| ACID Transactions | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Joins | ✅ Full SQL | ❌ No | ❌ No | ❌ No |
| Metadata Filters | ✅ Full WHERE | ⚠️ Basic | ⚠️ Basic | ⚠️ Good |
| Scaling | ⚠️ Vertical | ✅ Horizontal | ✅ Horizontal | ✅ Horizontal |
| Cost | 💰 Self-hosted | 💰💰💰 Per query | 💰💰 Managed | 💰 Self-hosted |
| Managed Option | ⚠️ Few | ✅ Yes | ✅ Yes | ⚠️ Cloud |
When to Use pgvector
✅ Good for:
- Under 10M vectors
- Need complex SQL filters (permissions, categories, dates)
- Require transactions (consistency with other data)
- Already using PostgreSQL
- Cost-sensitive
- Need data colocated (vectors + metadata in same DB)
Example: RAG with permissions
-- This is hard in dedicated vector DBs
SELECT
d.content,
d.title,
e.embedding <=> query_embedding AS similarity
FROM documents d
JOIN embeddings e ON d.id = e.document_id
JOIN document_permissions p ON d.id = p.document_id
WHERE
p.user_id = 123 -- Permission check
AND d.department = 'engineering' -- Metadata filter
AND d.created_at > NOW() - INTERVAL '6 months' -- Recency
AND e.embedding <=> query_embedding < 0.7 -- Similarity threshold
ORDER BY similarity
LIMIT 10;
When to Use Dedicated Vector DB
✅ Good for:
- Over 100M vectors
- Pure vector search (minimal metadata filtering)
- Need horizontal sharding
- Managed service preferred
- Specialized features (hybrid search, re-ranking)
Example: Image similarity at scale
# Pinecone: 1 billion image embeddings
import pinecone
index = pinecone.Index('image-search')
# Query
results = index.query(
vector=query_embedding,
top_k=10,
filter={"category": "products"} # Basic filtering
)
Key Takeaways
- pgvector excels at SQL + vector search for under 10M vectors
- Dedicated vector DBs scale horizontally for 100M+ vectors
- pgvector strengths: Full SQL, ACID transactions, complex filters, joins
- Dedicated DB strengths: Horizontal scaling, managed options, specialized features
- Use pgvector when you need tight integration with SQL and complex metadata filtering
- Use dedicated DBs when you need massive scale and pure vector search
- Most RAG systems benefit from pgvector's SQL capabilities for permissions and filters

