Lesson 3.6: Vector Indexes (pgvector)

Understanding Vector Search Indexes

Traditional B-tree won't work for vector similarity:

-- This is meaningless (vectors aren't ordered)
CREATE INDEX embeddings_vector_btree ON embeddings(vector);  -- Wrong!

-- Need specialized indexes for similarity search

IVFFlat Index (Inverted File Index)

How it works:

Cluster vectors into N lists (like K-means)
At query time, search only nearest clusters
Tradeoff: Speed for accuracy

-- Create IVFFlat index
CREATE INDEX embeddings_vector_idx ON embeddings
USING ivfflat (vector vector_cosine_ops)
WITH (lists = 100);

-- Parameters:
-- lists = number of clusters (start with sqrt(row_count))
-- vector_cosine_ops = use cosine distance

Setting probes:

-- How many lists to search (default: 1)
SET ivfflat.probes = 10;

-- More probes = better recall, slower queries
-- 1 probe: 60-70% recall, very fast
-- 10 probes: 90-95% recall, slower
-- 100 probes: 99% recall, defeats purpose of index

Query performance:

-- Vector similarity search
SELECT id, content, vector <=> query_vector AS distance
FROM embeddings
ORDER BY vector <=> query_vector
LIMIT 10;

-- Performance:
-- No index: 5-10 seconds (scan all rows)
-- IVFFlat (probes=1): 10-50ms
-- IVFFlat (probes=10): 50-200ms

HNSW Index (Hierarchical Navigable Small World)

Better accuracy than IVFFlat, available in pgvector 0.5+:

-- Create HNSW index (better than IVFFlat)
CREATE INDEX embeddings_hnsw_idx ON embeddings
USING hnsw (vector vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Parameters:
-- m = connections per layer (higher = better recall, larger index)
-- ef_construction = build-time accuracy (higher = better index, slower build)

Query-time parameters:

-- Search accuracy
SET hnsw.ef_search = 40;

-- Higher ef_search = better recall, slower queries

Hybrid Vector + SQL Queries

-- Combine vector search with SQL filters
SELECT
    d.id,
    d.content,
    d.title,
    e.vector <=> query_vector AS similarity
FROM documents d
JOIN embeddings e ON d.id = e.document_id
WHERE
    d.user_id = 123  -- Permission filter (uses B-tree index)
    AND d.category = 'support'  -- Metadata filter (uses B-tree index)
    AND e.vector <=> query_vector < 0.8  -- Vector filter (uses HNSW/IVFFlat)
ORDER BY similarity
LIMIT 10;

-- Performance optimization:
-- 1. Filter documents first (SQL indexes)
-- 2. Then vector search on subset
-- Much faster than vector search on all documents

Vector Index Maintenance

-- Rebuild vector index after bulk inserts
REINDEX INDEX embeddings_vector_idx;

-- Monitor index usage
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,  -- Number of index scans
    idx_tup_read  -- Tuples read
FROM pg_stat_user_indexes
WHERE indexname LIKE '%vector%';

Key Takeaways

Vector similarity requires specialized indexes (not B-tree)
IVFFlat uses clustering for approximate nearest neighbor search
HNSW provides better accuracy than IVFFlat with similar performance
Tradeoff between recall (accuracy) and query speed via parameters
Hybrid queries combine B-tree indexes (metadata) with vector indexes
Filter with SQL first, then do vector search on smaller subset
Rebuild indexes after bulk inserts for optimal performance

IVFFlat Index (Inverted File Index)

How it works:

Cluster vectors into N lists (like K-means)

At query time, search only nearest clusters

Tradeoff: Speed for accuracy

-- Create IVFFlat index CREATE INDEX embeddings_vector_idx ON embeddings USING ivfflat (vector vector_cosine_ops) WITH (lists = 100); -- Parameters: -- lists = number of clusters (start with sqrt(row_count)) -- vector_cosine_ops = use cosine distance

Setting probes:

-- How many lists to search (default: 1) SET ivfflat.probes = 10; -- More probes = better recall, slower queries -- 1 probe: 60-70% recall, very fast -- 10 probes: 90-95% recall, slower -- 100 probes: 99% recall, defeats purpose of index

Query performance:

-- Vector similarity search SELECT id, content, vector <=> query_vector AS distance FROM embeddings ORDER BY vector <=> query_vector LIMIT 10; -- Performance: -- No index: 5-10 seconds (scan all rows) -- IVFFlat (probes=1): 10-50ms -- IVFFlat (probes=10): 50-200ms

HNSW Index (Hierarchical Navigable Small World)

Better accuracy than IVFFlat, available in pgvector 0.5+:

-- Create HNSW index (better than IVFFlat) CREATE INDEX embeddings_hnsw_idx ON embeddings USING hnsw (vector vector_cosine_ops) WITH (m = 16, ef_construction = 64); -- Parameters: -- m = connections per layer (higher = better recall, larger index) -- ef_construction = build-time accuracy (higher = better index, slower build)

Query-time parameters:

-- Search accuracy SET hnsw.ef_search = 40; -- Higher ef_search = better recall, slower queries

Hybrid Vector + SQL Queries

-- Combine vector search with SQL filters SELECT d.id, d.content, d.title, e.vector <=> query_vector AS similarity FROM documents d JOIN embeddings e ON d.id = e.document_id WHERE d.user_id = 123 -- Permission filter (uses B-tree index) AND d.category = 'support' -- Metadata filter (uses B-tree index) AND e.vector <=> query_vector < 0.8 -- Vector filter (uses HNSW/IVFFlat) ORDER BY similarity LIMIT 10; -- Performance optimization: -- 1. Filter documents first (SQL indexes) -- 2. Then vector search on subset -- Much faster than vector search on all documents

-- Rebuild vector index after bulk inserts REINDEX INDEX embeddings_vector_idx; -- Monitor index usage SELECT schemaname, tablename, indexname, idx_scan, -- Number of index scans idx_tup_read -- Tuples read FROM pg_stat_user_indexes WHERE indexname LIKE '%vector%';

Key Takeaways

Vector similarity requires specialized indexes (not B-tree)

IVFFlat uses clustering for approximate nearest neighbor search

HNSW provides better accuracy than IVFFlat with similar performance

Tradeoff between recall (accuracy) and query speed via parameters

Hybrid queries combine B-tree indexes (metadata) with vector indexes

Filter with SQL first, then do vector search on smaller subset

Rebuild indexes after bulk inserts for optimal performance