Module 6: Setting Up pgvector (PostgreSQL)

Vector Search with Your Familiar Database

Introduction

pgvector brings vector similarity search to PostgreSQL. If you're already using Postgres, this is often the simplest way to add vector capabilities without introducing a new database.

By the end of this module, you'll have:

pgvector installed and configured
Working vector tables and indexes
Understanding of pgvector's SQL interface

6.1 Why pgvector?

Advantages

Use existing PostgreSQL: No new database to manage
ACID transactions: Vectors and metadata in sync
Familiar SQL: Query vectors with SQL you know
Join capabilities: Combine vector search with relational data
Mature ecosystem: PostgreSQL tooling, backups, monitoring

Limitations

2,000 dimension limit: Can't use largest embedding models
Performance ceiling: Not optimized purely for vectors
Scaling complexity: PostgreSQL scaling is harder than managed services
Memory management: Large indexes need careful tuning

When to Choose pgvector

Already using PostgreSQL
Need ACID transactions
Moderate scale (< 10M vectors)
Want to avoid additional infrastructure
Need complex joins with relational data

6.2 Installation

Option 1: Managed PostgreSQL (Supabase, Neon)

Most managed Postgres providers include pgvector:

Supabase:

-- Already enabled, just use it
CREATE EXTENSION IF NOT EXISTS vector;

Neon:

-- Available by default
CREATE EXTENSION IF NOT EXISTS vector;

Option 2: Docker (Local Development)

# Using the official pgvector image
docker run --name pgvector \
  -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 \
  -d pgvector/pgvector:pg16

Option 3: Install on Existing PostgreSQL

# Ubuntu/Debian
sudo apt install postgresql-16-pgvector

# macOS with Homebrew
brew install pgvector

# From source
cd /tmp
git clone --branch v0.6.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
make install

Then enable in your database:

CREATE EXTENSION vector;

6.3 Creating Vector Tables

Basic Table Structure

-- Create a table with a vector column
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  embedding vector(1536),  -- OpenAI embedding dimension
  category TEXT,
  created_at TIMESTAMP DEFAULT NOW()
);

Alternative: Separate Embeddings Table

For flexibility, you might separate embeddings:

-- Documents table
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  category TEXT,
  created_at TIMESTAMP DEFAULT NOW()
);

-- Embeddings table
CREATE TABLE document_embeddings (
  document_id INTEGER REFERENCES documents(id) ON DELETE CASCADE,
  embedding vector(1536),
  model TEXT NOT NULL,  -- Track which model was used
  created_at TIMESTAMP DEFAULT NOW(),
  PRIMARY KEY (document_id, model)
);

This allows re-embedding with different models without touching the source data.

6.4 Creating Indexes

HNSW Index (Recommended)

HNSW provides fast approximate nearest neighbor search:

-- Create HNSW index for cosine distance
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);

-- With custom parameters
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Parameters:

m: Max connections per node (default 16, higher = more accurate, more memory)
ef_construction: Build quality (default 64, higher = better index, slower build)

IVFFlat Index

Good for larger datasets where you want to control memory:

-- Create IVFFlat index
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);  -- Number of clusters

Parameter:

lists: Number of clusters. Recommended: sqrt(row_count)

Distance Operators

Operator	Ops Class	Description
`<->`	vector_l2_ops	Euclidean distance
`<=>`	vector_cosine_ops	Cosine distance
`<#>`	vector_ip_ops	Negative inner product

6.5 Basic Operations with SQL

Inserting Vectors

-- Insert a single document
INSERT INTO documents (title, content, embedding, category)
VALUES (
  'Introduction to Vectors',
  'Vector databases store numerical representations...',
  '[0.1, -0.2, 0.3, ...]'::vector,  -- 1536 values
  'tutorial'
);

Querying for Similar Vectors

-- Find 10 most similar documents
SELECT id, title,
       1 - (embedding <=> query_embedding) AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, -0.2, 0.3, ...]'::vector
LIMIT 10;

Filtering with Metadata

-- Similar documents in a specific category
SELECT id, title,
       1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'tutorial'
ORDER BY embedding <=> $1
LIMIT 10;

Using Index Parameters

For HNSW, control search quality:

-- Increase search accuracy (slower)
SET hnsw.ef_search = 100;

SELECT * FROM documents
ORDER BY embedding <=> query_vector
LIMIT 10;

-- Reset to default
RESET hnsw.ef_search;

For IVFFlat, control cluster probing:

-- Search more clusters (slower, more accurate)
SET ivfflat.probes = 10;

SELECT * FROM documents
ORDER BY embedding <=> query_vector
LIMIT 10;

6.6 Using pgvector with JavaScript/TypeScript

With Raw SQL (pg library)

import { Pool } from 'pg'
import OpenAI from 'openai'
import dotenv from 'dotenv'

dotenv.config()

const pool = new Pool({
  connectionString: process.env.DATABASE_URL
})

const openai = new OpenAI()

async function createTable() {
  await pool.query(`
    CREATE TABLE IF NOT EXISTS documents (
      id SERIAL PRIMARY KEY,
      title TEXT NOT NULL,
      content TEXT NOT NULL,
      embedding vector(1536),
      category TEXT
    )
  `)

  await pool.query(`
    CREATE INDEX IF NOT EXISTS documents_embedding_idx
    ON documents USING hnsw (embedding vector_cosine_ops)
  `)
}

async function insertDocument(
  title: string,
  content: string,
  category: string
) {
  // Generate embedding
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: `${title}\n\n${content}`
  })
  const embedding = response.data[0].embedding

  // Insert
  const result = await pool.query(
    `INSERT INTO documents (title, content, embedding, category)
     VALUES ($1, $2, $3, $4)
     RETURNING id`,
    [title, content, JSON.stringify(embedding), category]
  )

  return result.rows[0].id
}

async function searchDocuments(
  query: string,
  options: { limit?: number; category?: string } = {}
) {
  const { limit = 10, category } = options

  // Generate query embedding
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query
  })
  const embedding = response.data[0].embedding

  // Build query
  let sql = `
    SELECT id, title, content, category,
           1 - (embedding <=> $1::vector) AS similarity
    FROM documents
  `
  const params: any[] = [JSON.stringify(embedding)]

  if (category) {
    sql += ` WHERE category = $2`
    params.push(category)
  }

  sql += ` ORDER BY embedding <=> $1::vector LIMIT $${params.length + 1}`
  params.push(limit)

  const result = await pool.query(sql, params)
  return result.rows
}

// Usage
async function main() {
  await createTable()

  await insertDocument(
    'Vector Database Guide',
    'Learn how to use vector databases for AI applications...',
    'tutorial'
  )

  const results = await searchDocuments('How do I store embeddings?')
  console.log(results)
}

main()

With Drizzle ORM

import { pgTable, serial, text, index } from 'drizzle-orm/pg-core'
import { sql } from 'drizzle-orm'
import { drizzle } from 'drizzle-orm/node-postgres'
import { Pool } from 'pg'

// Note: You'll need a custom column type for vectors
// drizzle-orm doesn't have native vector support yet
const pool = new Pool({ connectionString: process.env.DATABASE_URL })
const db = drizzle(pool)

// Use raw SQL for vector operations
async function search(queryEmbedding: number[], limit = 10) {
  const result = await db.execute(sql`
    SELECT id, title, content,
           1 - (embedding <=> ${JSON.stringify(queryEmbedding)}::vector) as similarity
    FROM documents
    ORDER BY embedding <=> ${JSON.stringify(queryEmbedding)}::vector
    LIMIT ${limit}
  `)
  return result.rows
}

With Prisma

// schema.prisma
// Note: Prisma doesn't have native vector support
// Use raw queries for vector operations

model Document {
  id        Int      @id @default(autoincrement())
  title     String
  content   String
  category  String?
  createdAt DateTime @default(now())
  // embedding is handled via raw SQL
}

import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()

async function search(queryEmbedding: number[], limit = 10) {
  const embeddingStr = `[${queryEmbedding.join(',')}]`

  const results = await prisma.$queryRaw`
    SELECT id, title, content, category,
           1 - (embedding <=> ${embeddingStr}::vector) as similarity
    FROM "Document"
    ORDER BY embedding <=> ${embeddingStr}::vector
    LIMIT ${limit}
  `

  return results
}

6.7 Complete Example: RAG with pgvector

import { Pool } from 'pg'
import OpenAI from 'openai'

const pool = new Pool({ connectionString: process.env.DATABASE_URL })
const openai = new OpenAI()

class RAGSystem {
  async initialize() {
    await pool.query(`
      CREATE TABLE IF NOT EXISTS knowledge_base (
        id SERIAL PRIMARY KEY,
        source TEXT NOT NULL,
        chunk_index INTEGER NOT NULL,
        content TEXT NOT NULL,
        embedding vector(1536),
        UNIQUE(source, chunk_index)
      );

      CREATE INDEX IF NOT EXISTS kb_embedding_idx
      ON knowledge_base USING hnsw (embedding vector_cosine_ops);
    `)
  }

  async addDocument(source: string, content: string) {
    // Chunk content
    const chunks = this.chunkText(content, 500)

    // Generate embeddings
    const response = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: chunks
    })

    // Insert chunks
    for (let i = 0; i < chunks.length; i++) {
      await pool.query(
        `INSERT INTO knowledge_base (source, chunk_index, content, embedding)
         VALUES ($1, $2, $3, $4)
         ON CONFLICT (source, chunk_index) DO UPDATE
         SET content = $3, embedding = $4`,
        [source, i, chunks[i], JSON.stringify(response.data[i].embedding)]
      )
    }
  }

  async query(question: string): Promise<string> {
    // Get relevant context
    const response = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: question
    })

    const contextResults = await pool.query(`
      SELECT content, source,
             1 - (embedding <=> $1::vector) as similarity
      FROM knowledge_base
      ORDER BY embedding <=> $1::vector
      LIMIT 5
    `, [JSON.stringify(response.data[0].embedding)])

    // Build context
    const context = contextResults.rows
      .map(r => `[${r.source}]: ${r.content}`)
      .join('\n\n')

    // Generate answer
    const completion = await openai.chat.completions.create({
      model: 'gpt-4-turbo',
      messages: [
        {
          role: 'system',
          content: `Answer questions based on the following context. If the answer isn't in the context, say so.\n\nContext:\n${context}`
        },
        { role: 'user', content: question }
      ]
    })

    return completion.choices[0].message.content ?? 'No answer generated'
  }

  private chunkText(text: string, size: number): string[] {
    const words = text.split(/\s+/)
    const chunks: string[] = []
    for (let i = 0; i < words.length; i += size) {
      chunks.push(words.slice(i, i + size).join(' '))
    }
    return chunks
  }
}

// Usage
async function main() {
  const rag = new RAGSystem()
  await rag.initialize()

  // Add documents
  await rag.addDocument(
    'pgvector-guide.md',
    'pgvector is a PostgreSQL extension that adds support for vector similarity search...'
  )

  // Query
  const answer = await rag.query('How do I install pgvector?')
  console.log(answer)
}

Key Takeaways

pgvector adds vector capabilities to PostgreSQL
Use HNSW for fast approximate search, IVFFlat for memory efficiency
SQL interface means you can combine vector search with relational queries
Works with any PostgreSQL client library
Great for moderate scale, existing Postgres users

Exercise: Build a Product Search

Create a table for products with embeddings
Insert sample products with their descriptions embedded
Implement search by similarity
Add filtering by price range and category using SQL

Next up: Module 7 - Setting Up Chroma (Local/Free)