Indexes and Query Performance

What Are Indexes?

An index is a data structure that improves the speed of data retrieval operations on a table. Think of it like a book's index—instead of reading every page to find a topic, you look it up in the index and go directly to the right page.

How Indexes Work

Without an index, PostgreSQL performs a sequential scan—reading every row in the table:

Query: SELECT * FROM users WHERE email = 'alice@example.com'

Sequential Scan:
┌──────────────────────────┐
│ Row 1: bob@ex.com        │  ✗ Not a match
│ Row 2: charlie@ex.com    │  ✗ Not a match
│ Row 3: alice@example.com │  ✓ Found it!
│ Row 4: dave@ex.com       │  ✗ Keep checking...
│ ...10,000 more rows...   │  ✗ Must check all
└──────────────────────────┘

With an index, PostgreSQL uses a B-tree (or other structure) to find rows directly:

B-tree Index on email:
                    ┌─────────────┐
                    │   david@    │
                    └──────┬──────┘
           ┌───────────────┴───────────────┐
    ┌──────┴──────┐                 ┌──────┴──────┐
    │   alice@    │                 │   mike@     │
    └──────┬──────┘                 └─────────────┘
           │
    Points directly to row 3

Creating Indexes

Basic Index

-- Index on a single column
CREATE INDEX idx_users_email ON users(email);

-- Index name convention: idx_tablename_columns
CREATE INDEX idx_posts_user_id ON posts(user_id);

Unique Index

Enforces uniqueness (like a unique constraint):

CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- This is equivalent to
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email);

Composite (Multi-column) Index

For queries that filter on multiple columns:

-- Order matters! Most selective column first
CREATE INDEX idx_posts_user_status ON posts(user_id, status);

-- This index helps with:
WHERE user_id = '...'                      -- ✓
WHERE user_id = '...' AND status = '...'   -- ✓
WHERE status = '...'                       -- ✗ (won't use index)

Partial Index

Index only a subset of rows:

-- Only index published posts
CREATE INDEX idx_posts_published ON posts(created_at)
WHERE status = 'published';

-- Only index active users
CREATE INDEX idx_users_active ON users(email)
WHERE active = true;

Benefits:

Smaller index size
Faster index updates
Better cache utilization

Index Types

PostgreSQL offers several index types for different use cases:

B-tree (Default)

Best for: Equality and range queries

-- Default type
CREATE INDEX idx_users_created ON users(created_at);

-- Supports
WHERE created_at = '2024-01-01'    -- Equality
WHERE created_at > '2024-01-01'    -- Range
WHERE created_at BETWEEN ... AND   -- Range
ORDER BY created_at                -- Sorting

Hash

Best for: Equality comparisons only

CREATE INDEX idx_users_email_hash ON users USING hash(email);

-- Only supports
WHERE email = 'alice@example.com'  -- ✓
WHERE email LIKE 'alice%'          -- ✗ Not supported

GIN (Generalized Inverted Index)

Best for: Arrays, JSONB, full-text search

-- For JSONB
CREATE INDEX idx_users_metadata ON users USING gin(metadata);

-- Supports JSONB operators
WHERE metadata @> '{"role": "admin"}'
WHERE metadata ? 'premium'

-- For arrays
CREATE INDEX idx_posts_tags ON posts USING gin(tags);

-- Supports array operators
WHERE tags @> ARRAY['tech']
WHERE 'news' = ANY(tags)

-- For full-text search
CREATE INDEX idx_posts_search ON posts USING gin(to_tsvector('english', title || ' ' || content));

GiST (Generalized Search Tree)

Best for: Geometric data, ranges, full-text search

-- For geographic queries (PostGIS)
CREATE INDEX idx_stores_location ON stores USING gist(location);

-- For range types
CREATE INDEX idx_events_during ON events USING gist(during);

When to Create Indexes

Do Create Indexes For:

Foreign keys (PostgreSQL doesn't auto-index these!)

CREATE INDEX idx_posts_user_id ON posts(user_id);
CREATE INDEX idx_comments_post_id ON comments(post_id);

Columns in WHERE clauses

-- If you frequently query by status
CREATE INDEX idx_posts_status ON posts(status);

Columns used in ORDER BY

-- For queries sorting by date
CREATE INDEX idx_posts_created_desc ON posts(created_at DESC);

Columns in JOIN conditions

-- If joining on these columns
CREATE INDEX idx_team_members_team_id ON team_members(team_id);

Avoid Over-Indexing

Indexes have costs:

Write overhead: Every INSERT/UPDATE/DELETE must update indexes
Storage: Indexes take disk space
Maintenance: More indexes = more maintenance

-- Don't index everything!
-- Only index columns you actually query on

-- Probably unnecessary:
CREATE INDEX idx_posts_content ON posts(content);  -- Rarely filtered
CREATE INDEX idx_users_created ON users(created_at);  -- If rarely queried

Analyzing Query Performance

EXPLAIN

Shows the query plan without executing:

EXPLAIN SELECT * FROM posts WHERE user_id = 'some-uuid';

Output:

Seq Scan on posts  (cost=0.00..25.00 rows=1 width=100)
  Filter: (user_id = 'some-uuid'::uuid)

EXPLAIN ANALYZE

Actually executes and shows real timing:

EXPLAIN ANALYZE SELECT * FROM posts WHERE user_id = 'some-uuid';

Output:

Index Scan using idx_posts_user_id on posts (cost=0.15..8.17 rows=1 width=100) (actual time=0.025..0.026 rows=1 loops=1)
  Index Cond: (user_id = 'some-uuid'::uuid)
Planning Time: 0.085 ms
Execution Time: 0.046 ms

What to Look For

Plan Type	Meaning	Action
Seq Scan	Full table scan	Consider adding index
Index Scan	Using index	Good
Index Only Scan	All data from index	Great
Bitmap Index Scan	Multiple rows from index	Good

Reading EXPLAIN Output

EXPLAIN ANALYZE
SELECT p.*, u.email
FROM posts p
JOIN users u ON p.user_id = u.id
WHERE p.status = 'published'
ORDER BY p.created_at DESC
LIMIT 10;

Limit  (actual time=0.050..0.060 rows=10 loops=1)
  ->  Nested Loop  (actual time=0.048..0.056 rows=10 loops=1)
        ->  Index Scan using idx_posts_published on posts p
              (actual time=0.030..0.035 rows=10 loops=1)
              Filter: (status = 'published')
        ->  Index Scan using users_pkey on users u
              (actual time=0.001..0.001 rows=1 loops=10)
              Index Cond: (id = p.user_id)
Execution Time: 0.085 ms

Common Indexing Patterns

Covering Index

Include all columns needed by a query:

-- Query only needs id and title
SELECT id, title FROM posts WHERE user_id = '...' ORDER BY created_at DESC;

-- Covering index (includes all needed columns)
CREATE INDEX idx_posts_user_covering
ON posts(user_id, created_at DESC)
INCLUDE (id, title);

Benefit: Index-only scan, no table access needed.

Expression Index

Index on a computed value:

-- Index on lowercase email for case-insensitive search
CREATE INDEX idx_users_email_lower ON users(lower(email));

-- Now this uses the index
SELECT * FROM users WHERE lower(email) = 'alice@example.com';

-- Index on JSONB field
CREATE INDEX idx_users_role ON users((metadata->>'role'));

-- Uses the index
SELECT * FROM users WHERE metadata->>'role' = 'admin';

Concurrent Index Creation

For large tables in production:

-- Won't lock the table (but takes longer)
CREATE INDEX CONCURRENTLY idx_posts_title ON posts(title);

Indexes and Supabase

Automatic Indexes

Supabase/PostgreSQL automatically creates indexes for:

Primary keys
Unique constraints

Foreign Keys Need Manual Indexes

-- Supabase doesn't auto-index foreign keys!
CREATE TABLE posts (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id uuid REFERENCES auth.users(id)  -- No auto-index
);

-- Add it yourself
CREATE INDEX idx_posts_user_id ON posts(user_id);

Index for PostgREST Queries

PostgREST (Supabase's REST API) benefits from indexes:

// This query benefits from idx_posts_user_id
const { data } = await supabase
  .from('posts')
  .select('*')
  .eq('user_id', userId)
  .order('created_at', { ascending: false })

Performance Tips

1. Index Foreign Keys

-- Always!
CREATE INDEX idx_posts_user_id ON posts(user_id);

2. Match Index to Query Pattern

-- Query pattern
WHERE user_id = ? AND status = ? ORDER BY created_at DESC

-- Matching index
CREATE INDEX idx_posts_user_status_date
ON posts(user_id, status, created_at DESC);

3. Use Partial Indexes for Filtered Queries

-- If you only ever query published posts
CREATE INDEX idx_posts_published
ON posts(user_id, created_at DESC)
WHERE status = 'published';

4. Consider Index-Only Scans

-- Include columns you SELECT
CREATE INDEX idx_posts_user
ON posts(user_id)
INCLUDE (title, created_at);

5. Monitor and Remove Unused Indexes

-- Find unused indexes
SELECT
  schemaname,
  relname as table,
  indexrelname as index,
  idx_scan as times_used
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey';

Key Takeaways

Indexes trade write speed for read speed: Don't over-index
Always index foreign keys: PostgreSQL doesn't auto-index them
Use EXPLAIN ANALYZE: Understand actual query performance
Partial indexes save space: Index only what you query
Column order matters: In composite indexes, put selective columns first
Match indexes to queries: Design indexes for your access patterns

Next Steps

With indexes understood, we'll explore functions and triggers—the building blocks for custom database logic.

A good index is like a good assistant: it anticipates what you need and retrieves it instantly. A bad index is like a distracted assistant: always busy but never helpful.

Indexes and Query Performance

What Are Indexes?

How Indexes Work

Without an index, PostgreSQL performs a sequential scan—reading every row in the table:

Query: SELECT * FROM users WHERE email = 'alice@example.com'

Sequential Scan:
┌──────────────────────────┐
│ Row 1: bob@ex.com        │  ✗ Not a match
│ Row 2: charlie@ex.com    │  ✗ Not a match
│ Row 3: alice@example.com │  ✓ Found it!
│ Row 4: dave@ex.com       │  ✗ Keep checking...
│ ...10,000 more rows...   │  ✗ Must check all
└──────────────────────────┘

With an index, PostgreSQL uses a B-tree (or other structure) to find rows directly:

B-tree Index on email:
                    ┌─────────────┐
                    │   david@    │
                    └──────┬──────┘
           ┌───────────────┴───────────────┐
    ┌──────┴──────┐                 ┌──────┴──────┐
    │   alice@    │                 │   mike@     │
    └──────┬──────┘                 └─────────────┘
           │
    Points directly to row 3

Creating Indexes

Basic Index

-- Index on a single column
CREATE INDEX idx_users_email ON users(email);

-- Index name convention: idx_tablename_columns
CREATE INDEX idx_posts_user_id ON posts(user_id);

Unique Index

Enforces uniqueness (like a unique constraint):

CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- This is equivalent to
ALTER TABLE users ADD CONSTRAINT users_email_unique UNIQUE (email);

Composite (Multi-column) Index

For queries that filter on multiple columns:

-- Order matters! Most selective column first
CREATE INDEX idx_posts_user_status ON posts(user_id, status);

-- This index helps with:
WHERE user_id = '...'                      -- ✓
WHERE user_id = '...' AND status = '...'   -- ✓
WHERE status = '...'                       -- ✗ (won't use index)

Partial Index

Index only a subset of rows:

-- Only index published posts
CREATE INDEX idx_posts_published ON posts(created_at)
WHERE status = 'published';

-- Only index active users
CREATE INDEX idx_users_active ON users(email)
WHERE active = true;

Benefits:

Smaller index size
Faster index updates
Better cache utilization

Index Types

PostgreSQL offers several index types for different use cases:

B-tree (Default)

Best for: Equality and range queries

-- Default type
CREATE INDEX idx_users_created ON users(created_at);

-- Supports
WHERE created_at = '2024-01-01'    -- Equality
WHERE created_at > '2024-01-01'    -- Range
WHERE created_at BETWEEN ... AND   -- Range
ORDER BY created_at                -- Sorting

Hash

Best for: Equality comparisons only

CREATE INDEX idx_users_email_hash ON users USING hash(email);

-- Only supports
WHERE email = 'alice@example.com'  -- ✓
WHERE email LIKE 'alice%'          -- ✗ Not supported

GIN (Generalized Inverted Index)

Best for: Arrays, JSONB, full-text search

-- For JSONB
CREATE INDEX idx_users_metadata ON users USING gin(metadata);

-- Supports JSONB operators
WHERE metadata @> '{"role": "admin"}'
WHERE metadata ? 'premium'

-- For arrays
CREATE INDEX idx_posts_tags ON posts USING gin(tags);

-- Supports array operators
WHERE tags @> ARRAY['tech']
WHERE 'news' = ANY(tags)

-- For full-text search
CREATE INDEX idx_posts_search ON posts USING gin(to_tsvector('english', title || ' ' || content));

GiST (Generalized Search Tree)

Best for: Geometric data, ranges, full-text search

-- For geographic queries (PostGIS)
CREATE INDEX idx_stores_location ON stores USING gist(location);

-- For range types
CREATE INDEX idx_events_during ON events USING gist(during);

When to Create Indexes

Do Create Indexes For:

Foreign keys (PostgreSQL doesn't auto-index these!)

CREATE INDEX idx_posts_user_id ON posts(user_id);
CREATE INDEX idx_comments_post_id ON comments(post_id);

Columns in WHERE clauses

-- If you frequently query by status
CREATE INDEX idx_posts_status ON posts(status);

Columns used in ORDER BY

-- For queries sorting by date
CREATE INDEX idx_posts_created_desc ON posts(created_at DESC);

Columns in JOIN conditions

-- If joining on these columns
CREATE INDEX idx_team_members_team_id ON team_members(team_id);

Avoid Over-Indexing

Indexes have costs:

Write overhead: Every INSERT/UPDATE/DELETE must update indexes
Storage: Indexes take disk space
Maintenance: More indexes = more maintenance

-- Don't index everything!
-- Only index columns you actually query on

-- Probably unnecessary:
CREATE INDEX idx_posts_content ON posts(content);  -- Rarely filtered
CREATE INDEX idx_users_created ON users(created_at);  -- If rarely queried

Analyzing Query Performance

EXPLAIN

Shows the query plan without executing:

EXPLAIN SELECT * FROM posts WHERE user_id = 'some-uuid';

Output:

Seq Scan on posts  (cost=0.00..25.00 rows=1 width=100)
  Filter: (user_id = 'some-uuid'::uuid)

EXPLAIN ANALYZE

Actually executes and shows real timing:

EXPLAIN ANALYZE SELECT * FROM posts WHERE user_id = 'some-uuid';

Output:

Index Scan using idx_posts_user_id on posts (cost=0.15..8.17 rows=1 width=100) (actual time=0.025..0.026 rows=1 loops=1)
  Index Cond: (user_id = 'some-uuid'::uuid)
Planning Time: 0.085 ms
Execution Time: 0.046 ms

What to Look For

Plan Type	Meaning	Action
Seq Scan	Full table scan	Consider adding index
Index Scan	Using index	Good
Index Only Scan	All data from index	Great
Bitmap Index Scan	Multiple rows from index	Good

Reading EXPLAIN Output

EXPLAIN ANALYZE
SELECT p.*, u.email
FROM posts p
JOIN users u ON p.user_id = u.id
WHERE p.status = 'published'
ORDER BY p.created_at DESC
LIMIT 10;

Limit  (actual time=0.050..0.060 rows=10 loops=1)
  ->  Nested Loop  (actual time=0.048..0.056 rows=10 loops=1)
        ->  Index Scan using idx_posts_published on posts p
              (actual time=0.030..0.035 rows=10 loops=1)
              Filter: (status = 'published')
        ->  Index Scan using users_pkey on users u
              (actual time=0.001..0.001 rows=1 loops=10)
              Index Cond: (id = p.user_id)
Execution Time: 0.085 ms

Common Indexing Patterns

Covering Index

Include all columns needed by a query:

-- Query only needs id and title
SELECT id, title FROM posts WHERE user_id = '...' ORDER BY created_at DESC;

-- Covering index (includes all needed columns)
CREATE INDEX idx_posts_user_covering
ON posts(user_id, created_at DESC)
INCLUDE (id, title);

Benefit: Index-only scan, no table access needed.

Expression Index

Index on a computed value:

-- Index on lowercase email for case-insensitive search
CREATE INDEX idx_users_email_lower ON users(lower(email));

-- Now this uses the index
SELECT * FROM users WHERE lower(email) = 'alice@example.com';

-- Index on JSONB field
CREATE INDEX idx_users_role ON users((metadata->>'role'));

-- Uses the index
SELECT * FROM users WHERE metadata->>'role' = 'admin';

Concurrent Index Creation

For large tables in production:

-- Won't lock the table (but takes longer)
CREATE INDEX CONCURRENTLY idx_posts_title ON posts(title);

Indexes and Supabase

Automatic Indexes

Supabase/PostgreSQL automatically creates indexes for:

Primary keys
Unique constraints

Foreign Keys Need Manual Indexes

-- Supabase doesn't auto-index foreign keys!
CREATE TABLE posts (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id uuid REFERENCES auth.users(id)  -- No auto-index
);

-- Add it yourself
CREATE INDEX idx_posts_user_id ON posts(user_id);

Index for PostgREST Queries

PostgREST (Supabase's REST API) benefits from indexes:

// This query benefits from idx_posts_user_id
const { data } = await supabase
  .from('posts')
  .select('*')
  .eq('user_id', userId)
  .order('created_at', { ascending: false })

Performance Tips

1. Index Foreign Keys

-- Always!
CREATE INDEX idx_posts_user_id ON posts(user_id);

2. Match Index to Query Pattern

-- Query pattern
WHERE user_id = ? AND status = ? ORDER BY created_at DESC

-- Matching index
CREATE INDEX idx_posts_user_status_date
ON posts(user_id, status, created_at DESC);

3. Use Partial Indexes for Filtered Queries

-- If you only ever query published posts
CREATE INDEX idx_posts_published
ON posts(user_id, created_at DESC)
WHERE status = 'published';

4. Consider Index-Only Scans

-- Include columns you SELECT
CREATE INDEX idx_posts_user
ON posts(user_id)
INCLUDE (title, created_at);

5. Monitor and Remove Unused Indexes

-- Find unused indexes
SELECT
  schemaname,
  relname as table,
  indexrelname as index,
  idx_scan as times_used
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey';

Key Takeaways

Indexes trade write speed for read speed: Don't over-index
Always index foreign keys: PostgreSQL doesn't auto-index them
Use EXPLAIN ANALYZE: Understand actual query performance
Partial indexes save space: Index only what you query
Column order matters: In composite indexes, put selective columns first
Match indexes to queries: Design indexes for your access patterns

Next Steps

With indexes understood, we'll explore functions and triggers—the building blocks for custom database logic.

A good index is like a good assistant: it anticipates what you need and retrieves it instantly. A bad index is like a distracted assistant: always busy but never helpful.