Indexes - Performance Optimization
Introduction
Imagine searching for a word in a 1000-page book without an index—you'd read every page! Databases face the same problem. Indexes are like a book's index: they speed up data retrieval by creating optimized lookup structures.
Without indexes, queries scan entire tables (slow). With indexes, databases jump straight to relevant rows (fast).
What is an Index?
An index is a data structure (usually a B-tree) that stores a sorted copy of specific columns, pointing to the actual table rows.
Trade-off:
- ✅ Faster SELECT queries
- ❌ Slower INSERT/UPDATE/DELETE (index must be updated)
- ❌ Extra storage space
Creating Indexes
Basic Syntax
CREATE INDEX index_name ON table_name(column_name);
Simple Index Example
-- Without index: Full table scan
SELECT * FROM customers WHERE email = 'alice@example.com';
-- Scans all 1 million rows!
-- Create index
CREATE INDEX idx_customers_email ON customers(email);
-- Now fast!
SELECT * FROM customers WHERE email = 'alice@example.com';
-- Uses index: instant lookup
Index Types
B-Tree Index (Default)
CREATE INDEX idx_products_price ON products(price);
Use for:
- Equality searches (
=) - Range queries (
<,>,BETWEEN) - Sorting (
ORDER BY) - Pattern matching starting with text (
LIKE 'abc%')
Most common index type.
Unique Index
CREATE UNIQUE INDEX idx_users_email ON users(email);
Enforces uniqueness + speeds up lookups.
Equivalent to:
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);
Partial Index (PostgreSQL)
Index only rows matching a condition:
-- Index only active products
CREATE INDEX idx_active_products ON products(name)
WHERE active = TRUE;
Benefits:
- Smaller index
- Faster updates
- Perfect for filtering
Expression Index
Index on computed values:
-- Index lowercase email for case-insensitive search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Now fast:
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
Multi-Column (Composite) Index
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
Use for:
- Queries filtering on multiple columns
- Queries with specific column order
Index works for:
WHERE customer_id = 1WHERE customer_id = 1 AND order_date = '2024-01-15'
Index doesn't help:
WHERE order_date = '2024-01-15'(order_date not first!)
Rule: Left-to-right prefix matching!
Covering Index
Include extra columns for index-only scans:
CREATE INDEX idx_orders_customer_total ON orders(customer_id)
INCLUDE (total, order_date);
Query can be satisfied entirely from index (no table lookup needed).
When to Create Indexes
Always Index
- ✅ Primary keys (automatic)
- ✅ Foreign keys (manual!)
- ✅ Columns in WHERE clauses
- ✅ Columns in JOIN conditions
- ✅ Columns in ORDER BY
- ✅ Columns in GROUP BY
Example Schema with Indexes
CREATE TABLE customers (
id SERIAL PRIMARY KEY, -- Automatic index
email VARCHAR(255) UNIQUE NOT NULL, -- Automatic unique index
name VARCHAR(100),
city VARCHAR(50),
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
-- Manual indexes
CREATE INDEX idx_customers_city ON customers(city);
CREATE INDEX idx_customers_created ON customers(created_at);
CREATE TABLE orders (
id SERIAL PRIMARY KEY, -- Automatic index
customer_id INTEGER NOT NULL REFERENCES customers(id),
total DECIMAL(10,2),
status VARCHAR(20),
order_date DATE
);
-- Manual indexes
CREATE INDEX idx_orders_customer ON orders(customer_id); -- FK index!
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date); -- Composite
When NOT to Create Indexes
- ❌ Small tables (under 1000 rows) - Full scan is fast enough
- ❌ Columns with low cardinality (few unique values, e.g., boolean)
- ❌ Columns rarely queried
- ❌ Tables with many writes - Index overhead slows INSERTs
Checking Index Usage
EXPLAIN Query Plan
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';
Without index:
Seq Scan on customers (cost=0.00..1000.00 rows=1)
Filter: (email = 'alice@example.com')
With index:
Index Scan using idx_customers_email on customers (cost=0.00..8.27 rows=1)
Index Cond: (email = 'alice@example.com')
EXPLAIN ANALYZE (with execution)
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 1;
Shows actual execution time!
Listing Indexes
PostgreSQL
-- All indexes on a table
SELECT * FROM pg_indexes WHERE tablename = 'customers';
-- Index sizes
SELECT
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
Dropping Indexes
DROP INDEX idx_customers_email;
-- Safe drop
DROP INDEX IF EXISTS idx_customers_email;
Reindexing
Rebuild corrupted or bloated indexes:
-- Reindex a specific index
REINDEX INDEX idx_customers_email;
-- Reindex all indexes on a table
REINDEX TABLE customers;
-- Reindex entire database
REINDEX DATABASE mydb;
Index Maintenance
Autovacuum (PostgreSQL)
Automatically cleans up dead rows and updates statistics.
-- Check autovacuum status
SELECT * FROM pg_stat_user_tables WHERE relname = 'customers';
Manual VACUUM and ANALYZE
-- Reclaim space and update statistics
VACUUM ANALYZE customers;
Common Index Patterns
Pattern 1: Foreign Key Indexes
-- Always index foreign keys!
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers(id)
);
CREATE INDEX idx_orders_customer ON orders(customer_id);
Pattern 2: Composite Index for Queries
-- Query: Find active orders for customer
SELECT * FROM orders
WHERE customer_id = 1 AND status = 'active';
-- Ideal index:
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);
Pattern 3: Partial Index for Filtering
-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(customer_id, order_date)
WHERE order_date >= '2024-01-01';
Pattern 4: Text Search Index
-- Full-text search index
CREATE INDEX idx_products_search ON products USING gin(to_tsvector('english', description));
SELECT * FROM products
WHERE to_tsvector('english', description) @@ to_tsquery('laptop');
Performance Tips
1. Measure Before Optimizing
-- Use EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 1;
2. Don't Over-Index
More indexes = slower writes. Balance is key!
3. Use Composite Indexes Wisely
-- ✅ Good: Matches query pattern
CREATE INDEX idx ON orders(customer_id, order_date);
-- Query: WHERE customer_id = 1 AND order_date = '2024-01-15'
-- ❌ Bad: Wrong column order
CREATE INDEX idx ON orders(order_date, customer_id);
-- Doesn't help: WHERE customer_id = 1
4. Monitor Index Usage
SELECT
schemaname,
tablename,
indexname,
idx_scan AS index_scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan ASC;
Indexes with 0 scans = unused (consider dropping!).
Practice Exercises
Exercise 1
Create indexes for this query: SELECT * FROM products WHERE category = 'Electronics' AND price > 100 ORDER BY price;
Solution
-- Option 1: Composite index
CREATE INDEX idx_products_category_price ON products(category, price);
-- Option 2: Separate indexes (less efficient)
CREATE INDEX idx_products_category ON products(category);
CREATE INDEX idx_products_price ON products(price);
Exercise 2
Create a partial index for active users created in 2024.
Solution
CREATE INDEX idx_users_2024_active ON users(created_at)
WHERE active = TRUE AND EXTRACT(YEAR FROM created_at) = 2024;
Key Takeaways
- ✅ Indexes speed up SELECT but slow down INSERT/UPDATE/DELETE
- ✅ B-tree is the default and most common index type
- ✅ Always index foreign keys manually
- ✅ Composite indexes for multi-column queries
- ✅ Partial indexes for filtered queries
- ✅ Use EXPLAIN to verify index usage
- ✅ Monitor and drop unused indexes
- ✅ Don't over-index - balance read vs write performance
Next Steps
You've mastered indexes! In the next lesson, we'll explore Normalization—database design principles for reducing redundancy and ensuring data integrity.
Let's design better schemas!

