Performance Optimization - Making Queries Fast
Introduction
Your queries work, but are they fast? This lesson shows you how to add indexes, analyze performance, and optimize slow queries.
Step 1: Add Indexes
Save this as 02-indexes.sql:
-- Primary keys (already indexed automatically)
-- users.id, workspaces.id, projects.id, tasks.id, tags.id, comments.id, activity_log.id
-- Foreign key indexes (critical!)
CREATE INDEX idx_workspaces_owner ON workspaces(owner_id);
CREATE INDEX idx_workspace_members_workspace ON workspace_members(workspace_id);
CREATE INDEX idx_workspace_members_user ON workspace_members(user_id);
CREATE INDEX idx_projects_workspace ON projects(workspace_id);
CREATE INDEX idx_projects_created_by ON projects(created_by);
CREATE INDEX idx_tasks_project ON tasks(project_id);
CREATE INDEX idx_tasks_created_by ON tasks(created_by);
CREATE INDEX idx_task_assignments_task ON task_assignments(task_id);
CREATE INDEX idx_task_assignments_user ON task_assignments(user_id);
CREATE INDEX idx_tags_workspace ON tags(workspace_id);
CREATE INDEX idx_task_tags_task ON task_tags(task_id);
CREATE INDEX idx_task_tags_tag ON task_tags(tag_id);
CREATE INDEX idx_comments_task ON comments(task_id);
CREATE INDEX idx_comments_user ON comments(user_id);
CREATE INDEX idx_activity_log_workspace ON activity_log(workspace_id);
CREATE INDEX idx_activity_log_user ON activity_log(user_id);
-- Composite indexes for common queries
CREATE INDEX idx_tasks_project_status ON tasks(project_id, status);
CREATE INDEX idx_tasks_status_due_date ON tasks(status, due_date) WHERE deleted_at IS NULL;
-- Partial indexes
CREATE INDEX idx_tasks_active ON tasks(project_id, status)
WHERE deleted_at IS NULL AND status IN ('todo', 'in_progress');
CREATE INDEX idx_tasks_overdue ON tasks(project_id, due_date)
WHERE status NOT IN ('done', 'cancelled') AND due_date < CURRENT_DATE;
-- Timestamp indexes for sorting
CREATE INDEX idx_tasks_created ON tasks(created_at DESC);
CREATE INDEX idx_comments_created ON comments(created_at DESC);
CREATE INDEX idx_activity_log_created ON activity_log(created_at DESC);
-- Soft delete support
CREATE INDEX idx_users_active ON users(email) WHERE deleted_at IS NULL;
CREATE INDEX idx_workspaces_active ON workspaces(owner_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_projects_active ON projects(workspace_id) WHERE deleted_at IS NULL;
Run: psql -d task_management -f 02-indexes.sql
Step 2: Analyze Query Performance
Before Optimization
EXPLAIN ANALYZE
SELECT t.title, u.first_name || ' ' || u.last_name AS assigned_to
FROM tasks t
LEFT JOIN task_assignments ta ON t.id = ta.task_id
LEFT JOIN users u ON ta.user_id = u.id
WHERE t.project_id = 1;
Without indexes:
Seq Scan on tasks (cost=0.00..1000.00 rows=100)
Filter: (project_id = 1)
Nested Loop Left Join ...
With indexes:
Index Scan using idx_tasks_project on tasks (cost=0.00..8.27 rows=1)
Index Cond: (project_id = 1)
Nested Loop Left Join ...
Much faster!
Analyzing Slow Queries
-- Find slow queries (PostgreSQL)
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
Step 3: Optimize Common Query Patterns
Query 1: My Tasks (Optimized)
-- Before: Slow without indexes
EXPLAIN ANALYZE
SELECT t.id, t.title, t.status, t.priority
FROM tasks t
INNER JOIN task_assignments ta ON t.id = ta.task_id
WHERE ta.user_id = 2
AND t.status != 'done'
AND t.due_date BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL '7 days';
-- After: Add composite index
CREATE INDEX idx_task_assignments_user_task ON task_assignments(user_id, task_id);
-- Now uses index scan instead of seq scan!
Query 2: Workspace Overview (Optimized)
-- Optimize with materialized view
CREATE MATERIALIZED VIEW workspace_dashboard AS
SELECT
w.id,
w.name,
COUNT(DISTINCT wm.user_id) AS member_count,
COUNT(DISTINCT p.id) AS project_count,
COUNT(DISTINCT t.id) AS total_tasks,
COUNT(DISTINCT CASE WHEN t.status = 'done' THEN t.id END) AS completed_tasks
FROM workspaces w
LEFT JOIN workspace_members wm ON w.id = wm.workspace_id
LEFT JOIN projects p ON w.id = p.workspace_id AND p.deleted_at IS NULL
LEFT JOIN tasks t ON p.id = t.project_id AND t.deleted_at IS NULL
WHERE w.deleted_at IS NULL
GROUP BY w.id, w.name;
-- Refresh periodically
REFRESH MATERIALIZED VIEW workspace_dashboard;
-- Fast reads!
SELECT * FROM workspace_dashboard;
Step 4: Index Usage Statistics
-- Check which indexes are being used
SELECT
schemaname,
tablename,
indexname,
idx_scan AS scans,
idx_tup_read AS tuples_read,
idx_tup_fetch AS tuples_fetched,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;
Look for:
- Indexes with 0 scans = unused (consider dropping)
- Indexes with high scans = valuable (keep!)
Step 5: Table Statistics
-- Vacuum and analyze for accurate statistics
VACUUM ANALYZE tasks;
VACUUM ANALYZE users;
VACUUM ANALYZE projects;
-- Check table sizes
SELECT
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Performance Tips
1. Index Foreign Keys (Critical!)
-- ✅ Good: Indexed
CREATE INDEX idx_tasks_project ON tasks(project_id);
-- ❌ Bad: No index
-- Foreign key without index = slow JOINs
2. Use Partial Indexes for Filtered Queries
-- If you often query active tasks
CREATE INDEX idx_tasks_active ON tasks(status)
WHERE deleted_at IS NULL AND status IN ('todo', 'in_progress');
3. Avoid SELECT *
-- ❌ Bad: Returns unnecessary columns
SELECT * FROM tasks WHERE id = 1;
-- ✅ Good: Only what you need
SELECT id, title, status FROM tasks WHERE id = 1;
4. Use LIMIT on Large Result Sets
-- Pagination
SELECT * FROM tasks
ORDER BY created_at DESC
LIMIT 25 OFFSET 0;
5. Denormalize for Read-Heavy Queries
-- Add computed column to avoid JOIN
ALTER TABLE tasks ADD COLUMN assignee_name VARCHAR(100);
-- Update with trigger or application code
UPDATE tasks t
SET assignee_name = u.first_name || ' ' || u.last_name
FROM task_assignments ta
JOIN users u ON ta.user_id = u.id
WHERE t.id = ta.task_id;
-- Now no JOIN needed!
SELECT title, assignee_name FROM tasks;
Monitoring Performance
Enable Query Logging (PostgreSQL)
-- In postgresql.conf
log_min_duration_statement = 1000 -- Log queries > 1 second
-- Or per session
SET log_min_duration_statement = 100;
Check Long-Running Queries
SELECT
pid,
now() - pg_stat_activity.query_start AS duration,
query,
state
FROM pg_stat_activity
WHERE state = 'active'
AND now() - pg_stat_activity.query_start > interval '1 second'
ORDER BY duration DESC;
Key Takeaways
- ✅ Index all foreign keys (manually!)
- ✅ Composite indexes for multi-column queries
- ✅ Partial indexes for filtered queries
- ✅ Use EXPLAIN ANALYZE to verify performance
- ✅ VACUUM ANALYZE for accurate statistics
- ✅ Monitor index usage and drop unused indexes
- ✅ Materialized views for complex aggregations
- ✅ Avoid SELECT * - select only needed columns
Next Steps
In Lesson 8.7, you'll review the entire capstone project and explore what to learn next!
Quiz
Question 1 of 520% Complete
0 of 5 questions answered

