Lesson 5.4: Scaling Reads with Replicas
Read Replica Architecture
Primary (writes)
↓ Streaming replication
Replica 1 (feature store reads)
Replica 2 (analytics queries)
Replica 3 (dashboard queries)
Setup:
-- On primary: Allow replication
-- In postgresql.conf:
-- wal_level = replica
-- max_wal_senders = 10
-- On replica: Setup replication
-- In recovery.conf:
-- primary_conninfo = 'host=primary port=5432 user=replicator'
-- primary_slot_name = 'replica1'
Application Pattern
# Write to primary
def create_user(email, name):
primary_db.execute("INSERT INTO users (email, name) VALUES (%s, %s)", email, name)
# Read from replica (eventual consistency)
def get_user(user_id):
return replica_db.execute("SELECT * FROM users WHERE id = %s", user_id)
# Read from primary (strong consistency)
def get_user_for_payment(user_id):
return primary_db.execute("SELECT * FROM users WHERE id = %s", user_id)
Replication Lag Monitoring
-- On primary: Check replication lag
SELECT
client_addr,
state,
pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn) / 1024 / 1024 AS sent_lag_mb,
pg_wal_lsn_diff(pg_current_wal_lsn(), write_lsn) / 1024 / 1024 AS write_lag_mb
FROM pg_stat_replication;
Handling lag:
- Normal: Under 1MB lag (under 1 second)
- High: Over 100MB lag (may indicate network issues)
- Solution: Read from primary for critical queries
Load Balancing Reads
import random
class DatabaseRouter:
def __init__(self):
self.primary = get_db_connection('primary')
self.replicas = [
get_db_connection('replica1'),
get_db_connection('replica2'),
get_db_connection('replica3')
]
def read(self, query):
# Random replica
replica = random.choice(self.replicas)
return replica.execute(query)
def write(self, query):
return self.primary.execute(query)
Key Takeaways
- Read replicas scale read throughput horizontally
- Stream replication keeps replicas in sync with primary
- Accept eventual consistency for analytics and dashboards
- Use primary for critical reads requiring strong consistency
- Monitor replication lag to detect issues
- Load balance reads across multiple replicas
- Each replica can serve different use cases (features, analytics, dashboards)

