Lesson 1.6: Case Study - Shopify's E-Commerce at Scale
Shopify's Requirements
- 1M+ merchants
- Billions of products
- Peak: 10k+ orders/second
- Global consistency
Their Architecture
Primary Database: PostgreSQL with sharding by merchant_id
Why Postgres?
- ACID transactions: Money and inventory require correctness
- JSONB support: Product attributes vary by category
- Full-text search: Product search built-in
- Mature replication: Read replicas for analytics
- pgvector: Product recommendations
Schema Design for Scale
-- Merchants table (shard key)
CREATE TABLE merchants (
id BIGSERIAL PRIMARY KEY,
shop_domain TEXT UNIQUE NOT NULL,
-- All other tables foreign key to this
);
-- Products (sharded by merchant_id)
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
merchant_id BIGINT NOT NULL REFERENCES merchants(id),
title TEXT NOT NULL,
description TEXT,
attributes JSONB, -- Flexible schema
embedding VECTOR(1536), -- Product recommendations
search_vector TSVECTOR -- Full-text search
);
-- Optimized indexes
CREATE INDEX products_merchant_idx ON products(merchant_id);
CREATE INDEX products_search_idx ON products USING GIN(search_vector);
CREATE INDEX products_embedding_idx ON products
USING ivfflat (embedding vector_cosine_ops);
Handling Black Friday Traffic
Challenge: 10k orders/second during peak sales
Solution: Write Ahead Log (WAL) Optimization
-- Batch commits reduce WAL overhead
BEGIN;
INSERT INTO orders (...) VALUES (...); -- 100 orders
COMMIT; -- One WAL write for 100 orders
Connection Pooling:
10k requests/sec → PgBouncer → 100 DB connections
Read Replicas:
Primary (writes) → Replica 1-10 (product reads)
→ Replica 11-20 (analytics)
Result: Handle 10x traffic spikes without downtime.
Lessons for AI E-Commerce
- JSONB for flexibility: Product attributes without schema migrations
- Full-text search: Built-in, no Elasticsearch needed for basic search
- pgvector for recommendations: "Similar products" without separate DB
- Sharding by tenant: Each merchant's data isolated
- Connection pooling: Essential for high concurrency
Key Takeaways
- Shopify handles 1M+ merchants and 10k+ orders/second with PostgreSQL
- JSONB allows flexible product schemas without constant migrations
- Built-in features (full-text search, pgvector) reduce infrastructure complexity
- Proper connection pooling and read replicas enable massive scale
- Multi-tenant sharding (by merchant_id) provides isolation and scalability

