Understanding Databases - How Data is Stored on Disk
Introduction
Before diving into SQL queries, it's crucial to understand what's happening behind the scenes. When you write a SQL query, how does the database find your data among millions or billions of rows? Why are some queries fast and others slow? The answers lie in understanding how databases store data physically on disk.
The Big Picture: From Tables to Disk
When you think of a database table, you might picture a spreadsheet:
| id | name | email | age |
|----|-------------|---------------------|-----|
| 1 | Alice Smith | alice@example.com | 28 |
| 2 | Bob Jones | bob@example.com | 34 |
| 3 | Carol White | carol@example.com | 22 |
But on disk, this data isn't stored as a neat table. It's stored as binary data in pages or blocks—the fundamental units of database storage.
Pages: The Building Blocks of Storage
What is a Page?
A page (also called a block) is a fixed-size chunk of data, typically 8 KB in PostgreSQL and 16 KB in MySQL. Think of pages as the "boxes" that hold your data.
Why Pages?
Operating systems and hard drives work most efficiently when reading and writing data in chunks, not individual bytes. Pages allow databases to:
- Read efficiently: Load an entire page at once instead of individual rows
- Write efficiently: Flush pages to disk in batches
- Cache smartly: Keep frequently-used pages in memory (RAM)
Anatomy of a Page
Each page contains:
- Header: Metadata about the page (page number, free space, etc.)
- Row data: The actual data from your table rows
- Item pointers: References to where each row starts within the page
- Free space: Room for new data or updates
┌──────────────────────────────────────┐
│ Page Header │
├──────────────────────────────────────┤
│ │
│ Row Data │
│ (Alice, Bob, Carol...) │
│ │
├──────────────────────────────────────┤
│ Item Pointers │
└──────────────────────────────────────┘
How Tables Are Stored
Heap Files
Most database tables are stored as heap files—collections of unordered pages. When you insert a new row, the database typically:
- Finds a page with enough free space
- Adds the row to that page
- Updates the page's metadata
This means rows aren't necessarily stored in the order you inserted them!
Example: Inserting Data
-- First insert
INSERT INTO users (name, email, age)
VALUES ('Alice', 'alice@example.com', 28);
-- Stored in Page 1
-- Second insert
INSERT INTO users (name, email, age)
VALUES ('Bob', 'bob@example.com', 34);
-- Might be stored in Page 1 (if there's room) or Page 2
The database doesn't care about order—it cares about efficiently using available space.
How Databases Find Your Data
The Sequential Scan
When you run a simple query without optimization:
SELECT * FROM users WHERE age > 30;
The database performs a sequential scan (also called a table scan):
- Read Page 1 → Check each row → Keep matches
- Read Page 2 → Check each row → Keep matches
- Read Page 3 → Check each row → Keep matches
- Continue until all pages are read...
For a table with 1 million rows spread across 10,000 pages, this means reading all 10,000 pages. That's slow!
Indexes to the Rescue
To speed things up, databases use indexes—special data structures that act like a book's index, pointing directly to where data lives.
We'll cover indexes in detail in Module 7, but for now, know that they help the database jump directly to relevant pages instead of scanning everything.
The Role of Memory (RAM)
The Buffer Pool
Databases don't read from disk every time you query. They maintain a buffer pool (also called buffer cache)—a region of RAM that stores frequently-accessed pages.
How It Works:
- First query: Database reads pages from disk → stores in buffer pool → returns results
- Second query (same data): Database finds pages in buffer pool → returns results instantly (no disk access!)
This is why the same query often runs faster the second time!
Cache Hit vs. Cache Miss
- Cache hit: Data found in buffer pool (fast! ~microseconds)
- Cache miss: Data must be read from disk (slow! ~milliseconds)
Good database design aims to maximize cache hits.
Storage Hierarchy
Understanding the speed differences is crucial:
┌─────────────────────────────────────────────┐
│ CPU Registers (fastest) │
│ ~1 nanosecond │
├─────────────────────────────────────────────┤
│ CPU Cache (L1, L2, L3) │
│ ~10 nanoseconds │
├─────────────────────────────────────────────┤
│ RAM (Main Memory / Buffer Pool) │
│ ~100 nanoseconds │
├─────────────────────────────────────────────┤
│ SSD (Solid State Drive) │
│ ~100 microseconds (1000x slower than RAM) │
├─────────────────────────────────────────────┤
│ HDD (Hard Disk Drive) │
│ ~10 milliseconds (100x slower than SSD) │
└─────────────────────────────────────────────┘
Key takeaway: RAM is ~1,000x faster than SSD and ~100,000x faster than HDD. This is why caching matters!
Write-Ahead Logging (WAL)
The Durability Problem
When you execute an UPDATE or INSERT:
UPDATE users SET email = 'new@example.com' WHERE id = 1;
What happens if the database crashes mid-update? How do we ensure data isn't lost?
The Solution: WAL
Before modifying data pages, databases write changes to a Write-Ahead Log (WAL):
- Write change to WAL on disk (fast sequential write)
- Modify page in buffer pool (in RAM)
- Eventually flush the modified page to disk
If the database crashes:
- Changes in WAL but not on disk → Recovered by replaying WAL
- Changes in WAL and on disk → Already persisted
This guarantees durability—one of the ACID properties we'll discuss later.
Storage Engines: Under the Hood
Different databases use different storage engines to manage how data is stored and retrieved:
PostgreSQL: Heap Storage
- Rows stored in unordered pages
- MVCC (Multi-Version Concurrency Control) for concurrent access
- Great for general-purpose applications
MySQL: InnoDB
- Clustered index structure (data stored with primary key)
- ACID-compliant with crash recovery
- Popular for web applications
SQLite: B-Tree Storage
- Entire database in a single file
- Perfect for embedded systems and mobile apps
- Lightweight and fast for smaller datasets
Why This Matters for Writing SQL
Understanding storage helps you:
1. Write Faster Queries
-- Slow: Sequential scan through all pages
SELECT * FROM orders WHERE customer_id = 12345;
-- Fast: Use an index to jump to relevant pages
CREATE INDEX idx_customer ON orders(customer_id);
SELECT * FROM orders WHERE customer_id = 12345;
2. Design Better Schemas
- Smaller rows = More rows per page = Fewer disk reads
- Appropriate data types save space and improve performance
3. Understand Performance
- Why does the first query take 2 seconds but the second takes 10ms? (Cold vs. warm cache)
- Why do large JOINs get slow? (Many pages to read and combine)
Practical Implications
Example: Table Size Estimation
-- Create a users table
CREATE TABLE users (
id INTEGER, -- 4 bytes
name VARCHAR(100), -- ~100 bytes
email VARCHAR(100), -- ~100 bytes
age INTEGER -- 4 bytes
);
-- Total per row: ~208 bytes + overhead
Pages: PostgreSQL page = 8 KB = 8,192 bytes
Rows per page: 8,192 / 208 ≈ 39 rows per page
1 million rows: 1,000,000 / 39 ≈ 25,641 pages ≈ 200 MB on disk
Understanding this helps you estimate:
- How much disk space you need
- How much RAM to allocate for caching
- Query performance expectations
Key Takeaways
- Data is stored in pages (8 KB chunks), not individual rows
- Sequential scans read all pages; indexes help jump to specific pages
- Buffer pool caches pages in RAM for faster repeated access
- WAL ensures durability even if the database crashes
- Understanding storage helps you write faster, more efficient SQL
Next Steps
Now that you understand how databases store data physically, we'll explore the different types of databases in the next lesson: SQL vs NoSQL vs Vector Databases. You'll learn when to use each type and why SQL databases dominate the industry.

