Primary Keys and Constraints - Data Integrity
Introduction
Tables without rules lead to messy data: duplicate emails, negative prices, orders without customers. Constraints are the rules that keep your data clean and consistent.
This lesson covers:
- Primary keys (unique row identifiers)
- Unique constraints
- Check constraints
- Default values
- Data integrity best practices
Primary Keys
A primary key uniquely identifies each row in a table.
Properties of Primary Keys
- UNIQUE: No two rows can have the same primary key value
- NOT NULL: Primary key cannot be NULL
- Immutable: Should never change once set
Single-Column Primary Key
CREATE TABLE customers (
id SERIAL PRIMARY KEY, -- Auto-incrementing ID
name VARCHAR(100) NOT NULL
);
SERIAL automatically generates: 1, 2, 3, 4, ...
Composite Primary Key
CREATE TABLE order_items (
order_id INTEGER,
product_id INTEGER,
quantity INTEGER,
PRIMARY KEY (order_id, product_id) -- Combination must be unique
);
Each (order_id, product_id) pair is unique.
Natural vs Surrogate Keys
Natural key: Real-world identifier
CREATE TABLE countries (
code CHAR(2) PRIMARY KEY, -- 'US', 'CA', 'UK' (natural)
name VARCHAR(100) NOT NULL
);
Surrogate key: Artificial identifier
CREATE TABLE customers (
id SERIAL PRIMARY KEY, -- Artificial (surrogate)
email VARCHAR(255) UNIQUE NOT NULL
);
Recommendation: Use surrogate keys (SERIAL/BIGSERIAL/UUID) for most tables!
SERIAL and BIGSERIAL
SERIAL (auto-incrementing INTEGER)
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100)
);
INSERT INTO products (name) VALUES ('Laptop'), ('Mouse');
SELECT * FROM products;
Result:
id | name
----+--------
1 | Laptop
2 | Mouse
IDs generated automatically!
BIGSERIAL (for large tables)
CREATE TABLE logs (
id BIGSERIAL PRIMARY KEY, -- Supports billions of rows
message TEXT,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
When to use:
- SERIAL: Up to 2 billion rows
- BIGSERIAL: More than 2 billion rows
UUID Primary Keys
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; -- Enable UUID functions
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
username VARCHAR(50) UNIQUE NOT NULL
);
INSERT INTO users (username) VALUES ('alice');
SELECT * FROM users;
Result:
id | username
--------------------------------------+----------
a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | alice
Pros:
- ✅ Globally unique (no collisions across databases)
- ✅ Hard to guess (better security)
- ✅ Can be generated client-side
Cons:
- ❌ Larger storage (16 bytes vs 4 for INTEGER)
- ❌ Slower index performance
- ❌ Not human-friendly
UNIQUE Constraint
Ensures column values are unique (but allows NULL).
Single-Column UNIQUE
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
username VARCHAR(50) UNIQUE NOT NULL
);
-- ❌ Error: Duplicate email
INSERT INTO users (email, username) VALUES
('alice@example.com', 'alice'),
('alice@example.com', 'alice2'); -- Error!
Composite UNIQUE Constraint
CREATE TABLE enrollments (
id SERIAL PRIMARY KEY,
student_id INTEGER NOT NULL,
course_id INTEGER NOT NULL,
UNIQUE (student_id, course_id) -- Each student can enroll in a course once
);
-- ✅ OK: Different student
INSERT INTO enrollments (student_id, course_id) VALUES (1, 101);
INSERT INTO enrollments (student_id, course_id) VALUES (2, 101);
-- ❌ Error: Same (student_id, course_id)
INSERT INTO enrollments (student_id, course_id) VALUES (1, 101);
UNIQUE with NULL
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE,
manager_id INTEGER UNIQUE -- Multiple NULLs allowed!
);
-- ✅ OK: NULL is not considered duplicate
INSERT INTO employees (email, manager_id) VALUES
('alice@example.com', NULL),
('bob@example.com', NULL),
('carol@example.com', 1);
Note: In SQL, NULL != NULL, so multiple NULLs don't violate UNIQUE!
NOT NULL Constraint
Prevents NULL values in a column.
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL, -- Name is required
description TEXT, -- Description is optional
price DECIMAL(10,2) NOT NULL -- Price is required
);
-- ❌ Error: name cannot be NULL
INSERT INTO products (description, price) VALUES ('A product', 99.99);
-- ✅ OK
INSERT INTO products (name, price) VALUES ('Laptop', 999.99);
Adding NOT NULL to Existing Column
-- First, ensure no NULLs exist
UPDATE products SET description = 'No description' WHERE description IS NULL;
-- Then add constraint
ALTER TABLE products ALTER COLUMN description SET NOT NULL;
CHECK Constraints
Enforces custom rules on column values.
Simple CHECK
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) CHECK (price >= 0), -- Price must be non-negative
stock INTEGER CHECK (stock >= 0),
rating DECIMAL(3,2) CHECK (rating BETWEEN 0 AND 5)
);
-- ❌ Error: price < 0
INSERT INTO products (name, price, stock) VALUES ('Laptop', -999.99, 10);
-- ✅ OK
INSERT INTO products (name, price, stock, rating) VALUES ('Laptop', 999.99, 10, 4.5);
CHECK with Multiple Conditions
CREATE TABLE reservations (
id SERIAL PRIMARY KEY,
check_in DATE NOT NULL,
check_out DATE NOT NULL,
CHECK (check_out > check_in) -- Check-out must be after check-in
);
CHECK with IN
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
status VARCHAR(20) DEFAULT 'pending',
CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled'))
);
-- ❌ Error: Invalid status
INSERT INTO orders (status) VALUES ('invalid_status');
-- ✅ OK
INSERT INTO orders (status) VALUES ('shipped');
Named CHECK Constraints
CREATE TABLE products (
id SERIAL PRIMARY KEY,
price DECIMAL(10,2),
CONSTRAINT chk_price_positive CHECK (price >= 0),
CONSTRAINT chk_price_reasonable CHECK (price <= 1000000)
);
-- Easier to identify in error messages!
DEFAULT Values
Automatically populate columns when not provided.
Simple DEFAULT
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL,
active BOOLEAN DEFAULT TRUE,
role VARCHAR(20) DEFAULT 'user',
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO users (username) VALUES ('alice');
SELECT * FROM users;
Result:
id | username | active | role | created_at
----+----------+--------+------+---------------------------
1 | alice | true | user | 2024-12-20 14:30:00+00
All defaults filled automatically!
DEFAULT with Functions
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
Overriding DEFAULT
INSERT INTO users (username, active) VALUES ('bob', FALSE);
-- Uses FALSE instead of default TRUE
Combining Constraints
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
username VARCHAR(50) UNIQUE NOT NULL CHECK (LENGTH(username) >= 3),
age INTEGER CHECK (age >= 18 AND age <= 120),
status VARCHAR(20) DEFAULT 'active' CHECK (status IN ('active', 'suspended', 'deleted')),
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL
);
Modifying Constraints
Add Constraint
ALTER TABLE products
ADD CONSTRAINT chk_stock_positive CHECK (stock >= 0);
Drop Constraint
ALTER TABLE products
DROP CONSTRAINT chk_stock_positive;
Add NOT NULL
ALTER TABLE products
ALTER COLUMN description SET NOT NULL;
Remove NOT NULL
ALTER TABLE products
ALTER COLUMN description DROP NOT NULL;
Constraint Validation
Immediate vs Deferred (Advanced)
-- Immediate: Checked after every statement (default)
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
total DECIMAL(10,2) CHECK (total >= 0)
);
-- Deferred: Checked at transaction commit (PostgreSQL)
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
total DECIMAL(10,2),
CONSTRAINT chk_total_positive CHECK (total >= 0) DEFERRABLE INITIALLY DEFERRED
);
Best Practices
1. Always Use Primary Keys
-- ✅ Good
CREATE TABLE logs (
id BIGSERIAL PRIMARY KEY,
message TEXT
);
-- ❌ Bad: No primary key
CREATE TABLE logs (
message TEXT
);
2. Add UNIQUE on Business Keys
CREATE TABLE users (
id SERIAL PRIMARY KEY, -- Surrogate key
email VARCHAR(255) UNIQUE NOT NULL, -- Business key (should be unique)
username VARCHAR(50) UNIQUE NOT NULL
);
3. Use CHECK for Valid Values
CREATE TABLE products (
price DECIMAL(10,2) CHECK (price >= 0),
rating DECIMAL(3,2) CHECK (rating BETWEEN 0 AND 5),
status VARCHAR(20) CHECK (status IN ('active', 'inactive', 'discontinued'))
);
4. NOT NULL for Required Fields
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL, -- Email is required
name VARCHAR(100) NOT NULL, -- Name is required
phone VARCHAR(20) -- Phone is optional
);
5. Name Your Constraints
CREATE TABLE products (
id SERIAL,
name VARCHAR(100),
price DECIMAL(10,2),
CONSTRAINT pk_products PRIMARY KEY (id),
CONSTRAINT uq_product_name UNIQUE (name),
CONSTRAINT chk_price_positive CHECK (price >= 0)
);
Common Patterns
Pattern 1: Audit Fields
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL,
total DECIMAL(10,2) NOT NULL,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL,
created_by INTEGER,
updated_by INTEGER
);
Pattern 2: Soft Delete
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
deleted_at TIMESTAMPTZ, -- NULL = active, set = deleted
CHECK (deleted_at IS NULL OR deleted_at >= created_at)
);
Pattern 3: Status Enum
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered', 'cancelled');
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
status order_status DEFAULT 'pending' NOT NULL
);
Practice Exercises
Exercise 1
Create a products table with id, name (required, unique), price (required, >= 0), stock (default 0, >= 0).
Solution
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(200) UNIQUE NOT NULL,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
stock INTEGER DEFAULT 0 CHECK (stock >= 0)
);
Exercise 2
Create a users table with email (unique), age (18-120), status (active/inactive/suspended), created_at.
Solution
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
age INTEGER CHECK (age BETWEEN 18 AND 120),
status VARCHAR(20) DEFAULT 'active' CHECK (status IN ('active', 'inactive', 'suspended')),
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP NOT NULL
);
Key Takeaways
- ✅ PRIMARY KEY uniquely identifies each row (UNIQUE + NOT NULL)
- ✅ SERIAL/BIGSERIAL for auto-incrementing IDs
- ✅ UUID for globally unique, distributed-friendly keys
- ✅ UNIQUE prevents duplicate values
- ✅ NOT NULL enforces required fields
- ✅ CHECK validates custom business rules
- ✅ DEFAULT provides automatic values
- ✅ Name constraints for better error messages
- ✅ Combine constraints for robust data validation
Next Steps
You've mastered primary keys and constraints! In the next lesson, we'll explore Foreign Keys—the relationships between tables that maintain referential integrity.
Let's connect our data!

