Self Joins and Multiple Joins
Introduction
So far, we've joined different tables together. But sometimes you need to join a table to itself (self join) or combine many tables at once (multiple joins). These advanced techniques unlock powerful queries for hierarchical data, complex relationships, and comprehensive reporting.
Self Joins
What is a Self Join?
A self join joins a table to itself, treating it as two separate tables using aliases.
Why Self Join?
Common use cases:
- Hierarchical data: Employees and their managers
- Comparisons: Find pairs of records
- Relationships within same table: Friends, siblings, etc.
Example: Employee-Manager Hierarchy
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
manager_id INTEGER REFERENCES employees(id)
);
INSERT INTO employees (id, name, manager_id) VALUES
(1, 'Alice (CEO)', NULL), -- No manager
(2, 'Bob (VP)', 1), -- Reports to Alice
(3, 'Carol (VP)', 1), -- Reports to Alice
(4, 'Dave (Manager)', 2), -- Reports to Bob
(5, 'Eve (Manager)', 2); -- Reports to Bob
Find each employee with their manager:
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Result:
employee | manager
-----------------+-------------
Alice (CEO) | NULL ← No manager
Bob (VP) | Alice (CEO)
Carol (VP) | Alice (CEO)
Dave (Manager) | Bob (VP)
Eve (Manager) | Bob (VP)
Explanation:
employees e: "Employee" version of the tableemployees m: "Manager" version of the same tablee.manager_id = m.id: Match employee's manager_id to manager's id
Self Join Pattern: Find Pairs
Find employees hired on the same date:
SELECT
e1.name AS employee1,
e2.name AS employee2,
e1.hire_date
FROM employees e1
INNER JOIN employees e2 ON e1.hire_date = e2.hire_date
WHERE e1.id < e2.id; -- Avoid duplicate pairs and self-matches
Result:
employee1 | employee2 | hire_date
-----------+------------+-----------
Bob | Carol | 2023-01-15
Dave | Eve | 2023-03-10
WHERE e1.id < e2.id ensures:
- No self-matches (Bob with Bob)
- No duplicates (Bob-Carol and Carol-Bob)
Self Join Pattern: Hierarchical Levels
Find all employees and their manager's manager (skip-level):
SELECT
e.name AS employee,
m.name AS manager,
mm.name AS managers_manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
LEFT JOIN employees mm ON m.manager_id = mm.id;
Result:
employee | manager | managers_manager
-----------------+--------------+------------------
Alice (CEO) | NULL | NULL
Bob (VP) | Alice (CEO) | NULL
Carol (VP) | Alice (CEO) | NULL
Dave (Manager) | Bob (VP) | Alice (CEO)
Eve (Manager) | Bob (VP) | Alice (CEO)
Multiple Joins
Joining Many Tables
Combine data from 3+ tables in one query.
Example Setup: E-commerce Database
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id),
order_date DATE
);
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INTEGER REFERENCES orders(id),
product_id INTEGER REFERENCES products(id),
quantity INTEGER,
price DECIMAL(10,2)
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
category VARCHAR(50)
);
CREATE TABLE categories (
name VARCHAR(50) PRIMARY KEY,
description TEXT
);
Join All Tables
SELECT
c.name AS customer,
o.id AS order_id,
o.order_date,
p.name AS product,
cat.description AS category,
oi.quantity,
oi.quantity * oi.price AS item_total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
INNER JOIN categories cat ON p.category = cat.name;
Result:
customer | order_id | order_date | product | category | quantity | item_total
----------+----------+------------+-----------+-------------------+----------+------------
Alice | 1 | 2024-01-15 | Laptop | Electronics | 1 | 999.99
Alice | 1 | 2024-01-15 | Mouse | Electronics | 2 | 51.98
Bob | 2 | 2024-01-18 | Desk | Furniture | 1 | 299.99
Join Order Matters for Readability
-- ✅ Logical flow: Customer → Order → Items → Product
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id;
-- ❌ Confusing flow
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.id
INNER JOIN orders o ON oi.order_id = o.id
INNER JOIN customers c ON o.customer_id = c.id;
Both work, but the first is easier to understand.
Mixing JOIN Types
-- All customers (even without orders), with their order details
SELECT
c.name,
o.id AS order_id,
p.name AS product
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id -- Keep all customers
LEFT JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id; -- Only valid products
Careful: Mixing LEFT and INNER can filter out rows unintentionally!
Complex Query Patterns
Pattern 1: Sales Report by Customer and Category
SELECT
c.name AS customer,
cat.name AS category,
SUM(oi.quantity * oi.price) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
INNER JOIN categories cat ON p.category = cat.name
GROUP BY c.id, c.name, cat.name
ORDER BY customer, total_spent DESC;
Pattern 2: Find Customers Who Bought Product X and Product Y
-- Customers who bought both 'Laptop' and 'Mouse'
SELECT c.name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.customer_id = c.id AND p.name = 'Laptop'
)
AND EXISTS (
SELECT 1
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.customer_id = c.id AND p.name = 'Mouse'
);
(We'll cover EXISTS in Module 6)
Pattern 3: Organizational Chart (Recursive Hierarchy)
-- WITH RECURSIVE (PostgreSQL)
WITH RECURSIVE org_chart AS (
-- Base case: Top-level employees
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: Employees under managers
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT
REPEAT(' ', level - 1) || name AS org_structure,
level
FROM org_chart
ORDER BY level, name;
Result:
org_structure | level
----------------------+-------
Alice (CEO) | 1
Bob (VP) | 2
Carol (VP) | 2
Dave (Manager) | 3
Eve (Manager) | 3
Performance with Multiple Joins
Index All Foreign Keys
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);
CREATE INDEX idx_products_category ON products(category);
Without indexes, multi-table JOINs can be very slow!
Join Order Optimization
PostgreSQL's query planner usually optimizes join order automatically, but you can help:
-- ✅ Filter early
SELECT c.name, p.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE c.active = TRUE -- Filter before joining more tables
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id;
EXPLAIN ANALYZE
See how the database executes your query:
EXPLAIN ANALYZE
SELECT c.name, p.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id;
(We'll cover query optimization in Module 7)
Common Pitfalls
Pitfall 1: Cartesian Product
-- ❌ Missing JOIN condition
SELECT c.name, p.name
FROM customers c, products p;
-- Returns EVERY customer with EVERY product!
-- 100 customers × 1000 products = 100,000 rows!
Always use explicit JOIN with ON condition.
Pitfall 2: Wrong JOIN Type
-- Want all customers, even those without orders
-- ❌ Wrong: Uses INNER JOIN
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;
-- Excludes customers with 0 orders!
-- ✅ Correct: Uses LEFT JOIN
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;
Pitfall 3: Ambiguous Columns
-- ❌ Error: Both tables have "id"
SELECT id FROM customers
JOIN orders ON customers.id = orders.customer_id;
-- ✅ Correct: Qualify column
SELECT customers.id FROM customers
JOIN orders ON customers.id = orders.customer_id;
Pitfall 4: Self Join Without WHERE Filter
-- ❌ Creates duplicate pairs
SELECT e1.name, e2.name
FROM employees e1
JOIN employees e2 ON e1.hire_date = e2.hire_date;
-- Results include: (Alice, Alice), (Alice, Bob), (Bob, Alice), (Bob, Bob)
-- ✅ Avoid duplicates
SELECT e1.name, e2.name
FROM employees e1
JOIN employees e2 ON e1.hire_date = e2.hire_date
WHERE e1.id < e2.id;
-- Results: (Alice, Bob) only
Practice Exercises
Exercise 1
Find each employee with their manager's name (self join).
Solution
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Exercise 2
Find all customers who placed orders in 2024, with product names and quantities.
Solution
SELECT
c.name AS customer,
p.name AS product,
oi.quantity
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
WHERE EXTRACT(YEAR FROM o.order_date) = 2024;
Exercise 3
Find pairs of products bought together (in the same order).
Solution
SELECT DISTINCT
p1.name AS product1,
p2.name AS product2
FROM order_items oi1
INNER JOIN order_items oi2 ON oi1.order_id = oi2.order_id
INNER JOIN products p1 ON oi1.product_id = p1.id
INNER JOIN products p2 ON oi2.product_id = p2.id
WHERE oi1.product_id < oi2.product_id; -- Avoid duplicates
Key Takeaways
- ✅ Self join: Join table to itself using aliases
- ✅ Use for: Hierarchies, comparisons, relationships within same table
- ✅ Multiple joins: Combine 3+ tables in one query
- ✅ Join order: Follow logical relationships for readability
- ✅ Mix JOIN types: LEFT/INNER combinations for specific needs
- ✅ Index foreign keys: Critical for multi-table query performance
- ✅ Avoid cartesian products: Always use proper ON conditions
Next Steps
You've mastered all JOIN types! In the next lesson, we'll discuss When to Use Which Join—practical guidelines for choosing the right JOIN for your query.
Almost done with Module 4!

