Subqueries - Queries Within Queries

Introduction

You've written many SQL queries, but what if you need to use the result of one query inside another query?

Examples:

"Find customers who spent more than the average"
"Get products that have never been ordered"
"Show orders from the customer with the highest total spending"

Subqueries (also called nested queries or inner queries) are queries embedded within another query. They enable complex logic and dynamic filtering that would be difficult or impossible with simple queries.

What is a Subquery?

Basic Structure

SELECT column1
FROM table1
WHERE column2 = (SELECT column3 FROM table2 WHERE condition);
                 └────────── Subquery ──────────┘

The subquery executes first, then its result is used by the outer query.

Simple Example

-- Find products more expensive than the average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Execution:

Inner query: SELECT AVG(price) FROM products → Returns 45.99
Outer query: SELECT name, price FROM products WHERE price > 45.99

Result:

       name        | price
-------------------+--------
 Laptop Pro 15     | 999.99
 Wireless Mouse    |  79.99
 USB-C Hub         |  54.99

Types of Subqueries

1. Scalar Subqueries (Single Value)

Returns one row, one column (a single value).

-- Customers who spent more than the average customer
SELECT
    name,
    (SELECT SUM(total) FROM orders WHERE customer_id = c.id) AS total_spent
FROM customers c
WHERE (SELECT SUM(total) FROM orders WHERE customer_id = c.id) >
      (SELECT AVG(total_spent)
       FROM (SELECT SUM(total) AS total_spent
             FROM orders
             GROUP BY customer_id) AS customer_totals);

2. Column Subqueries (Multiple Rows, One Column)

Returns multiple rows, one column (a list of values).

-- Orders from customers in California
SELECT order_id, total
FROM orders
WHERE customer_id IN (
    SELECT id
    FROM customers
    WHERE state = 'CA'
);

3. Row Subqueries (One or More Rows, Multiple Columns)

Returns one or more complete rows.

-- Products with same price and category as product ID 5
SELECT name, price, category
FROM products
WHERE (price, category) = (
    SELECT price, category
    FROM products
    WHERE id = 5
);

4. Table Subqueries (Multiple Rows and Columns)

Returns a full result set (acts like a table).

-- Average order value per customer
SELECT
    customer_id,
    AVG(total) AS avg_order_value
FROM orders
GROUP BY customer_id
HAVING AVG(total) > (
    SELECT AVG(total)
    FROM orders
);

WHERE Clause Subqueries

Using = with Scalar Subqueries

-- Find the most expensive product
SELECT name, price
FROM products
WHERE price = (SELECT MAX(price) FROM products);

Using IN with Multiple Values

-- Customers who placed orders in the last 30 days
SELECT name, email
FROM customers
WHERE id IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
);

Using NOT IN

-- Customers who have never placed an order
SELECT name, email
FROM customers
WHERE id NOT IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE customer_id IS NOT NULL
);

Warning: NOT IN with NULLs can cause unexpected behavior:

-- ❌ Returns no rows if subquery contains NULL
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);
-- If any customer_id is NULL, entire result is empty!

-- ✅ Better: Use NOT EXISTS or filter NULLs
SELECT name FROM customers
WHERE id NOT IN (
    SELECT customer_id FROM orders WHERE customer_id IS NOT NULL
);

Using Comparison Operators with Scalar Subqueries

-- Products cheaper than the average price
SELECT name, price
FROM products
WHERE price < (SELECT AVG(price) FROM products);

-- Orders larger than customer's average order
SELECT o.id, o.total
FROM orders o
WHERE o.total > (
    SELECT AVG(total)
    FROM orders
    WHERE customer_id = o.customer_id
);

FROM Clause Subqueries (Derived Tables)

Basic Derived Table

-- Average order value per customer, then filter
SELECT *
FROM (
    SELECT
        customer_id,
        AVG(total) AS avg_order_value
    FROM orders
    GROUP BY customer_id
) AS customer_avg
WHERE avg_order_value > 100;

The subquery creates a temporary "table" called customer_avg.

Practical Example: Top Customers

-- Top 10 customers by total spending
SELECT
    c.name,
    customer_totals.total_spent
FROM customers c
INNER JOIN (
    SELECT
        customer_id,
        SUM(total) AS total_spent
    FROM orders
    GROUP BY customer_id
    ORDER BY total_spent DESC
    LIMIT 10
) AS customer_totals ON c.id = customer_totals.customer_id
ORDER BY customer_totals.total_spent DESC;

SELECT Clause Subqueries (Correlated)

Adding Calculated Columns

-- Show each customer with their order count
SELECT
    name,
    email,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) AS order_count
FROM customers c;

Result:

      name       |        email         | order_count
-----------------+----------------------+-------------
 Alice Smith     | alice@example.com    |      12
 Bob Jones       | bob@example.com      |       5
 Carol White     | carol@example.com    |       0

Multiple Subqueries in SELECT

SELECT
    c.name,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) AS order_count,
    (SELECT COALESCE(SUM(total), 0) FROM orders WHERE customer_id = c.id) AS total_spent,
    (SELECT MAX(order_date) FROM orders WHERE customer_id = c.id) AS last_order_date
FROM customers c;

Correlated Subqueries

What is a Correlated Subquery?

A correlated subquery references columns from the outer query. It executes once for each row of the outer query.

-- Products with above-average price in their category
SELECT name, category, price
FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category  -- ← References outer query!
);

Execution:

For each product, the subquery runs with that product's category
Compares product's price to its category's average

Correlated vs Non-Correlated

-- Non-correlated: Runs once
SELECT name FROM products
WHERE price > (SELECT AVG(price) FROM products);

-- Correlated: Runs once per row
SELECT name FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category
);

Performance Note

Correlated subqueries can be slow on large tables (runs N times for N rows). Consider using JOINs instead when possible.

EXISTS and NOT EXISTS

EXISTS - Check if Subquery Returns Any Rows

-- Customers who placed at least one order
SELECT name, email
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

EXISTS returns TRUE if the subquery returns any rows (even one).

Why SELECT 1? The actual values don't matter—only whether rows exist. SELECT 1 is conventional and efficient.

NOT EXISTS - Check if Subquery Returns No Rows

-- Customers who have never placed an order
SELECT name, email
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

EXISTS vs IN

-- These are equivalent:

-- Using IN
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);

-- Using EXISTS
SELECT name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders WHERE customer_id = c.id
);

Differences:

EXISTS stops as soon as it finds one matching row (faster)
IN evaluates the entire subquery first
EXISTS handles NULLs better than IN

Recommendation: Use EXISTS for correlated subqueries, IN for simple lists.

ANY and ALL Operators

ANY (or SOME)

-- Products more expensive than ANY product in 'Books' category
SELECT name, price
FROM products
WHERE price > ANY (
    SELECT price
    FROM products
    WHERE category = 'Books'
);
-- Equivalent to: price > (minimum book price)

ALL

-- Products more expensive than ALL products in 'Books' category
SELECT name, price
FROM products
WHERE price > ALL (
    SELECT price
    FROM products
    WHERE category = 'Books'
);
-- Equivalent to: price > (maximum book price)

Practical Example

-- Customers who spent more than any customer from California
SELECT c.name, SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
HAVING SUM(o.total) > ANY (
    SELECT SUM(total)
    FROM orders o2
    INNER JOIN customers c2 ON o2.customer_id = c2.id
    WHERE c2.state = 'CA'
    GROUP BY c2.id
);

Common Subquery Patterns

Pattern 1: Find Maximum with Details

-- Most expensive product (with all details)
SELECT *
FROM products
WHERE price = (SELECT MAX(price) FROM products);

Pattern 2: Above Average

-- Customers who spent more than average
SELECT
    c.name,
    SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
HAVING SUM(o.total) > (
    SELECT AVG(total_per_customer)
    FROM (
        SELECT SUM(total) AS total_per_customer
        FROM orders
        GROUP BY customer_id
    ) AS customer_totals
);

Pattern 3: Top N per Group

-- Top 3 most expensive products per category
SELECT *
FROM products p1
WHERE (
    SELECT COUNT(*)
    FROM products p2
    WHERE p2.category = p1.category
      AND p2.price > p1.price
) < 3
ORDER BY category, price DESC;

Pattern 4: Latest Record per Group

-- Each customer's most recent order
SELECT c.name, o.order_date, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date = (
    SELECT MAX(order_date)
    FROM orders
    WHERE customer_id = c.id
);

Subqueries vs JOINs

Many Subqueries Can Be Rewritten as JOINs

-- Using subquery
SELECT name
FROM customers
WHERE id IN (
    SELECT customer_id
    FROM orders
    WHERE total > 100
);

-- Using JOIN (often faster)
SELECT DISTINCT c.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100;

When to Use Subqueries

✅ When you need a single aggregated value
✅ For EXISTS/NOT EXISTS checks
✅ When the logic is clearer with nesting
✅ For operations on derived tables

When to Use JOINs

✅ For better performance on large datasets
✅ When combining data from multiple tables
✅ When you need columns from both tables
✅ For most production queries

Performance Considerations

Subqueries Can Be Slow

-- ❌ Slow: Correlated subquery runs N times
SELECT
    c.name,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id)
FROM customers c;

-- ✅ Faster: Single JOIN + GROUP BY
SELECT
    c.name,
    COUNT(o.id)
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Use EXPLAIN to Analyze

EXPLAIN ANALYZE
SELECT name
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Common Mistakes

Mistake 1: Subquery Returns Multiple Rows with =

-- ❌ Error if subquery returns multiple rows
SELECT name FROM products
WHERE price = (SELECT price FROM products WHERE category = 'Books');

-- ✅ Use IN for multiple values
SELECT name FROM products
WHERE price IN (SELECT price FROM products WHERE category = 'Books');

Mistake 2: NOT IN with NULLs

-- ❌ Returns nothing if subquery contains NULL
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);

-- ✅ Filter NULLs
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders WHERE customer_id IS NOT NULL);

-- ✅ Or use NOT EXISTS
SELECT name FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id);

Mistake 3: Forgetting Alias for Derived Tables

-- ❌ Error: Subquery must have an alias
SELECT * FROM (
    SELECT customer_id, SUM(total)
    FROM orders
    GROUP BY customer_id
);

-- ✅ Add alias
SELECT * FROM (
    SELECT customer_id, SUM(total) AS total_spent
    FROM orders
    GROUP BY customer_id
) AS customer_totals;

Practice Exercises

Exercise 1

Find all products more expensive than the average product in their category.

Solution

SELECT name, category, price
FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category
);

Exercise 2

Find customers who have never placed an order.

Solution

SELECT name, email
FROM customers
WHERE id NOT IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE customer_id IS NOT NULL
);

-- OR using NOT EXISTS
SELECT name, email
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

Exercise 3

Find orders with a total greater than the average order total.

Solution

SELECT id, customer_id, total
FROM orders
WHERE total > (SELECT AVG(total) FROM orders)
ORDER BY total DESC;

Exercise 4

For each customer, show their name and total spent (using a subquery in SELECT).

Solution

SELECT
    name,
    (SELECT COALESCE(SUM(total), 0)
     FROM orders
     WHERE customer_id = c.id) AS total_spent
FROM customers c
ORDER BY total_spent DESC;

Key Takeaways

✅ Subqueries are queries nested inside other queries
✅ Scalar subqueries return a single value (use with comparison operators)
✅ Column subqueries return multiple rows (use with IN, ANY, ALL)
✅ Correlated subqueries reference the outer query (run once per row)
✅ EXISTS checks if subquery returns any rows (efficient)
✅ NOT IN with NULLs can cause problems (use NOT EXISTS or filter NULLs)
✅ Derived tables (FROM clause subqueries) act like temporary tables
✅ JOINs are often faster than subqueries
✅ Use EXPLAIN to analyze query performance

Next Steps

You've mastered subqueries! In the next lesson, we'll learn CTEs (Common Table Expressions)—a cleaner, more readable way to write complex queries with multiple subqueries.

Get ready to write more maintainable SQL!

Subqueries - Queries Within Queries

Introduction

You've written many SQL queries, but what if you need to use the result of one query inside another query?

Examples:

"Find customers who spent more than the average"
"Get products that have never been ordered"
"Show orders from the customer with the highest total spending"

What is a Subquery?

Basic Structure

SELECT column1
FROM table1
WHERE column2 = (SELECT column3 FROM table2 WHERE condition);
                 └────────── Subquery ──────────┘

The subquery executes first, then its result is used by the outer query.

Simple Example

-- Find products more expensive than the average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Execution:

Inner query: SELECT AVG(price) FROM products → Returns 45.99
Outer query: SELECT name, price FROM products WHERE price > 45.99

Result:

       name        | price
-------------------+--------
 Laptop Pro 15     | 999.99
 Wireless Mouse    |  79.99
 USB-C Hub         |  54.99

Types of Subqueries

1. Scalar Subqueries (Single Value)

Returns one row, one column (a single value).

-- Customers who spent more than the average customer
SELECT
    name,
    (SELECT SUM(total) FROM orders WHERE customer_id = c.id) AS total_spent
FROM customers c
WHERE (SELECT SUM(total) FROM orders WHERE customer_id = c.id) >
      (SELECT AVG(total_spent)
       FROM (SELECT SUM(total) AS total_spent
             FROM orders
             GROUP BY customer_id) AS customer_totals);

2. Column Subqueries (Multiple Rows, One Column)

Returns multiple rows, one column (a list of values).

-- Orders from customers in California
SELECT order_id, total
FROM orders
WHERE customer_id IN (
    SELECT id
    FROM customers
    WHERE state = 'CA'
);

3. Row Subqueries (One or More Rows, Multiple Columns)

Returns one or more complete rows.

-- Products with same price and category as product ID 5
SELECT name, price, category
FROM products
WHERE (price, category) = (
    SELECT price, category
    FROM products
    WHERE id = 5
);

4. Table Subqueries (Multiple Rows and Columns)

Returns a full result set (acts like a table).

-- Average order value per customer
SELECT
    customer_id,
    AVG(total) AS avg_order_value
FROM orders
GROUP BY customer_id
HAVING AVG(total) > (
    SELECT AVG(total)
    FROM orders
);

WHERE Clause Subqueries

Using = with Scalar Subqueries

-- Find the most expensive product
SELECT name, price
FROM products
WHERE price = (SELECT MAX(price) FROM products);

Using IN with Multiple Values

-- Customers who placed orders in the last 30 days
SELECT name, email
FROM customers
WHERE id IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
);

Using NOT IN

-- Customers who have never placed an order
SELECT name, email
FROM customers
WHERE id NOT IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE customer_id IS NOT NULL
);

Warning: NOT IN with NULLs can cause unexpected behavior:

-- ❌ Returns no rows if subquery contains NULL
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);
-- If any customer_id is NULL, entire result is empty!

-- ✅ Better: Use NOT EXISTS or filter NULLs
SELECT name FROM customers
WHERE id NOT IN (
    SELECT customer_id FROM orders WHERE customer_id IS NOT NULL
);

Using Comparison Operators with Scalar Subqueries

-- Products cheaper than the average price
SELECT name, price
FROM products
WHERE price < (SELECT AVG(price) FROM products);

-- Orders larger than customer's average order
SELECT o.id, o.total
FROM orders o
WHERE o.total > (
    SELECT AVG(total)
    FROM orders
    WHERE customer_id = o.customer_id
);

FROM Clause Subqueries (Derived Tables)

Basic Derived Table

-- Average order value per customer, then filter
SELECT *
FROM (
    SELECT
        customer_id,
        AVG(total) AS avg_order_value
    FROM orders
    GROUP BY customer_id
) AS customer_avg
WHERE avg_order_value > 100;

The subquery creates a temporary "table" called customer_avg.

Practical Example: Top Customers

-- Top 10 customers by total spending
SELECT
    c.name,
    customer_totals.total_spent
FROM customers c
INNER JOIN (
    SELECT
        customer_id,
        SUM(total) AS total_spent
    FROM orders
    GROUP BY customer_id
    ORDER BY total_spent DESC
    LIMIT 10
) AS customer_totals ON c.id = customer_totals.customer_id
ORDER BY customer_totals.total_spent DESC;

SELECT Clause Subqueries (Correlated)

Adding Calculated Columns

-- Show each customer with their order count
SELECT
    name,
    email,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) AS order_count
FROM customers c;

Result:

      name       |        email         | order_count
-----------------+----------------------+-------------
 Alice Smith     | alice@example.com    |      12
 Bob Jones       | bob@example.com      |       5
 Carol White     | carol@example.com    |       0

Multiple Subqueries in SELECT

SELECT
    c.name,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) AS order_count,
    (SELECT COALESCE(SUM(total), 0) FROM orders WHERE customer_id = c.id) AS total_spent,
    (SELECT MAX(order_date) FROM orders WHERE customer_id = c.id) AS last_order_date
FROM customers c;

Correlated Subqueries

What is a Correlated Subquery?

A correlated subquery references columns from the outer query. It executes once for each row of the outer query.

-- Products with above-average price in their category
SELECT name, category, price
FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category  -- ← References outer query!
);

Execution:

For each product, the subquery runs with that product's category
Compares product's price to its category's average

Correlated vs Non-Correlated

-- Non-correlated: Runs once
SELECT name FROM products
WHERE price > (SELECT AVG(price) FROM products);

-- Correlated: Runs once per row
SELECT name FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category
);

Performance Note

Correlated subqueries can be slow on large tables (runs N times for N rows). Consider using JOINs instead when possible.

EXISTS and NOT EXISTS

EXISTS - Check if Subquery Returns Any Rows

-- Customers who placed at least one order
SELECT name, email
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

EXISTS returns TRUE if the subquery returns any rows (even one).

Why SELECT 1? The actual values don't matter—only whether rows exist. SELECT 1 is conventional and efficient.

NOT EXISTS - Check if Subquery Returns No Rows

-- Customers who have never placed an order
SELECT name, email
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

EXISTS vs IN

-- These are equivalent:

-- Using IN
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);

-- Using EXISTS
SELECT name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders WHERE customer_id = c.id
);

Differences:

EXISTS stops as soon as it finds one matching row (faster)
IN evaluates the entire subquery first
EXISTS handles NULLs better than IN

Recommendation: Use EXISTS for correlated subqueries, IN for simple lists.

ANY and ALL Operators

ANY (or SOME)

-- Products more expensive than ANY product in 'Books' category
SELECT name, price
FROM products
WHERE price > ANY (
    SELECT price
    FROM products
    WHERE category = 'Books'
);
-- Equivalent to: price > (minimum book price)

ALL

-- Products more expensive than ALL products in 'Books' category
SELECT name, price
FROM products
WHERE price > ALL (
    SELECT price
    FROM products
    WHERE category = 'Books'
);
-- Equivalent to: price > (maximum book price)

Practical Example

-- Customers who spent more than any customer from California
SELECT c.name, SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
HAVING SUM(o.total) > ANY (
    SELECT SUM(total)
    FROM orders o2
    INNER JOIN customers c2 ON o2.customer_id = c2.id
    WHERE c2.state = 'CA'
    GROUP BY c2.id
);

Common Subquery Patterns

Pattern 1: Find Maximum with Details

-- Most expensive product (with all details)
SELECT *
FROM products
WHERE price = (SELECT MAX(price) FROM products);

Pattern 2: Above Average

-- Customers who spent more than average
SELECT
    c.name,
    SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
HAVING SUM(o.total) > (
    SELECT AVG(total_per_customer)
    FROM (
        SELECT SUM(total) AS total_per_customer
        FROM orders
        GROUP BY customer_id
    ) AS customer_totals
);

Pattern 3: Top N per Group

-- Top 3 most expensive products per category
SELECT *
FROM products p1
WHERE (
    SELECT COUNT(*)
    FROM products p2
    WHERE p2.category = p1.category
      AND p2.price > p1.price
) < 3
ORDER BY category, price DESC;

Pattern 4: Latest Record per Group

-- Each customer's most recent order
SELECT c.name, o.order_date, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date = (
    SELECT MAX(order_date)
    FROM orders
    WHERE customer_id = c.id
);

Subqueries vs JOINs

Many Subqueries Can Be Rewritten as JOINs

-- Using subquery
SELECT name
FROM customers
WHERE id IN (
    SELECT customer_id
    FROM orders
    WHERE total > 100
);

-- Using JOIN (often faster)
SELECT DISTINCT c.name
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100;

When to Use Subqueries

✅ When you need a single aggregated value
✅ For EXISTS/NOT EXISTS checks
✅ When the logic is clearer with nesting
✅ For operations on derived tables

When to Use JOINs

✅ For better performance on large datasets
✅ When combining data from multiple tables
✅ When you need columns from both tables
✅ For most production queries

Performance Considerations

Subqueries Can Be Slow

-- ❌ Slow: Correlated subquery runs N times
SELECT
    c.name,
    (SELECT COUNT(*) FROM orders WHERE customer_id = c.id)
FROM customers c;

-- ✅ Faster: Single JOIN + GROUP BY
SELECT
    c.name,
    COUNT(o.id)
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Use EXPLAIN to Analyze

EXPLAIN ANALYZE
SELECT name
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Common Mistakes

Mistake 1: Subquery Returns Multiple Rows with =

-- ❌ Error if subquery returns multiple rows
SELECT name FROM products
WHERE price = (SELECT price FROM products WHERE category = 'Books');

-- ✅ Use IN for multiple values
SELECT name FROM products
WHERE price IN (SELECT price FROM products WHERE category = 'Books');

Mistake 2: NOT IN with NULLs

-- ❌ Returns nothing if subquery contains NULL
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);

-- ✅ Filter NULLs
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders WHERE customer_id IS NOT NULL);

-- ✅ Or use NOT EXISTS
SELECT name FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id);

Mistake 3: Forgetting Alias for Derived Tables

-- ❌ Error: Subquery must have an alias
SELECT * FROM (
    SELECT customer_id, SUM(total)
    FROM orders
    GROUP BY customer_id
);

-- ✅ Add alias
SELECT * FROM (
    SELECT customer_id, SUM(total) AS total_spent
    FROM orders
    GROUP BY customer_id
) AS customer_totals;

Practice Exercises

Exercise 1

Find all products more expensive than the average product in their category.

Solution

SELECT name, category, price
FROM products p
WHERE price > (
    SELECT AVG(price)
    FROM products
    WHERE category = p.category
);

Exercise 2

Find customers who have never placed an order.

Solution

SELECT name, email
FROM customers
WHERE id NOT IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE customer_id IS NOT NULL
);

-- OR using NOT EXISTS
SELECT name, email
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders
    WHERE customer_id = c.id
);

Exercise 3

Find orders with a total greater than the average order total.

Solution

SELECT id, customer_id, total
FROM orders
WHERE total > (SELECT AVG(total) FROM orders)
ORDER BY total DESC;

Exercise 4

For each customer, show their name and total spent (using a subquery in SELECT).

Solution

SELECT
    name,
    (SELECT COALESCE(SUM(total), 0)
     FROM orders
     WHERE customer_id = c.id) AS total_spent
FROM customers c
ORDER BY total_spent DESC;

Key Takeaways

✅ Subqueries are queries nested inside other queries
✅ Scalar subqueries return a single value (use with comparison operators)
✅ Column subqueries return multiple rows (use with IN, ANY, ALL)
✅ Correlated subqueries reference the outer query (run once per row)
✅ EXISTS checks if subquery returns any rows (efficient)
✅ NOT IN with NULLs can cause problems (use NOT EXISTS or filter NULLs)
✅ Derived tables (FROM clause subqueries) act like temporary tables
✅ JOINs are often faster than subqueries
✅ Use EXPLAIN to analyze query performance

Next Steps

You've mastered subqueries! In the next lesson, we'll learn CTEs (Common Table Expressions)—a cleaner, more readable way to write complex queries with multiple subqueries.

Get ready to write more maintainable SQL!

Subqueries - Queries Within Queries

Introduction

What is a Subquery?

Basic Structure

Simple Example

Types of Subqueries

1. Scalar Subqueries (Single Value)

2. Column Subqueries (Multiple Rows, One Column)

3. Row Subqueries (One or More Rows, Multiple Columns)

4. Table Subqueries (Multiple Rows and Columns)

WHERE Clause Subqueries

Using = with Scalar Subqueries

Using IN with Multiple Values

Using NOT IN

Using Comparison Operators with Scalar Subqueries

FROM Clause Subqueries (Derived Tables)

Basic Derived Table

Practical Example: Top Customers

SELECT Clause Subqueries (Correlated)

Adding Calculated Columns

Multiple Subqueries in SELECT

Correlated Subqueries

What is a Correlated Subquery?

Correlated vs Non-Correlated

Performance Note

EXISTS and NOT EXISTS

EXISTS - Check if Subquery Returns Any Rows

NOT EXISTS - Check if Subquery Returns No Rows

EXISTS vs IN

ANY and ALL Operators

ANY (or SOME)

ALL

Practical Example

Common Subquery Patterns

Pattern 1: Find Maximum with Details

Pattern 2: Above Average

Pattern 3: Top N per Group

Pattern 4: Latest Record per Group

Subqueries vs JOINs

Many Subqueries Can Be Rewritten as JOINs

When to Use Subqueries

When to Use JOINs

Performance Considerations

Subqueries Can Be Slow

Use EXPLAIN to Analyze

Common Mistakes

Mistake 1: Subquery Returns Multiple Rows with =

Mistake 2: NOT IN with NULLs

Mistake 3: Forgetting Alias for Derived Tables

Practice Exercises

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Key Takeaways

Next Steps

Quiz

Subqueries - Queries Within Queries

Introduction

What is a Subquery?

Basic Structure

Simple Example

Types of Subqueries

1. Scalar Subqueries (Single Value)

2. Column Subqueries (Multiple Rows, One Column)

3. Row Subqueries (One or More Rows, Multiple Columns)

4. Table Subqueries (Multiple Rows and Columns)

WHERE Clause Subqueries

Using = with Scalar Subqueries

Using IN with Multiple Values

Using NOT IN

Using Comparison Operators with Scalar Subqueries

FROM Clause Subqueries (Derived Tables)

Basic Derived Table

Practical Example: Top Customers

SELECT Clause Subqueries (Correlated)

Adding Calculated Columns

Multiple Subqueries in SELECT

Correlated Subqueries

What is a Correlated Subquery?