Aggregate Functions - Summarizing Data

Introduction

So far, you've queried individual rows of data. But what if you want to answer questions like:

"How many customers do we have?"
"What's the average order total?"
"What's the highest price in our inventory?"

Aggregate functions compute a single result from a set of rows. They're essential for data analysis, reporting, and understanding your database at a high level.

The Five Core Aggregate Functions

PostgreSQL provides five fundamental aggregate functions:

Function	Purpose	Example
`COUNT()`	Count rows	How many orders?
`SUM()`	Add values	Total revenue
`AVG()`	Calculate average	Average order value
`MIN()`	Find minimum	Cheapest product
`MAX()`	Find maximum	Most expensive product

COUNT() - Counting Rows

COUNT(*) - Count All Rows

-- How many customers do we have?
SELECT COUNT(*) FROM customers;

Result:

 count
-------
   150

COUNT(*) counts all rows, including those with NULL values.

COUNT(column) - Count Non-NULL Values

-- How many customers have a phone number?
SELECT COUNT(phone) FROM customers;

Result:

 count
-------
   127

COUNT(column) only counts rows where the column is NOT NULL.

COUNT(DISTINCT column) - Count Unique Values

-- How many different cities do our customers live in?
SELECT COUNT(DISTINCT city) FROM customers;

Result:

 count
-------
    42

Removes duplicates before counting.

SUM() - Adding Values

Basic SUM

-- What's our total revenue from all orders?
SELECT SUM(total) AS total_revenue
FROM orders;

Result:

 total_revenue
---------------
    125847.50

SUM with WHERE

-- Total revenue from orders in 2024
SELECT SUM(total) AS revenue_2024
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2024;

Result:

 revenue_2024
--------------
    87520.25

SUM Ignores NULLs

CREATE TABLE donations (
    id SERIAL PRIMARY KEY,
    donor VARCHAR(100),
    amount DECIMAL(10,2)
);

INSERT INTO donations VALUES
    (1, 'Alice', 100.00),
    (2, 'Bob', NULL),      -- No donation yet
    (3, 'Carol', 50.00);

SELECT SUM(amount) FROM donations;

Result:

  sum
--------
 150.00

Bob's NULL is ignored, not treated as 0.

AVG() - Calculating Average

Basic AVG

-- What's the average order total?
SELECT AVG(total) AS avg_order_value
FROM orders;

Result:

 avg_order_value
-----------------
          125.85

AVG vs SUM/COUNT

-- These are equivalent:
SELECT AVG(total) FROM orders;

SELECT SUM(total) / COUNT(total) FROM orders;

AVG Ignores NULLs

-- Average donation (NULL excluded)
SELECT AVG(amount) FROM donations;
-- Result: 75.00 (not 50.00!)
-- Calculation: (100 + 50) / 2 = 75

Rounding AVG Results

SELECT ROUND(AVG(total), 2) AS avg_order_value
FROM orders;

Result:

 avg_order_value
-----------------
          125.85

MIN() and MAX() - Finding Extremes

Numeric MIN/MAX

-- Cheapest and most expensive products
SELECT
    MIN(price) AS cheapest,
    MAX(price) AS most_expensive
FROM products;

Result:

 cheapest | most_expensive
----------+----------------
     5.99 |         999.99

Date MIN/MAX

-- First and most recent order dates
SELECT
    MIN(order_date) AS first_order,
    MAX(order_date) AS latest_order
FROM orders;

Result:

 first_order | latest_order
-------------+--------------
  2023-01-05 |   2024-12-20

Text MIN/MAX

-- First and last customer names alphabetically
SELECT
    MIN(name) AS first_alphabetically,
    MAX(name) AS last_alphabetically
FROM customers;

Result:

 first_alphabetically | last_alphabetically
----------------------+---------------------
 Aaron Smith          | Zoe Williams

Combining Multiple Aggregates

-- Order statistics
SELECT
    COUNT(*) AS total_orders,
    SUM(total) AS total_revenue,
    AVG(total) AS avg_order_value,
    MIN(total) AS smallest_order,
    MAX(total) AS largest_order
FROM orders;

Result:

 total_orders | total_revenue | avg_order_value | smallest_order | largest_order
--------------+---------------+-----------------+----------------+---------------
         1000 |    125847.50  |          125.85 |           5.99 |        999.99

Aggregate Functions with WHERE

Filter Before Aggregating

-- Statistics for orders over $100
SELECT
    COUNT(*) AS large_order_count,
    AVG(total) AS avg_large_order
FROM orders
WHERE total > 100;

Result:

 large_order_count | avg_large_order
-------------------+-----------------
               487 |          215.42

Execution order:

Filter rows with WHERE
Apply aggregate functions to filtered rows

Common Pattern: Active vs Total

-- Active vs inactive customers
SELECT
    COUNT(*) AS total_customers,
    COUNT(CASE WHEN active = TRUE THEN 1 END) AS active_customers,
    COUNT(CASE WHEN active = FALSE THEN 1 END) AS inactive_customers
FROM customers;

Result:

 total_customers | active_customers | inactive_customers
-----------------+------------------+--------------------
             150 |              127 |                 23

Aggregate Functions with JOINs

Count Related Records

-- Total order count and revenue per customer
SELECT
    c.name,
    COUNT(o.id) AS order_count,
    COALESCE(SUM(o.total), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Result:

      name       | order_count | total_spent
-----------------+-------------+-------------
 Alice Smith     |      12     |    1547.82
 Bob Jones       |       5     |     687.25
 Carol White     |       0     |       0.00

Important: Use LEFT JOIN to include customers with 0 orders!

Aggregate from Multiple Tables

-- Product sales summary
SELECT
    p.name,
    COUNT(oi.id) AS times_ordered,
    COALESCE(SUM(oi.quantity), 0) AS units_sold,
    COALESCE(SUM(oi.quantity * oi.price), 0) AS revenue
FROM products p
LEFT JOIN order_items oi ON p.id = oi.product_id
GROUP BY p.id, p.name;

NULL Handling in Aggregates

COUNT(*) vs COUNT(column)

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50),
    manager_id INTEGER
);

INSERT INTO employees VALUES
    (1, 'Alice', 'Sales', NULL),     -- No manager (CEO)
    (2, 'Bob', 'Sales', 1),
    (3, 'Carol', 'Engineering', 1);

-- Count all employees
SELECT COUNT(*) FROM employees;
-- Result: 3

-- Count employees with managers
SELECT COUNT(manager_id) FROM employees;
-- Result: 2 (Alice's NULL excluded)

SUM/AVG/MIN/MAX Ignore NULLs

-- Average with NULLs
SELECT AVG(manager_id) FROM employees;
-- Result: 1.0 (only Bob and Carol counted)
-- NOT 0.67 (which would be if NULL counted as 0)

COALESCE for Default Values

-- Treat NULL as 0
SELECT
    name,
    COALESCE(manager_id, 0) AS manager_id
FROM employees;

Common Aggregate Patterns

Pattern 1: Data Quality Check

-- How complete is our customer data?
SELECT
    COUNT(*) AS total_customers,
    COUNT(email) AS has_email,
    COUNT(phone) AS has_phone,
    COUNT(address) AS has_address,
    ROUND(100.0 * COUNT(email) / COUNT(*), 2) AS email_pct,
    ROUND(100.0 * COUNT(phone) / COUNT(*), 2) AS phone_pct
FROM customers;

Result:

 total_customers | has_email | has_phone | has_address | email_pct | phone_pct
-----------------+-----------+-----------+-------------+-----------+-----------
             150 |       150 |       127 |         142 |    100.00 |     84.67

Pattern 2: Revenue Summary

-- Monthly revenue summary for 2024
SELECT
    DATE_TRUNC('month', order_date) AS month,
    COUNT(*) AS order_count,
    SUM(total) AS revenue,
    AVG(total) AS avg_order_value,
    MIN(total) AS smallest_order,
    MAX(total) AS largest_order
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2024
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

Pattern 3: Top Performers

-- Top 5 customers by total spending
SELECT
    c.name,
    COUNT(o.id) AS order_count,
    SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
ORDER BY total_spent DESC
LIMIT 5;

FILTER Clause (PostgreSQL Extension)

Conditional Aggregates

-- Orders by status
SELECT
    COUNT(*) AS total_orders,
    COUNT(*) FILTER (WHERE status = 'pending') AS pending,
    COUNT(*) FILTER (WHERE status = 'shipped') AS shipped,
    COUNT(*) FILTER (WHERE status = 'delivered') AS delivered
FROM orders;

Result:

 total_orders | pending | shipped | delivered
--------------+---------+---------+-----------
         1000 |     127 |     218 |       655

Alternative (CASE method):

SELECT
    COUNT(*) AS total_orders,
    COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending,
    COUNT(CASE WHEN status = 'shipped' THEN 1 END) AS shipped,
    COUNT(CASE WHEN status = 'delivered' THEN 1 END) AS delivered
FROM orders;

SUM with FILTER

-- Revenue by year
SELECT
    SUM(total) FILTER (WHERE EXTRACT(YEAR FROM order_date) = 2023) AS revenue_2023,
    SUM(total) FILTER (WHERE EXTRACT(YEAR FROM order_date) = 2024) AS revenue_2024
FROM orders;

Performance Considerations

Aggregates Can Be Slow on Large Tables

-- Slow on 10M+ rows without index
SELECT COUNT(*) FROM huge_table WHERE status = 'active';

Solution: Add index on filtered column:

CREATE INDEX idx_huge_table_status ON huge_table(status);

COUNT(*) vs COUNT(1)

-- These are identical in PostgreSQL
SELECT COUNT(*) FROM orders;
SELECT COUNT(1) FROM orders;

Both are optimized the same way. Use COUNT(*) for clarity.

Approximate Counts for Very Large Tables

-- Exact count (slow on huge tables)
SELECT COUNT(*) FROM orders;

-- Approximate count (fast, uses statistics)
SELECT reltuples::BIGINT AS approx_count
FROM pg_class
WHERE relname = 'orders';

Common Mistakes

Mistake 1: Forgetting GROUP BY

-- ❌ Error: Column must appear in GROUP BY or be used in aggregate
SELECT customer_id, COUNT(*)
FROM orders;

-- ✅ Correct
SELECT customer_id, COUNT(*)
FROM orders
GROUP BY customer_id;

(We'll cover GROUP BY in the next lesson)

Mistake 2: Treating NULL as 0

-- ❌ Wrong assumption
SELECT AVG(rating) FROM reviews;
-- NULL ratings are excluded, not counted as 0!

-- ✅ If you want NULL as 0
SELECT AVG(COALESCE(rating, 0)) FROM reviews;

Mistake 3: Using Aggregate in WHERE

-- ❌ Error: Cannot use aggregate in WHERE
SELECT customer_id
FROM orders
WHERE COUNT(*) > 5;

-- ✅ Use HAVING (covered in Lesson 5.3)
SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

Practice Exercises

Exercise 1

Find the total number of products and the average price.

Solution

SELECT
    COUNT(*) AS total_products,
    ROUND(AVG(price), 2) AS avg_price
FROM products;

Exercise 2

Find the highest and lowest order totals for customer with ID 5.

Solution

SELECT
    MIN(total) AS lowest_order,
    MAX(total) AS highest_order
FROM orders
WHERE customer_id = 5;

Exercise 3

Calculate total revenue from all completed orders (status = 'delivered').

Solution

SELECT
    COUNT(*) AS delivered_orders,
    SUM(total) AS total_revenue
FROM orders
WHERE status = 'delivered';

Exercise 4

Find how many customers have an email address vs how many don't.

Solution

SELECT
    COUNT(*) AS total_customers,
    COUNT(email) AS has_email,
    COUNT(*) - COUNT(email) AS no_email
FROM customers;

Key Takeaways

✅ COUNT() counts rows (COUNT(*) includes NULLs, COUNT(column) excludes them)
✅ SUM() adds numeric values
✅ AVG() calculates the mean (average)
✅ MIN() and MAX() find minimum and maximum values
✅ All aggregates ignore NULL (except COUNT(*))
✅ DISTINCT removes duplicates before aggregating
✅ FILTER clause allows conditional aggregates
✅ Aggregates work with JOINs for cross-table analysis
✅ Use COALESCE to treat NULL as a specific value

Next Steps

You've learned the five core aggregate functions! In the next lesson, we'll explore GROUP BY—how to group rows and calculate aggregates for each group (e.g., "total orders per customer" or "revenue per month").

Get ready to unlock powerful data analysis capabilities!

Aggregate Functions - Summarizing Data

Introduction

So far, you've queried individual rows of data. But what if you want to answer questions like:

"How many customers do we have?"
"What's the average order total?"
"What's the highest price in our inventory?"

Aggregate functions compute a single result from a set of rows. They're essential for data analysis, reporting, and understanding your database at a high level.

The Five Core Aggregate Functions

PostgreSQL provides five fundamental aggregate functions:

Function	Purpose	Example
`COUNT()`	Count rows	How many orders?
`SUM()`	Add values	Total revenue
`AVG()`	Calculate average	Average order value
`MIN()`	Find minimum	Cheapest product
`MAX()`	Find maximum	Most expensive product

COUNT() - Counting Rows

COUNT(*) - Count All Rows

-- How many customers do we have?
SELECT COUNT(*) FROM customers;

Result:

 count
-------
   150

COUNT(*) counts all rows, including those with NULL values.

COUNT(column) - Count Non-NULL Values

-- How many customers have a phone number?
SELECT COUNT(phone) FROM customers;

Result:

 count
-------
   127

COUNT(column) only counts rows where the column is NOT NULL.

COUNT(DISTINCT column) - Count Unique Values

-- How many different cities do our customers live in?
SELECT COUNT(DISTINCT city) FROM customers;

Result:

 count
-------
    42

Removes duplicates before counting.

SUM() - Adding Values

Basic SUM

-- What's our total revenue from all orders?
SELECT SUM(total) AS total_revenue
FROM orders;

Result:

 total_revenue
---------------
    125847.50

SUM with WHERE

-- Total revenue from orders in 2024
SELECT SUM(total) AS revenue_2024
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2024;

Result:

 revenue_2024
--------------
    87520.25

SUM Ignores NULLs

CREATE TABLE donations (
    id SERIAL PRIMARY KEY,
    donor VARCHAR(100),
    amount DECIMAL(10,2)
);

INSERT INTO donations VALUES
    (1, 'Alice', 100.00),
    (2, 'Bob', NULL),      -- No donation yet
    (3, 'Carol', 50.00);

SELECT SUM(amount) FROM donations;

Result:

  sum
--------
 150.00

Bob's NULL is ignored, not treated as 0.

AVG() - Calculating Average

Basic AVG

-- What's the average order total?
SELECT AVG(total) AS avg_order_value
FROM orders;

Result:

 avg_order_value
-----------------
          125.85

AVG vs SUM/COUNT

-- These are equivalent:
SELECT AVG(total) FROM orders;

SELECT SUM(total) / COUNT(total) FROM orders;

AVG Ignores NULLs

-- Average donation (NULL excluded)
SELECT AVG(amount) FROM donations;
-- Result: 75.00 (not 50.00!)
-- Calculation: (100 + 50) / 2 = 75

Rounding AVG Results

SELECT ROUND(AVG(total), 2) AS avg_order_value
FROM orders;

Result:

 avg_order_value
-----------------
          125.85

MIN() and MAX() - Finding Extremes

Numeric MIN/MAX

-- Cheapest and most expensive products
SELECT
    MIN(price) AS cheapest,
    MAX(price) AS most_expensive
FROM products;

Result:

 cheapest | most_expensive
----------+----------------
     5.99 |         999.99

Date MIN/MAX

-- First and most recent order dates
SELECT
    MIN(order_date) AS first_order,
    MAX(order_date) AS latest_order
FROM orders;

Result:

 first_order | latest_order
-------------+--------------
  2023-01-05 |   2024-12-20

Text MIN/MAX

-- First and last customer names alphabetically
SELECT
    MIN(name) AS first_alphabetically,
    MAX(name) AS last_alphabetically
FROM customers;

Result:

 first_alphabetically | last_alphabetically
----------------------+---------------------
 Aaron Smith          | Zoe Williams

Combining Multiple Aggregates

-- Order statistics
SELECT
    COUNT(*) AS total_orders,
    SUM(total) AS total_revenue,
    AVG(total) AS avg_order_value,
    MIN(total) AS smallest_order,
    MAX(total) AS largest_order
FROM orders;

Result:

 total_orders | total_revenue | avg_order_value | smallest_order | largest_order
--------------+---------------+-----------------+----------------+---------------
         1000 |    125847.50  |          125.85 |           5.99 |        999.99

Aggregate Functions with WHERE

Filter Before Aggregating

-- Statistics for orders over $100
SELECT
    COUNT(*) AS large_order_count,
    AVG(total) AS avg_large_order
FROM orders
WHERE total > 100;

Result:

 large_order_count | avg_large_order
-------------------+-----------------
               487 |          215.42

Execution order:

Filter rows with WHERE
Apply aggregate functions to filtered rows

Common Pattern: Active vs Total

-- Active vs inactive customers
SELECT
    COUNT(*) AS total_customers,
    COUNT(CASE WHEN active = TRUE THEN 1 END) AS active_customers,
    COUNT(CASE WHEN active = FALSE THEN 1 END) AS inactive_customers
FROM customers;

Result:

 total_customers | active_customers | inactive_customers
-----------------+------------------+--------------------
             150 |              127 |                 23

Aggregate Functions with JOINs

Count Related Records

-- Total order count and revenue per customer
SELECT
    c.name,
    COUNT(o.id) AS order_count,
    COALESCE(SUM(o.total), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Result:

      name       | order_count | total_spent
-----------------+-------------+-------------
 Alice Smith     |      12     |    1547.82
 Bob Jones       |       5     |     687.25
 Carol White     |       0     |       0.00

Important: Use LEFT JOIN to include customers with 0 orders!

Aggregate from Multiple Tables

-- Product sales summary
SELECT
    p.name,
    COUNT(oi.id) AS times_ordered,
    COALESCE(SUM(oi.quantity), 0) AS units_sold,
    COALESCE(SUM(oi.quantity * oi.price), 0) AS revenue
FROM products p
LEFT JOIN order_items oi ON p.id = oi.product_id
GROUP BY p.id, p.name;

NULL Handling in Aggregates

COUNT(*) vs COUNT(column)

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50),
    manager_id INTEGER
);

INSERT INTO employees VALUES
    (1, 'Alice', 'Sales', NULL),     -- No manager (CEO)
    (2, 'Bob', 'Sales', 1),
    (3, 'Carol', 'Engineering', 1);

-- Count all employees
SELECT COUNT(*) FROM employees;
-- Result: 3

-- Count employees with managers
SELECT COUNT(manager_id) FROM employees;
-- Result: 2 (Alice's NULL excluded)

SUM/AVG/MIN/MAX Ignore NULLs

-- Average with NULLs
SELECT AVG(manager_id) FROM employees;
-- Result: 1.0 (only Bob and Carol counted)
-- NOT 0.67 (which would be if NULL counted as 0)

COALESCE for Default Values

-- Treat NULL as 0
SELECT
    name,
    COALESCE(manager_id, 0) AS manager_id
FROM employees;

Common Aggregate Patterns

Pattern 1: Data Quality Check

-- How complete is our customer data?
SELECT
    COUNT(*) AS total_customers,
    COUNT(email) AS has_email,
    COUNT(phone) AS has_phone,
    COUNT(address) AS has_address,
    ROUND(100.0 * COUNT(email) / COUNT(*), 2) AS email_pct,
    ROUND(100.0 * COUNT(phone) / COUNT(*), 2) AS phone_pct
FROM customers;

Result:

 total_customers | has_email | has_phone | has_address | email_pct | phone_pct
-----------------+-----------+-----------+-------------+-----------+-----------
             150 |       150 |       127 |         142 |    100.00 |     84.67

Pattern 2: Revenue Summary

-- Monthly revenue summary for 2024
SELECT
    DATE_TRUNC('month', order_date) AS month,
    COUNT(*) AS order_count,
    SUM(total) AS revenue,
    AVG(total) AS avg_order_value,
    MIN(total) AS smallest_order,
    MAX(total) AS largest_order
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = 2024
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

Pattern 3: Top Performers

-- Top 5 customers by total spending
SELECT
    c.name,
    COUNT(o.id) AS order_count,
    SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
ORDER BY total_spent DESC
LIMIT 5;

FILTER Clause (PostgreSQL Extension)

Conditional Aggregates

-- Orders by status
SELECT
    COUNT(*) AS total_orders,
    COUNT(*) FILTER (WHERE status = 'pending') AS pending,
    COUNT(*) FILTER (WHERE status = 'shipped') AS shipped,
    COUNT(*) FILTER (WHERE status = 'delivered') AS delivered
FROM orders;

Result:

 total_orders | pending | shipped | delivered
--------------+---------+---------+-----------
         1000 |     127 |     218 |       655

Alternative (CASE method):

SELECT
    COUNT(*) AS total_orders,
    COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending,
    COUNT(CASE WHEN status = 'shipped' THEN 1 END) AS shipped,
    COUNT(CASE WHEN status = 'delivered' THEN 1 END) AS delivered
FROM orders;

SUM with FILTER

-- Revenue by year
SELECT
    SUM(total) FILTER (WHERE EXTRACT(YEAR FROM order_date) = 2023) AS revenue_2023,
    SUM(total) FILTER (WHERE EXTRACT(YEAR FROM order_date) = 2024) AS revenue_2024
FROM orders;

Performance Considerations

Aggregates Can Be Slow on Large Tables

-- Slow on 10M+ rows without index
SELECT COUNT(*) FROM huge_table WHERE status = 'active';

Solution: Add index on filtered column:

CREATE INDEX idx_huge_table_status ON huge_table(status);

COUNT(*) vs COUNT(1)

-- These are identical in PostgreSQL
SELECT COUNT(*) FROM orders;
SELECT COUNT(1) FROM orders;

Both are optimized the same way. Use COUNT(*) for clarity.

Approximate Counts for Very Large Tables

-- Exact count (slow on huge tables)
SELECT COUNT(*) FROM orders;

-- Approximate count (fast, uses statistics)
SELECT reltuples::BIGINT AS approx_count
FROM pg_class
WHERE relname = 'orders';

Common Mistakes

Mistake 1: Forgetting GROUP BY

-- ❌ Error: Column must appear in GROUP BY or be used in aggregate
SELECT customer_id, COUNT(*)
FROM orders;

-- ✅ Correct
SELECT customer_id, COUNT(*)
FROM orders
GROUP BY customer_id;

(We'll cover GROUP BY in the next lesson)

Mistake 2: Treating NULL as 0

-- ❌ Wrong assumption
SELECT AVG(rating) FROM reviews;
-- NULL ratings are excluded, not counted as 0!

-- ✅ If you want NULL as 0
SELECT AVG(COALESCE(rating, 0)) FROM reviews;

Mistake 3: Using Aggregate in WHERE

-- ❌ Error: Cannot use aggregate in WHERE
SELECT customer_id
FROM orders
WHERE COUNT(*) > 5;

-- ✅ Use HAVING (covered in Lesson 5.3)
SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

Practice Exercises

Exercise 1

Find the total number of products and the average price.

Solution

SELECT
    COUNT(*) AS total_products,
    ROUND(AVG(price), 2) AS avg_price
FROM products;

Exercise 2

Find the highest and lowest order totals for customer with ID 5.

Solution

SELECT
    MIN(total) AS lowest_order,
    MAX(total) AS highest_order
FROM orders
WHERE customer_id = 5;

Exercise 3

Calculate total revenue from all completed orders (status = 'delivered').

Solution

SELECT
    COUNT(*) AS delivered_orders,
    SUM(total) AS total_revenue
FROM orders
WHERE status = 'delivered';

Exercise 4

Find how many customers have an email address vs how many don't.

Solution

SELECT
    COUNT(*) AS total_customers,
    COUNT(email) AS has_email,
    COUNT(*) - COUNT(email) AS no_email
FROM customers;

Key Takeaways

✅ COUNT() counts rows (COUNT(*) includes NULLs, COUNT(column) excludes them)
✅ SUM() adds numeric values
✅ AVG() calculates the mean (average)
✅ MIN() and MAX() find minimum and maximum values
✅ All aggregates ignore NULL (except COUNT(*))
✅ DISTINCT removes duplicates before aggregating
✅ FILTER clause allows conditional aggregates
✅ Aggregates work with JOINs for cross-table analysis
✅ Use COALESCE to treat NULL as a specific value

Next Steps

Get ready to unlock powerful data analysis capabilities!

Aggregate Functions - Summarizing Data

Introduction

The Five Core Aggregate Functions

COUNT() - Counting Rows

COUNT(*) - Count All Rows

COUNT(column) - Count Non-NULL Values

COUNT(DISTINCT column) - Count Unique Values

SUM() - Adding Values

Basic SUM

SUM with WHERE

SUM Ignores NULLs

AVG() - Calculating Average

Basic AVG

AVG vs SUM/COUNT

AVG Ignores NULLs

Rounding AVG Results

MIN() and MAX() - Finding Extremes

Numeric MIN/MAX

Date MIN/MAX

Text MIN/MAX

Combining Multiple Aggregates

Aggregate Functions with WHERE

Filter Before Aggregating

Common Pattern: Active vs Total

Aggregate Functions with JOINs

Count Related Records

Aggregate from Multiple Tables

NULL Handling in Aggregates

COUNT(*) vs COUNT(column)

SUM/AVG/MIN/MAX Ignore NULLs

COALESCE for Default Values

Common Aggregate Patterns

Pattern 1: Data Quality Check

Pattern 2: Revenue Summary

Pattern 3: Top Performers

FILTER Clause (PostgreSQL Extension)

Conditional Aggregates

SUM with FILTER

Performance Considerations

Aggregates Can Be Slow on Large Tables

COUNT(*) vs COUNT(1)

Approximate Counts for Very Large Tables

Common Mistakes

Mistake 1: Forgetting GROUP BY

Mistake 2: Treating NULL as 0

Mistake 3: Using Aggregate in WHERE

Practice Exercises

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Key Takeaways

Next Steps

Quiz

Aggregate Functions - Summarizing Data

Introduction

The Five Core Aggregate Functions

COUNT() - Counting Rows

COUNT(*) - Count All Rows

COUNT(column) - Count Non-NULL Values

COUNT(DISTINCT column) - Count Unique Values

SUM() - Adding Values

Basic SUM

SUM with WHERE

SUM Ignores NULLs

AVG() - Calculating Average

Basic AVG

AVG vs SUM/COUNT

AVG Ignores NULLs

Rounding AVG Results

MIN() and MAX() - Finding Extremes

Numeric MIN/MAX

Date MIN/MAX

Text MIN/MAX

Combining Multiple Aggregates

Aggregate Functions with WHERE

Filter Before Aggregating

Common Pattern: Active vs Total

Aggregate Functions with JOINs

Count Related Records