Functions of Several Variables

Every machine learning model is a function with multiple inputs. A neural network might have millions of parameters — weights and biases — that together determine its predictions. To understand how training works, you need to work with functions that take many inputs instead of just one.

From One Input to Many

In the previous module, we worked with functions like f(x) = x², which take a single number and return a single number. But real ML models look more like this:

prediction = w₁x₁ + w₂x₂ + w₃x₃ + b

This function has seven inputs: three weights (w₁, w₂, w₃), three features (x₁, x₂, x₃), and one bias (b). During training, the features are fixed (they come from your data), while the weights and bias are adjusted to reduce loss.

Notation for Multivariable Functions

A function of several variables is written as:

f(x₁, x₂, ..., xₙ)

or more compactly using vector notation:

f(x)  where x = [x₁, x₂, ..., xₙ]

Example: Two-Variable Loss Function

Consider a simple model with two parameters, w₁ and w₂:

L(w₁, w₂) = (w₁ · x₁ + w₂ · x₂ - target)²

This loss function takes two weights as input and returns a single number: how wrong the model is. The goal of training is to find the values of w₁ and w₂ that minimize L.

Visualizing Functions of Two Variables

A function of one variable produces a curve in 2D. A function of two variables produces a surface in 3D.

L (loss)
 ^
 |    .  .  .  .
 |  .  mountain  .
 | .    peak      .
 |.                .
 | .     ___      .
 |  .  / valley\ .
 |   ./    *    \.      * = minimum
 +──────────────────> w₁
 /
w₂

Each point (w₁, w₂) on the horizontal plane corresponds to a specific pair of parameter values, and the height above that point is the loss. Training means finding the lowest point on this surface.

The Loss Surface

In real ML, the loss function depends on all model parameters simultaneously. For a neural network with 1 million parameters, the loss surface lives in 1,000,001-dimensional space (1 million parameter dimensions plus 1 loss dimension). You cannot visualize it, but the math works identically to the 2D case.

Key properties of loss surfaces:

Property	Meaning	Implication
Global minimum	The absolute lowest point on the surface	The best possible parameter values
Local minimum	A low point surrounded by higher values	Good but possibly not the best
Saddle point	Low in some directions, high in others	Looks like a minimum in some views but not all
Plateau	A flat region where loss barely changes	Gradients near zero, slow training

Level Curves (Contour Lines)

When we cannot visualize 3D surfaces, we use contour plots — the same idea as elevation lines on a topographic map. Each line connects points with the same loss value.

w₂
 ^
 |    ╭──────╮
 |  ╭─┤      ├─╮
 | ╭┤  ╭────╮  ├╮
 | │ ╭─┤  * ├─╮ │   * = minimum
 | ╰┤  ╰────╯  ├╯   Lines = equal loss
 |  ╰─┤      ├─╯
 |    ╰──────╯
 +──────────────> w₁

If the contour lines form concentric circles, the loss surface is "bowl-shaped" and gradient descent works well. If they form elongated ellipses, the surface is like a narrow valley, and training can be slower.

How ML Frameworks Handle Multivariable Functions

In practice, you define a model and a loss function, and the framework handles the multivariable calculus:

# PyTorch example (conceptual)
def model(x, w1, w2, b):
    return w1 * x[0] + w2 * x[1] + b

def loss(prediction, target):
    return (prediction - target) ** 2

The framework then computes how the loss changes with respect to each parameter independently. This is the subject of the next lesson: partial derivatives.

Why Multiple Inputs Complicate Things

With one input, there is only one direction to move: increase x or decrease x. With multiple inputs, there are many directions:

Change w₁ only
Change w₂ only
Change both simultaneously
Change in any combination of directions

The question becomes: which direction reduces the loss most efficiently? To answer this, we need to understand how the loss responds to each parameter individually — which is exactly what partial derivatives provide.

Summary

ML models are functions of many variables (weights, biases, and inputs)
A loss function maps all model parameters to a single number (the error)
The loss surface is a high-dimensional landscape where the height represents the error
Loss surfaces have global minima, local minima, saddle points, and plateaus
Contour plots (level curves) visualize 2D slices of the loss surface
With multiple inputs, we need to determine how the loss changes with respect to each parameter independently
Partial derivatives, covered in the next lesson, provide exactly this capability

Next, you will learn how to compute the derivative with respect to one variable while holding all others constant.

Functions of Several Variables

From One Input to Many

In the previous module, we worked with functions like f(x) = x², which take a single number and return a single number. But real ML models look more like this:

prediction = w₁x₁ + w₂x₂ + w₃x₃ + b

Notation for Multivariable Functions

A function of several variables is written as:

f(x₁, x₂, ..., xₙ)

or more compactly using vector notation:

f(x)  where x = [x₁, x₂, ..., xₙ]

Example: Two-Variable Loss Function

Consider a simple model with two parameters, w₁ and w₂:

L(w₁, w₂) = (w₁ · x₁ + w₂ · x₂ - target)²

This loss function takes two weights as input and returns a single number: how wrong the model is. The goal of training is to find the values of w₁ and w₂ that minimize L.

Visualizing Functions of Two Variables

A function of one variable produces a curve in 2D. A function of two variables produces a surface in 3D.

L (loss)
 ^
 |    .  .  .  .
 |  .  mountain  .
 | .    peak      .
 |.                .
 | .     ___      .
 |  .  / valley\ .
 |   ./    *    \.      * = minimum
 +──────────────────> w₁
 /
w₂

The Loss Surface

Key properties of loss surfaces:

Property	Meaning	Implication
Global minimum	The absolute lowest point on the surface	The best possible parameter values
Local minimum	A low point surrounded by higher values	Good but possibly not the best
Saddle point	Low in some directions, high in others	Looks like a minimum in some views but not all
Plateau	A flat region where loss barely changes	Gradients near zero, slow training

Level Curves (Contour Lines)

When we cannot visualize 3D surfaces, we use contour plots — the same idea as elevation lines on a topographic map. Each line connects points with the same loss value.

w₂
 ^
 |    ╭──────╮
 |  ╭─┤      ├─╮
 | ╭┤  ╭────╮  ├╮
 | │ ╭─┤  * ├─╮ │   * = minimum
 | ╰┤  ╰────╯  ├╯   Lines = equal loss
 |  ╰─┤      ├─╯
 |    ╰──────╯
 +──────────────> w₁

How ML Frameworks Handle Multivariable Functions

In practice, you define a model and a loss function, and the framework handles the multivariable calculus:

# PyTorch example (conceptual)
def model(x, w1, w2, b):
    return w1 * x[0] + w2 * x[1] + b

def loss(prediction, target):
    return (prediction - target) ** 2

The framework then computes how the loss changes with respect to each parameter independently. This is the subject of the next lesson: partial derivatives.

Why Multiple Inputs Complicate Things

With one input, there is only one direction to move: increase x or decrease x. With multiple inputs, there are many directions:

Change w₁ only
Change w₂ only
Change both simultaneously
Change in any combination of directions

Summary

ML models are functions of many variables (weights, biases, and inputs)
A loss function maps all model parameters to a single number (the error)
The loss surface is a high-dimensional landscape where the height represents the error
Loss surfaces have global minima, local minima, saddle points, and plateaus
Contour plots (level curves) visualize 2D slices of the loss surface
With multiple inputs, we need to determine how the loss changes with respect to each parameter independently
Partial derivatives, covered in the next lesson, provide exactly this capability

Next, you will learn how to compute the derivative with respect to one variable while holding all others constant.

Functions of Several Variables

From One Input to Many

Notation for Multivariable Functions

Example: Two-Variable Loss Function

Visualizing Functions of Two Variables

The Loss Surface

Level Curves (Contour Lines)

How ML Frameworks Handle Multivariable Functions

Why Multiple Inputs Complicate Things

Summary

Questions & Answers

Functions of Several Variables

From One Input to Many

Notation for Multivariable Functions

Example: Two-Variable Loss Function

Visualizing Functions of Two Variables

The Loss Surface

Level Curves (Contour Lines)

How ML Frameworks Handle Multivariable Functions

Why Multiple Inputs Complicate Things

Summary

Questions & Answers