How Machines Learn
How Machines Learn
Now that you know what machine learning is, let's peek under the hood. How does a computer actually "learn" from data?
The answer involves three key concepts: features, labels, and optimization.
Features and Labels
In machine learning, we work with structured data:
- Features (X): The input variables used to make predictions
- Labels (y): The output we want to predict (in supervised learning)
Think of it like a test: features are the questions, labels are the answers.
The Learning Process: Finding Patterns
When a machine "learns," it's really doing mathematical optimization. Here's the intuition:
- Start with a guess - The model makes random predictions
- Measure the error - How wrong are the predictions?
- Adjust and improve - Change the model to reduce errors
- Repeat - Keep adjusting until errors are minimized
This process is called training.
What the Model Actually Learns
A model learns parameters - the numbers that define the pattern.
For a simple linear model predicting house prices:
price = weight₁ × size + weight₂ × bedrooms + bias
The model learns the optimal values for weight₁, weight₂, and bias.
The Cost Function: Measuring Errors
How does the model know if it's improving? It uses a cost function (also called loss function) to measure errors.
Common cost functions:
- Mean Squared Error (MSE): Average of squared differences
- Mean Absolute Error (MAE): Average of absolute differences
Lower cost = better model.
Gradient Descent: The Learning Algorithm
Most ML algorithms use gradient descent to minimize the cost function. Think of it like finding the lowest point in a valley while blindfolded:
- Feel which direction goes down (calculate the gradient)
- Take a step in that direction
- Repeat until you can't go any lower
Generalization: The Ultimate Goal
The goal of ML isn't to memorize training data - it's to generalize to new, unseen data.
A good model:
- Learns the underlying patterns (not just memorizes examples)
- Performs well on data it has never seen before
- Balances complexity with simplicity
Key Takeaways
- Features are inputs, labels are outputs
- Learning = finding parameters that minimize prediction errors
- Cost functions measure how wrong the model is
- Gradient descent adjusts parameters to reduce errors
- The goal is generalization - performing well on unseen data
Next, we'll explore the real-world applications of machine learning!

