Data Visualization with Matplotlib & Seaborn
A chart compresses a thousand rows into a single picture your reader can read in two seconds. Matplotlib is the original Python plotting library, and seaborn is the friendlier wrapper that handles the boring parts (colors, axes, themes) so you can focus on what to show. Together they make 95 percent of the charts you will ever need.
This lesson covers how to make a useful chart, how to make it look professional, and how AI helps you choose the right type of chart for your data.
What You'll Learn
- The four chart types every analyst should know: line, bar, scatter, histogram
- The matplotlib + seaborn workflow that produces clean charts in 5 lines
- How to add titles, labels, and legends without fighting the API
- AI prompts for picking the right chart and improving an ugly one
The Setup
Every notebook starts with these imports:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="whitegrid") # nicer default look
The set_theme line is one of the highest-leverage lines in your toolkit. It changes seaborn's default style to something that already looks good, so you do not need to fix grid colors and font sizes by hand.
The Four Charts You Need First
1. Line chart — for time series
df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/flights.csv")
plt.figure(figsize=(10, 4))
sns.lineplot(data=df, x="year", y="passengers", hue="month")
plt.title("Monthly passenger counts by year")
plt.xlabel("Year")
plt.ylabel("Passengers (thousands)")
plt.show()
Use line charts when the x-axis is time. The hue parameter splits one line per category — here, one line per month.
2. Bar chart — for categories
df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")
plt.figure(figsize=(8, 4))
sns.barplot(data=df, x="day", y="total_bill", estimator="mean")
plt.title("Average bill by day of week")
plt.show()
Use bars when the x-axis is a category (day of week, department, country). Seaborn computes the mean for you; the dark line at the top of each bar is a confidence interval.
3. Scatter plot — for two numeric variables
plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x="total_bill", y="tip", hue="day")
plt.title("Tip vs total bill, colored by day")
plt.show()
Use scatter when you want to see the relationship between two numbers. Color by a third dimension to see if the relationship differs by group.
4. Histogram — for the distribution of one variable
plt.figure(figsize=(8, 4))
sns.histplot(data=df, x="tip", bins=20, kde=True)
plt.title("Distribution of tips")
plt.show()
Use histograms when you want to see "what does the distribution of this number look like?" — is it normal, skewed, bimodal? kde=True overlays a smooth density curve.
The Anatomy of a Good Chart
Every chart you publish should have these four things:
- A title that states the takeaway, not the variables. "Sales doubled in Q4" beats "Sales over time."
- Axis labels in plain English, with units. "Sales (USD)" beats "sales".
- A legend if you used
hue. Seaborn adds it for you. - Reasonable size —
figsize=(10, 4)for time series,(8, 6)for scatter,(8, 4)for bars.
Without these, you have a graph. With them, you have communication.
Saving a Chart
plt.savefig("my_chart.png", dpi=150, bbox_inches="tight")
Use dpi=150 minimum for anything you will paste into a slide deck. bbox_inches="tight" strips extra whitespace.
A Real Example: Multi-Step Plotting
Build this in Colab:
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset("titanic")
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
sns.countplot(data=df, x="class", hue="survived", ax=axes[0])
axes[0].set_title("Survival count by class")
sns.boxplot(data=df, x="class", y="age", hue="survived", ax=axes[1])
axes[1].set_title("Age distribution by class and survival")
plt.tight_layout()
plt.show()
This puts two charts side by side. The subplots(1, 2) line means "one row, two columns of plots." Each call to seaborn passes ax=axes[i] so the plot lands in the right panel.
When you read the result, you should see (a) more first-class passengers survived than died, while third-class shows the opposite, and (b) age distributions are similar across classes. That is a story in two charts.
When to Use Which Chart Type
A useful prompt for AI when you are not sure:
I have data with columns:
[list]. The first few rows look like:[paste]. I want to communicate[message]. What chart type should I use, and write the seaborn code?
A short cheat sheet to memorize:
- One number over time: line plot
- One number across categories: bar plot
- Two numbers: scatter plot
- One number's distribution: histogram (or KDE for smooth)
- One number across categories with distribution: box plot or violin plot
- Counts of categories: count plot (a bar plot of counts)
- Two numbers' relationship across categories: scatter with
hue - A correlation matrix:
sns.heatmap(df.corr())
Ugly Chart? Ask AI to Fix It
Charts often look cluttered the first time. Use this prompt:
Here is my plotting code. The chart looks cluttered: x-axis labels are overlapping, the legend covers data points, and the colors are hard to distinguish. Please rewrite it with these fixes, keeping the same data and chart type:
[paste]
The AI will add plt.xticks(rotation=45), move the legend with plt.legend(loc="upper left", bbox_to_anchor=(1, 1)), and pick a colorblind-friendly palette like palette="colorblind".
A Common Beginner Mistake: Missing plt.show()
In Colab you usually do not need it — charts render automatically. In scripts, Jupyter, and some other notebooks, you must call plt.show() to display the figure. If your chart never appears, that is the first thing to check.
Another common bug: making 10 charts in a loop and finding they all overlay on top of each other. Add plt.figure() at the start of each iteration to start a fresh canvas.
Beyond the Basics
Two more libraries to know about for later:
- Plotly — interactive charts (hover for details, zoom, click). Great for dashboards and notebooks shared with non-technical readers.
- Altair — declarative grammar of graphics, similar to ggplot2 in R. Concise once you learn its style.
For this course, matplotlib + seaborn is plenty.
Key Takeaways
sns.set_theme(style="whitegrid")turns ugly default plots into clean ones for free- Memorize the four chart types: line for time, bar for categories, scatter for two numerics, histogram for distribution
- Every chart needs a title (with the takeaway), axis labels, and a sensible size
- Save figures at
dpi=150minimum for anything you will share - When stuck, paste your code and a description of what is ugly — AI knows the fix

