Reducing DataFrame Memory

By default Pandas reads numbers into wide types: 64-bit integers and 64-bit floats. Many real columns do not need that range. Downcasting numeric columns to smaller types, alongside the category dtype for text, can cut a DataFrame's memory footprint dramatically, which means faster operations and the ability to load larger datasets.

What You'll Learn

How to inspect a DataFrame's memory usage
How to downcast integer and float columns safely
How to combine downcasting with the category dtype
A repeatable routine for shrinking any DataFrame

Inspecting Memory Usage

Start by measuring. info(memory_usage='deep') shows per-column dtypes and the total footprint, and memory_usage(deep=True) returns the bytes per column.

Loading Pandas Playground...

The id column is stored as int64 and score as float64, both far wider than this data needs.

Downcasting Numbers

pd.to_numeric with downcast='integer' or downcast='float' picks the smallest type that still holds the data without losing information. An int64 column with small values can often drop to int16 or int8.

Loading Pandas Playground...

The integer column shrinks to the smallest signed type that fits, and the float column moves from float64 to float32.

Combining With Category

The biggest wins come from downcasting numbers and converting low-cardinality text to category together. Here is the before-and-after on a small frame.

Loading Pandas Playground...

A Repeatable Shrink Routine

For any DataFrame, a simple loop downcasts every numeric column and converts low-cardinality object columns. Adjust the cardinality threshold to taste.

Loading Pandas Playground...

Watch the Range

Downcasting is only safe when the smaller type can hold every value. If a future value exceeds the range of an int8, it will overflow. Downcast based on the realistic maximum the column can hold, not just the current sample.

Exercise: Downcast an Integer Column

Loading Exercise...

Exercise: Measure the Savings

Loading Exercise...

Key Points

Measure first with info(memory_usage='deep') or memory_usage(deep=True)
pd.to_numeric(col, downcast='integer'|'float') picks the smallest safe numeric type
Combine numeric downcasting with the category dtype for the biggest savings
A simple per-column loop makes the routine repeatable across any DataFrame
Downcast based on the realistic value range, not just the current sample, to avoid overflow

Reducing DataFrame Memory

Downcasting Numbers

Loading Pandas Playground...

The integer column shrinks to the smallest signed type that fits, and the float column moves from float64 to float32.

Key Points

Measure first with info(memory_usage='deep') or memory_usage(deep=True)

pd.to_numeric(col, downcast='integer'|'float') picks the smallest safe numeric type

Combine numeric downcasting with the category dtype for the biggest savings

A simple per-column loop makes the routine repeatable across any DataFrame

Downcast based on the realistic value range, not just the current sample, to avoid overflow

Reducing DataFrame Memory

What You'll Learn

Inspecting Memory Usage

Downcasting Numbers

Combining With Category

A Repeatable Shrink Routine

Watch the Range

Exercise: Downcast an Integer Column

Exercise: Measure the Savings

Key Points

Quiz

Questions & Answers

Reducing DataFrame Memory

What You'll Learn

Inspecting Memory Usage

Downcasting Numbers

Combining With Category

A Repeatable Shrink Routine

Watch the Range

Exercise: Downcast an Integer Column

Exercise: Measure the Savings

Key Points

Quiz

Questions & Answers