Resampling Time Series
Real-world time data often arrives at one frequency but needs to be analyzed at another. Daily sales rolled up to monthly totals, or hourly readings averaged to a daily figure, are everyday tasks. resample is the Pandas tool for changing the frequency of a time series, and it works much like a GroupBy that groups by time buckets.
What You'll Learn
- How to downsample to a coarser frequency (daily to monthly)
- How to apply different aggregations while resampling
- How to upsample to a finer frequency and fill gaps
- Common frequency codes you will use
Downsampling: Daily to Monthly
Downsampling reduces the number of rows by grouping into larger time buckets. Call resample with a frequency code, then an aggregation. The DataFrame must have a DatetimeIndex.
resample('ME').sum() groups the rows by calendar month and sums each bucket. The result is indexed by the month-end date.
Different Aggregations
Resampling supports any aggregation: sum, mean, min, max, count, and more. You can also produce several at once with agg.
Upsampling and Filling Gaps
Upsampling moves to a finer frequency, which creates new rows with missing values. You then decide how to fill them: ffill carries the last value forward, and interpolate estimates values in between.
Use interpolate instead of ffill when a smooth trend between known points makes more sense than repeating the last value.
Common Frequency Codes
These are the frequency strings you will use most often with resample and date_range:
DdailyWweeklyMEmonth endMSmonth startQEquarter endYEyear endhhourlyminminutely
Exercise: Monthly Totals
Exercise: Weekly Average
Key Points
resamplechanges a time series to a new frequency, like a GroupBy over time buckets- Downsampling (e.g. daily to monthly) reduces rows; pair it with
sum,mean, oragg - Upsampling adds rows with gaps; fill them with
ffillorinterpolate - The DataFrame needs a DatetimeIndex before you can resample
- Frequency codes like
D,W,ME, andhcontrol the bucket size

