Named Aggregation & Window Patterns
Once you are comfortable with basic GroupBy, the next step is producing clean, well-labeled summary tables and applying group-aware calculations that keep every original row. This lesson covers named aggregation, the difference between transform and filter, and window-function-style operations you will reach for constantly in real data analysis.
What You'll Learn
- How to build readable summary tables with named aggregation
- When to use
transformversusfilteron a GroupBy - How to compute group-relative values like share of total and ranks
- How to add running totals within groups
Named Aggregation
Plain agg(['mean', 'sum']) produces awkward column names. Named aggregation lets you name each output column and point it at a specific source column and function, all in one call.
The pattern is output_name=('source_column', 'function'). The result has clean, analysis-ready column names with no multi-level headers to flatten later.
transform: Keep Every Row
agg collapses each group to one row. transform runs a group calculation but returns a result aligned to the original rows, so you can attach it as a new column. This is how you compute group-relative values.
Notice the DataFrame still has all five rows. transform is the right tool whenever you need a group statistic next to each individual record.
filter: Keep or Drop Whole Groups
filter evaluates a condition per group and keeps every row of the groups that pass. Use it to remove small or low-volume groups before further analysis.
The single North row is dropped because its group total of 50 does not pass the test. filter returns rows, not an aggregated table.
Window-Style Operations Within Groups
You can rank rows inside each group or build a running total per group by combining GroupBy with rank and cumsum. These behave like SQL window functions.
rank and cumsum respect group boundaries, so each region restarts its own ranking and running total.
Exercise: Share of Group Total
Exercise: Named Summary Table
Key Points
- Named aggregation (
name=('col', 'func')) produces clean, ready-to-use column names transformreturns a value per original row, perfect for group-relative columnsfilterkeeps or drops whole groups based on a group conditionrankandcumsumover a GroupBy act like window functions within each group- Reach for
transformwhen you need a group statistic alongside every record

