Window aggregation in databases and data processing systems is a technique where you perform calculations across a set of rows that are related to the current row. This "set of rows" is the window. Unlike traditional aggregate functions (like SUM, AVG, COUNT) that collapse groups of rows into single values, window aggregates preserve the individual rows while still providing insights calculated across those relevant rows.
Key Concepts
SUM(), AVG(), MIN(), MAX(): Standard aggregate functions applied within the window frame.RANK(), DENSE_RANK(), ROW_NUMBER(): Ranking functions within the window.LAG(), LEAD(): Accessing values from previous or subsequent rows within the window frame.Why Use Window Aggregations?
LAG() and LEAD() to address missing values in a sequence.Example
Let's say you have a sales dataset with columns date, region, and sales_amount. You could use a window function to calculate the moving average of sales for each region over the last 7 days:
SQL
SELECT date, region, sales_amount, AVG(sales_amount) OVER (PARTITION BY region ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_average FROM sales_data;