SQL Server Advent Calendar 13 – Grouping Sets

Day 13 of my virtual advent calendar, about stuff I like in SQL Server 2008..

The SQL Group by clause has been around since I can remember, but in SQL Server 2008 we know have grouping sets.  Imagine you wanted to sum sales amount from the adventure works reseller sales table by year, by Product Category and by Year AND Product Category perhaps as the first stage in writing a report.  You also need to have all the results in one query so you end up doing something like this in SQL Server 2005:

SELECT
NULL AS [year],
dpc.EnglishProductCategoryName AS Category,
SUM(frs.salesamount)
FROM FactResellerSales frs
INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
GROUP BY dpc.EnglishProductCategoryName

UNION ALL

SELECT
dt.CalendarYear AS [year],
NULL AS Category,
SUM(frs.salesamount)
FROM FactResellerSales frs
INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
GROUP BY dt.CalendarYear

UNION ALL

SELECT
dt.CalendarYear AS [year],
dpc.EnglishProductCategoryName AS Category,
SUM(frs.salesamount)
FROM FactResellerSales frs
INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
GROUP BY dpc.EnglishProductCategoryName,
dt.CalendarYear

Grouping sets makes this a far simpler exercise which also executes only once against the fact table..

SELECT
dt.CalendarYear AS [year],
dpc.EnglishProductCategoryName AS Category,
SUM(frs.salesamount)
FROM FactResellerSales frs
INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
GROUP BY
GROUPING SETS
( (dpc.EnglishProductCategoryName),
(dt.CalendarYear),
(dpc.EnglishProductCategoryName,dt.CalendarYear),
()
)

Each row under grouping sets groups the above query in a different way  equivalent to the previous query except that I have added an extra set, the last row, () which gives the grand total,  (which shows up as  NULL,NULL, 804505969823) in the results below..

image

I think this plays very well with Reporting Services and for generating aggregated fact tables in large relational data warehouses, but will obviously need development work to implement.

Technorati Tags: data warehouse,grouping sets,T-SQL,SQL Server 2008