Dynamic Cross-Tabs/Pivot Tables
By Rob Volk
on 12 March 2001
| 239 Comments
| Tags: Dynamic SQL
IMHO, the best feature of MS Access is the TRANSFORM statement, used to create cross-tabs/pivot tables. It does all of the work of dynamically generating the cross-tabulation and the summary calculations. T-SQL unfortunately doesn't have this statement, so you're stuck using complicated SQL commands, expensive 3rd party products, or exotic OLAP to make pivot tables...or you can use the following procedure to dynamically create them!
I got the idea from this question, asking how to "undo" a pivot table, and then I started working on how to create them in T-SQL. There are numerous ways of doing pivot tables, and this site has several examples (and lots of other cool stuff). The standard method uses a CASE statement, with one CASE for each pivot value (the column headings created by cross-tabbing the pivot column). The greatest shortcoming is finding a way to handle an unknown or changing number of pivot values. Obviously you have to know these values beforehand, and you must add a CASE for each new, distinct value inserted into the pivot column. The code listed below will do all of the work for you:
CREATE PROCEDURE crosstab
DECLARE @sql varchar(8000), @delim varchar(1)
SET NOCOUNT ON
SET ANSI_WARNINGS OFF
EXEC ('SELECT ' + @pivot + ' AS pivot INTO ##pivot FROM ' + @table + ' WHERE 1=2')
EXEC ('INSERT INTO ##pivot SELECT DISTINCT ' + @pivot + ' FROM ' + @table + ' WHERE '
+ @pivot + ' Is Not Null')
SELECT @sql='', @sumfunc=stuff(@sumfunc, len(@sumfunc), 1, ' END)' )
SELECT @delim=CASE Sign( CharIndex('char', data_type)+CharIndex('date', data_type) )
WHEN 0 THEN '' ELSE '''' END
WHERE table_name='##pivot' AND column_name='pivot'
SELECT @sql=@sql + '''' + convert(varchar(100), pivot) + ''' = ' +
stuff(@sumfunc,charindex( '(', @sumfunc )+1, 0, ' CASE ' + @pivot + ' WHEN '
+ @delim + convert(varchar(100), pivot) + @delim + ' THEN ' ) + ', ' FROM ##pivot
DROP TABLE ##pivot
SELECT @sql=left(@sql, len(@sql)-1)
SELECT @select=stuff(@select, charindex(' FROM ', @select)+1, 0, ', ' + @sql + ' ')
SET ANSI_WARNINGS ON
Yeah, I know, you want to know how it works :) I won't go into detail on the mechanics of the code. Let's just say that if you can follow it, feel free to play with it; if not, DON'T TOUCH IT! The syntax for the procedure call is below, followed by a description of each parameter:
|EXECUTE crosstab||SELECT statement,||summary calculation,||pivot column,||table name|
- The SELECT statement can be anything, as long as it has proper syntax and includes a GROUP BY clause. You can use JOINs, but if you use table aliases you should include the alias in the summary calculation expression (2).
- The summary calculation must have an aggregate function like SUM(), AVG(), MIN(), MAX(), etc. You'd have to modify the code if you want to use DISTINCT with these functions. COUNT(*) won't work, you have to COUNT on a column.
- The pivot column must be in the table (4). You can use an expression for the pivot column (a+b, LEFT(FirstName,3), etc.) as long as it can be derived from the table listed in (4). A cross-tab heading will be created for each distinct value in the pivot colum/expression.
- This table can be any table in your database, or another database if you use the full naming syntax (database.owner.table). Tables in a linked server may also work, but I haven't tested this. It's possible that a derived table (nested SELECT) can work, but I haven't tested this either. You would need to enclose the SELECT statement in parentheses, and use a table alias outside these parentheses, like this: '(SELECT LastName FROM myTable) AS Surnames'
I'll list some cross-tab settings and the results. Here's two you can run in the pubs database:
EXECUTE crosstab 'select title from titles inner join sales on (sales.title_id=titles.title_id)
group by title', 'sum(qty)','stor_id','stores'
|But Is It User Friendly?|| || || || || ||30|
|Computer Phobic AND Non-Phobic Individuals: Behavior Variations || || || ||20 || || |
|Cooking with Computers: Surreptitious Balance Sheets || || || || || ||25|
|Emotional Security: A New Algorithm || || || ||25 || || |
|Fifty Years in Buckingham Palace Kitchens || || ||20 || || || |
|Is Anger the Enemy? ||3 ||75 ||10 ||20 || || |
|Life Without Fear || || || ||25 || || |
|Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean || || ||40 || || || |
|Prolonged Data Deprivation: Four Case Studies || || || ||15 || || |
|Secrets of Silicon Valley || ||50 || || || || |
|Silicon Valley Gastronomic Treats || || || || ||10 || |
|Straight Talk About Computers || || || || ||15 || |
|Sushi, Anyone? || || ||20 || || || |
|The Busy Executive's Database Guide ||5 || || || || ||10 |
|The Gourmet Microwave || || || ||25 || ||15 |
|You Can Combat Computer Stress! || || || || ||35 || |
EXECUTE crosstab 'select pub_name, count(qty) as orders, sum(qty) as total
from sales inner join titles on (sales.title_id=titles.title_id)
right join publishers on (publishers.pub_id=titles.pub_id)
group by pub_name', 'sum(qty)','type','titles'
|Algodata Infosystems||6||135||55|| ||80|| || || |
|Binnet & Hardley||7||150|| ||50|| ||20||80|| |
|Five Lakes Publishing||0|| || || || || || || |
|GGG&G||0|| || || || || || || |
|Lucerne Publishing||0|| || || || || || || |
|New Moon Books||8||208||35|| || ||173|| || |
|Ramona Publishers||0|| || || || || || || |
|Scootney Books||0|| || || || || || || |
Here's one that will run in Northwind:
EXECUTE crosstab 'SELECT LastName FROM Employees INNER JOIN Orders
GROUP BY LastName', 'count(lastname)', 'Year(OrderDate)', 'Orders'
Some things to look out for:
- This procedure builds a SQL statement from your parameters, inserts the CASE statements to handle the cross-tabbing, and then executes it. The size of this statement is maxed out at 8000 characters, so there's a limit on the size and complexity of the SELECT statement and the number of pivoted values you can have.
- This procedure uses a global temporary table, and will cause an error if two sessions execute it at the same time. (thanks for catching this Garth!) I will have a remedy soon and will post the fixed code in this article.
- Date values can be pivoted, but remember that the timestamp portion will be included, so you should CONVERT() them in the pivot expression. NULLs can't be pivoted, you must use IsNull() to replace them with a value. I have not tested nvarchar or nchar data types, and the code may need to be modified to work with them. Nor have I tried ntext or text types, but I'm sure they can't be pivoted anyway.
- You can use a HAVING clause, or include WITH CUBE or WITH ROLLUP in the SELECT statement and it will return the appropriate summary rows for each group.
- You can even ORDER BY cross-tabbed columns, something you can't do with the TRANSFORM statement in Access.
I'm spending some time working on enhancements to this procedure (for a follow-up article), to allow things like subqueries, multiple pivoting columns, non-aggregate values (e.g. - show the store with the highest sales each month, pivoted by book type). I would love to hear from anyone who has suggestions or solutions on how to improve this code.