Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-02-11 : 20:54:07
|
Hi folks,I have a rather complicated TVF that returns a record set that conforms to this schema:create table Speeds( RoadID int not null ,BinID smallint not null ,Speed tinyint not null ,primary key(RoadID asc, BinID asc)) BinId varies from 0 to 999. There are about 1 million distinct RoadID values, so the total number of rows returned by the TVF with no where clause applied is about 1 billion.I can't post the code of the TVF. Suffice to say it's 5 way join, indexed such that this query starts returning rows instantly:select RoadID, BinID, Speed from dbo.ComplicatedTVF(@someArgs) order by RoadID, BinID My requirement is to create a crosstab report over these 1 billion rows that has these columns:RoadID, SpeedAtBin0, SpeedAtBin1, ..., SpeedAtBin999I tried it using both the PIVOT syntax available in 2005, and also the old-school method using CASE:-- using PIVOT:select RoadID,[0],[1],...[999]from (select RoadID, Speed, BinID from dbo.ComplicatedTVF()) apivot (min(Speed) for BinID in ([0],[1],...[999])) b-- using CASE:select RoadID ,min(CASE BinID when 0 then Speed else NULL end) AS Bin0 ,min(CASE BinID when 1 then Speed else NULL end) AS Bin1 ... ,min(CASE BinID when 999 then Speed else NULL end) AS Bin999 from dbo.ComplicatedTVF()group BY RoadID unfortunately, I get this error for either method:Msg 8632, Level 17, State 2, Line 61Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them if I limit the number of columns in the crosstab to 100 or so, the PIVOT method works. It falls over around 200.I'm starting to think I might need to just loop over all the rows (gasp!) in compiled code and create the crosstab by hand...One other thing I thought of was to dump the raw output of the TVF into a temp table, and try the crosstab on that. but I tend to shy away from a temp table with 1b rows. yuck. I don't care if this takes an hour or so to run btw. It's part of an offline process. elsasoft.org |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2008-02-11 : 21:00:47
|
Oh i didn't know that there is a limit in the PIVOT Maybe you can split into 10 PIVOT with 100 columns each and join back ? KH[spoiler]Time is always against us[/spoiler] |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-02-11 : 21:12:14
|
I don't think the limit is in PIVOT per se. the CASE method fails with the same error. I think the query is simply too "complicated" for the optimizer. Perhaps the parse tree is too deep or something.I'll try your suggestion. elsasoft.org |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2008-02-11 : 21:34:02
|
Yes. KH[spoiler]Time is always against us[/spoiler] |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2008-02-11 : 22:38:33
|
Where are you ultimately outputting your results? Why is it necessary to generate a pivot of 1,000 columns?- Jeffhttp://weblogs.sqlteam.com/JeffS |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-02-12 : 00:18:17
|
output would go to a csv file. it's necessary because that's the requirement. apparently some of our customers would prefer to work with 100k rows of 1001 columns each rather than 1b rows of 3 columns each. I myself would prefer the latter, but I don't buy the product - I just build it. elsasoft.org |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2008-02-12 : 07:23:38
|
Can you generate this CSV file with a custom app, say one written in .NET? It would be only a few simple lines of code to do the work there, I'd be happy to help if you can provide your specific requirements. It will be not only easier and much less code, but also much, much faster and more flexible.However, I still have to ask: What, then, will the customer do with this file? They won't be able to make sense of it viewing it in Notepad, and they won't be able to import that many columns into a table or Excel file...what is the final goal here? This reminds me of the client who insisted that their primary reporting need was a report that would be 300,000 printed pages because they "always needed to see everything." - Jeffhttp://weblogs.sqlteam.com/JeffS |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-02-12 : 09:43:33
|
quote: Originally posted by jsmith8858 Can you generate this CSV file with a custom app, say one written in .NET? It would be only a few simple lines of code to do the work there, I'd be happy to help if you can provide your specific requirements. It will be not only easier and much less code, but also much, much faster and more flexible.
I have already done so. I first tried khtan's suggestion of joining 10 crosstabs in t-sql, but that was very slow. I killed it after an hour and no rows had yet been returned. my compiled app generates the CSV in under an hour, which is acceptable.quote: Originally posted by jsmith8858However, I still have to ask: What, then, will the customer do with this file? They won't be able to make sense of it viewing it in Notepad, and they won't be able to import that many columns into a table or Excel file...what is the final goal here? This reminds me of the client who insisted that their primary reporting need was a report that would be 300,000 printed pages because they "always needed to see everything."
pm for this project says the customer wants to load it up in excel to "get a feel for the data". Apparently excel 2007 can support such large files, so they tell me anyway. I already objected to say this was pointless, but was overruled. elsasoft.org |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-02-12 : 10:23:02
|
You could do this with dynamic sql.I wrote an article about that here http://www.sqlservercentral.com/articles/Advanced+Querying/pivottableformicrosoftsqlserver/2434/It does one update per column. Reasonable fast. Please let me know if you have access to the article. Otherwise email me and I can send it as pdf to you. E 12°55'05.25"N 56°04'39.16" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-02-12 : 10:39:49
|
Here is the code without explanations. I trust you are experienced enough to dissect it CREATE PROCEDURE uspSalesOfficeDepartmentPivot( @FromDate DATETIME, @ToDate DATETIME)ASSET NOCOUNT ONCREATE TABLE #Aggregates ( RowText VARCHAR(50), ColumnText VARCHAR(50), CellData INT )INSERT #Aggregates ( RowText, ColumnText, CellData )SELECT Sales.Office, Sales.Department, SUM(Invoices.Amount)FROM SalesINNER JOIN Invoices ON Sales.OfficeID = Invoices.OfficeIDWHERE Invoices.OrderDate BETWEEN @FromDate AND @ToDateGROUP BY Sales.Office, Sales.DepartmentHAVING SUM(Invoices.Amount) <> 0CREATE UNIQUE INDEX IX_Aggregates ON #Aggregates (RowText, ColumnText, CellData)CREATE TABLE #Columns ( ColumnIndex INT IDENTITY (0, 1), ColumnText VARCHAR(50) )INSERT #Columns ( ColumnText )SELECT DISTINCT ColumnTextFROM #Aggregates WITH (NOLOCK, INDEX(IX_Aggregates))ORDER BY ColumnTextCREATE UNIQUE INDEX IX_Columns ON #Columns (ColumnIndex, ColumnText)CREATE TABLE #Rows ( RowText VARCHAR(50) )INSERT #Rows ( RowText )SELECT DISTINCT RowTextFROM #Aggregates WITH (NOLOCK, INDEX(IX_Aggregates))CREATE UNIQUE INDEX IX_Rows ON #Rows (RowText)DECLARE @ColumnIndex INT, @MaxColumnIndex INT, @ColumnText VARCHAR(50), @SQL VARCHAR(1000)SELECT @ColumnIndex = 0, @MaxColumnIndex = MAX(ColumnIndex)FROM #ColumnsWHILE @ColumnIndex <= @MaxColumnIndex BEGIN SELECT @ColumnText = ColumnText FROM #Columns WHERE ColumnIndex = @ColumnIndex SELECT @SQL = 'ALTER TABLE #Rows ADD ' + QUOTENAME(@ColumnText) + ' INT NULL DEFAULT 0' EXEC (@SQL) SET @SQL = 'UPDATE r SET r.' + QUOTENAME(@ColumnText) + ' = a.CellData FROM #Rows AS r INNER JOIN #Aggregates AS a WITH (NOLOCK, INDEX(IX_Aggregates)) ON a.RowText = r.RowText INNER JOIN #Columns AS c WITH (NOLOCK, INDEX(IX_Columns)) ON c.ColumnText = a.ColumnText WHERE c.ColumnIndex = ' + CAST(@ColumnIndex AS VARCHAR(12)) EXEC (@SQL) SET @ColumnIndex = @ColumnIndex + 1 ENDDROP TABLE #Columns, #AggregatesSELECT *FROM #RowsORDER BY RowTextDROP TABLE #Rows E 12°55'05.25"N 56°04'39.16" |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-02-12 : 10:40:51
|
I don't see how dynamic sql would help.I am not having any trouble composing the query. The problem is that the query, when composed, cannot be processed by sql server.it matters not at this point anyway as I have a solution using compiled code. elsasoft.org |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-02-12 : 10:45:19
|
It will create a 1000-column crosstab report reasonbly fast.My example above builds one column at a time and populate that column for all rows before moving on the next column. E 12°55'05.25"N 56°04'39.16" |
|
|
|
|
|
|
|