Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Running totals

Author  Topic 

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-05-23 : 06:29:16
On SQL Server 2005 at least, this works efficiently where we have an indexed row number.

It does seem to be very sensitive to the join condition in the recursive part of the CTE: changing it to the equivalent "ON T.rn - 1 = RT.rn" results in a scan of T each time instead of a seek!


DROP TABLE dbo.T

-- rn must have contiguous values for this to work:
CREATE TABLE dbo.T (rn int PRIMARY KEY, f float NOT NULL)

-- 100000 random floats between 0 and 1:
INSERT INTO dbo.T
SELECT n+1 AS rn, RAND(CAST(NEWID() AS binary(4))) AS f
FROM dbo.Numbers
GO

;WITH RT AS (
SELECT rn, f AS rt
FROM dbo.T
WHERE rn = 1

UNION ALL

SELECT T.rn, RT.rt + T.f
FROM RT
INNER JOIN dbo.T AS T
ON T.rn = RT.rn + 1
)
SELECT *
INTO dbo.TRT
FROM RT
--ORDER BY rn
OPTION (MAXRECURSION 0)



SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'Worktable'. Scan count 2, logical reads 600001, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'T'. Scan count 100000, logical reads 200002, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 3500 ms, elapsed time = 3724 ms.

(100000 row(s) affected)

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-05-23 : 06:40:29
Hmmm...

Sounds like a DATEDIFF(YEAR, Col1, GETDATE()) > 16 problem to me.
When doing a calculation over an indexed column, you will not get a seek anymore, you will get a scan.

In your case above, table T has column rn indexed as primary key.
Then T.rn = RT.rn + 1 will work great since T.rn (the indexed column) is untouched.

Changing to T.rn - 1 = RT.rn does a calculation over the indexed column and makes the index useless.


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-05-23 : 06:58:03
Yes, I think you're right. I probably don't normally notice the difference normally because most of the joins I do on adjacent row numbers are self joins, so it doesn't matter which way round they get done.

So for this data:

SELECT * FROM dbo.T AS T1 INNER JOIN dbo.T AS T2 ON T1.rn = T2.rn + 1
SELECT * FROM dbo.T AS T1 INNER JOIN dbo.T AS T2 ON T1.rn - 1 = T2.rn

would generate different execution plans, but they will take the same amount of time. Whereas with the recursive CTE, the two sides of the join are completely different.

And of course, it's not strictly true to say that they are equivalent. They would get arithmetic overflow errors in different places in the presence of minint and maxint values of rn.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-05-23 : 07:03:14
quote:
Originally posted by Arnold Fribble

SELECT * FROM dbo.T AS T1 INNER JOIN dbo.T AS T2 ON T1.rn = T2.rn + 1
SELECT * FROM dbo.T AS T1 INNER JOIN dbo.T AS T2 ON T1.rn - 1 = T2.rn


They will generate same execution plan, because what ever angle you look at it, the indexed column is manipulated and you will get clustered index scan.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-05-23 : 07:24:54
Yes, I guess I shouldn't have been surprised at having to think more than usual about which side's row number to apply the calculation to.

My expecting the optimizer to allow transformations that are perfectly ok on mathematical integers but not on int is probably a bad idea!

Go to Top of Page
   

- Advertisement -