Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Site Related Forums
 Article Discussion
 Article: Joining to the Next Sequential Row

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2008-04-02 : 07:59:54

One of the more obscure requirements that a developer may find themselves facing is the need to compare a row with its immediate sibling. One such case is when a list of values needs to be processed to produce a moving average or to smooth a sequence of statistical numbers where their order is important.  For example, values lying along a time line. The solution is actually quite simple, but not immediately obvious.




Read Joining to the Next Sequential Row

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-02 : 12:09:07
What about this approach?
SELECT	AVG(e.Gap)
FROM (
SELECT d.Period,
MAX(d.Stat) - MIN(d.Stat) AS Gap
FROM (
SELECT Stat,
Period,
0 AS aFake
FROM tbStats

UNION ALL

SELECT Stat,
DATEADD(DAY, 1, Period),
1
FROM tbStats
) AS d
GROUP BY d.Period
HAVING COUNT(*) = 2
) AS e



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

mharr
Starting Member

20 Posts

Posted - 2008-04-02 : 15:21:01
I believe you can still use solution #1 even if the rows are out of order, by using ROW_NUMBER() function to create a new perfectly sequential ID:

select
x.r_id xId,
y.r_id yId,
x.stat xStat,
y.stat yStat
from
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from tbStats ) x
left join
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from tbStats ) y
on x.r_id + 1 = y.r_id
order by
x.r_id


I would think this would be more efficient than executing a TOP(1) query in a udf for each row. It may be more efficient to just put the results of the SELECT Row_Number... query into a temp table and do the Solution #1 option against the temp table, but I suspect that SQL query optimizer will recognize that both subqueries are the same and do that anyways if it thinks it is more efficient (probably depending on estimated size of the resultset).

BTW, I specified both the period and id columns in the order by clause of the Row_Number function just to ensure that both results returned same results order, even if the table has same values for period in more than one row. Otherwise, subquery x and y could return rows in different order, and the join clause (x.r_id + 1 = y.r_id) would calculate on same value.

Mark
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-02 : 15:24:05
If you already are into sql server 2005 thingies, why don't use an CTE?
It would be fun if someone could performance test the different approaches.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

mharr
Starting Member

20 Posts

Posted - 2008-04-02 : 15:38:29
quote:
Originally posted by mharr

I believe you can still use solution #1 even if the rows are out of order, by using ROW_NUMBER() function to create a new perfectly sequential ID: ....



It jsut occurred to me after I posted this that there is a somewhat better way to do this same solution:

select
x.r_id xId,
y.r_id yId,
x.stat xStat,
y.stat yStat,
from
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) + 1 as r_id,
stat from #tbStats ) x
left join
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from #tbStats ) y
on x.r_id = y.r_id
order by
x.r_id


Or, the final solution:

select
avg(abs(x.stat - y.stat)) movingAvg
from
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) + 1 as r_id,
stat from #tbStats ) x
left join
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from #tbStats ) y
on x.r_id = y.r_id




Mark
Go to Top of Page

atulmar
Starting Member

7 Posts

Posted - 2008-04-02 : 21:03:58
I have used row_number with combination of CTE in these situation many times. It just works so simple and great.

Only thing is that the plan it generates it more greater. For many records I would not suggest to use this, as it is going to use a lot of memory if there are millions of records in data set.

At the same time for smaller data sets I dont see any other better option too.

Thanks
Atul
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-03 : 03:33:08
Ok, some updates for performance measuring (original 8 sample records).
Paul Alcon 2	  32 reads	 15 duration
Paul Alcon 3 26 reads 1 duration
Peso 6 reads 1 duration
mharr 20 reads 2 duration
Using 2048 sample records
Paul Alcon 2	6161 reads	140 duration
Paul Alcon 3 6155 reads 42 duration
Peso 16 reads 15 duration
mharr 16 reads 19 duration
Using 487765 sample records
Peso		  2544 reads	 810 duration
mharr 2544 reads 1498 duration

E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

RevMike
Starting Member

9 Posts

Posted - 2008-04-04 : 09:07:19
quote:
Originally posted by mharr

I believe you can still use solution #1 even if the rows are out of order, by using ROW_NUMBER() function to create a new perfectly sequential ID:

select
x.r_id xId,
y.r_id yId,
x.stat xStat,
y.stat yStat
from
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from tbStats ) x
left join
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from tbStats ) y
on x.r_id + 1 = y.r_id
order by
x.r_id




This solution is absolutely screaming for a SQL WITH clause.

BTW, I a general SQL expert. I work in data integration and regularly deal with about 30 different relational products and vendors. Frankly I'm not sure how well SQL Server supports "WITH", though it is SQL-99 standard.

The solution would look like this...

with tmpStats AS
(SELECT ROW_NUMBER() OVER(ORDER BY period ASC, id ASC) as r_id,
stat from tbStats )
select
x.r_id xId,
y.r_id yId,
x.stat xStat,
y.stat yStat
from
tmpStats x
left join
tmpStats y
on x.r_id + 1 = y.r_id
order by
x.r_id

[/quote]
Go to Top of Page

cas_o
Posting Yak Master

154 Posts

Posted - 2008-04-09 : 03:24:33
Well I think we have to concede the award goes to Peso/mharr looking at the performance stats. Mharr's solution appears to scale more linearly.

;-]... Quack Waddle
Go to Top of Page

dmckinney
Starting Member

3 Posts

Posted - 2008-04-10 : 02:47:34
I think the CTE approach definitely merits a mention. I attach a link to an article I wrote regarding this.

Although it's called 'Linking to the previous row', I'm sure it could be adapted to link to the next row ;-)

http://www.sqlservercentral.com/articles/T-SQL/62159/

Regards,

David McKinney.
Go to Top of Page

DevelopAri
Starting Member

1 Post

Posted - 2008-04-10 : 05:54:34
I solved a similar problem (taking a time-weighted average of a time-series) by using two cursors: The first cursor pointing to row n and the second to row n+1.

Do you have any reservation regarding this solution?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-10 : 06:54:36
Also see
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=93911



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Starnamer
Starting Member

4 Posts

Posted - 2008-04-11 : 08:11:56
After reading this I decided to try it on a real problem I have. The aim is to take a series of statuses and dates for documents and get pairs of status&date with next_status&date. This is so that further processing can look at turnaround times, etc.

However, my implementation of mharr's method turns out to be slower than a simple triangular join (select a.x,min(b.x) from t join t a join t.b on a.x<b.x).

So my question is, is there a faster way to do this?

As a second question, can anyone suggest why the performance of the row_number test solution goes down the tubes (on my system) at between 712 and 714 records (713 sometimes takes ~300ms CPU, others ~76000ms!).

P.S. The server has 4 CPUs hence for 1000 records, elapse time is only about 25% of CPU time.

/*
100 records generated

Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table '#t'. Scan count 3, logical reads 9,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 29 ms.

Table '#t'. Scan count 30, logical reads 18,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 46 ms, elapsed time = 35 ms.
-------------------------------------------
500 records generated

Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table '#t'. Scan count 3, logical reads 39,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 203 ms.

Table '#t'. Scan count 30, logical reads 78,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 205 ms, elapsed time = 157 ms.
---------------------------------------------
1000 records generated

Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table '#t'. Scan count 3, logical reads 78,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 297 ms, elapsed time = 716 ms.

Table '#t'. Scan count 28, logical reads 312,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.
Table 'Worktable'. Scan count 8, logical reads 775568,
physical reads 0, read-ahead reads 0,
lob logical reads 0, lob physical reads 0,
lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 152624 ms, elapsed time = 40658 ms.
--------------------------------------------------
*/
set statistics io off
set statistics time off
set nocount on
-----------------------------------
-- create some data
-- ----------------
-- This actually models part of a document workflow.
-- The flow has been simplified to just 3 states.
-- Documents start in state 'F' and may move to
-- either 'C' (compete) or 'P' (queried) or may be
-- amended and stay in state 'F'. Once in state 'P',
-- the query may be answered, so the document goes
-- to state 'F' or it may be amended and stay in
-- state 'P'. The probabilities of moving between
-- states are arbitrarily set for the model. In
-- order to simulate repeated queries making it
-- more likely for the document to be completed,
-- the probability is increased on each step.
-- Time intervals are randomly assigned between
-- 5 minutes and 1 month. In the model, the generated
-- IDs are sequential. In the real application, they
-- may not be.
--
drop table #t
create table #t (id int, st char(1), date datetime)

declare @n int
declare @st char(1)
declare @date datetime
declare @Cp float
declare @p float

set @n = 1000 -- <<<<<<<<<<<<<<<<< How many ID chains to generate
print convert(varchar(5),@n)+' records generated'
while @n > 0
begin
set @st = 'F'
set @date = dateadd(mi,rand()*1440.0*28,convert(datetime,'2008/2/1'))
insert #t (id,st,date) values(@n,@st,@date)
set @cp = 0.3
while @st <> 'C'
begin
if @st = 'F'
begin
set @p = rand()
if @p < @cp
set @st = 'C'
else
if @p < 0.99
set @st = 'P'
end
else
begin
set @p = rand(1)
if @p < 0.9
set @st = 'F'
end
set @date = dateadd(mi,rand()*1440*3+5,@date)
insert #t (id,st,date) values(@n,@st,@date)
set @cp = @cp * 1.1
end
set @n = @n -1
end
select * from #t

set nocount off
set statistics io on
set statistics time on
--------------------------------------------
-- version 1 - triangular join
------------------------------
-- This is the method currently used and
-- produces the desired result
--
select
id,
sta,
min(da) as 'da',
stb,
db
from (
select
a.id,
a.sta,
a.da,
b.st as 'stb',
b.date as db
from (
select
id,
sta,
da,
min(db) as 'db'
from (
select
a.id,
a.st as 'sta',
a.date as 'da',
b.st as 'stb',
min(b.date) as 'db'
from
#t a
join
#t b
on a.id=b.id
and a.date < b.date
and a.st <> b.st
group by
a.id,
a.st,
a.date,
b.st
) x
group by
id,
sta,
da
) a
join
#t b
on a.id=b.id
and a.db=b.date
) x
group by
id,
sta,
stb,
db
order by
id,
min(da)

--------------------------------------------
-- version 2 - using self join by row_number + 1 = row_number
------------
-- this was derived from comments following the article at
-- http://www.sqlteam.com/article/joining-to-the-next-sequential-row
--
;with t1 as (
select
row_number() over(order by id,date) as 'r_id',
id,
st,
date
from
#t
)
,t4 as (
select
row_number() over(order by id,date) as 'r_id',
id,
st,
date
from (
select top 1
id,
st,
date
from
t1
order by
id,
date
union all
select
id,
st,
date
from (
select
b.id,
b.st,
b.date
from
t1 a
join
t1 b
on a.r_id+1=b.r_id
where
(a.id=b.id and a.st<>b.st)
or (a.id<>b.id)
) z
) x
)
select
a.id,
a.st as 'sta',
a.date as 'da',
b.st as 'stb',
b.date as 'db'
from
t4 a
join
t4 b
on a.r_id+1=b.r_id
and a.id=b.id
order by
id,
da

set statistics time off
set statistics io off
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-12 : 15:53:54
Try this one...
;WITH Yak (id, st, date, rowid, col)
AS (
SELECT id,
st,
date,
ROW_NUMBER() OVER (ORDER BY id, date),
0
FROM #t
)

SELECT id,
MAX(CASE WHEN col = 0 THEN st ELSE '' END) AS sta,
MAX(CASE WHEN col = 0 THEN date ELSE '' END) AS da,
MAX(CASE WHEN col = 1 THEN st ELSE '' END) AS stb,
MAX(CASE WHEN col = 1 THEN date ELSE '' END) AS db
FROM (
select id, st, date, rowid, col from Yak
union all
select id, st, date, rowid - 1, col + 1 from Yak
) AS d
group by id,
rowid
having count(*) = 2
order by rowid

E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-12 : 16:04:56
Results for 1000 sample records:

derek 1 - triangular join

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#t'. Scan count 3, logical reads 78, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 141 ms, elapsed time = 1518 ms.

derek 2 - cte's

Table '#t'. Scan count 16, logical reads 208, physical reads 0, read-ahead reads 6, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 4, logical reads 702520, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 55094 ms, elapsed time = 121561 ms.

peso

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#t'. Scan count 2, logical reads 52, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 565 ms.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Starnamer
Starting Member

4 Posts

Posted - 2008-04-14 : 04:30:00
There was an error in my version 2. There should have been an additional "group by b.id, b.st, b.date" just be before the close of subquery 'z'. With this correction for 100 records I get...

Derek 1
(4108 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#t'. Scan count 3, logical reads 51, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 63 ms, elapsed time = 2330 ms.

Derek 2
(4108 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#t'. Scan count 6, logical reads 72, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 2499 ms.

Derek 3[(Similar to 2, but uses dense_rank() over(partition by id order by date))
(4108 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#t'. Scan count 6, logical reads 72, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 1736 ms.

peso
(4317 row(s) affected)
Table '#t'. Scan count 2, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 1534 ms.

Unfortunately, this seems slower than my corrected version 2 and also produces extra rows.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-14 : 04:38:19
Did you use same code to generate sample data?



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-14 : 04:46:49
Wow. Where did those values come from?
I used this code for testing
-- Peso
;WITH Yak (id, st, date, RowID)
AS (
SELECT id,
st,
date,
ROW_NUMBER() OVER (ORDER BY id, date)
FROM Derek
)

SELECT id,
MAX(CASE WHEN col = 0 THEN st ELSE '' END) AS sta,
MAX(CASE WHEN col = 0 THEN date ELSE '' END) AS da,
MAX(CASE WHEN col = 1 THEN st ELSE '' END) AS stb,
MAX(CASE WHEN col = 1 THEN date ELSE '' END) AS db
FROM (
SELECT id, st, date, RowID, 0 AS Col FROM Yak
UNION ALL
SELECT id, st, date, RowID - 1, 1 FROM Yak
) AS d
GROUP BY id,
RowID
HAVING COUNT(*) = 2
ORDER BY RowID

-- Derek 1
select
id,
sta,
min(da) as 'da',
stb,
db
from (
select
a.id,
a.sta,
a.da,
b.st as 'stb',
b.date as db
from (
select
id,
sta,
da,
min(db) as 'db'
from (
select
a.id,
a.st as 'sta',
a.date as 'da',
b.st as 'stb',
min(b.date) as 'db'
from
Derek a
join
Derek b
on a.id=b.id
and a.date < b.date
and a.st <> b.st
group by
a.id,
a.st,
a.date,
b.st
) x
group by
id,
sta,
da
) a
join
Derek b
on a.id=b.id
and a.db=b.date
) x
group by
id,
sta,
stb,
db
order by
id,
min(da)

-- Derek 3
;with t1 as (
select
row_number() over(order by id,date) as 'r_id',
id,
st,
date
from
Derek
)
,t4 as (
select
row_number() over(order by id,date) as 'r_id',
id,
st,
date
from (
select top 1
id,
st,
date
from
t1
order by
id,
date
union all
select
id,
st,
date
from (
select
b.id,
b.st,
b.date
from
t1 a
join
t1 b
on a.r_id+1=b.r_id
where
(a.id=b.id and a.st<>b.st)
or (a.id<>b.id)
group by b.id, b.st, b.date
) z
) x
)
select
a.id,
a.st as 'sta',
a.date as 'da',
b.st as 'stb',
b.date as 'db'
from
t4 a
join
t4 b
on a.r_id+1=b.r_id
and a.id=b.id
order by
id,
da


SET @n = 1000 -- Peso
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 2, logical reads 52, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 140 ms, elapsed time = 790 ms.

SET @n = 100 -- Peso

Table 'Derek'. Scan count 2, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 345 ms.

SET @n = 10 -- Peso

Table 'Derek'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-14 : 04:53:28
[code]SET @n = 10

Derek 1

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 3, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 24 ms.

Derek 3

Table 'Worktable'. Scan count 1, logical reads 359, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 6, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 42 ms.

SET @n = 100

Derek 1

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 3, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 462 ms.

Derek 3

Table 'Worktable'. Scan count 1, logical reads 7214, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 6, logical reads 18, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 1131 ms.

SET @n = 1000

Derek 1

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Derek'. Scan count 3, logical reads 78, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 1183 ms.

Derek 3

Table 'Derek'. Scan count 18, logical reads 156, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 438 ms, elapsed time = 1321 ms.[/code]


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Starnamer
Starting Member

4 Posts

Posted - 2008-04-14 : 07:37:26
quote:
Wow. Where did those values come from?


The timings are just those from my laptop for @n=1000. (Latitude D810 with 1.5Gb memory). I also added a Primary Key to the test data on columns (id,date).

I've now checked your query and the problem is that it doesn't eliminate those times when the status doesn't change. hence the different number of records.

Data:
id	st	date
19 F 2008-02-11 14:58:00.000
19 P 2008-02-11 21:17:00.000
19 P 2008-02-14 11:20:00.000
19 F 2008-02-15 02:11:00.000
19 P 2008-02-15 06:07:00.000
19 P 2008-02-16 05:27:00.000
19 F 2008-02-18 09:22:00.000
19 C 2008-02-19 05:50:00.000

My versions:
id	sta	da	stb	db
19 F 2008-02-11 14:58:00.000 P 2008-02-11 21:17:00.000
19 P 2008-02-11 21:17:00.000 F 2008-02-15 02:11:00.000
19 F 2008-02-15 02:11:00.000 P 2008-02-15 06:07:00.000
19 P 2008-02-15 06:07:00.000 F 2008-02-18 09:22:00.000
19 F 2008-02-18 09:22:00.000 C 2008-02-19 05:50:00.000

Peso
id	sta	da	stb	db
19 F 2008-02-11 14:58:00.000 P 2008-02-11 21:17:00.000
19 P 2008-02-11 21:17:00.000 P 2008-02-14 11:20:00.000
19 P 2008-02-14 11:20:00.000 F 2008-02-15 02:11:00.000
19 F 2008-02-15 02:11:00.000 P 2008-02-15 06:07:00.000
19 P 2008-02-15 06:07:00.000 P 2008-02-16 05:27:00.000
19 P 2008-02-16 05:27:00.000 F 2008-02-18 09:22:00.000
19 F 2008-02-18 09:22:00.000 C 2008-02-19 05:50:00.000

The requirement is that it only record pairs of statuses which are different.

Currently, it looks like using row_number() is fastest.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-14 : 08:52:46
What if you "cheat"?
Imagine you have a column named "Seq" and populate it's value like this
DECLARE	@Seq INT,
@Yak INT,
@st CHAR(1)

SET @Seq = -1

UPDATE Derek
SET @Seq = @Seq + 1,
@Yak = Seq = CASE
WHEN st = @st THEN NULL
ELSE @Seq
END,
@Seq = CASE
WHEN @Yak IS NULL THEN @Seq - 1
ELSE @Seq
END,
@st = st

Then you easily can write a query like this
SELECT		id,
MAX(CASE WHEN Col = 0 THEN st ELSE '' END) AS sta,
MAX(CASE WHEN Col = 0 THEN date ELSE '' END) AS da,
MAX(CASE WHEN Col = 1 THEN st ELSE '' END) AS stb,
MAX(CASE WHEN Col = 1 THEN date ELSE '' END) AS db
FROM (
SELECT id,
st,
date,
Seq,
0 AS Col
FROM Derek
WHERE Seq IS NOT NULL

UNION ALL

SELECT id,
st,
date,
Seq - 1,
1
FROM Derek
WHERE Seq IS NOT NULL
) AS d
GROUP BY id,
Seq
HAVING COUNT(*) = 2
ORDER BY id,
MAX(CASE WHEN Col = 0 THEN date ELSE '' END)
to get the result you want.


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page
    Next Page

- Advertisement -