Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 How to eliminate this cursor

Author  Topic 

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-07-31 : 16:21:53
Ok, I have a cursor based on a query that returns datetime stamps from a table (StartDate, EndDate).

DECLARE @MergedDates TABLE (StartTime DATETIME, EndTime DATETIME);
DECLARE @StartTime DATETIME;
DECLARE @EndTime DATETIME;

DECLARE curDates CURSOR LOCAL FAST_FORWARD
FOR
SELECT StartDate, EndDate
FROM <database>.<schema>.<table>
WHERE YEAR(StartDate) = 2007
ORDER BY StartDate, EndDate;

OPEN curDates;
FETCH curDates INTO @StartTime, @EndTime;

WHILE (@@FETCH_STATUS = 0)
BEGIN
IF EXISTS(SELECT TOP 1 1 FROM @MergedDates WHERE @StartTime BETWEEN StartTime AND EndTime)
BEGIN
UPDATE @MergedDates
SET EndTime = @EndTime
WHERE @StartTime BETWEEN StartTime AND EndTime
/* If the end time is less then we update it */
AND EndTime < @EndTime;
END;
ELSE
BEGIN
INSERT INTO @MergedDates (StartTime, EndTime)
VALUES (@StartTime, @EndTime);
END

FETCH curDates INTO @StartTime, @EndTime;
END;

CLOSE curDates;
DEALLOCATE curDates;


I have data in this table like so (StartDate, EndDate)
2008-01-04 11:00:00.000 2008-01-04 14:00:00.000
2008-01-04 11:30:00.000 2008-01-04 14:00:00.000
2008-01-04 11:45:00.000 2008-01-04 14:30:00.000
2008-01-04 16:00:00.000 2008-01-04 18:00:00.000
2008-01-05 16:00:00.000 2008-01-05 18:00:00.000
2008-01-06 16:00:00.000 2008-01-06 18:00:00.000

I want to query it so that any records that overlap entirely are merged into one record. With the above data I would want this returned:
2008-01-04 11:00:00.000 2008-01-04 14:30.00.000
2008-01-04 16:00:00.000 2008-01-04 18:00:00.000
2008-01-05 16:00:00.000 2008-01-05 18:00:00.000
2008-01-06 16:00:00.000 2008-01-06 18:00:00.000

Now the query above does the trick but I would like to try a solution that does not use cursors. I played a little bit with recursive CTEs but no luck so far.

Any suggestions?

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-07-31 : 16:40:38
See http://weblogs.sqlteam.com/peterl/archive/2008/05/13/Lightning-fast-collapsed-date-ranges-and-missing-date-ranges.aspx



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-07-31 : 16:43:22
Also see http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=88422



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2008-07-31 : 18:03:26
[code]
DECLARE @MergedDates TABLE (StartTime DATETIME, EndTime DATETIME);

Insert Into @MergedDates
select '2008-01-04 11:00:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:30:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:45:00.000', '2008-01-04 14:30:00.000' Union All
select '2008-01-04 16:00:00.000', '2008-01-04 18:00:00.000' Union All
select '2008-01-05 16:00:00.000', '2008-01-05 18:00:00.000' Union All
select '2008-01-06 16:00:00.000', '2008-01-06 18:00:00.000'

select distinct Min(b.StartTime),Max(b.EndTime)
from
(
Select Row_Number() over (order by a.StartTime)as RowID,*
from
@MergedDates a
) aa
inner Join
@MergedDates b
on aa.Starttime between b.StartTime and b.Endtime
or aa.Endtime between b.StartTime and b.Endtime
group by aa.RowID

[/code]
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-08-01 : 04:01:17
Try this test data
select '20080104 11:00', '20080104 14:00' Union All
select '20080104 13:00', '20080104 17:00' Union All
select '20080104 15:00', '20080104 18:00'



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 09:40:26
quote:
Originally posted by Vinnie881


DECLARE @MergedDates TABLE (StartTime DATETIME, EndTime DATETIME);

Insert Into @MergedDates
select '2008-01-04 11:00:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:30:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:45:00.000', '2008-01-04 14:30:00.000' Union All
select '2008-01-04 16:00:00.000', '2008-01-04 18:00:00.000' Union All
select '2008-01-05 16:00:00.000', '2008-01-05 18:00:00.000' Union All
select '2008-01-06 16:00:00.000', '2008-01-06 18:00:00.000'

select distinct Min(b.StartTime),Max(b.EndTime)
from
(
Select Row_Number() over (order by a.StartTime)as RowID,*
from
@MergedDates a
) aa
inner Join
@MergedDates b
on aa.Starttime between b.StartTime and b.Endtime
or aa.Endtime between b.StartTime and b.Endtime
group by aa.RowID





This is it. Nice, simple and it works! Thanks! I'm going to dissect this a little to understand why it does exactly but this is perfect. I've ran this against my real data and manually compared the results to the raw data and this accounts for all merge scenarios perfectly.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-08-01 : 10:10:27
Even when you try with my posted sample data 08/01/2008 : 04:01:17 ?
quote:
select '20080104 11:00', '20080104 14:00' Union All
select '20080104 13:00', '20080104 17:00' Union All
select '20080104 15:00', '20080104 18:00'

The result should be
select '20080104 11:00', '20080104 18:00'
right?


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 10:59:19
quote:
Originally posted by Peso

Even when you try with my posted sample data 08/01/2008 : 04:01:17 ?
quote:
select '20080104 11:00', '20080104 14:00' Union All
select '20080104 13:00', '20080104 17:00' Union All
select '20080104 15:00', '20080104 18:00'

The result should be
select '20080104 11:00', '20080104 18:00'
right?


E 12°55'05.25"
N 56°04'39.16"




Yes, you are correct. I'll tinker with this a bit.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 11:20:00
I see what is going on and I think it can be solved by a recursive CTE. Take your base data:

Row #1: '2008-01-04 11:00:00.000', '2008-01-04 14:00:00.000'
Row #2: '2008-01-04 13:00:00.000', '2008-01-04 17:00:00.000'
Row #3: '2008-01-04 15:00:00.000', '2008-01-04 18:00:00.000'
The results are:

Row #A: '2008-01-04 11:00:00.000', '2008-01-04 17:00:00.000'
Row #B: '2008-01-04 11:00:00.000', '2008-01-04 18:00:00.000'
Row #C: '2008-01-04 13:00:00.000', '2008-01-04 18:00:00.000'
The results are showing rows row #1 & #2 merged, #1 & #3 merged, and #2 and #3 merged.

I'll post the results in a bit after I toy with this.
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2008-08-01 : 12:29:44

I just verified there are errors in this query as well. Use a CTE, it'll be easier, but anyway here is where I am at. It works in MOST scenerios but not all.



select Min(aaaa.Start) as Start,aaaa.[End]
From
(
select aaa.Start,max(bbb.[EndTime]) as [End]
from
(
select RowID,Min(b.StartTime) as start,Max (b.EndTime) as [endTime]
from
(
Select Row_Number() over (order by a.StartTime)as RowID,*
from
@MergedDates a
) aa
Inner Join
@MergedDates b
on
aa.Starttime between b.StartTime and b.Endtime
or aa.Endtime between b.StartTime and b.Endtime

group by RowID
) aaa
Inner Join
@MergedDates bbb
on aaa.Start between bbb.StartTime and bbb.Endtime
or aaa.[EndTime] between bbb.StartTime and bbb.Endtime
group by Start

) aaaa
group by [End]



It's a mess to look at and you'll probably want to clean it up a little, but see if it relieves the issue.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 13:13:54
@Vinnie881

This does appear to correct this issue with the sample data Peso posted as well as the sample data I posted. I'm curious though, what situations do you think this will not work for (your latest query)?
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2008-08-01 : 13:26:17
DECLARE @MergedDates TABLE (StartTime DATETIME, EndTime DATETIME);

Insert Into @MergedDates
select '2008-01-04 11:00:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:30:00.000', '2008-01-04 14:00:00.000' Union All
select '2008-01-04 11:45:00.000', '2008-01-04 14:30:00.000' Union All
select '2008-01-04 16:00:00.000', '2008-01-04 18:00:00.000' Union All
select '2008-01-05 16:00:00.000', '2008-01-05 18:00:00.000' Union All
select '2008-01-06 16:00:00.000', '2008-01-06 18:00:00.000' Union All
select '20080104 11:00', '20080104 14:00' Union All
select '20080104 13:00', '20080104 17:00' Union All
select '20080104 15:00', '20080104 18:00' Union All
select '20080104 18:00', '20080106 18:00' Union all
select '20080106 15:00', '20080108 18:00'

that last record above will throw it off. Peso is one of the most knowledgeable individuals on this site (if not the most) when it comes to this type of question. If he suggested a CTE, he's likely accurate in his reasoning. So far I have not found a 100% method using the methodolgy I illustrated. I am thinking of an approach with a initial cross join then start the filtering from there might work, but a CTE is a easer method.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 13:33:01
Actually, I brought up the topic of using some sort of recursive CTE but I just can't figure it out yet. Peso's examples both utilized nested queries such as yours. I haven't quite figured out how to convert his examples to my approach but with the help of both of you I am getting closer.

I'll keep you posted.
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2008-08-01 : 15:37:00
I tested the methodology of Creating a table with every date by second between the two dates. It works but runs extrordianry slow that a while statement is much quicker.

I am interested to know the solution you come up with.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-08-01 : 16:45:06
Use the "clustered index update" trick as seen here
http://weblogs.sqlteam.com/peterl/archive/2008/05/13/Lightning-fast-collapsed-date-ranges-and-missing-date-ranges.aspx
quote:
For 1,000 date pairs the algorithm runs in 80 ms.
For 10,000 date pairs the algorithm runs in 360 ms.
For 100,000 date pairs the algorithm runs in 900 ms.
For 1,000,000 date pairs the algorithm runs in 2250 ms.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2008-08-01 : 16:51:30
quote:
Originally posted by Peso

Use the "clustered index update" trick as seen here
http://weblogs.sqlteam.com/peterl/archive/2008/05/13/Lightning-fast-collapsed-date-ranges-and-missing-date-ranges.aspx
quote:
For 1,000 date pairs the algorithm runs in 80 ms.
For 10,000 date pairs the algorithm runs in 360 ms.
For 100,000 date pairs the algorithm runs in 900 ms.
For 1,000,000 date pairs the algorithm runs in 2250 ms.



E 12°55'05.25"
N 56°04'39.16"




I have been looking at this since you posted it originally and I have to admit, I'm having a tough time translating it to my requirement. Any chance you would be able to shed some light on this?
Go to Top of Page
   

- Advertisement -