Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Script Library
 Islands and Gaps in sequential data

Author  Topic 

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-08-23 : 16:43:56
I couldn't find a topic suitable for testing this, so I thought I'd start one.

Here is one way to get the islands without a tally table.
declare	@test table (symbol char(3), dt smalldatetime)

insert @test
select 'abc','01/01/1990' union all
select 'abc','01/02/1990' union all
select 'abc','01/03/1990' union all
select 'abc','01/04/1990' union all
select 'abc','01/05/1990' union all
select 'def','01/03/1990' union all
select 'def','01/04/1990' union all
select 'def','01/05/1990' union all
select 'def','01/06/1990' union all
select 'def','01/07/1990' union all
select 'ghi','01/01/1990' union all
select 'ghi','01/02/1990' union all
select 'ghi','01/06/1990' union all
select 'ghi','01/07/1990' union all
select 'ghi','01/08/1990'

select symbol,
min(dt),
max(dt2)
from (
select t1.symbol,
t1.dt,
t2.dt as dt2,
(select count(distinct t3.symbol) from @test as t3 where t3.symbol < t1.symbol and t3.dt <= t1.dt) AS r
from @test as t1
inner join @test as t2 on t2.symbol = t1.symbol
where t2.dt - 1 = t1.dt
) as d
group by symbol,
r


E 12°55'05.25"
N 56°04'39.16"

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2007-08-23 : 16:46:02
You need to replace the #test with @test in your code Peter ...

Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-08-23 : 17:19:05
Thanks!
I didn't notice since I had both tables in my session.

I thought we could start a discussion about efficient ways to get the islands and the gaps in sequential data.
If this already has been done on SQLTeam, pleast post the link.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

gerhau
Starting Member

3 Posts

Posted - 2007-08-26 : 11:35:14
Hi.

I've been searching for a solution for a similar problem. Hope you don't mind me hangin a question to this thread:

Let's say you have a list of periods (fromdate - todate) in a table, and want to fill in a new period if there is a gap in time between two existing periods.

Let me give an example:



CREATE TABLE #MeterReadingPeriods
(
PlantId int NOT NULL,
FromDate datetime NOT NULL,
ToDate datetime NOT NULL,
)

--Sample data
INSERT INTO #MeterReadingPeriods (Plantid,FromDate,ToDate)
SELECT 1,'20050830','20060918' UNION ALL
SELECT 1,'20060918','20061201' UNION ALL
SELECT 1,'20070201','20070515'


I would like to use a query to insert "missing" periods. Based on the sample data above, I would like to insert the following period (fromdate - todate): '20061201' - '20070201'

Suggestions?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-08-26 : 11:43:51
See this topic
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=88422



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

gerhau
Starting Member

3 Posts

Posted - 2007-08-26 : 12:07:23
Thx for swift reply.

I had already looked at that posting, but it doesn't do just what I want in my problem (I've had to solve that kind of problem too, but have done it in a different way).

If you look at the output from the example you are linking to, you have a gap between the two rows with a=1.
You have the period '20070101' - '20070301' and the period '20070501' - '20070901'. In that scenario, I am looking for the period between those two periods.. so what I want is the period '20070301' - '20070501'. Do you understand what I mean?

Would be grateful if you could kick me further in the right direction. (Also, I have SQL SERVER 2000, so 2005-specific code is not an option at the moment unfortunately.)




Gert
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2007-08-26 : 22:04:15
Try this out:


Drop Table #MeterReadingPeriods
CREATE TABLE #MeterReadingPeriods
(
PlantId int NOT NULL,
FromDate datetime NOT NULL,
ToDate datetime NOT NULL,
)

--Sample data
INSERT INTO #MeterReadingPeriods (Plantid,FromDate,ToDate)
SELECT 1,'20050830','20060918' UNION ALL
SELECT 1,'20060918','20061201' UNION ALL
SELECT 1,'20070201','20070515'


Drop Table #sequenced
Select
A.*,
SeqId = Count(*)
Into #sequenced
From #MeterReadingPeriods A
Left Join #MeterReadingPeriods B
On A.PlantId = B.PlantId
and A.FromDate >= B.FromDate
Group By A.PlantId, A.FromDate, A.ToDate


Select
A.PlantId,
FromDate = A.ToDate,
ToDate = B.FromDate
From #sequenced A
Inner Join #sequenced B
On A.PlantId = B.PlantId
and A.SeqId = B.SeqId - 1
Where A.ToDate <> B.FromDate


SequenceIds are my favorite

Corey

snSQL on previous signature "...Oh and by the way Seventhnight, your signature is so wrong! On so many levels, morally, gramatically and there is a typo!"
Go to Top of Page

gerhau
Starting Member

3 Posts

Posted - 2007-08-27 : 02:13:36
Thx a lot :)

I was "dreaming" of this last night, and was going for something in this direction.. but did not quite have it figured out. So, this was very helpful.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-08-27 : 04:00:41
Hi Corey!
Long time no see.

I did some research for your solution and my solution at top.
They both take almost the same amount of time, my mine approach is almost half the reads from the table.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2007-08-27 : 21:09:55
Yeah... it takes half the reads becuase I had to create the sequenceId from the data... I personally keep sequenceIds as a column on any table that is range based (or 'gap'-able). I usually use a trigger to keep them updated so I don't have to fool with them when inserting and such...

I've been concentrating on .NET win forms recently... not a whole lot of SQL stuff... (apparently that's what happens when a db design actually works out )

Corey

snSQL on previous signature "...Oh and by the way Seventhnight, your signature is so wrong! On so many levels, morally, gramatically and there is a typo!"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-08-27 : 21:21:03
quote:
Originally posted by Peso
If this already has been done on SQLTeam, pleast post the link.



http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data

I can't tell for sure but I think it's pretty much the same thing ...

EDIT: then again, maybe not? I don't understand what your first example is doing; it returns the results as

select symbol, min(dt) max(dt)
from @test
group by symbol

??


EDIT #2: wait, never mind, I see the difference now.

They are different, two different problems are being solved.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2007-08-27 : 21:28:41
I knew there was a link around here somewhere...

Corey

snSQL on previous signature "...Oh and by the way Seventhnight, your signature is so wrong! On so many levels, morally, gramatically and there is a typo!"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-08-27 : 21:41:10
quote:
Originally posted by Peso

I couldn't find a topic suitable for testing this, so I thought I'd start one.

Here is one way to get the islands without a tally table.
declare	@test table (symbol char(3), dt smalldatetime)

insert @test
select 'abc','01/01/1990' union all
select 'abc','01/02/1990' union all
select 'abc','01/03/1990' union all
select 'abc','01/04/1990' union all
select 'abc','01/05/1990' union all
select 'def','01/03/1990' union all
select 'def','01/04/1990' union all
select 'def','01/05/1990' union all
select 'def','01/06/1990' union all
select 'def','01/07/1990' union all
select 'ghi','01/01/1990' union all
select 'ghi','01/02/1990' union all
select 'ghi','01/06/1990' union all
select 'ghi','01/07/1990' union all
select 'ghi','01/08/1990'

select symbol,
min(dt),
max(dt2)
from (
select t1.symbol,
t1.dt,
t2.dt as dt2,
(select count(distinct t3.symbol) from @test as t3 where t3.symbol < t1.symbol and t3.dt <= t1.dt) AS r
from @test as t1
inner join @test as t2 on t2.symbol = t1.symbol
where t2.dt - 1 = t1.dt
) as d
group by symbol,
r


E 12�55'05.25"
N 56�04'39.16"



Final post, sorry!

I am not sure this works .... First add this and try it:

select 'abc','01/10/1990'

And then add the above PLUS this and try again:

select 'abc','01/11/1990'

both seem to return incorrect results.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-08-28 : 01:16:48
Ahh. I see the flaw now.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-08-28 : 08:47:08
I think this does the trick, it should be pretty easy to follow (I hope):


select
MissingBefore.symbol, MissingBefore.dt as StartDt, min(MissingAfter.dt) as EndDt
from
(
select t1.symbol, t1.dt
from @test t1
left outer join @test t2 on t1.symbol = t2.symbol and t1.dt = t2.dt+1
where t2.symbol is null
)
MissingBefore -- this is symbol/date rows with no date just before

left outer join
(
select t1.symbol, t1.dt
from @test t1
left outer join @test t2 on t1.symbol = t2.symbol and t1.dt = t2.dt-1
where t2.symbol is null
)
MissingAfter -- this is symbol/date rows with no date just after

on
MissingBefore.symbol = MissingAfter.symbol and MissingBefore.dt <= MissingAfter.dt

group by
MissingBefore.symbol, MissingBefore.dt


A CTE would make it a little more clear, I think. This should work fine for all data.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2007-08-28 : 09:35:36
quote:
Originally posted by Peso

Hi Corey!
Long time no see.

I did some research for your solution and my solution at top.
They both take almost the same amount of time, my mine approach is almost half the reads from the table.



E 12°55'05.25"
N 56°04'39.16"



Oh... I definitely agree with you that my solution is a little cumbersome in the fact that I first prebuild a table to help with the execution. But, I've found that when I work with larger sets of data and the need for several layers of complex quering (many joins), prebuilding temp tables can also give you an edge... you can index them and then join them to themselves which is impossible (to index) and impractical (to join more than once) to do with derived queries.

Anywhooo... was just rereading the thread and thought I'd toss that in.

Corey

snSQL on previous signature "...Oh and by the way Seventhnight, your signature is so wrong! On so many levels, morally, gramatically and there is a typo!"
Go to Top of Page

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2007-08-28 : 14:36:28
Another similar topic: http://msdn2.microsoft.com/en-us/library/aa175780(sql.80).aspx

Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page
   

- Advertisement -