Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Overlapped date range Find when percentage > 1

Author  Topic 

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-11 : 15:20:55
Given the following data:

ALog ABegin AEnd EmpID APercent
1 03/08/2008 03/31/2008 1 0.6
2 03/08/2008 04/04/2008 1 0.4
3 04/01/2008 04/04/2008 1 0.6
4 01/01/2008 04/30/2008 2 0.33
5 01/01/2008 01/31/2008 2 0.31
6 02/01/2008 02/28/2008 2 0.31
7 03/01/2008 03/31/2008 2 0.98
8 04/01/2008 04/30/2008 2 0.31

Can anyone come up with the SQL statement(s) needed to identify for each EmpID what overlapping ALog entries exceed 1.0 when added/summed up.

For EmpID 1
ALogs 1,2,3 are involved in an overlap but none exceed 1.0
ALog 1 overlaps with Alog 2 0.6 + 0.4 = 1
ALog 2 overlaps with Alog 3 0.6 + 0.4 = 1
ALog 3 overlaps with ALog 1 & ALog 2 but not both at the same time
so at any time ALog 3 only has 0.6 + 0.4 = 1
The APercent never exceeds 1.
From the above data the results I am looking for would look like:
ALog TotAPercent
1 1
2 1
3 1
4 0.64
5 0.64
6 1.31
7 0.64
8 2.24


For EmpID 2
ALog 4 overlaps with ALog 8 0.31 + 0.33 = 0.64
ALog 5 overlaps with ALog 8 0.31 + 0.33 = 0.64
ALog 6 overlaps with ALog 8 0.98 + 0.33 = 1.31
ALog 7 overlaps with ALog 8 0.31 + 0.33 = 0.64
ALog 8 overlaps with ALogs 4,5,6,7 .31+.31+.98+.31+.33 = 2.24

Can anyone see an SQL statement to get this done?
I have beat my head against the wall... with nothing to show
Here's the code to insert the data into a temp table in SQL Svr 2005
if object_id('tempdb..#OVERLAP','U') is not null
DROP TABLE #OVERLAP


CREATE TABLE #OVERLAP
(
ALog INT,
ABegin DATETIME,
AEnd DATETIME,
EmpID char(11),
APercent float,
PRIMARY KEY (ALog, ABegin, AEnd)
)

insert into #OVERLAP (ALog, ABegin, AEnd, EmpID, APercent)
select 1, '03/08/2008', '03/31/2008', 1, 0.6 union all
select 2, '03/08/2008', '04/04/2008', 1, 0.4 union all
select 3, '04/01/2008', '04/04/2008', 1, 0.6 union all
select 4, '01/01/2008', '04/30/2008', 2, 0.33 union all
select 5, '01/01/2008', '01/31/2008', 2, 0.31 union all
select 6, '02/01/2008', '02/28/2008', 2, 0.31 union all
select 7, '03/01/2008', '03/31/2008', 2, 0.98 union all
select 8, '04/01/2008', '04/30/2008', 2, 0.31

Thank you for taking the time to review this issue.



Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2008-09-11 : 18:29:25
not sure if i'm following, Why doesn't alog 2 overlap both 1 & 3 (.6 +.4 +.6= 1.6) ? Can you clarify.


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

bjoerns
Posting Yak Master

154 Posts

Posted - 2008-09-12 : 05:24:48
I think it is because 1 and 2 overlap between 03/08 and 03/31, while 3 start 04/01. What he means may be

<pseudocode>
for each day
sum(APercent) where day between ABegin and AEnd
</pseudocode>




(54224 row(s) affected)

The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-12 : 05:59:38
For EmpID 1 it is aLog 2 that overlaps with both aLog 1 and aLog 3.
Not aLog 3 as OP described.

3/8 4/4
aLog 1 |----------------------|
aLog 2 |--------------------------|
aLog 3 |--|
For EmpID 2 it is even worse described. For EmpID 2 it is aLog 4 that overlaps all

1/1 1/31 2/28 3/31 4/30
aLog 4 |-----------------------------------------------------------------------------------------------------------------------|
aLog 5 |-----------------------------|
aLog 6 |--------------------------|
aLog 7 |-----------------------------|
aLog 8 |----------------------------|
Not aLog 8 as OP described.
Also it is aLog 7 that has 0.98 as percent, not aLog 6 as OP wrote in examples.


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-12 : 06:27:00
quote:
Originally posted by bjoerns

I think it is because 1 and 2 overlap between 03/08 and 03/31, while 3 start 04/01. What he means may be

<pseudocode>
for each day
sum(APercent) where day between ABegin and AEnd
</pseudocode>
In that case, how do you explain EmpID 2?
How come the sum is 2.24?


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-12 : 09:43:43
Peso is correct. As you can see by Peso's excellent diagram of the problem it is ALog 4 that should be = 2.24 not ALog 8!
Also it is aLog 7 that has 0.98 as percent, not aLog 6
ALog 8 = 0.64
I am sorry, my only excuse is that I have been looking at this so long my brain is becoming mush.
I am trying to identify the Alogs where the employee worked for more than 1 FTE. That is what the percentages represent.

Original Post should have been
ALog TotAPercent
1 1
2 1
3 1
4 2.24
5 0.64
6 0.64
7 1.31
8 0.64


For EmpID 2
ALog 4 overlaps with ALogs 4,5,6,7 .31+.31+.98+.31+.33 = 2.24
ALog 5 overlaps with ALog 8 0.31 + 0.33 = 0.64
ALog 6 overlaps with ALog 8 0.31 + 0.33 = 0.64
ALog 7 overlaps with ALog 8 0.98 + 0.33 = 1.31
ALog 8 overlaps with ALog 8 0.31 + 0.33 = 0.64

Here is my most recent attempt to solve this. This SQL yields the WRONG results for EmpID 1 ALog 2

select T1.EmpID, T1.ALog,
right('0' + rtrim(month(T1.ABegin)),2) + '/' + right('0' + rtrim(day(T1.ABegin)),2) + '/' + rtrim(year(T1.ABegin)) AS BDte,
right('0' + rtrim(month(T1.AEnd)),2) + '/' + right('0' + rtrim(day(T1.AEnd)),2) + '/' + rtrim(year(T1.AEnd)) AS EDte,
Sum(T2.APercent) as TotPrcnt
FROM #OVERLAP As T1 INNER JOIN #OVERLAP AS T2 ON T1.EmpID = T2.EmpID
Where (T1.ABegin Between T2.ABegin And T2.AEnd) OR (T2.ABegin Between T1.ABegin ANd T1.AEnd)
Group By T1.EmpID, T1.ALog, T1.ABegin, T1.AEnd
Order By T1.EmpID, T1.ALog

Results
EmpID ALog BDte EDte Totprcnt
1 1 03/08/2008 03/31/2008 1
1 2 03/08/2008 04/04/2008 1.6 THIS IS NOT WHAT I WANT SHOULD BE 1
1 3 04/01/2008 04/04/2008 1
2 4 01/01/2008 04/30/2008 2.24
2 5 01/01/2008 01/31/2008 0.64
2 6 02/01/2008 02/28/2008 0.64
2 7 03/01/2008 03/31/2008 1.31
2 8 04/01/2008 04/30/2008 0.64
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-12 : 10:02:53
Why are you using different logic for Emp 1?
For EmpID 1 
ALog 1 overlaps with ALog 2 0.6 + 0.4 = 1.0
ALog 2 overlaps with ALogs 1,3 0.4 + 0.6 + 0.6 = 1.6
ALog 3 overlaps with ALog 2 0.6 + 0.4 = 1.0



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-12 : 10:11:57
You can't have both logic into same query!

Either use this logic
For EmpID 1 
ALog 1 overlaps with ALog 2 0.6 + 0.4 = 1.0
ALog 2 overlaps with ALogs 1,3 0.4 + 0.6 + 0.6 = 1.6
ALog 3 overlaps with ALog 2 0.6 + 0.4 = 1.0

For EmpID 2
ALog 4 overlaps with ALogs 5,6,7,8 0.33 + 0.31 + 0.31 + 0.98 + 0.31 = 2.24
ALog 5 overlaps with ALog 4 0.31 + 0.33 = 0.64
ALog 6 overlaps with ALog 4 0.31 + 0.33 = 0.64
ALog 7 overlaps with ALog 4 0.98 + 0.33 = 1.31
ALog 8 overlaps with ALog 4 0.31 + 0.33 = 0.64
or use this logic
For EmpID 1 
ALog 1 overlaps with ALog 2 0.6 + 0.4 = 1.0
ALog 2 overlaps with ALogs 1 0.4 + 0.6 = 1.0
ALog 2 overlaps with ALogs 3 0.4 + 0.6 = 1.0
ALog 3 overlaps with ALog 2 0.6 + 0.4 = 1.0

ALog 1 = 1.0
ALog 2 = 1.0 (max of the two values above)
ALog 3 = 1.0

For EmpID 2
ALog 4 overlaps with ALogs 5 0.33 + 0.31 = 0.64
ALog 4 overlaps with ALogs 6 0.33 + 0.31 = 0.64
ALog 4 overlaps with ALogs 7 0.33 + 0.98 = 1.31
ALog 4 overlaps with ALogs 8 0.33 + 0.31 = 0.64
ALog 5 overlaps with ALog 4 0.31 + 0.33 = 0.64
ALog 6 overlaps with ALog 4 0.31 + 0.33 = 0.64
ALog 7 overlaps with ALog 4 0.98 + 0.33 = 1.31
ALog 8 overlaps with ALog 4 0.31 + 0.33 = 0.64

ALog 4 = 1.31 (max of the four values above)
ALog 5 = 0.64
ALog 6 = 0.64
ALog 7 = 1.31
ALog 8 = 0.64
You decide...


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-12 : 13:51:40
Peso I would like to thank you for your clarity of thought.
Your posts led me to a solution that yields the results I am looking for.
After reviewing your posts I have come up with the following that yields the results I am looking for.
I am reposting the sample data and my solution to hopefully help anyone else looking for such a solution.
Thank you to everyone who took the time to review this post.

if object_id('tempdb..#OVERLAP','U') is not null
DROP TABLE #OVERLAP

CREATE TABLE #OVERLAP
(
ALog INT,
ABegin DATETIME,
AEnd DATETIME,
EmpID char(11),
APercent float,
PRIMARY KEY (ALog, ABegin, AEnd)
)

insert into #OVERLAP (ALog, ABegin, AEnd, EmpID, APercent)
select 1, '03/08/2008', '03/31/2008', 1, 0.6 union all
select 2, '03/08/2008', '04/04/2008', 1, 0.4 union all
select 3, '04/01/2008', '04/04/2008', 1, 0.6 union all
select 4, '01/01/2008', '04/30/2008', 2, 0.33 union all
select 5, '01/01/2008', '01/31/2008', 2, 0.31 union all
select 6, '02/01/2008', '02/28/2008', 2, 0.31 union all
select 7, '03/01/2008', '03/31/2008', 2, 0.98 union all
select 8, '04/01/2008', '04/30/2008', 2, 0.31

SELECT T1.EmpID, T1.ALog, T1.APercent,
right('0' + rtrim(month(T1.ABegin)),2) + '/' + right('0' + rtrim(day(T1.ABegin)),2) + '/' + rtrim(year(T1.ABegin)) AS BDte,
right('0' + rtrim(month(T1.AEnd)),2) + '/' + right('0' + rtrim(day(T1.AEnd)),2) + '/' + rtrim(year(T1.AEnd)) AS EDte,
MAX(T2.APercent + T1.APercent) as MaxPercentOverlapped
FROM #OVERLAP As T1 INNER JOIN #OVERLAP AS T2 ON T1.EmpID = T2.EmpID
Where ((T1.ABegin Between T2.ABegin And T2.AEnd) Or (T2.ABegin Between T1.ABegin ANd T1.AEnd)) And T1.ALog <> T2.ALog
GROUP BY T1.EmpID, T1.ALog, T1.ABegin, T1.AEnd, T1.APercent


Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-12 : 17:02:20
[code]SELECT s1.EmpID,
s1.aLog,
s1.aPercent,
CONVERT(CHAR(10), s1.aBegin, 101) AS bDte,
CONVERT(CHAR(10), s1.aEnd, 101) AS eDte,
MAX(s1.aPercent + s2.aPercent) AS MaxPercentOverlapped
FROM @Sample AS s1
INNER JOIN @Sample AS s2 ON s2.EmpID = s1.EmpID
WHERE s1.aLog <> s2.aLog
AND s1.aBegin < s2.aEnd
AND s1.aEnd > s2.aBegin
GROUP BY s1.EmpID,
s1.aLog,
s1.aPercent,
s1.aBegin,
s1.aEnd[/code]


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-15 : 10:11:40
Well, while I thought that this topic was resolved, a member of another forum pointed out a problem with the solution. If an additional activity log for EmpID 2 from 2008-04-01 to 2008-04-30 at 68% is added to the above sample data. The solution is missing instances where three or more ALogs overlap for the same EmpID.

Specifically:

insert into #OVERLAP (ALog, ABegin, AEnd, EmpID, APercent)
select 1, '03/08/2008', '03/31/2008', 1, 0.6 union all
select 2, '03/08/2008', '04/04/2008', 1, 0.4 union all
select 3, '04/01/2008', '04/04/2008', 1, 0.6 union all
select 4, '01/01/2008', '04/30/2008', 2, 0.33 union all
select 5, '01/01/2008', '01/31/2008', 2, 0.31 union all
select 6, '02/01/2008', '02/28/2008', 2, 0.31 union all
select 7, '03/01/2008', '03/31/2008', 2, 0.98 union all
select 8, '04/01/2008', '04/30/2008', 2, 0.31 union all
select 9, '04/01/2008', '04/30/2008', 2, 0.68

With the above SQL Alogs 4, 7 and 9 are shown to exceed 1.0, but 8 which should be > 1 is only reported at 0.99...


Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-15 : 10:26:33
Yes. It is because you didn't mention this environment in your original post.
"There can be multiple overlaps for several aLogs.".

That single line had been enough for us to understand the complexity.
DECLARE	@Sample TABLE
(
aLog INT,
aBegin DATETIME,
aEnd DATETIME,
EmpID INT,
aPercent SMALLMONEY
)

INSERT @Sample
SELECT 1, '03/08/2008', '03/31/2008', 1, 0.6 UNION ALL
SELECT 2, '03/08/2008', '04/04/2008', 1, 0.4 UNION ALL
SELECT 3, '04/01/2008', '04/04/2008', 1, 0.6 UNION ALL
SELECT 4, '01/01/2008', '04/30/2008', 2, 0.33 UNION ALL
SELECT 5, '01/01/2008', '01/31/2008', 2, 0.31 UNION ALL
SELECT 6, '02/01/2008', '02/28/2008', 2, 0.31 UNION ALL
SELECT 7, '03/01/2008', '03/31/2008', 2, 0.98 UNION ALL
SELECT 8, '04/01/2008', '04/30/2008', 2, 0.31 UNION ALL
SELECT 9, '04/01/2008', '04/30/2008', 2, 0.68

;WITH Yak(EmpID, theDate, aEnd)
AS (
SELECT EmpID,
MIN(aBegin),
MAX(aBegin)
FROM @Sample
GROUP BY EmpID

UNION ALL

SELECT EmpID,
DATEADD(DAY, 1, theDate),
aEnd
FROM Yak
WHERE theDate < aEnd
)

SELECT aLog,
MAX(thePercent) AS aPercentTotal
FROM (
SELECT s.aLog,
SUM(s.aPercent) OVER (PARTITION BY y.EmpID, y.theDate) AS thePercent
FROM Yak AS y
LEFT JOIN @Sample AS s ON s.EmpID = y.EmpID
AND s.aBegin <= y.theDate
AND s.aEnd >= y.theDate
) AS d
GROUP BY aLog
ORDER BY aLog



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-15 : 11:01:59
Peso

You are amazing...
I will test against the original 2 million row table...

Thank you...

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-16 : 04:01:02
How did your tests go?



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jwpquinn
Starting Member

6 Posts

Posted - 2008-09-16 : 10:08:41
I have 77,191 overlaps to check. I am cross checking against an Access database which is where the data originally came from. I am upsizing to SQL Server 2005 because the size 2 million rows is just too much for Access to handle. In the Access database this logic was performed with VBA code and Recordsets Looping through the EmpIDs and flagging the records. I want to stay away from cursors in SQL Server 2005 because I have read that SQL is faster. I have only had time to check the first couple of thousand, but already I have found instances where the vba code in Access misidentified EmpIDs ALogs percentages > 1 and the SQL code that you wrote has it correct. I will let you know if I find a failure in the SQL.

Do you have any suggestions(books/resources) for a newbie such as myself to come up to speed on writing SQL and thinking in sets?
Also how to take advantage of SQL Server 2005? The CTE that you wrote was totally new to me and I have been reading up on it.
Is there a way to determine the appropriate value for option (maxrecursion 10000). I came to my value by trial and error.

Thanks again for your help!! I really appreciate it!
Go to Top of Page
   

- Advertisement -