Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Trouble with joins

Author  Topic 

thenearfuture
Starting Member

35 Posts

Posted - 2007-03-02 : 20:05:29
I have a somewhat long question, and if anybody has the time and patience to read through it, I'd appreciate it.

I've been trying to write queries to extract infomation about a group of cases.

Each case goes through a set of processes. The first process is analysis, the second process is project work, the third process is review. When a case enters or leaves a process, the date is noted in the log as either "ProcStartDate" or "ProcEndDate." So, from the process table below, you can see that Case # 10 entered Process 1 on 02-28-2005, and left Process 1 on 06-17-2005. Likewise, it entered Process 2 on 06-18-2005 and left it on 07-21-2005.


Process


CaseID ProcessID ProcStartDate ProcEndDate
10 1 02-28-2005 06-17-2005
10 2 06-18-2005 07-21-2005
10 3 07-22-2005 10-11-2005
20 1 03-21-2005 07-24-2005
20 2 07-25-2005 08-10-2005



During each process, the company sends letters or makes phone calls to help move the process along. Each letter is recorded in the Letters table, while each phone call is recorded in the PhoneCalls table. In the letters table, you can see that Case # 10 had a letter sent on 04-15-2005. In the PhoneCalls table, you can see that the same case had a phone call made on 07-24-2005.


Letters


CaseID LetterLogID EntryDate AmtofLetters
10 101 04-15-2005 1
10 102 07-20-2005 1
10 103 09-16-2005 1
20 201 06-14-2005 1
20 202 06-19-2005 1
20 203 07-27-2005 1
20 204 07-29-2005 1


PhoneCalls



PhoneCalls


CaseID PhoneLogID EntryDate AmtofPhoneCalls
10 4101 04-17-2005 1
10 4102 07-24-2005 1
10 4103 09-18-2005 1
10 4104 09-23-2005 1
10 4105 09-25-2005 1
20 4106 06-18-2005 1
20 4107 06-20-2005 1
20 4108 07-28-2005 1
20 4109 07-29-2005 1



I need to join these tables to show the number of letters and the number of phone calls that were made for each case during each process.

Since the two activity tables (Letters and PhoneCalls, respectively) don't have any Process information, I have to figure out what process the case was in at the time of the activity by looking at the EntryDate in the activity table and comparing it to the dates in the Process table. So, for example, Case # 10 had a letter sent on 09-16-2005, which means that it was in Process 3, since Process 3 (for Case # 10) lasted from 07-22-2005 to 10-11-2005.

For Case # 20, there was a letter on 07-27-2005 and another letter on 07-29-2005. By comparing these dates to the Process table, we can see that these two letters were sent during Process 2, which lasted from 07-25-2005 to 08-10-2005.

When I started trying to write this query, I began with just one of the activity tables - the Letters table. I made two subqueries in the "from" field using Process and Letters and left outer joined Letters on Letters.CaseID = Process.CaseID and Letters.EntryDate >= Process.ProcStartDate and Letters.EntryDate < Process.ProcEndDate. The second part of the "on" clause is an attempt at relating the activity entry dates to the process dates.

When I run the query using just the Process table and one of the activity tables, it works fine. When I try to left outer join the other activity table (so that I have Process, Letters, and PhoneCalls in there all at once), the numbers suddenly change and become incorrect, but I can't figure out why.

Does anyone have an idea as to how to join these tables so that the numbers come out right?

This is the result I'm looking for:


CaseID ProcessID AmtofLetters AmtofPhoneCalls
10 1 1 1
10 2 1 0
10 3 1 4
20 1 2 2
20 2 2 2


Thank you.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-03 : 04:26:36
This solution is should do the trick!
-- prepare sample data
set dateformat mdy

declare @process table (caseid int, processid int, procstartdate datetime, procenddate datetime)

insert @process
select 10, 1, '02-28-2005', '06-17-2005' union all
select 10, 2, '06-18-2005', '07-21-2005' union all
select 10, 3, '07-22-2005', '10-11-2005' union all
select 20, 1, '03-21-2005', '07-24-2005' union all
select 20, 2, '07-25-2005', '08-10-2005'

declare @letters table (caseid int, letterlogid int, entrydate datetime, amtofletters int)

insert @letters
select 10, 101, '04-15-2005', 1 union all
select 10, 102, '07-20-2005', 1 union all
select 10, 103, '09-16-2005', 1 union all
select 20, 201, '06-14-2005', 1 union all
select 20, 202, '06-19-2005', 1 union all
select 20, 203, '07-27-2005', 1 union all
select 20, 204, '07-29-2005', 1

declare @phonecalls table (caseid int, phonelogid int, entrydate datetime, amtofphonecalls int)

insert @phonecalls
select 10, 4101, '04-17-2005', 1 union all
select 10, 4102, '07-24-2005', 1 union all
select 10, 4103, '09-18-2005', 1 union all
select 10, 4104, '09-23-2005', 1 union all
select 10, 4105, '09-25-2005', 1 union all
select 20, 4106, '06-18-2005', 1 union all
select 20, 4107, '06-20-2005', 1 union all
select 20, 4108, '07-28-2005', 1 union all
select 20, 4109, '07-29-2005', 1

-- Show the proper expected result
SELECT x.CaseID,
x.ProcessID,
ISNULL(x.AmtOfLetters, 0) AS AmtOfLetters,
ISNULL(y.AmtOfPhoneCalls, 0) AS AmtOfPhoneCalls
FROM (
SELECT p.CaseID,
p.ProcessID,
SUM(l.AmtOfLetters) AS AmtOfLetters
FROM @Process AS p
LEFT JOIN @Letters AS l ON l.CaseID = p.CaseID AND l.EntryDate BETWEEN p.ProcStartDate AND p.ProcEndDate
GROUP BY p.CaseID,
p.ProcessID
) AS x
INNER JOIN (
SELECT p.CaseID,
p.ProcessID,
SUM(c.AmtOfPhoneCalls) AS AmtOfPhoneCalls
FROM @Process AS p
LEFT JOIN @PhoneCalls AS c ON c.CaseID = p.CaseID AND c.EntryDate BETWEEN p.ProcStartDate AND p.ProcEndDate
GROUP BY p.CaseID,
p.ProcessID
) AS y ON y.CaseID = x.CaseID AND y.ProcessID = x.ProcessID
ORDER BY x.CaseID,
x.ProcessID

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-03 : 07:02:53
Another approach
SELECT		CaseID,
ProcessID,
SUM(AmtOfLetters) AS AmtOfLetters,
SUM(AmfOfPhoneCalls) AS AmfOfPhoneCalls
FROM (
SELECT p.CaseID,
p.ProcessID,
ISNULL(l.AmtOfLetters, 0) AS AmtOfLetters,
0 AS AmfOfPhoneCalls
FROM @Process AS p
LEFT JOIN @Letters AS l ON l.CaseID = p.CaseID AND l.EntryDate BETWEEN p.ProcStartDate AND p.ProcEndDate

UNION ALL

SELECT p.CaseID,
p.ProcessID,
0,
ISNULL(c.AmtOfPhoneCalls, 0)
FROM @Process AS p
LEFT JOIN @PhoneCalls AS c ON c.CaseID = p.CaseID AND c.EntryDate BETWEEN p.ProcStartDate AND p.ProcEndDate
) AS x
GROUP BY CaseID,
ProcessID
ORDER BY CaseID,
ProcessID

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

thenearfuture
Starting Member

35 Posts

Posted - 2007-03-04 : 03:44:02
Both approaches were perfect, thank you.

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-04 : 07:54:45
Try both and examine the two different execution plans to see which approach that is most efficient.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page
   

- Advertisement -