Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Full outer join count problem

Author  Topic 

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-12 : 09:04:50
Hi all,

I have come across a problem which I find to be pretty odd. I have two tables with a similar structure and the same primary key with different data and sometimes the PK is present in @lan and sometime in @leas and sometimes in both. I need one unique row for each PK and a count and a sum from each table. Here is my sample data:
DECLARE @lan table (orgnr int, lan int, lannavn varchar(20))
DECLARE @leas table (orgnr int, leas int, leasnavn varchar(20))

insert into @lan
SELECT 1, 10000, 'henning' UNION ALL
SELECT 1, 11000, 'henning' UNION ALL
SELECT 2, 20000, 'reidar' UNION ALL
SELECT 3, 10000, 'bente'

insert into @leas
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 3, 11000, 'bente' UNION ALL
SELECT 4, 20000, 'anita' UNION ALL
SELECT 4, 10000, 'anita'

SELECT * FROM @lan
SELECT * FROM @leas

SELECT
COALESCE(a.OrgNr, b.OrgNr) AS OrgNr,
COALESCE(a.Lannavn, b.Leasnavn) AS Navn,
COUNT(a.Lan) AS CountLan,
ISNULL(SUM(a.Lan), 0) AS SumLan,
COUNT(b.Leas) AS CountLeas,
ISNULL(SUM(b.Leas), 0) AS SumLeas
FROM @lan a
FULL OUTER JOIN @leas b
ON a.OrgNr = b.OrgNr
GROUP BY
COALESCE(a.OrgNr, b.OrgNr),
COALESCE(a.Lannavn, b.Leasnavn)
ORDER BY
COALESCE(a.OrgNr, b.OrgNr)
The problem is that if you check out "reidar" he has a count of 1 in @lan and 2 in @leas but they come out as 2 for both. After some investigation I have found that if the PK is present in both tables the count will be equal for both (the highest of the two) regardless of how the count is in each table. The sum will also get multiplied with the highest count...

Why is this and how can I work around it?

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-12 : 09:43:03
Something like this?
-- prepare test data
DECLARE @lan table (orgnr int, lan int, lannavn varchar(20))

insert into @lan
SELECT 1, 10000, 'henning' UNION ALL
SELECT 1, 11000, 'henning' UNION ALL
SELECT 2, 20000, 'reidar' UNION ALL
SELECT 3, 10000, 'bente'

DECLARE @leas table (orgnr int, leas int, leasnavn varchar(20))

insert into @leas
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 3, 11000, 'bente' UNION ALL
SELECT 4, 20000, 'anita' UNION ALL
SELECT 4, 10000, 'anita'

-- do the work
SELECT q.OrgNr,
q.Navn,
SUM(CASE WHEN q.Type = 'Lan' THEN 1 ELSE 0 END) AS [# Lan],
SUM(q.Lan) AS [@ Lan],
SUM(CASE WHEN q.Type = 'Leas' THEN 1 ELSE 0 END) AS [# Leas],
SUM(q.Leas) AS [@ Leas]
FROM (
SELECT OrgNr,
LanNavn Navn,
Lan,
0 AS Leas,
'Lan' AS Type
FROM @Lan
UNION ALL
SELECT OrgNr,
LeasNavn,
0,
Leas,
'Leas' AS Type
FROM @Leas
) q
GROUP BY q.OrgNr,
q.Navn
ORDER BY q.OrgNr,
q.Navn

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-12 : 09:52:54
Exellent! I was hoping you were still at work

But I still don't get why mine didn't work...I'm suspecting it has something to do with the group by but I'm not sure.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-12 : 10:05:44
Umh...one little thing I sort of left out in the specifications. The name can actually be different between the two tables but they still have to be regarded as the same based on the OrgNr

Table @lan should preside (is that a word??) over @leas

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-12 : 10:07:35
This is your output when deleting aggregate functions
OrgNr	Navn	#Lan	@Lan	#Leas	@Leas
1 henning 10000 10000 NULL NULL
1 henning 11000 11000 NULL NULL
2 reidar 20000 20000 10000 10000
2 reidar 20000 20000 10000 10000
3 bente 10000 10000 11000 11000
4 anita NULL NULL 20000 20000
4 anita NULL NULL 10000 10000
FULL JOIN does some weird things when dealing with NULL. It seems that since it actually does find a value for one JOIN, it resuses that value whenever a NULL is found for same PK binding.

Took me a while to figure out...


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-12 : 10:15:45
It can be explained like this. SQL Server Joins only on OrgNr, therefore it matches Navn and Lan several times, like a CROSS JOIN. The full record across all columns are not expected to be unique.
SQL Server does a JOIN for the two last columns too.

Your FULL JOIN
SELECT *
FROM @lan a
FULL OUTER JOIN @leas b
ON a.OrgNr = b.OrgNr
can be best explained this way
SELECT		*
FROM (SELECT OrgNr FROM @Lan UNION SELECT OrgNr FROM @Leas) q
LEFT JOIN @Lan la ON la.OrgNr = q.OrgNr
LEFT JOIN @Leas le ON le.OrgNr = q.OrgNr

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-12 : 10:44:12
Hm, I'll have to think about this...I did use a completely different method a few days back that satisfied all requirements but I thought it could be done in a better way so I started investigating the full outer join. Seems I have to revert to the way I first started out unless you have a few pointers in any other direction:
SELECT 
OrgNr.OrgNr,
COALESCE(lan.Navn, leas.navn) AS Navn,
ISNULL(lan.CountLan, 0) AS CountLan,
ISNULL(lan.SumLan, 0) AS SumLan,
ISNULL(leas.CountLeas, 0) AS CountLeas,
ISNULL(leas.SumLeas, 0) AS SumLeas
FROM (
SELECT DISTINCT orgnr
FROM (
SELECT OrgNr FROM @lan
UNION ALL
SELECT OrgNr FROM @leas
) AS a
) AS OrgNr
LEFT OUTER JOIN (
SELECT OrgNr, LanNavn AS Navn, COUNT(*) AS CountLan, SUM(lan) AS SumLan
FROM @lan
GROUP BY OrgNr, LanNavn
) AS lan
ON OrgNr.OrgNr = lan.OrgNr
LEFT OUTER JOIN (
SELECT OrgNr, LeasNavn AS Navn, COUNT(*) AS CountLeas, SUM(leas) AS SumLeas
FROM @leas
GROUP BY OrgNr, LeasNavn
) AS leas
ON OrgNr.OrgNr = leas.OrgNr
ORDER BY OrgNr.OrgNr


--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-12 : 10:49:40
This last code of yours, produces the same result as my first suggestion.
Is this the result you want?


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-13 : 02:18:27
Yes, except for the fact that the names in the tables might differ but they should still be treated the same based on the OrgNr. If you change "reidar" in @leas to "Peso" or something but leave the OrgNr you'll see the difference...

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-13 : 02:19:23
My last code was my first try at this and I'm still hoping there is a better way that fulfills all requirements.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-13 : 03:08:08
How do you then treat the shift from "Reidar" to "Peso"?
Use my query with only OrgNr as GROUP BY argument and then JOIN the result to some kind of name table.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-12-13 : 03:25:01
[code]-- prepare test data
DECLARE @lan table (orgnr int, lan int, lannavn varchar(20))

insert into @lan
SELECT 1, 10000, 'henning' UNION ALL
SELECT 1, 11000, 'henning' UNION ALL
SELECT 2, 20000, 'reidar' UNION ALL
SELECT 3, 10000, 'bente'

DECLARE @leas table (orgnr int, leas int, leasnavn varchar(20))

insert into @leas
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 2, 10000, 'reidar' UNION ALL
SELECT 3, 11000, 'bente' UNION ALL
SELECT 4, 20000, 'anita' UNION ALL
SELECT 4, 10000, 'anita'

-- do the work
SELECT q.OrgNr,
MIN(q.Navn) NameLow,
MAX(q.Navn) NameHigh,
SUM(CASE WHEN q.Type = 'Lan' THEN 1 ELSE 0 END) AS [# Lan],
SUM(q.Lan) AS [@ Lan],
SUM(CASE WHEN q.Type = 'Leas' THEN 1 ELSE 0 END) AS [# Leas],
SUM(q.Leas) AS [@ Leas]
FROM (
SELECT OrgNr,
LanNavn Navn,
Lan,
0 AS Leas,
'Lan' AS Type
FROM @Lan
UNION ALL
SELECT OrgNr,
LeasNavn,
0,
Leas,
'Leas' AS Type
FROM @Leas
) q
GROUP BY q.OrgNr
ORDER BY q.OrgNr[/code]

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-13 : 05:01:49
The shift is handled here COALESCE(lan.Navn, leas.navn) hich means that any value in lan superseeds the value in leas unless it's NULL. So is this really the best way...hm...


--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page
   

- Advertisement -