Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Summing a UNION

Author  Topic 

ramoneguru
Yak Posting Veteran

69 Posts

Posted - 2006-11-14 : 18:46:39
Ok I have a query that looks like this:
SELECT
t.TutorID,
t.PersonID,
SUM(t.HoursPresent) AS TotalHours
FROM TutorHours t
GROUP BY t.TutorID

UNION

SELECT
pih.TutorID,
pih.PersonID,
SUM(pih.HoursPresent) AS TotalHours
FROM PairInstructionalHours pih
GROUP BY pih.TutorID

Is it possible to take the sum of both occurrences of TotalHours for each row? Because right now it looks like this:

TutorID PersonID TotalHours
------- -------- ----------
12345 56789 56
12345 56789 109.7
77777 88888 234
77777 88888 456
99999 11111 23
99999 11111 345



So, I get duplicates (one is for the first sum, the other for the second sum). How can I combine the sums? Is it possible?
--Nick

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2006-11-14 : 19:01:51
[code]
select TutorID, PersonID, TotalHours = sum(TotalHours)
from
(
SELECT t.TutorID, t.PersonID, SUM(t.HoursPresent) AS TotalHours
FROM TutorHours t
GROUP BY t.TutorID, t.PersonID

UNION

SELECT pih.TutorID, pih.PersonID, SUM(pih.HoursPresent) AS TotalHours
FROM PairInstructionalHours pih
GROUP BY pih.TutorID, pih.PersonID
) a
group by TutorID, PersonID
[/code]


KH

Go to Top of Page

ramoneguru
Yak Posting Veteran

69 Posts

Posted - 2006-11-14 : 19:55:53
Sweeeeeeeeeeeeeeeeet, works like a charm. Thanks :-)
--Nick
Go to Top of Page

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-11-15 : 02:26:44
I belive you need a UNION ALL, in case you have the exact same hours for the same TutorID/PersonID you will loose one of them to the implicit DISTINCT.

I would expect that this runs a bit faster, but I havent checked if the plans are any different
SELECT TutorID, PersonID, TotalHours = sum(HoursPresent)
from
(
SELECT t.TutorID, t.PersonID, t.HoursPresent
FROM TutorHours t

UNION ALL

SELECT pih.TutorID, pih.PersonID, pih.HoursPresent
FROM PairInstructionalHours pih
) a
group by TutorID, PersonID


-- Alice came to a fork in the road. "Which road do I take?" she asked. "Where do you want to go?" responded the Cheshire cat. "I don't know," Alice answered. "Then," said the cat, "it doesn't matter."
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-15 : 02:51:34
And why GROUP BY twice?
SELECT		TutorID,
PersonID,
SUM(HoursPresent) AS TotalHours
FROM (
SELECT TutorID,
PersonID,
HoursPresent
FROM TutorHours

UNION ALL

SELECT TutorID,
PersonID,
HoursPresent
FROM PairInstructionalHours
) q
GROUP BY TutorID,
PersonID
ORDER BY TutorID,
PersonID

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-11-15 : 07:48:54
In this case UNION ALL is better than UNION

>>And why GROUP BY twice?

When you use group by inside subqueries, you are combining smaller set of results than combining larger results and then applying group by. But I am not sure which is effecient.

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2006-11-15 : 10:26:37
quote:
Originally posted by madhivanan

In this case UNION ALL is better than UNION

>>And why GROUP BY twice?

When you use group by inside subqueries, you are combining smaller set of results than combining larger results and then applying group by. But I am not sure which is effecient.

Madhivanan

Failing to plan is Planning to fail



Yes. That was my consideration. It should be more efficient to union 2 smaller result set.


KH

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-15 : 13:52:26
Actually you are grouping three times.
First two times in the derived table, and then again outside the derived table.

It would be great if OP could test both approaches and post back here the execution plans.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-11-15 : 14:18:52
Given the two tables:
CREATE TABLE [dbo].[TutorHours](
[TutorID] [int] NOT NULL,
[PersonID] [int] NOT NULL,
[HoursPresent] [int] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[PairInstructionalHours](
[TutorID] [int] NOT NULL,
[PersonID] [int] NOT NULL,
[HoursPresent] [int] NOT NULL
) ON [PRIMARY]
filled with random numbers, like this:
INSERT INTO [Test].[dbo].[TutorHours]
([TutorID]
,[PersonID]
,[HoursPresent])
SELECT abs(cast(cast(newid() as varbinary) as int)) % 100 + 1
,abs(cast(cast(newid() as varbinary) as int)) % 100 + 1
,abs(cast(cast(newid() as varbinary) as int)) % 8 + 1
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3

INSERT INTO [Test].[dbo].[PairInstructionalHours]
([TutorID]
,[PersonID]
,[HoursPresent])
SELECT abs(cast(cast(newid() as varbinary) as int)) % 100 + 1
,abs(cast(cast(newid() as varbinary) as int)) % 100 + 1
,abs(cast(cast(newid() as varbinary) as int)) % 8 + 1
FROM sys.objects s1
CROSS JOIN sys.objects s2
CROSS JOIN sys.objects s3
then mine gets a slightly better execution plan using only one GROUP BY (on a 2005).

-- Alice came to a fork in the road. "Which road do I take?" she asked. "Where do you want to go?" responded the Cheshire cat. "I don't know," Alice answered. "Then," said the cat, "it doesn't matter."
Go to Top of Page

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2006-11-15 : 18:48:22
"then mine gets a slightly better execution plan using only one GROUP BY (on a 2005)."
I tried it on 2000. You are right. But what's the explaination ?


KH

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-16 : 00:34:50
GROUP BY does a sort internally, and sorting three times is costly. And SQL needs to allocate space for storing the intermediate results.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page
   

- Advertisement -