Please explain this Having clause

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Please explain this Having clause

Author

Topic

indy2005
Starting Member

12 Posts

Posted - 2007-07-15 : 11:31:43

Hello,

I am looking at some T-SQL in a database I am looking to consolidate with other data. I know what it does - it tries to create a ranking (forward and reverse) so that it can obtain a data on the 75th percentile. Below is a small part of a the subquery. My question is the use of aggregates in the Having clause has thrown me. I thought that "having" always occurred after the grouping - but this seems to be referring to fields not used in the grouping.

SELECT
X.TRIAL_NO,
X.[CENTRE FSI],
SUM(Y.DEG]) AS N
FROM
#TEMP_TAB_1 X,
#TEMP_TAB_1 Y
WHERE
X.[CENTRE FSI] > 0 AND Y.[CENTRE FSI] > 0
AND
X.TRIAL_NO = Y.TRIAL_NO
GROUP BY X.TRIAL_NO, X.[CENTRE FSI]
HAVING
SUM(CASE WHEN Y.[CENTRE FSI] <= X.[CENTRE FSI] THEN Y.[DEG] ELSE 0 END) >=FLOOR((SUM(Y.[DEG]) - 1) * 0.75) + 1
AND
SUM(CASE WHEN X.[CENTRE FSI] <= Y.[CENTRE FSI] THEN Y.[DEG] ELSE 0 END) >=FLOOR((SUM(Y.[DEG]) - 1) * (1 - 0.75)) + 1

How are the aggregate sums working here in the having clause?

Any help appreciated

Indy

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-15 : 11:44:01

sums in the having are used to filter the grouped data. it works on them.

have you tried it out and see what it does?

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp

indy2005
Starting Member

12 Posts

Posted - 2007-07-15 : 11:48:30

Hi,

I have seen the output and know what it is doing - I just dont understand how it does it. While I appreciate Having operates after the grouping - note that this Having clause refers to Y.[CENTRE FSI] when Y.[CENTRE FSI] isnt in the group by clause.

Regards

indy2005
Starting Member

12 Posts

Posted - 2007-07-15 : 16:17:12

Hello,

Perhaps I should elaborate with an example:

TEMP_TAB_1 (1 trial for brevity)

TRIAL NO CENTRE_FSI DEG
1 1/1/2003 1
1 1/2/2003 2
1 1/3/2003 3

Here DEG represents the number of times that particular date occurs for that trial.

The query produces a cartesian:

-----------X------------------- --------------Y-------------

X.TRIAL NO X.CENTRE_FSI X.DEG Y.TRIAL NO Y.CENTRE_FSI Y.DEG
1 1/1/2003 1 1 1/1/2003 1
1 1/1/2003 1 1 1/2/2003 2
1 1/1/2003 1 1 1/3/2003 3

1 1/2/2003 2 1 1/1/2003 1
1 1/2/2003 2 1 1/2/2003 2
1 1/2/2003 2 1 1/3/2003 3

1 1/3/2003 3 1 1/1/2003 1
1 1/3/2003 3 1 1/2/2003 2
1 1/3/2003 3 1 1/3/2003 3

The query groups by X.TRIAL_NO, X.CENTRE_FSI and aggregates SUM(Y.DEG)

From a basic T-SQL standpoint - I need to understand where this logic is applied:

HAVING
SUM(CASE WHEN Y.[CENTRE FSI] <= X.[CENTRE FSI] THEN Y.[DEG] ELSE 0 END) >=FLOOR((SUM(Y.[DEG]) - 1) * 0.75) + 1

I thought Having occured after the aggregation - but this SQL seems to state that Y.[CENTRE FSI] (not in the Group By) is being compared to X.[CENTRE FSI] on a row by row basis? Is this the case - only it works only on the rows within the groups:

e.g. group 1

Red meets WHEN Y.[CENTRE FSI] <= X.[CENTRE FSI] criteria within the group.

X.TRIAL NO X.CENTRE_FSI X.DEG Y.TRIAL NO Y.CENTRE_FSI Y.DEG
1 1/1/2003 1 1 1/1/2003 1
1 1/1/2003 1 1 1/2/2003 2
1 1/1/2003 1 1 1/3/2003 3

X.TRIAL_NO X.[CENTRE FSI] SUM(y.DEG)
1 1/1/2003 1

Any help appreciated

indy2005
Starting Member

12 Posts

Posted - 2007-07-15 : 16:18:10

Well that didnt format as I expected....sorry - data not aligned with columns. I shall expect zero responses!

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2007-07-15 : 21:17:08

format it using the [code] tag see the http://www.sqlteam.com/forums/faq.asp

KH
[spoiler]Time is always against us[/spoiler]

mattyblah
Starting Member

49 Posts

Posted - 2007-07-16 : 23:11:09

[code]
SUM(CASE WHEN Y.[CENTRE FSI] <= X.[CENTRE FSI] THEN Y.[DEG] ELSE 0 END) >=FLOOR((SUM(Y.[DEG]) - 1) * 0.75) + 1
[/code]
This is an aggregate as it uses the sum function. I wouldn't get caught up in the case statement, as it applies a conditional to the aggregate. i've used this method to combine multiple seperate aggregates on the same table into 1 query. an example:

[code]
select count(*) as hello_count from table1 where x = 'hello'
select count(*) as nothing_count from table1 where y = 'nothing'
select count(*) as hello_and_nothing_count from table1 where x = 'hello' and y = 'nothing'
[/code]

can be combined to a single query:

[code]
select
sum(case when x = 'hello' then 1 else 0 end) as hello_count,
sum(case when y = 'nothing' then 1 else 0 end) as nothing_count,
sum(case when x = 'hello' and y = 'nothing' then 1 else 0 end) as hello_and_nothing_count
from table1
[/code]

you don't see x or y in the group by clause but as long as you're not selecting the individual columns you shouldn't have to group by them.

indy2005
Starting Member

12 Posts

Posted - 2007-07-19 : 16:28:26

I think I have worked it out. I have always assumed that you can only performing having after the aggregation so all the detail is lost. In fact having groups the data, but you can still decide which group to include or exclude by performing filtering within the individual rows in the group (if that makes sense).

i.e. Having criteria can still look within the groups individual row data to determine which group to return.

I have been reading too many simple examples such as

Select CustomerID, Count(*) from Orders
Where Count(*) > 5

To sum up, I now understand Having criteria are performed after the grouping but before the aggregation!!

I hope....