Having vs Where - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Having vs Where

Author

Topic

nic
Posting Yak Master

209 Posts

Posted - 2002-10-24 : 20:03:17

Hi, I'm getting confused on the difference between "having" and "where". I have a table (tblUser) that has a many to one relationship with two other tables (tblEmail and tblUpdate).

I want to return a list of users where the maximum date in the tblEmail table is 20 days greater than the maximum date in the tblUpdate table.

Here is my original query.
SELECT
a.us_username
,MAX(c.em_date) AS LastEmailDate
,MAX(d.upd_date) AS LastUpdateDate
,DATEDIFF(day,MAX(d.updlg_dt), MAX(c.emlg_date)) AS Difference
FROM
tblUser a LEFT OUTER JOIN tblEmailLog b ON a.us_key = b.us_key
LEFT OUTER JOIN tblUpdateLog b ON a.us_key = c.us_key
WHERE
DATEDIFF(day,b.upd_date, c.em_date) >=20
GROUP BY
a.us_username

If the maximum record for the user in the update table is not greater than the 20 day difference, it returns an earlier update record. (i.e. the LastUpdateDate column is not really the maximum record, it is the maximum record where the "where" clause is true. I expected no records to be returned, not a record than has an incorrect LastUpdateDate).

I then tried this:
SELECT
a.us_username
,MAX(c.em_date) AS LastEmailDate
,MAX(d.upd_date) AS LastUpdateDate
,DATEDIFF(day,MAX(d.updlg_dt), MAX(c.emlg_date)) AS Difference
FROM
tblUser a LEFT OUTER JOIN tblEmailLog b ON a.us_key = b.us_key
LEFT OUTER JOIN tblUpdateLog b ON a.us_key = c.us_key
GROUP BY
a.us_username
HAVING
DATEDIFF(day,MAX(b.upd_date), MAX(c.em_date)) >=20

Since I use the "having" clause I can use an aggregate function like MAX. I'm a little confused why the queries are different. I guess the HAVING clause filters the data AFTER returning it and the WHERE clause alters the data when its being retrieved???

The main reason why I am confused it the maximum em_date always returns correctly (the real maximum record), even though I don't use the MAX statement in the 1st WHERE clause. Why does the MAX(upd_date) change to fit the where clause, but the Max(em_date) always returns the correct maximum date?

When should you use HAVING instead of WHERE?

Any explanation would be helpful.
Thanks

Nic

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-10-25 : 12:39:39

Okay, I have a query that checks for duplicates.
select pk1,pk2,count(47)
from table1
group by pk1,pk2

that gives me a list of all the pk1,pk2 combinations and the amount of records that have that combo. Now lets say I only want to see the combinations with more then one record with that combo.

select pk1,pk2,count(47)
from table1
where count(47) > 1
group by pk1,pk2

this returns
Server: Msg 147, Level 15, State 1, Line 2
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.

so in this case you need to use having
select pk1,pk2,count(47)
from table1
group by pk1,pk2
having count(47) > 1

theres the main use for it I found.

The defination of having : The having clause allows you to restrict the rows returned without restricting the rows used in the query. Must be used with groupby.

Anytime you want to filter by an aggregate (such as count) you need having.

-----------------------
SQL isn't just a hobby, It's an addiction

nr
SQLTeam MVY

12543 Posts

Posted - 2002-10-25 : 13:18:09

Your first query will take all the records for the group where the dates are more than 20 days different then return the maximum of the two dates.

The second one will take groups where the maximum value of the two dates is greater than 20 days and return the maximum value of the two dates in those groups.

Try this
For dif = 1 gives the same
dif = 5 different results
dif = 6 no result from the having

create table #a (i int, j int)
insert #a select 1,15
insert #a select 9,13
declare @dif int
set @dif = 5

select max(i), max(j)
from #a
where j-i > @dif

select max(i), max(j)
from #a
having max(j) - Max(i) > @dif

drop table #a

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2002-10-25 : 19:54:02

The short answer is that the WHERE clause filters BEFORE grouping. The HAVING clause filters AFTER grouping has occurred.

Now, as to your examples, both queries have both joined tables (tblEmailLog and tblUpdateLog) aliased as b. I don't see ANY table aliased as C or D, even though you reference those in your column lists...

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2002-10-27 : 15:22:36

>> I'm getting confused on the difference between "having" and "where". <<

Here is how a SELECT works in SQL ... at least in theory. Real products will optimize things when they can.

a) Start in the FROM clause and build a working table from all of the joins, unions, intersections, and whatever other table constructors are there. The table expression> AS <correlation name> option allows you give a name to this working table which you then have to use for the rest of the containing query.

b) Go to the WHERE clause and remove rows that do not pass criteria; that is, that do not test to TRUE (reject UNKNOWN and FALSE). The WHERE clause is applied to the working in the FROM clause.

c) Go to the optional GROUP BY clause, make groups and reduce each group to a single row, replacing the original working table with the new grouped table. The rows of a grouped table must be group characteristics: (1) a grouping column (2) a statistic about the group (i.e. aggregate functions) (3) a function or (4) an expression made up of the those three items.

d) Go to the optional HAVING clause and apply it against the grouped working table; if there was no GROUP BY clause, treat the entire table as one group.

e) Go to the SELECT clause and construct the expressions in the list. This means that the scalar subqueries, function calls and expressions in the SELECT are done after all the other clauses are done. The AS operator can give a name to expressions in the SELECT list, too. These new names come into existence all at once, but after the WHERE clause, GROUP BY clause and HAVING clause has been executed; you cannot use them in the SELECT list or the WHERE clause for that reason.

If there is a SELECT DISTINCT, then redundant duplicate rows are removed. For purposes of defining a duplicate row, NULLs are treated as matching (just like in the GROUP BY).

f) Nested query expressions follow the usual scoping rules you would expect from a block structured language like C, Pascal, Algol, etc. Namely, the innermost queries can reference columns and tables in the queries in which they are contained.

--CELKO--
Joe Celko, SQL Guru

shijobaby
Starting Member

44 Posts

Posted - 2009-08-21 : 06:27:40

The ways to avoid this error is simple

just look into my post

http://sqlerrormessages.blogspot.com/2009/08/sql-server-error-message-msg-130-cannot.html

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources