Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 ANY() aggregate function

Author  Topic 

daniel.nospam
Starting Member

11 Posts

Posted - 2011-09-28 : 06:06:36
Hi.

I've seen a lot of patronising here whenever someone asks about FIRST() and LAST() aggregate functions on SQL Server. Cliche answers like "Why don't you use MIN() and MAX()?", "You cannot guarantee the order of the rows in the database.", etc...

Please cut the patronising. I have an UNIQUEIDENTIFIER column and I want to retrieve _any_ value from a group-by set. I don't care which one, just give me any. FIRST() or LAST() would work just fine in the absense of ANY().

However, none of these are available and I have to use MIN(), but hang on... you can't use MIN()/MAX() on UNIQUEIDENTIFIER columns. It throws an exception. Thank you very much gurus, now I have to do this:

CONVERT(uniqueidentifier, MAX(CONVERT(char(36), MyGuidColumn)))

This is so fugly and what about the overhead added. Wouldn't it be much nicer if I could just do:

ANY(MyGuidColumn)
or even FIRST(MyGuidColumn)

Regards,
Daniel

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-09-28 : 06:26:24
select top (1) col from table?


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

daniel.nospam
Starting Member

11 Posts

Posted - 2011-09-28 : 07:03:34
Dear WEBFRED,

I guess you missed the part when I said I needed any value from a GROUP-BY set.

I'll try make it clearer. My question is:

"Why doesn't MS implement FIRST(), LAST() or at least ANY() aggregate functions?"

And here is the reason (IMO) why they should:

CREATE TABLE AnyTable
(
MyGuidColumn uniqueidentifier not null,
MyCharColumn varchar(10)
);

INSERT AnyTable VALUES
({GUID'7BCA03B9-A7C6-46E7-97C9-F29CCD338CE5'}, 'Banana'),
({GUID'C453034D-B4A0-4E21-80F5-58D196DC9267'}, 'Orange'),
({GUID'407CE892-04E0-46FE-9777-48B5DAF8817D'}, 'Banana'),
({GUID'9BBAFD63-145A-4A07-B501-C8A3CF502F25'}, 'Banana'),
({GUID'5B2B4663-F908-4376-AD19-23A8039EE17E'}, 'Orange');

SELECT MyCharColumn, MIN(MyGuidColumn)
FROM AnyTable
GROUP BY MyCharColumn;

-- OOPS!
-- Msg 8117, Level 16, State 1, Line 1
-- Operand data type uniqueidentifier is invalid for min operator.
go

-- WORK-AROUND
SELECT MyCharColumn, CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn)))
FROM AnyTable
GROUP BY MyCharColumn;

-- DESIRED SOLUTION
/*
SELECT MyCharColumn, ANY(MyGuidColumn)
FROM AnyTable
GROUP BY MyCharColumn;
*/
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-09-28 : 07:40:59
Yes I missed that GROUP BY - sorry.
Yes you are right - in this case it would be nice to have ANY() instead of doing a workaround with CONVERT() or ROW_NUMBER()...



No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-09-28 : 08:30:30
quote:
I've seen a lot of patronising here whenever someone asks about FIRST() and LAST() aggregate functions on SQL Server. Cliche answers like "Why don't you use MIN() and MAX()?", "You cannot guarantee the order of the rows in the database.", etc...
Cool your pits. It's neither patronizing nor cliche. Relational data structures do not have an inherent order, and believing (if not insisting) that they should makes understanding their nature more difficult.

You have a bag of marbles, each one unique, but no defined physical order. If you change the order of the marbles in the bag, or line them up, does their physical order change anything about them? No. This is very easily modeled in a relational table (color, weight, size, material)

Same thing with musical chairs. Each person has a name, height, weight, age, etc. If they move from chair to chair, their attributes do not change regardless of which chair they sit in. It's the same with relational data: it's all about WHAT it is, not WHERE it is.

There is an especially good reason for not caring about physical order. If you change the underlying physical structure, e.g. the clustered index, using FIRST() and LAST() could give you different results USING THE SAME QUERY. That's not acceptable from a data integrity perspective, regardless of your immediate need. MIN and MAX are however oblivious to physical ordering, and only care about the actual values.
quote:
Please cut the patronising. I have an UNIQUEIDENTIFIER column and I want to retrieve _any_ value from a group-by set. I don't care which one, just give me any. FIRST() or LAST() would work just fine in the absense of ANY().
ANY is a SQL function, just not an aggregate.
quote:
However, none of these are available and I have to use MIN(), but hang on... you can't use MIN()/MAX() on UNIQUEIDENTIFIER columns. It throws an exception. Thank you very much gurus, now I have to do this:

CONVERT(uniqueidentifier, MAX(CONVERT(char(36), MyGuidColumn)))
Really, is it that hard? You've written 10 times as much complaining about having to use convert than actually using it. You can also convert to binary(16), since that is the native storage of uniqueidentifier, and may process faster. You can also cast it as SQL_VARIANT and use MIN and MAX on it. In any event...
quote:
"Why doesn't MS implement FIRST(), LAST() or at least ANY() aggregate functions?"
The real question should be, why doesn't MS allow MIN and MAX on uniqueidentifier? I agree with that one, since they can be ordered, there's no reason not to allow MIN and MAX. My guess is that the original purpose was to simply be...wait for it...a unique identifier that did not carry any useful information, there was no point in aggregating it. Makes some sense but is still a lame excuse. Still, they don't currently support it, so do the CONVERT and get on with life. There are bigger things.

Lastly, SQL Server Denali/2012 will add the FIRST_VALUE() and LAST_VALUE() window functions. These are more valid than FIRST and LAST since they require a window to be defined over the data being evaluated. See ROW_NUMBER, RANK, DENSE_RANK, and NTILE for other examples. You can also simulate MIN and MAX on GUIDs using ROW_NUMBER(), as webfred suggested:
;WITH CTE(MyCharColumn,MyGUIDColumn,rn) AS (
SELECT MyCharColumn, MyGUIDColumn, ROW_NUMBER() OVER (PARTITION BY MyCharColumn ORDER BY MyGUIDColumn) rn FROM AnyTable)
SELECT MyCharColumn, MyGUIDColumn FROM CTE WHERE rn=1
Go to Top of Page

daniel.nospam
Starting Member

11 Posts

Posted - 2011-09-28 : 09:46:02
Thanks guys, both webfred and robvolk.

Sorry if I sounded like a whinger with no value to aggregate (sorry for the pun here too).

I agree with your explanation about first and last, applied to relational data. I just thought that if one doesn't care about the order, these would work just as well as ANY().

I can't help thinking that ANY() would be much cheaper than MIN()/MAX() even disregarding the data type conversion overhead. It would be as simple as getting the value from any row in the set, like the first one for instance.

I also don't understand why SQL Server allows ORDER BY on an uniqueidentifier, but not MIN() or MAX().
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-09-28 : 10:24:07
quote:
I also don't understand why SQL Server allows ORDER BY on an uniqueidentifier, but not MIN() or MAX().
I hear ya, I think it's pretty stupid. However given the alternatives, doing the CONVERT is probably the least overhead.
Go to Top of Page

ivanarj
Starting Member

1 Post

Posted - 2014-08-11 : 14:55:23
Internally, the optimizer has ANY():
http://sqlblog.com/blogs/paul_white/archive/2011/07/02/undocumented-query-plans-the-any-aggregate.aspx
Go to Top of Page
   

- Advertisement -