Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 GROUP BY, take Foo from row where Bar=Min(Bar)

Author  Topic 

Jenda
Starting Member

29 Posts

Posted - 2008-04-30 : 18:17:33
Last year I asked here whether there is a way to group a table and for each group obtain the value of some column from the row where some other column is minimal. I did get some suggestion, but I think I've found an even better way ... even if restricted.

If both those columns are int (or can be transformed to an int) you can do something like this:

SELECT GroupCol, min(Cast(2147483648 as BigInt)*SortCol + WantedCol) % 2147483648 as WantedCol
FROM SourceTable
GROUP BY GroupCol

This seems to be much quicker than any subselects and may be included as a derived table in the FROM clause in a more complex query. Where it seems to be quicker than

SELECT some, columns,
(SELECT TOP 1 WantedCol
FROM SourceTable
WHERE SourceTable.GroupCol = ExternalTable.GroupCol
ORDER BY SortCol
) as WantedCol
FROM ExternalTable
WHERE ...

Especially if you are not grouping a single table, but the results of a more complex query and/or if you need several columns from the SourceTable. In which case you'd have to specify and evaluate the SELECT TOP 1 for each column, while using this trick you can just find the PrimaryKey of the wanted row for each value of GroupCol and then link the SourceTable like this:

SELECT some, columns,
SourceTable.Col1, SourceTable.Col2
FROM ExternalTable
JOIN (
SELECT GroupCol, min(Cast(2147483648 as BigInt)*SortCol + Id) % 2147483648 as Id
FROM SourceTable
GROUP BY GroupCol
) as Wanted on Wanted.GroupCol = ExternalTable.GroupCol
join SourceTable on SourceTable.Id = Wanted.Id


Hope this is of any use to someone :-)
Jenda

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2008-04-30 : 18:32:51
I've very surprised that someone suggested you use the TOP 1 approach last year. The derived table approach is definitely the way to go.

Do you have the link?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Database maintenance routines:
http://weblogs.sqlteam.com/tarad/archive/2004/07/02/1705.aspx
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2008-04-30 : 18:53:22
quote:
Originally posted by tkizer

I've very surprised that someone suggested you use the TOP 1 approach last year. The derived table approach is definitely the way to go.

Agreed. Plus with the ROW_NUMBER() function in 2005 I think it makes it more readable and the performance is the same. Here is quick-n-simple sample of what I mean. Hopefully, this is in line with what you are doing. :)
DECLARE @Foo TABLE (ID INT, SortCol INT, GroupCol INT)

INSERT @Foo
SELECT 1, 8, 1
UNION ALL SELECT 2, 5, 1
UNION ALL SELECT 3, 3, 1
UNION ALL SELECT 4, 1, 5
UNION ALL SELECT 5, 2, 5
UNION ALL SELECT 6, 3, 5
UNION ALL SELECT 7, 1, 8
UNION ALL SELECT 8, 2, 8
UNION ALL SELECT 9, 3, 8
UNION ALL SELECT 10, 4, 8

SELECT
GroupCol,
min(Cast(2147483648 as BigInt) * SortCol + Id) % 2147483648 as Id
FROM
@Foo
GROUP BY
GroupCol

-- 2005 Way
SELECT
GroupCol,
ID
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY GroupCol ORDER BY SortCol) AS RowNum
FROM
@Foo
) AS T
WHERE
RowNum = 1
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2008-04-30 : 19:07:29
I think I confused the issue too much last time. With having a join in place of the SourceTable etc. Plus it was actually a different case than the one I solved with the TOP 1 and I used a subselect to find the Min(SortCol) for each GroupCol in that case. The TOP 1s are older and are on places where I needed just one column and often needed just any of the values in the group instead of the one for some minimal SortCol.

I can't use the ROW_NUMBER() at the moment because the app needs to work on MSSQL2000 as well for now, but thanks for that solution anyway!

Now I just need to find the more important places where I used the subselect or TOP 1 and can speed things up using this.

Thanks, Jenda
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-30 : 20:15:21
In SQL 2000, don't use correlated sub queries, use JOINS and derived tables:


select a.*
from YourTable a
inner join
(
select id, Min(col) as col
from YourTable
group by id
)
b on a.id = b.id and a.col = b.col


That's it.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page
   

- Advertisement -