Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Script Library
 Summing only the greatest 5 values in each group

Author  Topic 

jemajoign
Starting Member

7 Posts

Posted - 2008-01-10 : 14:08:46
Hello,
Here's one way to sum only the top 5 (greatest 5) values per group.
I assume there is a table called IdValues that contains two columns: id int, value int.

declare @lastId int
declare @value int
declare @count int
declare @idList varchar(5000)
declare @valuelist varchar(5000)
set @count=0
set @lastId = -1
set @value = 0
set @idList=''
set @valuelist=''

select
@count=(case when @lastId<>id then 1 else @count+1 end),
@value=(case when @lastId<>id then value when @count<=5 then @value+value else @value end),
@idList=(case when @lastId<>id then cast(id as varchar)+','+@idList else @idList end),
@valuelist=(case when @lastId<>id then cast(@value as varchar)+','+@valuelist else cast(@value as varchar)+','+right(@valuelist,len(@valuelist)-charindex(',',@valuelist)) end),
@lastId=id
from IdValues
order by id desc, value desc

select @idList,@valuelist

It's a funny approach. I'd be interested to see a better method. In MySQL it is possible to do this much better and have it produce an actual resultset (since MySQL allows you to assign variables and product a resultset in the same query).

I also noticed something interesting. If you do any operation on the order-by columns, the query doesn't work. For example, if I do:
order by id+0 desc, value desc
something funny happens and you only get one id and one value in the list variables. Maybe someone else who actually some idea of how SQL Server works can explain this.

Thanks,
Thomas

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-01-10 : 15:06:56
create table #IdValues (id int, value int)

insert into #IdValues
select 1, 20 union all
select 1, 30 union all
select 1, 40 union all
select 1, 50 union all
select 1, 60 union all
select 1, 70 union all
select 1, 80 union all
select 2, 21 union all
select 2, 32 union all
select 2, 43 union all
select 2, 54 union all
select 2, 65 union all
select 2, 76 union all
select 2, 87

select distinct id,
(select sum(value)
from #IdValues sub
where sub.id = #IdValues.id
and value in (
select top 5 value
from #IdValues sub2
where sub2.id = #IdValues.id
order by value desc
)
)
from #IdValues

drop table #IdValues



"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

jemajoign
Starting Member

7 Posts

Posted - 2008-01-10 : 15:32:42
Wow that was awesome .

But I guess your solution works well if IdValues is in fact one table as I initially proposed for simplicity's sake. If for example you had to join many tables to produce IdValues, those subqueries would make the difference of 1 second and hours (I am trying it on such a case and it seems never ending).

Nonetheless, having a comma-separated list at the end is a lot less cool than what you propose.

Thanks
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2008-01-11 : 01:18:18
and if you use SQL Server 2005, then from jhocutt's example


select id,sum(value) as value from
(
select row_number() over(partition by id order by value desc) as number,* from #IdValues
) as t
where number<=5
group by id

Also see what you can do with ROW_NUMBER() function
http://sqlblogcasts.com/blogs/madhivanan/archive/2007/08/27/multipurpose-row-number-function.aspx

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page
   

- Advertisement -