| 
                
Please start any new threads on our new 
    site at https://forums.sqlteam.com.  We've got lots of great SQL Server
    experts to answer whatever question you can come up with. 
    
        | 
                
                    | 
                            
                                | Author | Topic |  
                                    | rme8494Yak Posting Veteran
 
 
                                        98 Posts | 
                                            
                                            |  Posted - 2002-11-07 : 21:21:40 
 |  
                                            | Hi Everyone,  Here is the situation, I have two tables, a report and reportcat table.ReportCat TableCat_IDCatNamePostdateReport TableReport_IDCat_IDFileLocationPostDateI'm doing an inner join on the Cat_ID on a query to output them on a web page.  This query works great to out put ALL the reports in the database, however I would like to do something different.  I would like the user to be able select how many reports they would like to see output in each category.  So say I have 20 reports total, and 4 categories.  Each category has 5 reports in it.  How do I only return only 2 reports from each category?  If I do a TOP 8 in the query it wouldn't give me 2 from each category just 8 random records.  Make sense?  Any ideas?RyanRyan EverhartSBC Ameritech |  |  
                                    | simondeutschAged Yak Warrior
 
 
                                    547 Posts | 
                                        
                                          |  Posted - 2002-11-07 : 21:44:27 
 |  
                                          | Could definitely do it with UNION.  But there should be a better way.Otherwise:SELECT TOP n * FROM ReportCat INNER JOIN Report ON CAt_Id = cat_id where catname = 'a'UNIONSELECT TOP n * FROM ReportCat INNER JOIN Report ON CAt_Id = cat_id where catname = 'b'etc.Sarah Berger MCSD |  
                                          |  |  |  
                                    | rme8494Yak Posting Veteran
 
 
                                    98 Posts | 
                                        
                                          |  Posted - 2002-11-07 : 22:07:57 
 |  
                                          | Sarah,  Thanks for your reply.  I am looking for something in single query though.  I do think the union is feasable as the number of categories will change.  Any other ideas?ThanksRyanRyan EverhartSBC Ameritech |  
                                          |  |  |  
                                    | oitsubobYak Posting Veteran
 
 
                                    70 Posts | 
                                        
                                          |  Posted - 2002-11-07 : 22:20:08 
 |  
                                          | Hi Everyone...Ryan and I work together at SBC (a thousand miles apart) and have been bouncing these ideas off one another.  Essentially, we're both trying to accomplish the same basic goal but in slightly different applications.Just to give you an idea of what I'm trying to do, here's the text of a post I submitted to a Cold Fusion user group on Macromedia.com:Begin Post=============What I'm trying to do is build a single query that will join an "employees" table and "department" table and return the top X senior employees from each department where X is a value in the limit column of the "department" table.For the sake of this example, let's say the "employee" table has 100 records and 4 columns: First Name, Last Name, Department ID and Hire Date while the "department" table has 5 records 3 columns: Department ID, Department Name and Limit. To make it simple, lets also just assume that the value of the limit column is '2' employee's for each department. When all's said and done, the result set should contain 10 records (2 people for each of the 5 departments).Note: In actuality, we have tens of thousands of employees and several hundred departments. For that reason, it's not feasible or good practice to write a query for each individual department.In all honesty, I don't even know where to start on this. I'm hoping that someone in the community has come across a similar situation and can provide me with some much needed direction.=============End PostIf anyone has any suggestions, both Ryan and I would appreciate it.Thanks in Advance,Bob WestonSBC Pacific Bell |  
                                          |  |  |  
                                    | simondeutschAged Yak Warrior
 
 
                                    547 Posts | 
                                        
                                          |  Posted - 2002-11-07 : 22:34:47 
 |  
                                          | Check this out:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=21334see Ilya Zaltsman's post.  The situation seems similar to yours, although you will have to change the hardcoded TOP operator, maybe by using dynamic SQL, because otherwise I'm not sure how you can select TOP n of an unknown number.  Dynamic sql will slow you down somewhat, though.Sarah Berger MCSD |  
                                          |  |  |  
                                    | hoagmStarting Member
 
 
                                    1 Post | 
                                        
                                          |  Posted - 2002-11-07 : 23:45:20 
 |  
                                          | This should work.  I'm sure it's longer than you were looking for but it will take into consideration table changes to the Report table as well as the variable number of categories.  I've commented liberally... let me know if you have questions.--Create #temp_reports then clear it, this makes sure our temp table is always mirrored after our real tableSELECT TOP 1 *  INTO #temp_reports  FROM ReportDELETE #temp_reports--variable to hold our current category IDDECLARE @Cat_ID Int--cursor variable so we can step through each cat_id and perform actionsDECLARE Category_Cursor CURSOR FOR  SELECT Cat_ID FROM ReportCat--open the cursor and prime itOPEN Category_CursorFETCH NEXT FROM Category_Cursor  INTO @Cat_ID--as long as we've not reached the end of the cat_id's, we loopWHILE @@FETCH_STATUS = 0  BEGIN--insert top n reports from cat_id into temp    INSERT #temp_reports      SELECT TOP n *        FROM Report        WHERE Cat_ID = @Cat_ID--fetch the next cat_id    FETCH NEXT FROM Category_Cursor      INTO @Cat_ID  END--close your cursor and clear it from memoryCLOSE Category_CursorDEALLOCATE Category_Cursor--select your resultsSELECT * FROM #temp_reportsEdited by - hoagm on 11/07/2002  23:49:44 |  
                                          |  |  |  
                                    | r937Posting Yak  Master
 
 
                                    112 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 00:17:08 
 |  
                                          | caution: not tested select D.DepartmentName     , E.FirstName     , E.LastName     , E.HireDate    from department Dinner   join employee E     on D.ID = E.DepartmentID where D.Limit >        ( select count(*)            from employee           where DepartmentID = E.DepartmentID             and HireDate > E.HireDate )  order     by D.DepartmentName     , E.HireDate  descendingthe limit will be exceeded in departments where there are ties across the last placerudy |  
                                          |  |  |  
                                    | mohamedyousuff@yahoo.comStarting Member
 
 
                                    24 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 01:11:28 
 |  
                                          | Try this queryselect * from reportcat rc inner join report r on rc.cat_id = r.cat_id where report_id in ( select top 2 report_id from report where cat_id = rc.cat_id )     The solution solves it in a single query as you expected.  As u see above, it does not involve any temporary table but it is a correlated query.  I have tested it and is working fine.  I am thinking of alternatives without correlated query (performance reasons) and without temporary table and with a single query. quote:Hi Everyone,  Here is the situation, I have two tables, a report and reportcat table.ReportCat TableCat_IDCatNamePostdateReport TableReport_IDCat_IDFileLocationPostDateI'm doing an inner join on the Cat_ID on a query to output them on a web page.  This query works great to out put ALL the reports in the database, however I would like to do something different.  I would like the user to be able select how many reports they would like to see output in each category.  So say I have 20 reports total, and 4 categories.  Each category has 5 reports in it.  How do I only return only 2 reports from each category?  If I do a TOP 8 in the query it wouldn't give me 2 from each category just 8 random records.  Make sense?  Any ideas?RyanRyan EverhartSBC Ameritech
 
 |  
                                          |  |  |  
                                    | Page47Master Smack Fu Yak Hacker
 
 
                                    2878 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 07:49:50 
 |  
                                          | This question of top N of a group has come up a few times lately and I have spent some time looking at the various suggestions.  I will outline what I have found here.First some sample data to work with ... create table #temp (    someid int,    somevc varchar(10) )insert into #tempselect    n2.n,    left(        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97) +        char(round((rand(convert(binary(4),newid())) * 25),0)+97),    floor(rand(convert(binary(4),newid())) * 10)+1)from    (select n from toolbox.dbo.numbers where n <= 500) n1    cross join toolbox.dbo.numbers n2where    n2.n < 50This is a two column table.  The first column is an int which will represent the groupings and the seconds is a varchar which is the values we want to TOP.  The sample data consists of 500 values in each group and 50 groups.  (toolbox.dbo.numbers is simply a tally table).The first solution is one I posted a ways back.  I don't remember where it came from, but I do remember it was something setbasedisthetruepath and I were working one. select    someid,    somevcfrom    #temp twhere    exists (        select 1        from            #temp        where            someid = t.someid and            somevc = t.somevc        group by            someid        having (            select                count(*)            from                #temp            where                someid = t.someid and                somevc < t.somevc ) < 3 )order by someid, somevcThis query will return the top three values for each group and duplicates will be reported.  Meaning, if there are two rows of 1,a then both will be returned.  This query presents a subtree cost of 11.5 (very poor) and runs on my laptop with the above data set in around 3.6 seconds.The next is derived from Ilya and the post mentioned above. select    someid,    somevcfrom    #temp twhere    somevc in (        select top 3            somevc        from            #temp        where            t.someid = someid        order by somevc )order by someid, somevcThis query returns the exact same rowset as the previous method.  Now,  while this query looks much simpler, it presents a subtree cost of 467 and takes around 19.6 seconds under the same conditions as above.The final method comes from Arnold Fribble from a post I can't seem to find at the moment.A post dealing with bottom 2 of a group.select    someid,    somevcfrom (    select        someid,        case            when col = 0 then somevc0            when col = 1 then somevc1            else somevc2        end somevc    from (        select            t0.someid,            min(t0.somevc) as somevc0,            min(t1.somevc) as somevc1,            min(t2.somevc) as somevc2        from (            select                someid,                min(somevc) as somevc            from                #temp            group by                someid ) as t0            left join #temp t1                on t0.someid = t1.someid and                   t0.somevc < t1.somevc            left join #temp t2                on t1.someid = t2.someid and                   t1.somevc < t2.somevc        group by            t0.someid ) as a    cross join (        select 0 as col        union all        select 1        union all        select 2) as n ) as awhere    somevc is not nullorder by someid, somevcNow Frib's piece here is a bit different than the two above in that duplicates are not shown.  So if there are two rows in the sample data of 1,a only one of them will appear in the final rowset.  While this query gives more reasonable 16.6 subtree cost, it took about 38.4 seconds to run.So from that, it would appear that my original method is the way to go.  Now, that being said, with a smaller data set such as .. select 1,'a'union select 1,'b'union select 1,'c'union select 1,'d'union select 1,'f'union select 1,'g'union select 1,'h'union select 1,'i'union select 2,'a'union select 2,'b'union select 2,'c'union select 2,'d'union select 2,'e'union select 2,'f'union select 2,'g'union select 2,'h'union select 2,'i'union select 2,'j'..all three perform resonably well.Jay White{0}Edited by - Page47 on 11/08/2002  07:57:27 |  
                                          |  |  |  
                                    | r937Posting Yak  Master
 
 
                                    112 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 09:26:18 
 |  
                                          | jay, which, if any, of your methods will return the top X of each department, where the department itself specifies how many X to list?not to be obstinate, but i believe that was the original problemrudy |  
                                          |  |  |  
                                    | Page47Master Smack Fu Yak Hacker
 
 
                                    2878 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 09:44:57 
 |  
                                          | quote:Rudy, I was looking at the approach rather than the exact solution.   Certainly, you can understand the value of discussing different approaches that might be adapted vs. giving cut and paste code on a forum such as this.The determining TOP X is just a variation on the theme, right?  If you can do TOP 3, you can do TOP X.  You do a good job of adapting the second approach (Ilya's) I present to this specific problem.  For my first approach (the one that I think performs best), it is just as easy to adapt by coding the having clause to be less than a column rather than '3'.  Now Frib's approach is a bit more X specific but I think you can do the same thing by limiting the cross joined table 'a' ... I'd have to think about that a bit more.Jay White{0}not to be obstinate, but i believe that was the original problem
 
 |  
                                          |  |  |  
                                    | Arnold FribbleYak-finder General
 
 
                                    1961 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 11:54:19 
 |  
                                          | Sorry Jay, you've flattened that rather too much.It should look like this: SELECT someid, somevcFROM (    SELECT someid,      CASE col WHEN 0 THEN somevc0 WHEN 1 THEN somevc1 ELSE somevc2 END somevc    FROM (        SELECT T1.someid, MIN(somevc0) AS somevc0,          MIN(T1.somevc1) AS somevc1, MIN(somevc) AS somevc2        FROM (            SELECT T0.someid, MIN(somevc0) AS somevc0, MIN(somevc) somevc1            FROM (                SELECT someid, MIN(somevc) AS somevc0                FROM xtemp                GROUP BY someid              ) AS T0            LEFT JOIN xtemp AS T1 ON T0.someid = T1.someid AND T0.somevc0 < T1.somevc            GROUP BY T0.someid          ) AS T1        LEFT JOIN xtemp AS T2 ON T1.someid = T2.someid AND T1.somevc1 < T2.somevc        GROUP BY T1.someid      ) T2    CROSS JOIN (SELECT 0 AS col UNION ALL SELECT 1 UNION ALL SELECT 2) n  ) aWHERE somevc IS NOT NULLORDER BY someid, somevcBut I agree that you have to be really sure that you want 3 values to use this!Edited by - Arnold Fribble on 11/08/2002  12:08:22 |  
                                          |  |  |  
                                    | ConfusedOfLifeStarting Member
 
 
                                    35 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 11:58:27 
 |  
                                          | Suppose this is our table : someid      somevc     ----------- ---------- 1           Oopse1           Oopse1           Oopse1           Oopse1           Oopse2           Wow!You said that dublicates will be reported, so, I tried your code ( the first one and the second ) on this table and for the first row ( 1,Oopse) it returns all of them ( 5 rows ), well, it should, but what if we wanted to have only ( and only! ) 3 rows for each group in the results? ( i.e. I do not get 5 rows for this oopse! ) |  
                                          |  |  |  
                                    | Page47Master Smack Fu Yak Hacker
 
 
                                    2878 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 12:02:50 
 |  
                                          | Holy crap Frib, that is scary fast!!!  I'm glad you chimed in ... quote:Then use Fribbles method or modify one of the others so it meets your needs ...Jay White{0}Edited by - Page47 on 11/08/2002  12:08:26You said that dublicates will be reported, so, I tried your code ( the first one and the second ) on this table and for the first row ( 1,Oopse) it returns all of them ( 5 rows ), well, it should, but what if we wanted to have only ( and only! ) 3 rows for each group in the results? ( i.e. I do not get 5 rows for this oopse! )
 
 |  
                                          |  |  |  
                                    | Arnold FribbleYak-finder General
 
 
                                    1961 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 12:11:51 
 |  
                                          | Sigh. I had to modify it again: it was selecting the wrong someid values -- doesn't matter unless the data doesn't have enough to fill all 3... |  
                                          |  |  |  
                                    | MuffinManPosting Yak  Master
 
 
                                    107 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 12:25:34 
 |  
                                          | Ok, how about this? It reports ties , but might not be completely true to the TOP X criteria (see the output below for the TOP 3 from each grouping).set nocount oncreate table #sale (Company varchar(20) , Division varchar(20) , sale int)insert #sale select 'Allstar Shipping' , 'Marketing' , 9000insert #sale select 'Allstar Shipping' , 'IT' , 12000insert #sale select 'Allstar Shipping' , 'Operations' , 9000insert #sale select 'Allstar Shipping' , 'Engineering' , 9000insert #sale select 'Allstar Shipping' , 'Sales' , 7000insert #sale select 'Bombay Allied' , 'North' , 140000insert #sale select 'Bombay Allied' , 'South' , 125000insert #sale select 'Bombay Allied' , 'East' , 86000insert #sale select 'Bombay Allied' , 'West' , 168000insert #sale select 'Cat Walkers, Ltd.' , 'Persian' , 18000insert #sale select 'Cat Walkers, Ltd.' , 'Domestic' , 15000insert #sale select 'Cat Walkers, Ltd.' , 'Fluffy' , 22500declare @RecsPerGroup int , @msg varchar(100)select @RecsPerGroup = 2select @msg = '** Show TOP ' + convert(varchar(10), @RecsPerGroup) + ' records per grouping **'print @msgselect * from (	select top 100 percent * 		, Record = (				select count(*) + 1 as Cnt 				from #sale s2 				where s2.Company = s.Company and s2.sale > s.sale				)	from #sale s	order by Company,  Sale Desc , Division) awhere  record <= @RecsPerGroupprint ''select @RecsPerGroup = 3select @msg = '** Show TOP ' + convert(varchar(10), @RecsPerGroup) + ' records per grouping **'print @msgselect * from (	select top 100 percent * 		, Record = (				select count(*) + 1 as Cnt 				from #sale s2 				where s2.Company = s.Company and s2.sale > s.sale				)	from #sale s	order by Company,  Sale Desc , Division) awhere  record <= @RecsPerGroupprint ''select @RecsPerGroup = 3select @msg = '** Show TOP ' + convert(varchar(10), @RecsPerGroup) + ' records per grouping USING DISTINCT **'print @msgselect * from (	select top 100 percent * 		, Record = (				select count(Distinct Sale) + 1 as Cnt 				from #sale s2 				where s2.Company = s.Company and s2.sale > s.sale				)	from #sale s	order by Company,  Sale Desc , Division) awhere  record <= @RecsPerGroupdrop table #sale/**** Here are the results ***/** Show TOP 2 records per grouping **Company              Division             sale        Record      -------------------- -------------------- ----------- ----------- Allstar Shipping     IT                   12000       1Allstar Shipping     Engineering          9000        2Allstar Shipping     Marketing            9000        2Allstar Shipping     Operations           9000        2Bombay Allied        West                 168000      1Bombay Allied        North                140000      2Cat Walkers, Ltd.    Fluffy               22500       1Cat Walkers, Ltd.    Persian              18000       2 ** Show TOP 3 records per grouping **Company              Division             sale        Record      -------------------- -------------------- ----------- ----------- Allstar Shipping     IT                   12000       1Allstar Shipping     Engineering          9000        2   ==Allstar Shipping     Marketing            9000        2   ==> Are We getting the top 3 here?Allstar Shipping     Operations           9000        2   ==Bombay Allied        West                 168000      1Bombay Allied        North                140000      2Bombay Allied        South                125000      3Cat Walkers, Ltd.    Fluffy               22500       1Cat Walkers, Ltd.    Persian              18000       2Cat Walkers, Ltd.    Domestic             15000       3** Show TOP 3 records per grouping USING DISTINCT **Company              Division             sale        Record      -------------------- -------------------- ----------- ----------- Allstar Shipping     IT                   12000       1Allstar Shipping     Engineering          9000        2Allstar Shipping     Marketing            9000        2Allstar Shipping     Operations           9000        2Allstar Shipping     Sales                7000        3  == Now we get 3 distinct values!Bombay Allied        West                 168000      1Bombay Allied        North                140000      2Bombay Allied        South                125000      3Cat Walkers, Ltd.    Fluffy               22500       1Cat Walkers, Ltd.    Persian              18000       2Cat Walkers, Ltd.    Domestic             15000       3Edited by - muffinman on 11/08/2002  12:40:11 |  
                                          |  |  |  
                                    | Page47Master Smack Fu Yak Hacker
 
 
                                    2878 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 12:40:23 
 |  
                                          | Nice work Muffin ... select    someid,    somevcfrom (    select top 100 percent        *,        (select            count(*) + 1 as Cnt         from            #temp t2         where            t2.someid = t.someid and            t2.somevc < t.somevc ) as Record    from        #temp t    order by        someid,        somevc desc ) as awhere    record <= 3order by    someid,    somevc... a nice 6.16 subtree cost and running just over 3 seconds ...But Frib's re-write is still the champ (by a huge margin) running in 300ms and only a 1.06 cost.Jay White{0} |  
                                          |  |  |  
                                    | JustinBigelowSQL Gigolo
 
 
                                    1157 Posts | 
                                        
                                          |  Posted - 2002-11-08 : 14:11:55 
 |  
                                          | quote:Which just goes to show you, always be careful when you quibble with Fribble.But Frib's re-write is still the champ (by a huge margin) running in 300ms and only a 1.06 cost.
 
  JustinHave you hugged your SQL Server today? |  
                                          |  |  |  
                                    | LavosPosting Yak  Master
 
 
                                    200 Posts | 
                                        
                                          |  Posted - 2002-11-09 : 18:57:12 
 |  
                                          | Of course, there's always another solution.  I hesitate to give it, because it really depends on How the tables are used.  Denormalize the table to include your ranking information, or create a rankings table. If rankings are calculated only periodically, and the listing generated often, then that would be the fastest solution over time.I'm finding this very interesting so far.  I was working on my own solution for the general case, but I stopped when I couldn't wrap my head around Fribble's query :)Oh yeah, I did have a procedure that dynamically wrote a lot of UNION ALL queries which was very fast, but it quickly exceeds the varchar(8000) datatype when there are a lot of categories/departments.----------------------"O Theos mou! Echo ten labrida en te mou kephale!" |  
                                          |  |  |  
                                    | r937Posting Yak  Master
 
 
                                    112 Posts | 
                                        
                                          |  Posted - 2002-11-09 : 19:40:55 
 |  
                                          | fribble: "But I agree that you have to be really sure that you want 3 values to use this!"i hate to keep harping on this, and i'd love to hear what the SBC Ameritech guys have to say about this, but each department was to list a different number of top rows -- department A will want the top 2 listed, department B will want the top 6 listed, and so onany luck adapting your 1st query yet, jay?rudy |  
                                          |  |  |  
                                    | LavosPosting Yak  Master
 
 
                                    200 Posts | 
                                        
                                          |  Posted - 2002-11-09 : 23:19:52 
 |  
                                          | Well Rudy, I think Jay said that he was more interested in attacking the general approach, opposed to coming up with a complete solution.This query would work, but I don't know how it performs, but if updates to the Employee and Department tables are rare and the query is ran often, then the denormalized approach will still outperform it overall. SELECT E.*FROM Departments D INNER JOIN (SELECT A.EmpID, A.DepID, COUNT(*) AS Rank  FROM Employees A INNER JOIN      Employees B ON (A.DepID = B.DepID AND A.EmpID <= B.EmpID)  GROUP BY A.EmpID, A.DepID ) ER ON (ER.DepID = D.DepID AND ER.Rank <= D.RankLimit) INNER JOIN Employees E ( ON ER.EmpID = E.EmpID)----------------------"O Theos mou! Echo ten labrida en te mou kephale!" |  
                                          |  |  |  
                                | Next Page |  |  |  |  |