Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Need TOP N random unique records from one table

Author  Topic 

russ668
Starting Member

3 Posts

Posted - 2010-02-03 : 12:40:28
I'm working on a contest project where I need to send the number of winners requested, start date and end date (date range) to a stored procedure and have it return unique random records. The table can contain multiple records for the same employee, in other words they can have multiple entries for the date range. Those should be included in the random process.

This is what I have so far in my stored procedure:

if @action='cmdGetRandomWinners'
BEGIN
SELECT TOP (@N) empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY empid
ORDER BY NEWID()) AS RN
FROM contestEligible) AS InnerQ
WHERE InnerQ.RN = (@N) GROUP BY empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
END


This is working in that I get unique records returned to me. If there are only two employees in the date range and I request 3 winners, it'll only return 2 which is what I want. The problem I'm having is that this query will always return the record with the lowest empid first. So if empid 274 has one entry and empid 366 has 7 entries, empid 274 wins each time which doesn't make sense. I am at a loss on how to fix this behavior. I'm assuming it's because I'm grouping.

I really need "N" records returned for the date range without duplicating empid's. Can anyone help?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-02-03 : 12:45:32
quote:
Originally posted by russ668

I'm working on a contest project where I need to send the number of winners requested, start date and end date (date range) to a stored procedure and have it return unique random records. The table can contain multiple records for the same employee, in other words they can have multiple entries for the date range. Those should be included in the random process.

This is what I have so far in my stored procedure:

if @action='cmdGetRandomWinners'
BEGIN
SELECT TOP (@N) empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY empid
ORDER BY NEWID()) AS RN
FROM contestEligible) AS InnerQ
WHERE InnerQ.RN = 1 GROUP BY empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
END


This is working in that I get unique records returned to me. If there are only two employees in the date range and I request 3 winners, it'll only return 2 which is what I want. The problem I'm having is that this query will always return the record with the lowest empid first. So if empid 274 has one entry and empid 366 has 7 entries, empid 274 wins each time which doesn't make sense. I am at a loss on how to fix this behavior. I'm assuming it's because I'm grouping.

I really need "N" records returned for the date range without duplicating empid's. Can anyone help?



modify like above and see
Go to Top of Page

russ668
Starting Member

3 Posts

Posted - 2010-02-03 : 13:02:21
Wow..thanks! I did make that change but empid 274 is still winning every time. Is it the grouping that is giving me the unique empid's? I'm wondering if the group by clause is sorting the records by empid and it's just picking the first record instead of a true random one?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-02-03 : 13:05:11
quote:
Originally posted by russ668

Wow..thanks! I did make that change and now it's giving me a new entryid but empid 274 is still winning every time. So progress was made, but desired change still not achieved of getting a true random pull. Is it the grouping that is giving me the unique empid's?


i cant understand reason for group by. wont this be enough?


if @action='cmdGetRandomWinners'
BEGIN
SELECT TOP (@N) empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY empid
ORDER BY NEWID()) AS RN
FROM contestEligible) AS InnerQ
WHERE InnerQ.RN = 1
ORDER BY NEWID()
END
Go to Top of Page

russ668
Starting Member

3 Posts

Posted - 2010-02-03 : 13:18:59
That seemed to have fixed it! Here's the new stored procedure:

if @action='cmdGetRandomWinners'
BEGIN
SELECT TOP (@N) empid, entryid, contestid, teamid, submitdate, empName, teamname, answer, question
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY empid
ORDER BY NEWID()) AS RN
FROM contestEligible) AS InnerQ
WHERE InnerQ.RN = 1 ORDER BY NEWID()
END


I still need to test the heck out of it, but this is a huge improvement! Thanks again for your quick reply!
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-02-03 : 13:22:52
welcome
Go to Top of Page
   

- Advertisement -