Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Complex RANK() problem

Author  Topic 

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2008-11-26 : 07:39:45
Hi,

I have a problem with RANK that I was hoping maybe one of you geniouses could help me solve...and the problem is this:

I have a table with Users and Points and I must make a query to sum up all the points for each user and rank them acording to who has most points and then display a list of top 3. Pretty simple. However, a UserID is provided and I need to include this user in the result regardless of his position. If he's not in top 3 then I need to get his position and return it as a 4th row in the resultset. This is also somewhat straight forward, but the problem comes when several users have the same amount of points. The following code illustrates my problem:
DECLARE @UserID int

SET @UserID = 6

DECLARE @table table (
UserID int,
Points int
)

INSERT INTO @table
SELECT 1, 200 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 3, 400 UNION ALL
SELECT 4, 500 UNION ALL
SELECT 5, 300 UNION ALL
SELECT 6, 100

SELECT * FROM @table

SELECT TOP 4 Position, UserID, Points
FROM (
SELECT
Position = RANK() OVER (ORDER BY SUM(Points) DESC),
UserID,
Points = SUM(Points)
FROM @table
GROUP BY UserID) AS a
WHERE Position <= 3 OR UserID = @UserID
ORDER BY Position
If you run the code it will return the correct result with the position and points of user number 6 in the 4th row as expected. However, if you change the points for UserID 3 from 400 to 200 then the data for user number 6 will be pushed out of the resultset.

Is there any way I can get this query to always include the data for user number 6 but never more than 4 rows of data??

- Lumbago

ECS_Dale
Starting Member

5 Posts

Posted - 2008-11-26 : 07:48:33
I will admit that I haven't read your post in any great detail but I think you may be over-complicating the issue.

How about selecting the top 3 scores into a dataset and then performing a union with the user's score? If they appear in the top 3 the union will not add their score as they already appear but otherwise you will end up with 4 rows of data.
Go to Top of Page

PeterNeo
Constraint Violating Yak Guru

357 Posts

Posted - 2008-11-26 : 07:54:51
try this
SELECT TOP 4 Position, UserID, Points
FROM (
SELECT
ROW = ROW_NUMBER() OVER (ORDER BY SUM(Points) DESC),
Position = RANK() OVER (ORDER BY SUM(Points) DESC),
UserID,
Points = SUM(Points)
FROM @table
GROUP BY UserID) AS a
WHERE (ROW <= 3 OR UserID = @UserID)
ORDER BY Position


"There is only one difference between a dream and an aim. A dream requires soundless sleep to see, whereas an aim requires sleepless efforts to achieve..!!"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 08:04:29
Nice approach PeterNeo, but change sample data for userID 6 from 100 to 1000.

Here is a quick and dirty approach
DECLARE @UserID int, @top INT

SELECT @UserID = 6, @top = 4

DECLARE @table table (
UserID int,
Points int
)

INSERT INTO @table
SELECT 1, 200 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 3, 400 UNION ALL
SELECT 4, 500 UNION ALL
SELECT 5, 300 UNION ALL
SELECT 6, 100

SELECT * FROM @table

-- Prepare staging
DECLARE @Stage TABLE
(
rowID INT IDENTITY(1, 1),
userID INT,
Points INT,
rnk INT
)

INSERT @Stage (userID, Points, rnk)
SELECT userID,
SUM(Points) AS Points,
RANK() OVER (ORDER BY SUM(Points) DESC) AS rnk
FROM @Table
GROUP BY userID
ORDER BY SUM(Points) DESC

DECLARE @rowID INT

SELECT @rowID = rowID
FROM @stage
where userID = @userID

IF @rowID > @top
SELECT TOP (@top - 1) userid, points, rnk from @stage union all
SELECT userid, points, rnk from @stage where rowID = @rowid
order by rnk
else
SELECT TOP (@top ) userid, points, rnk from @stage order by rnk



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 08:10:56
Maybe a better approach than my previous suggestion
SELECT		TOP (@top)
userID,
Points,
rnk
FROM (
SELECT userID,
SUM(Points) AS Points,
RANK() OVER (ORDER BY SUM(Points) DESC) AS rnk,
ROW_NUMBER() OVER (ORDER BY SUM(Points) DESC) AS src
FROM @Table
GROUP BY userID
) AS d
WHERE src <= @top
OR userid = @userid
ORDER BY CASE
WHEN userid = @userid AND src > @top THEN @top
ELSE src
END,
NULLIF(userid, @userid)



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

PeterNeo
Constraint Violating Yak Guru

357 Posts

Posted - 2008-11-26 : 08:18:50
it works fine

DECLARE @UserID int

SET @UserID = 6

DECLARE @table table (
UserID int,
Points int
)

INSERT INTO @table
SELECT 1, 200 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 3, 200 UNION ALL
SELECT 4, 500 UNION ALL
SELECT 5, 300 UNION ALL
SELECT 6, 1000

SELECT * FROM @table

SELECT TOP 4 Position, UserID, Points
FROM (
SELECT
ROW = ROW_NUMBER() OVER (ORDER BY SUM(Points) DESC),
Position = RANK() OVER (ORDER BY SUM(Points) DESC),
UserID,
Points = SUM(Points)
FROM @table
GROUP BY UserID) AS a
WHERE (ROW <= 3 OR UserID = @UserID)
ORDER BY Position

Position UserID Points
1 6 1000
2 4 500
3 5 300




"There is only one difference between a dream and an aim. A dream requires soundless sleep to see, whereas an aim requires sleepless efforts to achieve..!!"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 08:20:34
Where is the fourth record?

or.. oh... never mind. op writes "never more than 4 records".
I thought he meant "4 records always"...



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 08:24:36
Changing to ROW <= 4 will not return user 6 for the orginal sample data
INSERT INTO @table
SELECT 1, 200 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 3, 400 UNION ALL
SELECT 4, 500 UNION ALL
SELECT 5, 300 UNION ALL
SELECT 6, 100



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2008-11-26 : 08:33:51
Thanx alot guys, great feedback in such a short time! PeterNeos solution with Row <= 3 does comply with the requirements however the guy that ordered this report actually wants 4 rows regardless og the selected users position, so Peso's solution might be the one I'm ending up with.

I would however appreciate if you could explain this ORDER BY Peso...I don't think I under stand this ORDER BY...@Top???

- Lumbago
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 08:42:18
What I do, is that I change the rownumber to last place, if current rownumber for the wanted user is greater than maximum wanted rows. second row is there for if now two users have same rownumber; one real and eventually the wanted userid. If this is the case, i want the wanted userid instead of the real user who has rownumber @top.
nullif(userid, @userid) returns null if userid equals @userid. NULLs are sorted before other integers, such as userid which is not equal to @userid.



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2008-11-26 : 10:00:02
Hm...I might be on the slow side here but I've been looking at this ORDER BY CASE statement for 30 mins now and I still don't get it. And the part I don't get is the "THEN @Top". As far as I know @Top is an integer and ORDER BY @Top doesn't work so I'm seriously confused...

- Lumbago
Go to Top of Page

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2008-11-26 : 10:20:00
Seems like variables in ORDER BY are not allowed in 2005, but when included CASE expression, it is valid.

However, in 2000 both are equally rejected.

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 10:31:58
Ok. Let's break down the query.
The derived table d returns this resultset
userID	Points	rnk	src
4 500 1 1
3 400 2 2
5 300 3 3
1 200 4 4
2 200 4 5
6 100 6 6
Here you can see that src for userID 6 is 6, which is greater than the first 4 records we want.
How to solve this? We will not have userID 6 on top of list right? No, we want userID at last position!
How to do that? By replacing the actual src value with a "virtual" value of @top, which is the last src we want because of the SELECT TOP (@top).

Now let's see the resultset after replacing the src to a "virtual" one.
userID	Points	"virtual src"
4 500 1
3 400 2
5 300 3
1 200 4
6 100 4
2 200 5
Now we can see we have two users at same position with number 4, which is @top, which is the last position.
Now how to get the userid 6 before userid 1? We can still ORDER BY the "virtal src" column and if two users now have same "virtual src" (only possible if userid 6 is out of range for top 4), i simply replace the userid with NULL, because NULL are sorted first before integer values when sorting ascending.

NULLIF(userID, @userID) takes care of that. For userid 6 this expression returns NULL, for userid 1, the expression returns 1.

NULL
1

and the final resultset looks like
userID	Points	rnk
4 500 1
3 400 2
5 300 3
6 100 6



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 10:38:12
Previous example happens when userID 6 falls out of the normal first 4 records.
What if userID 6 already has a top 4 position?

See this sample data
INSERT INTO @table
SELECT 1, 200 UNION ALL
SELECT 2, 200 UNION ALL
SELECT 3, 400 UNION ALL
SELECT 4, 500 UNION ALL
SELECT 5, 300 UNION ALL
SELECT 6, 450
Now the derived table d returns this resultset
userID	Points	rnk	src
4 500 1 1
6 450 2 2
3 400 3 3
5 300 4 4
1 200 5 5
2 200 5 6
All looks good. What happens when we apply the ORDER BY statements?
userID	Points	"virtual src"
4 500 1
6 450 2
3 400 3
5 300 4
All is good, no duplicates found.
What happens with the expression NULLIF(userID, @userID)? Nothing really...
userID	Points	"src"	"userid"
4 500 1 4
6 450 2 NULL
3 400 3 3
5 300 4 5
Since the query's first sort order is "virtual src" nothing happens, because there are no duplicate values! Adding a second expression as with NULLIF(userID, @userID) gains nothing in this case, but is essential when userid is out of "top 4" scope...



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2008-11-26 : 10:51:19
Thank you very much for explaining Peter, I really appreciate it. What I had a hard time understanding was that @Top actually was "converted" to a representation of src, and sorted acordingly. When I saw @Top in the ORDER BY I sort of assumed that it would sort by the 4th column in the select which didn't make sense at all.

It all does now though, thank you

- Lumbago
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-26 : 10:55:00
Come to the south of Sweden (or Copenhagen) and we'll have a beer!



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2008-11-27 : 03:17:34
I might have to take you up on that offer some time Peter! Mmmm beer...

- Lumbago
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-27 : 03:44:14
Looking forward to that!



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page
   

- Advertisement -