SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 How to flag top and bottom records?
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

Umar Strong
Starting Member

8 Posts

Posted - 08/05/2013 :  13:37:44  Show Profile  Reply with Quote
I have a need to write an SQL that will flag top and bottom N% records; the value for N% is stored in a table.

For an example, my data is:

DATA
------
20.30
10.50
2.64
10.50
17.52
13.64
21.32
6.32
4.21
14.23


and the desired output needs to be:

DATA  FLAG
----  ----
2.64   1
4.21   1
6.32
10.5
10.5
13.64
14.23
17.52
20.3   1
21.32  1



Following are the DDL & DMLs for this dataset - The two tables are joined by the column MY_KEY.

Any help will be highly appreciated.

CREATE TABLE DBO.TEMP_PCT (MY_KEY INTEGER, PCT_NR INTEGER)
INSERT INTO DBO.TEMP_PCT VALUES (1, 20)

CREATE TABLE DBO.TEMP_DATA (MY_KEY INTEGER, DATA DECIMAL(5,2))
INSERT INTO DBO.TEMP_DATA VALUES (1, 20.3)
INSERT INTO DBO.TEMP_DATA VALUES (1, 10.5)
INSERT INTO DBO.TEMP_DATA VALUES (1, 2.64)
INSERT INTO DBO.TEMP_DATA VALUES (1, 10.5)
INSERT INTO DBO.TEMP_DATA VALUES (1, 17.52)
INSERT INTO DBO.TEMP_DATA VALUES (1, 13.64)
INSERT INTO DBO.TEMP_DATA VALUES (1, 21.32)
INSERT INTO DBO.TEMP_DATA VALUES (1, 6.32)
INSERT INTO DBO.TEMP_DATA VALUES (1, 4.21)
INSERT INTO DBO.TEMP_DATA VALUES (1, 14.23)


Thanks!

Edited by - Umar Strong on 08/05/2013 13:46:15

James K
Flowing Fount of Yak Knowledge

3557 Posts

Posted - 08/05/2013 :  13:50:19  Show Profile  Reply with Quote
Like this:
SELECT 
	DATA,
	CASE WHEN 100.*NASC/N <= PCT_NR OR 100.*NDESC/N <= PCT_NR THEN 1 ELSE NULL END AS FLAG
FROM
(
SELECT *,
	ROW_NUMBER() OVER (ORDER BY DATA ASC) AS NASC,
	ROW_NUMBER() OVER (ORDER BY DATA DESC) AS NDESC,
	COUNT(*) OVER() AS N
FROM
	DBO.TEMP_DATA  
)s 
CROSS JOIN DBO.TEMP_PCT p
can you have more than one row in the DBO.TEMP_PCT table?
Go to Top of Page

Umar Strong
Starting Member

8 Posts

Posted - 08/05/2013 :  13:59:27  Show Profile  Reply with Quote
Yes, it can have. I just extended the SQL you provided and am getting the result I wanted.

Below is the extended SQL.

Thanks a lot for your help !


SELECT
  DATA,
  CASE WHEN 100.*NASC/N <= PCT_NR OR 100.*NDESC/N <= PCT_NR THEN 1 ELSE NULL END AS FLAG
FROM
(
SELECT *,
  ROW_NUMBER() OVER (ORDER BY DATA ASC) AS NASC,
  ROW_NUMBER() OVER (ORDER BY DATA DESC) AS NDESC,
  COUNT(*) OVER() AS N
FROM
  DBO.TEMP_DATA
)s
INNER JOIN DBO.TEMP_PCT p
  ON p.MY_KEY = s.MY_KEY

Edited by - Umar Strong on 08/05/2013 14:38:52
Go to Top of Page

James K
Flowing Fount of Yak Knowledge

3557 Posts

Posted - 08/05/2013 :  14:07:10  Show Profile  Reply with Quote
Great, glad you were able to fix it.

If you want to check out some nifty stuff, check out the NTILE function, which does something similar but a bit more elegantly. http://msdn.microsoft.com/en-us/library/ms175126(v=sql.105).aspx The only thing to keep in mind is that it may tile them into slightly unequal groups if the result of the division is not an integer.
Go to Top of Page

Umar Strong
Starting Member

8 Posts

Posted - 08/05/2013 :  14:43:08  Show Profile  Reply with Quote
Hello James,

This SQL will become a view in my code and will require some filter clause on the table DBO.TEMP_DATA during run time.

So I am wondering if the desired result can be achieved without the sub-query?


Thanks!
Go to Top of Page

James K
Flowing Fount of Yak Knowledge

3557 Posts

Posted - 08/05/2013 :  15:19:01  Show Profile  Reply with Quote
Nothing that I am going to say below is actionable information - these are just thoughts:

If it were a readonly table, you could have pre-computed the flags and stored, in which case, you wouldn't even need a view.

Another thought is to have a computed column or persisted view, but computed columns don't work with row_number() functions, so that is a non-starter. If I am not mistaken, in a persisted view you cannot create a unique clustered index on a column that has a ranking function. So that is out as well.

You might be able to use some clever combination of TOP 20 PERCENT ... ORDER BY DATA ASC and TOP 20 PERCENT ... ORDER BY DATA DESC, but not clear to me what. Even then, the server will have to sort the data - which means it will need to read all the data.

An index on DATA column is a possibility that might help. If you have an index, it has statistics on the index, so SQL Server knows the distribution of the data; so theoretically that should help. But I haven't experimented with it to say whether or not it will.

To start off, I would keep the query as it as it is (or use a cte instead of a subquery, whichever you feel comfortable with) and make it into a view and see how the performance is.

The bottom line is, when you want to identify the top N percent based on a value, unless SQL Server has some statistical information to determine where to cut it off, it would have to examine every row of data.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52309 Posts

Posted - 08/06/2013 :  01:25:02  Show Profile  Reply with Quote
This SQL will become a view in my code and will require some filter clause on the table DBO.TEMP_DATA during run time.

if you want a dynamic filter why not make it a procedure then? or if you want to use this resultset in further join operations with other tables make it a table valued UDF.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

Umar Strong
Starting Member

8 Posts

Posted - 08/06/2013 :  12:04:56  Show Profile  Reply with Quote
James and Visakh16 thanks for throwing ideas. I need to use the data coming out from this code, to be joined with another data.

So I started creating UDF than realized that dynamic SQLs are not supported there. Stored Procedure is not an option for me;

I am now left with creating view.

Thanks again!
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.1 seconds. Powered By: Snitz Forums 2000