Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Practical Applications: NTILE. What is the point?

Author  Topic 

theboyholty
Posting Yak Master

226 Posts

Posted - 2011-04-01 : 05:37:12
I've been spending a bit of time recently trying to learn new code. It has been very interesting. I've learnt a lot about things such as ROW_NUMNER, PARTITION BY etc and today I've come accross NTILE.

I think its a pretty cool function and I can see how it works and I've even created some examples but I can't think of any practical application for its use.

I'd be interested to hear from anyone who uses it and what they use it for. Its great to learn new code, but its even better if its any use.

---------------------------------------------------------------------------------
http://www.mannyroadend.co.uk The official unofficial website of Bury Football Club

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-01 : 15:36:41
I guess NTILE would be useful if you are the College Board and trying to determine what SAT score would correspond to the 95th percentile. But like you, I have not used it much either.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-04-01 : 15:49:47
It is very practical when displaying data columnwise.
CREATE TABLE #NotInUseLocations (
RowID INT NOT NULL,
locationKey VARCHAR(10) NOT NULL
)

INSERT INTO #NotInUseLocations
SELECT 1, 'COVEY'
UNION SELECT 2, 'FOODBANK'
UNION SELECT 3, 'FRT UPPER'
UNION SELECT 4, 'HOLD'
UNION SELECT 5, 'IN-TRANSIT'
UNION SELECT 6, 'INSPECTION'
UNION SELECT 7, 'LOADING'
UNION SELECT 8, 'MILWH'
UNION SELECT 9, 'NETTABLE'
UNION SELECT 10, 'NON-STOCK'
UNION SELECT 11, 'NONNET'
UNION SELECT 12, 'PHANTOM'
UNION SELECT 13, 'QA HOLD'
UNION SELECT 14, 'DAM01'
UNION SELECT 15, 'LOST'
UNION SELECT 16, 'MILTON'
UNION SELECT 17, '13090'
UNION SELECT 18, 'SANDL'
UNION SELECT 19, 'REJECT'
UNION SELECT 20, 'SALVAGE'
UNION SELECT 21, 'SECND FLR'
UNION SELECT 22, 'STAGING'
UNION SELECT 23, 'pouch'
UNION SELECT 24, 'D1302'
UNION SELECT 25, 'D1306'
UNION SELECT 26, 'D1307'
UNION SELECT 27, 'D1309'
UNION SELECT 28, 'WH12'
UNION SELECT 29, 'WH13'

;WITH cteSource(RowID, LocationKey, ColumnNumber)
AS (
SELECT RowID,
LocationKey,
NTILE(4) OVER (ORDER BY RowID) AS ColumnNumber
FROM #NotInUseLocations
), ctePresentation(LocationKey, ColumnNumber, RowNumber)
AS (
SELECT LocationKey,
ColumnNumber,
ROW_NUMBER() OVER (PARTITION BY ColumnNumber ORDER BY RowID) AS RowNumber
FROM cteSource
)
SELECT p.RowNumber,
p.[1],
p.[2],
p.[3],
p.[4]
FROM ctePresentation AS c
PIVOT (
MAX(c.LocationKey)
FOR ColumnNumber IN ([1], [2], [3], [4])
) AS p

DROP TABLE #NotInUseLocations



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Sqlraider
Yak Posting Veteran

65 Posts

Posted - 2011-04-01 : 16:44:07
quote:
Originally posted by sunitabeck

I guess NTILE would be useful if you are the College Board and trying to determine what SAT score would correspond to the 95th percentile. But like you, I have not used it much either.



I would be very interested in seeing how to do this with student scores. This might be a better way for me.

I tried using PESO's code using what test data, but I don't see how it should be coded or if it is the way.

Here is what I have:

CREATE TABLE #Scores (
SID INT NOT NULL,
TestID VARCHAR(10) NOT NULL,
Score DECIMAL(5,2) NOT NULL,
MaxScore DECIMAL(5,2) NOT NULL,
BeginScore DECIMAL(5,2) NOT NULL,
ProgScore DECIMAL(5,2) NOT NULL,
ProfScore DECIMAL(5,2) NOT NULL,
AdvScore DECIMAL(5,2) NOT NULL
)

INSERT INTO #Scores
SELECT 12222, 'LA09.01.01', 2.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 13333, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 14444, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 15555, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 16666, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 17777, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 18888, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 19999, 'LA09.01.01', 1.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 21111, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 22222, 'LA09.01.01', 2.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 23333, 'LA09.01.01', 2.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 24444, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 25555, 'LA09.01.01', 1.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 26666, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 27777, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 28888, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 29999, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 31111, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 32222, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 33333, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 34444, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 44221, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 55345, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 66345, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 77345, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 88345, 'LA09.01.01', 4.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 92345, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 93345, 'LA09.01.01', 3.00, 4.00, 1.00, 2.00, 3.00, 4.00
UNION SELECT 94445, 'LA09.01.01', 2.00, 4.00, 1.00, 2.00, 3.00, 4.00

;WITH cteSource(SID, TestID, Score)
AS (
SELECT SID,
TestID,
NTILE(4) OVER (ORDER BY SID) AS Score
FROM #Scores
), ctePresentation(TestID, Score, RowNumber)
AS (
SELECT TestID,
Score,
ROW_NUMBER() OVER (PARTITION BY Score ORDER BY SID) AS RowNumber
FROM cteSource
)
SELECT p.RowNumber,
p.[1],
p.[2],
p.[3],
p.[4]
FROM ctePresentation AS c
PIVOT (
MAX(c.TestID)
FOR Score IN ([1], [2], [3], [4])
) AS p


DROP TABLE #Scores


I'm trying to show those Scores for a TestID that fall into each of the 4 score levels. The above code did not give the excpected results. Is NTILE the way to do it? Just very curious.

Thanks.
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-04-01 : 18:00:34
We use it for RFM analysis when we do customer segmentation based on their purchase habits or site visits. Other utilizations of RFM analyisis you can find on my internet marketing blog [url]http://internet-marketing-corner.blogspot.com/2011/03/rfm-analysis-and-google.html[/url].

I also use it when calculating median. Median we use when trying to figure out if it is possible to predict purchase latency in specific situations.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

Sqlraider
Yak Posting Veteran

65 Posts

Posted - 2011-04-04 : 10:27:01
Mirko,
Would you mind sharing how you calculate the median for test scores using NTILE (using my test data above)?

Explination of data:
SID -> student ID
TestID -> test ID this group of students took
Score -> the actual score the student revieved on the test
MaxScore -> perfect score on the test (highest possible)
BeginScore -> means student is a beginner in test subject (always 1)
ProgScore -> means student is progressing (always 2)
ProfScore -> means student is proficient (always 3)
AdvScore -> means student is advanced (always 4)

NOTE: ALL scores are in whole numbers, no values are entered on the right side of the decimal (decimals field type is how the table was designed).

I think NTILE may be the way to go if you can help me figure out how to use it.

I wish to find out those students that are at or above the median score. And those students that are in each of the 4 (bucket) scores. Most times the median score is 3, but depending on the type of test it could be 2.

Come to think of it I could also apply RFM to it. Showing how students that have language barriers or by sex or by ethnicity.

Thanks in advance.
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-04-05 : 07:10:06
quote:
Originally posted by Sqlraider

Mirko,
Would you mind sharing how you calculate the median for test scores using NTILE (using my test data above)?

Explination of data:
SID -> student ID
TestID -> test ID this group of students took
Score -> the actual score the student revieved on the test
MaxScore -> perfect score on the test (highest possible)
BeginScore -> means student is a beginner in test subject (always 1)
ProgScore -> means student is progressing (always 2)
ProfScore -> means student is proficient (always 3)
AdvScore -> means student is advanced (always 4)

NOTE: ALL scores are in whole numbers, no values are entered on the right side of the decimal (decimals field type is how the table was designed).

I think NTILE may be the way to go if you can help me figure out how to use it.

I wish to find out those students that are at or above the median score. And those students that are in each of the 4 (bucket) scores. Most times the median score is 3, but depending on the type of test it could be 2.

Come to think of it I could also apply RFM to it. Showing how students that have language barriers or by sex or by ethnicity.

Thanks in advance.


I am pretty busy right now, so I will post a table structure and query I used for calculating median in one case. I hope it helps:
CREATE TABLE [dbo].[CustomerVisit](
[CustomerId] [int] NOT NULL,
[Visit] [int] NOT NULL,
[VisitDate] [date] NOT NULL,
[PurchaseAmount] [money] NOT NULL,
[StoreId] [smallint] NOT NULL,
CONSTRAINT [PK_CustomerVisit] PRIMARY KEY CLUSTERED
(
[CustomerId] ASC,
[Visit] ASC
)
)

select t.visit, MAX(purchaseAmount) as medianPurchaseAmount
from (
select cv.Visit,
cv.PurchaseAmount,
NTILE(2) over (partition by cv.visit order by cv.purchaseAmount) as half
from CustomerVisit cv
) t
where t.half = 1
group by t.Visit
order by t.Visit


Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

Sqlraider
Yak Posting Veteran

65 Posts

Posted - 2011-04-05 : 14:13:08
Thanks Mirko,

I've translated your code to my data. At this time I'm not sure it has a practical use for my data, plus I need to study this more and like you I am also busy.

Thank you for taking time out of your busy schedule to share your code.
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-04-05 : 16:03:14
You are wellcome

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page
   

- Advertisement -