Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 95th Percentile

Author  Topic 

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-28 : 16:45:17
I've been tasked with determining the 95th Percentile price per part sold by my company in the past 6 months. Can anyone give me some direction of how one might calculate that number using SQL?

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-01-28 : 16:57:54
I think it would be helpful of you could provide an example of exactly what that means, with tables, sample data, and sample results.




CODO ERGO SUM
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-28 : 17:08:59
Well, let's say you had a sales table with sales order line items.

     SO#           Item #     PartNo    NetPrice
100000 1 Widget 21.50
100010 1 Widget 18.50
100100 4 Widget 23.50
100099 6 Widget 24.00
123000 1 Widget 14.50


How would you write a very basic query so one could determine the correct 95th percentile price.
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-01-28 : 17:24:49
come on David, you've been around these forums long enough to know what "tables, sample data, and results" mean

Be One with the Optimizer
TG
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-01-28 : 17:26:18
without trying to guess the specifics, have a look at the ranking function NTILE() in Books Online.

Be One with the Optimizer
TG
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-28 : 17:37:17
quote:
Originally posted by TG

come on David, you've been around these forums long enough to know what "tables, sample data, and results" mean

Be One with the Optimizer
TG



I'm not trying to be difficult. I've given a table, sample data, and what I need is the 95th percentile. Unfortunately, I'm using SQL 2000.

I guess I don't know enough about what I'm asking about to phrase a better question.
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-01-28 : 17:47:00
So, based on the sample data you provided, what is the "95th Percentile price per part sold".




CODO ERGO SUM
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2009-01-28 : 17:47:34
95th percentile is approximately two standard deviations above the mean.
So calculate the mean of your data and the standard deviation, and there you are.

________________________________________________
If it is not practically useful, then it is practically useless.
________________________________________________
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-29 : 12:37:20
In an effort to be helpful here is a very abbreviated sample table:


SalesNO Price Part Number
100000 100 Widget
100001 99 Widget
100002 98 Widget
100003 97 Widget
100004 96 Widget
100005 95 Widget
100006 94 Widget
100007 93 Widget
100008 92 Widget
100009 91 Widget
100010 90 Widget
100011 89 Widget
100012 88 Widget
100013 100 Widget
100014 99 Widget
100015 98 Widget
100016 97 Widget
100017 96 Widget
100018 95 Widget
100019 94 Widget
100020 93 Widget
100021 92 Widget
100022 91 Widget
100023 90 Widget
100024 89 Widget
100025 88 Widget
100026 87 Widget
100027 86 Widget
100028 85 Widget
100029 84 Widget
100030 83 Widget
100031 82 Widget
100032 81 Widget
100033 80 Widget
100034 100 Widget
100035 99 Widget
100036 98 Widget
100037 97 Widget
100038 96 Widget
100039 95 Widget
100040 94 Widget
100041 93 Widget
100042 92 Widget
100043 91 Widget
100044 90 Widget
100045 89 Widget
100046 88 Widget
100047 87 Widget
100048 86 Widget
100049 85 Widget
100050 84 Widget
100051 100 Widget
100052 99 Widget
100053 98 Widget
100054 97 Widget
100055 96 Widget
100056 95 Widget
100057 94 Widget
100058 93 Widget
100059 92 Widget
100060 91 Widget
100061 90 Widget
100062 89 Widget
100063 88 Widget
100064 87 Widget
100065 86 Widget
100066 85 Widget
100067 84 Widget
100068 100 Widget
100069 99 Widget
100070 98 Widget
100071 97 Widget
100072 96 Widget
100073 95 Widget
100074 94 Widget
100075 93 Widget
100076 92 Widget
100077 91 Widget
100078 90 Widget
100079 89 Widget
100080 88 Widget
100081 87 Widget
100082 86 Widget
100083 85 Widget
100084 84 Widget
100085 83 Widget
100086 82 Widget
100087 81 Widget
100088 80 Widget
100089 79 Widget
100090 78 Widget
100091 77 Widget
100092 76 Widget
100093 75 Widget
100094 74 Widget
100095 73 Widget
100096 72 Widget
100097 71 Widget
100098 70 Widget
100099 69 Widget



According to Excel the 95th percentile is 99.05.

Any ideas?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-01-29 : 12:47:49
http://www.sqlteam.com/article/computing-percentiles-in-sql-server
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2009-01-29 : 17:09:04
[code]select *
into #TestData
from
(select 100000 as SalesNO, 100 as Price, 'Widget' as PartNumber
union select 100001, 99, 'Widget'
union select 100002, 98, 'Widget'
union select 100003, 97, 'Widget'
union select 100004, 96, 'Widget'
union select 100005, 95, 'Widget'
union select 100006, 94, 'Widget'
union select 100007, 93, 'Widget'
union select 100008, 92, 'Widget'
union select 100009, 91, 'Widget'
union select 100010, 90, 'Widget'
union select 100011, 89, 'Widget'
union select 100012, 88, 'Widget'
union select 100013, 100, 'Widget'
union select 100014, 99, 'Widget'
union select 100015, 98, 'Widget'
union select 100016, 97, 'Widget'
union select 100017, 96, 'Widget'
union select 100018, 95, 'Widget'
union select 100019, 94, 'Widget'
union select 100020, 93, 'Widget'
union select 100021, 92, 'Widget'
union select 100022, 91, 'Widget'
union select 100023, 90, 'Widget'
union select 100024, 89, 'Widget'
union select 100025, 88, 'Widget'
union select 100026, 87, 'Widget'
union select 100027, 86, 'Widget'
union select 100028, 85, 'Widget'
union select 100029, 84, 'Widget'
union select 100030, 83, 'Widget'
union select 100031, 82, 'Widget'
union select 100032, 81, 'Widget'
union select 100033, 80, 'Widget'
union select 100034, 100, 'Widget'
union select 100035, 99, 'Widget'
union select 100036, 98, 'Widget'
union select 100037, 97, 'Widget'
union select 100038, 96, 'Widget'
union select 100039, 95, 'Widget'
union select 100040, 94, 'Widget'
union select 100041, 93, 'Widget'
union select 100042, 92, 'Widget'
union select 100043, 91, 'Widget'
union select 100044, 90, 'Widget'
union select 100045, 89, 'Widget'
union select 100046, 88, 'Widget'
union select 100047, 87, 'Widget'
union select 100048, 86, 'Widget'
union select 100049, 85, 'Widget'
union select 100050, 84, 'Widget'
union select 100051, 100, 'Widget'
union select 100052, 99, 'Widget'
union select 100053, 98, 'Widget'
union select 100054, 97, 'Widget'
union select 100055, 96, 'Widget'
union select 100056, 95, 'Widget'
union select 100057, 94, 'Widget'
union select 100058, 93, 'Widget'
union select 100059, 92, 'Widget'
union select 100060, 91, 'Widget'
union select 100061, 90, 'Widget'
union select 100062, 89, 'Widget'
union select 100063, 88, 'Widget'
union select 100064, 87, 'Widget'
union select 100065, 86, 'Widget'
union select 100066, 85, 'Widget'
union select 100067, 84, 'Widget'
union select 100068, 100, 'Widget'
union select 100069, 99, 'Widget'
union select 100070, 98, 'Widget'
union select 100071, 97, 'Widget'
union select 100072, 96, 'Widget'
union select 100073, 95, 'Widget'
union select 100074, 94, 'Widget'
union select 100075, 93, 'Widget'
union select 100076, 92, 'Widget'
union select 100077, 91, 'Widget'
union select 100078, 90, 'Widget'
union select 100079, 89, 'Widget'
union select 100080, 88, 'Widget'
union select 100081, 87, 'Widget'
union select 100082, 86, 'Widget'
union select 100083, 85, 'Widget'
union select 100084, 84, 'Widget'
union select 100085, 83, 'Widget'
union select 100086, 82, 'Widget'
union select 100087, 81, 'Widget'
union select 100088, 80, 'Widget'
union select 100089, 79, 'Widget'
union select 100090, 78, 'Widget'
union select 100091, 77, 'Widget'
union select 100092, 76, 'Widget'
union select 100093, 75, 'Widget'
union select 100094, 74, 'Widget'
union select 100095, 73, 'Widget'
union select 100096, 72, 'Widget'
union select 100097, 71, 'Widget'
union select 100098, 70, 'Widget'
union select 100099, 69, 'Widget') TestData

select PartNumber,
avg(Price) as Mean,
stdev(Price) as StandardDeviation,
avg(Price) + (stdev(Price) * 1.65) as '95thPercentile'
from #TestData
group by PartNumber

drop table #TestData[/code]

________________________________________________
If it is not practically useful, then it is practically useless.
________________________________________________
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-30 : 09:06:01
So, my actual query is this:

SELECT 'XXX' AS [BU Name],
somast.fsono AS [Sales Order #],
' ' AS [Sales Order Type],
somast.forderdate AS [Sales Order Date],
aritem.fpartno AS [Part Number],
aritem.FMDESCRIPT AS [Part Description],
dbo.aritem.FPRODCL AS [Product Class],
'' AS [Group Code],
armast.fcinvoice AS [Invoice Number],
armast.finvdate AS [Invoice Date],
aritem.ftotprice AS [Invoice Amount],
aritem.fordqty AS [Quantity Ordered],
aritem.fshipqty AS [Quantity Invoiced],
0 AS [List Price Per Unit],
dbo.aritem.FTOTTXNPRICE AS [Net/Invoice Price Per Unit],
COALESCE(somefieldhere,0) AS [Discount Per Unit],
COALESCE(tmp.[COGS Material],0) AS [COGS Material],
COALESCE(tmp.[COGS Labor],0) AS [COGS Labor],
COALESCE(tmp.[COGS Overhead],0) AS [COGS Overhead],
slcdpm.FCUSRCHR1 AS [Customer Type],
somast.fcustno AS [Customer Number],
somast.fcompany AS [Customer Name],
Syaddr.fccountry AS [Region],
somast.fsoldby AS [Sales Rep]
FROM armast
INNER JOIN
aritem
ON armast.fcinvoice = aritem.fcinvoice
INNER JOIN
slcdpm
ON slcdpm.fcustno = armast.fcustno
INNER JOIN
somast
ON left (aritem.fsokey, 6) = somast.fsono
INNER JOIN
sorels
ON aritem.fsokey = sorels.fsono + sorels.finumber + sorels.frelease
INNER JOIN
SYADDR
ON sorels.fshptoaddr = syaddr.fcaddrkey
AND somast.fcustno = syaddr.fcaliaskey
INNER JOIN inmast on inmast.fpartno = aritem.fpartno and inmast.frev = aritem.FREV

LEFT JOIN (SELECT jomast.fsono,(jopact.flabact + jopact.flabinv) as [COGS Labor],
(jopact.fmatlact + jopact.fmatlinv) as [COGS Material],
(jopact.fovhdact + jopact.fovhdinv) as [COGS Overhead]
FROM sorels
Inner JOIN
jomast
ON sorels.fsono = jomast.fsono
AND sorels.finumber + sorels.frelease = jomast.fkey
inner join jopact on jomast.fjobno = jopact.fjobno

)tmp
ON tmp.fsono=somast.fsono


The List Price needs to be calculated from 95th percentile of the item below it: dbo.aritem.FTOTTXNPRICE AS [Net/Invoice Price Per Unit].

I read the article, but I'm still lost as to how to pull that off in a query such as this.

I appreciate all the help so far, but I need a little more.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-01-30 : 09:32:09
just make a function as shown in link. then call it from your query passing reqd field to get percentile value.
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-30 : 09:49:28
How about if I change it into a query which populates a temporary table? If I use a UDF, won't it execute at each record of this query and would therefore drag?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-01-30 : 10:00:02
yup./..it will. or else try solution given by blindman.
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-01-30 : 10:17:46
Blindman, how did you come up with the constant: 1.65? If I "reverse engineer" from OPs desired answer of 99.05 then I get instead: (approx) 1.315667078050060605

However, I also see some articles about how excel's percentile calc doesn't agree with statistical calculations.
ie: Statistical flaws in Excel
http://www.coventry.ac.uk/ec/~nhunt/pottel.pdf

Be One with the Optimizer
TG
Go to Top of Page

DavidChel
Constraint Violating Yak Guru

474 Posts

Posted - 2009-01-30 : 14:28:16
I have it working Blindman's way but I have no idea if it's accurate.
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-01-30 : 14:43:31
quote:
Originally posted by DavidChel

I have it working Blindman's way but I have no idea if it's accurate.




If Excel is your baseline then can't you spot check a few datasets there?

Be One with the Optimizer
TG
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2009-02-02 : 11:46:10
Remember there is a difference between the number of standard deviations from the mean required to encompass 95% of the data, and the number of standard deviations above zero that represents the CUMULATIVE percentage of data.
I didn't use excel to verify or research it to come up with that constant. I just googled "cumulative percentile standard deviations", or some such thing. I don't guarantee the constant is the correct one he wants, but the technique and code should be the same.

________________________________________________
If it is not practically useful, then it is practically useless.
________________________________________________
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-02-02 : 14:10:35
Did you try using the NTILE function for this? That is what is is designed for.

http://technet.microsoft.com/en-us/library/ms175126(SQL.90).aspx

http://www.databasejournal.com/features/mssql/article.php/3780311/Exploring-SQL-2005s-Ranking-Functions--NTILE-and-ROWNUMBER.htm





CODO ERGO SUM
Go to Top of Page
   

- Advertisement -