HiI want to create the statistical "Median" function and to use it in the same way that I use all aggregate functions such as sum(), avg() etc..I want to simply call it with a column name (e.g: "Median (Col)" ) and it will return a median value. It will be able to calculate the median of all column values and on groups of rows using the GROUP BY clause as all aggregate functions do.At first I created the Median function on a specific table as described in the following article (just to practice the algorithm):[url] http://www.sqlmag.com/articles/index.cfm?articleid=49827[/url]I created the following view to work on:IF OBJECT_ID('dbo.VOrders') IS NOT NULL DROP VIEW dbo.VOrders;GOCREATE VIEW dbo.VOrdersASSELECT O.OrderID, O.OrderDate, O.CustomerID, O.EmployeeID, O.ShipVia, SUM(OD.Quantity) AS Qty, CAST(SUM(OD.Quantity * UnitPrice * (1 - Discount)) AS DECIMAL(12, 2)) AS Value FROM dbo.Orders AS O JOIN [Order Details] AS OD ON O.OrderID = OD.OrderIDGROUP BY O.OrderID, O.OrderDate, O.CustomerID, O.EmployeeID, O.ShipViaGO
And then I tried the 2 ways to calculate the Median as described in the article.SQL Server 2000 way:SELECT EmployeeID,( (SELECT MAX(Value) FROM (SELECT TOP 50 PERCENT Value FROM dbo.VOrders AS O1 WHERE O1.EmployeeID = E.EmployeeID ORDER BY Value) AS H1) + (SELECT MIN(Value) FROM (SELECT TOP 50 PERCENT Value FROM dbo.VOrders AS O2 WHERE O2.EmployeeID = E.EmployeeID ORDER BY Value DESC) AS H2)) / 2 AS MedianFROM dbo.Employees AS Eorder by E.EmployeeID
SQL Server 2005 Way:WITH OrdersRN AS( SELECT EmployeeID, Value, ROW_NUMBER() OVER(PARTITION BY EmployeeID ORDER BY Value) AS RowNum, COUNT(*) OVER(PARTITION BY EmployeeID) AS Cnt FROM dbo.VOrders)SELECT EmployeeID, AVG(Value) AS MedianFROM OrdersRNWHERE RowNum IN((Cnt + 1) / 2, (Cnt + 2) / 2)GROUP BY EmployeeID;
Both Algorithms work OK.After checking those algorithms I tried to convert the code in a way that the function will accept parameter (col name) so I could use it as every aggregate function. I want to be able to call it as follows:select EmployeeID, Median (Value)FROM VOrdersGROUP BY EmployeeIDThen I started to run into problems. First problem is - How the function will know which table to query ? Do I need to pass table name as an argument to the Median function ? Do I need to pass column name as well ?If so - it does not act as an aggregate function.Is it possible at all ?Thanks Asi