I have a complicated table of over 300,000 records and growing. I had previously used a query (developed with help from someone on here) that calculated the daily medians of some values by groups/people/day. Without explaining all the colunms, basically these are testresults that each person may have many a day and I am wanting to calculate the daily median for each person of those tests. It also joins a lookup table to return the unit and if a unit is not in the lookup table, "other" is returned. Here it is:SELECT DISTINCT TOP (100) PERCENT v.PatientName, ROUND(CAST ((SELECT MAX(TestResult) AS Expr1 FROM (SELECT TOP (50) PERCENT TestResult FROM dbo.onlyControlProd WHERE (CollectDateODBC = v.collectdateodbc) AND (PatientName = v.patientname) ORDER BY TestResult) AS H1 GROUP BY v.CollectDateODBC) + (SELECT MIN(TestResult) AS Expr1 FROM (SELECT TOP (50) PERCENT TestResult FROM dbo.onlyControlProd AS onlyControlProd_1 WHERE (CollectDateODBC = v.collectdateodbc) AND (PatientName = v.patientname) ORDER BY TestResult DESC) AS H2 GROUP BY v.CollectDateODBC) AS Decimal) / 2, 2) AS Median, v.CollectDateODBC, ISNULL(dbo.unitlookup.UnitName, N'*Other') AS UnitNameFROM dbo.onlyControlProd AS v LEFT OUTER JOIN dbo.unitlookup ON v.[Patient Location Code] = dbo.unitlookup.Misys
Now, as you can see this would take a few seconds to do the calculation (around 24 seconds). Now I have converted over to 2005 and trying to take advantage of the row_number() to get the same results. Here is what I came up with based on the code above:SELECT cast(avg(testresult) as decimal) AS DailyMedian, patientname, collectdateodbc, ISNULL(dbo.unitlookup.UnitName, N'*Other') AS UnitNameFROM (SELECT patientname, collectdateodbc, testresult, [patient location code], row_number() OVER (partition BY collectdateodbc, patientname ORDER BY testresult) AS rownum, count(*) OVER (partition BY collectdateodbc, patientname) AS rowcntFROM onlycontrolprod) x LEFT OUTER JOINdbo.unitlookup ON [Patient Location Code] = dbo.unitlookup.MisysWHERE rownum >= ((rowcnt + 1) / 2) AND rownum <= ((rowcnt + 2) / 2)GROUP BY patientname, collectdateodbc, ISNULL(dbo.unitlookup.UnitName, N'*Other')
Now, I get similar numbers but using the code below I do not get the same number of rows. The old method (using first block of code) gives 14, 606 rows returned with the "daily" medians. The new block of code on sql2005 gives 14,305 rows. Strangely, I can execute again, and it will be 14,310, 14,300, etc.. Seems to differ each time, and in addition to that variation, when I compare some of the numbers using each method, some are exactly the same, some are slightly different. Has anyone experimented with this method? Can you see differences in my queries that may lead to this ?