SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Box and Whisker Plot
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

stwp86
Starting Member

USA
42 Posts

Posted - 09/26/2012 :  16:49:09  Show Profile  Reply with Quote
Hey Everyone . . .

I really hate doing this , but I lack the necessary skills to write this from scratch and have not been able to find query help regarding this topic.

I have had great success posting here with this great community of experts, so I thought I would give it another go . . .

I will try to lay everything out that I have been able to gather to this point.

I have the following table/columns(name: workflow_data):
Table Columns:
Workflow_ID
Duration
Start_Time
End_Time
User_Name
Server
Client
Process_Start
Process_End
Process_Checkpoints
Definition
Client
Group_Type

I will be using a reporting tool to create the Box and Whisker plot (5 values are required to make this work)
Required Values:
Median
Max
Min
Lower Quartile
Upper Quartile

This information came from here: http://www.bidn.com/blogs/mikedavis/ssis/163/how-to-make-a-box-plot-chart-in-sql-reporting-services-2008-ssrs

The bases for these calculations would be the duration column from the SQL table. I would ideally like to restrict based on the definition, client, and/or group_type fields and group by work week (exclude weekends) and display as the monday of the week that the data falls in.

In the past I have handled the work week grouping like this: CONVERT(varchar(50), (DATEADD(d, @@DATEFIRST - DATEPART(dw, Start_Time) - 5, Start_Time)), 101)

And I would also exclude the weekends like this in the where clause: datepart(dw, workflow_data.Start_Time) not in (1,7)

I feel bad asking for help in this manner, but am stuck trying to get my team by while we wait for a SME to join us.

Thanks in advance

Travis

stwp86
Starting Member

USA
42 Posts

Posted - 09/27/2012 :  11:23:24  Show Profile  Reply with Quote
I was able to locate a great post on this topic and I have gotten to this point:

WITH SourceData AS (

--Table of sets over which we want to produce multiple quartiles.
SELECT sales.duration AS Measure,
-- sales.definition AS GroupID,
CONVERT(varchar(50), (DATEADD(d, @@DATEFIRST - DATEPART(dw, sales.Start_Time) - 5, sales.Start_Time)), 101) As GroupID,
sales.duration as MaxMeasure,
sales.duration as MinMeasure
FROM Workflow_Data AS sales
WHERE sales.client_mnemonic = 'XXXX_XX' and
sales.Definition in
(
'Meds Rec - Discharge',
'Meds Rec - Transfer',
'Meds Rec - Admission'
) and datepart(dw, sales.Start_Time) not in (1,7)

)
--Aggregate into a single record for each group, using MAX to select the non-null
--detail value for each column.
SELECT GroupID,
(Max(Q1NextVal) - MAX(Q1Val)) * Max(Q1Frac) + Max(Q1Val) Quartile1,
(Max(MidVal1) + Max(MidVal2)) / 2 Median,
(Max(Q3NextVal) - MAX(Q3Val)) * Max(Q3Frac) + Max(Q3Val) Quartile3

--Save into a separate table if desired.
--INTO dbo.PreCalculatedQuartiles
FROM (
--Expose the detail values for only the records at the index values
--generated by the summary subquery. All other values are left as NULL.
SELECT detail.GroupID,
CASE WHEN RowNum = Q1Idx THEN Measure ELSE NULL END Q1Val,
CASE WHEN RowNum = Q1Idx + 1 THEN Measure ELSE NULL END Q1NextVal,
CASE WHEN RowNum = Q3Idx THEN Measure ELSE NULL END Q3Val,
CASE WHEN RowNum = Q3Idx + 1 THEN Measure ELSE NULL END Q3NextVal,
Q1Frac,
Q3Frac,
CASE WHEN RowNum = MidPt1 THEN Measure ELSE NULL END MidVal1,
CASE WHEN RowNum = MidPt2 THEN Measure ELSE NULL END MidVal2
FROM
--Calculate a row number sorted by measure for each group.
(SELECT *, ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY Measure) RowNum
FROM SourceData) AS detail

INNER JOIN (
--Summarize to find index numbers and fractions we need to use to locate
--the values at the quartile points.
SELECT GroupID,
FLOOR((COUNT(*) + 3) / 4.0) Q1Idx,
((COUNT(*) + 3) / 4.0) - FLOOR((COUNT(*) + 3) / 4.0) Q1Frac,
(COUNT(*) + 1) / 2 AS MidPt1,
(COUNT(*) + 2) / 2 AS Midpt2,
FLOOR((COUNT(*) * 3 + 1) / 4.0) Q3Idx,
((COUNT(*) * 3 + 1) / 4.0) - FLOOR((COUNT(*) * 3 + 1) / 4.0) Q3Frac
FROM SourceData
GROUP BY GroupID
HAVING COUNT(*) > 1

) AS summary ON detail.GroupID = summary.GroupID

) AS combined
GROUP BY GroupID;



I am just missing the Min and Max duration, any thoughts on how I would add that in to the above query?
thanks!

Travis
Go to Top of Page

stwp86
Starting Member

USA
42 Posts

Posted - 09/27/2012 :  11:39:47  Show Profile  Reply with Quote
Figured it out:

Here is the finished product incase anyone else needs something similar:

WITH SourceData AS (

    --Table of sets over which we want to produce multiple quartiles.
    SELECT sales.duration AS Measure,
           CONVERT(varchar(50), (DATEADD(d, @@DATEFIRST - DATEPART(dw, sales.Start_Time) - 5, sales.Start_Time)), 101) As GroupID
	FROM Workflow_Data AS sales
	WHERE sales.client_mnemonic = 'XXXX_XX'and
	   sales.Definition in 
	   (
	   'Meds Rec - Discharge',
	   'Meds Rec - Transfer',
	   'Meds Rec - Admission'
	   ) and datepart(dw, sales.Start_Time) not in (1,7)

)
--Aggregate into a single record for each group, using MAX to select the non-null 
--detail value for each column.
SELECT GroupID,
    (Max(Q1NextVal) - MAX(Q1Val)) * Max(Q1Frac) + Max(Q1Val) Quartile1,
    (Max(MidVal1) + Max(MidVal2)) / 2 Median,
    (Max(Q3NextVal) - MAX(Q3Val)) * Max(Q3Frac) + Max(Q3Val) Quartile3,
        Max(measure) maxMeasure,
        Min(measure) minMeasure

--Save into a separate table if desired.
--INTO dbo.PreCalculatedQuartiles
FROM (
    --Expose the detail values for only the records at the index values 
    --generated by the summary subquery. All other values are left as NULL.
    SELECT detail.GroupID,
        CASE WHEN RowNum = Q1Idx THEN Measure ELSE NULL END Q1Val,
        CASE WHEN RowNum = Q1Idx + 1 THEN Measure ELSE NULL END Q1NextVal,
        CASE WHEN RowNum = Q3Idx THEN Measure ELSE NULL END Q3Val,
        CASE WHEN RowNum = Q3Idx + 1 THEN Measure ELSE NULL END Q3NextVal,
        Q1Frac,
        Q3Frac,
        CASE WHEN RowNum = MidPt1 THEN Measure ELSE NULL END MidVal1,
        CASE WHEN RowNum = MidPt2 THEN Measure ELSE NULL END MidVal2,
        Detail.measure
    FROM
        --Calculate a row number sorted by measure for each group.
        (SELECT *,  ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY Measure) RowNum
        FROM SourceData) AS detail

    INNER JOIN (
        --Summarize to find index numbers and fractions we need to use to locate 
        --the values at the quartile points.
        SELECT GroupID, 
            FLOOR((COUNT(*) + 3) / 4.0) Q1Idx,
            ((COUNT(*) + 3) / 4.0) - FLOOR((COUNT(*) + 3) / 4.0) Q1Frac,
            (COUNT(*) + 1) / 2 AS MidPt1,
            (COUNT(*) + 2) / 2 AS Midpt2,
            FLOOR((COUNT(*) * 3 + 1) / 4.0) Q3Idx,
            ((COUNT(*) * 3 + 1) / 4.0) - FLOOR((COUNT(*) * 3 + 1) / 4.0) Q3Frac
        FROM SourceData
        GROUP BY GroupID
        HAVING COUNT(*) > 1
    
    ) AS summary ON detail.GroupID  = summary.GroupID 

) AS combined
GROUP BY GroupID;
Go to Top of Page

bitsmed
Constraint Violating Yak Guru

406 Posts

Posted - 09/27/2012 :  12:22:06  Show Profile  Reply with Quote
Thanks you for sharing.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000