Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 how to efficiently break apart cumulative data

Author  Topic 

sfortner
Yak Posting Veteran

63 Posts

Posted - 2008-04-14 : 20:16:15
Thought I'd check with the gurus on this, as I haven't done anything quite like it. Note that this is a real-life example, based on data logs I'm building a report on, and not an exercise for fun. I'm using SQL Server 2005, but will probably need to put this on a server with SQL Server 2000, so we can't use any new stuff! There seems to be a lot of examples about how to sum running totals to create cumulative data, but nothing I've found that talks about breaking apart cumulative data. Here's a simple example of what I need to do. I have a CumulativeData table with cumulative data by date that looks like this:


CREATE TABLE [dbo].[CumulativeData](
[ID] [int] NOT NULL,
[DeviceID] [int] NOT NULL,
[LoadDate] [datetime] NOT NULL,
[Temperature] [int] NOT NULL,
[CumulativeValue] [int] NOT NULL,
CONSTRAINT [PK_CumulativeData] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]


Inserted values would look like:

insert into CumulativeData select 1, 125, '04/09/08', 50, 6
insert into CumulativeData select 2, 125, '04/10/08', 50, 12
insert into CumulativeData select 3, 125, '04/11/08', 50, 20
insert into CumulativeData select 4, 125, '04/12/08', 50, 42
insert into CumulativeData select 5, 125, '04/09/08', 52, 7
insert into CumulativeData select 6, 125, '04/10/08', 52, 15
insert into CumulativeData select 7, 125, '04/11/08', 52, 17
insert into CumulativeData select 8, 125, '04/12/08', 52, 22
insert into CumulativeData select 9, 135, '04/09/08', 45, 2
insert into CumulativeData select 10, 135, '04/10/08', 45, 4
insert into CumulativeData select 11, 135, '04/11/08', 45, 8
insert into CumulativeData select 12, 135, '04/12/08', 45, 14


I need to specify a start and end date range, and have it return probably a table (from a table-valued UDF?) that looks like the following for start date of 4/10/08 and end date of 4/12/08. Essentially, the difference between start and end date for each temperature value:

DeviceID 45 50 52
125 0 36 15
135 12 0 0

Note this is a little tricky, and I miscalculated the values above the first time. You have to essentially build/return a table with a column containing the actual values then sum that based on the date range for a given device and temperature it seems. I think a table returned with the values like the above will work out best, using pivot tables from http://weblogs.sqlteam.com/jeffs/archive/2005/05/02/4842.aspx, http://www.sqlteam.com/article/counting-transactions-per-hour-using-a-pivot-table or http://msdn2.microsoft.com/en-us/library/aa172756.aspx. That would make displaying this in a report like crystal reports clean, or for graphing it if I have to do it manually like a div-based graph from http://meyerweb.com/eric/css/edge/bargraph/demo-ie.html. Note this needs to be fast, because I've got at least 1.4 million rows in just one table currently, based on an average of about 25 entries per device.

Other ideas are welcome. I also thought about just returning a variable with comma-delimited key/value pairs for a given DeviceID when reporting on just one device. Seems that's quite efficient.

The simple and straight-forward (and slow!) approach would maybe write a cursor-based job or stored procedure to go through each of the devices, order by LoadDate asc to get oldest first, calculate each of the actual values and store it in a new ActualValues column after CumulativeValue. Then from there, just use Jeff's new pivot table approach in the link above to sum the ActualValues and return it in a table with multiple columns. Not sure I've solved my own problem, and I know you guys can come up with a slicker way to do it, but at least I've got a solution I think for now. Maybe even forget about the SQL-based pivot table creation and just use Jeff's .NET pivoted DataTable here - http://weblogs.sqlteam.com/jeffs/archive/2005/05/11/5101.aspx. I'm using .NET 2.0 and C# to do this, but am planning on using crystal reports for it's graphing, nice report layout, and exporting capability to excel. I know crystal supports connections to a .net class with a DataSet, but don't know if supports DataTables. Can probably add a DataTable to a DataSet I believe and use that. I'll be checking back for other ideas or confirmation from someone who's done this...

TIA,

--Steve

RyanRandall
Master Smack Fu Yak Hacker

1074 Posts

Posted - 2008-04-15 : 04:42:47
Steve,

I don't get what your target is for this sample data. I can't make sense of...

DeviceID 45 50 52
125 0 36 15
135 12 0 0


Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

RyanRandall
Master Smack Fu Yak Hacker

1074 Posts

Posted - 2008-04-15 : 04:45:02
Don't worry - I get it now

Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

RyanRandall
Master Smack Fu Yak Hacker

1074 Posts

Posted - 2008-04-15 : 05:03:43
Here's the first step I'd take. This gets you the values you need.

-- Inputs
declare @FromDate datetime
declare @ToDate datetime
set @FromDate = '20080410'
set @ToDate = '20080412'

-- Calculation
select
DeviceID,
Temperature,
max(case when LoadDate = @ToDate then CumulativeValue else null end) -
max(case when LoadDate < @FromDate then CumulativeValue else null end) as Value
from CumulativeData
where LoadDate < @FromDate or LoadDate = @ToDate
group by DeviceID, Temperature

/* Results
DeviceID Temperature Value
----------- ----------- -----------
135 45 12
125 50 36
125 52 15
*/
The next step is to pivot this data. You're best off doing that in the front-end if you can, or using the method in one of the examples you've found.


Ryan Randall
Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

sfortner
Yak Posting Veteran

63 Posts

Posted - 2008-04-15 : 10:29:05
Hey Ryan, thanks for taking the time to try and solve this. Your solution is real good, but not quite right. Like I was saying, it's tricky. :) Your calculation query returns nulls for the values, so I modified it to be:

select
DeviceID,
Temperature,
max(case when LoadDate <= @ToDate then CumulativeValue else null end) -
min(case when LoadDate >= @FromDate then CumulativeValue else null end) as Value
from CumulativeData
where LoadDate <= @FromDate or LoadDate >= @ToDate
group by DeviceID, Temperature


but it's still not right. It returns:


DeviceID Temp Value
135 45 10
125 50 30
125 52 7


when it should return:


DeviceID Temp Value
135 45 12
125 50 36
125 52 15


To clarify, you have to take the actual values and then sum them for each of the dates in the date range. Also, maybe I should have explained this better, as you correctly state understanding the problem is the hard part. Cool quote, BTW. The CumulativeValues are number of occurrances logged for a certain temperature. They accumulate over time. The proper values I'm after in the report for DeviceID 135 at a temperature of 45 degrees, is 12 (2 + 4 + 6). DeviceID 125 at a temperature of 50 degrees should be 36 (6 + 8 + 22). It has to look really at the previous value at a given temperature (if there is one) and date range to properly sum them. Note that if the date range changed to 4/9/08 to 4/11/08 for DeviceID 135 it should return 8 (2 + 2 + 4). To clarify further, DeviceID 135 has the following cumulative and actual values:


DeviceID Temp CumulativeValue ActualValue
135 45 2 2
135 45 4 2
135 45 8 4
135 45 14 6


Not real easy or clear how to accomplish the solution quickly. That's why I was thinking I should do this with a cursor at the time of the weekly import. Just let the ActualValue column be null and in that cursor proc just query the ones that are not null and then update the ActualValue column for the lot of them. Come to think about it, this is setting itself up for an excellent interview question. :)

--Steve
Go to Top of Page

sfortner
Yak Posting Veteran

63 Posts

Posted - 2008-04-15 : 11:47:35
Typo in last insert statement fixed above. Date was

insert into CumulativeData select 12, 135, '04/10/08', 45, 14

now

insert into CumulativeData select 12, 135, '04/12/08', 45, 14

Sorry...


--Steve
Go to Top of Page

sfortner
Yak Posting Veteran

63 Posts

Posted - 2008-04-15 : 12:23:43
Here's the slow but straight-forward cursor based solution:



-- first add the new column to the table
ALTER TABLE dbo.CumulativeData ADD ActualValue int NULL

-- procedure to update the CumulativeData table's ActualValue column
create procedure spUpdateActualValue
AS

SET NOCOUNT ON

DECLARE @ID int,
@DeviceID int,
@Temperature int,
@CumulativeValue int,
@PreviousCumulativeValue int

-- order by oldest LoadDate first
DECLARE ffcursor CURSOR FAST_FORWARD FOR
select ID, DeviceID, Temperature, CumulativeValue
from CumulativeData where ActualValue is null
order by DeviceID, LoadDate, Temperature

OPEN ffcursor

FETCH NEXT FROM ffcursor
INTO @ID, @DeviceID, @Temperature, @CumulativeValue

WHILE @@FETCH_STATUS = 0
BEGIN
set @PreviousCumulativeValue = null

-- get the previous cumulative value for the device at the specified temperature, if it exists
-- order by newest LoadDate first
select top 1 @PreviousCumulativeValue = CumulativeValue from CumulativeData
where DeviceID = @DeviceID and Temperature = @Temperature and ActualValue is not null
order by LoadDate desc

-- if previous cumulative value is null, that temperature has no ActualValue set - just use current CumulativeValue
if (@PreviousCumulativeValue is not null)
update CumulativeData set ActualValue = @CumulativeValue - @PreviousCumulativeValue where ID = @ID
else
update CumulativeData set ActualValue = @CumulativeValue where ID = @ID

FETCH NEXT FROM ffcursor
INTO @ID, @DeviceID, @Temperature, @CumulativeValue
END

CLOSE ffcursor
DEALLOCATE ffcursor

SET NOCOUNT OFF

GO



It updates the ActualValue column and produces the following results:



ID DeviceID LoadDate Temperature CumulativeValue ActualValue
1 125 2008-04-09 00:00:00.000 50 6 6
2 125 2008-04-10 00:00:00.000 50 12 6
3 125 2008-04-11 00:00:00.000 50 20 8
4 125 2008-04-12 00:00:00.000 50 42 22
5 125 2008-04-09 00:00:00.000 52 7 7
6 125 2008-04-10 00:00:00.000 52 15 8
7 125 2008-04-11 00:00:00.000 52 17 2
8 125 2008-04-12 00:00:00.000 52 22 5
9 135 2008-04-09 00:00:00.000 45 2 2
10 135 2008-04-10 00:00:00.000 45 4 2
11 135 2008-04-11 00:00:00.000 45 8 4
12 135 2008-04-12 00:00:00.000 45 14 6



and using a modification of Ryan's suggestion:


declare @FromDate datetime
declare @ToDate datetime
set @FromDate = '20080410'
set @ToDate = '20080412'

select
DeviceID,
Temperature,
sum(case when LoadDate between @FromDate and @ToDate then ActualValue else null end) as Value
from CumulativeData
group by DeviceID, Temperature
order by DeviceID, Temperature


it returns:


DeviceID Temp Value
125 50 36
125 52 15
135 45 12


--Steve
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2008-04-16 : 17:46:45
I must have skipped past your question durring one ofmy busy days. Here is a solution for ya:
DECLARE @CumulativeData TABLE (
[ID] [int] NOT NULL PRIMARY KEY CLUSTERED,
[DeviceID] [int] NOT NULL,
[LoadDate] [datetime] NOT NULL,
[Temperature] [int] NOT NULL,
[CumulativeValue] [int] NOT NULL,
ActualValue INT NULL)

insert into @CumulativeData select 1, 125, '04/09/08', 50, 6, NULL
insert into @CumulativeData select 2, 125, '04/10/08', 50, 12, NULL
insert into @CumulativeData select 3, 125, '04/11/08', 50, 20, NULL
insert into @CumulativeData select 4, 125, '04/12/08', 50, 42, NULL
insert into @CumulativeData select 5, 125, '04/09/08', 52, 7, NULL
insert into @CumulativeData select 6, 125, '04/10/08', 52, 15, NULL
insert into @CumulativeData select 7, 125, '04/11/08', 52, 17, NULL
insert into @CumulativeData select 8, 125, '04/12/08', 52, 22, NULL
insert into @CumulativeData select 9, 135, '04/09/08', 45, 2, NULL
insert into @CumulativeData select 10, 135, '04/10/08', 45, 4, NULL
insert into @CumulativeData select 11, 135, '04/11/08', 45, 8, NULL
insert into @CumulativeData select 12, 135, '04/12/08', 45, 14, NULL

declare @FromDate datetime
declare @ToDate datetime
set @FromDate = '20080410'
set @ToDate = '20080412'



SELECT
DeviceID,
Temperature,
SUM(ActualValue) AS Val
FROM
(
SELECT
A.DeviceID,
A.Temperature,
MIN(B.CumulativeValue) - A.CumulativeValue AS ActualValue
FROM
@CumulativeData AS A
INNER JOIN
@CumulativeData AS B
ON A.DeviceID = B.DeviceID
AND A.Temperature = B.Temperature
AND A.LoadDate < B.LoadDate
WHERE
A.LoadDate >= @FromDate
OR A.LoadDate <= @ToDate
GROUP BY
A.DeviceID,
A.Temperature,
A.CumulativeValue
) AS T
GROUP BY
DeviceID,
Temperature
Go to Top of Page

sfortner
Yak Posting Veteran

63 Posts

Posted - 2008-04-17 : 09:46:08
Very nice and elegant, Lamprey. Not sure why I didn't think to at least use a self-join but the combination with the derived table was slick. Good job.

--Steve
Go to Top of Page
   

- Advertisement -