Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Historical Reporting on Changing Data

Author  Topic 

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-27 : 14:30:45
I've got a customer who wants reproducible/historical reporting. The problem is that the underlying data changes.

I tried to explain that this can't be done (can it?), but he doesn't
understand.

To illustrate the situation - Let's say a teacher wants to track
spelling test scores for her students.
The below are scores for students A, B, and C (for January, February, March)

A: {70,80,85}
B: {70,65, 80}
C: {100,90,100}

So, I can generate a historical report that charts the class average
and student trend - that's pretty easy.

Now, in April, we find that the school board has mandated that the
British spelling of words is ok, so now the cumulative scores (for
January, February, March, April)

A: {90,80,85,100}
B: {80,65, 80,80}
C: {100,90,100,75}

He wants a report showing the January average as (70+70+100)/3 = 80,
when really it is (90+80+100)/3 = 90.

Now imagine that there are actually thousands of data points changing like this...
Now also imagine that we add and remove students on a regular basis...

He and his office manager get frustrated when I explain that the
reports are not simple - in their mind it is. They have determined
the solution is to get a report writer and buy Crystal Reports...
I've tried to explain that the problem is that the report
specification is unclear (basically - they don't understand what they want). The situation is ok for now, I'm just trying to plan for when they figure out that buying Crystal Reports won't change their situation (except they are done several thousand dollars)...

Any tips?

rmiao
Master Smack Fu Yak Hacker

7266 Posts

Posted - 2008-04-27 : 15:17:43
No way if don't save current values to auditing table before changing them.
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-27 : 15:26:59
quote:
Originally posted by rmiao

No way if don't save current values to auditing table before changing them.



I've actually got an audit table - the problem is that the score can change a variable number of times in any timeframe... Having the audit here let's me track who/what/when/where/why, but it doesn't let me do magic reporting...

That 70 could go to 80, 75, 90, 100, then to 90 again... What the hell does a history report mean to them?
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-04-27 : 18:16:11
The audit tables you would need would have store complete copies of the production data, along with a timestamp, for every change that occurs (inserts, updates, and deletes). Then you could write queries that would show the status of the records at any point in time, allowing you to perform the type of reporting your customer wants. "Show me the January average as of March" would then return different data than "Show me the January average as of April".

e4 d5 xd5 Nf6
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-28 : 21:29:40
quote:
Originally posted by blindman

The audit tables you would need would have store complete copies of the production data, along with a timestamp, for every change that occurs (inserts, updates, and deletes). Then you could write queries that would show the status of the records at any point in time, allowing you to perform the type of reporting your customer wants. "Show me the January average as of March" would then return different data than "Show me the January average as of April".

e4 d5 xd5 Nf6



The above does not work...

If I want a history of averages for Jan-Dec and run the report in Dec. The plot point for a specific item could have between 1 and n (40) different values. Which one of those values should be used? If you are going to have a timestamp, you'd need to do specify a timestamp for each item - you could not use a single datetime value for the entire report.
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-04-28 : 22:27:32
No, you specify a single "asof" date. Then you query the archive database for the latest copy of each record prior to the asof date. From that, you can produce whatever report you want as of any specified datetime.

e4 d5 xd5 Nf6
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-28 : 23:05:27
quote:
Originally posted by blindman

No, you specify a single "asof" date. Then you query the archive database for the latest copy of each record prior to the asof date. From that, you can produce whatever report you want as of any specified datetime.

e4 d5 xd5 Nf6



LOL - are you my client?

example

Test_ID 100, score 70 on 1/1/2008 08:00
Test_ID 100, score 75 on 1/1/2008 10:00
Test_ID 100, score 65 on 1/5/2008 12:00
Test_ID 100, score 80 on 1/30/2008 12:00
Test_ID 100, score 85 on 5/5/2008 12:00

Test_ID 101, score 90 on 5/1/2008 08:00

User runs report on 1/31/2008 (asof 1/31/2008) - sees value is 80

User runs report on 6/1/2008 (asof 1/31/2008) - wants trend up to 6/1/2008. How is the value from 5/1 going to be displayed if I set the asof to 1/31/2008?

Now imagine that there are tens of thousands of these fluctuating data points. Are you gonna set an asof for each record?

I appreciate your time in replying, but you're not seeing the problem.
Go to Top of Page

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2008-04-29 : 00:44:47
He means that asof is a parameter for your query. You then use that to filter the data using that parameter against the date of the score on your archive table. It's quite straightforward.
If you want the trend up to 6/1/2008 then your asof parameter needs to be that. Why would you set asof 1/31/2008 if you wanted trend up to 6/1/2008?
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-04-29 : 01:05:56
LOL - lucky for you I am not.

e4 d5 xd5 Nf6
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-29 : 08:05:05
quote:
Originally posted by LoztInSpace

He means that asof is a parameter for your query. You then use that to filter the data using that parameter against the date of the score on your archive table. It's quite straightforward.
If you want the trend up to 6/1/2008 then your asof parameter needs to be that. Why would you set asof 1/31/2008 if you wanted trend up to 6/1/2008?



You are starting to see the problem... The client wants a history trend from an 1/2008- 12/2008 range, while wanting to know what the report looked like on 1/31 because he happened to run it on that day.

It is not as simple as, the last value is the desired one. He wants the value that he generated on a prior report (which is pretty random).

To report using blindman's suggestion would require a date parameter for each data point...

Anyway, the closest thing to a solution is to make a report table that stores the values on a periodic basis and to use the tallied/stored values to generate history reports... It doesn't quite fit the bill either, but sheesh no one here even comes up with that...
Anyway -
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-29 : 10:03:04
What do you mean by this:

quote:

To report using blindman's suggestion would require a date parameter for each data point...



? I don't follow that. You don't need a "date parameter for each data point", you just need an "as of date" parameter, which you need by definition if you want to do point in time reporting.

All you need to do is get the last change BEFORE the "as of" date, and use return that row in your final report. Will that not work for you?

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-29 : 10:06:14
No. The poster is not satisfied with either response time nor quality of responses
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=101890



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-29 : 10:20:39
quote:
Originally posted by jsmith8858

What do you mean by this:

quote:

To report using blindman's suggestion would require a date parameter for each data point...



? I don't follow that. You don't need a "date parameter for each data point", you just need an "as of date" parameter, which you need by definition if you want to do point in time reporting.

All you need to do is get the last change BEFORE the "as of" date, and use return that row in your final report. Will that not work for you?

- Jeff
http://weblogs.sqlteam.com/JeffS




To simplify the request:
* Let's say the customer wants the report run with the most recent data values for 1/2008-6/2008 in 7/2008. OK - that's easy.
* Let's say the customer wants the report run with the state of all values as of 3/2008 for 3/2008. OK - that's easy too (this is also what you are suggesting I think)


Now what the customer really wants:
* He runs a report on 1/29/2008 that shows data points up to that date
* He runs a report on 2/28/2008 that shows data points up to that date
* He runs a report on 3/20/2008 that shows data points up to that date
* Now all the data changes
* He runs a report on 4/30/2008 and wants to see the same results he saw from 1/29, 2/28 and 3/20.
* Before you say again - you just need an asof date - think about the implications of the other data points with a single asof date.

* Now imagine that there reports get run much more frequently than once a month.
* Now imagine that the data changes constantly.

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-29 : 10:30:17
quote:
* He runs a report on 1/29/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* He runs a report on 2/28/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* He runs a report on 3/20/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* Now all the data changes

Of course, the data is not static.
quote:
* He runs a report on 4/30/2008 and wants to see the same results he saw from 1/29, 2/28 and 3/20.

All together or individually?
Use the use provided date as "asof" and include same records again until the provided "asof" moment.

This will work.
Unless you make historical changes? I mean, you can change data back in time and edit them?



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-29 : 10:39:08
quote:
Originally posted by Peso

quote:
* He runs a report on 1/29/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* He runs a report on 2/28/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* He runs a report on 3/20/2008 that shows data points up to that date

Which means all records dated until this moment should be in the resultset.
quote:
* Now all the data changes

Of course, the data is not static.
quote:
* He runs a report on 4/30/2008 and wants to see the same results he saw from 1/29, 2/28 and 3/20.

All together or individually?
Use the use provided date as "asof" and include same records again until the provided "asof" moment.

This will work.
Unless you make historical changes? I mean, you can change data back in time and edit them?



E 12°55'05.25"
N 56°04'39.16"




This will work.
Unless you make historical changes? I mean, you can change data back in time and edit them?


EXACTLY Peso - now you see my problem. The customer makes historical changes!
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-04-29 : 11:02:19
Add a column named OriginalValue to the table, so that you have "CurrentValue" and "OriginalValue".
Then you only have to decide when to use CurrentValue and when to use OriginalValue.

This will make the queries consistent and deliver same records over and over again.

I hope this answer is satisfying, because you never ever before wrote that customers can make historical changes, only "in any timeframe...".
This can mean anything and I think most people here offering help interpreted that as add and edit contemporary records.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-29 : 11:07:14
Sounds like you simply need a "modified_date" column in your table, and THAT date is the one you use for your "as of" reporting.

Otherwise, things are not making sense. If you have:


TranDate ModifiedDate Value
1/1/2008 1/1/2008 $100
2/1/2008 2/1/2008 $200
3/1/2008 3/1/2008 $300
2/1/2008 3/2/2008 $400


And you want the value "as of" 3/2/2008, what value should be returned? $300 or $400?

That is, are you using the TRANSACTION DATE for the row to determine this, or the date the the data was entered? I suspect you will answer "both"; if so, then you need to clearly describe what you want and how that should work, because it doesn't make any sense. It is not a SQL issue you are having, but a logic issue.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-29 : 11:14:38
quote:
Originally posted by Peso

Add a column named OriginalValue to the table, so that you have "CurrentValue" and "OriginalValue".
Then you only have to decide when to use CurrentValue and when to use OriginalValue.

This will make the queries consistent and deliver same records over and over again.

I hope this answer is satisfying, because you never ever before wrote that customers can make historical changes, only "in any timeframe...".
This can mean anything and I think most people here offering help interpreted that as add and edit contemporary records.



E 12°55'05.25"
N 56°04'39.16"




A single data point can change a variable number of times at any point in time... There can be any number of changes to the value, even after a history report has been generated.

The original problem description states that the underlying data changes.

LOL - PESO - thanks for referring a student to me for a question. That was pretty funny.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-29 : 11:15:36
(I edited my last post, so please double-check to make sure you didn't miss the update)

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

m_k_s@hotmail.com
Insecure what ??

38 Posts

Posted - 2008-04-29 : 11:19:06
quote:
Originally posted by jsmith8858

Sounds like you simply need a "modified_date" column in your table, and THAT date is the one you use for your "as of" reporting.

Otherwise, things are not making sense. If you have:


TranDate ModifiedDate Value
1/1/2008 1/1/2008 $100
2/1/2008 2/1/2008 $200
3/1/2008 3/1/2008 $300
2/1/2008 3/2/2008 $400


And you want the value "as of" 3/2/2008, what value should be returned? $300 or $400?

That is, are you using the TRANSACTION DATE for the row to determine this, or the date the the data was entered? I suspect you will answer "both"; if so, then you need to clearly describe what you want and how that should work, because it doesn't make any sense. It is not a SQL issue you are having, but a logic issue.

- Jeff
http://weblogs.sqlteam.com/JeffS




Hey Jeff,

You are preaching to the choir... This is not my spec, this is my customer's request. I'm trying to explain why it is unreasonable...

btw - the value that should be returned - depends on when a report happened to run (and can change...)

Thanks
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-29 : 11:23:43
So, I hope you can see there is no SQL we can show you or a database design we can recommend to help you if you can't tell us how things are supposed to work.

It is not that the spec is unreasonable, it is just it doesn't exist. You just need to work with them to find out exactly what they want and lay out the scenarios. How can you say that their spec is "unreasonable" if it isn't defined?

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page
    Next Page

- Advertisement -