| Author |
Topic |
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-21 : 08:05:05
|
Hi folks,This is probably simple but I've never developped a query of this type and I'm clueless. Take the following data which only contains changes with respect to prior status:Address City State Zip Telephone RecordDate123 Main st. New York NY 12345 2127889999 March 5, 2011Null Null Null 14522 Null March 6, 2011Null Null Null Null 2127887777 March 7, 2011 The result set of the query for a current status should be:123 Main st. New York NY 14522 2127887777 I'm hoping there's a keyword that I don't know about which will do just that. Thanks for your help! |
|
|
DonAtWork
Master Smack Fu Yak Hacker
2167 Posts |
Posted - 2011-03-21 : 08:13:48
|
| strange that the record only has the changes, and not the value for all of the fields...http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspxHow to ask: http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxFor ultra basic questions, follow these links.http://www.sql-tutorial.net/ http://www.firstsql.com/tutor.htm http://www.w3schools.com/sql/default.asp |
 |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-03-21 : 08:15:52
|
| Does the table at least have some kind of key that says that these records are related? Also, what is the data type of RecordDate?JimEveryday I learn something that somebody else already knew |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-21 : 08:30:57
|
I only posted a simple example of the data which would relate to a particular account. There is an ID key and an Account field. The record date is actually an Integer. The reason why the information is entered incrementally is to avoid the DB taking up a lot of space. My understanding is that a Null only takes 2 bytes in a VarChar field as opposed to repeating the unchanged information which would take up much more space.A more complete example would therefore be likeID Account Address City State Zip Telephone RecordDate1 1 123 Main st. New York NY 12345 2127889999 406052 1 Null Null Null 14522 Null 406063 1 Null Null Null Null 2127887777 40607 |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-23 : 13:17:11
|
| Hi again folks,So I guess there's no way to build the final result using a SQL query which rejects the NULL values. I guess I'll have to create a routine in Visual Studio to do it then. Thanks for looking! |
 |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-03-23 : 13:23:06
|
| It may be impossible in SQL, as there are so many combinations to consider. This query would be really easy if you changed your table. You're not really saving any space doing it this way, especially if the old values aren't needed.JimEveryday I learn something that somebody else already knew |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-23 : 13:46:25
|
| Jim,OK, now we're getting into table design. Do you have any suggestions as to how best to restructure this potentially? |
 |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-03-23 : 14:54:00
|
| Your design isn't bad, just populate all the fields when there is a change to any one of them. But if these updates are just corrections, i.e, the zip should have been 14522 all along, just update the table with correct info. You might want to split off phone number into its own table as many people have more than 1 phone number.JimEveryday I learn something that somebody else already knew |
 |
|
|
TimS
Posting Yak Master
198 Posts |
Posted - 2011-03-23 : 16:06:23
|
| SQL COALESCE examplehttp://www.sqlbook.com/SQL/SQL-Coalesce-28.aspxThis keyword should help solve this problem. Note: it is not enough to solve the problem by itself.On second thought, I am not sure it will work.This might work below for a single column; I have not done SQL in over 5 years.SELECT TOP 1 t.column FROM table twhere t.column is not nullorder by t.date descTim S. |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-23 : 19:45:35
|
| Thanks TimS, I had already looked into coalescing but I can't figure out how I might filter things properly.And thanks to jimf but I don't understand your premise. The storage size for varchar fields is the actual length of data entered + 2 bytes. So by definition, if all fields are constantly documented, it will take up a lot more space than just documenting alterations (unless SQL Server had AI and could do pattern recog to save the space, but my understanding is that's not the case). If there's a more efficient storage methodology which is best practice, I'd appreciate any input. Thanks everyone! |
 |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-03-24 : 07:26:28
|
| You're really not saving that much space, though you may be saving some. However, that old data doesn't seem necessary and that's a waste of space and now you have a table that you can't really use, which is also a waste of space. Also, there is overhead associated with null value as sql still has to make room in case that column does get a value. I'm trying to find the article I read about that. If you're at the beginning of the design phase you have a chance to make it right the first time -- read up on Normalization and that's a start. JimEveryday I learn something that somebody else already knew |
 |
|
|
michael.appleton
Posting Yak Master
160 Posts |
Posted - 2011-03-24 : 07:28:43
|
I know where you're coming from as I've looked at storing data the same way you have before. The thing is storing temporal data in a relation database is a lot easier if you stop thinking about only optimising on space. Space is cheap and as you've found out the method you've picked is actually a lot harder to work with. Optimisation doesn't mean minimising everything, because that's actually impossible. By minimising space, you've actually harmed performance. Optimising is actually minimising the combination of all the different factors that constrain a database.I'd also add in another column called NextRecordDate which stores the date of the next record in that account like this: CREATE TABLE Accounts( Account int, Address varchar(1000), City varchar(255), State char(2), Zip char(5), Telephone char(10), RecordDate datetime, NextRecordDate datetime, Primary Key(Account,RecordDate), UNIQUE(Account,NextRecordDate)) With the data stored like this:Account Address City State Zip Telephone RecordDate NextRecordDate1 123 Main st. New York NY 12345 2127889999 20110305 201103061 123 Main st. New York NY 14522 2127889999 20110306 201103071 123 Main st. New York NY 14522 2127887777 20110307 99991231 Where there is no next record I store a date far in the future. This could be argued as a waste of space as the data is stored in two place in the database, but it makes things so much easier. To get a version of the data at any point in time the query is simply:DECLARE @Date datetimeSET @Date='20110306' --or whatever date you likeSELECT * FROM Accounts WHERE RecordDate<=@Date AND NextRecordDate>@Date Then you'll get just one row per account.Hope this all helps. |
 |
|
|
michael.appleton
Posting Yak Master
160 Posts |
Posted - 2011-03-24 : 07:32:31
|
| Check out my thinking a few years ago:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=102811 |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-24 : 07:49:06
|
| If I had a choice I would design it slightly differently. I would keep the latest data in the main table, and the historical/audit data in a separate audit table, which has the same columns along with two or three additional columns to indicate when the change happened, who changed it etc.The advantage of doing it this way are:a) easy to query the latest data. You wouldn't need to coalesce etc. to get it.b) the audit table can be populated via a trigger - so less room for client code to incorrectly enter updates.c) you can keep track of even deletions from the table etc.[Getting off from her soap box now :--)]Having said all that, here is a way in which you can coalesce your existing table to get the data you want.[code][/code |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-24 : 08:04:44
|
My previous post somehow got messed up. This is the code that is missing from that post.create table #tmp ( ID int, Account int, [Address] varchar(255), City varchar(255), [State] varchar(255), Zip varchar(255), Telephone varchar(255), RecordDate datetime);insert into #tmp values (1,1,'123 Main st.','New York','NY','12345','2127889999','20110101');insert into #tmp values (2,1,Null,Null,Null,'14522',Null,'20110203');insert into #tmp values (3,1,Null,Null,Null,Null,'2127887777','20110207');insert into #tmp values (1,2,'123 Washington st.','Stamford','CT','06901','2035551212','20110401');insert into #tmp values (1,3,'123 Lincoln st.','Elm Grove','WI','53122','2625551212','20100101');insert into #tmp values (2,3,'128 Lincoln st.',null,'WI','53122','2625551213','20100101');with cte1 as( select Id,Account,[Address],City,[State],Zip,Telephone,RecordDate from #tmp where Id = 1 union all select a.Id, a.Account, coalesce(a.[Address],b.[Address]), coalesce(a.City,b.City), coalesce(a.[State],b.[State]), coalesce(a.[Zip],b.[Zip]), coalesce(a.Telephone,b.Telephone), coalesce(a.RecordDate,b.RecordDate) from #tmp a inner join cte1 b on a.id = b.id+1 and a.account= b.account),cte2 as( select *,row_number() over (partition by [Account] order by Id desc)rn from cte1)select * from cte2 where rn = 1 drop table #tmp; |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-24 : 08:14:04
|
| Jimf, the null takes 2 bytes of overhead, which is of course necessary and acceptable. But repeating the unchanged data from line to line not only takes up more space but also involves having to do a comparison between the lines in order to determine what the differences are. This DB design is interesting because of those reasons... if it can be managed easily of course...Michael, thanks for that advice, and I'll read up on the extra topic that you provided. Actually, I do have both a "latest status" table which provides the current address set along with the table that describes incremental change. The issue is when the user wants to see who/what changed on a prior date and that's when data compilation is required of course. This is to meet regulatory requirements which is forcing the issue onto the design. |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-24 : 08:19:16
|
| sunitabeck, I'll go run the query and let you know. Thanks! |
 |
|
|
michael.appleton
Posting Yak Master
160 Posts |
Posted - 2011-03-24 : 10:52:52
|
| Okay seen as everyone's talking temporal data, can anyone point me to something that talks about the best way to set up indexing and primary/foreign keys? I've got several temporal data storing tables that are related to each other and figuring out how to manage keys and indexes seems quite difficult. I hope this isn't hijacking your thread but it seems relevant! |
 |
|
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2011-03-24 : 11:12:26
|
why are you keeping incremental changes in the same table? I would suggest an Accounts_audit table that gets populated with the incremental changes. Accounts stays as is with the current values or last values.so what you would have is123 Main st. New York NY 14522 2127887777 If you don't have the passion to help people, you have no passion |
 |
|
|
WindChaser
Posting Yak Master
225 Posts |
Posted - 2011-03-24 : 12:08:39
|
| sunitabeck, I'm reporting back that your query works like an absolute charm. It's a thing of beauty! Thank!!!Michael, I'm not a moderator but I suppose that it would be best to start a new thread in order to enable easier future referencing. I myself love to search for answers and usually find what I need without posting... as long as the threads don't become too confusing.Yosiasz, as I said earlier, I do have a Status table and an "Audits" table. But when comes the time to compile the incremental data to produce a report for the audit, this is what this query is for. |
 |
|
|
Sqlraider
Yak Posting Veteran
65 Posts |
Posted - 2011-03-24 : 12:32:04
|
| FYI,another way would be to use MERGE with OUTPUT. You can put the OUTPUT into an audit table, in it you can put what Action (insert/update), the date the Action took place plus what the record looked like BEFORE & AFTER the changes where made.You're orginal table would have what the account currently looks like & you could query the OUTPUT audit table to show any & all changes made to an account. |
 |
|
|
Next Page
|