Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Maintaining History Table

Author  Topic 

dineshrajan_it
Posting Yak Master

217 Posts

Posted - 2009-04-21 : 06:45:28
Hi ,

After lots of digging myself in retaining old values somewhere in db, i have come to conclusion to use history tables to each table. so when there is a record updated or deleted in main table, Previous records get appended to the history table. My question is

1) shall i go for a single History table for entire DB or create history table for each main table. Performance?

2) how do i go for structure of history table -> for each table concept
historyid,userid,updateddate,time,rest of main table columns.
is this correct format.

Thanks in Advance

whitefang
Enterprise-Level Plonker Who's Not Wrong

272 Posts

Posted - 2009-04-21 : 07:07:35
I'd use one history table for each database with the following columns:
Id, TargetTable, ModifierId, RowData, ModifierDate, CommandType

Where ModifierId is the user who invoked the modification of the row, and RowData is a XML data column containing the whole row compiled into an XML format. The CommandType is either Update or Delete.
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-04-21 : 07:17:23
If you aren't going to lookup this table often you may want to implement it as a HEAP. No keys, no indexes: that will minimise overhead on your writes.

If you need to reference the table for 2d time though then don't do that!


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

dineshrajan_it
Posting Yak Master

217 Posts

Posted - 2009-04-21 : 08:08:12
hi,

do u mean to avoid indexing in history table.
Go to Top of Page

dineshrajan_it
Posting Yak Master

217 Posts

Posted - 2009-04-21 : 08:10:04
hi whitefang,

how was ur historytable performance when u used xml datatype for storing previous records as xml.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2009-04-21 : 08:12:37
quote:
Originally posted by Transact Charlie

If you aren't going to lookup this table often you may want to implement it as a HEAP. No keys, no indexes: that will minimise overhead on your writes.

Actually, the overhead writing to heaps is greater than the overhead writing to a monotonically increasing clustered index.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2009-04-21 : 08:17:54
I prefer one history table per table.
Go to Top of Page

whitefang
Enterprise-Level Plonker Who's Not Wrong

272 Posts

Posted - 2009-04-21 : 08:21:28
quote:
Originally posted by dineshrajan_it

hi whitefang,

how was ur historytable performance when u used xml datatype for storing previous records as xml.



Instead of storing as XML, we stored table column data into a concatenated string column. I prefer using XML because it is easily serialized and deserialized as well as having query abilities.

Performance is fine unless you have like over 10 million rows. That is when you should consider archiving.
Go to Top of Page

dineshrajan_it
Posting Yak Master

217 Posts

Posted - 2009-04-21 : 08:27:44
Hi whitefang,

i have heard most of them telling me to go for 1 history table per main table. you are the one quite telling something diff like going for a single history table and storing the previous records as xml format. i like ur approach. but why is it others not accepting. so, is there anything that troubles that part of yours.

thanks in advance
Go to Top of Page

whitefang
Enterprise-Level Plonker Who's Not Wrong

272 Posts

Posted - 2009-04-21 : 08:33:21
Having one history table per main table is the worst nightmare of a design and increases the maintainability and bulk of database massively. It increases the overhead, cost, maintenance, time, and bugs. The engine itself has to keep track of those tables and statistics (if you implement them). Imagine implementing the changes across multiple environments (development, testing, staging, production). It's not even practical in a enterprise environment. Any developer who recommends history table for each table is FIRED ON the spot.

With your design, also think about the changes in the application layer. You'd have to write EACH "backup" query because they use different tables. It's just not practical.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2009-04-21 : 08:40:54
Lol.
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-04-21 : 09:29:12
quote:
Originally posted by pootle_flump

quote:
Originally posted by Transact Charlie

If you aren't going to lookup this table often you may want to implement it as a HEAP. No keys, no indexes: that will minimise overhead on your writes.

Actually, the overhead writing to heaps is greater than the overhead writing to a monotonically increasing clustered index.


Really? How can that be right (not insulting you -- just looking for more information)? Surely if you have no clustered index you don't have the overhead of maintaining one. If there are only going to be INSERTS into the table then I thought the HEAP was the way to go.

(NB -- This was probably the *only* scenario that I thought a HEAP would be a valid choice for)

I'd like to hear some reasons for the many tables approach. Dineshrajan_it hasn't really specified but I'm assuming that the log tables aren't going to be used often (or at all) so why the (unnecessary?) overhead of maintaining multiple tables. 1 table seems much simpler to integrate into the existing design and it sounds like performance (for reads) is irrelevant.


Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2009-04-21 : 09:56:03
I can explain, and provide references.

When populating a heap, the DB Engine looks to place new pages as close to the beginning of the file as possible. As such, there is an overhead of searching for the page. If it just dumped the data in the first page it found then it might be quicker. For a monotonically increasing clustered index, any new page is logically consecutive to the last, so the DB Engine attempts to put it as close to last page as possible. As such, it has to look at far less pages on average before finding a suitable next page.
Annoyingly I can't find the post where Paul Randall concuirs.
http://www.sqlskills.com/blogs/kimberly/post/The-Clustered-Index-Debate-Continues.aspx

Anyhoo - a heap could be useful if the CI B-Tree is enormous (i.e. table and\ or index huge) and the table is subject to loads and loads of lookups and nothing else. Or if it is teeny (single page) and it is cheaper to scan a heap than traverse the B-Tree (root and leaf level), even when returning a single row.
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2009-04-21 : 10:50:21
Cheers. Interesting article. I was under the (obviously incorrect) impression that inserting to a heap just dumped data to the first free page it found -- kinda like writing a file to a FAT volume.

I retract my suggestion re the heap.

Dineshrajan_It -- you should probably *not* use a heap for this (or in fact for anything really).





Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2009-04-22 : 03:23:11
Isn't it pleasant when professionals can respectfully share information without resorting to childish insults?
Go to Top of Page

dineshrajan_it
Posting Yak Master

217 Posts

Posted - 2009-04-22 : 04:28:57
thanks for overwhelming response guys,

I have considered using Service Broker as suggested by [spirit] for this History table.
this history table will be in separate DB and will be communicating with main db when there is insert, update, delete operations going on for tables. i think its a good approach.
Go to Top of Page
   

- Advertisement -