Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Heap or Identity

Author  Topic 

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2007-11-21 : 12:33:02
SQL Server 2005

We have a table that simply stores all changes to a specific record in another table and can get very large. The relationship is such there are many records in the transaction table for each record in the parent table. How many depends on how many times the record has been updated and can contain multiple entries for each column.

The transaction table contains a clustered index over a column that is defined as a UNIQUEIDENTIFIER. There are other indexes as well over business fields (basically foreign key columns). This obviously has some performance implications and becomes fragmented very quickly during heavy loads. Then as expected, the performance issues cascade to queries, etc.

Anyway, we are looking at two options - (a) removing the clustered index altogether and treat it as a heap or (b) add another column defined as identity (1,1) and make that the clustered index.

My initial research tells me the heap is not the way to go as there still may be performance issues with it. Using the second option guarantees me that all new data is always added to the end and will minimize the fragmentation. Keep in mind we do have regular maintenance jobs to rebuild / reorganize indexes and lob data.

Can anyone shed their thoughts on these two options for this situation?

cr8nk
Yak Posting Veteran

77 Posts

Posted - 2007-11-21 : 12:43:34
How are you querying this transaction table? An identity column would definately reduce your level of fragmentation. Is this unique identifier the primary key of your parent table or simply added as a surrogate key to your history table?
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2007-11-21 : 13:43:41
The queries against this transaction table are over the nonclustered indexes that are foreign keys to other tables (including the parent table). The clustered index is basically invisible to anything outside this table and is not used or referenced programmatically. So in summary, it's just a surrogate key to the history table.

As far as your comment regarding the identity reducing the level of fragmentation, would a heap achieve the same result? I'm not familiar with how SQL Server handles heaps. Or are there enough fundamental differences in how SQL Server handles these two concepts in terms of optimizating and querying that one approach is better than the other?
Go to Top of Page

cr8nk
Yak Posting Veteran

77 Posts

Posted - 2007-11-21 : 15:02:10
No a heap would not achieve the same result, it has nothing to enforce that data is stored contigously (sp?). It also cannot use index order scans. It is usually a best practice to have a clustered index on every table, however, you will not encounter page splits when using a heap table which you might have been experiencing when writing to a table with a uniqueidentifier as the primary key. You should experiment with dropping the uniqueid primary key field if it is not used and replacing it with a clustered index on a integer identity field.

Here is a good article to read on the subject.

http://technet.microsoft.com/en-us/library/aa964133.aspx
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2007-11-21 : 15:45:26
cr8nk, thanks for the help.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-22 : 11:14:15
For insert speed use the identity:
http://www.sqlskills.com/blogs/kimberly/2007/02/02/TheClusteredIndexDebateContinues.aspx
If your rows get updated loads (i.e. made larger on variable length columns) then you will need to consider fillfactor. If not, go for 100. If you have lots of deletions then you need to consider page density and keep routinely defragmenting the index (note you'll need to REBUILD\ DBCC DBREINDEX to sort out page density).

HTH
Go to Top of Page
   

- Advertisement -