Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Bad Table Design?

Author  Topic 

gavinharriss
Starting Member

17 Posts

Posted - 2008-09-28 : 21:27:27
Hi,

I have a table I use to track ads and to allow them to be rotated on a user by user basis:

CREATE TABLE [dbo].[StatsAdvertHit](
[AdvertID] [uniqueidentifier] NOT NULL,
[AdvertZoneID] [int] NOT NULL,
[UserName] [nvarchar](256) NOT NULL,
[UserHostAddress] [varchar](15) NOT NULL,
[Crawler] [bit] NOT NULL,
[TimestampUtc] [datetime] NOT NULL
) ON [PRIMARY]

and it has the following index:

CREATE NONCLUSTERED INDEX [IX_StatsAdvertHit] ON [dbo].[StatsAdvertHit]
(
[AdvertZoneID] ASC,
[UserName] ASC,
[TimestampUtc] DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

As the site gets pretty heavy traffic this stats table grows to multiple millions of rows fairly quickly.

Every now and then I archive old stats to another database and delete them from the live table. However I find that the whole site quickly grinds to a halt when I'm deleting the archived records. I assume this is because there's a lot of data being added and read while the delete is taking place.

I've tried reducing impact by running and structuring the delete in various ways such as:

DECLARE @RowCount int
WHILE 1 = 1
BEGIN
BEGIN TRANSACTION
SET ROWCOUNT 1000
DELETE FROM StatsAdvertHit WHERE TimestampUtc <= CAST('2008-09-26 01:58:54.483' AS datetime)
SET @RowCount = @@rowcount
COMMIT TRANSACTION
WAITFOR DELAY '00:00:05'
IF @RowCount = 0 BREAK
END

But I'm still pulling the site down. Even after I cancel the SQL query being run it takes a silly amount of time before SQL Server recovers.

Anyone have any ideas to get around this problem? I would be greatly appreciated. Cheers.

P.S. In case it helps: SQL Server 2005 Workgroup Edition on Windows 2003 with 2 GB RAM.

www.gavinharriss.com

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2008-09-28 : 22:17:39
a) Creating a clustered index on your timestamp should help. Your current index is worse than useless for this particular operation.
b) Don't delete the rows and/or consider partitioning
b.5) Maybe a clustered index/partition combined with a [materialized] view "select from StatsAdvertHit WHERE timestamp<=getdate()-xxxx" to constrain the table for most operations.
c) Turn on multi row versioning
d) Delete all in one go, not this top 1000 rubbish.
Go to Top of Page

gavinharriss
Starting Member

17 Posts

Posted - 2008-10-07 : 15:24:59
Hi LoztInSpace - thanks for the input.

Replies to suggestions...

a) The original index is used for day to day searching of data on the table rather than the archiving.
b) I was trying to reduce hardware needs by removing old data no longer used, but useful for future reporting. Figured smaller tables would be good for SQL Server performance in general.
c) Not come across multi-row versioning before so I'll look in to this.
d) The deleting in blocks of 1000 rows at a time definitely helps as deleting all in one go immediately degrades performance on the server to unusable levels.

Since starting this thread we've actually come to the conclusion that it's the hosting company's stack at fault (probably disk contention issues). The website is currently hosted on with GoGrid.com and we've starting to experience more and more problems all over the place and not just with the database server anymore. I think their cloud still need some work!

Cheers again,
Gavin.

www.gavinharriss.com
Go to Top of Page

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2008-10-07 : 15:45:01
1)You need to have Clustered index on that huge table as Loztinspace suggested as your table is heap without clustered index and You can't defrag Heap table.
2) Also Consider using table partitioning for archiving on basis of timestamp column.You won't have to do like you are doing. I am not sure Workgroup edition support it.check it
3) Regularly Defrag your fragmented index.
Go to Top of Page

gavinharriss
Starting Member

17 Posts

Posted - 2008-10-07 : 16:42:09
Hi sodeep, cheers for that - I was unaware of potential probs from not having a clustered index present. I'll look at introducing a uniqueidentifier as a unique key to add a clustered index to that. I can't use the timestamp as obviously this could contain duplicate values. And I remember reading in the past that an identity field isn't always reliable in SQL Server in this scenario?

For archiving we're actually removing the records physically from the server to reduce consumption of space and reduce database backup size to allow easier ftp'ing so I'm guessing table partitioning isn't going to help with this? Though it's an interesting feature I'd not explored before so thanks for the food-for-though.

www.gavinharriss.com
Go to Top of Page
   

- Advertisement -