Performance Creating clustered index

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

Old Forums

CLOSED - General SQL Server

Performance Creating clustered index

Author

Topic

ACALVETT
Starting Member

34 Posts

Posted - 2005-09-29 : 07:16:36

Hi Guys,

I have a SQL 2000 sp3a server on Windows 2000 sp4. Running dual proc server with hyper threading enabled, 3gb memory attached to a HP EVA 5000 SAN.

One of the tables is 67gb and contains 140,000,000 rows. Recently someone dropped the clustered indexe so i`m trying to put it back (i've dropped the non clustered indexes as no point leaving them there whilst clustered builds).

The problem i am having is the rebuild is taking forever!! It ran for 23 hours before someone rebooted the server (!). The database is currently recovering from the reboot but i need to work out what is causing the appalling performance so i can get the index rebuilt. There are no reported hardware problems.....

There are multiple file groups involved and i found i was getting an extent allocation rate of 1.5 extents a second and same for deallocation.

Any advice on how to trouble shoot this?

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-29 : 08:00:47

Someone dropped the clustered index on 140 M rows table?!? How long did it take? Anyway 23 hours is not normal even for such amount of data. Make sure you have enough space so data files are not expanding during the process. Is clustered index on identity column?

Kristen
Test

22859 Posts

Posted - 2005-09-29 : 08:30:20

Has there been a public flogging yet? I'll be annoyed if I've missed it

BCP out the data (ORDER BY the proposed Clustered Index keys), empty the table, create the Clustered Index, and re-import with BCP using the ORDER hint?

Alternatively, if TEMPDB is on a different drive set, and you can spare the space!!, using SORT_IN_TEMPDB with CREATE INDEX might speed it up.

Perhaps you could get the same effect by renaming the table, creating a new [empty] table in its place, with clustered index, and then

INSERT INTO RealTable SELECT * FROM TempTable ORDER BY Key1, Key2, ...

This could be changed to do the insert in batches of, say, 100,000 rows - which would need a WHERE TempTable.dbo.Key1 > MAX(RealTable.dbo.Key1) etc [bit more tricky if it is a multi-key index]

Is the Clustered Index UNIQUE? If so make sure the UNIQUE keyword is in place to avoid the extra 4 byte tie-break.

Kristen

ACALVETT
Starting Member

34 Posts

Posted - 2005-09-29 : 10:14:35

We did a crucifiction, you will find the guy hanging around a street in London.....

From what i`m lead to believe the drop took a few hours.

We definately have enough space, no autogrows are occuring.

The index is on a date column.

I was hoping to avoid taking the data out and other such options as the slow index create makes me think that what ever is affecting it will get me when i move the data so looking for the answer to the speed problem but don't know where to find it.

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-29 : 10:23:38

Is date dateTime or smallDateTime? What kind of date it is? If data are not inserted in clustered column(s) order you will have bad insert performance, a lot of page splits and fragmentation and I guess slower clustered index creation, but day long index creation is still too much. Clustered index on identity column is (in most cases) better choice. I can't add any new idea on top of what Kristen suggested. Good luck!

ACALVETT
Starting Member

34 Posts

Posted - 2005-09-29 : 10:29:58

Its datetime. Unfortunately its a third party product so no control over ths schema and having spoken to the developers previously you can write what they know about sql on the back of a postage stamp!

Anyway, thanks for your input! If i get to the bottom if the problem i`ll let you know.

Kristen
Test

22859 Posts

Posted - 2005-09-29 : 12:06:39

Could you just chuck a NON CLUSTERED index on their for now, to get the thing up & running (quickly, hopefully!!) and then sort out a permanent solution later?

Kristen

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-29 : 12:26:26

Kristen, you gave me idea.
ACALVETT, you can really create non-clustered index, then create new table with the same structure and only clustered index. Then insert data in batches from existing table into new one. After it's done, create non-clustered indexes and then:

sp_rename '<table name>', 'old_<table_name>'
go
sp_rename 'new_<table name>', '<table name>'
go

If everything goes well, you can drop old table. I assume you have enough space for two copies of that table and their indexes.

Kristen
Test

22859 Posts

Posted - 2005-09-29 : 14:03:03

"Kristen, you gave me idea"

I reckon I did more that that - I had already suggested it as well

Its a hassle with Foreign Keys. And any triggers on the table. And probably a few other things besides ... but it might get it going quickly.

Kristen

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-29 : 14:12:23

Oops! Sorry, yes you did it already.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-09-29 : 16:36:15

Seems like you've covered everything here. One thing ot bear in mind is that dropping a clustered index will always be faster than building one, as dropping one just rips out the b-tree and leaves the data pages unchanged. The couple of hours it took you would have been to rebuild the non-clustered indexes to pick up the new row locators.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)

Kristen
Test

22859 Posts

Posted - 2005-09-30 : 00:19:09

Thanks Paul, I was hoping there was something awry with the length of time the clustered index rebuild was taking, and that you would have a magic wand!

As there had been a clustered index there before I was thinking the physical order of the data would be correct, so not too much shuffling around required to recreate the index.

But there again 140M row tables are nearly two orders of magnitude bigger than what I have in my sand pit!.

Kristen

ACALVETT
Starting Member

34 Posts

Posted - 2005-09-30 : 05:17:54

Hi Guys,

Well i thought i'd give you an update. :D

We were quite fortunate that i had also just finished testing SP4 so i applied service pack 4 last night. After applying the service pack i also turned off parallelism (It came onto my radar at the begining of the week as a problem to investigate, don't know if create index can run in parallel (never seen parallel threads for a create index) so not sure if it could be related).

So, i said a little prayer to Mr Gates and left the rebuild running expecting to have to play with moving the data around today.

When i arrived i found that the index has finished rebuilding! And it only took 3hrs 25mins!! The table size is also now down to 59gb which i expected as there was alot of fragmentation.

So, its either parallelism or "something" that was fixed in SP4. If its SP4 i'd love to know what as i hate fixing something without understand the why and how.....

Cheers

A

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2005-09-30 : 06:04:24

I worked for two years on tables that size, we had sp3 and we needed just couple of ours for clustered index rebuild. Our data were not fragmented so badly. Unless you have much weaker hardware I bet on parallelism causing the problem. Unfortunately I don't work in that company anymore so I don't have environment to test. Paul, it would be nice if you are able to test and give us feedback.

Kristen
Test

22859 Posts

Posted - 2005-09-30 : 06:37:22

Glad to hear you're going to be allowed home for the weekend!

Kristen

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-10-01 : 10:40:26

Yes, most likely parallelism. Index build/rebuild can use a parallel plan in Enterprise edition, and its up to the optimizer to decide if the workload on the box plus available statistics can support a parallel plan. How many cpus are on your system?

As an aside, in SQL Server 2005 you can specify the degree of parallelism to use on index DDL operations.

Thanks

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2005-10-01 : 10:44:23

Never mind - rereading the thread I see you've got a dual-proc. It's perfectly possible that it was only using a single cpu and not driving the SAN efficiently. Hmmm - but you said now you've turned off parallelism. Now that you've upgraded, we can't dig in to see what it was but I'm glad the problem's solved.

Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)

ACALVETT
Starting Member

34 Posts

Posted - 2005-10-03 : 05:25:22

Thanks all for your input. Really appreciated.

A

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources