Lengthy Update Query - Improvements Required

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Lengthy Update Query - Improvements Required

Author

Topic

SHardy
Starting Member

35 Posts

Posted - 2007-05-11 : 07:16:12

Hi all,

As part of an "overnight" datawarehouse build, I have an update query which is taking quite some time to run, and I was hoping to get some suggestions as to how I could improve upon it.

There are two tables in the query, both of which are also created during the build & neither of which have any indexes.

Table 1 - "nav_glentry" - Is a large table reflective of the general ledger. It currently has approx 4.5 million records. "entry_no" is a unique field within this table. It has a populated "dim_cost_centre" field and an empty "dim_old_ini" which needs to be updated.

Table 2 - "nav_cost_centre_lookup" - This is a small table (approx 1,000 records) that maps "dim_cost_centre" to "dim_old_ini". These are the only two fields in the table and are both (currently) unique. "dim_cost_centre" will always be unique.

So I am updating "dim_old_ini" in Table 1 from Table 2. SQL is as follows:

update ng
set ng.dim_old_ini = nccl.dim_old_ini
from nav_glentry ng, nav_cost_centre_lookup nccl
where ng.dim_cost_centre = nccl.dim_cost_centre

This is currently taking nearly 3 hours to run.

I have tried adding indexes to both tables, but the index tuning wizrd suggests that this would have quite the opposite effect. As such, I have not yet been willing to actually try test running it.

I would welcome any suggestions as to how I may be able to cut down on the run time for this action. Would indexes be the way forward? Or will they only add unnecessary processing?

Thanks in advance,
Simon

nr
SQLTeam MVY

12543 Posts

Posted - 2007-05-11 : 07:31:43

If nav_cost_centre_lookup is that small then it will probably be in memory so indexes won't help.
What dataatype is dim_old_ini?
Make sure when you populate nav_glentry that the value isn't null, if it's fixed length then set the value to that length otherwise to the length it usually is. In that way you won't get pageb splitting when the update takes place and it can be a lot faster.

How long to read through nav_glentry - try a max on a non-indexed column?
How long does an update take for a single value - try using literals for the most common value in the table.
You don't say how big nav_glentry is - could be faster to create another table rather than update.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 07:55:00

3 hours to update 4.5 million rows? That's scary! Maybe they are all NULL and being extended, and thus each record has to be moved to a space large enough for it?

If so could you DEFAULT ng.dim_old_ini to some benign value when you populate the table initially, so that the row doesn't grow when it is Updated (assuming, say, a fixed length datatype like INT)

Could any of the rows already have the correct value? If so should be able to reduce the amount of work with:


update ng
set ng.dim_old_ini = nccl.dim_old_ini
from nav_glentry AS ng
    JOIN nav_cost_centre_lookup AS nccl
         ON ng.dim_cost_centre = nccl.dim_cost_centre
[red]WHERE    ng.dim_old_ini <> nccl.dim_old_ini
      OR (ng.dim_old_ini IS NULL AND nccl.dim_old_ini IS NOT NULL)
      OR (ng.dim_old_ini IS NOT NULL AND nccl.dim_old_ini IS NULL)

you can slim that down a bit if either of the columns do not permit NULLs

If there is an index on dim_old_ini drop that before you start, and recreate it afterwards.

Kristen

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2007-05-11 : 08:05:11

"Maybe they are all NULL and being extended, and thus each record has to be moved to a space large enough for it?"

Can you explain that a bit more, Kristen?

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"

nr
SQLTeam MVY

12543 Posts

Posted - 2007-05-11 : 08:06:17

quote:
Originally posted by harsh_athalye

"Maybe they are all NULL and being extended, and thus each record has to be moved to a space large enough for it?"

Can you explain that a bit more, Kristen?

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"

Have a look at my post.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

nr
SQLTeam MVY

12543 Posts

Posted - 2007-05-11 : 08:11:48

Actually for that time it is probably doing a nested loop - i.e. executing each row in turn (have a look at the query plan).

You could put an index on the large table (preferably clustered but that's probably not possible) and try updating for each value in turn from the smaller table in a loop - look at the query plan to see what it is doing.

If you have a clustered index (hopefully on a single small column like an ID) then first of all run a query to get the ID's and new values into a temp table then do the update using that joining on the ID.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 08:20:20

"Have a look at my post.[i/]"

Doh! Sorry Nigel, I missed reading that first time around.

"[i]Can you explain that a bit more, Kristen?"

If you have a record with a variable length column (such as VARCHAR) which is NULL / empty / contains short data, and you put a bigger piece of data in it, then the record won't fit back where it was. SQL Server may then have to move it somewhere else, and that means adjusting the index too (ALL of them if there is no clustered index!)

Kristen

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 08:22:03

@SHardy:

Make sure you have a Clustered Index on nav_glentry

Kristen

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2007-05-11 : 08:39:14

quote:
Originally posted by Kristen

"Have a look at my post.[i/]"

Doh! Sorry Nigel, I missed reading that first time around.

"[i]Can you explain that a bit more, Kristen?"

If you have a record with a variable length column (such as VARCHAR) which is NULL / empty / contains short data, and you put a bigger piece of data in it, then the record won't fit back where it was. SQL Server may then have to move it somewhere else, and that means adjusting the index too (ALL of them if there is no clustered index!)

Kristen

Perfect ! Got it.

Thanks Kristen.

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"

SHardy
Starting Member

35 Posts

Posted - 2007-05-11 : 08:54:17

Thanks for the comments. I have taken them onboard and reviewed what is actually going on.

Both the "dim_old_ini" & "dim_cost_centre" fields are varchar20 in both tables.
When "nav_glentry" is initially created, both fields are created as "space(20)".
The "dim_cost_centre" field in "nav_glentry" is then updated (along with some other fields).
It is then that the above update happens.

As prevously stated, there were no indexes on either table. I have manually added the following indexes for testing:
nav_glentry:
1) entry_no - unique index, non clustered
2) dim_cost_centre - non-unique index, non-clustered
nav_cost_centre_lookup
1) dim_cost_centre - unique index, non-clustered

I decided to ignore the index tuning wizard & to try running the above statement again. It only took 10 minutes. I did not first reset the field values within nav_glentry, but surely that wouldn't make any difference if they were initially space(20) anyway?

I might have to add in the index creation & see how it goes in the actual run. Will update here on Monday with results.

I will still check on here if there are any other suggestions/ comments.

Many thanks,
Simon

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 09:13:16

"It only took 10 minutes"

That sounds better!

"I did not first reset the field values within nav_glentry"

Well, change them all now to 'X' and see if that takes 10 minutes too?

I reckon that if you have ANY indexes on nav_glentry you should make one of them Clustered (preferably whatever constitutes the PK or something else "Unique"). Otherwise each Update will have to update ALL the indexes for that table, even though the actual index entry won't have changed.

(In the absence of a clustered index EVERY Index stores the "disk address" of the record, whereas with a Clustered Index all the secondary indexes store the Key from the Clustered Index, and thus do not need to be updated when the record "moves")

In fact I think both the indexes that you refer to as (1) should be CLUSTERED (in the absence of a Clustered PK that is ...)

Kristen

Kristen

SHardy
Starting Member

35 Posts

Posted - 2007-05-11 : 11:33:58

Hello again!

I have been doing a bit more testing. And this time I did first reset the field to be updated.

Even though I had got it down to 10 minutes, I saw a big improvement on this by dropping the index on the dim_cost_centre field in the nav_glentry table.

However, by making the remaining unique index clustered, it was slightly slower than when non-clustered. This was on top of the time it takes to set the clustered index.

I will now leave it as follows:

nav_glentry:
1) entry_no - unique index, non clustered
nav_cost_centre_lookup:
1) dim_cost_centre - unique index, non-clustered

This job will run each day over the weekend, so I will check on Monday how it goes and report back.

Thanks

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 12:50:12

"making the remaining unique index clustered, it was slightly slower than when non-clustered"

I'm surprised. Did you have the dim_cost_centre in place with this test?

My expectation would be that a clustered index would mean that you would not have to drop (and recreate) the index on dim_cost_centre - or any other secondary index for that matter, with the exception of any index on dim_old_ini [which I would definitely drop and recreate if you need one])

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources