Disparate execution times for UPDATE statements

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Disparate execution times for UPDATE statements

Author

Topic

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-24 : 03:04:38

I have a situation where I am updating 2 columns of a very large (~15,000,000 rows) table with separate UPDATE statements. These are
1.
UPDATE dbo.[lasAG_DWFactTable_0000000011] WITH (TABLOCK) SET [H_001] = ISNULL((SELECT TOP 1 A.AttributeId
FROM (lasAG_DWAttribute A WITH(NOLOCK) LEFT JOIN (lasAG_DWAttributeRange B WITH (NOLOCK) LEFT JOIN lasAG_DWDriver C WITH (NOLOCK) ON B.DriverId = C.DriverId) ON A.AttributeId = B.AttributeId)
WHERE A.HierarchyId = 1 AND ((CAST(dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS varbinary) BETWEEN B.RangeFrom AND B.RangeTo)
OR ( CAST (dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS nvarchar) LIKE B.Mask COLLATE DATABASE_DEFAULT))
AND CAST(dbo.[lasAG_DWFactTable_0000000011].[Database] AS varbinary) = C.DriverValue), 22)

2.
UPDATE dbo.[lasAG_DWFactTable_0000000011] WITH (TABLOCK) SET [H_003] = ISNULL((SELECT TOP 1 A.AttributeId
FROM (lasAG_DWAttribute A WITH(NOLOCK) LEFT JOIN (lasAG_DWAttributeRange B WITH (NOLOCK) LEFT JOIN lasAG_DWDriver C WITH (NOLOCK) ON B.DriverId = C.DriverId) ON A.AttributeId = B.AttributeId)
WHERE A.HierarchyId = 3 AND ((CAST(dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS varbinary) BETWEEN B.RangeFrom AND B.RangeTo)
OR ( CAST (dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS nvarchar) LIKE B.Mask COLLATE DATABASE_DEFAULT))
AND CAST(dbo.[lasAG_DWFactTable_0000000011].[Database] AS varbinary) = C.DriverValue), 40)

In Case 1 the result of the 3 table join in the embedded SELECT is about 150 rows, in Case 2 about 80 rows.

However, when I run the statements, the execution times are:
Case 1: 2 - 5 minutes
Case 2: 40 - 60 minutes

The only difference (that I can see) is that after examining the rows returned (by turning the UPDATE into a SELECT statement) is that in Case 2 the ISNULL logic is being invoked much more often - ie there are relatively few rows that satisfy the correlation between the FactTable and the other tables.

Anyone got any ideas on how I can speed this up, and why is Case 2 so much slower???

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2003-06-24 : 03:24:19

Mike,

Those Access type queries do my head in.. I formatted that first one slightly differently..


UPDATE dbo.[lasAG_DWFactTable_0000000011] WITH (TABLOCK) 
	SET [H_001] = 
ISNULL(
(SELECT TOP 1 A.AttributeId 
FROM lasAG_DWAttribute A WITH(NOLOCK) 
LEFT JOIN lasAG_DWAttributeRange B WITH (NOLOCK) ON A.AttributeId = B.AttributeId
LEFT JOIN lasAG_DWDriver C WITH (NOLOCK) ON B.DriverId = C.DriverId
WHERE A.HierarchyId = 1 
AND ((CAST(dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS varbinary) BETWEEN B.RangeFrom AND B.RangeTo) 
OR ( CAST (dbo.[lasAG_DWFactTable_0000000011].[AccountCode] AS nvarchar) LIKE B.Mask COLLATE DATABASE_DEFAULT)) 
AND CAST(dbo.[lasAG_DWFactTable_0000000011].[Database] AS varbinary) = C.DriverValue), 22)

Highlight this statement along with your original 2 and Press CTRL+L in QA.. This will give you a Estimated execution plan.

Tell us if there is any difference...

DavidM

"SQL-3 is an abomination.."

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-24 : 04:32:39

Thanks - I checked the execution plans as you suggested, and there is no difference. I noticed how you changed the join, but I have tried that. As there are so few rows which satisfy the join conditions it's a bit hard to see that this would make any difference.

I just tried rolling the two update statements into one, and it took 1h 1m to finish. I'm in the process of getting a server trace to see if they use resources differently.

Thanks again
Mike

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2003-06-24 : 04:46:00

I think you can write this a different way...


UPDATE dbo.[lasAG_DWFactTable_0000000011] WITH (TABLOCK) 
SET [H_001] = ISNULL(X.AttributeId,22)
FROM
(SELECT TOP 1 A.AttributeId,  B.RangeFrom, B.RangeTo, B.Mask, C.DriverValue
FROM lasAG_DWAttribute A WITH(NOLOCK) 
LEFT JOIN lasAG_DWAttributeRange B WITH (NOLOCK) ON A.AttributeId = B.AttributeId
LEFT JOIN lasAG_DWDriver C WITH (NOLOCK) ON B.DriverId = C.DriverId
WHERE A.HierarchyId = 1) AS X
WHERE 
((CAST([AccountCode] AS varbinary) BETWEEN X.RangeFrom AND X.RangeTo) 
OR ( CAST([AccountCode] AS nvarchar) LIKE X.Mask COLLATE DATABASE_DEFAULT)) 
AND CAST([Database] AS varbinary) = X.DriverValue

Is that the same result and how does it fair?

DavidM

"SQL-3 is an abomination.."

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-06-24 : 05:43:25

Is there some reason I can't figure that you are using TABLOCK and not TABLOCKX ? I assume that you are running at a time when no-one wants access to that table, since you deny access for your entire execution time.

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

Edited by - wanderer on 06/24/2003 05:44:36

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-24 : 20:04:00

Hi David - I've tried the UPDATE ... FROM syntax and it is much much slower (again for reasons I don't understand) than the correlated subselect UPDATE.

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-24 : 20:16:10

Hi Wanderer .. The only reason I want a TABLOCK is to save server resources - remember the table being updated is ~15,000,000 rows, and in some sites could be much much bigger again. With TABLOCK there is only one lock; without it page locking kicks in, and I've had a look at the syslocks table and there are a bloody lot of locks. There is actually a noticable improvement in response time with the TABLOCK compared to without it. As far as I can see TABLOCK behaves the same as TABLOCKX anyway - I don't see the advantage in this application.

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2003-06-24 : 20:44:27

Mike,

One more go at a rewrite...


UPDATE  Fact WITH (TABLOCK) 
SET Fact.[H_001] = ISNULL(A.AttributeId,22)
FROM dbo.[lasAG_DWFactTable_0000000011] AS Fact
INNER JOIN lasAG_DWAttributeRange B WITH(NOLOCK)  ON CAST([AccountCode] AS varbinary) BETWEEN B.RangeFrom AND B.RangeTo OR CAST([AccountCode] AS nvarchar) LIKE B.Mask COLLATE DATABASE_DEFAULT
LEFT JOIN lasAG_DWDriver C WITH(NOLOCK)  ON CAST([Database] AS varbinary) = C.DriverValue AND B.DriverId = C.DriverId
LEFT JOIN lasAG_DWAttribute A WITH(NOLOCK)  ON A.AttributeId = B.AttributeId AND A.HierarchyId = 1

EDIT: I forgot the TOP 1.. can you alter it?

DavidM

"SQL-3 is an abomination.."

Edited by - byrmol on 06/24/2003 20:47:40

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-24 : 22:52:54

David ... I'm running your suggested SQL statement now, and I'm up to 10 minutes for the "fast" set of parameters instead of 2m+. I checked the execution plan, and while it generates a plan very different to the "standard" plan, it appears that at one point it comes up with a cartesion product - in this case a rowset of 382,000,000 or so rows; that's why it's so slow, I guess.

I'm very grateful for your help - any more suggestions are very welcome.

PS I ran Performance Monitor for the 2 cases, and the results appear very different. The counters I used were Physical Disk - Avg Disk Queue Length, and Procesor - % processor time.

The "fast" update displayed high cpu and high disk queue levels, while the "slow" update displayed high (but very steady) cpu usage, but very low levels of disk usage (with a very occasional burst).

Very interesting, but no closer to solving the mystery.

Cheers
Mike

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-25 : 01:50:06

Hi all ... NEWS FLASH

I have conclusively proved that the execution speed is a function of the number of times that the correlated subselect returns NULL. I tweaked the AttributeRange data for HierarchyId 3 (the "slow" update) so that all the data in the FactTable matches at least one row of the 3-table join. The execution speed now matches that of the "fast" update - ie ~2 minutes.

The question is, can anyone think of a way of speeding up the "slow" updates - ie where the subselect returns lots of nulls when correlated to the FactTable???

Unfortunately the SQL is generated by a stored procedure distributed as part of a shrink-wrapped application, and is always going to be dependant on whatever the users enter (or don't enter) - so I need to generate "optimal" SQL - I havn't got the luxury of massaging the SQL by hand to match a particular set of circumstances.

Cheers
Mike

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2003-06-25 : 02:49:53

Mike,

Do you have to have LEFT JOINS? That usually produces a bucket of NULLS...

"Bucket of NULLS"
Definition: Large quantities of crap..... ;-)

DavidM

"SQL-3 is an abomination.."

MikeArnold
Starting Member

11 Posts

Posted - 2003-06-25 : 21:14:30

Hi David ... Unfortunately it's the correlation that's producing the nulls - it's what happens when a row in the FactTable can't find at least one corresponding row in the 3-table join, in which case I plug a default value in - and before you ask, I can't use a column DEFAULT to deal with this situation because the default value is different depending on the value of the HierarchyId.

I suspect that the reason why it's slow when there are lots of nulls is because each row in the FactTable (restricted over the DriverValue-[Database] join condition) has to search each row in the 3-table join result set (maybe a couple of hundred rows) before it can say for sure - "oh, no match, i'm null - better call the ISNULL function". In the other case, where the FactTable correlation always or nearly always finds a matching row in the 3-table join result set it only has to find the first one before it returns. Even so, I am surprised at the huge differential in execution time.

Cheers
Mike

FatDaddy

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources