2 logically equivalent queries act very different

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

2 logically equivalent queries act very different

Author

Topic

roger.hernandez
Starting Member

2 Posts

Posted - 2003-01-16 : 14:08:21

Given the table:

CREATE TABLE [tInstance] (
[InstanceID] [int] IDENTITY (1, 1) NOT NULL ,
[InstallationID] [int] NOT NULL ,
[LastActivity] [datetime] NOT NULL ,
CONSTRAINT [PK__tInstance__73BA3083] PRIMARY KEY CLUSTERED ([InstanceID]) ON [PRIMARY] ,
) ON [PRIMARY]

If session A has a lock on a row:
begin transaction
update tInstance set LastActivity = current_timestamp where InstanceID = 45;
...

Session B wants to read everything but that row. Using one of the two logically equivalent statements below, the behaviour is very different:

Statement 1:
select * from tInstance where InstanceID < 45 or InstanceID > 45;

Statement 2:
select * from tInstance where InstanceID <> 45;

Statement 1 returns all rows but 45. Statement 2 blocks on lock held by Session A. The estimated explain plan for Statement 1 is about 5 times faster than for Statement 2.

Any thoughts or ideas why this is? Why would you ever use Statement 2 instead of Statement 1?

robvolk
Most Valuable Yak

15732 Posts

Posted - 2003-01-16 : 14:38:25

As the saying goes, there's more than one way to skin a cat.

What exactly are the differences between the two plans? I can see where the first one may use a different lock granularity than the second. Also, did you flush the data cache between the two statements? (DBCC DROPCLEANBUFFERS) If you ran statement 2 first, then statement 1 could've benefited from having the data in cache and may NOT have actually been faster. Conversely, running statement 1 and then 2 might've forced a new plan that required actual disk I/O. There's no way to know without the details of the two plans.

Also, how do both queries perform without the UPDATE statement taking place? What else happens in that transaction, and what is the isolation level?

As far as why use #2? Simple, it is more logically compact and accurate and simpler to write, and avoids the chance for mistyping 45 twice in one statement. If they are logically equivalent why not use the simpler version? There are other factors that could be affecting the performance; it's better to rule them out first before declaring that one version actually performs better than the other.

roger.hernandez
Starting Member

2 Posts

Posted - 2003-01-16 : 14:49:30

The issue is not the performance, the issue is that Statement 1 runs while statement 2 blocks. The queries are logically equivalent but Statement 2 has worse concurrency. While someone is holding a lock on row 45, Statement 2 performs infinitely worse than Statement 1.

The optimizer is using an ORDERED FORWARD CLUSTERED INDEX SEEK for statement 1. It is using CLUSTERED INDEX SCAN for statement 2.

With regards to the statement being simpler to write I would use a sed or perl script to convert from the (A <> B) format to the (A < B or A > B) format. Some of our queries are generated by our backend so we could change that piece without any problems. That would avoid programmer error.

So to reiterate, the main issue is not the performance, although Statement 1 perform a little better than 2. My main concern is that in two logically equivalent queries, Statement 1 runs while Statement 2 blocks. I call that an optimizer bug.

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2003-01-16 : 15:10:22

Roger, not sure I'd call it an optimizer bug, but maybe. Perhaps you should send that to sqlwish@microsoft.com.

However, from other posts I've read here, my understanding (and your plans seem to support this) is that not equal comparisons have to scan all values to verify, whereas A < B or A > B it probably considers two different clauses, and quickly sees it can do an index seek instead of scan.

I have no idea how the optimizer is programmed, and maybe you can argue that it should be smart enough to change the not equal into the other one, I don't know. I suspect, however, that MS would not consider it a bug (as in the type that causes you to not have to pay for a support phone call).

------------------------------------------------------
The more you know, the more you know you don't know.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources