Comparing Two Tables and only selecting fields that dont match

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Comparing Two Tables and only selecting fields that dont match

Author

Topic

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2002-03-07 : 09:03:54

Dharm writes "Is there a way to create a list between two tables where we are only finding the ones that dont match.

eg.
A list of ID's in Table 1.
A partial List of ID's in Table 2.

We want to get a list of ID's that are not in listed in Table 2 but exist is Table 1.

Thank you,
Your Help would be greatful.

DD"

Nazim
A custom title

1408 Posts

Posted - 2002-03-07 : 09:07:37

SELECT ID FROM TABLE1 WHERE ID NOT IN( SELECT ID FROM TABLE2)

HTH

--------------------------------------------------------------

MuffinMan
Posting Yak Master

107 Posts

Posted - 2002-03-07 : 09:35:54

I believe this is more efficient:

Select t1.id
from table1 t1
left join table2 t2
on t1.id = t2.id
where t2.id IS NULL

Nazim
A custom title

1408 Posts

Posted - 2002-03-07 : 10:02:46

And I Cant Agree More on that

--------------------------------------------------------------

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-03-07 : 11:48:54

quote:

I believe this is more efficient:

Why would you believe that?

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-03-07 : 11:57:51

I'd believe it because MuffinMan's solution can make use of indexes on both tables (merge join), while Nazim's original answer would not.

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-03-07 : 12:11:11

Hmm. That's certainly not true for SQL Server 2000, but I haven't got any earlier versions handy to test it on.

Edited by - Arnold Fribble on 03/07/2002 12:15:15

robvolk
Most Valuable Yak

15732 Posts

Posted - 2002-03-07 : 12:43:30

Are the query plans the same for both versions? If they are, that's some pretty sharp optimizing on SQL 2000's part. If the plans are different, I can't see how the IN version can get by without accessing Table1; the LEFT JOIN version could simply hit the index(es) only. At least it's far more likely to only use the index than the IN version.

Next thing would be if either table has no indexes, would the optimizer build a temporary index vs. putting rows into a work table, or some combination of both?

Now I gotta test these! You're killing me Arnold!

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2002-03-07 : 15:14:32

If both T1.id and T2.id are unique and T1's index covers the result, the NOT IN is fractionally faster. The query plan for NOT IN will probably use an merge antijoin, where the LEFT JOIN will use a merge left join and filter. In practice, it's not that much different, though it does strike me a little odd that the LEFT JOIN doesn't use the antijoin.
If the columns are indexed but not unique, it's not as clear-cut. Sometimes the query plan estimate is for the NOT IN to be faster, but it runs slower.

setbasedisthetruepath
Used SQL Salesman

992 Posts

Posted - 2002-03-07 : 15:23:18

I believe the following is the fastest way:

select id
from table1 t
where not exists (
select 1
from table2
where field = t.field )

Jay99

468 Posts

Posted - 2002-03-07 : 16:16:32

on SQL Server 7 . . .


create table table1 (id int)
create table table2 (id int)

declare @i int
set @i = 1

while @i < 10000
begin
    insert table1 values (@i)
    set @i = @i + 1
    insert table2 values (@i)
    set @i = @i + 1
end

now, with no indexes . . .


select a.id
from table1 a left join table2 b on a.id = b.id
where b.id is null

... has a subtree cost of 0.226
and


select a.id
from table1 a
where not exists (
    select 1
    from table2 b
    where a.id = b.id)

... has a subtree cost of 0.217
SO with no indexes, the not-exists is is marginally faster . . .

Now, add some unique clustered indexes . . .


create unique clustered index blah on table1(id)
create unique clustered index bling on table2(id)

... and left-join-where-right-side-isnull = 0.127 and not-exists = 0.124

Drop those indexes and but some non-clustered on there . . .


drop index table1.blah
drop index table2.bling
create nonclustered index blah on table1(id)
create nonclustered index bling on table2(id)

... and left-join-where-right-side-isnull = 0.564 and not-exists = 0.131.

Making the non-clusted unique yields 0.132 and 0.129.

In all three cases, the actually execution time (or at least percentage relative to batch) reflected the difference in subtree costs.

quote:

If both T1.id and T2.id are unique and T1's index covers the result, the NOT IN is fractionally faster.

This seems to be quite right. However, this . . .

quote:

If the columns are indexed but not unique, it's not as clear-cut. Sometimes the query plan estimate is for the NOT IN to be faster, but it runs slower.

... I was not able to replicate.

I will say, as an aside, that the left-join-where-right-side-isnull does two very nice things for me.
1. less typing, especially when there are more than 2 tables involved
2. looks cooler(inotherwords less english-like), thereby helping to secure my employement because your average-joe-cool asp programmer never bothered to learn sql well enough to understand it.

Jay

MuffinMan
Posting Yak Master

107 Posts

Posted - 2002-03-07 : 16:31:42

quote:

I will say, as an aside, that the left-join-where-right-side-isnull does two very nice things for me.
1. less typing, especially when there are more than 2 tables involved
2. looks cooler(inotherwords less english-like), thereby helping to secure my employement because your average-joe-cool asp programmer never bothered to learn sql well enough to understand it.

Jay

Yeah, what Jay said!

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources