Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 IN vs WHERE

Author  Topic 

shiloh
Starting Member

48 Posts

Posted - 2007-04-30 : 12:29:54
Hello all

I was talking to someone at work the other day about using a JOIN instead of IN. My colleague says IN is better and I say JOIN works faster. Can anyone correct this info? Is JOIN better than IN?

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-04-30 : 12:34:11
those are 2 logically different constructs.

for large sets use join, because in translates into multiple OR's

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

peterlemonjello
Yak Posting Veteran

53 Posts

Posted - 2007-04-30 : 14:23:13
There's a limitation to how many items can be used in an IN. This isn't a limitation with JOIN.

Also, for kicks why aren't you debating the use of EXIST vs IN vs JOIN
Go to Top of Page

shiloh
Starting Member

48 Posts

Posted - 2007-04-30 : 14:28:23
Thanks for your replies. I know how/whhere/when to use EXISTS. The issue i have is

SELECT * FROM TableA WHere col1 in (SELECT col1 from TableB)

vs

SELECT a.* from TableA a join TableB b on a.col1 = b.col1

I did a quick test with systemtables and the IN seems to be faster than JOIN.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2007-04-30 : 14:31:49
The true answer is that it all depends. Depends on data, depends on indexes, depends on.. (insert item here).

It is one of those things that you will need to test and benchmark. From experience it seems that a JOIN will out perform a EXISTS or IN on larger data sets. By larger I mean tables with 100+ Million rows. But, again there are too many variables to give a definitive answer. My personal preference is to do a JOIN. Meh..

-Ryan
Go to Top of Page

sshelper
Posting Yak Master

216 Posts

Posted - 2007-04-30 : 15:12:27
Using a JOIN may give you a different result than when using an IN statement. As an example, let's say you have a Customers table and Orders table wherein the Customers table is unique while the Orders table can have multiple Orders per customer. Doing a JOIN between these tables to get those Customers with Orders will give you more results than doing an IN.

You can also refer to FAQ #2 in the following link:
http://www.sql-server-helper.com/faq/select-p01.aspx

SQL Server Helper
http://www.sql-server-helper.com
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-04-30 : 15:43:31
"SELECT * FROM TableA WHere col1 in (SELECT col1 from TableB)
vs
SELECT a.* from TableA a join TableB b on a.col1 = b.col1
"

1) If you do NOT want any data from TableB AND there are potentially multiple rows in TableB for each row in TableA then use IN (or even EXISTS)

If you want data from some columns from TableB, or TableB never has more than one row for each row in TableA, then use the JOIN.

If you want all rows from TableA (regardless of whether there is a corresponding row in TableB) and you want data from some columns in TableB, where a corresponding row exists, then use a LEFT OUTER JOIN.

Kristen
Go to Top of Page

shiloh
Starting Member

48 Posts

Posted - 2007-04-30 : 16:29:43
thankyou all for your replies. it was helpful information. purely from performance point of view, and assuming there's a 1-1 relation between tables A and B, is there any advantage of using a JOIN vs IN. One factor is the result set. Is there anything else that can affect the performance?
Go to Top of Page

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2007-04-30 : 21:52:39
If there is 1-1 and SQL Server knows about it via unique constraints and a relationship it might not make ao difference which one you use because it will make them the same.

Try it out & look at the query plans.
Personally, as sshelper said, they are not the same so I'd look at the question you are asking. One will give the correct result and one will not. Your data might make it look right of course. To make them the same you would want

SELECT a.* from TableA a join (select distinct col1 from TableB) b on a.col1 = b.col1
Go to Top of Page
   

- Advertisement -