which one is better? - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

which one is better?

Author

Topic

sachinsamuel
Constraint Violating Yak Guru

383 Posts

Posted - 2006-09-29 : 07:43:35

Hello Experts,

Which one is better Join or Subquery? and Why?

Thanks in advance

Don't sit back because of failure. It will come back to check if you still available. -- Binu

Kristen
Test

22859 Posts

Posted - 2006-09-29 : 07:44:43

Well it depends ....

Describe the situation and problem please.

Kristen

sachinsamuel
Constraint Violating Yak Guru

383 Posts

Posted - 2006-09-29 : 08:37:20

I have 2 tables

A - having 118040500 records
B(its temporary table) - having 1500 records.

When I say

select * from A where a.id in (Select b.id from B)

it takes almost 2 minutes to return data.

How can I optimize this. Should I remove in with join?

thanks in advance

Don't sit back because of failure. It will come back to check if you still available. -- Binu

pootle_flump

1064 Posts

Posted - 2006-09-29 : 08:44:52

You would level the playing field a little if you used EXISTS for your subquery rather than IN. You might want to remove the * and replace with column names.

Anyway - for a simple existance check Vs inner join the optimiser should come up with the same plan for both queries so it is probably a matter of preference.

HTH

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-09-29 : 08:49:38

Set the execution plan and see

select * from A where a.id in (Select b.id from B)
select T1.* from A T1 inner join B T2 on T1.id=T2.id
select * from A T1 where exists(Select * from B where id=T1.id)

Madhivanan

Failing to plan is Planning to fail

sachinsamuel
Constraint Violating Yak Guru

383 Posts

Posted - 2006-09-29 : 08:58:32

I am getting same execution plan for all of them. Infact the logical reads are also same.

please help.

regards
Sachin

Don't sit back because of failure. It will come back to check if you still available. -- Binu

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-09-29 : 18:04:23

It sounds like you dont have an index on A.id. How many rows does the query returns? If the answer is all 118040500 then I aint really surprised it takes 2 min ...

-- Alice came to a fork in the road. "Which road do I take?" she asked. "Where do you want to go?" responded the Cheshire cat. "I don't know," Alice answered. "Then," said the cat, "it doesn't matter."

pootle_flump

1064 Posts

Posted - 2006-09-29 : 18:41:11

quote:
Originally posted by sachinsamuel

I am getting same execution plan for all of them. Infact the logical reads are also same.

Ah - so the question is how do I speed up this query? Are there any columns you don't use? For example, if you only want ID then any of

select [ID] from A where a.id in (Select b.id from B)
select T1.[ID] from A T1 inner join B T2 on T1.id=T2.id
select [ID] from A T1 where exists(Select 1 from B where id=T1.id)

will be faster (especially if ID is uniquely indexed as mentioned). Minor but EXISTS(SELECT 1.... is supposed to be marginally faster than EXISTS(SELECT *... for compilation:
http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_frm/thread/b4def3fd105ba10a/c6e6847f9da612da?lnk=st&q=%22SELECT+'1'+FROM+MyTable%22%2B%22Conor%22%2B%22hugo%22&rnum=1
Note Conor's job title

sachinsamuel
Constraint Violating Yak Guru

383 Posts

Posted - 2006-10-01 : 13:33:16

Thanks a lot experts.

Unfortunatly I need all the columns there for I will have only option left is *.

But I am still not clear with what is better... An inner join or a in clause?

Please help.... this is urgent. Sorry to push.

Thanks in advance
Regards
Sachin

Don't sit back because of failure. It will come back to check if you still available. -- Binu

PSamsig
Constraint Violating Yak Guru

384 Posts

Posted - 2006-10-01 : 14:01:58

Since you your self answered I am getting same execution plan for all of them. Infact the logical reads are also same. then that clears that question. Now what is left is to determin if you ASK the right question, and under the right circumstances, but you have given zero to non information to help you there ...

-- Alice came to a fork in the road. "Which road do I take?" she asked. "Where do you want to go?" responded the Cheshire cat. "I don't know," Alice answered. "Then," said the cat, "it doesn't matter."

pootle_flump

1064 Posts

Posted - 2006-10-01 : 17:57:54

Assuming ID is indexed in both tables (ideally uniquely but that depends on the schema) then this is as fast as it can be unless you can sacrifice accuracy for performance. For example, you could use join hints or write the results of the query to a table at off peak times (especially if the contents of Table B don't change often). Maybe try using a table variable or derived table? Do you know if it is the population of the temp table or the joining to the primary table that takes up the majority of the execution time? Are you only dumping the ID into the temp table or are you also grabbing superfluous columns?
Tell you what - why not post all the SQL?

HTH

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-10-01 : 18:50:08

As a general rule you should use a join rather than an in (but the only way to be certain in a specific situation is to look at the execution plan, you cannot say one is always better).

But, the key question here is, do you really need all the rows from the query? You do not have a where clause so you are returning all the rows from A that match anything in B. If there are a lot of rows and especially if most of the rows in A have an id in B then you aren't going to speed it up.

Most likely though, you don't really want ALL of the rows that match EVERY row in B, so your query should be something like this

SELECT * 
FROM A
INNER JOIN B ON A.id = B.id
WHERE B.somecolumn = somevalue

And - as a few people have already asked do you have an index on at least the id columns in A and B?

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 08:55:47

"As a general rule you should use a join rather than an in (but the only way to be certain in a specific situation is to look at the execution plan, you cannot say one is always better)."

You did say general but I would add one caveat:

If the JOIN can provide multiple hits you will probably want to use IN instead, to prevent having to use DISTINCT.

Assuming you have the indexes mentioned above (and you haven't said one way or the other yet) then:

select * from A where a.id in (Select b.id from B)
A - having 118,040,500 records
B(its temporary table) - having 1500 records.

it takes almost 2 minutes to return data.

how many rows are being returned? Because if its lots then it will take some time to physically ship them from the database to your application. Particularly if your SELECT * is covering a bung of TEXT columns!

How long does

select COUNT(*) from A where a.id in (Select b.id from B)

take to run by comparison? That will remove the transfer time from the equation (but might use a smarter/shorter query plan that is not realistic for the main query, of course)

Kristen

pootle_flump

1064 Posts

Posted - 2006-10-02 : 09:27:34

You could try
COUNT(AnUnindexedAndAlsoNeverNullColumn)
to hobble the optimiser a bit

Kristen
Test

22859 Posts

Posted - 2006-10-02 : 10:10:53

"COUNT(AnUnindexedAndAlsoNeverNullColumn)"

COUNT(MyCol1+MyCol2+...)

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-10-02 : 10:51:33

>>COUNT(MyCol1+MyCol2+...)

Wont that lead to error "Cant convert to int datatype" if columns are of different datatypes?

Madhivanan

Failing to plan is Planning to fail

pootle_flump

1064 Posts

Posted - 2006-10-02 : 11:30:58

... and null if one is null?

COUNT(COALESCE(CAST(MyCol1 AS VarChar(8000)), '') + COALESCE(CAST(MyCol2 AS VarChar(8000)), , '') + ...)
This could end up on DailyWTF....

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-10-02 : 11:43:28

>>This could end up on DailyWTF....

Madhivanan

Failing to plan is Planning to fail

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources