Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 IN vs whatever, query optimization

Author  Topic 

JokerOfACoder
Starting Member

18 Posts

Posted - 2007-05-10 : 21:49:48
Optimize the following queries (pretend that all columns have been indexed). Would you optimize or leave it as it is? (if optimize, give the optimized query). The table size will be 20 million

First query:
SELECT * FROM users WHERE userid IN (SELECT userid FROM groupusers WHERE groupid = 1);

Second query:
SELECT COUNT(1) FROM group_Users WHERE 1 IN (SELECT 1 FROM permissions WHERE userid = group_users.id);

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-05-10 : 21:52:28
For the first, if the PK of groupusers is userid/groupid (hopefully it is, if that table relates the two entities), then I would do this:

select u.*
from users u
inner join groupusers gu on u.userID = gu.userID
where gu.groupID = 1


For the second, I have no idea what you are trying to logically do there ...


- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-05-10 : 21:53:05
Don't use IN for subqueries.. EVER

Use EXISTS...


DavidM

Production is just another testing cycle
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-05-10 : 21:54:06
Jeff,

Using a JOIN when no columns are needed from the joined table is pointless.

DavidM

Production is just another testing cycle
Go to Top of Page

herothecat
Starting Member

19 Posts

Posted - 2007-05-10 : 22:13:12
Anyone else think this post is a test question?

Why push the envelope when you can just open it?
Go to Top of Page

JokerOfACoder
Starting Member

18 Posts

Posted - 2007-05-10 : 22:14:18
*writes notes*
Time to open up sql management studio and test that theory.

Keep em coming..any pointers you got when writing queries, joins vs in vs whatever etc.
Go to Top of Page

JokerOfACoder
Starting Member

18 Posts

Posted - 2007-05-10 : 22:15:59
lol! I'm a .NET architect, not a DBA.

However, I would like to learn the architecture of designing DBS...I can learn it from a book...but books never tell you about using joins vs ins etc etc. Statement vs pratical etc.

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-05-10 : 22:20:47
quote:
Originally posted by byrmol

Jeff,

Using a JOIN when no columns are needed from the joined table is pointless.

DavidM

Production is just another testing cycle




perhaps ... but I still personally prefer joins in general, unless performance suffers from using them, which I don't think usually is the case. not sure here, didn't check. But it still is better than IN().

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

herothecat
Starting Member

19 Posts

Posted - 2007-05-10 : 22:21:54
Sorry Joker, The question was just TOO similar to what a teacher (or employer) would ask a student (or candidate) to do.

No offense!


Why push the envelope when you can just open it?
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2007-05-11 : 01:31:12
[code]
-First query:
/*
SELECT * FROM users WHERE userid IN (SELECT userid FROM groupusers WHERE groupid = 1);
*/
--Here's the way I would do it.
Select * from users a where exists (Select * from GroupUsers b where b.userID = a.UserID;

/*
SELECT COUNT(1) FROM group_Users WHERE 1 IN (SELECT 1 FROM permissions WHERE userid = group_users.id);
*/

SELECT COUNT(*) FROM group_Users a WHERE
exists (SELECT * FROM permissions b WHERE a.userid = b.id);
[/code]
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2007-05-11 : 04:06:31
quote:
Originally posted by byrmol

Don't use IN for subqueries.. EVER

Use EXISTS...




Because you say so?
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-05-11 : 04:07:55
because it's logicaly different.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 04:38:13
My $0.02 worth:

Assuming PK, and thus no risk of inducing duplicates, as Jeff assumed: then I would do:

select u.*
from users u
inner join groupusers gu
ON u.userID = gu.userID
AND gu.groupID = 1

in preference to EXISTS. My general rational for this is that during the maintenance phase for the code columns from [groupusers] may become required and there is a smaller risk of cock-up/edge-conditions if the code is already in the style of a JOIN.

I also have the expectation that the Optimiser will give me the same-deal on a JOIN as an EXISTS - but that may not be the case of course!

Kristen
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-05-11 : 07:11:56
>>Because you say so?

Yep, because every IN expression can be expressed as EXIST but not vice versa,

DavidM

Production is just another testing cycle
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-05-11 : 08:43:18
quote:
Originally posted by Kristen
My general rational for this is that during the maintenance phase for the code columns from [groupusers] may become required and there is a smaller risk of cock-up/edge-conditions if the code is already in the style of a JOIN.



That's pretty much my take as well; very often we need to reference or return things from the related tables and writing things as joins makes maintenance very easy.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

DonAtWork
Master Smack Fu Yak Hacker

2167 Posts

Posted - 2007-05-11 : 08:57:18
by saying "SELECT u.*", i assume you mean "SELECT [column list]"



[Signature]For fast help, follow this link:
http://weblogs.sqlteam.com/brettk/archive/2005/05/25.aspx
Learn SQL
http://www.sql-tutorial.net/
http://www.firstsql.com/tutor.htm
http://www.w3schools.com/sql/default.asp
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-05-11 : 09:00:06
Yeah, everyone always means that ...
Go to Top of Page

Andraax
Aged Yak Warrior

790 Posts

Posted - 2007-05-11 : 09:35:11
Most of the time, using exists gives the same query plan as using in anyway... Such as this case:

select * from tablea where exists (select 1 from tableb where tablea.id=tableb.someotherid)
OR
select * from tablea where id in (select someotherid from tableb)
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-05-11 : 10:13:24
The point is, IN() can only use 1 column to "join" on. over-usage of IN() leads to really bad database designs without any composite primary keys usually -- every table, even one that relates entityA to entityB (which you would think would have a pk of entityAID/entityBID) ends up with an identity. or, worse, keys are concatenated together to allow IN() to work. I've seen many people who write lots of SQL never use joins and always use IN() to write lots of their SELECT's, which is a bad habit to get into, but for some reason beginners seem to gravitate towards that approach.

With EXISTS() or JOINS, you can of course relate things on more than 1 column.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2007-05-11 : 10:30:17
"I've seen many people who write lots of SQL never use joins and always use IN() to write lots of their SELECT's, which is a bad habit to get into, but for some reason beginners seem to gravitate towards that approach"

May be because IN sounds so much logical than JOIN that people don't see the performance implications of both approaches.

Slightly off the topic, but I have seen people so resistant regarding their coding habits that they are not even ready to convert their non-ANSI JOIN styles to ANSI one.

But I am all in for JOIN. Normally this is my preferred order:

1. JOIN
2. EXISTS
3. IN

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-05-11 : 16:40:02
In the interests of clarification..

Three reasons never to use IN in a subquery...

1) NULL processing
2) Every IN expression can be expressed as EXIST but not vice versa
3) It was my understanding that IN will process all rows, while EXISTS will "pull up stumps" when the first match is encountered.



DavidM

Production is just another testing cycle
Go to Top of Page
    Next Page

- Advertisement -