| Author |
Topic |
|
Dargon
Starting Member
26 Posts |
Posted - 2008-12-12 : 11:06:19
|
| Hi, All,I feel confused about the query performance and can not explain why I am getting such a strange result. My query is pretty simple:select Event.Qty from Eventwhere Event.Uom in (Select UoM from UoM where exp_Mass=1)Here Event table is really huge one - about 9000000 records, UoM is small lookup table. So execution time is more then 1 min. From what I knew so far, it seems that inner join is much more prefferable way to do subset of records comparing to IN ox EXISTS clauses. So I changed query like this:select Event.Qty from Eventinner join UoM on UoM.UoM=Event.UoMwhere UoM.exp_Mass=1Sounds obvious, right? But it actually runs SLOWER, then original one and I am really puzzled why. The Event table is indexed by another column(EventID), which is not included into query, and subquery in the first example returns small set of data, but the result of outer SELECT is huge as well: about 4.5 million records. So does anybody know why it happens? Thanks,Dargon |
|
|
sodeep
Master Smack Fu Yak Hacker
7174 Posts |
Posted - 2008-12-12 : 11:16:07
|
quote: Originally posted by Dargon Hi, All,I feel confused about the query performance and can not explain why I am getting such a strange result. My query is pretty simple:select Event.Qty from Eventwhere Event.Uom in (Select UoM from UoM where exp_Mass=1)Here Event table is really huge one - about 9000000 records, UoM is small lookup table. So execution time is more then 1 min. From what I knew so far, it seems that inner join is much more prefferable way to do subset of records comparing to IN ox EXISTS clauses. So I changed query like this:select Event.Qty from Eventinner join UoM on UoM.UoM=Event.UoMwhere UoM.exp_Mass=1Do you index on these?Sounds obvious, right? But it actually runs SLOWER, then original one and I am really puzzled why. The Event table is indexed by another column(EventID), which is not included into query, and subquery in the first example returns small set of data, but the result of outer SELECT is huge as well: about 4.5 million records. So does anybody know why it happens? Thanks,Dargon
|
 |
|
|
Dargon
Starting Member
26 Posts |
Posted - 2008-12-12 : 11:40:57
|
| No, and I do not think I can do it, because this table is updated from the application all the time, so adding too many indexes can make performance worse in other places. |
 |
|
|
sodeep
Master Smack Fu Yak Hacker
7174 Posts |
Posted - 2008-12-12 : 12:26:10
|
| What do you see in Execution plan? |
 |
|
|
Dargon
Starting Member
26 Posts |
Posted - 2008-12-12 : 12:58:43
|
| IN case: Event table clustered Index scan - 75%Hash match(right semi join) - 24%Parallelism(Cather streams): 0% Inner join case: Event table clustered Index scan - 70%Hash match(inner join): 23%Parallelism(Cather streams): 7%So you see that the final operation takes longer in the JOIN case. I assume that Index scanning is the same in both cases. |
 |
|
|
Dargon
Starting Member
26 Posts |
Posted - 2008-12-12 : 13:04:18
|
| Btw, it looks that sqlserver tries to cache Event table indexes, or some scanning information, because when I execute these statements multiple times, they run faster and difference is not that significant, so I can live with it for now. Just curious, why I did not get the expected performance boost with JOIN syntax |
 |
|
|
sodeep
Master Smack Fu Yak Hacker
7174 Posts |
Posted - 2008-12-12 : 13:06:58
|
quote: Originally posted by Dargon IN case: Event table clustered Index scan - 75%Hash match(right semi join) - 24%Parallelism(Cather streams): 0% Inner join case: Event table clustered Index scan - 70%Hash match(inner join): 23%Parallelism(Cather streams): 7%Can you use :Option (maxdop 1) in your querySo you see that the final operation takes longer in the JOIN case. I assume that Index scanning is the same in both cases.
|
 |
|
|
Dargon
Starting Member
26 Posts |
Posted - 2008-12-12 : 14:42:13
|
| I am not sure if I understand what it means here. Does it restrict max degree of parallelism in the query? I probably have to read about it and how it could affect performance. Do I need to reset this hint back to default after query finishes? Btw, the quick test did not show any improvements. It could be that I am doing something wrong here. |
 |
|
|
LoztInSpace
Aged Yak Warrior
940 Posts |
Posted - 2008-12-12 : 23:57:47
|
| Some points1) A join is not going to magically increase performance. If it was then all other methods would not be required. Maybe you know your data more than you have told the database. Have you got a unique, non-null constraint on UoM with UoM and exp_Mass? If not, your join is not the same as IN (or exists depending on nullability).2) What is the cardinality of the set? Are you expecting loads or a few rows back?3) What indexes do you have in place? You say you don't want to add them on the join columns, thereby guaranteeing a scan on one or both tables even if the predicate column is indexed.Maybe post the tables & index scripts. |
 |
|
|
|