| Author |
Topic |
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-22 : 12:39:28
|
Hello,Please explain me out of the following 2, which is more efficient and more importantly why it is so?My guess is the 1st one is more efficient, but when I compare with the execution plan, both of those will show as 50%  SELECT T1F1, T1F2, T1F3, T2F2 FROM T1 Left JOIN (SELECT T2F1, MIN(T2F2) AS T2F2 FROM T2 WHERE T2F3 = 'BBB' GROUP BY T2F1 ) T2 ON T2.T2F1 = T1.T1F1 + 'AA'SELECT T1F1, T1F2, T1F3, (SELECT MIN(T2F2) AS T2F2 FROM T2 WHERE T2F3 = 'BBB' And T2.T2F1 = T1.T1F1 + 'AA') FROM T1 Srinika |
|
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2011-06-22 : 13:08:46
|
both seem very poorly written query. if you could provide sample data and desired result we could help you get an even more efficient query.this And T2.T2F1 = T1.T1F1 + 'AA' is very inefficientIf you don't have the passion to help people, you have no passion |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-22 : 14:13:46
|
| Thanks yosiasz for the reply.I can give some sample data and the desired results, but I don't think that has to do anything with my question.This is rather a theory question.Question in other words, Is it efficient to have the field of a table T2, corresponding to a field in T1 directly in the fields listor to have the second table Left Joined as a derived table and get the field.That part of "And T2.T2F1 = T1.T1F1 + 'AA'" is a must for the condition.as I need to find the records corresponding to a field in one table to a field in another table concat with a pre known string.Srinika |
 |
|
|
denis_the_thief
Aged Yak Warrior
596 Posts |
Posted - 2011-06-22 : 14:30:33
|
quote: Originally posted by Srinika Thanks yosiasz for the reply.I can give some sample data and the desired results, but I don't think that has to do anything with my question.
It may have something to do with the answer. When SQL Server determines an execution plan, it uses index and table statistics. Index and Table statistics are based on the Data.If they both show as 50% that would suggest neither is more efficient than the other. Are the 2 execution plans identical? Similar? |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-06-22 : 14:32:27
|
quote: This is rather a theory question
Performance is not a theory, it can be measured accurately and compared. Do the 2 queries generate the same plan (same operators, etc.)? The percentage of the batch is not a reliable performance indicator. You need to look at join types, scans vs. seeks, index usage and row count estimations to get a reliable measure. A simple statistics update could completely change either plan and affect performance dramatically.To address yosiaz's point, the + 'AA' condition on your join will always prevent the use of an index, so the query will always go with a table or clustered index scan. This will have a far greater effect than how you write the rest of the query. |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-22 : 14:42:54
|
| Hi Denis,It is a theory question, so data is not going to do anything here (at least as far as I can see). Even if I were to give you data, it would be just random data as my real system has no data yet. Assume there are no indexes in the tables.Execution plans are not identicle.But the most resources are on the following :"Table Scan T2 Cost : 84%"Srinika |
 |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2011-06-22 : 15:01:57
|
| Assuming no indexes and no data makes this discussion useless.The execution plan on 2 empty tables is irrelavent.You need to know how your data will be used. You'll need to add indexes.Without data, you won't get a meaningful execution plan to look at.With data, you will. Also, besides looking at the exec plan, SET STATISTICS IO ON and look how many reads are performed. Again, with data in the tables. |
 |
|
|
denis_the_thief
Aged Yak Warrior
596 Posts |
Posted - 2011-06-22 : 15:02:30
|
quote: Originally posted by Srinika Hi Denis,Even if I were to give you data, it would be just random data as my real system has no data yet.
I agree it may not be practical to provide your data. |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-24 : 16:43:25
|
quote: Originally posted by russell Assuming no indexes and no data makes this discussion useless.The execution plan on 2 empty tables is irrelavent.You need to know how your data will be used. You'll need to add indexes.Without data, you won't get a meaningful execution plan to look at.With data, you will. Also, besides looking at the exec plan, SET STATISTICS IO ON and look how many reads are performed. Again, with data in the tables.
Those tables are not yet filled with data, but I put random data for testing, and that is aloso few records. Here I'm designing the query so I need to know which way is better.Srinika |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-24 : 16:46:05
|
Hi Robvolk,quote: ... + 'AA' condition on your join will always prevent the use of an index
I have no option than comparing the 2 fields like that as the data will be in the tables as that.Srinika |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-06-24 : 16:47:25
|
quote: and that is aloso few records.
"A few records" won't tell you anything. You need to have at least as much data as you expect to fill the table.quote: Here I'm designing the query so I need to know which way is better.
Again, you can't know that without realistic data, and without indexes, you'll get table scans, period. (that sounds familiar, I wonder who said that?) Neither way of writing the query will change that. |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-27 : 10:45:29
|
quote: You need to have at least as much data as you expect to fill the table.
Means I have to provide this forum with 40000 records?or write a script to fill the table with that many random data? quote: ... Neither way of writing the query will change that.
You mean, calculating the Max in a fields list may have the same speed as getting from a one time derived & joined table?Srinika |
 |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2011-06-27 : 10:49:58
|
| You don't have to provide us with the data. You need the data in the table so that YOU can determine the most efficient way to query it. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-06-27 : 10:57:58
|
Just write a script to fill it with data.quote: You mean, calculating the Max in a fields list may have the same speed as getting from a one time derived & joined table?
The speed depends on numerous factors, the least of which is what kind of plan the query optimizer generates. Right now, with little or no data, and no indexes, WHATEVER query you write will use table or clustered index scans. These are typically the least optimal operators for large tables. Since you want to get a MAX, which is a singleton value, you'll want an index on that column.Additionally, without a representative amount of data in the table, the optimizer will determine the query is trivial and will not even attempt to optimize it, and you'll again probably get table scans. Adding an index should help with getting the MAX, but your JOIN condition with the + 'AA' will still force a table scan (I'm not interested in whether you can change the data or not, the point is how the optimizer will process it) |
 |
|
|
Srinika
Master Smack Fu Yak Hacker
1378 Posts |
Posted - 2011-06-27 : 14:43:50
|
| Hi All, the tables are of a different system and I won't be able to add indexes or do any other modifications.I'm not asking you to optimize the query, instead I need to know out of the above 2 which one is better.Srinika |
 |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2011-06-27 : 15:12:43
|
| Syntactically, I like the 1st one better.But, as we've been trying to tell you: no one can tell you which is better without data existing, for you to test/view exec plans against. |
 |
|
|
|