Which is more efficient and why

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Which is more efficient and why

Author

Topic

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-22 : 12:39:28

Hello,

Please explain me out of the following 2, which is more efficient and more importantly why it is so?

My guess is the 1st one is more efficient, but when I compare with the execution plan, both of those will show as 50%


SELECT T1F1, T1F2, T1F3, T2F2 
FROM T1 
Left JOIN (SELECT T2F1, MIN(T2F2) AS T2F2  FROM T2
			WHERE T2F3 = 'BBB'
			GROUP BY T2F1 	) T2 ON T2.T2F1 = T1.T1F1 + 'AA'



SELECT T1F1, T1F2, T1F3, 
	(SELECT MIN(T2F2) AS T2F2  FROM T2
		WHERE T2F3 = 'BBB' And T2.T2F1 = T1.T1F1 + 'AA') 
FROM T1

Srinika

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2011-06-22 : 13:08:46

both seem very poorly written query. if you could provide sample data and desired result we could help you get an even more efficient query.

this


And T2.T2F1 = T1.T1F1 + 'AA'

is very inefficient

If you don't have the passion to help people, you have no passion

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-22 : 14:13:46

Thanks yosiasz for the reply.

I can give some sample data and the desired results, but I don't think that has to do anything with my question.
This is rather a theory question.

Question in other words,
Is it efficient to have the field of a table T2, corresponding to a field in T1 directly in the fields list
or to have the second table Left Joined as a derived table and get the field.

That part of "And T2.T2F1 = T1.T1F1 + 'AA'" is a must for the condition.
as I need to find the records corresponding to a field in one table to a field in another table concat with a pre known string.

Srinika

denis_the_thief
Aged Yak Warrior

596 Posts

Posted - 2011-06-22 : 14:30:33

quote:
Originally posted by Srinika

Thanks yosiasz for the reply.

I can give some sample data and the desired results, but I don't think that has to do anything with my question.

It may have something to do with the answer. When SQL Server determines an execution plan, it uses index and table statistics. Index and Table statistics are based on the Data.

If they both show as 50% that would suggest neither is more efficient than the other. Are the 2 execution plans identical? Similar?

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-06-22 : 14:32:27

quote:
This is rather a theory question

Performance is not a theory, it can be measured accurately and compared. Do the 2 queries generate the same plan (same operators, etc.)? The percentage of the batch is not a reliable performance indicator. You need to look at join types, scans vs. seeks, index usage and row count estimations to get a reliable measure. A simple statistics update could completely change either plan and affect performance dramatically.

To address yosiaz's point, the + 'AA' condition on your join will always prevent the use of an index, so the query will always go with a table or clustered index scan. This will have a far greater effect than how you write the rest of the query.

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-22 : 14:42:54

Hi Denis,

It is a theory question, so data is not going to do anything here (at least as far as I can see). Even if I were to give you data, it would be just random data as my real system has no data yet. Assume there are no indexes in the tables.

Execution plans are not identicle.
But the most resources are on the following :
"Table Scan T2 Cost : 84%"

Srinika

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-06-22 : 15:01:57

Assuming no indexes and no data makes this discussion useless.

The execution plan on 2 empty tables is irrelavent.

You need to know how your data will be used. You'll need to add indexes.

Without data, you won't get a meaningful execution plan to look at.

With data, you will. Also, besides looking at the exec plan, SET STATISTICS IO ON and look how many reads are performed. Again, with data in the tables.

denis_the_thief
Aged Yak Warrior

596 Posts

Posted - 2011-06-22 : 15:02:30

quote:
Originally posted by Srinika

Hi Denis,

Even if I were to give you data, it would be just random data as my real system has no data yet.

I agree it may not be practical to provide your data.

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-24 : 16:43:25

quote:
Originally posted by russell

Assuming no indexes and no data makes this discussion useless.

The execution plan on 2 empty tables is irrelavent.

You need to know how your data will be used. You'll need to add indexes.

Without data, you won't get a meaningful execution plan to look at.

With data, you will. Also, besides looking at the exec plan, SET STATISTICS IO ON and look how many reads are performed. Again, with data in the tables.

Those tables are not yet filled with data, but I put random data for testing, and that is aloso few records.
Here I'm designing the query so I need to know which way is better.

Srinika

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-24 : 16:46:05

Hi Robvolk,

quote:

... + 'AA' condition on your join will always prevent the use of an index

I have no option than comparing the 2 fields like that as the data will be in the tables as that.

Srinika

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-06-24 : 16:47:25

quote:
and that is aloso few records.

"A few records" won't tell you anything. You need to have at least as much data as you expect to fill the table.

quote:
Here I'm designing the query so I need to know which way is better.

Again, you can't know that without realistic data, and without indexes, you'll get table scans, period. (that sounds familiar, I wonder who said that?) Neither way of writing the query will change that.

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-27 : 10:45:29

quote:
You need to have at least as much data as you expect to fill the table.

Means I have to provide this forum with 40000 records?
or write a script to fill the table with that many random data?

quote:
... Neither way of writing the query will change that.

You mean, calculating the Max in a fields list may have the same speed as getting from a one time derived & joined table?

Srinika

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-06-27 : 10:49:58

You don't have to provide us with the data. You need the data in the table so that YOU can determine the most efficient way to query it.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-06-27 : 10:57:58

Just write a script to fill it with data.

quote:
You mean, calculating the Max in a fields list may have the same speed as getting from a one time derived & joined table?

The speed depends on numerous factors, the least of which is what kind of plan the query optimizer generates. Right now, with little or no data, and no indexes, WHATEVER query you write will use table or clustered index scans. These are typically the least optimal operators for large tables. Since you want to get a MAX, which is a singleton value, you'll want an index on that column.

Additionally, without a representative amount of data in the table, the optimizer will determine the query is trivial and will not even attempt to optimize it, and you'll again probably get table scans. Adding an index should help with getting the MAX, but your JOIN condition with the + 'AA' will still force a table scan (I'm not interested in whether you can change the data or not, the point is how the optimizer will process it)

Srinika
Master Smack Fu Yak Hacker

1378 Posts

Posted - 2011-06-27 : 14:43:50

Hi All,
the tables are of a different system and I won't be able to add indexes or do any other modifications.

I'm not asking you to optimize the query, instead I need to know out of the above 2 which one is better.

Srinika

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-06-27 : 15:12:43

Syntactically, I like the 1st one better.

But, as we've been trying to tell you: no one can tell you which is better without data existing, for you to test/view exec plans against.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources