Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Question about UNION

Author  Topic 

wentihenduo
Starting Member

8 Posts

Posted - 2013-06-26 : 08:36:15
Hi, I have this kind of problem.
Here I have two tables, say, table A and table B, they have different fields,
and I have a table C, C linked to A and C linked to B by C's id.
Now, I have this task, I need to get top 100 rows from A and B.

I have two ways to do so
1) select top 100 AB.* from(
select A.* from A join C on A.cId = C.id where C.user = @user
UNION
select B.* from B join C on B.cId = C.id where C.user = @user) as AB

2) select top 100 AB.* from
( select A.* from A
UNION
select B.* from B) as AB
join C on AB.cId = C.id where C.user = @user

I do not know which way is better or they are the same. By the way, I have already set indexes on cId, id and user.
I do not know that, when two tables unioned together, do their indexes still get to work? if not, is there a faster way to do this task?

Could someone tell me which the best solution is and why?

Thanks!

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-06-26 : 08:49:11
Turn on the execution plan in SSMS (control-M) and run both queries (in one batch), and it will show you the relative costs for each query.

Assuming the two queries are logically equivalent (which it seems like, but I haven't looked closely enough), I suspect the first query would perform better because the inner queries are filtered down, so SQL Server does not have to bring in all the rows from table A and table B.
Go to Top of Page

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-06-26 : 09:18:12
In addition to James, if you put the top 100 clause in both inner select statments, will be faster. Since currently it is joining all records of "a with c" And "b with c", then takes a union, And then select randomly 100 records.. By putting top 100 in both inner select statments would restrict the record set of each select (of the inner select statments) to 100 (instead to be All) records. Overall max number of records of the inner statments would be 200.

Additionally if there is no duplication in the result set of the following two queries
1) select A.* from A join C on A.cId = C.id where C.user = @user
2) select A.* from A join C on A.cId = C.id where C.user = @user

Then use UNION ALL instead of UNION, will make it even more faster.

Cheers
MIK
Go to Top of Page

wentihenduo
Starting Member

8 Posts

Posted - 2013-06-26 : 09:45:34
Hi James and Mik, thanks for your reply.
I forgot that this top 100 is not randomly selected, in A and B, they both have "time" field, and I have set index on this "time", in the end of statement, i have "order by time desc", will index still work on "time"?
Go to Top of Page

wentihenduo
Starting Member

8 Posts

Posted - 2013-06-26 : 10:29:01
quote:
Originally posted by James K

Turn on the execution plan in SSMS (control-M) and run both queries (in one batch), and it will show you the relative costs for each query.




Hi, Thanks for your tip, it is very interesting that method 1) takes 59% cost, and method 2) takes only 41% cost.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2013-06-26 : 12:50:24
quote:
Originally posted by wentihenduo

Hi James and Mik, thanks for your reply.
I forgot that this top 100 is not randomly selected, in A and B, they both have "time" field, and I have set index on this "time", in the end of statement, i have "order by time desc", will index still work on "time"?

Doubtfull, since you are selecting *.
Go to Top of Page
   

- Advertisement -