Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Site Related Forums
 Article Discussion
 Article: Writing Outer Joins in T-SQL

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2008-02-11 : 10:00:47

Occasionally someone will ask for my help with a query and say that both a right outer join and a left outer join was tried, and still the expected results were not achieved. That made me realize that some developers do not completely understand outer joins and that an article explaining how to use them might help.



Read Writing Outer Joins in T-SQL

smithje
Starting Member

32 Posts

Posted - 2008-02-13 : 10:44:26
Left/Right, does it matter. Is one better than the other? A few years ago consultant on a project for our company advised me to always write my queries to use Left joins. He had worked on the project to convert the original database application to MS SQL when Microsoft took it over. He claimed the design of the query engine handled Left joins more effeciently than right. I converted several queries that processed large datasets to Left join only and got quicker results. I have used Left exclusively since then. Has this concept ever been tested or written about?
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-02-13 : 11:15:00
Hi -- they are technically the same, but it is always clearer to use LEFT OUTER JOINS. I strongly recommend to never use RIGHT OUTER JOINS.

When you write a SQL statement, you should express your "base" in your FROM clause, and from there join to auxiliary tables. There is no SELECT statement that cannot be written in that manner, and it is a nice clear, clean way to organize your code. So, if you want ALL customers and ANY orders that match, I think we can all agree that it makes logical sense to express this as:

SELECT ...
FROM customers
OUTER JOIN TO orders

Clearly, we are primarily selecting customers as our "base", and including any Orders that may or may not exist.

As a RIGHT OUTER JOIN, it becomes:

SELECT
FROM Orders
OUTER JOIN TO Customers

which doesn't make sense -- why are we selecting FROM Orders and joining TO Customers, when potentially we want to return Customers that don't have ANY orders?

Anyway, it is rare to get good advice from a consultant, but it appears that you actually did! Avoid RIGHT JOINS, and stick with LEFT JOINS. If a right outer join seems required to make your query work, you should re-write it and change your FROM clause to make it cleaner, simpler and clearer.

(by the same token, I strongly recommend to avoid FULL OUTER JOINS as well: http://weblogs.sqlteam.com/jeffs/archive/2007/04/19/Full-Outer-Joins.aspx )

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

viacoboni
Starting Member

1 Post

Posted - 2008-02-27 : 10:06:34
I wish you would have included something (I'm ashamed to say) I just learned after 15 years of SQL programming - the difference between an ON clause and a WHERE clause in an outer join.

Of course, there's no practical difference in an INNER join. And an example of comparing the RIGHT table of a LEFT join to a constant was given in the article. My problem came understanding what happens when comparing the LEFT table to a constant in a LEFT join.

Please see http://www.sqlservercentral.com/Forums/Topic449462-338-1.aspx for the gory details of how I misunderstood the issue. Essentially it boils down to this: the ON clause will determine which rows will JOIN left table to right table (and therefore whether the right table has data or is NULL-extended), but WILL NOT limit the rows of the left table. It's all right there in BOL, but this was exceeding surprising and subtle to me as well as my colleagues.

Vince
Go to Top of Page

JBonano
Starting Member

2 Posts

Posted - 2008-02-27 : 14:12:19
I have a question which may be a little off topic but how much of a difference would it have been had I moved the date filter to the Where clause?

You Had:
SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
and s.OrderDate between '1/1/2002' and '12/31/2002'
WHERE s.SalesOrderID IS NULL

Moved:
SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
WHERE s.SalesOrderID IS NULL and s.OrderDate between '1/1/2002' and '12/31/2002'

My first assumption is that we would get a full table scan on the SalesOrderHeader table which is not good. Thoughts?

Thanks
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-02-27 : 14:36:52
It doesn't matter which is more efficient; they are not equivalent so it is comparing two completely different things.

see:

http://www.sqlteam.com/article/additional-criteria-in-the-join-clause
and
http://weblogs.sqlteam.com/jeffs/archive/2007/05/14/60205.aspx
and
http://weblogs.sqlteam.com/jeffs/archive/2007/10/11/mixing-inner-outer-joins-sql.aspx

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

JBonano
Starting Member

2 Posts

Posted - 2008-02-27 : 16:16:09
Thanks for the information.
Go to Top of Page

RevMike
Starting Member

9 Posts

Posted - 2008-04-04 : 09:26:31
quote:
Originally posted by JBonano

I have a question which may be a little off topic but how much of a difference would it have been had I moved the date filter to the Where clause?

You Had:
SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
and s.OrderDate between '1/1/2002' and '12/31/2002'
WHERE s.SalesOrderID IS NULL

Moved:
SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
WHERE s.SalesOrderID IS NULL and s.OrderDate between '1/1/2002' and '12/31/2002'

My first assumption is that we would get a full table scan on the SalesOrderHeader table which is not good. Thoughts?

Thanks



Actually you just demonstrated a pitfall which I wish had been included in the main article. If your where clause constrains on something from the "optional" side of the outer join, you can very easily convert the outer join to an inner join! In this particular case you will most likely get no rows returned.

The equivalent is actually this...


WHERE s.SalesOrderID IS NULL and
(s.OrderDate between '1/1/2002' and '12/31/2002'
OR s.OrderDate IS NULL)



Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-04-04 : 10:11:14
RevMike --

Your first point is correct, but your second is not -- that is not equivalent. See:

http://weblogs.sqlteam.com/jeffs/archive/2007/05/14/60205.aspx

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

RevMike
Starting Member

9 Posts

Posted - 2008-04-04 : 10:34:16
quote:
Originally posted by jsmith8858

RevMike --

Your first point is correct, but your second is not -- that is not equivalent. See:

http://weblogs.sqlteam.com/jeffs/archive/2007/05/14/60205.aspx

- Jeff
http://weblogs.sqlteam.com/JeffS




D'Oh. I know that but didn't have enough coffee. )
Go to Top of Page

jp2code
Posting Yak Master

175 Posts

Posted - 2009-04-14 : 17:17:34
I need some help with JOINs, please.

We have two tables here: Test_Results and EmployeeInfo

EmployeeInfo is a fairly straightforward table with FIRSTNAME and LASTNAME both setup as nvarchar(255) columns, a [COUNT] int column, a NUM nvarchar(255) column, and others that don't matter here. Temporary employees are assigned NUM sequences in the 990000-999999 range. Active employees are assigned NUM sequences below 990000. If an employee goes from Temporary to full time, they are given a new NUM, and the old NUM stays in the table. Similarly, if an employee is terminated and comes to work here again later, they are assigned a new NUM and the old NUM stays in the table. Duplicate Employee records are prevented by filtering the result for the highest [COUNT] value, which is set up as the table's only key. The [COUNT] key is not used anywhere else in the database.

Test_Results has a varchar(50) column called Op_ID. If an employee is logged onto the system and doing tests, their *Full Name* appears here. Their full name is defined with the EmployeeInfo table using FIRSTNAME + ' ' + LASTNAME.

The machines that write to our Test_Results table are manufactured by one of our vendors. Occasionally, their software developers remote in using their Op_ID values. Similarly, the machines themselves run self-diagnostics tests, and these records are written using the machine names for the Op_ID. These Op_ID values are not included in our company's EmployeeInfo table.

How can I join to get the correct employee's name and number when it is there and display the Test_Results.[Op_ID] information when nothing is there?

Here's what I've tried, but it does not show machine entries and it does not get the employee with the highest [COUNT]
select case when (0<ei.[count]) then ei.firstname+' '+ei.lastname+' ('+ei.num+')' else tr.op_id end as 'Operator',
Serial_Number, tr.Date_Time, tr.System_ID, tr.Test_Result
from test_results tr left outer join employeeinfo ei on (tr.op_id like ei.firstname+'%' and tr.op_id like '%'+ei.lastname)

This is driving me crazy.

I have revisited this stored procedure and made changes to it numerous times as errors surface. Every time I make some change, another part of it stops working.

Does anyone understand how to make this query?

Thanks,
Joe


Avoid Sears Home Improvement
Go to Top of Page

smithje
Starting Member

32 Posts

Posted - 2009-04-16 : 12:51:19
If I understand your data and questions correctly the query below should work. I believe your problem is related to having multiple records in the employee info table that match the "firstname" "lastname" record in your Test Results table. You should use the "[count]" value as a identifier in the Test Results table instead of the names.


select
case when ei.[count] is not null then ei.firstname+' '+ei.lastname+' ('+ei.num+')' else tr.op_id end as 'Operator',
Serial_Number, tr.Date_Time, tr.System_ID, tr.Test_Result
from test_results tr left outer join (select [count], firstname, lastname, num
from employeeinfo
where (tr.op_id like ei.firstname+'%' and tr.op_id like '%'+ei.lastname)
and [count] = (select max[count] from employeeinfo
where (tr.op_id like ei.firstname+'%' and tr.op_id like '%'+ei.lastname))) ei on (tr.op_id like ei.firstname+'%' and tr.op_id like '%'+ei.lastname)
Go to Top of Page

smithje
Starting Member

32 Posts

Posted - 2009-04-16 : 17:00:19
Forgot to put Parans in the Max([count]) line.
Go to Top of Page

peteln123
Starting Member

1 Post

Posted - 2010-05-21 : 09:29:09
Online publishing company urgently needs excellent writers!

We are a global company that writes on multiple topics on an ongoing basis. If you are a
writer, or aspiring writer, and want to work at your own pace, we may be the group for you!

Choose to work part-time or full-time from the convenience of your home. You must have an excellent
command of the English language.

Visit - http://www.londonbrokers.net now to get started!

Go to Top of Page

Piyush
Starting Member

8 Posts

Posted - 2011-07-25 : 02:07:12
If you are searching for the Joins.....then you need not to worry at all, because i have posted a complete article with 3 parts on it;
unspammed

Piyush Bajaj
Go to Top of Page

myst_ukon
Starting Member

2 Posts

Posted - 2014-08-22 : 19:00:03
can someone help me to convert these joins. I have tried and tried, but cannot get the same results that this query produces when I try to convert to ansi joins: basically, I need a row for every order that is in Order and/or EA or and one row for each of the 3 returned by cdef = 1 (there are 3 values so if there is an order in Order with no matching OrderNumber in EA, I should get 3 rows back for each value in cdef (alt1, alt2, ex)

SELECT
o.OrderNumber, o.companyname Company, cdef.shortname RegClass,
isnull(xa.Allotment,0) Allotment,
IsNull((select sum(qty) from tab_item where OrderNumber = xa.OrderNumber and ItemCode = cdef.Purchase_Allotment_Item and ItemStatus = 'A'),0) Purchased,
IsNull((select count(*) from tab_reg where regclass = xa.regclass and OrderNumber = xa.OrderNumber),0) Used

FROM
Order o,
EA xa,
ClassDef cdef

WHERE
(cdef.exhibitor = 1)
and xa.regclass =* cdef.shortname
and o.ordernumber *= xa.ordernumber
and o.ordernumber <> 30000
and o.OrderType = 1

Order By
o.companyname
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-08-22 : 19:30:09
myst, please start a new thread. You need outer joins.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page
   

- Advertisement -