Professional Documents
Culture Documents
sql12 Joins
sql12 Joins
sql12 Joins
Introduction to Joins
Joins are one of the most important operations performed by a relational database
system. An RDBMS uses joins to match rows from one table with rows from another
table. For example, we can use joins to match sales with customers or books with
authors. Without joins, we might have a list of sales and customers or books and
authors, but we would have no way to determine which customers bought which items
or which authors wrote which books.
We can join two tables explicitly by writing a query that lists both tables in the FROM
clause. We can also join two tables by using a variety of different sub-queries.
• Inner join
• Outer join
• Cross join
• Cross apply
• Semi-join
• Anti-semi-join
Here is a simple schema and data set that we will use to illustrate each join type:
Inner joins
Inner joins are the most common join type. An inner join simply looks for two rows
that put together satisfy a join predicate. For example, this query uses the join
predicate “S.Cust_Id = C.Cust_Id” to find all Sales and Customer rows with the same
Cust_Id:
select *
from Sales S inner join Customers C
on S.Cust_Id = C.Cust_Id
Cust_Id Item Cust_Id Cust_Name
----------- ---------- ----------- ----------
2 Camera 2 John Doe
3 Computer 3 Jane Doe
3 Monitor 3 Jane Doe
Notes:
• Cust_Id 3 bought two items so this customer row appears twice in the result.
• Cust_Id 1 did not purchase anything and so does not appear in the result.
• We sold a ‘Printer’ to Cust_Id 4. There is no such customer so this sale does
not appear in the result.
Inner joins are fully commutative. “A inner join B” and “B inner join A” are
equivalent.
Outer joins
Suppose that we would like to see a list of all sales; even those that do not have a
matching customer. We can write this query using an outer join. An outer join
preserves all rows in one or both of the input tables even if we cannot find a matching
row per the join predicate. For example:
select *
from Sales S left outer join Customers C
on S.Cust_Id = C.Cust_Id
Note that the server returns NULLs for the customer data associated with the ‘Printer’
sale since there is no matching customer. We refer to this row as “NULL extended.”
Using a full outer join, we can find all customers regardless of whether they purchased
anything and all sales regardless of whether they have a valid customer:
select *
from Sales S full outer join Customers C
on S.Cust_Id = C.Cust_Id
The following table shows which rows will be preserved or NULL extended for each
outer join variation:
Join Preserve …
A left outer join B all A rows
A right outer join B all B rows
A full outer join B all A and B rows
Full outer joins are commutative. In addition, “A left outer join B” and “B right outer
join A” are equivalent.
Cross joins
A cross join performs a full Cartesian product of two tables. That is, it matches every
row of one table with every row of another table. You cannot specify a join predicate
for a cross join using the ON clause though you can use a WHERE clause to achieve
essentially the same result as an inner join.
Cross joins are fairly uncommon. Two large tables should never be cross joined as this
will result in a very expensive operation and a very large result set.
select *
from Sales S cross join Customers C
A semi-join returns rows from one table that would join with another table without
performing a complete join. An anti-semi-join returns rows from one table that would
not join with another table; these are the rows that would be NULL extended if we
performed an outer join.
Unlike the other join operators, there is no explicit syntax to write “semi-join,” but
SQL Server uses semi-joins in a variety of circumstances. For example, we may use a
semi-join to evaluate an EXISTS sub-query:
select *
from Customers C
where exists (
select *
from Sales S
where S.Cust_Id = C.Cust_Id
)
Cust_Id Cust_Name
----------- ----------
2 John Doe
3 Jane Doe
There are left and right semi-joins. A left semi-join returns rows from the left (first)
input that match rows from the right (second) input while a right semi-join returns
rows from the right input that match rows from the left input.
SELECT *
FROM employee
INNER JOIN department
ON employee.DepartmentID = department.DepartmentID
The resulting joined table contains two columns named DepartmentID, one
from table Employee and one from table Department.
Natural join
SELECT *
FROM employee NATURAL JOIN department
The result appears slightly different, however, because only one DepartmentID
column occurs in the joined table.
Using the NATURAL JOIN keyword to express joins can suffer from ambiguity at
best, and leaves systems open to problems if schema changes occur in the
database. For example, the removal, addition, or renaming of columns changes
the semantics of a natural join. Thus the safer approach involves explicitly
coding the join-condition using a regular inner join.
The result of a left outer join for tables A and B always contains all records of
the "left" table (A), even if the join-condition does not find any matching
record in the "right" table (B). This means that if the ON clause matches 0
(zero) records in B, the join will still return a row in the result — but with NULL
in each column from B.
A left outer join returns all the values from the left table, plus matched values
from right table (or NULL in case of no matching join predicate).
For example, this allows us to find an employee's department, but still to show
the employee even when their department does not exist (contrary to the
inner-join example above, where employees in non-existent departments get
filtered out).
SELECT *
FROM employee
LEFT OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
Employee.Last Employee.Depart Department.Departme Department.Depart
Name mentID ntName mentID
Jones 33 Engineering 33
Rafferty 31 Sales 31
Robinson 34 Clerical 34
A right outer join closely resembles a left outer join, except with the tables
reversed. Every record from the "right" table (B) will appear in the joined table
at least once. If no matching row from the "left" table (A) exists, NULL will
appear in columns from A for those records that have no match in A.
A right outer join returns all the values from right table and matched values
from left table (or NULL in case of no matching join predicate).
SELECT *
FROM employee
RIGHT OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
Employee.Last Employee.Depart Department.Departme Department.Depart
Name mentID ntName mentID
Smith 34 Clerical 34
Jones 33 Engineering 33
Robinson 34 Clerical 34
A full outer join combines the results of both left and right outer joins. The
joined table will contain all records from both tables, and fill in NULLs for
missing matches on either side.
SELECT *
FROM employee
FULL OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
SELECT *
FROM employee
LEFT JOIN department
ON employee.DepartmentID = department.DepartmentID
UNION
SELECT *
FROM employee
RIGHT JOIN department
ON employee.DepartmentID = department.DepartmentID
WHERE employee.DepartmentID IS NULL
Inner join
The left outer join results in all the rows of the left collection, and where
present, the rows of the right collection (in other words, it doesn't leave out
rows from the left collection). That is this area of the two collections:
The right outer join does exactly the same as the left outer join, but in
reverse, so that would be this area of the collection:
Right outer join
The full outer join, which is basically a left outer join and right outer join
added together, simply returns everything, like this: