Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

Joins

Introduction to Joins

Joins are one of the most important operations performed by a relational database
system. An RDBMS uses joins to match rows from one table with rows from another
table. For example, we can use joins to match sales with customers or books with
authors. Without joins, we might have a list of sales and customers or books and
authors, but we would have no way to determine which customers bought which items
or which authors wrote which books.

We can join two tables explicitly by writing a query that lists both tables in the FROM
clause. We can also join two tables by using a variety of different sub-queries.

The Different types of joins are are:

• Inner join
• Outer join
• Cross join
• Cross apply
• Semi-join
• Anti-semi-join

Here is a simple schema and data set that we will use to illustrate each join type:

create table Customers (Cust_Id int, Cust_Name varchar(10))


insert Customers values (1, 'Craig')
insert Customers values (2, 'John Doe')
insert Customers values (3, 'Jane Doe')

create table Sales (Cust_Id int, Item varchar(10))


insert Sales values (2, 'Camera')
insert Sales values (3, 'Computer')
insert Sales values (3, 'Monitor')
insert Sales values (4, 'Printer')

Inner joins

Inner joins are the most common join type. An inner join simply looks for two rows
that put together satisfy a join predicate. For example, this query uses the join
predicate “S.Cust_Id = C.Cust_Id” to find all Sales and Customer rows with the same
Cust_Id:

select *
from Sales S inner join Customers C
on S.Cust_Id = C.Cust_Id
Cust_Id Item Cust_Id Cust_Name
----------- ---------- ----------- ----------
2 Camera 2 John Doe
3 Computer 3 Jane Doe
3 Monitor 3 Jane Doe

Notes:

• Cust_Id 3 bought two items so this customer row appears twice in the result.
• Cust_Id 1 did not purchase anything and so does not appear in the result.
• We sold a ‘Printer’ to Cust_Id 4. There is no such customer so this sale does
not appear in the result.

Inner joins are fully commutative. “A inner join B” and “B inner join A” are
equivalent.

Outer joins

Suppose that we would like to see a list of all sales; even those that do not have a
matching customer. We can write this query using an outer join. An outer join
preserves all rows in one or both of the input tables even if we cannot find a matching
row per the join predicate. For example:

select *
from Sales S left outer join Customers C
on S.Cust_Id = C.Cust_Id

Cust_Id Item Cust_Id Cust_Name


----------- ---------- ----------- ----------
2 Camera 2 John Doe
3 Computer 3 Jane Doe
3 Monitor 3 Jane Doe
4 Printer NULL NULL

Note that the server returns NULLs for the customer data associated with the ‘Printer’
sale since there is no matching customer. We refer to this row as “NULL extended.”

Using a full outer join, we can find all customers regardless of whether they purchased
anything and all sales regardless of whether they have a valid customer:

select *
from Sales S full outer join Customers C
on S.Cust_Id = C.Cust_Id

Cust_Id Item Cust_Id Cust_Name


----------- ---------- ----------- ----------
2 Camera 2 John Doe
3 Computer 3 Jane Doe
3 Monitor 3 Jane Doe
4 Printer NULL NULL
NULL NULL 1 Craig

The following table shows which rows will be preserved or NULL extended for each
outer join variation:

Join Preserve …
A left outer join B all A rows
A right outer join B all B rows
A full outer join B all A and B rows

Full outer joins are commutative. In addition, “A left outer join B” and “B right outer
join A” are equivalent.

Cross joins

A cross join performs a full Cartesian product of two tables. That is, it matches every
row of one table with every row of another table. You cannot specify a join predicate
for a cross join using the ON clause though you can use a WHERE clause to achieve
essentially the same result as an inner join.

Cross joins are fairly uncommon. Two large tables should never be cross joined as this
will result in a very expensive operation and a very large result set.

select *
from Sales S cross join Customers C

Cust_Id Item Cust_Id Cust_Name


----------- ---------- ----------- ----------
2 Camera 1 Craig
3 Computer 1 Craig
3 Monitor 1 Craig
4 Printer 1 Craig
2 Camera 2 John Doe
3 Computer 2 John Doe
3 Monitor 2 John Doe
4 Printer 2 John Doe
2 Camera 3 Jane Doe
3 Computer 3 Jane Doe
3 Monitor 3 Jane Doe
4 Printer 3 Jane Doe

Semi-join and Anti-semi-join

A semi-join returns rows from one table that would join with another table without
performing a complete join. An anti-semi-join returns rows from one table that would
not join with another table; these are the rows that would be NULL extended if we
performed an outer join.
Unlike the other join operators, there is no explicit syntax to write “semi-join,” but
SQL Server uses semi-joins in a variety of circumstances. For example, we may use a
semi-join to evaluate an EXISTS sub-query:

select *
from Customers C
where exists (
select *
from Sales S
where S.Cust_Id = C.Cust_Id
)

Cust_Id Cust_Name
----------- ----------
2 John Doe
3 Jane Doe

There are left and right semi-joins. A left semi-join returns rows from the left (first)
input that match rows from the right (second) input while a right semi-join returns
rows from the right input that match rows from the left input.

Types of inner joins


Equi-join

An equi-join (also known as an equijoin), a specific type of comparator-based


join, or theta join, uses only equality comparisons in the join-predicate. Using
other comparison operators (such as <) disqualifies a join as an equi-join. The
query shown above has already provided an example of an equi-join:

SELECT *
FROM employee
INNER JOIN department
ON employee.DepartmentID = department.DepartmentID

The resulting joined table contains two columns named DepartmentID, one
from table Employee and one from table Department.

Natural join

A natural join offers a further specialization of equi-joins. The join predicate


arises implicitly by comparing all columns in both tables that have the same
column-name in the joined tables. The resulting joined table contains only one
column for each pair of equally-named columns.
The above sample query for inner joins can be expressed as natural join in the
following way:

SELECT *
FROM employee NATURAL JOIN department

The result appears slightly different, however, because only one DepartmentID
column occurs in the joined table.

Employee.LastName DepartmentID Department.DepartmentName


Smith 34 Clerical
Jones 33 Engineering
Robinson 34 Clerical

Using the NATURAL JOIN keyword to express joins can suffer from ambiguity at
best, and leaves systems open to problems if schema changes occur in the
database. For example, the removal, addition, or renaming of columns changes
the semantics of a natural join. Thus the safer approach involves explicitly
coding the join-condition using a regular inner join.

Types of Outer Joins


Left outer join

The result of a left outer join for tables A and B always contains all records of
the "left" table (A), even if the join-condition does not find any matching
record in the "right" table (B). This means that if the ON clause matches 0
(zero) records in B, the join will still return a row in the result — but with NULL
in each column from B.

A left outer join returns all the values from the left table, plus matched values
from right table (or NULL in case of no matching join predicate).

For example, this allows us to find an employee's department, but still to show
the employee even when their department does not exist (contrary to the
inner-join example above, where employees in non-existent departments get
filtered out).

Example of a left outer join:

SELECT *
FROM employee
LEFT OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
Employee.Last Employee.Depart Department.Departme Department.Depart
Name mentID ntName mentID
Jones 33 Engineering 33
Rafferty 31 Sales 31
Robinson 34 Clerical 34

Right outer join

A right outer join closely resembles a left outer join, except with the tables
reversed. Every record from the "right" table (B) will appear in the joined table
at least once. If no matching row from the "left" table (A) exists, NULL will
appear in columns from A for those records that have no match in A.

A right outer join returns all the values from right table and matched values
from left table (or NULL in case of no matching join predicate).

Example right outer join :

SELECT *
FROM employee
RIGHT OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
Employee.Last Employee.Depart Department.Departme Department.Depart
Name mentID ntName mentID
Smith 34 Clerical 34
Jones 33 Engineering 33
Robinson 34 Clerical 34

Full outer join

A full outer join combines the results of both left and right outer joins. The
joined table will contain all records from both tables, and fill in NULLs for
missing matches on either side.

Example full outer join:

SELECT *
FROM employee
FULL OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID

Employee.Last Employee.Depart Department.Departme Department.Depart


Name mentID ntName mentID
Smith 34 Clerical 34
Jones 33 Engineering 33
Robinson 34 Clerical 34
Some database systems do not support this functionality directly, but they can
emulate it through the use of left and right outer joins and unions. The same
example can appear as:

SELECT *
FROM employee
LEFT JOIN department
ON employee.DepartmentID = department.DepartmentID
UNION
SELECT *
FROM employee
RIGHT JOIN department
ON employee.DepartmentID = department.DepartmentID
WHERE employee.DepartmentID IS NULL

Visual Representations of Joins:


The most common types of joins can be made very clear by some very simple
illustrations. We already discussed the inner join, which is this area of two
collections (the area marked blue):

Inner join

The left outer join results in all the rows of the left collection, and where
present, the rows of the right collection (in other words, it doesn't leave out
rows from the left collection). That is this area of the two collections:

Left outer join

The right outer join does exactly the same as the left outer join, but in
reverse, so that would be this area of the collection:
Right outer join

The full outer join, which is basically a left outer join and right outer join
added together, simply returns everything, like this:

Full outer join

You might also like