Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 88

Relational Algebra & SQL

Module 4
Introduction to Relational Algebra
Formal language for the relational model
Provides basic set of operations for the relational
model
Enable a user to specify basic retrieval requests as
relational algebra expressions
Its operations can be divided into two groups:
mathematical set theory – like union, intersection, set
difference, and cartesian product
operations - select, project, and join
Basic Relational Algebra Operations
(Unary):
Selection, Projection

Selection:  <condition(s)> (<relation>)


Picks tuples from the relation

Projection:  <attribute-list> (<relation>)


Picks columns from the relation
+
Relational Algebra Operations (Set):
Union, Set Difference

Union: (<relation>) U (<relation>)


 New relation contains all tuples from both relations, duplicate tuples
eliminated.

Set Difference: R – S
 Produces a relation with tuples that are in R but NOT in S.
+
Relational Algebra Operations (Set):
Cartesian Product, Intersect

Cartesian Product: R x S
 Produces a relation that is concatenation of every tuple of R with every
tuple of S

 The Above operations are the 5 fundamental operations of relational


algebra.

Intersection: R S
 All tuples that are in both R and S
+
Relational Algebra Operations (Join):
Theta Join, Natural Join

Theta Join: R F S =  F (R x S)
 Selectall tuples from the Cartesian product of
the two relations, matching condition F
 When F contains only equality “=“, it is called
Equijoin

Natural Join: R S
 Equijoin with common attributes eliminated
Unary Relational Operations
Unary – operations performed on single relation
Select
Project

Select Operation
used to choose a subset of the tuples from a relation that satisfies a
selection condition
Performs horizontal partition on the relation
In general, the SELECT operation is denoted by

where σ (sigma) is used to denote the SELECT operator


selection condition is a Boolean expression (condition) specified on
the attributes of relation R
Select Operation
Example
select the EMPLOYEE tuples whose department is 4

 select the EMPLOYEE tuples whose salary is greater


than $30,000
Project Operation
Selects certain columns from the table and discards
the other columns
Performs vertical partition on the relation
The general form of the PROJECT operation is

where π (pi) is the symbol used to represent the PROJECT


operation
<attribute list> is the desired sublist of attributes from the
relation R
PROJECT operation removes any duplicate tuples
Project Operation
List each employee’s first name, last name and salary

In SQL, the PROJECT attribute list is specified in the


SELECT clause of a query
E.g.,
Select Lname, Fname,Salary
From Employee
Sequences of Operations
It is possible to apply several relational algebra
operations one after the other
E.g., Retrieve the first name, last name, and salary of
all employees who work in department number 5
Both the select and project operation must be applied
to get the result
Rename Operation
To rename the attributes of the relation
The general RENAME operation when applied to a
relation R of degree n is denoted by any of the
following three forms:

where ρ (rho) is used to denote the RENAME operator


S is the new relation name
B1, B2, ..., Bn are the new attribute names
Renaming in SQL is accomplished by aliasing using
AS, as in the following example:

SELECT E.Fname AS First_name, E.Lname AS


Last_name, E.Salary AS Salary
FROM EMPLOYEE AS E
WHERE E.Dno=5,
Set Theory Operations
Set theoretic operations are used to merge the elements of
two sets in various ways, including union, intersection,
and set difference
These are binary operations - applied on two relations
The two relations on which any of these three operations
are applied must have the same type of tuples - union
compatibility or type compatibility
Two relations R(A1, A2, ..., An) and S(B1, B2, ..., Bn) are said
to be union compatible (or type compatible) if they
have the same degree n and if dom(Ai) = dom(Bi) for 1 <= i
<= n
Cartesian Product Operation
Also known as cross product or cross join
Denoted by ×
This is also a binary set operation, but the relations on
which it is applied do not have to be union compatible
E.g.,
Binary Relational Operations
Join Operation( )
used to combine related tuples from two relations into
single “longer” tuples
The general form of a JOIN operation on two relations
R(A1, A2, ..., An) and S(B1, B2, ..., Bm) is

E.g.,
Variations of Join
Equi Join - involves join conditions with equality
comparisons only

Natural Join -requires that the two join attributes have


the same name in both relations
Denoted by *
E.g,
Additional Relational Operations
Generalized Projection
The generalized projection operation extends the projection
operation by allowing functions of attributes to be included in the
projection list
The generalized form can be expressed as:

where F1, F2, ..., Fn are functions over the attributes in relation R
and may involve arithmetic operations and constant values

This operation is helpful when developing reports where


computed values have to be produced in the columns of a query
result
Generalized Projection
Aggregate Functions and Grouping
Common functions applied to collections of numeric
values include SUM, AVERAGE, MAXIMUM,
MINIMUM and COUNT
We can define an AGGREGATE FUNCTION operation,
using the symbol ℑ

where <grouping attributes>


SUM, AVERAGE, MAXIMUM, MINIMUM,COUNT
Aggregate Functions and Grouping
For example, Retrieve each department number, the
number of employees in the department, and their
average salary
Outer Join Operations
 OUTER JOINS - developed for the case where the user wants to keep all
the tuples in R, or all those in S, or all those in both relations in the result
of the JOIN, regardless of whether or not they have matching tuples in the
other relation

 LEFT OUTER JOIN ( ) operation keeps every tuple in the first, or left,
relation R in R S; if no matching tuple is found in S, then the attributes
of S in the join result are filled or padded with NULL values

 RIGHT OUTER JOIN, denoted by , keeps every tuple in the second,


or right, relation S in the result of R S

 FULL OUTER JOIN, denoted by , keeps all tuples in both the left and
the right relations when no matching tuples are found, padding them with
NULL values as needed
Relational Algebra Operators
Symbol (Name) Example of Use
σ
σ
(Selection) salary > = 85000 (instructor)
Return rows of the input relation that satisfy the predicate.
Π
Π
(Projection) ID, salary (instructor)
Output specified attributes from all rows of the input relation.
Remove duplicate tuples from the output.
x
(Cartesian Product) instructor x department
Output pairs of rows from the two input relations that have the
same value on all attributes that have the same name.

Π ∪ Π
(Union) name (instructor) name (student)
Output the union of tuples from the two input relations.
-
Π -- Π
(Set Difference) name (instructor) name (student)
Output the set difference of tuples from the two input relations.

(Natural Join) instructor ⋈ department
Output pairs of rows from the two input relations that have the
same value on all attributes that have the same name.

Database System Concepts - 6th Edition 2.37 ©Silberschatz, Korth and Sudarshan
 SAILORS (SID, SNAME, RATING, AGE)
 BOATS (BID, BNAME, COLOR)
 RESERVES (SID, BID, DAY)

Database System Concepts - 6th Edition 2.38 ©Silberschatz, Korth and Sudarshan
 SAILORS (SID, SNAME, RATING, AGE)
 BOATS (BID, BNAME, COLOR)
 RESERVES (SID, BID, DAY)

Database System Concepts - 6th Edition 2.39 ©Silberschatz, Korth and Sudarshan
 SAILORS (SID, SNAME, RATING, AGE)
 BOATS (BID, BNAME, COLOR)
 RESERVES (SID, BID, DAY)

Database System Concepts - 6th Edition 2.40 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.41 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.42 ©Silberschatz, Korth and Sudarshan
Heuristic Query Optimization
Database System Concepts - 6th Edition 2.50 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 6th Edition 2.51 ©Silberschatz, Korth and Sudarshan
Using Selectivity and Cost Estimates in Query
Optimization
• Cost-based query optimization: Estimate and compare the
costs of executing a query using different execution strategies
and choose the strategy with the lowest cost estimate.
(Compare to heuristic query optimization)

• Issues
– Cost function
– Number of execution strategies to be considered

Chapter 15-55
Using Selectivity and Cost Estimates in Query
Optimization (2)
• Cost Components for Query Execution
1. Access cost to secondary storage
2. Storage cost
3. Computation cost
4. Memory usage cost
5. Communication cost

Note: Different database systems may focus on different cost


components.

Chapter 15-56
1. Translating SQL Queries into Relational
Algebra (1)
• Query block: the basic unit that can be translated
into the algebraic operators and optimized.
• A query block contains a single SELECT-FROM-
WHERE expression, as well as GROUP BY and
HAVING clause if these are part of the block.
• Nested queries within a query are identified as
separate query blocks.
• Aggregate operators in SQL must be included in
the extended algebra.

Chapter 15-57
Translating SQL Queries into Relational
Algebra (2)
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ( SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);

SELECT LNAME, FNAME SELECT MAX (SALARY)


FROM EMPLOYEE FROM EMPLOYEE
WHERE SALARY > C WHERE DNO = 5

πLNAME, FNAME (σSALARY>C(EMPLOYEE)) ℱMAX SALARY (σDNO=5 (EMPLOYEE))


Using Heuristics in Query Optimization (10)
General Transformation Rules for Relational Algebra
Operations:
1. Cascade of s: A conjunctive selection condition can be broken
up into a cascade (sequence) of individual s operations: s c1 AND
c2 AND ... AND cn(R) = sc1 (sc2 (...(scn(R))...) )
2. Commutativity of s: The s operation is commutative:
sc1 (sc2(R)) = sc2 (sc1(R))
3. Cascade of p: In a cascade (sequence) of p operations, all but
the last one can be ignored:
pList1 (pList2 (...(pListn(R))...) ) = pList1(R)
4. Commuting s with p: If the selection condition c involves
only the attributes A1, ..., An in the projection list, the two
operations can be commuted:
pA1, A2, ..., An (sc (R)) = sc (pA1, A2, ..., An (R))
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 15-59
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Query Tree
A notation typically used in relational systems to
represent queries internally
A query tree is a tree data structure that corresponds
to a relational algebra expression
Steps in converting a query tree
during heuristic optimization.
(a) Initial (canonical) query tree for SQL query Q.
(b) Moving SELECT operations down the query tree.

(c) Applying the more restrictive SELECT operation first.


(d) Replacing CARTESIAN PRODUCT and SELECT with
JOIN operations.
(e) Moving PROJECT operations down the query tree.
Execution plan
Is generated
Steps in converting a query tree
during heuristic optimization.
(a) Initial (canonical) query tree for SQL query Q.

(b) APPLY General Transformation Rules for RA

(b) Moving SELECT operations down the query tree.

(c) Applying the more restrictive SELECT operation first.


(d) Replacing CARTESIAN PRODUCT and SELECT with
JOIN operations.
(e) Moving PROJECT operations down the query tree.
Apply Heuristic Query optimization
Find the Employee Name and Dept Name of Employees who
belong to ‘SBST’ Department and Employee Name is ‘SAM’

• Emp Table
Execute the Query
STEP 1
Steps in converting a query tree
during heuristic optimization.
(a) Initial (canonical) query tree for SQL query Q.

(b) APPLY General Transformation Rules for RA


(b) Moving SELECT operations down the query tree.

(c) Applying the more restrictive SELECT operation first.


(d) Replacing CARTESIAN PRODUCT and SELECT with JOIN
operations.
(e) Moving PROJECT operations down the query tree.
Before STEP 2:
APPLY General Transformation Rules for RA
STEP 2:
Only Employees with Name ‘SAM’

Read only
36 Bytes
STEP 2:
)
STEP 3
• Apply more restrictive ‘SELECT’ operation
First.
• In this Query Tree no Restrictive SELECT is
available.
• Ignore and proceed to STEP 4.
STEP 4:
Step 4:

Reduced to 44 bytes
from 132 bytes

Total 36 Total 20
bytes bytes
STEP 5:
Exercise 2
• Display the names of all employees who work on the project
name ‘ProjX’
PROJECT
EMPLOYEE PID PNAME
ID NAME DEPT DESIG AAA ProjX
123 SAM 111 CLERK BBB ProjY
124 TOM 222 MANAGER CCC ProjZ WORKSON
125 RAM 333 LECTURER EID PID
123 AAA
126 BOB 222 PLUMBER
124 AAA
127 SAM 333 CLERK
125 BBB
126 BBB
127 CCC
126 BBB
ProjX

You might also like