Professional Documents
Culture Documents
Relational Algebra
Relational Algebra
04/25/2024 1
Relational Model
The relational Model of Data is based on the concept of a Relation.
The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the
following paper:
"A Relational Model for Large Shared Data Banks," Communications of
the ACM, June 1970
04/25/2024 2
Relation
RELATION: A table of values
A relation may be thought of as a set of rows, alternately as a set of
columns.
Each row has a value of an item or set of items that uniquely identifies
that row in the table.
04/25/2024 3
Schema & Domain
The Schema of a Relation: R (A1, A2, .....An)
Relation schema R is defined over attributes A1, A2, .....An
For Example -
CUSTOMER (Cust-id, Cust-name, Address, Phone_no)
04/25/2024 4
Tuple
A tuple is an ordered set of values.
04/25/2024 5
Summary
Informal Terms Formal Terms
Table Relation
Column name Attribute
Row Tuple
Values in a column Domain
04/25/2024 6
Relational Algebra
Basic set of operations for the relational model to enable a
user to specify basic retrieval requests.
04/25/2024 7
Relational Algebra
Procedural language
The operators take one or more relations as inputs and give a new
relation as a result.
Select operation
SELECT operation is used to select a subset of the
tuples from a relation that satisfy a selection condition
specified on attributes. It is a filter that keeps only those tuples
that satisfy a qualifying condition.
04/25/2024 9
Select Operation
loan
Loan_number Branch_name amount
L-10 Kolkata 200
L-11 Delhi 900
L-12 Chennai 1300
L-13 Mumbai 1500
L-15 Kolkata 1000
1 7
5 7
12 3
23 10
1 7
23 10
Select operation
SELECT Operation Properties
The SELECT operation <selection condition>(R) produces
a relation S that has the same schema as R
04/25/2024 13
Solve:
Select Operation
Find out the loans which are given from kolkata branch.
Ans- branch_name =“Kolkata” (loan)
Find out the loans for amount greater than 1200 and they are given
from kolkata branch.
Ans- amount >1200 ^ branch_name=“Kolkata” (loan)
loan
Loan_number Branch_name amount
Notation:
A1, A2, …, Ak (r)
where A1, A2 are attribute names and r is a relation name.
α 10 1
β 20 1
β 30 1
β 40 2
∏ (r) A C A C
A,C
α 1 α 1
β 1 = β 1
β 1 β 2
β 2
Project Operation
Duplicate rows removed from result, since relations
are sets
04/25/2024 17
Project Operation
loan
Loan_number Branch_name amount
L-10 Kolkata 200
L-11 Delhi 900
L-12 Chennai 1300
L-13 Mumbai 1500
L-15 Kolkata 2000
loan-number, balance (loan)
loan
Loan_number amount
L-10 200
L-11 900
L-12 1300
L-13 1500
L-15 2000
Union Operation
The result of this operation, denoted by R S, is a relation that
includes all tuples that are either in R or in S or in both R and S.
Duplicate tuples are eliminated.
Defined as: R S = {t | t R or t S}
For R S to be valid.
1. R, S must have the same arity (same number of attributes)
2. The attribute domains must be compatible (e.g., 2nd column
of R deals with the same type of values as does the 2nd
column of S)
The resulting relation for R1R2 has the same attribute names as
the first operand relation R1 (by convention).
Union Operation – Example
Relations r, s:
A B A1 B1
α 1 α 2
α 2 β 3
β 1 s
r
r s: A B
α 1
α 2
β 1
β 3
Union Operation
Depositor
customer_ account_nu
name mber
customer_name
Johnson A-10
Johnson A-11 Johnson
Jones A-12 Jones
Smith A-13
Smith
Lindsay A-15
Lindsay
Borrower
Jackson
c_name loan_numbe
r Adams
Johnson L-10
Jackson L-11
Jones L-12 customer-name (depositor) customer-name
Smith L-13 (borrower)
Lindsay L-15 (find the names of all customers with
Adams L-18
Union Example
Example: Suppose the employee relation schema is:
EMPLOYEE (NAME, SSN, SUPERSSN, DNO)
To retrieve the social security numbers of all employees
who either work in department 5 or directly supervise an
employee who works in department 5, we can use the union
operation as follows:
04/25/2024 22
Set-Intersection Operation
Defined as : The result of this operation, denoted by R S, is
a relation that includes all tuples that are in both R and S.
R S ={ t | t R and t S }
Assume:
R, S have the same arity
attributes of r and s are compatible
Note: R S = R - (R - S)
04/25/2024 23
Set-Intersection Operation -
Example
A B A B
Relation r, s: 1 2
2 3
r 1
s
r s A B
2
04/25/2024 24
Example
Find the names of celebrities who are instructors.
Celebrity Instructors
First_name Last_name F_name L_name
Jyoti Basu Shubhas Bose
Sourav Ganguly Jyoti Basu
Poulami Ghatak Rajib Gandhi
Mamata Banerjee
Mamata Banerjee
Rahul Gandhi
Rahul Gandhi
Celebrity
Ans : Celebrity Instructors
First_name Last_name
Jyoti Basu
Mamata Banerjee
Rahul Gandhi
04/25/2024 25
Set Difference Operation
The result of this operation, denoted by R - S, is a relation that
includes all tuples that are in R but not in S.
Defined as:
R– S = {t | t R and t S}
α 1 α 2
α 2 β 3
β 1 s
r
r - s: A B
α 1
β 1
Example
Find the names of celebrities who are not instructors.
Celebrity Instructors
First_name Last_name F_name L_name
Jyoti Basu Shubhas Bose
Sourav Ganguly Jyoti Basu
Poulami Ghatak Rajib Gandhi
Mamata Banerjee
Mamata Banerjee
Rahul Gandhi
Rahul Gandhi
04/25/2024 28
Example
Find the names of instructors who are not celebrities.
Celebrity Instructors
First_name Last_name F_name L_name
Jyoti Basu Shubhas Bose
Sourav Ganguly Jyoti Basu
Poulami Ghatak Rajib Gandhi
Mamata Banerjee
Mamata Banerjee
Rahul Gandhi
Rahul Gandhi
α 1 α 10 a
β 10 a
β 2 β 20 b
r γ 10 b
s
R x S:
A B C D E
α 1 α 10 a
α 1 β 10 a
α 1 β 20 b
α 1 γ 10 b
β 2 α 10 a
β 2 β 10 a
β 2 β 20 b
β 2 γ 10 b
Composition of Operations
Can build expressions using multiple operations
Example: A=C(r x s)
r x s A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A B C D E
A=C(r x s)
1 10 a
2 20 a
2 20 b
Rename Operation
Allows us to name, and therefore to refer to, the results of
relational-algebra expressions.
04/25/2024 36
Example Queries
3.Find the names of all customers who have a loan,
an account, or both, from the bank
cname (borrower) cname (depositor)
04/25/2024 37
Example Queries
Find the names of all customers who have a loan at the kolkata
branch.
- Query 1
cname(bname = “Kolkata”
Query 2
cname(loan.loan-no = borrower.loan-no
04/25/2024 38
Example Queries
Find the names of all customers who have a loan at the
Bangalore branch but do not have an account at any
branch of the bank.
– cname(depositor)
04/25/2024 39
Example Queries
Find the largest account balance.
blnc(account) –
04/25/2024 40
Additional Operations
Join
Division
Assignment
04/25/2024 41
Join operation
Definition:
The sequence of cartesian product followed by select
operation is used quite commonly to identify and select
related tuples from two relations, is a special operation, called
JOIN.
04/25/2024 42
Join Operation
Form of join condition: Ai Ѳ Bj
where i) Ai is an attribute of R and Bj is an attribute
of S, have same domain and
ii) Ѳ is one of the comparison operators {=, ≠,
≤, ≥, <, >}
04/25/2024 45
NATURAL JOIN Operation
Because one of each pair of attributes with identical values is
superfluous, a new operation called NATURAL join, was created to
get rid of the second (superfluous) attribute in an equi join condition.
The standard definition of natural join requires that the two join
attributes, or each pair of corresponding join attributes, have
the same name in both relations. If this is not the case, a renaming
operation is applied first.
04/25/2024 46
Example
DEPARTMENT (DNAME, DNUMBER, MGRSSN)
EMPLOYEE (NAME, SSN, SUPERSSN, DNO)
From these two tables we want to retrieve the name of the manager
of each department using natural join.
Solution:
1. Rename MGRSSN to SSN.
2. Apply natural join operation.
04/25/2024 47
Outer Join
An extension of the join operation that avoids loss of
information.
Computes the natural join and then adds tuples form one
relation that do not match tuples in the other relation to the
result of the join.
All info to the left relation is present in the result of left outer
join.
04/25/2024 49
Inner Join – Example
Relation loan loan-numberbranch-nameamount
L-170 Downtown 3000
L-230 Redwood 4000
L-260 Perryridge 1700
Relation borrower
customer-nameloan-number
Jones L-170
Smith L-230
Hayes L-155
For duplicate elimination and grouping, null is treated like any other
value, and two nulls are assumed to be the same
Null Values
Comparisons with null values return the special truth value unknown
Three-valued logic using the truth value unknown:
OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
04/25/2024 57
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
04/25/2024 58
Division Operation-Query
Retrieve the names of employees who works on all the
projects that ‘John Smith’ works on. Given schemas are:
1. Employee( Fname, Lname, ssn, Addr, salary, bdate)
2. Project( pname, pno, ploc, dno)
3. Works-on( ssn, pno, hours)
Solution:
04/25/2024 60
Banking Example
branch (branch-name, branch-city, assets)
CN(BN=“Downtown”(depositor account))
CN(BN=“Uptown”(depositor account))
Query 2
04/25/2024 63
Extended Relational-Algebra-
Operations
Generalized Projection
Aggregate Functions
04/25/2024 64
Generalized Projection
Extends the projection operation by allowing arithmetic functions to
be used in the projection list.
04/25/2024 65
Aggregate Functions and
Operations
Aggregation function takes a collection of values and returns a single
value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate operation in relational algebra
04/25/2024 66
Aggregate Operation – Example
Relation r: A B C
7
7
3
10
sum-C
g sum(c) (r)
27
04/25/2024 67
Aggregate Operation – Example
Relation account grouped by branch-name:
branch-nameaccount-number balance
Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700
04/25/2024 69
Aggregation
branch-nameaccount-number balance
Perryridge A-102 400
Perryridge A-201 900
Relation schema Brighton A-217 750
Account Brighton A-215 750
Redwood A-222 700
1. g count-distinct(branch-name) (account)
Ans: 3
2. g max(balance) Max-balance
(account)
900
Ans:
04/25/2024 70
Problem Set 1:
Consider the following relational database , where the primary keys are
underlined.
employee (person-name, street, city)
works (person-name, company-name, salary)
company (company-name, city)
manages (person-name, manager-name)
04/25/2024 72
Consider the above relational database. Give a relational-algebra
expression for each of the following queries:
a. Modify the database so that Jones now lives in Newtown.
b. Give all employees of First Bank Corporation a 10 percent salary raise.
c. Give all managers in this database a 10 percent salary raise.
d. Give all managers in this database a 10 percent salary raise, unless the
salary
would be greater than $100,000. In such cases, give only a 3 percent raise.
e. Delete all tuples in the works relation for employees of Small Bank
Corporation.
f. Find the company with the most employees.
g. Find the company with the smallest payroll.
h. Find those companies whose employees earn a higher salary, on average,
than the average salary at First Bank Corporation.
g. Find the accounts held by more than two customers using aggregate
function.
04/25/2024 73
PROBLEM SET -2
Specify the following queries on the COMPANY
relational database schema given below.
Employee(Fname,Minit,Lname,SSn,Bdate,Address,Se
x,Salary,Super_ssn,Dno)
Department(Dname,Dnumber,Mgrssn,Mgr_startdate
)
Dept_location(Dnumber,Dlocation)
Project(Pname, Pnumber, Plocation,Dnum)
Works_on(essn,pno,hours)
Dependent(essn,dependent_ name, sex, bdate,
Relationship)
04/25/2024 74
(a) Retrieve the names of employees in department 5 who work more
than 10 hours per week on the 'ProductX' project.
(b) List the names of employees who have a dependent with the same first
name as
themselves.
(c) Find the names of employees that are directly supervised by 'Franklin
Wong'.
(d) For each project, list the project name and the total hours per week
(by all employees) spent on that project.
(e) Retrieve the names of employees who work on every project.
(f) Retrieve the names of employees who do not work on any project.
(g) For each department, retrieve the department name, and the average
salary of employees working in that department.
(h) Retrieve the average salary of all female employees.
(i) Find the names and addresses of employees who work on at least one
project located in Houston but whose department has no location in
Houston.
(j) List the last names of department managers who have no dependents.
04/25/2024 75