Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 1 / 72

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 2 / 72


Type your t

Table "Movies"

id title genre opened


Uni ed representation of all data using tables with rows
101 Aliens Action 1986
and tables
102 Logan Drama 2017
De nitions ⁝ ⁝ ⁝ ⁝
Domains Possible values of an attribute
Table "Cast"
NULL Unknown/invalid values

movie_id actor_id role


Tuples A single row on the table
101 20 Ellen Ripley
Relations Set of tuples (all rows)
102 21 Logan
⁝ ⁝ ⁝

Table "Actors"
Condition that restricts what constitutes valid data
Domain (what type can be inserted?) id name dob
Key (which column(s) identi es others?) 20 Sigourney Weaver 08-10-1949
Foreign Key (does it exist somewhere else?) 21 Hugh Jackman 12-10-1968
⁝ ⁝ ⁝

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 3 / 72


(Foreign) key constraints are not an intrinsic property of a relation
Constraints are speci ed by the database designer to de ne what constitutes valid data

Without any key constraints, the relation on the right is course mc exam
perfectly valid (DBMS does not complain) cs2102 2 yes
However, the meaning is problematic from the cs2102 2 no
application perspective cs2102 2 NULL
(e.g., CS2102 gives different credits, with and without exam?) cs2102 4 yes
cs2102 4 no
Goal: Avoid different values for "mc" and "exam" for the
same course ⁝ ⁝ ⁝

(i.e., "course" must uniquely identify both "mc" and "exam") NULL NULL NULL

How? Pick {course} as the primary key

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 4 / 72


Each foreign key in R1 must satisfy one of the following:

Appear as primary key in R2


Be a NULL value (or a tuple containing at least one NULL value )

at least one of the values of the


referencing columns in R1 shall
be a NULL value, or
the value of each referencing
column in R1 shall be equal to
the value of the corresponding
referenced column in some row
of the referenced table

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 5 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 6 / 72
Preliminary Unary Operators
Closure Set Operators

Logical Inner Joins


Relational Outer Joins
Arithmetic

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 7 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 8 / 72
An algebra is a mathematical system consisting of:
Operands Variables or values from which new values can be constructed
Operators Symbols denoting procedures that construct new values from the given values

Operands Relations (or variables representing relations)


Operators Transformation from one or more input relations into one output relation

Unary selection (σ), projection (π), renaming (ρ)


All other operators✱ can be
Binary cross product (×), union (∪), intersection (∩),
expressed using these basic
difference (−) operators✱✱.

✱✱
In fact, intersection can be constructed using union and difference.

Except for "aggregate" and "sorting" operations.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 9 / 72
Allows for query optimization✱
Rewrite a query such that:
It is equivalent to the original query
It can be evaluated faster than the original query

Which of the following operation is "faster"?

A. (a × b) + (a × c)
B. a × (b + c)
Is it equivalent?


While we do not penalize for inef cient queries, it is useful to know.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 10 / 72
We say that a set of values is closed under the set of
operators if any combination of the operators produces only
values in the given set.

ℤ+ is closed under {+}


but not under {+, -}

ℤ is closed under {+, -, ×}


but not under {+, -, ×, /}

ℝ is closed under {+, -, ×, /}✱


but not under {+, -, ×, /, √}
ℂ is algebraically closed


Division operation (i.e., /) should not be a division by
zero.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 11 / 72
We say that a set of values is closed under the set of
operators if any combination of the operators produces only
values in the given set.

Relations are closed under Relational Algebra.

Proof
Inputs are relations
Outputs are relations

⇒ Output of one relation → input to another relation


⇒ No other output results are possible

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 12 / 72


We can chain operations because all inputs
and outputs are relations (output of one operators
can be input of another operators).

Using the image on the right we get

Op5 (
Op2 ( Op1 ( R1 ) ) ,
Op4 (
Op3 ( R2 ) ,
R3

)
)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 13 / 72


A simpli ed company schema

Employees(name: TEXT, age: INT, role: TEXT)


Managers(name: TEXT, of ce: TEXT)
Teams(ename: TEXT, pname: TEXT, hours: INT)
Projects(name: TEXT, manager: TEXT, start_year: INT, end_year: INT)

(Manager.name) ⇝ (Employees.name)
(Teams.ename) ⇝ (Employees.name)
(Teams.pname) ⇝ (Projects.name)
(Projects.manager) ⇝ (Manager.name)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 14 / 72


Table "Projects"

name manager start_year end_year


BigAI Judy 2020 2025
FastCash Judy 2018 2025
GlobalDB Jack 2019 2023
CoreOS Judy 2020 2020
Table "Teams" CoolCoin Jack 2015 2020

ename pname hours Table "Employees" Table "Managers"


Sarah BigAI 10
Sam BigAI 5 name age role name of ce

Bill BigAI 15 Sarah 25 dev Judy #03-20


Judy GlobalDB 20 Judy 35 sales Jack #03-10
Max GlobalDB 5 Max 52 dev
Sarah GlobalDB 10 Marie 36 hr
Emma GlobalDB 35 Sam 30 sales
Max CoreOS 40 Bernie 19 NULL L02.tbl
Bill CoreOS 30 Emma 28 dev
Sam CoolCoin 40 Jack 40 dev
Sarah CoolCoin 25 Bill 45 dev
Emma CoolCoin 10

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 15 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 16 / 72
c1 c2 c1 ∧ c2 c1 ∨ c2 ¬c1

False False False False True


False NULL False NULL True
False True False True True

NULL False False NULL NULL


NULL NULL NULL NULL NULL
NULL True NULL True NULL

True False False True False


True NULL NULL True False
True True True True False

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 17 / 72


c1 c1
c1 ∧ c2 c1 ∨ c2
False NULL True False NULL True
False False False False False False NULL True
c2 NULL False NULL NULL c2 NULL NULL NULL True
True False NULL True True True True True

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 18 / 72


Any relational/arithmetic operation with NULL produces NULL values

Operation Result Operation Result


NULL < 2023 NULL NULL = NULL NULL

NULL ≥ 0 NULL NULL <> NULL NULL

We add operators to treat NULL as values

V1 V2 V1 ≡ V2 V1 ≢ V2

NULL NULL True False


V1 NULL False True

NULL V2 False True


V1 V2 V1 = V2 V1 <> V2

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 19 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 20 / 72
σ[c](R) where c is a condition that returns a boolean value:
Expr Example Name

1 (expr) σ[(start_year=2020)](Projects) precedence

2 attr op const σ[start_year=2020](Projects) constant selection

2 attr1 op attr2 σ[start_year=end_year](Projects) attribute selection

3 ¬ expr σ[¬(start_year=2020)](Projects) negation

4 expr1 ∧ expr2 σ[start_year=2020 ∧ manager='Judy'](Projects) conjunction

5 expr1 ∨ expr2 σ[start_year=2020 ∨ manager='Judy'](Projects) disjunction

with op ∈ {=, <>, <, ≤, ≥, >, ≡, ≢}.

Operator Precedence: (), op, ¬, ∧, ∨✱


Without the parentheses and op, this is often called Disjunctive Normal Form.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 21 / 72
σ[c](R) selects all tuples from a relation R (i.e., rows from a table) that satisfy the selection condition c.

The result have the same schema


as the input relation
The number of rows are often
smaller


We can always "rearrange" the rows such that the visual above works because we are working with a set of tuples.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 22 / 72
Find all projects where Judy is the manager.

$T1 := σ[(manager = 'Judy')](Projects)


name manager start_year end_year

"BigAI" "Judy" 2020 2025

"FastCash" "Judy" 2018 2025

"CoreOS" "Judy" 2020 2020

3 Rows

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 23 / 72


Find all employees that (a) do not work in Sales and are younger than 30 or
(b) work in HR.

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 24 / 72


π[ℓ](R) where ℓ is an ordered list of attributes

Note:
For simplicity, we do not allow the following on ℓ:
Operations (e.g., π[A + A ](R))
1 2
Duplicate (e.g., π[A , A ](R))
1 1
The ordering of the attribute matters (i.e., π[A , A ](R) ≠ π[A , A ](R))
1 2 2 1

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 25 / 72


π[ℓ](R) keeps only the column (i.e., projects) speci ed in the ordered list ℓ and in the same order.

The resulting schema is speci ed


by ℓ
The number of rows may be
smaller
Because relation is de ned as
a set of tuples

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 26 / 72


$T1 := π[pname](Teams)
pname
Find all project names that have
"BigAI"
team members.
"GlobalDB"

"CoreOS"

"CoolCoin"

4 Rows

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 27 / 72


Which of the following relational algebra expression resulted in the output below?

name manager start_year end_year manager name


BigAI Judy 2020 2025 Judy FastCash
FastCash Judy 2018 2025 Jack GlobalDB
GlobalDB Jack 2019 2023 Jack CoolCoin
CoreOS Judy 2020 2020
CoolCoin Jack 2015 2020

Choice Comment
A σ[start_year≤2019](π[manager,name](Projects))
B π[manager,name](σ[start_year<2020](Projects))
C π[manager,name](σ[manager='Jack'](Projects))
D π[name,manager](σ[start_year≤2019](Projects))

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 28 / 72


ρ[ℜ](R) where ℜ is a collection of attribute renaming which is either:
. ℜ is an unordered collection of Bi ← Ai
Example: ρ[B1 ← A1, B2 ← A2](R)
The order of attributes does not matter
. ℜ is an ordered list of attributes
Example: ρ(B1, B2, ..., Bn)(R)
The order of attributes matters
The list must specify all attributes in the original schema
If not renamed, set Bi = Ai

We will follow option (1) (unordered collection of arrows (new ← old)).

Note:
Renaming will be relevant for set and join operations

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 29 / 72


ρ[ℜ](R) renames the attributes listed in ℜ.

The resulting schema is modi ed


by ℜ
The order columns is unchanged
(modulo renaming)
The number of rows remains the
same

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 30 / 72


$T1 := ρ[pname project](Teams)

ename project hours


Rename the teams such that the
"Sarah" "BigAI" 10
project name becomes simple
"Sam" "BigAI" 5
"project".
"Bill" "BigAI" 15

"Judy" "GlobalDB" 20

"Max" "GlobalDB" 5

"Sarah" "GlobalDB" 10

"Emma" "GlobalDB" 35

"Max" "CoreOS" 40

"Bill" "CoreOS" 30

"Sam" "CoolCoin" 40

"Sarah" "CoolCoin" 25

"Emma" "CoolCoin" 10

12 Rows

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 31 / 72


σ[c](R)

The condition c must specify only attributes in R

π[ℓ](R)

The ordered list of attributes ℓ must specify only attributes in R

ρ[ℜ](R)

No two different attributes may be renamed to the same name


No attributes may be renamed more than once in a single operation

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 32 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 33 / 72
Operation Operator Visualization Description

Union R∪S A relation containing all tuples that are in R or S

Intersection R ∩ S A relation containing all tuples that are in R and S

A relation containing all tuples that are in R but not in


Difference R−S
S

What should be the result of union between {1, 2} and {'A', 'B', 'C'}?

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 34 / 72


Two relations R and S are union-compatible if it satis es both of the following:
R and S have the same number of attributes
The corresponding attributes have the same or compatible domains

Note:
Just because two relations are union-compatible, does not mean the set operation is meaningful

Employees(name: TEXT, age: INT , role: TEXT ) Employees(name: TEXT, role: TEXT, age: INT)
Teams(ename: TEXT, pname: TEXT , hours: INT ) Teams(ename: TEXT, pname: TEXT, hours: INT)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 35 / 72


$T5 := $T2 - $T4
pname
Find all projects
"CoreOS"
that Bill is
working on but 1 Rows

Sarah is not
working on.

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 36 / 72


The cross product of two relations (R × S) is a relation formed $T1 := R × S
A B C X Y
by combining all pairs of tuples (i.e., rows) from the two input
relations. 1 5 "m" "a" 30

1 5 "m" "b" 10

More formally, let R(A1, A2, ..., An) and S(B1, B2, ..., Bn), then 1 5 "m" "c" 20

2 3 "f" "a" 30

R × S produce a schema 2 3 "f" "b" 10


(R × S)(A1, A2, ..., An, B1, B2, ..., Bn) 2 3 "f" "c" 20

6 Rows
R × S = {(a1, a2, ..., an, b1, b2, ..., bn)
| (a1, a2, ..., an) ∈ R ∧ (b1, b2, ..., bn) ∈ S}

Note:
The set of attributes in R and S must be disjoint
(Attr(R) ∩ Attr(S) = ∅),    where
Attr(R) is the set of attributes in R
Attr(S) is the set of attributes in S

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 37 / 72


Find all pairs of senior employee names (i.e., age ≥ 45) and junior employee names (i.e., age ≤
25).

$T6 := $T2 × $T5


name jname

"Max" "Sarah"

"Max" "Bernie"

"Bill" "Sarah"

"Bill" "Bernie"

4 Rows

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 38 / 72


For all the project names, nd the of ces of the managers.

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 39 / 72


Given two relations R and S, the size of the cross product is |R| × |S|
In practice, many to most queries requiring a cross product also require
Selection operation to remove unnecessary rows
Projection to remove unnecessary/duplicate columns (optional)

π[name,of ce](σ[manager=mname](Projects × (ρ[mname←name](Managers))))

Simplify relational algebra expression


Combine cross product, selection, and (optionally) projection
⋈θ
Avoid generating all |R| × |S| intermediate tuples

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 40 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 41 / 72
Combine cross product, selection, and (optionally) projection into a single operator
Typically results in simpler relational algebra expressions when formulating queries

Includes only tuples that satisfy the Also include tuples that do not satisfy
condition the condition

⋈[θ] θ-join ⟕[θ] left outer join


⋈= equi join ⟖[θ] right outer join
⋈ natural join ⟗[θ] full outer join

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 42 / 72


θ θ

The θ-join (R ⋈[θ] S) of two relations R and S is de ned as: R ⋈[θ] S = σ[θ](R × S)

Note:
The condition θ can use attributes that appears in R or S

$T2 := Projects ⋈[(manager = mname)] $T1


name manager start_year end_year mname office

For all the projects, nd "BigAI" "Judy" 2020 2025 "Judy" "#03-20"

the of ces of the "FastCash" "Judy" 2018 2025 "Judy" "#03-20"

"GlobalDB" "Jack" 2019 2023 "Jack" "#03-10"


managers. "CoreOS" "Judy" 2020 2020 "Judy" "#03-20"

"CoolCoin" "Jack" 2015 2020 "Jack" "#03-10"

Projects ⋈[manager=mname] 5 Rows

(ρ[mname←name](Managers))

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 43 / 72


θ

Equi join (R ⋈= S) is a special θ-join where the only relational operator that can be used is equality (e.g., = or ≡)

Note:
θ-Join can use arbitrary relational operator (e.g., =, <>, <, ≤, >, ≥, ≡, ≢)
Dedicated operation because attribute selection using equality operator are the most common
Especially when performing inner join following foreign key constraints
In many cases, hash-tables can be used to increase speed

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 44 / 72


θ

Let R(A1, A2, ..., Ai, B1, B2, ..., Bj ) and S( B1, B2, ..., Bj , C1, C2, ..., Ck). We de ne natural join (R ⋈ S):
R ⋈ S = {(a1, a2, ..., ai, b1, b2, ..., bj , c1, c2, ..., ck)
| (a1, a2, ..., ai, b1, b2, ..., bj) ∈ R
∧ (b1, b2, ..., bj, c1, c2, ..., ck) ∈ S}

In other words:
The join is performed over all common attributes between R and S (i.e., (B1, B2, ..., Bj))
We only combine if the value of the common attributes are equal
The output relations contains the common attribute of R and S only once (via projection)

Alternatively, R ⋈ S = π[ℓ](R ⋈[θ] S), where:


ℓ = Attr(R) ∪ (Attr(S) - Attr(R))✱
θ = ∀ Ai ∈ Attr(R) ∩ Attr(S): RAi = SAi


Equivalent to Attr(R) ∪ Attr(S) except that we want to show it more explicitly all attribute in S appears only after R.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 45 / 72
Inner joins eliminate all tuples that do not Table "Teams" Table "Employees"

satisfy the condition


ename pname hours name age role
(i.e., selection operation)
Sarah BigAI 10 Sarah 25 dev
Sometimes the tuple in R or S that do NOT
Sam BigAI 5 Judy 35 sales
match are also of interest
(i.e., dangling tuple) Bill BigAI 15 Max 52 dev
Judy GlobalDB 20 Marie 36 hr
Max GlobalDB 5 Sam 30 sales
Sarah GlobalDB 10 Bernie 19 NULL
Find all employees that are not
Emma GlobalDB 35 Emma 28 dev
assigned to any project.
Max CoreOS 40 Jack 40 dev
Bill CoreOS 30 Bill 45 dev
Sam CoolCoin 40
Inner join can only nd all employees that are Sarah CoolCoin 25
assigned to at least one project. Emma CoolCoin 10

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 46 / 72


. Perform inner join M = R ⋈θ S
. Add dangling tuples to M from:
R in case of a left outer join
S in case of a right outer join
R and S in case of a full outer join

. "Pad" missing attribute values of dangling tuples with NULL

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 47 / 72


R ⟕[θ] S = R ⋈[θ] S ∪ (dangle(R ⋈[θ] S) × {null(S)})
R ⟖[θ] S = R ⋈[θ] S ∪ (dangle({null(S)} × R ⋈[θ] S))
R ⟗[θ] S = R ⋈[θ] S ∪ (dangle(R ⋈[θ] S) × {null(S)}) ∪ (dangle({null(S)} × R ⋈[θ] S))

where

dangle(R ⋈ S) = set of dangling tuples in R with respect to R ⋈ S


null(R) = n-component tuple of NULL values where n is the number of attributes of R

Find all employees that σ[ename ≡ NULL](Employees ⟕[name=ename] Teams)


are not assigned to any
project.

There are other solutions involving set operations as well.

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 48 / 72


Analog to natural (inner) join

The condition θ is not speci ed


Equality operation is performed over all attributes that R and S have in common

Natural left outer join R ⟕ S


Natural right outer join R ⟖ S
Natural full outer join R ⟗ S

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 49 / 72


Table "Teams"

ename pname hours


Sarah BigAI 10
Sam BigAI 5
Bill BigAI 15
Judy GlobalDB 20
Max GlobalDB 5
Sarah GlobalDB 10
Recap the relations "Teams" and "Managers" on Emma GlobalDB 35

the right. How many row and columns are in the Max CoreOS 40
Bill CoreOS 30
result of the relational algebra expression below? Sam CoolCoin 40
σ[ename ≡ NULL](Managers ⟕[name=ename]Teams) Sarah CoolCoin 25
Emma CoolCoin 10

Table "Managers"
name of ce
Judy #03-20
Jack #03-10

Choice Comment
A 1 row, 5 columns
B 3 rows, 5 columns
C 1 row, 3 columns
D 3 rows, 3 columns

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 50 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 51 / 72
Find all managers (with their
of ces) of projects that
started in 2020 or later,
where at least one
member of the project
team has to work 30 hours
or more on that project
per week.

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 52 / 72


π[name, manager, office](
σ[hours ≥ 30](Teams)
⋈[pname = name]
π[name, manager, office](
σ[start_year ≥ 2020](Projects)

ρ[manager ← name](Managers)))

Find all managers (with their


of ces) of projects that
started in 2020 or later,
where at least one
member of the project
team has to work 30 hours
or more on that project
per week.

M := ρ[manager ← name](Managers)
P := σ[start_year ≥ 2020](Projects)
T := σ[hours ≥ 30](Teams)
Q1 := π[name, manager, of ce](P ⋈ M)
Q2 := T ⋈[pname = name] Q1
Q := π[name, manager, of ce](Q2)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 53 / 72


π[name, manager, office](
σ[hours ≥ 30](Teams)
⋈[pname = name]
σ[start_year ≥ 2020](Projects)

ρ[manager ← name](Managers)
)

Find all managers (with their


of ces) of projects that
started in 2020 or later,
where at least one
member of the project
team has to work 30 hours
or more on that project
per week.

M := ρ[manager ← name](Managers)
P := σ[start_year ≥ 2020](Projects)
T := σ[hours ≥ 30](Teams)
Q1 := T ⋈[pname = name] P
Q2 := Q1 ⋈ M
Q := π[name, manager, of ce](Q2)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 54 / 72


We say that two relational algebra expressions Q1 and Q2 are equivalent (denoted by Q1 ≡ Q2) if for the any input
relations either:
both produces error or both produces the same result
Note:
This may be called strongly equivalent
Weakly equivalent can be de ned as if there is no error then both produces the same result

In general, there are multiple ways to formulate a query to get the same result
This includes:
Order in which join operations are performed
Order in which selection operations are performed (e.g., before or after join operators)
Inserting additional projections to minimize intermediate results
Query optimization: nding the "best" relational algebra expression (CS3223)
Handled by the DBMS transparent to the user

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 55 / 72


σ[c1](σ[c2](R)) ≡ σ[c2](σ[c1](R)) R × S ≢ S × R    (different column order✱)
σ[c1](σ[c2](R)) ≡ σ[c1 ∧ c2](R) R ⋈ S ≢ S ⋈ R    (i.e., non-commutative)
R × (S × T) ≡ (R × S) × T
R ⋈ (S ⋈ T) ≡ (R ⋈ S) ⋈ T    (i.e., associative)
πℓ1(πℓ2(R)) ≢ πℓ1(R)    (unless ℓ1 ⊆ ℓ2) R ⋈[θ1] (S ⋈[θ2] T) ≢ (R ⋈[θ1] S) ⋈[θ2] T
(if θ1 uses T and θ2 uses R)

π[ℓ](σ[θ](R)) ≢ σ[θ](π[ℓ](R))    (unless θ uses only attributes in ℓ)


σ[θ](R × S) ≢ σ[θ](R) × S    (unless θ uses only attributes in R)


You may nd some people allow this to be equivalent but we do not.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 56 / 72
σ[c1](σ[c2](R)) ≡ σ[c2](σ[c1](R)) R × S ≢ S × R    (different column order✱)
σ[c1](σ[c2](R)) ≡ σ[c1 ∧ c2](R) R ⋈ S ≢ S ⋈ R    (i.e., non-commutative)
R × (S × T) ≡ (R × S) × T
R ⋈ (S ⋈ T) ≡ (R ⋈ S) ⋈ T    (i.e., associative)
πℓ1(πℓ2(R)) ≡ πℓ1(R)    R ⋈[θ1] (S ⋈[θ2] T) ≡ (R ⋈[θ1] S) ⋈[θ2] T

π[ℓ](σ[θ](R)) ≡ σ[θ](π[ℓ](R))
σ[θ](R × S) ≡ σ[θ](R) × S   


You may nd some people allow this to be equivalent but we do not.
CS2102: Database Systems -- Adi Yoga Sidi Prabawa 57 / 72
σ[ role = 'dev'](π[ name, age ](Employees))

σ[ role = 'dev'](ρ[ position ← role ](Employees))

σ[ age = role ](Employees)


We assume that there is no implicit type conversion (e.g., INT to TEXT)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 58 / 72


Can you use natural join for the rst
and last operations?


σ[ manager = mname ](Projects × ρ[mname ← name](Managers))
≡ Projects ⋈ [ manager = mname ] ρ[mname ← name](Managers)

π[ name ](π[ name , age ](Employees))


≡ π[ name ](Employees)

σ[start_year = 2020] (Projects ⋈[manager = mname] ρ[mname ← name](Managers))


≡ (σ[start_year = 2020](Projects)) ⋈[manager = mname] ρ[mname ← name](Managers)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 59 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 60 / 72
Formal method to query relational data
Closure property for arbitrarily complex relational expressions
Basis for database query language such as SQL

Unary selection (σ), projection (π), renaming (ρ)


Binary cross product (×), union (∪), intersection (∩), difference (−)

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 61 / 72


CS2102: Database Systems -- Adi Yoga Sidi Prabawa 62 / 72
Operation Operator Visualization Description

Union R∪S A relation containing all tuples that are in R or S

Intersection R ∩ S A relation containing all tuples that are in R and S

A relation containing all tuples that are in R but not in


Difference R−S
S

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 63 / 72


L Operator R Visualization

Keep
dangling

Keep
dangling

Keep Keep
dangling dangling

CS2102: Database Systems -- Adi Yoga Sidi Prabawa 64 / 72


postgres=# exit
Press any key to continue . . .

CS2102: Database Systems -- Adi Yoga Sidi Prabawa

You might also like