Professional Documents
Culture Documents
Information Systems Application - Databases: Relational Algebra
Information Systems Application - Databases: Relational Algebra
Application - Databases
Relational Algebra
© 2009, CMU-ISR 2
Queries
A query is applied to relation instances
The query result is also a relation instance
• Schemas of input relations for a query are fixed
• Queries will always run
• The schema for the result is also fixed!
Determined by definition of query language
constructs
• Queries often use field names, but can also use
field position
Field position easier for formal definitions; however,
names are more readable
Both used in SQL
© 2009, CMU-ISR 3
Example Instances
R1
© 2009, CMU-ISR 4
Relational Algebra
Basic Operations:
• Selection (σ) Selects a subset of rows from relation
• Projection (π) Chooses which columns for the new
relation
• Cross-product (χ) Allows us to combine two relations
• Set Difference (―) Rows in relations 1, but not in
relation 2
• Union (U) Tuples in relation 1 and relation 2
• Intersection ( ) Only tuples in both 1 and 2
U
Additional Operations:
• Intersection, join, division, renaming
© 2009, CMU-ISR 5
Selection
Selects rows that sid sname rating age
satisfy selection 28 yuppy 9 35.0
condition
58 Rusty 10 35.0
No Duplicates!
σ rating 8 s2
Schema of result
identical to schema of
sid sname rating age
input relation
28 yuppy 9 35.0
Result relation can be
the input for another
relational algebra σ sname “yuppy” σ rating 8 s2
operation!
© 2009, CMU-ISR 6
Projection
Deletes attributes that sname rating
are not in projection list yuppy 9
S2
© 2009, CMU-ISR 8
Union (s1 s2)
S2
© 2009, CMU-ISR 9
Union Compatible
Operations take two input relations, which must
be union-compatible
• Same number of fields
• Corresponding fields have same domain, i.e. same type
S1 S3
© 2009, CMU-ISR 10
Intersection and Set-Difference
S2
S1 S1— S2
sid sname rating age
sid sname rating age
22 Dustin 7 45.0
22 Dustin 7 45.0
31 Lubber 8 55.5
58 Rusty 10 35.0
© 2009, CMU-ISR 11
Cross Product (S1 χ R1)
Each row of S1 is paired with each row of R1
Result Schema has one field per field of S1 and
R1, with field names inherited if possible
© 2009, CMU-ISR 12
Renaming Operator
Name conflicts require new field names
Convenient to name relation results
Renaming operator (ρ) for expression ρ(R(F),E),
where:
• E is an arbitrary relation expression
• R is the name given the result from the relation
expression
• F is the list of renamed fields in the form of
Oldname Æ newname
Oldposition Æ newname
© 2009, CMU-ISR 14
Equi-joins
Special case of the condition join where
the condition c contains only equalities
Resulting schema same as that of cross-
product
Fewer tuples than cross-product
S1 s1.sid r1.sid R1
© 2009, CMU-ISR 15
Division
Not supported as a primitive operator, but
useful for expressing queries such as:
• Find x which have all y
• Let A have 2 fields, x and y; B have only one
field y:
A/B = {x| 〈x,y〉 ∊ A y ∊ B}
• A/B contains all x tuples such that for every y
tuple in B, there is an xy tuple in A
• x and y can be any lists of fields; y is the list of
fields in B, and x y is the list of fields of A
© 2009, CMU-ISR 16
Division Example
f1 f2 f2 f2 f2
S1 P1 P2 P2 P1
S1 P2 P4 P2
B1
S1 P3 P4
B2
S1 P4 f1
B3
S2 P1 S1
f1
S2 P2 S2
S1 f1
S3 p2 S3
S4 S1
S4 P2 S4
A
© 2009, CMU-ISR 17
Expressing Division in SQL
For A/B, compute all x values that are not
‘disqualified’ by some y value in B
• X value is disqualified if by attaching y value
from B, we obtain an xy tuple that is not in A
Disqualified x values πx πx A χ B —A
© 2009, CMU-ISR 18
Lots of Example Queries
© 2009, CMU-ISR 19
Example 2
Find names of sailors who have reserved a
red boat
• Solution 1: πsname((σcolor=‘red’ Boats) ⊜ Reserves
⊜ Sailors)
• Solution 2: πsname (πsid((πbid(σcolor=‘red’Boats) ⊜
Reserves ⊜ Sailors)
Which solution is more efficient?
© 2009, CMU-ISR 20
Example 3
© 2009, CMU-ISR 21
Relational Algebra Summary
The relational model has a rigorously
defined query language that is simple and
powerful
Relational algebra is more operational;
useful as internal representation for query
There are often several ways of expressing
a given query; the query optimizer should
choose the most efficient version
© 2009, CMU-ISR 22