Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2/20/2015

Relational Query Languages


Query languages: Allow manipulation and retrieval
of data from a database.
Relational Algebra • Relational model supports simple, powerful QLs:
– Strong formal foundation based on logic.
– Allows for much optimization.
• Query Languages != programming languages!
Ref: Database Management Systems 3rd Ed. R. Ramakrishnan and J. Gehrke – QLs not expected to be “Turing complete”.
– QLs not intended to be used for complex calculations.
– QLs support easy, efficient access to large data sets.
University of Botswana. CSI262 Lecture
2
Notes 2014-2015

Formal Relational Query Languages Preliminaries


• Two mathematical Query Languages form the • A query is applied to relation instances, and the
result of a query is also a relation instance.
basis for “real” languages (e.g. SQL), and for – Schemas of input relations for a query are fixed (but
implementation: query will run regardless of instance!)
– Relational Algebra: More operational, very useful – The schema for the result of a given query is also
fixed! Determined by definition of query language
for representing execution plans. constructs.
– Relational Calculus: Lets users describe what • Positional vs. named-field notation:
they want, rather than how to compute it. – Positional notation easier for formal definitions,
(Nonoperational, declarative.) named-field notation more readable.
– Both used in SQL

University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture


3 4
Notes 2014-2015 Notes 2014-2015

Example Instances
• “Sailors” and “Reserves” relations for our
Relational Algebra
examples. • Basic operations:
• We’ll use positional or named field notation, – Selection (ς) Selects a subset of rows from relation.
assume that names of fields in query results – Projection (π ) Deletes unwanted columns from relation.
are `inherited’ from names of fields in query – Cross-product (X) Allows us to combine two relations.
input relations. R1 SID BID Day
– Set-difference (-) Tuples in reln. 1, but not in reln. 2.
22 101 10/10/96 – Union(U) Tuples in reln. 1 and in reln. 2.
58 103 11/12/96 • Additional operations:
– Intersection, join, division, renaming: Not essential, but
S2 SID SName Rating Age (very!) useful.
S1 SID SName Rating Age
22 Dustbin 7 45.0
28 Yuppy 9 35.0 • Since each operation returns a relation, operations can
31 Lubber 8 55.5 be composed! (Algebra is “closed”.)
31 Lubber 8 55.5 44 Guppy 5 35.0
58 Rusty 10 35.0 58 Rusty 10 35.0
University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture
5 6
Notes 2014-2015 Notes 2014-2015

1
2/20/2015

Projection Selection
• Deletes attributes that are not in projection list. • Selects rows that satisfy selection condition.
• Schema of result contains exactly the fields in the • No duplicates in result! (Why?)
projection list, with the same names that they had in • Schema of result identical to schema of (only) input
the (only) input relation. relation.
• Projection operator has to eliminate duplicates! • Result relation can be the input for another relational
(Why??) algebra operation! (Operator composition.)
– Note: real systems typically don’t do duplicate elimination
unless the user explicitly asks for it. (Why not?) ς Rating > 8 (S2)
π SName, Rating (S2) π Age(S2) π SName, Rating(ς Rating > 8 (S2))
SID SName Rating Age
SName Rating Age
28 Yuppy 9 35.0 SName Rating
Dustbin 7 35.0
58 Rusty 10 35.0 Yuppy 9
Lubber 8 55.5
Rusty 10
Guppy 5
Rusty 10
University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture
7 8
Notes 2014-2015 Notes 2014-2015

Union, Intersection, Set-Difference Cross-Product


• S1 X R1 Each row of S1 is paired with each row of R1.
• All of these operations take two input • Result schema has one field per field of S1 and R1, with
relations, which must be union-compatible: field names `inherited’ if possible.
– Conflict: Both S1 and R1 have a field called SID.
– Same number of fields.
– `Corresponding’ fields have the same type. SID SName Rating Age SID BID Day
22 Dustbin 7 45.0 22 101 10/10/96
• What is the schema of result? 22 Dustbin 7 45.0 58 103 11/12/96
S1 U S2 S1 – S2 31 Lubber 8 55.5 22 101 10/10/96
SID SName Rating Age 31 Lubber 8 55.5 58 103 11/12/96
SID SName Rating Age 22 Dustbin 7 45.0 58 Rusty 10 35.0 22 101 10/10/96
22 Dustbin 7 45.0 58 Rusty 10 35.0 58 103 11/12/96
S1 ∩ S2
31 Lubber 8 55.5 – Renaming operator: ρ(C(1SId1,5SId2), S1X R1)
SID SName Rating Age
58 Rusty 10 35.0
31 Lubber 8 55.5
44 Guppy 5 35.0
58 Rusty 10 35.0
28 Yuppy 9 35.0 9
University of Botswana. CSI262 Lecture
University of Botswana. CSI262 Lecture Notes 2014-2015 Notes 2014-2015 10

Joins Joins
• Condition Join R cS = ς c (R X S) • Equi-Join: A special case of condition join where
(SID) SName Rating Age (SID) BID Day the condition c contains only equalities.
22 Dustbin 7 45.0 58 103 11/12/96
SID SName Rating Age BID Day
31 Lubber 8 55.5 58 103 11/12/96
22 Dustbin 7 45.0 101 10/10/96
• S1 S1.SID < R1.SID R1 58 Rusty 10 35.0 103 11/12/96

• Result Schema same as that of cross product.


S1 SID R1
• Fewer tuples than cross-product, might be able to • Result schema similar to cross-product, but only
compute more efficiently one copy of fields for which equality is specified.
• Sometimes called the theta-join • Natural Join: Equijoin on all common fields
University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture
11 12
Notes 2014-2015 Notes 2014-2015

2
2/20/2015

Division Examples of Division A/B


sno pno pno
• Not supported as a primitive operator, but useful for s1 p1
pno pno
p1
p2 p2
expressing queries like: s1 p2 P4 p2
Find sailors who have reserved all boats. s1 p3 B1 P4
B2
• Let A two fields, x and y: B have only one field y: s1 p4 B3
s2 p1
– A/B = {<x> | Ǝ <x, y> ϵ A <y> ϵ B
s2 p2
– i.e. A/B contains all x tuples (sailors) such that for every y s3 p2
tuple (boat) in B, there is an xy tuple in A. s4 p2 sno
sno sno
– Or: if there is the set of y values (boats) associated with an s4 p4 s1
s1 s1
x value (sailor) in A contains all y values in B, the x value is s2
s4
in A/B. A S3
A/B A/B3
• In general, x and y can be any lists of fields; y is the list s4
of fields in B, and x U y is the list of field of A. A/B1

University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture


13 14
Notes 2014-2015 Notes 2014-2015

Find names of sailors who’ve reserved boat #103


Expressing A/B Using Basic Operators
• Division is not essential op; just a useful • Solution 1: π sname ((ς bid= 103 Reserves) Sailors)
shorthand.
– (Also true of joins, but joins are so common that
systems implement joins specially.)
• Solution 2: ρ (Temp1 , ς bid= 103 Reserves)
– Idea: For A/B, compute all x values that are not
`disqualified’ by some y value in B. ρ(Temp2, Temp1 Sailors)
– x value is disqualified if by attaching y value from B,
we obtain an xy tuple that is not in A. π sname(Temp2)
Disqualified x values: πx ((πx(A)XB)-A)
• πx ((πx(A)XB)-A) - all disqualified tuples
• Solution 3: π sname (ς bid= 103 (Reserves Sailors))
• A/B = πx(A) - πx ((πx(A)XB)-A)
University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture
15 16
Notes 2014-2015 Notes 2014-2015

Find sailors who’ve reserved a red or a green boat


Find names of sailors who’ve reserved a red boat
• Can identify all red or green boats, then find
• Information about boat color only available in Boats;
sailors who’ve reserved one of these boats:
so need an extra join:
ρ(Tempboats,(ς color ‘red’ v color ‘green’ Boats))
π SName (ς color ‘red’ Boats) Reserves Sailors
π SName(Tempboats Reserves Sailors)
• A more efficient solution: • Can also define Tempboats using union!
π SName (π SID((π BID ς color ‘red’ Boats) Reserves Sailors
(How?)
• A query optimizer can find this, given the first • What happens if v is replaced by Λ in this
solution! query?

University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture


17 18
Notes 2014-2015 Notes 2014-2015

3
2/20/2015

Find sailors who’ve reserved a red and a green boat Find the names of sailors who’ve reserved all boats

• Previous approach won’t work! Must identify • Uses division; schemas of the input relations
to / must be carefully chosen:
sailors who’ve reserved red boats, sailors
who’ve reserved green boats, then find the ρ(Tempsids, (π SID, BID Reserves) / (π BID Boats))
intersection (note that SID is a key for Sailors):
ρ(Tempred, π SID((ς color ‘red’ Boats) Reserves)) π Sname (Tempsids Sailors)
ρ(Tempgreen, π SID((ς color ‘green’ Boats) Reserves)) To find sailors who’ve reserved all ‘Interlake’
π SName((Tempred ∩ Tempgreen) Sailors) boats:
... / (π BID ς Bname = ‘Interlake’Boats))

University of Botswana. CSI262 Lecture University of Botswana. CSI262 Lecture


19 20
Notes 2014-2015 Notes 2014-2015

Summary
• The relational model has rigorously defined
query languages that are simple and powerful.
• Relational algebra is more operational; useful
as internal representation for query evaluation
plans.
• Several ways of expressing a given query; a
query optimizer should choose the most
efficient version.

University of Botswana. CSI262 Lecture


21
Notes 2014-2015

You might also like