CPSC 421 Database Management Systems: Relational Algebra, More SQL, Repeat

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

9/8/09

CPSC 421
Database Management Systems
Lecture 3: 
Relational Algebra, More SQL,
Repeat

* Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher

Todays Agenda
Quiz
Finish up MySQL
More on SQL queries
Relational Algebra

CPSC421,2009

9/8/09

Why are mathematical definitions useful?


Can you think of examples when a mathematical
definition improved a practical problem?

CPSC421,2009

Query Languages for Relational DBs


SQL

Rela)onalAlgebra

Prac1caldeni1onof
rela1onaldatabase

Mathema1caldeni1on
ofrela1onaldatabase

Operatesontables
(withduplicatesbags)

Operatesonrela1ons
(i.e.,sets)

Variouskeywords,
statements

Setbasedopera1ons

SELECT,FROM,WHERE,

Intersec1on,Union,

crossfer1liza1on
CPSC421,2009

9/8/09

Commercial DBMSs Sets or Bags?


The default is to produce a bag (or multiset) of rows
as a query answer
If you want a set, use DISTINCT
Why do you think they do this?
Note that even though relational algebra was
originally defined as set-based, SQL queries are
represented internally using relational algebra (w/
extra operators)
There are also versions of relational algebra defined
using bag semantics
CPSC421,2009

The Plan
Present mathematical definition of a relational
database (relational algebra)
Intermix relational algebra and SQL
Well use relational algebra again when we talk about
evaluating and optimizing queries

CPSC421,2009

9/8/09

Mathematically Describing a Relational DB


A relation is a set of tuples
See the original definition of the model optional reading
(Codd 1970)

Define query operators as set-theoretic functions


Together these form the relational algebra

CPSC421,2009

Cross Products
Let A = {a, b, c} and B = {1, 2}
In set theory, the cross product is defined as
A X B = {(a, 1), (b, 1), (c, 1), (a, 2), (b, 2), (c, 2)}
A X B is a set consisting of ordered pairs (2-tuples)
where each pair consists of an element from A and
an element from B

CPSC421,2009

9/8/09

Practice Question
Suppose A = {a, b, c} and B = {1, 2}
What is B X B?
B X B = {(1, 1), (2, 1), (1, 2), (2, 2)}

CPSC421,2009

Defining Relations
Suppose we have the relation
Person(name, salary, num, status) with domains
NameValues = {all possible strings of 30 characters}
SalValues = {real numbers between 0 and 100,000}
StatusValues = {f, p}
NumValues = {integers between 0 and 9999}

Any instance of the relation is always a subset () of


NameValues X SalValues X NumValues X StatusValues
* Note that a domain is a set of simple, atomic values
CPSC421,2009

10

9/8/09

Defining Relations
Each relation instance is a subset of the cross
product of its domains (see the Codd paper)
One element of a relation is called a tuple
If n domains, then n-tuples

A relation is always a set by definition


Recall: If you add the element 2 to the set {1, 2, 3, 4}, then
you still have the set {1, 2, 3, 4}
If you add the tuple {101, J. Smith, 1000, checking} to
the account relation instance you dont change the result

CPSC421,2009

11

Set Theory Refresher


Consider two sets
S1 = {1, 3, 5, 7}
S2 = {1, 2, 3, 4}

What do these return?


S1 S2
S1 S2
S1 S2
S1 X S2

CPSC421,2009

12

9/8/09

Relational Algebra has Additional Operations


Consider two sets
S1 = {1, 3, 5, 7}
S2 = {1, 2, 3, 4}

We will talk about these a bit later


S1 condition S2
S1 S2
condition(S1)
attribute-list(S1)
renaming-specificaton(S1)

CPSC421,2009

13

Relational Algebra as a Query Language


We dont normally use relational algebra directly
Products dont allow you to write relational algebra
queries

But, it is used internally in a DBMS to represent a


query plan
It is also often used in theoretical work on databases
(although fragments of first order logic are frequently used
as well )

CPSC421,2009

14

9/8/09

Relational Algebra Queries w/out Operators


What does the following SQL query return?
SELECT *
FROM Student;
Answer: Student 
(i.e., this query is the identity function)
A relation name by itself is a valid relational algebra
query
Listing the relation name just returns the tuples in
the relation
CPSC421,2009

15

Relational Algebra: Selection operator ()


Account
Number
101
102
103
104
105

Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

Balance
1000.00
2000.00
5000.00
1000.00
10000.00

Type
checking
checking
savings
checking
checking

The relational algebra query: 

Balance<3000(Account)
is similar to the SQL query:
SELECT *
FROM Account
WHERE Balance < 3000;
CPSC421,2009

16

9/8/09

Relational Algebra: Selection operator ()


Select () is a unary operator:

:RR
i.e., it is always applied to a single relation

Balance<3000(Account)
rela1onorrela1onal
algebraexpression

selectoperator

thepredicate(condi1on)
A,ributeCompartor(,>,=,,,<)A,ribute|Constant
CPSC421,2009

17

Examples
Balance < 3000(Account)
Number = 103(Account)
Balance = Number(Account)
Type = checking(Balance < 3000(Account))
Account
Number
101
102
103
104
105

Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

Balance
1000.00
2000.00
5000.00
1000.00
10000.00

CPSC421,2009

Type
checking
checking
savings
checking
checking
18

9/8/09

Relational Algebra: Projection operator ()


Account
Number
101
102
103
104
105

Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

Balance
1000.00
2000.00
5000.00
1000.00
10000.00

Type
checking
checking
savings
checking
checking

The relational algebra query: 

Number,Owner(Account)
is similar to the SQL query:
SELECT Number, Owner 
FROM Account;
CPSC421,2009

19

Relational Algebra: Projection operator ()


Project () is also a unary operator:

:RR
i.e., it is always applied to a single relation

Number,Owner(Account)
rela1onorrela1onal
algebraexpression

projectoperator
listofalributestokeep
CPSC421,2009

20

10

9/8/09

Example
Owner(Account)
Account
Number
101
102
103
104
105

Owner
J.Smith
W.Wei
M.Jones
H.Mar1n

SELECTOwner
FROMAccount;

vs.
Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

Balance
1000.00
2000.00
5000.00
1000.00
10000.00

Type
checking
checking
savings
checking
checking

Rela1onsarealwayssets
Sothequeryanswerisa
setofnames
andJ.Smithappearsjust
onceintheanswer

Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

CPSC421,2009

21

Combining Select and Project


Owner(Balance < 3000(Account))
Balance < 3000(Owner,Balance(Account))
Owner(Balance < 3000(Owner,Balance(Account)))
Type = checking(Balance < 3000(Owner,Balance(Account)))
Do you seen any problems
with these queries?
Are any of these equivalent?
Equivalence implies for 
any DB instance

Account
Number
101
102
103
104
105

CPSC421,2009

Owner
J.Smith
W.Wei
J.Smith
M.Jones
H.Mar1n

Balance
1000.00
2000.00
5000.00
1000.00
10000.00

Type
checking
checking
savings
checking
checking
22

11

9/8/09

Relational Algebra Cross Products


Given A = {a, b, c}, B = {1, 2}, and C = {x, y}, the cross
product (A X B) X C in set theory is
{((a, 1), x), ((b, 1), x), (c, 1), x), , ((c, 2), y)}
Codd simplified cross products in relational algebra
by eliminating parentheses flattening the tuples
{(a, 1, x), (b, 1, x), (c, 1, x), , (c, 2, y)}

CPSC421,2009

23

Relational Algebra: Cross Product operator (X)


Used in the basic definition of a relation
An instance of a relation is a subset of the cross
product of its domains
Is also an operator in the relational algebra
So, we can use it in queries

CPSC421,2009

24

12

9/8/09

Example
Suppose we have these relations in a university DB
Teacher(tnum, tname)
Course(cnum, cname)
This is just a simple example!
If we developed an actual university schema, it would
be much more detailed than this

CPSC421,2009

25

Example
Teacher
tnum
101
105
110

Course
cnum tname
346
Opera1ngSystems
491
SeniorDesign

tname
Smith
Crowley
Yerion

TeacherXCourse
The cross product
produces every
possible combination
of teacher and course

tnum
101
105
110
101
105
110

tname
Smith
Crowley
Yerion
Smith
Crowley
Yerion

CPSC421,2009

cnum
346
346
346
491
491
491

tname
Opera1ngSystems
Opera1ngSystems
Opera1ngSystems
SeniorDesign
SeniorDesign
SeniorDesign

26

13

9/8/09

Relational Algebra: Join operator ()


Account
Number

Owner Balance Type

Deposit
Account

Transac1onId Date Amount

The relational algebra query 

Account Number=Account Deposit


is equivalent to

Number = Account(Account X Deposit)




CPSC421,2009

27

Relational Algebra: Join operator ()


Join () is a binary operator:

:RRR
i.e., it is always applied to a two relations and returns one

AccountNumber=AccountDeposit
rela1onorrela1onal
algebraexpression

rela1onorrela1onal
algebraexpression

thejoinpredicate(condi1on)
A,ributeCompartor(,>,=,,,<)A,ribute
CPSC421,2009

28

14

9/8/09

Relational Algebra: Join operator ()


The join operator is defined for convenience

R1 a1 = a2 R2 a1 = a2(R1 R2)
Thus, we dont really need the join operator
Any query with a join can always be rewritten into cross
product followed by selection
And vice-versa it is also possible to define crossproduct based on join

CPSC421,2009

29

A Few Details about Join


Each simple Boolean predicate in the join condition
must compare an attribute from one relation to an
attribute in the other relation
In the query
Account

Number = Account type = checking Deposit

type=checking is not a join condition

If you have a join with NO condition, then it is just a


cross product
CPSC421,2009

30

15

9/8/09

Examples
Student advisor=number Faculty
Student S S.age < F.age Faculty F
Student S S.salary F.age Faculty F
Join is sometimes called theta-join or -join
where represents any of the 6 comparators
The most common join is called a equi-join (for
equality condition)

R1 a1=a2 R2
CPSC421,2009

31

Challenge Question
How can you in general convert a basic SQL statement
(SELECT-FROM-WHERE) to an equivalent relational
algebra expression
SELECT DISTINCT attributes
FROM T1, T2, 
WHERE conditions

CPSC421,2009

???

32

16

9/8/09

Basic SQL = SPJ (except tables vs. relations)


A basic SQL query of the form
SELECT DISTINCT attributes
FROM T1, T2, 
WHERE conditions
is the same as
attributes(conditions (T1 X T2 X ))
SELECT-FROM-WHERE queries are sometimes described
as equivalent to the Select-Project-Join (SPJ) subset of
relational algebra
Confusingly, SQL SELECT == RA Project (not RA Select!)
CPSC421,2009

33

17

You might also like