Professional Documents
Culture Documents
03 Ch3 DesignTheoryforRelationalDatabases
03 Ch3 DesignTheoryforRelationalDatabases
TO
DATABASE
L E A R N B Y E X A M P L E S
D E S I G N T H E O RY F O R
R E L AT I O N A L D ATA B A S E S
Objectives
Functional Dependencies
Normalization
Decomposition
Multi-valued Dependencies
Design Theory for Relations
attributes in a relation
A set of attributes X (include A1A2…An) in R functionally
determine another attribute Y (include B1B2…Bm), also in R,
(written X → Y) if and only if each X value is associated with
precisely one Y value
A functional dependency A1A2…An B1B2…Bm holds on
relation R if two tuples of R agree on all of the attributes A1, A2, …,
An then they must also agree on all of the attributes B1, B2, …, Bm
3.1 Functional Dependency
I2DB: Design Theory for Relational Databases
Movies1
Example
I2DB: Design Theory for Relational Databases
superkey
Every superkey satisfies the first condition of a
key: it functionally determines all other attributes
of the relation
If K is a key, L is a super key, then: K L
A key is also a super key
Superkeys
Example
I2DB: Design Theory for Relational Databases
Some superkeys
▪ (title,year,starName)
▪ (title,year,starName,length,studioName)
3.2 Rules about Functional
Dependencies
Splitting Rule
I2DB: Design Theory for Relational Databases
Combining Rule
We can replace a set of FD’s A1A2…An Bi, for i=1,2,
Example
I2DB: Design Theory for Relational Databases
AnAi is trivial
The Closure of Attributes
Example
I2DB: Design Theory for Relational Databases
R(A,B,C,D,E,F)
2. Let X be a set of attributes that will become the closure. Initialize X to be {A1,A2,
…,An}
3. Repeatedly search for some FD: B1B2…Bm C, such that B1, B2, …, Bm are in X,
but C is not
a) If such C is found, add to X, and repeat the search
b) If such C is not found, no more attributes can be added to X
Example
I2DB: Design Theory for Relational Databases
R(A,B,C,D,E,F)
S={ABC, BCAD,DE,CFB}
Example
I2DB: Design Theory for Relational Databases
S={ABC, BCAD,DE,CFB}
Second: since BC A, BC D, and A is already in Result, but D is not, then add D
to Result, Result={A,B,C,D}
Third: since D E, and E is not in Result, then add E to Result, Result={A,B,C,D,E}
No more changes to Result are possible, the algorithm is finished here, and {A,B}
+={A,B,C,D,E}
The Closure of Attributes
R(A,B,C,D,E,F)
I2DB: Design Theory for Relational Databases
S={ABC, BCAD,DE,CFB}
Compute {AB}+ ? {BC}+ ?
What are some the keys of R?
The Closure of Attributes
R(A, B, C, D)
I2DB: Design Theory for Relational Databases
S={BC D, D A, A B}
Compute {CD}+ ? {BC}+ ?
What are some the keys of R?
The Closure of Attributes
R(A, B, C, D)
I2DB: Design Theory for Relational Databases
If A1A2…AnB1B2…Bm and
I2DB: Design Theory for Relational Databases
studioName studioAddr
The transitive rule allows us to construct the third FD’s:
Exercise 3.2.1 (page 83)
Exercises
Closing Sets of Functional
Dependencies
If for any FD in B we remove one or more attributes from the left side,
Example
I2DB: Design Theory for Relational Databases
R(A,B,C)
S={A B, A C, B A, B C, C A, C B,
some attributes?
That is, suppose a relation R with set of FD’s S,
and R1=L(R). What FD’s hold in R1?
Projecting Functional
Dependencies
we
Follow from S, and
Method:
▪ For each set of attributes X of R1, compute X+. Add to T all non-
Two notations
I2DB: Design Theory for Relational Databases
(1) Closing the empty set and the set of all attributes
Example
I2DB: Design Theory for Relational Databases
3) We can delete ‘Fox’ from the set of Studios, then we have no more
studios for Star Wars
I2DB: Design Theory for Relational Databases
… eliminate these anomalies?
How we can …
Decomposing Relations
decompose relations:
Given a relation R(A1,A2,…,An), we may decompose R
▪ S = B1,B2,…,Bm(R)
▪ T = C1,C2,…,Bk(R)
Decomposing Relations
title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount DanaCarvey
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount MikeMeyers
Mike Meyers
into
A relation Movies2, by all the attributes except for startName
Example (cont.):
I2DB: Design Theory for Relational Databases
title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount DanaCarvey
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount MikeMeyers
Mike Meyers
title
title year length
year length genre
genre studioName
studioName title
title year
year starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox StarWars
Wars 1977 CarrieFisher
Fisher
Star 1977 Carrie
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM StarWars
Wars 1977 MarkHamill
Hamill
Star 1977 Mark
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount StarWars
Wars 1977 HarrisonFord
Ford
Star 1977 Harrison
GoneWith
Gone WithThe
The1939
1939
Wind
Wind VivienLeigh
Vivien Leigh
Wayne'sWorld1992
Wayne's World1992 DanaCarvey
Dana Carvey
Wayne'sWorld1992
Wayne's World1992 MikeMeyers
Mike Meyers
Decomposing Relations
Discuss on Example
I2DB: Design Theory for Relational Databases
title
title year length
year length genre
genre studioName
studioName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount
title
title year
year starName
starName
StarWars
Star Wars 1977
1977 CarrieFisher
Carrie Fisher
The redundancy is eliminated StarWars
Star Wars 1977
1977 MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 HarrisonFord
Harrison Ford
The risk of update anomaly is gone GoneWith
Gone WithThe
The1939
1939
Wind
Wind VivienLeigh
Vivien Leigh
Wayne'sWorld1992
Wayne's World1992 DanaCarvey
Dana Carvey
The risk of deletion anomaly is gone Wayne'sWorld1992
Wayne's World1992 MikeMeyers
Mike Meyers
title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount DanaCarvey
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount MikeMeyers
Mike Meyers
title
title year length
year length genre
genre studioName
studioName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount
R 1 2 R3
A B C 4 2 A B C
1 2 3
1 2 3 R2
1 2 5
4 2 5 B C
4 2 3
2 3
2 5 4 2 5
in BCNF
Method: R=R0, S=S0
▪ Check whether R is in BCNF. If so, nothing to do, return {R}
▪ Choose R1=X+, and let R2 have attributes X and those attributes of R that are not in X+
▪ Compute the sets of FD’s for R1 and R2, let these S1, S2
▪ Recursively decompose R1, R2 using this algorithm. Return the union of the result of these
compositions
Decomposition into BCNF
Example:
I2DB: Design Theory for Relational Databases
Movies(title,year,studioName,president,presAddr)
Three FD’s are
▪ title, year → studioName
▪ studioName → president
▪ president → presAddr
Example (cont.)
I2DB: Design Theory for Relational Databases
▪ R2={title,year,studioName}
▪ S1={studioName→president, president→presAddr}
▪ S2={title year → studioName}
Example (cont.)
I2DB: Design Theory for Relational Databases
▪ R12={president,studioName}
▪ S11={president→presAddr}
▪ S12={studioName→president}
▪ So, R11, R12 are both in BCNF
Decomposition into BCNF
Example (cont.)
I2DB: Design Theory for Relational Databases
Final decomposition
▪ {tilte, year, studioName}
▪ {studioName, president}
▪ {president, presAddr}
3.4 Decomposition: The Good,
Bad and Ugly
Recoverability of Information
Preservation of Dependencies
Remember
I2DB: Design Theory for Relational Databases
Rk(R)
A B C D
a b1 c1 d
a b2 c d2
a3 b c d
A B C D
a b1 c1 d
a b1 c d2
a3 b c d
The Chase Test for Lossless Join
A B C D
a b1 c1 d
a b1 c d2
a3 b c d
A B C D
a b1 c d
a b1 c d2
a3 b c d
The Chase Test for Lossless Join
A B C D
a b1 c d
a b1 c d2
a3 b c d
A B C D
a b1 c d
a b1 c d2
a b c d
The Chase Test for Lossless
Join
A B C D
a b1 c1 d
a b1 c d2
a b c d
FD’s
▪ theater → city
▪ title city → theater
theater city theater title
Keys
Guild Menlo Park Guild Antz
▪ {title, city}
Park Menlo Park Park Antz
▪ {theater, title}
When we join these two decomposed relations, the result relation may not satisfy the title
3NF
Exercises
dependencies-preservation properties
A relation R is in the third normal form (3NF) if and only if whenever a
▪ Or those B1,B2,…Bm that are not among the A’s, are each a member of some key (not
necessarily the same key)
Notation
An attribute that is a member of some key is often said to be prime
For each nontrivial FD, either the left side is a super key, or the right side
3NF BCNF
I2DB: Design Theory for Relational Databases
property
The Synthesis Algorithm for 3NF
Schemas
▪ 2) For each FD X A in G, use {X, A} as the schema of one of the relations in the
decomposition
▪ 3) If none of the sets of relations from Step2 is a superkey for R, add another
relation whose schema is a key for R
The Synthesis Algorithm for 3NF
Schemas
Example 3.27
I2DB: Design Theory for Relational Databases
▪ Since neither of these keys is a subset of the schemas, we must add one of them, say S4(A,B,E)
Lossless Join
Dependency Preservation
Exercise 3.5.1 (page 105)
Exercises
Normal Form Summary
groups.
Example:
Example 2:
▪ R(A,B,C,D) and S={AB CD, C D}
▪ R is in 2NF
Normal Form Summary
Third Normal Form (3NF): A relation that is in 1NF and 2NF and in which NO non-key
▪ R is not in 3NF
▪ R is in 3NF
3.6 Multi valued Dependencies
(self studying)
v1 v2
u
equal
swap Y and Z to get new v1,v2 tuples
Multi valued Dependencies
MVD Rules
I2DB: Design Theory for Relational Databases
Trivial MVD’s
▪ The MVD A1A2…An ->-> B1B2…Bm holds in any relation if {B1,B2,…,Bm} {A1,A2,…,An}
▪ If A1A2…An ->-> B1B2…Bm, and B1B2…Bm ->-> C1C2…Ck hold for some relation, then so does
A1A2…An ->-> C1C2…Ck
FD promotion
▪ Every FD is an MVD, that is if A1A2…An -> B1B2…Bm, then A1A2…An ->-> B1B2…Bm
Complementation rule
▪ If X ->-> Y, and Z is all the other attributes, then X->->Z
▪ If all the attributes of R are (A1A2…An , B1B2…Bm) then A1A2…An ->-> B1B2…Bm holds in R
Multi valued Dependencies
But unlike FD’s, we cannot generally split the right side of MVD
Name street city title year
C. Fisher 123 Maple St. Hollywood Star Wars 1977
C. Fisher 5 Locust Ln. Malibu Star Wars 1977
C. Fisher 123 Maple St. Hollywood Empire Strikes Back 1980
C. Fisher 5 Locust Ln. Malibu Empire Strikes Back 1980
C. Fisher 123 Maple St. Hollywood Return of the Jedi 1983
C. Fisher 5 Locust Ln. Malibu Return of the Jedi 1983
So, name ->-> street city holds, but neither name ->-> street, nor name
4NF statement
I2DB: Design Theory for Relational Databases
Algorithm 3.33
I2DB: Design Theory for Relational Databases
Method:
▪ Find a 4NF violation in R, say X->->Y, where X is not a super key. If there is none,
return; R is suitable decomposition
▪ If there is such a violation, break the schema of R into two schemas:
▪ XY is one of the decomposed R1 relation
▪ And R2 schema has X and all attributes of R that are not among the X or Y
▪ Find the FD’s, MVD’s that hold in R1, R2. Recursively decompose R1, R2 again
Decomposition into 4NF
Example 3.34
I2DB: Design Theory for Relational Databases
Why do we care?
I2DB: Design Theory for Relational Databases
Example (cont)
I2DB: Design Theory for Relational Databases
A B C D E F
a b c d1 e1 f1
a b c d2 e2 f2
a b c d1
a b c2 d
Since A->->BC, then we swap the B and C columns to get two new tuples
Example (cont)
I2DB: Design Theory for Relational Databases
>C?
B->->C,
A the 2nd,3rd rows
B agree on B, then
C we swap the
D C column, we have
a b1 c d1
a b c2 d
a b c d1
a b c d
a b c2 d1
a b1 c2 d
Exercise 3.6.2 (page 114)
Exercises