03 Ch3 DesignTheoryforRelationalDatabases

INTRODUCTION
TO
DATABASE
L E A R N B Y E X A M P L E S
D E S I G N T H E O RY F O R
R E L AT I O N A L D ATA B A S E S
Objectives
 Understand concepts of:

I2DB: Design Theory for Relational Databases
 Functional Dependencies
 Normalization
 Decomposition
 Multi-valued Dependencies
Design Theory for Relations
 Helps us examine a design and make

improvements based on a few simple principles

 States the constraints that apply to the relation
 The most common constraint is the functional
dependencies
3.1 Functional Dependency
 A functional dependency: constraint between two sets of

attributes in a relation
 A set of attributes X (include A1A2…An) in R functionally
determine another attribute Y (include B1B2…Bm), also in R,
(written X → Y) if and only if each X value is associated with
precisely one Y value
 A functional dependency A1A2…An  B1B2…Bm holds on
relation R if two tuples of R agree on all of the attributes A1, A2, …,
An then they must also agree on all of the attributes B1, B2, …, Bm
3.1 Functional Dependency
Movies1
 Easy to see that: the following FD is true

 title,year  length, genre, studioName
 Exercise: How about the FD

 title,year  starName
title, year  startName does not hold in Movies1 relation

Keys of Relations
 A set of one or more attributes {A1, A2, …, An} is

a key for a relation R, if
1) Those attributes functionally determine all other
attributes of the relation R
2) No proper subset of {A1, A2, …, An} functionally

determines all other attributes of R, i.e., a key must be
minimal
Keys of Relations
 Example
 (title,year,starName) form a key for Movies1 relation
 (year,starName) is not a key of Movies1 relation

Keys of Relations
 A relation can have more than one keys

 One of them is chosen as primary key
 Others are candidate keys (or alternate keys)

Superkeys
 A set of attributes that contains a key is called a

superkey
 Every superkey satisfies the first condition of a
key: it functionally determines all other attributes
of the relation
 If K is a key, L is a super key, then: K  L
 A key is also a super key
Superkeys
 Example
 Some superkeys
▪ (title,year,starName)
▪ (title,year,starName,length,studioName)
3.2 Rules about Functional
Dependencies
 A set of FD’s S follows from a set of FD’s T

if every relation instance that satisfies all the FD’s

in T also satisfies all the FD’s in S
 Two sets of FD’s S and T are equivalent if and
only if S follows from T, and T follows S
Splitting and Combining Rules
 Splitting Rule
 We can replace an FD A1A2…An  B1B2…Bm by a set of
FD’s A1A2…An  Bi, for i=1,2,…,m
 Combining Rule
 We can replace a set of FD’s A1A2…An  Bi, for i=1,2,
…,m by the single FD A1A2…An  B1B2…Bm

Splitting and Combining Rules
 Example
 title year  length
 title year  genre
 title year  studioName
 title year  length genre studioName

Trivial Functional Dependencies
 The FD’s A1A2…An  B1B2…Bm is trivial(phụ

thuộc hàm tầm thường hiển nhiên, vế phải là
tập con của vế trái), if
{B1, B2, …, Bm}  {A1, A2, …, An}
 Every trivial FD holds in every relation
Trivial Functional Dependencies
 The FD’s A1A2…An  B1B2…Bm is equavilent to
A1A2…An  C1C2…Ck, where C’s are all those B’s

that are not also A’s C’s
A’s
B’s
 Ex: A,B,C,D  C,D,E,F then A,B,C,D  E,F

The Closure of Attributes
 The closure of a set of attributes {A1, A2, …, An}
under FD’s in S (denoted {A1, A2, …, An}+) is the set

of attributes B such that every relation that satisfies
all the FD’s in set S also satisfies A1A2…An B
 That is, A1A2…An B follows from the FD’s of S
 A1, A2, …, An  {A1, A2, …, An}+, because A1A2…
AnAi is trivial
 Example
 R(A,B,C,D,E,F)
 S={AC, A D, D E, E F}, {A}+=? (các phụ thuộc
hàng có liên quan đến bao đóng)

 {A}+={A,C,D,E,F}
 Algorithm 3.7: Closure of a set of attributes

 Input: A set of attributes {A1,A2,…,An} and a set of FD’s S
 Output: The closure {A1,A2,…,An}+
1. If necessary, split the FD’s of S, so each FD in S have singleton right side
2. Let X be a set of attributes that will become the closure. Initialize X to be {A1,A2,
…,An}
3. Repeatedly search for some FD: B1B2…Bm  C, such that B1, B2, …, Bm are in X,
but C is not
a) If such C is found, add to X, and repeat the search
b) If such C is not found, no more attributes can be added to X
4. The set X is the correct value of {A 1, A2, …, An}+

 Example
 R(A,B,C,D,E,F)
 S={ABC, BCAD,DE,CFB}
 Compute {A,B}+ , {B,C}+
[AB] += ABCDE(tâp thuộc tính vẫn còn thiếu F),

ABF
-tương tự [B,C]+ = BCADE, BCF
 Example
 S={ABC, BCAD,DE,CFB}
 Split BC AD into BC A and BC D
 S={ABC, BCA, BC D, DE, CFB}
 Start with Result={A,B}
 First: since AB C, then add C to Result, Result={A,B,C}
 Second: since BC A, BC D, and A is already in Result, but D is not, then add D
to Result, Result={A,B,C,D}
 Third: since D E, and E is not in Result, then add E to Result, Result={A,B,C,D,E}
 No more changes to Result are possible, the algorithm is finished here, and {A,B}
+={A,B,C,D,E}
 R(A,B,C,D,E,F)
 S={ABC, BCAD,DE,CFB}
 Compute {AB}+ ? {BC}+ ?
 What are some the keys of R?
 R(A, B, C, D)
 S={BC  D, D  A, A  B}
 Compute {CD}+ ? {BC}+ ?
 R(A, B, C, D)
 S={A→B, B→C, C→D, D→A}

 Compute {A}+ ? {B}+ ?
The Transitive Rule
If A1A2…AnB1B2…Bm and
B1B2…BmC1C2…Ck hold in relation R, then
A1A2…An  C1C2…Ck also holds in R

 If some of the C’s are among the A’s, we may
eliminate them from the right side
The Transitive Rule
 Example: Let’s see another version of Movies relation

title year length genre studioName studioAddr

Star Wars 1977 124 scifi Fox Holywood
Eight Below 2005 120 drama Disney Buena Vista
Wayne’s World 1992 95 comedy Paramount Holywood
 Two of the FD’s that we might claim to hold are:
title year  studioName
studioName  studioAddr
 The transitive rule allows us to construct the third FD’s:
title year  studioAddr


Exercise 3.2.1 (page 83)
Exercises
Closing Sets of Functional
Dependencies
 Suppose a set of FD’s S, any set of FD’s T equivalent to S is

said to be a basis for S.

Then we say T is a basis for S
 Just work with only FD’s that have singleton right sides
 A minimal basis for FD’s S is a basis B that satisfies three
conditions:
 All the FD’s in B have singleton right sides
 If any FD is removed from B, the result is no longer a basis
 If for any FD in B we remove one or more attributes from the left side,
the result is no longer a basis

Closing Sets of Functional
Dependencies
 Example
 R(A,B,C)
 S={A B, A  C, B  A, B  C, C  A, C  B,
AB  C, BC  A, AC  B, A  BC, B  AC, C  AB}

R and its FD’s have several minimal basis
▪ {A  B, B  A, B  C, C  B}, or
▪ {A  B, B  C, C  A}
What happens to …
 … a set of FD’s S of R when we project R on

some attributes?
 That is, suppose a relation R with set of FD’s S,
and R1=L(R). What FD’s hold in R1?
Projecting Functional
Dependencies
 To find a functional dependencies of projection,

we
 Follow from S, and
 Involve only attributes of R1

Dependencies
 Algorithm 3.12: Projecting a Set of FD’s

 Input: R, R1=L(R), S a set of FD’s that hold in R
 Output: the set of FD’s that hold in R1
 Method:
▪ T is the set of FD’s that hold in R1. Initially, T is empty
▪ For each set of attributes X of R1, compute X+. Add to T all non-
trivial FD’s X → A such that A is both in X+ and an attribute of R1
▪ Construct a minimal basis from T

Dependencies
 Algorithm 3.12: Projecting a Set of FD’s (cont)

 Compute a minimal basis from T

▪ If there is an FD F in T that follows from other FD’s in T, then
remove F from T
▪ Let Y  B is a FD in T , with at least two attributes in Y, and let Z
is Y with one of its attributes removed:
▪ If Z  B follows from the other FD’s in T (including Y  B), then
replace Y  B by Z  B
▪ Repeat the above steps in all possible ways until no more

changes to T can be made
Dependencies
 Two notations
 (1) Closing the empty set and the set of all attributes
cannot yield a nontrivial FD

 (2) If we have already know that the closure of some set
X is all attributes, then we cannot discover any new

FD’s by closing supersets of X
Dependencies
 Example: Suppose R(A,B,C,D) has FD’s A→B, B→C, and C→D.

R1=A,C,D(R). Find the FD’s of R1?

 Compute the closure of the singleton set
▪ {A}+={A,B,C,D}, and B is not in R1, then new FD’s A→C, A→D
▪ {C}+={C,D}, then new FD’s C→D
▪ {D}+={D}, no new FD’s
 Compute the closure of the doubleton set

▪ Since {A}+ include all attributes, no care any more for supersets of {A}
▪ {C,D}+={C,D}, no new FD’s holds in R1
 Finally, there are three FD’s A→C, A→D, C→D hold in R1
 A→D is transitive from A→C, and C→D
 So, minimal basis is {A→C, C→D}

Exercises
 Exercise 3.2.1, 3.2.2 (page 83)

 Exercise 3.2.3 (page 83/84)

3.3 Design of Relational Database
Schema
 Anomalies: problems such as redundancy that occur when

we try to cram too much into a single relation are called

anomalies
 Some kinds of anomalies:
1) Redundancy: information may be repeated unnecessarily in several
tuples
2) Update Anomalies: information may be changed in one tuple, but be
unchanged in another
3) Deletion Anomalies: if a set of values becomes empty, we may lose
other information as a side effect
Anomalies
 Example
title year length genre studioName starName

Star Wars 1977 124 SciFi Fox Carrie Fisher
Star Wars 1977 124 SciFi Fox Mark Hamill
Star Wars 1977 124 SciFi Fox Harrison Ford
Gone With The Wind 1939 231 drama MGM Vivien Leigh
Wayne's World 1992 95 comedy Paramount Dana Carvey
Wayne's World 1992 95 comedy Paramount Mike Meyers
1) The length and genre repeated unnecessarily

2) We’d change the length of Star Wars to 125 in the first tuple, but not in
the other tuples
3) We can delete ‘Fox’ from the set of Studios, then we have no more
studios for Star Wars

… eliminate these anomalies?
How we can …
Decomposing Relations
 One way to eliminate these anomalies is to

decompose relations:
 Given a relation R(A1,A2,…,An), we may decompose R
into two relations S(B1,B2,…,Bm) and T(C1,C2,…,Ck)

such that:
▪ {A1,A2,…,An} = {B1,B2,…,Bm}  {C1,C2,…,Ck}
▪ S = B1,B2,…,Bm(R)
▪ T = C1,C2,…,Bk(R)
 Example: Decompose the Movies1 relation

title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount DanaCarvey
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount MikeMeyers
Mike Meyers
into
 A relation Movies2, by all the attributes except for startName
 A relation Movies3, by title, year, and starName

 Example (cont.):
title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Mike Meyers
title
title year length
year length genre
genre studioName
studioName title
title year
year starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox StarWars
Wars 1977 CarrieFisher
Fisher
Star 1977 Carrie
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM StarWars
Wars 1977 MarkHamill
Hamill
Star 1977 Mark
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount StarWars
Wars 1977 HarrisonFord
Ford
Star 1977 Harrison
GoneWith
Gone WithThe
The1939
1939
Wind
Wind VivienLeigh
Vivien Leigh
Wayne'sWorld1992
Wayne's World1992 DanaCarvey
Dana Carvey
Wayne'sWorld1992
Wayne's World1992 MikeMeyers
Mike Meyers
 Discuss on Example
title
title year length
year length genre
genre studioName
studioName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount
title
title year
year starName
starName
StarWars
Star Wars 1977
1977 CarrieFisher
Carrie Fisher
 The redundancy is eliminated StarWars
Star Wars 1977
1977 MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 HarrisonFord
Harrison Ford
 The risk of update anomaly is gone GoneWith
Gone WithThe
The1939
1939
Wind
Wind VivienLeigh
Vivien Leigh
Wayne'sWorld1992
Wayne's World1992 DanaCarvey
Dana Carvey
 The risk of deletion anomaly is gone Wayne'sWorld1992
Wayne's World1992 MikeMeyers
Mike Meyers
 What is about the repeat of (title, year) in Movies3

relation?
How we can guarantee …
 … the anomalies not to exists?

Boyce-Codd Normal Form
 Boyce-Codd Normal Form (BCNF) is a simple

condition that guarantee anomalies are not exists

 A relation R is in BCNF if and only if whenever there is a
nontrivial FD A1A2…AnB1B2…Bm for R, it is the case

that {A1,A2,…,An} is a super key for R
 Notation
 The left side of every nontrivial FD must be a super key
 The left side of every nontrivial FD must contain a key

Boyce-Code Normal Form
 Example: BCNF or not?

title
title year length
year length genre
genre studioName
studioName starName
starName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox CarrieFisher
Carrie Fisher
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox MarkHamill
Mark Hamill
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox HarrisonFord
Harrison Ford
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM VivienLeigh
Vivien Leigh
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Dana Carvey
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Mike Meyers
 This is not in BCNF, because:

 title year  length genre studioName is a FD, and
 {title, year} is not a super key

 Example: BCNF or not?

title
title year length
year length genre
genre studioName
studioName
StarWars
Star Wars 1977
1977 124SciFi
124 SciFi Fox
Fox
GoneWith
Gone WithTheTheWind
Wind 1939
1939 231drama
231 drama MGM
MGM
Wayne'sWorld
Wayne's World 1992
1992 95comedy
95 comedy Paramount
Paramount
 This is in BCNF, because:

 title year  length genre studioName is a FD, and
 {title, year} is a key

 Example: What is about the case that a relation R

has only two attributes A and B?

  R(A,B) is in BCNF
Example: Loss of information
after decomposition
R1
A B
R 1 2 R3
A B C 4 2 A B C
1 2 3
1 2 3 R2
1 2 5
4 2 5 B C
4 2 3
2 3
2 5 4 2 5
 Suppose we have R(A,B,C) but neither of the FD’s B->A

nor B->C holds.
 R is decomposed into R1 and R2 as above
 When we try to re-construct R by Natural Join of R1 and R2, we have: R3
= R1 X R2 (but R3 <> R => We lost information)
Decomposition into BCNF
 We can break any relation schema into a

collection of subsets of its attributes and

 These subsets are the schemas of relations in BCNF
 From the data in decomposed relations, we must be
able to reconstruct the original relation instance exactly

 Algorithm 3.20: BCNF Decomposition Algorithm

 Input: a relation R0 with a set of functional dependencies S0
 Output: a decomposition of R0 into a collection of relations, all of which are
in BCNF
 Method: R=R0, S=S0
▪ Check whether R is in BCNF. If so, nothing to do, return {R}
▪ If there are BCNF violation, let one be X→Y

▪ Compute X+
▪ Choose R1=X+, and let R2 have attributes X and those attributes of R that are not in X+
▪ Compute the sets of FD’s for R1 and R2, let these S1, S2
▪ Recursively decompose R1, R2 using this algorithm. Return the union of the result of these
compositions
 Example:
Movies(title,year,studioName,president,presAddr)
 Three FD’s are
▪ title, year → studioName
▪ studioName → president
▪ president → presAddr
 {title, year} is only key for this relation
 Apply Algorithm 3.20 above:

 Example (cont.)
 Decomposition starts with studioName→president

▪ Compute {studioName}+={studioName,president,presAddr}
▪ R1={studioName,president,presAddr}
▪ R2={title,year,studioName}
▪ S1={studioName→president, president→presAddr}
▪ S2={title year → studioName}
▪ So, in R2, {title,year} is a key, and R2 in BCNF

▪ In R1, studioName is a key, but president→presAddr is not in BCNF
 Example (cont.)
 Decomposition continues with president→presAddr

▪ Compute {president}+={president,presAddr}
▪ R11={president,presAddr}
▪ R12={president,studioName}
▪ S11={president→presAddr}
▪ S12={studioName→president}
▪ So, R11, R12 are both in BCNF
 Example (cont.)
 Final decomposition
▪ {tilte, year, studioName}
▪ {studioName, president}
▪ {president, presAddr}
3.4 Decomposition: The Good,
Bad and Ugly
 Three problems of decomposition

 Elimination of Anomalies be decomposition
 Recoverability of Information
 Preservation of Dependencies
 There is no way to get all three at once

Recovering Information from a
Decomposition
 If we decompose relation R into two-attribute

relations, then we might not get the instance of R

back when we join these decomposed relations
 If we decompose a relation by Algorithm 3.20,
then the original relation can be recovered exactly
by the natural join
The Chase Test for Lossless Join
 Decomposing relation R into relations with sets of

attributes R1, R2, …, Rk

 Given set of FD’s F that hold in R
 Question: R1(R) ⋈ R2(R) ⋈…⋈ Rk(R)=R?
 Remember
 The natural join is associative and commutative
 Any tuple t in R is surely t in R1(R) ⋈ R2(R) ⋈…⋈
Rk(R)
 R1(R) ⋈ R2(R) ⋈…⋈ Rk(R)=R when the FD’s in F
hold for R if and only if every tuple in the join is also in R

 The chase test is to see whether a tuple t in R1(R) ⋈ R2(R)

⋈…⋈ Rk(R) can be proved, using the FD’s in F, also to be

a tuple in R
 If t is in the join, there are t1,t2,…tk in R such that t=R1(t1) ⋈
R2(t2) ⋈…⋈ Rk(tk)
 ti agree with t on the attributes of Ri
 ti has unknown values in its components not in Ri
 We draw a tableau and use a given set of FD’s F to prove

that t is really in R
 Example: R(A,B,C,D), S1={A,D}, S2={A,C},

S3={B,C,D}. FD’s A→B, B→C,CD→A

 The tableau for this decomposition
A B C D
a b1 c1 d
a b2 c d2
a3 b c d
 Goal: (a,b,c,d) appears in this tableau after using FD’s

 Example: (A→B, B→C, CD→A)

A B C D
a b1 c1 d
a b2 c d2
a3 b c d
 Since A→B, then b1=b2, replace b2 by b1, then
A B C D
a b1 c1 d
a b1 c d2
a3 b c d

A B C D
a b1 c1 d
a b1 c d2
a3 b c d
 Since B→C, then c1=c, replace c1 by c, then
A B C D
a b1 c d
a b1 c d2
a3 b c d

A B C D
a b1 c d
a b1 c d2
a3 b c d
 Since CD→A, then a3=a, replace a3 by a, then
A B C D
a b1 c d
a b1 c d2
a b c d
The Chase Test for Lossless
Join

A B C D
a b1 c1 d
a b1 c d2
a b c d
 t=(a,b,c,d) appears in relation R
 We have proved that, if R satisfies the FD’s A→B, B→C,

CD→A, then whenever we project onto {A,D}, {A,C},
{B,C,D} and rejoin, what we get must have been in R
Dependency Preservation
 In some cases, it is not possible that the

decomposition has both the lossless join and

dependency preservation
Dependency Preservation
 Example: Bookings(title, theater, city)

 FD’s
▪ theater → city
▪ title city → theater
theater city theater title
 Keys
Guild Menlo Park Guild Antz
▪ {title, city}
Park Menlo Park Park Antz
▪ {theater, title}
 theater → city violates BCNF
(theater is not a superkey) theater city Title

 Decomposition into two relations Guild Menlo Park Antz
▪ {theater, city} Part Menlo Park Antz
▪ {theater, title}
 When we join these two decomposed relations, the result relation may not satisfy the title
city → theater FD’s

How we can …
 … solve the BCNF’s weakness such as

 The preservation of dependencies
  3NF
Exercises
 Exercise 3.3.1 (page 92)

 Exercise 3.4.1 (page 102)

3.5 Third Normal Form (3NF)
 3NF is the solution that have both the lossless-join, and

dependencies-preservation properties
 A relation R is in the third normal form (3NF) if and only if whenever a
nontrivial FD A1A2…AnB1B2…Bm, , either

▪ {A1,A2,…,An} is a super key,
▪ Or those B1,B2,…Bm that are not among the A’s, are each a member of some key (not
necessarily the same key)
 Notation
 An attribute that is a member of some key is often said to be prime
 For each nontrivial FD, either the left side is a super key, or the right side
consists of prime (member of key) attributes only

Differences between BCNF and
3NF
3NF BCNF
A nontrivial functional dependency: A nontrivial functional dependency:

X  B holds in R, either X  B holds in R, then
(a) X is a super key of R, or (a) X is a super key of R
(b) B is prime attribute of R
Note: A functional dependency X  Y is a full functional

dependency if removal of any attribute A from X means that
the dependency does not hold any more; A partial
functional dependency is not a full functional dependency
The Synthesis Algorithm for 3NF
Schemas
 Goal: decomposing a relation R into a set of

relations such that:

 The relations of the decomposition are all in 3NF
 The decomposition has a lossless join
 The decomposition has the dependency-preservation
property
Schemas
 Algorithm 3.26: Synthesis of 3NF relations

 Input: A relation R, and a set F of functional dependencies that hold for R
 Output: A decomposition of R into a collection of relations, each of which
is in 3NF. The decomposition has the lossless-join and dependency-

preservation properties
 Method: Perform the following steps
▪ 1) Find a minimal basis of F, say G
▪ 2) For each FD X  A in G, use {X, A} as the schema of one of the relations in the
decomposition
▪ 3) If none of the sets of relations from Step2 is a superkey for R, add another
relation whose schema is a key for R
Schemas
 Example 3.27
 Given R(A,B,C,D,E), with FD’s AB C, C  B, A  D
 Notice: the given FD’s are their own minimal basis

▪ Cannot eliminate any of the given FD’s
▪ Cannot eliminate any attribute from a left side
 Start 3NF synthesis by follows (1), (2), and (3)

▪ New relations: S1(A,B,C), S2(B,C), S3(A,D)
▪ S2 is a subset of S1, then we can drop S2
▪ Consider whether we need to add a relation whose scheme is a key

▪ In this case, R has two keys: {A,B,E} and {A,C,E}
▪ Since neither of these keys is a subset of the schemas, we must add one of them, say S4(A,B,E)
▪ The final decomposition of R is S1(A,B,C), S3(A,D), S4(A,B,E)

Schemas
 Why this Algorithm works?

 Lossless Join
 Dependency Preservation
 Third Normal Form


Exercises
Normal Form Summary
 Unnormalized Form (UNF)

 A relation that contains one or more repeating
groups.
 Example:
Manufacture Model CPU RAM HD

HP 2001 2.00 2048 240
2002 2.40 2048 320
Toshiba 2003 2.00 2048 120
2004 2.80 4096 500
Sony 2010 2.40 2048 300
IBM 2011 2.20 4096 300
Normal Form Summary
 First Normal Form (1NF)

 Every component of every tuple is an atomic value
(intersection of each row and column contains one

and only one value)
Manufacture Model CPU RAM HD
HP 2001 2.00 2048 240
HP 2002 2.40 2048 320
Toshiba 2003 2.00 2048 120
Toshiba 2004 2.80 4096 500
Sony 2010 2.40 2048 300
IBM 2011 2.20 4096 300
Normal Form Summary
 Second Normal Form (2NF)

 Based on the concept of full functional dependency.
 X  Y is full functional dependency if Y is functionally dependent on X, but not

on any proper subset of X. i.e: ∄ V⊆ X, V  Y
 Second Normal Form (2NF) : a relation that is in 1NF and every non-primary-
key attribute is fully functionally dependent on the primary key.
 Example 1:
▪ R(A,B,C,D) and S={AB  CD, B  D}, so : key of R is AB
▪ B  D : D is functionally dependent on B, an subset of key AB so D is not fully functionally
dependent on the key AB
▪ R is not in 2NF
 Example 2:
▪ R(A,B,C,D) and S={AB  CD, C  D}
▪ R is in 2NF
Normal Form Summary
 Third Normal Form (3NF)

 Based on the concept of transitive dependency.
 Transitive Dependency is a condition where

▪ A, B and C are attributes of a relation such that if A  B
and B  C,
▪ then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or
C).
 Third Normal Form (3NF): A relation that is in 1NF and 2NF and in which NO non-key
attribute is transitively dependent on the key

 Example 1: R(A,B,C,D), AB  CD and C  D
▪ AB  C and C  D so that D is transitively dependent on AB
▪ R is not in 3NF
 Example 2: R(A,B,C,D), AB  CD and C  A, C  B

▪ although C is not a superkey but A, B are key attributes (prime attributes)
▪ R is in 3NF
3.6 Multi valued Dependencies
(self studying)
 Suppose we have relation R,

a MVD on R, X ->-> Y, says that if two tuples of R

agree on all the attributes of X, then their
components in Y may be swapped, and the result
will be two tuples that are also in the relation
Multi valued Dependencies
 Example about MVD

Name street city title year

C. Fisher 123 Maple St. Hollywood Star Wars 1977
C. Fisher 5 Locust Ln. Malibu Star Wars 1977
C. Fisher 123 Maple St. Hollywood Empire Strikes Back 1980
C. Fisher 5 Locust Ln. Malibu Empire Strikes Back 1980
C. Fisher 123 Maple St. Hollywood Return of the Jedi 1983
C. Fisher 5 Locust Ln. Malibu Return of the Jedi 1983
 name ->-> street city

C. Fisher 5 Locust Ln. Multi valued Dependencies
Malibu Star Wars 1977
 Example about Ln. MVD
C. Fisher 5 Locust Malibu Return of the Jedi 1983
name street city title year

 name ->-> street city
 t=first tuple, u=fourth tuple

▪ Since t[name]=u[name], there are v1, v2 tuple in relation R such that
▪ Its name is C. Fisher, street and city agree with t, title and year agree with u
or (v1 is third tuple)
▪ Its name is C. Fisher, street and city agree with u, title and year agree with t
(v’2 is second tuple)
 Picture of MVD X->->Y

X (A1, …, An) Y(B1, …, Bm) others (Z)

t
v1 v2
u
equal
swap Y and Z to get new v1,v2 tuples
 MVD Rules
 Trivial MVD’s
▪ The MVD A1A2…An ->-> B1B2…Bm holds in any relation if {B1,B2,…,Bm}  {A1,A2,…,An}
 The transitive rule
▪ If A1A2…An ->-> B1B2…Bm, and B1B2…Bm ->-> C1C2…Ck hold for some relation, then so does
A1A2…An ->-> C1C2…Ck
 FD promotion
▪ Every FD is an MVD, that is if A1A2…An -> B1B2…Bm, then A1A2…An ->-> B1B2…Bm
 Complementation rule
▪ If X ->-> Y, and Z is all the other attributes, then X->->Z
 More Trivial MVD’s
▪ If all the attributes of R are (A1A2…An , B1B2…Bm) then A1A2…An ->-> B1B2…Bm holds in R
 Splitting doesn’t hold

 Like FD’s, we cannot generally split the left side of MVD
 But unlike FD’s, we cannot generally split the right side of MVD
 So, name ->-> street city holds, but neither name ->-> street, nor name
->-> city holds for this relation

What we do with …
 … redundancy caused by MVD’s?


Fourth Normal Form
 The redundancy that comes from MVD’s is not

removable by putting the database schema in

BCNF
 There is a stronger normal form, called 4NF, that
treats MVD’s as FD’s when it comes to
decomposition, but not when determining keys of
the relation
Fourth Normal Form
 4NF statement
 A relation R is in 4NF if whenever X ->-> Y is a nontrivial
MVD, then X is a super key

 Nontrivial MVD means that
 Y is not a subset of X
 X and Y are not, together, all the attributes
 Note that the definition of ‘super key’ still depends

on FD’s only
BCNF Versus 4NF
 Every FD X→Y is also MVD X->->Y

 Thus, if R is in 4NF, it is certainly in BCNF

 But R could be in BCNF and not in 4NF
 Example: R(name,street,city,title,year) is BCNF but not
4NF because MDV name ->-> street city is a nontrivial

(name is not a key of R, but the key of R is all five
attributes)
Decomposition into 4NF
 Algorithm 3.33
 Input: A relation R0, with a set of FD’s and MVD’s S 0
 Output: A decomposition of R 0 into relations all of which are in 4NF
 Method:
▪ Initialize: R=R0, S=S0
▪ Find a 4NF violation in R, say X->->Y, where X is not a super key. If there is none,
return; R is suitable decomposition
▪ If there is such a violation, break the schema of R into two schemas:
▪ XY is one of the decomposed R1 relation
▪ And R2 schema has X and all attributes of R that are not among the X or Y
▪ Find the FD’s, MVD’s that hold in R1, R2. Recursively decompose R1, R2 again
Decomposition into 4NF
 Example 3.34
 The R(name, street, city, title, year) with MDV
name ->-> street city is a violation 4NF

 Decomposition relations:
▪ R1(name, street, city) with MDV name ->-> street, city
▪ And R2(name, title, year) with MDV name ->-> title, year
3.7 An Algorithm for Discovering MVD’s
(self studying)
 Reasoning about MVD’s and FD’s

 Problem: Given a set of MVD’s and/or FD’s that hold for
a relation R, Does a certain MVD or FD also hold in R?

 Solution: Use a tableau to explorer all inferences from
the given set, to see if you can prove the target

dependency
An Algorithm for Discovering
MVD’s
 Why do we care?
 4NF technically requires an MVD violation

▪ Need to infer MVD’s from given FD’s and MVD’s that may not be
violations themselves
 When we decompose, we need to project FD’s, MVD’s

The Closure and the Chase to FD
 Goal: Does X → Y hold in a relation R?

 Solution: Closing X with respect to F and seeing whether

Y  X+
 Method: Using and chasing the tableau by the FD’s F
 A chase-based test:
 Start with a tableau having two rows that agree only on X
 Chase the tableau using the FD’s of F
 If the final tableau agrees in all column of Y, then X → Y holds;
otherwise it does not

The Closure and Chase to FD
 Example: Suppose a relation R(A,B,C,D,E,F) with FD’s

AB→C, BC→AD, D→E, CF→B. Test AB→D hold in R?

 Start with the tableau having two rows that agree on A,B:
A B C D E F
a b c1 d1 e1 f1
a b c2 d2 e2 f2
 As AB→C, then c1=c2, we have

A B C D E F
a b c d1 e1 f1
a b c d2 e2 f2
The Closure and Chase to FD
 Example (cont)
A B C D E F
a b c d1 e1 f1
a b c d2 e2 f2
 As BC→AD, then d1=d2

A B C D E F
a b c d e1 f1
a b c d e1 f2
 We can go no further. Since the two tuples now agree in
the D column, AB→D does follow from the given FD’s

Extending the Chase to MVD’s
 For MVD X->->Y,

 We start with a tableau consisting of two tuples that

agree in X and disagree in all attributes not in the set X.
 We apply the given FD’s to equate symbols, and we
apply the given MVD’s to swap the values in certain
attributes between two existing rows of the tableau to
add new rows to the tableau
MVD’s
 Example: Suppose a relation R(A,B,C,D), and

A→B, B->->C. Test A->->C hold in R?

 Start with the tableau having two rows (for A->->C)
A B C D
a b1 c d1
a b c2 d
 Since A→B, then b1=b

A B C D
a b c d1
a b c2 d
MVD’s
A B C D
 Example (cont)
a b c d1
a b c2 d
 Since B->->C, then we swap two rows that agree in B column

to get two more rows:
A B C D
a b c d1
a b c2 d
a b c2 d1
a b c d
 We get a row (a,b,c,d), which proves that A->->C holds in

relation R
MVD’s
 Example: Suppose a relation R(A,B,C,D), A->->BC, D→C.

Test A→C hold ?

 Start with the tableau with two rows
A B C D
a b1 c1 d1
a b2 c2 d2
 Since A->->BC, then we swap the B and C columns to get two new tuples
into the tableau

A B C D
a b1 c1 d1
a b2 c2 d2
a b2 c2 d1
a b1 c1 d2
MVD’s
 Example (cont)
 We have pairs of rows that agree on D, so we can apply

the FD D→C, then c1=c2
A B C D
a b1 c1 d1
a b2 c2 d2
a b2 c2 d1
a b1 c1 d2
 So, we conclude that A→C
MVD’s
 Example: Suppose a relation R(A,B,C,D), A->->B,

B->->C. Test A->->C hold ?

A B C D
a b1 c d1
a b c2 d
 We have two rows agree on A, then we apply A->->B

A B C D
a b1 c d1
a b c2 d
a b c d1
a b1 c2 d
MVD’s
 Example: Suppose a relation R(A,B,C,D), A->->B, B->->C. A->-
>C?
 B->->C,
A the 2nd,3rd rows
B agree on B, then
C we swap the
D C column, we have
a b1 c d1
a b c2 d
a b c d1
a b c d
a b c2 d1
a b1 c2 d
 We have (a,b,c,d) which proves that A->->C

MVD’s
 Example: Suppose a relation R(A,B,C,D,E), and one of

the decomposed relation S(A,B,C). MVD A->->CD holds

in R. Does A->->C hold in S?
A B C D E
a b1 c d1 e1
a b c2 d e
 As A->->CD, swap the C and D columns

A B C D E
a b1 c d1 e1
a b c2 d e
a b1 c2 d e1
a b c d1 e

Exercises

03 Ch3 DesignTheoryforRelationalDatabases

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 Ch3 DesignTheoryforRelationalDatabases

Uploaded by

Copyright:

Available Formats

INTRODUCTION

 Understand concepts of:

 Helps us examine a design and make

improvements based on a few simple principles

 A functional dependency: constraint between two sets of

 Easy to see that: the following FD is true

 Exercise: How about the FD

title, year  startName does not hold in Movies1 relation

 A set of one or more attributes {A1, A2, …, An} is

2) No proper subset of {A1, A2, …, An} functionally

 (title,year,starName) form a key for Movies1 relation

 (year,starName) is not a key of Movies1 relation

 A relation can have more than one keys

 One of them is chosen as primary key

 Others are candidate keys (or alternate keys)

 A set of attributes that contains a key is called a

 A set of FD’s S follows from a set of FD’s T

if every relation instance that satisfies all the FD’s

 We can replace an FD A1A2…An  B1B2…Bm by a set of

FD’s A1A2…An  Bi, for i=1,2,…,m

…,m by the single FD A1A2…An  B1B2…Bm

 title year  length

 title year  genre

 title year  studioName

 title year  length genre studioName

 The FD’s A1A2…An  B1B2…Bm is trivial(phụ

 The FD’s A1A2…An  B1B2…Bm is equavilent to

A1A2…An  C1C2…Ck, where C’s are all those B’s

 Ex: A,B,C,D  C,D,E,F then A,B,C,D  E,F

 The closure of a set of attributes {A1, A2, …, An}

under FD’s in S (denoted {A1, A2, …, An}+) is the set

 S={AC, A D, D E, E F}, {A}+=? (các phụ thuộc

hàng có liên quan đến bao đóng)

 Algorithm 3.7: Closure of a set of attributes

 Output: The closure {A1,A2,…,An}+

1. If necessary, split the FD’s of S, so each FD in S have singleton right side

4. The set X is the correct value of {A 1, A2, …, An}+

 Compute {A,B}+ , {B,C}+

[AB] += ABCDE(tâp thuộc tính vẫn còn thiếu F),

 Split BC AD into BC A and BC D

 S={ABC, BCA, BC D, DE, CFB}

 Start with Result={A,B}

 First: since AB C, then add C to Result, Result={A,B,C}

 S={A→B, B→C, C→D, D→A}

B1B2…BmC1C2…Ck hold in relation R, then

A1A2…An  C1C2…Ck also holds in R

 Example: Let’s see another version of Movies relation

title year length genre studioName studioAddr

 Two of the FD’s that we might claim to hold are:

title year  studioName

title year  studioAddr

 Suppose a set of FD’s S, any set of FD’s T equivalent to S is

said to be a basis for S.

 If any FD is removed from B, the result is no longer a basis

the result is no longer a basis

AB  C, BC  A, AC  B, A  BC, B  AC, C  AB}

 … a set of FD’s S of R when we project R on

 To find a functional dependencies of projection,

 Involve only attributes of R1

 Algorithm 3.12: Projecting a Set of FD’s

 Input: R, R1=L(R), S a set of FD’s that hold in R

 Output: the set of FD’s that hold in R1