Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Lecture 9:

Database Design
Wednesday, January 25, 2006

Closure of a set of Attributes


Given
Givenaaset
setof
ofattributes
attributes AA11,,,
,AAnn
+
The
Theclosure,
closure,{A
{A11,,,
,AAnn}}+,, isisthe
theset
setof
ofattributes
attributesBB
s.t.
A
,
,
A

B
s.t. A11, , Ann B

Example:

name
name
color
color
category

category department
department
color,
color,category
category
price
price

Closures:
name+ = {name, color}
{name, category}+ = {name, category, color, department, price}
color+ = {color}
2

Closure Algorithm
Start
Startwith
withX={A1,
X={A1,,
,An}.
An}.

Example:

Repeat
Repeatuntil
untilXXdoesnt
doesntchange
change do:
do:

name
name
color
color
category
category
department
department
color,
category
color, category
price
price

BB11,,,
,BBnn
CC isisaaFD
FDand
and
BB1,,,
B
are
all
in
X
,
B
n
are
all
in
X
1
n
then
then add
addCCto
toX.
X.

ifif

{name, category}+ =
{ name, category, color, department, price }
Hence:

name,
name,category
category
color,
color,department,
department,price
price

Example
In class:
R(A,B,C,D,E,F)

Compute {A,B}+

A,
A,BB
CC
A,
D

A, D EE
BB
DD
A,
F

A, F BB

X = {A, B,

Compute {A, F}+ X = {A, F,

}
4

Why Do We Need Closure


With closure we can find all FDs easily
To check if X A
Compute X+
Check if A X+

Using Closure to Infer ALL FDs


Example:

A,
A,BB
CC
A,
A,DD
BB
BB
DD

Step 1: Compute X+, for every X:


A+
A+==A,
A, B+
B+==BD,
BD, C+
C+==C,
C, D+
D+==DD
AB+
AB+==ABCD,
ABCD, AC+
AC+==AC,
AC, AD+
AD+==ABCD
ABCD
++= ABCD (no need to compute why ?)
ABC+
=
ABD+
=
ACD
ABC+ = ABD+ = ACD = ABCD (no need to compute why ?)
+
BCD
BCD+==BCD,
BCD, ABCD+
ABCD+==ABCD
ABCD
Step 2: Enumerate all FDs X Y, s.t. Y X+ and XY = :
AB
AB
CD,
CD,ADBC,
ADBC, ABC
ABC
D,
D,ABD
ABD
C,
C,ACD
ACD
BB

Another Example
Enrollment(student, major, course, room, time)
student major
major, course room
course time

What else can we infer ? [in class, or at home]

Back to Conceptual Design


Now we know how to find more FDs, its easy
Search for bad FDs
If there are such, then decompose the table into two tables,
repeat for the subtables.
When done, the database schema is normalized
Unfortunately, there are several normal forms

Normal Forms
First Normal Form = all attributes are atomic
Second Normal Form (2NF) = old and obsolete
Third Normal Form (3NF) = will discuss
Boyce Codd Normal Form (BCNF) = will discuss
Others...
9

Keys
A superkey is a set of attributes A1, ..., An s.t. for
any other attribute B, we have A1, ..., An B
A key is a minimal superkey
I.e. set of attributes which is a superkey and for which
no subset is a superkey

10

Computing (Super)Keys
Compute X+ for all sets X
If X+ = all attributes, then X is a key
List only the minimal Xs

11

Example
Product(name, price, category, color)
name,
name, category
category
price
price
category
category
color
color
What is the key ?

12

Example
Product(name, price, category, color)
name,
name, category
category
price
price
category
category
color
color
What is the key ?
(name, category) + = name, category, price, color
Hence (name, category) is a key

13

Examples of Keys
Enrollment(student, address, course, room, time)
student
student
address
address
room,
room,time
time
course
course
student,
student,course
course
room,
room,time
time

(find keys at home)

14

Eliminating Anomalies
Main idea:
X A is OK if X is a (super)key
X A is not OK otherwise

15

Example
Name
Fred
Fred
Joe
Joe

SSN
123-45-6789
123-45-6789
987-65-4321
987-65-4321

PhoneNumber
206-555-1234
206-555-6543
908-555-2121
908-555-1234

City
Seattle
Seattle
Westfield
Westfield

SSN
SSN
Name,
Name,City
City
What the key?
{SSN, PhoneNumber}

Hence SSN Name, City


is a bad dependency 16

Key or Keys ?
Can we have more than one key ?
Given R(A,B,C) define FDs s.t. there are two
or more keys

17

Key or Keys ?
Can we have more than one key ?
Given R(A,B,C) define FDs s.t. there are two
or more keys
ABC
ABC
BCA
BCA

or

ABC
ABC
BAC
BAC

what are the keys here ?


Can you design FDs such that there are three keys ?

18

Boyce-Codd Normal Form


A simple condition for removing anomalies from relations:
AArelation
relationRRisisin
inBCNF
BCNFif:
if:
IfIfAA11,,...,
...,AAnn
BBisisaanon-trivial
non-trivialdependency
dependency
in
in RR,, then
then{A
{A11,,...,
...,AAnn}} isisaasuperkey
superkeyfor
forRR
In other words: there are no bad FDs
Equivalently:
X, either (X+ = X)

or (X+ = all attributes)


19

BCNF Decomposition Algorithm


repeat
repeat
choose
chooseAA11,,,
,AAmm
BB11,,,
,BBnnthat
thatviolates
violatesBNCF
BNCF
split
splitRRinto
intoRR11(A
(A11,,,
,AAmm,,BB11,,,
,BBnn))and
andRR22(A
(A11,,,
,AAmm,,[others])
[others])
and
R
continue
with
both
R
continue with both R11 and R22
until
untilno
nomore
moreviolations
violations

Bs

R1

As

Others

R2

Is there a
2-attribute
relation that is
not in BCNF ?
In practice, we have
20
a better algorithm (coming up)

10

Example
Name
Fred
Fred
Joe
Joe

SSN
123-45-6789
123-45-6789
987-65-4321
987-65-4321

PhoneNumber
206-555-1234
206-555-6543
908-555-2121
908-555-1234

City
Seattle
Seattle
Westfield
Westfield

SSN
SSN
Name,
Name,City
City
What the key?
{SSN, PhoneNumber}

use SSN Name, City


to split

21

Example
Name

SSN

City

Fred

123-45-6789 Seattle

Joe

987-65-4321 Westfield

SSN

PhoneNumber

123-45-6789

206-555-1234

123-45-6789

206-555-6543

987-65-4321

908-555-2121

987-65-4321

908-555-1234

SSN Name, City

Lets check anomalies:


Redundancy ?
Update ?
Delete ?

22

11

Example Decomposition
Person(name, SSN, age, hairColor, phoneNumber)
SSN name, age
age hairColor
Decompose in BCNF (in class):

23

BCNF Decomposition Algorithm


BCNF_Decompose(R)
BCNF_Decompose(R)
+
find
find X
X s.t.:
s.t.: X
X X
X+ [all
[all attributes]
attributes]

ifif (not
(not found)
found) then
then R
R isis in
in BCNF
BCNF
+
let
let Y
Y == X
X+ -- X
X
+
let
Z
=
[all
attributes]
let Z = [all attributes]-- X
X+
decompose
decompose RR into
into R1(X
R1(X
Y)
Y) and
and R2(X
R2(X
Z)
Z)
continue
continue to
to decompose
decompose recursively
recursively R1
R1 and
and R2
R2
24

12

Find X s.t.: X X+ [all attributes]

Example BCNF Decomposition


Person(name, SSN, age, hairColor, phoneNumber)
SSN name, age
age hairColor
Iteration
Iteration1:
1:Person
Person
SSN+
=
SSN,
SSN+ = SSN,name,
name,age,
age,hairColor
hairColor
Decompose
Decomposeinto:
into:P(SSN,
P(SSN,name,
name,age,
age,hairColor)
hairColor)
Phone(SSN,
phoneNumber)
Phone(SSN, phoneNumber)
Iteration
Iteration2:
2: PP
age+
age+==age,
age,hairColor
hairColor
Decompose:
Decompose:People(SSN,
People(SSN,name,
name,age)
age)
Hair(age,
hairColor)
Hair(age, hairColor)
Phone(SSN,
Phone(SSN,phoneNumber)
phoneNumber)

R(A,B,C,D)

Example
A+

25

A
A
BB
BB
CC

R(A,B,C,D)
= ABC ABCD

R1(A,B,C)
= BC ABC

B+
R11(B,C)

What are
the keys ?

R12(A,B)

R2(A,D)
What are
the keys ?

What happens if in R we first pick B+ ? Or AB+26 ?

13

Decompositions in General
R(A
R(A11,,...,
...,AAnn,,BB11,,...,
...,BBmm,,CC11,,...,
...,CCpp))

RR11(A
(A11,,...,
...,AAnn,,BB11,,...,
...,BBmm))

RR22(A
(A11,,...,
...,AAnn,,CC11,,...,
...,CCpp))

R1 = projection of R on A1, ..., An, B1, ..., Bm


R2 = projection of R on A1, ..., An, C1, ..., Cp
27

Theory of Decomposition
Sometimes it is correct:
Name

Price

Category

Gizmo

19.99

Gadget

OneClick

24.99

Camera

Gizmo

19.99

Camera

Name

Price

Name

Category

Gizmo

19.99

Gizmo

Gadget

OneClick

24.99

OneClick

Camera

Gizmo

19.99

Gizmo

Camera

Lossless decomposition

28

14

Incorrect Decomposition
Sometimes it is not:
Name

Price

Category

Gizmo

19.99

Gadget

OneClick

24.99

Camera

Gizmo

19.99

Camera

Whats
incorrect ??

Name

Category

Price

Gizmo

Gadget

19.99

Gadget

OneClick

Camera

24.99

Camera

Gizmo

Camera

19.99

Camera

Category

Lossy decomposition

29

Decompositions in General
R(A
R(A11,,...,
...,AAnn,,BB11,,...,
...,BBmm,,CC11,,...,
...,CCpp))

RR11(A
(A11,,...,
...,AAnn,,BB11,,...,
...,BBmm))

RR22(A
(A11,,...,
...,AAnn,,CC11,,...,
...,CCpp))

If A1, ..., An B1, ..., Bm


Then the decomposition is lossless
Note: dont need A1, ..., An C1, ..., Cp

BCNF decomposition is always lossless. WHY ?

30

15

3NF: A Problem with BCNF


Unit

Company

Product

Unit
Unit
Company
Company
Company,
Company,Product
Product
Unit
Unit
Unit+ = Unit, Company

Unit

Company

Unit

Product

Unit
Unit
Company
Company
We loose the FD: Company, Product Unit !!

31

So Whats the Problem?


Unit

Company

Galaga99 UW
Bingo
UW

Unit

Product

Galaga99
Bingo

Databases
Databases

Unit
Unit
Company
Company
No problem so far. All local FDs are satisfied.
Lets put all the data back into a single table again:
Unit

Company

Product

Galaga99
Bingo

UW
UW

Databases
Databases

Violates the FD:

Company,
Company,Product
Product
Unit
Unit

32

16

The Problem
We started with a table R and FD
We decomposed R into BCNF tables R1, R2,
with their own FD1, FD2,
We can reconstruct R from R1, R2,
But we cannot reconstruct FD from FD1, FD2,
33

Solution: 3rd Normal Form


(3NF)
A simple condition for removing anomalies from relations:
AArelation
relationRRisisin
in3rd
3rdnormal
normalform
formifif::
Whenever
Wheneverthere
thereisisaanontrivial
nontrivialdependency
dependencyAA11,,AA22,,...,
...,AAnn
BB
for
for RR,,then
then {A
{A11,,AA22,,...,
...,AAnn}}aasuper-key
super-keyfor
forR,
R,
or
B
is
part
of
a
key.
or B is part of a key.
Tradeoff:
BCNF = no anomalies, but may lose some FDs
3NF = keeps all FDs, but may have some anomalies

34

17

3NF Decomposition Algorithm


3NF_Decompose(R)
3NF_Decompose(R)
let
let K
K == [all
[all attributes
attributes that
that are
are part
part of
of some
some key]
key]
+
+
find
X -- K
K
and
and X
X+ [all
[all attributes]
attributes]
find X
X s.t.:
s.t.: X
X+ -- X

ifif (not
(not found)
found) then
then R
R isis already
already in
in 3NF
3NF
+
let
let Y
Y == X
X+ -- X
X -- K
K
Z
=
[all
attributes]
let
(X
Y)
Y)
let Z = [all attributes]-- (X
decompose
into
R1(X

Y)
and
decompose into R1(X Y) and R2(X
R2(X
Z)
Z)
decompose,
decompose, recursively,
recursively, R1
R1 and
and R2
R2

35

Example of 3NF decomposition


R(A,B,C,D,E):
R(A,B,C,D,E):
AB
AB
CC
CC
DD
DD
BB
DD
EE

Keys: (need to compute X+, for several Xs)


AB, AC, AD
K = {A, B, C, D}

Pick X = C
C+ = BCDE
C BDE is a BCNF violation
For 3NF: remove B, D (part of K):
C E is a 3NF violation
Decompose: R1(C, E), R2(A,B,C,D)

R1 is in 3NF
R2 is in 3NF (because its keys: AB, AC, AD)

36

18

3NF v.s. BCNF Decomposition


A

BCNF

37

19

FDs for E/R Diagrams


Rule 2: If the relation comes from a many-many relationship,
the key of the relation is the set of all attribute keys in the
relations corresponding to the entity sets
name
Product

Person

buys

price

name

ssn

date

buys(name, ssn, date)

39

FDs for E/R Diagrams


Except: if there is an arrow from the relationship to E, then
we dont need the key of E as part of the relation key.
sname

Product

name

card-no

Purchase

CreditCard

Person

Store

ssn

Purchase(name , sname, ssn, card-no)

40

20

FDs for E/R Diagrams


More rules:
Many-one, one-many, one-one relationships
Multi-way relationships
Weak entity sets
(Try to find them yourself, or check book)
41

FDs for E/R Diagrams


Say: the CreditCard determines the Person

sname

Product

name

card-no

Purchase

CreditCard

Person

Store

ssn

Incomplete
(what does
it say ?)

Purchase(name , sname, ssn, card-no)


card-no ssn

42

21

You might also like