Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Lecture 9: Database Design

Wednesday, January 25, 2006

Closure of a set of Attributes


Given ,,, Givenaaset setof ofattributes attributes A A 11 ,A A nn
+ The ,,{A ,,, } Theclosure closure {A 11 ,A A nn }+,, is isthe theset setof ofattributes attributesB B s.t. A , , A B 11, , A nn B s.t. A

Example:

name name color color category category department department color, color,category category price price

Closures: name+ = {name, color} {name, category}+ = {name, category, color, department, price} color+ = {color} 2

Closure Algorithm
Start Startwith withX={A1, X={A1,, ,An}. An}. Repeat :: Repeatuntil untilX Xdoesnt doesntchange change do do if if B ,,, B 11 ,B B nn C C is isaaFD FDand and B ,,, B 11 ,B B nn are areall allin inX X then then add addC Cto toX. X. Example: name name color color category category department department color, category color, category price price

{name, category}+ = { name, category, color, department, price } Hence: name, name,category category color, color,department, department,price price
3

Example
In class: R(A,B,C,D,E,F) A, A,B B C C A, D E A, D E B B D D A, F A, F B B

Compute {A,B}+

X = {A, B,

} }
4

Compute {A, F}+ X = {A, F,

Why Do We Need Closure


With closure we can find all FDs easily To check if X A
Compute X+ Check if A X+

Using Closure to Infer ALL FDs


Example: A, A,B B C C A, A,D D B B B D B D

Step 1: Compute X+, for every X: A+ A+= =A, A, B+ B+= =BD, BD, C+ C+= =C, C, D+ D+= =D D AB+ = ABCD, AC+ = AC, AD+ = ABCD AB+ = ABCD, AC+ = AC, AD+ = ABCD + ABCD (no need to compute why ?) ABC+ ABC+= =ABD+ ABD+= =ACD ACD+= = ABCD (no need to compute why ?) + + BCD = BCD, ABCD+ = BCD = BCD, ABCD+ =ABCD ABCD Step 2: Enumerate all FDs X Y, s.t. Y X+ and XY = : AB BC, AB CD, CD,AD AD BC, ABC ABC D, D,ABD ABD C, C,ACD ACD B B
6

Another Example
Enrollment(student, major, course, room, time)
student major major, course room course time

What else can we infer ? [in class, or at home]

Back to Conceptual Design


Now we know how to find more FDs, its easy Search for bad FDs If there are such, then decompose the table into two tables, repeat for the subtables. When done, the database schema is normalized Unfortunately, there are several normal forms

Normal Forms
First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = will discuss Boyce Codd Normal Form (BCNF) = will discuss Others...
9

Keys
A superkey is a set of attributes A1, ..., An s.t. for any other attribute B, we have A1, ..., An B A key is a minimal superkey
I.e. set of attributes which is a superkey and for which no subset is a superkey

10

Computing (Super)Keys
Compute X+ for all sets X If X+ = all attributes, then X is a key List only the minimal Xs

11

Example
Product(name, price, category, color)
name, name, category category price price category category color color
What is the key ?

12

Example
Product(name, price, category, color)
name, name, category category price price category category color color
What is the key ? (name, category) + = name, category, price, color Hence (name, category) is a key
13

Examples of Keys
Enrollment(student, address, course, room, time)
student student address address room, room,time time course course student, student,course course room, room,time time

(find keys at home)

14

Eliminating Anomalies
Main idea: X A is OK if X is a (super)key X A is not OK otherwise

15

Example
Name Fred Fred Joe Joe SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321 PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234 City Seattle Seattle Westfield Westfield

SSN SSN Name, Name,City City What the key? {SSN, PhoneNumber}

Hence SSN Name, City is a bad dependency 16

Key or Keys ?
Can we have more than one key ? Given R(A,B,C) define FDs s.t. there are two or more keys

17

Key or Keys ?
Can we have more than one key ? Given R(A,B,C) define FDs s.t. there are two or more keys AB AB C C BC BC A A
or

A A BC BC B B AC AC
18

what are the keys here ? Can you design FDs such that there are three keys ?

Boyce-Codd Normal Form


A simple condition for removing anomalies from relations: A Arelation relationR Ris isin inBCNF BCNFif: if: If ,,..., IfA A 11 ...,A A nn B Bis isaanon-trivial non-trivialdependency dependency in ,,..., } in R R,, then then{A {A 11 ...,A A nn } is isaasuperkey superkeyfor forR R In other words: there are no bad FDs Equivalently: X, either (X+ = X)

or (X+ = all attributes)


19

BCNF Decomposition Algorithm


repeat repeat choose ,,, B11 ,,, chooseA A 11 ,A A m ,B B nnthat thatviolates violatesBNCF BNCF mB split (A ,,, , B , , B ) and R (A , , Am , [others]) splitR Rinto intoR R 11 (A 11 ,A A m , B 1 , , B n ) and R 2 (A m 1 n 2 11, , A m, [others]) and R continue with both R 11 and R 22 continue with both R until no more violations until no more violations Is there a 2-attribute relation that is not in BCNF ? In practice, we have 20 a better algorithm (coming up)

Bs

As

Others

R1

R2

10

Example
Name Fred Fred Joe Joe SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321 PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234 City Seattle Seattle Westfield Westfield

SSN SSN Name, Name,City City What the key? {SSN, PhoneNumber}

use SSN Name, City to split

21

Example
Name Fred Joe SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321 SSN City 123-45-6789 Seattle 987-65-4321 Westfield PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234
22

SSN Name, City

Lets check anomalies: Redundancy ? Update ? Delete ?

11

Example Decomposition
Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Decompose in BCNF (in class):

23

BCNF Decomposition Algorithm


BCNF_Decompose(R) BCNF_Decompose(R)
+ find X find X X s.t.: s.t.: X X X+ [all [all attributes] attributes]

if if (not (not found) found) then then R R is is in in BCNF BCNF


+ let let Y Y= =X X+ -- X X + let Z = [all attributes] let Z = [all attributes]-- X X+ decompose decompose R R into into R1(X R1(X Y) Y) and and R2(X R2(X Z) Z) continue continue to to decompose decompose recursively recursively R1 R1 and and R2 R2
24

12

Find X s.t.: X X+ [all attributes]

Example BCNF Decomposition


Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Iteration Iteration1: 1:Person Person SSN+ = SSN, SSN+ = SSN,name, name,age, age,hairColor hairColor Decompose Decomposeinto: into:P(SSN, P(SSN,name, name,age, age,hairColor) hairColor) Phone(SSN, phoneNumber) Phone(SSN, phoneNumber) Iteration Iteration2: 2: P P age+ = age, hairColor age+ = age, hairColor Decompose: Decompose:People(SSN, People(SSN,name, name,age) age) Hair(age, hairColor) Hair(age, hairColor) Phone(SSN, Phone(SSN,phoneNumber) phoneNumber)

What are the keys ?


25

R(A,B,C,D)

Example
A+ R(A,B,C,D) = ABC ABCD

A A B B B B C C

B+ R11(B,C)

R1(A,B,C) = BC ABC R12(A,B)

R2(A,D)
What are the keys ?

What happens if in R we first pick B+ ? Or AB+26 ?

13

Decompositions in General
R(A ,,..., ,,B ,,..., , C11 ,,..., )) R(A 11 ...,A A nn B 11 ...,B B m ...,C C pp m, C

R (A ,,..., ,,B ,,..., ) R 11 (A 11 ...,A A nn B 11 ...,B B m m)

R (A ,,..., ,,C ,,..., )) R 22 (A 11 ...,A A nn C 11 ...,C C pp

R1 = projection of R on A1, ..., An, B1, ..., Bm R2 = projection of R on A1, ..., An, C1, ..., Cp
27

Theory of Decomposition
Sometimes it is correct:
Name Gizmo OneClick Gizmo Price 19.99 24.99 19.99 Category Gadget Camera Camera

Name Gizmo OneClick Gizmo

Price 19.99 24.99 19.99

Name Gizmo OneClick Gizmo

Category Gadget Camera Camera


28

Lossless decomposition

14

Incorrect Decomposition
Sometimes it is not:
Name Gizmo OneClick Gizmo Price 19.99 24.99 19.99 Category Gadget Camera Camera

Whats incorrect ??

Name Gizmo OneClick Gizmo

Category Gadget Camera Camera

Price 19.99 24.99 19.99

Category Gadget Camera Camera


29

Lossy decomposition

Decompositions in General
R(A ,,..., ,,B ,,..., , C11 ,,..., )) R(A 11 ...,A A nn B 11 ...,B B m ...,C C pp m, C

R (A ,,..., ,,B ,,..., ) R 11 (A 11 ...,A A nn B 11 ...,B B m m)

R (A ,,..., ,,C ,,..., )) R 22 (A 11 ...,A A nn C 11 ...,C C pp

If A1, ..., An B1, ..., Bm Then the decomposition is lossless Note: dont need A1, ..., An C1, ..., Cp

BCNF decomposition is always lossless. WHY ?

30

15

3NF: A Problem with BCNF


Unit Company Product Unit Unit Company Company Company, Company,Product Product Unit Unit Unit+ = Unit, Company Unit Company Unit Product

Unit Unit Company Company We loose the FD: Company, Product Unit !!
31

So Whats the Problem?


Unit Company Unit Product

Galaga99 UW Bingo UW

Galaga99 Bingo

Databases Databases

Unit Unit Company Company No problem so far. All local FDs are satisfied. Lets put all the data back into a single table again: Unit Company Product

Galaga99 Bingo
Violates the FD:

UW UW

Databases Databases
32

Company, Company,Product Product Unit Unit

16

The Problem
We started with a table R and FD We decomposed R into BCNF tables R1, R2, with their own FD1, FD2, We can reconstruct R from R1, R2, But we cannot reconstruct FD from FD1, FD2,
33

Solution: 3rd Normal Form (3NF)


A simple condition for removing anomalies from relations: A Arelation relationR Ris isin in3rd 3rdnormal normalform formif if:: Whenever ,,A ,,..., Wheneverthere thereis isaanontrivial nontrivialdependency dependencyA A 11 A 22 ...,A A nn B B for ,,A ,,..., for R R,,then then {A {A 11 A 22 ...,A A nn} }aasuper-key super-keyfor forR, R, or orB Bis ispart partof ofaakey. key. Tradeoff: BCNF = no anomalies, but may lose some FDs 3NF = keeps all FDs, but may have some anomalies

34

17

3NF Decomposition Algorithm


3NF_Decompose(R) 3NF_Decompose(R) let let K K= = [all [all attributes attributes that that are are part part of of some some key] key]
+ + find X -- K K and and X X+ [all [all attributes] attributes] find X X s.t.: s.t.: X X+ -- X

if (not found) found) then then R R is is already already in in 3NF 3NF if (not
+ let let Y Y= =X X+ -- X X -- K K Z = [all attributes] let (X Y) Y) let Z = [all attributes]-- (X decompose into R1(X Y) and decompose into R1(X Y) and R2(X R2(X Z) Z) decompose, decompose, recursively, recursively, R1 R1 and and R2 R2

35

Example of 3NF decomposition


R(A,B,C,D,E): R(A,B,C,D,E): AB AB C C C C D D D D B B D E DE Keys: (need to compute X+, for several Xs) AB, AC, AD K = {A, B, C, D} Pick X = C C+ = BCDE C BDE is a BCNF violation For 3NF: remove B, D (part of K): C E is a 3NF violation Decompose: R1(C, E), R2(A,B,C,D)
36

R1 is in 3NF R2 is in 3NF (because its keys: AB, AC, AD)

18

3NF v.s. BCNF Decomposition


A B C D E F G H K

3NF
A B C C D E E F G G H K

BCNF

37

FDs for E/R Diagrams


Given a relation constructed from an E/R diagram, what is its key? Rule 1: If the relation comes from an entity set, the key of the relation is the set of attributes which is the key of the entity set.

Person address name ssn

Person(address, name, ssn)

38

19

FDs for E/R Diagrams


Rule 2: If the relation comes from a many-many relationship, the key of the relation is the set of all attribute keys in the relations corresponding to the entity sets name Product price date buys name Person ssn

buys(name, ssn, date)

39

FDs for E/R Diagrams


Except: if there is an arrow from the relationship to E, then we dont need the key of E as part of the relation key.
Product

sname
Purchase Store

name

card-no

CreditCard

Person

ssn

Purchase(name , sname, ssn, card-no)

40

20

FDs for E/R Diagrams


More rules: Many-one, one-many, one-one relationships Multi-way relationships Weak entity sets (Try to find them yourself, or check book)
41

FDs for E/R Diagrams


Say: the CreditCard determines the Person

Product

sname
Purchase Store

name

card-no

CreditCard

Person

ssn

Incomplete (what does it say ?)

Purchase(name , sname, ssn, card-no)


card-no ssn

42

21

You might also like