Professional Documents
Culture Documents
CSC 214 Lecture 5part2
CSC 214 Lecture 5part2
ng
2
Lecture Objectives
At the end of this lecture, there should be a good
understanding of
– Update anomalies
– How to determine functional dependencies
– How to produce relations with good properties, that
avoid all the database anomalies
3
Schema Refinement
Normalization
4
Introduction
Normalization is the process of deciding which
attributes to be grouped together in a relation
i.e assigning attributes to entities.
― It helps eliminate data anomalies
― It reduces data redundancies
― It produces controlled redundancies to link tables
5
Introduction cont.
When to normalize
– During conceptual data modeling , you should
normalize ER diagrams that you develop during that
phase.
– During logical database design. You should use
normalization concepts as quality check for the
relations that are obtained from mapping the ER
diagrams
6
Introduction cont.
• Normalization is based on functional dependencies among the
attributes of a relation.
• A relation can be normalized to a specific form to prevent possible
occurrence of update anomalies.
• Four most commonly used normal forms are first (1NF), second
(2NF) and third (3NF) normal forms, and Boyce–Codd normal form
(BCNF).
7
Data Redundancy
Major aim of relational database design is to group attributes into relations to
minimize data redundancy and reduce file storage space required by base relations.
Problems associated with data redundancy are illustrated by comparing the
following Staff and Branch relations with the StaffBranch relation.
– StaffBranch relation has redundant data: details of a branch are repeated for every member
of staff.
– In contrast, branch information appears only once for each branch in Branch relation and the
only branchNo is repeated in Staff relation, to represent where each member of staff works.
8
9
Update Anomalies
Relations that contain redundant information may
potentially suffer from update anomalies.
Types of update anomalies include: Insertion, Deletion,
Modification.
Also from the StaffBranch table, the update anomalies
could be explained thus:
1
0
Insert Anomalies
If a new staff is employed and posted to branch B003, from the
StaffBranch table it means that as the new staff information is added
into the table, also branch B003 details have to be repeated again.
In the process of repeating this information that could be a spelling
mistake which automatically lead to data inconsistency.
For instance, Boo3 address could be writing in two different way but
to computer they are not the same.
163 Main St. Glasgow 163 Main Street Glasgow
11
Deletion Anomalies
Assuming staff SA9 is no more with the company and his
details have to be deleted from the StaffBranch table.
Since he is the only staff in branch B007, it means all
information about branch B007 has to be removed also,
whereas the branch is still existing.
12
Modification Anomalies
Assuming the address for branch B005 has to be changed on
StaffBranch, it simply means that the address has to be corrected in as
many occurrences of branch B005.
In the process of correction if any typographic error occurs it leads to
inconsistency of the database, which in turn affects further
implementation on the database.
13
Functional Dependency
Functional Dependency describes relationship between attributes in a relation.
If A and B are attributes of relation R, B is functionally dependent on A
(denoted A B), if each value of A in R is associated with exactly one value
of B in R.
1
4
Examples
15
Functional Dependency
Main characteristics of functional dependencies used in
normalization:
– have a 1:1 relationship between an attribute on the
left and right-hand side of a dependency;
– hold for all time;
– are nontrivial.
1
6
Functional Dependency
Full Functional Dependency:
When all non-key attributes are functionally dependent on the primary-key(s).
Partial functional dependency:
When non-key attributes are functionally dependent on part of the primary keys.
Transitive Dependency:
A condition where A, B, C are attributes of a relation such that if A B and B
C, then C is transitively dependent on A through B. (Provided that A is not
functionally dependent on B or C).
17
Functional Dependency Diagram
Project_Employee Table
Dependency Diagram
20
Un-normalized data
Customer_ CName Property_ Paddress RentStart RentFinish Rent Owner_No OName
No No
CR76 John Kay PG4 6,Lawrence 1-Jul-04 31-Aug-06 350 CO40 Tina
St. Ikoyi Murphy
PG16 1-Sep-06 1-Sep-08 450 CO93
5, Novar St, Tony Shaw
Oshodi
CR56 Fred Wey PG4 6,Lawrence 1-Sep 02 10-Jun-04 350 CO40 Tina
St. Ikoyi Murphy
PG36 10-Oct-04 1-Dec-06 375 CO93
PG16 2, Fred St. 1-Jan-06 10-Aug-06 450 CO93 Tony Shaw
Apapa Tony Shaw
5, Novar St,
Oshodi
2
1
First Normal Form (1NF)
A relation in which the intersection of each row and column contains one and
only one value.
Normalizing UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized
table.
Identify repeating group(s) in unnormalized table which repeats for the key
attribute(s).
All key attributes defined
No repeating groups in the table
All attributes dependent on primary key
22
Second Normal Form (2NF)
Based on the concept of full functional dependency:
A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key (no partial dependency)
Normalizing 1NF to 2NF
Identify the primary key for the 1NF relation.
Identify functional dependencies in the relation.
If partial dependencies exist on the primary key remove them by placing them
in a new relation along with a copy of their determinant.
23
Third Normal Form (3NF)
Based on the concept of transitive dependency:
A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key.
Normalizing 2NF to 3NF
Identify the primary key in the 2NF relations.
Identify functional dependencies in the relation.
If transitive dependencies exist on the primary key remove them by
placing them in a new relation along with a copy of their determinant.
24
Boyce–Codd Normal Form (BCNF)
Based on functional dependencies that take into account all candidate keys in a relation,
however, BCNF also has additional constraints compared with a general definition of 3NF.
Normalizing 3NF to BCNF
A relation is in BCNF if and only if every determinant is a candidate key.
Note: The attribute on the left side of the arrow in a functional dependence is called the
determinant.
Difference between 3NF and BCNF is that for a functional dependency A B,
3NF allows this dependency in a relation if B is a primary-key attribute and A is not a
candidate key.
Whereas, BCNF insists that for this dependency to remain in a relation, A must be a
candidate key.
25
Boyce–Codd Normal Form (BCNF)
Every relation in BCNF is also in 3NF.
However, relation in 3NF may not be in BCNF.
Violation of BCNF is quite rare.
– Example
– Student_Advisor ( SID, Major, Advisor, Maj _GPA )
– Where SID and Major is the composite key chosen(candidate key) from the beginning
and was used for the normalization till 3NF.
– If there are other candidate keys eg (SID and Advisor) such that advisor functionally
determines another attribute such as Advisor->Major ie a major can only have one
advisor, then we have to break the table down further and the relation now becomes
– (SID, Advisor, Maj _GPA ) and ( Advisor, Major )
26
Example : Dreamed House
Page 1 Date 7April 2009
Customer Rental Details
Customer Name John Kay Customer Number CR76
2
7
Table1: Unnormalized Customer_Rental Relation
Customer CName Property_ Paddress RentStart RentFinish Rent Owner OName
_No No _No
CR76 John Kay PG4 6,Lawrence 1-Jul-04 31-Aug-06 350 CO40 Tina Murphy
St. Ikoyi
PG16 1-Sep-06 1-Sep-08 450 CO93 Tony Shaw
5, Novar St,
Oshodi
CR56 Fred PG4 6,Lawrence 1-Sep 02 10-Jun-04 350 CO40 Tina Murphy
Wey PG36 St. Ikoyi 10-Oct-04 1-Dec-06 375 CO93 Tony Shaw
PG16 2, Fred St. 1-Jan-06 10-Aug-06 450 CO93 Tony Shaw
Apapa
5, Novar St,
Oshodi
CR56 PG4 Fred Wey 6,Lawrence St. Ikoyi 1-Sep 02 10-Jun-04 350 CO40 Tina Murphy
CR56 PG36 Fred Wey 2, Fred St. Apapa 10-Oct-04 1-Dec-06 375 CO93 Tony Shaw
CR56
We Identify
PG16 the Fred
three
Wey candidate keys
5, Novar St, Oshodi for the Customer_Rental
1-Jan-06 Relation
10-Aug-06 450 CO93 Tonyas being
Shaw
(Primary Key)
(Partial Dependency)
(Partial Dependency)
(Transitive Dependency)
(Candidate Key)
(Candidate Key)
3
0
Functional dependencies
Functional dependencies of the Customer_Rental relation with Customer_No, Property_No as the primary
key
So, to decompose the customer_Rental relation to 2 nd Normal form all the partial dependencies have
to be removed on separate tables.
So as to have all non-key attribute to be fully dependent on their determinant key.
31
Customer (Customer_No, Cname)
Rental (Customer_No , Property_No , RentStart, RentFinish)
Property_Owner (Property_No , Paddress, Rent, Owner_No, Oname)
32
• In other to transform the decomposed relation to 3 rd Normal form, we have to identify
transitive dependency(ies), if there exits in any of the three decomposed relations.
• slide 32 reveals that all the relations are in 3 rd Normal form except property_Owner relation
which consists of transitive dependency by considering fd4,
fd4: Owner_No OName (Td)
• So, to decompose property_Owner to 3rd Normal form the transitive dependency attribute
has to be removed as a separate relation with a copy of Owners_No as the Primary Key, to
have
Customer (Customer_No, Cname)
Rental (Customer_No , Property_No , RentStart, RentFinish)
Property_for _Rent (Property_No , Paddress, Rent, Owner_No)
Owner (Owner_No, Oname)
33
3rd Normal form decomposed relations
Rental Relation
3
4
3rd Normal form decomposed relations
In other to transform the decomposed relation to Boyce Codd Normal form,
we have to identify if there is any case where a determinant is not a candidate
key from slide 34.
Slide 34, reveals that all decomposed relations are in 3 rd Normal form and non
violate Boyce Codd condition.
3
5
SELF ASSESSMENT
36
Exercises
For each of the following ER diagrams;
– Transform the diagram to a relational schema that shows
referential integrity constraints.
– For each relation, determine the functional dependencies.
– If any of the relations are not in 3NF, transform them to 3NF
37
Exercises
a.
b.
c.
38
Lecture Reference
Ramakrishnan, R., Gehrke, J.: Database Management Systems. USA: McGraw Hill
Companies (2000)
Elmasri, R., & Navathe, S. (2000). Fundamentals of database systems (3rd ed.).
Reading, Mass.: Addison-Wesley.
Hoffer, J. A., Prescott, M. B., & McFadden, F. R. (2005). Modern database
management. Upper Saddle River, N.J: Pearson/ Prentice Hall.
Connolly, T. M., & Begg, C. E. (2002). Database systems: A practical approach to
design, implementation, and management. Harlow, England: Addison-Wesley
http://pages.cs.wisc.edu/~dbbook/openAccess/thirdEdition/solutions/ans3ed-oddonl
y.pdf
https://gyires.inf.unideb.hu/GyBITT/03/
39
NEXT
Part 3
40 40