Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 47

Normalization

Dale-Marie Wilson, Ph.D.

Purpose of Normalization

Definition
A technique for producing a set of suitable relations that
support data requirements of an enterprise
Characteristics of suitable set of relations:
Minimal number of attributes necessary to support data
requirements of enterprise
Attributes with close logical relationship found in same
relation
Minimal redundancy with each attribute
Represented only once
Exception of attributes that form all or part of foreign keys

Purpose of Normalization

Benefits of using database with suitable


set of relations:
Easier

for user to access and maintain

data
Takes up minimal storage space on
computer

How Normalization
Supports Database Design

Data Redundancy and


Update Anomalies

Major aim of relational database design


To

group attributes into relations to minimize data


redundancy

Potential benefits:
Updates

to data stored achieved with minimal


number of operations
Reduces opportunities for data inconsistencies
Reduces file storage space required by base
relations thus minimizing costs

Data Redundancy and


Update Anomalies

Data Redundancy and


Update Anomalies

Relations that contain redundant


information may potentially suffer from
update anomalies

Types of update anomalies:


Insertion
Deletion
Modification

Data Redundancy and


Update Anomalies

Lossless-join and
Dependency Preservation
Properties

Two important properties of


decomposition:
Lossless-join

property

Ability to find any instance of original relation


from corresponding instances in smaller
relations
Dependency

preservation property

Ability to enforce constraint on original relation


by enforcing some constraint on each of
smaller relations

Functional Dependencies

Determinant refers to attribute or group of


attributes on left-hand side of arrow

Functional Dependency
Example

Consider the values shown in staffNo and sName


attributes of the Staff relation

Based on sample data, the following functional


dependencies appear to hold
staffNo sName
sName staffNo

Functional Dependencies

The only functional dependency that remains


true for all possible values for staffNo and
sName attributes of Staff relation is:

staffNo sName

Functional Dependencies

Characteristics:
Full

functional dependency

Determinants should have minimal number of


attributes necessary to maintain functional
dependency with the attribute(s) on the right
hand-side
If A and B are attributes of relation, B is fully
functionally dependent on A, if B is functionally
dependent on A, but not on any proper subset
of A

Functional Dependency
Example

staffNo, sName branchNo

True - each value of (staffNo, sName) associated with single


value of branchNo

branchNo also functionally dependent on subset of (staffNo,


sName), staffNo.
Partial dependency

Functional Dependencies

Characteristics:
one-to-one

relationship between attribute(s) on


left-hand side (determinant) and right-hand side of
functional dependency
Holds for all time
Determinant has minimal number of attributes
necessary to maintain dependency with
attribute(s) on right hand-side

Transitive Dependencies

Existence of transitive dependency


potentially cause update anomalies

A condition where A, B, and C are


attributes of a relation such that if A B
and B C, then C is transitively
dependent on A via B (provided that A is
not functionally dependent on B or C

Transitive Dependency
Example

staffNo sName, position, salary, branchNo, bAddress


branchNo bAddress

Transitive dependency, branchNo bAddress exists on


staffNo via branchNo

Process of Normalization
Formal technique for analyzing a relation
based on its primary key and the
functional dependencies between the
attributes of that relation
Executed as series of steps
Each step corresponds to specific normal
form with known properties

Identifying Functional
Dependencies

Meaning of each attribute and relationships


between attributes must be well understood
Provided

by enterprise in form of discussions


with users and/or documentation e.g. users
requirements specification
If users unavailable and/or documentation
incomplete
Database designer uses common sense and/or
experience to provide missing information

Identifying Functional
Dependencies Example

Identifying Functional
Dependencies Example

In StaffBranch relation, assume that position held and


branch determine a member of staffs salary
Identify functional dependencies for StaffBranch relation as:
staffNo sName, position, salary, branchNo,
bAddress
branchNo bAddress
bAddress branchNo
branchNo, position salary
bAddress, position salary

Using Sample Data to


Identify FD Example

Using Sample Data to


Identify FD Example

Consider data for attributes denoted A, B,


C, D, and E in Sample relation

Important
Must

establish sample data values shown


in relation are representative of all possible
values that can be held by attributes A, B,
C, D, and E

Identifying the Primary


Key

Main purpose of identifying set of


functional dependencies for relation
To

specify the set of integrity constraints


that must hold on relation

Integrity constraint to consider first


Identification

of candidate keys
Choice of primary key

Example - Identify Primary


Key for StaffBranch Relation

Example - Identify Primary


Key for StaffBranch Relation

StaffBranch relation has five functional


dependencies

Determinants are staffNo, branchNo, bAddress,


(branchNo, position), and (bAddress, position)

To identify all candidate key(s)


Identify

attribute (or group of attributes) that


uniquely identifies each tuple in relation

Example - Identify Primary


Key for StaffBranch Relation

All attributes not part of a candidate key


should be functionally dependent on key

For StaffBranch relation:


One

candidate key, staffNo


=> one primary key, staffNo

Example - Identify Primary


Key for Sample Relation

Example - Identify Primary


Key for Sample Relation
Sample relation has four functional
dependencies
Determinants are A, B, C, and (A, B)
One determinant functionally determines
all other attributes of relation (A, B)
(A, B) identified as primary key

Normalization Process

As normalization proceeds
Relations

become progressively more


restricted (stronger) in format
Less vulnerable to update anomalies

Normalization Process

Normal Forms

Unnormalized form (UNF)


Table

that contains one or more repeating groups

To create an unnormalized table


Transform

data from information source (e.g.


form) into table format with columns and rows

First Normal Form (1NF)


A

relation in which intersection of each row and


column contains one and only one value

Normal Forms

UNF to 1NF
Nominate

attribute or group of attributes to act as


the key for unnormalized table
Identify repeating group(s) in unnormalized table
which repeats key attribute(s)
Remove the repeating group by
Entering appropriate data into empty columns of rows
containing repeating data (flattening the table)
Placing repeating data along with copy of original key
attribute(s) into separate relation

Normal Forms

Second Normal Form (2NF)


A

relation that is in 1NF and every non-primarykey attribute is fully functionally dependent on the
primary key

Based on concept of full functional dependency

Full functional dependency indicates that if


A

and B are attributes of a relation


B is fully dependent on A if B is functionally
dependent on A but not on any proper subset of A

Normal Forms

1NF to 2NF
Identify

primary key for 1NF relation

Identify

functional dependencies

If

partial dependencies exist on primary


key
Remove by placing in new relation along with
copy of determinant

Normal Forms

Third Normal Form (3NF)


A relation that is in 1NF and 2NF and in which no nonprimary-key attribute is transitively dependent on the
primary key

Based on concept of transitive dependency

Transitive Dependency is a condition where


A, B and C are attributes of relation such that if A B and B
C,
then C is transitively dependent on A through B. (Provided
that A is not functionally dependent on B or C)

Normal Forms

2NF to 3NF
Identify

primary key in 2NF relation

Identify

functional dependencies

If

transitive dependencies exist on


primary key
Remove by placing them in new relation along
with a copy of dominant

Normal Forms

Boyce-Codd Normal Form (BCNF)


A

relation is in BCNF if and only if every


determinant is a candidate key

Based on functional dependencies


Takes

into account all candidate keys in


relation
Adds constraints compared with 3NF

Normal Forms
Every relation in BCNF is in 3NF
Relation in 3NF not necessarily in BCNF
Violation of BCNF rare
Violations of BCNF occur when:

Relation

contains two (or more)


composite candidate keys
Candidate keys overlap i.e. have at least
one attribute in common

Normalization Example

Normalization Example

Normalization Example

Normalization Example
2NF
PropertyInspection(propertyNo, iDate, iTime, comments,
staffNo, sName, carReg)
Property(propertyNo, pAddress)

3NF
PropertyInspection(propertyNo, iDate, iTime, comments,
staffNo, sName, carReg)
Property(propertyNo, pAddress)
Staff(staffNo, sName)

Normalization Example
BCNF
Property Relation
fd2
propertyNo pAddress

Staff Relation
fd3
staffNo sName
PropertyInspect Relation
fd1
propertyNo, iDate iTime, comments, staffNo, carReg
fd4
staffNo, iDate carReg
fd5
carReg, iDate, iTime propertyNo, comments, staffNo
fd6
staffNo, iDate, iTime propertyNo, comments

Normalization Example

BCNF

StaffCar(staffNo, iDate, carReg)


Inspection(propertyNo, iDate, iTime, comments, staffNo)
Property(propertyNo, pAddress)
Staff(staffNo, sName)

Loss of fd:

carReg, iDate, iTime

propertyNo, paddress, comments, staffNo, sName

If not, propertyInspect has data redundancy

PropertyInspection(propertyNo, iDate, iTime, comments,


staffNo, sName, carReg)

Normalization Example

Chapters covered:
Chpts

13 & 14.1 14.3

You might also like