Chapter 6 - Normalization of Database Tables

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

Chapter 6 -Normalization of Database Tables

Learning Objectives
• After completing this chapter, you will be able to:
• Explain normalization and its role in the database design process
• Identify and describe each of the normal forms: 1NF, 2NF, 3NF, BCNF, and 4NF
• Explain how normal forms can be transformed from lower normal forms to
higher normal forms
• Apply normalization rules to evaluate and correct table structures
• Identify situations that require denormalization to generate information
efficiently
• Use a data-modeling checklist to check that the ERD meets a set of minimum
requirements
Database Tables and Normalization
• Normalization is process that evaluating and correcting table structures to minimize
data redundancies
• Normal forms
• First normal form (1NF)
• Second normal form (2NF)
• Third normal form (3NF)
• Do we need more after the 3rd normalization ?
• Structural point of view of normal forms
• Higher normal forms are better than lower normal forms !!
• Properly designed 3NF structures meet the requirement of fourth normal form (4NF)
• Denormalization: produces a lower normal form
• Results in increased performance and greater data redundancy
• Normalization process are same for both new database or modifying existing one.
The Need for Normalization
• Reduce and eliminate redundancy.
• Helps with large and complex entities
• Assigns attributes to tables based on determination.
• Analyzes the relationship among the attributes within each entity.
• Helps validate business rules.
• Determines if the structure can be improved through normalization.
• Improves the existing data structure and creates an appropriate database design.
The Normalization Process
General Rules

• Objective is to ensure that each table conforms to the concept of well-


formed relations
1. Each table represents a single subject
2. Each row/column intersection contains only one value and not a group of values
3. No data item will be unnecessarily stored in more than one table
4. All nonprime attributes in a table are dependent on the primary key
5. Each table has no insertion, update, or deletion anomalies
The Normalization Process
Identify functional dependencies
• Recall functional dependencies
• Value of one or more attributes determines value of one or more attributes

• There are two types of functional dependencies


• Partial dependency: occurs when one or more non key attribute are depending on a part of the primary ( only happens when PK is
composite key )
Example:
Table :Seller(Id, Product, Price)
Candidate Key : Id, Product and Non prime attribute : Price
Price attribute only depends on only Product attribute which is a subset of candidate key, Not the whole candidate key(Id, Product) key .
It is called partial dependency.
So we can say that Product->Price is partial dependency.

• Transitive dependency: non-prime attribute is dependent on another non-prime attribute that is not part of the primary key and more
difficult to identify among a set of data

Example : Table : student(#Id, Name, Age, PostCode, City)


Candidate Key : Id and PostCode ; Age, Name and City are non prime attribute.
“Id” attribute determines PostCode and PostCode functionally determines the City attribute. So transitive functional dependency is Id->City
The Normalization Process
The Normalization Process (1 of 9)
Process steps

• Table format
• No repeating column
1NF • Determine PK

• 1NF
• No partial dependencies
2NF

• 2NF
• No transitive dependencies
3NF
1NF

1. Each column must have a unique name.


2. The order of the rows and columns doesn’t matter.
3. Each column must have a single data type.
4. No two rows can contain identical values.
5. Each column must contain a single value.
6. Columns cannot contain repeating groups.
Un-Normalized Form (UNF) (2 of
9)

Data gathering from business rules ;


Example:
• Each student has a student ID
• Student can take 18 credit class per semester
• Each student must have name and last name
• Each teaching module has unit code and name
• …….
Conversion to First Normal Form (1NF) (3 of 9)
Create flat table from UNF

• Three step procedure


1. Eliminate the repeating
groups
2. Identify the primary key
3. Identify all dependencies

Some tables contain partial dependencies


Update, insertion, or deletion
Dependency Diagram (4 of 9)
After the created flat file or tabular form from UNF look into header of file. Use different color and underline the PK(s)
and explore partial and transitive dependencies with lines.
Conversion to Second Normal Form (2NF) (5 of 9)
• Conversion to 2NF occurs only when the 1NF has a composite primary key
• If the 1NF has a single-attribute primary key, then the table is automatically in 2NF
• The 1NF-to-2NF conversion is simple
• Make new tables to eliminate partial dependencies
• Reassign corresponding dependent attributes
• Table is in 2NF when it:
• Is in 1NF
• Includes no partial dependencies
Conversion to Second Normal Form (2NF) (6 of 9)
• Data anomalies ( delete, insert ,update) happens if
Get 1NF and explore to the 2NF is not applied .
partial dependencies • All non-key attributes must be depend on primary key.
• Each new table should hold only one kind of
information.

2NF

• Two step procedure


1. Get 1NF and explore ;
Is there composite PK or partial dependencies ?
Yes : Second table what we created with StudentID
+UnitCode is composite PK
2.Then all partial key dependencies must be removed
No: skip to 3NF
Conversion to Third Normal Form (3NF) (7 of 9)

• Make new tables to eliminate transitive dependencies


• Reassign corresponding dependent attributes
• Table is in 3NF when it:
• Is in 2NF
• Contains no transitive dependencies
Conversion to Third Normal Form (3NF) (8 of 9)
Get 2NF and explore to
Transitive dependencies

3NF 2N

• Two step procedure


1. Get 2NF and explore ;
Is there transitive dependencies ?
The Boyce-Codd Normal Form(BCNF) (1 of
3)
• Considered to be a special case of 3NF because,
• Every determinant in the table should be a candidate key
• Candidate key: same characteristics as primary key but not chosen to be the
primary key
• BCNF=3NF when the table contains only one candidate key
• Non-prime attribute cannot derive to prime attribute
• Focus PK and other possible candidate key(s)
• Remember when there is no transitive dependency 3NF was equal 2NF and when
there is no partial dependency 2NF was equal to 1NF
BCNF Example (2 of 3) PK StudentID , Subject
This table satisfies;
1st Normal form ;
StudentID subject professor
• All the values are atomic, 101 Java Tom
• Column names are unique and 101 C++ Brandon
• All the values stored in a particular column are of same domain.
2nd Normal Form there is no Partial Dependency. 102 Java Engin
3rd Normal Form there is no Transitive Dependency, 103 C# Lisa
104 Java Tom
 But this table is not in Boyce-Codd Normal Form.

BCNF rule :Non-prime attribute cannot derive to prime attribute

in the table above, studentID, subject form primary key, which means subject column is a prime
attribute.

But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed by BCNF.
BCNF Example (3 of 3)
To make this relation(table) satisfy BCNF,
we will split this table into 3 tables,
student table and professor table and subject table.

Please be aware of new PKs!!!

• Subject is no longer a prime attribute, professor is a prime attribute,


• And there is no non-prime attribute derive prime attributes and this solution meets BCNF.

Student Professor Subject


Student PID PID profes subjec SubID subject
ID sor t
a Java
101 1 1 Tom a
b C++
101 2 2 Brando b
n c C#
102 3
3 Engin a
103 4
4 Lisa c
104 1
Fourth Normal Form (4NF) (1 of 2)
• Rules
• All attributes must be dependent on the primary key, but
they must be independent of each other
• No row may contain two or more multivalued facts
about an entity
• Table is in 4NF = 3NF when it has no multivalued 1NF SAME
dependencies DOMAIN
• Multivalued dependency : one key determines multiple
values of two other attributes are independent of each
other
Costumer Adress
Costumer Phone Code
Code
C100 128,1st Street
C100 0777111222
C100 200, 2nd Street
C100 0777111222
C101 0555333322
C101 128,1st Street 4NF issue with two different
C102 0432329923
C102 120,3rd Street Phone and Address tables
Denormalization and Why we need it ?
• Performance and quick reporting is major problem today’s DB and DBAs
• Think major telecomm companies billing reports or
• Large Finance company need their SOC repots in a week ?
• Design goals with normalization
• Creation of normalized relations
• Processing requirements and speed
• When we need Denormalization database tables expands
• Tables are decomposed to conform to normalization requirements
• Joining a larger number of tables takes additional input/output (I/O) operations and processing logic

Simply -> Time –Space and relatively Cost trade off and it is very common in IT
world specially for reporting solution with temporary view tables (virtual
table)
What happens if I don’t normalize the
DB ?
• Defects in un-normalized tables
• Data updates are less efficient because tables are larger
• Application performance may suffer
• Indexing is more cumbersome
• No simple strategies for creating virtual tables known as views
Questions ?

You might also like