Normalization

Introduction to Relational Model
TERMINOLOGY
Normalization: It is a design technique that is widely used as a guide in designing relational databases. Its
theory is based on the concepts of normal forms. Normal forms are sets of rules describing what we
should do or not do in our table. A relational table is said to be a particular normal form if it satisfied a
certain set of constraints.
Normalization Process: It consists of breaking tables into smaller tables that forms a better design. In addition,
it is the process of decomposing relations with anomalies to produce smaller, well-structured relations.
Well-Structured Relations: It is a relation that contains minimal data redundancy and allows users to insert,
delete, and update rows without causing data inconsistencies.
The goal of normalization is to create a set of relational tables that are free of redundant data and that can be
consistently and correctly modified, and to avoid anomalies. This means that all tables in a relational
database should be in the third normal form (3NF). A relational table is in 3NF if and only if all non-key
columns are: (a) Mutually independent and; (b) Fully dependent upon the primary key.
Mutual independence means that no non-key column is dependent upon any combination of the other columns.
The first two normal forms are intermediate steps to achieve the goal of having all tables in 3NF.
Functional Dependency: It is a set of constraints between two attributes in a relation. Functional dependency
says that if two tuples have same values for attributes A1, A2,..., An, then those two tuples must have to
have same values for attributes B1, B2, ..., Bn.
Types of Anomalies
Insertion anomalies: This means that that some data cannot be inserted in the database. Likewise, adding new
rows forces user to create duplicate data.
Update/Modification anomalies: If data items are scattered and are not linked to each other properly, then it
could lead to strange situations. For example, when we try to update one data item having its copies
scattered over several places, a few instances get updated properly while a few others are left with old
values. Such instances leave the database in an inconsistent state.
Deletion anomalies: We tried to delete a record, but parts of it was left undeleted because of unawareness, the
data is also saved somewhere else. In addition, deleting some data cause other information to be lost.
Likewise, deleting rows may cause a loss of data that would be needed for other future rows.
FIRST NORMAL FORM
o The first normal form, sometimes called 1NF, states that each attribute or column value must be atomic.
o That is, each attribute must contain a single value, not a set of values or another database row.
o This schema design is not in first normal form because it contains sets of values in the skill column.
1|Page
o To put this schema in first normal form, we need to turn the values in the skill column into atomic values.
The first, and perhaps most obvious, way is:
All Values are Atomic
o Here we have made one row per skill. This schema is now in first normal form.
o But, this arrangement is far from ideal because—for each skill-employee combination, we repeat all the employee
details thus we have a great deal of redundancy.
o A better solution, and the right way to put this data into first normal form, is:
2|Page
In this example, we have split the skills off from the employee to form a separate table that only links employee ids and
individual skills. This gets rid of the redundancy problem.
SECOND NORMAL FORM
 A schema is said to be in second normal form (also called 2NF) if all the other attributes are fully functionally
dependent on the primary key, and the schema is already in first normal form. The term functional dependence
can be defined most easily this way: the attribute B is functionally dependent on attribute A if knowing the value of
attribute A you can determine the value of attribute B.
Example: Stud_Num  Stud_name, Course, Address
Consider this schema/table:
Employee (employeeID, name, job, departmentID, skill)
 This schema is in first normal form, but it is not in second normal form. Why not?
Answer: Because not all the attributes are fully functionally dependent on the primary key. Note that the
primary key of the table is a combination of the employeeID and skill.
Here are the functional dependencies found in the table:
employeeID, skill  name, job, departmentID
but we also have,
employeeID  name, job, departmentID
Based from these, we can determine the name, job, and departmentID from the employeeID alone. This means
then that the other attributes are only partially functionally dependent on the primary keyNOT fully functionally
dependent on the primary key. Hence, this schema is not in second normal form.
How can we put it into second normal form?
We need to decompose the table into tables in which all the non-key attributes are fully functionally dependent
on the primary key. It is fairly obvious that we can achieve this by breaking the table into two tables:
Employee(employeeID,name,job,departmentID)
EmployeeSkills(employeeID,skill)
As already discussed, this schema is in first normal form because the values are all atomic. It is also in second
normal form because each non-key attribute is now fully functionally dependent on all parts of the primary key.
THIRD NORMAL FORM
You may sometimes hear the saying "Normalization is about the key, the whole key, and nothing but the key." Second
normal form tells us that attributes must depend on the whole key. Third normal form tells us that all attributes must
depend on nothing but the key.
Formally, for a schema to be in third normal form (3NF), we must remove all transitive dependencies, and the schema
must already be in second normal form. What's a transitive dependency?
Consider this schema:
employeeDepartment(employeeID, name, job, departmentID, departmentName)
This schema contains the following functional dependencies:
3|Page
employeeID name, job, departmentID, departmentName
departmentID  departmentName
However, we can see that we also have,
employeeID departmentName
employeeID  departmentID
and,
departmentID  departmentName
This relationship means that the functional dependency employeeID  departmentName is a transitive
dependency. Because it has a middle step (the departmentID  departmentName dependency).
To get to third normal form, we need to remove this transitive dependency.
As with the previous normal forms, to convert to third normal form we decompose this table into multiple tables.
We convert the schema to two tables, employee and department, like this:
Employee(employeeID,name,job,departmentID)
Department(departmentID,departmentName)
NOTE:
o 1NF - means each attribute must be atomic. Meaning each attribute must contain a single value, not a set
of values.
o 2NF - means all attributes must be fully functionally dependent on the primary key.
o 3NF - means that all attributes must depend on nothing but on the primary key. Meaning there should be
no transitive dependencies.
Types of Relationships
One-to-One: Occurs when the Primary key in one table is linked to the Primary key in another table. This
means that the primary key in both tables is identical and that exactly one row in one table is related to
exactly one row in another table. These are not very common and achieve very little. Basically the
information in a one to one relationship could be combined into the single table. These might be used if
there was information which you wanted to hide from others but still relate to that table.
A one-to-one (1:1) relationship is when at most one instance of an entity A is associated with one
instance of entity B. For example, employees in the company are each assigned their own office. For
each employee there exists a unique office and for each office there exists a unique employee.
One-to-Many: The one-to-many relationship is used to relate one record in a table with many records in
another. This allows a customer to make more than one order and is the most common type of
relationship.
These are the most common and most practical for the majority of applications. An example of these
relationships is:
One student will have many different classes, One student may have many assignments, One
teacher will have many students, One doctor will have many patients. The primary key is a unique
4|Page
number and will never be repeated. When this is joined to the "Foreign key" (which is a key that was a
primary key in another table) this will then form a 1 to many relationship
A one-to-many (1:N) relationships is when for one instance of entity A, there are zero, one, or many
instances of entity B, but for one instance of entity B, there is only one instance of entity A. An example
of a 1:N relationships is a department has many employee; each employee is assigned to one
department.
Many-to-Many: Think of it as a pair of one to many relationships between two tables. A patient can go to a
hospital on many different dates so that forms a one-to-many relationship but at the same time on each
date; many people can be brought into the hospital. This is also a one-to-many relationship. So an
individual patient may visit the hospital on many dates, and on a given date, many patients may visit the
hospital. This is not used much and can generally be turned into a one to many relationships. An
example of many to many is that many students will have many classes and many classes will have
many students.
A many-to-many (M:N) relationship, sometimes called non-specific, is when for one instance of entity
A, there are zero, one, or many instances of entity B and for one instance of entity B there are zero, one,
or many instances of entity A. An example is: Employees can be assigned to no more than two projects
at the same time; projects must have assigned at least three employees
5|Page

Normalization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Normalization

Uploaded by

Copyright:

Available Formats

Introduction to Relational Model

FIRST NORMAL FORM

All Values are Atomic

SECOND NORMAL FORM

Example: Stud_Num  Stud_name, Course, Address

Consider this schema/table:

Employee (employeeID, name, job, departmentID, skill)

Here are the functional dependencies found in the table:

employeeID, skill  name, job, departmentID

but we also have,

employeeID  name, job, departmentID

How can we put it into second normal form?

THIRD NORMAL FORM

Consider this schema:

employeeDepartment(employeeID, name, job, departmentID, departmentName)

This schema contains the following functional dependencies:

However, we can see that we also have,

To get to third normal form, we need to remove this transitive dependency.

You might also like