Download as pps, pdf, or txt
Download as pps, pdf, or txt
You are on page 1of 42

Querying Data

Introduction toUsing SQL Database


Relational Server Management Systems
Objectives

In this session, you will learn to:


Describe data redundancy
Appreciate the need for denormalization

Ver.
Ver. 1.0
1.0 Slide
Slide11of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Understanding Data Redundancy

Redundancy:
Increases the time involved in updating, adding, and deleting
data.
Increases the utilization of disk space and hence, disk I/O
increases.
Redundancy can, therefore, lead to:
Insertion, modification, and deletion of data, which may cause
inconsistencies.
Errors, which are more likely to occur when facts are repeated.
Unnecessary utilization of extra disk space.

Ver.
Ver. 1.0
1.0 Slide
Slide22of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Understanding Data Redundancy (Contd.)

The STUDENT table contains the values for each attribute,


as shown in the following diagram.

The details of the students, such as STUDENTID and


STUDENTNAME are repeated while recording marks of different
semesters.

Ver.
Ver. 1.0
1.0 Slide
Slide33of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization

Normalization:
Is a method of breaking down complex table structures into
simple table structures by using certain rules.
Has the following benefits:
It helps in maintaining data integrity.
It helps in simplifying the structure of tables, therefore, making a
database more compact.
It helps in reducing the null values, which reduces the complexity
of data operations.

Ver.
Ver. 1.0
1.0 Slide
Slide44of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Some rules that should be followed to achieve a good


database design are:
Each table should have an identifier.
Each table should store data for a single type of entity.
Columns that accept NULL values should be avoided.
The repetition of values or columns should be avoided.

Ver.
Ver. 1.0
1.0 Slide
Slide55of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Normalization results in the formation of tables that satisfy


certain normal forms.
The normal forms are used to remove various types of
abnormalities and inconsistencies from the database.
The most important and widely used normal forms are:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)

Ver.
Ver. 1.0
1.0 Slide
Slide66of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The following diagram shows the different levels of normalization.

Ver.
Ver. 1.0
1.0 Slide
Slide77of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

First Normal Form (1NF):


A table is said to be in 1NF when each cell of the table
contains precisely one value.
The guidelines for converting a table into 1NF are:
Place the related data values in a table. Further, define similar
data values with the column name.
There should be no repeating group in the table.
Every table must have a unique primary key.

Ver.
Ver. 1.0
1.0 Slide
Slide88of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the PROJECT table, as shown in the following


diagram.
PROJECT table is not in first normal form because cells
in PROJCODE and HOURS have more than one value.
Primary key

Ver.
Ver. 1.0
1.0 Slide
Slide99of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

By applying the 1NF definition to the PROJECT table, you


arrive at the table, as shown in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide10
10of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Click the following link to view an animation for understanding


the first normal form.

Understanding
the First Normal Form (1NF)

Ver.
Ver. 1.0
1.0 Slide
Slide11
11of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the STUDENT table, as shown in the following


diagram.
The STUDENT table contains null values in the
PHONE NUMBER2 column, and the number of
entries allowed for storing the telephone numbers
Primary key per student is restricted to two.

Ver.
Ver. 1.0
1.0 Slide
Slide12
12of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

By applying the 1NF definition to the STUDENT table, you can


arrive at the tables, as shown in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide13
13of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Functional dependency:
Attribute A is functionally dependent on B if and only if, for
each value of B, there is exactly one value of A.
Attribute B is called the determinant.

Ver.
Ver. 1.0
1.0 Slide
Slide14
14of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the REPORT table, as shown in the following


diagram.

Primary key

Ver.
Ver. 1.0
1.0 Slide
Slide15
15of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The following diagram shows the functional dependency


between MARKS and ROLL_NUMBER+COURSE_CODE.

Ver.
Ver. 1.0
1.0 Slide
Slide16
16of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The other functional dependencies in the REPORT table are:


COURSE_CODECOURSE_NAME
COURSE_CODET_NAME (Assuming one course is taught by
only one teacher.)
T_NAMEROOM_NUMBER (Assuming each teacher has his/her
own, unshared room.)
MARKSGRADE

Ver.
Ver. 1.0
1.0 Slide
Slide17
17of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The COURSE_NAME, T_NAME, and ROOM_NUMBER


attributes are partially dependent on the whole key.
This dependency is called partial dependency, as shown in the
following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide18
18of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

ROOM_NUMBER is dependent on T_NAME, and T_NAME is


dependent on COURSE_CODE, as shown in the following
diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide19
19of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The following diagram shows the transitive dependency.

Ver.
Ver. 1.0
1.0 Slide
Slide20
20of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Second Normal Form (2NF):


A table is said to be in 2NF when:
It is in the 1NF, and
No partial dependency exists between non-key attributes and key
attributes.
The guidelines for converting a table into 2NF are:
Find and remove attributes that are functionally dependent on only
a part of the key and not on the whole key. Place them in a
different table.
Group the remaining attributes.

Ver.
Ver. 1.0
1.0 Slide
Slide21
21of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the PROJECT table, as shown in the following diagram.

The preceding table could lead to the following problems:


Insertion
Updation
Deletion

IfThe department
an employee
When is of
an employee acompletes
particular
transferred employee
to another a cannot
project,be
work on department, thethe
recorded
change willuntil
employee’s have thetoemployee
record be isThe
recorded
is deleted. assigned
in every a project.
row
information ofregarding
the
EMPLOYEE
the departmenttable.to Any
which omission will lead
the employee to
belongs will also be
inconsistencies.
lost.

Ver.
Ver. 1.0
1.0 Slide
Slide22
22of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

The EMPLOYEEDEPT and PROJECT tables are in 2NF, as


shown in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide23
23of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Click the following link to view an animation on understanding


the second normal form.

Understanding
the Second Normal Form

Ver.
Ver. 1.0
1.0 Slide
Slide24
24of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Third Normal Form (3NF):


A relation is said to be in 3NF if and only if:
It is in 2NF, and
No transitive (indirect) dependency exists between non-key attributes and key attributes.
The guidelines for converting a table into 3NF are:
Find and remove non-key attributes that are functionally dependent on attributes that
are not the primary key. Place them in a different table.
Group the remaining attributes.

Ver.
Ver. 1.0
1.0 Slide
Slide25
25of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the EMPLOYEE table, as shown in the following diagram.

The preceding table could lead to the following problems:


Insertion
Updation
Deletion

The department head of a new department that does not


have any employee at present cannot be entered in the
For a department,
DEPTHEAD DEPTHEAD is repeated. Any change
column.
will
If anhave to be made
employee recordconsistently across
is deleted, the the table.
information about
DEPTHEAD will also be deleted.

Ver.
Ver. 1.0
1.0 Slide
Slide26
26of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

To convert the EMPLOYEE table into 3NF, you must remove


the DEPTHEAD column and place it in another table, as shown
in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide27
27of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Click the following link to view an animation on understanding


the third normal form.

Understanding
the Third Normal Form (3NF)

Ver.
Ver. 1.0
1.0 Slide
Slide28
28of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Boyce-Codd Normal Form (BCNF):


The original definition of 3NF was not sufficient in some situations. It was not
satisfactory for the tables:
That had multiple candidate keys.
Where the multiple candidate keys were composite.
Where the multiple candidate keys overlapped (had at least one attribute in common).
The guidelines for converting a table into BCNF are:
Find and remove the overlapping candidate keys. Place the part of the candidate key
and the attribute it is functionally dependent on, in a different table.
Group the remaining items into a table.

Ver.
Ver. 1.0
1.0 Slide
Slide29
29of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Consider the PROJECT table, as shown in the following


diagram.
NAME+PROJCODE can also be chosen
Primary key as the primary key and hence, is a
candidate key.

The PROJECT table is already in 3NF.

Ver.
Ver. 1.0
1.0 Slide
Slide30
30of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

You can remove NAME and ECODE and place them in a


different table, as shown in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide31
31of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Normalization (Contd.)

Click the following link to view an animation on understanding


the Boyce-Codd normal form.

Understanding
the Boyce-Codd Normal Form

Ver.
Ver. 1.0
1.0 Slide
Slide32
32of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Just a minute

Which one of the following methods is used to break down


complex table structures into simple table structures by
using certain rules?
Normalization
Denormalization
Generalization
Specialization

Solution:
Normalization

Ver.
Ver. 1.0
1.0 Slide
Slide33
33of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Just a minute

In which normal form, you need to remove non-key


attributes that are functionally dependent on non-primary
key attributes?
First normal form
Second normal form
Third normal form
Boyce-Codd normal form

Solution:
Third normal form

Ver.
Ver. 1.0
1.0 Slide
Slide34
34of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Understanding Denormalization

Sometimes, you have to join multiple tables to get a simple


output.
This affects the performance of a query.
In such cases, a degree of redundancy is introduced either
by adding extra columns or extra tables.

Ver.
Ver. 1.0
1.0 Slide
Slide35
35of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Denormalization

The intentional introduction of redundancy in a table in order to improve performance


is called denormalization.
Consider the tables, ORDERS and PRODUCTS, as shown in the following diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide36
36of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Definition of Denormalization (Contd.)

To speed up the processing of the query, store the cost of


each order along with the tax, as shown in the following
diagram.

Ver.
Ver. 1.0
1.0 Slide
Slide37
37of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Just a minute

The intentional introduction of redundancy in a table in order


to improve performance is called ________________.

Solution:
denormalization

Ver.
Ver. 1.0
1.0 Slide
Slide38
38of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Summary

In this session, you learned that:


Normalization is used to simplify table structures.
Normalization results in the formation of tables that satisfy
certain specified constraints, and represent certain normal
forms. The normal forms are used to ensure that various types
of abnormalities and inconsistencies are not introduced in the
database. A table structure is always in a certain normal form.
The most important and widely used normal forms are:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
A table is said to be in 1NF when each cell of the table
contains precisely one value.

Ver.
Ver. 1.0
1.0 Slide
Slide39
39of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Summary (Contd.)

The following dependencies are found in normalization:


Functional
Partial
Transitive
A table is said to be in 2NF when it is in the 1NF, and no partial
dependency exists between non-key attributes and key
attributes.
A table is said to be in the 3NF if and only if it is in 2NF, and no
transitive (indirect) dependency exists between non-key
attributes and key attributes.
A table is in BCNF if and only if every determinant is a
candidate key.

Ver.
Ver. 1.0
1.0 Slide
Slide40
40of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
Summary (Contd.)

The intentional introduction of redundancy in a table in order to


improve performance is called denormalization.
The decision to denormalize results in a compromise between
performance and data integrity.
Denormalization increases disk space utilization.

Ver.
Ver. 1.0
1.0 Slide
Slide41
41of
of69
42
Querying Data
Introduction toUsing SQL Database
Relational Server Management Systems
What’s Next?

Before the next session, please ensure to:


Read Chapter 2 and Chapter 3 of the book, Introduction to
Relational Database Management Systems.
Go through the recorded lectures on the following topics
through Cloudscape:
Understanding the First Normal Form and Second Normal Form
Understanding the Third Normal Form and Boyce-Codd Normal
Form
Complete the Lab@Home exercises based on the learning
plan available on the technology page on Cloudscape.

Ver.
Ver. 1.0
1.0 Slide
Slide42
42of
of69
42

You might also like