Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 29

SEMINAR REPORT ON NORMALISATION

Guided by:
Mrs.Bijayalaxmi
Padhiary

Submitted By-
Mohit Das
Roll no – 15
Branch – I.T.
Sem - 4th
ACKNOWLEDGEMENT

As the very outset I would like to convey my


heartily regard with lots of thanks to my
esteemed guide Mrs. Bijayalaxmi
Padhiary(Lecturer Department of I.T.
Engineering) for her inspiration & valuable
guidance in carrying out my seminar report. I
would also like to acknowledge my gratitude and
special thanks to her for the ideas and moral
support.

Finally, I am also thankful to my branch


teachers for their constant support and
encouragement, I am indebted to my family for
their valuable suggestion for the project report.
CERTIFICATE

This is to be certifying that Mr.Mohit Das has


prepared this project report containing topic on
“NORMALISATION” in a partial fulfillment of
diploma course in Information Technology
branch.

This record is an original work and has been


neither copied nor submitted earlier for any
purpose of evaluation.

EXTERNAL EXAMINER

INTERNAL EXAMINER
PREFACE

In partial fulfillment of the requirement of the


course for the 4th semester diploma conduction
by “STATE COUNCIL OF TECHNICAL AND
VOCATIONAL TRAINING, ODISHA” ,this project

on “NORMALISATION” submitted to S.K.D.A.V


GOVT. POLYTECHNIC, ROURKELA.

The guidelines given under the objective and


scope of syllabus of the course has been
followed while preparing this report.

Submitted
By-

Mohit Das

Roll no – 15

Branch – I.T.

Sem - 4th
CONTENTS

• Introduction

• Normalisation

• Types of Normalisation

 First Normal Form (1NF)

 Second Normal Form (2NF)

 Third Normal Form (3NF)

• Advantages

• Disadvantages

• Conclusion
INTRODUCTION

Normalization is a database design


technique that reduces data redundancy and
eliminates undesirable characteristics like
Insertion, Update and Deletion Anomalies.
Normalization rules divides larger tables into
smaller tables and links them using relationships.

The inventor of the relational model Edgar


Codd proposed the theory of normalization of
data with the introduction of the First Normal
Form, and he continued to extend theory with
Second and Third Normal Form. Later he joined
Raymond F. Boyce to develop the theory of
Boyce-Codd Normal Form..
NORMALIZATION IN DBMS

Database Normalization is a technique that


helps in designing the schema of the database
in an optimal manner so as to ensure the
above points. The core idea of database
normalization is to divide the tables into smaller
subtables and store pointers to data rather than
replicating it. For a better understanding of
what we just said, here is a simple DBMS
Normalization example:

To understand (DBMS)normalization in
the database with example tables, let's assume
that we are supposed to store the details of
courses and instructors in a university. Here is
what a sample database could look like:
Instructor’s
Cours Course Instructor
phone
e code venue Name
number
CS101 Lecture Prof. +1
Hall 20 George 6514821924
CS152 Lecture Prof. +1
Hall 21 Atkins 6519272918
CS154 CS Prof. +1
Auditoriu George 6514821924
m

Here, the data basically stores the


course code, course venue, instructor name,
and instructor’s phone number. At first, this
design seems to be good. However, issues
start to develop once we need to modify
information. For instance, suppose, if Prof.
George changed his mobile number. In such a
situation, we will have to make edits in 2
places. What if someone just edited the mobile
number against

CS101, but forgot to edit it for CS154? This will


lead to stale/wrong information in the database.

This problem, however, can be easily


tackled by dividing our table into 2 simpler
tables:

Table 1 (Instructor):

 Instructor ID
 Instructor Name
 Instructor mobile number

Table 2 (Course):

 Course code
 Course venue
 Instructor ID
Now, our data will look like the following:

Table 1 (Instructor): 

Insturctor's Instructor's Instructor's

ID name number
1 Prof. George +1
6514821924
2 Prof. Atkins +1

6519272918
Table 2 (Course): 

Course code Course venue Instructor ID


CS101 Lecture Hall 20 1
CS152 Lecture Hall 21 2
CS154 CS Auditorium 1

Basically, we store the instructors


separately and in the course table, we do not
store the entire data of the instructor. We rather
store the ID of the instructor. Now, if someone
wants to know the mobile number of the
instructor, he/she can simply look up the
instructor table. Also, if we were to change the
mobile number of Prof. George, it can be done
in exactly one place. This avoids the
stale/wrong data problem.

Further, if you observe, the mobile number


now need not be stored 2 times. We have
stored it at just 1 place. This also saves
storage. This may not be obvious in the above
simple example. However, think about the case
when there are hundreds of courses and
instructors and for each instructor, we have to
store not just the mobile number, but also other
details like office address, email address,
specialization, availability, etc. In such a
situation, replicating so much data will increase
the storage requirement unnecessarily.
FIRST NORMAL FORM (1NF)

The First normal form simply says that


each cell of a table should contain exactly one
value. Let us take an example. Suppose we are
storing the courses that a particular instructor
takes, we can store it like this:

Instructor's name Course code

Prof. George (CS101, CS154)

Prof. Atkins (CS152)

Here, the issue is that in the first row, we


are storing 2 courses against Prof. George.
This isn’t the optimal way since that’s now
how SQL databases are designed to be used.
A better method would be to store the courses
separately. For instance:
Instructor's name Course code

Prof. George CS101

Prof. George CS154

Prof. Atkins CS152

This way, if we want to edit some


information related to CS101, we do not have
to touch the data corresponding to CS154.
Also, observe that each row stores unique
information. There is no repetition. This is the
First Normal Form.
SECOND NORMAL FORM (2NF)

For a table to be in second normal form,


the following 2 conditions are to be met:

1. The table should be in the first normal form.

2. The primary key of the table should


compose of exactly 1 column.

The first point is obviously straightforward


since we just studied 1NF. Let us understand
the first point - 1 column primary key. Well, a
primary key is a set of columns that uniquely
identifies a row. Basically, no 2 rows have the
same primary keys. Let us take an example:
Instructor’s
Course Course Instructor
phone
code venue Name
number
CS101 Lecture Prof. +1

Hall 20 George 6514821924


CS152 Lecture Prof. +1
Hall 21 Atkins 6519272918
CS154 CS Prof. +1

Auditorium George 6514821924

Here, in this table, the course code is


unique. So, that becomes our primary key. Let
us take another example of storing student
enrollment in various courses. Each student
may enroll in multiple courses. Similarly, each
course may have multiple enrollments. A
sample table may look like this (student name
and course code):
Student name Course code
Rahul CS152

Rajat CS101
Rahul CS154
Raman CS101

Here, the first column is the student name


and the second column is the course taken by
the student. Clearly, the student name column
isn’t unique as we can see that there are 2
entries corresponding to the name ‘Rahul’ in
row 1 and row 3. Similarly, the course code
column is not unique as we can see that there
are 2 entries corresponding to course code
CS101 in row 2 and row 4. However, the tuple
(student name, course code) is unique since a
student cannot enroll in the same course more
than once. So, these 2 columns when
combined form the primary key for the
database.
As per the second normal form definition,
our enrollment table above isn’t in the second
normal form. To achieve the same (1NF to
2NF), we can rather break it into 2 tables:

Students:

Student Enrolment
name number
Rahul 1
Rajat 2
Raman 3

Here the second column is unique and it


indicates the enrollment number for the
student. Clearly, the enrollment number is
unique. Now, we can attach each of these
enrollment numbers with course codes.
Courses:

Course code Enrolment number

CS101 2

CS101 3

CS152 1

CS154 1

These 2 tables together provide us with the


exact same information as our original table.

THIRD NORMAL FORM (3NF)

Before we delve into details of third normal


form,let us understand the concept of a
functional dependency on a table.
Column A is said to be functionally
dependent on column B if changing the value
of A may require a change in the value of B. As
an example, consider the following table:

Cours
Course Instructor' Departmen
e
venue s name t
code

MA21 Lecture Prof. CS

4 Hall 18 George Department

ME112 Auditoriu Prof. John Electronics


m Department
building

Here, the department column is dependent


on the professor name column. This is because
if in a particular row, we change the name of the
professor, we will also have to change the
department value. As an example, suppose
MA214 is now taken by Prof. Ronald who
happens to be from the Mathematics
department, the table will look like this:

Cours
Course Instructor' Departmen
e
venue s name t
code

MA21 Lecture Prof. Mathematic

4 Hall 18 Ronald s
Department

ME112 Auditoriu Prof. John Electronics

m Department
building

Here, when we changed the name of the


professor, we also had to change the
department column. This is not desirable since
someone who is updating the database may
remember to change the name of the
professor, but may forget updating the
department value. This can cause
inconsistency in the database.

Third normal form avoids this by breaking


this into separate tables:

Course Instructor's
Course venue
code ID
MA214 Lecture Hall 18 1
ME112 Auditorium 2
building,
Here, the third column is the ID of the
professor who’s taking the course.

Instructor's ID Instructor's Name Department

1 Prof. Ronald Mathematics Department

2 Prof. John Electronics Department

Here, in the above table, we store the


details of the professor against his/her ID. This
way, whenever we want to reference the
professor somewhere, we don’t have to put the
other details of the professor in that table again.

We can simply use the ID.

Therefore, in the third normal form, the


following conditions are required:

 The table should be in the second normal


form.

 There should not be any functional


dependency.
ADVANTAGES OF NORMALIZATION

Here we can perceive any reason why


Normalization is an alluring possibility in
RDBMS ideas.

 A more modest information base can be


kept up as standardization disposes of the
copy information. Generally speaking size of
the information base is diminished thus.

 Better execution is guaranteed which can


be connected to the above point. As
information bases become lesser in size,
the goes through the information turns out
to be quicker and more limited in this way
improving reaction time and speed.
 Narrower tables are conceivable as
standardized tables will be tweaked and will
have lesser segments which considers more
information records per page.

 Fewer files per table guarantees quicker


support assignments (file modifies).

 Also understands the choice of joining just


the tables that are required.

Tables are typically smaller than the tables


found in non-normalized databases. This
usually allows the tables to fit into the buffer,
thus offering faster performance.
DISADVANTAGES OF NORMALIZATION

 More tables to join as by spreading out


information into more tables, the need to join
table’s increments and the undertaking turns
out to be more dreary. The information base
gets more enthusiastically to acknowledge too.

 Tables will contain codes as opposed to


genuine information as the rehashed
information will be put away as lines of codes
instead of the genuine information. Thusly,
there is consistently a need to go to the query
table.

 Data model turns out to be incredibly hard


to question against as the information model is
advanced for applications, not for impromptu
questioning. (Impromptu question is an inquiry
that can’t be resolved before the issuance of
the question. It comprises of a SQL that is
developed progressively and is typically built
by work area cordial question devices.).
Subsequently it is difficult to display the
information base without understanding what
the client wants.

 As the typical structure type advances, the


exhibition turns out to be increasingly slow.

 Proper information is needed on the


different ordinary structures to execute the
standardization cycle effectively. Reckless use
may prompt awful plan loaded up with
significant peculiarities and information
irregularity.
CONCLUSION

With the help of data normalization the


data can be modified in the database like for
inserting, updating, deleting the data. Vince’s
ware house has large no of items that needs to
be entered into the database. The data present
in warehouse has huge amount of
denormalized data and this data has to be
summarized and normalized which helps in
increasing the performance of the database .
REFERENCES

https://en.wikipedia.org/wiki/Database_normaliz
ation#:~:text=Database%20normalization%20is
%20the%20process,part%20of%20his
%20relational%20model.

https://www.guru99.com/database-
normalization.tml

https://www.sqlservercentral.com/articles/datab
ase-normalization-in-sql-with-examples

https://towardsdatascience.com/database-
normalization-explained-53e60a494495

You might also like