What Is Normalizationbykbs

Normalisation
What is Normalization?
Normalization is a formal process for determining which fields
belong in which tables in a relational database.
Through normalization a collection of data in a record structure is
replaced by successive record structures that are simpler and
more predictable and therefore more manageable.
Normalisation is carried out for the following reasons.
1. To structure the data so that any pertinent (relevant)

relationships between entities can be represented
2. To permit simple retrieval of data in response to query
and report requests.
3. To simplify the maintenance of the data through
updates, insertions, and deletions.
4. To reduce the need to restructure or reorganize data
when new application requirements arise.
A normalized relational database provides several benefits:
 Elimination of redundant data storage.

 Decompose all data groups into two-dimensional
records.
 Eliminate any relationshiops in which data elements
do not fully depend on the primary key of the record.
 Eliminate any relationsips that contain transitive
dependency
Normalization ensures that you get the benefits relational

databases offer.
Design Vs Implementation
Designing a database structure and implementing a database structure
are different tasks.
 When you design a structure it should be described without

reference to the specific database tool you will use to implement
the system, or what concessions you plan to make for
performance reasons. These steps come later.
 After you’ve designed the database structure abstractly, then you
implement it in a particular environment.
 Too often people new to database design combine design and
implementation in one step. Implementing a structure without
designing it quickly leads to flawed structures that are difficult
and costly to modify.
 Design first, implement second, and you’ll finish faster and
cheaper.
Normalized Design: Pros and Cons
We’ve implied that there are various advantages to producing a

properly normalized design before you implement your system. Let’s
look at a detailed list of the pros and cons:
Pros of Normalizing Cons of Normalizing
More efficient database structure You can’t start building the

database before you know what
the user needs
More efficient database structure.
Better understanding of your data.
More flexible database structure
Easier to maintain database
structure
Few (if any) costly surprises down
the road
Validates your common sense and
intuition.
Avoid redundant fields.
Insure that distinct tables exist
when necessary.
Here, you can see the pros outweigh the cons.
Terminology
Primary Key
 The primary key is a fundamental concept in relational

database design.
 It’s an easy concept: each record should have something that
identifies it uniquely.
 The primary key can be a single field, or a combination of fields.
 A table’s primary key also serves as the basis of relationships
with other tables. For example, it is typical to relate invoices to
a unique customer ID, and employees to a unique department
ID.
 A primary key should be unique, mandatory, and permanent.
 A classic mistake people make when learning to create
relational databases is to use a volatile field as the primary key.
For example, consider this table:
[Companies]
Company Name
Address
 Company Name is an obvious candidate for the primary key.

Yet, this is a bad idea, even if the Company Name is unique.
 What happens when the name changes after a merger?
Not only do you have to change this record, you have to update
every single related record since the key has changed.
Another common mistake is to select a field that is usually unique

and unchanging. Consider this small table:
[People]
Social Security Number
First Name
Last Name
Date of birth
In the United States all workers have a Social Security Number that
uniquely identifies them for tax purposes. Or does it? As it turns out,
not everyone has a Social Security Number, some people’s Social
Security Numbers change, and some people have more than one. This
is an appealing but untrustworthy key.
The correct way to build a primary key is with a unique and

unchanging value.
Functional Dependency
Closely tied to the notion of a key is a special normalization concept

called functional dependence or functional dependency. The second
and third normal forms verify that your functional dependencies are
correct.
So what is a “functional dependency”?

It describes how one field (or combination (composite) of fields)
determines another field. Consider an example:
[ZIP Codes]
ZIP Code
City
County
State Abbreviation
State Name
ZIP Code is a unique 5-digit key. What makes it a key? It is a key

because it determines the other fields. For each ZIP Code there is a
single city, county, and state abbreviation. These fields are functionally
dependent on the ZIP Code field. In other words, they belong with this
key. Look at the last two fields, State Abbreviation and State Name.
State Abbreviation determines State Name, in other words, State Name
is functionally dependent on State Abbreviation. State Abbreviation is
acting like a key for the State Name field. Ah ha! State Abbreviation is
a key, so it belongs in another table. As we’ll see, the third normal
form tells us to create a new States table and move State Name into it.
Normal Forms
ZIP Code is a unique 5-digit key. What makes it a key? It is a key

because it determines the other fields. For each ZIP Code there is a
single city, county, and state abbreviation. These fields are functionally
dependent on the ZIP Code field. In other words, they belong with this
key. Look at the last two fields, State Abbreviation and State Name.
State Abbreviation determines State Name, in other words, State Name
is functionally dependent on State Abbreviation. State Abbreviation is
acting like a key for the State Name field. Ah ha! State Abbreviation is
a key, so it belongs in another table. As we’ll see, the third normal
form tells us to create a new States table and move State Name into it.
The principles of normalization are described in a series of
progressively stricter “normal forms”. First normal form (1NF) is the
easiest to satisfy, second normal form (2NF), more difficult, and so on.
There are 5 or 6 normal forms, depending on who you read. It is
convenient to talk about the normal forms by their traditional names,
since this terminology is ubiquitous in the relational database industry.
It is, however, possible to approach normalization without using this
language. For example, Michael Hernandez’s helpful Database
Design for Mere Mortals uses plain language. Whatever terminology
you use, the most important thing is that you go through the process.
First Normal Form (1NF)
The first normal form is easy to understand and apply:
A table is in first normal form if it contains no repeating groups.
What is a repeating group, and why are they bad? When you have
more than one field storing the same kind of information in a single
table, you have a repeating group. Repeating groups are the right way
to build a spreadsheet, the only way to build a ﬂat-file database, and
the wrong way to build a relational database. Here is a common
example of a repeating group:
[Customers]
Customer ID
Customer Name
Contact Name 1
Contact Name 2
Contact Name 3
What’s wrong with this approach? Well, what happens when you have a
fourth contact? You have to add a new field, modify your forms, and
rebuild your routines. What happens when you want to query or report
based on all contacts across all customers? It takes a lot of custom
code, and may prove too difficult in practice. The structure we’ve just
shown makes perfect sense in a spreadsheet, but almost no sense in a
relational database. All of the difficulties we’ve described are resolved
by moving contacts into a related table.
[Customers]
Customer ID
Customer Name
[Contacts]
Customer ID (this field relates [Contacts] and [Customers])
Contact ID
Contact Name
Second Normal Form (2NF)
The second normal form helps identify when you’ve combined two
tables into one. Second normal form depends on the concepts of the
primary key, and functional dependency. The second normal form is:
A relation is in second normal form (2NF) if and only if it is in 1NF

and
every nonkey attribute is fully dependent on the primary key.
C.J. Date
An Introduction to Database Systems
In other words, your table is in 2NF if:
1) It doesn’t have any repeating groups.

2) Each of the fields that isn’t a part of the key is functionally
dependent on the entire key.
If a single-field key is used, a 1NF table is already in 2NF.
Third Normal Form (3NF)
Third normal form performs an additional level of verification that you

have not combined tables. Here are two different definitions of the
third normal form:
A table should have a field that uniquely identifies each of its

records, and each field in the table should describe the subject that
the table represents.
Michael J. Hernandez
Database Design for Mere Mortals
To test whether a 2NF table is also in 3NF, we ask, “Are any of the
non-key columns dependent on any other non-key columns?
Chris Gane
Computer Aided Software Engineering
When designing a database it is easy enough to accidentally combine

information that belongs in different tables. In the ZIP Code example
mentioned above, the ZIP Code table included the State Abbreviation
and the State Name. The State Name is determined by the State
Abbreviation, so the third normal form reminds you to move this field
into a new table. Here’s how these tables should be set up:
[ZIP Codes]
ZIP Code
City
County
State Abbreviation
[States]
State Abbreviation
State Name
Higher Normal Forms
There are several higher normal forms, including 4NF, 5NF, BCNF, and
PJ/NF. We’ll leave our discussion at 3NF, which is adequate for most
practical needs. If you are interested in higher normal forms, consult a
book like Date’s An Introduction to Database Systems.
Illustrations
Before Normalization
Order Cust. Cust. Name Address Order Item Item Item Price Qty. Total
Numbe Numbe Date No. Description Order Cost
r r
101426 AK100 Arun Kumar 25, Car St., 20/04/09 TA100 Table Clothes 270.00 100 27000
Triplicane, Chennai.
102356 SU100 Sukumaran 16, Rajaji St, 03/05/09 BL200 Blankets 300.00 10 6450
Chennai CU200 Curtains-
Window 150.00 5
TA100 Table Clothes 270.00 10
102569 SW100 Swarna & Co. 27, Garden St, 07/06/09 BL200 Blankets 300.00 5 7500
Chennai – 34 CU100 Curtains-
Window 150.00 5
SP200 Spreadsheet 350.00 10
102589 ET100 Ethiraj 20,Dowing Street, 19/07/09 BL200 Blankets 300.00 10 7600
Chennai CU100 Curtains-
Window 150.00 10
SP200 Spreadsheet 350.00 5
This table is before normalization i.e., this table will not be flexible and easy to use. The Items are redundant. They appear
several times. This will lead to confusion. Moreover for each and every order number, we have to enter customer details
which is very cumbersome process. There will be problem in modifying the customer details and item details later on. This
table is highly unorganized. So, we have to split this table in to small tables to facilitate the processing. This call for using
Normalization.
First Normalization
1. Remove all repeating groups so that the record will be in fixed
length. In the above case for one order number, several Item
details are there.
Order Record
Order Cust. Cust. Name Address Order Total

Numbe Numbe Date Cost
r r
101426 AK100 Arun Kumar 25, Car St., 20/04/09 2700
Triplicane, Chennai. 0
102356 SU100 Sukumaran 16, Rajaji St, 03/05/09 6450
Chennai
102569 SW100 Swarna & Co. 27, Garden St, 07/06/09 7500
Chennai – 34
102589 ET100 Ethiraj 20,Dowing Street, 19/07/09 7600
Chennai
Items Purchased Record
Order Item Item Item Price Qty.

Number No. Description Order
101426 TA100 Table Clothes 270.00 100
102356 BL200 Blankets 300.00 10
102356 CU200 Curtains-
Window 150.00 5
102356 TA100 Table Clothes 270.00 10
102569 BL200 Blankets 300.00 5
Window 150.00 5
102569 SP200 Spreadsheet 350.00 10
102569 TA100 Table Clothes 270.00 5
102589 BL200 Blankets 300.00 10
Window 150.00 10
102589 SP200 Spreadsheet 350.00 5
102589 TA100 Table Clothes 270.00 5
Now each record is fixed in length and does not contain any repeating groups
Second Normal Form

Each data item in a record is fully functionally dependent on the record key.
Order Cust. Order Total

Numbe Numbe Date Cost
r r
101426 AK100 20/04/09 2700
0
102356 SU100 03/05/09 6450
102569 SW100 07/06/09 7500
102589 ET100 19/07/09 7600
Cust. Cust. Name Address

Numbe
r
AK100 Arun Kumar 25, Car St.,
Triplicane, Chennai.
SU100 Sukumaran 16, Rajaji St,
Chennai
SW100 Swarna & Co. 27, Garden St,
Chennai – 34
ET100 Ethiraj 20,Dowing Street,
Chennai
List of attributes 1NF 2NF 3NF
BOOK_NO BOOK_NO BOOK_NO BOOK_NO
BOOK_NAME BOOK_NAME BOOK_NAME BOOK_NAME
SIZE SIZE SIZE SIZE
TIME_PUBL TIME_PUBL TIME_PUBL TIME_PUBL
YEAR_PUBL YEAR_PUBL YEAR_PUBL YEAR_PUBL
PUB_HOUSE PUB_HOUSE PUB_HOUSE PUB_HOUSE
COST COST COST COST
AUTHOR AUTHOR AUTHOR AUTHOR
CHIEF_AUTH CHIEF_AUTH CHIEF_AUTH CHIEF_AUTH
COMP_AUTH COMP_AUTH COMP_AUTH COMP_AUTH
REV_AUTH REV_AUTH REV_AUTH REV_AUTH
LANG_NO LANG_NO LANG_NO LANG_NO
LANG_NAME LANG_NAME LANG_NAME NATION_NO
LANG_VN LANG_VN LANG_VN SPEC_NO
LANG_SYS LANG_SYS LANG_SYS COLL _NO
NATION_NO NATION_NO NATION_NO KW_MASTER
NATION_NAME NATION_NAME NATION_NAME KW_SLAVE
NATION_VN NATION_VN NATION_VN COMMENT
SPEC_NO SPEC_NO SPEC_NO

SPEC_NAME SPEC_NAME SPEC_NAME LANG_NO
COLL_NO COLL_NO COLL _NO LANG_NAME
COLL_NAME COLL _NAME COLL _NAME LANG_VN
KW_MASTER KW_MASTER KW_MASTER LANG_SYS
KW_SLAVE KW_SLAVE KW_SLAVE
COMMENT COMMENT COMMENT NATION_NO
NATION_NAME
NATION_VN
SPEC_NO
SPEC_NAME
COLL _NO
COLL _NAME
Secondly, we normalize the reader table with its attributes and have this table:

READER_NO READER_NO READER_NO READER_NO
READER_NAME READER_NAME READER_NAME READER_NAME
ADDRESS ADDRESS ADDRESS DEPT_NO
BIRTH_DATE BIRTH_DATE BIRTH_DATE ADDRESS

DEPT_NO DEPT_NO DEPT_NO BIRTH_DATE
DEPT_NAME DEPT_NAME DEPT_NAME COMMENT
COMMENT COMMENT COMMENT
DEPT_NO
DEPT_NAME
Thirdly, we normalize the book ticket_L&GB table with its attributes:

READER_NAME READER_NAME BOOK_NO BOOK_NO
BOOK_NO BOOK_NO BORROW_DATE BORROW_DATE
BOOK_NAME BOOK_NAME RETURN_DATE RETURN_DATE
BORROW_DATE BORROW_DATE COMMENT COMMENT
RETURN_DATE RETURN_DATE
COMMENT COMMENT READER_NO READER_NO
READER_NAME READER_NAME
BOOK_NO BOOK_NO
BOOK_NAME BOOK_NAME
And then, the magazine table, we realize that, the list of attributes has a
repeating group, it includes these attributes: MAG_HEAD_NO, MAG_NAME,
START_YEAR, MAG_SHEFL, ISSN_NO, PUB_HOUSE, LANG_NO,
LANG_NAME, LANG_VN, LANG_SYS, NATION_NO, NATION_NAME,
NATION_VN, SPEC_NO, SPEC_NAME, COLL_NO, COLL_NAME, and
COMMENT, so we decompose it into a new table with a primary key:
MAG_HEAD_NO without data loss. The remaining attributes (MAG_HEAD_NO,
MAG_DETL_NO, YEAR, VOLUME, NUMBER, MONTH, QUANTITY) are put in
another table with a primary key including 2 attributes: MAG_HEAD_NO,
MAG_DETL_NO. And we have the below table:

MAG_HEAD_NO MAG_HEAD_NO MAG_HEAD_NO MAG_HEAD_NO
MAG_DETL_NO MAG_NAME MAG_NAME MAG_NAME
MAG_NAME START_YEAR START_YEAR START_YEAR
START_YEAR MAG_SHEFL MAG_SHEFL MAG_SHEFL
MAG_SHEFL ISSN_NO ISSN_NO ISSN_NO
ISSN_NO PUB_HOUSE PUB_HOUSE PUB_HOUSE
PUB_HOUSE LANG_NO LANG_NO LANG_NO
LANG_NO LANG_NAME LANG_NAME NATION_NO
LANG_NAME LANG_VN LANG_VN SPEC_NO
LANG_VN LANG_SYS LANG_SYS COLL_NO
LANG_SYS NATION_NO NATION_NO COMMENT
NATION_NO NATION_NAME NATION_NAME

NATION_NAME NATION_VN NATION_VN LANG_NO
NATION_VN SPEC_NO SPEC_NO LANG_NAME
SPEC_NO SPEC_NAME SPEC_NAME LANG_VN
SPEC_NAME COLL_NO COLL_NO LANG_SYS
COLL_NO COLL_NAME COLL_NAME
COLL_NAME COMMENT COMMENT NATION_NO
YEAR NATION_NAME
VOLUME MAG_HEAD_NO MAG_HEAD_NO NATION_VN
NUMBER MAG_DETL_NO MAG_DETL_NO
MONTH YEAR YEAR SPEC_NO
QUANTITY VOLUME VOLUME SPEC_NAME
COMMENT NUMBER NUMBER
MONTH MONTH COLL_NO
QUANTITY QUANTITY COLL_NAME
MAG_HEAD_NO
MAG_DETL_NO
YEAR
VOLUME
NUMBER
MONTH
QUANTITY
And the magazine ticket_L&GB table with its attributes:

MAG_HEAD_NO
READER_NAME READER_NAME READER_NAME
MAG_DETL_NO
MAG_HEAD_NO MAG_HEAD_NO MAG_HEAD_NO
BORROW_DATE
MAG_DETL_NO MAG_DETL_NO MAG_DETL_NO
RETURN_DATE
MAG_NAME MAG_NAME MAG_NAME
COMMENT
YEAR YEAR YEAR
VOLUME VOLUME VOLUME

READER_NO
NUMBER NUMBER NUMBER
READER_NAME
MONTH MONTH MONTH
MAG_NAME
BORROW_DATE BORROW_DATE QUANTITY
YEAR
RETURN_DATE RETURN_DATE BORROW_DATE
VOLUME
COMMENT COMMENT RETURN_DATE
NUMBER
COMMENT
MONTH
.
.
.

What Is Normalizationbykbs

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What Is Normalizationbykbs

Uploaded by

Copyright:

Available Formats

Normalisation

Normalisation is carried out for the following reasons.

1. To structure the data so that any pertinent (relevant)

A normalized relational database provides several benefits:

 Elimination of redundant data storage.

Normalization ensures that you get the benefits relational

 When you design a structure it should be described without

Normalized Design: Pros and Cons

We’ve implied that there are various advantages to producing a

Pros of Normalizing Cons of Normalizing

More efficient database structure You can’t start building the

 The primary key is a fundamental concept in relational

 Company Name is an obvious candidate for the primary key.

Another common mistake is to select a field that is usually unique

The correct way to build a primary key is with a unique and

Closely tied to the notion of a key is a special normalization concept

So what is a “functional dependency”?

ZIP Code is a unique 5-digit key. What makes it a key? It is a key

ZIP Code is a unique 5-digit key. What makes it a key? It is a key

Second Normal Form (2NF)

A relation is in second normal form (2NF) if and only if it is in 1NF

1) It doesn’t have any repeating groups.

If a single-field key is used, a 1NF table is already in 2NF.

Third Normal Form (3NF)

Third normal form performs an additional level of verification that you

A table should have a field that uniquely identifies each of its

When designing a database it is easy enough to accidentally combine

Higher Normal Forms

Order Cust. Cust. Name Address Order Total

Items Purchased Record

Order Item Item Item Price Qty.

Second Normal Form

Order Cust. Order Total

Cust. Cust. Name Address

BOOK_NAME BOOK_NAME BOOK_NAME BOOK_NAME

SIZE SIZE SIZE SIZE

TIME_PUBL TIME_PUBL TIME_PUBL TIME_PUBL

YEAR_PUBL YEAR_PUBL YEAR_PUBL YEAR_PUBL

PUB_HOUSE PUB_HOUSE PUB_HOUSE PUB_HOUSE

COST COST COST COST

AUTHOR AUTHOR AUTHOR AUTHOR

CHIEF_AUTH CHIEF_AUTH CHIEF_AUTH CHIEF_AUTH

COMP_AUTH COMP_AUTH COMP_AUTH COMP_AUTH

REV_AUTH REV_AUTH REV_AUTH REV_AUTH

LANG_NO LANG_NO LANG_NO LANG_NO

LANG_NAME LANG_NAME LANG_NAME NATION_NO

LANG_VN LANG_VN LANG_VN SPEC_NO

LANG_SYS LANG_SYS LANG_SYS COLL _NO

NATION_NO NATION_NO NATION_NO KW_MASTER

NATION_NAME NATION_NAME NATION_NAME KW_SLAVE

NATION_VN NATION_VN NATION_VN COMMENT

SPEC_NO SPEC_NO SPEC_NO

COLL_NO COLL_NO COLL _NO LANG_NAME

COLL_NAME COLL _NAME COLL _NAME LANG_VN

KW_MASTER KW_MASTER KW_MASTER LANG_SYS

KW_SLAVE KW_SLAVE KW_SLAVE

COMMENT COMMENT COMMENT NATION_NO

List of attributes 1NF 2NF 3NF

READER_NAME READER_NAME READER_NAME READER_NAME

ADDRESS ADDRESS ADDRESS DEPT_NO

BIRTH_DATE BIRTH_DATE BIRTH_DATE ADDRESS

DEPT_NAME DEPT_NAME DEPT_NAME COMMENT