Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

S2 BCA DBMS M4

Data Redundancy / Need for Normalization


Data Redundancy means that storing of same information in different places. The various
problems caused by data redundancy are:
• Space wastage
• Insertion Problem
• Updation Problem
• Deletion Problem
Consider the table which shows the details about courses and the courses taken by students

Student Number Course Number Student Name Country Course Name


S21 9201 Jones USA Accounts
S21 9267 Jones USA physics
S24 9267 Smith INDIA Physics

Space Wastage: If Student S24 takes a new course his name and country are repeated and leads to
space wastage.
Update Anomalies: An Update Anomaly exists when one or more instances of duplicated data are
updated, but not all. For example, consider Jones updating his country - you need to update all instances of
Jones's country otherwise an update anomaly will occur.
Delete Anomalies: A Delete Anomaly exists when values of certain attributes are lost because of the
deletion of other attributes values. For example, consider what happens if Student S21 leaves the Accounts
course, all information about the account course is also lost.
Insert Anomalies: An Insert Anomaly occurs when certain attribute value cannot be inserted into the
database without the presence of other attributes. For example we can't add a new course unless we have
at least one student enrolled on the course.
Normalization of Data Base/Relation
• Database Normalization is a technique of organizing the attributes in the database table.
• Normalization is a systematic approach of decomposing tables to eliminate data redundancy.
• Normalization is the process of replacing a relation with a collection of smaller relations.
• Normalization is also known as Decomposition/Schema Refinement.

1 GG
S2 BCA DBMS M4

Levels of Normalization

1. First Normal Form (1NF)


2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (3.5NF)

1. First Normal Form (1NF)

A table is considered to be in INF all the fields in the tables contains only atomic(indivisible values)
values

#Book_id Book_Name Book_Price Author_Name


B1 DBMS 750 JOHN,ROY
B2 VB 900 ROY,TOM
In the above table book the attribute Author_Name is not atomic

1NF – Decomposition

• Remove the repeating groups to a new table


• Identify a primary key for the new table produced. The primary key of the new table will be the
primary key of the existing table combined with one or more columns from the new table.
Existing Table New Table
#Book_id Book_Name Book_Price #Book_id #Author_Name
B1 DBMS 750 B1 JOHN
B2 VB 900 B1 ROY
B2 ROY
B2 TOM

The primary key for the new table will be Book_id and Author_Name. The Book_id
attribute is the foreign key for the new table produced

2 GG
S2 BCA DBMS M4

Functional Dependencies

Example-1

A functional dependency occurs when one attribute in a relation uniquely determines another attribute.
Definition: -Let R be a relation, and A and B are the attributes of R. Then we say that B is
functionally dependent on A, If and only if each A value in R has associated with one B value in R
Symbolically this functional dependency can be represented by A→ B (read “A
functionally determines B”/ Attribute B is functionally dependent on Attribute A)
Example-2
Consider the Employee Table

#Employee number Employee Name Salary City

1 Dana 50000 San Francisco

2 Francis 38000 London

3 Andrew 25000 Tokyo

In this example, if we know the value of Employee number, we can obtain Employee
Name, city, salary, etc. By this, we can say that the city, Employee Name, and salary are functionally
depended on Employee number.

3 GG
S2 BCA DBMS M4

2. Second Normal Form (2NF)

A Table is in 2NF there are 2 requirements


• The Table is in first normal form
• All non key attributes in the table must be fully functionally dependent on the entire primary key or
Partial dependencies must be removed from the table.
Partial Dependency
If a non key attribute functional depended upon only a part of the primary key that dependency is known
as partial dependency.
Example-1
#Teacher_id #Subject_Name Teacher_Qualification
T1 DBMS MCA
T1 VB MCA
T2 Internet Marketing MBA
T2 Marketing Research MBA
T3 DBMS MSC CS

Consider the above table the primary key for the table is (Teacher_id,Subject_Name). We
can determine the qualification of the teacher by the attribute Teacher_id, So the non key attribute
Teacher_Qualification only depends on Teacher_id attribute (by using subject Name we can’t identify the
qualification of the teacher). Teacher_id is only a part of the primary key. So the above table is not in 2
NF.
2NF – Decomposition

• If an attribute is dependent on only a part of the primary key, move that attribute and a copy of
that part of the primary key to a new table.
• Make the partial primary key copied from the original table as the primary key for the new table.

4 GG
S2 BCA DBMS M4

Old Table New Table


#Teacher_id #Subject_Name #Teacher_id Teacher_Qualification
T1 DBMS T1 MCA
T1 VB T2 MBA
T2 Internet Marketing T3 MSC CS
T2 Marketing Research
T3 DBMS

Teacher_id attribute act as the primary key for the new table Produced.
Example-2
Consider the student Table
#Student_ID Student Name Marks #Subject_ID Subject Name
S1 A 24 SU1 C
S2 B 24 SU1 C
S1 A 23 SU2 C++
S2 B 22 SU2 C++
The primary key of the above table is (Student ID, Subject_ID)
Student_id →Student Name
Subject_id→Subject Name
Student_id, Subject_id→Marks

(#Student_id,Student_Name)
(#Subject_id,Subject Name)
(#Student_id, #Subject_id,Marks)

5 GG
S2 BCA DBMS M4

The dependencies are:


• Student_ID→Student Name (We can determine Student Name from the attribute Student_id.
Student_id is only a part of the primary key)
• Subject_ID → Subject Name (We can determine Subject Name from the attribute Subject_id.
Subject_id is only a part of the primary key)
Both the above dependencies violate the rule of 2 NF
• (Student_ID, Subject_ID)→ Marks (We can determine the Marks of a student for a particular
subject only by (Student_ID, Subject_ID) combination. The Non Key attribute Marks is depended
upon the entire primary key so we can retain the Marks attribute in the original table.
So the above table can be decomposed into 3 based on the dependencies.

Existing Table New Table-1 New Table-2


#Student_ID #Subject_ID Marks #Student_ID Student #Subject Subject
S1 SU1 24 Name ID Name
S2 SU1 25 S1 A SU1 C
S1 SU2 23 S2 B SU2 C++
S2 SU2 22

Transitive Dependency

Transitive Dependency will occurs when a non Key attribute depend on another non Key
attribute
Definition: If A, B, C are the attributes of Relation R and A is the Key attribute. The functional
Dependency in relation R is: A→ B , B→ C
In Relation R -Attribute A determines B and B determine C. Therefore Attribute A determines C via B
Symbolically A→C. Attribute C is transitively depended on A

6 GG
S2 BCA DBMS M4

3. Third Normal Form (3NF)


For a table to be in 3NF, there are two requirements
• The table should be in second normal form
• Remove Transitive Dependency or No Non key attribute is transitively depended on the
primary Key or No Non Key attribute is dependent on another Non Key attribute.
Example-1
#Eid Employee_Name Pin_Code Place
E1 JOHN 110007 Delhi University
E2 ROY 110016 Green Park
E3 TOM 110007 Delhi University

In the above table the attributes Place is having a direct dependency on pin Code attribute. By using the
non key attribute Pin_Code we can determine the value of Place, Pin _Code is a non key attribute.
The above dependency Pin_Code→ Place creates Transitive dependency Eid→Place with an intermediate
attribute Pin Code, so the above table is not in 3 NF.
3NF – Decomposition

• Move all attributes that cause transitive dependence to a new table.


• Identify a primary key for the new table.
• Place the primary key for the new table as a foreign key of the original table.

Existing Table New Table 1


#Eid Employee_Name Pin_Code #Pin_Code Place
110016 Green Park
E1 JOHN 110007
110007 Delhi University
E2 ROY 110016
E3 TOM 110007

The pin_code attribute is the primary key for the New Table and the pin_code will act as the foreign key
of the Existing table.

7 GG
S2 BCA DBMS M4

Example-2
Consider a table that shows the database of a bookstore. The Book_id is the primary Key for the table.
#Book_ID Genre_ID Genre_Name Price
100 1 FICTION 100
101 2 TRAVEL 50
102 1 FICTION 95
103 3 SPORTS 150

In the above table their exits a relation between Genre_ID and Genre_Name. Through Genre_ID we can
determine Genere_Name, so the above table can be decomposed into 2.
Old Table NewTable
#Book_ID Genre_ID Price Genre_ID Genre_Name
100 1 100 1 FICTION

101 2 50 2 TRAVEL
3 SPORTS
102 1 95
103 3 150

4. Boyce-Codd Normal Form (BCNF / 3.5NF)


If we are creating a composite primary key (consists of more than 1 attribute) each of this individual
attribute is known as Prime attribute.
#Student_ID Student Name Marks #Subject_ID Subject Name

In the above column headings the primary key is Student_id, Subject_id combination so the prime
attributes are:
1. Student_ID
2. Subject_ID

8 GG
S2 BCA DBMS M4

A relation R is in 3NF
I. It must be in 2 NF
II. For each functional dependency X ⟶ Y in R at least one of the following conditions is met
A. X is a key of R
B. Y is a prime attribute of R
A relation R is in BCNF
I. It must be in 3 NF
II. For each functional dependency X ⟶ Y in R
A. X is a key of R

BCNF is stricter than 3NF as any relation that is in BCNF will be in 3NF but not
necessarily every relation that is in 3NF will be in BCNF. BCNF avoids prime attributes depending on the
non-prime attributes. An example is as follows.

Example:
Consider the table R with attributes (#A, B, #C)
The Key attribute is {A, C}
The functional Dependencies are
1. AC→B (AC is the key of the table R , 1st FD condition of 3 NF Met)
2. B→C (C is the prime attribute of the table R, 2nd FD condition of 3 NF
Met)
So the above table is in 3NF
But the table is not in BCNF
1. AC→B (AC is the key of the table R , condition of BCNF Met)
2. B→C (B is not the key of the table R, condition of BCNF not Met)

9 GG

You might also like