Normalization Paper

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 3

Database Normalization

What is Normalization? Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies ma e sense (only storing related data in a table). !oth of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. The Normal Forms The database community has de"eloped a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or #N$) through fi"e (fifth normal form or %N$). &n practical applications, we'll often see #N$, (N$, and )N$ along with the occasional *N$.

First Normal Form (1NF) $irst normal form (#N$) sets the "ery basic rules for an organized database: +liminate duplicati"e columns from the same table. ,reate separate tables for each group of related data and identify each row with a uni-ue column or set of columns (the primary ey). Second Normal Form (2NF) .econd normal form ((N$) further addresses the concept of remo"ing duplicati"e data: /eet all the re-uirements of the first normal form. 0emo"e subsets of data that apply to multiple rows of a table and place them in separate tables. ,reate relationships between these new tables and their predecessors through the use of foreign eys. Third Normal Form (3NF) Third normal form ()N$) goes one large step further: /eet all the re-uirements of the second normal form. 0emo"e columns that are not dependent upon the primary ey. Fourth Normal Form (4NF) $inally, fourth normal form (*N$) has one additional re-uirement: /eet all the re-uirements of the third normal form. 1 relation is in *N$ if it has no multi2"alued dependencies. ,on"erting a database to the first normal form is rather simple. This first rule calls for the elimination of repeating groups of data through the creation of separate tables of related data. $or example, lets thin of a table. the original table contains se"eral sets of repeating groups of data,

student&D studentName /a3or college college4ocation class&D ,lassName professor&D professorName

+ach attribute can be repeated no of times times, allowing for each student to ta e multiple classes. 5owe"er, what if the student ta es more than three classes6 This, and other restrictions on this table should be ob"ious.

First Normal Form Therefore, let7s brea this mammoth table down into se"eral smaller tables. The first table contains solely student information (.tudent):

student&D

studentName

/a3or

college

college4ocation

The second table contains solely class information (,lass):

student&D

class&D

className

The third table contains solely professor information (8rofessor):

professor&D

professorName

Second Normal Form 9nce we ha"e separated the data into their respecti"e tables, we can begin concentrating upon the rule of .econd Normal $orm: that is, the elimination of redundant data. 0eferring bac to the ,lass table, typical data stored within might loo li e:

student&D

class&D className /ath #*= 8hysics ##) 5istory #%# 5istory #%#

#)*2%;2<=>? /#*= #()2*%2<=>* 8##) %)*2>=2>??> 5#%# #)*2%;2<=>? 5#%#

@hile this table structure is certainly impro"ed o"er the original, we can notice that there is still room for impro"ement. &n this case, the className attribute is being repeated. @ith ;?,??? students stored in this table, performing an update to reflect a recent change in a course name could be somewhat of a problem. Therefore, we can create a separate table that contains class&D to className mappings (,lass&dentity):

classID /#*= 8##) 5#%#

className /ath #*= 8hysics ##) 5istory #%#

The updated ,lass table would then be simply:

student&D

class&D

#)*2%;2<=>? /#*= #()2*%2<=>* 8##) %)*2>=2>??> 5#%# #)*2%;2<=>? 5#%#

0e"isiting the need to update a recently changed course name, all that it would ta e is the simple update of one row in the ,lass&dentity tableA 9f course, substantial sa"ings in dis space would also result, due to this elimination of redundancy. Third Normal Form ,ontinuing on the -uest for complete normalization of the school system database, the next step in the process would be to satisfy the rule of the Third Normal $orm. This rule see s to eliminate all attributes from a table that are not directly dependent upon the primary ey. &n the case of the .tudent table, the college and college4ocation attributes are less dependent upon the student&D than they are on the ma3or attribute. Therefore, we can create a new table that relates the ma3or, college and college4ocation information:

ma3or

,ollege

college4ocation

The re"ised .tudent table would then loo li e:

student&D

studentName

/a3or

1lthough for most cases these three Normal $orms sufficiently satisfy the re-uirements set for proper database normalization, there are still other $orms that go beyond what rules ha"e been set thus far.

You might also like