Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 9

One of the more crucial topics in the area of database management is the process of normalizing the tables in a relational

database. the underlying ideas in normalization are simple enough . Through normalization we want to design for our relational database a set of files that (1) contain all the data necessary for the purposes that the database is to serve. (2) have as little redundancy as possible, (3) accommodate multiple values for types of data that requires them, (4) permit efficient updates of the data in the database, and (5) avoid the danger of losing data unknowingly. the primary reason for normalizing databases to at least the level of the 3rd normal form (the levels are explained below) is that normalization is a potent weapon against the possible corruption of databases stemming from what are called insertion anomalies, deletion anomalies, and update anomalies. these types of error can creep into databases that are insufficiently normalized. Normalization can be viewed as a series of steps designed, one after another, to deal with ways in which tables can be too complicated for their own good. the purpose of normalization is to reduce the chances for anomalies to occur in a database. The definitions of the various levels of normalization illustrate complications to be eliminated in order to reduce the chances of anomalies. At all levels and in every case of table with a complication , the resolution of the problem turns out to be the establishment of two or more simpler tables which, as a group contain the same information as the original table, but because of their simpler individual structures, lack the complication. 1st Normal Form (1NF) A table (relation) is in 1NF if: 1. There are no duplicated rows in the table. 2. Each cell is single-valued (no repeating groups or arrays). 3. Entries in a column (field) are of the same kind. *The requirement that there be no duplicated rows in the table means that the table has a key (although the key might be made up of more than one column, even possibly, of all the colomns). 2nd Normal Form (2NF)

A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key. Since a partial dependency occurs when a non-key attribute is dependent on only a part of the composite key, the definition of 2NF is sometimes phrased as, A table is in 2NF if it is in 1NF and if it has no partial dependencies. 3rd Normal Form (3NF) A table is in 3NF if it is in 2NF and if it has no transitive dependencies. Boyce-Codd Normal Form (BCNF) A table is in BCNF if it is in 3NF and if every determinant is a candidate key. 4th Normal Form (4NF) A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies. 5th Normal Form (5NF) A table is in 5NF, also called Projection-join Normal Form (PJNF), if it is in 4NF and if every join dependency in the table is a consequence of the candidate keys of the table. Domain-Key Normal Form (DKNF) A table is in DKNF if every constraint on the table is a logical consequence of the definition of keys and domains. Insertion Anomaly It is a failure to place information about a new database entryinto all the places in the database where information about the new entry needs to be stored. In a properly normalized database, information about a new entry needs to be inserted into only one place in the database, in an inadequatly normalized database, information about a new entry may need to be inserted into more than one place, and human fallibility being what it is, some of the needed additional insertionsmay be missed. Deletion Anomaly It is a failure to remove information about an existing database entry when it is time to remove that entry. In a properly normalized database, information about an old, to-be-gotten-rid-of entry needs to be deletedfrom only one place in the database, in an inadequatly normalized database, information about that old entry may need to be deleted from more than one place.

Update Anomaly An update of a database involves modifications that may be additions, deletions, or both. Thus update anomalies can be either of the kinds discussed above. All three kinds of anomalies are highly undesirable, since thier occurence constitutes corruption of the database. Properly normalized database are much less susceptible to corruption than are un-normalized databases. To normalize databases, there are certain rules to keep in mind. These pages will illustrate the basics of normalization in a simplified way, followed by some examples. Database normalization Rule 1: Eliminate Repeating Groups. Make a separate table for each set of related attributes, and give each table a primary key. Unnormalized Data Items for Puppies puppy number puppy name kennel code kennel name kennel location trick ID trick name trick where learned skill level

In the original list of data, each puppy description is followed by a list of tricks the puppy has learned. Some might know 10 tricks, some might not know any. To answer the question Can Fifi roll over? we need first to find Fifis puppy record, then scan the list of tricks associated with the record.This is awkward, inefficient, and extremely untidy. Moving the tricks into a seperate tablehelps considerably. Seperating the repeating groupsof tricks from the puppy information results in first normal form. The puppy number in the trick table matches the primarykey in the puppy table, providing a foreign key for relating the two tables with a join operation. Now we can answer our question with a direct retrieval look to see if Fifis puppy number and the trick ID for roll over appear together in the trick table. First Normal Form:

Puppy Table puppy number puppy name kennel name kennel location Trick Table puppy number trick ID trick name trick where learned skill level Database Normalization Rule 2: Eliminate Redundant Data, if an attribute depends on only part of a multi-valued key, remove it to a separate table. The trick name (e.g. roll over) appears redundantly for every puppy that knows it. Just trick ID whould do. primary key

TRICK TABLE Puppy Number Trick ID Trick Name Where Learned Skill Level 52 53 54 27 16 27 roll over Nose Stand roll over 16 9 9 9 5 9

*Note that trick name depends on only a part (the trick ID) of the multi-valued, i.e. composite key. In the trick table, the primary key is made up of the puppy number and trick ID. This makes sense for the where learned and skill level attributes, since they will be different for every puppy-trick combination. But the trick name depends only on the trick ID. The same name will appear redundantly every time its associated ID appears in the trick table. Second Normal Form puppy table puppy number puppy name

kennel code kennel name kennel location tricks table tricks ID tricks name Puppy-Tricks puppy number trick ID trick where learned skill level Suppose you want to reclassify a trick, i.e. to give it a different trick ID. The change has to be made for every puppy that knows the trick. If you miss some of the changes, you will have several puppies with the same trick under different IDs, this is an update anomaly. Database normalization Rule 3: Eliminate columns not dependent on key. If attributes do not contribute to a description of the key, remove them to a separate table. Puppy Table puppy number puppy name kennel code kennel name The puppy table satisfies the first normal form, since in contains no repeating groups. It satisfies the second normal form, since it does not have a multivalued key. But the key is puppy number , and the kennel name and the kennel location describe only a kennel, not a puppy. To achieve the third normal form, they must be moved into a separate table. Since they describe a kennel, kennel code becomes the key of the new kennels table. Third Normal Form Puppies puppy number

puppy name kennel code Kennel kennel code kennel name kennel location Tricks trick ID trick name Puppy Tricks puppy number trick ID trick where learned skill level The motivation for this is the same as for the second normal form. We want to avoidupdate and delete anomalies. For example suppose no puppies from the Puppy Farm were currently stored in the database. With the previous design, there would be no record of its existence. The previous normalization forms are considered elementary, and should be applied on tables during our design process. This normalization form however, and the following forms, are done in special tables. A table is considered in BCNF (Boyce-Codd Normal Form) if its already in 3NF AND doesnt contain any nontrivial functional dependencies. That is it doesnt contain any field (other than the primary key) that can determine the value of another field. Lets take the following table: Student Smith Smith Jones Jones Doe Subject Math English Math English Math Teacher Dr. White Dr. Brown Dr. White Dr. Brown Dr. Green

By taking into consideration the following conditions:

For each subject, every student is educated by one teacher. Every teacher teaches one subject only. Each subject can be teached by more than one teacher.

Its clear we have the following functional dependency: Teacher -> Subject And the left side of this dependency is not the primary key. So, to convert the table from 3NF to BCNF, we do these steps: Determine in the table, a key other than the primary key. That can be left side to the functional dependency. Delete the key in the right side of our functional dependency in the main table. Make a table for this dependency, with its key being the left side of the dependency, as the following: Teacher Dr. White Dr. Brown Dr. White Dr. Brown Dr. Green

Student Smith Smith Jones Jones Doe And Teacher Dr. White Dr. Brown Dr. Green

Subject Math English Math

Database Normalization Rule 4: Isolate independent multiple relationships. No table may contain two or more 1:n (one-to-many) or n:m (many-to-many) relationships that are not directly related. This applies only to designs that include one-to-many and many-to-many relationships. An example of a one-to-many relationship is that one kennel can hold many puppies. An example of a many-to-many relationship is that a puppy can know many tricks and several puppies can know the same tricks.

Puppy Tricks and Costumes puppy number trick ID trick where learned skill level costume suppose we want to add a new attribute to the puppy-trick table, Costume, this way we can look for puppies that can both set-up-and-beg and wear a Groucho Marx mask, for example. The forth normal form dictates against this (i.e. against using the puppy-tricks table not against begging while wearing a Groucho mask). The two attributes do not share a meaningful relationship. A puppy may be able to wear a wet suit. This does not mean it can simultaneously sit up and beg. How will you represent this if you store both attributes in the same table? Forth Normal Form Puppies puppy number puppy name kennel code Kennels kennel code kennel name kennel location Tricks trick ID trick name Puppy-Tricks puppy number trick ID trick where learned skill level

Costumes costume number costume name Puppy-Custumes puppy number costume number

You might also like