Bad DB Design: Duplicate of Data Updating Deleting

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 21

Bad DB Design

Duplicate of data
Updating
Deleting
Redundant
Deleting
Update
Normalization
Normalization is a design technique that is widely
used as a guide in designing relational databases.
Normalization is essentially a two step process
that puts data into tabular form by removing
repeating groups and then removes duplicated
data from the relational tables.
Normalization theory is based on the concepts of
normal forms. A relational table is said to be a
particular normal form if it satisfied a certain set of
constraints. There are currently five normal forms
that have been defined. In this course, we will
cover the first three normal forms
cont
The goal of normalization is to create a set
of relational tables that are free of
redundant data and that can be
consistently and correctly modified. This
means that all tables in a relational
database should be in the third normal
form (3NF).
Normalization
A relational table is in 3NF if and only if all non-key
columns are:
mutually independent and
fully dependent upon the primary key
Mutual independence means that no non-key
column is dependent upon any combination of the
other columns
The first two normal forms are intermediate steps
to achieve the goal of having all tables in 3NF
In order to better understand the 2NF and higher
forms, it is necessary to understand the concepts
of functional dependencies
Functional Dependencies
The concept of functional dependencies is the basis for the
first three normal forms. A column, Y, of the relational table R
is said to be functionally dependent upon column X of R if
and only if each value of X in R is associated with precisely
one value of Y at any given time. X and Y may be
composite. Saying that column Y is functionally dependent
upon X is the same as saying the values of column X
identify the values of column Y. If column X is a primary key,
then all columns in the relational table R must be
functionally dependent upon X.
A short-hand notation for describing a functional
dependency is:
R.x > R.y
which can be read as in the relational table named R,
column x functionally determines (identifies) column y.
Functional Dependencies, example

Motivation: normalization, the process where we break a


relation schema into two or more schemas.
Example: ABCD with FDs AB ->C, C ->D, and D ->A.
Decompose into ABC, AD. What FDs hold in ABC ?
Not only AB ->C, but also C ->A !

AB ->C and C ->B.


Example: A = street address, B = city, C = zip code.
There are two keys, {A,B } and {A,C }.
Example FD
Drinkers(name, addr, drinkLiked, manf,
favdrink).
Reasonable FDs to assert:
1. name -> addr
2. name -> favdrink
3. drinkLiked -> manf
Example DF

name addr drinkLiked manf favDrink


Janeway Voyager Bud A.B. WickedAle
Janeway Voyager WickedAle Petes WickedAle
Spock Enterprise Bud A.B. Bud

Because name -> addr Because name -> favBeer

Because beersLiked -> manf


FDs With Multiple Attributes
No need for FDs with > 1 attribute on
right.
But sometimes convenient to combine FDs
as a shorthand.
Example: name -> addr and name -> favDrink
become name -> addr favDrink
> 1 attribute on left may be essential.
Example: Resturnt Drink -> price
Example, Cont
Consider relation Drinkers(name, addr,
drinkLiked, manf, favdrink).
{name, beersLiked} is a superkey because
together these attributes determine all the other
attributes.
name -> addr favBeer
beersLiked -> manf
{name, drinksLiked} is a key because neither
{name} nor {drinkLiked} is a superkey.
name doesnt -> manf; drinkLiked doesnt -> addr.
Basic Idea
To know what FDs hold in a projection, we
start with given FDs and find all FDs that
follow from given ones.
Then, restrict to those FDs that involve
only attributes of the projected schema.
normalization
What is normalization? Basically, it's the process
of efficiently organizing data in a database.
There are two goals of the normalization
process:
Eliminate redundant data (for example, storing
the same data in more than one table) and
Ensure data dependencies make sense (only
storing related data in a table).
Both of these are worthy goals as they reduce
the amount of space a database consumes and
ensure that data is logically stored.
First Normalization Form 1FN
Eliminate duplicative columns from the
same table. BY the values in each column
of a table are atomic. By atomic we mean
that there are no sets of values within a
column.
Create separate tables for each group of
related data and identify each row with a
unique column or set of columns (the
primary key).
example
Title Author Bio ISBN Subject Page Publi

Beginning Chad Russell, Chad Russell is 159059 MySQL, 512 Apress


MySQL Jon Stephens a programmer 3324 Database
Database and network Design
Design and administrator
Optimization who owns his
own Internet
hosting
company.,.

Book (ISBN, Title, Pages)


Author(Author_ID, First_Name, Last_name)
Subject(Subject_ID, Name)
Publisher (Publisher_ID, Name, Address, City, State, Zip)
The relationship between the Book table and the Author table is a many-to-many
relationship:
Book_Author (ISBN, Author_ID)
Book_Subject (ISBN Subject_ID)
One-to-many relationship exists between the Book table and the Publisher table:
Book (ISBN, Title, Pages, Publisher_ID)
Second normal form 2NF
Where the First Normal Form deals with
atomicity of data, the Second Normal Form (or
2NF) deals with relationships between
composite key columns and non-key columns:
Meet all the requirements of the first normal form.
Any non-key columns must depend on the entire
primary key. In the case of a composite primary key,
this means that a non-key column cannot depend on
only part of the composite key.
Create relationships between these new tables and
their predecessors through the use of foreign keys.
A relation R is in 2nf if every non-primary attribute A in
R is fully Functionally dependent on the primary
key.
Example 2NF

Student Advisor Adv-Room Class#

052144 Mohammed 500 Cs424


052144 Mohammed 500 Cs424
052144 Mohammed 500 Cs424
621464 Sami 501 Cs416
621464 Sami 501 Cs416 Registration
521423 Ibrahiem 215 Cs491 Student Class#
457215 Khalid 312 Cs412 052144 Cs424
052144 Cs424
Student 052144 Cs424
Student Advisor Adv-Room 621464 Cs416
052144 Mohammed 500 621464 Cs416
621464 Sami 501 521423 Cs491
457215 Cs412
Third normal form 3NF
Remove columns that are not dependent
upon the primary key.
Third Normal Form (3NF) requires that all
columns depend directly on the primary
key.
Example:
Publisher (Publisher_ID, Name, Address,
City, State, Zip)
Zip (Zip, City, State)
Example 3NF
In the last example, Adv-Room (the advisor's office number) is
functionally dependent on the Advisor attribute. The solution is to move
that attribute from the Students table to the Faculty table, as shown
below:
Student Advisor Adv-Room

052144 Mohammed 500


621464 Sami 501

Student Faculty
Student Advisor Name Room Dep
052144 Mohammed Mohammed 500 CS
621464 Sami Sami 501 IT

You might also like