Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

10/4/2018

Relation Rules (Codd, 1970)


1. Only one value in each cell (intersection of row and
column)
2. All values in a column are about the same subject

NORMALISATION 3.
4.
Each row is unique
No significance in column sequence
5. No significance in row sequence
Bamweyana Ivan
LSG: 2102

Normalisation
Normalisation
• Process of converting tables to conform to Codd’s • Database normalization is the process of organizing
relational rules the fields and tables of a relational database to
• Split tables into new tables that can be joined at query minimize redundancy and dependency.
time • Normalization usually involves dividing large tables into
• The relational join smaller (and less redundant) tables and defining
• Several levels of normalization relationships between them.
• Forms: 1NF, 2NF, 3NF, 4NF and 5NF. • The objective is to isolate data so that additions, deletions,
• Normalization creates many “expensive” joins and modifications of a field can be made in just one table
and then propagated through the rest of the database via
the defined relationships. (wikipedia)

Normalisation Relational joins


• There are two goals of the normalization process: • Every table must have a “primary key”
 eliminating redundant data (for example, storing the same data in • A column (or combination of columns) holding a unique value for
more than one table) and each tuple
 ensuring data dependencies make sense (only storing related data • Joins are effected by finding the same value as the
in a table). “primary key” in another table
• Both of these are worthy goals as they • this is called the “foreign key”
 ensure that data is logically stored- to avoid • Joins may be extended to third and subsequent tables –
inconsistency and unreliability in the data. hence normalisation
 reduce the amount of space a database • Tables must adhere to “normal form” (be normalised)
consumes

1
10/4/2018

Normal Forms
1st Normal Form
• Normal forms are a series of developed guidelines for
• Each attribute must be atomic
ensuring that databases are normalized.
• Atomic means can not be further decomposed and simplified
• They are numbered from one (the lowest form of • No repeating columns within a row.
normalization, referred to as first normal form or 1NF) • No multi-valued columns.
through five (fifth normal form or 5NF). • 1NF simplifies attributes
• In practical applications, you'll often see 1NF, 2NF and 3NF
• A relation is said to be in 1NF if it contains no
along with the occasional 4NF. Fifth normal form is very
rarely seen.
non-atomic values and each row can provide
a unique combination of values
• Occasionally necessary to stray from them to meet practical
• Queries become easier.
business requirements.

1NF Example 2nd Normal Form


Employee (unnormalized) Each attribute must be functionally dependent
on the primary key.
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C, Perl, Java
• Functional dependence - the property of one or more attributes that
2 Barbara Jones 224 IT Linux, Mac uniquely determines the value of other attributes.
3 Jake Rivera 201 R&D DB2, Oracle, Java
• Any non-dependent attributes are moved into a smaller (subset) table.
2NF improves data integrity.
• Prevents update, insert, and delete anomalies.
Employee (1NF)
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java

2NF
Functional Dependence
Employee (1NF)
Employee (1NF) emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
emp_no name dept_no dept_name skills 1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D C 1 Kevin Jacobs 201 R&D Java
1 Kevin Jacobs 201 R&D Perl 2 Barbara Jones 224 IT Linux
1 Kevin Jacobs 201 R&D Java 2 Barbara Jones 224 IT Mac
2 Barbara Jones 224 IT Linux 3 Jake Rivera 201 R&D DB2
2 Barbara Jones 224 IT Mac 3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D DB2 3 Jake Rivera 201 R&D Java
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java

Employee (2NF) Skills (2NF)


emp_no skills
• Name, dept_no, and dept_name are functionally dependent emp_no name dept_no dept_name
1 C
1 Kevin Jacobs 201 R&D
on emp_no. (emp_no -> name, dept_no, dept_name) 2 Barbara Jones 224 IT
1 Perl
1 Java
3 Jake Rivera 201 R&D 2 Linux
• Skills is not functionally dependent on emp_no since it is 2 Mac
not unique to each emp_no. 3 DB2
3 Oracle
It is this functional dependency that is eliminated in 2NF 3 Java

2
10/4/2018

2NF Data Integrity


• A relation is said to be in 2NF if it is already in 1NF and
Employee (1NF)
emp_no name dept_no dept_name skills
every attribute fully depends on the primary key of the 1 Kevin Jacobs 201 R&D C
relation 1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
• If a table has some attributes which are not dependant on 2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
the primary key of that table, then it is not in normal form 3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java

• Insert Anomaly - adding new values. E.g., inserting a new department does not
require the primary key of emp_no to be added.
• Update Anomaly - multiple updates for a single name change causes
performance degradation. E.g., changing IT dept_name to IS
• Delete Anomaly - deleting wanted information. E.g., deleting the IT department
removes employee Barbara Jones from the database
This is the purpose of the 3NF – to maintain data integrity!

Third Normal Form (3NF) Transitive Dependence


Employee (2NF)
Remove transitive dependencies.
emp_no name dept_no dept_name
• Derived dependency or Transitive dependence - two 1 Kevin Jacobs 201 R&D
2 Barbara Jones 224 IT
separate entities exist within one table. 3 Jake Rivera 201 R&D

• Any transitive dependencies are moved into a smaller


(subset) table. Note that, dept_name is functionally dependent on dept_no.
Dept_no is functionally dependent on emp_no, so via the
3NF further improves data integrity. middle step of dept_no, dept_name is functionally dependent
• Prevents update, insert, and delete anomalies. on emp_no.

(emp_no -> dept_no , dept_no -> dept_name, thus emp_no -> dept_name)

This is what is called transitive dependency and it is what is


eliminated in 3NF

3NF Other Normal Forms


Employee (2NF) Boyce-Codd Normal Form (BCNF)
emp_no name dept_no dept_name
1 Kevin Jacobs 201 R&D • Strengthens 3NF by requiring the keys in the
2 Barbara Jones 224 IT
3 Jake Rivera 201 R&D functional dependencies to be super keys Fourth
Normal Form (4NF)
Employee (3NF) Department (3NF) • Eliminate trivial multivalued dependencies.
emp_no name dept_no dept_no dept_name Fifth Normal Form (5NF)
1 Kevin Jacobs 201
201 R&D • Eliminate dependencies not determined by keys.
2 Barbara Jones 224
224 IT
3 Jake Rivera 201

3
10/4/2018

Revisit team ER diagram


Normalizing our team (1NF)
date opponent result players
player_id game_id name start_date end_date aces blocks spikes digs
45 34 Mike Speedy 1/1/00 12 3 20 5
1 1 45 35 Mike Speedy 1/1/00 10 2 15 4
games generates sales 45
78
40
42
Mike Speedy
Frank Newmon
1/1/00
5/1/05
7 2 10 3

102 34 Joe Powers 1/1/02 7/1/05 8 6 18 10


1 102 35 Joe Powers 1/1/02 7/1/05 10 8 24 12
103 42 Tony Tough 1/1/05 15 10 20 14
Recorded tickets Merchandise
by
N
games sales
player_stats N
tracked
1
players game_id date opponent result sales_id game_id merch tickets
34 6/3/05 Chicago W 120 34 5000 25000
35 6/8/05 Seattle W 122 35 4500 30000
40 6/15/05 Phoenix L 125 40 2500 15000
42 6/20/05 LA W 126 42 6500 40000
aces blocks digs spikes Start date End date
Name

Work out the 2NF&3NF…….

Normal Forms-summary
Normal Forms-summary
• First Normal Form
• Table has rows and columns • Third Normal Form
• Every row is unique • 3NF is also in 2NF (which is also in 1NF!)
• Only one value is in each location • All columns that are not primary keys must depend on the primary key

• Primary key is defined • In 3NF, all columns depend on the primary key only
• i.e. it is not possible to use any other (non-PK) column to find the value
• Second Normal Form of a column
• Table should be in 1NF
• Columns that are not the primary key must be totally dependant on
the primary key
• Each column is only searchable through its table’s primary key
• This further reduces redundancy and manages delete, update and
insert anomalies

4NF (Fourth Normal Form)


• Fourth normal form (4NF) has one additional requirement: Read about-Repeating Group
1. Meet all the requirements of the third normal form.
2. A relation is in 4NF if it has no multi-valued dependencies.
• Multivalued dependencies occur when the presence of one or • http://www.basicsofcomputer.com/modeling_repeating_gr
more rows in a table implies the presence of one or more oups.htm
other rows in that same table.
• ftp://ftp.cba.uri.edu/classes/horton/Class%20Notes/Lectur
• Remember, these normalization guidelines are cumulative.
For a database to be in 2NF for example, it must first fulfil all e%20Notes%20-%20Chapter%205%20(part%202).doc
the criteria of a 1NF database.
• Good idea but not an absolute requirement (e.g. dynamic
segmentation) – de-normalisation is required
• The dynamic segmentation process enables multiple sets of attributes to
be associated with any portion of a line feature without segmenting the
underlying feature. In the transportation field, examples of such linearly
referenced data might include accident sites, road quality, and traffic
volume.

You might also like