Professional Documents
Culture Documents
Dbms-Unit-3 - Aktu
Dbms-Unit-3 - Aktu
Dbms-Unit-3 - Aktu
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
Syllabus:
Data Base Design & Normalization: Functional dependencies, normal forms, first, second, third
normal forms, BCNF, inclusion dependence, loss less join decompositions, normalization using
FD, MVD, and JDs, alternative approaches to database design.
Objectives : At the end of this chapter the reader will be able to
Describe the Rules of RDBMS
Understand functional dependencies
Demonstrate Closure of FD and Attributes.
Decompose the relation into smaller relation
Understand Normalization and various normal forms.
12 Rules of RDBMS
Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems, came up with twelve
rules of his own, which according to him, a database must obey in order to be regarded as a true relational
database.
These rules can be applied on any database system that manages stored data using only its relational capabilities.
This is a foundation rule, which acts as a base for all the other rules.
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata, must be a value of some table cell. Everything in a
database must be stored in a table format.
Rule 2: Guaranteed Access Rule
Every single data element (value) is guaranteed to be accessible logically with a combination of table-name,
primary-key (row value), and attribute-name (column value). No other means, such as pointers, can be used to
access data.
Rule 3: Systematic Treatment of NULL Values
The NULL values in a database must be given a systematic and uniform treatment. This is a very important rule
because a NULL can be interpreted as one the following data is missing, data is not known, or data is not
applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online catalog, known as data dictionary,
which can be accessed by authorized users. Users can use the same query language to access the catalog which
they use to access the database itself.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that supports data definition, data
manipulation, and transaction management operations. This language can be used directly or by means of some
application. If the database allows access to data without any help of this language, then it is considered as a
violation.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must also be updatable by the system.
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. This must not be limited to a single row, that
is, it must also support union, intersection and minus operations to yield sets of data records.
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
F+ = (A B, A C, CG H, CG I, B H, A H, CG HI, AGI, AG H)
Closure of Attribute Sets
To test whether a set of is a super key, we need to find the set of attributes functionally determined by .
Let be a set of attributes. We call the set of attributes determined by under a set F of functional
dependencies the closure of under F, denoted +.
A procedure to compute +
result := ;
while (changes to result) do
for each functional dependency in F do
begin
if result then result := result ;
end
Example:
Find AG+ in F= (A B, A C, CG H, CG I, B H)
A B causes us to include B in result. To see this fact, we observe that A B is in F, A result which
is AG), so result: = result B.
A C causes result to become ABCG.
CGH causes result to become ABCGH.
CGI causes result to become ABCGHI.
To test if is a superkey, we compute +, and check if + contains all attributes of R.
Extraneous attributes
An attribute of a functional dependency is said to be extraneous if we can remove it without changing the closure
of the set of functional dependencies. The formal definition of extraneous attributes is as follows. Consider a set F
of functional dependencies and the functional dependency in F.
Attribute A is extraneous in if A , and F logically implies (F { }) {( A) }.
Attribute A is extraneous in if A , and the set of functional dependencies (F { }) { (
A)} logically implies F.
Consider a set F of functional dependencies and the functional dependency in F.
To test if attribute A is extraneous in
1. compute ({} A)+ using the dependencies in F
2. check that ({} A)+ contains all attributes of ; if it does, A is extraneous
To test if attribute A is extraneous in
1. compute + using only the dependencies in F = (F {}) { ( A)},
2. check that + contains A; if it does, A is extraneous
Canonical Cover
A canonical cover for F is a set of dependencies Fc such that
F logically implies all dependencies in Fc,
Fc logically implies all dependencies in F
No functional dependency in Fc contains an extraneous attribute
Each left side of functional dependency in Fc is unique.
To compute a canonical cover for F:
repeat
Use the union rule to replace any dependencies in F
1 1 and 1 1 with 1 1 2
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
R = (A, B, C)
F = {A BC, B C, A B, AB C}
Combine A BC and A B into A BC
Set is now {A BC, B C, AB C}
A is extraneous in AB C
Check if the result of deleting A from AB C is implied by the other dependencies
Yes: in fact, B C is already present!
Set is now {A BC, B C}
C is extraneous in A BC
Check if A C is logically implied by A B and the other dependencies
Yes: using transitivity on A B and B C.
Can use attribute closure of A in more complex cases
The canonical cover is: A B, B C
Normalization
If a database design is not perfect, it may contain anomalies, which are like a bad dream for any
database administrator. Managing a database with anomalies is next to impossible.
Update anomalies If data items are scattered and are not linked to each other properly, then it
could lead to strange situations. For example, when we try to update one data item having its
copies scattered over several places, a few instances get updated properly while a few others are
left with old values. Such instances leave the database in an inconsistent state.
Deletion anomalies we tried to delete a record, but parts of it was left undeleted because of
unawareness, the data is also saved somewhere else.
Insert anomalies we tried to insert data in a record that does not exist at all.
Normalization:
Normalization is a method to remove all these anomalies and bring the database to a consistent state.
First Normal Form
First Normal Form defines that all the attributes in a relation must have atomic domains. The values in
an atomic domain are indivisible units.
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
Each attribute must contain only a single value from its pre-defined domain.
Second Normal Form
Before we learn about the second normal form, we need to understand the following
Prime attribute An attribute, which is a part of the primary-key, is known as a prime attribute.
Non-prime attribute An attribute, which is not a part of the primary-key, is said to be a nonprime attribute.
The relation is in 2NF if every non-prime attribute is fully functionally dependent on primary key
attribute. That is, if X A holds, then there should not be any proper subset Y of X, for which Y A
also holds true.
In the above Student_Project relation, the prime key attributes are Stu_ID and Proj_ID. According to the
rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on any of
the prime key attribute individually.
But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID
independently. This is called partial dependency, which is not allowed in Second Normal Form.
We broke the relation in two as in the above picture. So there exists no partial dependency.
Third Normal Form
A relation is in Third Normal Form, if it is in Second Normal form and the following must satisfy
No non-prime attribute is transitively dependent on prime key attribute.
For any non-trivial functional dependency, X A, then either
o X is a superkey or,
o A is prime attribute.
Mukesh Kumar
Assistant Professor (CSE-Deptt)
UNIT-3
We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We
find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a
prime attribute. Additionally, Stu_ID Zip City, so there exists transitive dependency.
To bring this relation into third normal form, we break the relation into two relations as follows