Integrity Constraints (Ref: Dbms by Silbershatz and Galvin)

RT503 Database Management Systems 1 Module 4
______________________________________________________________________________________________________________
MODULE 4
Integrity constraints ( ref: dbms by Silbershatz and galvin)
We know that unauthorized users can access the database. They can damage data in the database.
Also they can make the database inconsistent. Also a normal DBMS user can make the database in an
inconsistent state because of accident. So some restrictions should be made in the database so that the
users do not make changes to data accidentally. These restrictions are also called constraints.
Integrity constraints are intended for the normal user. These integrityt constraints ensure that
changes made to the database by authorized users do not result in a loss of data consistency. So the
integrity constraints guard against accidental damage to the database. They are a number of weays to
specify integrity constraints.
They are
Key constraints ( primary keys, foreign keys and candidate key specification)
Using ‘not null’
Using ‘check’ clause
Using assertions
Using triggers
Using functional dependencies
Domain constraints
We know that an attribute has a set of possible values associated with it.
For example in the student table
Student ( stdid, name, marks)
We know that the set of possible values for the attribute stdid is in the range of integers.
For attribute name the set of possible values are a group of characters.
For attribute marks the set of possible values are integers.
So these integer, character, date etc.. are called standard domain types.
Declaring an attribute to be of a particular domain acts as a constraint on the values that it can take. It is
possible for several attributes to have the same domain. For example in our student table, the domain of
stdid is same as domain of marks. That is integer. But we never say that find the name of students who
have the same stdid as a mark. It is not meaningful.
We can define new domains by using the create domain clause.
That is
create domain Dollars int ;
create domain pounds int;
Define the domains Dollars and pounds to be of integers. An attempt to assign a value of type dollars to a
variable of type Pounds would result in a syntax error although both are of the same type. But they are of
different domains.
The check clause in SQL permits domains to be restricted in powerful ways. For example if we are
creating a domain Studmarks and the condition is that the Studmarks value should not be more than 100.
we can specify thgis by
Create domain Studmarks int

______________________________________________________________________________________________________________
Department of Computer Science & Engineering
______________________________________________________________________________________________________________
Constraint marktest check (value <=100)
Complex check conditions can be useful when we want to ensure integrity of data.
Rerential integrity
Here we are using foreign keys. Sometimes we wish to ensure that a value that appears in one table
for a given set of attributes also appears for a certain set of attributes in another table.this condition is
called referential integrity. We can illustrate by an example.
Suppose we have a college and we have stored the details of all students in the college in the student table
and we have a library in the college that contains books. Suppose the details of all books are atored in the
books table.
Student ( stdid, name, marks)

Books ( Bid, bname, author)
Suppose there is a facility for students to access and reserve books. Suppose the college uses 2 tables to
store this reserve and accessed
Reserve (stdid, bid, rdate)

Accessed ( stdid, bid, adate)
Suppose we create the student and books table like this.
Create table student (

Stdid int,
Name char(10),
Marks int,
Primary key ( stdid)
);
create table books (

bid int,
bname char(10),
author char(10),
primary key (bid)
);
suppose the students are allowed to access and reserve books. We are given that the details of all students
are in the students table and details of all books are in the books table.
Suppose the condition in the college is that only students of the college are allowed to access and reserve
books. In other words we can specify this condition as only students who are having entry in the student
table are allowed to access the books. In other words the stdid values in reserve and accessed table must
also be present in the student table.
Suppose another condition is that the students are allowed to access and reserve only those books that are
present in the college library. Or in other words we can say that the students are allowed to access and
reserve books that are present in the books table. Or in other words the bid values in the books reserve and
accessed tables must also be present in the books table .
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
The above conditions or restrictions we can specify by using a foreign key clause.
That is
The tables accessed and reserved are created by
Create table reserved (

Stdid int,
Bid int,
Foreign key( stdid) references student( stdid),
Foreign key( bid) references books( bid)
);
create table accessed (

stdid int,
bid int,
Foreign key( stdid) references student( stdid),
Foreign key( bid) references books( bid)
);
this means that for any tuples inserted in to the reserved table the value of stdid and bid must be present
in the student and books tables respectively.
Also for any tuples inserted in to the accessed table the value of stdid and bid must be present in the
student and books tables respectively.
We can also create the tables reserved and accessed by specifying a coantraint name for these foreign
keys. That is another way of creating the tables is
Create table reserved (

Stdid int,
Bid int,
Constraint st Foreign key( stdid) references student( stdid),
Constraint bks Foreign key( bid) references books( bid)
);
create table accessed (

stdid int,
bid int,
constraint stud Foreign key( stdid) references student( stdid),
constraint bk1 Foreign key( bid) references books( bid)
);
here we have given names to these constraints.

So there are 2 foreign key constraints for reserved table. They are st and bks.
There are 2 foreign key constraints for accessed table. They are stud and bk1.
These facts can be represented by
Student
Stdid Name marks
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Books
Bid Bname Author
Reserved
Stdid Bid Rdate
Accessed
Stdid Bid Adate
Then other types of constraints are primary key constraints , unique, not null, check constraints.
For example suppose consider the table student.
Student ( stdid, branch, sem, relation, name, marks)
In this we can see that there are 2 candidate keys. They are stdid and (branch, sem, relation). One we
assign as the primary key , one we assign as unique.
Suppose we have the constraint that the name and marks of a student should not be nil or thwere should
be a value in the marks field and also suppose that we want to ensure that the value of marks should not
be more than 100. we can ensure this by using check clause.
We can create the table by

Stdid int,
Branch char(2),
Sem int,
Rn int,
Name char(10) not null,
Marks int not null,
Primary key (stdid),
Unique( branch, sem, Rn),
Check (marks<=100)
);
OR
We can give a name to all these constraints as
Stdid int,
Branch char(2),
Sem int,
Rn int,
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Name char(10) not null,
Marks int not null,
Constraint pk Primary key (stdid),
Constraint cdk Unique( branch, sem, Rn),
Constraint chk Check (marks<=100)
);
if the student table is created in this way we cannot insert two tuples that are having the same stdid values,
since stdid is declared as the primary key.
Also we cannot insert two tuples that are having the same (branch, sem, relation) values since these three
attributes together forms another key and it is declared using unique key word.
We cannot insert a tuple that is having the marks value greater than 100 since check clause is used to
restrict the marks values to be less than 100.
We have declared marks and name fields to be non null. So for each tuple that is inserted in to the table
there should be some value in the marks and name fields.
Other integrity constraints are triggers, assertions, functional dependencies. These are explained in some
other sections.
Pitfalls in relational database design ( ref: navathe / silbertschatz)
Before we discuss normalization of databases, we can see the drawbacks in the common design of
databases.
Some of the undesirable properties of bad design are
Repetition of information
Inability to represent certain information
Problems in updating values
Lossy join decomposition
We can see an example. Suppose the information related with a college is stored as
College (dname, dhod, dphone, stdid, stdname, stdmarks)
College
Dname Dhod Dphone stdid stdname smarks

CS Abc 23456 100 Ss1 70
CS Abc 23456 101 Ss2 20
CS Abc 23456 102 Ss3 45
EC Bgh 78905 100 Ss7 67
EC Bgh 78905 101 Ss8 55
AE Mkl 34443 100 Ss2 68
CS Abc 23456 103 Ss4 34
AE Mkl 34443 101 Ss3 70
Suppose we want to add the details of a new student in to the college table.
That is student- 800, hjk, 50 to AE department.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
In our design we need a tuple with values on all attributes of college schema. Thus we must repeat
the dhod and dphone and we must add the tuple
AE, bcd, 34443, 800, hjk, 50
In general, the Dhod and Dphone for a department must appear once for each student admitted to
that department.
The repetition of information is very much undesirable. Repeating information wastes space. Also
it complicates the database. Suppose the phone number of department CS changes from 23456 to 56789.
Under this design many tuples of college relation needs to be changed. So updates are very costly in this
design. When we perform update on this table, we must ensure that every tuple corresponding to CS
departnment is updated. Otherwise our table will show 2 different phone number values.
By observing this, we can say that this design of our table or database is bad.
We know that a department has a unique value of phone number, so given a department name we can
uniquely identify the phone number value.
We know that a department has many students, so given a department name we cannot uniquely
determine the stdid. In other words we can say that the functional dependency dname  dphone holds on
college schema. But we cannot say that there is a functional dependency dname  stdid exists.
The fact that the department has a particular value for phone no., and the fact that dept has a
student are independent, these facts can be best represented in separate tables. We will see that we can use
functional dependencies to specify formally when a database design is good.
Another problem with the college relation is that we cannot represent directly the information
related with a department ( dname, dhod, dphone) if there are no students in that department. This is
because tuples in college relation requires values for stdid, stdname, stdmarks.
One solution for this is to use null values. But these null values are difficult to handle. If we do not
want to deal with null values, we can create department information only when the first student is
admitted to that department. And if all students from that department go out, then we have to delete all
information on that department. But this situation is undesirable.
Then some other problems that can occur isupdate anomalies or problems in updates and lossy join
decompositions.
For example if we consider the student table
Student ( stdid, branch, name, marks, hod, deptphoneno)
Student
Stdid Branch Name Marks Hod Deptphoneno
100 Cs Abc 60 Def 567890

101 Cs Bcd 70 Def 567890
102 Ec Sad 80 Ghj 123456
105 Ec Abc 10 Ghj 123456
In this table we can see that there is repetition of information. Also we can see that there is a particular
person as hod for each branch. If all the students’ details are stored in this table we can see that if there are
100 students in each branch the hod ‘s name will be repeated 100 times. Also the department phone no
will also be repeated 100 times. Suppose the hod of a particular branch changes. Then we have to update
the hod field of each branch. If there are 100 tuples corresponding to each branch then all those tuples
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
have to be updated corresponding to the hod field. This is the case with deptphoneno also. If we want to
change the phone no of a particular department, it also has to be changed for all these tuples. This is called
update anomalies.
Lossy join decomposition is another pitfall in the relational database design. This has been explained with
fourth normal form.
Functional dependency ( Ref: navathe)
This is a very important concept in the relational database design. A functional dependency is a
constraint between 2 sets of attributes from the database. First we can see an example.
Consider the student table.

Student
Stdid Sname Marks Rn Branch Sem Hod Grade

100 Anil 50 1 Cs 3 Abc D
101 Binil 80 2 Cs 3 Abc A
102 Cinil 70 3 Cs 3 Abc B
103 Dinil 80 4 Cs 3 Abc A
We are considering the student table and our assumptions are on a real world view of the student.
We can see that the keys or candidate keys of the table are stdid and (branch, sem, rn). We know
that a key means for each tuple the value of the key attribute or column should be distinct. For example
stdid, for each row or tuple in the student table, stdid value should be different. Then the key (branch,
sem, rn). In this case also the 3 values for these three attributes taken together are distinct for each tuple or
row. That is these groups of 3 values are distinct for each tuple or row.
Stdid
100
101
102
103
108
branch sem rn
cs 3 1
cs 3 2
cs 3 3
cs 5 1
cs 5 2
ec 3 1
ec 3 2
ec 3 3
ec 5 1
ec 5 2
we can see that the key values are distinct for each row.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
If we say
Stdid  marks
This is called a functional dependency. That is stdid functionally determines marks.
Suppose in the above table the values for the attributes are
Stdid marks
100 80
101 85
102 70
103 70
104 85
108 70
109 80
Any way ‘stdid’ values are different for each row since it is a candidate key. In this we can see that
for each ‘stdid’ value, there is a unique ‘marks’ value. It means if the ‘stdid’ is 102, its corresponding
‘marks’ value is always 70 in this student table. This means that the value of the ‘marks’ attribute of a
tuple in student depend on or are determined by the values of the ‘stdid’ component or we can say that the
values of the ‘stdid’ component of a tuple uniquely (functionally) determines the values of the ‘marks’
attribute. We can say that there is a functional dependency from ‘stdid’ to ‘marks’ or that ‘marks’ is
functionally dependent on ‘stdid’. The attribute ‘stdid’ is called the left hand side of the FD and ‘marks’
is called the right hand side.
We can write other functional dependencies as

Stdid  sname
Stdid  rn
Stdid  sem
Stdid  branch
Stdid  hod
Stdid  grade
Also we can write as
Stdid  sname, marks, rn, branch, sem, hod, grade
We can see that this is correct. We have written the above sets because stdid is a key attribute.
We can also write

Branch, sem, rn  marks
We can write it because the left hand side is a key attribute.
Branch sem rn marks
Cs 3 1 50
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Cs 3 2 60
Cs 3 3 70
Cs 3 4 50
Ec 3 1 50
Ec 3 2 20
Ec 3 3 30
On looking on to this we can say that

(branch, sem, rn) functionally determines marks.
Also we can write
Branch, sem, rn  stdid

Branch, sem, rn  sname
Branch, sem, rn  hod
Branch, sem, rn  grade
Or together
Branch, sem, rn  stdid, sname, marks, hod, grade
Since these 2 attributes are keys for student, we have written these 2 functional dependencies.
Stdid  branch, sem, rn, sname, marks, hod, grade
Of we look on to that table again, we can find other functional dependencies.
For example
Stdid branch hod
100 cs abc
101 cs abc
103 cs abc
104 cs abc
101 ec bcd
103 ec bcd
105 cs abc
104 ec bcd
if we think, we can find that for each branch there is only one hod or for each value of
branch there is a unique hod.
We can write as
Branch  hod
Then if we take marks and grade, suppose the mark is 80. suppose the grade is A for mark
80 and above. We can see that whenever mark 80 comes grade will be A.
So for each value of mark there is a unique grade.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Stdid marks grade
100 50 D
101 80 A
102 85 A
103 50 D
104 60 C
105 75 B
106 60 C
so we can write
marks  grade
so we can say that the following functional dependencies hold in the student relation.
Stdid  branch, sem, rn, sname, marks, hod, grade
Branch  hod
Marks  grade
So in the student schema we are representing these functional dependencies as
Student
Stdid Sname Marks Rn Branch Sem Hod Grade
A functional dependency (FD) denoted by X Y between 2 sets of attributes X and Y that

are subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. the
constraint is that for any two tuples t1 and t2 in r that have t1[X] = t2[X], we must also have t1[Y]=
t2[Y].
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
This means that the values of Y component of a tuple in r depend on or are determined by the
values of the X component. . Or in other words, the values of X component of a tuple uniquely or
functionally determine the values of the Y component.
We are saying that there is a functional dependency from X to Y or that Y is functionally
dependent on X. the abbreviation for functional dependency is FD. The set of attributes X is called left
hand side of FD, and Y is called right hand side of FD.
A functional dependency is a property of the relation schema R, not of a particular relation state r
of R. So an FD cannot be automatically determined from a given relation but it must be explicitly defined
by someone who knows the meaning or semantics of the columns of relation R.
Trivial and non trivial functional dependencies
In a functional dependency X  Y , if Y  X then it is a trivial FD.
X  Y is non trivial FD when Y is not subset of X
For example
A B C
Q L M
E J N
R B Y
Q L J
T B D
U G P
R B Y
The FD s are A  B
CA
These two are non trivial functional dependencies.
We can also write
A, B  B
A, B, C  A, C
These are trivial functional dependencies because RHS is a subset of LHS.
Inference rules for functional dependencies
 Amstrong Axioms are basic inference rule

 Amstrong Axioms are used to conclude FD’s on a relational
database
 The inference rule is a type of assertion .It can apply to a set
of FD to derive other FD
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
 Using the inference rule ,we can derive additional functional

dependency from the initial set
 The set of all such dependencies is called closure of F and is
denoted by F+.
Armstrong’ s inference rules

The following set of rules is well known inference rules for FDs
1. Reflexive rule
If Y  X, then X  Y,(if Y is subset of X then X determines Y)
2. Augmentation rule/Partial Dependency
If X determines Y then XZdetermines YZ for any Z

X  Y we can infer XZ  YZ
3. Transitive rule
X  Y, Y  Z , we can infer X  Z
4. Decomposition rule
X  YZ , we can infer X  Y, X  Z
5. union rule
X  Y, X Z we can infer X  YZ
6. pseudotransitive rule
X  Y, WY  Z we can infer WX  Z
Closure of an Attribute Set

Set of all attribute which can be functionally determined from an attribute /set of attribute
are called closure of that attribute /set attribute .
Closure of attribute set{X} is denoted as {X}+
Following steps are followed to find the closure of an attributes et
Step 1
Add the attribute s which are present on the LHS in the original functionl dependency
Step 2 :Add the attribute s which are present on the RHS in the functional dependency
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
.
Step 2:With the help of attributes present on RHS ,check the other attribute that can be derived
from the other given functional dependancies. Repeat this process until all the possible
attributes which can be derived are added I the closure
Ex:Consider relation R(A,B,C,D,E,F,G) with the functional dependencies
F={ABC, BCDE, DF, CFG}

Closure of attribute A
A+={A}
={A,B,C} (Using FD - A BC)
={A,B,C,D,E} (Using FD ,BCDE)
={A,B,C,D,E,F} (Using FD ,DF)
={A,B,C,D,E,F,G} (Using FD CF G)
A+=={A,B,C,D,E,F,G} it is a candidatekey
Ex 2:
R(A,B,C) FD are AB, B C, ABC
A+ = {A,B,C} –it is candidate key

B+={B,C}
C+={C}, C not determined any other attribute from the above FD
AB+={A,B,C} This is a super key
FINDING KEYS USING CLOSURE
Super Key
 If the closure result of an attribute set contains all the attributes of the relation ,then that
attribute set is called as a superkey of that relation
 Thus we can say
The closure of a super key is the entire relation schema
For example from the above example A is a super key,but if we need to find closure of BC
then it is not a super key as A will not get (should get all the attribute then it’s a superkey)
Candidate Key
If there is no subset of an attribute set whose closure contains all the attribute of the relation ,then
that attribute set is called as a candidate key of that relation .
For example
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
No subset of attribute A contains all the attribute of the relation
Thus, attribute A is also a candidate key for that relation
(If AC is a super key and A itself also a super key not a candidate key)A itself act as a super
key (minimal super key)
Equivalence of Functional Dependency
If F and G are two sets of functional dependencies

1.If All FD ‘s of G can be determined from FD’s that are present in F, we can conclude F covers G
2.If all FDs of F can be determined from FD’s that are present in G, we can conclude that G
covers F
If 1 and 2 satisfied then F=G
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Normal forms ( ref: Dbms by Navathe)
The normal forms or normalization process was first proposed by Codd. It takes a relation schema
or a set of tables through a series of tests and it checks whether the database satisfies a certain normal
form. Codd proposed 3 normal forms.
First normal form

Second normal form and
Third normal form
Then a modification to the third normal form was proposed. That is called
Boyce Codd normal form
All these normal forms are based on functional dependencies.

Later fourth normal form and fifth normal forms were proposed. They are based on multivalued
and join dependencies.
We have already studied some drawbacks or pitfalls in relational database design. The main
drawbacks are repetition of information and inability to represent certain information. The purpose of
normalization is to analyze the given relation schemas or tables and based on functional dependencies and
candidate keys and remove the above said drawbacks from the database. If a relation schema or tables are
not satisfying the normal form tests, they are decomposed and new relations are made which satisfies the
normal form tests.
We know the concept of candidate keys and primary key of a table.
Prime attribute
An attribute of relation schema R is called a prime attribute if it is a member of some candidate key
of R. an attribute is called non-prime if it is not a prime attribute- that is it is not a member of some any
candidate key.
For example
Student ( stdid, branch, sem, rn, sname, marks)
‘Branch’ is a prime attribute because it is a member of the candidate key ( branch, sem, rn).
Like wise ‘sem’ is a prime attribute.
‘Stdid’ is a prime attribute because it is itself a candidate key.
‘Marks’ is not a prime attribute.
Also ‘sname’ is not a prime attribute.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
First normal form (1NF)
It is defined to disallow multivalued attributes, composite attributes and their combinations.

It states that domain of an attribute must include only atomic (indivisible) values and that value of
any attribute in a tuple must be a single value from the domain of the attribute.
So first normal form disallows having as set of values, tuple of values or combination of both as an
attribute value for a single tuple.
We can explain this using an example.
Consider the student relation.
Student ( stdid, sname, saddress, phoneno)
Student
Stdid Sname Saddress phoneno
100 Abc No. 20, KTM, Kerala 567890
102 Bcd No. 35, EKM, Kerala 564476

234789
105 Def No. 41, KTM, Kerala 123245

367840
300898
In this relation we can see there are 3 tuples. But there is a composite attribute ‘saddress’ having
three fields, ‘ house no, city and state ‘.
Then we can see a multivalued attribute ‘phoneno’. We can see that student 102 has 2 phones. 103
has 3 phones.
According to 1NF, all these multivalued and composite attributes are not allowed.
We have to find a way to to normalize this schema to first normal form.
First we are solving the problem caused by multi valued attributes, here phoneno.
We are removing the attribute ‘phoneno’ and place it in a separate table or relation along with the primary
key of student that is ‘stdid’.
Then we get
Student1 ( stdid, sname, saddress)
Std_phone ( stdid, phoneno)
Here the
primary key of student1 is ‘stdid’ and
Primary key of std_phone is (stdid, phoneno )
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Student1
Stdid Sname Saddress
100 Abc No. 20, KTM, Kerala
102 Bcd No. 35, EKM, Kerala
105 Def No. 41, KTM, Kerala
Std_phone
Stdid Phoneno
100 567890
102 564476
102 234789
105 123245
105 367840
105 300898
Then next we have to deal with composite attributes . we can expand the ‘saddress’ to 3 attributes
as ‘add_house’, ‘add_city’, ‘add_state’. The nthe relations will be
Student1A
Stdid Sname Add_house Add_city Add_state
100 Abc No. 20 KTM Kerala

102 Bcd No. 35 EKM Kerala
105 Def No. 41 KTM Kerala
Std_phone
Stdid Phoneno
100 567890
102 564476
102 234789
105 123245
105 367840
105 300898
We can see that student1A and std_phone are in first normal form (1NF).
Second normal form (2NF)
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
A relation schema or a table, R is in second normal form, if every non prime attribute A in R is fully
functionally dependent on the primary key of R. it should be in 1NF
Before seeing second normal form, we have to learn some definitions
Prime Attribute: An attribute which is a part of the candidate key(LHS)

Non Prime Attribute:An attribute which is not a part of the candidate key(RHS)
Partial and full functional dependencies
A functional dependency X  Y is a full functional dependency if removal of an attribute A from X (that

is A subset of X) means that the dependency does not hold any more.
A functional dependency X  Y is a partial functional dependency, if some attribute A from X is

removed, the dependency still holds.
For example
Student (stdid, branch, sem, rn, name, marks, hod)
We know that the following FD’s are correct for this table.
FD1 -- stdid  branch, sem, rn, name, marks, hod
FD2 -- branch, sem, rn  stdid, name, marks

Also
FD3 -- branch, sem, rn  hod
In FD2, if we remove the attribute sem from the LHS or X part, we can see the
Branch, rn does not functionally determine stdid, name, marks, hod. This is the case if we remove branch
and rn. So this FD2 is called a full functional dependency.
In FD3, if we remove the attribute sem and rn we cn see that the FD still holds.
That is branch  hod is also a functuional dependency. So this FD3 is a partial functional dependency.
For example
Student1 (stdid,branch, sem, rn, name, hod, marks, grade )
FD1
FD2
FD3
FD4
We can see that the student1 relation is not in second normal form, because of FD3. that is
Branch  hod
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
It violates 2NF because the non prime attribute hod is partially dependent on the candidate key (branch,
sem, rn ).
This is a partial functional dependency because
Branch, sem, rn  hod. (if we remove the attribute sem, rn then also the FD holds).
Other non prime attributes are name, marks,grade. They are fully functionally dependent on the keys.
Stdid  name
Branch, sem, rn  name
Stdid  marks
Stdid  grade
Branch, sem, rn  grade
Grade  marks does not violate 2NF, because grade is not a prime attribute.
As a next step we have to normalize student1 to 2NF.

We are decomposing it by
Removing attribute hod which forms a partial dependency from student1 and put it in another relation.
That is we are decomposing student1 to student1A and student1B
Student1A
Stdid Branch Sem Rn Name Marks Grade
FD1
FD2
FD3
Student1B
Branch Hod
So we have decomposed student1 into

student1A (stdid, branch, sem, rn, name, marks, grade) and
stuident1B ( branch, hod)
This is in 2NF.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Third normal form (3NF)
3NF is based on the concept of transitive dependency. Transitive dependencies are not allowed in
3NF.
Transitive dependency means, if in a relation or a table if XY and YZ hold, then X Z is also
a functional dependency that holds on R. Here X, Y, Z are attributes of the table and also Y should not be
a candidate key or a subset of any key (prime attribute) of the table R.it should be in 2NF
we can see this by an example.
Student3
We have shown 3 FD’s here. That is
Fd1 Stdid  Marks
Fd2 Marks  Grade
Fd3 StdidGrade
We can see that marks is not a prime attribute of student3.

Stdid  grade is a transitive dependency because of Fd2 and Fd3.
This is not allowed in 3NF.
A relation R is said to be in 3NF, if R is in 2NF and also no non prime attribute of R is transitively
dependent on the key of R.
The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists. But
it is not in 3NF because of the transitive dependency stdid  grade via e ‘marks’.
We can normalize student3 by decomposing it in to two 3NF relation schemas,
Student3A and student3B as follows.
Student3A (stdid, branch, sem, rn, name, marks)

Student3B (marks, grade)
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Student3
Student3A
Stdid Branch Sem Rn Name Marks
Student3B
Marks Grade
We can see that this is in 3NF.
Example 2:
Emp_dept
Ename Ssn Bdate Address Dnumber Dname Dmgrssn
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
We can see that the above schema is not in 3NF because the transitive dependency, but it is in 2NF.
Ename  dmgrssn is there. Also

Ename  dname is there. (through dnumber)
We can decompose this in to
ED1
Ename Ssn Bdate Address Dnumber
ED2
Dnumber Dname Dmgrssn
See that this table is in 3NF.
General definitions of second and third normal forms
General definition of second normal form
A relation schema R is in 2NF, if every non prime attribute A in R is not partially dependent on
any key of R. we can see an example.
LOTS
Propertyid Countyname Lot Area Price Taxrate
Fd1
Fd2
Fd3
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Fd4
We can see that the LOTS schema violates the general definition of 3NF because ‘tax rate’ is
partially dependent on the candidate key (county name, lot) due to FD3.
To normalize LOTS in to 2NF, we decompose it in to 2 relations, Lots1 and Lots2. we construct
Lots1 by removing the attribute tax rate that violates 2NF and placing it with county name (the LHS of
FD3 that causes partial dependency) in to another relation Lots2. both Lots1 and Lots2 are in 2NF. We
can see that FD4 does not violate 2NF.
LOTS1
Propertyid Countyname Lot Area Price
Fd1
Fd2
Fd4
LOTS2
County Tax rate

name
fd3
The relations LOTS1 and LOTS2 are in second normal form.
General definition of third normal form (3NF)
A relation schema R is in 3NF if whenever a non-trivial functional dependency

X  A holds in R, either
a) X is a super key of R
OR
b) A is a prime attribute of R.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
If any of these conditions hold we can say that the relation schema is in 3NF.
Using this we can directly analyse a relation scheam whether it is in 3NF.
Consider the LOTS relation.
LOTS
Propertyid Countyname Lot Area Price Taxrate
Fd1
Fd2
Fd3
Fd4
According to this LOTS is not in 3NF, because FD3 and FD4 violates the conditions.
We can see that FD1 and FD2 are in 3NF.
But in FD3
County name taxrate
County name itself is not a super key and also tax rate is not a prime attribute.
Also in FD4
Area  price
Area is not a super key and also price is not a prime attribute.
So LOTS is not in 3NF.
To normalize LOTS we decompose it into LOTS2 and LOTS1A and LOTS1B.

We construct LOTS1A by removing the attribute price that violates 3NF and LOTS2 by removing the
attribute taxrate that also violates 3NF.
LOTS2
County Tax rate

name
Fd3
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
LOTS1A
Propert id Countyname Lot Area
Fd1
Fd2
LOTS1B
Area Price
Fd4
We can see that all the above relations LOTS2, LOTS1A, LOTS1B are in 3NF.
A relation schema R is in 3N if every non prime attribute of R meets the following conditions .
It is fully functionally dependent on every key of R.
It is non transitively dependent on every key of R.
Boyce Codd Normal form (BCNF)
It was first proposed as a simpler form of 3NF, but it was founf to be stricter than 3NF. This is
because every relation in BCNF is also in 3NF. However a relation in 3NF may not be in BCNF.
A relation schema R is in BCNF if whenever a non trivial functional dependency X  A
holds in R, then X is a superkey of R. the only difference between BCNF and 3NF is that the
condition (b) of 3NF (which allows A to be prime) is absent from BCNF.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Suppose we have a table Lots1A
Lots1A
Propertyid Countyname Lot Area
Fd1
Fd2
Fd5
Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF.
FD5 violates BCNF because area is not a superkey.Fd1 and Fd2 satisfies BCNF because the LHS are
super keys.
So we remove the attribute (county name) and place it in another relation.
Lots1AX
Propertyid Area Lot
Lots1AY
Area Countyname
These relations are in BCNF.
Every relation in BCNF is also in 3NF. Every relation in 3NF may not necessarily be in BCNF.
For example
R
A B C
Fd1
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
Fd2
Here the relation R is in 3NF. But we can see that it is not in BCNF because C is not a super key of R.
Exercise:
1. consider the relation R = { A, B, C, D, E, F, G, H, I, J } and the set of functional dependencies

A, B  C
A  D, E
BF
F  G, H
D  I, J
What is the key of R?
Decompose R in to 2NF, then 3NF relations.
Answer
A B C D E F G H I J
Fd1
Fd2
Fd3
Fd4
Fd5
From the figure, the key of R is (A, B).
This is not in 2NF because in fd2, fd3, there is partial functional dependency. So we remove attributes D,
E, F. but we can see
A D
D I
DJ
So we have to remove I, J
BF
FG
FH
So we have to remove G, H.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
So we get relations 2NF
R1
A B C
Fd1
R2
A D E I J
Fd2
Fd5
R3
B F G H
Fd3
Fd4
The above relations R1, R2, R3 are in 2NF because there are no partial functional dependencies
and also it is in 1NF.
DECOMPOSITION TO 3NF
We can take each of R1, R2 and R3 and analyse them
R1
A B C
Fd1
R1 is in 3NF because in Fd1 (A,B  C), A,B is a super key.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
R2
A D E I J
Fd2
Fd5
R2 is not in 3NF because
Fd2 ( A  D,E) is in 3NF because A is a super key.
Fd5 ( D I, J) is not in 3NF because D is not a super key and also D is not a prime attribute.
So we remove I and J from R2.

We decompose R2 as
R2A
A D E
Fd2
R2B
D I J
fd5
R2A and R2B are in 3NF.

Consider R3
R3
B F G H
Fd3
Fd4
We can see fd3 satisfies 3NF because B is a super key.
Fd4 is not in 3NF beause F is not a super key and also F is not a prime attribute.
We decompose it into 2 relations. R3A, and R3B.
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
R3A
B F
Fd3
R3B
F G H
Fd4
R3A and R3B are in 3NF.
So we get the final set of relations as
R1
A B C
Fd1
R2A
A D E
Fd2
R2B
D I J
fd5
R3A
B F
Fd3
______________________________________________________________________________________________________________
______________________________________________________________________________________________________________
R3B
F G H
Fd4
Numerical Questions:
1. Consider a relation R(A, B, C, D) with the following functional dependencies:

o A -> B
o BC -> D
Determine the minimal cover of the functional dependencies.
Solution: The minimal cover for the given functional dependencies is:
 A -> B
 B -> D
 C -> D
2. Given a relation R(A, B, C, D) with the following functional dependencies:

o AB -> C
o C -> D
Determine if the relation R is in Second Normal Form (2NF) and Third Normal Form (3NF).
Solution: The relation R is in 2NF but not in 3NF because C is transitively dependent on the candidate key AB. To
bring R into 3NF, it needs to be decomposed into two relations: R1(AB, C) and R2(C, D).
3. Consider a relation R(A, B, C, D) with the following functional dependencies:

o A -> B
o BC -> D
Decompose the relation R into Boyce-Codd Normal Form (BCNF) while preserving dependencies. Show
the resulting decomposed relations.
Solution: The relation R is already in BCNF since all the functional dependencies are either superkeys or candidate
keys.
______________________________________________________________________________________________________________

Integrity Constraints (Ref: Dbms by Silbershatz and Galvin)

Uploaded by

Copyright:

Available Formats

You might also like

Integrity Constraints (Ref: Dbms by Silbershatz and Galvin)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Integrity Constraints (Ref: Dbms by Silbershatz and Galvin)

Uploaded by

Copyright:

Available Formats

RT503 Database Management Systems 1 Module 4

Integrity constraints ( ref: dbms by Silbershatz and galvin)

Student ( stdid, name, marks)

Create domain Studmarks int

Student ( stdid, name, marks)

Reserve (stdid, bid, rdate)

Suppose we create the student and books table like this.

Create table student (

create table books (

Create table reserved (

create table accessed (

Create table reserved (

create table accessed (

here we have given names to these constraints.

These facts can be represented by

Bid Bname Author

Student ( stdid, branch, sem, relation, name, marks)

Create table student (

Pitfalls in relational database design ( ref: navathe / silbertschatz)

College (dname, dhod, dphone, stdid, stdname, stdmarks)

Dname Dhod Dphone stdid stdname smarks

AE, bcd, 34443, 800, hjk, 50

100 Cs Abc 60 Def 567890

Functional dependency ( Ref: navathe)

Consider the student table.

Stdid Sname Marks Rn Branch Sem Hod Grade

We can write other functional dependencies as

Also we can write as

Stdid  sname, marks, rn, branch, sem, hod, grade

We can also write

We can write it because the left hand side is a key attribute.

Branch sem rn marks

On looking on to this we can say that

Also we can write

Branch, sem, rn  stdid

Stdid  branch, sem, rn, sname, marks, hod, grade

Branch, sem, rn  stdid, sname, marks, hod, grade

Of we look on to that table again, we can find other functional dependencies.

Stdid branch hod

Stdid marks grade

Stdid  branch, sem, rn, sname, marks, hod, grade

Branch, sem, rn  stdid, sname, marks, hod, grade

So in the student schema we are representing these functional dependencies as

Stdid Sname Marks Rn Branch Sem Hod Grade

A functional dependency (FD) denoted by X Y between 2 sets of attributes X and Y that

Trivial and non trivial functional dependencies

In a functional dependency X  Y , if Y  X then it is a trivial FD.

X  Y is non trivial FD when Y is not subset of X

Inference rules for functional dependencies

 Amstrong Axioms are basic inference rule

 Using the inference rule ,we can derive additional functional

Armstrong’ s inference rules

2. Augmentation rule/Partial Dependency

If X determines Y then XZdetermines YZ for any Z

Closure of an Attribute Set

Ex:Consider relation R(A,B,C,D,E,F,G) with the functional dependencies

F={ABC, BCDE, DF, CFG}

A+ = {A,B,C} –it is candidate key