Kalinga University Department of Computer Science: Dependencies

Kalinga University
Department of Computer Science
Course-MCA Sem-II
Subject-Database Management System
Subject Code- MCA204
UNIT-III
Dependencies
1. Functional dependency: if we can dedicate to a property’s any kind of value, which

exists in one system to another property type only one value. For example: to one identity
number there can be only one name associated, but to the same name there can be many
identity numbers related. 1 to more relation.
2. Mutual functional dependency: if the above mentioned requirement is true in both
‘directions’. For instance: registration number – engine number. 1 to 1 relation.
3. Functionally independents: if the previous relation between the two property types does
not exist. An example: the hair colour of an employee and the company’s premise.
4. Transitive functional dependency: if within an entity type one descriptive property
type’s concrete values determine other descriptive property values.
Functional Dependency (FD)
A functional dependency (FD) is a relationship between two attributes, typically between the PK
and other non-key attributes within a table. For any relation R, attribute Y is functionally
dependent on attribute X (usually the PK), if for every valid instance of X, that value of X
uniquely determines the value of Y.
A functional dependency FD: X → Y is called trivial if Y is a subset of X. In other words, a
dependency FD: X → Y means that the values of Y are determined by the values of X. Two
tuples sharing the same values of X will necessarily have the same values of Y.
Functional Dependency
If the information stored in a table can uniquely determine another information in the same table,
then it is called Functional Dependency. Consider it as an association between two attributes of
the same relation.
If P functionally determines Q, then
P -> Q
Let us see an example −

<Employee>
EmpID EmpName EmpAge
E01 Amit 28
E02 Rohit 31
In the above table, EmpName is functionally dependent on EmpID because EmpName can take
only one value for the given value of EmpID:
EmpID -> EmpName
Fully-functionally Dependency
An attribute is fully functional dependent on another attribute, if it is Functionally Dependent on
that attribute and not on any of its proper subset.
For example, an attribute Q is fully functional dependent on another attribute P, if it is

Functionally Dependent on P and not on any of the proper subset of P.
<ProjectCost>
ProjectID ProjectCost
001 1000
002 5000
<EmployeeProject>
EmpID ProjectID Days (spent on the project)
E099 001 320
E056 002 190
The above relations states:
EmpID, ProjectID, ProjectCost -> Days
However, it is not fully functional dependent.

Whereas the subset {EmpID, ProjectID} can easily determine the {Days} spent on the project
by the employee.
This summarizes and gives our fully functional dependency −
{EmpID, ProjectID} -> (Days)
Transitive Dependency
When an indirect relationship causes functional dependency it is called Transitive Dependency.
If P -> Q and Q -> R is true, then P-> R is a transitive dependency.
Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows in the same table,
then the Multi-valued dependencies occur.
If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.
It is represented by double arrow −
->->
For our example:
P->->Q
Q->->R
In the above case, Multivalued Dependency exists only if Q and R are independent attributes.
Partial Dependency
Partial Dependency occurs when a nonprime attribute is functionally dependent on part of a
candidate key.
The 2nd Normal Form (2NF) eliminates the Partial Dependency. Let us see an example −
<StudentProject>
StudentID ProjectNo StudentName ProjectName
S01 199 Katie Geo Location
S02 120 Ollie Cluster Exploration

In the above table, we have partial dependency; let us see how −
The prime key attributes are StudentID and ProjectNo.
As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent.
The StudentName can be determined by StudentID that makes the relation Partial Dependent.
The ProjectName can be determined by ProjectID, which that the relation Partial Dependent.
Normalization
Normalization is a process of organizing the data in database to avoid data redundancy,

insertion anomaly, update anomaly & deletion anomaly. Let’s discuss about anomalies first then
we will discuss normal forms with examples.
Here are the most commonly used normal forms:
 First normal form(1NF)

 Second normal form(2NF)
 Third normal form(3NF)
First normal form (1NF)
As per the rule of first normal form, an attribute (column) of a table cannot hold multiple values.
It should hold only atomic values.
Example: Suppose a company wants to store the names and contact details of its employees. It
creates a table that looks like this:
emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390
8812121212
102 Jon Kanpur
9900012222
103 Ron Chennai 7778881212
9990000123
104 Lester Bangalore
8123450987
Two employees (Jon & Lester) are having two mobile numbers so the company stored them in
the same field as you can see in the table above.
This table is not in 1NF as the rule says “each attribute of a table must have atomic (single)
values”, the emp_mobile values for employees Jon & Lester violates that rule. To make the table
complies with 1NF we should have the data like this:
emp_id emp_name emp_address emp_mobile
101 Herschel New Delhi 8912312390
102 Jon Kanpur 8812121212
102 Jon Kanpur 9900012222
103 Ron Chennai 7778881212
104 Lester Bangalore 9990000123
104 Lester Bangalore 8123450987
Second normal form (2NF)
A table is said to be in 2NF if both the following conditions hold:
 Table is in 1NF (First normal form)

 No non-prime attribute is dependent on the proper subset of any candidate key of table.
An attribute that is not part of any candidate key is known as non-prime attribute.
Example: Suppose a school wants to store the data of teachers and the subjects they teach. They
create a table that looks like this: Since a teacher can teach more than one subjects, the table can
have multiple rows for a same teacher.
teacher_id subject teacher_age
111 Maths 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
Candidate Keys: {teacher_id, subject}

Non prime attribute: teacher_age
The table is in 1 NF because each attribute has atomic values. However, it is not in 2NF because
non prime attribute teacher_age is dependent on teacher_id alone which is a proper subset of
candidate key. This violates the rule for 2NF as the rule says “no non-prime attribute is
dependent on the proper subset of any candidate key of the table”.
To make the table complies with 2NF we can break it in two tables like this:
teacher_details table:
teacher_id teacher_age
111 38
222 38
333 40
teacher_subject table:
teacher_id Subject
111 Maths
111 Physics
222 Biology
333 Physics
333 Chemistry
Now the tables comply with Second normal form (2NF).
Example
Consider table as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
{Note that, there are many courses having the same course fee. }
Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate
key {STUD_NO, COURSE_NO} ;
But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,

which is a proper subset of the candidate key. Non-prime attribute COURSE_FEE is dependent
on a proper subset of the candidate key, which is a partial dependency and so this relation is
not in 2NF.
To convert the above relation to 2NF,
we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
2 C5
Note – 2NF tries to reduce the redundant data getting stored in memory. For instance, if there
are 100 students taking C1 course, we dont need to store its Fee as 1000 for all the 100 records,
instead once we can store it in the second table as the course fee for C1 is 1000.
Third Normal form (3NF)

A table design is said to be in 3NF if both the following conditions hold:
 Table must be in 2NF

 Transitive functional dependency of non-prime attribute on any super key should be
removed.
An attribute that is not part of any candidate key is known as non-prime attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:
 X is a super key of table

 Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as prime attribute.
Example: Suppose a company wants to store the complete address of each employee, they create
a table named employee_details that looks like this:
emp_id emp_name emp_zip emp_state emp_city emp_district
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1101 Lilly 292008 UK Pauri Bhagwan
1201 Steve 222999 MP Gwalior Ratan
Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on

Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any
candidate keys.
Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on
emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.
To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:
employee table:
emp_id emp_name emp_zip
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999
employee_zip table:
emp_zip emp_state emp_city emp_district
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai Urrapakkam

292008 UK Pauri Bhagwan
222999 MP Gwalior Ratan
Example-2
Student Table
student_id name reg_no branch address
10 Akon 07-WY CSE Kerala
11 Akon 08-WY IT Gujarat
12 Bkon 09-WY IT Rajasthan
Subject Table
subject_id subject_name teacher
1 Java Java Teacher
2 C++ C++ Teacher
3 Php Php Teacher
Score Table
score_id student_id subject_id marks
1 10 1 70
2 10 2 75
3 11 1 80
In the Score table, we need to store some more information, which is the exam name and total
marks, so let's add 2 more columns to the Score table.
score_id student_id subject_id marks exam_name total_marks
Requirements for Third Normal Form
For a table to be in the third normal form,
1. It should be in the Second Normal form.

2. And it should not have Transitive Dependency.
What is Transitive Dependency?
With exam_name and total_marks added to our Score table, it saves more data now. Primary key
for our Score table is a composite key, which means it's made up of two attributes or columns
→ student_id + subject_id.
Our new column exam_name depends on both student and subject. For example, a mechanical
engineering student will have Workshop exam but a computer science student won't. And for
some subjects you have Prctical exams and for some you don't. So we can say that exam_name is
dependent on both student_id and subject_id.
And what about our second new column total_marks? Does it depend on our Score table's
primary key?
Well, the column total_marks depends on exam_name as with exam type the total score changes.
For example, practicals are of less marks while theory exams are of more marks.
But, exam_name is just another column in the score table. It is not a primary key or even a part
of the primary key, and total_marks depends on it.
This is Transitive Dependency. When a non-prime attribute depends on other non-prime
attributes rather than depending upon the prime attributes or primary key.
How to remove Transitive Dependency?
Again the solution is very simple. Take out the columns exam_name and total_marks from Score
table and put them in an Exam table and use the exam_id wherever required.
Score Table: In 3rd Normal Form
score_id student_id subject_id marks exam_id
The new Exam table
exam_id exam_name total_marks
1 Workshop 200
2 Mains 70
3 Practicals 30
Advantage of removing Transitive Dependency
The advantage of removing transitive dependency is,
 Amount of data duplication is reduced.

 Data integrity achieved.
Multivalued Dependency-
If two or more independent relation are kept in a single relation or we can say multivalue
dependency occurs when the presence of one or more rows in a table implies the presence of
one or more other rows in that same table. Put another way, two attributes (or columns) in a table
are independent of one another, but both depend on a third attribute. A multivalued dependency
always requires at least three attributes because it consists of at least two attributes that are
dependent on a third.
For a dependency A -> B, if for a single value of A, multiple value of B exists, then the table
may have multi-valued dependency. The table should have at least 3 attributes and B and C
should be independent for A ->> B multivalued dependency.
o Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.
Example: Suppose there is a bike manufacturer company which produces two colors(white and
black) of each model every year.
BIKE_MODEL MANUF_YEAR COLOR
M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent
of each other.
In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The
representation of these dependencies is shown below:
1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR
This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL

multidetermined COLOR".
Example
Let us see an example &mins;
<Student>
StudentName CourseDiscipline Activities
Amit Mathematics Singing
Amit Mathematics Dancing
Yuvraj Computers Cricket
Akash Literature Dancing
Akash Literature Cricket
Akash Literature Singing
In the above table, we can see Students Amit and Akash have interest in more than one activity.
This is multivalued dependency because CourseDiscipline of a student are independent of

Activities, but are dependent on the student.
Therefore, multivalued dependency −

StudentName ->-> CourseDiscipline
StudentName ->-> Activities
The above relation violates Fourth Normal Form in Normalization.
To correct it, divide the table into two separate tables and break Multivalued Dependency −
<StudentCourse>
StudentName CourseDiscipline
Amit Mathematics
Amit Mathematics
Yuvraj Computers
Akash Literature
Akash Literature
Akash Literature
<StudentActivities>
StudentName Activities
Amit Singing
Amit Dancing
Yuvraj Cricket
Akash Dancing
Akash Cricket
Akash Singing
This breaks the multivalued dependency and now we have two functional dependencies −
StudentName -> CourseDiscipline

StudentName - > Activities
Fourth normal form (4NF):
Fourth normal form (4NF) is a level of database normalization where there are no non-trivial
multivalued dependencies other than a candidate key. It builds on the first three normal forms
(1NF, 2NF and 3NF) and the Boyce-Codd Normal Form (BCNF). It states that, in addition to a
database meeting the requirements of BCNF, it must not contain more than one multivalued
dependency.
Properties – A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).

2. the table should not have any Multi-valued Dependency.
A table with a multivalued dependency violates the normalization standard of Fourth Normal
Form (4NK) because it creates unnecessary redundancies and can contribute to inconsistent data.
To bring this up to 4NF, it is necessary to break this information into two tables.
o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exists, then the
relation will be a multi-valued dependency.
Example
STUDENT
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two

courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Example
<Movie>
Movie_Name Shooting_Location Listing
MovieOne UK Comedy
MovieOne UK Thriller
MovieTwo Australia Action
MovieTwo Australia Crime
MovieThree India Drama
The above is not in 4NF, since
 More than one movie can have the same listing

 Many shooting locations can have the same movie
Let us convert the above table in 4NF −
<Movie_Shooting>
Movie_Name Shooting_Location
MovieOne UK
MovieOne UK
MovieTwo Australia
MovieTwo Australia
MovieThree India
<Movie_Listing>
Movie_Name Listing
MovieOne Comedy
MovieOne Thriller
MovieTwo Action
MovieTwo Crime
MovieThree Drama
Now the violation is removed and the tables are in 4NF.
Joint dependency – Join decomposition is a further generalization of Multivalued dependencies.

If the join of R1 and R2 over C is equal to relation R then we can say that a join
dependency (JD) exists, where R1 and R2 are the decomposition R1(A, B, C) and R2(C, D) of a
given relations R (A, B, C, D). Alternatively, R1 and R2 are a lossless decomposition of R. A JD
⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is a lossless-join
decomposition. The *(A, B, C, D), (C, D) will be a JD of R if the join of join’s attribute is equal
to
the relation R. Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD
of R.
Let R is a relation schema R1, R2, R3……..Rn be the decomposition of R. r( R ) is said to satisfy
join dependency if and only if
If a table can be recreated by joining multiple tables and each of this table have a subset of the
attributes of the table, then the table is in Join Dependency. It is a generalization of Multivalued
Dependency
Join Dependency can be related to 5NF, wherein a relation is in 5NF, only if it is already in 4NF
and it cannot be decomposed further.
Example
<Employee>
EmpName EmpSkills EmpJob (Assigned Work)
Tom Networking EJ001
Harry Web Development EJ002
Katie Programming EJ002
The above table can be decomposed into the following three tables; therefore it is not in 5NF:
<EmployeeSkills>
EmpName EmpSkills
Tom Networking
Harry Web Development
Katie Programming
<EmployeeJob>
EmpName EmpJob
Tom EJ001
Harry EJ002
Katie EJ002
<JobSkills>
EmpSkills EmpJob
Networking EJ001
Web Development EJ002
Programming EJ002
Our Join Dependency −
{(EmpName, EmpSkills ), ( EmpName, EmpJob), (EmpSkills, EmpJob)}
The above relations have join dependency, so they are not in 5NF. That would mean that a join
relation of the above three relations is equal to our original relation <Employee>.
Fifth Normal Form / Projected Normal Form (5NF):
A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys
of R. A relation decomposed into two relations must have loss-less join Property, which ensures
that no spurious or extra tuples are generated, when relations are reunited through a natural join.
Properties – A relation R is in 5NF if and only if it satisfies following conditions:
1. R should be already in 4NF.

2. It cannot be further non loss decomposed (join dependency)
Boyce-Codd Normal Form (BCNF)
Application of the general definitions of 2NF and 3NF may identify additional redundancy
caused by dependencies that violate one or more candidate keys. However, despite these
additional constraints, dependencies can still exist that will cause redundancy to be present in
3NF relations. This weakness in 3NF, resulted in the presentation of a stronger normal form
called Boyce–Codd Normal Form (Codd, 1974).
Although, 3NF is adequate normal form for relational database, still, this (3NF) normal form
may not remove 100% redundancy because of X?Y functional dependency, if X is not a
candidate key of given relation. This can be solve by Boyce-Codd Normal Form (BCNF).
Boyce–Codd Normal Form (BCNF) is based on functional dependencies that take into
account all candidate keys in a relation; however, BCNF also has additional constraints
compared with the general definition of 3NF.
A relation is in BCNF iff, X is superkey for every functional dependency (FD) X?Y in given
relation.
In other words,
A relation is in BCNF, if and only if, every determinant is a Form (BCNF) candidate key.
You came across a similar hierarchy known as Chomsky Normal Form in Theory of
Computation. Now, carefully study the hierarchy above. It can be inferred that every relation in
BCNF is also in 3NF. To put it another way, a relation in 3NF need not to be in BCNF. Ponder
over this statement for a while.
To determine the highest normal form of a given relation R with functional dependencies, the
first step is to check whether the BCNF condition holds. If R is found to be in BCNF, it can be
safely deduced that the relation is also in 3NF, 2NF and 1NF as the hierarchy shows. The 1NF
has the least restrictive constraint – it only requires a relation R to have atomic values in each
tuple. The 2NF has a slightly more restrictive constraint.
The 3NF has more restrictive constraint than the first two normal forms but is less restrictive
than the BCNF. In this manner, the restriction increases as we traverse down the hierarchy.
Example-1:
Find the highest normal form of a relation R(A, B, C, D, E) with FD set as:
{ BC->D, AC->BE, B->E }
Explanation:
 Step-1: As we can see, (AC)+ ={A, C, B, E, D} but none of its subset can determine all
attribute of relation, So AC will be candidate key. A or C can’t be derived from any other
attribute of the relation, so there will be only 1 candidate key {AC}.
 Step-2: Prime attributes are those attribute which are part of candidate key {A, C} in this
example and others will be non-prime {B, D, E} in this example.
 Step-3: The relation R is in 1st normal form as a relational DBMS does not allow multi-
valued or composite attribute.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper
subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E
is in 2nd normal form (B is not a proper subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a
prime attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd
normal for, either LHS of an FD should be super key or RHS should be prime attribute. So the
highest normal form of relation will be 2nd Normal form.
Note –A prime attribute cannot be transitively dependent on a key in BCNF relation.
Below we have a college enrolment table with columns student_id, subject and professor.
student_id subject professor
101 Java P.Java

101 C++ P.Cpp
102 Java P.Java2
103 C# P.Chash
104 Java P.Java
As you can see, we have also added some sample data to the table.
In the table above:
 One student can enrol for multiple subjects. For example, student with student_id 101,
has opted for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors teaching one subject like we have for Java.
What do you think should be the Primary Key?
Well, in the table above student_id, subject together form the primary key, because
using student_id and subject, we can find all the columns of the table.
One more important point to note here is, one professor teaches only one subject, but one subject
may have two different professors.
Hence, there is a dependency between subject and professor here, where subject depends on the
professor name.
This table satisfies the 1st Normal form because all the values are atomic, column names are
unique and all the values stored in a particular column are of same domain.
This table also satisfies the 2nd Normal Form as their is no Partial Dependency.
And, there is no Transitive Dependency, hence the table also satisfies the 3rd Normal Form.
But this table is not in Boyce-Codd Normal Form.
Why this table is not in BCNF?
In the table above, student_id, subject form primary key, which means subject column is a prime
attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed
by BCNF.
How to satisfy BCNF?
To make this relation(table) satisfy BCNF, we will decompose this table into two
tables, student table and professor table.
Below we have the structure for both the tables.
Student Table
student_id p_id
101 1
101 2
and so on...
And, Professor Table
p_id professor subject

1 P.Java Java
2 P.Cpp C++
and so on...
And now, this relation satisfy Boyce-Codd Normal
A more Generic Explanation
In the picture below, we have tried to explain BCNF in terms of relations.
Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549
In the above table Functional dependencies are as follows:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
Designing D394 283
Testing D394 300
Stores D283 232

Developing D283 549
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID

For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because left side part of both the functional dependencies is a key.
Inclusion Dependency
o Multivalued dependency and join dependency can be used to guide database design
although they both are less common than functional dependencies.
o Inclusion dependencies are quite common. They typically show little influence on
designing of the database.
o The inclusion dependency is a statement in which some columns of a relation are
contained in other columns.
o The example of inclusion dependency is a foreign key. In one relation, the referring
relation is contained in the primary key column(s) of the referenced relation.
o Suppose we have two relations R and S which was obtained by translating two entity sets
such that every R entity is also an S entity.
o Inclusion dependency would be happen if projecting R on its key attributes yields a
relation that is contained in the relation obtained by projecting S on its key attributes.
o In inclusion dependency, we should not split groups of attributes that participate in an
inclusion dependency.
o In practice, most inclusion dependencies are key-based that is involved only keys.
Lossless Decomposition
o Decomposition is lossless if it is feasible to reconstruct relation R from decomposed
tables using Joins. This is the preferred choice. The information will not lose from the
relation when decomposed. The join would result in the same original relation.
o Let us see an example −
o <EmpInfo>
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
o Decompose the above table into two tables:

o <EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
o
<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance
o Now, Natural Join is applied on the above two tables −
o The result will be –

o Therefore, the above relation had lossless decomposition i.e. no loss of information.
o Lossy Decomposition
o As the name suggests, when a relation is decomposed into two or more relational
schemas, the loss of information is unavoidable when the original relation is retrieved.
o Let us see an example −
o <EmpInfo>
o Decompose the above table into two tables –

o <EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
o
<DeptDetails>
Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance
o Now, you won’t be able to join the above tables, since Emp_ID isn’t part of
the DeptDetails relation.
o Therefore, the above relation has lossy decomposition.
Lossless Join Decomposition

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
 This decomposition is called lossless join decomposition when the join of the sub relations
results in the same relation R that was decomposed.
 For lossless join decomposition, we always have- R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R , where ⋈
is a natural join operator
Example : Consider the following relation R( A , B , C )-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-
 R(A,B,C)
1. R1(A,B)
2. R2(B,C)
The two sub relations are-
A B
1 2
2 5
3 3
R1( A , B )
B C
2 1
B C
5 3
3 3
R2( B , C )
Now, let us check whether this decomposition is lossless or not.
For lossless decomposition, we must have-
R1 ⋈ R2 = R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
This relation is same as the original relation R.

Thus, we conclude that the above decomposition is lossless join decomposition.
NOTE
 Lossless join decomposition is also known as non-additive join decomposition.
 This is because the resultant relation after joining the sub relations is same as the decomposed
relation.
 No extraneous tuples appear after joining of the sub-relations.
Lossy Join Decomposition :

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
 This decomposition is called lossy join decomposition when the join of the sub relations does
not result in the same relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
 For lossy join decomposition, we always have- R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R where ⋈ is
a natural join operator
Example : Consider that we have table STUDENT with three attribute roll_no , sname and
department.
Student
Roll_no Sname Dept
111 parimal COMPUTER
222 parimal ELECTRICAL
This relation is decomposed into two relation no_name and name_dept :
No_name
Roll_no Sname
111 parimal
222 parimal
name_dept
Sname Dept
parimal COMPUTER
parimal ELECTRICAL
In lossy decomposition ,spurious tuples are generated when a natural join is applied to the
relations in the decomposition.
stu_joined
Roll_no Sname Dept

stu_joined
Roll_no Sname Dept
The above decomposition is a bad decomposition or Lossy decomposition.
Multivalued Dependency
o Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.
Example: Suppose there is a bike manufacturer company which produces two colors(white and
black) of each model every year.
BIKE_MODEL MANUF_YEAR COLOR
M2011 2008 White
M2001 2008 Black

M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent
of each other.
In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The
representation of these dependencies is shown below:
1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR
This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL

multidetermined COLOR".
Join Dependency
o Join decomposition is a further generalization of Multivalued dependencies.
o If the join of R1 and R2 over C is equal to relation R, then we can say that a join
dependency (JD) exists.
o Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a given relations
R (A, B, C, D).
o Alternatively, R1 and R2 are a lossless decomposition of R.
o A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,....., Rn is a lossless-
join decomposition.
o The *(A, B, C, D), (C, D) will be a JD of R if the join of join's attribute is equal to the
relation R.
o Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD of R.

Kalinga University Department of Computer Science: Dependencies

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kalinga University Department of Computer Science: Dependencies

Uploaded by

Copyright:

Available Formats

Kalinga University

Department of Computer Science

1. Functional dependency: if we can dedicate to a property’s any kind of value, which

Functional Dependency (FD)

If P functionally determines Q, then

Let us see an example −

EmpID EmpName EmpAge

EmpID -> EmpName

For example, an attribute Q is fully functional dependent on another attribute P, if it is

Let us see an example −

EmpID ProjectID Days (spent on the project)

E099 001 320

E056 002 190

The above relations states:

EmpID, ProjectID, ProjectCost -> Days

However, it is not fully functional dependent.

This summarizes and gives our fully functional dependency −

{EmpID, ProjectID} -> (Days)

If P -> Q and Q -> R is true, then P-> R is a transitive dependency.

If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.

It is represented by double arrow −

For our example:

StudentID ProjectNo StudentName ProjectName

S01 199 Katie Geo Location

S02 120 Ollie Cluster Exploration

The prime key attributes are StudentID and ProjectNo.

Normalization is a process of organizing the data in database to avoid data redundancy,

Here are the most commonly used normal forms:

 First normal form(1NF)

First normal form (1NF)

emp_id emp_name emp_address emp_mobile

emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

102 Jon Kanpur 8812121212

102 Jon Kanpur 9900012222

103 Ron Chennai 7778881212

104 Lester Bangalore 9990000123

104 Lester Bangalore 8123450987

Second normal form (2NF)

A table is said to be in 2NF if both the following conditions hold:

 Table is in 1NF (First normal form)

teacher_id subject teacher_age

Candidate Keys: {teacher_id, subject}

Now the tables comply with Second normal form (2NF).

But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,

Third Normal form (3NF)

 Table must be in 2NF

 X is a super key of table

emp_id emp_name emp_zip emp_state emp_city emp_district

1001 John 282005 UP Agra Dayal Bagh

1002 Ajeet 222008 TN Chennai M-City

1006 Lora 282007 TN Chennai Urrapakkam

1101 Lilly 292008 UK Pauri Bhagwan

1201 Steve 222999 MP Gwalior Ratan

Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on

emp_id emp_name emp_zip

1001 John 282005

1002 Ajeet 222008

1006 Lora 282007

1101 Lilly 292008

1201 Steve 222999

emp_zip emp_state emp_city emp_district