Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Kalinga University

Department of Computer Science

Course-MCA Sem-II
Subject-Database Management System
Subject Code- MCA204
UNIT-III

Dependencies

1. Functional dependency: if we can dedicate to a property’s any kind of value, which


exists in one system to another property type only one value. For example: to one identity
number there can be only one name associated, but to the same name there can be many
identity numbers related. 1 to more relation.
2. Mutual functional dependency: if the above mentioned requirement is true in both
‘directions’. For instance: registration number – engine number. 1 to 1 relation.
3. Functionally independents: if the previous relation between the two property types does
not exist. An example: the hair colour of an employee and the company’s premise.
4. Transitive functional dependency: if within an entity type one descriptive property
type’s concrete values determine other descriptive property values.

Functional Dependency (FD)

A functional dependency (FD) is a relationship between two attributes, typically between the PK
and other non-key attributes within a table. For any relation R, attribute Y is functionally
dependent on attribute X (usually the PK), if for every valid instance of X, that value of X
uniquely determines the value of Y.
A functional dependency FD: X → Y is called trivial if Y is a subset of X. In other words, a
dependency FD: X → Y means that the values of Y are determined by the values of X. Two
tuples sharing the same values of X will necessarily have the same values of Y.
Functional Dependency
If the information stored in a table can uniquely determine another information in the same table,
then it is called Functional Dependency. Consider it as an association between two attributes of
the same relation.

If P functionally determines Q, then

P -> Q

Let us see an example −


<Employee>

EmpID EmpName EmpAge

E01 Amit 28

E02 Rohit 31

In the above table, EmpName is functionally dependent on EmpID because EmpName can take
only one value for the given value of EmpID:

EmpID -> EmpName

Fully-functionally Dependency
An attribute is fully functional dependent on another attribute, if it is Functionally Dependent on
that attribute and not on any of its proper subset.

For example, an attribute Q is fully functional dependent on another attribute P, if it is


Functionally Dependent on P and not on any of the proper subset of P.

Let us see an example −

<ProjectCost>

ProjectID ProjectCost

001 1000

002 5000

<EmployeeProject>

EmpID ProjectID Days (spent on the project)

E099 001 320

E056 002 190

The above relations states:

EmpID, ProjectID, ProjectCost -> Days

However, it is not fully functional dependent.


Whereas the subset {EmpID, ProjectID} can easily determine the {Days} spent on the project
by the employee.

This summarizes and gives our fully functional dependency −

{EmpID, ProjectID} -> (Days)

Transitive Dependency
When an indirect relationship causes functional dependency it is called Transitive Dependency.

If P -> Q and Q -> R is true, then P-> R is a transitive dependency.

Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows in the same table,
then the Multi-valued dependencies occur.

If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.

It is represented by double arrow −

->->

For our example:

P->->Q
Q->->R

In the above case, Multivalued Dependency exists only if Q and R are independent attributes.

Partial Dependency
Partial Dependency occurs when a nonprime attribute is functionally dependent on part of a
candidate key.

The 2nd Normal Form (2NF) eliminates the Partial Dependency. Let us see an example −

<StudentProject>

StudentID ProjectNo StudentName ProjectName

S01 199 Katie Geo Location

S02 120 Ollie Cluster Exploration


In the above table, we have partial dependency; let us see how −

The prime key attributes are StudentID and ProjectNo.

As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent.

The StudentName can be determined by StudentID that makes the relation Partial Dependent.

The ProjectName can be determined by ProjectID, which that the relation Partial Dependent.

Normalization

Normalization is a process of organizing the data in database to avoid data redundancy,


insertion anomaly, update anomaly & deletion anomaly. Let’s discuss about anomalies first then
we will discuss normal forms with examples.

Here are the most commonly used normal forms:

 First normal form(1NF)


 Second normal form(2NF)
 Third normal form(3NF)

First normal form (1NF)

As per the rule of first normal form, an attribute (column) of a table cannot hold multiple values.
It should hold only atomic values.

Example: Suppose a company wants to store the names and contact details of its employees. It
creates a table that looks like this:

emp_id emp_name emp_address emp_mobile


101 Herschel New Delhi 8912312390

8812121212
102 Jon Kanpur
9900012222
103 Ron Chennai 7778881212

9990000123
104 Lester Bangalore
8123450987

Two employees (Jon & Lester) are having two mobile numbers so the company stored them in
the same field as you can see in the table above.

This table is not in 1NF as the rule says “each attribute of a table must have atomic (single)
values”, the emp_mobile values for employees Jon & Lester violates that rule. To make the table
complies with 1NF we should have the data like this:

emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

102 Jon Kanpur 8812121212

102 Jon Kanpur 9900012222

103 Ron Chennai 7778881212

104 Lester Bangalore 9990000123

104 Lester Bangalore 8123450987

Second normal form (2NF)

A table is said to be in 2NF if both the following conditions hold:

 Table is in 1NF (First normal form)


 No non-prime attribute is dependent on the proper subset of any candidate key of table.
An attribute that is not part of any candidate key is known as non-prime attribute.

Example: Suppose a school wants to store the data of teachers and the subjects they teach. They
create a table that looks like this: Since a teacher can teach more than one subjects, the table can
have multiple rows for a same teacher.

teacher_id subject teacher_age

111 Maths 38

111 Physics 38

222 Biology 38

333 Physics 40

333 Chemistry 40

Candidate Keys: {teacher_id, subject}


Non prime attribute: teacher_age

The table is in 1 NF because each attribute has atomic values. However, it is not in 2NF because
non prime attribute teacher_age is dependent on teacher_id alone which is a proper subset of
candidate key. This violates the rule for 2NF as the rule says “no non-prime attribute is
dependent on the proper subset of any candidate key of the table”.
To make the table complies with 2NF we can break it in two tables like this:
teacher_details table:

teacher_id teacher_age

111 38

222 38

333 40

teacher_subject table:

teacher_id Subject

111 Maths

111 Physics

222 Biology

333 Physics

333 Chemistry

Now the tables comply with Second normal form (2NF).

Example
Consider table as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
{Note that, there are many courses having the same course fee. }
Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate
key {STUD_NO, COURSE_NO} ;

But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,


which is a proper subset of the candidate key. Non-prime attribute COURSE_FEE is dependent
on a proper subset of the candidate key, which is a partial dependency and so this relation is
not in 2NF.
To convert the above relation to 2NF,
we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE

Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
2 C5
Note – 2NF tries to reduce the redundant data getting stored in memory. For instance, if there
are 100 students taking C1 course, we dont need to store its Fee as 1000 for all the 100 records,
instead once we can store it in the second table as the course fee for C1 is 1000.

Third Normal form (3NF)


A table design is said to be in 3NF if both the following conditions hold:

 Table must be in 2NF


 Transitive functional dependency of non-prime attribute on any super key should be
removed.

An attribute that is not part of any candidate key is known as non-prime attribute.

In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:

 X is a super key of table


 Y is a prime attribute of table

An attribute that is a part of one of the candidate keys is known as prime attribute.

Example: Suppose a company wants to store the complete address of each employee, they create
a table named employee_details that looks like this:

emp_id emp_name emp_zip emp_state emp_city emp_district

1001 John 282005 UP Agra Dayal Bagh

1002 Ajeet 222008 TN Chennai M-City

1006 Lora 282007 TN Chennai Urrapakkam

1101 Lilly 292008 UK Pauri Bhagwan

1201 Steve 222999 MP Gwalior Ratan

Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on


Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any
candidate keys.

Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on
emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.

To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:

employee table:

emp_id emp_name emp_zip

1001 John 282005

1002 Ajeet 222008

1006 Lora 282007

1101 Lilly 292008

1201 Steve 222999

employee_zip table:

emp_zip emp_state emp_city emp_district

282005 UP Agra Dayal Bagh

222008 TN Chennai M-City

282007 TN Chennai Urrapakkam


292008 UK Pauri Bhagwan

222999 MP Gwalior Ratan

Example-2

Student Table

student_id name reg_no branch address

10 Akon 07-WY CSE Kerala

11 Akon 08-WY IT Gujarat

12 Bkon 09-WY IT Rajasthan

Subject Table

subject_id subject_name teacher

1 Java Java Teacher

2 C++ C++ Teacher

3 Php Php Teacher

Score Table

score_id student_id subject_id marks

1 10 1 70

2 10 2 75
3 11 1 80

In the Score table, we need to store some more information, which is the exam name and total
marks, so let's add 2 more columns to the Score table.

score_id student_id subject_id marks exam_name total_marks

Requirements for Third Normal Form

For a table to be in the third normal form,

1. It should be in the Second Normal form.


2. And it should not have Transitive Dependency.

What is Transitive Dependency?

With exam_name and total_marks added to our Score table, it saves more data now. Primary key
for our Score table is a composite key, which means it's made up of two attributes or columns
→ student_id + subject_id.

Our new column exam_name depends on both student and subject. For example, a mechanical
engineering student will have Workshop exam but a computer science student won't. And for
some subjects you have Prctical exams and for some you don't. So we can say that exam_name is
dependent on both student_id and subject_id.

And what about our second new column total_marks? Does it depend on our Score table's
primary key?

Well, the column total_marks depends on exam_name as with exam type the total score changes.
For example, practicals are of less marks while theory exams are of more marks.

But, exam_name is just another column in the score table. It is not a primary key or even a part
of the primary key, and total_marks depends on it.
This is Transitive Dependency. When a non-prime attribute depends on other non-prime
attributes rather than depending upon the prime attributes or primary key.

How to remove Transitive Dependency?

Again the solution is very simple. Take out the columns exam_name and total_marks from Score
table and put them in an Exam table and use the exam_id wherever required.

Score Table: In 3rd Normal Form

score_id student_id subject_id marks exam_id

The new Exam table

exam_id exam_name total_marks

1 Workshop 200

2 Mains 70

3 Practicals 30

Advantage of removing Transitive Dependency

The advantage of removing transitive dependency is,

 Amount of data duplication is reduced.


 Data integrity achieved.

Multivalued Dependency-
If two or more independent relation are kept in a single relation or we can say multivalue
dependency occurs when the presence of one or more rows in a table implies the presence of
one or more other rows in that same table. Put another way, two attributes (or columns) in a table
are independent of one another, but both depend on a third attribute. A multivalued dependency
always requires at least three attributes because it consists of at least two attributes that are
dependent on a third.

For a dependency A -> B, if for a single value of A, multiple value of B exists, then the table
may have multi-valued dependency. The table should have at least 3 attributes and B and C
should be independent for A ->> B multivalued dependency.

o Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.

Example: Suppose there is a bike manufacturer company which produces two colors(white and
black) of each model every year.

BIKE_MODEL MANUF_YEAR COLOR

M2011 2008 White

M2001 2008 Black

M3001 2013 White

M3001 2013 Black

M4006 2017 White

M4006 2017 Black

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent
of each other.
In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The
representation of these dependencies is shown below:

1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL


multidetermined COLOR".

Example
Let us see an example &mins;

<Student>

StudentName CourseDiscipline Activities

Amit Mathematics Singing

Amit Mathematics Dancing

Yuvraj Computers Cricket

Akash Literature Dancing

Akash Literature Cricket

Akash Literature Singing

In the above table, we can see Students Amit and Akash have interest in more than one activity.

This is multivalued dependency because CourseDiscipline of a student are independent of


Activities, but are dependent on the student.

Therefore, multivalued dependency −


StudentName ->-> CourseDiscipline
StudentName ->-> Activities

The above relation violates Fourth Normal Form in Normalization.

To correct it, divide the table into two separate tables and break Multivalued Dependency −

<StudentCourse>

StudentName CourseDiscipline

Amit Mathematics

Amit Mathematics

Yuvraj Computers

Akash Literature

Akash Literature

Akash Literature

<StudentActivities>

StudentName Activities

Amit Singing
Amit Dancing

Yuvraj Cricket

Akash Dancing

Akash Cricket

Akash Singing

This breaks the multivalued dependency and now we have two functional dependencies −

StudentName -> CourseDiscipline


StudentName - > Activities

Fourth normal form (4NF):

Fourth normal form (4NF) is a level of database normalization where there are no non-trivial
multivalued dependencies other than a candidate key. It builds on the first three normal forms
(1NF, 2NF and 3NF) and the Boyce-Codd Normal Form (BCNF). It states that, in addition to a
database meeting the requirements of BCNF, it must not contain more than one multivalued
dependency.

Properties – A relation R is in 4NF if and only if the following conditions are satisfied:

1. It should be in the Boyce-Codd Normal Form (BCNF).


2. the table should not have any Multi-valued Dependency.

A table with a multivalued dependency violates the normalization standard of Fourth Normal
Form (4NK) because it creates unnecessary redundancies and can contribute to inconsistent data.
To bring this up to 4NF, it is necessary to break this information into two tables.

o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exists, then the
relation will be a multi-valued dependency.

Example

STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two


courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to unnecessary repetition of data.

So to make the above table into 4NF, we can decompose it into two tables:

STUDENT_COURSE

STU_ID COURSE

21 Computer

21 Math
34 Chemistry

74 Biology

59 Physics

STUDENT_HOBBY

STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey

Example
Let us see an example −

<Movie>

Movie_Name Shooting_Location Listing

MovieOne UK Comedy

MovieOne UK Thriller
MovieTwo Australia Action

MovieTwo Australia Crime

MovieThree India Drama

The above is not in 4NF, since

 More than one movie can have the same listing


 Many shooting locations can have the same movie
Let us convert the above table in 4NF −

<Movie_Shooting>

Movie_Name Shooting_Location

MovieOne UK

MovieOne UK

MovieTwo Australia

MovieTwo Australia

MovieThree India

<Movie_Listing>

Movie_Name Listing
MovieOne Comedy

MovieOne Thriller

MovieTwo Action

MovieTwo Crime

MovieThree Drama

Now the violation is removed and the tables are in 4NF.

Joint dependency – Join decomposition is a further generalization of Multivalued dependencies.


If the join of R1 and R2 over C is equal to relation R then we can say that a join
dependency (JD) exists, where R1 and R2 are the decomposition R1(A, B, C) and R2(C, D) of a
given relations R (A, B, C, D). Alternatively, R1 and R2 are a lossless decomposition of R. A JD
⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is a lossless-join
decomposition. The *(A, B, C, D), (C, D) will be a JD of R if the join of join’s attribute is equal
to
the relation R. Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD
of R.

Let R is a relation schema R1, R2, R3……..Rn be the decomposition of R. r( R ) is said to satisfy
join dependency if and only if

If a table can be recreated by joining multiple tables and each of this table have a subset of the
attributes of the table, then the table is in Join Dependency. It is a generalization of Multivalued
Dependency

Join Dependency can be related to 5NF, wherein a relation is in 5NF, only if it is already in 4NF
and it cannot be decomposed further.

Example
<Employee>
EmpName EmpSkills EmpJob (Assigned Work)

Tom Networking EJ001

Harry Web Development EJ002

Katie Programming EJ002

The above table can be decomposed into the following three tables; therefore it is not in 5NF:

<EmployeeSkills>

EmpName EmpSkills

Tom Networking

Harry Web Development

Katie Programming

<EmployeeJob>
EmpName EmpJob

Tom EJ001

Harry EJ002

Katie EJ002

<JobSkills>

EmpSkills EmpJob

Networking EJ001

Web Development EJ002

Programming EJ002

Our Join Dependency −

{(EmpName, EmpSkills ), ( EmpName, EmpJob), (EmpSkills, EmpJob)}

The above relations have join dependency, so they are not in 5NF. That would mean that a join
relation of the above three relations is equal to our original relation <Employee>.

Fifth Normal Form / Projected Normal Form (5NF):

A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys
of R. A relation decomposed into two relations must have loss-less join Property, which ensures
that no spurious or extra tuples are generated, when relations are reunited through a natural join.
Properties – A relation R is in 5NF if and only if it satisfies following conditions:

1. R should be already in 4NF.


2. It cannot be further non loss decomposed (join dependency)

Boyce-Codd Normal Form (BCNF)

Application of the general definitions of 2NF and 3NF may identify additional redundancy
caused by dependencies that violate one or more candidate keys. However, despite these
additional constraints, dependencies can still exist that will cause redundancy to be present in
3NF relations. This weakness in 3NF, resulted in the presentation of a stronger normal form
called Boyce–Codd Normal Form (Codd, 1974).

Although, 3NF is adequate normal form for relational database, still, this (3NF) normal form
may not remove 100% redundancy because of X?Y functional dependency, if X is not a
candidate key of given relation. This can be solve by Boyce-Codd Normal Form (BCNF).

Boyce–Codd Normal Form (BCNF) is based on functional dependencies that take into
account all candidate keys in a relation; however, BCNF also has additional constraints
compared with the general definition of 3NF.

A relation is in BCNF iff, X is superkey for every functional dependency (FD) X?Y in given
relation.

In other words,

A relation is in BCNF, if and only if, every determinant is a Form (BCNF) candidate key.

You came across a similar hierarchy known as Chomsky Normal Form in Theory of
Computation. Now, carefully study the hierarchy above. It can be inferred that every relation in
BCNF is also in 3NF. To put it another way, a relation in 3NF need not to be in BCNF. Ponder
over this statement for a while.
To determine the highest normal form of a given relation R with functional dependencies, the
first step is to check whether the BCNF condition holds. If R is found to be in BCNF, it can be
safely deduced that the relation is also in 3NF, 2NF and 1NF as the hierarchy shows. The 1NF
has the least restrictive constraint – it only requires a relation R to have atomic values in each
tuple. The 2NF has a slightly more restrictive constraint.

The 3NF has more restrictive constraint than the first two normal forms but is less restrictive
than the BCNF. In this manner, the restriction increases as we traverse down the hierarchy.

Example-1:
Find the highest normal form of a relation R(A, B, C, D, E) with FD set as:

{ BC->D, AC->BE, B->E }

Explanation:

 Step-1: As we can see, (AC)+ ={A, C, B, E, D} but none of its subset can determine all
attribute of relation, So AC will be candidate key. A or C can’t be derived from any other
attribute of the relation, so there will be only 1 candidate key {AC}.
 Step-2: Prime attributes are those attribute which are part of candidate key {A, C} in this
example and others will be non-prime {B, D, E} in this example.
 Step-3: The relation R is in 1st normal form as a relational DBMS does not allow multi-
valued or composite attribute.

The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper
subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E
is in 2nd normal form (B is not a proper subset of candidate key AC).

The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a
prime attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd
normal for, either LHS of an FD should be super key or RHS should be prime attribute. So the
highest normal form of relation will be 2nd Normal form.

Note –A prime attribute cannot be transitively dependent on a key in BCNF relation.

Below we have a college enrolment table with columns student_id, subject and professor.

student_id subject professor

101 Java P.Java


101 C++ P.Cpp

102 Java P.Java2

103 C# P.Chash

104 Java P.Java

As you can see, we have also added some sample data to the table.

In the table above:

 One student can enrol for multiple subjects. For example, student with student_id 101,
has opted for subjects - Java & C++

 For each subject, a professor is assigned to the student.

 And, there can be multiple professors teaching one subject like we have for Java.

What do you think should be the Primary Key?

Well, in the table above student_id, subject together form the primary key, because
using student_id and subject, we can find all the columns of the table.

One more important point to note here is, one professor teaches only one subject, but one subject
may have two different professors.

Hence, there is a dependency between subject and professor here, where subject depends on the
professor name.

This table satisfies the 1st Normal form because all the values are atomic, column names are
unique and all the values stored in a particular column are of same domain.

This table also satisfies the 2nd Normal Form as their is no Partial Dependency.

And, there is no Transitive Dependency, hence the table also satisfies the 3rd Normal Form.
But this table is not in Boyce-Codd Normal Form.

Why this table is not in BCNF?

In the table above, student_id, subject form primary key, which means subject column is a prime
attribute.

But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed
by BCNF.

How to satisfy BCNF?

To make this relation(table) satisfy BCNF, we will decompose this table into two
tables, student table and professor table.

Below we have the structure for both the tables.

Student Table

student_id p_id

101 1

101 2

and so on...

And, Professor Table

p_id professor subject


1 P.Java Java

2 P.Cpp C++

and so on...

And now, this relation satisfy Boyce-Codd Normal

A more Generic Explanation

In the picture below, we have tried to explain BCNF in terms of relations.

Example: Let's assume there is a company where employees work in more than one department.

EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549

In the above table Functional dependencies are as follows:

1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate key: {EMP-ID, EMP-DEPT}

The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.

To convert the given table into BCNF, we decompose it into three tables:

EMP_COUNTRY table:

EMP_ID EMP_COUNTRY

264 India

264 India

EMP_DEPT table:

EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283

Testing D394 300

Stores D283 232


Developing D283 549

EMP_DEPT_MAPPING table:

EMP_ID EMP_DEPT

D394 283

D394 300

D283 232

D283 549

Functional dependencies:

1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate keys:

For the first table: EMP_ID


For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

Now, this is in BCNF because left side part of both the functional dependencies is a key.

Inclusion Dependency
o Multivalued dependency and join dependency can be used to guide database design
although they both are less common than functional dependencies.
o Inclusion dependencies are quite common. They typically show little influence on
designing of the database.
o The inclusion dependency is a statement in which some columns of a relation are
contained in other columns.
o The example of inclusion dependency is a foreign key. In one relation, the referring
relation is contained in the primary key column(s) of the referenced relation.
o Suppose we have two relations R and S which was obtained by translating two entity sets
such that every R entity is also an S entity.
o Inclusion dependency would be happen if projecting R on its key attributes yields a
relation that is contained in the relation obtained by projecting S on its key attributes.
o In inclusion dependency, we should not split groups of attributes that participate in an
inclusion dependency.
o In practice, most inclusion dependencies are key-based that is involved only keys.

Lossless Decomposition
o Decomposition is lossless if it is feasible to reconstruct relation R from decomposed
tables using Joins. This is the preferred choice. The information will not lose from the
relation when decomposed. The join would result in the same original relation.

o Let us see an example −

o <EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name

E001 Jacob 29 Alabama Dpt1 Operations

E002 Henry 32 Alabama Dpt2 HR

E003 Tom 22 Texas Dpt3 Finance

o Decompose the above table into two tables:


o <EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location

E001 Jacob 29 Alabama

E002 Henry 32 Alabama

E003 Tom 22 Texas

o
<DeptDetails>

Dept_ID Emp_ID Dept_Name

Dpt1 E001 Operations

Dpt2 E002 HR

Dpt3 E003 Finance

o Now, Natural Join is applied on the above two tables −

o The result will be –


Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name

E001 Jacob 29 Alabama Dpt1 Operations

E002 Henry 32 Alabama Dpt2 HR

E003 Tom 22 Texas Dpt3 Finance

o Therefore, the above relation had lossless decomposition i.e. no loss of information.

o Lossy Decomposition
o As the name suggests, when a relation is decomposed into two or more relational
schemas, the loss of information is unavoidable when the original relation is retrieved.

o Let us see an example −

o <EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name

E001 Jacob 29 Alabama Dpt1 Operations

E002 Henry 32 Alabama Dpt2 HR

E003 Tom 22 Texas Dpt3 Finance

o Decompose the above table into two tables –


o <EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location

E001 Jacob 29 Alabama

E002 Henry 32 Alabama

E003 Tom 22 Texas

o
<DeptDetails>

Dept_ID Dept_Name

Dpt1 Operations

Dpt2 HR

Dpt3 Finance

o Now, you won’t be able to join the above tables, since Emp_ID isn’t part of
the DeptDetails relation.

o Therefore, the above relation has lossy decomposition.

Lossless Join Decomposition


 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
 This decomposition is called lossless join decomposition when the join of the sub relations
results in the same relation R that was decomposed.
 For lossless join decomposition, we always have- R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R , where ⋈
is a natural join operator

Example : Consider the following relation R( A , B , C )-

A B C

1 2 1

2 5 3

3 3 3

R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-
 R(A,B,C)
1. R1(A,B)
2. R2(B,C)
The two sub relations are-

A B

1 2

2 5

3 3

R1( A , B )

B C

2 1
B C

5 3

3 3

R2( B , C )
Now, let us check whether this decomposition is lossless or not.
For lossless decomposition, we must have-
R1 ⋈ R2 = R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-

A B C

1 2 1

2 5 3

3 3 3

This relation is same as the original relation R.


Thus, we conclude that the above decomposition is lossless join decomposition.
NOTE
 Lossless join decomposition is also known as non-additive join decomposition.
 This is because the resultant relation after joining the sub relations is same as the decomposed
relation.
 No extraneous tuples appear after joining of the sub-relations.

Lossy Join Decomposition :


 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
 This decomposition is called lossy join decomposition when the join of the sub relations does
not result in the same relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
 For lossy join decomposition, we always have- R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R where ⋈ is
a natural join operator
Example : Consider that we have table STUDENT with three attribute roll_no , sname and
department.
Student

Roll_no Sname Dept

111 parimal COMPUTER

222 parimal ELECTRICAL

This relation is decomposed into two relation no_name and name_dept :

No_name

Roll_no Sname

111 parimal

222 parimal

name_dept

Sname Dept

parimal COMPUTER

parimal ELECTRICAL

In lossy decomposition ,spurious tuples are generated when a natural join is applied to the
relations in the decomposition.

stu_joined

Roll_no Sname Dept

111 parimal COMPUTER


stu_joined

Roll_no Sname Dept

111 parimal ELECTRICAL

222 parimal COMPUTER

222 parimal ELECTRICAL

The above decomposition is a bad decomposition or Lossy decomposition.

Multivalued Dependency
o Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.

Example: Suppose there is a bike manufacturer company which produces two colors(white and
black) of each model every year.

BIKE_MODEL MANUF_YEAR COLOR

M2011 2008 White

M2001 2008 Black


M3001 2013 White

M3001 2013 Black

M4006 2017 White

M4006 2017 Black

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent
of each other.

In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The
representation of these dependencies is shown below:

1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL


multidetermined COLOR".

Join Dependency
o Join decomposition is a further generalization of Multivalued dependencies.
o If the join of R1 and R2 over C is equal to relation R, then we can say that a join
dependency (JD) exists.
o Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a given relations
R (A, B, C, D).
o Alternatively, R1 and R2 are a lossless decomposition of R.
o A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,....., Rn is a lossless-
join decomposition.
o The *(A, B, C, D), (C, D) will be a JD of R if the join of join's attribute is equal to the
relation R.
o Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD of R.

You might also like