Cat2 DBDM Key

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

AD3391 Database Design and Management

CAT – II Answer key

1. Define functional dependency?

Functional dependency is a relationship that exists when one attribute uniquely determines another
attribute. It R is a relation with attributes X and Y, a functional dependency between the attributes is
represented as X->Y, which specifies Y is functionally dependent on X. Functional Dependecny (FD) is a
set of constraints between two attributes in a relation.

2. What is decomposition?

Decomposition means dividing a relation R into {R1, R2,......Rn}. It is dependency preserving


and lossless.

3. Explain about Loss less Join Decomposition

We claim the above decomposition is lossless. How can we decide whether decomposition is lossless? 1.
Let R be a relation schema. 2. Let F be a set of functional dependencies on R. 3. Let R1and R2 form a
decomposition of R. 4. The decomposition is a lossless-join decomposition of R if at least one of the
following functional dependencies are in : a. R1∩ R2→ R1 b. b. R1∩ R2→ R2

4. Differentiate Weak and Strong Entity Sets

Weak entity set: entity set that do not have key attribute of their own are called weak entity sets. Strong
entity set: Entity set that has a primary key is termed a strong entity set.

5. Discuss normalization.1NF, 2NF,

1NF databases have some problems: Most notable: repetition of data to change a department name all
tuples of the relation need to be updated since the department name can exist in multiple rows.

A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally
dependent on primary key.00

6. Define a Transaction? List the properties of transaction

Collections of operations that form a single logical unit of work are Called transactions.

Atomicity, Consistency, Isolation and Durability.

7. What is a serializable schedule?

To process transactions concurrently, the database server must execute some Component statements of
one transaction, then some from other Transactions, before continuing to process further operations
from the first. The order in which the component operations of the various transactions are Interleaved
is called the schedule.

8. What are the needs for concurrency?

Improved throughput and resource utilization and reduced waiting time.


9. What are the steps followed in executing write(x) command in Transaction?

: Find the address of disk block that contain item x. Copy that disk block into buffer in main memory.
Copy item x from the program variable named x into its correct location In the buffer Store the update
block from the buffer back to disk.

10. Define Two Phase locking protocol?

A transaction is said to follow the Two-Phase Locking protocol if Locking and Unlocking can be
done in two phases.

 Growing Phase: New locks on data items may be acquired but none can be released.
 Shrinking Phase: Existing locks may be released but no new locks can be acquired.

PART – B

11. a) Discuss in details the steps involved in the ER to Relational – Relational mapping

ER Model, when conceptualized into diagrams, gives a good overview of entity-relationship,


which is easier to understand. ER diagrams can be mapped to relational schema, that is, it is
possible to create relational schema using ER diagram. We cannot import all the ER constraints
into relational model, but an approximate schema can be generated.There are several processes
and algorithms available to convert ER Diagrams into Relational Schema. Some of them are
automated and some of them are manual. We may focus here on the mapping diagram contents
to relational basics.

ER diagrams mainly comprise of −

 Entity and its attributes


 Relationship, which is association among entities.

Mapping Entity

An entity is a real-world object with some attributes.

Mapping Process (Algorithm)

 Create table for each entity.


 Entity's attributes should become fields of tables with their respective data types.
 Declare primary key.

Mapping Relationship

A relationship is an association among entities.

Mapping Process

 Create table for a relationship.


 Add the primary keys of all participating Entities as fields of table with their respective
data types.
 If relationship has any attribute, add each attribute as field of table.
 Declare a primary key composing all the primary keys of participating entities.
 Declare all foreign key constraints.

Mapping Weak Entity Sets

A weak entity set is one which does not have any primary key associated with it.

Mapping Process

 Create table for weak entity set.


 Add all its attributes to table as field.
 Add the primary key of identifying entity set.
 Declare all foreign key constraints.
Mapping Hierarchical Entities

ER specialization or generalization comes in the form of hierarchical entity sets.

Mapping Process

 Create tables for all higher-level entities.


 Create tables for lower-level entities.
 Add primary keys of higher-level entities in the table of lower-level entities.
 In lower-level tables, add all other attributes of lower-level entities.
 Declare primary key of higher-level table and the primary key for lower-level table.
 Declare foreign key constraints.
11.b)Explain the concept of lozzy and lossless Decompositon

Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables using
Joins. This is the preferred choice. The information will not lose from the relation when
decomposed. The join would result in the same original relation.

Let us see an example −

<EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Decompose the above table into two tables:

<EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location


E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>

Dept_ID Emp_ID Dept_Name


Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance
Now, Natural Join is applied on the above two tables –The result will be −

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Therefore, the above relation had lossless decomposition i.e. no loss of information.

Lossy Decomposition

As the name suggests, when a relation is decomposed into two or more relational schemas, the
loss of information is unavoidable when the original relation is retrieved.

Let us see an example −

<EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
Decompose the above table into two tables −

<EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location


E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>

Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance
Now, you won’t be able to join the above tables, since Emp_ID isn’t part of
the DeptDetails relation.
Therefore, the above relation has lossy decomposition.

12.a) Describe properties of relational decomposition

The relational database design algorithms that we present in Section 16.3 start from a
single universal relation schema R = {A1, A2, ..., An} that includes all the attributes of the
database. We implicitly make the universal relation assumption, which states that every
attribute name is unique. The set F of functional dependencies that should hold on the attributes
of R is specified by the database designers and is made available to the design algorithms. Using
the functional dependencies, the algorithms decompose the universal relation schema R into a set
of relation schemas D = {R1, R2, ..., Rm} that will become the relational database schema; D is
called a decomposition of R.

We must make sure that each attribute in R will appear in at least one relation schema Ri in the
decomposition so that no attributes are lost; formally, we have

This is called the attribute preservation condition of a decomposition.

Another goal is to have each individual relation Ri in the decomposition D be in BCNF or 3NF.
However, this condition is not sufficient to guarantee a good data-base design on its own. We
must consider the decomposition of the universal rela-tion as a whole, in addition to looking at
the individual relations. To illustrate this point, consider the EMP_LOCS(Ename, Plocation)
relation in which is in 3NF and also in BCNF. In fact, any relation schema with only two
attributes is auto-matically in BCNF.5 Although EMP_LOCS is in BCNF, it still gives rise to
spurious tuples when joined with EMP_PROJ (Ssn, Pnumber, Hours, Pname, Plocation), which
is not in BCNF (see the result of the natural join in Figure 15.6). Hence, EMP_LOCS represents
a particularly bad relation schema because of its convoluted semantics by which Plocation gives
the location of one of the projects on which an employee works.
Joining EMP_LOCS with PROJECT(Pname, Pnumber, Plocation, Dnum) in which is in
BCNF—using Plocation as a joining attribute also gives rise to spurious tuples. This underscores
the need for other criteria that, together with the conditions of 3NF or BCNF, prevent such bad
designs. In the next three subsections we discuss such additional conditions that should hold on a
decomposition D as a whole.
12. b) Define normalization? Explain 1NF, 2NF, 3NF normal forms.
First Normal Form
If a relation contain composite or multi-valued attribute, it violates first normal form or a relation is in
first normal form if it does not contain any composite or multi-valued attribute. A relation is in first
normal form if every attribute in that relation is singled valued attribute.
 Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute
STUD_PHONE. Its decomposition into 1NF has been shown in table 2.
Example

 Example 2 –
ID Name Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3
 In the above table Course is a multi-valued attribute so it is not in 1NF. Below Table is in 1NF as
there is no multi-valued attribute
ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3
Second Normal Form
To be in second normal form, a relation must be in first normal form and relation must not contain any
partial dependency. A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute
(attributes which are not part of any candidate key) is dependent on any proper subset of any candidate
key of the table. Partial Dependency – If the proper subset of candidate key determines non-prime
attribute, it is called partial dependency.
 Example 1 – Consider table-3 as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000

 {Note that, there are many courses having the same course fee} Here, COURSE_FEE cannot alone
decide the value of COURSE_NO or STUD_NO; COURSE_FEE together with STUD_NO cannot
decide the value of COURSE_NO; COURSE_FEE together with COURSE_NO cannot decide the
value of STUD_NO; Hence, COURSE_FEE would be a non-prime attribute, as it does not belong to
the one only candidate key {STUD_NO, COURSE_NO} ; But, COURSE_NO -> COURSE_FEE,
i.e., COURSE_FEE is dependent on COURSE_NO, which is a proper subset of the candidate key.
Non-prime attribute COURSE_FEE is dependent on a proper subset of the candidate key, which is a
partial dependency and so this relation is not in 2NF. To convert the above relation to 2NF, we need
to split the table into two tables such as : Table 1: STUD_NO, COURSE_NO Table 2:
COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
 NOTE: 2NF tries to reduce the redundant data getting stored in memory. For instance, if there are
100 students taking C1 course, we don’t need to store its Fee as 1000 for all the 100 records,
instead, once we can store it in the second table as the course fee for C1 is 1000.
 Example 2 – Consider following functional dependencies in relation R (A, B , C, D )
AB -> C [A and B together determine C]
BC -> D [B and C together determine D]
In the above relation, AB is the only candidate key and there is no partial dependency, i.e., any proper
subset of AB doesn’t determine any non-prime attribute.
X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).
Example 1: In relation STUDENT given in Table 4, FD set: {STUD_NO -> STUD_NAME,
STUD_NO -> STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key: {STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY
are true.
So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form.
To convert it in third normal form, we will decompose the relation STUDENT (STUD_NO,
STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as: STUDENT
(STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE) STATE_COUNTRY
(STATE, COUNTRY)
Consider relation R(A, B, C, D, E) A -> BC, CD -> E, B -> D, E -> A All possible candidate keys in
above relation are {A, E, CD, BC} All attributes are on right sides of all functional dependencies are
prime.
Example 2: Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC-
>BE, B->E}
Step 1: As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine all attribute of
relation, So AC will be candidate key. A or C can’t be derived from any other attribute of the relation,
so there will be only 1 candidate key {AC}.
Step 2: Prime attributes are those attributes that are part of candidate key {A, C} in this example and
others will be non-prime {B, D, E} in this example.
Step 3: The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or
composite attribute. The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is
not a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and
B->E is in 2nd normal form (B is not a proper subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime
attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for,
either LHS of an FD should be super key or RHS should be prime attribute. So the highest normal form
of relation will be 2nd Normal form.
For example consider relation R(A, B, C) A -> BC, B -> A and B both are super keys so above relation
is in BCNF.
Third Normal Form
A relation is said to be in third normal form, if we did not have any transitive dependency for non-
prime attributes. The basic condition with the Third Normal Form is that, the relation must be in
Second Normal Form.
Below mentioned is the basic condition that must be hold in the non-trivial functional dependency X ->
Y:
 X is a Super Key.
 Y is a Prime Attribute ( this means that element of Y is some part of Candidate Key).
For more, refer to Third Normal Form in DBMS.

13.a) Compare and contrast Boyce Code Normal Form(BCNF )with 3NF?
3NF
There shouldn’t be any transitive dependency.

There shouldn’t be any non-prime attribute that depends transitively on a candidate key.

It is not as strong as BCNF.

It has high redundancy.

The functional dependencies are already present in INF and 2NF.

It is easy to achieve.

It can be used to achieve lossless decomposition.

BCNF
For any relation A->B, ‘A’ should be a super key of that specific relation.

It is stronger than 3NF.

The functional dependencies are present in 1NF, 2NF and 3NF.

It has low redundancy in comparison to 3NF.

The functional dependencies may or may not be preserved.

It is difficult to achieve.

It is difficult to achieve lossless decomposition using BCNF.

13.b) Explain ACID properties and illustrate them through examples

ACID Properties

A transaction is a very small unit of a program and it may contain several lowlevel tasks. A
transaction in a database system must maintain Atomicity, Consistency, Isolation, and Durability
− commonly known as ACID properties − in order to ensure accuracy, completeness, and data
integrity.

 Atomicity − This property states that a transaction must be treated as an atomic unit, that
is, either all of its operations are executed or none. There must be no state in a database
where a transaction is left partially completed. States should be defined either before the
execution of the transaction or after the execution/abortion/failure of the transaction.
 Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the
database was in a consistent state before the execution of a transaction, it must remain
consistent after the execution of the transaction as well.
 Durability − The database should be durable enough to hold all its latest updates even if
the system fails or restarts. If a transaction updates a chunk of data in a database and
commits, then the database will hold the modified data. If a transaction commits but the
system fails before the data could be written on to the disk, then that data will be updated
once the system springs back into action.
 Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will
be carried out and executed as if it is the only transaction in the system. No transaction
will affect the existence of any other transaction.

14.a) Explain different types of advanced recovery techniques


Database Systems like any other computer system, are subject to failures but the data stored in
them must be available as and when required. When a database fails it must possess the
facilities for fast recovery. It must also have atomicity i.e. either transactions are completed
successfully and committed (the effect is recorded permanently in the database) or the
transaction should have no effect on the database.
Types of Recovery Techniques in DBMS
Database recovery techniques are used in database management systems (DBMS) to restore a
database to a consistent state after a failure or error has occurred. The main goal of recovery
techniques is to ensure data integrity and consistency and prevent data loss.
There are mainly two types of recovery techniques used in DBMS
 Rollback/Undo Recovery Technique
 Commit/Redo Recovery Technique
Rollback/Undo Recovery Technique
The rollback/undo recovery technique is based on the principle of backing out or undoing the
effects of a transaction that has not been completed successfully due to a system failure or
error. This technique is accomplished by undoing the changes made by the transaction using
the log records stored in the transaction log. The transaction log contains a record of all the
transactions that have been performed on the database. The system uses the log records to undo
the changes made by the failed transaction and restore the database to its previous state.
Commit/Redo Recovery Technique
The commit/redo recovery technique is based on the principle of reapplying the changes made
by a transaction that has been completed successfully to the database. This technique is
accomplished by using the log records stored in the transaction log to redo the changes made
by the transaction that was in progress at the time of the failure or error. The system uses the
log records to reapply the changes made by the transaction and restore the database to its most
recent consistent state.
In addition to these two techniques, there is also a third technique called checkpoint recovery.
Checkpoint Recovery is a technique used to reduce the recovery time by periodically saving
the state of the database in a checkpoint file. In the event of a failure, the system can use the
checkpoint file to restore the database to the most recent consistent state before the failure
occurred, rather than going through the entire log to recover the database.
Overall, recovery techniques are essential to ensure data consistency and availability
in Database Management System, and each technique has its own advantages and limitations
that must be considered in the design of a recovery system.
14. b)Discuss the violation caused by each of the following dirty read, on repeatable read and
phantoms with suitable example
As we know, in order to maintain consistency in a database, it follows ACID properties. Among
these four properties (Atomicity, Consistency, Isolation, and Durability) Isolation determines how
transaction integrity is visible to other users and systems. It means that a transaction should take
place in a system in such a way that it is the only transaction that is accessing the resources in a
database system.
Isolation levels define the degree to which a transaction must be isolated from the data
modifications made by any other transaction in the database system. A transaction isolation level
is defined by the following phenomena:
 Dirty Read – A Dirty read is a situation when a transaction reads data that has not yet been
committed. For example, Let’s say transaction 1 updates a row and leaves it uncommitted,
meanwhile, Transaction 2 reads the updated row. If transaction 1 rolls back the change,
transaction 2 will have read data that is considered never to have existed.
 Non Repeatable read – Non Repeatable read occurs when a transaction reads the same row
twice and gets a different value each time. For example, suppose transaction T1 reads data.
Due to concurrency, another transaction T2 updates the same data and commit, Now if
transaction T1 rereads the same data, it will retrieve a different value.
 Phantom Read – Phantom Read occurs when two same queries are executed, but the rows
retrieved by the two, are different. For example, suppose transaction T1 retrieves a set of rows
that satisfy some search criteria. Now, Transaction T2 generates some new rows that match
the search criteria for transaction T1. If transaction T1 re-executes the statement that reads the
rows, it gets a different set of rows this time.
Based on these phenomena, The SQL standard defines four isolation levels:

15.b)Discuss two phase locking protocol and strict two phase locking protocols

Locking and unlocking of the database should be done in such a way that there is no
inconsistency, deadlock, and no starvation.

2PL locking protocol


Every transaction will lock and unlock the data item in two different phases.
 Growing Phase − All the locks are issued in this phase. No locks are released, after all changes to data-
items are committed and then the second phase (shrinking phase) starts.
 Shrinking phase − No locks are issued in this phase, all the changes to data-items are noted (stored) and
then locks are released.
The 2PL locking protocol is represented diagrammatically as follows −
In the growing phase transaction reaches a point where all the locks it may need has been
acquired. This point is called LOCK POINT.
After the lock point has been reached, the transaction enters a shrinking phase.

Types
Two phase locking is of two types −

Strict two phase locking protocol


A transaction can release a shared lock after the lock point, but it cannot release any exclusive
lock until the transaction commits. This protocol creates a cascade less schedule.
Cascading schedule: In this schedule one transaction is dependent on another transaction. So if
one has to rollback then the other has to rollback.

Rigorous two phase locking protocol


A transaction cannot release any lock either shared or exclusive until it commits.
The 2PL protocol guarantees serializability, but cannot guarantee that deadlock will not happen.

Example
Let T1 and T2 are two transactions.
T1=A+B and T2=B+A

T1 T2
T1 T2

Lock-X(A) Lock-X(B)

Read A; Read B;

Lock-X(B) Lock-X(A)

Here,
Lock-X(B) : Cannot execute Lock-X(B) since B is locked by T2.
Lock-X(A) : Cannot execute Lock-X(A) since A is locked by T1.
In the above situation T1 waits for B and T2 waits for A. The waiting time never ends. Both the
transaction cannot proceed further at least any one releases the lock voluntarily. This situation is
called deadlock.
The wait for graph is as follows −

Wait for graph: It is used in the deadlock detection method, creating a node for each transaction,
creating an edge Ti to Tj, if Ti is waiting to lock an item locked by Tj. A cycle in WFG indicates a
deadlock has occurred. WFG is created at regular intervals.
15.a)Check whether the following schedule is conflict serializable or not conflict serializable then
find the serializability order

T1 T2 T3
R(A)
R(B)
R(B)
W(B)
W(A)
W(A)
R(A)

W(A)
Soln:

15.b)Consider the following two transactions :

T1:read(A)

Read(B);
If A=0 then B=B+1;
T2:read(B);
read(A) If B=0 then A=A+1
Write(A)
Add lock and unlock instructions to transactions T1 and T2, so that they observe two phase locking
protocol. Can the execution of these transactions result in deadlock?

Soln:

16.a) A software contract and consultancy firm maintains details of all various projects in which
its employees are currently involved. These details comprise
1.Employee number
2.Employee name
3.Date of Birth
4.Department code
5. Department Name
6.Project code
7.Project description
8.Project Supervisor
Assume the following :
 Each Employee number is unique
 Each department has a single department code
 Each project has a single code and supervisor
 Each employee may work on one or more projects
 Employee names need not necessarily be unique
 Project code, project description and project supervisor are repeating fields.
Normalise this data to third normal form

16.b) Draw E-R Diagram for the Restaurant menu ordering system that will facilitate the food items
ordering and services with in a restaurant.

You might also like