Professional Documents
Culture Documents
DBMS END TERM Paper and Solution 2020
DBMS END TERM Paper and Solution 2020
SECTION A
DDL does not use WHERE clause in its While DML uses WHERE clause in
statement. its statement.
c. What are different Integrity Constraints?
Ans: TYPES OF INTEGRITY CONSTRAINTS
Various types of integrity constraints are-
1. Domain Integrity : Domain integrity means the definition of a valid set of values for an attribute.
2. Entity Integrity Constraint : This rule states that in any database relation value of attribute of a
primary key can't be null.
3. Referential Integrity Constraint : It states that if a foreign key exists in a relation then either the
foreign key value must match a primary key value of some tuple in its home relation or the
foreign key value must be null.
4. Key Constraints: A Key Constraint is a statement that a certain minimal subset of the fields of a
relation is a unique identifier for a tuple.
• Atomicity
• Consistency
• Isolation
• Durability
SECTION B
Answer 2:
a)
What is ER Diagram? Explain different Components of an ER Diagram with their Notation. Also make an
ER Diagram for Employee Project Management System.
Ans: ER Diagram stands for Entity Relationship Diagram, also known as ERD is a diagram that displays the
relationship of entity sets stored in a database. In other words, ER diagrams help to explain the logical
structure of databases. ER diagrams are created based on three basic concepts: entities, attributes and
relationships.
ER Diagram Symbols
b). What is Relational Algebra? Explain Different Operations of Relational Algebra with Example.
Ans: RELATIONAL ALGEBRA is a widely used procedural query language. It collects instances of relations
as input and gives occurrences of relations as output. It uses various operations to perform this action.
SQL Relational algebra query operations are performed recursively on a relation.
Unary Relational Operations
• SELECT (symbol: σ)
• PROJECT (symbol: π)
• RENAME (symbol: ρ)
• UNION (υ)
• INTERSECTION ( ),
• DIFFERENCE (-)
• CARTESIAN PRODUCT ( x )
• JOIN
• DIVISION
SELECT (σ)
The SELECT operation is used for selecting a subset of the tuples according to a given selection
condition. Sigma(σ)Symbol denotes it. It is used as an expression to choose tuples which meet the
selection condition. Select operator selects tuples that satisfy a given predicate.
Projection(π)
The projection eliminates all attributes of the input relation but those mentioned in the projection list.
The projection method defines a relation that contains a vertical subset of Relation.
Rename (ρ)
UNION is symbolized by ∪ symbol. It includes all tuples that are in tables A or in B. It also eliminates
duplicate tuples. So, set A UNION set B would be expressed as:
- Symbol denotes it. The result of A - B, is a relation which includes all tuples that are in A but not in B.
Example
A-B
Intersection
An intersection is defined by the symbol ∩
A∩B
Defines a relation consisting of a set of all tuple that are in both A and B. However, A and B must be
union-compatible.
Cartesian Product in DBMS is an operation used to merge columns from two relations. Generally, a
cartesian product is never a meaningful operation when it performs alone. However, it becomes
meaningful when it is followed by other operations. It is also called Cross Product or Cross Join.
σ column 2 = '1' (A X B)
c).(i) What is highest normal form of the Relation R(W,X,Y,Z) with the set F= { WY → XZ, X →Y }
Ans: WY and WX are the keys of this relation. W,Y,X are the prime attributes. For 3NF Left side of each
FD should be super key or right should be prime attribute so relation is in 3NF.
For BCNF Left side of each FD should be super key only so it is not in BCNF.
So the highest normal for is 3NF.
d). Explain the method of testing the serializability. Consider the schedule
S1 and S2 given below
S1: R1(A),R2(B),W1(A),W2(B)
S2: R2(B),R1(A),W2(B), W1(A)
Check whether the given schedules are conflict equivalent or not?
Ans: For both the schedule all operations are non-conflicting because they operate on different data
item(transaction T1 on A and transaction T2 on B). So Both the schedule are conflict serializable and
both will generate the same output so both are conflict serializable also.
Ans: Validation phase is also known as optimistic concurrency control technique. In the validation based
protocol, the transaction is executed in the following three phases:
1. Read phase: In this phase, the transaction T is read and executed. It is used to read the value of
various data items and stores them in temporary local variables. It can perform all the write
operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against the actual
data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results are written
to the database or system otherwise the transaction is rolled back.
Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation phase.
o This protocol is used to determine the time stamp for the transaction for serialization using the
time stamp of the validation phase, as it is the actual phase which determines if the transaction
will commit or rollback.
o Hence TS(T) = validation(T).
o The serializability is determined during the validation process. It can't be decided in advance.
o While executing the transaction, it ensures a greater degree of concurrency and also less number
of conflicts.
o Thus it contains transactions which have less number of rollbacks.
SECTION C
Answer 3:
a) What is Data Abstraction? How the Data Abstraction is achieved in DBMS?
Ans: Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It provides
a different view and helps in achieving data independence which is used to enhance the security of data.
The database systems consist of complicated data structures and relations. For users to access the data
easily, these complications are kept hidden, and only the relevant part of the database is made accessible
to the users through data abstraction.
Levels of abstraction for DBMS
Database systems include complex data-structures. In terms of retrieval of data, reduce complexity in
terms of usability of users and in order to make the system efficient, developers use levels of abstraction
that hide irrelevant details from the users. Levels of abstraction simplify database design.
Mainly there are three levels of abstraction for DBMS, which are as follows −
Specialization –
In specialization, an entity is divided into sub-entities based on their characteristics. It is a top-down
approach where higher level entity is specialized into two or more lower level entities. For Example,
EMPLOYEE entity in an Employee management system can be specialized into DEVELOPER, TESTER etc.
as shown in Figure 2. In this case, common attributes like E_NAME, E_SAL etc. become part of higher
entity (EMPLOYEE) and specialized attributes like TES_TYPE become part of specialized entity (TESTER).
Aggregation –
An ER diagram is not capable of representing relationship between an entity and a relationship which
may be required in some scenarios. In those cases, a relationship with its corresponding entities is
aggregated into a higher level entity. Aggregation is an abstraction through which we can represent
relationships as higher level entity sets.
For Example, Employee working for a project may require some machinery. So, REQUIRE relationship is
needed between relationship WORKS_FOR and entity MACHINERY. Using aggregation, WORKS_FOR
relationship with its entities EMPLOYEE and PROJECT is aggregated into single entity and relationship
REQUIRE is created between aggregated entity and MACHINERY.
Answer 4:
a) What is Aggregate Function in SQL? Write SQL query for different Aggregate Function.
Ans: An aggregate function performs a calculation on a set of values, and returns a single value. Except
for COUNT(*), aggregate functions ignore null values. Aggregate functions are often used with the GROUP
BY clause of the SELECT statement.
1. COUNT FUNCTION
o COUNT function is used to Count the number of rows in a database table. It can work on both
numeric and non-numeric data types.
o COUNT function uses the COUNT(*) that returns the count of all the rows in a specified table.
COUNT(*) considers duplicate and Null.
o SELECT COUNT(*)
FROM PRODUCT_MAST;
2. SUM Function
Sum function is used to calculate the sum of all selected columns. It works on numeric fields only.
Syntax
Example: SUM()
1. SELECT SUM(COST)
FROM PRODUCT_MAST;
3. AVG function
The AVG function is used to calculate the average value of the numeric type. AVG function returns the
average of all non-Null values.
Syntax
Example:
1. SELECT AVG(COST)
2. FROM PRODUCT_MAST;
4. MAX Function
MAX function is used to find the maximum value of a certain column. This function determines the largest
value of all selected values of a column.
Syntax
Example:
1. SELECT MAX(RATE)
2. FROM PRODUCT_MAST;
5. MIN Function
MIN function is used to find the minimum value of a certain column. This function determines the smallest
value of all selected values of a column.
Syntax
Example:
1. SELECT MIN(RATE)
FROM PRODUCT_MAST;
Answer 5:
a) What is Functional Dependency? Explain the procedure of Calculating the Canonical Cover of a
given Functional Dependency Set with suitable example.
Ans: Whenever a user updates the database, the system must check whether any of the functional
dependencies are getting violated in this process. If there is a violation of dependencies in the new
database state, the system must roll back. Working with a huge set of functional dependencies can
cause unnecessary added computational time. This is where the canonical cover comes into play.
A canonical cover of a set of functional dependencies F is a simplified set of functional dependencies
that has the same closure as the original set F.
Extraneous attributes: An attribute of a functional dependency is said to be extraneous if we can
remove it without changing the closure of the set of functional dependencies.
b)
i) Consider the relation R(a,b,c,d) with Set F={a→c, b→d}. Decompose this relation in 2 NF.
Solution: ab is the key of the given relation. A relation is in 2NF if it has No Partial Dependency, i.e., no
non-prime attribute (attributes which are not part of any candidate key) is dependent on any proper
subset of any candidate key of the table.
In Given relation Both the FDs are partial so decomposition for 2NF is in two relation R1=={a→c}, R2=
{b→d}
ii) Explain the Loss Less Decomposition with example.
Lossless join decomposition is a decomposition of a relation R into relations R1,R2 such that if we
perform natural join of two smaller relations it will return the original relation. This is effective in
removing redundancy from databases while preserving the original data..
In other words by lossless decomposition it becomes feasible to reconstruct the relation R from
decomposed tables R1 and R2 by using Joins.
In Lossless Decomposition we select the common element and the criteria for selecting common
element is that the common element must be a candidate key or super key in either of relation R1,R2
or both.
Decomposition of a relation R into R1 and R2 is a lossless-join decomposition if at least one of the
following functional dependencies are in F+ (Closure of functional dependencies)
R1 ∩ R2 → R1
OR
R1 ∩ R2 → R2
Ans 6:
a) What is Conflict Serializable Schedule? Check the given Schedule S1
is Conflict Serializable or not?
S1: R1(X), R2(X),R2(Y),W2(Y),R1(Y),W1(X)
Ans: A schedule is called conflict serializable if it can be transformed into a serial schedule by swapping
non-conflicting operations.
Conflicting operations: Two operations are said to be conflicting if all conditions satisfy:
• They belong to different transactions
• They operate on the same data item
• At Least one of them is a write operation
Schedule S2
T1 T2
---------------------
r1(X)
r2(X)
r2(Y)
w2(Y)
r1(Y)
w1(X)
The schedule is conflict equivalent to T2T1.
Deadlock in DBMS
Deadlock Avoidance –
When a database is stuck in a deadlock, It is always better to avoid the deadlock rather than restarting
or aborting the database. Deadlock avoidance method is suitable for smaller databases whereas the
deadlock prevention method is suitable for larger databases.
One method of avoiding deadlock is using application-consistent logic. In the above given example,
Transactions that access Students and Grades should always access the tables in the same order. In this
way, in the scenario described above, Transaction T1 simply waits for transaction T2 to release the lock
on Grades before it begins. When transaction T2 releases the lock, Transaction T1 can proceed freely.
Another method for avoiding deadlock is to apply both row-level locking mechanism and READ
COMMITTED isolation level. However, It does not guarantee to remove deadlocks completely.
Deadlock Detection –
When a transaction waits indefinitely to obtain a lock, The database management system should detect
whether the transaction is involved in a deadlock or not.
Wait-for-graph is one of the methods for detecting the deadlock situation. This method is suitable for
smaller databases. In this method, a graph is drawn based on the transaction and their lock on the
resource. If the graph created has a closed-loop or a cycle, then there is a deadlock.
For the above-mentioned scenario, the Wait-For graph is drawn below
Deadlock prevention –
For a large database, the deadlock prevention method is suitable. A deadlock can be prevented if the
resources are allocated in such a way that deadlock never occurs. The DBMS analyzes the operations
whether they can create a deadlock situation or not, If they do, that transaction is never allowed to be
executed.
Deadlock prevention mechanism proposes two schemes :
• Wait-Die Scheme –
In this scheme, If a transaction requests a resource that is locked by another transaction, then the
DBMS simply checks the timestamp of both transactions and allows the older transaction to wait until
the resource is available for execution.
Suppose, there are two transactions T1 and T2, and Let the timestamp of any transaction T be TS (T).
Now, If there is a lock on T2 by some other transaction and T1 is requesting for resources held by T2,
then DBMS performs the following actions:
Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held some resource, then it allows
T1 to wait until resource is available for execution. That means if a younger transaction has locked
some resource and an older transaction is waiting for it, then an older transaction is allowed to wait
for it till it is available. If T1 is an older transaction and has held some resource with it and if T2 is
waiting for it, then T2 is killed and restarted later with random delay but with the same timestamp. i.e.
if the older transaction has held some resource and the younger transaction waits for the resource,
then the younger transaction is killed and restarted with a very minute delay with the same
timestamp.
This scheme allows the older transaction to wait but kills the younger one.
• Wound Wait Scheme –
In this scheme, if an older transaction requests for a resource held by a younger transaction, then an
older transaction forces a younger transaction to kill the transaction and release the resource. The
younger transaction is restarted with a minute delay but with the same timestamp. If the younger
transaction is requesting a resource that is held by an older one, then the younger transaction is
asked to wait till the older one releases it.
Answer 7:
Transaction rollback :
• In this scheme, we rollback a failed transaction by using the log.
• The system scans the log backward a failed transaction, for every log record found in the log
the system restores the data item.
Checkpoints :
• Checkpoints is a process of saving a snapshot of the applications state so that it can restart
from that point in case of failure.
• Checkpoint is a point of time at which a record is written onto the database form the buffers.
• Checkpoint shortens the recovery process.
• When it reaches the checkpoint, then the transaction will be updated into the database, and
till that point, the entire log file will be removed from the file. Then the log file is updated
with the new step of transaction till the next checkpoint and so on.
• The checkpoint is used to declare the point before which the DBMS was in the consistent
state, and all the transactions were committed.
To ease this situation, ‘Checkpoints‘ Concept is used by the most DBMS.
• In this scheme, we used checkpoints to reduce the number of log records that the system
must scan when it recovers from a crash.
• In a concurrent transaction processing system, we require that the checkpoint log record be
of the form <checkpoint L>, where ‘L’ is a list of transactions active at the time of the
checkpoint.
• A fuzzy checkpoint is a checkpoint where transactions are allowed to perform updates even
while buffer blocks are being written out.
Restart recovery :
• When the system recovers from a crash, it constructs two lists.
• The undo-list consists of transactions to be undone, and the redo-list consists of transaction
to be redone.
• The system constructs the two lists as follows: Initially, they are both empty. The system
scans the log backward, examining each record, until it finds the first <checkpoint> record.