2020_SOLVED_DBMS

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

1. Design an E-R diagram for an airline reservation system.

The database
must keep track of customers and their reservations, flights and their
status, seat assignments on individual flights, and the schedule and routing
of future flights. Your design should include a list of constraints, including
primary-key and foreign-key constraints.
Ans=>

Primary keys: Customer -> c_id


Flight -> F_id
Status -> St_id
Seats -> Seat_no
Schedule -> S_id
Route -> R_id
Foreign Key: Reserves(seat_no) references Seats(seat_no)
2. (a) Define instance and schema with one example.
Ans=> Instance: The data stored in database at a particular moment of time is called instance of
database. Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of that database.
For example, lets say we have a single table student in the database, today the table has 100 records,
so today the instance of the database has 100 records.
Schema: Design of a database is called the schema. Schema is of three types: Physical schema,
logical schema and view schema.
In the following diagram, we have a schema that shows the relationship between three tables: Course,
Student and Section.

(b) Write a short note on Data Independence.


Ans=> Data Independence is defined as a property of DBMS that helps you to change the Database
schema at one level of a database system without requiring to change the schema at the next higher
level. Data independence helps you to keep data separated from all programs that make use of it.
Types of Data Independence
1. Physical data independence
2. Logical data independence.
(c) What are the problems associated with redundancies within a table? What is the solution to this
problem?
Ans=> Redundancy means having multiple copies of same data in the database. This problem arises
when a database is not normalized. Problems caused due to redundancy are: Insertion anomaly,
Deletion anomaly, and Updation anomaly.
Normalization is the process of minimizing redundancy from a relation or set of relations. Like
1NF, 2NF, 3NF, BCNF etc.

3. (a) What is a candidate key? Is there any difference


between a Primary key and a candidate key?
Ans=> Candidate Key: The minimal set of attribute which can uniquely identify a tuple is known as
candidate key. For Example, STUD_NO in STUDENT relation.

Sr. Key Primary Key Candidate key


No.

Definition Primary Key is a unique and non- Candidate key is also a unique key to
null key which identify a record identify a record uniquely in a table
1
uniquely in table. A table can have but a table can have multiple
only one primary key. candidate keys.

Null Primary key column value can not be Candidate key column can have null
2
null. value.

Objective Primary key is most important part Candidate key signifies as which key
3
of any relation or table. can be used as Primary Key.

Use Primary Key is a candidate key. Candidate key may or may not be a
4
primary key.

(b) What is Multi-value Dependency? Give one example.


Ans=> MVD or multivalued dependency means that for a single value of attribute ‘a’ multiple values
of attribute ‘b’ exist. We write it as, a --> --> b.
Example: Suppose a person named Geeks is working on 2 projects Microsoft and Oracle and has 2
hobbies namely Reading and Music. This can be expressed in a tabular format in the following way.
Project and Hobby are multivalued attributes as they have more than one value for a single person i.e.,
Geeks.

(c) Define the concept of aggregation. Give one example of


it where this concept is useful.
Ans=> In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship which is
in a relationship with another entity visitor. In the real world, if a visitor visits a coaching center then
he will never enquiry about the Course only or just about the Center instead he will ask the enquiry
about both.

4. (i) What is Functional Dependency?


Ans=> Functional Dependency (FD) is a constraint that determines the relation of one attribute to
another attribute in a Database Management System.
A functional dependency is denoted by an arrow "→". The functional dependency of X on Y
is represented by X → Y.

(ii) Define the “referential integrity rule” with one


example.
Ans=> Referential Integrity Rule in DBMS is based on Primary and Foreign Key. The Rule defines
that a foreign key have a matching primary key. Reference from a table to another table should be
valid.
<Employee> <Department>
EMP_ID DEPT_ID
EMP_NAME DEPT_NAME
DEPT_ID DEPT_ZONE
The rule states that the DEPT_ID in the Employee table has a matching valid DEPT_ID in the
Department table.
To allow join, the referential integrity rule states that the Primary Key and Foreign Key have same
data types.

(iii) Consider the relation R(A, B, C, D, E), with functional


dependencies:
= {A -> BCD, C -> DE}. Identify the non-prime attributes of the relation R.
Ans=> Non-prime attributes are B, C, D, E. (Because non-prime attributes are present on the right
side of the functional dependencies).

(iv) The above relation R is in which normal form?


Ans=> Steps to find the highest normal form of a relation:

1. Find all possible candidate keys of the relation.


2. Divide all attributes into two categories: prime attributes and non-prime attributes.
3. Check for 1st normal form then 2nd and so on. If it fails to satisfy nth normal form condition,
highest normal form will be n-1.
Here, Prime attribute = A, Non-prime attribute = B, C, D, E
1) BCNF ? => yes, no [Not in BCNF]
2) 3NF ? => yes, no [Not 3NF]
3) 2NF ? => yes, yes [So it is in 2NF]

(v) Consider the table below:


Student
Name Roll_No Department
Rohan 123 Comp Science
Rohan 456 Architecture

The relation student is decomposed into following two new relations. Is this
decomposition lossless? Explain your answer.
Student_roll
Roll_No Name Student_department
123 Rohan Department Name
456 Rohan Comp Rohan
Science
Architecture Rohan

Ans => If we decompose a relation R into relations R1 and R2,


 Decomposition is lossy if R1 ⋈ R2 ⊃ R
 Decomposition is lossless if R1 ⋈ R2 = R
As, Att(Student_roll) U Att(Student_department) != Att(Student)
So, this decomposition is not lossless.

(vi) Does the above decomposition generate any spurious


tuple? Explain your answer.
Ans=> Yes, if we apply join it will create spurious tuple.
SELECT * FROM student_roll NATURAL JOIN student_department;
+-------+---------+--------------+
| name | roll_no | department |
+-------+---------+--------------+
| Rohan | 456 | comp science |
| Rohan | 123 | comp science |
| Rohan | 456 | Architecture |
| Rohan | 123 | Architecture |
+-------+---------+--------------+

5. (a) Explain the anomalies that may occur when two


integrity rules are violated.
Ans=> The two types of integrity rules are referential integrity rules and entity integrity rules.
Referential integrity rules dictate that a database does not contain orphan foreign key values. This
means that, a primary key value cannot be modified if the value is used as a foreign key in a child
table.
Entity integrity dictates that the primary key value cannot be Null.

(b) A relational schema of College (Faculty, Dean, Department,


Chairperson, Professor, Rank, Student) has the following Functional
Dependencies {Faculty->Dean, Dean-> Faculty, Department->
Chairperson, Professor-> Rank Chairperson, Department-> Faculty,
Student->Department Faculty Dean, Professor Rank-> Department
Faculty}. Find all the possible candidate keys of College. Decompose this
relation up to BCNF.
Ans=> College (Faculty, Dean, Department, Chairperson, Professor, Rank, Student)
(A, B, C, D, E, F, G)
Faculty->Dean A -> B
Dean-> Faculty B -> A
Department-> Chairperson C -> D
Professor-> Rank Chairperson E -> FD
Department-> Faculty C -> A
Student->Department Faculty Dean G -> CAB
Professor Rank-> Department Faculty EF -> CA
So, the candidate keys is EG
*from notes
The decomposed tables are: R1(AB), R21(CD), R221(CA), R222(CEFG)
So, the relation is in BCNF: R1(Faculty, Dean), R21(Department, Chairperson), R221(Department,
Faculty), R222(Department, Professor, Rank, Student)

6. Consider the following relational schemas:


Customer(cust_Id, cname, contact_no, dob) Cars(reg_no, model,
manufacturer, year) Borrow(cust_Id, reg_no, date, time)
(a) Write down the relational algebra expressions for the following queries:
(i) Find the names of the customers who have borrowed a car
manufactured by „Hyundai‟.
(ii) Find the registration number of the cars that had manufactured
before the year 2015 and borrowed by John.
Ans=> (i) πcname (σmanufacturer = “Hyundai” (customer ⋈ Borrow ⋈ Cars))
(ii) πreg_no (σyear<2015 and cname=”John” (Customers ⋈ Borrow ⋈ Cars))

(b) Write down the SQL statements for the following queries:
(i) Find the names of the customers who have borrowed a car
manufactured by “TATA”.
(ii) Find the model of the car that is mostly being borrowed by the
customer.
(iii) In which year had most of the cars been manufactured?
(iv) Find the names of customers who have borrowed all cars
available in the car tables.
(v) Find the contact number of the customers who have borrowed at
least one car today.
Ans=> (i) SELECT cname FROM Customer NATURAL JOIN Borrow NATURAL JOIN Cars
WHERE manufacturer = “TATA”;
(ii) SELECT Model FROM Cars
WHERE reg_no IN ( SELECT reg_no FROM Borrow
WHERE COUNT(reg_no) = (SELECT MAX(COUNT(reg_no))
FROM Borrow GROUP BY reg_no));
(iii) SELECT year FROM cars
GROUP BY year HAVING COUNT(year) = (SELECT MAX(COUNT(year)) FROM cars
GROUP BY years);
(iv) SELECT cname FROM customer
WHERE cust_id IN (SELCT cust_id FROM Borrow
WHERE reg_no IN ALL (SELECT reg_no FROM cars));
(v) SELECT contact_no FROM customer NATURAL JOIN Borrow
WHERE date = SYSDATE();

7. (a) Suppose, for executing a query 15 number of seeks and 10 number of


blocks need to be transferred from the disk, and for each block transfer
required 2 milliseconds. If each seeks requires 1 millisecond, what will be
the cost to execute the query?
Ans=> Number of seek = 15
Number of block = 10
Time required for each block transfer = 2ms
Time required for each seek = 1ms
So, Total time = (10 x 2) + (15 x 1)
= 20 + 15
= 35

(b) Consider the following set of relations:


Employee(Emp_Id, Name, Address) Project(Project_Id, PName) Works For(Emp_Id, Project_Id)
Draw two possible optimized query trees for the following query:
Find the name and the project name of the employees who lived in ‘Kolkata’.
Ans=> *In the note

8. (a) What are the ACID properties of a transaction?


Ans=> In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.
Atomicity: By this, we mean that either the entire transaction takes place at once or doesn’t happen at
all. There is no midway i.e. transactions do not occur partially.
Consistency: This means that integrity constraints must be maintained so that the database is
consistent before and after the transaction. It refers to the correctness of a database.
Isolation: This property ensures that multiple transactions can occur concurrently without leading to
the inconsistency of database state. Transactions occur independently without interference.
Durability: This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system failure
occurs.

(b) What is cascading rollback? Why it is necessary?


Ans=> A cascading rollback occurs in database systems when a transaction (T1) causes a failure and a
rollback must be performed. Other transactions dependent on T1's actions must also be rollbacked due
to T1's failure, thus causing a cascading effect. That is, one transaction's failure causes many to fail.
It’s necessary because if a transaction depend on some other transaction and the first
transaction fails then the dependent transaction get some wrong input which lead us to a wrong
output.
Example-

T1 T2 T3
Read
Write
Read
Write
Read
Here, if Write in T1 failed, as the Read on T2 depends on the Write T1 so it gets wrong data if it’s not
cascading roll back. That’s why cascading roll back prevents dirty read from T2 and T3.
(c) Check whether the following two schedules are view
equivalent or not. Explain your answer.
Schedule 1 Schedule 2
T1 T2 T3 T1 T2 T3
Write(A) Write(A)
Write(A) Read(A)
Read(A) Write(A)

Ans=>
1. Initial read in Schedule 1 and Schedule 2 performed by T3 so it satisfies this condition.
2. Final write in Schedule 1 and Schedule 2 performed by T2 so it satisfies this condition.
3. In Schedule 1 Read(A) of T3 reads a different A, compared to the Read(A) of T3 in Schedule 2. In
S1, Read(A) is processed by Write(A) of T2 where in Schedule 2, Read(A) of T3 is processed by
Write(A) of T1. So it doesn’t satisfies this condition.
So, Schedule 1 and Schedule 2 do not satisfy all 3 conditions. Schedule 1 and Schedule 2 are not view
equivalent.

9. (a) In time stamp based protocol, if a transaction Ti wants to modify data item A, under which
condition it can modify A.
Ans=> There are three conditions:
1. Ts(Ti) < R-Ts(A): If Write(A) was expected to happen before the value of A is read, here
system assumes that it would never be modified so, the write operation rejected.
2. Ts(Ti) < W-Ts(A): If Ti is trying to write an absolute value of A, it gets rejected.
3. Otherwise Write(A) is executed W-Ts(A) is set to Ts(Ti).
This is the only condition where Ti is able to modify A.
(b) Write an algorithm to avoid deadlock.
Ans=> To avoid dead we have to remove these four conditions: 1. Mutual Exclusion, 2. Hold and
Wait, 3. No preemption, 4. Circular wait.
We can avoid deadlock by:
I. Eliminate Mutual Exclusion:
It is not possible to dis-satisfy the mutual exclusion because some resources, such as
the tape drive and printer, are inherently non-shareable.
II. Eliminate Hold and wait:
1. Allocate all required resources to the process before the start of its execution.
2. The process will make a new request for resources after releasing the current set
of resources.
III. Eliminate No Preemption:
Preempt resources from the process when resources required by other high priority
processes.
IV. Eliminate Circular Wait:
Each resource will be assigned with a numerical number. A process can request the
resources increasing/decreasing.

(c) Draw a wait-for graph for the following schedule and try to find out if there is any deadlock in the
system or not.
T1 T2 T3 T4 T5
Read(A)
Read(C)
Read(B)
Read(C)
Write(A)
Read(A)
Read(A)
Read(C)
Write(A)
Write(C)
Read(B)
Read(A)
Read(B)
Write(B)

Ans=>

A B C
R3 R2 R1
W3 R3 R5
R1 R5 R4
R2 W3 W4
W1
R4

Wait for Graph:


Here, there is a cycle between T2 and T3. So, a deadlock exists between T2 and T3.

10. (a) Explain 2-phase locking protocol. What is lock up-gradation and down- gradation?
Ans=> A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking can be
done in two phases.
Growing Phase: New locks on data items may be acquired but none can be released.
Shrinking Phase: Existing locks may be released but no new locks can be acquired.

2PL enforces serialization and the serialization is done in the order of lock points. Lock points are
point of time where a transformation has obtained all locks.
Lock up-gradation: Converting an existing shared lock to exclusive lock.
Lock down-gradation: Converting an exclusive lock to a shared lock.

(b) Check whether the following schedule is conflict serialized or not?

Schedule S1
T1 T2
Read(X)
Write(X)
Read(X)
Read(Y)
Write(X)
Write(Y)
Read(Y)
Write(X)
Ans=>
Schedules: R1(X), W1(X), R2(X), R1(Y), W2(X), W1(Y), R2(Y), W2(X)
To detect for any conflict in this schedule we need to check the precedence graph for ant cycle.

T1 T2

So, there is no cycle means S1 is conflict seriable.

11. (a) What is checkpoint?


Ans=> The checkpoint is used to declare a point before which the DBMS was in the consistent state,
and all transactions were committed. During transaction execution, such checkpoints are traced. After
execution, transaction log files will be created.

(b) What is write-ahead logging?


Ans=> Write-Ahead Logging (WAL) is a standard method for ensuring data integrity. WAL's central
concept is that changes to data files (where tables and indexes reside) must be written only after those
changes have been logged, that is, after log records describing the changes have been flushed to
permanent storage.

(c) Which recovery techniques do not require rollback?


Ans=>
(d) A Database management system uses deferred database modification technique. In an instance of
time, the following log entries were there:
<T1 start>
<T1, A, 100, 150>
<T2 start>
<T2, B, 350, 250>
<T0 commit>
<T2, C, 200, 220>
<T3 start>
<checkpoint {T1, T2, T3}>
<T2 commit>
<T3, B, 250, 100>
<T4 start>
<T4, A, 150, 200>
<T3 commit>
<T4, C, 220, 150>
Crashed...
Identify the transactions that will be rollback or redone?
Ans=> T4 is the only transaction that will be rollback or redone. Because of the checkpoint T0 and T1
ignored.

12. (a) What is a Heterogeneous Distributed database environment?


Ans=> In a heterogeneous distributed database, different sites have different operating systems,
DBMS products and data models. Its properties are −

 Different sites use dissimilar schemas and software.


 The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
 Query processing is complex due to dissimilar schemas.
 Transaction processing is complex due to dissimilar software.
 A site may not be aware of other sites and so there is limited co-operation in processing user
requests.

(b) List 2 advantages of Distributed Database over Centralized Databases.


Ans=> 1. Increased Reliability and availability
2. Management of data with different level of transparency

(c) What is Data Warehouse?


Ans=> A data warehouse is a type of data management system that is designed to enable and support
business intelligence (BI) activities, especially analytics. Data warehouses are solely intended to
perform queries and analysis and often contain large amounts of historical data
(d) What is the difference between Data Stores and Data Marts?
Ans=> 1. Data Stores is a large repository of data collected from different sources whereas Data Mart
is only subtype of a data warehouse.
2. Data Stores is focused on all departments in an organization whereas Data Mart focuses on
a specific group.
3. Data Stores designing process is complicated whereas the Data Mart process is easy to
design.

(e) List 2 characteristics of a Data Warehouse.


Ans=> 1. Some data is denormalized for simplification and to improve performance.
2. Large amounts of historical data are used.

(f) What is the task of Data staging layer?


Ans=> The staging area is mainly used to quickly extract data from its data sources and then the
staging area is used to combine data from multiple data sources, transformations, validations, data
cleansing.

(g) What is Data Mining?


Ans=> Data mining is the process of analyzing massive volumes of data to discover business
intelligence that helps companies solve problems, mitigate risks, and seize new opportunities.

You might also like