Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

KLEF

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

19CS2108A DATABASE MANAGEMENT SYSTEMS


TEST 1 KEY & SCHEME OF EVALUATION

Q.1. The entity type DISTRIBUTOR has four subclasses: NORTHERN, SOUTHERN,
CENTRAL and EASTERN. Design EER diagram segment for each of the following
situations: i. At a given time, a DISTRIBUTOR must be exactly one of these subclasses. ii.
A DISTRIBUTOR may or may not be one of these subclasses. However, a DISTRIBUTOR
who is one of these subclasses cannot at the same time be one of the other subclasses. iii. A
DISTRIBUTOR may or may not be one of these subclasses. On the other hand, a
DISTRIBUTOR may be any two or even three of these subclasses at the same time. iv. At a
given time a DISTRIBUTOR must be at least one of these subclasses.
i. At a given time, a DISTRIBUTOR must be exactly one of these subclasses.[1.5M]

ii.A DISTRIBUTOR may or may not be one of these subclasses. However, a


DISTRIBUTOR who is one of these subclasses cannot at the same time be one of the other
subclasses. [1M]
iii. A DISTRIBUTOR may or may not be one of these subclasses. On the other
hand, a DISTRIBUTOR may be any two or even three of these subclasses at the
same time. [1M]

IV.At a given time a DISTRIBUTOR must be at least one of these subclasses.[1M]


i. At a given time, a DISTRIBUTOR must be exactly one of these subclasses.
[1.5M]
ii. A DISTRIBUTOR may or may not be one of these subclasses. However, a
DISTRIBUTOR who is one of these subclasses cannot at the same time be
one of the other subclasses. [1M]
iii. A DISTRIBUTOR may or may not be one of these subclasses. On the other
hand, a DISTRIBUTOR may be any two or even three of these subclasses at
the same time. [1M]
iii. At a given time a DISTRIBUTOR must be at least one of these
subclasses. [1M]

Q.2. What are the responsibilities of a DBA? If we assume that the DBA is never interested
in running his or her own queries, does the DBA still need to understand query
optimization? Why?

1-Schema definition:
DBA prepares the database schema through implement set of data definition in DDL
2- Storage body and define the access method
3- Schema and physical-organization modification:
The DBA execute changes on the schema and physical organization to invert all the needs
changing of the organization or change the organization physical to progress performance.
4- Granting of authorization for data access.
The DBA can organize any part of data base allow to users can access by agree to give various
kinds of authorization and keep the information in private system structure that the DB system
confer whenever try anyone access to the data in system.
5- Routine maintenance.
The activities of Routine maintenance like support Periodically Up the DB to prevent loss data,
and making sure if free disk space enough and control chances of work on DB and making sure
the performance.
In most environments, the database administrator is expected to help tune poorly performing
queries. After all, the DBA is the expert and is responsible for overall database performance.
Indeed, tuning a query to eliminate excessive disk I/O or CPU processing will generally buy
more in performance than you can normally get by tuning the System Global Area (SGA) or by
optimizing the placement of data files on disk. The optimizer is that portion of the kernel that
evaluates the SQL statement and determines the optimal way to retrieve the desired result set.
Any 4 responsibilities [4.5M]
Q.3. Specify all the relationships among the records of the database.

1) Each SECTION record is related to a COURSE record.


2) Each GRADE_REPORT record is related to one STUDENT record and one SECTION
record.
3) Each PREREQUISITE record relates two COURSE records: one in the role of a course
and the other in the role of a prerequisite to that course.

Mention 3 Relationships [8M]


Q.4. A company database needs to store information about employees (identified by ssn,
with salary and phone as attributes), departments (identified by dna, with dname and
budget as attributes), and children of employees (with name and age as attributes).
Employees work in departments; each department is managed by an employee; a child
must be identified uniquely by name when the parent (who is an employee; assume that
only one parent works for the company) is known. We are not interested in information
about a child once the parent leaves the company. Draw an ER diagram that captures this
information.

Diagram [4M]
Constraints [4M]

Q.5.Computer Sciences Department frequent fliers have been complaining to Dane County
Airport officials about the poor organization at the airport. As a result, the officials decided
that all information related to the airport should be organized using a DBMS, and you
have been hired to design the database. Your first task is to organize the information about
all the airplanes stationed and maintainable at the airport. The relevant information is as
follows: Every airplane has a registration number, and each airplane is of a specific
model.• The airport accommodates a number of airplane models, and each model is
identified by a model number (e.g., DC-lO) and has a capacity and a weight.• A number of
technicians work at the airport. You need to store the name, SSN, address, phone number,
and salary of each technician.• Each technician is an expert on one or more plane model(s),
and his or her expertise may overlap with that of other technicians. This information about
technicians must also be recorded.• Traffic controllers must have an annual medical
examination. For each traffic controller, you must store the date of the most recentexam.•
All airport employees (including technicians) belong to a union. You must store the union
membership number of each employee. You can assume that each employee is uniquely
identified by a social security number.
Diagram [7M]
Constraints [5.5M]

Q.6.Design Relational Model for the below mentioned ER diagram.


Employee(ssn, Bdate, Fname ,Lname, Minit, Address,Salary,Sex)
Department(Name, Location, Number, Number of employees)
Project(Name, Number,Location)
Dependent(Name,sex, ssn,Birth_date,relationship)
Works_for(ssn,name,number)
Manages(ssn, start_date, Name, number)
Works_on(ssn,name,number,hours)
Dependentsof(ssn,name)
Tables [6.5 M]
Insert records according to mentioned constraints [6 M]

Q.7. Consider the following relations for a database that keeps track of automobile sales in
a car dealership (OPTION refers to some optional equipment installed on an
automobile):CAR(Serial_no, Model, Manufacturer, Price)OPTION(Serial_no,
Option_name, Price)SALE(Salesperson_id, Serial_no, Date,
Sale_price)SALESPERSON(Salesperson_id, Name, Phone)First, specify the foreign keys
for this schema, stating any assumptions you make. Next, populate the relations with a few
sample tuples, and then give an example of an insertion in the SALE and SALESPERSON
relations that violates the referential integrity constraints and of another insertion that
does not.
CAR(Serial_no)→FOREIGN KEY REFERENCE OPTION(Serial_no)
SALE(Salesperson_id)→FOREIGN KEY SALESPERSON(Salesperson_id)
INSERTING VALUES
SHOWING VIOLATION IN INTEGRITY CONSTRAINTS
INSERT INTO CAR(1,’INNOVA1’,’INNOVA’,1000000)
INSERT INTO CAR(1,’INNOVA1’,’INNOVA’,1000000) →violating integrity constraint
because CAR SERIAL NUMBER IS PRIMARY KEY WHICH SHOULD BE UNIQUE
1. Consider the following example. It is natural to require that the did field of Works should
be a foreign key, and refer to Dept.

CREATE TABLE Works ( eid INTEGER NOT NULL , did INTEGER NOT NULL ,
pcttime INTEGER, PRIMARY KEY (eid, did), UNIQUE (eid), FOREIGN KEY (did)
REFERENCES Dept )
When a user attempts to delete a Dept tuple, There are four options:
Also delete all Works tuples that refer to it.
Disallow the deletion of the Dept tuple if some Works tuple refers to it.
For every Works tuple that refers to it, set the did field to the did of some (existing)
’default’ department.
For every Works tuple that refers to it, set the did field to null.
2. CREATE TABLE Emp ( eid INTEGER, ename CHAR(10), age INTEGER, salary
REAL, PRIMARY KEY (eid) ) CREATE TABLE Works ( eid INTEGER, did
INTEGER, pcttime INTEGER, PRIMARY KEY (eid, did), FOREIGN KEY (did)
REFERENCES Dept, FOREIGN KEY (eid) REFERENCES Emp, ON DELETE
CASCADE) CREATE TABLE Dept ( did INTEGER, budget REAL, managerid
INTEGER , PRIMARY KEY (did), FOREIGN KEY (managerid) REFERENCES Emp,
ON DELETE SET NULL)

3. CREATE TABLE Dept ( did INTEGER, budget REAL, managerid INTEGER NOT
NULL , PRIMARY KEY (did), FOREIGN KEY (managerid) REFERENCES Emp)
4. INSERT INTO Emp (eid, ename, age, salary) VALUES (101, ’John Doe’, 32, 15000)
5. UPDATE Emp E SET E.salary = E.salary * 1.10
6. DELETE FROM Dept D WHERE D.dname = ’Toy’
The did field in the Works relation is a foreign key and references the Dept
relation. This is the referential integrity constraint chosen. By adding the action ON
DELETE CASCADE to this, when a department record is deleted , the Works record
associated with that Dept is also deleted. The query works as follows: The Dept relation
is searched for a record with name = ‘Toy’ and that record is deleted. The did field of that
record is then used to look in the Works relation for records with a matching did value.
All such records are then deleted from the Works relation.
sid name login age gpa
53831 Madayan madayan@music 11 1.8
53832 Guldu guldu@music 12 2.0

Specify the foreign keys for this schema. [1M]


Populate the relations with a few sample tuples. [1M]
Example of an insertion in the SALE and SALESPERSON relations that violates the
referential integrity constraints. [1M]
Example of an insertion in the SALE and SALESPERSON relations that does not violates
the referential integrity constraints. [1.5M]

Q.8.Answer each of the following questions briefly. The questions are based on the
followingrelational schema:
Emp(eid: integer, ename: string, age: integer, salary: real)
Works(eid: integer, did: integer, pcttime: integer)
Dept(did: integer, dname: string, budget: real, managerid: integer)
1. Give an example of a foreign key constraint that involves the Dept relation. What are the
options for enforcing this constraint when a user attempts to delete a Dept tuple?
2. Write the SQL statements required to create the preceding relations, including
appropriate versions of all primary and foreign key integrity constraints.
3. Define the Dept relation in SQL so that every department is guaranteed to have a
manager.
4. Write an SQL statement to add John Doe as an employee with eid = 101, age = 32 and
salary = 15,000.
5. Write an SQL statement to give every employee a 10 percent raise.
6. Write an SQL statement to delete the Toy department. Given the referential integrity
constraints you chose for this schema, explain what happens when this statement is
executed.
1. Consider the following example. It is natural to require that the did field of Works should be a
foreign key, and refer to Dept.
CREATE TABLE Works ( eid INTEGER NOT NULL , did INTEGER NOT NULL , pcttime
INTEGER,
PRIMARY KEY (eid, did),
UNIQUE (eid),
FOREIGN KEY (did) REFERENCES Dept )
When a user attempts to delete a Dept tuple, There are four options:
Also delete all Works tuples that refer to it.
Disallow the deletion of the Dept tuple if some Works tuple refers to it.
For every Works tuple that refers to it, set the did field to the did of some
(existing) ’default’ department
For every Works tuple that refers to it, set the did field to null.
2. CREATE TABLE Emp( eid INTEGER,
ename CHAR(10),
age INTEGER,
salary REAL,
PRIMARY KEY (eid) )
CREATE TABLE Works ( eid INTEGER,
did INTEGER,
pcttime INTEGER,
PRIMARY KEY (eid, did),
FOREIGN KEY (did) REFERENCES Dept,
FOREIGN KEY (eid) REFERENCES Emp,
ON DELETE CASCADE)
CREATE TABLE Dept( did INTEGER,
budget REAL,
managerid INTEGER ,
PRIMARY KEY (did),
FOREIGN KEY (managerid) REFERENCES Emp,
ON DELETE SET NULL);
3. CREATE TABLE Dept ( did INTEGER, budget REAL, managerid INTEGER NOT NULL ,
PRIMARY KEY (did), FOREIGN KEY (managerid) REFERENCES Emp)
4. INSERT INTO Emp (eid, ename, age, salary) VALUES (101, ’John Doe’, 32, 15000);
5. UPDATE Emp E The Relational Model 27 SET E.salary = E.salary * 1.10;
6. DELETE FROM Dept D WHERE D.dname = ’Toy’

1. Give an example of a foreign key constraint that involves the Dept relation. What are the
options for enforcing this constraint when a user attempts to delete a Dept tuple? [1M]
2. Write the SQL statements required to create the preceding rela tions, including
appropriate versions of all primary and foreign key integrity constraints. [1M]
3. Define the Dept relation in SQL so that every department is guaranteed to have a
manager. [1M]
4. Write an SQL statement to add John Doe as an employee with eid = 101, age = 32 and
salary = 15,000. [0.5M]
5. Write an SQL statement to give every employee a 10 percent raise. [0.5M]
6. Write an SQL statement to delete the Toy department. Given the referential integrity
constraints you chose for this schema, explain what happens when this statement is
executed. [0.5M]

Q.9. Suppose that each of the following Update operations is applied directly to the
database state shown in the above figure. Discuss all integrity constraints violated by each
operation, if any, and the different ways of enforcing these constraints.

i. Insert < 'Robert', 'F', 'Scott', '943775543', '21-JUN-42', '2365 Newcastle Rd, Bellaire, TX', M,
58000, ‘888665555’, 1 > into EMPLOYEE..
ii. Delete the WORKS_ON tuples with Essn = ‘333445555’.
iii. Modify the Mgr_ssn and Mgr_start_date of the DEPARTMENT tuple with Dnumber = 5 to
‘123456789’ and ‘2007-10-01’, respectively [2 M+2 M+4 M]
i) This insertion satisfies all constraints, so it is acceptable
ii) No constraint violations.
iii) No constraint violations.
Q.10. Consider the AIRLINE relational database schema shown in the above schema which
describes a database for airline flight information. Each FLIGHT is identified by a
Flight_number, and consists of one or more FLIGHT_LEGs with Leg_numbers 1, 2, 3, and
so on. Each FLIGHT_LEG has scheduled arrival and departure times, airports, and one or
more LEG_INSTANCEs— one for each Date on which the flight travels. FAREs are kept
for each FLIGHT. For each FLIGHT_LEG instance, SEAT_RESERVATIONs are kept, as
are the AIRPLANE used on the leg and the actual arrival and departure times and
airports. An AIRPLANE is identified by an Airplane_id and is of a particular
AIRPLANE_TYPE. CAN_LAND relates AIRPLANE_TYPEs to the AIRPORTs at which
they can land. An AIRPORT is identified by an Airport_code. Consider an update for the
AIRLINE database to enter a reservation on a particular flight or flight leg on a given date.
a. Give the operations for this update.
b. What types of constraints would you expect to check?
c. Which of these constraints are key, entity integrity, and referential integrity constraints
and which are not.
a) One possible answer is given below:
INSERT <FNO,LNO,DT,SEAT_NO,CUST_NAME,CUST_PHONE> into
SEAT_RESERVATION; MODIFY the LEG_INSTANCE tuple with the condition:
( FLIGHT_NUMBER=FNO AND LEG_NUMBER=LNO AND DATE=DT) by setting
NUMBER_OF_AVAILABLE_SEATS = NUMBER_OF_AVAILABLE_SEATS - 1;

These operations should be repeated for each LEG of the flight on which a reservation is made.
This assumes that the reservation has only one seat. More complex operations will be needed for
a more realistic reservation that may reserve several seats at once.
b) We would check that NUMBER_OF_AVAILABLE_SEATS on each LEG_INSTANCE of
the flight is greater than 1 before doing any reservation (unless overbooking is permitted),
and that the SEAT_NUMBER being reserved in SEAT_RESERVATION is available.
c) The INSERT operation into SEAT_RESERVATION will check all the key, entity
integrity,and referential integrity constraints for the relation..The check that
NUMBER_OF_AVAILABLE_SEATS on each LEG_INSTANCE of the flight is greater than 1
does not fall into any of the above types of constraints. (it is a general semantic integrity
constraint).

a. Give the operations for this update. [3M]


b. What types of constraints would you expect to check? [3M]
c. Which of these constraints are key, entity integrity, and referential integrity constraints
and which are not. [2M]

Q.11. Design a relational database schema for a database application of your choice.
a. Declare your relations using the SQL DDL.
b. Specify a number of queries in SQL that are needed by your database application.
c. Based on your expected use of the database, choose some attributes that should have
indexes specified on them.
d. Implement your database, if you have a DBMS that supports SQL.
CREATE TABLE STATEMENT
DROP TABLE Enrollment;
DROP TABLE offering;
DROP TABLE Student;
DROP TABLE Course;
DROP TABLE Faculty;

-------------------- Student --------------------------------

CREATE TABLE Student (


stdNo char(11) not null,
stdFirstName varchar2(30) not null,
stdLastName varchar2(30) not null,
stdCity varchar2(30) not null,
stdState char(2) not null,
stdZip char(10) not null,
stdMajor char(6),
stdClass char(2),
stdGPA decimal(3,2),
CONSTRAINT StudentPk PRIMARY KEY (StdNo) );

-------------------- Course --------------------------------


CREATE TABLE Course(
CourseNo char(6) not null,
crsDesc varchar2(50) not null,
CrsUnits integer,
CONSTRAINT CoursePK PRIMARY KEY (CourseNo) );

-------------------- Faculty --------------------------------


CREATE TABLE Faculty(
FacNo char(11) not null,
FacFirstName varchar2(30) not null,
FacLastName varchar2(30) not null,
FacCity varchar2(30) not null,
FacState char(2) not null,
FacZipCode char(10) not null,
FacRank char(4),
FacHireDate date,
FacSalary decimal(10,2),
FacSupervisor char(11),
FacDept char(6),
CONSTRAINT FacultyPK PRIMARY KEY (FacNo),
CONSTRAINT SupervisorFK FOREIGN KEY (FacSupervisor) REFERENCES Faculty );

-------------------- Offering --------------------------------


CREATE TABLE Offering(
OfferNo INTEGER not null,
CourseNo char(6) not null,
OffTerm char(6) not null,
OffYear INTEGER not null,
OffLocation varchar2(30),
OffTime varchar2(10),
FacNo char(11),
OffDays char(4),
CONSTRAINT OfferingPK PRIMARY KEY (OfferNo),
CONSTRAINT CourseFK FOREIGN KEY (CourseNo) REFERENCES Course,
CONSTRAINT FacultyFK FOREIGN KEY (FacNo) REFERENCES Faculty );

-------------------- Enrollment --------------------------------


CREATE TABLE Enrollment (
OfferNo INTEGER not null,
StdNo char(11) not null,
EnrGrade decimal(3,2),
CONSTRAINT EnrollmentPK PRIMARY KEY (OfferNo, StdNo),
CONSTRAINT OfferingFK FOREIGN KEY (OfferNo) REFERENCES Offering
ON DELETE CASCADE,
CONSTRAINT StudentFK FOREIGN KEY (StdNo) REFERENCES Student ON DELETE
CASCADE );
INSERT STATEMENT (SOME MAY BE NOT BE INSERTED FOR SHOWING YOU
CONSTRAINT )
INSERT INTO student
(stdNo, stdFirstName, stdLastName, stdCity,
stdState, stdMajor, stdClass, stdGPA, stdZip)
VALUES ('123-45-6789','HOMER','WELLS','SEATTLE','WA','IS','FR',3.00,'98121-1111');

INSERT INTO student


(stdNo, stdFirstName, stdLastName, stdCity,
stdState, stdMajor, stdClass, stdGPA, stdZip)
VALUES ('124-56-7890','BOB','NORBERT','BOTHELL','WA','FIN','JR',2.70,'98011-2121');
INSERT INTO student
(stdNo, stdFirstName, stdLastName, stdCity,
stdState, stdMajor, stdClass, stdGPA, stdZip)
VALUES ('234-56-7890','CANDY','KENDALL','TACOMA','WA','ACCT','JR',3.50,'99042-
3321');
INSERT INTO student
(stdNo, stdFirstName, stdLastName, stdCity,
stdState, stdMajor, stdClass, stdGPA, stdZip)
VALUES ('345-67-8901','WALLY','KENDALL','SEATTLE','WA','IS','SR',2.80,'98123-1141');
INSERT INTO student
(stdNo, stdFirstName, stdLastName, stdCity,
stdState, stdMajor,...

a. Declare your relations using the SQL DDL. [2M]


b. Specify a number of queries in SQL that are needed by your database application. [2M]
c. Based on your expected use of the database, choose some attributes that should have
indexes specified on them. [4M]
d. Implement your database, if you have a DBMS that supports SQL. [4.5M]
Q.12. Consider the following MAILORDER relational schema describing the data for a
mail order company.
PARTS(Pno, Pname, Qoh, Price, Olevel)
CUSTOMERS(Cno, Cname, Street, Zip, Phone)
EMPLOYEES(Eno, Ename, Zip, Hdate)
ZIP_CODES(Zip, City)
ORDERS(Ono, Cno, Eno, Received, Shipped)
ODETAILS(Ono, Pno, Qty)
Qoh stands for quantity on hand: the other attribute names are self-explanatory. Specify
and execute the following queries using the RA interpreter on the MAILORDER database
schema.
a. Retrieve the names of parts that cost less than $20.00.
b. Retrieve the names and cities of employees who have taken orders for parts costing more
than $50.00. c. C.Retrieve the pairs of customer
number values of customers who live in the same ZIP Code
d. Retrieve the names of customers who have ordered parts from employees living in
Wichita
e. Retrieve the names of customers who have ordered parts costing less than $20.00.
f. Retrieve the names of customers who have not placed an order.
g. Retrieve the names of customers who have placed exactly two orders.

a. Select Pname from PARTS where Price < 20.


b. select e.ename,z.city from employees e, zipcodes z, orders o, odetails d, parts p where
e.zip = z.zip and e.eno = o.eno and o.ono = d.ono and d.pno = p.pno and p.price> 50.00;
c. select c1.cno, c2.cno from customers c1, customers c2 where c1.zip = c2.zip and c1.cno
< c2.cno;
d. select cname from customers where not exists (select * from orders o, employees e,
zipcodes z where o.cno = customers.cno and o.eno = e.eno and e.zip = z.zip and
z.city<> 'Wichita');
e. select cname from customers c where not exists (select * from parts p where p.price<
20.00 and not exists (select * from orders o, odetails d where o.ono = d.ono and o.cno =
c.cno and o.pno = p.pno));
f. select cname from customers where not exists (select * from orders where orders.cno
= customers.cno);
g. select cname from customers where exists (select * from orders o1, orders o2 where
o1.cno = customers.cno and o2.cno = customers.cno and o1.ono <> o2.ono) and not
exists ( select * from orders o1, orders o2, orders o3 where o1.cno = customers.cno and
o2.cno = customers.cno and o3.cno = customers.cno and o1.ono <> o2.ono and o2.ono
<> o3.ono and o1.ono <> o3.ono);

a. Retrieve the names of parts that cost less than $20.00. [1M]
b. Retrieve the names and cities of employees who have taken orders for parts
costing more than $50.00. [1.5M]

c. Retrieve the pairs of customer number values of customers who live in the same
ZIP Code. [2M]
d. Retrieve the names of customers who have ordered parts from employees living in
Wichita. [2M]
e. Retrieve the names of customers who have ordered parts costing less than $20.00.
[2M]
f. Retrieve the names of customers who have not placed an order. [2M]

g. Retrieve the names of customers who have placed exactly two orders. [2M]
Q1. In what normal form is the LOTS relation schema in Figure with the respect to the
restrictive interpretations of normal form that take only the primary key into account?
Will it be in the same normal form if the general definitions of normal form were used?
4.5M

Answer:
If we only take the primary key into account, the LOTS relation schema in Figure
will be in 2NF since there are no partial dependencies on the primary key .However, it is not
in 3NF, since there are the following two transitive dependencies onthe primary
key:PROPERTY_ID# ->COUNTY_NAME ->TAX_RATE, andPROPERTY_ID# ->AREA
->PRICE.Now, if we take all keys into account and use the general definition of 2NF and
3NF, theLOTS relation schema will only be in 1NF because there is a partial
dependencyCOUNTY_NAME ->TAX_RATE on the secondary key {COUNTY_NAME,
LOT#}, which violates 2NF.
Scheme
the primary key into account? 2M
Will it be in the same normal form if the general definitions of normal form were
used?2.5M

Q2. Suppose that we have the following three tuples in legal instance of a relation
schema with three attributes ABC as (1,2,3) (4,2,3) and (5,3,3) . Then which of the
following dependencies can you infer does not hold over schema. 4.5M
Answer:

BC→A
2 3 →1
2 3 →4
3 3 →5
It does not hold.
Scheme
Identifying dependencies that does not hold 2M

Q3. Explain the difference between each of the following:


1. Primary versus secondary indexes.
2. Dense versus sparse indexes. 3. Clustered versus unclustered indexes.
If you were about to create an index on a relation, what considerations would guide
your choice with respect to each pair of properties listed above? 8M
Answer:
1. The main difference between primary and secondary index is that the primary index is
an index on a set of fields that includes the primary key for the field and does not
contain duplicates, while the secondary index is an index that is not a primary index
and which can contain duplicates.
2. Dense index: In a dense index, an index entry appears for every search-key value in
the file. In a dense clustering index, the index record contains the search-key value
and a pointer to the first data record with that search-key value. The rest of the records
with the same search-key value would be stored sequentially after the first record,
since, because the index is a clustering one, records are sorted on the same search key.
In a dense nonclustering index, the index must store a list of pointers to all records
with the same search-key value.
Sparse index: In a sparse index, an index entry appears for only some of the
search-key values. Sparse indices can be used only if the relation is stored in sorted
order of the search key, that is if the index is a clustering index. As is true in dense
indices, each index entry contains a search-key value and a pointer to the first data
record with that search-key value. To locate a record, we find the index entry with the
largest search-key value that is less than or equal to the search-key value for which we
are looking. We start at the record pointed to by that index entry, and follow the
pointers in the file until we find the desired record.
3. The difference between Clustered and Nonclustered index in a relational database is
one of the most popular SQL interview questions almost as popular as the difference
between truncate and delete, primary key vs unique key and correlated vs
noncorrelated subqueries. Indexes are a very important concept, it makes your queries
run fast and if you compare a SELECT query which uses an indexed column to one
who doesn't you will see a big difference in performance. There can be two kinds of
indexes in relational database Clustered and Nonclustered indexes. A clustered index
determines the physical sorting order of rows in a table similar to entries on yellow
pages which are sorted in alphabetical order.

Suppose you have a table Employee, which contains emp_id as primary key than a
clustered index which is created on a primary key will sort the Employee table as per
emp_id. That was a brief introduction of What is clustered index in SQL.
On another hand, the Non-Clustered index involves one extra step which points to the
physical location of the record. In this SQL Interview question, we will see some
more differences between clustered and nonclustered index in point format.

Scheme
1. 3M
2. 3M
3. 2M

Q4. Consider a view branch_cust created on a bank database as follows:create view


branch_cust asselect branch name, customer name from depositor, account where
depositor.account number = account.account number branch(branch name, branch
city,assets)customer (customer name, customer street, cust city )loan (loan number,
branch name, amount)borrower(customer name, loan number)account (account
number, branch name, balance )depositor (customer name, account number)Suppose
that the view is materialized; that is, the view is computed and stored. Write triggers to
maintain the view, that is, to keep it up-to-date on insertions to and deletions from
depositor or account. Do not bother about updates. 8M
Answer: For inserting into the materialized view branch-cust we must set a database trigger
on an insert into depositor and account. We assume that the database system uses immediate
binding for rule execution. Further, assume that the current version of a relation is denoted by
the relation name itself, while the set of newly inserted tuples is denoted by qualifying the
relation name with the prefix–inserted. The active rules for this insertion are given
below–define trigger insert into branch-cust via depositor after insert on depositor referencing
new table as inserted for each statement insert into branch-cust select branch-name,
customer-name from inserted, account where inserted
.account-number=account.account-numberdefine trigger insert into branch-custvia account
after insert on account referencing new table asinserted for each statement insert into
branch-cust select branch-name,
customer-namefromdepositor,insertedwheredepositor.account-number=inserted.account-num
ber.
Note that if the execution binding was deferred (instead of immediate), then the result of the
join of the set of new tuples of account with the set of new tuples of have been inserted by
both active rules, leading to duplication of the corresponding tuples in branch-cust. deletion
of a tuple from branch-cust is similar to insertion, except that a deletion from either
depositoror account will cause the natural join of these relations to have a lesser number of
tuples. We denote the newly deleted set of tuples by qualifying the relation name with the
keyword deleted. define trigger delete from branch-cust via depositor after delete on
depositor referencing old table as deleted for each statement delete from branch-cust select
branch-name,
customer-namefromdeleted,accountwheredeleted.account-number=account.account-numberd
efine trigger delete from branch-cust viaa ccount after delete on account referencing old table
as deleted for each statement delete from branch-cust select branch-name,
customer-namefromdepositor,deletedwheredepositor.account-number=deleted.account-numb
er
Scheme
Trigger 8M
Q5. Consider the following relation:R (Doctor#, Patient#, Date, Diagnosis, Treat_code,
Charge)In this relation, a tuple describes a visit of a patient to a doctor along with a
treatment code and daily charge. Assume that diagnosis is determined (uniquely) for
each patient by a doctor. Assume that each treatment code has a fixed charge
(regardless of patient). Is this relation in 2NF? Justify your answer and decompose if
necessary. Then argue whether further normalization to 3NF is necessary, and if so,
perform it. 12.5M
Answer: From the question’s text, we can infer the following functional dependencies:
Doctor#, Patient#, Date -> Diagnosis, Treat_code, Charge
Scheme
Is this relation in 2NF 6.5M
Justify 6M
Q6. A PARTS file with Part# as key field includes records with the following Part#
values: 23, 65, 37,60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24, 28, 39, 43, 47,
50, 69, 75, 8, 49, 33, 38.Suppose the search field values are inserted in the given order in
a B + -tree of order p=4 and p leaf=3; show how the tree will expand and what the final
tree looks like. 12.5M
Answer:
Scheme
Construction of Tree 125M
Q7. Predict whether the given schedules is (conflict) serializable and determine its
equivalent serial schedules. r1(X); r3(X);w3(X); w1(X); r2(X); 4.5M
Answer: The above schedule is conflict serializable as two conflicting instructions i.e. r3(X)
and w1 (X) is appearing one after the other . The equivalent serial schedule is as
follows:
The conflicting instructions can be swapped to make it serializable. After swapping
the conflicting instructions we get:
r1 (X); r3 (X); r2 (X); w1 (X); w3 (X);
Scheme
Prediction 2.5M
Equivalent Serial Schedule 2M
Q8. Describe the MapReduce join procedures for Sort-Merge join, Partition Join,
N-way Map-side join, and Simple N-way join
What is a Join? 4.5M
Answer:
The join operation is used to combine two or more database tables based on foreign keys. In
general, companies maintain separate tables for the customer and the transaction records in
their database. And, many times these companies need to generate analytic reports using
the data present in such separate tables. Therefore, they perform a join operation on these
separate tables using a common column (foreign key), like customer id, etc., to generate a
combined table. Then, they analyze this combined table to get the desired analytic reports.
Joins in MapReduce
Just like SQL join, we can also perform join operations in MapReduce on different data sets.
There are two types of join operations in MapReduce:
● Map Side Join: As the name implies, the join operation is performed in the map
phase itself. Therefore, in the map side join, the mapper performs the join and it is
mandatory that the input to each map is partitioned and sorted according to the keys.
The map side join has been covered in a separate blog with an example.
● Reduce Side Join: As the name suggests, in the reduce side join, the
reducer is responsible for performing the join operation. It is comparatively simple
and easier to implement than the map side join as the sorting and shuffling phase
sends the values having identical keys to the same reducer and therefore, by default,
the data is organized for us.
Now, let us understand the reduce side join in detail.
What is Reduce Side Join?
As discussed earlier, the reduce side join is a process where the join operation is performed in
the reducer phase. Basically, the reduce side join takes place in the following manner:
● Mapper reads the input data which are to be combined based on common column or
join key.
● The mapper processes the input and adds a tag to the input to distinguish the input
belonging from different sources or data sets or databases.
● The mapper outputs the intermediate key-value pair where the key is nothing but the
join key.
● After the sorting and shuffling phase, a key and the list of values is generated for the
reducer.
● Now, the reducer joins the values present in the list with the key to give the final
aggregated output.
Meanwhile, you may go through this MapReduce Tutorial video where various MapReduce
Use Cases has been clearly explained and practically demonstrated:
Now, let us take a MapReduce example to understand the above steps in the reduce side join.
MapReduce Example of Reduce Side Join
Suppose that I have two separate datasets of a sports complex:
● cust_details: It contains the details of the customer.
● transaction_details: It contains the transaction record of the customer.
Using these two datasets, I want to know the lifetime value of each customer. In doing so, I
will be needing the following things:
● The person’s name along with the frequency of the visits by that person.
● The total amount spent by him/her for purchasing the equipment.

Scheme
Join 2.5M
Reduce Side Join 2M
Q9. Describe the MapReduce join proConsider schedule below. Determine whether the
below mentioned schedule is strict, cascadeless, recoverable, or nonrecoverable.
Determine the strictest recoverability condition that the schedule satisfies. S3: r1 (X); r2
(Z); r1 (Z); r3 (X);r3 (Y); w1 (X); c1; w3 (Y); c3; r2 (Y);w2 (Z); w2 (Y); c2;cedures for
Sort-Merge join, Partition Join, N-way Map-side join, and Simple N-way join 8M
Answer: In this schedule no read-write or write-write conflict arises before commit hence its
strict schedule:
Scheme
Schedule 8M
Q10. Describe the MapReduce join procedures for Sort-Merge join, Partition Join,
N-way Map-side join, and Simple N-way join. 8M
Answer: What is a Join?
The join operation is used to combine two or more database tables based on foreign keys. In
general, companies maintain separate tables for the customer and the transaction records in
their database. And, many times these companies need to generate analytic reports using
the data present in such separate tables. Therefore, they perform a join operation on these
separate tables using a common column (foreign key), like customer id, etc., to generate a
combined table. Then, they analyze this combined table to get the desired analytic reports.
Joins in MapReduce
Just like SQL join, we can also perform join operations in MapReduce on different data sets.
There are two types of join operations in MapReduce:
● Map Side Join: As the name implies, the join operation is performed in the map
phase itself. Therefore, in the map side join, the mapper performs the join and it is
mandatory that the input to each map is partitioned and sorted according to the keys.
The map side join has been covered in a separate blog with an example.
● Reduce Side Join: As the name suggests, in the reduce side join, the
reducer is responsible for performing the join operation. It is comparatively simple
and easier to implement than the map side join as the sorting and shuffling phase
sends the values having identical keys to the same reducer and therefore, by default,
the data is organized for us.
Scheme
Join 4.5M
Reduce Side Join 4M
Q11. Consider the three transactions T1, T2, and T3, and the schedules S1 and S2 given
below. Draw the serializability (precedence) graphs for S1 and S2, and state whether
each schedule is serializable or not. If a schedule is serializable, write down the
equivalent serial schedule(s). 12.5M
T1: r1 (X); r1 (Z); w1 (X);
T2: r2 (Z); r2 (Y); w2 (Z); w2 (Y);
T3: r3 (X); r3 (Y); w3 (Y);
S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2 (Y); w2 (Z); w2 (Y);
S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); w2 (Z); w3 (Y); w2 (Y);
Answer:

T1, T2, T3
_______________________
| T1 | T2 | T3
T| | |
I | r1(X) | r2(Z) | r3(X)
M | r1(Z) | r2(Y) | r3(Y)
E | w1(X) | w2(Z) | w3(Y)
| | w2(Y) |
Schedule: S1
_______________________
| T1 | T2 | T3
| | |
| r1(X) | |
T| | r2(Z) |
I | r1(Z) | |
M| | | r3(X)
E| | | r3(Y)
| w1(X) | |
| | | w3(Y)
| | r2(Y) |
| | w2(Z) |
| | w2(Y) |

Summary: Possible conflicts occur when T1 writes to X when T3 is


still reading X. However T3 does not write to X so this is ok.
T3 Then reads and writes to Y before T2 reads and writes to Y so
this is ok as well. Since T2 reads and writes to Z, it is also ok
that T1 reads Z but does not write. This schedule is serializable
because there are no cycles.
Schedule: S2
_______________________
| T1 | T2 | T3
| | |
| r1(X) | |
| | r2(Z) |
T| | | r3(X)
I | r1(Z) | |
M| | r2(Y) |
E| | | r3(Y)
| w1(X) | |
| | w2(Z) |
| | w3(Y) |
| | w2(Y) |

Summary: This schedule is non-serializable and contains a major


conflict. Both T2 and T3 are accessing 'Y' when T3 writes to it.
Therefore when T2 writes to 'Y', the transaction for T3 is lost
and overridden.
Scheme
Precedence Graph 6.5M
Equivalent serial schedule 6M
Q12. Consider the execution of two transactions T1 and T2 assume that if the initial
values of X, Y, M and N are 100, 800, 10, 45 respectively.i. Write the final values of X
and Y as per schedule A. Is this a serializable schedule? ii. Write the final values of X
and Y for all possible serial schedules as per schedule B.

Answer: Suppose we have two concurrent transactions T1 and T2, where both are updating
data d. Suppose T1 started first and read d for update. As soon as T1 read d, T2 started and
read d for its update. As soon as T2 reads d, T1 updates d to d’. Once T1 is complete, T2
updates d to d”. Here T2 is unaware of T1’s update as it has read the data before T1 has
updated it. Similarly, T1 is unaware of T2’s updates. What happens to final result and T1’s
update here? Which value of d will be final here – d’ or d” ?

Since T2 is unaware of T1’s update and is processed at the last, the updates done by T1 is
lost. The updates done by T2 will only be retained. T1’s update is totally lost and nowhere its
symptom of update is kept. This type of update is known as lost update.
But T1’s transaction is valid one and cannot be ignored. Its update is also as important as
T2’s. Probably if T1’s update might have changed the result of T2’s update (cases like update
is dependent on the value of the column that we are updating – d=d*10). Hence we cannot
lose the data that are being updated by any transactions. This type of lost update can be
prevented if these transactions are grouped and executed serially. Suppose T1 is allowed to
read and write d, once it completes write then T2 is allowed to read d, then we will have
updates done by T1 as well as T2. The first update will however changed by T2, the update of
T1 will be stored in undo log or rollback segment. Hence we will know at least there is some
value in between transaction begin (here transaction means group of T1 and T2 together) and
end of it (end of T2). Such a grouping of transactions and defining the order of execution is
known as scheduling or serialization. This type of execution guarantees isolation of
transaction. It will not have any dirty reads, non-repeatable reads, deadlocks or lost update
issues.
Scheme
Write the final values of X and Y as per schedule A. 3M
Is this a serializable schedule? ii. 6.5M
Write the final values of X and Y for all possible serial schedules as per schedule B. 3M

You might also like