Professional Documents
Culture Documents
ESP VI - CS - Materials 2-401-448
ESP VI - CS - Materials 2-401-448
DATABASES
Syllabus: ER model, relational model (relational algebra, tuple calculus), database design (integrity constraints,
normal forms), query languages (SQL), file structures (sequential files, indexing, B and B+ trees), transactions and
concurrency control.
database system contains not only data but it describes ••Operating system software like Microsoft Win-
the description of the database structure. dows, Linux OS, Mac OS.
••DBMS software such as Oracle 8I, MySQL,
8.1.2 Database Management System Access (Jet, MSDE), SQL Server etc.
••Network softwares are used for sharing share the
A database is a collection of related data. Data is a col- data of database among multiple users.
lection of raw facts or figures, processed to form infor- ••Application programs are developed like C++,
mation. Database management system is a collection of
VB, dotnet etc. are used to access database in
programs for the creation and maintenance of database.
dbms. These are used to access and manipulate
It is an efficient and reliable approach to retrieve data
the data in the database.
for many users. It provides various functions such as:
1. Redundancy control: It provides redundancy 2. Hardware: It consists of all system’s physical
by removing duplicity of data by following rules of devices such as computers, storage devices, I/O
normalization. channels, electromechanical devices etc. It also
2. Data independence: It provides independence includes peripherals, such as, keyboard, mouse,
to application programs from details of data repre- modems, printers, etc.
sentation and storage. It also provides an abstract 3. Data: It is the collection of facts. The database
view of the data to insulate application code from contains the data and the metadata.
such details. 4. Procedures: There are the instructions and rules
3. Data integrity: It promotes and enforces some to design and use the database system. These
integrity rules for reducing data redundancy and includes the following:
increasing data consistency. ••Steps for the installation of DBMS
4. Concurrency control: It supports sharing of ••Steps to use the DBMS or application program
data, so, it has to provide an approach for managing
••Steps for the backup of DBMS
concurrent access of the database. Hence, preserv-
ing the inconsistent state and integrity of the data. ••Steps to change the structure of DBMS
5. Transaction management: It provides an ••Steps for the generation of reports.
approach to ensure that either all the updates for a
5. Data access language: The users can use it to
given transaction will execute or that none of them
access the data to and from the database. The func-
would execute.
tion of data access language is the entry of new data,
6. Backup and recovery: It provides mechanisms
manipulation of the existing data and the retrieval of
for backing up data periodically and recovering
the existing data in the database. The most popular
from different types of failures, thus, preventing
database access language is SQL (Structured Query
loss of data.
Language). Users can perform these functions with
7. Non-Procedural query language: It provides
the help of commands. The role of administrator is
with query language for retrieval and manipulation
to access, to create and to maintain the database.
of data.
6. People: Persons involved to access, to create and
8. Security: It protects unauthorized access in the
to maintain the database are called users. These
database. It ensures the access to authorized users.
are of various types according to the role performed
by them (Fig. 8.1). These are as follows:
8.2 COMPONENTS OF DATABASE ••System Administrator: The role of system
SYSTEMS a dministrator is to supervise the general opera-
tions of DBMS.
••Database Administrator: The role of database
DBMS consists of several components, namely software,
hardware, data, procedures and data access language. administrator (DBA) is to manage the DBMS.
These components are responsible for the definition, col- ••Database Designer: The role of database
lection, management and use of data within the envi- designer is to design the structure of the database.
ronment. Figure 8.1 shows the components of database ••Application Programmer: The role of
system. The description of each component is as follows: application programmer is to create the data
1. Software: It is the collection of programs used entry forms, reports and procedures.
by the computers within the database system. It is ••End User: The role of end user is to use the
used to handle, control and manage the database. application programs by entering new data and
It includes the following software: manipulating and accessing existing data.
5. Network model: In this model, data is repre- Table 8.1 | Notations of fundamental operations
sented as record types. The data in this model has of relational algebra
many-to-many relationship.
6. Hierarchical model: In this model, the data is Fundamental Operation Symbol
represented as a hierarchical tree structure. Select s
Project p
8.4.1 Relational Model
Union ∪
The database in relational model is represented as a col- Intersection ∩
lection of relations (tables). A relation is a kind of set. It Set difference −
is also a subset of a Cartesian product of an unordered
set of ordered tuples. Relational model was proposed by Cartesian product ×
E. F. Codd, which stores data in a tabular form. It con- Rename r
sists of a table where rows represent records and columns Natural join
represent the attributes. It has various terminologies as
follows:
Fundamental operations of relational algebra are as
1. Tuple: It represents a single row of a table, which follows (see Table 8.1):
contains a single record for that relation.
1. Select: It is used to select rows from a relation. It
2. Relation instance: It represents a finite set of
is denoted by s.
tuples in the relational database system.
3. Relation schema: It represents the relation Syntax of select s p (r), r is relation and p is prepo-
name, that is, table name, attributes and their sitional logic.
names. p uses connectors and operators Ù, Ú, =, ≠,<,>, £, ³.
4. Relation key: It represents the unique key for For example, s empname = "John" (emp).
the relation or table. Each row has one or more
attributes, which can identify the row in the table 2. Project: It is used to project columns in a relation.
uniquely. It is denoted by p . The duplicate tuples are auto-
5. Attribute domain: It represents the predefined matically eliminated.
value scope of each attribute. Syntax of projects p A (r), r is relation and A is the
attribute name in a relation.
8.4.1.1 Constraints in Relational Model For example, p empname, sal (emp)
Constraints are the restrictions that one wishes to apply 3. Union: It returns a relation instance, which con-
on database. The following constraints are applied on tains all tuples occurring in the first relation or in
relational model. the second relation. It is denoted as R∪S , where R
and S are two relations. The duplicate tuples are
1. Key constraints: Each relation has at least one automatically eliminated.
minimal subset of attributes, which can identify a This operation is valid for the following:
tuple uniquely. ••Both relations must have the same number of
••No two tuples have identical value for key at-
attributes.
tributes. ••Attribute domains must be compatible.
••Key attribute does not have NULL value.
4. Intersection: It returns a relation instance,
2. Domain constraints: Attributes have specific which contains all tuples occurring in both
domain values in real world. For example, value of relations. It is denoted as R ∩ S , where R and S
age can only be positive. are two relations.
3. Referential integrity constraints: If a relation
5. Set difference: It returns a relation instance,
refers to a key attribute of a different relation then
which contains all tuples that occur in the first
that key element must exist.
relation but not in the second relation. It is denoted
as R − S , where R and S are two relations.
8.4.1.2 Relational Algebra 6. Cartesian product: It returns a relation
instance, which contains all the fields of the first
It is a procedural query language. It takes instances of relation followed by all the fields of the second
relations as input and returns instances of relations as relation. It is denoted as R×S , where R and S are
output. Operators are used to perform queries. two relations.
7. Rename: It returns a relation but without any { t | Condition}, wherte t is a tuple variable and Condition
name. It is used to rename the output relation. It is a conditional expression.
is denoted as r. Preliminaries of tuple calculus are as follows:
8. Joins: It returns combined information from two
or more relations. 1. Constants
2. Predicates
••Condition joins: It accepts a join condition c 3. Boolean and, or, not
and a pair of relation instances as arguments, 4. $ there exists
and returns a relation instance. It is denoted as 5. " for all
s c (R×S ) .
••uijoin: It is a special case of condition joins
8.4.2 ER Model
where the condition c contains equalities.
••Natural join: It is a Cartesian product of two
ER model represents the conceptual view of a database.
relations. It is denoted by . It describes the relation of data to each other (Table 8.2).
It was developed by Peter Chen in 1976. It views real-
8.4.1.3 Tuple Calculus world data as systems of entities and relationships.
ER model has three basic elements: entity, attribute
It is a non-procedural query language. In this, number and relationship. These are discussed in the following
of tuple variables is specified. It is represented as sections.
Weak entity
Order Order item
Attribute
Name
Simple attribute
Order
Name
Derived attribute
Date−of−birth Age
Single-valued attribute
Roll_number
Multi-value attribute
Phone_number
(Continued)
Relationship
One-to-One 1 1
relationship Department has Manager
One-to-Many
1 N
relationship Manager manages Employee
Many-to-One N 1
relationship Employee works Department
Many-to-Many N N
relationship Employee handles Projects
8.4.2.1 Entity For example, Details of Employee’s spouse, order item, etc.
Entities represent the real-world things. These are data Order Order item
objects which maintain different relationships with each
other, for example, Employee, Department, etc. These
are represented by means of rectangles.
8.4.2.2 Attributes
into sub parts. Examples are mobile number, roll For example contact number, email ids.
number.
Order Phone−number
Name
3. Derived attribute: Derived attributes are those Some basic terminologies related to relationship are given
whose value is derived from some other attribute below.
in the database. For example, age of person can be As we have discussed, relations are the core compo-
calculated from date of birth. Dotted oval is used nents in RDBMS, these relations are defined by two
to represent derived attributes. major characteristics–relationship set and the degree of
relationship, defined in the following text.
1. Relationship set: Relationship of similar type
is called relationship set. It has attributes. These
attributes are called descriptive attributes.
2. Degree of relationship: It defines the number
For example, of participating entities in a relationship. They are
of the following types:
••Unary relationship (Degree 1): One entity
participates. For example,
Date−of−birth Age
Manages
Specialization
Course Registration Faculty Employee
1 1
Department has Manager Employee works Department
2. Entity integrity constraint: It defines that Also, sometimes called, Project-Join Normal Form
the primary keys cannot be null. There must be a (PJNF).
proper value in the primary key field.
3. Referential integrity constraint: It is speci- 8.5.3 Attribute Closure
fied between two tables. It is used to maintain
the consistency among rows between the two Set of all attributes functionally determined by X is
tables. called closure of X. Closure set of X is denoted by X+.
4. Foreign key integrity constraint: There are
two types of foreign key integrity constraints:
Problem 8.1: Functional dependencies F = {A → B,
••Cascade update related fields: Whenever the B → C, C → D} is given, find closure of A, B, C
primary key of a row in the primary table is and D.
changed, the foreign key values are updated in
the matching rows in the related table. Solution:
••Cascade delete related rows: Whenever a row in
Closure of A = A+ = (A, B, C, D)
the primary table has been deleted, the match-
ing rows are automatically deleted in the related Closure of B = B+ = (B, C, D)
table. Closure of C = C+ = (C, D)
Closure of D = D+ = (D)
8.5.2 Normal Forms
in a table. Every candidate key is a superkey but not (a) Algorithm for finding decomposition is lossless:
vice versa.
tep 1: Union of all decomposed sub-relation
S
In other words, candidate key is the minimal super- should be equal to relation R.
key. If X is a superkey and none of the proper subset of
X is a superkey, then X is called the minimal superkey R1 ∪ R2 ∪ R3 ∪ … ∪ Rn = R
or candidate key.
tep 2: Any two sub-relations Ri and Rj can be
S
merged into Rij with R1 ∪ R2 only if
Closure of C = C+ = {C, D}
Problem 8.4: Consider a schema R (ABCDEFGHIJ)
Closure of AB = AB+ = {A, B, C, D, E}
and functional dependencies {FDs = (AB → C, A →
So, AB and B are superkeys. But only B is the can- D, B → F, F → GH, D → IJ)} and decompositions
didate key. (a) {D = (ABCDE, BFGH, DIJ)}
(b) {D = (ABCD, DE, BF, FGH, DIJ)}
Check whether the decomposition is lossless or not.
8.5.4.3 Primary Key
Solution:
It is one or more data attributes that uniquely identify (a) Given
an entity. It does not allow null values. R1 = (ABCDE)
1. Alternate key: The candidate key, which is not R2 = (BFGH)
selected as a primary key.
2. Composite key: It consists of two or more R3 = (DIJ)
attributes. Apply algorithm for lossless decomposition:
3. Foreign key: It is an entity that is the reference
to the primary key of another entity. Step 1:
R1 ∪ R2 ∪ R3 = {(ABCDE ) ∪ (BFGH ) ∪ (DIJ )} = (ABCDEFGHIJ
8.5.5 Decomposition R1 ∪ R2 ∪ R3 = {(ABCDE ) ∪ (BFGH ) ∪ (DIJ )} = (ABCDEFGHIJ ) = R
It is required to eliminate redundancy from the schema. Step 1 satisfies the given condition, so it is true.
If a relational schema R has redundancy in the data, Step 2:
then decompose R into two R1 and R2 schema. There For R1 and R2:
are two properties, which should be maintained when
we perform decomposition, it should be a lossless join as R1 ∩ R2 = (ABCDE ) ∩ (BFGH ) = B
well as dependency preserving.
Find closure of B = B+ = {B, F, G, H} .
Condition (ii) is satisfied, so R1 and R2 can be
8.5.5.1 Lossless Join
merged together. After merging R1 and R2,
Let R be a relation schema and let F be a set of func- R12 = (ABCDEFGH)
tional dependency (FD) over R. R is decomposed into R1
and R2. R1 and R2 are called lossless-join decomposition Now merge R12 and R3.
if R = R1 R2 or if we can recover original relation
from the decomposed relation. R12 ∩ R3 = (ABCDEFGH ) ∩ (DIJ ) = D
••Syntax of grant command in logarithmic time. B trees are a general form of binary
trees where a node can have more than one child.The
GRANT privilege_name ON object_name
B-trees are efficient for those systems that read and write
TO {user_name | PUBLIC | role_name}
large blocks of data, that is, databases and file systems.
[with GRANT option];
B trees are self-balancing trees. All the leaf nodes are
For example, GRANT SELECT ON emp TO
at the same level. A B tree with order p has:
user1;
1. Root node may contain minimum 1 key
••Syntax of revoke command p
2. Minimum number of child = − 1
REVOKE privilege_name ON object_name 2
3. Maximum number of children = p
FROM {User_name | PUBLIC | Role_name};
4. Maximum keys = p − 1
For example, REVOKE SELECT ON emp TO
user1; The order of B-tree can be found as follows:
p × P + ( p − 1) (K + Pr ) ≤ Block size
8.7 FILE STRUCTURES where p is order of the tree, P is the block pointer, Pr is
the record pointer and K is the key pointer.
File structure mainly deals with how files are stored on
the disk. Various file organisations are described below. Problem 8.7: The order of B-tree index is the maxi-
mum number of children it can have. Suppose that
8.7.1 Sequential Files a block pointer takes 6 bytes, the search field value
takes 9 bytes, record pointer takes 7 bytes and the
It is a file organisation system where every file record block size is 512 bytes. What is the order of the B
contains an attribute to uniquely identify a particular tree?
record. Records are placed in a sequential order with
some unique key. Solution:
Given that block size = 512 B; record pointer (Pr) =
8.7.2 Indexing 7 B; block pointer (PB) = 6 B; key pointer (K) = 9
B. So, we have
Indexing is a data structure mechanism to efficiently
retrieve records from the database, for example, book p ×P + ( p − 1) (K + Pr ) ≤ Block size
index. It is defined based on its indexing attributes.
Indexing is of three types: 6p + (p − 1)(9 + 7) = 512
Order of leaf nodes: (no transaction will affect the existence of any
Pleaf ×[k + Pr ] + PB ≤ Block size other transaction).
4. Durability: Persistence of data even if the system
where p is the order of internal nodes, Pr is record fails and restarts.
pointer, PB is block pointer, k is key pointer, Pleaf is the
order of leaf nodes.
8.8.2 Schedule
Problem 8.8: The order of an internal node in a B+
A chronological execution sequence of transactions is called
tree index is the maximum number of children it can
schedule. It is a list of actions reading, writing, aborting
have. Suppose that a block pointer takes 6 bytes, the
or committing from a set of transactions. A schedule can
search field value takes 9 bytes, record pointer take
have many transactions. Schedule can be further divided
7 bytes and the block size is 512 bytes. What is the
into two types–serial schedule and concurrent schedule.
order of the internal node and leaf node?
8.8.2.3 Comparison between Serial and Example: Let value of data item A = 1000, X =
Concurrent Schedule 100 and Y = 200, then transaction Ti will update
the database with value 1100. Transaction Tj will
Table 8.5 shows the comparison between serial and con- read data item value as 1100. Transaction Tj will
current schedule. update database with value 1300. As transaction Ti
fails, the database value should be 1200. Therefore,
Table 8.5 | Serial schedule vs. concurrent schedule this problem is known as uncommitted read prob-
lem or dirty read problem.
Serial Schedule Concurrent Schedule
2. RW problem (Write after Read): RW
All serial schedules are All serial schedules are problem is also known incorrect summary prob-
consistent schedules not consistent schedules lem or unrepeatable read problem (Fig. 8.8). Let
Less throughput More throughput there be two transactions Ti and Tj of schedule S.
Poor resource utilization Good resource utilization Transaction Ti read a data item and Tj also read
similar data item. Transactions Ti and Tj both
More response time Less response time have write operation on that data item. If in both
For the given For the given transactions, read operation occurs before commit
transactions, the number transactions, the number of the other transaction and before writing back
of serial schedule is of concurrent schedule is that data item in database by other transaction
very much less than the more than the number of then RW problem will arise.
number of concurrent serial schedule
schedule Transaction Ti Transaction Tj
Read (A)
8.8.2.4 Problems Occurring due to If (A > 0)
Concurrent Schedule A=A+1
Read (A)
In concurrent schedule, more than one transaction is
If (A > 0)
executed simultaneously; and due to this some problem
A=A+1
arises with concurrent schedule given as follows:
Write (A)
1. WR problem (Read after Write): WR problem Commit Write (A)
is also known as dirty read problem or uncommit- Commit
Let there be two transactions Ti and Tj of schedule S. If Schedule Problem Exists Problem
transaction Tj reads a data item which is updated by Remove
transaction Ti and transaction Tj commits before the
commit (or rollback) of transaction Ti , then the given Irrecoverable All None
schedule S is called irrecoverable schedule. Recoverable Incorrect summary Irrecoverable
problem (RW),
Uncommitted Read
8.8.3.2 Recoverable Schedule problem (WR
problem), lost
Let a schedule S have two transactions Ti and Tj . If
update (WW) and
transaction Tj reads a data item which is updated by
cascading rollback
transaction Ti and transaction Tj is not allowed to
problem
commit (or rollback) before the commit (or rollback) of
transaction Ti , then the given schedule S is called recov- Cascading Incorrect summary Irrecoverable,
erable schedule. Recoverable schedule may suffer from rollback problem (RW) and Uncommitted
uncommitted read, lost update and incorrect summary recoverable lost update (WW). Read problem
problem. (WR problem)
and cascading
rollback
8.8.3.3 Cascading Rollback Recoverable problem.
Schedule
Strict Incorrect summary Irrecoverable,
Let there be four transactions (T1 , T2 , T3 , T4 ) in a given recoverable problem (RW) Uncommitted
schedule S. In schedule S, if rollback of a transaction (T1 ) Read problem
results in the rollback of the other transactions (T2 , T3 , T4 ) (WR problem),
(because of their dependency on each other), then this is lost update
called cascading rollback. (WW) and
cascading
If a schedule is recoverable and has no cascading roll- rollback
back, then it is called cascading rollback recoverable problem.
schedule. Incorrect summary and lost update problems
Problem 8.9: A concurrent schedule S has three Step 3: Check for strict recoverable. There
transactions T1 , T2 , T3. Transactions execute read/write should be commit operation between write operations
operation in the following sequence: on similar data items by two different transactions. In
the given problem, transaction T2 and transaction T3
Read1(x), Read2 (z), Read3 (x), Read1(z), Read2 (y), have write operations on data item (y), but no commit
Read3 (y), Write1(x), Commit1 , Write2 (z), Write3 (y), operation is performed by transaction T3 before data
item is updated by transaction T2. Hence, the given
Write2 (y), Commit3 , Commit2 .
schedule is not strict recoverable.
Find out the type of recoverable schedule?
Step 1: Check for recoverable. By using the defi- If a pair of operations do not fulfill the above three con-
nition of recoverable schedule we can find that given ditions, then the pair is call a non-conflict pair.
schedule is recoverable or not. (In this problem, no
transaction has read operation on a data item after
that data item is written by another transaction, so Problem 8.10: A schedule S1 with two transactions
the given schedule is recoverable.) T1 and T2 is given as:
Step 2: Check for cascading recoverable. In Read1(x), Write1(x), Read2 (x), Read1(y), Write1(y),
cascading recoverable schedule, no uncommitted read
Read2 (y)),....
is allowed. In the given problem, there is no uncommit-
ted read so the given schedule is cascadless schedule. Find conflict equivalent schedule to S1 .
Solution:
Transaction T1 Transaction T2
Vertices V1 and V2 represent transactions T1 and T2.
Read1 (x) ------ Edge 1 for
V2 : Read(x) operation and V1 : Write (x) operation
Write1 (x) ------
Edge 2 for
Read1 (y )
V1 : Write(y) operation and V2 : Read (y) operation
------
Read2 (x )
1
------
Write1 (y ) ------ V2
Read2 (y )
V1
------
2
Conflict equivalent schedules.
Solution: T1 T2 T3
Vertices V1, V2 and V3 represent transactions T1, T2 ------ ------ Read (x)
and T3.
------ Read (x) ------
Edge 1 for
V1 : Write(x) operation and V2 : Read (x) operation ------ ------ Write (x)
Edge 2 for Read (x) ------ ------
V2 : Read (x) operation and V3 : Write(x) operation Write (x) ------ ------
Edge 3 for
V2 : Write(y) operation and V1 : Write(y) operation
Solution: We have a concurrent schedule S1 with
Edge 4 for three transactions, so the number of possible serializ-
V1 : Write(y) operation and V3 : Read (y) operation able schedules with three transactions is 8 (23 = 8).
Edge 5 for So, we will pick one serializable schedule S2 and if
V2 : Write(y) operation and V3 : Read (y) operation the given concurrent schedule is be view equivalent
to that schedule, only then would the given schedule
will be view serializable.
3
V1
T1 T2 T3
1
V2 ------ ------ Read (x)
4
------ Read (x) ------
2 and 5 ------ ------ Write (x)
Read (x) ------ ------
V3
Write (x) ------ ------
•• Transaction T issues Write(x) operation to inspect all the operation performed by trans-
(a) If (RTS(x) > Time Stamp( Ti )) actions in a schedule. There are various deadlock
Rollback Ti prevention scheme which uses time stamp ordering
mechanism.
(b) If (WTS(x) > Time Stamp( Ti )) 6. Wait-Die Protocol: Assume TimeStamp (T1 ) < … < TimeSt
Rollback Ti TimeStamp (T1 ) < … < TimeStamp (Tn ) and Ti and Tj are any two
Note: Basic Time Ordering protocol ensures seri- transactions and transaction Ti is older than trans-
alizability, equivalent serial schedule based on time action Tj . This implies
stamp Ordering. It is deadlock free protocol, but
basic time ordering protocol can have starvation TimeStamp (Ti ) < … < TimeStamp (Tj ).
and irrecoverable schedule (because it does not •• If transaction Ti required a lock which is hold
ensure order of commit). by transaction Tj, then transaction Ti is allowed
4. Strict Time Ordering Protocol: In this pro- to wait.
tocol, an additional condition is added in the Basic •• If transaction Tj required a lock which is hold by
Time Ordering protocol. This condition is as follows: transaction Ti then transaction Tj will rollback.
Let a schedule have two transactions Ti and Tj 7. Wound-Wait Protocol: Let TimeStamp (T1 ) < … < TimeSt
where time stamp of ( Ti ) is less than timeTimeStamp
stamp (T1 ) < … < TimeStamp (Tn ) and Ti and Tj are any two
of Tj . If transaction Tj issues Read(x)/ Write(x) transactions and transaction Ti is older than trans-
operation with WTS(x) < Time Stamp (Tj ) then action Tj . This implies
(Tj ) has to be delayed Read(x)/ Write(x) opera-
tion until commit or rollback of transaction Ti TimeStamp (Ti ) < TimeStamp (Tj )
that has performed Write(x). •• If transaction Ti required a lock which is hold
Strict Time Ordering Protocol ensures serializability by transaction Tj then rollback transaction Tj.
and deadlock free. But it may suffer from starvation. •• If transaction Tj required a lock which is hold
5. Deadlock Prevention: To prevent deadlock by transaction Ti then transaction Tj is allowed
situation in a database system it is very important to wait.
IMPORTANT FORMULAS
1. All the 3NF and BCNF decomposition guarantee 5. CP: Child Pointer
lossless join.
6. KV: Key Value
2. All 3NF and BCNF decomposition may not guar-
7. Order of internal node in B+ tree is
antee dependency preservation.
PI . CP + (PI − 1)KV ⇐ Block size
3. Any relation with two attributes is in BCNF.
4. Functional dependency F: X → Y implies that for
any two tuples if t1[X ] = t2[X ], they must have
t1[Y ] = t2[Y ].
Inference Rules:
1. Reflexivity 5. Decomposition
If Y ⊆ X, then X → Y If X → YZ, then X → Y and X → Z
2. Augmentation 6. Pseudo-transitivity
If X → Y, then XZ → YZ for any Z If A → B and BC → D, then AC → D
3. Transitivity 7. Number of rows in cross product = r1 × r2
If X → Y and Y → Z, then X → Z
8. Number of attributes in cross product = m + n
4. Union
If X → Y and X → Z, then X → YZ 9. Complete set of operations are {s, p, ∪, −, ×}
10. In clustering index, file is ordered on non-key field. 12. Different kind of keys — Primary Key, Secondary
Key, Candidate Key, Alternate key, Foreign Key,
11. Secondary indexes are created on unordered file on Superkey, Compound key (or Composite key)
either key or non-key field. (or concatenated key).
SOLVED EXAMPLES
1. Which of the following operations does not modify Solution: Lost update, inconsistent data and
the database? uncommitted (WR) problems are overcome by
database lock.
(a) Sorting (b) Insertion at the beginning
Ans. (d)
(c) Append (d) Modify
6. Database language may consist of the following?
Solution: Sorting does not involving addition/
(a) DDL and DML
insertion or modification of any element.
(b) DML
Ans. (a)
(c) Query language
2. Which of the following operations is based on rela- (d) DDL, DML, query language
tional algebra?
Solution: DDL, DML, query language are all data
(a) rename and union base languages.
(b) select and union Ans. (d)
(c) only union
7. Which of the following is not a function of DBA?
(d) rename, select and union
(a) Network maintenance
Solution: Rename, select and union are some of (b) Routine maintenance
the operations of relational algebra. (c) Defining the schema
Ans. (d) (d) Data access through authorization
3. ACID properties of a transaction are Solution: The role of database administrator is to
(a) atomicity, constant, integrity, durability. manage the DBMS.
(b) atomicity, consistency, isolation, durability. Ans. (a)
(c) atomicity, compact, in-built, durability. 8. In a relational database the category type of infor-
(d) automatically, consistency, isolation, database. mation is represented in
Solution: ACID properties are atomicity, consis- (a) tuple. (b) field.
tency, isolation and durability. (c) primary key. (d) database name.
Ans. (b) Solution: Field shows type of information in rela-
tional database.
4. Which normal form is concerned with removing
Ans. (b)
transitive dependency?
9. Which key is used to identify a tuple uniquely?
(a) 1NF (b) 2NF
(c) 3NF (d) BCNF (a) Primary key (b) Tuple key
(c) Unique key (d) Domain key
Solution: 3NF is also the highest normal form.
Solution: Primary key.
Ans. (c) Ans. (a)
5. The concept Database Lock is used to overcome
10. An association of the information is represented by
the problem of
(a) an attribute. (b) a relationship.
(a) lost update and inconsistent data.
(c) a normal form. (d) the records.
(b) uncommitted dependency and inconsistent data.
(c) inconsistent data only. Solution: Relationship shows association of
(d) lost updates, inconsistent data and uncommitted information.
dependency. Ans. (b)
11. In which stage of database design, all the necessary Solution: It provides data storage and responsive-
fields and their types of a database are listed? ness to queries.
Ans. (a)
(a) Data definition (b) Data field definition
(c) E-R diagram (d) User definition 17. If D1, D2, … Dn are domains in a relational
model, then the relation is a table, which is a
Solution: In data definition stage, all the fields
subset of
and their types are listed.
(a) D1 ⊕ D2 ⊕ … ⊕ Dn (b) D1 × D2 × … × Dn
(c) D1 ∪ D2 ∪ … ∪ Dn (d) D1 ∩ D2 ∩ … ∩ Dn
Ans. (a)
(a) Time stamp protocols avoid deadlock. (a) CD (b) EC (c) AE (d) AC
(b) Locking technique is used to avoid deadlock.
(c) 2Phase locking does not provide serializability. Solution: To find key, we try to find closure.
(d) 2Phase deals with input and storing phase. (CD)+ = {C, D, F}
Solution: Time stamp protocol is a deadlock free (EC)+ = {A, B, C, D, E, F}
protocol. (AE)+ = {A, B, E}
(AC)+ = {A, B, C, F}
Ans. (a)
15. Data integrity control makes use of for
maintaining the integrity of database. Only EC satisfies the closure property.
Ans. (b)
(a) specific alphabets (uppercase and lowercase
alphabets) 20. Give the following relation instance:
(b) passwords
(c) relational algebra x y z
(d) storing on a backup hard disk 1 4 2
1 5 3
Solution: It promotes and enforces some integ- 1 6 3
rity rules for the reducing data redundancy and 3 2 2
increasing data consistency.
Ans. (a) Which of the following functional dependencies are
16. Data warehouse provides satisfied by the instance?
Solution: Functional dependency should uniquely maximum and minimum sizes of the join, respec-
identify the value. From the given options only the tively, are
following satisfies this condition:
(a) m + n and 0. (b) mn and 0.
XY → Z, YZ → X, Y → Z AND Y → X (c) m + n and |m − n|. (d) mn and m + n.
Ans. (b)
Solution: Number of tuples in their join gets mul-
21. Relation R is decomposed using a set of functional tiplied with number of tuples in both the relations.
dependencies F, and relation S is decomposed So, the maximum and minimum tuples will be mn
using another set of functional dependencies G. and 0, respectively.
One decomposition is definitely BCNF, the other is Ans. (b)
definitely 3NF, but it is not known which is which.
To make a guaranteed identification, which on one 23. Given the relations
of the following tests should be used on the decom-
Employee(name, salary, deptno) and depart-
positions? (Assume that the closures of F and G
ment (deptno, deptname, address), which
are available.)
of the following queries cannot be expressed
(a) Dependency-preservation using the basic relational algebra operations
(b) Lossless-join (s, p, , E, ∩, −)?
(c) BCNF definition (a) Department address of every employee
(d) 3NF definition (b) Employees whose name is the same as their
department name
Solution: All the 3NF and BCNF decomposition (c) The sum of all employees’ salaries
guarantee lossless join. All 3NF and BCNF decom- (d) All employees of a given department
position does not guaranty dependency preservation.
According to this, only option (c) can be correct. Solution: Aggregate functions such as sum, avg,
Ans. (c) min and max cannot be expressed in terms of
basic relational algebra operations. These require
22. Consider the join of a relation R with a relation extended relational algebra.
S. If R has m tuples and S has n tuples then the Ans. (c)
1. Which of the following scenarios may lead to an 2. Consider the following SQL query:
irrecoverable error in a database system?
SELECT distinct a1,a2,K,an a
(a) A transaction writes a data item after it is read FROM r1,r2,L,rm r
by an uncommitted transaction. WHERE P
(b) A transaction reads a data item after it is read
by an uncommitted transaction. For an arbitrary predicate P, this query is equiva-
(c) A transaction reads a data item after it is lent to which of the following relational algebra
written by a committed transaction. expressions?
(d) A transaction reads a data item after it is (a) Πa1 ,a2 ,…,a n s p (r1 × r2 × × rm )
written by an uncommitted transaction.
(GATE 2003: 1 Mark) (b) Πa ,a , K ,an s p (r1 r2 L rm)
1 2
consider all the tables r1, r2, … rn. Row (s ) selects Grades: (Roll_number, Course_number, Grade)
all the rows satisfying predicate P. SELECT distinct Name
Ans. (a) FROM Students, Courses, Grades
3. Consider the following functional dependencies in a WHERE Students.Roll_number = Grades.
database: Roll_number
Date_of_birth → Age and Courses.Instructor = Shyam
Age → Eligibility and Courses.Course_number = Grades.
Name → Roll_number Course_number
Roll_number → Name and Grades.grade = A
Course_number → Course_name Which of the following sets is computed by the
above query?
Course_number → Instructor
(Roll_number, Course_number) → Grade (a) N ames of students who have got an A grade in
all courses taught by Shyam.
The relation (Roll_number, Name, Date_of_
(b) Names of students who have got an A grade in
birth, Age) is
all courses.
(a) In Second normal form but not in Third normal (c) Names of students who have got an A grade in
form at least one of the courses taught by Shyam.
(b) In Third normal form but not in BCNF (d) None of the above.
(c) In BCNF (GATE 2003: 2 Marks)
(d) In none of the above
(GATE 2003: 2 Marks) Solution: Use of distinct selects the name only
once. Due to distinct selected names will be unique,
Solution: The applicable FDs are: so one person will be selected only once, in spite of
Date_of_birth → Age getting grade A in any number of courses taught
Age → Eligibility by Shyam.
Ans. (c)
Name → Roll_number
Roll_number → Name 5. Consider three data items D1, D2 and D3, and the fol-
The candidate keys are lowing execution schedule of transactions T1, T2 and
(Roll_number, Date_of_birth) and (Name, T3. In the diagram, R(D) and W(D) denote the actions
Date_of_birth) reading and writing the data item D, respectively.
For candidate key the FD Date_of_birth → Age
T T2 T3
is a partial dependency.
So, the relation is in 1NF. R(D3);
Candidate keys for given relation are: (date of R(D2);
birth, name) and (date of birth, roll_number)
W(D2);
Data of Birth → Age
R(D2);
Age (non-prime) is partially dependent upon Date
of Birth which is prime attribute. According to R(D3);
2NF definition, every non-prime attribute should R(D1);
be fully functional dependent on key, so it is not in W(D1);
2NF. Therefore, it will not be in 3NF and BCNF
W(D2);
also.
Ans. (d) W(D3);
R(D1);
4. Consider the set of relations shown below and the
SQL query that follows: R(D2);
W(D2);
Students: (Roll_number, Name, Date_of_
birth) W(D1);
Courses: (Course number, Course_name,
Instructor) Which of the following statements is correct?
(a) The schedule is serializable as T2; T3; T1. present in (Student * Enroll), where `*’ denotes
(b) The schedule is serializable as T2; T1; T3. natural join?
(c) The schedule is serializable as T3; T2; T1.
(a) 8, 8 (b) 120, 8
(d) The schedule is not serializable.
(c) 960, 8 (d) 960, 120
(GATE 2003: 2 Marks)
(GATE 2004: 1 Mark)
Solution: Let us draw a flow diagram between
the three schedules. An edge is drawn between two Solution: The maximum and minimum number
schedules if there exists read-write or write-write of tuples presented in (Student * Enroll) would be
dependency between them. represented by the minimum of these 120 and 8,
that is, min(120, 8) = 8
Natural join has inbuilt condition of equality. So in
T2 both the relations, minimum and maximum tuples
T3 will be 8.
Ans. (a)
8. It is desired to design an object-oriented employee
record system for a company. Each employee has
a name, unique id and salary. Employees belong
T1 to different categories and their salary is deter-
mined by their category. The functions getName,
getId and compute salary are required. Given the
The schedule has cycle between T1 and T3. So, the class hierarchy below, possible locations for these
schedule is not serializable. functions are:
Ans. (d)
(i) getId is implemented in the superclass
6. Let R1(A, B, C, D) and R2(D, E) be two relation (ii) getId is implemented in the subclass
schema where the primary keys are shown (iii) getName is an abstract function in the
underlined, and let C be a foreign key in R1 refer- superclass
ring to R2. Suppose there is no violation of the (iv) getName is implemented in the superclass
above referential integrity constraint in the corre- (v) getName is implemented in the subclass
sponding relation instances r1 and r2. Which one of (vi) getSalary is an abstract function in the
the following relational algebra expressions would superclass
necessarily produce an empty relation? (vii) getSalary is implemented in the superclass
(a) ΠD (r2 ) − ΠC (r1 ) (b) ΠC (r1 ) − ΠD (r2 ) (viii) getSalary is implemented in the subclass
9. The relation scheme Student Performance (Name, 11. The order of an internal node in a B+ tree index
CourseNo, RollNo, Grade) has the following func- is the maximum number of children it can have.
tional dependencies: Suppose that a child pointer takes 6 bytes, the
Name, CourseNo → Grade search field value takes 14 bytes, and the block
RollNo, CourseNo → Grade size is 512 bytes. What is the order of the internal
Name → RollNo node?
RollNo → Name (a) 24 (b) 25
The highest normal form of this relation scheme is (c) 26 (d) 27
(a) 2NF (b) 3NF (GATE 2004: 2 Marks)
(c) BCNF (d) 4NF
Solution: Order of internal node is PI . CP + (PI − 1)
(GATE 2004: 2 Marks) KV <= Block Size
Child pointer (CP) = 6 B
Solution: Name and RollNo are candidate keys. Key value (KV) = 14 B
The attributes are not repeated. So, the relation Block size = 512 B
schema is in 1 NF. 6 + (PI − 1) × 14 = 512
There is no partial dependency. So, the relation is PI = 26
in 2NF. Ans. (c)
Grade is not fully dependent upon all candidate
keys, so it is not in 3NF. 12. The employee information in a company is stored
Candidate keys for given relation are: (rollNo, in the following relation:
courseNo) and (name, courseNo) there is no partial Employee(name, sex, salary, deptName)
and transitive dependencies. So the highest normal
form is 3NF. name → rollNo and rollNo → name Consider the following SQL query:
are violating the condition of BCNF.
SELECT deptName
Ans. (b)
FROM Employee
10. Consider the relation Student (name, sex, marks), WHERE sex = ‘M’
where the primary key is shown underlined, per-
taining to students in a class that has at least one GROUP by deptName
boy and one girl. What does the following rela- HAVING avg(salary)>
tional algebra expression produce? (SELECT avg (salary) FROM Employee)
(Note: r is the rename operator).
It returns the names of the department in which
Π name (rsex = female (Student) − Π name (Studentsex = female Λx = maleΛmarks≤m rn ,x ,m (Student))
(a) the average salary is more than the average
) − Π name (Studentsex = female Λx = maleΛmarks≤m rn ,x ,m (Student)) salary in the company.
(b) the average salary of male employees is more
(a) Names of girl students with the highest marks than the average salary of all male employees
(b) Names of girl students with more marks than in the company.
some boy student (c) the average salary of male employees is more
(c) Names of girl students with marks not less than than the average salary of employees in the
some boy student same department.
(d) Names of girl students with more marks than (d) the average salary of male employees is more
all the boy students than the average salary in the company.
(GATE 2004: 2 Marks)
(GATE 2004: 2 Marks)
Solution: The query is Part 1 − Part 2. Part 2 Solution: Inner query selects the average salary
of the query results in names of girls having marks of employees in the company. Outer query selects
less than or equal to marks of all the boys. And average salary of male employees. So, the whole
Part 1 selects the names of all the girls in class. So, query finds the department name of whose average
Part 1 − Part 2 will result in names of girl students salary of male employees is more than the average
having more marks than all the boy students. salary in the company.
Ans. (d) Ans. (d)
13. Which one of the following is a key factor for pre- Table r1 Table r2
ferring B+ trees to binary search trees for indexing
A B C A D
database relations?
1 10 100 1 1000
(a) Database relations have a large number of
records. 1 20 200 1 2000
(b) Database relations are sorted on the primary key.
(c) B+ trees require less memory than binary Table s (natural join of r1 and r2)
search trees.
(d) Data transfer from disks is in blocks. A B C D
(GATE 2005: 1 Mark)
1 10 100 1000
Solution: Indexing performs well for large data 1 20 200 1000
blocks. 1 20 100 200
B+ trees are preferred over the binary search
1 20 200 2000
trees.
B+ trees transfer data from disk to primary Ans. (c)
memory in form of data blocks.
In case of B+ trees, data moves in terms of blocks. 16. Let E1 and E2 be two entities in an E/R diagram
Ans. (d) with simple single-valued attributes. R1 and R2 are
two relationships between E1 and E2, where R1 is
14. Which one of the following statements about one-to-many and R2 is many-to-many. R1 and R2
normal forms is FALSE? do not have any attributes of heir own. What is the
(a) BCNF is stricter than 3NF. minimum number of tables required to represent
(b) Lossless, dependency-preserving decomposition this situation in the relational model?
into 3NF is always possible. (a) 2 (b) 3
(c) Lossless, dependency-preserving decomposition (c) 4 (d) 5
into BCNF is always possible. (GATE 2005: 2 Marks)
(d) Any relation with two attributes is in BCNF.
(GATE 2005: 1 Mark) Solution:
R2
Solution: It is not always possible to decompose E1 E2 E1 E2
a table in BCNF and preserve dependencies. For
l x L x
example, a set of functional dependencies {AB → C,
C → B} cannot be decomposed in BCNF. m y L y
Ans. (c) n z M y
15. Let r be a relation instance with schema R = The one-to-many relationships are represented
(A, B, C, D). We define r1 = PA, B, C(R) and r2 = with entity set from one side. In one-to-many rela-
PA, D(r). Let s = r1 * r2 where * denotes natural tionships, each entity in the entity set can be asso-
join. Given that the decomposition of r into r1 and ciated with at most one entity of the other. So, the
r1 is lossy, which one of the following is TRUE? table is not formed for R1. Hence, the tables are
formed for R2, E1 and E2.
(a) s ⊂ r (b) r ∪ s = r
Using cross-reference technique, minimum of 3
(c) r ⊂ s (d) r * s = s
tables are required.
(GATE 2005: 1 Mark)
Ans. (b)
Solution: 17. The following table has two attributes A and C
where A is the primary key and C is the foreign key
Table r
referencing A with on-delete cascade.
A B C D
A C
1 10 100 1000
2 4
1 20 200 1000
3 4
20 200 2000
(Continued)
Continued {A → B, BC → D, E → C, D → A}.
A C What are the candidate keys of R?
4 3 (a) AE, BE (b) AE, BE, DE
5 2 (c) AEH, BEH, BCH (d) AEH, BEH, DEH
7 2 (GATE 2005: 2 Marks)
9 5
Solution: Let S be a candidate key of relation R
6 4 if the closure of S is all attributes of R and there
is no subset of S whose closure is all attributes
The set of all tuples that must be additionally of R.
deleted to preserve referential integrity when the Closure of AEH: AEH+ = {ABCDEH}
tuple (2, 4) is deleted is
Closure of BEH: BEH+ = {ABCDEH}
(a) (3, 4) and (6, 4)
(b) (5, 2) and (7, 2) Closure of DEH: DEH+ = {ABCDEH}
(c) (5, 2), (7, 2) and (9, 5) Ans. (d)
(d) (3, 4), (4, 3) and (6, 4)
20. Consider the following log sequence of two transac-
(GATE 2005: 2 Marks) tions on a bank account, with initial balance 12000,
that transfer 2000 to a mortgage payment and then
Solution: When (2,4) is deleted, as C is a foreign apply a 5% interest.
key referring A with delete on cascade, all entries
with value 2 in C must be deleted. 1. T1 start
So, (5, 2) and (7, 2) are deleted. As a result of 2. T1 B old = 1200 new = 10000
this, 5 and 7 are deleted from A, which causes
(9, 5) to be deleted. 3. T1 M old = 0 new = 2000
Ans. (c) 4. T1 commit
5. T2 start
18. The relation book (title, price) contains the titles
and prices of different books. Assuming that no 6. T2 B old = 10000 new = 10500
two books have the same price, what does the fol- 7. T2 commit
lowing SQL query list?
Suppose the database system crashes just before log
SELECT title record 7 is written. When the system is restarted,
FROM book as B which one statement is true of the recovery procedure?
WHERE (SELECT count(*) (a) We must redo log record 6 to set B to 10500.
FROM book as T (b) We must undo log record 6 to set B to 10000
and then redo log records 2 and 3.
WHERE T.price > B.price) < 5
(c) We need not redo log records 2 and 3 because
(a) Titles of the four most expensive books transaction T1 has committed.
(b) Title of the fifth most inexpensive book (d) We can apply redo and undo operations in arbi-
(c) Title of the fifth most expensive book trary order because they are idempotent.
(d) Titles of the five most expensive books (GATE 2006: 1 Mark)
(GATE 2005: 2 Marks)
Solution: When a transaction is committed, no
Solution: The outer query selects all titles from need to redo or undo operations.
book table. For every selected book, the sub-query Ans. (c)
returns count of those books, which are more expen-
sive than the selected book. The where clause of outer 21. Consider the relation account (customer, balance)
query will be true for five most expensive books. where customer is a primary key and there are
Ans. (d) no null values. We would like to rank customers
according to decreasing balance. The customer with
19. Consider a relation scheme R = (A, B, C, D, E, H ) the largest balance gets rank 1. Ties are not broke
on which the following functional dependencies hold: but ranks are skipped: if exactly two customers
have the largest balance they each get rank 1, and (a) All queries return identical row sets for any database
rank 2 is not assigned. (b) Query 2 and Query 4 return identical row sets for
all databases but there exist databases for which
Query 1: SELECT A.customer, count(B. Query 1 and Query 2 return different row sets.
customer) FROM account A, account B (c) There exist databases for which Query 3 returns
WHERE A.balance ⇐ B.balance group by strictly fewer rows than Query 2
A.customer (d) There exist databases for which Query 4 will
Query 2: SELECT A.customer, 1+count(B. encounter an integrity violation at runtime.
customer) FROM account A, account B
WHERE A.balance < B.balance group by (GATE 2006: 2 Marks)
A.customer Solution: The output of Query 2, Query 3 and
Consider these statements about Query 1 and Query 2. Query 4 will be identical. Query 1 produces dupli-
cate rows. But row set produced by all the queries
I. Q uery 1 will produce the same row set as Query will be same. For example,
2 for some but not all databases.
II. Both Query 1 and Query 2 are correct imple- Table: Enrolled Table: Enrolled
mentation of the specification. Student Course Student Amount
III. Q uery 1 is a correct implementation of the
specification but Query 2 is not. A C1 A 100
IV. Neither Query 1 nor Query 2 is a correct imple- B C1 B 100
mentation of the specification. A C2 D 200
V. Assigning rank with a pure relational query
C C1
takes less time than scanning in decreasing
balance order assigning ranks using ODBC.
Output of Query 1: A B
Which two of the above statements are correct? Output of Query 2: A B
(a) II and V (b) I and III Output of Query 3: A B
(c) I and IV (d) III and V
Output of Query 4: A B
(GATE 2006: 2 Marks) Ans. (a)
Solution: Both query1 and query2 perform the 23. Consider the relation enrolled (student, course) in
same task, but none of them sort the customer which (student, course) is the primary key, and the
according to the decreasing balance. So, option (c) relation paid (student, amount) in which student
is correct. is the primary key. Assume no null values and no
Ans. (c) foreign keys or integrity constraints. Assume that
amounts 6000, 7000, 8000, 9000 and 10000 were
22. Consider the relation enrolled (student, course) in
each paid by 20% of the students. Consider these
which (student, course) is the primary key, and
query plans (Plan 1 on left, Plan 2 on right) to “list
the relation paid (student, amount) where student
all courses taken by students who have paid more
is the primary key. Assume no null values and no
than x”.
foreign keys or integrity constraints. Given the fol-
lowing four queries: A disk seek takes 4 ms, disk data transfer band-
Query 1: SELECT student FROM enrolled WHERE width is 300 MB/s and checking a tuple to see if
student in (SELECT student FROM paid) amount is greater than x takes 10 µs. Which of the
following statements is correct?
Query 2: SELECT student FROM paid WHERE
student in (SELECT student FROM enrolled) (a) P lan 1 and Plan 2 will not output identical row
Query 3: SELECT E.student FROM enrolled sets for all databases.
E, paid P WHERE E.student = P.student (b) A course may be listed more than once in the
Query 4: SELECT student FROM paid WHERE output of Plan 1 for some databases.
exists (c) For x = 5000, Plan 1 executes faster than Plan
2 for all databases.
(SELECT * FROM enrolled WHERE enrolled. (d) For x = 9000, Plan 1 executes slower than Plan
student = paid.student) 2 for all databases.
Which one of the following statements is correct? (GATE 2006: 2 Marks)
Project on course
Solution: Plans need to load both tables’ courses The closure of {AF}+ is {A F D E}. It cannot
and enrolled. So, disk access time is the same for drive all the members of set given in option (c). So,
both plans. option (c) is false.
Plan 2 does lesser number of comparisons com- Ans. (c)
pared to Plan 1.
25. Information about a collection of students is given
So, join operation will require more comparisons
by the relation studinfo(studId, name, sex). The
as the second table will have more rows in Plan 2
relation enroll(studId, courseId) gives which stu-
compared to Plan 1.
dent has enrolled for (or taken) what course(s).
The joined table of two tables will have more rows,
Assume that every course is taken by at least
so, more comparisons are needed to find amounts
one male and at least one female student. What
greater than x.
does the following relational algebra expression
Plan 1 executes faster than Plan 2. First tuples are
represent?
filtered then joined, whereas in plan 2 first tuples
are joined and then filtered, which takes more time. Π courseId (Π studId (s sex ="female"(studInfo)) × Π courseId (enroll))
Π courseId (Π studId (s sex ="female"(studInfo)) × Π courseId (enroll))
Ans. (c)
(a) C ourses in which all the female students are
24. The following functional dependencies are given:
enrolled.
AB → CD, AF → D, DE → F, C → G, F → E, G (b) Courses in which a proper subset of female
→A students are enrolled.
(c) Courses in which only male students are
Which one of the following options is false? enrolled.
(a) {CF}+ = {ACDEFG} (d) None of the above
(b) {BG}+ = {ABCDG} (GATE 2007: 2 Marks)
(c) {AF}+ = {ACDEFG} Solution:
(d) {AB}+ = {ABCDFG}
(i) Select studId of all female students and select
(GATE 2006: 2 Marks) all courseId of all courses.
(ii) The query performs a Cartesian product of the
Solution: Closure of AF or AF+ = {ADEF}, above select two columns in Step 1.
closure of AF does not contain C and G. (iii) It subtracts enroll table from the result of Step 2.
This will remove all the (studId, courseId) pairs, Solution: Q1 selects an employee but does not
which are present in enrol table. compute the result as after the `Where not exists’,
If all female students have registered in courses, it does not have statement to produce the result.
then this course will not be there in the subtracted Q2 selects an employee who gets higher salary
result. than anyone in the department 5. It correctly finds
So, the complete expression returns courses in employees who get higher salary than anyone in
which a proper subset of female students is enrolled. the department 5.
Ans. (b) Ans. (b)
26. Consider the relation employee (name, sex, 28. Which one of the following statements is FALSE?
supervisorName) with name as the key. super-
(a) Any relation with two attributes is in BCNF.
visorName gives the name of the supervisor of
(b) A relation in which every key has only one
the employee under consideration. What does
attribute is in 2NF.
the following Tuple Relational Calculus query
(c) A prime attribute can be transitively dependent
produce?
on a key in a 3NF relation.
{e.name|employee(e) ∧{(∀x) (d) A prime attribute can be transitively depen-
[¬employee(x) ∨ x.supervisorName ≠ dent on a key in a BCNF relation.
e.name ∨ x.sex = “male”)]}/ inserted on (GATE 2007: 2 Marks)
the left of expression
Solution: According to the definition of 3NF, a
(a) Names of employees with a male supervisor.
prime attribute can be transitively dependent on a
(b) Names of employees with no immediate male
key. Option (d) is incorrect because this does not
subordinates.
satisfy the condition of BCNF.
(c) Names of employees with no immediate female
Ans. (d)
subordinates.
(d) Names of employees with a female supervisor. 29. The order of a leaf node in a B+ tree is the maxi-
(GATE 2007: 2 Marks) mum number of (value, data record pointer) pairs
Solution: The query selects all those employees it can hold. Given that the block size is 1 KB, data
whose immediate subordinate is “male”. record pointer is 7 bytes long, the value field is
Ans. (b) 9 bytes long and a block pointer is 6 bytes long,
what is the order of the leaf node?
27. Consider the table employee (empId, name, depart- (a) 63 (b) 64
ment, salary) and the two queries Q1, Q2 below. (c) 67 (d) 68
Assuming that department 5 has more than one (GATE 2007: 2 Marks)
employee, and we want to find the employees who
get higher salary than anyone in the department Solution: Let the order of the leaf node be n.
5, which one of the statements is TRUE for any Block size = 1 KB = 1024 bits
arbitrary employee table? 6 + 7n + (n − 1)9 = 1024
Q1: SELECT e.empId 1 n = 64
FROM employee e Ans. (b)
WHERE not exists (SELECT * FROM
employee s WHERE s.department = “5”
30. Consider the following schedules involving two
and s.salary >= e.salary) transactions. Which one of the following statements
is TRUE?
Q2: SELECT e.empId
FROM employee e S1: r1(X ); r1(Y ); r2(X ); r2(Y ); w2(Y ); w1(X )
WHERE e.salary > Any S2: r1(X ); r2(X ); r2(Y ); w2(Y ); r1(Y ); w1(X )
(SELECT distinct salary FROM em-
ployee s WHERE s.department = “5”) (a) Both S1 and S2 are conflict serializable.
(b) S1 is conflict serializable and S2 is not conflict
(a) Q1 is the correct query. serializable.
(b) Q2 is the correct query. (c) S1 is not conflict serializable and S2 is conflict
(c) Both Q1 and Q2 produce the same answer. serializable.
(d) Neither Q1 nor Q2 is the correct query. (d) Both S1 and S2 are not conflict serializable.
(GATE 2007: 2 Marks) (GATE 2007: 2 Marks)
( ))
Schedule S2
T1 T2
(
IV. ΠP ΠP ,Q (R ) − ΠP ,Q (R ) − ΠP ,Q (S )
(a) I only (b) II only Solution: Collection is in BCNF as there is only one
(c) III only (d) III and IV only functional dependency Title Author → Catalog_no
(GATE 2008: 1 Mark) and {Author, Title} is key for collection.
Book is not in BCNF because Catalog_no is not a Book is in 2NF because every non-prime attribute
key and there is a functional dependency Catalog_ of the table is dependent on either the key [Title,
no → Title Author Publisher Year. Author] or another non-prime attribute.
Book is not in 3NF because non-prime attributes Ans. (c)
(Publisher Year) are transitively dependent on key Linked Answer Questions 35 and 36: Consider the
[Title, Author]. following ER diagram:
M1 M2 M3 P1 P2 N1 N2
M R1 P R2 N
36. Which of the following is a correct attribute set And the schedule T2T1 has the following sequence
for one of the tables for the correct answer to the of operations:
above question? R2[x]R2[y]W2[y]R1[x]W1[x]W1[y]
(a) {M1, M2, M3, P1}
The schedule S2 is conflict-equivalent to T2T1 and
(b) {M1, P1, N1, N2}
S3 is conflict-equivalent to T1T2.
(c) {M1, P1, N1}
Only S2 and S3 are conflict serialzable schedules.
(d) {M1, P1}
To test for conflict serializability, make the wait
(GATE 2008: 2 Marks)
for graph. If there is a cycle in graph, it means not
Solution: conflict serializable.
For R1 correct attribute set: M1, M2, M3, P1 Ans. (b)
For R2 correct attribute set: N1, N2, P1, P2 with P1
38. The following key values are inserted into a B+ tree
as primary key and N1 as weak entity set.
in which order of the internal nodes is 3, and that of
Ans. (a)
the leaf nodes is 2, in the sequence given below. The
37. Consider two transactions T1 and T2, and four order of internal nodes is the maximum number of
schedules S1, S2, S3, S4 of T1 and T2 as given below: tree pointers in each node, and the order of leaf
nodes is the maximum number of data items that
T1: R1[x]W1[x]W1[y] can be stored in it. The B+ tree is initially empty.
T2: R2[x]R2[y]W2[y] 10, 3, 6, 8, 4, 2, 1
0 `AC’ 8200
T1 T2 T3
1 `AC’ 8201
Read(X )
2 `SC’ 8201
Read(Y )
5 `AC’ 8203
Read(Y )
1 `SC’ 8204
Write(Y )
3 `AC’ 8202
Write(X )
What pids are returned by the following SQL query Write(X )
for the above instance of the tables?
Read(X )
SELECT pid
FROM Reservation WHERE class = ‘AC’ AND Write(X )
Which one of the schedules below is the correct multiple accounts or joint accounts. This attri-
serialization of the above? bute stores the primary account number.
(a) T1 → T3 → T2 (b) T2 → T1 → T3 IV. Name: Name of the Student.
(c) T2 → T3 → T1 (d) T3 → T1 → T2 V. Hostel_Room: Room number of the hostel.
(GATE 2010: 2 Marks) Which of the following options is INCORRECT?
(a) BankAccount_Number is a candidate key.
Solution: T1 can complete before T2 and T3 as
(b) Registration_Number can be a primary key.
there is no conflict between Write(X) of T1 and
(c) UID is a candidate key if all students are from
the operations in T2 and T3 which occur before
the same country.
Write(X) of T1.
(d) If S is a superkey such that S ∩ UID is NULL
then S ∪ UID is also a superkey.
T3 can complete before T2 as the Read(Y) of T3
does not conflict with Read(Y) of T2. Similarly,
(GATE 2011: 1 Mark)
Write(X) of T3 does not conflict with Read(Y) and
Write(Y) operations of T2. Solution: When two students hold joint account
After topologically sorting, the sequence is T1 → in that case BankAccount_Num will not uniquely
T3 → T2. determine other attributes.
Ans. (a) Ans. (a)
46. The following functional dependencies hold for 48. Database table by name Loan_Records is given
relations R(A, B, C) and S(B, D, E) below:
B → A,
Borrower Bank_Manager Loan_Amount
A→C
Ramesh Sunderajan 10000.00
The relation R contains 200 tuples and the rela-
tion S contains 100 tuples. What is the maximum Suresh Ramgopal 5000.00
number of tuples possible in the natural join R S? Mahesh Sunderajan 7000.00
(a) 100 (b) 200
(c) 300 (d) 2000 What is the output of the following SQL query?
(GATE 2010: 2 Marks) SELECT count(*) FROM(
(SELECT Borrower, Bank_Manager FROM
Solution: B is a candidate key of R.
Loan_Records) AS S NATURAL JOIN
So, all 200 values of B must be unique in R. (SELECT Bank_Manager, Loan_Amount
There is no functional dependency given for S. FROM Loan_Records) AS T );
For the maximum number of tuples in output,
(a) 3 (b) 9
there are two ways.
(c) 5 (d) 6
(i) All 100 values of B in S are same and there is (GATE 2011: 2 Marks)
an entry in R, which matches with this value.
So, we are having 100 tuples in output.
Solution: Output as Table S:
(ii) All 100 values of B in S are different and these
values are present in R also. So, we are having Borrower Bank_Manager
100 tuples.
Ans. (a) Ramesh Sunderajan
Suresh Ramgopal
47. Consider a relational table with a single record
for each registered student with the following Mahesh Sunderajan
attributes:
Output as Table T:
I. Registration_Number: Unique registration
number for each registered student. Bank_Manager Loan_Amount
II. UID: Unique Identity number, unique at the Sunderajan 10000.00
national level for each citizen.
III. B ankAccount_Number: Unique account Ramgopal 5000.00
number at the bank. A student can have Sunderajan 7000.00
Natural Join of S and T: 50. Which of the following statements are TRUE
about an SQL query?
Borrower Bank_Manager Loan_Amount
I. An SQL query can contain a HAVING clause
Ramesh Sunderajan 10000.00 even if it does not have a GROUP BY clause.
Ramesh Sunderajan 7000.00 II. An SQL query can contain a HAVING clause
only if it has GROUP BY clause.
Suresh Ramgopal 5000.00 III. A ll attributes used in the GROUP BY clause
Mahesh Sunderajan 10000.00 must appear in the SELECT clause.
IV. Not all attributes used in the GROUP BY
Mahesh Sunderajan 7000.00 clause need to appear in the SELECT clause.
(a) I and III (b) I and IV
Ans. (c)
(c) II and III (d) II and IV
49. Consider a database table T containing two col- (GATE 2012: 1 Mark)
umns X and Y each of type integer.
Solution:
After the creation of the table, one record (X = 1,
Y = l) is inserted in the table. I. HAVING clause can also be used with aggregate
function. If we use a HAVING clause without
Let MX and MY denote the respective maximum
a GROUP BY clause, the HAVING condi-
values of X and Y among all records in the table at
tion applies to all rows that satisfy the search
any point in time. Using MX and MY, new records
condition.
are inserted in the table 128 times with X and Y
values being MX + 1, 2 * MY + 1, respectively. II. To verify S,
It may be noted that each time after the insertion, CREATE TABLE temp (id INT, name
values of MX and MY change. VARCHAR(100));
What will be the output of the following SQL INSERT INTO temp VALUES (1, "abc");
query after the steps mentioned above are carried INSERT INTO temp VALUES (2, "abc");
out?
INSERT INTO temp VALUES (3, "bcd");
SELECT Y FROM T WHERE X = 7;
INSERT INTO temp VALUES (4, "cde");
(a) 127 (b) 255 SELECT Count(*) FROM temp GROUP BY name;
(c) 129 (d) 257
(GATE 2011: 2 Marks) Result:
count(*)
Solution: 2
The value of X is calculated using MX + 1 and Y
1
is calculated using 2 × MY + 1
1
For example, X = 1 and Y = 1
Ans. (b)
at the next step, X = MX + 1 = 1 + 1 = 2,
Y = 2 × MY + 1 = 2 × 1 + 1 = 3 51. Given the basic ER and relational models, which of
the following is INCORRECT?
52. Which of the following is TRUE? Solution: Two schedules are conflict-serializable
if the actions belong to different transactions, one
(a) Every relation is 3NF is also in BCNF.
of the actions is a write operation, the actions
(b) A relation R is in 3NF if every non-prime attri-
access the same object (read or write).
bute of R is fully functionally dependent on
every key of R. Two schedules are conflict-equivalent if both sched-
(c) Every relation in BCNF is also in 3NF. ules involve the same set of transactions, and the
(d) No relation can be in both BCNF and 3NF. order of each pair of conflicting actions in both
(GATE 2012, 1 mark) schedules is the same.
Take the following schedule:
Solution: Any relation that is in `x’NF, will auto-
matically be in `x-1’ NF. Now, let us see the order of S: r1(P); r1(Q); r2(Q); r2(P); w1(Q);w2(P)
NF as we discussed in the text, it is — 1NF, 2NF, 3NF,
BCNF, 4NF and 5NF. Therefore, option (c) is correct.
Ans. (c)
53. Suppose R1(A, B) and R2(C, D) are two relation
schemas. Let r1 and r2 be the corresponding rela- T1 T1
tion instances. B is a foreign key that refers to C in
R2. If data in r1 and r2 satisfy referential integrity
constrains, which of the following is ALWAYS
TRUE?
(a) ΠB(r1) − ΠC(r2) = ∅
Cycle exists between two transactions, so T1 and
(b) ΠC(r2) − ΠB(r1) = ∅
T2 are not conflict serializable.
(c) ΠB(r1) − ΠC(r2) A schedule is conflict-serializable when the schedule
(d) ΠB(r1) − ΠC(r2) ≠ ∅ is conflict-equivalent to one or more serial sched-
(GATE 2012: 2 Marks) ules. There are two possible serial schedules–T1
followed by T2 and T2 followed by T1. In both,
Solution: B is a foreign key in r1, which refers to one of the transactions reads the value written by
C in r2. another transaction as a first step.
r1 and r2 satisfy referential integrity constraints. Ans. (b)
So, every value that exists in column B of r1 must Common Data Questions 55 and 56: Consider the
also exist in column C of r2. following relations A, B and C:
Ans. (a)
A
54. Consider the following transactions with data Id Name Age
items P and Q initialized to zero:
12 Arun 60
T1: read (P); 15 Shreya 24
read (Q); 99 Rohit 11
if P=0 then Q:=Q+1;
write (Q). B
T2: read (Q); Id Name Age
read (P) 15 Shreya 24
if Q=0 then P:=P+1; 25 Hari 40
write (P). 98 Rohit 20
Any non-serial interleaving of T1 and T2 for con-
99 Rohit 11
current execution leads to
(a) a serializable schedule C
(b) a schedule that is not conflict-serializable
(c) a conflict-serializable schedule Id Phone Area
(d) a schedule for which precedence graph cannot 10 220 02
be drawn
99 2100 01
(GATE 2012: 2 Marks)
55. How many tuples does the result of the following 57. An index is clustered, if
relational algebra expression contain? Assume that
(a) It is on a set of fields that form a candidate key.
the schema of A ∪ B is the same as that of A.
(b) It is on a set of fields that include the primary key.
(A ∪ B) A.Id> 40 C.Id <15 C (c) The data records of the file are organized in the
same order as the data entries of the index.
(a) 7 (b) 4 (d) The data records of the file are organized not in
(c) 5 (d) 9 the same order as the data entries of the index.
(GATE 2012: 2 Marks)
(GATE 2013: 1 Mark)
59. How many candidate keys does the relation R have? (c) In 3NF, but not in BCNF
(d) In BCNF
(a) 3 (b) 4
(c) 5 (d) 6 (GATE 2013: 2 Marks)
(GATE 2013: 2 Marks)
Solution: A → BC, B → CFH and F → EG are
Solution: Candidate keys AD, BD, ED and FD
partial dependencies.
Ans. (b)
So, they are in 1NF but not in 2NF.
60. The relation R is Ans. (a)
(a) In 1NF, but not in 2NF
(b) In 2NF, but not in 3NF
PRACTICE EXERCISES
Set 1
(c) If we modify the conceptual schema, then there 20. The value of particular field should be less than 50
is no need to modify the external schema. if it is
(d) If we modify the middle level schema, then
(a) An integrity constraint.
any one schema (internal/external) may be
(b) A referential constraint.
modified.
(c) A primary key constraint.
13. Relational algebra can be defined as (d) A normalization constraint.
(a) Read and destroy 22. The cardinality ratio and participation
(b) Read and develop of a table helps us to implement either the cross
(c) Rapid application development referencing design or mutual referencing design.
(d) Rapid application design
(a) Functionality (b) Constraints
16. In DML, RECONNCT command cannot be used (c) SQL queries (d) Domains
with
23. Consider the following schemas:
(a) CLEAN set. (b) FIXED set.
(c) OPTIONAL set. (d) USER set. Branch = (Branch-name, Assets, Branch-city)
Customer = (Customer-name, Bank name,
17. The UWA is used to communicate the contents of
Customer-city)
individual records between
Borrow = (Branch-name, loan number,
(a) Host program and present record. customer account-number)
(b) Host program and host record. Deposit = (Branch-name, Accountnumber,
(c) Host program and DBMS. Customer-name, Balance)
(d) Host program and host language.
Using relational algebra, the query that finds cus-
18. Referential integrity is concerned with tomers having deposits have accountnumber >
23456 is
(a) Primary key. (b) Foreign key.
(c) Alternate key. (d) Project_Join key. (a) Πcustomer-name (s account-number > 23456 (Deposit)
(b) s customer-name (s account-number > 23456 (Deposit)
(c) Πcustomer-name (s account-number > 23456 (Borrow)
19. Match the following:
(d) s customer-name (Π account-number > 23456 (Borrow)
List — I List — II
I. DDL A. LOCK TABLE 24. What deletes the entire file except the file structure?
II. DML B. COMMIT (a) DELETE
III. TCL C. Natural Difference (b) DELETE RECORDS
(c) ZAP
IV. BINARY D. REVOKE
(d) RETAIN STRUCT
Operation
25. The function of a Transaction Manager is to
Codes:
I. Maintain a log of transactions.
I II III IV II. Maintain database images before and after a
(a) A D B C transaction.
(b) A B D C III. Maintain appropriate concurrency control.
(c) D B A C IV. Avoid deadlocks.
(d) D A B C
(a) I, II, III and IV (b) II and III 32. A network schema is used to
(c) I, III and IV (d) I, II and III
(a) restrict to many-to-many relationship.
(b) permit many-to-many relationship.
26. The term granularity relates to
(c) permit to store data in a database.
(a) the size of table. (d) help in storing one-to-many relationship.
(b) the size of data item.
(c) the size of tuple. 33. Storing under unified scheme at a single site is
(d) the size of primary key called as
(a) data mining. (b) universal database.
27. Match the following: (c) data warehousing. (d) advanced database.
34. The task of correcting and cleaning data is called as
I. OLAP A. Back propagation
(a) data organization. (b) pre-processing.
II. OLTP B. Data warehouse
(c) data mining. (d) data structuring.
III. Decision tree C. RDBMS
35. A clustering index consists of .
IV. Neural network D. Classification
(a) declared and ordered primary key
(b) both grouped and ordered primary key and for-
I II III IV eign key
(c) ordered foreign key
(a) D C A B
(d) grouped and ordered candidate key and alter-
(b) B C D A nate key
(c) A B C D
36. Consider the following schema:
(d) D B A C
Emp(Empcode, Name, Sex, Salary, Deptt)
A simple SQL query is executed as follows:
28. Which of the following statements is TRUE?
SELECT Deptt FROM Emp
(a) A prime attribute can be transitively depen-
dent on a key in 5NF relation. WHERE sex = ‘M’
(b) A prime attribute can be transitively depen- GROUP by Dept
dent on a key in 3NF relation.
HAVING avg (Salary) > {select avg (Salary)
(c) A prime attribute can be transitively depen-
from Emp}
dent on a key in PJNF relation.
(d) A prime attribute cannot be transitively depen- The output will be
dent on a key in BCNF relation.
(a) a verage salary of male employee is the average
29. Which normal form is considered as adequate for a salary of the organization.
simple database design? (b) average salary of male employee is less than the
average salary of the organization.
(a) 2NF (b) 3NF
(c) average salary of male employee is equal to the
(c) BCNF (d) PJNF
average salary of the organization.
30. Which of the following is not a type of DBMS? (d) average salary of male employees in a depart-
ment is more than the average salary of the
(a) Hierarchical (b) Network organization.
(c) Relational (d) Graphical
37. Consider a relation X(p, q, r), and the domains are
31. Manager’s salary details are to be hidden from represented by their atomic values. Further, the
Employee Table. This technique is called as functional dependency p → q, q → r is observed,
the relation is in
(a) internal level datahiding.
(b) physical level datahiding. (a) 1NF, not in 2NF
(c) external level datahiding. (b) 2NF, not in 3NF
(d) logical level datahiding. (c) 3NF
(d) 1NF, not in 3NF
(a) In First normal form but not in Second normal 5. Consider the scheme R = (S T U V) and the
form. dependencies S → T, T → U, U → V and V → S.
(b) In Second normal form but not in Third normal Let R = (R1 and R2) be a decomposition such that
form. R1 ∩ R2 = f. The decomposition is
(c) In Third normal form. (a) Not in 2NF
(d) None of the above. (b) In 2NF but not 3NF.
2. Let R(a, b, c) and S(d, e, f) be two relations in (c) In 3NF but not 2NF.
which d is the foreign key of S that refers to the (d) In both 2NF and 3NF.
primary key of R. Consider the following four oper- 6. Consider a schema R(A, B, C, D) and functional
ations on R and S: dependencies A → B and C → D. Then the decom-
I. Insert into R II. Insert into S position of R into R1(AB) and R2(CD) is
III. Delete from R IV. Delete from S (a) Dependency preserving and lossless join.
Which of the following is true about the referential (b) Lossless join but not dependency preserving.
integrity constraint above? (c) Dependency preserving but not lossless join.
(d) Not dependency preserving and not lossless join.
(a) None of I, II and IV can cause its violation.
(b) All of I, II, III and IV can cause its violation. 7. Relation R with an associated set of functional
(c) Both I and IV can cause its violation. dependencies, F, is decomposed into BCNF. The
(d) Both II and III can cause its violation. redundancy (arising out of functional dependencies)
in the resulting set of relation is
3. There are five records in a database
(a) Zero.
Name Age Occupation Category (b) More than zero but less than that of an equiva-
lent 3NF decomposition.
Rama 27 CON A
(c) Proportional to the size of F+.
Abdul 22 ENG A (d) Indeterminate.
8. From the following instance of a relation scheme 13. Which of the following relational calculus expres-
R(A, B, C), we can conclude that sions is not safe?
(b) disk access is much slower than memory access. student names are of length 8 bytes, disk blocks are
(c) d
isk data transfer rates are much less than of size 512 bytes and index pointers are of size 4
memory data transfer rates. bytes. Given this scenario, what would be the best
(d) disks are more reliable than memory. choice of the degree (i.e., the number of pointers
per node of the B+ tree?
18. A B+ tree index is to be built on the name attri-
bute of the relation STUDENT. Assume that all
Set 1
1. (a) 8. (d) 15. (c) 22. (b) 29. (b) 36. (d)
2. (b) 9. (a) 16. (b) 23. (a) 30. (d) 37. (a)
3. (d) 10. (b) 17. (c) 24. (c) 31. (c) 38. (b)
4. (d) 11. (c) 18. (b) 25. (d) 32. (b)
5. (a) 12. (b) 19. (d) 26. (b) 33. (c)
6. (d) 13. (a) 20. (a) 27. (b) 34. (b)
7. (b) 14. (a) 21. (a) 28. (d) 35. (a)
Set 2
1. (a) The relation has the following functional Candidate keys for this relation are {S, T, U, V }
dependencies: As given R1 ∩ R2 = f
a→c So, R1 and R2 might have two attributes. A rela-
b→d tion with two attributes satisfies both 2NF and
{a b} is key for the given relation. 3NF conditions.
Here prime attributes are a and b. 6. (c) Dependency is preserved because there is no
loss of functional dependencies.
c and d are dependent on partial keys, this is not Since R1(AB) and R2(CD) do not have any common
allowed in 2NF. So, the given relation is in 1 NF. attribute. So, it is lossy join.
2. (d) Insertion into S and deletion from R can cause 7. (a) Redundancy is zero when we decompose into
inconsistency. So, II and III can cause violation. BCNF.
3. (c) The index is built from field occupation where 8. (b, c) F: X → Y implies that for any two tuples if
column is sorted in alphabetic order CON, DOC, t1[X] = t2[X], they must have t1[Y] = t2[Y].
ENG, SER and MUS. Index is assigned according
9. (d) DISTINCT key word is used to remove dupli-
to alphabetic position.
cate rows along with SELECT keyword. SQL does
4. (a) 2NF, (b) 3NF not permit attribute names to be repeated in the
same relation.
5. (d) The relation schema R = (S T U V) has follow-
ing functional dependencies 10. (a) Cross join with an empty relation produces the
S→T overall result as empty. Therefore, S is not empty. `r’
should not have any duplicates to have same rows as `r’.
T→U
U→V 11. (a) s c (s c (R)) and (s c (s c (R))) will not be equiv-
1 1 2 1
12. (c) Tuple calculus expression select tuples where Leaf nodes are linked together in B+ trees, hence
A = 10 and B = 20 of relation R. So, the equiva- range queries are faster.
lent relational algebra expression is s(A = 10(r) ∩
sB = 20(r). 17. (b) In B+ trees, disk access is slow and numbers of
hits are less. B+ trees have very high fan-out (typi-
13. (c) The query −t|¬ (t ∈ R1) is syntactically cor- cally on the order of 100 or more), which reduces
rect. But it requests for all tuples that are not in the number of I/O operations required to find an
R1. That set of such t tuples is obviously infinite, element in the tree.
in the context of such as the infinite domain set of 18. (43) Let n be the degree.
all integers. Therefore, this is unsafe query.
Given k key size (length of the name = 8 byte attri-
14. (c) Both relational algebra and relational calculus bute of student)
has same expressive powers. Every query written
in relation algebra can be expressed in relational Disk block size, B = 512 bytes
calculus. Index pointer size, b = 4 bytes
15. (d) Given schedule has W1(A), R2(A) and W2(B), Degree of B+ tree can be calculated if we know
R1(B) conflicting pairs. So the schedule is not seri- the maximum number of key an internal node can
alizable. Hence, it cannot occur in scheme using have; the formula for that is
2PL protocol. (n − 1)k + n × b = Block size
16. (b) Most database systems use indexes built on (n − 1)8 + n × 4 = 512
some form of a B+ tree due to its many advan- 12n = 520
tages, in particular its support for range queries. n = 520/12 = 43