Database Management System

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 80

R. R.

Institute of Modern Technology, Lucknow

Database Management System


(ECS-402)
CS/IT- 2nd Year

Submitted By:
Rakesh Kumar Roshan
CS/IT Department

1
INDEX

S.N. Topic Page


No.
A Syllabus 4
1 Unit – I
1.1 Introduction
1.1.1 An overview of database management system
1.1.2 database system Vs file system
1.1.3 Database system concept and architecture
1.1.4 data model schema and instances
1.1.5 data independence
1.1.6 data definitions language and data manipulation language
1.1.7 Overall Database Structure 5-20
1.2 Data Modeling using the Entity Relationship Model
1.2.1 ER model concepts
1.2.2 notation for ER diagram
1.2.3 mapping constraints
1.2.4 Keys: Super Key, candidate key, primary key and foreign key
1.2.5 generalization and aggregation
1.2.6 extended ER model
1.2.7 relationship of higher degree
2 Unit – II
2.1 Relational data Model and Language
2.1.1 Relational data model concepts
2.1.2 integrity constraints: entity, referential, Keys and Domain integrity
constraints
2.1.3 relational algebra
2.1.4 relational calculus
2.1.5 tuple and domain calculus
2.2 Introduction on SQL
2.2.1 Characteristics of SQL
2.2.2 advantage of SQL
2.2.3 SQL data type and literals
21-44
2.2.4 Types of SQL commands
2.2.5 SQL operators and their procedure
2.2.6 Tables, views and indexes
2.2.7 queries and sub queries
2.2.8 Aggregate functions
2.2.9 Insert, update and delete operations
2.2.10 Joins

2
2.2.11 Unions, Intersection and Minus
2.2.12 Cursors
2.2.13 Triggers
2.2.14 Procedures in PL/SQL
3 Unit – III
3.1 Data Base Design & Normalization
3.1.1 Functional dependencies
3.1.2 normal forms: 1NF, 2NF, 3NF, BCNF, 4NF(MVD), 5NF(JD) 45-56
3.1.3 inclusion dependence
3.1.4 loss less join decompositions
4 Unit – IV
4.1 Transaction Processing Concept
4.1.1 Transaction system
4.1.2 Testing of serializability
4.1.3 serializability of schedules: conflict & view serializable schedule
4.1.4 recoverability 57-67
4.1.5 Recovery from transaction failures
4.1.6 log based recovery and checkpoints,
4.1.7 deadlock handling
4.2 Distributed Database
4.2.1 distributed data storage
4.2.2 concurrency control
4.2.3 directory system
5 Unit – V
5.1 Concurrency Control Techniques
5.1.1 Concurrency control
5.1.2 Locking Techniques for concurrency control
5.1.3 Time stamping protocols for concurrency control 68-75
5.1.4 validation based protocol
5.1.5 multiple granularity
5.1.6 Multi version schemes
5.1.7 Recovery with concurrent transaction
B Model papers 76-79
C References 80

3
(A) Database Management System (ECS-402)

Unit–1
Introduction: An overview of database management system, database system Vs file system, Database
system concept and architecture, data model schema and instances, data independence and database
language and interfaces, data definitions language, DML, Overall Database Structure.
Data modeling using the Entity Relationship Model:
ER model concepts, notation for ER diagram, mapping constraints, keys, Concepts of Super Key,
candidate key, primary key, Generalization, aggregation, reduction of an ER diagrams to tables,
extended ER model, relationship of higher degree.

Unit–2
Relational data Model and Language: Relational data model concepts, integrity constraints, entity
integrity, referential integrity, Keys constraints, Domain constraints, relational algebra, relational
calculus, tuple and domain calculus.
Introduction on SQL: Characteristics of SQL, advantage of SQL. SQl data type and literals. Types of SQL
commands. SQL operators and their procedure. Tables, views and indexes. Queries and sub queries.
Aggregate functions. Insert, update and delete operations, Joins, Unions, Intersection, Minus, Cursors,
Triggers, and Procedures in PL/SQL

Unit–3
Data Base Design & Normalization: Functional dependencies, normal forms, first, second, third normal
forms, BCNF, inclusion dependence, loss less join decompositions, normalization using FD, MVD, and
JDs, alternative approaches to database design.

Unit–4
Transaction Processing Concept: Transaction system, Testing of serializability, serializability of
schedules, conflict & view serializable schedule, recoverability, Recovery from transaction failures, log
based recovery, checkpoints, deadlock handling.
Distributed Database: distributed data storage, concurrency control, directory system.

Unit–5
Concurrency Control Techniques: Concurrency control, Locking Techniques for concurrency control,
Time stamping protocols for concurrency control, validation based protocol, multiple granularity, Multi
version schemes, Recovery with concurrent transaction, case study of Oracle.

4
UNIT- I

1.1.1 An overview of database management system


Database Management System (DBMS)
A general purpose software system enabling
 Creation of large disk-resident databases.
 Posing of data retrieval queries in a standard manner
 Retrieval of query results efficiently.
 Concurrent use of the system by a large number of users in a consistent manner.
 Guaranteed availability of data irrespective of system failures.

What is a Database?

A collection of related pieces of data:


 Representing/capturing the information about a real-world enterprise or part of an enterprise.
 Collected and maintained to serve specific data management needs of the enterprise.
 Activities of the enterprise are supported by the database and continually update the database.

Why Use a DBMS?

 Data independence and efficient access.


 Reduced application development time.
 Data integrity and security.
 Uniform data administration.
 Concurrent access, recovery from crashes

An Example-
University Database:
Data about students, faculty, courses, research-laboratories, course registration/enrollment etc.
Reflects the state of affairs of the academic aspects of the university. Purpose: To keep an
accurate track of the academic activities of the university.

1.1.2 Database System Vs File System

DBMS Approach
•separation of data and metadata
•flexibility of changing metadata
•program-data independence

Data access language


•standardized –SQL
•ad-hoc query formulation –easy

System development
•less effort required

5
•concentration on logical level design is enough •components to organize data storage
process queries, manage concurrent access, recovery from failures, manage access
control are all available

File Processing System

Files of records –used for data storage


•data redundancy –wastage of space
•maintaining consistency becomes difficult
Record structures –hard coded into the programs
•structure modifications –hard to perform
Each different data access request (a query)
•performed by a separate program
•difficult to anticipate all such requests
Creating the system
•requires a lot of effort
Managing concurrent access and failure recovery are difficult

1.1.3 Database system concept and architecture

The External Level represents the collection of views available to different end-users.
The Conceptual level is the representation of the entre information content of the database.
The internal level is the physical level which shows how the data is stored, what are the representation
of the fields etc.
Physical level: describes how a record (e.g., customer) is stored.
Logical level: describes data stored in database, and the relationships among the data.
View level: application programs hide details of data types. Views can also hide information (such as an
employee‘s salary) for security purposes.

1.1.4 Data Model, Schema and Instances

• A collection of tools for describing


6
– Data
– Data relationships
– Data semantics
– Data constraints
• Relational model
• Entity-Relationship data model (mainly for database design)
• Object-based data models (Object-oriented and Object-relational)
• Semi structured data model (XML)
• Other older models:
– Network model
– Hierarchical model

• Similar to types and variables in programming languages


• Schema – the logical structure of the database
– Example: The database consists of information about a set of customers and accounts and the
relationship between them)
– Analogous to type information of a variable in a program
– Physical schema: database design at the physical level
– Logical schema: database design at the logical level
• Instance – the actual content of the database at a particular point in time
– Analogous to the value of a variable

1.1.5 Data Independence and Data Base Language

Applications insulated from how data is structured and stored.

Physical Data Independence: the ability to modify the physical schema without changing the logical
schema
 Applications depend on the logical schema
 In general, the interfaces between the various levels and components should be well
defined so that changes in some parts do not seriously influence others.

Logical Data Independence: the ability to modify conceptual schema without changing the external
Schema or application programs.

1.1.6 Data Definition Language (DDL) and Data Manipulation Language (DML)

• Language for accessing and manipulating the data organized by the appropriate data model
• Two classes of languages
Procedural – user specifies what data is required and how to get those data
Declarative (nonprocedural) – user specifies what data is required without specifying how to get those
data
• SQL is the most widely used query language
• Specification notation for defining the database schema

Example: create table account (account_number char(10),branch_name char(10), balance integer)

• DDL compiler generates a set of tables stored in a data dictionary

7
• Data dictionary contains metadata (i.e., data about data)
– Database schema
– Data storage and definition language
• Specifies the storage structure and access methods used
– Integrity constraints
• Domain constraints
• Referential integrity (e.g. branch_name must correspond to a valid branch in the branch table)
– Authorization

1.1.7 Overall Database Structure

Figure: 1.1.7
8
Database Users

Users are differentiated by the way they expect to interact with the system

• Application programmers – interact with system through DML calls

• Sophisticated users – form requests in a database query language

• Specialized users – write specialized database applications that do not fit into the traditional data
processing framework

• Naïve users – invoke one of the permanent application programs that have been written previously
– Examples, people accessing database over the web, bank tellers, clerical staff

Database Administrator

• Coordinates all the activities of the database system


– has a good understanding of the enterprise‘s information resources and needs.
• Database administrator's duties include:
– Storage structure and access method definition
– Schema and physical organization modification
– Granting users authority to access the database
– Backing up data
– Monitoring performance and responding to changes
• Database tuning

Data storage and Querying


• Storage management
• Query processing
• Transaction processing

Storage Management

• Storage manager is a program module that provides the interface between the low-level data stored
in the database and the application programs and queries submitted to the system.
• The storage manager is responsible to the following tasks:
– Interaction with the file manager
– Efficient storing, retrieving and updating of data
• Issues:
– Storage access
– File organization
– Indexing and hashing

Transaction Management

• A transaction is a collection of operations that performs a single logical function in a database


application

9
• Transaction-management component ensures that the database remains in a consistent (correct)
state despite system failures (e.g., power failures and operating system crashes) and transaction
failures.

• Concurrency-control manager controls the interaction among the concurrent transactions, to ensure
the consistency of the database

Figure: 1.1.7(a)

Role of the Database Administrator

Typically there are three types of users for a DBMS:

1. The End User who uses the application. Ultimately, this is the user who actually puts the data in the
system into use in business. This user need not know anything about the organization of data in the
physical level.

2. The Application Programmer who develops the application programs. She has more knowledge
about the data and its structure since she has manipulate the data using her programs. She also need
not have access and knowledge of the complete data in the system.

3. The Database Administrator (DBA) who is like the super-user of the system.

The role of the DBA is very important and is defined by the following functions.
Defining the Schema
The DBA defines the schema which contains the structure of the data in the application. The DBA
determines what data needs to be present in the system ad how this data has to be represented and
organized.

Liaising with Users


The DBA needs to interact continuously with the users to understand the data in the system and its
use.

Defining Security & Integrity Checks


The DBA finds about the access restrictions to be defined and defines security checks accordingly. Data
Integrity checks are also defined by the DBA.
10
Defining Backup / Recovery Procedures
The DBA also defines procedures for backup and recovery. Defining backup procedures includes
specifying what data is to backed up, the periodicity of taking backups and also the medium and
storage place for the backup data.

Monitoring Performance
The DBA has to continuously monitor the performance of the queries and take measures to optimize all
the queries in the application.

1.2 Data Modeling using the ER Model


1.2.1 ER model concepts

The Entity Relationship (ER) data model allows us to describe the data involved in a real world
enterprise in terms of objects and their relationships and is widely used to develop an initial data base
design. Within the larger context of the overall design process, the ER model is used in a phase called
.Conceptual database design.

Entity Types, Attributes and Keys:


Entities are specific objects or things in the mini-world that are represented in the database.
For example, Employee or staff, Department or Branch, Project

For example an EMPLOYEE entity may have a Name, SSN, Address, Sex, Birthdate and Department may
have a Dname, Dno, DLocation. A specific entity will have a value for each of its attributes.
For example a specific employee entity may have Name=’John Smith’, SSN=’123456789’, Address
=’731, Fondren, Houston, TX’, Sex=’M’, BirthDate=’09-JAN-55.

Types of Attributes:

11
12
1.2.2 Notations for ER diagram

13
1.2.3 Mapping Cardinality Constraints

• Express the number of entities to which another entity can be associated via a relationship set.
• Most useful in describing binary relationship sets.
• For a binary relationship set the mapping cardinality must be one of the following types:
– One to one
– One to many
– Many to one
– Many to many

Figure1.2.3(a)

Note: Some elements in A and B may not be mapped to any elements in the other set

Figure1.2.3(b)

Note: Some elements in A and B may not be mapped to any elements in the other set

14
1.2.4 Keys

15
Graphical Representation in E-R diagram

Rectangle – Entity

Ellipses -- Attribute (underlined attributes are [part of] the primary key)

Double ellipses -- multi-valued attribute

Dashed ellipses-- derived attribute, e.g. age is derivable from birthdate and current date.
Keep all attributes above the entity. Lines have no arrows. Use straight lines only.

1.2.5 Generalization and Specialization:

16
Figure:1.2.5

17
1.2.6 Extended ER model

 As in C++, or other attributes are inherited.


 If we declare A ISA B, every A entity is also considered to be a B entity.
• Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity?
(Allowed/disallowed)
• Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a
Contract_Emps entity? (Yes/no)
• Reasons for using ISA:
– To add descriptive attributes specific to a subclass.
– To identify entitities that participate in a relationship

Figure:1.2.6

1.2.7 Relationship of higher degree

The number of roles in the relationship

Binary – links two entity sets; set of ordered pairs (most common)

Ternary – links three entity sets; ordered triples (rare). If a relationship exists among the three entities,
all three must be present

18
N-ary – links n entity sets; ordered n-tuples (very rare). If a relationship exists among the entities, then
all must be present. Cannot represent subsets.

Figure: 1.2.7

19
UNIT- II

2.1 Relational data model and language


2.1.1 Relational data model concepts

A relation is a table with columns and rows. Only applies to logical structure of the database, not the
physical structure.
 Attribute is a named column of a relation.
 Domain is the set of allowable values for one or more attributes.
 Tuple is a row of a relation.
 Degree is the number of attributes in a relation.
 Cardinality is the number of tuples in a relation.
 Relational Database is a collection of normalized relations with distinct relation names.

20
Figure: 2.1.1

Properties of Relations

• No Duplicate Tuples – A relation cannot contain two or more tuples which have the same values for
all the attributes. i.e., in any relation, every row is unique.
• Tuples are unordered – The order of rows in a relation is immaterial.
• Attributes are unordered – The order of columns in a relation is immaterial.
• Attribute Values are Atomic – Each tuple contains exactly one value for each attribute. It may be
noted that many of the properties of relations follow the fact that the body of a relation is a
mathematical set.

Codd's Twelve Rules

In his 1985 Computerworld article, Ted Codd presented 12 rules that a database must obey if it is to be
considered truly relational. Codd's 12 rules, shown in the following list, have since become a semi-
official definition of a relational database. The rules come out of Codd's theoretical work on the
relational model and actually represent more of an ideal goal than a definition of a relational database.

1. The information rule. All information in a relational database is represented explicitly at the logical
level and in exactly one way—by values in tables.
21
2.Guaranteed access rule. Each and every datum (atomic value) in a relational
database is guaranteed to be logically accessible by resorting to a combination of
table name, primary key value, and column name.

3. Systematic treatment of null values. Null values (distinct from an empty character string or a string
of blank characters and distinct from zero or any other number) are supported in a fully relational
DBMS for representing missing information and inapplicable information in a systematic way,
independent of the data type.

4.Dynamic online catalog based on the relational model. The database description is represented at
the logical level in the same way as ordinary data, so that authorized users can apply the same
relational language to its interrogation as they apply to the regular data.

5.Comprehensive data sublanguage rule. A relational system may support several languages and
various modes of terminal use (for example, the fill-in-the-blanks mode). However, there must be at
least one language whose statements are expressible, per some well-defined syntax, as character
strings, and that is comprehensive in supporting all of the following items:
• Data definition
• View definition
• Data manipulation (interactive and by program)
• Integrity constraints
• Authorization
• Transaction boundaries (begin, commit, and rollback)

6. View updating rule. All views that are theoretically updateable are also updateable by the system.

7.High-level insert, update, and delete. The capability of handling a base relation or a derived relation
as a single operand applies not only to the retrieval of data but also to the insertion, update, and
deletion of data.

8.Physical data independence. Application programs and terminal activities remain logically
unimpaired whenever any changes are made in either storage
representations or access methods.

9.Logical data independence. Application programs and terminal activities remain logically unimpaired
when information preserving changes of any kind that
theoretically permit impairment are made to the base tables.

10.integrity independence. Integrity constraints specific to a particular relational


database must be definable in the relational data sublanguage and storable in the
catalog, not in the application programs.

11. Distribution independence. A relational DBMS has distribution Independence.

12.Nonsubversion rule. If a relational system has a low-level (single record at a time) language, that
low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the
higher-level relational language (multiple records at a time).

22
2.1.2 Integrity Constraints

2.1.3 Relational algebra

Introduction
 Relational algebra and relational calculus are formal languages associated with the relational
model.
 Informally, relational algebra is a (high-level) procedural language and relational calculus a
nonprocedural language.
 However, formally both are equivalent to one another.
 A language that produces a relation that can be derived using relational calculus is relationally
complete.

23
Relational Algebra

 Relational algebra operations work on one or more relations to define another relation without
changing the original relations.
 Both operands and results are relations, so output from one operation can become input to
another operation.
 Allows expressions to be nested, just as in arithmetic. This property is called closure.
 Five basic operations in relational algebra: Selection, Projection, Cartesian product, Union, and
Set Difference.
 These perform most of the data retrieval operations needed.
 Also have Join, Intersection, and Division operations, which can be expressed in terms of 5 basic
operations.

Relational Algebra Operations

24
Figure: 2.1.3

Selection (or Restriction)

Works on a single relation R and defines a relation that contains only those tuples (rows) of R that
satisfy the specified condition (predicate).

Projection

.
Works on a single relation R and defines a relation that contains a vertical subset of R, extracting the
values of specified attributes and eliminating duplicates

25
Union

 .Union of two relations R and S defines a relation that contains all the tuples of R, or S,
or both R and S, duplicate tuples being eliminated.
 R and S must be union-compatible.
and S have I and J tuples, respectively, union is obtained by concatenating them into one relation
with a maximum of (I + J) tuples.

Set Difference
-S
 Defines a relation consisting of the tuples that are in relation R, but not in S.
 R and S must be union-compatible.

26
Intersection

 . Defines a relation consisting of the set of all tuples that are in both R and S.
 . R and S must be union-compatible.

Cartesian product

 Defines a relation that is the concatenation of every tuple of relation R with every tuple of
relation S

27
Join Operations

 Join is a derivative of Cartesian product.


 Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian
product of the two operand relations.
 One of the most difficult operations to implement efficiently in an RDBMS and one reason why
RDBMSs have intrinsic performance problems.

Various forms of join operation


 Theta join
 Equijoin (a particular type of Theta join)
 Natural join
 Outer join
 Semijoin

Division

 Defines a relation over the attributes C that consists of set of tuples from R that match
combination of every tuple in S.

28
X T1) . R)

2.1.4 Relational Calculus

what is to be retrieved rather than how to retrieve it.


 No description of how to evaluate a query.
-order logic (or predicate calculus), predicate is a truth-valued function with arguments.
proposition,
which can be either true or false.
x is a member of staff.), there must be a range for x.
x, proposition may be true; for other values, it may
be false.

tuple and domain.

2.1.5 Tuple Relational Calculus

inding tuples for which a predicate is true. Based on use of tuple variables.

values are tuples of the relation.


S as the Staff relation as:
Staff(S)

Tuple Relational Calculus - Example

bute, such as salary, write:

2.2 Introduction on SQL


2.2.1 Characteristics of SQL

SQL is an ANSI and ISO standard computer language for creating and manipulating databases.
SQL allows the user to create, update, delete, and retrieve data from a database.
SQL is very simple and easy to learn.
SQL works with database programs like DB2, Oracle, MS Access, Sybase,
MS SQL Server etc.

29
2.2.2 Advantage of SQL

2.2.3 SQL Data Types and literals

Data Type: A group of data that shares some common characteristics and operations.
SQL defines the following data types:

 Character String - A sequence of characters from a predefined character set.


 Bit String - A sequence of bit values: 0 or 1.
 Exact Number - A numeric value who's precision and scale need to be preserved. Precision and
scale can be counted at decimal level or binary level. The decimal precision of a numerical value
is the total number of significant digits in decimal form. The decimal scale of a numerical value
is the number of fractional digits in decimal form. For example, the number 123.45 has a
precision of 5 and a scale of 2. The number 0.012345 has a precision of 6 and a scale of 6.
 Approximate Number - A numeric value whose precision needs to be preserved, and scale
floated to its exponent. An approximate number is always expressed in scientific notation of
"mantissa"E"exponent". Note that an approximate number has two precisions: mantissa
precision and exponent precision. For example, the number 0.12345e1 has a mantissa precision
of 5 and exponent precision of 1.
 Date and Time - A value to represent an instance of time. A date and time value can be divided
into many portions and related them to a predefined calendar system as year, month, day,
hour, minute, second, second fraction, and time zone. A date and time value also has a
precision, which controls the number of digits of the second fraction portion. For example:
1999-1-1 1:1:1.001 has precision of 3 on the second fraction portion.

1. Character String - A character string is usually represented in memory as an array of characters. Each
character is represented in 8 bits (one byte) or 16 bits (two bytes) based on the character set and the
character encoding schema. For example, with ASCII character set and its encoding schema, character
"A" will be represented as "01000001". Character "1" will be represented as "00110001". Character
string "ABC" will be represented as "010000010100001001000011".
2. Bit String - The binary representation of a bit string should be easy. A bit string should be

30
represented in memory as it is. Bit string "01000001" should be represented as "01000001". There
might an issue with memory allocation, because computer allocates memory in units of bytes (8 bits
per byte). If the length of a bit string is not multiples of 8 bits, the last allocated byte is not full. How to
handle the empty space in the last byte? I guess different SQL implementation will have different rules.

3. Exact Number - Exact numbers can be divided into two groups: integers and non-integers. An integer
is an exact number with scale of 0. An integer is represented in either 4 bytes or 8 bytes based on the
signed binary value system. For example, with 4 bytes, integer "1" will be represented as
"00000000000000000000000000000001". Integer "-1" will be represented as
"1111111111111111111111111111111".
As for exact non-integer numbers, I don't know exactly how they will be represented in binary forms.

4. Approximate Number - An approximate number is normally represented in binary form according to


the IEEE 754 single-precision or double-precision standards in either 4 bytes or 8 bytes. The binary
representation is divided into 3 components with different number of bits assigned to each
components:

Code:
Sign Exponent Fraction Total
Single-Precision 1 8 23 32
Double-Precision 1 11 52 64

With the double precision standard, the mantissa precision can go up to 52 binary digits, about 15
decimal digits.

5. Data and Time - A date and time value is usually stored in memory as an exact integer number with 8
bytes representing an instance by measuring the time period between this instance and a reference
time point in millisecond precision, second fraction precision of 3. How MySQL is store date and time
values? We will try to find out later.

Data Literals
Now we know the types of data, and how they are stored in memory. Next we need know how data
can get in to the computer. One way is to enter it through the program source code as a data literal.
Data Literal: An program source element that represents a data value. Data literals can be divided into
multiple groups depending the type of the data it is representing and how it is representing.

1. Character String Literals are used to construct character strings, exact numbers, approximate
numbers and data and time values. The syntax rules of character string literals are pretty simple:

 A character string literal is a sequence of characters enclosed by quote characters.


 The quote character is the single quote character "'".
 If "'" is part of the sequence, it needs to be doubled it as "''".

Examples of character string literals:


Quote:
'Hello world!'
'Loews L''Enfant Plaza'

31
'123'
'0.123e-1'
'1999-01-01'
2. Hex String Literals are used to construct character strings and exact numbers. The syntax rules for
hex string literals are also very simple:

 A hex string literal is a sequence of hex digits enclosed by quote characters and prefixed with
"x".
 The quote character is the single quote character "'".

Examples of hex string literals:

Code:
x'41424344'
x'31323334'
x'31323334'
x'01'
x'0001'
x'ff'
x'ffffffff'
x'ffffffffffffffff'

3. Numeric Literals are used to construct exact numbers and approximate numbers. Syntax rules of
numeric literals are:

 A numeric literal can be written in signed integer form, signed real numbers without exponents,
or real numbers with exponents.

Examples of numeric literals:

Quote:

1
-22
33.3
-44.44
55.555e5
-666.666e-6

4. Date and Time Literals are used to construct date and time values. The syntax of date and time
literals are:
 A date literal is written in the form of "DATE 'yyyy-mm-dd'".
 A time literal is written in the form of "TIMESTAMP 'yyyy-mm-dd hh:mm:ss'".
Examples of data and time literals:
Quote:
DATE '1999-01-01'
TIMESTAMP '1999-01-01 01:02:03'

32
2.2.4 Types of SQL commands

Figure: 2.2.4

33
2.2.5 SQL operators and their procedure

34
Insert command

Update command

35
2.2.6 Tables, views and indexes

SQL> CREATE TABLE EMPLOYEE (NAME VARCHAR2 (15), DOB DATE, SALARY NUMBER (5), DEPT_CODE
NUMBER (3), ADDRESS VARCHAR2 (25));

Table created.

SQL> DESC EMPLOYEE;


Name Null? Type
----------------------------------------- -------- ----------------------------
NAME VARCHAR2 (15)
DOB DATE
SALARY NUMBER (5)
DEPT_CODE NUMBER (3)
ADDRESS VARCHAR2 (25))

2.2.7 Queries and sub queries

SQL> SELECT * FROM CUSTOMERS WHERE ID IN (SELECT ID FROM CUSTOMERS WHERE SALARY >
4500);

36
ID | NAME | AGE | ADDRESS | SALARY |

+----+----------+-----+---------+----------+

| 4 | Chaitali | 25 | Mumbai | 6500.00 |

| 5 | Hardik | 27 | Bhopal | 8500.00 |

| 7 | Muffy | 24 | Indore | 10000.00

2.2.8 Aggregate functions

SQL> select * from emp;


EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
--------- ---------- --------- --------- --------- --------- --------- ---------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 100 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10

SQL> SELECT MIN(SAL) FROM EMP;


MIN(SAL)
---------
800

SQL> SELECT MAX(SAL) FROM EMP


MAX(SAL)
---------
5000

SQL> SELECT SUM(SAL) FROM EMP


SUM(SAL)
---------
29025

37
SQL> SELECT AVG(SAL) FROM EMP
AVG(SAL)
---------
2073.2143

SQL> SELECT COUNT(*) FROM EMP


COUNT(*)
---------
14

2.2.9 Insert, update and delete operations

SQL> INSERT INTO EMPLOYEE VALUES('MAHESH', '01-FEB-1983', 5000, 01, 'LUCKNOW');


1 row created.

SELECT * FROM EMPLOYEE;

NAME DOB SALARY DEPT_CODE ADDRESS


--------------- --------- --------- ----------------- -------------------------
MAHESH 01-FEB-83 5000 1 LUCKNOW

INSERT MORE THAN ONE VALUES IN THE TABLE:


INSERT INTO EMPLOYEE VALUES(&NAME, &DOB, &SALARY, &DEPT_CODE, &ADDRESS)

Enter value for name: 'KAMLESH'


Enter value for dob: '26-MAR-1986'
Enter value for salary: 7000
Enter value for dept_code: 02
Enter value for address: 'MUMBAI'
old 1: INSERT INTO EMPLOYEE VALUES(&NAME, &DOB, &SALARY, &DEPT_CODE, &ADDRESS)
new 1: INSERT INTO EMPLOYEE VALUES('KAMLESH', '26-MAR-1986', 7000, 02, 'MUMBAI')
1 row created.

SELECT * FROM EMPLOYEE;

NAME DOB SALARY DEPT_CODE ADDRESS


--------------- --------- --------- --------- -------------------------
MAHESH 01-FEB-83 5000 1 LUCKNOW
KAMLESH 26-MAR-86 7000 2 MUMBAI

ALTER TABLE EMPLOYEE ADD ROLL_NO NUMBER(10)


Table altered.
38
2.2.10 Joins

CREATE TABLE FACULTY(SSN NUMBER(5), RANK VARCHAR2(15));

SELECT * FROM FACULTY;

SSN RANK
111 ASST
333 FULL
555 ASSOC

CREATE TABLE STUDENT(SSN NUMBER(5), GPA NUMBER(10));


SELECT * FROM STUDENT;

SSN GPA
333 5
444 6
777 7

SELECT FACULTY.SSN, FACULTY.RANK, STUDENT.SSN, STUDENT.GPA FROM FACULTY, STUDENT WHERE


FACULTY.SSN(+)=STUDENT.SSN;

SSN RANK SSN GPA


333 FULL 333 5
- - 444 6
- - 777 7

SELECT FACULTY.SSN, FACULTY.RANK, STUDENT.SSN, STUDENT.GPA FROM FACULTY, STUDENT WHERE


FACULTY.SSN=STUDENT.SSN(+);

SSN RANK SSN GPA


333 FULL 333 5
555 ASSOC - -
111 ASST - -

SELECT * FROM FACULTY FULL OUTER JOIN STUDENT ON FACULTY.SSN=STUDENT.SSN;

39
SSN RANK SSN GPA
333 FULL 333 5
555 ASSOC - -
111 ASST - -
- - 444 6
- - 777 7

SELECT * FROM FACULTY INNER JOIN STUDENT ON FACULTY.SSN=STUDENT.SSN;

SSN RANK SSN GPA


333 FULL 333 5

2.2.11 Unions, Intersection and Minus

"Employees Norway":

E_ID E_Name
01 Hansen, Ola
02 Svendson, Tove
03 Svendson, Stephen
04 Pettersen, Kari

"Employees_USA":

E_ID E_Name
01 Turner, Sally
02 Kent, Clark
03 Svendson, Stephen
04 Scott, Stephen

SQL>SELECT E_Name FROM Employees_Norway UNION


SELECT E_Name FROM Employees_USA

E_Name
Hansen, Ola
Svendson, Tove
Svendson, Stephen
Pettersen, Kari
Turner, Sally
Kent, Clark
Scott, Stephen

SQL>SELECT E_NAME FROM Employees_Norway INTERSECT SELECT E_NAME FROM Employee_USA;


40
E_Name
Svendson, Stephen

SQL>SELECT E_NAME FROM Employees_Norway MINUS SELECT E_NAME FROM Employee_USA;

E_Name
Hansen, Ola
Svendson, Tove
Pettersen, Kari

2.2.12 Cursors

A cursor is a temporary work area created in the system memory when a SQL statement is executed. A
cursor contains information on a select statement and the rows of data accessed by it. This temporary
work area is used to store the data retrieved from the database, and manipulate this data. A cursor can
hold more than one row, but can process only one row at a time. The set of rows the cursor holds is
called the active set. There are two types of cursors in PL/SQL:

Implicit cursors
These are created by default when DML statements like, INSERT, UPDATE, and DELETE statements are
executed. They are also created when a SELECT statement that returns just one row is executed.
Explicit cursors
They must be created when you are executing a SELECT statement that returns more than one row.
Even though the cursor stores multiple records, only one record can be processed at a time, which is
called as current row. When you fetch a row the current row position moves to next row.
Both implicit and explicit cursors have the same functionality, but they differ in the way they are
accessed.

41
2.2.13 Triggers
A trigger is a statement that is executed automatically by the system as a side effect of a modification
to the database.
_ To design a trigger mechanism, we must:
– Specify the conditions under which the trigger is to be executed.
– Specify the actions to be taken when the trigger executes.
The SQL standard does not include triggers, but many implementations support triggers.
Suppose that instead of allowing negative account balances, the bank deals with overdrafts by
– setting the account balance to zero
– creating a loan in the amount of the overdraft
– giving this loan a loan number identical to the account number of the overdrawn account
The condition for executing the trigger is an update to the account relation that results in a negative
balance value.

2.2.14 Procedures in PL/SQL


A stored procedure or in simple a procedure is a named PL/SQL block which performs one or more
specific task. This is similar to a procedure in other programming languages.
A procedure has a header and a body. The header consists of the name of the procedure and the
parameters or variables passed to the procedure. The body consists or declaration section, execution
section and exception section similar to a general PL/SQL Block.
A procedure is similar to an anonymous PL/SQL Block but it is named for repeated usage.
We can pass parameters to procedures in three ways.

1) IN-parameters
2) OUT-parameters
3) IN OUT-parameters
42
A procedure may or may not return any value.

General Syntax to create a procedure is:


CREATE [OR REPLACE] PROCEDURE proc_name [list of parameters]
IS
Declaration section
BEGIN
Execution section
EXCEPTION
Exception section
END;

43
UNIT- III

3.1 Data Base Design & Normalization

The goal of a relational database design is to generate a set of relation schemas that allows us to store
information without any redundant (repeated) data. It also allows us to retrieve information easily and
more efficiently. For this we use a approach normal form as the set of rules. These rules and
regulations are known as Normalization.
Database normalization is data design and organization process applied to data structures based on
their functional dependencies and primary keys that help build relational databases.
Main objective in developing a logical data model for relational database systems is to create an
accurate representation of the data, its relationships, and constraints.
 To achieve this objective, must identify a suitable set of relations.
 Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal
forms, and Boyce Codd normal form (BCNF).
 Based on functional dependencies among the attributes of a relation.
 A relation can be normalized to a specific form to prevent possible occurrence of update
anomalies.
 Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes.
 Often executed as a series of steps. Each step corresponds to a specific normal form, which has
known properties.
 As normalization proceeds, relations become progressively more restricted (stronger) in format
and also less vulnerable to update anomalies
3.1.1 Functional Dependency

Main concept associated with normalization.


Functional Dependency
 Describes relationship between attributes in a relation.
 If A and B are attributes of relation R, B is functionally dependent on A (denoted A arrow B), if
each value of A in R is associated with exactly one value of B in R.
Property of the meaning (or semantics) of the attributes in a relation.

44
Figure: 3.1.1

 have a 1:1 relationship between attribute(s) on left and right-hand side of a dependency;
 hold for all time;
 are nontrivial.
ation can be very large.

functional dependencies (Y) for that relation and has property that every functional dependency in Y is
implied by functional dependencies in X.

of X (written X+).
ong’s axioms, specifies how new functional dependencies can be
inferred from given ones.

R. Armstrong’s axioms are as follows:

2. Augmentation If A

3.1.2 Normal Forms

Normalization Helps:
• Minimizing data redundancy.
• Minimizing the insertion, deletion and update anomalies.
• Reduces input and output delays
• Reducing memory usage.
• Supports a single consistent version of the truth.
• It is an industry best method of tables or entity design.
Uses: Database normalization is a useful tool for requirements analysis and data modelling
45
process of software development. Thus The normalization is the process to reduce the all undesirable
problems by using the functional dependencies and keys.
Here we use the following normal forms:
• First Normal Form (lNF)
• Second Normal Form (2NF)
• Third Normal form (3NF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
• Sixth Normal Form (6NF)

Figure: 3.1.2 Relationship between Normal Forms

1. First Normal Form (1NF)

A relation schema R is said to be in First Normal Form (INF) if the values in the domain of
each attribute of the relation are atomic. A relation is first normal form if it has no repeating groups.

46
2. Second Normal Form (2NF)

• 2NF is a normal form in database normalization. It requires that all data elements in a table
are full functionally dependent on the table's primary key.
• If data clement only dependent on part of primary key, then they are parsed out to separate
tables.
• If the table has a single field as the primary key, it is automatically in 2NF.

47
3. Third Normal Form (3NF)

The 3NF is a normal form used in database normalization to check if all the non key attributes
of a relation depend only on the candidate keys of the relation.
This means that all non-key attributes are mutually independent or in other words that a non key
attribute cannot be transitively dependent on another non-key attribute.
A relation schema R is in 3NF if every non-prime attribute of R meets both of the following.
• It is full functionally dependency on every key of R.
• It is non transitively dependent on every key of R.
OR we can say: A relation schema R is in 3NF if, whenever a non trivial functional dependency

48
4. Boyce Codd Normal Form (BCNF)

BCNF is a normal form used in database normalization. It is slightly stronger version of the 3NF.
A table is in BCNF if and only if :
(a) It is in 3NF and
(b) For every of its nontrivial functional dependency X .... Y, X is a super key.
OR A relation schema R is in BCNF if whenever a nontrivial functional dependency X .... A holds in R
then
(1) X is a super key of R.

49
5. Fourth Normal Form (4NF)

• An entity type is in 4~F if it is BCNF and there are non-multivalued dependencies between
its attribute types.
• Any entity is BCNF is transformed in 4NF
(i) Direct any multivalued dependencies.
(ii) Decompose entity type.

50
6. Fifth Normal Form (5NF)

A table is said to be in the 5NF if and only if it is in 4NF and every Join dependency in it is
implied by the candidate key.

51
Domain/Key Normal Form
Domain/Key Normal Form is a normal form used in database normalization which required that
the database contains no constraints other than domain constraints and key constraints.
A domain constraint specifies the permissible values for a given attribute, while a key constraint
specifies the attributes that uniquely identify a row in a given table.
The domain/Key NF is the Holy Grail of relational database design, achieved when every
constraint on the relation is a logical consequence of the definition of keys and domains, and enforcing
key and domain restraints and conditions causes all constraints to be met.
Thus it avoids all non-temporal anomalies its must easier to build a database in domain/key
normal form than it is to convert lesser databases which may contains numerous anomalies.
However, successfully building a domain key normal form data base remains a difficult task, even for
experienced database programmers.
Thus, while the domain/key Normal form eliminates the problems found in most databases it
tends to be the most costly normal form to achieve.

3.1.3 Inclusion dependency

It is an important integrity constraint for relational database. For e.g., an inclusion dependency can say
that every MANAGER entry of the R relation appears as an EMPLOYEE entry of the S relation. In
general, an inclusion dependency is of the form

52
are attributes. The inclusion dependency (1.1) holds for a database if each tuple that is a member of
the relation corresponding to the left-hand side of (1.1) is also in the relation corresponding to the
right-hand side of (1.1).
Hence, INDs are valuable for database design, since they permit us to selectively define what data must
be duplicated in what relations.
INDs differ from other commonly studied database dependencies. INDs may be inter relational,
whereas the others deal with a single relation at a time.

Objective of Inclusion Dependencies: To formalize two types of inter relational constraints which
cannot be expressed using F.D.s or MVDs:
 Referential integrity constraints
 Class/subclass relationships

Inclusion dependency inference rules


 IDIR1 (reflexivity): R.X < R.X.
 IDIR2 (attribute correspondence): If R.X < S.Y
where X = {A1, A2 ,..., An} and Y = {B1, B2, ..., Bn} and Ai Corresponds-to Bi, then R.Ai < S.Bi for 1 ≤ i
≤ n.
 IDIR3 (transitivity): If R.X < S.Y and S.Y < T.Z, then R.X < T.Z.

3.1.4 Test for Lossless Joins – General Case


For a decomposition involving more than two relations, the simple test cannot be used, so we present
an algorithm for testing the general case.
Given a relation R(A1,A2,...An), a set of functional dependencies, F, and a decomposition of R
into relations R1,R2,...Rm, the following algorithm can be used to determine whether the decomposition
has a lossless join. The algorithm is illustrated in Figure 6.9 below.

Algorithm to test for lossless join:


1. Construct an m by n table, S, with a column for each of the n attributes in R and a row for each of the
m relations in the decomposition.

2. For each cell S(i,j) of S, if the attribute for the column, Aj, is in the relation for the row, Ri, then place
the symbol a(j) in the cell else place the symbol b(i,j) there.

53
3. Repeat the following process until no more changes can be made to S:
for each FD X → Y in F
for all rows in S that have the same symbols in the columns corresponding to the
attributes of X, make the symbols for the columns that represent attributes of Y equal by the following
rule:
if any row has an a value, a(j), then set the value of that column in all the other
rows equal to a(j)
if no row has an a value, then pick any one of the b values, say b(i,j), and set all
the other rows equal to b(i,j)

4. if, after all possible changes have been made to S, a row is made up entirely of a symbols, a(1), a(2),
... , a(n), then the join is lossless. If there is no such row, the join is lossy.

Example:

Consider the relation R(A,B,C,D,E) having decomposition consisting of R1(A,C), R2(A,B,D), and R3(D,E)
with FDs A → C, AB → D, and D → E. Figure 6.9 illustrates the algorithm. Referring to Figure(a), we
construct one row for each relation in the decomposition and one column for each of the five
attributes of R. For each row, we place the value a with the column subscript in any column whose
heading represents an attribute in that relation, and the value b with the usual row and column
subscript in the column for any attribute not in that relation. For example, in the first row, for relation
R1(A,C), we place a(1) in the first column, for A, and a(3) in the third column, for C. Since B does not
appear in R1, we place b(1,2) in its column. Similarly, we place b(1,4) in the D column and b(1,5) in the
E column, since these attributes do not appear in R1. Now we consider the FD A → C, and look for rows
that agree on the value of the left hand side, A. We find that rows 1 and 2 agree on the value a(1).
Therefore, we can set the C values equal. We find that row 1 has an a value, a(3), in the C column, so
we set the C column value of row 2 equal to a(3). Considering the second FD, AB → D, we cannot find
any two rows that agree on both their A and B values, so we are unable to make any changes. Now
considering the FD D → E, we find that row 2 and row 3 agree on their D values, a(4), so we can set
their E values equal. Since row 3 has an E value of a(5), we change the E value of row 2 to a(5) as well.
Now we find that the second row has all a values, and we conclude that the projection has the lossless
join property

R(A,B,C,D,E)

Decomposition: R1(A,C), R2(A,B,D), R3(D,E)

FD's: A → C, AB → D, D → E

| A B C D E

__________________________________________________________________

R1(A,C) | a(1) b(1,2) a(3) b(1,4) b(1,5)

R2(A,B,D) | a(1) a(2) b(2,3) a(4) b(2,5)

|
54
R3(D,E) | b(3,1) b(3,2) b(3,3) a(4) a(5)

Figure(a) Initial placement of values

| A B C D E

__________________________________________________________________

R1(A,C) | a(1) b(1,2) a(3) b(1,4) b(1,5)

R2(A,B,D) | a(1) a(2) a(3) a(4) a(5)

R3(D,E) | b(3,1) b(3,2) b(3,3) a(4) a(5)

(b) Table after considering all FDs

Inference Rules for FD

Reflexive Rule :

Augmentation Rule :

Transitive Rule:

Decomposition or Projective Rule

Union or additive Rule:

Pseudo Transitive Rule:

55
UNIT- IV

4.1 Transaction Processing Concept


4.1.1 Transaction System

Action, or series of actions, carried out by user or application, which accesses or changes contents of
database. It Transforms database from one consistent state to another, although consistency may be
violated during transaction.

ACID Properties of Transactions

Four basic (ACID) properties of a transaction are:

Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirely or not
performed at all.
Consistency: Must transform database from one consistent state to another.
Isolation: Partial effects of incomplete transactions should not be visible to other transactions.
Durability: Effects of a committed transaction are permanent and must not be lost because of later
failure.

4.1.2 Testing of Serializablity

In serializability, ordering of read/writes is important:


(a) If two transactions only read a data item, they do not conflict and order is not important.
(b) If two transactions either read or write completely separate data items, they do not conflict and
order is not important.
(c) If one transaction writes a data item and another reads or writes same data item, order of execution
is important.

Example of Conflict Serializability

Figure: 4.1.2
56
4.1.3 Serializablity of schedules

 Offers less stringent definition of schedule equivalence than conflict serializability.

 Two schedules S1 and S2 are view equivalent if:


 For each data item x, if Ti reads initial value of x in S1, Ti must also read initial value of x in S2.
 For each read on x by Ti in S1, if value read by x is written by Tj, Ti must also read value of x
produced by Tj in S2.
 For each data item x, if last write on x performed by Ti in S1, same transaction must perform
final write on x in S2.
 Schedule is view serializable if it is view equivalent to a serial schedule.

 Every conflict serializable schedule is view serializable, although converse is not true.

 It can be shown that any view serializable schedule that is not conflict serializable contains one
or more blind writes.

 In general, testing whether schedule is serializable is NP-complete.

Figure: 4.1.3

4.1.4 Recoverability

 Serializability identifies schedules that maintain database consistency, assuming no transaction


fails.
 Could also examine recoverability of transactions within schedule.
 If transaction fails, atomicity requires effects of transaction to be undone.
 Durability states that once transaction commits, its changes cannot be undone (without running
another, compensating, transaction).
Two basic concurrency control techniques:
 Locking,
 Timestamping.

57
4.1.5 Recovering from transaction failures

If a transaction failure occurs in a partitioned database environment, database recovery is usually


necessary on both the failed database partition server and any other database partition server that was
participating in the transaction.

There are two types of database recovery:


 Crash recovery occurs on the failed database partition server after the failure condition is
corrected.
 Database partition failure recovery on the other (still active) database partition servers occurs
immediately after the failure has been detected.
In a partitioned database environment, the database partition server on which a transaction is
submitted is the coordinator partition, and the first agent that processes the transaction is the
coordinator agent. The coordinator agent is responsible for distributing work to other database
partition servers, and it keeps track of which ones are involved in the transaction. When the application
issues a COMMIT statement for a transaction, the coordinator agent commits the transaction by using
the two-phase commit protocol. During the first phase, the coordinator partition distributes a prepare
request to all the other database partition servers that are participating in the transaction. These
servers then respond with one of the following:

Transaction failure recovery on the failed database partition server

If the transaction failure causes the database manager to end abnormally, you can issue
the db2start command with the RESTART option to restart the database manager once the
database partition has been restarted. If you cannot restart the database partition, you can
issue db2start to restart the database manager on a different database partition.

If the database manager ends abnormally, database partitions on the server can be left in an
inconsistent state. To make them usable, crash recovery can be triggered on a database
partition server:
 Explicitly, through the RESTART DATABASE command
 Implicitly, through a CONNECT request when the auto restart database configuration
parameter has been set to ON
Crash recovery reapplies the log records in the active log files to ensure that the effects of all
complete transactions are in the database. After the changes have been reapplied, all
uncommitted transactions are rolled back locally, except for in doubt transactions. There are
two types of in doubt transaction in a partitioned database environment:
 On a database partition server that is not the coordinator partition, a transaction is in
doubt if it is prepared but not yet committed.
 On the coordinator partition, a transaction is in doubt if it is committed but not yet
logged as complete (that is, the FORGET record is not yet written). This situation occurs
when the coordinator agent has not received all the COMMIT acknowledgments from all
the servers that worked for the application.

58
4.1.6 Log based recovery

Check points
Checkpoint-Recovery is a common technique for imbuing a program or system with fault tolerant
qualities, and grew from the ideas used in systems which employ transaction processing. It allows
systems to recover after some fault interrupts the system, and causes the task to fail, or be aborted in
some way. While many systems employ the technique to minimize lost processing time, it can be used
more broadly to tolerate and recover from faults in a critical application or task. The basic idea behind
checkpoint-recover is the saving and restoration of system state. By saving the current state of the
system periodically or before critical code sections, it provides the baseline information needed for the
restoration of lost state in the event of a system failure. While the cost of checkpoint-recovery can be
high, by using techniques like memory exclusion, and by designing a system to have as small a critical
state as possible may minimize the cost of checkpointing enough to be useful in even cost sensitive
embedded applications. When a system is check pointed, the state of the entire system is saved to
non-volatile storage. The checkpointing mechanism takes a snapshot of the system state and stores the
data on some non-volatile storage medium. Clearly, the cost of a checkpoint will vary with the amount
of state required to be saved and the bandwidth available to the storage mechanism being used to
save the state. In the event of a system failure, the internal state of the system can be restored, and it
can continue service from the point at which its state was last saved. Typically this involves restarting
the failed task or system, and providing some parameter indicating that there is state to be recovered.
Depending on the task complexity, the amount of state, and the bandwidth to the storage device this
process could take from a fraction of a second to many seconds.

This technique provides protection against the transient fault model. Typically upon state restoration
the system will continue processing in an identical manner as it did previously. This will tolerate any
transient fault, however if the fault was caused by a design error, then the system will continue to fail
and recover endlessly. In some cases, this may be the most important type of fault to guard against,
but not in every case.

59
4.1.7 Deadlock Handling

Deadlock is a situation when any process want to access that resource which is being previously held by
other process and over all progress is nil then such is called deadlock.

As shown in partial schedule transaction T1 is waiting for transaction T2 to release its lock on data item
B and transaction T2 is waiting for transaction T I to release its lock on data item A. Such a cycle of
transactions waiting for locks to be released is called a Deadlock.
Clearly, these two transactions will make no further progress. Now we can define the deadlock as
"A system is in a deadlock state if there exists a set of transactions such that every transaction in the
set is waiting for another transaction in the set."
More precisely, there exists a set of waiting transactions (T0,T1,…..Tn) such that T0 is waiting for a data
item that is held by T1 and T1 is waiting for a data item that is held by T2 and .......Tn-l is waiting for a
data item that is held by Tn and Tn is waiting for a item that is held by T0. None of the transactions can
make progress in such a situation.

There are two principal methods for dealing with the deadlock problem.

o Deadlock prevention
o Deadlock detection

Deadlock prevention: We can use a deadlock-prevention protocol to ensure that the system will never
enter a deadlock state.

Deadlock detection: In this case, we can allow the system to enter a deadlock state, and then try to
recover using a deadlock detection and deadlock recovery scheme.

Both the above methods may result in transaction rollback. Prevention is commonly used if the
probability that the system would enter a deadlock state is relatively high; otherwise detection and
recovery are more efficient.

60
Deadlock Prevention

We can prevent deadlocks by giving each transaction a priority and ensuring that lower priority
transactions are not allowed to wait for higher priority transactions (or vice versa). One way to assign
priorities is to give each transaction a timestamp when it starts up. The lower the timestamp, the
higher the transaction priority that is, the oldest transaction has the highest priority.

If a transaction Ti requests a lock and transaction Tj holds a conflicting lock, the lock manager can use
one of the following two policies:

Wait-die

If Ti has higher priority, it is allowed to wait; otherwise it is aborted. It means when transaction Ti
requests a data item currently held by Tj, Ti is allowed to wait only if it has a timestamp smaller than
that of T1 (that is Ti is older than Tj). Otherwise Ti is rolled back (dies).

Example

Suppose that transactions T22, T23 and T24 have timestamps 5, 10 and 15 respectively. If T22 requests
a data item held by T23 then T22 will wait. If T24 requests a data item held by T23 then T24 will be
rolled back.

T22 will wait T24 will rollback

T22 ----------------------> T23 <---------------------- T24

(5) (10) (15)

The wait-die scheme is non-preemptive scheme because only a transaction requesting a lock can be
aborted. As a transaction grows older (and its priority increases), it tends to wait for more and more
younger transactions.

Wound-wait

If Ti has higher priority, abort Tj otherwise Ti waits. It means when transaction Ti requests a data item
currently held by Tj, Ti is allowed to wait only if it has a timestamp larger than that of Tj (that is Ti is
younger than Tj). Otherwise Tj is rolled back (Tj is wounded by Ti).

Example:

Returning to our previous example, with transactions T22,T23 and T24, ifT22 requests a data item held
by T23 then the data item will be preempted from T23 and T23 will be rolled back. IfT24 requests a
data item held by T23, and then T24 will wait.

T22 access T23 will be

Data item rollback

T22 --------------------> T23


(5) (10)

61
Data item with T23 wait for T23

T23 <------------------ T24

(10) (15)

This scheme is based on a preemptive technique and is a counterpart to the wait-die scheme. In the
wait-die scheme, lower priority transactions can never wait for higher priority transactions. In the
wound-wait scheme, higher priority transactions never wait for lower priority transactions. In either
case no deadlock cycle can develop.

When a transaction is aborted and restarted, it should be given the same timestamp that it had
originally. Reissuing timestamps in this way ensures that each transaction will eventually become the
oldest transaction, and thus the one with the highest priority, and will get the locks that it requires.

Problem of starvation

Whenever transactions are rolled back, it is important to ensure that there is no starvation that is, no
transaction gets rolled back repeatedly and is never allowed to make progress.

Both the wound-wait and the wait-die schemes avoid starvation. At any time, there is a transaction
with the smallest timestamp. This transaction cannot be required to roll back in either scheme. Since
timestamps always increase, and since transactions are not assigned new timestamps when they are
rolled back, a transaction that is rolled back will eventually have the smallest timestamp. Thus it will
not be rolled back again.

Differences between wait-die and wound-wait scheme

There are following differences between wait-die and wound-wait schemes:

 In the wait-die scheme an older transaction must wait for a younger one to release its data
item. Thus, the older the transaction gets the more it tends to wait. By contrast, in the wound wait
scheme, an older transaction never waits for a younger transaction.

 In the wait-die scheme, if a transaction T24 request a data item held a data item by T23 and T24
then die and roll back because Ti request a data held by Tj and TS(Ti» TS(Tj). Now T24 restarted and
may reissue the same sequence of requests if data item is still held by T23 then T24 will die again.
Thus T24 may die several times before acquiring the needed data item.

Now see what happen in case of wound-wait scheme transaction Ti is wound and rollback because Tj
request a data item that it holds and TS(Ti>TS(Tj). Now suppose that data item is held by T23 and T22
request the data item held by T23. Then T23 is wound and rollback because TS (23>TS (T22). When T23
will again restarted and request the data item T22 then again, TS (23>TS (22) and T23 will wait for
transaction T22 and T23 will not rollback again. Thus, there may be few rollbacks in the wound wait
scheme.

62
Timeout-Based Schemes

Another simple approach to deadlock handling is based on lock timeouts. In this approach, a
transaction that has requested a lock waits for at most a specified amount of time. If the lock has not
been granted within that time, the transaction is said to time out, and it rolls itself back and restarts. If
there was in fact a deadlock, one or more transactions involved in the deadlock will time out and roll
back, allowing the others to proceed. This scheme falls somewhere between deadlock prevention,
where a deadlock will never occur and deadlock detection and recovery.

Uses of Timeout-Based Schemes

The timeout scheme is particularly easy to implement, and works well if transactions· are short, and if
long waits are likely to be due to deadlocks.

Limitations

• It is hard to decide how long a transaction must wait before timing out. Too long a wait results in
unnecessary delays once a deadlock has occurred. Too short a wait results in transaction rollback even
when there is no deadlock, leading to wasted resources.

• Starvation is also a possibility with this scheme.

Hence the timeout-based scheme has limited applicability.

4.2 Distributed Database


4.2.1 Distributed Data storage

A distributed data store is a computer network where information is stored on more than one node,
often in a replicated fashion. It is usually specifically used to refer to either a distributed
database where users store information on a number of nodes, or a computer network in which users
store information on a number of peer network nodes.
Distributed databases are usually non-relational databases that make a quick access to data over a
large number of nodes possible. Some distributed databases expose rich query abilities while others
are limited to a key-value-store semantics. Examples of limited distributed databases
are Google's BigTable, which is much more than a distributed file system or a peer-to-peer network,
Amazon's Dynamo and Windows Azure Storage.
As the ability of arbitrary querying is not as important as the availability, designers of distributed data
stores have increased the latter at an expense of consistency. But the high-speed read/write access
results in reduced consistency, as it is not possible to have both consistency, availability, and partition
tolerance of the network, as it has been proven by the CAP theorem.

4.2.2 Concurrency control


Concurrency control is a database management systems (DBMS) concept that is used to address
conflicts with the simultaneous accessing or altering of data that can occur with a multi-user system.
Concurrency control, when applied to a DBMS, is meant to coordinate simultaneous transactions while
preserving data integrity. The Concurrency is about to control the multi-user access of Database.

63
To illustrate the concept of concurrency control, consider two travellers who go to electronic kiosks at
the same time to purchase a train ticket to the same destination on the same train. There's only one
seat left in the coach, but without concurrency control, it's possible that both travellers will end up
purchasing a ticket for that one seat. However, with concurrency control, the database wouldn't allow
this to happen. Both travellers would still be able to access the train seating database, but concurrency
control would preserve data accuracy and allow only one traveller to purchase the seat.
This example also illustrates the importance of addressing this issue in a multi-user database.
Obviously, one could quickly run into problems with the inaccurate data that can result from several
transactions occurring simultaneously and writing over each other. The following section provides
strategies for implementing concurrency control.

Concurrency Control Locking Strategies

Pessimistic Locking: This concurrency control strategy involves keeping an entity in a database locked
the entire time it exists in the database's memory. This limits or prevents users from altering the data
entity that is locked. There are two types of locks that fall under the category of pessimistic locking:
write lock and read lock.
With write lock, everyone but the holder of the lock is prevented from reading, updating, or deleting
the entity. With read lock, other users can read the entity, but no one except for the lock holder can
update or delete it.

Optimistic Locking: This strategy can be used when instances of simultaneous transactions, or
collisions, are expected to be infrequent. In contrast with pessimistic locking, optimistic locking doesn't
try to prevent the collisions from occurring. Instead, it aims to detect these collisions and resolve them
on the chance occasions when they occur.
Pessimistic locking provides a guarantee that database changes are made safely. However, it becomes
less viable as the number of simultaneous users or the number of entities involved in a transaction
increase because the potential for having to wait for a lock to release will increase.
Optimistic locking can alleviate the problem of waiting for locks to release, but then users have the
potential to experience collisions when attempting to update the database.

Lock Problems:
Deadlock:
When dealing with locks two problems can arise, the first of which being deadlock. Deadlock refers to a
particular situation where two or more processes are each waiting for another to release a resource, or
more than two processes are waiting for resources in a circular chain. Deadlock is a common problem
in multiprocessing where many processes share a specific type of mutually exclusive resource. Some
computers, usually those intended for the time-sharing and/or real-time markets, are often equipped
with a hardware lock, or hard lock, which guarantees exclusive access to processes, forcing
serialization. Deadlocks are particularly disconcerting because there is no general solution to avoid
them.
A fitting analogy of the deadlock problem could be a situation like when you go to unlock your car door
and your passenger pulls the handle at the exact same time, leaving the door still locked. If you have
ever been in a situation where the passenger is impatient and keeps trying to open the door, it can be

64
very frustrating. Basically you can get stuck in an endless cycle, and since both actions cannot be
satisfied, deadlock occurs.

Livelock: Livelock is a special case of resource starvation. A livelock is similar to a deadlock, except that
the states of the processes involved constantly change with regard to one another wile never
progressing. The general definition only states that a specific process is not progressing. For example,
the system keeps selecting the same transaction for rollback causing the transaction to never finish
executing. Another livelock situation can come about when the system is deciding which transaction
gets a lock and which waits in a conflict situation.
An illustration of livelock occurs when numerous people arrive at a four way stop, and are not quite
sure who should proceed next. If no one makes a solid decision to go, and all the cars just keep
creeping into the intersection afraid that someone else will possibly hit them, then a kind of livelock
can happen.

Basic Timestamping: Basic timestamping is a concurrency control mechanism that eliminates deadlock.
This method doesn’t use locks to control concurrency, so it is impossible for deadlock to occur.
According to this method a unique timestamp is assigned to each transaction, usually showing when it
was started. This effectively allows an age to be assigned to transactions and an order to be assigned.
Data items have both a read-timestamp and a write-timestamp. These timestamps are updated each
time the data item is read or updated respectively.

Problems arise in this system when a transaction tries to read a data item which has been written by a
younger transaction. This is called a late read. This means that the data item has changed since the
initial transaction start time and the solution is to roll back the timestamp and acquire a new one.
Another problem occurs when a transaction tries to write a data item which has been read by a
younger transaction. This is called a late write. This means that the data item has been read by another
transaction since the start time of the transaction that is altering it. The solution for this problem is the
same as for the late read problem. The timestamp must be rolled back and a new one acquired.
Adhering to the rules of the basic timestamping process allows the transactions to be serialized and a
chronological schedule of transactions can then be created. Timestamping may not be practical in the
case of larger databases with high levels of transactions. A large amount of storage space would have
to be dedicated to storing the timestamps in these cases.

4.2.3 Directory System

A system database directory file exists for each instance of the database manager, and contains one
entry for each database that has been cataloged for this instance. Databases are implicitly cataloged
when the CREATE DATABASE command is issued and can also be explicitly cataloged with the CATALOG
DATABASE command.

For each database created, an entry is added to the directory containing the following information:

 The database name provided with the CREATE DATABASE command

65
 The database alias name (which is the same as the database name, if an alias name is not
specified)
 The database comment provided with the CREATE DATABASE command
 The location of the local database directory
 7An indicator that the database is indirect, which means that it resides on the 7current
database manager instance
 Other system information.

On UNIX platforms and in a partitioned database environment, you must ensure that all database
partitions always access the same system database directory file, sqldbdir, in the sqldbdir subdirectory
of the home directory for the instance. Unpredictable errors can occur if either the system database
directory or the system intention file sqldbins in the same sqldbdir subdirectory are symbolic links to
another file that is on a shared file system.

66
UNIT- V
5.1 Concurrency control Technique
Processes of managing simultaneous operations on the database without having them interfere with
one another.

one is updating data.


n
incorrect result.

Need for Concurrency Control

Three examples of potential problems caused by concurrency:

Fig.

Fig: Two sample transaction: Transaction 1 and Transaction 2

The problem occurs when two transactions that access the same database items have their operations
interleaved in a way that makes the value of the some database items incorrect. Suppose that
transactions T1 and T2 are submitted at approximately the same time, and suppose that their
operations are interleaved as shown in the figure a, then the final value of X is incorrect. Because T2
reads the value of X before T1 changes it in the database and hence the updated value resulting from
T1 is lost. For example if X=80 at the start (originally there were 80 reservation on the flight) N=5 (T1
transfers 5 seats reservations from the flight corresponding to X to the flight corresponding to Y) and
M=4 (T2 reserves 4 seats on X), the final result should be X=79, but the interleaving of operation shown
in the figure a it is X=84 because the update in T1 that removed the 5 seats from X was lost.

The lost update problem.


67
5.1.1 Concurrency Control Techniques

 . Locking,
 . Timestamping.

5.1.2 Locking Techniques for concurrency control

Transaction uses locks to deny access to other transactions and so prevent incorrect updates.

shared (read) or exclusive (write) lock on a data item before


read or write.

Locking - Basic Rules

ansaction has exclusive lock on item, can both read and update item.

item.
.
llow transaction to upgrade read lock to an exclusive lock, or downgrade exclusive
lock to a shared lock.

Two-Phase Locking (2PL)


Transaction follows 2PL protocol if all locking operations precede first unlock operation in the
transaction.
r transaction:
 . Growing phase - acquires all locks but cannot release any locks.
 . Shrinking phase - releases locks but cannot acquire any new locks

Preventing Lost Update Problem using 2PL Deadlock


An impasse that may result when two (or more) transactions are each waiting for locks held by the
other to be released.

 . Timeouts.
 . Deadlock prevention.
 . Deadlock detection and recovery.

Timeouts
-defined period of time.

ugh it may not be, and it aborts


and automatically restarts the transaction.

68
Deadlock Prevention

 Wait-Die - only an older transaction can wait for younger one, otherwise transaction is aborted
(dies) and restarted with same timestamp.
 Wound-Wait - only a younger transaction can wait for an older one. If older transaction
requests lock held by younger one, younger one is aborted (wounded).

Recovery from Deadlock Detection

. Choice of deadlock victim;


. How far to roll a transaction back;
. Avoiding starvation.

5.1.3 Timestamping

smaller timestamps, get


priority in the event of conflict.

Timestamp
A unique identifier created by DBMS that indicates relative starting time of a transaction.
transaction started, or by incrementing a logical
counter every time a new transaction starts.
last update on that data item was carried out by an older transaction.
and given a new timestamp.

. Read-timestamp - timestamp of last transaction to read item;


. Write-timestamp - timestamp of last transaction to write item.

Optimistic Techniques

e and more efficient to let transactions proceed without


delays to ensure serializability.

greater concurrency than traditional protocols.

 Read
 Validation
 Write
Optimistic Techniques - Read Phase

tes are applied


to a local copy of the data.

69
Optimistic Techniques - Validation Phase

-only transaction, checks that data read are still current values. If no interference,
transaction is committed, else aborted and restarted.

serializability maintained.
Optimistic Techniques - Write Phase

re applied to the database.

5.1.4 Validation Based Concurrency Control

It is also known as validation techniques, no checking is done while the transaction is executing. In this
scheme, updates in the transaction are not applied directly to the database items until the transaction
reaches its end. During transaction execution, all updates are applied to local copies of the data items
that are kept for the transaction. At the end of transaction execution, a validation phase checks
whether any of the transaction’s updates violate serializability. Certain information needed by the
validation phase must be kept by the system. If serializability is not violated, the transaction is
committed and the database is updated from the local copies; otherwise, the transaction is aborted
and then restarted later.
There are three phases for this concurrency control protocol:
1. Read phase: A transaction can read values of committed data items from the database. However,
updates are applied only to local copies (versions) of the data items kept in the transaction workspace.
2. Validation phase: Checking is performed to ensure that serializability will not be violated if the
transaction updates are applied to the database.
3. Write phase: If the validation phase is successful, the transaction updates are applied to the
database; otherwise, the updates are discarded and the transaction is restarted.

The idea behind optimistic concurrency control is to do all the checks at once; hence, transaction
execution proceeds with a minimum of overhead until the validation phase is reached. If there is little
interference among transactions, most will be validated successfully. However, if there is much
interference, many transactions that execute to completion will have their results discarded and must
be restarted later. Under these circumstances, optimistic techniques do not work well. The techniques
are called "optimistic" because they assume that little interference will occur and hence that there is
no need to do checking during transaction execution.
The optimistic protocol we describe uses transaction timestamps and also requires that the write_sets
and read_sets of the transactions be kept by the system. In addition, start and end times for some of
the three phases need to be kept for each transaction. Recall that the write_set of a transaction is the
set of items it writes, and the read_set is the set of items it reads. In the validation phase for
transaction Ti, the protocol checks that Ti does not interfere with any committed transactions or with
any other transactions currently in their validation phase. The validation phase for Ti checks that, for

70
each such transaction Tj that is either committed or is in its validation phase, one of the following
conditions holds:
1. Transaction Tj completes its write phase before Ti starts its read phase.

2. Ti starts its write phase after Tj completes its write phase, and the read_set of Ti has no items in
common with the write_set of Tj.

3. Both the read_set and write_set of Ti have no items in common with the write_set of Tj, and
Tj completes its read phase before Ti completes its read phase.
When validating transaction Ti, the first condition is checked first for each transaction Tj, since (1) is
the simplest condition to check. Only if condition (1) is false is condition (2) checked, and only if (2) is
false is condition (3)—the most complex to evaluate—checked. If any one of these three conditions
holds, there is no interference and Ti is validated successfully. If none of these three conditions holds,
the validation of transaction Ti fails and it is aborted and restarted later because interference may have
occurred.

5.1.5 Multiple Granularity


In computer science, multiple granularity locking (MGL) is a locking method used in database
management systems (DBMS) and relational databases.
In MGL, locks are set on objects that contain other objects. MGL exploits the hierarchical nature of
the contains relationship. For example, a database may have files, which contain pages, which further
contain records. This can be thought of as a tree of objects, where each node contains its children.
A lock on such as a shared or exclusive lock locks the targeted node as well as all of its descendants.
Multiple granularity locking is usually used with non-strict two-phase locking to
guarantee serializability.

Lock Modes

In addition to shared (S) locks and exclusive (X) locks from other locking schemes, like strict two-phase
locking, MGL also uses intention shared and intention exclusive locks. IS locks conflict with X locks, while
IX locks conflict with S and X locks. The null lock (NL) is compatible with everything.
To lock a node in S (or X), MGL has the transaction lock on all of its ancestors with IS (or IX), so if a
transaction locks a node in S (or X), no other transaction can access its ancestors in X (or S and X). This
protocol is shown in the following table:

To Get Must Have on all Ancestors

IS or S IS or IX

IX, SIX or X IX or SIX

71
Determining what level of granularity to use for locking is done by locking the finest level possible (at
the lowest leaf level), and then escalating these locks to higher levels in the file hierarchy to cover more
records or file elements as needed. This process is known as Lock Escalation. MGL locking modes are
compatible with each other as defined in the following matrix.

Mode NL IS IX S SIX X

NL Yes Yes Yes Yes Yes Yes

IS Yes Yes Yes Yes Yes No

IX Yes Yes Yes No No No

S Yes Yes No Yes No No

SIX Yes Yes No No No No

X Yes No No No No No

5.1.6 Multiversion Schemes


Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used
by database management systems to provide concurrent access to the database and in programming
languages to implement transactional memory.
If someone is reading from a database at the same time as someone else is writing to it, it is possible
that the reader will see a half-written or inconsistent piece of data. There are several ways of solving
this problem, known as concurrency control methods. The simplest way is to make all readers wait until
the writer is done, which is known as a lock. This can be very slow, so MVCC takes a different approach:
each user connected to the database sees a snapshot of the database at a particular instant in time.
Any changes made by a writer will not be seen by other users of the database until the changes have
been completed (or, in database terms: until the transaction has been committed.)
When an MVCC database needs to update an item of data, it will not overwrite the old data with new
data, but instead mark the old data as obsolete and add the newer version elsewhere. Thus there are
multiple versions stored, but only one is the latest. This allows readers to access the data that was
there when they began reading, even if it was modified or deleted part way through by someone else.
It also allows the database to avoid the overhead of filling in holes in memory or disk structures but
requires (generally) the system to periodically sweep through and delete the old, obsolete data objects.
For a document-oriented database it also allows the system to optimize documents by writing entire
72
documents onto contiguous sections of disk—when updated, the entire document can be re-written
rather than bits and pieces cut out or maintained in a linked, non-contiguous database structure.
MVCC provides point in time consistent views. Read transactions under MVCC typically use a
timestamp or transaction ID to determine what state of the DB to read, and read these versions of the
data. This avoids managing locks for read transactions because writes can be isolated by virtue of the
old versions being maintained, rather than through a process of locks or mutexes. Writes affect a
future version but at the transaction ID that the read is working at, everything is guaranteed to be
consistent because the writes are occurring at a later transaction ID.

5.1.7 Recovery with concurrent transaction

Process of restoring database to a correct state in the event of a failure.

 Two types of storage: volatile (main memory) and non-volatile.


 Volatile storage does not survive system crashes.
 Stable storage represents information that has been replicated in several non-volatile storage
media with independent failure modes.

Types of Failures

tion of data or facilities.

Transactions and Recovery

ary storage then, to


ensure durability, recovery manager has to redo (roll forward) transaction’s updates.
undo (rollback) any effects
of that transaction for atomicity.
undo - only one transaction has to be undone.
- all transactions have to be undone.
Example

to secondary storage.
undone. In absence of any other information, recovery manager has to redo T2,
T3, T4, and T5.

Recovery Facilities

 Backup mechanism, which makes periodic backup copies of database.


 Logging facilities, which keep track of current state of transactions and database changes.
 Checkpoint facility, which enables updates to database in progress to be made permanent.
 Recovery manager, which allows DBMS to restore database to consistent state following a
failure.
73
Log File

 Transaction records.
 Checkpoint records.

Transaction records contain:


. Transaction identifier.
. Type of log record, (transaction start, insert, update, delete, abort, commit).
. Identifier of data item affected by database action
(Insert, delete, and update operations).
. Before-image of data item.
. After-image of data item.
. Log management information
e duplexed or triplexes.
-access files.

74
Assignment

(1.) Differentiate between File Processing System and Database Management System.
(2.) What are data models? Explain the classification of data models in brief.
(3.) What do you mean by Database Abstraction? Differentiate among Physical, Conceptual
and View level of data abstraction.
(4.) Discuss the role of Database Administrator (DBA) in Database Management System.
(5.) How to create a user in Oracle? Write down command or queries for create a table, insert,
update, retrieve & delete values from table.
(6.) Draw the block diagram of Overall Database Structure and Explain in detail.
(7.) Construct an E-R diagram for a hospital with a set of patients and set of medical doctors.
Associate with each patient, a log of various tests and examination conducted.
(8.) Differentiate among generalization, specialization and aggregation with to the database.
(9.) What are the Codd’s rules for RDBMS?
(10.) What do you mean by integrity constraint? Explain its various types.
(11.) Differentiate among FPS, DBMS and RDBMS with suitable diagram.
(12.) What do you mean by data abstraction? Explains its various levels with suitable diagram.
(13.) Draw an E-R diagram for banking management system.
(14.) Differentiate between DDL and DML. Also create a table for employee and perform insert, update
and delete operation.
(15.) Explain the role of Database User and DBA in details.
(16.) Write short notes on the following:
(a.) PL/SQL
(b.) Cursor
(c.) Trigger
(17.) State Armstrong’s axioms for functional dependency.
(18) Define relational algebra and explain its division operation with examples.
(19) Write down an algorithm for test the general cases of lossy and lossless join decomposition.
(20) What do you mean by normalization? Explain various goals and motivation
behind normalizing a database?
(21) Differentiate between tuple relational calculus and domain relational calculus.
(22) What do you mean by transaction? Explain ACID property of transaction.
(23) Write short notes on any two of the following with examples:
(a.) BCNF
(b.) 4NF
(c.) PJNF
(d.) 3NF

75
Quiz

1. The column of the table is referred to as the


(a) tuple (b) attribute (c) entity (d) degree

2. If every non-key attribute is functionally dependent on the primary key, then the relation will be
in
(a) 1NF (b) 2NF (c) 3NF (d) 4NF

3. The set of permitted values for each attribute is called its


(a) entity sets (b) attribute range (c) group (d) domain

4. Redundancy is dangerous as it is a potential threat to data


(a) sufficiency (b) integrity & consistency (c) data abstraction (d) none of these

5. An attribute of one table matching the primary key of another table, is called as
(a) foreign key (b) super key (c) candidate key (d) primary key

6. The employee salary should not be greater than Rs. 5000. This is
(a) feasible constraint (b) referential constraint (c) over-defined constraint (d) integrity
constraint

7. If a relation scheme is in BCNF, then it is also in


(a) 1NF (b) 2NF (c) 3NF (d) All of them

8. Student and courses enrolled, is an example of


(a) one-to-one relationship (b) one-to-many relationship
(c) many-to-many relationship (d) many-to-one relationship

9. E-R modeling technique is a


(a) top-down-approach (b) bottom-up approach
(c) left-right approach (d) none of these

10. A trigger is
(a) a statement that enable to start any DBMS
(b) a statement that is executed by the user when debugging an application program
(c) a condition the system test for the validity of the database user
(d) a statement that is executed automatically by the system as a side effect of a modification to
the database

76
Model Paper

B. Tech, CS/IT-4th Semester Exam


Database Management System(ECS-402)
Time: 3:00 Hours MM: 100
Note: Attempt all questions.
1. Attempt any four parts:- ( 5 * 4 = 20 )
(a.) Differentiate between File Processing System and Database Management System.
(b.) What is the role of database administrator?
(c.) What is the difference between specialization and generalization with respect to database?
(d.) Define an E-R diagram. Construct an E-R diagram for university.
(e.) Define integrity constraints. List various types of integrity constraints.

2. Attempt any four parts:- ( 5 * 4 = 20 )

(a.) What is union compatibility? What are the relational algebra operators that require the
relations on
which they are applied to be union compatible?
(b.) Explain various aggregate functions available in SQL.
(c.) Explain 3NF with example.
(d.) Explain PJNF with example.
(e.) Consider the following relational schema:
SUPPLIER (SUPPLIER_ID, SUPPLIER_NAME, SUPPLIER_ADDRESS)
PARTS (PART_ID, PART_NAME, COLOUR)
CATALOG (SUPPLIER_ID, PART_NAME, COST)
Write the following query in Relational Algebra
(i.) Find the name of the supplier who supplies YELLOW parts.
(ii.) Find the name of the supplier who supplies both YELLOW and GREEN parts.
(iii.) Find the name of the supplier who supplies all parts.

3. Attempt any two parts:- ( 10 * 2 = 20 )


(a.) Define functional Dependency? Given two sets F1 and F2 of functional dependency for a relation
F 1: A B, AB C, D AC, D E
F 2: A BC, D AE. Are the two sets equivalent?
(b.) List ACID property of transaction. Check whether the given schedule S is conflict serializable or
not:
S: r1(X); r2(Z); r1(Z); r3(X); r3(Y); w3(Y); r2(Y); w2(Z); w2(Y); w1(X);
(c.) Write short notes on any two:
(i) Checkpoints (ii) Serial schedule (iii) Directory System (iv) Data replication

4. Attempt any two parts:- ( 10 * 2 = 20 )

77
(a.) Define log. Differentiate between DDM and IDM with example.
(b.) Define schedule. Explain conflict schedule with example.
(c.) What do mean by deadlock? Explain deadlock detection and recovery schemes.

5. Attempt any two parts:- ( 10 * 2 = 20 )


(a.) What do you mean by Lock based protocol? Explain 2PL protocol in detail.
(b.) What do you mean by multiple granularities? How it is implemented in transaction system?
(c.) What do you mean by multiversion scheme? Explain Multiversion 2PL protocol.

78
B. Tech, CS/IT-4th Semester Exam
Database Management System(ECS-402)
Time: 3:00 Hours MM: 100
Note: Attempt all questions.

1. Attempt any four parts:- ( 5 * 4 = 20 )


(a.) Discuss the three level architecture of database system.
(b.) Define an E-R diagram. Construct an E-R diagram for banking system.
(c.) Differentiate between DBMS and RDBMS.
(d.) Define key. Explain various types of key regarding database.
(e.) How relational algebra is different from relational calculus? Justify your answer.

2. Attempt any four parts:- ( 5 * 4 = 20 )

(a.) Explain mapping cardinalities with example.


(b.) Explain various types of outer join in SQL with example.
(c.) Explain BCNF with example.
(d.) Explain 4NF with example.
(e.) Consider the following relational schema:
SUPPLIER (SUPPLIER_ID, SUPPLIER_NAME, SUPPLIER_ADDRESS)
PARTS (PART_ID, PART_NAME, COLOUR)
CATALOG (SUPPLIER_ID, PART_NAME, COST)
Write the following query in Relational Algebra
(i.) Find the name of the supplier who supplies YELLOW parts.
(ii.) Find the name of the supplier who supplies both YELLOW and GREEN parts.
(iii.) Find the name of the supplier who supplies all parts.
3. Attempt any two parts:- ( 10 * 2 = 20 )

(a.) Define RAT axioms. Consider the relational schema R(A, B, C, D. E, F, G, H) having decomposition
R consisting of R1(A, B, C, D), R2(A, B, C, E, F) and R3(A, D, F, G, H) with functional dependencies
holds: F={ AB C, BC D, E F, G F, H A, FG H}
State whether the given decomposition of schema R is either lossy or lossless.
(b.) Explain state diagram of transaction. Check whether the given schedule S is conflict serializable
or not:
S: r1(X); r2(Z); r1(X); r1(Z); r2(Y); r3(Y); w1(X); w2(Z); w3(Y); w2(Y);
(c.) Write short notes on any two:
(i)Shadow paging
(ii) Non serial schedule
(iii) Distributed database
(iv) Data fragmentation
4. Attempt any two parts:- ( 10 * 2 = 20 )

(a.) What do you mean by schedule? Explain View schedule with examples.

79
(b.) Define deadlock. Explain deadlock prevention method with example.
(c.) What do you mean recoverability of schedules? Explain cascadless schedule with example.
5. Attempt any two parts:- ( 10 * 2 = 20 )

(a.) Discuss in detail about Graph based protocol for concurrency control.
(b.) Discuss in detail about Validation based protocol.
(d.) What do you mean by Timestamp based protocol? Explain Thomas’ Write rule.

REFRENCES

1. Date C J, “ An Introduction to Database Systems”, Addision Wesley


2. Korth, Silbertz, Sudarshan,” Database Concepts”, McGraw Hill
3. Elmasri, Navathe, “ Fudamentals of Database Systems”, Addision Wesley
4. O’Neil, Databases, Elsevier Pub.
5. Leon & Leon,”Database Management Systems”, Vikas Publishing House
6. Bipin C. Desai, “ An Introduction to Database Systems”, Gagotia Publications
7. Majumdar & Bhattacharya, “Database Management System”, TMH
8. Ramkrishnan, Gehrke, “ Database Management System”, McGraw Hill
9. Kroenke, “ Database Processing Fundamentals , Design and Implementation” Pearson
Education.
10. D.Ulman, “ Principles of Database and Knowledge base System”, Computer Science Press.
11. Maheshwari Jain.’DBMS: Complete Practical Approach”, Firewall Media, New Delhi
12. www.google.com

80

You might also like