Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 50

ADMT Experiments

Tushar Hirapure
18IT1036
B/B2
Experiment No -1

Implementation of Query Optimizer:-Simulation of


Query optimizer with a query tool (SQL Fiddle)
Aim - Implementation of Query Optimizer
Simulation of Query Optimizer with a query tool (SQL Fiddle)

Objectives -To understand how query optimization is done and to write


efficient queries.

Hardware & Software Required - PC Desktop, Query Optimizer Tool


(SQL Fiddle)

Theory -
How the Query Optimizer Works

At the core of the SQL Server Database Engine are two major components: the
Storage Engine and the Query Processor, also called the Relational Engine. The
Storage Engine is responsible for reading data between the disk and memory in
a manner that optimizes concurrency while maintaining data integrity. The
Query Processor, as the name suggests, accepts all queries submitted to SQL
Server, devises a plan for their optimal execution, and then executes the plan
and delivers the required results.

Queries are submitted to SQL Server using the SQL language (or T-SQL, the
Microsoft SQL Server extension to SQL). Since SQL is a high-level declarative
language, it only defines what data to get from the database, not the steps
required to retrieve that data or any of the algorithms for processing the
request. Thus, for each query it receives, the first job of the query processor is
to devise a plan, as quickly as possible, which describes the best possible way
to execute the said query (or, at the very least, an efficient way). Its second
job is to execute the query according to that plan.

Each of these tasks is delegated to a separate component within the query


processor; the Query Optimizer devises the plan and then passes it along to the
Execution Engine, which will execute the plan and get the results from the
database.
Implementation -

Conclusion and Discussion -


Thus we have learned about query optimization tools (SQL Fiddle).
Experiment No: 2

Translating SQL queries to Relational Algebra and Query


Tree
Aim: Query Evaluation and path expressions
Translating SQL queries to Relational Algebra and Query Tree
using an online tool(Relax).
Theory:
Translating an arbitrary SQL query into a logical query plan (i.e.,
a relational algebra expression) Consider a general SELECT-
FROM WHERE statement of the form
SELECT Select-list
FROM R1, R2 T2, ...
WHERE
Where-condition
When the statement does not use subqueries in its where-condition,
we can easily translate it into the relational algebra as follows:

1. There may be nowhere clause. In that case, it is of course


unnecessary to include the selection _ in the relational
algebra expression.
2. If we omit the projection (_) we obtain the translation of
the following special case:
SELECT *
FROM R1,R2 T2, . . .
WHERE Where-condition
Query Tree:
A query tree is a tree data structure representing a relational
algebra expression. The tables of the query are represented as leaf
nodes. The node is then replaced by the result table. This process
continues for all internal nodes until the root node is executed and
replaced by the result table.

Implementation:-
Conclusion and Discussion:
Thus we have learned to write SQL equivalent Relational
Algebra Expressions and Query Tree.
Experiment No. : 3

Implementation of concurrency control Two Phase Locking


Protocol
Aim: Implementation of concurrency control Two Phase Locking
Protocol,

Requirments: java/python.

Theory:
What is Concurrency Control? Concurrency control is a database
management systems (DBMS) concept that is used to address conflicts
with the simultaneous accessing or altering of data that can occur with
a multiuser system. Concurrency control, when applied to a DBMS, is
meant to coordinate simultaneous transactions while preserving data
integrity. The Concurrency is about to control the multi-user access of
database. When more than one transactions are running
simultaneously there are chances of a conflict to occur which can leave
database to an inconsistent state. To handle these conflicts we need
concurrency control in DBMS, which allows transactions to run
simultaneously but handles them in such a way so that the integrity of
data remains intact. C. What is Two-Phase Locking (2PL)?
• Two-Phase Locking (2PL) is a concurrency control method which
divides the execution phase of a transaction.
• It ensures conflict serializable schedules.
• If read and write operations introduce the first unlock operation in
the transaction, then it is said to be Two-Phase Locking Protocol. This
protocol can be divided into two phases, 1. In Growing Phase, a
transaction obtains locks, but may not release any lock. 2. In Shrinking
Phase, a transaction may release locks, but may not obtain any lock.
• Two-Phase Locking does not ensure freedom from deadlocks.
Varients of 2- Phase Locking Protocol 1. Strict Two-Phase Locking
Protocol
• Strict Two-Phase Locking Protocol avoids cascaded rollbacks.
• This protocol not only requires two-phase locking but also all
exclusive-locks should be held until the transaction commits or aborts.
• It is not deadlock free.
• It ensures that if data is being modified by one transaction, then other
transaction cannot read it until first transaction commits.
• Most of the database systems implement rigorous two – phase
locking protocol. 2. Rigorous Two-Phase Locking
• Rigorous Two – Phase Locking Protocol avoids cascading rollbacks. •
This protocol requires that all the share and exclusive locks to be held
until the transaction commits. 3. Conservative Two-Phase Locking
Protocol
• Conservative Two – Phase Locking Protocol is also called as Static Two
– Phase Locking Protocol.
• This protocol is almost free from deadlocks as all required items are
listed in advanced.
• It requires locking of all data items to access before the transaction
starts.

Implementation:
Conclusion: Thus, we have learnt the 2- Phase locking in concurrency
control and its implementation.

Experiment No. : 4

Implementation of Timestamp based Protocol


Aim: Implementation of Timestamp based Protocol Program to check
input schedule in timestamp ordering.

Requirments: java/python.

Theory: Timestamp based Protocol: 1) Assumptions


• Every timestamp value is unique and accurately represents an instant
in time.
• No two timestamps can be the same.
• A higher-valued timestamp occurs later in time than a lower-valued
timestamp. Whenever a transaction begins, it receives a timestamp.
This timestamp indicates the order in which the transaction must occur,
relative to the other transactions. So, given two transactions that affect
the same object, the operation of the transaction with the earlier
timestamp must execute before the operation of the transaction with
the later timestamp. However, if the operation of the wrong
transaction is actually presented first, then it is aborted and the
transaction must be restarted. Every object in the database has a read
timestamp, which is updated whenever the object's data is read, and a
write timestamp, which is updated whenever the object's data is
changed. If a transaction wants to read an object,
• but the transaction started before the object's write timestamp it
means that something changed the object's data after the transaction
started. In this case, the transaction is canceled and must be restarted.
• and the transaction started after the object's write timestamp, it
means that it is safe to read the object. In this case, if the transaction
timestamp is after the object's read timestamp, the read timestamp is
set to the transaction timestamp.
• With each transaction Ti in the system, we associate a unique fixed
timestamp, denoted by TS(Ti). This timestamp is assigned by the
database system before the transaction Ti starts execution. If a
transaction Ti has been assigned timestamp TS(Ti), and a new
transaction Tj enters the system, then TS(Ti) < TS(Tj ). There are two
simple methods for implementing this scheme:
• 1. Use the value of the system clock as the timestamp; that is, a
transaction’s timestamp is equal to the value of the clock when the
transaction enters the system.
• 2. Use a logical counter that is incremented after a new timestamp
has been assigned; that is, a transaction’s timestamp is equal to the
value of the counter when the transaction enters the system.
• The timestamps of the transactions determine the serializability
order. Thus, if TS(Ti) < TS(Tj ), then the system must ensure that the
produced schedule is equivalent to a serial schedule in which
transaction Ti appears before transaction Tj .
• To implement this scheme, we associate with each data item Q two
timestamp values: •
• W-timestamp(Q) denotes the largest timestamp of any transaction
that executed write(Q) successfully. •
• R-timestamp(Q) denotes the largest timestamp of any transaction
that executed read(Q) successfully.
• These timestamps are updated whenever a new read(Q) or write(Q)
instruction is executed.

Algorithm: 1. Read the two transactions. 2. Assign the timestamp value


to the transaction (logical counter). 3. Associate Read timestamp and
write timestamp.

Implementation:
Conclusion: Thus, we have learnt the timestamp based protocol.
Experiment No. : 5

Implementation of Log based recovery mechanism


Aim: Implementation of Log based Recovery mechanism.

Requirments: Java/Python programming.

Theory: II. LOG-BASED RECOVERY o Log is a sequence of records. Log of


each transaction is maintained in some stable storage so that if any
failure occurs then it can be recovered from there. o If any operation is
performed on the database then it will be recorded on the log. But the
process of storing the logs should be done before the actual transaction
is applied on the database. Let's assume there is a transaction to
modify the City of a student. The following logs are written for this
transaction. When the transaction is initiated then it writes 'start' log.
1. When the transaction modifies the City from 'Noida' to 'Bangalore',
another log is written to the file. When the transaction is finished then
it writes another log to indicate end of the transaction. There are two
approaches to modify the database: 1) 1. Deferred database
modification: o The deferred modification technique occurs if the
transaction does not modify the database until it has committed. 2) 2.
Immediate database modification: o The Immediate modification
technique occurs if database modification occurs while transaction is
still active. In this technique, the database is modified immediately
after every operation. It follows an actual database modification. Log
and log records – The log is a sequence of log records, recording all the
update activities in the database. In a stable storage, Prior to
performing any modification to database, an update log record is
created to reflect that modification. An update log record represented
as: has these fields: 1. Transaction identifier: Unique Identifier of the
transaction that performed the write operation. 2. Data item: Unique
identifier of the data item written. 3. Old value: Value of data item prior
to write. 4. New value: Value of data item after write operation. B.
Recovery using Log records When the system is crashed,the system
consults the log to find which transactions need to be undone and
which need to be redone. 1. If the log contains the record and or just
then the Transaction Ti needs to be redone. 2. If log contains record but
does not contain the record either or then the Transaction Ti needs to
be undone. Implementation steps: 1. Create a manual log file (text file).
2. Program should take log file input. Menu for normal run and
recovery. 3. Scan the log file and perform: Undo for uncommitted
transactions and Redo for committed transactions. Sample log file
contents given below: T0 start T0,A,950 T0,B,2050 T0 commit T1 start
T1,C,600

Implementation:

Conclusion: Thus, we have learnt different kinds of failures and the log
based recovery mechanism.
Experiment No. : 6

Case Study- Distributed database for a real life application and


simulation of recovery methods
Aim: Case Study- distributed database for a real life application and
simulation of recovery methods.

Software Required: Desktop PC, 4 GB ram, Oracle 91, MS SQL server


2000, Client/server architecture, MySql.

Theory: A distributed database is a database that is under the control


of a central database management system (DBMS) in which storage
devices are not all attached to a common CPU. It may be stored in
multiple computers located in the same physical location, or may be
dispersed over a network of interconnected computers. Distributed
database systems operate in computer networking environments
where component failures are inevitable during normal operation.
Failures not only threaten normal operation of the system, but they
may also destroy the consistency of the system by direct damage to the
storage subsystem. To cope with these failures, distributed database
systems must provide recovery mechanisms which maintain the system
consistency. types of failures that may occur in distributed database
systems are discussed and the appropriate recovery actions. C.
Recovery from Power Failure Power failure causes loss of information in
the non-persistent memory. When power is restored, the operating
system and the database management system restart. Recovery
manager initiates recovery from the transaction logs. In case of
immediate update mode, the recovery manager takes the following
actions –
• Transactions which are in active list and failed list are undone and
written on the abort list.
• Transactions which are in before-commit list are redone.
• No action is taken for transactions in commit or abort lists. In case of
deferred update mode, the recovery manager takes the following
actions –
• Transactions which are in the active list and failed list are written onto
the abort list. No undo operations are required since the changes have
not been written to the disk yet.
• Transactions which are in before-commit list are redone.
• No action is taken for transactions in commit or abort lists. D.
Recovery from Disk Failure A disk failure or hard crash causes a total
database loss. To recover from this hard crash, a new disk is prepared,
then the operating system is restored, and finally the database is
recovered using the database backup and transaction log. The recovery
method is same for both immediate and deferred update modes. The
recovery manager takes the following actions –
• The transactions in the commit list and before-commit list are redone
and written onto the commit list in the transaction log.
• The transactions in the active list and failed list are undone and
written onto the abort list in the transaction log. E. Checkpointing
Checkpoint is a point of time at which a record is written onto the
database from the buffers. As a consequence, in case of a system crash,
the recovery manager does not have to redo the transactions that have
been committed before checkpoint. Periodical checkpointing shortens
the recovery process. The two types of checkpointing techniques are –
• Consistent checkpointing
• Fuzzy checkpointing
Consistent Checkpointing Consistent checkpointing creates a
consistent image of the database at checkpoint. During recovery,
only those transactions which are on the right side of the last
checkpoint are undone or redone. The transactions to the left side of
the last consistent checkpoint are already committed and needn’t be
processed again. The actions taken for checkpointing are –
1) • The active transactions are suspended temporarily.
• All changes in main-memory buffers are written onto the disk. •
A “checkpoint” record is written in the transaction log.
• The transaction log is written to the disk.
• The suspended transactions are resumed. If in step 4, the
transaction log is archived as well, then this checkpointing aids in
recovery from disk failures and power failures, otherwise it aids
recovery from only power failures. 2) Fuzzy Checkpointing In fuzzy
checkpointing, at the time of checkpoint, all the active transactions
are written in the log. In case of power failure, the recovery manager
processes only those transactions that were active during checkpoint
and later. The transactions that have been committed before
checkpoint are written to the disk and hence need not be redone. 3)
Example of Checkpointing Let us consider that in system the time of
checkpointing is tcheck and the time of system crash is fail. Let there
be four transactions Ta, Tb, Tc and Td such that –
• Ta commits before checkpoint.
• Tb starts before checkpoint and commits before system crash. • Tc
starts after checkpoint and commits before system crash.
• Td starts after checkpoint and was active at the time of system crash.
F. Distributed One-phase Commit Distributed one-phase commit is the
simplest commit protocol. Let us consider that there is a controlling site
and a number of slave sites where the transaction is being executed.
The steps in distributed commit are –
• After each slave has locally completed its transaction, it sends a
“DONE” message to the controlling site.
• The slaves wait for “Commit” or “Abort” message from the controlling
site. This waiting time is called window of vulnerability.
• When the controlling site receives “DONE” message from each slave,
it makes a decision to commit or abort. This is called the commit point.
Then, it sends this message to all the slaves.
• On receiving this message, a slave either commits or aborts and then
sends an acknowledgement message to the controlling site. G.
Distributed Two-phase Commit Distributed two-phase commit reduces
the vulnerability of one-phase commit protocols. The steps performed
in the two phases are as follows − Phase 1: Prepare Phase
• After each slave has locally completed its transaction, it sends a
“DONE” message to the controlling site. When the controlling site has
received “DONE” message from all slaves, it sends a “Prepare” message
to the slaves.
• The slaves vote on whether they still want to commit or not. If a slave
wants to commit, it sends a “Ready” message.
• A slave that does not want to commit sends a “Not Ready” message.
This may happen when the slave has conflicting concurrent transactions
or there is a timeout. Phase 2: Commit/Abort Phase
• After the controlling site has received “Ready” message from all the
slaves − o The controlling site sends a “Global Commit” message to the
slaves. o The slaves apply the transaction and send a “Commit ACK”
message to the controlling site. o When the controlling site receives
“Commit ACK” message from all the slaves, it considers the transaction
as committed.
• After the controlling site has received the first “Not Ready” message
from any slave − o The controlling site sends a “Global Abort” message
to the slaves. o The slaves abort the transaction and send a “Abort ACK”
message to the controlling site. o When the controlling site receives
“Abort ACK” message from all the slaves, it considers the transaction as
aborted. H. Distributed Three-phase Commit The steps in distributed
three-phase commit are as follows − Phase 1: Prepare Phase The steps
are same as in distributed two-phase commit. Phase 2: Prepare to
Commit Phase
• The controlling site issues an “Enter Prepared State” broadcast
message.
• The slave sites vote “OK” in response. Phase 3: Commit / Abort Phase
The steps are same as two-phase commit except that “Commit
ACK”/”Abort ACK” message is not required. 34 Procedure/ Program:
Steps of Distributed Database Design Top-down approach: first the
general concepts, the global framework are defined, after then the
details. Down-top approach: first the detail modules are defined, after
then the global framework. If the system is built up from a scratch, the
top-down method is more accepted. If the system should match to
existing systems or some modules are yet ready, the down-top method
is usually used. General design steps according to the structure:
• analysis of the external, application requirements
• design of the global schema
• design of the fragmentation
• design of the distribution schema
• design of the local schemes
• design of the local physical layers DDBMS -specific design steps:
• design of the fragmentation
• design of the distribution schema

Conclusion: The DDB database real life scenarios is studied and its
requirements are documented. The simulation tool is used to
demonstrate the recovery mechanism in distributed environment.
Experiment No. : 7

Advance database models: case study based on assignments


for temporal, mobile or spatial databases
Aim: Advanced Database Models Case study for Temporal, Mobile or
Spatial databases

A Case Study on Spatio-Temporal Data Mining of Urban Social


Management Events Based on Ontology Semantic Analysis (Students
may refer to different case study)

Theory: The massive urban social management data with geographical


coordinates from the inspectors, volunteers, and citizens of the city are
a new source of spatio-temporal data, which can be used for the data
mining of city management and the evolution of hot events to improve
urban comprehensive governance. . First, an ontology model for USMEs
is presented to accurately extract effective social management events
from non-structured UMSEs. Second, an explorer spatial data analysis
method based on “event-event” and “event-place” from spatial and
time aspects is presented to mine the information from UMSEs for the
urban social comprehensive governance. The data mining results are
visualized as a thermal chart and a scatter diagram for the optimization
of the management resources configuration, which can improve the
efficiency of municipal service management and municipal departments
for decision-making. 1. Materials and Methods The concept system of
social comprehensive governance is huge and complex, and there are
various kinds of events. The extraction of interesting hot events and the
spatio-temporal information mining, which is only one of the many
entry points in this field is important. It has a broad research space in
the information mining of the social comprehensive management
events based on space-time management, whether in content or
method. The smart city platform adds a geographic coordinate tag for a
variety of events and log data generated from the city management
process, but these data records are from inspectors, volunteers in the
city management, and even citizens; the events are described as
unstructured natural language. This case study proposes a spatio-
temporal data mining approach based on the urban social management
events to extract unstructured natural language information, to find the
event spatio-temporal distribution pattern, and to provide visualized
decision support for the social management and comprehensive control
of the city. The technical framework of the proposed approach is shown
in Figure 1.
The purpose of urban management and comprehensive administration
is to maintain a good environment for social development. During the
process of urban management, there are a large number of work
record data. Thus, how to make use of these work records well to
excavate useful information hidden in these historical data is very
important for the decisionmaking of further urban social governance.
The content of city management is huge with a complicated structure
for urban governance. This study puts forward a concept system of
urban social management events. An ontology model is proposed for
the massive spatio-temporal data mining of social management and
comprehensive control events. It designs the process of the
construction of the ontology, builds the ontology using the existing
tools, and realizes the extraction of the hot events in city management
based on the semantic reasoning of ontology with Java-based
frameworks, whose comprehensiveness and accuracy are higher than
that of the old ones. This paper also introduces the spatio-temporal
information mining for discrete USMEs from three perspectives:
geographical statics, spatial aggregation and correlation relationship. A
spatial-temporal correlation data mining between events and locations
or 40 between events and events is proposed to mine the spatial-
temporal information from the discrete and massive city’s
comprehensive management events.

Conclusion: Thus the case study for spatial and temporal data has been
performed.
Experiment No. : 8
Construction of Star Schema and Snowflake Schema for
company database
Implementation:
Experiment No. : 9 OLAP
Exercise a) Construction of cubes b) OLAP operations,
OLAP queries
Implementation:
ADMT Assignment 1
Tushar Hirapure
18IT1036
B/B2
Q1 (a) - Given the following SQL query:

Student (sid, name, age, address)


Book(bid, title, author)
Checkout(sid, bid, date)

SELECT S.name
FROM Student S, Book B, Checkout C
WHERE S.sid = C.sid
AND B.bid = C.bid
AND B.author = ’Olden Fames’
AND S.age > 12
AND S.age < 20 ;
And assuming:
• There are 10, 000 Student records stored on 1, 000 pages.
• There are 50, 000 Book records stored on 5, 000 pages.
• There are 300, 000 Checkout records stored on 15, 000 pages.
• There are 500 different authors.
• Student ages range from 7 to 24
Draw a query tree for the above mentioned query.
Answer -
Q1 (b) - Consider the schema R(a,b), S(b,c), T(b,d), U(b,e).
For the following SQL query, write two equivalent logical plans in relational algebra such
that one is likely to be more efficient than the other. Indicate which one is likely
more efficient. Reason.
SELECT R.a
FROM R, S
WHERE R.b = S.b
AND S.c = 3

Answer -

Two logical plans for the above mentioned equation are:

1. π(σ (R ⋈ (S)))
a c=3 b=b
2. π(R ⋈ σ (S)))
a b=b c=3

The second plan is more likely to be more efficient because, with the select
operator applied first, fewer tuples need to be joined.
Q2 (a) - Describe ACID properties of Transaction.

Answer - ACID properties ensure the integrity of the data.


1) Atomicity - After the transaction is initiated it must be completed. A transaction
is an atomic unit of processing; it is either performed in its entirety or not
performed at all. It is the responsibility of the transaction recovery system of a
dbms to ensure atomicity. If a transaction fails to complete for some reason such
as a system crash in the midst of a transaction execution , the recovery
technique must undo any effects of the transaction on the database.
2) Consistency - Database should be consistent. A transaction consistency if its
complete execution take(s) the database from one consistent state to another .
EXAMPLE:
Before transaction A contains 1000$.
Before transaction B contains 2000$

A=1000 A=950
+B=2000 B=2050
3000 3000
Before
After

Hence, Consistency is established.

3) Isolation:- A transaction should appear as though it is being executed in isolation


from any other transactions. That is, the execution of a transaction should not be
interfered with by any other transactions executing concurrently.

EXAMPLE -
1st transaction :50$ to transfer from account A to account B
2nd transaction: 100$ to transfer from account B to account C\
In the following transactions, A has no role in the 2nd transaction ; whereas the data
item from B is shared.

4) Durability:- The changes applied to the transaction must persist in the database.
These changes must not be lost because of any failure. Once the transaction is
completed successfully, its impact will be forever over the database.
Q2 (b) - Determine whether the following schedule is serializable or
not? Justify.

T1 T2

read(A) -
write(A) -
- read(A)

- write(A)
read(B) -

write(B) -
- read(B)
- write(B)

Answer -

Step 1:-write(A) read(A) instruction causes conflict hence no swapping takes place.
Hence transaction will be as it is

Step 2:-write(A) read(B) will not cause conflict. Read(B) instruction is swapped up
and write(A) swapped down.
T1 T2

read(A) -

write(A) -
- read(A)
read(B) -

- write(A)
write(B) -

- read(B)
- write(B)
Step 3 - read(B) read(A) won't cause any conflict hence read(B) will be swapped up
and read(A) will be swapped down.

T1 T2

read(A) -

write(A) -
read(B) -
- read(A)

- write(A)
write(B) -
- read(B)
- write(B)

Step 4 - write(B) write(A) won't cause any conflict and hence their positions will be
swapped.

T1 T2

read(A) -
write(A) -

read(B) -
- read(A)

write(B) -
- write(A)
- read(B)

- write(B)
Step 5 - write(B) and read(A) won't cause any conflict and hence their positions are
swapped.

T1 T2

read(A) -

write(A) -
read(B) -
write(B) -

- read(A)
- write(A)

- read(B)
- write(B)

Hence the schedule is serialized with T1 followed by T2.


Q3 (a) - Explain Timestamp based protocol with example.
Answer - Transaction Timestamp is a unique identifier assigned to each transaction.
The timestamps are typically based on the order in which transactions are started;
hence, if transaction T1 starts before transaction T2, then TS(T1) < TS(T2). Notice that
the older transaction has the smaller transaction value. The two schemes that prevent
deadlock are called wait-die and wound-wait.

● Wait die : If TS(Ti) < TS(Tj) then (Ti older than Tj) is allowed to wait ; otherwise (Ti
younger than Tj) abort Ti(Ti dies) and restart it later with the same timestamp.
● Wound wait : If TS(Ti)<TS(Tj) then (Ti older than Tj) abort Tj(Ti wounds Tj) and
restart it later with the same timestamp; otherwise (Ti younger than Tj) Ti is
allowed to wait.

Protocol manages concurrent execution such that time-stamps determine serializability.


E.g: TS(Ti) =3 First Executed
TS(Tj)=5 Second Executed

1. W-Timestamp(Q) is the largest timestamp of any transaction that


executed write(Q) successfully.
2. R-Timestamp(Q) is the largest timestamp of any transaction that executed
read(Q) successfully.

Example -

T1 T2

read(Q) -
write(Q) -

- read(Q)
- write(Q)

Let, the TS(T1)=1 and TS(T2)=2


Hence, W-Timestamp(Q) =2
R-Timestamp(Q) =2
Q3 (b) - Describe ARIES algorithm with suitable example.
Answer - The ARIES recovery procedure consists of three main steps:
1. Analysis:- The analysis step identifies the dirty (updated) pages in the buffer
and the set of transactions active at the time of the crash.
2. Redo:- The appropriate point in the log where the REDO operation should start
is also determined. The REDO phase actually reapplies the updates from the log
to the database. Certain information in the ARIES log will provide the start point
for the REDO, from which REDO operations are applied until the end of the log
is reached. Thus, only necessary REDO operations are applied during recovery.
3. Undo:- Finally, during the UNDO phase, the log is scanned backwards and the
operations of transactions that were active at the time of the crash are undone in
reverse order.

The information needed for ARIES to accomplish its recovery procedure includes, hte
log, the transaction table and the dirty page table

Consider the following recover example:


There are three transactions, T1,T2 and T3. T1 updates page C, T2 updates page B,
and T3 updates page A.

(a) Partial contents of the log

SN LAST_LSN TRAN_ID TYPE PAGE_ID OTHER INFO

1 0 T1 update C ----

2 0 T2 update B ----

3 1 T1 commit ---- ----


4 Begin ---- ---- ---- ----
Checkpoint
5 End ---- ---- ---- ----
Checkpoint

6 0 T3 update A ----

7 2 T2 update C ----

8 7 T2 commit ---- ----


(b)Transaction and Dirty PAge Tables at the time of checkpoint

Transaction Table Dirty Page Table

TRAN_ID LAST_LSN STATUS PAGE_ID LSN

T1 3 Commit C 1

T2 2 In Progress B 2

(c)Transaction and Dirty page tables after the analysis phase

Transaction Table Dirty Page Table

TRAN_ID LAST_LSN STATUS PAGE_ID LSN


T1 3 commit C 1

T2 8 commit B 2

T3 6 In progress A 6
ADMT Assignment - 2

Tushar Hirapure
18IT1036
B/B2

You might also like