Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Distributed and Object Database Sheet

Lecture 1
Q.1: What are the four things that we can distribute? Give a brief note about
each one of them.
There are four things that might be distributed:
1. Processing Logic which is such a way that leads to a definition of a distributed
database system.
2. Various functions of a computer system which could be delegated to various pieces
of hardware or software.
3. Data used by a number of applications which may be distributed to a number of
processing sites.
4. The Control of the execution of various tasks which might be distributed instead of
being performed by one computer system.
Q.2: What is a Distributed Database System?
Distributed database system (DDBS) = DDB + D–DBMS
-Distributed database (DDB): is a collection of multiple, logically
interrelated databases distributed over a computer network.
- Distributed Database Management system (D–DBMS): is the software that
manages the DDB and provides an access mechanism that makes this distribution
transparent to the users.
Q.3: What is a parallel database System?
A parallel database system is that database system which runs over multiprocessor
system.
Q.4: What are the main advantages of DDBS?
There are four main advantages of DDBS:
1. Transparent Management of Distributed and Replicated Data.
2. Reliability Through Distributed Transactions.
3. Improved Performance. 4. Easier System Expansion.
Q5: What is the meaning of location transparency and naming transparency?
Location Transparency: refers to the fact that the command used to perform a task is
independent of both the location of the data and the system on which an operation is
carried out.
Naming Transparency: means that a unique name is provided for each object in the
database.
Lecture 2

Q6: Which areas that are affected with the distributed database design? Briefly
explain each one of them.
1.Distributed Directory Management: A directory contains information about data
items in the database. it can be centralized at one site or distributed over several sites;
there can be a single copy or multiple copies.
2. Distributed Query Processing: Query processing deals with designing algorithms
that analyze queries and convert them into a series of data manipulation operations.
3. Distributed Concurrency Control: Concurrency control involves the
synchronization of accesses to the distributed database, such that the integrity of the
database is maintained.
4. Distributed Deadlock Management: The competition among users for access to
data can result in a deadlock if the synchronization mechanism is based on locking.
5. Reliability of Distributed DBMS: It is important that mechanisms be provided to
ensure the consistency of the database as well as to detect failures and recover from
them.
6. Replication: it is necessary to implement protocols that ensure the consistency of
the replicas. These protocols force the updates to be applied to all the replicas before
the transaction completes.

Q7: What are the functions performed by the DBMS? Explaining this by
drawing?
The functions performed by a DBMS can be layered as:
1.Interface Layer: manages the interface to the applications.
2.Control Layer: controls the query by adding semantic integrity and authorized
predicates.
3.Query Processing (Compilation) Layer: maps the query into an optimized
sequence of lower-level operations.
4.Execution Layer: directs the execution of the access plans, including transaction
management and synchronization of algebra operations.
5.Data Access Layer: manages the buffers and data structures that implement the
files.
6.Consistency Layer: manages concurrency control and logging for update requests.

Q8: What is the meaning of the autonomy of local system?


Autonomy is a function of a number of factors such as whether the individual
DBMSs exchange information, or they can independently execute transactions, or one
is allowed to modify them.
Q9: Explain the components of a DDBMS by drawing.

Lecture 4
Q19: What is the input and output of the bond energy algorithm?
-Input: The (Attribute Affinity) AA matrix.
- Output: The clustered affinity matrix CA which is a
perturbation of AA.
Q20: How the clustered affinity matrix (CA) could be generated?
The bond energy algorithm takes as input the attribute affinity (AA) matrix, permutes its rows
and columns. the generation of clustered affinity matrix (CA) is done in three steps:
1. Initialization: Place and fix one of the columns of AA in CA.
2. Iteration: Place the remaining n-i columns in the remaining i+1 position in the CA matrix.
For each column, choose the placement that makes the most contribution to the global affinity
measure.
3. Row order: Order the rows according to the column ordering.
The best contribution of a placement:
Where

Lecture 5

Q21: Write a brief note about serializability theory of transactions?


- If in a complete history H, the operations of various transactions are not interleaved (i.e., the
operations of each transaction occur consecutively), the history is said to be serial.
- A history H is said to be serializable if and only if it is conflict equivalent to a serial history.

Q22: consider the following three transactions:

What is the complete history 𝐻' for these transactions?


A complete history 𝐻𝑐 for these transactions, and a history H (as a prefix of 𝐻𝑐)
is depicted as following:

23. Consider the three transactions of question 22. Is the following history serial
or not? And why?

Answer:
The following history is serial:

Two histories, H 1 and H 2, defined over the same set of transactions T, are said to be
equivalent if for each pair of conflicting operations Oij and Okl (i ≠ k), whenever 𝑂𝑖𝑗
≺𝐻1 𝑂𝑘𝑙, then 𝑂𝑖𝑗 ≺𝐻2 𝑂𝑘𝑙.
24. Consider the three transactions of question 22. Is the following history serial
or not? And why?

Answer:
The history H' defined over them is conflict equivalent to H as following:

So, the history is serial, because history H is said to be serializable if and only if it is
conflict equivalent to a serial history.
25. Consider two bank accounts, x (stored at Site 1) and y (stored at Site 2), and
the following two transactions where T1 transfers $100 from x to y, while T2
simply reads the balances of x and y:

Consider the following two histories that may be generated locally at the two
sites (Hi is the history at Site i):

Are these histories serializable? And is the global history that is obtained
serializable? And why?
- Both of these histories are serializable.
- The global history that is obtained is not serializable,

Because each local history is still serializable, the serialization orders are different:
𝐻1 ′ serializes T1 →T2 while 𝐻2 ′ serializes T2 →T1.
27. How could you classify the concurrency algorithms?
Pessimistic algorithms: synchronize the concurrent execution of transactions early in
their execution life cycle.
optimistic algorithms: delay the synchronization of transactions until their
termination.
28. Write a brief note on each of the following:
Locking-Based Concurrency Control Algorithms, and Timestamp- Based
Concurrency Control Algorithms.
1. Locking-Based Concurrency Control Algorithms:
The main idea is to ensure that a data item that is shared by conflicting operations is
accessed by one operation at a time.
two types of locks (commonly called lock modes) associated with each lock unit: read
lock (rl) and write lock (wl). And read locks are compatible, whereas read-write or
write-write locks are not.
Two lock modes are compatible if two transactions that access the same data item can
obtain these locks on that data item at the same time.
2. Timestamp- Based Concurrency Control Algorithms:
- A timestamp is a simple identifier that serves to identify each transaction uniquely
and is used for ordering.
- Uniqueness is only one of the properties of timestamp generation.
- The second property is monotonicity. Two timestamps generated by the same
transaction manager should be monotonically increasing. Thus, timestamps are values
derived from a totally ordered domain.
- The comparison between the transaction timestamps can be performed only if the
scheduler has received all the operations to be scheduled.
29. What is the difference between Deadlock prevention and Deadlock
Avoidance?
Deadlock prevention Deadlock Avoidance
1. Deadlock prevention methods guarantee 1. Deadlock Avoidance employ
that deadlocks cannot occur in the first concurrency control techniques that
place. require potential deadlock situations are
detected in advance.
2. The transaction manager checks a
transaction when it is first initiated and does
2. it insists that each process request
not permit it to proceed if it may cause a
deadlock. access to these resources in that order.

3. it is required that all of the data items that 3. the lock units in the distributed
will be accessed by a transaction be database are ordered globally or locally
predeclared. at each site and transactions always
request locks in that order.
4. The transaction manager then permits a
transaction to proceed if all the data items 5. Another alternative is to make use of
that it will access are available. transaction timestamps to prioritize
transactions and resolve deadlocks by
5. systems are not very suitable for database
aborting transactions with higher (or
environments access to certain data items
may depend on conditions that may not be lower) priorities.
resolved until run time.

30. What are the fundamental methods of detecting distributed deadlocks?


There are three fundamental methods of detecting distributed deadlocks, referred as:
1. Centralized Deadlock Detection.
2. Hierarchical Deadlock Detection.
3. Hierarchical Deadlock Detection.

Lecture 8
44. What are the purposes of data replication?
The purposes of replication are multiple:
1. System availability.
2. Performance.
3. Scalability.
4. Application requirements.
45. What is the difference between strong and weak mutual consistency criteria?
1. Strong mutual consistency criteria:
Require that all copies of a data item have the same value at the end of the execution
of an update transaction.
2. Weak mutual consistency criteria:
Do not require the values of replicas of a data item to be identical when an update
transaction terminates. What is required is that, if the update activity ceases for some
time, the values eventually become identical.
46.Discuss briefly Mutual Consistency and Transaction Consistency?
Mutual Consistency vs. Transaction Consistency
Mutual consistency refers to the replicas converging to the same value, while
transaction consistency requires that the global execution history be serializable.
It is possible for a replicated DBMS to ensure that data items are mutually consistent
when a transaction commits, but the execution history may not be globally
serializable.
47.What are the Update Management Strategies used with different replication
protocols?
1. Eager Update Propagation. 2. Lazy Update Propagation.
3. Centralized Techniques. 4. Distributed Techniques.
48. Write short notes about Eager Centralized Replication protocols?
- A master site controls the operations on a data item.
-These protocols are coupled with strong consistency techniques, so that updates to a
logical data item are applied to all of its replicas.
- Once the update transaction completes, all replicas have the same values for the
update data items.
49. Write a short note about Eager Distributed Replication protocols?
- The updates can originate anywhere, and they are first applied on the replica, then
the updates are propagated to other replicas.
- If the updates originate at a site where a replica of the data item does not exist, it is
forwarded to one of replica sites, which coordinates its execution.
50. Write a short note about Lazy Centralized Replication protocols?
- Lazy centralized replication algorithms are similar to eager centralized replication.
- The important difference is that the propagation does not take place within the
update transaction, but after the transaction commits as a separate refresh transaction.

You might also like