DBMS%20REDUCTION%202023.PDF.odt_0

You might also like

Download as odt, pdf, or txt
Download as odt, pdf, or txt
You are on page 1of 8

1 ) Explain the main function of database adminstrator ?

Ans :- In a database management system (DBMS), the main function of a database administrator
(DBA) is to manage and maintain the database system. This includes:Database Design: The DBA is
responsible for designing the database schema and determining the relationships between tables, as
well as defining the data types and constraints that should be used.Database Installation and
Configuration: The DBA installs the DBMS software, sets up the database environment, and
configures the system settings to optimize performance and security.Data Security and Backup: The
DBA implements security measures to protect the database from unauthorized access or malicious
attacks. They also set up backup and recovery procedures to ensure that data can be restored in case
of a system failure or data loss.Performance Monitoring and Tuning: The DBA monitors the
database system to ensure that it is running smoothly and efficiently. They may use tools to identify
and troubleshoot performance issues and optimize database performance.User Management: The
DBA manages user accounts, privileges, and permissions, and ensures that users have appropriate
access to the data they need.Data Integration and Migration: The DBA may be responsible for
integrating data from different sources into the database system and ensuring that data is accurately
migrated from one system to another.Overall, the main function of a DBA in a DBMS is to ensure
that the database system is reliable, secure, and performing optimally to meet the needs of the
organization.

2 ) Describe the architecture of database system and explain each component in the system ?

Ans :- A Database Architecture is a representation of DBMS design. It helps to design, develop,


implement, and maintain the database management system. A DBMS architecture allows dividing
the database system into individual components that can be independently modified, changed,
A Database stores critical information and helps access data quickly and securely. Therefore,
selecting the correct Architecture of DBMS helps in easy and efficient data management.It helps in
implementation, design, and maintenance of a database to store and organize information for
companies. The concept of DBMS depends upon its architecture. The architecture can be designed
as centralized, decentralized, or hierarchical replaced, and altered. It also helps to understand the
components of a database.Client Application: The client application is the front-end interface that
allows users to interact with the database system. This interface can be a graphical user interface
(GUI) or a command-line interface (CLI). The client application sends requests to the database
system and receives the results.
Database Management System (DBMS): The DBMS is the software that manages the storage and
retrieval of data in the database. It provides an interface between the client application and the
database, allowing users to perform operations on the data such as insertion, deletion, and
retrieval.Database: The database is the collection of data that is managed by the DBMS. It can be
thought of as a container that holds data in a structured format. The data is organized into tables,
and each table consists of rows and columns.Query Processor: The query processor is responsible
for processing queries that are submitted to the database system. It analyzes the query to determine
which data is needed and how to retrieve it from the database. The query processor also optimizes
the query to ensure that it is executed as efficiently as possible.Storage Manager: The storage
manager is responsible for managing the physical storage of data in the database. It interacts with
the file system to read and write data to and from disk. The storage manager also ensures that data is
stored in a consistent and reliable manner.Concurrency Control: Concurrency control ensures that
multiple users can access the database simultaneously without interfering with each other's
operations. It manages access to shared resources and ensures that data is consistent even when
multiple users are making changes to the database.Backup and Recovery Manager: The backup
and recovery manager is responsible for creating backups of the database and restoring the database
to a previous state if necessary.
3 ) Define trigger. Explain need for trigger with example ?

Ans :- In a database management system (DBMS), a trigger is a set of instructions or a piece of


code that automatically executes in response to a specific event or condition that occurs within the
database.Triggers are used to enforce business rules and ensure data consistency by automatically
performing actions such as updating or inserting data, validating data, or performing calculations.
They can also be used to audit changes made to the database, send notifications, or enforce security
constraints. Because a trigger resides in the database and anyone who has the required privilege can
use it, a trigger lets you write a set of SQL statements that multiple applications can use. It lets you
avoid redundant code when multiple programs need to perform the same database operation.
Example for Trigger:
1 CREATE TRIGGER sample_trigger
2 before INSERT
3 ON student
4 FOR EACH ROW
5 SET new.total = new.marks/

4 ) Define the term Normalization. Why it is necessary to decompose the relation into several
relation. With an example state the anomalies are removed by decomposition. ?

Ans :-Normalization is the process of organizing a database in such a way that data is stored in the
most efficient and logical manner possible. The goal of normalization is to eliminate redundant data
and reduce the likelihood of data anomalies, such as inconsistencies or errors in the data.
When a relation (or table) in a database contains redundant data, it can lead to data anomalies such
as update, insertion, and deletion anomalies. These anomalies can cause data inconsistencies and
errors, and can make it difficult to maintain the database. To address these issues, the relation needs
to be decomposed into several relations that are in a more normalized form. This is done by
identifying functional dependencies between attributes and breaking down the relation into smaller,
more atomic relations.
Relational Decomposition
• When a relation in the relational model is not in appropriate normal form then the
decomposition of a relation is required.
• In a database, it breaks the table into multiple tables.
• If the relation has no proper decomposition, then it may lead to problems like loss of
information.
• Decomposition is used to eliminate some of the problems of bad design like anomalies,
inconsistencies, and redundancy.

Lossless Decomposition
• If the information is not lost from the relation that is decomposed, then the decomposition
will be lossless.
• The lossless decomposition guarantees that the join of relations will result in the same
relation as it was decomposed.
• The relation is said to be lossless decomposition if natural joins of all the decomposition
give the original relation.
5 ) Describe the steps involved in Query Processing and explain the functional
of each step. ?

Ans :- Query Processing is the process of executing a user's query against a database system. The
main steps involved in query processing are:Parsing: The first step in query processing is parsing
the user's query. In this step, the query is analyzed to check for any syntax errors and to determine
the query's structure. The parser generates a tree structure called a parse tree, which represents the
query's syntax.Optimization: Once the query has been parsed, the optimizer determines the most
efficient way to execute the query. The optimizer considers factors such as the size of the tables
involved, the indexes available, and the query's complexity to choose the most efficient execution
plan.Semantic Analysis: In this step, the system checks the query for semantic errors such as table
or column names that do not exist or invalid data types. The semantic analyzer ensures that the
query is well-formed and that all the elements referenced in the query are valid.Execution: Once the
optimizer has determined the best execution plan and the semantic analyzer has verified the query's
validity, the query is executed. During execution, the system reads data from the database and
applies the query's operations to generate the result set.Result Generation: The final step in query
processing is generating the result set. The system takes the data obtained from the execution step
and formats it into a result set that is returned to the user. The result set is typically presented in a
tabular format and may be sorted, filtered, or aggregated based on the query's specifications.
Overall, query processing is a complex and iterative process that involves several steps to transform
a user's query into a result set. Each step plays a critical role in ensuring that the query is well-
formed, efficient, and generates accurate results.
6 ) What is meant by Query Optimization? How it is achieved?
Ans :- The process of selecting an efficient execution plan for processing a query is known as query
optimization. Following query parsing which is a process by which this decision making is done
that for a given query, calculating how many different ways there are in which the query can run,
then the parsed query is delivered to the query optimizer, which generates various execution plans
to analyze the parsed query and select the plan with the lowest estimated cost. The catalog manager
assists the optimizer in selecting the optimum plan to perform the query by generating the cost of
each plan. Query optimization is used to access and modify the database in the most efficient way
possible. It is the art of obtaining necessary information in a predictable, reliable, and timely
manner. Query optimization is formally described as the process of transforming a query into an
equivalent form that may be evaluated more efficiently. The goal of query optimization is to find an
execution plan that reduces the time required to process a query. We must complete two major tasks
to attain this optimization target.There is the various principle of Query Optimization are as follows
: Understand how your database is executing your query − The first phase of query optimization
is understanding what the database is performing. Different databases have different commands for
this. For example, in MySQL, one can use the “EXPLAIN [SQL Query]” keyword to see the query
plan. In Oracle, one can use the “EXPLAIN PLAN FOR [SQL Query]” to see the query plan. :
Retrieve as little data as possible − The more information restored from the query, the more
resources the database is required to expand to process and save these records. For example, if it
can only require to fetch one column from a table, do not use ‘SELECT *’. : Store intermediate
results − Sometimes logic for a query can be quite complex. It is possible to produce the desired
outcomes through the use of subqueries, inline views, and UNION-type statements. For those
methods, the transitional results are not saved in the database but are directly used within the query.
This can lead to achievement issues, particularly when the transitional results have a huge number
of rows.
7 ) What is serializability? Explain the distinction between serial schedule and
serializable schedule ?
Ans :- Serializability of schedules ensures that a non-serial schedule is equivalent to a serial
schedule. It helps in maintaining the transactions to execute simultaneously without interleaving
one another. In simple words, serializability is a way to check if the execution of two or more
transactions are maintaining the database consistency or not.A serial schedule is a schedule of
transactions in which each transaction is executed one at a time, without any concurrency. In other
words, each transaction is executed completely before the next transaction begins. A serial schedule
is guaranteed to be correct and consistent, but it may not be the most efficient way to execute
transactions in a multi-user environment. A serializable schedule, on the other hand, is a schedule of
transactions that is equivalent to a serial schedule, even though the transactions may be executed
concurrently. In other words, a serializable schedule ensures that the result of the concurrent
execution of transactions is the same as if they were executed sequentially. To achieve
serializability, database systems use techniques such as locking and concurrency control. Locking is
a technique in which transactions are prevented from accessing the same data simultaneously by
acquiring locks on the data. Concurrency control is a technique that manages the interactions
between transactions to ensure that they are executed in a serializable manner. The distinction
between a serial schedule and a serializable schedule is important because in a multi-user
environment, it is often necessary to execute transactions concurrently to achieve good
performance. By using concurrency control techniques to ensure serializability, database systems
can achieve both correctness and efficiency, allowing multiple users to access and modify the
database simultaneously without compromising the consistency of the data.
8 ) List the advantages and disadvantages of two phase locking ?
Ans :- Two-phase locking (2PL) is a concurrency control mechanism used in database systems to
ensure that multiple transactions can access shared resources without interfering with each other.
The two phases are the growing phase, during which a transaction acquires locks on the resources it
needs, and the shrinking phase, during which the transaction releases the locks it
acquired.Advantages of Two-Phase Locking: 1.Ensures serializability: Two-phase locking
guarantees that the execution of transactions is serializable, which means that the results of
executing multiple transactions concurrently are equivalent to executing them serially in some
order. This property ensures that the database remains consistent and accurate. 2. Deadlock
prevention: 2PL can prevent deadlocks from occurring by ensuring that transactions acquire locks
on resources in a consistent order. This prevents circular dependencies between transactions from
forming, which can cause deadlocks.3 Concurrency: 2PL allows for a high degree of concurrency in
database systems. Multiple transactions can access shared resources simultaneously without
interfering with each other, which can improve the performance of the database.Disadvantages of
Two-Phase Locking: 1. Locking overhead: Acquiring and releasing locks can add overhead to the
execution of transactions, which can slow down the performance of the database. This overhead can
be particularly significant in systems with a large number of concurrent transactions.2 Limited
parallelism: Two-phase locking limits the degree of parallelism in database systems. While multiple
transactions can access shared resources simultaneously, they cannot do so in an arbitrary order.
This means that the degree of parallelism is limited by the locking protocol.
9 ) Explain log-based recovery scheme ?
Ans :- Log-based recovery in DBMS provides the ability to maintain or recover data in case of
system failure. DBMS keeps a record of every transaction on some stable storage device to provide
easy access to data when the system fails. A log file will be created for every operation performed
on the database at that point. The original transaction should be processed before it is applied to the
database. To understand Log-based recovery in DBMS let us look at an example To recover from a
system failure, Apache HBase, a distributed data store based on HDFS, employs a log-based
recovery provided by DBMS. The WAL is written every time a database write occurs, so it is
replicated on HDFS. Upon recording the transaction in WAL, it is then moved to memstore, which
lives on the region server for HBase. This memstore becomes full once there are multiple writes. As
soon as the memstore is full, an HFile is created and flushed to disk. During a shutdown or a restart
of the system holding the memstore, all the data in the memstore cache are lost before they could be
flushed to disk as HFiles. When the memstore is full, the WAL is replayed to reconstruct the data
and repopulate it so that when full, it can flush the data to permanent storage .A log is a series of
records. The logs for each transaction are kept in a log file to allow recovery in case of failure. A log
is kept for each operation performed on the database. It is important to store the log before the
actual transactions are applied to the database. Take the example of modifying a student's City. This
transaction produces the following logs. A start log is produced when the transaction begins.
10 ) Explain the Timestamp based locking protocol ?
Ans :- The timestamp-ordering protocol ensures serializability among transactions in their
conflicting read and write operations. This is the responsibility of the protocol system that the
conflicting pair of tasks should be executed according to the timestamp values of the transactions.
A lock is a mechanism to control concurrent access to a data item
Data items can be locked in two modes: the write=>exclusive and
the read => shared.
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction
Timestamp based protocol: The Timestamp-based protocol ensures that every conflicting read
and write operations are executed in a timestamp order. This is the most commonly used
concurrency protocol. The older transaction is always given priority in this method. It uses system
time to determine the time stamp of the transaction In this protocol, each transaction is assigned a
unique timestamp value, which represents the order in which the transaction requests to access
shared resources. When a transaction attempts to read or write a data item, it checks the timestamp
of the last transaction that accessed that item. If the last transaction has a lower timestamp than the
current transaction, then the current transaction is allowed to access the item. Otherwise, the
transaction is delayed until the resource is released by the previous transaction.
The timestamp-based protocol uses two types of locks: shared and exclusive locks. Shared locks are
used when a transaction only needs to read a data item, while exclusive locks are used when a
transaction needs to write to a data item. A transaction can acquire a shared lock only if no other
transaction has an exclusive lock on the same item, and it can acquire an exclusive lock only if no
other transaction has a shared or exclusive lock on the same item.
11 ) What is deadlock? Explain deadlock detection and reconvey. ?
Ans :- Deadlock is a situation that occurs in a database management system when two or more
transactions are waiting for each other to release resources that they hold. This results in a situation
where no transaction can proceed, and the system becomes stuck, causing a failure in the system's
operations. Deadlocks can occur due to the way concurrency control mechanisms, such as locking
protocols, are implemented in the system.
Deadlock detection is a technique used to identify deadlocks in a database management system. It
involves periodically examining the system's lock tables to identify cycles in the waiting chains of
transactions. If a cycle is detected, it indicates that a deadlock has occurred. Once a deadlock is
detected, the system can take appropriate actions to resolve it.
One way to resolve a deadlock is through a technique known as deadlock reconveyance. This
technique involves rolling back one or more of the transactions involved in the deadlock to release
the resources they hold. The system can determine which transactions to roll back based on various
factors, such as their priorities or the amount of work they have already completed. The system then
releases the resources held by the rolled-back transactions and allows the remaining transactions to
proceed.
Deadlock reconveyance can be a complicated process, as it involves deciding which transactions to
roll back and how to ensure that the system remains consistent after the rollbacks. In some cases,
the system may need to abort all transactions involved in the deadlock and restart them to ensure
consistency.
12 ) Explain the Time stamp based ordering protocol with an example ?
Ans :-The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the transaction creation.
The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter. The lock-based
protocol is used to manage the order between conflicting pairs among transactions at the execution
time. But Timestamp based protocols start working as soon as a transaction is created. Let's assume
there are two transactions T1 and T2. Suppose the transaction T1 has entered the system at 007
times and transaction T2 has entered the system at 009 times. T1 has the higher priority, so it
executes first as it is entered the system first. The timestamp ordering protocol also maintains the
timestamp of last 'read' and 'write' operation on a data. .1
Check the following condition whenever a transaction Ti issues a Read (X) operation:
• If W_TS(X) >TS(Ti) then the operation is rejected.
• If W_TS(X) <= TS(Ti) then the operation is executed.
• Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti issues a Write(X) operation:
• If TS(Ti) < R_TS(X) then the operation is rejected.
• If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the
operation is executed.

You might also like