DDBMS Pastpaper Solve by M.noman Tariq

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

MCQS

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Q.1 Explain multi database system architecture with its types.
A multi-database system (MDBS) is a distributed database system that integrates two or more pre-
existing databases. It provides a unified view of the data and allows users to access and query data from
all of the component databases as if it were a single database.

Multi - DBMS Architectures


This is an integrated database system formed by a collection of two or more autonomous database
systems.

Multi-DBMS can be expressed through six levels of schemas

• Multi-database View Level − Depicts multiple user views comprising of subsets of the integrated
distributed database.

• Multi-database Conceptual Level − Depicts integrated multi-database that comprises of global


logical multi-database structure definitions.

• Multi-database Internal Level − Depicts the data distribution across different sites and multi-
database to local data mapping.

• Local database View Level − Depicts public view of local data.

• Local database Conceptual Level − Depicts local data organization at each site.

• Local database Internal Level − Depicts physical data organization at each site.

There are two design alternatives for multi-DBMS –

• Model with multi-database conceptual level.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


• Model without multi-database conceptual level.

Describe the top down approach used for designing distributed


databases.
Top – down design method
The top-down design method starts from the general and moves to the specific. In other words, you start
with a general idea of what is needed for the system and then work your way down to the more specific
details of how the system will interact. This process involves the identification of different entity types
and the definition of each entity’s attributes.,hello

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Explain the horizontal fragmentation and its types used to design
distributed database systems.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
write down a note on peer-to-peer architecture.
Peer-to-Peer (P2P) architecture is a fundamental paradigm in computer networking and distributed
systems. Unlike traditional client-server models where one central server manages communication and
resources, P2P architecture distributes these functions among all connected nodes. This decentralized
approach offers various benefits and has found applications in a wide range of fields.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Key Characteristics:
1. P2P networks lack a central authority or server. Instead, all participating nodes are equal peers,
capable of both requesting and providing services or resources.
2. P2P networks often scale effortlessly. As more nodes join the network, it becomes more robust
and efficient, rather than overloading a single central server.
3. The decentralized nature of P2P networks makes them inherently resilient to failures. If one
node goes down, others can still function, ensuring the network's continuity.
4. P2P networks excel at resource sharing. This can include sharing files, computational power, or
even internet connectivity, as seen in blockchain-based networks.
5. P2P networks can provide a higher level of anonymity and privacy compared to centralized
systems, as they often lack a single point of control or surveillance.

Types of P2P Architectures:


1. Unstructured P2P:

In unstructured P2P networks, nodes connect randomly to other nodes, making it suitable
for applications like file-sharing (e.g., BitTorrent). Search queries may require flooding the
network, making it less efficient for structured searches.

2. Structured P2P:

Structured P2P networks impose a specific topology or structure, such as a Distributed Hash
Table (DHT). This enables efficient data retrieval and indexing, commonly used in distributed
databases or content delivery networks.

3. Hybrid P2P:

Hybrid P2P networks combine elements of both structured and unstructured models to
leverage the strengths of each. They aim to strike a balance between efficient search and
scalability.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


What does it mean by Data Integration? Explain a scenario where we
apply data integration. Write in detail the factors affecting Data
Integration with approaches to overcome these factors.
Data integration is the process of combining and harmonizing data from different sources into a unified
view or format. The goal of data integration is to provide a comprehensive and consistent view of data
across an organization, allowing for more effective data analysis, reporting, and decision-making. It
involves bringing together data from various systems, applications, and databases and making it
accessible and usable for various business processes and applications.

Scenario for Data Integration:


A large retail company operates both physical stores and an e-commerce website. They collect data from
various sources, including point-of-sale systems, online transactions, customer feedback, and inventory
management systems. Each source uses different data formats and structures, making it challenging to
gain a holistic view of their business.

Factors Affecting Data Integration:


Data Quality:
Inconsistent or inaccurate data in the source systems can lead to problems during integration.

Data Volume:
Large volumes of data can strain integration processes, impacting performance and scalability.

Data Variety:
Dealing with diverse data formats, such as structured, semi-structured, and unstructured data, can be
challenging.

Data Velocity:
Real-time or near-real-time data integration requirements can put pressure on systems and processes.

Data Governance:
Ensuring data security, compliance, and privacy is crucial.

Approaches to Overcome Data Integration Challenges:


Data Quality Assurance:
Implement data quality checks and validation rules to ensure data accuracy and consistency.

Data Profiling:
Use data profiling tools to understand the structure and quality of data before integration.

Data Standardization:
Establish data standards and conventions to normalize data from different sources.

Master Data Management (MDM):


Employ MDM solutions to create a master repository of essential data, such as customer and product
information.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Data Integration Platforms:
Invest in robust data integration platforms and tools that support various data formats and integration
methods.

Data Governance Framework:


Develop and enforce data governance policies to manage data security, privacy, and compliance.

Scalability and Performance Optimization:


Continuously monitor and optimize integration processes for handling large data volumes and real-time
data.

Change Management:
Implement effective change management practices to ensure that data integration processes adapt to
evolving business needs.

What is Query Optimization, and What are the Important Phases of


Query Optimization?
Query optimization is a fundamental process in database management systems (DBMS) that aims to
improve the efficiency of SQL queries by selecting the most efficient execution plan. This optimization
process is crucial for minimizing response times and resource usage.

The important phases of query optimization typically include:


Parsing and Validation:
The SQL query is parsed and validated to ensure it adheres to the database's syntax and semantics.
Errors and inconsistencies are identified at this stage.

Query Transformation:
The query may undergo transformations to simplify or optimize it. This can involve rewriting subqueries,
using materialized views, or other techniques to enhance performance.

Query Rewrite:
Further query rewriting may be performed to optimize execution, such as converting subqueries into
joins or using available indexes.

Query Optimization:
This is the core phase where the DBMS generates various candidate execution plans. These plans
consider different access methods, join strategies, and operation orders. The goal is to estimate the cost
of each plan and choose the one with the lowest estimated cost.

Cost Estimation:
During query optimization, the DBMS estimates the cost of executing each candidate plan. The cost
includes factors like I/O cost, CPU cost, and memory usage. Accurate statistics about the data and
indexes are crucial for making these cost estimates.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Plan Selection:
After evaluating all candidate execution plans, the DBMS selects the plan with the lowest estimated cost
for query execution.

Plan Execution:
The chosen execution plan is used to execute the query, which may involve accessing tables, performing
joins, and applying filter conditions.

Here Is Query
π(ENAME, SAL) ( σ((BUDGET ≥ 200000) ∨ (DUR > 24)) ( EMP ⨝ (ENO = ENO) ASG ⨝ (PNO = PNO) PROJ
⨝ (PNAME = PNAME) PAY ) )

What is Transaction? what does it mean by concurrent transactions.


What features a transaction must have. Explain briefly

Distributed transaction management is the process of ensuring the consistency, isolation, durability, and
atomicity (often referred to as the ACID properties) of transactions that involve multiple resources or
databases in a distributed system. In a distributed environment, transactions may involve operations on
different nodes or databases, and coordinating these operations to maintain data integrity is a complex
task.

concurrent transactions
Concurrency control is a fundamental concept in database management systems (DBMS) that deals with
managing the execution of multiple transactions simultaneously while maintaining data consistency and
integrity. In a multi-user environment, where multiple transactions can be executing concurrently,
concurrency control mechanisms are employed to prevent conflicts and ensure that the final state of the
database remains consistent.

Features a Transaction
Atomicity:
A transaction is atomic, which means it is treated as a single, indivisible unit of work. Either all the
operations within a transaction are completed successfully, or none of them are. If any part of a
transaction fails, the entire transaction is rolled back, ensuring data consistency.

Consistency:
Transactions must ensure that the database transitions from one consistent state to another. This means
that the database should adhere to predefined rules and constraints, maintaining data integrity.

Isolation:
Transactions should be isolated from each other to prevent interference or conflicts. Multiple
transactions can run concurrently, but they should not be aware of each other. Isolation mechanisms like
locking and concurrency control are used to manage this.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Durability:
Once a transaction is committed, its changes to the database must be permanent and survive system
failures. Durability is achieved through techniques like logging, ensuring that data remains intact even in
the event of crashes.

Concurrency Control:
In a system with concurrent transactions, concurrency control mechanisms are vital. These mechanisms
manage access to shared resources (e.g., data) to prevent conflicts and maintain data consistency.
Techniques like locking and timestamps help control concurrency.

Serializability:
Transactions should be serializable, meaning that their execution should be equivalent to some
sequential order of execution. This ensures that the final state of the database is consistent, regardless of
the order in which transactions are executed.

What are different categories of NoSql Datastores. Write examples of


each category. Write features of any one NoSql of your choice.
When people use the term “NoSQL database,” they typically use it to refer to any non-relational
database. Some say the term “NoSQL” stands for “non SQL” while others say it stands for “not only SQL.”
Either way, most agree that NoSQL databases are databases that store data in a format other than
relational tables.

Types of NoSQL databases


Over time, four major types of NoSQL databases emerged: document databases, key-value databases,
wide-column stores, and graph databases.

Document databases
A document database stores data in JSON, BSON, or XML documents. Documents in the database can be
nested. Particular elements can be indexed for faster querying. We can access, store, and retrieve
documents from your network in a form that is much closer to the data objects used in applications,
which means less translation is required to use and access the data in an application. SQL data must
often be assembled and disassembled when moving between applications, storage, or more than one
network.

Key-value stores
This is the simplest type of NoSQL database. Every element is stored as a key-value pair consisting of an
attribute name ("key") and a value. This database is like an RDBMS with two columns: the attribute name
(such as "state") and the value (such as "Alaska").

Column-oriented databases
While an RDBMS stores data in rows and reads it row by row, column-oriented databases are organized
as a set of columns. When you want to run analytics on a small number of columns in the network, you
can read those columns directly without consuming memory with unwanted data. Columns are of the
same type and benefit from more efficient compression, making reads even faster.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Graph databases
A graph database focuses on the relationship between data elements. Each element is contained as a
node. The connections between elements in the database are called links or relationships. Connections
are first-class elements of the database, stored directly. A graph database is optimized to capture and
search the connections between elements, overcoming the overhead associated with JOINing several
tables in SQL. Very few real-world business systems can survive solely on graph databases.

Features of NoSql
Flexible data models
NoSQL databases typically have very flexible schemas. A flexible schema allows you to easily make
changes to your database as requirements change. You can iterate quickly and continuously integrate
new application features to provide value to your users faster.

Horizontal scaling
Most SQL databases require you to scale-up vertically (migrate to a larger, more expensive server) when
you exceed the capacity requirements of your current server. Conversely, most NoSQL databases allow
you to scale-out horizontally, meaning you can add cheaper commodity servers whenever you need to.

Fast queries
Queries in NoSQL databases can be faster than SQL databases. Why? Data in SQL databases is typically
normalized, so queries for a single object or entity require you to join data from multiple tables. As your
tables grow in size, the joins can become expensive. However, data in NoSQL databases is typically stored
in a way that is optimized for queries. The rule of thumb when you use MongoDB is data that is accessed
together should be stored together. Queries typically do not require joins, so the queries are very fast.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


What is distribution transparency? Also explain its types?

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
What is database control in DDBMS write its dimensions?
Database control refers to the task of enforcing regulations so as to provide correct data to authentic
users and applications of a database. In order that correct data is available to users, all data should
conform to the integrity constraints defined in the database. Besides, data should be screened away
from unauthorized users so as to maintain security and privacy of the database. Database control is one
of the primary tasks of the database administrator (DBA).

The three dimensions of database control

• Authentication

• Access rights

• Integrity constraints

Authentication
In a distributed database system, authentication is the process through which only legitimate users can
gain access to the data resources.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Access Rights
A user’s access rights refers to the privileges that the user is given regarding DBMS operations such as
the rights to create a table, drop a table, add/delete/update tuples in a table or query upon the table.

Integrity Control
Semantic integrity control defines and enforces the integrity constraints of the database system.

Explain distributed DDBMS architecture?


Al ready explain in previse questions

How concurrency control in distributed database is managed with


suitable examples and diagrams?
Concurrency control in a distributed database system is essential for ensuring that multiple transactions
can access and modify data concurrently while maintaining data consistency and integrity. There are
several techniques and algorithms to manage concurrency control in distributed databases. I'll provide
an overview with suitable examples and diagrams.

1. Two-Phase Commit Protocol (2PC):


The Two-Phase Commit Protocol is used to ensure that all nodes in a distributed database agree on
whether a transaction should be committed or aborted.

2. Optimistic Concurrency Control:


Optimistic Concurrency Control assumes that conflicts between transactions are infrequent.Transactions
are allowed to proceed without acquiring locks.Conflicts are detected and resolved at the time of
committing the transaction.

3. Distributed Locking:
Distributed Locking involves acquiring locks on data items to control access by multiple transactions.Lock
managers at each node coordinate lock requests and releases.

4. Timestamp-Based Concurrency Control:


Timestamps are assigned to transactions to establish an order of execution.Older transactions are given
priority.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


How distributed systems are different from parallel systems? Explain
with examples.
S.
Parallel System Distributed System
No

Parallel systems are the systems


that can process the data In these systems, applications are
simultaneously, and increase the running on multiple computers linked
computational speed of a by communication lines.
1. computer system.

The distributed system consists of a


Parallel systems work with the
number of computers that are
simultaneous use of multiple
connected and managed so that they
computer resources which can
share the job processing load among
include a single computer with
various computers distributed over the
multiple processors.
2. network.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


S.
Parallel System Distributed System
No

Tasks are performed with a more Tasks are performed with a less
3. speedy process. speedy process.

These systems are In Distributed Systems, each processor


4. multiprocessor systems. has its own memory.

It is also known as a tightly Distributed systems are also known as


5. coupled system. loosely coupled systems.

These systems communicate with one


These systems have close
another through various
communication with more than
communication lines, such as high-
one processor.
6. speed buses or telephone lines.

These systems do not share memory


These systems share a memory,
or clock in contrast to parallel
clock, and peripheral devices
7. systems.

In this, all processors share a In this there is no global clock in


single master clock for distributed computing, it uses various
8. synchronization. synchronization algorithms.

E.g:- Hadoop, MapReduce, E.g:- High-Performance Computing


9. Apache Cassandra clusters, Beowulf clusters

What are the main functions of DDBMS? Explain its network based
functions.

Main functions of a DDBMS include:

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Data Distribution:
The DDBMS distributes data across multiple sites or nodes in a network, allowing data to be stored and
managed at various locations. This distribution can be transparent to users and applications, making it
appear as a single, unified database.

Data Transparency:
DDBMS provides data transparency to users and applications. This means that users can access and
manipulate distributed data as if it were a single, centralized database without needing to know the
physical location of the data.

Query Processing:
DDBMS supports distributed query processing, allowing users to execute queries that involve data from
multiple sites. The system optimizes query execution to minimize data transfer and processing overhead.

Transaction Management:
DDBMS ensures the consistency and integrity of data across distributed sites by supporting distributed
transactions. It manages distributed transactions through techniques like two-phase commit and ensures
that either all changes made by a transaction are applied at all sites or none at all (atomicity).

Concurrency Control:
DDBMS provides mechanisms for handling concurrent access to distributed data to avoid conflicts and
ensure data consistency. This includes techniques like locking and timestamp-based concurrency control.

Distributed Data Dictionary:


It maintains metadata and data dictionaries across all distributed sites, helping users and applications
understand the structure and location of the data.

Network-based functions of a DDBMS:


Data Distribution and Replication:
DDBMS uses the network to distribute data across various sites. It may also employ replication, where
copies of data are stored at multiple sites to enhance availability and fault tolerance. The network is
essential for synchronizing updates and maintaining data consistency across replicas.

Data Communication:
The DDBMS relies heavily on network communication to exchange data and information between
distributed sites. This includes sending queries, updates, and transaction control messages across the
network.

Data Transfer Optimization:


Network-based functions involve optimizing data transfer between sites. DDBMS uses techniques like
data fragmentation, data replication, and data caching to minimize data transfer over the network,
reducing latency and bandwidth usage.

Security and Authentication:


The network functions include security measures to ensure that data transferred over the network
remains secure and confidential. This involves authentication, authorization, and encryption of data
during transmission.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Fault Tolerance and Load Balancing:
The DDBMS uses the network to monitor the health and status of distributed sites. In the event of a site
failure, it can redirect requests to alternative sites, ensuring fault tolerance. Load balancing can also be
achieved by directing queries to the site with the least load or network congestion.

Scalability:
Network-based functions facilitate the addition of new sites or nodes to the distributed database. The
DDBMS can dynamically adapt to changes in the network topology and scale its resources as needed.

How horizontal fragments are made in distributed systems?

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
How replication is done in DDBMS? Explain with example.
Replication techniques in a Database Management System (DBMS) involve the process of creating and
maintaining copies of data across multiple servers or nodes. The primary goal of replication is to enhance
data availability, fault tolerance, and performance. Here are some common replication techniques used
in DBMS:

• Master Slave Replication

• Master-Master Replication

• Snapshot Replication

• Transactional

• Merge

• Peer to Peer

• Lazy Replication

Master-Slave Replication:
In this technique, one database server (the master) is designated as the primary source of data, and one
or more other servers (slaves) replicate data from the master. The master handles write operations,
while the slaves handle read operations. This can improve read performance and provide fault tolerance.
However, updates to the master need to be replicated to the slaves to maintain consistency.

Master-Master Replication:
In this approach, multiple database servers act as both masters and slaves. Each server can handle both
read and write operations, which can improve both read and write performance. However, it introduces
complexities in managing data conflicts and maintaining consistency between multiple masters.

Snapshot Replication:
This technique involves periodically taking snapshots of the entire database or specific portions of it and
replicating those snapshots to other servers. Snapshots capture a point-in-time view of the data, and
replication occurs by copying the snapshots to other nodes. This is useful for data warehousing and
reporting purposes.

Transactional Replication:
In this method, changes made to the data (transactions) on one server are replicated in real-time to
other servers. This ensures that the data on all replicas is consistent and up-to-date. It's commonly used
in scenarios where data consistency is critical.

Merge Replication:
Merge replication is used in scenarios where multiple nodes can make updates to the data. Changes are
tracked, and at predefined intervals, these changes are merged across nodes to ensure that each node
has the most recent data.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Peer-to-Peer Replication:
In this technique, all participating servers are treated as equals, and changes can be made to any node.
These changes are then propagated to other nodes to maintain data consistency across the network.

Lazy Replication:
Also known as asynchronous replication, this technique allows for a certain delay between the time a
change is made on the master node and when it's replicated to the slave nodes. This can improve
performance by reducing the immediate overhead of replication.

What are the integrity rules? Give its types. Explain with examples.
Integrity rules are needed to inform the DBMS about certain constraints in the real world. Specific
integrity rules apply to one specific database.

Example: part weights must be greater than zero.

There are various types of data integrity


Logical Integrity
In a relational database, logical consistency provides the data remains intact as it is used in several ways.
Logical integrity, like physical integrity, defends information from human error and hackers, but in a
different way. There are multiple forms of logical consistency.

Referential Integrity
This defines all procedures and rules enforced to provide that data is stored and used consistently. This is
the notion of foreign keys.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


User-Defined Integrity
There are sets of data, generated by users, external of entity, referential and domain integrity. If an
employer makes a column to input corrective action of employees, this data can be defined as “user-
defined.”

Domain Integrity
Domain integrity is a sequence of rules and procedures that provide all data items pertain to the correct
domains. For instance, if a user types a birth date in a street address area, the system will display an
error message that will avoid the user from filling that field with wrong information.

Physical integrity
Physical integrity define the safeguarding of data's completeness and precision during storage and
retrieval. Physical integrity is at risk when natural disasters appears, electricity goes out, or hackers
interrupt database functions.

What are the modes of a database? Give itstypes. Explain with an


example.
Database Model:
It determines the logical structure of a database and fundamentally determines in which manner data
can be stored, organized and manipulated.

There are four common types of database model

1. Hierarchical databases.

2. Network databases.

3. Relational databases.

4. Object-oriented databases.

1. Hierarchical databases
It is one of the oldest database model developed by IBM for information Management System. In a
hierarchical database model, the data is organized into a tree-like structure. In simple language we can
say that it is a set of organized data in tree structure. This type of Database model is rarely used
nowadays. Its structure is like a tree with nodes representing records and branches representing fields.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


2. Network databases
This is looks like a Hierarchical database model due to which many time it is called as modified version of
Hierarchical database. Network database model organised data more like a graph and can have more
than one parent node. The network model is a database model conceived as a flexible way of
representing objects and their relationships.

3. Relational Database
A relational database is developed by E. F. Codd in 1970. The various software systems used to maintain
relational databases are known as a relational database management system (RDBMS). In this model,
data is organised in rows and column structure i.e., two-dimensional tables and the relationship is
maintained by storing a common field. It consists of three major components.

4. Object-oriented databases
An object database is a system in which information is represented in the form of objects as used in
object-oriented programming. Object oriented databases are different from relational databases which

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


are table-oriented. The object-oriented data model is based on the object-oriented- programming
language concept.

How many steps are involved in developing a database? Explain with


only a diagram.

Steps involved in developing a database:

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Requirements Gathering:
In this initial phase, you gather and document the requirements for the database. This includes
understanding the data that needs to be stored, how it will be accessed, and any specific functionality or
constraints.

Conceptual Design:
Create a high-level conceptual model of the database, often represented using Entity-Relationship
Diagrams (ERDs). This step focuses on defining entities, their attributes, and the relationships between
them.

Normalization:
Normalize the conceptual model to reduce data redundancy and improve data integrity. This involves
breaking down entities and attributes into more granular tables to minimize data duplication.

Schema Design:
Design the schema for the database, specifying the tables, their attributes, data types, and relationships.
This can be represented using a schema diagram.

Physical Design:
Determine the physical storage structure of the database, including indexing strategies, file organization,
and access paths.

Implementation:
Create the database schema in the chosen database management system (DBMS). This step involves
writing SQL scripts or using a graphical interface to create tables, indexes, and other database objects.

Data Loading:
Populate the database with initial data. This may involve data migration from existing sources or manual
data entry.

Application Development:
Develop applications or software systems that will interact with the database. This includes writing code
for data insertion, retrieval, and manipulation.

Testing:
Thoroughly test the database and the applications that use it to ensure they meet the specified
requirements. This involves unit testing, integration testing, and performance testing.

Security and Permissions:


Define security measures and permissions to control access to the database. Specify who can perform
various operations on the data.

Backup and Recovery:


Implement backup and recovery procedures to safeguard data against loss or corruption. This includes
defining backup schedules and strategies.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Documentation:
Document the database schema, data dictionary, and any relevant information about the database and
its usage.

Training:
Provide training to users and administrators who will interact with the database and its applications.

Deployment:
Deploy the database and associated applications in the production environment.

Monitoring and Maintenance:


Continuously monitor the database's performance and health. Perform routine maintenance tasks such
as optimizing queries, applying patches, and scaling as needed.

Optimization:
Identify and address performance bottlenecks, optimize queries, and refine the database design as
necessary to improve efficiency.

Scaling:
If the database grows or experiences increased usage, scale it by adding hardware resources or using
database scaling techniques.

End-user Feedback:
Gather feedback from end-users and stakeholders to make improvements and adjustments to the
database and its applications.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Explain the Top-Down and Bottom-Up approach to the design of data
distribution.
Top-Down Design Model:
In the top-down model, an overview of the system is formulated without going into detail for any part of
it. Each part of it then refined into more details, defining it in yet more details until the entire
specification is detailed enough to validate the model. if we glance at a haul as a full, it’s going to appear
not possible as a result of it’s so complicated.

Bottom-Up Design Model:


In this design, individual parts of the system are specified in detail. The parts are linked to form larger
components, which are in turn linked until a complete system is formed. Object-oriented language such
as C++ or java uses a bottom-up approach where each object is identified first.

Explain distribution transparency for Read-only Application.

Distribution transparency helps the user to recognize the database as a single thing or a logical entity,
and if a DDBMS displays distribution data transparency, then the user does not need to know that the
data is fragmented.

Fragmentation transparency
In this type of transparency, the user doesn’t have to know about fragmented data and, due to which, it
leads to the reason why database accesses are based on the global schema. This is almost somewhat like
users of SQL views, where the user might not know that they’re employing a view of a table rather than
the table itself.

Location transparency
If this type of transparency is provided by DDBMS, then it is necessary for the user to know how the data
has been fragmented, but knowing the location of the data is not necessary.

Replication transparency-
In replication transparency, the user does not know about the copying of fragments. Replication
transparency is related to concurrency transparency and failure transparency. Whenever a user modifies
a data item, the update is reflected altogether in the copies of the table. However, this operation
shouldn’t be known to the user.

Local Mapping transparency


In local mapping transparency, the user needs to define both the fragment names, location of data items
while taking into account any duplications that may exist. This is a more difficult and time-taking query
for the user in DDBMS transparency.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


Explain the following distribute
a) Global query
b) Global Dictionary

a) Global Query:
In a Distributed Database Management System (DDBMS), a global query refers to a database query that
is executed across multiple distributed databases as if they were a single, unified database. Unlike a
query in a centralized database management system (DBMS), where all data is stored in one location, a
global query in a DDBMS involves retrieving or manipulating data that is distributed across multiple
nodes or locations.

Global queries in DDBMS often require a mechanism to coordinate and distribute the query to the
relevant database nodes, retrieve results, and consolidate them into a coherent response for the user or
application. This coordination is necessary because each node in the distributed database may have its
own schema and data, and the DDBMS must handle the complexities of routing and aggregating the
query results.

b) Global Dictionary
In DDBMS, a global dictionary, sometimes referred to as a global schema or directory, serves as a central
repository of metadata and information about the structure and location of data in the distributed
database. It provides a standardized way to reference and access data distributed across different
database nodes. Key functions of a global dictionary in DDBMS include:

Schema Mapping:
It maintains mappings between the global schema (the way data is logically organized across the
distributed database) and the local schemas of individual database nodes.

Location Transparency:
It keeps track of the physical locations of data within the distributed system, allowing the DDBMS to
route queries to the appropriate nodes.

Data Description:
It stores metadata about tables, attributes, relationships, and other schema-related information, helping
users and applications understand the structure of the distributed database.

Query Optimization:
The global dictionary can be used by the DDBMS to optimize query execution by determining the most
efficient way to access and retrieve data across the distributed nodes.

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )


MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )
Elaborates how Concurrency control in distributed databases is managed
with suitable examples and diagrams.

Already Explained

MR.NOMAN.TARIQ@OUTLOOK.COM 0309-6054532. (IF FIND ANY MISTAKE CONTACT ME )

You might also like