Distributed Data Model

Distributed Data Model
A distributed database is a database that consists of two or more files

located in different sites either on the same network or on entirely
different networks. Portions of the database are stored in multiple
physical locations and processing is distributed among multiple database
nodes.
A centralized distributed database management system (DDBMS)
integrates data logically so it can be managed as if it were all stored in
the same location. The DDBMS synchronizes all the data periodically
and ensures that data updates and deletes performed at one location will
be automatically reflected in the data stored elsewhere.
Features of Distributed Database
 Location independent
 Distributed query processing
 Distributed transaction management
 Hardware independent
 Operating system independent
 Network independent
 Transaction transparency
 DBMS independent
Distributed Database Architecture
Distributed databases can be homogenous or heterogeneous.
In a homogenous distributed database system, all the physical locations have the same
underlying hardware and run the same operating systems and database applications.
For a distributed database system to be homogenous, the data structures at each location must
be either identical or compatible. The database application used at each location must also be
either identical or compatible.
In a heterogeneous distributed database, the hardware, operating systems or database

applications may be different at each location. Different sites may use different schemas and
software, although a difference in schema can make query and transaction processing difficult.
Advantages of distributed databases
 Distributed databases are capable of modular development, meaning that systems can be
expanded by adding new computers and local data to the new site and connecting them to
the distributed system without interruption.
 When failures occur in centralized databases, the system comes to a complete stop. When a
component fails in distributed database systems, however, the system will continue to
function at reduced performance until the error is fixed.
 Admins can achieve lower communication costs for distributed database systems if the
data is located close to where it is used the most. This is not possible in centralized
systems.
Types of distributed databases
 Replication
Replicated data is used to create instances of data in different parts of the database. By using
replicated data, distributed databases can access identical data locally, thus avoiding traffic.
Replicated data can be divided into two categories: read-only and writable data.
Read-only versions of replicated data allow revisions only to the first instance; subsequent
enterprise data replications are then adjusted. Writable data can be altered, but the first
instance is immediately changed.
2. Fragmentation –
 In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the
fragments is stored in different sites where they’re required. It must be made sure that the fragments
are such that they can be used to reconstruct the original relation (i.e, there isn’t any loss of data).
Fragmentation is advantageous as it doesn’t create copies of data, consistency is not a problem.
 Fragmentation of relations can be done in two ways:
 Horizontal fragmentation – Splitting by rows –

The relation is fragmented into groups of tuples so that each tuple is assigned to at least one
fragment.
 Vertical fragmentation – Splitting by columns –
The schema of the relation is divided into smaller schemas. Each fragment must contain a common
candidate key so as to ensure a lossless join.
Applications of Distributed Database:
 It is used in Corporate Management Information System.

 It is used in multimedia applications.
 Used in Military’s control system, Hotel chains etc.
 It is also used in manufacturing control system.
Examples of distributed databases
 Though there are many distributed databases to choose from, some examples of distributed
databases include Apache Ignite, Apache Cassandra, Apache HBase, Couchbase Server,
Amazon SimpleDB, Clusterpoint, and FoundationDB.
 Apache Ignite specializes in storing and computing large volumes of data across clusters of
nodes. In 2014, Ignite was open sourced by GridGain Systems and later accepted into the
Apache Incubator program. Apache Ignite's database uses RAM as the default storage and
processing tier.
 Apache Cassandra offers support for clusters that span multiple locations, and it features its
own query language, Cassandra Query Language (CQL). Additionally, Cassandra's
replication strategies are configurable.
Distributed Database Model
1. Consists of multiple database files located at different sites

2. Allows multiple users to access and manipulate data
3. Files delivered quickly from location nearest the user
4. If one site fails, data is retrievable
5. Multiple files fails dispersed databases must be synchronized
THANK YOU

Distributed Data Model

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Data Model

Uploaded by

Copyright:

Available Formats

Distributed Data Model

A distributed database is a database that consists of two or more files

Distributed databases can be homogenous or heterogeneous.

In a heterogeneous distributed database, the hardware, operating systems or database

 Fragmentation of relations can be done in two ways:

 Horizontal fragmentation – Splitting by rows –

 It is used in Corporate Management Information System.

1. Consists of multiple database files located at different sites

You might also like