Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Experiment 6 - Part 1

Aim - Implement small application using data replication.

Theory -
Overview - A key idea in distributed systems is data replication, which is the
process of making and keeping numerous copies of data on various nodes or
servers. The main objective is to improve fault tolerance, performance, and data
availability by permitting simultaneous access to data from several sources.
Replication reduces access latency by moving data closer to consumers and
lessens the chance of data loss in the event of a node failure.

Types of Data Replication -

Full Replication -
A complete copy of the entire dataset is stored on each system node during full
replication. High fault tolerance and read performance are offered by this method,
although consistency maintenance may become more difficult.

Partial Replication -
In partial replication, several nodes store distinct dataset subsets. This method
strikes a balance in the trade-offs between consistency, fault tolerance, and
storage effectiveness.

Replication Strategies -

Slave-Master Replication -
One node (the master) manages write operations in master-slave replication, and
changes are propagated to one or more replica nodes (slaves). This tactic
creates a single point of failure but streamlines consistency.

Multiple Master Replication -


Multiple nodes can independently accept write operations thanks to multi-master
replication. While this approach improves fault tolerance, handling concurrent
updates calls for a strong conflict resolution mechanism.
Use Cases -
Data replication is frequently used in distributed databases, content delivery
networks (CDNs), and geographically dispersed systems, among other situations
where high availability, fault tolerance, and low latency access to data are
essential.
Implementation:
REplication
1) Install Mongodb

2) Each replicated database will have its own copy of data, to do that, create
respective source directories
3) Now run the Primary DB instance specifying unique port and a name to the
replica set

Here the primary_db instance will be ran on port 6000 and the name of replica set
is my_replica_set

4) Similarly create a new instance of db for replica_01 and replica_02, but their
replica set name should be the same as primary db

replica_01 running on port 6001


replica_02 running on port 6002

5) Login to primary instance to configure it with replica set


6) List all the instances of db in replica set initializer

7) View the connection status of all replicated dbs


We can see the allocation of roles - The role used for initialization is primary, all
other are secondary
8) Login again into the primary_db shell

It is now shown that the primary_db is the Primary node for the replica set
my_replica_set

my_replica_set:PRIMARY>
9) Similarly logging into replica data set will show the status as secondary

By default all operations on secondary databases are blocked.


Allow Read operations on secondary database using

use admin
db.getMongo().setSlaveOk()
10) Testing replication
a) Create a new collection students and insert document in it from the primary
database

b) Now check replica db


From replica set 2

Consistency
Causal Consistency
If an operation logically depends on a preceding operation, then there is a causal
relationship between the operations
Servers:
1) Primary DB
Write operation performed on Primary DB

2) Replica 01
Received data from replica 02

Conclusion - With this experiment,we understood concept of data replication


and consistency and implemented the same on mongoDB.Here we performed data
replication by created two replicas of primary Database.Also we

You might also like