Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

DATABASE REPLICATION

Mrs Zubeda Shabani Kilua


MSc ISNS
Definition
• Database replication is the process of creating
copies of a database and storing them across
various on-premises or cloud destinations
• it improves data availability and accessibility.
Every user connected to the system can access
copies of the same (up-to-date)
Cont..
Cont..
Benefits
• Higher data availability. Your overall system
will still be able to perform adequately even if
one of your replicated databases becomes
unavailable because you’ll have a copy of the
database.
• Reduced server load. A replicated, distributed
database requires less processing for each
server. This means higher performance for
queries.
Cont..
• More reliable data. As part of the replication
process, data in target systems is processed
and updated to match that of the source
system which helps ensure data integrity.
• Less data movement. Having a distributed
database allows for versions of the data to be
closer to the point of transaction or data
entry.
Cont..
• Better protection. Achieve redundancy to safeguard
the read performance and availability of mission-
critical databases and ensure business continuity.
• Lower latency. Having copies of your data in
multiple locations means more localized data access,
which can improve your network performance. This
is especially helpful to employees in satellite offices.
• Better application performance. Improve the
scalability and availability of database-dependent
applications.
Challenges for Database Replication

• Inconsistent data. Some of your data may not


correctly sync with the rest of your distributed
system when you’re copying data between
multiple sites at different intervals.
• Lost data. Some of your data may be lost if
database objects are incorrectly configured
within the source database or if the primary
key you use to verify data integrity in the
replica is incorrect.
Types of Database Replication
a. Full-table replication
b. Key-based incremental replication
c. Log-based replication
Full-table replication
• Full-table replication copies every piece of
data within a table from the database to the
cloud destination; this includes new, existing
and updated data.
• Advantages: Because this replicates the entire
table, you will always have the correct data set
after each sync and can ensure that all inserts,
updates and deletes are captured.
Cont..
• Disadvantages: This is the least efficient type
of database replication and rather resource
intensive as you are copying every piece of
data within a table whether it has changed or
not.
• This can also lead to a burst load on the
source depending on the size and volume of
data within the tables.
Key-based incremental replication

• Key-based incremental replication is a


database replication method that uses a
replication key to identify new and updated
records based on a timestamp or integer key..
• Advantages: Key-based incremental
replication is an efficient type of database
replication as it only replicates updated and
inserted rows thus using fewer resources.
Cont..
• Disadvantages: Any data that’s hard-deleted
from a database won’t be replicated in your
destination of choice without a lot of time and
effort put into processes that could identify
deletes.
Log-based replication

• Log-based replication copies changes based on


a database’s binary log files — a file that
records patterns, activities and operations
within a database.
• Advantages: This type of database replication
is the most efficient, as it reads directly from
the binary log files and doesn’t compete with
other database queries.
Cont..
• Disadvantages: Log-based replication is only
available for certain databases or you may not
have access to your database’s logs if it is
hosted by a third-party. Also, setting up log-
based replication can be very time-intensive,
difficult, and bug-prone if you build it yourself.
Database Replication Method

• There are multiple methods for replicating


data from your database. The extensive list of
database replication methods allows you to
determine a method that suits your
infrastructure
Methods
• Log-Based Incremental Replication.
• Key-Based Incremental Replication.
• Full Table Replication.
• Snapshot Replication.
• Transactional Replication.
• Merge Replication.
• Bidirectional Replication.
Replication process
Cont..
• Identify your data source
The first step is to identify your primary data
source where data from your organization
originates. This could be any kind of database
on-premises or in the cloud. Next,
determine the destination you’ll replicate the
data to. Potential destinations are major cloud
data warehouses, data lakes or even another
database.
Cont..
• Determine the scope of your database replication
 The next step is to consider the data you need to
replicate from your database.
 If you need to replicate an entire database, you
should opt for a full-table database replication
scheme. This ensures that all of your data is
available in your destination. However, if you only
need certain aspects of a database replicated (e.g.,
analytical data), you would select the source tables
and columns to only replicate part of your database.
Cont..
• Decide on a database replication frequency
 How often do you need the data replicated?
Synchronous replication allows for
simultaneous updates in real-time. This is
typically used for transactional applications
that require near real-time data updates. It
uses more bandwidth, but it keeps data across
the network synchronized.
Cont..
 Asynchronous replication means that data is
written to the primary database first. Then the
data is replicated to the destination in batches
anywhere from every few minutes to daily.
 It is more cost-effective to have the data in
your database sync on a scheduled timeframe,
but there’s also the risk of data loss if recent
changes aren’t properly replicated.
Cont..
• Choose a database replication type and
method
 Decide on your database replication type: full-
table, key-based or log-based. The right choice
will depend on factors like your source and
destination pairing, the amount of data you
need to replicate and the resources available
for your database replication.
Cont..
• Use a database replication tool
 Database replication improves the availability of
your data by storing it in multiple locations and
potentially reducing the load on your source
database. To ensure your data is properly
replicated, you’ll need to select the right database
replication tool for your use case. This will keep
your systems running smoothly and ensure you
can get the greatest value out of your data.

You might also like