Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

DATA ARCHIVING SYSTEMS

DR.C.V.NARASIMHA MURTHY

The difference between data backup and archiving involves how you scan, characterize, and then
retain data for use; how easy it is to access; how long the data is stored; and what your end goal
for that data is.

A backup is a copy of the organization’s active, current operational data. This includes any data
that is being used, changed, or accessed regularly. When your system creates a backup copy of
your data, it doesn’t affect the original files which remain in the same location. These backup
files can be used in recoveries to restore the data to some previous point should the data become
corrupted or be lost.

A backup system should store data for a much shorter time than an archive file does. Operational
importance dictates how often the system updates backup data, and that may happen frequently
— even several times every day. Searches against backup data are limited to a single filesystem,
server/VM, or object from a single point in time (e.g. restore all files
from /home1 on fileserver1 to the way they looked last Thursday). Backup systems also do not
typically search via the contents of a file—only the file, server, or database name.

In contrast, archives serve as data repositories for information that may not be mission critical,
but which must be retained for long periods of time. For example, organizations often archive
regulatory compliance data for as long as they are required to keep it.

Archive files are typically no longer active or current, and do not change often or need to be
located frequently. Their absence from regular storage is not disruptive to normal operations and
in fact, saves time and money.

Compared to backup files, archive data storage solutions users search across many files,
servers/VMs, and objects across a range of time (e.g. find all files in the last three years
containing the phrase apollo). Archive data storage systems are more often than not retrieving
data based on its content, not its name or location. As more and more inactive data is sent to the
archive, searchability becomes more critical, especially for compliance reasons.

In addition, there is a distinction between backups and archives when it comes to data integrity
needs. Data integrity over time is more important to an enterprise data archiving solution than a
backup system because archiving large amounts of data over time increases the risk of corruption
and other problems. Systems must be put in place to ensure against bit rot, the common term for
data corruption over time, as well as accidental or malicious deletion or corruption.

Dr.C.V.Narasimha murthy. Notes made on a student request AUG 2022.

You might also like