Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Business Continuity Overview

1-1

Business Continuity Overview

1-2

Business Continuity Overview

1-3

Business Continuity Overview

1-4

Business Continuity Overview

1-5

Business Continuity Overview

1-6

Business Continuity Overview

1-7

Business Continuity Overview

1-8

Business Continuity Overview

1-9

Business Continuity Overview

1-10

Business Continuity Overview

1-11

Business Continuity Overview

1-12

Business Continuity Overview

1-13

Business Continuity Overview

1-14

Business Continuity Overview

1-15

A good copy is one that is: Consistent captures all relevant data as a coordinated snapshot

Restorable either at the volume level or file level


Durable Non-intrusive does not interrupt current operations when captured

Near Current

Business Continuity Overview

1-16

The Disaster Recovery timeline is defined around two key objectives: Recovery Point Objective How much data loss can you tolerate? 30 minutes 60 minutes 24 hours 1 week Zero is a valid answer! Recovery Time Objective How long can you tolerate being off-line? Time is defined as not just the time to recover the data, but the time to bring business operations back online

Business Continuity Overview

1-17

Procedures should be written so that secondary personnel are able to follow them In a disaster situation, primary personnel may not be able to reach the recovery site For example, persons other than the primary email administrator should be able to recover the email servers and confirm operation Business leaders and IT personnel must agree on Recovery Time Objectives and Recovery Point Objectives. RPO and RTO are essentially service level agreements disaster recovery Frequent review and updates to the plan are necessary to accommodate Changes to server or application software Additions to the IT environment, such as a new database or application server Deployment of new IT tools, such as Dell EqualLogic Auto-Snapshot Manager Periodic testing of the plan is necessary to ensure: that: The plan covers all necessary IT Infrastructure

Personnel are capable of carrying out recovery tasks


DR infrastructure fully supports all operations Servers Storage Applications Data Network Business Continuity Overview 1-18

Business Continuity Overview

1-19

Each of the following plans are described in more detail later in this lesson.

Data Backup Plan


The plan that describes the detailed backup strategy for each recoverable IT element (server, database, etc.) Recovery plans provide detailed steps for recovering each element from those backups Server Recover Plan Application Recovery Plan Database Recovery Plan

Data Network Recovery Plan


Voice Network Recovery Plan The Test Plan specifies frequency of testing, personnel assignments, pass/fail criteria, etc. The Plan Maintenance specifies the update schedule for all the DR plans; it also specifies events that would trigger an out-of-schedule update, such as deployment of a new application

Business Continuity Overview

1-20

Business Continuity Overview

1-21

Business Continuity Overview

1-22

1. On the Backup Server, the backup process starts, which can be initiated manually or through a schedule:

The Backup Server notifies VSS on the target server that the data should be backed up.
VSS notifies NTFS to prepare for the backup (for example, quiesce and flush the buffer cache). VSS notifies the PS Series group to create a snapshot.

PS Series group creates the snapshot.


VSS notifies NTFS to resume operation. 2. The Backup Server backs up the data to backup media using the snapshot: VSS notifies the Backup Server to import (mount) the snapshot, and then the Backup Server copies the data from the snapshot to backup media, either disk or tape (such as a PS Series volume located in the SAN or a tape device that is separate from the SAN). The Backup Server notifies VSS that the backup is complete. 3. VSS notifies the PS Series group to delete the snapshot.

Business Continuity Overview

1-23

The server recover plan specifies how to return the processing infrastructure to service following either a physical disaster or data disaster May also include application recovery (recover both servers and applications in a single step) Email stores Databases Specifies processes for recovery at both the local site (for example, after a hardware failure or software failure) and the remote site (fire, flood, etc.) Specifies return to normal operation at either the original site, or at a new site if original site is permanently lost Dell EqualLogic Enhances the Plan by: Enabling frequent snapshots/replications of the system volume (Boot from SAN) Local recovery, remote recovery, fast-failback to primary site, multiple recovery points

Business Continuity Overview

1-24

The database recovery plan specifies restoration processes to recover specific databases that are critical to business operations. Recovery processes for the both local site and the remote (DR) site are defined In-place recovery/rollback

Restore database at remote site on new server


How Dell EqualLogic enhances the plan: With HIT 3.0, applicationconsistent snapshots and replications for Microsoft SQL Server 2005 can be fully automated. All volumes that contain database elements are identified and backed up automatically per a user-defined schedule

Business Continuity Overview

1-25

Business Continuity Overview

1-26

Business Continuity Overview

1-27

Cold Systems are idled and are not available for use during backup operation Application recovery is the same as normal system startup Good old backup Crash Consistent A copy of all relevant data is captured at a single point in time Application recovery is as if power was lost Application Consistent Application supports the ability to create a coordinated snapshot of its data set. Application supports recovery from the coordinated snapshot Inconsistent Data are not coordinated; usefulness is limited (but may be better than nothing)

Business Continuity Overview

1-28

Different types of backup offer varying degrees of: Impact to the server being backed up

Heavy burden on the server is undesirable, because it necessitates doing backups during a specified maintenance window
At worst case, access to the server is completely restricted during the backup operation (Cold backup) Impact to users who are using the server Will the users see poor performance when backup is running? Ease of recovery The manner in which the backup is captured will impact both Recovery Time and Recovery Point objectives Application Consistent copies offer the easiest and fastest recovery

Crash Consistent copies are easy create, but recovery may be more time-consuming and laborious.

Business Continuity Overview

1-29

Cold backups are what most people think of when they think of backup

Is often the cheapest and easiest to implement


May be as simple as using command line copy commands Requires complete stoppage of application Requires time to complete backup operation

Backed-up data is not immediately accessible. The data will be in a format defined by the backup application, and must be restored using the same software used to perform the backup

Business Continuity Overview

1-30

Backup window is scheduled during off-hours

Users are notified in advance of service outage


Email Database Phone/Internet access When window begins, applications are halted Backup application runs and copies all relevant data to tape or disk Applications are re-started and users are notified

Business Continuity Overview

1-31

Crash consistent copies are point-in-time copies of a volume or a set of related volumes

May be stored locally (snapshot) or remotely (replica)


Crash-consistent copies are a very common form of data protection

Application data is in the same state as if a power failure occurred


Application must run recovery procedures to make the data usable

Procedures are typically run automatically when a server reboots

Business Continuity Overview

1-32

Current technology enables the creation of application consistent data copies Point-in-time snapshot capture is coordinated with the application Application is momentarily paused (quiesced) just prior to the snapshot being taken

All buffers and cache are flushed, so no data remains unwritten to disk
Snapshot is captured of volume or volume set associated with the application Application is un-paused

Business Continuity Overview

1-33

Inconsistent snapshots are snapshots that span multiple volumes, but which are not captured in a coordinated manner

For example, a database resides on 3 separate volumes, but the volumes are backed up individually at different points in time
Applications cannot recover from inconsistent snapshots It may be possible to manually recover pieces of data For example, recovery of an accidentally deleted file

Inconsistent snapshots should be avoided (but they may be better than nothing)

Business Continuity Overview

1-34

Business Continuity Overview

1-35

Business Continuity Overview

1-36

Snapshots may be crash-consistent or application-consistent depending on implementation

Crash-consistent snapshots are easier to implement, but require more effort during a recovery
Application-consistent snapshots involve more planning and setup, but offer faster recovery Locally stored snapshots support favorable RTO and RPO times, but they do NOT protect against a site outage Snapshot Techniques Copy-on-write Allocate-on write Snapshot considerations include The time that it takes to perform the snapshot The amount of storage space used by the snapshot The burden placed on the application server by the snapshot operation

Business Continuity Overview

1-37

Typical replication solutions are often time-consuming and require you to back up data and manually transport the backups to a different location. Through Dell EqualLogics Snapshots and Auto-Replication capability, end-toend data protection capability is now possible for customers. At the primary site, snapshots are use to provide quick recovery, based on Volume changes Snapshot schedule Risk Need to recover At the Remote site

Critical volumes or possibly all volumes are replicated


Backup to tape occurs at the replication secondary site, allowing for all backups to happen at a central site Secondary site is available for operation if the primary sites should fail Replication with Dell EqualLogic PS Series Storage arrays is performed between groups Primary site is one group

Secondary site is a second group

Business Continuity Overview

1-38

Business Continuity Overview

1-39

Business Continuity Overview

1-40

Business Continuity Overview

1-41

Business Continuity Overview

1-42

You might also like