Professional Documents
Culture Documents
Disaster Recovery - Wikipedia
Disaster Recovery - Wikipedia
Disaster Recovery involves a set of policies, t ools and procedures t o enable t he recovery or
cont inuat ion of vit al t echnology infrast ruct ure and syst ems following a nat ural or human-induced
disast er. Disast er recovery focuses on t he IT or t echnology syst ems support ing crit ical business
funct ions,[1] as opposed t o business cont inuit y, which involves keeping all essent ial aspect s of a
business funct ioning despit e significant disrupt ive event s. Disast er recovery can t herefore be
considered a subset of business cont inuit y.[2][3] Disast er Recovery assumes t hat t he primary sit e
is not recoverable (at least for some t ime) and represent s a process of rest oring dat a and
services t o a secondary survived sit e, which is opposit e t o t he process of rest oring back t o it s
original place
IT Service Continuity
IT Service Continuity[4][5] (ITSC) is a subset of business cont inuit y planning (BCP)[6] and
encompasses IT disast er recovery planning and wider IT resilience planning. It also incorporat es
t hose element s of IT infrast ruct ure and services which relat e t o communicat ions such as (voice)
t elephony and dat a communicat ions.
The ITSC Plan reflect s Recovery Point Object ive (RPO - recent t ransact ions) and Recovery
Time Object ive (RTO - t ime int ervals).
Planning includes arranging for backup sit es, be t hey hot , warm, cold, or st andby sit es, wit h
hardware as needed for cont inuit y.
In 2008, t he Brit ish St andards Inst it ut ion launched a specific st andard connect ed and support ing
t he Business Cont inuit y St andard BS 25999 t it led BS25777, specifically t o align comput er
cont inuit y wit h business cont inuit y. This was wit hdrawn following t he publicat ion in March 2011
of ISO/IEC 27031 "Securit y t echniques — Guidelines for informat ion and communicat ion
t echnology readiness for business cont inuit y".[7]
The Recovery Time Objective (RTO)[9][10] is t he t arget ed durat ion of t ime and a service level
wit hin which a business process must be rest ored aft er a disast er (or disrupt ion) in order t o avoid
unaccept able consequences associat ed wit h a break in business cont inuit y.[11]
Schematic representation of the terms RPO and RTO. In this example, the agreed values of RPO and RTO are not fulfilled.
In accept ed business cont inuit y planning met hodology, t he RTO is est ablished during t he
Business Impact Analysis (BIA) by t he owner of a process, including ident ifying opt ions t ime
frames for alt ernat e or manual workarounds.
In a good deal of t he lit erat ure on t his subject , RTO is spoken of as a complement of Recovery
Point Object ive (RPO), wit h t he t wo met rics describing t he limit s of accept able or "t olerable"
ITSC performance in t erms of time lost (RTO) from normal business process funct ioning, and in
t erms of dat a lost or not backed up during t hat period of t ime (RPO) respect ively.[11][12]
A Forbes overview[9] not ed t hat it is Recovery Time Actual (RTA) which is "t he crit ical met ric for
business cont inuit y and disast er recovery."
RTA is est ablished during exercises or act ual event s. The business cont inuit y group t imes
rehearsals (or act uals) and makes needed refinement s.[9][13]
A Recovery Point Objective (RPO) is defined by business cont inuit y planning. It is t he maximum
t arget ed period in which dat a (t ransact ions) might be lost from an IT service due t o a major
incident .[11]
If RPO is measured in minut es (or even a few hours), t hen in pract ice, off-sit e mirrored backups
must be cont inuously maint ained; a daily off-sit e backup on t ape will not suffice.[14]
Recovery t hat is not inst ant aneous will rest ore dat a/t ransact ions over a period of t ime and do so
wit hout incurring significant risks or significant losses.[11]
RPO measures t he maximum t ime period in which recent dat a might have been permanent ly lost
in t he event of a major incident and is not a direct measure of t he quant it y of such loss. For
inst ance, if t he BC plan is "rest ore up t o last available backup", t hen t he RPO is t he maximum
int erval bet ween such backup t hat has been safely vault ed off-sit e.
Business impact analysis is used t o det ermine RPO for each service and RPO is not det ermined
by t he exist ent backup regime. When any level of preparat ion of off-sit e dat a is required, t he
period during which dat a might be lost oft en st art s near t he t ime of t he beginning of t he work t o
prepare backups, not t he t ime t he backups are t aken off-sit e.[12]
RTO and t he RPO must be balanced, t aking business risk int o account , along wit h all t he ot her
major syst em design crit eria.[17]
RPO is t ied t o t he t imes backups are sent offsit e. Offsit ing via synchronous copies t o an offsit e
mirror allows for most unforeseen difficult y. Use of physical t ransport at ion for t apes (or ot her
t ransport able media) comfort ably covers some backup needs at a relat ively low cost . Recovery
can be enact ed at a predet ermined sit e. Shared offsit e space and hardware complet es t he
package needed.[18]
For high volumes of high value t ransact ion dat a, t he hardware can be split across t wo or more
sit es; split t ing across geographic areas adds resiliency.
History
Planning for disast er recovery and informat ion t echnology (IT) developed in t he mid t o lat e
1970s as comput er cent er managers began t o recognize t he dependence of t heir organizat ions
on t heir comput er syst ems.
At t hat t ime, most syst ems were bat ch-orient ed mainframes. Anot her offsit e mainframe could
be loaded from backup t apes pending recovery of t he primary sit e; downt ime was relat ively less
crit ical.
The disast er recovery indust ry[19][20] developed t o provide backup comput er cent ers. One of t he
earliest such cent ers was locat ed in Sri Lanka (Sungard Availabilit y Services, 1978).[21][22]
During t he 1980s and 90s, as int ernal corporat e t imesharing, online dat a ent ry and real-t ime
processing grew, more availabilit y of IT syst ems was needed.
Regulat ory agencies became involved even before t he rapid growt h of t he Int ernet during t he
2000s; object ives of 2, 3, 4 or 5 nines (99.999%) were oft en mandat ed, and high-availabilit y
solut ions for hot -sit e facilit ies were sought .
IT Service Cont inuit y is essent ial for many organizat ions in t he implement at ion of Business
Cont inuit y Management (BCM) and Informat ion Securit y Management (ICM) and as part of t he
implement at ion and operat ion informat ion securit y management as well as business cont inuit y
management as specified in ISO/IEC 27001 and ISO 22301 respect ively.
The rise of cloud comput ing since 2010 cont inues t hat t rend: nowadays, it mat t ers even less
where comput ing services are physically served, just so long as t he net work it self is sufficient ly
reliable (a separat e issue, and less of a concern since modern net works are highly resilient by
design). 'Recovery as a Service' (RaaS) is one of t he securit y feat ures or benefit s of cloud
comput ing being promot ed by t he Cloud Securit y Alliance.[23]
Classification of disasters
Disast ers can be t he result of t hree broad cat egories of t hreat s and hazards. The first cat egory
is nat ural hazards t hat include act s of nat ure such as floods, hurricanes, t ornadoes, eart hquakes,
and epidemics. The second cat egory is t echnological hazards t hat include accident s or t he
failures of syst ems and st ruct ures such as pipeline explosions, t ransport at ion accident s, ut ilit y
disrupt ions, dam failures, and accident al hazardous mat erial releases. The t hird cat egory is
human-caused t hreat s t hat include int ent ional act s such as act ive assailant at t acks, chemical or
biological at t acks, cyber at t acks against dat a or infrast ruct ure, and sabot age. Preparedness
measures for all cat egories and t ypes of disast ers fall int o t he five mission areas of prevent ion,
prot ect ion, mit igat ion, response, and recovery.[24]
Recent research support s t he idea t hat implement ing a more holist ic pre-disast er planning
approach is more cost -effect ive in t he long run. Every $1 spent on hazard mit igat ion (such as a
disast er recovery plan) saves societ y $4 in response and recovery cost s.[25]
2015 disast er recovery st at ist ics suggest t hat downt ime last ing for one hour can cost
Control measures
Cont rol measures are st eps or mechanisms t hat can reduce or eliminat e various t hreat s for
organizat ions. Different t ypes of measures can be included in a disast er recovery plan (DRP).
Disast er recovery planning is a subset of a larger process known as business cont inuit y planning
and includes planning for resumpt ion of applicat ions, dat a, hardware, elect ronic communicat ions
(such as net working), and ot her IT infrast ruct ure. A business cont inuit y plan (BCP) includes
planning for non-IT relat ed aspect s such as key personnel, facilit ies, crisis communicat ion, and
reput at ion prot ect ion and should refer t o t he disast er recovery plan (DRP) for IT-relat ed
infrast ruct ure recovery/cont inuit y.
IT disast er recovery cont rol measures can be classified int o t he following t hree t ypes:
1. Prevent ive measures – Cont rols aimed at prevent ing an event from occurring.
2. Det ect ive measures – Cont rols aimed at det ect ing or discovering unwant ed event s.
3. Correct ive measures – Cont rols aimed at correct ing or rest oring t he syst em aft er a disast er
or an event .
Good disast er recovery plan measures dict at e t hat t hese t hree t ypes of cont rols be
document ed and exercised regularly using so-called "DR t est s".
Strategies
Prior t o select ing a disast er recovery st rat egy, a disast er recovery planner first refers t o t heir
organizat ion's business cont inuit y plan, which should indicat e t he key met rics of Recovery Point
Object ive and Recovery Time Object ive.[28] Met rics for business processes are t hen mapped t o
t heir syst ems and infrast ruct ure.[29]
Failure t o properly plan can ext end t he disast er's impact .[30] Once met rics have been mapped,
t he organizat ion reviews t he IT budget ; RTO and RPO met rics must fit wit h t he available budget .
A cost -benefit analysis oft en dict at es which disast er recovery measures are implement ed.
Adding cloud-based backup t o t he benefit s of local and offsit e t ape archiving, t he New York
Times wrot e, "adds a layer of dat a prot ect ion."[31]
backups made t o disk on-sit e and aut omat ically copied t o off-sit e disk, or made direct ly t o
off-sit e disk
replicat ion of dat a t o an off-sit e locat ion, which overcomes t he need t o rest ore t he dat a (only
t he syst ems t hen need t o be rest ored or synchronized), oft en making use of st orage area
net work (SAN) t echnology
Privat e Cloud solut ions which replicat e t he management dat a (VMs, Templat es and disks) int o
t he st orage domains which are part of t he privat e cloud set up. These management dat a are
configured as a xml represent at ion called OVF (Open Virt ualizat ion Format ), and can be
rest ored once a disast er occurs.
Hybrid Cloud solut ions t hat replicat e bot h on-sit e and t o off-sit e dat a cent ers. These
solut ions provide t he abilit y t o inst ant ly fail-over t o local on-sit e hardware, but in t he event of
a physical disast er, servers can be brought up in t he cloud dat a cent ers as well.
t he use of high availabilit y syst ems which keep bot h t he dat a and syst em replicat ed off-sit e,
enabling cont inuous access t o syst ems and dat a, even aft er a disast er (oft en associat ed wit h
cloud st orage)[32]
In many cases, an organizat ion may elect t o use an out sourced disast er recovery provider t o
provide a st and-by sit e and syst ems rat her t han using t heir own remot e facilit ies, increasingly via
cloud comput ing.
In addit ion t o preparing for t he need t o recover syst ems, organizat ions also implement
precaut ionary measures wit h t he object ive of prevent ing a disast er in t he first place. These may
include:
local mirrors of syst ems and/or dat a and use of disk prot ect ion t echnology such as RAID
surge prot ect ors — t o minimize t he effect of power surges on delicat e elect ronic equipment
use of an unint errupt ible power supply (UPS) and/or backup generat or t o keep syst ems going
in t he event of a power failure
fire prevent ion/mit igat ion syst ems such as alarms and fire ext inguishers
Disaster Recovery as a Service DRaaS is an arrangement wit h a t hird part y, a vendor.[33] Commonly
offered by Service Providers as part of t heir service port folio.
Alt hough vendor list s have been published, disaster recovery is not a product , it 's a service, even
t hough several large hardware vendors have developed mobile/modular offerings t hat can be
inst alled and made operat ional in very short t ime.
Google (Google Modular Dat a Cent er) has developed syst ems t hat could be used for t his
purpose.[35][36]
Bull (mobull)[37]
See also
Cont inuous dat a prot ect ion Recovery Consist ency Object ive
References
4. M. Niemimaa; Steven Buchanan (March 2017). "Information systems continuity process" (https://dl.ac
m.org/citation.cfm?id=3062955) . ACM.com (ACM Digital Library).
7. Forum, Business Continuity (2012-05-03). "ISO 22301 to be published Mid May - BS 25999-2 to be
withdrawn" (https://www.continuityforum.org/content/news/165318/iso-business-continuity-standard-2
2301-replace-bs-25999-2) . www.continuityforum.org. Retrieved 2021-11-20.
10. "Three Reasons You Can't Meet Your Disaster Recovery Time" (https://www.forbes.com/sites/sungarda
s/2013/10/29/three-reasons-you-cant-meet-your-disaster-recovery-time-objectives) . Forbes. October
10, 2013.
12. "How to fit RPO and RTO into your backup and recovery plans" (https://searchstorage.techtarget.com/f
eature/What-is-the-difference-between-RPO-and-RTO-from-a-backup-perspective) . SearchStorage.
Retrieved 2019-05-20.
17. Peter H. Gregory (2011-03-03). "Setting the Maximum Tolerable Downtime -- setting recovery objectives"
(https://books.google.com/books?id=YC49DXW-_60C&pg=PA20) . IT Disaster Recovery Planning For
Dummies. Wiley. pp. 19–22. ISBN 978-1118050637.
18. William Caelli; Denis Longley (1989). Information Security for Managers (https://books.google.com/boo
ks?isbn=1349101370) . p. 177. ISBN 1349101370.
21. Charlie Taylor (June 30, 2015). "US tech firm Sungard announces 50 jobs for Dublin" (https://www.irishti
mes.com/business/technology/us-tech-firm-sungard-announces-50-jobs-for-dublin-1.2267857) . The
Irish Times. "Sungard .. founded 1978"
22. Cassandra Mascarenhas (November 12, 2010). "SunGard to be a vital presence in the banking industry"
(http://www.ft.lk/it-telecom-tech/sungard-to-be-a-vital-presence-in-the-banking-industry/50-7581) .
Wijeya Newspapers Ltd. "SunGard ... Sri Lanka’s future."
24. "Threat and Hazard Identification and Risk Assessment (THIRA) and Stakeholder Preparedness Review
(SPR): Guide Comprehensive Preparedness Guide (CPG) 201, 3rd Edition" (https://www.fema.gov/media
-library-data/1527613746699-fa31d9ade55988da1293192f1b18f4e3/CPG201Final20180525_508c.p
df) (PDF). US Department of Homeland Security. May 2018.
25. "Post-Disaster Recovery Planning Forum: How-To Guide, Prepared by Partnership for Disaster
Resilience" (https://1.usa.gov/1IBkvv0) . University of Oregon's Community Service Center, (C) 2007,
www.OregonShowcase.org. Retrieved October 29, 2018.
29. Gregory, Peter. CISA Certified Information Systems Auditor All-in-One Exam Guide, 2009. ISBN 978-0-07-
148755-9. Page 480.
30. "Five Mistakes That Can Kill a Disaster Recovery Plan" (https://web.archive.org/web/20130116112225/
http://content.dell.com/us/en/enterprise/d/large-business/mistakes-that-kill-disaster.aspx) . Dell.com.
Archived from the original (http://content.dell.com/us/en/enterprise/d/large-business/mistakes-that-kill-
disaster.aspx) on 2013-01-16. Retrieved 2012-06-22.
31. J. D. Biersdorfer (April 5, 2018). "Monitoring the Health of a Backup Drive" (https://www.nytimes.com/2
018/04/05/technology/personaltech/backup-drive-health.html) . The New York Times.
32. Brandon, John (23 June 2011). "How to Use the Cloud as a Disaster Recovery Strategy" (http://www.in
c.com/guides/201106/how-to-use-the-cloud-as-a-disaster-recovery-strategy.html) . Inc. Retrieved
11 May 2013.
35. Kraemer, Brian (June 11, 2008). "IBM's Project Big Green Takes Second Step" (https://web.archive.org/w
eb/20080611114732/http://www.crn.com/hardware/208403225) . ChannelWeb. Archived from the
original (http://www.crn.com/hardware/208403225) on 2008-06-11. Retrieved 2008-05-11.
36. "Modular/Container Data Centers Procurement Guide: Optimizing for Energy Efficiency and Quick
Deployment" (https://web.archive.org/web/20130531191212/http://hightech.lbl.gov/documents/data_c
enters/modular-dc-procurement-guide.pdf) (PDF). Archived from the original (http://hightech.lbl.gov/d
ocuments/data_centers/modular-dc-procurement-guide.pdf) (PDF) on 2013-05-31. Retrieved
2013-08-30.
38. "HP Performance Optimized Datacenter (POD) 20c and 40c - Product Overview" (https://web.archive.or
g/web/20150122213504/http://h18004.www1.hp.com/products/servers/solutions/datacentersolution
s/pod/index.html) . H18004.www1.hp.com. Archived from the original (http://h18004.www1.hp.com/p
roducts/servers/solutions/datacentersolutions/pod/index.html) on 2015-01-22. Retrieved
2013-08-30.
Further reading
Barnes, James (2001). A guide to business continuity planning. Chichest er, NY: John Wiley.
ISBN 9780470845431. OCLC 50321216 (ht t ps://www.worldcat .org/oclc/50321216) .
Bell, Judy Kay (2000). Disaster survival planning : a practical guide for businesses. Port
Hueneme, CA, US: Disast er Survival Planning. ISBN 9780963058027. OCLC 45755917 (ht t ps://
www.worldcat .org/oclc/45755917) .
Fulmer, Kennet h (2015). Business Continuity Planning : a Step-by-Step Guide With Planning
Forms. Brookfield, CT: Rot hst ein Associat es, Inc. ISBN 9781931332804. OCLC 712628907 (ht t
ps://www.worldcat .org/oclc/712628907) , 905750518 (ht t ps://www.worldcat .org/oclc/9057
50518) , 1127407034 (ht t ps://www.worldcat .org/oclc/1127407034) .
DiMat t ia, Susan S (2001). "Planning for Cont inuit y". Library Journal. 126 (19): 32–34. ISSN 0363-
0277 (ht t ps://www.worldcat .org/issn/0363-0277) . OCLC 425551440 (ht t ps://www.worldcat .
org/oclc/425551440) .
Harney, John (July–August 2004). "Business Cont inuit y and Disast er Recovery: Back Up Or Shut
Down" (ht t ps://web.archive.org/web/20080204225856/ht t p://www.edocmagazine.com/archiv
es_ art icles.asp?ID=29114) . AIIM E-DOC Magazine. ISSN 1544-3647 (ht t ps://www.worldcat .o
rg/issn/1544-3647) . OCLC 1058059544 (ht t ps://www.worldcat .org/oclc/1058059544) .
Archived from t he original (ht t p://www.edocmagazine.com/archives_ art icles.asp?ID=29114)
on 2008-02-04.
"ISO 22301:2019(en), Securit y and resilience — Business cont inuit y management syst ems —
Requirement s" (ht t ps://www.iso.org/obp/ui/#iso:st d:iso:22301:ed-2:v1:en) . ISO.
"ISO/IEC 27002:2013(en) Informat ion t echnology — Securit y t echniques — Code of pract ice
for informat ion securit y cont rols" (ht t ps://www.iso.org/obp/ui/#iso:st d:iso-iec:27002:ed-2:v1:
en) . ISO.
External links
"Glossary of t erms for Business Cont inuit y, Disast er Recovery and relat ed dat a mirroring &
z/OS st orage t echnology solut ions" (ht t ps://web.archive.org/web/20201114001623/ht t ps://r
ecoveryspecialt ies.com/glossary.ht ml) . recoveryspecialties.com. Archived from t he original (h
t t ps://recoveryspecialt ies.com/glossary.ht ml) on 2020-11-14. Ret rieved 2021-09-02.
Wikipedia