L-5 Module - 2 Prepare Desister Recovery and Contingency Plan

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

INFORMATION TECHNOLOGY

SERVICING MANAGEMENT
Level V

LEARNING GUIDE # 2
Unit of Competence: Prepare Disaster Recovery and Contingency Plan
Module Title : Preparing Disaster Recovery and
Contingency Plan
TTLM Code : ICT ITM5 02 1115

የአደጋ መመለስን እና የድንገተኛ ጊዜ ዕቅድ ማዘጋጀት

Compiled Getachew G/hana 1|Page


Table of Contents
LO 1: Evaluate impact of system on business continuity ........................................ 2
1. Introduction ..................................................................................................... 2
What is Disaster?............................................................................................................................................ 2
What is disaster recovery? ............................................................................................................................. 2
Network disaster recovery plan ...................................................................................................................... 3
IT Recovery Strategies ................................................................................................................................... 4
2. What is a critical business system? .................................................................. 5
3. Security environment ....................................................................................... 5
4. Identifying critical systems and data .............................................................. 13
Business Continuity Plan .................................................................................................................... 15
5. Impact of system failure ................................................................................. 22
6. Statutory and business requirements.............................................................. 22
LO 2: Evaluate threats to system .......................................................................... 27
2.1. Risk analysis ........................................................................................................................................ 27
2.2. Identify system threats .......................................................................................................................... 30
LO3:- Formulate a prevention and recovery strategy ............................................ 35
3.1. Strategies for dealing with risk .......................................................................................................... 35
3.2. Cost of recovery and prevention options ........................................................................................... 36
3.3. Define and develop contingency plan ............................................................................................... 38
3.4. Strategy report ................................................................................................................................... 39
3.5. Submitting the report ......................................................................................................................... 40
3.6. Getting approval ................................................................................................................................ 41
LO 4: Develop Disaster Recovery Plan to Support Strategy .................................. 47
4.1. Identifying resources required disaster recovery........................................................... 47
4.2. Implementing a disaster prevention and recovery strategy ....................................... 48
4.3. Identifying cut-over criteria .................................................................................................. 52

Compiled Getachew G/hana 1|Page


LO 1: Evaluate impact of system on business continuity

1. Introduction
What is Disaster?
 A disaster is a serious disruption, occurring over a relatively short time, of the functioning of a
community or a society involving widespread human, material, economic or environmental loss and
impacts, which exceeds the ability of the affected community or society to cope using its own resources.
 A disaster can be caused by man or nature and results in <<Organization Name>>’s IT department not
being able to perform all or some of their regular roles and responsibilities for a period of time.
 Organizations prepare for everything from natural disasters to cyber-attacks with disaster recovery plans
that detail a process to resume mission-critical functions quickly and without major losses in revenues or
business operations.

In contemporary academia, disasters are seen as the consequence of inappropriately managed risk. These
risks are the product of a combination of both hazards and vulnerability. Hazards that strike in areas with low
vulnerability will never become disasters, as in the case of uninhabited regions

What is disaster recovery?


 The methods, processes, and procedures needed to minimize the impact of a disaster upon information
and data required for critical business processes.
 The guidelines and activities required to restore systems, operations, and the business to the conditions
that prevailed prior to the disaster.
 A well-written and properly tested plan that allows recovery personnel to administer recovery efforts that
result in a timely restoration of services.

In the IT space, disaster recovery focuses on the IT systems that help support critical business functions. The
term “business continuity” is often associated with disaster recovery, but the two terms aren’t completely
interchangeable. Disaster recovery is a part of business continuity, which focuses more on keeping all
aspects of a business running despite the disaster. Because IT systems these days are so critical to the success
of the business, disaster recovery is a main pillar in the business continuity process.

Purpose of disaster recovery


 Preventing the loss of the organization’s resources such as hardware, data and physical IT assets
 Minimizing downtime related to IT
 Keeping the business running in the event of a disaster

Compiled Getachew G/hana 2|Page


DRP takes all of the following areas into consideration:

 Network Infrastructure
 Servers Infrastructure
 Telephony System
 Data Storage and Backup Systems
 Data Output Devices
 End-user Computers
 Organizational Software Systems
 Database Systems
 IT Documentation

Network disaster recovery plan

A network disaster recovery plan is a set of procedures designed to prepare an organization to respond to an
interruption of network services during a natural or manmade catastrophe.
Voice, data, internet access and other network services often share the same network resources. A network
disaster recovery (DR) plan ensures that all resources and services that rely on the network are back up and
running in the event of an interruption within certain a certain specified time frame.
Such a plan usually includes procedures for recovering an organization's local area networks (LANs), wide
area networks (WANs) and wireless networks. It may cover network applications and services, servers,
computers and other devices, along with the data at issue.

Network services are critical to ensuring uninterrupted internal and external communication and data sharing
within an organization. A network infrastructure can be disrupted by any number of disasters, including fire,
flood, earthquake, hurricane, carrier issues, hardware or software malfunction or failure, human error, and
cyber security incidents and attacks.

Any interruption of network services can affect an organization's ability to access, collect or use data and
communicate with staff, partners and customers. Interruptions put business continuity (BC) and data at risk
and can result in huge customer service and public relations problems. A contingency plan for dealing with
any sort of network interruption is vital to an organization's survival.

Compiled Getachew G/hana 3|Page


Some important requirements to consider when preparing a network disaster recovery plan include the
following:

 Use business continuity standards. There are nearly two dozen BC/DR standards and they are a useful
place to start when creating a contingency plan.
 Determine recovery objectives. Before starting on a plan, the organization must determine its recovery
time objective (RTO) and recovery point objective (RPO) for each key service and data type. RTO is the
time an organization has to make a function or service available following an interruption. RPO
determines the acceptable age of files that an organization can recover from its backup storage to
successfully resume operations after a network outage. RPO will vary for each type of data.
 Stick to the basics. A network DR plan should reflect the complexity of the network itself and should
include only the information needed to respond to and recover from specific network-related incidents.
 Test and update regularly. Once complete, a network DR plan should be tested at least twice a year and
more often if the network configuration changes. It should be reviewed regularly to ensure it reflects
changes to the network, staff, potential threats, as well as the organization's business objectives.
 Stay flexible. No one approach to creating a network disaster recovery plan will work for every
organization. Check out different types of plan templates and consider whether specialized network DR
software or services might be useful.

IT Recovery Strategies
Recovery strategies should be developed for Information technology (IT) systems, applications and data.
This includes networks, servers, desktops, laptops, wireless devices, data and connectivity. Priorities for
IT recovery should be consistent with the priorities for recovery of business functions and processes that
were developed during the business impact analysis. IT resources required to support time-sensitive business
functions and processes should also be identified. The recovery time for an IT resource should match the
recovery time objective for the business function or process that depends on the IT resource.
Information technology systems require hardware, software, data and connectivity. Without one component
of the “system,” the system may not run. Therefore, recovery strategies should be developed to anticipate
the loss of one or more of the following system components:
Computer room environment (secure computer room with climate control, conditioned and
backup power supply, etc.)
Hardware (networks, servers, desktop and laptop computers, wireless devices and
peripherals)
Connectivity to a service provider (fiber, cable, wireless, etc.)

Compiled Getachew G/hana 4|Page


Software applications (electronic data interchange, electronic mail, enterprise resource
management, office productivity, etc.)
Data and restoration

2. What is a critical business system?


A system is critical for a commercial organization if its failure results directly or indirectly in loss of life
(for example, an air traffic control system) and/or major financial loss. When developing a disaster recovery
plan (DRP) it is essential to identify critical systems and ensure they are restored as soon as possible.
Each critical system has a maximum allowable downtime beyond which its loss will severely impact the
business. The shorter the period of time before losses start to occur, the more critical the system is. The size
of the financial loss, relative to the financial worth of the business, is also significant. The greater the
financial loss in percentage terms, the more critical the system.

3. Security environment
3.1. Computer Network Security Basics
What is network security?
While computer systems today have some of the best security systems ever, they are more
vulnerable than ever before.
This vulnerability stems from the world-wide access to computer systems via the Internet.
Network security is preventing attackers from achieving objectives through unauthorized access or
unauthorized use of computers and networks.
Basic Security Measures
The basic security measures for computer systems fall into the following categories:
1. External security 7. Standard System attacks
2. Operational security 8. Viruses/worms and antivirus tools
3. Surveillance 9. Firewalls
4. Passwords/authentication 10. Encryption and Decryption Techniques
5. Auditing 11. Digital Signature
6. Access rights 12. Security Policy
External Security
Protection from environmental damage such as floods, earthquakes, and heat.
Physical security such as locking rooms, locking down computers, keyboards, and other devices.
Electrical protection from power surges.
Noise protection from placing computers away from devices that generate electromagnetic
interference.

Compiled Getachew G/hana 5|Page


Operational Security
1. Deciding who has access to what.
2. Limiting time of day access.
3. Limiting day of week access.
4. Limiting access from a location, such as not allowing a user to use a remote login during certain
periods or any time.
Surveillance
Proper placement of security cameras can deter theft and vandalism.
Cameras can also provide a record of activities.
Intrusion detection is a field of study in which specialists try to prevent intrusion and try to
determine if a computer system has been violated.
Passwords and ID Systems
 Passwords are the most common form of security and the most abused.
 Simple rules help support safe passwords, including:
1. Change your password often.
2. Pick a good, random password (minimum 8 characters, mixed symbols).
3. Don’t share passwords or write them down.
4. Don’t select names and familiar objects as passwords.
Many new forms of “passwords” are emerging:
 Fingerprints
 Face prints
 Retina scans and iris scans
 Voice prints
 Ear prints
 Nose recognition
Authentication
 Authentication is the process of reliably verifying the identity of someone (or something) by means of:
 A secret (password [one-time], ...)
 An object (smart card, ...)
 Physical characteristics (fingerprint, retina, ...)
 Trust
Security mechanisms by which a policy can be enforced, important security mechanisms are:
1. Encryption
2. Authentication
3. Authorization
4. Auditing

Compiled Getachew G/hana 6|Page


1. Encryption is fundamental to computer security. Encryption transforms data into something an attacker
cannot understand. In other words, encryption provides a means to implement confidentiality. In
addition, encryption allows us to check whether data have been modified. It thus also provides support
for integrity.
2. Authentication is used to verify the claimed identity of a user, client, server, and so on.In the case of
clients, the basic premise is that before a service will do work for a client, the service must learn the
client’s identity. Typically, users are authenticated by means of passwords, but there are many other
ways to authenticate clients such as biometrics, certification, multi-factor authentication etc.
3. Authorization, after a client has been authenticated, it is necessary to check whether that client is
authorized to perform the action requested. Access to records in a medical database is a typical example.
Depending on who accesses the database, permission may be granted to read records, to modify certain
fields in a record, or to add or remove a record.
4. Auditing tools are used to trace which clients accessed what, and which way. Although auditing does
not really provide any protection against security threats, audit logs can be extremely useful for the
analysis of a security breach, and subsequently taking measures against intruders. For this reason,
attackers are generally keen not to leave any traces that could eventually lead to exposing their identity.
In this sense, logging accesses makes attacking sometimes a riskier business.

Cryptography
Is the science and art of transforming messages to make them secure and immune to attacks
The original message, before being transformed, is called plaintext. After the message is
transformed, it is called cipher text.
An encryption algorithm transforms the plaintext into cipher text; a decryption algorithm
transforms the cipher text back into plaintext.
The sender uses an encryption algorithm, and the receiver uses a decryption algorithm.

A key is a number (or a set of numbers) that the cipher, as an algorithm, operates on.
To encrypt a message, we need an encryption algorithm, an encryption key, and the plaintext.
These create the cipher text.

Compiled Getachew G/hana 7|Page


To decrypt a message, we need a decryption algorithm, a decryption key, and the cipher text.
These reveal the original plaintext.
We can divide all the cryptography algorithms (ciphers) into two groups: symmetric-key (also
called secret-key) cryptography algorithms and asymmetric (also called public-key) cryptography
algorithms.

Cryptography terminologies
Plaintext: The original message, before being transformed in cipher text.
Encryption: the process of converting an original message into a form that cannot be understood
by unauthorized individuals. To encrypt a message, we need an encryption algorithm, an
encryption key, and the plaintext.
Cipher text or cryptogram-After the message is transformed into encrypted text.
An encryption algorithm transforms the plaintext into cipher text.
A decryption algorithm transforms the cipher text back into plaintext. The sender uses an
encryption algorithm, and the receiver uses a decryption algorithm.
Cipher: It is used to refer to different categories of algorithms in cryptography.
A key or crypto variable: the information used in conjunction with the algorithm to create the
cipher text from the plaintext. it can be a series of bits used in a mathematical algorithm or the
knowledge of how to manipulate the plaintext
Key space: the entire range of values that can possibly be used to construct an individual key
Cryptosystems: The combination of algorithm, key and key management functions used to
perform cryptographic operations.
Steganography: The process of hiding messages, usually within graphic images

Compiled Getachew G/hana 8|Page


Symmetric-Key Cryptography
In symmetric-key cryptography, the same key is used by both parties. The sender uses this key and
an encryption algorithm to encrypt data; the receiver uses the same key and the corresponding
decryption algorithm to decrypt the data

Asymmetric-Key Cryptography
In asymmetric or public-key cryptography, there are two keys: a private key and a public key. The
private key is kept by the receiver. The public key is announced to the public.
In public-key encryption/decryption, the public key that is used for encryption is different from the
private key that is used for decryption. The public key is available to the public; the private key is
available only to an individual.

Keys used in cryptography

Compiled Getachew G/hana 9|Page


SYMMETRIC-KEY CRYPTOGRAPHY
 Symmetric-key cryptography started thousands of years ago when people needed to exchange secrets
(for example, in a war).
 We still mainly use symmetric-key cryptography in our network security.

A substitution cipher replaces one symbol with another.


Example:
The following shows a plaintext and its corresponding ciphertext. Is the cipher monoalphabetic?

Solution
The cipher is probably monoalphabetic because both occurrences of L’s are encrypted as O’s.
Example 2
The following shows a plaintext and its corresponding ciphertext. Is the cipher monoalphabetic?

Compiled Getachew G/hana 10 | P a g e


Solution
The cipher is not monoalphabetic because each occurrence of L is encrypted by a different character. The
first L is encrypted as N; the second as Z.
The shift cipher is sometimes referred to as the Caesar cipher. In this cipher, the encryption algorithm is
"shift key characters down," with key equal to some number. The decryption algorithm is "shift key
characters up.“
Example: Use the shift cipher with key = 15 to encrypt the message “HELLO.”
Solution
We encrypt one character at a time. Each character is shifted 15 characters down. Letter H is encrypted
to W. Letter E is encrypted to T. The first L is encrypted to A. The second L is also encrypted to A. And O
is encrypted to D. The cipher text is WTAAD.
Example: Use the shift cipher with key = 15 to decrypt the message “WTAAD.”
Solution
We decrypt one character at a time. Each character is shifted 15 characters up. Letter W is decrypted to H.
Letter T is decrypted to E. The first A is decrypted to L. The second A is decrypted to L. And, finally, D is
decrypted to O. The plaintext is HELLO.
Mono alphabetic ciphers
The Caesar cipher involves replacing each letter of the alphabet with the letter standing three places further
down the alphabet. For example,
plain: meet me after the toga party
cipher: PHHW PH DIWHU WKH WRJD SDUWB
Note that the alphabet is wrapped around, so that the letter following Z is A. We can define the
transformation by listing all possibilities, as follows:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Transposition Ciphers
For example, to encipher the message "meet me after the toga party" with a rail fence of depth 2, we write
the following:
mematrhtgpry
etefeteoaat
The encrypted message is MEMATRHTGPRYETEFETEOAAT, This sort of thing would be trivial to
cryptanalyze.

Compiled Getachew G/hana 11 | P a g e


Quiz:
Similarly, Encrypt “Let us meet at microlink college” by yourself.
___________
For example,
Key: 4312567
Plaintext:
a t t a c k p
o s t p o n e
d u n t I l t
wo a m x y z
Cipher text: TTNAAPTMTSUOAODWCOIXKNLYPETZ

A transposition cipher reorders (permutes) symbols in a block of symbols.

Example: Encrypt the message “HELLO MY DEAR,” using the key shown in Figure above
Solution
We first remove the spaces in the message. We then divide the text into blocks of four characters. We add a
bogus character Z at the end of the third block. The result is HELL OMYD EARZ. We create a three-block
cipher text ELHLMDOYAZER.
Data Encryption Standard (DES)
 One example of a complex block cipher is the Data Encryption Standard (DES). DES was designed
by IBM and adopted by the U.S. government as the standard encryption method for nonmilitary and
no classified use.
 The algorithm encrypts a 64-bit plaintext block using a 64-bit key

Compiled Getachew G/hana 12 | P a g e


4. Identifying critical systems and data
Many organizations will have identified which systems their business relies on. Nevertheless, it is important
to formalize the identification of these systems and put in place appropriate recovery procedures.
Ideally, the business case or proposal for each new system should identify its importance and a risk analysis
should be undertaken early in the project. This information may already be available in the project
documentation, in which case you would review this material and identify the risk issues that have been
raised.
In the absence of this information, you may need to survey the organization’s business areas or conduct
workshops where managers can consider the critical nature of their systems.
During this process, each system should be considered as a whole. All the parts that make up the system
must be carefully documented. Only then can it be determined what part of the system is critical.
You will need to collect information about how the system uses:
software
hardware
networks (voice and data)
data
Facilities (chairs, table’s projectors etc.).

Compiled Getachew G/hana 13 | P a g e


Software in the form of standard packages is used to access data. It can be readily replaced.
Data may have been gathered over many years and is unique and irreplaceable.
Hardware is needed to run the software and access the data. Software requires a minimum hardware
platform to work properly.
Networks provide the basis for distributing data.
Facilities such as chairs, telephones tables, paper based form etc complete the system.
Since systems become more critical at different times, the maximum allowable downtime can vary
depending on the time of day, week, month or year a disaster happens. For example, many businesses work
to a monthly accounting cycle: losing their financial system at the end of the month would have a greater
impact than in the middle of the month.

An example of critical assessment


Consider the critical systems on your personal computer at home. Assess whether the following situations
make your systems critical or not.
1. You are working late on a 50-page assignment that must be handed in by 9: 30am the next day
otherwise you will fail the course.
2. You are using the Internet to book a holiday you intend taking in three months’ time.
3. You have developed a spreadsheet to calculate your tax return.
4. You have created a database of CDs, records, tapes and videos which you will need
to show your insurance company if the collection is destroyed or stolen.
5. You have saved several versions of your favorite computer game.
You may have come up with something like this:

Table 1: Levels of critical systems

Item Critical assessment

1 Critical until 9:30am and then not critical


2 Not critical
3 Critical when completing tax return
4 Critical if event occurs
5 Not critical

Compiled Getachew G/hana 14 | P a g e


Business Continuity Plan

Business Continuity Planning Process Diagram


When business is disrupted, it can cost money. Lost revenues plus extra expenses means reduced profits. Insurance does not cover all costs and
cannot replace customers that defect to the competition. A business continuity plan to continue business is essential. Development of a business
continuity plan includes four steps:
Conduct a business impact analysis to identify time-sensitive or critical business functions and processes and the resources that support
them.
Identify, document, and implement to recover critical business functions and processes.
Organize a business continuity team and compile a business continuity plan to manage a business disruption.
Conduct training for the business continuity team and testing and exercises to evaluate recovery strategies and the plan

Compiled Getachew G/hana 15 | P a g e


Critical systems/data assessment forms
Before starting work on the DRP all critical systems must be identified and documented. Users and
management complete critical systems/data assessment forms with the guidance of IT staff. Once
completed, they form an integral part of the system documentation.
The following are examples of the forms used:
Form 1: Review software used
Q.1 Which application software do you normally use and how often?
Form 1: Reviewing software used

Software Constantly Frequently A few times a day A few times a week Rarely

Use this form to identify the software that is most frequently used. Frequency may or may not indicate the
software is critical. For example, many users may use a word processor every day but this may not be critical
to the organization. Further analysis is required.
Form 2: Reviewing data used or created by the system
System Name: ____________________________
Q.2 what types of data activity do you carry out with each system and where does the source data originate?
Show as a percentage of total time.
System Name:
Update Create Create Create own Create own
corporate own data shared temporary longer-term
data files files documents documents documents
From source
documents
From other data
files
From irrecoverable
sources such as
telephone calls
Developed at the
workstation such as
report writing
Other – specify

Compiled Getachew G/hana 16 | P a g e


Complete this form for each system that is used constantly or frequently, for example, an email or e-
commerce system. You can use it to identify how important the data is, which data items are easy to recover
and which are not.
The form identifies the types of data activity carried out and where the source data originates. The level of
difficulty in restoring data and impact on the organization is then measured in percentage points.
Let’s say, for example, we need to assess an email system. The percentage level of criticality to the
organization is indicated with examples and an explanation of how this level was arrived at.

Table 2: Example of completed Form 2

Update corporate Create own data files Create shared Create own Create own
data files documents temporary longer term
documents documents
10%
source

(eg software
program—not critical
documents

because source
documents are
From

replaceable from other


areas)
10%
(eg email address
From other

stored on server—
data files

not critical because it


can be recovered
elsewhere)
10%
From irrecoverable

(eg diary and


calendar—not critical
sources such as
telephone calls

for the running of a


business even though
data can’t be
recovered)

Compiled Getachew G/hana 17 | P a g e


60% 5% 5%
(eg sent emails and (eg (eg received
Developed at the workstation

attachments— meeting room emails and


critical for bookings in attachments
organization because if shared inbox stored in
such as report writing

email crashes the - not critical for temporary


business suffers) running the files—can’t be
business even replaced but not
though data critical because
can’t be email and
recovered) attachments can
be resent.)

Note how most of the data files for the email system are developed and created at the workstation. The loss
of these files has a high impact on the individual but not on the business as a whole.
The following tables describe the significance of the loss of source files in relation to the purpose for which
they are used.

Table 3: Difficulty in recovering lost source files.

Data sourced Issues


From source documents Recovery could be from source documents if they are kept.
From other data files Recovery could be from other data files if they are backed up.
From irrecoverable sources Recovery impossible unless regular backups of files are made and stored
such as telephone calls externally.
Developed at the Recovery impossible unless regular backups of files are made and stored
workstation such as report externally.
writing
Other – specify Determine how easy it is to get back to a source or original.

Table 4: Impact of loss of source files on business.

Data used to Issues

Update corporate data files Important data used by many and may be critical.
Create own data files May be critical data but restricted impact and short life
Create shared documents May be critical data but restricted impact
Create own temporary documents Unlikely to be critical
Create own longer term documents May be critical data but restricted impact, may be required again

Compiled Getachew G/hana 18 | P a g e


Form 3: Resource Requirements
System Name:__________________________
Q.3 Please provide a list of resources required for each system. The list should include pre-printed forms,
office equipment (ie, photocopy machines, postage meters, etc.), computer devices (terminals, PCs, printers,
etc.), telecommunications and network capabilities (modems, FAX machines, telephones, telephone
messaging/announcing equipment, etc.) and records (hardcopy, microform, electronic databases).

Details

Complete this form for each system. It helps identify what equipment is needed to run each
system.
Form 4: Analyzing Critical Areas
System Name: _____________________________________________
For each software application used, what would be the impact on the organisation if you could not access the
data for more than one day, between 1 and 8 hours, and less than 1 hour?

Form 4a: Analysing critical areas: impact of more than 1 day

Area of impact Very costly Serious Little or no effect


Impact on cash flow
Impact on profitability
Impact on customer or supplier
relations
Impact on legal requirements
Impact on staff or morale

Q4a Are there any other implications? Please specify.


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

Compiled Getachew G/hana 19 | P a g e


Form 4b: Analysing critical areas: impact of between l and 8 hours

Area of impact Very costly Serious Little or no effect


Impact on cash flow
Impact on profitability
Impact on customer or supplier relations
Impact on legal requirements
Impact on staff or morale

Q.4b Are there any other implications? Please specify.


_____________________________________________________
_____________________________________________________
_____________________________________________________

Form 4c: Analysing critical areas: impact of less that 1 hour

Area of impact Very costly Serious Little or no effect

Impact on cash flow


Impact on
profitability
Impact on customer or
supplier relations
Impact on legal
requirements
Impact on staff or
morale

Q.4c Are there any other implications? Please specify


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

Q.4d Estimate the maximum amount of time you could operate without access to the system? Why?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

Compiled Getachew G/hana 20 | P a g e


Q.4e Are there any peak periods when disruption would be more or less serious? Why?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

Q.4f Are there any applications or data that you believe must be continuously available? Why?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
Complete this form to identify the impact of system failure in a number of different areas. The answers ’very
costly’, ’serious’ and ’little or no effect’ quantify the size of the financial loss and thus the magnitude of the
impact on the business.
The form should be completed for different time periods to show what the impact of system failure would be
in minutes and hours for time-sensitive critical systems and hours and days for others.
The following table describes the critical areas in Form 4.

Table 5: Description of critical areas

Area Issue

Impact on cash flow Businesses must be able to pay their debts and to obtain income. Is the system
critical to the cash flow?
Impact on profitability If sales are lost or expenses incurred then it begins to bite into the ’bottom line’.
Impact on customer or Customers may put up with delayed shipment of goods once but next time they
supplier relations may go elsewhere.
Impact on legal Are there contracts or statutory obligations that may incur penalties if missed?
requirements
Impact on staff or If systems are regularly down or inaccurate, staff may be harassed by customers
morale or have to undertake extra work to sort out problems.
Note that all these areas can eventually have an impact on profit so the user should identify the
primary area of impact.
Ranking of critical systems
Having identified one or more critical systems, these need to be ranked in order of importance and impact
on the organization. It is unlikely that you will have the time to implement DRP procedures for all systems
so you should initially concentrate on the most important.

Compiled Getachew G/hana 21 | P a g e


Form 5: Ranking of critical systems
Name of Main data Users or Areas of Maximum Ranking
system files departments impact (Profit, time users
impacted cash flow etc.) can be
without
system

Complete this form when ranking critical systems.


Activity
To practice ranking the critical systems go to Activity 3 in the Activities section of the Topic
menu.
5. Impact of system failure
When undertaking risk analysis and disaster planning, it is usual to focus on critical systems, software and
data. The very definition of a critical system is that the business depends upon it and would be severely
impacted if the system were not available.
Forms 1 to 4 assist in analyzing how long the business can cope after a loss.
Many organizations, such as banks, stock exchanges and automated factories, cannot manage more than a
few minutes without their systems. Imagine the state of the rail system or air traffic control without the use
of their computers. Even the local supermarket would suffer loss if the tills went down for several minutes.
When assessing the impact on a business it is usual to consider the financial impact. Profits will suffer if
customers cannot trade with the company. If an e-commerce website is down, for example, customers may
turn to competitors.
There may also be an impact on cash flow. Not so long ago, a bank had to borrow millions of dollars
overnight to cover its needs when its computers went down.
If systems are regularly down or slow then customers may eventually go elsewhere. If faulty systems delay
payments, suppliers may stop delivering essential goods and services

6. Statutory and business requirements


Statutory and commercial requirements must be considered when assessing the impact of a system failure.
The Act governing the Australian financial industry promotes financial soundness, stability and appropriate
risk management.

Compiled Getachew G/hana 22 | P a g e


Corporate regulation
Business continuity management (BCM) and DRP form part of the core principles of the International
Standards on Prudential Regulation. The Australian Prudential Regulatory Authority (APRA) regulates the
Australian financial industry, overseeing banks, general and life insurers and most members of the
superannuation industry.
Financial institutions are subject to auditing by APRA, including on-site visits. APRA determines whether
the business has an adequate and up-to-date DRP in place and whether the testing program is sound. Any
irregularities are noted in an audit report and a formal notice is sent to the business. If the business fails to
rectify problems it can be fined or even suspended from trade.
Organizations trading in the USA are subject to recently enacted legislation (Sarbanes-Oxely) which has
considerably tightened their operating requirements. Failure to comply would result in heavy fines.
s
The need to identify which critical systems rely on outside services or resources is paramount in managing
business continuity. Once critical systems are identified, it is necessary to state in the Service Level
Agreement (SLA) with the supplier how business outages will be handled. For example, the SLA may
require a supplier to store excess stock at an offsite storage area or arrange for a competitor to handle supply
until business resumes.
Take, for example, a car manufacture which purchases components for steering wheels. If the component
supplier is unable to fulfill orders due to a disaster, then the car manufacturer must stop production and lose
millions of dollars. To reduce the risk of this happening, the car manufacturer stipulates in its SLA with the
component supplier that there will be a penalty of $100,000 per day for non-supply of components. The
component supplier is forced to have a Disaster Recovery Plan to ensure production is resumed as fast as
possible or risk of being penalized or even financial collapse.

Compiled Getachew G/hana 23 | P a g e


Activities
Activity 1 – identifying critical systems

Consider this case study.

A clothing retail organization, Urban Wear, intends to develop a website to manage orders and payments for
its products. It will display a picture of each product, its price and availability. Customers will be able to
order and pay for the goods online. The organization believes that this will extend its sales to other countries
and allow 24-hour selling.

What factors would need to be considered in determining whether this new system will be critical to the
business and what the impact might be if it fails?
Write at least 4 questions you need to consider.
Questions include:
 What volume of sales is the new system expected to generate, especially compared to traditional
sales?
o (The higher the percentage of overall sales it generates, the more critical the system
will be.)
 How will the new system impact traditional sales?
o Will customers prefer to use the website rather than visit a store?
o How will this affect the profitability of the stores?
o If it reduces their profitability, what will happen to the stores?
 What are the implications of 24-hour access?
o Will deliveries be made 24 hours a day?
o Can the organization’s current distribution resources cope with overseas orders?
o Does the organization have the skills to maintain a 24-hour website? What extra
ongoing support will be required?
 Are the goods of a type that may attract hackers or terrorists to the site in an attempt to attack it?
 What sensitive information, such as customer credit card details, may be on the site?
Activity 2 – analyzing critical areas

 You have been given the following form for the Urban Wear e-commerce site. Most of the data will
be input online via the Internet.

Compiled Getachew G/hana 24 | P a g e


Table 1: critical areas

Update Create Create Create shared Create own longer term


corporate data own data shared documents Create own documents
files files documents temporary documents

From source documents 10%

From other data files 10%

From irrecoverable 80%


sources such a telephone
calls

Developed at the
workstation such as
report writing

Other—specify

1. What issues need to be considered for backup and restoration of data?


This type of data is the most difficult to backup and restore. Most organizations backup once a day, usually
overnight. The first issue to consider is that the system is planned to be available on a continuous basis. This
means that special backup arrangements may need to be considered. These may require the system to be
down for a brief period during backup or the use of backup software that can backup files in use.
2. What problems can occur with backing up online transactions?
Records of transactions can be lost if the system crashes between backups. Suppose a backup is undertaken
at 3 am after which orders continue to be received. At 2 pm the system crashes and needs to be restored from
backup. There may be no record of all the orders received between 3 am and 2 pm. In traditional paper-based
systems the original order would be available which could be re-keyed. It may therefore be necessary to
maintain a transaction log on another server which is a mirror of the data entered on the main file.

Compiled Getachew G/hana 25 | P a g e


LO1: Test
Name ________________________________________________________ IdNo __________
Choose the correct answer from the given alternatives

1. A disaster recovery plan is an action plan that charts the procedure for recovering:
A. every business function
B. information systems
C. critical business functions
D. computer hardware
2. What is the purpose of using the critical systems/data assessment form analyzing critical areas:
A. It defines how long critical business systems could cope with a disaster
B. It defines what software is needed by each critical system
C. It ranks the critical systems in terms of importance
D. It provides details of business process and procedures

3. To determine which critical system is the most important you must consider the impact on:
A. staff morale
B. customer satisfaction
C. cash flow
D. the information system

4. A disaster is defined as an event:


A. resulting in great loss or misfortune
B. that causes the business to stop production
C. resulting in minor loss or misfortune
D. that can cause great gains and wealth

5. When considering critical systems and data it is important to collect information about how the
system uses:
A. Software and data
B. Hardware and network
C. Facilities
D. A, b and c
E. A and b only
F. A only
G. None of the above

Compiled Getachew G/hana 26 | P a g e


LO 2: Evaluate threats to system
What is a risk?
A risk can be defined as an event or circumstance that has a negative effect on your business, for example,
the risk of having equipment or money stolen as a result of poor security procedures. Types of risk vary from
business to business.
2.1. Risk analysis
 If you own or manage a business that makes use of IT, it is important to identify risks to your IT
systems and data, to reduce or manage those risks, and to develop a response plan in the event of an IT
crisis. Business owners have legal obligations in relation to privacy, electronic transactions, and staff
training that influence IT risk management strategies.
 IT risks include hardware and software failure, human error, spam, viruses and malicious attacks, as
well as natural disasters such as fires, cyclones or floods.
System Risk Analysis
This risk analysis is then used by Business Owners to classify systems (endpoints, servers, applications) into
one of three risk categories:
 Low Risk
 System processes and/or stores public data
 System is easily recoverable and reproducible
 System provides an informational / non-critical service
 Moderate Risk
 System processes and/or stores non-public or internal-use data
 System is internally trusted by other networked systems
 System provides a normal or important service
 High Risk
 System processes and/or stores confidential or restricted data
 System is highly trusted by UI networked systems
 System provides a critical or campus-wide service
Risk Analysis must take into consideration the sensitivity of data processed and stored by the system, as well
as the likelihood and impact of potential threat events. We use a simple methodology to translate these
probabilities into risk levels and an overall system risk level.

Compiled Getachew G/hana 27 | P a g e


Threat Event Assessment
Risk assessment is the compilation of risks associated with various potential threat events. A "threat event"
is any event which may cause a loss of confidentiality, integrity, or availability of the system and the data it
stores and/or processes.
Although there may be hundreds of potential threat events related to a system, they can be
generally organized into three main categories:
 Loss of Confidentiality:
 The system and its data is compromised by external hackers
 The system and its data is released publicly without approval
 The system and its data erroneously publishes data on public-facing portions of the system
(i.e. web page) without authorization
 Loss of Integrity:
 The system and its data can no longer be trusted
 The system and its data is not complete or incorrect
 Loss of Availability:
 The system and its data no longer exists (e.g. hard drive failure, system destroyed)
 The system and its data no longer responds to valid queries from the user or users (system
fault)
 The system and its data cannot be retrieved by an authorized user (e.g. Denial of Service
Attack)

Having identified the organization’s critical systems, it is important to consider possible threats to the
system. A risk analysis will help determine these.
Risk analysis steps
Risk analysis is an analytical process undertaken to evaluate system assets and examine their susceptibility to
threats. Through this process we evaluate the possible commercial losses that may result from the loss of
these assets.

Compiled Getachew G/hana 28 | P a g e


Figure 1 Risk Analysis

In order to undertake a risk analysis you must:


 identify which system assets are included in the analysis
 identify threats to the system
 consider the probability of the event occurring
 estimate the possible loss that could occur
 consider safeguards to prevent or recover from the event
 carry out a cost-benefit analysis of loss versus the cost of the safeguard
 Implement safeguards and a recovery plan.
Why do we carry out a risk analysis?
The basic purpose of a risk analysis is to identify preventive and recovery options for assets. Think about
assets of your own which you would take steps to protect from loss. For example, if you own a car, you
might install an alarm and immobilizer to deter theft. In the same way, a company will also take precautions
with its assets.
Computer systems (including hardware, software and data) are valuable assets of an organisation. It is
therefore very important that a risk analysis be undertaken to identify and safeguard these systems. A major
factor in risk analysis is to identify the impact of systems on business continuity. ‘Mission critical’ systems
require the greatest level of protection.

Compiled Getachew G/hana 29 | P a g e


The loss of IT systems could have a major impact on many businesses. Many would come to a standstill in
minutes without their critical business systems. Even a small company could get into financial difficulties if
it lost its accounting data and did not know who owed it money.

An organization undertakes an IT risk analysis to identify:


 how dependent it is on IT systems
 what could go wrong with these systems
 what system assets they might lose
 What can be done about it?

2.2. Identify system threats


IT systems can comprise many parts including:
 hardware
 software
 networks
 data
 technical skills
 Projects.
There are many ways to categorize threats. One way is to consider whether the source of the threat is internal
or external.
Internal threats
Internal threats mainly result from actions by users and/or IT staff. These can include:
 Viruses corrupt or delete data*. Users can unknowingly transfer viruses to the corporate
network via mobile devices such as personal data assistants or laptop computers. For example, a
user might buy a new laptop and connect it to the Internet to check for updates at home. They
are unaware that a virus is downloaded on their computer. The next day the user takes the laptop
to work and connects it to the corporate network. The virus is then spread throughout the
network deleting important data. Normally the virus would have been stopped by the corporate
firewall.
 The wrong disk is formatted destroying data and software. Mistakes are easily made when
formatting a hard disk using the command line. For example, a person on work experience could
accidentally format the wrong hard disk drive by entering a wrong command.
 Sabotage. Data and software*are intentionally destroyed or corrupted.
 Data and software files are deleted. Deleting data can be accidental or intentional. For
example, a person could accidentally press the delete key when moving data or intentionally
delete data through known software system vulnerabilities.

Compiled Getachew G/hana 30 | P a g e


 A password is forgotten so data or software cannot be accessed. For example, a retrenched
employee deliberately doesn’t update a password list.
 Input errors cause data to be corrupted.* If operators input incorrect, duplicated or
unauthorized transactions, then very quickly the data becomes corrupted or inaccurate. How
many stories have you heard about computers sending out a bill for millions of dollars to an old
age pensioner or cheques for two cents?
 Processing errors cause data to be corrupted.* Poor software design changes data.
 Hardware failure occurs so data and software are not available. Hardware and networking
equipment is delivered with a mean time to failure or mean time to repair. This is the expected
time after which hardware will need to be replaced or repaired. Preventive maintenance can
prolong this period.
 Fraud. Data is corrupted in order to steal assets.*
 Poor testing. Bugs are left in software so errors or delays occur.*
 Incorrect processes or calculations occur in programs so errors or delays occur.*
 Copyright and license agreements are broken which leads to the company being sued by the
owner of the copyright or license provider.

External threats
External threats can include:
 Theft of data and loss of confidential information especially customer
details* transmitted over the Internet or wide area network connection.
 breakdowns of Internet or wide area network connection or failure of
critical systems hardware
 Fire or earthquake which renders the system inaccessible.
 Flooding which renders the system inaccessible. Water from sprinklers or
sewer lines can cause flooding of offices.
 Hackers corrupt or steal data*. A discontented customer or ex-employee may
decide to post customers’ credit card details to the Internet.
 Power problems make the system inaccessible. Power spikes or outages can
disrupt critical systems.
 ‘Buggy’ software from a package vendor may cause errors in data or delays.
The more serious external threats are likely to have an impact on the hardware and networks on which the
system run.

Compiled Getachew G/hana 31 | P a g e


Threats listed above marked with an * may have been previously identified by a security audit or analysis.
The organization’s internal or external auditors may have already performed such an analysis providing you
with a useful source of information. To see an example of an audit report, click on or copy the following link.
Example of system threats
Consider the Urban Home ware Company which has 10 stores located across the state. The company
headquarters are located in the capital city. They have identified their POS and dispatch systems as ‘critical
systems’. What threats can be identified for these systems?
Internal threats
 Viruses – deleting important data. Viruses can spread to stores via dial-up connection to
company headquarters. Point of Sale terminals are not connected to the Internet but are still
susceptible to virus attacks by employees transferring data from CD’s or floppy disks.
 Hardware failure. Computer servers or networking equipment fail causing loss or
inaccessibility of data.
 Deleting or changing data. Accidental deleting or changing of data by employees or software
programs.
 Input errors. Mistakes by POS operators.
External Threats
 Theft of data. Corporate espionage by competitors or by a hacker.
 Break down of telephone connections. Inability to transfer data to head office.
 Fire, earthquake, flood or windstorm. Causes disruption to facilities or supply chain.

Compiled Getachew G/hana 32 | P a g e


Activity
To practice identifying threats to the system go to Activity 1 and Activity 2 in the Activities section of the
Topic menu.
Activity 1 identifying possible threats
A small communications company, 4phones, is about to introduce an e-commerce system. List the possible
threats to the system. Identify whether they are internal or external and flag with an * any threats that are also
security threats.
Table 1: Threats

Threat Category

Hackers attempting to get to the data stored on the site. * External

Hardware failures that stop the site operating. Internal

Denial of service attacks to bring the service down* External

Data destruction by any means such as a user deleting a file* Internal

Misuse of information by internal staff* Internal

Power problems so site is down External

Overloaded site so response is slow External

Customers falsifying information to avoid payment* External

Incorrect information such as wrong prices so customers pay too little* Internal

Incorrect information such as wrong quantity in stock so customers have to Internal


wait for delivery*

Major disaster so site is down External

Compiled Getachew G/hana 33 | P a g e


Lo2:- Test
Name _________________________________________________ IdNo __________
Choose the correct answer from the given alternatives
1. In what order do the following risk analysis steps happen: 1. identify threats, 2. estimate loss, 3.
cost-benefit analysis, 4. consider safeguards, 5. implement plan, 6. consider probability, 7.
identify assets
A. 1324567
B. 7612354
C. 4567213
D. 7162435
2. An organization undertakes a risk analysis in order to identify:
A. how dependent they are on IT systems
B. what could go wrong with IT systems
C. what assets they might lose
D. what can be done about it
E. all of the above
3. An internal threat is caused mainly by the actions of internal employees. Which of the following
is an internal threat:
A. input error so data is corrupted
B. hurricane
C. earthquake
D. buggy software
4. What type of threat could originate from any employee who has physical access to equipment
and legitimate rights to information within the organization:
A. Internal
B. External
C. Direct
D. Indirect

Compiled Getachew G/hana 34 | P a g e


LO3:- Formulate a prevention and recovery strategy
3.1. Strategies for dealing with risk

There are two main strategies for dealing with risk (apart from ignoring it in the hope it will go away):
prevent or recover. Both options have the objective of minimizing the impact of the risk event.
Prevention
With prevention you attempt to decrease the probability (maybe even to 0) of the event occurring or causing
damage. Many events can never be totally eliminated but their impact may be minimized.
For example, an extensive sprinkler system will ensure that any outbreak of fire does minimal damage. It is
almost impossible to totally prevent a fire from occurring in the first place but this is still considered a
preventative action. This type of activity may also be termed risk minimization.
Recovery
Recovery procedures are put in place to ensure that the system can be quickly restored after the event occurs.
For example, the use of a hot-site (one that has a computer system already set up and ready to use) allows for
speedy recovery after a fire has gutted the building. This process may also be termed a contingency. In fact
DRP is sometimes referred to as contingency planning.
Recovery and prevention options
The recovery or prevention option chosen will vary depending upon the threat being analysed. Some of the
more common options are listed in the following table.

Table 1: recovery and prevention options

Used Option Type


To recover data/software when it has been Backup and recovery Recovery
destroyed or corrupted.
To minimize the impact of software bugs and errors. Testing Prevention
To stop unauthorized access and data theft or User and resource Prevention
destruction. security
To stop errors in the data. System controls Prevention
To minimize the impact of a major disaster at the Hot sites – one option Recovery
main site. among many
To stop unauthorized access to data Encryption, password Prevention
control
To stop virus attacks Virus checking software Prevention
To minimize user errors User training Prevention
To stop software being copied and breaking Software keys Prevention
license agreements.

Compiled Getachew G/hana 35 | P a g e


To allow access to data to continue even if a disk Mirrored disks or RAID Prevention
fails. (Redundant Array of
Inexpensive Disks)
systems, clustered
systems
To stop unauthorized access to data and data Access rights Prevention
destruction.
To minimize impact of power loss or spikes and Uninterruptible power Prevention
surges. supplies (UPS), standby
generator

3.2. Cost of recovery and prevention options

As you can see, there are many options available to prevent risk from occurring. Some of these are based on
policies or standards and may involve no additional cost. However, some options, such as a hot site, can be
very expensive.
When deciding which options to adopt, you need to weigh the possible cost of the risk event against the cost
of the recovery or prevention option (single incident cost). A simple formula can be used to calculate how
much money to allocate to a recovery or prevention measure for the known value of an asset.
Loss= Single Incident Cost X Rate of Threat Occurrence
The loss of critical systems can cost major organizations, such as banks, large sums of money. They are
therefore willing to invest in backup sites to keep their systems running in the event of a major disaster. Their
numerous branches and offices provide locations in which they can site the backup equipment.
While a typical small business can still suffer a relatively large loss in the case of critical system failure, it
will probably not choose to create a backup site because of the high cost.

Example of prevention and recovery options


Let us consider the case for installing a power surge protector in an average home. Suppose there is a power
surge while you are operating your computer. It could be seriously damaged or, at the very least, you would
be faced with disruption while your computer is being repaired.
Let’s assume the worst case scenario that the single incident cost is $1200 or the cost of a new computer.
Meanwhile, a computer vendor is selling power surge protectors for $10.
So spending $10 could save $1200 in the long run. While this represents a substantial cost benefit, it may
not be enough to convince some people to purchase such a device, especially if their computer is only used
for games.
However, people who use their home computer for work are likely to have a different attitude. Assume
someone is earning $50 per hour. Their computer is damaged by a power surge and is taken away for repairs

Compiled Getachew G/hana 36 | P a g e


for one day. That person stands to lose around $400 (earnings for an 8-hour day) plus the cost of repairs –
say $1600 in total. Intangible costs also need to be considered: if a customer has their work delayed as a
result, they may decide to send their work elsewhere in the future.
If you live in an area that is prone to power surges, common sense would dictate that you purchase a power
surge protector. Let’s suppose that the probability of a power surge occurring is 1 in 120 or roughly three
times a year.
Use the following simple formula to estimate how much should be spent to safeguard against power surges.

Loss= Single Incident Cost X Rate of Threat Occurrence


Loss = $1600 X 1/120 = $13.60
This means that it would be viable to spend $13.60 per incident or $40.80 per year on protection against
power spikes.
So why do so many people not bother to buy a power surge protector? When it comes to risk analysis,
people often seem to adopt an air of optimism – the ‘it will never happen to me’ syndrome. Interestingly, it
is the computer user who has already suffered a loss who buys these devices – a case of shutting the stable
door after the horse has not just bolted but died!
This attitude can also be found in business. There are many managers who are willing to live with a risk
rather than spend the money on something that may never be needed.
Available options
In choosing risk prevention and recovery options to employ you need to consider:
 how critical the system is and how far the organization relies on it
 the surrounding infrastructure and how susceptible the organization is to a risk event
 the existing procedures and controls used and how these may be enhanced
 the equipment that may be available to prevent or recover from the event
 the number of risk events or systems a particular option may cover
 what the option will cost and how much the organisation is prepared to pay
 When you have completed the analysis and considered the risk minimization options, the
findings are compiled in a report to be submitted to management for approval.
 Once the risk minimization strategy has been approved it has to be acted upon and the
equipment and procedures put in place.

Compiled Getachew G/hana 37 | P a g e


Example of DPR strategies
Consider the following disaster prevention and recovery strategies available for a typical home computer user.
 save work every few minutes
 regularly back up files according to their importance
 use external backup devices such as tape, zip or CDs
 store important files away from the home, possibly at the office
 use UPS or surge protectors especially in areas prone to power problems
 use telephone surge protectors with modems
 install and update anti-virus software
 create a repair disk
 record serial numbers of all components in case of theft
 keep a fire extinguisher in the vicinity of the computer
 use only licensed software and store all licenses safely
 use passwords and/or encryption to protect confidential files
 Avoid storing passwords in dial up settings. (While this makes logging in easier, it also makes it
easy for a thief to access your account)
 use anti-spyware software and firewalls if connected to the Internet
 Keep up to date security patches for software (operating systems and applications).

3.3. Define and develop contingency plan

Contingency planning is developing responses in advance for various situations that might impact business.
Although negative events probably come to mind first, a good contingency plan should also address positive
events that might disrupt operations - such as a very large order.

Contingency planning is a systematic approach to identifying what can go wrong in a situation. Rather than
hoping that everything will turn out OK or that "fate will be on your side", a planner should try to identify
contingency events and be prepared with plans, strategies and approaches for avoiding, coping or even
exploiting them
The Importance of Contingency Planning

Every business has the possibility of a situation that adversely impacts operations. If the response to the
situation is poor, it might have a dramatic impact on the future of the business, such as loss of customers,
loss of data, or even the loss of the business.
A good contingency plan should include any event that might disrupt operations. Here are some specific
areas to include in the plan:
 Natural disasters, such as hurricanes, fires, and earthquakes
 Crises, such as threatening employees or customers, on-the-job injuries, and worksite accidents

Compiled Getachew G/hana 38 | P a g e


 Personnel, such as death of a senior manager, or union members going on strike
 Data loss, such as loss due to natural disasters, sabotage, or other criminal action (such as an attack
on a website)
 Mismanagement, such as theft, neglect of critical duties, or accidental destruction
 Product issues, such as a huge order that requires reallocation of plant resources, or a product recall
3.4. Strategy report
A Strategy Report recommends the risk prevention and recovery strategies to be applied and provides a
summary of the risk analysis exercise. A typical report includes the following:
The systems covered by the analysis and any other scope definition
You may be preparing a plan for a single system or for several systems. There may already be a DRP in place
to recover from network or hardware disasters. You should define the areas that this particular plan covers
and also what it does not cover.
The systems that were identified as critical on which the analysis has been based
It is normal to focus on the most critical systems first since, if these are protected, the same processes will
often also protect other systems. You should ensure that readers of your report understand how you arrived at
your findings and why you concluded that a particular system is a critical one.
Parts of the business impacted by the systems
This may be described in terms of business functions or departments.
Possible impacts to the organization of major and minor events
Since you need to persuade managers to spend money on the DRP, you should describe in vivid terms the
impact of a disaster on profitability, cash flow and customer relationships to achieve a dramatic affect.
Current security and control of these systems
If the system is already in production then you should summarize the current situation and identify strengths
and weaknesses.
Assumptions made and the impact of any future developments
Your recommendations may be based on little or no change in the environment. However future business
developments may have an impact on your solution. For example, you may propose to use one of the
organization’s sites as a backup site to minimize the cost. If this site is scheduled for closure, then your plan
may not be practical.
Threat and risk events considered
This summarizes the risk analysis activity that should have been fully documented.
Findings and probabilities used in evaluation

Compiled Getachew G/hana 39 | P a g e


Details of the findings of the risk analysis outlining the method used to determine probability of events
occurring.
Cost to the organization if events occur
Costs should be expressed in such a way that it will capture the attention of the managers reading the report.
For example ‘If the system is down for 30 minutes we would lose $1,000,000 in revenue!’
Possible preventative and recovery measures (a major part of the report)
Having described the problems, you can now show how you can solve them. You may need to provide
alternative approaches and options, for example, the facilities provided by a hot site and a cold site.
Cost benefit analysis
This is what the managers will be keen to find out. What is the value of the benefits of the proposed solutions
against the cost? You should also include intangible benefits (for example, improved customer service) in
your analysis even though these are difficult to quantify in dollar terms.
Recommendations
Develop your argument into a recommendation. It may also be worth discussing what may happen if the
recommendations are not followed.
Action items and activities required to implement recommendations
To show that you have thought through the proposal, you should describe how the DRP will be implemented
if it is approved.
3.5. Submitting the report
The report needs to be submitted to management for approval and authorization of the required funding.
Often you will be asked to present and explain your report in person. This is an opportunity, if you are well
prepared, to obtain the desired approval from management.
Your presentation (using PowerPoint or other software) should include the following:
 introduction and approval process
 importance of a DRP
 impact of a disaster event
 real-life example(s)
 what the DRP will offer the business
 threats to be safeguarded against
 recovery and prevention processes
 cost benefit analysis
 how the DRP supports the business
 recommendations
 action plan
 Conclusion and call for approval.

Compiled Getachew G/hana 40 | P a g e


3.6. Getting approval
You may think that you have developed the best DRP in the world. However, you might present it to
management only to have it rejected. Why could this happen?
As you prepare to write your report and/or present your case consider the following issues:
Present to the audience
Use appropriate language. If the audience is made up of non-IT managers, avoid technical terms. Use
business terms and try to show the impact on individuals. For example, ask the payroll manager ‘What
would be the impact of an incorrect tax calculation?’ Ask the production manager ‘How long could you go
without raw materials before laying off workers?’
Make it a business case
All major decisions that management makes are usually based on as a business case. Basically this explains
the current situation, what the problems are and how to solve them. Express your argument in values and
key performance indicators. Most organisations focus on the profit of the business. Explain what impact a
disaster would have on this.
Provide examples to support your case
Disasters occur all the time. Perhaps your business recently suffered a power outage. What happened and
how did it cope? Use this and other real-life incidents to demonstrate that these things do happen. Carry out
research and have the facts available to back up your argument.
Consider legal or contractual implications
The business may need to meet certain legal or statutory obligations. How embarrassing would it be for a
hospital if patients’ records were disclosed? What would happen if Tax File Numbers were not kept secure?
Show cost benefits and extra benefits
The heart of the business case is what the DRP will cost and what benefits will be gained. The problem with
DRPs is that if a disaster never occurs, it can appear to be a waste of money. Are there any benefits to be
gained as a by-product? For example, could the use of encryption be used as a marketing tool to encourage
more security-conscious customers?
Work on the budget
Where will the money come from? Can the costs be spread over a period of time? What can be achieved
without cost or simply by reconfiguring the system? Can each business department be individually billed?
Don’t forget to add the cost of the project team carrying out the DRP.
Provide alternatives
Managers like choices. Give them options but don’t give them so many that they get confused. Keep it
simple. Remember, they can still decide to go with the risk and not put in place any recovery or preventative
strategy.
Show you can provide solutions
Describe the threats and the problems but quickly move on to your suggested solutions and the associated
benefits. Managers want to hear solutions not problems. If you follow these guidelines then you should get
the desired response.
The minutes of the meeting and a summary of any changes made to the proposal will form the basis of the
DRP that will be implemented.
If you’re recommended DRP has been modified by management or they have chosen one of a number of
alternatives that you suggested, you should produce a new document that reflects these decisions. This
should then be signed off as the DRP.

Compiled Getachew G/hana 41 | P a g e


Activities
Activity 1 - identifying prevention and recovery options
4phones, a small communications company, has identified major threats to its new e-commerce system.
Look at some of the threats to the system and suggest possible prevention or recovery options.

1. Hackers attempting to get to the data stored on the site


Prevention – firewalls and encryption
2. Hardware failures that prevent site from operating
Prevention - RAID for disks and other fault tolerance
Recovery - hot site and spare components
3. Denial of service attacks to bring service down
Prevention - firewalls and a monitoring system
4. Data destruction by any means such as a user deleting a file
Prevention - appropriate access rights
Recovery – Backup
5. Misuse of information by internal staff
Prevention - regulations for use of information, checking references for new staff, management
control and observation
6. Power problems causing site to go down
Prevention – UPS and generator
7. Overloaded site causing slow response
Prevention - spare capacity and monitoring system
8. Customers falsifying information to avoid payment
Prevention - verification with finance institution
9. Incorrect information such as wrong prices for products
Prevention – data controls
10. Incorrect information such as quantity in stock so customers have to wait for delivery
Prevention – data controls
11. Major disasters causing site to go down
Recovery - Hot site
Activity 2 – evaluating preventive and recovery options

Compiled Getachew G/hana 42 | P a g e


The City Institute of Technology (CIT) will implement a new system to test students using computerized
testing systems. These tests will include vendor exams such as Microsoft MCSE, Novell CNA, etc.
Before implementing the system, you need to evaluate potential threats and for each threat:
 evaluate what can be done to prevent/minimize or recover from the risk
 consider whether the option would be costly to implement on a scale of 1 to 5 (highest)
 Indicate whether the option should be considered an important or essential business
requirement on a scale of 1 to 5 (highest).
Activity 2 – Evaluating preventive and recovery options

Threat Options Cost (1-5) Business requirement (1-5)

Disasters that stop the centre operating such as fire,


flood, earthquake

Hardware problems that stop system operating

Credit card fraud. With the short time frame the


student could be tested before any credit card
discrepancy was identified.

Student not turning up and exam lapses so $50 is


lost.

ISDN links broken delaying download of exams

Hackers who may try to access test data or student


data

Internal unauthorised access to test data or student


data

Theft or misappropriation of test certificates

Table 1: preventive and recovery options

Threat Options Cost (1-5) Business


requirement
(1-5)
Disasters that stop the centre operating such as Hot site 5 2
fire, flood, earthquake
Hardware problems that stop system operating Fault-tolerant systems 4 4
Credit card fraud. With the short time frame the 1. Verify identity before 1. 1 5
student could be tested before any credit card taking exam 2. 4
discrepancy was identified. 2. Insurance to cover fraud 3. 2
3. Change business
processes to only take cash

Compiled Getachew G/hana 43 | P a g e


Student not turning up and exam lapses so $50 is 1. Charge a cancellation fee 1. 1 3
lost. at time of booking 2. 3
2. Set up a system to try to
re-use exam before expiry
ISDN links broken delaying download of Alternate links 4 5
exams

Hackers who may try to access test data 1. Encryption 1. 1 5


or student data 2. Firewalls 2. 2
3. Monitoring network 3. 1

Internal unauthorised access to test data 1. Encryption 1. 1 5


or student data 2. Firewalls 2. 2
3. Monitor network 3. 1

Theft or misappropriation of test Secure system for 1 5


certificates certificates
Activity 3 - strategic report
After completing the risk analysis for the 4phones e-commerce project, you believe that RAID (Redundant
Array of Inexpensive Disks) should be used in the server to prevent hardware failure. Write a report that
justifies your decision. Use hypothetical (but practical) data to support your recommendation.
You should have covered the following matters in your report:
 The use of RAID will protect against the failure of a single disk in the server. Since disks are
electromechanical devices, they are the most susceptible component to wear and tear and
subsequent breakdown. They also store the data that may be difficult or impossible to recover
depending upon when the breakdown occurs. They will not protect against other hardware
failures such as power failures or major disasters such as fire.
 The server has been identified as a critical component in the system and its loss could cause
considerable problems and loss of revenue and profit.
 All parts of the system will be impacted by the loss of disks in the server. The cost to the
business of losing the server disks for a day could be $100,000.
 The only current facility to cope with such an event is to restore from backup. This takes four
hours during which time we would not be able to operate the system. In addition the backup
tapes could be on average 12 hours old and so will not have current information.
 While we will eventually have a high-speed link to a backup site, the use of
RAID provides a cost-effective solution until this link is established in 10
months’ time.
 The cost of a RAID system would be in the region of $12,000. We will also
gain an improvement in the performance of disk access in the region of 10%.

Compiled Getachew G/hana 44 | P a g e


 If this recommendation is approved we can order the RAID components and
have it installed and operating within a week

LO3:- Test
Choose the correct answer from the given alternatives
1. What are the two main strategies when dealing with risk:
A. deterioration and prevention
B. hindrance and deterioration
C. prevention and recovery
D. prospect and possibility
2. Which of the following is a recovery option:
A. Testing
B. Encryption
C. Backup
D. access rights
3. ACME is considering a recovery option for its accounting and sales database. There is 120GB
of critical information stored on the hard disk drive. The estimated single incident cost of losing
all the information is about $400,000. Use a simple formula to calculate which of the following
recovery options is cost effective, considering the probability of losing the data is 1 in 100.
A. A mirrored site costing $450,000.
B. A tape backup system costing $15,000.
C. A tape backup system costing $7,580.
D. A tape backup system costing $3,980.
4. The outcome of a risk analysis exercise is a disaster prevention and recovery strategy report.
When considering cost benefits it is import to describe:
A. Tangible benefits.
B. Intangible benefits.
C. Tangible and intangible benefits.
D. None of the above.
5. When implementing the prevention and recovery options it is necessary to review the
organization’s policies and procedure. If changes are made to policies, procedures should be
updated to reflect these. It is important to then:
A. Test the procedure.
B. Test the procedures and document the results.
C. Implement the procedures as soon as possible.
D. Do not change company policies.

Compiled Getachew G/hana 45 | P a g e


6. The output of a risk analysis is:
A. An action plan.
B. A disaster recovery plan.
C. A business impact statement.
D. Both a and b.
7. Which of the following is not part of a strategy report:
A. parts of the business impacted by the system
B. threats and risk events considered
C. cost to the organization if events occur
D. how to declare a disaster

46 | P a g e
LO 4: Develop Disaster Recovery Plan to Support Strategy
4.1. Identifying resources required disaster recovery

As organizations rely more on technology and electronic data for their daily operations, the amount of data
and information technology infrastructure lost to disasters appears to be increasing. Organizations are
estimated to lose revenue and incur expenses every year due to disasters, unpreparedness, and lost
productivity. Measures must be taken to protect your organization from disasters.

One way your organization can prepare and protect itself from disasters is to create and implement a
disaster recovery plan (DRP). Organizations should create a disaster recovery plan that can address any
type of disaster. The plan should be easy to follow and understand, and be customized to meet the unique
needs of the organization. Typical elements in a disaster recovery plan include the following:

1. Create a disaster recovery team. The team will be responsible for developing, implementing, and
maintaining the DRP. A DRP should identify the team members, define each member’s responsibilities,
and provide their contact information. The DRP should also identify who should be contacted in the event
of a disaster or emergency. All employees should be informed of and understand the DRP and their
responsibility if a disaster occurs.

2. Identify and assess disaster risks. Your disaster recovery team should identify and assess the risks to
your organization. This step should include items related to natural disasters, man-made emergencies, and
technology related incidents. This will assist the team in identifying the recovery strategies and resources
required to recover from disasters within a predetermined and acceptable timeframe.

3. Determine critical applications, documents, and resources. The organization must evaluate its
business processes to determine which are critical to the operations of the organization. The plan should
focus on short-term survivability, such as generating cash flows and revenues, rather than on a long term
solution of restoring the organization’s full functioning capacity. However, the organization must recognize
that there are some processes that should not be delayed if possible. One example of a critical process is the
processing of payroll.

4. Specify backup and off-site storage procedures. These procedures should identify what to back up, by
whom, how to perform the backup, location of backup and how frequently backups should occur. All
critical applications, equipment, and documents should be backed up. Documents that you should consider
backing up are the latest financial statements, tax returns, a current list of employees and their contact
information, inventory records, customer and vendor listings. Critical supplies required for daily
operations, such as checks and purchase orders, as well as a copy of the DRP, should be stored at an off-
site location.

5. Test and maintain the DRP. Disaster recovery planning is a continual process as risks of disasters and
emergencies are always changing. It is recommended that the organization routinely test the DRP to
evaluate the procedures documented in the plan for effectiveness and appropriateness. The recovery team
should regularly update the DRP to accommodate for changes in business processes, technology, and
evolving disaster risks.

47 | P a g e
4.2. Implementing a disaster prevention and recovery strategy

Once the DPR strategy has been formally accepted by the business and approved by senior management,
it’s time to implement it. Required actions include:
 changing procedures, eg virus checkers to run each time a computer is switched on
 purchasing equipment to provide fault tolerance and standby
 implementing additional controls to identify errors
 improving backup procedures
 increasing security over data and user access
 Developing the disaster recovery plan.
These can be categorized as:
 building or implementing in-built system contingencies

 bringing the current site to the standard required


 making changes to policies and procedures
 Implementing additional or changed hardware and/or software.
In-built system contingencies
Not all prevention or recovery processes will cost money to implement. Often existing facilities have not
been fully implemented or turned on. These will vary from system to system and it is important for the
team undertaking the risk analysis to be aware of these built-in facilities.
We will examine a few of the built-in facilities of Windows XP Professional and how these may be used to
safeguard against different risk events. These are summarized in the following table:
Table 1: Windows XP system contingencies
Facility Function
User accounts Restrict access to authorized users only.
Encryption Additional level of security to ensure that confidential files are secure
Permissions Allows some users restricted access (such as read only) to safeguard the data
from destruction or corruption
Auditing Tracks events to determine what users have been doing on their computers
Lock computer Prevents others from accessing a user’s computer
Support for smart cards Restricts access to authorized users only
Automated System Allows quick recovery from an operating system problem
Recovery
Support for RAID 5 and Allows system to continue working even if a hard disk fails
mirroring
Recycle bin Allows recovery of recently deleted files
Backup software Creates backups of files and the whole system.

48 | P a g e
System restore Monitors and records system changes. Enables roll back to a previous point in
time
File protection Protects Windows files from being corrupted by rogue software installs
Firewall Prevents malicious attacks by worms and other viruses from the network or
Internet
Controls such as passwords and access permissions may be referred to as logical controls.

Current site configuration


Here we are primarily concerned with systems in terms of software, data and hardware. However, the
security and controls that are implemented at the physical site are also an important consideration in the
risk analysis.
While encryption and user access can be used to prevent unauthorized access, no-one should be able to
physically access a computer in the first place. The following diagram will give you an idea of some of the
levels of physical security that may be applied.
An organization in a secure building with locked doors on each floor with security guards and video
cameras can be confident that an intruder would find it difficult to access a PC and the confidential data it
contains. However, many frauds and errors are perpetrated by trusted employees. That is why there is still
an ongoing need for logical controls and passwords for each user.

Figure 1: Security measures

Review and update policies and procedures


The normal day-to day-operations of an organization are described in its policy and procedures manual.
This may be stored electronically, on the company's Intranet or published as a paper-based manual. After
designing the recovery requirements, you will often need to update this manual to include the changes
required to prevent or recover from a disaster.

49 | P a g e
As mentioned earlier, many risk events are also security threats which are often identified during a
security audit or review. Similarly, review and investigation of the current procedures also form part of the
Disaster Recovery Planning process to ensure that they meet DRP requirements.
The review process follows the following stages:
1. Identify key DRP issues that should have been resolved by the existing processes and
procedures
2. Review and evaluate the operational policies to ensure that they meet the demands imposed by
the DRP
3. Design a series of tests to verify that procedures are in accordance with these policies
4. Carry out the testing and document the results
5. Evaluate the findings and make any recommendations for changes or approve the current
processes.
The procedural changes required will depend upon what is discovered and the DRP strategy adopted. Here
are a few examples:

Table 2: Examples of
procedural changes

Strategy adopted Impact on procedures

Nightly backups to be Backup procedures and the process for getting backups offsite and subsequent
taken offsite retrieval will need to be described.

Software to be fully Testing procedures (defining what ‘fully tested’ means), documentation and
tested before going into test results to be maintained will need to be described.
production.

Virus checking Procedures to explain the danger of viruses, how to check for viruses on disks
and in e-mails and what to do if a virus is discovered will be required.

Only licensed software Procedures for checking the numbers of licenses that the organization has and
to be used. what to do if more are needed will be required. Penalties to be imposed if staff
disregards the policy.

A set of procedures for the disaster recovery plan itself will also be required.
Additional or changed hardware and/or software required
A DRP strategy usually requires new or updated hardware and software. Some of these requirements are
detailed in the following table:
Table 3: DRP requirements

Strategy Hardware or software

Regular backups to tapes Tape backup unit with sufficient capacity. Tapes for the backup. Appropriate backup

50 | P a g e
software.

Mirrored disks or RAID. Additional disks or disk subsystems.

Fault tolerance systems, Requires similar hardware to that being duplicated. If a file server is to be duplicated,
duplicated systems a matching machine will be needed. May also require additional software licenses.

Virus checking Virus software licenses for all users

Think about the hardware and software that would be required by the home user to implement the disaster
prevention and recovery strategies identified earlier, under which:
 work is saved every few minutes
 files are regularly backed up
 external backup devices such as tape, zip or CDs, are used
 important files are stored away from the home, possibly in the office
 UPS or surge protectors are used especially if in an area that suffers power problems.
 telephone surge protectors are used with modems
 virus checking software are always used and kept up to date
 a repair disk is always created
 serial numbers of all components are recorded in case of theft
 a fire extinguisher is kept in the vicinity of the computer
 only licensed software is used and all licenses are stored safely
 passwords and/or encryption is used to protect confidential files
 passwords are not stored in dial-up settings
 anti-spyware software and firewalls are always used if connected to the Internet
 security patches for software (operating systems and applications) are kept up to date.
The following hardware and software would be required:
 Backup tape unit (or zip drive or CD writer), tapes (or zip cartridges or CDs),
appropriate backup software and hardware drivers
 UPS and/or surge protectors for power and telephone
 Virus-checking software
 Fire extinguisher.

51 | P a g e
4.3. Identifying cut-over criteria

How do you know when to activate your disaster recovery plan? If an earthquake that destroyed the office
building the answer would be obvious. But what if a computer virus deleted all the data on one or all the
servers. Each possible incident needs to be analyzed to determine the impact of the disruption to the
business. The first step is to determine the extent of the impact to establish how long it will take for the
business systems to be restored. If this exceeds the maximum allowable downtime, then a disaster is
declared.
The Disaster Recovery Co-coordinator, with input from upper management, is responsible for deciding
when to activate the disaster recovery plan. If the co-coordinator is not available, responsibility flows down
the chain of command. This is why it is important for roles and responsibilities to be clearly defined in the
Disaster Recovery Plan. A contact list should be created and maintained containing details of all employees
with after-hours phone numbers. The organization’s internal directory listing, it can be modified
accordingly.

Figure 2 Example of a generic structure for disaster recovery

Documenting the Disaster Recovery Plan


All that remains is to document the Disaster Recovery Plan. The plan outlines the tasks that need to be
completed to recover from the disaster and return the business to its normal operations. The plan is a
dynamic one – it will constantly change as the business changes. Therefore it is important to review it at
regular intervals to ensure it is up to date.
There are many different possible formats for a DRP.

52 | P a g e
Here is one suggestion:
 Introduction
 Purpose
 Scope
 Authorities (what legal/contractual requirement the DRP complies with)
 Record of change
 Operations
 Systems description and architecture (a general description of all the systems
 Responsibilities (detailed outline of teams responsible for recovery operations)
 Activation phase (initial actions to detect and assess damage)
 Recovery phase (processes and procedures to complete recovery of each system with
nominated staff positions responsible for each task)
 Details of the post-recovery review to be performed after the completion of the recovery from
any declared disaster.

53 | P a g e

You might also like