Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

INCIDENT MANAGEMENT PROCESS

Content
1. Introduction
1.1 The life cycle of an incident
1.2 Process short description
1.3 Scope
1.3.1 Support groups
1.3.2 Locations
2. Description of activities
2.1 Incident detection and recording
2.1.1 Source of incident
2.1.2 Recording
3. Classification and priorization
3.1.1 Classification
3.1.2 Prioritization
4. Investigation and diagnosis
4.1 Incident closure
4.2 Incident owner & monitoring
4.3 Incident communication
4.4 Escalation
4.5 Crisis Management

5. The roles and responsabilities


6. The incident KPI’s
7. Annexes
8. List
1. Introduction
This document describes the Incident Management process as part of the overall Service
Management.

The general framework used in service management is based on ITIL ( IT infrastructure


Library )standards and recommendations.

ITIL describes the contours of organizing Service Management.

The APS incident Management is part of the Sevice Delivery activities and is led by an Incident
Manager.

1.1 The life cycle of an Incident


All the processes and function described in ITIL relate to each other. Here under a simple
example: The life cycle of an incident:

Analysis & Creation


SDE qualifies support ticket

Assign to
Operational Support SDE ADM ADM
teams Ticket incident Incident Team

Mac server issue


Analysis & Correction
Analysis & Correction EUS Mac
incident
Analysis & Correction CLOSED

CLOSED

Once Operation users raise a support ticket, the Service Delivery Team is in charge to handle it.

Service Delivery team would move the support ticket to incident based on the nature of the request.

Depending on the type of incident such as application/infrastructure issues, the corresponding team
would be looped in to handle.

In case where the involvement of development team is needed, the ticket is assigned to them to take
it forward. The development team fall under the ADM hierarchy.
Once the analysis and fix is provided by the ADM team , the incident would be closes .The incidents
handled by APS service delivery team would be closed once the resolution is provided.

Whenever the incident involve a Mac server asset , a support ticket is opened by SDE to EUS MAC
team to perform an analyse and to be solved. The SDE team after checking the resolution ticket must
close the incident.

1.2 Process short description


The primary goal of the trip APS incident Management process is to restore normal service
operation as quickly as possible and minimize the adverse impact on business operations,
thus ensuring that the best possible levels of service quality and availability are maintained;

Here after is the list of activities encompassed by the IRP APS Incident Management process:
 Incident detection and recording
 Classification and initial support
 Investigation and diagnosis
 Resolution and recovery
 Incident closure
 Incident ownership, monitoring and communication

1.3 Scope
The “IRP APS Incident Management process considered here covers the support activities
under the responsibility of the Application Support. The supported Business activity is :
 IFS IRP

1.4 The support groups of the Incident Management process as described in the present
document are mainly :
 1st level (IRP APS Team which is also the service desk in case)
 2nd level (IRP ADM Team or any other infrastructure team )
1.5 Locations

The following locations are mainly supported by the Incident Management process:

1) Paris
2) India
3) UK
4) Australia
5) New Zealand
6) Poland
7) USA
8) Singapore
9)
2. DESCRIPTION OF ACTIVITIES

When the Service Delivery gets the notification of an incident, the following activities are
performed:

2.1 Incident detection and recording


The fundamental requirements are:

- All Incidents should be registered in 2Strack tool


- The service delivery receives appropriate alerts and maintains overall control.

The incident monitoring remains the responsibility of the Service Delivery.

An alert to the Service Manager is required in the case of serious degradation of service levels, in

case it is necessary to take special technical and /or communication action(s).

2.1.1 Source of Incident


The are 4 main entries for Incident detection:

1. Emails sent to the mailbox: PARIS BP2S IRP SDE


2. Calls given to the Hotline number +91444444444444
3. Support tickets in 2sTrack tool assigned to IRP SDE group
4. Proatively raised by SDE IT team

The detection of Incidents related to hardware, networks, servers, and/or reported by


alerting tools will not be treated in this document as they are under the responsibility of IPS
but communication remains the key.

2.1.2 Recording
The tool used to have a complete overview of the Incident lifecycle from beginning to
closure is 2Strack .
Description of the tool, working instructions and user guide available.
Link for user guide :
https://socialbusines.group.echonet/wikis/home?lang=fr

All Major production incidents (Priority 1 & 2 ) are followed by a detailed incident report.
These reports are stored in below Allshare directory:
https: //allshare-bp2s.is.echonet/

2.2 Classification and Priorization


2.2.1 Classification
The Classification is applied once an Incident has been declared. It associates the Incident
with a particular application and/or service from one of the supported Business Line
order to select the right level of priority and allow the assignment of this incident to the
relevant support group (if necessary).

The classification is done in the 2Strack incident ticket by selecting the correct
component.
Full list of components is provided in annexe 5.1.
Below is an example of an incident ticket showing the components:

FIG example

2.2.2 Prioritization
Each declared Incident will get one of the defined priorities according to the matrix of
priorities.
This matrix of priorities has been created according to business needs and is the
combined result of the impact on the business and the urgency.
The impact is a measure of the business criticality of an incident or Problem, often equal
to the extent to which an incident leads to degradation of agreed service levels. The
impact is often measured by the number of affected users and or clients.

The urgency reflects the necessary speed of solving an Incident of a certain impact. A
high-impact Incident does not, by default, have to be solved immediately if this impact
does not affect Business commitments or service levels significantly.
The results priority is then reflective of the expected efforts.

2.2.3

Adapting the ITIL standard impact / urgency matrix to BP2S it leads to the following
definitions for the incident “severity”.

Major Significant Minor


System / Service wholy System / Service System / Service degraded
IMPACT unavailable and requiring significantly degraded and or otherwise provided such
URGENCY investigation to determine requiring investigation to that normal business
the necessary restorative determine the necessary process and usage is
action restorative action impacted

High
Business commitments Client service Severity 1 Severity 2 Severity 2
Levels and Financial loss at risk in "Critical" "High" "High"
immediate timeframe

Medium
Severity 2 Severity 2 Severity 3
Business commitments Client service
Levels and Financial loss at risk within 2 "High" "Medium" "Medium"
hours

Low
Severity 3 Severity 4 Severity 4
Business commitments Client service
Levels and Financial loss at risk not "Medium" "Low" "Low"
reasonably at risk

Incident Reports prepared for An incident 's severity can increase


2.2.3 severity 1 and severity 2 incidents in event protracted resolution
Priority Definition
1- Critical An IT service interruption that puts the affected business’ client
commitments immediately at risk or presents high risk or presents
high risk of financial loss.
 All clients are impacted
 Database is down
 Data Breach

2- High An IT service interruption or degradation that puts the affected


business client commitments at risk or presents risk of financial loss.
 1 Strategic or Market Strategic client is impacted
 Dynamic report impacted is daily production

3- Medium An IT service degradation or deficiency that may (if protracted or


repeated) put the affected business’ client commitments at risk of
financial loss.
 1 non strategic and non market strategic client is impacted
 Dynamic report impacted is monthly production

4- Low An IT service degradation or deficiency that does not put the


affected business client commitments at risk nor present risk of
financial loss.
 Example :Internal IRP KPI template down

2.3Investigation and Diagnosis


Once an incident is raised, the identification of the correct team who would work on the
ticket is being done by service delivery APS team.
After the initial analysis, sevice delivery team reach out to infrastructure teams for
infrastructure related issues by providing the necessary details needed to investigate from
their end.
Incidents which need the Development team (ADM) involvement, the ADM team would ( level
3 support ) be involved to proceed with the investigation<;
The primary goal is to identify workaround to minimize the impact of the incident.

2.4 Incident closure


When the incident has been resolved, the Service Delivery ensures that:
 Details of the actions taken to sesolve the incident are concise and readable
 Resolution /action is agreed with the customer – verbally or , preferably by email.
 All details applicable to the incident such as Outage time, degradation time, root cause and
business impact are recorded in the 2Strack ticket.

This process is essential in resolving problems between a service provider and a customer over
the validity of a closure.

Closing the incident ticket is doe solely by the service Delivery members and mostly by the
incident Manager.
Once an incident is fixed, impact gathering template would be sent to users requesting to
furniture complete the details.

Please provide details for the following areas:

Client SLA ‘s breached (please provide


impacted SLA references) :
Financial impact as Profit & Loss (€ )
Bank image /market reputation / visibiliry:
DI (Incident declaration into Ops database like
forecast (Y/N)
Staff overtime incurred as a result of the
incident (hrs):
Communicate this incident into weekely
reporting for BP2S Management Board (Y/N ):

2.5 Incident ownership & monitoring


The service Delivery is responsible for the incident communication all along the incident life
cycle.
- Regular updates to users through appropriate distribution lists
- Communication to the technical teams on the progress of incident resolution
- Check for similar incidents and similarities between incidents
- Raise problem ticketfor recurring incidents to ensure it is resolved permanently.

The distribution lists used for communication are:

1) BP2S Global I Incident Reports IRP


2) BP2S Global I Incident Reports IT IRP

The frequency of incident communication is mainly based on the level of priority.

1) Every 1 hour for P1-Critical


2) Every 2 hour for P2-Critical

The template used for incident communication is below:

2.6

Bp2 s application Production Support- Incident Communication


Incident Title
Application Incident Reference
Priority
Incident Status
Application name
Impact
Advice to Business
Impacted locations
Current status / progress
DD-MM-YYYY HH:MM<timezone>
Next communication In 1 hour or sooner if more information is available
Estimated Time for Resolution TBA
Root cause TBA
Responsability Application /Infrastructure /Upstream application/
Vendor Operations
Infrastructure Incident Reference This field is mandatory if you specify the responsibility =
infrastructure

2.7Escalation
- Should the IFS IRP APS Support turn out to be unable to meet the agreements in
below document for any reason whatoesver (technical complexity, resource capacity,
lack of details on the reported issue), it will be up to the IFS IRP APS team service
Delivery Manager to raise the issue to the relevant manager of the Business to make
arbitrages.
- In such case, the IFS IRP APS team Service Delivery Manager will start a crisis
management process with periodical crisis meeting involving all the parties ( The
Business and the IFS IRP APS support including all the required providers ), until
complete resolution of the crisis situation. The IFS IRP APS support Head of IT will
participate to the crisis meeting, depending of the severity of the issue.

The escalation process is described in below document:

Ppt document Escalation process communication .pptx

2.8 Crisis Management


Should the IFS IRP APS Support turn out to be unable to resolve the incident within SLA time
for any reason whatsoever (technical complexity , resource capacity, lack of details on the
reported issue), it will be up to the duty Manager to raise the issue to the relevant manager of
the Business to make arbitrages.

In sucgh case , the IFS IRP APS team service Delivery Manager will start a crisis management
process wih periodical crisis meeting involving all the parties ( the Business and the IFS IRP
APS support including all the required providers),until complete resolution of the crisis
situation.The IFS IRP APS support head of IT will participate to the crisis meeting, depending
of the severity of the issue.

The IRP APS Crisis cell management is described in below document;

IRP Crisis cell management - duty manager role .pptx

3. THE ROLES AND RESPONSABILITIES


The roles related to the incident Management process are :
- Incident Manager :
The role of the incident Manager is to develop , maintain and improve all procedures
tools,methodologies, etc: needed to restore normal service operation as quickly as possible
levels of service quality and availability are maintained.’Normal service operation’ is defined here
as service operation within Service Level Agreement (SLA) limits.

 First- second -and third-line support groups, including specialist support groups and external
suppliers (roles) :
The support groups include the Development team, infrastructure teams such as Database
Administrator.Unix system Admin, Windows,Administrator,MAC system administrators etc.
The support groups responsablility is to ensure that the informed incidents needed to be
actioned immediately to minimize the impact.

 Service Delivery Coordinator:


The role of the Service Delivery coordinator is to manage the first-level support team and to
ensure that all the procedures defined by the incident Management process are
known,understood and executed by all involved internal and external support teams.

5. Annexes
5.1 IRP Component List

This table lists the components used to identify application or team affected by incident.

CIA
CALCUL
CISPEO
CODEX

You might also like