Incident Management Process

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14

Incident Management Process

Owner: Director of Support

Approver: Chief Customer Officer

Classification: Internal

Version 1.0

© Mapal Group. All rights reserved.


Table of Contents

About this document

This document describes the Incident Management Process. The Process provides a consistent, simple, and repeatable method for everyone to follow
when system issues are reported by a customer or discovered by Mapal.

Who is accountable?

 Product Manager(s) are accountable for the successful resolution of all incidents within agreed SLA targets.
 Director of Support are accountable as the process owner and manager.

This document should be used by:

 MAPAL personnel responsible for the restoration of services


 MAPAL personnel involved in the operation and management of Incident Process.

© Mapal Group. All rights reserved Page 1 of 15


1.1. What is Incident Management

Incident management is a defined process for logging, recording and resolving incidents.
The aim of incident management is to restore the service to the customer as quickly as possible, this could be through a work around or temporary
fixes, whilst trying to find a permanent solution handled under the problem management process.
1.1.1. Primary goal

The primary goal of the Incident Management process is to restore normal service operation as quickly as possible. The aim is to minimize the
adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. ‘Normal service
operation’ is defined here as service operation within SLA limits.
1.2. Process Definition:
Incident Management includes any event which disrupts, or which could disrupt, a service. This includes events which are communicated directly by
users or Mapal staff through the Support Team or through an interface to system monitoring and incident management tools.

1.3. Objectives - Provide a consistent process to track incidents that ensures:

 Incidents are properly logged.


 Incidents are properly routed.
 Incident status is accurately reported.
 Incident and updates communicated to customers and stakeholders.
 Queue of unresolved incidents is visible and reported.
 Incidents are properly prioritized and handled in the appropriate sequence.
 Resolution provided meets the requirements of the SLA for the customer.
1.4. Definitions

1.4.1. CSM

Customer Success management (CSM) is an approach a company's interactions with current and future customers. It often involves using technology
to organize, automate, and synchronize sales, marketing, customer service, and technical support. Mapal use Dynamics CRM as the technology
provider /platform to manage this data.
1.4.2. Customer

A customer is an end user or someone who refers to an overall organisation that is engaged in a contracted agreement for Mapal Group.

1.4.3. Business Hours (9AM-5PM UTC - Monday to Friday)

Mapal Support operates on a local business hours model and all solutions are developed and managed from the Madrid, Edinburg, Paris, and
Stockholm offices including all MAPAL-OS products such as, Workforce Management, Inventory, Flow Learning, Compliance, Reputation,
Analytics.

1.4.4. End User

The end user is simply the person who uses the Mapal software after it has been fully developed, marketed, and installed. It is also the person who
would raise a query should a Mapal solution not be working correctly. Generally, the terms "user" and "end user" has the same meaning.
1.4.5. Incident

An incident is an unplanned interruption to a Mapal Product or Service or reduction in the quality of. Failure of any Item, software, or hardware, used
in the support of a Mapal system that has not yet affected service is also an Incident. Often described as a fault, error, defect, bug, problem or it
doesn’t work as designed.

An incident occurs when the operational status of a Mapal solution changes from working to failing or about to fail, resulting in a condition in which
the product is not functioning as it was designed or implemented. The resolution for an incident involves implementing a repair to restore the item to
its original state.

A design flaw does not create an incident. If the product is working as designed, despite the design may not be perceived as correct, the correction
needs to take the form of a change request or idea to modify the design. The service request may be expedited based upon the need, but it is still a
modification, not a repair.

A knowledge gap or user related process gap does not create an incident but rather a Question or Request.

1.4.5.1. Problem

Problem management differs from incident management in that its main goal is the detection of the underlying causes of an incident and the best
resolution and prevention. In many situations, the goals of problem management can be in direct conflict with the goals of incident management. The
Mapal approach is to restore the service as quickly as possible (incident management) but ensuring that all details are recorded. This will enable
problem management to continue once a workaround has been implemented.
1.4.5.2. Incident vs. Problem

An incident is where an error occurs: something does not work the way it is designed.
A problem (is different) and can be:
 the occurrence of the same incident many times.

© Mapal Group. All rights reserved Page 2 of 15


 the result of network diagnostics revealing that some systems are not operating in the expected way.
 A problem can exist without having immediate impact on the users, whereas incidents are usually more visible and the impact on the user
is more immediate.

1.4.6. Incident Management Process

Incident Management distinguishes between Incidents (Service Interruptions) and Service Requests (standard requests from users, e.g. password
resets). Service Requests are not fulfilled by Incident Management; instead, this is a data services or change Request. There is a dedicated process for
dealing with emergencies.
1.4.7. Incident Priority

Incident Priority is the value given to an Incident to indicate its relative importance to ensure the appropriate allocation of resources and to determine
the timeframe within which action and resolution is required. The severity and impact of an incident will be used in determining the Incident Priority
for resolution. Incident Priority is based upon a coherent and up-to-date understanding of business impact and severity.

 P4- #
 P3- System operational with difficulty or procedural issues. A valid workaround is available. Low numbers of users affected.
 P2 - A core functionality of the system is not operating as expected. No valid workaround is available. Multiple users affected.
 P1- System completely non-operational and no work can be carried out. All users affected.

Impact

Urgency Critical High Medium Low

Critical P1 P1 P2 P2

High P1 P2 P2 P3

Medium P2 P2 P3 P3

Low P4 P4 P4 P4

1.4.7.1. Urgency

Urgency is determined by how many personnel are affected and has a rating assigned to Incidents, Problems and Changes used in conjunction with
Impact is one of the factors for allocating IT priorities.

There are four grades of severity:

 Low – Single user affected/no deadline will be missed.

 Medium – Low numbers of users affected.

 High –Multiple users affected. Multiple users from multiple organisations or a single organisation are affected.

 Critical – All users affected. All users from multiple organisations or a single organisation are affected.

1.4.7.2. Impact

Impact is determined by how much the user is restricted from performing their work, and is a measure of the effect of an Incident, Problem or Change
on Business Processes. Impact is often based on how service levels will be affected.

There are four grades of impact:

 Low- System operational with procedural issue. A valid workaround is available.


 Medium- System operational with minor difficulty and/ or procedural issues. A valid workaround is available.
 High -A significant part of the system is not functioning, or business critical functionality is not working.
 Critical - System completely non-operational and no work can be carried out.

1.4.8. Incident Repository

The Incident Repository is a database containing relevant information about all Incidents whether they have been resolved or not. General status
information along with notes related to activity should also be maintained in a format that supports standardized reporting. At Mapal, the incident
repository is contained within Customer Service APP, part of Dynamics CRM.

1.4.9. Incident Reports (RCA Reports)

Reports providing incident details including a root cause, timeline of events and corrective measures to be distributed to customers within SLA.

© Mapal Group. All rights reserved Page 3 of 15


1.4.10. Incident Logs

A Record containing the details of an Incident; Each Incident record documents the Lifecycle of a single Incident, cause where available, and
corrective measures to resolve the incident and these records are maintained in Dynamics.
1.4.11. CSAT

Customer Satisfaction Survey (CSAT) is a management tool that is used to gauge the satisfaction on the service received for the Mapal end-users. We
use Dynamics for this.
1.4.12. Operational Level Agreement (OLA)

Often referred to as the OLA, operational level management ensures that arrangements are in place with internal IT support-providers in the form of
Operational Level Agreements (OLAs) and Underpinning Contracts (UCs), respectively.

1.4.13. RCA

Is an activity that identifies the root cause of an Incident or Problem, and typically concentrates on IT, Infrastructure, or database failures. (Root
Cause Analysis)
RCA reports are required for all P1 and P2 incidents that are of a higher impact and severity as this information is used to provide details and
reassurance of further preventative measures within problem management.
1.4.14. Response

Time elapsed between the time the incident is reported, and the time receipt is acknowledged and assigned to an individual for resolution.

1.4.15. Resolution

Service is restored to a point where the customer can perform their job. In some instances, this may only be a work around solution until the root
cause of the incident is identified and corrected.

1.4.16. Service Level Agreement (SLA)

Often referred to as the SLA, the Service Level Agreement is the agreement between Mapal and the customer outlining services to be provided, and
operational support levels.

1.4.17. Underpinning Contracts (UC’s)

A Contract between Mapal and a Third-Party service provider. The Third-Party provides goods or Services that support delivery of an IT Service to
end users, Mapal’s Customers. The Underpinning Contract defines targets and responsibilities that are required to meet agreed Service Level Targets
in an SLA.
1.5. Metrics

Metrics are results of processes and data that is measured and reported to help manage a Process, IT Service, or an Activity.

1.6. RACI Matrix

It is important for the success of all processes that roles, responsibilities and owners are agreed and documented. The person who will later have the
responsibility for running a certain process should also participate in its design. This will ensure that as much experience as possible flows into the
process definition, and that the role owners identify themselves closely with any changes to existing working practice. Responsibilities are assigned
and understood by those required to fulfill activities and tasks as part of the process.

KEY

R Responsible Does the work and makes the decisions to ensure a task is achieved

A Accountable Must be one person. Ensures correct and thorough completion of the process

C Consulted Provides information for the process through 2-way communication. Usually several people, subject experts

I Informed Affected by the process so kept informed through 1-way communication

2. Roles and Responsibilities


Responsibilities may be delegated, but escalation does not remove responsibility from the individual accountable for a specific action.

2.1. Support Team (Tier I)

Includes Support and Senior Support Specialist


 Represent the incident priority of the customer and are responsible for communicating the incident status to customers. Involved with
incidents in terms of working with the technology teams when required to understand the fault.
 Responsible for reviewing and maintaining an appropriate list of categorised incident records, with continual service improvements and
problem management.

© Mapal Group. All rights reserved Page 4 of 15


2.2. Regional Support Manager (MIM - Major Incident Manager)

Perform and responsible for the major incident management process, provide notifications, updates on progress and resolution of each major incident
to key business stakeholders and customers effected by an Incident.
 The creation and collection of root cause analysis details for distribution of the RCA Incident report.
 Responsible for P1 and P2 incident internal and external communication and in accordance with OLA.
 Responsible for governance of Major Incident Management Process with Tier I.
2.3. Customer Support – (Tier II)

Includes - System Analysts, DBAs, and Infrastructure Engineers

To support incident management process through performing fault fix activities as per relevant support model and providing required communication
on progress to Service Team as per agreed OLA timescales. Ordinarily, a manager would not be involved in an incident in terms of working with the
support teams to understand the fault but ensuring there is the agreed activity with support and incident management process.
2.4. Technology Operations (TIM – Technology Incident Manager)

To manage the major incident management process within Technology and provide communication on progress to Service Manager (MIM) as per
agreed OLA timescales.
 Ensure details and updates provided to incident logs, for root cause analysis and problem records are readily available and in accordance
with OLA.
 Responsible for P1 and P2 incident communication to MIM in accordance with OLA.
 Responsible for governance of Major Incident Management Process within Technology division.

2.5. Customer
Communication and point of contact for escalation of incidents with a specified support contact for incidents.

2.6. RACI Matrix

The below RACI matrix is a high-level summary of activities of defined roles and responsibilities for the Incident Management process agreed across
multiple departments.

Roles

Activities Process Service Support Tech-Ops Technology User, Internal, Customer


Owner Team Manager Operations
(TIER II)
(TIER I) (MIM) (TIM)

Support Team Incident handling as per Incident R A I/C I I/C


Management process

Technology Team Incident handling as per Incident I/C I R A


Management process

Incident Closure, incident log (RCA record), and Incident R A R I/C I


report

Process Owner’s - responsibilities: include sponsorship, I/C A/R I/C I/C I


design, and continual improvement of the process and its
metrics.

2.7. P1, P2 Service & Technology Communication OLA

P1 Incidents Activity and agreed OLA

TIER I & TIER II

Raise and validate the incident case within 15 minutes of incident identification

Contact the MIM within 15 minutes of incident identification

REGIONAL SUPPORT MANAGER (MIM) & TECH teams (TIM)

Respond to any P1 incident escalated from TIER I & TIER II within 15 minutes with at least an acknowledgement of the incident escalation and as much useful information as possible.

© Mapal Group. All rights reserved Page 5 of 15


Send incident notifications (TEAMS CHANNEL, with a post in a channel to inform stakeholders and a conversation with the concerned people tech and support, STATUS PAGE)

Send an email notification to all customers concerned (to be coordinated with Marketing department and Customer Success teams.
Need to have the customer lists (by product, by country)

Feedback on the latest progress at least every 30 minutes after that initial response unless other agreed and stated clearly in communications

SLA agreement is to resolve any P1 incidents within 4 business hours since reported

Tech teams would continue to work on any P1 incidents as a top priority until resolution

P2 Incidents Activity and agreed OLA

TIER I & TIER II

Raise and validate the incident case within 30 minutes of incident identification

Contact the MIM within 30 minutes of incident identification

SUPPORT MANAGER (MIM) & TECH OPS (TIM)

Respond to any P2 incident escalated from TIER I & TIER II within 1 hour with at least an acknowledgement of the incident escalation and as much useful information as possible.

Send incident notifications (EMAIL, TEAMS CHANNEL, with a post in a channel to inform stakeholders and a conversation with the concerned people tech and support, STATUS PAGE)

Feedback on the latest progress at least every 60 minutes after the initial response unless other agreed and stated clearly in communications

SLA agreement is to resolve any P2 incidents within 12 hours since reported

Tech Ops would continue to work on any P2 incidents as a top priority until resolution

2.8. Mapal Support Team

 Owns all reported incidents.


 Ensure that all incidents received by the Support Team are recorded in Dynamics.
 All incidents must be logged in Dynamics. (Customer Service Mapal – APP)
 All incidents must be raised to the Support Team for replication, validation of an incident and to follow incident management process.
 Includes incidents reported from all internal departments.
 Identify nature of incidents based upon reported symptoms and provide categorisation.
 Prioritize incidents based upon impact to the users and severity of issue utilising the incident priority matrix.
2.9. Regional Support Manager (MIM)

 Responsible for incident closure, customer, and stakeholder communication for all incidents
 Responsible for assigning incidents to the appropriate technology group for resolution i.e., assign incident to System Analysts, Tech Op to
provide initial investigation.
 Performs post-resolution customer review to ensure that all services are functioning properly, and all incident logging is complete.
 Responsible for creation and distribution of the RAC.
 Prepare reports showing statistics of incidents resolved, SLA achievement and other metrics agreed with for Mapal Group Managers,
Directors and Executive team.

2.10. Mapal Technology Support Group (TIM)

 Includes Infrastructure, Development and QA technical staff involved in supporting services including and not limited to System Analyst,
Infrastructure engineer, Database analyst, and Developer.
 Correct the issue or provide a work around to the service that will provide functionality that approximates normal service as closely as
possible and minimises the impact.
 If an incident reoccurs or is likely to reoccur, create or update the problem management monitoring record so that root-cause analysis can
be performed, and a standard work around can be deployed.
 Incident log details completed for cause and corrective measures.
 On-going analysis to identify trends and support problem management.

2.11. Incident Manager Contact details

Outside the listed hours of availability for contact and escalation please refer to ‘2.11. Hierarchical Escalation Contact details’
NAME EMAIL MOBILE

© Mapal Group. All rights reserved Page 6 of 15


3. Incident Categorisation, Prioritization, Target Times
To manage SLA’s correctly, effectively, and proactively, it will be necessary to correctly categorize and prioritize incidents quickly.

3.1. Categorisation

The goals of proper categorisation are:

 Identify what is reported is in an incident; the products and services impacted, the appropriate SLA and escalation timelines.
 Indicate what support groups need to be involved.
 Provide meaningful reporting on system continuity and reliability.

For each incident, the specific product or service will be identified. It is critical to establish with the user the specific area of the service being
impacted. For example, at Mapal it is Stock Control, Financial, Human Resources, or another area? If it is Stock Control, is it for Stock Count or
Purchasing? Identifying the impact to operations properly establishes the appropriate Service Level Agreement and relevant Service Level Targets.

In addition, the impact and severity of the incident need to be established. All incidents are important to the user, but incidents that affect large
groups of personnel, business deadlines or critical operational functions need to be addressed before those affecting 1 or 2 users.

Principles of Categorisation:

 understanding the severity and impact

 replicable and validated.

Does the incident cause a work stoppage for the user, or do they have other means of performing their job? An example would be a broken link on a
web page is an incident but if there is another navigation path to the desired page, the incident’s priority (severity) would be low because the user can
still perform the needed function.

The incident may create a work stoppage for only one person, but the impact is far greater because it is a critical operational function. An example of
this scenario would be the person in payroll having an issue which prevents the payroll from processing. The impact affects many more personnel
than just the user.

3.2. Incident Priority

The Incident Priority P1 – P4 is assigned to an incident that will determine how quickly it is scheduled for resolution and will be set depending upon a
combination of the severity and impact.

3.3. Target Times

Following are the current targets for response and resolution for incidents based upon Incident Priority.

Response Time Resolution Time

P1 (Critical) 4 business hours 2 business days

P2 (High) 8 business hours 5 business days

P3 (Medium) 3 business days No commitment

P4 (Low) 4 business days No commitment

© Mapal Group. All rights reserved Page 7 of 15


4. Incident Management Process
High level summary of the incident management steps:

© Mapal Group. All rights reserved Page 8 of 15


4.1. Incident Management Process Steps and activities

Role Description

Incident Reported Incidents can be reported by the customer, internal or technical staff through various means, i.e., phone, email, or a self-service web interface
(Help Centre).

Incident identification

Support Team As far as possible, all key components should be monitored so that failures or potential failures are detected early so that the incident management
process can be started quickly. Mapal with all available monitoring will always aim to resolve an Incident before the end user is impacted.

(TIER I) Incident logging

All incidents must be fully logged, and date/time stamped, regardless of whether they are raised via a customer or whether automatically detected
via an event monitoring alert. All relevant information relating to the nature of the incident must be logged so that a full historical record is
maintained – and so that if the incident must be referred to other support group(s), they will have all relevant information at hand to assist them.

Incident categorisation

If the customer is calling about an issue, they have that is not related to one of the agreed services or is a system issue, then it is not an incident. A
case will still be logged and categorised appropriately as non-system related.

Is this a Question or Request incorrectly categorized as an incident? If so, update the case to reflect that it is and follow the appropriate process.

Incident prioritisation

Before an Incident Priority can be set, the severity and impact need to be assessed. Once the severity and impact are set, the Incident Priority is
derived using the Incident Priority matrix. Refer to ‘1.4.5. Incident Priority’

Initial diagnosis

A Support Team member must carry out initial diagnosis, using tools and known error information to try to discover the full symptoms of the
incident and to determine exactly what has gone wrong. The Support Team will utilise the collected information on the symptoms and use that
information to initiate a search of the Knowledge available in the Knowledge Hub, Dynamics CRM to find an appropriate solution. If possible,
the Support Team will resolve the incident and close the incident if the resolution is successful.

Is this a major Incident (P1 or P2)? - Major Incident

If this is a major incident meaning that a service is unavailable in part or whole, the Mapal appropriate stakeholders should be alerted to make
certain any resources necessary to the resolution will be immediately made available.

Incident Closure

Verify with the customer that the resolution was satisfactory, and the customer can perform their work. An incident resolution does not require
that the underlying cause of the incident has been corrected. The resolution only needs to make it possible for normal system activity to resume.

If the customer is satisfied with the resolution, proceed to closure, otherwise continue investigation and diagnosis.

When proceeding with closure the Support Team should also check the following:

Closure categorisation. Check and confirm that the initial incident categorisation was correct or, where the categorisation subsequently turned
out to be incorrect, update the record so that a correct closure categorisation is recorded for the incident – seeking advice or guidance from the
resolving group(s) as necessary.

Formal closure. Formally close the Incident Record.

User satisfaction survey. CSAT survey distributed on CRM incident closure by email.

Incident documentation. Chase any outstanding details and ensure that the Incident log is fully documented so that a full historic record at a
sufficient level of detail is complete. Incident RCA

On-going or recurring problem? Determine (in conjunction with support groups) whether it is likely that the incident could recur and decide
whether any preventive action is necessary to avoid this. In conjunction with Problem monitoring, a problem record should be created from every
incident to document and prevent further root cause problems and for repeat analysis reporting.

Assign to Is the necessary information available to resolve the incident? If not, the case should then be assigned to the Development Group that supports
Technology group the product.

5. Incident assignment, escalation to the Technology group


Mapal has adopted the ITIL framework for IT service management and as a standard working practice recognise assignment may
change but ownership of incidents always resides with the Support Team. Thus, the responsibility of ensuring that an incident is
escalated and to the Tech Ops group when appropriate also resides with the Support Team.

mapal-os.com 9
5.1. Incident Tech Ops (Tier II) assignment steps:

*All escalation process steps are performed by the Support Team. Some of the steps may be automated.

Description

Examine all open incidents and determine actions based upon incident Priority.

If it is a lower Incident Priority P3 or P4 assign appropriate Tier 2 team

Has the incident been resolved? If not continue to monitor and provide fort-nightly customer updates.

Is this an Incident Priority P1?

If it is a P1 incident, The Support Manager (MIM) on shift should be contacted by phone to initiate the incident assignment.

Join Incident Teams Channel - monitor the status of the P1 incident providing informational updates Support Manager (MIM) every 25
minutes.

Has the incident been resolved? If not continue to monitor and provide updates.

If the incident has been resolved, provide Support Manager with RCA details.

Is this an Incident Priority P2?

If it is a P2 incident, The Support Manager (MIM) on shift should be contacted by phone to initiate the incident assignment.

Join Incident Teams Channel - monitor the status of the P2 incident providing informational updates Support Manager (MIM) every 55
minutes.

Has the incident been resolved? If not continue to monitor and provide updates.

If the incident has been resolved, provide Support Manager with RCA details.

5.2. Incident Hierarchical Escalation Process:

If the Tech Ops Team Lead (TIM) is not available, and it’s been 30 minutes call the Head of Tech Ops

Is the manager available?

If neither Team Lead nor the Head of Tech Ops is available, organically follow the hierarchical escalation chart

5.3. Hierarchical escalation roles:

Hierarchical escalation should be used if 30 minutes after the incident has been logged, assigned and contact attempted if the
Incident Manager has not been available or responded.

Technology – Tech Ops and Infrastructure Support Team

Tech Op Team Leads (TIM) Support Managers (MIM)

Head of Tech Ops Director of Support

Head of Infrastructure Chief Customer Officer

Chief Technology Officer Chief Operating Officer

Chief Executive Officer Chief Executive Officer

5.4. Incident Management process:

All incidents must be raised to the Support Team for validation and categorisation of an incident and to follow incident management
process. Includes incidents reported from all internal departments.

mapal-os.com 10
5.5. P3 and P4 Minor defect escalation process:

Mapal has an escalation process for minor defects prioritised as P3 or P4.


This exists for all products as a mechanism to raise the priority of any minor defect that due to length of resolution time has become
increasingly urgent and has the potential to cause high impact to the customers’ business. The objective is to ensure technical
resource allocation to provide a quicker resolution and can set estimated timelines for a resolution date.

This can be applied at any stage of the defect resolution lifecycle and is not dependant on if the SLA has been breached or not by
updating this incident to P2 to invoke the incident management process.

6. Major Incident Management Process


6.1. Primary Goal

The primary goal of the Major Incident Management process is to restore normal service operation as quickly as possible and
minimize the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are
maintained. ‘Normal service operation’ is defined here as service operation within SLA limits.

6.2. Major incident Process Definition:

Major incident requiring this process is defined as an event which has significant impact or urgency for the business/organisation,
and which demands a response beyond the routine incident management process. This is an inclusive extension of the Incident
Management process and is implemented before the SLA is breached on P1 and P2 incidents or where otherwise deemed necessary.

Major Incidents may either cause, or have the potential to cause, impact on business-critical services or systems or be an incident
that has significant impact and risk on reputation, revenue, legal compliance, regulation, or security of the business/organisation.

Incidents for which the timescale of disruption – to even a relatively small percentage of users – becomes excessive should also be
regarded as major incidents.

It is possible to define some of these major incidents, but most will be prioritised as they happen based on impact and urgency.
Major incidents at Mapal will normally be classified as P1, P2 priority incidents.

6.3. Major Incident Management Response Process (MIMR)

Major Incident Management Response process is implemented by the Support Manager (MIM).

Role Description

Is it P1 or P2 incident?

Support Manager (MIM)

Has it been 4 hours for P1 or 12 hours for P2 priority incident and still not resolved?

mapal-os.com 11
Follow standard P1, P2 incident management process until it is clear the SLA resolution
time has or is likely to breach. Also invoke MIMR if investigation of the incident has
provided evidence that the resolution time will breach SLA.

Implement Major Incident Management Response Process.

1st Action- Commandeer appropriate meeting room if applicable and start Teams. session.

2nd Action-Provide Teams join meeting ID via Team multi channels broadcast.

3rd Action- call missing attendees via their mobile phone.

 The Support Manager (MIM)

 Tech Ops Team Lead (TIM)

 Investigation Team members/ technical resources

 Director of Support

 Head of Tech Ops

 Infrastructure Manager

 Chief Customer Officer

 Chief Technology Officer

7. Reports/Metrics, SLA Alerts and Meetings


A critical component of success in meeting the SLA is for Mapal to hold itself accountable for deviations from acceptable
performance. This will be accomplished by producing meaningful reports that can be utilized to focus on areas that need
improvement. The reports must then be used in coordinated activities aimed at improving the support.

7.1. Reports/Metrics

Reports and Metrics will be produced monthly with quarterly summaries and included is:
 Total numbers of Incidents (as a control measure)
 Breakdown of incident at each stage (e.g., recorded, open, closed etc.)
 Size of current incident backlog
 Number and percentage of P1 and P2 incidents
 Percentage of incidents handled within agreed response time as defined by SLA’s.
 Number of incidents reopened and as a percentage of the total.
 Number and percentage of incidents incorrectly prioritised and categorized.
 Number and percentage of Incidents closed by the Service Team without reference to other levels of support (often
referred to as ‘first time fix)
 Breakdown of incidents by time of day, to help pinpoint peaks and ensure matching of resources.

mapal-os.com 12

7.2. SLA Alerts & Dashboard

Alerting is available and is for open incidents SLA achievement/ tracking and is configured using entitlements in Salesforce and
email alerts will be distributed to prevent SLA breach.
There will be multiple levels of alerting for each incident based on age and time left to be able to resolve the incident without SLA
breach and these will be distributed hierarchically to include Tech Ops, Cloud Ops, Technical Support and Service and both
Technology and Operations Leadership.
Additionally, there is also an open incident SLA dashboard to display all open cases aligned to SLA achievement with the time left
to resolve within or if already breached SLA.

7.3. Meetings

The Director of Support conduct fortnightly sessions (Case Management) with Tech Ops to review previous incidents and incident
management.

The goal of the sessions is to identify:

 Processes that are working well and need to be reinforced.


 Patterns related to incidents where support failed to meet targets.
 Identify alerts, tools, knowledge gaps or additional levels of system access required by Tier I or Tier II
 Reoccurring incidents (Problems)
 Identification of work around solutions that need to be developed until root cause can be corrected.
 Improve on SLA achievement rate.

8. Process Audit; How Did we do?


Objective is to measure the success of the process against targets. Identify the strategy for improvement and provide continual
service improvement.
Measures include:
 OLAs
 SLAs
 Incident logs
 RCA Incident reports
 Prioritising
 Categorisation
 Graphical results available via Dynamics BI tools

Target for all the above is 95% completion and success with the minimum accepted level of 90%.

9. Incident Management Statement


The Incident process should be followed for all incidents covered by an existing service level agreement, regardless of whether the
request is eventually managed as a project or through the Incident process.

If Mapal already provides a service to a customer, but that customer wants to significantly expand that service or solution usage
beyond the existing cost support model in place, the request should be treated as an additional service request for managed services
or other and forwarded to the Customer Success Manager.

Incidents should be prioritized based upon impact to the customer and the availability of a workaround.

Regardless of where an incident is referred to during its life, ownership of the incident remains with the Support Team always. The
Support Team remains responsible for tracking progress, keeping users informed and ultimately for Incident Closure.

Rules for re-opening incidents - Despite all adequate care, there will be occasions when incidents recur even though they have been
formally closed. If the incident recurs within one working day, then it can be re-opened – but beyond this point a new incident must
be raised but linked to the previous incident(s) as a child case.

Work around solutions should be in conformance with Mapal standards, and policies.

mapal-os.com 13

You might also like