The Shortcut Guide To: Improving IT Service Support Through ITIL

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 103

The Shortcut Guide To

tm

Improving IT Service Support through ITIL


sponsored by

Rebecca Herold

Introduction

Introduction to Realtimepublishers
by Don Jones, Series Editor

For several years, now, Realtime has produced dozens and dozens of high-quality books that just happen to be delivered in electronic formatat no cost to you, the reader. Weve made this unique publishing model work through the generous support and cooperation of our sponsors, who agree to bear each books production expenses for the benefit of our readers. Although weve always offered our publications to you for free, dont think for a moment that quality is anything less than our top priority. My job is to make sure that our books are as good asand in most cases better thanany printed book that would cost you $40 or more. Our electronic publishing model offers several advantages over printed books: You receive chapters literally as fast as our authors produce them (hence the realtime aspect of our model), and we can update chapters to reflect the latest changes in technology. I want to point out that our books are by no means paid advertisements or white papers. Were an independent publishing company, and an important aspect of my job is to make sure that our authors are free to voice their expertise and opinions without reservation or restriction. We maintain complete editorial control of our publications, and Im proud that weve produced so many quality books over the past years. I want to extend an invitation to visit us at http://nexus.realtimepublishers.com, especially if youve received this publication from a friend or colleague. We have a wide variety of additional books on a range of topics, and youre sure to find something thats of interest to youand it wont cost you a thing. We hope youll continue to come to Realtime for your educational needs far into the future. Until then, enjoy. Don Jones

Table of Contents Introduction to Realtimepublishers.................................................................................................. i Chapter 1: ITIL Overview and Challenges......................................................................................1 A High-Level ITIL Overview..........................................................................................................2 Change Management ...........................................................................................................3 Incident Management...........................................................................................................5 Problem Management ..........................................................................................................7 The Business Value of ITIL.............................................................................................................9 Efficient IT Benefits Business .............................................................................................9 Customer Retention ...............................................................................................10 Improved Quality ...................................................................................................10 Greater Efficiency..................................................................................................11 Better Communication .......................................................................................................11 Measurable Results ................................................................................................11 Better Audit Outcomes ..........................................................................................12 Automation Boosts IT Efficiency ......................................................................................12 ITIL Challenges .............................................................................................................................13 ITIL Implementation Takes Time......................................................................................13 ITIL Implementation Requires Resources from Across the Enterprise.............................13 ITIL Implementation Requires Understanding..................................................................13 Baseline Data Must Be Collected ......................................................................................13 Personnel Throughout the Enterprise Must Be Involved...................................................14 Integration with Other Frameworks Must Be Carefully Planned ......................................14 Getting Started With ITIL..............................................................................................................14 Implementing ITIL.............................................................................................................15 #1: Be Realistic; Start Small ..................................................................................15 #2: Document, Document, Document! ..................................................................15 #3: Obtain Executive Support................................................................................15 Summary ........................................................................................................................................16 Chapter 2: Effective Change Management Through ITIL.............................................................18 The Change Management Process .................................................................................................18 Change Management Benefits .......................................................................................................19 Inputs, Outputs, and Relationships ................................................................................................20 Inputs..................................................................................................................................20

ii

Table of Contents Outputs...............................................................................................................................20 Relationships......................................................................................................................21 About RFCs .......................................................................................................................24 Planning .........................................................................................................................................26 Why Do We Need to Create the CMDB?......................................................................................26 What Should Be in the CMDB?.....................................................................................................27 Why Is Automation Important? .....................................................................................................30 Automation Has Positive Business Impact ........................................................................30 Automation Tool Features .................................................................................................30 Avoid Common Pitfalls .....................................................................................................32 Costs...............................................................................................................................................32 People Costs.......................................................................................................................32 Technology Costs...............................................................................................................32 Measuring Success.........................................................................................................................33 Change Efficiency Rate .....................................................................................................34 Change Success Rate .........................................................................................................34 Change Reschedule Rate....................................................................................................34 Change Incident Rate.........................................................................................................34 Other Useful Metrics..........................................................................................................34 Summary ........................................................................................................................................35 Chapter 3: Effective Incident and Problem Management Through ITIL ......................................36 Incidents.............................................................................................................................36 Problems ............................................................................................................................36 Errors..................................................................................................................................36 Relationship Between Incident and Problem Management ...........................................................37 Why Is Incident Management Important? .....................................................................................38 The Incident Management Process ................................................................................................39 Incident Reporting .............................................................................................................39 Classification and Initial Support.......................................................................................40 Matching ............................................................................................................................40 Investigation and Diagnosis...............................................................................................40 Resolution and Recovery ...................................................................................................41 Incident Closure .................................................................................................................41

iii

Table of Contents Incident Management Benefits ......................................................................................................41 Incident Management Inputs, Outputs, and Relationships ............................................................42 Outputs...............................................................................................................................44 Relationships......................................................................................................................44 Measuring Incident Management success .....................................................................................47 Incident Resolution Efficiency Rate ..................................................................................48 Customer Incident Impact Rate .........................................................................................48 Incident Reopen Rate.........................................................................................................49 Incident Labor Utilization Rate .........................................................................................49 Why Is Problem Management Important?.....................................................................................49 The Problem Management Process................................................................................................50 Problem Control.................................................................................................................50 Error Control......................................................................................................................51 Proactive Problem Management ........................................................................................52 Information Generation......................................................................................................52 Problem Management Benefits......................................................................................................53 Inputs, Outputs, and Relationships ................................................................................................53 Outputs...............................................................................................................................54 Relationships......................................................................................................................54 Putting Incident Management and Problem Management into Action..........................................56 Costs...............................................................................................................................................58 People Costs.......................................................................................................................58 Technology Costs...............................................................................................................58 Measuring Problem Management Success ....................................................................................59 Customer Impact Rate........................................................................................................60 Incident Repeat Rate ..........................................................................................................60 Problem Labor Utilization Rate .........................................................................................60 Problem Reopen Rate ........................................................................................................61 Problem Resolution Rate ...................................................................................................61 Problem Workaround Rate ................................................................................................61 Summary ........................................................................................................................................61 Chapter 4: Supporting Compliance Through ITIL ........................................................................62 IT Compliance Is Relatively Young ..............................................................................................62

iv

Table of Contents Frameworks Support Compliance..................................................................................................63 ITIL Has Been Validated ...............................................................................................................64 ITIL Service Management Supports Compliance..........................................................................64 SOX Mapping to ITIL Service Management.................................................................................65 ITIL Supports Compliance with Many Laws and Regulations .....................................................67 Compliance with Policies and Procedures.....................................................................................68 ITIL Supports Compliance and Improves Business ......................................................................69 Change Management .........................................................................................................70 Incident Management.........................................................................................................72 Problem Management ........................................................................................................73 Compliance Requires AccountabilityITIL Establishes Accountability.....................................75 Summary ........................................................................................................................................75 Chapter 5: Roadmap for Successful ITIL Service Support Implementation .................................77 Getting Ready ................................................................................................................................78 Realizing Improvements Are Needed................................................................................79 Get Executive Support .......................................................................................................80 Choose Team Members .....................................................................................................80 Create Mission Statements.................................................................................................81 Perform a Baseline Assessment .....................................................................................................82 Identify Stakeholders .........................................................................................................82 Determine Current Situation ..............................................................................................83 Identify Trouble Spots .......................................................................................................84 Perform Benchmarks .........................................................................................................85 Planning .........................................................................................................................................86 Document the Business Case .............................................................................................86 Set Goals ............................................................................................................................87 Create the Implementation Plan.........................................................................................87 Create Policies ...................................................................................................................89 Identify Responsibilities ....................................................................................................90 Implementation ..............................................................................................................................91 Train Personnel ..................................................................................................................91 Implement the Plan ............................................................................................................91 Use Tools to Manage Change ............................................................................................92

Table of Contents Measurement..................................................................................................................................93 Review Status.....................................................................................................................93 Measure Goals ...................................................................................................................93 Measure Changes ...............................................................................................................93 Document Problems and Vulnerabilities ...........................................................................94 Plan for Ongoing Management..........................................................................................94 Summary ........................................................................................................................................95

vi

Copyright Statement

Copyright Statement
2007 Realtimepublishers.com, Inc. All rights reserved. This site contains materials that have been created, developed, or commissioned by, and published with the permission of, Realtimepublishers.com, Inc. (the Materials) and this site and any such Materials are protected by international copyright and trademark laws. THE MATERIALS ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice and do not represent a commitment on the part of Realtimepublishers.com, Inc or its web site sponsors. In no event shall Realtimepublishers.com, Inc. or its web site sponsors be held liable for technical or editorial errors or omissions contained in the Materials, including without limitation, for any direct, indirect, incidental, special, exemplary or consequential damages whatsoever resulting from the use of any information contained in the Materials. The Materials (including but not limited to the text, images, audio, and/or video) may not be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any way, in whole or in part, except that one copy may be downloaded for your personal, noncommercial use on a single computer. In connection with such use, you may not modify or obscure any copyright or other proprietary notice. The Materials may contain trademarks, services marks and logos that are the property of third parties. You are not permitted to use these trademarks, services marks or logos without prior written consent of such third parties. Realtimepublishers.com and the Realtimepublishers logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. If you have any questions about these terms, or if you would like information about licensing materials from Realtimepublishers.com, please contact us via e-mail at info@realtimepublishers.com.

vii

Chapter 1 [Editor's Note: This eBook was downloaded from Realtime NexusThe Digital Library. All leading technology guides from Realtimepublishers can be found at http://nexus.realtimepublishers.com.]

Chapter 1: ITIL Overview and Challenges


Most organizations are dependent upon IT to meet their business goals and to fulfill business processes. This increased dependency has led to more diverse systems and applications within the enterprise, with many of the components highly decentralized and/or highly specialized. This diversity has created a complex business-processing environment. IT complexity makes ensuring business applications and systems availability very challenging. IT complexity, complicated with frequently changing technologies and always emerging threats, creates many IT service support challenges and problems. Table 1.1 demonstrates the types of situations that often occur within business and the resulting challenges and problems that must be addressed.
Situation Deployments and Retirements Updates Budget Cuts Mergers and Acquisitions Divestitures Challenges and Problems When new systems, applications, and technologies are deployed and old systems, applications, and technologies retired, problems arise, incidents occur, and changes must be made throughout the enterprise. When applications are updated, the systems and other applications communicating with them will often also need to be updated. When budgets are cut, existing applications and systems support and/or components must often be removed or their support drastically reduced. When companies merge or acquire others, existing and diverse systems must be combined and changes must be made to ensure that the applications and systems that support the business keep going. When companies divest of business units, they must remove those portions of the business from the network in an effective and efficient manner to keep business running efficiently and securely. When new technological threats occur, which seem to be more and more frequently, incidents occur, problems emerge, and changes must be made. When there is a poor interface between the details of incidents, problem management systems, and error details, problems will not be effectively resolved. When known errors from the development environment do not get communicated to the production environment, business will experience problems, impacting productivity and often revenues. When change management processes consist of an overabundance of disjointed, and often inconsistent, rules and regulations, IT becomes unproductive trying to follow the mess and often gives up following the rules altogether in frustration. When personnel are reluctant to adopt new change, incident, or problem management processes, preferring to stay with what they know because it seems to them that is the easier thing to do, the old issues with resolving problems, implementing changes, and responding to incidents are perpetuated, continuing to cost business.

New Technology Threats Poor Communication Interfaces Development Errors that Go into Production Ineffective Change Management Processes Personnel Reluctance to Change

Table 1.1: Example IT service support challenges and problems.

Chapter 1 The possibilities are endless. Implementing yet more disconnected processes and procedures alone cannot efficiently address these challenges.
Recently in the U.K., JPMorgan Chase used ITIL to streamline their IT service desk. They have seven sites with more than 700 people that handle 3 million IT service calls per year. In 2004, they merged with Bank One and had to consolidate dozens of IT tools. Before ITIL, four incident-management tools, 14 change-control systems, four knowledge management tools, and 25 request tools were used. After using ITIL to consolidate processes, JPMorgan Chase now has just one incidentmanagement tool, one change-control system, one knowledge management tool, and four request tools. The service desk maintains 93% customer satisfaction ratings and a 75% first-call resolution rate. (Source: ComputerworldUK, April 23, 2007, http://www.computerworlduk.com/management/itbusiness/it-organisation/news/index.cfm?newsid=2689.)

Old management styles that were once used strictly within centralized, single-system computing environments dont work in todays highly diverse and decentralized environments. It is easy within such complexity for errors to happen. Even small failures can impact the entire business. A single hardware problem can impact multiple virtual machines. For example, if an event console cannot accurately perform root cause analysis, it could possibly be reported as multiple faults, making it extremely difficult to identify the error to fix the problem.
According to IDC research, Eighty percent of IT system outages are caused by operator and application errors (Source: Behr, Kim and Spafford. The Visible OPS Handbook. Information Technology Process Institute. Eugene, Oregon. 2006. pg. 10).

A High-Level ITIL Overview


During the 1980s, the British government was experiencing numerous IT service problems and related quality issues. The Central Computer and Telecommunications Agency (CCTA), now named the Office of Government Commerce (OGC), was tasked with developing a way to efficiently and cost-effectively use IT resources within public sector organizations. The goal was to create a successful approach that would be independent of any particular vendor. What resulted was the Information Technology Infrastructure Library (ITIL), a collection of the best practices for the IT service industry developed by practitioners, theorists, and researchers. The core ITIL V2 publications include: Service Support Service Delivery The Business Perspective ICT Infrastructure Management Application Management Security Management. Planning to Implement Service Management

ITIL V3 is currently being reviewed and has a reported planned release in 2007.

Chapter 1 This guide will look at how the ITIL Service Support processes can be applied to businesses. More specifically, it will explore Change Management, Problem Management, and Incident Management. Change Management To most efficiently and effectively handle IT changes, there must be one centrally managed Change Management process. The Change Management process must be integrated throughout the entire applications and systems development life cycle (SDLC). Activities that must be managed to process changes include the following; shown in the order they occur:
1. RecordingEnsuring all change sources can submit Requests for Change (RFCs) and

that the RFCs are properly recorded.


2. AcceptanceFiltering submitted RFCs and moving those eligible on for consideration. 3. Classification, categorization, and prioritizationPutting each RFC into the appropriate

category and establishing a priority.


4. Planning and approvalConsolidating the changes, giving approvals, obtaining

resources, and involving the change advisory board (CAB) where necessary.
5. CoordinationScheduling, development, testing, and implementation. 6. Evaluation and closureDetermining success and learning from the experience. ITIL Change Management Glossary ITIL uses many terms that may not be familiar to those of you new to this methodology. To assist with the discussion of Change Management within this and subsequent chapters, the following list highlights common terms that you will see when ITIL Change Management concepts are discussed: Change ManagerOne of the two authorities within Change Management. This is the position that is responsible for sorting, receiving, and classifying all Requests for Change (RFCs). Change Advisory Board (CAB)The second of the authorities within Change Management. This is a type of consulting group that meets on a regular basis to review, assess, prioritize, and plan changes. Configuration Items (CIs)IT components and the services provided with them. Examples include computer hardware, computer software, network components, servers of all types, procedures, documentation, and all other components that the IT area controls. Configuration Management Database (CMDB)Used to track all the IT components, including the version, status, and relationships for each. All CIs are part of the CMDB. Process ScopeThis is determined in conjunction with the scope of the Configuration Management and Release Management processes. Determining the scope of Configuration Management is dynamic and can change as associated actions occur and as the information within the CMDB changes. Request for Change (RFC)This is used to propose a change to any component of the IT infrastructure or any part of an IT service. An RFC can be a document or record used to enter the details, justification, and authorization for the proposed change.

Figure 1.1 illustrates the relationship between Change Management activities.

Chapter 1

RFC submission and recording

RFC acceptance and filtering

RFC classification and prioritization

Urgent?

Yes

RFC classification and prioritization

No Process information; monitor and update CMDB

Planning and approval

Coordination

Working as planned?

No

Initiate back-out plan

Yes

Evaluation

Figure 1.1: Change Management processes.

Chapter 2 will go over Change Management in more detail.

Chapter 1 Incident Management Incident Management is responsible for managing all incidents from detection and recording through resolution and closure. Incident Management is reactive. The objectives of Incident Management are to reduce or eliminate the business impacts and effects of actual or likely disturbances within IT services to ensure personnel can get back to work and business can resume to normal as soon as possible. The types of activities that occur within Incident Management include the following; shown in the order they occur:
1. Incident acceptance and recordingDetect or report an incident and then create an

incident record.
2. Classification and initial supportCode the incident by type, status, impact, urgency,

priority, service level agreement (SLA), and so on. Provide temporary workarounds as applicable.
3. Service requestIf necessary, implement the appropriate procedure to request IT

services.
4. MatchingDetermine whether the incident is known and if there is a workaround. 5. Investigation and diagnosisIf a known solution to the incident does not exist, then

investigation occurs.
6. Resolution and recoveryAfter finding a solution, the issue is resolved. 7. ClosureIf the user is satisfied with the solution, the incident is closed.

Progress monitoring and tracking activities occur after each of these steps. During these activities, the incident cycle is monitored to determine how quickly it can be resolved and whether escalation is necessary.
ITIL Incident Management Glossary To assist with the discussion of Incident Management within this and subsequent chapters, the following list highlights common terms that you will see when ITIL Incident Management concepts are discussed: IncidentAny event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in the quality of that service (Source: The ITIL Open Guide site at http://www.itlibrary.org/index.php?page=Incident_Management on May 6, 2007.) Request for Change (RFC)This is used to request a change to the IT infrastructure or any IT service. An RFC can be a document or record used to enter the details, justification, and authorization for the proposed change. Service RequestA request for a change to be made to an IT service. A Service Request should be made under strict, well-defined procedural controls, making it almost risk free. Examples include establishing a new network user ID and transferring a computer from one department to another.

Figure 1.2 demonstrates the relationship between Incident Management activities.

Chapter 1

Incident acceptance and recording

Classification and initial support Follow urgency procedures Yes Service request?

No

Matching

Match? No Yes Investigation and diagnosis

Progress monitoring and tracking

Resolution and recovery

Resolved? Yes Incident closure

No

Figure 1.2: Incident Management processes.

Chapter 3 will go over Incident Management in more detail.

Chapter 1 Problem Management So how is a problem different than an incident? Generally, a problem is an unwanted or undesirable situation that, if not addressed soon enough, can become the root cause of an incident. Problem Management takes the entire IT infrastructure into account, using all available information to identify existing and potential failures in the delivery of IT services. Problem Management supports Incident Management by providing alternative workarounds and temporary fixes during an incident but does not have responsibility for actually resolving incidents. Problem Management also involves the analysis of incidents and problems to identify trends and then subsequently takes proactive actions to prevent the further occurrence of similar incidents and problems. The types of activities that occur within Problem Management include:
1. Problem identification and recordingIdentifying known and new problems and

performing trend analysis.


2. Problem classification and allocationDetermining the category, impact, urgency,

priority, and status of a problem, then allocating resources for resolution.


3. Problem investigation and diagnosisDetermining the cause of the problem and linking

it to the appropriate CIs.


4. Temporary fixesImplementing necessary temporary or emergency fixes to manage

known errors until they can be resolved.


5. Error identification and recordingIdentifying the error and then communicating the

error to Incident Management if appropriate.


6. Error assessmentDetermining what is necessary to resolve known problems and errors. 7. Record error resolutionDetermining the most appropriate business solution. 8. Close error and associated problemsPerforming a Post Implementation Review (PIR)

and then closing the records. During each of the first four steps, actions are taken to track and monitor the problem, ensuring clear and comprehensive documentation is maintained. Likewise, during each of steps five through eight, actions are taken to track and monitor the error.
ITIL Problem Management Glossary To assist with the discussion of Problem Management within this and subsequent chapters, the following list highlights common terms that you will see when ITIL Problem Management concepts are discussed: ProblemA description of an unwanted situation that specifies the root cause of one or most existing or potential incidents. Known ErrorA problem that has a documented root cause and a workaround. Post-Implementation Review (PIR)A review that occurs after a change or a project has been implemented, determines whether the change or project was successful, and identifies improvement opportunities.

Figure 1.3 illustrates the relationship between Problem Management activities.

Chapter 1
Problem investigation and diagnosis RFC and problem resolution and closure

Problem identification and recording

Problem classificatino

Tracking and monitoring problems

Error identification and recording

Error assessment

Record error resolution

Close error and associated problems

Tracking and monitoring errors

Figure 1.3: Problem Management processes.

Chapter 4 will discuss Problem Management in more detail.

Chapter 1

The Business Value of ITIL


IT inefficiencies can have dramatic negative impact on business. The following examples highlight potential impacts: Change Management processes can be overly bureaucratic and unproductive. Change Management must consist of more than scarcely followed procedures, rarely attended meetings, and procrastination. Effective Change Management is not just an endless bureaucracy consisting of making sure the is are dotted and the ts are crossed. The goal of Change Management is not dealing with a bunch of red tape; it is getting necessary changes made in the most efficient, effective, and business-supporting way possible. Personnel that circumvent the established processes and procedures can create outages and can make ad hoc rampant changes that are not tracked. Personnel typically consider themselves justified in not following the processes because they view the processes as complicated barriers to getting their jobs done. IT services that do not support good customer service can create bad relationships with customers, resulting in lost business. Most businesses today depend upon IT applications to perform customer service activities. Lack of structured, documented IT service support procedures that do not comply with growing numbers of laws and regulations can result in lost time to lengthy audits and costly fines and penalties. Audits are completed much more quickly, and with many fewer findings and penalties, when IT service support is performed using documented and effective procedures that support the business. IT resources that are not available when necessary to support the business cost the business in personnel time, lost business, and lost opportunity. Organizations today cannot afford to have the IT resources they depend upon for business to be missing in action.

These inefficiencies and negative impacts can be reduced, and most even eliminated, using ITIL. Efficient IT Benefits Business ITIL promotes efficient and effective IT practices, which in turn benefits business. All these benefits help to ensure IT becomes a global, efficient, cost-effective, seamless part of the business enterprise.

Chapter 1

Customer Retention IT services become more customer-focused, making your external customers happier and promoting customer loyalty and retention by reducing IT problems that noticeably impact customers. Agreements about IT service quality also improve the relationships with customers. IT services are more clearly and accurately described, in better detail, and in customer language. The customers who depend upon your IT services as part of the product or service they purchased expect that your organization will put them first when they experience problems. If your IT services are not customer-focused when problems occur, you will quickly find yourself on the front page of international news sitesnot only concerning your other customers but also keeping potential customers away.
Avoid headlines like these from the May 16, 2007 abcnews.com Web site and widely discussed on Good Morning America (http://abcnews.go.com/GMA/Technology/story?id=3179394&page=1): "Dell Hell: Computer Giant Faces Consumer Lawsuit and Consumers Allege They Didn't Get the Tech Support They Paid For."

Improved Quality IT service quality, availability, reliability, and costs are managed better when properly using ITIL, saving time, money, and resources and resulting in better justification for costs related to IT service quality. There are many factors that contribute to this quality improvement: Documented roles and responsibilities improve the quality of IT service provisioning. Following repeatable, consistent processes that are engineered specifically to support business reduces human errors, resulting in better quality output and outcomes. Quality management systems based on ISO 9000 and BS15000 are supported.

As a case in point, United Space Alliance, the largest contractor for the NASA space shuttle program, implemented an integrated asset and service management system using ITIL (Source: http://www-306.ibm.com/software/success/cssdb.nsf/CS/LWIS6ZSLKQ?OpenDocument&Site=software&cty=en_us, May 16, 2007). As a result, they measurably improved service quality and efficiency for their 50,000+ hardware assets and 100,000+ software assets by establishing real-time incident, problem, and change management capabilities. Many organizations are now outsourcing critical IT processes. The IT process structure resulting from ITIL also provides a framework to facilitate more effective outsourcing of IT service elements, allowing the organization to realize better quality from their outsourced vendor. A higher-quality IT service support function brings with it more agile and efficient IT service support, which enhances competitiveness and ultimately improves business.

10

Chapter 1

Greater Efficiency Commonly used IT processes are better integrated when successfully implementing ITIL. Rework is reduced and redundant work eliminated by centralizing IT processes. Not only does this result in IT processes having improved scalability and consolidation, but the IT area is more clearly structured, more efficient, and better focused on corporate objectives. Because ITIL is business focused, IT is also better integrated with other business processes throughout the enterprise. Better integration results in an improved utilization of IT resources. Changes within the IT infrastructure are easier to manage. Clearly identifiable reference points for internal communications and external communications with vendors and business partners are created, allowing for the effective standardization of procedures. During mergers and acquisitions, IT installations that may have wide-scale differences are consolidated into one coherent management structure by using ITIL concepts and frameworks. Better Communication The use of ITIL concepts produces agreed-upon consistent points of contact within the IT areas. Consistent points of contact within the IT area improve communication. Because of the emphasis of documentation, continuous documented learning occurs from IT experiences, helping to prevent mistakes from recurring. As a result of improved communication, IT services better meet business, customer, and user demands and realize improved performance for the IT service delivery and service support areas. Measurable Results Using ITIL, demonstrable performance indicators are created and used to support the business. By being able to monitor and respond to these indicators, mission-critical IT services have improved availability, reliability, and security. The centralization and efficiency impacts result in measurably reducing latency at every stage of the IT management cycle, dramatically reducing costs. Reducing latency improves IT project deliverables and delivery times. Baseline IT metrics and ongoing measurements become part of the business as an effect of ITIL implementation. How significant are the savings that can result from ITIL? Consider Transporeon, a leading European e-logistics solution provider. Transporeon implemented ITIL and an IT process automation system and improved their IT staffs productivity in addition to reducing their overall maintenance cost by more than 40% (Source: http://www.opsware.com/Downloads/CS_2007_05_PAS_Transporeon.pdf on May 16, 2007). A side effect was freeing key resources to allow for faster response times that increased customer satisfaction.

11

Chapter 1

Better Audit Outcomes ITIL makes audits easier by having better documentation and up-to-date metrics that the auditors can use instead of having to try to create the metrics themselves based upon numerous and disparate documents. ITIL allows IT management system audits to be more favorable and take less time. Because IT infrastructure and related services are better controlled, fewer audit findings result. Security controls based on COBIT, which auditors overwhelmingly use, are supported. Compliance requirements for the U.S. Sarbanes-Oxley Act, as well as other laws and regulations, are supported by ITIL. Automation Boosts IT Efficiency The efficiency of ITIL implementation can be improved with automation. Many critical processes can be automated to streamline business. Automation can be seamlessly integrated using products specifically engineered to complement the ITIL processes.
Automation makes personnel work easier and less training is needed to accomplish the ITIL objectives.

Good, effective tools allow automation to use the same data model, security model, and other IT models your organization has adopted; it just makes them timelier, more efficient, more consistent, and more likely to be error-free. As a case in point, consider EDS. EDS has one of the worlds largest IT organizations. They recently automated more than 65,000 servers across more than 400 worldwide locations in support of their ITIL process. As a result, they reduced costs and improved efficiencies in their IT organization by automating the complete life cycle of business application management and the underlying infrastructure. The EDS example points out that ITIL can support a global scale of deployment. Automation solutions must be able to scale as large as possible. Automating ITIL processes does not occur overnight; automation must be built-in to solutions. In addition to supporting more successful and efficient global deployment, automation allows for IT service support and delivery processes to be delivered more quickly, saving huge amounts of time to accomplish tasks and ultimately reducing resource costs involved with performing actions manually. As an example, consider the BNSF Railway. In 2005, BNSF became one of the first companies in the railway industry to deploy an extensive wireless network and automate the management of its complex IT infrastructure. Prior to automation, the network engineers would log on to each device manually to make configuration changes, taking significant time to accomplish. Automation allowed the BNSF network engineers to securely automate password and SNMP community string management, deployment of access control lists (ACLs), and configuration change tracking. Now, BNSF pushes out changes as a batch and they use an automated policy compliance manager to ensure that all deployed changes match the companys required security and compliance policies.

12

Chapter 1 As Greg Britz, Network Operations Manager, BNSF Railway said, Automation will win every time over manual IT management as we begin to roll out new services and the network becomes more complex. We will handle configuration updates in milliseconds, compared to the minutes or hours that it took to configure systems manually(Source: http://www.opsware.com/about/success_BNSF.php on May 8, 2007).
Automation reduces latency at every stage of the management cycle to dramatically reduce costs.

ITIL processes and solutions must be living. The ITIL rules and capabilities, compliance audits, and other processes must be updated and changed whenever necessary with the least impact. In addition, solutions must be able to self correct as much as possible and notify administrators when the rules have been changed and when they need to change. Automation allows for these living changes to occur much more quickly and efficiently than can be accomplished manually.

ITIL Challenges
The benefits of ITIL can be realized only if ITIL is used correctly. Organizations face similar challenges using ITIL and must be aware of the common mistakes. Avoid these mistakes by understanding and using ITIL components according to the needs of your business that the IT organization supports. ITIL Implementation Takes Time Bringing ITIL into the enterprise can take a long time and requires significant coordination and effort. It may very well require a change to the culture of the organization. Being overly ambitious in bringing ITIL into the organization could be frustrating if objectives are not met. ITIL Implementation Requires Resources from Across the Enterprise Without sufficient resources, training, support tools, and time, ITIL will not be implemented to the degree with which it can have the most positive business impact. When ITIL is being introduced, additional resources and personnel may be needed until it is well established. ITIL Implementation Requires Understanding A lack of understanding about the processes being implemented will not result in any improvements. Those using ITIL, throughout the enterprise, must understand what the appropriate performance indicators are and how to control the processes. Baseline Data Must Be Collected Baseline data must be established to be able to measure impacts and improvements. If no baseline data is collected, improvements in the provisioning of services and cost reductions will not be able to be measured, and business leaders will not know, in quantitative terms they understand, the value ITIL brought to the organization.

13

Chapter 1 Personnel Throughout the Enterprise Must Be Involved Successful implementation requires the participation of personnel at all levels of the enterprise, throughout the entire enterprise. If you try to have one department or team implement ITIL, it may very well isolate that group, and a direction may subsequently be set that the rest of the enterprise does not accept or follow. Integration with Other Frameworks Must Be Carefully Planned Implementing ITIL with other frameworkssuch as COBIT, Six Sigma, and CMMIis possible. In fact, just one framework alone will not meet the wide range of business needs and processes, and so multiple chosen frameworks should be used in harmony. However, harmony between ITIL and other frameworks cannot be achieved in ways that try to make incompatible components fit; this will lead to frustration and failure and full ITIL value will not be realized. Certainly, multiple frameworks have compatible components, but careful analysis must occur to determine which components are truly good fits within your particular organization.

Getting Started With ITIL


To successfully and most effectively implement ITIL within your organization, you must first clearly understand your business processes and the IT services that support those processes. This understanding is not only necessary for successful ITIL implementation; it is necessary for good IT governance. Successful IT governance that adds value to business requires that you are able to Identify the IT services that support key business activities Establish metrics to measure the quality of IT services delivered and the value IT adds to the business Effectively manage the end-to-end IT infrastructure that supports service delivery Hold IT accountable for business results

To most successfully implement ITIL, IT organizations can implement an integrated technology solution that addresses the issues previously discussed. Automating the ITIL processes will allow the IT department to more successfully Map services to business needs Measure key performance indicators (KPIs) and the actual end-user experience Manage IT components across all organizational systems and networks

When choosing a technology solution to support ITIL, organizations should require the solution to support all aspects of IT service support in a unified manner so that use of the product does not end up being counter to ITIL principles.

14

Chapter 1

Implementing ITIL Expect to have at least some internal resistance to implementing ITIL. There will always be people who would rather stay an ineffective course than put in the time and effort necessary to follow a new path. ITIL initiatives can be successfully championed and initiated by following three simple principles. #1: Be Realistic; Start Small Identify the areas within the IT infrastructure where there is the least efficiency, most problems, and most user dissatisfaction. Implement ITIL to address those areas. This will give you experience with the ITIL processes while addressing the most significant and business-impacting IT problem areas. #2: Document, Document, Document! You will not be able to demonstrate or communicate the value of IT to the business if it is not well documented. Initial measurements and benchmarks must be accurately and consistently documented to validate improvements. Accurately and consistently tracking KPIs will allow IT to continuously measure progress and report this progress to business leaders over time. This is critical for successful and effective management as well as for establishing accountability for business unit and executive management. #3: Obtain Executive Support Organizational changes are almost always difficult. It is human nature to want to continue using known processes, even if they are bad or ineffective, instead of learning and implementing something new. Executive support and sponsorship is necessary for successful ITIL implementation, just as it is with any other major enterprise initiative. IT personnel will need to do significant work to implement ITIL processes, so executive support is a must, and the value for it cannot be underestimated. When executive leaders understand the positive impact ITIL can have upon business, they will support the training and other investments necessary to ensure successful and efficient ITIL implementation.

15

Chapter 1

Summary
By relating IT infrastructure to business value, ITIL also helps demonstrate to business leaders the value of IT, supporting IT investments and initiatives. ITIL integrates data to provide a comprehensive cross-tier representation for IT services, bringing better understanding to business leaders throughout the enterprise for how IT supports business success. ITIL integrates data to provide a comprehensive cross-tier representation for IT services. ITIL is a powerful set of guidelines that enables IT to deliver greater value and better align itself with business needs in efficient and valuable ways. Organizations implementing the ITIL framework must do so with a clearly defined sense of purpose and realize that, as with most ambitious business objectives, it will be achieved one step at a time. ITIL addresses IT service support challenges, as the following examples illustrate: Change Management Helps prevent problems and incidents that typically occur when IT changes are made to accommodate new technologies that are deployed throughout the enterprise Helps ensure that all IT infrastructure or device changes are implemented in a consistent, efficient, and repeatable way, which in turn will minimize IT services downtime resulting from errors and bad planning associated with the changes Helps to successfully combine and streamline multiple and diverse systems and applications during mergers and acquisitions Lessens the impacts of new technology threats, ensuring more efficient and effective recovery Helps restore business services and processes as quickly as possible; incidents are recorded within a central repository, enabling IT to most effectively utilize available skills and ensure important tasks are not overlooked during incident response With Problem Management, ensures an effective interface exists between the details of Incident Management and Problem Management systems to most effectively resolve the errors and root causes of incidents to keep the errors from recurring Helps to ensure that known errors from the development environment are communicated to the production environment Minimizes the impact that errors within the IT infrastructure have on the business and helps to prevent recurrence of incidents related to the errors

Incident Management

Problem Management

16

Chapter 1 The upcoming chapters will discuss in detail how the ITIL Change Management, Problem Management, and Incident Management Service Support processes can be applied to help support business activities and goals. Each chapter will detail The basic concept and objectives for the associated ITIL process The benefits of implementing the ITIL process How to get started with implementing the ITIL process Specific metrics for each ITIL process Ways to verify the ITIL process The costs of potential problems associated with each ITIL process

Chapter 4 will discuss in more detail how ITIL supports compliance. And, finally, Chapter 5 will tie it all together within a roadmap for successful ITIL implementation of the three processes discussed.

17

Chapter 2

Chapter 2: Effective Change Management Through ITIL


Change happens every day and in every way within every business. Information technology (IT) changes historically did not often occur back when all processing was done within a central mainframe that used dumb terminals for business inputs. However, technology continues to change increasingly more often, resulting in more frequent changes to business processes to improve services, reduce the number of incidents, lessen costs, and generally improve business. Complex networks and systems coupled with numerous necessary changes are a recipe for disaster if not successfully managed. Implementing the ITIL Change Management processes can help not only to demonstrate that IT is a necessary cost to your business but also to actually show how IT can add value to your business. There are generally three types of modifications made within Change Management. ChangesPlanned changes to the IT infrastructure to keep the processing going, these changes can range from installing a workstation on the network to relocating a server or a mainframe. These types of changes can introduce new errors into the IT environment and may not be recognized for a long time unless a good change management process is in place. Corrective measuresPut simply, these are fixing the errors that exist. Improvements and innovationsThese are implementing new services, technical capabilities, components, or technologies into the IT architecture. These often result in unexpected consequences and long-term errors.

The Change Management Process


The ITIL Change Management process has two authorities; the Change Manager and the Change Advisory Board (CAB), as defined in Chapter 1. The Change Manager is responsible for obtaining authorizations for the requested changes and planning and coordinating the implementation of the approved changes. The CAB is a group that regularly meets to review significant change requests and progress, prioritize changes, and help plan the changes. The CAB should include representatives from each IT area. The scope of the Change Management process is coordinated with the scope of Configuration Management and Release Management processes. Configuration Management processes determine the impact of changes and updates the change management database (CMDB). Change Management does not include activities considered Standard Changes, such as creating and deleting user IDs. Activities considered Standard Changes are not done under the Change Management process but are classified as Service Requests and performed as part of the Incident Management process. Because there are typically so many Service Requests, this delineation helps to keep the Change Management process manageable, keeping the CMDB from being overloaded and the performance of the activities from being overly bureaucratic.

18

Chapter 2

Change Management Benefits


Numerous incidents significantly impacting business have been the result of IT changes. These incidents occurred because of poor planning, insufficient resources, inadequate testing, sloppy work practices, ignoring business impact analysis for planned changes, and unknown bugs. The goal of implementing Change Management within the ITIL framework is to consistently and efficiently mange the change process, which will, as a consequence, reduce errors and prevent incidents by providing: Controlled implementation of changes The service and Help desk with information on current and future change activity as well as change history Up-to-date information to customers on change progress Management with a history of how efficiently changes are made

Successful implementation of the ITIL Change Management process will result in many benefits to the business, including: Better estimates for proposed change costs Better management information about changes allow for better problem diagnosis Fewer reversed changes More smoothly executed back-outs Improved IT personnel productivity because of fewer distractions caused by emergency changes or back-out procedures Improved user productivity because of more stable IT services Better ability to make more frequent changes without creating an unstable IT environment Reduced adverse impacts of changes

How will you know if you are realizing these benefits? By maintaining Change Management metrics, which I discuss later in this chapter. However, first it is important to understand what is involved with the Change Management process to help you understand and appreciate the benefits measurements.

19

Chapter 2

Inputs, Outputs, and Relationships


At the core of Change Management are the inputs and outputs for the process and the many important connections Change Management has with the other ITIL processes. Historically, ad hoc change management resulted in wasted time and resources by performing unnecessary changes. These types of changes also typically created problems and resulted in costly incidents. Change Management creates a repeatable, efficient way to implement changes throughout the enterprise. Inputs One of the goals of Change Management is to be able to determine with a high level of certainty whether the change being considered is appropriate. This is accomplished using a request for change (RFC), as defined within Chapter 1. The Change Manager will facilitate the CAB to determine whether the change should be made. It is up to the CAB to determine the potential impact of the requested change. To enable this determination, the CAB considers four types of information: The RFC The CMDB data Information from other processes (budgets and the Capacity Database to name just two examples) Forward Schedule of Change (FSC)

The CMDB data is critical for performing the change impact analysis.

Outputs The outputs of the Change Management process include: The updated FSC Triggers to use for Configuration Management and Release Management CAB agenda, minutes, discussions, decisions, and action items Change Management reports

The FSC, sometimes called a Change Schedule, lists all approved changes and their planned implementation dates.

Figure 2.1 illustrates the inputs and outputs for the Change Management process.

20

Chapter 2
Data From Other Processes Forward Schedule of Change (FSC)

RFCs

CMDB Data

Recording Rejecting

Building

Change Management Classifying Planning Testing Implementing

Accepting Evaluating

Triggers

CAB Documents

Change Management Reports

Updated FSC

Figure 2.1: Change Management inputs and outputs.

Relationships Change Management has relationships with all the other ITIL processes. It is important for the success of not only Change Management but for all enterprise-wide ITIL processes that these relationships are appropriately managed. Figure 2.2 illustrates these relationships at a high level.
Incident Management Configuration Management
RFC CI Relationships RFC Change Notice Change Notice Change Notice

IT Service Continuity Management

Capacity Management

RFC Change Notice Change Notice

Change Management
PSA Report

Change Notice

RFC

Problem Management

Change Notice

RFC

RFC Change Impact Analysis

Availability Management

Release Management

Service Level Management


Figure 2.2: Change Management relationships with other ITIL processes.

21

Chapter 2 As this figure shows, Change Management activities impact all other ITIL processes in one way or another. It is important for effective communications channels to exist to communicate key activities. Lets step through an example to see how all these processes are related. ACME Super Duper Supplies is going to implement a new ecommerce Web site that will allow for online merchandise ordering and payments for their new product, Magic Mover. This is a significant change in their IT infrastructure in addition to having a major impact on their business. The change must be implemented in a coordinated way to ensure all impacted areas are aware of the change, and that any potential negative impacts are minimized as much as possible. By following ACMEs Change Management process, Ms. Flint, the manager of the Magic Mover business unit, will help ensure the change is implemented as successfully as possible. Ms. Flint submits an RFC for the change to the Change Management team. The ACME Change Manager gives the RFC to the CAB, which approves the change. The Change Manager works closely with the Configuration Management area to provide the data from the associated CMDB to identify the relationships between the configuration item (CI) associated with adding Magic Mover to the site and determines what is affected by the change. The CAB works with the Availability Management team to estimate the potential impact of making the changes to add the Magic Mover to the e-commerce Website. Availability Management will in turn make the changes necessary to help improve service availability as it may be affected as a result of the changes. The Change Manager notifies the Incident Management, Problem Management, and the Release Management teams of the planned change so that they can determine how this change will affect them. The Change Manager sends a report to the Service Level Management team that lists the changes that will need to be made to the SLAs along with the impact of the FSC on the service availability. The Change Manager will communicate the change details to the Capacity Management team so that they can determine what the cumulative effects will be of adding the Magic Mover item to the e-commerce Web site, and they will determine what the cumulative impact of that change will be over an extended time. They may find that response time will be impacted and that more processing power is necessary. The Change Management and the CAB will work closely with the IT Service Continuity Management team to ensure it is aware of all the changes that will be made as a result of adding the Magic Move to the Web site and determine how this will impact the existing recovery plans. They can then ensure that the appropriate steps are taken to update the plans so that recovery can be completed successfully. Figure 2.3 shows now the Change Management process would flow to make the Magic Mover Web site implementation.

22

Chapter 2
Ms. Flint submits an RFC to add the Magic Mover to the ecommerce site

The CAB reviews the RFC and accepts the change

The CAB classifies and prioritizes the RFC

Is the change urgent? No CAB creates the change plan and waits for approval Magic Mover team works with Change Management area on implementation

Yes

Follow the urgency procedures to speed the changes into production

Process the Magic Mover information; monitor and update CMDB

Is implementation going okay?

No

Initiate back-out plan

Yes Evaluate Magic Mover implementation success

Figure 2.3: Magic Mover Change Management process flow.

23

Chapter 2 Table 2.1 provides the high-level descriptions about the relationships between Change Management and the other ITIL processes that I pointed out in the Magic Mover scenario. These relationships will be similar for any type of change.
ITIL Process Availability Management Capacity Management Configuration Management Incident Management IT Service Continuity Management Problem Management Release Management Service Level Management Relationship with Change Management Helps to estimate the potential impact of changes and determines how a change could affect the availability of a service Works with Change Management to determine how a change would impact a service and the availability of resources over an extended period of time Controls change recording and change impact analysis and keeps track of the relationships between the CI and other CIs; ITIL Service Support guidance recommends integrating with Change Management Requests changes to repair the impacts of incidents; also takes information from change notices to identify and repair any impacts from those changes Must be aware of changes that could make continuity plans unfeasible or unnecessary and updates plans accordingly Must be aware of changes to be able to identify new errors that result in new problems; must also communicate change requests to fix errors Change Management controls rollouts of new releases Helps to determine the impact of changes on services and business processes; discusses change impacts with customers as appropriate

Table 21: Change Management relationships.

About RFCs RFCs come from many different sources, as represented in Figure 2.4.

24

Chapter 2

IT Personnel

Legislation

Customers

RFCs

Project Management

Suppliers

Problem Management

Figure 2.4: RFC originators.

The RFCs can contain a wide amount of varying information depending upon your own unique organization, business, technologies, and so on. A few examples of the types of information to collect on RFCs include: Requestors name, location, phone number, email address Submission date RFC identification number Problem number CI to be changed Description of change Justification and business benefit for change Estimated resources Timeframes
25

Chapter 2 The RFC will be recorded when submitted. From the information on the RFC, Change Management will be able to determine whether the request will be treated as a service request, as a change, or will be denied. This categorization is good because it helps to sort out the service requests so that the CAB does not need to spend valuable time considering them. Change Management also makes an initial decision for denying RFCs if they do not make sense, are impractical, incomplete or unnecessary; this saves additional time for the CAB. If the CAB accepts an RFC, they give it a priority and determine the category to put it in.

Planning
Change Management uses an FSC to keep track of when each change will occur. The FSC will inform the recipients of upcoming changes. The FSC should contain enough information for the person responsible for the change to determine whether the change is going to affect them. The FSC allows both the IT and business areas to schedule changes appropriately. The Change Manager may need to obtain the approval of IT management for major changes before submitting an RFC to the CAB. Approval for major changes is typically necessary for three issues: Business approvalThe areas impacted by the change may need to provide approval. Financial approvalThe IT area may need to perform a cost/benefit analysis and budget. Technical approvalThe IT area will need to determine the impact, necessity, and feasibility of the change.

If these approvals are obtained, the CAB will help plan significant changes and act as an advisory committee. To help facilitate effective use of time and make the most informed decisions, the Change Manager should communicate the details of the RFC to CAB members prior to the CAB meeting.

Why Do We Need to Create the CMDB?


To be successful with using the ITIL Change Management process, it is important that the people, processes, and technologies are working together in a coordinated manner to overcome the political roadblocks that usually inhibit cooperation between groups in the same organization. The primary goal of the CMDB is to house the CIs that exist throughout the enterprise and the relationships between them, revealing the status for each configuration at any time. The contents of the CMDB will vary from organization to organization. For example, an organization with a comparably small IT area may be experiencing problems with application troubleshooting because the business units throughout the enterprise regularly implement new software at their own discretion. The CMDB solution they implement will need to constantly, and preferably automatically, track in excruciating detail application configurations so that quick configuration comparisons before and after a problem can be analyzed to identify root causes.

26

Chapter 2 The situation would likely be quite different for a very large IT department with the same types of application troubleshooting problems. The resolution process would typically span the Help desk area, many different IT experts, the customer relationship area, and the related business unit managers. For a large organization, simply capturing all configuration details in the CMDB will not improve the coordination effort; too many details will be confusing to the different players who only need to know some of the details. Instead, this large organization will probably decide to share the many different infrastructure relationships across the Change Management team members. Their CMDB could then contain the most basic device configuration data, relationship information, and information that points to additional sources of more detailed information. These two CMDB implementations are much different but they both provide benefits to business. They allow for shorter applications and systems problem solving and more efficient and error-free changes. By performing thoughtful and enterprise-wide efforts to prioritize process improvements, take into consideration the players involved, and identify the data sharing and use needs throughout the enterprise, each organization will be able to determine the best items to put into each of their respective CMDBs. The CMDB will provide a centralized enterprise repository of information that contains all the details related to the IT architecture. It will allow for a unified view of every IT component within the enterprise. This centralized, unified capability will allow all the business leaders to make better business, as well as technical, decisions. A CMDB will facilitate the capabilities for: Automated discovery Service-centric views Automated and out-of-the-box change processes Integration to other change management solutions

What Should Be in the CMDB?


The CMDB will contain data about hardware and software deployments. The CMDB will allow users to quickly see and report on the technical environmental details. The CMDB can contain data about servers, network devices, workstations, software, and any other network component. There are multiple tools and methods available to enter data into the CMDB. Most CMDB tools can populate the database as well as create customized reports to run at scheduled times as well as upon demand. The CMDB can track and create reports about the different components within the IT architecture and maintain the current CIs. Coincidentally, ITIL also directs that the CMDB should hold data related to CIs; some possibilities are shown in Figure 2.5.

27

Chapter 2

C Is C I R e la tio n s h ip s A s s e ts C o m p u te r S y s te m s P ro b le m s In c id e n ts C o n tra c ts DSL P ro b le m s C h a n g e R e q u e s ts SLAs S e rv ic e M o d e l H e lp D e s k T ic k e ts

Figure 2.5: CMDB data items.

A single, centralized, all-encompassing CMDB should have all the key information available to the entire organization to track all the CIs in the system, map dependencies of CIs, track the status of CIs, determine the history of CIs, and track requests for change for CI verification. Some of the fields you will want to consider using as keys to track the status for each CI are listed in Table 2.2.

28

Chapter 2

Key Field CI Identifier (ID) CI Description CI ID Number CI Category Owner Customer Date Created License Number Location Make Model Model Number Part Number Relationship Relationship Number Scheduled Maintenance Serial Number Status Supplier Ticket Number Version Number

Description The unique name used to identify the CI The description of the CI The unique number generated by the CMDB The category for the CI The person responsible for the CI The customer using the CI The date the CI was created Software license number The physical location of the device Manufacturer Model name Model number Hardware part number How the CI is connected to other CIs; for example, Parent/Child, contained within another CI, using another CI, and so on CI IDs used to create the Relationship Number Date for the next scheduled maintenance, if applicable Hardware serial number Information regarding if the CI is registered, accepted, rejected, under development, installed, and so on The vendor that supplied the component Ticket numbers related to this CI Software version number

Table 2.2: Key fields within the CMDB.

These key fields will not only help to make Change Management more efficient, the data can also be used to help measure the success with Change Management activities and automate the some or all of the Change Management process.

29

Chapter 2

Why Is Automation Important?


A great benefit of the CMDB is that it can work with other IT automation tools. Many data center automation tools exist that can store information in the CMDB. By automating the CMDB, it can more efficiently achieve the goal of using the CMDB to provide IT personnel with a centrally managed storage repository for of IT data. Automation Has Positive Business Impact The Change Management process enables IT to successfully deliver results that link people, processes, and technology throughout the enterprise to most efficiently meet business demands. Automating these relationships for the collection, orchestration, and management infrastructures, will accelerate ITIL processing and changes within the CMDB. When the triggers that launch changes to configurations and workflows are automated, IT does not need as much training as if it were all done manually. So time is not only saved by automating the processes but also by not needing to do as much training and not having to correct human errors. This automation will reduce downtime and make Change Management more controlled, consistent, and efficient from both a time perspective as well as a resources perspective. Automation also has the added benefit of being able to generate activity logs that in turn establish accountability for the actions that occur. Automation Tool Features When choosing an automation tool, look for the following features and requirements: Capabilities to store, track, sort, and reconcile asset data, configuration data, and operational change activity Change event coordination that can track, trigger, and make changes using the ITIL Change Management processes, all the way from RFC submission to implementation Capabilities to granularly report each configuration along with the change impact assessments for the relationships between the changes Compliance reporting and reconciliation actions necessary for noncompliance issues and settings Configuration comparisons and associated reports Configuration search and auto-discovery capabilities CI definitions listings and reports along with relationship mapping Diagnostics capabilities and change tracking Ability to enforce compliance controls Federated data modeling and data auditing

A federated data model enables a CMDB to provide a single source of record for CIs.

30

Chapter 2 Ability to be used throughout the entire IT infrastructure coverage; all servers, applications, network devices, storage locations, and so on Maintain baseline configurations and generate reports against those baselines at any point in time Event management between all enterprise IT systems Dependency mapping of CIs to determine business impact assessment, service desk activities, and event management consoles Ability to generate role-defined dashboards Software developer kits (SDKs) and application program interfaces (APIs) for data integration Configurable triggers for workflows and specific events that include send notifications and create incidents to investigate

A good and effective automation product that is successfully implemented within an organization can provide the following business benefits: Increased ITIL adoption throughout the enterprise for all eight ITIL processes ITIL implementation acceleration by making previously manual tasks automated and utilizing standardized ITIL processes, speeding up adoption and business-impact analysis Standardization of tasks and workflows throughout the enterprise and at all levels for noticeable positive day-to-day process impact Verification that the Change Management processes are being performed correctly and are effectively communicating with the other ITIL processes Communications between teams improve resulting in a decrease of operational costs Granular configuration tracking, reconciliation, and auditing capabilities result in more detailed and accurate compliance reports, enabling audits to be performed more quickly using historical change tracking, point-in-time references, and remediation capabilities More efficient provisioning and change management of new application services based on standardized configurations that allow for faster time to production Improvement in change and configuration management processes using automated workflows and integration across multiple enterprise IT sources using federation improves IT service availability Server repurposing results in reduced hardware costs Creating a single source for provisioning and compliance tasks that produces comprehensive tracking streamlines processes, brings cross-silo teams together, reduces human errors, and results in lower operations costs The ability to create real-time accurate reports showing the current state of the environment, comparisons to baseline data, and trending analysis on change activity based on CMDB data allows for more useful and valuable reports

31

Chapter 2 Avoid Common Pitfalls Organizations often fall into common pitfalls during the implementation of ITIL Change Management automation tools: Lack of executive sponsorshipExecutive visibility and leadership buy-in is essential for describing political and silo concerns and for getting cooperation enterprise-wide Lack of integrationLack of solution integration between the chosen vendor products and with existing third-party tools used within the enterprise Cross-silo configurationsLimited configuration capabilities for network components, server, applications, storage, and so on that span multiple enterprise silos without a federated data scheme Inability to calculate return on investment (ROI)Limited ROI discussion and long-term projection with only a narrow perspective on short-term cost savings Excessive deployment and professional services costsPlanning sessions considering deployment timelines and key ROI objectives with a quarter-to-quarter perspective and focus on the role of integrators for deployment customization and data, event, and interface integrations Poor ITIL alignmentOverlooked opportunities to standardize processes based on ITILdefined process workflows to increase the impact of changes on IT services Lack of closed-loop processesMany vendor solutions do not have a closed-loop change management process; without a closed-loop process, you will need to spend time and resources to integrate the products, which will be a very complicated task

Whatever automation tool you use, it should seamlessly integrate all Change Management processes to avoid costly in-house time making it fit with your other systems.

Costs
It is important for you to consider the costs involved with implementing ITIL Change Management processes. These costs will basically fall into two categories; people costs and technology costs. People Costs You likely already have personnel throughout the enterprise performing Change Management tasks. If you are not already using ITIL, it is likely that they are performing these tasks, but in silos, meaning they are repeating tasks. When implementing Change Management processes, you should be able to use some of these personnel that are now freed up for implementation. However, you will still need to utilize personnel to be on the CAB. Technology Costs You will need to plan carefully the hardware and software tools you decide to use for implementing Change Management processes, and ensure that they integrate with the other ITIL processes. A good, integrated technology tool may be a significant up-front investment, but if chosen and implemented correctly, it will result in long-term savings in other areas of the enterprise.
32

Chapter 2

Measuring Success
Change Management metrics can help improve business. But to demonstrate this, it is important to create statistics and metrics to clearly show the improvements. Success must be documented in terms of improvements to the business. As it has often been said, you cannot manage what you cannot measure. What kind of change management measurements and associated data can be used to measure improvements? The following are some for you to consider and build upon: Total changes planned; in the pipeline Total changes implemented Number of failed changes Number of emergency changes Number of unauthorized changes Number of rescheduled changes Average process time per change Number of changes that resulted in incidents Change management tooling support level Change management process maturity Total labor hours to coordinate changes Total labor hours available for coordinating changes Total labor hours to implement changes Change management system reports Incident management system reports Labor reports HR reports Audit reports CMDB reports

So where do you find this data? They can be found in such places as:

What kind of evaluations can you make from these seemingly nondescript numbers? Now is the fun part, when you get to do some math! The following are just some of the metrics you can calculate from the data.

33

Chapter 2

Change Efficiency Rate You can determine the change efficiency rate by dividing the total changes implemented by the total changes in the pipeline. For example, if you did 20 changes this week and you had 40 to do in the pipeline, your efficiency rate is 20/40, or 50%. This will tell your management how efficient you are at handling changes. This can be used to demonstrate your improvement in implementing changes by using ITIL compared with when you did not. Change Success Rate You can determine the success rate and failure rate percentages for your changes by dividing the number of failed changes by the total number of changes implemented. For example, if you implemented 10 changes, but 2 of them failed, you would have a 2/10, or a 20% failure rate. Subtract this from 100% and this gives you an 80% success rate. Change Reschedule Rate You can determine how well you implement changes on schedule by calculating the change reschedule rate. Do so by dividing the number of changes rescheduled by the number of changes you had scheduled. For example, if you had planned for 40 changes this week but rescheduled 5 of them, your reschedule rate would be 5/40, or 12.5%. Change Incident Rate A very useful metric to reveal how changes impacted business productivity is the change incident rate. You can calculate this by taking the number of changes that created incidents and divide it by the total number of changes implemented. For example, if you implemented 30 changes this week and 5 of them caused incidents, your change incident rate would be 5/30, or 16.7%. Other Useful Metrics These should give you a good idea of what metrics you can use to determine your successes and challenges with Change Management processes. There are many more you can compute using the data you have gathered in a successfully implemented Change Management process. A few of these include: Emergency change rate Average process time per change Unauthorized change rate Personnel time utilization for changes Change Management technology tools support utilization Change Management process maturity

Metrics such as these will tell you, and more importantly tell your business leaders, how efficient you are at implementing Change Management process components and where improvements are needed.

34

Chapter 2

Summary
Implementing the ITIL Change Management process will be an evolutionary process. It will take time and investment up front. It will be a learning experience. But, when done correctly, it will make your business more efficient and make IT more valuable in the eyes of your business leaders. Change Management implementation success will take the strong and steady commitment of your executive management to get through these growing pains. Be sure you have that to get the subsequent commitment of your ITIL team members and ultimately improve your Change Management processes.

35

Chapter 3

Chapter 3: Effective Incident and Problem Management Through ITIL


Before embarking on a discussion of Incident Management and Problem Management, it is good to do some level setting. Three process terms often used within Incident and Problem Management are incidents, problems, and errors. How are these different from each other? Lets take a look at each one separately and establish parameters for each. Incidents The ITIL Service Support book (http://www.itil.org.uk/support.htm) defines an Incident as Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in the quality of that service. Examples of incidents are: A user encounters an error when trying to access an application on the network Part of the WAN becomes unavailable, resulting in some users being unable to log onto the network Users do not get their expected messages because the email server rejects all incoming messages

Problems The ITIL Service Support book defines a Problem as An unknown, underlying cause of one or more incidents. A single problem may generate several incidents. Examples of problems are: An application update may have made the application unusable under the same settings as before the update A newly installed WAN component may not be working correctly The ISP may not have renewed the domain name correctly

Errors The ITIL Service Support book defines an Error as A problem for which the root cause has been identified and a workaround or permanent solution has been developed. Errors can be identified through analysis of user complaints or by vendors and development staff prior to production implementation. Examples of errors include: The network settings for the desktop or server may have been misconfigured A network-monitoring tool may incorrectly flag a WAN circuit as being busy The spam filter on the email server may have been configured incorrectly

36

Chapter 3

Relationship Between Incident and Problem Management


There is a close relationship between incidents, problems and errors: Incidents often indicate problems Problem investigation often leads to the identification of errors Errors that are unresolved can cause incidents and problems

To demonstrate this relationship, consider a common scenario within IT shops. The Service desk receives a call from an end user who got an error message when trying to log into the network. The Service desk logs the report to the incident database. An automated trend analysis determines whether this same incident has been reported, taking into consideration the time, date, and other related details about the incident. The resulting trend analysis is sent to the Problem Management system where commonalities between this and the other reported incidents can be identified. Common failures and configuration items (CIs) are identified and matched with known errors. The Problem Management system will provide a workaround or a temporary fix so that the user can get logged into the network as soon as possible. In the meantime, a request for change (RFC) may be generated to resolve the error. If the number of incidents continues to increase, the priority for implementing the RFC will become higher. When the change is implemented, the Known Errors Database will be updated to indicate the error has been resolved. Figure 3.1 shows the relationships between incidents, problems, and errors.

Figure 3.1: Relationships between incidents, problems, and errors.

Because of these close relationships, it is intuitive to discuss all three together.

37

Chapter 3

Why Is Incident Management Important?


More incidents are reported daily. As long as technology continues to evolve, more errors will be created and incidents will continue to occur.
Incidents impact all levels and parts of an organization. Organizations must be prepared to deal with incidents or the impact will be much more significant compared with an organization that performed no pre-planning.

Incident Management is inherently reactive. With regard to IT incidents, the goal of Incident Management is to reduce or eliminate the effects of actual or possible troubles in IT services to ensure users can get back to work, and the business can get back to being productive, as soon as possible. Incident Management has a short-term focus on restoring service.
Information Management activities include: Incident detection and recording Classification and initial support Investigation and diagnosis Resolution and recovery Closure Incident ownership, monitoring, tracking, and communication

To most effectively address incidents, they need to be recorded and classified and the resolution for each assigned to the appropriate, qualified personnel. Incident resolution must be monitored consistently and closely to ensure incidents have been completely addressed.

38

Chapter 3

The Incident Management Process


There are seven or eight steps within the Incident Management process, depending upon whether the incident involves a Service Request. Figure 3.2 demonstrates the Incident Management process.
A A A Tracking, progress monitoring & escalation as necessary

Incident is reported

Incident recording

Classification & initial support

Service Request? No

Yes

Service Request procedure

Matching

Match? Yes

No

Investigation & Diagnosis

Resolution & Recovery

No Resolved? Yes A

Incident Closure

Figure 3.2: The Incident Management process.

According to the Office of Government Commerce (OGC) Best Management Practice (http://www.best-management-practice.com/gempdf/ITIL_Glossary_V3_1_24.pdf), Service Request is defined as A request from a User for information, or advice, or for a Standard Change or for Access to an IT Service. For example to reset a password, or to provide standard IT Services for a new User. Service Requests are usually handled by a Service Desk, and do not require an RFC to be submitted.

Incident Reporting Incidents can be reported from any part of the enterprise as well as a number of sources outside the organization. Following a well-thought-out repeatable process will not only make incident responses more efficient, it will help to prevent similar incidents from recurring. When the incident is reported, it is important that the details of the incident are first recorded as soon as possible. If you try to jump headfirst into incident response thinking you can always come back later and record the details, it is likely that documentation will never occur. It is also important for successful resolution of the incident that ongoing recording of significant details occurs so that progress can be accurately monitored. This documentation will also assist with addressing other incidents; learn from your experiences!
Failure to record the incident details will not allow you to monitor compliance with SLA levels.

39

Chapter 3 An important note to make about incident reporting is that each incident should not be recorded in the system more than once. Doing so will skew the incident reports and make your key performance indicator (KPI) metrics inaccurate. A KPI is a valuable metric that indicates the performance level, or success, of a particular operation or process. Management can use KPIs to make better decisions about IT processes and systems.
The OGC Best Management Practice (http://www.best-managementpractice.com/gempdf/ITIL_Glossary_V3_1_24.pdf) defines a KPI as A Metric that is used to help manage a Process, IT Service or Activity. Many Metrics may be measured, but only the most important of these are defined as KPIs and used to actively manage and report on the Process, IT Service or Activity. KPIs should be selected to ensure that Efficiency, Effectiveness, and Cost Effectiveness are all managed.

Classification and Initial Support Often overlooked in typical incident response plans is classification of the incidents. Classification will allow the incident to be categorized and assist with monitoring and reporting. To create your classifications, use the following parameters: CategoryThis will include information about the origin of the incident or the support group involved. Examples include such things as processor, network, workstation, organization, procedure, Service Request, and so on. PriorityThis will determine how quickly the incident should be addressed. ServiceThis will provide information about the services involved with the incident as covered within the associated SLA. Support groupThis is the group that will assist with incident resolution if the Service desk cannot resolve it. TimelinesThis will indicate the estimated time it will take to resolve the incident along with planned update times. Incident reference numberAssign a number not only to make it easier to find the incident data within your Incident Management system but also to reference. StatusUpdate the status to show where you are within the incident resolution process.

Matching After the incident is classified and all associated data recorded, check to determine whether this type of incident has occurred before. If so, you can streamline the incident response time by seeing what the solution or workaround was for the previous incident and possibly use the same one, depending upon the symptoms or causal problems and/or errors. Investigation and Diagnosis If the Service Desk passes an incident on to a support group, the group will investigate the incident and perform diagnosis to provide resolution. If the initial group cannot resolve the incident within the targeted timeframe, they will pass it on to another support group. This will continue until the incident is resolved.

40

Chapter 3 Resolution and Recovery When the incident has been successfully solved, the support group will record all the details about the resolution into the system. If a change must occur to prevent a similar incident from recurring, a request for change (RFC) will be submitted into the Change Management process.
It is possible that you may have an incident that does not get resolved. In this hopefully rare situation, the incident will remain open.

Incident Closure The support group will send notice to the Service Desk that the incident has been resolved. The Service Desk will then check with the person that reported the incident and ask him or her to check the related application or system to ensure that, from their point of view and experience, the incident truly has been addressed correctly. The incident record should be updated to indicate what final category the incident is now in, along with the SLA-related metrics. Throughout the Indicate Management process, the Service Desk is responsible for monitoring progress and updating users and customers of incident resolution status and escalation to other support groups.

Incident Management Benefits


If a well-defined Incident Management process does not exist, there is no clear accountability or responsibility for monitoring and appropriately responding to incidents. With a lack of planning and responsibility, incidents that may have been quickly resolved with a formal structure in place could become unnecessarily expansive and severely damage business by reducing service levels and leaving customers confused because they dont know what to do. In these situations, you often either have many people working on the incidentcausing duplicated and often conflicting activities to occuror you have no one addressing significant issues associated with the incident, prolonging incident resolution to an unacceptable time period. The resulting costs of the incident not only to IT but to all business customers will be much higher than it would have been if an Incident Management process were in place. Having a well-defined, documented, and implemented Incident Management process in place not only benefits the IT areas; it benefits all areas of business. The business benefits by realizing: More efficient, effective, and expedient incident resolution, which reduces the negative business impact of incidents More productive personnel, as a result of less downtime from incidents Incident monitoring performed independently and is customer-focused SLA business management information is available SLA compliance

41

Chapter 3 The IT area benefits will include: More efficient and effective use of personnel time Documented tracking of incidents and service requests with lessened likelihood of losing or incorrectly documenting incident information The CMDB is more accurate, with incident information keeping it updated as well as audited with the incident data being recorded and mapped to CIs The ability to improve monitoring of and measurement for meeting SLA requirements Better management of SLA reporting and service quality Customers are happier with IT services because of more effective response to incidents and less downtime

Incident Management Inputs, Outputs, and Relationships


The inputs for Incident Management are pretty clear-cut: incident data. Incidents can occur within any level or par of the enterprise. Although incidents are commonly reports by end users, the incident notification can come from a wide range of sources: End users Business leaders IT leaders External customers Business partners Automated tools

To make incident response as effective and efficient as possible, there should be a basic core of information consistently collected about each incident. These data items will determine the classification of the incident and will contribute to determining the urgency and speed for which the incident should be addressed. The data items will also support how the incident is monitored and provide information for the incident report.

42

Chapter 3 Table 3.1 provides the items that should be collected when an incident is reported; these are the details that are input to the Incident Management process.
Input Item Category Description Each incident should be assigned to a category and subcategory to correspond to the incident origin and support group. The following are examples of categories that can be used: Central processingApplication, system, mainframe NetworkIP address, segment, router, hub Organization and ProceduresCommunication, order, request Service RequestFrom the Service Desk Use and FunctionalityAvailability, backup, capacity, service WorkstationKeyboard, monitor, CPU, storage drive Each incident needs to be assigned a priority to help the support groups understand which incidents need to be addressed immediately versus those that can be addressed at a later time. Priority is often computed by taking a number assigned to Urgency multiplied by a number assigned to Impact. For example, if the Urgency is 1 and the Impact is 2, the Priority is 1 2, or 2. Another incident may have an Urgency of 3 and an Impact of 1, so the Priority would be 3 1, or 3. This is a list to identify the services related to the incident. These should reference the applicable SLA requirements. Included within this list will be the escalation times for the services required by the SLA. If the Service Desk doesnt resolve the incident within the SLA time requirements, a support group may be called on to address the incident. The consideration of the SLA requirements with the priority will be used to determine the timelines for incident resolution. These need to be recorded. Each incident is assigned a reference number for easy and future reference. The status, also referenced as workflow position, indicates where progress is within the workflow. Status labels could include such terms as new, accepted, planned, assigned, active, suspended, resolved, closed, and so on.

Priority

Service

Support Group Timelines

Incident reference number Status

Table 3.1: Incident Management inputs.

The escalation of an incident from the Service Desk to a support group is often described as functional escalation.

43

Chapter 3

Outputs Figure 3.3 illustrates the inputs and outputs for the Incident Management process.
Service Desk
Incidents

Computer Operations
Incidents

Procedures
Incidents

Networking
Incidents

Other Sources of Incidents


Incidents

Detection & Recording

Incident Management Classification & First-line Support

Matching

Investigation & Diagnosis Resolution & Recovery Incident closure Incident Ownership Monitoring, Tracking & Communication
Resolutions & Workarounds Reports Resolutions & Workarounds Incident Data Resolutions & Workarounds RFCs Resolutions & Workarounds Reports Configuration Details Reports Routing & Monitoring

Service Desk

Computer Operations

Procedures

Networking

CMDB

Service Requests

Availability Management

Problem Management

Change Management

Capacity Management

Service Level Management

Figure 3.3: Incident Management inputs and outputs.

Relationships Incident Management has relationships with most of the other ITIL processes. It is important for the success of not only Incident Management but of enterprise-wide ITIL processes that these relationships are appropriately managed. Figure 3.4 illustrates these relationships at a high level.

44

Chapter 3

Change Management Capacity Management


RFC Configuration Details Resolution Configuration Details

Configuration Management

Reports

Incident Management
Incident Data Reports Reports

Availability Management
SLA Parameters

Work Arounds

Problem Management

Service Level Management


Figure 3.4: Incident Management relationships with other ITIL processes.

As this figure shows, Incident Management activities impact other ITIL processes in one way or another. It is important for effective communications channels to exist to communicate key activities. Table 3.2 provides the high-level descriptions about the relationships between Incident Management and the other ITIL processes.

45

Chapter 3

ITIL Process Availability Management

Relationship with Incident Management Availability Management uses incident data and records in conjunction with status monitoring data from Configuration Management. Based upon the information, a service can be assigned a status, just like a CI in the CMDB. Information provided by Availability Management records can be used to determine the availability of a service and the response time of the service provider. Capacity Management uses information about incidents that are associated to capacity (for example, incidents resulting from lack of storage space, unacceptably slow response times, and so on). These events can send a notice to the Incident Management process via systems managers, business managers, or using automated tools The CMDB defines the relationships between resources, services, users, and Service Levels. Because Configuration Management defines the position responsible for each infrastructure component, incidents related to specific components can be most efficiently addressed. The CMDB can also be used to develop workarounds, such as diverting traffic to a different email server or temporarily placing a defined user group on a different print server. Problem Management provides requirements for the quality of incident documentation and records that assist with determining the causal errors. It provides information about problems, known errors, temporary fixes, and workarounds. How are many incidents resolved? By making changes, such as replacing faulty network components or modifying parameters. Change Management provides information about scheduled changes, change status, and so on that Incident Management needs to determine appropriate actions. Additionally, changes can cause incidents. When this happens, Incident Management will send information and data to Change Management about the incidents. Service Level Management is involved with monitoring the customer agreements to ensure support provided meets customer expectations. Incident Management must understand the SLA to ensure this information is considered and used when communicating with users about incidents. Incident reports can also reveal whether service levels are provided accordingly.

Capacity Management

Configuration Management

Problem Management

Change Management

Service Level Management

Table 3.2: Incident Management relationships.

46

Chapter 3

Measuring Incident Management success


Incident management metrics can help improve business. To demonstrate this, it is important to create statistics and metrics to clearly show the improvements. Success must be documented in terms of improvements to the business. Yes, the mantra still applies; you cannot manage what you cannot measure. What kind of incident management measurements and associated data can be used to measure improvements? The following list highlights the common incident management metrics typically available for you to consider and build upon: Total number of incidents reported Total number of unique incidents Total number of Severity 1 incidents Total number of Severity 2 incidents Total time to resolve Severity 1 incidents Average time to resolve each Severity 1 incident Total time to resolve Severity 2 incidents Average time to resolve each Severity 2 incident Number of incidents resolved within SLA parameters Total number of High Severity incidents Total number of incidents with customer impacts Number of incidents reopened Total available non-Service Desk labor hours available to work on incidents Total non-Service Desk labor hours used resolving incidents Incident Management tools support level Incident Management process maturity Incident management system reports Labor reports HR reports Process Assessment Audit Report Findings Tool Assessment Results

So where do you find this data? It can be found in such places as:

47

Chapter 3 What kind of evaluations can you make from these seemingly nondescript numbers? What are your KPIs? Some of these numbers stand on their own to provide meaningful KPIs, such as: Total number of incidents reported Total number of unique incidents Total number of Severity 1 incidents Total number of Severity 2 incidents Total time to resolve Severity 1 incidents Total time to resolve Severity 2 incidents Number of incidents resolved within service level agreement parameters Total number of High Severity incidents Total number of incidents with customer impacts Total available non-Service Desk labor hours available to work on incidents Total non-Service Desk labor hours used resolving incidents

However, you can do a little math and determine additional useful KPIs. The following are just some of the metrics you can calculate from the data. Incident Resolution Efficiency Rate You can determine the incident resolution rate by dividing the total number of incidents resolved within SLA parameters by the total number of incidents reported. For example, if there were 15 incidents reported this week and 12 of them were resolved within the SLA parameters, your resolution efficiency rate is 12/15 or 80%. This will tell your management how successful you are at resolving incidents in alignment with business requirements. The lower your efficiency rate goes, the more evidence you have that you do not have the resources or tools necessary to appropriately resolve incidents or that your SLA parameters are not realistic. Customer Incident Impact Rate You can determine the impact of incidents upon customers by dividing the total number of incidents with customer impact by the total number of incidents reports. For example, 15 of 20 incidents reported during the week noticeably and measurably impacted customers, such as making services unavailable, damaging business files customers depend upon, and so on, you would have a 15/20 or a 75% customer incident impact rate. This metric will tell you how successful you are at keeping incidents from impacting your customers and can point to where stronger controls are necessary, where systems need to be adjusted, and so on.

48

Chapter 3

Incident Reopen Rate You can determine the incident reopen rate by dividing the total number of incidents reopened by the total number of incidents reported. For example, if you had 5 incidents reopened during the week, and the total number of incidents reports was 20, your incident reopen rate would be 5/20 or 25%. This metric will tell you how successful you are at permanently resolving incidents. If your incident reopen rate is high, you need to look at you incident response procedures and tools and make changes to lower the rate. Incident Labor Utilization Rate A very useful metric to reveal how changes impact business productivity is the change incident rate. This metric will tell you how much available labor was used handling incidents. You can calculate this by taking the total labor hours (not part of the Service Desk) used to resolve incidents divided by the total available labor non-Service Desk labor hours to resolve incidents. For example, if 55 labor hours were used during the week to resolve incidents, and you had 50 hours available to work on incidents, you would have an incident labor utilized rate of 55/50 or 110%. You were over-utilized this week in working on incidents. You should keep you eye on this number to determine whether you are consistently or often over-utilized. This will help you to decide whether you should add personnel who have responsibilities for handling incidents. Metrics such as these will tell you, and more importantly tell your business leaders, how efficient your Incident Management process components are and where improvements are needed.

Why Is Problem Management Important?


The Problem Management process includes the activities taken to minimize the adverse impacts of problems upon the business that were caused by errors within the IT infrastructure and to prevent recurrence of incidents related to these errors. Problem Management strives to get to the root cause of problems, identifies workarounds or permanent fixes, and eliminates errors.
Problem Management activities include: Problem control Error control Proactive problem prevention Providing information

49

Chapter 3 Whereas Incident Management is reactive, Problem Management is primarily proactive by taking actions to determine the reasons why there was a failure in the provision of IT services. However, there are some significant reactive actions within Problem Management, such as identifying the cause of previous incidents and providing recommendations for removing those causes. Problem Management is basically an investigative process whereas Incident Management is basically a resolution process.
Many errors may be the cause of a problem. Many problems may be the result of one error.

Problem Management seeks to identify the cause or causes of a problem. The determination of the cause becomes a known error. An RFC can then be submitted to eliminate the known error along with the associated problem or problems.

The Problem Management Process


There are four basic activities involved with the Problem Management process: Problem control Error control Proactive problem management Information generation

Problem Control Problem control activities seek to identify problems and determine the root cause of the problems. Once the causes are known, the problems can be turned into known errors that are associated with the base cause of the problem and an associated workaround. Any incident could have associated problems if the cause of the incident is not known. The first step in problem control is identifying the existence of a problem along with recording significant details about the problem. The problem should then be classified according to the Appropriate category, such as hardware or software Impact upon the business and associated business process and applications Priority based upon consideration of urgency, impact, risk, and the sources necessary to resolve the problem Status of the problem Urgency of finding a solution

50

Chapter 3 The classification of a problem may change throughout the process of resolving the problem. For example, implementing a temporary fix or using a workaround may lessen the urgency and impact. In addition to classification, an impact analysis should be performed to determine how serious the problem is and what potential and actual effects the problem has on IT services. This impact analysis will become the basis to mitigate and manage the risk. Based upon the results of the impact analysis, a priority is assigned to the problem and then the appropriate personnel and resources can be assigned to resolve the problem. The problem is now ready to be investigated and diagnosed. Investigation and diagnosis will typically need to be repeated multiple times. Each time you will get closer to resolution. Too many IT practitioners believe that resolution should or can occur quickly, but with this attitude, you will be setting yourself up for failure and frustration. Investigation often includes trying to reproduce the problem within an isolated environment. This is a very good tactic. Dont be afraid to call in specialists from the support group to help. If an acceptable workaround can be established after the cause of the problem is discovered, and the CIs responsible are identified, a relationship between the incident and CIs will allow for a known error to be defined. If an RFC must be submitted to apply a temporary fix, the RFC process must be followed. Error Control The error control activities involve monitoring and managing all known errors from the time they are identified until they are resolved. Many areas throughout the enterprise may be involved with error control. When the cause of the problem has been determined and the corresponding CIs identified, the problem can be linked to a known error, which launches the error control process. At this point, data is sent to the Incident Management process to use within any open incidents. An existing workaround for the known error can also be used to assist with incident resolution. The team working within the Problem Management process will determine what needs to be done to resolve the problem if the errors are known. The team members should compare the possible solutions and choose the one that is the best fit with the associated SLAs, costs, impacts, and urgency. When the decision has been made regarding the best solution for resolving the problem, an RFC can be submitted to Change Management. Although most problem and failures are identified in the production environment, it is important to keep in mind that test and development environments can also have failures and known errors. When the changes to fix the error have been implemented, a Post Implementation Review (PIR) should be done before closing the problem. Incident Management should be sent the results of the PIR so that they can close the applicable incidents. Throughout the error control activities, there is constant tracking and monitoring to stay abreast of problem and error resolution. Tracking and monitoring will help determine whether the business impact and/or urgency changes, if the priority changes, and whether the RFC has been successfully implemented and addresses the problem or error.

51

Chapter 3 Proactive Problem Management The actions that occur within proactive problem management, which basically means actions taken to prevent problems, ensures the quality of the services and underlying infrastructure. Trend analysis occurs along with actions to identify weaknesses. Proactive problem management can have a huge impact on the business by identifying, investigating, and addressing weaknesses throughout the infrastructure components before they result in incidents. Information Generation Throughout Problem Management processes, information is generated and shared. The closest relationship is with Incident Management, to which information is passed concerning workarounds and temporary fixes. Information is also obtained from the CMDB to determine the other entities that need to receive information about the problem resolution. The SLA is also used to see what additional entities need to receive information. Figure 3.5 demonstrates the Problem Management process.
Problem tracking and monitoring

Problem identification and recording

Problem classification

Problem investigation and diagnosis

RFC and problem resolution and closure

Error identification and recording

Error assessment

Record error resolution

Close error and associated problems

Error tracking and monitoring

Figure 3.5: The Problem Management process.

52

Chapter 3

Problem Management Benefits


The objective of Problem Management is to identify and eliminate the causes of incidents so that actions can be taken to prevent them from happening again. That alone would seem to be a compelling benefit for implementing Problem Management processes. However, in case this does not sway you, the following list highlights a few more benefits: Improves the quality of IT services by taking actions to reduce the number of incidents and thus reduce the IT workload. Improves the documentation related to problems, errors, and incidents. Improves user productivity by addressing and removing errors and problems, which results in giving users more time to actively perform business-related activities. Documentation will improve support team productivity by having documentation to reference to resolve incidents on an ongoing basis more efficiently, economically, and quickly. Raises the stature of IT services reputation. When IT services become more stable and systems availability increases, customers will be more willing to entrust business activities to IT areas. Documentation can be used to perform trend analysis that can result in implementing procedures and tools to prevent incidents. Documentation is also available and useful for investigations and in preparing RFCs. Establishes a standard for consistent and thorough incident and problem recording, classification, and reporting. Provides details on workarounds and temporary fixes allowing first-line support personnel to be more likely to resolve incidents.

Inputs, Outputs, and Relationships


Six other ITIL processes provide input to the Problem Management process: Incident Management provides incident record data used by Problem Management to identify problems. Change Management provides PIR results about associated incidents, problems, and errors. Configuration Management provides information critical for resolving problems, such as infrastructure details, software and hardware configurations, services, architecture blueprints, and so on. Availability Management provides availability design, planning, and monitoring data. Capacity Management provides data about storage, bandwidth settings, and other details useful for problem shooting. Service Level Management provides SLA data along with other quality data.

53

Chapter 3 Outputs Problem Management provides output to two other ITIL processes: Change Management receives RFCs to help resolve problems. Incident Management receives matching information to determine whether a problem has been associated with other incidents.

Figure 3.6 illustrates the inputs and outputs for the Problem Management process.
Incident Management
Information

Capacity Management
Information

Change Management
PIR

Configuration Management
Information

Service Level Management


Information

Availability Management
Information

Problem Control

Problem Management Error Control Proactive Problem Management


RFCs Matching information, workarounds and quick fixes

Change Management

Incident Management

Figure 3.6: Problem Management inputs and outputs

Relationships Problem Management has relationships with six other ITIL processes. It is important for the success of not only Problem Management but of enterprise-wide ITIL processes that these relationships are appropriately managed. Figure 3.7 illustrates these relationships at a high level.

54

Chapter 3

Change Management Capacity Management


PIR Capacity Data RFC Configuration Data

Configuration Management

Problem Management
SLA Data Matching Information, Workarounds, & quick fixes

Availability Management

Availability Data

Service Level Management

Incident Data

Incident Management
Figure 3.7: Problem Management relationships with other ITIL processes.

Table 3.3 provides high-level descriptions about the relationships between Problem Management and the other ITIL processes.
ITIL Process Incident Management Relationship with Problem Management Incident Management provides incident record data used by Problem Management to identify problems. Incident Management receives matching information to determine whether this problem has been associated with other incidents. Change Management provides PIR results about associated incidents, problems, and errors. Change Management receives RFCs to help resolve problems. Configuration Management provides information critical for resolving problems, such as infrastructure details, software and hardware configurations, services, architecture blueprints, and so on. Availability Management provides availability design, planning, and monitoring data. Capacity Management provides data about storage, bandwidth settings, and other details useful for problem shooting. Service Level Management provides SLA data along with other quality data.

Change Management

Configuration Management Availability Management Capacity Management Service Level Management

Table 3.3: Problem Management relationships.

55

Chapter 3

Putting Incident Management and Problem Management into Action


Lets revisit our ACME Super Duper Supplies business from previous chapters and step through an example to see how all these Incident Management and Problem Management processes are related. ACME Super Duper Supplies recently implemented a new ecommerce Web site that allows for online merchandise ordering and payments for their new product, Magic Mover. This was a significant change in their IT infrastructure in addition to having a major impact on their business. Much money was invested in this change, so there are great expectations for a large financial return. The new product was popular right away, and the new Web site continued to get increasingly more hits from day to day. This was great news to the business unit manufacturing the product. However, as sales were ramping up to a very profitable level, the Web site crashed and Web site customers received only error messages when trying to get to the site. The Service Desk received an automated notice that the Web site was down. They contacted Ms. Flint, the manager of the Magic Mover business unit, and notified her of the incident. They also performed first-line support to determine whether they could resolve the incident. They were not able to get the site back up within their goal timeline, so they passed the incident on to the second-line support group. The second-line support group performed a change impact analysis and were able to apply a temporary fix and get the Web site back up and available for business. The support group enters the incident data into the Incident Management system, from which data is sent to the Problem Management system. The Problem Management team receives the information and investigates by performing deep change impact analysis to identify the root cause of the problem. They found that a known error with capacity settings triggered the Web site outage. The Problem Management team submits an RFC to modify the capacity settings to eliminate the error and prevent the incident from recurring. A PIR is performed to determine whether the changes truly resolved the problem. Figure 3.8 shows how the Incident Management and Problem Management processes flow to address the Magic Mover Web site incident.

56

Chapter 3
Incident Management for the Magic Mover website (in bold)
A A A

Tracking, progress monitoring & escalation as necessary

Incident alert is generated

Service Desk receives automated alert and records incident

Service Desk classifies the incident and provides initial support

Service Request is generated?

Yes

Service Request procedures are followed to prevent incident from recurring

No A C A

Incident Data

Service Desk performs Matching procedures

Match to an existing incident?

No

Support group performs investigation & diagnosis

Send data to Problem Management and perform resolution & recovery procedures

Incident Resolved? Yes

No

Yes B A A

Incident Closure

Problem Management for the Magic Mover website (in bold)


Change Management

Problem tracking and monitoring

RFC

Problem Management receives incident data

Problem identification and recording

Problem classification

Problem investigation and diagnosis No

Problem resolved?

Yes

RFC for capacity setting change, problem resolution and closure

Error identification and recording

Error assessment

Problem resolved?

Yes

Record error resolution

No

Error tracking and monitoring

Perform PostImplementation Review (PIR)

Is error with the Magic Mover website resolved? No

Close error and associated problems

Figure 3.8: Magic Mover Incident Management and Problem Management process flow.

57

Chapter 3

Costs
It is important for you to consider the costs involved with implementing ITIL Incident Management and Problem Management processes. These costs will generally fall into two categories: people costs and technology costs. People Costs You likely already have personnel throughout the enterprise performing Incident Management and Problem Management tasks, but in an ad hoc or otherwise uncoordinated way. If you are not already using ITIL, it is likely that they are performing these tasks, but in silos, meaning they are repeating tasks, leaving out important tasks, or performing conflicting tasks. When implementing Incident Management and Problem Management processes, you should be able to use some of these same personnel that are now freed up for implementation. Personnel costs will include such things as the time of the personnel who are members of the support groups when they are actively resolving incidents as well as any training they need to receive. There are also personnel costs in maintaining and upgrading the associated Information Management and Problem Management systems and tools. A typically significant cost is the upfront time necessary to plan, define, communicate, and implement the Incident Management and Problem Management processes. Technology Costs Technology costs will include such things as tools to support the Incident Management and Problem Management processes, possibly hiring outside consultants or technicians to assist in implementation of the tools, storage space for incident data, and any training costs that may be necessary. You will need to plan carefully the hardware and software tools you decide to use for implementing the automated portion of the Incident Management and Problem Management processes, and ensure that they integrate with the other ITIL processes. A good, integrated technology tool may be a significant up-front investment, but if chosen and implemented correctly, it will result in long-term savings in other areas of the enterprise.

58

Chapter 3

Measuring Problem Management Success


Problem management metrics can help improve business. But to demonstrate this, it is important to create statistics and metrics to clearly show the improvements. Success must be documented in terms of improvements to the business. Shall we repeat the mantra again? You cannot manage what you cannot measure. What kind of Problem Management measurements and associated data can be used to measure improvements? The following are some of the common Problem Management metrics typically available for you to consider and build upon: Total number of incidents reported Total number of incidents reopened Total number of major problems Total number of problems in the pipeline Total number of problems resolved and removed Total number of known errors Total number of problems reopened Total number of problems with customer impact Average time to resolve each problem Average time to resolve Severity 1 problems Average time to resolve Severity 2 problems Total available labor hours allotted to work on problems Total labor hours spent working on problems Problem Management tools support level Problem Management process maturity Incident Management system reports Problem Management system reports Labor reports HR reports Audit reports Process Assessment Audit Report Findings Tool Assessment Results

So where do you find this data? They can be found in such places as:

59

Chapter 3 What kind of evaluations can you make from these seemingly nondescript numbers? What are your KPIs? Some of these numbers stand on their own to provide meaningful KPIs: Total number of major problems Total number of problems in the pipeline Total number of problems resolved and removed Total number of known errors Total number of problems reopened Total number of problems with customer impact Total available labor hours allotted to work on problems Total labor hours spent working on problems

However, you can do a little math and determine additional useful KPIs. The following are just some of the metrics you can calculate from the data. Customer Impact Rate You can determine the customer impact rate by dividing the total number of problems with customer impact by the total number of problems in the pipeline. For example, if you had 50 problems in the pipeline this week and 22 of them impacted customers, your customer impact rate is 22/50 or 44%. This will tell your management how well you are at keeping problems from impacting your customers and point to where you need more resources, tools, or labor to lower the rate to an acceptable level. Incident Repeat Rate When incidents repeat and must be reopened, it points to underlying problems that must be discovered. You can determine the incident repeat rate by dividing the total number of repeat incidents by the total number of incidents. For example, if you had 50 incidents during the week, and 25 of them were repeat incidents, your incident repeat rate is 25/50 or 50%. This will tell your management how effective you are at minimizing repeat incidents. The higher the number, the more investigation and research that needs to be done to determine any existing problems at the core of the incidents. Problem Labor Utilization Rate You can determine the problem labor utilization rate by dividing the total labor hour spent working on problems by the total labor hours available to work on problems. For example, if you spent 80 hours resolving problems during the week and you had allotted 120 hours to be available for problem resolution, your problem labor utilization rate would be 80/120 or 67%. This metric will indicate how much available labor capacity was used handling problems, and can indicate whether the number allotted is too low, if more personnel is needed, or if there needs to be changes in the procedures to fix problems.

60

Chapter 3 Problem Reopen Rate The problem reopen rate is found by dividing the total number problems reopened by the total number of problems in the pipeline. For example, if you had 60 problems in the pipeline for the week, and 20 of the problems were reopened, your problem reopen rate would be 20/60 or 33%. This metric will tell your management how successful you are at permanently removing problems. Your goal will be to get this rate as low as possible. Problem Resolution Rate The problem resolution rate is computed by dividing the total number of problems resolved by the total number of problems in the pipeline. For example, if you had 45 problems in the pipeline for the week, and you resolved 30 of them, your problem resolution rate would be 30/45 or 67%. This metric will tell your management the percentage of problems you successfully addressed and removed. The higher the percentage, the better. Problem Workaround Rate The problems workaround rate is found by dividing the total number of known errors by the total number of repeat incidents. For example, if the total number of known errors is 100 and the total number of repeat incidents is 120, your problem workaround rate is 100/120 or 83%. This metric will tell your management the percentage of problems for which you implemented workarounds. Metrics such as these will tell you, and more importantly tell your business leaders, how efficient you are at implementing Problem Management process components and where improvements are needed.

Summary
Implementing the ITIL Incident Management and Problem Management processes will be an evolutionary process just as Change Management implementation was. It will take time and investment up front. It will be a learning experience. But, when done correctly, it will make your business more efficient; reduce downtime; prevent incidents from happening; save money that otherwise would have been spent constantly addressing recurring incidents, problems, and errors; and make IT more strategic in the eyes of your business leaders. Incident Management and Problem Management implementation success will also take the strong, consistent commitment of your executive management to get through the inevitable growing pains. Be sure you have that to get the subsequent commitment of your ITIL team members, and ultimately improve your Incident Management and Problem Management processes.

61

Chapter 4

Chapter 4: Supporting Compliance Through ITIL


Organizations have faced legal and regulatory requirements for literally decades. Perhaps the first, most painfully apparent compliance requirements were experienced by U.S. businesses in 1970. At that time, there was huge concern about the increasingly large numbers of deaths and injuries that occurred at work sites. A new oversight agency, the Occupational Safety and Health Administration (OSHA), was created in 1970 and tasked to create regulations to ensure worker safety. Businesses hated these directives. Many business leaders predicted that following the new safety regulations would cost businesses huge amount of money not only because of lost productivity but also because of how much just getting into compliance would cost. Many of the requirements seemed unnecessary based solely upon the cost and timed involved for their implementation. However, history has shown that, as a result of OSHA requirements and compliance by organizations, there have been measurably fewer injuries and deaths and significantly less lost work. In addition, there have been fewer workers compensation losses.

IT Compliance Is Relatively Young


Fast forward a couple of decades and, as Yogi Berra would say, Its deja vu all over again. U.S. healthcare organizations reacted with alarm over the passage of the Health Insurance Portability and Accountability Act (HIPAA) of 1996. The U.S. financial organizations soon followed suit with their reaction to the passage of the Gramm Leach Bliley Act (GLBA), also known as the Financial Modernization Act, of 1999. But probably the biggest whammy felt by the largest numbers of organizations was felt by passage of the Sarbanes Oxley (SOX) Act of 2002. There have been many data protection laws that have been enacted since around 1995 throughout the world. Organizations now must follow specific requirements to protect information and the IT infrastructures that process and house the data. In addition to these laws, there is now a new trend to require organizations that perform certain activities, such as processing credit cards, to have very specific data protection practices implemented. The perfect example of this is the Payment Card Industry (PCI) Data Security Standard (DSS). Although this standard is not a law, it is a contractual requirement for processing credit cards from Visa, MasterCard, American Express, and others. Protecting information is no longer just a good idea; it is a legal requirement that is best accomplished by using proven, internationally accepted, data management frameworks.

62

Chapter 4

Frameworks Support Compliance


Some of the current prominent frameworks for IT and information security governance are ITIL, COBIT, ISO/IEC 17799 (soon to be ISO27002), and COSO.

Recall each of these? Lets quickly review: Information Technology Infrastructure Library (ITIL) offers best practice approaches to facilitate the delivery of high-quality information technology (IT) services, the earliest version of which was released in 1985. Control Objectives for Information and related Technology (COBIT) provides best practices for IT management and controls created by the Information Systems Audit and Control Association (ISACA) and the IT Governance Institute (ITGI) in 1992. ISO/IEC 17799 is an information security standard most recently published in June 2005 by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). This standard was renumbered ISO/IEC 27002:2005 in July 2007. Committee of Sponsoring Organizations (COSO) of the Treadway Commission is a U.S. privatesector initiative formed in 1985 that makes recommendations to reduce fraud incidents. COSO has a common definition of internal controls, standards, and criteria against which companies and organizations can assess their control systems.

There has been much written in the past few years about ITIL. Why? Because ITIL is a perfect complement to both COBIT and ISO/IEC17799. It aligns nicely with them. ITIL, COBIT, and ISO/IEC 17799 interoperate in many ways. Most organizations that use frameworks will typically use more than one; they realize that just one framework does not address all the issues necessary for effective information management within a complex business environment. With the passage of SOX, it has been common to see organizations use COSO and COBIT in conjunction with ITIL. Auditors overwhelmingly use COBIT to determine appropriate controls when doing SOX reviews. IT areas can benefit from following a standardized framework, such as ITIL, to support COBIT constructs, and at the same time ensure SOX compliance. Why is this? Because COBIT and ITIL provide frameworks covering the areas that must be reviewed, along with the necessary criteria to use for evaluations, when considering the effectiveness of IT service management.

It is important to keep in mind that COBIT and ITIL do not provide explicit solutions to the risks being discussed within them. For them to try to do so would be foolhardy considering the very wide range of technology solutions that exist along with the technologies emerging every day. However, COBIT and ITILwhich address general and significant IT control and management issues in basically all organizationsprovide an efficient and effective roadmap to follow to successfully implement IT solutions. Because COBIT and ITIL include what are widely accepted as best practices, the documentation and implementation of the concepts will provide the best possible, and defendable, IT management results.

63

Chapter 4

ITIL Has Been Validated


The concepts within frameworks have been tried and tested within numerous organizations, and they work! Frameworks are efficient and effective. Frameworks already exist; you do not need to create something from scratch yourself. You dont need to spend staff and management time creating roughly similar processesafter numerous trials and errorsthat may not be as effective as these already existing frameworks. Frameworks can offer a competitive advantage. ITIL offers cost savings, efficiency, and a competitive advantage. Why? The following list highlights just a few of the reasons: ITIL satisfies and extends COBIT controls relating to IT Service Management, including Change Management, Problem Management, and Incident Management. ITIL improves IT processes and controls. The organizations that have successfully implemented ITIL attest to that. ITIL can be used to determine technology requirements and identify possible organizational structure, roles, and responsibilities. ITIL Service Support processes enable effective IT services and contain the building blocks of all IT services. ITIL is increasingly being used to implement the best practices promoted by COBIT and ISO/IEC 17799.

ITIL Service Management Supports Compliance


ITIL supports compliance with many laws and regulations, such as the USA PATRIOT Act, California SB1386, SOX, the European Union Data Protection Directive 95/46/EC, Basel II, GLBA, HIPAA, the U.S. state breach notice laws, and many more. However, actual ITIL specifications do not contain references to any particular regulations or laws; there would be too many to list, and too many new ones are going into effect. By comparing the requirements of various laws and regulations, though, it becomes clear how ITIL supports compliance. Data protection and privacy laws and regulations throughout the world have many commonalities, and they promote following accepted best practices and standard frameworks. In fact, by following frameworks such as COSO, COBIT, ISO/IEC 17799, and ITIL, organizations will realize compliance with roughly 80% to 85% of the data protection requirements within all these many laws and regulations. Much more time will be spent on compliance activities if they are addressed in an ad-hoc manner or with one-off solutions. By following defined frameworks, much time and resources will be saved in meeting compliance objectives. Using a well-defined framework allows for a comprehensive approach to compliance.

64

Chapter 4

SOX Mapping to ITIL Service Management


It is important to note that standard auditor recommendations are based upon these widely respected and internationally endorsed IT and information security frameworks. Why? Because regulatory oversight agencies reference the use of these frameworks over and over again within their compliance guidance documents. Just consider SOX. SOX gave the Public Company Accounting Oversight Board (PCAOB) responsibility for oversight of SOX compliance. The PCAOB then created several guidance documents to help auditors and organizations determine whether organizations had proper controls in place.

PCAOB "is a private-sector, non-profit corporation, created by the Sarbanes-Oxley Act of 2002, to oversee the auditors of public companies in order to protect the interests of investors and further the public interest in the preparation of informative, fair, and independent audit reports." For more information, see their Web site at http://www.pcaobus.org/.

The PCAOB recommends the COSO and COBIT frameworks be used to meet SOX compliance within various guidance documents they have issued, such as in PCAOB Release No. 2004-001, March 9, 2004, and in their Auditing Standard #2. The PCAOB directed that established frameworks be used by organizations to support consistent and effective internal controls. So, SOX directed the PCAOB to create guidance, and the PCAOB mandated the use of established and effective frameworks for internal controls. ITIL clearly maps to COBIT and COSO. Figure 4.1 demonstrates these relationships.

Sarbanes Oxley Act

Guidelines

COSO, COBIT

ITIL, ISO 17799 IT Management

Securities and Exchange Commission

PCAOB

Auditors

Management

Figure 4.1: How SOX relates to ITIL.

65

Chapter 4 Now lets drill down a little further to the point where the auditors are using COBIT to evaluate your IT controls. Auditors will use the COBIT 4.0, Manage Changes (AI6, AI7) section. The Control Objective is Controls provide reasonable assurance that system changes of financial reporting significance are authorized and appropriately tested before being moved to production. What does this have to do with financial reporting controls? The Rationale explains it well: Managing changes addresses how an organization modifies system functionality to help the business meet its financial reporting objectives. Deficiencies in this area could significantly impact financial reporting. For instance, changes to the programs that allocate financial data to accounts require appropriate approvals and testing prior to the change so that proper classification and reporting integrity is maintained. This relates to Section 404 of SOX general requirements because they are there to ensure proper internal controls exist for processes, automation, and documentation. IT managers, internal auditors, controllers, process specialists, and IT systems personnel are accountable for ensuring these controls exist. Figure 4.2 shows at a high level how ITIL Service Management processes support SOX Section 404. Details for each are discussed later in the chapter.
Change Management Requests for program changes, system changes, and maintenance (including changes to system software) are standardized, logged, approved, documented, and subject to formal change management procedures Emergency change requests are documented and subject to formal change management procedures Controls are in place to restrict migration of programs to production by authorized individuals only IT management implements system software that does not jeopardize the security of the data and programs being stored on the system Rapid disclosure of operations, financial reporting and compliance validation and documentation
Figure 4.2: How ITIL Service Management supports SOX Section 404 requirements.

Incident Management IT management has defined and implemented a incident management system such that data integrity and access control incidents are recorded, analyzed, resolved in a timely manner and reported to management A security incident response process exists to support timely response and investigation of unauthorized activities

Problem Management The problem management system provides for adequate audit trail facilities, which allow tracing from incident to underlying cause

66

Chapter 4 The general ITIL controls that support all three of these IT Service Management processes include: Application controls, such as those for the systems development life cycle (SDLC), logging access activities, and processing and reporting financial activities of all types IT general controls, such as access controls, authorization, and records retention Document controls, such as the existence of policies, procedures, narratives, flowcharts, configurations

ITIL Supports Compliance with Many Laws and Regulations


As Figure 4.3 highlights, ITIL supports compliance with many other laws and regulations. Later, this chapter will delve deeper into the specifics of how ITIL Change Management, Incident Management, and Problem Management support compliance with these legal requirements.
Law or Regulation Basel II GLBA Requirements Supported by ITIL Monitoring and reporting; internal controls; risk management; documentation; and accountability Detecting, preventing and responding to attacks, intrusions, or other systems failures; testing and monitoring; assigning security and privacy responsibility; providing policies and procedures for access controls; developing an awareness and training program Providing policies and procedures to prevent, detect, contain, and correct security violations; assigning security and privacy responsibility; offering policies and procedures for access controls; developing an awareness and training program; ensuring there are policies and procedures for responding to an emergency; implementing audit controls, authentication controls, and incident response Ensuring data accuracy; providing access controls; assigning responsibility; ensuring data retention Providing access controls; ensuring data retention and data accuracy Implementing incident response; assigning accountability

HIPAA

European Union Data Protection Directive 95/46/EC Canadas Personal Information Protection and Electronic Data Act (PIPEDA) U.S. State Breach Notice Laws

Figure 4.3: Laws and regulations ITIL supports.

67

Chapter 4

Compliance with Policies and Procedures


In addition to complying with laws and regulations, you must comply with your own organizations policies. Unfortunately, too many organizations do not realize this. The security and privacy policies posted on an organizations Web site are legally binding documents. Do you have procedures in place within your organization to support compliance with them? Auditors and regulators will review your organizations internal information security and privacy policies to determine whether your organization is following the policies. Do you have procedures to support compliance with your policies? Most organizations have documented policies but do not offer documented procedures to support compliance, and very little to no training and awareness to communicate those policies to personnel and business partners. All organizations within the U.S. that are in noncompliance of their policies are putting themselves at risk of being found in violation of the U.S. Federal Trade Commission Act (FTC Act). Section 5 of the FTC Act declares that unfair or deceptive trade practices are illegal. Not following your own policies is generally considered as an unfair and deceptive trade practice. Not following your policies, which are basically the promises you make to your customers and employees, is considered misleading your consumers. This may be in the form of express or implied claims or promises, and may be written or oral. A few examples of organizations that have received fines and penalties as a result of noncompliance with their own policies include: In May 2007, the FTC found Pacific Herbal Sciences to be in violation of the FTC Act and were fined $172,500. The FTC contended that the defendants falsely claimed on their Web site ordering pages that transactions were secure and that customer privacy was protected. The Web site contained the message, NOTE: To ensure your personal privacy, all of the information that you submit to us after this point will be secured using SSL encryption technology. However, the transactions were not secured in this manner. In November 2006, the FTC applied a $3 million penalty against Zango Inc. for their unfair and deceptive business practices because they did not have procedures in place to support their policies. In September 2006, the FTC fined Enternet $2 million for being in violation of the FTC Act for misleading consumers with their privacy policies.

It is important to note that the FTC also typically requires violators of the FTC Act to establish formal information security programs and undergo ongoing independent audits of the adequacy of the programs for a period of 20 years. The ongoing purview of the FTC is often more expensive than the dollar penalty.

68

Chapter 4

ITIL Supports Compliance and Improves Business


ITIL Change Management, Incident Management, and Problem Management processes support compliance with laws, regulations, and corporate policies. In addition to supporting compliance, implementing these ITIL processes will result in: Cost justification for service quality activities Better integration of corporate processes Better integration of IT with other business processes throughout the enterprise Support for systems audits Creation of key performance indicators Documented roles and responsibilities in service provisioning Enhanced efficiency, resulting in better competitiveness Improved availability, reliability, and security of mission-critical IT services Improved resource utilization Improved process scalability and consolidation Improved project deliverables and time to delivery Continuous learning process and feedback Reduced rework and elimination of redundant work Services better able to meet business, customer, and user demands

So, with all these in mind, lets look at the details for how these three ITIL Service Management processes support not only compliance but also business improvement.

69

Chapter 4

Change Management One of the key internal control objectives in COBIT is managing change. Managing change is also one of the required General IT controls. The foundation of an effective and efficient IT control environment is effective Change Management. Well-defined documented processes based on best practices frameworks, such as ITIL, and supported by automation where possible, are necessary to achieve compliance. The following Change Management activities support compliance requirements: Ensuring system changes are authorized and appropriately tested before being moved to production Having a documented change management process and keeping it maintained to reflect the current process Having change management procedures for all changes within the production environment, including program changes, system maintenance, and infrastructure changes Following procedures to control and monitor change requests Following procedures to initiate, approve, and track change requests Following documented procedures to appropriately test and approve changes before placing them into production Ensuring the approval procedures address all the following: operations, security, IT infrastructure management, and IT management Following documented procedures to ensure only authorized/approved changes are moved into production Maintaining an audit trail, change request log, and supporting documentation Ensuring documented procedures for timely implementation of patches to system software Maintaining and following documented procedures to control and supervise emergency changes Maintaining an audit trail of all emergency activity and following procedures to have it independently reviewed Following documented procedures, including back out activities, for emergency changes Following documented procedures to ensure all emergency changes are tested and appropriately approved by systems owners, development staff, and computer operations, as appropriate, before being put into production Establishing separation of duties between the staff responsible for moving a program into production and development staff Following documented procedures to perform a risk assessment of the potential impact of changes to system software

70

Chapter 4 The benefits of following the ITIL Change Management process go beyond compliance. The organizational benefits include: Cost savingsAccording to Nouri Association, Inc. (NAI), organizations save 30% to 50% using frameworks with automated controls compared with those that use manual change management controls. Increased customer satisfactionChange management occurs more consistently and dependably. Customers know the status of their change request throughout the entire change process. Production environment stabilityNAI research shows there is a 15% to 20% decrease in change-related incidents. Supports quality assurance (QA) initiativesFollowing the structured, well-documented, and consistent processes within ITIL Change Management supports QA recommendations, such as those found within Six Sigma.

To most efficiently and effectively handle IT changes and compliance requirements, the Change Management process should be centrally managed and integrated throughout the entire applications and SDLC. Activities that should be centrally managed to process changes include: RecordingEnsuring all change sources can submit requests for change (RFCs) and that the RFCs are properly recorded AcceptanceFiltering submitted RFCs and moving those eligible on for consideration Classification, categorization, and prioritizationPutting each RFC into the appropriate category and establishing a priority Planning and approvalConsolidating the changes, giving approvals, obtaining resources, and involving the change advisory board (CAB) where necessary CoordinationScheduling, development, testing, and implementation Evaluation and closureDetermining success and learning from the experience

71

Chapter 4

Incident Management The Incident Management process needs to manage all incidents from detection and recording through to resolution and closure. Incident Management is reactive by nature. The objectives of Incident Management are to reduce or eliminate the business impacts and effects of actual or likely disturbances within IT services to not only ensure personnel can get back to work as soon as possible but also that business can resume to normal as soon as possible. Another COBIT internal control objective is managing incidents. The following Incident Management activities also support compliance requirements: Documenting and maintaining a formal incident management system. Establishing and maintaining formally documented incident management procedures. Providing training for, and consistently following, incident management procedures. Obtaining clearly documented management support for incident management processes. Establishing consistent, well-documented incident reports that include information about the incident, how the incident was analyzed, and how it was resolved. Establishing incident management audit trails to track the entire incident resolution lifecycle, from initial report to confirmed resolution. Establishing procedures to respond to unauthorized activities in a timely manner.

Well-defined documented procedures, automated where possible, help to further support compliance. Automation helps to ensure procedures are consistently and completely followed and reduce the amount of human error. The types of activities that occur within Incident Management that can be automated to support compliance requirements include: Incident acceptance and recordingDetecting and reporting an incident and then creating an incident record Classification and initial supportAssigning the incident a type, status, impact, urgency, priority, service level agreement (SLA), and so on to help facilitate the most appropriate response; this should include providing temporary workarounds whenever applicable Service requestDocumenting and implementing automated procedures to request IT services whenever necessary to support incident response MatchingDetermining whether the incident is known and if there is a workaround in place

72

Chapter 4

Investigation and diagnosisDetermining whether a known solution to an incident does not exist, then following procedures to launch an investigation Resolution and recoveryFollowing procedures to find a solution, documenting it, and then automatically notifying the appropriate individuals and areas ClosureUpon obtaining confirmation from those notified that the solution is satisfactory, following automated procedures to formally close the incident Progress monitoring and trackingThroughout the incident response life cycle, monitoring progress so that the time it takes to resolve the incident is recorded; in addition, ensuring that, when roadblocks occur, that incident is appropriately escalated to the next level of support.

Problem Management So how is a problem different than an incident? As I discussed in Chapter 1, a problem is generally an unwanted or undesirable situation that, if not addressed soon enough, can become the root cause of an incident. Problem Management takes the entire IT infrastructure into account, using all available information, to identify existing and potential failures in the delivery of IT services. Problem Management supports Incident Management by providing alternative workarounds and temporary fixes during an incident but does not have responsibility for actually resolving incidents. Problem Management also involves the analysis of incidents and problems to identify trends and then subsequently takes proactive actions to prevent the further occurrences of similar incidents and problems. Problem Management also supports COBIT internal control objectives and, as a result, compliance with laws and policies. The following Problem Management activities support compliance requirements: Establishing a documented Problem Management system and ensuring it is being used throughout the enterprise Establishing formally documented procedures to use the Problem Management system, including consistent reports and review practices Following formally documented procedures to create audit trails for Problem Management activities

73

Chapter 4 Well-defined documented Problem Management procedures, automated where possible, help to further support compliance. As with Incident Management, automation helps to ensure procedures are consistently and completely followed and reduce the amount of human error. The types of activities that occur within Problem Management that can be automated to support compliance requirements include: Problem identification and recordingAutomating problem reporting helps to streamline the identification of known and new problems, in addition to supporting better trend analysis. Problem classification and allocationDetermining the category, impact, urgency, priority, and status of a problem then allocating resources for resolution is made more efficient through automation. Problem investigation and diagnosisDetermining the cause of the problem and linking it to the appropriate CIs is more accurate and time efficient through automation. Temporary fixesImplementing necessary temporary or emergency fixes to manage known errors until they can be resolved is accomplished much more quickly by using automated processes to identify the temporary fixes. Error identification and recordingIdentifying the error and then communicating the error to Incident Management, if appropriate, is made easier through automation. Error assessmentDetermining what is necessary to resolve known problems and errors is made easier through automation. Record error resolutionDetermining the most appropriate business solution is done more quickly through automation. Close error and associated problemsPerforming a Post Implementation Review (PIR) and then closing the records is done more accurately and efficiently through automation.

74

Chapter 4

Compliance Requires AccountabilityITIL Establishes Accountability


Another key aspect of achieving compliance is establishing accountability. When management visibly supports and takes ownership of the organizations IT control strategy, accountability is achieved. In IT, control strategy is composed of three types of interrelated controls, all of which support compliance and are a result of implementing ITIL: Preventive controls help keep bad and unauthorized things from happening. Examples of preventive controls are policy, segregation of duties, and authorization processes. Compliance requires all these controls. ITIL establishes these controls. Detective controls are analytical controls that monitor activities and processes to identify when preventive controls have failed or been circumvented. Examples of detective controls include change auditing and post-deployment verification of changes to the production infrastructure. Compliance requires all these controls. ITIL establishes these controls. Corrective controls restore the IT environment to an authorized and appropriate state when the detective controls identify something that is not appropriate. Examples of corrective controls include restoring programs and provisioning tools. Compliance requires all these controls. ITIL establishes these controls.

An effective IT control strategy will utilize all these controls and be designed to minimize risk to the business. By implementing these controls following ITIL, regulatory and policy compliance in large part can be achieved.

Summary
As organizations continue to look for better ways to manage IT while meeting regulatory and policy compliance, ITIL continues to grow in popularity. As a result, organizations also realize better integration of IT throughout all enterprise business processes. Putting ITIL in place requires careful planning and commitment, and it is usually expensive. ITIL is often best implemented with other frameworks, particularly COBIT, to meet compliance requirements. However, organizations that take a proactive approach to compliance and frameworks implementation realize they also achieve greater efficiency, reduced operational and legal risk, and lower operational expense.

75

Chapter 4 According to studies of high-performing IT organizations by the IT Process Institute, implementing frameworks as part of their compliance efforts spent less than 10 full-time equivalent (FTE) staff-years on SOX Section 404 activities compared with hundreds of FTEs in other organizations. The organizations working towards frameworks and compliance goals spent less than 5% of their time on IT problem resolution compared with 35% to 45% spent on unplanned, unscheduled work in other IT organizations that were not using frameworks [Behr, K., G. Kim, and G. Spafford, The Visible Ops Handbook, Information Technology Process Institute (ITPI), 2004-2005]. ITIL implementation continues to grow throughout the world; a reminder of the growing importance of international standards. When you are implementing controls and processes to meet compliance requirements so that you can avoid litigation, fines, and penalties under your applicable laws and policies, take the opportunity to also act strategically to incorporate IT throughout all your organizations business decision-making processes. You will find that taking this risk-based, frameworks approach will create valuable benefits beyond compliance. You will see that the resulting strong IT controls strategy will achieve compliance objectives as well as increase IT efficiency and effectiveness.

76

Chapter 5

Chapter 5: Roadmap for Successful ITIL Service Support Implementation


Once you have decided to implement ITIL service support processesnamely Change Management, Incident Management, and Problem Managementwhat do you do? For the best results, start with a small scope and build upon each successbegin with the areas that have the most urgent need for improvement and then go from there. It often helps to examine how another company has had success with implementing processes. Lets explore the challenges faced by a large multinational organization that are common across many organizations. First consider the following background information about the company that Ill call Generic Manufacturing Company: Multinational organization with more than 100,000 employees Manufacturing industry Offices and facilities in 20 countries Multiple service and product business unit organizations In the process of implementing a new enterprise IT operating environment Historically has experienced major problems with support services; each business unit had its own way of doing things, making it a challenge for the individual area IT environments to communicate with each other Wants to standardize hardware, software, and IT tools throughout the organization to allow better communication and coordination Wants to centralize Change Management, Problem Management, and Incident Management processes to ensure consistent practices are followed

There were processes in place but each business unit had their own unique way of getting their jobs accomplished. There were still old desktop computers and servers being utilized. There was a centralized Help desk, but the IT areas did not use it. In fact, the IT area told their applications and systems customers to contact the IT staff directly if the customers needed help. The only time the Help desk was called was basically for password resets. Okay, so where to start? This sounds like a job for ITIL!

77

Chapter 5

Getting Ready
Organizations of any size can benefit from centralizing IT processes as much as possible. The IT Service Support processes that tend to impact all organizations regardless of size are Change Management, Problem Management, and Incident Management. The Generic Manufacturing Company can benefit from effective centralization of its many IT processes. They determine they can probably benefit by clearly establishing one process and one area to be ultimately responsible for change, problem, and incident handling processes. Consider all the possibilities for where and when to begin. This is the initial stage of ITIL implementation. It is important to get the different areas, currently mistrustful and at odds with each other, to understand the improvements that can be made through cooperative implementation of ITIL. The business management was all for implementing a process to make IT management go more smoothly, and in fact improve upon business results. The Generic Manufacturing Company developed the ITIL process implementation roadmap that Figure 5.1 shows. All organizations can take this roadmap and modify it to meet their own enterprises unique environments.

Create mission statement

Perform baseline assessment

Planning

Implementation

Measurement

Get executive support Choose team members

Set scope Identify stakeholders Determine current situation Identify trouble spots Perform benchmark

Document business case Set goals Create plan

Create awareness Train personnel Implement plan Manage organizational change Manage cultural change

Review status Measure goals Measure organizational changes Measure cultural changes Document problems and vulnerabilities

Create policies Manage risks Identify responsibilities

Figure 5.1: ITIL process implementation roadmap.

78

Chapter 5 It is not practical to implement all aspects of ITIL Service Support at one time! Not all the required process inputs will be available when the first process is initiated. There will be information quality issues where key process input areas are absent. It will be very difficult, and often impossible, to determine the impact of a change on business services availability, capacity, and continuity when supporting processes do not exist. This increases the possibility of problems occurring when deploying the processes. Be sure that representatives from each of the ITIL Service Support process teams are included in the CAB. These representatives can then be made aware of the impact of the changes on the user community.
Be sure to include information security within your processes. Information security is a critical function within the IT organization. It is critical to include information security in the authorization of all network and information system changes. Doing so ensures that security is accounted for in the development of the change. Information security should also be part of the CAB.

Realizing Improvements Are Needed The Generic Manufacturing Company performed an assessment to clearly identify and document the problems. Three major trouble areas were discovered: The multiple areas of the company were each following different processes to move applications from test to pilot to production environments. One of the areas didnt even have a pilot (end-user quality assurance) environment and moved the applications directly from the test environment to the production environment! IT problems were handled very differently throughout the enterprise. A couple of the business unit applications support areas told their end users to call them directly to handle problems. Other business units directed end users to call the corporate Help desk. Documentation for the problems was not consistent, was not centralized, and often was not documented. The same problems seemed to occur over and over again. IT-related incidents were recurring with increasing frequency. Many times the same incident occurred in different parts of the enterprise, and different teams handled the same incident differently. The teams handling the incidents did not communicate with each other about what worked well and what didnt work for incident resolution.

Generic Manufacturing Company IT leaders realized these problems were likely having major negative impact on the business. They wanted to look into specifically how much impact, and then determine what could be done to improve upon the situations. These trouble areas are common to most organizations.

79

Chapter 5

Get Executive Support Executive management must clearly support the implementation of ITIL processes throughout the enterprise. Without this support, the people who must be involved with implementing the necessary changes will not do so and will continue with business as usual. It is human nature to continue doing things as they are currently done; it is not as much work out of an already perceived to be too busy day to continue with the status quo. Unless executive leaders tell personnel that changes must be made, there will not be full cooperation. Lack of cooperation here will lessen effectiveness of ITIL implementation and could result in an unsuccessful project. The Generic Manufacturing Company CIO explained the three major problems to the corporate executives. The explanation described the ITIL Service Support processes and how they could be good ways to improve or eliminate the problems. The IT staff asked for, and got, executive support to do a project to determine the extent of the problems throughout the enterprise, and then to determine what specifically should be done to address the problems. Choose Team Members Another key component for success is obtaining the qualified personnel to perform the implementation and ongoing tasks necessary within the ITIL Service Support processes. There will be a need for some full time employees (FTEs), but there will also be the need to add responsibilities to existing positions. Even with the strong and visible support of executive business leaders, buy-in from all stakeholders, the establishment of the CMDB, the creation of a good process definition, and integration and automation of the processes throughout the enterprise, success will not be accomplished if you do not have personnel performing the necessary activities. You must also ensure that your internal customers will be actively involved in the ITIL process development, implementation, and maintenance. This will make certain your customers have tested and accepted the components of the ITIL process, ensuring the requirements have been met. Be sure to obtain feedback to guarantee the customer needs are considered and included within the planning process. There will be similar team members for each of the ITIL processes. The key roles for the ITIL Service Support processes include: Change ManagerThis position will have authority and accountability to define, validate, and maintain the Change Management process. This position is responsible for oversight of Change Management process monitoring, measuring, reporting, and operations. Support Center ManagerThis position should help determine how the Incident Management process can provide data relating to the other processes. This is the central point of contact for customers, so this position should ensure that the information identified and classified within the Support Center will be valuable and useful to other parts of the organization. Project ManagerThis position is responsible for assisting with the initiation of the project, creating the project plan, executing the plan, and then closing the project. The Project Manager is also responsible for providing status updates for each project milestone.

80

Chapter 5 Human Resources (HR)A representative from HR should create job descriptions for new positions related to the Change Management process. HR can also identify potential internal candidates for the new positions. TrainerA position needs to exist to ensure training will be provided to targeted positions as well as providing awareness communications to the entire enterprise. Purchasing DepartmentSomeone from the purchasing department should be enlisted to negotiate the best prices and services possible from the vendors you will use to implement and support the Change Management process. Applications DevelopmentEnlist the experts from your applications development areas to help choose Change Management process software. They can also identify the hardware requirements for the software. They should also discuss the products being considered with the Support Center staff to make sure the integration with the other ITIL processes are facilitated. Business Unit Managers or RepresentativesYour internal customers must be represented within the Change Management process planning, implementation, and ongoing management to ensure the needs of the business are appropriately considered. This person must have in-depth knowledge of how the business works and must be able to translate and communicate the process issues and requirements into business terms for the personnel within the business unit. Support Center StaffThese personnel will typically not participate in the implementation of the Change Management process but will provide service delivery through their established communications channels, such as email, telephone, intranet site, and so on. The Support Center owns the Incident Management process. This area is the point of contact for customers and must be able to provide them with good advice and guidance. The Support Center staff must also obtain feedback regarding the problems reported through the Problem Management process. They should also create regular, typically daily, reports of the top-ten most common problems reported each day to allow for addressing reported problems most effectively and lessening the effect of ongoing, repeat problems. Incident Team MembersThese personnel will work with the application providers to create Service Level Agreements (SLAs) and Operating Level Agreements. They also will manage the Support Center knowledge base and management reporting.

Create Mission Statements To achieve success, organizations must first define success. It is necessary to create a mission statement for each of the ITIL processes you are implementing. The following example shows the mission statement Generic Manufacturing Company created. You can use this as an example on which to base your own mission statements. You will need to modify it to fit your own organizations style of writing, along with your industry and environment.

81

Chapter 5

Generic Manufacturing Company Change Management Mission Statement The mission of the Generic Manufacturing Company Change Management process is to enable technical changes within the production environment in the most efficient and consistent way possible to support business objects and with the least amount of disruptions resulting from making IT changes. The purpose of the Change Management process is to ensure changes made within the IT environments are consistently tracked, reviewed, tested, communicated, implemented, and validated to reduce the negative impacts to the business as much as possible.

Perform a Baseline Assessment


It is important to clearly identify and document the scope for which your ITIL process will apply. For example, if you are going to start the implementation of ITIL by implementing the Change Management process for a specific application, clearly document the application, the persons and positions responsible for each of the changes involved with that particular application, the types of changes possible for that application, and so on.
You must have a complete CMDB in place to be able to successfully manage changes to the infrastructure scope that you have established. An incomplete CMDB will result in Change Management mistakes because of missing or incorrect information.

The Generic Manufacturing Company determined the scope for the ITIL processes they will first implement will be for Change Management, Problem Management, and Incident Management related to the email system used throughout the enterprise. Identify Stakeholders You must understand who the stakeholders are; they are necessary to understand how to improve upon the processes as well as determine the success for how the processes were implemented. Define, identify, and map the stakeholders for each of the ITIL processes you are implmenting. Identify the specific needs for each of the types of stakeholders youve identified. This information will be used in assessing the success of the ITIL process for the corresponding stakeholders. The Generic Manufacturing Company identified the following stakeholders for the email system: Information Technology Information Security Human Resources Legal Management Email Process Owners Email Users

These are similar to the stakeholder other organizations will have for the email system.

82

Chapter 5 Determine Current Situation Organizations must determine the specific needs for change; the need to improve upon the ways IT processes are currently being performed. This need must be documented to validate to the stakeholders why change is necessary. As a generic example, if your organization has $1000 per hour revenue coming in every hour of every day each week, and your changes typically cost you 5 hours of lost revenue, that calculates to $5000 of lost revenue per week or $260,000 of lost revenue as a result of changes. Making your Change Management process more efficient and effective could save you many hours of time that computes to many thousands of dollars of additional revenue. To demonstrate this, you will need to keep track of some critical measurements. Here are some key measurements to make when implementing the Change Management process: How many changes occur within the scope youve identified each month? How must time does it take to implement the changes? How many incidents and outages occur as a result of the changes? How many changes are backed out each month?

The Generic Manufacturing Company identified the following for their email system: Even though the same type of email system was being used throughout the enterprise, Lotus Notes, there were four different email servers being separately maintained by different business units, each located in a different country from the others. Upgrades and patches were applied to each of the four email servers as determined by each maintenance team. This resulted in having different versions of Lotus Notes running on the four servers. End users reported their email problems to many different areas; sometimes to the IT staff administering one of the servers, sometimes to the central Help desk, sometimes to the Information Security area, and sometimes to their own managers. End users also report email incidents, such as spam or phishing messages, to many different areas. A review of each mail server revealed that each server was unavailable because of changes, problems, or incidents anywhere from 1 hour to 8 hours per work week.

83

Chapter 5

Identify Trouble Spots Where are the trouble spots within your current processes? You must identify the activities and components within your current processes in order to ensure not only that you do not recreate them, but more importantly that you resolve the trouble spots with your implementation. There are many different tools, automated and manual, you can use for identifying your trouble spots flowcharts, Pareto charts, and fishbone diagrams, just to name a few.
A flowchart is a schematic representation, using well-defined symbols, of an algorithm or a process. A flowchart can be used to identify the flow or sequence of events within a process or service. A Pareto chart, named after Vilfredo Pareto, is a special type of bar chart where the values plotted are arranged in descending order. A Pareto chart can focus efforts on the problems that have the greatest potential for improvement by illustrating their relative frequency or size within a bar graph. A fishbone diagram, also called a "cause and effect" diagram, and an "Ishikawa" diagram after the creator, shows the causes of a certain event. A fishbone diagram can allow team members to identify and graphically display all the possible causes related to a problem or condition to help determine the root causes.

The Generic Manufacturing Company ITIL implementation team decided they would use fishbone diagrams to identify where their trouble spots were for their email system Change Management, Problem Management, and Incident Management processes. Figure 5.2 shows the diagram they created to show the trouble spots within their email system Problem Management process. This provides an example of how you could use a fishbone diagram to help identify the trouble spots when you are planning for your ITIL Service Support processes.
Customers
Inconsistent Problem Reporting Problems Go Unreported

Documentation

Methods
Handwritten Notes

No Documentation

Inconsistent Documentation

Voice Mails Excel Spreadsheets Inconsistent Problem Handling

Responsibility for Handling Problems is Not Assigned Different Areas are Called

Problems are not communicated to other business units Problems are inconsistently communicated to other business units

No training for problem resolution

No experience for problem resolution

Contacts

Linkages

Knowledge

Figure 5.2: An example fishbone diagram for identifying trouble spots.

84

Chapter 5 Perform Benchmarks After youve identified the problem areas and the need for improvements, you can establish a benchmark. Why create a benchmark? Simple: to be able to determine how much youve improved your process following implementation and to help to continue process improvement. If you do not measure where your organization is at (your benchmark), you will not be able to clearly show how much change has occurred as a result of implementing the ITIL processes. ITIL has a process maturity framework (PMF) that organizations can used to measure, or benchmark, the process within your organization and then to subsequently provide the context for measuring the maturity of the process as time goes on.
The PMF assumes that a Quality Management System (QMS) is in place and that there is a goal to improve one or more aspects of the process effectiveness, efficiency, economy, or equity. A few of the QMS models ITIL uses include those by Deming, Juran, Baldridge, and Crosby.

Table 5.1 shows the five levels within the ITIL PMF.
Level 1 2 3 4 5 PMF Initial Repeatable Defined Managed Optimized Description Little to no documentation or assigned responsibilities for the process. The process is documented but there are limited operational processes in place; it is not viewed as having significant importance. The process is documented and has an owner, objectives, and allocated resources; however, acceptance throughout IT may not exist. The process is well-documented and implemented throughout all the business units and IT. The process interfaces with other processes. There is seamless integration of the process throughout IT and business areas. The process has become institutionalized as part of everyday activity.

Table 5.1: The ITIL PMF levels.

The Generic Manufacturing Companys assessment clearly and quickly shows that they are at Level 1 within the PMF.

85

Chapter 5

Planning
When you have the trouble spots documented and the benchmark complete, it is time to formally document the plan to address the issues from your findings. Document the Business Case The single biggest factor in successfully implementing ITIL Service Support processes will likely be overcoming resistance to change. To overcome the challenge, partner with all your stakeholders to gain their buy-in for the changes as well as using their input for creating the processes you are implementing. Use the information from your assessment and benchmark to build your business case. As a detailed investment proposal, include within your business case a detailed analysis of all the costs, benefits, and risks associated with the proposed investment of implementing Change Management, Incident Management, and Problem Management. Put the investment decision into the context of strategic business goals. Position the business objectives and goals with the options involved with each of the ITIL processes that impact the decision makers. Use one of the many management tools, automated or manual, to help you demonstrate the improvements that the business can realize by implementing the ITIL processes. The Generic Manufacturing Company created their report by linking the current situation with the business goals for revenue, and showed how much money they were losing by the time lost from downtime and employee time taken to respond to poorly executed changes, along with inconsistent handling of incidents and ongoing problems. They used Pareto charts to demonstrate their current situations and focused upon the problems and the alternatives that are projected to have the most positive impact upon the business. Figure 5.3 shows the Pareto chart for their problems, current and projected, following Problem Management process implementation. Use this example to inspire your plans to make the business case.

Figure 5.3: Example Pareto chart for Problem Management process implementation.

86

Chapter 5 Set Goals You must establish documented goals to be able to know whether you are successful in your efforts. Clearly define the goals for your specific organization for implementing ITIL Service Support processes. The following example highlights the goals defined by the Generic Manufacturing Company.
The Generic Manufacturing Companys Goals for Implementing ITIL Service Support Processes The goal for implementing a formal Incident Management process within the Generic Manufacturing Company is to restore normal service operation as quickly as possible, using consistent practices, and to minimize the negative impact on business operations to ensure the best possible quality and availability of IT service levels as defined within the IT SLA. The goal for implementing a formal Problem Management process is to minimize the negative impacts of problems and the resulting incidents on the business. The Problem Management process will maximize IT services by correcting problems and preventing recurrences. The goal for implementing a formal Change Management process is to ensure that standardized and consistent methods and procedures are used to efficiently and promptly handle all IT changes to minimize the impact of change-related problems upon IT service quality, thus improving the day-to-day operations of the organization.

Use the Generic Manufacturing Company Service Support goals as a basis for creating your own customized goals. Communicate the goals within your awareness messages. Create the Implementation Plan Carefully develop the implementation plan. Too many organizations spend too little time planning implementations, and then end up spending ten or twenty more times doing the actual implementation than they would have spent if they had just invested more planning time up front. Think through the steps necessary for your own organization. Each business will be different. The Generic Manufacturing Company created detailed implementation plans for their ITIL Service Support processes. Table 5.2 shows a sample from their Incident Management process implementation.

87

Chapter 5

Section 1 1.1 1.2 2 2.1 2.2 2.3 2.4 2.5 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Scope

Action

Start Date

Due Date

Identify areas impacted by the Incident Management process Identify linkages to other processes Roles Identify existing Incident Management roles Update roles and responsibilities Identify personnel to fill roles and assume responsibilities Identify CAB members Identify project team roles Awareness and Training Identify groups that need targeted training Identify awareness communications necessary Decide whether to create training content or bring in from outside Create awareness communications Create timeline for sending awareness communications Identify dates for providing training Send awareness communications Deliver training

Table 5.2: Sample from an Incident Management process implementation plan.

Use this plan sample as an example upon which to create your own, customized plan for your business. Remember, this is just a sample; yours will need to be more detailed.

88

Chapter 5

Create Policies You must document the high-level plans to describe the goals for your organization with regard to the ITIL processes. This documentation is critical for describing managements decisions regarding their commitment, direction, and planned course of action. Create policies for each of the ITIL Service Management processes you decide to implement. The following list shows some of the Change Management policies created by the Generic Manufacturing Company: All changes within the enterprise email systems must follow the established and documented Change Management procedure. Each submitted change must have a detailed script describing the change. The Change Management Project Manager is responsible for reviewing the scripts and approving change requests before the change can be scheduled. Following change request approval, any deviation from the pre-approved script requires Change Management approval. Each change request must include the following information: Activity description Details of change justification Possible impacts to the customer and other production systems Scheduled start and end times Necessary resources

Each approved change request must have a documented fallback option. Any vendor contracted to do changes must do the work onsite to enable appropriate oversight of the activities.

Use these Change Management policies as examples to help you get started with your own organizations policy development.

89

Chapter 5

Identify Responsibilities You will need to assign responsibilities within each of the ITIIL Service Support processes. Most, if not all, of the people filling these roles will be the same as the implementation team members within the corresponding defined roles. However, these responsibilities are different in that, as opposed to implementation activities, these roles will be responsible for ongoing Service Support activities. These are the ITIL Service Support responsibilities the Generic Manufacturing Company defined: 1st Level SupportRegister and classify incident report and perform immediate actions and keep users informed about the situation status at specified intervals. If 1st Level Support cannot resolve the situation, it will be transferred to the appropriate 2nd Level Technical Support Group. 2nd Level SupportTake over situations that cannot be solved immediately by 1st Level Support. If necessary, request external support, such as from software or hardware manufacturers. If the situation cannot be resolved, the 2nd Level Support passes the situation on to Problem Management. 3rd Level SupportResources within the hardware or software manufacturers whose services are requested by 2nd Level Support. Incident ManagerResponsible for the effective implementation of the Service Desk and Incident Management process. Carries out reporting procedures. The first point of contact for incidents. Problem ManagerResearches the root causes of problems and incidents. If possible, makes workarounds to the Incident Management team. Develops final solutions for Known Errors. Change ManagerAuthorizes and documents all IT infrastructure and configuration item changes. Determines and communicates the sequence of individual change stages. Involves the CAB when necessary. Release ManagerResponsible for consistently and effectively implementing changes to the IT infrastructure. Plans, monitors, and implements changes in coordination with Change Management. Configuration ManagerPrepares and makes available to the Service Management teams the necessary information about the IT infrastructure and services. Maintains the configuration items and related documentation for the components of the IT infrastructure. Documents changes and checks the updated information regularly. Automates the CMDB update process as much as possible.

Build upon these to fit your organizations requirements and needs.

90

Chapter 5

Implementation
When the mission statements have been clearly articulated, they must be effectively be communicated throughout the enterprise. If you do not make your personnel aware of the Service Support processes, they will not follow the processes and/or will not know what the processes require. The implementation plan and the purpose of each process must be clearly and effectively communicated to each of the key stakeholders. Ask for feedback to ensure that the needs from each of their areas are adequately incorporated into the process as well as to obtain their ongoing support. Train Personnel Success cannot be accomplished with a group of implementation folks who have no knowledge about what must be done for each of the ITIL processes. Training the team members for each of your processes is yet another key to success for your process implementation. There must be a clear understanding of the goals for each of the processes. You can provide the team members with this knowledge through a wide variety of methodsassigned reading, classroom training, sending them to industry meetings, taking computer based training (CBT), or bringing in an outside trainer who is an expert in the ITIL process. When implementing ITIL, it is also a good idea to have at least the leader for each of the ITIL processes invest the time necessary to attain ITIL Foundation Level Certification as well as practitioner level certification in their specific ITIL process.
For more information about ITIL certifications see http://www.itilofficialsite.com/Qualifications/HowtoStart.asp.

Implement the Plan At this point, you should have a very clearly documented, detailed implementation plan. Provide enough time between the development and implementation phase to allow for one of Murphys Lawseverything will take twice as long as anticipatedand to allow for training to effectively occur. Make sure all stakeholders will be provided with advance notice of the upcoming changes. Be sure to clearly communicate to them how they will or may be affected by the changes. Use a phased rollout approach to avoid a big bang type of situation. Small changes are difficult enough for personnel to deal with. If you try to throw many changes at them at once, you are setting yourself up for failure at best and disaster at worst.

91

Chapter 5

Use Tools to Manage Change It is important that any tools you choose to implement the ITIL processes actually follow or support the ITIL philosophy. Keep the following in mind when choosing your ITIL Service Support tools: Be sure to include the people who will be using the tools for different activities involved with the processes when identifying tool requirements and testing the tools. If you are replacing old tools with new ones, remember that there may be more users for the tools than when you did not have the processes in place. Take into consideration the impact the considered tools will have upon the network and supporting infrastructures. Be sure to obtain enough licenses to cover all possible users. Start defining the requirements you have for supporting tools early in the Service Support planning processes; dont wait until you are planning to implement the processes. Assign a ranking or level of importance to each of the tool requirements you identify. Be sure to evaluate the tool in terms of how well it will meet your defined goals and requirements. Look for tools that are customizable and consider how much effort and cost is involved with that customization. Take into consideration the amount of training and expertise the possible tools will require. Will you be able to obtain that expertise in-house or will you need to go outside your enterprise? Be sure to test the potential tools thoroughly. Clearly define scenarios and choose testers from throughout your stakeholders to ensure you consider the perspective of all the ultimate tool users.

Be sure to choose tools that will offer seamless integration with the other IT tools throughout the enterprise to reduce integration risks.

92

Chapter 5

Measurement
Chapter 2 and 3 provided key performance indicators (KPIs) to use to measure the success of Change Management, Problem Management, and Incident Management success. Be sure to use them! You must also measure the success of your implementation activities as you go along. Tracking key measurements will help to ensure optimization of your IT investments and will be used to validate to your customers the effectiveness of the processes. Review Status Carefully track the status for each of the activities involved with each of the Service Support processes. You first need to know what your current state is for each of the processes you are implementing; your benchmark values. You then need to track how much time the implementation tasks consuming. Once implementation is complete, you need to determine how much time the changes take. How much time does responding to incidents take? How much time does resolving problems take? Be sure to carefully document all these status metrics to not only be able to see your progress but also enable you to answer executive management questions regarding your implementation progress. Measure Goals Look at the goals you documented during project planning. How close are you to meeting those goals? Document the goals that have been achieved, along with how close you are to meeting the goals that you have not yet met. Measure Changes Document the changes that have occurred not only within IT but also within the entire enterprise as a result of implementing the ITIL Service Support processes. Some of these changes may include: Reduced IT costs Greater business process productivity Improved communications Increased IT reliability

It is important to document the changes not only to ensure that improvements can continually be made within the processes but also to provide documentation to validate the value of the processes within the organization. Many organizations have a tendency to abandon processes once improvements have been made. Your business leaders must understand through your wellwritten documentation that the processes must be preserved, and improved upon as necessary, in order to keep those noticeable and measured improvements in place. It is also important to be realistic in your efforts. You should not expect to become ITIL certified compliant right away; maybe even never. However, what you should expect with successful implementation is for documented and noticeably improved processes, validated through your measurements.

93

Chapter 5 Document Problems and Vulnerabilities No ITIL Service Support implementation will occur without snags. And there are always vulnerabilities within the enterprise that will put certain portions of the processes at risk. It is important to identify and document these problems and vulnerabilities so that you can most effectively address them. Some of the common problems and vulnerabilities experienced by organizations include: Lack of strong and visible commitment from executive management Shortage of resources needed for implementation of one or more of the processes Shortage of resources needed for ongoing maintenance of one or more of the processes Lack of personnel and/or implementation team awareness and understanding of the processes and how they apply to their respective job responsibilities Customer service levels or the processes not clearly defined or inadequately defined Poor integration with other processes Personnel resistance to change Workarounds are not effectively or consistently shared with other support staff Change updates are not communicated Lack of established customer service levels Poor or non-existent tools to support the Service Support processes

Plan for Ongoing Management As I stated earlier, implementing ITIL Service Support processes is not a one-time activity. To keep the processes effective, organizations need to develop and consistently follow plans for ongoing management and operations activities. These activities can be simplified and made more efficient through automation. The following are some examples for how ongoing management and activities can be automated: Automated auditing tools can be used to compare the documented infrastructure with what the infrastructure actually looks like. Change detection tools can be used to identify changes within the infrastructure and compare them with the approved changes documented within the Change Management database. Detection tools can alert IT staff of intrusions and possible threats and vulnerabilities. Incident tracking software can be used to identify incidents related to specific changes. This will make it easier to report separately from other incidents and will enable incident trends and summary reports to be more easily created. IT service monitoring can show the service impact resulting from changes, incidents, and problems.
94

Chapter 5 Network search tools can be used to identify all the devices throughout the enterprise. Some are especially helpful by being able to identify software versions and hardware configurations. Report automation software can standardize, quicken, and make the reporting process and tasks easier. Self-service tools can allow end users to perform ITIL Service Support activities themselves, without involving other personnel, which will allow the other personnel to continue with their other job responsibilities. System infrastructure monitors can be used to monitor device availability during planned outages, changes, incident response, and problem resolution.

Summary
IT services are an integral part of all business processes within most organizations. ITIL provides a collection of best practices for performing effective Service Support processes. All ITIL Service Support processes are linked and, theoretically, are dependent upon each other. However, to be most successful with ITIL implementation, organizations must first identify where their most pressing problems exist within the enterprise, then establish a manageable scope for initially introducing and implementing the Service Support processes into the enterprise. The likely processes for most organizations to implement beneficially will be for Change Management, Problem Management, and Incident Management. Implementation of the chosen ITIL processes within the chosen scope must be strongly and clearly supported by executive management. The implementation plan must be carefully and thoroughly developed. Mistakes within the implementation plan will not only result in a loss of investment cost and personnel time but also likely result in pushback of further implementation of the processes from the stakeholders, and even damage the business disruption of services. Planning and providing a coordinated approach to IT Service Support design, implementation, and maintenance, otherwise referenced as application life cycle support, will help to ensure IT operations areas deliver services designed to meet business requirements. Coordinating input from the stakeholder and business unit areas will allow the IT organization to better meet service-level requirements and choose new technologies to re-engineer business processes, resulting in greater effectiveness, more efficiency, and positive business impact.

Download Additional eBooks from Realtime Nexus!


Realtime NexusThe Digital Library provides world-class expert resources that IT professionals depend on to learn about the newest technologies. If you found this eBook to be informative, we encourage you to download more of our industry-leading technology eBooks and video guides at Realtime Nexus. Please visit http://nexus.realtimepublishers.com.

95

You might also like