Lita Lean It Kaizen Publication

Lean IT Kaizen
Official Publication
V 1.03
October 2015
Table of Contents
Scope and Purpose 5
Target audience ................................................................................................................................................................................................ 5
1 Introduction 6
1.1 Definitions...................................................................................................................................................................................................... 6
1.2 The Kaizen Mindset.................................................................................................................................................................................. 7
1.3 Improvement methods.......................................................................................................................................................................... 8
1.4 DMAIC............................................................................................................................................................................................................. 9
1.5 Lean and problems..................................................................................................................................................................................10
1.6 Problems in IT............................................................................................................................................................................................ 10
2 Organizing Kaizen 12
2.1 Daily Kaizen................................................................................................................................................................................................. 12
2.2 Improvement Kaizen............................................................................................................................................................................. 12
3 A3 Method 16
3.1 A3...................................................................................................................................................................................................................... 16
3.2 Contents of a Problem-solving A3................................................................................................................................................. 16
3.3 A3 Status Report and A3 Proposal............................................................................................................................................... 19
3.4 Skills for completing an A3.................................................................................................................................................................20
3.5 Building communication.......................................................................................................................................................................20
4 Define Phase 23
4.1 Problem Statement.................................................................................................................................................................................23
4.2 Validating the problem.........................................................................................................................................................................24
4.3 Types of problems..................................................................................................................................................................................26
4.4 Validating the value of solving the problem.............................................................................................................................27
4.5 Ensuring Support for a kaizen.........................................................................................................................................................30
4.6 Stakeholder analysis.............................................................................................................................................................................30
4.7 Define phase and A3.............................................................................................................................................................................. 31
4.8 Key steps in the Define phase.........................................................................................................................................................32
4.9 Case Study: Define Phase...................................................................................................................................................................35
2
5 Measure Phase 37
5.1 Data................................................................................................................................................................................................................. 37
5.2 Measurement Systems........................................................................................................................................................................39
5.3 Baseline and Benchmark.....................................................................................................................................................................42
5.4 Value Stream Map..................................................................................................................................................................................43
5.5 Measure phase and A3.........................................................................................................................................................................45
5.6 Key Steps in the Measure phase.....................................................................................................................................................45
5.7 Case Study: Measure Phase...............................................................................................................................................................48
6 Analyze Phase 50
6.1 Seven Basic Tools of Quality..............................................................................................................................................................50
6.2 Finding the root cause..........................................................................................................................................................................62
6.3 Analyzing a Value Stream Map........................................................................................................................................................64
6.4 Analysis in IT..............................................................................................................................................................................................65
6.6 Key Steps for Analyze Phase............................................................................................................................................................68
6.7 Case study: Analyze Phase................................................................................................................................................................70
7 Improve Phase 71
7.1 Idea generation.......................................................................................................................................................................................... 71
7.2 Option selection and prioritization.................................................................................................................................................72
7.3 Testing solutions....................................................................................................................................................................................... 74
7.4 Solutions used in IT.................................................................................................................................................................................75
7.5 Improve phase and A3..........................................................................................................................................................................76
7.6 Key Steps for Improve Phase............................................................................................................................................................76
7.7 Case Study: Improve Phase................................................................................................................................................................78
8 Control Phase 79
8.1 Achieving Control.....................................................................................................................................................................................79
8.2 Control Plan...............................................................................................................................................................................................79
8.3 Monitoring................................................................................................................................................................................................... 81
8.4 Communication Plan.............................................................................................................................................................................83
8.5 Closure..........................................................................................................................................................................................................84
8.6 Control phase and A3...........................................................................................................................................................................85
8.7 Key steps in the Control phase........................................................................................................................................................86
8.8 Case Study: Control Phase.................................................................................................................................................................88
3
9 Appendix 1: References 89
9.1 Lean Six Sigma Pocket Toolbook (chapters 1-4, 9).................................................................................................................89
9.2 Understanding A3 Thinking...............................................................................................................................................................89
9.3 A Leader’s Framework for Decision Making.............................................................................................................................89
10 Appendix 2: Glossary 90
11 About the author 101

11.1 Niels Loader..............................................................................................................................................................................................101
4
Scope and Purpose
The purpose of this document is support the Lean IT Kaizen qualification. The exam questions can all
be answered based on information in this document.
Target audience
The target audience for this document is:
•• Candidates for the Lean IT Kaizen Exam

•• Accredited Training Organizations
Copyright notes
ITIL® is a registered trade mark of AXELOS Limited.
COBIT® is a trademark of ISACA® registered in the United States and other countries.
PRINCE2® is a Registered Trade Mark of AXELOS Limited.
PMI® is a registered Trade Mark of the Project Management Institute, Inc.
Acknowledgements
The author would like to thank everyone who put their time and effort into improving this document.
The author would especially like to thank Troy DuMoulin (Pink Elephant) for the inspiring discussions
to get the right content into the Kaizen syllabus and the first reviews.
Many thanks to the following peoplefor their critical reviews, which helped to improve this
publication:
•• Mike Orzen, Mike Orzen and Associates, Member of Lean IT Association Content Advisory
Board
•• Barry Fairingside, APMG
•• Gary Case, Pink Elephant
•• Rita Pilon, Exin, Member of Lean IT Association Content Team
•• Hans van den Bent, CLOUD-linguistics
•• Marianne Hubregtse, Exin
•• Natasja Soselisa, Quint Wellington Redwood
•• Ilona op de Weegh, Quint Wellington Redwood
5
1 Introduction
As one of the pillars of Lean and Lean IT, It translates as change for the better. Kai means
ensuring that an IT organization is competent at change, Zen means for the better. Kaizen is an
ensuring continuous improvement in line with approach for solving problems and forms the
the interest of the customer(s) is absolutely basis of incremental continual improvement
vital to the success of Lean within IT. in organizations. A problem is a difficulty that
has to be resolved or dealt with. When applied
In the Lean IT Foundation, we looked at the to the workplace, Kaizen means continuous
basics of Kaizen (continuous improvement) and improvement involving everyone, managers
the DMAIC problem-solving method. In this and workers alike, every day and everywhere,
document, we will build on this material to help providing structure to process improvement.
you on your journey to mastering continuous Kaizen is about continuously improving:
improvement, and becoming a Lean IT Kaizen everyday, everyone and everywhere. Many
Lead. The Lean IT Kaizen Lead is someone who small improvements implemented with Kaizen
is involved with Lean improvement that could produce faster results with less risk. In IT
be at any level of the IT organization, in any terms, we can equate this to a minor update to
‘department’. a piece of software.
We will be using terms defined in the Lean Lean also recognizes that there are moments
IT Foundation publication. Where a term is that more radical, step change is necessary.
central to this document, we may repeat This type of change is known as Kaikaku. This
Foundation material. If you are no longer refers to a revolutionary change to the existing
familiar with a particular term, please refer to situation. Following the software example,
the aforementioned document. Kaikaku would be the upgrade of an application
currently in use from a release level to a new
We will take an in-depth look at the key aspects release level. Software providers will often
of organizing and running a Kaizen event. We substantially change both the technical basis
will also investigate the DMAIC problem-solving of the software and its functionality. For both
method in substantially more detail than we IT and the user community, this means a large
did in the Foundation document. On top of step change.
this, we will use the A3 method to record and
communicate the findings of our Kaizen event. A third type of improvement known within
Lean is Kakushin. The idea here is that some
1.1 Definitions change will form a complete departure from
the current situation. It is about innovation,
There are three main words used within the transformation, reform and renewal. Again,
domain of improvement: Kaizen, Kaikaku and in our software example, this may mean
Kakushin. replacing a complete application with a
different application that supports the
Kaizen is the Japanese word for continuous process in a completely different way, for
improvement using small incremental changes. example a web-based application that fully
6
automates the registration of orders, the mindset?
submission of invoices and the generation of
a picking order at order fulfillment. This kind 1. Seeing and prioritizing problems: are
both managers and employees truly
of change will entail the disappearance of
prepared to uncover problems, accept
many roles and functions within a business.
them as a part of daily life and initiate
Both from technological and business process action to identify the problems that
perspectives, this example represents a most need solving?
complete departure from the current way
of working. Another example of Kakushin is 2. Solving problems: are both managers
and employees prepared to invest time
where the organization standardizes a process
and other resources to understand the
and supporting software across the entire
root causes of problems and resolve
organization where previously various groups problems completely?
had different processes and applications to
achieve similar goals. 3. Sharing lessons learned: are both
managers and employees driven to
In this document, we will focus on the use of share the lessons learned as a result of
solving problems with others in the IT
Kaizen within IT organizations. The reason
organization, so that they may benefit
for this is that purposeful improvement of from the lessons learned?
IT services to customers is, generally, not
done consistently and continuously within IT It is important to note at this point that
organizations. The aim of this document is to Problem Solving is not about reactively waiting
describe how to embed purposeful continuous for problems to appear and then resolving
improvement into any IT organization. them as they occur. A problem solving mindset
is to first establish a desired state for the
1.2 The Kaizen Mindset service and or process, understand the current
baseline and gap and, then, to incrementally
As we already described in the Lean IT close the gaps towards the desired state
Foundation, Lean is a way of thinking and through Kaizen improvement steps. The
acting. We will be discussing what needs to essence is that identifying problems and
be done to successfully introduce and execute solving their root causes drives individual and
kaizen within an IT organization. At each step, organizational learning.
we will discuss critical ‘thinking’ aspects.
There are, in fact, two types of kaizen:
Before we can start, we must investigate the Improvement Kaizen and Daily Kaizen. The
starting point of Kaizen, and that is developing one we will be dealing with in detail in this
a Kaizen mindset. What do we mean by document is the former and we will refer to this
this? We mean that there must be a belief as kaizen. It is focused on carrying out kaizen
throughout the IT organization, both among events to bring about incremental change. Daily
managers and employees, that improving IT Kaizen will be further discussed in the Lean IT
services and the way they are delivered can Leadership certification.
and must be done on a daily basis.
Daily kaizen is more closely related to the
So what are the core elements of a Kaizen kaizen mindset as it entails continuously looking
7
at the environment in which we operate and as the Deming Circle. This is the Plan-Do-Check-
changing things to make it easier for the people Act (PDCA) sequence. This cycle is applicable
in this environment to deliver a higher level in any situation, and forms the basis for all
of customer value, more quickly and more improvement within Lean. Its premise is that
consistently. Why is daily kaizen more closely by following the plan, do, check and act steps
related to the kaizen mindset? Because daily in that order, we are able to purposefully take
kaizen means being constantly alert to minor steps to improve the capabilities of individuals,
(and major) issues that need to be addressed organizations, processes and technology. The
directly. Deming Circle consists of the following steps:
•• PLAN: Establish a desired future state

A simple example of daily kaizen is the
or reference, Design establish the
following. Imagine a printer on a table. The current gap define the plan to or revise
paper for the printer is stacked in boxes, each the business process components to
containing five packs of paper, under the table. improve results
The result is that whenever the paper drawer •• DO: Implement the plan and measure its
is empty, someone must bend under the table performance
•• CHECK: Assess the results of
to get a pack of paper out of a box. A daily
the mitigation actions through
kaizen action would be to mark out a rectangle
measurements and report the results to
the size of a piece of paper with red tape on decision makers
the table next to the printer. A box of paper is •• ACT: Decide on changes needed to
placed on the rectangle. When the last pack of improve the process
paper is used and the box is discarded, the red
The Deming Circle creates a feedback loop to
tape will signal that a new box of paper needs
ensure that improvements are identified and
to be put on the table. This simple example
implemented.
means that mostly people will not need to bend
anymore to get paper for the printer’s paper Within IT, the IT Infrastructure Library 1
drawer. Since this makes life more pleasant for framework identifies a Continual Service
everyone for a substantial length of time, we Improvement cycle. This cycle uses the PDCA-
can see this as a small improvement, and an cycle as its basis. Other frameworks and
example of daily kaizen. reference models within IT, such as Cobit 2 and
ISO/IEC 20000, all contain improvement cycles
Both types of kaizen (daily and improvement)
based on the Deming Plan-Do-Check-Act.
must be present in an IT organization to be able
for it to say it is continuously improving.
1.4 DMAIC
In Lean IT, our mindset is that we accept that
Within Lean IT, we also recognize the Plan-
our world is filled with problems and we act to
Do-Check-Act cycle as an integral part of
solve the problems on a continuous basis.
continuous improvement and recommend its
use in all circumstances. We have, however,
1.3 Improvement methods chosen a more specific problem-solving method
The most well-known continuous improvement 1 ITIL® is a Registered Trade Mark of Axelos Limited,
2 COBIT ® is a Registered Trade Mark of ISACA
method is the Shewhart Cycle, often referred to
8
to support the actually execution of Kaizen Our objective is to improve the delivery of
within IT organizations. value to the customer. For this we apply
Lean principles and techniques to the work
Kaizen events consist of five phases starting of IT. We use five dimensions to support the
with a problem statement towards embedded effectiveness of our improvement activities.
improvement implementation. These steps are: We use the DMAIC steps in a disciplined way
define, measure, analyze, improve and control, to solve problems and learn from them. This
also known as DMAIC, the preferred method is how we continuously improve business
for problem-solving. This method has proven in performance, through continuously improving
practice to be easy to understand and adopt, IT.
and suitable for the majority of problems
encountered within IT. Taking a brief look at Kaikaku and Kakushin,
DMEDI (Define, Measure, Explore, Develop,
•• In the Define step, we define the problem
statement, describe the goal statements, and Implement), is an approach that, like
analyze the cost of poor quality, define DMAIC, is strongly based on data and statistical
the scope with a SIPOC (suppliers, inputs, analysis, and can be used for more radical
process, outputs, customers) diagram, improvements. It requires the application
establish the Kaizen project team, create of creativity in using data to design new
the project charter and planning, get processes, products and services. DMEDI aims
stakeholders’ support and start the project.
at taking a step-change leap over existing
•• In the Measure step, we build understanding
of current KPIs and performance, develop processes, products, or services, and seeks to
the Critical to Quality (CTQ) flowdown, write generate a competitive advantage.
a data collection plan, we try to understand
process behavior and variation, and relate DMEDI is principally used in situations where
current performance to the Voice of the existing processes, products or services work
Customer so poorly that they need to be designed from
•• In the Analyze step, we collect data and
scratch, or where the gap between the desired
verify the measurement system, study the
state and the current performance remains
process with Value Stream Mapping, identify
the types of waste, develop hypotheses huge. DMEDI can be used if an IT service or
about the root cause, analyze and identify process continues to fail to meet customer
the data distribution and study correlation expectations even after DMAIC has been used.
•• In the Improve step, we generate potential The application of DMEDI requires longer lead-
solutions by brain-storming, design time and considerable resources compared to
assessment criteria for impact and feasibility,
DMAIC. DMEDI unsuited for Kaizen.
decide the improvement to implement,
implement or pilot the improvement and
measure the impact on the CTQs.
•• In the Control step, we implement ongoing
measurement, we anchor the change in the
organization through effective controls, and
we quantify the improvement, capture the
learning, and replicate it across the board.
We write the project report and close the
actions for our project.
9
1.5 Lean and problems and amount of data generated is without
historical precedence. The result is that within
Lean has a relatively relaxed relationship IT we are continuously confronted with new
with problems. Particularly in Western situations. Inevitably, new situations generate
organizations, problems do not always appear new problems that need to be solved.
to be particularly welcome. There is a tendency
to believe that higher levels of management However, not only new situations require
prefer to hear a positive story, and are not problem-solving. Within IT, we are confronted
open to problems. Whether this is true or not on a daily basis with disruptions to existing
is irrelevant, the result is more important, and services based on unplanned outages or
that result is that problems are not proactively failures. These service outages impact the
identified, often ignored or polished over. users of IT services and require support.
Support means restoration of the service via
Lean takes a completely different view. Incident Management but also the application
Problems are a fully accepted part of people of Problem Management practices to identify
working together. In fact, the way Lean looks at the root cause and establish solutions to ensure
problems can be summarized by: the disruption does not occur again in the
•• Most problems are solvable (or partially future.
solvable, or at least their impact can
minimized). The necessity to solve problems permanently
•• Problems are opportunities to make some has been long recognized within IT. The term
good things happen. ‘Problem’3 is defined as ‘the cause of one or
•• Problems are challenges, that encourage more incidents’. In this document, we will
people to overcome them refer to this term with a capital P, in order to
Lean sees problem-solving as a Leadership distinguish it from the more generic ‘problem’.
activity regarding the identification of future The Problem is one of the key units of work
or desired state and the relentless pursuit of within an IT organization.
closing the current gaps between the desired
state and the current baseline. Problem solving Problem Management is one of the core
is about establishing the way forward, making operational IT processes, as defined in ITIL.
the difference between the status quo and a Its aim is to prevent problems and incidents,
better situation. eliminate repeating incidents and minimize the
impact of incidents that cannot be prevented.
When we solve a problem within an IT
organization, in essence, we are removing muri, Problem Management is made up of two parts:
mura and/or muda from people, process and/ •• The first part is aimed at uncovering the root
or technology. cause of incidents. A Problem for which the
root cause and a workaround are known, is
called a Known Error. A workaround is a way
1.6 Problems in IT
of reducing the impact of the problem and
associated incidents when the full resolution
IT is one of the fastest developing areas of
is not yet known.
society; the phenomenal increase in processing •• The second part focuses on removing the
power, transmission of data across networks
3 ITIL definition of a Problem
10
Problem from the IT service infrastructure.
In many cases, this is done by carrying out a
change.
The DMAIC methodology is completely
compatible with the Problem Management
process. In fact using DMAIC to solve technical
problems provides additional structure and
discipline. It also helps to broaden the scope,
identification and resolution of problems
in other Lean IT areas such as Process,
Performance, Organization and Behavior &
Attitude.
11
2 Organizing Kaizen
As we saw earlier, the aim of kaizen is to ensure the day start and the week start give ample
that problems are identified and solved, and opportunity to discuss problems. These
that lessons learned are shared within the IT problems may give rise to short term, quick
organization. It is vital that leaders within the IT solutions, but may also trigger an improvement
organization emphasize the need for the kaizen kaizen.
mindset.
2.2 Improvement Kaizen
2.1 Daily Kaizen
This is the most popular and visible form of
Daily kaizen is the act of responding to kaizen within IT. Simply said: improvement
everyday occurrences such as incidents, kaizen is about bringing together a group
mistakes and other quality issues and of people who have an interest in having a
addressing quality issues at the source rather particular problem solved, and getting them
than being satisfied with quick fixes. It is highly to solve this problem. It is sometimes referred
dependent on the fact that management and to as a kaizen event. Improvement kaizen does
technical leaders adopt the kaizen mindset, have the drawback of requiring a substantial
to ensure that they provide their staff with time investment and the results may not
the authority and time necessary to address always be as successful as desired.
the quality issue. The focus of leaders on
uncovering and dealing with problems is vital This sounds simple but this generally requires
in encouraging employees to see and tackle some organization and management to ensure
problems they meet in their daily work. that the right people are involved and right
things happen. In this chapter, we will look into
Daily kaizen is about ‘stopping the line’ when the governance and organizational aspects of
a problem is uncovered. This is principally kaizen events.
a kaizen mindset issue: do we continue
programming and let the testers find the errors 2.2.1 Sources of kaizen initiatives
in the code, or do we create an environment in Most organizations have no shortage of known
which quality is built in at the source, even if errors or opportunities for improvement.
this means stopping or delaying delivery. What Deciding which Kaizen initiatives deserve
is the ‘andon cord’ in your IT organization? Daily the resources, involves deciding which is
kaizen may well lead to quick fixes of everyday most important to the customer and the
problems. However, it tends to focus on solving organization. Have made this primary decision,
smaller and simpler problems. The analysis we must check the feasibility of the initiative.
required is less intense than in Improvement
Kaizen. This automatically means there is less As we saw in the Lean IT Foundation, kaizen
deep learning achieved through daily kaizen. initiatives may arise from one or more sources.
These sources are known as the ‘Voices’.
Within IT, we see that daily kaizen is an integral
•• The most important voice is the Voice
part of the daily and weekly meeting structures
of the Customer (VoC) which gives the
we discussed in the Lean IT Foundation. Both
12
IT organization feedback on how the 2.2.2 Kaizen team
customer, the user of the IT service, actually
In order to solve a problem using kaizen, we
experiences the IT service. The only person
who can truly give us this feedback is the must accept that the problem is not solvable by
person who uses the IT service. There are, of an individual; that it is only with the power of a
course, other voices that help us to uncover diversity of points of view that the problem will
problems: be adequately addressed.
•• Voice of the Business (VoB): for IT, this
concerns the ‘business’ of the IT organization This brings us to two major questions:
itself; not to be confused with the fact that
the customer of IT is regularly referred to 1. How many people do I need in the
as “the business”. Even if the VoC does not kaizen team?
identify any problems, the VoB may well find
problems to be solved. An example could 2. Which roles are there in the team?
be that the customer is very happy with the
quality of the IT services, but the Voice of the To start with the first question, practice has
Business tells us that cost levels are too high shown that 5 to 8 people is the optimum range.
and that budgets will be exceeded before With fewer than 5 participants, the diversity
the end of the year. The VoB would indicate
of points of view can be compromised and the
that the IT organization needs to carry out a
Kaizen to understand where cost is excessive work that needs to be done is spread over a
and how it can be reduced. small group. It has also been found that where
•• Voice of the Process (VoP): this is about larger teams are required, the scope of the
processes not working correctly. Again, the problem is probably too large.
VoC may indicate that the results of the
process may be satisfactory and the VoB Within the team, we find three basic roles:
may not have any issues with the costs or
quality. However, the process may indicate •• Kaizen sponsor: this is the owner of the
that, for example, even though changes are problem, the person who has a direct
delivered on time and with few incidents, interest in having the problem solved. In
the variability of the process gives cause for some cases, we may find that the manager
concern. of the problem owner is identified as the
•• Voice of the Regulator (VoR): it may seem kaizen sponsor. Generally, this will happen
that regulators primarily have their sights for budgetary or visibility reasons. This
set on particular business sectors. IT is person must want to see the kaizen event
also directly affected by regulators. The through to its conclusion, i.e. resolution of
Sarbanes-Oxley act specifically stipulates the problem. Without this person, there is no
how IT must create an audit trail for point carrying out a kaizen event. Especially,
changes. As IT becomes more entwined when time (and maybe some money) will
with the primary processes of business, or be spent understanding and solving the
even replaces these primary processes with problem. The kaizen sponsor must have
systems that only require humans to see the affinity with the problem and must also be
exceptions, IT will find itself more directly prepared to do what is necessary to get the
affected by the regulator. problem solved. This does not mean that the
resolution can be at any price.
•• Kaizen lead: this person manages the kaizen
process on behalf of the sponsor and the
team. This role ensures that the correct steps
are followed as efficiently as possible, so that
13
the right actions can be taken as quickly as are posted on the improvement board. There
possible to remove the problem. This person is thus always a ready inventory of problems
must be experienced in managing the kaizen to be solved. This inventory of problems will
process and ensuring that the team stays on
contain both daily kaizen initiatives that need
track. A kaizen lead must have facilitation
to be picked up; there will also be problems
and team-building skills in order to turn the
group into an effective team in a short time. that need some more attention in the form of a
•• Kaizen team member: the people executing kaizen event.
this role will do the required work. They
must be involved with the problem as it It is from this inventory of problems that the
occurs on the work floor. They must have problem with the highest priority must be
intimate knowledge of the process in which picked and investigated through a kaizen event.
the problem occurs, i.e. they must work
How priority is defined, will be discussed later.
in the process on a daily basis. It is useful
to have people who are ‘upstream’ and
The fact that a problem has found its way
‘downstream’ of where the problem occurs.
Also, having someone who is involved onto an improvement board means that there
with the problem but can look at it from a is someone who thinks it is important enough
dispassionate point of view, can be useful to to need a resolution. The question is: does this
avoid tunnel vision. person have the support of others, especially
Selecting the correct team members for a those in the position to allocate resources for
kaizen team is the next challenge. It is clear the resolution of the problem?
that we need diversity. This means the team
must include people who work in the process, Assuming there is sufficient need to solve a
but also a manager who is close to the process, particular problem, usually the kaizen sponsor
but not necessarily the manager of the process or a small team of people including the sponsor
(who may be the sponsor). The team will will create a short kaizen charter in which the
need technical skills, e.g. understanding the problem is described and an indication is given
technology involved in supporting the process, of what resources (people, time, money) are
or business and regulatory rules governing the allocated to the resolution of the problem.
process. Also, the time within which a solution should
be found will be indicated. This means that an
2.2.3 Preparing a kaizen event initial stakeholder analysis must have been
done.
The DMAIC cycle exists within an organizational
context. As we saw in the Lean IT Foundation, Based on the kaizen charter, the kaizen event
Lean IT organizations work with visual can be planned and prepared. This means
management as part of the Jidoka principle. organizing basic things such as:
Jidoka is all about creating an environment
in which disturbances to the flow of work •• a location where the kaizen team can meet
•• whiteboards, flip-overs, marker pens, post-it
through the value streams are made visible, i.e.
notes
problems are not left covered up.
•• access to data sources
•• invitations to all participants, including the
In day-to-day working within a Lean IT sponsor and kaizen lead
organization, we see that problems are brought
On top of this, there must be agreement on
up on a daily and weekly basis. The problems
14
how and when the team will communicate
progress. The minimum communication must
be through daily updating the improvement
board containing the relevant problem. This
should be supplemented by regular submission
of the current state of the A3.
Planning a kaizen is often described as a

relatively straightforward affair in which
activities are planned in a week. Ideally, a
kaizen is planned within a short time, wherein
the kaizen team dedicates their time to solving
the problem.
In practice, within an IT organization, this

kind of planning is quite difficult. Especially
at the start of a Lean IT transformation, the
organization is not attuned to the fact that
people are out of the ‘production’ process for
a full week. Even after a transformation has
taken hold of the IT organization, it remains
difficult to clear agendas completely to focus on
a kaizen.
A more realistic way of planning the kaizen is

to set up five or six meetings of three hours
per meeting at regular intervals over a period
of two weeks. This gives engineers the time to
carry out operational work in the meantime.
The agreement must also be made that work
related to the kaizen be carried out in between
meetings, e.g. data collection, processing of
data or preliminary analysis.
These preparatory activities are exactly that,

getting the kaizen event to its point of initiation.
In the Define phase, we will see how this input
is validated, enriched and brought to a point
that the problem can be fully investigated.
15
3 A3 Method
One of the powerful tools that Toyota has the A3 problem-solving sheet. It includes the
institutionalized within Lean is working with A3 following elements:
reports. It supports and promotes continuous
•• Background. In this section, the context
improvement, and is based on the PDCA cycle. in which the problem exists is described.
This may include a brief history of the IT
3.1 A3 organization or department in which the
problem exists. The background section will
include a description of the problem.
A3 is not a clever acronym, it simply refers to
•• Current Condition: Here we describe the
the size of a piece of paper. A3 is 29,7 cm by
current condition surrounding the problem.
42 cm (11,7 in by 16,5 in). It is twice the size This may include complications that cause
of A4 and half the size of A2. The beauty of the problem to remain in place.
the A3 sheet is that it provides enough space •• Future State goals. This is a description
to explain a relatively complicated story, but of the way the situation should be if the
limits the writer in their verbosity. The aim problem did not occur. Preferably, we should
be able to define in concrete terms what
of the A3 is to encourage conciseness in the
would happen if the problem no longer
communication of a message. It also works as
existed. ‘Concrete’ may even mean setting a
a checklist to ensure strict adherence to the numerical target that should be achieved as
chosen problem-solving methodology, in our a result of the resolution of the problem.
case DMAIC. •• Analysis. This section includes a short
description of the analysis that was done in
It is important to understand that there is no trying to work out what the root cause of
hard and fast way to complete an A3 problem- the problem was.
•• Proposed options: Here we find the list of
solving sheet. Most A3 sheets tend to have 7
possible solution candidates to the problem.
or 8 sections, as we will see below. However, if
•• Plan / Improvement: This is where the
you wish to have 5 sections focusing on DMAIC, improvements to be implemented are
then this is acceptable. It is important that the described and a brief plan is created for their
problem-solving A3 covers the complete PDCA implementation.
cycle. The key determinants for a good A3 •• Follow-Up. After the chosen solutions have
sheet are: been implemented, there must be one or
more follow-up actions to ensure that the
1. Does it help the team compiling it to adopted solution remains in place. There
follow a structured problem-solving must at least be one action to inform others
method? of the lessons learned from the problem-
solving action and/or to communicate the
2. Does it help people who need to take solution to other parts of the organization
action on the outcome, to understand where they may be suffering from the same
the logic that led to the outcome? issue.
The associated A3 may look like this:
3.2 Contents of a Problem-solving A3
Let us start with a basic much-used version of
16
Alternatives may include the following models. The first model includes a flow in which the position of
the Analysis step is clearly seen as an intermediate step.
And the second model is based on the DMAIC method.
17
Each of the three models presented is valid as it helps the team carrying out the kaizen to both follow
a process and to communicate a result.
As we stated earlier, within IT, we not only recognize problems to be solved. We also recognize
Problems. These tend to be issues of a technical nature that are the root cause of incidents. An A3
model for the resolution of these Problems could be the one below.
18
3.3 A3 Status Report and A3 Proposal and enhances the continuous improvement
mindset.
In their book ‘Understanding A3 Thinking’,
Sobek and Smalley describe two other forms The A3 proposal is used for creating a
of A3 report: the A3 Status report and the A3 recommendation for action. Generally, the A3
Proposal report. These are again variations on proposal will be aimed at implementing new
the above themes, but with different purposes. policy or for carrying out a project that entails
substantial investment of time and/or money.
The A3 status report is aimed at informing all This A3 report focuses principally on the Plan
stakeholders of the progress of the execution phase of the PDCA. It will also describe how the
of a longer-running project or action. This type Check and Act phases need to be carried out,
of A3 is not some much focused on analysis, i.e. it should indicate how the proposal will be
rather it aims to continually check whether the monitored as it is being executed and after it
assumptions made continue to be correct and has been implemented.
ensure that it is clear which actions need to be
taken. An A3 status report will tend to focus on The A3 proposal report is more similar to the
the Check and Act aspects of the PDCA cycle. A3 problem-solving report.
•• Background. As with the A3 problem-solving

The key components of the A3 status report
report, this section includes the context
are: within which the proposal is being written.
•• Background: in this section, the context is •• Current Condition: This is the key section of
described. This may be a concise version of this A3. It should be clear from this section
the problem-solving A3 for which the A3 is a why the proposal needs to be made and
status report. why it is important to seriously consider its
•• Current Conditions: Here, the progress of the execution. The main issues must be clear to
project is described. The changes that have the reader.
already been made are described. •• Proposal. This is a description of the
•• Results: This is the key section of an A3 proposed course of action.
status report. The current conditions are the •• Analysis/Alternatives. This section is all
consequence of actions taken. These actions about the business case for the proposal.
have led to results. It is the results on which •• Plan details: In this section, the reader is
the decisions are taken whether to continue given the details of what will be involved
and, if so, which course of action to take. with carrying out the proposed change. It is
•• Remaining Issues/Action Items: The A3 vital that stakeholders, necessary resources
status report ends with the upcoming and consequences are made clear.
actions. These may be based on issues •• Unresolved issues: in this section, issues
encountered during the process of getting to that are not (sufficiently) addressed that
the current condition or they may be actions may have an impact on the execution of the
based on the original plan. proposal, are dealt with. In essence, these
are risks that may affect the proposal.
The A3 status report is an important document •• Implementation schedule. This is a high-
to support the learning process within the level plan of how the proposal would be
organization. Each status report must lead to implemented
some kind of reflection, with lessons learned In all cases, the text in an A3 must be created
that lead to action. In this way, the A3 status in such a way that the audience clearly
report is embedded into the daily kaizen,
19
understands what problem has been solved, the subject of A3) to turn your story into a
what the status is of a particular project or visual experience using pictures and graphics
what the proposal is. The A3 must be written to explain what has been investigated and
what is proposed as a solution.
from the perspective of the reader!
These skills will be further specified through
3.4 Skills for completing an A3 the examples in this document.
Using A3 reports requires practice. There are 3.5 Building communication

skills that need to be acquired and honed to
ensure that an A3 becomes a powerful tool. Using the aforementioned skills will help to
determine the parts of the story about your
In order to ensure the information in the A3 kaizen. You will then need to construct the
is accurate, there are four skills that must be story in a way that is easy for the stakeholders
practiced: to understand. This will help stakeholders to
•• Summarize: the first key skill is the ability accept the solution you are proposing.
to express thoughts, facts, and other
information concisely. Although an A3 sheet There are many ways to construct a story.
looks quite large when it is blank, the act of The one we will deal with in this publication
filling it with the relevant information can be is Barbara Minto’s Pyramid Principle. This
quite a challenge. It is vital, therefore, to stick is a method that is fully compatible with A3
to the information that has a direct bearing thinking. In fact, it helps to structure the
on the issue at hand, be it a problem, a
information and insights gained during the
proposal or a status. In order to summarize,
we need the two other skills. kaizen event.
•• Analyze: analysis is part of most A3
reports in some form or other. What does The problem is framed using the following
it mean? The aim of analyzing is literally framework:
to separate something into its constituent
parts or elements. It is vital when writing Situation: the current situation and ambition
an A3 report to understand the parts of the of what the situation will look like when the
problem so that only the right information is
problem is solved
given. If we are able to discern the parts of
a problem, we can also determine which of
Complication: as description of the things that
these parts are relevant to the reader.
•• Synthesize: the opposite is also true. One are keeping the current situation the way it is
of the best ways of summarizing is by or preventing the problem from being solved
combining parts or elements. The ability to
Synthesize can be defined as combining a Key Question: this is the question to be
number of disparate elements to make a answered; the problem to be solved (in
coherent whole. This is important when the question form)
parts do not immediately appear to have
individual relevance to the issue.
Answer: This is where the elements of the
•• Visualize: Once we have analyzed,
analysis are structured in order to present
synthesized and summarized, we need to
tell a story succinctly. In line with the ‘a a coherent set of motivations supported by
picture tells a thousand words’ adage, it is arguments, completed by the proposed course
strongly recommended (by all authors on of action.
20
The Situation-Complication-Key Question trilogy will be recognizable in the next chapter as a problem
statement. The answer includes the structuring process required to bring the Measure, Analyze and
Improve steps together.
Using the Pyramid Principle means using a bottom-up approach for grouping arguments (the A’s in
the above figure) in a logical way such that they support a motivation for the answer you give. The
Answer should be supported by three clear motivations as to why this answer is the best answer to
the Key Question. The arguments and motivations will come from the Analyze phase of your kaizen
event. The Answer will be the result of the Improve phase.
A useful technique in constructing an argumentation pyramid is MECE. This stands for Mutually
Exclusive, Collectively Exhaustive. Mutually Exclusive means that all items in a particular category
only belong to that category, and no other. Collectively Exhaustive means that all possibilities have
been covered.
In an IT context, we may encounter a situation where there is a lack of satisfaction with two services.
Based on a data set including a variety of calls, we would need to have each call put into a single
category, e.g. the call may be an incident, a service request, a request for information or a complaint.
These categories would need to be defined in such a way that all calls in the data set fall into one of
the four categories, and only one of the four categories. In this way, the set and the analysis on which
the data is based would be assured to be MECE.
Subsequently conclusions drawn and proposals suggested would also be relevant to the correct
calls. Analysis may show that the calls for a particular application are distributed 80% incidents and
20% service requests, whereas a second application may have 20% incidents, 40% requests for
information and 40% service requests. Assuming for one moment that the absolute volumes of calls
are the same, the analysis may conclude that application 1 is technically unsound since it has many
technical disruptions. Further analysis may identify the causes of these disruptions. Secondly the
21
analysis may show that there has been insufficient training of users regarding application 2 because
there are many calls for support.
The result is two motivations – resolve the technical problems and train the users – and a series of
arguments leading to these motivations. The answer to the key question may then be: we need to
invest differently in applications 1 and 2, to increase the user satisfaction of the two services.
22
4 Define Phase
“'The beginning of wisdom is the definition of Possibly more important than defining a
terms” is a quote attributed to Socrates. He problem, is the fact that someone believes that
might just well have said “The beginning of it IS in fact a problem and is prepared to invest
solutions is the definition of problems”. And, time and, possibly, money to get the problem
in practice, it turns out to be true. Once a solved. In short, we need a sponsor for the
problem has been defined, the problem can problem.
appear to diminish in size or importance. As our
understanding increases, so does our feeling of Identifying the sponsor of the kaizen is an
our ability to solve the problem. absolutely indispensable step that must be
confirmed regularly throughout the DMAIC
This issue of the perceived diminishing size of process. As soon as no one feels a need to
a problem gets more significant the more we solve the problem, stop the process instantly.
understand about the problem. We will return Any further action is waste, since when it
to this issue as we go through the DMAIC cycle. comes to actually taking action, no one will feel
inclined to make the effort.
Unsurprisingly, ‘Define’ is the starting point
for DMAIC, namely with the definition of the As we said in 2.2.2, there are three roles that
problem to be solved. must be identified and fulfilled before a kaizen
event can be organized. The first, we have
Before we can start, we need to work out just seen, is the sponsor. Mostly, this will be
which problem we are going to solve. This may someone who has a vested interest in getting
appear to be simple, especially since one of the problem solved. This person must then
the most prevalent starts to a sentence within ensure that other people directly involved in
IT is “The problem is …” followed by a problem the situation where the problem exists are
statement of dubious quality. On top of this, brought together as a team to work on solving
the ‘problem statement’ usually includes the the problem. Lastly, the kaizen lead must be
preferred solution somewhere in an adjoining added to the team to ensure that the kaizen
sentence, e.g. “The problem is that [x] is not process is followed. The kaizen lead must also
possible with Windows/Linux, but is possible keep an eye the way the team works together.
with Linux/Windows”, or “The problem is that
development doesn’t provide us in Operations
4.1 Problem Statement
with a decent handover document.”
Problems are mostly visible difficulties with
A good way to start developing the Kaizen
which the IT organization or individuals are
mindset within the IT organization is to have
confronted. However, problems never exist in
a standard response to these ‘The problem
isolation. There is always a cause. The cause
is …’ statements. The standard response
and factors that keep the cause in place are the
could be something like “But is that the real
entities that we are trying to understand when
problem?” Experience has shown that this
we seek to solve a problem. One of the most
simple question, gets IT people thinking about
difficult parts of defining a problem is that
problems in a constructive manner.
23
every problem has symptoms; phenomena that observations. These are by definition selective
accompany the problem or serve as evidence and biased, and very much in need of testing
that the problem exists. They are, however, not through thorough analysis of the data and facts
the problem itself. that can be found.
Before we can start to investigate our problem, Example of a problem statement and
we must have a statement that helps the team hypotheses:
investigating the problem to focus its attention.
We call this a problem statement. The problem We need all of our software changes to go to
statement may be in the form of a question production seamlessly, without defects, where
or in the form of a statement. The former is everyone is aware of and informed about the
preferable because it is then clear when you outcomes and status of the change. Right now,
have found the answer to the question. we have too many release failures, requiring
rollbacks. If we do not address this problem
A complete problem statement should include a in the short term, we will need to increase
description of the current situation, the reason resources needed to handle the ensuing
why this is not acceptable and an indication incidents and rework. Consequently, we may
of what the ideal situation looks like. Next, miss customer deadlines potentially resulting in
the problem itself is described, followed by lost revenue, SLA penalties, lost business, and
the question to be answered. It is vital that further damage to our quality reputation.
the problem statement is SMART (Specific,
Measurable, Achievable, Realistic, Time-bound), How can we halve the number of release
to ensure that the team solving the problem failures in the next two months?
knows when they have been successful.
Three associated hypotheses that could be
The other reason for using a question as the tested while investigating this problem could
form to describe the problem statement be:
is that we can then discern the problem
1. We think that our ability to test changes
statement from any hypotheses we may
is not good enough
have. A hypothesis is ‘a proposition, or set
of propositions, set forth as an explanation 2. We believe that the adherence to
for the occurrence of some specified group the change and release process is
of phenomena, either asserted merely as a inconsistent across the IT organization
provisional conjecture to guide investigation
3. We think that the technology
(working hypothesis) or accepted as highly supporting certain software
probable in the light of established facts.’ development and release processes is
(Dictionary.com) unstable.
A hypothesis is a statement that will start 4.2 Validating the problem

with the words “I/We think/believe that
…”. The hypothesis is as yet not supported As we indicated earlier, people have an
by any factual basis. The hypothesis is automatic ability to state that there are flaws.
based on people’s beliefs as a result of their In fact, if we take a look around an average
24
IT organization, we will be able to find a amounts of time to deliver new functionality
seemingly unending list of problems. This may to their customers is also an ongoing problem
seem somewhat disheartening and there are within the vast majority of IT organizations.
certainly examples of IT organizations where And providing high quality advice in a timely
there are so many (perceived) problems that manner is not always possible. So we cannot
people give up making the effort to solve them, deny that, in a generic sense, IT organizations
and resort to a fire-fighting mentality. This is have problems to solve.
NOT the Lean way.
We are, however, particularly interested
As we said earlier, in Lean IT, our mindset is in the specific problems within our own IT
that we accept that IT is filled with problems organization. And this means taking a detailed
and we must act to solve the problems on a look at the Voice of the Customer. We will go
continuous basis. More than anything this is a into more detail later on in this chapter.
behavioral aspect. Traditionally, management
has a tendency to want to hear the good news Voice of the Process
and a smattering of problems is OK as long as
they are on the way to being solved. Lean IT is The second most obvious place to look for
about seeking out problems and understanding problems is in the processes flowing from the
what impact they have on the IT organization’s customer into the IT organization, or looking at
ability to deliver high value products and the internal processes of an IT organization.
services to its customers.
This is where we find the link to the well-
Earlier in this document, we looked at where known IT frameworks. ITIL, ISO/IEC 20000
problems can be found. The four ‘voices’ tell us and Cobit, as prime examples of IT frameworks
where things are going wrong and when we and standards, define the future states of IT
need to actually solve a problem. organizations, the ideal situations. By matching
these ideal situations to your current situation,
For now, let’s look at the two voices that are you can undoubtedly find discrepancies. These
most likely to provide us with problems: frameworks take a variety of views on the IT
organization. However, in essence they aim
Voice of the Customer to ensure that the processes work reliably
and effectively. We can therefore use these
Customers of IT have three basic requirements: frameworks to understand how far an IT
•• Make sure my IT services work process is from its ideal state. Thanks to the
•• Give me new IT capabilities as and when fact that these frameworks describe the ideal
I need them state very specifically, it is quite easy to create
•• Give me advice on the new or better a problem statement.
usage of IT
Each of these statements is the potential The key aspect is, therefore, not so much
source of a problem statement. The fact finding a problem as determining which
that the customers of IT are confronted problem to tackle.
with incidents is a problem to be solved. The
fact that IT organizations can take excessive
25
4.3 Types of problems organizations when IT engineers find they need
more time to solve a particular incident than
There are many problems and in order to they had previously thought, even though the
solve them it helps if we know what the incident was relatively innocent. The engineer
characteristics of the problem are. In 2002, may oversimplify the solution leading to an
Dave Snowdon published the Cynefin (Welsh incomplete solution for the user of the IT
for ‘habitat’) model, in which he categorized service. This type of problem is often seen as
decision-making into one of five types. operational problems. Obvious problems are
Decision-making is directly related to the particularly suited to daily kaizen.
underlying problem about which a decision
must be made. We can therefore use the same The second type of problem are the
categorization to identify the type of problem complicated problems. The relationship
we are dealing with: between cause and effect requires analysis,
expert knowledge is necessary. Having said
The Cynefin model contains the following five this, the problem does follow rules. However,
types: simple, complicated, complex, chaotic the rules may be more difficult than expected.
and disorder. This type of problem can be solved by using
best practices, scenario-planning and systems
thinking. Once understood, the rules for
resolution can be defined and followed. We find
this kind of problem within the technology of
IT. Although, we sometimes do not understand
what happened, by investigating trends and
analysis, we can understand what went wrong
and how to solve the problem (often by rolling
back a change). There is a right answer that
can be found. These are the types of problems
where technical experts may disagree because
they focus on the elements of the problem they
recognize. Generally, these problems are seen
as tactical problems.
The first type of problem is the simple (or Simple and complicated problems always
obvious) type. This is a problem that is caused require analysis, i.e. breaking the problem down
by the fact that rules have not be followed. into a sequence of technical events. This is one
The relationship between cause and effect is of the reasons why it is important to record
obvious to (almost) everyone, it is reproducible, what activities have been carried out within
repeatable and predictable. These problems an IT organization; it makes understanding the
can be solved with a Standard Operating causes of simple and complicated problems
Procedure. As long as the SOP is used, the easier and quicker.
problem should not reoccur. Unfortunately,
simple problems can be underestimated Complex problems are problems for which
by experts. We regularly see this within IT the cause and effect are explainable in
26
retrospect. The issue has not been seen problems, complex and chaotic problems
before and there are no known solutions or require synthesis. The team solving these kinds
best practices available. The team needs to of problems need to investigate how factors
look for completely new ways to solve the and symptoms interact to create the problem.
problem. This kind of problem does not repeat
in exactly the same way; outcomes may be The final area to address is disorder. This is
unforeseen and patterns emerge over time. the situation in which we do not know into
To understand these problems, these patterns which of the four other categories the problem
must be investigated. This means learning while falls. Causality is not understood at all. From
solving the problem. This type of problem is disorder, we can use the Cynefin model to
particularly related to the people issues within determine which state the problem is in and
IT organizations. We may need to carry out then act accordingly. The danger with disorder
several different experiments to understand is that experts see the problem’s symptoms
the dynamics of the problem and find a as being part of a simple type of problem. This
solution. There is not necessarily a single right may cause the problem to be underestimated
answer and we can use guidelines to solve or incorrectly diagnosed.
the problem. Generally seen as more strategic
decisions and problems, they tend to affect Generally, within IT, we use kaizen to
social systems, rather than technical systems. investigate complicated and complex problems.
The Define session described at the end of this If a problem turns out to be simple, the solution
chapter is an example of a complex problem. will probably be found during the Define phase.
We generally refer to the solutions found in the
Lastly there are chaotic problems. With these Define phase as quick wins.
problems, no cause and effect relationship is
directly perceivable. These problems are not 4.4 Validating the value of solving the
detectible before the fact, there are no clear problem
answers and there are elements of the problem
that we cannot know when it is happening. The best Kaizen selection is based on
These problems require crisis management that identifying the problems that best match the
focuses on relieving symptoms to create some current needs, capabilities and objectives of the
kind of stability. Typically a Leader will have to IT organization, related to the Voices.
act quickly based on the information available
to stabilize the situation in order to buy time Each problem needs to be investigated at a
for experimentation. In order to tackle chaotic high level from three perspectives:
problems, we need to use principles. The crisis •• Results for the customer or business
team must act to cause change to the existing benefits
situation. All action must be aimed at trying to •• Feasibility
create order. When taking action, it is important •• Organizational impact
to do a risk analysis of the action (use Failure Remember: it is important to not get lost in a
Mode Effect Analysis) to understand what its ‘mini-Kaizen’ when investigating which problem
consequences could be. needs to be solved. It is about matching the
aforementioned criteria in order to create
As opposed to simple and complicated
a broadly prioritized list of possible kaizen
27
initiatives. Each time, a kaizen initiative must and do we have them available?
be selected. The previous prioritization needs •• Complexity: how complicated or difficult
to be reviewed to ensure that it is still valid. do we anticipate it will be to develop the
improvement solution and implement it?
Kaizen candidates on the list may in the
•• Likelihood of success: based on what we
meantime have a lower priority due to a series
know, what is the likelihood that this Kaizen
of daily kaizen actions, or as a result of changes event will be successful in a reasonable
in the customer’s environment. timeframe?
•• Support or buy-in: how much support for
Results or business benefits criteria. this Kaizen can we anticipate from key
stakeholders within the value stream and
First, the sponsor (often together with the will we be able to make a good case for
kaizen team) will carry out an assessment doing this Kaizen event?
of the benefits that will be achieved if the Organizational impact criteria:
problem is solved. Aspects to be considered
and questions to be asked may be: Lastly, the team must look at whether solving a
particular problem will provide the organization
•• Impact on external customers and
requirements: How beneficial is the problem with additional benefits.
of our customers? •• Learning benefits: what new knowledge
•• Impact on business strategy: what value will might we gain from this kaizen?
this potential Kaizen have in helping us to •• Cross-functional benefits: to what
realize our business vision or improve our extent will this event help to break
competitive position? down barriers between groups in
•• Financial impact: what is the expected cost the organization and create better
reduction, improved efficiency, increased collaboration in the entire value stream?
sales or market share gain? •• Core competencies: how will this
•• Urgency: what kind of lead time do we have possible kaizen event affect our mix and
to address this issue or capitalize on this capabilities in core competencies?
opportunity?
•• Trend: is the problem or opportunity getting A useful question to ask is: what will happen if
bigger or smaller over time and what will we do NOT solve this problem, but a different
happen if we do nothing? one instead?
•• Sequence or dependency: are other possible
initiatives or opportunities dependent on Problems in IT
dealing with the issue first?
Feasibility criteria Within IT, we find many different types of
problems to solve. Some may seem trivial,
The sponsor and kaizen team members whereas others are clearly quite significant. In
must try to understand what effort must be the table below, we present a number of typical
expended to solve the problem. The following IT problems. This list is merely a selection and
aspects may be investigated: by no means complete.
•• Resources needed: how many people, how

much time, how much money is this kaizen
event likely to need?
•• Expertise available: what knowledge or
technical skills will be needed for this event
28
Problem Explanation
Technical performance problems This problem may come in a multitude of forms. Every
piece of technology (hardware or software) may be a
source of problems. Often, a piece of technology does not
so much fail as just not perform well for any number of
reasons.
‘Fire-fighting’, focus on solving IT organizations seem to have the time to repeatedly

incidents rather than structural solve incidents but do not make the time to remove
resolution the sources of these incidents. This leads to a highly ad
hoc way of working, in which the number of incidents
(both per unit of time and open at any one moment)
continuously creeps higher.
Balance operational and change The classic statement here is: “I couldn’t complete the
work change on time because I had to solve an incident”. The
key issue is that IT people are involved in all sorts of
work, not just a single type.
Releases or ‘technical weekends’ A key question within IT organizations is: Why do

that cause problems the following changes need to lead to more incidents? It should be
work day possible to implement changes without causing further
disruptions
Planning and execution of work This causes a huge amount of stress within IT
organizations, as poor planning has a correlation with
switching priorities
Collaboration between development This is probably one of the most classic problems within
and operations, or applications and IT organizations, departments that throw work ‘over the
infrastructure wall’ to each other
Changes applied without informing IT organizations and people are not renowned for their
users ability to communicate. However, this is a skill that must
be mastered especially in a world where IT services are
pervasive
Constantly changing priorities This causes context switching and work to be left
incomplete (particularly the documenting)
Focus on achieving SLA KPIs Engineers no longer focus on providing a great service
but only look at whether they are achieving the numbers
29
Shared resources, dependency on IT people, especially the experts, are often required to
specific individuals be in multiple places at one time. They are allocated
to multiple projects and continue to have a role in the
operations. This often causes huge delays and highly
stressful situations, leading to errors.
Lack of availability and capacity Every IT organization knows that they should plan for
planning the future, understand how its services will perform
given the projected developments in their customer’s
organization. Very few actually do, leading to network
capacity problems, disk space incidents, insufficient
processing power or poor human resource planning.
Undoubtedly, while reading the list, you will have recognized problems; others may not be relevant
in your IT organization. There will also be problems for which there are ‘standard’ solutions (“if
you use [standard IT solution], you can solve the problem”). Although the above problems may be
recognizable, their specific causes may be diverse and different per IT organization.
4.5 Ensuring Support for a kaizen
As indicated earlier, a kaizen without support must not be attempted. We saw earlier that the kaizen
sponsor is an absolutely indispensable role to be filled. Where the two are separate, the problem
owner must be identified. These two parties play a critical role in the acceptance of the solutions.
Much attention is paid to sponsorship of the kaizen. However, in the end, if the people on the work
floor, the primary stakeholders of any problem to be solved, do not see the point in solving the
problem, then choose a different problem. If the work floor is not convinced that the problem needs
to be solved, then the acceptance of any solution will be very low.
Other stakeholders obviously include people up and downstream of the place where the problem
is identified. In pretty much every case, the customer will have an interest in the resolution of the
problem, be it from a qualitative perspective or a cost perspective. Looking at IT organizations,
the vast majority of them are internal to a business, governmental organization or an NGO
(Non-Governmental Organization). They have a vested interest in cost reduction and/or quality
improvement. The key question is whether they need to be directly involved in the execution of the
kaizen.
4.6 Stakeholder analysis
Carrying out a stakeholder analysis is all about understanding where the various people involved
stand on a particular issue, and what impact their view has on the success of addressing the issue.
Not all stakeholders should necessarily be involved in the actual Kaizen event. Some stakeholders
provide input or data into the Measure phase, others need to be kept informed and others need to be
30
actively involved in the actual meetings.
In order to understand stakeholder positions, we must define the issue on which you wish to
understand the positions of various stakeholders (individuals or groups). In the case of a kaizen, the
issue would be the problem at hand. Stakeholders are directly or indirectly involved with the issue,
often referred to as the “chicken or pig dilemma”. When it comes to having a cooked breakfast, the
chicken is indirectly involved by having to be provide an egg; the pig is fully and directly involved as
it needs to deliver the bacon. In the stakeholder analysis, the sponsor and kaizen team will need to
understand which stakeholders fall into which category.
On top of that, they will need to identify whether a stakeholder is positive, negative or neutral
regarding the problem. And whether they have a strong and explicit opinion on the subject or
whether they are not outspoken. Also, the stakeholder’s influence must be investigated. Do they have
formal or informal power regarding the problem? Adjust the analysis regularly throughout the kaizen
to understand how the stance of stakeholders changes
Based on influence (power or impact) and involvement (or interest), we can readily identify types of
stakeholders.
For each of the types of stakeholders, we see that there is a communication strategy involved in
keeping the stakeholders engaged with the problem.
4.7 Define phase and A3
The result of the Define phase is that the “Background” section of the A3 can be completed, by
answering the following questions:
1. What is the problem?
31
2. Who has the problem?
3. What is the scope of the problem?
The questions may seem simple but can take considerable time and effort to answer accurately.
4.8 Key steps in the Define phase
We have looked at a number of aspects of the Define phase. Bringing these together, we find that
there is a number of steps that can be taken to complete the Define phase
1. Problem selection and owner identification
Use the criteria to determine whether a problem is significant enough to warrant solving in the
short term. Always ensure that there is a person who owns the problem (the kaizen sponsor) and
sponsors the kaizen event. It is vital that the problem sponsor is serious about needing the problem
to be solved. The owner/sponsor must be able to maintain the drive to solve the problem. This is an
important reason why the kaizen event must be kept as short as possible, because there will always
be another problem around the corner that clamors for attention. This step may already have been
sufficiently addressed in preparing the kaizen.
2. Problem statement and kaizen team selection
Create a problem statement, and complete the background section of the A3 and select the right
32
team members. Select the kaizen team speaker must address what has been said.
members using the stakeholder analysis. All
team members must agree on the problem to 4. Collect Voice of the Customer information
be solved. If there is no agreement, then it is
possible that there are two different problems Having understood the scope of the problem,
that need to be solved, or stakeholders have we need to bring together the Voice of the
been incorrectly identified. The kaizen lead may Customer information that is relevant to
use an Ishikawa diagram as a visualization tool this specific problem. Use the CTQ (Critical
for collating symptoms of the problem, even to Quality) to collate and structure the
though it is usually used for analysis of factors information. Note that you may well have
causing the problem. The kaizen lead can also selected your problem based on feedback from
use the 5 Why technique to understand and the customer. Having validated, and possibly
scope the problem. adjusted, the scope of the problem, you may
need to go back to the customer for more
3. Validate scope of the problem specific requirements and wishes regarding
the problem at hand. The team must formulate
Once we have defined the problem, we must specific questions for the customer. Likewise, if
validate its scope. The team must understand the problem is based on a signal from the Voice
whether it is reasonable to expect the problem of the Business or the Voice of the Regulator,
to be solved. To do this, we draw a SIPOC for the team must formulate clarification questions
the process in which the problem occurs. The for the business or regulator.
team may need to adjust the scope based on
insights gained from the SIPOC. During this 5. Create high level plan
step, it is useful to understand what type
of problem needs to be solved. This will be You will need to agree on a plan for the
an indicative typology to guide the team’s execution of the kaizen event. This will include
resolution efforts. It is very important not practicalities such as the availability of team
to jump into a problem too quickly. It takes members and the sponsor, availability of
time (sometimes up to 3 hours) to define the meeting facilities and agreement on the main
problem and, particularly, to gain agreement deadlines. This needs to be done during the
among the team members. The time is often Define meeting. All participants must clear
spent learning to listen to one another. Team their agendas to ensure that the kaizen can be
members have the tendency to repeat each completed in a short period of time. Within IT
other with different words. Here, the role of organizations, this can be a considerable issue
the kaizen lead is extremely important. He or since IT people can rarely be taken off their
she must ensure that team members take the work to do a kaizen full-time. This is certainly
time to listen, often the managers in the team one of the key challenges of doing kaizen in
have the most difficulty with this aspect. A an IT organization. A suitable strategy is to
technique that the kaizen lead can us is to ask plan five meetings (one for each phase) and
the person wanting to say something to repeat ensure that there are two days between the
what the previous speaker has said, in his or meetings, so that actions can be carried out.
her own words. This technique ensures that the The meetings should be planned as 3-hour
previous speaker feels heard and the following meetings. That can always be changed if the
33
goals for the meeting have been met. The plan
should also include how the (interim) results
should be communicated.
It is critical that all aspects of the Define

phase are completed prior to moving on to the
Measure phase. If the problem statement is not
fully defined and agreed by the kaizen team,
they will not have a basis on which to complete
the following phases. This invariably leads to
having to go back to the Define phase.
34
4.9 Case Study: Define Phase
Within an IT organization of about 100 people, the teams had been structured in such a way that
the technicians managing the core of the IT infrastructure (networks, servers, databases, operating
systems, etc.) were allocated to customer-oriented support teams. This allocation was based on a
number of hours. For example, the three network engineers were allocated to four teams for an
average of 10 hours per team. This led to a situation of shared resources, on which the demand was
unmanaged and unbridled. The result was a highly tense situation in which the management team
was dissatisfied with the performance of the shared resources, the team leaders felt they did not get
the service they had been promised and the engineers felt as though they were being pulled in all
directions by 5 team leaders and 3 management team members.
This situation had existed for about a year and the situation had become explosive because neither
the support teams nor the infrastructure technicians believed they were getting or delivering high
quality services. The primary stakeholders were the infrastructure engineers, the IT management
team (IT MT) and the team leaders of the customer-oriented teams.
The situation was, in fact, so explosive that the mere mention of the problem to any one of the
stakeholders led to emotionally charged discussions about the quality and capabilities of the other
stakeholders.
The first step was to understand whether there was a desire within the IT management team to
actually solve the problem. The direct response was ‘of course’. To the question ‘Are you prepared
to accept a solution defined in a kaizen?’, there was a long silence. And here we find one of the key
issues within IT organizations starting on their Lean journey: the acceptance of the results of a kaizen.
It is very challenging for IT managers to accept that the solution to a non-technical problem (as in
this and many other cases) may be provided by the work floor. Eventually, one of the members of
the IT management team said he would act as sponsor for the kaizen, even though all MT members
could be defined as problem owners in their own right. The other three agreed they would accept the
result, as well.
The next step was to define the problem. Due to the emotionally charged nature of the discussion,
we started with a pre-kaizen session with each of the three key stakeholder groups. In each of the
sessions, the stakeholders were challenged to define what they thought the problem was. They were
also challenged to define the problem from the point of view of the other stakeholders.
After the exploratory sessions, a first kaizen session was organized, in which delegates from the
three stakeholders came together. A total of 9 people made up the kaizen team for the Define
session. The kaizen lead started the session by explaining the DMAIC procedure to be used
throughout the kaizen. He also explained the goal for this session: to agree on the problem statement
to be solved. She stated the initial problem statement as defined by the problem owner/sponsor, and
also asked all the participants whether they were prepared to do what was necessary to solve the
problem, independent of personal preferences and opinions.
35
As a result of the pre-kaizen information sessions, it was quite easy for the participants to state
what they thought the problem statement should be. It took a further two hours to finally gain
full agreement among all parties regarding the problem to be solved. The result was particularly
interesting because the problem statement was closest to the problem statement as stated by
the infrastructure engineers. It was not the case that management and team leaders used their
hierarchical power to push through their idea of the problem to be solved. Other cases have proved
that a 2-hour session to define the problem statement for a complicated or complex problem is no
luxury. In fact, it is often very necessary. This is because the team had to ensure that symptoms did
not end up being defined as the problem. The kaizen lead had to continually keep the team focused
on the goal and had to question the team members when he felt the symptoms are being turned into
the problem.
Having agreed on the problem statement, the team needed to agree on how much time they would
take to solve the problem. The urgency and impact dictated that the team needed to work as fast as
possible.
36
5 Measure Phase
The second phase in the DMAIC cycle is the Where y is the dependent variable and x is
Measure phase. In this phase, we refine the the independent variable. The f means that
problem statement based on measurement. the problem (dependent variable) is a function
The goal is to ensure that there is a detailed of independent variables. In fact, a clearer
understanding of the current situation notation would be:y=f(x1, x2, x3 ...xn)
surrounding the problem area. This is done
by collecting reliable data on the variables In this equation, we see that the problem
related to the problem. The aim is to provide may in fact be caused by any number of
information to help identify the underlying independent variables.
causes of the problem.
Our aim in the Measure and Analyze phases is
to find the independent variables (the x’s) and
5.1 Data
understand their impact on the problem (the y).
The first step is to define the data to be
An example: there is a problem with the ability
collected. The data obviously needs to be
of a desktop support department to deliver
related to the problem statement.
laptops to customers within the agreed time.
This is the y. The equation could look like this:
5.1.1 Variables
During this phase, we need to fully understand y=f(laptop, knowledge of employees, process, software,
the role of variables in the resolution of holidays, sickness, …)
problems. There are essentially three types of
We would need to collect data regarding each
variables:
of the independent variables to understand
•• Independent variable: this is an input. In the their effect on the problem.
case of problem-solving, the independent
variable can be seen as something that may It is therefore our task in the Measure phase
or may not contribute to the problem. The
to determine the independent variables of the
aim is obviously to find the independent
problem.
variables that have the greatest effect on the
problem.
•• Dependent variable: this is the output; in 5.1.2 IT Units of Work
effect, this is the problem. One of the key characteristics that
•• Control variable: this kind of variable is
differentiates Lean IT from Lean in other areas
particularly useful in experiments. This
of business is the units of work. These units of
variable is kept constant while others are
changed so that they can be investigated. work are the inputs for processes.
The mathematical notation for the relationship The standard units of work are derived to a
between independent and dependent variables certain extent from ITIL.
is:
y=f(x)
37
Incident A technical malfunction of the IT service affecting the customer
Service Request A request from the customer, not being a technical malfunction
Problem The root cause of incident(s)
Standard Change Change that is carried out according to a checklist or Standard Operating
Procedure
Operational activity Any activity necessary to keep the current IT service running, not being
an incident or service request. This category includes events, monitoring
and other daily/weekly activities that ensure the health of an IT service.
Non-standard Any change not being a standard change

Change
Advice A document detailing options for a solution, based on a customer

request
Plan A document covering a course of action in the future (Availability,

Capacity, Continuity, Security)
Within Lean, three categories of units of work are identified. Within IT, there are two basic criteria on
which the categories can be defined: size in hours worked and the process dynamics.
•• Runners: these are units of work that occur on a daily basis and tend to require up to one hour of
work for them to be completed. Within IT, we can say that incidents, service requests, standard
changes and operational activities fall in this category. The dynamics of these processes is that
work is statistically predictable (per week) but its exact occurrence is not known. This work cannot
be planned as such but time can be reserved for these units of work.
•• Repeaters: these units of work occur regularly; indicative frequency is weekly. Within IT, we find
high impact incidents, small to medium sized non-standard changes and the smaller advisory
services. This category is partly plannable (advice and changes). However, the high impact incidents
require direct response, and therefore have a dynamic that more closely resemble runners.
Unfortunately, their impact means that solving the incident can require a different effort than
regular incidents.
•• Strangers: these are units of work that have an irregular occurrence. IT ‘strangers’ are large non-
standard changes, large requests for advice and plans, which all tend to occur or be updated on a
monthly or quarterly basis.
5.1.3 Technical Data
The other type of data (alongside the units of work) is technical data. This is data that helps us to
understand the ‘behavior’ of the technology. This type of data includes such entities as
•• Log files: files in which activities (normal and abnormal) of individual IT hardware and,
particularly, software components are recorded
•• Monitoring data: data and information from monitoring tools, including alerts based on
38
thresholds Standard Work or the desired state of any
•• Technical performance data: Data about activity or process. Once standard work is
CPU, memory, network speed/capacity defined for processes which have a repeatable
and storage usage
nature the activity of going to Gemba can in
5.1.4 People Data part be focused on looking for variances from
The final type of data that is of use to manage the pre-defined standard.
an IT organization is time.
However, standard work does not mean once
•• Time represents the people factor, especially defined it remains static. All value systems
within IT where time usage is absolutely evolve over time and requirement improvement
critical in the sense that we have limited
using both the dialing and kaizen event models.
time and more than enough to do. Making
Another key aspect of going to Gemba is to
the best use of available time must have a
top priority within the steering mechanisms observe whether improvement is occurring
within an IT organization. against standard work. In short, it is difficult
•• Skills and knowledge capabilities: in order if not impossible to improve something that
to fully understand the capabilities of the IT has not been stabilized based on a previously
organization, it is insufficient to know how defined desired state.
much time is available. We must also know
what knowledge is available and in what
quantities. 5.2 Measurement Systems
5.1.5 Go to the Gemba and find data
Essentially two types of data can be collected:
One of the most important aspects of collecting quantitative data and qualitative data.
data is to understand its context. In Lean, we These two types of data require different
understand the context by going to the Gemba, measurement systems in order to collect the
the place where the work is done. The key data meaningfully.
question is: What do you look for when you got
to the Gemba? Quantitative data is numerical data that is
always expressed as a number. Qualitative
The key is to look at how specific data is used data measures in non-numerical terms. Often
within the organization. This means going to the qualitative data is transferred to numerical
place where the data is used, understanding data by giving it a scale on which answers are
who uses it for what purpose. Observe the scored as numbers. The data, however, remains
person using the data and ask questions to qualitative, since the distance between the
clarify how the data is used. numbers on the scale is not something we can
measure.
Going to the Gemba, observing the work being
done and understanding how data is used can The key aspect of any measurement system
help to determine what other data should be is that it must be accompanied by one or
recorded. It may also lead to understanding more visits to the Gemba to check whether
which data is available but unused or data that the data is being interpreted correctly, and to
is collected but is not useful. understand the context within which the data
is generated and is variance from standard
A key aspect of Lean is the definition of work.
39
5.2.1 Quantitative Measurement Systems
Quantitative measurement systems are used to gain objective insight into the performance of a
particular entity. Within IT, we have one source of quantitative data.
•• Automated data collection: most systems register data in the course of their operation. In some
cases, the amounts of data are quite substantial. This data is generally held in log files that can be
consulted to find out when something happened.
When using quantitative measurement systems, it may seem as though the data is objective.
However, there are questions that need to be asked regarding the quality of the data.
Collection
Reliability issues Remedy
method
Automated Data Do we correctly understand Ensure that you have skilled technicians
Collection what the automated data is who can explain what the system says
telling us?
Use automated data as a source for the
A lot of data does not confirmation of hypotheses
necessarily mean information
Setting up a measurement procedure (automated data collection)
The first step is to ensure the way the measure is calculated is unambiguous:
•• Definitions of units measured are clear and not open to misinterpretation

•• The way the calculation is carried out is understandable
•• Exceptions are documented in detail
Second, the fields within the database from which the data is taken must be clearly defined:
•• Name of the field in named database

•• If the field is filled in an application, ensure that it is clear which field is used
•• Ensure the field is suitably restricted to ensure input is always valid
Lastly, create an auditable automated routine to ensure that the correct data is selected and
transformed to the outcome of the measure.
Lean IT has many quantitative measures, many related to value stream mapping. Examples include
measuring the lead time of the units of work captured in a Service Management tool, the numbers of
units of work registered and the technical data on system performance. In all cases, the definition of
each piece of data must be identified and validated.
5.2.2 Qualitative Measurement Systems

Qualitative measurement systems principally measure capability or maturity from the perspective
of the people involved. Through the data collection method, qualitative measurement systems do
40
attempt to create objectivity in the subjective data. This can be done by using a framework of criteria;
most maturity models work on this basis. However, the maturity model is also based on human
perception. This means that qualitative measurement systems are always open to bias, be it based on
the questioning or on the answers. Three forms of qualitative measurement system are the following:
•• Annotated Observation: This basically means watching what happens and noting the number
of times something happens, the amount of time spent on a task, the number of errors made
in finished products and other such observable occurrences. The tool often used here is the
check sheet.
•• Interview: One of the preferred methods of gathering information is through interviewing
people involved or people associated with the aspect that is being investigated. An interview
can involve one or more people involved with the subject matter. Generally, multiple points of
view are sought when gathering information through interviews.
•• Registration: During the course of work in an IT organization, data is recorded on work
units (incidents, changes, problems, service requests). This data provides valuable input for
understanding how the organization performs regarding these units of work. In registering units
of work, the system records time stamps and other data either automatically or as a result of the
action of a user. As a result of the dependency on human action, registration is seen as qualitative
rather than quantitative.
These methods may be combined, and may take the form of asking the person or people being
observed questions about their activities. However, interrupting the work to ask questions does
affect concentration and the overall performance.
Collection
Reliability issues Remedy
method
Annotated Are we watching a representative set Observe at several different moments

observation of actions?
Observe multiple subjects
How does the fact that we are
observing impact the performance?
Interview House<?>: “Everybody lies” … except the Check statements with evidence,
interviewee of course preferably data
Information from an interview is Use multiple interviews

always biased
Involved means having an interest in

the outcome of the interview
41
Registration •• Is everything registered •• Check the quality of the data by
and classified correctly? a visual check of the raw data
•• Does the registration tool •• Do not rely on the tool to
allow access to the desired provide information, get a
information? database administrator to
extract data
Setting up a qualitative measurement procedure (annotated observation and interview)
In order to create a valid qualitative measurement system, ensure the goal of the observation or
interview is clearly formulated. Define the framework against which answers will be checked.
Determine the questions to be answered and ensure the answers can be unambiguous. With
observation, this can be relatively simple, using “Yes” or “No” type answers, counts or time
measurements. During interviews, answers will often be narrative. Answers must be clearly noted.
Ensure that answers can be, and are, recorded in a way to ensure that processing the answers to
a suitable result of the measurement is possible. Both the processing and the raw data must be
auditable.
Lean IT examples of qualitative measurements are collecting data for Voice of the Customer or Skills
& Knowledge analysis. In both cases, the information is based on the opinions of the people involved.
Collecting VoC information can be a complicated process, if you are aiming for a detailed view of
the VoC. The easiest way to start collecting VoC data is for the team to ask three simple questions
concerning the problem:
•• What does [problem area] do that has value for you?

•• What does [problem area] do well?
•• Where can be improved?
Setting up a qualitative measurement procedure (registration)
Follow the same procedure as with automated data collection.
5.3 Baseline and Benchmark
Baselines and benchmarks are necessary to understand the relative value of the performance.
•• A baseline is the measurement of a situation in order to understand whether a change occurs

based on an intervention after the baseline has been set. This is particularly useful in kaizen
because we are very interested in the effect of changes that have been implemented in the
IT organization. It is vital that during the Measure phase a baseline is set that can be used to
measure progress.
•• A benchmark is a standard or set of standards used in evaluating the performance or
level of quality of an organization. Benchmarking is a measurement used to compare the
organization’s position in relation to other organizations. Benchmarking can also be done
between teams within a single IT organization. Benchmarking may be used during a kaizen
42
to understand how well others perform a particular activity. This may help to identify what
improvements are possible.
5.4 Value Stream Map
A number of metrics and calculations are required to help measure a Value Stream. Many of these
were introduced In the Foundation publication.
VSM Metrics
The following metrics help the kaizen to prepare data such that it can be used in the Analyze phase.
Metric Explanation
Lead time The time between the moment the customer submits their request
to the time they receive the requested item or service
Takt rate Volume of customer demand per time period (takt time is the
inverse of this number)
Changeover time Time needed to change from processing one unit of work to
processing a different one. Within IT, this is the time lost due to
context-switching. This is a type of waiting time
Queue time The time a unit of work is in a queue. This is a type of waiting time.
Machine Time The time a unit of work is being processed by a machine. This is a
type of waiting time.
Work-in-process The number of uncompleted units of work that are still in the
process. This number is directly related to the lead time (Little’s
Law)
Capacity The maximum amount of output that the process can deliver over a
period of time
Throughput The actual amount of output over a period of time. This is invariably
lower than the capacity as a result of waste.
VA / NNVA / NVA time Time spent on Value Add (often referred to as cycle time), Necessary
Non-Value Add and Non Value Add activities
VSM Calculations
The most essential calculations in a Value Stream Map are PCE (process cycle efficiency) and Little’s
Law
Process Cycle Efficiency = VA time / Process lead time
43
Little’s Law helps us to understand the relationship between lead time and work-in-progress.
Little’s Law = the number of units of work in the process (WIP) / average completion rate
These calculations can be done over the entire process, but also per process step. This helps to create
a richer picture of the dynamics of the process, and identify where issues exist.
44
5.5 Measure phase and A3
At the end of the Measure phase, the kaizen team should be able to complete the Current Condition
and Future State goals sections of A3.
5.6 Key Steps in the Measure phase
Bringing together the important aspects of the Measure phase into a series of steps, we find the
following points to take into account when carrying out the Measure phase of the kaizen.
1. Identify the outputs and inputs of the process in which the problem occurs
Problems invariably have an effect on the output of one or more processes since it is often the
recipient of the output who indicates that it does not meet the expectations of said recipient.
Having defined the output relevant to the problem, we need to define the input. This leads to an
understanding of which value stream causes the problem. Much of this work will have been done
while making the SIPOC in the Define phase. This step entails collecting the data concerning the
inputs and outputs of the process. Within IT organizations, it is not always clear that there is a
process associated with the problem. Take the case in the Define chapter. This was identified as a
people problem, a time-constraint problem, an attitude problem and many other types, but not as
a process problem. In the end, the key issues were identified by treating the issue as a resourcing
process problem. This allowed the emotion to be removed from the discussion.
2. Create a value stream map of the process
45
As we stated above, the value stream map (VSM) describes the current situation of the process at
this phase of the kaizen. Describing the current situation may seem easier than it actually is.
Often, people working in the same process have different perceptions as to how the process actually
works. It is vital to go and look at the Gemba to see how the process is actually executed.
Within IT organizations, there is a tendency to say “we already have a process picture" when the
kaizen lead recommends creating a VSM. The process document is usually a couple of years old and
will have been based on a process-oriented implementation. These documents are essentially useless
since they describe a desired situation that has never been achieved; nobody knows the document
except the people who wrote it and it distracts the team from the focus of creating a description of
how the process currently works.
Always take a clean sheet of paper when making an initial VSM.
3. Create and execute the data collection plan
Once we know what the process looks like, we can identify the independent variables that may affect
the problem. When we know the independent variables, we can define the data that needs to be
collected in order to investigate the problem.
There is a strong tendency to believe that IT organizations are difficult to measure. As a result,
kaizen teams within IT may try to take a short cut in the Measure phase. Often, the involvement of a
powerful sponsor will lull the team into a false sense of security.
Powerful sponsorship does not mean we do not need to collect the right data to support the
resolution of the problem. The data is a continuous reminder of how important it is to completely
solve the problem. However difficult collecting data may be, it still must be done. In fact, measuring
the various aspects of IT is not so difficult. The team just needs to be prepared to extract data from
databases. And if there’s one business that has people who know how to do this, it is IT.
4. Validate the measurement system
We have already discussed possible inaccuracies within measurement systems. This is why the
kaizen team must validate the measurement system(s) it uses. The idea here is to show that the data
collected can be reproduced and repeated, in exactly the same way.
Any assumptions made must be made explicit. Any manipulation of data must be documented and
explained so that it is clear on which premises analysis is being done in the Analyze phase.
5. Assess the capability and performance of the process
For each measurement we make, we must set a baseline. In the case of the IT units of work, we tend
to create time series charts and determine average performance over a defined period. Setting a
46
single data point as a baseline tends to create an arbitrary and highly contentious baseline.
Although there are organizations that sell benchmark data and reports for considerable sums,
there is also doubt as to whether benchmarking truly helps IT organizations. The problem is that IT
organizations are service organizations in which the factors influencing performance and cost may be
similar but may have very different effects from one organization to another. It is therefore vital to
baseline while benchmarking is optional.
6. Identify Quick Wins
During the execution of measurements, it may become clear that there is a course of action that
everyone involved agrees on; a solution that can be implemented straightaway. This so-called
quick win should be implemented as soon as possible. On rare occasions, a problem thought to be
complicated or complex may turn out to be simple.
As with the Define phase, it is vital to ensure that the key deliverables from this phase are completed
before moving on the Analyze phase. In practice, it is almost impossible to think of everything that
needs to be measured. Often, further necessary measurements will emerge as a result of gaps in the
analysis. This should, however, not be a reason for rushing the Measure phase or too easily accepting
that something is not measurable.
47
5.7 Case Study: Measure Phase
Customers of an IT organization were highly dissatisfied with the service. Part of the IT organization
was responsible for carrying out network installations and changes. Delivery times were completely
unpredictable and were consequently experienced as too long. Expectations were not managed. In
essence, the team needed to process three types of requests, and also requests that included two or
three of the individual types of request.
Every so often, there would be a workload peak as a result of sales activities or management
pressure. This led to stress, because although extra hours were worked. New requests balanced
fulfilled requests and the backlog remained the same, causing intense frustration within the team.
Unfortunately for the team, the expectation is that the number of requests they need to process will
increase by 100% in the next 2 years. These need to be processed by the same people. Inevitably, this
led to despondency in the team since they were not able to keep up with the current workload let
alone twice as much.
The organization’s hypothesis was that processes were not implemented such that they would help
customers.
Having defined the problem, data needed to be collected. Most of it was available in the systems used
by the team. The data that was collected was a data dump of the previous 12 months of requests.
The data required was the date of receipt of the request, the starting date, the completion date and
the closure date. Also, the department responsible for the execution of the request was included
in the dump. Preliminary data processing provided new insights including the average lead time of
requests, WIP inventories, numbers of opened and closed requests per time period and how long
requests spent on each status.
The data was validated and the three key conclusions were drawn from this part of the kaizen:
•• Data was incomplete, start date of the work was not always available
•• Data was unreliable, the time stamp of status 2 was sometimes earlier than that of status 1
•• Data was not used to manage the process
This meant that the kaizen team needed to be careful when drawing conclusions. It was also a trigger
for the operational team to improve their data registration.
Based on this data, the kaizen team was able to construct a VSM. The VSM provided insight into
where data was missing, particularly details about waiting times. These gaps were filled in using
a bespoke time registration sheet (in Excel) adapted to the specific situation. The sheet allowed
registration of NVA, NNVA and VA activities. This time registration lasted for two weeks to ensure
representative data. The VSM further uncovered that there was no standard process for each of the
three basic request types.
During the Measure phase, the following quick wins were identified.
•• Daily (manual) measurement was instituted straightaway. It was carried out by the team and
48
communicated twice daily in short
stand-up meetings
•• Specified knowledge-sharing sessions
were organized every day based on
identified needs
•• Resource-planning to reduce context-
switching, i.e. per day people were
allocated to a single task thereby
increasing their effectiveness.
Rotation schemes ensured that all
team members became proficient at
processing all types of requests
The data was enriched and processed into
graphs and other graphics so that the kaizen
team could close the Measure phase and
proceed to analyzing the data in the Analyze
phase.
49
6 Analyze Phase
The Analyze phase is aimed at getting to the root cause of the problem (finding the key x’s). From the
Define phase, we have a clear problem definition. This has been refined during the Measure phase
and data has been collected. The data will be processed to a certain extent during the Measure phase.
In the Analyze phase, the goal is to translate the data into information that will provide insight into
the key variables that have the greatest impact on the problem. By determining these key variables,
we will be able to provide input for the Improve phase, in which we will try to find the possible
actions that will reduce the negative impact of the variables.
In short, the Analyze phase is about identification, quantification, interrogation and prioritization of
the root causes of the problem we are investigating. We do this using a number of tools. There are
tools that help us make sense of the data we have uncovered during the Measure phase and there
are tools that help us to further decompose the problem into its constituent parts.
6.1 Seven Basic Tools of Quality
As early as the 1950’s, there has been a list of the seven basic tools of Quality. It is speculated that
Kaoru Ishikawa created the list as a result of exposure to the teachings of W. Edwards Deming.
Whatever the source, the list of the seven basic tools of Quality has been standardized and is used
universally. The seven tools are: histogram, Pareto chart, scatter diagram, flow chart, control chart,
fishbone (Ishikawa) diagram, and check sheet.
We will investigate each tool and explain how these are constructed. For each, an IT example will be
provided.
6.1.1 Histogram
According to Webster’s Online Dictionary, a histogram is "a representation of a frequency distribution
by means of rectangles whose widths represent class intervals and whose areas are proportional
to the corresponding frequencies." In short, this means that we create a graph in which groups of
numbers are plotted based on how often they appear.
The power of histograms is that they allow us to analyze extremely large datasets by reducing
them to a single graph that can show one or more peaks in data. The histogram also visualizes the
significance of the peaks.
50
Step Description
Step 1 Select a data set to be plotted. A classic example is the distribution of incidents
according to resolution time (i.e. lead time).
Step 2 Collate the data into groups. In the case of incidents, we bundle them into time-
related groups, e.g. incidents solved within 1 day / 2 days / etc. or younger than 10
days, between 10 and 20 days, etc.
Step 3 Count the number of data points per group. Plot the groups onto a graph with on
the vertical axis the number of data points and on the horizontal axis the names of
the groups. The incident graph will have the number of incidents along the vertical
axis and the time intervals along the horizontal axis.
Step 4 Investigate the pattern that is depicted by the graph. Determine the cause of the
pattern. In the case of the incidents, we can easily see how long incidents have been
open and what the distribution is according to age.
The above diagram shows an IT example of a histogram. This one shows the number of open
51
incidents with a certain age (time that they are open).
6.1.2 Pareto chart

The Pareto chart is a way to visualize the relative importance of root causes of problems. It is based
on the principle (the Pareto principle) that a limited number of factors account for most of the impact
on the problem. The Pareto principle is sometimes referred to as the 80-20 rule, i.e. 80% of the
impact is caused by 20% of the factors.
Using the following steps, you can create a Pareto chart. In this case, it may be best to create a pencil-
and-paper version first before entering the data into a tool such as Excel.
Step Description
Step 1 Develop a list of causes to be compared and determine a standard measure for
comparing the causes. Frequency of occurrence, time (lead time, time usage)
and cost are the most used standard measure for comparison. Also, choose the
timeframe in which data needs to be collected.
Step 2 Count the frequency (cost or time) for each item. Add these amounts together to
create the grand total for all items. Calculate the percent of each item in relation
to the grand total, by taking the sum of the item, dividing it by the grand total and
multiplying by 100.
Step 3 List the causes in decreasing order of the measure of comparison, from most
frequent to least frequent. On top of the individual percentages, a cumulative
percent is calculated by adding the cause’s percent of the total to that of all the
other items that come before it in the ranking.
Step 4 List the items on the horizontal axis of a graph from highest to lowest. Label the left
vertical axis with the numbers (frequency, time or cost), then label the right vertical
axis with the cumulative percentages.
Step 5 Draw in the bars for each cause. Draw a line graph of the cumulative percentages.
The first point on the line graph should line up with the top of the first bar.
52
Step 6 Analyze the diagram by identifying those causes that appear to account for most
of the problem. Identify those causes that account for around 80 percent of the
effect. In most cases, 2 or 3 causes will generate 80% of the effect. There is usually
an inflection point where the graph levels off. If there appears to be no pattern (the
bars are essentially all of the same height), you may need to subdivide the data and
draw separate Pareto charts for each subgroup to see if a pattern emerges.
By comparing Pareto charts regarding a single problem made at intervals over a

period of time will indicate whether mitigation actions have had a positive effect on
the problem.
The above Pareto diagram shows the prevalence of particular causes of an incident, both absolute (in
numbers) and cumulative (in percentage).
6.1.3 Scatter diagram

A Scatter diagram is a graph that aims to demonstrate the relationship between two sets of
data. We try to understand whether there is a correlation between two sets of data and whether
this correlation is positive or negative. This type of diagram can be used to both interpolate and
extrapolate.
53
To create a Scatter diagram, follow the steps in the table below.
Step Description
Step 1 Select the two data sets that need to be plotted against one another. A simple
example is the investigation of the lead time (in days) of changes in relation to the
size (in hours) of the change. Not that it is not necessary for the units of the two
data sets to be the same.
Step 2 Create a graph whereby one of the data sets is plotted on the vertical axis and the
other on the horizontal axis.
Step 3 Determine whether there is a correlation between the data sets. This is done by
plotting a straight line known as the line of best fit or trend line through the data
points. The trend line is drawn by ensuring that the line is as close as possible to all
of the data points and has the same number of points above it as below it.
Step 4 The scatter diagram can be used in two ways. It can be used to find the value
of a particular data point within or outside the existing data set (interpolating
and extrapolating). The second analysis is the most important. This is to find the
correlation. The correlation is positive if the values increase together; it is negative
if one decreases as the other increases. Based on this analysis, we can determine
which variables have a positive or negative (or no) correlation, and we can take this
into account when determining the solutions during the Improve phase.
54
The above scatter diagram shows the relationship between the average time to repair of incidents
in days and the days of inventory within an IT organization. In effect, this is a chart depicting Little’s
Law.
6.1.4 Flowchart
A flowchart is one of the simpler of the seven quality tools. The flowchart is the visual representation
of series of steps in a process, and helps to break down a complicated process into a simple series of
steps. This simplification ensures that the process becomes understandable to anyone.
A flowchart shows actions and decisions at points where variations occur in the process. These
decision points are always marked by a question that can be answered with ‘yes’ or ‘no’. The basic
forms are blocks (actions) and diamonds (decisions). There are many other symbols used for drawing
flowcharts. A further elaboration is the use of so-called ‘swimming lanes’. These are horizontal or
vertical lines that separate the activities of different roles or groups responsible for completing a
particular task in the process.
In Value Stream Mapping, a very simple version of flowchart is used. The goal of the VSM is to
understand waste and time usage within the process. The flowchart discussed here is generally used
for a more detailed look at the process
55
Step Description
Step 1 Select the process you wish to analyze. You may already have a SIPOC or VSM of
the process. Use this as a starting point.
Step 2 Create a sequential list of the activities and decisions in the process
Step 3 Create blocks for the activities and use diamonds for recording the decision points.
Build the process step-by-step. Use an arrow between two symbols to denote the
flow of the process.
Step 4 The analysis of the flowchart centers around the logic of the steps. Drawing the
flow of steps can indicate where there are (unnecessary) feedback loops, parallel
activities that influence one another, i.e. should be sequential. The use of swimming
lanes indicates whether there are many transfer moments between roles. In
general, transfer moments cause delays within the process.
Below is an IT example of a Flowchart. In this case, the flowchart of the Problem Management
process.
56
6.1.5 Control chart
Control Charts were defined by Walter Shewart (the inventor of the PDCA-cycle). The control chart
is essentially a time-series chart. A time-series chart is one in which data is plotted on a chart where
the horizontal axis is a time sequence. The vertical axis can be numbers or another variable whose
value can be different over time.
The difference between a time-series chart and a control chart is that the Control Chart is used to
identify variation in a repeating process. This is done using control limits. Control limits are sometimes
also called Action Limits (Control Limits are calculated; Action Limits may be assigned).
A control chart helps to understand variation. There are two important types of variation: common
cause variation and special cause variation.
•• Common cause variation: the variation due to random shifts in the X’s that are always present
in the process. As a result, the pattern shows variation with ‘noise’, the collective effect of
many minor influences. A process affected by common cause variation is called stable or in
57
control. It makes no sense to figure out what the causes are. The only way to improve the
performance is to redesign the (parts of the) process to reduce common cause variation. An
example of this is when a process, e.g. the ability to deliver a new piece of standard software
to the customer, performs consistently at a level that does not meet the requirement of the
customer. We would need to completely redesign the process to improve the performance.
•• Special cause variation: In these cases, the effect of variation can be assigned to a specific
cause which can usually be discovered. Special causes generate patterns in the data. They
provide signals about the problems in the process and how they can be resolved. You cannot
predict if and when the special cause variation will occur and what the impact will be.
Therefore, the process is unstable and unpredictable. Continuing the example above, if the
ability to deliver the software shows a spike in lead times, we can investigate and possibly
remove the reasons for the spike.
The control chart can thus be used to identify whether a process is under control (statistically)
and whether it suffers from special and/or common cause variation. It can also be used to detect
statistically significant trends in measurements, e.g. to identify whether improvements have had an
effect on performance.
Process is in control Process is out of control
Control charts are best suited to processes where regular measurements can be made. Typically this
is in processes that repeat within a reasonably short space of time. Within IT, we look at Incident,
58
Service Request and Standard Change processes. Control charts are also very suited to monitoring
technical processes like the ability to load a data warehouse.
Create a control chart using the following steps:
Step Description
Step 1 Identify the objective of using the Control Chart. Typically this will be either to detect
defects or to monitor/investigate a process.
Step 2 Identify the actual measurement to be made, including what to measure, and where
in the process to measure it. Select the measurements based on their ability to
identify problems or defects.
Step 3 Identify the type of Control Charts to use. This will depend on the type of
measurement being made.
Step 4 Choose the measurements that will make up each plotted point on the Control
Chart. Measure more frequently when significant variation can occur over a short
period. Use consecutive measurements, rather than a random sample, as this will
result in less variation within the subgroup, with tighter, more sensitive control
limits.
Step 5 Measure the data. If possible, automate the measurement process. If measurements
are to be collected by hand, design a data collection method that eases both the
collection and the subsequent calculations.
Step 6 Calculate mean and upper and lower control limits. Note that control limits are
usually straight lines.
Step 7 Draw the chart. This should include plotted points, with a line drawn between
successive points, horizontal lines for each of the central line, upper control limit and
lower control limit, and labeling and other information to uniquely identify the chart.
Step 8 Analyze the chart, looking for significant patterns and points, and find the cause
of any identified significant set of points. In the Improve phase, we will look for a
method of correcting the problem. To be clear, the Control Chart shows us when the
problem occurs, but not where.
6.1.6 Fishbone (Ishikawa) diagram

Ishikawa diagrams (also called fishbone diagrams) are causal diagrams that show the causes of a
specific event. They were designed by Kaoru Ishikawa in the 1960’s. The Ishikawa diagram is generally
used to identify potential factors causing an overall problem. Each cause or reason for imperfection is
a source of variation, i.e. an ‘x’ or an independent variable.
59
Causes are usually grouped into major categories to identify these sources of variation. Depending on
the industry, there may be up to 7 categories. Within IT, we commonly use four categories: People,
Process, Technology and Policy
•• People: deals with all aspects to do with people, particularly behavior and attitude, but also
personnel and knowledge issues
•• Process: concerns all issues that relate to processes
•• Technology: any issues or causes related to the technical part of IT services should be posted
in this category
•• Policy: this category deals with all of the factors that determine the environment in which the
people, process and technology exist.
In practice, these categories are Collectively Exhaustive. The factors affecting a problem can,
however, often be placed in one or more categories i.e. the set of categories is not Mutually Exclusive.
An example: is the fact that people do not follow a process a people factor or a process factor? The
answer is that it does not matter as long as the factor is posted on the Ishikawa diagram and its
impact on the problem is analyzed accordingly.
Create an Ishikawa diagram using the following steps:
Step Description
Step 1 An Ishikawa diagram should be made with by the Kaizen team. Use a whiteboard
and sticky notes to create the first version of the diagram.
Step 2 Draw a horizontal arrow pointing to the right. At the end of the arrow, write the
problem to be solved.
Step 3 Draw 4 diagonal arrows (2 from below, 2 from above) pointing towards the
horizontal arrow; each arrow should be labeled with one of the categories: People,
Process, Technology and Policy.
Step 4 The team collects as many causes as can be identified. The team can use the 5 Why
method to find and detail root causes. The value of the Ishikawa rises with the
quality and detail of the root causes.
Step 5 Use the other six basic tools to quantify the impact of each cause. This will create a
list of the causes with the greatest impact.
60
Example of Ishikawa diagra
6.1.7 Check (or tally) sheet

The check sheet is a simple and highly effective tool for collecting quality-related data in a structured way.
It is a way to assess a process and can function as input for other analyses.
The check sheet helps to quantify the causes from the Ishikawa diagram, for which there is limited or
no numerical data to be analyzed.
Set up your check sheet using the following steps
Step Description
Step 1 State the problem for which data is being collected. Identify and record the location
where and time when data will be collected.
Step 2 Create a table with the symptoms or occurrences to be observed and counted in the
left-hand column. Depending on how you wish to measure you may have a single
column in which to mark the number of occurrences or you may have a column per
day of the week.
Step 3 Collecting data may be done by the people actually doing the work or may be done
by an observer. The person collecting the data puts a mark in the recording column
for each time a particular symptom or occurrence takes place.
61
Step 4 Per registration period (usually a day), the data can be processed into a Pareto
chart for further analysis. Alternatively, a histogram can be used to understand the
relative amount of times a particular symptom is observed.
6.2 Finding the root cause
The seven basic quality tools help us to process the data we have collected and visualize the data in a
way that facilitates getting to the root cause of the problem we are investigating in our kaizen. There
are also tools to help us take a step further and actually get to the root cause.
6.2.1 Whys
The 5-Why analysis is a simple root cause analysis that requires the kaizen team to question a failure
through sequential causes. ‘Why’ is asked to find each preceding trigger until we supposedly arrive at
the root cause of a problem.
A Why question can often be answered with multiple answers. Each answer should be supported
by evidence that proves the answer is right. Failure to do this may send the team on a wrong failure
path.
Step 1 Make a table with two columns and 5 rows and write the question from the
problem statement at the top of the table
Step 2 Ask the question: “why did this happen?” Find the answer, supported by evidence,
and write the answer in the left-hand column of the top row.
Step 3 Repeat this question and answer cycle, four more times. List the answers in the left-
hand column of the table.
Step 4 Determine a solution for each of the answers and record these in the right-hand
column.
6.2.2 Cause & Effects matrix

A cause-and-effect matrix helps determining which factors affect the outcomes of the process being
investigated. It maps the value connection between inputs (the Xs) to outputs (the Ys). With these
relationships visible and quantified, you can determine the most-influential factors contributing to
value.
Follow the steps below to create a cause and effect matrix
Step 1 Start by listing all the possible input factors (the Xs) as individual rows of the matrix.
The inputs should come from a previously completed VSM or Ishikawa diagram.
62
Step 2 List the multiple outputs (the Ys) of the process across the columns of the matrix.
There may be only a single physical output, however there will also be outputs such
as a performance level, a cost target or a maximum lead time.
Step 3 The key to the cause and effect matrix is building the relationships. Analyze and
quantify the relationships between each listed input and each output by placing a
relationship score (on a scale of 0 to 9) at the matrix intersection of each row and
column. Strong cause-effect relationships are scored as 9s; moderate cause-effect
relationships get 3s; weak relationships are 1s; and having no relationship means a
score of 0.
Step 4 For each matrix row-column intersection, ask yourself whether the associated input
affects the level of variation in the associated output. Then place the appropriate
score in the matrix cell.
Step 5 Summarize the results by calculating the weighted score for each row. For each row,
multiply the first matrix cell score by the first column weight; do this for the second
matrix cell, and so on. Finally, add up the weighted scores for the entire row. Place
this weighted row sum in the far right column of the C&E matrix.
Step 6 Apply Pareto analysis to the scores for each row. Those rows with high scores are
the ones that indicate important, high-leverage input factors. The factors with low
scores can be ignored.
6.2.3 Failure Mode Effects Analysis (FMEA)

Failure modes and effects analysis (FMEA) is an analysis for identifying all possible failures in a
design, process, product or service. The Failure modes are the ways in which something might
fail. Failures are any errors or defects and can be potential or actual. The effects analysis is about
understanding the consequences of those failures.
Failures are prioritized according to their consequences are. The aim of the FMEA is to take actions to
remove the sources of failure, i.e. the root causes, starting with those with the greatest impact. FMEA
can be used throughout the lifecycle of an IT service, from design to operation and retirement of the
service
Step 1 List the key process steps in the first column. These may come from the highest
ranked items of a C&E (Cause and Effect) matrix or VSM made previously.
Step 2 In the second column, list the (potential) failures for each process step, i.e. state how
this process step or input could go wrong.
Step3 Per failure, describe what the effect would mean to the IT organization and the
customer, in the third column
63
Step 4 We then complete three columns with ranks from 1 to 10. Before you start, ensure
the team agrees on what each number in the scale means before you start. The
three columns are:
1. The severity of the effect -1 (not severe) to 10 (extremely severe)
2. The frequency of occurrence of the effect - 1 (almost never) to 10 (very

frequently)
3. Our ability to detect based on controls in place - 1 (predictable) to 10

(undetectable)
Step 5 Multiply the severity, occurrence, and detection numbers and record this value in
the RPN (risk priority number) column. This is the key number that will be used
to identify the principal causes to address first. An RPN of 1000 (10 x 10 x 10) is
obviously the most critical.
Step 6 Sort the causes by RPN number and identify most critical causes.
6.3 Analyzing a Value Stream Map
In the Lean IT Foundation, we looked at the mechanics behind creating the Value Stream Map. In this
publication, we will principally look at how to analyze the VSM.
The Value Stream Map is a mine of useful information. This information is used to identify the places
in a process where a solution is most needed.
VSM Analysis
Having carried out the chosen calculations, there is a series of aspects that we must analyze in more
depth.
•• Time Trap: this is a process step that introduces a delay into the process. A classic example
is the need for an approval. This is not a capacity constraint. A time trap is based on a policy
decision (muri). The waiting times in the VSM must be analyzed carefully to determine the
reason for waiting times. Removing time traps will improve the efficiency of the process.
•• Capacity Constraints: this is a process step that does not have sufficient capacity to process
all of the work it must process in a particular timeframe. These steps are also referred to
as ‘bottlenecks’. Removing capacity constraints allows the process to deliver the value in
the quantities required to meet customer demand. Takt rate is an important metric for
understanding bottlenecks. Each process step for which the takt time is higher than the
takt time for the entire process is a bottleneck. Capacity constraints can cause variability
(mura) throughout the process. A capacity constraint may be caused by lack of resources or
knowledge.
•• Waste: Time traps and capacity constraints are important causes of waiting time. Obviously
we need to analyze the other types of waste (TIMWOOD) in the VSM to ensure that these are
not causing delays or quality issues (muda). We do this by analyzing each step in the process
64
and determining whether a particular type of waste is present in the step. This waste is
identified with a symbol on the VSM (See Lean IT Foundation)
6.4 Analysis in IT
Much of what has been dealt with in this chapter is not specific to IT. Does this mean that the analysis
within IT kaizen is the same as non-IT kaizen? From a tool perspective, maybe. However, within IT, we
find that there are specific challenges. Looking at People, Process and Technology, we find a number
of characteristic analyses.
Technology
There is a massive amount of data available within IT organizations regarding the ‘behavior’ of the
technology. Technology delivers data that is very suitable for the creation of control charts and
histograms.
Having said this, Murphy’s Law dictates that the bit of technology that you need to investigate does
not have the right set of monitors in place to ensure that the technology can be researched. It is vital
that monitoring can be put in place quickly to understand the technological aspects of a problem.
The analyses most related to technology are control charts, Pareto charts and scatter diagrams.
Control charts help to understand the behavior of technology over time, Pareto charts help to
rank the importance of causes and scatter diagrams are used to understand whether there is a
relationship between symptoms.
Process
The analyses described above in relation to VSM are all used in relation to IT processes. It is
important, as already stated in the Measure phase, that people within IT organizations (especially
those involved in kaizen) know the difference between the IT units of work and the associated
processes. The dynamics of each unit of work must be understood so that the effects can be
understood.
People
People-related analysis is particularly related to the availability of skills and knowledge, and the usage
of time.
Skills and knowledge can be analyzed using a skills & knowledge matrix. The aim is, particularly, to
understand in which ways muri and mura are caused by choices made regarding people.
One of the IT-specific analyses is time-related. Where traditional Lean analyses look at the time
aspects of processes, within Lean IT, we also manage time on an organizational level. In essence, we
look at VA time versus NNVA and NVA time at an organizational level, i.e. per team, department or
the whole IT organization. This helps us to understand whether teams are faced with substantial
65
amounts of ad hoc work or whether they have a high diversity of units of work, leading to muda and
mura.
6.5 Analyze phase and A3 Complete Analysis section of A3. The Analysis phase tends to be the phase
in which most time is spent. The team will spend much effort trying to bring together the various
symptoms, causes, effects and indications of how these may be mitigated.
There are two major pitfalls to avoid in the phase.
•• Don’t fall in love with your analysis

All the hard work that goes into the Analysis phase is in fact only a step to determining the correct
course of action. Most of the charts, graphs, matrices, etc. will not find their way onto the A3; only
the most significant. This does not mean you should throw away the analysis once made. All analyses
should be stored as a baseline for checking the effect of improvements made and for starting future
improvement initiatives.
•• Don’t jump to conclusions

Based on the insights gained in the Analysis phase and the relief that the solution of the problem
is getting close, the team may have a tendency to think that they have found the solution as they
uncover root causes. Jumping to conclusions will mean that other, possibly more significant, root
causes may be overlooked. This is where a kaizen lead proves their value, by keeping the team on
track.
Completing the Analysis part of the A3, ensures that the team must focus on what the essence is of
the analysis.
66
67
6.6 Key Steps for Analyze Phase to narrow the search for the most important
root causes. If the kaizen team does not feel
To wrap up the Analyze phase, let us take a it has found the real cause or does not have
brief look at the main steps that need to be sufficient evidence to support the root causes
accomplished before moving on to the Improve found, do not hesitate to go back and collect
phase. additional data to verify root causes or find
new ones.
1. Determine the critical independent variables
Remember: it is vital to not jump to conclusions,
The first and probably most important step especially once one or two root causes are
is to identify the key X’s, the independent known. Do not start formulating solutions
variables that most influence our problem. If we before you have finished the analysis. It may be
do not identify these correctly, we may spend a solution, but is it the best solution or the one
a lot of time and effort analyzing aspects that that will actually solve the problem completely
really have little bearing on the problem at rather than partially. Let the analysis run its
hand. course and see where the data, the facts,
the calculations, the visualizations and the
2. Perform the data analysis dissecting of the problem take you.
We must use at least the seven basic tools of 5. Prioritize the root causes
quality to analyze the data collected in the
Measure phase, with the aim of determining Lastly we must prioritize the root causes we
which of the X’s have the greatest influence on have found. This priority will be passed on to
the problem. the Improve phase, so that the kaizen team can
focus on finding solutions for the most pressing
3. Perform the process analysis root causes.
We also need to take a detailed look at the Closing the Analyze phase is probably the most
Value Stream Map we created in the Measure critical change of phase. The reason is that prior
phase. In the VSM, we will be trying to identify to the closure of the Analyze phase, there must
where there is waste, where the balance be little attention paid to the solutions. As we
between Value Add and Non-Value Add have seen, quick wins may be found early on
activities is clearly tipping the wrong way, i.e. in the cycle. However, the danger of jumping to
too much NNVA and NVA activity. We also conclusions is always present. The kaizen lead
need to do the necessary calculations (PCE must therefore be continuously aware that
and Little’s Law). Our higher level goal is to the team stays focused on the phase at hand.
understand the flow (or lack of it) in the process This is absolutely vital for the Analyze phase
by analyzing the throughput and constraints. where the temptation to go for the solutions is
greatest.
4. Determine the root causes
The other danger that can raise it head at this
Based on the data and process analyses, we
point is distraction. This is a huge problem
can generate theories to explain potential
within IT organizations, as there are more than
causes. Use the 5 Why, C&E Matrix and FMEA
68
enough problems that need to be tended to. As
the team works through the DMAIC cycle, the
problem and its causes become clearer. As the
team becomes more familiar with the problem,
it appears to become less daunting. When we
do not understand a problem, it tends to seems
more threatening. As the threat decreases,
people have a tendency to downplay the
problem. In the worst case, this can lead to the
problem sponsor being distracted towards a
different problem that is demanding attention.
With this distraction, the kaizen team may lose
interest and the kaizen itself may peter out
before the solutions have truly been found and
implemented. It is absolutely vital that a kaizen
is brought to its logical conclusion, with at least
one solution to the problem being implemented.
This solution must have a visible impact on
reducing the problem.
69
6.7 Case study: Analyze Phase
In a large organization, the use of Lean principles had been steadily and rapidly increasing within
operational departments. This resulted in a large increase in the number of reports created in
the Business Intelligence (BI) system. As a result of carrying out lots of changes for the customer
departments, the IT department supporting the BI system had not paid attention to how this was
affecting the BI service. The number of incidents and complaints were exploding. Worst of all, the
reports so critical to the business were only being made available by the end of the morning, due to
slow data loading and system crashes. Finally, the problem was escalated to board level and a kaizen
was started. This case is interesting because a number of errors were made in carrying out the
Kaizen.
The problem definition was quite quickly developed: How can we ensure that all reports are available
at 07:00 every weekday morning? The kaizen sponsor, lead and team came to agreement on this
question to be answered within about an hour. This led to the start of the Measure phase. Here, a
mistake was made. The team convinced itself that they did not need to measure everything because
“we know it’s of strategic importance”. The assumption was that the board support to get the BI
systems up to scratch was pretty much unconditional because of the business criticality of the
system. Later on, during a meeting with customers and the kaizen sponsor, the preliminary analysis
of the problem and the proposed solutions were shot to pieces because they were insufficiently
supported by facts and numbers. This led to a re-take of the Measure phase.
In the first iteration, the kaizen team focused on creating an Ishikawa of the problem. This formed
the basis for the ill-fated meeting. The team subsequently set about creating clear control charts
of the performance of the load processes, supplemented by those of the use of memory, network
bandwidth, processing power and disk space. These were accompanied by histograms of the
occurrence of incidents and the implementation of changes. For each of the branches of the Ishikawa,
a Pareto chart was produced.
Initially, no Value Stream Map was made, because the problem was deemed to be a technical and
behavior & attitude problem. In the end, the VSM provided vital insight to solving the problem. It
turned out that a number of steps were sequential, when they could be carried out in parallel. This
was the result of a challenge to the common sense assumptions that had been made many years
back and, since then, had not been challenged. The VSM including both human and technical actions
showed where time
70
7 Improve Phase
At the end of the Analyze phase, the kaizen team basically has a list of the most important X’s, the
factors that cause the problem. The next goal is to identify improvement options. This is the aim of
the Improve phase.
The Improve phase is really the moment that we start thinking in solutions. Earlier in the cycle, we
may have come across a solution, especially if the problem turns out to be an obvious one. Assuming
that we are dealing with a complicated or complex problem, the Improve phase is the time to start
gathering solutions
In this section, we will look at ways of generating solution ideas, techniques for selecting and
prioritizing the solutions and testing solutions. All of these methods are useful but the one that
stands hand-and-shoulders above all of them is going to the Gemba.
Observing the Gemba and validating solutions at the Gemba are two of the most important ways to
ensure that the right improvements are implemented in the right way. Going to the Gemba facilitates
the generation of ideas for solutions, especially because ideas can be discussed with the people.
Gemba validation ensures that the implementation of improvements is carried out in a way that
garners support with the people doing the work.
7.1 Idea generation
There are many options when choosing idea generation techniques. The techniques described below
are well-known, often-used, proven techniques that generate many ideas
7.1.1 Brainstorming
Brainstorming is about generating as many ideas as possible. It is vital that ideas are not evaluated
during the brainstorm session as this limits the creativity. Typically, the brainstorm session will start
with a recap of the key factors causing the problem. These may be posted on a flipchart or on the
wall; a visual solution is recommended. Per factor, the team must generate as many solution ideas as
possible. As time goes by the solutions become more outlandish and strange. This is when you know
that the brainstorm session is reaching its goal. Often, in the absurdity of a proposed solution is a
core truth that helps to develop a more realistic solution. Once the ideas have dried up, the team can
move on to selecting and prioritizing the ideas.
An alternative to brainstorming is brainwriting. In this technique, the principal factors causing the
problem are posted on flipcharts around a room. The team members walk around the room in silence
posting sticky notes with their ideas on them. Participants read each other’s’ posts and use them as
inspiration to generate new ideas. The fact that a particular post is not explained means that another
person can freely associate or interpret the post as they wish. This again leads to ideas that are out of
the ordinary.
71
7.1.2 Reverse thinking
Reverse thinking is all about describing what you would like to have happen and then working out
how to make the opposite happen. This method helps to understand what the team should definitely
not do. Once this is clear, the step to understanding the possibilities becomes much easier. Usually
developing 10 to 15 reverse ideas provides sufficient input to look for desired solutions.
This method works because it is fun. Looking at the absolute opposite of what you are trying to
achieve means, from an IT perspective, how can I aggravate the current – problematic – situation?
This leads to amazing definitions of how the IT service infrastructure and organization can be
comprehensively sabotaged. Many additional insights have been collected during a reverse-thinking
session, as particularly engineers try to out-do each other with better ideas. The challenge is to then
identify the opposite solution. A single negative solution may lead to multiple positive solutions.
7.1.3 SCAMPER
A third idea generation technique uses action verbs as triggers to generate ideas. SCAMPER is an
acronym with each letter standing for an action verb which in turn stands for a prompt for creative
ideas.
S – Substitute
C – Combine
A – Adapt
M – Modify
P – Put to another use
E – Eliminate
R – Reverse
Again, the aim is to produce as many ideas as possible. Each cause of the problem is approached
using the seven action verbs, with the aim of understanding what could happen if an aspect of the
cause (or all of it) is substituted, combined, adapted, modified, and so on.
7.2 Option selection and prioritization
Having generated a large number of solutions, we need to make this number manageable. This can be
done through bundling and/or elimination. The question is: how can we select the best solution(s) for
solving the problem?
Also for this task, there are tens of tools. The selection presented below are among the more
72
commonly used.
7.2.1 Affinity mapping

Affinity mapping reduces the number of solutions by bundling solutions that are linked, similar or
overlapping. The benefit of this way of working is that by bundling we are able to identify the central
themes of a set of solutions. This in turn can provide further insight into the best solution. Affinity
mapping is all about sorting the large number of solutions into a manageable set of clusters.
As with brainwriting, the first part of affinity mapping is done in silence. Team members put sticky
notes with possible solutions together, if they believe the solutions should be clustered for whatever
reason. Then one by one the clusters are discussed, and a header describing the key theme is given to
each set of solutions.
The team then determines which solutions are most suitable from each of the clusters, or potentially
they may develop a different solution based on the insight gained from the bundling exercise.
7.2.2 Solution Matrix

The solution matrix is a simple tool made up of two axes: feasibility and impact. Feasibility represents
the ability of the IT organization to actually implement the solution. Feasibility is high if the costs,
effort and time involved are low. Impact is about judging the effect that the solution will have if it
were implemented. Impact is about the effect on the IT organization and its customers in financial,
performance and/or learning terms. In paragraph 4.4, we already dealt with a number of questions
that can help to determine feasibility and impact.
Example of solution matrix
All of the solutions are then plotted by the team on to the solution matrix. The important part of the
process is that team members discuss the reasons why they believe a given solution should have a
73
particular impact and feasibility. This helps to understand how the team members see the adoption
of solutions within the organization.
Once all the solutions have been plotted, there will be a group that is clustered in the high
impact, high feasibility quadrant. These will be the solutions that need to be considered first for
implementation.
7.2.3 Multi-voting
Each of the above techniques helps to gain control over a large group of solutions. The solution matrix
helps to make a broad prioritization of the solutions, as well. Multi-voting focuses on prioritizing the
solutions. This is done by each team member allocating votes to a set of solutions.
Let’s assume there are 30 solutions. Each team member is given 10 votes (a third of the total number
of solutions that can be voted for). After everyone has voted, the scores are tallied and the top 10
solutions are selected. It is possible to do a second round in which everyone gets 3 or 4 votes in order
to select the best solutions from the previously determined top 10. In this way, the kaizen team can
reduce the number of solutions to a manageable number.
7.2.4 Business case development

The last technique is one to use once the number of solutions has been reduced to less than a
handful. The aim is to build comparable business cases for each of the solutions. Each business case
will include both the costs and the returns for the same fixed period of time.
The solutions of a kaizen should give a positive return within a maximum of six months. Anything
more probably means that the solution is too big and costly; the team should look for smaller
solutions, possibly only tackling part of the problem. It is advisable, where possible, to implement part
of the complete solution to a problem at a fraction of the cost, rather than spend huge amounts of
resources to completely solve a problem in one go. The key consideration here is the acceptance of
the change: smaller changes are more easily accepted and assimilated into the way of working than
large changes.
7.3 Testing solutions
Having selected one or more solutions to implement, the question that must then be answered is:
how will we try out the solution to see whether it works. This will depend very much on the type of
problem being solved.
Type of problem (Cynefin) Solution test
Obvious Implement a pilot using the best practices available in the market
Complicated Create a small production pilot to understand how it behaves in

the live environment.
74
Complex Use experimentation techniques to understand how the solution
‘behaves’ in practice.
Chaos Determine which actions to take and carry out a risk analysis for
each action
7.4 Solutions used in IT
Based on the table above, we can deduce that there are situations within IT for which solutions have
already been devised. Let us look at some typical solutions used within IT organizations.
Best practices are an area at which the IT industry excels. There are many best practice frameworks
developed for use within IT. These best practice frameworks present sets of rules that have been
developed over many years with contributions from the IT community. The most prominent
examples are:
•• ITIL: the most widely accepted approach to IT Service Management. It describes best practices
within IT organizations covering the entire lifecycle of an IT service from concept to retirement.
It identifies and describes 26 processes, all of which contain solutions to common problems
within IT organizations. A more concise version, related to ISO/IEC 20000, is part 2 of this
standard.
•• Cobit: approaches IT from a strategic governance perspective. Cobit aims to link business
goals to IT objectives including the definition of metrics, and roles and responsibilities. Cobit
primarily provides answers to the governance issues of IT organizations
•• Scrum: is a best practice model for rapid application development. It describes the way to
ensure that software is developed rapidly and that the final product delivers value to the
customer. A number of the problems faced by IT organizations in delivering new software to
customers are solved in this best practice.
•• Prince2/PMI: are best practices that help to solve problems in the area of project
management and project governance.
On top of the best practices, IT also has good practices. These are frameworks of principles and tools
that help to improve the ability of IT to deliver and improve its services to customers. In this category,
we find methods like
•• Lean IT: On top of kaizen problem-solving to continually improve services, Lean IT applies
lean principles to IT. As such, it helps to focus IT processes on single piece flow and delivery of
value to customers. Lean IT describes the principles and good practices that can be applied to
complex and complicated problems. An example of a lean solution is 5S, which guides teams
through a series of steps to take action on how work is organized.
•• Agile: This is a set of principles, originating from the development of software, that can and
is applied to a variety of areas (e.g. Agile Project Management). The essence is about focusing
on individuals and interaction, working product (software), customer collaboration and
responding to change. Agile can be used problems in a similar way to Lean IT.
•• DevOps: A more recent addition to the list of methods, DevOps is a solution that derives
its effectiveness from the integration of a number of critical areas: process, organization,
performance, behavior & attitude and automation. This combination ensures that all aspects of
75
IT are included in the solution.
7.5 Improve phase and A3
The Improve phase leads to the review of the Future State section to see whether the solutions meet
the requirements of the intended future state. The Proposed Options section of the A3 must be
completed. Finally, the team must also describe the plan for implementing the solutions in the Plan/
Improvement section. This last section will be finalized during the Control phase, when the relevant
details of the control plan are added.
Determine the key actions to be taken per cause. Once actions have been completed, re-score the
occurrence and detection. In most cases we will not change the severity score unless the customer
decides this is not an important issue
7.6 Key Steps for Improve Phase
As a recap for the Improve phase, let us review the main steps that need to be accomplished before
moving on to the Control phase.
1. Generate potential solutions
Having understood the cause and effect relationships in the Analyze phase. The kaizen team
must now generate as many solution possibilities as they can, using one or more idea generation
techniques. It is important that maximum creativity is used in this step. The more solutions, the
better the chance that there is an easy-to-implement solution that solves the problem. Remember:
76
we are looking for small effective steps to Operation Procedure), checklists, KPIs (Key
resolve the problem, not large solutions that Performance Indicators) and metrics. The
require considerable effort to implement. team may also use the FMEA (Failure Mode
and Effects Analysis) to prepare for possible
2. Select and prioritize solutions challenges during the implementation.
From the large number of solutions defined, 6. Create implementation plan for full-scale
we must now reduce the collection to a small roll-out of solution(s)
number of solutions that both have impact
and are highly feasible. We do this using the Plan the implementation. This is primarily
selection and prioritization techniques. If for communication purposes. Hopefully,
necessary, the team may need to perform the solutions to be implemented are small
small experiments to check whether a solution requiring a minor amount of training to ensure
is suitable. adoption. However, there may be aspects of
the implementation that require substantial
3. Apply best and good practices communication with people affected by the
problem.
Within IT, where we have a large number of
best and good practices, it is very important The kaizen team and particularly the sponsor
to check the best and good practices for the must agree on the fact that the solutions will
particular area where the problem exists. Since help to alleviate the problem. It is only then
IT problems, even complex problems, usually that the Improve phase can be closed and the
include parts that can be solved using best implementation of the solution(s) can begin.
practices, it is a waste not to apply what others
have already learned.
4. Develop “Future State” VSM
Once the team understands which solutions

they intend to implement, they can create
the future state Value Stream Map. This
is important because it helps to focus the
improvement efforts and to communicate the
intended changes to the other people working
in the process who were not part of the kaizen.
5. Pilot the solution and confirm improvement

outcomes
During the Improve phase, the kaizen team

must check whether the intended solutions
actually work. Use the pilot to create
documentation required to support the full-
scale implementation, e.g. SOP (Standard
77
7.7 Case Study: Improve Phase team, which did not include the customer,
actually invited all of the known project owners
The IT management team had recently received (the principal project customers of IT) to listen
a number of complaints from customers about to the analysis and help to generate solutions
the intake of projects within the IT organization. to the problem.
Their complaints centered around the number
of times they had to tell their ‘story’ before IT The team, and the invited customers, used
actually got down to carrying out the project. classic brainstorming to generate solutions.
The kaizen lead introduced a 20-minute period
The Measure and Analysis phases of the Kaizen of reverse thinking when she felt the options
showed that, depending on the sensitivity were drying up. This had the added effect of
of a topic, up to six different people may causing hilarious exchanges between business
have a meeting with the customer to define and IT people. Later, in the evaluation, the
the wishes for the project. First, an account team realized that this period of fun actually
manager would ask what the customer wanted. improved the acceptance of changes that had
Second, an information manager would request previously been non-discussable. The ‘non-
a meeting to gain some more insight into the discussables’ were especially related to people
technological impact of the project. Then a having to relinquish part of their responsibility
project manager would turn up and pretty or authority for the sake of a more efficient
much repeat everything the account manager process.
and information manager had just done, but
then from a project execution perspective. If The result was quite surprising: the customer
there were budgetary or governance issues, the recognized that they themselves had caused
financial director or operations manager may part of the problem by insisting on a fairly
require a similar explanation as had already bureaucratic governance on the business side; a
been given. And, finally, if there was any kind classic case of muri.
of problem the CEO would get involved as a
referee. There seemed to be many projects Policies were adjusted and the information
with ‘any kind of problem’. In short, the process manager was given overall responsibility for
of getting a project started was extremely ensuring that the project was fully defined. This
time-consuming. was done in such a way that a project manager,
who was allocated at a later date (sometimes
The kaizen involved creating a Value Stream weeks after the project had been defined),
Map of the process and the analysis focused could easily read up on what was required and
on the roles involved. Each of the roles had start executing the project.
documented responsibilities some of which
either overlapped, conflicted or caused
transfers. The analysis determined that, in fact,
no one was actually responsible for ensuring
that a project was defined, so that it could be
executed.
The Improve phase was novel in that the kaizen
78
8 Control Phase
The Control phase is the last step in de DMAIC we have focused and dealt with the underlying
cycle. In this phase, the goal is to successfully causes that prevented the performance needed
implement and, more importantly, maintain to meet the requirements from the four voices
the gains achieved, i.e. it is all about ensuring discussed earlier. Now that we have achieved
the sustainability of the improvement. The the performance improvements, we need to
question that the Kaizen team is trying look at controlling the delivered quality.
to answer is, “How can we guarantee the
improved performance?” Ensuring that the As we all know making changes is challenging
successes from the Improve phase will continue and sustaining those changes even more so.
means transferring the responsibilities for That is why we need control. It is often said
performance to the process owner. One way to that standing still means going backwards.
look at the concept of establishing controls is We need to continually put energy into the IT
to ask yourself what elements, activities, roles, organization in order to maintain organizational
policies, etc. you need to put in place to make performance. The problem is that expending
sure that, the next time you go to Gemba, the energy is tiring. The way to reduce energy in
improvement is still in place and preferably organizations is to create habits.
better than you left it.
We need to be diligent and develop the habits
The Control phase has two main focus areas: and practices necessary to maintain our
guarantee the increased performance and the current state and pursue improvement in the
hand-off the improved process to its process future. That is why we need a kaizen mindset;
owner. a true kaizen mindset means enjoying the
challenge of counteracting the descent to chaos
and continually seeking improvements.
8.1 Achieving Control
A control is, in essence, a procedure or policy; 8.2 Control Plan

a way to identify whether work is done in
the correct way. There are specific IT controls As we said, the Control phase in our kaizen is
that provide assurance that the information aimed at maintaining the changes that were
technology used by an organization operates made in order to sustain the improvements. To
as intended, for example, with the correct help the process owner and the people doing
authorizations, sufficient audit trails and the actual work, we must develop a control
processes that deliver the correct results. The plan. This plan consists of four basic parts:
aim of this phase is to implement controls to •• Documentation: a record of the changes
ensure that work is done the correct way. made
•• Monitoring: a way of checking that changes
In the previous Kaizen phases, we successfully are maintained
investigated and, eventually, made •• Response: a way of reacting to deviations or
improvements to alleviate the causes of the incidents
problem. To make structural improvements, •• Training: communication of the changes to
stakeholders
79
Without implementing a control plan to ensure issue: if it is not there, people complain that
problems do not reoccur, the Kaizen cannot be they do not know what is expected, if it does
successful in the long run! exist, nobody reads it. Traditionally, process
documentation comes in one of two forms:
8.2.1 Documentation a process flowchart accompanied by a RACI
Our improvements in the way of working (Responsible, Accountable, Consulted, Informed)
need to be institutionalized as habits and chart or a process flowchart with ‘swimming
routines. One key ingredient to achieve this, lanes’ (as described in the Analyze phase).
is documentation. Of course, we all know that Whichever method you use, the document
documenting alone is insufficient. It would should be short and simple so that it is easy
be just a paper tiger (‘something that seems for people to understand. This style of process
threatening but is ineffectual’). However, documentation is often required for compliance
without documentation, it is difficult to create purposes. An additional way to document a
a baseline, to establish the right routines and process is to post the Value Stream Map on
habits. Examples of new documentation include the wall, and regularly organize short meetings
new process steps, standards, procedures, to determine which improvements need to be
policies and instructions for new or updated carried out. Posing a Value Stream Map has
systems or tools. the effect of keeping the improvement of the
process at the top of people’s minds.
Policy
Standard Operating Procedure (SOP)
Creating policy often results in documents
that resemble legal documents, not least A SOP is a written procedure that describes
because they aim to be complete, covering all how a specific task should be carried out. The
eventualities and exceptions. A policy should be idea is that by following the SOP, the desired
clear and concise, and should stay within the outcome can be guaranteed and created in
scope intended. The policy should clear state a consistent and efficient manner. A SOP is
its intent and the spirit that should be followed sometimes referred to as the ‘best known way’
when applying it. New or rewritten policy must of doing something, simply because there will
obviously not contradict any other policy. one day be a better way of doing the work.
This is an example of how the language in a
Roles and Responsibilities lean organization can differ from that used
in a non-lean organization. Within IT, the key
Key to gaining control is establishing clear area where we use the SOP is in describing the
ownership and accountability for results. This execution of Standard Changes.
needs to be documented so that the ownership
and accountability are available for all involved. A good SOP has a name for each step. It
These roles and responsibilities can also be describes what needs to be done per step and
used in the process documentation. how this should be carried out. An excellent
and highly effective SOP also includes why
Process Documentation a step needs to be carried out in the way
described. If people do not understand the
Process documentation is always a contentious ‘why’, they will not often ignore that step,
80
especially if it is an administrative step (see Indicator (see Lean IT Foundation for the
case at the end of the chapter). definition of a KPI).
An alternative to the SOP is the checklist. The credibility of measurements is highly

Checklists are particularly helpful when the dependent on their consistency and coherence.
process is non-standard or has limited aspects Consistency means that they are measured in
of repeatability. Although this document works a repeatable way that is the same across the
principally as a reminder to carry out particular whole IT organization. Coherence means that
activities (as does the SOP), it does not the measures are self-consistent across any
guarantee a specific outcome. Rather, it ensures number of assessments.
that things are not forgotten.
Ideally, we will identify leading and lagging
It is very important to determine the detail of indicators. The former will identify whether
the documentation. This should be based on the performance will decline in the future;
the risk of not having sufficient detail versus the latter look at performance after the fact.
the value of a short and easy-to-understand Within IT, an example of a leading indicator
document. As with everything in Lean, start is the number of Problems solved. This is a
with focusing on the value. leading indicator for – a reduction in – the
number of incidents. Untested changes on
8.3 Monitoring the other hand are a leading indicator for an
increasing number of incidents. The decrease
To detect any irregularities, we need to know or increase in the number of incidents is the
how the implemented changes are affecting lagging indicator.
performance. To do this, we need monitoring.
Our approach for monitoring should focus on: Besides the usage of metrics, we need to
establish a form of dashboard. The input for
•• Monitor the process using the updated
the dashboards is the information from the
metrics and measurements made during the
metrics. A dashboard is a visual tool to ensure
Measure phase
•• Evaluate the improvements made in the that both managers and engineers know how
Improve phase they are performing. The dashboard ensures
•• Assesses the capability of the process over consistency in the use and interpretation of
time and ensure that the solutions work for metrics and KPIs since everyone looks at the
the long term same consistent and coherent measurements.
Metrics Visual management
During the Define and Measure phases, we Aside from the metrics, an excellent way of
have identified and created measurement spotting irregularities is to talk with the people
systems for metrics related to the problem. related to the problem area about performance
Some of these can be re-used. The and the issues they face. To do this, we can
improvement we have chosen to implement setup Lean style Visual Management and
will undoubtedly be related to a Critical Success engage in meaningful performance dialogues
Factor for which there is a Key Performance with the process owner, engineers and
81
management. discussions consist of three elements.
•• Share feedback on the performance and

As we saw in the Lean IT Foundation, Visual
spot irregularities
Management is about effective communication •• Identify root causes and offer suggestions
and real-time updates regarding the work. for improvement
Performance and workload is shared for •• Determine what needs to be done in order
visibility and effective communication. Visual to correct any irregularities and whether
management covers steering the work, support is needed for completion.
planning and reviewing progress and, of course, The order of the steps may vary depending on
managing improvements on a daily and weekly whether new irregularities are being discussed,
basis. Including the Visual Management tools, ongoing performance is being evaluated or
therefore make sense as control mechanisms. correcting measures are being reviewed.
Visual management helps to create consistent Cascade

and effective communication. It removes the
need for a series of one-to-one communication Quite often, support from other organizational
that inherently has the risk of an inconsistent units is required when implementing
message. The communication is effective improvements. Therefore, we must establish
because the entire team hears the same rapid and effective communication that
message at the same time. Lastly, frequent can easily cascade through all levels of the
feedback loops are established. This is based on organization. This may require changes to the
common knowledge of the chosen solutions to infrastructure of meetings relevant to the
problems. problem area. The goal is to propose changes
to the infrastructure so that ideas, suggestions
Performance dialogues and requests for help can flow readily through
the channels of the organization.
Measurement is vital to understanding the
dynamics of our processes. Measurement 8.3.1 Response
in itself does nothing, it is what we do with
Although we never know when and what
the measurement that counts. Measurement
kind of irregularities we will be facing, we can
must lead to changes in behavior and we
prepare by setting up responses in advance.
must ensure that our behavior helps us to
This means establishing checks that will signal
achieve our goals. Once again, as we saw in
out-of-control conditions and defining actions to
the Lean IT Foundation, the performance
be taken.
dialogue is an instrument that helps us better
understand what good performance is and
During the Analyze phase, we looked at the
how to collaborate in order to create value for
use of the FMEA to understand what could
customers.
go wrong in a certain situation. To the FMEA,
we can add OCAP procedures (Out-of-Control-
To engage in meaningful communication to
Action-Plan).
control our process, we setup performance
dialogues. Their aim is to ensure a structured
As we saw before, a good FMEA defines:
and objective discussion of performance. These
•• The activities such as inspection, checks or
82
measurements aimed at control a particular role knows their
•• The frequency of the activities, who is accountability
responsible and the tools involved •• Guidance in understanding and using all
•• The standard or norm that defines what is the documentation
acceptable and what is not •• Instructions for how to use the
monitoring tools
The interventions grouped together, we call
•• Knowing the response activities
an OCAP or Out-of-Control-Action-plan. It
prescribes what to do in case of a failure To aid with setting up the necessary training, a
occurs. It is a living document, which stores Knowledge and skills matrix can be used. With
knowledge about possible/known issues and this tool, we can readily identify the current
related solution strategies. The OCAP makes and needed knowledge and skills for those
exception handling and firefighting efficient and involved in the problem area.
effective.
If a knowledge and skill matrix already exists,
It is vital that engineers are involved in setting it will need to be updated to include any
up the FMEA and the OCAP to ensure they improvements. A lack of skills and knowledge
know what is required. The OCAP was first is a kind of waste and may prevent the kaizen
used at Philips Semiconductors. There, it had improvements from having the desired impact.
been common practice to establish normal
operations, but no response was formalized 8.4 Communication Plan
to specific events. By adding the OCAP to
the FMEA, they created the possibility to Building a communication plan is essentially
prepare for disruptions. Also, the responsibility about ensuring that the right people are given
for the activities was anchored in the work the right information at the right time. When
place. In effect, the OCAP is like a small-scale putting together a communication plan, we
contingency plan. must include the following variables:
•• Content: what are we communicating about?

Guidelines for using the OCAP are: •• Audience: for whom is the communication?
•• OCAP procedures are usually documented as •• Purpose: why are we communicating about
a flowchart this content?
•• Knowledge and experiences with the process •• Timing: when will the communication take
are documented place? Is it a one-time event or does it recur?
•• The OCAP is a living-document and is •• Form: will the communication be presented
updated regularly in the form of a newsletter, email, interactive
•• Part of the OCAP is maintaining a log, in meeting, presentation, training or other such
which uses of the OCAP are registered for form?
both problem and solution •• Input: from whom do we need input /
•• This log offers insight in problems that occur consultation prior to the communication
frequently and guide new kaizen initiatives. event?
•• Actions: who will do what to ensure the
8.3.2 Training communication happens?
The training part in our Control plan is focused •• Capacity: how much time is needed to carry
out the communication event planned?
on:
•• Ensuring that each person executing
83
Generic example of communication plan
8.5 Closure
The final step in our kaizen is its closure and the hand-off from the kaizen team to the problem owner
(as we saw, this tends to be the kaizen sponsor). We consider the Kaizen closed when the problem
owner has accepted the following deliverables:
•• Improved performance, including the before and after data on metrics, to be used as a baseline for
further improvements
•• A completed Kaizen A3, including lessons learned (both success and failures) and recommendations
for further improvements
•• Documentation, including Standard Operating Procedures, policies and other documentation
produced during the kaizen such as Value Stream Maps and other tools
•• Operational training, including training on the Standard Operating Procedures and changes in
processes or policies
•• Transfer plan for sharing gained knowledge and new best practices
One of the powerful aspects of running kaizens is to transfer successful implementations across the
entire organization, through replication and standardization. Replication means taking the solution
from the team and applying it to the same type or a similar type of problem. Standardization
means taking the lessons learned from the team and applying those good ideas and solutions to
other problems. The kaizen team should consider standardization and replication opportunities to
significantly increase the impact on the business, to far exceed anticipated results.
The transfer of best practices demands great care and a well-devised implementation method.
Special care should be given to:
•• The people working in other processes. The background of the changes should be well
explained to them.
•• The changes made should be verified whether they work well enough in practice, and are
transferable to the new situation. This can be done by carrying out pilots of the improvement
actions
•• Any feedback, especially on complications at the start
•• Acceptance of the changes
Never assume that your proposed improvements work perfectly at once somewhere else. Usually
the improvements come across some complications. Fine-tuning the improvements may be necessary
84
and provide a great opportunity to involve the others. Ensure that any feedback given is captured
and used.
When the kaizen event is officially over, a team evaluation may be done to assess how each individual
did as a team member, management may devise rewards to recognize the work of the team, and the
team may share the gained knowledge on how to run a successful Kaizen with others.
8.6 Control phase and A3
In this final step of the kaizen, we must complete the A3. This will entail reviewing the entire A3 to
ensure that the story that needs to be told is actually told. As the team moves through the DMAIC
cycle, each phase that is completed appears to be the most important. And it is until that point. The
key message of the A3 is: what are we doing to remove the problem we initially defined?
The analysis that the team spent so much time on, producing valuable insights through
measurements, may be reduced to a few sentences, results or graphs. The proposed options from
the Improve phase may be limited to the top three.
In finalizing the A3, we focus on creating a consistent story based on the prior documentation and,
principally, describe the solution to the problem defined in the background section.
In the Plan / Improvement section, the chosen solution to the problem is described. It is accompanied
by a plan defining how the chosen solution will be implemented in the IT organization. The Follow-up
section is where we describe the activities that we have devised to ensure that the solution remains
embedded in the IT organization, or if necessary how the solution will be disseminated throughout
85
the IT organization.
8.7 Key steps in the Control phase
To complete the Control phase, let us take a brief look at the main steps that need to be
accomplished.
1. Create a measurement system
Institute the metrics to control the improvement. Ensure that these are included in a dashboard for
use by all people involved. The basis for this measurement system will probably have been laid during
the Define and Measure phases.
2. Create documentation
It is vital to record the changes made. At the same time, we must be careful about creating too much
documentation. Keep policies and process documentation concise. Ensure that the documentation is
written for the right audience. Make use of Standard Operating Procedures and checklists wherever
possible.
3. Create control plan
Ensure that the kaizen team makes a control plan, preferably with help from colleagues outside
the kaizen team. This involvement helps to generate a plan that is supported by a greater number
of people and that contains acceptable controls. The control plan must include all of the four key
aspects: documentation, monitoring, response and training.
4. Communicate to stakeholders
Communicating the results and control measures to the stakeholders is vital. This is the only way to
ensure that everyone involved knows what to do. Part of the communication is achieved through
training, the rest will be achieved through information sessions. Sending an email to inform someone
of the change does not constitute communication to the stakeholders.
5. Present the results as described on the A3
We need to finalize the kaizen A3. The key reason is to ensure that all pertinent information is
collected in one place, and that the results of the kaizen are explained simply. This document can
obviously be used to support the communication of the solution and the control activities to the
stakeholders.
6. Transition ownership
The last step is for the kaizen sponsor to take ownership of the results of the kaizen. In effect, we
86
move the responsibility from a ‘project team’ back to the hierarchical line. The kaizen sponsor should
be pleased with the result, since a (part of the) problem has been solved.
Therefore, do not forget to celebrate the success with all involved.
87
8.8 Case Study: Control Phase
A classic problem for many IT organizations is the use and, particularly, the maintenance of the
Configuration Management Database (CMDB). Mostly, considerable effort is put into ensuring that
the configuration items (CI) are recorded in the CMDB. The problem is that within months (sometimes
weeks), the CMDB is no longer up-to-date, CIs are missing, details of new Cis have not been entered
and people start complaining about the quality of the CMDB. This leads to general apathy towards
the CMDB, and it spirals into disuse.
One IT organization decided to take this problem seriously. The kaizen resulted in the conclusion
that many of the aspects necessary for the CMDB to be used were in place. What was missing was
a comprehensive set of controls to ensure that everyone was focused on keeping the CMDB up to
date. The result of the kaizen was a complete control plan, which was written by both managers and
engineers, thereby stimulating the adoption of the agreed actions.
Starting with the documentation, it was found that process documentation existed. Its quality was
fine and it turned out to still be relevant. Two pieces of documentation were missing. The first was a
policy. This document was created and consisted of nine points covering definitions, authorizations,
the allocation of responsibility for particular CIs to teams, the basic set of data to be collected and the
way quality to should be monitored. The second piece of documentation was not so much missing,
as in need of improvement. This concerned all Standard Change procedures. It was decided to include
the ‘CMDB update’ step at ¾ of the way through the set of steps to ensure that everyone would carry
out the update before the end of the procedure. In the ‘SOP’, it was explained why the step was so
important.
The next step was to define the monitoring activities. First, a set of simple metrics was defined:
the number of CIs with no relationships, the total number of CIs under management per team
and the number of CIs not containing the basic set of data. The metrics were used during the
Visual Management meetings and results were used in both performance dialogues and were
communicated through the cascade. The management levels agreed that if the metrics did not
showed a steadily improving result, they would ensure that action was taken. The responses were
pre-determined and communicated to the teams.
Lastly, everyone in the whole IT organization was trained in the new way of working. The training
was primarily done by team members (not by management). In general, team members were very
persuasive in their communication to their colleagues as to the why, how and what of the way of
working surrounding the CMDB.
The result was a much better acceptance of the need to maintain the CMDB. The quality of the CMDB
improved over the ensuing months. The quality improvement did not spike and fall back, rather it
showed a steady improvement trend
88
9 Appendix 1: References
9.1 Lean Six Sigma Pocket Toolbook (chapters 1-4, 9)
Authors: Michael L. George et al

ISBN number 0-07-144119-0
Publisher: McGraw Hill
9.2 Understanding A3 Thinking
Author: Durward K Sobek III, Art Smalley

ISBN: 978-1-56327-360-5
Publisher: CRC Press
9.3 A Leader’s Framework for Decision Making
Author: David Snowdon, Mary Boone

Publisher: Harvard Business Review
Date: November 2007, p69-76
89
10 Appendix 2: Glossary
A3 Refers to the size of a piece of paper that provides enough space to
explain a relatively complicated story, but encourages conciseness in
the communication of a message.
A3 Proposal Is used for creating a recommendation for action
A3 Status Report The A3 status report is aimed at informing all stakeholders of the
progress of the execution of a longer-running project or action
Affinity Mapping Bundling solutions that are linked, similar or overlapping in order to
reduce the number of solutions..
Agile a set of principles, originating from the development of software, that

can and is applied to a variety of areas (e.g. Agile Project Management).
Andon Refers to a system to notify management, maintenance, and other

workers of a quality or process problem
Analysis An A3 skill where the aim is to separate something into its constituent
parts or elements. It is vital when writing an A3 report to understand
the parts of the problem so that only the right information is given. If
we are able to discern the parts of a problem, we can also determine
which of these parts are relevant to the reader.
What was done to identify the root cause of the problem. (vb. Analyze)
Analyze (Phase) Third phase of the DMAIC cycle in which the analysis of the problem is
done.
Annotated Watching what happens and noting the number of times something
Observation happens, the amount of time spent on a task, the number of errors
made in finished products and other such observable occurrences
Baseline Baselines and benchmarks are necessary to understand the relative

value of the performance. A baseline is the measurement of a
situation in order to understand whether a change occurs based on an
intervention after the baseline has been set. This is particularly useful
in kaizen because we are very interested in the effect of changes that
have been implemented in the IT organization. It is vital that during the
Measure phase a baseline is set that can be used to measure progress.
90
Benchmark A benchmark is a standard or set of standards used in evaluating the
performance or level of quality of an organization. Benchmarking may
be used during a kaizen to understand how well others perform a
particular activity. This may help to identify what improvements are
possible.
Capacity The maximum amount of output that the process can deliver over a
period of time
Cause and Effect See Fishbone diagram.

Diagram
Cause and Effect A cause-and-effect matrix helps determining which factors affect the
Matrix outcomes of the process being investigated
Change Over Time Time needed to change from processing one unit of work to processing
a different one. Within IT, this is the time lost due to context-switching.
Check sheet The check sheet is a simple and highly effective tool for collecting
quality-related data in a structured way. It is a way to assess a process
and can function as input for other analyses when there is limited or no
numerical data to be analyzed.
Common cause Sources of variation in a process that are inherent to the process, also
variation referred to as noise.
Continuous Ongoing process in an organization with the objective to find, resolve

Improvement and share solutions to problems. The objective is achieve perfection, in
other words to improve value streams, product and customer value. A
philosophy of frequently reviewing processes, identifying opportunities
for improvement, and implementing changes to get closer to perfection.
Control Chart The control chart is essentially a time-series chart. A time-series chart
is one in which data is plotted on a chart where the horizontal axis is
a time sequence. The vertical axis can be numbers or another variable
whose value can be different over time.
A control chart helps to understand variation.
Control (Phase) The fifth and final phase of the DMAIC cycle. This phase ensures that
improvements are implemented and anchored into the way of working
91
Control Plan A plan aimed at maintaining the changes that were made in order
to sustain the improvements. This plan consists of four basic parts:
Documentation: Monitoring, Response:, Training
Control Variable This kind of variable is particularly useful in experiments. This variable
is kept constant while others are changed so that they can be
investigated.
Customer The person or group of people who use your product or service OR the
person next in line in the value stream.
Customer Value A person who buys, uses or derives value from a product/service. Only
the ultimate customer defines value. The person ‘next in line’ is referred
to as a ‘partner in the value stream’, or an ‘internal’ customer.
A capability provided to a customer at the right time at an appropriate
price, as defined by the customer. The more a product or service meets
a customer’s needs in terms of affordability, availability and utility,
the greater value it has. Thus, a product with true value will enable, or
provide the capability for, the customer to accomplish his objective.
Cynefin (Model) A model, in which categorized decision-making is placed into one of five
types: simple, complicated, complex, chaotic and disorder.
Daily Kaizen Act of responding to everyday occurrences such as incidents, mistakes

and other quality issues and addressing quality issues at the source
rather than being satisfied with quick fixes
Define (Phase) The first phase of the DMAIC cycle, in which the problem to be solved is
defined and agreed
Dependent Variable this is the output; in effect, this is the problem that is captured as part
of the Measure phase.
DevOps DevOps is a solution that derives its effectiveness from the integration
of a number of critical areas: process, organization, performance,
behavior & attitude and automation.
DMAIC Acronym for the five steps in problem solving with Kaizen, i.e.: Define,
Measure, Analyze, Improve and Control.
DMEDI Acronym for the five steps in problem solving with Kaikaku, i.e.: Define,
Measure, Explore Decide and Implement
Fishbone diagram The fishbone diagram identifies many possible causes for an effect or
problem. It can be used to structure a brainstorming session.
92
Five “Whys.” A root-cause analysis tool used to identify the true root cause of a
problem. The question “why” is asked a sufficient number of times
to find the fundamental reason for the problem. Once that cause
is identified, an appropriate countermeasure can be designed and
implemented in order to eliminate re-occurrence.
Flow The smooth, uninterrupted movement of a product or service through

a series of process steps. In true flow, the work product (information,
paperwork, material, etc.) passing through the series of steps never
stops.
Flowchart A flowchart is one of the simpler of the seven quality tools. The
flowchart is the visual representation of series of steps in a process,
and helps to break down a complicated process into a simple series
of steps. This simplification ensures that the process becomes
understandable to anyone.
Failure Mode and Failure modes and effects analysis (FMEA) is an analysis for identifying
Effect Analysis all possible failures in a design, process, product or service. The Failure
(FMEA) modes are the ways in which something might fail. Failures are any
errors or defects and can be potential or actual. The effects analysis is
about understanding the consequences of those failures.
The aim of the FMEA is to take actions to remove the sources of failure,
i.e. the root causes, starting with those with the greatest impact. FMEA
can be used throughout the lifecycle of an IT service, from design to
operation and retirement of the service
Gemba The place where the work is done. Within a lean context, Gemba simply
refers to the location where value is created
93
Histogram A histogram is "a representation of a frequency distribution by means
of rectangles whose widths represent class intervals and whose areas
are proportional to the corresponding frequencies." In short, this means
that we create a graph in which groups of numbers are plotted based
on how often they appear.
The power of histograms is that they allow us to analyze extremely

large datasets by reducing them to a single graph that can show one or
more peaks in data. The histogram also visualizes the significance of the
peaks.
Hypothesis A hypothesis is a statement that will start with the words “I/We think/
believe that …”. The hypothesis is as yet not supported by any factual
basis. The hypothesis is based on people’s beliefs as a result of their
observations. These are by definition selective and biased, and very
much in need of testing through thorough analysis of the data and facts
that can be found.
Improve (Phase) Fourth phase of the DMAIC cycle. The kaizen team thinks up possible
solutions to the problem based on the analysis done.
Improvement Board Board that presents current problems and the follow-up to resolving
or addressing that problem (also Kaizen Board); an element of Visual
Management
Incident An unplanned interruption to an IT service or reduction in the quality

of an IT service. Failure of a configuration item that has not yet affected
service is also an incident
Independent Variable In the case of problem-solving, the independent variable can be seen
as something that may or may not contribute to the problem. The aim
is obviously to find the independent variables that have the greatest
effect on the problem.
Ishikawa diagram See Fishbone diagram.
Jidoka Creating an environment in which disturbances to the flow of work

through the value streams are made visible, i.e. problems are not left
covered up
94
Kaikaku Japanese for "radical change" is a business concept concerned with
making fundamental, transformational and radical changes to a
production system, unlike Kaizen which is focused on incremental minor
changes.
Kaizen An improvement philosophy in which continuous incremental

improvement occurs over a sustained period of time, creating more
value and less waste, resulting in increased speed, lower costs and
improved quality. When applied to a business enterprise, it refers to
ongoing improvement involving the entire workforce including senior
leadership, middle management and frontline workers. Kaizen is also
a philosophy that assumes that our way of life (working, social or
personal) deserves to be constantly improved.
Kaizen board See Improvement board
Kaizen charter The document in which the problem is described and an indication is
given of what resources (people, time, money) are to allocated to the
resolution of the problem
Kaizen Event See DMAIC
Kaizen lead This person manages the kaizen process on behalf of the sponsor and
the team
Kaizen Mindset There must be a belief throughout the IT organization, both among
managers and employees, that improving IT services and the way they
are delivered can and must be done on a daily basis
Kaizen sponsor This person is the owner of the problem, and has a direct interest in
having the problem solved.
Kaizen team member The people executing this role will do the required work. They must be
involved with the problem as it occurs on the work floor
Kakushin This is the third form of improvement. Kakushin focuses on innovation,

reform and renewal. It differs from Kaikaku in that Kaikaku deals with
transformational change of existing structures, systems, etc. Kakushin
deals with the introduction of completely new structures, systems, etc.
Known Error A Problem for which the root cause and a workaround have been
documented
Lead Tme The time between the moment the customer submits their request to
the time they receive the requested item or service
95
Little’s Law Little’s Law = the number of units of work in the process (WIP) /
average completion rate. Helps us understand the relationship between
lead time and work-in-progress.
Machine Time The time a unit of work is worked on by a machine. This is a type of
waiting time.
Measure (Phase) Second Phase of the DMAIC cycle. In this phase, facts and figures are
collected to understand the problem we are trying to resolve.
MECE Acronym for Mutually Exclusive, Collectively Exhaustive. Mutually

Exclusive means that all items in a particular category only belong
to that category, and no other. Collectively Exhaustive means that all
possibilities have been covered.
Muda Japanese word for waste. See Non-value-added and Waste.
Multi-Voting Multi-voting focuses on prioritizing the solutions by allowing each team

member allocating votes to a set of solutions.
Mura Japanese word meaning unevenness; irregularity; lack of uniformity;

variation
Muri Japanese word meaning overburdened, unreasonableness;

excessiveness. Often related to policy-based waste
Pareto chart or Bar chart showing the causes of problem or condition order from large
diagram to small contribution. Effective tool to show what the big contributors
to the problem are.
PDCA Cycle Plan, Do Check, Act is a well-known continuous improvement method

often referred to as the Deming Circle. The PDCA cycle is applicable in
any situation, and forms the basis for all improvement within Lean.
Performance Dialogue Their aim is to ensure a structured and objective discussion of

performance. These discussions consist of three elements.
Poka Yoke Literally, to prevent an unintentional error, this is a concept aimed at

ensure that activities can only be done in one way, the right way; fool-
proofing an activity
Problem An undesired situation that stands in the way of providing the

necessary customer value; an opportunity to improve. Also, the root
cause of incidents (ITIL Definition of Problem, denoted with a capital P)
Problem Board See Improvement board.
96
Problem Management A Core ITIL Operational process with an aim is to prevent problems
and incidents, eliminate repeating incidents and minimize the impact of
incidents that cannot be prevented
Problem Statement A statement that helps the team investigating the problem to focus its
attention. The problem statement may be in the form of a question or
in the form of a statement. The former is preferable because it is then
clear when you have found the answer to the question.
Process Cycle Process Cycle Efficiency refers to the degree of efficiency of a process
Efficiency (PCE) (or set of processes) whether it relates to the level of success of
processing within an organization, the cost-effectiveness of a market,
or the erosion of income by expense.
Pyramid Principle Developed by Barbara Minto’s. The Pyramid Principleis a method that
is fully compatible with A3 thinking. In fact, it helps to structure the
information and insights gained during the kaizen event.
The problem is framed using the following framework Situation-

Complication-Key Question-Answer:
Queue Time The time a unit of work is in a queue. This is a type of waiting time
Repeater These units of work occur regularly; indicative frequency is weekly. As

an example within IT, we find high impact incidents, small to medium
sized non-standard changes and the smaller advisory services.
Root cause The underlying or original cause of an incident or problem.
Root cause analysis Studying the fundamental causes of a problem, as opposed to analyzing
symptoms.
Runner Units of work that occur on a daily basis and tend to require up to one
hour of work for them to be completed. Within IT, we can say that
incidents, service requests, standard changes and operational activities
fall in this category.
97
SCAMPER A third idea generation technique uses action verbs as triggers to
generate ideas. SCAMPER is an acronym with each letter standing for
an action verb which in turn stands for a prompt for creative ideas. S –
Substitute, C – Combine, A – Adapt, M – Modify, P – Put to another use
E – Eliminate, R – Reverse
Scatter diagram A graph that aims to demonstrate the relationship between two sets of
data. We try to understand whether there is a correlation between two
sets of data and whether this correlation is positive or negative.
Shewhart Cycle Often referred to as the Deming Circle (Plan-Do-Check-Act (PDCA)

sequence.
SIPOC Supplier, Input, Process, Output, Customer. Diagram used to establish

the Kaizen project team, create the project charter and planning, get
stakeholders’ support and start the project.
SMART Specific, Measurable, Achievable, Realistic, Time-bound
Solution Matrix Matrix in which solutions can be plotted according to two axes:
feasibility and anticipated cost.
Special Cause Source of variation that can be assigned to a specific cause which can
Variation usually be discovered. Special causes generate patterns in the data and
provide signals about the problems in the process and how they can be
resolved.
Standard Operating A SOP is a written procedure that describes how a specific task should
Procedure (SOP) be carried out.
Stranger Are units of work that have an irregular occurrence. IT ‘strangers’ are
large non-standard changes, large requests for advice and plans, which
all tend to occur or be updated on a monthly or quarterly basis.
Summarize An A3 skill that is the ability to express thoughts, facts, and other
information concisely
98
Synthesis Refers to a combination of two or more entities that together form
something new; alternately and is required to address complex and
chaotic problems (vb. Synthesize)
System Thinking Systems thinking has been defined as an approach to problem solving,
by viewing "problems" as parts of an overall system, rather than
reacting to specific parts, outcomes or events, and thereby potentially
contributing to further development of unintended consequences.
Is the process of understanding how those things which may be

regarded as systems influence one another within a complete entity, or
larger system.
Takt Time Volume of customer demand per time period (takt time is the inverse of
this number)
Tally sheet See Check Sheet
Throughput The actual amount of output over a period of time. This is invariably
lower than the capacity as a result of waste.
Value Stream Map A technique used to analyses the flow of materials and information
(VSM) currently required to bring a product or service to a consumer. A
visual representation of all of the process steps (both value-added and
non-value-added) required to transform a customer requirement into
a delivered good or service. A VSM shows the connection between
information flow and product flow, as well as the major process
blocks and barriers to flow. VSMs are used to document current state
conditions as well as design a future state. One of the key objectives
of Value Stream Mapping is to identify non-value adding activities
for elimination. Value Stream Maps, along with the Value Stream
Implementation Plan are strategic tools used to help identify, prioritize
and communicate continuous improvement activities.
Variable, Control This kind of variable is particularly useful in experiments. This variable is
kept constant while others are changed so that they can be investigated
Variable, Dependent This is the output; in effect, this is the problem
Variable, Independent This is an input. In the case of problem-solving, the independent

variable can be seen as something that may or may not contribute to
the problem. The aim is obviously to find the independent variables that
have the greatest effect on the problem.
99
Visual Management Visual Management is about effective communication and real-time
updates regarding the work.
Visualize An A3 skill used to turn your story into a visual experience using
pictures and graphics to explain what has been investigated and what is
proposed as a solution.
Voice of the Business Concerns the ‘business’ of the IT organization itself; not to be confused
with the fact that the customer of IT is regularly referred to as “the
business”.
Voice of the Customer Gives the IT organization feedback on how the customer, the user of the
IT service, actually experiences the IT service
Voice of the Process Provides information about processes not working correctly.
Voice of the Regulator Those representing regulatory requirements
VSM See Value Stream Map
Work in Progress The number of uncompleted units of work that are still in the process.
This number is directly related to the lead time (Little’s Law)
100
11 About the author
11.1 Niels Loader
As advisor to tens of IT organizations, Niels has extensive knowledge and experience in implementing
IT Service Management, IT Performance Management, Lean IT and DevOps within IT organizations.
In 2010 and 2011, he was one of the initiators of the Lean IT Foundation certification and spent four
years as the Chief Examiner for the APMG Lean IT certification. He is the lead of the Content team of
the Lean IT Association.
The author would like to thank everyone who put their time and effort into improving this document.
The author would especially like to thank Troy DuMoulin for the inspiring discussions to get the right
content into the Lean IT Kaizen syllabus and the first reviews. And many thanks to Gary Case, Rita
Pilon, Hans van den Bent, Marianne Hubregtse and Mike Orzen for their critical reviews, which helped
to improve this publication.
101
Copyright © 2015 Lean IT Association.
For all your inquiries, please contact info@leanitassociation.com
or visit us at www.leanitassociation.com
102

Lita Lean It Kaizen Publication

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lita Lean It Kaizen Publication

Uploaded by

Copyright:

Available Formats

Lean IT Kaizen

11 About the author 101

The target audience for this document is:

•• Candidates for the Lean IT Kaizen Exam

ITIL® is a registered trade mark of AXELOS Limited.

PRINCE2® is a Registered Trade Mark of AXELOS Limited.

PMI® is a registered Trade Mark of the Project Management Institute, Inc.

•• PLAN: Establish a desired future state

Planning a kaizen is often described as a

In practice, within an IT organization, this

A more realistic way of planning the kaizen is

These preparatory activities are exactly that,

Let us start with a basic much-used version of

And the second model is based on the DMAIC method.

•• Background. As with the A3 problem-solving

Using A3 reports requires practice. There are 3.5 Building communication

A hypothesis is a statement that will start 4.2 Validating the problem

•• Resources needed: how many people, how

‘Fire-fighting’, focus on solving IT organizations seem to have the time to repeatedly

Releases or ‘technical weekends’ A key question within IT organizations is: Why do

4.5 Ensuring Support for a kaizen

4.6 Stakeholder analysis

4.7 Define phase and A3

1. What is the problem?

3. What is the scope of the problem?

4.8 Key steps in the Define phase

1. Problem selection and owner identification

2. Problem statement and kaizen team selection

It is critical that all aspects of the Define

Problem The root cause of incident(s)

Non-standard Any change not being a standard change

Advice A document detailing options for a solution, based on a customer

Plan A document covering a course of action in the future (Availability,

Setting up a measurement procedure (automated data collection)

•• Definitions of units measured are clear and not open to misinterpretation

•• Name of the field in named database

5.2.2 Qualitative Measurement Systems

Annotated Are we watching a representative set Observe at several different moments

Information from an interview is Use multiple interviews

Involved means having an interest in

Setting up a qualitative measurement procedure (annotated observation and interview)

•• What does [problem area] do that has value for you?

Follow the same procedure as with automated data collection.

5.3 Baseline and Benchmark

•• A baseline is the measurement of a situation in order to understand whether a change occurs

5.4 Value Stream Map

Process Cycle Efficiency = VA time / Process lead time

5.6 Key Steps in the Measure phase

2. Create a value stream map of the process

Always take a clean sheet of paper when making an initial VSM.

3. Create and execute the data collection plan

4. Validate the measurement system

5. Assess the capability and performance of the process

6. Identify Quick Wins

6.1 Seven Basic Tools of Quality

6.1.2 Pareto chart

By comparing Pareto charts regarding a single problem made at intervals over a

6.1.3 Scatter diagram

Process is in control Process is out of control

Create a control chart using the following steps:

6.1.6 Fishbone (Ishikawa) diagram

Create an Ishikawa diagram using the following steps:

6.1.7 Check (or tally) sheet

Set up your check sheet using the following steps