Professional Documents
Culture Documents
Rapid Equipment Strategy Development Process - Online - Participant Manual - Rev 3.6.1-Cópia
Rapid Equipment Strategy Development Process - Online - Participant Manual - Rev 3.6.1-Cópia
Rapid Equipment Strategy Development Process - Online - Participant Manual - Rev 3.6.1-Cópia
Development
Participant Manual
Version 3.6.1
Version Version
training@assetivity.com.au | www.assetivity.com.au
DOCUMENT INFORMATION
REVISION HISTORY
© Assetivity 2020. All Rights Reserved. No part may be reproduced by any process without prior written permission from Assetivity Pty Ltd.
This publication is copyright. Other than for the purposes of and manual is subject to change without notice due to factors outside
subject to the conditions prescribed under the Copyright Act 1968 the control of Assetivity Pty Ltd and this manual should,
(as amended), no part of it may in any form or by any means therefore, be used a s a guide only. For example, the products
(electronic, mechanical, microcopying, photocopying, recording referred to in this publication are continually improved through
or otherwise) be reproduced, stored in a retrieval system or further research and development and this may lead to
transmitted without prior written permission of the copyright information contained in this manual being altered without
owner. Enquiries should be addressed to Assetivity Pty Ltd. notice.
Unauthorised reproduction in whole or in part is an infringement This training manual is published and distributed on the basis that
of copyright. Assetivity Pty Ltd will actively pursue any breach of the publisher is not responsible for the results of any actions
its copyright. taken by users of information contained in this training manual on
the basis of information contained in this manual nor for any
error in or omission from this manual. Assetivity Pty Ltd does not
Information contained in this training manual has been obtained
accept any responsibility whatsoever for misrepresentation by
from AS/NZS Standards, International Standards, data provided by
any person whatsoever of the information contained in this
or published by international fastener suppliers, manufacturers
training manual and expressly disclaims all and any liability and
and institutions, and by direct calculation and measurement by
responsibility to any person, whether a reader of this training
Assetivity Pty Ltd.
manual or not, in respect of claims, losses or damage or any other
matter, either direct or consequential arising out of or in relation
Every care has been taken by the staff of Assetivity Pty Ltd in to the use and reliance, whether wholly or partially, upon any
compilation of the data contained herein and in verification of its information contained or products referred to in this manual.
accuracy when published, however the content of this training
CONTENTS ........................................................................................................................... 1
INTRODUCTION............................................................................................................... 4
1.1 General Course Description ................................................................................... 4
1.2 Learning Objectives................................................................................................ 5
BACKGROUND................................................................................................................. 7
2.1 Why you may need better Equipment Maintenance Strategies ........................... 7
2.2 Where are the opportunities to improve your current maintenance program? 10
THE ORIGINS OF FMEA, RCM AND PMO ....................................................................... 13
3.1 FMEA – An Overview ........................................................................................... 13
3.2 RCM – An Overview ............................................................................................. 14
3.3 PM Optimisation - An Overview .......................................................................... 15
3.4 Similarities and Differences between RCM and PMO ......................................... 16
3.5 Which approach should you use? ........................................................................ 17
KEY UNDERLYING CONCEPTS ........................................................................................ 18
4.1 What Causes Equipment Failure? ........................................................................ 18
4.2 Who Can Help Prevent Equipment Failure? ........................................................ 19
4.3 Operating Context ............................................................................................... 19
4.4 Failure Patterns .................................................................................................... 20
THE RAPID EQUIPMENT STRATEGY DEVELOPMENT PROCESS ...................................... 23
STEP 1 – DETERMINE SCOPE OF ANALYSIS ................................................................... 24
6.1 Objectives ............................................................................................................ 24
6.2 How Do We Select Systems for Analysis? ............................................................ 24
6.3 Define the Boundaries around the System .......................................................... 25
6.4 Other Options for selecting the scope of analysis ............................................... 25
Assetivity has developed a series of short training courses to assist individuals and organisations to
enhance their skills in Asset Management, Maintenance Management and Reliability Improvement.
Many of these courses are presented as public courses with training sessions scheduled in major
Australian capital cities at various time during the year. All these courses are able to be presented
inhouse for individual organisations and can be customised to meet specific organisational needs.
Many of you may be wondering why you should apply RCM (Reliability Centred Maintenance) or PMO
(Planned or Preventive Maintenance Optimisation) within your organisation. The short answer is that it
will better enable you and your organisation to capitalise on the equipment reliability and plant
capability by optimising the maintenance schedules applied to it. Your maintenance schedules will be
optimised and therefore more efficient, your overall system reliability will improve, unplanned
maintenance will decrease, productivity will improve and much more.
So look, listen, learn and be involved. Your involvement throughout this course is your first step to
being part of a team that will help you and your organisation to obtain the benefits of PMO and RCM.
During this course you will be guided through Assetivity’s Rapid Equipment Strategy Development
process. You will gain an understanding of the key concepts underpinning these steps, and how these
concepts can be applied in practice. These steps are:
9. Gain Approval
In concluding the course, we will discuss what happens after you have applied this 11 step process with
this new found knowledge.
Achievement of these learning objectives will equip you to function as a team member in an equipment
strategy development activity. We are confident that application of the equipment strategy
development process within your organisation will make a significant difference in the effectiveness of
your PM program and allow you to achieve better business results. These results can include
improvements in:
• Equipment Availability
• Equipment Reliability,
• Cost performance
Here is a scenario that may or may not be similar to your situation; it certainly is seen commonly
throughout many organisations in Australia and overseas.
Many organisations will, from time-to-time, embark on a series of studies to improve efficiency or
effectiveness, reduce costs or overheads, increase production or improve some other business
performance measure. Let’s look at an all too common example of constructing a new facility.
• After several attempts, and after needing to rework the figures on several occasions, they
determine (surprise, surprise) that the project is justified, and it is approved by the board.
An EPC (Engineer, Procure and Construct) contractor is engaged to perform the detailed
design, and construct and commission the plant.
• This EPC contractor is on a performance-based contract that rewards them for completing
the project on time and on budget. There are no rewards or penalties associated with
ensuring that the designed facility operates efficiently or effectively through its expected
life. Most of the staff at the EPC contractor have spent their entire careers in design
engineering. Few, if any, have ever operated or maintained a plant that they have
designed.
• As a result of this, the designers are not aware of, and often do not take into consideration
the need for:
o Good ergonomics during maintenance activities Rapid Equipment Strategy Development | Version 3.6.1
o Quick-change components
o Safe access to perform vibration analysis, to perform thickness testing and to take
oil samples
• It soon becomes apparent that the original capital estimate for the project was hopelessly
optimistic. So was the original project schedule.
• In addition, cost escalation means that the project is running well over budget. It is time to
cut costs and reduce scope.
o Reducing the amount of time and money spent on training Operating and
Maintenance personnel
• During commissioning, the capital budget has already been exhausted (this is a very
common occurrence). In addition, the Project Engineers are already starting to think ahead
to their next project. Everyone is keen to get commissioning over and done with as quickly
as possible, and since there was no agreed method for assessing Reliability, Availability,
Operability or Maintainability, Technical Acceptance is largely a superficial exercise. When
the equipment is handed over to Production and Maintenance, there is a long punch list of
outstanding items to be addressed including the development of the PM program.
• After handover...
o The equipment is operated inconsistently, inaccurately, and often outside its design
limitations.
o The PM program is inadequate, and equipment failures occur, but nobody has time
Rapid Equipment Strategy Development | Version 3.6.1
to review and update the PM program, or negotiate changes with OEMs.
o Due to the lack of proper Work Instructions and adequate training, there are
frequent workmanship issues, with regular on-the-job delays, and high levels of
rework.
o Because Production have not participated in, or agreed to, the equipment
maintenance strategies, and because of the short-term pressures to achieve
production targets, they do not release equipment so that vital PM can be
performed, despite the reliability issues.
o Because the equipment is not released for PM, equipment breakdowns, that
otherwise would have been prevented, start to occur.
o Before you know it, you are entering the Equipment Reliability Death Spiral...
• It was developed in haste because the project allowed insufficient time and budget.
• It was over-reliant on OEM guidance as ‘gospel’ when the intended operating environment
was different from what the OEM had assumed.
• It was developed with minimal involvement from operators and maintainers (tradesmen)
who actually knew how the equipment was to be operated and maintained.
• Work instruction documentation was weak and did not provide sufficient detail to conduct
the task required.
Production was probably never able to release equipment or plant for PM to be conducted because:
• Operating targets were developed without allowing for maintenance downtime for PM.
• The production team was not committed to the PM program because they were not
involved in the development process and there was no ‘ownership’ associated with it.
• The plant has undergone several equipment modifications and changes to the operating
parameters and strategies, additional equipment has been added, or equipment has been
removed – all without properly considered changes to the PM program.
• Changes to spare parts suppliers and/or the specifications for spare parts used and installed
on equipment have resulted in more, or less, reliable parts, and therefore more, or less,
required maintenance – but the PM routines have not been adjusted to allow for this.
• Almost certainly, operating and maintenance staff have gained additional experience and
understanding of the way, and frequency, with which the equipment fails, but this
knowledge and experience has not been captured and used to update the PM program.
• Beyond the plant itself, there have been improvements in Condition Monitoring
Technologies since the original PM program was developed that have not been adopted.
• Changes in the external economy have changed demand for your products, or the costs
associated with equipment failures and subsequent change in the business impact of
equipment failures, but the cost justification of PM activities has not been re-evaluated in
light of these changes.
Examples of opportunities to improve your PM program that exist in most organisations are listed
below:
• Ensuring tasks are performed at the optimal frequency i.e. not too often, and not too
infrequently. This is where practical advice from personnel involved in the equipment as
well as work order data/history will be useful.
• Capture and optimise ALL routine equipment care activities – not just those managed within
the CMMS.
• Ensure that Production Strategies and Schedules and Maintenance Strategies and Schedules
are compatible and aligned.
• Identify recurring failures, and either prevent them, through more effective PM, or
eliminate them altogether through redesign.
• Improve the quality and consistency of inspections through improved work instructions and
attempt to have these linked through some documentation management system within
your CMMS.
• Keep a record of the rationale behind each PM task to aid in continuous improvement. If
you continually refine the process you can make the spiral work in reverse. Optimisation is
a continuous improvement project; the first iteration will usually achieve the biggest gains
and future iterations will capitalise on latest data gathered and improved technology as it
becomes available.
The reality is that most organisations over-maintain their equipment, and that most of this over-
maintenance is, at best, a complete waste of time and money, and in other cases may even be
contributing to lower equipment availability and reliability. Applying a structured process, such as
• Tasks that add no value, that do not predict, prevent or mitigate against failure will be
deleted
• Value adding tasks, that predict, prevent or mitigate against failure will be added
As a result of this, typically, significant additional maintenance labour time is made available to
complete planned corrective work – before the work becomes urgent, and before the equipment has
broken down.
FMEA has been around for a very long time. Before any documented format was developed, most
inventors and process experts would try to anticipate what could go wrong with a design or process
before it was developed. The trial and error alternative was both costly
and time consuming.
Step Outputs
What are the Functions of the Asset? Equipment Functions
In what ways can it fail? Failure modes
What are the causes of failure? Failure causes
Information
Rapid Equipment Strategy Development | Version 3.6.1
What happens when each failure Failure Effects Collection
occurs?
FMEA became very popular and is still in wide use today as a “design for reliability” tool. However,
FMEA doesn’t answer all the questions we have when designing and building a sustainable and reliable
PM plan for our assets. We need more.
The next advance began in 1968, when work by Stanley Nowlan & Howard Heap of United Airlines led
to the addition of a structured decision-making approach to the FMEA process in order to develop
effective maintenance strategies. This approach has become known as “Reliability Centred
Maintenance” or RCM.
The methodology was originally intended to be used for new equipment while in the design phase.
Nowlan and Heap’s research was driven by civil aviation and was originally documented in the aviation
industry Maintenance Steering Group guidelines (MSG1). Their research continued up to 1978,
resulting in a second civil aviation standard (MSG2) and a report to the US Department of Defence that
From the above table you will be able to see that the basic steps involved in an FMEA are incorporated
in the first four steps of the RCM methodology.
PMO was originally intended to be used for existing equipment where comprehensive PM programs are
already in place. It was derived from RCM and developed in response to concerns regarding the time
and effort involved in a “full” RCM analysis. PMO is sometimes referred to as:
• Reverse RCM
• Streamlined RCM
PMO came to prominence in the 1990s and continued to evolve through the early 2000s. The seven key Rapid Equipment Strategy Development | Version 3.6.1
elements of PMO are:
Notice that PMO does not use FMEA in its Information Collection phase. However, the Decision-Making
process for PMO is identical to that for RCM.
• Both processes use traditional RCM decision logic. The second half of both processes is
essentially identical, using the same decision framework and RCM principles.
• Both approaches focus on the business consequences of equipment failure, as well as the
technical characteristics of those failures.
The Differences between RCM and PMO are mainly in the initial stages of the analysis.
• PMO uses existing PM schedules and tasks (to identify the majority of failure causes),
utilising and building on existing knowledge and only reviewing functions if doubt exists as
to capabilities.
• There can be a tendency when applying RCM to over analyse and produce too many
functions and too many failure causes.
• Because RCM analysis is time-consuming, often the focus is on performing the analysis,
rather than on implementing the outcomes.
• PMO ideally requires an existing PM program for the equipment (or essentially identical
equipment). If this does not exist, then PMO is very difficult, if not impossible, to perform.
• There is a risk with PMO that existing strategies are kept even though they add no value
because “we have always done it this way”.
• Another potential pitfall of PMO is that some failure modes and causes can get overlooked.
Most sources of failure data span relatively short periods. CMMS get updated about every
3-5 years. Often older history is lost and long-term failures (MTBF >5 years) get missed. If
the initial development of strategies missed something, a PMO review may also miss
something.
RCM tends to be more appropriate for brand new equipment, particularly where this equipment is
using new technology that does not exist elsewhere. This is where the baseline or blank sheet of paper
origins of RCM originate.
PMO tends to be more appropriate for existing equipment, particularly where an existing PM program
is in place and some operating and maintenance experience has been obtained. The use of the existing
PM programs and experience base lends itself to optimisation.
In this section, we explore the key principles that underlie RCM and PMO-based equipment strategy
development. The most sensible place to begin is with an understanding of what causes failure.
• Overstressing equipment
▪ Deliberately
▪ By accident
• Incorrect assembly
o After maintenance
• Incorrect parts
o Incorrect specification
Maintenance is only capable of addressing some of the above failure causes – and is actually a cause in
itself. Consequently, more maintenance is NOT better, and we need to think carefully about when we
should intervene in the running of a machine.
• Maintainers
When
• Operators
developing • Control Room attendants
equipment • Condition Monitoring Technicians
strategies, we • NDT technicians Improving Reliability is a
typically focus • Lubricators/Greasers Team Effort
on the routine • Stores people
activities • Procurement personnel
performed by • Equipment Designers
• Managers
these people
For the purposes of developing PM tasks, we must look to those who are involved with the day to day
care and operation of the equipment as highlighted above. We must, however, also keep in mind that a
PM task may not be the only way – or the best way – to deal with a particular equipment failure.
Identical equipment, in a different operating context, can often require quite different Routine
Maintenance programs. Consider, for example, the exact same make and model of pump that is used in
four different operating contexts:
• The pump pumps water to a cooling tower at a power station. If it fails, there is another,
identical, installed standby pump which can take over its duty.
Why are these PM programs different? It is because the Operating Context in each case is different, and
therefore both the technical characteristics and business consequences of failure are also different.
Consider the PM program that the Original Equipment Manufacturer (OEM) for the pump recommends.
What assumptions does it make regarding the operating context and consequences of failure for the
pump? Is this likely to apply in all cases? Does it apply in your situation? For this (as well as many other
reasons) we need to treat OEM maintenance recommendations with caution.
Historically, equipment was expected to follow a failure distribution similar to that shown below. The
failure rate (roughly equivalent to the Conditional Probability of Failure under specific assumptions)
would be roughly constant for most of the life and then increase rapidly after a certain age. That is, a
few items would suffer random failures throughout the life, but most could be reasonably expected to
operate for a certain period and then “wear out” soon after the end of this period.
How would we avoid the majority of failures, therefore? The answer is to replace or overhaul the
equipment at some point shortly before it hits this “wear out” zone. Let’s examine this further by
reference to an example.
Imagine, for a moment, that you are the Chief Engineer for a Civil Aviation Airline in the late 1950’s. You
have taken delivery of a brand-new civil aviation aircraft that uses brand new technology (such as jet or
turbine engines!). Your job is to determine the most appropriate Preventive Maintenance program for
this aircraft. You subscribe to the traditionally accepted view of equipment failure. How are you going
to work out at what age each component should be replaced or overhauled?
• Civil Aviation Safety Statistics indicate that there are approximately 60 crashes every million
take-offs.
• The industry is about to introduce the largest passenger aircraft to date – the Boeing 747.
• The aviation regulators (the FAA) come to you and insist that the current situation is not
tolerable – you must reduce the number of crashes due to equipment failure.
It was this imperative that drove the work of Nowlan and Heap referred to above. Their research found
that there were 6 predominant failure patterns that applied to the range of systems/components
commonly found in civil aviation aircraft. These failure patterns are shown below.
• Predicting impending failures and replacing components in a planned manner at a time that
is convenient to operations (Condition-Based Maintenance or Condition Monitoring)
• Modifying the equipment in some way so that the equipment failure does not matter
(Redundancy, Protection)
The results of this paradigm shift were significant improvements in civil aviation safety performance,
accompanied by the bonus of reduced materials costs:
1960’s Today
Overall Safety 60 crashes/million take-offs <3 crashes/million take-offs
We saw in the previous section that there were a number of key concepts underlying both RCM and
PMO:
• Solutions to equipment failures may take a wide variety of forms and PM is not necessarily
the best.
• Most equipment does not wear out, so traditional maintenance strategies are seldom
applicable.
If we are to obtain the benefits available from RCM and PMO, we must develop a process that captures
the above principles. Additionally, it must include the flexibility to apply either RCM or PMO
approaches as required for specific equipment. Lastly, it must do all of this within a business problem
solving framework, which forces early identification of business drivers for a maintenance review and a
closed loop process to ensure the required performance improvement has been delivered.
Within Assetivity, the process described above is known as Rapid Equipment Strategy Development.
This process is based on the SAE JA 1011 RCM standard and, as previously shown, consists of the
following 11 steps:
9. Gain Approval
We will spend most of the rest of the course working through these steps in detail.
The objectives of Step 1 of the Equipment Strategy Development process are to:
• Select systems for Analysis where Equipment Strategy Development will deliver significant
business benefits
• Ensure alignment and agreement regarding the scope of the improvement effort
• Provide a baseline against which the need for future revisions can be assessed
In order that you can select the appropriate systems or equipment for analysis we must first understand
the drivers for the business in total, and then align the objectives of the PM review process with those
business drivers.
• Maximise availability/uptime?
• Maximise throughput/performance?
• Criticality
Rapid Equipment Strategy Development | Version 3.6.1
o Safety
o Environment
o Throughput (Bottlenecks/Constraints)
• Costs
At all times you should remain aware of opportunities to generate the greatest value possible for the
effort you input for common or similar assets. Take the example of Rail Ore Cars. By conducting the
analysis on one ore care, you may be able to apply this across many thousand other ore cars which are
identical. So, elevate the analysis on assets that fall into this category (many, or like assets) up the
priority order to gain the reliability and cost benefit sooner.
• Hydraulic systems.
Each of these and many other types of systems will need to be considered before and possibly
throughout your analysis.
Other options exist to assist you in your selection of scope. These can include Equipment Shutdown
tasks only or tasks for one workgroup only. The choice is yours, however you must ensure you tailor it
to meet your needs and those of the facility whose maintenance program is being optimised.
As we’ve already seen, an understanding of the operating context and subsequent consequences of
failure is critical to establishing an appropriate maintenance program.
The consequences of failure within a specific system can vary widely from a Safety Impact to an
Environmental Impact to the Cost of Downtime. Additionally, there are often buffer stocks before and
after a process that can limit the consequences of shorter duration failures. If your system relies on
these, you must evaluate your plant’s ability to “catch up” after a failure has occurred and been
rectified.
A system overview document should be used to provide a description of the system being analysed
including a description of system boundaries or scope and the system’s current configuration. This
document should make appropriate references to drawings and manuals, noting current revision
numbers. It should also capture your analysis of the impact of equipment downtime and note any
assumptions made.
The objective of Step 2 is to define and quantify the primary function of the equipment.
Functions are the reason why we own the equipment – we want it to do something, even if it’s just to
look pretty! The primary function is the main reason we own it.
In addition to the primary function, there are generally several secondary functions. There is no need to
consider these, at this point in the process – but we will discuss Secondary Functions in more detail in
Step 4 of the RESD process.
Note that primary functions are identified at the equipment or system level only NOT at lower levels in
the hierarchy (e.g. at a component level).
All Functions should ideally have some form of performance standard associated with it. This
performance standard should be the minimum acceptable level of performance required by operations
in order to achieve their target levels of:
• Throughput.
• Environmental Risk.
• Operating Efficiency/Costs.
In the illustrated system, the performance standard associated with the Primary Function of Pump A is
50kl/hr.
• Define and quantify the desired level of performance required of the equipment.
• Assess the impact of any gaps between desired and existing equipment capability.
• Inlet pressures
• Back Pressure
• Fluid Temperatures
• Contaminants
• Fluid Density/SG
• Viscosity
• Operating practices
• Production Budgets
• Production targets
• Manufacturer’s Specifications
• Design Intent
• Design Calculations
Note that, for all equipment, there are two functional performance standards – Desired Performance
and Design Capability. As stated above, the relationship between these standards has significant
implications for the reliability of the equipment. There may also be additional complications associated
with Engineering Factors built into the design.
Desired Performance is set by the organisation and is the minimum acceptable level of performance
required by operations in order to achieve target levels of: Rapid Equipment Strategy Development | Version 3.6.1
• Throughput
• Safety
• Environmental Risk
• Operating Efficiency/Costs
• Reliability
• Availability
Engineering Factors built-in to the design may allow short-term, infrequent operation beyond the
Design Capability in some cases. This creates the upper limit of capability for the equipment, although
care must be taken that operation in this area does not become frequent or long-term.
As equipment is operated, the Current Capability will vary between full capability (as new, at or above
Design Capability) and no capability (failed, or below Desired Performance) based on the Design
Capability (including Engineering Factors), the Desired Performance and the operating context. The
primary purpose of maintenance is to support the organisation in achieving its goals by maximising the
amount of time the equipment spends above the Desired Performance standard in order to. This is
accomplished through a combination of Preventive, Predictive, Corrective and Breakdown maintenance
activities as discussed previously.
We should never forget that the prime reason for doing maintenance is to support the overall objectives
of the organisation. In some cases, exceeding design capability may be a wise business decision if:
• Increased revenues and profits exceed the costs of additional downtime and maintenance
• Initiate a study into the costs and benefits of increasing Design Capability or reducing
Desired Performance
• Continue to develop the best equipment strategies that we can, given current Desired
Performance and Design Capability specifications
The objective of Step 4 is to identify all of the relevant failure modes that could affect the equipment.
A Failure Mode is a Cause of equipment failure. Causes and effects are part of an infinite continuum
that begs the questions “Where do we stop?”
We aim to describe failure modes to a level of detail that permits us to identify appropriate PM
activities. The level will be dictated by what we can achieve at the maintenance ‘coal face’.
There are two approaches to identifying Failure Modes – the RCM approach and the PMO approach:
Rapid Equipment Strategy Development | Version 3.6.1
The RCM Approach The PMO Approach
Secondary Functions
In addition to the primary function, there are generally several secondary functions. These are
additional functional requirements that we have of the equipment given selection for use within a
particular operating context. The maintenance required to ensure continued functionality of these
secondary functions can sometimes be more important (and more costly) than the maintenance
required to assure functionality of the primary function. Secondary functions can be considered under
the following headings:
• Appearance
• Containment
• Contamination
• Environmental
• Economy/Efficiency
•
Rapid Equipment Strategy Development | Version 3.6.1
Safety
Protective Functions are functions that are intended to minimise or reduce the consequences in the
event of another failure or abnormal event. Examples of protective functions include:
In the event of a failure or abnormal event, Protective Devices normally act in one of the following ways
(with associated examples):
• Stop equipment.
• Contain hazards.
o e.g. Handrails must be painted yellow; fire pipework must appear red.
Secondary functions relating to containment often require containment for safety or production
efficiency reasons:
Environmental Secondary functions are those that are required in order to comply with environmental
standards and regulations
• e.g. ensure that cooling water discharge at a power station is less than required under
terms of environmental licence
Secondary functions relating to Economy/Efficiency are those that are required in order to meet
operational cost performance targets:
• e.g. fuel consumption for a public bus to be less than a specified limit
• e.g. turbine efficiency to be greater than the minimum level acceptable to operations
Secondary functions relating to Safety are those that are required in order to ensure that the equipment
can be operated safely, within a tolerable level of risk:
• e.g. provide adequate lighting of the road ahead in order to be able to see hazards when
driving at night
• e.g. to stop the vehicle within a specified distance from a specified speed when being
operated under normal conditions
The critical point to note about secondary functions is that, on occasions, these may be more important
than the primary functions in terms of the business consequences associated with their failure, and
failures associated with all functions need to be identified and managed.
The Primary Function of Pump A is to pump at least 50kl/hr of fluid into Tank 1, therefore:
o Pumps at <50kl/hr
If the function of a flow meter is to provide an indication of the flow rate with an accuracy of +/-5% then
the Functional Failures of the flow meter are:
o Current PM program
o Failure History
Both the RCM and PMO approach can generate huge numbers of potential failure modes, most of which
In later steps, we will need a concise description of the cause of each failure mode so we can determine
what to do about it. Typically, we use the phrase “due to” to achieve this. For example:
Considering again our pump example, one Functional Failure of Pump A is:
• etc, etc
• There is a dominant failure mode associated with a particular Protective Device which is
likely to be preventable with PM:
• We need to carefully consider alternatives because the task to check the operation of a
specific Protective Device is:
Then...
• CMMS/ERP/EAM system
o Ensure you identify tasks that may be attached at a different location in the
Equipment hierarchy (e.g. Condition Monitoring Tasks, Lubrication Tasks, NDT)
• Operator Checklists and Standard Operating Procedures Rapid Equipment Strategy Development | Version 3.6.1
For each work order, we should ideally identify the cause of the failure. Points to consider:
• How well does your formally recorded failure history identify failure causes?
Performing Pareto Analysis can assist with identifying the few failures that are deserving of the most
analysis:
Pump Failures
25
20
No of Work Orders
15
10
0
Worn impeller Worn throat bush Unknown Seal failure Bearing failure Broken shaft Motor failure
Failure Cause
• Protective Functions – Functions that are intended to reduce the consequences in the
event of another failure or abnormal event
In the event of a failure or abnormal event, Protective Devices normally act in one of the following
ways:
• Stop equipment
The objective of Step 5 is to define and categorise the effects and consequences of each failure mode.
For us to choose to perform a routine maintenance task, the task must meet two criteria:
• It must have business effectiveness in that it successfully reduces either the likelihood or
the size of the business consequences (and by enough to offset the task cost!).
In order to assess candidate tasks against these criteria, we must first understand the technical
characteristics and business impact of each failure mode.
• Failure Effects
• Failure Consequences
▪ Hidden
▪ Safety
▪ Environmental
▪ Operational
▪ Non-Operational
We should describe failure effects in sufficient detail to permit the assessment of failure consequences.
Often, the biggest impact on operational capability is determined by the amount of downtime
associated with the equipment. This can be difficult to define unless we can answer the following:
• What other assumptions have we made that impact on our estimate of total downtime?
The consequences of failure are divided into four categories as shown below (drawn from top row the
SAE JA 1011 RCM decision diagram, which we will use later):
• Those failures where the loss of function, under normal circumstances, would not be Rapid Equipment Strategy Development | Version 3.6.1
detected
o E.g. conveyor trip wire does not stop conveyor when it is pulled (Hidden)
o E.g. conveyor trips even when nobody pulled trip wire (Evident)
Hidden failures have no noticeable effect until another event occurs. However, they do increase the
risk of more serious consequences if the equipment failure or other event that they are protecting us
from subsequently occurs. Ultimately, we only suffer the final consequences of a hidden failure if the
equipment failure or other event that they are protecting us from subsequently occurs. In other words,
there has to be more than one failure (or other independent event) occur before we suffer these
consequences. In other words, there must be multiple failures for the consequences of a hidden failure
to become apparent.
Safety and Environmental Consequences Rapid Equipment Strategy Development | Version 3.6.1
A failure has safety consequences if this failure creates an intolerable risk to personnel safety.
A failure has environmental consequences if this failure creates an intolerable risk of environmental
damage.
Likelihood
We should assess risk as if there were no PM task in
Low Medium High High
place to prevent the occurrence from happening.
This will ensure the overall significance of the risk is
understood and PM tasks to mitigate/control it are Low Low Medium Medium
given the correct level of importance.
Low Low Low Medium
Who should assess the level of risk?
• What if the impact of the event is likely to occur outside our boundaries?
Corporate guidelines should tell us what level of risk is tolerable and how to deal with risks that are
intolerable or have cross-divisional effects – use your Corporate Risk Matrix if one exists.
Operational Consequences
Operational consequences include:
• Lost Production
What units of measure can we use to quantify the impact of these? Rapid Equipment Strategy Development | Version 3.6.1
Non-Operational Consequences
Failures have non-operational consequences if the only impact of the failure is the direct cost of the
repair. This can usually be expressed in dollar terms, using business rules.
The objective of this step is to select the most appropriate routine maintenance task to deal with each
failure cause.
We use a structured approach to task selection, based on the SAE JA 1011 RCM standard:
Following the above structured approach will provide confidence that we have identified the most Rapid Equipment Strategy Development | Version 3.6.1
appropriate maintenance task to address each failure mode without undertaking unnecessary analysis
steps. In general, the preferred approach is “one failure mode, one PM task” to ensure clarity of
responsibility and prevent duplication. Note, however, that in some cases – particularly where a failure
mode has safety or environmental consequences – multiple tasks may be required to address the failure
mode satisfactorily.
• Is the task APPLICABLE? From a technical perspective, does the task successfully predict,
prevent or detect the failure cause?
• Is the task EFFECTIVE? From a business perspective, does the task successfully address
the business impact of the failure cause?
In order to determine whether a task is effective, we must assess the business impact of a failure mode
both:
The task has effectiveness if, through prediction, prevention or detection, it will successfully addresses
the consequences of failure.
The way to assess this is by using an appropriate risk matrix. The impact of the multiple failure is
generally easy to assess – it is the impact of the original failure if there were no protective device in
place. In addition, PM tasks generally change the likelihood but not the impact of failures, so the
If we take our pump system and assume perfect sensing, perfect switching and independent failures, we
can calculate the likelihood of the multiple failure in a given period through the following formula:
PrX = 1 / MTBFX
Where PrX = the probability of failure of device x in one unit of the time that MTBF is expressed in.
This allows us to calculate an approximation of the probability of the multiple failure. As a simple
example, the probability of failure of our pumping system in a given year with no failure finding is
calculated as follows:
For a given PM task, the Effectiveness is assessed by comparing the risk before and after application of
the task. The acceptability criteria will depend on whether the hidden failure exposes
safety/environmental or operational/non-operational consequences – see below for details.
Note that, in general, it is very difficult to obtain sufficiently accurate data to estimate MTBF for the
current PM program. It is even more difficult to estimate MTBF for an alternative PM program and
assessment of business effectiveness therefore often relies on “expert judgement”. This is explored
further below as part of the assessment of Applicability for Failure Finding tasks.
Therefore, a PM task to address Safety and Environmental Consequences is Effective if: Rapid Equipment Strategy Development | Version 3.6.1
It reduces the risk associated with the failure at a cost that is “Reasonably Practical” given
the extent of the reduction
The risk is assessed using a risk matrix, as with hidden failures. Again, applying a PM task changes the
likelihood but the impact of the failure remains unchanged. If the risk after applying PM is reduced and
the cost is “reasonably practical” for the level of reduction, then the PM task is effective.
• The standard for safety/environmental risks is usually ALARP (As Low As Reasonably
Practical), so just reducing the risk to a tolerable level may not be sufficient. Multiple tasks
may be required to achieve ALARP, with each assessed as part of an overall package.
Overall, the package of PM tasks selected MUST reduce the risk to a tolerable level. If this cannot be
achieved, the equipment must be redesigned, or acceptance of the risk must be elevated in accordance
with the organisation’s risk management processes.
Operational Consequences
Operational consequences can include:
• Lost Production
You will note that these are all financial in nature. A PM task to address Operational Consequences is
therefore Effective if:
The cost of doing the task is less than the cost of the operational and repair costs
associated with the failure
Non-Operational Consequences
Failures will have non-operational consequences if the only impact of the failure is the direct cost of the
repair. This impact is also financial in nature and therefore a PM task to address Non-Operational
Consequences is Effective if:
The cost of doing the task is less than the cost of repairing the consequential damage
associated with the failure
Note that we may also need to consider the cost of consequential damage when assessing the
In order to determine whether a task is applicable, we must consider the technical nature of the task,
and the technical characteristics of the failure which we are trying to predict,
prevent, or detect.
• On condition
o Scheduled Replacement
Historically, the first PM tasks were fixed interval tasks, so let us consider them first.
A routine task which “restores the capability of an item at or before a specified interval (age limit),
regardless of its condition at the time, to a level that provides a tolerable probability of survival to the
end of another specified interval” (SAE JA 1011, emphasis added)
A routine task which “entails discarding an item at or before a specified interval (age limit), regardless
of its condition at the time” and replacing it with a new component. (SAE JA 1011, emphasis added)
• The conditional probability of failure starts to rapidly increase after a specified age
This corresponds to failure patterns A, B and C, which relate to items that are likely to be:
o Erosion (wear)
o Fatigue
The frequency of a Scheduled Restoration or Scheduled Replacement Task is determined by the “life” of
the component, or the age at which the conditional probability of failure starts to increase rapidly.
Note that task Intervals are not necessarily based just on calendar time. Other units may also be used
include machine hours run, kilometres travelled, cycles completed etc.
A potential failure is a clearly identifiable condition that indicates a failure mode is about to occur, or is
in the process of occurring. We can generate a curve showing the relationship between the age of an
item and its resistance to failure, which is known as a P-F Curve.
For each potential failure and associated measuring/monitoring technique, we can define a unique
interval between the first possible point of detection and the functional failure. This is known as the P-F
Interval.
The frequency of Condition Monitoring tasks must be less than the shortest likely P-F Interval. It must
also be sufficiently shorter than the PF interval to permit effective action to avoid the consequences of
the failure. The frequency of a Condition Monitoring Task is therefore determined by:
• The amount of time required to permit effective action to avoid the consequences of the
failure.
• The task can be done at a frequency which permits effective action to avoid the
consequences of the failure
o Look, Listen, Touch (caution – burn danger), Taste (be even more cautious -
poisoning), Smell (caustic burns)
o Vibration Analysis
o Thermography
o Oil Analysis
o Etc.
• Can be applied to any failure cause, regardless of failure pattern, as long as there is some
warning of failure
• Often can be done with the equipment on-line Rapid Equipment Strategy Development | Version 3.6.1
• Often can be done without the need for intrusive maintenance activity
• Can only be applied where potential failure conditions can be identified – some items fail
without warning
• A smoke detector
• A Rupture Disc
• An Electrical Fuse
• An Emergency Flare
Ideally, we want to test the entire system or device. But we must also consider the consequences if the
protective device does not operate as expected when we test it – we don’t want to cause the exact
failure that the protective device is there to protect us against! And we should also recognise that some
protective devices cannot be tested at all, because testing them actually destroys them.
• Performing the test does not significantly increase the risk of inducing the multiple failure
we are seeking to avoid
We need to reduce the risk of the multiple failure to tolerable levels. Failure Finding will not change the
impact of the multiple failure, but it can reduce its likelihood.
The longer the Failure Finding Interval, the greater the risk the multiple failure will occur. Rapid Equipment Strategy Development | Version 3.6.1
We can calculate the level of risk using some statistical approximations and the following formula:
PrTIVE = (1/MTBFTIVE),
where MTBFTED = the Mean Time Between Failures of the protected equipment
Then:
These formulae only apply for single protective devices. We can also calculate the level of risk for more
complex systems, but these calculations are outside the scope of this course. These situations include:
• Complex multi-level systems (e.g. fire protection systems with smoke alarms, sprinkler
systems, alarms and a communication system to the fire brigade)
• Where there is a risk that doing a failure finding task may induce the multiple failure
In general, the data is usually insufficient or of inadequate quality to support direct calculation of an FFI
and it is necessary to resort to expert judgement. In such cases, the frequency of a Failure Finding Task
is determined by the following principles:
• The higher the impact of the multiple failure the more frequent the task, and
• The higher the reliability of the protected equipment the less frequent the task, and
• The higher the reliability of the protective device the less frequent the task.
• How to tell whether the item passes or fails the inspection or test – e.g.:
• What action should be taken in if the item fails the inspection or test – e.g.:
We will discuss the role of work instructions further in the next section when we talk about human
error.
The objective of this step is to identify any other actions that should be recommended from the PM
review and review of failure history. These could relate to one-time changes that:
• Reduce the risks of failures that are not preventable, predictable or detectable, but where
the current level of risk is not tolerable
• etc...
• All changes require full consideration through a formal “management of change” process in Rapid Equipment Strategy Development | Version 3.6.1
order to ensure that all the implications of a change are considered.
• The group should not spend large quantities of time identifying one-time changes – only
identify those that are likely to give a rapid return for effort.
• The output of this step is recommendations for further, more detailed investigation through
the appropriate channels.
• Be extremely cautious when recommending equipment redesign. Typically, less than 10%
of outcomes from root cause analysis relate to genuine design issues. Changing the design
may add additional failure modes whose management may create additional problems
greater than those that currently exist. Living with current limitations is sometimes the best
choice.
Possible causes:
• Human Error
• System Error
• Design Error
• Parts Error
Types of Causes:
• Recognition failures
• Branching errors
• Overshoot errors
• Rule-Based
• Skill-based
• Knowledge-based
Actions can be taken in several areas in order to reduce the risk of maintenance error
• Person Measures
o Awareness training
o Control distractions
• Team Measures
• Workplace/Task Measures
o Ensure personnel perform tasks only when appropriately qualified and skilled
o Actions can be taken in several areas in order to reduce the risk of maintenance
error
• Organisational Measures
o Put in place proactive processes for assessing the risk of future maintenance errors
The objective of this step is to take all of the individual tasks that have been developed in the previous
steps, and group them for execution in such a way that:
• Integrates planned downtime for operational reasons with planned downtime for
maintenance
• Maximises the overall reliability and uptime of the plant (planned and unplanned)
• Group and Sort Tasks by Equipment Shutdown Status, Trade Type, Task Frequency and
Equipment
• Level Workloads
• Scheduled operating breaks (e.g. aisle cleaning, copper boil descale etc)
• Trade Type
• Task Frequency
• Equipment
For a given:
• Shutdown Status
• Task Frequency
o
Rapid Equipment Strategy Development | Version 3.6.1
Geographically in the same area,
• The increment of time (or other unit) between the performance of individual schedules
Schedule Cycle
What do we do if there is one task, or a small number of tasks that are to be performed at a non-whole
multiple of the Schedule Interval?
• E.g. 7 weeks
• Reducing the task interval to the next schedule interval (e.g. from 7 weeks to 6 weeks)
• Increasing the task interval to the next schedule interval (e.g. from 7 weeks to 8 weeks). Rapid Equipment Strategy Development | Version 3.6.1
This should only be done after considering the risks associated with extending the interval.
o On its own
• Using an alternative inspection technique or failure-finding technique which may permit the
inspection or test to be performed with the equipment running
• Selecting a Fixed Interval task, with a longer interval, to replace a Condition Monitoring Task
• Modifying components which are subject to Fixed Interval tasks so that component life is
increased
• Modifying components so that the need for the PM task is eliminated entirely
• Modifying equipment and/or the process, so that the process can run with the equipment
shutdown
These actions require considerable time and should be noted and then discussed and actioned later
outside this forum.
• Stockpile/Work in Process limitations that constrain the length of shutdowns before total
plant throughput is impacted
Where these constraints cannot be resolved, they must be taken into account when packaging tasks
Identify and resolve contradictory tasks in the same schedule (e.g. “check oil level and top up if
required” every 250 hrs, and “drain and replace oil” every 1000 hrs)
Sequence the tasks in an order that permits them to be done with maximum efficiency.
Identify and resolve potential conflicts between work packages (e.g. welding directly overhead another
PM activity).
o Number of people
o Trade types
Levelling Workloads
Sometimes it is useful to adjust PM schedules to avoid large variations in PM workload and downtime
across the Schedule Cycle
100
80
Workload 60
(Labour Hours)
40
20
0
250 hrs 500 hrs 750 Hrs 1000 1250 1500 1750 2000
Hrs hrs Hrs Hrs Hrs
This can be done by performing some of the longer interval work as part of shorter interval activities
100
• To identify and obtain the necessary funding and resources for implementation of the
recommendations
Recommendations for:
• Improved/additional training
• Modifications to equipment
• Etc.
Who needs to authorise their implementation? Rapid Equipment Strategy Development | Version 3.6.1
Who has the authority to approve all of the proposed implementation actions?
Who has the financial authority to approve all of the necessary expenditure and/or release the
resources required to perform the work?
We suggest that this be the “asset owner” – someone with overall responsibility for performance of the
assets within the scope of the analysis.
In order to ensure that the review and approval process is performed with high quality:
o Equipment/System knowledge?
How much time does the Asset Owner (or his delegate) need in order to review the recommendations
from the process properly?
• The Review Team should give a brief presentation on the key outputs and
recommendations from the process – highlighting the most significant/contentious changes
• Set a time for a final joint Q&A session between the Asset Owner and the team to resolve
outstanding issues
• To perform all of the activities required in order to implement the outcomes of the analysis
• Documentation changes:
o CMMS records
o Operating/Maintenance Procedures
o Work Instructions
o Bills of Materials
• People issues
o Training
o Communication
• Priorities need to be balanced against the other activities that key personnel will be
involved in
• Who is responsible for ensuring that all the identified activities are completed on time?
• What is the process for monitoring and reporting progress and taking corrective actions as
required?
One of the most critical implementation steps is preparing Maintenance Work Instructions and
Operator Inspection Sheets. In particular, Work Instructions which involve inspections or tests should
include:
• Precise method to be used to perform the inspection, including tools required (if applicable)
e.g.
o Visually inspect bolts for corrosion Rapid Equipment Strategy Development | Version 3.6.1
o Check bearing temperature using Infrared gun
• What action must be taken if the pass criteria is not met e.g.
• Work instructions are written with the person who is going to read the instruction in mind
• Group complex work instructions into phases, with each phase consisting of many, related
tasks
• Focus on the key risks that may prevent the job from being performed safely and to the
required quality standard
• Are written in the first person, not the third, and use the active voice, not the passive
• Are written in both upper and lower case, not upper case only
• Incorporate appropriate, conspicuous reminders in order to ensure that critical steps are
not omitted
• To monitor the extent to which the recommended changes have been successfully
implemented
• To monitor the effectiveness of the changes made in achieving the business improvement
objectives for the PM review process
• To identify and address any issues that may arise from this monitoring process
• Maximise availability/uptime?
• Maximise throughput/performance?
o Maintenance Costs/Tonne
• For each KPI, we need to establish a baseline performance level against which to compare
performance – e.g. last 12 months average
• The extent to which these are being delivered in full and on time?
• Corrective actions resulting from PM and operator inspections are being completed in an
appropriate time frame?
• Identify the possible strategic direction and next steps, once optimal PM programs are in
place
• Focus on eliminating those things that cause failures and on extending the life of critical
components
• Perform Pareto Analysis to identify those items of equipment or components that are Rapid Equipment Strategy Development | Version 3.6.1
having the greatest impact on your business in terms of:
o Downtime
o Throughput
o Costs
o Safety
o Etc
o Equipment is operated
o Equipment is maintained
• Trial or select components and equipment that maximise reliability and minimise total
lifecycle costs
Perth Office
PO Box 31
Burswood WA 6100
Phone: +61 8 9474 4044
Melbourne Office
Phone: +61 3 8676 0774
Brisbane Office
PO Box 10856, Brisbane
Queensland 4000
Email: assetivity@assetivity.com.au
Web: www.assetivity.com.au