Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

International Journal of Production Research, 2013

Vol. 51, No. 18, 5404–5412, http://dx.doi.org/10.1080/00207543.2013.775521

A developed autonomous preventive maintenance programme using RCA and FMEA


Chee-Cheng Chen*

Department of Business Administration, National Pingtung University of Science and Technology, Pingtung, Taiwan
(Received 22 December 2011; final version received 29 January 2013)

Total productive maintenance (TPM) was developed in Japan in 1971 and has since been phased into many manufactur-
ing firms to promote productivity and competitiveness. Autonomous preventive maintenance (APM) systems are very
special. The fundamental pillar of TPM includes a series of important systematical first-line direct labour activities. The
technical cost, human resources and management issues are all considered. Failure modes and effects analysis (FMEA)
and root-cause analysis (RCA) are the most popular failure analytical methods, widely adopted over different industries.
They are often used to examine the potential problems in the design and manufacturing phase, discovering possible fail-
ure causes before product design and manufacturing finalisation. This study integrates the RCA and FMEA techniques
to establish an APM system that meets a company’s goal of reducing manufacturing costs and promoting employee and
equipment productivity. The major contribution of this study is constructing potential equipment failure modes and their
risk priority number through RCA and FMEA integration transformed into a selection of items and their APM mainte-
nance frequencies. A strategy for deploying employee technical capability upgrade through effective training is devel-
oped. This study uses the S Company – a key manufacturer of semiconductor material – as a case study to verify the
model’s applicability and suitability.
Keywords: failure modes and effects analysis; root-cause analysis; total productive maintenance; autonomous preventive
maintenance

1. Introduction
The market environment is characterised by a rapid increase in technological advancements fuelled by intense economic
competition. This transition to global competition has forced companies to improve their competitiveness by enhancing
their manufacturing performance. One major improvement programme in the field of production and operations manage-
ment is total productive maintenance (TPM) (Ahuja and Khamba 2008). The main objective of TPM is to achieve a reli-
able manufacturing system, accomplished by maximising the overall equipment effectiveness so that plant and equipment
productivity is increased. The most commonly cited TPM practices in the literature are autonomous preventive mainte-
nance (APM), preventive maintenance (PM), equipment technology emphasis, as well as cross-functional approaches to
training and employee involvement (Konecny and Thun 2011). In the approach used in most industries, APM and PM
actions are performed on an item at a scheduled time regardless of its actual condition. As the schedule is often drawn up
based on the supplier’s recommendations, but made with either staff knowledge of the actual use conditions or from past
experience, it is seldom an optimal procedure (Eti, Ogaji, and Probert 2006). Failure modes and effect analysis (FMEA)
is known to be a systematic procedure for the analysis of a system to identify the potential failure modes and their causes
and effects on system performance. The analysis is performed successfully preferably early in the development cycle so
that removal or mitigation of the failure mode is most cost-effective (Cassanelli et al. 2006; Vinodh and Santhosh 2012).
In order to identify faults in terms of where they are located in a system and how serious their consequences are, a risk
priority analysis should be a prerequisite to any operation. The development of FMEA-aided APM and its operation
system described in this article is based on our work with an international electronics firm (S Company). In addition to
describing the system, the process used to develop such an approach is reviewed and application conclusions are drawn.

2. Literature review
The literature on fault tree analysis (FTA), FMEA, TPM and APM is varied. In this section, literature that closely relates
to the topic of interest is reviewed, leading to the development of the research focus.

*Email: carl2004@mail.npust.edu.tw
Ó 2013 Taylor & Francis
International Journal of Production Research 5405

2.1 FTA and FMEA


FTA is a flexible technique equally applicable to quantitative and qualitative analysis that is easy to use and understand.
FTA is a deductive technique, which means it works from the top down – assuming the system has failed and then try-
ing to work out why it failed (Ferdous et al. 2007). This is done by working backwards to determine what possible
combinations of events might have caused the failure. The system failure then becomes the top event in the fault tree
and the individual component failures form the basic events. They are all combined using a network of logical gates
(Mentes and Helvacioglu 2011).
FMEA, by contrast, is an inductive technique that works from the bottom up – assuming a component failure has
occurred and then assessing the effects of that initial event on the rest of the system. The end result is a table of failures
and their effects on the system, which provides the analyst with an overview of the possible faults. Usually, these effects
are evaluated according to a number of criteria, such as severity (S), occurrence (O) and detectability (D). Often these
criteria are then combined into an overall estimate of risk. All of these data are then presented in the form of a table,
which allows the analyst to see quickly what the effects of each failure mode are. They are useful methods that we can
use to identify potential faults in a system, so that we can then use that information to correct or prevent those faults
(Papadopoulos et al. 2011).
FMEA is a widely used reliability tool for identifying potential failures before they occur with the intention of mini-
mising the risk associated with them. As the basic function of FMEA is to find, prioritise and minimise the failure, it
has been widely used in various manufacturing areas, helping reliability related problems (Geum, Cho, and Park 2011).
It is a decision-making tool for prioritising corrective action to enhance product/system performance by eliminating or
reducing the failure rate. The major benefits of implementing FMEA are improving the product/process quality and reli-
ability, thereby ensuring customer satisfaction (Vinodh and Santhosh 2012).

2.2 TPM and autonomous preventive maintenance


TPM can be regarded as an improvement programme establishing a comprehensive productive maintenance system
throughout the entire life of the equipment, encompassing all equipment-related fields and with the participation of all
employees, to promote productive maintenance through motivation or voluntary team-based activities (Konecny and
Thun 2011). TPM calls for operator care and involvement in maintaining equipment and particularly its functional capa-
bility. Some call this autonomous maintenance. Here, operators take care of the equipment (e.g., tighten, lubricate, clean
and work to avoid defects and failure modes, or as appropriate to monitor them, so that their consequence can be mini-
mised) (Nakajima 1988). TPM emphasises improving maintenance efficiency and effectiveness, that is, you must under-
stand your defects or failure modes and work proactively to avoid them and/or detect them early enough to minimise
their consequences. In order to identify faults in terms of where they are located in a system and how serious their con-
sequences are, a risk analysis should be a prerequisite to any major operation (Moghaddam and Usher 2011).
The overall equipment effectiveness (OEE) is the core metric for measuring the success of a TPM implementation
programme. This measure is designed to determine the reliability of assets and their capability to deliver the desired
performance. This measure is assessed by taking the product of three sub-measures – ‘quality rate’, ‘availability’ and
‘performance efficiency’. OEE is thus obtained as follows:

OEE ¼ quality rate(Q)  availability(A)x performance efficiency(P) (1)

The quality rate is the amount of product that is right the first time – without adjustment, recycles, and so on. To
achieve satisfactory performance in this regard, it is necessary to achieve a very high first-time-right rate. Quality rate is
calculated as follows:
Good production
Quality rate ¼
Good production þ Failed QC

The availability is defined in terms of the number of hours the plant actually operates divided by the number of
hours in a month. For example, if a factory plans to operate for 22 h per day for 25 days per month, the maximum
available number of hours is 550. If so, availability is calculated as follows:
550  (number of hours of total shutdown)
Availability ¼
550
5406 C.-C. Chen

The performance efficiency is the product of the operating speed rate and net operating rate. The operating speed
rate of equipment refers to the discrepancy between the ideal speed and its actual operating speed. The net operating
rate measures the achievement of a stable processing speed over a given period of time, for example a production shift
of 12 h, rather than whether the actual speed is faster or slower than the design standard speed. This calculates losses
resulting from minor recorded stoppages, as well as those that go unrecorded on daily logs, such as small problems and
adjustment losses.

Performance efficiency = (Net operating rate  Operating speed rate) or

theoretical cycletime x processed amount


Performance Efficiency ¼
operating time

OEE rests on the principle that the best manufacturing performance occurs when a plant operates to full capacity, it
always produces a perfect product and never breaks down. Data on capacity usage, quality performance and equipment
breakdown are therefore required to determine the OEE. The manufacturing manager at the site is responsible for pro-
viding the required information on a timely basis (Chen 2008; Wang and Pan 2011). This metric has become widely
accepted as a quantitative tool essential for the measurement of productivity in manufacturing operations (Muthiah,
Huang, and Mahadevan 2008). The target of TPM activities is to raise the OEE and labour productivity, eventually to
secure zero equipment failure, zero defects and rework and zero industrial accidents. TPM initiatives are focused on
addressing the six major losses and wastes associated with the production systems by implementing continuous and sys-
tematic evaluations of the production system, thereby effecting significant improvements in production facilities (Ahuja
and Khamba 2008).
Through the 1980s in the US and Japan the developed corporate maintenance strategies involved significant para-
digm shifts, such as uptime maintenance, inter-trade flexibility within the maintenance workforce, as well as an amal-
gamation of the roles of plant operators and front-line maintenance personnel. This paved the way for the introduction
of APM, a key element of TPM, which required the implementation of following procedures.

• Cultivating a sense of ownership in the operator by introducing autonomous maintenance, i.e., the operator
takes responsibility for the primary care of his/her plant.
• Optimising the operator’s skills and knowledge of his/her plant in order to maximise operating effectiveness:
the operator is thus mobilised to detect early signs of wear, maladjustments, leaks, errant chips or loose parts.
He/she is also involved in making improvement suggestions to eliminate losses resulting from the breakdown
or suboptimal performance of the plant.
• Using cross-functional teams, consisting of operators, maintainers, engineers and managers, to improve the
plant’s overall performance.
• Establishing a schedule of clean-up and preventive maintenance to extend the plant’s lifespan and maximise its
uptime (Eti, Ogaji, and Probert 2006).

3. Method
3.1 Recent practice in PM review
In PM, a system that is highly likely to exhibit a demobilising fault is replaced before that failure is allowed to occur.
The most common form of this policy is still scheduled PM in industries. The PM action is performed on the item at a
scheduled time regardless of its actual condition. As the schedule is often drawn up based on the supplier’s recommen-
dation, but made with either only limited local knowledge of the actual use conditions or from past experience, it is sel-
dom an optimal procedure/system. Preventive maintenance schedules that minimise resource consumption or maximise
availability can be determined through the use of quantitative decision models, based on factual information such as
time-to-failure distributions, techniques of FTA, root-cause analysis (RCA) and FMEA.
In conventional FTA the process should be fully understood and the probability of failure for basic events must be
known. It is difficult to have an exact estimation of the failure rates for the system components or the probability of
occurrence of undesired events owing to the lack of sufficient data or the engineering team’s lack of sufficient
knowledge (Mentes and Helvacioglu 2011; Renjith et al. 2010). The background of this multidisciplinary team at a
International Journal of Production Research 5407

lead-frame manufacturer, S Company, is various engineering disciplines, i.e., mechanical, electrical, etc. It is very diffi-
cult for them to measure accurately the probability, not to mention other advanced methods, i.e., fuzzy model, or grey
rational analysis, etc. A simple, workable and useful method/system can be developed for promoting the PM perfor-
mance that is proposed/preferred.
RCA classifies the problem into associated categories, such as people, procedures or hardware, and tries to prevent
problem recurrence (Eti, Ogaji, and Probert 2006).
FMEA is a structured, bottom-up approach that starts with known potential failure modes at one level and investi-
gates the effect on the next subsystem level. All complex mechanical systems are composed of several subsystems (SS)
that can be further broken down or up to the component level, failure mode level and cause level combined with endless
RCA effort (Sharma, Kumar, and Kumar 2005). A complete FMEA analysis of a system will therefore span all of the
levels in the hierarchy from bottom to top, as shown in Figure 1.
FMEA starts with identifying defects. The widely accepted FMEA procedure starts with identifying all potential
failure modes of the system. Following identification, all possible causes, effects and hazards of each failure should
be related to the failure modes. After identifying the causes and effects of failure modes, a target for improvement
should be chosen. In order to quantify this procedure a risk priority number (RPN) is used, which is the multipli-
cation of failure severity (S), their probability of occurrence (O), and possibility of detection (D) (Geum, Cho, and
Park 2011). The higher the RPN, the higher the chance that the mode will fail, and subsequently demand higher
priority for corrective action (Vinodh and Santhosh 2012). A need existed for the case company to prioritise the
corrective actions to enhance system performance by reducing the failure rate. The FMEA table format is shown in
Table 1.
Two methodologies, RCA and FMEA, and one transforming mechanism are adopted to build an integrated
framework for the FMEA-aided APM programme and its operating system, which could prove beneficial to mainte-
nance engineers/managers dealing with the analysis, design and optimisation of both reliability and maintainability
issues. The case of S Company from the semiconductor industry is undertaken to discuss the proposed framework.

3.2 FMEA-aided APM programme establishment


The main establishment steps for a FMEA-aided APM programme and the required techniques are shown in Figure 2
and specified as follows.

Step 1. Choosing a piece of equipment as a system to be studied and forming a multidisciplinary team to build the
‘hierarchical FMEA system structure’, to elaborate and ensure ‘causal chain’ fail dependencies: Causes ! Failure
Modes ! Effects accurately.

Figure 1. Hierarchical FMEA system structure.


5408 C.-C. Chen

Table 1. Format of FMEA table.

System: FMEA number.:


Subsystem: FMEA date/Rev.:
Company name/Prepared by: Page of
FMEA process Action results

(Sub)system and function Actions taken


Failure mode Severity (S)
Failure effect Occurrence (O)
Severity (S) Detectability (D)
Cause of failure RPN
Occurrence (O)
Prevention methods
Detection methods
Detectability (D)
RPN
Corrective action
Responsibilities/date

Figure 2. The main steps for establishing a FMEA-aided APM programme.

Step 2. Performing FMEA, including identifying failure modes, determining the potential effect of each failure mode,
ranking the severity (S) and/or class of failure mode effects, listing the causes for each failure mode and ranking
the probability of occurrence (O). Listing the prevention, design validation/verification or other activities that
will assure design adequacy for the failure mode and/or cause/mechanism under consideration. Ranking the
detectability (D) of each failure mode, calculating the RPN for each failure mode, developing ‘What actions
need to be taken to prevent the failure mode?’, assigning the appropriate individual, area, function or team and
setting a realistic target for completion established by the development programme. Listing the results of the
actions taken and reassessing severity (S), probability (O), detectability (D) and RPN.
International Journal of Production Research 5409

Table 2. FMEA table with required APM.

FMEA process Action results Required APM

Project and function Actions taken OAM


Failure mode Severity (S) EPM
Failure effect Occurrence (O)
Severity (S) Detectability (D)
Cause of failure
Occurrence (O)
Prevention methods
Detection methods
Detectability (D) RPN
RPN
Corrective action
Responsibilities/date

Table 3. RPN versus frequency of OAM, EPM (S Company).

Severity < 8 Severity  8


RPN OAM EPM OAM EPM

451 < RPN 2h 1 day 1h 12 h


351 < RPN 5 450 12 h 5 days 5h 2 days
251 < RPN 5 350 1 day 2 weeks 10 h 5 days
151 < RPN 5 250 1 week 3 weeks 1 day 1 week
51 < RPN 5 150 2 weeks 1 month 3 days 2 weeks
1 < RPN 5 50 None None 1 week 3 weeks

Step 3. A RPN versus (transformed to) APM interval mechanism can be developed by the multidisciplinary FMEA
team. An analysis of past records, and considerations of effectiveness and cost are necessary. The RPN
mechanism versus APM interval can be very different for plants with different product lines. The FMEA team

Tape plate pivot too loose

Tape releasing no good


Out of use
Tapping turn table and
Tape membrane ‘rotary
table’ adjustment no good Tooling parts
shortage
Tapping M/C failure

Cylinder damage
Taping tool motion failure
Air cylinder leakage or
pressure too high or low

Cylinder damage
Lifting wheel motion not

L/F incoming structure smooth Cylinder pipe


failure remained w/ water
Incoming motor failure

M/C: machine w/ : with

Figure 3. RTDC machine hierarchical FMEA structure machine.


5410 C.-C. Chen

members may be charged with reviewing performance data periodically to assess how much the cost of quality
has improved since implementing the system. Creating a ‘required APM’ column with OAM (operator autono-
mous maintenance) and EPM (engineer preventive maintenance) sub-columns in the FMEA table is shown in
Table 2 as well as developing the required APM actions and performing interval/frequency used to update the
PM working instructions. The mechanism of S Company is established as shown in Table 3.
Step 4. Implementing actions and monitoring performance. The operation manager and FMEA-aided APM project team
members are charged with reviewing performance data periodically to assess how much the equipment has
improved since implementing the project. The improvement cycle from step 1 to step 4 will be endless. The
mechanism: required APM and its intervals will be tuned and updated dynamically to meet the requirements of
prevention, cost-effectiveness and customer satisfaction at all times.

4. Case study
We selected a case study involving an RTDC taping machine from S Company to illustrate the proposed method.
The main function of this machine is to attach tape precisely on to the lead-frame (L/F). The hierarchical FMEA
system structure and FMEA are analysed and performed. The hierarchical FMEA structure of the RTDC machine is
a graphical representation of the logical combinations of RTDC machine failures. The relationship between a failure
or fault and the events that cause them is shown in Figure 3. Table 4 is part of the FMEA RTDC machine table.
One of three possible failures and two causes for this failure are shown in Figure 3 to calculate the RPN, 108 and
216, respectively, as shown in Table 4. Referring to Table 3, the required APM frequencies, including the intervals
for OAM and EPM for the two causes of failures with RPN, 108 and 216, respectively, can be determined as
Table 4. FMEA table of RTDC machine.

Required
FMEA process Action results APM

Project and function to attach tape on L/F Actions Check the tape Check OAM (1) (3)
precisely taken turn table transparenttable
Failure Mode Tape releasing no good Severity (S) 6 6 EPM (2) (4)
Failure Effect to effect the tape Occurrence 6 6
incoming (O)
Severity (S) 6 6 Detect- 3 6
ability (D)
Cause of failure Tapping turn table axle Table adjustment no RPN 108 216
too loose good
Occurrence (O) 8 8
Prevention methods Enough Parts/Tool Enough Parts/Tool
detection methods Adjustment by Adjustment by
condition condition
Detect-ability (D) 6 8
RPN 288 384
Corrective action Confirm the tape turn Confirm transparent
table status table status
Responsibilities/ date Tapping Operator/ Tapping Dept.
20100923 Operator

Table 5. Required APM frequency for RTDC machine.

Required APM
Failure mode Cause of failure mode Prevention methods Detection methods RPN OAM EPM

Tape releasing not Tapping turn table axle too 1. Conducting the Check the tape turn 108 (1) (2)
good loose training table 2 weeks 1 month
2. Sufficient parts/tool
Table adjustment not good 1. Conducting the Check transparent table 216 (3)1 week (4)
training 3 weeks
2. Sufficient parts/tool
International Journal of Production Research 5411

Table 6. Performance comparison of machine maintenance (S Company).

2010 (July–Dec.) 2011 (July–Dec.)


Operation Breakdown (time/month) Troubleshoot(hour/month) Breakdown(time/month) Troubleshoot (hour/month)

Stamping 5 136 3 53
Plating 41 172 30 104
Cutting 45 20 35 157
OEE 48.24%⁄ 61.40%⁄⁄

Referring to Equation (1), OEE = Q  A  P = 0.92  0.69  0.76 = 0.4824.⁄⁄OEE = Q  A  P = 0.96  0.82  0.78 = 0.6140.

shown in Table 5. Very significant performance improvement on machine maintenance and at least 20% improve-
ment on average for the OEE of the S Company lead-frame manufacturing plant were obtained one year after the
phasing in of this system, as shown in Table 6.

5. Conclusions
Maintenance efficiency can affect plant availability, costs, business effectiveness, risk, safety, environmental integrity,
product quality and customer service. As a result of the complexity of the current equipment, repair and restoration are
more difficult and new special multi-skill tools and techniques are needed. It is now increasingly realised that achieving
high-quality maintenance requires prevention at the sources and a focus on identifying and eliminating the risk of criti-
cal failures and the causes of equipment deterioration. The FMEA-aided APM system has been convincing as being
very useful in eliminating the chance of failure. Identifying potential failures quickly and taking appropriate actions and
making it easier for people to do the right thing are critical to the success of this system.

Acknowledgements
The author thanks the manufacturing management team from S Company for their valuable assistance with this research project. This
study was partly supported by the NSC, Taiwan (Project No. NSC 101-2622-E-020 -009 -CC3).

References

Ahuja, I. P. S., and J. S. Khamba. 2008. “Assessment of Contributions of Successful TPM Initiatives towards Competitive Manufac-
turing.” Journal of Quality in Maintenance Engineering 14 (4): 356–374.
Cassanelli, G., G. Mura, F. Fantini, M. Vanzi, and B. Plano. 2006. “Failure Analysis-Assisted FMEA.” Microelectronics Reliability
46: 1795–1799.
Chen, C. C. 2008. “An Objective-Oriented and Product-Line-Based Manufacturing Performance Measurement.” International Journal
of Production Economics 112 (1): 380–390.
Eti, M. C., S. O. T. Ogaji, and S. D. Probert. 2006. “Development and Implementation of Preventive-Maintenance Practices in Nige-
rian Industries.” Applied Energy 83: 1163–1179.
Ferdous, R., F. I. Khan, B. Veitch, and P. R. Amyotte. 2007. “Methodology for Computer-Aided Fault Tree Analysis.” Trans IChemE,
Part B, Process Safety and Environmental Protection 85 (1): 70–80.
Geum, Y., Y. Cho, and Y. Park. 2011. “A Systematic Approach for Diagnosing Service Failure: Service-Specific Fmea and Grey
Relational Analysis Approach.” Mathematical and Computer Modeling 54: 3126–3142.
Konecny, P. A., and J. H. Thun. 2011, “Do it separately or simultaneously – an empirical analysis of a conjoint implementation of
TQM and TPM on plant performance”, International Journal of Production Economics 133(2): 496-507.
Mentes, A., and I. H. Helvacioglu. 2011. “An Application of Fuzzy Fault Tree Analysis for Spread Mooring Systems.” Ocean
Engineering 38: 285–294.
Moghaddam, K. S., and J. S. Usher. 2011. “Preventive Maintenance and Replacement Scheduling for Repairable and Maintainable
Systems Using Dynamic Programming.” Computers & Industrial Engineering 60: 654–665.
Muthiah, K. M. N., S. H. Huang, and S. Mahadevan. 2008. “Automating Factory Performance Diagnostics Using Overall Throughput
Effectiveness (OTE) Metric.” Int J AdvManufTechnol 36: 811–824.
Nakajima, S. 1988. Total Productive Maintenance. London: Productivity Press.
Papadopoulos, Y., M. Walker, D. Parker, E. Rüde, R. Hamann, A. Uhlig, U. Grätz, and R. Lien. 2011. “Engineering Failure Analysis
and Design Optimisation with Hip-HOPS.” Engineering Failure Analysis 18: 590–608.
5412 C.-C. Chen

Renjith, V. R., G. Madhu, V. Lakshmana Gomathi Nayagam, and A. B. Bhasi. 2010. “Two-Dimensional Fuzzy Fault Tree
Analysis for Chlorine Release from a Chlor-Alkali Industry Using Expert Elicitation.” Journal of Hazardous Materials 183:
103–110.
Sharma, R. K., D. Kumar, and P. Kumar. 2005. “Systematic Failure Mode Effect Analysis (FMEA) Using Fuzzy Linguistic Model-
ing.” International Journal of Quality & Reliability Management 22 (9): 986–1004.
Vinodh, S., and D. Santhosh. 2012. “Application of FMEA to an Automotive Leaf Spring Manufacturing Organization.” The TQM
Journal 24 (3): 260–274.
Wang, T. Y., and H. C. Pan. 2011. “Improving the OEE and UPH Data Quality by Automated Data Collection for the Semiconductor
Assembly Industry.” Expert Systems with Applications 38: 5764–5773.
Copyright of International Journal of Production Research is the property of Taylor & Francis
Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv
without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.

You might also like