Professional Documents
Culture Documents
Cold Front - Vol. 10 No. 4, 2010 Newsletter
Cold Front - Vol. 10 No. 4, 2010 Newsletter
OVERVIEW
By their very nature, safety systems are called upon to operate infrequently throughout their life;
consequently, they are susceptible to failures that lie latent or dormant - often not realized until times when
the safety system is actually needed. When safety systems fail, personnel are at increased risk and the
Research Staff
Dan Dettmers 608/262-8221
djdettme@wisc.edu
1
Vol. 10 No. 4, 2010
Noteworthy
• Mark your calendars now for the 2011 IRC Research and Technology
Forum – May 4-5, 2011 at the Pyle Center in Madison, WI.
• Send items of note for next newsletter to Todd Jekel, tbjekel@wisc.edu.
2
Vol. 10 No. 4, 2010
room shutdown is actuated, the following control actions are taken: (1) activate visual beacons and audible
alarms outside each machinery room entrance as well as those beacons and alarms located within the
machinery room; (2) energize machinery room ventilation system in the emergency mode (opening intake
dampers and starting all exhaust fans); (3) shut-trip electrical power to all unclassified electrical equipment
within the machinery room.
When refrigeration personnel triggered the break glass switch during their test, they discovered that only
the visual beacon/alarm inside the machinery room successfully activated. Upon diagnosing why the
emergency ventilation system and electrical shutdown did not function, plant personnel discovered a blown
fuse in the electrical control circuit for these safety systems. This deficiency could have existed for nearly a
year since the blown fuses were latent or unknown to plant personnel. Plant personnel also noted that
both the electrical shutdown and the emergency ventilation system actuation were designed to function
using a programmable logic controller (PLC) – more on this design approach later. Disturbingly, these
critical safety systems (electrical shutdown and ventilation) were compromised by the failure of a simple
and relatively reliable electrical component – a fuse.
Recommendation: The current configuration for both the controls circuits responsible for actuating
the electrical shutdown and emergency ventilation were rewired to be “fail-safe.” A fail-safe control circuit
arrangement will trigger the desired control actuation of a safety system on the failure of a control circuit
due to loss of power, severed conductor, or blown fuse.
Interestingly, the failure of this safety system was enabled by humans – plant electrical or refrigeration
personnel (although no one seemed to recall actually pulling the disconnects). Unfortunately, the supply of
electrical power cannot be wired in a “fail-safe” arrangement as can be accomplished for control wiring.
Recommendation: To prevent a re-occurrence of this safety system failure, the plant intends to
make the following changes:
Another option that the plant could consider is the installation of sensors to detect and alarm on the loss of
electrical conductor continuity between the source of electrical supply power and the terminal demand
(the electric motor(s) for the emergency ventilation fans). This is not a “fail-safe” but another layer of
protection to ensure an uninterrupted supply of electrical power to the actuated devices (fan motors).
1
Although excess flow valves are a form of engineering control for safety, they do have their application limitations and
they are susceptible to failure. For further information on limitations of excess flow valves and for case studies on their
failure, see EPA 2007.
3
Vol. 10 No. 4, 2010
system in the event of a catastrophic failure of piping or other components. When the sensed fluid flow
through the valve exceeds a user-specified rate, controls, via a local PLC, triggers closure of the valve. The
local PLC communicates the state change to the system’s main PLC to alert operators that the valve tripped.
In this plant, multiple excess flow valves are positioned in strategic locations throughout the facility to limit
the loss of ammonia in the event of a catastrophic failure. During annual testing of this engineered safety,
plant personnel could not remotely or locally trip one of the excess flow valves to close. In diagnosing the
cause of the valve failure, the local PLC at the excess flow valve was found to have a dead battery. Plant
personnel test these valves annually; however, their annual testing regimen did not include a PM to replace
the PLC’s batteries. In this case, the battery on the excess flow valve’s PLC could have failed shortly after
the last annual functional test; thereby, rendering the valve inoperable for a significant period of time
without plant personnel knowing they have lost valve function.
Recommendation: In addition to periodic replacement of the batteries in each excess flow valve
PLC, the plant is adding a “heart beat” function between the local PLC at the excess flow valve and the
system’s main PLC. The revised arrangement will alarm on a loss of communication or “heart beat” from
the remote PLCs on each valve; thereby, providing a red alert to the operators.
Upon closer investigation, loose connections (likely due to vibration) on the panel’s motherboard were
found to be responsible for the intermittent loss of machine control. Furthermore, plant personnel found
that the e-stop switches on multiple compressors were incorrectly re-wired after previous service electrical
work had been performed - the e-stop switches were wired through the machine’s PLC.
Recommendations: The e-stop switches were reconfigured to be hard-wired bypassing the PLC for
emergency local shutdown of individual compressor packages. The plant changed their annual inspection
and test procedures to be capable of assessing the safety circuits of individual machines while idle and
during operation. Finally, additional information is being gathered to better understand the root causes of
the intermittent electrical connections and required remedial action.
These are just a few of the case studies we have compiled during the past several months. The specific
remedies identified in each case represent one of several possible paths for corrective action but the corrective
action is not the lesson to be learned from this article. The important point is that one or more safety systems
failed without plant personnel being aware. Are critical safety systems in your plant functional or failed? Now
that we have your attention, let’s look more closely at critical safety systems and keys to their successful
operation.
INTRODUCTION
In considering the state-of-health of your critical safety systems, one of the first questions that should come to
mind is: what is a “critical safety system?” In other disciplines, critical safety systems are commonly referred to
as “safety-critical systems.” Here, we will use the former term and recognize that the latter may be used as
well. Consider the following definitions:
4
Vol. 10 No. 4, 2010
• Safety-critical systems – A computer, electronic, or electromechanical system whose failure may cause
injury or death to human beings [Fowler 2010].
• Safety-critical systems – Those systems critical to the safe operation of some larger system, whether this
be a nuclear power station or vehicle [Lees 1996].
• Safety interlock or alarm - Any equipment whose proper functioning is essential to prevent or signal
hazardous process conditions that may threaten personnel or equipment [Barclay 1988].
For the purpose of this article, we define a critical safety system as “engineering controls intended to prevent or
mitigate equipment failures in order to protect people, equipment, infrastructure, or products.”
In the Cold Front Vol. 8 No. 2 (2008) edition, we introduced some basic principles and practices for
hazard control in industrial refrigeration systems. Also included in that edition were definitions of engineering
controls. A key aspect of an engineering control is that it does not rely on human intervention in order to
achieve a desired level of performance. We also discussed the reality that all forms of engineering controls,
including those that we might classify as “critical safety systems,” involve humans – from design through to
installation and ongoing operation. As such, we also need to guard against failures rooted in human error during
the entire life cycle of a safety system.
At a basic level, safety systems consist of the following elements (see Figure 1):
Controller
I/O Management
I/O Management
User Interface
One could include “wiring” or “interconnections” in this list as well. Input devices are quite varied and commonly
include: pressure sensors, pressure switches, continuous level sensors, discrete level switches, temperature
transducers, ammonia detectors, flow sensors, or other state indicators. The controller receives inputs from one
or more sensors and applies some basic logic before rendering a decision on required control action. As-
required, the controller communicates this control action to one or more controlled elements or output devices.
The output devices are varied but could consist of alarms (audible and visual) or energizing/de-energizing relays
or coils to cause the change in state of valves or other electromechanical components. In some cases, the output
devices rely on ancillary systems to achieve their function. For example, a pneumatically actuated butterfly valve
5
Vol. 10 No. 4, 2010
may receive an electrical signal to energize a solenoid valve to change the state of the valve. The loss of the
pneumatic system can compromise the performance of the controlled component.
The successful function of a safety system requires all components to be operable: input devices, controller,
output devices (including any ancillary services), and interconnecting wiring. The entire safety system’s
function can be compromised when any one of these elements fails to perform their assigned tasks.
Historically, the “controller” element for critical safety systems was hard-wired and utilized electro-mechanical
devices for making control decisions. Today, microprocessor-based programmable logic controllers (PLCs) have
become standard for process control and they are finding growing use as the controller for safety systems as
well. In critical safety systems, designers have the option of specifying “safety PLCs” for added robustness and
controller reliability. Safety PLCs have internal redundancy and are significantly more fault-tolerant than
traditional PLCs.
The concept of “fail-safe” design was introduced above. It is one approach that can be used to guard against
simple modes of failure like wiring or input/output malfunctions; however, it is not applicable in all situations.
1. Identify specific individual safety systems and their importance in maintaining personnel and plant
safety
2. Determine the components that make up each safety systems and gather information on their failure
modes and frequencies
3. Understand the sequence(s) of control for proper safety system function
4. Establish appropriate inspections and tests to ensure continued safety system functionality
5. Carry out appropriate inspections and tests to maintain the safety system from cradle to grave
The first key to success may seem obvious but many plants have not clearly identified what safeties integrated
into their industrial refrigeration systems are (or should be) considered as “critical safety systems.” Table 1
provides a categorized list of safety systems - differentiating those that would typically be classified as a “critical
safety system” as opposed to a “safety system” (non-critical).
As noted in the table, not all safety systems are classed as “critical.” Each end-user needs to establish specific
and clear criteria that can be used to categorize existing and future safety systems as to whether they are
critical or not. Keep in mind that all safety systems are important but not all are critical. Because of the
extraordinary responsibility they have to protect people and infrastructure, critical safety systems require
special care and attention.
• Initiation (design): The initiation of a critical safety system may grow out of an incident investigation,
process hazard analysis, changes to industry codes & standards, or by other means. The conceptual
and detailed designs of critical safety systems would benefit by independent peer review to determine
whether or not any gaps or flaws exist. Of course this assumes that the initial safety design is based on
applicable codes, standards, and guidelines.
• Deployment (construction): Upon initial construction/deployment, the critical safety system should be
completely verified as part of a pre-startup safety review [1910.119(i)]. This functional verification
needs to validate proper installation of sensors (including calibration), wiring, and controlled devices.
6
Vol. 10 No. 4, 2010
The final step in functional verification is the demonstration of proper system actuation in all modes. At
least one of the critical safety system failures included in the above case study would have been
prevented with a rigorous pre-startup safety review.
• Life Management (mechanical integrity): On a periodic basis, appropriate inspections and tests of all
components that make up each critical safety system need to be conducted. The process safety
management standard includes the ongoing management of controls (including critical safety systems)
2
Many of today’s compressor packages are equipped with numerous (twenty or more) separate safety cut-outs. As an
end-user, it is up to you to identify and understand these safety systems as well as performing the necessary ongoing
inspections and tests to ensure their functionality. As noted above, not all compressor safeties will be critical.
7
Vol. 10 No. 4, 2010
within the scope of the mechanical integrity element [1910.119(j)(1)(v)]. The PSM standard requires that
appropriate procedures for testing be developed and that all staff involved in inspections and tests be
appropriately trained [1910.119(j)(2) and 1910.119(j)(3)]. Of course, the results of the inspections and
tests need to be properly documented [1910.119(j)(4(iv)].
A typical interval for inspections and tests of many critical safety systems is annually; however, the
operating experience within your plant should be used to shorten the interval if you are finding failures
similar to those noted above [1910.119(j(4)(iii). We should also point out that some elements of critical
control systems (such as ammonia sensors) often require shorter intervals for inspections.
There are two basic approaches to functional testing of critical safety systems: actual testing and
simulated testing. The process of actual testing involves bringing the process or safety system to the trip
point and verifying the change of state of controlled variables and control functions. For example, actual
testing of a compressor high-pressure cut-out would involve slowly closing the discharge stop valve while
continuously monitoring the discharge pressure (upstream of the stop valve) to prove the compressor
actually shuts down as intended at the discharge pressure cut-out set point. Simulated testing involves
testing individual or groups of sensors, controllers, and actuators in a manner that does not actually
change the state of the system. The simulated test is used if the instrument schemes are so complicated
that the interlocks cannot be checked on a single-trip shutdown or if the shutdown creates considerable
process interruption [Sanders 2005].
It is also important to note that the functional readiness of safety systems does not solely hinge on a
periodic demonstration of actual or simulated testing but inspections are also required (Barclay 1988).
Appropriate inspections are important to identify deteriorated components so they can be replaced.
When deficiencies are found in any of the safety system components, repair or replace them immediately.
Use longitudinal inspection and test results to modify the inspection and test intervals as-appropriate.
A CALL TO ACTION
Are your critical safety systems functional? Use the following as a safety system checklist as a tool for verification.
Review your safety systems and identify those that should be classified as “critical.”
Review the implementation of your critical safety systems to ensure a failsafe design is implemented.
Consider using diversity for instrumentation that is part of critical safety systems (e.g. flow meter +
refrigerant detectors).
Use redundancy for sensors that are part of critical controls (e.g. pressure sensors and pressure switches).
Develop and implement written procedures to functionally verify the operability of your safety systems
and controls including the calibration and function of sensors, trigger devices (such as floats), control
logic, and actuated devices. The functional verification should be conducted post-construction (as part of
a pre-startup safety review), following equipment repair or replacement, following changes (managed
part of your plant’s MOC program), following a near-miss, and routinely as part of your plant’s mechanical
integrity program.
8
Vol. 10 No. 4, 2010
IIAR Standard 5 (2008) provides basic minimum requirements for the safe start-up and commissioning of
completed closed-circuit mechanical refrigerating systems utilizing ammonia as the refrigerant and to additions
and modifications made to such systems. Review this trial use document for information related to inspections
and tests for your safety systems.
CONCLUSION
Critical safety systems have to work! What’s the state of your critical safety systems? Will they function in the
unlikely event they are called upon to actuate? How do you know? We hope you found the case studies and
additional information presented in this issue of the Cold Front informative. Use it to continuously
improve the safety and reliability of “cold” in your plants.
REFERENCES
Barclay, D. A., “Protecting Process Safety Interlocks,” Chemical Engineering Progress, pp. 20-24, February
(1988).
CCPS, Guidelines for the Management of Change for Process Safety, Center for Chemical Process Safety, New
York, NY, (2008).
EPA, “Chemical Safety Alert - Emergency Isolation for Hazardous Material Fluid Transfer Systems – Applications
and Limitations of Excess Flow Valves,” U.S. Environmental Protection Agency, EPA 550-F-0-7001,
http://www.epa.gov/oem/docs/chem/EFV_alert.pdf, June, (2007).
Fowler, K. ed., Mission-Critical and Safety-Critical Systems Handbook, Elsevier Publishers, (2010).
IIAR Standard 5, “Start-up and Commissioning of Closed-Circuit Ammonia Mechanical Refrigerating Systems,”
International Institute of Ammonia Refrigeration, June (2008).
Lees, F. P., “Loss Prevention in the Process Industries”, Elsevier Publisher, 2nd edition, (1996).
OSHA, “Process safety management of highly hazardous chemicals”, U.S. Occupational Safety and Health
Administration, 29 CFR 1910.119, (1992).
Reindl, D. T., “Mechanical Integrity for Safety Systems and Controls,” 2010 IRC Research and Technology Forum,
http://www.irc.wisc.edu/?/rtforums, (2010).
Sanders, R. E., Chemical Process Safety - Learning from Case Histories, Elsevier Publishers, 3rd Edition, (2005).