Valve OnLine Test

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

SAFETY

In Tech February 1998 39

Increase plant safety with online valve testing


By Paul Gruhn, Joe Pittman, Susan Wiley, and Tom LeBlanc

Improvements in safety system field devices can dramatically impact the overall safety of a facility.

The weak link in most safety systems is the field devices, specifically the valves. How can this weak link be strengthened? The ISA S84 and IEC 1508/1511 standards along with the American Institute of Chemical Engineers Center for Chemical Process Safety guidelines on safety instrumented (interlock) systems are performance-oriented, not prescriptive. They do not prescribe which logic system or field device configuration to use, how redundant a system should be, or how often a system should be tested. They merely list performance requirements for the system; the greater the process risk, the greater the safety system performance needed. These standards, along with Occupational Safety and Health Administration (OSHA) process safety management (PSM) requirements, state that companies need to determine and document that equipment is designed, maintained, inspected, tested, and operated in a safe manner. When overall system performance, from sensor to final element, is quantified, it is easy to see that valves represent the weak links in most systems.

A discrete shutoff valves typical failure mode is being stuck. The only way to test for this condition is to stroke the valve, but closing the valve completely and stopping production is not desirable. The valve does not need to be fully stroked to test its functionality. If partially stroking in a simple, reliable, and secure manner is possible online without stopping production, a dramatic safety improvement can result. When the safety impact of this test method is quantified, it typically shows an improvement by one order of magnitude. For large, continuous-operation process facilities, this solution has been described as manna from heaven. Understanding the metrics To evaluate systems and make design option comparisons, the failure modes of safety-related systems and the terms defining system performance must be understood. Unfortunately, this is not a universally understood and agreed upon area. The term availability is used by just about everyone involved in control systems. Unfortunately, its usage varies from person to person, and it isnt even that applicable for safety systems. Safety-related systems can suffer from two failure modes, not just one. Thus, if just one term (e.g., availability) is used, which failure mode is being referenced? Safety systems can shut down a process when nothing is actually wrong. Such failures are typically called nuisance trips (or spurious, safe, overt, revealed, or initiating failures). What might an availability of 99.9% mean here? Does this mean the system is down 45 minutes once a month, or 9 hours once a year, or 37 days once every 10 years? All three downtime options reflect the same availability. Most users know how long their process will be down when production stops. They just want to know, on average, how often such an event might occur. Saying a nuisance trip might occur once every month, once every year,

SAFETY
40 February 1998 In Tech Table 1. Safety integrity levels and performance requirements (for the entire system)
Probability of failure on demand (pfd) (1 safety availability) 0.001 0.0001 0.01 0.001 0.1 0.01

ISA S84 safety integrity level (SIL) 3 2 1 0

Safety availability 99.9% 99.99% 99% 99.9% 90% 99%

Risk reduction factor (RRF) or (1/pfd) 1,000 10,000 100 1,000 10 100

Process controlnot applicable

here is the range of numbers typically used are very difficult for most people to relate to. Table 1 shows a comparison between the numbers and the safety integrity levels (SILs) defined in industry standards. The difference between a safety availability of 99% and 99.99% does not sound significant; after all, it is less than 1%. The difference between RRFs of 100 and 10,000, however, is a bit more obvious. The point is, both ranges of numbers differ by two orders of magnitude. This article uses RRF. Shut down regularly or test online Facilities now operate for extended periods between scheduled shutdowns to maximize reliability and profits. This sometimes means companies wait six years for the chance to test a shutdown valve off-line. This is not a viable solution for interlocks requiring very high safety availability, so online testing must be considered. However, the thought of testing shutdown valves online strikes terror into the hearts of plant personnel who struggle to produce every pound of product possible. To those who keep hearing do more with less and maximize efficiency and reliability, a spurious trip and lost production due to a bad sensor are unacceptable, and deliberately interrupting a process to test a valve that may never be called on is insane. However, given the new rules put out by OSHA and ISA S84, operating personnel are realizing that systems must either be regularly shut down to test the valves or designed so valves can be tested without shutting down. Regular shutdown, while possible, is simply not feasible economically for large-scale commodity product units. Therefore, testing online is being explored in greater detail. A logic box does not a system make Control system vendors have been providing system performance numbers for years. Unfortunately, many users do not realize that such numbers account only for the logic box, not the entire system. The performance numbers listed in Table 1, however, represent the entire system, including field devices, not the logic box in isolation. Many specialized dual and triplicated logic boxes are independently certified for SIL 3 applications. Lets assume a generic 200-year NTR and an RRF of 10,000 for such a specialized logic box, as shown in Figures 1 and 2. In other words, the logic box is both safe and fault tolerant. These numbers are realistic and typical for the vendors specializing in these systems.

once every 10 years, or once every 100 years is much easier to relate to. This article uses the term nuisance trip rate (NTR), measured in years, which represents the mean time between nuisance trips. Safety systems may also fail to respond to an actual demand. Such failures have historically been called dangerous, covert, or inhibiting failures. Commonly used terms here are safety availability, probability of failure on demand (pfd), and risk reduction factor (RRF). The problem

Figure 1. Nuisance trip performance

Figure 2. Safety performance

SAFETY

In Tech February 1998

41

Simplex field devices decrease performance Most dual and triplicated systems are installed with simplex field devices. What impact, if any, does this have on overall system performance? Methodologies presented in draft ISA technical report dTR84.02 will be used to evaluate a small interlock system with eight sensors and two valves. Assume a 100-year mean time between failure (MTBF) for each device in each failure mode (i.e., one out of 100 devices causes a nuisance trip in one year, and after one year of testing 100 devices, one is found to be stuck). In nuisance trip mode, all field devices in the model must be included because it is assumed that any device failing safely will cause a nuisance trip. If there are 10 devices, each having a 100year NTR, the NTR due to all 10 devices is 10 years. In other words, there will be a nuisance trip, on average, every 10 years due to the field devices alone, as shown in Figure 1. Things are even simpler for the RRF calculation because all 10 field devices are not included. Assume for this example that both valves must function; therefore, both should be included in the model. However, the system will only fail if a demand is placed on the one sensor that failed. In other words, if a pressure sensor fails, but the shutdown demand occurs on a temperature sensor, the system functions properly. Therefore, only one input, not all eight, should be included in the fail-to-function model. So, there are one sensor and two valves. Assume an eight-hour repair time and a one-year manual test interval. The pfd from these three devices alone is 0.015, which equates to an RRF of 66, as shown in Figure 2. If the RRF for the field devices is 66, the RRF for the system is 66. Thus, the overall system only meets SIL 1 requirements, not SIL 3. Ouch!! The logic box represents less than 1% of the overall problem. Transmitters make little overall improvement During the past decade, more companies used analog transmitters, which offer a variety of benefits, instead of discrete switches. For example, the dynamic nature of a transmitter signal means it is easier to tell if the device is working properly. When multiple devices are used, comparisons can then be made to increase the potential diagnostic levels even further. Impacts of such design options can be quantified and sensor performance can be shown to increase by an order of magnitude. However, if the valve design remains unchanged, the overall system performance may not improve at all. For example, assume using a single analog transmitter versus a discrete switch increases the

sensor RRF an order of magnitude from 200 to 2,000. If the logic box RRF is 10,000 and the RRF for the two valves together is 100, the overall system has an RRF of 94, as shown in Equation 1 and in Figure 2. Thus, the system is still just at the high end of SIL 1. 1 = 94 1 1 1 + + 2,000 10,000 100 Dual arrangement increases performance Traditionally, to improve system performance, the valves would be tested more often or dual valves would be used in series, possibly with a bleed valve between them. Such a dual arrangement is called one-out-of-two (1oo2), meaning that either valve can shut down the system. While this is safer than just one valve, the nuisance trip rate suffers since either valve can stop production. Using the same numbers as in previous examples, but assuming each output valve is now configured as dual redundant and accounting for a small amount of common cause, the nuisance trip rate degrades from 10 to 8 years and the RRF increases from 94 to 900, as shown in Figures 1 and 2. Therefore, dual redundant valves can increase the overall safety by one order of magnitude. This is the traditional solution for valves in highrisk applications. The obvious drawback, however, is cost. Not only is the capital equipment cost higher for twice as many valves, but maintenance test labor costs increase as well since there are now twice as many valves requiring periodic testing. Can a simplex valve be safe? Can a system be designed with simplex valves and still meet SIL 2 performance requirements? The answer is yes. The key is simple, reliable, limited movement testing. If a valve could be partially stroked to obtain an 80% diagnostic coverage factor (the Pareto principle, also known as the 80/20 rule) and be tested automatically once a day (and still be fully tested once a year), the NTR goes up from 8 to 10 years for simplex field devices, yet the RRF is 800, as shown in Figures 1 and 2. Therefore, a simplex valve can meet SIL 2 performance requirements. If valves are never stroked, it can almost be guaranteed that they wont work when needed. Periodically stroking the valve actually increases the reliability of the valve (i.e., increases the MTBF). The RRF of 800 assumes the valve MTBF increased from the original 100 years to 300 years merely due to periodic testing.

Terminology
MTBF mean time between failure nuisance trip rate Occupational Safety and Health Administration probability of failure on demand process safety management risk reduction factor safety integrity level safety instrumented system

NTR OSHA

pfd

PSM

RRF

SIL

SIS

SAFETY
42 February 1998 In Tech

Reliability models are like captured foreign spies; if tortured long enough theyll tell you anything.

Cautions for reliability models Its easy to get overly involved in reliability models and put too much faith in their answers. For example, hardware interlocks were left off the Therac 25 radiation machine because a quantitative reliability analysis of the software showed it to be so good that the designers felt the interlocks werent needed. Six people died after being massively overdosed due to a software error. Thus, keep the following in mind: Reliability models are like captured foreign spies; if tortured long enough theyll tell you anything. If the modeling process is automated on a computer, dont forget computers are known for their speed, not their intelligence. All programs are designed by people, so carefully check the assumptions, simplifications, and hidden agendas that may be buried in the computer code. No one particular modeling technique is more correct than another; all are merely approximations. Different assumptions and simplifications can be made that can change the answer orders of magnitude. Valve testing methods Many methods and designs have been proposed for online testing of shutdown valves. The time-honored favorite is to install a bypass valve around every interlock valve. While some minor process upsets may be encountered, the process remains intact and running. Maintenance is also easier; if a valve fails, a bypass allows for changeout online. This seems simple from the operating side, but drawbacks abound from other perspectives. Economics, obviously, is a large factor. The

Figure 3. A single testable valve (left), or a standard valve with bypasses (right). Which would you rather implement?

cost of the additional piping, full-size bypass valves, and instrumentation required for a large operating facility are astronomical. Labor costs to cut pipe and build access platforms for valves are a major portion of any interlock retrofit project. In addition, many existing installations do not have space available for adding piping and valves. When the unit has not been designed for this, the ergonomic assessment can get very interesting. Adding a bypass valve that cannot be readily accessed or is in the way of routine operations is not a practical solution. Another bypass valve issue is the consequence of the valve being left in the wrong position after testing. This is an important consideration when designing interlocks since it effectively takes the interlock out of service. Many operating personnel believe that interlock bypass valves can be car sealed in the appropriate position between testing cycles and, thus, be administratively monitored. However, anyone who has ever dealt with large facilities, some with extremely long car seal lists, may disagree. Because operating facilities are being pushed to decrease costs, the feasibility of adding many valves that must regularly be inspected to a car seal list should be taken into account when designing the interlock for longterm maintainability and reliability. One method used to prevent this problem without increasing inspection requirements is to install limit switches on the bypass valves and alarm when the valves are not in the safe position. This increases I/O counts and wiring requirements and may add confusion for operators due to the additional alarms. Figure 3 shows a typical installation sketch for two safety instrumented system (SIS) valve

SAFETY

In Tech February 1998 43

scenarios. One uses a device for partial stroke testing, the other uses a traditional bypass arrangement. Limit switches, which ensure the valves are in the proper position both during normal plant operation and during an interlock trip, are provided on both designs. The switches can also be used to test documentation if the SIS has an associated sequence-of-events recorder. This documented testing is required by 29 CFR 1910.119 to prove that the required performance is being maintained. Online partial stroke testing involves stroking the valve through approximately 20% of its travel to prove that the valve is not stuck in position. It obviously does not test whether the valve will fully close and seal completely. Another method is to use limit switches and actually measure valve movement, or else time the signal to the valve, thus only moving the valve partially. Limit switches are naturally prone to failure, and problems occur if they are out of adjustment. Valve stroke timing has also proven problematic for users willing to admit having tried it. Some use analog control valves for safety applications. While analog valves are certainly capable of limited movement testing, they are an expensive alternative compared to typical discrete valves and are more likely to leak, which is not desirable in most applications. The benefits of a simple, dependable online testing device for such valves are easy to see. From the manufacturing perspective, testing online without a process disruption is very important. On the economic side, cost savings are great because hardware, piping, and labor costs are reduced. Some devices can also be installed without pulling the valve from the line. This saves time during a retrofit turnaround by not needing to clear the line for entry and saves time and labor for removal and reinstallation of valves. In addition to testing the valve, operation of the logic solver output point, wiring, and fusing for energize-to-trip circuits and operation of the solenoid valve are verified. When looking at the total output system, testing with this type of device verifies approximately 85% of the interlocks output side. This 85% estimation can then be used in interlock availability calculations as percent credit taken during testing to ensure the interlock still meets its required performance. Partial stroke testing does not ensure the valve will shut off completely. However, neither does closing the valve completely while the valve remains in the line. Only removing the valve from the line and pressure testing it will verify complete closure and shutoff class.

Behind the byline


Paul Gruhn is a safety system specialist with Moore Products in Houston, Texas. He has a B.S. in mechanical engineering from Illinois Institute of Technology and is a licensed professional engineer in Texas. Paul serves on the national Chemical and Petroleum Industries Division board as the chairman of the Safety System Subcommittee and sits on the SP84 committee. Paul is the developer and instructor for ISAs two-day course EC50, Emergency Shutdown Systems. Joe Pittman is an instrument engineering adviser with Arco in Channelview, Texas. He currently leads Arcos efforts for compliance with OSHA 29 CFR 1910 regulations as they apply to process safety systems. He was responsible for developing a set of risk-based standards and guidelines to define the classification, design, and maintenance of safety interlock systems. Susan Wiley is a principal instrument engineer with Arco in Channelview, Texas. She has a B.S. in electrical engineering from the University of Houston. She currently manages safety interlock system retrofit and DCS upgrade projects for Arco. Tom LeBlanc is national sales manager of Tyco Valves & Controls in Houston, Texas. He has a B.B.A. from the University of Texas at Austin. He has held several sales and marketing positions with Keystone, now Tyco, during the past 15 years.

Figure 4. Testable valve device

One valve/actuator manufacturer uses a simple technique to incorporate online partial stroke testing. An interlocking device is mounted between the actuator and valve body, as shown in Figure 4. This device is generic and can be mounted on any quarter-turn valve and can be actuated manually or remotely. When placed in test mode, the actuator may be stroked approximately 20 degrees to verify valve movement. After testing, the actuator may then be moved back to its normal position. The manually operated device is provided with an interlocking unit and proprietary key that allows only authorized users access to the system. Remote control enhances safety system integration. The remote device is supplied with integral proximity switches to provide positive open/close indication of the interlock device itself. More solution benefits Just as all systems on an aircraft must function in unison, so must integrated safety systems in the process industry. While obvious performance and safety benefits can be gained through limited movement valve testing, additional benefits can be gained if testing, logging, and reporting can be automated, centralized, and simplified. Currently, maintenance personnel must both perform and record field device testing. Records must then be maintained showing the tests were actually performed. Consider the benefits, however, of having the logic box perform the following: Send a self-test signal to the valve Read back that the test was successfully performed Log and store the test date, time, and results Print the report when requested This would not only lower manual testing requirements and associated costs, but greatly simplify the documentation requirements set IT forth in the PSM legislation.

00

February 1998 In Tech

In Tech February 1998 00

You might also like