Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H.

Tan

BASIC CONCEPTS IN SAFETY ENGINEERING AND RISK MANAGEMENT 1. Introduction Today, Safety, Reliability, and Risk Management play a vital role in our society. Engineers use the concepts to ensure that systems perform without unacceptable failures, ensuring the health and wellbeing of people. Companies use the concepts to ensure that they can return suitable profits for shareholders, while meeting the required expectations of society and stakeholders in terms safety performance. The ideas of `Safety' and `Risk' have been framed into legislation with a view to making society adopt practices that reduce risk to an acceptable level. 2. Safety One of the difficulties in studying the subjects of safety, risk and reliability is that many of the terms that are required are also in common usage in the English language. For example, just as the words stress and strain have precise meanings to engineers and should not be confused with their meaning in everyday usage, the terms used in safety engineering are of a similar nature and it very important to be clear about their scientific and engineering definitions. Most of the terms used during the course will be defined where they are first introduced in the lectures or notes. It is however helpful to define the term safety here, as its meaning tends to be taken for granted and because it has no rigorous mathematical basis unlike many other terms that will be used. A well considered definition is as follows: Safety is relative freedom from danger, or threat of harm, injury, or loss to personnel and/or property, whether caused deliberately or by accident. The word safety therefore involves two concepts. The first is that of a safe state in which one needs to be in, in order to feel or be safe. The second concept is related to the notion that the chance or probability or transferring from the safe state to an unsafe state should be reasonably small. During this course we shall be working to define what this means for engineering systems and how we can calculate the probabilities that failures will occur and the changes to design that are required to reduce these chances to an acceptable level, recognising that however hard we might try nothing can be made completely safe. It is interesting to point out that when safety is considered from the perspective of the consumer of a technology or service, actual safety measures may differ dramatically from perceived safety. One bad experience can be magnified in the mind of the customer, inflating the perceived unreliability of the product. One plane crash where hundreds of passengers die will immediately install fear in a large percentage of the flying consumer population, regardless of actual reliability data about the safety of air travel. 2. The Accident Sequence

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H. Tan

The fields of safety and reliability are concerned with the study of a particular class of events. These events are known as `accident' or `failure' events. Such events release the energy stored in hazards present in the engineering system. The release of the energy usually results in undesired consequences such as fatalities, commercial losses, damage to plant, and/or environmental damage. These ideas are combined in the `the accident sequence' described as follows:

Hazard Accident/Failure Event Undesired Consequences

(1)

The accident event can be used to manage the risk associated with a complex engineering system by examining each element of the accident sequence in-turn. It will act as a framework for the topics studied in the course. 3. Hazards The term `Hazard' in the accident sequence refers to an object or substance that has the potential to inflict harm or damage to persons, property or the environment. The hazard has the potential to release or convert energy in all its forms: Potential Energy - e.g. elastic strain energy, gravitational energy (height, internal pressure). Kinetic Energy - mass and speed (compressors, pumps, and other rotating equipments). Chemical Energy - stored energy in the bonds between molecules and atoms (hydrocarbon fluids, biological agents, sulphuric acid). Atomic Energy - Stored energy in the bonds between sub-atomic particles in atoms (radioactive materials).

Identifying hazards, and the magnitude of the energy associated with them, is an important part of managing safety. 4. The Failure Event The failure event represents the loss of function of the component or system being analysed. However, engineering systems contain random defects and design problems; environmental loads can take on extreme values, and new and unplanned operational challenges can occur. The combination of these things introduces uncertainty into when the failure will actually occur. An example of the problem we are faced with is shown in Figure 1. The figure shows a histogram of the time to failure of a very large population of valve components. The components were nominally identical, and all operated in a similar operating environment.

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H. Tan

Figure 1, The random variation in the time to failure of a valve component. Prior to operating an individual valve, it is impossible to state a definitive value for the lifetime of the component since the value varies in a random manner from one to the other. We could however, make a statement about the probability of the component lasting beyond a particular point in time from the data available in the histogram. This uncertainty about the future performance of systems is one of the great challenges of engineering. There are two possible approaches. Either a simple reactive approach is used, where preparations are made to cope with the effects of failure; or a proactive approach is used, where an attempt to identify the most likely performance of the system is attempted, and steps taken to prevent failure. These steps can be taken at all stages of the systems lifecycle; design, manufacturing, installation, operation, and decommissioning. The proactive approach requires some way of dealing with the uncertainty. This is traditionally achieved by using conservative values of variables and safety factors; assuming low yield strength and high loading in the design calculation for example. It is possible however to use probability to describe the uncertainty associated with failure. This allows us to look into the future and get a feel for whether to expect failure in the next day, or month or year. Identifying which outcomes are more likely than others can lead to good decisions on design, replacement, and maintenance, which help prevent unwanted failure events from occurring. One way of analysing this problem is to define a time interval of interest, say the useful lifetime of the component [0; t]. With some information on the uncertainty, it is possible to answer the question what is the probability that the failure event occurs in this time interval?. This probability is known as the probability of failure, Pf, which obeys the axioms of the theory of probability (See sections 2 and 3 on pages 10-67 of Lewis [6]). Probability theory will be used in the material on classical reliability techniques later in the course.

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H. Tan

5. Consequences Once an accident/failure event has occurred and the hazard potential released, a number of undesired consequences can result. Quantifying the exact consequences associated with a real failure event is however difficult: there is no way of knowing prior to the failure event exactly what will happen. One good example of the uncertainty associated with consequences resulting from failure events was the Buncefield depot incident in 2005, where a huge release of the hazard potential stored in 10 million litres of petrol, diesel, and aviation fuel resulted in zero fatalities [7]. This was largely due to the timing of the explosion; 6.01am on a Sunday morning. Had this event occurred on Monday at 10:00am, the number of fatalities could well have been in the 100's or 1000's. Which consequence magnitude should be used to make decisions? The worst case scenario - this leads to very expensive designs and spending of resources where they are not needed? The best case scenario - this usually results in expensive failures? To complicate matters, in most real situations, there are many shades of grey between these two extremes. Lindlay [8] has demonstrated that the rational approach to decision making in the face of uncertainty is to use the expectation of the undesired consequences. If the undesired consequence of a failure event, for example fatalities - which can only occur as 0, 1, 2, 3, , is defined as C, able to take on values in the finite range c1 c c2 , then the expectation is defined using the following:
E (C ) cP C c
c1 c2

(2)

where P C c is the probability that the variable C takes on the specific value c. In line with Lindley's theory, Equation (2) is a weighted average. Practical calculations of expectations through the use of event trees will be studied later in the course. In addition to the uncertainty associated with the consequences following a failure event, there is the second issue of different kinds of consequences. The following consequences are examples of the different types that can occur:

Human consequences (fatalities, injuries, psychological damage); Commercial losses (loss of revenue and product, loss of sales); Environmental damage (air, ground water, sea, habitat); and, Reputational damage (share price, loss of customers, high staff turnover).

These different consequence types must somehow be assessed in any measurement of the effects of failure events. Incorporating different consequence types into an overall assessment is often done by converting the losses into a system of common units, usually financial. However, placing values on preventing a fatality, or a major environmental release brings its own difficulties. Who decides the cost of a life? and is it appropriate or necessary to even try? Understanding of the issues associated with these questions will be developed during the programme.

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H. Tan

6. Risk Having examined the three individual terms in the accident sequences, the concept of risk is now introduced to bring all three together. One of the difficulties in studying the field of risk management is the level of confusion surrounding the meaning of the word risk. Consider the following: 1. 2. 3. 4. 5. 6. That's a risk. The risk of that happening is very small. There is a small risk. That looks risky. We need to manage Risk. What a massive risk.

In the above, risk is being used to describe different concepts; hazard (1,6), probability (2,3), and combination of probability and consequences (4,5). These differences are often not a problem in everyday language, but are a problem when it comes to using risk to describe the performance of an engineering system. Imagine the problems if temperature or load meant different things to different people! - you don't have to look to far to see the issue; NASA lost a $125M spacecraft in 1999 because two engineering teams were working in different units. A number of different definitions for risk have been proposed [9][10] (The Royal Society gathered together experts from a range of disciplines to discuss the nature of risk in 1992. The resulting report contains nine different definitions [11]). The simplest possible definition of use for this course is as follows:

Risk = Probability of the failure event occurring Undesired consequences

(3)

Risk is the product of the failure event probability and the failure event consequences. A simple example demonstrates the idea. If the consequences of a minor hydrocarbon release are estimated at 100,000 and the probability of the failure that led to the release at 0.01, then the risk associated with the failure event is simply 0.01 100,000 1, 000 . More generally, if the probability of failure in the time interval [0; t] is denoted Pf, and the expectation or `average' consequences as E[C], then Eqn.(3) can be written as:

Risk = Pf E[C ]

(4)

This fundamental result is the basis for the material studied in the course. It will be developed and studied over the coming weeks.

7. ALARP

ALARP stands for As low as is reasonably practicable. What it means is that, unless the risk is excessively high (in which case the activity must be stopped immediately), the risk is allowable at its

Fundamental Safety Engineering and Risk Management Concepts, 2012/2013 by N. C. Renton, M. J. Baker and H. Tan

assessed value only if the cost of risk reduction is disproportionally high compared with the benefits gained (i.e. the reduction in risk). Otherwise, steps must be taken to reduce the risk until it becomes as low as is reasonably practicable.

References

[1] Health and Safety Executive. Offshore Injury, Ill Health, and Incident Statistics 2005/2006. HID Statistics Report, 2007. URL http://www.hse.gov.uk/offshore/statistics/hsr0506.pdf. [2] J. Uff. The Southall Rail Accident Enquiry Report. Health and Safety Executive, 2000. [3] Lord Cullen. The Ladbroke Grove Rail Enquiry. Part 1 Report. Health and Safety Executive, 2000. [4] M. Wright and S. Marsden. Changing business behaviour -would bearing the true cost of poor health and safety performance make a difference? RR436, Health and Safety Executive, 2002. http://www.hse.gov.uk/research/crr_pdf/2002/crr02436.pdf. [5] K. Haefeli, C. Haslam, and R. Haslam. Perceptions of the cost implications of health and safety failures. RR403, Health and Safety Executive, 2005. http://www.hse.gov.uk/research/rrhtm/rr403.htm. [6] E. E. Lewis. Introduction to Reliability Engineering. John Wiley and Sons, 2nd edition, 1996. ISBN 978-0-471-01833-9. [7] Buncefield Major Incident Investigation Board. Initial report to the Health and Safety Commission and the Environment Agency of the investigation into the explosions and fires at the Buncefield oil storage and transfer depot, Hemel Hempstead, on 11 December 2005. HMSO, 2006. [8] D. V. Lindley. Making Decisions. 1985. [9] Health and Safety Executive. The tolerability of risk from nuclear power stations. HMSO, 1992. [10] BS 4778-3.1:1991. Quality vocabulary. Availability, reliability and maintainability terms. Guide to concepts and related definitions. British Standards Institute, 1991. [11] The Royal Society. Risk: Analysis, Perception and Management. Report of a Royal Society Study Group, 1992.

You might also like