Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Eng Part

Hierarchy of controls

Risk

- Effect of uncertainty on objectives


- In terms of consequences of an event and associated likelihood of occurrence

Events

- Occurrence or change in particular set of circumstances

Consequence

- Outcome of an event affecting objectives

Principles of risk mgmt.

- Creates value
- Integral part of org processes
- Part of DM
- Addresses uncertainty
- Based on best available info
- Tailored
- Takes human/cultural factors into account
- Transparent/inclusive
- Dynamic, iterative and responsive to change
- Facilitates continual improvement/enhancement of org

Failure mode

- The effect by which failure is observed on a failed item


o Eg: fracture of beam is failure mode from mechanisms (corrosion/fatigue), or
bearing fails to turn is failure mode resulting from mechanism (loss of lubrication)
Risk mgmt. process

Establish context

- Internal/external context (ext: SH, social, cultural, political,


int: policies, obj, strategies in place to achieve them)
- Risk mgmt. context
- Develop criteria (should reflect orgs values, obj, resources
defined beginning of risk mgmt. process)
- Define structure

Risk identification

- Develop list of sources of risk/events (areas of impact,


events/their causes and potential consequences)

Analyse risks

- Identify existing controls


- Determine consequence, likelihood, lvl of risk

Evaluate risks

- Compare against criteria, set priorities

ALARP

Treat risks

- Identify/assess options
- Prepare/implement treatment plans
- Analyse/eval residual risk

Identify options for risks with positive outcomes

- Engage in activity
- Enhance consequence
- Retain residual opportunity
- Share opportunity
- Enhance likelihood of outcome

Identify options for risks with negative outcomes

- Avoid risk
- Change consequence
- Retain risk
- Share risk
- Reduce likelihood of outcome

Reliability

- Prob that an item can perform its intended function for a specified interval under stated
conditions
- R(t) [survival fnc]: Prob that the item does not fail in time interval
- Longer time goes, more probably failure occurs (reliability decrease)
- Longer time goes, less probable item survive

Function

- Items designed to perform one or more required functions

Quality

- Conformity of product to its specification

Non-repairable items: item to its first failure

Time to failure is a function of T: continuously distributed RV with pdf

Hazard rate: conditional prob that an item fails during interval, given that it has survived to time t
(increasing h(t) means MORE likely to fail in next instant)

o Rising = wear out (maintenance: consider fixed time replacement as rate of increase
of curve move upwards)
o Constant = random pattern (maintenance: no fixed time replacement since having
replace an item at any time, a new identical item would have equal chance of failing
in the next interval)

Hazard rate (Bathtub curve)

1st interval: material/manu


defects

2nd interval: sudden stresses,


extreme conditions

3rd interval: wear-out failures h(t)


increase as equipment
deteriorates
- Failure distribution that has constant hazard rate has exponential distribution

Memoryless properties:

- Hazard rate for exp distribution = horizontal line


- Prob of failing in the next instant given that item has survived so far is the same as previous
- Time indep

Maintenance:

- Restore asset function when failure occurred


- Prevent system/asset failures

Consumer protection

- Safety requirements
- Environmental requirements

Suspensions/censorings:

- Component still functioning at the time data was obtained


- Obtained from: equipment currently in service, preventive replacements
- Inclusion of suspensions

Right censored (suspended data)

- Component still functioning at the end of the time, data was obtained
- Dont see failure due to the asset being removed from the test
- Obtained from: equipment currently in service, preventive replacements

Left censored

- Dont know the start date of operation

Uncensored data: data with no suspensions

To determine eta and beta from F(t) graph

- Find eta where F(t) = 0.632

Weibull parameters

Beta (shape/what type failure):

- Beta<1: wear-in failures


- Beta=1: random failures
- Beta>1: wear-out failures

Eta (characteristic life/estimate MTTF):

- Scales distribution over time axis


- Time where 63.2% of components have failed F(eta) = 0.632
Regression analysis:

- = + where a = intercept, b = gradient


- = 0 + 1 1 + 1 where = error term or residual, Y = dependent variable
- for Weibull:
1
o = ln ln 1() , = ln(), 1 = , 0 = ln()

5 steps for project eval & DM

1) Define decision criteria


2) Define base case
3) Collect data against each criterion for the base
4) Calc $ values for financial criteria/explore sensitivities
a. Projects initiated if; estimated benefits outweigh costs
b. Examples of indirect/intangible benefits: business opp, social well being, economic
activity, customer loyalty, public perception, reduced likelihood/conseq lvls
5) Check approp risks have been evaluated
6) Eval base case against decision criteria

Personal safety and Process safety

Personal safety: Low severity, high freq risks (OHS)

Process safety: high severity, low freq risks

Process safety:
Managing integrity of operating systems and process that handle hazardous substances

- Relies on good design principles, engineering, operating and maintenance practices


- Good design/safety mgmt. sys not effective unless accompanied by good safety culture
- Culture = function(behaviour, shared values)

3 capital model:

- Human: knowledge, skills, experience of ind


- Org: routines/sys
- Social: interactions, leadership, culture
Bow Tie Diagram

Eg: unwanted incident is a large gas leak on the rig

Example

Safety culture:
product of ind/group values, attitudes, and perceptions, competencies, and patterns of
behaviour that determine that commitment to, and the style and proficiency of, an orgs
health/safety mgmt

- Its pervasive, difficult to define, shared amongst ppl, communicates whats important,
expressed through activities

NPV (Net Present Value)

- deals with TIMING and RISK of cash


flows by discounting future cash
flows

r = discount rate
Payback

- amount of time required for the difference in present value of savings, to equal present
value of costs

Rate of return (r*)

- discount rate that causes cash flow in to be equal to cash flow out

Equivalent Annual Cost (EAC)

- cost per year of owning and operating an asset over its entire lifespan
- CRF: Capital Recovery Factor

Life cycle costing (LCC)

- Life-cycle: time interbal between products recognition of need or opp and its disposal
- Consumer perspective: business need, purchase, install, commission, operating and
maintenance, disposal
- Manufacturing perspective: product conception, design, prototype, production, logistics,
warranty/support, phase out
o Asset mgmt. systematic/coordinated activities and practices where org manages its
assets, and their associated performance, risks and expenditures over their life cycle
for the purpose of achieving its org strategic plan

1) Strategic/functional lvl
a. business need, meeting strategic goals, operation req meet standards,
constraints
2) Baseline for cost breakdown structure and cost for each year

12) Trade-off for LCC; operational availability, intrinsic availability, spares cost, manpower
cost, prob of mission success
- Objectives of LCC
o Calculate $ value representing LCC of product as an input to a DM/eval process
together with other inputs. Cost based on defined need
o Support mgmt. considerations affecting decisions during any phase
o Identify attributes of product which influence LCC (Cost drivers) so it can be
managed
Iceberg

When to use LCC

RCM (Reliability Centred Maintenance)

- Actions during life cycle of an item intended to retain it in a state, or restore it to a state
which it can perform the required function
- Obj:
o detect/correct incipient failures before they occur or develop in defects
o detect hidden failures
o increase cost-effectiveness of maintenance program
- Categories of failure conseq:
o Hidden or Evident:
Failures where the loss of function, under normal circumstances would not
be detected (protective device not fail-safe)
Proactive tasks technically feasible if reduce risk of multiple failure to a low
lvl
o Safety/Environmental impact:
Failure has safety conseq if it creates an intolerable risk to personnel safety
Failure has envir conseq if it creates an intolerable risk of envir damage
Proactive tasks worth doing if reduce prob of failure conseq to low lvl
o Operational impact:
Op conseq include: lost production, loss of product quality, increase
operating costs, loss of customer service
Proactive maintenance tasks worth doing if over a period of time it cost less
to do the task than cost of the conseq

Fixed interval task selection

- Applicable when cond prob of failure starts to rapidly increase after a specified age, most
items will survive until that age
Condition monitoring task selection

- P-F (Potential Failure)


- Freq of condition monitoring task determined by P-F interval
- P point: specific detection technique can detect deterioration

Run to failure task selection


- Applicable for failure modes without safety/envir conseq when it is cheaper, in the long run,
to allow the item to fail

Failure finding task selection


- Applicable for failure modes with hidden failure conseq when possible to test device without
destroying it, possible to test most of the device or sys, does not increase risk of inducing
multiple failure

Developing tactics

1) Function: desired standard of performance


2) Functional failure: occurs when equipment does not perform its intended function
3) Failure mode: how does it fail and what causes each functional failure
4) Failure effects: what happens when each failure occurs
5) Failure conseq: environmental, safety, economics
6) Proactive tasks: what to do to predict/prevent each failure
7) Default actions

Adopting and adapting


- Template tactics from other, similar equipment
- Cut and paste tactics from other, identical equipment
- Using manus recommended tactics

Hazard/Risks

- a potential source of harm


-

- Hazard level: combination of severity and likelihood of occurrence of hazard


- Risk level: hazard lvl combined with likelihood of hazard leading to the event + exposure

Bow tie diagram

- Determine need for barriers


- Swiss cheese model of accident causation
o Incident occur when one or more holes in each slice momentarily align
o Allows hazard to pass through leading to an incident
o Holes: active and latent conditions
o Barriers need to reduce chance of the holes in swiss cheese line up
- Active failures: unsafe acts or equipment failures directly linked to an initial hazardous event
- Latent failures: contributory factors in the sys that may have been present and not corrected
for some time until they contribute to an incident
- Barriers should be indep
o Examples: process/procedures, eng controls, training, physical separation, monitor

Asset integrity mgmt. (AIM)

- Focuses on preventing major incidents

3 common measures of risk

1) Societal risk
- Freq vs number of fatalities (F-N) curve
2) Potential loss of life (PLL)
- Estimate risk to groups working at specific sites
- Fatalities/year
3) Location specific individual risk (LSIR)

FAR (fatal accident rate)

- Number of fatalities per 100 mill worked (exposed) hours


- Eg: 8 fatalities per 100000 workers/yr. assuming 40 hr week, 48 weeks per yr working period
8 1 1
- = 100000 . 40 . 48 = 4.1108

Permit-to-work

- Formal written sys used to control certain types of work that are potentially hazardous
- Specifies work to be done and precautions to be taken

Software threat categories: Integrity, Security

- Knowledge comes from:


o Info provided by clients and end-users
o From domain experts
o Opinions of the programmer
o Data provided by sensors
o Learning systems
- Verification/validation
o Verification: assess correctness of code
Multiple indep teams to build similar sys leading to // test
Multiple indep teams to review/test sys
External teams to try penetrate security
- Quality assurance: prevention of errors (gathering/design, program techniques), inspections,
testing
- Methods
o Static: tests which do not involve executing the program
Code inspections: formal, efficient, economical in finding faults in design
Procedure: overview, prep, inspection, review, follow up
Code walk throughs: informal meeting
Reviews are proactive tests
o Dynamic: do not involve executing code

Assumptions of reliability models

- defects indep, size of sys constant


- Defects may not be indep (comes in clusters and not uniformly distributed)

Semantic analysis

- Based on meaning of program


- Formal proofs: prove given program satisfies required property
- Control flow analysis
- Data flow analysis
- Symbolic execution
Test: to detect diff between specified (required) and observed (existing) behaviour

- White box test: internal structure of component


- Black box test: input/output behaviour of component

EE Week

Functional safety

- Part of the overall safety relating to Equipment Under Control (EUC) that depends on correct
functioning of safety-related control system and other risk reduction measures
- Subset of safety

Safety

- Freedom from unacceptable risk

Risk

- Combination of probability of occurrence of harm and severity of that harm

AS61508 BINGO Safety Lifecycle (diagram), and Risk Reduction Process

SIS: Safety Instrumented System

SIF: Safety Instrumented Function

- Likelihood of safety function being performed satisfactorily


1) Hazard identification: HAZOP (Hazard/Operability)
2) Risk Analysis/selection of SIL: LOPA (Layer of Protection Analysis)
- assessing adequacy of protection layers used to mitigate process risk
Initiating event Protection Pro 2..N Accident Risk reduction Risk..factor..N Outcome
freq Layer 1 event factor 1 freq
freq
Trigger <-Prevention Measures -> Incident <- Mitigation Measures -> Harm

3) Define Safety Requirements: SRS (Safety Requirement Specification)


4) SIL verification (SIL: Safety Integrated Level):
- Method to show that SIS meets requirements of particular SIL level
- V&V report (Verification/Validation): will the sys be reliable enough
5) Ongoing operation, maintenance, repair (Ops & maintenance plans)

Standards for safety

- AS 4024:2006 Safety of Machinery


o Safety categories (CAT)
- AS 61508/61511/62061 Functional Safety
o SIL
o AS61508: addresses functional hazards of new tech advances, safety LC to avoid sys
faults during design/development/commissioning/operation/maintenance
functional safety assessments of components/sys address the correct
performance of the assigned safety fnc as required for necessary lvl of risk
reduction

Safety Categories

Quick release hooks

- Electric capstan: advanced mooring system, ensure vessel is secured fast to jetty providing
solid, reliable anchor points for mooring lines
- Component of integrated sys, provide safe method of securing vessel whilst alongside jetty
- If required, releasing lines even under full tension
- Integral capstan haul in each mooring line
- Hooks are specified with capacity of 150 tonnes
- Double, triple, quadruple hook units selected depending on jetty layout/vessel parameters
- Once mooring line attached to hook, the line is tensioned by the shipboard
- Risks: snapback -> line breaks, and in region there will be damage
o Avoid standing behind or near a line under tension
o Never stand on or walk over taught lines
o A line could come under sudden addition tension at any moment
o Yellow areas are there for a reason

Car dumper tunnel

- Low O2 lvl in car dumper tunnel


- Hazardous atmosphere
o Excessive dust
Treatment: extraction, moisture addition, PPE requirements (masks)
o Toxic atmosphere by hydrocarbon decomposition in sumps
Oil/water separator, air ventilation

Self-driving beer

Safety lifecycle impact

1) Addressing over-engineered loops


a. Cheaper design/build
b. Less maintenance, higher availability
2) Addressing under-engineered loops
a. Meet min safety requirements
b. Improved reliability, reduced insurance costs
c. Demonstrable duty of care

SIL
AS 4024: Safety of Machinery

Qualitative

- Risks identified/analysed
- Risks pre-controls used to specify requirements
- Assumes design is right
- No consideration of diagnostics/maintenance
effectiveness, common-cause failures, MTBF

CAT1: Hazard analysis/risk assessment

CAT2: Risk reduction measures

CAT3: Safety req

CAT4: Design Critical risk

CAT5: Validation

- Validate the design meet CAT4 req

SIL:

- Concept
- Scope
- Hazard

how well does each concept address the root causes?


Concept Risk assessment Risk reduction
Category Risk matrix metric Unknown
Safety Integrity Level Quantified Value Calculated Value
Client 35%, designer 65%

Verification/Validation example (traffic lights)

1) Verification: have we built it right?


2) Validation: have we built the right thing? (what the client wanted)?

Risk matrix

Hierarchy of controls

Risk spectrum
Incident/accident causation

- Accident: event where a failure leads to at least some undesired -ve conseq
- Near miss: event where an accident could have occurred, but all undesired -ve conseq have
been avoided
- Incident

Causal factors: immediate/direct causes

Root causes: contribute to causal factors

Active failures/errors: occurs at end of operations with immediate effects (tech faults/human error)

Latent failures: effects lie dormant for long time, only evident when combined with other factors
(planning, design policies, procedures,etc)

Models (linear)

- Heinrichs domino theory


o Main domino to target is the unsafe act one taking this away will stop rest of
domino seq
- Single root cause model
o Root cause -> Accident
o Favours human error, assigns blame
- Extended root cause model

o Root cause -> Cause -> Cause -> Accident


o Underlying causes, ignore possibility of multiple concurrent contributing factors
- Multi-cause joint effects model
o Multiple RC can contribute

- Extended multi-cause joint effects model


o Multiple RC with direct/indirect pathway to contribute

Models(complex)

- STAMP (Sys Theoretic Accident Model and Process)


o Sys interrelated components kept in state of dynamic equilibrium by feedback loops
of info and contrl
o Safety mgmt. control tasks/impose constraints to ensure sys safety
o Accident investigations: why controls that were in place failed to detect/prevent
changes that lead to accident
- FRAM (Functional Resonance Analysis Method)
o Identify how sys should have functioned to succeed
o Understand variability of functions
RCA (Root Cause Analysis)

- Investigate/categorise root causes of events with safety, health, envir, quality, reliability,
production impacts (what, why, how)
o Reactive method: when performed after incident occurred
o Proactive method: when used to audit systems/processes (forecast possibility of an
incident/undesired event before it occurs)

Steps in incident investigation

1) Immediate incident scene mgmt


2) Set up investigation team
3) Evidence collection
4) Analyse data (event time seq/chart, incident causation model, root causes)
5) Develop recommendations
6) Record, report, present investigation findings
7) Implement actions
8) Review actions
9) Capture/disseminate broader lessons learnt

Success factors

Difficulties
5 why analysis

- To identify RC

Fishbone diagram (cause/effect)

- Splits into diff categories

Fault tree analysis

- Use logical pathways to connect underlying causes to higher lvl causes


- Causes connecting through AND gate and collecting necessary and sufficient to produce
higher lvl cause
- Causes connecting through OR gate to a higher lvl event are individually sufficient

Causal tree analysis

- Inclusion of situational factors that contributed


- Underlying factors necessary
Event time seq

Safety case

- Document produced by operator of facility


o Identifies hazard/risks
o How risks are controlled
o Safety mgmt. to ensure controls are effective and consistent

Prescriptive

- Product:
o specific design features
o assurance provided by inspection
o products comply with standards
- Process
o assurance based on if process was followed

Availability

- Inherent: considers corrective maintenance


- Achieved: similar to inherent, PM downtime included, steady state availability
- Operational: real average availability over period of time (includes downtime)

You might also like