Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Functional Safety Activities

Borja Fernández Adiego BE/ICS


Outline
1. (very basic) Functional safety concepts

2. CERN Safety Instrumented System example

• Functional Safety activities

• Most common (CERN) Problems

3. Discussion (reliability information, WIC reengineering, …)

4. Optional - Software reliability (?)

• Model checking for PLC programs


Our group
BE-ICS (Industrial Controls and Safety) group

“ provides solutions and support for industrial control systems and

develops, installs and maintains CERN's safety systems”


Reliability, availability and (functional) safety
• Reliability: … about how long a system performs its intended function
• E.g. Hardware reliability is related to the probability of safe failure + probability of dangerous failure.

• Availability: … about the % of time the equipment is operational


• Availability = Uptime / Total time

• Safety: … we only care about “dangerous” failure modes. Probability of dangerous failure

For example:
• Fails to open: Dangerous failure
• Fails to close: Safe failure

Safe or dangerous depends of the application


Risk reduction

Find techniques to get into


the acceptable risk zone
Severity Real System
(hardware, software)

Functional safety focuses in


Prevention Risk
S prevention techniques
Protection

unacceptable
risk
Risk

acceptable
risk

P Probability
Functional safety and standards

• Functional safety: part of the overall safety depends on a system or equipment operating correctly in response to
its inputs
• Passive systems are NOT functional safety (e.g. fire resistant door) -> protection

• Active systems are functional safety (e.g. smoke detection by sensors and intelligent activation of fire
suppression system) -> prevention

• Functional Safety Standards:


• IEC 61508: Functional Safety of Electrical / Electronic / Programmable Electronic Safety-related Systems
• IEC 61511: specific for the process industry
• IEC 62061: Safety of machinery
• …
• …+ Specific industry standards (e.g nuclear safety, etc.)
Functional Safety standards

• They provide:
• Functional Safety terminology:
• SIS (Safety Instrumented System)
• SIF (Safety Instrumented Function)
• …

• Guidelines to develop a SIS:


• Safety life cycle
• Metrics to measure safety
• SIL (Safety Integrity Level)
Switchboard installation

• New magnet test bench facility in building 311

• Different test benches to measure the field of


normal conducting electro-magnets

• Magnets will be powered with DC or quasi-DC


current up to 1000 A

• The current provided by each converters must be


multiplexed to the test benches by a dedicated
electro-mechanical switches assembly (hereafter
named “switchboard”)

• Project managed by TE/MSC


Switchboard installation
Switchboard installation

• 17 Measuring benches
• 7 Power converters
• 3 COMET
• 1 Apolo
• 3 Transtechnik

• Switchboard assembled by the company


Boffetti http://www.boffettigroup.com/

• Switchboard main components:


• ABB Emax circuit-breakers
• Mersen (FLOHE Foulileret SAS) circular
commutators
Switchboard installation

ABB Emax circuit-breakers

Mersen circular conmutators


Development of a IEC 61508 SIS

1. Analysis
• Risk analysis
• Safety Instrumented Functions definitions

2. Realization
• Implementation of the Safety Instrumented System
• Steps to prove the SIL of one SIF

3. Commissioning, operation and management of the SISs


Risk analysis

• FMEA (Failure Mode and Effect Analysis)


Electrical risk
• High level analysis: focusing on the design
• 4 items were analysed Need of an interlock system to mitigate
• Magnet this risk
• Interbox
• Switchboard
• Power converter

Occurrence FMEA
Potential Potential Current Design Risk Priority Recommen
Potential Occurrence Current Detectio unique
Item Function Effect(s) of Severity (S) Cause(s) of Controls Number ded
Failure Mode (O) Design(Preventi n (D) identifier
Failure Failure (Detection) (RPN) Action(s)
on) number

installation
blade position open circuit under switch indicates tests during regular switching of a
1.3 Switchboard 10 3 8 240 NBS#69
switch power closed position installation tests without load redundant
switch
mix between
thermal (NC) standard interlock
Thermal
Wrong versus water (NO) system (CERN OK,
1.1 Magnet interlock 10 5 8 none 400 NBS#39
connections interlock External institute
wrong connectors NOK)
interlock scheme
Risk Analysis (FMEA) + Brainstorming

- The configuration from the SCADA is not very critical as we have feedback from all switches

- Powering with two (or more) power converters the same bench will rise a critical situation (damage to the
installation and eventually to the workers)

- All safety functions will act on the power converters


How can we reduce the risk?

Switchboard
installation
Target risk Original risk
Control system

F4 F3 F2 F1

Risk
Risk reduction by Risk reduction by Risk reduction by
Conditional other reduction Safety
modifiers measures Instrumented
Function
(SIF)
Safety Functions (families)

1. Coherence switches: all feedbacks from the switch must be coherent (a.k.a one signal TRUE and all the rest
FALSE)
2. Breaker status: Breaker must be closed in order to allow to the PC to provide the power to the bench
3. Bench configuration: never more than 1 PC can power the same bench
4. Bench status signals: all “bench signals” (EMXX, PBXX, FSLXX, TSHXX and ZSLXX) must be “OK” in order to allow to
the PC to provide the power to the bench
5. Overcurrent protection: The current of COMET#3 PC should be limited.

Remarks:
• SIFs are independent of the test bench selection (SCADA)
• All safety functions will stop the Power Converters (PC_FPAXX and PC_PERMXX)
Safety Instrumented Function definition
• Risk to mitigate: Electrical risk (short-circuit) due to wrong Switchboard configuration. Potential power
converters damage, magnet damage and human damage.

• Functionality: Each bench should be powered by only 1 Power Converter


• Mode: Low demand operation mode
• Safety Integrity Level: SIL2

SIF must be compliant with:


Risk evaluation table
• SIL2 Hardware Safety Integrity requirements:
• Architectural constrains
• Hardware random failures
• SIL2 Systematic Safety Integrity requirements:
Mechanical Stress, EM interference, Software errors, etc.
Switchboard SIS architecture WinCC OA

UNICOS

317F-2PN/DP

Siemens Safety Distributed Library

Profisafe

EM01 EM01 PC
PC
PB01 PB01 PC
… PC
FSL01 FSL01
17 ET200SP 1 ET200SP 3 ET200SP
TSH01 TSH01
34 Power converter signals
ZSL01 ZSL01 (including spares)

135 Bench signals (including spares


and Flashing light signals)
113 Switchboard signals (including spares)
Implementation of SIF

317F-2PN/DP OB35

Profisafe
Power
Permit
PC

Fast
Abort

ET200SP ET200SP
Power
Permit
PC
MSB311_SW1A_POS2_DI
MSB311_SW1B_POS1_DI
Fast
MSB311_SW2_POS2_DI Abort
2
Hardware Safety Integrity requirements 3

1
PC

Reliability Block Diagram

PC

𝑃𝐹𝐷𝑆𝐼𝐹 = 3 ∗ (𝑃𝐹𝐷𝑠𝑤 ) + 2 ∗ 𝑃𝐹𝐷𝐸𝑇200 + 𝑃𝐹𝐷𝑃𝐿𝐶 + 1oo2(𝑃𝐹𝐷𝑃𝐶 ) 3


PFDSR PFDFG CF A
1 2 PFDCOM ET

PFDR PFDFG CPP


PFDSW1a PFDSW1b PFDSW2 PFDE T200 PFDPLC PFDE T200

PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP


Source of Information (IEC 61508)

1. Site specific (CERN)


• Power converters team (TE-EPC) Why so important:

1. Hardware Safety Integrity requirements:


2. Industry specific • Hardware random failures (Failure
• Test bench facilities (e.g. SM18) rate, MTTF, etc.)
• Architectural constrains
• Route 1 :SFF (Safe Failure
Fraction)
3. Generic (large number of applications) • Route 2 : Feedback from the
• Boffetti, ABB, Mersen users

2. Systematic Safety Integrity requirements:


• Proven in use
4. Manufacturer data
• ABB – circuit breakers
• Mersen (FLOHE) - Switches
Hardware random failures
PFDSR PFDFG CF A
1 2 PFDCOM ET
3
PFDR PFDFG CPP
PFDSW 1a PFDSW 1b PFDSW2 PFDE T200 PFDPLC PFDE T200

PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP

𝑇 𝑀𝑇𝑇𝐹 = 1/𝜆𝐷
𝑃𝐹𝐷 = 𝜆𝐷 .
2
𝑀𝑇𝐵𝐹 = 𝑀𝑇𝑇𝐹 + 𝑀𝑇𝑇𝑅

Where:
 𝝀𝑫 is the (dangerous) failure rate. We consider constant failure rate λ(t) = λ
 T is the period of time between the manual tests
 No automatic tests C = 0
PFDSR PFDFG CF A

Hardware random failures 1 2 PFDCOM ET


3
PFDR PFDFG CPP
PFDSW 1a PFDSW 1b PFDSW2 PFDE T200 PFDPLC PFDE T200

PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP


𝑇
𝑃𝐹𝐷 = 𝜆𝐷 .
2

PFD for block 1:


 Information provided by manufacturer (Mersen): 𝜆 = 0.9 E-03
 Assumptions:
 𝜆 = 𝜆𝐷 (Failure rate = Dangerous failure rate)
 C = 0 (No automatic tests)
 SIL2
 If PFD1 = 1 E-03, then T = 0.741 years = 270 days
 If PFD1 = 1 E-02, then T = 7.41 years
Hardware random failures
PFDSR PFDFG CF A
1 2 PFDCOM ET
3
𝑇 PFDR PFDFG CPP

𝑃𝐹𝐷 = 𝜆𝐷 . PFDSW 1a PFDSW 1b PFDSW2 PFDE T200 PFDPLC PFDE T200


2 PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP

PFD for block 2:


 Information provided by manufacturer (Siemens): SIL3 certified devices
 No significant to reach SIL2 for this SIF
Hardware random failures
PFDSR PFDFG CF A
1 2 PFDCOM ET
3
𝑇 PFDR PFDFG CPP
𝑃𝐹𝐷 = 𝜆𝐷 . PFDSW1a PFDSW1b PFDSW2 PFDE T200 PFDPLC PFDE T200
2
PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP

PFD for block 3:


 Safety relay: Information provided by manufacturer (Siemens): SIL3
 Power Converters: No valid information to guarantee SIL
 Large number of PC installed at CERN (around 2000 PCs)
 New Function Generator Controller (FGC3)
 Redundant signals
 Fast Abort (safe signal, redundant architecture)
 Power Permit
 Redundant Architecture
 Stop both power converters
 Possibility to add hardware interlock (recommendation given to SM18 test bench facilities)
Architectural constrains PFDSR PFDFG CF A
1 2 PFDCOM ET
3
PFDR PFDFG CPP
PFDSW1a PFDSW1b PFDSW2 PFDE T200 PFDPLC PFDE T200

PFDSR PFDFG CF A
PFDCOM ET

PFDR PFDFG CPP

2 options:
 Route 𝟏𝐇 : Based on hardware fault tolerance (HFT) and safety failure fraction(SFF)

2
 Route 𝟐𝑯 : HFT and Feedback of users (Boffetti, Mersen and the Power converter team)
Systematic Safety Integrity

 We focus on the software (PLC program) reliability: IEC 61511 verification of application
software

 All the SIFs were formally verified using the PLCverif tool: https://cern.ch/PLCverif
 This tool applies model checking to the PLC programs

 During the development of the PLC program, PLCverif found “discrepancies” between the
SIFs specification (desired functionality) and the SIFs implementation (PLC program)

 The 5 SIFs are expressed in 94 verification properties. The PLC program has 2174 ≈ 6*1051
input combinations. “Impossible” to check all of them with testing
Conclusions

 IEC 61508 safety concepts should be consider at the design time

 Sources of information is a real challenge. No failure records at CERN

 Systematic Safety Integrity: Successful formal verification of the PLC program using PLCverif

 Hardware safety integrity cannot be proven, but we reasonably increased the safety of the
system:
 Hardware random failures requirements “are not met” due to the power converters
 Architecture constrains requirements “are not met” due to the power converters and
switches

You might also like