Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Assuring Separation of Safety and Non-safety Related Systems

Bruce Hunter
Thales Training & Simulation
Thales Services Division, Building 314, Garden Island, Sydney
Locked Bag 2700, Potts Point, NSW 2011, Australia
bruce.hunter@thalesgroup.com

Abstract required to trim the aircraft for flight. The Canadian


Transport Safety Board (TSB) report (A96O0030)
Safety standards call for the separation of safety and non- findings included:
safety related systems. Although good guidance is
provided in these standards on how to achieve the • “A recently modified computer application, ALPAC,
required hazard analysis, safety integrity assignment and used by load agents to plan loads and compute
validation to prove a safe system, there is little available aircraft weight and balance, incorrectly computed the
on establishing safety boundaries around the critical aircraft take-off C of G.
components and the proof of isolation from non-safety • The ALPAC-computed aircraft take-off C of G was
functions. Delineation between safety and non-safety near the centre of the aircraft flight envelope, while
systems is particularly important where it is impractical to the actual C of G was beyond the aft limit.
substantiate a Safety Integrity Level of the overall system
due to the complexity of some components. In this case it • The ALPAC application produced a large error in the
is better to assume high failure probability of the non- aircraft C of G calculation; however, there was no
safety system and prove isolation from the safety-related defence in place to detect such a critical error in the
system. application itself, at the aircraft loading stage, or in
the flight crew confirmations of load and trim setting.
This paper explores a conceptual methodology (including
the use of Fault Tree Analysis and Common Cause • The modified computer application was not
Failure Analysis) for establishing and assuring separation adequately tested before it was released for
of systems and some examples from training simulators operational use.
that are an example of this situation drawn from real-life. .
• The modified computer application was not
Keywords: Functional Safety Separation, Functional monitored effectively for accuracy after it was placed
Safety Boundaries, Simulator Functional Safety. in operational use.”
In this case the software that led to the incident was not
1 Introduction even aboard the aircraft and was operated by a different
Separating safety-critical and safety-related systems from party. Interaction across what appears to be valid safety
systems where safety integrity is unable to be established boundaries can sometimes be nebulous. While this failure
or maintained is an important aspect of system safety may be considered as incomplete hazard analysis of the
design. When implementing a system safety program it is changes to the ALPAC system and the impact on the
important to suspect all components as being unsafe performance of the loaded aircraft, it also can provide a
unless assured otherwise and then target the few areas good example of where coupling between systems may
where safety requirements are allocated. Coupling be overlooked.
between components of complex systems can be subtle
and interaction with non-safety related systems have led 2 Setting Functional Safety Boundaries
to harmful outcomes in safety related systems.
An example of this coupling occurred on 19 February 2.1 The need for having boundaries
1996, when a Boeing 747-433 Combi aircraft operating Taking the extreme position, very few systems are fully
as Air Canada flight 899, was on a scheduled independent in their operation and to be completely
passenger/freight flight from Toronto/Lester B. Pearson assured of the absence of interaction or common-cause
International Airport, Ontario, to Vancouver International failure between the safety-related and other systems
Airport, British Columbia. As the aircraft was taking off, would take an inordinate amount of time and effort. This
the underside of the tail struck the runway, and, during could cause the opposite effect to delay introducing the
the climb-out, considerable nose-down stabilizer trim was safety benefits of the deployment of a safety-related
system. At some point a determination must be made that
all possible influences are controlled or risks sufficiently
Copyright ©2006, Australian Computer Society, Inc. This paper known so the safety analysis can be bounded.
appeared at the 11th Australian Workshop on Safety Related
Programmable Systems (SCS'06), Melbourne. Conferences in Taking the above tail-strike incident as an example of an
Research and Practice in Information Technology, Vol. 69. indirect influence on system safety, the safety analysis
Tony Cant, Ed. Reproduction for academic, not-for profit boundary could well be established around the flight
purposes permitted provided this text is included.
control systems. Further investigation of the use of the E
off-board planning system would have identified its 2E-08
Resulting System Failure
criticality to the Centre of Gravity of the loaded aircraft
and extended the functional safety boundary to include
this and any changes made to it.

2.2 Objectives of Functional Safety Boundaries


D
This paper introduces a concept of Functional Safety 1E-08
Non-Safety Related
Boundaries, which can be used to contain areas where I Interaction

specific safety integrity measures are to be employed.


Objectives of these boundaries are to: Safety Boundary

• Minimise the interfaces across the safety boundary to


direct focus on the safety separation implemented in A B C
these; 1E01
Failure of System
1E-09
Protective Isolation
1E-08
Failure of system
with SIL0 Failure with SIL X

• Minimise likelihood of common-cause failures


across the boundary;
• Exclude non-safety related functions where these are
volatile or subject to undefined or non-safety related Figure 1 Simple FTA of Safety System Coupling
controls; and
In a similar manner, the boundary must be extended to
• Allow a Safety Integrity Level (SIL) to be achieved include common-cause failures that effectively defeat the
within the boundary. independence across the boundary as shown in Figure 2.

2.3 Identifying safety functions within a


Function A
boundary
A useful method to establish the functional safety
Protective Function B
boundary between systems or subsystems is to undertake OR
a Fault Tree Analysis (FTA) of the contributing factors to
failure of the system, which may lead to hazardous
Related Function C
events identified in the preliminary hazard analysis. The
first attempt at a boundary would be around the systems
Common-Cause Failure
that are implicated in the FTA. This FTA needs to be
extensive and complete from all initiating situations to the
system failure that is a casual factor for the hazardous
event. Then flowing down the tree, mark off those Figure 2 Setting boundaries outside possible
functions that are related to systems that should be Common-Cause Failures
excluded due to:
• The possibility of common-cause failure;
• High levels of complexity and non-deterministic 2.4 The problem with software
failure rate; or At a system level, this process looks reasonably
• Components that may not always be present or straightforward but the problem comes with setting
enabled. boundaries with distributed software architectures. In this
situation it is very difficult to identify boundaries that
Failure probabilities are then assigned to the contributing don’t involve the possibility of common-cause failures.
and basic events. Figure 1 shows a very simplified fault Some useful work on partitioning in this context has been
tree for a safety-related system and its isolation from non- done by Conmy, Nicholson, Purwantoro and McDermid
safety-related and non-deterministic functions (SIL0). (2002), Identifying Safety Dependencies in Modular
In Figure 2, the boundary is set around the failure Computer Systems.
associated with the SIL0 system (A) which then requires If the layering approach from this work is extended to a
the failure probability of the associated protective generic model, common cause failures can be seen to
isolation mechanism (B) to be made no less than the involve lower layers of the architecture (hardware
failure probability (C) of the SIL rated system within the failures, resource sharing failures, communication
boundary to achieve an end failure probability failures, memory leakage failures etc). For this reason it
commensurate with the SIL rated system (E). is essential that any functional safety boundary must
include all the layers that support that function, as shown
in Figure 3.
3 Assuring Functional Safety Separation
Safety standards do call for independence of safety-
related functions but aren’t very specific about what is
User I/F User I/F User I/F User I/F acceptable or how to dependably achieve this. Although it
is a difficult area to quantify for completeness and
Application Application Application Application
repeatability, I believe it is important that standards don’t
Subsystem OS Subsystem OS
avoid addressing this issue and should specify a generic
methodology for assuring independence or separation.
Subsystem Hardware Subsystem Hardware

System interconnection protocols


3.1 What the standards say

Common System Hardware including network infrastructure 3.1.1 Key IEC 61508 extracts
IEC 61508 identifies qualitative requirements for
Figure 3 Distributed system acceptable boundaries independence of safety-related functions.
Common-cause failures and dependencies extending over • IEC 61508-2 clause 7.4.2.3 “Where an E/E/PE
the distributed communication networks must also be safety-related system is to implement both safety and
considered and the functional safety boundary set non-safety functions, then all the hardware and
accordingly. These may include: software shall be treated as safety-related unless it
can be shown that the implementation of the safety
• Global variables accessed by network
and non-safety functions is sufficiently independent
• Security attack and security blocking issues (i.e. that the failure of any non-safety-related
functions does not cause a dangerous failure of the
• Affects of network lock-up on functional safety safety-related functions). Wherever practicable, the
The separation requirements over the functional safety safety-related functions should be separated from the
boundary must take these failures into account. non-safety-related functions.”
NOTE 1 Sufficient independence of implementation
2.5 Setting boundaries in the safety lifecycle is established by showing that the probability of a
As part of the safety lifecycle, identification of Functional dependent failure between the non-safety and safety-
Safety Boundaries and Functional Safety Separation related parts is sufficiently low in comparison with
should be included in setting overall safety requirements the highest safety integrity level associated with the
and the allocation of these to systems and their safety functions involved.”
components. The following table identifies the lifecycle • IEC 61508-2 clause 7.4.2.5 “When independence
phases from IEC 61508, Functional safety of electrical/ between safety functions is required (see 7.4.2.3 and
electronic/ programmable electronic safety-related 7.4.2.4) then the following shall be documented
systems, where segmentation and separation of safety during the design:
should be undertaken.
a) the method of achieving independence;
IEC 61508 Safety Functional Safety Separation
Lifecycle Activities b) the justification of the method.”
Phase 4. Overall Safety Determine safety boundaries Although not quantified, this does support the use of
Requirements
safety boundary setting (in Section 2) and for identifying
Phase 5.Overall Safety Determine separation the level of separation (Section 3.2). However this does
Requirement Allocation requirements allow varying levels of rigour in establishing the required
Phase 9. System Safety Specify trans-boundary independence.
Requirements information allowed and
Specification prohibited IEC 61508-3 (Software Requirements) clause 7.4.2.7 has
requirements requiring: “Where the software is to
Phase 10. Safety-related Establishment of separation implement both safety and non-safety functions, then all
Systems Realisation measures
of the software shall be treated as safety-related, unless
Phase 13. Overall Safety Proof of separation of non-safety adequate independence between the functions can be
Validation systems or influences
demonstrated in the design.”
Phase 14. Overall Monitoring for compromised
Operation, Maintenance separation Clause 7.4.2.8 also requires “Where the software is to
and Repair implement safety functions of different safety integrity
levels, then all of the software shall be treated as
Phase 15. Overall Re-evaluating safety boundaries
Modification and Retrofit and separation belonging to the highest safety integrity level, unless
adequate independence between the safety functions of
Table 1: Lifecycle Consideration of Safety Boundaries the different safety integrity levels can be shown in the
and Separation design. The justification for independence shall be
documented.”
The concept of Safety Separation Levels could be the SSL Probability of propagating Probability of propagating
basis for demonstration of this “independence” for these dangerous failure for low dangerous failure for
clauses. demand mode (<1 per year) continuous/high-demand mode
-5 -4 -9 -8
Notes to clause 7.4.2.8, in the new committee draft, 4 =>10 to <10 =>10 to <10
expand the requirements that allow independence to be 3
-4
=>10 to <10
-3 -8
=>10 to <10
-7

shown on a single computer system by means of spatial -3 -2 -7 -6


2 =>10 to <10 =>10 to <10
and temporal techniques. In my view some of these may
-2 -1 -6 -5
further erode the rigour required by the standard due to 1 =>10 to <10 =>10 to <10
the lack of formality in establishing this independence
unless substantiated by some form of quantification of the Table 2 Proposed allocation of dangerous interaction
independence level required. probability to SSL

Taking this concept further, Table 3 shows a proposed


3.1.2 DEF (AUST) 5679A extracts method of assigning a Safety Separation Level (SSL) to
DEF (AUST) 5679A, The procurement of Computer- differences in SIL across safety boundary. This attempt to
Based Safety-Critical Systems, still has a qualitative quantify independence between safety-related systems
approach but is more specific about the requirements of meets the intent of IEC 61508 parts 2 and 3 and
independence and its dimensions to be considered. DEF(AUST) 5679A section 15.5.

• Section 15.5 Component Independence - “…. One System 1


Component is independent of another if its operation Unclaimed
(SIL0)
SIL1 SIL2 SIL3 SIL 4
cannot be changed, misdirected, delayed or inhibited
by the other Component. Unclaimed
(SIL0)
N/A SSL1 SSL2 SSL3 SSL4
• Section 15.5.2 “The notion of Component
SIL1 SSL1 N/A SSL1 SSL2 SSL3
System 2

independence has several dimensions. These include:


SIL2 SSL2 SSL1 N/A SSL1 SSL2
a) physical isolation (for example with software
components this means that each Component runs on SIL3 SSL3 SSL2 SSL1 N/A SSL1
a separate processor);
SIL 4 SSL4 SSL3 SSL2 SSL1 N/A
b) diversity of implementation, for example, one
Component may be implemented in software, Table 3: Allocation of SSL to differences in systems
another implemented by hardware or operator SIL ratings
procedure;
Relating this back to the FTA model of separation in
c) data independence (for example, the input data for Figure 1, independence between SIL0 and SIL4 systems
the Components is not to be generated by the same would require an SSL of 4, equivalent to the reliability of
mechanism); and a SIL 4 system. Achievement of these separation levels
d) control independence, meaning that one could use similar compliance routes identified in IEC
Component cannot affect the control flow of another 61508-2 section 7.4. Establishing Safety Integrity Levels
Component…” in a homogeneous system without external interfaces is
adequately although sometime controversially dealt with
• Section 15.5.4 “If a SIL assignment depends on the in existing standards. The relationship between SIL
independence of components, evidence of the differences and proposed minimum requirements for SSL
independence shall be documented. The documented would need further work to justify more than the
evidence shall state how independence is achieved extremes. Simply, where the SIL requirement is the same,
and how independence is used as a protective this is effectively an extension of the safety system
measure.” therefore no SSL is required. Where there is an interface
to SIL0 system this requires the same rigour as the higher
DEF (AUST) 5679 is quite helpful in identifying some
integrity system.
key concepts of independence along with required
practices and evidence that components can be
considered as independent. I believe that the techniques 3.3 Setting separation requirements
of functional safety boundaries and Safety Separation Establishing the level of independence could use the
Level would satisfy the evidence required. effective definitions in DEF (AUST) 5679A section
15.5.2 where each of the dimensional attributes would be
3.2 Concept of Safety Separation Level (SSL) assessed against separation characteristics commensurate
with the SSL required from Table 3.
A means of quantifying and comparing independence
could be achieved with the use of a Safety Separation These independence dimensions (physical, data
Level achieved by the assignment of failure of separation independence and control independence, and diversity of
probability equivalent to the SIL target failures of IEC implementation) should be assessed for ability to change,
61508-1 7.6.2.9 as shown in Table 2 misdirect, delay or inhibit safety functions of the safety
. related system across the functional safety boundary.
Independence SSL0 SSL1 SSL2 SSL3 SSL4
Dimension

Diversity Common development TBD TBD Separate subsystem Independent


and design development development and
implementation. approaches and design technologies.
system Thorough prevention
implementation. of CCF across FSB
Strong prevention of
CCF across FSB

Physical Fully integrated (e.g. TBD TBD In separate enclosures Fully separated (e.g.
single MCU) with special or housing, environment,
physical protection. power, access)

Data Dependent on data TBD TBD Strong checking on No data dependencies


across system (e.g. out of range data and across FSB except
global variables) protection against contained within
flooding of information approved controls or
read-only access.

Control Many system-wide TBD TBD Few controls and all Few controls and all
controls without verified and verified and
limitation of their authenticated for authenticated for
impact dangerous impact. dangerous impact.

Table 4 Possible SSL independence attributes

Setting a common process for this assessment would One often-identified risk of training simulators is
require considerable development and agreement before negative training indirectly leading to bad practices on the
inclusion in a standard could be contemplated but Table 4 real platform. To mitigate this risk, full-flight simulators
proposes a possible framework (albeit incomplete in this are accredited to standards prior to being placed into
paper) where assignment to SSL objectives may possible. service and training credits being claimed. Fidelity checks
Further work and substantiation would be required on the are based on many factors, including model checks with
assignment of separation levels, but in my view this real aircraft data and cues associated with key training
would be beneficial to accommodate the complexity of competencies.
emerging systems.
Taking the example of the 747 aircraft tail scrape in this
paper’s introduction, one of the findings of the TSB was:
3.4 Separation in the Maintenance Lifecycle “The first officer's recent simulator training did not
One of the strengths of IEC 61508 is the full life-cycle include an aircraft out-of-trim or out-of-balance take-off”.
approach that it takes in respect of establishing and The safety functional boundary for this scenario could
maintaining functional safety. This is particularly have been extended beyond the operation of the aircraft
important with maintaining independence across to the specific training task and cues on the simulator. If
functional safety boundaries, as changes to maintenance, this was considered then, so long as the simulator
repair and updates could defeat the isolation measures faithfully represented the aircraft and controls under these
taken. conditions, then the simulator has fulfilled its
requirements.
Due to the subtlety and far reaching impact of some
safety separation issues (see Introduction), continuing Flight Data Simulator
independence of these is at threat of being compromised Available Model

through the support, maintenance and upgrade phases of


the safety lifecycle. Like other safety requirements, the
implementation of functional safety separation must be
fully identified in the Safety Case and maintenance
documentation. This must be revisited on a regular basis
to ensure no unauthorised modifications have been made
when changes to the system are made to ensure effective
functional safety separation is maintained.

4 Practical Examples in Simulator Systems


Physical Normal
4.1 Simulation domain specific safety issues Flying Flight
Envelope Operation

In simulator training devices, the combination of safety


and non-safety related systems is an inevitable
consequence of the systems involved and the direct Figure 4 Simulator Modelled Space
interface to trainees through the simulation cues of visual,
motion, aural and force-feedback.
The nature of general simulator architectures and the The motion system is considered a safety related system
modelling of the aircraft operation do cause real problems due to the large excursions of movement. The safety
in assigning an overall safety integrity level. Figure 4 boundary of this system encloses all the necessary
graphically illustrates this limitation with the controls to ensure safe operation and shutdown of the
impracticality of simulating the complete behaviour of all motion system as shown in Figure 5.
platform systems in all conditions.

4.2 Generic simulator hazards Safety Boundary

Interlocks
and
Shutdowns

From
SIL0
Host SIL 3 Hydraulic
Motion Control
PLC Valves

Hydraulic 6 DOF Motion Base


Pump Unit

 Thales, used by permission


Figure 5 Simulator Motion System Overview

Data and control across the Functional Safety Boundary


in this case is limited to acceleration and direction
information. Local control is applied between the SIL 3
Safety PLC and the Motion hydraulic controls complete
with integrated safety interlocks and emergency
shutdowns. While ever the motion requests are within the
bounds of acceptable limits the motion system will
respond once the instructor gives consent and interlocks
remain inactive (data and control independence).
Simulators are different from real aircraft in several
important areas: they are meant to crash without injury to
the occupants; and they only simulate key areas of the

 Thales, used by permission


aircraft functions. The key hazards associated with
simulators are the integrated human interfaces associated
with training cues as shown in Table 5.

Hazardous Cues Dangerous failure impacts


Aural Cues Issues of occupational deafness if
sustained excessive volume

Motion Cues Issues of crushing, falling and hitting


Visual Cues No direct hazard other than motion
sickness

Control Loading Feedback Cues – issues of entrapment,


Figure 6 C2000X Full Flight Simulator Cut-away
Cues crush and strike.
The motion systems is physically separated from the rest
Combination Simulator motion sickness due to the of the system and independently developed and
concentration and limited accuracy
possible in the simulated environment.
implemented with different technology (physical
independence and diversity). Separation across this safety
Table 5 Simulator Hazards and Impact Issues boundary can claimed to meet equivalent SSL 3
requirements as per Table 4 proposal and maintained to
meet the accepted risk profile. This would then maintain
4.3 Example of motion System Safety Integrity the motion system PLC as SIL3 without degrading by
One of the key cues associated with full-flight simulator connection to the host system, which cannot be
systems is the “feel” of motion associated with aircraft substantiated as anything more that SIL0.
movement and attitude. The motion system of the
simulator takes the acceleration vectors and aircraft
attitude from the simulated model and typically applies
these to a six-degrees of freedom hydraulic motion
platform.
this conference whose valuable contribution has helped
5 Conclusion me complete this paper.
This paper has proposed a technique to quantify and
implement separation of safety-related systems from 7 References
other systems by recognising safety boundaries and IEC 61508 (1998) Parts 1 to 4
interaction across those boundaries and their effect on the Functional safety of electrical/ electronic/
separation. This technique re-uses methods from existing programmable electronic safety-related systems.
standards to measure, implement and maintain separation
based on the concept of Safety Separation Level with DEF (AUST) 5679 (1998) Land Engineering Agency,
similar criteria to Safety Integrity Level and Functional Department of Defence. The procurement of
Safety Boundaries. This allows the safety case to be Computer-Based Safety-Critical Systems.
established for complex systems by applying quantifiable Transport Safety Bureau of Canada Report Number
separation requirements to systems where a SIL is A96O0030, Control Difficulty Tail Strike, Air Canada
difficult, if not impossible, to obtain at the overall system Boeing 747-433 Combi C-GAGL Toronto/Lester B.
level. Pearson, International Airport, Ontario, 19 February
1996.
6 Acknowledgements
Conmy, P., Nicholson, M., Purwantoro, Y.,M., and
The author gratefully acknowledges the assistance and McDermid, J. (2002) Safety Analysis and Certification of
support of Philip Swadling and Stephen Carey of Thales Open Distributed Systems.
Training & Simulation and the independent reviewers for

You might also like