Professional Documents
Culture Documents
I 000784520 Thesis
I 000784520 Thesis
Master’s Thesis
Issued on 16.06.2020
Submitted on 05.11.2020
Declaration
I hereby declare that this thesis is my own work, that I have not presented it elsewhere
for examination purposes and that I have not used any sources or aids other than those
stated. I have marked verbatim and indirect quotations as such.
Abstract
In-the-loop tests can save time, money and effort by helping to identify errors before
they occur in the target environment or at the customer. These tests can be designed in
various forms, depending on the current state of development. From an organizational
point of view, it is desirable to formulate tests only once and to use them in different
environments without further adjustments. This thesis focuses on deviations between
in-the-loop test levels and the possible reuse of tests. In addition to the explanation
and classification in the field of software testing, previous work in this area is also
considered. Computers with x86-64 processors are typically used for the development
of embedded system software, which do not correspond to the processors of the target
platform. Furthermore, the exchange of data does not necessarily take the form as
intended for the target applications, but possibly only in a modelling environment or
through simplified channels. With regard to the desired tests to be carried out, this
can lead to a number of problems, which are identified in the context of this work. In
addition to accuracy and performance aspects, requirements and designs for reusability
are presented.
III
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
4 Project Context 30
4.1 Hardware Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.1 Desktop Computers . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.2 Microcontroller . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.3 Evaluation Board . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Software Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.1 COTS Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 In-House Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Test Level Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Methodology 37
5.1 Accuracy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.1 Floating-Point Numbers . . . . . . . . . . . . . . . . . . . . . . 38
Contents IV
6 Conclusion 68
List of Abbreviations 70
List of Figures 72
List of Tables 73
List of Listings 74
Bibliography 75
Appendix 82
A.1 Ninth Grade Taylor Approximation of sin x . . . . . . . . . . . . . . . 82
A.1.1 MATLAB Sources . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.1.2 C Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.2 Entity B Implementation including MiL/SiL/HiL Adapters of System
Response Time Measurement and Reusable Test Asset Example . . . . 85
A.2.1 MiL Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A.2.2 SiL Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.2.3 HiL Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
1
1 Introduction
1.1 Motivation
The development and in particular the verification and validation of avionics systems is
very time-consuming and costly. The increasing number of software-based functions as
well as the growing number of interfaces and dependencies pose additional challenges
to software development. To meet these challenges, attempts are made by testing the
development results as early as possible. The use of in-the-loop test methods is a proven
means for this in the practice of controller development and for embedded systems
software development in general. In addition to Hardware-in-the-Loop (HiL), Software-
in-the-Loop (SiL) and Model-in-the-Loop (MiL) methods are increasingly being used
with the aim of making systems more testable in the early phases of development, to
reduce costs and to shorten development times.
Like all companies, Airbus Defence and Space is interested in further improving the
development quality of the systems at manageable costs. The approach to execute
in-the-loop tests early in the process, for example in a simulation environment, should
open up the possibility of achieving a high degree of maturity of the system functionality
before the production of the first prototypes. An unchanged reuse of these tests for
hardware-specific in-the-loop tests is highly desirable from a project management
perspective. The motivation of this work is therefore to answer the question in which
scope test results from MiL, SiL and HiL environments can differ, in which way reuse
can take place and where their limits are.
1.2 Objectives
The following aspects will be dealt with in the context of this work. Development tools,
simulation and test environments from Airbus Defence and Space are used as required
for this purpose:
20,000
15,000
KSLOC
10,000
5,000
0
F16A B757 B767 F16D B747 A310 B737 A320 A330 B777 F22 F35
and coding errors. Not all of the mentioned factors can be handled via software testing,
but testing can help to reveal the weak spots of a system and focus resources on critical
areas.
According to ISO 29119-1:2013 testing is defined as "set of activities conducted to
facilitate discovery and/or evaluation of properties of one or more test items" [ISO13,
p. 12]. So the purpose of test cases is to check, whether the requirements of an object
have been met. Ensuring the functional correctness and fulfilment of requirements are
the main goals of software testing, but there are other advantages as well [WHM18]:
tests show their benefits, because of their positive effects to a project regarding
maintainability and adaptability. Software which can be adjusted quick and easy
to new conditions can save a lot of time and consequently money.
Figure 2.3 depicts the contents of software testing in a tree-like structure. The
subdivision is primarily based on static and dynamic testing, which will be discussed
further in the following sections. The contents are built upon the explanations of
Frühauf; Ludewig; Sandmayr [FLS07].
The idea of trying something and evaluating the result corresponds to the understanding
of testing for most of the people. This type of testing is often misinterpreted as testing
as a whole, but should be specified as dynamic testing. As illustrated in figure 2.3
dynamic testing is further divided into two subcomponents. Regarding the test, there
exist in general two types, which are black box testing and white box testing.
The prerequisites of black box testing are specifications. These are intended to describe
the inputs and expected outputs of the system under test (SUT). The internal states
of the SUT are not of relevance, the focus is on the comparison of the specified to
the actual results. For the selection of test cases, there are criteria such as input or
output coverage. This means, for example, that each input or output datum should
be covered by a test case. Even with simple systems, meeting these kinds of criteria
can be impossible due to the combinatorial possibilities. Methods like equivalence
2 Software Testing Theory 6
partitioning or boundary analysis are used, which, with the greatest possible benefit,
keep the mass of test cases to a reasonable level.
Software Testing
In white box testing (also referred to as glass box testing), the inner structure, essentially
the source code, is available. The selection of test cases depends on traversal options,
where sequence graphs of the program serve as aids. Loops lead particularly quickly
to a high number of test cases, which is why one tries to find compromises with this
method as well. For example, the goal is to traverse each statement, branch, or path
at least once to find errors in this way.
Performance measurement examines the timing behaviour of the software. Response
time analysis and throughput measurements have proven to be efficient instruments
therefore, however these require careful and comprehensive preparation.
Dynamic testing has the advantage to be reproducible and therefore objective. The
invested effort can be used several times and the target environment is also checked
during execution. The system behaviour is made visible and thus provides impressions,
which can be important for the succeeding development. On the other hand, it is hardly
possible to examine important quality properties by testing, such as reusability or
maintainability. Tests can only detect that the behaviour differs from the specification,
but locating the defect often causes more effort than the eventual elimination. It is also
not possible to replicate all real-life situations in a test case, such as the emergency
treatment in a nuclear power plant. Moreover, even tiny programs with a single real
type parameter have at least 231 ≈ 2,000,000,000 execution options with no continuous
transition guaranteed. A correctness assertion by testing alone cannot be achieved in
any way.
2 Software Testing Theory 7
Static testing is the second major pillar of software testing with a distinction between
computer and non-computer support. The first category, tests against rules, is intended
to provide information about whether certain norms and standards have been met.
Compiler checks can be considered as one of the first tests of this kind. Today, there is
a wide range of integrated development environments (IDE), which offer tests of this
type. Furthermore, it is also possible to check if, for example, goto statements have
been used, or the rules of the Unified Modeling Language (UML) specification have
been observed in an object-oriented environment.
Similarly, the computer-aided approach allows for consistency checks. Modern tools
are able to unveil errors related to the program flow, such as unreachable code or the
use of variables that were not previously initialized. In addition, warnings can be
issued that do not affect functionality, but are important for maintainability, such as
unused constants or variables. In literature, these tools are often referred to as static
or misleadingly logic analysers. The mentioned tooling does not check the logic of the
program, but the compliance with rules.
Quantitative inspections contain aspects like: how many modules are fully implemented,
which dependencies exist between the components or what is the rate of test coverage.
These offer interesting insights in the state and quality of a product and will be covered
usually by the use of metrics. The resulting figures can be used, for example, to control
further test activities.
Another big part of static testing is the review. This method takes one of the greatest
efforts, because it is non-computer-aided. Like the previously mentioned techniques, the
software is not executed, but visually inspected. In the most common form, the code is
analysed by a group of people, whose goal is to find defects and potential risks, but not
to fix them. Reviews have the advantage, that they can be done early in the development
stage, even before executable programs are available. The latency of a defect can be
reduced, which results in less costs for the development. Not only programs but every
readable document can be checked via review, these include, for example, requirements,
design drafts, user manuals, regulations, test data and so on. The discussion during a
review leads to an alignment of standards between the contributors, knowledge is shared
and a common understanding is supported. Conversely, there are also disadvantages
which should be mentioned. Reviews cannot run alongside a project but take time
for preparation and execution. Another point is, that it can be difficult to maintain
an overall view. Especially the interlinkage of object-oriented programs is stronger
compared to imperative programs and polymorphism complicates the recognition of the
used classes and methods, too. Third party software is rarely available at coding level.
Due to this circumstance, reviews are often executed for specific parts of a project only.
For an individual, a review can also be a threat, namely when the tone does not remain
constructive, but accusing.
To sum up, static testing requires effort for example to setup the tooling or preparing
reviews, but on the other hand, it can save a lot of time and money, because defects
can be found very early in the development process and the fix can follow often very
2 Software Testing Theory 8
quickly. For many projects, the signed outcomes of static testing are also necessary to
fulfil standards and legal regulations. As a result, it is advised to include static testing
in every test plan and test level.
Project Accepted
Request Software
User
Acceptance
Requirements
Tests
Definition
Tested
System
Software
System
Requirements
Tests
Definition
Tested
Subsystems
Architectural Integration
Design Tests
Tested
Modules
Detailed Unit
Design Tests
Compiled
Modules
Code
The process and relationships between testing and specifications can be explained with
a traditional software development model, also known as V-model. Figure 2.4 shows
such a model as used by the European Space Agency [ESA94]. In an exemplary project,
the time course begins from the top left. On the way to the code, the requirements
are specified more and more precisely and end up in the detailed design description
before they are finally converted into code. The arrows in the upward direction of the
left branch represent small iterations, which can be triggered as a result of reviews.
The dashed arrows from right to left represent verification activities which will be
2 Software Testing Theory 9
discussed below. The model defines specific artefacts for each transition that has been
omitted for reasons of clarity. The test levels of the right branch are further discussed
in ascending order [Grü17]:
Unit testing
Unit Testing is the test of a software component against its (detailed) design. A
component is defined as a meaningfully testable unit in isolation, for which a
separate specification exists. The term module is often used in object-oriented
environments or if a complete file is used as a test item. In principle, both terms,
component and module, describe functions that operate with a specific set of data
and high internal cohesion, which is why this thesis does not explicitly distinguish
between the two expressions. Beside black box testing, also the practical use of
white box testing can be found on this test level. Unit Tests are usually performed
after a code review. This is more efficient, because if design improvements are
proposed in the review, previously written unit tests may have been in vain.
Integration testing
Once there are individual tested modules, integration testing can be used to check
whether the interfaces to other modules work as intended in the architectural
design. There are different types of integration tests, which have the task of
checking dependencies between software modules, among hardware parts and
between software and hardware. Typical errors found at this level are different
interpretations of interface specifications. Environment simulation, devices for
generating stress at communication interfaces, protocol analysis tools and data
loggers are examples of useful tools for integration testing.
System testing
The system test is carried out after the integration test. This test is based on
the tested and integrated subsystems and has the purpose of ensuring that the
specified requirements have been met. Errors are also found at this stage, because
this is the first time the tests are formed against requirements, and not the
design as in the two previous stages. So, if the design itself is faulty, it often only
becomes apparent in the system test. As far as possible, testing is done in the
target environment and since the overall system only provides limited insight
into internal states, usually only black box techniques are used.
Acceptance testing
The tested system is the subject of acceptance, all lower hierarchy levels are
available and tested. The acceptance test is carried out by the customer or in the
presence of the customer. The aim is to demonstrate the fulfilment of the original
user requirements. Acceptance is of particular importance for contract projects,
as payments are made after acceptance has been passed and the guarantee period
begins.
Vigenschow also uses the V-model to explain the terms verification and validation,
whose interpretation is largely based on Barry Boehm. Verification is described as
testing the conformity between a software product and its specification (doing things
right). This nomenclature can be assigned to unit, integration and system testing. The
2 Software Testing Theory 10
validation is stated as testing the suitability of a product in relation to its intended use
(doing the right things), which can be related to acceptance testing [Vig10].
In addition to the V-model, there are a variety of further approaches and process
models. Prototyping, for example, develops separate sections of the target system in a
very short time, with a reduced set of specification, in order to clarify requirements.
In the meantime, agile methods like Scrum or extreme programming (XP) have also
become popular. These completely dispense with comprehensive specifications in the
traditional sense [FLS07].
With many embedded or safety-relevant systems, partial software deliveries are only
limited possible or make little sense. Current variants of the V-model also take iterative
approaches into account [Wei06]. Either way, checking the quality at every development
stage by testing has proven successful in practice for many years and will probably
continue to do so in the future.
In black box testing the modules are tested solely based on their specification. This
test method can be applied to virtually all levels of software testing: unit, integration,
system and acceptance tests. The combinatorial variety of possible test cases is too high
in almost all practical applications. The following options are used to systematically
reduce them [FLS07; Grü17; MBT04]:
Equivalence Partitioning
The essence of equivalence partitioning is to subdivide the input data area so
that the same errors are found with any value in a area, as with any other value
of this area. Only one test case from each partition is required, which reduces
the number of test cases to a reasonable level.
2 Software Testing Theory 11
Of course, there are a lot of other methods. For example, the area of robustness testing,
whereby it is examined whether a system works properly with invalid inputs or stressful
environmental conditions, can also be regarded as a black box technique. Depending
on the form of the specification, use case testing can also be a valid option. In error
guessing, the tester considers which area of the SUT could be particularly error-prone
and develops his test cases accordingly. In safety critical areas, however, this approach
should be rejected because it is not based on the specification and false assumptions
can cause great damage [Grü17; KS11].
In contrast to black box testing, the source code of the SUT is available for white
box testing. Central element for the selection of test cases in white box testing is the
sequence graph, which is created from the source code. Since white box tests are based
on the implementation level, they are mainly applied to unit and integration tests.
Measuring the test coverage is of particular importance for this type of testing. It
shows the tester which parts of the program are still untouched or which conditions
have never been run through. The tester then creates test cases to achieve the desired
test coverage. White box test design techniques include the following code coverage
criteria [Grü17; MBT04]:
Statement Coverage
Each node of the graph and thus each statement/instruction of the program is
traversed in at least one test.
2 Software Testing Theory 12
Branch Coverage
Each branch and thus every edge of the sequence graph, is traversed in at least
one test. Examples of branch statements are switch, do-while or if-else statements.
Branch coverage implies statement coverage and is synonymous with the term
decision coverage for most of the authors.
Path Coverage
Path coverage testing is attempted to traverse all possible program paths from
the start node to the end node. Loops can be problematic here, which is why
full path coverage is rarely sought. Path Coverage guarantees complete branch
coverage.
Decision/Condition Coverage
If, in addition to branch coverage, a change of each sub-condition of a boolean
expression is required, it is called decision/condition coverage.
Modified Condition/Decision Coverage (MC/DC)
Every sub-condition that can influence the program branching must show, that
it can determine the program flow independently of the others. This is done
by varying just a single sub-condition while holding fixed all other possible
sub-conditions.
There are a lot more kinds of test coverage criteria than the few presented here. For
software with high safety requirements, however, the deployment of 100% MC/DC is
common today. It is highly recommended for example in ISO 26262, a standard for
safety-relevant automotive systems, as well as in the avionics software development
guidance DO-178C to ensure appropriate testing of the most critical (Level A) soft-
ware, whose errors would cause or contribute failures of system functions resulting in
catastrophic failure conditions of the aircraft [Grü17; DO-11].
The basis for a model, and MBT as a whole, are system models that describe the
specified behaviour of the SUT, abstract test models which are derived from the
requirements, or a combination of these. When the model exists, test generation can
be carried out, which presupposes the choice of a test selection criteria. In addition to
the test case specification and certain coverage goals a variety of other criteria can be
used for this, such as random and stochastic methods, fault-based methods or mutation
analysis. The technological possibilities to generate tests are also extensive. The tests
can be generated before they run (offline) or be adapted during runtime based on
the outputs of the SUT (online). They can be produced by hand or, more common,
be generated fully automatically. Theorem prover and model checker are used for
verifying the properties of a system. Their mathematical procedures can help to find
counterexamples. Symbolic execution is based on constraints, whose instantiation of
resulting traces leads to test cases. Graph search algorithms aim to traverse nodes
and arcs and offer thereby a good possibility to reach a certain coverage criteria. In
contrast, this takes an undefined period of time with random generation. In the end,
the test case generation leads to executable tests in the form of models, scripts or
code.
Test execution is accomplished by test platforms. The main purpose of these is to
provide the SUT with inputs and to collect the outputs. Possible execution options also
include MiL, SiL and HiL Testing, which is prevalent in the development of embedded
controllers. The following chapter covers these types of tests in detail. The execution
of reactive tests implies an evaluation of the output signals of the SUT, on the basis of
which the subsequent input signals are set. The test execution thus varies with the
behaviour of the SUT. This is not the case with nonreactive tests. Configurable test
log generation, which provides detailed information about test steps, executed methods
and test coverage, is standard for most test platforms today.
In the test evaluation phase, the SUT results are compared with the expected ones
and a verdict is assigned. The evaluation criteria can be based on a specially tailored
test evaluation specification, for example. Requirements coverage aims to cover the
specified SUT requirements. This presumes, however, that the requirements have a high
degree of testability. The reference signal based approach is also known as back-to-back
testing. Here, an ideal set of signals and data sets is generated and the obtained test
results are compared against these. The signals to be compared can either be pure or
differentiated into so-called signal-features, based on their properties such as minimum,
maximum, reaction time, increase rate and so on. Similar to the test creation, the
evaluation can take place automatically or manually and online or offline.
MBT models represent the functional behaviour of the application to be tested (or
parts thereof) and do not reflect the internal SUT design. Therefore, the tests are
usually seen as black box tests [ZSM12]. Grünfelder explicitly mentions that it is
important not to generate code and tests from the same model. Otherwise, an error
in the (SUT) model would affect both, the code and the test cases and would not be
visible. The only thing that would be tested against each other would be the respective
code and test generators. It is therefore better to have system models for the tests that
are independent of the implementation [Grü17].
2 Software Testing Theory 14
Mutation-analysis based
Structural model coverage
Data coverage
Test selection
Requirements coverage
criteria
Test case specification
Random and stochastic
Fault-based
Automatic/manual
Random generation
Graph search algorithm
Test
Technology Model checking
generation
Symbolic execution
Theorem proving
Online/offline
MiL/SiL/HiL (simulation)
Test Execution
Reactive/nonreactive
execution options
Generating test logs
Graphical modelling languages have proven helpful for the purpose of MBT, especially
when it comes to deriving information and data for dynamic tests. The models used
therefore typically describe the behaviour, but architectural models can also support
test data generation. There is a wide range of language standards and notations with
relevance to MBT, some representatives are listed below [ZSM12; Grü17]:
The idea of extracting test cases and other test data from a model is not new. Cor-
responding work on this was published as early as the 1970s, confer [Cho78]. MBT
test cases tend to result in many tests. Hence, MBT is particularly effective when the
tests are not performed manually, but fully automatically [Grü17]. The number of
MBT tools available on the industrial market is growing steadily, a recent overview of
available tools can be found, for example, in “A Survey on Model-Based Testing Tools
for Test Case Generation” [LLS18].
The practical use of MBT offers several benefits. Besides tests and test data, artefacts for
traceability, i.e. the association of each requirement to the related verification activities,
can be generated automatically. Usually the tooling is also able to locate implicitly
covered tests. Quick execution of regression tests is one of the main advantages. If the
specification changes, for example, the entire test suite can be regenerated, executed and
evaluated with little effort. Since the test models operate on an interface abstraction
level, MBT is also very well suited for in-the-loop test approaches as they are presented
in the next section [Pel18].
In-the-loop testing is mainly used in the development of controller software for embedded
systems. The terminology of embedded systems includes a variety of elements. In
general, it is a processor that is involved in a technical context. Its wide range of
applications covers all areas of daily life, such as washing machines, hearing aids,
smartphones, control systems and many more. There is a common characteristic to
be recognized, the interaction with the physical world. Besides the processor, further
2 Software Testing Theory 16
components are appropriate memory units (RAM and ROM) as well as analogue
digital converters (two way). Power can be supplied via the mains as well as batteries.
Embedded systems receive signals via sensors and influence the outside world by
actuators. An electronic control unit (ECU) is an embedded system that controls one
or more electrical systems. The physical environment, including sensors and actuators,
is hereinafter referred to as plant and is connected to the embedded system, respectively
ECU, through interfaces [BN08].
For the test of an embedded system, in the simplest case, appropriate stimuli are placed
on the inputs of the isolated system and the processed outputs are monitored in order
to evaluate them. There is no interlinkage with the environment, which causes a similar
configuration to conventional unit tests. This procedure is also described as one-way
simulation [BN08].
Model-in-the-Loop (MiL)
Early phases of the development process are characterized by the presence of
mathematical models such as physical models of the engine or chassis components
as well as the external environment, respectively the plant. The advances in
computing power of today’s desktop PCs allow the models to run together on
the same machine, without any other physical or hardware components. The
prerequisites for this level are therefore models of the ECU’s software, the plant
and additionally a simulation environment in which they can be executed. The ob-
jectives of MiL simulation are to test the model architecture and functions against
requirements, obtain reference signals and to verify the developed algorithms in
a hardware independent approach [Jai09; SPK12].
Software-in-the-Loop (SiL)
In the SiL context, the actual ECU code is used in order to investigate the
properties of the ECU (or part of it) to be developed. The code is derived from
the model and can either be handwritten or generated automatically. As with
the MiL simulation, there is a virtual plant model that can run on the same
2 Software Testing Theory 17
machine as the ECU software. SiL simulations allow the analysis of effects due
to restrictions of the target software. If, for example, the software is built for
a fixed-point architecture, the necessary scaling is already part of the software.
Arithmetic problems can be identified by comparing the results of the ECU code
with the functional model [SPK12].
Hardware-in-the-Loop (HiL)
HiL tests are executed with the final ECU software and the designated target
hardware. The environment of the ECU can still be simulated. However, the data
is transformed using appropriate converters to interact with the control unit via
the electrical interface. The environment models must have a so-called real-time
behaviour, which means the response of the models must be guaranteed within
specified time constraints. This is to ensure realistic tests so that communication
with the ECU is the same as in a real environment. The main goal of HiL-testing
is to verify the correct operation of the embedded software in the target processor
and its surrounding components and environment [BN08; SPK12].
Another intermediate step between SiL and HiL - processor-in-the-loop (PiL) - is often
described in the literature, confer [KDM18; Vig10; ZSM12]. PiL means, that the
software runs in the target processor or an emulation of it, also with a surrounding
virtual environment. The aim is to verify the system behaviour on the target processor
and reveal faults which are caused by, for example, the target compiler or the processor
architecture. Since these kinds of errors would also occur in HiL-testing, this level is
not discussed further.
MiL and SiL are commonly used in the early stages of development, primarily to
find functional errors, optimize the design and create a proof of concept. In order
to find problems regarding the real hardware used later, HiL tests must be executed.
Issues with low-level services can be identified here, such as problems with interfaces to
external hardware and issues with sensors and actuators. These would be very difficult
to assess with MiL- and SiL-testing alone [BN08].
It is desirable to design tests only once and to reuse as many of them as possible for all
levels of testing. This approach allows to use the same input test signals (patterns) for
all stages. In addition, the results of the individual test levels can be compared with
each other and thus deviations can be determined, and possible errors eliminated. Such
an integrated approach is presented in figure 2.7. It should be noted, that although
the logical description of inputs and outputs can be reused, the physical representation
of the signals regarding the separate stages is different [BN08].
How the representation can differ, which problems can arise and what to look out for
in the case of reuse, is dealt with in the main part of this work.
2 Software Testing Theory 18
Model-in-the-Loop
Model
Plant simulation
Software-in-the-Loop
Test result
Test pattern Software
evaluation
Plant simulation
Hardware-in-the-Loop
Software
and
Hardware
Plant simulation
costs arise through the use of in-house or open-source solutions, it still requires effort
to automate the required tests. However, the costs are amortized depending on the
number of repetitions. There are different opinions about the necessary recurrences, for
example Broekman; Notenboom report three to six times, while Grünfelder assumes
a factor of ten. Beside of that, any changes to the SUT are problematic, as they
mean expensive rework for the tests. Another point is that automated testing only
reveals errors on values that have been programmed to be checked. In contrast, further
errors might also be noticeable during manual testing. Basically, existing problems in
processes remain sat. Poor documentation, incomplete tests or poorly trained staff do
not improve because of test automation [BN08; Grü17].
To sum up, test automation does not work equally well for all types of projects. The
additional development effort as well as the possible costs for software licencing and
infrastructure can be problematic especially for small projects with limited budget and
constantly changing codebases. Automated testing requires a greater investment of time
upfront, but it saves time and therefore money on regression testing. This means that
test automation is particularly helpful for projects with successive releases of the same
software. By using appropriate abstraction layers for test data and communication,
the once written tests can be reused in different stages of development, as can be seen
later in this work. The minimization of human errors and the good transparency make
test automation an important means for the verification and validation of software.
In projects with liability risk and maximum safety standards, documentation of the
answers to the questions outlined above is essential. The planning stage of DO-178C,
for example, requires the development of a so-called Software Verification Plan to
accomplish this task [DO-11].
The next steps comprise the determination and implementation of measures that are
necessary, so that test work can proceed and the goals specified in the plan can be
achieved. It should therefore be monitored that guidelines are met, weaknesses in the
process are identified and solved, changed framework conditions are taken into account
and, ultimately, the timetable is adhered to. As mentioned in section 2.2, software
metrics can be used to express the testing progress in numbers and finally make it
measurable. These metrics are reflected, for example, in the number of successful test
cases per test run [Grü17].
Each project should specify how to deal with found errors. Bug tracking tools are
an important tool to support test management. Before using these tools, however,
it should be determined which fields of the input masks are to be filled and when a
bug should be entered. For example during unit and system tests, during field usage,
or whether the bug tracking only starts after a certain milestone. It should also be
determined when and who is investigating, which software versions are affected by an
identified bug and how authorities are informed [Grü17].
Similar to the maturity models for evaluating development processes (SPICE, CMM,
CMMi), there are also maturity models for software testing processes. For example,
the Testing Maturity Model (TMM), the Test Process Improvement (TPI) evaluation
model, or TestSPICE. The aim of these models is to evaluate test processes and to make
target shortfalls recognizable as early as possible. The level of maturity determined in
an audit allows purchasers of software to get an idea of the supplier’s processes and
thus to evaluate the own project and liability risk [Grü17].
21
As a basic approach, the framework in figure 3.1 is discussed in the publication. The
central element is the Integrated Test Management Tool, which consists of the four
components: Test Script Developer, Test Manager, Test Handler and a Test Report
Viewer. To meet the above mentioned re-use requirements, the test management tool
uses special configuration files which shall enable independence of test benches, test
platforms, test languages and ECU variants. The abstraction is achieved through
generic ports, which enable access to model ports, ECU signals, data logging and fault
simulation. On top of that, so called signatures, which are functions calls or a range of
operations for every port, can be called from scripts for every port.
Test Case
Library
Finally, the paper reports on an user study in which tests for fuel cell controls could
be successfully reused for various in-the-loop tests without any adaptions. Products
from the commercially available ETAS tool chain were used for this purpose. This tool
set (INTECRIO, INCA and LABCAR) was also used in the paper “Model-based ECU
development – An Integrated MiL-SiL-HiL Approach” [Jai09]. In these approaches,
INTECRIO is used as a central integration platform for a variety of model standards.
LABCAR is utilized to create, perform, and automate ECU tests and INCA supports
the measurement, display, calibration and evaluating of physical ECU data.
3 Existing Solutions and Research Results 23
Silver
Attach to Reader Silver module
XCP Python CAN
process Writer S-function of FMU
Rapid
Debug TCP/IP
prototyping
MS Visual CANape Modelica
Studio INCA scripts
SIMPACK
MDF
AMESim
CSV Measurement Test and Simulink
and calibration adapt
Vehicle model
The test automation tool TestWeaver, also developed by QTronic, complements the
current approach. As can be seen in figure 3.3, TestWeaver connects to a SiL, MiL
or HiL simulation. It controls the inputs and observes the outputs. The connection
to the system simulation is done via libraries. For example, a connection block-set
is provided for Simulink. Dymola and SimulationX can be connected via a Modelica
library. Python and C libraries are supplied for Silver or other environments.
3 Existing Solutions and Research Results 24
Controllable
State
input
Component Plant SW
model controller Alarm
fault
Quality observers
MiL/SiL/HiL simulation
TestWeaver
Test
State space
State report
evaluation
DB
Figure 3.3: Generation and evaluation of test scenarios with TestWeaver [TM14]
processor. The serial I/O network for these boards is based on Gigabit Ethernet and is
integrated via PCIe cards. The article reports, that there are I/O libraries available for
widespread aerospace and automotive bus systems such as ARINC 429, MIL-STD-1553,
CAN, LIN, FlexRay and UART. Because the I/O boards are programmable, more I/O
functions can be added at any time.
Plant
models
MiL/SiL
testing HiL testing
VEOS SCALEXIO
Offline Real-time
simulator simulator
Virtual Real
ECUs ECUs
Himmler mentions in his further publication “From Virtual Testing to HIL Testing -
Towards Seamless Testing” [Him14], a number of (dSpace) tools, which can be used for
simulation control, test automation and so on. The use of standards such as ASAM
(Association for Standardization of Automation and Measuring Systems) XIL API
and FMI are intended to ensure, that overlying applications can exchange tests and
configurations without further modifications.
Similar to the previous publications, this article also presents a test management tool,
the approach of which is shown in figure 3.5. dSPACE has introduced SYNECT to
ease the test management, taking into account the large amount of common data,
models and test configurations, given the context of many types of variant platforms.
Furthermore, traceability shall be granted by dedicated interaction between the test
tools and requirements management tools such as IBM Rational Doors and MKS
Integrity.
3 Existing Solutions and Research Results 26
Virtual ECU,
MiL and SiL Third-party test
testing HiL testing tools
Standard PC
UDP packets
MATLAB
SiL setup
MySQL UDP packets
Connector/J
Standard PC
MIL-STD-
MATLAB Bus Tools- BusTools- 1533 data
Import.txt Schedule.btd
1553 1553
PiL setup
MySQL
Extra.txt QPCX- MIL-STD-
Connector/J AutoIt
1553 1533 data
NI PXIe-1075 Simulation
HW&SW
rack
AutoIt
Key:
MATLAB
Slave software
MySQL
Master hardware Connector/J
NI PXI-8108
Slave hardware
Figure 3.6: SiL, PiL and HiL example of the Charles Stark Draper Laboratory [CV14]
3 Existing Solutions and Research Results 28
adjust and monitor them. A resulting list of the software and hardware used in this
project can be found in table 3.1 and 3.2.
In the presented work, special emphasis was placed on the accommodation of signal
ICD changes through automation. This was enabled by a centrally held signal database,
which allowed interface changes to be carried out with as little effort as necessary. The
subsequent testing stages and configurations provided various test possibilities and
V&V evidence automatically. The modular setup and especially the exclusive use of
commercial-off-the-shelf (COTS) software and hardware in the different stages allowed
easy replacement and extension of existing components. Unfortunately, there was no
further information about the elements to be tested, but the approach is nevertheless
very remarkable and the design well-conceived and structured.
3.5 Conclusion
The articles shown represent only a small selection of the available publications and
solutions. The sheer number of tools and standards, as well as their rapid development,
allow more and more solutions and variations. Unfortunately, many of the published
articles often give the impression, that manufacturers use the publications as a pre-
sentation platform for their purchasable tools. In addition, except for the last article
presented, the approaches are often kept very abstract. What a concrete use of the
tools in the MiL, SiL and HiL context can look like and by what means the abstraction
is designed, can usually only be guessed at. What kind of tests and in which areas
problems arise in reuse between MiL, SiL and HiL testing is hardly mentioned, which
is therefore a motivation for this thesis.
30
4 Project Context
At the beginning of this thesis, it was planned to accomplish the investigations on the
basis of an already developed flight control system. Corresponding models, software, as
well as the (test) target hardware in conjunction with existing environmental models
should underpin the present study. However, several reasons stood against this approach,
including:
Two computers with Windows and Linux operating systems (Host 1 and Host 2) were
on hand. Their characteristics are listed in table 4.1.
4 Project Context 31
4.1.2 Microcontroller
A Texas Instruments microcontroller unit (MCU) with the product name TMS570LC4357
was used for hardware-related consideration in the context of embedded systems. The
derivative encloses an ARM Cortex R5F processor. This 32-bit reduced instruction
set computer (RISC) CPU offers several error detecting capabilities, which is why it is
mainly used in safety-critical applications. Some features of the MCU are listed below,
for a more in-depth description, reference is made to the corresponding datasheet and
technical reference manual [Tex16; Tex18]:
• Dual-core CPU in lockstep configuration. The two cores always receive the
same inputs and perform the same calculations. The generated outputs are
then compared with each other. An error signal is generated in the event of a
deviation.
• Up to 300 MHz CPU clock
• 8-stage pipeline and dynamic branch prediction for operation execution
• Floating-Point coprocessor with single and double precision
• 32 KB of instruction and 32 KB of data caches
• 4 MB of program flash
• 512 KB of RAM
• 128 KB of EEPROM, emulated from flash
• Error-correcting code (ECC) for flash, RAM, EEPROM and caches
• Multiple communication interface modules like Ethernet, UART (SCI), I2 C, CAN
• Two 12-bit analog-to-digital converter
• Up to 145 pins available for general purpose input/output (GPIO)
• Big-endian (BE32) format supported
4 Project Context 32
During the work, two Texas Instruments evaluation boards with the part number
LAUNCHXL2-570LC43 were available. As figure 4.1 and table 4.2 show, the board
includes the MCU presented in the former paragraph, as well as some user- and
connection-interfaces such as switches, LEDs, potentiometer, GPIO pins and Ethernet
connector. Particularly noteworthy is the USB connector with coupled JTAG interface.
This enables programming, debugging and UART communication. Further information
can be found in the manufacturer’s documentation [Tex15].
P2
P3
P4
P1
P5
P8
P6
P7
Position Component
P1 TMS570LC4357 Microcontroller in 337 pin package
P2 Potentiometer
P3 Power-on and warm reset switches
P4 Ethernet connector
P5 User switches
P6 User LEDs
P7 I/O Pin connections for prototyping (exemplarily marked)
P8 USB connector
The following tools are mentioned in the upcoming chapters. They were used, unless
otherwise stated, with their basic settings and their supplied libraries. Compiler
optimization switches have been disabled unless otherwise noted.
MATLAB
MATLAB is a commercial software from the US company The MathWorks for
solving mathematical problems. The environment allows the development of
complex algorithms and the displaying of results by plotting functions and data.
MATLAB is primarily designed for numerical calculations using matrices, from
which the name is derived: MATrix LABoratory [The20b].
Simulink
Simulink is also a software from The MathWorks and is mainly used to model
technical or physical systems. Simulink is an additional product to MATLAB and
requires it to run. Further add-ons such as Simulink Coder (formerly Real-Time
Workshop) enable automatic code generation from the models. Embedded Coder
and Polyspace Code Prover, for example, are able to proof the absence of dynamic
runtime errors using formal methods, but these were not available for this work
[The20d; The20a; The20c].
SCADE Suite
SCADE Suite is a commercial product of Esterel Technologies (a subsidiary of
Ansys) and is based on the formal, synchronous and data flow-oriented language
Lustre. With its included code generator called KCG, correctness can be ensured
on a certified basis and the integration into a production system can be accelerated
and simplified. KCG is certified according to several safety-critical standards like
DO-178C, ISO 26262:2011, IEC 61508:2010 and EN50128:2011, which is why it
is used in the aerospace industry, for example [Est20].
Code Composer Studio
An IDE offered by Texas Instruments is called Code Composer Studio and
supports their portfolio of microcontrollers and processors. The tool is based
on the open source development environment Eclipse and provides various tools
for developing, compiling, debugging and analysing of embedded applications
[Tex20a].
4 Project Context 34
HALCoGen
HALCoGen is another tool from Texas Instruments. Its name is an abbreviation
for Hardware Abstraction Layer Code Generator. It allows the user to set
peripherals, interrupts, clocks, memory areas and other parameters for Texas
Instruments MCU derivates. After that, initialization and driver code can be
generated and imported directly into IDEs like Code Composer Studio [Tex20b].
The version numbers of the COTS tools quoted above are listed in the following table.
Windows and Linux refer to Host 1, respectively Host 2, as named in 4.1.1. In addition,
the versions of the compilers are also specified, to which reference will be made in the
succeeding chapter.
The following tools were mainly developed or co-developed in-house. For reasons of
classification, some of these are anonymised. A more in-depth description of the range
of functions is provided in the upcoming chapters, if necessary.
Simulation Framework
The company’s internal Simulation Framework (SF), enables multiple simulation
models to execute and to interact using a variety of services. Simulation models
are usually executed as independent subtasks inside the framework process. They
are loaded using model driver plugins for specific model standards. The models
are connected to the internal simulation framework network, which allows them
to communicate with other models. In addition, the framework provides model
developers with a toolchain to generate adapter code and to build and link model
binaries.
VISTAS
The Eurocae ED-247 specification and its title "Technical Standard for Virtual
Interoperable Simulation for Tests of Aircraft Systems in Virtual or Hybrid
4 Project Context 35
of all listed tools, either directly or indirectly, through appropriate interfaces, which is
why it has been drawn across all contexts.
For the sake of completeness, it should be mentioned that the modelling tools listed
are able to generate binaries from the models with a selectable programming language,
which can be used for (tool internal) simulations and tests. MATLAB/Simulink can
even run full HiL tests using the appropriate add-ons. The in-house TSS can also
execute models, including Simulink models. However, to fully exploit all possibilities
would go beyond the scope of this work, which is why they should mainly be used and
examined in the extend described above.
37
5 Methodology
The deviations between the test levels are considered on the basis of accuracy, per-
formance and reusability, which also aligns the subdivision of this chapter. In the
following, the programming language C is mainly used for examples and explanations.
There are few guarantees regarding accuracy in the C standard for decimal printed
results. With C99, the hexadecimal floating-point literal was introduced, which allows
an accurate interpretation of the results [ISO99, Sec. 7.19.6.1]. All presented results
were checked accordingly. Numbers without a suffix are to be interpreted as decimal
numbers.
Data uncertainty
Errors, which may arise from data acquisition, such as measurement errors by
virtue of biased instruments. Additional disruptions may occur, due to conversion
and storing this data.
Truncation errors
Errors caused by discretization. These are originated by numerical methods. The
Taylor theorem for example states, that any smooth function can be approximated
as a polynomial. The number of terms determines the accuracy and therefore
the error.
Rounding errors
Unavoidable errors because of the limited size and precision of computers. Not
all quantities can be represented exactly, therefore it must be rounded.
The following sections focus substantially on the latter point, rounding errors. For this
reason, some basics of floating-point numbers are explained first. Afterwards, rounding
modes are introduced and eventually errors are discussed using suitable examples. In
chapter 5.1.7 accuracy deviations of different test levels are examined.
5 Methodology 38
The set of floating-point numbers is a subset of the rational numbers. Together with
the operations defined on them, they form the finite floating-point arithmetic. Since
the early beginnings of the electronic computing, many ways have been initiated
to approximate a continuous set (the real numbers) to a finite set (the machine
numbers) like fixed-point arithmetic, logarithmic and semi-logarithmic number systems,
continued fractions and more recently unums and posits, to name just a few examples.
Nevertheless, floating-point arithmetic is by far the most used way of representing real
numbers in modern computers today [MBD18].
Between the 1960s and the 1980s, diverse floating-point arithmetics were designed and
implemented for a variety of computing platforms. All dealt with issues such as binary
formats, overflow, underflow and rounding, but are often different and incompatible
with each other. Porting numerical code from one platform to another was very complex,
error-prone, and time consuming. This led to the publication of the IEEE 754 standard
in 1985 [Hüs18].
IEEE 754-1985 was under revision at the turn of the millennium and has been adopted
in June 2008. IEEE 754-2008 is also known as ISO/IEC/IEEE 60559:2011 and has
been superseded by IEEE 754-2019 in July 2019. Besides the definition of arithmetic
formats, interchange formats, rounding rules, operations and exception handling, also
recommendatory parts for better numerical results and portability are described in
the standard. In the following, floating-point numbers and their features are only
introduced as far as they are needed in this thesis. For more detailed treatise, reference
is made to the official standard [IEE19] and pertinent literature such as Handbook of
floating-point arithmetic [MBD18].
x = s · be · m. (5.1)
According to the standard [IEE19] the components have to be in the following form:
The binary representation requires the arrangement and bit width to be defined.
Therefore the following parameters are introduced:
The standard defines three binary floating-point basic formats, which are listed in table
5.1 together with some of their corresponding parameters. In addition, interchange
formats are specified, which generalize and supplement the basic formats, so that also
16 bit and any multiple of 32 bits are defined.
Note, that the numbers listed for emin conform to normal numbers. A normal number is
a floating-point number, that can be represented without leading zeros in its significand.
Because of the binary format, the leading bit of the significand has to be one and
can therefore be implied in the memory encoding. A simple example shall clarify the
previous statements.
The number 4.5 shall be saved in a binary floating-point format. As can be seen in
the following two equations, the sign bit s has to be 0 to represent a positive number.
With a base of 2, the exponent e has to be 2 and the significand results in a decimal
number of 1.125.
The significand field string (also called fraction) has to be in the form of d0 .d1 d2 ..dt ,
where di ∈ {0, 1} and i = 0, 1, ..., t, so that the significand fits in as
t
4.5 = (−1)0 · 22 · di 2−i
X
(5.4)
i=0
4.5 = (−1)0 · 22 · 1.001b . (5.5)
5 Methodology 40
Assumed the number shall be saved in a 32-bit format, the binary value for the biased
exponent field is calculated by E = 2 + 127 = 129. The leading digit of the significand,
respectively d0 , can be omitted for normal numbers as mentioned before. The resulting
memory encoding is depicted in figure 5.1.
31 30 23 22 0
0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
According to IEEE 754-2019 a valid format comprises also two infinities +∞ and −∞,
as well as two kinds of NaN (not a number), a quiet NaN (qNaN) and a signalling
NaN (sNaN). Furthermore +0 and −0 are part of the specification, as well as so called
denormalized or subnormal numbers, whose magnitude is greater 0, but less than the
smallest normal number, respectively bemin .
To extend the precision in comparison to the basic formats, the standard specifies
optional extended precision formats. For this purpose, the parameters p and emax are
defined for a number of formats. A 64-bits extended binary number must have at least
a precision p of 64 bit and an emax of 16383, which applies for the x87 80-bit extended
format for example [Kus14].
The given values round to the nearest 0.25, that corresponds to two bits right of the
binary point. Nr. 1 and 2 round to the obviously closer number. In 3 and 4 the
given values lie exactly between the next representable numbers. It can be seen, that
the resulting value is chosen so that the least significand bit is 0, which is marked in
blue.
5 Methodology 41
In the base-10 system, numbers and fractions can be represented accurately if they
are a multiple of 2 or 5. Likewise in the base-2 system, only numbers built from the
powers of 2 can be written precisely in a storable form. A very simple and illustrative
example to show the impacts of rounding is the binary form of 0.1, which results in a
periodic significand as
In the following two steps, the number is first converted into a binary32 format with a
boundless fraction, whereafter it is rounded according to round to nearest, tie to even
as described before.
Let x ∈ R and RN (x) be the function, that returns the round to nearest, ties to even
value of x, then the introduced absolute error is given by
According to Higham [Hig02] the machine precision mach (also called machine epsilon
or unit roundoff) is defined as
1 β 1-p
2
in round to nearest mode,
mach = (5.13)
β 1-p in other rounding modes
5 Methodology 42
In algorithms, rounding errors can accumulate over several iterations. The following
lines of C code are sufficient to show such an effect. Note, that the calculation in line 7
is done in a higher internal precision, before it is assigned to the variable.
Without rounding, 5.16 would apply and result 0 for any value of k.
k
k X
abs (k) = − 0.1 (5.16)
10
i=1
The resulting deviations for different k can be seen in figure 5.2, where the scaling of
both axis is logarithmic. The same principle can also be applied to larger data types
with the outcome of smaller rounding errors.
The multiplication of floating-point numbers is simple from a high point of view. First,
the exponents are added and the significands are multiplied. After that, the result will
be rounded and normalized. Division is done similarly, whereas the divisor’s exponents
is subtracted from the dividend’s exponents and the dividend’s significand is divided
by the divisor’s significand. In addition to the previous described rounding errors,
overflow and underflow conditions may appear if range boundaries are exceeded. These
are signalled by the special numbers +∞ or −∞ respectively, but should not play a
major role in carefully evaluated algorithms.
When adding or subtracting two floating-point numbers, the smaller exponent has to
be adapted to the bigger one. This is done by shifting the significand of the smaller
one, before the rest is added or subtracted. Finally, rounding and normalization is
applied if necessary. The following example illustrates this procedure. The addition
5 Methodology 43
1000
10
Accumulated Rounding Error
0.1
0.001
10−5
10−7
10−9
shall be executed in a single precision format (binary32). 5.18 and 5.19 show the
corresponding summands in its binary layouts.
1.0 ≡ 0 |01111111
{z } 00000000000000000000000b (5.18)
e=0
2−23 ≡ 0 |01101000
{z } 00000000000000000000000b (5.19)
e = −23
Any number less than 2−23 in this example, would have the effect, as if the addition had
not been performed at all. This error can occur in the computation of infinite series.
Often the initial terms are relatively large compared to the later terms. After a few
iterations, the situation may arise, that small quantities are added to large quantities.
Calculating such series in reverse order can be a solution for this type of error.
5 Methodology 44
1 − cos(x)
f (x) = (5.21)
x2
and subsequently to
1 − cos(x)
f (x) = ≈ 0.88 (5.24)
x2
which is evidently wrong, because the range is 0 ≤ f (x) < 0.5 for all x =
6 0. In the
given subtraction of 1 − cos(x) the binary numbers are in the form of
After the exponents have been adjusted and the subtraction has been performed, it is
necessary to normalize the remaining bits in the form of 1.d1 ...dt , which in this case
means, that no bits remain in the fraction.
The underlying problem here is the mentioned normalization after subtraction, which
causes a padding with 0 instead of further significand digits. This problem is also
known as cancellation error. The reorganization of calculations in terms that are
mathematically equivalent can be a way to avoid such errors. Since cos(x) = 1−2 sin2 ( x2 )
the given example can be rewritten as
2 sin2 ( x2 )
f (x) = . (5.28)
x2
Figure 5.3 exhibits the values of 5.21 and 5.28 in the range of x = [−1 · 10−3 , +1 · 10−3 ]
calculated with floating-point binary32 precision. It can be seen, that 5.28 (blue) is
stable in the given area in comparison to the subtracting formula 5.21 (red).
5 Methodology 45
1
1−cos(x)
x2
0.8 2 sin2 ( x2 )
x2
0.6
y
0.4
0.2
Rounding also affects fundamental rules of algebra. The C code fragments in the follow-
ing listing result in 5.29 and 5.30, which proves that associativity does not necessarily
apply to floating-point calculations.
1 double a = 1.0;
2 double b = 1.0 / (long double)(1ull << 53); // 2^(-53)
3 double c = 1.0 / (long double)(1ull << 53); // 2^(-53)
4
5 printf("a + (b + c) = %.17g\n", a + (b + c));
6 printf("b + (a + c) = %.17g\n", b + (a + c));
a + (b + c) = 1.0000000000000002 (5.29)
b + (a + c) = 1 (5.30)
A violation of the distributive law can also be created, as can be seen in listing 5.3
with the results 5.31 and 5.32.
5 Methodology 46
1 double x = 1.234567;
2 double y = 1234.567;
3 double z = 1.1;
4
5 printf("x * (y - z) = %.17g\n", x * (y - z));
6 printf("x * y - x * z = %.17g\n", x * y - x * z);
x ∗ (y − z) = 1522.7976537890002 (5.31)
x ∗ y − x ∗ z = 1522.7976537889999 (5.32)
Due to the fact that floating-point numbers are not able to represent all real numbers
and floating-point operations can not exactly represent true arithmetic operations,
unstable algorithms and unintuitive outcomes can be the results. At the end of this
subchapter it should also be emphasized, that the presented errors or inaccuracies so
far, are independent of the used rounding mode, programming language, operating
system and test level. The following paragraph, on the other hand, also deals with the
influence of architecture-depended issues.
The IEEE 754 standard may be implemented in hardware or software. One of the
first hardware implementations with remarkable success was released in 1980, the Intel
8087 floating-point coprocessor, whose instruction set is commonly abbreviated to x87.
This floating-point unit (FPU) made it possible to take over floating-point operations
for the 8086 processor from Intel. The 16-bit architecture and instruction sets of the
8086 laid the foundation for the well-known and widespread x86 family. In 1985, Intel
introduced a 32-bit extension to that instruction set, which from then on was also
referred to as IA-32, short for Intel Architecture 32-bit. In 2003, AMD developed a
64-bit extension to the IA-32 instruction set called x86-64. Intel followed with a 64-bit
extension in 2005, designated as x64 [Mon08; Int19].
Over the years, there has been a progressive expansion of the x86 instruction sets for
various areas. From a point of view of the floating-point arithmetic, the SIMD (Single
Instruction, Multiple Data) supplementary instruction sets SSE (Streaming SIMD
Extensions) and AVX (Advanced Vector Extensions) should be mentioned in particular.
These extensions are especially applicable to digital signal processing and graphics
processing, because SIMD instructions can significantly increase the performance in
this fields. Furthermore, the portability of the code is promoted by full support of
the IEEE 754 standard, including rounding precision which is closely aligned to the
floating-point basic formats (see 5.1.2) [MBD18].
In contrast, a crucial point in the design of the x87 coprocessor was the decision for a
double-extended design, with all registers being 80 bits wide [Int89]. The reason for this
5 Methodology 47
decision was to provide additional bits for the exponent and significand to minimize
unavoidable rounding errors and therefore improve numerical accuracy. Intermediate
results are rounded to double-extended precision and just rounded to the target format,
when the results are written back to memory. This approach is also in line with the
standard which states [IEE19, p. 29]:
"Unless otherwise specified, each of the computational operations specified by this
standard that returns a numeric result shall be performed as if it first produced an
intermediate result correct to infinite precision and with unbounded range, and then
rounded that intermediate result, if necessary, to fit in the destination’s format."
Although this approach provides better numerical accuracy, it may affect portability,
since results might not be the same as those calculated on a binary32 or binary64 based
architecture. The use of the x87 registers is mainly deprecated today. Nevertheless,
for reasons of backward compatibility, the x87 related 80-bit extended FPU registers
are still present in most of today’s desktop PCs. As shown later, compiler switches
allow dedicated use of these registers. It can be precarious if the compiler and hence
the resulting program build upon them by default, maybe without the user knowing.
This is the case with the popular GNU C compiler (GCC), if it is configured for i386
targets (32-bit), for example [Gcc20]. Some possible problems arising from using the
x87 80-bit extended format are illustrated below by suitable examples.
Undetected Overflows
1 double x = 1E+308;
2 double y = (x * x) / x;
3
4 printf("%g\n", y);
GCC prints +∞ by default if compiled for 64-bit, regardless of the platform and version
used (see 4.2.1). However, if one compiles with the flag -mfpmath=387 or for a 32-bit
target (-m32), the value 10308 (1E+308) will be the result. The reason for this is the
use of 80-bit registers, which provide 64 bits of significand and 15 bits of exponent.
Regarding the used double type in C, which is compliant with the IEEE 754 defined
binary64 in the x86-64 platforms, the intermediate result of 10616 generated in line 2
by (x * x) is too large and consequently +∞ would be the result. Due to the enlarged
exponent size of an extended precision format register, the calculation can continue
and deliver 10308 in the end.
If the example is compiled with the ffloat-store option in GCC, again the expected
+∞ is the result, since this option causes the results of intermediate computations to
be saved in the CPU’s general-purpose registers which are 64 bits in total and actually
round to double precision.
5 Methodology 48
A similar case to the last one can be seen in listing 5.5. Here, a possibly activated
optimization may affect the outcome. Even when using the x87 registers +∞ is printed
in the default case, but with enabled optimization (-O), line 2 and 3 are combined and
10308 is the output.
1 double x = 1E+308;
2 double y = x * x;
3 double z = (y / x);
4
5 printf("%g\n", z);
Another issue is observable, once a printout for y is set between line 2 and 3. In that
case, the result would be +∞ again, because the intermediate result would then be
converted to the smaller target format, which is not able to hold this value.
The following example has already been discussed in the context of associativity. If
the 80-bit extended precision registers are used here, the results are the same for both
expressions (compare 5.29, 5.30 and 5.33, 5.34).
1 double a = 1.0;
2 double b = 1.0 / (long double)(1ull << 53); // 2^(-53)
3 double c = 1.0 / (long double)(1ull << 53); // 2^(-53)
4
5 printf("a + (b + c) = %.17g\n", a + (b + c));
6 printf("b + (a + c) = %.17g\n", b + (a + c));
a + (b + c) = 1.0000000000000002 (5.33)
b + (a + c) = 1.0000000000000002 (5.34)
The more interesting case is the second expression (b + (a + c)). Here, the calculation
in the parenthesis leads to a value, which should not be possible to be held by a double
precision number. So, the higher intermediate precision leads to a more correct, but
different result compared to strict 64-bit double-precision calculation.
5 Methodology 49
Double Rounding
1 double a = 1848874847.0;
2 double b = 19954562207.0;
3 double c = a * b;
4
5 printf("c = %.17g\n", c);
The multiplication of the two given numbers leads to a floating-point fraction with a
binary representation of
53bits
z }| {
10000000000000000000000000000000000000000000000000000
| {z 10000000000} 01
64bits
In the case of a machine that uses x87 registers and the default round to nearest, tie to
even mode, the first rounding step to fit the 64-bit wide signficand gives
53bits
z }| {
10000000000000000000000000000000000000000000000000000
| {z 10000000000}
64bits
A second step is necessary to fit in the target format, which finally results in
53bits
z }| {
10000000000000000000000000000000000000000000000000000
,→ c = 3.6893488147419103e+19
in contrast to a single rounding step, as it is done via SSE without extended precision
53bits
z }| {
10000000000000000000000000000000000000000000000000001
,→ c = 3.6893488147419111e+19
The observable effect here is called double rounding. It occurs not only during multi-
plication, but can also turn up during any other operation. In addition, the results
remain the same, even if the -ffloat-store flag is set and optimization is switched off.
Interestingly, double rounding is not an issue for directed rounding modes, because
these do not necessarily have to round to the next nearest number. The two results of
c in the example only differ from the 14th decimal place, but a similar scenario can
also be created with a binary32 target format.
5 Methodology 50
The former section showed, that the final result depends on how the compiler allocates
registers, because the process of rounding may depend on them. Compiler switches also
play a major role, especially optimizer switches, as they may affect the combination or
the order of calculations. Even printing or logging statements can alter the results, as
they change the handling of temporaries. In the C programming language, one may
also have direct access to extended data formats by the long double type. However,
this is implementation-depended and may reduce to the standard binary64 double or
enlarge to binary128 quadruple precision. The also tested Microsoft C/C++ Compiler,
for example, does not support extended precision floating-point numbers, neither by
offering compiler switches nor by data types like long double. The introduction of SSE
extensions in modern x86 derivates has deprecated the usage of the x87 FPU with
its internal 80-bit wide registers. Nevertheless, GCC is still able to use them and a
possibly unwitting use, due to platform and configuration settings, can lead to errors
which are difficult to find.
The IEEE 754 floating-point standard specifies correct rounding of the arithmetic
operations add, subtract, multiply, divide, fused multiply-add and square root. For more
complex functions like logarithm or the exponential function there are recommendations
located in the version of 2019, however, such implementations are optional [IEE19].
Trigonometric functions for example are basically supplied by system libraries, with
implementation decisions, which may be historically justified. Embedded system
suppliers sometimes offer special libraries for transcendental functions such as Texas
Instruments and the CMSIS DSP Library [Tex20c].
The following example, based on the explanations of Monniaux [Mon08], demonstrates
very impressively, how different the results of transcendental functions can be. Table
5.3 shows the results of sin(x) with x = 14885392687, by using several compilers and
simulation environments. Note, that the used GCC and Microsoft C/C++ compilers
have linked against the standard system libraries. MATLAB has been configured to show
the long fixed-decimal format by the command: format long. The SCADE Simulation
used the tool provided function SinR64_mathext and also an 16-digit decimal output
format to examine the result. The computing was all done on the same machine (Intel
i7-6700 CPU and Windows 10 Pro). WolframAlpha, which is based on Mathematica,
was used by the online service. The specific version numbering of the tools can be seen
in 4.2.1.
The difference between the Microsoft C/C++ compiler or the SCADE environment
to the other values is not just one or two digits in the last place, but approx. 11.5%
in total. The value for the sine function was chosen to be a multiple of 2π. So what
comes up here is the lack of precision in the algorithm for reduction of modulo 2π in
the used libraries.
Software or hardware manufacturers frequently use formal proofs to verify their arith-
metic implementations. A less complex procedure is to use certain test programs, which
check whether a system conforms to IEEE 754. This can be helpful if, for example, one
5 Methodology 51
Setup Result
WolframAlpha 1.4798091093322176 · 10−10
MATLAB 1.4798091093322176 · 10−10
GCC 10.1.0 1.4798091093322177 · 10−10
TI ARM C/C++ 1.4798091093322177 · 10−10
Microsoft C/C++ 1.6714458398077447 · 10−10
SCADE 1.6714458398077447 · 10−10
wants to check whether a compiler optimization option violates the standard. Examples,
that have also been used during this thesis are
Some programming packages offer correct rounding for transcendental functions with
arbitrary precision. The GNU MPFR library for example is a freely available and
portable C library. It supports a variety of mathematical functions with regard to the
same rounding modes as specified by IEEE 754. It is based on the GNU Multi-Precision
(GMP) library, which is also the basis for computer algebra systems like Maple or the
Multiprecision Computing Toolbox for MATLAB [MPF20].
To sum up, consistent behaviour of transcendental functions across varying configura-
tions cannot be guaranteed easily. Discrepancies in successive generations of processors,
FPUs and libraries may not be ruled out. The detection of possible standard violations
can be done by specific test programs. Eventually, deviations can be avoided more
reliable, by pure software implementations using GMP or MPFR for example.
∞
X (−1)n 2n+1
sin(x) = x (5.35)
n=0 (2n + 1)
x3 x5 x7 x9
sin(x) = x − + − + ... (5.36)
3! 5! 7! 9!
The higher the grade of the polynomial, the better the precision of the convergence.
For the present evaluation, the exact approximation of sin(x) is of no significance,
but rather the comparison of the results between different test levels. Therefore, a
polynomial of grade nine is used to calculate 50 values in the interval from [0, 5].
The first stage comprises the computing with an accuracy of 32 decimal places by
use of the variable-precision floating-point arithmetic (VPA) functionality, provided
by MATLAB. This calculation serves as a reference to determine deviations in the
following stages. The next step constitutes the calculation of the values in a common
x86-64 environment. The implemented C routine includes a loop, which leads to
intermediate results being converted to the target format regularly. This leads to rather
slight differences between the use of SSE and x87 FPU instruction sets, as can be seen
later. Since also the implementation of the exponential function depends on the used
math library, it was coded by hand as well. The same code basis is compiled and
loaded to the microcontroller in the final stage. For comparison, all results were saved
in csv files. The corresponding sources are available in appendix A.1.
The following graph shows the deviations between the MATLAB reference and the
algorithm implemented in C and executed on the different platforms. In the present
example, 38 out of 50 values differed, which are illustrated with a factor of 1014 . It can
be seen, that the execution on an x86 machine and the microcontroller unit (MCU)
diverged in exactly the same positions and by exactly the same magnitude. The values
were also calculated using the x87 instruction set, but the deviations were so small
that they have not been included in 5.4 for the sake of clarity.
An overview of the numerical deviations can be found in table 5.4. The minimum
deviations refer to values not equal to zero. The maximum and mean results suggest,
that the use of extended precision (x87) lead to marginally better accuracy. In general,
the implemented algorithm provides an accuracy around 15 decimal digits in comparison
to the reference.
5 Methodology 53
MATLAB
x86 SSE difference from MATLAB ·1014
MCU difference from MATLAB ·1014
0.8
0.6
0.4
0.2
y 0
−0.2
−0.4
−0.6
−0.8
−1
0 1 2 3 4 5
x [rad]
Figure 5.4: Accuracy deviations of the ninth grade Taylor approximation of sin x
Platform / Deviations
Instruction Set Min Max Mean
x86-64 / x87 1.39 · 10−17 2.89 · 10−15 7.79 · 10−16
x86-64 / SSE 1.39 · 10−17 4.23 · 10−15 9.86 · 10−16
MCU 1.39 · 10−17 4.23 · 10−15 9.86 · 10−16
The components presented in chapter 4 are used to analyse the system response time.
A first general configuration is divided into three setups, as shown in the next figure.
Each setup contains two different and process-independent entities, the structure of
which shall remain the same from setup to setup. Each entity provides executable
software. Entity A sends a signal (trigger) to Entity B, which sends a feedback to A
after the signal has been recognized. The runtime measurement takes place within
Entity A. A more detailed explanation of the implementation follows later in the text.
Simple digital I/O signals are used for data exchange.
In Setup 1, the two entities operate in a simulation environment (SF runtime) within a
desktop PC (Host 1). Setup 2 operates the entities as standalone applications with
VISTAS interface, which packs the digital signals in UDP packets and exchanges them
via internal Ethernet loopback. The hardware-related implementation finally follows in
Setup 3, where the digital signals are processed via the GPIO ports of the two available
MCUs. The MCUs are operated without an additional operating system.
Internal Direct
SF-Runtime Ethernet wiring
loopback
GPIO
VISTAS-I/O Adapter
Adapter Adapter Entity B
Entity B Entity B
MCU 2
The three setups are related to MiL, SiL and HiL implementations. In practice, mixed
configurations of, for example, virtual ECUs connected with VISTAS, similar to setup 2,
together with real hardware as in setup 3, would also be conceivable. With regard to the
response time analysis, however, the configurations should be considered separately.
An extension of the configuration described above can be seen in figure 5.6. Instead of
a single host, the entities in Setup 1.1 and 2.1 run on distributed machines. In both
setups, the signals are transferred via UDP packets.
Host 1 Host 1
Entity A Entity A
Adapter Adapter
SF-Runtime VISTAS-I/O
Direct Direct
Ethernet Ethernet
SF-Runtime VISTAS-I/O
Adapter Adapter
Entity B Entity B
Host 2 Host 2
The following sequence diagram shows that both entities read the inputs in a main loop,
process them in the central part and write the outputs in the final step. A completed
initialization marks the starting point, which is either controlled centrally by the SF
environment, by exchanging additional messages provided by the LIBIMS-VISTAS
library or visually via evaluation board LEDs.
Only two asynchronous signals (trigger and feedback) are exchanged between Entity
A and B for the time measurement. This means that no busy polling or additional
concurrencies are necessary for transmitting or receiving the signals. The simulation
framework offers to operate the cycle in an "as-fast-as-possible" mode. The standalone
applications can use infinite loops without interrupts for this purpose. At the beginning,
the trigger and feedback signals are initialized with false. Entity A start the time
measurement (start_clock) and sets the trigger signal (set_trigger). Entity B returns
the feedback of the trigger signal (set_feedback). Entity A stops the clock after
receiving the callback (stop_clock) and finally saves the result (plot_results). To
restart, the entities must be reinitialized. In each test level, the adapters have to be
adapted just slightly.
5 Methodology 56
Entity A Entity B
initialized initialized
loop loop
get_input get_input
opt
start_clock
[trigger ==
false]
set_trigger
set_feedback
opt
stop_clock
[feedback ==
true]
plot_results
set_output
trigger
set_output
feedback
The results of the first configuration are shown in figure 5.8 using box plots. 50 samples
each were recorded for Setup 1 and 2. Due to the high reproducibility, only 20 values
were measured for Setup 3. The first setup shows a median value of 7 µs and minimum
5 Methodology 57
and maximum values of 3 µs and 16 µs. The variation of the values is quite small, but
there are isolated outliers, which are marked by x. Setup 2 has a median of 73 µs. The
minimum and maximum values are 30 µs and 129 µs. Setup 1 and Setup 2 therefore
differ by a factor of approximately 10. The system response time results with the
MCUs or Setup 3 is significantly faster, with a median of 1.17 µs. From this it can
be concluded that the access to GPIO registers can take place much faster than the
packing and unpacking of the Ethernet stack, as well as the data exchange within the
SF runtime.
Setup 1
Setup 2
Setup 3
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140
µs
The following figure shows the results with distributed SF runtime and VISTAS interface.
The division of the X-axis is no longer in µs but in ms. It is particularly striking that
the SF runtime connection is no longer faster, but somewhat slower and with greater
dispersion. The longer time measurements for Setup 1.1 and 2.1 can be explained
by the fact that the data now meets a physical network and is not only written back
by a loop back adapter within the kernel. The entire processing chain is no longer
CPU-bound for these two setups, but rather very dependent on resources that are
provided by the operating system.
Setup 1.1
Setup 2.1
0 2 4 6 8 10 12 14 16 18 20
ms
The presented results are not necessarily intended to evaluate the frameworks and
interfaces used. These can differ by a variety of preferences or by type and number
of signals or bus system. Rather, it should be emphasized that there are individual
differences in response times for each test level. It is very difficult to conclude from a
successful execution of tests in the simulation context to an equally successful execution
in a HiL environment. The specific setup must be taken into account by, for example,
worst-case execution time (WCET) analyses and representative measurements.
5 Methodology 58
In general terms, throughput is the rate of data that a system or subsystem can process
per time. When used in the context of communication networks it indicates, how
much data can be transmitted in a period of time. Aside from intercommunication,
the calculation capacity of a processing unit, especially in embedded systems, can also
have an impact on throughput. Both factors will be considered below.
Calculation Capacity
Instead of developing algorithms and metrics specifically for assessing the computation
capacity of the individual platforms, the decision was made to use a freely available
tool for this purpose. CoreMark is a benchmark of the Embedded Microprocessor
Benchmark Consortium (EEMBC), which is also suitable for embedded systems. It is
written in C and uses data structures and algorithms that are common in a variety
of practical applications such as list processing, state machine operations and matrix
multiplications. Furthermore, CoreMark is implemented in such a way, that every
operation in the benchmark is driven by runtime provided values. This prevents code
elimination and pre-computation during compile time. To finally validate the results,
CoreMark performs cyclic redundancy checks (CRCs) [EEM20].
The benchmark should be applied to the platforms described in the project context. For
better comparability, the compilation is done for single-core 32-bit targets and disabled
optimizations. The application could be applied out-of-the-box for the Linux system
(Host 1). For MinGW on Windows (Host 2) and the ARM CPU of the microcontroller,
the provided porting files had to be adapted. The performance of the executed runs is ex-
pressed in iterations per second. Table 5.5 shows the results of the processing units used.
Possible reasons for the deviations between the platforms can be explained for the
predominant part with the different processor frequencies. Between Host 1 and Host 2
is a factor of 1.36 (2.50 GHz to 3.40 GHz). This roughly corresponds to the difference
in iterations per second which is 1.50. The remaining portion can be explained by
the improved memory connection and memory generation of Host 2. For example,
the cache memories are larger and DDR4 RAM is installed instead of DDR3 RAM,
which allows for higher clock frequencies and transfer rates. The deviation between
MCU and the desktop host platforms is significantly greater. The CPU clock with a
frequency of 300 MHz is roughly ten times smaller, but even after this deduction, the
5 Methodology 59
number of iterations is at least twice too small. Less cache sizes and a smaller pipeline
(8-stage instead of 14-stage) partially provide an explanation. However, the fact that
the object code is read and executed directly from flash memory, as is not uncommon
with embedded systems, probably has a much greater impact on performance. This
requires so-called additional wait states, which are cycles the CPU must wait in order
to access the memory again.
Intercommunication
5.2.3 Conclusion
This subchapter has shown that it is not an easy task to determine and evaluate
the system performance. It can vary from setup to setup and depends on individual
factors. Each additional level of abstraction to the real system also introduces a further
dimension of complexity and additional unknowns.
Technical advances in computer architectures such as caches, deep pipelines, out-of-
order executions, and branch prediction mean that execution times are no longer
constant, but depend on the execution history. The execution time of a memory access
depends crucially on whether the content of the required memory cell is currently in
the cache or not. In most cases, executions will be faster than, for example, with
the cache turned off, but stochastic arguments for critical real-time systems is not
enough. A guarantee of execution times is required. Since measuring does not allow
for a guaranteed execution time in the worst case, static methods are used in the
5 Methodology 60
development of safety-relevant systems. This means that these methods are not based
on measuring execution times, but on the analysis of the program text.
As already mentioned, a common methodology for this is the WCET analysis. A
guarantee can be obtained when the longest assumed execution time for all operations
is included for the longest possible path. However, the actual execution time is greatly
overestimated. Tools like aiT from AbsInt specialize in WCET analysis taking into
account the hardware architecture. An abstract model of the target platform is
required for each analysis, the description of which must be provided by the processor
manufacturer. With this prerequisite, the tool can also include caches and pipelines
in the analysis and determine execution times more precisely [Abs20]. aiT is already
integrated in the SCADE Suite, which has the advantage that the response time
behaviour can be determined at a very early stage of development. Tests using the
target hardware, i.e. in the HiL context, can then serve as confirmation of compliance
with the previously determined limits.
As stated by Großmann et al. and Mäki-Asiala, the reuse of tests can be seen from
three different views [GMW12; Mäk05]:
Vertical
During the development process of a (single) software product, vertical reuse can
be obtained by using similarities between different test levels or test types. This
is the case if, for example, integration tests are also used as part of system tests
or serve as a basis for performance tests.
Horizontal
Specifications can be similar between different products in a product family or
application domain and thus form a solid basis for the development of reusable
tests. Standardized test suites or tests for entire product families are the outcome
of horizontal reuse.
5 Methodology 61
Historical
Historical reuse is probably the most common one. It is the reuse of tests between
product generations or versions. A typical use case is that a new version of
a product should meet the same basic requirements that were also met by its
predecessor. Most of the tests created for the former product version are likely to
be reusable. Only new features or components require the creation of additional
new tests.
Looking at these viewpoints with regard to MiL, SiL and HiL testing, vertical reuse
concurs the most, but historical reuse also plays a role, since MiL tests are usually
done in earlier versions than, for example, HiL tests. The horizontal aspect can be
seen as an additional goal, as tests should ideally not only be reused for the currently
developed product, but for an entire product family or domain.
In order to achieve a high level of reuse, the following general points are recommended.
Ideally, these should be supported by the test management process as well as uniform
tool support [GMW12]:
• Definition of and adherence to standards, norms and conventions
• Definition of the granularity of reuse (e.g. component level or system level)
• Creation of modular and abstract test assets
• Collection and classification of tests for easy locating and retrieval
• Detailed documentation (to make the reusable tests usable)
Creating modular and abstract tests is critical for reuse. How this can be achieved
is demonstrated by Mäki-Asiala using guidelines and the testing language TTCN-3.
Exemplary items are listed below [Mäk05]:
• Modularize tests according to components
• Modularize tests according to features
• Implement test cases using high level functions
• Use preambles, bodies and postambles
• Use conditional branching to alternate test behaviour and execution
• Use parametrization
• Use common types, templates and wildcards
5 Methodology 62
With the exception of some language-specific features such as templates and wildcards,
the listed guidelines can also be applied with the in-house Test Support System (TSS).
The degree of reuse is therefore more a question of test design and organisation than
the language to be used.
Figure 5.10 shows a high-level design for using in-the-loop tests. This essentially
corresponds to figure 2.7 in chapter 2.4.3, but three points should be particularly
highlighted here. Firstly, the SUT basically provides the same logical functionality
in all levels, but different technologies and refinements of the interfaces are deployed
in each stage. For this reason, appropriate adapters must be applied to the inputs
and outputs of the SUT. Secondly, the plant model, which includes the modelling of
the environment as well as actuators and sensors, should be decoupled from the SUT
implementation in order to achieve reuse capabilities in the various test levels. Finally,
it should be noted that the tests and their handling are located in the same layer as
the plant model and that these should also be designed as abstractly as possible in
order to achieve a high degree of reuse with the help of the adaption layer.
Model
layer
layer
ECU
To get a more concrete idea of how the adapters can look like, the previous test for
measuring response times should serve as a basis. In chapter 5.2.1, Entity A sends
a signal, which Entity B sends back. Entity A then measures the elapsed time. For
the sake of simplicity, only the functionality of Entity B’s response should be tested
in conjunction with the TSS. After the trigger signal is stimulated by the TSS, the
feedback signal should be set within a specified time by Entity B. The behaviour of
the SUT essentially corresponds to line 5 in listing 5.8, which remains identical in all
test levels.
5 Methodology 63
1 while (true)
2 {
3 // <get_input>
4
5 output.feedback = input.trigger;
6
7 // <set_output>
8 }
Implementing the adapters with VISTAS on the SiL stage is somewhat more complex,
as can be seen in listing 5.10. During the reading process, the relevant context is
first loaded in line 9, then the corresponding message is fetched and finally the signal
is assigned. To send the message, the write function of the library must be called,
followed by the context on which it is to be sent. In this phase, the SUT code no longer
runs in the SF environment, but as a standalone application. The adapter on the TSS
side is analogous to this.
5 Methodology 64
For the HiL phase, the binary object of the SUT code is loaded onto the ECU. In
the present example, a connection to the GPIO ports of the controller is necessary,
which results in an adapter as shown in listing 5.11. The test system must also serve
digital inputs and outputs within the adapter functions. The wiring of the GPIO pins
5 Methodology 65
In more complex test scenarios, which, for example, also require plant models, the
timing behaviour is particularly important. The execution of the plant model as well
as the test execution must be done within time limits. In control system development,
the SUT typically operates in fixed-length cycles. SUT, plant model and test execution
must take this into account and provide appropriate time synchronization mechanisms
and the possibility of configuration.
Considering the given toolchain, it would also be possible to create test scripts within
MATLAB on the model side, or even to create entire test models within Simulink and
SCADE by function blocks. The existing simulation framework, which can be used to
5 Methodology 66
execute the generated code for MiL/SiL operations, is able to run tests with the help
of a Python scripting interface, too. The TSS in turn contains a powerful proprietary
scripting test language. While functional tests once created in MATLAB or SCADE,
can be reused for successive versions of a controller model, this is a disadvantage from
the vertical reuse point of view. Focusing test execution with the TSS would offer
the greatest opportunities for reuse in the given context, since interfaces to different
development environments are available.
As described, the functional properties of the SUT can be tested in the early stages
of development by replacing the physical environment with simulation models and
gradually concretizing the interfaces. However, this type of black box testing only
covers some of the tests, which are necessary for the development of embedded systems.
This includes, for example, the compatibility of all electronic or mechanical compo-
nents. The proper functionality of a bus interface, for example, can only be verified
by using the real bus system and only partly by its simulation. The following tests
and characteristics to be obtained, which are described by Grünfelder are also of great
importance, but can only be transferred to a limited extend or with great effort in a
MiL or SiL environment [Grü17]:
Volume testing
In this type of test, the test object is supplied with vast amounts of test data and
it is checked whether program errors occur or when data response times become
unacceptable.
Stress testing
This test is a kind of performance test in which the system is executed beyond
its specified workload. Low memory resources or the waiting for lock releases can
reveal problems that do not occur under normal conditions.
Failover and recovery testing
These tests ensure that the test object can recover from a number of hardware,
software, or network malfunctions without losing data or data integrity.
Security testing
Tests to protect against misuse and unauthorized access. Typical security re-
quirements comprise aspects like authentication, authorization, availability and
confidentiality.
In general, resource-related tests such as volume testing and stress testing only really
make sense when executed on the target platform, because the systems used for
simulation may be more powerful and the results therefore would be less meaningful.
Static analyses, on the other hand, are very accurate and can already be carried out at
the model level. Also safety and security properties are often very hardware-dependent.
5 Methodology 67
Watchdog timers, interrupt handlers and the access to memory or memory protection
units are hardware-specific and an emulation of these with regard to MiL and SiL tests
is associated with great effort and probably unprofitable.
Finally, it should be mentioned that test management also contributes to reusability.
It happens that the progress of a project is measured in new lines of code with the idea
that the more new lines are written, the more new features and significant the project.
This can have a counterproductive effect on reuse, as it rewards quantity instead of
quality here. Another aspect is that the development for reuse generally takes longer.
This effort pays off in the long run, but has to be raised and financed in advance.
68
6 Conclusion
The results of this work cover various aspects of software testing. First of all, test types,
methods and principles that are common today were classified, which also include
in-the-loop testing. With regard to the question of how multi-level tests can differ,
an examination of present studies and research results was carried out. It has been
established that these do not, or only to a very limited extent, address possible problems
and inconsistencies, which can arise from the reuse of tests within different test levels.
After evaluating the tools and aids provided by Airbus, the actual analysis could begin.
The methodology spanned three main parts for the investigations, which are briefly
taken up again below:
Accuracy
• Calculations on digital platforms lead to errors caused by data uncertainty,
truncation or rounding.
• Rounding errors are independent of the platform and test level used. They
can occur, for example, by calculating numbers with very big or small
magnitudes.
• The IEEE 754 standard allows extended precision formats which can cause
problems for portability between different test levels like undetected overflows
and double rounding.
• Transcendental functions are system and implementation dependent. Differ-
ences are avoidable when the same libraries are used on all testing platforms.
• Results can depend on compiler switches and logging statements, as these
can change register allocation and finally affect the computation.
Performance
• A practical system response time measurement has shown that, depending
on the configuration, different results can occur, which are not necessarily
related to the test level.
• The achievable throughput depends on the system configuration and is
usually considerably lower on the target platform than on a development
platform, as benchmark tests have shown.
6 Conclusion 69
Reusability
• Distinctions can be made between different types of reusability (vertical,
horizontal, historical), in-the-loop testing aims for vertical reuse.
• Basic principles and an example of an in-the-loop testing approach have
been presented.
• There is a number of test types that are poorly suited for reuse, such as
stress testing or failover and recovery testing.
Among the points listed, the performance evaluation was of particular interest to
Airbus. The results show that the response times especially in distributed systems and
in the context of an integrated MiL-, SiL-, HiL-testing approach can be critical. For
the MIL-STD-1553 standard, for example, devices have to respond within a period of 4
to 12 µs, which cannot be achieved with the setups shown and the given MiL and SiL
environment [MIL18].
The question of the test language is also subject of discussions within the company
and in research, confer [GMW12; Mäk05]. For the reuse and exchange of development
models and data, standards such as the Functional Mock-up Interface (FMI) or the
XML Metadata Interchange (XMI) have already been established. From the testing
side, this is less the case. However, using an abstract test specification language such
as TTCN-3 or UTP would offer the advantage that tests would be implementation-
independent and could therefore be used in different test levels. In addition, these
would not depend on a specific tool or framework and could also be supplied by the
original equipment manufacturer (OEM). Only the respective adapters would have to
be adapted accordingly.
In summary, errors can be found and eliminated earlier and with less effort in the model
phase than in the real system, which is why the possibilities of MiL/SiL testing should
be used more intensively. It should be noted, however, that there may be deviations
in the accuracy of the results. Performance aspects of the respective systems and the
desired type of tests is decisive for the reuse of tests. From an economic point of view,
it is essential that the implementation of the test adapters takes less effort than the
creation of new test cases. This requires a consistent workflow between MiL, SiL, and
HiL tests as well as appropriate tool support to ensure the consistency of requirements,
models and code.
70
List of Abbreviations
AVX Advanced Vector Extensions
HiL Hardware-in-the-Loop
MiL Model-in-the-Loop
PiL Processor-in-the-Loop
SF Simulation Framework
SiL Software-in-the-Loop
List of Figures
2.1 Growth of source lines of code in aerospace systems [SAV] . . . . . . . 4
2.2 Encapsulation of software testing [FLS07] . . . . . . . . . . . . . . . . . 5
2.3 Structure of software testing [FLS07] . . . . . . . . . . . . . . . . . . . 6
2.4 Software development model [ESA94] . . . . . . . . . . . . . . . . . . . 8
2.5 Taxonomy of model-based testing [ZSM12] . . . . . . . . . . . . . . . . 14
2.6 Closed-loop simulation setup . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Integrated MiL-, SiL-, HiL-testing approach . . . . . . . . . . . . . . . 18
List of Tables
3.1 Software summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Hardware summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
List of Listings
5.1 Summation rounding error accumulation. . . . . . . . . . . . . . . . . . 42
5.2 Example of broken associativity . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Example of broken distributivity . . . . . . . . . . . . . . . . . . . . . . 46
5.4 First example of an undetected overflow . . . . . . . . . . . . . . . . . 47
5.5 Second example of an undetected overflow . . . . . . . . . . . . . . . . 48
5.6 Example of impacts on intermediate calculations with x87 registers . . 48
5.7 Example of double rounding . . . . . . . . . . . . . . . . . . . . . . . . 49
5.8 Reusable test design – SUT behaviour . . . . . . . . . . . . . . . . . . 63
5.9 Reusable test design – MiL adapter . . . . . . . . . . . . . . . . . . . . 63
5.10 Reusable test design – SiL adapter . . . . . . . . . . . . . . . . . . . . 64
5.11 Reusable test design – HiL adapter . . . . . . . . . . . . . . . . . . . . 64
5.12 Reusable test design – Test example . . . . . . . . . . . . . . . . . . . . 65
Bibliography
[DO-11] DO-178C STANDARD.
DO-178C, Software Considerations in Aiborne Systems and Equipment
Certification. Washington, DC, USA, 2011.
[Abs20] ABSINT.
aiT - Worst-Case Execution Time Analyzer [online].
2020 [visited on 2020-10-28].
Available from: https://www.absint.com/ait/.
[BN08] BROEKMAN, Bart; NOTENBOOM, Edwin.
Testing embedded software. Reprinted. London: Addison-Wesley, 2008.
ISBN 0-321-15986-1.
[CV14] CAMPBELL, Alan; VELEZ, Dianna M.
Efficient testing of simulation V&V for closed-loop operations. In:
IEEE AUTOTEST. IEEE, 2014, pp. 44–51.
Available from DOI: 10.1109/AUTEST.2014.6935120.
[Cho78] CHOW, T. S.
Testing Software Design Modeled by Finite-State Machines.
IEEE Transactions on Software Engineering.
1978, vol. SE-4, no. 3, pp. 178–187. ISSN 0098-5589.
Available from DOI: 10.1109/TSE.1978.231496.
[EEM20] EEMBC.
Embedded Microprocessor Benchmark Consortium: CoreMark - An
EEMBC Benchmark [online]. 2020 [visited on 2020-10-28].
Available from: https://www.eembc.org/coremark/.
[ESA94] ESA, Board-for-Software-Standardisation-and-Control.
ESA PSS-05-0 Software Engineering Standards [online].
1994. 2. Issue [visited on 2020-10-28]. Available from:
http://microelectronics.esa.int/vhdl/pss/PSS-05-0.pdf.
Appendix
A.1.2 C Sources
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <stdint.h>
4
5 #define RESULT_FILE "sineByTaylor.csv"
6
7
8 static FILE *fp;
9 int32_t factorials[10];
Appendix 83
10
11
12 double dpow(double base, uint32_t exp)
13 {
14 double result = 1.0;
15 double pow = base;
16
17 while (exp > 0)
18 {
19 if (exp & 1)
20 {
21 result *= pow;
22 }
23
24 pow *= pow;
25 exp >>= 1;
26 }
27
28 return result;
29 }
30
31 void calcSineByTaylor(double x)
32 {
33 int32_t currentGrade;
34 int32_t multiplier = -1.0;
35 double sine = x;
36
37 for (currentGrade = 3; currentGrade <= 9; currentGrade += 2)
38 {
39 sine += (dpow(x, currentGrade) / factorials[currentGrade]) *
,→ multiplier;
40 multiplier *= -1;
41 }
42
43 fprintf(fp, "%.13a;%.13a\n", x, sine);
44 }
45
46
47 int main()
48 {
49 // Initialize
50 // ------------------------------------------------
51 int32_t i = 0;
52 int32_t steps = 50;
53 int32_t preFactorial = 1;
54 double x = 0.0;
Appendix 84
39 }
40
41 if (F_INIT)
42 {
43 input.trigger = false;
44 output.feedback = false;
45
46 R_INIT = 1;
47 }
48
49 if (F_REINIT)
50 {
51 R_REINIT = 1;
52 }
53
54 if (F_RUN)
55 {
56 get_input(&input);
57
58 output.feedback = input.trigger;
59
60 set_output(&output);
61
62 R_RUN = 1;
63 }
64
65 if (F_HOLD)
66 {
67 R_HOLD = 1;
68 }
69
70 if (F_UNLOAD)
71 {
72 R_UNLOAD = 1;
73 }
74 }
48 ims_message_t* feedback_message)
49 {
50 ims_test_actor_t actor = ims_test_init(ACTOR_ID);
51
52 ims_create_context_parameter_t create_parameter =
,→ IMS_CREATE_CONTEXT_INITIALIZER;
53 create_parameter.init_file_path = IMS_INIT_FILE;
54
55 ims_create_context(IMS_CONFIG_FILE, VISTAS_CONFIG_FILE, &
,→ create_parameter, context);
56 ims_get_equipment(*context, "myEquipment", equipment);
57 ims_get_application(*equipment, "myApplication", application);
58 ims_get_message(*application, ims_discrete, "trigger_signal", 1, 1,
,→ ims_input, trigger_message);
59 ims_get_message(*application, ims_discrete, "feedback_signal", 1,
,→ 1, ims_output, feedback_message);
60
61 // Signalling model A to be ready
62 TEST_SIGNAL(actor, 1);
63 }
64
65
66 int main()
67 {
68 input_t input = {};
69 output_t output = {};
70
71 ims_node_t context;
72 ims_node_t equipment;
73 ims_node_t application;
74 ims_message_t trigger_message;
75 ims_message_t feedback_message;
76
77 init(&context,
78 &equipment,
79 &application,
80 &trigger_message,
81 &feedback_message );
82
83 while (true)
84 {
85 get_input(&context, &trigger_message, &input);
86
87 output.feedback = input.trigger;
88
89 set_output(&context, &feedback_message, &output);
Appendix 89
90 }
91
92 ims_free_context(context);
93
94 return ims_test_end(actor);
95 }