Professional Documents
Culture Documents
Software Support For Metrology PDF
Software Support For Metrology PDF
March 2001
Software Support for Metrology
Best Practice Guide No. 6
March 2001
c Crown copyright 2002
Reproduced by permission of the Controller of HMSO
ISSN 14714124
1 Scope 1
1.1 Software Support for Metrology Programme . . . . . . . . . . . . 1
1.2 Structure of the Guide . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Introduction 7
2.1 Uncertainty and statistical modelling . . . . . . . . . . . . . . . . 7
2.2 The objective of uncertainty evaluation . . . . . . . . . . . . . . 12
2.3 Standard deviations and coverage intervals . . . . . . . . . . . . . 14
3 Uncertainty evaluation 16
3.1 The problem formulated . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 The two phases of uncertainty evaluation . . . . . . . . . . . . . 17
i
5 Candidate solution approaches 39
5.1 Analytical methods . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.1 Single input quantity . . . . . . . . . . . . . . . . . . . . . 40
5.1.2 Approximate analytical methods . . . . . . . . . . . . . . 42
5.2 Mainstream GUM . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3 Numerical methods . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3.1 Monte Carlo Simulation . . . . . . . . . . . . . . . . . . . 44
5.4 Discussion of approaches . . . . . . . . . . . . . . . . . . . . . . . 44
5.4.1 Conditions for Mainstream GUM . . . . . . . . . . . . . . 44
5.4.2 When the conditions do or may not hold . . . . . . . . . . 45
5.4.3 Probability density functions or not? . . . . . . . . . . . . 45
5.5 Obtaining sensitivity coefficients . . . . . . . . . . . . . . . . . . 46
5.5.1 Finite-difference methods . . . . . . . . . . . . . . . . . . 46
6 Mainstream GUM 48
6.1 Introduction to model classification . . . . . . . . . . . . . . . . . 48
6.2 Measurement models . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2.1 Univariate, explicit, real-valued model . . . . . . . . . . . 49
6.2.2 Multivariate, explicit, real-valued model . . . . . . . . . . 51
6.2.3 Univariate, implicit, real-valued model . . . . . . . . . . . 52
6.2.4 Multivariate, implicit, real-valued model . . . . . . . . . . 53
6.2.5 Univariate, explicit, complex-valued model . . . . . . . . 54
6.2.6 Multivariate, explicit, complex-valued model . . . . . . . 55
6.2.7 Univariate, implicit, complex-valued model . . . . . . . . 55
6.2.8 Multivariate, implicit, complex-valued model . . . . . . . 56
6.3 Classification summary . . . . . . . . . . . . . . . . . . . . . . . . 56
9 Examples 79
9.1 Flow in a channel . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
9.2 Graded resistors . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
ii
9.3 Calibration of a hand-held digital multimeter at 100 V DC . . . 84
9.4 Sides of a right-angled triangle . . . . . . . . . . . . . . . . . . . 85
9.5 Limit of detection . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.6 Constrained straight line . . . . . . . . . . . . . . . . . . . . . . . 91
9.7 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
10 Recommendations 96
iii
iv
Uncertainty and Statistical Modelling
Chapter 1
Scope
1
Software Support for Metrology Best Practice Guide No. 6
1.3 Summary
This guide provides best practice on the evaluation of uncertainties within
metrology and the support to this discipline given by statistical modelling. Cen-
tral to considerations is a measurement system or process, having input quanti-
ties that are (invariably) inexact, and an output quantity that consequently is
also inexact. The input quantities represent measurements or other information
obtained from manufacturers specifications and calibration certificates. The
output represents a well-defined physical quantity to be measured, the measur-
and.1 The objective of uncertainty evaluation is to model the system, including
the quantification of the inputs to it, accounting for the nature of their inexact-
ness, and to determine the model output, quantifying the extent and nature of
1 In some instances the output quantities may not individually have physically meaning.
2
Uncertainty and Statistical Modelling
equate, is not treated in this guide. Detailed information on model validation is available
[9].
3 There may be more than one output, in which case a coverage region is required.
4 Mainstream GUM is summarized in GUM Clause 8 and Section 6 of this guide.
3
Software Support for Metrology Best Practice Guide No. 6
are some limitations and assumptions inherent in the Mainstream GUM pro-
cedure and there are applications in metrology in which users of the GUM are
unclear whether the limitations apply or the assumptions can be expected to
hold in their circumstances. In such situations other analytical or numerical
methods can be used, as is stated in the GUM (in Clause G.1.5). This gen-
eral approach is contrasted in this guide with the mainstream approach. In
particular, the limitations and assumptions at the basis of the easy-to-use
formula inherent in Mainstream GUM are highlighted.
The GUM (through Clause G.1.5) does permit the practitioner to employ
alternative techniques whilst remaining GUM-compliant. However, if such
techniques are to be used they must have certain credentials in order to permit
them to be applied in a sensible way. Part of this guide is concerned with these
techniques, their properties and their credentials.
It is natural, in examining the credentials of any alternative scientific ap-
proach, to re-visit established techniques to confirm or otherwise their appro-
priateness. In that sense it is appropriate to re-examine the principles of Main-
stream GUM to discern whether they are fit for purpose. This task is not possi-
ble as a single general health check. The reason is that there are circumstances
when the principles of Mainstream GUM cannot be bettered by any other can-
didate technique, but there are others when the quality of the Mainstream GUM
approach is not quantified. The circumstances in which Mainstream GUM is
unsurpassed are when the model relating the input quantities X1 , . . . , Xn to the
measurand Y is additive, viz.,
Y = a1 X1 + + an Xn ,
for any constants a1 , . . . , an , any value of n, however large or small, and when
the input quantities Xi have independent Gaussian distributions. In other cir-
cumstances, Mainstream GUM generally provides an approximate solution: the
quality of the approximation depends on the model and its input quantities and
the magnitudes of their uncertainties. The approximation may in many cases
be perfectly acceptable for practical application. In some circumstances this
may not be so. See the statement in Clause G.6.6 of the GUM.
The concept of a model remains central to these alternative approaches. This
guide advocates the use of such an alternative approach in circumstances where
there is doubt concerning the applicability of Mainstream GUM. Guidance is
provided for this approach. The approach is numerical, being based on Monte
Carlo Simulation. It is thus computationally intensive, but nevertheless the
calculation times taken are often only seconds or sometimes minutes on a PC,
unless the model is especially complicated.
It is shown how the alternative approach can also be used to validate Main-
stream GUM and thus in any specific application confirm (or otherwise) that
this use of the GUM is fit for purpose, a central requirement of the Quality
Management Systems operated by many organizations. In instances where the
approach indicates that the use of Mainstream GUM is invalid, the approach
can itself subsequently be used for uncertainty evaluation, in place of Main-
stream GUM, in that it is consistent with the general principles (Clause G.1.5)
of the GUM.
An overall attitude taken to uncertainty evaluation in this guide is that it
consists of two phases. The first phase, formulation, constitutes building the
4
Uncertainty and Statistical Modelling
model and quantifying statistically its inputs. The second phase, calculation,
consists of using this information to determine the model output quantity and
quantify it statistically.
The concepts presented are demonstrated by examples, some chosen to em-
phasize a particular point and others taken from particular areas of metrology.
Each of these examples illustrates Mainstream GUM principles or the recom-
mended alternative approach or both, including the use of the latter as a vali-
dation facility for the former.
An account is included of the current situation concerning the revision of the
GUM, a process that is taking place under the auspices of the Joint Committee
for Guides in Metrology (JCGM).5 This revision is concerned with amplifying
and emphasizing key aspects of the GUM in order to make the GUM more read-
ily usable and more widely applicable. Any published revision to the GUM, at
least in the immediate future, would make no explicit change to the existing
document, but enhance its provisions by the addition of supplemental guides.
The approaches to uncertainty evaluation presented here are consistent with
the developments by the JCGM in this respect, as is the classification of model
types given. This best-practice guide will be updated periodically to account
for the work of this committee. It will also account for the work of standards
committees concerned with various aspects of measurement uncertainty, aware-
ness of requirements in the areas indicated by workshops, etc., organized within
SSf M and elsewhere, and technical developments.
Two of the authors of this guide are members of the Working Group of the
JCGM that is concerned with GUM revision and of other relevant national or
international committees, including British Standards Committee Panel SS/6/-
/3, Measurement Uncertainty, CEN/BT/WG 122, Uncertainty of Measurement,
and ISO/TC 69/SC 6, Measurement Methods and Results.
It is assumed that users of this guide have reasonable familiarity with the
GUM.
A companion document [20] provides specifications of relevant software for
uncertainty evaluation when applying some of the principles considered here.
1.4 Acknowledgements
This guide constitutes part of the deliverable of Project 1.2, Uncertainties
and statistical modelling within the UK Department of Industrys National
Measurement System Software Support for Metrology Programme 19982001.
It has benefited from many sources of information. These include
SSf M workshops
5
Software Support for Metrology Best Practice Guide No. 6
6 The contacts are far too numerous to mention. Any attempt to enumerate them would
6
Uncertainty and Statistical Modelling
Chapter 2
Introduction
argument.
3 If a prediction were possible, allowance for the effect could be made!
7
Software Support for Metrology Best Practice Guide No. 6
to the systematic effect, both in this situation and in many other situations.
Depending on the application, the random effect may dominate, the systematic
effect may dominate or the effects may be comparable.
In order to make a statement concerning the measurement of the quantity of
interest it is typically required to provide a value for the measurand and an asso-
ciated uncertainty. The value is (ideally) a best estimate of the measurand
and the uncertainty a numerical measure of the quality of the estimate.
The above discussion concerns the measurement of a particular quantity.
However, the quantity actually measured by the device or instrument used is
rarely the result required in practice. For instance, the display on the bathroom
scales does not correspond to the quantity measured. The raw measurement
might be that of the extension of a spring in the scales whose length varies
according to the load (the weight of the person on the scales).
The raw measurement is therefore converted or transformed into the required
form, the measurement result. The latter is an estimate of the measurand, the
physical quantity that is the subject of measurement.
For a perfect (linear) spring, the conversion is straightforward, being based
on the fact that the required weight is proportional to the extension of the
spring. The display on the scales constitutes a graduation or calibration of the
device.
For a domestic mercury thermometer, the raw measurement is the height of
a column of mercury. This height is converted into a temperature using another
proportional relationship: a change in the height of the column is proportional
to the change in temperature, again a calibration.
A relationship of types such as these constitutes a rule for converting the
raw measurement into the measurement result.
In metrology, there are very many different types of measurement and there-
fore different rules. Even for one particular type of measurement there may
well be more than one rule, perhaps a simple rule (e.g., a proportional rule) for
everyday domestic use, and a sophisticated rule involving more complicated cal-
culations (a nonlinear rule, perhaps) that is capable of delivering more accurate
results for industrial or laboratory purposes.
Often, measurements are repeated and averaged in some way to obtain a
more reliable result.
The situation is frequently more general in another way. There is often a
number of different raw measurements that contribute to the estimation of the
measurand. Here, the concern is not simply repeated measurements, but intrin-
sically different measurements, e.g., some temperature measurements and some
displacement measurements. Also, there may be more than one measurand. For
instance, by measuring the length of a bar at various temperatures it may be
required to determine the coefficient of expansion of the material of which the
bar is made and also to determine the length of the bar at a temperature at
which it may not have been measured, e.g., 27 C, when measurements were
made at 20, 22, 24, 26, 28 and 30 C.
In addition to raw data, representing measurements, there is another form
of data that is also frequently fed into a rule in order to provide a measurement
result. This additional data relates to a variety of constants, each of which
can be characterized as having a value and a distribution about it to represent
the imperfect knowledge of the value. An example is a material constant such
as modulus of elasticity, another is a calibrated dimension of an artefact such
8
Uncertainty and Statistical Modelling
as a length or diameter, and another is a correction arising from the fact that a
measurement was made at, say, 22 C rather than the stipulated 20 C.
The complete set of data items that are required by the rule to enable a
measurement result to be produced is known as the input quantities. The rule
is usually referred to as a model because it is the use of physical modelling
(or perhaps empirical modelling or both types of modelling) [22] of a measure-
ment, measurement system or measurement process that enables the rule to be
established. The model output quantities are used to estimate the measurands.
This guide is concerned with the problem of characterizing the nature of the
errors in the estimates of the measurands given the model, the input quantities
and information concerning the errors in these quantities. Some advice is given
on assigning statistical properties to the input quantities. Because the form
of the model varies enormously over different metrology disciplines, it is largely
assumed that a (physical) model is available (having been derived by the experts
in the appropriate area). The use of statistical modelling is considered, however,
in the context of capturing the error structure of a problem. Model validity is
not specifically addressed. Information is available in a companion publication
[9].
In particular, this guide reviews several approaches to the problem, including
the widely-accepted GUM approach. It reviews the interpretation of the GUM
that is made by many organisations and practitioners concerned with measure-
ments and their analysis and the presentation of measurement results (Section
6 of this guide).
The point is made that this interpretation is subject to limitations that are
insufficiently widely recognized. These limitations have, however, been indicated
[61] and are discussed in Section 5.4.
An approach free from these limitations is presented. It is a numerical
method based on the use of Monte Carlo Simulation (MCS) and can be used
1. in its own right to characterise the nature of the inexactness in the mea-
surement results,
2. to validate the approach based on the above-mentioned interpretation of
the GUM.
MCS itself has deficiencies. They are of a different nature from those of
Mainstream GUM, and to a considerable extent controllable. They are identified
in Section 7.6.
The GUM does not refer explicitly to the use of MCS. However, this option
was recognized during the drafting of the GUM. The ISO/IEC/OIML/BIPM
draft (First Edition) of June 1992, produced by ISO/TAG 4/WG 3, states, as
Clause G.1.5:
If the relationship between Y [the model output] and its input quan-
tities is nonlinear, or if the values available for the parameters char-
acterizing the probabilities of the Xi [the inputs] (expectation, vari-
ance, higher moments) are only estimates and are themselves char-
acterized by probability distributions, and a first order Taylor ex-
pansion is not an acceptable approximation, the distribution of Y
cannot be expressed as a convolution. In this case, numerical meth-
ods (such as Monte Carlo calculations) will generally be required
and the evaluation is computationally more difficult.
9
Software Support for Metrology Best Practice Guide No. 6
In the published version of the GUM [1], this Clause had been modified to
read:
[Clause 6.6] The NIST policy provides for exceptions as follows (see
Appendix C):
It is understood that any valid statistical method that is technically
justified under the existing circumstances may be used to determine
the equivalent of ui [the standard deviation of the ith input quan-
tity], uc [the standard deviation of the output], or U [the half-width
of a coverage interval for the output, under a Gaussian assumption].
Further, it is recognised that international, national, or contractual
agreements to which NIST is a party may occasionally require de-
viation from NIST policy. In both cases, the report of uncertainty
must document what was done and why.
This guide adheres to these broad views. The most important aspect re-
lates to traceability of the results of an uncertainty evaluation. An uncertainty
evaluation should include
1. all relevant information relating to the model and its input quantities,
2. an estimate of the measurand and an associated coverage interval (or
coverage region),
3. the manner in which these results were determined, including all assump-
tions made.
There would also appear to be valuable and relevant interpretations and
considerations in the German standard DIN 1319 [27]. An official English-
language translation of this standard would not seem to be available.
There has been a massive investment in the use of the GUM. It is essential
that this investment is respected and that this guide is not seen as deterring
4 That this interpretation is correct has been confirmed by JCGM/WG1.
10
Uncertainty and Statistical Modelling
the continuation of its use, at least in circumstances where such usage can be
demonstrated to be appropriate.
In this respect, a recommended validation procedure for Mainstream GUM is
provided in this guide. The attitude taken is that if the procedure demonstrates
in any particular circumstance that this usage is indeed valid, the Mainstream
GUM procedure can legitimately continue to be used in that circumstance. The
results of the validation can be used to record the fact that fitness for purpose
in this regard has been demonstrated. If the procedure indicates that there
is doubt concerning the validity of Mainstream GUM usage, there is a case
for investigation. Since in the latter case the recommended procedure forms
a constituent part (in fact the major part) of the validation procedure, this
procedure can be used in place of the Mainstream GUM approach. Such use of
an alternative procedure is consistent with the broader principles of the GUM
(Section 5 of this guide and above).
There is another vital issue facing the metrologist. In a measurement situa-
tion it is necessary to characterise the nature of the errors in the input quantities
and to develop the model for the measurand in terms of these quantities. Car-
rying out these tasks can be far from easy. Some advice is given in this regard.
However, written advice can only be general, although examples and case stud-
ies can assist. In any one circumstance, the metrologist has the responsibility,
perhaps with input from a mathematician or statistician if appropriate, of char-
acterizing the input quantities and building the model.
The Mainstream GUM procedure and the recommended approach using
Monte Carlo Simulation both utilize this information (but in different ways).
As mentioned, the former possesses some limitations that the latter sets out to
overcome (but again see Section 7.7 concerning implementation).
There is often doubt concerning the nature of the errors in the input quanti-
ties. Are they Gaussian or uniform, or do they follow some other distribution?
The metrologist needs to exercise best judgement in this regard (see Section
4.3). Even then there may be aspects that cannot fully be quantified.
It is regarded as important that this incomplete lack of knowledge, that
can arise in various circumstances, is handled by repeating the exercise of char-
acterizing the errors in the outputs. By this statement it is meant that any
assumptions relating to the nature of the errors in the inputs are changed to
other assumptions that in terms of the available knowledge are equally valid.
Similarly, the information that leads to the model may be incomplete and there-
fore changes to the model consistent with this lack of knowledge made.
The sensitivity of the output quantities can then be assessed as a function
of such perturbations by repeating the evaluation.
The attitude here is that whatever the nature of the input quantities and
the model, even (and especially!) if some subjective decisions are made in their
derivation, the nature of the outputs should then follow objectively and without
qualification from this information, rather than in a manner that is subject to
limitations, in the form of effects that are difficult to quantify and beyond the
control of the practitioner.
In summary, the attitude that is generally promoted in this guide is that
as far as economically possible use should be made of all available knowledge.
In particular, (a) the available knowledge of the input quantities should be em-
bodied within their specification, (b) a model that relates these input quantities
to the measurand should carefully be constructed, and (c) the calculation of
11
Software Support for Metrology Best Practice Guide No. 6
12
Uncertainty and Statistical Modelling
scientific reason for this choice. It almost certainly stems from the traditional
use of 95% in statistical hypothesis testing [11], although the reasons for the
choice in that area are very different. The overriding reason for the use of 95%
in uncertainty evaluation is a practical one. It has become so well established
that for purpose of comparison with other results its use is almost mandated.
Another strong reason for the use of 95% is the considerable influence of the
Mutual Recognition Arrangement concerning the comparison of national mea-
surement standards and of calibration and measurement certificates issued by
national metrology institutes [7]. See Appendix A.4 for a discussion.
Such an interval will be referred to in this guide as a 95% coverage interval.
It can be argued that if a coverage interval at some other level of probability
is quoted, it can be converted into one at some other level. Indeed, a similar
operation is recommended in the GUM, when information concerning an input
distribution is converted into a standard deviation (standard uncertainty in
GUM parlance). The standard deviations together with sensitivity coefficients
are combined to produce the standard deviation of the output, from which a
coverage interval is obtained by multiplication by a factor. The factor is selected
based on the assumption that the output distribution is Gaussian.
That this process gives rise to difficulties in some cases can be illustrated
using a simple example. Pre-empting the subsequent discussion, consider the
model Y = X1 + X2 + . . ., where X1 , X2 , . . . are the input quantities and Y
the output quantity. Assume that all terms but X1 have a small effect, and X1
has a uniform distribution. The above-mentioned GUM procedure gives a 95%
coverage interval for Y that is longer than the 100% coverage interval for X1 !
Instances of this type would appear to be not uncommon. For instance, the
EA guide [28] gives three examples arising in the calibration area.
This possibility is recognised by the GUM:
[GUM Clause G.6.5] ... Such cases must be dealt with on an individ-
ual basis but are often amenable to an analytic treatment (involving,
for example, the convolution of a normal distribution with a rectan-
gular distribution ...
The statement that such cases must be dealt with on an individual basis would
appear to be somewhat extreme. Indeed, such a treatment is possible (cf. Sec-
tions 5.1 and 5.1.2), but is not necessary, since Monte Carlo Simulation (Section
7) generally operates effectively in cases of this type.
The interpretation [63] of the GUM by the United Kingdom Accreditation
Service recommends the inclusion of a dominant uncertainty contribution by
adding the term linearly to the remaining terms combined in quadrature. This
interpretation gives rise generally to a more valid result, but remains an approx-
imation. The EA Guide [28] provides some analysis in some such cases.
It is emphasized that a result produced according to a fixed recipe that is
not universally applicable, such as Mainstream GUM, may well be only approx-
imately correct, and the degree of approximation difficult to establish.
The concern in this guide is with reliable uncertainty evaluation, in that the
results will not exhibit inconsistent or anomalous behaviour, however simple or
complicated the model may be.
Appendix A reviews some relevant statistical concepts.
13
Software Support for Metrology Best Practice Guide No. 6
As an extreme example, consider a random variable X that can take only two
values, a and b, with equal probability. The mean is = (a + b)/2 and the
standard deviation = |b a|/2. However, given only the values of and ,
there is no way of deducing the distribution. If a Gaussian distribution were
assumed, it would be concluded that the interval 1.96 contained 95% of
the distribution. In fact, the interval contains 100% of the distribution, as does
the interval , of about half that length.
Related comments are made in Clause G.6.1 of the GUM. Although knowl-
edge of the mean and standard deviation is valuable information, without further
information it conveys nothing about the manner in which the values are dis-
tributed.5 If, however, it is known that the underlying distribution is Gaussian,
the distribution of the output quantity is completely described since just the
mean and standard deviation fully describe a Gaussian distribution. A sim-
ilar comment can be made for some other distributions. Some distributions
require additional parameters to describe them. For instance, in addition to the
mean and standard deviation, a Students-t distribution requires the number of
degrees of freedom to specify it.
Thus, if the form of the distribution is known, from analysis, empirically
or from other considerations, the determination of an appropriate number of
statistical parameters will permit it to be quantified. Once the quantified form
of the distribution is available, it is possible to calculate a percentile, i.e., a
value for the measurement result such that, according to the distribution, the
corresponding percentage of the possible values of the measurement result is
smaller than that value. For instance, if the 25-percentile is determined, 25%
of the possible values can be expected to lie below it (and hence 75% above it).
Consider the determination of the 2.5-percentile and the 97.5-percentile. 2.5%
of the values will lie to the left of the 2.5-percentile and 2.5% to the right of the
97.5-percentile. Thus, 95% of the possible values of the measurement result lie
between these two percentiles. These points thus constitute the endpoints of a
95% coverage interval for the measurand.
The 2.5-percentile of a distribution can be thought of as a point a certain
number of standard deviations below the mean and the 97.5-percentile as a
point a certain number of standard deviations above the mean. The numbers of
5 See, however, the maximum entropy considerations in Appendix D.2.
14
Uncertainty and Statistical Modelling
Then [52],
1. [a, b] is a 100(1 )% coverage interval. For instance, if a and b are such
that
G(b) G(a) = 0.95,
95% of the possible values of x lie between a and b,
2. The shortest such interval is given by g(a) = g(b). a lies to the left of the
mode (the value of x at which g(x) is greatest) and b to the right,
3. If g(x) is symmetric, not only is the shortest such interval given by g(a) =
g(b), but also a and b are equidistant from the mode.
6 In most interpretations of the GUM (Mainstream GUM), the model output is taken as
Gaussian or Students-t.
15
Software Support for Metrology Best Practice Guide No. 6
Chapter 3
Uncertainty evaluation
16
Uncertainty and Statistical Modelling
x1 , u(x1 ) -
x2 , u(x2 ) - f (X) - y, u(y)
x3 , u(x3 ) -
quantity is unbiased. It is expected that the bias will be negligible in many cases. The bias
results from the fact that the value of Y obtained by evaluating the model at the input
estimates x is not in general equal to the value of Y given by the mean of the pdf g(y)
of Y . These values will be equal when the model is linear in X, and close if the model is
mildly nonlinear or if the uncertainties of the input quantities are small. See Section 3.2.
A demonstration of the bias is given by the simple model Y = X 2 , where X is assigned
a Gaussian distribution with mean zero and standard deviation u. The mean of the pdf
describing the input quantity X is zero. The corresponding value of the output quantity Y is
also zero. However, the mean value of the pdf describing Y cannot be zero, since Y 0, with
equality occurring only when X = 0. (This pdf is in fact a 2 distribution with one degree of
freedom.)
17
Software Support for Metrology Best Practice Guide No. 6
g(x1)
f(X) g(y)
g(x2)
g(x3)
[1, Clauses 2.3.2, 3.3.5] or are based on scientific judgement using all the relevant
information available [1, Clauses 2.3.3, 3.3.5], [61].
In the case of uncorrelated input quantities and a single output quantity,
Phase 2 of uncertainty evaluation is as follows. Given the model Y = f (X),
where X = (X1 , . . . , Xn )T , and the pdfs gi (xi ) (or the distribution functions
Gi (xi )) of the input quantities Xi , for i = 1, . . . , n, determine the pdf g(y) (or
the distribution function G(y)) for the measurand Y .
The mean of g(y) is an unbiased estimate of the output quantity.
Figure 3.2 shows the counterpart of Figure 3.1 in which the pdfs (or the
corresponding distribution functions) of the input quantities are propagated
through the model to provide the pdf (or distribution function) of the measur-
and.
It is reiterated that once the pdf (or distribution function) of the measurand
Y has been obtained, any statistical information relating to Y can be produced
from it. In particular, a coverage interval for the measurand at a stated level of
probability can be obtained.
When the input quantities are correlated, in place of the n individual pdfs
gi (xi ) i = 1, . . . , n, there is a joint pdf g(x). An example of a joint pdf is
the multivariate Gaussian pdf (Section 4.3.2). In practice this joint pdf may
be decomposable. For instance, in some branches of electrical, acoustical and
optical metrology, the input quantities may be complex-valued. The real and
imaginary parts of each such variable are generally correlated and thus each has
an associated 2 2 covariance matrix. See Section 6.2.5. Otherwise, the input
quantities may or may not be correlated.
18
Uncertainty and Statistical Modelling
If there is more than one output quantity, Y, these outputs will almost
invariably need to be described by a joint pdf g(y), since each output quantity
generally depends on all the input quantities. See Section 9.7 for an important
exception.
Phase 2, calculation, involves the derivation of the measurement result and
the associated uncertainty, given the information provided by Phase 1. It is
computational and requires no further information from the metrology applica-
tion. The uncertainty is commonly provided as a coverage interval. A coverage
interval can be determined once the distribution function G(y) (Appendix A)
has been derived. The endpoints of a 95% coverage interval3 are given (Section
2.3) by the 0.025- and 0.975-quantiles of G(y), the -quantile being the value
of y such that G(y) = .4
It is usually sufficient to quote the uncertainty of the measurement result
to one or at most two significant figures. See Section 7.7.1. In general, fur-
ther figures would be spurious, because the information provided in Phase 1
is typically imprecise, involving a number of estimates and assumptions. The
attitude taken here though is that the second phase should not exacerbate the
consequences of the decisions made in Phase 1.5
The pdf g(y) pertaining to Y cannot generally be expressed in simple or even
closed mathematical form. Formally, if () denotes the Dirac delta function,
Z Z Z
g(y) = g(x)(y f (x))dxn dxn1 . . . dx1 (3.1)
[19]. Approaches for determining g(y) or G(y) are addressed in Section 5. That
several approaches exist is a consequence of the fact that the determination of
g(y) and/or G(y) ranges from being very simple to extremely difficult, depending
on the complexity of the model and its input pdfs.
3 95% coverage intervals are used in this guide, but the treatment applies more generally.
4 There are many intervals having a coverage probability of 0.95, a general interval being
given by the - and (0.95 + )-quantiles of G(y), with 0 0.05. The choice = 0.025 is
natural for a G(y) corresponding to a symmetric pdf g(y). It also has the shortest length for
a symmetric pdf and, in fact, for a unimodal pdf (Section 2.3).
5 This attitude compares with that in mathematical physics where a model (e.g., a partial
differential equation) is constructed and then solved. The construction involves idealizations
and inexact values for dimensional quantities and material constants, for instance. The solu-
tion process involves the application of hopefully sensible and stable methods in order to make
some supported statements about the quality of the solution obtained to the posed problem.
19
Software Support for Metrology Best Practice Guide No. 6
Chapter 4
1. Statistical modelling.
2. Input-output modelling.
20
Uncertainty and Statistical Modelling
Consider a situation in which the ti are known with negligible error and xi
has error ei . Suppose that the nature of the calibration is such that a straight-
line calibration curve y = a1 + a2 x provides an adequate realization. Then, as
part of the statistical modelling process [22], the equations
xi = a1 + a2 ti + ei , i = 1, . . . , n, (4.1)
relate the observations xi to the calibration parameters a1 (intercept) and a2
(gradient) of the line and the errors ei in the measurements.
In order to establish values for a1 and a2 it is necessary to make an appro-
priate assumption about the nature of the errors ei [22]. For instance, if they
can be regarded as realizations of independent, identically distributed Gaussian
variables, the best unbiased estimate of the parameters is given by least squares.
Specifically, a1 and a2 are given by minimizing the sum of the squares of the
deviations xi a1 a2 ti , over i = 1, . . . , n, with respect to a1 and a2 , viz.,
n
X
min (xi a1 a2 ti )2 .
a1 ,a2
i=1
The model equations (4.1), with the solution criterion (least-squares), constitute
the results of the statistical modelling process for this example.
There may be additional criteria. For instance, a calibration line with a
negative gradient may make no sense in a situation where the gradient represents
a physical parameter whose true value must always be greater than or equal
to zero (or some other specified constant value). The overall criterion in this
case would be to minimize the above sum of squares with respect to a1 and
a2 , as before, with the condition that a2 0. This problem is an example of a
constrained least-squares problem, for which sound algorithms exist [22]. In this
simple case, however, the problem can be solved more easily for the parameters,
but the uncertainties associated with a1 and a2 require special consideration.
See the example in Section 9.6.
Appendix B discusses some of the modelling issues associated with the sta-
tistical modelling approach of this section and the input-output modelling ap-
proach of the next.
21
Software Support for Metrology Best Practice Guide No. 6
The problem of establishing a simple model for the length of a piece of string,
when measured with a tape, is considered (cf. [5]). The measurand is the length
of the string. As part of Phase 1, formulation, a measurement model for string
length is established. It depends on several input quantities. This model is
expressed here as the sum of four terms. Each of these terms, apart from the
first, is itself expressed as a sum of terms.1 The model takes the form2
String length = Measured string length (1)
+ Tape length error (2)
+ String length error (3)
+ Measurement process error (4),
where
(1) Measured string length = Average of a number of repeated
measurements
(2) Tape length error = Length error due to tape calibration
imperfections
+ Extension in tape due to stretching
(negative if there is shrinking rather
than stretching)
+ Reduction in effective length of tape
due to bending of the tape
(3) String length error = Reduction in effective string length due
to string departing from a straight line
+ Reduction in string length as a result
of shrinking (negative if there is
stretching rather than shrinkng)
(4) Measurement process error = Length error due to inability to align
end of tape with end of string due to
fraying of the string ends
+ Length error due to the tape and the
string not being parallel
+ Error committed in assigning a
numerical value to the measurement
indicated by the tape
+ Length error arising from the
statistics of averaging a finite number
of repeated measurements.
Once this model is in place statements can be made about the nature of the
various terms in the model as part of the first phase of uncertainty evaluation.
1 The model can therefore be viewed as a multi-stage model (Section 4.2.3, although of
22
Uncertainty and Statistical Modelling
Phase 2 can then be carried out to calculate the required uncertainty in the
measured length.
There may be some statistical modelling issues in assigning pdfs to the input
quantities. For instance, a Students-t distribution would probably be assigned
to the measured string length (1), based on the mean and standard deviation of
the repeated measurements, with a number of degrees of freedom one less than
the number of measurements. As another instance, for tape bending, this effect
can be approximated by a 2 variable3 which does not, as required, have zero
mean, since the minimum effect of tape bending on the measurand is zero.
For the straight-line calibration example of Section 4.1 (Example 3), the GUM
model constitutes a formula or prescription (not in general necessarily explicit in
form) derived from the results of the statistical modelling process. Specifically,
the measurement results a = (a1 , a2 )T are given in terms of the input quantities
x = (x1 , . . . , xn )T by an equation of the form
Ha = q. (4.3)
(Compare [22], [1, Clause H.3].) Here, H is a 2 2 matrix that depends on the
values of the ti , and q a 2 1 vector that depends on the values of the ti and
the xi .
By expressing this equation as the formula
a = H 1 q, (4.4)
a GUM model for the parameters of the calibration line is obtained. It is, at
least superficially, an explicit expression4 for the measurement result a. The
form (4.3) is also a GUM model, with the measurement result defined implicitly
by the equation (Section 6).
matrix inversion is not recommended [22]. The form (4.4), or forms like it in other such appli-
cations, should not be regarded as an implementable formula [22]. Rather, numerically stable
matrix factorization algorithms [35] should be employed. This point is not purely academic.
The instabilities introduced by inferior numerical solution algorithms can themselves be an
appreciable source of uncertainty. It is not straightforward to quantify this effect.
23
Software Support for Metrology Best Practice Guide No. 6
24
Uncertainty and Statistical Modelling
...
It is the view of the authors of this guide that these statements by Eurachem
are sound. However, this guide takes a further step, related to modelling the
measurement and through the use of the model defining and making a statement
about the measurand, as opposed to the observations. Because a (simple) model
is established, this step arguably exhibits even closer consistency with the GUM.
The Eurachem statements stress that observations are not often constrained
by the same fundamental limits that apply to real concentrations. It is hence
appropriate to demand that the measurand, defined to be the real analyte con-
centration (or its counterpart in other applications) should be constrained to
be non-negative. Also, the observations should not and cannot be constrained,
because they are the values actually delivered by the measurement method.
Further, again consistent with the Eurachem considerations, assign a pdf to
the input quantity, viz., a best (unconstrained) estimate of the analyte con-
centration, that is symmetric about that value. Thus, the input, X, say, is
unconstrained analyte concentration with a symmetric pdf, and the output, Y ,
say, real analyte concentration, with a pdf to be determined.
25
Software Support for Metrology Best Practice Guide No. 6
The rationale behind this simple choice of model is as follows. Should the mean
x of the observations prove to be non-negative, it would naturally be taken as
the estimate of the measurand Y . Such a value would conventionally be used
at points removed from the limit of detection. Should x prove to be negative,
it cannot be used as a physically feasible estimate of Y , since by definition Y
is the real analyte concentration and hence non-negative. Taking y = 0 in this
case is the optimal compromise between the observed values and feasibility (the
closest feasible value to the mean observation).
An example illustrating the use of this model is given in Section 9.5.
26
Uncertainty and Statistical Modelling
the argument (force in the above instance).6 The output quantities will be the
values of the curve at these designated points, together with their associated
covariance matrix. Again, because these curve values all depend in general on all
the inputs, the curve parameters, this covariance matrix will be non-diagonal.
There may be no further stage, since the interpolated values provided by
the calibration curve may be the primary requirement.
Otherwise, Stage 3 will be the use of the curve values obtained in Stage 2
to provide further measurement results. As an example, take the area under
(a specified portion of) the calibration curve. Suppose that this area is to be
determined by numerical quadrature because of the impossibility of carrying
out the integration analytically. This result can typically be expressed as a
linear combination of the curve values provided as input quantities. As another
instance, if more than one measurement result is required, e.g., estimates of
gradients to the curve at various points, these again can typically be expressed
as linear combinations of the curve values. They will, for similar reasons to
those above, have a non-diagonal covariance matrix.
The concepts described in Chapter 6 can be applied to the above stages. The
various categories within the classification of that chapter would relate to the
various types of calibration model, depending on whether it can be expressed
explicitly or implicitly or is real-valued or complex-valued. The model is almost
always multivariate in the sense of Chapter 6, i.e., it has more than one output.
data in Stage 1 represents a set of standards, e.g., established responses to specified concen-
trations. At Stage 2, it is required to use the calibration curve to determine the concentration
corresponding to to an observed response. The mathematical function representing the curve
then constitutes an implicit model (Section 6.2.3) (e.g., if the calibration curve is a fifh-degree
polynomial).
27
Software Support for Metrology Best Practice Guide No. 6
straight line such that its parameters are uncorrelated [22]. Also see Section 4.2.1. The
example in Clause H.3 of the GUM illustrates this point. Such re-parametrisation is not always
a practical proposition, however, because of the conflict between a numerically or statistically
convenient representation and the requirements of the application. However, the possibility of
re-parametrization should always be considered carefully for at least two reasons. One reason
is that the result corresponding to a sound parametrization can be obtained in a numerically
stable manner [22], whereas a poor parametrization can lead to numerically suspect results.
Another reason is that a poor parametrization leads to artificially large correlations in the
output quantities. Decisions about the natural correlation present in the results cannot be
made in terms of these induced correlation effects.
8 The subdivision into Type A and Type B evaluations of uncertainty will correspond in
some instances to random and systematic effects, respectively, but not in all circumstances.
28
Uncertainty and Statistical Modelling
1
exp (x )2 /(2 2 ) ,
g(x) = < x < .
2
The standardized Gaussian distribution in the variable Z, with zero mean and
unit standard deviation, is
1
(z) = exp(z 2 /2), < z < .
2
29
Software Support for Metrology Best Practice Guide No. 6
The inverse function 1 (p) gives the value of z such that (z) = p, a stated
probability.
Tables and software for and its inverse are widely available.
It states that any value of x in the interval [a, b] is equally probable and that
the probability of a value of x outside this interval is zero.
Consider two values c and d, where c < d. The probability that X lies
between c and d is straightforwardly confirmed to be
0, d a,
(d a)/(b a), c a d b,
Z d
g(x)dx = (d c)/(b a), a c < d b,
c
(b c)/(b a), a c b d,
0, b c.
The authors of this guide regard the above assumption with some scepticism.
Although the PME is perfectly reasonable in the light of such information, it is
the information that leads to its being invoked in this manner that should be
questioned. The belief in uniform distributions as being right would appear
to be an overused and misused belief. It can lead to anomalies.
Can taking a uniform pdf be a better model in general than using, say, a
Gaussian? There are indeed genuine instances for the use of a uniform pdf. An
30
Uncertainty and Statistical Modelling
3
x 10
4
3
1 1.02 1.04 1.06 1.08 1.1
a sequence of readings corresponding to a slowly varying signal. The errors in the resolved
sequence are serially correlated as a consequence of the resolution of the instrument. Figure 4.1
shows the errors in successive values displayed by a simulated instrument having a resolution
of two decimal places. The values shown are the differences between the values of sin t that
would be displayed by the instrument and the actual values of sin t, for t = 1.00, 1.01, . . . , 1.10
radians. Any analysis of such data that did not take account of the very obvious serial
correlation would yield a flawed result. The effects of serial correlation depend on the relative
sizes of the uncertainties in the signal, the instrument resolution and the magnitudes of the
changes in successive values of the signal (the last-mentioned item depending on the sampling
rate). In hopefully many cases they will be negligible, but it is appropriate to establish when
this is indeed the case. In the context of calibration it is stated [28], but the point is more
general, that the measurement uncertainty associated with the calibration of all low-resolution
indicating instruments is dominated by the finite resolution provided this resolution is the only
dominant source in the uncertainty budget.
31
Software Support for Metrology Best Practice Guide No. 6
0.5
0.4
0.3
0.2
0.1
0 X
ad a a+d bd b b+d
0.1
2 1.5 1 0.5 0 0.5 1 1.5 2
Figure 4.2: A uniform pdf with inexact endpoints. The diagram is conceptual:
the height of the pdf would in fact vary with the endpoints in order to maintain
unit area.
left endpoint were between 1.05 and 0.95 and the right endpoint between 0.95 and 1.05.
11 An alternative approach can be used. It could be assumed, for instance, that each endpoint
can be regarded as a Gaussian (rather than a uniform) variable, centred on that endpoint,
with a stated standard deviation. The analysis and the result would differ from that here.
The choice of approach would be made using expert judgement.
32
Uncertainty and Statistical Modelling
4
x 10
3
2.5
1.5
0.5
0
1.5 1 0.5 0 0.5 1 1.5
Figure 4.3: A histogram produced using Monte Carlo Simulation for the model
X = A + (B A)V , where A is uniform over [a d, a + d], B is uniform over
[b d, b + d] and V is uniform over [0, 1], with a = 1.0, b = 1.0 and d = 0.5.
Figure 4.2 refers. The histogram provides a scaled estimate of the pdf of X.
It corresponds to an input quantity that is defined by a uniform distribution
between inexact limits, theirselves being represented by uniform distributions.
4
x 10
3
2.5
1.5
0.5
0
1.5 1 0.5 0 0.5 1 1.5
Figure 4.4: As Figure 4.3 except that the endpoints are related as described in
the text.
their location. Between the inner and outer extremities it reduces from the
uniform height to zero in what is approximately a quadratic manner. Beyond the
outer extremities the pdf is zero. The piecewise nature of the pdf is comparable
to that for the sum of uniform pdfs, where the pieces form polynomial segments
[26].
The standard deviation of X, assuming the exactness of the endpoints (equiv-
alent to taking d = 0), is 1/ 3 = 0.577. That for the above finite value of d
is 0.625. As might be expected, the inexactness of the endpoints increases the
value. The extent to which this increase (8%) is important depends on circum-
stances.
There will be situations where the inexact endpoints would be expected to
move together, i.e., the knowledge of one of them would imply the other. In
this circumstance the pdf of X is slightly different. See Figure 4.4. The standard
deviation of X is now 0.600 (a 4% increase over that for d = 0), roughly halfway
between that for the above pdf and the pure uniform pdf. The flanks of the pdf
now have greater curvature.
The final statement to be made here concerning uniform distributions with
inexactly defined endpoints is that the effects of such endpoints on the evaluation
33
Software Support for Metrology Best Practice Guide No. 6
34
Uncertainty and Statistical Modelling
Further,
35
Software Support for Metrology Best Practice Guide No. 6
36
Uncertainty and Statistical Modelling
This advice is sound in a qualitative sense but, again, it is unclear when the
circumstances hold. The problem is that the distinction between Phase 1 and
Phase 2 of uncertainty evaluation13 , as indicated in Section 4, becomes blurred.
The intention of the subdivision into the two phases is to permit all decisions
to be made in Phase 1 and the mechanical calculations to be made in Phase 2.
In terms of the set of conditions in GUM Clause 6.6, listed above, it is unclear
what is meant by
a significant number of input quantities
well-behaved probability distributions
the standard uncertainties of the xi contributing comparable amounts14
the adequacy of linear approximation, and
the output uncertainty being reasonably small.
The concern is that because none of these considerations is explicitly quanti-
fied, different practitioners might adopt different interpretations of the same
situation, thus causing divergence of results.
Further, an approximation is acceptable, but is it an acceptable approxima-
tion?
tities, when scaled by the magnitudes of the corresponding sensitivity coefficients, contribute
comparable amounts.
15 Such an estimate is inconsistent with the intention of the GUM which promotes the use
37
Software Support for Metrology Best Practice Guide No. 6
38
Uncertainty and Statistical Modelling
Chapter 5
Candidate solution
approaches
This chapter covers candidate solution processes for the calculation phase, Phase
2, of the uncertainty evaluation problem formulated in Section 3.1. The start-
ing point is (i) the availability of a model f or f that relates the input quan-
tities X = (X1 , . . . , Xn )T to the scalar measurand Y or vector measurand
Y = (Y1 , . . . , Ym )T through Y = f (X) or Y = f (X), and (ii) assigned probabil-
ity density functions (pdfs) g1 (X1 ), . . . , gn (Xn ) for the input quantities. If the
input quantities are correlated, they will have a joint pdf.
It is required to determine the pdf g(Y ) for the output quantity y or the
(joint) pdf g(Y) for Y.
Once g(Y ) has been obtained a 95% coverage interval for the (scalar) mea-
surand Y can be derived. Once g(Y) has been obtained, a 95% coverage region
for the (vector) measurand Y can be derived.
Three approaches to the determination of the output pdf for Y or Y are
considered and contrasted:
1. Analytical methods
2. Mainstream GUM
3. Numerical methods.
All three approaches are consistent with the GUM. Mainstream GUM is the
procedure that is widely used and summarized in GUM Clause 8. Analytical
methods and numerical methods fall in the category of other analytical and
numerical methods (GUM Clause G.1.5). Under the heading of Analytical
methods below, mention is also made of Approximate analytical methods.
39
Software Support for Metrology Best Practice Guide No. 6
0.6 1.8
1.6
0.5
1.4
0.4 1.2
1
0.3
0.8
0.2 0.6
0.4
0.1
0.2
0 0
0.2
0.1
1 2 3 0 0.5 1
Figure 5.1: A uniform probability density function (left) for the input X and
the corresponding probability density function for the output Y , where Y is
related to X by the model Y = log(X).
If the model is Y = ln(X) with X having uniform pdf in [a, b], the application
of Formula (5.1) gives
0, y ln(a),
G(y) = (exp(y) a)/(b a), ln(a) y ln(b),
1, ln(b) y.
(cf. Section 4.3.5). Figure 5.1 depicts the uniform pdf (left) for X and the
corresponding pdf for Y in the case a = 1, b = 3.
This case is important in, say, electromagnetic compatibility measurement,
where conversions are often carried out between quantities expressed in linear
and decibel units using exponential or logarithmic transformations [62].
40
Uncertainty and Statistical Modelling
A
A
A
A - Y
Figure 5.2: The probability density function for the sum Y = X1 + X2 of two
uniform distributions X1 and X2 of identical semi-widths.
Example 9 The sum of two uniform distributions with the same semi-widths
Suppose the model is
Y = X1 + X2
and, for i = 1, 2, Xi hasa uniform pdf with mean i and semi-width a (and hence
standard deviation a/ 3). Then Y has a symmetric triangularp pdf g(y) with
mean = 1 +2 , semi-width 2a and standard deviation a 2/3. Geometrically,
this pdf, for the case 1 + 2 = 0, takes the form
0, y 2a,
(2a + y)/(4a2 ), 2a y 0,
g(y) =
(2a y)/(4a2 ), 0 y 2a,
0, 2a y.
For general 1 and 2 , the pdf is the same, but centred on 1 + 2 rather than
zero. Geometrically, this pdf takes the form indicated in Figure 5.2.
Analytical solutions in some other simple cases are available [26, 28].
41
Software Support for Metrology Best Practice Guide No. 6
A
A
A
A - Y
Figure 5.3: The probability density function for a general linear combination
Y = 1 X1 + 2 X2 of two uniform distributions X1 and X2 . It is symmetric
about its midpoint.
The first of these examples is used subsequently in this guide (Section 9.3)
in the context of the MCS approach to uncertainty evaluation and the results
compared with those of [28] and Mainstream GUM.
[GUM Clause 4.1.6] Each input estimate xi and its associated un-
certainty u(xi ) are obtained from a distribution of possible values
of the input quantity Xi . This probability distribution may be fre-
quency based, that is, based on a series of observations xi,k of Xi ,
or it may be an a priori distribution. Type A evaluations of stan-
dard uncertainty components are founded on frequency distributions
while Type B evaluations are founded on a priori distributions. It
must be recognized that in both cases the distributions are models
that are used to represent the state of our knowledge.
2 To this statement, the comment must be added that some pdfs may be based on both
types of information, viz., prior knowledge and repeated observations. Evaluations of standard
uncertainty in this setting are not purely Type A or Type B. The GUM gives one such instance
(GUM Clause 4.3.8, Note 2.)
42
Uncertainty and Statistical Modelling
1. Obtain from the (joint) input pdf(s) the means x = (x1 , . . . , xn )T and the
standard deviations u(x) = (u(x1 ), . . . , u(xn ))T of the input quantities.
2. Form the covariances u(xi , xj ) of the input quantities from the appropriate
joint pdfs for any pair of correlated input quantities.
3. Form the partial derivatives of first order of the model with respect to the
input quantities. See Section 5.5.
5. Form the model sensitivities as the values of the partial derivatives eval-
uated at these means. See Section 5.5.
correlated.
4 A quadrature rule is a numerical integration procedure. Examples in the univariate case
43
Software Support for Metrology Best Practice Guide No. 6
uncertainty is affected, as is the estimate of the measurand. The latter point may be less well
appreciated in some quarters. The bias so introduced into the estimate of the measurand is
illustrated in Section 9.5, for example.
44
Uncertainty and Statistical Modelling
values are independent and that the input standard deviations are independent.
8 Implementations made by the authors have been applied to explicit and implicit models
(where Y can and cannot be expressed directly in terms of X), and complex-valued models
(for electrical metrology), with univariate and multivariate outputs.
45
Software Support for Metrology Best Practice Guide No. 6
Finite-difference methods,
Advice on the use of finite-difference methods is given in Section 5.5.1 and some
comments on computer-assisted algebraic methods in Appendix C.
In the context of Phase 2, calculation, of the uncertainty evaluation problem
there is no essential concept of sensitivity coefficients. They are of course re-
quired by the Mainstream GUM approach (Section 5.2). Independently of the
approach used, they also convey valuable quantitative information about the
influences of the various input quantities on the measurand (at least in cases
where model linearization is justified). If an approach is used that does not
require these coefficients for its operation, they can of course additionally be
calculated if needed. Within the context of the Monte Carlo Simulation ap-
proach, it is also possible to apply the concept of sensitivity coefficients. Some
comments are given in Appendix E.
46
Uncertainty and Statistical Modelling
47
Software Support for Metrology Best Practice Guide No. 6
Chapter 6
Mainstream GUM
In practice, however, this functional relationship does not apply directly for
many of the measurement systems encountered, but may instead (a) take the
form of an implicit relationship, h(Y, X) = 0, (b) involve a number of measur-
ands Y = (Y1 , . . . , Ym )T , or (c) involve complex-valued quantities. Although
measurement models other than (6.1) are not directly considered in the GUM,
the same underlying principles may be used to propagate uncertainties from the
input quantities to the output quantities.
In Section 6.2 a classification of measurement models is given that is more
general than that considered in the GUM. This classification is motivated by ac-
tual measurement systems, examples of which are given. For each measurement
model it is indicated how the uncertainty of the measurement result is evaluated.
Mathematical expressions for the uncertainty are stated using matrix-vector
notation, rather than the subscripted summations given in the GUM, because
generally such expressions are more compact and more naturally implemented
within modern software packages and computer languages.
The Mainstream GUM approach is used throughout this chapter. Any doubt
cocerning its applicability should be addressed as appropriate, for instance by
using the concepts of Chapter 8.
48
Uncertainty and Statistical Modelling
developed by JCGM/WG1.
4 In accordance with the GUM, the informal notation that f /x denotes the partial
i
derivative f /Xi evaluated at x = (x1 , . . . , xn )T is used. If derivatives are evaluated at any
other point, a specific indication will be given.
49
Software Support for Metrology Best Practice Guide No. 6
Then, a compact way of writing (6.2), that avoids the use of doubly-scripted
summations, is
u2 (y) = (x f )Vx (x f )T , (6.5)
where Vx is a matrix of order n containing the covariance terms.
X (D, LS , AS , A, , )T
x (d, `S , S , , , )T . (6.7)
5 This choice of input variables is made for consistency with GUM, Example H.1. Other
50
Uncertainty and Statistical Modelling
The estimate
y`
of the measurand L is
{1 + S ( )} `S + d
`= .
1 + (S + )
The partial derivatives of the model with respect to the input quantities are
L 1
= ,
D 1 + (AS + A)
L 1 + AS ( )
= ,
LS 1 + (AS + A)
L (2 A A )LS D
= 2 ,
AS {1 + (AS + A)}
L [{1 + AS ( )} LS + D]
= 2 ,
(A) {1 + (AS + A)}
L (AS + A)(LS AS D) LS A
= 2 ,
{1 + (AS + A)}
L AS LS
= .
() 1 + (AS + A)
Vy = Jx Vx JxT , (6.9)
51
Software Support for Metrology Best Practice Guide No. 6
h(Y, X) = 0. (6.10)
52
Uncertainty and Statistical Modelling
where M is the total applied mass, a and m are the densities of air and the
applied masses, g` is the local acceleration due to gravity, A0 is the effective
cross-sectional area of the balance at zero pressure, is the distortion coeffi-
cient of the piston-cylinder assembly, is the temperature coefficient, and T is
temperature [47].
There are eight input quantities, X (A0 , , , T, M, a , m , g` )T and a
single measurand Y p related by the implicit model7
h(Y, X) = 0. (6.14)
model given by the difference between the left- and right-hand sides of (6.12) could be used.
The efficiency and stability of the solution of the model equation depends on the choice made.
8 Using recognised concepts from numerical linear algebra [35], a numerically stable way to
form Vy , that accounts for the fact that Jx is a rectangular matrix and Jy a square matrix,
is as follows:
TR =
1. Form the Cholesky factor Rx of Vx , i.e., the upper triangular matrix such that Rx x
Vx .
2. Factorize Jx as the product Jx = Qx Ux , where Qx is an orthogonal matrix and Ux is
upper triangular.
3. Factorize Jy as the product Jy = Ly Uy , where Ly is lower triangular and Uy is upper
triangular.
4. Solve the matrix equation UyT M1 = I for M1 .
5. Solve LT
y M2 = M1 for M2 ,
6. Form M3 = QT
x M2 .
7. Form M4 = UxT M3 .
8. Form M = Rx M4 .
9. Orthogonally triangularize M to give the upper triangular matrix R.
10. Form Vy = RT R.
It is straightforward to verify this procedure using elementary matrix algebra.
53
Software Support for Metrology Best Practice Guide No. 6
u2 (xR )
u(xR , xI )
V = .
u(xR , xI ) u2 (xI )
This is a more complicated data structure than for the case of real-valued x.9
For an n-vector x of complex-valued quantities, the uncertainty of x is expressed
using a 2n 2n matrix Vx :
V1,1 V1,n
.. .. .. ,
Vx = . . . (6.16)
Vn,1 Vn,n
Vy = Jx Vx JxT , (6.17)
where Jx is a 2 2n matrix whose first row contains the derivatives of the real
part of f with respect to the real and imaginary parts of X, and in whose second
row are the derivatives of the imaginary part of f .
54
Uncertainty and Statistical Modelling
Vy = Jx Vx JxT ,
Jy Vy JyT = Jx Vx JxT ,
55
Software Support for Metrology Best Practice Guide No. 6
Another approach to the example given in Section 6.2.5 is to relate the inputs
X (a, b, c, W )T and the measurand Y using the (complex-valued) implicit
model
h(Y, X) = (cW + 1) (aW + b) = 0.
An advantage of this approach is that the calculation of derivatives and thence
sensitivity coefficients is easier.
i (cWi + 1) (aWi + b) = 0, i = 1, 2, 3,
that are solved for the three calibration coefficients a, b and c. There are
six (complex-valued) input quantities X (W1 , 1 , W2 , 2 , W3 , 3 )T and three
(complex-valued) measurands Y (a, b, c)T related by a model of the form
(6.14), where
56
Uncertainty and Statistical Modelling
Chapter 7
developed by JCGM/WG1.
2 The determination of coverage regions (for the multivariate case) remains a research prob-
57
Software Support for Metrology Best Practice Guide No. 6
Mean4 , median, mode and other estimates of location such as the total
median [24],
in a multi-stage model (Section 4.2.3), and hence in these circumstances MCS provides valu-
able problem-specific information that would not necessarily be provided by more traditional
approaches to uncertainty evaluation.
4 There is currently a debate in the metrology community concerning whether this value or
the value of the model corresponding to the estimates of the input quantities should be used.
In many instances it will make neglible practical difference. In some cases, the difference can
be appreciable. Consider the simple model Y = X12 , where the pdf assigned to X1 has mean
zero and standard uncertainty u. Taking the mean of values of Y involves averaging a set
of non-negative values and hence will be positive. In contrast, the value of Y corresponding
to the mean value of X is zero. In this case, the former value is more reasonable, since zero
lies at an extremity of the range of possible values for the output quantity and is hardly
representative, as a mean should be. In other, somewhat more complicated situations, the
mean of the values of the output quantity constitutes a measurement result that contains
unwanted bias. In this circumstance, it can be more meaningful to take instead the value of
Y corresponding to the mean value of X. In general, circumstances should dictate the choice.
In one sense the degree of arbitrariness is genuine. The Monte Carlo procedure naturally
provides fractiles of the distribution function of the output quantity. In particular the 0.025
and 0.975 fractiles define a 95% coverage interval for the measurand. Such a coverage interval
is also given by any other pair of fractiles that differ by 0.95, such as 0.015 and 0.965, or
0.040 and 0.990. In this setting, it is less meaningful to quote a coverage interval in the form
y U (y), involving the mean y. For comparison with a Monte Carlo interval, it would
instead be appropriate to quote the interval [y U (y), y + U (y)].
5 The first moment of a pdf is the mean, a measure of location, the second is the variance, a
measure of dispersion or spread about the mean, the third is skewness, a measure of asymmetry
about the mean, and the fourth is kurtosis, a measure of heaviness of the tails of the pdf or
the peakedness in the centre. [51, p143, p329]
58
Uncertainty and Statistical Modelling
59
Software Support for Metrology Best Practice Guide No. 6
60
Uncertainty and Statistical Modelling
The typical figure of 50,000 applies to a 95% level of probability for a moderate problem.
Because of the need to quantify the tails of the output pdf sufficiently well, very many more
trials will be needed for a higher level such as 99%.
61
Software Support for Metrology Best Practice Guide No. 6
yi = f (xi ), i = 1, . . . , M,
10. Take (y(0.025M ) , y(0.975M ) ) as a 95% coverage interval for the output.9
The procedure depends on the ability to sample from the pdfs of the input
quantities. Section 7.7 provides advice on this aspect.
In the procedure, the number M of Monte Carlo trials is selected initially,
at Step 1. The variant, indicated above in this section, in which the number is
chosen adaptively, i.e., as the procedure is followed, is given in Section 7.7.1.
For M = 50, 000, since 0.025M = 1, 250 and 0.975M = 48, 750, a 95% coverage
interval is taken as (y(1250) , y(48750) ), the values in positions 1250 and 48750 in
the ordered list of values y(1) , . . . , y(50000) .
shape of which depends sensitively on the choice of cell size [33], but in terms of the distri-
bution function. The histogram is, however, a useful visual aid to understanding the nature
of the pdf.
9 The 100p-percentile is the (M p)th value in the list. If M p is not an integer, the integer
immediately below M p is taken, if p < 1/2, and the integer immediately above M p, otherwise.
Similar considerations apply in other circumstances.
62
Uncertainty and Statistical Modelling
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
Figure 7.1: An empirical estimate obtained using Monte Carlo Simulation, with
500 trials, of the distribution function for the model Y = X12 . X1 has a proba-
bility density function that is Gaussian with mean 1.2 and standard deviation
0.5.
25
20
15
10
0
0 1 2 3 4 5 6 7 8
Figure 7.2: A histogram of the values used to produce the distribution function
of Figure 7.1. It constitutes a discrete, scaled representation of the correspond-
ing probability density function.
Consider the univariate model Y = X12 , where the single input X1 has a Gaus-
sian pdf with mean 1.2 and standard deviation 0.5. Determine the pdf and
distribution function of Y , the mean and standard deviation of Y and a 95%
coverage interval for the measurand.
First, the number M of Monte Carlo trials was taken as 500. Values xi ,
i = 1, . . . , M , were sampled from the Gaussian distribution having mean 1.2
and standard deviation 0.5. The corresponding values yi = x2i , i = 1, . . . , M ,
were calculated according to the model. The empirical distribution function of
Y was taken, in accordance with the above, as the join of the points (y(i) , pi ),
i = 1, . . . , M , where pi = (i1/2)/M . Figure 7.1 shows the distribution function
so obtained.
A histogram of the values of yi appears as Figure 7.2. It constitutes a
discrete, scaled representation of the pdf.
The empirical distribution function is a much smoother function than the
empirical pdf, the main reason for which is that the empirical distribution func-
tion constitutes a cumulative sum of the empirical pdf. Taking successive sums
is a smoothing process, being the converse of taking successive differences. This
summation corresponds to the integration counterpart for analytical pdfs.
The exercise was repeated for M = 50, 000 trials. See Figures 7.3 and 7.4.
63
Software Support for Metrology Best Practice Guide No. 6
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 2 4 6 8 10 12
Figure 7.3: As Figure 7.1 but based on 50,000 Monte Carlo trials.
2000
1800
1600
1400
1200
1000
800
600
400
200
0
0 2 4 6 8 10 12
Figure 7.4: As Figure 7.2 but based on 50,000 Monte Carlo trials.
The enhanced smoothness of the results is evident. Statistics computed from the
larger sample would be much more reliable. It can be expected that increasing
the number of trials by a factor of 100, as here, would yield an additional
significant digit of accuracy in the computed statistics [19]. A rule for the
number of trials related to this consideration is given in Section 7.7.1.
The enhanced resolution permits a feature to be discerned in the pdf for
M = 50, 000 (Figure 7.4) that was not evident in that for M = 500. The
pdf is bimodal, there being a narrow peak near the origin, in addition to the
main peak. This is not an artifact introduced by the sampling procedure, but
a genuine effect. Its presence is due to the fact that 0.8% of the values of
X1 according to its pdf are negative. These values lie in the left-hand tail of
the Gaussian pdf for X1 , i.e., that with mean = 1.2 and standard deviation
= 0.5. The above proportion corresponds to the area under the standardized
Gaussian curve (i.e., that with mean zero and standard deviation unity) to the
left of the value z = (0 )/ = 2.4. These values when squared, through the
model Y = X12 , are aggregated with those arising from small positive values of
X1 . Even for such a superficially innocent example, there can be a surprise
such as this!
The mean and standard deviation of the output as estimated by Mainstream
GUM are 1.44 and 1.20. Those provided by MCS were 1.70 and 1.26. The stan-
dard deviation in this example is reasonably estimated by Mainstream GUM,
but the mean is lower than the correct value.
A further noteworthy feature arises from this example. In the case M =
50, 000, a 95% coverage interval for Y , determined from the 2.5- and 97.5-
64
Uncertainty and Statistical Modelling
percentiles of the empirical distribution function was [0.1, 4.8]. That provided
using Mainstream GUM is [1.1, 3.9] or, equivalently, 1.4 2.5. The lengths
of the coverage interval are similar. However, the interval provided by Main-
stream GUM is shifted left relative to that for MCS. In fact, the portion of the
Mainstream GUM coverage interval from -0.8 to zero is infeasible, since, from
its definition, Y cannot be negative.
Coverage intervals at other levels of probability were also obtained using
MCS and Mainstream GUM. Appreciable differences were again observed. For
instance, at a 99.8% level of probability (corresponding to a coverage factor of
3 under the Gaussian assumption), the coverage interval provided by MCS was
[0.0, 7.5] and that for Mainstream GUM is [2.3, 5.2].
Although the above example might seem extreme, situations with large un-
certainties arise in metrology areas such as EMC measurement. Instances where
the quotient of the standard uncertainty and the mean are of order unity also
arise, e.g., in dimensional metrology and in photometry and radiometry. There
are also problems in limits of detection (Section 9.5), where the uncertainties
involved are comparable to the magnitudes of the quantities measured.
The effect of bias in the evaluated endpoints of the coverage interval con-
structed in this way, resulting from the use of a finite sample, can be reduced
using so-called bias-corrected methods [30].10
erations proportional to M log M [55]. A naive sorting algorithm would take a number of
operations proportional to M 2 that might make the computation time unacceptably long.
65
Software Support for Metrology Best Practice Guide No. 6
= (y1 , . . . , yM ).
where a is a location statistic, such as the mean or (spatial) median [58], for the
set yi and is a dispersion statistic, such as the covariance matrix Vy , for the
set.
The provision of coverage regions in general is currently a research topic.
A simple, practical approach is therefore proposed for current purposes. As
indicated in Section 3.1, even in the univariate case the coverage interval is not
unique. There is far greater freedom of choice in the multivariate case, where
any domain containing 95% of the distribution of possible values constitutes a
95% coverage region. Moreover, a coverage interval can be expressed in terms
of just two quantities, such as the interval endpoints or the interval midpoint
and the semi-width. In more than one dimension, there is an infinite number of
possibilities for the shape of the region.
12 The symbol is (reluctantly) used to denote the matrix of y-vectors, since Y is used to
66
Uncertainty and Statistical Modelling
7.6.1 Disadvantages
1. Availability of pseudo-random number generators. Pseudo-random num-
ber generators are required that are appropriate for the specified pdfs and
joint pdfs of the input quantities that are likely to arise in metrology.
67
Software Support for Metrology Best Practice Guide No. 6
6. Model evaluation. In MCS the model is evaluated for each sample of the
input quantities and hence for a range of values (that may be a number of
standard deviations away from the estimates of the input quantities).
This is in contrast to the Mainstream GUM procedure in which the mea-
surement model is evaluated only at the estimates of the input quantities.
For this reason some issues may arise regarding the numerical procedure
used to evaluate the model, e.g., ensuring its convergence (where iterative
schemes are used) and numerical stability.
7.6.2 Advantages
1. Straightforward use. Software can be implemented such that the user pro-
vides information concerning just the model and the parameters defining
the pdfs of the input quantities.
68
Uncertainty and Statistical Modelling
6. Derivatives are not required. There is no need for algebraic expressions for
the first-order partial derivatives of the model with respect to the input
quantities and for the evaluation of these expressions at the mean values
of the input quantities.
10. Multi-stage models. MCS can take the output matrix of vectors yi from
one stage of a multi-stage model, and carry out bootstrap-like re-sampling
at the input to the next.
freedom the standard deviation u of the Gaussian distribution should also be regarded as a
random variable. In this regard, a pdf should also be attached to u. Because of the Gaussian
assumption this pdf is distributed as 2 . A pseudo-code is available [36] that accounts for
this effect.
69
Software Support for Metrology Best Practice Guide No. 6
Probability density functions that are commonly assigned to the input quan-
tities are the uniform, the Gaussian, the Students-t, and the multivariate Gaus-
sian. Methods, in the form of pseudo-random number generators, for sampling
from these pdfs are available in a companion document [20].
Other pdfs arise from time to time: Binomial, Gamma (including Exponen-
tial and 2 , Beta, Poisson, truncated Gaussian, etc.). The next edition of this
document will cover such pdfs. Moreover, a general approach that is applicable
to any pdf will be outlined.
Knowing when to stop the MCS process is discussed below. This aspect is
important in terms of (a) carrying out an adequate number of trials in order
that the process has converged according to a specified threshold, and (b) not
carrying an excessive and hence uneconomic number of trials.
Example 22 The number of digits after the decimal point in rounding a stan-
dard uncertainty
Suppose the measurement result for a nominally 100 g standard of mass [1,
Clause 7.2.2] is y = 100.021 47 g and the standard uncertainty of y, rounded to
two significant figures, is u(y) = 0.000 35 g, The number of digits in u(y) after
the decimal point is q = 5.
70
Uncertainty and Statistical Modelling
It was stated in Section 7.3 that after M MCS trials, the endpoints of a
coverage interval for Y are the 2.5-and 97.5-percentiles y(0.025M ) and y(0.975M ) .
The endpoints of a coverage interval for the (100p)th percentile are [6] M p
2{M p(1p)}1/2 . For the 2.5- and 97.5-percentiles, these endpoints are 0.025M
0.31M 1/2 and 0.975M 0.31M 1/2 . Let
M is large enough when the lengths y(L+) y(L) and y(H+) y(H) of the 95%
coverage intervals for the endpoints y(L) and y(H) are no larger than 0.5 10q ,
A practical and easily implemented approach consists of carrying out M =
5, 000 (say) trials initially, calculating the value of u(y) from these results, form-
ing the value of q, computing the coverage interval and carrying out the above
accuracy check. If the check was satisfied, accept the result. Otherwise, carry
out a further 5000 trials, and recompute u(y), q and the coverage interval based
on the aggregated (increased) number of trials and carry out the checks again.
Repeat the process as many times as necessary.16
Consider the example of Section 9.3 concerned with the calibration of a hand-
held multimeter. Table 7.1 shows the application of the above procedure. The
first column in the table gives the total number of trials taken. The second
column gives the number of digits after the decimal point in the value of u(y),
the standard deviation of the M values yi of the measurand Y EX determined
so far. The third column contains the value of u(y). The columns headed y(L)
and y(H) give the endpoints of the coverage interval formed from the appropriate
percentiles of the yi . The columns headed y(L) and y(L+) give the lower and
upper endpoints of the coverage interval for y(L) , and y(H) and y(H+) similarly
for y(H) . The columns headed L-len and H-len give the lengths of these endpoint
coverage intervals.
It is seen that these lengths reduce approximately as M 1/2 , in accordance
with the point made in Section 7 concerning convergence.17 60,000 trials were
needed to reduce both of these lengths to 0.0005, and thus to give a good degree
of assurance that the coverage interval for Y EX was computed correctly to
three figures after the decimal point, in accordance with providing two significant
figures in u(y). The resulting coverage interval for the measurand, defined by
16 Because the value of u(y) is much better defined by a relatively small number of trials
than are the endpoints of a coverage interval, subsequent values of u(y) are unlikely to change
much from their original values. In contrast, the coverage interval will take longer to settle
down.
17 For instance, when M is increased from 15 000 to 60 000, i.e., increased by a factor of
four, L-len and H-len are approximately halved, reducing from 0.0009 to 0.0005 and from
0.0010 to 0.0005, respectively.
71
Software Support for Metrology Best Practice Guide No. 6
1600
1400
1200
1000
800
600
400
200
0
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Figure 7.5: A histogram estimating the probability density function of the out-
put of the digital voltmeter model produced using MCS.
the fifth and eighth values in the final row of the table, is [0.049, 0.150] V. This
interval is to be compared with that in [28] of (0.10 0.05) V or, equivalently,
[0.05, 0.15] V, viz., agreement to the published number of figures.
Figure 7.5 shows a histogram that estimates the probability density function
of the output of the digital voltmeter model. The histogram was produced using
the final set of M = 60, 000 model values yi . It is essentially trapezoidal in
shape, to be compared with the statement [28], made following an approximate
analytical treatment, that the distribution is essentially rectangular.
Figure 7.6 shows the corresponding distribution function, estimated from the
ordered yi values, as described in Section 7.3. The endpoints of the estimated
95% coverage interval are indicated in this figure by vertical lines.
72
Uncertainty and Statistical Modelling
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Figure 7.6: The distribution function of the output of the digital voltmeter
model estimated using MCS. The endpoints of the estimated 95% coverage
interval are indicated in this figure by vertical lines.
73
Software Support for Metrology Best Practice Guide No. 6
For very complicated models18 it would not be economic to take more than a
small number of trials (say 10). In such a case it would be impossible to provide
a coverage interval reliably. Rather, a mean and standard deviation should be
obtained and a coverage interval obtained assuming a Gaussian distribution.
(Appeal can be made to the Principle of Maximum Entropy.) For multivariate
output quantities a covariance matrix from the results can be calculated and
a multivariate Gaussian distribution assumed. As always, the results obtained
should be accompanied by a statement that indicates clearly how they were
obtained.
74
Uncertainty and Statistical Modelling
Chapter 8
Validation of Mainstream
GUM
This chapter describes validation tests that can be carried out to decide whether
Mainstream GUM is applicable in any particular circumstance. There are two
types of test, prior tests and post tests, carried out, respectively, before utilizing
Mainstream GUM and after having applied Mainstream GUM.
The post-test involves the use of MCS as the validation tool.
a formula obtained from (6.2) when all covariances are set to zero, and ci denotes
f /Xi evaluated at X = x. The test is based on generating, for each model
input Xi , not just a single value of the sensitivity coefficient ci but three values
that are realistically likely to encompass the range of values of the first-order
partial derivative for that input quantity in the neighbourhood of its estimated
value. If the variation in these values is small enough, for all input quantities,
a first-order evaluation is adequate. Small enough means that the variation
is no greater than the required numerical accuracy of the solution uncertainty.
In the neighbourhood means, for each i, the interval about xi that covers,
say, 95% of the distribution of values of the input quantity. Thus, if one correct
figure in a relative sense is sought, this relative variation is not to exceed 0.05.
Compare with Section 7.7.1.
75
Software Support for Metrology Best Practice Guide No. 6
This test is severe. If just one of the model input quantities failed the test,
the test overall would be regarded as failing, which would be unreasonable if
that quantity made a negligible contribution to u(y).
A less extreme and more practical test is (i) to evaluate u(y) for the smallest
of the three sensitivity coefficients for each component, (ii) as (i) but for the
largest sensitivity coefficient, and (iii) to test whether the two estimates of u(y)
agreed to one figure.
(low)
For each value of i from 1 to n, take the variation over the range xi
(high)
to xi , the 2.5- and 97.5-percentiles of the pdf of xi .1 The rationale for
this choice is that the pdf of y is likely to be dominated by contributions from
the input pdfs that are sufficiently central to these pdfs: taking the stated
values reflects this consideration. The variation of ci regarded as a function of
Xi , i.e., ci = ci (Xi ), is thus considered over the interval xi ki u(xi ), where ki is
the coverage factor such that 95% of the pdf is embraced.2 It is therefore given
by the interval [cmini , cmax
i ], where
cmin
i = min ci (Xi ), cmax
i = max ci (Xi ),
cmin
i = min Ci , cmax
i = max Ci ,
This variant is the one recommended3 , particularly because the third piece
of information, ci (xi ), would have been evaluated in any case as part of the
conventional evaluation of u(y).
The principle can be implemented in the following way.
1. For each value i = 1, . . . , n
1 For an input quantity Xi with a Gaussian pdf, xi 1.96u(xi ) can be taken. For an input
quantity Xi with a uniform pdf, take xi 1.65u(xi ). The multiplier 1.65 is such that 95% of
the uniform pdf is covered. In general, take the appropriate 95% coverage interval.
2 This interval applies to a symmetric pdf. If the pdf is asymmetric the interval becomes
() (+) () (+)
[xi ki u(xi ), xi + ki u(xi )], where ki and ki are factors such that the interval is a
95% coverage interval for Xi . The modifications to be made to the subsequent considerations
are straightforward.
3 It is possible to carry out more sophisticated calculations. One approach is to pass a
quadratic polynomial through the three sensitivity values, and take its minimum or maximum
as appropriate, if such a value lies within the interval. Another possibility is to evaluate ci (Xi )
on a fine mesh of values in the interval in order to determine extreme values. Such strategies
could be adopted, but their use would be inconsistent with the intent of the quick test of
this section.
76
Uncertainty and Statistical Modelling
(a) Set ki such that [xi ki u(xi ), xi + ki u(xi )] is a 95% coverage interval
for Xi .
(mid)
(b) Form ci = f (X)/Xi , evaluated at x.
(low)
(c) Replace the ith component of x by xi = xi ki u(xi ).
(d) Form ci (low) = f (X)/Xi , evaluated at x.
(high)
(e) Replace the ith component of x by xi . = xi + ki u(xi ).
(high)
(f) Form ci = f (X)/Xi , evaluated at x.
(g) Restore the original value of x.
(min) (low) (mid) (high)
(h) Form ci = min(ci , ci , ci ).
(max) (low) (mid) (high)
(i) Form ci = max(ci , ci , ci ).
2. Form u(min) (y) and u(max) (y) from
n
2 X 2
(min)
u(min) (y) = ci u2 (xi )
i=1
and
n
2 X 2
(max)
u(max) (y) = ci u2 (xi ).
i=1
77
Software Support for Metrology Best Practice Guide No. 6
78
Uncertainty and Statistical Modelling
Chapter 9
Examples
The examples in this chapter are intended to illustrate the principles contained
in the body of this guide. Where appropriate, two approaches, Mainstream
GUM and Monte Carlo Simulation, are used and contrasted. Some examples
can be regarded as typical of those that arise in metrology. Others are more
extreme, in an attempt to indicate the considerations that are necessary when
those of normal circumstances fail to apply. Perhaps, unfortunately, such
adverse circumstances arise more frequently than would be wished, in areas such
as limit of detection, electromagnetic compliance, photometry and dimensional
metrology.
Many other examples are given throughout this guide, some to illustrate
basic points and others more comprehensive.
and
4b2 h2 Cv2 27B 2 (h + p)2 (Cv2/3 1) = 0. (9.2)
79
Software Support for Metrology Best Practice Guide No. 6
To calculate Q for any values of the input quantities, it is first necessary to form
CD from the explicit formula (9.1) and Cv from the implicit equation (9.2).1
The first four input quantities are geometric dimensions obtained by a se-
ries of measurements with a steel rule at various locations across the flume.
There are uncertainties in these measurements due to location, rule reading
errors and rule calibration. Head height is measured with an ultrasonic detec-
tor, uncertainties arising from fluctuations in the water surface and instrument
calibration.
All uncertainty sources were quantified and appropriate pdfs assigned. All
pdfs were based on Gaussian or uniform distributions. The standard deviations
of these pdfs (standard uncertainties of the input quantities) were all less than
0.3% relative to the means (estimated values).
Both the Mainstream GUM procedure (Section 6) and the Monte Carlo pro-
cedure (Section 7) were applied. The Mainstream GUM results were validated
(Section 8) using the Monte Carlo procedure, under the requirement that results
to two significant figures were required. In fact, the coverage interval for y as
produced by Mainstream GUM was confirmed correct (by carrying out further
Monte Carlo trials) to three significant figures. To quote the comparison in a
relative sense, the quotient of (a) the half-length of the 95% coverage interval
for y ( Q) and (b) the standard uncertainty of y was 1.96, which agrees to
three significant figures with the Mainstream GUM value, viz., the (Gaussian)
coverage factor for 95% coverage. For further comparison, the corresponding
quotients corresponding to 92.5% and 97.5% coverage intervals were 1.78 and
2.24, also in three-figure agreement with Mainstream GUM results. It is con-
cluded that the use of Mainstream GUM is validated for this example for the
coverage probabilities indicated.
2/3
1 This equation is in fact a cubic equation in the variable C
v and, as a consequence, Cv
can be expressed explicitly in terms of B, h and p. Doing so is unwise because of the possible
numerical instability due to subtractive cancellation in the resulting form. Rather, the cubic
equation can be solved using a recognised stable numerical method.
80
Uncertainty and Statistical Modelling
4500
4000
3500
3000
2500
2000
1500
1000
500
0
0.9 0.95 1 1.05 1.1 1.15
If not, and it has resistance in the interval (1.00 0.10) it is allocated to the
C-grade bin. Otherwise, it is allocated to the unclassified bin.
For the circuit application, C-grade resistors are selected. All such resistors
have a resistance in the interval [0.90, 0.95] or the interval [1.05, 1.10] .
From the knowledge of the manufacturing process, the pdf of the resistance of
a resistor before the allocation process is carried out can be taken as Gaussian
with mean 1.00 and standard deviation 0.04 .
Consider the use of three such resistors in series within the circuit to form a
(nominally) 3 resistance.
The model for the 3 resistance R is
R = R 1 + R2 + R 3 ,
where Ri denotes the resistance of resistor i. Each Ri has a pdf as above. What
is the pdf of R and what is a 95% coverage interval for R?
Figures are used to show diagrammatically histograms and distribution func-
tions of the results of using MCS to estimate the pdfs. An analytic or semi-
analytic treatment is possible, but MCS enables results to be provided rapidly.
All results are based on the use of M = 105 Monte Carlo trials.
Figure 9.1 shows a histogram of the pdf of Ri . It is basically Gaussian, with
the central and tail regions removed as a consequence of the grading process.
Figure 9.2 shows the corresponding distribution function produced by MCS
in the manner of Section 7.3 The endpoints of the estimated 95% coverage
interval are indicated by vertical dashed lines and the corresponding endpoints
under the Gaussian assumption by vertical dotted lines.
Figure 9.3 shows a histogram produced by MCS to estimate the probability
density function for R, three Grade-C resistors in series.
The pdf is multimodal, possessing four maxima. The mean of the pdf is
3.00 , the sum of the means of the three individual pdfs, as expected. This
value is, however, unrepresentative, an expectation that could rarely occur.
Figure 9.4 shows the corresponding distribution function produced in the
manner of Section 7.3.
The pdf of Figure 9.3 could be perceived as an overall bell-shape, with strong
structure within it. Indeed, the counterpart of these results for six resistors in
series, as illustrated in Figures 9.5 and 9.6, lies even more in that direction.
Table 9.1 summarises the numerical results, and also includes the results
for n = 10 and 20 resistors. It is reassuring that, considering the appreciable
81
Software Support for Metrology Best Practice Guide No. 6
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
4500
4000
3500
3000
2500
2000
1500
1000
500
0
2.7 2.8 2.9 3 3.1 3.2 3.3 3.4
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.85
2.7 0.9
2.8 0.95
2.9 1
3 1.05
3.1 1.1
3.2 1.15
3.3 1.2
3.4
82
Uncertainty and Statistical Modelling
4500
4000
3500
3000
2500
2000
1500
1000
500
0
5.4 5.6 5.8 6 6.2 6.4 6.6 6.8
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.85
5.4 0.9
5.6 0.95
5.8 1
6 1.05
6.2 1.1
6.4 1.15
6.6 1.2
6.8
83
Software Support for Metrology Best Practice Guide No. 6
departure from normality, the coverage interval converges rapidly to that ob-
tained under the assumption that the output pdf is Gaussian (as in Mainstream
GUM). These are no grounds for complacency, however: there will be situations
where the use of Mainstream GUM is not so favourable.
As stated, an analytical treatment would be possible for this problem. It
might be difficult to justify the effort required, however, unless the analysis
provided some general insight that would give added value to the application.
Using existing MCS software, it required approximately one hour to enter the
problem and produce the numerical and graphical results. The computation
time itself was negligible, being a few seconds in all.
EX = ViX VS + ViX VS ,
where the model input quantities and their pdfs are defined and assigned as
follows:
DMM reading ViX . The voltage indicated by the DMM (the index i meaning
indication). Because of the limited resolution of the device, no scatter is
observed in the indicated values. Therefore, the indicated voltage, 100.1 V,
at the calibrator setting of 100 V, is taken as exact.
Voltage VS generated by the calibrator. The calibration certificate for the
calibrator states that the voltage generated is the value indicated by the
calibrator setting and that the associated expanded uncertainty of mea-
surement associated with the 100 V setting is U = 0.002 V with a cov-
erage factor of k = 2. In the absence of other knowledge a Gaussian
pdf with mean 100 V and standard deviation 0.001 V (obtained from
U/k = 0.002/2) is therefore assigned to this input.
Correction ViX of the indicated voltage of the DMM. The least signif-
icant digit of the DMM display corresponds to 0.1 V as a consequence of
84
Uncertainty and Statistical Modelling
This model was analyzed using Monte Carlo Simulation. Details are given in
Section 7.7.1. The error of indication of the DMM was found to be 0.100 V with
a 95% coverage interval of [0.049, 0.150] V. The corresponding result obtained by
an approximate analytical treatment [28] was (0.10 0.05) V, i.e., in agreement
to the figures quoted.
ai = A + ai , i = 1, . . . , nA ,
bi = B + bi , i = 1, . . . , nB ,
hi = H + hi , i = 1, . . . , nH .
A2 + B 2 = H 2 . (9.3)
For consistency, the solution values of A, B and H are to satisfy this condition.
The errors ai , etc. are statistically related in that the instrumental sys-
tematic effect will manifest itself in all these values. Its presence means that
all measurements are correlated. In order to quantify this correlation, it is
2 Since this correction is based on a number of effects, it does not seem reasonable (cf.
Section 4.3.3 to regard the error as equally likely to take any value in this interval. However,
since the effect of this input on the model output is relatively small, and since an intention
is to compare the EA approach with that of Mainstream GUM, M3003 and MCS, the above
form is taken.
85
Software Support for Metrology Best Practice Guide No. 6
ai = L + ai ,
etc., where ai is the random error in ai , etc. The above relationships become
ai = A + L + ai , i = 1, . . . , nA ,
bi = B + L + bi , i = 1, . . . , nB ,
hi = H + L + hi , i = 1, . . . , nH .
The errors ai , etc. are uncorrelated, the covariance matrix being diagonal with
all entries equal to u2rep . Best estimates of the sides (and L) are then given
by an ordinary least-squares problem (Gauss estimation). First, it is necessary
to incorporate the condition (9.3) [22]. There are various ways to do so in
general, but here it is simplest to use the condition to eliminate a variable. One
possibility is to replace H by (A2 +B 2 )1/2 or, letting denote the angle between
sides A and H, set
A = H cos (9.4)
and
B = H sin . (9.5)
3 The covariance matrix, of order nA + nB + nH , can be built from
1. var(ai ) = var(bi ) = var(hi ) = u2rep + u2 (L),
2. All covariances are equal to u2 (L).
(The standard uncertainty u(L) associated with the instrumental systematic effects may not
be available explicitly from the calibration certificate of the instrument, but should be part of
the detailed uncertainty budget for the calibration.)
86
Uncertainty and Statistical Modelling
Its solution could be found using the Gauss-Newton algorithm or one of its
variants [22].4
Many such problems would be solved in this manner. In this particular case,
by defining new parameters
unknowns, H and L. Equate to zero the partial derivatives of S, with respect to H, and
L, to give three algebraic equations. Eliminate H and L to give a single nonlinear equation
in . Solve this equation using a suitable zero-finder (cf. Section 6.2.3). Determine H and
L by substitution.
87
Software Support for Metrology Best Practice Guide No. 6
X
x
95% cov reg
Figure 9.7: The Gaussian probability density function for the input quantity
for the limit of detection problem. The 95% coverage interval that would con-
ventionally be obtained for the analyte concentration is indicated.
for the measurand and its uncertainty. It utilizes basic statistical modelling
principles.
The framework is as given in Section 4.2.2 on constrained uncertainty eval-
uation. The GUM model (4.5), repeated here, is
Y = max(X, 0),
here, but can be addressed using Monte Carlo Simulation or other methods as appropriate.
88
Uncertainty and Statistical Modelling
X
0
Figure 9.8: As Figure 9.7, but for the case in the text where the mean x =
1.0 ppm and the standard deviation u(x) = 1.0 ppm.
Model
input
X
0
Figure 9.9: The pdf for the input quantity for the limit of detection problem
the values of Y = max(X, 0) that is zero is equal to this value. For the above
numerical values, the fraction is (1.0) = 0.16. So, 16% of the distribution of
values that can plausibly be ascribed to the measurand take the value zero.
The pdf for the input quantity is indicated in Figure 9.9. That for the output
quantity is depicted in Figure 9.10. In Figure 9.10 16% of the area under the
curve is concentrated at the origin. Strictly, this feature should be denoted by a
Dirac delta function (having infinite height and zero width). For illustrative
purposes only, the function is depicted as a tall thin solid rectangle.
The shortest 95% coverage interval for the measurand therefore has (a) zero
as its left-hand endpoint, and (b) as right-hand endpoint that value of X for
which ((X x)/u(x)) = 0.95, viz., X = 2.6 ppm. Thus, the required 95%
coverage interval is [0.0, 2.6] ppm.6
Figure 9.11 shows the distribution function G(Y ) for Y , with endpoints of the
95% coverage interval indicated by vertical lines. G(Y ) rises instantaneously
at Y = 0 from zero to 0.16 and thereafter behaves as the Gaussian distribution
function.7
The mean and standard uncertainty of G(Y ) are readily shown to be y =
1.1 ppm and u(y) = 0.9 ppm.
By comparison, the Mainstream GUM approach would yield a result as
6A Monte Carlo Simulation confirms this result.
7 Itis apparent from this figure that if a 95% coverage interval with 2.5% of the distribution
in each tail were chosen the interval would be longer, in fact being [0.0, 3.0] ppm. Of course,
since more than 5% of the distribution is at Y = 0, the left-hand endpoint remains at zero for
any 95% coverage interval.
89
Software Support for Metrology Best Practice Guide No. 6
Model
output
Y
0
Figure 9.10: The pdf for the output quantity for the limit of detection problem.
16% of the area under the curve is concentrated at the origin. Strictly, this
feature should be denoted by a Dirac delta function (having infinite height and
zero width). For illustrative purposes the function is depicted as a tall thin
solid rectangle.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6
Figure 9.11: The distribution function for the limit of detection problem. The
right-hand endpoint of the 95% coverage interval is indicated by a vertical line.
the left-hand endpoint is at the origin.
90
Uncertainty and Statistical Modelling
follows. Since, in the neighbourhood of the input estimate x = 1.0 ppm, the
model behaves as Y = max(X, 0) = X, Mainstream GUM gives y = 1.0 ppm
as a point estimate of the measurand. Moreover, the sensitivity coefficient c is
f
= 1,
X
evaluated at X = 1.0 ppm, viz., c = 1. Thus,
91
Software Support for Metrology Best Practice Guide No. 6
92
Uncertainty and Statistical Modelling
1.06
1.05
1.04
1.03
1.02
1.01
1
17 18 19 20 21 22 23
Figure 9.12: The five gauge block length measurements against temperature
(large blobs) and 100 simulations of these measurements obtained from sam-
pling from assigned Gaussian pdfs. For some of the synthesised sets of five
measurements the gradient of an unconstrained least-squares straight line would
be negative.
Vx = 2 I, (9.7)
x = Ay, (9.8)
where
1 cos 1 sin 1 ... cos r1 sin r1
1 cos 2 sin 2 ... cos r2 sin r2
A=
.. .. .. .. .. ..
. . . . . .
1 cos n sin n . . . cos rn sin rn
is the matrix of order n of Fourier basis-function values. Formally, the Fourier
coefficients are given in terms of the data using
y = A1 x (9.9)
93
Software Support for Metrology Best Practice Guide No. 6
Vy = 2 A1 AT = 2 (AT A)1 .
Now, using the fact that is equiangular and the fundamental properties of the
trigonometric functions, it is straightforward to show that
n
AT A = diag 2, 1, . . . , 1 ,
2
giving
2
(AT A)1 = 1
n
diag 2, 1, ..., 1 .
Consequently,
2 2
diag 12 , 1, . . . , 1 .
Vy =
n
This result states that for data errors that are independent with standard un-
certainty , the errors in the Fourier coefficients are (also)
p independent, with
standard uncertainty equal to scaled by the factor 2/n, where n is the
number pof points (with the exception of the constant term for which the fac-
tor is 1/n). Moreover, each Fourier coefficient is a linear combination of n
raw data values, the multipliers being the products of a constant value and
that of values of cosines and sines (and thus lying between -1 and +1). Conse-
quently, if n is large9 regardless of the statistical distributions of the errors in
the data, the errors in the Fourier coefficients can be expected to be very close
to being normally distributed. This result is an immediate consequence of the
Central Limit Theorem when using the Fourier transform to analyse large num-
bers of measurements having independent errors. Thus, it is valid to regard the
resulting Fourier coefficients as if they were independent Gaussian-distributed
measurements.
The outputs, the Fourier coefficients, from this process become the inputs
to the subsequent stage, viz., the evaluation of the Fourier series h() for any
value of . Now, since, as shown, the Fourier coefficients are uncorrelated,
u2 (h()) = u2 (a0 )+u2 (a1 ) cos2 +u2 (b1 ) sin2 +u2 (ar ) cos2 r+u2 (br ) sin2 r.
8 In
exact arithmetic, the FFT and (9.9) give identical results, since mathematically they
are both legitimate ways of expressing the solution.
9 In high-accuracy roundness measurement, e.g., n = 2000 would be typical.
94
Uncertainty and Statistical Modelling
2 2 2 2 2 2 2 2 2
u2 (h()) = + cos2 + sin2 +. . .+ cos2 r + sin2 r, (9.10)
n n n n n
which simplifies to 2 . Thus,
u(h()) = ,
i.e., the uncertainty in the Fourier representation of a data set when evaluated
at any point is identical to the uncertainty in the data itself. This property is
remarkable in that the (interpolatory) replacement of data by other functions
usually gives an amplification of the raw data uncertainty, at least in some
regions of the data.
95
Software Support for Metrology Best Practice Guide No. 6
Chapter 10
Recommendations
96
Uncertainty and Statistical Modelling
MCS to validate Mainstream GUM is urged when it is unclear whether the latter
is applicable in a certain situation. MCS can also be seen as a widely-applicable
tool for uncertainty evaluation.
4. The model and its input quantities are to be provided in a manner that
maximizes the use of this information consistent with the assumptions
made.
97
Software Support for Metrology Best Practice Guide No. 6
The intention of this guide has been to address these aims as far as reasonably
possible. Further, the two-phase approach advocated in Section 3.2 and followed
in the rest of this guide supports the first point from GUM, Clause 0.4. The
approach to multi-stage models recommended here supports the second point.
Finally, the mathematical formulation and the attitude of this guide supports
the third point, through the solution approaches of Section 5.
98
Uncertainty and Statistical Modelling
Bibliography
[2] AFNOR. Aid in the procedure for estimating and using uncertainly in
measurements and test results. Technical Report FD X 07-021, AFNOR,
2000. Informal English translation. 4.2.1
[5] Stephanie Bell. Measurement Good Practice Guide No. 11. A Beginners
Guide to Uncertainty of Measurement. Technical report, National Physical
Laboratory, 1999. 4.2
99
Software Support for Metrology Best Practice Guide No. 6
100
Uncertainty and Statistical Modelling
101
Software Support for Metrology Best Practice Guide No. 6
102
Uncertainty and Statistical Modelling
103
Software Support for Metrology Best Practice Guide No. 6
Appendix A
Some statistical concepts used in this guide are reviewed. The concept of a
random variable is especially important. Inputs and outputs are (or are to
be regarded as) random variables. Some of the elementary theory of random
variables is pertinent to the subsequent considerations.
104
Uncertainty and Statistical Modelling
1
G(1 x < 2) = ,
2
7
G(2 x < 3) = ,
8
G(3 x) = 1.
The distribution function varies from zero to one throughout its range, never
decreasing.
The probability that X lies in an interval [a, b] is
P (a X b) = G(b) G(a).
Two important statistics associated with a discrete random variable are its
mean and standard deviation.
Let x1 , x2 , . . . denote the possible values of X and p(x) the frequency function
of X. The expected value or mean of a discrete random variable X with
frequency function p(x) is
X
= xi p(xi ).
i
Since a random variable X must take some value, g(x) has the property that
Z
g(x)dx = 1.
The uniform density function is a density function that describes the fact
that X is equally likely to lie anywhere in an interval [a, b]:
1
ba , a x b,
g(x) =
0, otherwise.
105
Software Support for Metrology Best Practice Guide No. 6
The distribution function G(x) gives the probability that a random variable
takes a value no greater than a specified value, and is defined as for a discrete
random variable:
G(x) = P (X x), < x < .
The distribution function can be expressed in terms of the probability density
function as Z x
G(x) = g(v)dv.
The mean of a continuous random variable X with probability density func-
tion g(x) is Z
E(X) = xg(x)dx.
It is often denoted by .
The variance of a continuous random variable X with density function g(x)
and expected value = E(X) is
Z
var(X) = (x )2 g(x)dx.
The variance is often denoted by 2 and its square root is the standard deviation
.
106
Uncertainty and Statistical Modelling
107
Software Support for Metrology Best Practice Guide No. 6
108
Uncertainty and Statistical Modelling
Appendix B
A discussion on modelling
109
Software Support for Metrology Best Practice Guide No. 6
area and in some other branches of metrology, this is not always the case. Not
only can such a heuristic approach give rise to invalid results for the propagated
uncertainty, it can also yields anomalies resulting from the fact that the propa-
gation of uncertainties associated with a symmetric pdf such as a Gaussian or a
uniform pdf through a nonlinear transformation yields an asymmetric distribu-
tion. The extent of the asymmetry is exaggerated by large input uncertainties.
UKAS has recognised the problems associated with the current guidance
document and is working towards a revision that will incorporate a number
of the considerations included in this best-practice guide. These considerations
include a model-based approach, handling changes of variables in a valid manner,
the use of Monte Carlo Simulation to provide output uncertainties, and the use
of MCS to validate Mainstream GUM.
A model can be viewed as transforming the knowledge of what has been
measured or obtained from calibrations, manufacturers specifications, etc. into
knowledge of the required measurement results. It is difficult to consider the
errors or uncertainties in this process without this starting point. It is accepted
that there may well be difficulties in developing models in some instances. How-
ever, their development and recording, even initially in a simple way, provides
a basis for review and subsequent improvement.
If, as is permitted by the GUM (Clause G.1.5), other analytical or numerical
methods are used, a 95% coverage interval can be obtained, as implied above
(using Monte Carlo Simulation, for example), free of these limitations. When
these other approaches are employed, there is no concept of a coverage factor.
In this appendix, consideration is given to the relationship of the GUM
(input-output) model to classical statistical modelling.
There is much discussion in many quarters of different attitudes to uncer-
tainty evaluation. One attitude involves the formal definition of uncertainty
of measurement (GUM Clause 2.2.3 and above) and the other in terms of the
concepts of error and true value. These alternative concepts are stated in GUM
Clause 2.2.4 as the following characterizations of uncertainty of measurement:
1. a measure of the possible error in the estimated value of the measurand
as provided by the result of a measurement;
2. an estimate characterizing the range of values within which the true value
of a measurand lies (VIM, first edition, 1984, entry 3.09).
A second edition of VIM has been published [40]. That edition is currently
being revised by the JCGM.
The extent and ferocity of discussion on matters such as the existence of a
true value almost reaches religious fervour and is such that it provides a major
inhibiting influence on progress with uncertainty concepts. This best-practice
guide does not subscribe to the view that the attitudes are mutually unten-
able. Indeed, in some circumstances an analysis can be more straightforward
using the concepts of GUM Clause 2.2.3 and in others the use of Clause 2.2.4
confers advantage. It is essential to recognize that, although philosophically
the approaches are quite different, whichever concept is adopted an uncertainty
component is always evaluated using the same data and related information
(GUM Clause 2.2). Indeed, in Clause E.5 of the GUM a detailed comparison
of the alternative views of uncertainty clearly reconciles the approaches. In
particular, GUM Clause E.5.3 states
110
Uncertainty and Statistical Modelling
`1 = L1 + e0 + e1 ,
`2 = L2 + e0 + e2 ,
where e0 is the imperfection error in the steel rule when measuring lengths
close to those of concern, and e1 and e2 are the operator errors in making the
measurements. The errors in the measured lengths are therefore
`1 L1 = e0 + e1 ,
`2 L2 = e0 + e2 .
Make the reasonable assumption that the errors e0 , e1 and e2 are independent.
Then, if 0 denotes the standard deviation of e0 and that of e1 and e2 ,
var(`1 L1 ) = 02 + 2 ,
var(`2 L2 ) = 02 + 2 ,
cov(`1 L1 , `2 L2 ) = 02 .
`1 `2 = (L1 L2 ) + (e1 e2 )
As expected, the uncertainty in the steel rule does not enter this result.
Compare the above with the Mainstream GUM approach:
Inputs. L1 and L2 .
Model. Y = L1 L2 .
Input values. `1 and `2 .
Input uncertainties.
1/2
u(`1 ) = u(`2 ) = 02 + 2 , u(`1 , `2 ) = 02 .
111
Software Support for Metrology Best Practice Guide No. 6
Y /L1 = 1, Y /L2 = 1.
112
Uncertainty and Statistical Modelling
Appendix C
Sensitivity coefficients can be difficult to determine by hand for models that are
complicated. The process by which they are conventionally determined is given
in Section 5.5. The partial derivatives required can in principle be obtained using
one of the software systems available for determining derivatives automatically
by applying the rules of algebraic differentiation.
If such a system is used, care needs to be taken that the mathematical
expressions generated are stable with respect to their evaluation at the estimates
of the input quantities. For instance, suppose that (part of) a model is
Y = (X1 C)4 ,
Y
= 4X13 12X12 C + 12X1 C 2 4C 3 , (C.1)
X1
and perhaps not contain a facility to generate directly or simplify this expression
to the mathematically equivalent form
Y
= 4(X1 C)3 , (C.2)
X1
113
Software Support for Metrology Best Practice Guide No. 6
magnitude):
4x31 = 4121.204,
12x21 C = 12118.79,
12x1 C 2 = 11878.81,
4C 3 = 3881.196.
The sum of these values constitutes the value of c1 . To the numerical accu-
racy held, this value is 0.028, compared with the correct value of 0.032. The
important point is that a value of order 102 has been obtained by the sum of
positive and negative values of magnitude up to order 104 . Almost inevitably,
a loss of some six figures of numerical precision has resulted, as a consequence
of subtractive cancellation.
For different values of x1 and C or in other situations the loss of figures could
be greater or less. The concerning matter is that this loss has resulted from such
a simple model. The effects in the case of a sophisticated model or a multi-stage
model could well be compounded, with the consequence that there are dangers
that the sensitivity coefficients formed in this way will be insufficiently accurate.
Therefore, care must be taken in using sensitivity coefficients that are evaluated
from the expressions provided by some software for algebraic differentiation.
Such a system, if used, should evidently be chosen with care. One criterion
in making a choice is whether the system offers comprehensive facilities for
carrying out algebraic simplification, thus ameliorating the danger of loss of
figures. Even then, some form of validation should be applied to the numerical
values so obtained.1
Some systems operate in a different way. They differentiate code rather
than differentiate a formula. Acceptable results can be produced by the user by
carefully constructing the code that defines the function.
1 Numerical analysis issues such as that discussed here will be addressed as part of SSf M2.
114
Uncertainty and Statistical Modelling
Appendix D
115
Software Support for Metrology Best Practice Guide No. 6
The Principle of Maximum Entropy (PME) [42] is a concept that can valu-
ably be employed to enable maximum use to be made of available information,
whilst at the same time avoiding the introduction of unacceptable bias in the
result obtained. Two internationally respected experts in measurement uncer-
tainty [65] state that predictions based on the results of Bayesian statistics and
this principle turned out to be so successful in many fields of science [59], partic-
ularly in thermodynamics and quantum mechanics, that experience dictates no
reason for not also using the principle in a theory of measurement uncertainty.
Bayesian statistics have been labelled subjective, but that is the intended
nature of the approach. One builds in knowledge based on experience and other
information to obtain an improved solution. However, if the same knowledge is
available to more than one person, it would be entirely reasonable to ask that
they drew the same conclusion. The application of PME was proposed [66] in
the field of uncertainty evaluation in order to achieve this objective.
To illustrate the principle, consider a problem [66] in which a single unknown
systematic deviation X1 is present in an measurement process. Suppose that
all possible values for this deviation lie within an interval [L, L], after the
observed measurement result was corrected as carefully as possible for a known
constant value. The value supplied for L is a subjective estimate based on
known properties of the measurement process, including the model inputs. In
principle, the value of L could be improved by aggregating in a suitable manner
the estimates of several experienced people. Let g1 (x1 ) denote the pdf of X1 .
Although it is unknown, g1 (x1 ) will of course satisfy the normalizing condition
Z L
g1 (x1 )dx1 = 1. (D.1)
L
Suppose that from the properties of the measurement process it can be asserted
that on average the systematic deviation is expected to be zero. A second
condition on g1 (x1 ) is therefore
Z L
x1 g1 (x1 )dx1 = 0. (D.2)
L
Suppose an estimate u of the standard deviation of the possible values for the
116
Uncertainty and Statistical Modelling
called the (information) entropy [56] of that pdf, The least-biased probability
assignment is that which maximizes S subject to the conditions (D.1)-(D.3).
Any other pdf that satisfies these conditions has a smaller value of S, thus
introducing a prejudice that might contradict the missing information.
This formulation can fully be treated mathematically [66] and provides the
required pdf. Theshape of the pdf depends on the quotient u/L. If u/L
is smaller than 1/ 3, the pdf is bell-shaped. If u/L is larger than 1/ 3, the
pdf is U-shaped. Between these possibilities, when u/L = 1/ 3, the pdf is the
uniform pdf.
It can also be determined that if no information is available about the stan-
dard deviation u, and S is maximized with respect to conditions (D.1) and (D.2)
only, the resulting pdf is the uniform pdf.
It is evident that the more information there is available the better the
required pdf can be estimated. The services of a mathematician or a statistician
might be required for this purpose. However, in a competitive environment, or
simply when it is required to state the most realistic (and hopefully the smallest)
and defensible measurement uncertainties, such a treatment might be considered
appropriate.
One approach to a range of such problems might be to categorize the com-
monest types of problem, such as those above, viz., when
117
Software Support for Metrology Best Practice Guide No. 6
2. If the measurements are as in 1 above, but are known to contain errors that
can be considered as drawn from a Gaussian distribution, it can be inferred
from the PME that a Students t pdf with mean, standard deviation and
number of degrees of freedom obtained from the measurements should be
assigned.
1 1 1
= 2 + 2 .
u2 (x) uP uM
Clause 4.3.8 of the GUM provides the pdf when limits plus a single mea-
surement are available.
to help build a realistic model. They would not by their nature constitute measurement results,
but their values and their uncertainties might be of interest as part of model development or
enhancement.
3 Check whether the pdf is Gaussian or Students-t.
118
Uncertainty and Statistical Modelling
Consider that lower and upper bounds a and b for the input quantity Xi are
available. If no further information is available the PME would yield (a+b)/2 as
the best estimate of Xi and {(b a)/12}1/2 as the best estimate of the standard
deviation of Xi . It would also yield the pdf of Xi as the uniform distribution
with limits a and b.
(ea eb )C() = 0,
where
1
C() = (xi a)ea + (b xi )eb .
The PME yields [1, Clause 4.3.8] the pdf of Xi as
C()eXi .
119
Software Support for Metrology Best Practice Guide No. 6
Appendix E
Nonlinear sensitivity
coefficients
120