Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Rounding the expanded uncertainty

B. D. Hall
Measurement Standards Laboratory of New Zealand,
Callaghan Innovation,
PO Box 31-310, Lower Hutt 5040,
New Zealand.
(email: blair.hall@measurement.govt.nz)

Abstract
When reporting expanded uncertainty, rounding the number of significant figures can af-
fect the associated level of confidence. A simple numerical method is used to measure
the level of confidence obtained after different rounding methods have been applied. This
also offers some insight into the meaning of level of confidence. The findings are con-
sistent with NIST recommendations that, for a level of confidence of 95%, an expanded
uncertainty should be rounded to two significant figures and the accompanying measured
value reported in the same precision.

Introduction
As a technical expert participating in the assessment of laboratories operating to ISO 17025
[1], I am sometimes asked “How many significant figures should be retained when reporting
expanded uncertainty?”. My answer is something like
The expanded uncertainty, at a level of confidence of 95%, should be rounded to
two significant figures and the measured value should be reported with the same
precision.
This follows excellent guidance given in the NIST publication Good Laboratory Practice for
Rounding Expanded Uncertainties and Calibration Values [2]. However, that guide does not
discuss the actual effects of rounding, and so it does not help to discuss the consequences of
using alternative rounding schemes, which are inevitably under consideration when the question
is posed. The Guide to the expression of uncertainty in measurement (GUM) also briefly
discusses the number of digits used when reporting uncertainty but does not consider the effects
of rounding either [3, §(7.6.2)]. We are not aware of other documents that address this question,
so this paper looks at the process of rounding and tries to give some insight into how different
rounding rules, and the number of digits retained, affect reporting.1
Rounding a number produces a different number. For instance, if p = 3.141 59 is rounded to
p0 = 3.14 we should not be surprised that a calculation using p0 gives different results to the
same calculation using p; clearly p and p0 are not the same. Rounding of measured values is a
well-known source of error that should be accounted for in the measurement uncertainty budget.
Rounding of uncertainty, however, affects measurement results in a different way.
The expanded uncertainty of a traceable measurement is reported as a way to quantify the risk of
taking the measured value as an approximation for the measurand – the quantity intended to be

c 2019 Springer-Verlag GmbH Germany. This is an author-created, un-copyedited version of an article ac-
cepted for publication in Accreditation and Quality Assurance. The final authenticated version is available online
at: https://doi.org/10.1007/s00769-019-01400-z.

1 The reader need not be alarmed, the effects of rounding uncertainty are not generally large; other approximations

made during uncertainty analysis may very well impact more on the resulting level of confidence. Nonetheless, we
offer here a considered response to this recurring question about rounding.

1
measured. We may consider a measurement result as an uncertainty interval for the measurand.2
This interval is constructed with the aim of covering the measurand but, in practice, that cannot
be guaranteed, so we attribute a level of confidence to such intervals. Rounding may give rise to
a different interval, which may cover a measurand that would otherwise have been missed, or
vice-versa. So, rounding the uncertainty may affect the level of confidence of the measurement
result.
Level of confidence is related to the long-run behaviour of a measurement process.3 To under-
stand this better, we can think in terms of a large number of independent measurements and ask
how many of the intervals obtained actually cover the corresponding measurand?4 Suppose,
just for a moment, that we actually knew each measurand. We could then classify a result as
a ‘success’, if the measurand was covered, or a ‘failure’, if it was not. We could work through
all the results in this way and evaluate a success rate for the measurement process. This would
be approximately equal to the level of confidence: the more measurements in our sample, the
closer the observed success rate would be to the actual level of confidence. With this interpre-
tation in mind, we see that measurement results reported with a 95% level of confidence are
expected to cover measurands in 19 out of every 20 independent cases – on average, in the long
run.
Can the long-run success rate of a measurement procedure, and hence the level of confidence, be
assessed? No. In practice, there is no way to determine whether an uncertainty interval covers
a measurand. We can, however, turn to computer simulation as a way of calibrating the method
used to produce uncertainty intervals. We can simulate many independent measurements, round
the results, and observe the effect on the long-run success rate. This approach is analogous to
calibrations carried out in the laboratory. Here we use a computer to create standard data sets,
in which we know the measurands, and use this data to evaluate the performance of our data
processing.
The next section describes the simple numerical method used to assess the level of confidence
obtained when using different formatting rules. We present an algorithm for simulating mea-
surement results and describe the different rounding formats that will be assessed. The fol-
lowing section presents the results, which are discussed in final section before presenting our
conclusions.

Assessing rounding effects by simulation


A measurement model
We consider a simple scenario, in which a measurement is influenced by one dominant source
of error. Following the GUM [3], a measurement model can be expressed using three terms:
the measurand Y , the ith measured value yi and a term representing the ith measurement error
2 The measurement uncertainty statement accompanying a measured value (an estimate of the measurand) may
be presented in different ways. Most common, perhaps, is as an expanded uncertainty at a given level of confidence.
Alternatively, a standard uncertainty, with known degrees-of-freedom, could be given or even a fully specified
probability distribution. In any case, an uncertainty interval may be constructed from the information provided.
3 The interpretation of probability in terms of a long-run relative frequency of events is likely to be familiar to

most readers, but there are alternative interpretations. A Bayesian view, based on a state-of-knowledge, has been
emphasized in a recent supplement to the GUM [4], and a fiducial view has been considered [5]. The resulting
difficulty with terminology is unfortunate; but, although the interpretations may vary, repeated simulations can
be applied to any method that generates uncertainty intervals. Indeed, examples of fiducial methods have been
accompanied by such tests (see, e.g., [6]).
4 Independent measurements generate results that do not depend on each other; the occurrence of one particular

result makes it no more or less likely that another will occur. For this discussion, it is also best to assume that all
sources of measurement error have been randomised; that is, measurements made under conditions of metrological
reproducibility.
Ei . The model is
Y = yi − Ei . (1)
Only the measured value yi will be known to the metrologist; it provides an approximate value
for the measurand. The measurand is fixed but unknown and the values of Ei vary unpredictably
from one measurement to the next. Here, we attribute those values to random noise and assume
that they follow a Gaussian distribution. In GUM terminology, the standard uncertainty of yi ,
written as u(yi ), is associated with the standard deviation of this distribution. So, measured
values are distributed about Y .
As a first step in evaluating the uncertainty interval for a result, an expanded uncertainty is
calculated by multiplying the standard uncertainty by a coverage factor k0.95 = 1.96,5

U(yi ) = k0.95 · u(yi ) . (2)

This coverage factor adjusts the width of the interval to obtain the desired level of confidence –
in this case 95%. The interval [l− , l+ ] is then determined by the lower limit l− = yi −U(yi ) and
the upper limit l+ = yi +U(yi ). Note that yi is unpredictable, so the location of the interval can
vary from one measurement to the next.

Assessing the level of confidence by simulating measurements


The following steps can be used to simulate independent measurement results.
1. Choose a value for the measurand Y
2. Choose a value for standard uncertainty u(yi )
3. Calculate the expanded uncertainty U(yi ) = k0.95 · u(yi )
4. Apply a formatting rule to round U(yi ) to obtain U 0 (yi )
5. Draw a value for Ei from a Gaussian random number generator with a mean of zero and
a standard deviation of u(yi )6
6. Calculate the measured value yi = Y + Ei
7. Round yi to obtain y0i in the same precision as step 4
0 = y0 +U 0 (y ) and l 0 = y0 −U 0 (y )
8. Calculate the limits l+ i i − i i
0 , l0 ]
9. Determine whether Y is included in the uncertainty interval [l− +

Note that the procedure used at step 4 can be changed to allow different formatting rules to be
investigated. This step formats the expanded uncertainty to a finite number of significant figures
and the same number of significant figures must be used in step 7. However, only conventional
rounding is used at step 7 (see ‘ordinary rounding’ in the section on different formatting rules).
By executing this algorithm many times, a relative frequency of successes can be obtained that
gives a measure of the level of confidence.
5 We make the simplifying assumption that the standard deviation is known. This is referred to as infinite degrees

of freedom. More generally, the factor k0.95 depends on the number of degrees of freedom and grows larger as the
degrees-of-freedom decreases.
6 The simulation draws an independent error from a distribution that does not change from one measurement to

the next. Enduring sources of systematic error, which would bias results, and ‘drift’ in the measurement system are
not considered.
Different formatting rules
A number of different formatting rules are described below and measurements of the levels of
confidence obtained using them are reported in the results section. The parameter d is a number
of digits.7

Ordinary rounding
Retain the first d significant digits of a number and if the next significant digit is greater than or
equal to five increment the least significant digit. For example, 2.543 rounds to 2.5 when d = 2,
while 2.553 rounds to 2.6.

Round up
Identify the first d significant digits of a number and increment the least significant digit unless
the remaining digits are all zero. For example, 2.543 rounds up to 2.6 when d = 2, while 2.500
rounds to 2.5.

Round down
Retain the first d significant digits of a number. For example, 2.543 rounds down to 2.5 when
d = 2.

Round and drop a digit


This is a hybrid formatting rule that can produce d significant digits or d − 1. Round the number
(using ordinary rounding) and retain the first d − 1 significant digits if the least significant of
these digits is five or greater, otherwise retain the number rounded to d digits. For example,
5.431 becomes 5 when d = 2, while 4.452 becomes 4.5 and 4.422 becomes 4.4.

Results
The algorithm described above was used to assess the level of confidence produced by the
different formatting rules. The results are shown in Table 1.

d=1 d=2 d=3


round 0.9691 0.9529 0.9506
round up 0.9809 0.9555 0.9506
round down 0.9484 0.9498 0.9500
round and drop - 0.9593 0.9521
base-line 0.9479 0.9498 0.9505

Table 1: The level of confidence of rounding formats when the number of significant digits
retained is d. The table shows the relative frequency of successes in simulations of 1 000 000
independent measurements, where a ‘success’ is an uncertainty interval that covers the mea-
surand. Note that the nominal level of confidence without rounding is 0.9500 or 95%. The
‘base-line’ results are for simulations where the measured value was rounded but not the ex-
panded uncertainties.

Without loss of generality we set Y = 0. In each simulation, a value of u(yi ) was drawn from
a uniform random number generator, which was adjusted so that the expanded uncertainty was
7 We do not report on the balanced even–odd rounding method described in [2], because it is not implemented
in computer systems. However, we do comment on a very slight performance difference between this method and
ordinary rounding in the results section.
uniformly distributed between 0.1 and 1.0.8 Each result in the table was evaluated from a sample
of 1 000 000 simulations. For this sample size, and when the nominal level of confidence is
95 %, we expect the number of successes observed to vary by about 218 from run to run.9 This
corresponds to a success rate variability of about 0.000 22 or 0.022%.
To identify effects that might be due solely to rounding the measured values, we also ran sim-
ulations where the measured value was rounded but not the expanded uncertainty. These are
reported as ‘base-line’ results in Table 1. Here, even the case of d = 1 shows a very small devi-
ation from the nominal probability 0.950. This improved further to 0.9494, for d = 1, when a
balanced odd-even rounding algorithm was used instead of ordinary rounding [2].

Discussion and conclusions


Comments about the results
The level of confidence values reported in Table 1 support the convention of retaining two
significant figures in the expanded uncertainty. Nevertheless, retaining three figures is even
better, which should remind us that rounding degrades the information and is done only for the
convenience of human readers. Retaining two significant figures may be seen as an acceptable
compromise.
Although it is not uncommon to see expanded uncertainty reported using just one significant
digit, this should be discouraged. When d = 1, the effects of ordinary rounding and rounding
up are conservative, leading to probabilities higher than 0.950. Also, the success rates observed
can vary by several percent, depending on the value of uncertainty, which means that the level
of confidence is rather poorly defined.
When using two significant digits, there is little to distinguish between ordinary rounding and
rounding up or down. Rounding up is sometimes adopted as a conservative laboratory policy
but we see here that the effect is small – less than 1%. Similarly, rounding down (dropping
digits) might be thought of as optimistic but the effects are negligible. Depending on the value
of uncertainty, the success rate when d = 2 can vary by almost half a percent. The ‘round
and drop’ format allows an extra digit to be dropped from time to time. It behaves better than
ordinary rounding for d = 1, but worse than d = 2.10
Although no results are presented for cases of finite degrees of freedom, this can be investigated
with a slight modification to our algorithm. We looked at ten degrees of freedom but found a
barely perceptible difference (of the order of 0.1%) in the average level of confidence at d = 1
and nothing for d = 2 and d = 3. For two degrees of freedom, and with d = 1, ordinary rounding
and rounding up are noticeably conservative (of the order of 1% and 2%, respectively) but when
d = 2 and d = 3 the results obtained were essentially nominal values.

Comments about the level of confidence of single measurements


When a level of confidence is reported for a single result, it is tempting to think of it as the prob-
ability that the reported uncertainty interval covers the measurand. However, this misinterprets
8 Simulations that held the uncertainty fixed were found to be slightly sensitive to the choice of u(yi ), due to
the behaviour of the rounding algorithms. The pattern of behaviour we observed repeats each decade, so drawing
uniformly distributed variates over this interval obtains a representative average of different rounding behaviours.
9
pFor a nominal level of confidence p, the standard deviation in the number of successes that will be observed
is np(p − 1), where n is the number of simulations. For p = 0.95 and n = 1 000 000 the standard deviation is
approximately 217.9.
10 The method may be justified by the effect of rounding on the measured value. When d = 2, a significant digit

may sometimes be dropped while introducing no more than 10% relative error to a measured value. The method has
been used at the Measurement Standards Laboratory for many years, but ordinary rounding with two digits is now
preferred.
the probability statement. The measurand is fixed [3, §1.2] and so is the uncertainty interval,
which is calculated from measurement data. So, once the measurement is made, nothing can
vary; there is nothing left to chance and so it does not make sense to talk about probability.
An analogous situation arises just before the final step in our simulation process. There, an
interval has been calculated and the remaining step will test whether it covers the measurand.
The test result can only be true or false; we cannot say that, for instance, the measurand is 95 %
covered. Therefore, as already stated in the introduction, the level of confidence relates to the
performance of the measurement process being used, it is not a probability statement about a
single result.

Conclusion
This paper provides a simple interpretation of the effect of rounding expanded uncertainties
on measurement results and shows that rounding effects can be assessed by simulations. Our
results support the advice that, for 95% level of confidence, the expanded uncertainty should
be rounded to two significant figures and the measured value should be reported in the same
precision.

Acknowledgments
This work was funded by the New Zealand Government. The author is grateful to D. R. White
and A. Koo for critical review and helpful comments, and to the anonymous referees for their
constructive comments and the suggestion to include reference [5].

References
[1] ISO/IEC 17025:2017 (2017) General requirements for the competence of testing and cali-
bration laboratories. International Organization for Standardization, Geneva, 3rd edn.
[2] NIST (2014) Good Laboratory Practice for Rounding Expanded Uncertainties and Calibra-
tion Values. (GLP 9) National Institute for Standards and Technology.
[3] BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, OIML (2008) Evaluation of measurement data
Guide to the expression of uncertainty in measurement JCGM 100:2008 (GUM 1995 with
minor corrections). BIPM Joint Committee for Guides in Metrology, Paris, Sèvres, 1st edn.
[4] BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, OIML (2008) Evaluation of measurement data –
Supplement 1 to the “Guide to the expression of uncertainty in measurement” – Propagation
of distributions using a Monte Carlo method JCGM. 101:2008 BIPM Joint Committee for
Guides in Metrology, Paris, Sèvres, 1st edn.
[5] Guthrie W F, Liu H K, Rukhin A L, Toman B, Wang J C M, Zhang N F (2009) In: Pavese
F, Forbes A B (eds) Data Modeling for Metrology and Testing in Measurement Science,
Modeling and Simulation in Science, Engineering and Technology. Birkhäuser, Boston, pp.
1–45
[6] Wang C M, Iyer H K, Fiducial intervals for the magnitude of a complex-valued quantity.
Metrologia 46, 81 (2009)

You might also like