Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Uncertainty, error propagation and data analysis

Measurements
Measuring physical quantities is imperative for experimental physics. Physical
quantities have a name, value and a unit, as in the length 𝐿𝐿 = 6.32 m, where the symbol
𝐿𝐿 for the measured length is also introduced.

Data collection
It is important that the original data is immediately noted down, be it on paper or typed
into a computer. Write clearly and always remember units; a photo of the recorded data
is excellent documentation in a short journal, and it is not necessary to spend a lot of
time on formatting. Ideally, all participants in an experiment should record the
measured data.

Reading measuring equipment


3.3 3.4 3.5 3.6 3.7
Digital instruments are easy to work with. Usually, the value of a
measurement can be recorded directly. Remember to always include the
unit. On the other hand, analogue instruments with a scale require more
attention. When using equipment with e.g. a dial, a measurement can lie between two
marks and one must suitably interpolate between these two points. The middle is
always an option, but one can often be more precise. If the dial is between 3.4 s and 3.5 s,
the next decimal can be gleaned from an eye measurement. E.g. 3.47 on the figure or at a
different position e.g. 3.425.

Uncertainty
If a measurement is repeated a few times, the same measurement outcome is not always
observed every time. Even though it is assumed that a measurement has a “true” value,
we cannot expect to find it. In principle, all measurements have an uncertainty, which
one naturally strives to minimize. We therefore present the result of the measurement
in the format of a read-off number, an uncertainty, and a unit. Symbolically, this is
written as 𝑥𝑥 ± 𝛿𝛿𝛿𝛿, where the unit lies in the definition of the variable.

Notation
Physical quantities are presented as the best estimate and the corresponding
uncertainty with a unit attached as in 1.2 ± 0.3 m.
𝑥𝑥 ± 𝛿𝛿𝛿𝛿

1
Absolute uncertainty
𝛿𝛿𝛿𝛿 > 0
Relative uncertainty (given in percent).
𝛿𝛿𝛿𝛿
|𝑥𝑥|

Assessment of the uncertainty of a single measurement


As in a previous example, we determine a time to be 3.47 s. It is therefore natural to ask
what the uncertainty on this measurement is. If there are no additional measurements,
statistics cannot be used as in the case above. We must therefore assess the uncertainty
ourselves. In this example, one could say that 𝛿𝛿𝑡𝑡 = 0.005 s. This results in the time being
3.470 ± 0.005 s. Notice the added zero to the measurement value. It can be necessary to
round, but this will be covered later.

Types of uncertainty, precision and accuracy


There are multiple sources of uncertainty in a measurement. The first are personal and
due to human error and can, in principle, be carefully eliminated. The same cannot be
said for the two other sources of uncertainty: random error and systematic error. The
difference between them is illustrated in the figure below. Every filled circle is a
measurement, the star represents the true value, and the cross is the average of the
measurements. The distance from the true value to the average is the deviation
(systematic error in the figure), which can be caused by an instrument that
systematically measures larger values, or a source of error that is not taken into account
in the model (e.g. ignoring air resistance when determining the gravitational
acceleration). The random variation can be due to read-off-error, vibrations in the setup
and temperature fluctuations. The spread of the measurements around the average is
called random error.
We use the term accuracy for how close our measurement (the average) is to the true
value and precision for the spread around the average. High precision means that we
can reproduce the measurements well, but high precision does not necessarily imply
high accuracy.

systematic error
true value mean value

Measurements with random error

2
Sources of error
Sources of error are the conditions that affect the result in a certain direction. It is
important to try to identify and (if possible) eliminate sources of error during
measurements. A sign of an error source is deviations that are larger than the
uncertainty and in a certain direction, since deviations are calculated with a sign.

Maple example
In this example, a variable with absolute and relative uncertainty is defined. Also shown
is how to get the value and uncertainty for further use.
with(ScientificErrorAnalysis);
a := Quantity(1.23, 0.04);
a := Quantity(1.23, 0.04)
a := Quantity(1.23, 0.04, 'relative');
a := Quantity(1.23, 0.0492)
GetValue(a);
1.23
GetError(a);
0.0492

Rounding of quantities
Rounding is determined by the uncertainty, i.e. the uncertainty is rounded first,
followed by rounding of the value. Usually, the uncertainty is known up to a single
significant figure. If the measurement value is 7.1383 s and the uncertainty is
determined to be 0.0336 s, the uncertainty is rounded to a single significant figure. Next,
the measurement value is rounded, so it contains the same number of digits after the
decimal point as the uncertainty. Here, the rounded result becomes 7.14 ± 0.03 s. To
avoid rounding too much, an extra significant figure is included if the most significant
figure in the uncertainty is 1 or 2. If the uncertainty above had been 0.0236 s, the result
would have been 7.138 ± 0.024 s.

Maple example
In the following examples, rounding of numbers is demonstrated. The round method
rounds the uncertainty to the given number of significant figures. The round3g method
knows that if the least significant digit in the uncertainty is 1 or 2, an extra digit must be
included.
u := Quantity(7.1383, 0.0336);

3
u := Quantity(7.1383, 0.0336)
ApplyRule(u, round3g[1]);
Quantity(7.14, 0.03)
u := Quantity(7.1383, 0.0236);
u := Quantity(7.1383, 0.0236)
ApplyRule(u, round3g[1]);
Quantity(7.138, 0.024)

Statistical analysis of measurements


It usually a good idea to make multiple measurements of the same quantity, if possible,
because it can help uncover errors. If 𝑛𝑛 measurements are done with the result
𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑛𝑛 , then the average/mean is the best estimate of the quantity.
Average
𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖
𝑥𝑥̅ = =
𝑛𝑛 𝑛𝑛
Uncertainty on each measurement
From the measurement values and the average, the standard deviation 𝛿𝛿𝛿𝛿 can be
calculated, which can be interpreted as the uncertainty on each measurement.

(𝑥𝑥1 − 𝑥𝑥̅ )2 + (𝑥𝑥2 − 𝑥𝑥̅ )2 + ⋯ + (𝑥𝑥𝑛𝑛 − 𝑥𝑥̅ )2 ∑𝑛𝑛 (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2


𝛿𝛿𝛿𝛿 = � = � 𝑖𝑖=1
𝑛𝑛 − 1 𝑛𝑛 − 1

Uncertainty on the average


The uncertainty on the average used as the best estimate of the measured quantity can
now be calculated from the standard deviation and the number of measurements.
𝛿𝛿𝛿𝛿
𝛿𝛿𝑥𝑥̅ =
√𝑛𝑛
The result is = 𝑥𝑥̅ ± 𝛿𝛿𝑥𝑥̅ , which can be used in further calculations.
Maple example
In this example, a calculation of the average, standard deviation and uncertainty on the
average is shown.
with(LinearAlgebra);
with(Statistics);
with(ScientificErrorAnalysis);
x := Vector([1.22, 1.23, 1.20, 1.40, 1.25, 1.21, 1.20, 1.31, 1.26, 1.28]):

4
N := RowDimension(x);
N := 10
mean:= Mean(x);
mean := 1.25600000000000
xstd := StandardDeviation(x);
xstd := 0.0620394139953698
xsdom := StandardError(Mean, x);
xsdom := 0.0196185852927495
xres := Quantity(xavg, xsdom);
xres := Quantity(1.25600000000000, 0.0196185852927495)
xresround := ApplyRule(xres, round3g[1]);
xresround := Quantity(1.256, 0.020)

Error propagation
In an experiment, the measurement values by themselves are often not the desired
result. The result can be one or more quantities that depend on the measurement
values. When quantities with uncertainties are used in a calculation, the uncertainty is
propagated to the result. Imagine that we have measured a number of quantities with
corresponding uncertainties 𝑥𝑥1 ± 𝛿𝛿𝑥𝑥1 , 𝑥𝑥2 ± 𝛿𝛿𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ± 𝛿𝛿𝑥𝑥𝑛𝑛 and would like to
calculate a quantity based on these: 𝑦𝑦 = 𝑓𝑓(𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑛𝑛 ). The general method of finding
the uncertainty on 𝑦𝑦 is called the law of error propagation or uncertainty propagation.
The method changes shape dependent on whether the uncertainties 𝛿𝛿𝑥𝑥1 , 𝛿𝛿𝑥𝑥2 , ⋯ , 𝛿𝛿𝑥𝑥𝑛𝑛
are independent or dependent. If one measures different quantities with the same
measurement technique, the two uncertainties would normally be independent – most
uncertainties are independent. However, if the two quantities are added, the
uncertainty on the sum would be dependent on the uncertainty on the quantities
included in the sum.
Independent uncertainties

2 2 2
𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕

𝛿𝛿𝛿𝛿 = � 𝛿𝛿𝑥𝑥 � + � 𝛿𝛿𝑥𝑥 � + ⋯ + � 𝛿𝛿𝑥𝑥 �
𝜕𝜕𝑥𝑥1 1 𝜕𝜕𝑥𝑥2 2 𝜕𝜕𝑥𝑥𝑛𝑛 𝑛𝑛

Dependent uncertainties

𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕


𝛿𝛿𝛿𝛿 = � 𝛿𝛿𝑥𝑥1 � + � 𝛿𝛿𝑥𝑥2 � + ⋯ + � 𝛿𝛿𝑥𝑥 �
𝜕𝜕𝑥𝑥1 𝜕𝜕𝑥𝑥2 𝜕𝜕𝑥𝑥𝑛𝑛 𝑛𝑛

5
As you can see, the error propagation depends on the partial derivative of the function
being calculated. There are a few important special cases which give much simpler
calculations than the direct use of the law. These are summarized below.
Addition and subtraction
When adding and subtracting quantities
𝑦𝑦 = 𝑥𝑥1 ± 𝑥𝑥2 ± ⋯ ± 𝑥𝑥𝑛𝑛 , 𝑥𝑥1 ± 𝛿𝛿𝑥𝑥1 , 𝑥𝑥2 ± 𝛿𝛿𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ± 𝛿𝛿𝑥𝑥𝑛𝑛
simpler formulas can be used, where the absolute uncertainties are summed directly
(for dependent uncertainties) or in so-called quadrature (for independent
uncertainties).
Independent uncertainties

𝛿𝛿𝛿𝛿 = �𝛿𝛿𝛿𝛿12 + 𝛿𝛿𝛿𝛿22 + ⋯ + 𝛿𝛿𝛿𝛿𝑛𝑛2

Dependent uncertainties
𝛿𝛿𝛿𝛿 = 𝛿𝛿𝑥𝑥1 + 𝛿𝛿𝑥𝑥2 + ⋯ + 𝛿𝛿𝑥𝑥𝑛𝑛

Multiplication and division


When multiplying and dividing quantities
𝑥𝑥 ∙𝑥𝑥 ∙⋯∙𝑥𝑥
𝑦𝑦 = 𝑧𝑧 1∙𝑧𝑧 2∙⋯∙𝑧𝑧 𝑛𝑛, 𝑥𝑥1 ± 𝛿𝛿𝑥𝑥1 , 𝑥𝑥2 ± 𝛿𝛿𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 ± 𝛿𝛿𝑥𝑥𝑛𝑛 , 𝑧𝑧1 ± 𝛿𝛿𝑧𝑧1 , 𝑧𝑧2 ± 𝛿𝛿𝑧𝑧2 + ⋯ + 𝑧𝑧𝑚𝑚 ± 𝛿𝛿𝑧𝑧𝑚𝑚
1 2 𝑚𝑚

it applies that the relative uncertainties are added directly (for dependent
uncertainties) or in quadrature (for independent uncertainties).
Independent uncertainties

𝛿𝛿𝛿𝛿 𝛿𝛿𝑥𝑥1 2 𝛿𝛿𝑥𝑥2 2 𝛿𝛿𝑧𝑧𝑚𝑚 2



= � � +� � + ⋯+ � �
|𝑦𝑦| 𝑥𝑥1 𝑥𝑥2 𝑧𝑧𝑚𝑚

Dependent uncertainties
𝛿𝛿𝛿𝛿 𝛿𝛿𝑥𝑥1 𝛿𝛿𝑥𝑥2 𝛿𝛿𝑧𝑧𝑚𝑚
= + + ⋯+
|𝑦𝑦| |𝑥𝑥1 | |𝑥𝑥2 | |𝑧𝑧𝑚𝑚 |
Maple example
Note that one can use most of the normal mathematical operations, but they must be
packed in as an argument for the combine method with the extra parameter errors.
with(ScientificErrorAnalysis);
x := Quantity(2.84, 0.03);
x := Quantity(2.84, 0.03)

6
y := Quantity(6.73, 0.12e-1);
y := Quantity(6.73, 0.012)
u := combine(x+y, errors);
u := Quantity(9.57, 0.03231098884)
ApplyRule(u, round3g[1]);
Quantity(9.57, 0.03)
u := combine(x*y, errors);
u := Quantity(19.1132, 0.2047560900)
ApplyRule(u, round3g[1]);
Quantity(19.11, 0.20)
u := combine(exp(x)*sin(y), errors);
u := Quantity(7.395638965, 0.2890233512)
ApplyRule(u, round3g[1]);
Quantity(7.40, 0.29)

Weighting measurements
If different measurements and uncertainty assessments of a quantity are carried out,
there will be different estimates of that quantity, e.g. 𝑥𝑥1 ± 𝛿𝛿𝑥𝑥1 , 𝑥𝑥2 ± 𝛿𝛿𝑥𝑥2 . We can use this
to make an even better estimate of the quantity than the individual two measurements.
This can be done by calculating the weighted average. Instead of simply assuming the
measurements to be equally good, the one with the least uncertainty is weighted more,
the other less. Usually, the reciprocal of the square of the uncertainty is used as the
1 1
weight-factor, i.e. 𝑤𝑤1 = 𝛿𝛿𝛿𝛿 2 and 𝑤𝑤2 = 𝛿𝛿𝛿𝛿 2 . The best estimate of the quantity and the
1 2
uncertainty is then given by
𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2
𝑥𝑥 =
𝑤𝑤1 + 𝑤𝑤2
1
𝛿𝛿𝑥𝑥 =
√𝑤𝑤1 + 𝑤𝑤2
One can also use individual weights in regression analysis, but this is not examined in
this note.
Maple example
x1 := Quantity(0.123, 0.017);
x1 := Quantity(0.123, 0.017)

7
x2 := Quantity(0.131, 0.04);
x2 := Quantity(0.131, 0.04)
w1 := 1/GetError(x1)^2;
w1 := 3460.207612
w2 := 1/GetError(x2)^2;
w2 := 625.0000000
x := (w1*GetValue(x1) + w2*GetValue(x2))/(w1 + w2);
x := 0.1242239280
dx := 1/sqrt(w1 + w2);
dx := 0.01564562561
x := Quantity(x, dx);
x := Quantity(0.1242239280, 0.01564562561)
ApplyRule(x, round3g[1]);
Quantity(0.124, 0.016)

Accounting uncertainty
If both systematic and random uncertainties are present, we assume that they are
independent. The two types of uncertainties are combined in quadrature

2 2
𝛿𝛿𝛿𝛿 = �𝛿𝛿𝑥𝑥systematisk + 𝛿𝛿𝑥𝑥tilfældig

The systematic uncertainty of the experimental equipment is sometimes stated in the


manual, often with a relative uncertainty.

Normal distribution
The normal distribution is one of the most common distributions regarding
experimental measurements. Measurement values are often, but not always, normally
distributed.
The normal distribution has the following probability density function with mean value
𝜇𝜇 and width 𝜎𝜎.
1 (𝑥𝑥−𝜇𝜇)2

𝑓𝑓(𝑥𝑥) = 𝑒𝑒 2𝜎𝜎2
𝜎𝜎√2𝜋𝜋
The probability that a measurement lies in a certain interval is the integral of the
probability density function over that interval.

8
𝑧𝑧2
𝑃𝑃(𝑧𝑧1 < 𝑧𝑧 < 𝑧𝑧2 ) = � 𝑓𝑓(𝑧𝑧)𝑑𝑑𝑑𝑑
𝑧𝑧1

Normalization is often used, so that the mean value is 0 and the width 1:
𝑥𝑥 − 𝜇𝜇
𝑧𝑧 =
𝜎𝜎
The probability density function now becomes
1 𝑧𝑧 2

𝑓𝑓(𝑧𝑧) = 𝑒𝑒 2
√2𝜋𝜋

𝑃𝑃(−1 < 𝑥𝑥 < 1) = 0.6827


𝑃𝑃(−2 < 𝑥𝑥 < 2) = 0.9545
𝑃𝑃(−3 < 𝑥𝑥 < 3) = 0.9973
𝑃𝑃(−4 < 𝑥𝑥 < 4) = 0.9999
Maple example
The four probabilities from above are the probabilities that a normally distributed
result lies between 1 to 4 standard deviations away from the mean. They are calculated
explicitly below. Lastly, the probability of a result lying between 0.4 and 0.9 is
calculated.
with(Statistics);
for x to 4 do evalf(CDF(NormalDistribution(0, 1), x) - CDF(NormalDistribution(0, 1), -x)); end do;
0.6826894920
0.9544997360
0.9973002039
0.9999366575
CDF(NormalDistribution(0, 1), 0.9)-CDF(NormalDistribution(0, 1), 0.4);
0.160518133042916

Comparing the measurement and the expected value


Assume that a measurement has been done and the uncertainty has been evaluated, 𝑥𝑥 ±
𝛿𝛿𝛿𝛿. We would like to compare this to an expected value, 𝑥𝑥expected . It is natural to calculate
the normalized value

�𝑥𝑥 − 𝑥𝑥expected �
𝑧𝑧 =
𝛿𝛿𝛿𝛿

9
and assume that the measurement is normally distributed. If 𝑧𝑧 is less than 2
uncertainties or standard deviations, the quantities are not unreasonably far away from
each other. If 𝑧𝑧 is larger than 3, the two quantities are far from each other. If 𝑧𝑧 is
between 2 and 3, this is gray area.

Linear regression
Consider the model, 𝑦𝑦 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏, with the dependent variable, 𝑦𝑦, and an independent
variable, 𝑥𝑥, and the two parameters, 𝑎𝑎, 𝑏𝑏. Assume that there is a series of corresponding
(measured) values of the dependent and independent variable:
(𝑥𝑥1 , 𝑦𝑦1 ), (𝑥𝑥2 , 𝑦𝑦2 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 ).
We now find the parameter values that minimize the quadrature sum
RSS = ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑎𝑎 − 𝑏𝑏𝑥𝑥𝑖𝑖 )2 (residual sum of squares), the so-called least square method.
The expression in the parentheses is called a residual. In the example below, it is shown
how to find the model parameters. We can also find the residual standard deviation

RSS
𝛿𝛿𝛿𝛿 = �
𝑛𝑛 − 2

that simultaneously gives the uncertainty on the 𝑦𝑦-values, assuming that it is equal.
Maple example
In the example, x is the independent variable and y is the dependent variable. An
example is shown where only the regression function is output and where other
information is output including the uncertainty on the parameters. In the last three
examples, it is shown how to get the parameters, uncertainty on the parameters and the
residual standard deviation.
with(Statistics);
x := Vector([30., 40., 50., 60., 70., 80., 90., 100., 110., 120., 130., 140.]);
y := Vector([19., 19., 22., 25., 21., 25., 26., 23., 27., 29., 30., 29.]);
LinearFit(B*t+A, x, y, t);
LinearFit(B*t+A, x, y, t, summarize = true);
Summary:
----------------
Model: 16.410256+.96153846e-1*t
----------------
Coefficients:
Estimate Std. Error t-value P(>|t|)

10
Parameter 1 16.4103 1.2998 12.6251 0.0000
Parameter 2 0.0962 0.0142 6.7866 0.0000
----------------
R-squared: 0.8216, Adjusted R-squared: 0.8038
LinearFit(B*t+A, x, y, t, output = parametervector);
LinearFit(B*t+A, x, y, t, output = standarderrors);
LinearFit(B*t+A, x, y, t, output = residualstandarddeviation);

Quadratic regression
As in the linear regression, there is a number of corresponding values
(𝑥𝑥1 , 𝑦𝑦1 ), (𝑥𝑥2 , 𝑦𝑦2 ), ⋯ , (𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 ) which we would like to perform the quadratic regression,

𝑦𝑦 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏 + 𝑐𝑐𝑥𝑥 2 , on. Here, the quantity RSS = ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑎𝑎 − 𝑏𝑏𝑥𝑥𝑖𝑖 − 𝑐𝑐𝑥𝑥𝑖𝑖2 )2 is minimized.
The residual standard deviation is found by the expression

RSS
𝛿𝛿𝛿𝛿 = �
𝑛𝑛 − 3

which simultaneously gives the uncertainty on the 𝑦𝑦-values, assuming they are equal.
Maple example
In this example, first a linear regression is carried out, then a quadratic regression.
From the residuals in the linear regression you see information that is missing when
looking at the quadratic regression.
with(plots):
with(Statistics):
x := Vector([1, 2, 3, 4, 5, 6]);
y := Vector([3.4, 5.2, 7.0, 9.0, 11.2, 13.4]);
N := 6;
plot(x, y, style = point);
LinearFit(B*t+A, x, y, t);
f := unapply(%, t);
display(plot(x, y, style = point), plot(f(t), t = x(1) .. x(N), color = red));
res := LinearFit(B*t+A, x, y, t, output = residuals);
plot(x, res, style = point);

11
LinearFit(C*t^2+B*t+A, x, y, t);
f := unapply(%, t);
display(plot(x, y, style = point), plot(f(t), t = x(1) .. x(N), color = red));
res := LinearFit(C*t^2+B*t+A, x, y, t, output = residuals);
plot(x, res, style = point);

12

You might also like