Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 58

Engineering Maths 1 (Part 1) - Basic Statistics and Error Analysis

Course Structure:
Lecture 1
Nature of errors
Summarising data:
o Measures of central tendency mean, median, mode
o Measures of dispersion variance, standard deviation, median absolute
deviation
Combination and propagation of errors
Lecture 2
Distributions normal, log-normal, Students t distribution
Confidence limits
Central limit theorem
Tests for normality:
o Kolmogorov-Smirnov
o Lilliefors
Lecture 3
Parametric statistical tests:
o F test (variances)
o t tests (means)
Outliers
Rejection of data
Lecture 4
Regression and correlation
Correlation coefficient
Least squares fitting
Lecture 5
Software packages:
o Excel/Openoffice Calc
o Sigmaplot
o Origin
o SPSS
N.J. Goddard Basic Statistics and Error Analysis 1
Introduction
This course is intended to provide a basic introduction to the subject of statistics and to impart
the necessary skills for primary data analysis. The emphasis will be on the use of
straightforward statistical tests and methods that will ensure that data can be clearly and
unambiguously interpreted and used.
The principal thesis underlying this lecture course is that any quantitative result must be
accompanied by an estimate of the errors it contains for it to be of real value. The five
lectures will describe techniques that that can be used to determine the effect of errors as well
as to provide estimates for uncertainty in a given measurement or calculation. The emphasis
of this course is on the pragmatic rather than the theoretical.
Reading List
Statistics and Chemometrics for Analytical Chemistry, 6th Edition, J.N. Miller and J.C.
Miller, Prentice-Hall, ISBN 0273730428.
Chemometrics: Statistics and Computer Applications in Analytical Chemistry, M. Otto,
Wiley-VCH, ISBN 3527314180.
Statistical Procedures for Analysis of Environmental Monitoring Data and Risk Assessment,
E.A. McBean and F.A. Rovers, Prentice-Hall, ISBN 0136750184.
Statistical Tables, J. Murdoch and J.A. Barnes, 4th Edition, McMillan, ISBN 0333558596.
In addition, there is a very useful Web resource for statistics at:
http://www.itl.nist.gov/div898/handbook/
N.J. Goddard Basic Statistics and Error Analysis 2
Introduction to Statistics
Classification of Data and Errors
Summarising Data
Summarising Dispersion
Classification of Data and Errors
Statistics may be defined as the collection, ordering and analysis of data. Data consists of sets
of recorded observations or values. Any quantity that may be described by more than one
value is a variable, and there are two types of variable:
Discrete: also known as step variables
Discrete variables are those that can be counted (one bean, two beans, three beans etc) or they
may be described as variables that are described by a fixed set of values. In other words, each
value of the variable can be associated with an integer index.
Continuous
Continuous variables are those described by a continuous range of values. The range of
possible values may be limited or may extend from - to +. The result is dependent upon
the precision of the measurement or the accuracy of the observer.
Every measurement is subject to two types of errors:
Systematic: also known as determinate errors
These errors are built into the observation and affect the accuracy of measurement. They are
caused by imperfections in the instruments or methods used to make the measurement. In
theory, a determinate error may be quantified and corrected for. Figure 1 illustrates a very
simple situation in which a systematic error might occur.
Random: also known as indeterminate errors
These are due to random and uncontrolled variations in the implementation of the
measurement. Such errors affect the precision of the observation. Statistical methods may be
used to assess the effect of a random or indeterminate error.
N.J. Goddard Basic Statistics and Error Analysis 3
Summarising Data
The Mean
The mean of a set of measurements is defined by Equation (1):
n
x
x
n
i
i

1
Equation (1)
where x is the sample mean, x
i
is a member of the sample and n is the size of the sample.
The mean is also referred to as the first moment of a sample.
There are two other ways of reporting repeated measurements of the same variable, the mode
and the median:
The Mode
The mode is the value of the most frequent observation in a set of data of the same variable.
Furthermore, for data presented as a frequency histogram (a plot of number of occurrences
against value of the variable) the mode may be evaluated using graphical techniques. In this
case, the mode is the value that gives the tallest bar in the histogram
The Median
The median is the mid-point in a set of data of the same variable. If the number of
observations is odd, the median is the centre value in a sorted list of observations, while for
an even number of observations the median is the average of the two observations on either
side of the centre.
N.J. Goddard Basic Statistics and Error Analysis 4
Figure 1. An example of a source of systematic error in a measurement.
V
Power Supply
Transducer
Stimulus
Copper Aluminium
T
Measurements of Dispersion
The most obvious measurement of dispersion is of the range, which is the difference between
the largest and smallest value observed for a variable. The range does not, however, convey
any useful information about the distribution of values within the range.
Median Absolute Deviation (MAD)
The median absolute deviation is defined by Equation (2):
( ) ) ( X median x median MAD
i

Equation (2)
The MAD is useful in statistical analysis as it is a way of independently estimating the
population standard deviation, . In this case, MAD/0.6745 is used as the estimate of . We
will come on to the difference between sample and population standard deviations later.
The Variance (S
2
)
The variance of a sample is given by Equation (3):
S
x x
n
i
i
n
2
2
1
1

( )
Equation (3)
Using the square of the difference between a sample value and the mean ensures that the
variance is always positive.
The standard form of Equation (3) can be rewritten to avoid the need to first calculate the
mean then subtract it from each x
i
:
S
x
x
n
n
i
i
n
i
i
n
2
2
1
1
2
1

1
]
1

1
]
1


Equation (3a)
In this case, all that needs to be calculated are the sums of x and x
2
.
Standard deviation (S)
Standard deviation is the square root of the variance and so may be described simply by
Equation (4):
S
x x
n
i
i
n

( )
2
1
1
Equation (4)
Or using the square root of Equation (3a):
N.J. Goddard Basic Statistics and Error Analysis 5
S
x
x
n
n
i
i
n
i
i
n

1
]
1

1
]
1


2
1
1
2
1
Equation (4a)
To use the alternate forms of these equations, we simply accumulate the sums of
x
i
and x
i
2
and use these values in Equations (3a) or (4a).
NOTE ON THE EQUATIONS TO USE:
You should always use the forms involving sums (Equations 3a and 4a) instead of the formal
definitions (Equations 3 and 4) as the formal definitions are prone to accumulated rounding
errors. Later examples where you should use an alternative equation will also be noted.

Alternative Ways of Expressing Dispersion
There are other ways of expressing the standard deviation. The relative standard deviation
expresses the ratio of the standard deviation to the mean as a percentage, as shown in
Equation (5):
RSD
S
x
100% Equation (5)
while the coefficient of variation simply expresses the dispersion as the ratio of the standard
deviation over the sample mean, Equation (6):
Coefficien t of Variation
S
x
Equation (6)
For scales that begin at zero, the coefficient of variation is independent of units, and
consequently is sometimes a more convenient way of reporting dispersion. However, the
coefficient of variation should not be used for scales that do not have a common zero origin
or for data sets that contain negative values. If negative values are possible, the mean could
turn out to be close to zero, leading to very large values for the coefficient of variation.
Example 1
20 mass measurements were made of a flask of reagent. The units and measurements were in
grams:
12.475 12.469 12.481 12.466
12.474 12.465 12.475 12.473
12.481 12.472 12.482 12.475
12.485 12.473 12.465 12.485
12.468 12.477 12.450 12.513
The first task is to sort the data into ascending order. The mean is given by Equation (1)
above, and in this case n is 20, so Equation 1 reduces to:
N.J. Goddard Basic Statistics and Error Analysis 6
x
x
i
i

1
20
20
or, x g
249 504
20
12 4752
.
.
12.4572 g claims a greater precision than the observation was capable of and so is rounded to
the same precision as the original data. So
x g 12 475 .
to three decimal places. The table
below shows the individual measurements, their squares, deviations from the mean and the
square of the deviation from the mean for this set of measurements, as well as the sums of
these values.
Observation x
i x
i
2
x x
i

1 12.513 156.575169 0.038 0.001444
2 12.485 155.875225 0.010 0.000100
3 12.485 155.875225 0.010 0.000100
4 12.482 155.800324 0.007 0.000049
5 12.481 155.775361 0.006 0.000036
6 12.481 155.775361 0.006 0.000036
7 12.477 155.675529 0.002 0.000004
8 12.475 155.625625 0.000 0.000000
9 12.475 155.625625 0.000 0.000000
10 12.475 155.625625 0.000 0.000000
11 12.474 155.600676 -0.001 0.000001
12 12.473 155.575729 -0.002 0.000004
13 12.473 155.575729 -0.002 0.000004
14 12.472 155.550784 -0.003 0.000009
15 12.469 155.475961 -0.006 0.000036
16 12.468 155.451024 -0.007 0.000049
17 12.466 155.401156 -0.009 0.000081
18 12.465 155.376225 -0.010 0.000100
19 12.465 155.376225 -0.010 0.000100
20 12.450 155.002500 -0.025 0.000625
Sums: 249.504 3112.615078 - 0.002778
The variance, S
2
, of this sample is obtained by:
( ) x x
i


2
3
2 778 10 .
( )
S
x x
i
i 2
2
1
20
3
4
19
2 778 10
19
1462 10

.
.
S g

1462 10 0 012
4
. .
Alternatively,
N.J. Goddard Basic Statistics and Error Analysis 7
( )
S g

_
,

3112 615078
249 504
20
19
3112 615078 3112 612301
19
0 012
2
.
.
. .
.
RSD
S
x
100%
0 012
12475
100% 0 096%
.
.
.
What happens if you round too soon? The table below shows the result of rounding the
squares of the deviations to different numbers of decimal places:
Obs x
i
x x
i

( ) x x
i

2
3 dp
( ) x x
i

2
4 dp
( ) x x
i

2
5 dp
( ) x x
i

2
6 dp
1 12.513 0.038 0.001 0.0014 0.00144 0.001444
2 12.485 0.01 0 0.0001 0.00010 0.000100
3 12.485 0.01 0 0.0001 0.00010 0.000100
4 12.482 0.007 0 0 0.00004 0.000049
5 12.481 0.006 0 0 0.00003 0.000036
6 12.481 0.006 0 0 0.00003 0.000036
7 12.477 0.002 0 0 0.00000 0.000004
8 12.475 0 0 0 0.00000 0.000000
9 12.475 0 0 0 0.00000 0.000000
10 12.475 0 0 0 0.00000 0.000000
11 12.474 -0.001 0 0 0.00000 0.000001
12 12.473 -0.002 0 0 0.00000 0.000004
13 12.473 -0.002 0 0 0.00000 0.000004
14 12.472 -0.003 0 0 0.00000 0.000009
15 12.469 -0.006 0 0 0.00003 0.000036
16 12.468 -0.007 0 0 0.00004 0.000049
17 12.466 -0.009 0 0 0.00008 0.000081
18 12.465 -0.01 0 0.0001 0.00010 0.000100
19 12.465 -0.01 0 0.0001 0.00010 0.000100
20 12.45 -0.025 0 0.0006 0.00062 0.000625
Sums: 249.504 0 0.001 0.0024 0.00271 0.002778
S 0.007255 0.011239 0.011943 0.012092
To finish with this data set, we note that the mode (the most frequent observation) is 12.475g,
while the median is given by the average of observations 10 and 11:
Median g g rounded
+

12 475 12 474
2
12 4745 12 475
. .
. . ( )
In this case, the Mean, Mode and Median are the same.
The MAD of this set of measurements is the median of the absolute deviations from the
median. The table below gives a sorted list of these mass measurements and their absolute
deviations from the median.
Observation x
i
) ( X median x
i

N.J. Goddard Basic Statistics and Error Analysis 8


1 12.513 0.038
2 12.450 0.025
3 12.485 0.010
4 12.485 0.010
5 12.465 0.010
6 12.465 0.010
7 12.466 0.009
8 12.482 0.007
9 12.468 0.007
10 12.481 0.006
11 12.481 0.006
12 12.469 0.006
13 12.472 0.003
14 12.477 0.002
15 12.473 0.002
16 12.473 0.002
17 12.474 0.001
18 12.475 0.000
19 12.475 0.000
20 12.475 0.000
Since the median of the absolute deviations is the average of observations 10 and 11, we can
see that in this case MAD = 0.006 g. Our estimate of the population standard deviation will
be = 0.006/0.6745 = 8.895 10
-3
g or 0.009 g (rounded to the same number of decimal
places as the original measurements.) This is somewhat lower than the sample standard
deviation calculated from the measurements. The estimate derived from the MAD is likely to
be closer to the true value, as it effectively ignores the values furthest away from the median,
which have a disproportionate effect on the calculation of the sample standard deviation
(because the deviation from the mean is squared.)
NOTE ON ROUNDING:
While the results of your data manipulations should be presented to the same precision as the
original data, avoid the trap of rounding too soon. Only round your data at the last stage of
the calculation, keeping any extra significant figures until the final reading. This can be
illustrated clearly in the alternative calculation of standard deviation above, where the answer
depends on the small difference between two very large numbers. If these were rounded to,
say, three decimal places the result would be erroneous. Similarly, in the table on the
previous page, the differences
x x
i

have three decimal places, so when squared require six
decimal places. If these squares are rounded to fewer than six places, the final answer will
almost certainly be wrong.

In addition, if you are using a calculated value in subsequent calculations, use the unrounded
value if possible to preserve the precision in the final answer.
NOTE ON UNITS:
Do not forget to state the units of a measurement. In the example above, the units of the
variable are grams (g), and it is incorrect to state a result without these units. The units for
standard deviation are always the same as the units of the original measurement. You should
note, however, that measures of dispersion such as the RSD and Coefficient of Variation are
N.J. Goddard Basic Statistics and Error Analysis 9
always dimensionless, as you are dividing one unit by the same unit (as the standard deviation
and mean have the same units). You will lose marks in examinations if you leave out the units
(or add units to a dimensionless quantity).
Figure 2 shows the effect of an outlier value on the mean, median and mode using the dataset
shown above. The data has been generated by removing one of the readings at 12.475g and
replacing it with the value shown on the x axis. The mean and mode are shown to be
seriously affected by the outlier, while the median value is unaffected. The mode is
particularly affected in this case as there are five values which then occur twice in the
remaining dataset. If the outlier value is one of these five values, then a single value of the
mode is defined, otherwise the dataset is multi-modal.
Outlier Value (g)
12.38 12.40 12.42 12.44 12.46 12.48 12.50 12.52 12.54
C
e
n
t
r
a
l

V
a
l
u
e

(
g
)
12.46
12.47
12.48
12.49
Mode
Median
Mean
Propagation of Errors
Many measurements are composite; that is, they are the result of combining more than one
measurement into a single value. For example, the density of a substance can be determined
by dividing the mass by the volume. Since the mass and volume measurements each have
their own errors, if we are to estimate the error present in the density measurement, we need
the know how to combine errors.
N.J. Goddard Basic Statistics and Error Analysis 10
Figure 2. Effect of an Outlier on the Mean, Median and
Mode.
Random errors tend to cancel each other out (consider the drunkards walk), but systematic
errors are vectors which do not cancel out. Thus, the propagation of systematic and random
errors are undertaken in a slightly different ways
Consider the trivial example where the final result of an experiment, x, is given by the
equation:
x = a + b
If a and b each have a systematic error of +1 then the whole systematic error of x is +2. On
the other hand, for a and b having an indeterminate error of t 1 the random error in x is not
t 2, for there will be occasions when the random error in a is positive while that in b is
negative and vice versa.
In addition, if we have measuring instruments that have a finite resolution, we can use the
propagation of determinate errors to determine the resolution of a composite measurement. In
this context, the resolution of an instrument is typically half of the smallest scale reading. In
the case of a ruler marked in millimetres, the resolution would be 0.5 mm.
Propagation of Determinate Errors
Figure 3 shows how an error in a length measurement rapidly propagates, as length is used to
determine area and then volume. In many systems a small error in a critical factor results in a
disproportionate bias in the final response. Error analysis of such a system enables key factors
to be identified and hence controlled. Error analysis undertaken at the beginning of an
experiment will often highlight errors which need to be improved, and sometimes
experimental workers will change their design when error analysis highlights that their efforts
are concentrated in the wrong place. Error analysis should be part of experimental design and
planning at the early stages.
N.J. Goddard Basic Statistics and Error Analysis 11
For most practical applications the contribution of an uncertainty in a factor towards the
uncertainty in the final response is obtained by the product of the (partial) differential of the
response with respect to the factor and the uncertainty, or error, in the respective factor. For a
two factor system:
y f x z ( , )
where the uncertainty in x and z is x and z. If
y x z .
then
y y x x z z xz x z z x x z + + + + + + ( )( )
giving
y x z z x x z + +
Since x z will be very small, we can discard it, giving:
y x z z x +
but since
x
y
z
x

_
,

and
z
y
x
z

_
,

then
N.J. Goddard Basic Statistics and Error Analysis 12
Figure 3. Accumulation of Errors in Composite Measurements
A small error x in a length
measurement:
x x
Leads to increased errors in
area estimates:
Error is 2xx+x
2
And even larger errors in volume estimates:
Error is 3x
2
x+3xx
2
+x
3
y
y
z
z
y
x
x
x z

_
,
+

_
,

Equation (7)
for a multi-variate system,
y f x x x
n
( , )
, 1 2

:
y
y
x
x
y
x
x
y
x
x
x xn x xn
n
x xn
n

_
,
+

_
,
+

_
,

1
2
1
2
1
2
1 1 , , ,

Equation (8)
For resolution calculations:
y
y
x
x
y
x
x
y
x
x
x xn x xn
n
x xn
n

_
,
+

_
,
+

_
,

1
2
1
2
1
2
1 1 , , ,

Equation
(9)
Equation (9) above outlines the calculation of an experimental resolution. Measurements and
their associated calculations have a resolution determined by the sensitivity of the instruments
used. Unless explicitly stated it may be taken as half the scale division of the instrument
concerned. Equation (9) is used to determine the experimental resolution at the levels for
which measurements were made.
Example 3
Determination of the density, , of the material weighed in Example 1. The volume of the
material was measured five times. The data obtained were: 6.0, 6.0, 5.8, 5.7, and 6.3 cm
3
.
This data may be summarised as:
x = 5.98 cm
3
, S
v
= 0.239 cm
3
, n = 5 and the resolution = 0.05 cm
3
Or, rounded:
x = 6.0 cm
3
, S
v
= 0.2 cm
3
, n = 5 and the resolution = 0.05 cm
3
The density of the material is determined from Equation (10):

m
V
Equation
(10)
The resolution of the experiment is given by:

_
,
+

_
,

m
m
V
V
V m
since

m V
V

_
,

1
and

V
m
V
m

_
,

2
then
N.J. Goddard Basic Statistics and Error Analysis 13


+
m
V
m V
V
2
Since m = 5 10
-4
g, V = 5 10
-2
cm
3
, m = 12.475g and V = 6.0 cm
3
Then


+




5 10
6 0
12 475 5 10
6 0
174 10
4 2
2
2 3
.
.
.
. g cm
The resolution of the density determination experiment was 0.017 g cm
-3
.
Determinate errors caused by external factors such as temperature, pressure, relative humidity
or supply voltage for example should be corrected for using Equation (8). It is important not
to confuse the calculation of an experimental resolution with the calculation of bias or error
resulting from a system operating outside its calibrated parameters.
Propagation of Indeterminate Errors
There are a variety of treatments for analysing the propagation of indeterminate errors and
these fall under the headings of linear combinations and multiplicative expressions. Examples
of these treatments may be found in Statistics for Analytical Chemistry, 6
th
Edition, by JC
Miller and JN Miller. Firstly, we will consider the case where the derived value is a function
of a single variable, as shown in Equation (11):
x d
y d
S S x f y For
x y
), (
Equation (11)
Note the similar form of Equation (11) to Equation (9).
In some cases variance is used in the place of standard deviation in Equation (11) to give an
estimate for the variance in the derived quantity, as shown in Equation (12):
2
2 2
), (

,
_


x d
y d
S S x f y For
x y
Equation (12)
NOTE:
2

,
_

x d
y d
is the square of the first differential and is not the same as
2
2
dx
y d
, the
second differential.
And for
y f x x x
n
( , , )
1 2

S S
y
x
S
y
x
S
y
x
y x x xn
n
2
1
2
1
2
2
2
2
2
2
2

_
,
+

_
,
+

_
,

Equation (13)
Equations (12) and (13) are strictly applicable only for linear functions of the form:
n n
x k x k x k k y + + + +
2 2 1 1 0

Since
N.J. Goddard Basic Statistics and Error Analysis 14
n
n
k
x
y
k
x
y
k
x
y

, ,
2
2
1
1

Then:
2 2 2
2
2
2
2
1
2
1
2
xn n x x y
S k S k S k S + +
2 2 2
2
2
2
2
1
2
1 xn n x x y
S k S k S k S + + Equation (14)
For multiplicative combinations, the treatment is slightly different:
4 3
2 1
x x
x kx
y
2
4
4
2
3
3
2
2
2
2
1
1

,
_

,
_

,
_

,
_

x
S
x
S
x
S
x
S
y
S
y
Equation (15)
When we have a combination such as:
n
x y
We do not treat this as x x x x, as the errors in x are no longer independent (the error is
the same for each occurrence of x). In this case, we would use a term:

2

,
_

x
nS
y
S
x
y
This arises from Equation (12).
An additional treatment of Equation (15) can be applied to the coefficients of variation of the
different functions, resulting in Equation (16):
2 2
2
2
1
2
xn x x y
C C C C + +
Equation (16)
where C
i
is the coefficient of variation of i.
With such a range of possible methods for determining the propagation of indeterminate error
through an experiment it is important that you are consistent in your treatment of data and
that you clearly state how you have determined your estimates for mean and standard
deviation. Whatever method you choose, it is important to ensure that the original data is
summarised so that readers may use it to perform their own calculations, if they so choose.
For many functions and applications Equation (15) yields an estimate for standard deviation
that is larger than other treatments. This is a prudent measure to adopt, as it means that you
will not be underestimating the variance in your measurement.
Example 4
A common operation is the subtractive measurement of the mass of some material. In this
case, a vessel is weighed empty and then weighed with the material. The mass of material is
N.J. Goddard Basic Statistics and Error Analysis 15
then determined by subtraction. What then is the indeterminate error in the weight of
material?
Firstly, we must distinguish between repeated weighings, where the same object is weighed a
number of times using the same instrument, and replicate determinations, where the analysis
is repeated a number of times. In the first case the repeated measurements allow us to
estimate the indeterminate error in the instrument, while the second case allows us to estimate
the indeterminate error in the entire analysis procedure, including the weighing instrument.
In the case of repeated weighings, we should note that the repeated weighings of the empty
and full vessel are not paired. This is, we cannot make an estimate of the standard deviation
of the weight of precipitate by subtracting the first, then second, then third etc empty and full
weights, then working out the standard deviation of these values. The example below will
illustrate this:
Observation Full weight
(g)
Empty
weight (g)
Difference
(g)
1 12.465 9.965 2.500
2 12.466 9.966 2.500
3 12.482 9.982 2.500
4 12.468 9.968 2.500
5 12.481 9.981 2.500
6 12.481 9.981 2.500
7 12.469 9.969 2.500
8 12.472 9.972 2.500
9 12.477 9.977 2.500
10 12.473 9.973 2.500
Mean 12.473 9.973 2.500
Standard deviation 0.006 0.006 0.000
The standard deviations of the full end empty weights are both 0.006 g, which we can take to
be the indeterminate error in the weighing scales. The standard deviation of the difference,
however, is zero. This cannot be correct, as we know that the weighing scales have a
significant indeterminate error. This arises because of the false pairing of the weighings. If
we take a different order of weighings, the results are different:
Observation Full weight
(g)
Empty
weight (g)
Difference
(g)
1 12.465 9.965 2.500
2 12.466 9.966 2.500
3 12.482 9.968 2.514
4 12.468 9.969 2.499
5 12.481 9.972 2.509
6 12.481 9.973 2.508
7 12.469 9.969 2.500
8 12.472 9.977 2.495
9 12.477 9.981 2.496
10 12.473 9.981 2.492
Mean 12.473 9.982 2.501
Standard deviation 0.006 0.006 0.007
N.J. Goddard Basic Statistics and Error Analysis 16
Now we can see that the difference has a non-zero standard deviation, which is different from
the standard deviations of the full and empty weighings. How do we make a good estimate of
the standard deviation of the difference?
The relationship between the mass of material and the mass of the full and empty vessels is
given by:
2 1
m m m
m

Where m
1
is the mass of the vessel plus material and m
2
is the mass of the vessel alone. This
is a linear combination, so we can use Equation 14 to estimate the indeterminate error in the
mass of material. In this case, So in this case, k
0
= 0, k
1
= 1, k
2
= -1, k
3..n
= 0 and x = m.
Substituting these parameters in Equation 14 gives:
( ) ( )
( ) ( )
2
2
2
1
2
2
2
1
2
2 2
2
1 1
. 1 . 1
m m
m m
m m m
S S
S S
S k S k S
+
+
+
In general, if using the same instrument to perform the measurements, then:
2 1 m m
S S
So:
2
1
2
m m
S S
That is, the error in the final mass measurement is larger by a factor of 2 than the individual
errors in the mass measurements. We can see why this should be by noting that the errors in
the two measurements are independent of each other. In other words, the indeterminate error
in one measurement has no effect on the magnitude of the error in the other measurement. In
more mathematical language, we can consider that the errors are orthogonal. This can be
illustrated graphically:
So, for our examples above, the standard deviations of the full and empty weights are both
0.006 g. Our estimate of the standard deviation of the mass of precipitate is then:
g
S S S
m m m
0085 . 0
006 . 0 006 . 0
2 2
2
2
2
1

+
+
N.J. Goddard Basic Statistics and Error Analysis 17
S
m1
S
m2
S
m
If the values were paired, then it would be appropriate to use the standard deviation of the
differences to estimate the indeterminate error in the mass of precipitate.
Distributions
The data in Example 1 may be plotted as number of occurrences against observed value. Such
a graph is known as a frequency distribution or frequency histogram (Figure 4a). If the data
are plotted as summed number of occurrences against observed value the graph is known as a
cumulative frequency distribution curve, Figure 4b.
The standard deviation (S) provides a measure of the spread of data, but it does not
necessarily indicate the way in which the data are distributed. In the example above the data
are clustered around the centre of the distribution, which is the mean. We could in theory
make an infinite number of mass measurements and so we could completely define the values
of recorded mass.
The infinite set of possible observations is the population. The 20 measurements taken are a
sample of the population. If there are no determinate errors then the mean of the population is
the true value of the mass. The mean of the population is denoted by . Similarly the standard
deviation of the population would be a measure of the true distribution, and is denoted by .
The true mean of a distribution is given by the symbol , and is the maximum of the
probability density function, its formal definition being given by Equation (17):

x f x dx . ( ) Equation (17)
A measurement is an estimate of the probability density function of the observed variable.
The position of the measurement is given by an estimate of the mean, while the shape, or C-
spread, of the distribution is provided by an estimate for the standard deviation:
x is an estimate for .
S is an estimate for .
All of the statistical tests described in these notes are parametric tests; they make the
assumption that the data follow a particular distribution, in most cases the Normal or
Gaussian distribution.
N.J. Goddard Basic Statistics and Error Analysis 18
As more observations are made of a variable so a continuous distribution begins to be
defined. As the number of observations increases so too does the quality of the definition of
the distribution until, when an infinite number of observations are available, the distribution
is completely defined.
If the distribution curve obtained from an infinite number of observations is normalised so
that the area under it is equal to 1 then the function that describes that distribution is known
as the probability density function (PDF). The area under the probability density function is 1
(by definition) and is the probability of observing x over all possible values:
N.J. Goddard Basic Statistics and Error Analysis 19
Figure 4
(a) Frequency Distribution of Observed Masses
Observed Mass (g)
12.44 12.46 12.48 12.50 12.52
N
u
m
b
e
r

o
f

O
b
s
e
r
v
a
t
i
o
n
s
0
1
2
3
4
(b) Cumulative Frequency Distribution of Observed Masses
Observed Mass
12.44 12.46 12.48 12.50 12.52
C
u
m
u
l
a
t
i
v
e

N
u
m
b
e
r

o
f

O
b
s
e
r
v
a
t
i
o
n
s
0
5
10
15
20
25
P x F x dx ( ) ( ).

1 Equation
(18)
Consequently the area under the curve between two values for x is the probability of
observing a value for x in the defined range, see Figure 1 and Equation (19):
P x x F x dx
x
x x
( ) ( ). +
+

Equation (19)
There are many types of probability density function, but the three most important are:
Normal distribution, also known as a Gaussian distribution
Poisson distribution, also known as a Stochastic distribution
Binomial distribution
The probability density function of most physical measurements can be described or
approximated by a normal distribution. Other important distributions include
2
, exponential
and bivariate. It should be emphasised though, that non-normality is rare and that some
distributions which are not normal may be rendered normal by taking the logarithm of the
variable. In addition, binomial distributions approximate towards a normal distribution for
large numbers of observations.
A normal distribution is defined by and and the general form is given in Equation (20),
and Figure 5 summarises the form of a normal distribution with a mean of 0 and a standard
deviation of 1 (N(0,1)).
( )
f x e
x
( )

1
]
1
1

1
2 2
2
2

Equation (20)
For the normal distribution, the area under the curve bounded by t 1 will contain
approximately 68.3% of the population, increasing the bounded area to t 2 will draw in
approximately 95.5% of the population, while a further increase to t 3 will increase the
proportion of the population to approximately 99.7%, as illustrated in Figure 6.
N.J. Goddard Basic Statistics and Error Analysis 20
Figure 5. A Normal distribution having a mean of 0 and a standard deviation of 1.
(N(0,1)).
Figure 6. Areas under the Normal distribution for t 1, t 2 and t 3.
x
-4 -2 0 2 4
F
r
e
q
u
e
n
c
y
0.0
0.1
0.2
0.3
0.4
0.5
x
-4 -2 0 2 4
F
r
e
q
u
e
n
c
y
0.0
0.1
0.2
0.3
0.4
0.5
x
-4 -2 0 2 4
F
r
e
q
u
e
n
c
y
0.0
0.1
0.2
0.3
0.4
0.5
~68.3% ~95.5% ~99.7%
t1 t2 t3
Another important distribution is the log-normal. This is often found in cases where the
variable cannot take values below a particular limit, unlike the normal distribution which is
defined from - to +. Examples include aerosol particle size distributions and antibody
concentrations in blood, where the variable cannot go below zero. The log-normal
distribution can be converted into a normal distribution by taking the log of the variable, as
shown in Figure 7.
N.J. Goddard Basic Statistics and Error Analysis 21
x
-4 -2 0 2 4
F
r
e
q
u
e
n
c
y
0.0
0.1
0.2
0.3
0.4
0.5
x x+x
Area under the curve
is the probability that
the variable is in the
range x to x+x
Figure 7. The Log-Normal distribution and its transformation into a Normal
distribution.
Log(x)
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
y
0.0
0.1
0.2
0.3
0.4
0.5
x
0 2 4 6 8 10 12
y
0.0
0.1
0.2
0.3
0.4
0.5
log(x)
The final distribution we will consider is the t-distribution. The derivation of the t-
distribution was first published in 1908 by William Sealy Gosset, who worked at the
Guinness Brewery in Dublin. He was not allowed to publish under his own name, so the
paper was written under the pseudonym Student. The t-distribution and the associated
theory became well-known through the work of R.A. Fisher, who called the distribution
Student's distribution.
Gossett studied the distribution of:
n S
x
T
n
n

And showed it was of the form:


2 / ) 1 ( 2
) / 1 (
) 2 / (
) 2 / ) 1 ((
) (
+
+

x x f
Where = n 1. is known as the degrees of freedom of the variable. In this case it is one
less than the number of observations. This is because if we know the mean we can determine
any observation from the remaining n 1 observations. In other words, we only have n 1
independent observations. The t-distribution is independent of and , so no estimate of is
required, S can be used instead. For large n, the t-distribution tends towards the normal
distribution, while at low n the tails of the t-distribution are higher than those of the normal
distribution. Figure 8 shows the normal distribution (black line) against the t distribution for
degrees of freedom of 1, 2, 3, 4, 5, 10, 15, 20 and 30. The t distribution is of use in
establishing probabilities where the number of observations is less than ~30, while the normal
distribution can be used where the number of observations is > 30 as the difference between
the two distributions is then insignificant.
N.J. Goddard Basic Statistics and Error Analysis 22
Figure 8. The Normal distribution compared to the t distribution for degrees of freedom
of 1, 2, 3, 4, 5, 10, 15, 20 and 30.
x
-4 -2 0 2 4
P
r
o
b
a
b
i
l
i
t
y
0.0
0.1
0.2
0.3
0.4
Normal
t, = 1
t, = 2
t, = 3
t, = 4
t, = 5
t, = 10
t, = 15
t, = 20
t, = 30
Sampling Distributions
If a limited set of observations is made on a variable, a range of values is obtained and from
these the sample mean and standard distribution can be obtained. It is unlikely that the sample
mean is equal to the true value for the mean, , and similarly that the sample standard
deviation S is equal to the true standard deviation .
Furthermore, if another set of readings is taken giving new values for the sample mean and
standard deviation it is unlikely that these new values obtained in a second set would agree
with those obtained from a first set of observations.
If this process is repeated a distribution for x is obtained, and this distribution is the
sampling distribution of the mean. The sampling distribution of the mean has a mean equal to
that of the original population. Its standard deviation, however, is different and is referred to
as the standard error of the mean and is defined in Equation (21):
SEM
n


Equation (21)
where SEM is the standard error of the mean, is the standard deviation of the observed
variable's PDF and n is the number of observations per sample.
The obvious corollary to this definition is that the more measurements that are used to define
a variable the more precise the result will be.
N.J. Goddard Basic Statistics and Error Analysis 23
The Central Limit Theorem
The central limit theorem is important and lies at the centre of many statistical techniques
applied to experimental data. It may be summarised as follows.
If we take samples randomly from a non-normally distributed population, the resulting
distribution of the mean becomes more normally distributed as the sample size increases.
More generally, the central limit theorem states that the distribution of a sum or average of
many random quantities is close to normal. This is true even if the quantities are not
independent (as long as they are not too strongly associated) and even if they have different
distributions (as long as no one random quantity is so large it outweighs all the others). It is
also only true if the underlying distributions have a finite standard deviation.
The central limit theorem suggests why normal distributions are found to be common models
for observed data. Any measurement that is the sum of many small random influences will
have an approximately normal distribution, even if the distributions for these individual
influences are not normally distributed. This theorem is also important because many
statistical tests assume that data are normally distributed, and the central limit theorem shows
that normal or near-normal distributions occur naturally.
To summarise, the sample mean of a random sample of a population having a mean and a
standard deviation will have:

( ) x


( ) x
n

The sample mean is an unbiased estimator of the population mean . In addition, the central
limit theorem indicates that for large samples, the sampling distribution of x is
approximately normally distributed with a mean of and a standard deviation of

n
.
Figure 9 illustrates how increasing sample size changes the observed distribution of even non-
normally distributed variables to a more normal distribution.
N.J. Goddard Basic Statistics and Error Analysis 24
Accuracy and Precision
Accuracy is defined by the trueness of a measurement. An accurate measurement is one
which produces a value for x equal to , without any systematic error.
A precise measurement produces a value for S that is as close to zero as possible.
Indeterminate error may arise from the variable under observation as well as from the
measurement technique being applied. We assume that the variable under observation has no
variance associated with it. If the variable does have some variance, then a precise
measurement would produce a sample standard deviation (S) as close as possible to the
underlying population standard deviation () of the observed variable.
It is important to remember that precision and accuracy describe two different properties of a
measurement, and that precise data does not imply accurate data and vice-versa. Accuracy is
tested by calibration and validation methods, in particular through the analysis of certified
national and international standards.
Precision, that is indeterminate error, is best reported through the use of confidence limits at a
specified level of probability. The use of confidence limits has much to recommend it, in that
N.J. Goddard Basic Statistics and Error Analysis 25
Figure 9. The Central Limit Theory in action.
it explicitly states that indeterminate errors are being reported, whereas a bold statement
giving a range may give the impression that it represents a determinate error.
A clear and straightforward format for reporting confidence limits would be:
x E t at the P% confidence limit for n measurements
Figure 10 shows the relationship between the form of the probability distribution function and
the terms accurate and precise.
Confidence Limits
Confidence limits define a range with a probability that the true value, , lies within it. The
format is:

t x E
at the P% confidence level for n measurements. Equation (22)
Confidence limits are determined by the value for the standard deviation. However, as shown
previously, for a small sample (n < ~30) then the estimate, S, is unlikely to be an accurate
assessment of the true value, . Consequently, Student's t-distribution is used to derive a
value for E,
E t
S
n
p n

, 1
Equation
(23):
Example 2
N.J. Goddard Basic Statistics and Error Analysis 26
Figure 10. Accuracy and Precision
Observed Value
-15 -10 -5 0 5 10 15
N
o
r
m
a
l
i
s
e
d

N
u
m
b
e
r

o
f

O
b
s
e
r
v
a
t
i
o
n
s
0.0
0.1
0.2
0.3
0.4
0.5
A
B
C
D
A: Precise and accurate
B: Imprecise and accurate
C: Precise and inaccurate
D: Imprecise and inaccurate
The mass measurements used in Example 1 have been summarised as:
N = 20, x = 12.475 g, S = 0.012 g
The data are rounded to the resolution of the original measurement.
Table 1 is a copy of a table of Student's t-distribution for different percentage confidence
limits. This is similar in layout to Table 7 (page 17) in Murdoch and Barnes, with the
addition of column headings for 2, the two-tailed probability. From Table 1 it can be seen
that for = n - l = 19 the values for t are 2.093, 2.861 and 3.883 for the 95, 99 and 99.9%
probabilities respectively. We use the two-tailed value in this case (use the column with the
appropriate 2 value, 95% confidence is 2 = 0.05, 99% 2 = 0.01 etc). The calculation of the
95% confidence limits would be as follows:
Mass
20
012 . 0
093 . 2 475 . 12 t
Mass = 12.475 t 0.0056g at the 95% confidence limit for 20 measurements
In this case the error is reported to two significant figures as rounding to the same resolution
as the original mass measurement would result in a significantly different value of
probability.
The confidence level simply says that there is 95%, 99% or 99.9% confidence that the true
values lies within the specified confidence limits. In other words, there is less than a 1 in 20
(for 95%), 1 in 100 (for 99%) or 1 in 1000 (for 99.9% confidence level) that the true value
lays outside the confidence limits. The drawback to using higher confidence levels is that the
confidence limits have to be drawn wider (the t value increases).
If the number of observations is > 30, we can use the Normal distribution instead of the t
distribution, as the two are virtually the same. Thus, for 95, 99 and 99.9% confidence levels
the appropriate values would be 1.960, 2.576 and 3.291, as shown in the last row of Table 1,
where the Normal and t-distributions are the same.
N.J. Goddard Basic Statistics and Error Analysis 27
Table 1. Critical Values for t-Test

= 0.1
2 = 0.2
0.05
0.1
0.025
0.05
0.01
0.02
0.005
0.01
0.0025
0.005
0.001
0.002
0.0005
0.001
1 3.078 6.314 12.706 31.821 63.657 127.321 318.309 636.619
2 1.886 2.920 4.303 6.965 9.925 14.089 22.327 31.599
3 1.638 2.353 3.182 4.541 5.841 7.453 10.215 12.924
4 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
60 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
120 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373

1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291


The values are used for the 1-tailed test, and the 2 values for the 2-tailed test.
N.J. Goddard Basic Statistics and Error Analysis 28
Statistical Tests
Indeterminate error means that it is unlikely that the mean observed value of a variable will
exactly agree with a previous set of observations, or the mean derived from an alternative
measurement technique. In evaluating data a decision often needs to be made as to whether
the difference is real, or is it due to indeterminate error in each of the measurements? When
the difference between means is small compared to their respective standard deviations
statistical tests can be used to support the judgment of the analyst.
Statistical tests are no replacement for common sense. Sometimes statistical tests result in
non-sensible conclusions, and in such cases it is up to the experimental worker to decide the
result. In many cases, the decision will be to seek more experimental data.
The statistical tests described here are:
Kolmogorov-Smirnov and Lilliefors tests, used to determine the probability that a set of
observations is drawn from a particular distribution.
F-tests, used for the comparison of standard deviations.
t-tests, which are used to compare means.
Before we can perform any statistical test, we must first establish a null hypothesis (H
0
). This
is an exact statement of something we initially suppose to be true. For example, we may
propose that the means of two samples are the same, and that any observed difference is a
result of random error:
2 1 0
: x x H
The alternate hypothesis (H
1
) not simply the opposite of H
0
. In this case, there are three
possibilities:
2 1 1
2 1 1
2 1 1
:
:
:
x x H
x x H
x x H

>
<
The first two alternate hypotheses are one-tailed. The third alternate hypothesis is two-tailed.
We use one-tailed tests where the direction of difference is important, while two-tailed tests
are used where the direction of difference is unimportant. We also need to establish
beforehand the tails of the test. For example, if we were performing a clinical trial of a drug
designed to reduce blood pressure the null hypothesis would be that the blood pressure before
and after treatment would be the same, while the alternate hypothesis would be that the mean
blood pressure after treatment would be lower than before treatment. In this case we would
use a one-tailed test because we have an expectation before performing the trial that the
direction of difference would be important.
Alternatively, if we are just testing whether two analytical methods give the same result, the
alternate hypothesis would be simply that the two results are different, and we would use a
two-tailed test.
As with the setting of confidence limits, we also need to set an appropriate confidence level
before performing the test. The confidence level (P) is expressed as a percentage, and can be
related to the probability () in the following ways:
N.J. Goddard Basic Statistics and Error Analysis 29
For a one tailed test:
% 100
1 or % 100 ) 1 (
P
P
For a two tailed test:
2
% 100
1 or % 100 ) 2 1 (

,
_


P
P

There are two types of possible errors when performing statistical tests:
Type 1: rejection of the null hypothesis even though it is in fact true
Type 2: acceptance of the null hypothesis even though it is false
Figure 11 illustrates the meaning of the two types of error. The solid line shows the sampling
distribution of the mean if the null hypothesis (the exact statement) is true. The dashed line
shows a possible distribution that fits the alternate hypothesis.
Figure 11. Illustration of Type I and Type II Errors.
Reducing the chance of a Type 1 error increases the chance of a Type 2 error, and conversely
reducing the chance of a Type 2 error increases the chance of a Type 1 error. We can see in
Figure 11 that if we increase
c
x
we reduce the area (or probability) of the Type 1 error, but
at the expense of increasing the area (probability) of the Type 2 error. Only by increasing the
sample size can we reduce the chances of both types of error. This is because we then reduce
the standard error of the mean. Figure 12 illustrates this concept.
Figure 12. Increasing sample size to reduce both Type 1 and Type 2 errors.
N.J. Goddard Basic Statistics and Error Analysis 30
Type I error Type II error
x
c
Sampling
distribution of mean
if null hypothesis is
true
Sampling
distribution of mean
if alternate
hypothesis is true
Tests for Normality
If we look at the weight data given earlier, we can plot it as a frequency distribution (Figure
4a). We can see that the data appears at least approximately normally distributed. More
rigorously, we can create a fractional cumulative frequency distribution and plot this against
the value normalised to the normal distribution with a mean of 0 and a standard deviation of 1
(N(0,1)). This is done by subtracting the mean and dividing by the standard deviation:
S
x x
i

(SNV) value normal Standard (Equation 22)
The fractional cumulative frequency for any value is simply given by the cumulative
frequency divided by the number of data points PLUS ONE:
fractional cumulativ e frequenc y
+
cumulative frequency
n 1
The divisor is n+1 to ensure that the centre of the fractional cumulative frequency distribution
is at 0.5. If the divisor was n, then the range of fractional cumulative frequencies would vary
from a minimum of 1/n to n/n, giving a mean or central value of (n+1)/2n. Using the divisor
n+1 means that the range of fractional cumulative frequencies would vary from a minimum
of 1/(n+1) to n/(n+1), giving a mean or central value of (n+1)/2(n+1), or 0.5.
We can replot our weight data as the fractional cumulative frequency against the standard
normal variable with an overlaid line showing the expected cumulative distribution for a
normal distribution, as shown in Figure 13.The individual points are close to the expected
frequencies (as shown by the solid line). This indicates that this dataset is reasonably close to
a normal distribution. This method can be performed manually using Normal probability
graph paper, where the Y axis has been made non-linear to give a straight line instead of the
sigmoidal curve shown below in Figure 13.
Figure 13. Graph of Data Points on Cumulative Frequency Plot
N.J. Goddard Basic Statistics and Error Analysis 31
Type I error Type II error
x
c
Type I error Type II error
x
c
Increased
number of
observations
Normalised Value
-4 -3 -2 -1 0 1 2 3 4
C
u
m
u
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y
0.00
0.25
0.50
0.75
1.00
Once we have this graph, how can we tell if the individual data points are far enough from
the line to be considered not part of the normal distribution? We can apply the Kolmogorov-
Smirnov (K-S) test to determine if any of the points lie so far from the line that the data can
be considered non-normally distributed. In this test (and its variants) the null hypothesis is
that the observations are all drawn from the hypothesized distribution, and the alternate
hypothesis is that they are not all drawn from the hypothesized distribution. The maximum
deviation from the expected cumulative frequency is determined and compared to a table of
critical values at a given confidence level. If the maximum deviation is greater than this
critical value, then the null hypothesis is rejected and the alternate hypothesis accepted.
Two tests are possible with this method; the first determines whether the data fits a particular
normal distribution whose parameters are determined in advance and the second whether the
data fits a normal distribution whose parameters are the sample mean and standard deviation.
The first method, which requires the distribution to be known in advance, uses the critical
values in Table 3 (also Table 16, page 28 in Murdoch and Barnes). The second method uses a
modified table of critical values derived by Hubert Lilliefors, with Table 4 giving the critical
values for a number of different confidence levels. There is no table of Lilliefors values in
Murdoch and Barnes, so this will be given in the paper if an exam question requires the use of
this table. We will use the second method for this data set. The two last columns of the table
are the expected cumulative frequency for a given value of the standard normal value and the
absolute difference between the expected and actual cumulative frequencies. To apply the
Lilliefors variant of the K-S test, we first establish our null hypothesis, that the data comes
from a normally distributed population with a mean and standard deviation equal to the
sample mean and standard deviation. We then find the maximum difference between the
expected and actual cumulative frequencies (0.119). We then compare this to the critical
value from the Lilliefors table at the appropriate confidence level (95% in this case). For 20
values we find the critical value to be 0.192. Our maximum difference is less than the critical
N.J. Goddard Basic Statistics and Error Analysis 32
value, so we accept the null hypothesis that the data are normally distributed with a mean
given by the sample mean and a standard deviation given by the sample standard deviation.
In this case, we have generated the standard normal values using the mean and standard
deviation determined from the data, so we are using Lilliefors variant of the K-S test. To
determine if the data fit this particular normal distribution, we simply use the mean and
standard deviation of that distribution to generate the standard normal values. The expected
cumulative frequency can be obtained from tables, or generated directly in spreadsheets such
as Microsoft Excel.
Table 5 gives the area in the upper tail of the Normal distribution. This is the same as the
table in Murdoch and Barnes. The required value can be found from this table very simply. If
the SNV whose cumulative frequency you wish to determine is negative, you simply find the
row that corresponds to the first two digits of the SNV (ignoring the sign), then look along
this row to the column that corresponds to the third digit. That is then the expected
cumulative frequency. F the SNV is positive, then you look up the cumulative frequency as
before, then subtract it from 1 to get the required value. As an example, consider the SNV of
0.579 in the table below. Rounded, this is 0.58, which gives a value of 0.2810. To get the
correct value, we subtract this from 1 to give 0.7190. This is close to the true value in the
table, which was generated directly from the unrounded SNV of 0.5790.
Table 2. Sorted list of weights showing the actual and expected cumulative frequencies.
Value Standard
Normal Value
(normalised to
N(0,1))
Number of
occurrences
Cumulative
frequency
Fractional
cumulative
frequency
Expected
cumulative
frequency
|expected-
actual|
cumulative
frequency
12.450 -2.0679 1 1 0.0476 0.0193 0.0283
12.465 -0.8272 2 3 0.1429 0.2041 0.0612
12.466 -0.7444 1 4 0.1905 0.2283 0.0378
12.468 -0.5790 1 5 0.2381 0.2813 0.0432
12.469 -0.4963 1 6 0.2857 0.3098 0.0241
12.472 -0.2481 1 7 0.3333 0.4020 0.0688
12.473 -0.1654 2 9 0.4286 0.4343 0.0057
12.474 -0.0827 1 10 0.4762 0.4670 0.0092
12.475 0.0000 3 13 0.6190 0.5000 0.1190
12.477 0.1654 1 14 0.6667 0.5657 0.1010
12.481 0.4963 2 16 0.7619 0.6902 0.0717
12.482 0.5790 1 17 0.8095 0.7187 0.0908
12.485 0.8272 2 19 0.9048 0.7959 0.1089
12.513 3.1432 1 20 0.9524 0.9992 0.0468
N.J. Goddard Basic Statistics and Error Analysis 33
Table 3. Kolmogorov-Smirnov critical values
n =0.20 =0.15 =0.10 =0.05 =0.01
1 0.900 0.925 0.950 0.975 0.995
2 0.684 0.726 0.776 0.842 0.929
3 0.565 0.597 0.642 0.708 0.828
4 0.494 0.525 0.564 0.624 0.733
5 0.446 0.474 0.510 0.565 0.669
6 0.410 0.436 0.470 0.521 0.618
7 0.381 0.405 0.438 0.486 0.577
8 0.358 0.381 0.411 0.457 0.543
9 0.339 0.360 0.388 0.432 0.514
10 0.322 0.342 0.368 0.410 0.490
11 0.307 0.326 0.352 0.391 0.468
12 0.295 0.313 0.338 0.375 0.450
13 0.284 0.302 0.325 0.361 0.433
14 0.274 0.292 0.314 0.349 0.418
15 0.266 0.283 0.304 0.338 0.404
16 0.258 0.274 0.295 0.328 0.392
17 0.250 0.266 0.286 0.318 0.381
18 0.244 0.259 0.278 0.309 0.371
19 0.237 0.252 0.272 0.301 0.363
20 0.231 0.246 0.264 0.294 0.356
25 0.210 0.220 0.240 0.270 0.320
30 0.190 0.200 0.220 0.240 0.290
35 0.180 0.190 0.210 0.230 0.270
16 0.258 0.274 0.295 0.328 0.392
17 0.250 0.266 0.286 0.318 0.381
18 0.244 0.259 0.278 0.309 0.371
19 0.237 0.252 0.272 0.301 0.363
20 0.231 0.246 0.264 0.294 0.356
25 0.210 0.220 0.240 0.270 0.320
30 0.190 0.200 0.220 0.240 0.290
35 0.180 0.190 0.210 0.230 0.270
>35
n
07 . 1
n
14 . 1
n
22 . 1
n
36 . 1
n
63 . 1
N.J. Goddard Basic Statistics and Error Analysis 34
Table 4. Lilliefors modified critical values
n =0.20 =0.15 =0.10 =0.05 =0.01
4 0.3027 0.3216 0.3456 0.3754 0.4129
5 0.2893 0.3027 0.3188 0.3427 0.3959
6 0.2694 0.2816 0.2982 0.3245 0.3728
7 0.2521 0.2641 0.2802 0.3041 0.3504
8 0.2387 0.2502 0.2649 0.2875 0.3331
9 0.2273 0.2382 0.2522 0.2744 0.3162
10 0.2171 0.2273 0.2410 0.2616 0.3037
11 0.2080 0.2179 0.2306 0.2506 0.2905
12 0.2004 0.2101 0.2228 0.2426 0.2812
13 0.1932 0.2025 0.2147 0.2337 0.2714
14 0.1869 0.1959 0.2077 0.2257 0.2627
15 0.1811 0.1899 0.2016 0.2196 0.2545
16 0.1758 0.1843 0.1956 0.2128 0.2477
17 0.1711 0.1794 0.1902 0.2071 0.2408
18 0.1666 0.1747 0.1852 0.2018 0.2345
19 0.1624 0.1700 0.1803 0.1965 0.2285
20 0.1589 0.1666 0.1764 0.1920 0.2226
21 0.1553 0.1629 0.1726 0.1881 0.2190
22 0.1517 0.1592 0.1690 0.1840 0.2141
23 0.1484 0.1555 0.1650 0.1798 0.2090
24 0.1458 0.1527 0.1619 0.1766 0.2053
25 0.1429 0.1498 0.1589 0.1726 0.2010
26 0.1406 0.1472 0.1562 0.1699 0.1985
27 0.1381 0.1448 0.1533 0.1665 0.1941
28 0.1358 0.1423 0.1509 0.1641 0.1911
29 0.1334 0.1398 0.1483 0.1614 0.1886
30 0.1315 0.1378 0.1460 0.1590 0.1848
31 0.1291 0.1353 0.1432 0.1559 0.1820
32 0.1274 0.1336 0.1415 0.1542 0.1798
33 0.1254 0.1314 0.1392 0.1518 0.1770
34 0.1236 0.1295 0.1373 0.1497 0.1747
35 0.1220 0.1278 0.1356 0.1478 0.1720
N.J. Goddard Basic Statistics and Error Analysis 35
Table 5. Area in the upper tail of the Normal distribution
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.0007
3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
F-Tests
F-tests are used for the comparison of standard deviations of two samples. They can be used
to determine if one set of data is more precise (a one-tailed test), or is different in its
precision, (a two-tailed test). The F-test looks at the ratio of two sample variances,
F
S
S

1
2
2
2
Equation (23)
S
1
and S
2
are chosen such that F 1.
The critical values for F are determined by numbers of observations in each of the two
samples, n
1
and n
2
, the confidence level and the type of test performed. The degrees of
freedom for an F-test are given by n
1
-1 and n
2
-1. The null hypothesis is that the variances are
equal:
N.J. Goddard Basic Statistics and Error Analysis 36
2
2
2
1 0
: S S H
Table 6 is an example of a one-tailed F-test table at the 95% confidence limit, while Table 7
is the two-tailed variant. If the calculated value for F is less than the critical value obtained
from the appropriate F-table then the hypothesis that the two variances of the sample
populations are equal at the stated confidence level is accepted. If the calculated value for F is
greater than the critical value given in the F table then this hypothesis is rejected, and the
alternative hypothesis accepted.
Example 4
A different experimental worker repeated the measurement in Example 1. The data obtained
were:
x
2
=12.501g, S
2
= 0.019g, n
2
= 5
The original data were:
x
1
=12.475g, S
1
= 0.012g, n
1
= 20
The null hypothesis adopted is
S S
1
2
2
2

that is, that there is no significant difference in the variance of both samples at the 95% (P =
0.05) confidence level.
S g S g
1
2 4 2
2
2 4 2
144 10 361 10

. , .
F

361 10
144 10
251
4
4
.
.
.
The critical value for F for a two-tailed test at the 95% confidence level, P = 0.05, for degrees
of freedom of 4 and 19 is F
0.05,4,19
= 3.56. A two-tailed test is used because we have no reason
to suppose that one set of measurements will be more precise than the other.
In this case, the calculated F value is less than the critical value (2.51 < 3.56), in other words,
we accept the null hypothesis that there is no significant difference between the two
variances.
The table of F values in Murdoch and Barnes (Table 9, pages 20-21) is laid out somewhat
differently to Tables 6 and 7 below. In Murdoch and Barnes the one-tailed values for =
0.05, 0.025, 0.01 and 0.001 are given for each combination of numerator (columns) and
denominator (rows). This corresponds to P = 95, 97.5, 99 and 99.9% one-tailed or P = 90, 95,
98 and 99.8% two-tailed. The value corresponding to = 0.025 is bracketed to make it easier
to see.
N.J. Goddard Basic Statistics and Error Analysis 37
Table 6. Table of one-tailed critical F values at the 95% confidence level.
Numerator degrees of freedom (1 = n1-1)
1 2 3 4 5 6 7 8 9 10
D
e
n
o
m
i
n
a
t
o
r

d
e
g
r
e
e
s

o
f

f
r
e
e
d
o
m

(

1

=

n
1
-
1
)
1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 241.88
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
Numerator degrees of freedom (1 = n1-1)
11 12 13 14 15 16 17 18 19 20
D
e
n
o
m
i
n
a
t
o
r

d
e
g
r
e
e
s

o
f

f
r
e
e
d
o
m

(

1

=

n
1
-
1
)
1 242.98 243.91 244.69 245.36 245.95 246.46 246.92 247.32 247.69 248.01
2 19.40 19.41 19.42 19.42 19.43 19.43 19.44 19.44 19.44 19.45
3 8.76 8.74 8.73 8.71 8.70 8.69 8.68 8.67 8.67 8.66
4 5.94 5.91 5.89 5.87 5.86 5.84 5.83 5.82 5.81 5.80
5 4.70 4.68 4.66 4.64 4.62 4.60 4.59 4.58 4.57 4.56
6 4.03 4.00 3.98 3.96 3.94 3.92 3.91 3.90 3.88 3.87
7 3.60 3.57 3.55 3.53 3.51 3.49 3.48 3.47 3.46 3.44
8 3.31 3.28 3.26 3.24 3.22 3.20 3.19 3.17 3.16 3.15
9 3.10 3.07 3.05 3.03 3.01 2.99 2.97 2.96 2.95 2.94
10 2.94 2.91 2.89 2.86 2.85 2.83 2.81 2.80 2.79 2.77
11 2.82 2.79 2.76 2.74 2.72 2.70 2.69 2.67 2.66 2.65
12 2.72 2.69 2.66 2.64 2.62 2.60 2.58 2.57 2.56 2.54
13 2.63 2.60 2.58 2.55 2.53 2.51 2.50 2.48 2.47 2.46
14 2.57 2.53 2.51 2.48 2.46 2.44 2.43 2.41 2.40 2.39
15 2.51 2.48 2.45 2.42 2.40 2.38 2.37 2.35 2.34 2.33
16 2.46 2.42 2.40 2.37 2.35 2.33 2.32 2.30 2.29 2.28
17 2.41 2.38 2.35 2.33 2.31 2.29 2.27 2.26 2.24 2.23
18 2.37 2.34 2.31 2.29 2.27 2.25 2.23 2.22 2.20 2.19
19 2.34 2.31 2.28 2.26 2.23 2.21 2.20 2.18 2.17 2.16
20 2.31 2.28 2.25 2.22 2.20 2.18 2.17 2.15 2.14 2.12
N.J. Goddard Basic Statistics and Error Analysis 38
Table 7. Table of two-tailed critical F values at the 95% confidence level.
Numerator degrees of freedom (1 = n1-1)
1 2 3 4 5 6 7 8 9 10
D
e
n
o
m
i
n
a
t
o
r

d
e
g
r
e
e
s

o
f

f
r
e
e
d
o
m

(

1

=

n
1
-
1
)
1 647.79 799.50 864.16 899.58 921.85 937.11 948.22 956.66 963.28 968.63
2 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40
3 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42
4 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84
5 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62
6 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46
7 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.76
8 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30
9 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96
10 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72
11 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53
12 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37
13 6.41 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25
14 6.30 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15
15 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06
16 6.12 4.69 4.08 3.73 3.50 3.34 3.22 3.12 3.05 2.99
17 6.04 4.62 4.01 3.66 3.44 3.28 3.16 3.06 2.98 2.92
18 5.98 4.56 3.95 3.61 3.38 3.22 3.10 3.01 2.93 2.87
19 5.92 4.51 3.90 3.56 3.33 3.17 3.05 2.96 2.88 2.82
20 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.77
Numerator degrees of freedom (1 = n1-1)
11 12 13 14 15 16 17 18 19 20
D
e
n
o
m
i
n
a
t
o
r

d
e
g
r
e
e
s

o
f

f
r
e
e
d
o
m

(

1

=

n
1
-
1
)
1 973.03 976.71 979.84 982.53 984.87 986.92 988.73 990.35 991.80 993.10
2 39.41 39.41 39.42 39.43 39.43 39.44 39.44 39.44 39.45 39.45
3 14.37 14.34 14.30 14.28 14.25 14.23 14.21 14.20 14.18 14.17
4 8.79 8.75 8.71 8.68 8.66 8.63 8.61 8.59 8.58 8.56
5 6.57 6.52 6.49 6.46 6.43 6.40 6.38 6.36 6.34 6.33
6 5.41 5.37 5.33 5.30 5.27 5.24 5.22 5.20 5.18 5.17
7 4.71 4.67 4.63 4.60 4.57 4.54 4.52 4.50 4.48 4.47
8 4.24 4.20 4.16 4.13 4.10 4.08 4.05 4.03 4.02 4.00
9 3.91 3.87 3.83 3.80 3.77 3.74 3.72 3.70 3.68 3.67
10 3.66 3.62 3.58 3.55 3.52 3.50 3.47 3.45 3.44 3.42
11 3.47 3.43 3.39 3.36 3.33 3.30 3.28 3.26 3.24 3.23
12 3.32 3.28 3.24 3.21 3.18 3.15 3.13 3.11 3.09 3.07
13 3.20 3.15 3.12 3.08 3.05 3.03 3.00 2.98 2.96 2.95
14 3.09 3.05 3.01 2.98 2.95 2.92 2.90 2.88 2.86 2.84
15 3.01 2.96 2.92 2.89 2.86 2.84 2.81 2.79 2.77 2.76
16 2.93 2.89 2.85 2.82 2.79 2.76 2.74 2.72 2.70 2.68
17 2.87 2.82 2.79 2.75 2.72 2.70 2.67 2.65 2.63 2.62
18 2.81 2.77 2.73 2.70 2.67 2.64 2.62 2.60 2.58 2.56
19 2.76 2.72 2.68 2.65 2.62 2.59 2.57 2.55 2.53 2.51
20 2.72 2.68 2.64 2.60 2.57 2.55 2.52 2.50 2.48 2.46
It should also be noted that in Table 9 in Murdoch and Barnes some numerator degrees of
freedom are missing (for example,
1
= 9), so values must be interpolated. If, for example, the
two-tailed critical F value for
1
= 9,
2
= 9 (F
0.025
,
9,9
) was required, we would use the
bracketed values ( = 0.025) on either side, giving
03 . 4
2
96 . 3 10 . 4
2
9 , 10 , 025 . 0 9 , 8 , 025 . 0
9 , 9 , 025 . 0

+

F F
F
We can see that this is correct from Table 7 above.
N.J. Goddard Basic Statistics and Error Analysis 39
The t-Test
t-Tests are used to evaluate the likelihood that observed differences between means are a
result of indeterminate error. In a t-test, the null hypothesis is that
x x
1 2

, or that there is no
difference between the means.
Statistical theory is used to calculate the probability that observed differences between the
means are due to indeterminate errors. The null hypothesis will be rejected if the probability
of differences being due to chance is less than the confidence limit adopted. There are three
forms of t-test:
Comparison of a mean against a reference
Comparison of the means of two samples
Comparison of sets of means
Comparison of a Mean Against a Reference
Remembering that:
n S
x
T
n
n

We can express the equation for confidence limits (equation 22) as:
t

x t
S
n
p n , 1
Equation (24)
where is the reference value, x is the mean, t
p,n-1
is Student's t-value for adopted confidence
level p for n-1 degrees of freedom, S is the sample standard deviation and n is the number of
observations.
This expression can be re-arranged to give Equation (25)
( ) t x
n
S
p n ,

1
Equation (25)
This expression may be used to decide if x and are equivalent, or significantly different. In
this case, the null hypothesis is:
x H :
0
First we calculate t, and then look up the critical value for t with n-l degrees of freedom at
the selected confidence limit. If the modulus of t is greater than the critical value then the null
hypothesis is rejected. Since we are only interested in determining if the means are different,
we use a two-tailed t-test. If we wanted to determine if the mean was greater or less than the
reference value (a particular direction of difference), we would use a one-tailed test.
Example 5
N.J. Goddard Basic Statistics and Error Analysis 40
The material weighed in Example 1 was obtained from a machine set to deliver 12.45g per
operation. Is the unit operating outside its specification?
In this case the null hypothesis is that x = .

=12.45g, x = 12.475g, n = 20. S = 0.012g
( ) t 12 475 12 45
20
0 012
9317 . .
.
.
The critical two-tailed value for t at the 95% confidence limit with 19 degrees of freedom is
2.093. The calculated t-value is much greater, so the null hypothesis (that the means are
equal) is rejected and the alternative hypothesis (that the means are different) is accepted. The
machine is operating out of specification.
Comparison of the Means of Two Samples: t-Tests
The t-test is used to test that the difference between the means of two samples is significant,
and cannot be accounted for by indeterminate error. The first part of a t-test is to test whether
the variances of the two means are significantly different, so a two-tailed F-test is used. For
samples with the same distribution, that is the variances are not significantly different (also
known as homoscedastic), the t-value is calculated by the following expression, Equation (26)
( )
t
x x
S
n n

_
,

1 2
1 2
1 1
Equation (26)
The number of degrees of freedom for t in Equation (26) is given by:
+ n n
1 2
2
and
( ) ( )
2
1 1
2 1
2
2 2
2
1 1
+
+

n n
S n S n
S Equation (27)
S is a pooled value for the standard deviation, and should be applied if the F-test shows no
significant difference in the standard deviations. This pooled value is used because we know
from the F test that the observations are drawn from distributions with the same standard
deviation. We use a weighted average of the two standard deviations.
If the standard deviations are significantly different, a modification of Equation (26) can be
used:
( )
t
x x
S
n
S
n

_
,

1 2
1
2
1
2
2
2
Equation (28)
and the degrees of freedom are calculated from:
N.J. Goddard Basic Statistics and Error Analysis 41

+

_
,

_
,

+
+

_
,

_
,

S
n
S
n
S
n
n
S
n
n
1
2
1
2
2
2
2
1
2
1
2
1
2
2
2
2
2
1 1
2
Equation (29)
rounded to the nearest integer.
Example 6
Although there is a clear difference between the mean mass values given in Examples 1 and
4, the uncertainty associated with the second measurement is larger than the first, as a result
of the smaller sample size. A t-test will be useful in confirming this observation. The null
hypothesis is:
2 1 0
: x x H
We have already performed the F-test to determine if the standard deviations are significantly
different, and found that there is no significant difference at the 95% confidence level.
Calculating t from Equations (12) and (13):
g S 014 . 0
2 5 20
019 . 0 4 012 . 0 19
2 2

+
+

and
375 . 3
5
1
20
1
014 . 0
501 . 12 475 . 12

,
_

t
The critical value for t for (20 + 5 - 2 ) = 23 degrees of freedom at the 95% confidence level
(P = 0.05) is 2.069. Since our t value is much larger, we reject the null hypothesis that the
means are the same and accept the alternative hypothesis that the means are different at the
95% confidence level. We use a two-tailed test in this case as we are only testing if there is a
difference in the two means, not whether one mean is greater than the other.
Rejection of Data
It is often the case that a set of data may contain a datum that is clearly different from the rest
of the sample. Such data may contain a determinate error as well as an indeterminate error. If
the experiment was not operating correctly at the time that the "outlier" was measured then to
include it is misleading. However, removing inconvenient results so that the recorded
observations fit preconceived models is wrong - even though some very well known scientists
have succumbed to this temptation (Mendel, Darwin). Consequently, a great deal of care
needs to be exercised when dealing with outliers.
N.J. Goddard Basic Statistics and Error Analysis 42
If an outlier exists in experimental data, the first thing to be done is to check the experimental
procedures operating at the time the datum was obtained. Check records and observations to
try to identify the cause of the outlier. At this point deficiencies in record keeping are often
highlighted. Should this be the case, changes to the experimental procedure need to be made
to ensure that appropriate records are kept so that error tracing in future becomes more
reliable.
If no determinate error is identified the argument is applied that while no changes to the
system have been observed it is nevertheless highly unlikely that the outlier lies on the
probability density function, as the frequency of such an observation is so low that it would
require a very large sample size normally to observe it. Consequently such an outlier is more
likely to contain determinate error, even though that error is not known, and therefore the
outlier may be rejected.
Note: The most important data are the ones that don't conform to existing models.
Criteria for the Rejection of Data
There are three simple tests to use in deciding whether or not to reject outliers.
Chauvanet's Criterion
In this case the null hypothesis is that all the observations are drawn from the same
distribution. The presumed outlier is removed from the sample and new values for the mean
and standard deviation are calculated. The confidence limits based on the new mean and
standard deviation are then determined at the appropriate confidence level and if the rejected
outlier lies outside the confidence limits then it may be discarded. This criterion may only be
applied once to a set of data, and it carries substantially more weight if strict confidence
limits are applied, e.g. 99% or even better 99.9%.
Dixon's Q, Also Known as a Q-Test
The null hypothesis again is that all the observations are drawn from the same Normal
distribution. A ratio, Dixon's Q, is calculated using the equations in Table 8, which also gives
the critical values for = 0.1, 0.05 and 0.01 (P = 90, 95 and 99%). In this case the distinction
between one and two tailed tests is irrelevant, as the outlier must lie either below or above the
rest of the data points. The Q value is calculated using the equations depending on the sample
size n and whether the presumed outlier is below (left hand equation) or above (right hand
equation) the rest of the data. The calculated Q value is compared to the critical value at the
required confidence level and if it is below the critical value the null hypothesis is accepted.
If it is larger than the critical value the null hypothesis is rejected and the alternate hypothesis
(that the point is drawn from a different distribution (is an outlier)) is accepted.
Table 8. Dixons Q test equations and critical values.
Rank Difference Ratio (Q
statistic)
n =
0.10
=
0.05
=
0.01
or
1
1
1
1 2
x x
x x
x x
x x
n
n n
n


3 0.886 0.941 0.988
4 0.679 0.765 0.889
5 0.557 0.642 0.780
6 0.482 0.560 0.698
7 0.434 0.507 0.637
N.J. Goddard Basic Statistics and Error Analysis 43
or
2
2
1 1
1 3
x x
x x
x x
x x
n
n n
n

8 0.650 0.710 0.829


9 0.594 0.657 0.776
10 0.551 0.612 0.726
11 0.517 0.576 0.679
12 0.490 0.546 0.642
13 0.467 0.521 0.615
14 0.448 0.501 0.593
or
3
2
1 2
1 3
x x
x x
x x
x x
n
n n
n

15 0.472 0.525 0.616


16 0.454 0.507 0.595
17 0.438 0.490 0.577
18 0.424 0.475 0.561
19 0.412 0.462 0.547
20 0.401 0.450 0.535
21 0.391 0.440 0.524
22 0.382 0.430 0.514
23 0.374 0.421 0.505
24 0.367 0.413 0.497
25 0.360 0.406 0.489
Grubbs Test
Grubbs test is the ISO recommended method for removal of outliers. The null hypothesis is
again that all measurements are from the same population. The suspect value is that furthest
away from the mean. This test assumes that the observations are Normally distributed.
We calculate:
S
x
G

lue suspect va
Equation (30)
Which is the standard normal value for the suspected outlier. The presence of an outlier
increases both the numerator and the denominator (as the outlier increases S as well),
so the G statistic therefore cannot increase indefinitely. In fact, G cannot exceed:
n
n
G
1

We compare the calculated G statistic against critical values at an appropriate confidence


level, and if the calculated G value is less than the critical value we accept the null
hypothesis, otherwise we reject it and accept the alternate hypothesis (that the observation is
an outlier). This test should generally only be used once to remove an outlier. Table 9 gives
critical values for Grubbs test at = 0.05, 0.01 (P = 95, 99%) for sample sizes from 3 up to
600.
Table 9. Critical values for Grubbs test at = 0.05, 0.01 (P = 95, 99%)
n G
crit
, G
crit
, n G
crit
, G
crit
, n G
crit
, G
crit
,
N.J. Goddard Basic Statistics and Error Analysis 44
= 0.05 = 0.01 = 0.05 = 0.01 = 0.05 = 0.01
3 1.1543 1.1547 15 2.5483 2.8061 80 3.3061 3.6729
4 1.4812 1.4962 16 2.5857 2.8521 90 3.3477 3.7163
5 1.7150 1.7637 17 2.6200 2.8940 100 3.3841 3.7540
6 1.8871 1.9728 18 2.6516 2.9325 120 3.4451 3.8167
7 2.0200 2.1391 19 2.6809 2.9680 140 3.4951 3.8673
8 2.1266 2.2744 20 2.7082 3.0008 160 3.5373 3.9097
9 2.2150 2.3868 25 2.8217 3.1353 180 3.5736 3.9460
10 2.2900 2.4821 30 2.9085 3.2361 200 3.6055 3.9777
11 2.3547 2.5641 40 3.0361 3.3807 300 3.7236 4.0935
12 2.4116 2.6357 50 3.1282 3.4825 400 3.8032 4.1707
13 2.4620 2.6990 60 3.1997 3.5599 500 3.8631 4.2283
14 2.5073 2.7554 70 3.2576 3.6217 600 3.9109 4.2740
Example 7.
The weight data in Example 1 has one point (12.513 g) that is 3.14 standard deviations away
from the mean. Is this observation an outlier?
We will apply all three tests to this data, using the null hypothesis that this observation is
from the same (Normal) distribution as the rest of the observations. We will use a confidence
level of 95% for all three tests.
Chauvanets criterion.
Removing the suspected outlier from the set leaves 19 observations with a mean of 12.473 g
and a standard deviation of 0.00841 g.
The confidence limits are given by:

g
g
n
S
t x
n p
477 . 12 469 . 12
004054 . 0 473 . 12
19
00841 . 0
101 . 2 473 . 12
1 ,

t
t

Where n= n 1 and n is the original number of observations before removal of the outlier.
The t value (two-tailed, P = 95%) is 2.101. Since our presumed outlier is outside of the
confidence range of 12.469 to 12.477 g, we reject the null hypothesis (that the observation is
part of the same population as the rest of the data) and accept the alternate hypothesis that the
observation is an outlier from a different population.
Dixons Q
The null hypothesis is that all of the observations are drawn from the same population. Since
the sample size is 20 and the presumed outlier is greater than the rest of the observations, we
calculate the Q statistic thus:
n Observation
N.J. Goddard Basic Statistics and Error Analysis 45
1 12.450
2 12.465
3 12.465 x
3
4 12.466
5 12.468
6 12.469
7 12.472
8 12.473
9 12.473
10 12.474
11 12.475
12 12.475
13 12.475
14 12.477
15 12.481
16 12.481
17 12.482
18 12.485 x
n-2
19 12.485
20 12.513 x
n
0.583
0.048
0.028
12.465 - 12.513
12.485 - 12.513
3
2


x x
x x
Q
n
n n
The critical value for P = 95% is 0.450. Our Q statistic is greater than the critical value, so we
reject the null hypothesis and accept the alternate hypothesis that the observation is an outlier
from a different population.
Grubbs test
The null hypothesis is that all of the observations are drawn from the same population. We
have already calculated the standard normal value for the outlier, but will show the
calculation again:
143 . 3
01209 . 0
038 . 0
01209 . 0
475 . 12 513 . 12
lue suspect va

S
x
G
The critical value for P = 95% ( = 0.05) for 20 samples is 2.7082. Our calculated G value is
well above this, so we reject the null hypothesis and accept the alternate hypothesis that the
observation is an outlier from a different population.
N.J. Goddard Basic Statistics and Error Analysis 46
We can see that all three methods give the same result, that the observation 12.513 g is an
outlier.
Concluding Comments on the Rejection of Data
It is important to always retain and report outliers, as they may contain important information
that you are unaware of. Explain the basis for their exclusion from your data analysis. Do not
hide such data, even if it is highly inconvenient and even embarrassing to report it, history
may attach a great deal more importance to it than you do.
N.J. Goddard Basic Statistics and Error Analysis 47
Linear Regression Analysis
Instrumental analysis techniques are often used to determine the concentration of an analyte
over a wide range. Such comparative methods of analysis usually rely on a calibration curve
obtained from the analysis of reference standards. The material presented to the instrument
containing an unknown concentration of analyte yields a response from which the analyte
concentration can be interpolated using the calibration graph.
Sometimes in the investigation of an unknown system the relationship between the systems
response and a factor may be studied over a wide range of factor levels. In using such data a
curve is fitted to show the relationship between the factor and the response, or the analyte
concentration and the instrument signal.
Important questions are raised in adopting such an approach.
Is the graph linear? If not, what is the form of the curve?
As each calibration point is subject to indeterminate error, then what is the best straight
line through the data?
What errors are present in the fitted curve?
What is the error in a determined concentration?
What is the limit of detection?
An important assumption in conventional linear regression analysis, and other curve fitting
techniques, is that there is no error in x axis values, that is, the standard deviation of the
individual x-axis values is very much less than the standard deviation of the individual y-axis
values. More complex forms of regression analysis, such as the Reduced Major Axis method,
can produce a regression line where both the x and y axis values have significant variances.
When presented with a set of data where a linear relationship is claimed between a factor and
its associated response an assessment of the accuracy of that claim may be obtained through
the product-moment correlation coefficient, r. This is also known simply as the correlation
coefficient. The correlation coefficient is used to estimate the "goodness of fit" of data to a
straight tine and is given by Equation (31):
r
x x y y
x x y y
i i
i
n
i i
i
n
i
n

_
,


( )( )
( ) . ( )
1
2 2
1 1
Equation (31)
or:
r
n x y x y
n x x n y y
i i
i
n
i
i
n
i
i
n
i i
i
n
i
n
i i
i
n
i
n

_
,

_
,





1 1 1
2
1
2
1
2
1
2
1
Equation (31a)
The advantage of equation 31a is that only the sums of x, y, xy, x
2
and y
2
need be
accumulated.
N.J. Goddard Basic Statistics and Error Analysis 48
A "perfect" straight line fit will result in a value for r of t 1. The sign of r is determined by
the sign of the gradient of the line. The correlation coefficient effectively measures how much
of the variance in the y (dependent) variable is accounted for by the variance in the x
(independent) variable. If all of the variance in y is accounted for by the variance in x, then
the variables are perfectly correlated with r = t 1. It should be noted that the correlation
coefficient can give very low correlations when there is an obvious relationship between the
dependent and independent variables. Figure 14 illustrates such a case.
Sometimes there is a large indeterminate error present in the measurement of the response,
always plotted on the y-axis. In such cases a t-test may be used to determine if a low value for
r is significant. The null hypothesis adopted in such instances is that y is not correlated to x,
or as an exact statement, r = 0. A two-tailed t-test is used if the sign of the slope is not
significant (we only want to detect if r 0, H
1
: r 0) or a one-tailed test is used if we have
some reason to believe in advance that the slope will be of a particular sign (we want to detect
if r < 0 or r > 0). If t is greater than the critical value then the hypothesis is rejected. We
calculate a t statistic using Equation (32).
t
r n
r

2
1
2
Equation (32)
where the number of degrees of freedom is given by:
n 2
The degrees of freedom are n 2, NOT n 1, because we have two derived values (slope and
intercept) that means that we only need n 2 points to determine the remaining two points.
An alternative method is to use tables of critical values of the correlation coefficient, as found
in Table 10 (page 22) of Murdoch and Barnes. Table 10 below (generated in Excel) also gives
the critical values. The values are for the one-tailed test and the 2 values are for the two-
tailed test. It can be seen for many degrees of freedom that even low values of the correlation
coefficient means that there is significant correlation.
N.J. Goddard Basic Statistics and Error Analysis 49
Figure 14. Illustration of obvious correlation where the correlation
coefficient is low.
X Data
0 2 4 6 8 10 12
Y

D
a
t
a
0
1
2
3
4
5
6
7
r=0.0000
Table 10. Critical values of the correlation coefficient.
0.05 0.025 0.005 0.0025 0.0005 0.00025
2 0.1 0.05 0.01 0.005 0.001 0.0005
= 1 0.98769 0.99692 0.999877 0.999969 0.99999877 0.99999969
2 0.9000 0.9500 0.9900 0.995000 0.999000 0.999500
3 0.8054 0.8783 0.9587 0.97404 0.99114 0.99442
4 0.7293 0.8114 0.9172 0.9417 0.9741 0.98169
5 0.6694 0.7545 0.8745 0.9056 0.9509 0.96287
6 0.6215 0.7067 0.8343 0.8697 0.9249 0.9406
7 0.5822 0.6664 0.7977 0.8359 0.8983 0.9170
8 0.5494 0.6319 0.7646 0.8046 0.8721 0.8932
9 0.5214 0.6021 0.7348 0.7759 0.8470 0.8699
10 0.4973 0.5760 0.7079 0.7496 0.8233 0.8475
11 0.4762 0.5529 0.6835 0.7255 0.8010 0.8262
12 0.4575 0.5324 0.6614 0.7034 0.7800 0.8060
13 0.4409 0.5140 0.6411 0.6831 0.7604 0.7869
14 0.4259 0.4973 0.6226 0.6643 0.7419 0.7689
15 0.4124 0.4821 0.6055 0.6470 0.7247 0.7519
16 0.4000 0.4683 0.5897 0.6308 0.7084 0.7358
17 0.3887 0.4555 0.5751 0.6158 0.6932 0.7207
18 0.3783 0.4438 0.5614 0.6018 0.6788 0.7063
19 0.3687 0.4329 0.5487 0.5886 0.6652 0.6927
20 0.3598 0.4227 0.5368 0.5763 0.6524 0.6799
25 0.3233 0.3809 0.4869 0.5243 0.5974 0.6244
30 0.2960 0.3494 0.4487 0.4840 0.5541 0.5802
40 0.2573 0.3044 0.3932 0.4252 0.4896 0.5139
50 0.2306 0.2732 0.3542 0.3836 0.4432 0.4659
Least Squares Fit
A least squares fit is used to draw a straight line through data that minimises the residuals in
the y-axis, as shown in Figure 15. We are, in fact, minimising the sum of the squares of the
residuals (the difference between the actual y value and the y value calculated from the
regression line). For a straight line of form y =a + bx the coefficients b and a are given by:
( ) ( )
( )

n
i
i
n
i
i i
x x
y y x x
b
1
2
1
Equation (33)
or,
b
n x y x y
n x x
i i
i
n
i
i
n
i
i
n
i i
i
n
i
n

_
,





1 1 1
2
1
2
1
Equation (33a)
a y bx
Equation (34)
N.J. Goddard Basic Statistics and Error Analysis 50
n
x
b
n
y
a
n
i
i
n
i
i


1 1
Equation (34a)
The alternate forms of both expressions use only sums of x, y, xy and x
2
.
It is important to provide an estimate of the uncertainty in the slope and intercept calculated
through a least squares fit. This is especially so when involved in the characterisation of a
systems response to a proposed factor.
The first stage is to calculate the y residuals. These are the differences between the calculated
data
y
i
and the observed data
y
i
for a given value of
x
i
. Having done this a statistic
S
y
x
is obtained from Equation (35):
( )
S
y y
n
y
x
i i
i
n

_
,


2
1
2
Equation (35)
This is the standard deviation of the y residuals that is, the standard deviation of the
difference between each data point and the y value given by the best fit line.
Alternatively, if we define:
N.J. Goddard Basic Statistics and Error Analysis 51
Figure 15. Residuals for least-squares fitting.
X Data
0 2 4 6 8 10 12
Y

D
a
t
a
0
2
4
6
8
10
12
Residual
Q x
x
n
Q y
y
n
Q x y
x y
n
x i
i
n i
i
n
y i
i
n i
i
n
xy i i
i
n i
i
n
i
i
n

_
,

_
,


2
1
1
2
2
1
1
2
1
1 1
x
xy
y x
xy
Q
Q
b
Q Q
Q
r

then,
( )
S
Q
Q
Q
n
y
x
y
xy
x

2
2
Equation (35a)
Which does not involve the calculation of the individual
y
i
values and can be performed
using only the sums of x, y, xy, x
2
and y
2
.
S
y
x
is used to estimate the standard deviation in b, S
b
,

and a, S
a
, as given in Equations (36)
and (37):
( )
S
S
x x
S
Q
b
y
x
i
i
n
y
x
x

2
1
Equation (36)
( )
S S
x
n x x
S
x
n
a y
x
i
i
n
i
i
n b
i
i
n

_
,

2
1
2
1
2
1
Equation (37)
These estimates for the standard deviation may be used in the normal way in t and F-tests and
used to provide estimates of appropriate confidence limits. These estimates of the standard
deviation of the coefficients that define the linear least squares fit may be used to determine
the uncertainty that accompanies a value of x
0
obtained from the interpolation of an unknown
x
0
yielding of response of y
0
. An alternative approach is to calculate the standard deviation of
x
0
using Equation (38), which approximates the value of the standard deviation:
( )
( )
S
S
b n
y y
b x x
x
y
x
i
i
n 0
0
2
2
2
1
1
1
+ +

_
,

Equation (38)
N.J. Goddard Basic Statistics and Error Analysis 52
If y
0
is the mean of m measurements then Equation (10) is modified and becomes:
( )
( )

,
_

+ +

n
i
i
x
y
x
x x b
y y
n m b
S
S
1
2 2
2
0
0
1 1
Equation (39)
Alternatively, we can use the forms involving the Q parameters:
( )

,
_


+ +
2
2
0
0
1
1
xy
x x
y
x
Q
y y Q
n b
S
S or
( )

,
_


+ +
2
2
0
0
1 1
xy
x x
y
x
Q
y y Q
n m b
S
S
These expressions are valid if:
05 . 0
2
2 2
<
xy
x
y x
Q
S Q t
Where t is the value of the t statistic for the appropriate confidence level and n-2 degrees of
freedom. See http://www.rsc.org/images/Brief22_tcm18-51117.pdf for a fuller discussion.
How can we minimise the error S
x
0
? We can reduce the 1/m term by making replicate
measurements (m > 1). We can also reduce the 1/n term by increasing the number of data
points in our calibration line. We can reduce the ( )
2
0
y y factor by working close to the
centre of the data and we can also use a well-determined line, where b >> 0. Finally, we can
maximize the factor
( )

n
i
i
x x
1
2
by using a wide range of x values.
Some graphing packages (such as Sigmaplot) can show the confidence interval for a best-fit
line. Figure 16 shows such a plot. The curved lines are the confidence interval, in this
case at a confidence level of 95%. In essence, these lines show the region over which
there is a 95% probability of finding the true best fit line.
The confidence interval shows why interpolation is acceptable, while extrapolation can be
very dangerous. As can be seen in Figure 16, the confidence interval is smallest in the centre
of the data range. The confidence interval diverges away from the centre of the data, as
shown in Figure 17. The further away from the centre of the data range, the larger the
confidence interval. This can be seen clearly in the estimate of the standard deviation of x
0
using equations (38) and (39), which both include a term ( )
2
0
y y , which increases rapidly
the further y
0
is from
y
.
N.J. Goddard Basic Statistics and Error Analysis 53
Figure 16. Sigmaplot graph showing best fit line and 95% confidence interval.
Figure 17. Interpolation versus Extrapolation
Summary of Least Squares Fit
The approach described above is widely used, but it is flawed:
It assumes that x values are free of errors. This is not necessarily true.
It assumes that errors in y values are constant. This is rarely true. All y values are given
equal weighting regardless of the uncertainty associated with them.
Nevertheless, used carefully linear regression analysis and least squares fit provides useful
information.
N.J. Goddard Basic Statistics and Error Analysis 54
Graph of ozone concentration versus time
Time (hours)
0 2 4 6 8 10 12
O
z
o
n
e

c
o
n
c
e
n
t
r
a
t
i
o
n

(
p
p
m
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
95% confidence
intervals
Best fit
line
95% confidence
intervals
Age of Plasma (s)
0 200 400 600 800 1000 1200
C
o
a
g
u
l
a
t
i
o
n

T
i
m
e

(
s
)
18
20
22
24
26
28
30
32
Age of Plasma (s)
0 2000 4000 6000 8000 10000
C
o
a
g
u
l
a
t
i
o
n

T
i
m
e

(
s
)
20
40
60
80
100
Non-linear Regression
The linear regression method outlined above works for any linear relationship. Many
relationships between variables do not follow a simple linear relationship, but may be
linearised by the appropriate transformation, allowing linear regression to be performed. An
example would be:
y ax
b

which may be linearised by taking logs of both sides:


log log log y a b x +
We can now plot log y against log x to determine the slope and intercept, and hence the
values of a and b. Figure 3 shows how this is done for a data set of fish length and mass.
From the log-log plot we obtain log a =-1.954 (a = 0.0111g) and b = 3.013. The equation
relating fish mass to length is then:
y x 0 0111
3 013
.
.
The correlation coefficient for the transformed data is 0.8733 for 29 observations. If we wish
to see if this correlation is significant, we can determine the t value as shown in equation (4):
t
r n
r


2
1
08733 29 2
1 08733
9314
2 2
.
.
.
Our null hypothesis is that there is no correlation between the variables (r = 0). Because we
expect in advance that the fish mass should increase with fish length (positive correlation),
the alternate hypothesis will be that r > 0, which means that we should use a one-tailed test.
The critical value for a one-tailed t-test with 27 degrees of freedom at the 99% confidence
level is 2.473, which is much less than our calculated t value. Accordingly, we reject the null
hypothesis (that there is no correlation) and accept the alternative hypothesis that there is a
positive correlation between fish mass and length. Figure 18 shows graphs of both the
original and transformed data, along with 99% confidence lines on either side of the least-
squares fit line.
We see that the exponent in the relationship is very close to 3, which is what we would expect
if the fish show isometric growth - that is, the fish grow uniformly in all three dimensions.
We can perform a t-test on the exponent against the reference value (3), by first calculating
the standard deviation of the exponent (the slope in this case).
S
y
x
01944 .
S
b

01944
0601
03235
.
.
.
We calculate t using:
( ) ( ) t x
n
S
3013 3000
29
03235
02164 . .
.
.
N.J. Goddard Basic Statistics and Error Analysis 55
The critical value for 27 degrees of freedom at the 99% confidence level for a two-tailed test
is 2.771. Our t value is well below this value, so we accept the null hypothesis that the slope
is not significantly different from 3.000 at the 99% confidence level.
Limits of Detection
Much debate has taken place regarding the definition for the limit of detection of a
measurement system. IUPAC has proposed that the limit of detection be the response at zero
analyte concentration plus three standard deviations of the response at zero analyte. If a
calibration curve is used then the limit of detection is given by the y axis intercept plus three
times the standard deviation associated with that value, Equation (40):
Limit of detection (LOD) =
a S
a
+3
Equation (40)
Or using S
y/x
:
Limit of detection (LOD) =
x
y
S a 3 +
Equation (41)
Equation (40) should be used if S
a
has been estimated independently by making replicate
blank measurements. Equation (41) should be used where replicate blank measurements have
not been made. It may be possible to record the presence of an analyte below this level, but
the uncertainty associated with such an observation is such as to make it unreliable. Figure 19
illustrates the determination of the LOD from the intercept and standard deviation of the
intercept.
N.J. Goddard Basic Statistics and Error Analysis 56
Figure 18. Linear-linear and log-log plots of fish mass (g) against fish length (cm)
Roach Data - Log data
log(length)
0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3
l
o
g
(
m
a
s
s
)
-1
0
1
2
3
Roach Data - Raw Data
Length (cm)
4 6 8 10 12 14 16 18
M
a
s
s

(
g
)
-40
-20
0
20
40
60
Figure 19. Determination of the limit of detection (LOD) from the intercept and
standard deviation of the intercept.
Data Processing
All these statistical tests and calculations, along with curve and line fitting can be done using
statistical packages and spreadsheets. These are an important part of a modern measurement
laboratory's equipment.
Often procedures will be given as to how to determine the appropriate statistical parameters
and produce calibration curves which require no thought on the part of the operator. May I
warn you against falling into this trap of using a computer package that you are unfamiliar
with. Make sure you understand what the program does before you trust it with your data and
always highlight results derived from computer programs that you are unfamiliar with.
Fortunately, most software packages come with excellent tutorials and texts that fully
describe how they operate and the basic theory behind the programs.
Packages such as MINITAB and SPSS give a far more comprehensive range of statistical
tests than covered in these lectures. Most of the graphs shown in these lecture notes (with
associated confidence limits) were produced using the SigmaPlot for windows package, while
most spreadsheets have sophisticated graphing facilities.
Of the software available within The University of Manchester, SAS is the most
comprehensive package for statistics. It can do far more than you are ever likely to require. It
is also possible to perform statistical manipulations using spreadsheets such as Microsoft
Excel. It should be noted, however, that Excel reports the results of its statistical tests in an
terms of probabilities, not confidence levels.
Sigmaplot for Windows is a sophisticated scientific graphing package, permitting the creation
of 2D and 3D plots. It has limited statistical capabilities, with its strong point being its ability
N.J. Goddard Basic Statistics and Error Analysis 57
0
0
a
a+3S
a
LOD
Concentration
R
e s
p o
n s
e
to add confidence limits and prediction intervals to graphs. Origin is another graphics
package with similar capabilities.
Finally, the arithmetic behind most of the methods outlined here is very simple, involving
summations and the odd square root. It is not impossible to write your own programs to
perform these tests using packages such as Visual BASIC or Borland Delphi. Of course, if
you write your own programs you should check the results using test data before performing
any significant analyses.
N.J. Goddard Basic Statistics and Error Analysis 58

You might also like