Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

CHAPTER 6:

RANDOM ERRORS
IN CHEMICAL
ANALYSIS
“All measurements contain random errors.”

• They can never be totally eliminated and are often the


major source of uncertainty in a determination.
• Random errors are caused by the many uncontrollable
variables that are an inevitable part of every analysis.
• Most contributors to random error cannot be positively
identified.
• Even if we can identify sources of uncertainty, it is usually
impossible to measure them because most are so small that
they cannot be detected individually.
• The accumulated effect of the individual uncertainties,
however, causes replicate measurements to fluctuate
randomly around the mean of the set.
Figure 6-1 Three-dimensional plot showing absolute error in
Kjeldahl nitrogen determination for four different analysts.
What Are the Source of Random Errors?
A Gaussian, or normal error
curve, is a curve that shows the
symmetrical distribution of data
around the mean of an infinite
set of data.
The spread in a set of replicate measurements is the difference
between the highest and lowest result.
results directly from an accumulation of all random uncertainties in the
experiment
Figure 6-3 A histogram (A) showing distribution of the 50 results in Table 6-3
and a Gaussian curve (B) for data having the same mean and standard
deviation as the data in the histogram.
Sources of random uncertainties in the calibration of a pipet
include:
(I) visual judgments, such as the level of the water with
respect to the marking on the pipet and the mercury level in
the thermometer;
(2) variations in the drainage time and in the angle of the
pipet as it drains;
(3) temperature fluctuations, which affect the volume of the
pipet, the viscosity of the liquid. and the performance of the
balance; and
(4) vibrations and drafts that cause small variations in the
balance readings.
STATISTICAL TREATMENT OF RANDOM ERROR
“Statistical methods allow us to categorize and characterize
data in different ways and to make objective and intelligent
decisions about data quality and interpretation.”

Sample – a finite number of experimental observations; a


tiny fraction of infinite number of observations
Population (or universe) - collection of all measurements of
interest to the experimenter
Properties of Gaussian Curves
Parameters used to define a population or distribution:
• Population mean,  Population mean – true
• Population standard deviation,  mean of the population;
 in the absence of any
x i
systematic error, this is
also the true value for
= i =1
the measured quantity.

The sample mean and the sample standard deviation are
examples of statistics that estimate parameters  and ,
respectively.
 Sample mean, 𝑥ҧ - the mean of a limited
x i
sample drawn from the population of the
data
= i =1

The probable difference between  and 𝑥ҧ decreases rapidly
as the number of measurements making up the sample
increases.
Measures of Precision
1. Population standard deviation, 
- a measure of the precision of a population of data and is
mathematically given by: 

 ( xi −  )2

 = i=1

Note that z is the deviation of a data point from the mean
relative to one standard deviation. That is, when x-  = , z is
equal to one; when x -  = 2, z is equal to two; and so forth.
Normal error curve has several general properties:
1. The mean occurs at the central point of maximum frequency.
2. There is a symmetrical distribution of positive and negative
about the maximum,
3. There is an exponential decrease in frequency as the
magnitude of the deviations increases.
Thus, small uncertainties are observed much more often than
very large ones.
Areas under a Gaussian Curve
2. Sample standard deviation, 𝒔
- measures how closely the data are clustered about the
mean N−1 is called the

 (x − x )
  number of degrees of

d
2 2 freedom which is said to
i i be an unbiased
estimator of the
s= i =1
= i =1
population standard
 −1  −1 deviation, 

The smaller the s, the more closely the data are clustered
about the mean .
2
Alternatively,  N

 Xi 
 i =1 
N

 X i
2

N
s= i =1
N −1
3. Standard deviation of the mean, 𝒔𝒎
- measures how closely the data are clustered about the
mean
s
sm =
N
Improvement of precision is be gained by:
• Increasing the number of measurements
- Increase precision by a factor of 10 requires 100
measurements
• Decrease 𝒔 → a better way
- by being more precise in individual operations, by
changing the procedure, and by using more precise measurement
Other ways of expressing precision:
• Variance, 𝒔𝟐 – the square of the standard deviation

 (x − x )

2
i
s =
2 i =1
 −1
• Coefficient of variation, CV 𝑠
𝒔 𝑅𝑆𝐷 =
𝑪𝑽 = 𝐱 𝟏𝟎𝟎% 𝑥ҧ

𝒙 𝑠
𝑅𝑆𝐷 𝑖𝑛 𝑝𝑝𝑡 = 𝑥 1000 𝑝𝑝𝑡
𝑥ҧ
• Spread or range, 𝒘
𝒘 = highest value − lowest value
The following results were obtained in the replicate determination
of the lead content of a blood sample: 0.752, 0.756,
0.752, 0.751, 0.760 ppm Pb
Calculate (a) the variance, (b) the relative standard deviation in
parts per thousand, (c) the coefficient of variation, and (d) the
spread.
X = 0.754 ppm and s = 0.0038ppm Pb
−5
(a) S = (0.0038) = 1.4  10
2 2

0.0038
(b) RSD =  1000 ppt = 5.0 ppt
0.754
0.0038
(c)CV =  100% = 0.50%
0.754
(d ) w = 0.760 − 0.751 = 0.009 ppmPb
Reliability of s as a Measure of Precision

Most of the statistical tests described are based upon sample


Standard deviations, and the probability of correctness of the
results of these tests improves as the reliability of s becomes
greater. Uncertainty in the calculated value of s decreases as
N increases. When N is greater than 20, s and  can be
assumed identical for all practical purposes

Pooling Data to Improve the Reliability of s


• data from a series of similar samples accumulated over time
can often be pooled to provide an estimate of s superior to
the value of the individual subset
 (x − x ) +  (x ) ( )
1 2 3
− x 2 +  x j − x3 + ...
2 2 2
i 1 j
i =1 j =1 k =1
spooled =
1 +  2 +  3 + ... −  T

where: N1 = number of data in set 1


N2 = number of data in set 2
NT = number of data sets that are pooled
N1 + N2 + …−NT degrees of freedom
STANDARD DEVIATION OF CALCULATED RESULTS
1)

2)
3)

The relative standard deviation


of y = a3 is not the same as the
relative standard deviation of
the product of three
independent measurements y =
abc, where a = b = c.
4)
REPORTING COMPUTED DATA

A numerical result is worthless to users of the data unless they


know something about its quality.

Ways of Indicating reliability:


• give a confidence interval at the 90% or 95% confidence
level
• report the absolute standard deviation or the coefficient of
variation of the data
• use significant figures
The SIGNIFICANT FIGURE in a number are all the certain
digits plus the first uncertain digit.

• Express data in scientific notation to avoid confusion in


determining whether terminal zeros are significant.
• Rules for determining the number of significant figures:
1. Disregard all initial zeros.
2. Disregard all final zeros unless they follow a
decimal point.
3. All remaining digits including zeros
between nonzero digits are significant
• Sum and Differences
➢ The result should contain the same number of decimal places as
the number with the smallest number of decimal places.
➢ When adding and subtracting numbers in scientific notation.
express the numbers to the same power of 10. For example,
2.432 X 106 = 2.432 X 106
+6.512 X 104 = +0.06512 X 106
- 1.227 X 10 5 = -0.1227 X 106
2.37442 X 106 (round to 2.374 X 106)

• Products and Quotients


➢ round off the answer so that it contains the same number of
significant digits as the original number with the smallest number of
significant digits.
• Logarithms and Antilogarithms
1. In a logarithm of a number. keep as many digits to the right
of the decimal point as there are significant figures in the
original number.
2. In an antilogarithm of a number, keep as many digits as
there are digits to the right of the decimal point in the original
number.

➢ The number of significant figures in the mantissa, or the


digits to the right of the decimal point of a logarithm, is the
same as the number of significant figures in the original
number. Thus. log (9.57 X 104) = 4.981. Since 9.57 has 3
significant figures, there are 3 digits to the right of the
decimal point in the result.
Rounding Data
➢ In rounding a number ending in 5. always round so that
the result ends with an even number. Thus. 0.635 rounds
to 0.64 and 0.625 rounds to 0.62.

We should note that it is seldom justifiable to keep more than one significant
figure in the standard deviation because the standard deviation contains error as
well.

It is especially important to postpone rounding until the


calculation is completed. At least one extra digit beyond the
significant digits should be carried through all of the
computations in order to avoid a rounding error.

You might also like