Professional Documents
Culture Documents
Week 2 Chapter 2 Sample Prep Statistics PDF
Week 2 Chapter 2 Sample Prep Statistics PDF
Week 2 Chapter 2 Sample Prep Statistics PDF
CHAPTER 1
Sample preparation-continued
2 March 2015
Week 2 1
Preparing solutions
How to prepare a solution?
How to prepare a solution with a
certain concentration?
Expression of concentration
Concentration is the quantity of solute in a known amount of
volume or mass of solution or solvent
Concentration = Amount of solute
Amount of solution
Unit for concentrations
Molarity (M) Formality (F)
Normality (N) (not common) Molality (m)
Part per thousand (ppt)
Parts per million (ppm)
Parts per billion (ppb)
Percent concentration (%w/w, %w/v, %v/v)
Week 2 2
Molarity (M) = number of moles of solute
liters of solution
concentration of a particular chemical species
Week 2 3
• For substances that DO NOT ionize in solution, such as glucose, molarity and
formality are the SAME
• For substances that ionize in solution, such as NaCl, molarity and formality are
DIFFERENT.
The molarity of NaCl zero since there is no undissociated NaCl in solution. The
solution, instead, is 0.1 M in Na+ and 0.1 M in Cl–. When we state that a solution is
0.1 M NaCl we understand it to consist of Na+ and Cl– ions.
The formality of NaCl 0.1 F because it represents the total amount of NaCl in
solution.
Week 2 5
Concentration in terms of
percent composition
The concentration of substance in commercial aqueous reagents, organic
solvents and commercial household products are usually expressed in
percent composition
Example: 37% is labeled on a HCl reagent bottle. This means that it contains 37 g
HCl per 100 g solution
Week 2 6
The w/w % unit may also be expressed as a fraction
e.g. 37% (w/w) can be expressed as 37 parts per hundred
How about smaller fractions??
• Parts per thousand (ppt ) = g of solute
103 g solution
Or = mg/g
As the density of aqueous solution often very close to 1.00 g/L, we usually
correlates 1 g water with 1 mL water (approximation)
8
Week 2 8
Preparation of solution
Example: How to prepare 250 mL of 0.100 M Na from NaOH
solid? [MW: NaOH = 40, Na= 23.00]
1. Calculate the weight (g) of NaOH that is equivalent to the required moles
of Na in solution
Calculations ???
2. Weigh ??? g of solid (generally 0.1 mg, ie up to 4 decimal places in
grams)
3. Dissolve in water, transfer (quantitatively with rinsing) to a 250 mL
volumetric flask, and dilute to the mark
Dissolve and
Calculate Weigh
transfer
Week 2 9
Dilution of solution
The moles solute in concentrated (1) solution equals the
moles in dilute (2) solution
M1 V1 = M2 V2
M1 : Initial concentration of solution
V1 : Volume of concentrated solution transferred
M2 : Concentration of final solution (diluted)
V2 : Volume of concentrated solution transferred
Week 2 10
Week 2+3
CHAPTER 2
Data analysis and statistics
Week 2 11
DATA ANALYSIS
Normal phrases in describing results of an analysis
“pretty sure”
“very sure” Replaced by using statistical tests important to
“most likely” understand the significance of data and therefore to
“probably” set limitations on each step of analysis.
Week 2 12
Accuracy is how close you
get to the bullseye. Precision
ACCURACY is how close the repetitive
•degree of agreement between measured value shots are one to another. It
and the true value (which may not be known!) is nearly impossible to have
•Therefore, it is the degree of agreement accuracy without good
precision
between measured value and the accepted true
value
PRECISION
•Degree of agreement between replicate
measurements of the same quantity;
repeatability of a result.
•Expressed by standard deviation, the coefficient
of variation, the range of the data or as
confidence interval (e.g. 95%) about the mean
value
•How similar are values obtained in exactly the
same way?
•Useful for measuring deviation from the mean.
di x i x
Week 2 13
High Precision
High accuracy
High precision
Low accuracy
Low precision
Low accuracy
Low precision
High accuracy
x
Week 2 14
Types of error in chemical analysis
Accuracy is expressed in terms of error
A. Systematic (determinate) Error
•Determinable and that presumably can be either avoided or corrected.
•Can be constant (e.g uncalibrated weight used in all weighing) or variable (e.g.buret whose
volume readings are in error by different amount at different volumes)
•Readings all too high or too low that can affect accuracy.
•How to detect?analysis of reference sample or determine recovery after adding known
standard
SF in multiplication &Division
•Use the same number of digits as 40.1 0.1633 2
the number with the fewest number 3.21 10 (3 sf )
of digits. 204 . 228
Week 2 17
Application of statistics in data analysis
•Defining the confidence interval (RANGE) of values around a set mean (x)
within which the population mean () can be expected with a given
probability
•Estimating the probability that the experimental mean (x) and true value
() are different or two experimental mean are different (t test).
•Estimating the probability that data from two experiments are different
in precision (F-test).
Week 2 18
Statistical treatment of random error
Week 2 19
Standard Deviation (SD)
– a very important precision indicator
Each set of analytical results should be accompanied by an indication of the precision of the
analysis standard deviation which is measuring of the precision of a population of data.
Can be calculated using Excel spreadsheet
s
xi x 2
xi 2
N1 N
Xi = individual values of x Xi = individual values of x
x = mean = mean
N = number of replicate measurements N = number of replicate measurements
Week 2 20
Relative Standard Deviation (RSD)
– also a very important precision
Varian, V indicator
Week 2 21
Week 2 22
Examples:
HPLC chromatogram of toxins
1
Can you guess which result (peak)
2 with higher precision?
Replicate 1 example sd rsd.xlsx
1
2
Replicate 2
1 2
Replicate 3
Week 2 23
Confidence limits (CL) and confidence
interval (CI)– how sure are you?
•Calculation of SD for a set of data provides indication of the precision inherent in
particular procedure.
•If for large data set, it doesn't by itself give any info about how close the
experimentally determined mean ( x ) to the true mean value (µ).
•Statistic allows to estimate range within which true value might fall, within a given
probability, defined by the experimental mean and sd.
•This range confidence interval and the limits of this range confidence limit
The likelihood that the true value falls within the range is called
probability or confidence level, usually expressed as %
Week 2 24
Confidence interval for small data set (N = 20)
Week 2 25
Values of t at various confidence level
Data for the analysis of calcium in rock are given by 14.35%, 14.41%, 14.40%, 14.32% and
14.37%. Within what range are you 95% confident that the true value lies??
ts
CI ( ) x
N
Solution:
Mean, x = 14.37
SD, s = 0.037
From the table, at 95 % confidence level, N - 1 = 4, t = 2.78.
Week 2 27
At different confidence level,
Summary:
If the confidence level increased, the confidence interval (CI)
also increased. The probability of the true mean value ()
appeared in the interval will increase
Week 2 28
Confidence interval for large data set (N >20)
•since the exact value of population mean () cannot be determined, one must
use statistical theory to set limits around the measured mean, that probably
contain .
•CI only having meaning with the measured standard deviation, s, is a good
approximation of the population standard deviation, , and there is no bias in
the measurement.
Week 2 29
Table 1 Values for z at various confidence levels
Confidence Level, % z
50 0.67
68 1.00
80 1.29
90 1.64
95 1.96
96 2.00
99 2.58
99.7 3.00
99.9 3.29
Week 2 30
Examples:
Week 2 31
Other usage of confidence interval (CI)
•To determine number of replicates needed for the mean to be within the
confidence interval.
•To determine systematic error.
ts
μ x
z x
N N
2
z 2
N
x
ts
N
x
Week 2 32
Example:
Calculate the number of replicates needed to reduce the confidence interval
to 1.5 g/mL at 95% confidence level. Given, s = 2.4 g/mL.
2
ts
N
x
2
1.96 2.4
N 10
1.5
Week 2 33
B. To determine systematic error
1) Calculation method 1
Example
A standard solution gave an absorption reading of 0.470 at a particular wavelength.
Ten measurements were done on a sample and the mean gave a value of 0.461,
with standard deviation (s) of 0.003. Show whether systematic error exists in the
measurements at 95% confidence level.
Solution
At 95% confidence level, N = 10, t = 2.26,
Week 2 34
Values of t at various confidence level
Solution
At 95% confidence level, N = 10, t = 2.26,
N The tcalc >ttable
t x
s Does systematic error
10 present?
0.461 0.470
0.003
t 9.49
Week 2 36
Testing a hypothesis
Observations
Hypothesis Model
NO
Valid? Reject
YES
Week 2 37
Significant tests
• Approach tests whether the difference between the two results is significant (due to
systematic error) or not significant (merely due to random error).
Week 2 38
Null hypothesis, ho
The values of two measured quantities do not differ (significantly) UNLESS we can
prove it that the two values are significantly different.
•If the calculated value is smaller than the table value, the hypothesis is
accepted and vice-versa.
Week 2 39
Values of t at various confidence level
Week 2 41
Steps in t-test
1) Comparing two mean values x and
i) If is not known,
ii) Calculate t or z (tcalc) from the data.
ts iii) Compare tcalc and ttable
x
N
iv) If tcalc > ttable
t x -
N Reject Null Hypothesis (Ho) i.e. x
s • The difference is due to systematic error.
If is known,
The sulphur content in a sample of kerosene was found to be 0.123%. A new method
was used on the same sample and the following data is obtained:
%Sulphur : 0.112; 0.118; 0.113; 0.119
Null Hypothesis, Ho : = x
x = 0.116%, = 0.123%, s = 0.0032
N
t x
s
4
0.116 0.123 t table = 3.18 (95 %, N-1 = 3)
0.0032 Since tcalc> ttable, Ho is rejected and the
4.38 (t calc ) two means are significantly different and
thus systematic error is present.
Week 2 43
Other Solution:
x 0.116 - 0.123
0.007 (experimental data)
ts
x
N
x 3.18 0.0032
4
Ho is rejected, the difference ( x ) is
0 .0051 ( Table value ) significant and there is systematic error in
the measurement.
Since, x calculated x - table
ts
x (i.e. 0.007 > 0.0051)
N
Week 2 44
2.Comparing two mean values x 1 and x 2
•Normally used to determine whether the two samples are identical or not.
•The difference in the mean of two sets of the same analysis will provide information on
the similarity of the sample or the existence of random error.
Ho : x1= x 2
ts
We want to test whether x 1- x 2 =0 x1 1
1
N
1
ts
x2 2
2
N
2
Week 2 45
Assume, 1 2 and 1 2
x1 - x 2 N1N2
t calc
sp N1 N2
Week 2 46
Example;
Week 2 47
Values of t at various confidence level
Since the values of F (from table) are always greater than 1, the smaller
variance (the more precise) always become the denominator.
V1 > V2, so F 1
Week 2 49
F values
v1 s 21
F 2
v2 s 2
Week 2 50
Example:
The determination of CO in a mixture of gases using the standard procedure
gave an s value of 0.21 ppm. The method was modified twice giving s1 of
0.15 (10 degrees of freedom (N-1)) and s2 of 0.12 (10 degrees of freedom
(N-1))
Are the two modified methods more precise than the standard?
Solution:
Ho : s1 = sstd and Ho : s2 = sstd
In the standard method, s and the degrees of freedom becomes infinity.
Refer the F table:
Numerator = , and denominator = 10; giving the critical: Ftable= 2.54
Week 2 51
Conclusions
Ho : s1 = sstd.
and
Ho : s2 = sstd.
Week 2 52
The Dixon test (Q test)
A way of detecting outlier a data that does not belong to the set.
Example:
Data: 10.05, 10.10, 10.15, 10.05, 10.45, 10.10
Week 2 53
x q - xn
Q exp t
w
where,
xq = the questionable data
xn = its nearest neighbour
w = the difference between the highest and the lowest value (range).
Week 2 54
Values of Q
Week 2 55
Solution:
xq - xn
Q exp t
w
10.45 - 10.15
Q exp t = 0.75
10.45 - 10.05
Qcritical (95%, n = 6) = 0.625
Qexpt > Qcritical
Data (10.45) can be
rejected.
Week 2 56
Example:
An analysis on calcite gave the following percentage of CaO:
55.45, 56.04, 56.23, 56.00, 55.08
Q: Is there any that data should be
rejected at 95% confidence level?
Solution:
•Arrange data
55.08, 55.45, 56.00, 56.04, 56.08, 56.23
•Suspected data: 55.45 OR 56.23
Qtable from 5 determinations, 95% = 0.710
56.23 - 56.08
Q calc
56.23 - 55.45 Qcalc<Qtable. Data cannot be rejected.
0.19
55.45 - 56.00
Q calc Qcalc = Qtable. Data cannot be rejected.
56.23 - 55.45
0.71
Week 2 57