Professional Documents
Culture Documents
Lecture 2-Data Analysis - Part2
Lecture 2-Data Analysis - Part2
Chapter 2
Statistics in Analytical Chemistry- Part 2
2
Hypothesis test
• Experimental results seldom agree exactly with those
predicted from a theoretical:
– Scientists/engineers frequently must judge whether a numerical
difference is a result of the random errors or systematic errors. Certain
statistical tests are useful in sharpening these judgments.
– To test this kind, we use null hypothesis, which assumes that the
numerical quantities being compared are not different.
3
Hypothesis test
• Comparison an experimental mean with a known
value (true or predicted value).
– A large number of measurements or known σ.
– A small number of measurements or unknown σ.
4
Comparing an Experimental Mean with a Known Value
5
Comparing an Experimental Mean with a Known Value
6
Comparing an Experimental Mean with a Known Value
• Test procedure:
– Step 1: Formulation of an appropriate test statistic:
• z statistic: a large number of measurements or known σ.
• t statistic: small numbers of measurements with unknown σ.
• If not sure: use t statistic.
7
Comparing an Experimental Mean with a Known Value
8
Comparing an Experimental Mean with a Known Value
9
Comparing an Experimental Mean with a Known Value
10
Comparing an Experimental Mean with a Known Value
12
Comparing an Experimental Mean with a Known Value
– Obtain tcrit from Table 7.3 with the degree of freedom of (N1+ N2 -2)
– Compare t with tcrit:
• If 𝑡 < tcrit : null hypothesis is accepted à no difference between the means
• If 𝑡 > tcrit : null hypothesis is rejected à significant difference between the means
14
Comparison of Two Experimental Means
• t test for differences in the means:
– Example: 2 barrels of wine were analyzed for their alcohol content to
determine whether they were from different sources. On the basis of
6 analyses, the average content of the 1st barrel was 12.61% ethanol. 4
analyses of the 2nd barrel gave a mean of 12.53% alcohol. The 10
analyses yielded spooled of 0.070%. Do the data indicate a difference
between the wines?
– Null hypothesis H0: μ1 = μ2, and alternative hypothesis Ha: μ1 ≠ μ2.
– The test statistic t :
15
Comparison of Two Experimental Means
• Paired data:
– Use of pairs of measurements on the same sample to minimize
sources of variability that are not of interest.
– The paired t test uses the same type of procedure as the normal t test
except that pairs of data are analyzed.
– Null hypothesis is H0: μd = △0, where △0 is a specific value of the
difference to be tested, often zero.
– Alternative hypothesis: μd ≠ △0 ; μd <△0 or μd >△0
– The test statistic t :
∑+
, )*
• Where 𝑑̅ is the average difference 𝑑̅ = ; di: difference in each data pair
-
• sd is the standard deviation of the difference:
/
∑- 𝑑
∑123 𝑑 − 123
- /
𝑠𝑑 = 𝑁
𝑁−1
16
Comparison of Two Experimental Means
• Paired data:
– Example: A new automated procedure for determining glucose in
serum (Method A) is to be compared with the established method
(Method B). Both methods are performed on serum from the same 6
patients to eliminate patient-to-patient variability. Do the following
results confirm a difference in the two methods at the 95% CI?
– Since t > tcrit = 2.57 (at 95% CI and 5 degrees of freedom) à reject the
null hypothesis and conclude that 2 methods give different results. 17
Comparison of Precision: F test
• F test: can be used when
– Comparing the variances ( or standard deviations) of two populations
under the provision that the populations follow the normal (Gaussian)
distribution.
– Comparing more than two means and in linear regression analysis.
18
Comparison of Precision: F test
• Critical values of F at the 0.05 significance level are shown:
– Two degrees of freedom: one associated with the numerator and the
other with the denominator.
– Can used in either a one-tailed mode or a two- tailed mode.
19
Comparison of Precision: F test
• Example: A standard method for the determination of CO level in
gaseous mixtures is known from many hundreds of measurements to have
a standard deviation s of 0.21 ppm CO. A modification of the method
yields a value for s of 0.15 ppm CO for a pooled data set with 12 degrees
of freedom. A 2nd modification, also based on 12 degrees of freedom, has
a s of 0.12 ppm CO. Is either modification significantly more precise than
the original?
– Null hypothesis H0: 𝜎𝑠𝑡𝑑 / = 𝜎 / (where 𝜎𝑠𝑡𝑑 / is the variance of the
standard method and is 𝜎 / the variance of the modified method).
Alternative hypothesis is one-tailed, Ha: 𝜎2 < 𝜎𝑠𝑡𝑑 /
– The variances of the modifications are placed in the denominator:
• Calculate test statistic F for 1st and 2nd modifications:
– F2 > Fcrit : reject the null hypothesis. The 2nd method does appear to
give better precision a the 95% confidence level.
• With Fcrit = 2.69. Since F < 2.69, we must accept H0 and conclude that
the two methods give equivalent precision.
21
Detection of gross errors: Q test
• Q test is used to decide whether a suspected result should be
retained or rejected:
• Calculate Q:
23
Detection of gross errors: Q test
• Example: The analysis of a calcite sample yielded CaO percentages of
55.95, 56.00, 56.04, 56.08, and 56.23. The last value appears anomalous;
should it be retained or rejected at the 95% confidence level?
– The difference between 56.23 and 56.08 is 0.15%. The spread (56.23 –
55.95 ) is 0.28%. Thus:
24
Standardization and calibration
• Calibration:
– Determines the relationship between the analytical
response and the analyte concentration.
– Procedure:
• A series of such external standards containing the analyte in
known concentrations is prepared.
• Calibration is accomplished by obtaining the response signal
(absorbance, peak height, peak area) as a function of the known
analyte concentration.
• A calibration curve is prepared by plotting the data or by fitting
them to a suitable mathematical equation.
26
The least-squares method
• Assumptions:
1. A linear relationship actually exists between
the measured response y and the standard
analyte concentration x, described by
equation y = mx +b à regression model.
27
The least-squares method
• The least-squares method finds the sum of the squares of the
residuals SSresid and minimizes them.
Intercept:
28
The least-squares method
• Total sum of the squares, SStot, is defined as:
– The closer R2 is to unity, the better the linear model explains the y
variations.
29
Transformed variables
• Least-squares method can be applied to nonlinear models by
converting them into simple linear model as shown in Table
8.3:
30
Using excel
• Calculation of slope and intercept:
31
Using excel
• Plotting a graph and the least-squares fit
Create a chart using built-in
Chart Wizard of Excel
32