Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Analytical Chemistry

Chapter 2
Statistics in Analytical Chemistry- Part 2

Instructor: Nguyen Thao Trang


Semester I 2016-2017
Outlines
• Hypothesis test

• Detection of gross errors

• Standardization and calibration

2
Hypothesis test
• Experimental results seldom agree exactly with those
predicted from a theoretical:
– Scientists/engineers frequently must judge whether a numerical
difference is a result of the random errors or systematic errors. Certain
statistical tests are useful in sharpening these judgments.
– To test this kind, we use null hypothesis, which assumes that the
numerical quantities being compared are not different.

• Specific examples of hypothesis tests:


– Compare with what is believed to be the true value;
– Compare the mean to a predicted or cutoff (threshold) value;
– Compare the means or the standard deviations from two or more sets
of data.

3
Hypothesis test
• Comparison an experimental mean with a known
value (true or predicted value).
– A large number of measurements or known σ.
– A small number of measurements or unknown σ.

• Comparison between two experimental means.


– t test for differences of the means.
– t test for paired data.

• Comparison of precision: F test

4
Comparing an Experimental Mean with a Known Value

• A statistical hypothesis test is used to draw conclusions about


the population mean μ and its closeness to the known value
μ0.

• A known value (μ0):


– The true or accepted value based on prior knowledge or experience.

– Predicted from theory.

– A threshold value for making decisions about the presence or absence


of a constituent.

5
Comparing an Experimental Mean with a Known Value

• Two contradictory outcomes:


1. Null hypothesis H0 : μ = μ0
2. Alternative hypothesis Ha :
– Reject the null hypothesis if μ ≠ μ0
– Reject the null hypothesis if μ>μ0 or μ<μ0

• Example: determining whether the concentration of lead in an


industrial wastewater discharge exceeds the maximum
permissible amount of 0.05 ppm:
– H0 : μ = 0.05 ppm

– Ha: μ > 0.05 ppm

6
Comparing an Experimental Mean with a Known Value

• Test procedure:
– Step 1: Formulation of an appropriate test statistic:
• z statistic: a large number of measurements or known σ.
• t statistic: small numbers of measurements with unknown σ.
• If not sure: use t statistic.

– Step 2: Identification of a rejection region:


• The null hypothesis is rejected if the test statistic lies within the
rejection region.

7
Comparing an Experimental Mean with a Known Value

• A large number of measurement (or known σ) – z test


statistic:
– State the null hypothesis H0: μ = μ0
– Form the test statistic:

– State the alternative hypothesis, Ha, and determine the rejection


region:
• For Ha: μ ≠ μ0, reject H0 if z ≧ zcrit or if z ≦ – zcrit
• For Ha: μ > μ0, reject H0 if z ≧ zcrit
• For Ha: μ < μ0, reject H0 if z ≦ –zcrit
– zcrit: critical value of z listed in Table 7.1 (Chapter 2- p.37)at
different values of confidence level.

8
Comparing an Experimental Mean with a Known Value

• A large number of measurement (or known σ) – z test


statistic:
– For Ha: μ ≠ μ0, reject H0 if z ≧ zcrit or if z ≦ – zcrit à reject for either a
positive value of z or for a negative value of z that exceeds the critical
value à two-tailed test
• At 95% confidence level: zcrit = 1.96:

9
Comparing an Experimental Mean with a Known Value

• A large number of measurement (or known σ) – z test


statistic:
– For Ha: μ > μ0, reject H0 if z ≧ zcrit à reject for a positive value of z
that exceeds the critical value à one-tailed test.
– For Ha: μ < μ0, reject H0 if z ≦ –zcrit à reject for a negative value of z
that exceeds the critical value à one-tailed test.
• At 95% confidence level:

10
Comparing an Experimental Mean with a Known Value

• A large number of measurement (or known σ) – z test


statistic:
– Example: A class of 30 students determined the activation energy of a
chemical reaction to be 27.7± 5.2 kcal/mol. Are the data in agreement
with the literature value of 30.8 kcal/mol at (1) the 95% confidence
level and (2) the 99% confidence level?
• Assuming that s should be a good estimate of σ. Our null
hypothesis is that μ = 30.8 kcal/mol, the alternative hypothesis is
μ≠ 30.8 kcal/mol.
• Calculate z:

• Look up for zcrit:


zcrit = 1.96 for the 95% confidence level
zcrit = 2.58 for the 99% confidence level
Since z (= -3.26) ≦ –1.96, we reject the null hypothesis at the
11
95% confidence level. Similar for 99% confidence level.
Comparing an Experimental Mean with a Known Value

• A small number of measurement (or unknown σ) – t test


statistic:
– State the null hypothesis H0: μ = μ0

– Form the test statistic:

– State the alternative hypothesis, Ha, and determine the rejection


region:
• For Ha: μ ≠ μ0, reject H0 if t ≧ tcrit or if t ≦ – tcrit

• For Ha: μ > μ0, reject H0 if t ≧ tcrit

• For Ha: μ < μ0, reject H0 if t ≦ – tcrit


– tcrit: critical value of t listed in Table 7.3 (Chapter 3- p.44) at
different values of confidence level.

12
Comparing an Experimental Mean with a Known Value

• A small number of measurement (or unknown σ) – t test


statistic:
– Example: A new procedure for the rapid determination of the
percentage of sulfur in kerosenes was tested on a sample known from
its method of preparation to contain 0.123% (μ0 = 0.123%) S. The
results were % S = 0.112, 0.118, 0.115, and 0.119. Do the data indicate
that there is a bias in the method at the 95% confidence level?
• The null hypothesis is H0: μ= 0.123% S, and the alternative
hypothesis is Ha: μ≠ 0.123% S.

• Look up Table 7.3: at 95% confidence level and degree of freedom


of 3: tcrit = 3.18
• Calculated t (-4.375) < -tcrit (-3.18) à a significant difference at the
95% confidence level and thus bias in the method. 13
Comparison of Two Experimental Means
• t test for differences in the means:
– Null hypothesis: 2 means are identical and that any difference is the
result of random errors: H0: μ1 =μ2
– Alternative hypothesis: Ha: μ1 ≠ μ2
– The test statistic t is calculated by:

• 𝑥̅ 1 and 𝑥̅ 2 are the means of set 1 and set 2.


• Where spooled is the pooled estimate of σ (Chapter 2 - p. 30).
• N1 and N2 are the numbers of results of set 1 and set 2.

– Obtain tcrit from Table 7.3 with the degree of freedom of (N1+ N2 -2)
– Compare t with tcrit:
• If 𝑡 < tcrit : null hypothesis is accepted à no difference between the means

• If 𝑡 > tcrit : null hypothesis is rejected à significant difference between the means
14
Comparison of Two Experimental Means
• t test for differences in the means:
– Example: 2 barrels of wine were analyzed for their alcohol content to
determine whether they were from different sources. On the basis of
6 analyses, the average content of the 1st barrel was 12.61% ethanol. 4
analyses of the 2nd barrel gave a mean of 12.53% alcohol. The 10
analyses yielded spooled of 0.070%. Do the data indicate a difference
between the wines?
– Null hypothesis H0: μ1 = μ2, and alternative hypothesis Ha: μ1 ≠ μ2.
– The test statistic t :

– tcrit at 95% confident level (degree of freedom: 10-2 = 8) = 2.31


– As 1.771 < 2.31 à null hypothesis is accepted: no difference in the
alcohol content between 2 barrels.

15
Comparison of Two Experimental Means
• Paired data:
– Use of pairs of measurements on the same sample to minimize
sources of variability that are not of interest.
– The paired t test uses the same type of procedure as the normal t test
except that pairs of data are analyzed.
– Null hypothesis is H0: μd = △0, where △0 is a specific value of the
difference to be tested, often zero.
– Alternative hypothesis: μd ≠ △0 ; μd <△0 or μd >△0
– The test statistic t :

∑+
, )*
• Where 𝑑̅ is the average difference 𝑑̅ = ; di: difference in each data pair
-
• sd is the standard deviation of the difference:

/
∑- 𝑑
∑123 𝑑 − 123
- /
𝑠𝑑 = 𝑁
𝑁−1
16
Comparison of Two Experimental Means
• Paired data:
– Example: A new automated procedure for determining glucose in
serum (Method A) is to be compared with the established method
(Method B). Both methods are performed on serum from the same 6
patients to eliminate patient-to-patient variability. Do the following
results confirm a difference in the two methods at the 95% CI?

– Hypotheses: If μd is the true average difference between 2 methods,


null hypothesis H0: μd = 0, alternative hypothesis, Ha: μd ≠ 0.
– Test statistic t:

– Since t > tcrit = 2.57 (at 95% CI and 5 degrees of freedom) à reject the
null hypothesis and conclude that 2 methods give different results. 17
Comparison of Precision: F test
• F test: can be used when
– Comparing the variances ( or standard deviations) of two populations
under the provision that the populations follow the normal (Gaussian)
distribution.
– Comparing more than two means and in linear regression analysis.

• F test for comparison of the variances:


– Null hypothesis H0: 𝜎1/ = 𝜎2/ ;
– Alternative hypothesis Ha: 𝜎1/ # 𝜎2/ (2 tailed test) or 𝜎1/ > 𝜎2/ (1
tailed test).
93:
– Calculate test statistic F: 𝐹 = (place larger variance in nominator ).
9/:
– Compare F with Fcrit at desired significant levels.

18
Comparison of Precision: F test
• Critical values of F at the 0.05 significance level are shown:

– Two degrees of freedom: one associated with the numerator and the
other with the denominator.
– Can used in either a one-tailed mode or a two- tailed mode.
19
Comparison of Precision: F test
• Example: A standard method for the determination of CO level in
gaseous mixtures is known from many hundreds of measurements to have
a standard deviation s of 0.21 ppm CO. A modification of the method
yields a value for s of 0.15 ppm CO for a pooled data set with 12 degrees
of freedom. A 2nd modification, also based on 12 degrees of freedom, has
a s of 0.12 ppm CO. Is either modification significantly more precise than
the original?
– Null hypothesis H0: 𝜎𝑠𝑡𝑑 / = 𝜎 / (where 𝜎𝑠𝑡𝑑 / is the variance of the
standard method and is 𝜎 / the variance of the modified method).
Alternative hypothesis is one-tailed, Ha: 𝜎2 < 𝜎𝑠𝑡𝑑 /
– The variances of the modifications are placed in the denominator:
• Calculate test statistic F for 1st and 2nd modifications:

• Sstd is a good estimate of σ and the number of the degrees of


freedom from the numerator can be taken as infinite , at the 95%
confidence level is Fcrit 2.30. 20
Comparison of Precision: F test
• Example:
– F1 < Fcrit : accept the null hypothesis. There is no improvement in
precision.

– F2 > Fcrit : reject the null hypothesis. The 2nd method does appear to
give better precision a the 95% confidence level.

– Comparison between 2 methods:


• Null hypothesis: 𝜎1/ = 𝜎2/ ;
• Calculate test statistic F:

• With Fcrit = 2.69. Since F < 2.69, we must accept H0 and conclude that
the two methods give equivalent precision.

21
Detection of gross errors: Q test
• Q test is used to decide whether a suspected result should be
retained or rejected:

• Calculate Q:

Where xq is questionable result xq, its


nearest neighbor is xn, and w is the spread of
the entire set

• Compared with critical values


Qcrit in Table 7-5:
If Q > Qcrit, the questionable result
can be rejected with the indicated
degree of confidence.
22
Detection of gross errors: Q test

23
Detection of gross errors: Q test
• Example: The analysis of a calcite sample yielded CaO percentages of
55.95, 56.00, 56.04, 56.08, and 56.23. The last value appears anomalous;
should it be retained or rejected at the 95% confidence level?

– The difference between 56.23 and 56.08 is 0.15%. The spread (56.23 –
55.95 ) is 0.28%. Thus:

– For 5 measurements, Qcrit at the 95% confidence level is 0.71. Because


0.54 < 0.71, we must retain the outlier at the 95% confidence level.

24
Standardization and calibration
• Calibration:
– Determines the relationship between the analytical
response and the analyte concentration.

– Usually accomplished by the use of chemical standards.

– Standards comparison methods:

• Direct comparison: compare a property of the analyte with a


standard such that the property being tested matches or nearly
matches that of the standard.

• Titration procedure: the analyte reacts with a standardized


reagent (the titrant) in a reaction of known stoichiometry.
25
External standard calibration
• External standards:
– Prepared separately from the sample.

– Used to calibrate instruments and procedures when there are no


interference effects from matrix components in the analyte solution.

– Procedure:
• A series of such external standards containing the analyte in
known concentrations is prepared.
• Calibration is accomplished by obtaining the response signal
(absorbance, peak height, peak area) as a function of the known
analyte concentration.
• A calibration curve is prepared by plotting the data or by fitting
them to a suitable mathematical equation.

26
The least-squares method
• Assumptions:
1. A linear relationship actually exists between
the measured response y and the standard
analyte concentration x, described by
equation y = mx +b à regression model.

2. Any deviation of the individual points from


the straight line arises from error in the
measurement.

• The vertical deviation of


each point from the
straight line is called a
residual.

27
The least-squares method
• The least-squares method finds the sum of the squares of the
residuals SSresid and minimizes them.

Where xi and yi are individual pair of


data for x and y;
N is the number of data pairs
𝑥̅ and 𝑦= are average values of x and y

Slope: Standard deviation about the regression:

Intercept:

Standard deviation of the slope:


Standard deviation of the intercept:

28
The least-squares method
• Total sum of the squares, SStot, is defined as:

• Coefficient of determination (R2): measures the fraction of the


observed variation in y that is explained by the linear
relationship:

– The closer R2 is to unity, the better the linear model explains the y
variations.

29
Transformed variables
• Least-squares method can be applied to nonlinear models by
converting them into simple linear model as shown in Table
8.3:

30
Using excel
• Calculation of slope and intercept:

31
Using excel
• Plotting a graph and the least-squares fit
Create a chart using built-in
Chart Wizard of Excel

Right click on any data point


and then dick on Add
trendline

32

You might also like