Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Chapter 7

Statistical Intervals based on a


Single Sample

1/1
Outline

7.1 Introduction to Confidence Intervals

7.2 Large Sample Confidence Intervals for A Population


Mean and Proportion

7.3 Confidence Intervals Based on A Normal Population


Distribution

7.4 Confidence Intervals for the Variance and Standard


Deviation of A Normal Populations

2/1
Introduction to Confidence Intervals

Population mean µ
Sample mean X̄
Variability of point estimates

2.4
2.2
Sample mean
2.0
1.8
1.6

0 20 40 60 80 100
Sample number

Interval estimates: X̄ ± margin of error

3/1
The Idea of Interval Estimates
Leslie Kish (1910-2000)
Margin of error
Margin of Error: 0.5 Margin of Error: 1

20

20
15

15
10

10
5

5
67 68 69 70 71 72 67 68 69 70 71 72

Margin of Error: 1.5 Margin of Error: 2


20

20
15

15
10

10
5

66 68 70 72 66 68 70 72

4/1
Principle of Normal Interval Estimates Pn
Let X1 , X2 , · · · , Xn be a sample of size n taken from N(µ, σ 2 ). X̄ = 1
n i=1 Xi . If σ is
known, the normal table shows that
σ σ
P[µ − 1.96 √ < X̄ < µ + 1.96 √ ] = 0.95.
n n
It is straightforward to show that
σ σ
P[X̄ − 1.96 √ < µ < X̄ + 1.96 √ ] = 0.95.
n n
This means the probability that the random interval [X̄ − 1.96 √σn , X̄ + 1.96 √σn ] contains
the unknown parameter µ is 0.95 or its realization
σ σ
(x̄ − 1.96 √ , x̄ + 1.96 √ )
n n
has a 95% confidence to include µ.
In general, for any α > 0, there exist a zα/2 such that
σ σ
P[X̄ − zα/2 √ < µ < X̄ + zα/2 √ ] = 1 − α.
n n
A 100(1 − α)% confidence interval of µ is (x̄ − zα/2 √σn , X̄ + zα/2 √σn ).
Confidence interval was coined by Jerzy Neyman (1934, Journal of the Royal Statistical Society, 97, 558-625, p. 562). However,

Pierre Simon Laplace (1814, Théorie analytique des probabilité, 2nd ed.,1749-1827) had introduced confidence interval

procedures.
5/1
Interpretation of Confidence Intervals
Recall that a probability of an event is the long-run relative frequency
that the event occurs. A 100(1 − α)% confidence interval of µ is
interpreted as follows: If the process of taking a sample of size n is
repeated many times, 100(1 − α)% of all interval estimates
constructed over many repetitions of sampling will cover µ in the
long-run.
In short, interval (x̄ − zα/2 √σn , x̄ + zα/2 √σn ) has 100(1 − α)%
confidence to cover µ.
In a short-run process of repetitions 95% confidence intervals can be
20

20

20
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
15

15

15
| | | | | |
Confidence Intervals

Confidence Intervals

Confidence Intervals
| | | | | |
| | | | | |
| | | | | |
| | | | | |
10

10

10
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
5

5
| | | | | |
| | | | | |
| | | | | |
| | | | | |

85 90 95 100 105 110 115 90 100 110 120 90 95 100 105 110

6/1
Keyboard Heights

Example
Industrial engineers who specialize in ergonomics are concerned with designing
workspace and devices operated by workers so as to achieve high productivity and
comfort. Nakaseko, M., Grandjean, E., Hüunting, W., and Gierer, R. (1985, Studies on
Ergonomically Designed Alphanumeric Keyboards, Human Factors: The Journal of the
Human Factors and Ergonomics Society 27(2), 175-187) reported on a study of preferred
height for an experimental keyboard with large forearm-wrist support. A sample of
n = 31 trained typists was selected, and the preferred keyboard height was determined
for each typist. The resulting sample average preferred height was x̄ = 80.0cm. The
paper assume the preferred height is normally distributed with σ = 2.0cm.
1 Construct a 95% confidence interval for µ. (79.3, 80.7).
For the specific interval like (79.3, 80.7), the possibility of covering the true mean
µ is either 0 or 1.
2 Construct a 80% confidence interval for µ. (79.54, 80.46).
3 Construct a 99% confidence interval for µ. (79.08, 80.92).
4 What can you observe?

7/1
Margin of Error
The margin of error in a confidence interval equals to its half-width and it
determines the accuracy of the the interval. The accuracy of a
measurement system is the degree of closeness of measurements of a
quantity to its true value. The precision of a measurement system, also
called reproducibility or repeatability, is the degree of closeness of
repeated measurements under the same conditions. One strategy to
decrease the margin of error of a confidence interval of a desired level is to
increase the sample size.

Accurate & Precise Inaccurate & Precise

Accurate & Imprecise Inaccurate & Imprecise

8/1
Sample Size Determination
The sample size necessary for a 100(1 − α)% confidence interval
 z σ 2
(x̄ − zα/2 √σn , x̄ + zα/2 √σn ) with a margin of error d is n = α/2
d .

Example
A limnologist wishes to estimate the mean phosphate content per unit volume of
lake water. It is known from previous studies that the phosphate content follows a
normal distribution with a standard deviation of 4 milligrams. How many water
samples must the limnologist analyze to be 90% certain that the error of her
estimation doesn’t exceed 0.8 milligrams? (68)

Example
A research worker wants to determine the average time it takes a mechanic to
rotate the tires of a car and she wants to be able to assert with 95% confidence
that the mean of her sample is off by at most 0.50 minute. If she can assume
from her past experience that the working time follows a normal distribution with
a standard deviation 1.6 minutes, how large a sample will she have to take? (40)

9/1
A General Procedure for Confidence Intervals
Find a random variable that is a function of X1 , X2 , . . . , Xn and the unknown
parameter θ of interest, say h(X1 , X2 , . . . , Xn ; θ). The probability distribution of
h(X1 , X2 , . . . , Xn ; θ) does not depend on θ and any other unknown parameters.
For any α between 0 and 1, find constants a and b such that
P[a < h(X1 , X2 , . . . , Xn ; θ) < b] = 1 − α.
If inequality a < h(X1 , X2 , . . . , Xn ; θ) < b results in
L(X1 , X2 , . . . , Xn ) < θ < U(X1 , X2 , . . . , Xn ) such that
P[L(X1 , X2 , . . . , Xn ) < θ < U(X1 , X2 , . . . , Xn )] = 1 − α, then a 100(1 − α)%
confidence interval of θ is (L(x1 , x2 , . . . , xn ), U(x1 , x2 , . . . , xn ).

Example
A theoretical model suggests that the time to the breakdown of an insulating fluid
between electrodes at a particular voltage has an exponential distribution with
parameter λ. A random sample of 10 breakdown times yield the following data (in
minutes) 41.53, 18.73, 2.99, 30.34, 12.33, 117.52, 73.02, 223.63, 4.00, 26.78. Find a
95% confidence interval for the population average breakdown time. Note that a
n
2λXi ∼ χ22 . h(X1 , X2 , . . . , Xn ; λ) = 2λ Xi ∼ χ22n .
P
i=1

10 / 1
Bootstrap Confidence Intervals
Let X1 , X2 , . . . , Xn be a random sample of size n. A 100(1 − α)% confidence
bootstrap percentile interval for a specific population parameter is obtained by
1 generating B bootstrap samples,
2 calculating the value of some particular statistic that estimates the
parameter for each sample
3 finding the 100(α/2)% and 100(1 − α/2)% percentiles of all values of the
particular statistic, say, L and U .
The 100(1 − α)% confidence bootstrap percentile interval for the parameter is
(L, U ).

Example
Twenty HIV-positive subjects received an experimental antiviral drug and their CD4 cell
counts (or T-cells, white blood cells) in hundreds were recorded at baseline and after
one year of treatment. The two measurements are correlated with a sample correlation
coefficient of θ̂ = 0.723. We wish to construct a 95% confidence interval for the
population correlation coefficient θ. (Thomas J. DiCiccio and Bradley Efron, 1996,
Bootstrap confidence intervals, Statistical Science, 11(3), 189-228)

11 / 1
Bootstrap Confidence Intervals (cont’d)
Subject Baseline One year Subject Baseline One year
1 2.12 2.17 11 4.15 4.74
2 4.35 4.61 12 3.56 3.29

6
3 3.39 5.26 13 3.39 5.55
4 2.51 3.02 14 1.88 2.82

5
One year
5 4.04 6.36 15 2.56 4.23

4
6 5.10 5.93 16 2.96 3.23

3
7 3.77 3.93 17 2.49 2.56
8 3.35 4.09 18 3.03 4.31 2.0 2.5 3.0 3.5 4.0 4.5 5.0
9 4.10 4.88 19 2.66 4.37 Baseline

10 3.35 3.81 20 3.00 2.40

baseline<-c(2.12,4.35,3.39,2.51,4.04,5.10,3.77,3.35,4.10,3.35,4.15,3.56,3.39,1.88,2.56,2.96,2.49,3.03,2.66,3)
oneyear<-c(2.47,4.61,5.26,3.02,6.36,5.93,3.93,4.09,4.88,3.81,4.74,3.29,5.55,2.82,4.23,3.23,2.56,4.31,4.37,2.4)
data<-data.frame(cbind(baseline,oneyear))
set.seed(12345)
#correlation function
cor.mat<-function(data) cor(data[,1],data[,2])
boot.conf.int<-function(data, func, num,conf.level){
#Take samples from matrix
resamples <-lapply(1:num, function(i) data[sample.int(nrow(data), replace=T),])
#Calculate statistics
r.statistic<-sapply(resamples, func);original<-func(data)
lower<-quantile(r.statistic,probs=(1-conf.level)/2)
upper<-quantile(r.statistic,probs=(1+conf.level)/2)
out<-matrix(c(lower,original,upper),nrow=1,byrow=TRUE)
colnames(out)<-c("Lower limit","Statistic","Upper limit")
out }
boot.conf.int(data,cor.mat,1000,0.95)
Lower limit Statistic Upper limit
[1,] 0.5085269 0.7231654 0.8654607

12 / 1
Bootstrap Confidence Intervals (cont’d)
Subject Baseline One year Subject Baseline One year
1 2.12 2.17 11 4.15 4.74
2 4.35 4.61 12 3.56 3.29
3 3.39 5.26 13 3.39 5.55
4 2.51 3.02 14 1.88 2.82
5 4.04 6.36 15 2.56 4.23
6 5.10 5.93 16 2.96 3.23
7 3.77 3.93 17 2.49 2.56
8 3.35 4.09 18 3.03 4.31
9 4.10 4.88 19 2.66 4.37
10 3.35 3.81 20 3.00 2.40

library(boot)
baseline<-c(2.12,4.35,3.39,2.51,4.04,5.10,3.77,3.35,4.10,3.35,4.15,3.56,3.39,1.88,2.56,2.96,2.49,3.03,2.66,3)
oneyear<-c(2.47,4.61,5.26,3.02,6.36,5.93,3.93,4.09,4.88,3.81,4.74,3.29,5.55,2.82,4.23,3.23,2.56,4.31,4.37,2.4)
data<-data.frame(cbind(baseline,oneyear))
#correlation function
cor.m<-function(data,d) cor(data[d,1],data[d,2])
HIV.out<-boot(data,cor.m,1000)
boot.ci(HIV.out,conf=c(0.90,0.95),type=c("norm","basic","perc"))

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS


Based on 1000 bootstrap replicates

CALL :
boot.ci(boot.out = HIV.out, conf = c(0.9, 0.95), type = c("norm",
"basic", "perc"))

Intervals :
Level Normal Basic Percentile
90% ( 0.5763, 0.8815 ) ( 0.6008, 0.9002 ) ( 0.5461, 0.8455 )
95% ( 0.5471, 0.9107 ) ( 0.5837, 0.9410 ) ( 0.5053, 0.8626 )
Calculations and Intervals on Original Scale
13 / 1
Large-Sample Confidence Intervals for A Population Mean
Example
Punctuality of patients in keeping appointments is of interest to a medical administrator.
In a study of patient flow through the offices of general practitioners, it was found that a
random sample of 35 patients were 17.2 minutes late for their appointments, on the
average, with a sample standard deviation of 8 minutes. What is the 90% confidence
interval for the population mean amount of time late for appointments?

Let X1 , X2 , . . . , Xn be a random sample from a population having a mean µ and


finite standard deviation σ. When the sample size is sufficiently large
(superstitiously, n > 30), a 100(1 − α)% confidence interval for µ is given by
(X̄ − zα/2 √Sn , X̄ + zα/2 √Sn ) where zα/2 is the upper α/2 critical value of standard
X̄ −µ
normal. This is because √
S/ n
approximately follows N(0, 1) by the central limit
theorem.
Interpretation: In repeated sampling of the same sample size from the population,
100(1 − α)% of all intervals of the form (x̄ − zα/2 √s n , x̄ + zα/2 √s n ) will, in the long
run, include the population mean µ.

14 / 1
Large-Sample Confidence Intervals for A Population Mean
Example
Punctuality of patients in keeping appointments is of interest to a medical administrator.
In a study of patient flow through the offices of general practitioners, it was found that a
random sample of 35 patients were 17.2 minutes late for their appointments, on the
average, with a sample standard deviation of 8 minutes. What is the 90% confidence
interval for the population mean amount of time late for appointments?

Let X1 , X2 , . . . , Xn be a random sample from a population having a mean µ and


finite standard deviation σ. When the sample size is sufficiently large
(superstitiously, n > 30), a 100(1 − α)% confidence interval for µ is given by
(X̄ − zα/2 √Sn , X̄ + zα/2 √Sn ) where zα/2 is the upper α/2 critical value of standard
X̄ −µ
normal. This is because √
S/ n
approximately follows N(0, 1) by the central limit
theorem.
Interpretation: In repeated sampling of the same sample size from the population,
100(1 − α)% of all intervals of the form (x̄ − zα/2 √s n , x̄ + zα/2 √s n ) will, in the long
run, include the population mean µ.

Solution: x̄ = 17.2, s = 8, z0.05 = 1.645, 17.2 ± 1.645(8/ 35), (15.0, 19.4).

14 / 1
Large-Sample Confidence Intervals (cont’d)
Example
Let X be the time (in seconds) for a firefighter to place a ladder against a building,
pulling out a section of fire hose, dragging a weighted object, and crawling in a
simulated attic. There are 158 randomly selected firefighters.
425 427 287 296 270 294 365 279 319 289 297 267 336 374 380
386 334 238 281 222 261 370 291 334 350 291 294 389 417 256
266 302 254 356 400 276 312 353 305 291 268 421 386 342 286
228 285 293 399 352 294 276 269 323 438 378 269 317 317 254
354 438 313 297 333 386 320 331 300 226 276 312 264 236 287
262 304 285 264 289 368 321 291 254 327 277 285 363 289 240
317 299 339 417 286 280 278 288 266 303 350 273 303 296 261
292 403 269 221 247 395 228 275 278 317 255 276 284 373 416
225 283 336 240 347 403 278 328 305 247 302 410 385 268 302
296 264 322 268 234 292 252 339 342 257 315 317 308 259 246
232 250 266 279 286 328 304 406

Construct a 95% confidence interval for the mean time of firefighters who complete the
test, where n = 158, X̄ = 307.77, and s = 51.852.

15 / 1
Large-Sample Confidence Intervals (cont’d)
Example
Let X be the time (in seconds) for a firefighter to place a ladder against a building,
pulling out a section of fire hose, dragging a weighted object, and crawling in a
simulated attic. There are 158 randomly selected firefighters.
425 427 287 296 270 294 365 279 319 289 297 267 336 374 380
386 334 238 281 222 261 370 291 334 350 291 294 389 417 256
266 302 254 356 400 276 312 353 305 291 268 421 386 342 286
228 285 293 399 352 294 276 269 323 438 378 269 317 317 254
354 438 313 297 333 386 320 331 300 226 276 312 264 236 287
262 304 285 264 289 368 321 291 254 327 277 285 363 289 240
317 299 339 417 286 280 278 288 266 303 350 273 303 296 261
292 403 269 221 247 395 228 275 278 317 255 276 284 373 416
225 283 336 240 347 403 278 328 305 247 302 410 385 268 302
296 264 322 268 234 292 252 339 342 257 315 317 308 259 246
232 250 266 279 286 328 304 406

Construct a 95% confidence interval for the mean time of firefighters who complete the
test, where n = 158, X̄ = 307.77, and s = 51.852.

Solutions: (X̄ − zα/2 √Sn , X̄ + zα/2 √Sn ) = 307.77 ± 1.96 × 51.852



158
= (299.69, 315.86).
Therefore, we are 95% confident that the true mean time is contained in (299.69,
315.86).

15 / 1
Large-Sample Confidence Intervals (cont’d)
Example
Consider the following 80 determinations of daily emission (in tons, randomly selected)
of sulfur oxides from an industrial plant. Construct a 99% confidence interval for the
plant’s true average daily emission of sulfur oxides.
emission<-c(15.8,22.7,26.8,19.1,18.5,14.4,8.3,25.9,26.4,9.8,22.7,15.2,23.0,29.6,21.9,10.5,17.3,6.2,18.0,
22.9,24.6,19.4,12.3,15.9,11.2,14.7,20.5,26.6,20.1,17.0,22.3,27.5,23.9,17.5,11.0,20.4,16.2,20.8,13.3,18.1,
24.8,26.1,20.9,21.4,18.0,24.3,11.8,17.9,18.7,12.8,15.5,19.2,7.7,22.5,19.3,9.4,13.9,28.6,19.4,21.6,13.5,
24.6,20.0,24.1,9.0,17.6,16.7,16.9,23.5,18.4,25.7,20.1,13.2,23.7,10.7,19.0,14.5,18.1,31.8,28.5)

Solution: Here n = 80, x̄ = 18.89625, and s = 5.65646.


18.89625 ± 2.575829 × 5.65646

80
= (17.26727, 20.52523).
Therefore, we are 99% confident that the interval from 17.27 tons to 20.52 tons
contains the true average daily emission.
> mean(emission)+c(-1,1)*qnorm(0.995)*sd(emission)/sqrt(length(emission))
[1] 17.26727 20.52523
t.test(emission,alternative="two.sided",conf.level=0.99)
One Sample t-test
data: emission
t = 29.8797, df = 79, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
99 percent confidence interval: 17.22700 20.56550
mean of x: 18.89625

16 / 1
Large-Sample Confidence Intervals (cont’d)
A large-sample upper 100(1 − α)% confidence bound for µ is
µ < x̄ + zα · √sn and a larger-sample lower 100(1 − α)% confidence
bound for µ is µ > x̄ − zα · √sn .
A general methods to construct large-sample confidence intervals is as
follows. Let X1 , X2 , · · · , Xn be a random sample on which the
confidence interval for a parameter θ is to be based. Suppose that θ̂
is an estimator of θ satisfying the following properties:
1 θ̂ is unbiased (at least approximately);
2 θ̂ has approximately a normal distribution;
3 the standard deviation of θ̂, say σθ̂ , or its estimate sθ̂ is available.
Then large-sample 100(1 − α)% confidence interval of θ is

(θ̂ − zα/2 · σθ̂ , θ̂ + zα/2 · σθ̂ ) or (θ̂ − zα/2 · sθ̂ , θ̂ + zα/2 · sθ̂ ).

In words,

point estimate ± (z critical value)(estimated standard error).


17 / 1
Sample Size Determination for Large-Sample CIs
The sample size necessary for a 100(1 − α)% confidence interval (x̄ − zα/2 √s n , x̄ + zα/2 √s n ) with
a margin of error d is not as straightforward as it was for the case of known σ. This is because
the width 2zα/2 √s n is dependent on s which is not available before the data have been collected.

The only option for an investigator who wishes to specify a desired width is to make an
educated guess as to what the value of s might be. By being conservative and guessing a larger
value of s, an n larger than necessary will be chosen.
If the investigator is able to specify a reasonably accurate value of the population range (the
difference between the largest and the smallest values), then, if the population distribution is
not too skewed, dividing the range by 4 gives a ballpark value of what s might be.

Example
The charge-to-tap time (in minutes) for carbon steel in one type of open hearth furnace is to be
determined for each heat in a sample of size n. If the investor believes that almost all times in
the distribution are between 320 and 440, what sample size would be appropriate for estimating
the population average time to within 5 minutes with a confidence level of 95%?

18 / 1
Sample Size Determination for Large-Sample CIs
The sample size necessary for a 100(1 − α)% confidence interval (x̄ − zα/2 √s n , x̄ + zα/2 √s n ) with
a margin of error d is not as straightforward as it was for the case of known σ. This is because
the width 2zα/2 √s n is dependent on s which is not available before the data have been collected.

The only option for an investigator who wishes to specify a desired width is to make an
educated guess as to what the value of s might be. By being conservative and guessing a larger
value of s, an n larger than necessary will be chosen.
If the investigator is able to specify a reasonably accurate value of the population range (the
difference between the largest and the smallest values), then, if the population distribution is
not too skewed, dividing the range by 4 gives a ballpark value of what s might be.

Example
The charge-to-tap time (in minutes) for carbon steel in one type of open hearth furnace is to be
determined for each heat in a sample of size n. If the investor believes that almost all times in
the distribution are between 320 and 440, what sample size would be appropriate for estimating
the population average time to within 5 minutes with a confidence level of 95%?
Solution: A reasonable value for s is 440 − 320)/4 = 30. Thus
 2  2
z0.025 (30) (1.96)(30)
n= = 138.3 ≈ 139.
5 5

18 / 1
Large-Sample Confidence Intervals for A Population
Proportion
Example
1. A study is interested in estimating the prevalence rate of breast cancer among 50- to
54-year-old women whose mothers have had breast cancer. In a random sample of 10,000 such
women, 400 are found to have had breast cancer at some point in their lives. The best point
estimate of the prevalence rate p in population is the sample proportion p̂ = 400/10000 = 0.04.
What is the 95% confidence interval for the population prevalence rate?
2. A government agency wishes to assess the prevailing rate of unemployment in a particular
county. It is correctly felt that this assessment could be made quickly and effectively by
sampling a small fraction of the label force in the county and counting the number of persons
currently unemployed. Suppose that 500 randomly selected persons are interviewed and 41 are
found to be unemployed. A descriptive summary of this finding is provided by the sample
41
proportion of unemployed p̂ = 500 = 0.082. What is the 99% confidence interval for the
unemployed proportion p in the entire county population?

Parameter: Population proportion p (success probability)


Data: X is number of successes occurred in n Bernoulli trials
Statistic: Point estimate p̂ = Xn
q q
Standard error: S.E .(p̂) = p(1−p)n and estimated by S.E .( p̂) = p̂(1−p̂)
n
19 / 1
Wald and Wolfowitz Confidence Interval
Let X be the number of successes in n Bernoulli trials.
For large sample size, the central limit theorem shows that Xn is approximately
p
normally distributed with mean p and standard deviation p(1 − p)/n.
The random interval Xn ± zα/2 p(1 − p)/n is a candidate of 100(1 − α)%
p

confidence intervals for p. However, the standard deviation involves the unknown
parameter p, so Wald, A. and Wolfowitz, J. (1939, Confidence limits for
continuous distribution functions, The Annals of Mathematical
q Statistics, 10,
X
(1− Xn )
105-118) recommended the estimated standard error n
n to be used as the
endpoints of the confidence interval.
For large n, x ≥ 10, and n − x ≥ 10, a 100(1 − α)% confidence interval for p is
given by r r
x x x
x n (1 − n ) x (1 − xn )
− zα/2 < p < + zα/2 n .
n n n n
Because of its simplicity, Wald and Wolfowitz confidence interval is popularly used. However, recent research shows that the

performance of this interval is far more erratic and inadequate than what is appreciated unless np(1 − p) is quite large. See

Brown, L.D., Cai, T. and Dasgupta, A. (2001, Interval estimation for a binomial proportion (with discussion). Statistical Science

16, 101-133) for details.

20 / 1
Wilson or Score Confidence Interval
An approximate 100(1 − α)% confidence interval for p is
2
q 2
zα/2 zα/2
x + 2 ± zα/2 x(n−x)n + 4
2
.
n + zα/2

Wilson or score confidence interval comes from solutions to the equations


x
−p
qn = ±zα/2 .
p(1−p)
n

Wilson, Edwin Bidwell (1927. Probable inference, the law of succession,


and statistical inference, Journal of the American Statistical Association
22, 209-212).
The actual confidence level of Wilson CI is quite close to the nominal level
specified by the choice of zα/2 for virtually all sample sizes and values of p.

21 / 1
Agresti-Coull ”Plus Four” and Jeffreys Confidence Intervals
A 100(1 − α)% Agresti-Coull confidence interval for p is
s
x+2
x +2 (1 − x+2
n+4 )
± zα/2 n+4
n+4 n+4
or v
u x+z 2 /2 2 /2
x+zα/2
α/2
2
x + zα/2 /2
u
2 (1 − 2 )
t n+zα/2 n+zα/2
u
2
± z α/2 2
.
n + zα/2 n + zα/2
A 100(1 − α)% Jeffreys confidence interval for p is
(B(α/2; x + 0.5, , n − X + 0.5), B(1 − α/2; x + 0.5, , n − X + 0.5)),
where B(α, m1 , m2 ) denotes the α quantile of Beta(m1 , m2 )
distribution. The lower limit takes 0 when x = 0 and the upper takes
1 when x = n.
Agresti, A. and Coull, B.A. (1998 The American Statistician 52(2), 119-126). Agresti, A. and and Caffo, B. (2000, The

American Statistician 54(4), 280-288). Brown, Cai, and DasGupta (2001, Statistical Science, 16(2) 101-133).
22 / 1
Coverage Probability
For n = 100 and a nominal 95% confidence level,

Wald Interval Wilson Interval

1.00

1.00
Coverage Probability

Coverage Probability
0.90

0.90
0.80

0.80
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
p p

Plus Four Interval Jeffreys Equal−Tailed Interval


1.00

1.00
Coverage Probability

Coverage Probability
0.90

0.90
0.80

0.80
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
p p

Agresti-Caffo’s ”plus four” interval dominates other intervals in coverage,


but is also longer on an average and is quite conservative for p near 0 or 1.
23 / 1
Various Confidence Intervals for p
prop.CI<-function(X,n,conf.level){
if (X==0) {cp.lower<-0; cp.upper<-1-((1-conf.level)/2)^(1/n)}
if (X==n) {cp.lower<-((1-conf.level)/2)^(1/n);cp.upper<-1}
else{
cp.lower<-(1+(n-X+1)/(X*qf((1-conf.level)/2,2*X,2*(n-X+1))))^(-1)
cp.upper<-(1+(n-X)/((X+1)*qf((1+conf.level)/2,2*(X+1),2*(n-X))))^(-1)
}
wald.lower<-X/n-qnorm((1+conf.level)/2)*sqrt((X/n)*(1-X/n)/n)
wald.upper<-X/n+qnorm((1+conf.level)/2)*sqrt((X/n)*(1-X/n)/n)
wilson.lower<-(X+(qnorm((1+conf.level)/2))^2/2-qnorm((1+conf.level)/2)*sqrt(X*(n-X)/n+
(qnorm((1+conf.level)/2))^2/4))/(n+(qnorm((1+conf.level)/2))^2)
wilson.upper<-(X+(qnorm((1+conf.level)/2))^2/2+qnorm((1+conf.level)/2)*sqrt(X*(n-X)/n+
(qnorm((1+conf.level)/2))^2/4))/(n+(qnorm((1+conf.level)/2))^2)
plus4.lower<-(X+2)/(n+4)-qnorm((1+conf.level)/2)*sqrt(((X+2)/(n+4))*(1-(X+2)/(n+4))/(n+4))
plus4.upper<-(X+2)/(n+4)+qnorm((1+conf.level)/2)*sqrt(((X+2)/(n+4))*(1-(X+2)/(n+4))/(n+4))
if (X==0) jeffreys.lower<-0
else jeffreys.lower<-qbeta((1-conf.level)/2,(X+0.5),(n-X+0.5))
if (X==n) jeffreys.upper<-1
else jeffreys.upper<-qbeta((1+conf.level)/2,(X+0.5),(n-X+0.5))
cp.CI<-c(round(cp.lower,digits=4),round(cp.upper,digits=4))
wald.CI<-c(round(wald.lower,digits=6),round(wald.upper,digits=6))
wilson.CI<-c(round(wilson.lower,digits=6),round(wilson.upper,digits=6))
plus4.CI<-c(round(plus4.lower,digits=6),round(plus4.upper,digits=6))
jeffreys.CI<-c(round(jeffreys.lower,digits=6),round(jeffreys.upper,digits=6))
list(Clopper.Pearson.CI=cp.CI,Wald.CI=wald.CI,Wilson.CI=wilson.CI,Plus4.CI=plus4.CI,Jeffreys.CI=jeffreys.CI)
}
> prop.CI(400,10000,0.95)
$Clopper.Pearson.CI $Wald.CI
[1] 0.0362 0.0440 [1] 0.036159 0.043841
$Wilson.CI $Plus4.CI
[1] 0.036333 0.044021 [1] 0.036336 0.044032
$Jeffreys.CI
[1] 0.036291 0.043975
24 / 1
Sample Sizes for Confidence Intervals
The sample
 2 size required
 to achieve 100(1 − α)% confidence with w interval width is
4zα/2 p(1−p)
nWald = w2
for Wald’s interval and
& 2
'
zα/2 n o
2p(1 − p) − w 2 + [2p(1 − p) − w 2 ]2 + w 2 (1 − w 2 )
p
nWilson = 2
w
for Wilson’s interval. Both cannot be used directly because they involve the unknown p.
1 The most conservative sample size is to take p = 0.5

2 Use a prior or expert estimate of p, say p0 , instead of p.

Example
Mandel, J. (1997, Repeatability and Reproducibility for Pass/Fail Data, Journal of
Testing and Evaluation 25(2), 151-153) reported that, among 48 trials in a particular
laboratory, 16 resulted in ignition of a particular type of substrate by a lighted cigarette.
Find a 95% confidence interval for the probability that a randomly selected such a trial
that results in ignition. If the width of a 95% confidence interval cannot exceed 0.1,
what is the necessary sample size?
> p.hat<-16/48;p.tilde<-(p.hat+qnorm(0.975)^2/(2*48))/(1+qnorm(0.975)^2/48);
> c(p.hat, p.tilde) > p.hat+c(-1,1)*qnorm(0.975)*sqrt(p.hat*(1-p.hat)/48)
[1] 0.3333333 0.3456834 [1] 0.1999747 0.4666920
> p.tilde+c(-1,1)*qnorm(0.975)*sqrt(p.hat*(1-p.hat)/48+qnorm(0.975)^2/(4*48^2))/(1+qnorm(0.975)^2/48)
[1] 0.2167678 0.4745989

25 / 1
Sample Sizes for Confidence Intervals (cont’d)
Example
> p0<-0.5; w<-0.1
> wald.size<-4*qnorm(0.975)^2*p0*(1-p0)/w^2
> wald.size
[1] 384.1459
> wilson.size<-(qnorm(0.975)^2/w^2)*(2*p0*(1-p0)-w^2+sqrt((2*p0*(1-p0)-w^2)^2+w^2*(1-w^2)))
> wilson.size
[1] 380.3044

Given a preset α-level and an anticipated value of successful probability p0 , what integer value
of n will lead to an interval width of some desired value w ?
Piegorsch, W.W. (2004, Sample sizes for improved binomial confidence intervals,
Computational Statistics & Data Analysis 46, 309-316) provided sample sizes for Wald’s,
Wilson’s Agresti-Coull’s and Jeffreys’ intervals.
Goncalves, et al. (2012, Sample size for estimating a binomial proportion: comparison of
different methods, Journal of Applied Statistics 39(11), 2453-2473) had more details on
Clopper-Pearson intervals.
More updated results and references were given in Thulin, Måns (2014, The cost of using exact
confidence intervals for a binomial proportion. Electronic Journal
 of Statistics, 8(1), 817-840).
2

4zα/2 p0 (1−p0 )
The sample size from Agresti-Coull’s interval is nAgresti-Coull = 2
− zα/2 .
w2
The sample size from Jeffreys’ interval nJeffreys is the smallest integer that greater than or
F1−α/2 (d1 ,d2 )−Fα/2 (d1 ,d2 )
equal to the solution to d1 d2 [d = w , where
2 +d1 Fα/2 (d1 ,d2 )][d2 +d1 F1−α/2 (d1 ,d2 )]
d1 = 2n(1 − p0 ) + 1, d2 = 2np0 + 1, and Fγ (ν1 , ν2 ) is the γ quantile of F (ν1 , ν2 ).
26 / 1
Confidence Intervals of Population Mean from Small Size
Samples
For a random sample of small sample size, say, n < 30, the
distribution of
X̄ − µ
T = √ ∼ tn−1
S/ n
if X1 , X2 , · · · , Xn is a random sample from N(µ, σ 2 ).
A two-sided 100(1 − α)% confidence interval for a normal population
mean when sample size n < 30 is
s s
(x̄ − tα/2,n−1 √ , x̄ + tα/2,n−1 √ )
n n
where tα/2,n−1 is the upper t critical value given in Table A. 5. Use
tα,n−1 for one-sided confidence intervals.
X̄ −µ
The rationale is P[−tn−1,α/2 < √
S/ n
< tα/2,n−1 ] = 1 − α.
S S
P[X̄ − tα/2,n−1 √ < µ < X̄ + tα/2,n−1 √ ] = 1 − α.
n n
27 / 1
Healing of Skin Wounds
Biologists studying the healing of skin wounds measured the rate at which new
cells closed a razor cut made in the skin of an anesthetized newt. Here are the
data from a randomly selected 18 newts measured in micrometers per hour
(µm/h): 29, 27, 34, 40, 22, 28, 14, 35, 26, 35, 12, 30, 23, 18, 11, 22, 23, 33. Find
a 95% confidence interval for the mean rate for all newts of this species.
Check normality:
Normal Q−Q Plot of Newt Data

40
35
Sample Quantiles
30
25
20
15
10

−2 −1 0 1 2
Theoretical Quantiles

Solution: n = 18, df = 17, x̄ = 25.67, s = 8.324, t0.025,17 = 2.11,


25.67 ± 2.11 × 8.324
√ , (21.53, 29.81).
18
Practical interpretation: In repeated sampling of sample size 18 from the
population, we are 95% confident that the population mean healing rate for all
newts of this species is between 21.53 and 29.81 micrometers per hour.
28 / 1
Trout Habitat
A master student in wildlife management studied trout habitat in the upper Shavers Fork
watershed in West Virginia. The spring time water pH of 29 randomly selected tributary
sample sites were 6.2, 6.3, 5.0, 5.8, 4.6, 4.7, 4.7, 5.4, 6.2, 6.0, 5.4, 5.9, 6.2, 6.1, 6.0, 6.3,
6.2, 5.8, 6.2, 6.3, 6.3, 6.3, 6.4, 6.5, 6.6, 6.1, 6.3, 4.4, 6.7. Find a 90% confidence interval
of the mean springtime water pH of the tributaty water basin around the Shavers Fork
watershed.
Check normality:
Normal Q−Q Plot of Trout Data

6.5
6.0
Sample Quantiles
5.5
5.0
4.5
−2 −1 0 1 2
Theoretical Quantiles

Questionable solution: n = 29, df = 28, x̄ = 5.8931, s = 0.6380, t0.05,28 = 1.701,


5.8931 ± 1.701 × 0.6380
√ , (5.69, 6.09).
29
Practical interpretation: In repeated sampling of the sample size 29 from the population,
we are 90% confident that the population mean pH springtime water pH of the tributaty
water basin around the Shavers Fork watershed is between 5.69 and 6.09.
The one-sample t confidence interval is robust to small or even moderate departure from
normality unless the sample size is quite small. This means the actually confidence level
will be reasonably close to the nominal level of 90%. However, if the data are highly
nonnormal or the sample size is too small, the actually confidence level will be
considerably different from the nominal level of 90%.
29 / 1
Navigate Mazes
Psychology experiments sometimes involve testing the ability of rats to navigate mazes.
The mazes are classified according to levels of difficulty as measured by the mean length
of time it takes rats to find the food at the end. One researcher selects a maze that is
claimed to take rats an average of one minute to solve and is interested in whether the
claim is true or not. The researcher randomly selects 21 rats and records their times of
solving the maze. Here are the data in seconds: 38.4, 46.2, 62.5, 38.0, 62.8, 33.9, 50.4,
35.0, 52.8, 60.1, 55.1, 57.6, 55.5, 49.5, 40.9, 44.3, 93.8, 47.9, 69.2, 46.2, 56.3. Find a
90% confidence interval of the population mean solving time of rats.
Check normality:
Normal Q−Q Plot of Maze Data

40 50 60 70 80 90
Sample Quantiles

−2 −1 0 1 2
Theoretical Quantiles

Questionable solution: n = 21, df = 20, x̄ = 52.2095, s = 13.5646, t0.05,20 = 1.725,


52.2095 ± 1.725 × 13.5646
√ , (47.10, 57.32).
21
Practical interpretation: In repeated sampling of the sample size 21 from the population,
we are 90% confident that the population mean solving time of rats is between 47.10 and
57.32 seconds.
In the case of outliers, the actually confidence level of the one-sample t confidence
interval may be considerably different from its nominal level of 90%.

30 / 1
CIs of Bernoulli Proportion from Small Size Samples
Example
A researcher is interested in estimating the rate of bladder cancer in rats that have been fed a diet high in saccharin. Among
twenty rats randomly selected for the experiment, two have developed bladder cancer. What is the 95% confidence interval for
the rate of developing bladder cancer?

For small sample size, construction of confidence intervals for the binomial parameter has several obstacles.
1 Since X and Xn are values of discrete random variables, it may be impossible to get an interval for which the degree of
confidence interval is exactly 100(1 − α)%.
2 The standard deviation of the sampling distribution of the number of successes, as well as that of the proportion of
successes, involves the parameter p that we are trying to estimate.
Clopper, C.J. and Pearson, E.S. (1934, The use of confidence or fiducial limits illustrated in the case of the binomial, Biometrika
26(4), 404-413) proposed the following exact method to construct a 100(1 − α)% confidence interval for p.
Let x0 be the observed number of successes, pL is the probability such that
n
k n−k
X
P[X ≥ x0 |pL ] = pL (1 − pL ) ≤ α/2,
k=x0

and pU is the probability such that


x0
k n−k
X
P[X ≤ x0 |pU ] = pU (1 − pU ) ≤ α/2.
k=0

Then a 100(1 − α)% confidence interval of p is (pL , pU ).

The exact CI due to Clopper and Pearson can be written as (qbeta(α/2; x, n − x + 1), qbeta(1 − α/2; x + 1, n − x)) or
" −1  −1 #
n−x+1 n−x
1 + xF , 1 + (x+1)F , where Fα,ν1 ,ν2 denote the upper α quantile in
1−α/2,2x,2(n−x+1) α/2,2(x+1),2(n−x)
the F-distribution with ν1 and ν2 degrees of freedom. See Leemis, L.S. and Trivedi, K.S. (1996, The American Statistician, 50,

63-68, p. 67) for more details.


31 / 1
Notes on Confidence Intervals for Small Samples
A confidence interval is called robust if the confidence level does not
change very much when assumptions of the procedure are violated.
Student’s t intervals are robust to nonnormality when there are no
outliers and the distribution is roughly symmetric and unimodal.
Heavy tails, heavy skewness, and outliers are devastating. For 50,000
samples, we have
Student’s t Intervals from LN(1,2) Clopper−Pearson Intervals
0.5 0.6 0.7 0.8 0.9 1.0

0.5 0.6 0.7 0.8 0.9 1.0


Coverage Probability
Probability coverage

5 10 15 20 25 30 0.0 0.2 0.4 0.6 0.8 1.0


Sample size p

Generally, Clopper-Pearson intervals provide more confidence than the


nominal level. When strict conservativeness is not critical, ”plus four”
is recommended. For 1000 sample of size 100 we have
32 / 1
A Prediction Interval for a Single Future Value
A random sample of 10 hot dogs has been chosen and their fat content (in percentage)
are as follows: 25.2, 21.3, 22.8, 17.0, 29.8, 21.0, 25.5, 16.0, 20.9, 19.5. It is assumed that
the fat content follows a normal distribution. A 95% CI for the population mean fat
content is x̄ ± t0.025,9 · √s n = 21.90 ± 2.262 · 4.134
√ = 21.90 ± 2.96 = (18.94, 24.86).
10
This confidence interval has no use for a customer who wants to estimate the fat content
in his next hot dog. What is useful for the customer is an interval that will, with a
specified degree of confidence, contain the next observation from the population. This
interval is called a prediction interval for a single future value.
A 100(1 − α)% prediction interval (PI)
q for a single observation to be selected from a
1 + n1 . Use tα,n−1 for one-sided prediction intervals.
normal population is x̄ ± tα/2,n−1 · s
Let X1 , X2 , · · · , and Xn be a sample from a normally distributed population and we wish
to predict the value of a single future value, say, Xn+1 . The prediction interval holds
X̄ −Xn+1
because q ∼ tn−1 .
S 1+ n1

Interpretation: In repeated sampling of the same sample qsize from the population,
100(1 − α)% of all intervals of the form x̄ ± tα/2,n−1 s 1 + n1 will, in the long run,
include a future observation of the population.
q
n = 10, df = 9, x̄ = 21.90, s = 4.134, t0.025,9 = 2.262, 21.90 ± (2.262)(4.134) 1 + 91 ,
(12.09, 31.71).

33 / 1
A Prediction Interval for Several Future Values
Example
The weights of 100 newly minted U.S. pennies measured to 10−4 g but reported only to
the nearest 0.02g (Youden, W.J., Experimentation and Measurement) are 2.99(1),
3.01(4), 3.30(4), 3.05(4), 3.07(7), 3.09(17), 3.11(24), 3.13(17), 3.15(13), 3.17(6),
3.19(2), 3.21(1), where the number in parentheses after each number is the frequency of
the number. Find a 90% prediction interval for the weight of the 101th penny. How
about a prediction interval containing 101th to (100 + m)th?
weight<-rep(c(2.99,3.01,3.03,3.05,3.07,3.09,3.11,3.13,3.15,3.17,3.19,3.21),c(1,4,4,4,7,17,24,17,13,6,2,1))
mean(weight)+c(-1,1)*qt(0.95,length(weight)-1)*sd(weight)*sqrt(1+1/length(weight))
[1] 3.035878 3.179722

A 100(1 − α)% prediction interval containing all m future observations is


r
1
x̄ ± s · t 2m
α ,n−1 1+ .
n
A 100(1 − α)% prediction interval containing the mean of m future observations is
r
1 1
x̄ ± s · t α2 ,n−1 + .
m n
34 / 1
Prediction Intervals for Future Counts
Suppose that the past data consist of X successes out of n trials from
a BIN(n, p) distribution with a success probability p, 0 < p < 1. Let
Y be the future number of success out of m trials from a BIN(m, p)
distribution.
A large-sample approximate level 100(1 − α)% two-sided prediction
interval (L(X ), U(X )) for the future number Y of occurrences based
on the observed value X of the past occurrences for the binomial
distribution constructed by Nelson, W (1982, Applied Life Data
Analysis) is p
Ŷ ± zα/2 mp̂(1 − p̂)(m + n)/n,
where p̂ = X /n, Ŷ = mp̂, and zα/2 is the upper α/2 quantile of the
standard normal, when X , n − X , Y , m − Y all are large.
The Nelson prediction interval is derived from the fact that
Y − mp̂
p
p̂(1 − p̂)m(m + n)/n
is approximately N(0, 1).
35 / 1
Prediction Intervals for Future Counts (cont’d)
Wang, Hsiuying (2010, Closed form prediction intervals applied for disease counts,
The American Statistician 64(3), 250-256) proposed several improved prediction
intervals with better coverage probability than the existing method.
One prediction interval is derived using an approach that is similar to the
construction of the Wilson confidence intervals and it has the form of CA ± CB , where
2 2 2
A = mn[2xzα/2 (n + zα/2 + m) + (2x + zα/2 )(m + n)2 ],
2 2
B = {mn(m + n)zα/2 (m + n + zα/2 )2 [2(n − x)(n2 (2x + zα/2
2
) + 4mnx + 2m2 x)
2 2
+nzα/2 (n(2x + zα/2 ) + 3mn + m2 )]}1/2 ,
and
2
C = 2n[(n + zα/2 )(m2 + n(n + zα/2
2 2
)) + mn(2n + 3zα/2 )].
The prediction interval above is based on the fact that the random variable
Y − mp̂
q ∼ N(0, 1)
X +Y X +Y
 m(n+m)
n+m
1− n+m n

approximately.

36 / 1
Prediction Intervals for Future Counts (cont’d)
To avoid the poor coverage probability when the parameter is near the boundaries,
the prediction limits are obtained from inverting
p
y = mp̂ ± zα/2 W (x, y ),
where
2 2
!
x + zα/2 /2 + y x + zα/2 /2 + y m(n + m)
W (x, y ) = 2
1− 2
n + zα/2 + m n + zα/2 + m n
instead of inverting
s  
x +y x +y m(n + m)
y = mp̂ ± zα/2 1− .
n+m n+m n

Patel, J. and Samaranayake, V.A. (1991, Prediction intervals for some discrete
distributions, Journal of Quality Technology 23, 270-278) proposed an interval of
the form (0, X + d) as an upper prediction interval or (X − d, m) as a lower
prediction interval for Y where d is a positive integer. It turns out that d is the
smallest integer satisfying
n
! x+d
! !
X n x n−x
X m y m−y
min p (1 − p) p (1 − p) ≥ 1 − α.
1≤p≤1
x=0
x y =0
y

37 / 1
Hearing Loss
Example
The data were collected from a hearing screening program for all births with transient evoked
otoacoustic emissions in all eight maternity hospitals in the state of Rhode Island over a 4-year
period during 1993-1996. We assume that the number of children with hearing loss follows a
binomial distribution in each year.

Year
1993 1994 1995 1996 Total
Normal nursery liveborns 9885 13176 12694 12236 47991
Identified with permanent hearing loss 11 12 20 18 61

score.pi<-function(n,m,x,alpha){
#x is the number sucesses in n trilas
#m is the total number of trial in the future,1-alpha is the prediction level
z<-qnorm(1-alpha/2);A<-m*n*(2*x*(z^2)*(n+z^2+m)+(2*x+z^2)*(m+n)^2)
B<-sqrt(m*n*(m+n)*(z^2)*((m+n+z^2)^2)*(2*(n-x)*((n^2)*(2*x+z^2)+4*m*n*x+2*(m^2)*x)+
n*(z^2)*(n*(2*x+z^2)+3*m*n+m^2)))
C<-2*n*((n+z^2)*(m^2+n*(n+z^2))+m*n*(2*n+3*z^2));c((A-B)/C,(A+B)/C)
}
#Use 1993 and 1994 (23061, 23) to predict 1995 and 1996 (24930, 38)
> score.pi(n=23061,m=24930,x=23,alpha=0.1); [1] 14.24376 38.40283 (38 is in)
Nelson.pi<-function(n,m,x,alpha){
#x is the number sucesses in n trilas
#m is the total number of trial in the future, 1-alpha is the prediction level
z<-qnorm(1-alpha/2);p.hat<-x/n
c(m*p.hat-z*sqrt(m*p.hat*(1-p.hat)*(m+n)/n),m*p.hat+z*sqrt(m*p.hat*(1-p.hat)*(m+n)/n))
}
> Nelson.pi(n=23061,m=24930,x=23,alpha=0.1); [1] 13.03807 36.69004 (38 is not in)

38 / 1
Tolerance Intervals
A 95% confidence interval for the mean mileage of cars of a specific model
provides a consumer, who is considering buying a car of the model, the
information on the total gasoline consumption over 500 miles or the average
distance the car with a full tank gasoline can carry him or her?
a 95% prediction interval provides the consumer the above information on
the car (s)he is going to buy (one future observation).
a 95% tolerance interval provides mileage range information on cars that will
be produced (future observations) for a design engineer to guarantee that a
certain portion of population mileages reside in.
The concept of tolerance intervals belongs to Walter Andrew Shewhart
(1891-1967)
Let X1 , . . . , and Xn be a random sample from a population of N(µ, σ 2 ). A
100γ%-content and 100(1 − α)%-confidence tolerance interval (TI)
[L(X1 , . . . , Xn ), U (X1 , . . . , Xn )] is an interval such that

PX1 ,...,Xn {PX [L(X1 , . . . , Xn ) ≤ X ≤ U (X1 , . . . , Xn )] ≥ γ} ≥ 1 − α.

39 / 1
Tolerance Intervals (cont’d)
Let X̄ and S be the sample mean and standard deviation. A κ-factor,
100γ%-content and 100(1 − α)%-confidence TI has the form of
[X̄ − κγ,α,n S, X̄ + κγ,α,n S] such that
PX̄ ,S [P(X̄ − κγ,α,n S ≤ X ≤ X̄ + κγ,α,n S) ≥ γ] ≥ 1 − α
for γ, α ∈ [0, 1], where κγ,α,n is the two-sided tolerance critical value. Both
two-sided and one-sided critical values are given in Table A.6.
Interpretation: For a random sample of size n from a normal population N(µ, σ 2 ) a
tolerance interval of the form x̄ ± κγ,α,n s captures at least 100γ% of the values in
the normal population with a confidence level of 100(1 − α)%.
Critical values for tolerance intervals of 100γ = 90, 95, and 99, and α = 0.05, and
α = 0.01 are given in Table A.6.
Type Two-sided One-sided
Confidence Level, 100(1 − α) 95% 99%
% of Population Captured, 100γ ≥ 90% ≥ 95% ≥ 99% ≥ 90% ≥ 95% ≥ 99%
. . . . . . .
. . . . . . .
. . . . . . .
13 2.587 3.081 4.044 2.677 3.290 4.472
. . . . . . .
. . . . . . .
. . . . . . .
Sample Size n 30 2.140 2.549 3.350 2.030 2.516 3.447
. . . . . . .
. . . . . . .
. . . . . . .
∞ 1.645 1.960 2.576 1.282 1.645 2.326

40 / 1
Tolerance Intervals (cont’d)
Example
The times (in seconds) of the first sprinkler activation for a random
selected 13 fire prevention sprinkler systems using an aqueous film-forming
foam were 27, 41, 22, 27, 23, 35, 30, 33, 24, 27, 28, 22, 24. Find a
two-sided tolerance interval to capture at least 99% times of the
population with a 95% level of confidence.
Solution: n = 13, x̄ = 27.92308, s = 5.619335, κ0.99,0.95,13 = 4.044,
27.92308 ± (4.044)(5.619335), (5.2, 50.6).
> library(tolerance)
> normtol.int(c(27, 41, 22, 27, 23, 35, 30, 33, 24, 27, 28, 22, 24), alpha = 0.05, P = 0.99, side = 2)
alpha P x.bar 2-sided.lower 2-sided.upper
1 0.05 0.99 27.92308 5.023494 50.82266

A 95% prediction interval for a future observation:


q df = 12,
1
t0.025,12 = 2.179, 27.92308 ± (2.179)(5.619335) 1 + 13 , (15.2, 40.6).
A 95% confidence interval for the population mean time: df = 12,
t0.025,12 = 2.179, 27.92308 ± (2.179)(5.619335), (15.7, 40.2).
41 / 1
Critical Values for Tolerance Intervals
If the data are from a normally distributed population,
Howe, W. G. (1969, JASA 64,610-620) provided
v
u (n − 1)(1 + n1 )z 21+γ
u
2
Two-sided tolerance critical value ≈ ,
t
χ2α,n−1
where zα and χ2α,n−1 are α quantiles. This is method ”HE2” in
K.factor in R.
Weissberg, A. and Beatty, G. (1960, Technometrics 2(4), 483-500)
calculated critical values by solving r in equation q

Φ( n + r ) − Φ( √1n − r ) = γ. The critical value is r χn−1
2 . This is
α;n−1
method ”WBE” in K.factor in R.
Natrella, M.G. (1966, Experimental statistics. Handbook 91) showed
r
2
z1−α 2
z1−α
zγ + zγ2 − (1 − 2
2(n−1) )(zγ − n )
One-sided critical value ≈ 2 .
z1−α
1− 2(n−1)
42 / 1
Critical Values for Tolerance Intervals (cont’d)
#Howe’s formula
> n<-10
> p<-c(0.9,0.95,0.99)
> alpha<-0.05
> sqrt((n-1)*(1+1/n)*qnorm((1+p)/2)^2/qchisq(alpha,n-1))
[1] 2.838191 3.381913 4.444588

#Natrella’s table
> n<-30
> p<-c(0.9,0.95,0.99)
> alpha<-0.01
> (qnorm(p)+sqrt(qnorm(p)^2-(1-qnorm(alpha)^2/(2*(n-1)))*(qnorm(p)^2-qnorm(alpha)^2/n)))
/(1-qnorm(alpha)^2/(2*(n-1)))
[1] 2.034222 2.525496 3.467542

#R Tolerance package
> library(tolerance)
> K.factor(seq(2,10,1),alph=0.05,P=0.9,side=2,method="HE2")
[1] 32.126129 8.386221 5.369902 4.274622 3.711869 3.368132
[7] 3.135364 2.966607 2.838191
> K.table(n = seq(10,15,1), alpha=0.05,side=2, P=c(0.90, 0.95, 0.99), by.arg = "n")
$‘10‘ 0.9 0.95 0.99
0.95 2.85966 3.407495 4.478207
$‘11‘ 0.9 0.95 0.99
0.95 2.75611 3.284107 4.316049
$‘12‘ 0.9 0.95 0.99
0.95 2.672037 3.183929 4.184391
$‘13‘ 0.9 0.95 0.99
0.95 2.602273 3.100799 4.075141
$‘14‘ 0.9 0.95 0.99
0.95 2.543345 3.030582 3.98286
$‘15‘ 0.9 0.95 0.99
0.95 2.49283 2.97039 3.903754
..... 43 / 1
Aluminum Contents
Example
The article Albin, Susan L. (1990, The lognormal distribution for modeling quality data when mean is near zero, Journal of
Quality Technology, 22(2), 105-110) described a plastic recycling pilot plant run by Rutgers University. The most important
material reclaimed from beverage bottles is PET plastic. A serious impurity is aluminum, which later can clog the filters in
extruders when the recycled material is used. The following are the amounts (in ppm by weight of aluminum) found in bihourly
26 samples of PET recovered at the plant over roughly a two-day period. 291, 222, 125, 79, 145, 119, 244, 118, 182, 63, 30,
140, 101, 102, 87, 183, 60, 191, 119, 511, 120, 172, 70, 30, 90, 115.
1 Are the data from a normal distribution? What can we do?
2 Construct a 90% two-sided confidence interval for the mean aluminum contents of such specimens at the Rutgers
recycling facility.
3 Construct a 95% two-sided confidence interval for the mean aluminum contents of such specimens at the Rutgers
recycling facility. How does this interval compared to the 90% one?
4 Construct a 90% prediction interval for a single additional aluminum content measurement at the Rutgers recycling
facility.
5 Construct a 99% two-sided tolerance interval for 90% of aluminum contents of such specimens at the Rutgers recycling
facility.

pet<-c(291,222,125,79,145,119,244,118,182,63,30,140,101,102,87,183,60,191,119,511,120,172,70,30,90,115)

> t.test(pet,alt="two.sided",conf.level=0.9) > t.test(pet,alt="two.sided",conf.level=0.95)


90 percent confidence interval: 95 percent confidence interval:
109.7560 175.5517 102.9883 182.3194

> mean(pet)+c(-1,1)*qt(0.95,length(pet)-1)*sd(pet)*sqrt(1+1/length(pet))
[1] -28.2883 313.5960

> normtol.int(pet,alpha=0.01,P=0.9,side=2,method="HE2")
alpha P x.bar 2-sided.lower 2-sided.upper
1 0.01 0.9 142.6538 -99.79583 385.1035
44 / 1
Confidence Intervals for Population Variances

Example
Cowan, Renk, and Vander worked with a manufacturer of high precision metal
parts on a project involving a computer numerically controlled lathe. A critical
dimension of one particular part produced on the lathe had engineering
specifications of the form

Nominal dimension ± 0.0020in

where 0.0020 is the ±3σ acceptable machine capability. This means the standard
deviation should be around 0.0007.
An important practical issue in such situations is whether or not the machine is
capable of meeting specifications of this type. One way of addressing this concern
is to collect the data and perform inference for the intrinsic machine short-term
variability, represented as a standard deviation. Here are 20 measurements (in
0.0001 inch) of parts machined on the lathe over a three-hour period:

8, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, 13.

45 / 1
Confidence Intervals for A Population Variance (cont’d)
Let X1 , · · · , Xn be a random sample from a normal population
N(µ, σ 2 ). Then, the sample variance
Pn
2 (Xi − X̄ )2
S = i=1
n−1
is an unbiased estimator of σ , i.e. E [S 2 ] = σ 2 and
2

(n − 1)S 2
∼ χ2n−1 .
σ2
A 100(1 − α)% confidence interval of σ 2 is
(n − 1)s 2 2 (n − 1)s 2
< σ < .
χ2α/2,n−1 χ21−α/2,n−1
where χ21−α/2,n−1 and χ2α/2,n−1 can be obtained from Table A.7.
For ν > 40,
r !3
2 2 2
χα,ν ≈ ν 1 − + zα .
9ν 9ν
46 / 1
Confidence Intervals for A Population Variance (cont’d)
Example
The accompanying data on breakdown voltage of electrically stressed
circuits appeared in Gray, E.W. (1985, Damage of flexible printed wiring
boards associated with lightning-included voltage surges, IEEE
Transactions on Components, Hybrids, and Manufacturing Technology
8(1), 214-220) 1470, 1510, 1690, 1740, 1900, 2000, 2030, 2100, 2190,
2200, 2290, 2380, 2390, 2480, 2500, 2580, 2700. Let σ 2 denote the
variance of the breakdown voltage distribution. Construct a 95% CI of σ 2 .

Solutions: s 2 = 137324.3, χ20.975,16 = 6.908 and χ20.025,16 = 28.845. The


interval is
 
16(137324.3) 16(137324.3)
, = (76172.3, 318064.4).
28.845 6.908
voltage<-c(1470,1510,1690,1740,1900,2000,2030,2100,2190,2200,2290,2380,2390,2480,2500,2580,2700)
> library("TeachingDemos")
> sigma.test(voltage,alt="two.sided",conf.level=0.95)
95 percent confidence interval:
76171.31 318079.76
sample estimates: var of voltage 137324.3

47 / 1
Confidence Intervals for A Population Variance (cont’d)
Standard confidence interval of variance has the shortest width (Cohen, 1972, Improved confidence intervals for the
variance of a normal distribution, Journal of the American Statistical Association 67, 382-387) but is super-sensitive to
minor violations of the normality assumption.

Confidence Intervals of Variance from t(5)

Coverage probability
0.95
0.85
0.75
20 40 60 80 100
Sample size

Box, G.E.P. (1953, Non-normality and tests on variances, Biometrika 40(3-4), 318-335) proposed the following
approximate two-sided 100(1 − α)% confidence interval for the variance σ 2 .

 
rS 2 rS 2
 , .
χ2α/2,n−1 χ21−α/2,n−1

where χ21−α/2,n−1 and χ2α/2,n−1 can be obtained from Table A.7, r = γ̂ +2n/(n−1)
2n , is the adjusted degree of
e
freedom and
n  4 2
n(n + 1) X xi − x̄ 3(n − 1)
γ̂e = − .
(n − 1)(n − 2)(n − 3) i=1 s (n − 2)(n − 3)

48 / 1
Confidence Intervals for A Population Variance (cont’d)
Bonett, D.G. (2006, Approximate confidence interval for standard deviation of
nonnormal distributions, Computational Statistics & Data Analysis 50, 775-782)
recommended the approximate two-sided 100(1 − α)% confidence interval for the
variance as follows.
(cs 2 exp(−zα/2 se), cs 2 exp(zα/2 se)),
where zα/2 is the upper α/2 percentile of the standard normal distribution, and se
is an asymptoticq estimate of the standard error of the log-transformed sample
n
xi −m 4
variance, se = c γ̂e +2+3/n n
P 
n−1
, where γ̂e = (n−1) 2 s
− 3, c = n/(n − zα/2 ),
i=1 √
and m is a trimmed mean with trim-proportion equal to 1/(2 n − 4).

Confidence Intervals of Variance from t(5)


Coverage probability
0.95
0.85

Standard
Box
Bonett
0.75

Bootstrap

20 40 60 80 100
Sample size

49 / 1
Monitoring Coffee Machine
Example
The machine that fills 500-gram coffee containers for a large food processor is monitored
by the quality control department. The machine was designed so that the weights of the
500-gram containers would have a normal distribution with a mean of 506.6 grams and
a standard deviation of 4 grams. This would fill a population of containers with coffee at
most 5% of which weighted less than 500 grams. A random sample of 30 containers is
selected every hour and the weights of the sample are 501.4, 498.0, 498.6, 499.2, 495.2,
501.4, 509.5, 494.9, 498.6, 497.6, 505.5, 505.1, 499.8, 502.4, 497.0, 504.3, 499.7, 497.9,
496.5, 498.9, 504.9, 503.2, 503.0, 502.6, 496.8, 498.2, 500.1, 497.9, 502.2, 503.2.

weight<-c(501.4,498.0,498.6,499.2,495.2,501.4,509.5,494.9,498.6,497.6,505.5,505.1,499.8,502.4,497.0,504.3,
499.7, 497.9,496.5,498.9,504.9,503.2,503.0,502.6,496.8,498.2,500.1,497.9,502.2,503.2)

> mean(weight);sd(weight) 99 percent confidence interval:


[1] 500.4533 498.7255 502.1812
[1] 3.43348 > var.int(weight,conf.level=0.99)
#Check normality $standard.CI
> qqnorm(weight); qqline(weight) [1] 6.532352 26.055239
> shapiro.test(weight) $Box.CI
Shapiro-Wilk normality test [1] 6.479758 26.439387
data: weight $Bonett.CI
W = 0.962, p-value = 0.3479 [1] 6.09531 27.28460
#t interval for the mean $Bootstrap.CI
> t.test(weight,alt="two.sided",conf.level=0.99) [1] 6.792212 21.755506
50 / 1
Confidence Intervals for A Population Standard Deviation
2
Let X1 , · · · , Xn be a random sample
q Pn from a 2normal population N(µ, σ ). Then the
(Xi − X̄ )
sample standard deviation S = i=1
n−1
is NOT an unbiased estimator of σ.
In fact,
Γ( n2 )
r
2 4(n − 1)
E (S) = σ≈ σ.
n − 1 Γ( n−1 2
) 4n − 3
100(1 − α)% confidence interval of σ is
s s
n−1 n−1
s < σ < s ,
χ2α/2,n−1 χ21−α/2,n−1

where χ21−α/2,n−1 and χ2α/2,n−1 can be obtained from Table A.7.


For ν > 40,
r !3
2 2 2
χα,ν ≈ ν 1 − + zα .
9ν 9ν

For nonnormal data, use bootstrap t type confidence intervals for population variance
(Cojbasic, Vesna and Tomovic, Andrija, 2007, Nonparametric confidence intervals for
population variances of one sample and the difference of variances of two samples,
Computational Statistics & Data Analysis 51(12), 5562-5578).
51 / 1
Confidence Intervals for A Poisson Parameter
q
X̄ −λ
1 Wald interval: √ ∼ N(0, 1) approximately. x̄ ± zα/2 x̄n .
X̄ /n
q
2 Schwertman and Martinez interval: x̄ ± 0.5 ± zα/2 x̄±0.5 n
(Schwertman, N.C. and
Martinez, R.A., 1994, Approximate Poisson confidence limits, Communication in
Statistics-Theory and Methods 23(5), 1507-1529). r
2 2
4X̄ +zα/2 /n
X̄ −λ zα/2
3 Score interval: √ ∼ N(0, 1) approximately. X̄ + 2n
± zα/2 4n
.
λ/n
√ √ 2
zα/2
q
4 Variance stabilizing: √X̄ − λ
∼ N(0, 1) approximately. X̄ + 4n ± zα/2 X̄n .
1/(4n)
√ √
X̄ +c− λ+c
5 Re-centered variance stabilizing: √ ∼ N(0, 1) approximately.
1/(4n)
2 q
z X̄ +3/8
X̄ + α/2
4n
± zα/2 n
. (Anscombe, F.J., 1948, Biometrika 35, 246-254).
√ √ n
Begaud interval: ([ k + 0.02 − zα/2 ]2 /n, [ k + 0.96 + zα/2 ]2 /n) when
P
6 Xi = k.
i=1
Begaud, B., Karin, M., Abdelilah, A., Pascale, T., Nicholas, M., and Yola, M. (2005, An
easy to use method to approximate Poisson confidence limits, Europen Journal of
Epidemiology 20(3), 213-216). √ √
k (1/(6 k)±zα/2 ) k
7 Approximate bootstrap confidence (ABC): + √ √ when
n n[1−(1/(6 k)±zα/2 )/(6 k)]2
n
P
Xi = k. Swift, B.M. (2009, Comparison of confidence intervals for a Poisson
i=1
Mean-Further considerations, Communications in Statistics-Theory and Methods, 38,
748-759).
52 / 1
Confidence Intervals for A Poisson Parameter (cont’d)
8 Exact intervals: Exact confidence limits for discrete random variables are derived
from exact tests on the parameters. For one parameter discrete distributions(for
example, binomial or Poisson) the exact confidence limits are the solutions to
P[Y ≥ y |µ = L(y )] = α/2 and P[Y ≤ y |µ = U(y )] = α/2, where Y is the
statistic. Note that Fpois (k, λ) = 1 − Fχ2 (2λ; 2(k + 1)) and
P[X = k] = Fχ2 (2λ; 2k) − Fχ2 (2λ; 2(k + 1)) (Johnson, N.L., Kotz, S., and Kemp,
A.W., 2005, Univariate Discrete distributions, 3nd Ed., p. 197). Garwood, F.
(1936, Fiducial limits for the Poisson distribution, Biometrika 28, 437-442)
χ2 χ2 n
proposed 1−α/2,2k < λ < α/2,2k+2
P
2n 2n
when Xi = k.
i=1
9 When χ2 quantiles are not available, the Wilson and Hilferty transformation can
be used to derive the 100(1 − α)% confidence interval for λ as
z z
k √ )3 < λ < k+1 (1 −
(1 − 9k1 − 3α/2 1
+ 3√α/2 )3 (Breslow, N.E.; Day, N.E.,
n k n 9(k+1) k+1
1987, Statistical Methods in Cancer Research, Vol. 2, The Design and Analysis of
Cohort Studies, 69-71).
z z
√ ]2 < λ < k+1 [1 + √
Breslow and Day (1987) also suggested kn [1 − 2α/2 ]2 based
α/2
10
k n 2 k+1
s
n
P √
on 2( Xi − nλ) ≈ ±zα/2 where k + 1 is used empirically to improve the
i=1
approximation for small k.
53 / 1
Confidence Intervals for A Poisson Parameter (cont’d)
Baker (2002, A comparison of nine confidence intervals for a Poisson parameter when the
expected number of events is ≤ 5, The American Statistician 56(2), 85-89)
recommended: If the investigator cannot tolerate coverage smaller than the nomial value,
exact intervals are the only choice;if the investigator tolerates some anti- conservativeness
and believes that the need for a closed expression outweighs the greater expected width,
score intervals are the choice because they come close to maintain the nomial coverage
but their expected widths are greater than those of exact intervals.
Patil, V.V. and Kulkarni, H.V. (2012, Comparison of confidence intervals for the Poisson
mean: Some new aspects, REVSTAT Statistical Journal 10(2), 211-227) compared
nineteen methods and recommended: If the mean is expected to be between 0 to 2,
Schwertman and Martinez and modified Wald (Wald intervals plus zero and − ln(α/2) for
lower and upper limits when x = 0) intervals are the best because they have the highest
coverage probabilities and shortest expected lengths. If the mean is expected to be larger
than 4, exact intervals based on χ2 and those based on D.P. Byar approximation are
uniformly satisfactory.
Wilson and Hilferty transformations:
p(n/2 − 1/p)(1/2−1/p) (χ2n /2)1/p − (n/2 − 1/p)1/p ) ∼ N(0, 1) approximately for p ≥ 1
and (χ2n /n)1/3 ∼ N(1 − 2/(9n), 2/(9n)) approximately (Wilson, E.B. and Hilferty,
M.M.,1931, The distribution of chi-square, Proceedings of the National Academy of
Sciences, 17, 684-688).
p √
Fisher proposed χ2n − 2n − 1 ∼ N(0, 1) approximately.

54 / 1

You might also like