Professional Documents
Culture Documents
MATH 403 Engineering Data Analysis 95 132
MATH 403 Engineering Data Analysis 95 132
MATH 403 Engineering Data Analysis 95 132
Chapter 6
Sampling Distributions and Point Estimation of Parameters
Introduction
Statistical methods are used to make decisions and draw conclusions about
techniques utilize the information in a sample for drawing conclusions. This chapter
Statistical inference has one major areas which is the parameter estimation. In
practice, the engineer will use sample data to compute a number that is in some sense a
reasonable value (a good guess) of the true population mean. This number is called a
point estimate. In this chapter, we will see that procedures are available for developing
At the end of this module, it is expected that the students will be able to:
2. Calculate and explain the important rule of the normal distribution as a sampling
3. Solve and explain important properties of point estimators, including bias, variance,
population parameter. We know that before the data are collected, the observations are
considered to be random variables, say, X1, X2, …, Xn. Therefore, any function of the
For example, the sample mean X and the sample variance 𝑆 2 are statistics and
random variables. A simple way to visualize this is as follows. Suppose we take a sample
of n = 10 observations from a population and compute the sample average, getting the
observations from the same population and the resulting sample average is 10.4. The
sample average depends on the observations in the sample, which differ from sample to
sample because they are random variables. Consequently, the sample average (or any
distribution is very important and is discussed and illustrated later in the chapter.
represent the parameter of interest. We use the Greek symbol θ (theta) to represent the
parameter. The symbol θ can represent the mean μ, the variance σ2, or any parameter of
interest to us. The objective of point estimation is to select a single number based on
sample data that is the most plausible value for θ. The numerical value of a sample
MATH 403- ENGINEERING DATA ANALYSIS
statistic is used as the point estimate. In general, if X is a random variable with probability
random sample of size n from X, the statistic = h(X1, X2,…, Xn) is called a point estimator
the sample has been selected, Θ̂ takes on a particular numerical value θ̂ called the point
estimate of θ.
Point estimation is the process of using the data available to estimate the unknown value
of a parameter, when some representative statistical model has been proposed for the
unknown mean μ. Sample mean is a point estimator of the unknown population mean μ.
That is, .After the sample has been selected, the numerical value is the point
estimate of μ. Thus, if x1 = 25, x2 = 30, x3 = 29, and x4 = 31, the point estimate of μ is
25 + 30 + 29 + 31
= 4
= 28.75
Similarly, if the population variance σ2 is also unknown, a point estimator for σ 2 is the
sample variance S2, and the numerical value s2 = 6.9 calculated from the sample data is
Practice Problem:
estimate the mean and variance of X, we observe a random sample X1, X2,⋯⋯, X7. We
166.8,171.4,169.1,178.5,168.0,157.9,170.1166.8,171.4,169.1,178.5,168.0,157.9,170.1
Find the values of the sample mean, the sample variance, and the sample standard
predictions. For example, we might claim, based on the opinions of several people
interviewed on the street, that in a forthcoming election 60% of the eligible voters in the
city of Detroit favor a certain candidate. In this case, we are dealing with a random sample
of opinions from a very large finite population. As a second illustration we might state that
the average cost to build a residence in Charleston, South Carolina, is between $330,000
and $335,000, based on the estimates of 3 contractors selected at random from the 30
MATH 403- ENGINEERING DATA ANALYSIS
now building in this city. The population being sampled here is again finite but very small.
millilitres per drink. A company official who computes the mean of 40 drinks obtains =
236 millilitres and, on the basis of this value, decides that the machine is still dispensing
drinks with an average content of μ = 240 millilitres. The 40 drinks represent a sample
from the infinite population of possible drinks that will be dispensed by this machine.
Random Sample
distributed. These random variables are known as a random sample. The random
variables X1, X2, … , Xn are a random sample of size n if (a) the Xi ’s are independent
random variables and (b) every Xi has the same probability distribution.
Statistic
if X1, X2, … , Xn is a random sample of size n, the sample mean ,the sample variance
S2, and the sample standard deviation S are statistics. Because a statistic is a random
Sampling distribution
sampling distribution of a statistic depends on the distribution of the population, the size
of the samples, and the method of choosing the samples. The probability distribution of
that a random sample of size n is taken from a normal population with mean μ and
variance σ2. Now each observation in this sample, say, X1, X2, … , Xn, is a normally and
independently distributed random variable with mean μ and variance σ2. Then because
linear functions of independent, normally distributed random variables are also normally
distributed as discussed in the previous chapters, we conclude that the sample mean
𝑋1 +𝑋2 ……𝑋𝑛
=
n
μ+μ+μ……μ
μ = = μ
n
and variance
the sampling distribution of the sample mean will still be approximately normal with mean
μ and variance σ2/n if the sample n is large. This is one of the most useful theorems in
finite or infinite) with mean μ and finite variance σ2 and if is the sample mean,
−μ
𝑍= σ
√𝑛
.
Figure1. Illustration of the Central Limit Theorem (distribution of for n =1,
Figure1 illustrates how the theorem works. It shows how the distribution of
becomes closer to normal as n grows larger, beginning with the clearly nonsymmetric
remains μ for any sample size and the variance of gets smaller as n increases.
Example 1. An electrical firm manufactures light bulbs that have a length of life that is
approximately normally distributed, with mean equal to 800 hours and a standard
deviation of 40 hours. Find the probability that a random sample of 16 bulbs will have an
Solution:
= 40/√16 = 10. The desired probability is given by the area of the shaded region
755 − 800
𝑍= = −2.5
10
and therefore
Practice Problem:
ohms and a standard deviation of 10 ohms. The distribution of resistance is normal. Find
the probability that a random sample of n = 25 resistors will have an average resistance
If we have two independent populations with means μ 1 and μ2 and variances σ21
and σ22 and if 1 and 2 are the sample means of two independent random samples of
sizes n1 and n2 from these populations, then the sampling distribution of the equation
below is approximately standard normal if the conditions of the central limit theorem
apply. If the two populations are normal, the sampling distribution of Z is exactly standard
normal.
1 − 2 − (μ1 − μ2 )
𝑍=
√σ21 /𝑛1 + σ22 /𝑛2
Example 1. Two independent experiments are run in which two different types of paint
are compared. Eighteen specimens are painted using type A, and the drying time, in
hours, is recorded for each. The same is done with type B. The population standard
Assuming that the mean drying time is equal for the two types of paint, find
P( 𝐴 − 𝐵 > 1.0), where 𝐴 and 𝐵 are average drying times for samples of size
nA = nB = 18.
Solution:
μ = μ𝐴 − μ𝐴 = 0
𝐴− 𝐵
and variance
σ2𝐴 σ2𝐵 1 1 1
σ2 = + = 18 + 18 = 9
𝐴− 𝐵 n𝐴 n𝐵
MATH 403- ENGINEERING DATA ANALYSIS
1 − 2 − (μ1 − μ2 )
𝑍=
√σ21 /𝑛1 + σ22 /𝑛2
1 − (μ𝐴 − μ𝐵 ) 1−0
𝑍= =𝑍= =3
√1 √1
9 9
Practice Problem:
1. The television picture tubes of manufacturer A have a mean lifetime of 6.5 years and
a standard deviation of 0.9 year, while those of manufacturer B have a mean lifetime of
6.0 years and a standard deviation of 0.8 year. What is the probability that a random
sample of 36 tubes from manufacturer A will have a mean lifetime that is at least 1 year
more than the mean lifetime of a sample of 49 tubes from manufacturer B? Given the
following information.
MATH 403- ENGINEERING DATA ANALYSIS
. For example, the value of the statistic , computed from a sample of size n, is a point
estimate of the population parameter μ. Similarly, = x/n is a point estimate of the true
An estimator should be “close” in some sense to the true value of the unknown
is equal to θ. This is equivalent to saying that the mean of the probability distribution of
Bias of an Estimator
E( )=θ
E( )−θ
Example 1. Let X1, X2, X3, ......, Xn be a random sample. Show that the sample mean
Solution:
B( )=E( )=θ
=E( )–θ
= EXi – θ
=0
above example 1 = 𝑋1 .
B( 1) =E( 1) –θ
= EX1 – θ
=0
Practice Problem:
1. Suppose that X is a random variable with mean μ and variance σ 2. Let X1, X2, … , Xn
be a random sample of size n from the population represented by X. Show that the sample
mean and sample variance S2 are unbiased estimators of μ and σ2, respectively.
Suppose that 1 and 2 are unbiased estimators of θ. This indicates that the
distribution of each estimator is centered at the true value of zero. However, the variance
of these distributions may be different. Figure 4 illustrates the situation. Because 1 has
a smaller variance than 2, the estimator 1 is more likely to produce an estimate close
MATH 403- ENGINEERING DATA ANALYSIS
to the true value of θ. A logical principle of estimation when selecting among several
If we consider all unbiased estimators of θ, the one with the smallest variance is
If X1, X2, … , Xn is a random sample of size n from a normal distribution with mean μ
When we do not know whether an MVUE exists, we could still use a minimum
have a random sample of n observations X1, X2, … , Xn, and we wish to compare two
possible estimators for μ: the sample mean and a single observation from the sample,
say, Xi . Note that both and Xi are unbiased estimators of μ; for the sample mean, we
have V ( ) =σ2 ∕ n from previous Chapters and the variance of any observation is V (Xi)
= σ2. Because V ( ) < V (Xi) for sample sizes n ≥ 2, we would conclude that the sample
desirable to give some idea of the precision of estimation. The measure of precision
usually employed is the standard error of the estimator that has been used.
the standard error involves unknown parameters that can be estimated, substitution
Suppose that we are sampling from a normal distribution with mean μ and variance
σ2. Now the distribution of is normal with mean μ and variance σ2/n, so the standard
error of is
σ
σ =
√𝑛
If we did not know σ but substituted the sample standard deviation S into the
S
SE ( ) = =
√𝑛
Table 1. present’s standard errors for some sample statistics with its standard
error formula. Sampling distributions for these statistics, or at least their means and
standard deviations (standard errors), can often be found. Some of these, together with
Example 1. An article in the Journal of Heat Transfer (Trans. ASME, Sec. C, 96, 1974,
p. 59) described a new method of measuring the thermal conductivity of Armco iron. Using
a temperature of 100°F and a power input of 550 watts, the following 10 measurements
A point estimate of the mean thermal conductivity at 100 °F and 550 watts is the sample
mean or
The standard error of the sample mean is = σ ∕ √𝑛, and because σ is unknown, we
may replace it by the sample standard deviation s = 0.284 to obtain the estimated
standard error of as
S 0.284
SE ( ) = = = √10 = 0.0898
√𝑛
squared error of the estimator can be important. The mean squared error of an estimator
Figure 5 A biased estimator 1 that has smaller variance than the unbiased estimator 2
MSE ( ) = E ( − θ) 2
MATH 403- ENGINEERING DATA ANALYSIS
MSE ( ) = E [ −E ( )] 2 + [θ – E ( )] 2
= V ( ) + (bias) 2
That is, the mean squared error of is equal to the variance of the estimator
plus the squared bias. If is an unbiased estimator of θ, the mean squared error
estimators. Let 1 and 2 be two estimators of the parameter θ, and let MSE ( 1)
and MSE ( 2) be the mean squared errors of 1 and 2. Then the relative
efficiency of 2 to 1 is defined as
MSE( 1 )
MSE( 2 )
If this relative efficiency is less than 1, we would conclude that 1 is a more efficient
estimator of θ than 2.
REFERENCES:
Walpole, Ronald E., et al., Probability and Statistics for Engineers and Scientists, 9th ed.,
Montgomery, Douglas C., et al., Applied Statistics and Probability for Engineers, 7th ed.,
Murray, Spiegel R., et al., Probability and Statistics, 4th ed., McGraw Hill Companies Inc.,
2013 https://www.probabilitycourse.com/chapter8/8_2_5_solved_probs.php
MATH 403- ENGINEERING DATA ANALYSIS
CHAPTER TEST
1. A population consists of the four numbers 3, 7, 11, 15. Consider all possible
samples of size two that can be drawn with replacement from this population.
Find
(a) The population mean, (b) the population standard deviation, (c) the mean of
the sampling distribution of means, (d) the standard deviation of the sampling
distribution of means. Verify (c) and (d) directly from (a) and (b) by use of suitable
formulas.
(a) 3 or more points, (b) 6 or more points, (c) between 2 and 5 points?
3. A normal population has a variance of 15. If samples of size 5 are drawn from
this population, what percentage can be expected to have variances (a) less
than 10, (b) more than 20, (c) between 5 and 10?
10.2, and 9.4 lb, respectively. Determine unbiased estimates of (a) the population
Find the distribution of the sample mean of a random sample of size n=40?
1
𝑓(𝑥, 𝑦) = { 2 (2𝑥 + 3𝑦), 4 ≤ 𝑥 ≤ 6
0,
Chapter 7
STATISTICAL INTERVALS
Introduction
represent an uncertainty that exists in the data because we work with samples that are
obtained from a larger population or process. Statistical intervals are staples of the quality
and validation practitioner’s statistical tool box. Statistical intervals can manifest as plus-
or-minus limits on test data, represent a margin of error in a scientific poll, or indicate the
level of confidence associated with a predicted value. This chapter will discussed a three-
part series written to help validation and understand the three most common intervals;
namely, the confidence interval, the prediction interval, and the tolerance interval. In this
At the end of this module, it is expected that the students will be able to:
A way to avoid this is to report the estimate in terms of a range of plausible values
usually 90%, 95%, or 99%, which is a measure of the reliability of the procedure. An
about the precision of estimation is conveyed by the length of the interval. A short interval
implies precise estimation. We cannot be certain that the interval contains the true,
unknown population parameter—we use only a sample from the full population to
compute the point estimate and the interval. However, the confidence interval is
constructed so that we have high confidence that it does contain the unknown population
parameter. Confidence intervals are widely used in engineering and the sciences.
The basic ideas of a confidence interval (CI) are most easily understood by initially
considering a simple situation. Suppose that we have a normal population with unknown
mean μ and known variance σ2.This is a somewhat unrealistic scenario because typically
both the mean and variance are unknown. However, in subsequent sections, we present
where 𝑍𝛼/2 is the upper 100α/2 percentage point of the standard normal distribution.
MATH 403- ENGINEERING DATA ANALYSIS
For small samples selected from non-normal populations, we cannot expect our
degree of confidence to be accurate. However, for samples of size n ≥ 30, with the shape
of the distributions not too skewed, sampling theory guarantees good results.
Example 1.ASTM Standard E23 defines standard test methods for notched bar impact
testing of metallic materials. The Charpy V-notch (CVN) technique measures impact
energy and is often used to determine whether or not a material experiences a ductile-to-
brittle transition with decreasing temperature. Ten measurements of impact energy (J) on
specimens of A238 steel cut at 60∘C are as follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3,
64.6, 64.8, 64.2, and 64.3. Assume that impact energy is normally distributed with σ = 1
J. We want to find a 95% CI for μ, the mean impact energy. The required quantities are
Solution:
𝜎 𝜎
- 𝑍𝛼/2 ( )≤μ≤ + 𝑍𝛼/2 ( )
√𝑛 √𝑛
1 1
64.46 -1.96( ) ≤ μ ≤ 64.46 + 1.96( )
√10 √10
63.84 ≤ μ ≤ 65.08
Based on the sample data, a range of highly plausible values for mean impact
Practice Problem:
in 36 different locations in a river is found to be 2.6 grams per milliliter. Find the 95% and
99% confidence intervals for the mean zinc concentration in the river. Assume that the
population standard deviation is 0.3 gram per milliliter. Ans. 2.47 <μ< 2.73.
This means that in using to estimate μ, the error E = | − μ| is less than or equal
𝜎
to 𝑍𝛼/2 ( ) with confidence 100(1 − α). This is shown graphically in Figure 1.
√𝑛
In situations whose sample size can be controlled, we can choose n so that we are
100(1 − α) % confident that the error in estimating μ is less than a specified bound on
𝜎
the error E. The appropriate sample size is found by choosing n such that 𝑍𝛼 ( )=E
2 √𝑛
𝑍𝛼
𝜎
n =( 2
)2
𝐸
MATH 403- ENGINEERING DATA ANALYSIS
Example 1.Consider the CVN test described in Example1 and suppose that we want to
determine how many specimens must be tested to ensure that the 95% CI on μ for
A238 steel cut at 60°C has a length of at most 1.0 J. Because the bound on error in
Solution:
𝑍𝛼
𝜎
n=( 2
)2
𝐸
(1.96)(1)
n=( )2 = 15. 37
0.5
Known
The confidence interval in Equation 8.5 gives both a lower confidence bound and
an upper confidence bound for μ. Thus, it provides a two-sided CI. It is also possible
to obtain one-sided confidence bounds for μ by setting either the lower bound l= −∞
𝜎
+ 𝑍𝛼 ( )
√𝑛
MATH 403- ENGINEERING DATA ANALYSIS
𝜎
- 𝑍𝛼 ( )≤μ
√𝑛
Example 1.The same data for impact testing from Example 1 are used to construct a
lower, one-sided 95% confidence interval for the mean impact energy. Recall that x =
Solution:
𝜎
- 𝑍𝛼 ( )≤μ
√𝑛
1
64. 46 – 1.64 ( )≤μ
√10
63.94 ≤ μ
The lower limit for the two sided interval in Example1 was 63.84. Because 𝑍𝛼 <
𝑍𝛼/2, the lower limit of a one-sided interval is always greater than the lower limit of a two-
sided interval of equal confidence. The one-sided interval does not bound μ from above
so that it still achieves 95% confidence with a slightly larger lower limit. If our interest is
only in the lower limit for μ, then the one-sided interval is preferred because it provides
equal confidence with a greater limit. Similarly, a one-sided upper limit is always less than
Practice Problem:
that the variance in reaction times to these types of stimuli is 4 sec2 and that the
distribution of reaction times is approximately normal. The average time for the subjects
is 6.2 seconds. Give an upper 95% bound for the mean reaction time. Ans: 6.858
seconds.
−μ
S
√𝑛
𝑆 𝑆
- 𝑍𝛼/2 ( )≤μ≤ + 𝑍𝛼/2 ( )
√𝑛 √𝑛
100(1 − α) %.
Example 1. An article in the 1993 volume of the Transactions of the American Fisheries
largemouth bass.
A sample of fish was selected from 53 Florida lakes, and mercury concentration in
Solution:
The required quantities are n = 53, x = 0.5250,s = 0.3486, and 𝑍0.025 = 1.96. The
approximate 95% CI on μ is
𝑆 𝑆
- 𝑍𝛼/2 ( )≤μ≤ + 𝑍𝛼/2 ( )
√𝑛 √𝑛
𝑆 𝑆
- 𝑍0.025 ( )≤μ≤ + 𝑍0.025 ( )
√𝑛 √𝑛
0.3486 0.3486
0.5250 - 1.96 ( ) ≤ μ ≤ 0.05250 + 1.96 ( )
√53 √53
0.4311 ≤ μ ≤0.6189
MATH 403- ENGINEERING DATA ANALYSIS
This interval is fairly wide because there is substantial variability in the mercury
interval.
If and S are the mean and standard deviation of a random sample from a normal
by
𝑆 𝑆
- 𝑡𝛼,𝑛−1 ( )≤μ≤ + 𝑡𝛼,𝑛−1 ( )
2 √𝑛 2 √𝑛
where 𝑡𝛼,𝑛−1 is the upper 100α/2 percentage point of the t distribution with n − 1 degrees
2
of freedom.
Adhesion Tests on Plasma Sprayed Thermal Barrier Coatings” (1989, Vol. 11(4), pp.
275–282)] describes the results of tensile adhesion tests on 22 U-700 alloy specimens.
Solution:
The sample mean is = 13.71, and the sample standard deviation is s = 3.55.
Figures 8.6 and 8.7 show a box plot and a normal probability plot of the tensile adhesion
test data, respectively. These displays provide good support for the assumption that the
(CI) is
𝑆 𝑆
- 𝑡𝛼,𝑛−1 ( )≤μ≤ + 𝑡𝛼,𝑛−1 ( )
2 √𝑛 2 √𝑛
3.55 3.55
13.71 - 2.080 ( ) ≤ μ ≤ 13.71 + 2.080 ( )
√22 √22
12.14 ≤ μ ≤ 15.28
The CI is fairly wide because there is a lot of variability in the tensile adhesion test
t - distribution
−μ
𝑇=
S
√𝑛
Distribution
are needed. When the population is modelled by a normal distribution, the tests and
intervals described in this section are applicable. The following result provides the basis
X2 Distribution
Let X1, X2,…, Xn be a random sample from a normal distribution with mean μ
(n − 1)𝑆 2
𝑋2 =
σ2
normal distribution with unknown variance σ2, then a 100(1 − α) % confidence interval
on σ2 is
(n−1)𝑆 2 (n−1)𝑆 2
( ) ≤ σ2 ≤ ( )
𝑋2𝛼 𝑋 2 1−𝛼
,𝑛−1 ,𝑛−1
2 2
where 𝑋 2 𝛼,𝑛−1 and 𝑋 2 1−𝛼,𝑛−1 are the upper and lower 100α/2 percentage points of the
2 2
interval for σ has lower and upper limits that are the square roots of the
respectively
(n−1)𝑆 2 (n−1)𝑆 2
( ) ≤ σ2 and σ2 ≤ ( )
𝑋 2 𝛼,𝑛−1 𝑋 2 1−𝛼,𝑛−1
Example 1. An automatic filling machine is used to fill bottles with liquid detergent. A
(fluid ounce). If the variance of fill volume is too large, an unacceptable proportion of
bottles will be under- or overfilled. We will assume that the fill volume is approximately
Solution:
(n−1)𝑆 2
σ2 ≤ ( )
𝑋 2 1−𝛼,𝑛−1
(20−1)0.0153
σ2≤( ) = 0.0287(fluid ounce)2
𝑋 2 0.95,19
This last expression may be converted into a confidence interval on the standard
σ =0.17
Therefore, at the 95% level of confidence, the data indicate that the process
standard deviation could be as large as 0.17 fluid ounce. The process engineer or
MATH 403- ENGINEERING DATA ANALYSIS
manager now needs to determine whether a standard deviation this large could lead to
If we have two populations with means μ1 and μ2 and variances σ21 and σ22 ,
respectively, a point estimator of the difference between μ1 and μ2 is given by the statistic
random samples, one from each population, of sizes n1 and n2, and compute 1 − 2, the
difference of the sample means. Clearly, we must consider the sampling distribution
of 1 − 2.
σ2 σ22 σ2 σ22
( 1 − 2 )- 𝑍𝛼 ( √ 𝑛 1 + ) < 𝜇1 -𝜇2 < ( 1 − 2) + 𝑍𝛼 ( √ 𝑛 1 + )
2 1 𝑛2 2 1 𝑛2
𝑋−𝑛𝑝 −𝑝
𝑍= =
√𝑛𝑝−(1−𝑝) √𝑛𝑝−(1−𝑝)
(1− ) (1− )
- 𝑍𝛼 (√ )≤ p ≤ + 𝑍𝛼 (√ )
2 𝑛 2 𝑛
where 𝑍𝛼 is the upper α/2 percentage point of the standard normal distribution.
2
100(1− α) % confident that the error is less than some specified value E. If we set
𝑝(1 − 𝑝)
E = 𝑍𝛼 √ and solve for n, the appropriate sample size is
2 𝑛
𝑍𝛼
𝜎
n=(
2
)2 p (1-p)
𝐸
MATH 403- ENGINEERING DATA ANALYSIS
respectively.
(1− ) (1− )
- 𝑍𝛼 (√ )≤ p and + 𝑍𝛼 (√ )
𝑛 𝑛
of a variable. This is a different problem than estimating the mean of that variable, so a
confidence interval is not appropriate. In this section, we show how to obtain a 100(1 − α)
A prediction interval provides bounds on one (or more) future observations from
the population. For example, a prediction interval could be used to bound a single, new
1 1
- 𝑍𝛼 𝜎 ( √1 + ) < 𝑋0 ≤ + 𝑍𝛼 𝜎 ( √1 + )
2 𝑛 2 𝑛
1 1
- 𝑡𝛼 𝑆 ( √1 + ) < 𝑋0 ≤ + 𝑡𝛼 𝑆 (√1 + )
2 𝑛 2 𝑛
where 𝑡𝛼 is the t-value with v = n − 1 degrees of freedom, leaving an area of α/2 to the
2
right.
Example 1. A meat inspector has randomly selected 30 packs of 95% lean beef. The
sample resulted in a mean of 96.2% with a sample standard deviation of 0.8%. Find a
99% prediction interval for the leanness of a new pack. Assume normality.
Solution:
1 1
- 𝑡𝛼 𝑆 ( √1 + ) < 𝑋0 ≤ + 𝑡𝛼 𝑆 (√1 + )
2 𝑛 2 𝑛
1 1
96.2 - (2.756)(0.8) √1 + ) < 𝑋0 ≤ 96.2 + ( 2.756) (0.8) √1 + )
30 30
Notice that the prediction interval is considerably longer than the CI. This is because the
observation.
Practice Problem:
Due to the decrease in interest rates, the First Citizens Bank received a lot of
amount of $257,300. Assume a population standard deviation of $25,000. For the next
customer who fills out a mortgage application, find a 95% prediction interval for the loan
amount.
MATH 403- ENGINEERING DATA ANALYSIS
might like to calculate limits that bound 95% of the viscosity values.
-ks , + ks or ± ks
where k is a tolerance interval factor found in Table I. Values are given for
γ = 90%, 95%, and 99%, and for 90%, 95%, and 99% confidence.
bounds can also be computed. The tolerance factors for these bounds are also given in
Table I.
MATH 403- ENGINEERING DATA ANALYSIS
Example 1. Consider Example 7. With the information given, find a tolerance interval that
gives two-sided 95% bounds on 90% of the distribution of packages of 95% lean
Recall from Example 7 that n = 30, the sample mean is 96.2%, and the sample
Solution:
± ks
we find that the lower and upper bounds are 94.5 and 97.9. We are 95% confident that
the above range covers the central 90% of the distribution of 95% lean beef packages.
REFERRENCES:
Walpole, Ronald E., et al., Probability and Statistics for Engineers and Scientists,9th ed., Pearson
Education Inc., 2016
Montgomery, Douglas C., et al., Applied Statistics and Probability for Engineers, 7th ed., John
Wiley & Sons (Asia) Pte Ltd, 2018
Murray, Spiegel R., et al., Probability and Statistics, 4th ed., McGraw Hill Companies Inc., 2013