Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

ENGINEERING DATA ANALYSIS

Sampling Distribution and Point Estimation of Parameters

Introduction:

 The field of statistical inference consists of those methods used to make


decisions or to draw conclusions about a population.
 These methods utilize the information contained in a sample from the population
in drawing conclusions.
 Statistical inference may be divided into two major areas:
-Parameter estimation and Hypothesis testing

6.1. Point Estimation:

In statistics, point estimation involves the use of sample data to calculate a single
value (known as a point estimate since it identifies a point in some parameter
space) which is to serve as a "best guess" or "best estimate" of an unknown
population parameter (for example, the population mean). More formally, it is the
application of a point estimator to the data to obtain a point estimate.

Point estimation can be contrasted with interval estimation: such interval


estimates are typically either confidence intervals, in the case of frequentist
inference, or credible intervals, in the case of Bayesian inference.

 A point estimate is a reasonable value of a population parameter.


 Data collected, X1, X2,…, Xn are random variables.
 unctions of t ese random variables, X and s2, are also random variables called
statistics.
 Statistics have their unique distributions that are called sampling distributions.

6.2. Sampling Distribution and the Central Limit of Point Theorem

6.2.1. Sampling Distribution

Suppose that we draw all possible samples of size n from a given population.
Suppose further that we compute a statistic (e.g., a mean, proportion, standard
deviation) for each sample. The probability distribution of this statistic is called
a sampling distribution. And the standard deviation of this statistic is called
the standard error.

Variability of a Sampling Distribution

The variability of a sampling distribution is measured by its variance or its standard


deviation. The variability of a sampling distribution depends on three factors:

 N: The number of observations in the population.

 n: The number of observations in the sample.


 The way that the random sample is chosen.

If the population size is much larger than the sample size, then the sampling
distribution has roughly the same standard error, whether we
sample with or without replacement. On the other hand, if the sample represents
a significant fraction (say, 1/20) of the population size, the standard error will be
meaningfully smaller, when we sample without replacement.

Sampling Distribution of the Mean

Suppose we draw all possible samples of size n from a population of size N.


Suppose further that we compute a mean score for each sample. In this way, we
create a sampling distribution of the mean.

We know the following about the sampling distribution of the mean. The mean of
the sampling distribution (̅̅̅) is equal to the mean of the population (μ). And the
standard error of the sampling distribution (̅̅̅) is determined by the standard
deviation of the population (σ), the population size (N), and the sample size (n).
These relationships are shown in the equations below:

̅̅̅ = μ

̅̅̅ √

In the standard error formula, the factor sqrt[ (N - n ) / (N - 1) ] is called the finite
population correction or fpc. When the population size is very large relative to the
sample size, the fpc is approximately equal to one; and the standard error
formula can be approximated by:

̅̅̅

You often see this "approximate" formula in introductory statistics texts. As a
general rule, it is safe to use the approximate formula when the sample size is no
bigger than 1/20 of the population size.

Sampling Distribution of the Proportion

In a population of size N, suppose that the probability of the occurrence of an


event (dubbed a "success") is P; and the probability of the event's non-
occurrence (dubbed a "failure") is Q. From this population, suppose that we draw
all possible samples of size n. And finally, within each sample, suppose that we
determine the proportion of successes p and failures q. In this way, we create a
sampling distribution of the proportion.

We find that the mean of the sampling distribution of the proportion (μ p) is equal
to the probability of success in the population (P). And the standard error of the

2
sampling distribution (σp) is determined by the standard deviation of the
population (σ), the population size, and the sample size. These relationships are
shown in the equations below:

μp = P

σp = √

σp = √ √

where σ = √

Like the formula for the standard error of the mean, the formula for the standard
error of the proportion uses the finite population correction, sqrt[ (N - n ) / (N - 1)
]. When the population size is very large relative to the sample size, the fpc is
approximately equal to one; and the standard error formula can be approximated
by:

σp = √

You often see this "approximate" formula in introductory statistics texts. As a


general rule, it is safe to use the approximate formula when the sample size is no
bigger than 1/20 of the population size.

6.2.2 Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of
any independent, random variable will be normal or nearly normal, if the sample
size is large enough.

How large is "large enough"? The answer depends on two factors.

 Requirements for accuracy. The more closely the sampling distribution


needs to resemble a normal distribution, the more sample points will be
required.
 The shape of the underlying population. The more closely the original
population resembles a normal distribution, the fewer sample points will
be required.

In practice, some statisticians say that a sample size of 30 is large enough when
the population distribution is roughly bell-shaped. Others recommend a sample
size of at least 40. But if the original population is distinctly not normal (e.g., is
badly skewed, has multiple peaks, and/or has outliers), researchers like the
sample size to be even larger.

How to Choose Between T-Distribution and Normal Distribution

3
The t distribution and the normal distribution can both be used with statistics that
have a bell-shaped distribution. This suggests that we might use either the t-
distribution or the normal distribution to analyze sampling distributions. Which
should we choose?

Guidelines exist to help you make that choice. Some focus on the population
standard deviation.

 If the population standard deviation is known, use the normal distribution


 If the population standard deviation is unknown, use the t-distribution.

Other guidelines focus on sample size.

 If the sample size is large, use the normal distribution. (See the
discussion above in the section on the Central Limit Theorem to
understand what is meant by a "large" sample.)

 If the sample size is small, use the t-distribution.

Example: In practice, researchers employ a mix of the above guidelines. On this


site, we use the normal distribution when the population standard deviation is
known and the sample size is large. We might use either distribution when
standard deviation is unknown and the sample size is very large. We use the t-
distribution when the sample size is small, unless the underlying distribution is
not normal. The t distribution should not be used with small samples from
populations that are not approximately normal.

Assume that a school district has 10,000 6th graders. In this district, the average
weight of a 6th grader is 80 pounds, with a standard deviation of 20 pounds.
Suppose you draw a random sample of 50 students. What is the probability that
the average weight of a sampled student will be less than 75 pounds?

Solution: To solve this problem, we need to define the sampling distribution of the
mean. Because our sample size is greater than 30, the Central Limit Theorem
tells us that the sampling distribution will approximate a normal distribution.

To define our normal distribution, we need to know both the mean of the
sampling distribution and the standard deviation. Finding the mean of the
sampling distribution is easy, since it is equal to the mean of the population.
Thus, the mean of the sampling distribution is equal to 80.

The standard deviation of the sampling distribution can be computed using the
following formula.

̅̅̅ √

4
̅̅̅ √

̅̅̅

Let's review what we know and what we want to know. We know that the
sampling distribution of the mean is normally distributed with a mean of 80 and a
standard deviation of 2.81. We want to know the probability that a sample mean
is less than or equal to 75 pounds.

Because we know the population standard deviation and the sample size is
large, we'll use the normal distribution to find probability. To solve the problem,
we plug these inputs into the Normal Probability Calculator: mean = 80, standard
deviation = 2.81, and normal random variable = 75. The Calculator tells us that
the probability that the average weight of a sampled student is less than 75
pounds is equal to 0.038.

Note: Since the population size is more than 20 times greater than the sample
size, we could have used the "approximate" formula σ x = [ σ / sqrt(n) ] to compute
the standard error. Had we done that, we would have found a standard error
equal to [ 20 / √ ] or 2.83.

Example: Find the probability that of the next 120 births, no more than 40% will
be boys. Assume equal probabilities for the births of boys and girls. Assume also
that the number of births in the population (N) is very large, essentially infinite.

Solution: The Central Limit Theorem tells us that the proportion of boys in 120
births will be approximately normally distributed.

The mean of the sampling distribution will be equal to the mean of the population
distribution. In the population, half of the births result in boys; and half, in girls.
Therefore, the probability of boy births in the population is 0.50. Thus, the mean
proportion in the sampling distribution should also be 0.50.

The standard deviation of the sampling distribution (i.e., the standard error) can
be computed using the following formula.

σp = √ √

Here, the finite population correction is equal to 1.0, since the population size (N)
was assumed to be infinite. Therefore, standard error formula reduces to:

σp = √

σp = √

σp = 0.04564

5
Let's review what we know and what we want to know. We know that the
sampling distribution of the proportion is normally distributed with a mean of 0.50
and a standard deviation of 0.04564. We want to know the probability that no
more than 40% of the sampled births are boys.

Because we know the population standard deviation and the sample size is
large, we'll use the normal distribution to find probability. To solve the problem,
we plug these inputs into the Normal Probability Calculator: mean = .5, standard
deviation = 0.04564, and the normal random variable = .4. The Calculator tells us
that the probability that no more than 40% of the sampled births are boys is equal
to 0.014.

Note: This problem can also be treated as a binomial experiment. Elsewhere, we


showed how to analyze a binomial experiment. The binomial experiment is
actually the more exact analysis. It produces a probability of 0.018 (versus a
probability of 0.14 that we found using the normal distribution). Without a
computer, the binomial approach is computationally demanding. Therefore, many
statistics texts emphasize the approach presented above, which uses the normal
distribution to approximate the binomial.

6.3. General Concept of Point Estimation

We'll start the lesson with some formal definitions. In doing so, recall that we
denote the n random variables arising from a random sample as subscripted
uppercase letters:
X1, X2, ..., Xn
The corresponding observed values of a specific random sample are then
denoted as subscripted lowercase letters:

x1, x2, ..., xn

Definition. The range of possible values of the parameter θ is called


the parameter space Ω (the greek letter "omega").

For example, if μ denotes the mean grade point average of all college students, t
hen the parameter space (assuming a 4-point grading scale) is:
Ω = {μ: 0 ≤ μ ≤ 4}
And, if p denotes the proportion of students who smoke cigarettes, then the
parameter space is:
Ω = {p: 0 ≤ p ≤ 1}

Definition. The function of X1, X2, ..., Xn, that is, the
statistic u(X1, X2, ..., Xn), used to estimate θ is called a point
estimator of θ.

6
For example, the function:
X= ∑

is a point estimator of the population mean μ. The function:

̂ ∑

(where Xi = 0 or 1) is a point estimator of the population proportion p. And, the


function:

S2= ∑

is a point estimator of the population variance σ2.

Definition. The function u(x1, x2, ..., xn) computed from a set of
data is an observed point estimate of θ.

For example, if xi are the observed grade point averages of a sample of 88


students, then:

̅ ∑

is a point estimate of μ, the mean grade point average of all the students in the
population.
And, if xi = 0 if a student has no tattoo, and xi = 1 if a student has a tattoo, then:

̂ =0.11

is a point estimate of p, the proportion of all students in the population who have
a tattoo.

6.3.1. Unbiased Estimator and Variance of a Point Estimator

On the previous page, we showed that if Xi are Bernoulli random variables with
parameter p, then:

̂ ∑

7
Definition. If the following holds:

E[u(X1,X2,…,Xn)]=θE[u(X1,X2,…,Xn)]=θ

then the statistic u(X1,X2,…,Xn)u(X1,X2,…,Xn) is an unbiased


estimator of the parameter θ. Otherwise, u(X1,X2,…,Xn)u(X1,X2,…,Xn) is
a biased estimator of θ.
is the maximum likelihood estimator of p. And, if Xi are normally distributed random
variables with mean μ and variance σ2, then:

∑ ∑ ̅
̅ ̂
are the maximum likelihood estimators of μ and σ2, respectively. A natural question then
is whether or not these estimators are "good" in any sense. One measure of "good" is
"unbiasedness."

Example: If Xi is a Bernoulli random variable with parameter p, then:

̂ ∑

is the maximum likelihood estimator (MLE) of p. Is the MLE of p an unbiased estimator


of p?

Solution. Recall that if Xi is a Bernoulli random variable with parameter p, then E(Xi) = p.
Therefore:

̂ ( ∑ ) ∑ ∑

The first equality holds because we've merely replaced p-hat with its definition. The
second equality holds by the rules of expectation for a linear combination. The third
equality holds because E(Xi) = p. The fourth equality holds because when you add the
value p up n times, you get np. And, of course, the last equality is simple algebra.
In summary, we have shown that:

E( ̂ )= p

Therefore, the maximum likelihood estimator is an unbiased estimator of p.


Example: If Xi are normally distributed random variables with mean μ and variance σ2,
what is an unbiased estimator of σ2? Is S2 unbiased?

8
Solution. Recall that if Xi is a normally distributed random variable with mean μ and
variance σ2, then:

Also, recall that the expected value of a chi-square random variable is its degrees of
freedom. That is, if:

X~

then E(X) = r. Therefore:

[ ] [ ]

The first equality holds because we effectively multiplied the sample variance by 1. The
second equality holds by the law of expectation that tells us we can pull a constant
through the expectation. The third equality holds because of the two facts we recalled
above. That is:

E[ ]

And, the last equality is again simple algebra.

In summary, we have shown that, if Xi is a normally distributed random variable with


mean μ and variance σ2, then S2is an unbiased estimator of σ2. It turns out, however,
that S2 is always an unbiased estimator of σ2, that is, for any model, not just the normal
model. (You'll be asked to show this in the homework.) And, although S2 is always an
unbiased estimator of σ2, S is not an unbiased estimator of σ.

Example: Let T be the time that is needed for a specific task in a factory to be
completed. In order to estimate the mean and variance of T, we observe a random
sample T1, T2,⋯⋯,T6. Thus, Ti‟s are i.i.d. and have the same distribution as T. We obtain
the following values (in minutes):
18, 21, 17,16, 24, 20.
Find the values of the sample mean, the sample variance, and the sample standard
deviation for the observed sample.

The sample mean is

9
The sample variance is given by

Finally, the sample standard deviation is given by

S= √ = 2.94

6.3.2. Standard Error

The standard error of an estimator is its standard deviation:

Let‟s calculate the standard error of the sample mean estimator :

where σ is the standard deviation std(X) being estimated. We don‟t know the
standard deviation σ of X, but we can approximate the standard error based

10
upon some estimated value s for σ. Irrespective of the value of σ, the standard
error decreases with the square root of the sample size m. Quadrupling the
sample size halves the standard error.

6.3.3. Mean Squared Error

We seek estimators that are unbiased and have minimal standard error.
Sometimes these goals are incompatible. Consider Exhibit 4.2, which indicates
PDFs for two estimators of a parameter θ. One is unbiased. The other is biased
but has a
lower
standard
error.
Which
estimator
should we
use?

Exhibit 4.2: PDFs are indicated for two estimators of a parameter θ. One is unbiased.
The other is biased but has lower standard error.
Mean squared error (MSE) combines the notions of bias and standard error. It is
defined as

Since we have already determined the bias and standard error of estimator, calculating
its mean squared error is easy:

Faced with alternative estimators for a given parameter, it is generally reasonable to use
the one with the smallest MSE.

11
References:

Holton,G., (n.d.) Value-at-Risk Second Edition [Available online] Retrieve from:


https://www.value-at-risk.net/bias/

Pishro-Nik, H., (n.d.) Introduction to Probability, Statistics, and Random Processes


[Available online] Retrieve from:
https://www.probabilitycourse.com/chapter8/8_2_2_point_estimators_for_mean_and_var
.php

The Pennsylvania State University (2018) Probability Theory and Mathematical Statistics
[Available online] Retrieve from:
https://newonlinecourses.science.psu.edu/stat414/node/192/

Stat Trek (n.d.) Sampling Distribution [Available online] Retrieve from:


https://stattrek.com/sampling/sampling-distribution.aspx

12
Statistical Intervals
Confidence Intervals

In statistical inference, one wishes to estimate population parameters using observed


sample data.

A confidence interval gives an estimated range of values which is likely to include an


unknown population parameter, the estimated range being calculated from a given set of
sample data. The confidence interval can take any number of probabilities, with the most
common being 95% or 99%.

The common notation for the parameter in question is . Often, this parameter is the
population mean , which is estimated through the sample mean .

The level C of a confidence interval gives the probability that the interval produced by
the method employed includes the true value of the parameter .

Example

Suppose a student measuring the boiling temperature of a certain liquid observes the
readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different
samples of the liquid. He calculates the sample mean to be 101.82. If he knows that the
standard deviation for this procedure is 1.2 degrees, what is the confidence interval for
the population mean at a 95% confidence level?

In other words, the student wishes to estimate the true mean boiling temperature of the
liquid using the results of his measurements. If the measurements follow a normal
distribution, then the sample mean will have the distribution . Since the sample

size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt (6) = 0.49.

The selection of a confidence level for an


interval determines the probability that the
confidence interval produced will contain the
true parameter value. Common choices for the
confidence level C are 0.90, 0.95, and 0.99.
These levels correspond to percentages of the
area of the normal density curve. For example, a
95% confidence interval covers 95% of the
normal curve -- the probability of observing a
value outside of this area is less than 0.05.
Because the normal curve is symmetric, half of
the area is in the left tail of the curve, and the
other half of the area is in the right tail of the
curve. As shown in the diagram to the right, for a
confidence interval with level C, the area in each
tail of the curve is equal to (1-C)/2. For a 95% confidence interval, the area in each tail is

13
equal to 0.05/2 = 0.025.

The value z* representing the point on the standard normal density curve such that the
probability of observing a value greater than z* is equal to p is known as the
upper p critical value of the standard normal distribution. For example, if p = 0.025, the
value z* such that P(Z > z*) = 0.025, or P(Z < z*) = 0.975, is equal to 1.96. For a
confidence interval with level C, the value p is equal to (1-C)/2. A 95% confidence
interval for the standard normal distribution, then, is the interval (-1.96, 1.96), since 95%
of the area under the curve falls within this interval.

Confidence Intervals for Unknown Mean and Known Standard Deviation

For a population with unknown mean and known standard deviation , a confidence
interval for the population mean, based on a simple random sample (SRS) of size n, is
+ z* , where z* is the upper (1-C)/2 critical value for the standard normal distribution.

An increase in sample size will decrease the length of the confidence interval without
reducing the level of confidence. This is because the standard deviation decreases
as n increases. The margin of error m of a confidence interval is defined to be the
value added or subtracted from the sample mean which determines the length of the
interval: .

Confidence Intervals for Unknown Mean and Unknown Standard Deviation

In most practical research, the standard deviation for the population of interest is not
known. In this case, the standard deviation is replaced by the estimated standard
deviation s, also known as the standard error. Since the standard error is an estimate for
the true value of the standard deviation, the distributions of the sample mean is no
longer normal with mean , and standard deviation . Instead, the sample mean follows

the t distribution with mean , and standard deviation . The t distribution is also

described by its degrees of freedom. For a sample of size n, the t distribution will have n-
1 degrees of freedom. The notation for a t distribution with k degrees of freedom is t(k).
As the sample size n increases, the t distribution becomes closer to the normal
distribution, since the standard error approaches the true standard deviation or large n.

For a population with unknown mean and unknown standard deviation, a confidence
interval for the population mean, based on a simple random sample (SRS) of size n, is
t* , where t* is the upper (1-C)/2 critical value for the t distribution with n-1 degrees

of freedom, t(n-1).

Prediction Intervals
Predicting the next future observation with a 100(1-α)% prediction interval
Suppose that is a random sample from a normal population. We wish to
predict the value, a single future observation. A point prediction of is the

14
sample mean. The prediction error is . The expected value of the prediction
error is

and the variance of the prediction error is

because the future observation , is independent of the mean of the current sample
. The prediction error is normally distributed. Therefore

has a standard normal distribution. Replacing with S results in

which has a t distribution with degrees of freedom. Manipulating Tas we have done
previously in the development of a CI leads to a prediction interval on the future
observation
Definition:

A 100( ) % prediction interval on a single future observation from a normal


distribution is given by

√ √

EXAMPLE: (Prediction Intervals)


Reconsider the tensile adhesion tests on specimens of U-700 alloy described in
Example 8-4. The load at failure for specimens was observed, and we found that
and . The 95% confidence interval on was . We
plan to test a twenty-third specimen. A 95% prediction interval on the load at failure for
this specimen is

√ √

√ √

Notice that the prediction interval is considerably longer than the CI.

Tolerance Interval

15
Consider a population of semiconductor processors. Suppose that the speed of these
processors has a normal distribution with mean megahertz and standard
deviation megahertz. Then the interval from 600 - 1.96(30) = 541.2 to 600 +
1.96(30) = 658.8 megahertz captures the speed of 95% of the processors in this
population because the interval from 1.96 to 1.96 captures 95% of the area under the
standard normal curve. The interval from ⁄ ⁄ is called a tolerance
interval
If μ and σ are unknown, capturing a specific percentage of values of a population will
contain less than this percentage (probably) because of sampling variability in x-bar and
s.
A tolerance interval for capturing at least % of the values in a normal distribution with
confidence level 100(1+ ) % is,

where k is a tolerance interval factor found in Appendix Table XI. Values are given for
90%, 95%, and 95% and for 95% and 99% confidence
One-sided tolerance bounds can also be computed. The tolerance factors for these
bounds are also given in Appendix Table XI.

EXAMPLE: (Tolerance Intervals)


Let‟s reconsider the tensile adhesion tests. The load at failure for specimens was
observed, and we found that and . We want to find a tolerance
interval for the load at failure that includes 90% of the values in the population with 95%
confidence. From Appendix Table XI the tolerance factor k for , , and
95% confidence is The desired tolerance interval is
( ) ( )
which reduces to (23.67, 39.75). We can be 95% confident that at least 90% of the
values of load at failure for this particular alloy lie between 23.67 and 39.75
megapascals.
Reference:
Valerie J. Easton and John H. McColl's Statistics Glossary v1.1
Douglas C. Montgomery and George C. Runger: Applied Statistics and Probability for
Engineers (Third Edition)

16
Chapter VIII

TEST OF HYPOTHESIS FOR A SINGLE SAMPLE

8.1. Hypothesis Testing

What is Hypothesis Testing?


Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl
Pearson and Pearson‟s son, Egon Pearson. Hypothesis testing is a statistical method
that is used in making statistical decisions using experimental data. Hypothesis Testing
is basically an assumption that we make about the population parameter.

A statistical hypothesis is an assumption about a population parameter. This


assumption may or may not be true.

Hypothesis testing refers to the formal procedures used by statisticians to accept


or reject statistical hypotheses.

Statistical Hypotheses
The best way to determine whether a statistical hypothesis is true would be to
examine the entire population. Since that is often impractical, researchers typically
examine a random sample from the population. If sample data are not consistent with
the statistical hypothesis, the hypothesis is rejected.

There are two types of statistical hypotheses.

Null hypothesis. The null hypothesis, denoted by H o, is usually the hypothesis


that sample observations result purely from chance.

Alternative hypothesis. The alternative hypothesis, denoted by H 1 or Ha, is the


hypothesis that sample observations are influenced by some non-random cause.

Hypothesis testing is an act in statistics whereby an analyst tests an assumption


regarding a population parameter. The methodology employed by the analyst depends
on the nature of the data used and the reason for the analysis. Hypothesis testing is
used to infer the result of a hypothesis performed on sample data from a larger
population.

Hypothesis testing in statistics is a way for you to test the results of a survey or
experiment to see if you have meaningful results. You‟re basically testing whether your

17
results are valid by figuring out the odds that your results have happened by chance. If
your results may have happened by chance, the experiment won‟t be repeatable and so
has little use.

Hypothesis testing can be one of the most confusing aspects for students, mostly
because before you can even perform a test, you have to know what your null
hypothesis is. Often, those tricky word problems that you are faced with can be difficult
to decipher. But it‟s easier than you think; all you need to do is:

1. Figure out your null hypothesis,

2. State your null hypothesis,

3. Choose what kind of test you need to perform,

4. Either support or reject the null hypothesis.

What is a Hypothesis Statement?


If you are going to propose a hypothesis, it‟s customary to write a statement.
Your statement will look like this:

“If I…(do this to an independent variable)….then (this will happen to


the dependent variable).”

A good hypothesis statement should:

 Include an “if” and “then” statement (according to the University of California).


 Include both the independent and dependent variables.
 Be testable by experiment, survey or other scientifically sound technique.
 Be based on information in prior research (either yours or someone else‟s).
 Have design criteria (for engineering or programming projects).

What is the Null Hypothesis?


If you trace back the history of science, the null hypothesis is always the accepted
fact. Simple examples of null hypotheses that are generally accepted as being true are:

1. DNA is shaped like a double helix.

2. There are 8 planets in the solar system (excluding Pluto).

3. Taking Vioxx can increase your risk of heart problems (a drug now taken off the
market).

18
How do I State the Null Hypothesis?

You won‟t be required to actually perform a real experiment or survey in


elementary statistics (or even disprove a fact like “Pluto is a planet”!), so you‟ll be given
word problems from real-life situations. You‟ll need to figure out what your hypothesis is
from the problem. This can be a little trickier than just figuring out what the accepted fact
is. With word problems, you are looking to find a fact that is nullifiable (i.e. something
you can reject).

8.1.1 One-tailed and Two-tailed Hypothesis

Critical Regions in a Hypothesis Test


In hypothesis tests, critical regions are ranges of the distributions where the
values represent statistically significant results. Analysts define the size and location of
the critical regions by specifying both the significance level (alpha) and whether the test
is one-tailed or two-tailed.

Consider the following two facts:

1. The significance level is the probability of rejecting a null hypothesis that is correct.

2. The sampling distribution for a test statistic assumes that the null hypothesis is
correct.

Consequently, to represent the critical regions on the distribution for a test


statistic, you merely shade the appropriate percentage of the distribution. For the
common significance level of 0.05, you shade 5% of the distribution.

Two-Tailed Hypothesis Tests


Two-tailed hypothesis tests are also known as nondirectional and two-sided tests
because you can test for effects in both directions. When you perform a two-tailed test,
you split the significance level percentage between both tails of the distribution. In the
example below, I use an alpha of 5% and the distribution has two shaded regions of
2.5% (2 * 2.5% = 5%).

19
When a test statistic falls in either critical region, your sample data are sufficiently
incompatible with the null hypothesis that you can reject it for the population.

In a two-tailed test, the generic null and alternative hypotheses are the following:

Null: The effect equals zero.

Alternative: The effect does not equal zero.

The specifics of the hypotheses depend on the type of test you perform because
you might be assessing means, proportions, or rates.

Advantages of two-tailed hypothesis tests


You can detect both positive and negative effects. Two-tailed tests are standard
in scientific research where discovering any type of effect is usually of interest to
researchers.

One-Tailed Hypothesis Tests


One-tailed hypothesis tests are also known as directional and one-sided tests
because you can test for effects in only one direction. When you perform a one-tailed
test, the entire significance level percentage goes into the extreme end of one tail of the
distribution.

In the examples below, I use an alpha of 5%. Each distribution has one shaded
region of 5%. When you perform a one-tailed test, you must determine whether the
critical region is in the left tail or the right tail. The test can detect an effect only in the
direction that has the critical region. It has absolutely no capacity to detect an effect in
the other direction.

20
In a one-tailed test, you have two options for the null and alternative hypotheses,
which corresponds to where you place the critical region.

You can choose either of the following sets of generic hypotheses:

Null: The effect is less than or equal to zero.

Alternative: The effect is greater than zero.

Or:

Null: The effect is greater than or equal to zero.

Alternative: The effect is less than zero.

21
Again, the specifics of the hypotheses depend on the type of test you perform.

Notice how for both possible null hypotheses the tests can‟t distinguish between
zero and an effect in a particular direction. For example, in the example directly above,
the null combines “the effect is greater than or equal to zero” into a single category. That
test can‟t differentiate between zero and greater than zero.

Advantages and disadvantages of one-tailed hypothesis tests


One-tailed tests have more statistical power to detect an effect in one direction
than a two-tailed test with the same design and significance level. One-tailed tests occur
most frequently for studies where one of the following is true:

1. Effects can exist in only one direction.

2. Effects can exist in both directions but the researchers only care about an
effect in one direction. There is no drawback to failing to detect an effect in the other
direction. (Not recommended.)

The disadvantage of one-tailed tests is that is has no statistical power to detect


an effect in the other direction.

8.1.2 P-value in Hypothesis Test

The P value, or calculated probability, is the probability of finding the observed,


or more extreme, results when the null hypothesis (H0) of a study question is true – the
definition of „extreme‟ depends on how the hypothesis is being tested. P is also
described in terms of rejecting H0 when it is actually true, however, it is not a direct
probability of this state.

The term significance level (alpha) is used to refer to a pre-chosen probability


and the term "P value" is used to indicate a probability that you calculate after a given
study.

If your P value is less than the chosen significance level then you reject the null
hypothesis i.e. accept that your sample gives reasonable evidence to support the
alternative hypothesis. It does NOT imply a "meaningful" or "important" difference; that is
for you to decide when considering the real-world relevance of your result.

Type I error is the false rejection of the null hypothesis and type II error is the
false acceptance of the null hypothesis. As an aid memoir: think that our cynical society
rejects before it accepts.

22
The significance level (alpha) is the probability of type I error. The power of a
test is one minus the probability of type II error (beta). Power should be maximised when
selecting statistical methods.

The following table shows the relationship between power and error in
hypothesis testing:

DECISION

Accept H0: Reject H0:


TRUTH

H0 is true: correct decision P type I error P

1-alpha alpha (significance)

H0 is false: type II error P correct decision P

beta 1-beta (power)

H0 = null hypothesis

P = probability

Notes about Type I error:


 is the incorrect rejection of the null hypothesis
 maximum probability is set in advance as alpha
 is not affected by sample size as it is set in advance
 increases with the number of tests or end points (i.e. do 20 rejections of
H0 and 1 is likely to be wrongly significant for alpha = 0.05)

Notes about Type II error:


 is the incorrect acceptance of the null hypothesis
 probability is beta
 beta depends upon sample size and alpha
 can't be estimated except as a function of the true population effect

23
 beta gets smaller as the sample size gets larger
 beta gets smaller as the number of tests or end points increases

8.1.3 General Procedure for Test of Hypothesis

1. From the problem context, identify the parameter of interest.

2. State the null hypothesis, H0.

3. Specify an appropriate alternative hypothesis, H1.

4. Choose a significance level, α.

5. Determine an appropriate test statistic.

6. State the rejection region for the statistic.

7. Compute any necessary sample quantities, substitute these into the


equation for the test statistic, and compute that value.

8. Decide whether or not H0 should be rejected and report that in the


problem context.

8.2 Test on the Mean of a Normal Distribution, Variance Known

Example:

Aircrew escape systems are powered by a solid propellant. The burning rate of
this propellant is an important product characteristic. Specifications require that the
mean burning rate must 50 centimeters per second. We know that the standard
deviation of burning rate is centimeters per second. The experimenter decides to
specify a type I error probability or significance level and selects a random
sample of and obtains a sample average burning rate of centimeters per
second. What conclusions should be drawn?

Solving the problem by following the eight-step procedure results:

1. The parameter of interest is µ, the mean burning rate.

2. Ho: µ = 50 centimeters per second

3. H1: µ ≠ 50 centimeters per second

4. α = 0.05

5. The test statistics is:


24
6. Reject H0 if z0 > 1.96 or if z0 < -1.96. Note that this results from step 4, where we
specified α = 0.05, and so the boundaries of the critical region are at z 0.025 = 1.96 and -
z0.025 = -1.96.

7. Computations: Since x = 51.3 and = 2,

8. Conclusion: Since z0 = 3.25 > 1.96, we reject H0: µ = 50 at the 0.05 level of
significance. Stated more completely, we conclude that the mean burning rate differs
from 50 centimeters per second, based on a sample of 25 measurements. In fact, there
is strong evidence that the mean burning rate exceeds 50 centimeters per second.

8.3 Test on the Mean of a Normal Distribution, Variance Known

Example:

The increased availability of light materials with high strength has revolutionized
the design and manufacture of golf clubs, particularly drives. Clubs with hollow heads
and very thin faces can result in much longer tee shots, especially for players of modest
skills. This is due partly to the “spring-like effect” that the thin face imparts to the ball.
Firing a golf ball at the head of the club and measuring the ratio of the outgoing velocity
of the ball to the incoming velocity can quantify this spring-like effect. The ratio of
velocities is called the coefficient of restitution of the club. An experiment was performed
in which 15 drivers produced by a particular club maker were selected at random and
their coefficients of restitution measured. In the experiment the golf balls were fired from
air cannon so that the incoming velocity and spin rate of the ball could be precisely
controlled. It is of interest to determine if there is evidence (with α = 0.05) to support a
claim that the mean coefficient of restitution exceeds 0.82. The observations follow:

0.8411 0.8191 0.8182 0.8125 0.8750

0.8580 0.8532 0.8483 0.8276 0.7983

0.8042 0.8730 0.8282 0.8359 0.8660

The sample mean and sample standard deviation are x = 0.83725 and s =
0.02456. The normal probability plot of the date supports the assumption that the
coefficient of restitution is normally distributed. Since the objective of the experiment is to
demonstrate that the mean coefficient of restitution exceeds 0.82, a one-sided
alternative hypothesis testing is appropriate.

Using the eight-step procedure for hypothesis:

1. The parameter of interest is the mean coefficient of restitution, µ.

25
2. H0: µ = 0.82

3. H1: µ > 0.82. We want to reject H0 if the mean coefficient of restitution exceeds 0.82.

4. α = 0.05

5. The test statistic is:


6. Reject H0 if t0 > t0.05,14 = 1.761

Normal probability plot of the coefficient of restitution data.

7. Computation: Since x = 0.83727, s = 0.02456, µ0 = 0.82, and n = 15, we have

8. Conclusions: Since t0 = 2.72 > 1.761, we reject H0 and conclude at the 0.05 level of
significance that the mean coefficient of restitution exceeds 0.82.

8.4 Test on the Variance and Statistical Deviation of a Normal Distribution

Example:

An automatic filling machine is used to fill bottles with liquid detergent. A random
sample of 20 bottles results in a sample variance of fill volume of s 2 = 0.0153 (fluid
ounces)2. If the variance of fill volume exceeds 0.01 (fluid ounces) 2, an unacceptable
proportion of bottles will be underfilled or overfilled. Is there evidence in the sample data

26
to suggest that the manufacturer has a problem with underfilled or overfilled bottles? Use
α = 0.05, and assume that fill volume has a normal distribution.

Using the eight-step procedure:


2
1. The parameter of interest is the population variance .

2. H0; 2
= 0.01
2
3. H1: > 0.01

4. α = 0.05

5. The test statistic is:

6. Reject H0 if .

7. Computations:

8. Conclusions: Since , we conclude that there is no strong evidence


that the variance of fill volume exceeds 0.01 (fluid ounces)2.

8.5 Test on a Population Proportion

Example:

A semiconductor manufacturer produces controllers used in automobile engine


applications. The customer requires that the process fallout or fraction defective at a
critical manufacturing step not exceed 0.05 and that the manufacturer demonstrate
process capability at this level of quality using α = 0.05. The semiconductor
manufacturer takes a random sample of 200 devices and finds that four of them are
defective. Can the manufacturer demonstrate process capability for the customer?

Using the eight-step procedure:

1. The parameter of interest is the process fraction defective p.

2. H0: p = 0.05

3. H1: p < 0.05

27
This formulation of the problem will allow the manufacturer to make a strong
claim about process capability if the null hypothesis H 0: p = 0.05 is rejected.

4. α = 0.05

5. The test statistic is

where x = 4, n = 200, and p0 = 0.05.

6. Reject H0: p = 0.05 if z0 < -z0.05 = -1.645

7. Computations: The test statistic is

8. Conclusions: Since z0 = -1.95 < -z0.05 = -1.645, we reject H0 and conclude that the
process fraction defective p is less than 0.05. The P-value for this value of the test
statistic z0 is P = 0.0256, which is less than α = 0.05. We conclude that the process is
capable.

28
References:

Majaski, C., (2019) Hypothesis Testing [Available online]

Retrieved from https://www.investopedia.com/terms/h/hypothesistesting.asp

Statistics Solutions (2013) Hypothesis Testing [Available online]

Retrieved from http://www.statisticssolutions.com/academic-


solutions/resources/directory-of-statistical-analyses/hypothesis-testing/

Glen, S., (2019) Hypothesis Testing [Available online]

Retrieved from https://www.statisticshowto.datasciencecentral.com/probability-and-


statistics/hypothesis-testing/

Frost J., (2019) One-Tailed and Two-Tailed Hypothesis Tests Explained [Available
online]

Retrieved from https://statisticsbyjim.com/hypothesis-testing/one-tailed-two-tailed-


hypothesis-tests/

29

You might also like