Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

HYPOTHESIS TESTING

SREYA V.V.
Department of Mathematics
BPIT, ROHINI

-
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Overview
1 Statistical Intervals for a Single Sample
2 Confidence Interval on the Mean of a Normal Distribution,
Variance Known
3 Confidence Interval on the Mean of a Normal Distribution,
Variance Unknown
4 Confidence Interval on the Variance and Standard Deviation of
a Normal Distribution
5 Hypothesis testing
6 Types of errors
7 P-values in Hypothesis Tests
8 Connection between Hypothesis Tests and Confidence Intervals
9 Tests on the Mean of a Normal Distribution, Variance Known
10 Tests on the Mean of a Normal Distribution, Variance Unknown
11 Tests on the Variance and Standard Deviation of a Normal
Distribution
12 Goodness of fit
13 Simple Linear Regression and Correlation
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Confidence interval
An interval estimate for a population parameter is called a
confidence interval. Information about the precision of estimation
is conveyed by the length of the interval. A short interval implies
precise estimation
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Confidence Interval on the Mean of a Normal Distribution,


Variance Known

1 − α is called the confidence coefficient.


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

A confidence interval estimate is desired for the gain in a circuit on


a semiconductor device. Assume that gain is normally distributed
with standard deviation s = 20. Find a 95% CI for m when n=10
and x = 1000.

Solution: Given n=10, s=20 and x = 1000.


95% CI ⇒ 100(1 − α)% = 95% ⇒ α = 0.05 ⇒ α2 = 0.025.
z α2 is the upper 100 α2 % point. i.e., 100 × 0.025% = 2.5% point.
Now, P(0 ≤ z ≤z α2 = 0.5 − 0.025 = 0.475. From the standard
normal table, we get z α2 = z0.025 = 1.96.
Hence,
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem
A random sample has been taken from a normal distribution and
the following confidence intervals constructed using the same data:
(38.02, 61.98) and (39.95, 60.05)
1 What is the value of the sample mean?

2 One of these intervals is a 95% CI and the other is a 90% CI.

Which one is the 95% CI and why?


Solution:
1 We have X − z σ σ
α/2 √n ≤ µ ≤ X + zα/2 √n
Here, 38.02 ≤ µ ≤ 61.98
Equating LHS of these two inequalities, we have
X − zα/2 √σn = 38.02 and equating RHS, we have
X + zα/2 √σn = 61.98.
Hence 2X = 100 ⇒ X = 50.
2 The 95% CI is (38.02, 61.98) and the 90% CI is (39.95,
60.05). The higher the confidence level, the wider the CI.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Sample Size for Specified Error on the Mean, Variance


Known

If x is used as an estimate of µ, we can be 100(1 − α)% confident


that the error will not exceed a specified amount E when the
sample size is
 z σ 2
α/2
n=
E
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

One-Sided Confidence Bounds on the Mean, Variance


Known

PROBLEM:Given z = z0 .05 = 1.64, n = 10, σ = 1, and x = 64.46.


Find lower one sided 95% confidence interval.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Large-Sample Confidence Interval for µ


When n is large, replacing by the sample standard deviation S has
little ef-
fect on the distribution of Z. This leads to the following useful result.

Generally, n should be at least 40 to use this result reliably. The


central limit theorem generally holds for n30, but the larger sample
size is recommended here.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Confidence Interval on the Mean of a Normal Distribution,


Variance Unknown
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

t confidence interval on µ
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

PROBLEM: A research engineer for a tire manufacturer is


investigating tire life for a new rubber compound and has built 16
tires and tested them to end-of-life in a road test. The sample
mean and standard deviation are 60,139.7 and 3645.94 kilometers.
Find a 95% confidence interval on mean tire life. (Given
t0.025,15 = 2.131)
Solution:
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Confidence Interval on the Variance and Standard


Deviation of a Normal Distribution
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

A rivet is to be inserted into a hole. A random sample of n = 15


parts is selected, and the hole diameter is measured. The sample
standard deviation of the hole diameter measurements is s = 0.008
millimeters. Construct a 99% lower confidence bound for σ 2 .
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Statistical Intervals for a Single Sample

Statistical Hypothesis
A statistical hypothesis is a statement about the parameters of one
or more populations.

Null Hypothesis H0
A null hypothesis is a claim mostly are equality about a certain
parameter of the population.

Alternative Hypothesis H1
Statement which contradicts the null hypothesis.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Example
A machine was producing chocolate bars of average 100 gms.
After maintainance a worker claims that the machine is no longer
produces chocolates of 100 gms.
Here,

H0 : µ = 100g
H1 = µ ̸= 100g

Because the alternative hypothesis specifies values of µ that could


be either greater or less than 100 g, it is called a two-sided
alternative hypothesis. In some situations, we may wish to
formulate a one-sided alternative hypothesis
Test of a hypothesis
A procedure leading to a decision about the null hypothesis is
called a test of a hypothesis.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

The sample mean can take on many different values. For example,
if 98.5 ≤ x ≤ 101.5, we will not reject the null hypothesis
H0 : µ = 100, and if eitherx < 98.5 orx > 101.5, we will reject the
null hypothesis in favor of the alternative hypothesis H1 : µ ̸= 100.
The values of x that are less than 98.5 and greater than 101.5
constitute the critical region for the test; all values that are in the
interval 98.5 ≤ x ≤ 101.5 form a region for which we will fail to
reject the null hypothesis. By convention, this is usually called the
acceptance region. The boundaries between the critical and
acceptance regions are called the critical values.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Types of errors

Type I Error
Rejecting the null hypothesis H0 when it is true is defined as a type
I error.

Type II Error
Failing to reject (Accept) the null hypothesis when it is false is
defined as a type II error.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Probability of Type I Error (α)


α =P(type I error) = P(reject H0 when H0 is true)

Sometimes the type I error probability is called the significance


level, the -error, or the size of the test.
Probability of Type II Error (β)
β= P(type II error) = P(fail to reject H0 when H0 is false)

An increase in sample size results in decrease in both α and β.


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Calculating Type I Error

Let X be the sample mean of a sample of size n.


Let the acceptance region is x1 ≤ X ≤ x2 .
X −µ
Normalize the random variables using the formula z = √
σ/ n
Let zl and zr be the corresponding critical values after
normalization.
Area beyond zl and zr is the probability of type I error.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Example

Solution: Given n = 9, σ = 2
H0 : µ = 100
H1 : µ ̸= 100
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

a)α=P(type I error)=P(reject H0 |H0 is true)


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

b)β=P(type II error)=P(accept H0 |H0 is false)


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Power of a statistical test


The power of a statistical test is the probability of rejecting the
null hypothesis H0 when the alternative hypothesis is true. (1 − β)

The power is computed as 1 − β, and power can be interpreted as


the probability of correctly rejecting a false null hypothesis.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

P-values in Hypothesis Tests

P-value
The P-value is the smallest level of significance that would lead to
rejection of the null hypothesis H0 with the given data.

Example

Consider the two-sided hypothesis test for burning rate

H0 : µ = 50, H1 : µ ̸= 50

with n = 16 and σ = 2.5. Suppose that the observed sample mean


is x = 51.3 centimeters per second.
Consider the accepted region as 48.7 ≤ x ≤ 51.3.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

P-value is the area of the shaded


region when x = 51.3.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

The null hypothesis H0 = 50 would be rejected at any level of


significance greater than or equal to 0.038.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Connection between Hypothesis Tests and Confidence


Intervals

If [l, u] is a 100(1 − α)% confidence interval for the parameter θ,


the test with level of significance α of the hypothesis

H 0 : θ = θ0 H1 : θ ̸= θ0
For the problem x = 51.3, σ = 2.5 and n = 16, if we calculate the
95% CI for µ, we get 51.3 ≤ µ ≤ 52.525. Hence µ = 50 lies
outside the CI. Thus we can reject H0 .
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

General Procedure for Hypothesis Tests


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Tests on the Mean of a Normal Distribution, Variance


Known
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

Air crew escape systems are powered by a solid propellant. The


burning rate of this propellant is an important product
characteristic. Specifications require that the mean burning rate
must be 50 centimeters per second. We know that the standard
deviation of burning rate is = 2 centimeters per second. The
experimenter decides to specify a type I error probability or
significance level of = 0.05 and selects a random sample of n =
25 and obtains a sample average burning rate of x = 51.3
centimeters per second. What conclusions should be drawn?
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Probability of Type II Error


Failing to reject the null hypothesis when it is false is defined as a
type II error.
Suppose H0 : µ = µ0 , H1 : µ ̸= µ0 .
Let the null hypothesis is false and µ0 + δ be the true value.
Then µ = µ0 + δ.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

Suppose that the true burning rate of a rocket propellant is 49


centimetres per second. Specifications require that the mean
burning rate must be 50 centimetres per second. What is β for the
two-sided test with α = 0.05, σ = 2, and n = 25?
Here, µ = 49. Hence δ = 1 and zα/2 = 1.96.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

The heat evolved in calories per gram of a cement mixture is


approximately normally distributed. The mean is thought to be
100, and the standard deviation is 2. You wish to test
H0 : µ = 100 versus H1 : µ ̸= 100 with a sample of n =
9 specimens. calculate the P-value if the observed statistic is x = 98.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Tests on the Mean of a Normal Distribution, Variance


Unknown

If the null hypothesis is true, T0 has a t distribution with n − 1


degrees of freedom. When we know the distribution of the test
statistic when H0 is true (this is often called the reference
distribution or the null distribution)
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

P-value for two-sided distribution


To test H0 : µ = µ0 against the two-sided alternative
H1 : µ ̸= µ0 , the value of the test statistic t0 is calculated.
P-value is found from the t distribution with n 1 degrees of
freedom (denoted by Tn1 )
Because the test is two-tailed, the P-value is the sum of the
probabilities in the two tails of the t distribution
 
P = 2P Tn−1 > |t0 |
where, t0 = t α2 ,n−1 .
Reject H0 if t0 > t α2 ,n−1 or t0 < −t α2 ,n−1 for a fixed
significant level α.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

P-value for one-tailed test


For the one-sided alternative hypotheses,
H0 : µ = µ 0 , H1 : µ > µ0
P = P(Tn−1 > t0 ),
Reject H0 if t0 > tα,n−1 .

For H0 : µ = µ0 , H1 : µ < µ0
P = P(Tn−1 < t0 ),
where t0 = tα,n−1 . Reject H0 if t0 < −tα,n−1
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem
Given body temperatures of 25 females: 97.8, 97.2, 97.4, 97.6,
97.8, 97.9, 98.0, 98.0, 98.0, 98.1, 98.2, 98.3, 98.3, 98.4, 98.4,
98.4, 98.5, 98.6, 98.6, 98.7, 98.8, 98.8, 98.9, 98.9, and 99.0.
Test the hypothesis H0 : µ = 98.6 versus H1 : µ ̸= 98.6, using
α = 0.05. Find the P-value.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Tests on the Variance and Standard Deviation of a Normal


Distribution
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

An automated filling machine is used to fill bottles with liquid


detergent. A random sample of 20 bottles results in a sample
variance of fill volume of s 2 = 0.0153 (fluid ounces)2. If the
variance of fill volume exceeds 0.01 (fluid ounces)2, an
unacceptable proportion of bottles will be underfilled or overfilled.
Is there evidence in the sample data to suggest that the
manufacturer has a problem with underfilled or overfilled bottles?
Use α = 0.05, and assume that fill volume has a normal distribution.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Testing for Goodness of fit

It is used when population distribution is unknown.


The test procedure requires a random sample of size n from
the population whose probability distribution is unknown.
These n observations are arranged in a frequency histogram,
having k bins or class intervals.
Let Oi be the observed frequency in the ith class interval.
compute the expected frequency in the ith class interval,
denoted Ei
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

if the population follows the hypothesized distribution, χ20 has,


approximately, a chi-square distribution with k − p − 1 degrees
of freedom, when p represents the number of parameters of
the hypothesized distribution estimated by sample statistics.
For a fixed-level test, we would reject the hypothesis that the
distribution of the population is the hypothesized distribution
if the calculated value of the test statistic χ20 > χ2α,k−p−1 .
p-value=p(χ2α,k−p−1 > χ20 )
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

Test the goodness of fit against a Poisson distribution

The estimate of the mean


number of defects per board is the sample average, that is,
(32 × 0 + 15 × 1 + 9 × 2 + 4 × 3)/60 = 0.75.
Hence the parameter λ of Poisson distribution=0.75.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Because each class interval corresponds to a particular number of


defects, we may find the pi as follows:
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

The expected frequencies are computed by multiplying the sample


size n = 60 times the probabilities pi . That is, Ei = n × pi .
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Simple Linear Regression and Correlation

Regression Analysis
The collection of statistical tools that are used to model and
explore relationships between variables that are related in a
nondeterministic manner is called regression analysis.

it is probably reasonable to assume that the mean of the random


variable Y is related to x by the following straight-line relationship:

E (Y |x) = µY |x = β0 + β1 x

β0 and β1 are called regression coefficients.


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Although the mean of Y is a linear function of x; the actual


observed value y does not fall exactly on a straight line. The
appropriate way to generalize this to a probabilistic linear model is
to assume that the expected value of Y is a linear function of x
but that for a fixed value of x, the actual value of Y is determined
by the mean value function (the linear model) plus a random error
term, say

Y = β0 + β1 x + ϵ

where ϵ is a random error with mean zero and (unknown) variance


σ 2 We call this model the simple linear regression model because it
has only one independent variable or regressor.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Suppose that we have n pairs of observations


(x1 , y1 ), (x2 , y2 ), ..., (xn , yn ). The estimates of β0 and β1 should
result in a line that is (in some sense) a “best fit” to the data. The
criterion for estimating the regression coefficients is called the
method of least squares.
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

Fit a least-squares line to the data


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

6 7
the required least-squares line is y = 11 + 11 x
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

Problem

Fit a least-squares line to the data


Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte
Statistical Intervals for a Single Sample Confidence Interval on the Mean of a Normal Distribution, Variance Known Confidence Inte

y = 35.82 + 0.476x.

You might also like