Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

BAB 4 : Decision Making for a Single Sample

POINT ESTIMATE & HYPOTHESIS TESTING

• A point estimate of some population parameter 𝜃𝜃 is a single numerical value 𝜃𝜃�


�.
of a statistic Θ

• The point estimator 𝚯𝚯� is an unbiased estimator for the parameter 𝜃𝜃 if


� � = 𝜃𝜃
𝐸𝐸�Θ
if the estimator is not unbiased, then the difference
� � − 𝜃𝜃
𝐸𝐸�Θ
�.
is called the bias of the estimator Θ
• If we consider all unbiased estimators of 𝜃𝜃, the one with the smallest variance is called the
minimum variance unbiased estimator (MVUE).
• The mean square error of an estimator Θ � of the parameter 𝜃𝜃 is defined as
� � = 𝐸𝐸�Θ
𝑀𝑀𝑀𝑀𝑀𝑀�Θ � − 𝜃𝜃�2
• The standard error of a statistic is the standard deviation of its sampling distribution. If the
standard error involves unkown parameters whose values can be estimated, substitution of
these estimates into the standar error results in an estimated standard error.
𝑆𝑆 𝑋𝑋� − 𝜇𝜇
𝜎𝜎𝑥𝑥̅ = 𝑍𝑍 =
√𝑛𝑛 𝜎𝜎/√𝑛𝑛

• A statistical hypotesis is a statement about the parameters of one or more populations.


• Rejecting the null hypotesis 𝐻𝐻0 when it is true is defined as a type I error.

• Failing to reject the null hypotesis when it is false is defined as a type II error.
𝛼𝛼 = 𝑃𝑃(𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝐼𝐼 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒) = 𝑃𝑃(𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝐻𝐻0 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝐻𝐻0 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡)
𝛽𝛽 = 𝑃𝑃(𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝐼𝐼𝐼𝐼 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒) = 𝑃𝑃(𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑡𝑡𝑡𝑡 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝐻𝐻𝑜𝑜 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝐻𝐻0 𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓)
• The values of 𝑥𝑥̅ that are less than 48.5 and greater than 51.5 is the critical region.
• A widely used procedure in hypothesis testing is to use a type I error or significance level of
𝛼𝛼 = 0.05. This value has envolved through experience and may not be appropriate for all
situations.
• The power of a statistical test is the probability of rejecting the null hypothesis 𝐻𝐻0 when the
alternative hypothesis is true.
• The P-value is the smallest level of significance that would lead to rejection of the null
Hypothesis 𝐻𝐻0 .
• The P-value is not the probability that the null hypothesis is false, nor is 1-P the probability
that the probability that the null hypothesis is true. The null hypothesis is either true or false
(there is no probability associated with this), and so the proper interpretation of the P-value is
in terms of the risk of wrongly rejecting 𝐻𝐻𝑜𝑜
• In formulationg one-s0ded alternative hypotheses, we should remember that rejecting 𝐻𝐻0 is
always a strong conclusion. Consequently, we should put the statement about which it is
important to make a strong conclusion in the alternative hypothesis, In real-world
problems, this will often depend on out point of view and experience with the situation.
• General procedure :
- Parameter of interest : From the problem context, identify the parameter of interest.
- Null hypothesis, 𝑯𝑯𝟎𝟎 : State the null hypothesis, 𝐻𝐻0 .
- Alternative hypothesis, 𝑯𝑯𝟏𝟏 : Specify an appropriate alternative hypothesis, 𝐻𝐻1 .
- Test statistic : State an appropriate test statistic.
- Reject 𝑯𝑯𝟎𝟎 if : Define the criteria that will lead to rejection of 𝐻𝐻0 .
- Computations : Compute any necessary sample quantities, substitute these into the
equation for the test statistic, and compute that value.
- Conclusions : Decide whether or nor 𝐻𝐻0 should be rejected and report that in the problem
context. This could involve computing a P-value or comparing the test statistic to a set of
critical values.
INFERENCE ON THE MEAN OF A POPULATION, VARIANCE KNOWN
• 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is a random sample of size 𝑛𝑛 from a population.
The population is normally distributed, or if it is not, the conditions of the central limit theorem
apply.
• Under the previous assumptions, the quantity
𝑋𝑋� − 𝜇𝜇
𝑍𝑍 =
𝜎𝜎/√𝑛𝑛
has a standard normal distribution, 𝑁𝑁(0,1).
• Testing hypothees on the mean, variance known (z-test)

• Probability of a type II error for the two-sided alternative hypothesis on the mean, variance
known
• Sample size for two-sided alternative hypothesis on the mean, variance known
For the two-sided alternative hypothesis on the mean with variance known and significance
level 𝛼𝛼, the sample size required to detect a difference between the true and hypothesized mean
of 𝛿𝛿 with power at least 1 − 𝛽𝛽 is
2
�𝑧𝑧𝛼𝛼/2 + 𝑧𝑧𝛽𝛽 � 𝜎𝜎 2
𝑛𝑛 ≅
𝛿𝛿 2
Where
𝛿𝛿 = 𝜇𝜇 − 𝜇𝜇0
If n is not an integer, the convention is to always round the sample size up to the next integer.
• Sample size for one-sided alternative hypothesis on the mean, variance known
2
�𝑧𝑧𝛼𝛼 + 𝑧𝑧𝛽𝛽 � 𝜎𝜎 2
𝑛𝑛 ≅
𝛿𝛿 2
Where
𝛿𝛿 = 𝜇𝜇 − 𝜇𝜇0
If n is not an integer, the convention is to round the sample size up to the next integer.
• Confidence interval on the mean, variance known
If 𝑥𝑥̅ is the sample mean of a random sample of size n from a population with known variance
𝜎𝜎 2 , a 100(1 − 𝛼𝛼)% confidence interval on 𝝁𝝁 is given by
𝑧𝑧𝛼𝛼/2 𝜎𝜎 𝑧𝑧𝛼𝛼/2 𝜎𝜎
𝑥𝑥̅ − ≤ 𝜇𝜇 ≤ 𝑥𝑥̅ +
√𝑛𝑛 √𝑛𝑛
Where 𝑧𝑧𝛼𝛼/2 is the upper 100𝛼𝛼/2 percentage point and −𝑧𝑧𝛼𝛼/2 is the lower 100𝛼𝛼/2 percentage
point of the standard normal distribution in Appendix A Table I.
• Sample size for a specified E on the mean, variance known
If 𝑥𝑥̅ is used as an estimate of 𝜇𝜇, we can be 100(1 − 𝛼𝛼)% confident that the error |𝑥𝑥̅ − 𝜇𝜇| will
not exceed a specified amount E when the sample size is
𝑧𝑧𝛼𝛼/2 𝜎𝜎 2
𝑛𝑛 = � �
𝐸𝐸
• One-sided condifence bounds on the mean, variance known
The 100(1 − 𝛼𝛼)% upper-confidence bound for 𝜇𝜇 is
𝜇𝜇 ≤ 𝑢𝑢 = 𝑥𝑥̅ + 𝑧𝑧𝛼𝛼 𝜎𝜎/√𝑛𝑛
And the 100(1 − 𝛼𝛼)% lower-confidence bound for 𝜇𝜇 is
𝑥𝑥̅ − 𝑧𝑧𝛼𝛼 𝜎𝜎/√𝑛𝑛 = 𝑙𝑙 ≤ 𝜇𝜇
INFERENCE ON THE MEAN OF A POPULATION, VARIANCE UNKNOWN
• Test statistic
𝑋𝑋� − 𝜇𝜇0
𝑇𝑇0 =
𝑆𝑆/√𝑛𝑛
• Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be a random sample for a normal distribution with unknown mean 𝜇𝜇 and
unknown variance 𝜎𝜎 2 . The quantity
𝑋𝑋� − 𝜇𝜇
𝑇𝑇 =
𝑆𝑆/√𝑛𝑛
Has a t distribution with n-1 degrees of freedom.
• Testing hypotheses on the mean of a normal distribution, variance unknown.

• Confidence interval on the mean of a normal distribution, variance unknown


If 𝑥𝑥̅ and s are the mean and standard deviation of a random sample from a normal distribution
with unknown variance 𝜎𝜎 2 , a 100(1 − 𝛼𝛼)% CI on 𝜇𝜇 is given by
𝑥𝑥̅ − 𝑡𝑡𝛼𝛼/2,𝑛𝑛−1 𝑠𝑠/√𝑛𝑛 ≤ 𝜇𝜇 ≤ 𝑥𝑥̅ + 𝑡𝑡𝛼𝛼/2,𝑛𝑛−1 𝑠𝑠/√𝑛𝑛
Where 𝑡𝑡𝛼𝛼/2,𝑛𝑛−1 is the upper 100𝛼𝛼/2 percentage point of the t distribution with n-1 degrees of
freedom.

INFERENCE ON THE VARIANCE OF A NORMAL POPULATION


• Test statistic :
(𝑛𝑛 − 1)𝑆𝑆 2
𝑋𝑋02 =
𝜎𝜎02
• Let 𝑋𝑋1 , 𝑋𝑋2 , … . , 𝑋𝑋𝑛𝑛 be a random sample from a normal distribution with unknown mean 𝜇𝜇 and
unknown variance 𝜎𝜎 2 . The quantity
(𝑛𝑛 − 1)𝑆𝑆 2
𝑋𝑋 2 =
𝜎𝜎 2
2
Has a chi-square distribution with n-1 degrees of freedom, abbreviated as 𝑋𝑋𝑛𝑛−1 . In general, the
probability density function of a chi-square random variable is
1
𝑓𝑓(𝑥𝑥) = 𝑥𝑥 (𝑘𝑘/2)−1 𝑒𝑒 −𝑥𝑥/2
𝑘𝑘
2𝑘𝑘/2 Γ �2�
𝑥𝑥 > 0
Where k is the number of degrees of freedom and Γ(𝑘𝑘/2) was defined in Section 4-5.1
• Testing hypotheses on the variance of a normal distribution

• Confidence interval on the variance of a normal distribution


If 𝑠𝑠 2 is the sample variance from a random sample of n observations from a normal distribution
with unknown variance 𝜎𝜎 2 , a 100(1 − 𝛼𝛼)% CI on 𝜎𝜎 2 is
(𝑛𝑛 − 1)𝑠𝑠 2 2
(𝑛𝑛 − 1)𝑠𝑠 2
2 ≤ 𝜎𝜎 ≤ 2
𝑋𝑋𝛼𝛼/2,𝑛𝑛−1 𝑋𝑋1−𝛼𝛼/2.𝑛𝑛−1
2 2
Where 𝑋𝑋𝛼𝛼/2,𝑛𝑛−1 and 𝑋𝑋1−𝛼𝛼/2.𝑛𝑛−1 are the upper and lower 100𝛼𝛼/2 percentage points of the chi-
square distribution with n-1 degrees of freedom, respectively. To find a CI on the standard
deviation 𝜎𝜎, simply take the square root throughout in equation 4-62.
INFERENCE ON A POPULATION PROPORTION
• Let 𝑋𝑋 be the number of observations in a random sample of size n that belongs to the class
associated with p. Then the quantity
𝑋𝑋 − 𝑛𝑛𝑛𝑛
𝑍𝑍 =
�𝑛𝑛𝑛𝑛(1 − 𝑝𝑝)
Has approximately a standard normal distribution, 𝑁𝑁(0,1).
• Testing hypotheses on a binomial proportion
• The approximate 𝛽𝛽 − 𝑒𝑒𝑒𝑒𝑟𝑟𝑜𝑜𝑜𝑜 for the two-sided alternative 𝐻𝐻1 : 𝑝𝑝 ≠ 𝑝𝑝0 is

If the alternative is 𝐻𝐻1 : 𝑝𝑝 < 𝑝𝑝0,

Whereas if the alternative is 𝐻𝐻1 : 𝑝𝑝 > 𝑝𝑝0 ,

• Sample size for a two-sided hypothesis test on a binomial proportion

If n is not an integer, round the sample size up to the next larger integer. For a one-sided
alternative, replace 𝑧𝑧𝛼𝛼/2 in equation by 𝑧𝑧𝛼𝛼 .
• Confidence interval on a binomial proportion
If 𝑝𝑝̂ is the proportion of observations in a random sample of size n that belong to a class of
interest, an approximate 100(1 − 𝛼𝛼)% CI on the proportion p of the population that belongs
to this class is

Where 𝑧𝑧𝛼𝛼/2 is the upper 100𝛼𝛼/2 percentage point of the standard normal distribution.
• Sample size for a specified error E on a binomial proportion.
If 𝑝𝑝̂ is used as an estimate of p, we can be 100(1 − 𝛼𝛼)% confident that the error �𝑃𝑃� − 𝑝𝑝� will
not exceed a specified amount E when the sample size is
𝑧𝑧𝛼𝛼/2 2
𝑛𝑛 = � � 𝑝𝑝(1 − 𝑝𝑝)
𝐸𝐸
For a specified error E, an upper bound on the sample size for estimating p is
𝑧𝑧𝛼𝛼/2 2 1
𝑛𝑛 = � �
𝐸𝐸 4
TESTING FOR GOODNESS OF FIT
• Test statistic for the chi-square goodness-of-fit test
(𝑂𝑂𝑖𝑖 −𝐸𝐸𝑖𝑖 )2
𝑋𝑋02 = ∑𝑘𝑘𝑖𝑖=1 𝐸𝐸𝑖𝑖

You might also like