Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Review of Statistics

• Hypothesis Testing for Mean of Samples

• Confidence Interval for Mean of Samples


Hypothesis Testing for MEAN
The hypothesis testing problem (for the mean): make a provisional decision
based on the evidence at hand whether a null hypothesis is true, or instead
that some alternative hypothesis is true. That is, test
• H0: E(Y) = μY,0 vs. H1: E(Y) > μY,0 (1-sided, >)
• H0: E(Y) = μY,0 vs. H1: E(Y) < μY,0 (1-sided, <)
• H0: E(Y) = μY,0 vs. H1: E(Y) ≠ μY,0 (2-sided)

1-2
• A statistical test uses the data obtained from a sample to make a decision about
• whether the null hypothesis should be rejected.
• The numerical value obtained from a statistical test is called the test value.
• The level of significance is the maximum probability of committing a type I error. This
• probability is symbolized by a (Greek letter alpha). That is, P(type I error) a.
• The critical value separates the critical region from the noncritical region. The symbol
• for critical value is C.V.
• The critical or rejection region is the range of values of the test value that indicates
• that there is a significant difference and that the null hypothesis should be rejected.
• The noncritical or nonrejection region is the range of values of the test value that
• indicates that the difference was probably due to chance and that the null hypothesis
• should not be rejected.
• A one-tailed test indicates that the null hypothesis should be rejected when the test value
is in the critical region on one side of the mean.
• In a two-tailed test, the null hypothesis should be rejected when the test value is in either
of the two critical regions.

1-3
1. If s is known, use the z test. The variable must be normally distributed if n < 30.
2. If s is unknown but n >=30, use the t test.
3. If s is unknown and n<30, use the t test.

1-4
Find critical value in z-statistics
1-6
Finding critical value for t-statistics
(d.f. = n 1)

1-7
Comments on Student t distribution, ctd.

2. If the sample size is moderate (several dozen) or large (hundreds or more),


the difference between the t-distribution and N(0,1) critical values is
negligible. Here are some 5% critical values for 2-sided tests:

degrees of freedom 5% t-distribution


(n – 1) critical value
10 2.23
20 2.09
30 2.04
60 2.00
∞ 1.96

1-8
Running a z test on your data requires five steps:

1. State the null hypothesis and alternate hypothesis.


2. Choose an alpha level.
3. Find the critical value of z in a z table.
4. Decision (use any of the 3 methods Traditional – p-value – Confidence interval)
5. Summarize the results

Running a t test on your data requires five steps:


If Yi, i = 1,…, n is i.i.d. N(μY, s Y2), then the t-statistic has the Student t-
distribution with n – 1 degrees of freedom.
The critical values of the Student t-distribution is tabulated in the back of all
statistics books. Remember the recipe?
1. State the null hypothesis and alternate hypothesis.
2. Choose an alpha level.
3. Find the critical value of t in a t table.
4. Decision (use any of the 3 methods Traditional – p-value – Confidence interval)
5. Summarize the results
1-9
Methods used to test hypotheses
The three methods used to test hypotheses are

• 1. The traditional method

• 2. The P-value method

• 3. The confidence interval method

1-10
The Traditional Method

A researcher wishes to see if the mean number of days that a basic, low-price, A medical investigation claims that the average number of infections per
small automobile sits on a dealer’s lot is 29. A sample of 30 automobile dealers week at a hospital in southwestern Pennsylvania is 16.3. A random sample
has a mean of 30.1 days for basic, low-price, small automobiles. At a 0.05, test of 10 weeks had a mean number of 17.7 infections. The sample standard
the claim that the mean time is greater than 29 days. The standard deviation of deviation is 1.8. Is there enough evidence to reject the investigator’s claim
the population is 3.8 days. at a 0.05?

1-11
p-value Method

A researcher wishes to test the claim that the average cost of tuition and fees
at a four year public college is greater than $5700. She selects a random
sample of 36 four-year public colleges and finds the mean to be $5950. The
population standard deviation is $659. Is there evidence to support the claim at
a 0.05? Use the P-value method.

1-12
Comments on Student t
distribution, ctd.
4. You might not know this. Consider the t-statistic testing the hypothesis
that two means (groups s, l) are equal:
Ys - Yl Ys - Yl
t= =
ss2 sl2 SE(Ys - Yl )
+n
ns l
Even if the population distribution of Y in the two groups is normal, this
statistic doesn’t have a Student t distribution!
There is a statistic testing this hypothesis that has a normal distribution,
the “pooled variance” t-statistic – see SW (Section 3.6) – however the
pooled variance t-statistic is only valid if the variances of the normal
distributions are the same in the two groups. Would you expect this to
be true, say, for men’s v. women’s wages?

1-13
Confidence Interval
• A 95% confidence interval for μY is an interval that contains the true
value of μY in 95% of repeated samples.
• Digression: What is random here? The values of Y1,...,Yn and thus any
functions of them – including the confidence interval. The confidence
interval will differ from one sample to the next. The population
parameter, μY, is not random; we just don’t know it.
Confidence Intervals for the Mean
When s Y2 Is Known

Confidence Intervals for the Mean


When s Y2 Is unknown

1-15
Ten randomly selected people were asked how long they slept at night. The mean time was
7.1 hours, and the standard deviation was 0.78 hour. Find the 95% confidence interval of the
mean time. Assume the variable is normally distributed.
Testing the Difference Between Two Means of Independent Samples:
Using the t Test

Hypothesis testing for the Difference of


Two Means: Independent Samples

Confidence Intervals for the Difference of


Two Means: Independent Samples
Hypothesis testing for the Difference of
Two Means: Independent Samples
Confidence Intervals for the Difference of
Two Means: Independent Samples
1-20

You might also like