Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Statistical inference; estimation

and hypothesis testing


G&P Appendix D
Outcomes
• At the end of this presentation you should be able to:
o Define/describe key terms and concepts such as
o null-hypothesis
o confidence interval
o level of significance
o acceptance region
o test statistic
o critical value
o Find critical t values
o Test hypothesis μ=
Statistical inference
•It is the study of the relationship between the population and the
sample drawn from it.
•Drawing conclusions about the population, based on a random sample
from the population.
•The example of the NYSE…
•Estimation is the first step in statistical inference.
oParameter, estimator, estimate…
•Hypothesis testing is about testing an expectation about what value a
particular parameter may have.
Estimation of parameters
• We want to know the mean and variance of a random variable X.
o If we have a random sample of size n from a known probability distributions
o The sample mean is an estimate of the population mean.
o The sample variance is an estimate of the population variance.
• Estimation can be: point or interval estimation.
• Point estimation estimates the population parameter with one
numerical value X
• But this point estimator is also a r.v. that will vary from sample to
sample.
o Which may make it better to rather estimate an interval of values.
Point estimation
• In the NYSE example:
o The point estimator of µx is X
o The sample mean of the 28 p/e ratios is estimated as 23.25 (table D-1).
Interval estimation G&P p.491
• Rather than saying X=23.25 is the best estimator of the true mean, one can
also say that the true mean lies between two specific values with a certain
probability.
• To obtain an interval estimator for µx you employ the notion of a probability
distribution of an estimator discussed in Appendix C.
• Critical t (Table E2)
• Z vs t-values
• upper and lower critical t
• % of the area under the t distribution curve between upper and lower t
• Use the t distribution to obtain the interval estimator equation (D.6).

• Example: n=?; df=?


Test of significance
One of two-tailed test
 Sx Sx 
P X  2.052   x  X  2.052   0.95
 n n

• P(23.25 - 2.052(9.49/√28) ≤ ≤ 23.25 + 2.052(9.49/√28)) = 0.95


• Plugging in values gives 19.57 ≤ µx ≤ 26.93 as the 95% confidence interval for µx.
• Interval estimation provides a range of values (19.57 to 26.93) that will include
the true value with a certain degree of confidence or probability.
• Important concepts to note:
o Critical t values (find in Table)
o Confidence interval (95% sure that true parameter lies within calculated interval; 1-
α)
o Level of significance (α) prob of committing type I error; reject true hypothesis;
chance of making mistake if you reject the null hypothesis
o Type II error accepting false hypothesis
Properties of estimators
• Estimators should have the following properties that statisticians find desirable:
• Linearity.
o The estimator is a linear function of the sample observations.
• Unbiasedness.
o If in repeated application of the method the value of the estimators coincides with the true
parameter value.
• Minimum variance.
o If its variance is smaller than that of any other estimator of the parameter value.
• Efficiency.
o If only unbiased estimators of a parameter is considered, the one with the smallest variance is
efficient.
• Consistency.
o If the estimator approaches the true value of the parameter as the sample size increases.
• BLUE.
o If the estimator is linear, unbiased and has minimum variance in the class of all linear unbiased
estimators.
Hypothesis testing
•Hypothesis testing is about testing an expectation about what
value a particular parameter may have.
•The NYSE example:
oThe null hypothesis H0: μx=18.5
oThe alternative hypothesis may be H1:μx>18.5, or H1:μx<18.5, or
H1:μx≠18.5
oWe will develop decision rules to determine whether the sample
evidence supports the null hypothesis.
oIf the sample evidence supports the null hypothesis we do not reject
H0, but if it does not, we reject H0.
Confidence interval approach
• The NYSE example:
o Sample mean computed as 23.25.
o The true variance is unknown and we replace it with the sample
variance.
o We know that the sample mean follows the t distribution from which
we obtain the following 95% confidence interval: 19.57 ≤μx≤ 26.93
o Then, if the acceptance region includes the value of the parameter
under H0 we don’t reject the null hypothesis.
o In this case we reject H0: μx=18.5.
Test of significance approach
• In the test of significance approach the idea is to use a test statistic
and its probability distribution under the hypothesized value of μx
• In the NYSE example we know that X  x
t
Sx n
X
and we know the values of , Sx and n.
• If we specify the value of μx under the H0 we will have a unique t value
that we can look up in the t table to find out the probability of
obtaining such a t value.
o As |t| gets larger we will be more inclined to reject the null.
Test of significance approach
•The NYSE example:
oIf we use 95% confidence interval and the degrees of freedom are 27
the 5% critical values are –2.052 and +2.052.
23.25  18.5
t  2.6486
9.49 28
oThe calculated t value of 2.6 lies in the right tail critical region and we
therefore reject the null hypothesis that the true average P/E ratio is
18.5.
•Note the use of the term statistically significant.
•Note that hypotheses are “rejected” or “not rejected”.
Choosing the level of significance and the p
value
• The choice of 1, 5 or 10% as the level of significance is arbitrary.
• It is preferable to find the probability (p) value which is the lowest
significance level at which a null hypothesis can be rejected.
• You thus determine the probability that the value under the null
hypothesis will realise.
• See the tables for the NYSE example…
• Later on EViews will determine the p-values for you
F tests of significance

•From App C FS


2
x

 X  X  m  1
i
2

 Y  Y  n  1
2 2
S y i

•with H0 :  x  y
2 2

•Table E-3: Critical F values.


•More about this in Topic 6
•See p.510 for a summary of the steps involved in testing a statistical
hypothesis.
•More about this in Topic 6
Activities
• A random sample of 30 students finds that the average time it takes a student to
walk to campus is 35.8 minutes, while the standard deviation is 8.5 minutes.
Determine a 99% confidence interval for the average time to walk to campus.
• Security Services estimate the time it takes to lock the EW and to dim all the
lights, as 30 minutes. In a sample of 20 nights, the average time it took was 30.6
minutes and the standard deviation 1.6 minutes. Use the confidence interval
approach to test the hypothesis with a 10% level of significance (two-sided test).
• Management of the Rugby Institute suspects that the weight of their players are
increasing. Previously, the average forward weighted 85kg with a standard
deviation of 5kg. But a current random sample of 26 players indicate that the
average weight is 92kg. Does this information support their fears? (Use the test
of significance approach with α=0.05, two-sided test.)
• End of Unit 5 in study guide

You might also like