Hypothesis Testing

Advanced Econometrics1
ANOVA
&
Hypothesis Testing
Dr. Sudipa Majumdar
THE GAUSS MARKOV THEOREM
Gauss Markov Theorem states that
Given the assumptions of the classical linear regression
model, the OLS estimate of is said to be the best linear
unbiased estimator (BLUE) of β2
B - It is the “best” since it has minimum variance in the class

of all linear unbiased estimators.
L - It is a linear function of a random variable
U - It is unbiased. Its expected value, E() = true value β2.
An unbiased estimator with the least variance is known
as an “efficient” estimator.
OLS estimators are BLUE
anova
ANALYSIS OF VARIANCE (Anova)
• Analysis of variance (ANOVA) for data with 20 observations,
3 independent variables and 1 dependent variable.
df = (n-1), therefore TSS = 5793.43; total df = (20-1)=19

3 indep vars + 1 cons ESS = 5649.48 with df = (4-1) = 3
RSS = 143.95 with 16 df
MS = sum of squares/degree of freedom
Anova & MODEL FIT
F(3, 16) or the F-statistics

= Mean Model SS / Mean Residual SS
How well do the predictors (X variables taken together)

reliably predict the dependent variable?
H0 : The X variables have no predictive power. All coeffs = 0

When the associate p-value < 0.05, we would reject the H0 at
95% level of significance and accept the OLS regression.
COEFFICIENT OF
DETERMINATION
R-squared = Explained SS / Total SS = 0.9752
= 97.52%
Addition of more and more predictors tends to increase the R-
squared.
Adj R-squared is R-squared controlled for by the number of
predictors.
Adj R-Square = Mean Explained SS / Mean Total SS
= Explained SS /df model} / {Total SS / df total} = 97.05%
Root MSE = Standard deviation of residuals (error term).

Measure of spread of the residuals = 2.9995
Testing of hypothesis
THE PROBABILITY
DISTRIBUTION
• Recall, solving the normal equations, we derived
• Xi are fixed pre-determined variables, is a function of Y,

which in turn is a function of the disturbance term ‘ui’,
which is random.
• Therefore, the probability distribution of will depend on the
probability distribution of ui .
• The nature of the probability distribution of ‘ui’ assumes an
extremely important role in hypothesis testing.
• In order to carry out hypothesis testing, we assume that the
error terms follow normal distribution.
Hypothesis Testing
It is usually assumed that the u’s follow a normal distribution.
The classical normal linear regression model assumes that each ui is
distributed normally where the Normal Distribution is
Mean: E() = 0
Variance: E(2)= σ2
cov (ui, uj): E(ui uj) = 0 i≠j
Ui ∼ N(0, σ2)
where ∼ means distributed as: N is normal distribution;
terms in parentheses are mean and variance.
cov (ui, uj) = 0 means independence of the two variables.

ui and uj are uncorrelated and independently distributed.
ui ∼ NID(0, σ2)
NID - normally and independently distributed.
Normality assumption
1. ui represent the combined influence of independent variables
that are not explicitly introduced in the regression model. The
influence of these omitted or neglected variables should be
small and at best random.
2. Central Limit Theorem (CLT) of statistics states that if you have
a population with mean μ and standard deviation σ and take
sufficiently large random samples from the population, with
replacement, then the distribution of the sample means will be
approximately normally distributed. This will hold true
regardless of whether the source population is normal or
skewed, provided the sample size is sufficiently large.
3. The CLT provides a theoretical justification for the assumption
of normality of ui.
Normality assumption
3. With the normality assumption, the probability
distributions of OLS estimators can be derived.
OLS estimators are functions of ui. Therefore, if ui are normally
distributed, so are
4. The normal distribution involves only the mean and

variance.
5. The normality assumption helps us to derive the exact

probability distributions of OLS estimators and also
enables us to use the t, F, and χ2 statistical tests for
regression models.
Interval estimation
• Consider the consumption-income example where the estimated
marginal propensity to consume (MPC) β2 was 0.5091.
• It was a point estimate of the unknown population MPC β2.
• How reliable is this estimate?
• Because of sampling fluctuations, a single estimate is likely to differ
from the true value, although in repeated sampling its mean value
is expected to be equal to the true value. E()=β2.
• Now instead of relying on the point estimate alone, we may
construct an interval around the point estimator on either side, such
that this interval has 95% probability of including the true
parameter value.
Assume that we want to find out how “close” is to β2.
We try to find out an interval such that ( − δ, + δ) contains
the true β2.
Interval estimation
Pr ( − δ ≤ β2 ≤ + δ) = 1 − α
Such an interval is known as a confidence interval.
1 − α is the level of confidence;
α (0 < α < 1) is known as the level of significance.
The endpoints of the confidence interval are the critical
values
− δ is the lower confidence limit
+ δ is the upper confidence limit.
If α = 0.05 5% significance level 95% confidence level.
The probability that the interval includes the true β2 is 0.95 or
95%.
Interval estimation
A normal distribution for β2 using the standard normal,
Z = ( − β2 ) / σ2
If the true population variance σ2 is known, a normally distributed variable with mean µ
and variance σ2 is that the area under normal curve between µ ± σ is about 68 %,
between µ ± 2σ is about 95 %,
between µ ± 3σ is about 99.7 %.
But σ2 is rarely known.

So, instead of using the normal distribution, we use the “t- distribution” to establish a
confidence interval for β2.
t = ( − β2 ) / SE()
Larger the standard error of the estimator,
greater is the uncertainty of estimating the true value of the unknown
parameter.
So, SE of an estimator is a measure of the precision of the estimator.
The region outside the confidence interval is the REJECTION REGION
and we reject the Null Hypothesis. When we reject the null, we say that
our finding is statistically significant.
If the estimated value is within the confidence interval, it is in the

ACCEPTANCE REGION. Then we do not reject the null hypothesis, we say
that our finding is not statistically significant.
Test for normality using Jacque Bera test in STATA.

“2-t” Rule of Thumb
We wish to find out whether Y is related to X or not.
Null hypothesis to be tested is H0: β2 = 0
• If the degrees of freedom (n-2) is ≥ 20

• At 95% confidence level,
the null hypothesis β2 = 0 can be rejected if t > |2|
So, one need not consult the ‘t’ table for the statistical
significance of the coefficient.
If we choose a different level of significance, then the
appropriate t value needs to be determined from the table.
Interpret the
a) ANOVA
b) Goodness of Fit of the Model
c) t-statistics
Power of the test
Level of Significance
Type I error : probability of rejecting the true hypothesis
: Pr(Reject H0| H0 is true)
= α (significance level)
Type II error: probability of accepting the false hypothesis
: Pr(Not Reject H0| H0 is false)
POWER OF THE TEST = Prob (Type 1 error)

= 1 – Prob(Type II error)
Ideally, we would like to have small probabilities of both Type I error and
Type II error but there is a trade off between making Type I error and
making Type II error.
Level of Significance
Classical statistics concentrates on a Type I error.
Type I error is more costly relative to Type II error.
We seek to minimize Type 1 error and maximize the “Power
of the Test”
We seek to minimize Type 1 error and maximize the “Power

of the Test”
Type 1 error is commonly fixed at the 1%, 5%, or 10% levels.

Corresponding to 99%, 95% and 90% confidence intervals.
The p-value
Once the t statistic is obtained from the data, we would

ideally refer to the statistical table to find out
p value = Prob (Type I error).
We ideally p-value to be less than 1%, 5% or 10%.

This is determined by the Margin of Error.
For a given sample size, as |t| increases, the p value

decreases, and we can reject the null hypothesis with
increasing confidence.
Interpret the significance of
a) x1
b) x2
d) x3
c) constant

Hypothesis Testing

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Advanced Econometrics1

B - It is the “best” since it has minimum variance in the class

df = (n-1), therefore TSS = 5793.43; total df = (20-1)=19

F(3, 16) or the F-statistics

How well do the predictors (X variables taken together)

H0 : The X variables have no predictive power. All coeffs = 0

Root MSE = Standard deviation of residuals (error term).

• Xi are fixed pre-determined variables, is a function of Y,

cov (ui, uj) = 0 means independence of the two variables.

4. The normal distribution involves only the mean and

5. The normality assumption helps us to derive the exact

But σ2 is rarely known.

If the estimated value is within the confidence interval, it is in the

Test for normality using Jacque Bera test in STATA.

• If the degrees of freedom (n-2) is ≥ 20

POWER OF THE TEST = Prob (Type 1 error)

We seek to minimize Type 1 error and maximize the “Power

Type 1 error is commonly fixed at the 1%, 5%, or 10% levels.

Once the t statistic is obtained from the data, we would

We ideally p-value to be less than 1%, 5% or 10%.

For a given sample size, as |t| increases, the p value

You might also like