Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Outline

Introduction
Confidence interval and Hypothesis testing using slope
Tests based on correlation
Exercises

Chapter 12 - Lecture 2
Inferences about regression coefficient

Andreas Artemiou

April 19th, 2010

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline
Introduction
Confidence interval and Hypothesis testing using slope
Tests based on correlation
Exercises

Introduction
Review
Facts about slope
Confidence interval and Hypothesis testing using slope
Test Statistic
Confidence interval
Hypothesis testing
Test using ANOVA Table
Example
Tests based on correlation
Review
Estimation
Tests
Example
Other inferences concerning
Exercises
Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient
Outline
Introduction
Review
Confidence interval and Hypothesis testing using slope
Facts about slope
Tests based on correlation
Exercises

Review

I In previous lectures we have seen that the regression


coefficient 1 is a parameter that can be estimated using a
sample
I In previous Chapters we have seen that using a sample we can
make statistical inference about a parameter.
I That means we can use the regression line to make inference
about regression slope and this is what we will see in this
lecture.

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline
Introduction
Review
Confidence interval and Hypothesis testing using slope
Facts about slope
Tests based on correlation
Exercises

Slope facts

I E (1 ) = 1
2
I Var (1 ) =
Sxx
I What can we say about the distribution of 1 when n is large?
I So using this fact we can use a test statistic to make inference
about the slope of the regression line. What test statistic can
we use?
I What is a problem with the test statistic above?

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

Test statistic

I So the test statistic will be the following:

1 1 1 1
T = =
S S1

Sxx
I Can you find the distribution of the above?

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

Constructing a confidence interval

I Starting from the fact that:


!
1 1
P tn2,/2 < < tn2,/2 =1
S1

I We get the following (1 )100% Confidence interval for 1 :

1 tn2,/2 S1

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

Hypothesis test

I Null Hypothesis: H0 : 1 = 10
1 10
I Test statistic: t = tn2
s1
I Rejection Regions:
I t tn2, if HA : 1 > 10
I t tn2, if HA : 1 < 10
I t tn2,/2 and t tn2,/2 if HA : 1 6= 10

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

Hypothesis test using ANOVA


I In Chapter 6 we have seen that if you take a random variable
U tv then U 2 F1,v .
I Last lecture, I showed you how one can use the SSR and SSE
to construct an ANOVA Table.
I The F test statistic that we get in that Table (see also next
slide) is the square of a special case of the T-test we get from
the test statistic in the previous slide.
I So the ANOVA table is another way to make a test, but only
in the case that 10 = 0, that is your null hypothesis is
H0 : 1 = 0.
I The case when 10 = 0 is considered the most useful test and
is also called the model utility test. Why do you think that
case is of extreme importance?
Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient
Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

ANOVA Table

Table: ANOVA TABLE


Source of Sum of Mean
variation df Squares Squares F
Regression 1 SSR SSR SSR/s 2
Error n2 SSE s2 = SSE /n 2
Total n1 SSTo

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Test Statistic
Introduction Confidence interval
Confidence interval and Hypothesis testing using slope Hypothesis testing
Tests based on correlation Test using ANOVA Table
Exercises Example

Example

I I want to find the regression line that relates the scores on the
two Midterms in Stat 319. I randomly select five students and
the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in
the same order the scores they had in Midterm 2 is 40, 65, 95,
90, 100.
I Find a 95% Confidence Interval for the regression slope.
I Make a test using a t-test to see if there is a relationship
between the scores of the two midterms at significance level
0.02
I Make a test using an F-test to see if there is a relationship
between the two scores at significance level 0.02.

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Correlation between two random variables

I In Stat 318, we defined the correlation coefficient as a


measure of how strong two random variables X and Y are
related. The formula was:
Cov (X , Y )
= (X , Y ) = p
Var (X )Var (Y )
I takes values between -1 and 1.
I The closer the value is to 1 the stronger positive relationship
we have. The closer the value is to -1 the stronger negative
relationship we have. The closer it is to 0 the weaker the
relationship is.

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Estimating correlation from a sample


I Lets assume we want to see the correlation of the height and
weight of male students at PSU. That means we need to go
ask all 25000 male students their height and weight find the
covariance of the two random variables, the variances and
calculate the correlation.
I It is much more easier, if we take a sample and estimate the
correlation.
I That means that as we learned it in Chapter 5 is a
population parameter.
I If we want to estimate it from a sample, the formula that is
Sxy
being used is: = r = p
Sxx Syy
I This estimator, r , is actually equal to the square root of the
Coefficient of Determination we have seen last lecture.
Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient
Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Hypothesis testing

I The following test is only true for testing the null H0 : = 0



r n2
I Test statistic: t = tn2
1 r2
I Rejection Regions:
I t tn2, if HA : > 0
I t tn2, if HA : < 0
I t tn2,/2 and t tn2,/2 if HA : 6= 0

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Example

I I want to find the regression line that relates the scores on the
two Midterms in Stat 319. I randomly select five students and
the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in
the same order the scores they had in Midterm 2 is 40, 65, 95,
90, 100.
I Perform a hypothesis testing procedure to test if there is
significance evidence of positive relationship between the two
scores at significance level 0.05

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Extending the test to more cases

I Last test we have seen about can be used only for the null
H0 : = 0.
I What happens if we want to test for the null H0 : = 0
when 0 6= 0?
I We will use Fisher transformation and random variable:
 
1 1+R
V = log
2 1R

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Distribution

I Random variable V as was defined in previous slide is


approximately following normal distribution as follows:
   
1 1+ 2 1
V N V = log , V =
2 1 n3

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Hypothesis testing

I Null hypothesis: H0 : = 0
I Test statistic:
   
1 1+r 1 1 + 0
log log
2 1r 2 1 0
z= N(0, 1)
1

n3
I Rejection Regions:
I z z if HA : > 0
I z z if HA : < 0
I z z/2 and z z/2 if HA : 6= 0

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Confidence interval for V

I Based on previous results it is easy to create a confidence


interval for V .
I A (1 )100% Confidence Interval for V is given by:
z/2
V
n3

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Confidence interval for

I Our objective is not to create a Confidence Interval for V .


Our objective is to create a Confidence interval about .
I A (1 )100% Confidence Interval for is given by:
 2c1
e 1 e 2c2 1

,
e 2c1 + 1 e 2c2 + 1
I c1 is the lower endpoint for the interval for V
I c2 is the upper endpoint for the interval for V

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline Review
Introduction Estimation
Confidence interval and Hypothesis testing using slope Tests
Tests based on correlation Example
Exercises Other inferences concerning

Example

I I want to find the regression line that relates the scores on the
two Midterms in Stat 319. I randomly select five students and
the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in
the same order the scores they had in Midterm 2 is 40, 65, 95,
90, 100.
I Make a hypothesis test at significance level 0.05 to see if
there is significant evidence that the correlation coefficient is
different than 0.5.
I Find a 99% confidence interval for .

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient


Outline
Introduction
Confidence interval and Hypothesis testing using slope
Tests based on correlation
Exercises

Exercises

I Section 12.3 page 609


I Exercises 31, 32, 33, 34, 35, 36, 37, 38, 41
I Section 12.5 page 623
I Exercises 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67

Andreas Artemiou Chapter 12 - Lecture 2 Inferences about regression coefficient

You might also like