Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 69

Simple Linear

Regression

Prepared by:
Prof. Dr Bahaman Abu Samah
Department of Professional Development and Continuing Education
Faculty of Educational Studies
Universiti Putra Malaysia
Serdang
Introduction
– Simple linear regression is an extension to Pearson
Product-Moment correlation
– Purpose
1. Determine relationship between two variables, generally
between IV and DV
2. Make prediction of DV based on IV

Next ►
Requirement
– Scales of measurement for variables:
 DV - interval or ratio
 IV - interval or ratio
– Ex: Regression between age (X) and CGPA (Y)
Age Ratio
CGPA Interval
– For non-metric/categorical IV, it must be transformed into
dummy variable (assign as 0 and 1)
– Number of dummy variables equals k - 1

Next ►
Assumptions
To apply regression analysis
1. The independent and dependent variables are bivariately
normally distributed in the population
2. The cases represents a random sample from the population

Next ►
What to Expect
Regression
Model
R2 and R
Derive regression/ Hypothesis
prediction equation Testing

Slope

Next ►
Components
Simple
of Linear Regression
Calculate: Prediction Equation
 b1 Yˆ  b0  b1 X
 b0
R and R2
Descriptive
Inferential
Hypothesis
Test:
Regression
Slope
Model

Next ►
Descriptive
Basis for Best Fit Line
100

 Use the least squares


90
method to identify the line
 The line is called the least 80

squares regression line 70

 This method will minimize 60 Which one is the best-


SSE fit line?

Test scores
50

40
4 5 6 7 8 9 10 11

Average assignment scores

A plot of paired observations of X and Y

Next ►
Least Squares Method
The line that minimize the
sum of squared difference
Y
● ●-
● + -

+
● -
+ ●
-
● ●
● -
+ ●

X
Next ►
Derive Prediction Equation
– Calculate b1 and b0
SXY
b1  b0  y  b1 x
SSX
( X ) (  Y )
XY 
 n
2
(  X )
X 2 
n

Yˆ  b0  b1 X 1
Prediction Equation

Next ►
Prediction Equation
Yˆ  b0  b1 X
Y

b1
ΔY

b0 ΔX

Ŷ Predicted value of Y
b0 Y-intercept
b1 Slope (regression coefficient) X

Next ►
Interpreting Prediction Equation
– Based on a given prediction equation, you can indicate the
influence of IV on DV
Yˆ  1.07  .75 X
– For every one (1) unit increase in X, Y will increase by .75
unit
– The amount of increase/decrease in Y is based on b1

Next ►
Inferential

Next ►
Hypothesis Test:
Regression Model
Manual
What to Expect?
Criteria Decision

Fcal ≥ Fcritical Reject HO
Calculate
Fcal < Fcritical Fail to reject HO
F-ratio
Hypothesis


State  
HO and HA Decision Conclusion
Test


Critical value
Effect size
(f 2)

Next ►
5-Step
Hypothesis Test
5-Steps Hypothesis
Test
State HO and HA
1
Calculate F-ratio
2 Determine Critical Value (Fcritical)

3 Decision

Conclusion
4

Next ►
Step 1: State HO & HA
HO: Y = β0 + ei
Y
HA: Y = β0 + β1X + ei
+ ei
X
β1
β0 +
Y =
:
H A

H O: Y = β 0 + e i

Next ►
Step 2: Calculate Test
Statistics
1. Calculate sum of squares
2
(  Y )
Total sum of squares SST  Y 2 
N
2
( SXY )
Sum squares Regression SSR 
SSX
Sum squares Error SSE  SST  SSR

Next ►
2. Determine Degree of Freedom
Regression p
Error/Residual n  p 1
Total n 1

p Number of Independent variable


n Sample size

Next ►
Summary ANOVA Table
Source SS df MS F

Regression SSR p MSR F


Error/Residual SSE n–p–1 MSE

TOTAL SST n–1

Next ►
Step 3: Critical Value
p
Fn  p 1 ( )

Next ►
Next ►
Step 4: Decision
– Only two (2) possible decisions.
– Reject or Fail to Reject HO
Manual:
Reject HO: Fcal ≥ Fcritical
Criteria Decision
Fail to reject HO: Fcal < Fcritical Fcal ≥ Fcritical Reject HO
Fcal < Fcritical Fail to reject HO
SPSS:
Criteria Decision
Reject HO: sig-F ≤ α sig-F ≤ α Reject HO

Fail to reject HO: sig-F > α sig-F > α Fail to reject HO

Next ►
Step 5: Conclusion
Reject HO
The regression model fits the data at α
Fail to reject HO
The regression model does not fit the data
at α

Next ►
Additional Analysis
Model Summary
Coefficient of determination, R2
SSR
2
R   Amount of variance in Y explained by X
SST
 Ranges: 0 ≤ R2 ≤ 1

Multiple correlation coefficient, R


R  R2
b1 ( SXY )
R  Relationship between X and Y
SSY  Ranges: 0 ≤ R ≤ 1

Next ►
Cohen f 2
.02 Small
.15 Medium
Cohen’s Effect Size, f 2 .35 Large

2
R
f2  Amount of variance in Y explained by X
1  R2  Ranges: 0 ≤ R2 ≤ 1

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2 nd Edition). Hillsdale,
NJ: Lawrence Earlbaum
Hypothesis Test:
Slope
Manual
What to Expect?
Criteria Decision

tcal ≥ tcritical Reject HO
Calculate
tcal < tcritical Fail to reject HO
t-value
Hypothesis


State  
HO and HA Decision Conclusion
Test


Critical value

Next ►
5-Step
Hypothesis Test
5-Steps Hypothesis
Test
State HO and HA
1
Calculate Test Statistics (t)
2 Determine Critical Value

3 Decision

Conclusion
4

Next ►
Step 1: State HO & HA
HO: β1 = 0
Y
HA: β1 ≠ 0 ≠ 0
: β1
HA

HO: β1 = 0

Next ►
Step 2: Calculate Test
Statistics
b1  1
t
MSE
SSX

Source SS df MS F
MSR
Regression SSR p MSR
MSE
Error SSE n-p-1 MSE
Total SST n-1
Summary ANOVA Table

Next ►
Step 3: Critical Value
The critical value:
– df = n-2
– One-tailed
t , df
– Two-tailed
t
, df
2

Next ►
Step 4: Decision
– Only two (2) possible decisions.
– Reject or Fail to Reject HO
Manual:
Reject HO: tcal ≥ tcritical
Criteria Decision
Fail to reject HO: tcal < tcritical tcal ≥ tcritical Reject HO
tcal < tcritical Fail to reject HO
SPSS:
Criteria Decision
Reject HO: sig-t ≤ α sig-t ≤ α Reject HO

Fail to reject HO: sig-t > α sig-t > α Fail to reject HO

Next ►
Step 5: Conclusion

Reject HO
X contributes significantly towards Y
Fail to reject HO
X does not contribute significantly towards
Y

Next ►
Effect Size
2
R
f2
1 R2 .02 Small
.15 Medium
.35 Large

Next ►
Example/Exercis
e
Data were collected from a randomly selected sample to determine
relationship between average assignment scores and test scores in
statistics. Distribution for the data is presented in the table below.

1. Calculate b1 and b0 and derive the Data set:


prediction equation Scores
ID Assign Test
2. Test the hypothesis for the 1 8.5 88
regression model at α = .05 2 6 66
3. Calculate coefficient of 3 9 94
4 10 98
determination and multiple 5 8 87
correlation coefficient. Interpret the 6 7 72
two values. 7 5 45
8 6 63
4. Test hypothesis for the slope at .05
9 7.5 85
level of significance. 10 5 77
Data: 5950 SL Regression 1 Class

Next ►
1. Derive Regression/Prediction equation
( X ) ( Y )
XY 
b1  n ID X Y
2
(  X ) 1 8.5 88
X 2 
n 2 6 66
(72) (775) 3 9 94
5,795.5  Summary stat:
10 4 10 98
 n 10
(72) 2 5 8 87
544.5  ΣX 72 6 7 72
10 ΣY 775 7 5 45
215.5 ΣX2 544.5 8 6 63

26.1 ΣY2 62,441 9 7.5 85
 8.257 ΣXY 5,795.5 10 5 77

Next ►
b0  y  b1x Y 
Y
X 
X
n n
 77.5  8.257 (7.2) 775 72
 
 18.050 10 10
 77.5  7 .2

Prediction equation:
yˆ  18.05  8.257 x

Next ►
Interpretation of the regression equation
yˆ  18.05  8.257 x

For every 1 unit increase in


Y
average assignment score (X),
test score (Y) will increase by
8.257 units 57
8 .2 ΔY
18.05 ΔX

| | | | | |
X

Next ►
2. Hypothesis test – Regression model
a. Hypotheses
HO: Y = β0 + ei
HA: Y = β0 + β1X1 + ei
b. Calculate test statistic
Summary stat:
 Sum of squares
n 10
2 ( Y ) 2
SST  Y  ΣX 72
n
775 2 ΣY 775
 62,441 
10 ΣX2 544.5
 62,441  60,062.5 ΣY2 62,441
 2,378.5 ΣXY 5,795.5

Next ►
2
Summary stat:
( SXY )
SSR  SSE  SST  SSR n 10
SSX
2
 2,378.5  1,779.320 ΣX 72
215.5
  599.180 ΣY 775
26.1
 1,779.320 ΣX2 544.5
ΣY2 62,441
ΣXY 5,795.5
 Summary ANOVA table
Source SS df MS F
Regression 1,779.320 1 1,779.320 23.757
Error 599.180 8 74.898
Total 2,378.500 9

Next ►
Decision criteria
Criteria Decision
c. Critical value Fcal > Fcritical Reject HO
F81 (.05)  5.32 Fcal ≤ Fcritical Fail to reject HO

d. Decision
Since Fcal (23.757) is bigger than Fcritical (5.32)
 Reject HO
e. Conclusion
The regression model fits the data at .05 level of
significance

Next ►
3. R2 and R Source SS df MS F

2 SSR Regression 1,779.320 1 1,779.320 23.757


R  Error 599.180 8 74.898
SST
1,779.320 Total 2,378.500 9

2,378.5
 .748

About 75 of variance in test scores is explained by assignment


scores

Next ►
R  R2 b1 ( SXY )
OR R
 .748 SSY
 .865 8.257 ( 215.5)

2,378.5
 .748
 .865

There is a positive and high correlation between assignment


scores and test scores

Next ►
4. Hypothesis test – Slope
a. Hypotheses HO:
β1 = 0 HA: β1 ≠
0
b. Calculate test statistic
b1  1
t
MSE
SSX
8.257  0
 Source SS df MS F
74.898
Regression 1,779.320 1 1,779.320 23.757
26.1 Error 599.180 8 74.898
8.257
  4.874 Total 2,378.500 9
1.694

Next ►
Decision criteria

c. Critical value Criteria Decision


|tcal| ≥ |tcritical| Reject HO
t .025, 8  2.306
|tcal| < |tcritical| Fail to reject HO

d. Decision
Since |t cal| (4.874) is bigger than |t critical| (2.306)
 Reject HO
e. Conclusion
Assignment scores contribute significantly towards test
scores at .05 level of significance

Next ►
SPSS
Computation
SPSS Procedure

Next ►
SPSS Output Results

Next ►
◄ END ►
1. Regression
Model
Step 1: State HO & HA
HO: Y = β0 + ei
Y
HA: Y = β0 + β1X + ei
+ ei
X
β1
β0 +
Y =
:
H A

H O: Y = β 0 + e i

Next ►
Step 2: Report Test
Statistics

F = 23.757, sig-F = .001

Next ►
Step 3: Set Alpha

  .05

Next ►
Step 4: Decision

Since sig-F (.001) < α (.05),


Reject HO

SPSS:
Reject HO: sig-F ≤ α
Fail to reject HO: sig-F > α

Next ►
Step 5: Conclusion

Reject HO
The regression model fits the data at .05
level of significance

Next ►
Coefficient of determination, R2

R  .748
2  74.8% variance in Y is explained by X
 Ranges: 0 ≤ R2 ≤ 1

Multiple correlation coefficient, R

R  .865  Positive and high relationship


between X and Y
 Ranges: 0 ≤ R ≤ 1

Next ►
2. Slope
Step 1: State HO & HA
HO: β1 = 0
HA: β1 ≠ 0

Next ►
Step 2: Report Test
Statistics

t = 4.874, sig-t = .001

Next ►
Step 3: Set Alpha

  .05

Next ►
Step 4: Decision

Since sig-t (.001) < α (.05),


Reject HO

SPSS:
Reject HO: sig-t ≤ α
Fail to reject HO: sig-t > α

Next ►
Step 5: Conclusion

Reject HO
Assignment scores contribute significantly
towards test scores at .05 level of significance

Next ►
APA REPORTING
STYLE
A simple linear regression was calculated to predict
test score based on average assignment scores .
A significant regression equation was found (F(1,8)=
23.757, p < .001), with an R2 of .748. Participants’
predicted weight is equal to 18.952+8.257
(independent variable measure) [average test scores]
when [independent variable] is measured in [unit of
measure]. Test score increased 8.257 for
each unit increased on average test scores.
APA REPORTING
STYLE
Table 1. Simple Linear Regression Analysis between Assignment
Scores and Test Scores
Model Unstandardized Standardized t p
Coefficients Coefficients
B Std. Error
1 (Constant) 18.052 12.50      
Assignment scores 8.257 1.694 .865 4.87 .001
test scores 4

R=.865, R Square=.748, F=23.757, sig-p=.001

You might also like