Professional Documents
Culture Documents
Applied Statistics (D1074) : Simple Linear Regression and Correlation
Applied Statistics (D1074) : Simple Linear Regression and Correlation
Week 11-14
(1) Correlation Analysis
(1) Correlation Analysis
1.1 Definition
Variable Variable
1 2
x x
y y
x x
Bina Nusantara University 5
(1) Correlation Analysis
No relationship x
x
Bina Nusantara University 6
(1) Correlation Analysis
x x x
r = -1 r = -0.6 r=0
y y
x x
r = +0.3 r = +1
( x x )( y
i i y)
rxy i 1
n n
2
( xi x ) ( yi y )
2
i 1 i 1
r = Sample correlation coefficient
n = Sample size
x = first variable
y = second variable
Bina Nusantara University 10
(1) Correlation Analysis
1.3 Testing a Correlation
Hypothesis Test Statistic Critical Values
H 0: ³ o
t t ,n 2
H 1: < o
r n2
H 0 : £ o t
1 r2 t t ,n 2
H 1 : > o
H 0: = o t t / 2,n 2 or
t t / 2,n 2
H 1: ¹ o
Bina Nusantara University 11
(1) Correlation Analysis
Example 1
Example 1
Example 1
( x x )( y y )
i i
rxy i 1
n 2
n
2
( xi x ) ( yi y )
i 1 i 1
2840
rxy 0,95
(568)(15730)
Bina Nusantara University 14
(2) The Simple Linear Regression
Model
(2) The Simple Linear Regression Model
2.1 Modeling
Independent Dependent
variable variable
Independent Dependent
variable variable
Example 2
Example 2
Model :
Predict : 201.7
Impact : Yes
Where
x = independent variable
y = dependent variable
n = number of data
y β 0 β1x ε
Variable
x x
Linear relationships Nonlinear relationships
y y
x x
Bina Nusantara University 26
(2) The Simple Linear Regression Model
2.6 Estimation
y β 0 β1x ε
27
Bina Nusantara University
(2) The Simple Linear Regression Model
Estimated
Estimated (or Estimate of the Estimate of the
predicted) y regression regression slope
value intercept
Independent
ˆ ˆ
ŷ 0 1x variable
28
Bina Nusantara University
(2) The Simple Linear Regression Model
S xy x i x yi y
ˆ1 i 1
n
S xx
x x
2
i
i 1
ˆo y ˆ1 x
29
Bina Nusantara University
(2) The Simple Linear Regression Model
Estimation of variance
n
y yˆ i
2
i
ˆ
2 i 1
n2
30
Bina Nusantara University
(2) The Simple Linear Regression Model
2.7 Coefficient Determination
2
R , the coefficient of determination of the regression
line is defined as the proportion of the total sample
variability in the Y ’s explained by the regression model
R r 2 2
xy
R rxy
31
Bina Nusantara University
(2) The Simple Linear Regression Model
Example 3
32
Bina Nusantara University
(2) The Simple Linear Regression Model
Example 3
Scatterplot
33
Bina Nusantara University
(2) The Simple Linear Regression Model
Example 3
34
Bina Nusantara University
(2) The Simple Linear Regression Model
Example 3
n
x i x yi y
2840
ˆ1 i 1
n
5
568
ix x 2
i 1
yˆ 60 5 x
35
Bina Nusantara University
(2) The Simple Linear Regression Model
Example 3
yˆ 60 5 x
36
Bina Nusantara University
(3) Inference on the Parameter
(3) Inference on the Parameter
3.1 Parameter 1
Critical value :
t t / 2,n 2
39
Bina Nusantara University
(3) Inference on the Parameter
Example 4
t t t 2,308
Critical value : 5% / 2 ,8
Conclusion : Reject H0
40
Bina Nusantara University
(3) Inference on the Parameter
3.2 Regression Line
41
Bina Nusantara University
(3) Inference on the Parameter
42
Bina Nusantara University
(3) Inference on the Parameter
Prediction Interval for
an individual y, given
y xp
Confidence
Interval for
+ b x the mean of
y = b0
1
y, given xp
x
x
Bina Nusantara University xp 43
(4) The Analysis of Variance Table
(4) The Analysis of Variance Table
4.1 Definition
- Hypothesis :
H 0 : 1 0
H 1 : 1 0
- Test statistics :
46
Bina Nusantara University
(4) The Analysis of Variance Table
4.3 ANOVA Table
47
Bina Nusantara University
(4) The Analysis of Variance Table
n
SST yi y
2
i 1
n
SSE yi yˆ i
2
i 1
n
SSR yˆ i y
2
i 1
48
Bina Nusantara University
(4) The Analysis of Variance Table
4.4 The Sum of Squares for a Simple Linear Regression
49
Bina Nusantara University
(4) The Analysis of Variance Table
50
Bina Nusantara University
(4) The Analysis of Variance Table
51
Bina Nusantara University
(4) The Analysis of Variance Table
Example 5
52
Bina Nusantara University
(5) Residual Analysis
(5) Residual Analysis
5.1 Residuals
Y
prediction y yˆ
these differences
are called
residuals or
errors
yˆ ˆ0 ˆ1 x
54
Bina Nusantara University
(5) Residual Analysis
The residuals are defined ei yi yˆ i
Residual analysis can be used to :
Identify data points that are outliers
Check whether the fitted model is appropriate
Check whether the error variance is constant
Check whether the error terms are normally
distributed
55
Bina Nusantara University
(5) Residual Analysis
56
Bina Nusantara University
(5) Residual Analysis
57
Bina Nusantara University
(5) Residual Analysis
58
Bina Nusantara University
(5) Residual Analysis
59
Bina Nusantara University
(6) Application with Minitab
(6) Application with Minitab
Correlation Analysis