Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 15

Correlation and regression

correlation
A stereo and sound equipment store’s manager wants to
determine the relationship between the number of weekend
television commercials shown and the sales at the store
during the following week. The sample data is as shown:

Week No. of commercials ‘x’ Sales Volume (Rs.1000s)


1 2 50
2 5 57
3 1 41
4 3 54
5 4 54
6 1 38
7 5 63
8 3 48
9 4 59
10 2 46
covariance

Covariance is a descriptive measure of linear relationship


between two variables.

Covariance is given by sxy = [∑(xi - xbar) (yi - ybar)] / (n-1)


SCATTER DIAGRAM METHOD

SOURCE: WIKIPEDIA
Simple linear regression
Regression x y

Model y = β0 + x1 y1

β1 x + ε x2 y2

Regression .
v .

eqn . .

E(y) = β0 + β1 x xn yn

Estimated
b0, b1 provide Regression
Equation
estimates of ŷ = b0 + b1x
β0 and β1 Sample stats
b0, b1
Least squares method

Sample data from 10 pizza parlor restaurants situated near


college campus.

Restaurant Student Population (‘000s) Sales (Rs.‘000s)


1 2 58
2 6 105
3 8 88
4 8 118
5 12 117
6 16 137
7 20 157
8 20 169
9 22 149
10 26 202
Least squares method

Least squares criterion

• Min ∑(yi - ŷ)2

Estimated regression equation is

ŷ = b0 + b1 x

Where b1 = [∑(xi - xbar) (yi - ybar)] / ∑(xi - xbar)2

And where b0 = ybar - b1 xbar


Chart Title

220

200
y = 5x + 60
180

160

140

120

100

80

60

40

20

0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Student Population (000s)
SSE

xi Student Predicted Squared


yi Sales
Restaurant Population Sales Error (yi - ŷ) Error
(Rs.‘000s)
(‘000s) ŷ = 5x + 60 (yi - ŷ)2
1 2 58 70 -12 144
2 6 105 90 15 225
3 8 88 100 -12 144
4 8 118 100 18 324
5 12 117 120 -3 9
6 16 137 140 -3 9
7 20 157 160 -3 9
8 20 169 160 9 81
9 22 149 170 -21 441
10 26 202 190 12 144
SSE = 1530
SSt

xi Student
yi Sales Squared Error
Restaurant Population Error (yi - ȳ)
(Rs.‘000s) (yi - ȳ)2
(‘000s)
1 2 58 -72 5184
2 6 105 -25 625
3 8 88 -42 1764
4 8 118 -12 144
5 12 117 -13 169
6 16 137 7 49
7 20 157 27 729
8 20 169 39 1521
9 22 149 19 361
10 26 202 72 5184
ȳ= 130 SST = 15730
Coefficient of determination

SSE = ∑(yi - ŷi)2

SST = ∑(yi - ȳ)2

SSR = ∑(ŷi - ȳ)2 = SST - SSE

r2 = SSR / SST

Correlation Coefficient = (sign of b1) √r2


Coefficient of determination

Xi 1 2 3 4 5

Yi 3 7 5 11 14

The estimated regression equation for these data is ŷ =.20 +


2.60x

A) compute SSE, SST and SSR

B) Compute r2

C) Compute r
Testing for significance

E(y) = β0 + β1 x. If the value of ‘x’ = 0, then E(y) = β0 and hence x


and y are linearly related

To test the significance of relationship, conduct a hypothesis test


to determine whether the value of is β1 zero.

MSE - Mean Square of errors estimates the value of σ2.

S2 is unbiased estimator of σ2

So S2 = MSE = SSE / (n-2)

n-2 degrees of freedom as β0 & β1 are already used to compute


SSE.
Testing for significance

H 0: β 1 = 0

• H a: β 1 ≠ 0

Sampling distribution of b1

• E(b1) = β1

Standard Deviation σb1= σ / √(xi - xbar)2

• sb1 = s / √(xi - xbar)2


exercise

Xi 2 3 5 1 8
Yi 25 25 20 30 16

• A) Compute mean square error

• B) Use t-test to test for significance at alpha = .05

• C) use F-test to test the hypothesis at .05 level of


significance. Present the results in ANOVA table format.

You might also like