Regression and Factor

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 95

1

สถิตแิ ละการวิเคราะห์ข้อมูลขั้นสูงด้วย
SPSS for Windows

กัลยา วานิชย์ บัญชา


1. วันที่ 4 พุธ สิ งหาคม 2564 วลา 18.30 – 21.30 น.

➢ Simple and Multiple Linear Regression


▪ Assumptions
▪ examine conditions(Assumptions)
▪ Example

2. วันพฤหัสบดีที่ 19 สิ งหาคม 2564 วลา 18.30 – 21.30 น.


➢ Multiple Linear Regression
➢ Factor Analysis
➢ Example

3. วันพฤหัสบดีที่ 26 สิ งหาคม 2564 วลา 18.30 – 21.30 น.

➢ Logistic Regression Analysis and ROC Curve


▪ Assumptions
▪ examine conditions(Assumptions)
▪ Example

2
4. วันพฤหัสบดีที่ 2 กันยายน 2564 วลา 18.30 – 21.30 น.

➢ Discriminant Analysis
▪ Assumptions
▪ examine conditions(Assumptions)
▪ Example

5. วันพฤหัสบดีที่ 9 กันยายน 2564 วลา 18.30 – 21.30 น.


➢ Cluster Analysis
▪ Example

6. วันพฤหัสบดีที่ 16 กันยายน 2564 วลา 18.30 – 21.30 น.

▪ Log Linear Model

3
Regression Analysis.
• Analyze the relationship among
variables.
• Causal Relationship
Objective
1. Study the pattern of relationship between variables.

2. Estimate or Forecast.

Kanlaya Vanichbuncha
4
Regression Analysis

Regression Analyze the Causal


Analysis. relationship
among variables. Relationship

Study the pattern


of relationship Estimate or
Objective between Forecast.
variables.
Simple Regression
One dependent and one independent
variable. Both of them are quantitative.

Interval / Ratio Interval / Ratio

Income Expense
Independent (X) Dependent (Y)
Income Expense Adv.
Expense Sales
Advertising Sales Revenue
Expense Income Sat.
Score
Income Satisfaction score

Kanlaya Vanichbuncha 6
Type of Data

1 2 3
Cross-sectional Time – series Longitudinal
data data data

Kanlaya Vanichbuncha 7
Simple Linear Regression Analysis
The relationship between dependent (Y) and
independent (X) is linear form.
𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 + 𝒆𝒊 i = 1,2,…, N

Y = dependent variable – annual sales


X = independent variable – advertising budget
𝜷𝟎 = Y intercept for regression line
𝜷𝟏 = slope of regression line
e = error – difference between actual value (Y) and Predicated
(Yˆ ) value

Kanlaya Vanichbuncha
8
Predicted by regression line ( Yˆ )
y 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿 + 𝒆 y
𝜷𝟏 < 0

𝜷𝟏 > 0

𝜷𝟎

x x
𝜷𝟏 = slope of the line or the average changed in Y for each
change of one unit (either increase or decrease) in X

𝜷𝟏 = regression coefficients
𝜷𝟎 = Y- intercept or the estimated value of Y when X = 0
The unit 𝜷𝟎 and 𝜷𝟏 are the same of dependent variable Y.

Kanlaya Vanichbuncha 9
Figure 1 : Positive linear relationship Figure 2 : Negative linear relationship

y y

x x

10

Kanlaya Vanichbuncha
For Sample Data

𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 + 𝒆𝒊 …(1)

෢𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝒊
𝒀 …(2)

Yi − Yˆi = ei (1) - (2)


෡𝟎
𝜷 = 𝒃𝟎 ෡ 𝟏 = 𝒃𝟏
𝜷

Estimated a and b by
1. Ordinary Least Square (OLS)
2. Maximum Likelihood Estimator (MLE)

Kanlaya Vanichbuncha 11
X = Advertising expense (Baht 100,000)
Example 1
Y = Sales Revenue (million)
Regression equation

𝑆𝑎𝑙𝑒𝑠Ƹ = 28.94 + 5.18𝐴𝑑𝑣


𝑏1 = 5.18 million baht
𝑏0 = 28.94 million baht

If we do not advertise ➔ Adv =0 ,then 𝑆𝑎𝑙𝑒𝑠Ƹ = 28.94 + 5.18 (0)

If advertising expense increase 100,000 baht , we can


expect to increase sales mean by 5.18 million baht.

If next year we let advertising expense = 10 or Baht 1,000,000. we


can expect to have sales mean = 28.94 + 5.18(10) = 80.74 million
baht

Kanlaya Vanichbuncha 12
Example 2
X 2 4 8 10 13 14
60
50
40
Y 50 38 26 25 7 2
30
Step
20
10
1 Use scatter plot to examine the relationship between X and Y.
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Step 2 ෢𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝒊
𝒀
Error Term or Residual Term 𝒃𝟏 = -3.368 𝒃𝟏 = 54.417

Yˆi = 54.417 − 3.368 X i


2 Yˆ = 54.417 − 3.368 X i
Xi Yi ei = Yi − Yˆi e i
2 50 47.681 2.319 5.378
4 38 40.945 -2.945 8.673
8 26 27.473 -1.473 2.17 ei = Yi − Yˆi or ei = Yˆi − Yi
10 25 20.737 4.263 18.173
13 7 10.633 -3.633 13.199
14 2 0.529 1.471 2.164
Total 49.757
13
Kanlaya Vanichbuncha
Testing Hypothesis about 𝜷𝟎 and 𝜷𝟏

1. Test about (slope of linear line)

H 0 : 𝜷𝟏 = 0 or H0 : There is no linear relationship between Y and X.


H 1 : 𝜷𝟏  0 H1: X and Y are linearly related.
Test Statistics
𝑏1 − 0
tn-2 =
𝑆𝐸(𝑏)
Reject H0 if t > t1-/2,n-2 or t < - t1-/2,n-2

2. Test about 𝜷𝟎 (Intercept)


Test Statistics
H0 : 𝜷𝟎 = 0 𝑏0 − 0
tn-2 =
H1 : 𝜷𝟎  0 𝑆𝐸(𝑏0 )

Reject H0 if t > t1-/2,n-2 or t < - t1-/2,n-2

14
Kanlaya Vanichbuncha
The coefficient of Determination : R2

▪ To the proportion of total variation in dependent variable Y that is


explained by the variation of independent variable X.
Sum Square total (SST) = Sum Square regression (SSR) + Sum Square Error (SSE)

SST = SSReg + SSE

𝑆𝑆 Re 𝑔 0  R2  1 or 0%  R2  100%
𝑅2 =
𝑆𝑆𝑇

( )
2 n 2
𝑛
SSTotal = SST = ෍ 𝑌𝑖 − 𝑌ሜ SSReg =  Yˆi − Y
i =1
𝑖=1

( )
n 2

SSE = SSError = SSResidual =  Yi −Yˆ i


i =1

15
Kanlaya Vanichbuncha
The Coefficient of Correlation
Pearson correlation:
Describes the strength of the relationship between two sets of quantitative variable.

n( XY ) − ( X )( Y )
r= -1  r  +1
n( X 2 2

) − (  X ) n(  Y ) − (  Y )
2 2

r = +1, or r = -1 : indicate perfect correlation.
r = +1 : X and Y are perfectly related in a positive linear.
r = -1 : X and Y are perfectly related in a negative linear.
r = 0 : There is absolutely no relationship between two variables.

Perfect positive
Perfect negative correlation

Moderate Moderate
Strong Weak
Weak Strong

-1 -.5 0 .5 1

Negative correlation Positive correlation 16


Kanlaya Vanichbuncha
Conditions or Assumptions

1. error ei ~ normal (0, 2)


2. V(e) = 2 is constant
3. et and et+1 are independent

Properties of a and b for Least Square Method

1. e i =  (Yi − Yˆi ) = 0

2. ( X , Y ) is the point on regression line


3.  i is minimum
e 2

Kanlaya Vanichbuncha 17
Examining Condition
1. e is normal
- Chi-Square Test
- Kolmogorov-Smirnov Test (any sample size : n)
- Shapiro-Wilk (n ≤ 50)
2. V(e) is constant
(if V(e) is not constant, Heteroscedastic Problem)
- Plot graph between e and
Yˆor X
3. et and et+1 are independent
- Durbin-Watson

Kanlaya Vanichbuncha 18
Checking condition of V(e) is constant
e
• Homoskedasticity: Equal Error Variance

0
X or Yˆ
-1
10

• Homoskedasticity: Equal Error Variance Examine error at


8 different values of X. Is
it roughly equal?
6

2
HAPPY

0
Kanlaya Vanichbuncha
0 20000 40000 60000 80000
Kanlaya Vanichbuncha
100000
19
INCOME
• Homoskedasticity: Equal Error Variance

10

Examine error at different values


8
of X. Is it roughly equal?

2
• Heteroskedasticity: Unequal Error Variance
HAPPY

0
0 20000 40000 60000 80000 100000
10 At higher values of
INCOME X, error variance
8 increases a lot.

2
HAPPY

0
0 20000 40000 60000 80000 100000
10000 30000 50000 70000 90000

INCOME

Kanlaya Vanichbuncha 20
Example 3
𝑌෠ = 2.286 + 2.679𝑋
X Y ෡
𝒀 Error
1 4 4.96 -.96
2 8 7.64 .36
3 9 10.32 -1.32
4 16 13.00 3.00
5 13 15.68 -2.68
6 24 18.36 5.64
7 17 21.04 -4.04
e i =0

3
e
.
2

0 Series1
0 1 2 3 4 5 6 7 8
-1

-2

-3 X
-4

-5

21
Kanlaya Vanichbuncha
Yˆ = 1.906 + 0.397 X

X 𝑌 ෢𝑌 Error Error
nonlinear linear
1 2.00 2.30 -.30 -.96
2 2.83 2.70 .13 .36
3 3.00 3.10 -.10 -1.32
4 4.00 3.49 .51 3.00
5 3.61 3.89 -.29 -2.68
6 4.90 4.29 .61 5.64
7 4.12 4.68 -.56 -4.04
1
e  ei = 0  ei = 0
0.5

0
X
-0.5 0 1 2 3 4 5 6 7 8

-1

22
Kanlaya Vanichbuncha
Checking condition : et and et+1 are independent

Use Durbin-Watson (D-W)


n

(et − et −1 ) 2
D −W = t =2 0  D_W  4


n

et2
t =1

1. D-W  2 => et and et+1 are independent

2. D-W < 2 => et and et+1 are positive relationship

3. D-W > 2 => et and et+1 are negative relationship

Kanlaya Vanichbuncha
23
Regression: Outliers
▫ Choose “influence” and “distance” statistics such as Cook’s Distance, DFFIT,
standardized residual
▫ High values signal potential outliers
▫ Note: This is less useful if you have a VERY large dataset, because you have to look
at each case value.

• Example: Study time and student achievement.


▫ X variable: Average # hours spent studying per day
▫ Y variable: Score on reading test Scatterplots
Y axis
30
Case X Y
1 2.6 28
2 1.4 13
20
3 .65 17
4 4.1 31
5 .25 8
10
6 1.9 16
7 3.5 6
Kanlaya Vanichbuncha
0 0 1 2 3 4 X axis 24
Regression: Outliers

• Note: Even if regression assumptions are met, slope estimates can have
problems
• Example: Outliers -- cases with extreme values that differ greatly from
the rest of your sample
 More formally: “influential cases”
• Outliers can result from:
 Errors in coding or data entry
 Highly unusual cases
 Or, sometimes they reflect important “real” variation
• Even a few outliers can dramatically change estimates of the slope,
especially if N is small.

Kanlaya Vanichbuncha 25
Regression: Outliers
Extreme case that pulls
regression line up

4
• Outlier Example :
2

-4 -2 0 2 4
Regression line with
extreme case removed
-2 from sample

26
Kanlaya Vanichbuncha -4
Outlier Diagnostics
• Cook’s D: Identifies cases that are strongly influencing the
regression line
▫ SPSS calculates a value for each case
 Go to “Save” menu, click on Cook’s D
• How large of a Cook’s D is a problem?
▫ Rule of thumb: Values greater than: 4 / (n – k – 1)
▫ Example: N=7, K = 1: Cut-off = 4/5 = .80
▫ Cases with higher values should be examined.
• Example: Outlier/Influential Case Statistics
Hours Score Residual Std Residual Cook’s D

2.60 28 9.32 1.01 .124


1.40 13 -1.97 -.215 .006
.65 17 4.33 .473 .070
4.10 31 7.70 .841 .640
.25 8 -3.43 -.374 .082
1.90 16 -.515 -.056 .0003
3.50 6 -15.4 -1.68 .941
Kanlaya Vanichbuncha
27
Model Summaryb

Adjusted Std. Error of


Outliers
Model R R Square R Square the Estimate
1 .466a .217 .060 9.1618
a. Predictors: (Constant), HRSTUDY
Coefficientsa
b. Dependent Variable: TESTSCOR
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 10.662 6.402 1.665 .157
HRSTUDY 3.081 2.617 .466 1.177 .292
a. Dependent Variable: T ESTSCOR

Model Summaryb • Results with outlier removed:


Adjusted Std. Error of
Model R R Square R Square the Estimate
1 .903a .816 .770 4.2587
a. Predictors: (Constant), HRSTUDY
Coefficientsa
b. Dependent Variable: TESTSCOR
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 8.428 3.019 2.791 .049
HRSTUDY 5.728 1.359 .903 4.215 .014
a. Dependent Variable: TESTSCOR

28
Kanlaya Vanichbuncha
Multiple Linear Regression
One dependent and at least 2 independent
variables (k independent variables are potentially
related to the dependent variable).

Y =  0 + 1 X 1 +  2 X 2 + ... +  k X k + e k2

X1 (Nominal)

X2 (Ordinal) Dependent
X3 (Interval) Y(I/R)
X4 (Ratio)
Kanlaya Vanichbuncha
29
การรับรู ้ความสามารถของตนเอง

คุณลักษณะงาน
▪ ความหลากหลายของทักษะ
▪ ความมีเอกลักษณ์ของงาน
▪ ความสาคัญของงาน พฤติกรรมการทางาน
▪ ความเป็ นอิสระในงาน
▪ การได้รับข้อมูลย้อนกลับ

ความสุขในการทางาน

ที่มา :สิ ริศา จักรบุญมา และ ผศ.ดร.ถวัลย์เนียมทรัพย์


“ปัจจัยที่มีอิทธิพลต่อพฤติกรรมการทางานของพนักงานฝ่ ายบารุ งรักษาในรัฐวิสาหกิจแห่งหนึ่ง”

30
ลักษณะความเป็ นพลเมืองดีดา้ นอนุรักษ์นิยม

ลักษณะความเป็ นพลเมืองดีดา้ นการมี


ความมีความรู ้
การสนับสนุนทางสังคมด้านการประเมิน

การสนับสนุนทางสังคมด้านข้อมูลข่าวสาร

ภูมิลาเนาเดิม พฤติกรรมการแสดงออกที่เหมาะสมทาง
การเมืองของนิสิต
ความเชื่อในอานาจตนเอง

การมีส่วนร่ วมในกิจการนิสิตด้าน
การมีส่วนร่ วมในการรับประโยชน์

ที่มา :วรรณกร พลพิชยั และ ผศ.น.ท.หญิง ดร. งามลมัย ผิวเหลือง


อิทธิพลของการมีส่วนร่ วมกิจกรรมนิสิต การสนับสนุนทางสังคม ความเชื่ออานาจควบคุมตนเอง และลักษณะความเป็ นพลเมืองดีที่มีต่อพฤติกรรมการแสดงออกที่เหมาะสมทางการเมือง
ของนิสิตมหาวิทยาลัยเกษตรศาสตร์

31
Estimate Y by Yˆ
𝑌෠ = 𝑏0 + 𝑏1 𝑋1 + 𝑏2 𝑋2 + 𝑏3 𝑋3

conditions

1. Error ~ Normal (o,2)


2. V(e) = 2 is constant (If not, Heteroscedastic)
3. et and et+1 are independent
4. X1,…,Xk are independent (If not, Multicollinearity)

For k = 3 𝑌෠ = 𝑏0 + 𝑏1 𝑋1 + 𝑏2 𝑋2 +. . . +𝑏𝑘 𝑋𝑘
Y= Sales Revenue (Million Baht)
X1 = Advertising expense (Baht 100,000)
X2 = Selling price per unit (Baht)
X3 = number of branches
If b1 = 7 => if we increase advertising expense 1 unit (Baht 100,000) Y or sales
mean will be increase 7 million baht when fix selling price per unit and number
of branch.
Kanlaya Vanichbuncha 32
Testing Hypothesis

Step 1 : Test effects of independent variables set


H0 : 1 = 2 = … = k = 0
H1 : At least one of i  0 ; i = 1,2,…,k
SV df SS MS=SS/df F

Regression (X) k SSReg MSReg MSReg/MSE

Error n-k-1 SSE MSE

Total n-1 SST

Test statistic F = MSReg / MSE


Kanlaya Vanichbuncha 33
Conclusion :
1.1 Accept H0 : 1 = 2 = … = k = 0
-No linear relationship between Y and X1,X2,…, Xk stop
1.2 Reject H0 => There is at least one of Xi’s that, which linear
relationship with Y
-Test for each X or for each  in Step 2
Step 2 : Test effects for each independent variable
H0 : i = 0
H1 : i  0

bi − 0 MSE
t= SE (bi ) =
SE (bi ) X i
2

Kanlaya Vanichbuncha 34
The coefficient of Determination : R2
To the proportion of total variation in dependent variable Y that is
explained by the variation of independent variable X.
SS total = SS regression + SS Error
SST = SSReg + SSResidual
SS Re g SSE
R2 = = 1− 0  R2  1 or 0%  R2  100%
SST SST
Note: As more independent variables (X) are added to regression model => R2
will never decrease, even if the additional variables are not related to Y.

Adjusted R2 for Multiple Regression


SSE /( n − k − 1) (n − 1)
R 2
= 1− = 1+ ( R 2 − 1)
adj
SST /( n − 1) (n − k − 1)

Kanlaya Vanichbuncha
Copyright reserved by Kanlaya Vanichbuncha 35
Multicollinearity

Problem of Multicollinearity exist when independent variables


(X’s) are correlated with one another.
Problems high degree of Multicollinearity among the
independent variables.
1. Contradiction
Step 1 : Accept H0 : 1 = 2 = … = k = 0
but in Step 2 : Reject H0 : i = 0
• Or step1: reject Ho but in step 2 :accept Ho

Kanlaya Vanichbuncha
36
2. b unstable
3. SE(b) will be very large.
t-Test too small => accept H0 even when X and
Y are strong related with Y

Kanlaya Vanichbuncha 37
Detecting Multicollinearity
1. Compute the pairwise correlation between X’s multicollinearity
may be a serious problem if any pairwise correlation is bigger
than 0.5

2. Tolerance Tolerance (Xi) = 1 - R( X i )


2

2
R( X i ) = Coefficient of determination of 0  R2  1
X i = a + b1 X 1 + b2 X 2 + ... + bi −1 X i −1 + bi +1 X i +1 + ... + bk X k

0  Tolerance  1
3.VIF (Variance Inflation Factor)

VIF(Xi) =
1 1  VIF  
Tolerance( Xi )
38
Copyright
Kanlaya reserved by Kanlaya Vanichbuncha
Vanichbuncha
1. VIF(Xi) > 10 => multicollinearity may be influencing the least
square astimate of regression coefficients (bi)
k

2.  VIF ( X
j =1
j ) > 10 => serious problem of multicollinearity
VIF =
k

Corrections for Multicollinearity


1. Remove variables that are highly correlated with others.
2. Use Exploratory Factor Analysis

3. Ridge Regression

Kanlaya Vanichbuncha 39
Copyright reserved by Kanlaya Vanichbuncha
Variables Selection

1 2 3 4
Enter / Backward Forward Stepwise
Remove Elimination Selection Regression

Kanlaya Vanichbuncha 40
Example 1 : Data File : diaper. sav

Size Multiple Linear regression

Price

. Preference
.
.
.

Tape

Kanlaya Vanichbuncha 41
1.1 Enter for 9 Independent Variables

Kanlaya Vanichbuncha 42
Click Statistics

Click Plots

Kanlaya Vanichbuncha 43
Click Save

Kanlaya Vanichbuncha 44
Descriptive Statistics
Mean Std. Deviation N
Preference 3.99 1.961 296
Size 4.80 1.048 296
price 4.75 1.005 296
Value 4.81 1.166 296
unisex 4.13 1.500 296
style 4.03 1.325 296
absorb 3.92 1.172 296
leakage 3.89 1.153 296
Comfort 3.83 1.203 296
tape 3.35 1.167 296

Kanlaya Vanichbuncha 45
ANOVAa

Model df F Sig.
Sum of Squares Mean Square

Regression 857.721 9 95.302 98.661 .000b

Residual 276.265 286 .966


1
Total 1133.986 295
a. Dependent Variable: Preference
b. Predictors: (Constant), tape, Value, style, absorb, Size, Comfort, price, unisex, leakage

Kanlaya Vanichbuncha 46
Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
(Constant) -3.590 .307 -11.694 .000
Size .524 .113 .280 4.644 .000
price .124 .122 .063 1.009 .314
Value -.027 .074 -.016 -.369 .712
1 unisex .430 .089 .329 4.819 .000
style -.014 .097 -.010 -.148 .882
absorb .370 .149 .221 2.479 .014
leakage .271 .157 .159 1.731 .085
Comfort .074 .081 .045 .910 .363
tape .031 .066 .018 .467 .641
a. Dependent Variable: Preference

Kanlaya Vanichbuncha 47
Model Summaryb
Model R R Square Adjusted R Square Std. Error of the Estimate

1 .870a .756 .749 .983


a. Predictors: (Constant), tape, Value, style, absorb, Size, Comfort, price, unisex, leakage
b. Dependent Variable: Preference

Kanlaya Vanichbuncha 48
1.2 Enter for 3 independent variables
Size , unisex , absorb
ANOVAa

Model Sum of df Mean F Sig.


Squares Square
Regression 850.232 3 283.411 291.64 .000b

1 Residual 283.755 292 .972

Total 1133.986 295


a. Dependent Variable: คะแนนความชอบ
b. Predictors: (Constant), การซึมซับ, จานวนชิ้นต่อกล่อง, ใช้ได้ท้ งั 2 เพศ

Kanlaya Vanichbuncha 49
Coefficientsa
Standardize
Unstandardized d t Sig. Collinearity
Model Coefficients Coefficients Statistics

B Std. Error Beta Tolerance VIF

(Constant) -3.387 .281 -12.04 .000


Size .603 .066 .322 9.112 .000 .685 1.45
Unisex .437 .048 .334 9.174 .000 .646 1.54
Absorb .684 .059 .409 11.656 .000 .696 1.43
a. Dependent Variable: คะแนนความชอบ

Kanlaya Vanichbuncha
50
Model Summaryb

Model
R Square Adjusted R Square Std. Error of the
R Estimate
1 .866a .750 .747 .986

a. Predictors: (Constant), การซึมซับ, จานวนชิ้นต่อกล่อง, ใช้ได้ท้งั 2 เพศ


b. Dependent Variable: คะแนนความชอบ

Kanlaya Vanichbuncha 51
2.Forward
Variables Entered/Removeda

Variables Variables Method


Model Entered Removed
1 absorb . Forward (Criterion: Probability-of-F-to-enter <= .050)

2 unisex . Forward (Criterion: Probability-of-F-to-enter <= .050)

3 size . Forward (Criterion: Probability-of-F-to-enter <= .050)


4 Leak . Forward (Criterion: Probability-of-F-to-enter <= .050)
a. Dependent Variable: Preference

Kanlaya Vanichbuncha 52
\
Coefficientsa

Collinearity
Model Unstandardized Standardized Statistics
Coefficients Coefficients t Sig.

B Std. Error Beta Toleranc VIF


e
(Constant) -.748 .276 -2.711 .007
1
absorb 1.209 .067 .723 17.925 .000 1.000 1.000
(Constant) -1.713 .241 -7.116 .000
2 absorb .825 .064 .493 12.892 .000 .749 1.336
unisex .597 .050 .457 11.942 .000 .749 1.336
(Constant) -3.387 .281 -12.048 .000

3 absorb .684 .059 .409 11.656 .000 .696 1.436


unisex .437 .048 .334 9.174 .000 .646 1.548
size .603 .066 .322 9.112 .000 .685 1.459
(Constant) -3.484 .283 -12.332 .000
absorb .380 .147 .227 2.580 .010 .109 9.190
4 unisex .424 .048 .325 8.909 .000 .637 1.570
size .614 .066 .328 9.311 .000 .682 1.467
leak .332 .148 .195 2.242 .026 .112 8.964
a. Dependent Variable: คะแนนความชอบ
Kanlaya Vanichbuncha 53
Model Summarye
Model
R R Square Adjusted R Square Std. Error of the
Estimate
1 .723a .522 .521 1.358
2 .824b .679 .676 1.115
3 .866c .750 .747 .986
4 .868d .754 .751 .979

Kanlaya Vanichbuncha 54
3. Backward
Variables Entered/Removeda
Model Variables Entered Variables Removed Method

tape, Value, style, absorb, Size, . Enter


1 Comfort, price, unisex, leakageb
. style Backward (criterion: Probability of F-to-
2
remove >= .100).
. Value Backward (criterion: Probability of F-to-
3
remove >= .100).
. tape Backward (criterion: Probability of F-to-
4
remove >= .100).
. price Backward (criterion: Probability of F-to-
5
remove >= .100).
. Comfort Backward (criterion: Probability of F-to-
6
remove >= .100).
a. Dependent Variable: Preference
b. All requested variables entered.

Kanlaya Vanichbuncha 55
3.Backward
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients Collinearity
Model Sig. Statistics
Tolerance VIF
B Std. Error Beta t
-3.484 .283 -12.3 .000
(Constant)
size .614 .066 .328 9.311 .000 .682 1.467
6 unisex .424 .048 .325 8.909 .000 .637 1.570
absorb .380 .147 .227 2.580 .010 .109 9.190
leak .332 .148 .195 2.242 .026 .112 8.964
a. Dependent Variable: Preference

Kanlaya Vanichbuncha 56
Model Summaryg

Model R R Square Adjusted R Square Std. Error of the


Estimate
1 .870a .756 .749 .983
2 .870b .756 .750 .981
3 .870c .756 .750 .980
4 .870d .756 .751 .978
5 .869e .755 .751 .978
6 .868f .754 .751 .979

Kanlaya Vanichbuncha 57
4.Stepwise Variables Entered/Removeda
Model Variables Variables Method
Entered Removed
Stepwise (Criteria: Probability-of-F-to-enter <=
1 absorb .
.050, Probability-of-F-to-remove >= .100).
Stepwise (Criteria: Probability-of-F-to-enter <=
2 unisex .
.050, Probability-of-F-to-remove >= .100).
Stepwise (Criteria: Probability-of-F-to-enter <=
3 size .
.050, Probability-of-F-to-remove >= .100).
Stepwise (Criteria: Probability-of-F-to-enter <=
4 leak .
.050, Probability-of-F-to-remove >= .100).

Coefficientsa
Unstandardized Stand t Sig. Collinearity
Coefficients ardize Statistics
Model d
Coeffi
cients
B Std. Error Beta Toleranc VIF
e

(Constant) -3.484 .283 -12.332 .000


การซึมซับ .380 .147 .227 2.580 .010 .109 9.190
4 ใช้ได้ท้ งั 2 เพศ .424 .048 .325 8.909 .000 .637 1.570
จานวนชิ้นต่อกล่อง .614 .066 .328 9.311 .000 .682 1.467
การรั่ว .332 .148 .195
Kanlaya2.242 .026 .112 8.964
Vanichbuncha 58
Factor Analysis
 FA is a multivariate statistical technique that is used to
summarize the information contained in a large number of
variable into a small number of subsets or factor.
 FA is used to identify underlying dimensions or constructs in
the data and reduce the number of variables by eliminating
redundancy.
Types of Factor Analysis
1. Exploratory Factor Analysis
• Study of interrelationship among the variables in an
effort to find a new set of variables.
2. Confirmative Factor Analysis
• Confirm hypothesis relationship structure.

Kanlaya Vanichbuncha 59
Copyright reserved by Kanlaya Vanichbuncha
X1, X2, …, X10

X2, X4, X10 X1, X7, X8,X9 X3, X5, X6

Factor1 Factor2 Factor3

Kanlaya Vanichbuncha
60
Variable
Example
Service of Nurse
Quality of
Service of Doctor
Servers
Service of office employee

Quality of Drug
Quality of
Quality of
treatment
Products
Quality of equipment

Kanlaya Vanichbuncha
61
Factor Analysis

1.Study of interrelationship among the variables in an effort to find


a new set of variables.
2.Confirm hypothesis relationship structure.

1. Exploratory Factor Analysis


1.1 Principal Component Analysis

1.2 Common Factor Analysis

2. Confirmative Factor Analysis

Kanlaya Vanichbuncha 62
63

Exploratory Factor Analysis


➢ Study structure of the factor model or the underlying
theory is not know or specified a priori.

➢ Data are used to help reveal or identify the structure of the factor
model.

➢ The researcher has no knowledge of the factor structure and is


essentially seeking to identify the factor model that would account for
the covariances among the variables.

➢ Exploratory Factor Analysis Software :


SPSS, SAS, etc.
Kanlaya Vanichbuncha
component 1 component 2

Var1 Var2 Var3 Var4 Var5 Var6 Var7

Kanlaya Vanichbuncha
64
Confirmatory Factor Analysis

 To validate the hypothesized model and estimate model parameters.

 The precise structure of the model is know before.

Software : Amos ,EQS , MPLUS, LISREL

If we => use wish to confirm or negate the hypothesis Structure


Confirmatory FA

Kanlaya Vanichbuncha 65
e1 X1 D1 D3 Y1 e8

e2 X2
F1 Y2 e9
F3
e3 X3
Y3 e10

e4 X4
Y4 e11
e5 X5 F2
F4 Y5 e12

e6 X6
Y6 e13
e7 X7 D2 D4
Y7 e14

Kanlaya Vanichbuncha 66
Type of Exploratory Factor Analysis
1. Principal Component Analysis (PCA)
2. Common Factor Analysis
Exploratory Factor Analysis :PCA
Item1(Variable 1) Item2(Variable 2) Item3(Variable 2)
(Independent)

Component
(Dependent Variable)

67
Kanlaya Vanichbuncha
I. Principal Component Analysis (PCA)
(Measurement Model)

1. Total variance in Data set.


How much of total variance they account for?

2. Variance for each variable = 1

3. Full Solution
number of component (Factor ) = number of
measurement variable
4. No unique error

Kanlaya Vanichbuncha 68
Copyright reserved by Kanlaya Vanichbuncha
Exploratory Factor Analysis
(Principal Component Analysis)

component 1/
Factor 1 component 2

Variable1 Variable2 Variable3 Variable4 Variable5 Variable6 Variable7

Kanlaya Vanichbuncha 69
Copyright reserved by Kanlaya Vanichbuncha
II. Common Factor Analysis
(PAF: Principal Axis Factoring)
Note: Factor structure be descriptive of what the
variables have n common
1. Total variance = common variance (shared)
+ unique variance

common variance
of I2, I2, I3
I3
I1

unique variance of I3
I2
Kanlaya Vanichbuncha 70
Copyright reserved by Kanlaya Vanichbuncha
II. Common Factor Analysis
(PAF: Principal Axis Factoring)
1. Factor= Independent variable
Indicator / Measurement = Dependent variable

Factor
(Independent)

Item 1 Item 2 Item 3


Dependent Dependent Dependent

1 1 1
e1 e2 e3

Kanlaya Vanichbuncha 71
Copyright reserved by Kanlaya Vanichbuncha
II. Common Factor Analysis
(PAF: Principal Axis Factoring)
latent/Unobserved

Factor 1 Factor 2

Item1 Item2 Item3 Item4 Item5 Item6

1 1 1 1
1 1
e11 e2 e3 e4 e5 e6

error/unique
72
Copyright reserved by Kanlaya
Kanlaya Vanichbuncha
Vanichbuncha
Steps for Exploratory Factor Analysis 73

Step1. Check Appropriation of data for EFA


Ho : X1, X2 , …… , Xp are independent
H1: X1, X2 , …… , Xp are related
Use KMO (Kaiser-Meyer-Olkin)
  ij
r 2

KMO =
i j
0 ≤ KMO ≤ 1
r
i j
2
ij + a
i j
2
ij

rij = bivariate correlation between Xi and Xj

= Partial correlation between Xi and Xj when control others X


aij
Bartlett’s Test of Sphericity

Kanlaya Vanichbuncha 73
Step for Factor Analysis

KMO Recommendation
≥ 0.9 Marvelous
0.80 + Meritorious
0.70 + Middling
0.60 + Mediocre
0.50 + Miserable
Below 0.5 Unacceptable

Copyrighted Reserved by Kanlaya Vanichbuncha


Kanlaya Vanichbuncha 74
Fundamental Steps for EFA
step2. Select Factor Extraction
1. Principal Component Analysis ( PCA )
: no distributional assumptions.

2. Common Factor
◼ Principal Axis Factoring ( PAF)
◼ Maximum Likelihood ( MLE )
: assumes mutivariate normality, but
provide goodness of fit evaluation.
➢ etc

Kanlaya Vanichbuncha 75
Copyright reserved by Kanlaya Vanichbuncha
Fundamental Steps for EFA
Step 3. consider number of Factors
 Determine the appropriate number of factors by
1. Eigenvalues
2. Scree plot of eigenvalues.

Step 4 : Identify original /observed variables for each factor


by consider factor loading. If can not identify ➔ Go to step 5

Step 5 : Rotate axis of Factors

Kanlaya Vanichbuncha 76
Copyright reserved by Kanlaya Vanichbuncha
Step 6: Identify original /observed variables for each factor by
consider factor loading.
Step 7 :Interpret the Factor and Evaluate the Quality of
the Solution
➢ Consider the meaningfulness and interpretability
of the factors.
Eliminate poorly defined factors

Re-Run and (Ideally) Replicate the Factor Analysis


 If items or factors are dropped in preceding
step, re-run the EFA
Step 8: Naming factor

Copyright reserved by Kanlaya Vanichbuncha 77


Example 2. Exploratory Factor Analysis :PCA
Analyze =➔ Dimension Reduction ➔ Factor…….

Kanlaya Vanichbuncha 78
Kanlaya Vanichbuncha 79
Kanlaya Vanichbuncha 80
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .806
Approx. Chi-Square 2372.596
Bartlett's Test of Sphericity df 36
Sig. .000

Kanlaya Vanichbuncha 81
Communalities
Initial Extraction

Size 1.000 .861


price 1.000 .890
Value 1.000 .785
unisex 1.000 .944
style 1.000 .943
absorb 1.000 .840
leakage 1.000 .869
Comfort 1.000 .804
tape 1.000 .615
Extraction Method: Principal Component Analysis.

Kanlaya Vanichbuncha 82
Total Variance Explained
Compone Initial Eigenvalues Extraction Sums of Squared Rotation Sums of Squared
nt Loadings Loadings

Total % of Cumulati Total % of Cumulati Total % of Cumulati


Variance ve % Variance ve % Variance ve %

1 5.019 55.764 55.764 5.019 55.764 55.764 3.058 33.981 33.981


2 1.509 16.764 72.528 1.509 16.764 72.528 2.550 28.337 62.318
3 1.024 11.374 83.903 1.024 11.374 83.903 1.943 21.584 83.903
4 .561 6.230 90.133
5 .327 3.635 93.768
6 .269 2.986 96.754
7 .137 1.525 98.279
8 .099 1.105 99.384
9 .055 .616 100.0
Extraction Method: Principal Component Analysis.

Kanlaya Vanichbuncha 83
Component Matrixa
Component
1 2 3
Size .761 .485 .217
price .751 .524 .227
Value .669 .487 .316
unisex .752 .148 -.596
style .726 .126 -.633
absorb .828 -.383 .085
leakage .820 -.439 .070
Comfort .757 -.454 .158
tape .636 -.423 .179
Extraction Method: Principal Component Analysis.
a. 3 components extracted.

Kanlaya Vanichbuncha

84
Rotated Component Matrixa
Component
1 2 3

Size .226 .865 .247


price .194 .891 .239
Value .187 .858 .118
unisex .253 .272 .897
style .242 .226 .913
absorb .845 .239 .264
leakage .874 .189 .265
Comfort .864 .181 .157
tape .767 .143 .085

Extraction Method: Principal Component Analysis.


Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.

Kanlaya Vanichbuncha 85
Rotation Method
I. Orthogonal Rotation
1. Varimax (Kaiser 1958)
2. Qaurtimax
3. Equimax
Note Give one Rotated Factor Matrix. (structural
coefficient)
II. Oblique Factor Rotation
 Direct Oblimin
Note : Give 2 different Rotated factor Matrix
1. Pattern Matrix (Pattern coefficient)
2. Structure Matrix

Kanlaya Vanichbuncha 86
Copyright reserved by Kanlaya Vanichbuncha
Orthogonal Rotation
F2

X4
X10
X6

X1 X9
0
F1
X2 X3
X7

X8
X5

Kanlaya Vanichbuncha 87
Factor Rotation
F2

X4
X10
X6

X1 X9
F1
0
X2 X3
X7

X8
X5

Kanlaya Vanichbuncha 88
F1 Factor Rotation
X4 X5

X1 F2
X2

X3
X6

Kanlaya Vanichbuncha 89
II. Common Factor Analysis
(PAF: Principal Axis Factoring)
1. Factor= Independent variable
Indicator / Measurement = Dependent variable

Factor
(Independent)

Item 1 Item 2 Item 3


Dependent Dependent Dependent

1 1 1
e1 e2 e3

Kanlaya Vanichbuncha 90
Copyright reserved by Kanlaya Vanichbuncha
II. Common Factor Analysis
(PAF: Principal Axis Factoring)
Communalities

Initial Extraction
Size .766 .820
price .784 .911
Value .558 .588
unisex .817 .911
style .803 .877
absorb .893 .843
leakage .900 .905
Comfort .655 .701
tape .453 .420
Extraction Method: Principal Axis Factoring.

Kanlaya Vanichbuncha 91
Total Variance Explained

Extraction Sums of Squared Rotation Sums of


Initial Eigenvalues Loadings Squared Loadingsa
% of
Varianc % of Cumulative
Factor Total e Cumulative % Total Variance % Total
1 5.019 55.764 55.764 4.818 53.534 53.534 3.972
2 1.509 16.764 72.528 1.283 14.254 67.788 3.565
3 1.024 11.374 83.903 .876 9.734 77.522 3.298
4 .561 6.230 90.133
5 .327 3.635 93.768
6 .269 2.986 96.754
7 .137 1.525 98.279
8 .099 1.105 99.384
9 .055 .616 100.000
Extraction Method: Principal Axis Factoring.
a. When factors are correlated, sums of squared loadings cannot be added to obtain a total variance.

92
Kanlaya Vanichbuncha
Factor Matrixa Pattern Matrix a
Factor Factor
1 2 3 1 2 3
Size .750 .454 .228 Size .012 .881 -.034
price .753 .526 .259 price -.048 .972 -.011
Value .627 .372 .237 Value .037 .763 .029
unisex .758 .145 -.562 unisex .002 .027 -.939
style .727 .122 -.578 style .003 -.019 -.945
absorb .818 -.401 .112 absorb .892 .022 -.028
leakage .820 -.472 .102 leakage .962 -.052 -.027
Comfort .724 -.399 .135 Comfort .850 .004 .028
tape .572 -.288 .101
tape .640 .025 .009
Extraction Method: Principal Axis Factoring.
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.
a. 3 factors extracted. 11 iterations required.
a. Rotation converged in 5 iterations.

93
Kanlaya Vanichbuncha
Factor Correlation Matrix
Factor 1 2 3
1 1.000 .504 -.534
2 .504 1.000 -.536
3 -.534 -.536 1.000
Extraction Method: Principal Axis Factoring.
Rotation Method: Oblimin with Kaiser Normalization.

Kanlaya Vanichbuncha 94
References
Kanlaya Vanichbuncha “การวิเคราะห์ขอ
้ มูลหลายต ัวแปร”

Kanlaya Vanichbuncha “การวิเคราะห์สถิตข


ิ นสู
ั้ งด้วย SPSS ”
Kanlaya Vanichbuncha “สถิตส
ิ าหร ับงานวิจ ัย ”

กัลยา วานิชย์ บัญชา การวิเคราะห์ สมการโครงสร้ าง (SEM) ด้ วย AMOS

Kanlaya Vanichbuncha 95

You might also like