Professional Documents
Culture Documents
SEM Ppt2final
SEM Ppt2final
PROF.SANJIV MITTAL
SEM: The Basic
PROF.SANJIV MITTAL
Structural equation modeling
Structural Equation Modelling (SEM) is a
powerful method to estimate multiple and
simultaneous relationships involving
several dependent variables and
explanatory variables, and allows for the
inclusion of latent variables which cannot
be directly measured but can be expressed
as a function of other measurable
variables
3
Latent v/s Observed Variables
In behavioral sciences researchers are interested in
studying theoretical constructs that cannot be
observed directly. These abstract phenomena are
called as latent variables or factors. E.g. like self
concept, motivation, teacher ability, service quality
etc. To measure these researcher must
operationally define the latent variable of interest
in terms of behavior believed to represent it. Uses
some attitudinal scales as interview questions
called observation scores. These measured scores
i.e. measurements are termed as observed or
manifest variables.
Latent v/s Observed Variables
In SEM terminology they serve as indicators
of the underlying constructs that they
represent. Hence it is necessary to have this
bridging process between the observed
variables and unobserved latent variables.
Exogenous and Endogenous Latent Variables
Latent variables are of two types Exogenous or
Endogenous. Exogenous latent variables are also
called as independent variables; they “cause”
fluctuations in the values of other latent variables
in the model. Changes in the values of exogenous
variables is not explained by the model. They are
influenced by the background factors external to
the model called demographics like gender, age,
socio-economic status etc.
On the other hand Endogenous latent variables
are called as dependent variables. They are
influenced by the exogenous variables directly or
indirectly in the model.
SEM
SEM extends the regression assumptions as
Several dependent variables can be considered at
the same time
Explanatory variables can be assumed to be
measured with a random error
Endogenous variables can be used to explain
dependent variables
Correlation between explanatory variables is
allowed for
And this is not all. Another key feature in SEM is
the possibility of including in the model, as
endogenous or exogenous variables i.e. some
latent (unobservable) variables.
7
SEM is ……
a family of statistical techniques
which incorporates and
integrates
Path analysis
Linear regression
Factor analysis
Causality
Causality has theoretical basis
Education Success
in Life
Price Suppl
Demand
y
Unemp- Windows of
loyment No. of Opportunity
Rate Crimes for Crime
Cause and Effect
Philosophers have had a great deal to say
about the conditions necessary to infer
causality i.e. Cause and effect
cause should occur before an effect is
observed, and
the cause should never occur without the
presence of the effect.
Statistical Modeling
A Statistical Model DOES NOT necessarily have
theoretical basis – It may be interpreted as either
‘make sense’ or ‘nonsense’
Weight
Heart
Disease
Income
Smokin
g
No. of No. of
Road Newspaper
Accidents Readers
Variables in SEM
Manifest variables that can be directly measured and serve
as indicators/items
Latent variables are not observable and are the actual
components of the causal relationship
Just like variables in regression analysis one can distinguish
between exogenous constructs and endogenous constructs in
a causal relationship
Exogenous constructs only act as explanatory factors and
do not depend on any other construct
Endogenous constructs play the role of the dependent
variables and they appear on the left-hand side in at least
one of the SEM causal relationships
12
Examples of marketing applications for SEM
Scarcity and willingness to buy How perceived scarcity influences
willingness to buy through a set of mediating variables, like perceived
quality, perceived monetary sacrifice and perceived symbolic benefits
(Wu and Hsing (2006))
13
SEM & related techniques
Structural Equation Modelling is a
comprehensive method which includes as
special cases:
◦ confirmatory factor analysis
◦ path analysis
◦ multivariate regression (simultaneous
equation systems)
14
Exogenous Latent Variable
/Construct
Verbal
Ability
d1 d2 d3 d4 1
y1 y2 y3 y4 Indicators
1 1 1 1
Exogenous Latent Variable Quantitative
Ability e1 e2 e3 e4
x5 x6 x7 x8 Indicators
1 1 1 1
d5 d6 d7 d8
SEM Nomenclature
Independent variables, which are assumed to
be measured without error, are called
exogenous or upstream variables;
Dependent or mediating variables are called
endogenous or downstream variables.
Manifest or observed variables or indicators
are directly measured by researchers
Latent or unobserved variables are not directly
measured but are inferred by the relationships
or correlations among measured variables in
the analysis. Example, self-concept,
motivation, powerlessness, verbal ability,
capitalism, social class.
SEM Nomenclature (cont.)
SEM illustrates relationships among
observed and unobserved variables using
path diagrams.
Ovals or circles represent latent variables,
Rectangles or squares represent measured
variables.
Residuals/error terms are always
unobserved, so they are represented by
ovals or circles.
SEM – Definition
SEM is an extension of the general linear
model (GLM) that enables a researcher to
test a set of regression equations
simultaneously.
SEM consists of TWO components;
Structural Model
illustrates the relationships among the latent
constructs or endogenous variables
Measurement Model
represents how the constructs are related to their
indicators or manifest variables
Example
In psychology, the theory
postulates that …
Ability /
Intelligenc Aspirations Achievement
e
1 1 2
Exogenous Endogenous Endogenous
Latent Latent Latent
Construct Construct Construct
Full Latent Variable Model
21
Structural 1 2
11 21
Model
1 1 2
x 11y y
31 y
42 y
11 x
x
y
62
21 31 y
21 52
Measurement
Model
x1 x2 x3 y1 y2 y3 y4 y5 y6
1 2 3 1 2 3 3 4 6
Structural Model
The structural model allows for certain
relationships among the latent variables,
depicted by lines or arrows (in a path diagram)
In the path diagram, we specified that Ability
and Achievement were related in a specific way.
That is, intelligence had some influence on later
achievement.
Thus, one result from the structural model is an
indication of the extent to which these priori
hypothesized relationships are supported by our
sample data.
Structural Model (Cont.)
The structural equation addresses the
following questions:
Are Ability and Achievement related?
Exactly how strong is the influence of
Ability on Achievement?
Could there be other latent variables that
we need to consider to get a better
understanding of the influence on
Achievement?
Example:
ONE Latent (unobserved) Exogenous Variable &
TWO Latent (unobserved) Endogenous Variables
21
Structural 1 2
11 21
Model
1 1 2
x 11y y
31 y
42 y
11 x
x
y
62
21 31 y
21 52
Measurement
Model
x1 x2 x3 y1 y2 y3 y4 y5 y6
1 2 3 1 2 3 3 4 6
Measurement Model
Specifying the relationship between the latent
variables and the observed variables
Answers the questions:
1) To what extent are the observed variables
actually measuring the hypothesized latent
variables?
2) Which observed variable is the best measure of
a particular latent variable?
3) To what extent are the observed variables
actually measuring something other than the
hypothesized latent variable?
Using Exploratory Factor Analysis (EFA) or
Confirmatory Factor Analysis (CFA) to
determine the significant observed variables
related to each of the latent variables
Measurement Model (Cont.)
The relationships between the observed
variables and the latent variables are described
by factor loadings
Factor loadings provide information about the
extent to which a given observed variable is able
to measure the latent variable. They serve as
validity coefficients.
Measurement error is defined as that portion of
an observed variable that is measuring
something other than what the latent variable is
hypothesized to measure. It serves as a measure
of reliability.
Exploratory FA (EFA)
In EFA the factor structure or theory about a
phenomenon is NOT KNOWN.
For example, the researcher is interested in
measuring “the achievement of a personnel”.
Suppose he has no knowledge ( very little theory)
regarding
the factors that contribute to achievement
the no. of indicators of each factor
which indicators represent which factor
In such a case, the researcher may collect data
and ‘explore’ for a factor or theory which can
explain the correlations among the indicators.
The Factor Analytic Model
The best known method to investigate the
relationship between set of observed and latent
variables is that of factor analysis. Here the
researcher examines the covariance among a set
of observed variables to gather information on
the underlying construct or Factors.
It is normally performed in situations when the
links between the observed and latent variables
are unknown or uncertain. So analysis in an
exploratory mode proceeds to determine how
and to what extent the observed variables are
linked to their underlying factors.
The Factor Analytic Model
So researcher attempts to find out the
minimal number of factors that account
for covariance or correlations among the
observed variables. In factor analysis
these relationships are represented by
factor loadings. It is exploratory in nature
as researcher has no prior knowledge that
these items do, indeed measure the
intended factors.
Confirmatory FA (CFA)
In CFA the precise factor structure or theory about a
phenomenon is KNOWN or specified priori.
For example, a researcher is interested in measuring
“consumer preference” to a product.
Suppose that ‘based on previous research’ it is hypothesized
(the theory) that a construct or factor to measure ‘consumer
preference’ is
a one-dimensional construct with 7 indicators or items as its
measures
The obvious question is:
How well do the empirical data conform to the theory of
consumer preferences? Or
How well do the data fit the model?
In such a case, CFA is used to do empirical ‘confirmation’ or
‘testing’ of the theory
Confirmatory Factor Analysis
Is used when the researcher has prior knowledge
of the underlying latent variable structure. The
relationship between the observed measures and
the underlying factors “a priori” would then be
tested statistically using SEM. All items will be
free to load on that factor but restricted to have
zero loadings on the remaining factors. This is
evaluated statistically using measurement model
by determining its adequacy of goodness of fit
to the sample data.
Reliability
• Definition: Extent to which a variable or set of
variables is consistent in what it is intended to
measure
• If multiple measurement are taken, the reliable
measures will all be consistent in their values
• It is a degree to which the observed variable
measure the “true” value and is “error free”
• It is different from validity
Reliability
• The degree to which scores are free from
random measurement error
• Reliability measures
– Internal Consistency Reliability
– Test-retest Reliability
– Alternate Forms Reliability
Reliability
• Levels of Reliability
– 0.90 Excellent
– 0.80 Very Good
– 0.70 Adequate
– <0.70 Poor
Example: Reliability of Observed Variables
Cronbach’s alpha were computed for the all variables
Variable No. of items Reliability
Variable1 10 .91
Variable2 10 .87
Variable3 10 .58
Variable4 10 .70
Variable5 12 .72
Variable6 12 .80
Variable7 12 .80
Variable8 12 .87
Variable9 10 .84
Variable10 7 .71
Variable11 4 .48
Types of Reliability
Test-retest
Assessed by administering the same instrument to
the same sample respondent at two points in time,
and computing the correlation between two sets of
scores.
Internal consistency reliability
The extent to which individual items that constitute a
test correlate with one another or with the test total.
In short, it measures how consistently respondents
respond to the items within scale.
Validity
Definition: extent to which an item or set of
items correctly represent the construct of
study- the degree of which it is free from any
systematic or non-random error
Validity deals with
How well the construct is defined by the item/s
(what should be measured)
While Reliability deals with
How consistent the item/s is/are in measuring the
construct (HOW it is measured)
Validity
Whether the scores measure what they are
supposed to measure
Types of validity
Construct Validity
Criterion-Related Validity or Nomological Validity
(Correlation with an external standard)
Convergent Validity (SEM Confirmatory Factor Analysis
helps to establish convergent validity)
Discriminant Validity (Can be determined through SEM
Confirmatory Factor Analysis)
Types of Measurement Scale
There 4 types of measurement scale in a scale
instrument
◦ Nominal Scale
◦ Ordinal
◦ Interval Scales
◦ Ratio
Some other common scales like Likert scales,
Semantic Differential Scales, Dichotomous
Scales etc can be categorized into the 4 above
Metric and Non-metric Scales
Metricscales are quantitative data where the
parameters of the scale is continuum
◦ Interval or Ratio scale data
Non-metric scales are qualitative data where
attributes, characteristics or categorical
properties that identify or describe a subject or
object
◦ Possibly Nominal or Ordinal scale data
VARIABLE SCALES
SEM in general assumes observed variables are
measured on a linear continuous scale. Metric
scales or Interval scales data is only used in SEM.
Correlation
Perhaps the most basic semantic
◦ Definition: the linear relationship of two variables
The strength of relationship is determined by the
correlation coefficient and R² .
There are 2 common types of correlation
coefficient
◦ Pearson Product Moment Correlation (Interval)
◦ Spearman Ranking Correlation (Ordinal)
The former is the one we will use in this case
Covariance
54
Estimation
Maximum likelihood estimation (MLE)
• the manifest indicators must follow a multivariate normal distribution (i.e.
they are normally distributed for any value of the other indicators)
• the latent constructs are also assumed to be normally distributed
Key aspects of SEM estimation
• Individual cases only enter the estimation process to obtain the covariance
matrix
• SEM does not use the individual observations (cases) for the estimation of
the parameters
• Estimation is based on the covariance matrix not on the individual cases
• Thus, degrees of freedom refer to the elements in the covariance
matrix
• An adequate sample size is still needed
• Identification problems may emerge when many of the elements of the
covariance matrix are close to zero unless the sample size is large enough
• A simple rule of thumb requires at least fifteen observations per measured
variable or indicator
55
Testing
56
Goodness-of-fit indicators
Three Types of Fit Indices
1. Absolute Fit Indices
57
Goodness-of-fit indicators
2. Incremental Fit Indices: which are
Other Goodness of fit indices which are expected to be as close as
possible to one (and not below 0.90):
• normed fit index (NFI)
• relative fit index (RFI)
• incremental fit index (IFI)
• Tucker-Lewis coefficient (TLI)
• comparative fit index (CFI)
58
Measurement Error
Their can exists some degree of measurement
error when we cannot perfectly measure a
concept. For example when asking straight
forward about ones household income, we know
some people will answer in correctly i.e. either
overstating or understating called measurement
error. The measurement error help to strengthen
the coefficients in our dependent models that is
why we use the error concept in SEM model.
Example
Retail Manager are interested in knowing
how individuals became loyal, committed
customers. They ask the research
questions: How do customer perceptions
of three key strategic elements-price,
service & atmosphere-determine customer
acceptance of store, measured by
customer share & customer commitment?
Example contd…..
From their experience they developed a serious of
relationship they feel explain the process
Better price perception increase customer share.
Better service perceptions increase customer
share.
Better store atmosphere perceptions increase
customer share.
Higher customer state increases customer
commitment.
Structural Equation Model
The relationship identified by the retail
managers include five constructs: perceptions of
price, service, and store atmosphere along with
customer share and customer commitment. The
first step is to identify which constructs can be
considered exogenous versus endogenous.
From our relationship we can identify three
exogenous constructs and two endogenous
constructs:
Exogenous Constructs Endogenous Constructs
Price Customer Share
Service Customer Commitment
Atmosphere
Price
Customer Customer
Service
Share Commitment
Atmosphere
Path Diagram Figure:-1
Price
.065
.200
.200
.219 Customer .500 Customer
Service
Share Commitment
.150
.454
Atmosphere
Covariance matrix
Y(cc) = .50(cs)
Direct Path
Service to Customer Share=0.219
Indirect Path
Service to Price to Customer share = .20x .065 =.013
Service to atmosphere to customer share = .15 x ..454 = .30
Direct Path
Service to Customer Share=0.219
Indirect Path
Service to Price to Customer share = .20x .065 =.013
Service to atmosphere to customer share = .15 x ..454 = .30
In reflective Model
a) Measures denotes the effects of the underlying
Latent constructs.
b) A change in the latent variable causes variance in all
measures simultaneously and
c) All measures in Reflective model be positively
correlated.
Reflective v/s Formative Model
Formative Measurement Model
Measure latent
Graphical interface
Alternative softwares
• LAMOS reads SPSS data sets
• ISREL, the first computer program which dates back to 1973 after the
pioneering work of the statistician Karl Jöreskog (1967 and 1969) which has
evolved together with the method.
• SAS also provides a procedure for estimating structural equation models proc
CALIS.
• Other packages designed for structural equation models are EQS and Mplus
76
Drawing the path diagram
Open AMOS graphics from the AMOS
menu
77
The AMOS graphic interface
78
Some useful functions
Draw a square Draw a latent variable or
(manifest variable) adds indicators to a latent
variable
Draw a circle
(latent variable) When clicking on an
indicator, associate it with a
Draw a single arrow latent unobserved variable
(causation) (generally an error)
Otherwise it creates the
Draw a double arrow latent variable
(correlation) Copies Moves Delete
objects objects objects
79
The path diagram
1 1 1 1 1 1 1 1 1 1 1
80
The data
To include the observed variables it is now time to open the SPSS
file:
or simply click on this
button
81
Adding names of variables from the data-set
82
Unobservable variables and errors
It is also necessary to give names to the latent variables and to the errors
(circles)
Latent variables: Right-click on the desired circle, select PROPERTIES and
assign a a name and a label
Errors: all remaining variables can be assigned an automated name as follows
83
The model is now ready
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
e1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11
1 1 1 1 1 1 1 1 1 1 1
bba bbb bbc bbd bbe bbf bbg bbh bbi bbj bbk
1 0,
A
0,
e12
SN 1
ITP
PBC
84
Before estimation
Intercept is required
for estimation if there
are missing data
Choose estimation
method
Decide whether results
for the independence
and saturated should
be shown
85
Analysis properties
86
Estimation
87
Estimation output
As the estimation procedure
terminates this button becomes
available to show estimates
directly in the path diagram
88
Output view
89
AMOS output window
Summary information
Final estimates
Goodness-of-fit evaluation
90
Degrees of freedom and Chi-
square
Variable summary
Parameter estimates
91
Output
Computation of DoF(Default model)
92
Estimates
Estimate S.E. C.R. P
This coefficient is constrained to
bba <--- A 1.000
one in order to ensure identification
bbb <--- A .985 .074 13.298 ***
bbc <--- A -.097 .067 -1.461 .144
bbd <--- A .836 .067 12.496 ***
bbe <--- A 1.045 These
.078 are the loadings
13.485 *** of the
bbf <--- A .853 manifest
.061 indicators***on the latent
13.872
bbg <--- A .887 variable
.074 attitude
11.953 (A)
***
bbh <--- A .965 .079 12.296 ***
bbi <--- A -.297 .064 -4.628 ***
Attitudes have a
bbj <--- A .674 .077 8.807 ***
positive and
bbk <--- A -.028 .084 -.327 .744
significant (albeit
q7 <--- A .073 .012 6.307 ***
small)
SN impact
& PBC on
do not
q7 <--- sn .008 .008 1.051 .293
intentions
have a significant
q7 <--- pbc .016 .010 1.581 .114
impact on intentions
93
Standardized regression weights
Standardized weights can be interpreted like
Estimate
correlations, but they assume causation
bba <--- A .735
bbb <--- A .656 The behavioral belief which is most important to
bbc <--- A -.072 measure the latent construct (attitude) is bba, i.e.
bbd <--- A .616
chicken taste
bbe <--- A .661
bbf <--- A .682 These weights are negatively related to
bbg <--- A .594 attitudes. They measure the following beliefs:
bbh <--- A .633 bbc: difficulty of preparation
bbi <--- A -.228
bbj <--- A .447 bbi: the agreement with the statement that
bbk <--- A -.017 chicken lacks flavour
q7 <--- A .324 A standardized weight
bbk: animal of 0.32
welfare indicates a positive but
concern
q7 <--- sn .048 small relation between attitudes and intentions to
q7 <--- pbc .071 purchase chicken
94
Correlations (no causation)
Covariances: (Group number 1 - Default model)
S.E. C.R. P
Estimate
sn <--> pbc 3.712 4.219 .880 .379
pbc <--> A -2.880 3.548 -.812 .417
sn <--> A 12.017 4.461 2.694 .007
95
Goodness-of-fit (1)
Model Fit Summary
CMIN
Model NPAR CMIN DF P CMIN/DF
Default model 45 619.428 74 .000 8.371
Saturated model 119 .000 0
Independence model 14 1736.822 105 .000 16.541
Parsimony-Adjusted Measures
Model PRATIO PNFI PCFI
Default model .705 .453 .469 In a good model all
Saturated model .000 .000 .000 these indicators should
Independence model 1.000 .000 .000
be above 0.9
96
Goodness-of-fit (2)
NCP
Model NCP LO 90 HI 90
Default model 545.428 469.732 628.594
Saturated model .000 .000 .000
Independence model 1631.822 1500.462 1770.572
FMIN
Model FMIN F0 LO 90 HI 90
Default model 1.241 1.093 .941 1.260
Saturated model .000 .000 .000 .000
Independence model 3.481 3.270 3.007 3.548
RMSEA
Model RMSEA LO 90 HI 90 PCLOSE Good models
Default model .122 .113 .130 .000
Independence model .176 .169 .184 .000 have a pclose
value above
0.05
97
Information criteria
AIC
Model AIC BCC
Default model 709.428 712.218
Saturated model 238.000 245.376
Independence model 1764.822 1765.690
ECVI
Model ECVI LO 90 HI 90 MECVI
Default model 1.422 1.270 1.588 1.427
Saturated model .477 .477 .477 .492
Independence model 3.537 3.273 3.815 3.538
HOELTER
HOELTER HOELTER
Model .05 .01
Default model 77 85
Independence model 38 41
98
A competing model
Thetheoretical model is rejected by the coefficients are significant
Competing model strategy
• try and remove all the non-significant components of the model
• add some other explanatory variables
Problem: measurement of the latent construct for attitude because
the presence of items with negative wording might lead to the
identification of more than one latent factor
Additional explanatory variable
• some variable explaining habit, which could be correlated with attitude and
influence ITP
• For example, variable q2b measures the frequency of chicken purchases and
we label it as habit
99
The modified model 0, 80.22 0, 90.91
0, 57.29 0, 89.38 e4 e5
e1 e2 1 1
1
29.38
34.27
1
bbb 32.42 bbd bbe
37.31 1.08
A
14.54, 114.85 0, 2.76
SN .07 e12
12.36 1
2.90
ITP
1.63, .93 3.85
.64
Habits
100
Estimates
Raw Std.
S.E. C.R. P
Estimate Estimate
Intercept bba 37.307 .509 73.344 ***
Intercept bbb 32.417 .567 57.216 ***
Intercept bbd 29.377 .513 57.300 ***
Intercept bbe 34.267 .593 57.774 ***
Intercept q7 3.848 .169 22.763 ***
101
Goodness-of-fit
Model NPAR CMIN DF P CMIN/DF NFI-Delta1
Final model 22 20.99 13 0.073 1.615 0.967
LO 90 HI 90 RMS EA LO 90 HI 90 PCLOS E
Final model 0 0.049 0.035 0 0.062 0.801
HOELTER HOELTER
0.05 0.01
Final model 532 659
The diagnostics are now ok
Independence model 33 38
102
Defining a Model
In SEM there are two models which are used
1. The measurement model which represent how measured
variable come to represent constructs.
2. The structural models which shows how constructs are
associated which each other.
Common types of theoretical relationships in a SEM model are as
follows.
a. Relationship Between a
Construct and a Measured Exogenous X
Variable
Or
Exogenous X
b. Relationship Between a X1
Construct and Multiple
Measure Variables Exogenous X2
X3
c. Dependence Relationship
Between Two Constructs Construct 1 Construct 2
(a structural relationship)
Construct 1
d. Correlational Relationship
Between Constructs
Construct 1
Exogenous Endogenous
Construct Construct
X1 X2 X3 X4 y1 Y2 Y3 Y4
Exogenous Exogenous
Construct Construct
X1 X2 X3 X4 X5 X6 X7 X8