Professional Documents
Culture Documents
Bollen On SEM Constructs
Bollen On SEM Constructs
Bollen On SEM Constructs
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Research Commentary
Although the literature on alternatives to effect indicators is growing, there has been little attention given to
evaluating causal and composite (formative) indicators. This paper provides an overview of this topic by con
trasting ways of assessing the validity of effect and causal indicators in structural equation models (SEMs).
It also draws a distinction between composite (formative) indicators and causal indicators and argues that
validity is most relevant to the latter. Sound validity assessment of indicators is dependent on having an
adequate overall model fit and on the relative stability of the parameter estimates for the latent variable and
indicators as they appear in different models. If the overall fit and stability of estimates are adequate, then a
researcher can assess validity using the unstandardized and standardized validity coefficients and the unique
validity variance estimate. With multiple causal indicators or with effect indicators influenced by multiple
latent variables, collinearity diagnostics are useful. These results are illustrated with a number of correctly
and incorrectly specified hypothetical models.
Keywords: Causal indicators, effect indicators, formative indicators, reflective indicators, measurement,
validity, structural equation models, scale construction, composites
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
The purpose of this paper is to present ideas on evaluating the Indicators are observed variables that measure a latent vari
validity of causal, composite, or formative indicators in mea able. I use indicator and measure to mean the same thing.
surement models. This task is facilitated by considering the One primary distinction is whether the indicator is influenced
selection and elimination of effect indicators in that the by the latent variable or vice versa. Effect (or reflective)
similarities and differences between these traditional and indicators depend on the latent variable. Causal indicators
nontraditional indicators are illuminating. A division is also affect the latent variable. Blalock (1964) appears to be the
made between models whose only possible flaw is that one or first social or behavioral scientist to make the distinction
more of the indicators are not a measure of the intended between causal and effect indicators. The term formative
concept and a second set of models where more serious indicator came much later than causal indicators and had a
structural misspecifications are present. But the task is the meaning more restrictive than causal indicators. Fornell and
same: evaluating the validity of indicators. Bookstein (1982, p. 292) defined formative indicators as
The next section presents background information to clarify when constructs are conceived as explanatory com
my terminology, the role played by theory, a summary table binations of indicators (such as "population change"
of validity checks, and my position on when to eliminate or "marketing mix") that are determined by a com
indicators. Next are sections that examine these different bination of variables, their indicators should be
indicators in models that are correctly specified, followed by formative.
diversity of terms with sometimes the same term applied in largely variables that define a convenient composite variable
different ways. For this reason I clarify my basic terminology where conceptual unity is not a requirement. That is, the
and assumptions. composite variable with its formative indicators might just be
a useful summary device for the effect of several variables on
Concepts are the starting point in measurement. Concepts other variables (e.g., see Heise's [1972] sheaf coefficient).
refer to ideas that have some unity or something in common. In other situations, researchers' use of the term formative
The meaning of a concept is spelled out in a theoretical defi indicators departs from its original meaning and they appear
to mean the same as causal indicators.
nition. The dimensionality of a concept is the number of
distinct components that it encompasses. Each component is
analytically separate from other components, but it is not To avoid the ambiguity of the term formative indicator, I use
easily subdivided. The theoretical definition should make composite indicator to refer to an indicator that is part of a
clear the number of dimensions in a concept. Each dimension linear composite variable (Grace and Bollen 2006). Although
is represented by a single latent variable in a SEM, so if a composite indicators might share similarities, they need not
concept is unidimensional we need a single latent variable; if have conceptual unity. In other words, I use composite
it is bidimensional, we will need two latent variables. indicators to refer to the original use and sometimes current
use of formative indicators. Causal indicators differ from the
SEMs are sets of equations that encapsulate the relationships composite indicators in that causal indicators tap a unidimen
among the latent variables, observed variables, and error vari sional concept (or dimension of a concept) and the latent
ables. Parameters such as coefficients, variances, and covari variable that represents the concept is not completely deter
ances from the same model are estimable in different ways mined by the causal indicators. That is, an error term affects
(e.g., ML, PLS) with the different estimators having different the latent variable as well. As I will explain later, whether an
properties. So the model and estimation method are distinct indicator is a composite or causal indicator has implications
and should not be confused. for evaluating validity.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
The Necessity of Theory The second check in Table 1 is to examine the overall fit of
the model. Assuming that the researcher is using an estimator
Theory enters measurement throughout the process. We need such as ML that provides a test of whether the model ade
theory to define a concept and to pick out its dimensions. We quately reproduces the variances, covariances, and means of
need theory to develop or to select indicators that match the the observed variable, a chi-square test statistic and various fit
theoretical definition and its dimensions. Theory also enters indexes are available to help in assessing model fit. The fit
in determining whether the indicator influences the latent indexes to choose and the weight to give to them is subject to
variable or vice versa. Each relation specified between a controversy and this is an issue that cannot be easily resolved.
latent variable and an indicator is a theoretical hypothesis. However, the researcher will need to make the judgment as to
Empirical estimates and tests aid in assessing whether the set whether model fit is adequate enough to use the other validity
of relations are consistent with the data. However, empirical checks that are specific to the individual indicators.
consistency between the data and model alone is insufficient
to support the validity of measures. There must be theory Another aspect of assessing measurement validity is having
behind the relations to make the validity argument credible. the factor and measures embedded in a fuller model with
additional causes and consequences of the main latent
On the other hand, theory might suggest that an indicator is
variable. Part of assessing validity is determining whether a
valid, but the empirical estimates linking the latent variable measure and latent variable relate to other distinct latent vari
and indicator might be weak to zero. If this occurs in a well
ables as expected. Sometimes this is called external validity.
fitting model, these results cast doubt on the validity of the
For instance, there often are variables thought to be determi
measure even if theory suggests its validity. The burden
nants or consequences of a specific latent variable. If I esti
would be on the researcher to explain why a valid measure is
mate a model with this full set of relationships and 1 find the
not related to its latent variable. It could be that the measure
expected effects, then this is further support for the validity of
is valid, but that the model is misspecified in a way that leads
the latent variable and its measures. Failure to find such rela
to low values of the validity statistics. This explanation is
tionships undermines validity even if the preceding validity
most convincing if the researcher is specific on the nature of
measures are satisfactory. This idea is closely related to con
the misspecifications and why these problems would lead to
struct validity where part of the validity assessment is done by
low validity estimates.
evaluating if a construct relates to other constructs as pre
Although I will not repeatedly discuss this in the remaining dicted by theory or substantive considerations. When such
sections, I assume throughout that there is measurement variables are available, they enhance the validation process.
theory to support the models and the relations between the
latent variables and indicators within them. The empirical If the researcher is satisfied with model fit, the remaining
estimation and tests are intended to evaluate these ideas. validity checks bear examination. Using the definition of
validity from the previous section, the factor loadings (1) for
effect indicators and the regression coefficients (y) for causal
indicators provide the unstandardized validity coefficient.
Validity Checks
Here the researcher makes sure that the coefficients have the
I define the validity of a measure as follows: "The validity of correct sign and looks for their statistical and substantive
a measure x, of <f, is the magnitude of the direct structural significance. The standardized validity coefficients are
relation between and x," (Bollen 1989, p. 197). Causal treated similarly except that these coefficients give the
indicators occur when the observed indicators have a direct expected standard deviation shift in dependent variable for a
effect on the latent variable whereas effect indicators are one standard deviation shift in the other variable net of other
indicators where the latent variable has a direct effect on the influences on the dependent variable.
indicator.
The unique validity variance in the last row gives the variance
Table 1 gives a summary of the validity assessment that I in the dependent variable uniquely attributable to a variable.
would recommend (Bollen 1989, pp. 197-206). The first step If we have an effect indicator, it gives the explained variance
is to compare the definition of the concept (or dimension of uniquely attributable to the latent variable that it measures. If
the concept) that is the object of measurement to the indicator we have a causal indicator, it gives the uniquely attributable
and to check whether the indicator corresponds to this variance in the latent variable due to the causal indicator.
definition. This is a type of face validity check in that the When a single latent variable influences the effect indicator,
researcher assesses whether it is sensible to view the indicator the unique validity variance is the same as the usual R- for an
as matching the idea embodied by the concept. indicator. Typically there are two or more causal indicators,
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
change the nature of the latent variable. But here also if the
coefficient of the causal indicator is not significant or is the
wrong sign, it is possible that you have an invalid causal
indicator. Decisions on whether to eliminate indicators must
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicat
Suppose that I am not sure that all five measures are valid. Closely related is the standardized validity coeffic
How can I examine their validity and discard any that are not gives the expected number of standard deviation
valid? Assume that I am certain of the validity of at least one indicator (y) differs for a one standard deviation di
of these measures, say yh so I scale the latent ;/ to this the latent variable {rj). The unique validity varianc
indicator by setting kx = 1 and «, = 0 leading to y, =// + £,. the variance in the measure that is uniquely attrib
Scaling the latent variable is sufficient to identify this model; specific latent variable. Assuming that only >/ direc
that is, it is possible to find unique values for the remaining y , the unique validity variance is equivalent to th
intercepts, factor loadings, and other parameters in this for the indicator (yy). This in turn is the amount o
model.4 (Without scaling the latent variable, the model is in the indicator explained by the latent variable. F
underidentified.) collinearity analysis is not relevant for this exam
only a single latent variable affects the measure so
Referring to Table 1, the first check for the remaining correlation of the latent variable with anything
indicators is whether they correspond to the definition of the would be an issue if, say, there were two correl
concept that is represented by tj. If they do not, then this is variables influencing the same measure.
sufficient grounds to eliminate an indicator. Assuming that
they do match the definition, then the next step is to assess Another aspect of assessing measurement validity
whether the model represented in equation (1) fits the data. the factor and measures embedded in a fuller m
This requires that we estimate the model with a procedure that additional causes and consequences of //. In this h
provides overall fit statistics. For instance, if the ML esti example and sometimes in practice, we do not hav
mator is appropriate, then a likelihood ratio chi-square test is latent variables for external validity checks but thi
available to compare the hypothesized model to a saturated possible if this measurement structure were emb
model. Adjustments to this chi-square test are readily avail larger model.
able when the data exhibit significant nonnormality. There
also are additional fit indexes that can supplement the chi Returning to the primary task of indicator selection in the
square test. If the fit is not adequate, then the researcher will effect indicator model in Figure 1, I can now provide some
need to find alternative specifications that might be better guidance. First, each indicator should have a sufficiently
suited to the data. If the fit is judged to be good, then the
other means of assessing validity are turned to next.
consistent estimator is an estimator that converges on the true parameter
values of the model as the sample size goes to infinity. A related concept is
asymptotically unbiased. Asymptotically unbiased means that in large
3Here, as with the causal indicator and composite indicator model, we can samples the expected value of the estimator is the population parameter. The
estimate a measurement model and need not form indexes prior to analysis. usual maximum likelihood estimator that I use in this paper has these two
properties. Partial least squares (PLS)—sometimes called a component-based
4If the scaling indicator has zero validity, I would expect estimation problems SEM estimator—does not, so this makes it difficult to use PLS to make
and even if estimates are obtained, the R-square for this indicator would be validity assessments. That is, if you have biased estimates of parameters in
near zero. the model, using these to guide validity assessments could be misleading.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
I Figure
Figure 1.
1. Five
Five Effect Indicators,One
Effect Indicators, OneLatent
LatentVariable
Variable
large and statistically significant unstandardized and stan disturbance with E(0 = 0 and COV(xj, 0 = 0 for j = 1 to 5.
dardized factor loading and R-square. For instance, an indi Figure 2 is a path diagram of the equation.
cator with an R-square of less than 0.10 would be undesirable
in that 90 percent or more of its variance would be due to The presence of the error (Q in equation (2) is in recognition
error. Similarly a statistically nonsignificant factor loading that other variables not in the model are likely to determine
would suggest a possibly invalid indicator. Any indicator the values of the latent variable. It seems unlikely that we
with a statistically insignificant loading is a candidate for could identify and measure all of the variables that go into the
elimination since it suggests that the measure is not valid. latent variable and the error collects together all of those
Finally, when // is embedded in a fuller model it and its effect omitted causes.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
Our second validity criterion is to check the overall fit of the Before eliminating a causal indicator, a researcher must
model. The model as specified is overidentified, so using an consider whether the allegedly invalid causal indicator suffers
estimator like ML we can test its fit compared to the saturated from high multicollinearity with the other causal indicators.
model and supplement this with other fit indexes. A poor fit Since rj has more than one determinant, our validity assess
means we need to stop to determine the source of the problem ment is more complicated than the previous effect indicator
before going further. A good fit suggests that we move on to example in that we need to assess the impact of each
the other validity checks. separately from the other causal indicators. Ifx7 were uncor
rected (or nearly so) with all other causal indicators, then the
The definition of validity given earlier refers to the direct task would be easier in that I would not attribute to one causal
structural relation between the measure and the latent indicator the effects of another. But when the causal indi
variable. In the causal indicator model, the unstandardized cators are highly correlated, there is the potential for the well
validity coefficient is y);/ and the standardized validity coeffi known multiple regression problem of multicollinearity: An
cient is its standardized counterpart8 and these provide explanatory variable (a causal indicator) is highly correlated
evidence relevant to the validity of the causal indicators. A with the other explanatory variables (causal indicators) and it
researcher should examine the sign, magnitude, and statistical is more difficult to estimate its unique effects. This difficulty
significance of these coefficients. In addition, the unique is reflected in larger standard errors than if there were lower
validity variance reveals the incremental variance explained collinearity and these larger standard errors imply less precise
by each causal indicator. A valid causal indicator should have (more variable) coefficient estimates and validity coefficients.
a strong effect on rj as gauged by these standards. In the situation of high multicollinearity, a competing
explanation for low unstandardized and standardized validity
It is impossible to give exact numerical values for cutoff for coefficients and low unique validity variance is that the causal
each of these validity measures since the minimum validity indicator was so highly associated with the other causal
acceptable would depend on several factors such as the indicators that estimates of distinct impact were not possible.
findings from prior studies, the degree to which collinearity
prevents crisp estimates of effects, the difficulty of measuring One factor that makes this collinearity issue less worrisome
the latent variable, etc. However, a coefficient of a causal for causal indicators is that such indicators need not correlate
indicator with the wrong sign or that is not statistically signi (Bollen 1984). This means in general we would not expect
ficant would appear to be invalid and a candidate for exclu high degrees of collinearity among causal indicators, although
sion. Alternatively, a causal indicator with a correct sign and it is possible. Fortunately, there are regression diagnostics
statistically significant coefficient with an increase of the available to assess when collinearity is severe or mild (e.g.,
squared multiple correlation by 10 percent or more has Belsley et al. 1980) and many of these are adaptable to the
support for its validity. causal indicator model.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
latent variables are highly correlated, the same collinearity situations where an error term would be absent as in (4). This
issue might emerge. If an effect indicator has factor com would mean that the latent variable that represents the
plexity of one, then the collinearity issue is absent. unidimensional concept is an exact linear function of its
indicators, which would seem to be a rarity.9 Given that the
concept of validity makes little sense for arbitrary composites
Five Composite Indicators, One Composite formed from composite indicators, I will not discuss the
validity measures for composite indicators.
The distinction between composite (formative) and causal
indicators is subtle. I represent the composite indicator model Many researchers gloss over the distinction between compo
for five indicators below: site indicators and causal indicators that I make here.10 This
distinction likely underlies some of the controversies in this
C — ac + 7ciXl + fc2X2 + yc3x3 + 7cA X4 + VcSXS (4) area. For instance, if composite indicators form a composite
without having unidimensional conceptual unity, then their
This equation shares with equation (2) that the indicators (x's) coefficients could well be unstable as the composite is al
help set the value of the left-hand side variable, but the dif lowed to influence different outcome variables. Returning to
ferences from causal indicators are more important than this the demographic composite, we would not expect the effects
similarity. For one thing, the equation with composite indi of age, race, and gender to be the same regardless of the out
cators has no error (or disturbance) term since the composite come variable. On the other hand, if these are causal indica
indicators completely determine the composite. The compo tors intended to tap a single latent variable, then the coeffi
site variable C rather than the latent variable rj is on the left cients of the causal indicators should be stable across different
hand side of the equation to reflect that we are dealing with a outcomes. This could explain some of the differences ex
linear composite variable (C) rather than a latent variable (rj). pressed in Howell et al. (2007) versus Bollen (2007) where
the former is referring to composite (formative) indicators and
At least as important, but less obvious, is that composite indi the latter refers to causal indicators.
cators need not share unidimensional conceptual unity. That
is, composite indicators might be combined into a composite
as a way to conveniently summarize the effect of several
Summary of Structurally Correct Models
variables that do not tap the same concept although they may
share a similar "theme." For instance, age, sex, gender, and
When we have "clean" models in which the researcher has
race effects could be summarized in a demographic composite
essentially the correct specification and the only question is
even though in this context demographic is not seen as a
whether one or more of either the effect or causal indicators
unidimensional latent variable. Indeed, another researcher
might have a different set of demographic variables to form a
are valid, the task of indicator assessment is simplified. We
demographic composite. can use the regression coefficients from the causal indicator
to the latent variable (as in (2)) or from the latent variable to
A troubling aspect of composite indicators not sharing the effect indicator (as in (1)) to form the unstandardized and
unidimensional conceptual unity is whether it even makes standardized validity coefficients and test their statistical
sense to discuss the validity of these variables. Using either significance to assess measurement validity. And we can use
the preceding definition of validity (The validity of a measure the unique validity variance as another measure of an indi
x, of Qj is the magnitude of the direct structural relation cator's validity. Low or statistically insignificant values on
between ^ and x;) or the less formal definition of whether a these measures are symptoms of low validity and suggest
variable measures what it is suppose to measure, if the com potential removal of such measures. Moderate to high values
posite indicators are just convenient ingredients to form a reinforce their validity. Exact cutoff values for indicators do
linear composite variable, it is not clear that assessments of not exist since their performance must be assessed in the con
validity apply. In other words, if we have no unidimensional text of what prior research has found under similar conditions.
theoretical concept in mind, how can we ask whether the
composite indicators measure it? With composite indicators
and a composite variable, the goal might not be to measure a
9 .... .
scientific concept, but more to have a summary of the effects Linear identities m
of several variables. sense. For instanc
- Deaths +/- Net M
accurate measures.
Even if the unidimensional concept criterion is satisfied by the
compositive indicators, it seems unlikely that there are many 10My earlier writings have not made this distinction clear.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
One complication with causal indicators is that we cannot this model. In practice, researchers do not know the true
identify or estimate any of the validity measures in a model generating model and are likely to make specification mis
when only causal indicators are present. The latent variable takes. Figures 3b to 3d are three examples of misspecified
with causal indicators must emit at least one path to identify models. In Figure 3b, I mistakenly assume a single factor for
and estimate the unstandardized validity coefficients and it all five indicators; Figure 3c also assumes a single factor but
must emit at least two paths to identify and estimate the only uses yx, y3, and y5 as indicators; and Figure 3d has a
standardized validity measures (Bollen and Davis 2009b). single factor but takes yt,y2, and y3 as the only indicators.
The main remaining difficulty in indicator selection occurs Simulating data that conforms to Figure 3a (AM = 1, a2i = .8,
with causal indicators or with effect indicators that depend on A31 = .6, A32 = .5, A42 = .8, A52 = 1, a, = 0, V(e) = 1, V(rjk) = 1,
more than one collinear latent variable. In either of these COV(?h, r/2) = .6, N = 1000), I estimated the factor loadings
cases, high collinearity might make it difficult to obtain sharpand R-squares for the models that matched Figures 3a to 3d
estimates of the distinct effects of the explanatory variables. using the ML estimator. These results are reported in Table
In practice, causal indicators are often not highly collinear and2. The first columns with the results for the true model in
effect indicators often depend on one latent variable, so theFigure 3a show estimates that are within two standard errors
collinearity issue is less prevalent. Finally, composite indi of the population values. The low chi-square and high p
value and the other fit indices consistently point toward a
cators as an arbitrary linear combination of indicators are not
subject to validity analysis in that the composite need not good fitting model. I could interpret the validity measures for
represent a unidimensional concept so the question of validitythis correct version of the model as I illustrated with the other
is not appropriate. correct models in the first part of the paper, but will not do so.
Rather I examine what happens in the incorrect models.
Figure 3b mistakenly has one factor rather than two. All the
Indicators in Structurally factor loadings remain statistically and substantively signifi
Misspecified Models cant, which might lead a researcher to view these as valid
measures. But we know that y4 andy5 are invalid measures of
The discussion to this point has focused on models that arethe first factor (>/,) and that y3 measures both r/2 and >/,,
structurally correctly specified. Real world modeling is morealthough the model only permits to have an impact.
complex and makes the task of indicator selection more
complicated. I turn now to some of these issues. As before,In applied settings we are not privileged to the knowledge of
I treat effect indicator models first and then turn to causal the true model, so how can we know that there is a validity
indicator models to facilitate understanding the similarities issue with some of the measures in Figure 3b? This is where
and differences between these models. The effect indicators the previous discussion about overall model fit is relevant.
are included to contrast the similarities and differences The chi-square of 70.8 with 5 df is highly statistically
between causal and effect indicators. For the reasons givensignificant for the model in Figure 3b. The Relative Noncen
in the last subsection, I do not discuss composite indicators
trality Index (RNI) and Incremental Fit Index (IFI) pass
further. conventional standards, but the Tucker-Lewis Index (TLI) and
Bayesian Information Criterion (BIC) point toward an unac
ceptable fit." The overall poor fit for some of the measures
Five Effect Indicators, Two Latent Variables serves as a cautionary note about structural misspecifications
and the misspecification creates uncertainty about the
Suppose we once again have five effect indicators, but now accuracy of the validity statistics.
have two latent variables in the true model. The equations are
The models in Figures 3c and 3d provide additional signs of
>>i = j/i + ex trouble and potential invalidity. If these five indicators truly
= a2 ^21 + £2 measured a single factor, then I would expect that fitting a
y3 = a3 + A3i tjj + A32 tfo + £3 (5) single factor to a subset of the five indicators would lead to
J4 = a4 + /l42//2 + e4 the same factor loadings and other parameter estimates for
ys = i2 + % any parameters shared between the models. Any differences
in factor loadings for the same variable should be due to just
where I make the usual assumptions that the unique factors
(fj) have means of zero and are uncorrelated with the latent
"Other fit indexes could be used, but in my experience these represent a
variables and with each other. Figure 3a is a path diagram of
useful variety of measures that perform well.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
£j E-, £j £4 £? e1 B2 £: £4 E,
Figure
Figure 3.
3. Five
Five Indicators
Indicators with
with Correct
Correct (3a)
(3a) a
a
Lamda jj
yi 1.000 -
1.000 -
1.000 -
1.000 -
R-square |
yi 0.464 0.338 0.300 0.474
y2 0.331 0.263 0.324
y3 0.479 0.520 0.648 0.410
y4 0.369 0.274
y5 0.506 0.353 0.275
Chi-Square 1.142 70.801 0.000 0.000
DF 3 5 0 0
p-value 0.767 0.000
RNI 1.002 0.928
IFI 1.002 0.928
TLI 1.007 0.856
BIC -19.581 36.262
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Ind
sampling fluctuations. In Figure 3c, I keep the assumption of is a model where once again a single laten
a single factor, but only usey„y3, and_y5 as indicators. Figure incorrectly used, but only one effect indicator (y
3d also is a single factor, but 1 only use yt, y2, and y3. Figure 4d is the same setup except that only y2
Contrasting the estimates from the models that correspond to an effect indicator.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
Figure 4. Causal Indicator Models with Correct (4a) and Three Incorrect Specifications (4b-4d)
lamda |
yi 1.000 -
1.000 -
1.000 -
y2 1.000 -
1.227 0.099 1.000 -
R-square (
Hi 1.000 0.898 1.000
n2 1.000 1.000
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
what happens when I use the wrong model? Figure 4b issue, see Howell et al. (2007), Bagozzi (2007), and Bollen
mistakenly assumes that there is a single latent variable with (2007) for contrasting views.
two effect indicators. The column in Table 3 that corresponds
to it shows all causal indicators with statistically significant
coefficients. Their nonnegligible magnitude suggests that
Conclusion
these measures are valid. Indeed, the R-square for //* sug
gests that these causal indicators collectively explain nearly
Indicator selection and elimination begins with clear measur
90 percent of the variance in the latent variable. It is only
ment theory that defines a concept, its dimensions, and th
when I note the large and statistically significant chi-square
necessary latent variables to represent them. The definition
and other fit indices that doubt is cast on the validity assess
also guide the selection of indicators and specifying whethe
ment. The model's fit is not good. This raises questions
the measures are causal or effect indicators.14 Once these
about the validity analysis since structural misspecifications
relations are specified in SEM, a researcher can begin t
in the model could distort these estimates. A comparison of
assess the validity of indicators. An important step in the
the results for Figure 4b to those for the true Figure 4a model
empirical analysis is determining whether the measuremen
reveals the problem of incorrectly assuming a single latent
model is an adequate description of the data using statistics
variable when two are present.
such as the likelihood ratio chi-square test that compares the
model fit to a saturated model. Additional fit indices often
Figures 4c and 4d represent a case where a researcher might
supplement the chi-square test. In addition, if the estimates
have different single effect indicators with which to estimate
are relatively stable using different effect indicators for the
a model. Table 3 has columns that represent the results of
same latent variable, this is supportive of the model specifi
estimating these models. Interestingly, the estimates that
cation. We would not expect such stability when using a
correspond to the model in Figure 4c correspond to estimates
subset of causal indicators. Embedding this measurement
of the impact of these causal indicators on rjl while the
structure in a larger model and checking whether the coeffi
estimates that correspond to the model in Figure 4d are the
cients change beyond sampling fluctuations is another check
estimates of the impact of these same causal indicators but on
on validity. If the results differ, this raises doubt about the
>112. This implies that as long as a researcher recognizes that
specification.
the effect indicators (y, and v2) correspond to different latent
variables, then it is possible to estimate the unstandardized
If the fit of the model is adequate, then the researcher can
validity coefficients and determine which causal indicator is
examine the unstandardized and standardized validity coeffi
valid for which latent variable. I tested the single latent vari
cients and the unique validity variance measures to gauge the
able model in the column that corresponds to Figure 4b. I
degree of validity. The analyst needs to consider the substan
found a poor fit as is expected given that the effect indicators
tive and statistical significance of these estimates. For causal
actually measure two different latent variables. A good fit
indicators or effect indicators that depend on two or more
would be more likely if they did tap the same latent variable.
latent variables, it would be wise to perform collinearity
diagnostics. All of these statistics provide information on the
Some researchers have been puzzled by estimates of the
validity of the indicators. Indicators with low validity coeffi
impact of causal indicators that differ depending on the effect
cients and low unique validity variance (in the absence of
indicator (or other outcome variable) in the analysis. They
high collinearity) are ones that are candidates for elimination
have interpreted this as suggesting that this is a special prob
even if theory initially supported their use. This is true
lem with causal indicators. However, this is more of a prob
whether they are causal or effect indicators. On the other
lem with indicators not measuring the same latent variable.
hand, statistically significant and substantively large unstan
An analogous problem was found with effect indicators in the
dardized and standardized validity coefficients with moderate
models represented in Figure 3b to 3d. In Figure 3b, I mis
to high unique validity variance support the validity of an
takenly assumed that all five effect indicators measure a
indicator.
single latent variable and the factor loadings differ from the
true model with two factors. In Figures 3c and 3d, the factor
What makes these assessments most difficult is deciding
loadings differ depending on which three effect indicators are
whether the model as a whole is adequate for the data. In
used, a symptom of the incorrect number of latent variables,
many SEMs with large samples there is considerable statis
not of an inherent problem with effect indicators. So
tical power to detect even minor mistakes in the model
assuming a single latent variable when there are more and
specification. In practice, there will be ambiguity in assessing
having indicators that tap different latent variables can lead to
different assessments of validity whether a researcher is
14The study by Bollen and Ting (2000) has an empirical test to help
evaluating causal or effect indicators. For a discussion of this
determine whether a subset or all indicators are causal or effect indicators.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms
Bollen/Evaluating Effect, Composite, and Causal Indicators in SEMs
overall model fit. This in turn leaves open the possibility of Diamantopoulos, A., Riefler, P, and Roth, K. P. 2008. "Advancing
a structural misspecification that is sufficient to alter the Formative Measurement Models," Journal ofBusiness Research
parameter estimates vital to assessing indicator validity. Thus (61:12), pp. 1203-1218.
Fornell, C., and Bookstein, F. L. 1982. "A Comparative Analysis
we must recognize and be willing to live with the fact that
of Two Structural Equation Models: LISREL and PLS Applied
mistakes are possible where an invalid indicator is retained or
to Market Data," in A Second Generation of Multivariate
a valid one discarded. It is this fact that reinforces the need
Analysis (Vol. 1), C. Fornell (ed.), New York: Praeger, pp.
for a theoretical basis to back up our specifications and 289-324
empirical tests. Grace, J. B., and Bollen, K. A. 2008. "Representing General
Theoretical Concepts in Structural Equation Models: The Role
Finally, if composite (formative) indicators are combined to of Composite Variables," Environmental and Ecological
be a convenient summary of a set of variables, then the issue Statistics (15), pp. 191-213.
of validity is not present since there is not a claim that these Heise, D. R. 1972. "Employing Nominal Variables, Induced
indicators share conceptual unity. With the composite (for Variables, and Block Variables in Path Analyses," Sociological
Methods and Research (1), pp. 147-173.
mative) indicators, there is no expectation of stable coeffi
Howell R. D., Breivik, E., and Wilcox, J. B. 2007. "Reconsidering
cients across different dependent variables that the composite
Formative Measurement," Psychological Methods (12:2), pp.
might predict. This distinction between causal indicators and 205-218.
composite (formative) indicators is not typically made and Jarvis, C., MacKenzie, S., and Podsakoff, P. 2003. "A Critical
when ignored can contribute to controversy. Review of Construct Indicators and Measurement Model
Misspecification in Marketing and Consumer Research," Journal
of Consumer Research (30:2), pp. 199-218.
Acknowledgments Joreskog, K. G., and Goldberger, A. S. 1975. "Estimation of a
Model with Multiple Indicators and Multiple Causes of a Single
I gratefully acknowledge the support of NSF SES 0617276. My Latent Variable," Journal of the American Statistical Association
thanks to Shawn Bauldry for research assistance and to the (70:351), pp. 631-639.
reviewers and editors for helpful comments. MacCallum, R. C., and Browne, M. W. 1993. "The Use of Causal
Indicators in Covariance Structure Models: Some Practical
Issues," Psychological Bulletin (114:3), pp. 533-541.
References Petter, S., Straub, D., and Rai, A. 2007. "Specifying Formative
Constructs in Information Systems Research," MIS Quarterly
Bagozzi, R. P. 2007. "On the Meaning of Formative Measurement
(31:4), pp. 623-656.
and How it Differs from Reflective Measurement," Psychological
Methods (12), pp. 229-237.
About the Author
Belsley, D. A., Kuh, E., and Welch, R. E. 1980. Regression
Diagnostics, New York: Wiley.
Blalock, H. M. 1964. Causal Inferences in Nonexperimental
Kenneth A. Bollen is the H. R. Immerwahr Distinguished Professor
Research, Chapel Hill, NC: University of North Carolina Press.
of Sociology at the University of North Carolina at Chapel Hill. He
is a member
Bollen, K. A. 1984. "Multiple Indicators: Internal Consistency or of the Statistical Core and a Fellow of the Carolina
Population
No Necessary Relationship," Quantity and Quality (18:4), pp. Center and an adjunct professor of Statistics at UNC
377-385. Chapel Hill. He was Director of the Odum Institute for Research in
Social Science from 2000-2010.
Bollen, K. A. 1989. Structural Equations with Latent Variables,
New York: Wiley.
Bollen, K. A. 2007. "Interpretational Confounding Is DueDr.to Bollen is a Fellow of the American Association for the Advance
ment of Science (AAAS) and Chair of the Social, Economic, and
Misspecification, Not to Type oflndicator: Comment on Howell,
Political Sciences Section K of AAAS. He is a former Fellow at the
Breivik, and Wilcox," Psychological Methods (12:2), pp.
219-228. Center for Advanced Study in the Behavioral Sciences and is the
Bollen, K. A., and Davis, W. R. 2009a. Causal Indicator Models: Year 2000 recipient of the Lazarsfeld Award for Methodological
Identification, Estimation, and Testing," Structural Equation Contributions in Sociology. The ISI named him among the World's
Modeling (16:3), pp. 498-522. Most Cited Authors in the Social Sciences. He is coauthor of Latent
Bollen, K. A., and Davis, W. R. 2009b. "Two Rules of Identi Cun>e Models: A Structural Equations Approach (with P. Curran,
fication for Structural Equation Models," Structural Equation 2006, Wiley) and author of Structural Equation Models with Latent
Modeling (16:3), pp. 523-536. Variables (1989, Wiley) and of over 100 published papers. Bollen
Bollen, K. A., and Lennox, R. 1991. "Conventional Wisdom on has been an instructor at the Interuniversity Consortium for Political
Measurement: A Structural Equation Perspective," Psycho and Social Research (ICPSR) Summer Training Program in Quan
logical Bulletin (110:2), pp. 305-314. titative Methods at the University of Michigan in Ann Arbor since
Bollen, K. A., and Ting, K. 2000. "A Tetrad Test for Causal 1980. His primary areas of statistical research are structural equa
Indicators," Psychological Methods (5:1), pp. 3-22. tion models, measurement models, and latent growth curve models.
This content downloaded from 129.180.1.217 on Mon, 29 Jan 2018 20:02:50 UTC
All use subject to http://about.jstor.org/terms