Professional Documents
Culture Documents
OSFPreprint WMRCGoring Schamnk Kane Conway
OSFPreprint WMRCGoring Schamnk Kane Conway
Latent Variable and Psychometric Network Model Re-Analyses of Freed, Hamilton, and Long
(2017)
Sara Anne Goring1, Christopher J. Schmank1, Michael J. Kane2, & Andrew R. A. Conway1
Abstract
It is well established that reading comprehension ability varies among individuals. However,
domain-general abilities (e.g., working memory capacity, WMC; Just & Carpenter, 1992) or
Recently, Freed, Hamilton, and Long (2017) reported that individual differences in reading
comprehension can be largely attributed to language experience and concluded that WMC is not
an important factor. After re-analyzing structural equation models from Freed et al. and
generating a psychometric network model of their data, we find that WMC is more important to
individual differences in reading comprehension than was suggested by Freed et al. Overall, it
was confirmed that both domain-general and domain-specific processes are associated with
Latent Variable and Psychometric Network Model Re-Analyses of Freed, Hamilton, and Long
(2017)
Considerable research at the intersection of memory and language has sought to identify
reading comprehension are well established, researchers disagree about their sources. One
perspective argues that individual differences are due to domain-general abilities such as
working memory capacity (WMC; Just & Carpenter, 1992). An alternative viewpoint is that
and other language abilities derived through experience (MacDonald & Christiansen, 2002). A
Just and Carpenter (1992) proposed that individual differences in reading comprehension
result from variation in WMC resources available to generate activation. During comprehension,
language units become available for processing and computation by reaching a specific level of
activation. Cognitive resources are finite, however, and when capacity limits are nearly reached,
scaling-back procedures are initiated so only the most important items maintain or achieve
activation. These scaling-back procedures include pulling activation from previously processed
items to be used for upcoming items. The effect of this is seen when a student forgets
information from a previous sentence while processing a new difficult one. Having more
available resources is an advantage because capacity limits will not be reached as easily.
READING COMPREHENSION AND WORKING MEMORY 4
The domain-general perspective has been supported by research that compares subjects
with different levels of WMC. Just and Carpenter (1992) reported that adults with greater WMC
(higher-spans) process language more quickly and accurately than do adults with lesser WMC
(lower-spans). These differences become more evident when processing difficult or unfamiliar
lexical content and grammatical structures. For example, compared to lower-span subjects,
higher-span subjects read irregular reduced relative clauses faster (Ferreira & Clifton, 1986; Just
& Carpenter, 1992). Reduced relative clauses are difficult in that they do not contain an explicit
relative pronoun or complementizer (e.g., “that/who was”). Without the complementizer to orient
readers to the expected outcome, they are forced to adjust expectations mid-sentence. Consider
the phrase, “The defendant examined by the lawyer shocked the jury.” The word defendant could
initially be interpreted as the subject or action-taker of the phrase, setting up expectations for a
different outcome (e.g., “The defendant examined the evidence...”). Cognitive resources are thus
needed to adjust expectations and resolve ambiguity quickly. According to Just and Carpenter,
higher-span subjects had shorter reading times because they had more resources to process the
syntactic clues that suggested the correct interpretation. Lower-span subjects required more time
and resources to reread the clauses and resolve the confusion. This WMC effect has been
confirmed by other studies that have used ambiguity or syntactic complexity to increase the
difficulty of processing, thus increasing the amount of activation required and decreasing the
performance of lower-span subjects (Just & Carpenter, 1992; King & Just, 1991; MacDonald,
previous studies, researchers have reported that, similar to verbal span tasks, complex numerical
span measures of WMC predict reading comprehension, although to a lesser extent than verbal
READING COMPREHENSION AND WORKING MEMORY 5
spans (Daneman & Merikle, 1996). This indicates that factors other than language-specific
resources are involved in the relationship between WMC and reading comprehension. Indeed,
McVay and Kane (2012) examined the relationship between WMC, reading comprehension, and
WMC and reading comprehension tasks and account for correlations between these two abilities.
As predicted, not only did a WMC latent variable derived from verbal and non-verbal tasks
control) also partially mediated the relationship between WMC and comprehension. Subjects
with lower WMC were more prone to mind-wandering during both reading and non-reading
tasks, and this general propensity for inattention was related to decreased comprehension.
Additionally, a recent meta-analysis found that there is a positive association between reading
comprehension and executive function that remains consistent across age, regardless of the type
of executive function or reading comprehension measures that were used (Follmer, 2018). These
results (see also Unsworth & McMillan, 2013) confirm that the domain-general, attentional-
We note that researchers from the domain-general perspective also acknowledge the role
vocabulary, linguistic fluency, and background knowledge (Baddeley, Logie, & Nimmo-Smith,
1985; Cromley & Azevedo, 2007). However, the domain-general perspective emphasizes the
Kane, Conway, Hambrick & Engle, 2007; Kane & Engle, 2003). Supporting this perspective,
studies have confirmed that individual differences in WMC do significantly predict variation in
READING COMPREHENSION AND WORKING MEMORY 6
performance for reading comprehension tasks, and WMC also correlates with many language-
& Carpenter, 1983; Mason & Just, 2007; Daneman & Merikle, 1996)). Thus, according to this
viewpoint, domain-specific and domain-general processes are both necessary, but the general
Domain-specific Perspective
and Christiansen (2002), verbal WMC is not a separate property that can vary independently
from other language processes, but an emergent characteristic of a complex, multi-layer system.
According to this view, language processing tasks and tests of verbal WMC both measure
language ability and are conceptually indistinguishable from one another. So, when measuring
verbal WMC per this viewpoint, one is essentially just measuring language ability, not a separate
process with an individual capacity, as purported by the domain-general view. According to the
domain-specific perspective, the ease and efficiency of language processing depends on the
complexity of the linguistic input being processed and how often similar input has been
experienced. Reading more often exposes the language processes to a wider variety of structures
and content, conditioning the entire system for more efficient processing. For example, low-
experience readers have more difficulties processing irregular words than regular ones, unless it
is a high-frequency word and likely to be more familiar. However, highly experienced readers
process irregular and regular words with similar ease, regardless of frequency/familiarity
important factor in the domain-specific view, which stresses the combined effects of language-
verbal/reading fluency, and previously obtained cultural knowledge (Baddeley et al., 1985;
Cromely & Azevedo, 2007; Verhoeven & Van Leeuwe, 2008). Having more experience, or
being exposed to more written and oral language, builds these language-specific skills
(MacDonald & Christiansen, 2002). Indeed, subjects’ processing of irregular sentence structures
improves from pre- to post-test after they are given additional practice with them (Wells,
Christiansen, Race, Acheson, & MacDonald, 2009). Thus, studies from the domain-specific
MacDonald and Christiansen (2002) offered evidence for the domain-specific perspective
via reinterpretations of previous results (initially presented by Just & Carpenter, 1992) and
network simulations. For example, as discussed above, Just and Carpenter found that compared
to low-span subjects, high-span subjects had faster reading times for reduced relative clauses.
Just and Carpenter attributed these differences in performance to WMC limitations. However,
MacDonald and Christiansen ascribed these results to previous language experience. Compared
to less experienced readers (comparable to lower-span subjects), highly experienced readers (or
higher-span subjects) had more familiarity with irregular sentence structures and could anticipate
the sentence resolution (King & Just, 1991; MacDonald & Christiansen, 2002). This resulted in
faster reading times for highly experienced readers compared to less experienced readers.
Additionally, MacDonald and Christiansen supported their view with simulated network studies.
Prior to linguistic training, a simulated network generated results comparable to the performance
READING COMPREHENSION AND WORKING MEMORY 8
of low-span subjects. However, after training with irregular sentence structures, the network
worth noting that this experience-based, domain-specific perspective cannot account for positive
correlations between non-verbal WMC tasks and reading comprehension, due to the assertion
that skill necessitates specific experience. Regardless, research continues to develop on both
Freed, Hamilton, and Long (2017) examined the role of previous language experience in
WMC, word decoding). A particular focus of this study was determining the role and importance
of WMC, relative to word decoding, language experience, verbal fluency, perceptual speed,
inhibition, and reasoning. Using structural equation modeling (SEM) techniques, they reported
the best fitting model to consist of direct relationships predicting reading comprehension from
only language experience and reasoning (see Figure 1). All other latent variables (including
WMC) were only related to reading comprehension indirectly, via overlapping correlations.
Regarding WMC, Freed et al. concluded, “The authors question the need to include WMC in our
The Freed et al. (2017) conclusion is not only strong, but it conflicts with much of the
reading comprehension literature. Such claims require further investigation. Although the Freed
et al. study is impressive in the number of relevant constructs it explored empirically, we identify
First, SEM techniques are not appropriate to answer the exploratory, open-ended
research questions proposed by Freed et al. (2017). SEM is a confirmatory analysis that is largely
READING COMPREHENSION AND WORKING MEMORY 9
theory-driven and typically guided by a specific, a priori model to test. Freed et al., in contrast,
conducted their analyses using a data-driven method with no explicit theoretical model, but
rather started the process by conducting an exploratory principle component analysis (PCA).
When conducting the SEM, the authors used program-generated modification indices to delete
“irrelevant” pathways between variables in their model, based solely on a data-driven basis with
no theoretical justification or interpretation. This approach obviously blurs the line between a
priori and post-hoc hypothesis development and makes the models less interpretable.
Additionally, we argue that the process of deleting pathways did not sufficiently improve fit to
Second, the Freed et al. (2017) interpretation of direct versus indirect effects was
unclear. They correctly identified direct paths, such as the direct effect of language experience on
reading comprehension, but they misclassified correlations between latent variables as evidence
for indirect paths. For example, language experience had a direct relationship to reading
comprehension, and word decoding was correlated with language experience. Freed considered
this to reflect an indirect effect of word decoding on reading comprehension, via language
experience. For this to be a true indirect effect, word decoding would need a direct path to
Third, choices made during the analysis plan and presentation were inconsistent with
standard practices. Freed et al. (2017) used a PCA with varimax (orthogonal) rotation to extract
their components/factors, but then allowed latent variables to correlate in the SEMs. This is
inconsistent across analyses, as the orthogonal rotation prevents correlations between factors, but
the SEM allowed for these relationships. Additionally, forcing orthogonality between
components/factors causes important information about the relationship between these factors to
READING COMPREHENSION AND WORKING MEMORY 10
be lost. Moreover, using an oblique rotation would have allowed the authors to analyze
correlations between components, had they existed, or shown orthogonality if that were the case.
Additionally, Freed et al. followed their PCA, which is an exploratory analysis (and which
identifies components rather than latent factors), with SEM which is a confirmatory factor-
analytic technique. Using a confirmatory analysis on an exploratory model generated from the
original data sample will always demonstrate good model fit, but that does not establish that the
model is valid. Finally, Freed failed to report standardized regression coefficients in their
published figures making their SEM models uninterpretable. All of these combined factors raise
Yet our most important critique of Freed et al. (2017) was the lack of a theoretically-
based rationale for the decisions made throughout the modeling process, particularly when
designing their models or determining which model to draw conclusions from. Predictors were
chosen for the initial model based on evidence of each individual variable’s predictive validity of
reading comprehension, but there was no theoretical basis presented for how these variables hang
together as a model of reading comprehension. So rather than starting with a theory-based model
to test, models were designed by including all (or certain predictors) and all pathways. Then
data-driven modification indices were used to remove pathways without any further explanation
or justification for what these changes meant from a theoretical perspective. Interpretability
aside, this actually limited the amount of information that could obtained from certain models.
For example, for Model 1 (see Figure 1), allowing all of the factors to have a direct path to
reading comprehension would have identified the strength of each factor and allowed for a
comparison of the predictive strength between certain variables of interest, like reasoning and
WMC. Additionally, Freed arbitrarily removed factors between models in order to explore
READING COMPREHENSION AND WORKING MEMORY 11
specific relationships and the stability or robustness of certain associations. However, this goes
against standard practices for SEM, and this process fed into the larger issue of not justifying
why they chose the particular model they did to base conclusions from (Model 1). Many of the
Freed et al. models demonstrated comparable fit indices and proportion of variance explained.
Specifically, compared to the Freed’s chosen model (Model 1) a similar model, also presented by
Freed (Model 2, see Figure 2), contained all of the latent variables but Freed had removed
reasoning from this model. For Model 2, the direct relationship between reading comprehension
and language experience was maintained, but Freed claimed the removal of reasoning for Model
2 allowed for a significant direct relationship between WMC and reading comprehension. The
model fit indices were similar between the first model containing reasoning [χ2(333) = 542.88, p
< .001; CFI = 0.91; TLI = 0.90; RMSEA = 0.04], and the model with reasoning removed,
[χ2(239) = 390.63, p < .001; CFI = 0.92; TLI = 0.90; RMSEA = 0.04]. Also, including reasoning
in the model only added a negligible 2.91% of explanatory power, so there is not a clear data-
driven reason for why Model 1 was chosen over Model 2 (with reasoning: 76.68%, without
reasoning: 73.77%). Finally, including both reasoning and WMC in the same model could have
introduced the potential for redundancies or multicollinearity due to their conceptual overlap and
shared variance (Engle, 2001; 2002). Although ideally all measured predictors should be
included in a model, rather than being cherry-picked, this factor structure lacked a theoretical
basis to begin with. Particularly when considering Freed presented previous research supporting
each variable’s ability to predict reading comprehension yet most were not given direct paths to
reading comprehension in the model. Thus, considering this model was not based on a testable
theory, and it was not remarkably better fitting than the other models conducted by Freed, it
The current study replicated and extended the work by Freed et al. (2017) to better
understand their SEM process and what the data imply about variation in comprehension. The
current study was conducted using Freed’s published correlation matrix and descriptive statistics.
A covariance matrix was generated to run a confirmatory factor analysis (CFA) on their original
measurement model to assess whether their model provided an adequate fit to the data. Next,
three SEMs were assessed: (a) Model 1, containing direct paths from language experience and
reasoning to reading comprehension, which represents the first reanalysis of a model from Freed,
see Figure 2; (b) Model 2, containing direct paths from WMC and language experience to
reading comprehension, which was the second reanalysis of a model from Freed, see Figure 3;
(c) and Model 3, which is the only SEM that was not initially presented by Freed and contained
direct paths from all latent variables to reading comprehension, see Figure 4.
Replicating Model 1(Figure 2) and Model 2 (Figure 3) from Freed et al. (2017) served a
practical purpose. Since Freed et al. did not present their standardized path weights, their models
were largely uninterpretable. Thus, replication was necessary to get a better understanding of the
The extension component of Study 1 was to analyze a third model that was not produced
by Freed et al. (2017), containing direct paths from all predictor factors to reading
comprehension, Model 3 (Figure 4). The purpose of Model 3 was to test whether, when all
factors are given a direct path to reading comprehension, WMC would maintain a significant
direct path to reading comprehension while reasoning does not. Such an outcome would directly
challenge the conclusion drawn by Freed et al., that reasoning maintains a significant direct path
to comprehension while WMC does not. Model 3 will also allow an assessment of all the
READING COMPREHENSION AND WORKING MEMORY 13
Finally, reproducing all three models will provide a means to examine and compare model fit of
all of the SEMs utilizing the factor structure designed by Freed et al.
Method
We present only the methodological details that are necessary to understand and
evaluate our re-analyses of the Freed et al. (2017) data. For more information regarding
Subjects. Three hundred and fifty-seven young adults, sampled from a four-year
university and a community college, participated in the study. However, only 346 participants
were used in the subsequent analyses, consistent with Freed et al. (2017).
Measures. Twenty-six manifest variables that were explained by seven factors (not
including reading comprehension or the related observed variables for this factor): decoding,
WMC, inhibition, language experience, reasoning, perceptual speed, and fluency. The target
variable, reading comprehension, was measured using two manifest variables that were
Decoding. In the Phonological Decision task, subjects viewed two non-word letter
strings, and determined which one, if spoken aloud, would sound like a real word (e.g.,
HOWSE). For Non-Word Naming, subjects viewed 100 pronounceable non-words (e.g.,
plambust) and pronounced them as quickly as possible. For the Orthographic Decision task,
subjects viewed two letter strings and decided which was a correctly spelled word (e.g., DEAL
vs. DEEL).
READING COMPREHENSION AND WORKING MEMORY 14
WMC. Reading Span involved 15 sets of sentences, and 60 target words. After subjects
view a sentence, they decided whether it made sense or not followed by presentation of an
unrelated target word. After presentation of 2–7 sentences, subjects recalled all target words in
order. For Alphabet Span, subjects viewed 25 lists of words, with 2–7 words in each. Subjects
recalled each word list in alphabetical order. Minus Span consisted of subjects viewing 35 sets
(2–8 numbers in each set) of random numbers; following presentation of the number, subjects
subtracted a value of 2 from each number. Afterwards, subjects reported the differences in order
per set. Finally, for Visual Number Span, subjects viewed 24 sets of digits (4–13 digits presented
sequentially in each set). Subjects recalled each digit list in reverse order of presentation.
Inhibition. In a Stroop task, subjects named the hue for 105 letter strings (one at a time)
that consisted of either a color name or an unrelated word; color names appeared in either
congruent or incongruent hues (e.g., GREEN in red hue). Subject’s naming latencies were used
for analysis.
vocabulary, background knowledge, and print exposure (also called reading frequency).
Vocabulary was measured using: (a) Form F or Form G from the Nelson-Denny Reading Test,
which required subjects to complete sentences with the final word missing, and two more
traditional, multiple-choice vocabulary tests from the Ekstrom Battery (Ekstrom, Dermen, &
of topics. Cultural literacy was assessed using a test from Test-Prep your IQ with the Essentials
of Cultural Literacy (Zahler & Zahler, 2003) about American History, Geography,
Myth/Religion, Science, and Art. Science knowledge was tested with a novel measure, in which
READING COMPREHENSION AND WORKING MEMORY 15
subjects distinguished names of scientists from foils. A total of 100 names were listed and half
were Nobel Prize winners (scientists) the others were names from the National Academy of
Two measures of print exposure assessed reading frequency: the Author Recognition
Test (Stanovich & West, 1989) and the Reading Habits Questionnaire (Scales & Rhee, 2001). In
the former, subjects distinguished real authors (from New York Times Best Seller List) from
foils; in the later, subjects responded to self-report scales about their reading habits, skills, and
preferences.
Reasoning. Reasoning measures included the first two sets from Raven’s Advanced
Progressive Matrices (Raven, 1962) and measures from the Ekstrom Battery of Factor-
Referenced Tests (Ekstrom et al., 1976). Raven’s Advanced Matrices consisted of 48 items in
which subjects induced the missing element to complete a pattern. The Ekstrom Battery included
(a) Arithmetic Aptitude (two sections of 15 arithmetic word problems); (b) Mathematic Aptitude
Operations (subjects asked to determine which numerical operations were necessary to solve two
sections of 15 items).
Perceptual Speed. In Letter Comparison, subjects identified whether two patterns were
the same or different, from two timed lists of 21 pairs of letter strings. For Pattern Comparison
subjects determined whether two lists of 15 pairs of patterns were the same or different from
each other in a 30 s time limit. In Finding As, subjects had to find five words including the letter
“a”, from five columns of words in 2 min. For the Identical Pictures measure, subjects viewed
rows of geometrical figures and determined which of them were identical to the first figure in the
row?. Subjects completed 48 rows of these figures in 2 lists for 2 min per list.
READING COMPREHENSION AND WORKING MEMORY 16
Fluency. For Word Beginnings, subjects reported as many words as possible that started
with a specified letter within 3 min per prompt (a total of 2 prompts were given). In Word
Endings, subjects reported as many words as possible that ended with a prompted letter (2
prompts given with 3 min per prompt). Two pairs of prompts were given with 3 min to complete
section of Nelson-Denny Reading Test, which consisted of reading a passage and answering
questions from either Form F (36 questions) or Form G (38 questions; Brown, Bennett, & Hanna,
1980; Brown, Fischo, & Hanna, 1993). A novel measure, the Investigator-Generated-Test,
presented 10 multiple-choice questions that assessed memory and understanding of the main
Data Analysis Plan. Descriptive statistics and correlations (See Table 1; N = 357) of all
measured variables provided by Freed et al. (2017) were used to generate a covariance matrix.
This covariance matrix was used to replicate their measurement model directly, using a CFA1.
Next, 3 SEMs were generated (two were replications of models presented in Freed): (a) Model 1,
the Freed et al. chosen model that had direct paths from language experience and reasoning to
reading comprehension (Model 1 in Freed et al.); (b) Model 2, which is the same as Model 1
except the latent variable for reasoning has been deleted, and there is now a direct path to reading
comprehension from WMC (Model 3 in Freed et al.); (c) Model 3, a novel model that contains
all latent variables and includes direct paths from each of them to reading comprehension (not
presented in Freed et al.). Model fit statistics were used to assess and compare the fit of these
1
Despite using their correlation matrix and specifying the exact same models presented by Freed et al. (2017), there
is a large disparity in model fit between the current models and those conducted by Freed et al.. The current authors
used all of the information available from Freed et al.’s paper and supplementary materials, thus the cause of this
disparity remains unclear. Results and discussion will concern the analyses conducted in the current project.
READING COMPREHENSION AND WORKING MEMORY 17
models to the data in accordance with the standards suggested by Klein (2005) and Schreiber,
Nora, Stage, Barlow, and King (2006). We analyzed these models in R using the lavaan package
(Rosseel, 2012).
Results
Confirmatory Factor Analysis. A CFA was used to test the 7-factor measurement
model originally proposed by Freed et al. (2017; see Figure 1). This model includes factors
representing: WMC, reasoning, perceptual speed, decoding, fluency, inhibition, and language
experience. Inhibition is not a true factor, as it only has one indicator variable. Based on the
summary of model fit statistics presented in Table 2, the measurement model demonstrated
acceptable fit for the ratio of c2 to df, and SRMR. However, the comparative fit indices, CFI/TLI
and RMSEA, were not in the acceptable range. This could be due to some redundancy across
factors, for example WMC and reasoning were correlated at .58 (p < .001) and language
experience and fluency were correlated at .57 (p < .001). Regardless of fit, for the sake of better
Structural Equation Model 1. The first model was a replication of Model 1 from Freed
et al. (2017), consisting of the seven latent predictor variables from the measurement model and
a latent variable representing reading comprehension (See Figure 1 for the associated factor
loadings). The model accounted for 79% of the variance in reading comprehension with two
predictor variables that had direct effects: language experience (b = .76, p < .001) and reasoning
(b = .24, p = .001). These predictor variables also had shared covariances with the other predictor
variables in the model. The remaining latent variables (WMC, decoding, perceptual speed,
fluency, inhibition) did not have direct paths to reading comprehension, but were related to other
predictor variables through shared covariance (see Figure 2 for path values). The only
READING COMPREHENSION AND WORKING MEMORY 18
paths/relationships that were included were consistent with what Freed et al. also maintained in
their model. For model fit statistics see Table 2. Model 1 demonstrated acceptable fit for the
following indices: c2 to df, and SRMR. Similar to the measurement model, the CFI/TLI and
Structural Equation Model 2. The second model was a replication of Model 3 from
Freed et al. (2017), consisting of the six latent predictor variables from the measurement model
(language experience, WMC, decoding, perceptual speed, fluency, inhibition) and reading
comprehension. The model accounted for 77% of the variance in reading comprehension with
two predictor variables with direct effects: WMC (b = .22, p =.001) and language experience (b
= .77, p < .001). The remaining predictor variables did not have direct paths but were related to
certain other predictor variables (including WMC and language experience) via shared
covariance (See Figure 3 for path values). The only paths/relationships that were included were
consistent with what Freed et al. also maintained in their model. For model fit statistics see Table
2. Model 2 demonstrated acceptable fit for the ratio of c2 to df, and SRMR. Similar to previous
models, CFI/TLI and RMSEA, were not in the acceptable range. Model 2 did produce smaller
AIC and BIC values than Model 1, indicating better fit than the previous model containing all
predictor variables. However, this is only due to the fact that this model is less complex due to
Structural Equation Model 3. The third model was not included in Freed et al. (2017)
but consists of their original 7-factor model structure (WMC, decoding, perceptual speed,
fluency, inhibition, language experience, and reasoning) and reading comprehension. Rather than
removing pathways from the model (as done in Model 1 and Model 2 by Freed et al.), each
exogenous variable had a direct path to reading comprehension and was allowed to correlate with
READING COMPREHENSION AND WORKING MEMORY 19
all other latent variables. The purpose of this model was to determine whether any pathways
were overlooked in the first two models. In fact, all latent variables except for reasoning,
perceptual speed, and inhibition had significant direct paths to reading comprehension,
including: language experience (b = .76, p < .001), WMC (b = .20, p = .027), decoding (b = -.20,
p = .018), and fluency (b = -.25, p = .037). This model accounted for 85% of the variance in
reading comprehension. Model fit statistics indicate comparable fit to that of Model 1 and Model
2 demonstrating acceptable fit for the ratio of c2 to df, and SRMR. Again, the comparative fit
indices (CFI, TLI), RMSEA, and 90% confidence intervals around RMSEA were not in the
acceptable range.
Discussion
The purpose of Study 1 was to reassess conclusions made by Freed et al. (2017) that
WMC does not need to be included in cognitive models of reading comprehension. Freed et al.
used SEM techniques to assess the role of domain-general and domain-specific factors
underlying reading comprehension and determined that language experience was the strongest
predictor of this ability. Additionally, Freed et al. concluded that when both WMC and reasoning
were included in a model, only reasoning maintained a significant direct path to reading
comprehension (see Figure 2). We reassessed two of the Freed et al. models. First, Model 1
(Figure 2) was the model chosen by Freed in the original publication to base their final
conclusions on and Model 2 (Figure 3), a second, similar model presented in the original
publication without a latent variable representing reasoning. The unacceptable fit of both models
to the data demonstrate the lack of clear justification for choosing one model over the other.
Additionally, Freed et al. did not provide a theory-based reason for choosing Model 1 over
Model 2, even though these models offer different interpretations of the data. Model 1 indicated
READING COMPREHENSION AND WORKING MEMORY 20
no direct relationship between WMC and reading comprehension; Model 2 did indicate a
relationship. These concerns highlight that there was neither a data-driven nor theoretically-
driven explanation provided to support their choice of model. This ultimately calls into question
In the current study, we tested a third and novel SEM that was not presented by Freed et
al. (2017), Model 3, containing direct paths from all factors to reading comprehension. Fluency,
decoding, WMC, and language experience were all significant predictors of reading
comprehension, with language experience as the strongest predictor, a result consistent with both
1992; MacDonald & Christiansen, 2002). More importantly, this model also tested whether,
when both WMC and reasoning are given direct paths to reading comprehension, only the path
for reasoning remains significant. We disconfirmed the Freed et al. conclusion and found that
WMC maintained a significant direct path to reading comprehension, while reasoning did not.
However, as with the two previous SEMs, Model 3 demonstrated unacceptable fit to the
data. This indicates that there are likely core issues with the measurement model underlying the
SEMs designed by Freed and used as the factor structure for all three SEMs. Some of the model
fit indices from our CFA measurement model were outside of the acceptable range, similar to the
SEMs. Additionally, when examining the factor loadings, many were inconsistent with each
other. For example, the seven factor loadings for language experience range from .37 to .88.
Although the low value is still within the “acceptable” range, from a theoretical perspective, the
wide range of factor loading values seems to indicate some conceptual and statistical
inconsistencies with this factor. It is unsurprising as this factor was initially intended by Freed to
be three separate factors, vocabulary, print exposure, and background knowledge. However, it
READING COMPREHENSION AND WORKING MEMORY 21
was reported that these all loaded onto one factor and thus was labeled language experience.
Many of the other factors also have uneven factor loadings, and this could indicate that some
manifest variables may be double-loading or simply inappropriate for the given factor.
Additionally, the factor representing inhibition is not truly a factor at all, but rather a constant, as
there is only one manifest variable underlying this factor. This constant is inappropriate to use in
the measurement model and could be contributing to the poor fit of the models, particularly
because it is not correlated with both indicators of reading comprehension (See Table 1). Due to
these serious concerns regarding the measurement model, we next sought to reexamine the
relationship between WMC and reading comprehension using alternative techniques that do not
Study 2 served as an additional extension to the work of Freed et al. (2017), using an
alternative, more exploratory statistical analysis. Specifically, we used the correlations and
descriptive statistics reported by Freed et al. to generate psychometric network models. These
network analysis uses partial-correlations between all observed variables to display a network of
emergent quality generated from a network of interacting mechanisms (Epskamp, Borsboom, &
Fried, 2017). Rather than using latent variables, each observed variable is represented by a node,
and partial-correlations between variables are represented through connections called edges.
Instead of using a common latent factor to explain the relationships between observed variables,
all one-to-one relationship between observed variables are depicted. Recently, network analysis
has been used in the field of clinical psychology to assess networks of psychological disorders,
READING COMPREHENSION AND WORKING MEMORY 22
such as symptoms of clinical depression (McNally et al., 2015; McNally, Mair, Mugno, &
Riemann, 2016; Van der Maas et al., 2017). For more information about using network analysis
There are several reasons to use network analysis to examine the processes underlying
reading comprehension. First, network analysis is more appropriate than SEM considering the
research question. The goal set by Freed et al. (2017) was to determine how reader
characteristics are related, focusing on the role of WMC and language experience (vocabulary
and word knowledge). This was an exploratory, open-ended approach that should have been
paired with an exploratory analysis. Network analysis is much more exploratory than SEM and
can also be modified to favor a more conservative or exploratory model. Allowing for more
flexibility, network models can also be adjusted to select for a sparser or more interconnected
A second justification for using network analysis is to examine better how each manifest
variable relates to the outcome variable and to all of the other observed variables in the network.
Unlike latent variable modeling, network models allow you to visualize and explore relationships
among all of the manifest variables. Visually, you can determine whether a node is more relevant
to the network by examining the number of edges connecting each node to the other nodes in the
model. Edge appearance can also indicate information about variable relationships, such as
direction and magnitude. Moreover, how closely the nodes are clustering together also indicates
important association information, as the closer the clustering the stronger the relationship.
A third justification for using network analysis is that it provides information about each
variable and their individual contribution to the network. There are data-driven indicators of the
relevancy of each node compared to other nodes in the network, including centrality indices and
READING COMPREHENSION AND WORKING MEMORY 23
clustering coefficients that are computed for each variable. Centrality indices specify the
importance or relevance of each variable to the network through measures of strength (total edge
weights connecting the node to other nodes), betweenness (how often a node bridges paths
between other nodes), and closeness (proximity to other nodes; Epskamp and Fried, 2016).
However, there is some controversy surrounding the use and interpretation of centrality indices,
particularly because they can become artificially inflated and difficult to interpret (Dablander &
Hinne, 2018; Bringmann et al, 2018). For that reason, the current project will not use centrality
indices as a means to compare the strength of each node relative to the outcome variable, but
rather will plot them alongside the clustering coefficient to find extreme scores indicating
redundancy or irrelevancy. The clustering coefficient generated for each variable is an index of
redundancy (Epskamp & Fried, 2016). It is desirable to have a certain level of clustering, as it
indicates that a node is related to other nodes in the network, but extreme values could indicate
redundancy. Comparing a node’s clustering coefficient alongside its associated centrality indices
will allow assessment of whether each variable is offering unique and relevant information to the
network. Nodes that have low centrality, but a high clustering coefficient indicate that these
nodes are likely redundant with other nodes in the network. Nodes that have both low centrality
and low clustering indicate that the node is not related to many other variables in the network
and could be considered irrelevant. Both of these types of nodes are potential candidates for
removal from the network as they could be negatively impacting the overall fit of the model to
the data.
A fourth benefit of network analysis is that, like other types of modeling, fit indices can
be obtained to indicate network model fit to the data. Network models generate fit indices similar
to those used to compare SEM and factor analysis, such as CFI/TLI, RMSEA, SRMR, and c2
READING COMPREHENSION AND WORKING MEMORY 24
values. Currently, this only allows for qualitative comparisons across different types of models.
However, these indices can be compared between multiple network models to determine whether
changes made, such as the removal nodes, have improved overall fit.
A final, fifth gain in using network analysis, is that it can be used as a complement to
other types of analyses, sans the controversial subjectivity that comes with the use of latent
variables (Borsboom, Mellenbergh, & VanHeerden, 2003). There is debate as to whether latent
variables truly represent and measure something in reality or are simply mathematical artifacts.
Plus, the same factor could be defined/named differently between researchers leading to
component of this project is to demonstrate how network analysis can be used as a tool to
confirm, modify, and inform future models (both latent variable and network). More so than
latent variable models, network models are much easier to produce/modify and allow researchers
Essentially this analysis offers an alternative perspective on the data that can provide insight into
the relationships being investigated. Using network analysis, the clustering of the nodes can be
used to confirm latent variables or factor structures, as we would expect nodes belonging to the
same factors to be clustered closely together. If the clustering is inconsistent with the factor
structure of the latent variable model, this could indicate changes that need to be made to the
established that can be used to remove superfluous variables. If removing these nodes results in
improvement to the overall fit of the network model, this indicates that removing these variables
from the latent variable model could also improve the fit of the latent variable model. Finally, the
exploratory nature of network modeling may uncover other information about the relationships
READING COMPREHENSION AND WORKING MEMORY 25
between observed variables, latent variable models or measures used that are not readily
The purpose of Study 2 is to reexamine how specific domain-general (i.e., WMC) and
Study 2 demonstrates how psychometric network modeling, an approach that is rapidly gaining
Method
The participants, materials, and procedure for Study 2 are the same as those in Study 1.
Results
correlations between all the measured variables (See Figure 5 for network visualization). The
network analysis was conducted using the qgraph package available in R, using the graphical
least absolute shrinkage and selector operator (gLASSO) based on the extended Bayesian
Tibshirani, 2008). Regularization techniques are used to eliminate spurious edges between nodes
that can occur due to sampling error and setting the EBIC hyperparameter (gamma)
conservatively, as done for this network (γ = .50), favors a sparser network model. A second
component, the tuning parameter (lambda), will be set to eliminate spurious edges while
maintaining true edges (λ = .01) that limits spurious edges and retains true edges, as per the
recommendations for psychometric network analysis (Epskamp, Lunansky, Tio, & Borsboom,
2018). Assessment of the fit indices are again consistent with recommendations by Klein (2005)
For the most part, node clusters were consistent with latent variable loadings, meaning
observed variables belonging to the same latent variables tended to cluster consistently with
Freed et al., 2017. However, there was an exception to this: Raven’s Progressive Matrices
clustered more closely with WMC nodes rather than Reasoning (which was reflected by several
math reasoning tasks). This likely contributed to the less than acceptable fit indices for the
measurement model CFA. Unlike the previous SEMs, model fit statistics for the network model
nearly all indicated good fit. This was true for all but CFI/TLI values which, although lower than
desired, were still substantially higher than the previous models (See Table 2). Although at face
value the network model fit statistics seem to indicate a better fit than the SEMs, the network
Psychometric Network Model 2. The next step was to determine whether there were
any redundant or irrelevant nodes in need of removal to improve the network fit. Although the
previous network model displayed acceptable fit to the data, there was still room for
language experience nodes compared the number of nodes representing the other
concepts/factors in the network. More nodes representing one concept compared to the others
could influence the structure or layout of the network. Three plots were generated with the
clustering coefficient along the X-axis and each of the centrality indices (betweenness, strength,
and closeness) along the Y-axis (see Figure 6). To be conservative, only nodes with the most
extreme values were considered for removal. Nodes that are highly redundant would be in the
lower right quadrant, indicating a high clustering coefficient, but low centrality. Nodes that are
highly irrelevant would be in the lower left quadrant, indicating low clustering and low
centrality. The node representing the Stroop effect (the only measure representing inhibition)
READING COMPREHENSION AND WORKING MEMORY 27
appeared to not be related to much when looking at the network, indicated by the lack of edges
connected to this node (See Figure 5). Moreover, the plots also confirmed this node was
irrelevant, demonstrating the lowest possible centrality and clustering of all the nodes (see Figure
6). Thus, this node was flagged for removal from the network. When examining nodes for
potential redundancy, the node representing Reading Questionnaire (a language experience node)
seemed to be the most extreme, with very low centrality and the highest clustering compared to
After the nodes were removed the network was estimated again, to generate the
redundancy plots a second time to determine if there were still any issues of redundancy (see
Figure 7). Two nodes still demonstrated a high degree of redundancy, Advanced Vocabulary and
Author Recognition Test (language experience nodes). For each measure of centrality, these
nodes showed extreme redundancy and limited centrality, thus these nodes were removed as
well. Overall, then, the inhibition factor from the SEMs only contained one measure, making it
an inappropriate factor to use in the first place. Compounding the issue, the inhibition node in the
network model was not related to nodes in the network, but for one edge. Additionally, the three
language experience nodes that were removed seemed conceptually overlapped at face value or
Test). Removing these nodes now allowed for each concept within the network to be equally
represented by two to four nodes, rather than biasing the network towards language experience
The network analysis was run again with the four nodes removed, with all the same
parameters set as the previous network model (Network Model 2; See Figure 8). As expected,
language experience nodes were more closely clustered with reading comprehension nodes due
READING COMPREHENSION AND WORKING MEMORY 28
to the conceptual overlap of these two abilities. However, removal of the redundant and
irrelevant nodes allowed the relationship between WMC and reading comprehension to be more
pronounced. This was evidenced by the closer clustering (compared to Network Model 1) and
visible edges between these two concepts’ nodes. As with the previous model, the reasoning
node representing Raven’s Progressive Matrices was clustered more closely with the WMC than
the reasoning nodes, highlighting a likely issue with the measurement model underlying the
SEMs presented by Freed et al. (2017). The changes made to the network substantially improved
relationships between reading comprehension and WMC and language experience. This was
done to test the prediction that, although language experience nodes would be more strongly
related to reading comprehension, WMC nodes would also have connections to reading
comprehension. Thus, we conducted a model with just the nodes representing these three
concepts and also set the code to display the edge weights connecting the nodes in the network.
This was done to offer numerical evidence for the relationships among these observed variables.
Network Model 3 (parameters set consistent with previous networks; See Figure 9; Table 2)
demonstrated excellent fit to the data. Language experience nodes had stronger edge weights
connecting them to reading comprehension, but each WMC had partial correlations connecting
them to at least one reading comprehension node (See Discussion for possible explanations and
elaboration). Although the edge weights connecting WMC and reading comprehension were
small (between .10 and .11), these represent partial correlations that were significant enough to
not be removed by the tuning parameter that was set to remove spurious edges.
Discussion
READING COMPREHENSION AND WORKING MEMORY 29
model underlying the latent variable models from Freed et al. (2017).
The initial psychometric network model, although acceptably fit to the data, was
imbalanced (many more language experience nodes than nodes for other concepts), with several
redundant nodes and one irrelevant node (Network Model 1; See Figure 5). Once the irrelevant
and redundant nodes were removed from the analysis, the second network model indicated
excellent fit to the data (Network Model 2). Network Model 2 demonstrated the relationship
between WMC and reading comprehension nodes, as well as language experience and reading
comprehension nodes via multiple connecting edges and close node clusters (see Figure 8). To
explore these relationships further, language experience, WMC, and reading comprehension
were used in a separate network model. As expected, Network Model 3 showed that language
experience had strong associations to reading comprehension, but also demonstrated significant
partial correlations between each WMC node and one of the reading comprehension nodes. This
combined with the results from SEM Model 3, provide a convincing case for the relationship
between WMC and reading comprehension. At the very least these results challenge the
conclusion from Freed et al. (2017) that WMC is not necessary for models of reading
comprehension.
Another goal of this study was to demonstrate how network analysis can be used as a
complimentary tool to latent-variable modeling. This was achieved by using network analysis to
explore the issues that likely contributed to the poor fit of the SEMs from Freed et al. (2017),
specifically relating to the measurement model used. One of the reasoning nodes (Raven’s
READING COMPREHENSION AND WORKING MEMORY 30
Progressive Matrices) clustered more closely to the WMC nodes than the other reasoning nodes.
All of the other reasoning measures were arithmetic-based, so it is possible that Raven’s
Progressive Matrices has more in common with the WMC complex-span tasks. However, this
indicates that the reasoning factor is not consistently defined, and this is more than likely
contributing to poor fit of the models produced under this factor structure. Another concern for
the measurement model involves the inhibition factor, which as a single node in the network
analysis, was irrelevant to other nodes in the network, with only one edge connecting it to
another node. Finally, removing this node from the network improved the overall fit, likely
indicating that removing this variable from the measurement model could also improve the fit of
the SEMs.
Other issues uncovered by the network models directly involve the underlying measures
used in the SEMs presented by Freed et al. (2017). In addition to the irrelevant inhibition node,
the network model also indicated that three of the language experience nodes were highly
redundant with one another (Reading Questionnaire, Advanced Vocabulary, and Author
Recognition Test). This is unsurprising as many of the nodes overlapped conceptually (e.g., three
different measures of vocabulary were used). Moreover, for the latent variable model, the
language experience factor initially was supposed to be three separate factors that were
subsequently combined and defined inconsistently. Thus, some of the language experience
measures were too similar, while others were too dissimilar from each other. The removal of
these nodes improved the fit of the network model, again indicating that removing these
measures from the measurement model could improve the fit of the overall SEM.
Another problem with the measures used was emphasized by the inconsistent
relationships between WMC nodes and each of the reading comprehension nodes. All WMC
READING COMPREHENSION AND WORKING MEMORY 31
nodes had edges connecting them to the reading comprehension node representing Investigator-
Generated Comprehension. However, none of the WMC nodes had any significant connections
Comprehension measure has been criticized as not being a true measure of reading
reasoning (Ready, Chaudhry, Schatz, & Strazzullo, 2013). This challenge is supported by results
indicating those with higher IQ or reading ability are able to accurately answer the test questions
without even reading the associated passages. If reading comprehension is not truly being
exerted or assessed by the measure than it is possible that some of the skills or processes
underlying reading comprehension were not employed to complete the task. For example,
perhaps the Nelson-Denny Comprehension Test does not properly tax working memory enough
to allow the relationship between WMC nodes and Nelson-Denny Comprehension node to be
was more reflective of reading comprehension and thus properly taxed WMC, allowing for the
relationship between this node and WMC nodes to materialize. Adding to this, the constructs of
reading comprehension and language experience were not convincingly separate from one
another. In fact, these two factors were correlated at .87 (p < .001) which indicates redundancy.
Denny Vocabulary was used as a language experience node. These two highly correlated
measures should not have been employed in the same model to represent distinct factors, as the
strong relationship between them will potentially bias the model. The strong relationship
between reading comprehension and language experience may also reflect shared method
variance, as Nelson Denny Vocabulary was embedded in a reading comprehension context, and
READING COMPREHENSION AND WORKING MEMORY 32
both of the reading comprehension measures may have presented difficult vocabulary.
was also highly correlated with another language experience node, Extended Range Vocabulary
Test (r = .59, p < .05). Both of these issues seem to indicate that reading comprehension and
language experience needed to be distinguished better from one another, such as only using the
language experience measures that were the most dissimilar to the reading comprehension
measures (e.g., Author Recognition Test, Scientist Recognition Test). Perhaps, due to the
previously mentioned problems of the Nelson-Denny Comprehension Test, this measure should
General Discussion
that contribute to variation in this ability across individuals. The ongoing debate concerns
related to better reading comprehension (Just & Carpenter, 1992). Alternatively, the domain-
skills that are improved upon with increased experience (MacDonald & Christiansen, 2002). The
current project confirms aspects from both of these perspectives, with both language-specific
factors and WMC accounting for variation in reading comprehension. Language experience was
the strongest predictor, but WMC, decoding, and fluency were significant predictors as well.
However, given the issues discovered concerning the measurement model and
psychometric measures used, these results warrant skepticism. However, we suggest that, all
READING COMPREHENSION AND WORKING MEMORY 33
together, the results of our re-analyses conflict with the original conclusions from Freed et al.
(2017), that WMC is not necessary for cognitive models of reading comprehension. The latent
variable models and network models converged to suggest that WMC was related to variation in
visualization of the relationship between WMC and reading comprehension, in addition to all of
the relationships between the various other cognitive processes necessary underlying reading
comprehension, but also revealed aspects about relationships that may not have been as readily
available using latent variable modeling. For example, it revealed that WMC nodes were only
related to one measure of reading comprehension which then prompted exploration into why the
Nelson-Denny Comprehension Test may not have been appropriate to use. Finally, although this
would be outside of the scope of the current project, the network model revealed a variety of
different associations between all of the observed variables that could have been further
explored, an advantage that is not freely offered by latent variable modeling. Essentially network
There were a few limitations to the current project, the most obvious being that the
current authors did not use the full dataset collected by Freed et al. (2017). We were able to
reconstruct the measurement model using the reported correlation matrix, descriptive statistics,
and model descriptions, but there were still large and unexplainable differences in model fit
between the current models and the models presented by Freed. This could be due to a number of
reasons such as rounding errors; the correlation matrix from Freed was rounded to two decimal
places, so some of the correlations were reported as .00 when they were smaller than .01. The
READING COMPREHENSION AND WORKING MEMORY 34
current CFA may have also differed from the PCA from Freed, in that we allowed the factors to
correlate to be consistent with the SEM, while Freed used an orthogonal rotation in their PCA
(but allowed factors to correlate in their SEMs). It is also possible that different estimation
methods were employed between the current models and those presented by Freed. The current
project used maximum likelihood (which tend to be the standard practice), but Freed did not
indicate which method was used for their analyses. Finally, some of the less common techniques
used by Freed, such as backward-building techniques in the creation of their models, may have
Other noted limitations can be expected for any project utilizing a secondary analysis, in
that we had no control over which variables were included, how these variables were measured,
nor the factor structure underlying the models. This lack of control over certain aspects was
particularly limiting once it was determined that there were problems with the measurement
model and some of the measures used. As discussed previously, the decision to include certain
variables/measures and how the factors were defined by Freed impacted the results of the SEMs,
but also could have constrained possibilities for the network analyses. With psychometric
network analysis, the overall structure is reliant on the variables chosen to include in the model.
Accordingly, overrepresenting one construct in the model (e.g., language experience), can lead to
bias in the network structure such that the entire network is built around one (set of) strong
association(s). Although we were able to remove redundant/irrelevant nodes and improve the fit,
having more control over the measures used would have allowed us to investigate and represent
the relationships in the network better, particularly the relationship between WMC and reading
comprehension.
READING COMPREHENSION AND WORKING MEMORY 35
comprehension (as well as other cognitive abilities). Using both techniques in tandem will allow
for exploration of reading comprehension from the traditional latent variable perspective, while
also benefitting from the new tools network analysis has to offer. For example, network analysis
can be used as an exploratory first step that guides the structure of the measurement model.
Changes made to the network model that improve the overall fit can inform potential adjustments
for the latent-variable model as well. However like other exploratory techniques, it would be
inappropriate to conduct an exploratory network analysis and a CFA/SEM on the same dataset.
Thus, it would be necessary to collect new data after conducting the network analysis, prior to
moving onto other confirmatory analyses. Additionally, for future research utilizing latent-
variable modeling to explore reading comprehension, it will be necessary to pay careful attention
to the measures used and how they load onto the specific factors in the measurement model.
Measures that are redundant, not sufficiently taxing, or too highly correlated with measures from
other factors, should not be used in latent-variable nor network models. Network analysis can
also aid in this task by providing centrality indices and clustering coefficients which can confirm
whether any measure used is appropriately related to the other nodes in the network and not
confirmatory network analysis can also be used afterwards to corroborate the factor structure by
verifying whether the nodes are clustering consistently with what the measurement model
predicts.
Overall, these re-analyses confirmed the significant role that WMC plays in reading
comprehension. In the final SEM tested WMC, but not reasoning, maintained a significant direct
READING COMPREHENSION AND WORKING MEMORY 36
path to reading comprehension when both factors were included as predictor variables in the
model. Additionally, the network analyses indicated significant associations between WMC and
experience seems to be the strongest predictor of reading comprehension. However, WMC is still
this ability.
READING COMPREHENSION AND WORKING MEMORY 37
Acknowledgments
Sara Anne Goring would like to thank her co-authors Christopher J. Schmank, Michael J. Kane,
and Andrew R. A. Conway. She would also like to give a special thanks to Kathy Pezdek for all
her help throughout the writing process. Thank you to Ester Navaro and Kevin Rosales, as well
References
Baddeley, A., Logie, R., Nimmo-Smith, I., & Brereton, N. (1985). Components of fluent reading.
596x(85)90019-1
Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003). The theoretical status of latent
Bringmann, L. F., Elmer, T., Epskamp, S., Krause, R. W., Schoch, D., Wichers, M., ... &
Brown, J., Bennett, J., & Hanna, G. (1980). The Nelson-Denny reading test. Boston: Houghton
Mifflin.
Brown, J., Fischo, V., & Hanna, G. (1993). The Nelson-Denny reading test. Boston: Houghton
Mifflin.
Cromley, J. G., & Azevedo, R. (2007). Testing and refining the direct and inferential mediation
https://doi.org/10.1037/0022-0663.99.2.311
Dablander, F., & Hinne, M. (2018). Node Centrality Measures are a poor substitute for Causal
Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A
https://doi.org/10.3758/bf03214546
Ekstrom, R. B., Dermen, D., & Harman, H. H. (1976). Manual for kit of factor-referenced
https://doi.org/10.21236/ad0410915
Epskamp, S., Borsboom, D., & Fried, E. I. (2018). Estimating psychological networks and their
https://doi.org/10.3758/s13428-017-0862-1
Epskamp, S., Lunansky, G., Tio, P., & Borsboom, D. (2018, April). Recent developments on the
http://psychosystems.org/author/sachaepskamp
Ferreira, F., & Clifton Jr, C. (1986). The independence of syntactic processing. Journal of
Freed, E. M., Hamilton, S. T., & Long, D. L. (2017). Comprehension in proficient readers: The
https://doi.org/10.1016/j.jml.2017.07.008
Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with
https://doi.org/10.1037/0033-295x.99.1.122
Kane, M. J., Conway, A. R., Hambrick, D. Z., & Engle, R. W. (2007). Variation in working
Kane, M. J., & Engle, R. W. (2003). Working-memory capacity and the control of attention: The
contributions of goal neglect, response competition, and task set to Stroop interference.
3445.132.1.47
King, J., & Just, M. A. (1991). Individual differences in syntactic processing: The role of
https://doi.org/10.1016/0749-596x(91)90027-h
Klein, R. B. (2005). Principles and Practice of Structural Equation Modeling. New York. NY:
Guilford.
READING COMPREHENSION AND WORKING MEMORY 41
Just and Carpenter (1992) and Waters and Caplan (1996). Psychological Review, 109, 35-
54. https://doi.org/10.1037//0033-295x.109.1.35
MacDonald, M. C., Just, M. A., & Carpenter, P. A. (1992). Working memory constraints on the
https://doi.org/10.1016/0010-0285(92)90003-k
Mason, R. A., & Just, M. A. (2007). Lexical ambiguity in sentence comprehension. Brain
McNally, R. J., Robinaugh, D. J., Wu, G. W. Y., Wang, L., Deserno, M., & Borsboom, D. (2015).
McNally, R. J., Mair, P., Mugno, B., & Riemann, B. C. (2017). Co-morbid obsessive-compulsive
1204-1214. https://doi.org/0.1017/S0033291716003287.
McVay, J. C., & Kane, M. J. (2012). Why does working memory capacity predict variation in
https://doi.org/10.1037/a0039137
Ready, R. E., Chaudhry, M. F., Schatz, K. C., & Strazzullo, S. (2013). “Passageless”
Rosseel, Y. (2012). Lavaan: An R package for structural equation modeling and more. Version
http://www.jstatsoft.org/v48/i02/.
Scales, A. M., & Rhee, O. (2001). Adult reading habits and patterns. Reading Psychology, 22,
175–203. https://doi.org/10.1080/027027101753170610
Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural
equation modeling and confirmatory factor analysis results: A review. The Journal of
Seidenberg, M. S. (1985). The time course of phonological code activation in two writing
Stanovich, K. E., & West, R. F. (1989). Exposure to print and orthographic processing. Reading
Unsworth, N., & McMillan, B. D. (2013). Mind wandering and reading comprehension:
Examining the roles of working memory capacity, interest, motivation, and topic
832-842.
van der Maas, H. L. J., Kan, K.-J., Marsman, M., & Stevenson, C. E. (2017). Network models for
https://doi.org/10.3390/jintelligence5020016
Verhoeven, L., & Van Leeuwe, J. (2008). Prediction of the development of reading
https://doi.org/10.1002/acp.1414
Wells, J. B., Christiansen, M. H., Race, D. S., Acheson, D. J., & MacDonald, M. C. (2009).
https://doi.org/10.1016/j.cogpsych.2008.08.002
Zahler, D., & Zahler, K. (2003). Test prep your IQ cultural literacy (1st ed.). Lawrenceville, NJ:
Peterson’s.
READING COMPREHENSION AND WORKING MEMORY 44
Figure 1. Measurement model from Freed et al. (2017), but standardized path values are
obtained from the current analysis. This includes 7 exogenous latent variables representing
individual difference measures (Inhibition not a true factor) and the latent variable representing
the endogenous variable, Comprehension. Although Comprehension was not included in the
CFA, it is entered here to show indicator loadings. See Table 1 for indicator variable names.
READING COMPREHENSION AND WORKING MEMORY 45
Table 1
Bivariate Correlations for Individual Difference and Comprehension Measures (From Freed et al. 2017).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1. Orthographic Decision (OrD)
2. Phonological Decision (PhD) .52
3. Non-word Naming (NnN) .26 .20
4. Phoneme Transposition† -.19 -.23 -.10
5. Reading Span (RdS) .05 -.07 -.12 .17
6. Alphabet Span (AlS) -.14 -.11 -.09 .14 .45
7. Minus Span (MnS) -.12 -.10 .00 .23 .41 .27
8. Visual Number Span (VNS) -.03 .10 -.02 .13 .21 .23 .33
9. Go-No Go† -.17 -.09 -.15 .07 -.11 -.06 .12 -.00
10. Stroop Interference (StI) .05 .13 .12 -.09 .01 .02 .00 -.03 -.10
11. Author Recognition Test (ART) -.18 -.22 -.08 .31 .12 .25 .08 .02 -.06 .00
12. Reading Questionnaire (RdQ) .09 .03 -.00 .20 .13 -.00 .04 .04 -.02 .07 .35
13. Cultural Intel igence (ClI) -.17 -.32 -.21 .28 .26 .28 .16 -.03 .04 -.10 .51 .36
14. Scientist Recognition Test (SRT) -.10 -.18 -.07 .10 .15 .20 .10 -.01 -.03 -.08 .36 .17 .48
15. Extended Range Vocabulary (ERV) -.14 -.25 -.16 .26 .29 .31 .14 -.01 -.05 -.06 .58 .26 .64 .43
16. Advanced Vocabulary (AdV) -.09 -.20 -.13 .22 .19 .25 .05 -.01 -.05 .15 .46 .28 .57 .36 .67
17. Nelson-Denny Vocabulary (NDV) -.22 -.33 -.21 .35 .27 .32 .13 -.08 -.06 -.15 .58 .37 .71 .37 .73 .61
18. Raven's Progressive Matrices (RPM) -.05 .07 .01 .19 .24 .32 .33 .31 -.08 .04 .14 -.02 .12 .14 .22 .14 .18
19. Arithmetic Aptitude Test (AAT) -.16 -.06 -.11 .21 .28 .31 .26 .22 -.02 -.11 .13 -.10 .27 .19 .29 .24 .29 .29
20. Mathematic Aptitude Test (MAT) -.11 -.06 -.09 .21 .22 .26 .29 .16 .03 -.02 .16 -.02 .39 .21 .35 .26 .32 .28 .71
21. Necessary Arthmetic Operations (NAO) -.05 -.09 -.11 .25 .31 .33 .32 .11 -.02 -.08 .24 .03 .37 .17 .37 .28 .42 .29 .63 .61
22. Letter Comparisons (LtC) -.16 -.13 -.10 .13 .10 .12 .09 .06 .01 -.05 .14 .12 .14 .02 .17 .14 .10 .00 .21 .19 .21
23. Finding As (FnA) -.19 -.13 -.06 .17 .05 .08 .13 .11 .00 -.17 .16 .11 .13 .00 .12 .06 .17 .03 .13 .15 .18 .33
24. Number Comparison (NmC) -.26 -.05 -.14 .07 .01 .04 .15 .16 .04 -.11 .03 -.08 -.05 -.05 -.12 -.04 -.05 .04 .25 .21 .17 .23 .38
25. Pattern Comparison† -.15 -.07 -.05 .06 .12 .04 .08 .03 -.04 -.04 .15 -.06 .12 .14 .18 .12 .15 .06 .03 .09 .11 .00 .11 .04
26. Identical Pictures (IdP) -.16 -.15 -.06 .06 .08 .13 .10 .17 .08 .02 .15 .04 .22 .05 .17 .14 .26 .12 .17 .20 .22 .18 .26 .20 .08
27. Word Beginnings (WrB) -.24 -.31 -.11 .22 .18 .31 .20 .11 -.05 .16 .24 .06 .33 .18 .36 .32 .42 .08 .25 .26 .24 .17 .27 .12 .17 .19
28. Word Endings (WrE) -.24 -.30 .00 .09 .15 .33 .15 .09 -.08 -.02 .23 -.04 .23 .13 .28 .24 .29 .16 .16 .20 .27 .11 .18 .14 .12 .20 .48
29. Word Beginnings and Endings (WBE) -.25 -.29 -.13 .18 .19 .28 .14 .09 -.04 -.03 .26 -.06 .29 .21 .37 .28 .39 .15 .24 .22 .22 .01 .16 .12 .18 .16 .44 .44
30. Nelson-Denny Comprehension (NDC) -.30 -0.34 -.19 .29 .17 .16 .18 -.03 .04 -.22 .36 .13 .49 .22 .36 .38 .55 .24 .27 .31 .38 .18 .18 .24 .08 .16 .26 .20 .21
31. Investigator-Generated Comprehension (IGC) -.06 -.22 -.11 .26 .35 .34 .27 -.04 -.05 -.04 .37 .21 .50 .31 .59 .47 .57 .27 .29 .35 .36 .10 .12 -.03 .08 .13 .29 .19 .25 .48
M 762.64 2659.04 1053.33 0.88 0.64 0.73 0.83 0.35 0.14 96.98 0.18 3.58 0.57 0.25 0.32 0.35 0.87 0.72 0.50 0.37 0.52 0.19 0.31 0.28 0.96 0.71 26.36 31.50 19.23 0.79 0.08
SD 141.63 1106.41 484.23 0.13 0.14 0.12 0.14 0.07 0.14 102.89 0.10 0.96 0.14 0.11 0.15 0.14 0.08 0.13 0.19 0.15 0.18 0.05 0.08 0.06 0.06 0.14 7.92 8.13 6.94 0.16 0.09
N 261 252 243 224 285 284 281 278 256 245 279 330 262 279 278 270 283 280 278 261 274 274 274 275 262 260 276 262 277 263 303
Note. Significant correlations (p < .05) are shown in bold. † = Measures removed by Freed et al.
READING COMPREHENSION AND WORKING MEMORY 46
Table 2
Model Fit Statistics for Latent Variable and Network Models of Reading Comprehension
Data.
Statistics
SEM: Model 23 643.14 239 .85 (.82) .07 21426.92 21753.86 .07
SEM: Model 34 881.75 323 .84 (.81) .07 24760.29 25187.25 .07
Network Model 26 193.70 175 .99 (.99) .02 21064.96 21006.05 .03
Figure 2. SEM Model 1 (Model 1 from Freed et al. 2017) containing 7 predictor variables
representing latent individual difference variables, and the outcome variable representing
Reading Comprehension. Path values are standardized and obtained from the current analysis.
READING COMPREHENSION AND WORKING MEMORY 48
Figure 3. SEM Model 2 (Model 3 from Freed et al. 2017) containing 6 predictor variables
(Reasoning deleted from model) representing latent individual difference variables, and the
outcome variable representing Reading Comprehension. Path values are standardized and
obtained from the current analysis.
READING COMPREHENSION AND WORKING MEMORY 49
Figure 4. SEM Model 3 (Current model, not presented in Freed et al. 2017) containing 7
predictor variables representing latent individual difference variables, and the outcome variable
representing Reading Comprehension. Dotted / dashed lines indicate a non-significant path.
READING COMPREHENSION AND WORKING MEMORY 50
Reasoning
WMC
Decoding
Fluency
Inhibition
Reading Comprehension
Perceptual Speed
Language Experience
Betweenness
Strength
Closeness
Clustering Coefficient
Figure 6. Redundancy plots, with the clustering coefficient on the x-axis and centrality indices
(betweenness, strength, and closeness) on the y-axis. The boxes indicate measures that removed
for either being redundant (RQ) or irrelevant (SI).
READING COMPREHENSION AND WORKING MEMORY 52
Betweenness
Strength
Closeness
Clustering Coefficient
Figure 7. Redundancy plots, with the clustering coefficient on the x-axis and centrality indices
(betweenness, strength, and closeness) on the y-axis. The boxes indicate measures that removed
for either being redundant (ART, AV)
READING COMPREHENSION AND WORKING MEMORY 53
Language Experience
Fluency
Reading Comprehension
Decoding
WMC
Perceptual Speed
Reasoning
Reading Comprehension
WMC
Language Experience
Figure 9. Network Model only containing Reading Comprehension (blue; NDC: Nelson-Denny
Comprehension, IGC: Investigator-Generated Comprehension); WMC (pink; VNS: Visual
Number Span, MnS: Minus Span, RdS: Reading Span, AlS: Alphabet Span; Language
Experience (green; CII: Cultural Intelligence, NDV: Nelson-Denny Vocabulary, ERV: Extended
Range Vocabulary, SRT: Scientist Recognition Test; γ = .50, λ = .01).