Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Running head: READING COMPREHENSION AND WORKING MEMORY 1

The Role of Working Memory in Reading Comprehension:

Latent Variable and Psychometric Network Model Re-Analyses of Freed, Hamilton, and Long

(2017)

Sara Anne Goring1, Christopher J. Schmank1, Michael J. Kane2, & Andrew R. A. Conway1

Claremont Graduate University1

University of North Carolina at Greensboro2


READING COMPREHENSION AND WORKING MEMORY 2

Abstract

It is well established that reading comprehension ability varies among individuals. However,

there is disagreement regarding whether those differences can be attributed to differences in

domain-general abilities (e.g., working memory capacity, WMC; Just & Carpenter, 1992) or

domain-specific abilities (e.g., language experience; MacDonald & Christiansen, 2002).

Recently, Freed, Hamilton, and Long (2017) reported that individual differences in reading

comprehension can be largely attributed to language experience and concluded that WMC is not

an important factor. After re-analyzing structural equation models from Freed et al. and

generating a psychometric network model of their data, we find that WMC is more important to

individual differences in reading comprehension than was suggested by Freed et al. Overall, it

was confirmed that both domain-general and domain-specific processes are associated with

variation in reading comprehension.

Keywords: reading comprehension, domain-general, domain-specific, network modeling


READING COMPREHENSION AND WORKING MEMORY 3

The Role of Working Memory in Reading Comprehension:

Latent Variable and Psychometric Network Model Re-Analyses of Freed, Hamilton, and Long

(2017)

Considerable research at the intersection of memory and language has sought to identify

the cognitive processes involved in reading comprehension. Although individual differences in

reading comprehension are well established, researchers disagree about their sources. One

perspective argues that individual differences are due to domain-general abilities such as

working memory capacity (WMC; Just & Carpenter, 1992). An alternative viewpoint is that

variation in reading comprehension is due to domain-specific processes, such as verbal fluency

and other language abilities derived through experience (MacDonald & Christiansen, 2002). A

wealth of research supports both domain-general and domain-specific perspectives regarding

individual differences in reading ability.

The Domain-General Perspective

Just and Carpenter (1992) proposed that individual differences in reading comprehension

result from variation in WMC resources available to generate activation. During comprehension,

language units become available for processing and computation by reaching a specific level of

activation. Cognitive resources are finite, however, and when capacity limits are nearly reached,

scaling-back procedures are initiated so only the most important items maintain or achieve

activation. These scaling-back procedures include pulling activation from previously processed

items to be used for upcoming items. The effect of this is seen when a student forgets

information from a previous sentence while processing a new difficult one. Having more

available resources is an advantage because capacity limits will not be reached as easily.
READING COMPREHENSION AND WORKING MEMORY 4

The domain-general perspective has been supported by research that compares subjects

with different levels of WMC. Just and Carpenter (1992) reported that adults with greater WMC

(higher-spans) process language more quickly and accurately than do adults with lesser WMC

(lower-spans). These differences become more evident when processing difficult or unfamiliar

lexical content and grammatical structures. For example, compared to lower-span subjects,

higher-span subjects read irregular reduced relative clauses faster (Ferreira & Clifton, 1986; Just

& Carpenter, 1992). Reduced relative clauses are difficult in that they do not contain an explicit

relative pronoun or complementizer (e.g., “that/who was”). Without the complementizer to orient

readers to the expected outcome, they are forced to adjust expectations mid-sentence. Consider

the phrase, “The defendant examined by the lawyer shocked the jury.” The word defendant could

initially be interpreted as the subject or action-taker of the phrase, setting up expectations for a

different outcome (e.g., “The defendant examined the evidence...”). Cognitive resources are thus

needed to adjust expectations and resolve ambiguity quickly. According to Just and Carpenter,

higher-span subjects had shorter reading times because they had more resources to process the

syntactic clues that suggested the correct interpretation. Lower-span subjects required more time

and resources to reread the clauses and resolve the confusion. This WMC effect has been

confirmed by other studies that have used ambiguity or syntactic complexity to increase the

difficulty of processing, thus increasing the amount of activation required and decreasing the

performance of lower-span subjects (Just & Carpenter, 1992; King & Just, 1991; MacDonald,

Just, & Carpenter, 1992).

Recent research has also emphasized the attentional-control components of WMC. In

previous studies, researchers have reported that, similar to verbal span tasks, complex numerical

span measures of WMC predict reading comprehension, although to a lesser extent than verbal
READING COMPREHENSION AND WORKING MEMORY 5

spans (Daneman & Merikle, 1996). This indicates that factors other than language-specific

resources are involved in the relationship between WMC and reading comprehension. Indeed,

McVay and Kane (2012) examined the relationship between WMC, reading comprehension, and

attentional-control to investigate the shared attentional-control mechanisms involved in both

WMC and reading comprehension tasks and account for correlations between these two abilities.

As predicted, not only did a WMC latent variable derived from verbal and non-verbal tasks

predict reading comprehension, but mind-wandering propensity (indicating a lack of attentional-

control) also partially mediated the relationship between WMC and comprehension. Subjects

with lower WMC were more prone to mind-wandering during both reading and non-reading

tasks, and this general propensity for inattention was related to decreased comprehension.

Additionally, a recent meta-analysis found that there is a positive association between reading

comprehension and executive function that remains consistent across age, regardless of the type

of executive function or reading comprehension measures that were used (Follmer, 2018). These

results (see also Unsworth & McMillan, 2013) confirm that the domain-general, attentional-

control components of WMC are related to individual differences in reading comprehension.

We note that researchers from the domain-general perspective also acknowledge the role

of domain-specific processes in producing variation in reading comprehension, such as

vocabulary, linguistic fluency, and background knowledge (Baddeley, Logie, & Nimmo-Smith,

1985; Cromley & Azevedo, 2007). However, the domain-general perspective emphasizes the

activation or attentional-control components of WMC as the dominant mechanisms underlying

differences in complex cognitive functions (e.g., language learning, reading comprehension;

Kane, Conway, Hambrick & Engle, 2007; Kane & Engle, 2003). Supporting this perspective,

studies have confirmed that individual differences in WMC do significantly predict variation in
READING COMPREHENSION AND WORKING MEMORY 6

performance for reading comprehension tasks, and WMC also correlates with many language-

specific processes such as integration of knowledge or resolution of lexical ambiguity (Daneman

& Carpenter, 1983; Mason & Just, 2007; Daneman & Merikle, 1996)). Thus, according to this

viewpoint, domain-specific and domain-general processes are both necessary, but the general

processes make a greater difference.

Domain-specific Perspective

Researchers from the domain-specific perspective have focused on contributions of

language-specific processes to variation in reading comprehension. According to MacDonald

and Christiansen (2002), verbal WMC is not a separate property that can vary independently

from other language processes, but an emergent characteristic of a complex, multi-layer system.

According to this view, language processing tasks and tests of verbal WMC both measure

language ability and are conceptually indistinguishable from one another. So, when measuring

verbal WMC per this viewpoint, one is essentially just measuring language ability, not a separate

process with an individual capacity, as purported by the domain-general view. According to the

domain-specific perspective, the ease and efficiency of language processing depends on the

complexity of the linguistic input being processed and how often similar input has been

experienced. Reading more often exposes the language processes to a wider variety of structures

and content, conditioning the entire system for more efficient processing. For example, low-

experience readers have more difficulties processing irregular words than regular ones, unless it

is a high-frequency word and likely to be more familiar. However, highly experienced readers

process irregular and regular words with similar ease, regardless of frequency/familiarity

(MacDonald & Christiansen, 2002; Seidenburg, 1985).


READING COMPREHENSION AND WORKING MEMORY 7

Unlike the domain-general perspective, previous language experience is the most

important factor in the domain-specific view, which stresses the combined effects of language-

specific skills involved in reading comprehension, including vocabulary, word knowledge,

verbal/reading fluency, and previously obtained cultural knowledge (Baddeley et al., 1985;

Cromely & Azevedo, 2007; Verhoeven & Van Leeuwe, 2008). Having more experience, or

being exposed to more written and oral language, builds these language-specific skills

(MacDonald & Christiansen, 2002). Indeed, subjects’ processing of irregular sentence structures

improves from pre- to post-test after they are given additional practice with them (Wells,

Christiansen, Race, Acheson, & MacDonald, 2009). Thus, studies from the domain-specific

perspective typically include some measure of previous experience or exposure to language

when examining comprehension.

MacDonald and Christiansen (2002) offered evidence for the domain-specific perspective

via reinterpretations of previous results (initially presented by Just & Carpenter, 1992) and

network simulations. For example, as discussed above, Just and Carpenter found that compared

to low-span subjects, high-span subjects had faster reading times for reduced relative clauses.

Just and Carpenter attributed these differences in performance to WMC limitations. However,

MacDonald and Christiansen ascribed these results to previous language experience. Compared

to less experienced readers (comparable to lower-span subjects), highly experienced readers (or

higher-span subjects) had more familiarity with irregular sentence structures and could anticipate

the sentence resolution (King & Just, 1991; MacDonald & Christiansen, 2002). This resulted in

faster reading times for highly experienced readers compared to less experienced readers.

Additionally, MacDonald and Christiansen supported their view with simulated network studies.

Prior to linguistic training, a simulated network generated results comparable to the performance
READING COMPREHENSION AND WORKING MEMORY 8

of low-span subjects. However, after training with irregular sentence structures, the network

produced output comparable to high-span subjects. From the domain-general perspective, it is

worth noting that this experience-based, domain-specific perspective cannot account for positive

correlations between non-verbal WMC tasks and reading comprehension, due to the assertion

that skill necessitates specific experience. Regardless, research continues to develop on both

sides of the domain-general versus domain specific debate.

Freed, Hamilton, and Long (2017)

Freed, Hamilton, and Long (2017) examined the role of previous language experience in

reading comprehension, as well as other domain-specific and domain-general variables (e.g.,

WMC, word decoding). A particular focus of this study was determining the role and importance

of WMC, relative to word decoding, language experience, verbal fluency, perceptual speed,

inhibition, and reasoning. Using structural equation modeling (SEM) techniques, they reported

the best fitting model to consist of direct relationships predicting reading comprehension from

only language experience and reasoning (see Figure 1). All other latent variables (including

WMC) were only related to reading comprehension indirectly, via overlapping correlations.

Regarding WMC, Freed et al. concluded, “The authors question the need to include WMC in our

theories of variability in adult reading comprehension” (p. 135).

The Freed et al. (2017) conclusion is not only strong, but it conflicts with much of the

reading comprehension literature. Such claims require further investigation. Although the Freed

et al. study is impressive in the number of relevant constructs it explored empirically, we identify

several serious methodological limitations to the work.

First, SEM techniques are not appropriate to answer the exploratory, open-ended

research questions proposed by Freed et al. (2017). SEM is a confirmatory analysis that is largely
READING COMPREHENSION AND WORKING MEMORY 9

theory-driven and typically guided by a specific, a priori model to test. Freed et al., in contrast,

conducted their analyses using a data-driven method with no explicit theoretical model, but

rather started the process by conducting an exploratory principle component analysis (PCA).

When conducting the SEM, the authors used program-generated modification indices to delete

“irrelevant” pathways between variables in their model, based solely on a data-driven basis with

no theoretical justification or interpretation. This approach obviously blurs the line between a

priori and post-hoc hypothesis development and makes the models less interpretable.

Additionally, we argue that the process of deleting pathways did not sufficiently improve fit to

justify the use of these techniques.

Second, the Freed et al. (2017) interpretation of direct versus indirect effects was

unclear. They correctly identified direct paths, such as the direct effect of language experience on

reading comprehension, but they misclassified correlations between latent variables as evidence

for indirect paths. For example, language experience had a direct relationship to reading

comprehension, and word decoding was correlated with language experience. Freed considered

this to reflect an indirect effect of word decoding on reading comprehension, via language

experience. For this to be a true indirect effect, word decoding would need a direct path to

language experience, not just a correlation.

Third, choices made during the analysis plan and presentation were inconsistent with

standard practices. Freed et al. (2017) used a PCA with varimax (orthogonal) rotation to extract

their components/factors, but then allowed latent variables to correlate in the SEMs. This is

inconsistent across analyses, as the orthogonal rotation prevents correlations between factors, but

the SEM allowed for these relationships. Additionally, forcing orthogonality between

components/factors causes important information about the relationship between these factors to
READING COMPREHENSION AND WORKING MEMORY 10

be lost. Moreover, using an oblique rotation would have allowed the authors to analyze

correlations between components, had they existed, or shown orthogonality if that were the case.

Additionally, Freed et al. followed their PCA, which is an exploratory analysis (and which

identifies components rather than latent factors), with SEM which is a confirmatory factor-

analytic technique. Using a confirmatory analysis on an exploratory model generated from the

original data sample will always demonstrate good model fit, but that does not establish that the

model is valid. Finally, Freed failed to report standardized regression coefficients in their

published figures making their SEM models uninterpretable. All of these combined factors raise

concerns regarding the validity of the authors’ original conclusions.

Yet our most important critique of Freed et al. (2017) was the lack of a theoretically-

based rationale for the decisions made throughout the modeling process, particularly when

designing their models or determining which model to draw conclusions from. Predictors were

chosen for the initial model based on evidence of each individual variable’s predictive validity of

reading comprehension, but there was no theoretical basis presented for how these variables hang

together as a model of reading comprehension. So rather than starting with a theory-based model

to test, models were designed by including all (or certain predictors) and all pathways. Then

data-driven modification indices were used to remove pathways without any further explanation

or justification for what these changes meant from a theoretical perspective. Interpretability

aside, this actually limited the amount of information that could obtained from certain models.

For example, for Model 1 (see Figure 1), allowing all of the factors to have a direct path to

reading comprehension would have identified the strength of each factor and allowed for a

comparison of the predictive strength between certain variables of interest, like reasoning and

WMC. Additionally, Freed arbitrarily removed factors between models in order to explore
READING COMPREHENSION AND WORKING MEMORY 11

specific relationships and the stability or robustness of certain associations. However, this goes

against standard practices for SEM, and this process fed into the larger issue of not justifying

why they chose the particular model they did to base conclusions from (Model 1). Many of the

Freed et al. models demonstrated comparable fit indices and proportion of variance explained.

Specifically, compared to the Freed’s chosen model (Model 1) a similar model, also presented by

Freed (Model 2, see Figure 2), contained all of the latent variables but Freed had removed

reasoning from this model. For Model 2, the direct relationship between reading comprehension

and language experience was maintained, but Freed claimed the removal of reasoning for Model

2 allowed for a significant direct relationship between WMC and reading comprehension. The

model fit indices were similar between the first model containing reasoning [χ2(333) = 542.88, p

< .001; CFI = 0.91; TLI = 0.90; RMSEA = 0.04], and the model with reasoning removed,

[χ2(239) = 390.63, p < .001; CFI = 0.92; TLI = 0.90; RMSEA = 0.04]. Also, including reasoning

in the model only added a negligible 2.91% of explanatory power, so there is not a clear data-

driven reason for why Model 1 was chosen over Model 2 (with reasoning: 76.68%, without

reasoning: 73.77%). Finally, including both reasoning and WMC in the same model could have

introduced the potential for redundancies or multicollinearity due to their conceptual overlap and

shared variance (Engle, 2001; 2002). Although ideally all measured predictors should be

included in a model, rather than being cherry-picked, this factor structure lacked a theoretical

basis to begin with. Particularly when considering Freed presented previous research supporting

each variable’s ability to predict reading comprehension yet most were not given direct paths to

reading comprehension in the model. Thus, considering this model was not based on a testable

theory, and it was not remarkably better fitting than the other models conducted by Freed, it

remains unclear what motivated the conclusions drawn from Model 1.


READING COMPREHENSION AND WORKING MEMORY 12

Study 1: Replication and Extension of Freed et al. (2017)

The current study replicated and extended the work by Freed et al. (2017) to better

understand their SEM process and what the data imply about variation in comprehension. The

current study was conducted using Freed’s published correlation matrix and descriptive statistics.

A covariance matrix was generated to run a confirmatory factor analysis (CFA) on their original

measurement model to assess whether their model provided an adequate fit to the data. Next,

three SEMs were assessed: (a) Model 1, containing direct paths from language experience and

reasoning to reading comprehension, which represents the first reanalysis of a model from Freed,

see Figure 2; (b) Model 2, containing direct paths from WMC and language experience to

reading comprehension, which was the second reanalysis of a model from Freed, see Figure 3;

(c) and Model 3, which is the only SEM that was not initially presented by Freed and contained

direct paths from all latent variables to reading comprehension, see Figure 4.

Replicating Model 1(Figure 2) and Model 2 (Figure 3) from Freed et al. (2017) served a

practical purpose. Since Freed et al. did not present their standardized path weights, their models

were largely uninterpretable. Thus, replication was necessary to get a better understanding of the

models and appropriate conclusions.

The extension component of Study 1 was to analyze a third model that was not produced

by Freed et al. (2017), containing direct paths from all predictor factors to reading

comprehension, Model 3 (Figure 4). The purpose of Model 3 was to test whether, when all

factors are given a direct path to reading comprehension, WMC would maintain a significant

direct path to reading comprehension while reasoning does not. Such an outcome would directly

challenge the conclusion drawn by Freed et al., that reasoning maintains a significant direct path

to comprehension while WMC does not. Model 3 will also allow an assessment of all the
READING COMPREHENSION AND WORKING MEMORY 13

measured domain-general and domain-specific variables underlying reading comprehension.

Finally, reproducing all three models will provide a means to examine and compare model fit of

all of the SEMs utilizing the factor structure designed by Freed et al.

Method

We present only the methodological details that are necessary to understand and

evaluate our re-analyses of the Freed et al. (2017) data. For more information regarding

participants, measures, and procedures, see Freed et al. (2017).

Subjects. Three hundred and fifty-seven young adults, sampled from a four-year

university and a community college, participated in the study. However, only 346 participants

were used in the subsequent analyses, consistent with Freed et al. (2017).

Measures. Twenty-six manifest variables that were explained by seven factors (not

including reading comprehension or the related observed variables for this factor): decoding,

WMC, inhibition, language experience, reasoning, perceptual speed, and fluency. The target

variable, reading comprehension, was measured using two manifest variables that were

explained by one factor.

Decoding. In the Phonological Decision task, subjects viewed two non-word letter

strings, and determined which one, if spoken aloud, would sound like a real word (e.g.,

HOWSE). For Non-Word Naming, subjects viewed 100 pronounceable non-words (e.g.,

plambust) and pronounced them as quickly as possible. For the Orthographic Decision task,

subjects viewed two letter strings and decided which was a correctly spelled word (e.g., DEAL

vs. DEEL).
READING COMPREHENSION AND WORKING MEMORY 14

WMC. Reading Span involved 15 sets of sentences, and 60 target words. After subjects

view a sentence, they decided whether it made sense or not followed by presentation of an

unrelated target word. After presentation of 2–7 sentences, subjects recalled all target words in

order. For Alphabet Span, subjects viewed 25 lists of words, with 2–7 words in each. Subjects

recalled each word list in alphabetical order. Minus Span consisted of subjects viewing 35 sets

(2–8 numbers in each set) of random numbers; following presentation of the number, subjects

subtracted a value of 2 from each number. Afterwards, subjects reported the differences in order

per set. Finally, for Visual Number Span, subjects viewed 24 sets of digits (4–13 digits presented

sequentially in each set). Subjects recalled each digit list in reverse order of presentation.

Inhibition. In a Stroop task, subjects named the hue for 105 letter strings (one at a time)

that consisted of either a color name or an unrelated word; color names appeared in either

congruent or incongruent hues (e.g., GREEN in red hue). Subject’s naming latencies were used

for analysis.

Language Experience. Language experience was a broad factor including measures of

vocabulary, background knowledge, and print exposure (also called reading frequency).

Vocabulary was measured using: (a) Form F or Form G from the Nelson-Denny Reading Test,

which required subjects to complete sentences with the final word missing, and two more

traditional, multiple-choice vocabulary tests from the Ekstrom Battery (Ekstrom, Dermen, &

Harman, 1976), Extended Range and Advanced.

Background knowledge measures assessed expertise in specific domains across a range

of topics. Cultural literacy was assessed using a test from Test-Prep your IQ with the Essentials

of Cultural Literacy (Zahler & Zahler, 2003) about American History, Geography,

Myth/Religion, Science, and Art. Science knowledge was tested with a novel measure, in which
READING COMPREHENSION AND WORKING MEMORY 15

subjects distinguished names of scientists from foils. A total of 100 names were listed and half

were Nobel Prize winners (scientists) the others were names from the National Academy of

Sciences (these were categorized by Freed et al. as “foils”).

Two measures of print exposure assessed reading frequency: the Author Recognition

Test (Stanovich & West, 1989) and the Reading Habits Questionnaire (Scales & Rhee, 2001). In

the former, subjects distinguished real authors (from New York Times Best Seller List) from

foils; in the later, subjects responded to self-report scales about their reading habits, skills, and

preferences.

Reasoning. Reasoning measures included the first two sets from Raven’s Advanced

Progressive Matrices (Raven, 1962) and measures from the Ekstrom Battery of Factor-

Referenced Tests (Ekstrom et al., 1976). Raven’s Advanced Matrices consisted of 48 items in

which subjects induced the missing element to complete a pattern. The Ekstrom Battery included

(a) Arithmetic Aptitude (two sections of 15 arithmetic word problems); (b) Mathematic Aptitude

(two sections of 15 algebraic, multiple-choice word problems); (c) Necessary Arithmetic

Operations (subjects asked to determine which numerical operations were necessary to solve two

sections of 15 items).

Perceptual Speed. In Letter Comparison, subjects identified whether two patterns were

the same or different, from two timed lists of 21 pairs of letter strings. For Pattern Comparison

subjects determined whether two lists of 15 pairs of patterns were the same or different from

each other in a 30 s time limit. In Finding As, subjects had to find five words including the letter

“a”, from five columns of words in 2 min. For the Identical Pictures measure, subjects viewed

rows of geometrical figures and determined which of them were identical to the first figure in the

row?. Subjects completed 48 rows of these figures in 2 lists for 2 min per list.
READING COMPREHENSION AND WORKING MEMORY 16

Fluency. For Word Beginnings, subjects reported as many words as possible that started

with a specified letter within 3 min per prompt (a total of 2 prompts were given). In Word

Endings, subjects reported as many words as possible that ended with a prompted letter (2

prompts given with 3 min per prompt). Two pairs of prompts were given with 3 min to complete

each; all responses were written.

Reading Comprehension. Measures of comprehension included the comprehension

section of Nelson-Denny Reading Test, which consisted of reading a passage and answering

questions from either Form F (36 questions) or Form G (38 questions; Brown, Bennett, & Hanna,

1980; Brown, Fischo, & Hanna, 1993). A novel measure, the Investigator-Generated-Test,

presented 10 multiple-choice questions that assessed memory and understanding of the main

ideas and themes from 10 texts.

Data Analysis Plan. Descriptive statistics and correlations (See Table 1; N = 357) of all

measured variables provided by Freed et al. (2017) were used to generate a covariance matrix.

This covariance matrix was used to replicate their measurement model directly, using a CFA1.

Next, 3 SEMs were generated (two were replications of models presented in Freed): (a) Model 1,

the Freed et al. chosen model that had direct paths from language experience and reasoning to

reading comprehension (Model 1 in Freed et al.); (b) Model 2, which is the same as Model 1

except the latent variable for reasoning has been deleted, and there is now a direct path to reading

comprehension from WMC (Model 3 in Freed et al.); (c) Model 3, a novel model that contains

all latent variables and includes direct paths from each of them to reading comprehension (not

presented in Freed et al.). Model fit statistics were used to assess and compare the fit of these

1
Despite using their correlation matrix and specifying the exact same models presented by Freed et al. (2017), there
is a large disparity in model fit between the current models and those conducted by Freed et al.. The current authors
used all of the information available from Freed et al.’s paper and supplementary materials, thus the cause of this
disparity remains unclear. Results and discussion will concern the analyses conducted in the current project.
READING COMPREHENSION AND WORKING MEMORY 17

models to the data in accordance with the standards suggested by Klein (2005) and Schreiber,

Nora, Stage, Barlow, and King (2006). We analyzed these models in R using the lavaan package

(Rosseel, 2012).

Results

Confirmatory Factor Analysis. A CFA was used to test the 7-factor measurement

model originally proposed by Freed et al. (2017; see Figure 1). This model includes factors

representing: WMC, reasoning, perceptual speed, decoding, fluency, inhibition, and language

experience. Inhibition is not a true factor, as it only has one indicator variable. Based on the

summary of model fit statistics presented in Table 2, the measurement model demonstrated

acceptable fit for the ratio of c2 to df, and SRMR. However, the comparative fit indices, CFI/TLI

and RMSEA, were not in the acceptable range. This could be due to some redundancy across

factors, for example WMC and reasoning were correlated at .58 (p < .001) and language

experience and fluency were correlated at .57 (p < .001). Regardless of fit, for the sake of better

understanding the data we continued onto the SEMs.

Structural Equation Model 1. The first model was a replication of Model 1 from Freed

et al. (2017), consisting of the seven latent predictor variables from the measurement model and

a latent variable representing reading comprehension (See Figure 1 for the associated factor

loadings). The model accounted for 79% of the variance in reading comprehension with two

predictor variables that had direct effects: language experience (b = .76, p < .001) and reasoning

(b = .24, p = .001). These predictor variables also had shared covariances with the other predictor

variables in the model. The remaining latent variables (WMC, decoding, perceptual speed,

fluency, inhibition) did not have direct paths to reading comprehension, but were related to other

predictor variables through shared covariance (see Figure 2 for path values). The only
READING COMPREHENSION AND WORKING MEMORY 18

paths/relationships that were included were consistent with what Freed et al. also maintained in

their model. For model fit statistics see Table 2. Model 1 demonstrated acceptable fit for the

following indices: c2 to df, and SRMR. Similar to the measurement model, the CFI/TLI and

RMSEA were not in the acceptable range.

Structural Equation Model 2. The second model was a replication of Model 3 from

Freed et al. (2017), consisting of the six latent predictor variables from the measurement model

(language experience, WMC, decoding, perceptual speed, fluency, inhibition) and reading

comprehension. The model accounted for 77% of the variance in reading comprehension with

two predictor variables with direct effects: WMC (b = .22, p =.001) and language experience (b

= .77, p < .001). The remaining predictor variables did not have direct paths but were related to

certain other predictor variables (including WMC and language experience) via shared

covariance (See Figure 3 for path values). The only paths/relationships that were included were

consistent with what Freed et al. also maintained in their model. For model fit statistics see Table

2. Model 2 demonstrated acceptable fit for the ratio of c2 to df, and SRMR. Similar to previous

models, CFI/TLI and RMSEA, were not in the acceptable range. Model 2 did produce smaller

AIC and BIC values than Model 1, indicating better fit than the previous model containing all

predictor variables. However, this is only due to the fact that this model is less complex due to

the removal of reasoning and its associated paths.

Structural Equation Model 3. The third model was not included in Freed et al. (2017)

but consists of their original 7-factor model structure (WMC, decoding, perceptual speed,

fluency, inhibition, language experience, and reasoning) and reading comprehension. Rather than

removing pathways from the model (as done in Model 1 and Model 2 by Freed et al.), each

exogenous variable had a direct path to reading comprehension and was allowed to correlate with
READING COMPREHENSION AND WORKING MEMORY 19

all other latent variables. The purpose of this model was to determine whether any pathways

were overlooked in the first two models. In fact, all latent variables except for reasoning,

perceptual speed, and inhibition had significant direct paths to reading comprehension,

including: language experience (b = .76, p < .001), WMC (b = .20, p = .027), decoding (b = -.20,

p = .018), and fluency (b = -.25, p = .037). This model accounted for 85% of the variance in

reading comprehension. Model fit statistics indicate comparable fit to that of Model 1 and Model

2 demonstrating acceptable fit for the ratio of c2 to df, and SRMR. Again, the comparative fit

indices (CFI, TLI), RMSEA, and 90% confidence intervals around RMSEA were not in the

acceptable range.

Discussion

The purpose of Study 1 was to reassess conclusions made by Freed et al. (2017) that

WMC does not need to be included in cognitive models of reading comprehension. Freed et al.

used SEM techniques to assess the role of domain-general and domain-specific factors

underlying reading comprehension and determined that language experience was the strongest

predictor of this ability. Additionally, Freed et al. concluded that when both WMC and reasoning

were included in a model, only reasoning maintained a significant direct path to reading

comprehension (see Figure 2). We reassessed two of the Freed et al. models. First, Model 1

(Figure 2) was the model chosen by Freed in the original publication to base their final

conclusions on and Model 2 (Figure 3), a second, similar model presented in the original

publication without a latent variable representing reasoning. The unacceptable fit of both models

to the data demonstrate the lack of clear justification for choosing one model over the other.

Additionally, Freed et al. did not provide a theory-based reason for choosing Model 1 over

Model 2, even though these models offer different interpretations of the data. Model 1 indicated
READING COMPREHENSION AND WORKING MEMORY 20

no direct relationship between WMC and reading comprehension; Model 2 did indicate a

relationship. These concerns highlight that there was neither a data-driven nor theoretically-

driven explanation provided to support their choice of model. This ultimately calls into question

the conclusions that were based on this model.

In the current study, we tested a third and novel SEM that was not presented by Freed et

al. (2017), Model 3, containing direct paths from all factors to reading comprehension. Fluency,

decoding, WMC, and language experience were all significant predictors of reading

comprehension, with language experience as the strongest predictor, a result consistent with both

domain-specific and domain-general perspectives of reading comprehension (Just & Carpenter,

1992; MacDonald & Christiansen, 2002). More importantly, this model also tested whether,

when both WMC and reasoning are given direct paths to reading comprehension, only the path

for reasoning remains significant. We disconfirmed the Freed et al. conclusion and found that

WMC maintained a significant direct path to reading comprehension, while reasoning did not.

However, as with the two previous SEMs, Model 3 demonstrated unacceptable fit to the

data. This indicates that there are likely core issues with the measurement model underlying the

SEMs designed by Freed and used as the factor structure for all three SEMs. Some of the model

fit indices from our CFA measurement model were outside of the acceptable range, similar to the

SEMs. Additionally, when examining the factor loadings, many were inconsistent with each

other. For example, the seven factor loadings for language experience range from .37 to .88.

Although the low value is still within the “acceptable” range, from a theoretical perspective, the

wide range of factor loading values seems to indicate some conceptual and statistical

inconsistencies with this factor. It is unsurprising as this factor was initially intended by Freed to

be three separate factors, vocabulary, print exposure, and background knowledge. However, it
READING COMPREHENSION AND WORKING MEMORY 21

was reported that these all loaded onto one factor and thus was labeled language experience.

Many of the other factors also have uneven factor loadings, and this could indicate that some

manifest variables may be double-loading or simply inappropriate for the given factor.

Additionally, the factor representing inhibition is not truly a factor at all, but rather a constant, as

there is only one manifest variable underlying this factor. This constant is inappropriate to use in

the measurement model and could be contributing to the poor fit of the models, particularly

because it is not correlated with both indicators of reading comprehension (See Table 1). Due to

these serious concerns regarding the measurement model, we next sought to reexamine the

relationship between WMC and reading comprehension using alternative techniques that do not

involve latent variables.

Study 2: Psychometric Network Analysis of Freed et al. (2017)

Study 2 served as an additional extension to the work of Freed et al. (2017), using an

alternative, more exploratory statistical analysis. Specifically, we used the correlations and

descriptive statistics reported by Freed et al. to generate psychometric network models. These

analyses are relatively new to psychology, particularly to cognitive psychology. Psychometric

network analysis uses partial-correlations between all observed variables to display a network of

interconnected nodes. Network analysis approaches observed psychological processes as an

emergent quality generated from a network of interacting mechanisms (Epskamp, Borsboom, &

Fried, 2017). Rather than using latent variables, each observed variable is represented by a node,

and partial-correlations between variables are represented through connections called edges.

Instead of using a common latent factor to explain the relationships between observed variables,

all one-to-one relationship between observed variables are depicted. Recently, network analysis

has been used in the field of clinical psychology to assess networks of psychological disorders,
READING COMPREHENSION AND WORKING MEMORY 22

such as symptoms of clinical depression (McNally et al., 2015; McNally, Mair, Mugno, &

Riemann, 2016; Van der Maas et al., 2017). For more information about using network analysis

techniques, see Epskamp and Fried (2016).

There are several reasons to use network analysis to examine the processes underlying

reading comprehension. First, network analysis is more appropriate than SEM considering the

research question. The goal set by Freed et al. (2017) was to determine how reader

characteristics are related, focusing on the role of WMC and language experience (vocabulary

and word knowledge). This was an exploratory, open-ended approach that should have been

paired with an exploratory analysis. Network analysis is much more exploratory than SEM and

can also be modified to favor a more conservative or exploratory model. Allowing for more

flexibility, network models can also be adjusted to select for a sparser or more interconnected

network and parameters can be set to intentionally prune spurious edges.

A second justification for using network analysis is to examine better how each manifest

variable relates to the outcome variable and to all of the other observed variables in the network.

Unlike latent variable modeling, network models allow you to visualize and explore relationships

among all of the manifest variables. Visually, you can determine whether a node is more relevant

to the network by examining the number of edges connecting each node to the other nodes in the

model. Edge appearance can also indicate information about variable relationships, such as

direction and magnitude. Moreover, how closely the nodes are clustering together also indicates

important association information, as the closer the clustering the stronger the relationship.

A third justification for using network analysis is that it provides information about each

variable and their individual contribution to the network. There are data-driven indicators of the

relevancy of each node compared to other nodes in the network, including centrality indices and
READING COMPREHENSION AND WORKING MEMORY 23

clustering coefficients that are computed for each variable. Centrality indices specify the

importance or relevance of each variable to the network through measures of strength (total edge

weights connecting the node to other nodes), betweenness (how often a node bridges paths

between other nodes), and closeness (proximity to other nodes; Epskamp and Fried, 2016).

However, there is some controversy surrounding the use and interpretation of centrality indices,

particularly because they can become artificially inflated and difficult to interpret (Dablander &

Hinne, 2018; Bringmann et al, 2018). For that reason, the current project will not use centrality

indices as a means to compare the strength of each node relative to the outcome variable, but

rather will plot them alongside the clustering coefficient to find extreme scores indicating

redundancy or irrelevancy. The clustering coefficient generated for each variable is an index of

redundancy (Epskamp & Fried, 2016). It is desirable to have a certain level of clustering, as it

indicates that a node is related to other nodes in the network, but extreme values could indicate

redundancy. Comparing a node’s clustering coefficient alongside its associated centrality indices

will allow assessment of whether each variable is offering unique and relevant information to the

network. Nodes that have low centrality, but a high clustering coefficient indicate that these

nodes are likely redundant with other nodes in the network. Nodes that have both low centrality

and low clustering indicate that the node is not related to many other variables in the network

and could be considered irrelevant. Both of these types of nodes are potential candidates for

removal from the network as they could be negatively impacting the overall fit of the model to

the data.

A fourth benefit of network analysis is that, like other types of modeling, fit indices can

be obtained to indicate network model fit to the data. Network models generate fit indices similar

to those used to compare SEM and factor analysis, such as CFI/TLI, RMSEA, SRMR, and c2
READING COMPREHENSION AND WORKING MEMORY 24

values. Currently, this only allows for qualitative comparisons across different types of models.

However, these indices can be compared between multiple network models to determine whether

changes made, such as the removal nodes, have improved overall fit.

A final, fifth gain in using network analysis, is that it can be used as a complement to

other types of analyses, sans the controversial subjectivity that comes with the use of latent

variables (Borsboom, Mellenbergh, & VanHeerden, 2003). There is debate as to whether latent

variables truly represent and measure something in reality or are simply mathematical artifacts.

Plus, the same factor could be defined/named differently between researchers leading to

questions of reliability. Because it is a new technique to the field of cognitive psychology, a

component of this project is to demonstrate how network analysis can be used as a tool to

confirm, modify, and inform future models (both latent variable and network). More so than

latent variable models, network models are much easier to produce/modify and allow researchers

to visualize the underlying interconnectedness of the observed variables in a meaningful way.

Essentially this analysis offers an alternative perspective on the data that can provide insight into

the relationships being investigated. Using network analysis, the clustering of the nodes can be

used to confirm latent variables or factor structures, as we would expect nodes belonging to the

same factors to be clustered closely together. If the clustering is inconsistent with the factor

structure of the latent variable model, this could indicate changes that need to be made to the

measurement model. Additionally, redundant or irrelevant nodes in the network can be

established that can be used to remove superfluous variables. If removing these nodes results in

improvement to the overall fit of the network model, this indicates that removing these variables

from the latent variable model could also improve the fit of the latent variable model. Finally, the

exploratory nature of network modeling may uncover other information about the relationships
READING COMPREHENSION AND WORKING MEMORY 25

between observed variables, latent variable models or measures used that are not readily

anticipated or visible using complex latent variable modeling.

The purpose of Study 2 is to reexamine how specific domain-general (i.e., WMC) and

domain-specific (i.e., language experience) factors relate to reading comprehension. As well,

Study 2 demonstrates how psychometric network modeling, an approach that is rapidly gaining

adherents in individual-differences research, can be used in conjunction with latent variable

modeling to inform future models of cognitive abilities.

Method

The participants, materials, and procedure for Study 2 are the same as those in Study 1.

Results

Psychometric Network Model 1. A network analysis was generated from partial

correlations between all the measured variables (See Figure 5 for network visualization). The

network analysis was conducted using the qgraph package available in R, using the graphical

least absolute shrinkage and selector operator (gLASSO) based on the extended Bayesian

Information Criterion (EBIC) regularization technique (recommended by Friedman, Hastie, &

Tibshirani, 2008). Regularization techniques are used to eliminate spurious edges between nodes

that can occur due to sampling error and setting the EBIC hyperparameter (gamma)

conservatively, as done for this network (γ = .50), favors a sparser network model. A second

component, the tuning parameter (lambda), will be set to eliminate spurious edges while

maintaining true edges (λ = .01) that limits spurious edges and retains true edges, as per the

recommendations for psychometric network analysis (Epskamp, Lunansky, Tio, & Borsboom,

2018). Assessment of the fit indices are again consistent with recommendations by Klein (2005)

and Schreiber et al. (2006).


READING COMPREHENSION AND WORKING MEMORY 26

For the most part, node clusters were consistent with latent variable loadings, meaning

observed variables belonging to the same latent variables tended to cluster consistently with

Freed et al., 2017. However, there was an exception to this: Raven’s Progressive Matrices

clustered more closely with WMC nodes rather than Reasoning (which was reflected by several

math reasoning tasks). This likely contributed to the less than acceptable fit indices for the

measurement model CFA. Unlike the previous SEMs, model fit statistics for the network model

nearly all indicated good fit. This was true for all but CFI/TLI values which, although lower than

desired, were still substantially higher than the previous models (See Table 2). Although at face

value the network model fit statistics seem to indicate a better fit than the SEMs, the network

model could only be compared to the previous models qualitatively.

Psychometric Network Model 2. The next step was to determine whether there were

any redundant or irrelevant nodes in need of removal to improve the network fit. Although the

previous network model displayed acceptable fit to the data, there was still room for

improvement. Additionally, the network seemed imbalanced, considering the number of

language experience nodes compared the number of nodes representing the other

concepts/factors in the network. More nodes representing one concept compared to the others

could influence the structure or layout of the network. Three plots were generated with the

clustering coefficient along the X-axis and each of the centrality indices (betweenness, strength,

and closeness) along the Y-axis (see Figure 6). To be conservative, only nodes with the most

extreme values were considered for removal. Nodes that are highly redundant would be in the

lower right quadrant, indicating a high clustering coefficient, but low centrality. Nodes that are

highly irrelevant would be in the lower left quadrant, indicating low clustering and low

centrality. The node representing the Stroop effect (the only measure representing inhibition)
READING COMPREHENSION AND WORKING MEMORY 27

appeared to not be related to much when looking at the network, indicated by the lack of edges

connected to this node (See Figure 5). Moreover, the plots also confirmed this node was

irrelevant, demonstrating the lowest possible centrality and clustering of all the nodes (see Figure

6). Thus, this node was flagged for removal from the network. When examining nodes for

potential redundancy, the node representing Reading Questionnaire (a language experience node)

seemed to be the most extreme, with very low centrality and the highest clustering compared to

other nodes (Figure 6). This node was also removed.

After the nodes were removed the network was estimated again, to generate the

redundancy plots a second time to determine if there were still any issues of redundancy (see

Figure 7). Two nodes still demonstrated a high degree of redundancy, Advanced Vocabulary and

Author Recognition Test (language experience nodes). For each measure of centrality, these

nodes showed extreme redundancy and limited centrality, thus these nodes were removed as

well. Overall, then, the inhibition factor from the SEMs only contained one measure, making it

an inappropriate factor to use in the first place. Compounding the issue, the inhibition node in the

network model was not related to nodes in the network, but for one edge. Additionally, the three

language experience nodes that were removed seemed conceptually overlapped at face value or

from a practical perspective (Advance Vocabulary, Reading Questionnaire, Author Recognition

Test). Removing these nodes now allowed for each concept within the network to be equally

represented by two to four nodes, rather than biasing the network towards language experience

by representing this concept with seven nodes.

The network analysis was run again with the four nodes removed, with all the same

parameters set as the previous network model (Network Model 2; See Figure 8). As expected,

language experience nodes were more closely clustered with reading comprehension nodes due
READING COMPREHENSION AND WORKING MEMORY 28

to the conceptual overlap of these two abilities. However, removal of the redundant and

irrelevant nodes allowed the relationship between WMC and reading comprehension to be more

pronounced. This was evidenced by the closer clustering (compared to Network Model 1) and

visible edges between these two concepts’ nodes. As with the previous model, the reasoning

node representing Raven’s Progressive Matrices was clustered more closely with the WMC than

the reasoning nodes, highlighting a likely issue with the measurement model underlying the

SEMs presented by Freed et al. (2017). The changes made to the network substantially improved

the fit of the model (See Table 2).

Psychometric Network Model 3. Finally, we decided to take a closer look at the

relationships between reading comprehension and WMC and language experience. This was

done to test the prediction that, although language experience nodes would be more strongly

related to reading comprehension, WMC nodes would also have connections to reading

comprehension. Thus, we conducted a model with just the nodes representing these three

concepts and also set the code to display the edge weights connecting the nodes in the network.

This was done to offer numerical evidence for the relationships among these observed variables.

Network Model 3 (parameters set consistent with previous networks; See Figure 9; Table 2)

demonstrated excellent fit to the data. Language experience nodes had stronger edge weights

connecting them to reading comprehension, but each WMC had partial correlations connecting

them to at least one reading comprehension node (See Discussion for possible explanations and

elaboration). Although the edge weights connecting WMC and reading comprehension were

small (between .10 and .11), these represent partial correlations that were significant enough to

not be removed by the tuning parameter that was set to remove spurious edges.

Discussion
READING COMPREHENSION AND WORKING MEMORY 29

Study 2 reexamined the role of the domain-specific and domain-general processes

underlying reading comprehension, specifically WMC, using an alternative analysis.

Additionally, network modeling explored potential problems pertaining to the measurement

model underlying the latent variable models from Freed et al. (2017).

The initial psychometric network model, although acceptably fit to the data, was

imbalanced (many more language experience nodes than nodes for other concepts), with several

redundant nodes and one irrelevant node (Network Model 1; See Figure 5). Once the irrelevant

and redundant nodes were removed from the analysis, the second network model indicated

excellent fit to the data (Network Model 2). Network Model 2 demonstrated the relationship

between WMC and reading comprehension nodes, as well as language experience and reading

comprehension nodes via multiple connecting edges and close node clusters (see Figure 8). To

explore these relationships further, language experience, WMC, and reading comprehension

were used in a separate network model. As expected, Network Model 3 showed that language

experience had strong associations to reading comprehension, but also demonstrated significant

partial correlations between each WMC node and one of the reading comprehension nodes. This

combined with the results from SEM Model 3, provide a convincing case for the relationship

between WMC and reading comprehension. At the very least these results challenge the

conclusion from Freed et al. (2017) that WMC is not necessary for models of reading

comprehension.

Another goal of this study was to demonstrate how network analysis can be used as a

complimentary tool to latent-variable modeling. This was achieved by using network analysis to

explore the issues that likely contributed to the poor fit of the SEMs from Freed et al. (2017),

specifically relating to the measurement model used. One of the reasoning nodes (Raven’s
READING COMPREHENSION AND WORKING MEMORY 30

Progressive Matrices) clustered more closely to the WMC nodes than the other reasoning nodes.

All of the other reasoning measures were arithmetic-based, so it is possible that Raven’s

Progressive Matrices has more in common with the WMC complex-span tasks. However, this

indicates that the reasoning factor is not consistently defined, and this is more than likely

contributing to poor fit of the models produced under this factor structure. Another concern for

the measurement model involves the inhibition factor, which as a single node in the network

analysis, was irrelevant to other nodes in the network, with only one edge connecting it to

another node. Finally, removing this node from the network improved the overall fit, likely

indicating that removing this variable from the measurement model could also improve the fit of

the SEMs.

Other issues uncovered by the network models directly involve the underlying measures

used in the SEMs presented by Freed et al. (2017). In addition to the irrelevant inhibition node,

the network model also indicated that three of the language experience nodes were highly

redundant with one another (Reading Questionnaire, Advanced Vocabulary, and Author

Recognition Test). This is unsurprising as many of the nodes overlapped conceptually (e.g., three

different measures of vocabulary were used). Moreover, for the latent variable model, the

language experience factor initially was supposed to be three separate factors that were

subsequently combined and defined inconsistently. Thus, some of the language experience

measures were too similar, while others were too dissimilar from each other. The removal of

these nodes improved the fit of the network model, again indicating that removing these

measures from the measurement model could improve the fit of the overall SEM.

Another problem with the measures used was emphasized by the inconsistent

relationships between WMC nodes and each of the reading comprehension nodes. All WMC
READING COMPREHENSION AND WORKING MEMORY 31

nodes had edges connecting them to the reading comprehension node representing Investigator-

Generated Comprehension. However, none of the WMC nodes had any significant connections

to the other comprehension node, Nelson-Denny Comprehension. The Nelson-Denny

Comprehension measure has been criticized as not being a true measure of reading

comprehension but rather, is reflective of individual differences in general knowledge or verbal

reasoning (Ready, Chaudhry, Schatz, & Strazzullo, 2013). This challenge is supported by results

indicating those with higher IQ or reading ability are able to accurately answer the test questions

without even reading the associated passages. If reading comprehension is not truly being

exerted or assessed by the measure than it is possible that some of the skills or processes

underlying reading comprehension were not employed to complete the task. For example,

perhaps the Nelson-Denny Comprehension Test does not properly tax working memory enough

to allow the relationship between WMC nodes and Nelson-Denny Comprehension node to be

demonstrated. Whereas, it is possible that the Investigator-Generated Comprehension measure

was more reflective of reading comprehension and thus properly taxed WMC, allowing for the

relationship between this node and WMC nodes to materialize. Adding to this, the constructs of

reading comprehension and language experience were not convincingly separate from one

another. In fact, these two factors were correlated at .87 (p < .001) which indicates redundancy.

Additionally, Nelson-Denny Comprehension was used as a comprehension node, but Nelson-

Denny Vocabulary was used as a language experience node. These two highly correlated

measures should not have been employed in the same model to represent distinct factors, as the

strong relationship between them will potentially bias the model. The strong relationship

between reading comprehension and language experience may also reflect shared method

variance, as Nelson Denny Vocabulary was embedded in a reading comprehension context, and
READING COMPREHENSION AND WORKING MEMORY 32

both of the reading comprehension measures may have presented difficult vocabulary.

Consequently, the other reading comprehension node, Investigator-Generated Comprehension

was also highly correlated with another language experience node, Extended Range Vocabulary

Test (r = .59, p < .05). Both of these issues seem to indicate that reading comprehension and

language experience needed to be distinguished better from one another, such as only using the

language experience measures that were the most dissimilar to the reading comprehension

measures (e.g., Author Recognition Test, Scientist Recognition Test). Perhaps, due to the

previously mentioned problems of the Nelson-Denny Comprehension Test, this measure should

not have been used in the analyses at all.

General Discussion

Theories of reading comprehension largely focus on the underlying cognitive processes

that contribute to variation in this ability across individuals. The ongoing debate concerns

whether variation in reading comprehension is due to individual differences in domain-general

versus domain-specific processes. The domain-general perspective posits that differences in

WMC contribute to variation in reading comprehension; specifically, having greater WMC is

related to better reading comprehension (Just & Carpenter, 1992). Alternatively, the domain-

specific perspective, posits that variation in reading comprehension is due to language-specific

skills that are improved upon with increased experience (MacDonald & Christiansen, 2002). The

current project confirms aspects from both of these perspectives, with both language-specific

factors and WMC accounting for variation in reading comprehension. Language experience was

the strongest predictor, but WMC, decoding, and fluency were significant predictors as well.

However, given the issues discovered concerning the measurement model and

psychometric measures used, these results warrant skepticism. However, we suggest that, all
READING COMPREHENSION AND WORKING MEMORY 33

together, the results of our re-analyses conflict with the original conclusions from Freed et al.

(2017), that WMC is not necessary for cognitive models of reading comprehension. The latent

variable models and network models converged to suggest that WMC was related to variation in

reading comprehension. Additionally, psychometric network analysis allowed for a richer

visualization of the relationship between WMC and reading comprehension, in addition to all of

the relationships between the various other cognitive processes necessary underlying reading

comprehension. This visualization confirmed an association between WMC and reading

comprehension, but also revealed aspects about relationships that may not have been as readily

available using latent variable modeling. For example, it revealed that WMC nodes were only

related to one measure of reading comprehension which then prompted exploration into why the

Nelson-Denny Comprehension Test may not have been appropriate to use. Finally, although this

would be outside of the scope of the current project, the network model revealed a variety of

different associations between all of the observed variables that could have been further

explored, an advantage that is not freely offered by latent variable modeling. Essentially network

analysis allows us to view an ability as a network of interacting processes, which seems to be a

more realistic depiction of reading comprehension.

There were a few limitations to the current project, the most obvious being that the

current authors did not use the full dataset collected by Freed et al. (2017). We were able to

reconstruct the measurement model using the reported correlation matrix, descriptive statistics,

and model descriptions, but there were still large and unexplainable differences in model fit

between the current models and the models presented by Freed. This could be due to a number of

reasons such as rounding errors; the correlation matrix from Freed was rounded to two decimal

places, so some of the correlations were reported as .00 when they were smaller than .01. The
READING COMPREHENSION AND WORKING MEMORY 34

current CFA may have also differed from the PCA from Freed, in that we allowed the factors to

correlate to be consistent with the SEM, while Freed used an orthogonal rotation in their PCA

(but allowed factors to correlate in their SEMs). It is also possible that different estimation

methods were employed between the current models and those presented by Freed. The current

project used maximum likelihood (which tend to be the standard practice), but Freed did not

indicate which method was used for their analyses. Finally, some of the less common techniques

used by Freed, such as backward-building techniques in the creation of their models, may have

also contributed to differences in model fit between the two studies.

Other noted limitations can be expected for any project utilizing a secondary analysis, in

that we had no control over which variables were included, how these variables were measured,

nor the factor structure underlying the models. This lack of control over certain aspects was

particularly limiting once it was determined that there were problems with the measurement

model and some of the measures used. As discussed previously, the decision to include certain

variables/measures and how the factors were defined by Freed impacted the results of the SEMs,

but also could have constrained possibilities for the network analyses. With psychometric

network analysis, the overall structure is reliant on the variables chosen to include in the model.

Accordingly, overrepresenting one construct in the model (e.g., language experience), can lead to

bias in the network structure such that the entire network is built around one (set of) strong

association(s). Although we were able to remove redundant/irrelevant nodes and improve the fit,

having more control over the measures used would have allowed us to investigate and represent

the relationships in the network better, particularly the relationship between WMC and reading

comprehension.
READING COMPREHENSION AND WORKING MEMORY 35

Future investigations should use a combination of latent-variable modeling and network

analyses to better understand the processes underlying individual differences in reading

comprehension (as well as other cognitive abilities). Using both techniques in tandem will allow

for exploration of reading comprehension from the traditional latent variable perspective, while

also benefitting from the new tools network analysis has to offer. For example, network analysis

can be used as an exploratory first step that guides the structure of the measurement model.

Changes made to the network model that improve the overall fit can inform potential adjustments

for the latent-variable model as well. However like other exploratory techniques, it would be

inappropriate to conduct an exploratory network analysis and a CFA/SEM on the same dataset.

Thus, it would be necessary to collect new data after conducting the network analysis, prior to

moving onto other confirmatory analyses. Additionally, for future research utilizing latent-

variable modeling to explore reading comprehension, it will be necessary to pay careful attention

to the measures used and how they load onto the specific factors in the measurement model.

Measures that are redundant, not sufficiently taxing, or too highly correlated with measures from

other factors, should not be used in latent-variable nor network models. Network analysis can

also aid in this task by providing centrality indices and clustering coefficients which can confirm

whether any measure used is appropriately related to the other nodes in the network and not

irrelevant or redundant. Finally, once a measurement model has been constructed, a

confirmatory network analysis can also be used afterwards to corroborate the factor structure by

verifying whether the nodes are clustering consistently with what the measurement model

predicts.

Overall, these re-analyses confirmed the significant role that WMC plays in reading

comprehension. In the final SEM tested WMC, but not reasoning, maintained a significant direct
READING COMPREHENSION AND WORKING MEMORY 36

path to reading comprehension when both factors were included as predictor variables in the

model. Additionally, the network analyses indicated significant associations between WMC and

reading comprehension nodes. In conclusion, as suggested by Freed et al. (2017) language

experience seems to be the strongest predictor of reading comprehension. However, WMC is still

an important factor in reading comprehension and is a necessity in future cognitive models of

this ability.
READING COMPREHENSION AND WORKING MEMORY 37

Acknowledgments

Sara Anne Goring would like to thank her co-authors Christopher J. Schmank, Michael J. Kane,

and Andrew R. A. Conway. She would also like to give a special thanks to Kathy Pezdek for all

her help throughout the writing process. Thank you to Ester Navaro and Kevin Rosales, as well

for feedback on the manuscript.


READING COMPREHENSION AND WORKING MEMORY 38

References

Baddeley, A., Logie, R., Nimmo-Smith, I., & Brereton, N. (1985). Components of fluent reading.

Journal of Memory and Language, 24, 119-131. https://doi.org/10.1016/0749-

596x(85)90019-1

Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003). The theoretical status of latent

variables. Psychological review, 110(2), 203-219.

Bringmann, L. F., Elmer, T., Epskamp, S., Krause, R. W., Schoch, D., Wichers, M., ... &

Bringmann, L. (2018). What do centrality measures measure in psychological

networks. Researchgate preprint, november.

Brown, J., Bennett, J., & Hanna, G. (1980). The Nelson-Denny reading test. Boston: Houghton

Mifflin.

Brown, J., Fischo, V., & Hanna, G. (1993). The Nelson-Denny reading test. Boston: Houghton

Mifflin.

Cromley, J. G., & Azevedo, R. (2007). Testing and refining the direct and inferential mediation

model of reading comprehension. Journal of Educational Psychology, 99, 311.

https://doi.org/10.1037/0022-0663.99.2.311

Dablander, F., & Hinne, M. (2018). Node Centrality Measures are a poor substitute for Causal

Inference. Scientific Reports, 1, 6846.

Daneman, M., & Carpenter, P. A. (1983). Individual differences in integrating information

between and within sentences. Journal of Experimental Psychology: Learning, Memory,

and Cognition, 9, 561. https://doi.org/10.1037//0278-7393.9.4.561


READING COMPREHENSION AND WORKING MEMORY 39

Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A

meta-analysis. Psychonomic Bulletin & Review, 3, 422-433.

https://doi.org/10.3758/bf03214546

Ekstrom, R. B., Dermen, D., & Harman, H. H. (1976). Manual for kit of factor-referenced

cognitive tests (Vol. 102). Princeton, NJ: Educational Testing Service.

https://doi.org/10.21236/ad0410915

Engle, R. W. (2001). What is working memory capacity? In H. L. Roediger III, J. S. Nairne, I.

Neath, & A. M. Surprenant (Eds.), Science conference series. The nature of

remembering: Essays in honor of Robert G. Crowder (pp. 297-314). Washington, DC,

US: American Psychological Association. https://dx.doi.org/10.1037/10394-016

Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in

Psychological Science, 11, 19 –23. https://doi.org/10.1111/1467-8721.00160

Epskamp, S., Borsboom, D., & Fried, E. I. (2018). Estimating psychological networks and their

accuracy: A tutorial paper. Behavior Research Methods, 50, 195-212.

https://doi.org/10.3758/s13428-017-0862-1

Epskamp, S., & Fried, E. I. (2018). A tutorial on regularized partial correlation

networks. Psychological Methods, 23, 617-634. http://dx.doi.org/10.1037/met0000167

Epskamp, S., Lunansky, G., Tio, P., & Borsboom, D. (2018, April). Recent developments on the

performance of graphical LASSO networks. [Blog post]. Retrieved from

http://psychosystems.org/author/sachaepskamp

Ferreira, F., & Clifton Jr, C. (1986). The independence of syntactic processing. Journal of

Memory and Language, 25, 348-368. https://doi.org/10.1016/0749-596x(86)90006-9


READING COMPREHENSION AND WORKING MEMORY 40

Follmer, D. J. (2018). Executive function and reading comprehension: A meta-analytic

review. Educational Psychologist, 53(1), 42-60

Freed, E. M., Hamilton, S. T., & Long, D. L. (2017). Comprehension in proficient readers: The

nature of individual variation. Journal of Memory and Language, 97, 135-153.

https://doi.org/10.1016/j.jml.2017.07.008

Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with

the graphical lasso. Biostatistics, 9, 432-441. doi:10.1093/biostatistics/kxm045.

Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: individual

differences in working memory. Psychological Review, 99, 122.

https://doi.org/10.1037/0033-295x.99.1.122

Kane, M. J., Conway, A. R., Hambrick, D. Z., & Engle, R. W. (2007). Variation in working

memory capacity as variation in executive attention and control. Variation in Working

Memory, 1, 21-48. https://doi.org/10.1093/acprof:oso/9780195168648.003.0002

Kane, M. J., & Engle, R. W. (2003). Working-memory capacity and the control of attention: The

contributions of goal neglect, response competition, and task set to Stroop interference.

Journal of Experimental Psychology: General, 132, 47. https://doi.org/10.1037/0096-

3445.132.1.47

King, J., & Just, M. A. (1991). Individual differences in syntactic processing: The role of

working memory. Journal of Memory and Language, 30, 580-602.

https://doi.org/10.1016/0749-596x(91)90027-h

Klein, R. B. (2005). Principles and Practice of Structural Equation Modeling. New York. NY:

Guilford.
READING COMPREHENSION AND WORKING MEMORY 41

MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working memory: Comment on

Just and Carpenter (1992) and Waters and Caplan (1996). Psychological Review, 109, 35-

54. https://doi.org/10.1037//0033-295x.109.1.35

MacDonald, M. C., Just, M. A., & Carpenter, P. A. (1992). Working memory constraints on the

processing of syntactic ambiguity. Cognitive Psychology, 24, 56-98.

https://doi.org/10.1016/0010-0285(92)90003-k

Mason, R. A., & Just, M. A. (2007). Lexical ambiguity in sentence comprehension. Brain

Research, 1146, 115-127. https://doi.org/10.1016/j.brainres.2007.02.076

McNally, R. J., Robinaugh, D. J., Wu, G. W. Y., Wang, L., Deserno, M., & Borsboom, D. (2015).

Mental disorders as causal systems: A network approach to post-traumatic stress disorder.

Clinical Psychological Science, 3, 836-849. https://doi.org/10.1177/2167702614553230

McNally, R. J., Mair, P., Mugno, B., & Riemann, B. C. (2017). Co-morbid obsessive-compulsive

disorder and depression: A Bayesian network approach. Psychological Medicine, 47,

1204-1214. https://doi.org/0.1017/S0033291716003287.

McVay, J. C., & Kane, M. J. (2012). Why does working memory capacity predict variation in

reading comprehension? On the influence of mind wandering and executive attention.

Journal of Experimental Psychology: General, 141, 302.

https://doi.org/10.1037/a0039137

Raven, J. C. (1962). Advanced progressive matrices. London: HK Lewis.

Ready, R. E., Chaudhry, M. F., Schatz, K. C., & Strazzullo, S. (2013). “Passageless”

administration of the Nelson–Denny Reading Comprehension Test: Associations with IQ

and reading skills. Journal of learning disabilities, 46, 377-384.


READING COMPREHENSION AND WORKING MEMORY 42

Rosseel, Y. (2012). Lavaan: An R package for structural equation modeling and more. Version

0.5–12 (BETA). Journal of statistical software, 48(2), 1-36. Retrieved from

http://www.jstatsoft.org/v48/i02/.

Scales, A. M., & Rhee, O. (2001). Adult reading habits and patterns. Reading Psychology, 22,

175–203. https://doi.org/10.1080/027027101753170610

Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural

equation modeling and confirmatory factor analysis results: A review. The Journal of

educational research, 99(6), 323-338. https://doi.org/10.3200/JOER.99.6.323-338

Seidenberg, M. S. (1985). The time course of phonological code activation in two writing

systems. Cognition, 19, 1-30. https://doi.org/10.1016/0010-0277(85)90029-0

Stanovich, K. E., & West, R. F. (1989). Exposure to print and orthographic processing. Reading

Research Quarterly, 402-433. https://doi.org/10.2307/747605

Unsworth, N., & McMillan, B. D. (2013). Mind wandering and reading comprehension:

Examining the roles of working memory capacity, interest, motivation, and topic

experience. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39,

832-842.

van der Maas, H. L. J., Kan, K.-J., Marsman, M., & Stevenson, C. E. (2017). Network models for

cognitive development and intelligence. Journal of Intelligence, 5, 16.

https://doi.org/10.3390/jintelligence5020016

Verhoeven, L., & Van Leeuwe, J. (2008). Prediction of the development of reading

comprehension: A longitudinal study. Applied Cognitive Psychology, 22, 407-423.

https://doi.org/10.1002/acp.1414

Wells, J. B., Christiansen, M. H., Race, D. S., Acheson, D. J., & MacDonald, M. C. (2009).

Experience and sentence processing: Statistical learning and relative clause


READING COMPREHENSION AND WORKING MEMORY 43

comprehension. Cognitive psychology, 58, 250-271.

https://doi.org/10.1016/j.cogpsych.2008.08.002

Zahler, D., & Zahler, K. (2003). Test prep your IQ cultural literacy (1st ed.). Lawrenceville, NJ:

Peterson’s.
READING COMPREHENSION AND WORKING MEMORY 44

Figure 1. Measurement model from Freed et al. (2017), but standardized path values are
obtained from the current analysis. This includes 7 exogenous latent variables representing
individual difference measures (Inhibition not a true factor) and the latent variable representing
the endogenous variable, Comprehension. Although Comprehension was not included in the
CFA, it is entered here to show indicator loadings. See Table 1 for indicator variable names.
READING COMPREHENSION AND WORKING MEMORY 45

Table 1

Bivariate Correlations for Individual Difference and Comprehension Measures (From Freed et al. 2017).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1. Orthographic Decision (OrD)
2. Phonological Decision (PhD) .52
3. Non-word Naming (NnN) .26 .20
4. Phoneme Transposition† -.19 -.23 -.10
5. Reading Span (RdS) .05 -.07 -.12 .17
6. Alphabet Span (AlS) -.14 -.11 -.09 .14 .45
7. Minus Span (MnS) -.12 -.10 .00 .23 .41 .27
8. Visual Number Span (VNS) -.03 .10 -.02 .13 .21 .23 .33
9. Go-No Go† -.17 -.09 -.15 .07 -.11 -.06 .12 -.00
10. Stroop Interference (StI) .05 .13 .12 -.09 .01 .02 .00 -.03 -.10
11. Author Recognition Test (ART) -.18 -.22 -.08 .31 .12 .25 .08 .02 -.06 .00
12. Reading Questionnaire (RdQ) .09 .03 -.00 .20 .13 -.00 .04 .04 -.02 .07 .35
13. Cultural Intel igence (ClI) -.17 -.32 -.21 .28 .26 .28 .16 -.03 .04 -.10 .51 .36
14. Scientist Recognition Test (SRT) -.10 -.18 -.07 .10 .15 .20 .10 -.01 -.03 -.08 .36 .17 .48
15. Extended Range Vocabulary (ERV) -.14 -.25 -.16 .26 .29 .31 .14 -.01 -.05 -.06 .58 .26 .64 .43
16. Advanced Vocabulary (AdV) -.09 -.20 -.13 .22 .19 .25 .05 -.01 -.05 .15 .46 .28 .57 .36 .67
17. Nelson-Denny Vocabulary (NDV) -.22 -.33 -.21 .35 .27 .32 .13 -.08 -.06 -.15 .58 .37 .71 .37 .73 .61
18. Raven's Progressive Matrices (RPM) -.05 .07 .01 .19 .24 .32 .33 .31 -.08 .04 .14 -.02 .12 .14 .22 .14 .18
19. Arithmetic Aptitude Test (AAT) -.16 -.06 -.11 .21 .28 .31 .26 .22 -.02 -.11 .13 -.10 .27 .19 .29 .24 .29 .29
20. Mathematic Aptitude Test (MAT) -.11 -.06 -.09 .21 .22 .26 .29 .16 .03 -.02 .16 -.02 .39 .21 .35 .26 .32 .28 .71
21. Necessary Arthmetic Operations (NAO) -.05 -.09 -.11 .25 .31 .33 .32 .11 -.02 -.08 .24 .03 .37 .17 .37 .28 .42 .29 .63 .61
22. Letter Comparisons (LtC) -.16 -.13 -.10 .13 .10 .12 .09 .06 .01 -.05 .14 .12 .14 .02 .17 .14 .10 .00 .21 .19 .21
23. Finding As (FnA) -.19 -.13 -.06 .17 .05 .08 .13 .11 .00 -.17 .16 .11 .13 .00 .12 .06 .17 .03 .13 .15 .18 .33
24. Number Comparison (NmC) -.26 -.05 -.14 .07 .01 .04 .15 .16 .04 -.11 .03 -.08 -.05 -.05 -.12 -.04 -.05 .04 .25 .21 .17 .23 .38
25. Pattern Comparison† -.15 -.07 -.05 .06 .12 .04 .08 .03 -.04 -.04 .15 -.06 .12 .14 .18 .12 .15 .06 .03 .09 .11 .00 .11 .04
26. Identical Pictures (IdP) -.16 -.15 -.06 .06 .08 .13 .10 .17 .08 .02 .15 .04 .22 .05 .17 .14 .26 .12 .17 .20 .22 .18 .26 .20 .08
27. Word Beginnings (WrB) -.24 -.31 -.11 .22 .18 .31 .20 .11 -.05 .16 .24 .06 .33 .18 .36 .32 .42 .08 .25 .26 .24 .17 .27 .12 .17 .19
28. Word Endings (WrE) -.24 -.30 .00 .09 .15 .33 .15 .09 -.08 -.02 .23 -.04 .23 .13 .28 .24 .29 .16 .16 .20 .27 .11 .18 .14 .12 .20 .48
29. Word Beginnings and Endings (WBE) -.25 -.29 -.13 .18 .19 .28 .14 .09 -.04 -.03 .26 -.06 .29 .21 .37 .28 .39 .15 .24 .22 .22 .01 .16 .12 .18 .16 .44 .44
30. Nelson-Denny Comprehension (NDC) -.30 -0.34 -.19 .29 .17 .16 .18 -.03 .04 -.22 .36 .13 .49 .22 .36 .38 .55 .24 .27 .31 .38 .18 .18 .24 .08 .16 .26 .20 .21
31. Investigator-Generated Comprehension (IGC) -.06 -.22 -.11 .26 .35 .34 .27 -.04 -.05 -.04 .37 .21 .50 .31 .59 .47 .57 .27 .29 .35 .36 .10 .12 -.03 .08 .13 .29 .19 .25 .48
M 762.64 2659.04 1053.33 0.88 0.64 0.73 0.83 0.35 0.14 96.98 0.18 3.58 0.57 0.25 0.32 0.35 0.87 0.72 0.50 0.37 0.52 0.19 0.31 0.28 0.96 0.71 26.36 31.50 19.23 0.79 0.08
SD 141.63 1106.41 484.23 0.13 0.14 0.12 0.14 0.07 0.14 102.89 0.10 0.96 0.14 0.11 0.15 0.14 0.08 0.13 0.19 0.15 0.18 0.05 0.08 0.06 0.06 0.14 7.92 8.13 6.94 0.16 0.09
N 261 252 243 224 285 284 281 278 256 245 279 330 262 279 278 270 283 280 278 261 274 274 274 275 262 260 276 262 277 263 303
Note. Significant correlations (p < .05) are shown in bold. † = Measures removed by Freed et al.
READING COMPREHENSION AND WORKING MEMORY 46

Table 2

Model Fit Statistics for Latent Variable and Network Models of Reading Comprehension

Data.

Statistics

Models χ2 df CFI (TLI) RMSEA AIC BIC SRMR

CFA1 717.49 279 0.85 (0.83) .07 23104.72 23481.67 .06

SEM: Model 12 905.49 333 .83(.81) .07 24764.05 25152.52 .07

SEM: Model 23 643.14 239 .85 (.82) .07 21426.92 21753.86 .07

SEM: Model 34 881.75 323 .84 (.81) .07 24760.29 25187.25 .07

Network Model 15 596.76 299 .91(.89) .05 24564.41 24539.44 .06

Network Model 26 193.70 175 .99 (.99) .02 21064.96 21006.05 .03

Network Model 37 8.94 24 1.00(1.02) < .001 8609.17 8623.75 .01


Note.1 Figure 2. 2 Figure 3. 3 Figure 4. 4 Figure 5. 5 Figure 6. 6 Figure 7. 7 Figure 8. df = degree of
freedom; CFI = Comparative fit index; TLI = Tucker-Lewis index; RMSEA = Root mean square
error of approximation; 90% CI = RMSEA 90% confidence interval; SRMR = Standardized root
mean square residual; AIC = Akaike information criteria; BIC = sample size adjusted Bayesian
information criteria.
READING COMPREHENSION AND WORKING MEMORY 47

Figure 2. SEM Model 1 (Model 1 from Freed et al. 2017) containing 7 predictor variables
representing latent individual difference variables, and the outcome variable representing
Reading Comprehension. Path values are standardized and obtained from the current analysis.
READING COMPREHENSION AND WORKING MEMORY 48

Figure 3. SEM Model 2 (Model 3 from Freed et al. 2017) containing 6 predictor variables
(Reasoning deleted from model) representing latent individual difference variables, and the
outcome variable representing Reading Comprehension. Path values are standardized and
obtained from the current analysis.
READING COMPREHENSION AND WORKING MEMORY 49

Figure 4. SEM Model 3 (Current model, not presented in Freed et al. 2017) containing 7
predictor variables representing latent individual difference variables, and the outcome variable
representing Reading Comprehension. Dotted / dashed lines indicate a non-significant path.
READING COMPREHENSION AND WORKING MEMORY 50

Reasoning

WMC
Decoding

Fluency

Inhibition

Reading Comprehension

Perceptual Speed
Language Experience

Figure 5. Network Model of Reading Comprehension (Undirected; γ = .50, λ = .01)


READING COMPREHENSION AND WORKING MEMORY 51

Betweenness
Strength
Closeness

Clustering Coefficient

Figure 6. Redundancy plots, with the clustering coefficient on the x-axis and centrality indices
(betweenness, strength, and closeness) on the y-axis. The boxes indicate measures that removed
for either being redundant (RQ) or irrelevant (SI).
READING COMPREHENSION AND WORKING MEMORY 52

Betweenness
Strength
Closeness

Clustering Coefficient
Figure 7. Redundancy plots, with the clustering coefficient on the x-axis and centrality indices
(betweenness, strength, and closeness) on the y-axis. The boxes indicate measures that removed
for either being redundant (ART, AV)
READING COMPREHENSION AND WORKING MEMORY 53

Language Experience

Fluency

Reading Comprehension

Decoding

WMC

Perceptual Speed
Reasoning

Figure 8. Network Model 2 of Reading Comprehension (Redundant nodes removed; undirected;


γ = .50, λ = .01)
READING COMPREHENSION AND WORKING MEMORY 54

Reading Comprehension

WMC

Language Experience

Figure 9. Network Model only containing Reading Comprehension (blue; NDC: Nelson-Denny
Comprehension, IGC: Investigator-Generated Comprehension); WMC (pink; VNS: Visual
Number Span, MnS: Minus Span, RdS: Reading Span, AlS: Alphabet Span; Language
Experience (green; CII: Cultural Intelligence, NDV: Nelson-Denny Vocabulary, ERV: Extended
Range Vocabulary, SRT: Scientist Recognition Test; γ = .50, λ = .01).

You might also like