Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Psychotherapy Research

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tpsr20

Can cognitive inflexibility reduce symptoms of


anxiety and depression? Promoting the structural
nested mean model in psychotherapy research

Henrik Børsting Jacobsen, Peter Solvoll Lyby, Thomas Johansen, Silje


Endresen Reme & Ole Klungsøyr

To cite this article: Henrik Børsting Jacobsen, Peter Solvoll Lyby, Thomas Johansen, Silje
Endresen Reme & Ole Klungsøyr (2023) Can cognitive inflexibility reduce symptoms of anxiety
and depression? Promoting the structural nested mean model in psychotherapy research,
Psychotherapy Research, 33:8, 1096-1116, DOI: 10.1080/10503307.2023.2221808

To link to this article: https://doi.org/10.1080/10503307.2023.2221808

© 2023 The Author(s). Published by Informa View supplementary material


UK Limited, trading as Taylor & Francis
Group

Published online: 11 Jul 2023. Submit your article to this journal

Article views: 1185 View related articles

View Crossmark data Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tpsr20
Psychotherapy Research, 2023
Vol. 33, No. 8, 1096–1116, https://doi.org/10.1080/10503307.2023.2221808

RESEARCH ARTICLE

Can cognitive inflexibility reduce symptoms of anxiety and depression?


Promoting the structural nested mean model in psychotherapy research

HENRIK BØRSTING JACOBSEN1,2∗ , PETER SOLVOLL LYBY1, THOMAS JOHANSEN3,


SILJE ENDRESEN REME2, & OLE KLUNGSØYR4,5∗
1
CatoSenteret Rehabilitation Center, Son, Norway; 2The Mind-Body Lab, Department of Psychology, University of Oslo,
Oslo, Norway; 3Norwegian National Advisory Unit on Occupational Rehabilitation, Rauland, Norway; 4Institute of Clinical
Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway & 5Department for Research and Innovation, Division of
Mental Health and Addiction, Oslo Centre for Biostatistics and Epidemiology, Oslo, Norway
(Received 29 June 2022; revised 30 May 2023; accepted 31 May 2023)

ABSTRACT
Objective To estimate the causal effect of executive functioning on the remission of depression and anxiety symptoms in an
observational dataset from a vocational rehabilitation program. It is also an aim to promote a method from the causal
inference literature and to illustrate its value in this setting.
Method With longitudinal (four-time points over 13 months) data from four independent sites, we compiled a dataset with
390 participants. At each time point, participants were tested on executive function and self-reported symptoms of anxiety
and depression. We used g-estimation to evaluate whether objectively tested cognitive flexibility affected depressive/anxious
symptoms and tested for moderation. Multiple imputations were used to handle missing data.
Results The g-estimation showed a strong causal effect of cognitive inflexibility reducing depression and anxiety and
modified by education level. In a counterfactual framework, a hypothetical intervention that could lower cognitive
flexibility seemed to cause improvement in mental distress at the subsequent time-point (negative sign) for low
education. The less flexibility, the larger improvement. For high education, the same but weaker effect was found, with a
change in sign, negative during the intervention and positive during follow-up.
Discussion An unexpected and strong effect was found from cognitive inflexibility on symptom improvement. This study
demonstrates how to estimate causal psychological effects with standard software in an observational dataset with
substantial missing and shows the value of such methods.

Keywords: executive functioning; anxiety; depression; cognitive flexibility

Clinical and methodological significance of this article: The results in the current study challenge the scientific consensus
of higher cognitive flexibility leading to improvement in symptoms of depression and anxiety. Estimating causal effects with a
structural nested mean model with g-estimation yields unbiased effect estimates when conventional methods are likely to fail.

Executive functions (EFs) are an umbrella term for & Friedman, 2012). They are generally thought to
top-down regulation central to our emotional, cogni- encompass inhibitory control and interference
tive, and behavioral control (Diamond, 2013; Miyake control, working memory, and cognitive flexibility


HBJ and OK contributed equally to the manuscript.
Correspondence concerning this article should be addressed to Henrik Børsting Jacobsen, The Mind-Body Lab, Department of Psychology,
University of Oslo, Oslo, Norway. Email: henrbors@uio.no
This article has been corrected with minor changes. These changes do not impact the academic content of the article.

© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/
licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s)
or with their consent.
Psychotherapy Research 1097

(Miyake & Friedman, 2012). EFs can uniquely predict EFs (Devanand et al., 2003; Levkovitz et al.,
fundamental aspects of adult life such as complex 2002). Moreover, such treatment effects are strongly
reasoning, literacy, and education from as early as 2 modified by age and years of education (Gudayol-
years of age (Mulder et al., 2017), and are closely Ferré et al., 2021). In summary, it is not yet estab-
linked to general functioning and work life (Huizinga lished whether EFs deficits drive, predict, or are
et al., 2018). Recent trials of multidisciplinary rehabili- caused by psychopathology, or if improvements in
tation indicate an important role of EF capacity in EFs from therapies depend on age or level of edu-
return-to-work, quality of life, and overall functioning cation (Gudayol-Ferré et al., 2021). Although our
(Aasvik et al., 2015; Aasvik et al., 2017; Jacobsen knowledge of EFs in mood disorders has grown, a
et al., 2016; Johansen et al., 2019), but the “why” major drawback is that the prospective role of EF def-
and the “how” of this relationship remain unclear. icits in mood disorders is poorly understood (Fu
A viable candidate for illuminating this relation- et al., 2021).
ship is understanding the impact from symptoms of A formal framework that provides technical nota-
anxiety and depression on EFs. A large body of tion to conceptualize causation can give insight in
research associates mood disorders with poor EFs this relationship. Such a framework is that of “poten-
(Snyder, 2013; Snyder et al., 2015), and EF capacity tial outcomes” or “counterfactuals”, employed in a
in the form of indecisiveness and distractibility has wide variety of disciplines to assess causality. If an
been included in the DSM-5 classification of mood outcome would have differed had some exposure or
disorders (First et al., 2021). Several influential action been other than it was, then the exposure or
psychological models echo the DSM by proposing action can be said to cause the outcome. In statistics,
EF deficits as early signs of depression and anxiety a formal notation for the counterfactual approach was
(Fresco et al., 2007; Hayes et al., 2006; Nolen-Hoek- described by Neyman (1923), and later developed and
sema, 2011; Price & Duman, 2020; Teasdale et al., extended to observational studies by Rubin (1974,
2002; Wells, 2011; Wells & Matthews, 1996). As 1978). Robins extended it further to include multiple
such, deficits in EFs are often conceptualized as and time-varying exposures (1986), and it was related
transdiagnostic risk factors for developing mood dis- to a graphical representation of causality by Spirtes
orders (Liu et al., 2021; Snyder et al., 2015). et al. (1993) and Pearl (1995, 2009).
However, while EFs have been shown to prospec-
tively predict mood disorders (Caspi et al., 2020; Let-
kiewicz et al., 2014; Mac Giollabhui et al., 2019), and
An Introduction to Robins’ g-methods
cognitive inflexibility appears particularly salient (Mac-
Pherson et al., 2021; Stange et al., 2016; Stange et al., Even in the perfectly designed randomized con-
2017), the relationship is not straightforward (Fu trolled trial (RCT), the treatment comparison can
et al., 2021). The nature and role of EF deficits vary be a biased estimate of the “true” treatment effect
in psychopathology, as well as within different sub- due to non-adherence or differential loss to follow-
classifications of mood disorders (Snyder et al., 2015). up. Moreover, conducting an RCT is often not feas-
When comparing patients with depression to healthy ible, leaving the causal question unanswered. The
controls, systematic reviews and meta-analyses overwhelming amount of research questions of a
observe deficits across attention, executive and causal nature in the health sciences has inspired the
memory measures, with no evidence of specific development of methods for causal inference for
domains being affected more than others (Rock et al., any study design to come as close as possible to the
2014). To further complicate the picture, EFs can “true” causal effect, with as few and mild assump-
also deteriorate because of mood disorders, and tions as possible. In epidemiology, causal questions
symptom remission has been documented to happen have been assessed from observational studies as
independently of improvements in EFs (Porter et al., RCTs are hard to come by, and progress in method-
2015; Torrent et al., 2012). ology has benefitted other fields. Identification and
Moreover, there are only a handful of studies estimation of causal effects including underlying
looking at the relationship between treatment assumptions, e.g., effect of a time-varying exposure,
effects, symptom remission, and EF deficits, and treatment, or action on an outcome has been
these show conflicting results (Porter et al., 2015). studied in detail with a variety of applications
In bipolar patients, poor performance on verbal (Herńan & Robins, 2020), but might still need justi-
recall has been associated with symptom severity fication in psychological research. From here on, the
after cognitive–behavioral group therapy (Sachs variable of which the causal effect is of interest is
et al., 2020). On the other hand, following anti- called the “action” variable, to represent the specific
depressant therapy, improvements in depression action itself or a hypothetical intervention that could
and anxiety can happen without improvements in change the action variable.
1098 H. B. Jacobsen et al.

under the observed action level. This equality


should hold, regardless of the nature of the
assignment mechanism (e.g. whether it was
randomized or observational). If, for
example, there are multiple versions of the
Figure 1. Causal diagram (DAG) representing the relation action (let’s say, therapy with different
between a time-varying exposure (A0 , A1 ), a confounder (L1 ) for content) which might give rise to different
the relation between A1 and Y and an unmeasured confounder outcomes depending on which version is
for the relation between L1 and Y .
administered, this renders the counterfactual
outcome not well defined, and consistency is
In observational studies of a time-varying action, violated (VanderWeele & Hernán, 2013).
time-varying confounding is always a potential chal- 2. “Sequential conditional exchangeability” is
lenge. This is commonly illustrated in epidemiology the same as “no unmeasured confounding”,
with a causal diagram, a so-called causal-directed in the time-varying case. It implies that the
acyclic graph (causal DAG) (Pearl, 1995). In a counterfactual outcomes are independent of
causal DAG, with arrows representing direct causal the observed action levels, conditional on
effects, a statistical association between an action past covariates and action levels (in Figure 1
variable and an outcome can only be produced by it corresponds to no unmeasured variables
the following three causal structures: (i) a true that affect A0 and Y , or A1 and Y ). From
cause and effect, (ii) common cause (confounding): here on, the term “no unmeasured confound-
if the action and the outcome share a common ing” is used and is meant for every time point.
cause that is not adjusted for, and (iii) common 3. “Positivity” is met when there are individuals
effects (selection/collider bias): if the action and the in all levels of the action variable, within all
outcome have a common effect, or a descendant of confounder and prior action levels (not
a common effect, that is adjusted for. needed for g-estimation). This implies well-
If the time-varying confounder is itself affected by defined and positive probabilities for each
earlier actions, conventional methods for confounder action level.
control (by stratification/adjustment in a regression)
are flawed (Figure 1). With interest in the joint Under these assumptions, the hypothetical obser-
effect of A0 and A1 on Y (Daniel et al., 2013), adjust- vational study in Figure 1 equals a sequentially ran-
ing for L1 in a standard regression model would block domized trial in which the action level was
part of the effect of A0 (through A1 ) and potentially randomized at baseline and randomized again at
lead to bias of the effect of A0 through the unmea- time 1 with probability depending on L1 (Naimi
sured confounder U for the L1 − Y relationship (L1 et al., 2017). Of the three g-methods, IPW esti-
becomes a collider). mation for MSMs is by far the most popular
Robins’ generalized methods (g-methods) were method (and newest). It appears much simpler
developed for identification and estimation of such than the two others, and is easy to implement in stan-
effects. It’s a family of methods that include the g- dard software. However, the less known method of g-
formula / g-computation, g-estimation for Structural estimation outperforms IPW in several ways, it’s
Nested Models (SNMs) (Robins et al., 1992; almost always more efficient, more robust for unmea-
Robins, 1994, 1997; Vansteelandt & Joffe, 2014; sured confounding, particularly well-suited for a con-
Vansteelandt & Sjolander, 2016), and inverse prob- tinuous action variable, can accommodate
ability weighting (IPW) for Marginal Structural moderation by time-varying covariates and does not
Models (MSM) (Robins et al., 1999). In the counter- require the positivity-assumption (Vansteelandt &
factual framework, a marginal causal effect is defined Sjolander, 2016). G-computation is even more effi-
by the theoretical contrast between the mean if every- cient, but highly computer-intensive with many para-
body in the population was exposed to the action at metric assumptions (distributional assumptions for
both timepoints versus the mean if everybody was confounders etcetera) and can also not handle mod-
not exposed to the action at both timepoints. eration by time-varying covariates.
Assumptions on which the g-methods rely form the Structural nested mean models (SNMMs) solve
link between this hypothetical effect and the the problems in the conventional confounder-
observations: control (stratification/adjustment in regression), by
avoiding the adjustment on post-action variables
1. “Counterfactual consistency” allows us to (adjustment for L1 in Figure 1). This is achieved by
equate the observed outcome to the counter- modeling the outcome at each time conditional on
factual outcome that would be observed the action variable and covariate history up to that
Psychotherapy Research 1099

time; after having removed the effects of subsequent (Picciotto & Neophytou, 2016) which should easily
action variables to disentangle the unique contri- translate to psychotherapy research and other
butions of each action at each time (Vansteelandt & mental health therapies. In pharmacoepidemiology,
Joffe, 2014). The parameters in the SNMMs are esti- g-estimation has been applied to adjust for non-
mated with g-estimation and “nested” refers to a adherence and to determine optimal personalized
conceptualization of longitudinal data as resulting strategies, which has increased interest in the area
from a nested series of trials (Picciotto & Neophytou, of dynamic treatment regimens (Tsiatis et al.,
2016). Starting with the last time-point (with no 2020). A dynamic treatment regime is a sequence
post-action variables), the action–outcome relation of interventions and of interest in all personalized
is estimated and adjusted for past actions and covari- treatment of chronic diseases, including many
ates. Iteration for each prior time-point after having mental disorders. In occupational epidemiology, g-
“removed” the effect of the later action variable on estimation has been used to adjust for the so-called
the outcome ends with a counterfactual outcome “healthy worker survivor bias,” which creates the
for everyone being predicted under an “action-free” illusion that an unhealthy exposure is protective,
regime. Combined with the no unmeasured con- because the workers of poorer health are more
founding assumption (for every time-point), the g- likely to reduce their exposure. This is also seen in
estimation finds the counterfactual outcome (with psychotherapy research and the mental health field,
the action removed) that satisfies the independence where patients with more severe symptoms tend to
assumption between this counterfactual and the illicit more frequent and comprehensive treatment,
observed action variable. This counterfactual is a thereby generating an inverse association between
function of the causal action effect through par- treatment and outcome. G-estimation is appropriate
ameters in the SNMM, and inversion of this function to adjust for such feedback.
yields the g-estimated parameters. Even though the g-estimation can be implemented
The limited popularity of g-estimation has largely in standard software, and the regression models for
been related to a lack of implementation in standard the action variable and for the outcome are flexible
software. In 2016, Vansteelandt and Sjolander (within standard limitations for regression models),
described how to obtain the g-estimator for a linear there is a lack of flexibility for link functions other
SNMM by combining separate regression models than identity or log link (Vansteelandt & Sjolander,
for the outcome and the action variable (Vanstee- 2016). The implementation of log-linear SNMMs
landt & Sjolander, 2016). If either of these models for counts and similar models for survival endpoints
(but not both) are mis-specified, the g-estimator for can be informed by the development of the linear
the causal parameters in the SNMM is still consistent SNMM (Vansteelandt & Sjolander, 2016). Based
(provided the SNMM is also correct), a property on semi-parametric theory (Tsiatis, 2020), Robins
which is called “double-robust.” This implies formulated the g-estimator as the solution of an
indirectly some protection against unmeasured con- unbiased estimating equation, without distributional
founding in the outcome model (one form of misspe- assumptions (consistency under mild regulatory con-
cification), also shared by the instrumental variable ditions) (Robins, 1994).
approach (Ertefaie et al., 2017; Joffe & Brensinger, Traditionally, causality in epidemiology and psy-
2003). It can be shown that the g-estimator is equiv- chology has been assessed in structural equation
alent to an instrumental variable (two-stage least models (SEMs). Even though some of the more
squares) estimator (Joffe & Brensinger, 2003; recent causal inference methods have developed out
Naimi et al., 2017). Double-robustness means that of the SEM literature, e.g. causal graphs (DAGs),
consistent causal parameters in the SNMM are traditional SEMs have many limitations with
obtained even if the model for the action variable, respect to causal inference when, e.g. the effect of
or for the outcome (but not both) has some form of an action on an outcome, or mediation is of interest
misspecification. This property can be exploited in (Rijnhart et al., 2021 VanderWeele, 2012). SEMs
the model fitting, with one model held constant tend to make more assumptions, like linearity in
while varying the other. Large variation in the functional form among covariates and multivariate
causal parameter estimates indicates that the model normal distribution. In fact, in a regression model,
which was held constant is wrong. In this way, internal relations among covariates are not
double-robustness can guide model selection modeled, only the relation between the action vari-
(Wallace et al., 2016). able and outcome (covariates play the role of con-
More applications of g-estimation have been called founders or moderators). This means that bias from
for (Vansteelandt & Joffe, 2014; Vansteelandt & Sjo- unmeasured confounding is more likely in a multi-
lander, 2016). In two subfields of epidemiology there variate SEM, since any common cause of two vari-
has been an increase in applications in recent years, ables in a causal graph (representing the SEM) also
1100 H. B. Jacobsen et al.

must be included (VanderWeele, 2012). The usual were currently working full time. A total of 187 par-
assumption of uncorrelated errors in an SEM ticipated in an inpatient occupational rehabilitation
would be violated in Figure 1, and therefore program and 93 in an outpatient program. Seventy-
indirectly implies an assumption of no unmeasured three workers in the control group who volunteered
confounding for both the relationship between A0 - to take part were all working full time and had no
Y, A1 - Y and L1 – Y (De Stavola et al., 2015). sick leave during the testing period. They were
One reason for the popularity of the SEM model in recruited from the wider community and employees
psychology is that it can be extended to account for from three rehabilitation clinics and included a
measurement error, within restrictions of normality wide selection of different blue- and white-collar
and uncorrelated errors. A realistic causal DAG is a workers. The two groups were matched for age,
general and useful way of depicting different types sex, and number of days between pre- and post-
of measurement errors. It should simultaneously rep- tests. Of participants in rehabilitation, 80% had an
resent biases arising from confounding, selection ICD-10 diagnosis either in the categories F, mental
(collider-bias) and measurement error, and thereby and behavioral disorders or M, diseases of the mus-
reveal how (if possible) to correct for these (Herńan culoskeletal system and connective tissue. Exclusion
& Robins, 2020). Sensitivity analysis techniques for criteria for the rehabilitation and control group were
direct and indirect effects in causal mediation have a history of head injury or having applied for disabil-
been developed, for unmeasured confounding as ity pension. The flow of participants is illustrated in
well as for the presence of measurement error Figure 2.
(VanderWeele, 2015).
The main aim of this study is to demonstrate how
the structural nested mean model with g-estimation Design
can be used to disentangle causal pathways in an
This study had an explorative, non-randomized pre–
observational longitudinal dataset of mental health
post design. All participants in the study were
variables with some design limitations. The causal
assessed with cognitive and emotional tests and
effect of a hypothetical intervention that could
work and health questionnaires on four time points
change cognitive flexibility on remission of
spanning a 13-month period and are included in
depression and anxiety symptoms over a 12-month
the estimation of effects irrespective treatment
period is assessed. As EF capacity is strongly
groups.
related to socio-demographics such as age and edu-
cation, these were, in addition to coping style, con-
sidered as candidate moderators (VanderWeele, Multidisciplinary Return-to-Work (RTW)
2015). It was also an aim to see whether symptoms Rehabilitation Program
of depression and anxiety was associated with differ-
ences in baseline EFs or long-term return-to-work The patients were referred to occupational rehabili-
for our participants. tation by general practitioners or social security
offices. The main aim of rehabilitation was RTW,
and the programs lasted between 3 and 12 weeks.
The patients were followed up by an interdisciplinary
Methods
team including at least four of the following pro-
Setting fessionals: physician, physiotherapist, psychologist,
work consultant, coach, nurse/psychiatric nurse,
Individuals completing either inpatient or outpatient
and sports pedagogue. The assessment of work
multidisciplinary occupational rehabilitation pro-
ability, physical fitness, and current work and
grams based on multimodal cognitive behavior
health situation was carried out to tailor rehabilita-
therapy were recruited from four clinics, alongside
tion efforts.
healthy controls. The duration of the rehabilitation
programs varied between the clinics from 3 to 12
weeks.
Materials
Computerized Cognitive and Emotional
Participants Tests (Exposure Variables)
A total of 390 participants were recruited for this Eight validated tests from the Cambridge Neurop-
study. Of these 317 were patients who were either sychological Test Automated Battery (CANTAB)
on partial or full sick leave, volunteered to take part were used to assess cognitive and emotional func-
in the study as well as 73 healthy volunteers who tioning on a touch screen. The order of the tests
Psychotherapy Research 1101

Figure 2. Time course of cognitive flexibility (IEDt ) and mental distress (depression/anxiety symptoms) by HADDt , HADAt and HADTt
(with f (t) from Equation 1 included).

was fully counterbalanced across participants at stimuli are composed of color-filled shapes and
each testing session and within each group. All par- white lines. The participant is asked to select a
ticipants were introduced to the touch screen by way stimulus at random. Following a selection, the par-
of a motor screening task performed before testing ticipants get feedback on their answer—correct or
both at pre- and post-tests. This screening was per- incorrect. The task is to discover the rule determin-
formed to familiarize the participants with the touch ing which stimulus is correct. We chose IED total
screen and reduce as much as possible any initial errors adjusted as our action variable. IED total
apprehension before testing. A description of com- error is a summation of all errors made at different
puterized cognitive and emotional tests is given in stages. For those that fail a stage and therefore are
brief below. not given a chance to attempt further stages, an
Rapid Visual Information Processing, Spatial additional sum of 25 is added for each stage not
Working Memory, Spatial Recognition Memory, attempted. A lower score represents better perform-
Stockings of Cambridge (a version of the Tower of ance (higher flexibility).
London task measuring executive planning),
Emotion Recognition Task. All tests were adminis-
tered on a touch-sensitive computer screen.
Depressive and Anxious Symptoms
(Response Variable)
Cognitive Flexibility (Primary Action
The 14-item Hospital Anxiety and Depression Scale
Variable)
(HADS) measures mental distress and is divided into
Intra-Extra Dimensional Shift (IED) tests the par- an anxiety and a depression scale, each with 7 items
ticipants’ cognitive flexibility and is a computerized (Zigmond & Snaith, 1983). Twenty-one is the
version of the Wisconsin Card Sorting Test (Grant maximum score on each scale. The cut-off for
& Berg, 1993). It measures a person’s ability to mental distress is usually set at 7 or above on either
acquire and reverse rules and taps into lower- the anxiety (HAD-A) and/or the depressive scales
order cognitive functions such as discriminating (HAD-D) (Wu et al., 2021) and this is validated
between visual stimuli and flexibly maintaining for a Norwegian population (Bjelland et al., 2002).
and changing an attentional set (Cambridge Cogni- In the current sample, the HADS showed satisfactory
tion, n.d.). On the screen, four white boxes appear, internal consistency (Cronbach’s α = 0.89) at
and a stimulus is shown inside two of them. The baseline.
1102 H. B. Jacobsen et al.

Moderators, Time-Varying Statistical Analyses


Theoretically Originated Measure of the Baseline demographics for selected variables were
Cognitive Activation Theory of Stress analyzed using means and standard deviations for
(TOMCATS) continuous variables or percent for categorical. A
comparison of average percent sick leave over 12
TOMCATS is a measure designed to measure the
months and EFs on baseline was tested for education
concept of response outcome expectancies as
using independent sample t-tests.
defined in the Cognitive Activation Theory of
Stress (CATS) [2, 18]. In this context, it was pro-
posed as a moderator between the unmeasured Causal Effects
effects of stress on executive functioning. Response
outcome expectancies are thought to drive, or limit The counterfactual approach is used to assess the
sustained stress activation, thus potentially moder- causality between cognitive flexibility and subsequent
ating the adverse effects of cortisol on brain areas depressive/anxiety symptoms. In terms of the effect of
involved in EFs, thus acting on depressive an action on an outcome, the counterfactual outcome
symptoms. Y (a) is the potentially unobserved outcome for the
The inventory consists of three factors that rep- hypothetical action level a. In the following, random
resent the three response outcome expectancy variables and their realizations are represented by
dimensions of CATS: positive expectancy (one upper- and lower case letters, respectively.
item), no expectancy (two items) and negative A conditional average causal effect of a hypotheti-
expectancy (three items). All items are rated on a cal intervention setting the action variable to a versus
four-point scale from “not true at all” to “completely 0 (0 can represent action-free or any reference value)
true.” For this study, we only used the negative among subjects with covariate value l, can be formu-
expectancy item that has been associated with lated by the linear structural mean model (SMM)
depressive symptoms, namely an expectancy of hope- (Robins, 1994; Vansteelandt & Joffe, 2014; Vanstee-
lessness (TChope) and helplessness (TChelp). In the landt & Sjolander, 2016) as
current sample, the TOMCATS showed satisfactory E(Y (a) − Y(0)|l) = c′ za, where E(.|.) is the con-
internal consistency (Cronbach’s α = 0.81) at ditional population mean, z is a covariate vector
baseline. possibly depending on l, and c is the vector of
causal parameters of interest.
For example, with l representing gender and z = 1,
Non-linear Decreasing Magnitude Over the additive action effect on an outcome from action
Time level a relative to no action, is equal between genders.
On the other hand, z = l would describe different
In treatment-effect trials, effects often decrease over action effects between men and women.
time. In the present study, moderation from a non- With time-varying action variable At , vector of cov-
linear function of time f (t) with a similar shape as ariates Lt and outcome Yt , observations are made at
the outcome proved a good fit and is given by time t = 1, 2.. with the history up until (and including)
(similar to the Box–Cox transformation): t, denoted by A  t and L  2 = {A1 , A2 }.
 t , for example A
⎧ l
Ys (
at , 0) is the counterfactual outcome at time
⎨ (365 − t l )/l l=0 s = 1, 2 . . ., with s . t, for an action history equal to
 
f (t) = 365 (1) t up until time t and zero there – after. This construct
a
⎩ log l=0
t+1 facilitates assessments of a causal effect of a time-
varying action on the following outcome as well as on
where l . 0 ( , 0) yields a function that decreases all subsequent ones, formulated by the structural
less (more) rapidly than the negative log(t), l = 1 nested mean model (SNMM) (Robins, 1994):
yields a linear function of time (in days).
at , 0) − Ys (
E(Ys ( at−1 , l t ) = c′ zst at
at−1 , 0)| (2)

Moderators, Time Constant The action effect posited in (2) represents the differ-
ence in outcome when the action level at time t is set
Education Levels
equal to at and 0 thereafter, relative to when action
For this analysis, education was divided into two level at time t and onward is set equal to 0, conditional
levels: low = up to high school (<13 years of edu- on action and covariate history. In other words, the
cation) and high = college and university degree contribution to the action effect from a specific time-
(>13 years of education). point (so-called “blip”) when later contributions are
Psychotherapy Research 1103

“removed.” This effect is identifiable and can be esti- correlation). The (1, HighEdu, f (t − 1))iedt−1
mated from observed data under the usual causal coefficients are the preliminary SNMM par-
assumptions of consistency and no unmeasured con- ameter estimates ĉ(0) = (ĉ(0) 0 , ĉ1 , ĉ2 ) in (3).
(0) (0)

founding (see Introduction). III. Predicting Ys (A  t , 0) for all t, s with s . t by


The causal effects to be assessed in the present means of ĉ(0) (removing subsequent effects
application are the effects of a hypothetical interven- from the observed outcome, see appendix),
tion that could change cognitive flexibility (IEDt ) at denoted H31 , H42 , and H41 . The updated
time t, on subsequent depression and anxiety symp- and improved ĉ(1) is found by a second inde-
toms, by HADDs , HADAs , and HADTs , s . t, and pendence GEE with these predictions as out-
with potential effect-modifiers TChopet and TChelpt . comes in an extended data set (long format),
Different causal hypotheses can easily be assessed and the same covariates as above, for example
with different choices of zst in (2) (Appendix), fitted by the R command: “geem
although lack of power is a limitation with increased ((HADD2 , HADD3 , HADD4 , H31 , H42 ,
model complexity. A simple model with moderation H41 )  . . . )” and again with the SNMM par-
from a time-fixed covariate and a non-linear func- ameter estimates ĉ(1) = (ĉ(1) 0 , ĉ1 , ĉ2 ) as the
(1) (1)

tion of time proved to have a good fit in the (1, HighEdu, f (t))iedt coefficients. Standard
present application (with HADDs as outcome), errors are estimated by bootstrap.
and is given by

E(HADDs (ied t , 0) − HADDs (ied t−1 , 0)|ied t−1 , l t )


In the outcome model (steps II and III), f (t) was
= c′ zst iedt = (c0 + c1 HighEdu + c2 f (t))iedt included as one component in lt with
(3) l = 0.2, 0.1, 0.4 giving good fit for HADDt ,
HADAt , and HADTt respectively as a function of
days (Figure 3).
for = 2, 3, 4 s . t, and with f (t) from (1).
The steps above describe how g-estimation can be
Following Vansteelandt and Sjolander, fitting the
achieved by the simple combination of standard
SNMM for the causal effect of the time-varying
regression models. To achieve an automatic model
IEDt on HADDs , HADAs , and HADTs , s ., can
selection and identify the “best” model for each boot-
be done in three steps, by combining ordinary
strap sample (and thereby precise bootstrap standard
regression models for IEDt , one model for each
errors), the OLS regression in step I, can be replaced
time-point (e.g. linear regression) and a longitudinal
by for example an automatic lasso-regression algor-
model for the outcome, including the SNMM—
ithm (as in the present application where the R-
terms, fitted and refitted (Appendix)
package glmnet was used) (Friedman et al., 2010).
I. Regressing the action variable IEDt on its Automatic model selection was also performed in
history, baseline, and preceding time- the calculation of the censoring weights (adjusting
varying covariates and outcome for each, for potential selection bias from loss to follow-up,
including non-linearities and interactions. see Appendix).
With data in a “wide” format, these models Even though the procedure is flexible (e.g. to
could for example be fitted with the linear include automatic model selection), the
model command in the statistical software R implementation requires some experience in stat-
(R Core Team, 2014): “lm(IEDt  . . .)” istical programming (see R-code for the present
(OLS regression) for t = 1, 2, 3, with the application in the supplementary file). Recent
fitted values (called “propensity scores”), development in software has made g-estimation
saved and denoted P1 , P2 , and P3 based on this method available, in the R-package
respectively. gestools (De Stavola et al., 2015). This package
II. Regressing the observed outcome, e.g., takes as input the data set in long format and
HADDt , on covariates and moderators as changes it to a convenient form, to ensure there
well as the SNMM terms with previous exists a data entry for each person at each time
action (1, HighEdu, f (t − 1))iedt−1 and pro- point, generates lagged versions of the time-
pensity score (1, HighEdu, f (t − 1))pt−1 in a varying covariates, and covariate-histories (set to
repeated measures regression model, by zero at the start). It performs g-estimation for
forming a dataset in “long” format and for different types of action variables and for continu-
example fitting the model with the GEE ous or dichotomous outcomes.
command in R: “geem((HADD2 , HADD3 , A causal graph (DAG) of the design is shown in
HADD4 )  . . . )” (independence working Figure 3.
1104 H. B. Jacobsen et al.

Figure 3. Causal graph (DAG) of study design, with time-varying action variable, covariates and outcome. Directed arrows represent poss-
ible direct causal effects (red arrows are short-term effects being estimated), in a longitudinal design with 390 adults in an occupational reha-
bilitation program in Norway.
Note. At: action variable (IEDt), Lt: baseline and time-varying covariates, Yt: outcome (HADDt, HADAt, HADTt), Ct: censoring-indicator
(loss-to-follow up) – surrounding box represents conditioning (the outcome is observed only for those not lost-to-follow up)

Multiple Imputation (MI) to Handle Missing FMI of 20%, 10 imputations correspond to a relative
Values efficiency above 98% (Schafer, 1999).
With respect to standard errors in the g-estimation
Incomplete data were of concern, both for the out-
algorithm, the bootstrap is recommended (Vanstee-
comes (HADDt , HADAt , HADTt ) for the action
landt & Sjolander, 2016). To account for uncertainty
variable (IEDt ), and different covariates. Missing
from incomplete data and to achieve unbiased standard
values in an outcome was considered as loss-to-
errors from the g-estimation, the bootstrap was per-
follow-up and the person was not allowed to re-
formed for each imputed dataset. Ten complete data-
enter. Potential selection bias from loss-to-follow-
sets were generated from the imputation algorithm,
up was adjusted for by inverse probability of censor-
and for each complete dataset a bootstrap parameter
ing weights (Appendix).
estimate with the standard error was generated (500
To address the issue of incomplete data in covari-
resamples), and finally combined according to
ates, multiple imputation (MI) (Appendix) was per-
Rubin’s rules (Rubin, 1987). Model selection was
formed under the assumption of missing at random
carried out for every bootstrap sample, both in the cen-
(MAR), with the R-package mice (Buuren &
soring weights and propensity score models, to identify
Groothuis-Oudshoorn, 2011). The outcomes
the best model. A large set of covariates was entered for
served as covariates both in the propensity score
each bootstrap sample, and the automatic lasso
and outcome, steps 1, 2, and were imputed as
regression algorithm returned the best cross-validated
missing covariate, but not as missing outcome. Out
model for that sample, in terms of minimum prediction
of 69 variables in the dataset, the covariates with
error, while avoiding overfitting.
missing values are shown in supplementary table 1.
Efficiency gain from MI was assessed by the frac-
tion of missing information (FMI), approximated Results
by FMI = r/(1 + r) (for a high number of imputa- Selected demographics and baseline covariates in the
tions), where r is the relative increase in variance sample are shown in Table I and baseline scores on
due to the missingness (Madley-Dowd et al., 2019; executive functioning for the whole sample are
Schafer, 1999) (Appendix). The FMI is parameter shown in supplementary table 2.
specific and quantifies the loss of information due At baseline, those with higher education scored
to missingness, while accounting for the amount of significantly better compared to those with low edu-
information retained by other variables (Madley- cation on the Stockings of Cambridge, Emotional
Dowd et al., 2019). A low number of imputations recognition, and Rapid visual processing, but not
in MI is usually sufficient, for example with an on cognitive flexibility (see Table II).
Psychotherapy Research 1105
Table I. Selected demographics at baseline for all participants. during the intervention period (mean length = 28
days), represented by a decrease in IEDt (increased
Participants (n =
390) flexibility) and depression/anxiety symptoms
(HADDt , HADAt , and HADTt ) followed by gradu-
Variable Mean SD ally flatter mean levels during follow-up.
Also, a constant difference between the groups, in
Age 45.2 9.8
HADS
favor of the high-education group, was evident.
Anxiety (0–21; 0 = no anxiety) 7.7 4.5 Included in the plot is the function f (t) from
Depression (0–21; 0 = no depression) 5.7 4.0 Equation (1), used in the outcome model (steps 2
Total 13.2 7.9 and 3) as one component of lt , representing a good
Variable n % fit of the observed outcomes.
Gender
Female 267 69
Different causal models (SNMMs) were fitted to
Male 123 34 assess long-term effects, group differences in short-
Education term effects (interactions with the action variable or
High (13 years or more) 189 50 other covariates) and whether TChopet or TChelpt
Low (up to or including 12 years) 189 50 could play a role as moderators for IEDt ’s influence
Work status
Not in work 172 44
on mental distress (Appendix). No such significant
Part-time work 126 32 effects were found, but few repeated measures with
Full-time work 74 19 limited observation time, and model complexity
resulted in a lack of power for one or more of these
Note: SD, standard deviation; HADS, Hospital Anxiety and
tests.
Depression Scale.
The simpler SNMM from Equation (3), modeling
a constant and short-term low- versus high education
Results from g-Estimation effect as well as a time-varying part, showed signifi-
cant causal parameters for all outcomes. Final com-
Time-course of the mean level of the action variable bination estimates from the 10 imputed datasets are
(IEDt ) and outcomes (HADDt , HADAt , and shown in Table III. The effect of IEDt on HADDt
HADTt ), stratified by low- versus high education had a non-significant intercept (c0 ), a significant
(HighEdut ), is shown in Figure 2. The overall difference between low- and high education
feature in both groups was clear improvement (ĉ1 = 0.043, p = 0.016) and a significant time-

Table II. Baseline scores on executive functioning measures divided by high and low education levels.

Education low (n = 189) Education high (n = 189) Statistics

Mean SD Mean SD T (df)# p-value

Attention
Simple reaction time
Reaction time (ms) 256.2 56.5 246.9 41.4 1.813 (374) .071
Choice reaction time
Reaction time (ms) 324.4 68.2 313.7 50.3 1.722 (373) .086
Rapid visual information processing
Latency (ms) 414.2 93.8 395.5 74.2 2.132 (367) .034
Probability of hit 0.58 0.19 0.67 0.17 −4.461 (368) <.001
Memory
Spatial working memory
Total between errors 13.2 10.3 11.5 9.2 1.658 (374) .098
Spatial recognition memory
Response time (ms) 2796.3 1144.7 2583.7 991.2 1.928 (375) .055
Total correct (%) 80.4 10.0 80.7 10.7 –.272 (375) .786
Executive function
Stockings of Cambridge
Choice duration (ms) 4315.1 2389.5 4125.3 2120.2 .815 (374) .416
Total correct 8.7 2.1 9.2 1.9 −2.537 (375) .012
Intra-extra dimensional set shift
Total errors adjusted 26.2 24.5 22.0 19.2 1.855 (374) .064
Emotion recognition
Emotion recognition task
Total correct (%) 57.3 10.6 60.7 8.8 −3.400 (374) <.001
1106 H. B. Jacobsen et al.
Table III. Estimated causal effects (SNMM coefficients) on mental distress by depressive symptoms (HADDt + 1), anxiety (HADAt + 1), and
sum (HADTt + 1) (separate models), of hypothetical interventions on cognitive flexibility (IEDt), by g-estimation.

HADDt+1 HADAt+1 HADT t+1

Estimate 95% CI p-value Estimate 95% CI p-value Estimate 95% CI p-value

Int 0.002 −0.0021, 0.006 0.33 0.0035 −0.002, 0.009 0.2 0.0038 8e-6, 0.008 0.0495
HighEdu 0.043 0.008, 0.08 0.016 0.048 0.017, 0.08 0.003 0.12 0.04, 0.2 0.0026
f (t) −0.005 −0.009, −0.001 0.016 −0.009 −0.015, −0.003 0.0036 −0.006 −0.01, −0.002 0.0022

Note. Parameter estimates refer to c0 , c1 , c2 in the SNMM = (c0 + c1 HighEdu + c2 f (t))IEDt (Equation 3), where f (t) is the function in
(1) with l = 0.2, 0.1, 0.4 respectively.

varying part (ĉ2 = −0.005, p = 0.016). The effect period (although weaker than for the low-education
of IEDt on HADAt had similar characteristics with group). After the intervention period, the effect
a non-significant intercept (c0 ), a significant differ- changed sign, a hypothetical intervention that could
ence between low- and high education increase inflexibility had a negative influence on
(ĉ1 = 0.048, p = 0.003) and a significant time- mental distress.
varying part (ĉ2 = −0.009, p = 0.0036). The effect In terms of magnitude, a hypothetical intervention
of IEDt on the total score, HADTt had a significant that could inflict inflexibility in the low-education
intercept (ĉ0 = 0.0038, p = 0.05), a significant group, equal to 26 points on the IEDt scale relative
difference between low- and high education to no inflexibility (IEDt = 0) at Time1 (baseline),
(ĉ1 = 0.12, p = 0.0026) and a significant time- would yield a change of −0.053 × 26 = −1.37
varying part (ĉ2 = −0.006, p = 0.0022). (95% CI: − 2.46, − 0.27) points on the HADDt
These estimates are inserted in the SNMM from scale at Time2 (improvement). A similar intervention
Equation (3) (c0 + c1 HighEdu + c2 f (t)) to give the at Time2 to change IEDt with 20 points relative to no
coefficient for the short-term effect for each time inflexibility in the low-education group, would yield a
point (Table IV). The causal effect had the same change of −0.033 × 20 = −0.65 (95% CI: − 1.18,
characteristics for all three outcomes. −0.13) points on the HADDt scale at Time3 , and
For low education, the effect was negative overall, finally to change IEDt with 19 points at Time3 rela-
strongest during the intervention period and gradually tive to no inflexibility in the low-education group,
weaker. In other words, a hypothetical intervention would yield a change of −0.02 × 19 = −0.38
that could increase inflexibility seemed to cause (95% CI: − 0.68, − 0.08) points on the HADDt
improvement in mental distress at the subsequent scale at Time4 . In other words, a total change in
time-point, the more inflexibility, the larger improve- HADDt of approximately −2.4 points would be
ment. For high education, a change in sign was esti- expected for the low-education group. In contrast,
mated, the effect was negative during the the change in the high-education group would
intervention and positive during follow-up (for be −0.01 × 22 = −0.21 (95% CI: − 1.14, 0.71)
HADDt and HADTt immediately following the inter- HADDt points at Time2 , 0.01 × 17 = 0.18
vention, and for HADAt the sign-change happened in (95% CI: − 0.27, 0.62) HADDt points at
the last period). In this group, a hypothetical interven- Time3 and 0.023 × 15 = 0.35 (95% CI:0.11, 0.59)
tion that could increase inflexibility seemed to HADDTt points at Time4 , a total change of 0.31
improve mental distress during the intervention HADDTt points.

Table IV. Time-varying estimated SNMM coefficients from Table III, for each time-point and level of education, describing short-term
causal effects over time on mental distress by depressive symptoms (HADDt + 1), anxiety (HADAt + 1), and sum (HADTt + 1) (separate
models) of hypothetical interventions on cognitive flexibility (IEDt).

HADDt+1 HADAt+1 HADT t+1

Low education High education Low education High education Low education High education

Time1 −0.055 −0.012 −0.065 −0.016 −0.14 −0.019


Time2 −0.035 0.008 −0.036 −0.012 −0.101 0.019
Time3 −0.021 0.022 −0.021 0.028 −0.067 0.054

Note. Estimates refer to c′ zt+1,t = (c0 + c1 HighEdu + c2 f (t)) (time-varying coefficients in Equation 3), where f (t) is the function in (1) with
l = 0.2, 0.1, 0.4 respectively.
Psychotherapy Research 1107

Other dimensions of EF were tested similarly, In the low-education group, cognitive inflexibility
without any significant effects. Dividing the signifi- was higher than in the high-education group over the
cance level (0.05) by the number of tests to correct whole period and seemed to have a strong causal
for multiple testing (a conservative Bonferroni cor- effect on subsequent anxiety and depressive symptoms,
rection) would still result in significant effects for which was not the case for the high-education group.
HADAt and HADDt . The effect weakened after the treatment period. A
In the imputation, both prior and subsequent total change in HADDt (depressive symptoms) over
measurements for each missing value were allowed the whole observation period for the low-education
in the model. Stability in the effect estimates across group was estimated to be around 2.4 points improve-
the imputations indicated successful imputation. ment relative to perfect flexibility, which is clinically
Estimated FMI ′ s were between 1.6 and 4.3%, con- relevant. The high education group did have a
siderably smaller than the proportions of missing similar, although weak effect during the treatment
IEDt values during follow-up (except at baseline)— period, and with a change of sign, indicating that
0.8%, 10.3% and 30.8%, reflecting that other vari- more cognitive inflexibility led to more depressive/
ables succeeded in retaining information for the anxious symptoms after treatment. These results
missing IEDt values. might explain why there is no consistent relationship
Selection bias from loss-to-follow-up was assessed between EF capacity and mood disorders, and why
by comparison of results with and without censoring- some benefiting from rehabilitation while others do
weights (Appendix). Estimation without weights not (Mikkelsen & Rosholm, 2018).
(without adjusting for loss-to-follow-up) showed an Education as a moderator was found in a recent
underestimation of the magnitude (in the direction systematic review and meta-analysis on the effects
of zero). Relative bias was largest for HADDt , with of antidepressants on executive functioning
8.5%, and 7.1% for c1 and c2 (significant coeffi- (Gudayol-Ferré et al., 2021). Improvement in EFs
cients) and smallest for HADAt with 0.2% and was overall stronger in the higher educated following
1.1% for c1 and c2 , respectively. The difference in treatment (Gudayol-Ferré et al., 2021) indicating
bias was not explained by different samples. that education sometimes can be a moderator in
this relationship.
Theoretically, the importance and positive benefits
Return-to-Work in Low vs. High Education of high cognitive flexibility are recognized in leading
models of cognitive behavioral therapy (CBT) (Clark
Testing whether those with low education differed & Beck, 2010; Hayes et al., 2011). The more tra-
with regard to work, an independent samples t-test ditional CBT models state that with repeated acti-
indicated no significant difference in the percentage vation, negative self-schemas can crystalize into a
of sick leave post-treatment to 12 months after com- depressive or anxious mode, which then generalizes
pleting rehabilitation t(175) = 0.2, p = .88. to more and milder stressful life events. Newer devel-
Additional analyses were performed to check for opments such as metacognitive therapy also state that
baseline differences in cognitive flexibility and emotional disorders are maintained by the activation
depressive and anxious symptoms in the RTW of maladaptive thinking styles. Here, the cognitive
versus non-RTW groups. Here, depressive symp- attentional syndrome is characterized by extended
toms were associated with RTW (p = 0.04), but no negative thinking in the form of rumination and
substantial differences were detected for anxiety or worry and is associated with increased self-focused
cognitive flexibility (p = .13). attention and maladaptive coping behaviors (Wells,
2011). In both cognitive models, maladaptive infor-
mation processing is thought to result in a weakening
of cognitive control. This could be operationalized as
Discussion
inflexibility or inability to access more adaptive,
Causal effects of cognitive inflexibility on improve- alternative modes of thinking. What is surprising
ment in symptoms of depression and anxiety were with our current findings is that such inflexibility
found in this observational dataset. Despite limit- seems to be beneficial.
ations in the data the model proved sufficient to Our results might therefore indicate an important
“tease out” strong causal effects. The inflexibility nuance in our understanding of cognitive flexibility
domain of EF had a much stronger signal/noise and symptoms of anxiety and depression. While it
ratio than the other domains, consistent across all is fairly established that shifting ability is central for
the centers, and maintained significance when cor- supporting flexible and effective self-regulation
rected for multiple testing. The effect decreased (Kashdan & Rottenberg, 2010), the ability to shift
over time and was moderated by education. attention from a task at hand to new aspects of a
1108 H. B. Jacobsen et al.

situation more critical is highly adaptive, but it also could be adherence to instructed benefits from fol-
costly in the form of energy expenditure for the lowing “rules” set in a rehabilitation program, such
brain. An automized cost–benefit analysis of shifting as expecting less symptoms from certain behaviors.
between sets of stimuli in any demanding novel situ- As an example of this, a previous randomized con-
ation has previously been demonstrated (Szalma & trolled trial indicated support for our current
Matthews, 2015). This signifies that a given chal- finding in showing that low-educated participants
lenge must be significant or perceived as important benefited more from an intervention in smoking ces-
to motivate shifts in attention. Less flexibility may sation, due to increased adherence to instructive
thus be associated with good coping in less demand- stories (Strecher et al., 2008).
ing day-to-day situations, only becoming negatively Indeed, the current rehabilitation programs would
associated with good coping in crises or extended involve multiple examples of increased behavioral
challenging situations. activation under the control of instructions from
Following this reasoning, our results could be therapists. Given these conditions, one could envi-
explained by our participants’ perception of their sion generalized tracking developing during the stay
challenges related to sick leave. Challenges caused from instructed functional relationships among
by sick leave in the Nordic region are often con- events (e.g., drawing out rules based on observation
sidered manageable for the individual due to avail- and instructions from therapists) (Villatte et al.,
able, generous benefits, support and health care 2015). This generalized tracking of a rule in the
systems (Elstad & Vabø, 2008). It could be that the abstract sense could then improve anxiety and
beneficial role of inflexibility appears in our study depression through rigidity over time and perceived
due to the context of sick leave being less dramatic symptom relief disregarding evidence of the contrary.
than in countries with no or little welfare or sick This is substantiated by the HADS being a patient-
leave benefits. reported outcome, meaning that one does not know
Another interpretation also worth considering in if others in close personal relationships for example
lieu of our results is our action variable testing cogni- would judge this perceived symptom relief in the
tive flexibility, IED set shift. In this test, a rule is same way. Thus, this improvement could be strictly
established by shaping a given participants’ behavior subjective, something that is indicated by the lack
with contingencies related to choosing an abstract of substantive associations with return-to-work. Of
pattern (correct/wrong). Once a rule is established, interest to the current findings is also a prior publi-
it is then altered without warning, and the speed cation on this multicenter trial showing significant
and accuracy of acquiring a new rule, accumulated effects from other cognitive functions on return-to-
over several stages, is the outcome from the test. Its work, but not between cognitive flexibility and
design dictates that those who resist changes to return-to-work (Johansen et al., 2019).
rules and are less sensitive to cues from their
context, score poorly on this test. But here it is also
these participants who improve in their symptoma-
Methodological Discussion
tology the most. The counter-intuitive nature of
our data could also, therefore, be viewed considering The structural nested mean model with g-estimation
the behavioral construct of rule-governed behavior. demonstrated efficient estimation in this application.
A conceptualization of EFs as a subset of rule-gov- The study of causal reciprocal effects between two
erned behavior characterized by flexibility was given variables (so-called cross-lagged effects), dates to
by Hayes et al in the mid-90s (Hayes et al., 1996). early time-series analysis, and was incorporated in
When one compares learning behavior under the the SEM framework (Jöreskog, 1970; Jöreskog &
control of instructions (i.e., rule-governed behavior) Sörbom, 1979), in the form of the cross-lagged
with behaviors shaped by contingencies such as panel model (CLPM), which became especially
reinforcement by trial and error, participants who popular in the behavioral and psychological science
are instructed tend to show more insensitivity to research. A recent review of medical journals found
negative rewards than contingency-shaped partici- 270 papers published between 2009 and 2019 that
pants (Vaughan, 1989). This entails that when used CLPM (Usami et al., 2019b). The CLPM rep-
instructed to expect less symptoms from doing a be- resents a different tradition for modeling causal
havior such as running, a less flexible participant effects than the counterfactual framework in epide-
would ignore experience of non-relief and defer to miology but shares the same fundamental “no
the instruction of “less symptoms.” Within rule-gov- unmeasured confounding” assumption. The
erned behavior, cognitive inflexibility, therefore, CLPM’s sensitivity for this assumption has been cri-
would refer to a tendency to maintain current behav- ticized, e.g., yielding non-satisfactory adjustment for
ior disregarding experience to a larger degree, which stable trait factors (Lucas, 2022). Extensions of
Psychotherapy Research 1109

CLPM and several alternatives with latent variables treatment or exposure and outcome is done away
have been developed (Usami et al., 2019a), but with (Robins et al., 1999). The SNMM with g-esti-
they all have in common that they are severely mation is well suited for sensitivity analysis, because
biased in their causal cross-lagged effect-estimate, g-estimation does not require no unmeasured con-
when the data-generating process deviates from founding for the identification of parameters, it
their strict parametric assumptions (Lüdtke & merely requires that the magnitude of unmeasured
Robitzsch, 2022). To date, it seems to be a lack of confounding is known (Robins et al., 1999; Van-
consensus in psychological and behavioral science, steelandt & Joffe, 2014; Yang & Lok, 2018). By
of whether to keep (Orth et al., 2021) or to letting the bias-function be zero for no unmeasured
abandon (Lucas, 2022) the CLPM model. confounding and varying both parameters in this
Consequently, the counterfactual framework has function and the functional form itself, sensitivity
gained acceptance in defining the causal effects for unmeasured confounding can be assessed
(bias) in general (Lüdtke & Robitzsch, 2022; Rijn- (Yang & Lok, 2018). However, Yang and Lok
hart et al., 2021; Usami, 2022) and for comparison described this in the form of complicated estimating
of methods (Lüdtke & Robitzsch, 2022). In some equations (Yang & Lok, 2018), which have not
cases, authors from the SEM literature even rec- been translated to g-estimation by nested regression
ommend causal mediation analysis (counterfactual models from Vansteelandt & Sjolander, used in this
framework) rather than the traditional mediation application.
models (SEM) (Rijnhart et al., 2021). The strength A practical problem is to determine if a certain
of the SNMM with g-estimation has been recog- magnitude of unmeasured confounding (on the
nized, but unwillingness to depart from the latent scale of the parameters in the bias-function) rep-
variable models makes a comparison between the resents sensitivity or not. Therefore, when the
two traditions difficult. Usami used the SNMM as association of interest is not well described, as in
part of fitting the SEM, to estimate joint effects of the present case, such an undertaking might not
time-varying treatments with improved control for be worth the effort. Further, the degree of missing
stable traits, due to the robustness of functional data will necessarily affect the interpretation of the
form and misspecification in the SNMM (Usami, results in all cases. Plausible significant causal
2022). Despite of these advances in robustness, the effects rely on “no misspecification” in the imputa-
SEM component still relies on several parametric tion models (as well as in other model-com-
(and distributional) assumptions, some of which are ponents), which are strong assumptions. Adding
untestable. further assumptions, in the form of bias-functions
The assumption of “no unmeasured confound- for unmeasured confounding might seem question-
ing” is not testable from data, and the reason that able. Nevertheless, the g-estimator is itself robust
causal inference from observational data is hazar- for unmeasured confounding by allowing for miss
dous (Herńan & Robins, 2020). In the counterfac- specification in one of its components (e.g. unmea-
tual framework, rather than introducing further sured confounding in the outcome model) (doubly
assumptions on unmeasured factors that each robust). One gets two attempts of success, consist-
would need separate assessment (and not testable), ent estimates are achieved if either the propensity
it seems more appealing to apply the already exist- score model, or the outcome model, in addition
ing framework for sensitivity analysis (Robins et al., to the SNMM are correctly specified. Even effi-
1999; VanderWeele & Arah, 2011; Klungsøyr et al., ciency gain can be achieved by including a
2009). In this framework, one needs a model (a “wrong” model. In an RCT with a dichotomous
parsimonious function) for the association action (or treatment or exposure), where the pro-
between the mean of the counterfactual outcome pensity score model is known (probability for treat-
variable and the action, treatment or exposure, ment), and consistent effect-estimate would be
within levels of measured confounders. This bias- expected, adding a mis specified outcome model
function which represents the magnitude of con- in a doubly-robust estimator usually leads to
founding due to unmeasured factors, is not ident- lower standard error in the effect-estimate (Tsiatis
ified from the data, and needs to be considered et al., 2020). Insight in one of the two models
plausible by subject-matter experts. The sensitivity will, in a doubly robust method, be of extra value.
of the causal effect is described by varying this func- In the present application, confidence in the com-
tion (and its parameters). In this way, many fewer prehensive propensity score model (a series of
sensitivity parameters are needed to be varied, and lasso-regressions with a rich set of covariates, both
the “impossible” decision as to whether to view U time-constant and time-varying) indicates that the
as univariate or multivariate, continuous or dis- estimated causal effect is not completely a result
crete, and its relation to the action variable, of unmeasured confounding.
1110 H. B. Jacobsen et al.

Strengths and Limitations are associated with symptoms of fatigue and anxiety. Frontiers
in Psychology, 6, https://doi.org/10.3389/fpsyg.2015.01338
Some limitations concerning a causal interpretation of Aasvik, J. K., Woodhouse, A., Stiles, T. C., Jacobsen, H. B.,
the results are: the observational nature of the data, a Landmark, T., Glette, M., Borchgrevink, P. C., & Landrø,
limited observation period and number of repeated N. I. (2017). Effectiveness of working memory training
among subjects currently on sick leave Due to complex symp-
measures, time-varying confounding, and substantial toms. Frontiers in Psychology, 7, 2003. https://doi.org/10.3389/
missingness. On the other hand, the multicenter fpsyg.2016.02003
design with four independent sites provided data Bjelland, I., Dahl, A. A., Haug, T. T., & Neckelmann, D. (2002).
from many participants with a good geographic The validity of the hospital anxiety and depression scale.
spread in the most populous region of Norway. The Journal of Psychosomatic Research, 52(2), 69–77. https://doi.
org/10.1016/S0022-3999(01)00296-3
tests of executive functioning were selected on relevant Buuren, S. V., & Groothuis-Oudshoorn, K. (2011). mice: multi-
and important aspects of executive functioning and the variate imputation by chained equations in R. Journal of
HADS is a widely used and suitable response variable. Statistical Software, 45(3), 1–67. https://doi.org/10.18637/jss.
The structural nested mean model with g-estimation is v045.i03
appropriate causal inference in these types of data out- Cambridge Cognition. (n.d.). Executive function. https://
cambridgecognition.com/executive-function/
performing several alternative methods. Caspi, A., Houts, R. M., Ambler, A., Danese, A., Elliott, M. L.,
Hariri, A., Rasmussen, L. J. H., Reuben, A., Richmond-
Rakerd, L., Sugden, K., Wertz, J., Williams, B. S., & Moffitt,
Conclusion T. E. (2020). Longitudinal assessment of mental health dis-
orders and comorbidities across 4 decades among participants
The results from this study identify an unexpected in the Dunedin birth cohort study. JAMA Network Open, 3(4),
causal relationship between cognitive inflexibility e203221. https://doi.org/10.1001/jamanetworkopen.2020.3221
and improvement in symptoms of anxiety and Clark, D. A., & Beck, A. T. (2010). Cognitive theory and therapy
of anxiety and depression: Convergence with neurobiological
depression. It challenges the scientific consensus, findings. Trends in Cognitive Sciences, 14(9), 418–424. https://
even though the causal interpretation must be doi.org/10.1016/j.tics.2010.06.007
viewed as exploratory, due to limitations in the Daniel, R. M., Cousens, S., De Stavola, B., Kenward, M. G., &
design. Confirmation of these results from other Sterne, J. (2013). Methods for dealing with time-dependent
studies will contribute a nuance in the understanding confounding. Statistics in Medicine, 32(9), 1584–1618. https://
doi.org/10.1002/sim.5686
of which variables to target in therapy and why. With De Stavola, B. L., et al. (2015). Mediation analysis With inter-
higher acceptance in defining causal effects with mediate confounding: Structural equation modeling viewed
counterfactuals within the psychological and behav- through the causal inference lens. American Journal of
ioral science, and the continuing lack of consensus Epidemiology, 181(1), 64–80. https://doi.org/10.1093/aje/
for the most appropriate model for causal inference, kwu239
Devanand, D. P., Pelton, G. H., Marston, K., Camacho, Y.,
this application is meant to promote the structural Roose, S. P., Stern, Y., & Sackeim, H. A. (2003). Sertraline
nested mean model with g-estimation. It has fewer treatment of elderly patients with depression and cognitive
non-testable assumptions than the traditional latent impairment. International Journal of Geriatric Psychiatry, 18(2),
variable models and does not rely on specialized 123–130. https://doi.org/10.1002/gps.802
software. Diamond, A. (2013). Executive functions. Annual Review of
Psychology, 64(1), 135–168. https://doi.org/10.1146/annurev-
psych-113011-143750
Elstad, J. I., & Vabø, M. (2008). Job stress, sickness absence and
sickness presenteeism in Nordic elderly care. Scandinavian
Disclosure Statement Journal of Public Health, 36(5), 467–474. https://doi.org/10.
1177/1403494808089557
No potential conflict of interest was reported by the Ertefaie, A., Small, D. S., Flory, J. H., & Hennessy, S. (2017). A
author(s). tutorial on the use of instrumental variables in pharmacoepide-
miology. Pharmacoepidemiology and Drug Safety, 26(4), 357–
367. https://doi.org/10.1002/pds.4158
First, M. B., Gaebel, W., Maj, M., Stein, D. J., Kogan, C. S.,
Supplemental data Saunders, J. B., Poznyak, V. B., Gureje, O., Lewis-
Fernández, R., Maercker, A., Brewin, C. R., Cloitre, M.,
Supplemental data for this article can be accessed Claudino, A., Pike, K. M., Baird, G., Skuse, D., Krueger, R.
online at https://doi.org/10.1080/10503307.2023. B., Briken, P., Burke, J. D., Woods, D. W. (2021). An organ-
2221808. ization- and category-level comparison of diagnostic require-
ments for mental disorders inICD-11 andDSM-5. World
Psychiatry, 20(1), 34–51. https://doi.org/10.1002/wps.20825
Fresco, D. M., Segal, Z. V., Buis, T., & Kennedy, S. (2007).
References Relationship of posttreatment decentering and cognitive reac-
Aasvik, J. K., Woodhouse, A., Børsting Jacobsen, H., tivity to relapse in major depression. Journal of Consulting and
Borchgrevink, P. C., Stiles, T. C., & Landrø, N. I. (2015). Clinical Psychology, 75(3), 447–455. https://doi.org/10.1037/
Subjective memory complaints among patients on sick leave 0022-006X.75.3.447
Psychotherapy Research 1111
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Liu, H., Funkhouser, C. J., Langenecker, S. A., & Shankman, S.
paths for generalized linear models via coordinate descent. A. (2021). Set shifting and inhibition deficits as potential endo-
Journal of Statistical Software, 33(1), 1–22. https://doi.org/10. phenotypes for depression. Psychiatry Research, 300, 113931.
18637/jss.v033.i01 https://doi.org/10.1016/j.psychres.2021.113931
Fu, Z., Brouwer, M., Kennis, M., Williams, A., Cuijpers, P., & Lucas, R. E. (2022). It’s time To abandon the cross-lagged panel
Bockting, C. (2021). Psychological factors for the onset model. Preprint.
of depression: a meta-analysis of prospective studies. BMJ Lüdtke, O., & Robitzsch, A. (2022). A comparison of different
Open, 11(7), e050129. https://doi.org/10.1136/bmjopen- approaches for estimating cross-lagged effects from a causal
2021-050129 inference perspective. Structural Equation Modeling: A
Grant, D. A, & Berg, E. A. (1993). Wisconsin card sorting test. Multidisciplinary Journal, 29(6), 888–907.
Journal of Experimental Psychology. Mac Giollabhui, N., Olino, T. M., Nielsen, J., Abramson, L. Y., &
Gudayol-Ferré, E., Duarte-Rosas, P., Peró-Cebollero, M., & Alloy, L. B. (2019). Is worse attention a risk factor for or a con-
Guàrdia-Olmos, J. (2021). The effect of second-generation sequence of depression, or are worse attention and depression
antidepressant treatment on the executive functions of patients better accounted for by stress? A prospective test of three
with major depressive disorder: a meta-analysis study with hypotheses. Clinical Psychological Science, 7(1), 93–109.
structural equation models. Psychiatry Research, 296, 113690. https://doi.org/10.1177/2167702618794920
https://doi.org/10.1016/j.psychres.2020.113690 MacPherson, H. A., Kudinova, A. Y., Schettini, E., Jenkins, G.
Hayes, S. C., Gifford, E. V., & Ruckstuhl Jr, L. (1996). Relational A., Gilbert, A. C., Thomas, S. A., Kim, K. L., Radoeva, P.
frame theory and executive function: A behavioral approach. D., Fenerci, R. L. B., Yen, S., Hower, H., Hunt, J., Keller,
Hayes, S. C., Luoma, J. B., Bond, F. W., Masuda, A., & Lillis, J. M. B., & Dickstein, D. P. (2021). Relationship between cogni-
(2006). Acceptance and commitment therapy: model, pro- tive flexibility and subsequent course of mood symptoms and
cesses and outcomes. Behaviour Research and Therapy, 44(1), suicidal ideation in young adults with childhood-onset bipolar
1–25. https://doi.org/10.1016/j.brat.2005.06.006 disorder. European Child & Adolescent Psychiatry, 31(2), 299–
Hayes, S. C., Strosahl, K. D., & Wilson, K. G. (2011). Acceptance 312. https://doi.org/10.1007/s00787-020-01688-0
and commitment therapy: The process and practice of mindful Madley-Dowd, P., Hughes, R., Tilling, K., & Heron, J. (2019).
change. Guilford Press. The proportion of missing data should not be used to guide
Herńan, M., & Robins, J. (2020). Causal inference: What If. decisions on multiple imputation. Journal of Clinical
Chapman & Hall/CRC. Epidemiology, 110, 63–73. https://doi.org/10.1016/j.jclinepi.
Huizinga, M., Baeyens, D., & Burack, J. A. (2018). Editorial: 2019.02.016
Executive function and education. Frontiers in Psychology, 9, Mikkelsen, M. B., & Rosholm, M. (2018). Systematic review and
https://doi.org/10.3389/fpsyg.2018.01357 meta-analysis of interventions aimed at enhancing return to
Jacobsen, H. B., Aasvik, J. K., Borchgrevink, P. C., Landrø, N. I., work for sick-listed workers with common mental disorders,
& Stiles, T. C. (2016). Metacognitions Are associated with sub- stress-related disorders, somatoform disorders and personality
jective memory problems in individuals on sick leave due to disorders. Occupational and Environmental Medicine, 75(9),
chronic fatigue. Frontiers in Psychology, 7, https://doi.org/10. 675–686. https://doi.org/10.1136/oemed-2018-105073
3389/fpsyg.2016.00729 Miyake, A., & Friedman, N. P. (2012). The nature and organiz-
Joffe, M. M., & Brensinger, C. (2003). Weighting in instrumental ation of individual differences in executive functions. Current
variables and G-estimation. Statistics in Medicine, 22(8), 1285– Directions in Psychological Science, 21(1), 8–14. https://doi.org/
1303. https://doi.org/10.1002/sim.1380 10.1177/0963721411429458
Johansen, T., Jensen, C., Eriksen, H. R., Lyby, P. S., Dittrich, W. Mulder, H., Verhagen, J., Van der Ven, S. H. G., Slot, P. L., &
H., Holsen, I. N., Jakobsen, H., & Øyeflaten, I. (2019). Leseman, P. P. M. (2017). Early executive function at Age
Occupational rehabilitation is associated with improvements Two predicts emergent mathematics and literacy at Age five.
in cognitive functioning. Frontiers in Psychology, 10, 2233. Frontiers in Psychology, 8. https://doi.org/10.3389/fpsyg.2017.
Jöreskog, K. G. (1970). A general method for analysis of covari- 01706
ance structures. Biometrika, 57(2), 239–251. 5. https://doi. Naimi, A. I., Cole, S. R., & Kennedy, E. H. (2017). An introduc-
org/10.1093/biomet/57.2.239 tion to g methods. International Journal of Epidemiology, 46(2),
Jöreskog, K. G., & Sörbom, D. (1979). Advances in factor analysis 756–762.
and structural equation models. Abt Books. Neyman, J. (1923). Sur les applications de la thar des probabilities
Kashdan, T. B., & Rottenberg, J. (2010). Psychological flexibility aux experiences Agaricales: Essay des principle. Excerpts re-
as a fundamental aspect of health. Clinical Psychology Review, 30 printed (1990) in English (D. Dabrowska and T. Speed,
(7), 865–878. https://doi.org/10.1016/j.cpr.2010.03.001 Trans.). Statistical Science, 5: 463–472.
Klungsøyr, O., Sexton, J., Sandanger, I., & Nygård, J. F. (2009). Nolen-Hoeksema, S. (2011). Lost in thought: The perils of
Sensitivity analysis for unmeasured confounding in a marginal rumination.
structural Cox proportional hazards model. Lifetime data analy- Orth, U., Clark, D. A., Donnellan, M. B., & Robins, R. W.
sis, 15, 278–294. (2021). Testing prospective effects in longitudinal research:
Letkiewicz, A. M., Miller, G. A., Crocker, L. D., Warren, S. L., Comparing seven competing cross-lagged models. Journal of
Infantolino, Z. P., Mimnaugh, K. J., & Heller, W. (2014). Personality and Social Psychology, 120(4), 1013–1034. https://
Executive function deficits in daily life prospectively predict doi.org/10.1037/pspp0000358
increases in depressive symptoms. Cognitive Therapy and Pearl, J. (1995). Causal diagrams for empirical research.
Research, 38(6), 612–620. https://doi.org/10.1007/s10608- Biometrika, 82(4), 669–688. https://doi.org/10.1093/biomet/
014-9629-5 82.4.669
Levkovitz, Y., Caftori, R., Avital, A., & Richter-Levin, G. (2002). Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd
The SSRIs drug Fluoxetine, but not the noradrenergic ed). Cambridge University Press.
tricyclic drug Desipramine, improves memory Picciotto, S., & Neophytou, A. M. (2016). G-Estimation of struc-
performance during acute major depression. Brain Research tural nested models: Recent applications in Two subfields of
Bulletin, 58(4), 345–350. https://doi.org/10.1016/S0361-9230 epidemiology. Current Epidemiology Reports, 3(3), 242–251.
(01)00780-8 https://doi.org/10.1007/s40471-016-0081-9
1112 H. B. Jacobsen et al.
Porter, R. J., Robinson, L. J., Malhi, G. S., & Gallagher, P. (2015). Stange, J. P., Alloy, L. B., & Fresco, D. M. (2017). Inflexibility as
The neurocognitive profile of mood disorders - a review of the a vulnerability to depression: A systematic qualitative review.
evidence and methodological issues. Bipolar Disorders, 17, 21– Clinical Psychology: Science and Practice, 24(3), 245–276.
40. https://doi.org/10.1111/bdi.12342 https://doi.org/10.1111/cpsp.12201
Price, R. B., & Duman, R. (2020). Neuroplasticity in cognitive Stange, J. P., Connolly, S. L., Burke, T. A., Hamilton, J. L.,
and psychological mechanisms of depression: an integrative Hamlat, E. J., Abramson, L. Y., & Alloy, L. B. (2016).
model. Molecular Psychiatry, 25(3), 530–543. https://doi.org/ Inflexible cognition predicts first onset of major depressive epi-
10.1038/s41380-019-0615-x sodes in adolescence. Depression and Anxiety, 33(11), 1005–
R Core Team, R. (2014). A language and environment for statistical 1012. https://doi.org/10.1002/da.22513
computing. R Foundation for statistical computing. http://www. Strecher, V. J., McClure, J. B., Alexander, G. L., Chakraborty, B.,
r-project.org/. Nair, V. N., Konkel, J. M., Greene, S. M., Collins, L. M.,
Rijnhart, J. J. M., Valente, M., MacKinnon, D. P., Twisk, Carlier, C. C., Wiese, C. l. J., Little, R. J., Pomerleau, C. S.,
J. W. R., & Heymans, M. W. (2021). The Use of traditional & Pomerleau, O. F. (2008). Web-Based smoking-cessation
and causal estimators for mediation models with a binary programs. American Journal of Preventive Medicine, 34(5),
outcome and exposure-mediator interaction. Structural 373–381. https://doi.org/10.1016/j.amepre.2007.12.024
Equation Modeling: A Multidisciplinary Journal, 28(3), 345– Szalma, J. L., & Matthews, G. (2015). 12 Motivation and
355. https://doi.org/10.1080/10705511.2020.1811709 Emotion in Sustained Attention.
Robins, J. M. (1994). Correcting for non-compliance in random- Teasdale, J. D., Moore, R. G., Hayhurst, H., Pope, M., Williams,
ized trials using structural nested mean models. S., & Segal, Z. V. (2002). Metacognitive awareness and preven-
Communications in Statistics - Theory and Methods, 23(8), tion of relapse in depression: empirical evidence. Journal of
2379–2412. https://doi.org/10.1080/03610929408831393 Consulting and Clinical Psychology, 70(2), 275–287. https://doi.
Robins, J. M. (1997). Causal inference from complex longitudinal org/10.1037/0022-006X.70.2.275
data. In M. Berkane (Ed.), Latent variable modeling and appli- Torrent, C., Martinez-Arán, A., del Mar Bonnin, C., Reinares,
cations to causality. Lecture notes in statistics (120) (pp. 69– M., Daban, C., Solé, B., Rosa, A. R., Tabarés-Seisdedos, R.,
117). Springer-Verlag. Popovic, D., Salamero, M., & Vieta, E. (2012). Long-term
Robins, J. M., Blevins, D., Ritter, G., & Wulfsohn, M. (1992). G- outcome of cognitive impairment in bipolar disorder. The
estimation of the effect of prophylaxis therapy for pneumocystis Journal of Clinical Psychiatry, 73(7), e899–e905. https://doi.
Carinii pneumonia on the survival of aids patients. org/10.4088/JCP.11m07471
Epidemiology, 3(4), 319–336. https://doi.org/10.1097/ Tsiatis, A. A., Davidian, M., Holloway, S. T., & Laber, E. B.
00001648-199207000-00007 (2020). Dynamic treatment regimes: Statistical methods for pre-
Robins, J. M., Rotnitzky, A., & Scharfstein, D. (1999). Sensitivity cision medicine. CRC Press.
analysis for selection bias and unmeasured confounding in Usami, S., Murayama, K., & Hamaker, E. L. (2019a). A unified
missing data and causal inference models. In M. E. Halloran, framework of longitudinal models to examine reciprocal
& D. Berry (Eds.), In: Statistical models in epidemiology: The relations. Psychological Methods, 24(5), 637–657. https://doi.
environment and clinical trials (pp. 1–92). Springer-Verlag. org/10.1037/met0000210
Rock, P. L., Roiser, J., Riedel, W. J., & Blackwell, A. (2014). Usami, S., Todo, N., & Murayama, K. (2019b). Modeling reci-
Cognitive impairment in depression: a systematic review and procal effects in medical research: Critical discussion on the
meta-analysis. Psychological Medicine, 44(10), 2029–2040. current practices and potential alternative models. PLOS
https://doi.org/10.1017/S0033291713002535 ONE, 14(9), e0209133.
Rubin, D. B. (1974). Estimating causal effects of treatments in ran- Usami, S. (2022). Within-person variability score-based causal infer-
domized and nonrandomized studies. Journal of Educational ence: A two-step estimation for joint effects of time-varying treat-
Psychology, 66(5), 688–701. https://doi.org/10.1037/h0037350 ments. Psychometrika.
Rubin, D. B. (1978). Bayesian inference for causal effects: The VanderWeele, T. J. (2012). Invited commentary: Structural
role of randomization. The Annals of Statistics, 6(1), 34–58. equation models and epidemiologic analysis. American Journal
https://doi.org/10.1214/aos/1176344064 of Epidemiology, 176(7), 608–612. 3. https://doi.org/10.1093/
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. aje/kws213
Wiley. VanderWeele, T. J. (2015). Explanation in causal inference. Methods
Sachs, G., Berg, A., Jagsch, R., Lenz, G., & Erfurth, A. (2020). for mediation and interaction. Oxford University Press.
Predictors of functional outcome in patients With bipolar dis- VanderWeele, T. J., & Arah, O. A. (2011). Bias formulas for sen-
order: Effects of cognitive psychoeducational group therapy sitivity analysis of unmeasured confounding for general out-
after 12 months. Frontiers in Psychiatry, 11, https://doi.org/10. comes, treatments, and confounders. Epidemiology, 22(1), 42–
3389/fpsyt.2020.530026 52. https://doi.org/10.1097/EDE.0b013e3181f74493
Schafer, J. L. (1999). Multiple imputation: a primer. Statistical VanderWeele, T. J., & Hernán, M. A. (2013). Causal inference
Methods in Medical Research, 8(1), 3–15. https://doi.org/10. under multiple versions of treatment. Journal of Causal
1177/096228029900800102 Inference, 1(1), 1–20. https://doi.org/10.1515/jci-2012-0002
Snyder, H. R. (2013). Major depressive disorder is associated Vansteelandt, S., & Joffe, M. (2014). Structural nested models
with broad impairments on neuropsychological measures and G-estimation: The partially realized promise. Statistical
of executive function: a meta-analysis and review. Science, 29(4), 707–731. https://doi.org/10.1214/14-STS493
Psychological Bulletin, 139(1), 81–132. https://doi.org/10.1037/ Vansteelandt, S., & Sjolander, A. (2016). Revisiting g-estimation
a0028727 of the effect of a time-varying exposure subject to time-
Snyder, H. R., Miyake, A., & Hankin, B. L. (2015). Advancing varying confounding. Epidemiologic Methods, 5(1), 37–56.
understanding of executive function impairments and psycho- https://doi.org/10.1515/em-2015-0005
pathology: bridging the gap between clinical and cognitive Vaughan, M. (1989). Rule-governed behavior in behavior analy-
approaches. Frontiers in Psychology, 6, https://doi.org/10.3389/ sis. In Rule-governed behavior (pp. 97–118). Springer.
fpsyg.2015.00328 Villatte, M., Villatte, J. L., & Hayes, S. C. (2015). Mastering the
Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, predic- clinical conversation: Language as intervention. Guilford
tion, and search. MIT Press. Publications.
Psychotherapy Research 1113
Wallace, M. P., Moodie, E. E. M., & Stephens, D. A. (2016). (2021). Accuracy of the Hospital Anxiety and Depression
Model assessment in dynamic treatment regimen estimation Scale Depression subscale (HADS-D) to screen for major
via double robustness. Biometrics, 72(3), 855–864. https://doi. depression: systematic review and individual participant data
org/10.1111/biom.12468 meta-analysis. BMJ, 373, n972.
Wells, A. (2011). Metacognitive therapy for anxiety and depression. Yang, S., & Lok, J. J. (2018). Sensitivity analysis for unmeasured
Guilford Press. confounding in coarse structural nested mean models. Statistica
Wells, A., & Matthews, G. (1996). Modelling cognition in Sinica, 28(4), 1703–1723.
emotional disorder: The S-REF model. Behaviour Research Zigmond, A. S., & Snaith, R. P. (1983). The hospital anxiety
and Therapy, 34(11-12), 881–888. https://doi.org/10.1016/ and depression scale. Acta Psychiatrica Scandinavica, 67(6),
S0005-7967(96)00050-2 361–370. https://doi.org/10.1111/j.1600-0447.1983.
Wu, Y., Levis, B., Sun, Y., He, C., Krishnan, A., Neupane, D., tb09716.x
Bhandari, P. M., Negeri, Z., Benedetti, A., & Thombs, B. D.

Appendix
A1. G-estimation in Linear Structural Nested replaced by the propensity score, given by
Mean Models (SNMMs)
A linear structural mean model (SMM) (Robins 94; E(Y |l, a) = b0 + b′1 l + b′2 zp + c′ za (A2)
Vansteelandt & Sjolander, 2016; Vansteelandt &
Joffe, 2014) is one of Robins’ g-methods [Robins The estimated causal parameter c is equivalent (in
92a], a model for the conditional causal effect (con- large samples) to the g-estimator (Vansteelandt & Sjo-
ditional on covariates) of any variable of interest lander, 2016) and has nice properties, like “double-
(e.g. treatment or exposure) on an outcome. It’s robustness” (Vansteelandt & Sjolander, 2016). The
linear in an unknown (finite-dimensional) parameter generalization to a conditional causal effect of a
c, which quantifies the causal effect of a hypothetical time-varying exposure, with time-varying covariates
level of the exposure on a counterfactual outcome, and outcome is based on the linear structural nested
denoted Y(a) (potentially unobserved outcome for mean model (SNMM) [Robins 94]. The exposure
the hypothetical exposure level a). One of the basic and covariates At and Lt are thought to be assessed
assumptions for causal inference is “consistency,” at t = 1, . . . , T − 1 with the history up until time t
that Y (a) = Y for those that had exposure level denoted by A  t and L
 t . Similarly, the future of a vari-
A = a. Two others are “positivity” (not necessary ables is denoted by underbars, Lt = {Lt , Lt+1 , . . .}.
for the SNM) and “no unmeasured confounding” The SNMM can be formulated as
(Vansteelandt & Sjolander, 2016). A linear SMM
may be formulated as at , 0) − Ys (
E{Ys ( at−1 , 0)|at−1 , l t } = c′ zst at (A3)

where s . t, and Ys ( at , 0) is the counterfactual


E(Y (a) − Y (0)|l) = c′ za (A1) outcome where the exposure is set to a t up until time
t, and zero thereafter. The “no unmeasured confound-
ing” (sequentially conditional exchangeability)
which describes the effect of setting the exposure to a assumption  is formulated as
versus 0 (by some intervention) within subjects with Y t+1 (at−1 , 0) At |L  t−1 = a
t, A t−1 (conditional inde-
covariate values l, on the additive scale, where 0 is pendence between At and all future values of
some reference value (not necessarily “no Ys (
at , 0)).
exposure”), and z is a covariate vector that may This construct facilitates the estimation of, for
depend on (e.g. be a subset of) l. This is the “point each observed outcome Ys at time point s, not only
exposure case,” which is the building block for the the causal effect of the exposure immediately preced-
general case. ing the outcome (time s − 1), but also earlier
The SMM is fitted in two steps: first, the exposure exposures (times s − 2 . . . ) for which the effect of
is regressed on a vector of covariates L. Let the “pro- later exposures are subtracted from the outcome.
pensity score” P be the fitted value from this This way the SNMM can be broken down into a
regression (e.g. a linear regression), whether the sequence of SMMs and fitted in the same way as in
exposure is continuous or categorical (even though the point exposure case.
the term is usually reserved for a dichotomous First, the exposure on each time = 1, . . . , T − 1,
exposure). Second, regress the outcome on the is regressed on the history of exposures A  t−1 , and
vector of covariates L, and the terms ZA from the covariates L  t , for example in a linear regression,
SMM, and ZP where the exposure A has been and the fitted value from this regression, called the
1114 H. B. Jacobsen et al.

propensity score, is given by Pt Different causal hypotheses can easily be assessed


with different SNMMs. In the application presented,
at−1 , l t ) = g0 + g1 at−1 + g′2 lt
E(At | (A4) some candidates were:

The causal effect of the exposure at each time on the at , 0) − Ys (


E(Ys ( at−1 , l t ) = c′ zst at
at−1 , 0)|
subsequent outcome is found by viewing the data as a = (c0 I(s = t + 1) + c1 (s − t − 1))at (A8)
sequence of point-exposure SMMs, with the consist-
ency assumption that Yt ( at−1 ) equals Yt , at , 0) − Ys (
E(Ys ( at−1 , l t ) = c′ zst at
at−1 , 0)|
t = 1, . . . , T for those with observed exposure
history A  t−1 = a
t−1 , and with the univariate = (c0 + c1 trt + c2 lt + c3 trt × lt )at (A9)
outcome regression in (A2) replaced with a repeated
measures regression model like the Generalized Esti- at , 0) − Ys (
E(Ys ( at−1 , l t ) = c′ zst at
at−1 , 0)|
mating Equation model (GEE) with independence = (c0 + c1 f (t))at (A10)
working correlation (Vansteelandt & Sjolander,
2016). The history A  t−2 and L
 t−1 are considered
for = 1, 2, 3 s . t, and I(.) is the indicator function,
baseline covariates (possibly different than the such that I(arg) = 1 if arg = true.
vector of covariates in the exposure regression) and The SNMM in (A8) differentiates between short-
conditioned on, as well as the sequence of propensity and long-term effects, e.g. how a long-term effect
scores Pt−1 may decrease over time. The SNMM in (A9) and
(A10) parameterizes short-term effects. (A9) allows
E(Yt | t−2 + b′2l t−1
at−1 , l t−1 ) = b0 + b1 a a treatment-group difference in effect, time-varying
+ b′3 zt,t−1 pt−1 effect-modification from lt , and also a treatment-
group difference in the effect-modification, in the
+ c′ zt,t−1 at−1 (A5) most general form. The SNMM in (A10) is a
special case of (A9) with effect modification from a
This yields a preliminary estimate ĉ(0) . To estimate deterministic function of time.
the effect of At−2 , At−3 . . . again each one is con- Strengths of the SNMMs and the associated
sidered a point exposure with the predictions of method of g-estimation, compared to the MSM
 s−2 , 0), Ys (A
Y s (A  s−3 , 0) . . . as outcomes. The and associated inverse probability weighting, are
unbiased prediction of Ys (A  t , 0), here denoted by more efficient effect estimates and the so-called
Hst for arbitrary s and t, s . t, is found by the use “doubly robustness” property, which provides some
of ĉ(0) to subtract the cumulative effect of observed protection against model-misspecification (Vanstee-
exposure past time t from the observed outcome landt & Sjolander, 2016).


s−1
Hst = Ys − ĉ(0) ′Zsu Au (A6)
u=t+1 A2. Multiple Imputation
Compared to a “complete case analysis,” both bias
The updated and improved estimate of c is found by reduction and efficiency gain can be achieved by
considering Hst for all s and t as repeatedly measured multiple imputation (MI). The availability of stan-
outcomes, with a new independence GEE in the dard software leads to an increase in applications
model of MI in medical research after 2005. An upper
limit for the proportion of missing values in key
at , l t ) = b0 + b1 a
E(Hst | t−1 + b′2l t + b′3 zst pt variables has been postulated, but with little evi-
+ c′ zst at (A7) dence to support it (Madley-Dowd et al., 2019).
A more useful tool for determining potential effi-
ciency gain from MI is the fraction of missing
which yields ĉ(1) . Standard errors by bootstrap are
information (FMI), approximated by
recommended (Vansteelandt & Sjolander, 2016).
FMI = r/(1 + r) (for a high number of imputa-
The structural nested mean model (SNMMs)
tions), where r is the relative increase in variance
[Robins 94] with g-estimation is particularly well
due to the missingness given by (Madley-Dowd
suited for estimating an effect of a time-varying con-
2019; Schafer, 1999)
tinuous exposure, and to assess effect-modification
by time-varying covariates, not possible with the  
1 s2between
more popular marginal structural model (MSM) r = 1+ (A11)
(Vansteelandt & Sjolander, 2016). m s2within
Psychotherapy Research 1115

with m as number of imputations, the between variance on A t−1 , L


 t . Since the SNMM already conditions on
is the sample variance of the coefficients across imputa- exposure and confounder history up to time t, no
tions, and the within variance is the sample mean of further adjustment is needed for no loss-to-follow-up
the estimated variances across imputations. The FMI up to time t. Therefore, the weights are usually less vari-
is a parameter-specific measure that quantifies the loss able than the weights for the marginal structural model
of information due to missingness, while accounting (Vansteelandt & Sjolander, 2016). In the present appli-
for the amount of information retained by other variables cation, the weights for the outcomes Hst (in the GEE)
(Madley-Dowd et al., 2019). If the estimated FMI (in take the form (A0 = 0):
%), say for an exposure effect parameter, is less than
the proportion of missing values in the exposure, it  1)
Pr(C2 = 0|C1 = 0, A0 , L
means that other variables contain information about w21 = I(C2 = 0)  1)
Pr(C2 = 0|C1 = 0, A1 , L
this parameter which is recovered by the imputation,
even in the case of a high proportion of missing values.  1)
Pr(C2 = 0|C1 = 0, A0 , L
However, for a high proportion of missing, unbiasedness w31 = I(C3 = 0)  1)
Pr(C2 = 0|C1 = 0, A1 , L
in the MI relies heavily on the untestable assumptions of
 1)
Pr(C3 = 0|C2 = 0, A0 , L
MAR, and on no misspecification in the imputation and ×
analysis models. A low number of imputations in MI is   2)
Pr(C3 = 0|C2 = 0, A2 , L
usually sufficient, by an argument based on relative effi-
ciency (variance compared to the case with a very large  2)
Pr(C3 = 0|C2 = 0, A1 , L
w32 = I(C3 = 0)  2 ) (A13)
 2, L
number of imputations). For example, with an FMI of Pr(C3 = 0|C2 = 0, A
20%, 10 imputations correspond to a relative efficiency
above 98% (Schafer, 1999).  1)
Pr(C2 = 0|C1 = 0, A0 , L
w41 = I(C4 = 0)  1)
Pr(C2 = 0|C1 = 0, A1 , L
 1)
Pr(C3 = 0|C2 = 0, A0 , L
A3. Censoring from Loss-to-Follow-up ×  2, L
 2)
Pr(C3 = 0|C2 = 0, A
To adjust for potential selection-bias from loss-to-
 1)
Pr(C4 = 0|C3 = 0, A0 , L
follow-up, inverse probability of censoring weighting ×
is possible, under the “missing at random assump-  3, L
Pr(C4 = 0|C3 = 0, A  3)
tion,” which in the SNMM model means that at
each time, the missingness in the outcome is inde-  2)
Pr(C3 = 0|C2 = 0, A1 , L
w42 = I(C4 = 0)   2)
pendent of future exposures, covariates and out- Pr(C3 = 0|C2 = 0, A2 , L
comes, conditional on exposures, covariates and  1)
Pr(C4 = 0|C3 = 0, A0 , L
outcomes measured up to that time (Vansteelandt ×  2, L
 2)
Pr(C4 = 0|C3 = 0, A
& Sjolander, 2016). Adjustment for censoring by
loss-to-follow-up is achieved by weighting the  2, L
Pr(C4 = 0|C3 = 0, A  3)
outcome regression, with inverse probability w43 = I(C4 = 0)   3)
weights, similar (but not identical) to the weights Pr(C4 = 0|C3 = 0, A3 , L
used in fitting the marginal structural models. If
Ct = 1 is an indicator of a person being lost to
follow-up by time t, and zero otherwise, the “stabil-
ized weights” that are used and with the least varia-
A.4 The Lasso
bility, express the conditional probability of
Cs = 0, s . t (that the person will continue to stay The lasso is a “shrinkage” method with high per-
in the study) conditional on exposure and covariate formance with respect to minimizing prediction
history, divided by the same probability with error in a regression model. Increasing the number
updated exposure and covariate history (Vanstee- of predictors in a regression model usually means
landt & Sjolander, 2016) less bias and more variance, leading to overfitting.
To avoid overfitting, the lasso shrinks the coefficients
T  t−1 , L
Pr(Cs = 0|Cs−1 = 0, A t) by minimizing the ℓ1 penalty function, which results
wTt = I(CT = 0)   in some of the coefficients are set equal to zero when
s=t+1 Pr(Cs = 0|Cs−1 = 0, As−1 , Ls−1 ) the predictors are found to have small influence. In
(A12) the linear regression model

The expression in (A12) reduces to 1 if censoring at 


p
each time s . t has no residual dependence on yi = b0 + xij bj + 1i (A14)
exposure and covariate values past time t, conditional j=1
1116 H. B. Jacobsen et al.

The lasso solution can be written as where s is the amount of shrinkage. The optimal s
that minimizes prediction error, can be found by
2
argmin  
n p cross-validation, so that the data is divided into let’s
b̂lasso = yi − b 0 − xij bj (A15) say 10 equal-sized parts, 9 of which is used to fit a
b i=1 j=1 model with a specific shrinkage and one part to calcu-
late the prediction error. When this is repeated for

p
each part, the mean prediction error is a good
subject to |bj | ≤ s
estimate.
j=1

You might also like