Retrieve

Journal of Consulting and Clinical Psychology © 2015 American Psychological Association
2016, Vol. 84, No. 1, 67–78 0022-006X/16/$12.00 http://dx.doi.org/10.1037/ccp0000055
Reflective Functioning as Predictor of Working Alliance and Outcome in

the Treatment of Depression
Annika Ekeblad Fredrik Falkenström

Linköping University and Sundsvall Hospital, Västernorrland Linköping University and Uppsala University
County Council, Sweden
Rolf Holmqvist
Linköping University
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
Aims: Although considerable attention has been paid to the concept of mentalization in psychotherapy,
there is little research on mentalization as predictor of psychotherapy process and outcome. Using data
from a randomized controlled trial of cognitive– behavioral therapy and interpersonal psychotherapy for
depression, we studied mentalization in 85 outpatients with major depressive disorder (MDD) according
to the Diagnostic and Statistical Manual of Mental Disorders. It was hypothesized that patients showing
lower capacity for mentalization would experience poorer quality of alliance and worse outcome.
Method: Depressive symptoms were measured each session using the Beck Depression Inventory—II.
Mentalization was measured as reflective functioning (RF) on a slightly shortened version of the Adult
Attachment Interview. A measure of depression-specific reflective functioning (DSRF), measuring
mentalization about depressive symptoms, was also used. The Working Alliance Inventory—Short Form
Revised was completed after each session by both therapist and patient. Longitudinal multilevel modeling
was used to analyze data. Results: The patients had on average very low RF (M ⫽ 2.62, SD ⫽ 1.22).
Lower pretreatment RF/DSRF predicted significantly lower therapist-rated working alliance during
treatment. RF did not affect patient-rated alliance, but lower DSRF predicted lower patient-rated alliance
across treatment. Patients with higher RF/DSRF had better outcomes on self-rated depression. Conclu-
sions: The findings showed lower than normal capacity for mentalization in patients with MDD. Lower
RF/DSRF predicted worse treatment outcome. More research is needed to understand how RF affects
psychotherapy response and how RF is affected after recovery from depression.
What is the public health significance of this article?

This study shows that the capacity for mentalization, that is, the capacity for understanding humans
as being motivated by a more or less unobservable mental state (e.g., intentions, wishes, feelings,
thoughts), is important for getting optimal results from interpersonal psychotherapy and cognitive–
behavioral therapy in the treatment of depression. For depressed patients with severely restricted
mentalization, some adaption of treatment or some other treatment may be preferred.
Keywords: psychotherapy process, mentalization, reflective functioning, working alliance, major

depression
In recent years the concept of mentalization (Fonagy, Gergely, behavior in terms of underlying mental states, that is, thoughts,
Jurist, & Target, 2002) has attracted interest from both clinicians feelings, wishes, needs, and so forth, and can be seen as consisting
and researchers in the field of clinical psychology and psychother- of three dimensions: implicit versus explicit mentalization; men-
apy. Mentalization is defined as the capacity to understand human talization about self versus mentalization about other people; men-
This article was published Online First November 23, 2015. also want to thank the Psychiatric Clinic, Sundsvall Hospital, Väster-
Annika Ekeblad, Department of Behavioural Research and Learning, norrland County Council, for making this research possible, together
Linköping University, and Psychiatric Clinic, Sundsvall Hospital, Västernorrland with FoU Västernorrland, the Rehsam Fund, 2010/013, the L. J.
County Council, Sweden; Fredrik Falkenström, Department of Behavioural Re- Boëthius Research Fund, Emil Andersson Research Fund, the Swedish
search and Learning, Linköping University, and Center for Clinical Research Research Council for Health, Working Life and Welfare (Grant 2013-
Sörmland, Uppsala University; Rolf Holmqvist, Department of Behavioural Re- 0203) and Vårstavi Foundation.
search and Learning, Linköping University. Correspondence concerning this article should be addressed to Annika
The authors would like to thank all participating patients and thera- Ekeblad, Kristinagatan 8, S-87160 Härnösand Sweden. E-mail: annika
pists: Without your work this trial would not have been possible. We .ekeblad@lvn.se
67
68 EKEBLAD, FALKENSTRÖM, AND HOLMQVIST
talization about thoughts versus mentalization about feelings capacity for mentalization (Fonagy et al., 2002). Reduced mental-
(Bateman & Fonagy, 2004). According to the theory developed by ization may in turn make it harder to resolve, or even worsen, the
Fonagy and his coworkers (Bateman & Fonagy, 2004; Fonagy et issues at hand, with the result that depressed mood may increase
al., 2002; Fonagy & Target, 2000; Target & Fonagy, 1996), further.
mentalization is developed in attachment relationships with care- Some studies have shown lower RF on the AAI in depressed
givers in infancy and onward. Attachment figures’ mirroring inpatients than in nonclinical samples (Fischer-Kern et al., 2013,
fants’ mental states in a congruent (i.e., correct) and marked (i.e., 2008). However, there are also studies showing close to normal RF
showing that what is reflected is the infant’s mental state and not in depressed patient groups (Taubner, Kessler, Buchheim,
the adult’s own experience) way is thought to promote develop- Kächele, & Staun, 2011). The patients in these studies differ in the
ment of representations of mental states that can be used for level of functioning and comorbidity, which may explain the
reflection. differences. The studies by Fischer-Kern et al. (2013, 2008) in-
Mentalization has been operationalized for research purposes as clude severely comorbid patients (e.g., substance abusing and
reflective functioning (RF; Fonagy, Target, Steele, & Steele, psychotic inpatients), whereas the study by Taubner et al. (2011)
1998). Conceptually, RF is defined as a continuum from an un- was based on patients who were stable enough to attend psycho-
willingness to think in mental state terms at all or to avoid such analytic psychotherapy several times weekly.
thinking actively or passively, through stages in which mental state The causal direction of a possible relationship between impaired
language is used but with no clear indication of reflectiveness, and mentalization and depression is not established. It may be that
a bit higher up on the scale to basic demonstrations of mentaliza- depression impairs the capacity for mentalization or that an im-
tion such as a capacity to distinguish internal from external reality, paired capacity to mentalize is a risk factor for developing depres-
or the limitations of direct knowledge of mental states. On the sion. A third alternative is a bidirectional influence, in that depres-
highest end of the scale, RF ratings show more-developed in- sion lowers mentalization and lowered mentalization creates
stances of “playing with reality” (e.g., Fonagy & Target, 1996, p. interpersonal problems that exacerbates depression (Luyten et al.,
1), in which more-complex interactional sequences involving men- 2012). Still another explanation would be that there is a third
tal states and behavior affecting each other are considered. Usually variable, such as a general cognitive impairment due to depression
RF ratings are done on transcribed Adult Attachment Interviews severity, that causes both worsening of the depression and im-
(AAI; George, Kaplan, & Main, 1985). When rated from the AAI, paired mentalization (Montag et al., 2010).
the RF score is interpreted as the person’s capacity to mentalize
in the context of attachment relationships actualized in the inter-
view. The RF scale has also been used to rate mentalization about Mentalization and Psychotherapy Process/Outcome
discrete symptoms and problems on the Panic-Specific Reflective
Functioning Interview (Rudden, Milrod, Target, Ackerman, & Interestingly, the concepts of mentalization and working alliance
Graf, 2006). In this interview, the RF scores are interpreted as the may be seen as having a common theoretical root. An article by
capacity to understand psychiatric symptoms in terms of underly- Sterba (1934) described the need for patients in psychotherapy to
ing mental states. develop an internal “split” between an observing and an experiencing
Impaired RF has been hypothesized to be related to various part. The idea of an “observing ego” is related to mentalization in the
psychopathologies and to the development of the psychotherapy sense that mentalizing requires the capacity to disengage from imme-
process. For instance, deficient mentalization has been shown in diate experience and reflect on it. In order to engage in the therapeutic
patients with borderline personality disorder (Fonagy et al., 2002; work the patient must also in a way separate between a cooperating
Fonagy & Target, 2000), and partial deficits in mentalization have part of the mind (the alliance) and an experiencing that faces the
also been related to other psychiatric conditions such as anorexia therapeutic challenges. Several contemporary authors have noted that
nervosa (Skårderud, 2007a, 2007b, 2007c), panic disorder and mentalization to some extent may be a prerequisite for effective use of
depression (Lemma, Target, & Fonagy, 2011; Luyten, Fonagy, psychotherapy (Lemma et al., 2011; Taubner et al., 2011). The work
Lemma, & Target, 2012), obsessive– compulsive disorder (Kull- of psychotherapy may demand a capacity to reflect on mental states
gard, Persson, Möller, Falkenström, & Holmqvist, 2013), and and relationships, which implies that patients who are not used to, or
posttraumatic stress disorder (PTSD; Markowitz & Meehan, even actively resist, thinking in mental state terms may be at a
2009). There is some research suggesting that in anxiety disorders disadvantage when seeking psychotherapeutic help. In addition, be-
the general level of RF is not lowered but that mentalization cause lower RF generally implies less-secure internal working models
specific to the anxiety disorder symptoms is low (Kullgard et al., of attachment (Fonagy, Steele, Steele, Moran, & Higgitt, 1991),
2013; Rudden et al., 2006). patients with lower RF may resist more-intimate or emotionally
intense aspects of the therapeutic relationship as well. It is also likely
that different kinds of psychotherapeutic interventions require more or
Mentalization and Depression
less mentalization from the patient.
The basic theoretical assumption of a mentalization-based ap- There is little research on mentalization as a predictor of psy-
proach to depression (Luyten et al., 2012) is that depressive chotherapy outcome and/or process, and the results of the existing
symptoms reflect responses to threats to attachment relations, studies are conflicting. A recent review (Katznelson, 2014) con-
either because of (impending) separation, rejection, loss; (impend- cluded that “the evidence is inconclusive at best with regards to
ing) failure experiences; or a combination of these. It is assumed how levels of RF are predictive of psychotherapeutic outcome” (p.
that threats to attachment relations result in feelings of abandon- 115). A study by Fonagy et al. (1996), using an early version of the
ment and/or shame, which lead to depressed mood and reduced RF scale, investigated mentalization in inpatients at a London
PATIENT REFLECTIVE FUNCTIONING AS PREDICTOR 69
hospital. Results showed that the only variable that predicted Method
outcome was attachment classification.
A small pilot study (n ⫽ 24) by Müller, Kaufhold, Overbeck,
and Grabhorn (2006) showed a significant and relatively strong Participants
relationship between initial RF and psychotherapy outcome The participant patients were all diagnosed with major depres-
(r ⫽ ⫺.46) for a mixed-diagnostic psychiatric sample (eating sive disorder (MDD) by experienced psychiatrists and clinical
disorders and depressive disorders). Patients with higher RF had psychologists using the Structured Clinical Interview for DSM–IV
better outcome. This study used the Operationalized Psychody- (SCID; First, Spitzer, Gibbon, & Williams, 2002). They had been
namic Diagnostic Interview (OPD Task Force, 2008) supple- referred to the psychiatric clinic for treatment for depression. Most
mented by five demand questions taken from the AAI to rate RF of them came from primary care, but some came from the psychi-
in addition to the probe “Why do you think your father/mother/ atric clinic’s inpatient care and some from other units at the
partner behaved that way in this particular situation?” (Müller et psychiatric clinic. All patients had previously received treatment
al., 2006, p. 488) after every relationship episode described. Be- for depression, most commonly medication in primary care, with
cause the context for rating RF differs from those based on the none or only partial response, and they were referred to the
AAI, it may be that the RF ratings used in the study by Müller et psychiatric clinic for further treatment. The inclusion criteria were
al. (2006) are not comparable to those for the RF as it is usually age 18 to 65 and MDD diagnosis. Exclusion criteria were psycho-
scored. Another small study by Taubner et al. (2011) used RF sis, ongoing substance addiction, serious neuropsychiatric disor-
ratings on the full AAI to predict outcome for chronically de- der, or active self-harm behavior. Ongoing medication was not an
pressed patients in long-term psychoanalytic treatment. In this exclusion criterion, but the recommendation was to avoid changes
study there was no statistically significant relationship between RF in medication during therapy. To be included in the study the
and outcome. More studies comparing RF ratings on different patient had to accept random allocation to the therapy methods and
interviews are needed to find out in what ways RF is stable across video filming of all sessions. A total of 96 patients were included
different types of interviews. out of 99 who were asked to participate. Patients who met the
Empirical findings about associations between RF and the psycho- inclusion criteria were fully informed about the study and gave
therapy process are even sparser. Taubner et al. (2011) found a their written consent. The study was approved by the Regional
moderate association between pretreatment RF and the early patient- Ethical Review Board in Linköping (2010/348 –31). The random-
rated therapeutic alliance (r ⫽ .48, p ⫽ .07). There was also a trend ization procedure was done by a psychologist at the clinic not
for high RF to predict improvement in alliance during treatment (d ⫽ otherwise involved in the project.
.95, p ⫽ .08). There is thus clearly a need for further research on RF
as a predictor and/or moderator of process and outcome in psycho-
Therapists
therapy. In the present study we tested the predictive power of RF
rated on the AAI on the working alliance and on outcome. In addition, All participating therapists worked on a regular basis at the
we tested the associations between a new measure, Depression- psychiatric clinic in Sundsvall. There were nine therapists provid-
Specific Reflective Functioning (DSRF; Falkenström, June 2010) ing IPT, with 4.9 as the mean number of patients treated. There
interview and psychotherapy process and outcome. were 25 therapists providing CBT, treating on average 1.9 patients.
We hypothesized that patients with low RF/DSRF scores would
have more difficulties establishing a working alliance with their
therapists, because of their general difficulty in fulfilling a general Training and Supervision
requirement of therapy, that is, reflecting on thoughts, feelings, and so All therapists had basic training in psychotherapy and specialist
forth in their relationships. This could be expected to result in diffi- training in the treatment method they provided. During the trial the
culties complying with the therapeutic tasks of, for instance, cognitive therapists in both the IPT and the CBT arm had regular supervision
restructuring (which requires reflection on thoughts as separate from (once to twice a month) with trained supervisors in the respective
external reality) in cognitive– behavioral therapy (CBT), or working methods and were given the opportunity to attend more days of
with affects or communication analysis used in interpersonal psycho- training in their respective therapy forms.
therapy (IPT). In addition, insecure internal working models of at-
tachment associated with low RF might interfere more generally with
the formation of a productive therapeutic relationship. This would be Treatments
revealed by lower scores on working alliance measures in both IPT was delivered according to the standard manual (Weissman,
therapist and patient ratings. We also predicted that patients with Markowitz, & Klerman, 2000). In this trial, 14 sessions were used.
lower RF scores would show less improvement of symptoms during CBT was implemented according to the two principal manuals that
treatment. Finally, we explored potential differences between treat- exist for CBT for depression (Beck, Rush, Shaw, & Emery, 1979;
ment types in how RF affects the working alliance and outcome. Martell, Dimidjian, & Herman-Dunn, 2010). The CBT therapists
Specifically, we anticipated that due to the intense focus on interper- were proficient in both manuals and used them to different degrees
sonal relationships in IPT, the alliance and outcome in IPT would be according to their clinical judgment. Some therapists also included
more adversely affected by lower RF than CBT. Because RF is likely mindfulness-based interventions in their CBT treatment (Segal,
to be affected by general cognitive impairments in patients with Williams, & Teasdale, 2013), again, according to clinical judg-
MDD, we wanted to also control for this possibility using tests of ment. This is how CBT is usually implemented in clinical practice
cognitive functioning. in psychiatric care in Sweden.
Treatment adherence was assessed by rating Sessions 3 and 7 processing speed, and TMT–B executive functions. The test is
using the Collaborative Study Psychotherapy Rating Scale (Evans, easily completed, and participants are assessed on the basis of the
Piasecki, Kriss, & Hollon, 1984) from videotaped therapy ses- amount of time taken to finish it. The TMT is a commonly used
sions. The IPT ratings were higher in IPT therapies than in CBT test and has good psychometric properties (Lim et al., 2013;
therapies (2.20 vs. 1.33), F(1, 69) ⫽ 45.17, p ⬍ .001, and the CBT Tombaugh, Kozak, & Rees, 1999). In the present study the TMT
ratings were higher in CBT therapies than in IPT therapies (2.33 was used to control for the possibility that RF/DSRF are lowered
vs. 1. 42), F(1, 69) ⫽ 52.95, p ⬍ .001. IPT therapists were more by general cognitive impairments due to depression.
adherent to IPT than to CBT (2.20 vs. 1.42), F(1, 33) ⫽ 9.06, p ⬍ Verbal Fluency in FAS form. FAS (based on the letters F, A,
.001, and CBT therapists were more adherent to CBT than to IPT and S used in the test) is an easily completed and sensitive
(2.33 vs. 1.33), F(1, 35) ⫽ 7.53, p ⬍ .001. Thus, IPT and CBT cognitive test assessing phonemic verbal fluency, and the used
could be distinguished as separate therapies. result is the number of words beginning with each letter. The test
is used in many test batteries and has good psychometric properties
Measures (Lim et al., 2013; Ruff, Light, Parker, & Levin, 1996; Tombaugh
et al., 1999). In the present study the FAS was used to control for
Beck Depression Inventory—II (BDI–II; Beck & Steer, the possibility that RF/DSRF were affected by general verbal
1996). The BDI–II is a widely used instrument to self-assess fluency.
depressive symptoms. The scale consists of 21 items, each item Adult Attachment Interview (AAI; George, Kaplan, &
rated from 0 to 3. The BDI–II has showed high reliability; capacity Main, 1985). In this semistructured interview, respondents are
to discriminate between depressed and nondepressed participants; asked to describe their childhood attachment experiences and to
and improved concurrent, content, and structural validity (Wang & evaluate possible impacts of these experiences on their own per-
Gorenstein, 2013). It was completed before each session. sonality and behavior. The interview is transcribed verbatim, and
Montgomery Åsberg Depression Rating Scale (MADRS; the transcription is used to rate RF. In the present study a slightly
Montgomery & Asberg, 1979). The MADRS is an observer- shortened version of the interview was used (Questions 1–11).
rated depression scale where depression symptoms are assessed Because this version includes most of the so-called demand ques-
using a clinical interview. The scale consists of 10 items where the
tions (Taubner et al., 2013), it is possible to use it for rating RF.
symptom severity is rated on a scale ranging from 0 to 6, resulting
Reflective Functioning (Version 5; Fonagy et al., 1998).
in a total score of maximum 60, where higher scores indicate
The Reflective Functioning scale is usually used to rate mental-
more-severe depression. MADRS is especially designed to be
ization on responses to the AAI. High RF is characterized by AAI
sensitive to change in symptom levels (Montgomery & Asberg,
interview passages showing explicit mentalization, especially
1979). The scale has shown satisfactory internal consistency and
when interview questions are posed to which a response without
good psychometric properties (Holländare, Andersson, & Eng-
any reflection on mental states would seem odd (so-called demand
ström, 2010; Svanborg & Åsberg, 1994). The interview was made
questions; e.g., “Why do you think your parents behaved the way
before and after the treatment.
they did during your childhood?”). The four overarching catego-
Working Alliance Inventory. The Working Alliance Inventory—
ries of responses that are scored as reflective functioning are
Short Form (WAI–S; Tracey & Kokotovic, 1989) for therapist-
rated, and revised short form (WAI-SR; Hatcher & Gillaspy, 2006)
1. Understanding of the nature of mental states,
for patient-rated working alliance were used. Both forms consist of
a 7-point Likert scale with four items designed to measure each of 2. Explicit efforts to tease out mental states underlying
the three aspects of the alliance (agreement on goals, tasks, and behavior,
emotional bonds), making a total of 12 items. The Working Alli-
ance Inventory is one of the more widely used measures of the 3. Recognizing developmental aspects of mental states, and
therapeutic alliance, and several studies have shown good reliabil-
ity and validity (Horvath & Greenberg, 1989, 1994; Samstag et al., 4. Mental states in relation to the interviewer.
2008). Swedish translations were used in the present study (Falk-
enström, Hatcher, & Holmqvist, 2015). The WAI was completed Scores are given to individual passages throughout the inter-
after each session by the patient and the therapist independently view, but in the end these are weighed together to create one final
from each other. The largest factor analysis of the factor structure RF score between ⫺1 and 9, where ⫺1 is negative or antireflective
of the patient-rated WAI showed that it was best represented by a RF, 5 is considered ordinary RF, and 9 is exceptional RF. The
bifactor model with one general factor and two specific (“group”) Reflective Functioning scale has shown the expected one-factor
factors corresponding to the Task/Goal and Bond subscales, re- structure, good reliability, and stability over time (Taubner et al.,
spectively (Falkenström et al., 2015). However, because the spe- 2013).
cific factors contained little reliable variance, the recommendation Depression-Specific Reflective Functioning (Falkenström,
is to use the sum score of all items. Although based on a much 2010). Rudden et al. (2006) developed the Panic-Specific Re-
smaller sample, therapist-rated version produced a similar finding flective Functioning interview in order to test the hypothesis that in
(Tracey & Kokotovic, 1989). specific Axis I disorders, mentalization in general may not be
Trail Making Test (TMT; Reitan, 1985). The TMT is a impaired but there may still be specific mentalization deficits
neuropsychological test included in many test batteries. It provides around the understanding of a certain symptom area. In this trial
information on visual search, scanning, speed of processing, men- we included a Swedish version of this interview, adapted slightly
tal flexibility, and executive functions. TMT–A assesses cognitive by Falkenström (2010) to be used with depression, therefore called
Depression-Specific Reflective Functioning (Depression-Specific be biased. On the other hand, including many nonsignificant
RF). The English version is in the Appendix. The interview was interactions would lead to less-reliable estimates and thus to a loss
scored using the original Reflective Functioning manual (Fonagy of statistical power. Because of these considerations, a model
et al., 1998). The Depression-Specific RF measure has not previ- trimming approach was used, in which all predictors were first
ously been subjected to reliability or validity tests. entered as full factorial interactions with all polynomials of time.
Assessments were made at baseline by an external rater, after If the highest order cross-level interaction was nonsignificant (e.g.,
Session 5 by the treating therapist or by the same external rater, the interaction between the predictor and the cubic term in a cubic
and at completion of therapy by the same external rater as at model), this interaction was removed and the model reestimated.
baseline. Follow-up is planned after 12 months. Ratings of BDI–II This procedure was repeated until either an interaction was found
were made by the patient before every therapy session, and the statistically significant or all interaction terms for this predictor
WAI–SR and WAI–S were scored after each session. The AAI and were removed. So, if for instance the interaction with the cubic
Depression-Specific RF interviews were transcribed verbatim and term was nonsignificant, this was removed. If, upon reestimation,
rated independently by two authors of this article. The more the interaction with the quadratic term was significant, this was
experienced rater (Fredrik Falkenström) was trained at the Anna retained together with the main effect and the lower order inter-
Freud Centre and has shown reliability with the authors of the RF action with the linear term.
scale. RF and DSRF were grand-mean-centered to facilitate interpre-
tation of intercepts. Analyses were made on the intention-to-treat
sample—that is, all patients who were randomized and came to
Procedure
Session 1—with completer status entered as covariate (dummy-
Patients were referred to the psychiatric clinic for treatment of coded with completer coded as “1” and dropout as “0”). All
depression. The recruitment started in the fall of 2010 and ended multilevel models were analyzed with full maximum likelihood
in November 2013. The patients who fulfilled the inclusion criteria estimation, using Stata (Version 13.1).
were asked about participation and were given a written informa-
tion sheet about the study with contact information. If patients
Results
gave written informed consent to participate they were randomized
to either CBT or IPT, and therapy started. All therapy sessions
were videotaped in order to make treatment adherence assessment Descriptive Statistics
possible.
The pretreatment characteristics of the study population (n ⫽
96) are presented in Table 1. The randomization seemed to be
Statistical Analyses successful except for the pretreatment MADRS scores, where the
IPT group had significantly higher scores than did the CBT group
Data was analyzed using multilevel growth curve modeling
(29.2 vs. 25.9, p ⬍ .05). Of the participants, 68.1% were on
(Singer & Willet, 2003). With regard to depression severity, re-
antidepressant medication when referred to the trial but still ful-
peated measurements of the BDI–II were modeled using a random
filled the inclusion criteria for major depressive disorder. The
intercepts and slopes model. Because differences in depression
mean age was 34.2 years, there were 68.8% women in the sample
level at treatment termination was of particular interest, the time
of participants, and 49.5% had at least one personality disorder
variable was centered on Session 14 (the last session according to
according to the Structured Clinical Interview for DSM–IV (2nd
the research protocol). With time centered on the last session, the
ed.; SCID–II; First, Gibbon, Spitzer, Williams, & Benjamin,
random intercept represents end-of-treatment status with respect to
1997). Pretreatment depression severity was in the moderate to
depression severity— estimated using available information from
severe range (BDI–II M ⫽ 36.5, SD ⫽ 9.5; MADRS M ⫽ 27.5,
all sessions. Depression end state is more reliably estimated this
SD ⫽ 6.5). There was no difference in outcome between patients
way than by using observed BDI–II scores at the last session only.
getting antidepressants and those who did not (p ⫽ .74).
The main effect of RF then represents the regression of end-state
Interrater reliability for the two raters was tested using intraclass
depression on pretreatment RF. In addition to the main effect of
correlation (ICC; two-way mixed model for the scores of a single
RF, the cross-level interactions with linear and quadratic time,
rater). For RF, ICC ⫽ .84, 95% confidence interval (CI) [.76, .89],
representing the effect of RF on change rate and change pattern
and for DSRF ICC ⫽ .83, 95% CI [.74, .89], indicating excellent
during treatment, were also tested.
reliability. Ratings from the more experienced rater (Fredrik Falk-
When WAI scores were analyzed, time was instead centered on
enström) were used in the final analyses.1 In this study, 85 tran-
the Session 1, because working alliance in the initial phase of
therapy is of particular importance theoretically (Horvath & Lu- scripts of AAI and 79 DSRF interviews were available. Six pa-
borsky, 1993). The main effect of pretreatment RF then represents tients were not interviewed, due to the interviewers’ sickness; two
model-implied working alliance at the beginning of treatment, pairs of interviews were impossible to use, due to language prob-
estimated using all available information. As with the BDI–II, lems; two patients never showed up for the interviews; and for one
cross-level interactions between pretreatment RF and time repre- patient there were technical problems making the interviews im-
sent the effect of mentalization on the rate of change in working possible to transcribe. Six additional DSRF interviews could not be
alliance during treatment. used, some due to lost videotapes and some because they were
In all analyses, we were most interested in the main effects of
predictors. Still, if significant interactions exist with any of the 1
We also tested the same models using the mean of both raters or the
polynomials of time, these have to be modeled or else results may ratings of only the second rater, and results were roughly the same.
Table 1
Pretreatment Characteristics of the Study Population
Variable Total sample (N ⫽ 96) CBT (n ⫽ 48) IPT (n ⫽ 48) Test value p
Mean age in years (SD) 34.2 (10.82) 32.0 (9.89) 36.5 (11.32) 2.08 .04
Women (%) 68.8 68.8 68.8 ns
Pretreatment BDI–II mean (SD) 36.5 (9.5) 36.7 (9.6) 36.2 (9.6) .08 ns
Pretreatment MADRS mean (SD) 27.5 (6.5) 25.9 (6.2) 29.2 (6.5) 6.1 ⬍.05
Personality disorder (%) 49.5 44.7 54.3 .87 (1) ns
Antidepressant medication (%) 68 67 71 .44 (94) ns
On sick benefits at start (%) 37.1 42.0 35.0 .68 (92) ns
RF (SD) 2.62 (1.22) 2.58 (1.05) 2.64 (1.41) .21 ns
DSRF (SD) 2.37 (.98) 2.30 (.91) 2.45 (1.06) .66 ns
Note. CBT ⫽ cognitive– behavioral therapy; IPT ⫽ interpersonal psychotherapy; BDI–II ⫽ Beck Depression
Inventory—II; MADRS ⫽ Montgomery Åsberg Depression Rating Scale; RF ⫽ reflective functioning; DSRF ⫽
depression-specific reflective functioning.
written on paper instead of spoken. The mean RF level in this r ⫽ .29, p ⫽ .01) but not to any of the other cognitive tests, and
group of 85 patients was 2.62 (SD ⫽ 1.23, range ⫽ 1– 6). The RF was unrelated to all cognitive tests (see Table 2). Overall, this
mean DSRF (N ⫽ 79) was slightly lower at 2.37 (SD ⫽ 0.98, suggests that RF and DSRF were not confounded with cognitive
range ⫽ 1–5). A paired-samples t test showed that the difference impairment or by depression severity.
in means between RF and DSRF was close to significant, F(79) ⫽ Both RF and DSRF variables were significantly skewed toward
1.77, p ⫽ .08. There was no difference in RF/DSRF prior to lower values. Log-transforming them reduced skewness to non-
treatment between patients assigned to IPT and an those to CBT: significance. However, in practice it turned out that model fit for
for RF, F(82) ⫽ 0.21, p ⫽ .83; for DSRF, F(79) ⫽ 0.66, p ⫽ .51. multilevel models was slightly better when nontransformed vari-
Women had significantly higher RF than did men (mean differ- ables were used, and because interpretation of parameter estimates
ence ⫽ 0.76, SE ⫽ 0.27, p ⫽ .006). are more intuitive when variables are untransformed, the raw— but
Table 2 shows intercorrelations between RF, DSRF, age, pre- centered—RF and DSRF values were used.
treatment depression severity, and cognitive processing speed
(Trail Making Test—A), executive functions (Trail Making Test—
B), and verbal fluency (FAS). RF and DSRF were significantly Patient Pretreatment Reflective Functioning Predicting
intercorrelated (r ⫽ .48, p ⬍ .001). RF was also significantly and Patient-Rated Working Alliance
positively related to patient-rated initial depression severity (pre- Initial tests showed that the intraclass correlation (ICC), show-
treatment BDI–II: r ⫽ .28, p ⫽ .01; Session 1 BDI–II: r ⫽ .27, p ⫽ ing the amount of nesting at different levels, was zero for the
.01) but not to observer-rated initial depression severity (r ⫽ .05, therapist level and .75 for the patient level, indicating that most of
ns). It is important to note that the direction of relationship be- the variance (75%) was at the patient level and the rest was at the
tween self-rated depression severity and RF was positive, meaning repeated-measures level. Observed WAI–SR scores were then
that higher RF was related to more-severe depressive symptoms. plotted over time, separately for each patient, as recommended by
DSRF was significantly related to executive functions (TMT–B: Singer and Willett (2003). Some of the trajectories appeared rel-
atively linear, whereas others seemed more quadratic, suggesting
that a random quadratic model might be a good approximation of
Table 2 the data. Starting with a linear random intercepts and slopes model,
Correlations Between RF, DSRF, Age, Depression Severity, we tested increasingly more-complex models using the likelihood
Cognitive Processing Speed, Executive Functions, and ratio test. A random cubic model with no constraints on the
Verbal Fluency covariances among random effects turned out to be the best fit to
Variable RF DSRF the data. Because initial BDI–II was significantly related to both
the level of WAI–SR and the level of RF, initial BDI–II was used
ⴱ
DSRF .48 as covariate.
Age .04 .01 In the initial model with all cross-level interactions included,
Pretreatment BDI–II .28ⴱ .21
Session 1 BDI–II .27ⴱ .11 there were no statistically significant interactions between RF and
Pretreatment MADRS .05 ⫺.04 the growth terms (ps ⫽ .18, .18, .26, for interactions with linear,
TMT–A ⫺.05 ⫺.01 quadratic, and cubic terms, respectively), showing that RF was
TMT–B .18 .29ⴱ unrelated to growth in patient-rated working alliance. In the final,
FAS .16 .22
trimmed model, the main effect of RF was not quite significant
Note. RF ⫽ reflective functioning; DSRF ⫽ depression-specific reflec- (coefficient ⫽ .15, SE ⫽ .09, p ⫽ .10), indicating that RF was
tive functioning; BDI–II ⫽ Beck Depression Inventory—II; MADRS ⫽ unrelated to initial patient-rated working alliance as well.
Montgomery Åsberg Depression Rating Scale; TMT–A ⫽ Trail Making
Test (cognitive processing speed); TMT–B ⫽ Trail Making Test (executive
The main effect of DSRF was statistically significant in the
functions); FAS ⫽ Verbal Fluency test. predicted direction (DSRF main effect ⫽ .33, SE ⫽ .11, p ⫽ .002)
ⴱ
p ⬍ .05. in the final, trimmed model (all the cross-level interactions with
growth terms in the initial model were nonsignificant; p ⬎ .05). (see Figure 2). As Figure 2 shows, for patients rated higher on the
This indicates that higher DSRF was related to a higher patient- RF scale, therapist-rated alliance increased faster until around
rated alliance across treatment. Using the estimated standard de- midtreatment, with this effect leveling off slightly toward the end
viation of the intercept of WAI–SR together with the observed of treatment.
standard deviation of DSRF, we calculated a standardized regres- The same effect was shown using the DSRF (DSRF ⫻ Linear
sion coefficient to ␤ ⫽ .27. Adding treatment (IPT vs. CBT) as a Coefficient ⫽ .06, SE ⫽ .02, p ⫽ .004; RF ⫻ Quadratic Coeffi-
moderator did not improve model fit, likelihood ratio ␹2(2) ⫽ 1.23, cient ⫽ ⫺.003, SE ⫽ .001, p ⫽ .003). As with RF, the main effect
p ⫽ .54, indicating that the effect of DSRF on working alliance of DSRF was nonsignificant (p ⫽ .92), and there was no effect of
level was constant across treatments. Figure 1 shows the effect of treatment as moderator, likelihood ratio ␹2(3) ⫽ 4.04, p ⫽ .26. A
DSRF on patient-rated working alliance. graph of the effect of DSRF on therapist-rated working alliance
was very similar to Figure 2, so no separate graph is provided.
Patients Pretreatment Reflective Functioning
Predicting Therapist-Rated Working Alliance Patients Pretreatment Reflective Functioning
Predicting Psychotherapy Outcome

The ICC for the WAI–S was .34 at the patient level and .36 at
the therapist level, showing that the largest source of variance for As the main outcome analysis of this trial (Ekeblad, Falken-
therapist ratings was the therapists (i.e., 36% of variance compared ström, Andersson, Vestberg, & Holmqvist, 2015) showed, a qua-
to 34% for the patient level and 30% for the repeated-measures dratic Level 1 model was the best representation of the change in
level). Thus, including a third level of random effect(s) seemed BDI–II over time. Because previous analyses showed a tendency
critical. As with patient-rated scores, the WAI–S scores of thera- for superiority of IPT to CBT, treatment was entered as a covari-
pists were first plotted separately for each patient. The trajectories ate. Also, RF was significantly related to pretreatment BDI–II
for individual patients seemed more linear than for the patient data. scores, so pretreatment BDI–II was used as a covariate in all
However, model testing showed that the WAI–S trends over time analyses.2 The interpretation of the main effect of RF then is the
was best approximated using a random quadratic model (a cubic effect on change in BDI–II from pre- to posttreatment, rather than
model was tested but did not converge) with only the intercept just the effect on final session BDI–II score. In the initial, full
varying among therapists. A model with covariances among model, the quadratic change rate was not statistically significant
patient-level random effects estimated without constraints yielded (RF ⫻ Quadratic Change Rate ⫽ ⫺.00, SE ⫽ .02, p ⫽ .85), so this
the best fit to data. The main effect of RF on therapist-rated term was constrained to zero. However, in the trimmed model the
working alliance was not statistically significant (p ⫽ .48). How- cross-level interaction with the linear term was statistically signif-
ever, the cross-level interactions between RF and linear and qua- icant. Results showed that in the final, trimmed model, RF signif-
dratic time were both statistically significant (RF ⫻ Linear Coef- icantly predicted change in depression severity from pre- to post-
ficient ⫽ .05, SE ⫽ .02, p ⫽ .002; RF ⫻ Quadratic treatment (RF main effect ⫽ ⫺3.19, SE ⫽ 1.16, p ⫽ .006) as well
Coefficient ⫽ ⫺.003, SE ⫽ .001, p ⫽ .009). There was no as linear change rate (RF ⫻ Linear Change Rate ⫽ ⫺.30, SE ⫽
moderating effect of treatment, likelihood ratio ␹2(3) ⫽ 2.89, p ⫽ .10, p ⫽ .002). The interpretation of this model is that improve-
.41. To show the effect of RF on the course of therapist-rated ment in BDI–II scores from Session 1 to 14 is 3.2 points more for
alliance, we plotted model-predicted fixed effects over sessions each higher RF score. Using the estimated standard deviation for
BDI–II at termination (SD ⫽ 11.09) and the observed standard
deviation for RF (SD ⫽ 1.22), the effect of RF on change in
BDI–II amounts to a standardized regression coefficient of .35,
that is, a medium-sized effect of RF on outcome. A moderator
model, in which the main effect of RF and the effect of RF on
linear change rate in BDI–II over sessions were moderated by
treatment, was tested but found not to improve model fit, likeli-
hood ratio ␹2(2) ⫽ 2.45, p ⫽ .29.
The same models were tested using DSRF as predictor. As with
RF, the cross-level interaction with the quadratic term was non-
significant in the full factorial model (DSRF ⫻ Quadratic Change
Rate ⫽ ⫺.01, SE ⫽ .02, p ⫽ .68), so this effect was constrained
to zero. The main effect of DSRF in the trimmed model was
statistically significant (DSRF main effect ⫽ 4.49, SE ⫽ 1.38, p ⫽
.001), as was the cross-level interaction with the linear term
(DSRF ⫻ Linear Change Rate ⫽ ⫺.34, SE ⫽ .12, p ⫽ .005). The
main effect of DSRF is interpreted as each higher DSRF score
being associated with the treatment effect being 4.5 BDI–II scores
larger. This relationship corresponded to a standardized regression
Figure 1. Depression-specific reflective functioning (DSRF) predicting
higher level of patient-rated working alliance across sessions. The bottom line
(blue) indicates DSRF at 1.4, the middle line (red) DSRF at 2.3, and top line 2
We chose not to use the Session 1 BDI-II score as covariate, out of a
(green) DSRF at 3.3. WAI–SR ⫽ Working Alliance Inventory—Short Form concern that this would amount to using the same information twice (i.e.,
Revised. See the online article for the color version of this figure. Session 1 BDI-II used as both a predictor and an outcome).
of MDD using IPT and CBT. The main finding was that patients
with higher initial RF and DSRF attained more symptom reduc-
tion, and there were indications that the working alliance was
worse for patients with lower RF.
Our results replicated results from previous studies (Fischer-
Kern et al., 2013; Fischer-Kern et al., 2008; Taubner et al., 2011)
that have found depressed patients to have a reduced capacity for
reflecting on mental states in the context of attachment. There
seems to be an important distinction between patients with depres-
sive disorders and patients with anxiety disorders with regard to
the general level of RF. In their study of psychotherapy for panic
disorder, Rudden et al. (2006) found that panic disorder patients on
average had normal general RF, and Kullgard et al. (2013) found
almost normal general RF scores in patients with obsessive com-

pulsive disorder (OCD). Although it is possible that this difference

would be due to the generally lowered cognitive functioning often
Figure 2. Reflective functioning (RF) predicting change in therapist-
seen in depression but usually not in anxiety disorders, we found
rated working alliance across sessions. The bottom line (blue) indicates RF little evidence for a relationship between cognitive impairment and
at 1.4, the middle line (red) RF at 2.6, and top line (green) RF at 3.8. RF in our study.
WAI–S ⫽ Working Alliance Inventory—Short Form. See the online article The average RF score for the depressed patients in this study
for the color version of this figure. was on about the same level as what other studies have shown for
borderline personality disorder (BPD) patients (Fonagy et al.,
coefficient of ␤ ⫽ .41. The moderator model for treatment on the 1996; Levy et al., 2006) and considerably lower than what has
effect of DSRF on outcome was tested but found not to improve been shown for nonclinical samples (around RF ⫽ 5; e.g., Falk-
model fit, likelihood ratio ␹2(2) ⫽ 2.35, p ⫽ .31. The results for enström et al., 2014; Fonagy et al., 1998). It should be noted that
RF and DSRF as predictors of outcome are illustrated in Figure 3, there was considerable Axis II comorbidity in the present sample,
which shows outcome at the mean and at plus/minus one standard which to some extent may explain the low RF. There may also be
deviation of RF (left panel) and DSRF (right panel). different causal explanations for low RF in Axis II disorders such
Finally, because RF and DSRF were significantly intercorre- as BPD and Axis I conditions such as MDD. Because MDD is a
lated, it was deemed of interest to test whether they predicted less-chronic condition than are personality disorders, it is conceiv-
outcome independently or whether it was the common variance able that low RF is a result of MDD rather than the cause. If this
between the two that was most important. RF and DSRF were thus is the case, then RF should return to normal levels when patients
entered together as predictors. Results showed that DSRF still recover from depression. If not, it is likely that RF is more of a
significantly predicted BDI–II at termination (DSRF main ef- personality factor making patients vulnerable to depression. Such
fect ⫽ ⫺3.11, SE ⫽ 1.55, p ⫽ .04) but not change over time a vulnerability theory would be strengthened if patients who, in
(DSRF ⫻ Linear Change Rate ⫽ ⫺.18, SE ⫽ 0.13, p ⫽ .16). RF addition to recovering from depression, also increase their RF
was close to significance for BDI–II at termination (RF main during treatment were less likely to relapse during follow-up.
effect ⫽ ⫺2.12, SE ⫽ 1.25, p ⫽ .09) and significant for linear
RF and DSRF were not associated with observer-rated depression
change rate (RF ⫻ Linear Change Rate ⫽ ⫺.25, SE ⫽ .11, p ⫽
severity. The significant relationship between RF and pretreatment
.02). Because these results are a bit confusing, they were plotted to
patient-rated BDI–II was in the opposite direction from what was
explore the partly conflicting results (i.e., why last session was
expected; higher RF was related to higher initial depression severity
significant but not change rate for DSRF, and vice versa for RF).
(see Table 2). If this is not a spurious correlation, it may suggest that
Plots showed that although the model was controlled for the
patients with greater reflective capacity may be more aware of and
pretreatment BDI–II measure, there was still a correlation between
attentive to depressive symptoms. The finding also indicates that it is
RF and Session 1 BDI–II. A post hoc test was performed in which
highly unlikely that low RF was a simple side effect of higher
estimated Session 14 BDI–II minus estimated Session 1 BDI–II
depression severity. Observer-rated depression was not associated
was compared between patients one standard deviation above and
with RF level. A possible explanation for the discrepancy between
below the mean for RF. This test showed a statistically significant
difference of 7.70 BDI–II points (SE ⫽ 3.35, p ⫽ .02) between the self- and observer-rated depression could be that low-RF patients are
groups. However, for DSRF this test showed a nonsignificant less aware of their depression than are high-RF patients but that the
difference of 4.71 BDI–II points (SE ⫽ 3.39, p ⫽ .16). Thus, it depression of low-RF patients is nevertheless communicated—per-
seems that RF contained some unique variance that still predicted haps nonverbally through posture, tone of voice, and so forth—so that
outcome after controlling for DSRF, whereas the opposite seemed trained clinicians pick it up. Upon further speculation, it may be that
not to be true.3 depression in low-RF patients is perceived not as a mental state but as
Discussion 3
This can also be seen in the left-hand panel of Figure 3, where higher
RF is associated with slightly higher estimates of Session 1 BDI–II scores,
The aim of this study was to analyze mentalization measured as meaning that the effect of RF on change during treatment was slightly
RF and DSRF as predictor of alliance and outcome in the treatment underestimated using this mode.
Figure 3. Reflective functioning (RF; left panel) and depression-specific reflective functioning (DSRF; right
panel) predicting change in depression severity from pre- to posttreatment. Left panel: The mainly bottom line
(green) indicates RF at 1.4, the middle line (red) RF at 2.6, and top line (blue) RF at 3.8. Right panel: The bottom
line (green) indicates DSRF at 1.4, the middle line (red) DSRF at 2.3, and top line (blue) RF at 3.3. BDI–II ⫽
Beck Depression Inventory—II. See the online article for the color version of this figure.
objective reality, or what is called psychic equivalence in the men- related to symptom-specific mentalization and not depend on the
talization literature (to some extent clinical depression may always be general reflective capacity.
experienced in psychic equivalence mode, but perhaps more strongly Therapist-rated alliance was predicted by pretreatment RF and
so for some patients). If true, this might indicate an alternative DSRF in the sense that in therapies with higher RF and DSRF ratings
interpretation of the finding that low-RF patients have worse outcome the therapist-rated alliance increased faster until around midtreatment.
than do high-RF patients, namely that change in depression in low-RF No associations with alliance levels at the beginning of therapy were
patients does not show on self-report measures, due to their incapacity found for either RF or DSRF ratings. Thus, it seems as if therapists
to accurately perceive their own mental states. Although this might who meet patients with higher mentalization capacity gradually, over
undermine our findings in the sense that it could mean that low-RF time, feel stimulated and engaged and perceive the therapeutic col-
patients do not have worse outcome than do high-RF patients after all, laboration as better than when meeting patients with lower capacity
it is still reasonable to think that incapacity to self-rate depression for mentalization. This is a response both to the general mentalization
would be an obstacle to treatment success. It is even conceivable that ability and to reflections about symptoms.
this would be especially so in brief symptom-focused treatments such Both RF and DSRF ratings predicted symptom reduction sig-
as CBT and IPT, because these treatments start focusing on depressive nificantly. Higher pretreatment RF was associated with faster
symptoms from the first session with the assumption that patients symptom reduction. No difference was found between the therapy
know that they are depressed. methods in this respect. It is apparently important to assess the
The associations between RF and DSRF ratings on the one hand patients’ ability to reflect about their experiences in attachment
and alliance on the other were complex. Patient-rated alliance in- relationships and about their symptoms in order to be able to
creased substantially during treatment irrespective of pretreatment predict their response to these two therapy forms.
DSRF or RF ratings. But the level of patient-rated alliance throughout Previous findings have shown RF to be relatively independent
treatment depended on the level of pretreatment DSRF, with higher of other measures of similar constructs such as mindfulness and
DSRF scores being related to higher alliance. No significant relation affect consciousness (Falkenström et al., 2014) and personality
was found between RF ratings and the level of patient-rated alliance. (Fonagy et al., 1998). Because of this relative independence of RF
One interpretation of this may be that patients with less capacity to compared to other variables, it is the more striking that the ratings
mentalize about their depressive symptoms may feel frustrated about so strongly predicted outcome across the two most empirically
perceived demands from the therapist to understand the psychological supported treatments of depression. Our findings indicate that
causes for their depression. This lower alliance would be specifically these treatments work better with patients who have a better
capacity for mentalization. Bateman and Fonagy (2004) have long References
argued that patients with borderline personality disorder need a
Bateman, A. W., & Fonagy, P. (2004). Mentalization-based treatment of
treatment focused specifically on restoring the patients’ capacity
BPD. Journal of Personality Disorders, 18, 36 –51. http://dx.doi.org/10
for mentalization. It may be that depressed patients with a reduced
.1521/pedi.18.1.36.32772
capacity for mentalization also need some form of mentalization- Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive
based treatment to recover. Another possibility is that patients with therapy of depression. New York, NY: Guilford Press.
the most reduced RF/DSRF are better helped by longer term Beck, A. T., & Steer, R. A. (1996). Beck Depression Inventory: Manual
treatment, or by other forms of treatment/combined treatments, as (Swedish version). Stockholm, Sweden: Psykologiförlaget.
pharmacologic treatment and supportive care. Ekeblad, A., Falkenström, F., Andersson, G., Vestberg, R., & Holmqvist,
It could be argued that because DSRF is a much shorter inter- R. (2015). Randomized trial of interpersonal psychotherapy and cogni-
view and is much easier to rate than is RF on AAI, it could replace tive behavioral therapy for major depressive disorder in a community-
the AAI interview for RF scoring. However, because there were based psychiatric outpatient clinic. Manuscript under review.
some differences in their ability to predict outcome, it might be Evans, M. E., Piasecki, M., Kriss, M. R., & Hollon, S. D. (1984). Raters’
wise to do both interviews until research can firmly establish manual for the Collaborative Study Psychotherapy Rating Scale—Form
whether they measure the same or different constructs. 6. Minneapolis: University of Minnesota and the St. Paul-Ramsey Med-
ical Center.
Falkenström, F. (2010, June). Depression-specific reflective functioning.
Assets and Limitations Paper session at the 441ST Meeting of the Society for Psychotherapy
Research. Asilomar, CA.
Strengths of this study are the use of structured interviews to assess Falkenström, F., Hatcher, R. L., & Holmqvist, R. (2015).Confirmatory
patient diagnosis, repeated measures to assess change in depressive factor analysis of the patient version of the Working Alliance Inventory-
symptoms and alliance, randomization to assign patients to treatment, Short Form Revised. Assessment, 22, 581–593. http://dx.doi.org/10
and the use of manualized treatments with integrity checks. In addi- .1177/1073191114552472
tion, an important asset is the relatively large number of patients who Falkenström, F., Solbakken, O. A., Möller, C., Lech, B., Sandell, R., &
were assessed for RF on the AAI and DSRF. This data set made it Holmqvist, R. (2014). Reflective functioning, affect consciousness, and
possible to analyze the impact of pretreatment mentalization capacity mindfulness: Are these different functions? Psychoanalytic Psychology,
on symptom development and alliance with a repeated-measures 31, 26 – 40. http://dx.doi.org/10.1037/a0034049
First, M. B., Gibbon, M., Spitzer, R. L., Williams, J. B. W., & Benjamin,
design, which allows for analyses of the successive development of
L. S. (1997). Structured Clinical Interview for DSM-IV Axis II Person-
symptoms and alliance. The possibility to check relationships between
ality Disorders, (SCID-II). Washington, D.C: American Psychiatric
RF and several possible confounders, including tests of cognitive Press, Inc.
performance, is another strength. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. (2002).
Limitations include the slightly heterogeneous CBT intervention. Structured Clinical Interview for DSM–IV–TR Axis I Disorders, Re-
We do not know whether the current results would hold for more search Version, Patient Edition (SCID-I/P). New York, NY: Biometrics
“pure” forms of cognitive therapy or behavioral activation, or whether Research, New York State Psychiatric Institute.
it holds for only the combination of CBT techniques used in this Fischer-Kern, M., Fonagy, P., Kapusta, N. D., Luyten, P., Boss, S., Na-
study. On the other hand, this combined treatment is most likely more derer, A., . . . Leithner, K. (2013). Mentalizing in female inpatients with
representative of how CBT is done in clinical practice than the pure major depressive disorder. Journal of Nervous and Mental Disease, 201,
versions tested in most trials, so in this respect it increases general- 202–207. http://dx.doi.org/10.1097/NMD.0b013e3182845c0a
izability and external validity. In addition, the DSRF measure has not Fischer-Kern, M., Tmej, A., Kapusta, N. D., Naderer, A., Leithner-
been thoroughly validated previously, and the use of a slightly short- Dziubas, K., Löffler-Stastka, H., & Springer-Kremser, M. (2008) .Men-
talisierungsfähigkeit bei depressiven Patientinnen: Eine Pilotstudie [The
ened AAI should also be seen as a limitation. In particular, the
capacity for mentalization in depressive patients: A pilot study].
shortened AAI did not include questions about losses, which may be
Zeitschrift für Psychosomatische Medizin und Psychotherapie, 54, 368 –
particularly relevant in a depressed sample (Taubner et al., 2011).
380. http://dx.doi.org/10.13109/zptm.2008.54.4.368
Finally, it would be important in future research to test whether RF Fonagy, P., Gergely, G., Jurist, E. L., & Target, M. (2002). Affect regula-
predicts change in observer-rated depression. tion, mentalization, and the development of the self. New York, NY:
Other Press.
Fonagy, P., Leigh, T., Steele, M., Steele, H., Kennedy, R., Mattoon, G., . . .
Conclusions Gerber, A. (1996). The relation of attachment status, psychiatric classi-
Results indicate a poor average capacity for mentalizing in fication, and response to psychotherapy. Journal of Consulting and
patients diagnosed with MDD. The patients with lower mentaliz- Clinical Psychology, 64, 22–31. http://dx.doi.org/10.1037/0022-006X
.64.1.22
ing capacity had significantly less-successful outcome than did
Fonagy, P., Steele, M., Steele, H., Moran, G., & Higgitt, A. (1991). The
those with better mentalizing capacity. The finding may indicate a
capacity for understanding mental states: The reflective self in parent
possibility to differentiate between patients who are more or less
and child and its significance for security of attachment. Infant Mental
apt for these psychotherapies, or who may need longer treatments. Health Journal, 12, 201–218. http://dx.doi.org/10.1002/1097-
It should however be noted that patients with low RF also im- 0355(199123)12:3⬍201::AID-IMHJ2280120307⬎3.0.CO;2-7
proved to some extent. Future research should focus on under- Fonagy, P., & Target, M. (1996). Playing with reality: I. Theory of mind
standing in more detail why these patients are not improving as and the normal development of psychic reality. International Journal of
well as those with better mentalizing capacity, and how these Psychoanalysis, 77 (Pt 2), 217–233.
patients can be helped better. Fonagy, P., & Target, M. (2000). Mentalization and personality disorder in
children: A current perspective from the Anna Freud Centre. In T. Lubbe Montgomery, S. A., & Asberg, M. (1979). A new depression scale de-
(Ed.), The borderline psychotic child: A selective integration (pp. 69 – signed to be sensitive to change. British Journal of Psychiatry, 134,
89). Philadelphia, PA: Taylor & Francis. 382–389. http://dx.doi.org/10.1192/bjp.134.4.382
Fonagy, P., Target, M., Steele, H., & Steele, M. (1998). Reflective- Müller, C., Kaufhold, J., Overbeck, G., & Grabhorn, R. (2006). The
functioning manual: Version 5 for application to Adult Attachment importance of reflective functioning to the diagnosis of psychic struc-
Interview. Unpublished manual. ture. Psychology and Psychotherapy, 79, 485– 494. http://dx.doi.org/10
George, C., Kaplan, N., & Main, M. (1985). The Berkeley Adult Attach- .1348/147608305X68048
ment Interview: Interview protocol. Berkeley: University of California, OPD Task Force (Eds.) (2008). Operationalized Psychodynamic Diagnosis
Department of Psychology. OPD-2: Manual of diagnosis and treatment planning. Cambridge, MA:
Hatcher, R. L., & Gillaspy, J. A. (2006). Development and validation of a Hogrefe & Huber.
revised short version of the Working Alliance Inventory. Psychotherapy Reitan, R. (1985). Halstead-Reitan Neuropsychological Test Battery: The-
Research, 16, 12–25. http://dx.doi.org/10.1080/10503300500352500 ory and clinical interpretation. Tucson, Arizona: Reitan Neuropsychol-
Holländare, F., Andersson, G., & Engström, I. (2010). A comparison of ogy. ISBN 0934515026
psychometric properties between Internet and paper versions of two Rudden, M., Milrod, B., Target, M., Ackerman, S., & Graf, E. (2006).
depression instruments (BDI-II and MADRS-S) administered to clinic Reflective functioning in panic disorder patients: A pilot study. Journal
patients. Journal of Medical Internet Research, 12 (5): e49. http://dx.doi of the American Psychoanalytic Association, 54, 1339 –1343. http://dx
.org/10.2196/jmir.1392 .doi.org/10.1177/00030651060540040109
Horvath, A. O., & Greenberg, L. S. (1989). Development and validation of Ruff, R. M., Light, R. H., Parker, S. B., & Levin, H. S. (1996). Benton
the Working Alliance Inventory. Journal of Counseling Psychology, 36, Controlled Oral Word Association Test: Reliability and updated norms.
223–233. http://dx.doi.org/10.1037/0022-0167.36.2.223 Archives of Clinical Neuropsychology, 11, 329 –338. http://dx.doi.org/
Horvath, A. O., & Greenberg, L. S. (1994). The Working Alliance: Theory, 10.1093/arclin/11.4.329
research, and practice. Oxford, England: Wiley. Samstag, L. W., Muran, J. C., Wachtel, P. L., Slade, A., Safran, J. D., &
Horvath, A. O., & Luborsky, L. (1993). The role of the therapeutic alliance Winston, A. (2008). Evaluating negative process: A comparison of
in psychotherapy. Journal of Consulting and Clinical Psychology, 61, working alliance, interpersonal behavior, and narrative coherency
561–573. http://dx.doi.org/10.1037/0022-006X.61.4.561 among three psychotherapy outcome conditions. American Journal of
Katznelson, H. (2014). Reflective functioning: A review. Clinical Psychol- Psychotherapy, 62, 165–194.
ogy Review, 34, 107–117. http://dx.doi.org/10.1016/j.cpr.2013.12.003 Segal, Z. V., Williams, J. M. G., & Teasdale, J. D. (2013). Mindfulness-
Kullgard, N., Persson, P., Möller, C., Falkenström, F., & Holmqvist, R. based cognitive therapy for depression (2nd ed.). New York, NY:
(2013). Reflective functioning in patients with obsessive– compulsive Guilford Press.
disorder (OCD): Preliminary findings of a comparison between reflec- Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis:
tive functioning (RF) in general and OCD-specific reflective function- Modeling change and event occurrence. New York, NY: Oxford Uni-
ing. Psychoanalytic Psychotherapy, 27, 154 –169. http://dx.doi.org/10 versity Press. http://dx.doi.org/10.1093/acprof:oso/9780195152968.001
.1080/02668734.2013.795909 .0001
Lemma, A., Target, M., & Fonagy, P. (2011). The development of a brief Skårderud, F. (2007a). Eating one’s words. Part I. “Concretised metaphors”
psychodynamic intervention (dynamic interpersonal therapy) and its and reflective function in anorexia nervosa—An interview study. Euro-
application to depression: A pilot study. Psychiatry, 74, 41– 48. http:// pean Eating Disorders Review, 15, 163–174. http://dx.doi.org/10.1002/
dx.doi.org/10.1521/psyc.2011.74.1.41 erv.777
Levy, K. N., Meehan, K. B., Kelly, K. M., Reynoso, J. S., Weber, M., Skårderud, F. (2007b). Eating one’s words. Part II. The embodied mind and
Clarkin, J. F., & Kernberg, O. F. (2006). Change in attachment patterns reflective function in anorexia nervosa—Theory. European Eating Dis-
and reflective function in a randomized control trial of transference- orders Review, 15, 243–252. http://dx.doi.org/10.1002/erv.778
focused psychotherapy for borderline personality disorder. Journal of Skårderud, F. (2007c). Eating one’s words: Part III. Mentalisation-based
Consulting and Clinical Psychology, 74, 1027–1040. http://dx.doi.org/ psychotherapy for anorexia nervosa—An outline for a treatment and
10.1037/0022-006X.74.6.1027 training manual. European Eating Disorders Review, 15, 323–339.
Lim, J., Oh, I. K., Han, C., Huh, Y. J., Jung, I.-K., Patkar, A. A., . . . Jang, http://dx.doi.org/10.1002/erv.817
B.-H. (2013). Sensitivity of cognitive tests in four cognitive domains in Stata (Version 13.1) [Computer software]. College Station, TX: StataCorp.
discriminating MDD patients from healthy controls: A meta-analysis. Sterba, R. (1934). The fate of the ego in analytic therapy. The International
International Psychogeriatrics, 25, 1543–1557. http://dx.doi.org/10 Journal of Psychoanalysis, 15, 117–126.
.1017/S1041610213000689 Svanborg, P., & Åsberg, M. (1994). A new self-rating scale for depression
Luyten, P., Fonagy, P., Lemma, A., & Target, M. (2012). Depression. In and anxiety states based on the Comprehensive Psychopathological
A. W. Bateman & P. Fonagy (Eds.), Handbook of mentalizing in mental Rating Scale. Acta Psychiatrica Scandinavica, 89, 21–28. http://dx.doi
health practice (pp. 385– 417). Arlington, VA: American Psychiatric .org/10.1111/j.1600-0447.1994.tb01480.x
Publishing. Target, M., & Fonagy, P. (1996). Playing with reality: II. The development
Markowitz, J. C., & Meehan, K. (2009). Reflective functioning and PTSD. of psychic reality from a theoretical perspective. International Journal of
Paper presented at the 40th Meeting of the Society for Psychotherapy Psychoanalysis, 77, 459 – 479.
Research, Santiago de Chile, Chile. Taubner, S., Hörz, S., Fischer-Kern, M., Doering, S., Buchheim, A., &
Martell, C. R., Dimidjian, S., & Herman-Dunn, R. (2010). Behavioral Zimmermann, J. (2013). Internal structure of the Reflective Functioning
activation for depression: A clinician’s guide. New York, NY: Guilford Scale. Psychological Assessment, 25, 127–135. http://dx.doi.org/10
Press. .1037/a0029138
Montag, C., Ehrlich, A., Neuhaus, K., Dziobek, I., Heekeren, H. R., Heinz, Taubner, S., Kessler, H., Buchheim, A., Kächele, H., & Staun, L. (2011).
A., & Gallinat, J. (2010). Theory of mind impairments in euthymic The role of mentalization in the psychoanalytic treatment of chronic
bipolar patients. Journal of Affective Disorders, 123, 264 –269. http:// depression. Psychiatry, 74, 49 –57. http://dx.doi.org/10.1521/psyc.2011
dx.doi.org/10.1016/j.jad.2009.08.017 .74.1.49
Tombaugh, T. N., Kozak, J., & Rees, L. (1999). Normative data Wang, Y.-P., & Gorenstein, C. (2013). Psychometric properties of the Beck
stratified by age and education for two measures of verbal fluency: Depression Inventory-II: A comprehensive review. Revista Brasileira de
FAS and animal naming. Archives of Clinical Neuropsychology, 14, Psiquiatria (São Paulo, Brazil), 35, 416 – 431. http://dx.doi.org/10.1590/
167–177. 1516-4446-2012-1048
Tracey, T. J., & Kokotovic, A. M. (1989). Factor structure of the Working Weissman, M. M., Markowitz, J. C., & Klerman, G. L. (2000). Compre-
Alliance Inventory. Psychological Assessment, 1, 207–210. http://dx.doi hensive guide to interpersonal psychotherapy. New York, NY: Basic
.org/10.1037/1040-3590.1.3.207 Books.
Appendix
Depression-Specific Reflective Functioning Interview
Adapted from Rudden, Milrod, Target, Ackerman, and Graf If yes: What may this be?
(2006). If yes: Do you have any ideas about how these things might
connect to your depression?
1. Why do you think you are depressed?
2. Have your ideas about why you are depressed changed

over time?
Received February 2, 2015
3. Do you ever notice that you get more depressed by Revision received August 18, 2015
certain events, thoughts, or feelings? Accepted August 21, 2015 䡲

Retrieve

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Retrieve

Uploaded by

Copyright:

Available Formats

Journal of Consulting and Clinical Psychology © 2015 American Psychological Association

2016, Vol. 84, No. 1, 67–78 0022-006X/16/$12.00 http://dx.doi.org/10.1037/ccp0000055

Reflective Functioning as Predictor of Working Alliance and Outcome in

Annika Ekeblad Fredrik Falkenström

What is the public health significance of this article?

Keywords: psychotherapy process, mentalization, reflective functioning, working alliance, major

depression-specific reflective functioning.

Predicting Psychotherapy Outcome

almost normal general RF scores in patients with obsessive com-

pulsive disorder (OCD). Although it is possible that this difference

2. Have your ideas about why you are depressed changed

You might also like