Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Accepted Manuscript

Are the Insomnia Severity Index and Pittsburgh Sleep Quality Index valid outcome
measures for Cognitive Behavioral Therapy for Insomnia? Inquiry from the
perspective of response shifts and longitudinal measurement invariance in their
Chinese versions

Po-Yi Chen, Ya-Wen Jan, Chien-Ming Yang

PII: S1389-9457(17)30179-X
DOI: 10.1016/j.sleep.2017.04.003
Reference: SLEEP 3374

To appear in: Sleep Medicine

Received Date: 11 January 2017


Revised Date: 20 March 2017
Accepted Date: 6 April 2017

Please cite this article as: Chen P-Y, Jan Y-W, Yang C-M, Are the Insomnia Severity Index and
Pittsburgh Sleep Quality Index valid outcome measures for Cognitive Behavioral Therapy for Insomnia?
Inquiry from the perspective of response shifts and longitudinal measurement invariance in their Chinese
versions, Sleep Medicine (2017), doi: 10.1016/j.sleep.2017.04.003.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Are the Insomnia Severity Index and Pittsburgh Sleep Quality Index valid
outcome measures for Cognitive Behavioral Therapy for Insomnia? Inquiry
from the perspective of response shifts and longitudinal measurement
invariance in their Chinese versions

<< NO AUTHOR INFO PROVIDED >>

PT
RI
ARTICLE INFO

SC
Article history:
Received

U
Received in revised form
Accepted
AN
Keywords:
M

Cognitive Behavioral Therapy for Insomnia


CBT-I
D

Insomnia Severity Index


Pittsburgh Sleep Quality Index
TE

Measurement invariance
Response shift
EP

*Corresponding author at:


C
AC

1
ACCEPTED MANUSCRIPT
ABSTRACT

Objective: The purpose of this study was to examine whether the Insomnia Severity
Index (ISI) and Pittsburgh Sleep Quality Index (PSQI) are valid outcome measures for
Cognitive Behavioral Therapy for Insomnia (CBT-I). Specifically, we tested whether
the factorial parameters of the ISI and the PSQI could remain invariant against CBT-I,
which is a prerequisite to using their change scores as an unbiased measure of the

PT
treatment outcome of CBT-I.
Methods: A clinical data set including scores on the Chinese versions of the ISI and

RI
the PSQI obtained from 114 insomnia patients prior to and after a 6-week CBT-I
program in Taiwan was analyzed. A series of measurement invariance (MI) tests were
conducted to compare the factorial parameters of the ISI and the PSQI before and

SC
after the CBT-I treatment program.
Results: Most factorial parameters of the ISI remained invariant after CBT-I.

U
However, the factorial model of the PSQI changed after CBT-I treatment. An extra
loading with three residual correlations was added into the factorial model after
AN
treatment.
Conclusions: The partial strong invariance of the ISI supports that it is a valid
outcome measure for CBT-I. In contrast, various changes in the factor model of the
M

PSQI indicate that it may not be an appropriate outcome measure for CBT-I. Some
possible causes for the changes of the constructs of the PSQI following CBT-I are
D

discussed.
TE
C EP
AC

2
ACCEPTED MANUSCRIPT
1. Introduction

Insomnia is defined as the subjective complaint of dissatisfaction with sleep


quantity and/or quality [1]. Thus, a self-rating measure for the severity of insomnia is
important in studies of insomnia treatments. Researchers often use the difference
scores on self-rating scales assessed before and after treatment as an index for patients’
improvement associated with the treatment [2]. Although the use of change scores for

PT
treatment effect is a common practice, potential threats that might confound the
interpretations of the scores are often neglected. From the perspective of

RI
psychometrics, for example, longitudinal measure invariance (MI) against the
treatment should be a prerequisite for using the change on a scale as a measure for
treatment outcome. A lack of MI can bias the interpretation of the change scores and

SC
therefore threaten the validity of the results [3].

U
1.1. Importance of MI in evaluating the effects of psychological interventions
MI is an important element of psychological tests. It concerns whether the target
AN
construct is measured in the same way across occasions [4,5]. In longitudinal studies,
testing MI is similar to examining whether researchers use the same ruler to measure a
target across time points [6], where the “scales of the ruler” are the factorial
M

parameters like factor loadings or intercepts. The longitudinal change scores of a


self-report measure can be meaningfully interpreted only when the “scale” of the ruler
D

is identical across time points. Otherwise, researchers will have difficulty


differentiating the “true change” of the targets (eg, decrease in severity of insomnia
TE

due to Cognitive Behavioral Therapy for Insomnia [CBT-I]) from simple shifts in the
scale of the ruler.
EP

In the scenario of CBT-I, the treatment might not only alleviate the symptoms of
insomnia; it could also change patients’ attitudes and concepts of sleep and insomnia.
C

Given that previous studies have shown that the cognitive changes caused by
AC

psychological interventions might change the factorial parameters underlying


questionnaires, it is reasonable to suspect that some questionnaires that researchers
use in CBT-I could also be affected. These changes could hinder researchers in
accurately estimating the treatment efficacy of CBT-I. For example, if CBT-I will
strengthen an item’s relation with the underlying construct of insomnia in a patient,
then the factor loading of this item could also be increased, because loadings are
usually considered as links between observable indicators to latent constructs. Given
this situation, the observed change score on this item for insomnia severity will, on
average, underestimate the “real” treatment efficacy, for the decrease in subjective

3
ACCEPTED MANUSCRIPT
ratings on the targeted latent construct will be offset by the increase of factor loadings
[7−10]. More detailed explanations of the influences of noninvariant factorial
parameters on the validity of change scores can be found in the Appendix (online
supplementary materials).

From the perspective of psychometrics, checking whether factorial parameters


are invariant across times is exactly the issue addressed by MI tests in CFA. The four

PT
most common MI tests in CFA are configural, weak, strong, and strict invariance tests.
These four tests focus on the factor structure, loadings, intercepts, and residual

RI
variances respectively, and are usually tested in sequence. According to the literature
[7], strong invariance is a key property to ensure the longitudinal comparability of a
self-report measure.

SC
1.2. Response shifts caused by cognitive behavioral therapy (CBT) and their

U
corresponding clinical interpretations in a CFA framework
In contrast to the invariant properties mentioned above, the changes (ie,
AN
noninvariance) in factor loadings and intercepts can be associated with a
psychological phenomenon called response shifts [11−13]. A response shift
can be defined as a change in the meaning of one’s self-report on the target
M

construct due to the following: 1) changes in the internal scale that one uses
for self-evaluation (ie, recalibration); 2) rearrangements in the order of
D

importance of the items (components) that one uses to compose the target
construct (ie, reprioritization); or 3) conceptual changes in the way that one
TE

defines the target construct (ie, reconceptualization) [14].


EP

Various methods have been developed to detect response shifts. Oort proposed a
procedure to test response shifts with invariance tests in CFA (ie, to identify the
non-invariance parts of a factorial model) [12]. Fokkema et al. extended Oort’s works
C

to the scenario of CBT for depression and offered possible clinical interpretations of
AC

different kinds of noninvariance [11]. They proposed that changes in a questionnaire’s


factor structure across times (ie, failure to pass the configural invariance test) can be
considered as evidence of reconceptualization, because it means that patients used
different items to define the underlying construct after the treatment of CBT. Second,
if the factor structure underlying a questionnaire is invariant after treatment but the
loadings of some items become higher (ie, failure to pass the weak invariance test),
then it represents a reprioritization because these items have become more indicative
for the patients. Third, the changes in items’ intercepts (failure to pass the strong
invariance test) represent uniform recalibration (change in the initial point of

4
ACCEPTED MANUSCRIPT
self-evaluation). An increase in an item’s intercept might indicate that patients have
become more sensitive to the symptom depicted by the item after CBT treatment.
Fourth, changes in residual variance can be considered nonuniform recalibration.
Table 1 presents a summary of Oort’s definition of response shifts (in a CFA
framework), corresponding MI tests, and possible clinical interpretations proposed by
Fokemma et al.

PT
1.3. Response shifts that have been found in psychological interventions
Considering that the aims of psychological treatments such as CBT usually

RI
involve reshaping patients’ cognitions, it is reasonable that some parts of the
factorial model will be changed after treatment [3,11,15−17]. For example,
researchers have recently found evidence indicating that the Beck Depression

SC
Inventory (BDI) is not invariant against CBT for depression [11]. Specifically,
it was found that most of the intercepts and two-factor loadings of the BDI

U
changed after treatment. This phenomenon not only reflects the response shifts
caused by CBT but also indicates that the BDI failed to pass the strong
AN
invariance test. Given these results, the authors concluded that the scores on
the BDI obtained before and after CBT might not be comparable to each other.
Wu also found similar results [15] and concluded that, due to the confounding
M

caused by response shifts (ie, measurement noninvariance), using the BDI as a


measure for psychological treatment outcome may entail bias.
D

1.4. Possible response shifts on the ISI and PSQI caused by CBT-I
TE

Several studies have demonstrated the potential influence of response shifts


(noninvariance) on self-report measures after psychological interventions.
EP

Consequently, researchers in different areas have begun to examine the MI of


their own measures with MI tests after psychological interventions [11,15−17].
Among the treatments for insomnia, CBT-I has been demonstrated to be
C

effective and is recommended as a first-line treatment for insomnia [18].


AC

CBT-I, like many other psychological interventions, also contains the element
of reshaping patients’ cognitions and could therefore cause response shifts
(noninvariance). As far as we know, no studies to date have addressed the MI
of the outcome measures in treatment studies of CBT-I.

Among the rating scales commonly used in insomnia studies, the Insomnia
Severity Index (ISI) and the Pittsburgh Sleep Quality Index (PSQI) are the
recommended measures for global sleep and insomnia symptoms in a standard
research assessment protocol for insomnia [19−21]. The ISI is one of the most

5
ACCEPTED MANUSCRIPT
common outcome measures in CBT-I studies [22−25]. The PSQI is also widely used
as an outcome measure in CBT-I studies [2,25,26]. For example, in a recent
systematic review and meta-analysis of CBT-I for chronic insomnia, the PSQI is one
of the two questionnaires identified to be used consistently enough for meta-analysis
at the posttreatment time point [25]. Another recent meta-analysis of the treatment
effect of group CBT-I, PSQI was used in four of the seven studies included in the
study [26]. As a result, in the current study, we examined the MI properties of the ISI

PT
and the PSQI against CBT-I.

RI
1.5. Research hypotheses
Most of the items on the ISI directly focus on patients’ subjective feelings
about their insomnia symptoms [21]. Furthermore, the three-factor model of

SC
ISI proposed by Bastien et al. [22] has been successfully replicated in clinical,
nonclinical, and cross-cultural studies [27,28]. In contrast, the PSQI was

U
developed to measure general sleep quality [20], a construct that is still not
well defined [29]. Even though a three-factor model of the PSQI has been
AN
demonstrated in one study [30], it could not be replicated in other studies (eg,
Otte [31]). Recently, a review article of the PSQI concluded that its structural
validity is only moderate [32]. Given these properties and results, we
M

hypothesized that the invariance against CBT-I of the ISI would be stronger
than that of the PSQI. More details about the psychometric properties of the
D

ISI and the PSQI we used in this study are provided in the following method
section.
TE

2. Method
EP

2.1. Study participants


The participants were 114 insomnia patients recruited from a general hospital and
C

communities in northern Taiwan (mean age = 43.15 years; 85 women and 29 men).
AC

The inclusion criteria were age 20−65 years, and DSM-IV diagnosis of primary
insomnia. The exclusion criteria were as follows: presence of psychotic disorders or
substance-related disorders; other sleep disorders such as periodic limb movement
disorder (PLMD; periodic movement disorder index ≥ 15 movements/h) and sleep
apnea (apnea−hypopnea index ≥ 5 events/h); medical conditions that might interfere
with sleep; use of medications or psychotropic drugs that might affect sleep; and
employment as a shift-worker or other reason for habitual irregular sleep schedule.

2.2. Procedure

6
ACCEPTED MANUSCRIPT
Potential participants went through a clinical interview to screen for possible
psychiatric, medical, and sleep disorders. A structured interview, the Mini
International Neuropsychiatric Interview, was also conducted to screen for
psychiatric disorders. After passing the screening, the participants were asked
to fill out a packet of questionnaires, including the Chinese version of the ISI
and the Chinese version of the PSQI (pretreatmenttest). They then underwent
1 night of polysomnography (PSG) recording to rule out sleep-related

PT
breathing disorders, PLMD, and other sleep disorders. All participants
received six weekly sessions of CBT-I. The CBT-I program involved

RI
multifaceted intervention with educational, behavioral, and cognitive
components [20]. The contents of the program included sleep education
regarding basic sleep regulation and the etiology of insomnia, sleep hygiene

SC
principles, behavioral techniques for insomnia (eg, sleep restriction therapy,
stimulus control instructions), relaxation training, principles for hypnotic

U
tapering, and cognitive restructuring. The participants were requested to
complete the packet of questionnaires again after completing the CBT-I
AN
program (post-test). The procedures were approved by an ethics review board,
and each participant signed an informed consent form.
M

2.3. Measures
The study evaluate the MI properties of the ISI and PSQI against CBT-I. The ISI is a
D

self-rating scale designed to assess the subjective perception of the severity of


insomnia [21,22]. The questionnaire contains seven items with Likert-type scales that
TE

measure the symptoms, associated features, and impact of insomnia. The Chinese
version of the ISI that we used for this study was translated and validated by Yang et
EP

al. in Taiwan [33]. It was found to have sufficient internal reliability (Cronbach’s α =
0.94). Furthermore, the Chinese version of the ISI has been shown to successfully
differentiate insomnia patients from healthy participants, with sensitivity and
C

specificity both > 0.9. A recent study based on data collected in Hong Kong, Taiwan,
AC

and Canada further supported the structural validity and cross-cultural comparability
(measurement invariance) of the ISI [27].

The PSQI is a 19-item questionnaire that is designed to assess general sleep


quality over a 1-month period [20]. A global score and seven-component score can be
derived from the scale. It has been used in both research and clinical settings for the
evaluation of sleep quality and screening for sleep disturbances. Tsai et al. translated
the scale and examined the psychometric properties of the Chinese version of the
PSQI [34]. They reported acceptable internal reliability (Cronbach’s α = 0.85) of the

7
ACCEPTED MANUSCRIPT
Chinese version of the PSQI in clinical patients in Taiwan. Its convergent validity was
also supported by showing significant correlations between the scores obtained on the
PSQI with the parameters from sleep logs. However, it is noticeable that the factorial
model of the PSQI could be different across nations and regions. For example, in a
cross-national study, Gelaye et al. found a two-factor model of the PSQI in samples
from Chile, Ethiopia, and Thailand, whereas a three-factor model was found in
samples from Peru [35].

PT
2.4. Data analysis

RI
R 3.1.1 and its package lavaan were used for all of the analyses [36,37]. The
full information maximum likelihood method (FIML) was used to handle
missing data [38]. In the current study, we fitted the longitudinal configural →

SC
weak invariance → strong invariance → strict invariance models in sequence.
The fitness of each of the invariance models was evaluated with global fit
indices and ∆χ 2 tests. Specifically, the configural invariance assumption was

U
evaluated with global fitness of the configural invariance. The fitness of other
AN
invariance models was evaluated with ∆χ 2 tests. An invariance assumption
held if its corresponding ∆χ 2 test result was nonsignificant (p > .05). On the
other hand, if an invariance model did not demonstrate sufficient fitness to the
M

data, we referred to the modification indices to release model constrains [7].


Our configural invariance models were established based on the factorial
D

models proposed by Bastien et al. [22] and Cole et al. [28] for the ISI and
PSQI, respectively. The global fit indices indicated that these two models fit
TE

the ISI and PSQI well with our pretreatment data (both models’ CFI and TLI >
0.90, RMSEA < 0.08, the criterion mentioned by Little [7] for a longitudinal
EP

CFA model).

3. Results
C
AC

3.1. Longitudinal MI tests for the ISI


The descriptive statistics of the items of the ISI and the PSQI are presented in
Table 2. The global fit indices of the MI models that we fitted to our ISI data
are presented in Table 3. The configural, weak, and strong MI models all fit
the ISI data acceptably (CFI and TLI > 0.90, RMSEA < 0.08). The only
potential misfit in the first three MI models was the ∆ χ 2 test between the
weak and strong invariance models (∆ χ 2 = 9.98, df = 4, p < .05). Therefore,
we referred to modification indices and let the intercept of ISI item 4 (worry)
be freely estimated (the ∆ χ 2 between this model and weak invariance model

8
ACCEPTED MANUSCRIPT
was 5.617, df = 1, p > .01). After establishing the partial strong invariance
model (ie, M4 in Table 3), we then further examined the partial strict
invariance model with all corresponding error variances set to be equal (ie, M5
in Table 3). The ∆χ 2 test again indicated that the assumption that all
corresponding errors were identical was not tenable (M4 versus M5 in Table 1,
∆ χ 2 =37.148, df = 7, p < .01). Therefore, by referring to modification indices,
we then allowed three error variances in the posttreatment items (items 1b, 5,

PT
7) to be freely estimated in sequence (M6−M8 in Table 3) to make the ∆ χ 2
test become nonsignificant again (M5 versus M8 in Table 1, ∆ χ 2 =10.665, df

RI
= 5, p > .05). The estimates of the final partial strict invariance model (M8 in
Table 3) are presented in Table 4. That table shows that the intercept term that
was allowed to be freely estimated became higher after treatment (ie, for ISI_4,

SC
worry, intercept 1.737 → 2.073). On the other hand, three error variances that
were allowed be freely estimated became lower after treatment (ie, ISI1b:

U
0.203 → 0.01; ISI_3: 0.450 → 0.099; and ISI_5: 0.507 → 0.301).
AN
3.2. Response shifts in the factor model of the PSQI after CBT-I
The global fit indices of the configural MI model based on the PSQI data sets
indicated insufficient fitness ( χ 2 = 94.062, df = 61, CFI = 0.896 < 0.09,
M

RMSEA = 0.069, TLI = 0.845 < 0.09). The lack of fitness of the configural MI
model indicated that the factor structure of the PSQI changed and that the
D

constructs that the PSQI measures were re-conceptualized after CBT-I.


TE

To identify the causes of the misfits, we again referred to modification indices to


allow three between-residual correlations in the posttreatment PSQI model, namely,
EP

those between 3 (sleep duration) and 6 (used sleeping medicine), 4 (habitual sleep
efficiency) and 1 (subjective sleep quality), and 6 (used sleeping medicine) and 7
(daytime functioning), to be freely estimated. Furthermore, we also added an extra
C

loading between the sleep efficiency factor and the item on subjective sleep quality
AC

(PSQI_1 in Table 5). The global fit indices indicated that this partial configural
invariance model fit the data acceptably ( χ 2 = 73.665, df = 58, CFI = 0.941, RMSEA
= 0.049, TLI = 0.923). The estimates of this final model are presented in Table 5.

4. Discussion

The current study examined the MI of two self-rating scales, the ISI and PSQI,
which are commonly used as outcome measures in treatment studies for insomnia.
Our results showed that the ISI had stronger MI against CBT-I in comparison to the

9
ACCEPTED MANUSCRIPT
PSQI. Most items of the ISI showed strong MI against CBT-I. Only one item,
measuring participants’ worry about their insomnia, had an increase in its intercept
after treatment. Considering the rule of thumb about partial MI and longitudinal
comparability suggested by Little [7] (ie, two of three items within a factor are still
invariant), a single noninvariant item should not have a profound impact on the
comparability of ISI scores. Therefore, it is reasonable to assert that our results
provide evidence that the ISI is an appropriate outcome measure for CBT-I.

PT
As for the other noninvariant ISI parameters (ie, three error variances became lower

RI
after treatment), the directions of their changes are very similar to results found in a
previous study on CBT for depression. Fokkema et al. [11] also reported that after
treatment with CBT for depression, some items’ residual variances decreased. The

SC
decrease in residual variances on these items may reflect the fact that patients’
understanding of the symptoms measured by these items had become more unified

U
after intervention. The PSQI, as hypothesized, did demonstrate more noninvariance
than the ISI. The failure to pass the configural invariance test indicated that the
AN
constructs underlying the PSQI were re-conceptualized after treatment. Specifically,
an extra loading was added to the subscale for subjective sleep quality from the
sleep-efficiency factor. Moreover, the correlations between the error variances of the
M

four subscales (ie, habitual sleep efficiency, used sleeping medicine, sleep duration,
and subjective sleep quality) also changed after treatment. These correlations suggest
D

that these four items reflect some newly present common variance that cannot be
explained by the original factors. These properties make the change scores of the
TE

PSQI less interpretable, given that the meanings of the target constructs change after
CBT-I treatment.
EP

One possible explanation for the changes in the factorial structure of the PSQI
after CBT-I might be the cognition reshaping associated with the treatment. Both the
C

education and cognitive components of CBT-I might change patients’ beliefs and
attitudes about their sleep quality [39]. For example, CBT-I therapists usually provide
AC

information based on sleep science and challenge patients’ definitions of good-quality


and sufficient quantity of sleep (eg, by discussing with patients the belief “I need 8
hours of sleep to feel refreshed and function well during the day” [40]). Furthermore,
the behavioral component of CBT-I might also reshape the patients’ beliefs about the
consequences of sleeplessness after they experience better sleep quality resulting from
behavioral practices that restrict their sleep (eg, sleep restriction therapy, stimulus
control instructions). Such cognitive reshaping may change the way that the patient
interprets some items on the PSQI and therefore may change the factorial structure of

10
ACCEPTED MANUSCRIPT
the scale. Furthermore, unlike most items of the ISI, which ask the subjects to rate the
severity of their insomnia symptoms directly, the PSQI contains items with different
rating approaches that measure different aspects of sleep quality. Some of the items
ask the patients to fill in the actual timings of their sleep patterns, others have the
patients rate the frequency of sleep-related symptoms, and still others ask the patients
to rate their general level of sleep quality. Therefore, some aspects could be more
susceptible than others to the impact of conceptualization reshaping by CBT-I. This

PT
property might also make the factor structure of the PSQI more likely to be affected
by the treatment.

RI
5. Conclusion

SC
In summary, the current study examined the longitudinal MI properties of the ISI
and PSQI against CBT-I. Our results support the longitudinal comparability of the ISI
before and after CBT-I treatment. The change score on the ISI could therefore be a

U
reliable measure of improvement due to CBT-I. On the other hand, the PSQI score
AN
reflects a subjective perception of general sleep quality, as well as some concepts that
might be changed by CBT-I. Its underlying constructs were shown to have changed,
as indicated by the failure to pass the configural invariance test after CBT-I. From the
M

perspective of longitudinal MI, the PSQI is therefore not an appropriate outcome


measure for CBT-I.
D

In light of the significance of our findings, some limitations should be considered


TE

in the interpretation of our results. First, the results that we obtained were based a
clinical sample in Taiwan assessed with the Chinese versions of the PSQI and ISI.
Even though a previous study has demonstrated the cross-cultural comparability of
EP

the ISI, other cultural factors could still affect the generalizability of our results. This
is especially true for the PSQI, given that a previous study has reported that the factor
C

models of the PSQI could be different across countries [35]. Therefore, future
research can examine the longitudinal invariance among other versions of these two
AC

measures.

The other limitation of the current study is that we examined the response shift
with only quantitative methods. As a result, we can make only indirect inferences
about the reasons behind the patients’ response shifts. This is a common limitation
shared by quantitative longitudinal invariance (response shift) studies [11,15]. To
further explore the reasons behind the response shifts after CBT-I, future studies can
conduct a qualitative interview following the longitudinal invariance analysis to

11
ACCEPTED MANUSCRIPT
address this issue [41].

Acknowledgements

The authors gratefully acknowledge the financial support of National Science Council
(NSC95-2413-H-004-020-MY3, NSC101-2410-H-004-082-MY3). Our thanks also go
to the members of the sleep laboratory at National Chengchi University for their help

PT
in the process of data collection. Finally, we also thank all of our participants and the
Department of Psychology at National Chengchi University for their support.

RI
U SC
AN
M
D
TE
C EP
AC

12
ACCEPTED MANUSCRIPT
References

1. American Psychiatric Association. Diagnostic and Statistical Manual of Mental


Disorders—Fifth Edition (DSM-V). Washington, DC: American Psychiatric
Association; 2013.
2. Okajima I, Komada Y, Inoue Y. A meta‐analysis on the treatment effectiveness of
cognitive behavioral therapy for primary insomnia. Sleep Biol Rhythms

PT
2011;9:24–34.
3. Pentz MA, Chou C. Measurement invariance in longitudinal clinical research

RI
assuming change from development and intervention. J Consult Clin Psychol
1994; 62:450–462.
4. Brown TA. Confirmatory factor analysis for applied research. New York:

SC
Guilford Press; 2006
5. Meade AW, Lautenschlager GJ. A Monte-Carlo study of confirmatory factor

U
analytic tests of measurement equivalence/invariance. Struct Equat Model
2004;11:60–72.
AN
6. Coertjens L, Donche V, De Maeyer S, Vanthournout G, Van Petegem P.
Longitudinal measurement invariance of Likert-type learning strategy scales:
Are we using the same ruler at each wave? J Psychoeduc Assess 2012;30:577–
M

87.
7. Little TD. Longitudinal structural equation modeling. New York: Guilford Press;
D

2013.
8. Kline RB. Principles and practice of structural equation modeling. New York:
TE

Guilford Press; 2015.


9. Millsap RE. Statistical approaches to measurement invariance. New York:
EP

Routledge; 2012.
10. Widaman KF, Reise SP. Exploring the measurement invariance of psychological
instruments: Applications in the substance use domain. In: Bryant MJ, Windle M,
C

editors. The science of prevention: Methodological advances from alcohol and


AC

substance abuse research. 1997, p. 281–324.


11. Fokkema M, Smits N, Kelderman H, Cuijpers P. Response shifts in mental
health interventions: An illustration of longitudinal measurement invariance.
Psychol Assess 2013;25:520–31.
12. Oort FJ. Using structural equation modeling to detect response shifts and true
change. Qual Life Res 2005;14:587–598.
13. Oort FJ, Visser MR, Sprangers MA. Formal definitions of measurement bias and
explanation bias clarify measurement and conceptual perspectives on response
shift. J Clin Epidemiol 2009;62:1126–1137.

13
ACCEPTED MANUSCRIPT
14. Sprangers MA, Schwartz CE. Integrating response shift into health-related
quality of life research: A theoretical model. Soc Sci Med 1999;48:1507–1515.
15. Wu PC. Response shifts in depression intervention for early adolescents. J Clin
Psychol 2016;72:663–75.
16. Elhai JD, Contractor AA, Biehn TL, Allen JG, Oldham J, Ford JD, et al. Changes
in the Beck Depression Inventory−II’s underlying symptom structure over 1
month of inpatient treatment. J Nerv Ment Dis 2013;201:371–6.

PT
17. Smith D, Woodman R, Harvey P, Battersby M. Self-perceived distress and
impairment in problem gamblers: A study of pre-to post-treatment measurement

RI
invariance. J Gambl Stud 2016;32:1065–78.
18. U.S. Department of Health and Human Services, National Institutes of Health.
NIH State-of-the-Science Conference Statement on manifestations and

SC
management of chronic insomnia in adults. NIH Consensus and
State-of-the-Science Statements 2005;22:1–30.

U
19. Buysse DJ, Ancoli-Israel S, Edinger JD, Lichstein KL, Morin CM.
Recommendations for a standard research assessment of insomnia. Sleep
AN
2006;29:1155–73.
20. Buysse DJ, Reynolds CF, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh
Sleep Quality Index: A new instrument for psychiatric practice and research.
M

Psychiatry Res 1989;28:193–213.


21. Morin CM. Insomnia: Psychological assessment and management. New York:
D

Guilford Press; 1993.


22. Bastien CH, Vallières A, Morin CM. Validation of the Insomnia Severity Index
TE

as an outcome measure for insomnia research. Sleep Med 2001;2:297–307.


23. Cheng SK, Dizon J. Computerised cognitive behavioural therapy for insomnia: A
EP

systematic review and meta-analysis. Psychother Psychosom 2012;81:206–16.


24. Johnson JA, Rash JA, Campbell TS, Savard J, Gehrman PR, Perlis M, et al. A
systematic review and meta-analysis of randomized controlled trials of cognitive
C

behavior therapy for insomnia (CBT-I) in cancer survivors. Sleep Med Rev
AC

2016;27:20–8.
25. Trauer JM, Qian MY, Doyle JS, Rajaratnam SM, Cunnington D. Cognitive
behavioral therapy for chronic insomnia: A systematic review and meta-analysis.
Ann Intern Med 2015;163:191–204.
26. Koffel EA, Koffel JB, Gehrman PR. A meta-analysis of group cognitive
behavioral therapy for insomnia. Sleep Med Rev 2015;19:6–16.
27. Chen PY, Yang CM, Morin CM. Validating the cross-cultural factor structure and
invariance property of the Insomnia Severity Index: Evidence based on ordinal
EFA and CFA. Sleep Med 2015;16:598–603.

14
ACCEPTED MANUSCRIPT
28. Fernandez-Mendoza J, Rodriguez-Muñoz A, Vela-Bueno A,
Olavarrieta-Bernardino S, Calhoun SL, Bixler EO, et al. The Spanish version of
the Insomnia Severity Index: A confirmatory factor analysis. Sleep Med
2012;13:207–10.
29. Hartmann JA, Carney CE, Lachowski A, Edinger JD. Exploring the construct of
subjective sleep quality in patients with insomnia. J Clin Psychiatry
2015;76:768–73.

PT
30. Cole JC, Motivala SJ, Buysse DJ, Oxman MN, Levin MJ, Irwin MR. Validation
of a 3-factor scoring model for the Pittsburgh Sleep Quality Index in older adults.

RI
Sleep 2006;29:112–6.
31. Otte JL, Rand KL, Carpenter JS, Russell KM, Champion VL. Factor analysis of
the Pittsburgh Sleep Quality Index in breast cancer survivors. J Pain Symptom

SC
Manage 2013;45:620–7.
32. Mollayeva T, Thurairajah P, Burton K, Mollayeva S, Shapiro C, Colantonio A.

U
The Pittsburgh Sleep Quality Index as a screening tool for sleep dysfunction in
clinical and non-clinical samples: A systematic review and meta-analysis. Sleep
AN
Med Rev 2016; 25:52–73.
33. Yang CM, Hsu SC, Lin SC, Chou YY, Chen IY. Reliability and validity of the
Chinese version of the Insomnia Severity Index. Arch Clin Psychol 2009;4:95–
M

104.
34. Tsai PS, Wang SY, Wang MY, Su CT, Yang TT, Huang CJ, et al. Psychometric
D

evaluation of the Chinese version of the Pittsburgh Sleep Quality Index (CPSQI)
in primary insomnia and control subjects. Qual Life Res Res 2005;14:1943–52.
TE

35. Gelaye B, Lohsoonthorn V, Lertmeharit S, Pensuksan WC, Sanchez SE, Lemma


S, et al. Construct validity and factor structure of the Pittsburgh Sleep Quality
EP

Index and Epworth Sleepiness Scale in a multi-national study of African, South


East Asian and South American college students. PLoS One 2012;9:e116383.
36. R Core Team R: A language and environment for statistical
C

computing. Vienna, Austria; R Foundation for Statistical Computing; 2014.


AC

Available at: http://www.R-project.org/.


37. Rosseel Y. lavaan: An R package for structural equation modeling. J Stat Softw
2012;48:1–36.
38. Enders CK. Applied missing data analysis. New York: Guilford; 2010.
39. Morin CM, Blais F, Savard J. Are changes in beliefs and attitudes about sleep
related to sleep improvements in the treatment of insomnia? Behav Res Ther
2002;40:741–52.
40. Morin CM, Espie CA. Insomnia: A clinical guide to assessment and treatment.
New York: Plenum; 2003.

15
ACCEPTED MANUSCRIPT
41. Lugtig P, Boeije HR, Lensvelt-Mulders GJ. Change? What change?
Methodology 2012;8:115–23.

PT
RI
U SC
AN
M
D
TE
C EP
AC

16
ACCEPTED MANUSCRIPT

Table 1
Changes in factor models underlying questionnaire, their corresponding response shifts, and potential psychological inferences mentioned in

PT
Oort [12] and Fokkema et al. [11].
Types of changes Corresponding response shifts Possible psychological interpretations

RI
Factor structure (configural invariance) Reconceptualization, reprioritization, or Adding an extra factor loading represents the
(uniform and nonuniform) recalibration, psychological phenomenon that subjects have

SC
dependent on how researchers modify used a new item to define the underlying
the posttreatment model when the configural construct; that is, the underlying construct has

U
model is not supported been “reconceptualized” (Oort [12], p. 591)

AN
Factor loadings Reprioritization A decrease in the loading of an item might
(weak invariance test) indicate the item has become less indicative

M
for the underlying construct it measures
(Fokkema et al. [11], p. 526)

D
Intercepts Recalibration (uniform) A significant increase in the intercept of an

TE
(strong invariance test) item might indicate subjects have become
more sensitive to the symptoms that item
EP
measures (Fokkema et al., [11], p. 529)
Residual variances (strict invariance test) Recalibration (nonuniform) If the major loadings remain the same, but
C

residual variance decreases, it might indicate


AC

the meaning of items has become clearer to


the subjects after treatment (Fokkema et al.,
[11], p. 529)

17
ACCEPTED MANUSCRIPT
Table 2
Descriptive statistics for items of the ISI and subscores of the PSQI.
Variable Mean SD Skewness Kurtosis
a
Pre (Post) Pre (Post) Pre (Post) Pre (Post)
ISI_1a 2.34 (1.21) 1.26 (1.05) –0.09 (1.03) –1.13 (0.66)
ISI_1b 2.52 (1.61) 1.11 (1.03) –0.11 (0.61) –0.94 (–0.13)
ISI_1c 2.14 (1.53) 1.15 (1.09) 0.04 (0.52) –0.94 (–0.19)

PT
ISI_2 3.09 (1.78) 0.79 (0.86) –0.37 (0.52) –0.79 (0.14)
ISI_3 2.45 (1.55) 1.00 (0.85) –0.13 (0.49) –0.46 (–0.31)

RI
ISI_4 1.74 (1.15) 1.08 (0.99) –0.01 (0.39) –0.70 (–0.94)
ISI_5 2.83 (1.23) 1.01 (1.00) –0.36 (0.68) –0.82 (0.10)
PSQI_1b 2.34 (1.34) 0.59 (0.57) –0.26 (0.63) –0.71 (0.13)

SC
PSQI_2 2.15 (1.55) 0.94 (0.79) –0.63 (0.21) –0.88 (–0.55)
PSQI_3 2.12 (1.85) 0.76 (0.73) –0.93 (–0.43) 1.17 (0.18)

U
PSQI_4 0.93 (1.62) 1.08 (1.22) 0.79 (–0.13) –0.76 (–1.57)
PSQI_5 1.30 (1.24) 0.57 (0.52) 1.10 (1.17) 1.01 (1.59)
AN
PSQI_6 1.88 (1.30) 1.36 (1.31) –0.52 (0.30) –1.61 (–1.67)
PSQI_7 1.41 (1.28) 0.87 (0.83) 0.32 (0.31) –0.61 (–0.53)
ISI, Insomnia Severity Index; PSQI, Pittsburgh Sleep Quality Index; SD, standard
M

deviation.
a
Descriptive statistics in pretreatmenttest (the descriptive statistic in post), eg, mean of
D

item ISI_1a pre CBT-I (mean of item ISI_1a post CBT-I).


b
TE

PSQO_1 – PSQI_7 are the seven subscores calculated from the 19 PSQI items. They
are respectively 1) subjective sleep quality, 2) sleep latency, 3) sleep duration, 4)
habitual sleep efficiency, 5) sleep disturbances, 6) use of sleeping medications, and 7)
EP

daytime dysfunction (Buysse et al. [20]).


C
AC

18
ACCEPTED MANUSCRIPT
Table 3
Fit indices of the longitudinal invariance models of the ISI.

Model χ2 df ∆χ2 CFI ∆CFI TLI RMSEA

Configural 80.10 57 0.96 0.947 0.060

invariance (M1) 7

PT
Weak invariance 87.20 63 7.099 0.96 0.002 0.950 0.058

(M2) (with M1) 5

RI
strong invariance 97.18 67 9.980* 0.95 0.009 0.941 0.063

SC
(M3) (with M2) 6

Partial strong 92.81 66 5.617 0.96 0.004 0.947 0.060

invariance (M4)a
U
(with M3) 1
AN
Partial strict 129.96 73 37.148** 0.91 0.898 0.083

invariance 1 (M5)b (with M4) 8


M

Partial strict 115.90 72 23.077** 0.93 0.920 0.073


D

invariance 2 (M6)c (with M4) 7


TE

Partial strict 106.92 71 14.11* 0.94 0.934 0.067

invariance 3 (M7)d (with M4) 8


EP

Partial strict 103.72 71 10.908 0.95 0.940 0.064


C

invariance 4 (M8)e (with M4) 3


AC

CFI, ; ISI, Insomnia Severity Index; RMSEA, root mean square error of
approximation; TLI, .
a
M4: the partial strong invariance model that has all corresponding factor loadings
and 6/7 intercepts are set to be equal. Only the intercept of ISI_4 is allowed to be
different across times.
b
M5: the partial strict invariance model that has all factor corresponding loadings, 6/7
intercepts (exception ISI_4), and all residual variances constrained to be equal across
times.
c
M6: almost identical to M5, the only difference being that residual variance of ISI_3
20
ACCEPTED MANUSCRIPT
is now allowed to be different across times.
d
M7: almost identical to M5, the only difference being that residual variance of
ISI_1b is now allowed to be different across times.
e
M8: almost identical to M7, the only two differences being that residual variance of
ISI_5 is now allowed to be different across times and residual variance of ISI_1b in
the posttreatment model is fixed at 0.01 due to the presence of the Heywood case.
*p <.05.

PT
**p < .01.

RI
U SC
AN
M
D
TE
C EP
AC

21
ACCEPTED MANUSCRIPT
Table 4
Estimates of the final partial strict invariance model for the ISI.
Loadings Intercept Error
variance
Item Factor 1 Factor 2 Factor 3
(impact) (satisfaction) (severity)

PT
ISI_1a .163 .536 2.365 .851

ISI_1b .971 2.492 .263(.01b)

RI
ISI_1c .789 2.182 .534

SC
ISI_2 .683 3.096 .179

ISI_3 .745 2.476 .450(.099)

U
ISI_4 .683 1.737 .492
a
(2.073 )
AN
.336 2.784
ISI_5 .336 .507(.301)
ISI, Insomnia Severity Index.
M

a
Numbers in parentheses indicate estimates in the posttreatment part of the model that
are allowed to be different from the pretreatmenttreatment model. These constraints
D

are freely estimated after referring to modification indices.


b
The original estimate of error variance of ISI_1b in the posttreatment model was
TE

negative. As a result, we fixed the variance at 0.01.


C EP
AC

22
ACCEPTED MANUSCRIPT

Table 5
Unstandardized loadings, intercepts, and correlation between residuals in the partial
configural invariance model of the PSQI.
Loading Intercept Error variance Correlations between
residuals
Item Factor 1e Factor 2 Factor 3

PT
a b
PSQI_1 0.313 0.312 1.377 0.090
(0.241) (2.342) (0.292)

RI
PSQI_2 0.518 1.553 0.359
(0.399) (2.150) (0.689)
c
PSQI_3 0.404 1.838 0.349 0.142 (with PSQI_6 in

SC
d
(0.457) (2.118) (0.346) posttreatment model)
e
PSQI_4 1.071 0.931 0.010 −0.203 (with PSQI_1

U
(1.014) (1.620) (0.352) in posttreatment
model)
AN
PSQI_5 0.123 1.250 0.256
(0.097) (1.302) (0.310)
PSQI_6 0.518 1.301 1.433 −0.198 (with PSQI_7
M

(-0.377) (1.877) (1.646) in posttreatment


model)
D

PSQI_7 0.436 1.281 0.478


TE

(0.217) (1.409) (0.687)


PSQI, Pittsburgh Sleep Quality Index,
a
PSQO_1 – PSQI_7 are the seven subscores calculated from the 19 PSQI items. They
EP

are respectively 1) subjective sleep quality, 2) sleep latency, 3) sleep duration, 4)


habitual sleep efficiency, 5) sleep disturbances, 6) use of sleeping medications, and 7)
daytime dysfunction (Buysse, et al [20]).
C

b
Bold numbers represent the extra estimates that exist only in the posttreatment
AC

model.
c
These values in the table represent the estimates in the posttreatment model.
d
Values below them in parentheses represent the corresponding estimates in the
pretreatmenttreatment model for the same parameters.
e
This error variance is fixed at 0.01 due to the Heywood case.e factor 1: sleep
efficiency, factor 2: perceived sleep quality, factor 3: daily disturbances.

23
ACCEPTED MANUSCRIPT

Appendix

PT
This appendix provides further details about how the changes in factorial

RI
parameters, such as factor loadings, can affect the accuracy of change scores. Assume
that one measures a patient’s insomnia severity with a self-rating scale before and
after Cognitive Behavioral Therapy for Insomnia (CBT-I) and obtains two observed

SC
scores, y pre _ score & y post _ score . It is a common practice for researchers to calculate the

U
change score of this scale as y pre _ score − y post _ score and to directly use it as a quantity
AN
that can reflect the treatment efficacy of CBT-I. However, if we further decompose the
observed scores with confirmatory factor analysis (CFA), a common method for
validating construct measures, we will be able to verify the implicit assumptions of
M

the change scores.


D

In a scenario of CFA, the two observed scores can be further decomposed into
the following two regression-like equations:
TE

y pre _ score = τ pre (pretreatmenttreatment intercept of the measured item) +


EP

λ pre (pretreatmenttreatment factor loading of the measured item)* η pre (initial level
C

of subjective insomnia of the patient before treatment) + ε pre (measurement error)


AC

--- (equation 1)

y post _ score = τ post (posttreatment intercept) + λ post (posttreatment factor

loading)* η post (level of subjective insomnia after treatment) + ε post (measurement

error) ---- (equation 2)

24
ACCEPTED MANUSCRIPT

In the two equations above, an observed score (y) is linked to the latent construct (η )
with a factor loading λ (like a regression coefficient) and an intercept τ . Factor
loadings and intercepts are the parameters that can be estimated in CFA. With these
two equations, one can tell that the change score will on average be directly

proportional to the improvement in subjective insomnia (that is, η pre −η post ) only

PT
when the corresponding intercepts and loadings in the two equations are “invariant”

(ie, λ pre = λ post and τ pre = π post ).

RI
In the scenario of CBT-I, one can imagine that CBT-I will not only relieve a patient’s

SC
insomnia severity (ie, make η smaller after the treatment), but also strengthen an
item’s relation (factor loading) with the underlying construct of subjective insomnia

U
(ie, make λ post > λ pre ). Given this situation, researchers might not be able to see the
AN
efficacy of CBT-I in the change score y pre _ score − y post _ score , for the decrease on

subjective insomnia η post could be directly offset by the inflation of factor loading
M

λ post and will not be reflected in y post .


D
TE
C EP
AC

25
ACCEPTED MANUSCRIPT
Highlights

• The validity of the Chinese versions of the Insomnia Severity Index (ISI) and the
Pittsburgh Sleep Quality Index (PSQI) as outcome measures for Cognitive
Behavioral Treatment for Insomnia (CBT-I) was examined with longitudinal
invariance models.
• The Chinese version of the ISI was found to have a good structure after CBT-I and

PT
therefore is a valid outcome measure for the treatment efficacy of CBT-I.
• The Chinese version of the PSQI showed changes in its factorial model after

RI
CBT-I; thus, it may not be an appropriate outcome measure for CBT-I.

U SC
AN
M
D
TE
C EP
AC

You might also like