Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Child Abuse & Neglect 27 (2003) 169–190

Development and validation of a brief screening version


of the Childhood Trauma Questionnaire!
David P. Bernstein a,∗ , Judith A. Stein b , Michael D. Newcomb c ,
Edward Walker d , David Pogge e,f , Taruna Ahluvalia e,f , John Stokes e ,
Leonard Handelsman g , Martha Medrano h , David Desmond h , William Zule h
a
Department of Psychology, Fordham University, Dealy Hall, 3rd Floor, Bronx, NY 10458, USA
b
Department of Psychology, University of California, Los Angeles, CA, USA
c
Department of Psychology, University of Southern California, Los Angeles, CA, USA
d
Department of Psychiatry, University of Washington School of Medicine, Seattle, WA, USA
e
Department of Psychology, Four Winds Hospital, Ketonah, NY, USA
f
Fairleigh Dickinson University, Teaneck, NJ, USA
g
Department of Psychiatry, Duke University School of Medicine, Durham, NC, USA
h
Department of Psychiatry, San Antonio Health Sciences Center,
University of Texas, San Antonio, TX, USA
Received 15 June 2001; received in revised form 14 August 2002; accepted 14 August 2002

Abstract
Objective: The goal of this study was to develop and validate a short form of the Childhood Trauma
Questionnaire (the CTQ-SF) as a screening measure for maltreatment histories in both clinical and
nonreferred groups.
Method: Exploratory and confirmatory factor analyses of the 70 original CTQ items were used to create
a 28-item version of the scale (25 clinical items and three validity items) and test the measurement
invariance of the 25 clinical items across four samples: 378 adult substance abusing patients from New
York City, 396 adolescent psychiatric inpatients, 625 substance abusing individuals from southwest
Texas, and 579 individuals from a normative community sample (combined N = 1978).
Results: Results showed that the CTQ-SF’s items held essentially the same meaning across all four
samples (i.e., measurement invariance). Moreover, the scale demonstrated good criterion-related validity
in a subsample of adolescents on whom corroborative data were available.

!
Dr. Stein and Dr. Newcomb are supported by a grant from the National Institute on Drug Abuse, DA-01070-28.
Dr. Walker is supported by a grant from the National Institute on Mental Health, K20MH01106.

Corresponding author.

0145-2134/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved.
doi:10.1016/S0145-2134(02)00541-0
170 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Conclusions: These findings support the viability of the CTQ-SF across diverse clinical and nonreferred
populations.
© 2002 Elsevier Science Ltd. All rights reserved.

Keywords: Child abuse; Neglect; Measures; Validity

Introduction

Over the past two decades, research on the prevalence, causes, and consequences of child
abuse and neglect has increased exponentially (Crouch & Milner, 1993; Finkelhor, 1994;
Kendall-Tackett, Meyer Williams, & Finkelhor, 1993; Knutson, 1995; Malinosky-Rummell
& Hansen, 1993). However, many of the empirical studies in this area are limited by se-
rious methodological shortcomings, including a lack of standardized, adequately validated
instruments for retrospectively assessing abuse and neglect (Briere, 1992). Many previous
studies have used methods such as chart review or single questions or items to assess mal-
treatment, although such approaches may be unreliable and lack sensitivity (Briere & Zaidi,
1989). Moreover, studies have often focused on a single form of childhood trauma, typically
sexual or physical abuse, despite evidence that multiple types of maltreatment often cooccur
(Briere & Runtz, 1988; Rosenberg, 1987). As a result, it has been difficult to disentangle the
effects of particular types of trauma from that of other coexisting forms or from the impact
of maltreatment in general. Little systematic attention has been paid to issues concerning
instrument format, for example, whether maltreatment phenomena are more adequately as-
certained using self-report questionnaire or interview methods (Dill, Chu, Grob, & Eisen,
1991; Walker, Bernstein, & Keegan, 1997). A related issue is whether childhood trauma are
better conceptualized as dichotomous events (i.e., events that either did or did not occur) or as
experiences that vary along continuous dimensions such as frequency, severity, and duration
(Lipschitz, Bernstein, Winegar, & Southwick, 1999; Walker et al., 1997). Finally, although
several instruments have been developed that incorporate a more methodologically sophis-
ticated approach to the assessment of childhood trauma (Bernstein & Fink, 1998; Bernstein
et al., 1994; Bifulco, Brown, & Harris, 1994; Ditomasso, 1995; Fink, Bernstein, Handelsman,
Foote, & Lovejoy, 1995; Gallagher, Flye, Hurt, Stone, & Hull, 1992; Herman, Perry, & van der
Kolk, 1989; Meyer, Muenzenmaier, Cancienne, & Struening, 1996; Sanders & Becker-Lausen,
1995; Straus & Hamby, 1997; Straus, Hamby, Finkelhor, Moore, & Runyan, 1998; Zanarini,
Gunderson, Marino, Schwarz, & Frankenburg, 1989), comparatively little attention has been
paid to their validity. While published reports on many of these instruments contain information
about reliability, most contain little information about criterion-related validity or construct
validity. With the exception of the Childhood Trauma Questionnaire (Bernstein & Fink, 1998;
Bernstein et al., 1994), none of these instruments has been validated with respect to the critical
question of whether they correctly detected abuse and neglect histories (i.e., criterion-related
validity).
This lack of attention to instrument validity is of particular concern, given the controversy
over the accuracy of retrospective reports of childhood trauma. Many authors have noted that
a variety of factors can affect the accuracy of recollections for childhood events, including
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 171

normative ones, such as the degradation of memories over time, and pathological ones, such
as dissociation and repression (Allen, 1995; Bernstein et al., 1995; Rogers, 1995). The “false
memory syndrome” is another example of inaccurate recall, in this case, one that is purportedly
iatrogenic in nature (Loftus, 1993). On the other hand, some authors have noted that memo-
ries for childhood experiences may actually be enhanced in cases where events are unusual,
unexpected, or consequential, such as childhood trauma (Brewin, Andrews, & Gotlib, 1993).
One experimental study found that recall was improved for emotionally arousing events and
that this enhancement was related to greater beta-adrenergic activation (Cahill, Prins, Weber,
& McGaugh, 1994). In light of these controversies, it is essential that trauma researchers
demonstrate the validity of their retrospective assessments.
To address the need for reliable and valid assessment of a broad range of maltreatment
experiences, Bernstein and colleagues developed a 70-item self-administered inventory, the
Childhood Trauma Questionnaire (CTQ; Bernstein & Fink, 1998; Bernstein et al., 1994). The
CTQ uses multiple Likert-type items to create dimensional scales, thereby enhancing reliabil-
ity and maximizing statistical power. Cut scores can be applied to identify individuals with
histories of abuse and neglect. In initial studies of adult substance abusers, the CTQ showed ex-
cellent test-retest reliability over a 2- to 6-month interval as well as convergent and discriminant
validity with a structured trauma interview (Bernstein et al., 1994; Fink et al., 1995). Principal
components analysis of the CTQ items yielded four rotated factors which were labeled, physi-
cal and emotional abuse, emotional neglect, sexual abuse, and physical neglect (Bernstein et al.,
1994). Similar factor analytic results were obtained in a study of adolescent psychiatric patients,
with the exception that physical and emotional abuse items loaded on separate factors, rather
than a single factor, and that the numbers of items loading highly on each respective factor were
somewhat different than in the original study (Bernstein, Ahluvalia, Pogge, & Handelsman,
1997). In the adolescent study, it was possible to corroborate histories obtained with the CTQ
through the use of independent evidence, such as information from referring clinicians and
agencies, and the reports of other informants. When compared to therapists’ trauma ratings
based on all available data about the patient, the CTQ showed good sensitivity and satisfactory
or better specificity, supporting its criterion-related validity (Bernstein et al., 1997).
The goals of the present study were twofold. First, we wished to develop a short form of the
CTQ that would take no more than 5 minutes to self-administer, to provide more rapid screening
for maltreatment histories in both clinical and nonreferred populations. The original 70-item
version of the CTQ, which requires 10–15 minutes to give, may be too lengthy for settings in
which time constraints are present (e.g., primary care medical settings) or may unduly increase
respondent burden, when the CTQ is included in a battery of other tests. A short form of the
scale, on the other hand, would overcome some of these limitations. Second, we were interested
in examining two important aspects of the construct validity of the CTQ short form: (1) the
“measurement invariance” of its factor structure across clinical and nonreferred groups (Hoyle
& Smith, 1994), and (2) its criterion-related validity, that is, its relationship to independent
validating criteria. Measurement invariance refers to the question of whether a measure holds
the same meaning across groups and encompasses several related issues: whether the number
and nature of the latent dimensions (i.e., factors) represented by a measure are equivalent across
the groups; whether the pattern of factor loadings are the same across groups; and whether the
covariances among the latent dimensions are equivalent across groups (Hoyle & Smith, 1994).
172 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

All of these issues can be addressed through the use of confirmatory factor analysis, a special
case of structural equation modeling. In practical terms, measurement invariance means that the
CTQ short form would be equally useful in both normal and clinical populations, an essential
property in a screening instrument. Moreover, measurement invariance is a precondition for
a comparison of means between groups, for example, using the CTQ short form to compare
levels of child abuse and neglect across different populations.
Although factor analytic studies of the 70-item CTQ have produced similar results across
different populations, they have not demonstrated measurement invariance in the strict sense,
in that somewhat different factor structures were obtained (i.e., four vs. five factors, different
numbers of items per factor) (Bernstein et al., 1994, 1997). In the present study, our aim was
to reduce the number of items on each factor to produce a scale with a relatively simple factor
structure that would be invariant across diverse clinical and nonreferred groups. In particular,
we dropped items from the original CTQ that loaded highly on more than one factor, so that
the resulting factors would be as discriminable as possible across multiple populations. We
tested the measurement invariance of the CTQ short form in 1978 individuals consisting of
four separate samples: a primarily male sample of adult substance abusers enrolled in inpatient
and outpatient treatment programs in New York City, male and female adolescent psychiatric
inpatients, male and female substance abusers in a community sample from the Southwest,
and a normative sample of male and female participants in a longitudinal study selected from
greater Los Angeles County. Two of the four data sets—the adult substance abusers from New
York City and the sample of adolescent psychiatric patients—had been used previously to
examine the validity of the 70-item version of the CTQ (Bernstein et al., 1994, 1997). How-
ever, we felt justified in using them again in conjunction with the two new samples, because
the four samples together provided a diverse set of participants on which to validate the short
form of the scale, and because the short form is a substantially different version of the CTQ
requiring separate validation. We also performed latent means analyses to test the hypothesis
that the adult substance abusers from New York City and the Southwest and the adolescent
psychiatric patients would report higher levels of child maltreatment than the normative com-
munity sample. The failure to find such differences would be a serious blow to our claims for
the scale’s validity, in light of extensive research documenting the high prevalence of mal-
treatment in clinical populations (Crouch & Milner, 1993; Finkelhor, 1994; Kendall-Tackett
et al., 1993; Knutson, 1995; Malinosky-Rummell & Hansen, 1993). Finally, we examined the
criterion-related validity of the CTQ short form in a subgroup of the adolescent psychiatric
patients on whom corroborative data were available in the form of therapists’ trauma ratings.

Methods

Participants

Four diverse sets of participants were used in this study: adult substance abusing patients
from New York City, adolescent psychiatric inpatients, substance abusing individuals from a
community sample in southwest Texas, and individuals from a normative community sample
in Los Angeles County.
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 173

Adult substance abusing patients. The first sample consisted of 378 adult substance-dependent
patients seeking treatment at two facilities: inpatient drug and alcohol detoxification and re-
habilitation units located at the VA Medical Center in the Bronx, NY (N = 252) and an
outpatient methadone maintenance program affiliated with the Mount Sinai Medical Center in
New York City (N = 126). VA patients were consecutive admissions who were given the CTQ
during their first week in the hospital as part of a battery of self-report measures and structured
interviews. Mount Sinai patients were enrolled in a NIDA funded treatment demonstration
project that examined the efficacy of cognitive behavioral therapy in methadone maintained
heroin addicts with comorbid cocaine addiction. Patients were randomly assigned to a high
intensity cognitive behavioral treatment group and a low intensity “treatment as usual” control
group. All of the Mount Sinai patients were given the CTQ during the intake phase of the study
prior to assignment to one of the treatment groups. The VA and Mount Sinai patients were quite
similar with respect to their demographic and clinical characteristics and were therefore com-
bined into a single sample for the analyses reported here. The patients in the combined sample
were mostly minority (African-American = 50.3%, Hispanic = 33.7%, White = 13.4%),
predominantly male (85.6%) inner-city addicts and alcoholics who ranged in age from 24 to
68 years (M = 40.2 years, SD = 8.8 years). Most had extensive lifetime histories of polysub-
stance abuse and dependence, with alcohol (90.1%), cocaine (68.3%), cannabis (60%), and
heroin (39.2%) being the most frequently used substances. The sample of VA and Mount Sinai
patients used in the present study was partially overlapping with one described in earlier report
on the validity of the original version of the CTQ (Bernstein et al., 1994), and was obtained
by supplementing the original sample with an additional 92 participants drawn by the same
method from the same population.

Adolescent psychiatric inpatients. The second sample consisted of 398 psychologically dis-
turbed adolescents admitted to the inpatient unit of a private psychiatric hospital in Ketonah,
NY. The adolescents were given the CTQ approximately 1 week after admission as part of a clin-
ical battery of psychological tests, including the Wechsler Intelligence Scale for Children-Third
Edition (WISC-III), the Wide Range Achievement Test-Version III (WRAT-III), and a variety
of self-report measures. Approximately 25% of the adolescents were unable to complete the
CTQ and the other self-report measures due to low intelligence (WISC Full Scale IQ < 80)
or poor reading skills (WRAT-III Reading Level below sixth grade) and were excluded from
the study. The adolescents were diverse with respect to age (M = 14.9 years, SD = 1.4 years,
range = 12–17 years), gender (male = 43%, female = 57%), and ethnicity (White = 67.9%,
Hispanic = 13.3%, African-American = 11.2%), and spanned a range of family income from
upper- and middle-income families with private health insurance to families in poverty (pa-
tients with Medicaid coverage = 51%). Although the adolescents were admitted for a variety
of psychiatric conditions, the most frequent presenting problems were suicide risk (48.9%),
substance abuse (37.8%), and mood disorders (35.2%). The adolescent sample used in this
study is identical to one describe in an earlier report on the validity of the original version of
the CTQ (Bernstein et al., 1997).
In both the clinical sample of adult substance abusers and the sample of adolescent psy-
chiatric patients, data on participants’ CTQ responses was extracted from their testing files.
Specific informed consent for the CTQ was not solicited because it was subsumed under the
174 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

general consent for clinical evaluation and treatment services obtained from patients and/or
their parents or legal guardians.

Adult substance abusers in the community. The third sample consisted of 625 male and female
participants in the community outreach for the prevention of AIDS (COPA) project, an ongoing
National Institute of Drug Abuse (NIDA) Cooperative Agreement research project. Injection
drug or crack cocaine using adults residing in South Texas were recruited for the study through
community outreach. To be eligible for the study, participants had to screen positive for cocaine,
opiates, or met-amphetamine based on urine toxicology, and to have not received drug treatment
in the prior 30 days. All participants were given an HIV risk behavior interview developed by
NIDA, an HIV antibody test, and a NIDA-developed educational intervention that included
HIV pretest counseling. At the time of initial evaluation, participants were given the CTQ and
a variety of other self-report measures as part of a substudy funded by the Hogg Foundation to
examine the relationship between HIV risk behavior and history of childhood victimization.
Participants who were unable to complete the CTQ and the other self-report measures on
their own were administered the scales verbally. Participants were 64% male, 60% Hispanic
(28% African-American, 11% non-Hispanic White), and ranged in age from 18 to 54 years
(M = 34 years). Most of the sample (57%) had not graduated high school. Seventy seven
percent of participants were injection drug users with heroin (44%), crack cocaine (38%), and
intravenous cocaine (14%) being the most frequently used primary substances.

Normative community sample of adults. The fourth sample was obtained from all current 579
participants in a 20-year longitudinal study of community adolescents that began in 1976
(Newcomb, 1997). When the study began, participants were 7th, 8th, and 9th grade students in
11 Los Angeles County schools. Assessments have occurred every 4 years. At present their ave-
rage age is 34.9 years (range = 33–37 years); they are 67% Caucasian, 14% African-American,
10% Hispanic, and 8% Asian-Pacific Islander. Their average income is US $45,000; their av-
erage education is some college, 37% have only a high school diploma, 28% have a BA/BS or
higher degree. The sample is 72% women (N = 417), and most participants are married and
have full-time jobs. The greater preponderance of women has been a feature of this sample since
its inception. Numerous studies have been published based on this longitudinal sample (e.g.,
Newcomb, 1994, 1997; Newcomb & Bentler, 1988; Scheier & Newcomb, 1993; Stein,
Newcomb, & Bentler, 1987, 1993). The CTQ was included in the most recent wave of the
survey, which was sent to the participants by mail. They were given US $30 to complete the
questionnaire.

Measures

Childhood Trauma Questionnaire (CTQ). The original CTQ is a 70-item self-administered


inventory that was developed to provide reliable and valid retrospective assessment of child
abuse and neglect (Bernstein et al., 1994). Items on the CTQ ask about experiences in childhood
and adolescence and are rated on a 5-point, Likert-type scale with response options ranging
from Never True to Very Often True (sample CTQ items are given in an earlier report, Bernstein
et al., 1994). The CTQ has five clinical scales—physical, sexual, and emotional abuse, and
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 175

physical and emotional neglect—which have been empirically derived (Bernstein et al., 1994,
1997). The CTQ scales were based on the following definitions of abuse and neglect. Sexual
abuse was defined as “sexual contact or conduct between a child younger than 18 years of
age and an adult or older person.” Physical abuse was defined as, “bodily assaults on a child
by an adult or older person that posed a risk of or resulted in injury.” Emotional abuse was
defined as, “verbal assaults on a child’s sense of worth or well-being or any humiliating or
demeaning behavior directed toward a child by an adult or older person.” Physical neglect was
defined as, “the failure of caretakers to provide for a child’s basic physical needs, including
food, shelter, clothing, safety, and health care” (poor parental supervision was also included in
this definition if it place children’s safety in jeopardy). Emotional neglect was defined as, “the
failure of caretakers to meet children’s basic emotional and psychological needs, including
love, belonging, nurturance, and support.” In the short version of the CTQ, each type of
maltreatment is represented by five items to provide adequate reliability and content coverage
while substantially reducing the overall number of items in the scale. The CTQ also has a
three-item Minimization/Denial validity scale that was developed to detect the underreporting
of maltreatment (Bernstein & Fink, 1998). In the present study, the two treatment samples—
adult substance abusers and adolescent psychiatric patients—received the original 70-item
version of the CTQ. The two community samples—adult substance abusers in the Southwest
and the normative sample—were given the short form of the CTQ from which the three-item
validity scale and many of the other CTQ items had been excluded to reduce respondent burden.

Therapists’ maltreatment ratings. Therapists’ ratings of abuse and neglect were obtained on
a subsample of the adolescent psychiatric patients (N = 179) who had also received the
CTQ. In an earlier study (Bernstein et al., 1997), these ratings were used to validate the full
70-item CTQ. In the present study, these data were reanalyzed to provide external validation
for the short form of the questionnaire. After the adolescent patients were discharged from the
hospital, their primary therapists were given the Child Maltreatment Ascertainment Interview, a
structured interview eliciting detailed information about their patients’ histories of childhood
trauma (Bernstein et al., 1997). The therapists were given a synopsis of each case based
on information that was extracted from the clinical record, but were kept blind to the CTQ
responses of the adolescents. The therapists were then presented with standardized definitions
of four kinds of maltreatment (physical, sexual, and emotional abuse, and physical neglect) and
asked to determine their patients’ maltreatment status (definitely or definitely not maltreated,
or uncertain), based on all available information about the case.
The therapists’ maltreatment ratings showed excellent interrater reliability (kappas = .9
to 1.0), when two therapists were asked to rate 10 case vignettes that were abstracted from
patients clinical charts (Bernstein et al., 1997). Thus, the therapists were able to apply the
maltreatment definitions in a uniform manner.
These therapists’ ratings were used as the validity criterion in this study. The therapists
had extensive contact with the patients and their families during the typically lengthy hos-
pitalizations (length of stay: M = 6.9 weeks, SD = 5.6 weeks). Moreover, the therapists
were privy to information from a variety of other sources, such as reports of child welfare
investigations, referring clinicians and agencies, and other members of the multidisciplinary
treatment team. In a majority of cases of sexual abuse (62.8%), physical abuse (67.7%), and
176 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

physical neglect (75.6%), the therapists were able to support their judgments with independent
evidence, such as knowledge of Child Protective Services investigations, criminal or family
court charges/appearances, or removal of the child from the parental home (Bernstein et al.,
1997). Thus, although the therapists’ ratings were based in part on information provided by
the patient, these were substantiated with independent data in most instances and were not
influenced by responses to the CTQ.

Analyses

Although each of the items on the original CTQ was intended to represent only one factor,
many loaded highly on more than one factor. The initial goal of the data analysis was therefore
to identify five items from each of the five hypothesized factors of the CTQ that would load
highly together and overlap only moderately with the other factors, leaving a briefer (25 items
plus the three-item validity scale) and more easily interpretable form of the questionnaire.
We wanted to reduce the full form by about two-thirds from 70 to 25 items plus the 3
validity items, producing a 28-item short form. We also wanted to establish reliable subscales
that were equally balanced among the five types of maltreatment and had sufficient items to
provide a breadth of content. Five items seemed a reasonable compromise on each of these
issues. We did not want to give more weight to one type of trauma more than another, but
rather give equal credence to all types of trauma, some of which have received little attention
(i.e., emotional abuse and neglect, and physical neglect).
First, after excluding the three validity items, we conducted exploratory factor analyses of
the remaining 67 CTQ items that were given to the adult substance abusers and adolescent
psychiatric patients using the BMDP 4M factor analysis program with maximum likelihood
estimation and direct quartimin rotation. Where appropriate, items were reverse-scored in
the analyses to keep the items positively correlated among themselves. We did not expect the
factors to be orthogonal since previous research (Bernstein et al., 1994, 1997) indicated that the
factors are highly related to each other. Twenty five items were retained that had factor loadings
greater than .50 on its intended factors and low loadings (<.30) on the other factors. We thus
developed reasonably distinct factors corresponding to the a priori constructs developed for
the original CTQ—physical, sexual, and emotional abuse, and physical and emotional neglect.
Next, confirmatory and multisample latent variable analyses were performed using the
EQS structural equations modeling program (Bentler, 1995). Latent variables are error-free
constructs that are composed of the shared variance or relations among a number of manifest or
indicator variables (Bentler & Stein, 1992). These analyses compare a proposed hypothetical
model with a set of actual data. The closeness of the hypothetical model to the empirical data
are evaluated statistically through goodness-of-fit indexes, which include the χ2 /degrees of
freedom ratio, and various fit indexes. A χ2 value no more than twice the degrees of freedom
in the model generally indicates a plausible, well-fitting model since with large sample sizes
it is difficult to obtain a nonsignificant χ2 value (Newcomb, 1994).
Goodness-of-fit of the models was principally evaluated with Satorra-Bentler robust fit
statistics [the Satorra-Bentler chi-square (S-B χ2 ) and the Robust Comparative Fit Index
(RCFI)] since the data were multivariately kurtose (Bentler, 1990; Bentler & Dudgeon, 1996).
The RCFI ranges between 0 and 1 and compares the improvement of fit of a hypothesized
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 177

model to a model of complete independence among the measured variables while adjusting
for sample size. Values over .90 are desirable since that indicates that 90% or more of the co-
variation in the data are reproduced by the hypothesized model (Bentler, 1990, 1995). We also
report another indicator of model fit, the root mean square error of approximation (RMSEA,
Steiger, 1990). A value of about .07 or less is considered reasonable (Browne & Cudeck, 1993).

Models

Preliminary confirmatory factor analyses. Initial confirmatory factor analyses (CFA) were
performed for each group separately with each hypothesized latent construct predicting its
proposed five manifest indicators selected from results of the exploratory factor analysis de-
scribed above. All latent constructs intercorrelated freely since we expected them to be signif-
icantly correlated with each other. This analysis assessed the adequacy of the proposed factor
structure and the relationships among the latent variables. To improve the fit of the models,
a few correlated error residuals suggested by the Lagrange Multiplier Test (LM test, Chou &
Bentler, 1990) were allowed between the measured variables if they made sense theoretically.
We did not allow any complex factor loadings in which an indicator would load on more than
one factor as we planned to contrast the factor structures of the four groups and wanted the
models to be as similar as possible.

Multisample analyses. After the separate confirmatory factor analyses, we tested multiple
group hypotheses about invariance across the four groups in their factor structures (Hoyle
& Smith, 1994). We contrasted the adult substance abusers, the adolescents, the southwest
Texas sample, and the community sample members to see whether the revised instrument held
essentially the same meaning for them. We were testing to see whether the instrument would
be equally useful in clinical and community samples and also wanted to contrast the means of
the latent constructs.
Constraints on the equality of the factor loadings in the CFA models were imposed (Bentler,
1995; Byrne, 1994; Byrne, Shavelson, & Muthén, 1989). After testing a baseline unconstrained
model, the factor loading of each measured variable on its latent factor was constrained to
equality across the groups. The tenability of this constrained model was determined with
the same goodness-of-fit indexes described above, χ2 -difference tests, and results of the LM
test, which in this context provides information concerning which equality constraints are not
plausible and should be released to improve the overall fit of the model.
We also assessed the differences between the samples in latent means (Hoyle & Smith, 1994;
Stein & Gelberg, 1995). First we contrasted the latent means of the two clinical populations.
Then, in a four-group analysis, we contrasted the latent means of the clinical groups against
the community sample using the community sample as the reference group. This technique
provides a statistic analogous to a z-score and was meant to determine if the clinical samples
reported more maltreatment than the community sample.

Criterion-related validation

As noted above, therapists’ independent ratings of four types of childhood trauma were ob-
tained from the Child Maltreatment Ascertainment Interview (CMAI) for 179 of the adolescent
178 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

psychiatric patients. Ratings of “present” on the CMAI were coded as “2,” ratings of “absent”
were coded as “1,” and ratings of “uncertain” were coded as “1.5.” To evaluate whether scores
from the CTQ short form corresponded well with those independently obtained ratings, we first
performed a CFA in which the five latent variables from the CTQ short form were intercorre-
lated with the four scores from the CMAI (physical, sexual, and emotional abuse, and physical
neglect). We then tested a predictive model to observe whether constructs from the CTQ short
form could predict analogous measured constructs from the CMAI. Initially, all possible pre-
dictive paths were included simultaneously and nonsignificant paths were dropped gradually.
This procedure was a test of both the convergent and discriminative validity of the CTQ short
form (i.e., childhood trauma variables on the CTQ should be related to corresponding variables
on the CMAI, and not to noncorresponding variables) and the criterion-related validity of the
CTQ short form (i.e., the ability of the CTQ to predict an independent criterion variable).

Results

Confirmatory factor analysis

Table 1 reports the means, standard deviations, and factor loadings for the individual items
that were selected to form the five latent constructs. Alpha coefficients for each group are also
reported.
All manifest variables loaded significantly (p ≤ .001) on their hypothesized latent factors in
all four groups. Model modification was minimal and is described below. The fit indexes were
quite good which indicated that the hypothesized factor structures were plausible for all four
groups: (1) adult substance abusers from New York City S-B χ2 (262, N = 378) = 484.98;
p < .001; χ2 /df = 1.85; RCFI = .92; RMSEA = .05; (2) adolescents S-B χ2 (263, N =
396) = 527.77; p < .001; χ2 /df = 2.01; RCFI = .94; RMSEA = .05; (3) substance abusers
from the Southwest S-B χ2 (262, N = 625) = 654.47; p < .001; χ2 /df = 2.49; RCFI = .93,
RMSEA = .05: and (4) normative community sample S-B χ2 (263, N = 579) = 491.12;
p < .001; χ2 /df = 1.87; RCFI = .93; RMSEA = .06. All fit indexes were greater than .90,
all but one χ2 /degrees of freedom ratios were near 2:1 or less, and RMSEAs were acceptable
in all four groups.
A few nonhypothesized covariances among the error residuals were added to each model
based on suggestions from the LM test. These correlations reflect unique associations between
variables that are not accounted for by the latent factor. They may capture either method or
content similarity. It is not surprising that a few were needed for each group and these in no
way altered the fundamental factor structure.
For the adult substance abusers, three additional covariances were added: one was between
the residuals of two physical abuse items (“People in my family hit me so hard it left me with
bruises or marks,” and “I was punished with a belt, a board, a cord, or some other hard object”),
one was between two sexual abuse items (“Someone tried to touch me in a sexual way, or tried
to make me touch them,” and “Someone tried to make me do sexual things or watch sexual
things”), and one was between two emotional abuse items (“People in my family called me
things like ‘stupid,’ ‘lazy,’ or ‘ugly,’ and “People in my family said hurtful or insulting things
Table 1
Means, standard deviations, and factor loadings of measured variables (CTQ short form itemsa ) in the confirmatory factor analysis
396 Adolescents 378 Substance abusers 579 Community members 625 Texas sample
Mean Factor Mean Factor Mean Factor Mean Factor
(SD)b loadingc (SD) loading (SD) loading (SD) loading

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190


I. Emotional abuse (adolescents’ coefficient α = .89, drug abusers’ α = .84,
community sample members’ α = .87, Texas = .88)
Called names by family 2.7 (1.5) .82 2.2 (1.2) .60 1.9 (1.2) .73 2.2 (1.3) .69
Parents wished was never born 2.1 (1.3) .72 1.6 (1.0) .67 1.4 (.9) .69 1.8 (1.3) .70
Felt hated by family 2.5 (1.5) .81 1.7 (1.2) .78 1.7 (1.1) .73 2.0 (1.3) .83
Family said hurtful things 2.7 (1.4) .84 2.2 (1.2) .79 2.1 (1.1) .83 2.2 (1.3) .82
Was emotionally abused 2.5 (1.5) .77 1.9 (1.3) .80 1.8 (1.3) .85 2.1 (1.5) .83
II. Physical abuse (α = .86, .81, .83, .85)
Hit hard enough to see doctor 1.3 (.8) .60 1.4 (.9) .74 1.1 (.5) .67 1.3 (.9) .63
Hit hard enough to leave bruises 2.0 (1.3) .91 1.8 (1.2) .74 1.3 (.8) .84 1.9 (1.3) .78
Punished with hard objects 2.1 (1.4) .75 3.0 (1.4) .49 2.2 (1.2) .56 2.5 (1.4) .66
Was physically abused 2.0 (1.5) .82 1.6 (1.2) .75 1.4 (1.0) .82 1.8 (1.3) .87
Hit badly enough to be noticed 1.4 (1.0) .69 1.3 (.9) .73 1.1 (.5) .63 1.4 (1.0) .72
III. Sexual abuse (α = .95, .93, .92, .94)
Was touched sexually 1.9 (1.5) .91 1.7 (1.2) .75 1.6 (1.0) .80 1.8 (1.4) .90
Hurt if didn’t do something sexual 1.4 (1.0) .71 1.3 (.8) .75 1.1 (.6) .68 1.4 (1.0) .73
Made to do sexual things 1.6 (1.2) .87 1.5 (1.0) .82 1.4 (.9) .85 1.6 (1.2) .90
Was molested 1.7 (1.4) .95 1.4 (1.0) .91 1.4 (1.0) .91 1.7 (1.4) .93
Was sexually abused 1.7 (1.4) .93 1.4 (1.0) .94 1.4 (1.0) .89 1.7 (1.4) .92
IV. Emotional neglect (α = .89, .88, .91, .85)
Felt loved (R) 2.3 (1.3) .86 1.9 (1.2) .78 1.8 (.9) .80 2.1 (1.3) .79
Made to feel important (R) 2.5 (1.3) .72 2.3 (1.2) .67 2.0 (1.1) .72 2.7 (1.5) .47
Was looked out for (R) 2.6 (1.3) .77 2.0 (1.1) .80 1.9 (1.0) .84 2.3 (1.3) .83
Family felt close (R) 3.0 (1.3) .76 2.1 (1.2) .78 2.2 (1.1) .84 2.4 (1.3) .81
family was source of strength (R) 2.9 (1.4) .85 2.1 (1.2) .83 2.1 (1.1) .90 2.4 (1.4) .81
179
180
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190
Table 1 (Continued )
396 Adolescents 378 Substance abusers 579 Community members 625 Texas sample

Mean Factor Mean Factor Mean Factor Mean Factor


(SD)b loadingc (SD) loading (SD) loading (SD) loading
V. Physical neglect (α = .78, .68, .61, .68)
Not enough to eat 1.4 (.9) .57 1.5 (.9) .41 1.2 (.6) .28 1.7 (1.1) .40
Got taken care of (R) 2.1 (1.2) .79 1.8 (1.2) .65 1.7 (1.0) .82 2.0 (1.3) .60
Parents were drunk or high 1.6 (1.2) .53 1.4 (.8) .49 1.3 (.7) .41 1.6 (1.1) .51
Wore dirty clothes 1.4 (.8) .58 1.4 (.8) .61 1.2 (.5) .30 1.5 (.9) .42
Got taken to doctor (R) 1.7 (1.0) .69 1.5 (.9) .56 1.3 (.8) .44 1.8 (1.2) .66
a
Items presented in abbreviated form (R) = reverse-scored item.
b
Range of all variables = 1–5. 1 = never true; 2 = rarely true; 3 = sometimes true; 4 = often true; 5 = very often true.
c
All factor loadings significant, p ≤ .001.
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 181

to me”). For the adolescents, two additional correlated error residuals were added. One was
between two sexual abuse items (“Someone tried to make me do sexual things or watch sexual
things,” and “Someone threatened to hurt me or tell lies about me unless I did something sexual
with them”). The other was between two emotional neglect items (“I felt loved,” and “People
in my family felt close to each other”). Three correlated errors were added for the Southwest
sample. The first was between two emotional abuse items (“People in my family said hurtful
or insulting things to me,” and “People in my family called me things like ‘stupid,’ ‘lazy,’ or
‘ugly’ ”); the second was between two physical abuse items (“I got hit or beaten so badly that it
was noticed by someone like a neighbor, teacher, or doctor,” and “People in my family hit me
so hard that it left me with bruises or marks”); and the third was between two physical neglect
items (“I didn’t have enough to eat,” and “I had to wear dirty clothes”). Three correlated errors
were also added for the community sample. The first was between two emotional neglect items
(“I knew there was someone to take care of me and protect me,” and “There was someone in
my family who helped me feel that I was important or special”). The second was between two
sexual abuse items (“Someone tried to make me do sexual things or watch sexual things,” and
“Someone tried to touch me in a sexual way or tried to make me touch them”). The third was
between “I believe that I was physically abused,” and “I believe that I was emotionally abused.”
Relationships among the latent variables are reported in Table 2. All relationships among
the latent variables were significant (p ≤ .001). The relationships between emotional abuse
and physical abuse were particularly high in all groups (.75 among the adolescents, .80 among
the adult substance abusers, .77 among the Texas sample, .87 among the community sample)
as were the relationships between emotional neglect and physical neglect for the four groups
(.88, adolescents; .79, adult substance abusers; .84, Texas sample, .90, community sample).
Relatively smaller although still significant relationships were observed between sexual abuse
and the other latent variables.

Multiple group comparisons

Factor structure. The factor structure is the relationship between the latent and measured
variables. The baseline multiple group model for all four sets with no equality constraints
imposed on it served as a comparison for further models (Model 1 of Table 3). Values for an
absolute null model are also reported for comparison purposes (Model 4). The multiple group
comparison in which the factor structures (measurement models) were constrained to equality
across the groups (Model 2) suggested that the factor structures for the adolescents, adult
substance abusers in New York City, Southwest substance abusers, and normative community
sample were reasonably similar (RCFI = .92), although there was a significant decrement in
fit from the unrestricted model (the two constraints released in the two group analyses were
not constrained in the four group analysis). The χ2 -difference between Model 2 with equality
constraints on the factor structure and an unrestricted model (Model 1) was 212.86 with
df = 58, which was significant (p < .001). This lack of equivalence reflects significant group
differences in relationships among some of the measured variables especially between the
community and clinical samples and how they relate to their associated latent variables. After
releasing five of the 58 constraints, the fit improved considerably (Model 3, χ2 -difference =
55.00/53 df, nonsignificant, p > .10). The five constraints that were dropped centered mostly
182 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Table 2
Correlations among the latent factorsa
I II III IV
Adolescents
I. Emotional abuse –
II. Physical abuse .75 –
III. Sexual abuse .44 .41 –
IV. Emotional neglect .77 .55 .30 –
V. Physical neglect .72 .59 .45 .84
Substance abusers
I. Emotional abuse –
II. Physical abuse .81 –
III. Sexual abuse .50 .41 –
IV. Emotional neglect .78 .52 .27 –
V. Physical neglect .63 .54 .29 .78
Community sample members
I. Emotional abuse –
II. Physical abuse .73 –
III. Sexual abuse .47 .59 –
IV. Emotional neglect .83 .50 .30 –
V. Physical neglect .80 .62 .35 .90
Texas sample
I. Emotional abuse –
II. Physical abuse .87 –
III. Sexual abuse .59 .60 –
IV. Emotional neglect .61 .53 .40 –
V. Physical neglect .65 .65 .43 .84
a
All correlations significant, p ≤ .001.

around the equivalence of the loadings of the normative sample as compared to the clinical
groups on three items: “I believe that I was emotionally abused,” “There was someone in
my family who helped me feel that I was important or special,” and “I knew that there was
someone to take care of me and protect me.”

Table 3
Result of multiple group analyses between adolescents, substance abusers, Texas sample, and community sample
Model S-B χ2 df RCFI (RMSEA) χ2 -difference from Model 1
1. Baseline four-group model, no 2152.47 1049 .93 (.023) NA
constraints
2. Four-group model, constrained 2365.33 1107 .92 (.024) 212.86/58 df
measurement model
3. Model 2 without 5 equality 2207.47 1102 .93 (.023) 55.00/53 df
constraints
4. Absolute null model 17044.74 1200 NA –
2
Note: S-B χ : Satorra-Bentler scaled chi-square; RCFI: robust comparative fit index; RMSEA: root mean square
error of approximation; NA: not applicable.
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 183

Latent means analysis

We tested for differences between the latent means of the factors in the two clinical samples
and then among the four groups. In these types of models, the factor structure and the observed
intercepts (means) are initially constrained to equality between the groups (see Byrne, 1994).
In the initial latent means analysis contrasting only the two clinical samples, all latent means
were significantly higher for the adolescent group than for the adult substance abusers group
except for the physical abuse latent mean (emotional abuse z = 7.25, p ≤ .001; physical abuse
z = .62, ns; sexual abuse z = 2.79, p ≤ .01; emotional neglect z = 7.42, p ≤ .001; physical
neglect z = 2.27, p ≤ .05). Although this model fit well (RCFI = .98, RMSEA = .054), one
equivalence constraint between the observed intercepts of the two groups was released since
it was reported to be extremely untenable in the LM test (χ2 = 149.06, 1 df, p < .001). This
indicator was the physical abuse item, “I was punished with a belt, a board, a cord, or some
other hard object.” As reported in Table 1, this particular item was endorsed more highly by the
substance abusers in New York City, although other items on this factor tended to be endorsed
more highly by the adolescents. Once that constraint was dropped, the physical abuse factor
latent mean was significantly higher for the adolescent group (z = 2.33, p ≤ .05).
In the four group latent means model (RCFI = .96), using the community sample as the
reference group, we found that the adult substance abusers reported more emotional abuse
(z = 2.61, p ≤ .01), more physical abuse (z = 6.78, p ≤ .001), and more physical neglect
(z = 4.68, p ≤ .001) than the community sample. There was no significant difference on the
means for the sexual abuse factor. The adolescents reported more emotional abuse (z = 10.84,
p ≤ .001), more physical abuse (z = 7.85, p ≤ .001), more sexual abuse (z = 4.19,
p ≤ .001), more emotional neglect (z = 10.14, p ≤ .001), and more physical neglect
(z = 7.01, p ≤ .001) than the community sample. The Texas sample reported more emotional
abuse (z = 4.47, p ≤ .001), more physical abuse (z = 8.91, p ≤ .001), more sexual abuse
(z = 5.01, p ≤ .001), more emotional neglect (z = 6.17, p ≤ .001), and more physical
neglect (z = 10.23, p ≤ .001) than the community sample.

Criterion-related validation

First, a CFA validation model was run using the subsample of 179 adolescents available for
this analysis. We added the same correlated error residuals as in the original CFA model with
the complete set of adolescents. This model fit the data very well: S-B χ2 (344, N = 179) =
534.55; p < .001; χ2 /df = 1.55; RCFI = .93. Correlations between the therapist ratings and
the CTQ latent factors are reported in Table 4. Therapist ratings are arranged in columns. The
highest correlation in each column coincides with the analogous CTQ latent construct.
We then used the CTQ latent factors as predictors of the therapist ratings. All factors were
used as predictors of all constructs simultaneously. We allowed covariances (correlations)
among the predictor variables and significant covariances among the error residuals of the
outcome variables. We gradually dropped paths if they were nonsignificant until only signif-
icant paths remained. The fit indices for this final path model reflected an excellent fit: χ2
(361, N = 179) = 550.08; p < .001; χ2 /df = 1.52; RCFI = .93; RMSEA = .05. Results
are reported in Figure 1.
184 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Figure 1. Significant regression paths among latent variables in the structural equation model predicting observer
ratings (N = 179). Regression coefficients are standardized (a p ≤ .05, b p ≤ .01, c p ≤ .001). Correlations among
predictors, and correlations among residuals of outcomes are not depicted for readability.
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 185

Table 4
Correlations between CTQ latent factors and therapist observation scores for 179 adolescents from the validation
CFA
Therapist ratings
Physical abuse Sexual abuse Emotional abuse Neglect
CTQ factors
I. Emotional abuse .51a .38a .48a .27a
II. Physical abuse .59a .27a .45a .28a
III. Sexual abuse .18b .75a .20b .23b
IV. Emotional neglect .42a .22b .38a .36a
V. Physical neglect .43a .27a .32a .50a
a
p ≤ .001.
b
p ≤ .01.

We found that the CTQ constructs significantly predicted analogous observational scores by
the therapists. In most cases, there was considerable discriminative validity between similar
observed and reported variables, except that CTQ physical abuse also predicted observed
emotional abuse. Since these constructs are highly related to each other, these results are not
unexpected.
To refine these results, we needed to determine empirically whether the path from the
CTQ physical abuse factor to the physical abuse rating was significantly larger than the path
from CTQ physical abuse to the emotional abuse rating variable. Therefore, we ran a model
that constrained these paths to equivalence and then examined the χ2 -difference test between
these nested models. The difference test revealed that the paths were significantly different in
magnitude (p < .01), thereby providing additional evidence of the discriminant validity of
the CTQ.

Discussion

The results of the confirmatory factor analyses indicate that with few exceptions the items
on the CTQ short form performed equivalently across four diverse populations with differ-
ing maltreatment histories, supporting the measurement invariance of the scale. In the ini-
tial analyses where each sample was examined separately, the proposed five-factor structure
of the CTQ short form (i.e., physical, sexual, and emotional abuse, and physical and emo-
tional neglect) provided a good fit for the data in all four groups: adult substance abusing
patients in New York City, adolescent psychiatric inpatients, adult substance abusers in the
Southwest, and normative community sample members. To provide a more stringent test of
measurement invariance, we then compared the four groups directly, first using an uncon-
strained baseline model and then introducing equality constraints on the model. When we
constrained the factor structure (i.e., the relationships between items and their latent variables)
to equality, the model provided a good fit for the data, once a few constraints were released.
Thus, individuals in the four groups, which differed widely in terms of age, sex, ethnicity,
SES, psychopathology, and life experiences, responded to the scale’s items in a reasonably
186 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

equivalent manner, indicating that the items held essentially the same meaning across diverse
populations.
Importantly, the main precondition for the utility of a scale across different groups is the
invariance of its factor structure (Byrne, 1994). For example, a scale with an invariant factor
structure can be used to perform latent means analyses, even when its covariance structure
shows some nonequivalence between groups (Byrne et al., 1989). Thus, despite small differ-
ences in the covariance structure of the scale particularly in the community sample, our results
support the use of the CTQ short form as a screening instrument for maltreatment in both
clinical and nonreferred groups.
The CTQ short form also showed good evidence of criterion-related validity in a subgroup
of psychiatrically referred adolescents on whom corroborative data were available. When the
CTQ short form’s latent maltreatment variables were compared to analogous therapists’ ratings
of abuse and neglect based on all available information about the patients, the correspondence
between the two sets of measures was quite precise, supporting the convergent and discrim-
inant validity of the CTQ short form. Although the CTQ short form’s physical abuse factor
was related to both physical and emotional abuse ratings made by the therapists, this was
not unexpected. Indeed, the high intercorrelation between the physical and emotional abuse
factors across the four samples supports the clinical observation that physical abuse almost
always occurs in the context of emotional abuse (Claussen & Crittenden, 1991), although the
converse—emotional abuse in the absence of physical abuse—is more common. Moreover,
the physical abuse factor was significantly more highly associated with therapists’ physical
abuse ratings than with their emotional abuse ratings, supporting the discriminant validity of
the physical abuse factor.
The latent means analyses showed that, as expected, the two substance abusing samples
and the sample of adolescent psychiatric patients reported higher levels of maltreatment in
nearly all areas than the normative community sample. One exception was the nonsignificant
difference in levels of sexual abuse reported by the community sample and the adult substance
abusers. This lack of a difference is probably attributable to the predominance of males in the
substance abusing group, whereas the community sample has a higher proportion of females.
To test this possibility, we examined gender differences in the community sample and found
that men reported significantly less sexual abuse than the women (p < .001). This result
corroborates our conclusion the lack of a mean difference on the sexual abuse factor between
the adult drug abusers and community samples was due to the disproportionate number of men
and women in these groups.
In general, the adolescent psychiatric inpatients reported the highest levels of maltreat-
ment, the normative community sample members the lowest levels of maltreatment, and the
adult substance abusers were for the most part intermediate between the other two groups,
although still showing quite substantial maltreatment. Several studies have reported the preva-
lence of abuse and neglect in clinically referred samples of adolescents (Cavaiola & Schiff,
1988; Sanders & Giolas, 1991; Sansonnet-Hayden, Haley, Marriage, & Fine, 1987) and adult
substance abusers (Kroll, Stock, & James, 1985; Schaefer, Sobieraj, & Hollyfield, 1988) that
are far in excess of those found in the general population. For example, in a recent study
of the same adolescent psychiatric patients reported on here (Bernstein et al., 1997), over
50% of patients were rated as abused or neglected by their therapists, and over 70% reported
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 187

maltreatment on the CTQ, when cut scores were used to determine caseness. However, few
previous studies have directly compared the prevalence or severity of maltreatment across
different groups, in part because measurement invariance is a precondition for the valid-
ity of such comparisons. By demonstrating the measurement invariance of the CTQ short
form, the present study helps lay the groundwork for more accurate comparisons of the ex-
tent of child abuse and neglect across both clinical and nonreferred populations in future
studies.
The findings of this study must be considered in light of certain methodological limitations.
First, two of the four data sets used in this study (i.e., the adult substance abusers in New
York City and the adolescent psychiatric patients) were used to derive the CTQ short form
by exploratory factor analysis and also to test the measurement invariance of the CTQ short
form by confirmatory factor analysis. Although it would have been preferable to obtain com-
pletely new clinical samples to cross validate the exploratory factor analysis results, this was
not feasible, due to the time and expense that would have been required to gather additional
clinical samples of adequate size. On the other hand, our finding that the CTQ short form
showed measurement invariance in two entirely new samples, including a normative commu-
nity sample, suggests that our results are not merely circular. Second, in the adolescent sample,
the therapists’ maltreatment ratings based on the CMAI were clustered within therapists (i.e.,
each therapist made ratings on more than one patient) and were therefore nonindependent;
however, it is unlikely that this lack of independence affected the external validity results,
because the therapists’ ratings had excellent interrater reliability. Nevertheless, this possibility
cannot be entirely ruled out. More importantly, however, although we found strong evidence
for criterion-related validity in the adolescent sample, no such analyses were performed for
the other three groups due to the absence of direct corroborative data. The verification of
self-reported childhood trauma poses inherent difficulties, including the passage of time and
the secrecy that often surrounds these experiences. For this reason, corroborative data are often
difficult or impossible to obtain, particularly in samples of adults. In the adolescent sample,
we capitalized on the fact that the events in question were relatively recent ones and that the
adolescents’ therapists were privy to many sources of corroborative information, such as child
welfare records and interviews with family members and other informants. No comparable
data were available in the other three samples. Thus, the criterion-related validity of the CTQ
short form in other populations remains to be established.
In summary, our findings provide strong support for the coherence and viability of the con-
structs measured by the CTQ short form, including the invariance of its factor structure across
diverse populations and its criterion-related validity in an adolescent psychiatric population
in which independent corroborative evidence was obtained. The CTQ short form’s brevity of
administration and assessment of multiple types of maltreatment should give it broad utility
in both clinical and nonreferred groups. As a clinical screening instrument, the CTQ short
form, which takes about 5 minutes to give, can quickly identify individuals with histories
of maltreatment so that appropriate treatments can be provided. As a research tool, its ease
and quickness of administration make it well suited for treatment studies and for large scale
epidemiological and multivariate correlational studies. In future studies, we will continue to
examine the validity of trauma histories obtained with the CTQ, using a variety of research
strategies including corroboration by additional sources of independent evidence.
188 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Acknowledgments

The secretarial and production assistance of Wendy Sallin and Gisele Pham is gratefully
acknowledged.

References

Allen, J. (1995). The spectrum of accuracy in memories of childhood trauma. Harvard Review of Psychiatry, 3,
84–95.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.
Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software Inc.
Bentler, P. M., & Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions.
Annual Review of Psychology, 47, 563–592.
Bentler, P. M., & Stein, J. A. (1992). Structural equation modeling in medical research. Statistical Methods in
Medical Research, 1, 159–181.
Bernstein, D., & Fink, L. (1998). Childhood Trauma Questionnaire: A retrospective self-report. San Antonio, TX:
The Psychological Corporation.
Bernstein, D. P., Fink, L., Handelsman, L., Foote, J., Lovejoy, M., Wenzel, K., Sapareto, E., & Ruggiero, J. (1994).
Initial reliability and validity of a new retrospective measure of child abuse and neglect. American Journal of
Psychiatry, 151, 1132–1136.
Bernstein, D. P., Fink, L., Handelsman, L., Foote, J., Lovejoy, M., Wenzel, K., Sapareto, E., & Ruggiero, J. (1995).
Validity of child abuse measurements: Dr. Bernstein and colleagues reply. American Journal of Psychiatry, 152,
1535–1537.
Bernstein, D. P., Ahluvalia, T., Pogge, D., & Handelsman, L. (1997). Validity of the Childhood Trauma Ques-
tionnaire in an adolescent psychiatric population. Journal of the American Academy of Child and Adolescent
Psychiatry, 36, 340–348.
Bifulco, A., Brown, G., & Harris, T. (1994). Child Experience of Care and Abuse (CECA): A retrospective interview
measure. Journal of Child Psychology & Psychiatry & Allied Disciplines, 35, 1419–1435.
Brewin, C. R., Andrews, B., & Gotlib, I. H. (1993). Psychopathology and early experience: A reappraisal of
retrospective reports. Psychological Bulletin, 113, 82–98.
Briere, J. (1992). Methodological issues in the study of sexual abuse effects. Journal of Consulting and Clinical
Psychology, 60, 196–203.
Briere, J., & Runtz, M. (1988). Multivariate correlates of childhood psychological and physical maltreatment among
university women. Child Abuse & Neglect, 12, 331–341.
Briere, J., & Zaidi, L. Y. (1989). Sexual abuse histories and sequelae in female psychiatric emergency room patients.
American Journal of Psychiatry, 146, 1602–1606.
Browne, W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.),
Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage.
Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand Oaks, CA: Sage.
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean
structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466.
Cahill, L., Prins, B., Weber, M., & McGaugh, J. L. (1994). Beta-adrenergic activation and memory for emotional
events. Nature, 371, 702–704.
Cavaiola, A., & Schiff, M. (1988). Behavioral sequelae of physical and/or sexual abuse in adolescents. Child Abuse
& Neglect, 12, 181–188.
Chou, C. P., & Bentler, P. M. (1990). Model modification in covariance structure modeling: A comparison among
likelihood ratio, Lagrange Multiplier, and Wald tests. Multivariate Behavioral Research, 25, 115–136.
Claussen, A., & Crittenden, P. (1991). Physical and psychological maltreatment: Relations among types of mal-
treatment. Child Abuse & Neglect, 15, 5–18.
Crouch, J., & Milner, J. (1993). Effects of child neglect on children. Criminal Justice and Behavior, 20, 49–65.
D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 189

Dill, D. L., Chu, J. A., Grob, M. C., & Eisen, S. V. (1991). The reliability of abuse history reports: A comparison
of two inquiry formats. Comprehensive Psychiatry, 32, 166–169.
Ditomasso, M. (1995). Remembering development and validation of an instrument to measure adults’ recall of
maltreatment in childhood. Unpublished doctoral dissertation.
Fink, L., Bernstein, D. P., Handelsman, L., Foote, J., & Lovejoy, M. (1995). Initial reliability and validity of the
Childhood Trauma Interview: A new multidimensional measure of childhood interpersonal trauma. American
Journal of Psychiatry, 152, 1329–1335.
Finkelhor, D. (1994). Current information on the scope and nature of child sexual abuse. Sexual Abuse of Children,
4, 31–53.
Gallagher, R. E., Flye, B. L., Hurt, S. W., Stone, M. H., & Hull, J. W. (1992). Retrospective assessment of traumatic
experiences (RATE). Journal of Personality Disorders, 36, 99–108.
Herman, J. L., Perry, J. C., & van der Kolk, B. A. (1989). Childhood trauma in borderline personality disorder.
American Journal of Psychiatry, 146, 490–495.
Hoyle, R. H., & Smith, G. T. (1994). Formulating clinical research hypotheses as structural equation models: A
conceptual overview. Journal of Consulting and Clinical Psychology, 62, 429–440.
Kendall-Tackett, K. A., Meyer Williams, L., & Finkelhor, D. (1993). Impact of sexual abuse on children: A review
and synthesis of recent empirical studies. Psychological Bulletin, 113, 164–180.
Knutson, J. F. (1995). Psychological characteristics of maltreated children: Putative risk factors and consequences.
Annual Review of Psychology, 46, 401–431.
Kroll, P., Stock, D., & James, M. (1985). The behavior of adult alcoholic men abused as children. Journal of
Nervous and Mental Disease, 173, 689–693.
Lipschitz, D. S., Bernstein, D. P., Winegar, R. K., & Southwick, S. M. (1999). Hospitalized adolescents’ reports of
sexual and physical abuse: A comparison of two self-report measures. Journal of Traumatic Stress, 12, 641–654.
Loftus, E. F. (1993). The reality of repressed memories. American Psychologist, 48, 518–537.
Malinosky-Rummell, R., & Hansen, D. J. (1993). Long-term consequences of childhood physical abuse. Psycho-
logical Bulletin, 114, 68–79.
Meyer, I., Muenzenmaier, K., Cancienne, J., & Struening, E. (1996). Reliability and validity of a measure of sexual
and physical abuse histories among women with serious mental illness. Child Abuse & Neglect, 29, 213–219.
Newcomb, M. D. (1994). Drug use and intimate relationships among women and men: Separating specific from
general effects in prospective data using structural equations models. Journal of Consulting and Clinical Psy-
chology, 62, 463–476.
Newcomb, M. D. (1997). General deviance and psychological distress: Impact of family support/bonding over 12
years from adolescence to adulthood. Criminal Behaviour and Mental Health, 7, 369–400.
Newcomb, M. D., & Bentler, P. M. (1988). Consequences of adolescent drug use: Impact on the lives of young
adults. Beverly Hills, CA: Sage.
Rogers, M. (1995). Factors influencing recall of childhood sexual abuse. Journal of Traumatic Stress, 8, 691–716.
Rosenberg, M. S. (1987). New directions for research on the psychological maltreatment of children. American
Psychologist, 42, 166–171.
Sanders, B., & Becker-Lausen, E. (1995). The measurement of psychological maltreatment: Early data on the Child
Abuse and Trauma Scale. Child Abuse & Neglect, 19, 315–323.
Sanders, B., & Giolas, M. (1991). Dissociation and childhood trauma in psychologically disturbed adolescents.
American Journal of Psychiatry, 148, 50–53.
Sansonnet-Hayden, H., Haley, G., Marriage, K., & Fine, S. (1987). Sexual abuse and psychopathology in hospital-
ized adolescents. Journal of the American Academy of Child and Adolescent Psychiatry, 26, 753–757.
Schaefer, M., Sobieraj, K., & Hollyfield, R. (1988). Prevalence of childhood physical abuse in adult male veteran
alcoholics. Child Abuse & Neglect, 12, 141–149.
Scheier, L. M., & Newcomb, M. D. (1993). Multiple dimensions of affective and cognitive disturbance: Latent
variable models in a community sample. Psychological Assessment: A Journal of Consulting and Clinical
Psychology, 5, 230–234.
Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate
Behavioral Research, 25, 173–180.
190 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Stein, J. A., & Gelberg, L. (1995). Homeless men and women: Differential associations among substance abuse,
psychosocial factors, and severity of homelessness. Experimental and Clinical Psychopharmacology, 3, 75–86.
Stein, J. A., Newcomb, M. D., & Bentler, P. M. (1987). An eight-year study of multiple influences on drug use and
drug use consequences. Journal of Personality and Social Psychology, 53, 1094–1105.
Stein, J. A., Newcomb, M. D., & Bentler, P. M. (1993). Differential effects of parent and grandparent drug use on
behavior problems of male and female children. Developmental Psychology, 29, 31–43.
Straus, M., & Hamby, S. (1997). Measuring physical and psychological maltreatment of children with the Conflict
Tactics Scales. In G. Kantor & J. Jasinski (Eds.), Out of darkness: Contemporary perspectives on family violence
(pp. 119–135). Thousand Oaks, CA: Sage.
Straus, M., Hamby, S., Finkelhor, D., Moore, D., & Runyan, D. (1998). Identification of child maltreatment with the
Parent-Child Conflict Tactics Scales: Development and psychometric data for a national sample of American
parents. Child Abuse & Neglect, 22, 249–270.
Walker, E. A., Bernstein, D. P., & Keegan, D. (1997). A comparison of interview and questionnaire methods of
assessing childhood interpersonal trauma. Unpublished manuscript.
Zanarini, M. C., Gunderson, J. G., Marino, M. F., Schwarz, E. O., & Frankenburg, F. R. (1989). Childhood
experiences of borderline patients. Comprehensive Psychiatry, 30, 18–25.

You might also like