Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Copyright © The British Psychological Society

Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

359

The
British
Psychological
British Journal of Mathematical and Statistical Psychology (2005), 58, 359–375
q 2005 The British Psychological Society
Society

www.bpsjournals.co.uk

Latent variable models for misclassified


polytomous outcome variables

Jens C. Eickhoff1* and Yasuo Amemiya2


1
Department of Biostatistics and Medical Informatics, University of Wisconsin
Madison, USA
2
IBM TJ Watson Research Center, Yorktown Heights, USA

1. Introduction
The problem of analysing concepts or variables which are not directly observable and
can only be measured through related indicators arises frequently in practice. In these
situations, latent variable modelling provides a useful statistical technique. Statistical
methods for analysing covariances and other relationships between latent and observed
variables were historically originated by psychometricians in the form of factor analysis,
later extended to the more general structural equation analysis (Bentler, 1995; Jöreskog
& Sörbom, 1996). Today, latent variable models are extensively used in the behavioural
and social sciences.
In many social and behavioural science studies, outcome variables are measured in
polytomous form — for example, Likert-type outcome variables (e.g. ‘disagree’,
‘neutral’, ‘agree’) to measure psychosocial phenomena such as feelings, attitudes or
opinions. While methods for analysing latent variable models with continuous outcome
variables have been extensively studied, the development of methods on models where
the outcome variables are in non-continuous form remains an active area of research.
Bock and Lieberman (1970) considered a maximum likelihood method for factor
analysis models with dichotomous outcome variables and one factor. However, direct
maximum likelihood analysis for models with higher-dimensional latent variables
involves computational difficulties as it requires maximization over multiple intractable
integrals. This led to the development of multi-stage generalized least squares (GLS)
estimation based on limited first and second-order samples using polychoric and
polyserial correlations (Muthén, 1984; Lee & Poon, 1987; Lee, Poon, & Bentler 1995).
Multi-stage GLS estimation procedures for structural equation models with polytomous
outcome variables have been implemented in popular psychometric software packages

* Correspondence should be addressed to Jens C. Eickhoff, Department of Biostatistics and Medical Informatics, University
of Wisconsin–Madison, 238 WARF Building, 610 Walnut Street, Madison, WI 53726-2397, USA
(e-mail: cickoff@biostat.wisc.edu).

DOI:10.1348/000711005X64970
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

360 Jens C. Eickhoff and Yasuo Amemiya

including LISCOMP (Muthén, 1987), EQS (Bentler, 1995), LISREL/PRELIS (Jöreskog &
Sörbom, 1996) and Mplus (Muthén & Muthén, 1998). With the availability of high-speed
computers, computationally intensive methods involving Monte Carlo EM and the Gibbs
sampler have recently been proposed for models with continuous and polytomous data
to perform maximum likelihood and Bayesian analysis (Sammel, Ryan, & Legler 1997;
Shi & Lee, 1998; Lee & Shi, 2001). Moustaki (1996, 2000) and Jöreskog and Moustaki
(2001) proposed a generalized linear latent variable model that allows latent variables to
be linked to manifest variables of different types. A wide range of latent variable models
with continuous and discrete outcomes can be fitted using the STATA program GLLAMM
(Rabe-Hesketh, Skrondal, & Pickles, 2002). Parameter estimation is carried out in
GLLAMM using direct maximum likelihood estimation where the marginal likelihood
function is approximated by adaptive quadrature.
The polytomous outcome variables of latent variable models in psychosocial research
applications are often associated with misclassification. For instance, one might expect
participants to deliberately give incorrect answers to sensitive or touchy questions (such
as ‘Have you ever used any illegal drugs?’). Ignoring these erroneous responses might
lead to errors in inference about the latent variables. The topic of misclassification of
binary and polytomous responses in logistic regression analysis has been intensively
studied (see Bollinger & David, 1997; Cheng & Hsueh, 1999). It has been demonstrated
that even low rates of misclassification in polytomous responses have pronounced
effects on estimation and tests. Cheng and Hsueh (2003) proposed a method to adjust for
bias in the estimation of logistic regression models. However, little work has been done
to incorporate response errors into the generalized latent variable modelling framework.
The main purpose of this paper is to address the problem of response errors in
psychosocial applications and subsequently to develop a latent variable model with
polytomous outcome variables allowing for misclassification which might provide a
useful tool in the data analysis for many social and behavioural research studies. In many
application, researchers are interested in comparing latent variable characteristics
across different groups (e.g. treatment groups vs. control group). Thus, we propose an
approach here which allows for misclassification and which is also appropriate for
coherent multi-group analysis. This article is organized as follows. In Section 2 the
general model and motivation for our approach is discussed. The maximum likelihood
estimation procedure via Monte Carlo EM algorithm is described in Section 3. In
Sections 4 and 5, a simulation study is performed and an application from a substance
abuse intervention study is discussed. Finally, a brief conclusion is given in Section 6.

2. Model and motivation


Consider a set of G groups which may represent different treatment groups, sex groups,
 0
ðgÞ ðgÞ ðgÞ
etc. Let yi ¼ y1i ; : : : ; ypi denote a set of p polytomous outcome variables for the
gth group, measured on the ith individual, i ¼ 1; : : : ; N ðgÞ , where N (g) denotes the
number of observations within group g. We assume independence between the groups
ðgÞ
and that each yki is in polytomous form with c (k) categories c1 , : : : , ccðkÞ .
To motivate our model, assume that for each group the p outcome variables can be
explained by a small number of q(q ! p) unobservable latent variables
 0  
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
f i ¼ f 1i ; : : : ; f qi with density function p f i ; h f ðgÞ . Conditionally on f i
ðgÞ ðgÞ
assume that the elements of y i are independent observations, and that each yki relates
to the latent variables through a logistic response probability function, i.e.
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 361

  h n  oi21
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
P yki # cj jf i ¼ 1 þ exp 2 akj þ b0 k f i

ðgÞ ðgÞ
for k ¼ 1; : : : ; p, j ¼ 1; : : : ; cðkÞ 2 1, and ak1 , : : : , akðcðkÞ21Þ :The intercept and
ðgÞ ðgÞ
slope parameters, akj and bk describe the measurement properties for the kth case
outcome variable. To incorporate the response error, we assume that instead of
ðgÞ ðgÞ
observing the true outcome variables y1i ; : : : ; ypi directly, we observe possible
ðgÞ ðgÞ
misclassified outcome variables y1i ; : : : ; ypi . An intuitive way to describe the
relationship between observed and true outcome variables would be to include a
misclassification parameter representing the conditional probability that the observed
response falls into category cj given that the true response is in category cl, i.e.
 
ðgÞ ðgÞ ðgÞ
P yki ¼ cj j yki ¼ cl ¼ pkjl ; for j; l ¼ 1; : : : ; cðkÞ:
P
This would add G pk¼1 cðkÞðcðkÞ 2 1Þ additional parameters to the model and clearly, with
out further restrictions, the model would not be identifiable. It is difficult to give necessary
and sufficient identification conditions for this type of parametrization and this will not be
attempted here. Instead, we propose a different type of parametrization, which can be
considered reasonable in many applications with meaningful parameter interpretation and
which provides identifiability. Specifically, we assume a monotone misclassification
pattern, i.e. misclassification may occur only in one direction in the sense that if the true
response of a participant falls into any category above cj then the observed response may be
any category cl # cj. On the other hand, it is assumed that no misclassification can occur for
the case where the observed response falls into any category above cj given that the true
response is from any category cl # cj. To illustrate the monotone misclassification pattern,
we consider a question assessing alcohol drinking behaviour. Participants are asked the
question ‘How would you describe your alcohol drinking intake?’ Assume that the possible
responses are categorized as 1 ¼ ‘low’ 2 ¼ ‘moderate’ and 3 ¼ ‘high’. If the true drinking
behaviour of a participant is in the ‘low’ category, it is reasonable to assume the observed
response will not fall into the category ‘moderate’ or ‘high’. However, if the true drinking
behaviour of the participant is in the ‘moderate’ category, we are willing to allow for an
understatement in the observed response, i.e. the observed response might fall into the
‘low’ category but not into the ‘high’ category. Finally, if the participant is an excessive
drinker, one may suspect that he/she might understate his or her drinking behaviour, i.e.
the observed response might fall into the ‘low’ or ‘moderate’ category. We can express
the monotone misclassification pattern as
 
ðgÞ ðgÞ ðgÞ
P yki # cj j yki . cj ¼ gkj ;
 
ðgÞ ðgÞ
P yki . cj j yki # cj ¼ 0:

The conditional distribution of the observed outcome variables given the latent variables
can be written as
     
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
P yki # cj jf i ¼ P yki # cj j yki # cj P yki # cj jf i

   
ðgÞ ðgÞ ðgÞ ðgÞ
þP yki # cj j yki . cj P yki . cj jf i :
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

362 Jens C. Eickhoff and Yasuo Amemiya

Hence, we can write the latent variable model with monotone misclassification as
   h n  oi21
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
P yki # cj jf i ¼ gkj þ 1 2 gkj 1 þ exp 2 akj þ b0 k f i ; ð1Þ

for g ¼ 1; : : : ; G; k ¼ 1; : : : ; p; j ¼ 1 : : : ; cðkÞ 2 1, and i ¼ 1; : : : ; N ðgÞ . This mono-


tone misclassification pattern covers a wide range of applications. Modification of this
pattern can be easily implemented. For example, one modification would be to allow for
misclassification for the case where the observed response falls into any category cl $ cj
given that the true response falls into a category below cj. Another modification of a
monotone misclassification pattern would to be assume an ordered structure for the
misclassification parameters.
Model (1) contains the factor indeterminacy inherent in this type of latent variable
model. That is, the same model can be expressed using transformed parameters and
factors. To remove this indeterminacy, we use a parametrization suitable for multi-group
analysis. With possible reordering of p outcome variables, k ¼ 1; : : : , p, we assume that,
for the first q outcome variables,
ðgÞ ðgÞ
a11 ; : : : ; aq1 ¼ 0;
and
 
ðgÞ ðgÞ
bq£q ¼ b1 : : : bðgÞ
q ¼ Iq;

where q denotes the dimension of the latent variable and Iq is the q £ q identity
matrix. This is an interpretable and meaningful identification parametrization where
the group characteristic latent variable distribution parameters h f ðgÞ are unrestricted.
Therefore, differences between groups can be assessed by comparing h f ðgÞ over
different groups.

3. Maximum likelihood estimation via Monte Carlo EM algorithm


3.1. The likelihood function
We consider the maximum likelihood estimation for the parameters in model (1) which
ðgÞ ðgÞ
include the intercept and slope parameters, akj and bk ; the misclassification
ðgÞ
parameters, gkj ; and the latent variable distribution parameters h f ðgÞ . Let
 0 0
0
f ¼ f ð1Þ ; : : : ; f ðGÞ and
 0
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
uk ¼ gk1 ; : : : ; gkðcðkÞ21Þ ; ak1 ; : : :; akðcðkÞ21Þ ; b0 k ; ð2Þ

for k ¼ 1; : : : ; p; g ¼ 1; : : : ; G. Furthermore, we define the vactor of all model


parameters as
 0
ð1Þ ð1Þ ðGÞ ðGÞ
c ¼ u0 1 ; : : :; u0 p ; : : :; u0 1 ; : : :; u0 p ; : : :; h0f ð1Þ ; : : : ; h0f ðGÞ : ð3Þ

For future reference we also define the group-specific parameter

 0
ðgÞ ðgÞ
c ðgÞ ¼ u0 1 ; : : : ; u0 p ; h0f ðgÞ : ð4Þ
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 363


 0 00
The log-likelihood function of the observed data y  ¼ y ð1Þ ; : : : ; y ðGÞ
is given by
"
ð YcðkÞ n 
#
G X
X N ðgÞ X
p odðgÞ  
ðgÞ ðgÞ g kij ðgÞ ðgÞ
lðcj y  Þ ¼ log P yki ¼ cj jf i ; uk p f i ; h f ðgÞ df i ; ð5Þ
g¼1 i¼1 k¼1 j¼1

where
8 9
< 1; ðgÞ
ðgÞ
if yki ¼ cj ; =
dkij ¼
: 0; otherwise; ;

for j ¼ 1; : : : ; cðkÞ.
The direct maximization of the log-likelihood function given in (5) is difficult,
involving intractable multiple integrals which cannot be evaluated in closed form. To
ðgÞ
solve this difficulty, we treat the latent variable f i as a missing variable and utilize the
EM approach (Dempster, Laird, & Rubin 1977).

3.2. An EM algorithm
The E-step of the EM approach computes the conditional expectation of the complete
log-likelihood function given the observed variables evaluated at the current parameter
estimate. Specifically, the E-step at iteration j þ 1 computes
 
Qðcjc ð jÞ Þ ¼ E lc ðcjy  ; fÞjy  ; c ð jÞ

N ðgÞ 
G X
X   
ðgÞ ðgÞ ðgÞ ðgÞ
¼ E log p yi ; f i ; c ðgÞ jy i ; cð jÞ ; ð6Þ
g¼1 i¼1
ðgÞ
where c ( j) and cð jÞ are the estimates of c and c (g) from (3) and (4) at iteration j. This
expectation is complicated as it is with respect to the conditional density of the latent
variables given the observed variables, for which, in our setting, no closed form is
available. However, we can rewrite this conditional density as
   
ðgÞ ðgÞ ðgÞ ðgÞ
  p yi jf i ; cð jÞ p f i ; h f ðgÞ ð jÞ
ðgÞ ðgÞ ðgÞ
p f i jy i ; cð jÞ ¼ Ð     ;
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
p yi jf i ; cð jÞ p f i ; h f ðgÞ ð jÞ df i

and substitute it into equation (6). That is,


Ð      
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
XG XN ðgÞ log p yi ; f i ; c ðgÞ p y i jf i ; cð jÞ p f i ; h f ðgÞ ð jÞ df i
Qðcjc ð jÞ Þ ¼ Ð     :
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
g¼1 i¼1 p y i jf i ; cð jÞ p f i ; h f ðgÞ ð jÞ df i
ð7Þ
ðgÞ
Evaluation of the expression above requires integration over fi :
Numerical integral
approximation such as adaptive quadrature (Liu & Pierce, 1994) can be used to
approximate expression (7). Adaptive quadrature is commonly used in generalized
mixed effect models to evaluate marginal likelihoods (see Goldstein, 1991; Lesaffre &
Spiessens, 2001; Rabe-Hesketh et al., 2002). However, numerical quadrature can become
unreliable, especially for high-dimensional integrations (see Meng & Schilling, 1996).
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

364 Jens C. Eickhoff and Yasuo Amemiya

Our approach is to use a version of Monte Carlo integration. Specifically, at iteration j þ 1


of our EM algorithm, we draw a large number M of independent samples,
ðgÞ ðgÞ
 
ðgÞ
f^1i ; : : : ; f^Mi , p f i ; h f ðgÞ ð jÞ

and approximate (7) by


PM    
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
N ðgÞ
G X
X m¼1 log p yi jf^mi ; ck p yi jf^mi ; cð jÞ
^
Qðcjc ð jÞ Þ ¼   : ð8Þ
PM ðgÞ ^ðgÞ ðgÞ
g¼1 i¼1 m¼1 p y i j f mi ; c ð jÞ

In the M-step, the updated parameter estimates c( jþ,1) are obtained by maximizing (8).
Because of the conditional independence of the observed variables given the latent
variable, we can separate the parameter space of c into components corresponding to
each outcome variable and the latent variable. Hence, expression (8) can be written as
G X
X p   XG  
^
Qðcjc ð jÞ Þ ¼
^ ðgÞ uðgÞ
Q k jc ð jÞ þ
^ f ðgÞ h f ðgÞ jc ð jÞ ;
Q
k
g¼1 k¼1 g¼1

with
PM    
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
  N ðgÞ
X m¼1 log p yki jf^mi ; uk p yi jf^mi ; cð jÞ
^ ðgÞ
Q
ðgÞ
uk jc ð jÞ ¼   ð9Þ
k PM ðgÞ ^ðgÞ ðgÞ
i¼1 m¼1 p y i jfmi ; cð jÞ

and
PM  ðgÞ   
  XN ðgÞ log p ^ ; h ðgÞ p yðgÞ jf^ðgÞ ; cðgÞ
f
m¼1 mi f i mi ð jÞ
^ f ðgÞ h f ðgÞ jc ð jÞ ¼
Q   ; ð10Þ
PM ðgÞ ^ðgÞ ðgÞ
i¼1 m¼1 p y i j f mi ; c ð jÞ
ðgÞ
where uk and c (g) are defined in (2) and (4), and h f ðgÞ denotes the latent variable
distribution parameter. Therefore, the M-step can be carried out by maximizing (9) with
ðgÞ
respect
ðgÞ
 uk , and
 to each  (10) with  respect to each h f ðgÞ , separately. Note that
^ ðgÞ ^
Qk uk jc ð jÞ and Qf ðgÞ h f ðgÞ jc ð jÞ are in the form of a weighted likelihood. Thus, each
maximization can be carried  out by modifying the existing maximum likelihood
ðgÞ ðgÞ
^
procedures. Specifically, Qk uk jc ð jÞ can be maximized by an iteratively  reweighted

ðgÞ
least squares procedure. Expression (10) depends on the density function p f i ; h f ðgÞ . For
ðgÞ P
example, if f i is normally distributed with mean m f ðgÞ and covariance matrix f ðgÞ , then
the closed-form solutions for the next step estimate m ^ f ðgÞ ð jþ1Þ are given by
^ f ðgÞ ð jþ1Þ and 
PN ðgÞ PM  
^ðgÞ w
f ^ ðgÞ
u
ðgÞ
i¼1 m¼1 mi mi kð jÞ
m
^ f ðgÞ ð jþ1Þ ¼ P ðgÞ P  ;
N M ðgÞ ðgÞ ðgÞ
i¼1 m¼1 N w^ mi u kð jÞ

PN ðgÞ PM  ðgÞ  ðgÞ 0  


i¼1 m¼1 f^mi 2 m^ f ðgÞ ð j11Þ f^mi 2 m ^ f ðgÞ ð j11Þ w ^ ðgÞ ðgÞ
mi ukð jÞ
^ f ðgÞ ð jþ1Þ ¼
   ;
PN ðgÞ PM
i51 m51 N ^ ðgÞ
ðgÞ w ðgÞ
mi ukð jÞ
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 365

where
QcðkÞ n  ðgÞ
Qp ^ ðgÞ ðgÞ
odðgÞ
kij
  k51 j51 p y ki 5 c j f
j mi ; u kð jÞ
^ ðgÞ
w
ðgÞ
mi ukð jÞ 5 :
PM Qp QcðkÞ n  ðgÞ ^ ðgÞ ðgÞ
odðgÞ
kij
s51 k51 j51 p y ki 5 c j jfsi ; ukð jÞ

3.3. Estimation of standard errors


Let c^ denote the maximum likelihood estimate obtained by the Monte Carlo EM, and
^ ðgÞ be the part of c^ corresponding to cðgÞ as defined in (4). Then the empirical
c
observed information matrix is given by
G X
X N ðgÞ
›   ›  0
^Ie ðc^Þ ¼ ðgÞ ^ ðgÞ ðgÞ ^ ðgÞ
log p yi ; c log p y i ; c : ð11Þ
g¼1 i¼1 ›c ðgÞ ›c ðgÞ

Note that the gradient vector of the complete log-likelihood function given the observed
data for observation i in group g can be written as
›     ›  
c ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
E l i cjy i ; f i jy i ¼ log p y i ; c ;
›c ðgÞ ›c ðgÞ
i.e. the individual score functions of the observed data can be computed as a by-product
of the EM algorithm. Therefore, expression (11) can be approximated by
N ðgÞ
G X
X ›     ›     0
^ < c ^ ðgÞ ðgÞ ðgÞ c ^ ðgÞ ðgÞ ðgÞ
I^ e ðcÞ E li cj yi ; f i j yi E li cj yi ; f i j yi ;
g¼1 i¼1
›c ðgÞ ›c ðgÞ

where
›    
c ^ ðgÞ ðgÞ ðgÞ
E l i cjy i ; f i jy i
›c ðgÞ
can be obtained in the same way as described in Section 3.2 using Monte Carlo
integration. This Monte Carlo integration needs to be performed only once after the
convergence of the Monte Carlo EM algorithm has been determined.

3.4. Convergence of Monte Carlo EM algorithm


It is a notoriously difficult task to assess the convergence of the Monte Carlo EM
algorithm. Specifically, the celebrated monotonicity property of the deterministic EM
algorithm no longer holds as the E-step is approximated through stochastic integration.
In the deterministic EM algorithm, the change in the parameter value from two
consecutive iterations is decreasing with increasing iterations. However, in the Monte
Carlo EM algorithm this change may be swamped by the Monte Carlo error. Therefore,
one should increase the Monte Carlo sample size M with increasing iterations. Several
strategies have been suggested for increasing M (Chan & Kuk, 1997; McCulloch, 1997).
In these methods the Monte Carlo EM is implemented in three phases. First, during a
‘burn-in’ phase, a small Monte Carlo size M is used, resulting in a relatively large Monte
Carlo error which does not affect the change in the E-step much. During the growing
phase, the Monte Carlo sample size is increased linearly with each Monte Carlo EM
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

366 Jens C. Eickhoff and Yasuo Amemiya

iteration to reduce the Monte Carlo error. Finally, during the stationary phase, the
algorithm is run with a large Monte Carlo sample size for several iterations so that the
effect of the Monte Carlo error is small. The convergence of the Monte Carlo EM
algorithm can then be determined during the stationary phase. One way is to
approximate the log-likelihood function (5) during the stationary phase by Monte Carlo
integration, i.e.
" cðkÞ #!
X XX
G NðgÞ p
1X M Y n  ðgÞ ðgÞ ðgÞ
odðgÞ
kij

lðcjy Þ ¼ log ^
p yki ¼ cj jfmi ; uk ; ð12Þ
g¼1 i¼1 k¼1
M m¼1 j¼1

ð gÞ ð gÞ
using the Monte Carlo sample f^1i ; : : : ; f^Mi from the E-step of the EM algorithm.
Convergence is determined at iteration j þ 1 if

jlðc ^ ð jÞ jy  Þj , d;
^ ð jþ1Þ jy  Þ 2 lðc ð13Þ

where d has a predefined value.


In our numerical example and simulation studies, we used the following set-up to
increase the Monte Carlo sample size: M ¼ 50 for iterations 1–19 (burn-in phase), M is
increased linearly by 150 for iterations 20–53 (growing phase), M ¼ 5,000 after iteration
53 (stationary phase). The pre-defined tolerance level d in (13) was fixed at 0.001. In
order to ensure that the algorithm was not terminated prematurely, we did not stop the
algorithm in our numerical examples until condition (13) was fulfilled for three
consecutive iterations.

3.5. Model diagnostic and misclassification test


3.5.1. Goodness of fit
We introduce a goodness-of-fit test statistic which can be used to test the
appropriateness of a particular latent variable model with misclassified polytomous
^ cjy
outcome variables (1). Let lð ^  Þ denote the Monte Carlo approximated log-likelihood
function at the maximum likelihood estimate C ^ as given in (12). Moreover, let lsat ðwjy
^ Þ
denote the log-likelihood function of the Q saturated model, that is, the log-likelihood
function of a multinomial distribution with pk¼1 cðkÞ categories where w^ is the vector of
observed frequencies for each category. Then, under the null hypothesis that the
proposed latent variable model (1) is appropriate,

X 2 ¼ 22ðlð^ cjy
^  Þ 2 lsat ðwjy
^  ÞÞ ð14Þ
2 Qp
is approximately distributed as a x with k¼1 cðkÞ 2 1 2 r degrees of freedom, where p
denotes the number of outcome variables, c (k) the number of categories for outcome
variable k, and r the number of free parameters in model (1). This x2 approximation
requires a large sample size to ensure a sufficiently large number of observations in each
response pattern category of the saturated multinomial model. In practice, the x2
approximation of (14) tends to be numerically unstable because the expected
frequencies of some response pattern categories of the saturated multinomial model are
often less than 5. However, the test is still very useful for model selection by comparing
the corresponding x2 values. Alternatively, one might consider using standard goodness-
of-fit measures such as the Akaike information criterion ( AIC: Akaike, 1987) or the
Schwarz Baesian criterion (SBC: Schwarz, 1978). These criteria take into account the
value of the likelihood evaluated at the maximum likelihood estimates and the number
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 367

of parameters estimated. Sclove (1987) gives a review of these model selection criteria
used in multivariate analysis. The advantage of AIC and SBC over x2 difference statistics
is that they can be used for non-hierarchical modelcomparisons.
 The AIC is given by
22lð^ cjy
^  Þ þ 2r and the SBC by22lð ^  Þ þ r log PG N ðgÞ , respectively, where r
^ cjy
g¼1
denotes the number of free parameters.

3.5.2. Test of misclassification


A researcher might be particularly interested in testing whether a particular
misclassification is present. Standard tests using asymptotic normal or x2 approximation
cannot be used to test the null hypothesis in model (1) that there is no misclassification.
One might consider using a likelihood-ratio test approach under non-standard
conditions where the null distribution is a mixture of two x2 distributions as described
in Miller (1977). We suggest instead fixing an ignorable or tolerable misclassification
level g0 . 0 and testing
H 0 : gkj # g0
versus
H A : gkj . g0 ;
using a standard testing procedure. In practice, one would fix the misclassification level
g0 to 0.025, 0.05, or even 0.1. It sould be noted that fixing a very small g0 requirs a large
sample size to ensure the asymptotic normality under the null hypothesis.

4. A simulation study
To examine the adequacy of the proposed method, a simulation experiment was
performed. We consider here a single-group confirmatory factor analysis model with
correlated factors. The outcome variables are misclassified according to a monotone
misclassification pattern as described in Section 2. Specifically, let
h  i21
Pð yki # cj jf i Þ ¼ gkj þ ð1 2 gkj Þ 1 þ exp 2ðakj þ b0k f i Þ ; ð15Þ

where k ¼ 1; : : : ; p; j ¼ 1; : : : ; cðkÞ 2 1; i ¼ 1; : : : ; N and


 n o  
f i , N m f ; diag s2f 1 ; : : : ; s2f q þ r 1 q 10q 2 I q :

In order to generalize the simulation results we consider two experimental conditions:

C1. Four dichotomous outcome variables ( p ¼ 4) and two correlated factors (q ¼ 2).
The slope parameter matrix is given by
!
1 b1 0 0
b¼ ;
0 0 1 b2

where the parameters with value zero or one are fixed and are not estimated.
To carry out the simulation study, the following parameter values were chosen:
ak ¼ 0 ðk ¼ 1; 2; 3; 4Þ; b1 ¼ b2 ¼ 1; m f ¼ ð1; 1Þ0 ; s2f 1 ¼ s2f 2 ¼ 1; r ¼ 0:5; g1 ¼ g2 ¼ 0:1
and g3 ¼ g4 ¼ 0:2.
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

368 Jens C. Eickhoff and Yasuo Amemiya

C2. Twelve polytomous outcome variables ( p ¼ 12), each with four categories and
six correlated factors (q ¼ 6). The slope parameter matrix is given by
0 1
1 b1 0 0 0 0 0 0 0 0 0 0
B C
B0 0 1 b2 0 0 0 0 0 0 0 0 C
B C
B C
B0 0 0 0 1 b3 0 0 0 0 0 0 C
B C
b¼B C;
B0 0 0 0 0 0 1 b4 0 0 0 0 C
B C
B0 0 0 0 0 0 0 0 1 b5 0 0 C
B C
@ A
0 0 0 0 0 0 0 0 0 0 1 b6

where the parameters with value zero or one are fixed and are not estimated. To carry
out the simulation study, the following parameter values were chosen: akj ¼ 0 for
k¼1;:::;12ð j¼1;:::;4Þ; b1 ¼:::¼ b6 ¼1; m f ¼ð1;1;1;1;1;1Þ0 ; s2f 1 ¼:::¼s2f G ¼1; r¼
0:5; gkj ¼0:1 for k¼1;:::;6 and gkj ¼0:2 for k¼7;:::;12 ð j¼1;:::;4Þ.
Two models were fitted under conditions C1 and C2.:
M1. Model (15) where all misclassification parameters are set to be free parameters.
M2. Model (15) where all misclassification parameters are fixed to 0, i.e. the
misclassification is ignored.

The Monte Carlo EM algorithm as described in Section 3 was used to perform this
simulation study to compute parameter estimates. The number of replications for each
simulation was 500. Simulations were conducted for sample size N ¼ 200 and
N ¼ 1,000, respectively. The stopping rule described in Section 3.4 was employed to
determine when to terminate the Monte Carlo EM algorithm. Specifically, the tolerance
level d in expression (13) was set to 0.001. Starting values were randomly chosen within
a 25% range of the true parameter values. All computations were performed in R version
1.8.1 (Ihaka & Gentleman, 1996), a publicly available statistical analysis environment
(http://www.r-project.org).

4.1. Aspects of performance


In this simulation study, we evaluate two aspects of performance — accuracy of
parameter estimates and goodness of fit.
For each parameter of model M1 under experimental conditions C1 and C2, we
computed the mean for selected parameter estimates across the 500 replications. The
precision of the standard error estimates was evaluated by computing the ratios of the
empirical sampling standard deviation (SD) of the parameter estimates to the mean
estimated standard errors (SE) across the 500 replications. Standard error estimates are
precise if the ratios SD/SE are close to 1.0. Standard errors were estimated as described
in Section 3.3.
The goodness-of-fit test statistics (14) for model M1 and model M2 were computed
under condition C1 (four dichotomous outcomes), and they were compared to the
asymptotic null distributions, the x2 distribution with 4 degrees of freedom for model
M1, and the x2 distribution with 8 degrees of freedom for model M2. First, we
compared the mean and variance of the 500 goodness-of-fit test statistic values for
models M1 and M2 with the null distribution mean and variance, that is, 4 and 8 for
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 369

model M1, and 8 and 16 for model M2. We also compared the empirical distribution
of the goodness-of-fit test statistic values with the the asymptotic null distribution. The
Kolmogorov–Smirnov test (Gibbons & Chakraborti, 1992) was used to test whether
the 500 goodness-of-fit statistic values for models M1 and M2 were distributed as x24
and x28 , respectively. Due to the larger number of outcome variables under
experimental condition C2 (12 polytomous, each with four categories), we compared
the goodness of fit between model M1 and model M2 using AIC and SBC. Specifically,
the differences of AIC values between model M2 and model M1 and the differences of
SBC values between model M2 and M1 were compared.

4.2. Results
The Monte Carlo EM algorithm used in this simulation appears to be robust. There were
no convergence problems experienced among the 500 replications. All simulations
were performed on a SunBlade 100 with a 500 MHz processor. The Monte Carlo EM
algorithm converged on average after fewer than 45 iterations. Within each iteration, the
M-step required most of the computation time while the evaluation the M-step required
relatively little time. Specifically, the computation time to reach convergence was on
average 162 seconds under experimental condition C1 and 284 seconds under
experimental condition C2.

4.2.1. Accuracy of parameter estimates


The means and standard deviations of the parameter estimates as well as the means of
the estimated parameter standard errors across the 500 replications for model M1 under
experimental conditions C1 and C2 are summarized in Table 1 and 2, respectively. When
N ¼ 200, the slope parameters bk are slightly overestimated. The situation improves
when the sample size is increased to 1,000. The ratios SD/SE of all model parameters are
close to 1.0, indicating reliable standard error estimation even for sample sizes of
N ¼ 200.

Table 1. Means and ratios of the empirical sampling standard deviation (SD) of each parameter estimate
to the mean estimated standard errors (SE) across the 500 replications for model M1 under
experimental condition C1 (four dichotomous outcomes with two correlated factors)

N ¼ 200 N ¼ 1,000
Parameter
(true value) Mean Est. SD/SE Mean Est. SD/SE

g1 (0.1) 0.091 1.038 0.095 0.846


g2 (0.1) 0.109 1.090 0.098 0.902
g3 (0.2) 0.218 1.383 0.192 1.091
g4 (0.2) 0.211 1.395 0.204 0.891
b1 (1.0) 1.176 0.977 1.038 0.966
b2 (1.0) 1.166 0.969 1.012 0.922
r (0.5) 0.498 0.879 0.498 0.896

4.2.2. Goodness of fit


The results on the goodness-of-fit x2 test under experimental condition C1 are
summarized in Table 3. When fitting model M1, the empirical distribution of the 500
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

370 Jens C. Eickhoff and Yasuo Amemiya

Table 2. Means and ratios of the empirical sampling standard deviation (SD) of each parameter estimate
to the mean estimated standard errors (SE) across 500 replications for model M1 under experimental
condition C2 (12 polytomous outcomes with six, correlated factors)

N ¼ 200 N ¼ 1,000
Parameter
(true value) Mean Est. SD/SE Mean Est. SD/SE

g11 (0.1) 0.093 0.935 1.022 1.100


g13 (0.1) 0.094 0.921 0.097 1.125
g71 (0.2) 0.197 1.012 0.203 1.026
g73 (0.2) 0.209 1.049 0.206 1.114
b1 (1.0) 1.018 0.980 1.021 0.964
b2 (1.0) 1.001 0.993 1.008 0.902
b3 (1.0) 0.998 0.961 1.009 0.902
b4 (1.0) 1.013 0.987 1.071 1.131
b5 (1.0) 1.088 0.907 0.997 0.978
b6 (1.0) 0.995 1.014 1.006 0.968
r (0.5) 0.5005 1.207 0.490 0.850

Table 3. Goodness of fit under experimental condition C1: means and variances of goodness-of-fit test
statistic values across 500 replications; percentage frequencies for which the test statistic values are
larger than the 95th and 90th percentiles of the x24 and x28 distributions; Kolmogorov–Smirnov (KS) test
p-value

N ¼ 200 N ¼ 1,000

M1 M2 M1 M2

Mean 7.21 12.81 4.83 38.21


Variance 16.83 26.10 9.12 98.22
% . 95th percentile 17.02 25.81 7.02 43.29
% . 90th percentile 23.51 32.33 12.32 56.04
KS test p-value , 0.01 ,0.01 0.17 , 0.01

goodness-of-fit test statistic values reasonably resembles a x2 with 4 degrees of freedoms.


For N ¼ 200, the Type I error appears to be larger than the desired level of 5% and 10%
so that correct models tend to be wrongly rejected. This is greatly improved when the
sample size is increased to N ¼ 1,000. Since the data were generated according to the
misclassification model, the empirical distribution of the goodness-of-fit test statistic
values of model M2 does not resemble a x2 distribution with 8 degrees of freedom.
Specifically, the test statistic values axe much larger than the null distribution expects.
The sample means and variances are not close to the expected values of 8 and 16, and
the p-values of the Kolmogorov–Smirnov tests are very small. The discrepancy between
the empirical and the null distribution becomes larger as the sample size increases.
For experimental condition C2, AIC and SBC values were used to assess goodness of
fit. Table 4 shows the means and ranges for the differences of AIC and SBC values
between model M2 and model M1. All mean differences are positive, indicating that AIC
and SBC values of the ‘correct’ model (M1) tend to be smaller than the corresponding
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 371

Table 4. Goodness of fit under experimental condition C2: means and ranges for the differences of
AIC and SBC values between model M2 and model M1

N ¼ 200 N ¼ 1; 000

Difference AIC 186.03 (68.56, 270.82) 314.29 (170.10, 438.34)


Difference SBC 52.67 (2 64.78, 137.46) 137.61 (26.58, 241.06)

values of the model ignoring the misclassification (M2). Since the number of parameters
in model M1 is considerable larger than in model M2, the difference in SBC values is
smaller than in AIC values. Moreover, in 18 out of the 500 replications for N ¼ 200 and
in 7 out of the 500 replications for N ¼ 1,000, the SBC values of the ‘incorrect’ model
M2 were smaller than the corresponding SBC values of the ‘correct’ model (M1).

5. Application from a substance abuse prevention study


In this section, we consider an application from a substance abuse prevention study
where the primary outcome variables are in polytomous form and where the subject
matter suggests that the responses are associated with misclassification errors. The
example is from the Capable Family and Youth (CAFAY) project (Goldberg, Spoth, Meek,
& Molgaad 2001), which was conducted at the Institute for Social and Behavioral
Research at Iowa State University. The CAFAY project is a study of family- and school-
based programmes designed to prevent teen substance abuse and other problem
behaviours through youth and family skill training. Participants in the study were
families of seventh-graders enrolled in 36 schools in northeast Iowa. Participants were
randomly assigned to one of three experimental conditions. One group of students
received the Life-Skills Training Program (LSTP) at school through a program
developed by Cornell University Medical Center, and their families participated in the
Iowa Strengthening Families Program (ISFP), adapted by the Institute of Social and
Behavioral Research. This group will be referred to as LSTP-ISFP group. The second
group consists of families whose seventh-graders received only the school-based training
(LSTP group). The third group consists of families where the parents received reading
material on youth development (control group). There were 216 families assigned to the
LSTP-ISFP group, 183 to the LSTP group, and 202 to the control group. Data were based
on in-home and in-school interviews one year after the intervention programme had
been completed. From the research questions addressed in this study, we focus on how
the three experimental conditions affect the substance abuse behaviour of the target
children of the participating families. Based on the knowledge of the underlying study, a
confirmatory factor analysis model, which involves two non-overlapping factors, is used
to analyse the data. The first factor is termed target substance behaviour (TSB). It
represents the substance abuse behaviour of the target child in each family and consists
of three indicators. The target children were asked about their activities concerning
tobacco, alcohol and other substances during the last 12 months. They were asked to
answer ‘yes’ or ‘no’ to the following three questions: (i) ‘Have you smoked any
cigarettes?’ (ii) ‘Have you been drunk from drinking beer, wine, wine coolers, or liquor?’
and (iii) ‘Have you used marijuana?’. The second factor is termed parents’ substance
behaviour (PSB) and represents the parents’ behaviour concerning tobacco and alcohol
use when their child is present. This factor consists of three indicators. The parents of
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

372 Jens C. Eickhoff and Yasuo Amemiya

the target child were asked to answer ‘yes’ or ‘no’ to the following three questions about
tobacco and alcohol activities during the last 12 months: (i) ‘Has your child brought
cigarettes to a family member or lit a cigarette for you or another family member?’ (ii)
‘Has your child brought, opened, or poured a drink containing alcohol for you or
another family member?’ (iii) ‘Has your child seen you drink alcohol?’. Note that for all
six indicators it is realistic to assume a monotone misclassification pattern as described
in Section 2. For instance, consider the question ‘Have you used marijuana?’. If a target
child did not smoke marijuana during the last 12 months it is reasonable to assume a ‘no’
response to that question. However, if the child did indeed smoke marijuana during the
last 12 months, he or she might respond ‘yes’ (correct classification) or ‘no’
(misclassification) to that particular question.
ðgÞ ðgÞ ðgÞ ðgÞ ðgÞ
Let y1 , y2 and y3 represent the three indicators for TSB and y4 , y5 and
ðgÞ
y6 the three indicators for PSB for experimental condition g (g ¼ LSTP-ISFP, LSTP,
control). As described in Section 2, we can write a latent variable model with
misclassified outcome variables and two non-overlapping factors as
ðgÞ
ðgÞ ðgÞ ðgÞ 12gk
Pð yki ¼ ‘no’jTSBi Þ ¼ gk þ ðgÞ ðgÞ ðgÞ ; k ¼ 1; 2; 3;
1 þ exp { 2 ð ak þ bk TSPi Þ}
ðgÞ ð16Þ
ðgÞ ðgÞ ðgÞ 12gk
Pð yki ¼ ‘no’jPSBi Þ ¼ gk þ ðgÞ ðgÞ ðgÞ ; k ¼ 4; 5; 6
1 þ exp { 2 ð ak þ bk PSPi Þ}

with
00 1 0 2ðgÞ 11
!ðgÞ ðgÞ ðgÞ
TSBi s
mTSB sTSB;PSB
B B TSB CC
, N @@ ðgÞ A; @ ðgÞ 2ðgÞ AA;
PSBi mPSB sTSB;PSB sPSB

ðgÞ
for g ¼ 1; 2; 3; k ¼ 1; : : : ; 6 and i ¼ 1; : : : ; N ðgÞ . To identify this model we fix a1 ¼
ðgÞ ðgÞ ðgÞ
a4 ¼ 0 and b1 ¼ b4 ¼ 1. This form allows straightforward interpretation of the
parameters corresponding to the factor variables which is particularly useful for multi-
group analysis, e.g. for comparing the intervention effects on the latent constructs TSB
and PSB across the three experimental groups. The Monte Carlo EM algorithm as
described in Section 3 was used to compute the maximum likelihood estimates. The
stopping rule described in Section 3.4, with d ¼ 0:001, was employed to determine
when to terminate the Monte Carlo EM algorithm. Convergence was reached rapidly
after 89 iterations. We first tested for measurement and misclassification error invariance
by comparing the general model (16) with the more restrictive model which assumes
invariance of measurement and misclassification parameters across the three sampling
groups. The p-value of the likelihood-ratio test was 0.23 so that for further analysis the
restricted model was used, i.e. we set
ð1Þ ð2Þ ð3Þ
gk ¼ gk ¼ gk ¼ gk ;
ð1Þ ð2Þ ð3Þ
ak ¼ ak ¼ ak ¼ ak ;
ð1Þ ð2Þ ð3Þ
b k ¼ bk ¼ bk ¼ bk ;

for k ¼ l; : : : ; 6. A tolerable misclassification level of g0 ¼ 0:05 was determined for this


study by the CAFAY project scientists. The hypothesis
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 373

H 0 : gk # 0:05; H a : gk . 0:05
was tested for each k ¼ 1; : : : ; 4. The p-values for each test are displayed in Table 5. The
results indicate that there is significant misclassification for the target child questions
(i) and (iii), and for parents’ question (i), when testing at a significance level of 10%.

Table 5. ML estimates for misclassification and measurement parameters; p-values for testing H0 :
gk # 0:05 against Hc : gk . 0:05

Misclassification parameter Measurement parameter

Parameter MLE p-value Parameter MLE Parameter MLE

g1 0.182 0.030 a2 0.562 b2 2 0.643


g2 0.081 0.219 a3 2.350 b3 2 0.662
g3 0.111 0.064 a5 0.125 b5 2 0.547
g4 0.143 0.017 a6 0.138 b6 2 0.283
g5 0.070 0.309
g6 0.088 0.224

The maximum likelihood estimates for the measurement parameters, ak and bk, and
misclassification parameters gk of the fitted model are also shown in Table 5. Note that
the estimates for the misclassification parameters have a meaningful interpretation:
g^1 ¼ 0:182 indicates that a target child who smoked cigarettes during the last 12 months
will answer ‘no’ with probability 0.182 to the question ‘Have you smoked any
cigarettes?’ The maximum likelihood estimates for the parameters corresponding to the
latent constructs TSB and PSB are summarized in Table 6. A lower factor mean is
associated with a higher probability of answering ‘no’ to any of the six questions. The
researchers were particularly interested in comparing the factor means of the two latent
constructs TSB and PSB between the three experimental conditions. A test of
equivalence between the factor means of the three experimental conditions provided a
p-value of less than 0.01 for TSB and of 0.02 for PSB. Comparing the factor means of both
latent constructs from the LSTP-ISFP and ISFP group with the control group factor
means provided a p-value of less than 0.01 and of 0.03. This indicates that there is a
significant intervention effect for the LSTP as well as for the combined LSTP-ISFP on the
latent constructs TSB and PSB after accounting for misclassification.

Table 6. Maximum likelihood estimates for latent variable parameters

LSTP-ISFP LSTP Control

mTSB 2 1.391 2 1.282 20.940


mPSB 2 2.655 2 2.461 22.436
s2TSB 1.102 0.688 0.905
sTSB,PSB 0.116 0.161 0.123
sPSB 0.332 0.269 0.273
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

374 Jens C. Eickhoff and Yasuo Amemiya

6. Discussion
In this paper, we have introduced a new approach to analysing latent variable models
with misclassified polytomous outcome variables. As we have noted, modelling
misclassification patterns imposes identification difficulties. In order to guarantee model
identifiability, we have assumed a monotone misclassification pattern in our approach,
that is, misclassification may occur only in one direction. The assumption of monotone
misclassification is reasonable in many social and behavioural science studies where
data are based on interview questions. The parametrization proposed in our approach is
flexible and provides meaningful parameter interpretation. Modifications of the
proposed monotone misclassification pattern could be considered. For example, one
modification could be to assume an ordered structure of misclassification parameters,
which would provide an identifiable model. However, the parameter estimation in such
models could become more difficult due to the constraints in the parameter space.
The maximum likelihood estimation in our approach is performed using an EM
algorithm where the E-step is computed via simple Monte Carlo integration. The M-step
is computed using an iterative procedure. We have also demonstrated how to assess the
convergence of the Monte Carlo EM algorithm. Furthermore, we have proposed a
simple goodness-of-fit test statistic to evaluate the appropriateness of the model and a
misclassification test to determine whether there is misclassification for some or all
outcome variables.
Interest in latent variable models with polytomous outcome variables has recently
increased in the social and behavioural sciences. We conclude that the incorporation of
misclassification error in the latent variable framework is a useful tool with many
practical applications in psychosocial studies.

References
Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332.
Bentler, P. M. (1995). EQS: Structural equation program manual. Los Angeles BMDP Statistical
Software:.
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items.
Psychometrika, 35, 179–197.
Bollinger, C. R., & David, M. H. (1997). Modeling discrete choice with response error: Food stamp
participation. Journal of the American Statistical Association, 92, 827–835.
Chan, J. S., & Kuk, A. Y. (1997). Estimation for probit-linear mixed models with correlated random
effects. Biometrics, 53, 86–97.
Cheng, K. F., & Hsueh, H. M. (1999). Correcting bias due to misclassification in the estimation of
logistic regression models. Statistics and Probability Letters, 44, 229–240.
Cheng, K. F., & Hsueh, H. M. (2003). Estimation of a logistic regression model with mis-measured
observations. Statistica Sinica, 13, 111–127.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via
the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1–38.
Gibbons, J. D., & Chakraborti, S. (1992). Nonparametric statistical inference (3rd ed.). New York:
Marcel Dekker.
Goldberg, C. J., Spoth, R. L., Meek, J., & Molgaard, V. (2001). The capable families and youth
project: Extension-University-Community Partnerships. Journal of Extension, 39(3) http://
www.joe.org/
Goldstein, H. (1991). Nonlinear multilevel models with an application to discrete response data.
Biometrika, 78, 45–51.
Copyright © The British Psychological Society
Unauthorised use and reproduction in any form (including the internet and other electronic means)
is prohibited without prior permission from the Society.

Latent variable models for misclassified polytomous outcome variables 375

Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of
Computational and Graphical Statistics, 5, 299–314.
Jöreskog, K., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three
approaches. Multivariate Behavioral Research, 36, 347–387.
Jöreskog, K., & Sörbom, D. (1996). LISREL 8: User’s reference guide. Chicago: Scientific Software
International.
Lee, S. V., & Poon, W. Y. (1987). Two-step estimation of multivariate polychoric correlations.
Communications in Statistics: Theory and Methods, 16, 307–320.
Lee, S.-Y., Poon, W. Y., & Bentler, P. M. (1995). A two-stage estimation of structural equation models
with continuous and polytomous variables. British Journal of Mathematical and Statistical
Psychology, 48, 339–358.
Lee, S.-Y., & Shi, J. Q. (2001). Maximum likelihood estimation of two-level latent variable models
with mixed continuous and polytomous data. Biometrics, 57, 787–794.
Lesaffre, E., & Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic
random-effects model: An example. Applied Statistics, 50, 325–335.
Liu, Q., & Pierce, D. A. (1994). A note on Gauss–Hermite quadrature. Biometrika, 81, 624–629.
McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed. Journal of
the American Statistical Association, 92, 162–170.
Meng, X. L., & Schilling, S. (1996). Fitting full-information item factor models and empirical
investigation of bridge sampling. Journal of the American Statistical Association, 91,
1254–1267.
Miller, J. J. (1977). Asymptotic properties of maximum likelihood estimation in the mixed model of
the analysis of variance. Annals of Statistics, 5, 746–762.
Moustaki, I. (1996). A latent trait and a latent class model for mixed observed variables. British
Journal of Mathematical and Statistical Psychology, 49, 313–334.
Moustaki, I. (2000). A latent variable model for ordinal variables. Applied Psychological
Measurement, 24, 211–223.
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical
and continuous latent variable indicators, Psychometrika. 49, 115–132.
Muthén, B. (1987). LISCOMP: Analysis of linear statistical equations using a comprehensive
measurement model. Mooresville, IN: Scientific Software, Inc.
Muthén, B., & Muthén, L. (1998). M-plus user’s guide. Los Angeles: Muthén & Muthén.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2002). Reliable estimation of generalized linear
mixed models using adaptive quadrature. Stata Journal, 2, 121.
Sammel, M. D., Ryan, L. M., & Legler, J. M. (1997). Latent variable models for mixed discrete and
continuous outcomes. Journal of the Royal Statistical Society, Series B, 59, 667–678.
Schwarz, G. (1978). Estimating the dimensions of a model. Annals of Statistics, 6, 461–464.
Sclove, S. (1987). Application of model-selection criteria to some problems in multivariate
analysis. Psychometrika, 52, 333–343.
Shi, J. G., & Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with
continuous and polytomous data. British Journal of Mathematical and Statistical
Psychology, 51, 233–252.

Received 17 July 2003; revised version received 3 December 2004

You might also like