Pereira et al

1
Comparison of one- and two-dimensional item analysis models of the Mini-Mental State
Examination in elderly healthy and clinical Brazilian samples
Running title: Comparison of IRT models of the MMSE
Danilo Assis Pereira 1,2
Antônio Gimenez Giglio 3
Sabri Lakhdari 3
Carlos Tomaz 4
_______________________
Danilo Pereira: Substantial contribution to conception and design, analysis and interpretation
of data;
Antônio Giglio: Substantial contribution to acquisition of data, revising it critically for

important intellectual content;
Sabri Lakhdari: Substantial contribution to acquisition of data, revising it critically for

important intellectual content;
Carlos Tomaz: Drafting the article and revising it critically for important intellectual content;
Final approval of the version to be published.
________________________
1
IBNeuro – Brazilian Institute of Neuropsychology and Cognitive Sciences, Brasilia DF,
Brazil.
2
PhD Candidate, Postgraduate Program in Health Sciences, Faculty of Health Sciences,
University of Brasilia (UnB), Brasilia DF, Brazil.
3
Geriatric and Gerontology Ambulatory of Regional Hospital of Asa Norte (HRAN), Brasilia
DF, Brazil.
4
PhD, Full Professor, Laboratory of Neuroscience and Behavior, Institute of Biology, UnB,
Brasilia DF, Brazil.
Correspondence: Danilo Assis Pereira, Instituto Brasileiro de Neuropsicologia e Ciências
Cognitivas, SHCR CS 504, Bloco C, Entr.37, 1º andar, 70331-535, Brasília-DF, Brasil.
Email: danilo@ibneuro.org
The authors declare no conflict of interest.

2
ABSTRACT
Introduction: The MMSE is known to be subject to both roof and floor effects, depending on
the tested sample. The aim of this study was to compare one-dimensional and
multidimensional IRT (MIRT) models in a cognitive impaired and a normal group. Methods:
Secondary MMSE data from 329 elderly in a Brazilian sample were used. Cognitive impaired
group (N= 109, Clinical Dementia Rating, CDR 1, 2 and 3) was compared with a normal
group (N=220, CDR 0 and 0.5). Classical test theory (CTT), IRT and MIRT were used in both
groups. Results: Using MIRT analysis, MMSE gets better goodness-of-fit index than Rasch
and other one-dimensional IRT models. Writing has the higher factor load in the first factor
for impaired group (0.91). Calculation and Recall tasks are very discriminative in the first
latent trait, but not in the second for impaired and non-impaired group. Calculation is very
discriminative in the first dimension for impaired (a=2.48), but not for non-impaired (a=0.99).
Different item functioning (DIF) was found in almost all MMSE items when cognitive
impaired and non-impaired groups were compared. MMSE has unacceptable reliability when
used in normal non-impaired, but has been acceptable when cognitive impaired answered the
test. Some MMSE items have ceiling effect (Registration and Naming) even in dementia
patients. Discussion: The results indicate that MMSE is a multidimensional screening
instrument and the items discriminate differently to cognitive impaired and normal elderly
subjects. The most discriminative items for Memory factor were Orientation to time,
Orientation to place and Recall. Calculation, Repetition, Comprehension, Reading, Writing
and Drawing tasks were more discriminative for Executive function factor.
Key words: MMSE, cognition impairment, dementia, Alzheimer’s disease, item response
theory.
3
The Mini-Mental State Examination (MMSE) (1) is the most widely used screening
instrument for cognitive dysfunction. Originally conceived as a brief assessment of five
cognitive domains (Orientation, Attention, Registration, Recall and Language), the MMSE is
often subjected to various investigations of its factor structure and predictive value, showing
considerable sensitivity and specificity for dementia screening (2, 3). Instead of the modified
written (4-6) and telephone (7) versions, the second edition of the MMSE (8) was recently
published in the following three main versions: brief (16 points), standard (30 points) and
extended (90 points). The total score of the MMSE has been used continuously in various
studies (9-11).
Although psychometric analyses of the MMSE have been conducted (12-14), few
studies have performed an item-by-item evaluation (9). For an item analysis, a measure must
be unidimensional (i.e., measuring only one latent trait or ability) and local independent (local
dependence occurs when items are strongly related and thus dependent of each other) (15). To
examine the one-dimensionality of a test, psychometricians often use the scree test and factor
analysis (15). Factor analysis is essential in psychometric studies because it reveals
exceptional items to consider a latent trait and shows the covariance between the item and
factor.
The first factor analysis of the MMSE was performed in 1987 by Fillenbaum and
colleagues (16), who used data from 36 subjects with a probable diagnosis of Alzheimer’s
disease (AD). The items were divided into seven categories and a two-factor solution using
oblique rotation was applied, explaining 66% of the variance. For the first factor, the items
were Concentration, Language and Praxis and the second factor items were Evocation,
Temporal and Spatial Orientation. The second experiment used 60 subjects from a
longitudinal study (17). Factor analysis was performed using 11 MMSE items. The MMSE
scores were submitted to a principal components analysis, which revealed that the five-factor
4
solution accounted for 75% of the variance. Serial subtraction, Recall, Orientation to time,
Orientation to place and Design copying loaded onto Factor 1. The second factor comprised
Naming and Obeying commands. An MMSE factor analysis study that examined psychiatric
patient data found the following nine polytomous items in the first component: Orientation to
time and to place, Registration, Saying “world” backwards, Calculation, Recall, Repetition,
Writing and Drawing (18).
Examining more than 8,000 subjects over 50 years of age using 31 dichotomous items,
researchers found a five-factor analysis with tetrachoric correlation, as follows: Concentration
(Calculation and the Word “world”), Language (naming and three commands) and Praxis,
Orientation (items containing the spatial and temporal), Memory (recall) and Attention
(registry entries). The simplest item was Naming (clock, pencil) and the most difficult item
was Calculating (subtract 100-7 and so on). The first latent trait explained 41% of the
variance. Although the scree method found five factors, the authors suggested that the MMSE
was a one-dimensional instrument for the elderly because all items had positive charges in the
first factor and the size of the first eigenvalue was remarkably high compared with the second
(19).
Schultz-Larsen and colleagues (20) used item response theory (IRT) and the Rasch
model to assess the dimensionality and differential item functioning (DIF) of the MMSE (21).
In IRT, there are two types of DIF: uniform and non-uniform. Non-uniform DIF indicates that
the item favors a particular group at certain points along the continuum of cognitive function
(22). The researchers proposed an MMSE reduction (23), evaluating the sensitivity,
specificity and predictive values. This shortened version of the MMSE contains the following
nine items: Orientation to place (state, county, city, building and floor), Registration, Recall,
Reading and Drawing. Although they used the Rasch model, the two-parameter logistic model
(2PL) is preferable because it includes a measure of item discrimination. (24). However,

5
studies have detected the existence of DIF in particular MMSE items, which can be
considered to reflect biased items (9, 20, 23, 25, 26).
Item response theory (IRT) has several arguments favoring its use in psychometrics: a)
the standard error of measurement differs between the scores (or response patterns) but is
generalizable between populations (27); b) a shorter test may be more reliable than the
extended versions (28); c) comparing the test scores between multiple forms is appropriate
when test difficulty levels vary between individuals; d) nonbiased estimates of item properties
can be obtained from unrepresentative samples (29); e) test scores acquire meaning when they
are compared with the distance of the items (30); f) the properties of interval scales are
obtained by applying the models reasonable measures; g) items with mixed formats can
produce results relevant to the score test; h) changes in scores can be meaningfully compared
when initial score levels differ (31); i) factor analysis of all items provides more complete
information than the factor analysis of the test; and j) characteristics of the item can be
directly related to the psychometric properties (30).
Unlike IRT, the classical test theory (CTT) is based on the true score model (in which
the person’s observed score is typically the unweighted sum of responses to the items of the
instruments). In assessing ability, this sum reflects the number of correct answers. CTT
assumes that instrument errors are not correlated with scores of the latent trait in a different
tool. The assumption that an instrument’s errors have no correlation with errors in a different
instrument is considered “weak” because the instruments depend on the type of response that
the individual provides (22). The IRT assumption is considered “strong” because the
estimation of the location is invariant with respect to the instrument; thus, the accuracy of this
estimate is known at the group and individual levels and the evaluated parameter estimation
extends beyond the sample used for its estimation (22). In IRT, items and persons are located
on the same continuum. Although items may differ in their locations, the ability of an item to
6
differentiate individuals remains constant. In general, each item provides information for
estimating the location of the person. The total information of the instrument is the sum of the
items of information (the concept of total information can be used for specific instruments
with psychometric properties) (22).
Taking into account the above mentioned aspects, the purpose of the current study was
to obtain the MMSE item parameters and determine which cognitive domains (in one- or two-
dimension) are more discriminative to different latent traces.
METHODS
This study used secondary medical record data from three sources that included the
MMSE. The first two sources were the medical records of patients seen at two local hospitals
in Brasilia, Brazil and the third source was a previous study (32) of active seniors in the
community. At the hospital, patients were evaluated by two geriatricians and one
neuropsychologist. Diagnosis was based on a clinical and neuropsychological battery
following the recommendations of the National Institute of Neurological Disease and
Communicative Disorders and Stroke Alzheimer’s Disease and Related Disorders Association
(NINCDS-ADRDA). Clinical evaluation was performed between 2007 and 2009 and the
research was approved by the Ethics and Human Research Committee of the Public Health
Department of the Federal District.
The study used data from 329 elderly individuals (100 men) who were grouped
according to their clinical condition, a community group (67 men) without cognitive
impairment and a clinical group (33 men) with cognitive impairment. There was a significant
difference between age and the MMSE raw scores (p<0.01) and the older group and those
with a lower education level suffered greater cognitive impairment. Table 1 summarizes the
characteristics of the groups.

7
TABLE 1
Statistical analysis
First, residuals of the tetrachoric correlation were analyzed on 30 dichotomous items
using NOHARM software (Normal Ogive Harmonic Analysis Robust Method, version 4.0)
(33). This program takes advantage of the relationship between nonlinear factor analysis and
the normal ogive model to fit one-dimensional and multidimensional normal ogive models.
The residual matrix, threshold values, unique variances, factor loadings and item parameters
revealed that some MMSE items could not be properly calibrated because of their very low
variability (more than 95% correct responses on Registration and Naming). Orientation to
time, Orientation to place and Recall had higher loadings on the first factor. Drawing,
Reading, Writing, Comprehension and Calculation exhibited higher loadings on the second
factor. The Tanaka index of goodness-of-fit was 0.988 for the two-dimensional solution. The
sum of squares of residuals (lower off-diagonals) was 0.02 and root mean square of residuals
(lower off-diagonals) was 0.007, which indicates small errors. The final discrimination
constants ranged from -0.37 to 3.37. The threshold values ranged from -0.24 to 2.25.
Second, the 30-point MMSE was divided into 22 unrelated items (according to the local
independence assumption). Nineteen items were treated as dichotomous (0 or 1), as follows:
Orientation to time (year, month, date, day-week, hour), Orientation to place (state, city, town,
building, floor), Recall (table, watch, pen), Naming (key, pencil), Repetition, Reading,
Writing and Drawing. The three remaining items were treated as polytomous, as follows:
Registration (ranging from 0 to 3), Attention/calculation (0 to 5) and Comprehension of
commands (0 to 3).
Two groups were created for item calibration, non-impaired (elderly individuals without
cognitive impairment, defined as Clinical Dementia Rating (CDR) 0 and CDR 0.5, N=220)
and impaired (elderly individuals with cognitive impairment, defined as CDR 1, CDR 2 and
8
CDR 3, N=109). Frequencies of categorical responses on the 22-item MMSE for the non-
cognitive impairment and cognitive impairment groups showed that Registration (table,
watch, pen) and Naming (key, pencil) exhibited an extremely high chance of correct
responses (>0.95) in both groups; thus, these items were excluded from the analysis. The
Rasch and IRT analyses were performed using the remaining 19-item MMSE. Categories in
the Comprehension task were grouped and it was treated as a dichotomous item.
Classical test theory statistics were calculated and items were calibrated using Rasch
analysis (one-parameter model) to determine item fit. A two-parameter model for
dichotomous items and a graded response model for polytomous items were used for one-
dimensional and multidimensional item analysis. A goodness-of-fit statistics of the one- and
two-dimensional models were compared for the impaired and non-impaired groups.
RESULTS
Classical test theory statistics of the 19-item MMSE were calculated (Table 2) using
STATA software (version 12) (34). The Cronbach’s alpha of the non-impaired group was
0.38 (unacceptable reliability) and the alpha of the impaired group was 0.76 (acceptable
reliability). The item-rest correlation (with the item excluded from the test) for the non-
impaired group was remarkably low for most items (r<0.3), except Calculation (0.38),
Reading (0.34) and Drawing (0.35). For the impaired group, the item-rest correlation was
greater than 0.3 for all items except Recall (table, watch, pen).
TABLE 2
A Rasch model analysis of the MMSE was performed using Winsteps software
(version 3.73) (35). Using data from small groups, Rasch analysis is suitable for item
calibration (22). Mean-square (MNSQ) and standardized fit statistics (ZSTD) were plotted for
9
inlier-sensitive or information-weighted fit (infit) and outlier-sensitive fit (outfit) (Table 3).
MSNQ values should range from 0.5 to 1.5 and ZSTD values should range from -1.9 to 1.9.
The non-impairment group had many items with low MNSQ for infit (day-week, hour, city)
and outfit (year, day-week, hour, building), with person reliability was 0.35 and item
reliability was 0.98. Calculation and Recall (watch) displayed a ZSTD ≥ 3, indicating that the
data were highly unpredictable. For the impaired group, only Repetition demonstrated a low
MNSQ; however, many items indicated that other dimensions may exist (ZSTD ≤ -2). The
person reliability was 0.76 and the item reliability was 0.98.
TABLE 3
Based on the ZSTD results, other dimensions may constrain the response patterns. A
multidimensional two-parameter logistical model (M2PL) was calculated for dichotomous
items (Table 4) and multidimensional graded model for polytomous items, where a1 is the
discriminating parameter in the first latent trait and a2 is the discriminating parameter in the
second latent trait. IRTPro software (version 2.1) (36) was used for these calculations. The
Bock-Aitkin EM algorithm was selected, using a maximum number of 500 cycles and 50 M-
step iterations. The SEM algorithm tolerance was set to 000.1 and supplemented EM was
used as the standard error computation algorithm. The EAP method was used to estimate the
theta parameter (ability).
The discrimination parameter should range between 0.3 (very low) and 4 (very high).
The c parameter is the item intercept in the multidimensional latent space (once threshold, or
difficulty, b parameters do not have meaning for multidimensional models). However, we can
calculate an index, the multidimensional item location that can be interpreted as a location
parameter. Multidimensional item location is given by –c/a (where a is the best value that
represents the best that item can discriminate across all the dimensions). Similarly, the item’s
location in the multidimensional space is given by a single value (b) and should range
10
between -3.0 (very easy) and 3.0 (very difficult). For the non-impaired group, only Recall
items (table, watch, pen) were discriminative in the first latent trait (a1 = 1.78, 2.58, 2.39) but
not the second (a2 = 0.01, -0.34, -0.07), as shown in Table 4. In the graded model item
parameter, location (c) was estimated for each response in the category for the non-impaired
and impaired groups.
TABLE 4
The multidimensional IRT analysis showed that the MMSE items have discriminative
parameters in two dimensions and that they are biased between the impaired and non-
impaired group (DIF). For the impaired and non-impaired groups, the recall task was very
discriminative in the first latent trait (Table 4) but not the second. Calculation was very
discriminative in the first dimension for the impaired group (2.48) but not the non-impaired
group (0.99).
Factor loadings were obtained by oblique quartimax rotated loadings. Table 5 shows
more two-factor loadings in the impaired (14 items) than the non-impaired group (8 items).
TABLE 5
Confirmatory factor analysis (CFA) was calculated for the multidimensional and one-
dimensional models of the impaired and non-impaired groups. Statistics were based on the
loglikelihood. Considering only the impaired group, the values of -2 loglikelihood were
2307.89 and 2224.87 for the one- and two-dimensional models, respectively. The difference
between the values of -2 loglikelihood for the one-dimensional model (2224.87) and the two-
dimensional model (2307.89) may be interpreted as a χ2-distributed statistic on 1 degree of
freedom. This difference of 83.02 was highly significant (p<0.001). This result provides
strong evidence that these data require a two-dimensional model to be fitted. For the non-
impaired group, the values of -2 loglikelihood were 3231.59 and 3127.92 for the one- and
11
two-dimensional models, respectively. The difference (103.67) was highly significant
(p<0.001), suggesting a better fit in the two-dimensional model in both groups.
DISCUSSION
Mini-Mental State Examination was originally concept to describe five cognitive
domains: Orientation, Registration, Attention-Calculation, Recall and Language. However,
previous studies explored the dimensions of the MMSE and had conflicting results (16, 17,
37). A five factor first-order solution have been found in two studies (12, 19), but the
instrument satisfied assumptions of unidimensionality.
The factor structure of the MMSE was investigated using exploratory and
confirmatory factor analyses (37). The following three-factor solution was found: Commands,
Repetition, Registration tasks loaded onto the first factor; Writing, Reading and Drawing
tasks loaded onto the second factor; and Attention/Calculation, Recall, Orientation to time and
Orientation to place loaded onto the third factor. The confirmatory analysis suggested three
interrelated components, simple processing, complex processing and Attention/Memory.
During the clinical phase, the last factor was explained by self-care ability (66.9% of the
variance). More independent subjects performed better on the third factor, independent of
differences in age, gender and education. In our study, only two factors were found and most
of the items were discriminative in both latent traits.
Using IRT and Rasch models and two cognitively different groups of elderly, Schultz-
Larsen and colleagues found two dimensions of cognitive function in the MMSE (38). In their
study some items were age-correlated (Orientation to time, Attention/Calculation, Naming,
Repetition and Command) and others were non-age correlated (Orientation to place,
Registration, Recall, Reading and Copying). They suggested that a two-scale solution (scales
12
A and B) is a “stable and statistically supported framework for interpreting data obtained by
means of the MMSE”.
In the present study, the Rasch and one-dimensional two-parameter models did not fit
the data well. MSNQ and ZSTD values indicated that many of the MMSE items had
unexpected values. Using two-dimensional analysis, the MMSE obtained a better goodness-
of-fit index. It is likely that the latent space involves two abilities: memory and executive
function. Writing had the highest factor loading on the first factor (executive function) for the
impaired group (0.91, Table 5), followed by Calculation (0.8), Repetition (0.74) and Reading
(0.69). Recall task and Orientation to time had the highest factor loading on the second factor
(memory). Considering the c parameter to calculate the multidimensional item location (-a /c),
Calculation and Recall are the most difficulty tasks, the same found in the Taiwan study (39).
Calculation was very discriminative for the impaired (2.48) but not for the non-impaired
group (0.99).
A shorter version of the MMSE (only 16 items) (39) was proposed using Orientation
to time, Orientation to place, Recall and Calculation with a cut-off point of 11. These items
were selected because they were identified as the most efficient from the MMSE, using
unidimensional IRT. However, in our study, two different latent traits can be measured.
Following MIRT discrimination parameter two forms of the MMSE should be proposed:
Form A (Memory tasks, 13 items) using Orientation to time, Orientation to place and Recall
(do not score Registration task) and Form B (Executive function tasks, 11 items) with
Calculation, Repetition, Comprehension, Reading, Writing and Drawing tasks.
Although the MMSE parameters were quite different between non-impaired and
impaired groups, education level is lower for the last one (mean was 5.0 for CDR 3) and can
be a problem when groups’ data are compared, once many items require formal education
13
skills. More analysis should be performed using different cognitive groups but equal age and
education.
In sum, the MMSE is subject to both roof and floor effects, depending on the tested
sample (3, 9, 11). The sensitivity, specificity and predictive values are influenced by the base
rate (frequency) of the disorder in the general population (40). These attributes also vary with
the cutoff score and can be affected by the frequency or marginal distribution and reliability
of the measure in the normative population (41). Therefore, using clinical sample data for this
type of instrument is preferable, as the test shows little variability in responses in a normal
population (ceiling effect). Because of this low variance, item-rest correlation and alpha are
smaller in non-impaired than impaired group.
Multidimensional IRT models fit the MMSE data better than Rasch and one-
dimensional IRT models. Different item functioning was found in almost all MMSE items
when cognitive impaired and non-impaired groups were compared. The MMSE has
unacceptable reliability when used in the non-impaired population but has acceptable
reliability when impaired people answer the test. Some MMSE items display a ceiling effect
(Registration and Naming), even in dementia patients.
In general, the most discriminative items for Memory factor were Orientation to time,
Orientation to place and Recall. Calculation, Repetition, Comprehension, Reading, Writing
and Drawing tasks were more discriminative for Executive function factor.
REFERENCES
1. Folstein MF, Folstein, S. E., McHugh, P. R. "Mini-mental state". A practical method
for grading the cognitive state of patients for the clinician. Journal of Psychiatrich Research
1975;12:189-198
14
2. Fayers P, Hjermstad M, Ranhoff A, et al. Which Mini-Mental State Exam Items Can
Be Used to Screen for Delirium and Cognitive Impairment? Journal of Pain and Symptom
Management 2005;30:41-50
3. Folstein MF, Folstein SE, Fanjiang G, eds. MMSE Mini-Mental State Examination.
Lutz, Fl: Psychological Assessment Resources, Inc.; 2001
4. Teng E, Chui H. The modified Mini-Mental State Examination. Journal of Clinical
Psychiatry 1987;48:314-318
5. McDowell I. The modified Mini-Mental State Test. In: I M, ed. Measuring health: a
guide to rating scales and questionnaires. New York: Oxford University Press; 2006:441-449
6. Tombaugh T, McDowell I, Kristjansson B, et al. Mini-Mental State Examination
(MMSE) and the Modified MMSE (3MS): a psychometric comparison and normative data.
Psychological Assessment 1996;8:48-59
7. Newkirk LA, Kim JM, Thompson JM, et al. Validation of a 26-Point Telephone
Version of the Mini-Mental State Examination. Journal of Geriatric Psychiatry and Neurology
2004;17:81-87
8. Folstein M, Folstein S, White T, et al. MMSE-2 - Mini-Mental State Examination, 2nd
Edition - User`s Manual. Lutz, FL: PAR, Psychological Assessment Resources; 2010
9. Teresi J. Mini-Mental State Examination (MMSE): Scaling the MMSE using item
response theory (IRT). Journal of Clinical Epidemiology 2007;60:256-259
10. Wouters H, van Gool WA, Schmand B, et al. Three sides of the same coin: measuring
global cognitive impairment with the MMSE, ADAS-cog and CAMCOG. International
Journal of Geriatric Psychiatry 2009;25:770-779
11. Kraemer HC, Moritz DJ, Yesavage J. Adjusting mini-mental state examination scores
for age and educational level to screen for dementia: correcting bias or reducing validity?
International Psychogeriatrics 1998;10:43-51

15
12. Baños JH, Franklin LM. Factor structure of the Mini-Mental State Examination in
adult psychiatric inpatients. Psychological Assessment 2002;14:397-400
13. Brugnolo A, Nobili F, Barbieri MP, et al. The factorial structure of the mini mental
state examination (MMSE) in Alzheimer's disease. Archives of Gerontology and Geriatrics
2009;49:180-185
14. Elhan A, Kutlay S, Küçükdeveci A, et al. Psychometric properties of the Mini-Mental
State Examination in patients with acquired brain injury in Turkey. Journal of Rehabilitation
Medicine 2005;37:306-311
15. Champlain AD, Gessaroli ME. Assessing the dimensionality of item response matrices
with small sample sizes and short test lengths. Applied Measurement in Education
1998;11:231-253
16. Fillenbaum G, Heyman A, Wilkinson W, et al. Comparison of two screening tests in
Alzheimer's disease: the correlation and reliability of the Mini-Mental State Examination and
the Modified Blessed Test. Archives of Neurology 1987;44:924-927
17. Tinklenberg J, III jOB, Tanke ED, et al. Factor Analysis and Preliminary Validation of
the Mini-Mental Examination from a Longitudinal Perspective. International Psychogeriatrics
1990;2:123-134
18. de Leon J, Baca-García E, Sompson GM. A factor analysis of the Mini-Mental State
Examination in schizophrenic disorders. Acta Psychiatrica Scandinavica 1998;98:366-368
19. Jones RN, Gallo JJ. Dimensions of the Mini-Mental State Examination among
community dwelling older adults. Psychological Medicine 2000;30:605-618
20. Schultz-Larsen K, Kreiner S, Lomholt R. Mini-Mental Status Examination: Mixed
Rasch model item analysis derived two different cognitive dimensions of the MMSE. Journal
of Clinical Epidemiology 2007;60:268-279

16
21. Conrad KJ, Smith EV. International Conference on Objective Measurement:
Applications of Rasch Analysis in Health Care. Medical Care 2004;42:I-1
22. de Ayala RJ. The theory and practice of item response theory. New York, NY: The
Guilford Press; 2009
23. Schultz-Larsen K, Lomholt R, Kreiner S. Mini-Mental Status Examination: A short
form of MMSE was as accurate as the original MMSE in predicting dementia. Journal of
Clinical Epidemiology 2007;60:260-267
24. Edelen MO, Thissen D, Teresi JA, et al. Identification of Differential Item Functioning
Using Item Response Theory and Likelihood-Based Model Comparison Approach -
application to the Mini-Mental State Examinatio. Medical Care 2006;44:S134-S142
25. Morales LS, Flowers C, Gutierrez P, et al. Item and Scale Differential Functioning of
the Mini-Mental State Exam Assessed Using the Differential Item and Test Functioning
(DFIT) Framework. Medical Care 2006;44:S143-S151
26. Ramirez M, Teresi JA, Holmes D, et al. Differential Item Functioning (DIF) and the
Mini-Mental State Examination (MMSE) - overview, sample, and issues of translation.
Medical Care 2006;44:S95-S106
27. Brennan RL. Generalizability Theory and Classical Test Theory. Applied
Measurement in Education 2010;24:1-21
28. Edelen MO, Reeve BB. Applying item response theory (IRT) modeling to
questionnaire development, evaluation, and refinement. Quality of Life Research 2007;16:5-
18
29. Crane PK, Narasimhalu K, Gibbons LE, et al. Item response theory facilitated
cocalibrating cognitive tests and reduced bias in estimated rates of decline. Journal of Clinical
Epidemiology 2008;61:1018-1027.e1019
17
30. Embretson S, Reise S. Item response theory for psychologists. Mahwah, New Jersey:
Lawrence Erlbaum Associates Inc. Publishers; 2000
31. Teresi JA, Ramirez M, Jones RN, et al. Modifying Measures Based on Differential
Item Functioning (DIF) Impact Analyses. Journal of Aging and Health 2012
32. Pereira D, Satler C, Medeiros L, et al. Philadelphia Brief Assessment of Cognition in
healthy and clinical Brazilian sample. Arquivos de Neuro-Psiquiatria 2012;70:175-179
33. Fraser C, McDonald R. Version 4.0; 2012
34. StataCorp. Version 12. College Station, TX: StataCorp LP; 2011
35. Linacre J. Version 3.73. Beaverton, Oregon; 2011
36. Cai L, Thissen D, du Toit S. Version 2.1.21111.16001. Skokie, IL: Scientific Software
International, Inc; 2011
37. Shyu Y-I, Yip P-K. Factor Structure and Explanatory Variables of the Mini-Mental
State Examination (MMSE) for Elderly Persons in Taiwan. Journal of Formosan Medical
Association 2001;100:676-683
38. Schultz-Larsen K, Lomholt RK, Kreiner S. Mini-Mental Status Examination: A short
form of MMSE was as accurate as the original MMSE in predicting dementia. Journal of
Clinical Epidemiology 2007;60:260-267
39. Lou MF, Dai YT, Huang GS, et al. Identifying the most efficient items from the Mini-
Mental State Examination for cognitive function assessment in older Taiwanese patients.
Journal of Clinical Nursing 2007;16:502-508
40. Guggenmoos-Holzmann I, Houwelingen HCv. The (In) Validity of sensitivity and
specificity. Statistics in Medicine 2000:1783-1792
41. Shrout P, Fleiss J. Reliability and case detection. In: Wing JK, Bebbington P, Robbins
LN, eds. What is a case? London: Grant Mcintyre; 1981:117-128

18
Table 1 – Age, education and MMSE scores by CDR (Clinical Dementia Rating) level.
CDR 0 CDR 0.5 CDR 1 CDR 2 CDR 3 Total

N Mean S.D. N Mean S.D. N Mean S.D. N Mean S.D. N Mean S.D. N Mean
Age Men 48 74.0 7.9 19 77.5 8.5 21 76.0 7.2 11 81.1 4.7 1 73.0 0.0 100 75.9
Women 108 70.5 7.9 40 78.4 8.7 58 79.6 7.5 16 77.1 9.7 2 86.0 1.4 224 74.9
Total 161 71.8 8.1 59 78.1 8.6 79 78.6 7.6 27 78.7 8.2 3 81.7 7.6 329 75.2
Education Men 48 9.0 4.2 19 10.4 6.0 21 8.1 5.8 11 11.9 6.3 1 12.0 0.0 100 9.4
Women 108 8.9 4.5 40 9.7 6.1 58 9.5 5.8 16 6.5 5.8 2 1.5 0.7 224 9.0
Total 161 9.0 4.5 59 9.9 6.0 79 9.1 5.8 27 8.7 6.5 3 5.0 6.1 329 9.1
MMSE Men 48 26.1 4.0 19 25.1 3.5 21 20.5 5.1 11 12.8 4.9 1 4.0 0.0 100 23.1
Women 108 26.3 4.1 40 20.7 6.4 58 17.6 6.2 16 13.2 7.7 2 13.5 9.2 224 22.0
Total 161 26.1 4.1 59 22.1 6.0 79 18.4 6.0 27 13.0 6.6 3 10.3 8.5 329 22.3
Table 2 – Classical test theory statistics of MMSE (19 items).
Non-impaired Impaired
Response Std. Item-Rest Coeffici Response Std. Item-Rest Coeffici
Item Average Dev. Correlation ent α Average Dev. Correlation ent α
Year 0.98 0.15 0.23 0.36 0.27 0.45 0.49 0.74
Month 0.96 0.19 0.10 0.37 0.39 0.49 0.32 0.75
Date 0.94 0.24 0.05 0.38 0.19 0.40 0.31 0.75
Day-week 0.99 0.10 0.08 0.38 0.35 0.48 0.30 0.75
Hour 1.00 0.07 -0.05 0.38 0.74 0.44 0.43 0.75
State 0.95 0.22 0.18 0.36 0.33 0.47 0.41 0.75
City 0.99 0.10 -0.05 0.39 0.69 0.46 0.41 0.75
Town 0.94 0.24 0.24 0.35 0.42 0.50 0.37 0.75
Building 0.98 0.13 0.18 0.37 0.59 0.49 0.45 0.74
Floor 0.96 0.19 0.12 0.37 0.54 0.50 0.45 0.74
Calculatio
n 3.86 1.50 0.38 0.29 1.44 1.67 0.45 0.79
Table 0.85 0.37 -0.07 0.41 0.25 0.44 0.28 0.76
Watch 0.67 0.47 -0.01 0.40 0.14 0.35 0.22 0.76
Pen 0.64 0.48 0.06 0.38 0.09 0.29 0.06 0.77
Repetition 0.98 0.15 0.07 0.38 0.78 0.42 0.41 0.75
Comprehe
nsion 0.92 0.27 0.15 0.36 0.71 0.46 0.26 0.76
Reading 0.89 0.32 0.34 0.32 0.46 0.50 0.42 0.75
Writing 0.95 0.23 0.24 0.35 0.47 0.50 0.55 0.74
Drawing 0.82 0.39 0.35 0.31 0.35 0.48 0.40 0.75
Item-Rest = item-test with item deleted.
19
Table 3 – Rasch model analysis (N=326).
Infit Outfit Infit Outfit
Item Measure MNSQ ZSTD MNSQ ZSTD Measure MNSQ ZSTD MNSQ ZSTD
a
Year -0.19 0.64 -0.8 0.32 -1.6 0.99 0.82 -1.8 0.67 -1.6
Month 0.27 0.88 -0.3 0.73 -0.5 0.39 0.95 -0.4 0.93 -0.4
Date 0.72 0.99 0.1 1.06 0.3 1.49 0.96 -0.3 0.75 -0.8
a a
Day-week -0.57 0.38 -1.4 0.33 -1.4 0.57 0.99 0.0 0.93 -0.4
a a b
Hour -0.76 0.21 -1.9 0.30 -1.5 -1.07 0.50 -3.2 0.55 -3.2 b
State 0.51 0.88 -0.3 0.78 -0.4 0.68 0.89 -1.1 0.77 -1.2
a b
City -0.57 0.41 -1.3 0.58 -0.7 -0.92 0.59 -2.6 0.64 -2.4 b
Town 0.72 0.90 -0.3 0.57 -1.3 0.22 0.93 -0.6 1.15 1.0
a
Building -0.37 0.51 -1.1 0.34 -1.4 -0.51 0.72 -1.9 0.73 -1.8
Floor 0.27 0.86 -0.3 0.74 -0.5 -0.30 0.77 -1.7 0.73 -1.9
c c c
Calculation -6.55 1.35 3.4 1.28 2.5 -2.54 2.39 8.2 2.26 6.4 c
Table 1.81 1.29 1.9 1.17 0.9 1.12 0.99 -0.1 0.83 -0.7
c
Watch 2.96 1.16 2.1 1.09 0.9 1.88 0.98 -0.1 1.06 0.3
Pen 3.12 1.08 1.1 1.03 0.4 2.41 1.06 0.4 1.40 1.0
a b
Repetition -0.19 0.69 -0.7 0.63 -0.6 -1.21 0.45 -3.7 0.58 -2.9 b
Comprehension -5.32 0.61 -4.5 b 0.65 -3.6 b -3.76 1.09 0.8 1.34 2.5 c
Reading 1.45 0.88 -0.7 0.74 -1.1 0.02 0.86 -1.2 0.82 -1.2
b
Writing 0.62 0.88 -0.4 0.65 -0.9 -0.01 0.73 -2.3 0.67 -2.3 b
Drawing 2.07 0.87 -1.1 0.75 -1.5 0.57 0.89 -1.1 0.79 -1.2
Mean .00 0.81 -0.3 0.72 -0.6 .00 0.92 -0.7 0.93 -0.6
S.D. 2.32 0.30 1.6 0.29 1.2 1.44 0.39 2.4 0.39 2.2
a
Note. MSNQ < 0.5 indicates that the item may cause misleadingly high reliability and
separation coefficients. b ZSTD ≤ -2 indicates that other dimensions may be constraining the
response patterns. c
ZSTD ≥ 3 indicates that the data are particularly unexpected or
unpredictable.
20
Table 4 – Multidimensional 2PL and graded model item parameter estimates, logit: aθ + c
Item a1 a2 c a1 a2 c
Year 3.86 3.81 5.75 2.32 2.13 -1.69
Month 2.45 1.13 3.77 0.82 1.28 -0.66
Date 1.88 1.67 2.46 1.58 3.04 -2.34
Day-week 1.89 1.00 3.44 1.37 2.77 -0.94
Hour 1.28 1.86 4.93 1.50 2.39 1.46
State 2.19 1.82 3.38 1.96 1.26 -1.02
City 2.12 1.82 5.92 0.80 1.19 1.23
Town 1.38 1.28 2.82 0.98 1.34 -0.48
Building 1.90 1.88 4.53 1.26 0.94 0.44
Floor 1.29 1.15 3.72 1.41 0.88 0.16
Table 1.78 0.01 1.43 1.54 2.76 -1.42
Watch 2.58 -0.34 0.07 1.82 4.54 -2.00
Pen 2.39 -0.04 -0.07 0.78 >6.0 -2.73
Repetition 0.76 1.53 3.97 1.97 0.49 1.82
Comprehension 0.52 0.84 2.81 0.32 0.12 0.97
Reading 1.16 0.59 3.31 1.63 0.13 -0.25
Writing 1.02 3.89 4.75 4.10 0.87 -0.33
Drawing 1.12 0.58 2.35 1.20 0.11 -0.93
Non-impaired
Item a1 a2 c1 c2 c3 c4 c5
Calculation 0.99 1.43 3.39 1.79 1.12 0.43 -0.55
Commands 0.85 0.63 3.87 3.27
Impaired
Calculation 2.48 0.73 0.58 -0.46 -1.65 -2.32 -2.88
21
Table 5 – Oblique CF-Quartimax Rotated Loadings.
Item λ1 λ2 λ1 λ2
Year 0.92 0.66 0.60
Month 0.53 0.36 0.56
Date 0.63 0.41 0.79
Day-week 0.65 0.39 0.79
Hour 0.46 0.73
State 0.80 0.68 0.44
City 0.36 0.54
Town 0.50 0.40 0.41 0.56
Building 0.97 0.54 0.41
Floor 0.53 0.56 0.59 0.37
Calculation 0.30 0.60 0.80
Table 0.43 0.77
Watch 0.49 0.35 0.88
Pen 0.35 1.00
Repetition 0.56 0.74
Comprehension 0.26 0.43 0.38
Reading 0.64 0.69
Writing 0.83 0.91
Drawing 0.68 0.32 0.57
* Factor correlations: non-impaired = 0.01; impaired = -0.35. Low factor loadings (<0.3) were omitted.

Pereira et al

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pereira et al

Uploaded by

Copyright:

Available Formats

1

Examination in elderly healthy and clinical Brazilian samples

Running title: Comparison of IRT models of the MMSE

Danilo Assis Pereira 1,2

Antônio Gimenez Giglio 3

Antônio Giglio: Substantial contribution to acquisition of data, revising it critically for

Sabri Lakhdari: Substantial contribution to acquisition of data, revising it critically for

The authors declare no conflict of interest.

patients. Discussion: The results indicate that MMSE is a multidimensional screening

Orientation to place and Recall. Calculation, Repetition, Comprehension, Reading, Writing

instrument for cognitive dysfunction. Originally conceived as a brief assessment of five

analysis (15). Factor analysis is essential in psychometric studies because it reveals

Writing and Drawing (18).

researchers found a five-factor analysis with tetrachoric correlation, as follows: Concentration

(2PL) is preferable because it includes a measure of item discrimination. (24). However,

considered to reflect biased items (9, 20, 23, 25, 26).

directly related to the psychometric properties (30).

with psychometric properties) (22).

dimension) are more discriminative to different latent traces.

neuropsychologist. Diagnosis was based on a clinical and neuropsychological battery

following the recommendations of the National Institute of Neurological Disease and

Department of the Federal District.

characteristics of the groups.

First, residuals of the tetrachoric correlation were analyzed on 30 dichotomous items

independence assumption). Nineteen items were treated as dichotomous (0 or 1), as follows:

Registration (ranging from 0 to 3), Attention/calculation (0 to 5) and Comprehension of

analysis (one-parameter model) to determine item fit. A two-parameter model for

multidimensional two-parameter logistical model (M2PL) was calculated for dichotomous

theta parameter (ability).

and impaired groups.

dimensional model (2307.89) may be interpreted as a χ2-distributed statistic on 1 degree of

two-dimensional models, respectively. The difference (103.67) was highly significant

(p<0.001), suggesting a better fit in the two-dimensional model in both groups.

Mini-Mental State Examination was originally concept to describe five cognitive

domains: Orientation, Registration, Attention-Calculation, Recall and Language. However,

instrument satisfied assumptions of unidimensionality.

interrelated components, simple processing, complex processing and Attention/Memory.

of the items were discriminative in both latent traits.

study some items were age-correlated (Orientation to time, Attention/Calculation, Naming,

means of the MMSE”.

Calculation, Repetition, Comprehension, Reading, Writing and Drawing tasks.

smaller in non-impaired than impaired group.

(Registration and Naming), even in dementia patients.

Orientation to place and Recall. Calculation, Repetition, Comprehension, Reading, Writing

1. Folstein MF, Folstein, S. E., McHugh, P. R. "Mini-mental state". A practical method

Lutz, Fl: Psychological Assessment Resources, Inc.; 2001

4. Teng E, Chui H. The modified Mini-Mental State Examination. Journal of Clinical

6. Tombaugh T, McDowell I, Kristjansson B, et al. Mini-Mental State Examination

Psychological Assessment 1996;8:48-59

8. Folstein M, Folstein S, White T, et al. MMSE-2 - Mini-Mental State Examination, 2nd

response theory (IRT). Journal of Clinical Epidemiology 2007;60:256-259

Journal of Geriatric Psychiatry 2009;25:770-779

International Psychogeriatrics 1998;10:43-51

adult psychiatric inpatients. Psychological Assessment 2002;14:397-400

state examination (MMSE) in Alzheimer's disease. Archives of Gerontology and Geriatrics

14. Elhan A, Kutlay S, Küçükdeveci A, et al. Psychometric properties of the Mini-Mental

16. Fillenbaum G, Heyman A, Wilkinson W, et al. Comparison of two screening tests in

the Modified Blessed Test. Archives of Neurology 1987;44:924-927

the Mini-Mental Examination from a Longitudinal Perspective. International Psychogeriatrics

Examination in schizophrenic disorders. Acta Psychiatrica Scandinavica 1998;98:366-368

community dwelling older adults. Psychological Medicine 2000;30:605-618

20. Schultz-Larsen K, Kreiner S, Lomholt R. Mini-Mental Status Examination: Mixed

of Clinical Epidemiology 2007;60:268-279

21. Conrad KJ, Smith EV. International Conference on Objective Measurement: