Professional Documents
Culture Documents
Estimating Population or Group Sensitivity and Its Precision From A Set of Individual D
Estimating Population or Group Sensitivity and Its Precision From A Set of Individual D
The
British
Psychological
British Journal of Mathematical and Statistical Psychology (2005), 58, 55–63
q 2005 The British Psychological Society
Society
www.bpsjournals.co.uk
Jian Bi*
Sensometrics Research and Service, Richmond, Virginia, USA
1. Introduction
In signal detection theory (Green & Swets, 1974), the index d 0 is a measure of individual
sensitivity. The major advantage of using d 0 to measure sensitivity is that the index is
theoretically unaffected by the decision criteria and methods used. In many situations,
however, researchers are interested in estimating population sensitivity from a set of
participants. The set of participants is a random sample of a population.
There are two main methods, averaging and pooling, to estimate population
sensitivity (Swets & Pickett, 1982; Hautus, 1997). The ‘pooling’ method involves
pooling all data from a set of participants and then calculating a d 0 value from the pooled
data as if the data pooled over participants were data obtained from a single participant.
The ‘averaging’ method involves calculating individual d 0 for each participant and then
averaging the set of individual d 0 to obtain an estimate of population sensitivity.
* Correspondence should be addressed to J. Bi, 9212 Groomfield Rd, Richmond, VA 23236, USA (e-mail: bbdjcy@aol.com).
DOI:10.1348/000711005X38357
56 Jian Bi
Macmillan and Kaplan (1985), Hanley (1989), Metz (1989), Macmillan and Creelman
(1991), Irwin, Hautus, and Stillman (1992), Hautus (1997) and others discussed the
topic with emphasis on the ‘pooling’ method. Dorfman and Berbaum (1986) developed
a jackknife procedure and a computer program to estimate population parameters
including sensitivity and their standard errors from pooled rating-method data. Hautus
(1997) conducted Monte Carlo simulations for the two methods.
The most important reason for using the ‘pooling’ method is that when individual d 0
cannot be calculated due to sparse data for each participant, the ‘pooling’ method might
be the last resort for estimating population sensitivity. However, there are a few hazards
involved in unnecessary pooling. It is the consensus in the literature that the ‘pooling’
method should be used only when data are too sparse for each participant under the
rather restrictive conditions that the sensitivities and decision criteria of participants are
very close (Metz, 1989; Dorfman, Berbaum, & Metz, 1992). If these conditions are not
met, the estimate of population sensitivity using the ‘pooling’ method may be distorted
and biased, sometimes by a substantial amount (Metz, 1989). In this situation the
jackknife procedure developed by Dorfman and Berbaum (1986) for pooled ratings data
might avoid these pitfalls (Irwin et al., 1992). Another weakness of the ‘pooling’ method
is that it often overestimates the precision of the estimator for sensitivity because the
values in a pooled data set are not independent (Metz, 1989). The ‘averaging’ method
does not suffer from the problems normally associated with the ‘pooling’ method.
Another merit of the ‘averaging’ method is that it can handle individual d 0 obtained from
different methods. The ‘averaging’ method is usually preferred in the literature and
should be used whenever possible (Macmillan & Kaplan, 1985; Metz, 1989; Hautus,
1997).
There is relatively little discussion of the ‘averaging’ method in the literature. The
implicitly assumed and commonly used estimator in the ‘averaging’ method is
arithmetic mean. An issue does arise if the arithmetic mean is an appropriate estimator
in estimating population or group sensitivity. It is noted that using the arithmetic mean
estimator ignores the fact that individual d 0 itself is an estimator of individual sensitivity
with an inherent variance. In the situation of heterogeneous variances, arithmetic mean
is not a good estimator of a population parameter. Using this estimator may lead to an
estimate with high variation. Another important fact, which is also often ignored in
estimating population sensitivity using the arithmetic mean, is that the variance of
individual d 0 involves both between- and within-subject variation in a random effects
model for the estimation of population sensitivity and its precision. Failing to account
for both of these components of variance leads to an underestimate of the variation and
an overestimate of the precision of a population sensitivity estimator.
The objective of the present paper is to propose a kind of weighted mean instead of
the arithmetic mean to estimate population or group sensitivity. No particular novelty in
statistics is claimed for the proposed approach. The justification for presenting it is that
it is valid and relevant to the issue. As far as the author is aware, the approach has not
previously been proposed, and the unsuitability of the arithmetic mean in estimating
population d 0 certainly has not been given adequate attention in the psychometric
literature.
general agreement that sensitivity data are lognormally distributed rather than normally
distributed (Gaddum, 1933; Bliss, 1934). As a measure of sensitivity, d 0 always takes non-
negative values. Strictly speaking, d 0 cannot be normally distributed, but its logarithm
can be because only the transformed variable is defined over the whole of the range
from 2 1 to 1. It is reasonable to assume that individual sensitivity in terms of d 0
follows a lognormal distribution. Analysis of d 0 data should be conducted after they have
been log-transformed.
In order to estimate population sensitivity using the averaged d 0 method, the
individual d 0i , i ¼ 1, 2, : : : , k, should be log-transformed as d 0i ¼ logðd 0i Þ, where logðd 0i Þ
is the natural logarithm of d 0i , and d 0i is assumed to be normally distributed.
In this paper, d* denotes a true population or group sensitivity and di denotes the
true individual sensitivity, while d 0p (or d 0g ) and d 0i denote the estimates of d * and di on
the log scale. As soon as the population or group sensitivity and the variance of the
estimator, V ðd 0p Þ or V ðd 0g Þ, are estimated, a confidence interval for d * can be
constructed based on the assumption of normal distribution for the log-transformed
data.
The estimated population or group sensitivity and the 1 2 a confidence limits on the
log scale should be back-transformed to the original scale.
d 0i ¼ d þ Li þ ei ; i ¼ 1; 2; : : : ; k: ð1Þ
In this paper, population sensitivity and group sensitivity are different concepts with
different meanings. If the k subjects are a random sample of a population, (1) is a
random effects model, where d* is the population sensitivity, Li is the effect of the ith
subject, Li ¼ di 2 d , di is the sensitivity of the ith subject and ei is the deviation of d 0i
from di . Assume that Li and ei are mutually independent, and that Li , Nð0; s 2 Þ and
ei ~ Nð0; s 2i Þ, where s 2and s 2i are the between-subject and within-subject variances,
respectively. The expectation and variance of d 0i are given by
Eðd 0i Þ ¼ d ; Varðd 0i Þ ¼ s 2 þ s 2i :
If the k participants are not a random sample of a population, in other words, interest is
confined to the sensitivity of the specified k participants, (1) is a fixed effects model. In
the model, d * represents group sensitivity and Li is not a variable, that is, s 2 ¼ 0. Hence
In practice, the true variance of individual d 0 is not known and must be estimated from
data. Furthermore, because a set of participants are randomly chosen from a population,
a random effects model with two sources of variation, between- and within-subject
variances, should be accounted for in estimating population sensitivity and the variance
of the estimator. This paper proposes an iterative procedure to estimate population
sensitivity and variance of the estimator. The procedure is based on a random effects
model.
If interest is confined to the sensitivity of a particular group of participants, for
example a panel, then the consensus value of a set of individual d 0 represents group
58 Jian Bi
sensitivity. The estimate of group sensitivity is based on a fixed effects model. This paper
also shows how to estimate group sensitivity and its precision.
!21
X
k
V ðd 0p Þ ¼ ^i
w : ð4Þ
i¼1
Rukhin & Vangel (1998) investigated the theoretical properties of the Mandel–Paule
algorithm and compared it with the maximum likelihood estimator. They show that the
Mandel–Paule solution for the semi-weighted mean can be interpreted as a simplified
version of the maximum likelihood method and concluded that it is a satisfactory rule
from many perspectives. They also show that a better variance estimator than (4) for the
semi-weighted mean in (2), considering that the estimator s^ 2 in (3) is not a consistent
estimator of the between-subject variance s 2, is given by
Pk
0
^ 2i ðd 0i 2 d 0p Þ2
i¼1 w
V ðd p Þ ¼ Pk
2 : ð5Þ
i¼1 w ^i
where W^ i ¼ 1=s^i2 . No iteration is needed for calculation of the ordinary weighted mean.
The variance of the weighted mean in (6) can be estimated from
1
V ðd 0g Þ ¼ Pk : ð7Þ
i¼1 Wi
^
5. Numerical examples
5.1 Estimating sensitivity of a consumer population
As an illustration of the procedure for estimation of population sensitivity, the artificial
data in Table 1 are used. The data relate to 30 consumers, drawn randomly from a
consumer population, and consist of individual d 0i s and variances on both the original
and log scales.
In order to estimate the sensitivity of the population that the 30 consumers
represent, an iterative procedure based on (3) is used to estimate between-subjects
variance, s^ 2 , on the log scale, for the population. The results for the first ten iterations
are given in the Appendix. We can see that the process converges to 0.0995 after the
seventh iteration. According to (2) and (6), the population sensitivity on the log scale is
d 0p ¼ 0:6351 and the precision of the estimate is V ðd 0p Þ ¼ 0:0035. Hence, the
estimated population sensitivity and the 95% confidence limits on the original scale are
60 Jian Bi
Table 1. Individual d0 and its variance for the example in Section 5.1
d 0p ¼ expð0:6351Þ ¼ 1:89;
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d 0L ¼ expð0:6351 2 1:96 £ 0:0035Þ ¼ 1:68;
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d 0U ¼ expð0:6351 þ 1:96 £ 0:0035Þ ¼ 2:12:
Table 2. Individual d0 and its variance for the example in Section 5.2
References
Bi, J. (2002). Variance of d0 for the same-different method. Behavior Research Methods,
Instruments, & Computers, 34(1), 37–45.
Bi, J., Ennis, D. M., & O’Mahony, M. (1997). How to estimate and use the variance of d0 from
difference tests. Journal of Sensory Studies, 12, 87–104.
Bliss, C. I. (1934). The method of probits. Science, 79, 38–39.
Cochran, W. G. (1954). The combination of estimates from different experiments. Biometrics, 10,
101–129.
Dorfman, D. D., & Alf, E. Jr. (1968). Maximum likelihood estimation of parameters of signal
detection theory – A direct solution. Psychometrika, 33, 117–124.
Dorfman, D. D., & Alf, E. Jr. (1969). Maximum likelihood estimation of parameters of signal
detection theory and determination of confidence intervals – Rating method data. Journal of
Mathematical Psychology, 6, 487–496.
62 Jian Bi
Dorfman, D. D., & Berbaum, K. S. (1986). RSCORE-J: Pooled rating-method data: A computer
program for analyzing pooled ROC curves. Behavior Research Methods, Instruments, &
Computers, 18, 452–462.
Dorfman, D. D., Berbaum, K. S., & Metz, C. E. (1992). Receiver operating characteristic rating
analysis: generalization to the population of readers and patients with the jackknife method.
Investigative Radiology, 27, 723–731.
Gaddum, J. H. (1933). Reports on biological standards. III: Methods of biological assay depending
on a quantal response. Special Report Series, Medical Research Council, London, No. 183.
Gourevitch, V., & Galanter, E. (1967). A significance test for one parameter isosensitivity functions.
Psychometrika, 32, 25–33.
Green, D. M., & Swets, J. A. (1974). Signal detection theory and psychophysics. Huntington, NY:
Krieger.
Hanley, J. A. (1989). Receiver operating characteristic (ROC) methodology: the state of the art.
Critical Reviews in Diagnostic Imaging, 29, 307–335.
Hautus, M. J. (1997). Calculating estimates of sensitivity from group data: Pooled versus averaged
estimators. Behavior Research Methods, Instruments, & Computers, 29, 556–562.
Irwin, R. J., Hautus, M. J., & Stillman, J. A. (1992). Use of receiver operating characteristic in the
study of taste perception. Journal of Sensory Studies, 7, 291–314.
Keene, O. N. (1995). The log transformation is special. Statistics in Medicine, 14, 811–819.
Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A user’s guide. Cambridge:
Cambridge University Press.
Macmillan, N. A., & Kaplan, H. L. (1985). Detection theory analysis of group data: Estimating
sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98, 185–199.
Mandel, J. (1991). Evaluation and control of measurements. New York: Marcel Dekker.
Mandel, J., & Paule, R. C. (1970). Interlaboratory evaluation of a material with unequal number of
replicates. Analytical Chemistry, 42, 1194–1197.
Metz, C. E. (1989). Some practical issues of experimental design and data analysis in radiological
ROC studies. Investigative Radiology, 24, 234–245.
Miller, J. (1996). The sampling distribution of d0 . Perception & Psychophysics, 58, 65–72.
Ogilvie, J.C., & Creelman, C.D. (1966). Maximum likelihood estimations of ROC curve parameters.
Paper read before the Eastern Psychological Association, New York, April.
Ogilvie, J. C., & Creelman, C. D. (1968). Maximum-likelihood estimation of receiver operating
characteristic curve parameters. Journal of Mathematical Psychology, 5, 377–391.
Paule, R. C., & Mandel, J. (1982). Consensus values and weighting factors. Journal of Research of
the National Bureau of Standards, 87, 377–385.
Rukhin, A. L., & Vangel, M. G. (1998). Estimation of a common mean and weighted means
statistics. Journal of the American Statistical Association, 93, 303–308.
Swets, J. A., & Pickett, R. M. (1982). Evaluation of diagnostic systems: Methods from signal
detection theory. New York: Academic Press.
Using Newton’s method, the solution of function Fðs^ 2 Þ ¼ 0 can be obtained by the
iterative process
Fðs^ 2 ðnÞÞ
s^ 2 ðn þ 1Þ ¼ s^ 2 ðnÞ 2 ; ð9Þ
F 0 ðs^ 2 ðnÞÞ
where s^ 2 ðnÞ and s^ 2 ðn þ 1Þ denote the s^ 2 values at the nth and the (n þ 1)th
iterations, respectively; Fðs^ 2 ðnÞÞ denotes the value of function Fðs^ 2 Þ at s^ 2 ðnÞ;
and F 0 ðs^ 2 ðnÞÞ
Pk denotes the derivative of F at s^ 2 ðnÞ. It can be shown that
F ðs Þ < 2 i¼1 wi ðd i 2 d 0p Þ2 .
0
^ 2 ^ 2 0
For example, using the data in Table 1 for d 0i , s^2i and k ¼ 30, selecting an initial value
s^ 2 ¼ 0:001, the values of s^ 2 at the first ten iterations are as below. The process
converges to 0.0995 at the seventh iteration.
n 0 1 2 3 4 5 6 7 8 9 10
s^ 2 0.0010 0.0058 0.0168 0.0363 0.0632 0.0877 0.0983 0.0995 0.0995 0.0995 0.0995