Professional Documents
Culture Documents
Confidence Intervals For The Mahalanobis Distance: Communication in Statistics-Simulation and Computation March 2001
Confidence Intervals For The Mahalanobis Distance: Communication in Statistics-Simulation and Computation March 2001
Confidence Intervals For The Mahalanobis Distance: Communication in Statistics-Simulation and Computation March 2001
net/publication/266719016
CITATIONS READS
29 524
1 author:
Benjamin Reiser
University of Haifa
86 PUBLICATIONS 2,747 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Benjamin Reiser on 24 February 2019.
Benjamin Reiser
Department of Statistics
University of Haifa
Haifa 31905, Israel
ABSTRACT
1. INTRODUCTION
defined to be
δ = ( µ x - µ y )′ Σ-1 ( µ x - µ y ).
2
(1)
A = Φ ( δ/ 2 ) . (4)
Reiser and Faraggi (1997) develop a confidence interval for A.
If a confidence interval for δ 2 could be obtained, confidence intervals for
OVL, OER and A would automatically follow as these are all monotone
transformations of δ 2 .
Madansky and Olkin (1969) provide an approximate method based on the
asymptotic distribution of the likelihood ratio statistic, which could be applied to
provide an approximate confidence interval for (1). A Bayesian approach for
inference on (1) (Radhakrishnan, 1984) is possible but we will not consider it in
this paper.
In Section 2 we review the approach Reiser and Faraggi (1997) use to obtain
a confidence interval for A and point out that it provides an almost exact
solution for the confidence interval of δ 2 and consequently for OER, OVL and
A. In Section 3 we further examine the properties of this solution by means of a
simulation study and provide an example. Some concluding comments are
provided in Section 4.
m n (m + n - p - 1) ˆ 2
δ = D 2 ~ F p,m+n- p-1 ( λ ) (5)
m + n (m + n - 2) p
mn 2
with λ = δ where F v1,v2 ( λ ) denotes a non-central F variate with v1 and
m+ n
v 2 degrees of freedom and non-centrality parameter λ
interval for δ 2 . If Prob( F p,m+n - p-1 (0) ≤ D 2 ) is less than 1- α /2 [ α /2] then no
2
solution is obtained for (6) [(7)] and the bound δ 2 [ δ ] is taken to be zero. If
not for this restriction the interval obtained from solving (6) and (7) would have
exactly 1- α coverage. By solving only one of (6) or (7) the corresponding
one-sided confidence bounds are obtained. Frequently when dealing with
distances or functionals such as OER one-sided bounds are of greater interest
than two sided intervals.
We numerically investigate the coverage of this procedure in Section 3. Lam
(1987) provides graphs which gives solutions of (6) and (7) for certain values of
α and of the degrees of freedom. We had no difficulty in finding numerical
solutions using the GAUSS Nonlinear Equations program
If the true value of δ 2 is large, (6) and (7) will generally be solvable.
Solutions will tend not to be available for small δ 2 . Denote by F γp,m + n - p - 1 the γ
D
2
≤ 1-α / 2
F p,m + n - p - 1 (8)
2
while the upper bound δ will be taken to be zero if and only if
D
2
≤ α/2
F p,m + n - p - 1 . (9)
Note that in the extreme case of the true value of δ 2 being 0, events (8) and (9)
δ quite small, but not zero, the actual coverage will tend to be close to the
2
nominal value of 1- α since the probabilities in (i), (ii) and (iii) will not change
very much but coverage will tend to occur only in case (ii). Reiser and Faraggi
(1999) have made similar remarks for the univariate (p=1) problem.
out the simulation using two multivariate normal populations with common
covariance matrix taken to be the identity matrix and choose the population
means to provide a range of values for δ 2 .
An extensive simulation study was carried out for δ 2 = 0, 0.0001, 0.125,
0.25, 0.5,1 and 2; α =0.05and 0.10; m=n=10, 20, 50; p=2, 3 and 5. 10,000
simulations were carried out for every combination of the parameter values
given above. Confidence intervals for δ 2 were computed as described in
Section 2. The observed percentage of cases in which the confidence intervals
contained the true value of δ 2 was noted and is denoted by CP in Table I. For
the sake of brevity we provide results only on some of the δ 2 values. The
simulations were programmed in GAUSS (1994). In addition the Table presents
the proportion of cases falling below (above) the lower (upper) confidence
bounds. We denote these proportions by LT and RT respectively. These
measure the adequacy of the coverage of one-sided confidence bounds. An
estimated probability marked with bold face indicates that the 95% confidence
interval for that probability (based on a binomial sample of 10,000 simulated
data sets) does not contain the targeted nominal value.
The confidence interval gives simulated coverages close to their nominal
values except for the case of δ 2 =0. For this case the coverage is conservative
and is in fact 1- α /2 instead of 1- α This interval is asymmetric with all cases
falling outside the interval falling below the lower bound. This is as expected
based on our theoretical explanation in Section 2. For all other cases LT is close
to RT
One should note that for δ 2 ≠ 0 even in the cases (bold) in which the
approximate confidence intervals for the probabilities do not include the nominal
value, the estimated probabilities are quite close to their nominal values. The
differences between the estimated and nominal probabilities have little practical
importance. The performance of this procedure does not depend on p.
Our simulation study indicates that the confidence interval for the
Mahalanobis distance (and consequently for OVL, OER and A) presented in this
paper performs well even for small sample sizes and all δ 2 except for δ 2
TABLE I. Coverage Probabilities for the Mahalanobis Distance based on 10,000 Simulations.
α =1 - 0.95 α =1 - 0.90
m=n=10 m=n=20 m=n=50 m=n=10 m=n=20 m=n=50
p CP LT RT CP LT RT CP LT RT CP RT LT CP LT RT CP LT RT
δ
2
2 0 .975 .025 0 .977 .023 0 .975 .025 0 .951 .049 0 .953 .047 0 .949 .051 0
3 0 .972 .028 0 .972 .028 0 .976 .024 0 .947 .053 0 .947 .053 0 .953 .047 0
5 0 .974 .026 0 .966 .034 0 .974 .026 0 .943 .057 0 .937 .063 0 .951 .049 0
2 .0001 .949 .025 .026 .949 .025 .026 .953 .023 .024 .900 .047 .053 .900 .047 .053 .903 .049 .048
3 .0001 .948 .027 .025 .948 .027 .025 .953 .025 .022 .896 .055 .049 .896 .055 .049 .903 .050 .047
5 .0001 .949 .026 .025 .943 .032 .025 .946 .028 .026 .900 .051 .049 .890 .061 .049 .895 .051 .054
2 .125 .952 .025 .023 .952 .025 .023 .955 .023 .022 .899 .050 .051 .899 .050 .051 .904 .049 .047
3 .125 .954 .023 .023 .954 .023 .023 .953 .024 .023 .903 .050 .047 .903 .050 .047 .903 .049 .048
5 .125 .948 .027 .025 .940 .033 .027 .947 .027 .026 .898 .053 .049 .889 .059 .052 .896 .052 .052
2 1.0 .950 .025 .025 .950 .025 .025 .950 .025 .025 .897 .053 .050 .897 .053 .050 .898 .050 .052
3 1.0 .953 .023 .024 .953 .023 .024 .949 .025 .026 .906 .045 .049 .906 .045 .049 .897 .051 .052
5 1.0 .945 .029 .026 .942 .033 .025 .952 .024 .024 .894 .055 .051 .888 .059 .053 .900 .048 .052
Values in bold are the estimated coverage probabilities whose 95% confidence interval (based on a binomial sample of
10,000 simulated data sets) does not include the targeted nominal probability.
TABLE II. Coverage Probabilities for OER based on 10,000 Simulations using the Jackknife.
α =1 - 0.95 α =1 - 0.90
m=n=10 m=n=20 m=n=50 m=n=10 m=n=20 m=n=50
δ
p OER 2 CP LT RT CP LT RT CP LT RT CP RT LT CP LT RT CP LT RT
2 .500 0 .821 .039 .141 .845 .037 .118 .867 .029 .104 .739 .049 .212 .767 .046 .187 .789 .037 .174
3 .500 0 .738 .019 .243 .801 .021 .178 .836 .017 .147 .644 .028 .327 .709 .029 .263 .741 .023 .236
5 .500 0 .550 .007 .443 .674 .008 .319 .739 .008 .253 .459 .010 .531 .572 .011 .417 .625 .012 .363
2 .498 .0001 .816 .041 .143 .854 .038 .108 .874 .032 .094 .736 .052 .213 .770 .049 .181 .801 .040 .159
3 .498 .0001 .746 .023 .231 .810 .019 .171 .846 .022 .132 .651 .031 .318 .718 .026 .256 .757 .031 .212
5 .498 .0001 .562 .008 .430 .685 .009 .306 .757 .010 .233 .461 .013 .526 .580 .014 .406 .655 .015 .330
2 .430 .125 .865 .050 .084 .897 .049 .055 .919 .040 .041 .804 .066 .131 .846 .065 .088 .866 .063 .071
3 .430 .125 .828 .034 .138 .884 .038 .078 .912 .040 .048 .759 .047 .194 .883 .059 .119 .856 .062 .082
5 .430 .125 .682 .015 .303 .833 .022 .146 .893 .031 .076 .600 .024 .376 .757 .036 .208 .833 .049 .118
2 .301 1.0 .863 .027 .110 .904 .021 .075 .921 .024 .055 .803 .048 .149 .845 .045 .110 .862 .048 .090
3 .301 1.0 .820 .026 .154 .885 .026 .089 .910 .023 .067 .754 .049 .198 .822 .048 .130 .854 .046 .100
5 .301 1.0 .707 .021 .272 .832 .024 .144 .898 .027 .075 .630 .036 .333 .764 .044 .192 .834 .050 .116
exactly zero, which does not really occur in practice.
As an example of this methodology consider the data discussed by Srivastava
and Carter (1983 p. 236-7) in which three psychological tests are administered
to 114 patients suffering from anxiety and 33 patients suffering from hysteria.
The estimate of the Mahalanobis distance δ 2 =0.359 and the resulting 95%
confidence interval obtained from (6) and (7) is (0, 0.875). Depending on the
particular focus of scientific interest, the corresponding confidence intervals for
any of the measures described in Section 1 can be obtained. For example the
95% confidence interval for OER is (0.32, 0.5) indicating, in the context of
discriminant analysis a high probability of misclassifying patients.
An alternative confidence interval for the OER can be obtained using the
jackknife based methodology of Dorvlo (1992). In order to compare
procedure with that based on solving (6) and (7) we repeated the simulation
study summarized in Table I for the jackknife based method. The results are
presented in Table II. Due to the one-to-one relationship in (2) between the
Mahalanobis distance and OER the two tables are directly comparable. It is clear
that the jackknife provides coverage substantially less than the nominal. The
coverage increases with a larger sample size but is still inadequate for m=n=50
and decreases with increasing p. We also compared the average length of the
confidence intervals for the two methods. In the majority of cases the average
length of the jackknife based confidence intervals was substantially greater than
that of the non-central F based procedure. In the few cases where it was shorter
the differences were quite small. It is important to note that due to the clear
inadequacy in the coverage of the jackknife procedure there is not much value in
comparing these lengths. For OER, interest is often focused on an upper
confidence bound. Comparing the columns headed RT in Table I and II shows
that the jackknife does particularly poorly on the upper bound while the
non-central F based method performs well.
4. CONCLUDING REMARKS.
ACKNOWLEDGEMENTS
The author thanks the referee for comments which let to an improved
presentation.
BIBLIOGRAPHY
Rao, P.S.R.S and Dorvlo, A.S.S. The Jackknife Procedure for the Probabilities
of Misclassification.
Computation, 1985,14, 774-790.
Statistics, 1983, North Holland: New York.