Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Communications in Statistics - Simulation and

Computation

ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20

Stabilizing the Performance of Kurtosis Estimator


of Multivariate Data

S. Ejaz. Ahmed , M. Hafidz Omar & Anwar H. Joarder

To cite this article: S. Ejaz. Ahmed , M. Hafidz Omar & Anwar H. Joarder (2012) Stabilizing
the Performance of Kurtosis Estimator of Multivariate Data, Communications in Statistics -
Simulation and Computation, 41:10, 1860-1871, DOI: 10.1080/03610918.2011.624237

To link to this article: http://dx.doi.org/10.1080/03610918.2011.624237

Published online: 13 Jun 2012.

Submit your article to this journal

Article views: 93

View related articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=lssp20

Download by: [Thammasat University Libraries], [Muhammad Kashif Ali Shah] Date: 23 November 2016, At: 04:07
Communications in Statistics—Simulation and Computation® , 41: 1860–1871, 2012
Copyright © Taylor & Francis Group, LLC
ISSN: 0361-0918 print/1532-4141 online
DOI: 10.1080/03610918.2011.624237

Stabilizing the Performance of Kurtosis


Estimator of Multivariate Data

S. EJAZ. AHMED1 , M. HAFIDZ OMAR2 ,


AND ANWAR. H. JOARDER2
1
Department of Mathematics and Statistics, University of Windsor,
Windsor, Ontario, Canada
2
Department of Mathematics and Statistics, King Fahd University
of Petroleum and Minerals, Dhahran, Saudi Arabia

The estimation of the kurtosis parameter of the underlying distribution plays a


central role in many statistical applications. The central theme of the article is
to improve the estimation of the kurtosis parameter using a priori information.
More specifically, we consider the problem of estimating kurtosis parameter of a
multivariate population when some prior information regarding the the parameter is
available. The rationale is that the sample estimator of the kurtosis parameter has a
large estimation error. In this situation we consider shrinkage and pretest estimation
methodologies and reappraise their statistical properties. The estimation based on
these strategies yield relatively smaller estimation error in comparison with the
sample estimator in the candidate subspace. A large sample theory of the suggested
estimators are developed and compared. The results demonstrate that suggested
estimators outperform the estimator based on the sample data only in the candidate
subspace. In an effort to appreciate the relative behavior of the estimators in a finite
sample scenario, a Monte-carlo simulation study is planned and performed. The
result of simulation study strongly corroborates the asymptotic result. To illustrate
the application of the estimators, some example are showcased based on recently
published data.

Keywords Asymptotic properties; Kurtosis; Local alternatives; Monte-Carlo


simulation; Parameter stability; Relative precision; Shrinkage and pretest
estimation.

Mathematics Subject Classification Primary 62E20, 65C05; Secondary 62F12,


62J07, 65C60, 62F03, 62F05.

1. Introduction and Preliminaries


Skewness and kurtosis have been used in tests of normality, robustness, outliers,
modified tests and other situations. The kurtosis parameter is embedded in many

Received December 23, 2010; Accepted September 8, 2011


Address correspondence to M. Hafidz Omar, Department of Mathematics and Statistics,
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia; E-mail:
omarmh@kfupm.edu.sa

1860
Stabilizing Performance of Kurtosis Estimator 1861

inference problems (see Douglas, 2006, An and Ahmed, 2008). For example, the
estimation of asymptotic variance for process capability indices, coefficient of
variation, and effect size indices depend on the kurtosis parameter as well as other
parameters. More importantly, estimating kurtosis of the underlying distribution is
exceedingly important in implementing the restricted maximum likelihood (REML)
procedure since the asymptotic distribution of the REML estimator of the ratio
of two variances depends on the estimation of kurtosis parameter (Jiang, 1996,
1997). The asymptotic variance of many important indices are a function of kurtosis
parameter and hence an accurate and precise estimation of kurtosis is essential. Kim
and White (2004) argued that the role of higher moments has become increasingly
important in the financial literature mainly because the traditional measure of risk
and variance has failed to capture fully the “true risk” of the distribution of stock
market returns (see also Harvey and Siddique, 2000). In its own right, kurtosis
measures the “peakedness” of a distribution; a distribution whose kurtosis exceeds
three is called “leptokurtic” and is usually associated with heavier tails than that of
the normal distribution. Thus, this research is motivated by diverse applications and
involvement of the sample kurtosis in the arena of statistical inference. Note that
the kurtosis parameter estimation is not “stable,” especially in the presence of a few
outliers. For the above reasons, we consider some alternative estimation strategies
for the kurtosis parameter in this paper. Our objective is to combine sample and
non-sample information (NSI) in the estimation process for the kurtosis parameter
of a multivariate normal distribution.

1.1. Model and Statement of the Problem


Let X be a p-dimensional random variable with mean vector  and covariance
matrix . Then the kurtosis parameter is defined by

 = EX −  −1 X − 2 

For multivariate normal distribution, we have  = pp + 2 For the bivariate case,
with p = 2,  has a simplified version in terms of centered product moments
(Joarder and Abujiya, 2008).The estimate of the kurtosis measure based on a sample
(X1    Xn ) is given by

1 n
ˆ = Xi − 
X S−1 Xi − 
X2
n i=1

where

 1 n
1  n
X= Xi and S = X − 
XXi − 
X 
n i=1 n − 1 i=1 i

Note that ˆ is sensitive to outliers or unusual observations and a few contaminated


observations will have diverse effect on its magnitude and consequently on its
estimated asymptotic variance. For this reason, we plan to stabilize the kurtosis
parameter estimation by incorporating the available NSI in the estimation process.
The article is organized as follows. The preliminary test method and the
improved estimation procedures based on shrinkage are considered in Sec. 2 along
1862 Ahmed et al.

with some asymptotic results. In Sec. 3, we compare our estimators with the sample
estimate and show that our methods are asymptotically superior to the sample
estimate when the NSI is nearly credible. The results of the simulation experiment
are given in Sec. 4. The examples are given in Sec. 5. Finally, we provide concluding
remarks in Sec. 6.

2. Improved Estimation Strategies


Our main focus here is to improve the estimation of  when it is generally
assumed that the sample data may come from a distribution that is fairly close
to a multivariate normal distribution. The data may be contaminated by a few
observations, which will have a very negative impact on the sample estimate,
ˆ Hence, in an effort to stabilize the parameter estimation of , we consider
.
the problem of estimating  when some prior information regarding the kurtosis
parameter is available. In a number of real world problems, the practitioner may
have both an approximation of  that provides a constant o and a sample
information that provides a point estimator . ˆ The quality of o is unknown;
however, the analyst appreciates its ability to approximate . Our problem is to
ˆ Consequentially, we consider
combine the approximation o and the sample result .
estimators based on shrinkage and pretest estimation.
Suppose the analyst wishes to report the point estimator defined by the linear
combination

ˆ
ˆ S = co + 1 − c (2.1)

in which we would choose, in ideal circumstances, the coefficient c so as to minimize


the mean squared error (MSE). Further, c may also be defined as the degree of
confidence in the prior information o . The value of c ∈ 0 1 may be assigned
by the experimenter according to confidence in the prior value of o  If c = 0,
then we use the sample data only. We may choose an estimator of optimal c that
minimizes the variance. However, the optimal value of c depends on the unknown
parameter  and thus it is not accessible. Estimators constructed as linear (or, more
precisely, convex) combinations of other estimators or guessed values as in (2.1),
are called composite estimators. The composite estimator ˆ S can be interpreted as
shrinkage estimator (SE), as it shrinks the sample estimator ˆ towards o . Ledoit
and Wolf (2003) applied this strategy to estimate the covariance matrix. They
suggested that shrinking the MLE of the covariance matrix towards structured
covariance matrices can produce a relatively small estimation error in comparison
with the MLE. Ahmed and Krzanowski (2004) and others pointed out that such an
estimator yields a smaller MSE when a priori information o is correct or nearly
correct. We will demonstrate that ˆ S will have a smaller MSE than ˆ when  is close
to o . However, ˆ S becomes considerably biased and inefficient when the restriction
may not be judiciously justified. Thus, the performance of this shrinkage procedure
depends upon the correctness of the uncertain prior information. As such, when
the prior information is rather not trustworthy, it may be desirable to formulate a
shrinkage pretest estimator (SPE) denoted by ˆ SP which incorporates a pretest on
o . Thus, we consider the shrinkage pretest estimator which is defined by

ˆ
ˆ SP = I ˆ
n ≥ c
 + 1 − c + co In < c
 (2.2)
Stabilizing Performance of Kurtosis Estimator 1863

where IA is the indicator function of set A and n is the test statistic for the null
hypothesis Ho :  = o , as defined below. We consider testing Ho :  = o against Ha :
 = o (or  < o or  > o  A natural choice of o will be o = pp + 2. Hence,
the statistics is given by

 nˆ − o 2
n = 
8pp + 2

For large n≥ 50 and under the null hypothesis, the test statistics n follows a
2 -distribution with one degree of freedom, which provides the asymptotic critical
values. Thus, the critical value c
of n may be approximated by 1
2
, the upper
2
100
% critical value of the distribution with 1 degree of freedom.
It is important to note that for a fixed alternative that is different from the null
hypothesis, the power of the test statistics will converge to one as n → . Hence, to
explore the asymptotic power properties of n , we confine ourselves to a sequence
of local alternatives Kn . In the present work, such a sequence is specified by


Kn n = o + √ (2.3)
n

where is a fixed real number. Stochastic convergence of ˆ to the parameter 


ensures that ˆ −→  under local alternatives as well, where the notation −→ means
p p
convergence in probability.
The following theorem, which we present without proof, characterizes the
asymptotic powers of the test statistics under local alternatives.

Theorem 2.1. Under local alternatives in (2.3) the following results hold.

1. nˆ −  −→   8pp + 2
D
2. n has asymptotically a non central 2 -distribution with 1 degree of freedom and non
2
centrality parameter  = 8pp+22

Hence, the power calculations of the proposed test statistic can be accomplished
by using noncentral 2 -distribution.
Further, SPE can be written in a more computationally attractive form as
follows:

ˆ SP = ˆ − cˆ − o In < c


 (2.4)

Thus, the classical pretest estimator (PE) is readily obtained, by substituting c = 1


in above relation,

ˆ P = ˆ − ˆ − o In < c


 (2.5)

The above PE is due to Bancroft (1994). The proposed SPE (Ahmed, 1992) may
be viewed as an improved PE which represents both ˆ and PE for c = 0 and c = 1
respectively. In the literature, a discussion about pretesting can be found in Giles
and Giles (1993), Magnus (1999), Ohanti (1999), Reif and Vlček (2002), and Khan
and Ahmed (2003), among many others.
1864 Ahmed et al.

3. Asymptotic Bias and Mean Squared Error


ˆ so our results
Note that our results are based on the asymptotic normality of ,
˜
will be of an asymptotic nature. The asymptotic bias of an estimator /3 of  is
defined as

˜  = lim E n˜ − 
AB (3.1)
n→

Under the local alternatives, ABˆ S  = −c is an unbounded function of . The


expression of ABˆ SP  is obtained with the aid of the following lemma from Judge
and Bock (1978).

Lemma 3.1. Let Z ∼   1. Then we have the following:

EZI0 < Z2 < x = P 2 2 < x


3 2

where 2 2 is distributed as a chi-square with 3 degrees of freedom and non centrality


3 2
2
parameter 2


Since the lim → G3  1


2
  = 0, we can safely conclude that ˆ SP is
asymptotically unbiased, with respect to . For c = 1 ABˆ P   = − G3  1

2
 .
ˆ ˆ
The quantities AB   and AB   are 0 at  = 0 The bias functions of both
SP P

pretest estimators increase to the maximum as  increases, then decrease towards 0


as  further increases. Also, it is seen from the AMSE expression that the larger the
value of c is, the greater the variation in the bias values is.
Under the local alternatives in (2.3), we present the expressions for the AMSE
for the estimators under consideration.

ˆ  − AMSE
AMSEˆ S   = AMSE ˆ c2 − c + AMSE
ˆ c2 

ˆ  = 8pp + 2
where AMSE

AMSEˆ SP   = AMSE
ˆ  − AMSE
ˆ c2 − cG3  2  
1

ˆ c2 2G3  2   − 2 − cG5  2  


+ AMSE 1
1

The expression of AMSEˆ SP   is readily obtained with the use of the following
lemma from Judge and Bock (1978).

Lemma 3.2. Let Z ∼   1. Then we have the following:


   
EZ2 I0 < Z2 < x = P 2 2 < x + 2 P 2 2 <x 
3 2 5 2

For c = 1 we get the AMSE of ˆ P as follows:

AMSEˆ P   = 8pp + 2 + 8pp + 22G3  1

2
  − G5  1

2
 
− 8pp + 2G3  1

2
 
Stabilizing Performance of Kurtosis Estimator 1865

and AMSEˆ P   ≥ AMSE


ˆ  Accordingly,

 ≥ G3  1

2 2
 2G3  1
  − G5  1

2
 −1  (3.2)

Thus, we notice that the range of the parameter space in (2.4) is smaller.
The risk difference

AMSEˆ P   − AMSEˆ SP   = 8pp + 221 − cG3  1

2
 
− 1 − c2 G5  1

2
 
− 8pp + 21 − c2 G3  1

2
 

suggests that AMSEˆ P   ≤ AMSEˆ SP   since

 ≤ 1 − cG3  1

2 2
 2G3  1
  − 1 − cG5  1

2
 −1 

Thus, ˆ SP outshines ˆ P when

 > 1 − cG3  1

2 2
 2G3  1
  − 1 − cG5  1

2
 −1 

However, at  = 0, the shrinkage estimator will be the best choice. Also, both
pretest estimators have smaller AMSE than that of ˆ in the candidate space.

4. A Monte-Carlo Simulation Study


We now conduct a Monte-Carlo simulation to provide empirical outcomes to
the theory developed in the earlier sections of this article. The objective of this
simulation study is to examine the behavior of the relative precisions of ˆ S ˆ P and
ˆ SP to 
ˆ We consider Ho  = o against Ha  = o + where is a shift real
number in the neighborhood domain of  from various data distributions. Using
Monte Carlo simulations, we calculate the various kurtosis estimators discussed
earlier. The simulated relative precisions (SRP) of the estimator ˜ for various values
of is

ˆ
˜ = SMSE 
ˆ 
SRP
˜
SMSE

Further, SMSE ˜ and SMSE ˆ are the empirical mean square errors of ˜ and ˆ
respectively.
In the next subsections, we consider several data distributions from those very
unlike the normal distribution in shape, such as the chi-square distribution, to those
that are closer to the normal distribution to a certain degree such as the t and
the contaminated normal distributions. We also consider the case when we have
bivariate data following the bivariate t distribution.
1866 Ahmed et al.

4.1. Data from Contaminated Normal Distribution


In this simulation study, we consider a mixture of normal variables. The probability
density function (pdf) of a mixture of k normal random variables X is defined by


k
fx = pj j x j j2 
j=1

k
where 0 ≤ pj ≤ 1 and j=1 pj = 1 for j = 1    k and
 
1 x − j 2
j x j j2  =√ exp − 
2j 2j2

The population kurtosis of this distribution is given as below:

1  k
∗ = p 34 + 6j − 2 j2 + j − 4 
4 j=1 j j

If 1 = 2 An and Ahmed (2008) showed that ∗ simplifies to

3p1 14 + p2 24 


∗ = 
p1 12 + p2 22 2

Under the same conditions, they also showed that ∗ has a minimum value of 3
when either p1 = 0 or 1. If 1 = 2 , ∗ , on the other hand, has a maximum value
of 43 12 /22 + 22 /12 + 2 when either p1 = 22 /12 + 22  or p2 = 12 /12 + 22  Thus,
in our simulation study, we considered mixtures of two normal distributions with
the specifications: 1 = 2 1 /2 = 1/3 0 ≤ p1 ≤ 1 From this mixture specification,
we generated 5000 replication samples of size 50 for each 0.01 increment of p1 and
calculated the estimators.

4.2. Skewed Data


To simulate skewed data, 5,000 replication samples of size 50 were randomly
selected from central 2 distributions with degrees of freedom 1–60 in each case.

4.3. Heavy-Tailed Data


To simulate heavy-tailed data, 5,000 samples of size 50 were randomly selected from
central t− distributions with degrees of freedom 5–60 in each case.

4.4. Heavy-Tailed Bivariate Data


To simulate heavy-tailed bivariate data, 5,000 samples of size 50 were randomly
selected from bivariate t- distributions with degrees of freedom 5–60 in each case.

 1 05  t-distribution, the mean vector  = 3 4 and the covariance
For the bivariate
matrix  = 05 1 . The bivariate normal distribution on which the comparison is
made has the same mean vector and covariance matrix.
Stabilizing Performance of Kurtosis Estimator 1867

Figure 1. Relative MSE Precision for various kurtosis estimators when c = 02.

4.5. Results and Discussion


For each distribution above, we considered values of c = 02, 0.5, and 0.8 when
n = 50. To conserve space, we also report via Fig. 4 in the Appendix for cases
when n = 30 but for only the value of c = 05 in the Apppendix. The results of the
simulation study are captured in Figs. 1 through 3.
Figure 1 shows the relative MSE precision of the various estimators versus
different values of when c = 02 for each type of data. When values are
closer to 0, it appears that ˆ P outperforms the other estimators. However, for
skewed and univariate and bivariate heavy-tailed data, ˆ S appears to dominate the
other estimators for larger values. All the shrinkage estimators, ˆ S ˆ P and ˆ SP
generally outperform ˆ except at certain ranges of higher values.
Figure 2 shows the relative MSE precision of the various estimators versus
different values of when c = 05. When the data follows contaminated normal, the
relative MSE precision generally seems to be largest for ˆ P followed by ˆ S and then
by ˆ SP  The relative precision for each estimator appear to be lower but similar to
that for ˆ only at higher values of  For skewed data, ˆ S generally outperforms
ˆ P except at higher values where it performs the worst of all four estimators. For
the univariate and bivariate heavy-tailed data, ˆ S seems to generally outperform
ˆ P which in turn outperforms ˆ SP for approximately < 4. Thereafter, all three
estimators are fairly close to each other.
Figure 3 shows the relative MSE precision of the various estimators versus
different values of when c = 08. Here, ˆ S performed better than the other
1868 Ahmed et al.

Figure 2. Relative MSE precision of various kurtosis estimators when c = 05.

Figure 3. Relative MSE Precision of various kurtosis estimators when c = 08.


Stabilizing Performance of Kurtosis Estimator 1869

estimators. ˆ P outperforms ˆ SP . All the shrinkage estimators, ˆ S ˆ P and ˆ SP


generally outperform ˆ except at certain ranges of higher values, where they are
either lower or similar in performance to .ˆ
In a nutshell, the simulation study is in agreement with the analytical findings
of the preceding section.

5. Examples
In this section, we share some examples involving real data from different fields of
application.
The first real data is concerned with gas exchange during exercise for 17 patients
with mild chronic obstructive pulmonary disease (Barbera et al., 1991). The data
consist of measurements of PaO2 gas exchanges during exercise and at rest, and
patients’ emphysema scores. With this trivariate data, we are interested in testing
the null hypothesis Ho  = 15 against the alternative hypothesis Ha  = 15. We
calculated from the data ˆ =14.6171 and ˆ S = 148086 with c = 05, and n =
00207672. In this example, the null hypothesis is not rejected at 0.05 level of
significance. So, the selected estimators of  are ˆ P and ˆ SP with values equal to 15
and 14.8086, respectively. Based on our earlier discussion, we suggest using ˆ SP of
14.8086.
The final real data consist of monthly stock returns over 20 months (Sutradhar
and Ali, 1986). The data comprise of monthly stock returns for General Electric,
Standard Oil, and Sears compared to the New York Stock Exchange. With this
multivariate data, we are interested in testing the null hypothesis Ho  = 24 against
the alternative hypothesis Ha  = 24. We calculated from the data ˆ = 280355 and
ˆ S = 260177 with c = 05, and n = 169635. For
= 005, based on our earlier
discussion, we suggest using ˆ SP = 260177.

6. Conclusions
In this article, we reappraised the statistical properties of shrinkage and
shrinkage preliminary test estimators in the context of the kurtosis parameter
estimation. We demonstrated that preliminary test estimator is a bounded function
of approximation error and it offers substantial MSE reduction when the
approximation error is nearly correct. The results from the simulation study are
promoting and agreeing with the asymptotic findings of the article. The suggested
shrinkage preliminary test estimation is easy to implement and free from any tuning
or hyper parameter. It also gives comparable performance in simulation and real
data empirical studies. The estimation of skewness parameter can also be examined
in a similar way, but is not pursued here for the sake of brevity. The estimation
of kurtosis via Bayesian method is also not pursued in this article since the main
focus of this article is stabilization of the kurtosis estimator through shrinkage
estimation.
1870 Ahmed et al.

Appendix

Figure 4. Relative MSE precision of various kurtosis estimators when c = 05 and n = 30.

Acknowledgments
The research work of Professor Ahmed is supported by a grant from the Natural
Sciences and Engineering Council of Canada and a part of this investigation
was conducted while he was visiting the King Fahd University of Petroleum and
Minerals (KFUPM), Dhahran, KSA. Professors Joarder and Omar would also like
to express gratitude to KFUPM for providing facilities for this research.

References
Ahmed, S. E. (1992). Shrinkage preliminary test estimation in multivariate normal
distributions. Journal of Statistical Computation and Simulation 43:177–195.
Ahmed, S. E., Krzanowski, W. J. (2004). Biased estimation in a simple multivariate
regression model. Computational Statistics and Data Analysis 45:689–696.
An, L., Ahmed, S. E. (2008). Improving the performance of kurtosis estimator. Computational
Statistics and Data Analysis 52:2669–2681.
Bancroft, T. A. (1994). On the biases in estimation due to the use of preliminary tests of
significance. Annals of Mathematical Statistics 15(2):190–204.
Barbera, J. A., Roca, J., Ramirez, J., Wagner, P. D., Ussetti, P., Rodriguez-Roisin, R.
(1991). Gas Exchange During Exercise in Mild Chronic Obstructive Pulmonary Disease:
Correlation with Lung Structure. American Review of Respiratory Disease 144:520–525.
Douglas, G. B. (2006). Confidence interval for a coefficient of quartile variation.
Computational Statistics and Data Analysis 50:2953–2957.
Giles, J. A., Giles, D. E. A. (1993). Preliminary-test estimation of the regression scale
parameter when the loss function is asymmetric. Communications in Statistics–Theory and
Methods 22(6):1709–1733.
Stabilizing Performance of Kurtosis Estimator 1871

Harvey, C. R., Siddique, A. (2000). Conditional skewness in asset pricing tests. Journal of
Finance 55:1263–1295.
Jiang, J. (1996). REML estimation: Asymptotic behavior and related topics. The Annals of
Statistics 24(1):255–286.
Jiang, J. (1997). Wald consistency and the method of Sieves in REML estimation. The Annals
of Statistics 25(4):1781–1803.
Joarder, A. H., Abujiya, M. R. (2008). Standardized moments of bivariate chi-square
distribution. Journal of Applied Statistical Science 4:1–9.
Judge, G. G., Bock, M. E. (1978). The Statistical Implications of Pre-test and Stein-Rule
Estimators in Econometrics. Amsterdam: North-Holland Publishing Co.
Khan, B. U., Ahmed, S. E. (2003). Improved estimation of coefficient vector in a regression
model. Communication Statistics Simulation and Computations 32(3):747–769.
Kim, T.-H., White, H. (2004). On more robust estimation of skewness and kurtosis. Finance
Research Letters 1:56–73.
Ledoit, P., Wolf, M. (2003). Improve estimation of of the covariance matrix of stock returns
with an application to portfolio selection. Journal of Empirical Finance 10:603–621.
Magnus, J. R. (1999). The traditional pretest estimator. Theory Probability Appl.
44(2):293–308.
Ohanti, K. (1999). MSE performance of a heterogeneous pre-test estimator. Statistics &
Probability Letters 41:65–71.
Reif, J., Vlček, K. (2002). Optimal pre-test estimators in regression. Journal Econometrics
110:91–102.
Sutradhar, B. C., Ali, M. M. (1986). Estimation of the parameters of a regression model with
multivariate t error variable, Communications Statistics—Theory Methods 15(2):429–450.

You might also like