Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Communications in Statistics - Simulation and

Computation

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/lssp20

Two-parameter estimator for the inverse Gaussian


regression model

Muhammad Nauman Akram, Muhammad Amin & Muhammad Amanullah

To cite this article: Muhammad Nauman Akram, Muhammad Amin & Muhammad
Amanullah (2022) Two-parameter estimator for the inverse Gaussian regression model,
Communications in Statistics - Simulation and Computation, 51:10, 6208-6226, DOI:
10.1080/03610918.2020.1797797

To link to this article: https://doi.org/10.1080/03610918.2020.1797797

Published online: 30 Jul 2020.

Submit your article to this journal

Article views: 345

View related articles

View Crossmark data

Citing articles: 18 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=lssp20
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R

2022, VOL. 51, NO. 10, 6208–6226


https://doi.org/10.1080/03610918.2020.1797797

Two-parameter estimator for the inverse Gaussian


regression model
Muhammad Nauman Akrama , Muhammad Amina , and Muhammad Amanullahb
a
Department of Statistics, University of Sargodha, Sargodha, Pakistan; bDepartment of Statistics, Bahauddin
Zakariya University, Multan, Pakistan

ABSTRACT ARTICLE HISTORY


The inverse Gaussian regression model (IGRM) is frequently applied in the Received 11 March 2020
situations, when the response variable is positively skewed and well fitted Accepted 14 July 2020
to the inverse Gaussian distribution. The maximum likelihood estimator
KEYWORDS
(MLE) is generally used to estimate the unknown regression coefficients of
Inverse Gaussian regression
the IGRM. The performance of the MLE method is better if the explanatory model; Liu estimator;
variables are uncorrelated with each other. But the presence of multicolli- Maximum likelihood
nearity generally inflates the variance and standard error of the MLE result- estimator; Mean squared
ing the loss of efficiency of estimates. So, for the estimation of unknown error; Multicollinearity;
regression coefficients of the IGRM, the MLE is not a trustworthy method. Ridge estimator; TPE
To combat multicollinearity, we propose two parameter estimators (TPE)
for the IGRM to improve the efficiency of estimates. Moreover, mean
squared error criterion is taken into account to compare the performance
of TPE with other biased estimators and MLE using Monte Carlo simulation
study and a real example. Based on the results of Monte Carlo simulation
study and a real example, we may suggest that the TPE based on Asar
and Genç method for the IGRM is better than the other competi-
tive estimators.

1. Introduction
The linear regression model (LRM) is used, when the distribution of response variable is normal.
If this assumption is violated and the response variable is well fitted to some exponential family
of distributions such as binomial, Gamma, Poisson, geometric, inverse Gaussian (IG), then in
such situations we use the generalized linear model (GLM) instead of the LRM. The inverse
Gaussian regression model (IGRM) is mainly applied in the situations when the behavior of
response variable is positively skewed (Amin, Amanullah, and Aslam 2016; Amin, Amanullah,
and Qasim 2020). The IGRM has several applications i.e., engineering, physical sciences, social
sciences, medical sciences, environment and business (Amin, Amanullah, and Aslam 2016; Amin,
Amanullah and Qasim 2020; Akram, Amin and Qasim 2020; Kinat, Amin, and Mehmood
2020).The maximum likelihood estimator (MLE) is more suitable to estimate the IGRM coeffi-
cients as compared to the ordinary least square (OLS) estimator to describe different phenomena,
and to investigate the phenomena quantitatively.
It is very common in multiple regression modeling that the explanatory variables are corre-
lated with one another especially in health, economics and social sciences. This correlation among
explanatory variables is called multicollinearity. Frisch (1934) initially discussed this issue in the
LRM. Like the LRM, the multicollinearity issue may arise in the IGRM. For estimating unknown

CONTACT Muhammad Nauman Akram nauman_akram18@yahoo.com Department of Statistics, University of Sargodha,


Sargodha, Pakistan.
ß 2020 Taylor & Francis Group, LLC
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6209

regression coefficients of the IGRM, MLE is most commonly used method. But it is known that
the variances and standard errors of the regression coefficients are so large in the presence of
multicollinearity (McClendon 2002). Thus, it is generally difficult to construct valid statistical
inferences when multicollinearity exist (Asar and Genç 2018).
To minimize the effect of multicollinearity, biased estimation method generally attains more
attention as compared to any other method. Among these the ridge is considered to be the most
attractive methods as compared to the MLE which was initially proposed by Hoerl and Kennard
(1970a; 1970b). Different approaches have been adopted for the selection of optimal value of
shrinkage parameter k, for example see Kibria (2003), Muniz and Kibria (2009), Alkhamisi and
Shukur (2008), Asar, Karaibrahimoglu, and Genç (2014), Asar and Genç (2015) and Kibria and
Banik (2016). Moreover, Segerstedt (1992) extended this work for the GLM. Many researchers
worked on the selection of k for the GLM with different responses. Some of them worked on dis-
crete response models and some worked on continuous response models. For a detailed study, we
refer the following studies: Månsson and Shukur (2011), Khalaf et al. (2014), Kibria, Månsson
and Shukur (2012), Månsson (2012), Qasim et al. (2019), Amin, Akram, and Amanullah (2020)
and Amin et al. (2020a, 2020b). Another method to combat multicollinearity is called Liu estima-
tion method which was proposed by Liu (1993). The preference of Liu estimator over the ridge
estimator is that it is a linear function of the Liu parameter d, whose range is 0 to 1 (Qasim,
Amin, and Amanullah 2018). For detailed study of choosing Liu parameter for the LRM, we sug-
gest the following studies: Liu (1993), Akdeniz and Kaçiranlar (2001), Hubert and Wijekoon
(2006), Alheety and Kibria (2009), Kibria (2012) and Qasim et al. (2019). However, the literature
on Liu parameter for the GLM is limited. While a few studies can be found where different Liu
parameters are suggested for some specific cases of the GLM. For the GLM, some studies which
have been done by Månsson (2013), Qasim, Amin, and Amanullah (2018) and Naveed et al.
(2020). To combat multicollinearity, another estimation method which is called two parameter
estimation (TPE) method attains the advantages over both ridge and Liu estimators. Many
researchers proposed the TPE for the LRM to minimize the effect of multicollinearity. For
instance, Yang and Chang (2010) introduced a new TPE for the LRM to minimize the effect of
multicollinearity. Furthermore, Huang, Wu, and Yi (2017) introduced the restricted almost
unbiased TPE for the LRM. On the other hand, the literature on TPE for the GLM is very lim-
ited. We suggest to see the following studies for different characterization of this method in vari-
ous forms of the GLM: Huang and Yang (2014) developed TPE for negative binomial regression
model (NBRM) to minimize the problem of correlated explanatory variables. Asar and Genç
(2018) proposed a two-parameter ridge estimator for the binary logistic regression Furthermore,
Asar and Genç (2018) suggested a TPE for the Poisson Regression Model (PRM) and the most
recent study of Amin et al. (2019) where they assessed the performance of Asar and Genç and
Huang and Yang’s TPE methods for the gamma regression model. The above literature shows
that no such study related to the TPE for the IGRM is available. So, we propose TPE for the
IGRM to overcome the effect of multicollinearity among the explanatory variables.
The purpose of the article is to propose TPE for the IGRM and suggest some estimation meth-
ods of shrinkage parameters to improve the efficiency of the IGRM estimates. We also assess the
performance of biased estimation methods (ridge, Liu and TPE) for the IGRM with the help of
Monte Carlo simulation study and real life application, where mean squared error (MSE) consider
as an evaluation criterion. Furthermore, comparison of TPE methods of Asar and Genç (2018)
and Huang and Yang (2014) for the IGRM are made to find out which estimation method out-
performs the others in the presence of the high multicollinearity.
The organization of this article is as follows: In Sec. 2, we present the methodology which con-
tains the review of the IGRM. Moreover, we also give some biased estimation methods i.e., ridge,
Liu and TPE for the IGRM with some mathematical manipulations. Furthermore, theoretical
comparison based on matrix mean squared error (MMSE) and MSE will also be presented in this
6210 M. N. AKRAM ET AL.

section. The design of Monte Carlo simulation and its results are displayed in Sec. 3. The applica-
tion of the proposed methods is given in Sec. 4. Finally, some concluding remarks are given in
Sec. 5.

2. Methodology
2.1. Inverse Gaussian regression model
Suppose that y1 , y2 , :::, yn are independently and identically distributed observations that comes
from the IG distribution with mean li : As the IG distribution is the member of the exponential
density having functional density as given by
 
yh  bðhÞ
f ðy; h, /Þ ¼ exp þ cðy, /Þ , (1)
/
where h is the location parameter, b (h) is the cumulant function, / > 0 is the dispersion param-
eter, and cðy, /Þ is called the normalization term. The likelihood function of Eq. (1) is given as
Yn  
yi hi  bðhi Þ
lðyi ; hi , /Þ ¼ exp þ cðyi , /Þ : (2)
i¼1
/

Let yi be the random variable from the IG distribution with drift parameter m and volatility par-
ameter k, denoted as IG (m, k), then the probability density function is defined by
" #
k kðy  lÞ2
f ðy; l, kÞ ¼ pffiffiffiffiffiffiffiffiffi exp ; y > 0, l > 0, k > 0: (3)
2py3 2l2 y

For the IGRM, we consider reciprocal of the square root link function as gðlÞ ¼ p1ffiffig , where gi ¼
x0 i b be the linear predictor of the explanatory variables, such as b ¼ ðb0 , b1 , :::, bp Þ0 is called the
vector of regression coefficients including intercept and xi ¼ ð1, xi1 , :::, xip Þ0 are the r ¼ p þ 1
explanatory variables. Then the likelihood function of Eq. (3) is given as
" #
kðy2 þ l2  2ylÞ 1 1
f ðy; l, kÞ ¼ exp þ ln ðkÞ  ln ð2py Þ :
3
2l2 y 2 2

or
"    (  3  )#
y 1 1 2py k
f ðy; l, kÞ ¼ exp k þ þ  ln þ : (4)
2l2 l 2 k y

Thus the mean and variance of the IG distribution are, respectively, given as
@b @l 1 1
EðyÞ ¼ b0 ðhÞ ¼ ¼ 2 ðl3 Þ ¼ l ¼ pffiffiffi :
@l @h l g
  1 1 l3
VarðyÞ ¼ /b00 ðhÞ ¼ /VðlÞ ¼ /diag l31 , :::, l3n ¼ 3 ¼ :
k ðlÞ k
Applying logarithm on both sides of Eq. (4), we have
"       !#
Xn
yi 1 1 2pyi 3 k
lðli , kÞ ¼ k þ þ  ln þ , (5)
i¼1
2l2i li 2 k yi
pffiffiffiffiffiffiffi
where l1 ¼ xi0 b: Then Eq. (5) becomes
i
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6211

   "  3  #
X
n
yi xi0 b pffiffiffiffiffiffi
ffi 1 2py k
lðb, /Þ ¼ k þ xi0 b þ  ln i
þ : (6)
i¼1
2 2 k yi

The estimate of b can be obtained by taking the derivative of Eq. (6) and equating to zero, we have
   
@ ðlðb, kÞÞ X n
yi xi0 1  0 12  0 
¼ k þ xi b xi ¼ 0: (7)
@b i¼1
2 2

The estimate of b will be attained by solving Eq. (7), we get


" #
@li 1 1
¼ ffi xi0 ¼ 0:
yi  pffiffiffiffiffiffi (8)
@bj 2/ xi0 b

As Eq. (8) is the non-linear function of the regression coefficients. So, Eq. (8) can be solved itera-
tively with the help of Fisher’s method of scoring. In every single iteration, the parameters are
^ ðrþ1Þ ¼ b
calculated as b ^ ðrÞ þ ðX 0 W ðrÞ XÞ1 X 0 ðy  l ^ ðrÞ represents the estimate of b at
^ ðrÞ Þ, where b
^ ðrþ1Þ converges to b,
the rth iteration. If the estimates b ^ as r approaches to infinity, we obtain
^
b 1 0 ^
MLE ¼ ðHÞ X WZ, (9)
^ X, ^z i ¼ ^g i þ
where H ¼ X 0 W ðyi  l ^iÞ ^ ¼
is called the adjusted response variable while W
^ 3i
l
l 3i Þ, i ¼ 1, 2, 3, :::, n: The MLE follows the normal distribution with mean vector Eðb ^
diagð^ MLE Þ ¼
0 and the asymptotic covariance matrix Covðb ^ Þ ¼ ^ ðX W
/ 0 ^ XÞ 1
: If one wants to find the
MLE
MMSE and scalar MSE of any estimator, then one utilize the spectral decomposition of
the matrix H i.e., H ¼ QKQ0 , where Q is the matrix of the eigenvectors of H and K is called the
diagonal matrix of eigenvalues of H i.e.,, K ¼ diagðk1 , k2 , :::, kp Þ: Thus, the MMSE and scalar
MSE of Eq. (9) can be respectively given as
 
MMSE b ^ ¼ / QK1 Q0 : (10
MLE

X
r
1
^
MSE b ¼/ , (11)
MLE
j¼1
kj

where kj represents the jth eigenvalue of H. When there exists a high correlation among the
explanatory variables, then the weighted matrix of cross products i.e.,, H, is ill-conditioned, result-
ing high variance of the MLE. So it is very difficult to estimate and interpret the regression
parameters as the estimated coefficients are too larger (Amin et al. 2020a). Whereas the disper-
sion parameter / is estimated as
1 X n
ð yi  l ^ i Þ2
^ ¼
/ : (12)
ðn  p  1Þ i¼1 ^ 3i
l

2.2. TPE for the IGRM


When there exists a high correlation among the explanatory variables, then we may face the esti-
mation and inferences related issues of the IGRM coefficients. Different studies have been con-
ducted in the literature to overcome the effect of multicollinearity. Among these, ridge estimator
proposed by Hoerl and Kennard (1970a; 1970b) is considered to be the most popular which is
generally used to combat multicollinearity issue. Later on, Algamal (2019) proposed the inverse
Gaussian ridge estimator (IGRE) as
6212 M. N. AKRAM ET AL.

^
b 1 ^
IGRE ¼ ðH þ kIÞ H b MLE k > 0, (13)
^
/
where k ðk > 0Þ is the ridge parameter which is to be estimated as ^ 2j
a
, while I is the identity
matrix of order r  r:
Akram et al. (2020) introduced the inverse Gaussian Liu estimator (IGLE) to overcome the
limitations of IGRE and is defined as
^ 1 ^
b IGLE ¼ ðH þ I Þ ðH þ dI Þb MLE , 0 < d < 1 (14)
P r a2 /
j
j¼1 ðkj þIÞ2
where d ¼ P r /þa2 kj
called the Liu parameter.
j
j¼1 kj ðkj þIÞ2

Asar and Genç (2018) proposed TPE by combining the ridge and Liu estimator for the PRM
that is also valid for the IGRM. So, we propose an estimator called two parameter Asar and Genç
(TPAG) estimator defined as
^  1  ^
b TPAG ¼ H þ kIp H þ kdIp b MLE , (15)
where k > 0 and 0 < d < 1. It can be noticed that TPAG is called a general estimator of the MLE,
IGRE and IGLE. For instance, if we put k ¼ 0 in Eq. (15), then b ^ ^
TPAG ¼ b MLE : Furthermore, if
^ ^
k ¼ 1 in Eq. (15), then b TPAG ¼ b IGLE and if d ¼ 0 in Eq. (15), one can be seen
^
that b ^
TPAG ¼ b IGRE :
Following Huang and Yang (2014) suggested a TPE by relating the IGRE and IGLE estimator
which is called two parameter Huang and Yang (TPHY) estimator defined as
^ 1 1 ^
b TPHY ¼ ðS þ I Þ ðS þ dI ÞðS þ kI Þ Sb MLE , (16)
where k > 0 and 0 < d < 1.

2.3. MSE and MMSE properties of the estimators


In this section, we give the information about the MSE and MMSE of the estimators which are
further used for assessment purposes of the proposed and other estimators. The MMSE and MSE
are correspondingly written as
 
0
MMSE b ^ ¼E b ^ b b ^ b :
MLE MLE MLE
0
¼ Var b ^ ^ ^
MLE þ bias b MLE bias b MLE :

h i  
0
^
MSE b ¼ tr MMSE b ^ ¼E b ^ ^
MLE MLE MLE  b b MLE  b :
h i 0
^
¼ tr Var b þ bias b^ ^
MLE MLE bias b MLE ,

^
where tr is the trace operator, Varðb ^
MLE Þ and biasðb MLE Þ represents the variance and bias of an
^ ^
estimator respectively, such that Biasðb MLE Þ ¼ Eðb MLE Þ  b which is the difference between the
expected value and the true population parameter b.
Algamal (2019) computed the MMSE and MSE of the IGRM as
^
MMSE b 1 1 0 0
IGRE ¼ /QKk KKk Q þ bIGRE b IGRE : (17)

^
where Kk ¼ diag ðk1 þ k, k2 þ k, :::, kr þ kÞ and bIGRE ¼ Bias b 1
IGRE ¼ kQKk a and
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6213

X
r
kj X
r a2j
^
MSE b IGRE ¼ / 2 þ k
2
  2 , (18)
j¼1 kj þ k j¼1 kj þ k

where a ¼ Q0 b and k (k > 0) is the ridge biasing parameter suggested by Hoerl and
Kennard (1970).
Akram et al. (2020) computed the MMSE and MSE of the IGLE which are respectively given
as
^
MMSE b 1 1 1 0 0
IGLE ¼ QK1 Kd K Kd K1 Q þ bIGLE bIGLE (19)

where K1 ¼ diag ðk1 þ 1, k2 þ 1, :::, kr þ 1Þ, Kd ¼ diag ðk1 þ d, k2 þ d, :::, kr þ dÞ and bIGLE ¼
^ 1
Bias b IGLE ¼ ðd  1ÞQK1 a and
 2
Xr
kj þ d Xr a2j
^
MSE b  2 þ ðd  1 Þ
2
IGLE ¼ /   2 , (20)
j¼1 kj kj þ 1 j¼1 kj þ 1

The MMSE and scalar MSE of TPAG can be obtained as


^
MMSE b 1 1 1 0 0
TPAG ¼ /QKk Kkd K Kkd Kk Q þ bTPAG b TPAG : (21)

^ 1
where Kkd ¼ diag ðk1 þ kd, k2 þ kd, :::, kr þ kdÞ and bTPAG ¼ Bias b TPAG ¼ kðd  1ÞQKk a
and
 2
X r
kj þ kd Xr a2j
^
MSE b TPAG ¼ /   2 þ k ðd  1 Þ
2 2
 2 , (22)
j¼1 kj kj þ k j¼1 kj þ k

or
^
MSE b TPAG ¼ / c1 ðk, dÞ þ c2 ðk, dÞ,

where c1 ðk, dÞ and c2 ðk, dÞ represents the variance and squared bias respectively.
The MMSE and MSE of the Huang and Yang (2014) TPE are, respectively, given as
^
MMSE b 1 1 1 1 0 0
TPHY ¼ /QK1 Kd Kk KKk Kd K1 Q þ bTPHY b TPHY :, (23)

where bTPHY ¼ Bias b^ 1 1


TPHY ¼ QðK1 Kd K1 K  IÞa and
 2  
X r
kj kj þ d Xr ð1 þ k  dÞkj þ k 2 a2j
^
MSE b TPHY ¼ /  2  2 þ  2  2 : (24)
j¼1 kj þ I kj þ k j¼1 kj þ I kj þ k

Huang and Yang (2014) considered the theorems for TPE in the NBRM. These theorems are also
valid for TPE in the IGRM, which are discussed below.
Theorem 2.1. The asymptotic variance c1 ðk, dÞand the squared bias c2 ðk, dÞ are two continuous
functions of k and d; for fixed d ranging between zero and one, c1 ðk, d Þ and c2 ðk, d Þare mono-
tonically decreasing function of k and for fixed k > 0, c1 ðk , dÞ and c2 ðk , dÞ are monotonically
increasing function of d.
Proof. We know that c1 ðk, dÞ and c2 ðk, dÞ are the two continuous functions of k and d.
Differentiate Eq. (22) with respect to k, we have
6214 M. N. AKRAM ET AL.

n o
^  
@ MSE b TPAG Xr 2ð1 þ dÞ2 k2 a2
j
Xr 2ð1 þ dÞ2 ka2
j
Xr
2d dk þ kj /
¼  3 þ  2 þ  2
@k j¼1 k þ kj j¼1 k þ kj j¼1 kj k þ kj
 2
Xr
2 dk þ kj /
  3 : (25)
j¼1 kj k þ kj

After some simplification, Eq. (25) becomes


n o 2 3
@ MSE b ^ X r 2ð1 þ dÞ kj / þ k ð1 þ dÞa2j kj þ d/
TPAG
¼ 4 5:
 3 (26)
@k j¼1 k þ kj

Now differentiate Eq. (26) with respect to k we have


n o
@ 2 MSE b^ X r 6ð1 þ dÞ2 k2 a2 Xr 8ð1 þ dÞ2 ka2 Xr 2ð1 þ dÞ2 a2 Xr
TPAG j j j 2d2 /
¼      þ   þ  2
@k2 j¼1 k þ kj
4
j¼1 k þ kj
3
j¼1 k þ kj
2
j¼1 kj k þ kj
   2
X r
8d dk þ kj / X r
6 dk þ kj /
  3 þ  4 :
j¼1 kj k þ kj j¼1 kj k þ kj

(27)
After simplification, Eq. (27) becomes
n o  
@ 2 MSE b ^ X r 2ð1 þ dÞ kj a2j kj  dkj  ð3 þ dÞ/ þ 2k ð1 þ dÞa2j kj þ d/
TPAG
¼   4 :
@k2 j¼1 k þ kj
(28)
  
So, it is clear that for fixed d between 0 and 1, then c1 ðk, d Þ and c2 ðk, d Þ are monotonically
decreasing and increasing functions of k. Now again differentiating Eq. (22) in terms of d, we
have
n o
^  
@ MSE b TPAG Xr 2ð1 þ dÞk2 a2
j
X r
2k dk þ kj /
¼  2 þ  2 : (29)
@d j¼1 k þ kj j¼1 k k þ kj

After some mathematical manipulation, Eq. (29) becomes


n o 0 1
^
@ MSE b TPAG X 2k kj / þ k ð1 þ dÞaj kj þ d/
r 2

¼ @ A:
 2 (30)
@d j¼1 kj k þ kj

Finally, take the second derivative of Eq. (30) with respect to d, we have
n o
@ 2 MSE b ^ Xr 2k2 a2j Xr
TPAG 2k2 /
¼   þ  2 :
@d2 j¼1 k þ kj
2
j¼1 kj k þ kj

Or equivalently
0 1
^
@ 2 MSE b Xr 2k2 a2j kj þ /
TPAG
¼ @ A:
 2 (31)
@d2 j¼1 kj k þ kj
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6215

So, it is obvious that c1 ðk , dÞ and c2 ðk , dÞ are monotonically increasing and decreasing func-
tions of d.

2.4. Theoretical comparisons based on MMSE and scalar MSE’s


Asar and Genç (2018) showed that the performance of the suggested estimator is better than the
estimator of Huang and Yang (2014) and MLE but they did not provided a comparison for the
IGRM. So, we compare the MMSE and MSE of TPAG with MLE, IGRE, IGLE and TPHY for
the IGRM to assess the performance based on theoretical findings.
Lemma 2.2 Let M be a positive definite (p.d.) matrix, a be a vector of nonzero constants and c be
a positive constant. Then cM  aa0 > 0 if and only if a0 Ma < c (Farebrother, 1976).
^
Theorem 2.3. Suppose that k > 0, 0 < d < 1 and bTPAG ¼ Biasðb ^
TPAG Þ: Then MMSEðb MLE Þ 
MMSEðb^ 0 1 1 1 1
TPAG Þ > 0 if b K  Kk Kkd K Kkd Kk b < 1:

Proof. By using Eq. (11) and Eq. (21), we obtain


 1  0
^
MMSE b ^  K1 1 1 0
MLE  MMSE b TPAG ¼ /Q K k Kkd K Kkd Kk Q  bTPAG b TPAG :

In terms of scalar MSE, one gets


8  2 9r
kj þ kd =
<1
^
MSE b ^ Q0  b0 TPAG bTPAG :
MLE  MSE b TPAG ¼ /Qdiag    (32)
:kj kj kj þ k 2 ;
j¼1

After simplification, Eq. (32) can be obtained as


( )r
^ ^ k2 ð1  d2 Þ þ 2kj kð1  dÞ
MSEðb MLE Þ  MSEðb TPAG Þ ¼ /Q diag Q0  b0 TPAG bTPAG :
kj ðkj þ kÞ2
j¼1
( )r
k ð1  dÞð1 þ dÞ þ 2kj kð1  dÞ
2
¼ /Q diag 2 Q0  b0 TPAG bTPAG :
kj ðkj þ kÞ
j¼1
Or equivalently
( )r
^ ^ kð1  dÞ kð1 þ dÞ þ 2kj
MSE b MLE  MSE b TPAG ¼ /Q diag  2 Q0  b0 TPAG bTPAG : (33)
kj kj þ k
j¼1

The MMSE K1  K1 1


k Kkd K Kkd Kk
1
is p.d if ðkj þ kÞ2  ðkj þ kdÞ2 > 0 and this algebraic
expression is equal to ðkj þ kÞ  ðkj þ kdÞ ðkj þ kÞ þ ðkj þ kdÞ > 0: Or in a more simplified
form, the expression becomes kð1  dÞ kð1 þ dÞ þ 2kj > 0: Thus, if k > 0 and 0 < d < 1, then
the proof is ended by Lemma 2.2. w

Theorem 2.4. Suppose that k > 0, 0 < d < 1 and bIGRE ¼ Biasðb^ ^
IGRE Þ and bTPAG ¼ Biasðb TPAG Þ:
Then MMSEðb^ ^ 0 1 1 1 1 1
IGRE Þ  MMSEðb TPAG Þ > 0 if b Kk KKk  Kk Kkd K Kkd Kk b < 1:

Proof. By using Eq. (17) and Eq. (21), we obtain


6216 M. N. AKRAM ET AL.

^
MMSEðb ^
IGRE Þ  MMSEðb TPAG Þ

¼ /QK1 1 0 1 1 1 0 0 0
k KKk Q  /QKk Kkd K Kkd Kk Q þ bIGRE b IGRE  bTPAG b TPAG :
¼ /QðK1 1 1 1 1 0 0 0
k KKk  Kk Kkd K Kkd Kk ÞQ þ bIGRE b IGRE  bTPAG b TPAG :

The comparison of scalar MSE can be obtained as


r
^ ^ kj ðkj þkdÞ2
MSEðb IGRE Þ  MSEðb TPAG Þ ¼ /Q diag ðkj þkÞ2
 kj ðkj þkÞ2
Q0 þ b0 IGRE bIGRE  b0 TPAG bTPAG :
j¼1
Or equivalently
(  )r
kd kd  2kj
^
MSE b ^ Q0 þ bIGRE b0 IGRE  bTPAG b0 TPAG :
IGRE  MSE b TPAG ¼ /Q diag  2
kj kj þ k
j¼1
(34)
The MMSE K1 1
k KKk
 K1 1 1
k Kkd K Kkd Kk
is p.d if k j  ðkj þ kdÞ > 0 and this algebraic
2 2

expression is equal to kj  ðkj þ kdÞ kj þ ðkj þ kdÞ > 0: Or in a more simplified form, the
expression becomes kdðkd  2kj Þ > 0:Thus, if k > 0 and 0 < d < 1, then the theorem is ended
by Lemma 2.2. w

Theorem 2.5. Suppose that k > 0, 0 < d < 1 and


^
bIGLE ¼ Biasðb ^ ^ ^
IGLE Þ and bTPAG ¼ Biasðb TPAG Þ: Then MMSEðb IGLE Þ  MMSEðb TPAG Þ > 0 if
0 1 1 1 1 1 1
b ½ K1 Kd K1 Kd K  Kk Kkd K Kkd Kk b < 1:

Proof. By using Eq. (19) and Eq. (21), we obtain


^
MMSEðb ^
IGLE Þ  MMSEðb TPAG Þ
¼ /QK1 1 1 0 1 1 1 0 0 0
1 Kd K Kd K1 Q  /QKk Kkd K Kkd Kk Q þ bIGLE b IGLE  bTPAG b TPAG : (35)
¼ /QðK1 1 1
1 Kd K Kd K1  K1 1 1
k Kkd K Kkd Kk ÞQ
0 0
þ bIGLE b IGLE  bTPAG b TPAG : 0

The MMSE of the IGLE and TPAG is a p.d if ðkj þ dÞ2 ðkj þ kÞ2  ðkj þ kdÞ2 ðkj þ 1Þ2 > 0: Thus,
if k > 0 and 0 < d < 1, then the theorem is ended by Lemma 2.2. w

^
Theorem 2.6. Let k > 0, 0 < d < 1, kj ðd  kd  1Þ  kd > 0, bTPHY ¼ Biasðb TPHY Þ: Then
^ ^ 1
MMSEðb TPHY Þ  MMSEðb TPAG Þ > 0 if b K1 Kd Kk KKk Kd K1  Kk Kkd K Kkd K1
0 1 1 1 1 1 1
k b < 1:
Proof. By using Eq. (23) and Eq. (21), we obtain

^
MMSEðb ^ 1 1 1 1 1 1 1 0
TPHY Þ  MMSEðb TPAG Þ ¼ Q K1 Kd Kk KKk Kd K1  Kk Kkd K Kkd Kk Q

þ bTPHY b0 TPHY  bTPAG b0 TPAG


The scalar MSE can be obtained as
8 2 9r
< kj kj þ d2 
kj þ kd =
^
MSE b ^ Q0 þ b0 TPHY bTPHY
TPHY  MSE b TPAG ¼Q      
: kj þ 1 2 kj þ k 2 kj kj þ k 2 ;
j¼1
0
 b TPAG bTPAG : (36)
Since b0 TPHY bTPHY is non-negative definite. So, it is enough to prove that
QðK1 1 1 1 1 1 1 0 0
1 Kd Kk KKk Kd K1  Kk Kkd K Kkd Kk ÞQ  bIGTPE b IGTPE is p.d if kj ðd  kd  1Þ  kd >
0: Thus, the theorem is ended by Lemma 2.2. w
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6217

2.5. Selection of the shrinkage parameters k and d


To combat multicollinearity, the TPAG is an effective method as compared to the other estima-
tion methods. There is no solid rule for finding the optimum value of ridge parameter k. Hoerl
and Kennard (1970a, 1970b) and Kibria (2003) suggested various methods for estimating the
appropriate shrinkage parameters for the LRM. In order to find the optimum value, we differenti-
ate Eq. (22) with respect to k by fixing the value of d and setting the expression equal to zero.
The following results are obtained.
2     3
@MSE b ^ Xr 2 2
2ð1 þ dÞ k2 a2j 2ð1 þ dÞ ka2j 2d dk þ kj / 2 dk þ kj 2 /
TPAG
¼ 4  3 þ  2 þ  2   3 5 :
@k j¼1 kj þ k kj þ k kj k j þ k k j kj þ k
2 3
Xr 2ð1 þ dÞ ð1 þ dÞka2j kj þ dk/ þ kj /
4 5
¼ 3 :
j¼1 ð kþk Þ
(37)
after simplification of Eq. (37), we get
kj /
kj ¼ : (38)
a2j kj ð1  dÞ  d/

As a is unknown, thus we can estimate it by a^ ¼ Q0 b ^


MLE : As each individual parameter kj should
be non-negative, so we calculate the expression for d by differentiating Eq. (22) with respect to d
and setting it to zero, we have
^   !
@MSE b TPAG Xr
kj þ kd X r a2j
¼ / 2k 2 þ 2k ðd  1Þ 2 ¼ 0:
2
 
@d j¼1 kj kj þ kIp j¼1 kj þ kIp

On simplification, one gets


!
X
r ka2j kj  /kj
d¼ 0 < d < 1:
j¼1
/k þ ka2j kj

We propose to estimate d as
!r
1 ka2j kj  /kj
d ¼ min :, (39)
2 /k þ ka2j kj
j¼1

where min represents the minimum function such that d is lies between 0 and 1. After estimating
d, we consider two methods to choose optimum values of ridge parameter k as given by
X
r
 
k1 ¼ kj : (40)
j¼1
 
k2 ¼ diag kj : (41)
The following algorithm is used to estimate the value of shrinkage parameters.
Step 1: Select d from Eq. (39).
Step 2: Estimate k by using one of the Eqs. (40) and (41) by inserting the value of d.

Huang and Yang (2014) suggested an ideal value of shrinkage parameters for the NBRM. Based
on their work, we suggest some optimal values of shrinkage parameters for the IGRM to assess
6218 M. N. AKRAM ET AL.

the performance their estimator. Initially, for finding the optimal value of k, take the derivative of
Eq. (24) by fixing the value of d i.e., d and equating to zero we have
n o 0  1 !
^ 2  
@ MSE b TPHY Xr
kj þ d kj Xr a2j kj kj þ d ðk þ 1  dÞkj þ k
¼ 2/ @ 3  2 A þ 2  3  2 :
@k j¼1 kj þ k kj þ 1 j¼1 kj þ k kj þ 1
(42)
After some mathematical manipulation, we have
 
Xr ðd  1Þa2 k þ d þ k /
j j j
k¼   , (43)
j¼1
a2j kj þ 1

^
a ¼ Q0 b
where ^ MLE , replacing kj and aj by their unbiased estimates. So, Eq. (43) can be written as
0   ^ 1r
ðd  1Þ^a 2j ^k j þ d þ kj /
k ¼ min@ A : (44)
^a 2j ^ kj þ 1
j¼1

^
/
It is interesting to note that when d ¼ 1, then Eq. (44) reduces to k ¼ ^
a 2j
which is suggested by
Hoerl and Kennard (1970a, 1970b). Likewise, we find an optimal value of d by fixing k > 0, we
have
 
^k max k þ ^k max þ k ^a 2  ^k max /
^
d¼ max
, (45)
^k max ^a 2 þ / ^
max

where ^
k max represents the maximum eigenvalue of H and a^ 2max is the maximum element of ^a 2j :
To ensure the value of d1 lies between zero and one, Eq. (45) is defined as
  !
^k max k þ ^k max þ k ^a 2  ^k max /
^
D1 ¼ max 0, max
: (46)
^
k^ max ^a 2max þ /
.
Another optimal value of d is also obtained by minimizing Eq. (24) as @ fMSEðb TPHY Þg
^
¼ 0 and
@d
solve for d.

Pr f^k j ðk þ1Þþkg^a 2j ^k j ^k j /^
j¼1
ð^k j þk Þ ð^k j þ1Þ
2 2
^2 ¼
d . : (47)
Pr k j ð^
^ a 2j ^ ^Þ
k j þ/
j¼1
ð^k j þk Þ ð^k j þ1Þ
2 2

For practical purposes, Eq. (47) can be written as

^
D2 ¼ max 0, d
2
: (48)
Thus, the following limitation on k is obtained
(  )
^k max 1  ^a 2 ^
/
max 0, 2  max  < k < 2 :, (49)
^a 1 þ ^k max
max
^a max

where kmax is the maximum eigenvalue of H and a2max represents the maximum element of ^a j :
These shrinkage parameters are further used in simulation to assess the effectiveness of the esti-
mators to conclude which one is more consistent to handle the issue of multicollinearity.
The mentioned steps are followed in order to find the optimal values of k and d.
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6219

Table 1. Estimated MSE of the estimators when p ¼ 4.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:1
25 0.80 0.4308 0.1071 0.0521 0.0016 0.0041 0.4077 0.4081
50 0.1640 0.0454 0.0293 0.0011 0.0027 0.1577 0.1582
100 0.0767 0.0226 0.0163 0.0009 0.0027 0.0738 0.0741
200 0.0373 0.0123 0.0073 0.0007 0.0029 0.0353 0.0355
25 0.90 0.8112 0.1906 0.0633 0.0017 0.0034 0.7875 0.7873
50 0.2957 0.0760 0.0390 0.0010 0.0022 0.2899 0.2903
100 0.1383 0.0365 0.0226 0.0009 0.0022 0.1362 0.1365
200 0.0667 0.0198 0.0108 0.0008 0.0025 0.0653 0.0654
25 0.95 1.6307 0.4025 0.0898 0.0028 0.0053 1.6006 1.5986
50 0.6085 0.1586 0.0552 0.0013 0.0023 0.5992 0.5996
100 0.2869 0.0763 0.0375 0.0011 0.0023 0.2850 0.2852
200 0.1274 0.0349 0.0163 0.0009 0.0023 0.1267 0.1268
25 0.99 7.7633 1.8062 0.2373 0.0067 0.0107 7.6015 7.6024
50 3.0138 0.7748 0.1520 0.0023 0.0035 2.9918 2.9920
100 1.2958 0.3141 0.0759 0.0018 0.0028 1.2938 1.2941
200 0.5771 0.1454 0.0446 0.0013 0.0021 0.5770 0.5770
When / ¼ 0:25
25 0.80 1.2678 0.2985 0.2026 0.0028 0.0074 1.1507 1.1550
50 0.4161 0.1177 0.1031 0.0018 0.0047 0.4011 0.4020
100 0.1717 0.0488 0.0324 0.0013 0.0034 0.1678 0.1681
200 0.0749 0.0228 0.0213 0.0010 0.0028 0.0725 0.0727
25 0.90 2.3659 0.5381 0.3026 0.0040 0.0118 2.2407 2.2441
50 0.7579 0.1946 0.1558 0.0024 0.0055 0.7427 0.7436
100 0.3216 0.0889 0.0859 0.0014 0.0034 0.3195 0.3197
200 0.1332 0.0372 0.0372 0.0011 0.0025 0.1320 0.1321
25 0.95 4.4524 0.9311 0.4443 0.0070 0.0137 4.2560 4.2571
50 1.3765 0.3187 0.2094 0.0031 0.0059 1.3532 1.3540
100 0.5731 0.1482 0.1307 0.0018 0.0032 0.5711 0.5714
200 0.2769 0.0776 0.0726 0.0011 0.0025 0.2766 0.2766
25 0.99 22.2594 4.5505 1.9071 0.0776 0.0499 21.5727 21.5696
50 7.2919 1.6534 0.7535 0.0078 0.0151 7.2194 7.2196
100 2.9891 0.7733 0.4284 0.0040 0.0071 2.9864 2.9866
200 1.3360 0.3546 0.2577 0.0023 0.0041 1.3347 1.3357

Step 1: Select the ridge parameter k using Eq. (49).


Step 2: Compute D1 and D2 by inserting the value of k.
Step 3: Estimate k using Eq. (44) by using D1 and D2 :
Step 4: If the value of k in Step 3 attains negative value, then consider k ¼ K.

3. Monte Carlo simulation


In this section, we give the simulation layout and results to evaluate the performance of the pro-
posed estimators of the IGRM. This evaluation is done under different simulation factors which
may affect the simulation results.

3.1. Simulation layout


The response variable yi of the IGRM is generated from the IGðli , kÞ distribution, where
li ¼ Eðyi Þ ¼ ðb0 þ b1 xi1 þ b2 xi2 þ ::: þ br xir Þ2 , i ¼ 1, 2, :::, n, j ¼ 1, 2, :::, r:
1
(50)
Eq. (50) is the mean function and it is generated for p ¼ 4, 6, 8 explanatory variables. The com-
mon restriction in simulation studies on the slope parameter values used in Eq. (50) is that
6220 M. N. AKRAM ET AL.

Table 2. Estimated MSE of the estimators when p ¼ 4.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:5
25 0.80 3.7498 0.7899 0.7355 0.0070 0.0190 3.3587 3.3765
50 0.9107 0.2323 0.2308 0.0030 0.0111 0.8675 0.8702
100 0.2869 0.0752 0.0111 0.0019 0.0050 0.2812 0.2816
200 0.1238 0.0370 0.0237 0.0014 0.0037 0.1215 0.1216
25 0.90 6.3070 1.2440 1.1232 0.0206 0.0225 5.6296 5.6421
50 1.5702 0.3692 0.2219 0.0036 0.0082 1.5252 1.5274
100 0.5469 0.1459 0.1399 0.0023 0.0061 0.5425 0.5431
200 0.2328 0.0682 0.0472 0.0016 0.0039 0.2326 0.2326
25 0.95 12.3475 2.2608 1.8949 0.0305 0.0407 11.3708 11.3843
50 3.1993 0.7419 0.7393 0.0075 0.0170 3.1487 3.1494
100 1.0473 0.2642 0.2329 0.0033 0.0071 1.0454 1.0457
200 0.4564 0.1296 0.1180 0.0020 0.0050 0.4561 0.4467
25 0.99 60.9087 11.3919 9.5083 0.6397 0.1597 57.0514 57.0663
50 14.1051 3.1247 2.4783 0.0052 0.0299 13.9957 13.9967
100 5.0549 1.2259 1.1118 0.0067 0.0153 5.0510 5.0509
200 2.1618 0.5752 0.5518 0.0035 0.0075 2.1615 2.1412
When / ¼ 2
25 0.80 87.9296 14.6164 13.6712 2.1554 0.2891 67.9993 68.8710
50 7.2160 1.3374 0.5979 0.5125 0.0384 6.4128 6.4604
100 1.1363 0.2426 0.6358 0.0033 0.0107 1.0798 1.0847
200 0.3635 0.0983 0.0238 0.0027 0.0081 0.3517 0.3526
25 0.90 147.8530 24.6815 23.2293 86.4208 0.6976 122.1939 122.9046
50 12.1098 1.9959 0.0560 0.0240 0.0460 11.2110 11.2426
100 2.0416 0.4025 0.1074 0.0043 0.0129 2.0004 2.0039
200 0.6980 0.1789 0.1445 0.0032 0.0094 0.6885 0.6896
25 0.95 297.4180 55.7837 51.4703 2.1923 1.1917 258.8407 259.2597
50 23.5453 3.8645 2.2740 0.1147 0.0587 21.6453 21.6776
100 4.3490 0.7684 0.2567 0.0102 0.0193 4.3143 4.3149
200 1.4019 0.3587 0.2849 0.0050 0.0137 1.3983 1.3988
25 0.99 1077.3300 136.2320 132.7220 15.6978 2.6152 762.3609 762.9061
50 114.4280 19.0281 8.4796 2.3669 0.2994 105.5276 105.5843
100 21.6735 4.1295 3.8644 0.0687 0.0790 21.5056 21.5068
200 6.3034 1.5145 1.3218 0.0098 0.0328 6.2847 6.2832

Pr
b2j ¼ 1, for further information please see Kibria (2003). However, correlated explanatory
j¼1
variables are generated using the following expression
 1=2
xij ¼ 1  a2o fij þ ao fij , i ¼ 1, 2, :::, n, j ¼ 1, 2, :::, r (51)

where ao represents the correlation among the explanatory variables, fij denotes the pseudo ran-
dom numbers which are generated from standard normal distribution. For detailed description,
we recommend the studies of Kibria (2003) and Månsson, Kibria, and Shukur (2012). Four differ-
ent values of ao are considered i.e., ao ¼ 0.80, 0.90, 0.95, 0.99 and four sets of sample size i.e.,,
n ¼ 25, 50, 100, 200 are considered. To investigate the performance of MLE, IGRE, IGLE, TPAG
and TPHY, MSE criterion is used for the assessment of proposed estimator and this criteria is
defined by

PR ^ b
0
^ b
b b
^ ¼
MSEðbÞ
i¼1 i i
:
R

where R represents the total number of replications and are set to be 2000. The simulation study
is done using the most popular R 3.5.2 software.
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6221

Table 3. Estimated MSE of the estimators when p ¼ 6.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:1
25 0.80 0.6751 0.1591 0.0753 0.0017 0.0068 0.6282 0.6295
50 0.2593 0.0660 0.0502 0.0012 0.0048 0.2532 0.2529
100 0.1207 0.0347 0.0290 0.0009 0.0041 0.1182 0.1184
200 0.0572 0.0177 0.0127 0.0008 0.0042 0.0570 0.0569
25 0.90 1.2996 0.2958 0.1008 0.0020 0.0079 1.2203 1.2228
50 0.4811 0.1202 0.0713 0.0012 0.0052 0.4738 0.4736
100 0.2169 0.0587 0.0432 0.0010 0.0037 0.2147 0.2149
200 0.1011 0.0288 0.0197 0.0008 0.0039 0.1011 0.1012
25 0.95 2.5682 0.5917 0.1364 0.0027 0.0107 2.4422 2.4445
50 0.9116 0.2167 0.0922 0.0015 0.0058 0.9020 0.9020
100 0.4300 0.1194 0.0704 0.0012 0.0040 0.4281 0.4282
200 0.1958 0.0541 0.0316 0.0010 0.0036 0.1927 0.1948
25 0.99 11.7156 2.7452 0.3378 0.0070 0.0390 11.1664 11.1738
50 4.5658 1.1400 0.1934 0.0033 0.0164 4.5479 4.5482
100 2.0177 0.5340 0.1394 0.0020 0.0058 2.0167 2.0167
200 0.9027 0.2286 0.0804 0.0014 0.0039 0.9017 0.9016
When / ¼ 0:25
25 0.80 2.0916 0.3960 0.2542 0.0029 0.0134 1.8608 1.8681
50 0.6410 0.1499 0.1314 0.0019 0.0095 0.6173 0.6182
100 0.2392 0.0637 0.0588 0.0012 0.0052 0.2353 0.2356
200 0.1172 0.0357 0.0422 0.0011 0.0051 0.1170 0.1171
25 0.90 3.9713 0.7591 0.3945 0.0038 0.0222 3.6442 3.6548
50 1.1302 0.2601 0.2204 0.0021 0.0093 1.1043 1.1055
100 0.4473 0.1172 0.1134 0.0014 0.0056 0.4430 0.4432
200 0.2167 0.0622 0.0508 0.0011 0.0041 0.2154 0.2118
25 0.95 7.8451 1.5355 0.6809 0.0078 0.0422 7.2791 7.2945
50 2.2228 0.4967 0.3343 0.0027 0.0121 2.1938 2.1934
100 0.8606 0.2201 0.2099 0.0016 0.0051 0.8561 0.8560
200 0.4062 0.1123 0.1121 0.0013 0.0043 0.4055 0.4049
25 0.99 36.5503 6.7751 2.4529 0.4001 0.1059 34.1044 34.1285
50 10.9406 2.4443 1.0870 0.0075 0.0551 10.8202 10.8221
100 4.0904 1.0072 0.5745 0.0035 0.0138 4.0807 4.0811
200 2.0241 0.5591 0.4332 0.0022 0.0127 2.0231 2.0217

3.2. Simulation results


The estimated MSE’s of different proposed estimators with various effective factors such as degree of
multicollinearity ðao Þ, sample size ðnÞ, number of explanatory variables ðpÞ and different dispersion
parameter ð/Þ are given in Tables 1–6. It is clear from the simulated results that for all the evaluated
situations, the MSE values of b^ with different simulated factors are smaller as compared to other
TPAG
estimation methods. For fixed n, p, /, the degree of multicollinearity has a direct impact on the esti-
mated MSE’s of the IGRM estimators. This means that as we increase the level of multicollinearity i.e.,
from mild to severe, then MSE values of the estimators are also increases and this is valid for all simu-
lation scenarios. However, this variation is more severe for the MLE with severe multicollinearity. If
we assess the results according to sample size keeping all other factors held constant, then it is clear
that the simulated MSE shows a decreasing behavior as sample size increases. Moreover, by increasing
the p and / then the estimated MSE values of the estimators are also increased. But the performance
of b^  
TPAG with ðd , k1 Þ and ðd , k2 Þ is always superior as compared to other estimation methods.
Furthermore, the estimated MSE of MLE is three times larger than that of the b ^
TPAG :
^ 
Hence b TPAG with parameter ðd , k1 Þ is considered as a robust estimation method for the esti-
mation of unknown parameters of the IGRM in the presence of severe multicollinearity. Though
^
b TPHY with ðD1 , K1 Þ and ðD2 , K2 Þ yields smaller MSE as compared to the MLE, but this estimator
attains larger MSE as compared to the IGRE and IGLE. Additionally, from Tables 1–6, we found
6222 M. N. AKRAM ET AL.

Table 4. Estimated MSE of the estimators when p ¼ 6.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:5
25 0.80 6.0930 0.9845 0.1020 0.0063 0.0400 5.2080 5.2396
50 1.5966 0.3294 0.2505 0.0028 0.0171 1.5088 1.5127
100 0.4416 0.1108 0.1086 0.0018 0.0086 0.4319 0.4330
200 0.1894 0.0572 0.0303 0.0015 0.0066 0.1905 0.1903
25 0.90 13.7424 2.3567 2.3441 0.0241 0.0908 12.0457 12.1076
50 2.7539 0.5461 0.6661 0.0032 0.0209 2.6595 2.6628
100 0.8203 0.2019 0.3026 0.0021 0.0091 0.8085 0.8091
200 0.3639 0.1057 0.1659 0.0016 0.0072 0.3659 0.3656
25 0.95 26.1001 4.0405 4.0275 2.8660 0.0959 23.1431 23.2248
50 4.9815 0.9787 0.9578 0.0053 0.0304 4.8386 4.8435
100 1.6859 0.4206 0.3630 0.0025 0.0108 1.6734 1.6744
200 0.6632 0.1266 0.1128 0.0018 0.0074 0.5666 0.4665
25 0.99 116.0001 18.5658 17.0782 3.0734 0.4708 102.8950 102.9870
50 25.3257 5.1126 4.2352 0.0207 0.1284 24.8552 24.8630
100 7.5439 1.8235 1.7313 0.0067 0.0360 7.5047 7.5060
200 3.4425 0.9752 0.9167 0.0035 0.0226 3.1448 3.3447
When / ¼ 2
25 0.80 163.6170 22.1057 21.3789 0.2556 0.9386 127.4251 128.6456
50 13.9317 1.9206 1.8585 0.0404 0.0796 11.9982 12.0981
100 2.0818 0.3567 0.2939 0.0034 0.0225 1.9661 1.9750
200 0.6309 0.1499 0.1416 0.0025 0.0120 0.6182 0.6189
25 0.90 327.8830 37.1460 34.4431 1.0995 1.2700 244.8162 246.2732
50 27.8044 3.9040 1.7775 0.3920 0.1212 24.5874 24.6808
100 3.8784 0.6736 0.6192 0.0046 0.0340 3.6881 3.7006
200 1.1557 0.2681 0.1375 0.0027 0.0144 1.1451 1.1465
25 0.95 807.3380 100.4770 69.7909 4.6646 4.1621 584.5510 587.1855
50 59.7684 8.8121 5.2913 0.5602 0.3599 53.2102 53.3345
100 7.3134 1.2040 1.1275 0.0062 0.0303 7.0883 7.1002
200 2.2628 0.5338 0.4062 0.0034 0.0218 2.2502 2.2509
25 0.99 3069.9800 366.2120 190.0010 24.7072 11.0610 1909.7800 1911.1150
50 234.8800 31.9180 29.1674 3.8352 0.7484 204.9293 205.0676
100 34.0014 5.8119 4.0300 0.0940 0.1736 33.3838 33.3991
200 9.9222 2.1770 2.1495 0.0088 0.0720 9.8937 9.8942

that the best option is b ^  ^ 


TPAG with ðd , k1 Þ and second best choice is b TPAG with ðd , k2 Þ in almost
all conditions which are considered in our simulation study.
^
Finally, we conclude that overall the performance of b 
TPAG with ðd , k1 Þ is better in terms of minimum
^ 
MSE. So, one must apply b TPAG with shrinkage parameter ðd , k1 Þ to obtain valid statistical inferences.

4. Application: liver cirrhosis death rate data set


The performance of proposed estimators is also evaluated using real data set. Therefore, Liver cir-
rhosis death rate data set is considered which is taken from Brownlee (1965). The sample size
comprises of 46 observations with response variable y that indicates the standardized liver cirrho-
sis death rate with four explanatory variables. In which, x1 represents the percent of the popula-
tion "urban", x2 represents the number of children ever born to women 45–49 years old,
x3 shows the consumption of wine per capita and x4 indicates the spirits consumption.
First, it is important to identify the probability distribution of the response variable to build a
suitable regression model. For this purpose, three tests i.e., Anderson darling, Cramer-Von Mises
and Pearson v2 tests are considered. Results of Table 7 illustrated that the liver cirrhosis data set
is well fitted to the IG distribution. Results of Crammer-Von Mises test indicated that the most
suitable one is the IG distribution with a test statistic (p-value) equals to 0.0509 (0.8047).
Moreover, the dispersion parameter which is estimated from Eq. (12) is to be 0.00068.
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6223

Table 5. Estimated MSE of the estimators when p ¼ 8.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:1
25 0.80 1.3750 0.2808 0.1082 0.0016 0.0094 1.3045 1.3065
50 0.4190 0.1032 0.0784 0.0012 0.0066 0.4107 0.4111
100 0.1766 0.0486 0.0402 0.0009 0.0057 0.1756 0.1756
200 0.0829 0.0249 0.0216 0.0008 0.0055 0.0631 0.0820
25 0.90 2.6874 0.5282 0.1417 0.0018 0.0091 2.6075 2.6088
50 0.8351 0.1995 0.1080 0.0012 0.0058 0.8267 0.8268
100 0.3291 0.0869 0.0628 0.0009 0.0045 0.3287 0.3288
200 0.1580 0.0442 0.0351 0.0008 0.0045 0.1575 0.1573
25 0.95 4.8646 0.9219 0.1686 0.0027 0.0111 4.7359 4.7383
50 1.6041 0.3717 0.1425 0.0016 0.0065 1.5914 1.5916
100 0.6590 0.1737 0.0996 0.0011 0.0040 0.6588 0.6587
200 0.3030 0.0825 0.0571 0.0009 0.0039 0.2040 0.2038
25 0.99 24.5921 4.7454 0.4814 0.0084 0.0358 24.2154 24.2183
50 8.1467 1.9095 0.3086 0.0029 0.0154 8.1148 8.1153
100 3.2310 0.8207 0.2223 0.0021 0.0066 3.2295 3.2296
200 1.4594 0.3797 0.1444 0.0013 0.0039 1.4532 1.4530
When / ¼ 0:25
25 0.80 5.5967 0.9754 0.5079 0.0041 0.0346 5.0747 5.0913
50 1.1089 0.2409 0.2315 0.0018 0.0128 1.0703 1.0716
100 0.3983 0.1061 0.1058 0.0013 0.0076 0.3957 0.3959
200 0.1774 0.0539 0.0480 0.0011 0.0070 0.1578 0.1477
25 0.90 10.2367 1.8160 0.7827 0.0052 0.0436 9.5873 9.6047
50 2.2273 0.4784 0.3755 0.0022 0.0138 2.1757 2.1779
100 0.7251 0.1906 0.1066 0.0016 0.0073 0.7226 0.7225
200 0.3309 0.0943 0.0117 0.0011 0.0062 0.3308 0.3316
25 0.95 19.8670 3.2688 1.4342 0.0091 0.0695 18.8210 18.8399
50 3.9995 0.8062 0.5041 0.0029 0.0170 3.9378 3.9400
100 1.4272 0.3572 0.3380 0.0017 0.0084 1.4226 1.4227
200 0.6665 0.1915 0.1130 0.0015 0.0072 0.6618 0.6628
25 0.99 92.9084 15.3984 5.5590 0.3773 0.2503 90.0573 90.0751
50 21.1377 4.4681 1.7972 0.0076 0.0376 20.9268 20.9304
100 7.4629 1.8725 1.0767 0.0036 0.0214 7.4460 7.4464
200 3.2117 0.8817 0.6841 0.0022 0.0135 3.1177 3.2113

The correlation matrix of liver cirrhosis data set can be seen in Table 8. It is clearly observed
from the results of Table 8 that correlation among the explanatory variables are all greater than
0.99 indicating severe multicollinearity in the data set. As correlation matrix is not enough for
the detection of multicollinearity. So we use condition index (CI) to check the existence of multi-
qffiffiffiffiffiffiffiffiffiffiffiffi
maxðk Þ
collinearity issue. The CI ¼ minðkjjÞ of the data is 38.6598 which is greater than 30 indicating the
presence of severe multicollinearity among the explanatory variables.
The estimated coefficients and MSE’s of the IG regression estimators are summarized in
Table 9. According to Table 9, it is observed that the IG regression estimators have smaller MSE
values as compared to the MLE. Moreover, the estimated MSE of TPAG with parameter ðd , k1 Þ
has smaller MSE as compared to other biased estimators. Hence, the results of a simulation study
and real data set illustrated that the performance of TPAG is best among others in the presence
of mild to severe multicollinearity.

5. Concluding remarks
In this paper, we propose TPE for the IGRM to overcome the effect of multicollinearity based on
the work of Asar and Genç (2018) and Huang and Yang (2014) and derive the optimal value of
the shrinkage parameters. Moreover, we propose some methods to estimate the shrinkage
6224 M. N. AKRAM ET AL.

Table 6. Estimated MSE of the estimators when p ¼ 8.


TPAG TPHY
 
n ao MLE IGRE IGLE ðd , k1 Þ ðd , k2 Þ ðD1 , K1 Þ ðD2 , K2 Þ
When / ¼ 0:5
25 0.80 19.1160 2.7883 2.5617 0.0111 0.0610 16.7180 16.7834
50 2.5575 0.4754 0.3095 0.0029 0.0225 2.4080 2.4149
100 0.7356 0.1805 0.1802 0.0019 0.0118 0.7213 0.7221
200 0.2913 0.0843 0.0142 0.0015 0.0095 0.2910 0.2911
25 0.90 39.1671 5.8548 5.1340 0.2035 0.1411 35.2584 35.3289
50 5.1772 0.9367 0.8062 0.0034 0.0248 4.9708 4.9772
100 1.4241 0.3393 0.2966 0.0022 0.0138 1.4082 1.4090
200 0.5554 0.1530 0.1540 0.0015 0.0083 0.5512 0.5531
25 0.95 66.0231 9.2630 7.9243 7.4469 0.2176 61.0847 61.1607
50 10.3753 1.9151 1.9150 0.0053 0.0383 10.0799 10.0908
100 2.6650 0.6348 0.5395 0.0024 0.0141 2.6418 2.6427
200 1.1182 0.3088 0.2741 0.0019 0.0118 1.1151 1.1180
25 0.99 343.1514 48.9809 42.0222 0.3654 0.8461 310.8070 310.9075
50 51.8494 9.2572 8.1505 0.1577 0.1804 50.6598 50.6731
100 14.0387 3.2415 3.0166 0.0066 0.0617 13.9779 13.9800
200 5.3411 1.4552 1.2733 0.0031 0.0252 5.2445 5.1441
When / ¼ 2
25 0.80 641.40 65.18 34.75 16.01 2.91 464.93 468.49
50 30.442 3.928 1.831 0.104 0.186 25.788 25.950
100 4.529 0.734 0.353 0.004 0.044 4.260 4.276
200 1.004 0.214 0.165 0.002 0.015 0.994 0.994
25 0.90 1333.12 148.45 55.88 3.21 4.23 998.52 1001.21
50 66.050 8.334 6.732 0.124 0.283 58.095 58.262
100 7.225 1.171 1.150 0.004 0.034 6.947 6.958
200 1.955 0.415 0.323 0.002 0.019 1.937 1.939
25 0.95 2661.97 228.62 78.49 2.47 7.31 1719.63 1721.90
50 122.07 15.49 12.51 0.152 0.33 106.65 106.80
100 14.19 2.13 1.43 0.01 0.05 13.78 13.80
200 3.85 0.79 0.29 0.00 0.02 3.82 3.82
25 0.99 13869.59 1031.88 257.58 2.23 52.95 7647.59 7652.11
50 595.46 65.97 23.16 0.13 0.24 488.67 488.86
100 74.07 11.23 10.63 0.79 0.31 72.06 72.08
200 19.67 4.20 0.15 0.01 0.10 19.55 19.55

Table 7. Distribution of goodness of fit tests.


Probability Distributions
Goodness-of-fit tests Normal Exponential Gamma IG Weibull
AD Statistic 0.7227 9.0060 0.2787 0.2864 0.6264
p-value 0.0578 0.0000 0.4555 0.7548 0.1026
CVM Statistic 0.1232 1.8195 0.0500 0.0509 0.1047
p-value 0.0538 0.0000 0.6493 0.8047 0.0899
Pearson v2 Statistic 13.5652 53.1304 10.087 7.4783 7.4782
p-value 0.0595 0.0000 0.1837 0.5808 0.3808
AD ¼ Anderson Darling, CVM ¼ Cramer-Von Mises.
Note: The bold value represents well fitted of the IG distribution.

Table 8. Correlation between the IG explanatory variables of death rate data.


x1 x2 x3 x4
x1 1 0.9987 0.9949 0.9835
x2 0.9987 1 0.9985 0.9914
x3 0.9949 0.9985 1 0.9964
x4 0.9835 0.9914 0.9964 1

parameters k and d. The performance of the suggested estimators is assessed through a Monte
Carlo simulation study and a real data application in terms of minimum MSE. Based on the
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6225

Table 9. Estimated coefficient’s and MSE’s of the listed estimators for liver cirrhosis death rate data.
Estimated Coefficients
Estimators ^
b ^
b ^
b ^
b ^
b MSE
o 1 2 3 4
MLE 4.102 0.392 0.714 1.233 0.047 0.845
IGRE 4.102 0.403 0.701 1.221 0.036 0.798
IGLE 3.868 0.348 0.638 1.105 0.045 0.760
TPAGðd , k1 Þ 4.075 0.172 0.743 1.094 0.319 0.382
TPAGðd , k2 Þ 3.975 0.626 0.852 0.587 1.311 0.643
TPHYðD1 , K1 Þ 4.102 0.390 0.714 1.232 0.049 0.837
TPHYðD2 , K2 Þ 4.075 0.170 0.744 1.092 0.322 0.754

results of simulation and example, we conclude that the TPAG ðd , k1 Þ performed better and it is
considered as the best option to combat multicollinearity in the IGRM. For instance, the MLE
method is not a good choice in the presence of mild to severe multicollinearity. In addition, the
MSE of TPHY for all the considered situations are larger than the other biased estimation meth-
ods. So we suggests to use the TPAG with shrinkage parameter ðd , k1 Þ instead of other estima-
tors for estimating the parameters of the IGRM with correlated explanatory variables. Hence, for
the estimation of the IGRM with correlated explanatory variables, the best choice is to use
TPAG ðd , k1 Þ:

ORCID
Muhammad Nauman Akram http://orcid.org/0000-0001-6688-808X
Muhammad Amin http://orcid.org/0000-0002-7431-5756

References
Akdeniz, F., and S. Kaçiranlar. 2001. More on the new biased estimator in linear regression. Indian Journal of
Statistics 63 (3):321–25. doi:10.2307/25053183.
Akram, M. N., M. Amin, and M. Qasim. 2020. A new Liu-type estimator for the inverse Gaussian regression
model. Journal of Statistical Computation and Simulation 90 (7):1153–72. doi:10.1080/00949655.2020.1718150.
Algamal, Z. Y. 2019. Performance of ridge estimator in inverse Gaussian regression model. Communications in
Statistics–Theory and Methods 48 (15):3836–49. doi:10.1080/03610926.2018.1481977.
Alheety, M. I., and B. M. G. Kibria. 2009. On the Liu and almost unbiased Liu estimators in the presence of multi-
collinearity with heteroscedastic or correlated errors. Surveys in Mathematics and Its Applications 4:155–67.
Alkhamisi, M. A., and G. Shukur. 2008. Developing ridge parameters for SUR model. Communications in Statistics
– Theory and Methods 37 (4):544–64. doi:10.1080/03610920701469152.
Amin, M., M. Amanullah, and M. Aslam. 2016. Empirical evaluation of the inverse Gaussian regression residuals
for the assessment of influential points. Journal of Chemometrics 30 (7):394–404. doi:10.1002/cem.2805.
Amin, M., M. Amanullah, and M. Qasim. 2020. Diagnostic techniques for the inverse Gaussian regression model.
Communications in Statistics – Theory and Methods. Advance online publication. doi:10.1080/03610926.2020.
1777308.
Amin, M., M. N. Akram, and M. Amanullah. 2020. On the James-Stein estimator for the poisson regression model.
Communications in Statistics – Simulation and Computation. Advance online publication. doi:10.1080/03610918.
2020.1775851.
Amin, M., M. Qasim, and M. Amanullah. 2019. Performance of Asar and Genç and Huang and Yang’s two-param-
eter estimation methods for the gamma regression model. Iranian Journal of Science and Technology,
Transactions A: Science 43 (6):2951–63. doi:10.1007/s40995-019-00777-3.
Amin, M., M. Qasim, M. Amanullah, and S. Afzal. 2020a. Performance of some ridge estimators for the gamma
regression model. Statistical Papers 61 (3):997–1026. doi:10.1007/s00362-017-0971-z.
Amin, M., M. Qasim, S. Afzal, and K. Naveed. 2020b. New ridge estimators in the inverse Gaussian regression:
Monte Carlo simulation and application to chemical data. Communications in Statistics-Simulation and
Computation. Advance online publication. doi:10.1080/03610918.2020.1797794.
Asar, Y., A. Karaibrahimoglu, and A. Genç. 2014. Modified ridge regression parameters: A comparative Monte
Carlo study. Hacettepe Journal of Mathematics and Statistics 43 (5):827–41.
6226 M. N. AKRAM ET AL.

Asar, Y., and A. Genç. 2015. On some new modifications of ridge estimators. Kuwait Journal of Sciences 44:4–57.
doi: ArXiv:1512.02773v1.
Asar, Y., and A. Genç. 2018. A new two-parameter estimator for the Poisson regression model. Iranian Journal of
Science and Technology, Transactions A: Science 42 (2):793–803. doi:10.1007/s40995-017-0174-4.
Brownlee, K. A. 1965. Statistical theory and methodology in science and engineering. New York: Wiley.
Farebrother, R. W. 1976. Further results on the mean square error of ridge regression. Journal of the Royal
Statistical Society: Series B (Methodological) 38 (3):248–50. doi:10.1111/j.2517-6161.1976.tb01588.x.
Frisch, R. 1934. Statistical confluence analysis by means of complete regression systems. The Economic Journal 45
(180):741–42. doi:10.2307/2225583.
Hoerl, A. E., and R. W. Kennard. 1970a. Ridge regression: Biased estimation for nonorthogonal problems.
Technometrics 12 (1):55–67. doi:10.1080/00401706.1970.10488634.
Hoerl, A. E., and R. W. Kennard. 1970b. Ridge regression: Applications to nonorthogonal problems. Technometrics
12 (1):69–82. doi:10.1080/00401706.1970.10488635.
Huang, H.,. J. Wu, and W. Yi. 2017. On the restricted almost unbiased two-parameter estimator in linear regres-
sion model. Communications in Statistics - Theory and Methods 46 (4):1668–78. doi:10.1080/03610926.2015.
1026991.
Huang, J., and H. Yang. 2014. A two-parameter estimator in the negative binomial regression model. Journal of
Statistical Computation and Simulation 84 (1):124–34. doi:10.1080/00949655.2012.696648.
Hubert, M. H., and P. Wijekoon. 2006. Improvement of the Liu estimator in linear regression model. Statistical
Papers 47 (3):471–79. doi:10.1007/s00362-006-0300-4.
Khalaf, G., K. Månsson, P. Sj€ olander, and G. Shukur. 2014. A tobit ridge regression estimator. Communications in
Statistics - Theory and Methods 43 (1):131–40. doi:10.1080/03610926.2012.655881.
Kibria, B. M. G. 2003. Performance of some new ridge regression estimators. Communications in Statistics -
Simulation and Computation 32 (2):419–35. doi:10.1081/SAC-120017499.
Kibria, B. M. G. 2012. Some Liu and ridge-type estimators and their properties under the ill-conditioned Gaussian
linear regression model. Journal of Statistical Computation and Simulation 82 (1):1–17. doi:10.1080/00949655.
2010.519705.
Kibria, B. M. G., and S. Banik. 2016. Some ridge regression estimators and their performances. Journal of Modern
Applied Statistical Methods 15 (1):206–38. doi:10.22237/jmasm/1462075860.
Kibria, B. M. G., K. Månsson, and G. Shukur. 2012. Performance of some logistic ridge regression estimators.
Computational Economics 40 (4):401–14. doi:10.1007/s10614-011-9275-x.
Kinat, S., M. Amin, and T. Mehmood. 2020. GLM-based control charts for the Inverse Gaussian response variable.
Quality Reliability Engineering International 36 (2):765–83. doi:10.1002/qre.2603.
Liu, K. 1993. A new class of biased estimate in linear regression. Communication in Statistics-Theory and Methods
22 (2):393–402. doi:10.1080/03610929308831027.
Månsson, K. 2012. On ridge estimators for the negative binomial regression model. Economic Modelling 29 (2):
178–84. doi:10.1016/j.econmod.2011.09.009.
Månsson, K. 2013. Developing a Liu estimator for the negative binomial regression model: Method and applica-
tion. Journal of Statistical Computation and Simulation 83 (9):1773–80. doi:10.1080/00949655.2012.673127.
Månsson, K., and G. Shukur. 2011. A Poisson ridge regression estimator. Economic Modelling 28 (4):1475–81. doi:
10.1016/j.econmod.2011.02.030.
Månsson, K., B. M. G. Kibria, and G. Shukur. 2012. On Liu estimators for the logit regression model. Economic
Modelling 29 (4):1483–8. doi:10.1016/j.econmod.2011.11.015.
McClendon, M. J. 2002. Multiple regression and causal analysis. US: Waveland Press.
Muniz, G., and B. M. G. Kibria. 2009. On some ridge regression estimators: An empirical comparisons.
Communications in Statistics - Simulation and Computation 38 (3):621–30. doi:10.1080/03610910802592838.
Naveed, K., M. Amin, S. Afzal, and M. Qasim. 2020. New shrinkage parameters for the inverse Gaussian Liu
regression. Communications in Statistics-Theory and Methods. Advance online publication. doi:10.1080/
03610926.2020.1791339.
Qasim, M., B. M. G. Kibria, K. Månsson, and P. Sj€olander. 2019. A new Poisson liu regression estimator: Method
and application. Journal of Applied Statistics. Advance online publication. doi:10.1080/02664763.2019.1707485.
Qasim, M., M. Amin, and M. Amanullah. 2018. On the performance of some new Liu parameters for the gamma
regression model. Journal of Statistical Computation and Simulation 88 (16):3065–80. doi:10.1080/00949655.
2018.1498502.
Segerstedt, B. 1992. On ordinary ridge regression in generalized linear models. Communications in Statistics -
Theory and Methods 21 (8):2227–46. doi:10.1080/03610929208830909.
Yang, H., and X. Chang. 2010. A new two-parameter estimator in linear regression. Communications in Statistics-
Theory and Methods 39 (6):923–34. doi:10.1080/03610920902807911.

You might also like