Professional Documents
Culture Documents
Two-Parameter Estimator For The Inverse Gaussian Regression Model
Two-Parameter Estimator For The Inverse Gaussian Regression Model
Computation
To cite this article: Muhammad Nauman Akram, Muhammad Amin & Muhammad
Amanullah (2022) Two-parameter estimator for the inverse Gaussian regression model,
Communications in Statistics - Simulation and Computation, 51:10, 6208-6226, DOI:
10.1080/03610918.2020.1797797
1. Introduction
The linear regression model (LRM) is used, when the distribution of response variable is normal.
If this assumption is violated and the response variable is well fitted to some exponential family
of distributions such as binomial, Gamma, Poisson, geometric, inverse Gaussian (IG), then in
such situations we use the generalized linear model (GLM) instead of the LRM. The inverse
Gaussian regression model (IGRM) is mainly applied in the situations when the behavior of
response variable is positively skewed (Amin, Amanullah, and Aslam 2016; Amin, Amanullah,
and Qasim 2020). The IGRM has several applications i.e., engineering, physical sciences, social
sciences, medical sciences, environment and business (Amin, Amanullah, and Aslam 2016; Amin,
Amanullah and Qasim 2020; Akram, Amin and Qasim 2020; Kinat, Amin, and Mehmood
2020).The maximum likelihood estimator (MLE) is more suitable to estimate the IGRM coeffi-
cients as compared to the ordinary least square (OLS) estimator to describe different phenomena,
and to investigate the phenomena quantitatively.
It is very common in multiple regression modeling that the explanatory variables are corre-
lated with one another especially in health, economics and social sciences. This correlation among
explanatory variables is called multicollinearity. Frisch (1934) initially discussed this issue in the
LRM. Like the LRM, the multicollinearity issue may arise in the IGRM. For estimating unknown
regression coefficients of the IGRM, MLE is most commonly used method. But it is known that
the variances and standard errors of the regression coefficients are so large in the presence of
multicollinearity (McClendon 2002). Thus, it is generally difficult to construct valid statistical
inferences when multicollinearity exist (Asar and Genç 2018).
To minimize the effect of multicollinearity, biased estimation method generally attains more
attention as compared to any other method. Among these the ridge is considered to be the most
attractive methods as compared to the MLE which was initially proposed by Hoerl and Kennard
(1970a; 1970b). Different approaches have been adopted for the selection of optimal value of
shrinkage parameter k, for example see Kibria (2003), Muniz and Kibria (2009), Alkhamisi and
Shukur (2008), Asar, Karaibrahimoglu, and Genç (2014), Asar and Genç (2015) and Kibria and
Banik (2016). Moreover, Segerstedt (1992) extended this work for the GLM. Many researchers
worked on the selection of k for the GLM with different responses. Some of them worked on dis-
crete response models and some worked on continuous response models. For a detailed study, we
refer the following studies: Månsson and Shukur (2011), Khalaf et al. (2014), Kibria, Månsson
and Shukur (2012), Månsson (2012), Qasim et al. (2019), Amin, Akram, and Amanullah (2020)
and Amin et al. (2020a, 2020b). Another method to combat multicollinearity is called Liu estima-
tion method which was proposed by Liu (1993). The preference of Liu estimator over the ridge
estimator is that it is a linear function of the Liu parameter d, whose range is 0 to 1 (Qasim,
Amin, and Amanullah 2018). For detailed study of choosing Liu parameter for the LRM, we sug-
gest the following studies: Liu (1993), Akdeniz and Kaçiranlar (2001), Hubert and Wijekoon
(2006), Alheety and Kibria (2009), Kibria (2012) and Qasim et al. (2019). However, the literature
on Liu parameter for the GLM is limited. While a few studies can be found where different Liu
parameters are suggested for some specific cases of the GLM. For the GLM, some studies which
have been done by Månsson (2013), Qasim, Amin, and Amanullah (2018) and Naveed et al.
(2020). To combat multicollinearity, another estimation method which is called two parameter
estimation (TPE) method attains the advantages over both ridge and Liu estimators. Many
researchers proposed the TPE for the LRM to minimize the effect of multicollinearity. For
instance, Yang and Chang (2010) introduced a new TPE for the LRM to minimize the effect of
multicollinearity. Furthermore, Huang, Wu, and Yi (2017) introduced the restricted almost
unbiased TPE for the LRM. On the other hand, the literature on TPE for the GLM is very lim-
ited. We suggest to see the following studies for different characterization of this method in vari-
ous forms of the GLM: Huang and Yang (2014) developed TPE for negative binomial regression
model (NBRM) to minimize the problem of correlated explanatory variables. Asar and Genç
(2018) proposed a two-parameter ridge estimator for the binary logistic regression Furthermore,
Asar and Genç (2018) suggested a TPE for the Poisson Regression Model (PRM) and the most
recent study of Amin et al. (2019) where they assessed the performance of Asar and Genç and
Huang and Yang’s TPE methods for the gamma regression model. The above literature shows
that no such study related to the TPE for the IGRM is available. So, we propose TPE for the
IGRM to overcome the effect of multicollinearity among the explanatory variables.
The purpose of the article is to propose TPE for the IGRM and suggest some estimation meth-
ods of shrinkage parameters to improve the efficiency of the IGRM estimates. We also assess the
performance of biased estimation methods (ridge, Liu and TPE) for the IGRM with the help of
Monte Carlo simulation study and real life application, where mean squared error (MSE) consider
as an evaluation criterion. Furthermore, comparison of TPE methods of Asar and Genç (2018)
and Huang and Yang (2014) for the IGRM are made to find out which estimation method out-
performs the others in the presence of the high multicollinearity.
The organization of this article is as follows: In Sec. 2, we present the methodology which con-
tains the review of the IGRM. Moreover, we also give some biased estimation methods i.e., ridge,
Liu and TPE for the IGRM with some mathematical manipulations. Furthermore, theoretical
comparison based on matrix mean squared error (MMSE) and MSE will also be presented in this
6210 M. N. AKRAM ET AL.
section. The design of Monte Carlo simulation and its results are displayed in Sec. 3. The applica-
tion of the proposed methods is given in Sec. 4. Finally, some concluding remarks are given in
Sec. 5.
2. Methodology
2.1. Inverse Gaussian regression model
Suppose that y1 , y2 , :::, yn are independently and identically distributed observations that comes
from the IG distribution with mean li : As the IG distribution is the member of the exponential
density having functional density as given by
yh bðhÞ
f ðy; h, /Þ ¼ exp þ cðy, /Þ , (1)
/
where h is the location parameter, b (h) is the cumulant function, / > 0 is the dispersion param-
eter, and cðy, /Þ is called the normalization term. The likelihood function of Eq. (1) is given as
Yn
yi hi bðhi Þ
lðyi ; hi , /Þ ¼ exp þ cðyi , /Þ : (2)
i¼1
/
Let yi be the random variable from the IG distribution with drift parameter m and volatility par-
ameter k, denoted as IG (m, k), then the probability density function is defined by
" #
k kðy lÞ2
f ðy; l, kÞ ¼ pffiffiffiffiffiffiffiffiffi exp ; y > 0, l > 0, k > 0: (3)
2py3 2l2 y
For the IGRM, we consider reciprocal of the square root link function as gðlÞ ¼ p1ffiffig , where gi ¼
x0 i b be the linear predictor of the explanatory variables, such as b ¼ ðb0 , b1 , :::, bp Þ0 is called the
vector of regression coefficients including intercept and xi ¼ ð1, xi1 , :::, xip Þ0 are the r ¼ p þ 1
explanatory variables. Then the likelihood function of Eq. (3) is given as
" #
kðy2 þ l2 2ylÞ 1 1
f ðy; l, kÞ ¼ exp þ ln ðkÞ ln ð2py Þ :
3
2l2 y 2 2
or
" ( 3 )#
y 1 1 2py k
f ðy; l, kÞ ¼ exp k þ þ ln þ : (4)
2l2 l 2 k y
Thus the mean and variance of the IG distribution are, respectively, given as
@b @l 1 1
EðyÞ ¼ b0 ðhÞ ¼ ¼ 2 ðl3 Þ ¼ l ¼ pffiffiffi :
@l @h l g
1 1 l3
VarðyÞ ¼ /b00 ðhÞ ¼ /VðlÞ ¼ /diag l31 , :::, l3n ¼ 3 ¼ :
k ðlÞ k
Applying logarithm on both sides of Eq. (4), we have
" !#
Xn
yi 1 1 2pyi 3 k
lðli , kÞ ¼ k þ þ ln þ , (5)
i¼1
2l2i li 2 k yi
pffiffiffiffiffiffiffi
where l1 ¼ xi0 b: Then Eq. (5) becomes
i
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6211
" 3 #
X
n
yi xi0 b pffiffiffiffiffiffi
ffi 1 2py k
lðb, /Þ ¼ k þ xi0 b þ ln i
þ : (6)
i¼1
2 2 k yi
The estimate of b can be obtained by taking the derivative of Eq. (6) and equating to zero, we have
@ ðlðb, kÞÞ X n
yi xi0 1 0 12 0
¼ k þ xi b xi ¼ 0: (7)
@b i¼1
2 2
As Eq. (8) is the non-linear function of the regression coefficients. So, Eq. (8) can be solved itera-
tively with the help of Fisher’s method of scoring. In every single iteration, the parameters are
^ ðrþ1Þ ¼ b
calculated as b ^ ðrÞ þ ðX 0 W ðrÞ XÞ1 X 0 ðy l ^ ðrÞ represents the estimate of b at
^ ðrÞ Þ, where b
^ ðrþ1Þ converges to b,
the rth iteration. If the estimates b ^ as r approaches to infinity, we obtain
^
b 1 0 ^
MLE ¼ ðHÞ X WZ, (9)
^ X, ^z i ¼ ^g i þ
where H ¼ X 0 W ðyi l ^iÞ ^ ¼
is called the adjusted response variable while W
^ 3i
l
l 3i Þ, i ¼ 1, 2, 3, :::, n: The MLE follows the normal distribution with mean vector Eðb ^
diagð^ MLE Þ ¼
0 and the asymptotic covariance matrix Covðb ^ Þ ¼ ^ ðX W
/ 0 ^ XÞ 1
: If one wants to find the
MLE
MMSE and scalar MSE of any estimator, then one utilize the spectral decomposition of
the matrix H i.e., H ¼ QKQ0 , where Q is the matrix of the eigenvectors of H and K is called the
diagonal matrix of eigenvalues of H i.e.,, K ¼ diagðk1 , k2 , :::, kp Þ: Thus, the MMSE and scalar
MSE of Eq. (9) can be respectively given as
MMSE b ^ ¼ / QK1 Q0 : (10
MLE
X
r
1
^
MSE b ¼/ , (11)
MLE
j¼1
kj
where kj represents the jth eigenvalue of H. When there exists a high correlation among the
explanatory variables, then the weighted matrix of cross products i.e.,, H, is ill-conditioned, result-
ing high variance of the MLE. So it is very difficult to estimate and interpret the regression
parameters as the estimated coefficients are too larger (Amin et al. 2020a). Whereas the disper-
sion parameter / is estimated as
1 X n
ð yi l ^ i Þ2
^ ¼
/ : (12)
ðn p 1Þ i¼1 ^ 3i
l
^
b 1 ^
IGRE ¼ ðH þ kIÞ H b MLE k > 0, (13)
^
/
where k ðk > 0Þ is the ridge parameter which is to be estimated as ^ 2j
a
, while I is the identity
matrix of order r r:
Akram et al. (2020) introduced the inverse Gaussian Liu estimator (IGLE) to overcome the
limitations of IGRE and is defined as
^ 1 ^
b IGLE ¼ ðH þ I Þ ðH þ dI Þb MLE , 0 < d < 1 (14)
P r a2 /
j
j¼1 ðkj þIÞ2
where d ¼ P r /þa2 kj
called the Liu parameter.
j
j¼1 kj ðkj þIÞ2
Asar and Genç (2018) proposed TPE by combining the ridge and Liu estimator for the PRM
that is also valid for the IGRM. So, we propose an estimator called two parameter Asar and Genç
(TPAG) estimator defined as
^ 1 ^
b TPAG ¼ H þ kIp H þ kdIp b MLE , (15)
where k > 0 and 0 < d < 1. It can be noticed that TPAG is called a general estimator of the MLE,
IGRE and IGLE. For instance, if we put k ¼ 0 in Eq. (15), then b ^ ^
TPAG ¼ b MLE : Furthermore, if
^ ^
k ¼ 1 in Eq. (15), then b TPAG ¼ b IGLE and if d ¼ 0 in Eq. (15), one can be seen
^
that b ^
TPAG ¼ b IGRE :
Following Huang and Yang (2014) suggested a TPE by relating the IGRE and IGLE estimator
which is called two parameter Huang and Yang (TPHY) estimator defined as
^ 1 1 ^
b TPHY ¼ ðS þ I Þ ðS þ dI ÞðS þ kI Þ Sb MLE , (16)
where k > 0 and 0 < d < 1.
h i
0
^
MSE b ¼ tr MMSE b ^ ¼E b ^ ^
MLE MLE MLE b b MLE b :
h i 0
^
¼ tr Var b þ bias b^ ^
MLE MLE bias b MLE ,
^
where tr is the trace operator, Varðb ^
MLE Þ and biasðb MLE Þ represents the variance and bias of an
^ ^
estimator respectively, such that Biasðb MLE Þ ¼ Eðb MLE Þ b which is the difference between the
expected value and the true population parameter b.
Algamal (2019) computed the MMSE and MSE of the IGRM as
^
MMSE b 1 1 0 0
IGRE ¼ /QKk KKk Q þ bIGRE b IGRE : (17)
^
where Kk ¼ diag ðk1 þ k, k2 þ k, :::, kr þ kÞ and bIGRE ¼ Bias b 1
IGRE ¼ kQKk a and
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6213
X
r
kj X
r a2j
^
MSE b IGRE ¼ / 2 þ k
2
2 , (18)
j¼1 kj þ k j¼1 kj þ k
where a ¼ Q0 b and k (k > 0) is the ridge biasing parameter suggested by Hoerl and
Kennard (1970).
Akram et al. (2020) computed the MMSE and MSE of the IGLE which are respectively given
as
^
MMSE b 1 1 1 0 0
IGLE ¼ QK1 Kd K Kd K1 Q þ bIGLE bIGLE (19)
where K1 ¼ diag ðk1 þ 1, k2 þ 1, :::, kr þ 1Þ, Kd ¼ diag ðk1 þ d, k2 þ d, :::, kr þ dÞ and bIGLE ¼
^ 1
Bias b IGLE ¼ ðd 1ÞQK1 a and
2
Xr
kj þ d Xr a2j
^
MSE b 2 þ ðd 1 Þ
2
IGLE ¼ / 2 , (20)
j¼1 kj kj þ 1 j¼1 kj þ 1
^ 1
where Kkd ¼ diag ðk1 þ kd, k2 þ kd, :::, kr þ kdÞ and bTPAG ¼ Bias b TPAG ¼ kðd 1ÞQKk a
and
2
X r
kj þ kd Xr a2j
^
MSE b TPAG ¼ / 2 þ k ðd 1 Þ
2 2
2 , (22)
j¼1 kj kj þ k j¼1 kj þ k
or
^
MSE b TPAG ¼ / c1 ðk, dÞ þ c2 ðk, dÞ,
where c1 ðk, dÞ and c2 ðk, dÞ represents the variance and squared bias respectively.
The MMSE and MSE of the Huang and Yang (2014) TPE are, respectively, given as
^
MMSE b 1 1 1 1 0 0
TPHY ¼ /QK1 Kd Kk KKk Kd K1 Q þ bTPHY b TPHY :, (23)
Huang and Yang (2014) considered the theorems for TPE in the NBRM. These theorems are also
valid for TPE in the IGRM, which are discussed below.
Theorem 2.1. The asymptotic variance c1 ðk, dÞand the squared bias c2 ðk, dÞ are two continuous
functions of k and d; for fixed d ranging between zero and one, c1 ðk, d Þ and c2 ðk, d Þare mono-
tonically decreasing function of k and for fixed k > 0, c1 ðk , dÞ and c2 ðk , dÞ are monotonically
increasing function of d.
Proof. We know that c1 ðk, dÞ and c2 ðk, dÞ are the two continuous functions of k and d.
Differentiate Eq. (22) with respect to k, we have
6214 M. N. AKRAM ET AL.
n o
^
@ MSE b TPAG Xr 2ð1 þ dÞ2 k2 a2
j
Xr 2ð1 þ dÞ2 ka2
j
Xr
2d dk þ kj /
¼ 3 þ 2 þ 2
@k j¼1 k þ kj j¼1 k þ kj j¼1 kj k þ kj
2
Xr
2 dk þ kj /
3 : (25)
j¼1 kj k þ kj
(27)
After simplification, Eq. (27) becomes
n o
@ 2 MSE b ^ X r 2ð1 þ dÞ kj a2j kj dkj ð3 þ dÞ/ þ 2k ð1 þ dÞa2j kj þ d/
TPAG
¼ 4 :
@k2 j¼1 k þ kj
(28)
So, it is clear that for fixed d between 0 and 1, then c1 ðk, d Þ and c2 ðk, d Þ are monotonically
decreasing and increasing functions of k. Now again differentiating Eq. (22) in terms of d, we
have
n o
^
@ MSE b TPAG Xr 2ð1 þ dÞk2 a2
j
X r
2k dk þ kj /
¼ 2 þ 2 : (29)
@d j¼1 k þ kj j¼1 k k þ kj
¼ @ A:
2 (30)
@d j¼1 kj k þ kj
Finally, take the second derivative of Eq. (30) with respect to d, we have
n o
@ 2 MSE b ^ Xr 2k2 a2j Xr
TPAG 2k2 /
¼ þ 2 :
@d2 j¼1 k þ kj
2
j¼1 kj k þ kj
Or equivalently
0 1
^
@ 2 MSE b Xr 2k2 a2j kj þ /
TPAG
¼ @ A:
2 (31)
@d2 j¼1 kj k þ kj
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6215
So, it is obvious that c1 ðk , dÞ and c2 ðk , dÞ are monotonically increasing and decreasing func-
tions of d.
Theorem 2.4. Suppose that k > 0, 0 < d < 1 and bIGRE ¼ Biasðb^ ^
IGRE Þ and bTPAG ¼ Biasðb TPAG Þ:
Then MMSEðb^ ^ 0 1 1 1 1 1
IGRE Þ MMSEðb TPAG Þ > 0 if b Kk KKk Kk Kkd K Kkd Kk b < 1:
^
MMSEðb ^
IGRE Þ MMSEðb TPAG Þ
¼ /QK1 1 0 1 1 1 0 0 0
k KKk Q /QKk Kkd K Kkd Kk Q þ bIGRE b IGRE bTPAG b TPAG :
¼ /QðK1 1 1 1 1 0 0 0
k KKk Kk Kkd K Kkd Kk ÞQ þ bIGRE b IGRE bTPAG b TPAG :
expression is equal to kj ðkj þ kdÞ kj þ ðkj þ kdÞ > 0: Or in a more simplified form, the
expression becomes kdðkd 2kj Þ > 0:Thus, if k > 0 and 0 < d < 1, then the theorem is ended
by Lemma 2.2. w
The MMSE of the IGLE and TPAG is a p.d if ðkj þ dÞ2 ðkj þ kÞ2 ðkj þ kdÞ2 ðkj þ 1Þ2 > 0: Thus,
if k > 0 and 0 < d < 1, then the theorem is ended by Lemma 2.2. w
^
Theorem 2.6. Let k > 0, 0 < d < 1, kj ðd kd 1Þ kd > 0, bTPHY ¼ Biasðb TPHY Þ: Then
^ ^ 1
MMSEðb TPHY Þ MMSEðb TPAG Þ > 0 if b K1 Kd Kk KKk Kd K1 Kk Kkd K Kkd K1
0 1 1 1 1 1 1
k b < 1:
Proof. By using Eq. (23) and Eq. (21), we obtain
^
MMSEðb ^ 1 1 1 1 1 1 1 0
TPHY Þ MMSEðb TPAG Þ ¼ Q K1 Kd Kk KKk Kd K1 Kk Kkd K Kkd Kk Q
We propose to estimate d as
!r
1 ka2j kj /kj
d ¼ min :, (39)
2 /k þ ka2j kj
j¼1
where min represents the minimum function such that d is lies between 0 and 1. After estimating
d, we consider two methods to choose optimum values of ridge parameter k as given by
X
r
k1 ¼ kj : (40)
j¼1
k2 ¼ diag kj : (41)
The following algorithm is used to estimate the value of shrinkage parameters.
Step 1: Select d from Eq. (39).
Step 2: Estimate k by using one of the Eqs. (40) and (41) by inserting the value of d.
Huang and Yang (2014) suggested an ideal value of shrinkage parameters for the NBRM. Based
on their work, we suggest some optimal values of shrinkage parameters for the IGRM to assess
6218 M. N. AKRAM ET AL.
the performance their estimator. Initially, for finding the optimal value of k, take the derivative of
Eq. (24) by fixing the value of d i.e., d and equating to zero we have
n o 0 1 !
^ 2
@ MSE b TPHY Xr
kj þ d kj Xr a2j kj kj þ d ðk þ 1 dÞkj þ k
¼ 2/ @ 3 2 A þ 2 3 2 :
@k j¼1 kj þ k kj þ 1 j¼1 kj þ k kj þ 1
(42)
After some mathematical manipulation, we have
Xr ðd 1Þa2 k þ d þ k /
j j j
k¼ , (43)
j¼1
a2j kj þ 1
^
a ¼ Q0 b
where ^ MLE , replacing kj and aj by their unbiased estimates. So, Eq. (43) can be written as
0 ^ 1r
ðd 1Þ^a 2j ^k j þ d þ kj /
k ¼ min@ A : (44)
^a 2j ^ kj þ 1
j¼1
^
/
It is interesting to note that when d ¼ 1, then Eq. (44) reduces to k ¼ ^
a 2j
which is suggested by
Hoerl and Kennard (1970a, 1970b). Likewise, we find an optimal value of d by fixing k > 0, we
have
^k max k þ ^k max þ k ^a 2 ^k max /
^
d¼ max
, (45)
^k max ^a 2 þ / ^
max
where ^
k max represents the maximum eigenvalue of H and a^ 2max is the maximum element of ^a 2j :
To ensure the value of d1 lies between zero and one, Eq. (45) is defined as
!
^k max k þ ^k max þ k ^a 2 ^k max /
^
D1 ¼ max 0, max
: (46)
^
k^ max ^a 2max þ /
.
Another optimal value of d is also obtained by minimizing Eq. (24) as @ fMSEðb TPHY Þg
^
¼ 0 and
@d
solve for d.
Pr f^k j ðk þ1Þþkg^a 2j ^k j ^k j /^
j¼1
ð^k j þk Þ ð^k j þ1Þ
2 2
^2 ¼
d . : (47)
Pr k j ð^
^ a 2j ^ ^Þ
k j þ/
j¼1
ð^k j þk Þ ð^k j þ1Þ
2 2
^
D2 ¼ max 0, d
2
: (48)
Thus, the following limitation on k is obtained
( )
^k max 1 ^a 2 ^
/
max 0, 2 max < k < 2 :, (49)
^a 1 þ ^k max
max
^a max
where kmax is the maximum eigenvalue of H and a2max represents the maximum element of ^a j :
These shrinkage parameters are further used in simulation to assess the effectiveness of the esti-
mators to conclude which one is more consistent to handle the issue of multicollinearity.
The mentioned steps are followed in order to find the optimal values of k and d.
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6219
Pr
b2j ¼ 1, for further information please see Kibria (2003). However, correlated explanatory
j¼1
variables are generated using the following expression
1=2
xij ¼ 1 a2o fij þ ao fij , i ¼ 1, 2, :::, n, j ¼ 1, 2, :::, r (51)
where ao represents the correlation among the explanatory variables, fij denotes the pseudo ran-
dom numbers which are generated from standard normal distribution. For detailed description,
we recommend the studies of Kibria (2003) and Månsson, Kibria, and Shukur (2012). Four differ-
ent values of ao are considered i.e., ao ¼ 0.80, 0.90, 0.95, 0.99 and four sets of sample size i.e.,,
n ¼ 25, 50, 100, 200 are considered. To investigate the performance of MLE, IGRE, IGLE, TPAG
and TPHY, MSE criterion is used for the assessment of proposed estimator and this criteria is
defined by
PR ^ b
0
^ b
b b
^ ¼
MSEðbÞ
i¼1 i i
:
R
where R represents the total number of replications and are set to be 2000. The simulation study
is done using the most popular R 3.5.2 software.
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6221
The correlation matrix of liver cirrhosis data set can be seen in Table 8. It is clearly observed
from the results of Table 8 that correlation among the explanatory variables are all greater than
0.99 indicating severe multicollinearity in the data set. As correlation matrix is not enough for
the detection of multicollinearity. So we use condition index (CI) to check the existence of multi-
qffiffiffiffiffiffiffiffiffiffiffiffi
maxðk Þ
collinearity issue. The CI ¼ minðkjjÞ of the data is 38.6598 which is greater than 30 indicating the
presence of severe multicollinearity among the explanatory variables.
The estimated coefficients and MSE’s of the IG regression estimators are summarized in
Table 9. According to Table 9, it is observed that the IG regression estimators have smaller MSE
values as compared to the MLE. Moreover, the estimated MSE of TPAG with parameter ðd , k1 Þ
has smaller MSE as compared to other biased estimators. Hence, the results of a simulation study
and real data set illustrated that the performance of TPAG is best among others in the presence
of mild to severe multicollinearity.
5. Concluding remarks
In this paper, we propose TPE for the IGRM to overcome the effect of multicollinearity based on
the work of Asar and Genç (2018) and Huang and Yang (2014) and derive the optimal value of
the shrinkage parameters. Moreover, we propose some methods to estimate the shrinkage
6224 M. N. AKRAM ET AL.
parameters k and d. The performance of the suggested estimators is assessed through a Monte
Carlo simulation study and a real data application in terms of minimum MSE. Based on the
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
6225
Table 9. Estimated coefficient’s and MSE’s of the listed estimators for liver cirrhosis death rate data.
Estimated Coefficients
Estimators ^
b ^
b ^
b ^
b ^
b MSE
o 1 2 3 4
MLE 4.102 0.392 0.714 1.233 0.047 0.845
IGRE 4.102 0.403 0.701 1.221 0.036 0.798
IGLE 3.868 0.348 0.638 1.105 0.045 0.760
TPAGðd , k1 Þ 4.075 0.172 0.743 1.094 0.319 0.382
TPAGðd , k2 Þ 3.975 0.626 0.852 0.587 1.311 0.643
TPHYðD1 , K1 Þ 4.102 0.390 0.714 1.232 0.049 0.837
TPHYðD2 , K2 Þ 4.075 0.170 0.744 1.092 0.322 0.754
results of simulation and example, we conclude that the TPAG ðd , k1 Þ performed better and it is
considered as the best option to combat multicollinearity in the IGRM. For instance, the MLE
method is not a good choice in the presence of mild to severe multicollinearity. In addition, the
MSE of TPHY for all the considered situations are larger than the other biased estimation meth-
ods. So we suggests to use the TPAG with shrinkage parameter ðd , k1 Þ instead of other estima-
tors for estimating the parameters of the IGRM with correlated explanatory variables. Hence, for
the estimation of the IGRM with correlated explanatory variables, the best choice is to use
TPAG ðd , k1 Þ:
ORCID
Muhammad Nauman Akram http://orcid.org/0000-0001-6688-808X
Muhammad Amin http://orcid.org/0000-0002-7431-5756
References
Akdeniz, F., and S. Kaçiranlar. 2001. More on the new biased estimator in linear regression. Indian Journal of
Statistics 63 (3):321–25. doi:10.2307/25053183.
Akram, M. N., M. Amin, and M. Qasim. 2020. A new Liu-type estimator for the inverse Gaussian regression
model. Journal of Statistical Computation and Simulation 90 (7):1153–72. doi:10.1080/00949655.2020.1718150.
Algamal, Z. Y. 2019. Performance of ridge estimator in inverse Gaussian regression model. Communications in
Statistics–Theory and Methods 48 (15):3836–49. doi:10.1080/03610926.2018.1481977.
Alheety, M. I., and B. M. G. Kibria. 2009. On the Liu and almost unbiased Liu estimators in the presence of multi-
collinearity with heteroscedastic or correlated errors. Surveys in Mathematics and Its Applications 4:155–67.
Alkhamisi, M. A., and G. Shukur. 2008. Developing ridge parameters for SUR model. Communications in Statistics
– Theory and Methods 37 (4):544–64. doi:10.1080/03610920701469152.
Amin, M., M. Amanullah, and M. Aslam. 2016. Empirical evaluation of the inverse Gaussian regression residuals
for the assessment of influential points. Journal of Chemometrics 30 (7):394–404. doi:10.1002/cem.2805.
Amin, M., M. Amanullah, and M. Qasim. 2020. Diagnostic techniques for the inverse Gaussian regression model.
Communications in Statistics – Theory and Methods. Advance online publication. doi:10.1080/03610926.2020.
1777308.
Amin, M., M. N. Akram, and M. Amanullah. 2020. On the James-Stein estimator for the poisson regression model.
Communications in Statistics – Simulation and Computation. Advance online publication. doi:10.1080/03610918.
2020.1775851.
Amin, M., M. Qasim, and M. Amanullah. 2019. Performance of Asar and Genç and Huang and Yang’s two-param-
eter estimation methods for the gamma regression model. Iranian Journal of Science and Technology,
Transactions A: Science 43 (6):2951–63. doi:10.1007/s40995-019-00777-3.
Amin, M., M. Qasim, M. Amanullah, and S. Afzal. 2020a. Performance of some ridge estimators for the gamma
regression model. Statistical Papers 61 (3):997–1026. doi:10.1007/s00362-017-0971-z.
Amin, M., M. Qasim, S. Afzal, and K. Naveed. 2020b. New ridge estimators in the inverse Gaussian regression:
Monte Carlo simulation and application to chemical data. Communications in Statistics-Simulation and
Computation. Advance online publication. doi:10.1080/03610918.2020.1797794.
Asar, Y., A. Karaibrahimoglu, and A. Genç. 2014. Modified ridge regression parameters: A comparative Monte
Carlo study. Hacettepe Journal of Mathematics and Statistics 43 (5):827–41.
6226 M. N. AKRAM ET AL.
Asar, Y., and A. Genç. 2015. On some new modifications of ridge estimators. Kuwait Journal of Sciences 44:4–57.
doi: ArXiv:1512.02773v1.
Asar, Y., and A. Genç. 2018. A new two-parameter estimator for the Poisson regression model. Iranian Journal of
Science and Technology, Transactions A: Science 42 (2):793–803. doi:10.1007/s40995-017-0174-4.
Brownlee, K. A. 1965. Statistical theory and methodology in science and engineering. New York: Wiley.
Farebrother, R. W. 1976. Further results on the mean square error of ridge regression. Journal of the Royal
Statistical Society: Series B (Methodological) 38 (3):248–50. doi:10.1111/j.2517-6161.1976.tb01588.x.
Frisch, R. 1934. Statistical confluence analysis by means of complete regression systems. The Economic Journal 45
(180):741–42. doi:10.2307/2225583.
Hoerl, A. E., and R. W. Kennard. 1970a. Ridge regression: Biased estimation for nonorthogonal problems.
Technometrics 12 (1):55–67. doi:10.1080/00401706.1970.10488634.
Hoerl, A. E., and R. W. Kennard. 1970b. Ridge regression: Applications to nonorthogonal problems. Technometrics
12 (1):69–82. doi:10.1080/00401706.1970.10488635.
Huang, H.,. J. Wu, and W. Yi. 2017. On the restricted almost unbiased two-parameter estimator in linear regres-
sion model. Communications in Statistics - Theory and Methods 46 (4):1668–78. doi:10.1080/03610926.2015.
1026991.
Huang, J., and H. Yang. 2014. A two-parameter estimator in the negative binomial regression model. Journal of
Statistical Computation and Simulation 84 (1):124–34. doi:10.1080/00949655.2012.696648.
Hubert, M. H., and P. Wijekoon. 2006. Improvement of the Liu estimator in linear regression model. Statistical
Papers 47 (3):471–79. doi:10.1007/s00362-006-0300-4.
Khalaf, G., K. Månsson, P. Sj€ olander, and G. Shukur. 2014. A tobit ridge regression estimator. Communications in
Statistics - Theory and Methods 43 (1):131–40. doi:10.1080/03610926.2012.655881.
Kibria, B. M. G. 2003. Performance of some new ridge regression estimators. Communications in Statistics -
Simulation and Computation 32 (2):419–35. doi:10.1081/SAC-120017499.
Kibria, B. M. G. 2012. Some Liu and ridge-type estimators and their properties under the ill-conditioned Gaussian
linear regression model. Journal of Statistical Computation and Simulation 82 (1):1–17. doi:10.1080/00949655.
2010.519705.
Kibria, B. M. G., and S. Banik. 2016. Some ridge regression estimators and their performances. Journal of Modern
Applied Statistical Methods 15 (1):206–38. doi:10.22237/jmasm/1462075860.
Kibria, B. M. G., K. Månsson, and G. Shukur. 2012. Performance of some logistic ridge regression estimators.
Computational Economics 40 (4):401–14. doi:10.1007/s10614-011-9275-x.
Kinat, S., M. Amin, and T. Mehmood. 2020. GLM-based control charts for the Inverse Gaussian response variable.
Quality Reliability Engineering International 36 (2):765–83. doi:10.1002/qre.2603.
Liu, K. 1993. A new class of biased estimate in linear regression. Communication in Statistics-Theory and Methods
22 (2):393–402. doi:10.1080/03610929308831027.
Månsson, K. 2012. On ridge estimators for the negative binomial regression model. Economic Modelling 29 (2):
178–84. doi:10.1016/j.econmod.2011.09.009.
Månsson, K. 2013. Developing a Liu estimator for the negative binomial regression model: Method and applica-
tion. Journal of Statistical Computation and Simulation 83 (9):1773–80. doi:10.1080/00949655.2012.673127.
Månsson, K., and G. Shukur. 2011. A Poisson ridge regression estimator. Economic Modelling 28 (4):1475–81. doi:
10.1016/j.econmod.2011.02.030.
Månsson, K., B. M. G. Kibria, and G. Shukur. 2012. On Liu estimators for the logit regression model. Economic
Modelling 29 (4):1483–8. doi:10.1016/j.econmod.2011.11.015.
McClendon, M. J. 2002. Multiple regression and causal analysis. US: Waveland Press.
Muniz, G., and B. M. G. Kibria. 2009. On some ridge regression estimators: An empirical comparisons.
Communications in Statistics - Simulation and Computation 38 (3):621–30. doi:10.1080/03610910802592838.
Naveed, K., M. Amin, S. Afzal, and M. Qasim. 2020. New shrinkage parameters for the inverse Gaussian Liu
regression. Communications in Statistics-Theory and Methods. Advance online publication. doi:10.1080/
03610926.2020.1791339.
Qasim, M., B. M. G. Kibria, K. Månsson, and P. Sj€olander. 2019. A new Poisson liu regression estimator: Method
and application. Journal of Applied Statistics. Advance online publication. doi:10.1080/02664763.2019.1707485.
Qasim, M., M. Amin, and M. Amanullah. 2018. On the performance of some new Liu parameters for the gamma
regression model. Journal of Statistical Computation and Simulation 88 (16):3065–80. doi:10.1080/00949655.
2018.1498502.
Segerstedt, B. 1992. On ordinary ridge regression in generalized linear models. Communications in Statistics -
Theory and Methods 21 (8):2227–46. doi:10.1080/03610929208830909.
Yang, H., and X. Chang. 2010. A new two-parameter estimator in linear regression. Communications in Statistics-
Theory and Methods 39 (6):923–34. doi:10.1080/03610920902807911.