RenSun Sankhya2004 ComparisonBayesFreqtstPrediction

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Sankhya : The Indian Journal of Statistics

2004, Volume 66, Part 4, pp 678-706


c 2004, Indian Statistical Institute
Comparison of Bayesian and Frequentist Estimation and
Prediction for a Normal Population
Cuirong Ren
South Dakota State University, Brookings, USA
Dongchu Sun
University of Missouri, Columbia, USA
Dipak K. Dey
University of Connecticut, Storrs, USA
Abstract
Comparisons of estimates between Bayes and frequentist methods are inter-
esting and challenging topics in statistics. In this paper, Bayes estimates
and predictors are derived for a normal distribution. The commonly used
frequentist predictor such as the maximum likelihood estimate (MLE) is a
plug-in procedure by substituting the MLE of into the predictive dis-
tribution. We examine Bayes prediction under the -absolute error losses,
the LINEX losses and the entropy loss as special case of the -absolute er-
ror losses. If the variance is unknown, the joint conjugate prior is used to
estimate the unknown mean for the -absolute error losses and an ad hoc
method by replacing the unknown variance by the sample variance for the
LINEX losses. Bayes estimates are also extended to the linear combinations
of regression coecients. Under certain assumptions for a design matrix, the
asymptotic expected losses are derived. Under suitable priors, Bayes esti-
mate and predictor perform better than the MLE. Under the LINEX loss, the
Bayes estimate under the Jereys prior is superior to the MLE. However, for
prediction, it is not clear whether Bayes prediction or MLE performs better.
Under some circumstances, even when one loss is the true loss function,
Bayes estimate under another loss performs better than the Bayes estimate
under the true loss. This serves as a warning to naive Bayesians who
assume that Bayes methods always perform well regardless of circumstances.
AMS (2000) subject classication. 62F15, 62H10, 62H12.
Keywords and phrases. Bayes estimation, Jereys prior, loss function, max-
imum likelihood estimator, risk function.
1 Preliminary
In the last twenty years, there has been considerable attention on com-
parisons between Bayesian and frequentist procedures under some known
Bayesian and frequentist estimation and prediction 679
smooth loss functions such as the squared error loss. Use of symmetric loss
functions, such as the squared error loss, is convenient for many practical
problems. In most cases, they are reasonable choices. However, for some es-
timation and prediction problems, this may be inappropriate. Asymmetric
functions have been shown to be useful, see Varian (1975), Zellner (1986).
Moorhead and Wu (1998), Spiring and Yeung (1998), Chandra (2001), etc.
Joseph (2004) considered the solder mask thickness of printed circuit board
and discovered a symmetric loss function would not be appropriate for the
total loss from customers and manufactures.
It has been thought that Bayesian methods always perform well regard-
less of circumstances. However, this is not true. In this paper, we found in
a number of cases the results run counter to intuition. To take one example
among these, Section 4.1 compares the Bayes estimator
B
, calculated as
the Bayes estimator under LINEX loss L
2
, with
n
, which is the Bayes esti-
mator under the -absolute error loss L
1
. L
1
and L
2
will be dened in the
section. Under some circumstances, even when L
1
is the true loss func-
tion,
B
performs better than
n
. Therefore, this kind of results can serve
as a warning to these naive Bayesians. Because normal distributions play
an important role in statistics, we now study the estimation and prediction
for normal distributions.
Suppose that X
n
= (X
1
, . . . , X
n
) is a random sample from N(,
2
),
where
2
> 0 is known and IR is an unknown parameter. We want to
estimate unknown mean . We rst consider a class of the -absolute error
loss functions for an estimate = (X
n
),
L
1
(, ) = | |

, (1)
where is a known positive constant. When = 2, we have the squared
error loss, and if = 1, we have the commonly used absolute error loss.
We also consider a class of LINEX loss functions,
L
2
(, ) = e
c()
c( ) 1, (2)
where c is a nonzero constant. Varian (1975) introduced this useful class of
LINEX loss functions that arises approximately exponentially on one side of
zero and approximately linearly on the other side in a study of real estate
assessment. Zellner (1986) also described interesting applications in using
such LINEX losses.
Another commonly used loss function is the entropy loss, (cf. Robert,
1997). If f(.|) and f(.|) are the distributions associated with the true
680 Cuirong Ren, Dongchu Sun and Dipak K. Dey
parameter and the estimate , the entropy distance is dened as
L
3
(, ) =
_

log
_
f(x|)
f(x|)
_
f(x|)dx. (3)
For the normal example here, it is easy to show that
L
3
(, ) =
( )
2

2
. (4)
Because
2
is known, the entropy loss function is equivalent to the squared
error loss. Thus we will not study this further here.
In Section 2, we derive the expected risk for MLE of and the predic-
tive risk when the plug-in estimate is used. Under L
1
, we obtain Bayes
estimate of with conjugate prior or with joint conjugate prior (,
2
)
depending on whether
2
is known or unknown. We also obtain a prediction
of normal distribution, which is an approximation of Bayes prediction under
L
1
if the variance
2
is known with the conjugate prior of . In Section
3, we derive Bayes estimate and prediction for the LINEX losses and the
expected losses under the LINEX. In Section 4, a higher-order asymptotic
theory for these estimates and predictions is developed, and these risks are
compared theoretically under these loss functions, respectively. In Section
5, we extend the estimates of the parameters in previous sections to linear
combination of regression coecients under these two kinds of loss functions
L
1
and L
2
. Numerical results are presented in Section 6. Several lemmas,
which are useful in verifying the main results, are presented in the Appendix.
The results in the paper show that for estimates, if it is not clear for param-
eter , using the MLE is not too bad. For prediction, as Smith (1997, 1999)
points out that the usual Bayesian method may be inferior to a crude MLE
plug-in. However, if we really have some information, the Bayes estimate
and predictors can perform better.
2 Risk Expansion Under -absolute Error Loss
2.1. The MLE. Suppose that we have independent and identical ob-
servations X
1
, X
2
, . . . , X
n
from a normal distribution N(,
2
) with known
variance
2
. The MLE of normal population mean is the sample mean

X
n
=
1
n

n
i=1
X
i
. We rst give the frequentist risk for the MLE. The proof
is omitted.
Theorem 2.1 The expected loss for the MLE is
IE|

X
n
|

=
_
2

n
/2

_
+ 1
2
_
. (5)
Bayesian and frequentist estimation and prediction 681
Now we are interested in the predictive distribution for a future observation
X

,
(x; ) P{X

x} =
_
x

_
, x IR.
The MLE of (x; ) is then the plug-in estimate, which is optimal relative
to a zero-one loss function in many problems:

M
n
=
_
x

X
n

_
.
Theorem 2.2 The predictive risk in using the MLE of has the form:
IE|
M
n
|

=
_
2

()

n
/2

_
+1
2
__
1+
(+1){(3+1)
2
4}
24n
_
+O(n
2/2
),
(6)
where = (x )/ and is the density function of the standard normal
distribution.
The proof of the theorem is in Appendix B.
2.2. Bayes Estimator of . If a conjugate prior for is employed,
i.e. is N(
0
,
2
0
), where
0
IR and
2
0
> 0 are known, then the posterior
distribution for is N(
n
,
2
n
), where

n
=
n

X
n

2
+

0

2
0
n

2
+
1

2
0
=

X
n
1 +

2
n
2
0
+

0
n
2
0
1 +

2
n
2
0
, (7)
and

2
n
=
1
n

2
+
1

2
0
=

2
n
1
1 +

2
n
2
0
. (8)
To nd the Bayes estimate of under the -absolute error loss function
(1), we need the following lemma.
Lemma 2.3 Suppose a random variable Y is symmetric about 0 with a
density f(y), which is decreasing when |y| increases. If, for some > 0,
IE|Y |

< , then
IE(|Y b|

) IE(|Y |

), for any b IR. (9)


682 Cuirong Ren, Dongchu Sun and Dipak K. Dey
The proof follows since |Y b| is stochastically larger than |Y |.
Remark 1. The monotonicity condition can be dropped if 1 but
not if < 1.
From Lemma 1, the Bayes estimate of is the posterior mean
n
under
the -absolute error loss function (1) since the posterior distribution of
is symmetric about
n
and the density of
n
is decreasing in |
n
|.
Therefore, we have the following result:
Theorem 2.4 Dene b
0
= (
0
)/
2
0
.
a) The expected loss for the Bayes estimate
n
has the closed form expression,
IE|
n
|

=
_
2

exp(
b
2
0
2n
)
n
/2
(1 +

2
n
2
0
)

k=0
2
k
b
2k
0
(
+1
2
+ k)
(2k)!n
k
. (10)
b) When n is large,
IE|
n
|

=
_
2

n
/2

_
+1
2
__
1+

2
2n
2
0
_
a
1
()+
a
2
(, )
n
__
+o(n
2/2
),
(11)
where
a
1
() =
(
0
)
2

2
0
2, (12)
a
2
(, ) =

2

2
0
_
+ 1 +
( 2)(
0
)
4
12
4
0

(
0
)
2

2
0
_
. (13)
Proof. For a), let Z =

n(

X
n
)/. Thus, Z is N(0, 1). We get,
IE|
n
|

= IE

X
n
1 +

2
n
2
0

_

_

0
n
2
0
_
_
_
1 +

2
n
2
0
__

=
_
1+

2
n
2
0
_

IE

X
n

_
1+

2
n
2
0
__

_

0
n
2
0
_
_
_
1+

2
n
2
0
__

n
/2
_
1 +

2
n
2
0
_

IE

Z
b
0
n
1/2

.
Then, by Lemma A.1 a) given in Appendix A, (10) follows.
For b), using (40) where =
2
/(n
2
0
) and Lemma A.1 b), (11) follows.
Bayesian and frequentist estimation and prediction 683
Now, let us compare the MLE and the Bayes estimate of under the
loss function (1). It follows from Theorem 2.1 and Theorem 2.4 Part b) that
IE|
n
|

IE|

X
n
|

=
_
2
2

+2
n

2
+1

2
0

_
+ 1
2
__
a
1
() +
a
2
(, )
n
_
+ o(n
2/2
).
The leading term in the dierence is proportional to a
1
() and is also propor-
tional to (
0
)
2
/
2
0
2. Thus, when (
0

2
0
,
0
+

2
0
), the Bayes
estimate
n
performs better than the MLE

X
n
asymptotically. However,

X
n
asymptotically performs better than
n
if / (
0

2
0
,
0
+

2
0
). When
the variance of the prior,
2
0
, is large enough, both a
1
() and a
2
(, ) go to
zero. In fact, in this case
n
is close to

X
n
, so the dierence of the expected
loss goes to zero.
2.3. Bayes estimator for unknown
2
. In practice,
2
is often unknown.
A joint conjugate prior for and
2
is then considered:
|
2
N(
0
,
2
/
0
)

2
Inv-
2
(
0
,
2
0
),
which corrsponds to the joint prior density
p(,
2
)
1
(
2
)
(
0
/2+1)
exp
_

1
2
2
{
0

2
0
+
0
(
0
)
2
}
_
, (14)

0
,
0
,
0
, and
2
0
are known constants. This joint prior density sometimes is
denoted by N-Inv-
2
(
0
,
2
0
/
0
;
0
,
2
0
); its four parameters can be identied
as the location and scale of and the degrees of freedom and scale of
2
,
respectively. From Gelman et al, (2003), the joint posterior distribution is
p(,
2
|X
n
)
1
(
2
)
(
0
/2+1)
exp
_

1
2
2
[
0

2
0
+
0
(
0
)
2
]
_
(
2
)
n/2
exp
_

1
2
2
[(n 1)s
2
+ n(

X
n
)
2
]
_
= N-Inv-
2
(
n
,
2
n
/
n
;
n
,
2
n
),
where

n
=

0

0
+ n

0
+
n

0
+ n

X
n
,

n
=
0
+ n,

n
=
0
+ n,

n

2
n
=
0

2
0
+ (n 1)s
2
+

0
n

0
+ n
(

X
n

0
)
2
.
684 Cuirong Ren, Dongchu Sun and Dipak K. Dey
Then, integrating the joint posterior density with respect to
2
, one can
show that the marginal posterior density for is
p(|X
n
)
_
1 +

n
(
n
)
2

n

2
n
_
(
n
+1)/2
= t

n
(|
n
,
2
n
/
n
).
Because the marginal posterior density for is symmetric about
n
and
decreases in |
n
|, from Lemma 2.3, the Bayes estimate of is the posterior
mean
n
under L
1
.
Remark 2. The joint conjugate prior of (,
2
) with density (14) in-
cludes several commonly used priors. For example, if one takes
0
=
2
0
= 0
and
0
= 1, it becomes the Jereys prior; it is the reference prior if

0
=
2
0
= 0 and
0
= 2.
Theorem 2.5 Dene b
1
=
0
(
0
)/.
a) The expected loss for the Bayes estimate
n
has the closed form expression,
IE|
n
|

=
_
2

exp(
b
2
1
2n
)
n
/2
(1 +

0
n
)

k=0
2
k
b
2k
1
(
+1
2
+ k)
(2k)!n
k
. (15)
b)When n is large,
IE|
n
|

=
_
2

(
+1
2
)
n
/2
_
1+

2n
_
b
2
1
2
0
+
(+1)
2
0
b
2
1

0
+
(2)b
4
1
12
n
_
_
+o(n
2/2
), (16)
Proof. With some algebra, one can show that
IE|
n
|

n
/2
_
1 +

0
n
_

Z
b
1

.
Then, by applying the results in Lemma A.1, one can show that the conclu-
sions hold in the theorem.
Remark 3. If
0
= 0, that is, has a vague prior, then conclusions are
the same as in Theorem 2.1. Therefore, under L
1
, the expected loss for the
Bayes estimate
n
is the same as the expected loss of the MLE when the
prior is the Jereys prior or reference prior.
Bayesian and frequentist estimation and prediction 685
2.4 Bayes prediction. Finding the closed form of the Bayes estimate
for the predictive distribution = (), where = (x )/ under the
loss function (1), is not easy. Instead, we rst examine two special cases.
Through these two special cases, we will give a reasonable approximate es-
timate of . We will compute the predictive risk of this estimate under the
-absolute loss and compare it with MLE.
Case 1. = 1. In this case, the Bayes estimate of is the posterior
median. The posterior median of is the posterior mean
n
, because the
posterior distribution of is N(
n
,
2
n
), which is symmetric about
n
. Fur-
thermore, because is an increasing function, the Bayes estimate of is
(s
0
), where s
0
= (x
n
)/.
Case 2. = 2. The Bayes estimate of is its posterior mean,
IE( | X
n
) =
_

_
x

_
1

2
n
e

(
n
)
2
2
2
n
d
=
_

_
s
0


n
t

_
(t) dt =
_
s
0
_
1 +
2
n
/
2
_
,
where
n
is the posterior standard deviation, given in (8). The last equality
is derived based on that both sides are P{X s
0

n
T/}, where X and T
are independent N(0, 1). Because
2
n
/
2
= 1/n+o(1/n) and 11/

1 + x =
x/2 + o(x),
_
s
0

1+
2
n
/
2
_
is equal to

_
s
0
s
0
_
1
1
_
1 +
2
n
/
2
_
_
= (s
0
s
0
/(2n) + o(1/n))
= (s
0
(11/2n)) + o(n
1
).
From these two special cases, the Bayes estimates of under the loss function
(1) is close to (s
0
). Therefore, we only consider the estimate of of the
form,
(s
0
{1 + g()/n}), (17)
for some function g of .
Proposition 1. g(), which minimizes IE{|(s
0
(1+g()/n))()|

|X
n
},
is equal to the argument t (a function of ), which minimizes the function,
h(t) = IE

_
b +
bt
n
_

_
b +

n

Z
_

, t IR, (18)
where b = s
0
and the expectation is taken with respect to Z N(0, 1).
686 Cuirong Ren, Dongchu Sun and Dipak K. Dey
The proof is simple and omitted.
Lemma 2.6 For any constant b IR, the minimum point t
0
of the func-
tion h(t) dened in (18) has the form,
t
0
=
1
2
+ O(n
1
).
The proof is in Appendix B.
From Lemma 2.6, we have
_
x
n

_
1
1
2n
__
, denoted by
n
, is an
approximation of Bayes estimate of = (
x

) under the loss function (1),


i.e.

n
=
_
x
n

_
1
1
2n
__
. (19)
Theorem 2.7 Dene
a
3
(, ) =
(b
0
)
2
2
+
( + 1)
2
8 + 4
12


2

2
0
,
where b
0
= (
0
)/
2
0
and = (x )/. Then the predictive risk of
n
has the expansion,
IE|
n
|

=
_
2

()

n
/2

_
+1
2
__
1+
a
3
(, )
n
_
+O(n
2/2
). (20)
Corollary 1 The dierence of predictive risks between
n
and the
M
n
is
IE|
n
|

IE|
M
n
|

=
_
2

()

d
3
(, )
n
/2+1

_
+ 1
2
_
+ O(n
2/2
),
where
d
3
(, ) = a
3
(, )
( + 1){(3 + 1)
2
4}
24
.
Proof. It follows from Theorem 2.2 and Theorem 2.7 immediately.
If a diuse prior, p() 1, is employed, i.e.
2
0
, we have, by (19)

0n
=
_
x

X
n

_
1
1
2n
__
.
Bayesian and frequentist estimation and prediction 687
Corollary 2 a) The predictive risk for
0n
has the form
IE|
0n
|

=
_
2

()

n
/2

_
+1
2
__
1+
(21){(3+1)
2
4}
12n
_
+O(n
2/2
).
b) The dierence of predictive risks between
0n
and
M
n
is
IE|
0n
|

IE|
M
n
|

=
_
2

()

n
/2+1

_
+1
2
_
(1){(3+1)
2
4}
8
+O(n
2/2
). (21)
From (21), we have the following conclusions. If = 1, both of them
perform approximately the same. For > 1, when is close to zero,
0n
performs better than
M
n
; otherwise,
M
n
performs better than
0n
. For
0 < < 1, when is close to zero,
M
n
performs better than
0n
; otherwise,

0n
performs better than
M
n
.
3 Bayes Estimators and Risks Under the LINEX Loss
Now, let us consider the LINEX loss function (2) to nd the estimate of
with known . Zellner (1986) showed that the Bayes estimate, denoted by

B
, under the LINEX loss (2) is

B
=
1
c
log
_
IE(e
c
| X
n
)
_
. (22)
If the posterior distribution of is N(
n
,
2
n
), then IE(e
c
|X
n
) = exp(c
n
+
c
2

2
n
/2) and the Bayes estimate reduces to

B
=
n

1
2
c
2
n
, (23)
where
n
and
2
n
are given in (7) and (8), respectively.
Theorem 3.1 Under a normal prior N(
0
,
2
0
), the expected loss of

B
under the LINEX loss is
IE{L
2
(
B
, )} = exp
_
cs
n
(cs
n
2

b)
2n
_
+
cs
n

b
n
1, (24)
where
s
n
=

1 +

2
n
2
0
and

b =
_

0

2
0
+
c
2
_
.
688 Cuirong Ren, Dongchu Sun and Dipak K. Dey
Proof. Note that

B
=
1
1 +

2
n
2
0
_

X
n


2
n
_

0

2
0
+
c
2
__
=
s
n
(Z n
1/2

b)
n
1/2
,
where Z =

n(

X
n
)/. We get,
IE{L
2
(
B
, )} = IE
_
exp
_
s
n
(Z n
1/2

b)
n
1/2
_

s
n
(Z n
1/2

b)
n
1/2
1
_
= exp
_

cs
n

b
n
_
exp
_
c
2
s
2
n
2n
_
+
cs
n

b
n
1
because since Z N(0, 1), and IE(e
bZ
) = exp(b
2
/2) for any b. This com-
pletes the proof.
If a noninformative prior, p() 1, is employed, then the optimal esti-
mate relative to the LINEX loss function (2) is

0B
=

X
n

c
2
2n
,
which is the same as (23) when
2
0
. Thus, we have
Corollary 3 The expected loss
0B
under the LINEX loss is
IE{L
2
(
0B
, )} =
c
2

2
2n
.
4 Comparisons
In this section, we will derive and compare the frequentist risks for esti-
mates and predictions under the -absolute loss and the LINEX loss.
4.1. Under the -absolute error loss. We will derive the frequentist risks
for these estimates in the previous section under the -absolute loss and
compare them with MLE.
Theorem 4.1 a) The frequentist risk of
B
under the loss function (1)
is
IE|
B
|

=
_
2

n
/2

_
+1
2
_
_
1+

n
_

b
2
2

2
0
+
a
4
(, )
n
__
+o(n
2/2
),
(25)
Bayesian and frequentist estimation and prediction 689
where

b is dened in Theorem 3.1 and
a
4
(, ) =
( + 1)
4
2
4
0
+
( 2)

b
4
24


2

b
2
2
2
0
.
b) The frequentist risk of
0B
under the loss function (1) is
IE|
0B
|

=
_
2

n
/2

_
+1
2
__
1+

2
c
2
8n
+
( 2)
4
a
4
384n
2
_
+o(n
2/2
).
c) The dierence of the expected losses of
n
and
B
under the loss function
(1) is
IE|
n
|

IE|
B
|

=
_
2
2

(b
2
0

b
2
)
n
/2+1

_
+1
2
_
_
1+
(2)(b
2
0
+

b
2
)
12n

2
n
2
0
_
+o(n
2/2
),
where b
0
= (
0
)/
2
0
.
Proof. For Part a), if b in Theorem 2.4 is replaced by

b, then it is done.
When
0
in Part a), we get Part b). Part c) follows from Part a) and
Theorem 2.4 Part b).
It is easy to see IE|
0B
|

> IE|

X
n
|

uniformly for a large n and


the MLE is superior to the Bayes estimate under the loss function (1). A
special case, when = 2, has been proved in Zellner (1986) for a small
sample size n.
For a large n, when is close to
0
,
n
performs better than
B
. When
is far away from
0
, if and
0
are the same sign, then
n
performs
better than
B
. Otherwise,
B
performs better than
n
.
4.2. Under the LINEX loss. It would be interesting to compare the risks
under the LINEX loss for the three estimates, the MLE

X
n
and the posterior
mean
n
(the Bayes estimate under the loss function (1)) and
B
(the Bayes
estimate under the LINEX loss). We give the corresponding risks of

X
n
and

n
. The proof is similar to that of Theorem 3.1 and is omitted.
Theorem 4.2 a) The expected loss of

X
n
under the LINEX loss is
IE{L
2
(

X
n
, )} = exp
_
c
2

2
2n
_
1.
690 Cuirong Ren, Dongchu Sun and Dipak K. Dey
b) The expected loss of
n
under the LINEX loss is
IE{L
2
(
n
, )}=exp
_
cs
n
(cs
n
2b
0
)
2n
_
+
cs
n
b
0
n
1,
where s
n
is dened in Theorem 3.1 and b
0
= (
0
)/
2
0
.
From Theorem 3.1 and Theorem 4.2, for a large n, we have the expan-
sions,
IE{L
2
(
B
, )} =
c
2

2
2n
+
c
2

4
2
2
0
n
2
_
(
0
)
2

2
0
2
_
+ o(n
2
), (26)
IE{L
2
(

X
n
, )} =
c
2

2
2n
+
c
4

4
8n
2
+ o(n
2
), (27)
IE{L
2
(
n
, )} =
c
2

2
2n
+
c
2

4
2n
2
_
_

0

2
0

c
2
_
2

2
0
_
+ o(n
2
). (28)
Thus, IE{L
2
(
0B
, )}<IE{L
2
(

X
n
, )} uniformly and consequently, the Bayes
estimate is superior to MLE under the LINEX loss function. Zellner (1986)
proved that

X
n
is an inadmissible estimate for under the LINEX loss.
Because
IE{L
2
(
n
, )} IE{L
2
(
B
, )} =
c
3

4
{4(
0
) c
2
0
}
8n
2

2
0
+ o(n
2
),

B
performs better than
n
when c{4(
0
)/
2
0
c} < 0, but
n
performs
better than
B
when c{4(
0
)/
2
0
c} > 0.
When
2
is unknown, we replace
2
by S
2
n
=
1
n1

n
i=1
(X
i


X
n
)
2
. The
new estimate of is

B
=
n

1
2
c
2
n
,
where

2
n
=
S
2
n
n
_
1 +
S
2
n
n
2
0
_. (29)
By Lemma A.4, we have the same expansion as (26) for IE{L
2
(

B
, )}.
4.3. Predictive risks under the LINEX loss. The Bayes estimate of
under the LINEX loss, denoted by
n
, according to (22), is

n
=
1
c
log
_
IE(e
c
| X
n
)
_
.
Bayesian and frequentist estimation and prediction 691
First, we give the expansion for
n
. Then we will compare the risks under
the LINEX loss for three estimates
n
,
M
n
and
n
.
Lemma 4.3 For large n, the following expansion holds.

n
= (s
0
)
(s
0
){s
0
+ c(s
0
)}
2n
+ O(n
2
),
where s
0
= (x
n
)/.
Proof. Note that
IE(e
c
|X
n
)=
1

2
n
_

e
c(
xu

)
e

(u
n
)
2
2
2
n
du=
_

e
c(s
0

n
t

)
(t)dt
because N(
n
,
2
n
) given X
n
. From the expression
n
= O(n
1/2
), we
have

_
s
0

n
t

_
=(s
0
)
(s
0
)
n
t


s
0
(s
0
)
2
n
t
2
2
2

(s
2
0
1)(s
0
)
3
n
t
3
6
3
+O(n
2
t
4
),
Using the fact (38) for any small positive number a and the Taylor expansion
for exp{
c(s
0
)
n
t

+
cs
0
(s
0
)
2
n
t
2
2
2
+
c(s
2
0
1)(s
0
)
3
n
t
3
6
3
}, we have
IE(e
c
| X
n
) = e
c(s
0
)
_
1 +
cs
0
(s
0
){1 + s
0
(s
0
)}
2n
_
+ O(n
2+4a
).
Let a 0, then the proof is complete.
Theorem 4.4 The predictive risk of
n
under the LINEX loss has the
expansion,
IE{L
2
(
n
, )}
=
c
2
()
2
2n
_
1 +
1
n
_
(b
0
2)
2


2
2

2
2

2
0
+c() 2
__
+ o(n
2
),
where = (x )/.
When the variance of the prior,
2
0
, i.e., the noninformative prior
is employed, the Bayes predictor of is denoted by
0n
.
692 Cuirong Ren, Dongchu Sun and Dipak K. Dey
Corollary 4 a) The Bayes predictor of for the noninformative prior
has the expansion,

0n
=
_
x

X
n

1
2n

_
x

X
n

__
x

X
n

+ c
_
x

X
n

__
+ O(n
2
).
b) The predictive risk of
0n
under the LINEX loss has the expansion,
IE{L
2
(
0n
, )} =
c
2
()
2
2n
_
1 +
7
2
+ 2c() 4
2n
_
+ o(n
2
).
Theorem 4.5 a) The predictive risk of
M
n
under the LINEX loss has
the expansion,
IE{L
2
(
M
n
, )} =
c
2
()
2
2n
_
1 +
{c() 3}
2
2
2
4
4n
_
+ o(n
2
).
b) The predictive risk of
n
under the LINEX loss has the expansion,
IE{L
2
(
n
, )}
=
c
2
()
2
2n
_
1 +
1
n
_
_
b
0

( + 2)
2
+
c()
2
_
2


2
2

2
2

2
0

__
+ o(n
2
).
Corollary 5 The predictive risk of
0n
under the LINEX loss has the
expansion,
IE{L
2
(
0n
, )} =
c
2
()
2
2n
_
1 +
{c() ( + 2)}
2
2
2
4
4n
_
+ o(n
2
).
5 Linear Combination of Regression Coecients
The results we present above can be adapted to multiple regression con-
text,
y = X +e, (30)
where X is an nk full column rank matrix, is a k-dimensional vector of
unknown regression coecients, the error vector e follows N
n
(0,
2
I
n
), and

2
is assumed to be known.
If we would like to estimate a linear combination of regression coecients,
namely, = l

, where l = 0 is a given k-dimensional vector. Suppose that


Bayesian and frequentist estimation and prediction 693
the prior for is N
k
(
0
,
0
), where
0
and
0
are known. The posterior of
given y is N
k
(

n
, V
n
), where

n
=
_

1
0
+
X

2
_
1
_

1
0

0
+
X

2
_
, and V
n
=
_

1
0
+
X

2
_
1
.
Thus, given y is N(l

n
, l

V
n
l). The Bayes estimate of is the posterior
mean

n
= l

n
under L
1
, and its Bayes estimate relative to the LINEX loss
function is

B
= l

c
2
l

V
n
l.
If a diuse prior is applied for , which is equivalent to
1
0
= O, then we
have the Bayes estimate of under L
1
is

= l

(X

X)
1
X

y, which is MLE
of , denoted by

M
= l

M
n
and under the LINEX loss, l

(X

X)
1
X

y
c
2
2
l

(X

X)
1
l, denoted by

0
B
, which one can nd in Zellner (1986).
Let us assume that
_
_
_
_
1
n
X

X
_
D+
1
n
E
__
_
_
_
= o(1), (31)
where is the Euclidean norm of matrices, which is dened as A =
_
m
i=1

n
j=1
a
2
ij
_
1/2
. Here A = (a
ij
)
mn
. D is positive denite matrix and
E is a symmetric matrix. Note that l

n
is normal with mean of l

1
0
+X

X/
2
_
1

1
0
(
0
) and variance

2
n
= l

1
0
+X

X/
2
_
1
X

X
_

1
0
+X

X/
2
_
1
l/
2
.
Thus, from Lemma A.1, we have the following:
IE{L
1
(

n
, )} =

n
_
2

exp
_

2
n
2
_

k=0
2
k

2k
n
(
+1
2
+ k)
(2k)!
,
where
n
= l

1
0
+
X

2
_
1

1
0
(
0
)/
n
. By the assumption of X

X
in (31), one can nd

2
n
=

2
l

D
1
l
n


2
l

D
1
(2
2

1
0
+E)D
1
l
n
2
+ o(n
2
),
and

n

n
=

2
l

D
1
n
_
I
(
2

1
0
+E)D
1
n
_

1
0
(
0
) + o(n
2
).
We have the following theorem, whose proof is omitted.
694 Cuirong Ren, Dongchu Sun and Dipak K. Dey
Theorem 5.1 (1) Under L
1
loss function, the expected losses of the
Bayes estimates

n
and

B
are
IE{L
1
(

n
, )}
=
_
(2
2
l

D
1
l)

_
1 +
( + 1)
2
a
5
()
2
a
6
(
2
)
2nl

D
1
l
_
+o(n
1/2
),
IE{L
1
(

B
, )}
=
_
(2
2
l

D
1
l)

_
1 +
( + 1)
2
{2a
5
() + cl

D
1
l}
2
4a
6
(
2
)
8nl

D
1
l
_
+o(n
1/2
),
respectively, where
a
5
() = l

D
1

1
0
(
0
),
a
6
(
2
) = l

D
1
(2
2

1
0
+E)D
1
l.
(2) Under L
2
loss function, the expected losses of the Bayes estimates

n
and

B
are
IE{L
2
(

n
, )} = exp
_
c
n
(c
n

n
)
2
_
+ c
n

n
1,
IE{L
2
(

B
, )} = exp
_
c(2c
2
n
2
n

n
cl

V
n
l)
4
_
+ c
_

n

n
1
2
l

V
n
l 1
_
,
respectively.
From Theorem 5.1, one can obtain the expected losses of the Bayes es-
timates when the diuse prior is applied under these two kinds of loss func-
tions. For example
IE{L
1
(

M
n
, )}
=
_
(2
2
l

D
1
l)

_
1
l

D
1
ED
1
l
2nl

D
1
l
_
+o(n
1/2
),
IE{L
1
(

0
B
, )}
=
_
(2
2
l

D
1
l)

_
1+
(+1)
2
(cl

D
1
l)
2
4l

D
1
ED
1
l
8nl

D
1
l
_
+o(n
1/2
).
6 Numerical Results
In the following numerical examples, we assume that variance
2
= 1
and the constant c in the linex loss (2) to be 1.
Bayesian and frequentist estimation and prediction 695

0

0

0

0

0
= 1,
p p p p p p p p p p p p

0
= 2,
0
= 10
Figure 1. The proportion of the difference between E|
n
|

and E|

X
n
|

at n = 20, (a) = 0.5; (b) = 1; (c) = 1.5; (d) = 2.


In Figure 1, we plot the proportion of the risk dierence IE|
n
|

IE|

X
n
|

, which is a
1
() +a
2
(, )/n versus
0
, where a
1
and a
2
are
dened in (12) and (13), respectively. We choose n = 20, = 0.5, 1, 1.5, 2
and
0
= 1, 2, 10. All plots show that if
0
, the mean of the prior, is close
to , the Bayes estimate
n
is better than the MLE

X
n
. Otherwise, the
MLE is better than the Bayes estimate. Also, these plots show that when

2
0
is large, the MLE and the Bayes estimate perform almost the same.
696 Cuirong Ren, Dongchu Sun and Dipak K. Dey

0
= 1,
p p p p p p p p p p p p

0
= 2,
0
= 10
Figure 2. The proportion of the difference between the risks of
n
and
M
n
for

0
= 1, 2 and 10, (a) = 1,
0
= 0.5; (b) = 2,
0
= 0.5; (c) = 1,
0
= 2;
(d) = 2,
0
= 2.
In Figure 2, we plot d
3
(, ), which is the proportion of IE|
n
|

IE|
M
n
|

versus , where = (x )/, for = 1, 2,


0
= 0.5, 2,
and
0
= 1, 2, 10. We see that if
0
= 0.5,
n
performs almost better
than
M
n
. However, for
0
= 2, it is not clear which is better. As
0
increases, the dierence goes to zero and consequently
n
performs better
than
M
n
. This suggests that if the prior mean is well chosen and
0
is large,
we should choose
n
as a predictor of .
Bayesian and frequentist estimation and prediction 697

0
= 1,
p p p p p p p p p p p p

0
= 2,
0
= 10
Figure 3. The proportion of the difference between the predictive risks of

n
and
n
for
0
= 1, 2 and 10, (a) = 1,
0
= 0.5; (b) = 2,
0
= 0.5; (c)
= 1,
0
= 2; (d) = 2,
0
= 2.
In Figure 3, we plot {b
0
(+2)/2+c()/2}
2
(b
0
2)
2
+2+c(),
i.e. the proportion of IE{L
2
(
n
, )} IE{L
2
(
n
, )}, versus , where b
0
=
(
0
)/
2
0
. We choose = 1, 2,
0
= 0.5, 2 and = 1, 2, 10. For
= 1, the plots show that
n
is better than
n
. However, it is not clear
which is better in general.
698 Cuirong Ren, Dongchu Sun and Dipak K. Dey

0
= 1,
p p p p p p p p p p p p

0
= 2,
0
= 10
Figure 4. The proportion of the difference between the predictive risks of

n
and
M
n
for
0
= 1, 2 and 10, (a)
0
= 0.5; (b)
0
= 2.
In Figure 4, we plot the proportion of IE{L
2
(
n
, )} IE{L
2
(
M
n
, )},
which is (b
0
2)
2

2
2

2
0
+c() {c() 3}
2
/4 1 versus . It shows
that when
0
is close to ,
n
performs better than
M
n
. Otherwise,
M
n
performs better than
n
.
Appendix
A Some Lemmas
We give some lemmas that will be used for proving main results.
Lemma A.1 Let Z N(0, 1).
a) For any constant b, we have
IE

Z
b
n
1/2

=
_
2

exp
_

b
2
2n
_

k=0
2
k
b
2k
(
+1
2
+ k)
(2k)!n
k
. (32)
b) When n is large,
IE

Z
b
n
1/2

=
_
2


_
+1
2
__
1+
b
2
2n
+
(2)b
4
24n
2
_
+o(n
2
). (33)
Bayesian and frequentist estimation and prediction 699
Proof. For Part a), let Y = Z b/

n. Then Y is N(b/

n, 1). Thus,
IE

Z
b
n
1/2

=
1

2
_

0
y

_
e

1
2
(y+bn
1/2
)
2
+ e

1
2
(ybn
1/2
)
2
_
dy.
Note that
e

1
2
(y+
b
n
1/2
)
2
+ e

1
2
(y
b
n
1/2
)
2
= 2 exp
_

1
2
_
y
2
+
b
2
n
__

k=0
(by)
2k
(2k)!n
k
.
Because each term in the above series and y

are nonnegative, we can ex-


change the summation and the integration and use the following fact
_

0
t

t
2
2
dt = 2
1
2
(
+ 1
2
), > 0, (34)
to each term, the result then follows. For Part b), it is easy to see that

k=3
2
k
b
2k
(
+1
2
+ k)
(2k)!n
k
= o(n
2
).
Using Part a) and the Taylor expansion for exp{b
2
/(2n)}, the result follows.

Lemma A.2 For any xed a, as s +,


(s + a)
s
a
(s)
= 1 +
a(a 1)
2s
+
a(a 1)(a 2)(3a 1)
24s
2
+ o(s
2
). (35)
Proof. From Bowman and Shenton (1988, p. 26) about the expansion
formula for log (y), we have (y) =

2y
y
1
2
e
y+
1
12y
+o(y
2
)
, as y .
Therefore,
(s + a)
s
a
(s)
=

2(s + a)
s+a
1
2
exp{s a +
1
12(s+a)
+ o(s
2
)}
s
a

2s
s
1
2
exp{s +
1
12s
+ o(s
2
)}
=
_
1 +
a
s
_
s+a
1
2
exp
_
a
a
12s(s + a)
_
+ o(s
2
)
= exp
_
a
a
12s(s + a)
+
_
s + a
1
2
_
log
_
1 +
a
s
_
_
+ o(s
2
).
Because
log(1 + t) = t
t
2
2
+
t
3
3
+ o(t
3
), as t 0,
700 Cuirong Ren, Dongchu Sun and Dipak K. Dey
we have
_
s + a
1
2
_
log
_
1 +
a
s
_
=
_
s + a
1
2
__
a
s

1
2
_
a
s
_
2
+
1
3
_
a
s
_
3
+ o(s
3
)
_
= a +
a(a 1)
2s
+
a
2
(3 2a)
12s
2
+ o(s
2
).
With Taylor expansion for exponential function, the result then follows.
Lemma A.3 For any xed a > 0 and y > 0, as s +,
1
(s)
_

0
t
s1
e
t
(1 +
at
s
2
)
y
dt = 1
ya
s
+
y(y + 1)a
2
2s
2
+ o(s
2
). (36)
Proof. With some algebra, we have that (36) is equivalent to
J
1
(s)
_

0
t
s1
e
t
(1 +
at
s
2
)
y
g
_
at
s
2
, y
_
dt = o(s
2
), (37)
where
g(x, y) = (1 + x)
y
_
1 yx +
1
2
y(y + 1)x
2
_
1, x > 0 and y > 0.
Because g(0, y) = 0 and
g(x, y)
x
=
1
2
y(y + 1)(y + 2)(1 + x)
y1
x
2
> 0,
g(x, y) > 0. Consequently, J has an upper bound
1
(s)
_

0
t
s1
e
t
g
_
at
s
2
, y
_
dt.
Note that
g(x, y)
y
= (1+x)
y
__
1yx+
y(y+1)
2
x
2
_
log(1+x)+yx
2

x(2x)
2
_
= (1+x)
y
_
x
2
log(1+x)
2
_
y
2
+
_
1
2
x
+
2
log(1+x)
_
y
_
+log(1+x)
x(2x)
2
_
,
which is positive for x, y > 0 because log(1 + x) > x, and log(1 + x)
1
2
x(2 x) > 0 for x > 0. Consequently, g(x, y) is an increasing function in
y. Choosing an integer k(> y), we get
J
1
(s)
_

0
t
s1
e
t
_
_
1 +
at
s
2
_
k
_
1
kat
s
2
+
k(k + 1)a
2
t
2
2s
4
_
1
_
dt
=
1
(s)
_

0
t
s1
e
t
_
k

i=0
_
k
i
_
a
i
t
i
s
2i
_
1
kat
s
2
+
k(k + 1)a
2
t
2
2s
4
_
1
_
dt.
Bayesian and frequentist estimation and prediction 701
Using the following fact,
1
s
2i
(s)
_

0
t
s+i1
e
t
dt =
(s + i)
s
2i
(s)
= o(s
2
), for i > 2,
the result then follows.
Lemma A.4 Suppose that a random variable Y
n
is
2
n1
. Dene
a
7
(y, ) =
y(y 1)(3y
2
7y + 8)
6

y(y 1)
2

2
0

2y
2

2
0
+
( + 1)
4
2
4
0
.
Then, for any real y and > 0,
IE
_
(
Y
n
n1
)
y
(1 +

2
Y
n
(n1)n
2
0
)

_
= 1 +
y(y 1)
2
/
2
0
n
+
a
7
(y, )
n
2
+ o(n
2
).
Proof. Note that
IE
_
(
Y
n
n1
)
y
(1+

2
Y
n
(n1)n
2
0
)

_
=
1
2
(n1)/2
(n 1)
y
(
n1
2
)
_

0
t
(n1)/2+y1
e
t/2
(1+

2
t
(n1)n
2
0
)

dt
=
(
n1
2
+ y)
(
n1
2
)
y
(
n1
2
)
1
(
n1
2
+y)
_

0
u
(n1)/2+y1
e
u
(1+
2
2
u
(n1)n
2
0
)

du.
By lemma A.2, we have
(
n1
2
+ y)
(
n1
2
)
y
(
n1
2
)
= 1 +
y(y 1)
n
+
y(y 1)(3y
2
7y + 8)
6n
2
+ o(n
2
).
By lemma A.3,
1
(
n1
2
+ y)
_

0
u
(n1)/2+y1
e
u
(1+
2
2
u
(n1)n
2
0
)

du
= 1

2
n
2
0
+

2
n
2

2
0
_
2y +
(+1)
2
a
2
0
_
+ o(n
2
).
The result then follows.
B Proofs of Main Results
Proof of Theorem 2.2. Let Y =

n(

X
n
)/, which is N(0, 1).
Then
702 Cuirong Ren, Dongchu Sun and Dipak K. Dey
IE|
M
n
|

= IE

_
x

X
n

_
()

= IE

Y
n
1/2
+
_
()

=
_

0
__

_
y
n
1/2
+
_
()
_

+
_
()
_

y
n
1/2
+
__

_
(y)dy.
Let
g
1
(y) =
_

_
y
n
1/2
+
_
()
_

and g
2
(y) =
_
()
_

y
n
1/2
+
__

.
Then, for any a > 0,
IE|
M
n
|

=
_

0
{g
1
(y) + g
2
(y)}(y) dy
=
_
n
a
0
{g
1
(y)+g
2
(y)}(y)dy+
_

n
a
{g
1
(y)+g
2
(y)}(y)dy.
Note that 0 g
i
(y) 1, i = 1, 2, and for any positive numbers (a, l),
lim
n+
_

n
a
y
l
(y) dy = 0. (38)
Then
IE|
M
n
|

=
_
n
a
0
{g
1
(y) + g
2
(y)}(y) dy + o(n
2
).
It follows from Taylor expansion that if a (0, 1/2) and y (0, n
a
),
g
1
(y)
=
_
()
y
n
1/2
+
1
2

()
y
2
n
+
1
6

()
y
3
n
3/2
+
1
24

()
y
4
n
2
+ O(n
5/2
y
5a
)
_

=
_
()y
n
1/2
_

_
1 +

()
2()
y
n
1/2
+

()
6()
y
2
n
+

()
24()
y
3
n
3/2
+ O(n
2+4a
)
_

=
_
()y
n
1/2
_

_
1
y
2n
1/2
+
(
2
1)y
2
6n
+
(3
2
)y
3
24n
3/2
+ O(n
2+4a
)
_

.
The last equality follows from the facts that

() = (),

() = (
2
1)()

() = (3
2
)(). (39)
Bayesian and frequentist estimation and prediction 703
Using the Taylor expansion,
(1 + )

= 1 + +
( 1)
2

2
+
( 1)( 2)
6

3
+ O(
4
), (40)
we get g
1
(y) is equal to
_
()y
n
1/2
_

_
1
y
2n
1/2
_

{(3+1)
2
4}y
12n
1/2

{42(+1)
2
}y
2
24n
_
+O(n
2+4a
)
_
.
Similarly, we have the expansion for g
2
(y). Therefore,
g
1
(y) + g
2
(y) =
_
()y
n
1/2
_

_
2 +
{(3 + 1)
2
4}y
2
12n
_
+ O(n
2/2+4a
).
Using (38) again, we have
_
n
a
0
{g
1
(y) + g
2
(y)}(y) dy
=
_

0
_
()y
n
1/2
_

_
2 +
{(3 + 1)
2
4}y
2
12n
_
(y) dy + O(n
2/2+4a
).
Then, using (34) and (39), we obtain
IE|
M
n
|

=
_
2

()

n
/2

_
+ 1
2
__
1 +
( + 1){(3 + 1)
2
4}
24n
_
+O(n
2/2+4a
).
Let a 0, then the result follows.
Proof of Lemma 2.6. It is easy to see that h

(t) = b(b+bt/n){h
1
(t)
h
2
(t)}/n, where
h
1
(t) =
_ bt
n
n

_
b +
bt
n
_

_
b +

n
y

_
_
1
(y) dy,
h
2
(t) =
_

bt
n
n
_

_
b +

n
y

_
b +
bt
n
__
1
(y) dy.
Let u = y + bt/(n
n
). We have
h
1
(t) =
_

0
g
3
(u)
_
u +
bt
n
n
_
du =
__
n
a
0
+
_

n
a
_
g
3
(u)
_
u +
bt
n
n
_
du,
704 Cuirong Ren, Dongchu Sun and Dipak K. Dey
where a (0, 1/2) and
g
3
(u) =
_

_
b +
bt
n
_

_
b +
bt
n


n
u

__
1
.
Because
n
= O(n
1/2
) and g
3
(u)/u
1
= O(n
(1)/2
), then using equation
(38), the second integral is negligible. Because u < n
a
,

_
u+
bt
n
n
_
=
exp
_

1
2
_
u
2
+
b
2

2
t
2
n
2

2
n
__

2
_
1+
btu
n
n
+
1
2
_
btu
n
n
_
2
+O(n

3
2
+3a
)
_
.
Using facts (39), one can obtain that g
3
(u) equals to
_
(b)
n
u

_
1
_
1+(1)
_
b
n
u
2

b
2
t
n
+
{(32)b
2
4}
2
n
u
2
24
2
_
+O(n

3
2
+3a
)
_
.
Similarly, if we use the transformation u = y bt/(n
n
) for h
2
(t), we have
h
2
(t) =
__
n
a
0
+
_

n
a
_
g
4
(u)
_
u +
bt
n
n
_
du,
where
g
4
(u) =
_

_
b +
bt
n
+

n
u

_
b +
bt
n
__
1
.
Thus, we have the similar expansions for (u +bt/(n
n
)) and g
4
(u). Then
g
3
(u)
_
u +
bt
n
n
_
g
4
(u)
_
u +
bt
n
n
_
=
(b)
1

2
n
u

2
2
exp
_

1
2
_
u
2
+
b
2

2
t
2
n
2

2
n
___
t +
1
2
+ O(n
1+3a
)
_
.
It is easy to see that when t
0
=
1
2
+ O(n
1
), h

(t) = o(n
2
), i.e., h(t)
can reach the minimum value at t
0
.
Proof of Theorem 2.7. Dene
W
n
=

X
n

/n
1/2

1
n
1/2
_
_
_
b
0

( 1)(1 +

2
n
2
0
)
2(1
1
2n
)
_
_
_
.
Then
IE|
n
|

= IE

_
x
n

_
1
1
2n
__
()

= IE

_


n
n
1/2
W
n
_
()

. (41)
Bayesian and frequentist estimation and prediction 705
Clearly, W
n
has a normal distribution with mean t
n
/

n and variance 1,
where
t
n
= b
0
+
( 1)
_
1 +

2
n
2
0
_
2
_
1
1
2n
_ and
n
=
1
1
2n
1 +

2
n
2
0
.
Thus, (41) equals to
_

0
__

_
+

n
w
n
1/2
_
()
_

_
w +
t
n
n
1/2
_
+
_
()
_


n
w
n
1/2
__

_
w
t
n
n
1/2
__
dw
=
1

2
e
t
2
n
2n
_

0
e

w
2
2
_
e

t
n
w
n
1/2
_

_
+

n
w
n
1/2
_
()
_

+ e
t
n
w
n
1/2
_
()
_


n
w
n
1/2
__

_
dw.
Let
a
8
(t
n
,
n
) =

2
n
{(3 + 1)
2
4}
24
+
t
2
n
2
+

n
t
n
2
.
By using Taylor expansion for {( +
n
w/

n) ()}

and {() (

n
w/

n)}

, similar to g
1
(y) and g
2
(y) in Theorem 2.2, we get, for a
(0, 1/2),
IE|
n
|

=
_
2

e
t
2
n
2n
_
()
n
n
1/2
_

_

0
w

e
w
2
/2
_
1 +
a
8
(t
n
,
n
)w
2
n
_
dw
+O(n
2/2+4a
)
=
_
2

()

n
/2

_
+1
2
_
e
t
2
n
/(2n)

n
_
1 +
+1
n
a
4
4(t
n
,
n
)
_
+O(n
2/2+4a
).
Let a 0, and note that
t
n
=
_
( 1)
2
b
0
_
+ O(n
1
),

n
= 1

n
_

2
0
+
1
2
_
+ O(n
2
),
the conclusion to the theorem now follows.
706 Cuirong Ren, Dongchu Sun and Dipak K. Dey
Acknowlegements. Suns research is partially supported by the National
Science Foundation grants DMS-9972598 and SES-0095919, and a grant from
Missouri Department of Conservation. The authors would like to thank the
editor, an associate editor, and an anonymous referee for many constructive
suggestions for revising the paper.
References
Bowman, K.O. and Shenton, L.R. (1988). Properties of Estimators for the Gamma
Distribution. Marcel Dekker, New York.
Chandra, M.J. (2001). Statistical Quality Control. CRC Press, Boca Raton.
Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (2003). Bayesian Data
Analysis, 2nd edition, Chapman & Hall, London.
Joseph, V.R. (2004). Quality loss functions for nonnegative variables and their appli-
cations. J. Quality Technology 36, 129-138.
Moorhead, P.R. and Wu, C.F.J. (1998). Cost-driven parameter design. Technometrics
40, 111-119.
Robert, C.P. (1997). The Bayesian choice A Decision-Theoretic Motivation. Springer-
Verlag, New York.
Smith, R. (1997). Predictive inference, rare events and hierarchical models. Technical
report, University of North Carolina, Chapel Hill.
Smith, R.L. (1999). Bayesian and frequentist approaches to parametric predictive in-
ference, (with discussion). Bayesian Statistics 6, J.M. Bernardo, J.O. Berger, A.P.
Dawid and A.F.M. Smith, eds., Oxford University Press, Oxford, 589-612.
Spiring, F.A. and Yeung, A.S. (1998). A general class of loss functions with industrial
applications. J. Quality Technology 30, 152-162.
Varian, H.R. (1975). A Bayesian approach to real estate assessment. Studies in
Bayesian Econometrics and Statistics in Honor of Leonard J. Savage, S.E. Fienberg
and A. Zellner, eds., North-Holland, Amsterdam, 195-208.
Zellner, A. (1986). Bayesian estimation and prediction using asymmetric loss function.
J. Amer. Statist. Assoc., 81, 446-451.
Cuirong Ren
Department of Plant Science
South Dakota State University
Box 2207A
Brookings, SD 57007, USA
E-mail: cuirong ren@sdstate.edu
Dongchu Sun
Department of Statistics
University of Missouri
Columbia, MO 65211, USA
E-mail: dsun@stat.missouri.edu
Dipak K. Dey
Department of Statistics
University of Connecticut
Storrs, CT 06269, USA
E-mail: dey@snet.net
Paper received: September 2003; revised October 2004.

You might also like