Problem Set 5 Solution Numerical Methods

Problem set 5
Numerical Methods for EOR
07/03/2023
Week 5-Simulation
This week we focus on simulation, and we continue to use the data of the assignment.
Reading material
Read chapter 20 of Jones et al.
Problem 0
First, note that if Y follows a translated Gamma distribution, we have
FY (y) = Pr(Y ≤ y) = Pr(X + c ≤ y) = Pr(X ≤ x − c) = FX (x − c),
with FX (·) the distribution function of a Gamma distribution with parameters α and β. The probability of
finding an observation in the irow of the table (with lowerbound li and upperbound ui ) is then
Pr(li < Y ≤ ui ) = FX (ui − c; α, β) − FX (li − c; α, β).
Note that for the last row we have
Pr(Y ≥ l10 ) = 1 − Pr(Y ≤ l10 ) = 1 − FX (li − c; α, β).
As a consequence, the likelihood function takes the standard multinomial form
Y n
L(α, β, c) = (FX (ui − c; α, β) − FX (li − c; α, β)) i
i
with ni the number of observations in row i. The loglikelihood to be optimized is then

X
`(α, β, c) = ni log (FX (ui − c; α, β) − FX (li − c; α, β)) .
i
We implement this below.

table1 <- cbind(c(0,2.5,7.5,12.5,17.5,22.5,32.5,47.5,67.5,87.5),
c(2.5,7.5,12.5,17.5,22.5,32.5,47.5,67.5,87.5,Inf),
c(41,48,24,18,15,14,16,12,6,23))
loglik <-function(p,d){
upper <- d[,2]
lower <- d[,1]
n <- d[,3]
ll<-n*log(ifelse(upper<Inf,pgamma(upper-p[3],p[1],p[2]),1)-
pgamma(lower-p[3],p[1],p[2]))
sum( ll )
}
1
We need a decent starting value. The minimum of the domain of a Gamma distribution is 0, so we take that
to be the starting value for c. Then, we take a very rough approach: suppose we have the indicated number
of observations from the center of each interval. Then it is easy to estimate mean and variance, and obtain
starting values for α and β, as EX = α/β and varX = α/β 2 .
interval.center <- c((table1[1:9,1]+table1[1:9,2])/2,table1[10,1])
pseudo.data <- rep(interval.center,table1[,3])
mean.p.d <- mean(pseudo.data)
var.p.d <- var(pseudo.data)
beta0 <- mean.p.d/var.p.d
alpha0 <- beta0*mean.p.d
p0 <- c(alpha=alpha0,beta=beta0,c=0)
m <- optim(p0,loglik,control=list(fnscale=-1),
d=table1,hessian=T)
print(m)
## $par
## alpha beta c
## 0.36449625 0.01257362 1.88088830
##
## $value
## [1] -468.4725
##
## $counts
## function gradient
## 154 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## alpha beta c
## alpha -1531.2802 16670.3179 -107.19179
## beta 16670.3179 -404466.4366 204.15174
## c -107.1918 204.1517 -21.66854
rbind(par=m$par,
se=sqrt(diag(solve(-m$hessian))))
## alpha beta c
## par 0.36449625 0.012573623 1.8808883
## se 0.05052261 0.002519416 0.3161037
The optimizer has converged, and the estimate for c seems reasonable. Note that it differs significantly from 0,
so in this case, a translated Gamma distribution should provide a better fit than just a Gamma distribution.
Problem 1
In problem 4 of the assignment, you were asked to estimate the parameters of a translated Gamma distribution,
based on grouped data. Use simulated data set to check your likelihoodfunction as follows. First, generate
data according to the Gamma distribution that you estimated in the first part of problem 4 of the assignment.
Use the same lower- and upper bounds as in Table 1 of the assignment (hint: to create the groups, use
2
ggplot2::cut_width). Generate 400 random numbers, and assign them to the thirteen groups. Use this
simulated table to estimate the parameters of your Gamma distribution. Suppose you would increase the
sample size to 40000, would you expect the standard deviations of your estimated parameters to decrease by
a factor 10? Assess this, and explain your result.
table2.10 <- cbind(lower=c(0,2.5,7.5,12.5,17.5,22.5,32.5,
47.5,67.5,87.5),upper=c(2.5,7.5,12.5,17.5,22.5,32.5,
47.5,67.5,87.5,Inf),freq=c(41,48,24,18,15,14,16,12,6,23))
loglik <-function(p,d){
upper <- d[,2]
lower <- d[,1]
n <- d[,3]
ll<-n*log(ifelse(upper<Inf,pgamma(upper-p[3],p[1],p[2]),1)-
pgamma(lower-p[3],p[1],p[2]))
sum( ll )
}
p0 <- c(alpha=0.47,beta=0.014,c=0)
m <- optim(p0,loglik,hessian=T,control=list(fnscale=-1),
d=table2.10)
## Warning in pgamma(upper - p[3], p[1], p[2]): NaNs produced

## Warning in pgamma(lower - p[3], p[1], p[2]): NaNs produced
m
## $par
## alpha beta c
## 0.36448483 0.01256196 1.88139535
##
## $value
## [1] -468.4725
##
## $counts
## function gradient
## 124 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $hessian
## alpha beta c
## alpha -1531.3511 16685.7389 -107.26790
## beta 16685.7389 -405229.6073 204.16466
## c -107.2679 204.1647 -21.70944
3
Now we simulate 400 observations from the estimated Gamma distribution, and bin them in the same type of
table:
x400 <- m$par[3]+rgamma(400,m$par[1],m$par[2])
# use cut to bin

x.table <- cut(x400,breaks=c(0,table2.10[,"upper"]))
table(x.table)
## x.table
## (0,2.5] (2.5,7.5] (7.5,12.5] (12.5,17.5] (17.5,22.5] (22.5,32.5]
## 90 94 31 34 27 23
## (32.5,47.5] (47.5,67.5] (67.5,87.5] (87.5,Inf]
## 24 30 14 33
replication.table2.10 <- table2.10
replication.table2.10[,"freq"] <- table(x.table)
m2 <- optim(p0,loglik,hessian=T,control=list(fnscale=-1),
d=replication.table2.10)

x40000 <- m$par[3]+rgamma(40000,m$par[1],m$par[2])
x.table <- cut(x40000,breaks=c(0,table2.10[,"upper"]))
table(x.table)
## x.table
## (0,2.5] (2.5,7.5] (7.5,12.5] (12.5,17.5] (17.5,22.5] (22.5,32.5]
## 7743 9067 4043 2788 2107 3054
## (32.5,47.5] (47.5,67.5] (67.5,87.5] (87.5,Inf]
## 3126 2590 1575 3907
replication.table2.10 <- table2.10
replication.table2.10[,"freq"] <- table(x.table)
m3 <- optim(p0,loglik,hessian=T,control=list(fnscale=-1),
d=replication.table2.10)

cbind(original=m$par,sd.original=sqrt(diag(solve(-m$hessian))),
n400=m2$par,sd.n400=sqrt(diag(solve(-m2$hessian))),
n40000=m3$par,sd.n40000=sqrt(diag(solve(-m3$hessian))))
## original sd.original n400 sd.n400 n40000 sd.n40000

## alpha 0.36448483 0.050510339 0.33504832 0.036188115 0.36233994 0.0037501113
## beta 0.01256196 0.002516809 0.01284136 0.001931353 0.01242269 0.0001824151
## c 1.88139535 0.315744218 1.85136595 0.250673632 1.86751787 0.0241354584
4
Problem 2
Suppose X1 , . . . , Xn ∼ Exp(λ). Then we know that
√ asy
n(λ̂ − λ0 ) ∼ N (0, I(λ0 )−1 )
with λ̂ the maximum likelihood estimator of λ, λ0 the true value, and
∂ 2 ì
I(λ) = −E .
∂λ2
In the case of the exponential distribution we have
`(λ; xi ) = log λ − λxi ,
so
∂ 2 ì ∂ 2 ì 1
−E 2
= − 2 = 2.
∂λ ∂λ λ
Summarizing, for the exponential distribution, we have
√ asy
n(λ̂ − λ0 ) ∼ N (0, (λ0 )2 ).
Take λ0 = 1 and show that this relation holds approximately for n = 10, n = 1000, and n = 10000. Use a
qq-plot to assess (approximate) normality (hint: ggplot2::stat_qq).
solution
We start with n = 10, and generate B samples from wich to calculate λ̂. Then those B λ’s should be
approximately normal.
set.seed(123456)
B <- 100000 # number of replications
n <- 10
l.hat10 <- rep(NA,B)
for (b in 1:B){
x.b <- rexp(n,rate=1)
l.hat10[b] <- 1/mean(x.b)
}
z10 <- sqrt(n)*(l.hat10-1) # should be approximately normal
qq10 <- data.frame(lambda.hat=z10,n=10)
n <- 100
for (b in 1:B){
}
n <- 1000
5
for (b in 1:B){
}

qq <- bind_rows(qq10,qq100,qq1000)
ggplot(qq,aes(sample=lambda.hat)) + stat_qq() +
facet_wrap(~n,ncol=2) + geom_abline(intercept=0,slope=1,col="grey")
10 100
15
10
0
sample
−5
−2.5 0.0 2.5
1000
15
10
−5
−2.5 0.0 2.5
theoretical
Clearly, the asymptotic distribution is not reasonable at all if n = 10. With increasing number of observations,
the approximation of normality becomes better. It is perhaps more interesting to look at the densities of the
estimated parameters for different values of the number of observations.
ggplot(qq) + geom_density(aes(x=lambda.hat)) + facet_wrap(~n) +
stat_function(fun=dnorm,args=list(mean=0,sd=1),col="red")
6
10 100 1000
0.4
0.3
y
0.2
0.1
0.0
−5 0 5 10 15 −5 0 5 10 15 −5 0 5 10 15
lambda.hat
Problem 3
We continue with the exponential distribution with λ = 1, but now we look at the maximum:
Mn = max(X1 , . . . , Xn ),
with Xi iid Exp(λ).

1. Show that limn→∞ Pr(Mn − log n ≤ x) = exp(− exp(−x)) (so the asymptotic distribution of the
maximum follows a generalized extreme value distribution, and not a normal distribution).
2. Show by means of a qq-plot that
Pr(Mn − log n ≤ x) ≈ exp(− exp(−x))
for fixed n. Also compare the simulated distribution to a normal distribution.

First, we derive the asymptotic distribution of the maximum.
Pr(Mn − log n ≤ x) = Pr(Mn ≤ x + log n) = Pr(X1 ≤ x + log n, . . . , Xn ≤ x + log n)

n
n 1
Pr(X1 ≤ x + log n)n = (1 − exp(−x − log n)) = 1 − exp(−x) .
n
A standard limit in analysis is y

z
lim 1− = exp(−z),
y→∞ y
so we have n
1
lim Pr(Mn − log n ≤ x) = lim 1 − exp(−x) = exp(− exp(−x)).
n→∞ n→∞ n
7
This is a special case of the socalled Generalized Extreme Value distribution with shape parameter ξ = 0.
This distribution function is also known as the Gumbel distribution.
We use the same setu as above to see whether the small sample distribution of the maximum is well
approximated by the limit distribution.
pgumbel <- function(x){ exp(exp(-x)) }
set.seed(123456)
B <- 10000 # number of replications
n <- 10
m10 <- rep(NA,B)
for (b in 1:B){
m10[b] <- max(x.b)
}
z10 <- m10-log(10) # should be approximately Gumbel
qq10 <- data.frame(centered.max=z10,n=10)
n <- 100
m100 <- rep(NA,B)
for (b in 1:B){
m100[b] <- max(x.b)
}
n <- 1000
m1000 <- rep(NA,B)
for (b in 1:B){
m1000[b] <- max(x.b)
}
n <- 10000
m10000 <- rep(NA,B)
for (b in 1:B){
m10000[b] <- max(x.b)
}

qq <- bind_rows(qq10,qq100,qq1000,qq10000)
ggplot(qq,aes(sample=centered.max)) + stat_qq() +
8
10 100
12
0
sample
1000 10000
12
−4 −2 0 2 4 −4 −2 0 2 4
theoretical
Clearly, the distribution of Mn − log n is not well approximated by a standard normal distribution. But
them we derived above that it should be well approximated by a Gumbel distribution. We need to be able to
calculate the quantiles of a Gumbel distribution:
x = − log(− log p)
is the inverse of the distribution function.

qgumbel <- function(p){
p <- ifelse(p>0,p,1e-8)
-log(-log(p))
}
ggplot(qq,aes(sample=centered.max)) + stat_qq(distribution = qgumbel) +

9
10 100
12
0
sample
1000 10000
12
−2.5 0.0 2.5 5.0 7.5 10.0 −2.5 0.0 2.5 5.0 7.5 10.0
theoretical
One could argue that the asymptotic approximation is better in this case than for the maximum likelihood
estimator above.
10

Problem Set 5 Solution Numerical Methods

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Problem Set 5 Solution Numerical Methods

Uploaded by

Copyright:

Available Formats

Problem set 5

Numerical Methods for EOR

with ni the number of observations in row i. The loglikelihood to be optimized is then

We implement this below.

## Warning in pgamma(upper - p[3], p[1], p[2]): NaNs produced

# use cut to bin

## Warning in pgamma(upper - p[3], p[1], p[2]): NaNs produced

## Warning in pgamma(upper - p[3], p[1], p[2]): NaNs produced

## Warning in pgamma(upper - p[3], p[1], p[2]): NaNs produced

## original sd.original n400 sd.n400 n40000 sd.n40000

with λ̂ the maximum likelihood estimator of λ, λ0 the true value, and

`(λ; xi ) = log λ − λxi ,

qq10 <- data.frame(lambda.hat=z10,n=10)

qq100 <- data.frame(lambda.hat=z100,n=100)

qq1000 <- data.frame(lambda.hat=z1000,n=1000)

with Xi iid Exp(λ).

Pr(Mn − log n ≤ x) ≈ exp(− exp(−x))

for fixed n. Also compare the simulated distribution to a normal distribution.

Pr(Mn − log n ≤ x) = Pr(Mn ≤ x + log n) = Pr(X1 ≤ x + log n, . . . , Xn ≤ x + log n)

A standard limit in analysis is  y

qq10 <- data.frame(centered.max=z10,n=10)

qq100 <- data.frame(centered.max=z100,n=100)

qq10000 <- data.frame(centered.max=z10000,n=10000)

is the inverse of the distribution function.

ggplot(qq,aes(sample=centered.max)) + stat_qq(distribution = qgumbel) +

You might also like

A standard limit in analysis is y