Professional Documents
Culture Documents
Mra HW2
Mra HW2
Mra HW2
Presented by:
Edmond Sacla Aide (2159278)
Melvin Estolano (2159122)
Fan Huang (2263527)
Lecturer:
Prof. Dr. Marc Aerts
Prof. Dr. Christel Faes
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
1 Introduction
In reference to the study’s origin, the presence of tetrahydrocannabinol (THC) in milk and other
animal source food were investigated on the risk to health. Here, we focus on the repeated dose
oral toxicity study in rats (from the rat subchronic NTP study of 1996) specifically on the endpoint
in oestrus cycle length.
The oestrus cycle length is the time between consecutive estrous cycles. The average length of an
oestrus cycle is 21 days. Rats are polyestrous animals that typically have rapid cycle lengths of
4 to 5 days and this short cycle length makes them the ideal animal for investigation of changes
occurring during the reproductive cycle. The female reproductive cycle begins with the onset of
proestrus that lasts for about four to six days. This phase is characterized by a bloody discharge
from the vagina and increased sexual activity. Estrus, or ”standing heat”, lasts for about one to
two days and occurs in conjunction with ovulation. Estrus is associated with a vaginal discharge
that contains pheromones and can be detected by males as a signal that the female is sexually
receptive (Marcondes et al. [2002]).
In this project, four different models (including Exponential, Inverse exponential, Hill, and Lo-
gistic models) were used to explore the dose-response relationship between dose level and oestrus
cycle length, and to estimate the benchmark dose. Also, a model averaging approach has been
investigated.
2 Methodology
2.1 Description of the data
Table 1 presents the summary of the data used in this project. The data is about the length of the
oestrus cycle observed in the rat subchronic NTP (1996).
Its contains the number of animals included in the study randomized across different dose levels,
the result on the mean oestrus cycle length in days, and standard errors in days.
Dose (mg/kg b.w. per day) Number of animals Mean oestrus cycle length (days) Standard error (days)
0 7 4.57 0.20
5 9 5.00 0.24
15 7 5.57 0.20
50 7 5.71 0.18
150 6 5.33 0.33
500 4 6.00 0.00
1
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
The main idea behind model averaging is that different models will have different strengths and
weaknesses and it is best to use all available information when making estimations. Model averag-
ing can be seen as a step towards incorporating all available knowledge into an estimation problem,
and this can be done by using all the data that we have collected to estimate the response variable.
4
X
ˆD=
BM ˆ Di
wi BM
i=1
The weights (wi ) are based on the differences in AIC relative to the model that has the lowest AIC.
This will make sure that the numerator for the best-performing model is equal to 1 and that the
2
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
rest will have a value less than 1. The denominator is a normalizing constant so that the sum of
all weights will be equal to 1. The weight can be calculated as follows:
exp(− 12 ∆i )
wi = P4 1
j=1 exp(− 2 ∆j )
In estimating the variance of the model-average estimate, we consider two variance components.
The first component is the model-specific variance estimate conditional on the model and it can
reflect the within-model variance. The second variance component is the between-model variance
and reflects the variation in the estimate across all of the models. The sum of these components is
combined and weighted to obtain the final variance estimate, and can be calculated as follows:
" 4 q #2
X
var(
ˆ BMˆ D) = wi var(
ˆ BMˆ Di ) + (BM
ˆ Di − BM
ˆ D)2
i=1
Wald type:
ˆ D ± 1.645 × se(BM
BM ˆ D)
ˆ D ± 1.645 × q 1
BM
nI(BMˆ D)
h i
ˆ D) is the Fisher’s Information and is equivalent to −E d2
This I(BM dBM D2
logf (x|BM D) .
3
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
The background response equals the mean of the endpoint of interest at the minimum does level,
whereas the maximum response equals the mean of the endpoint at the maximum dose level divided
by the background response. Therefore, we don’t need to worry about the starting values of (a)
and (c) when calculating their maximum likelihood as they are condition on the data.
Although we know that potency controls when a curve reaches its maximum and maximal steepness
determines the steepness of a curve, it is really hard to find good starting values for them by simply
looking at the plot of fitted curves and the data. Consequently, we turned to the primitive grid
search method to find starting values that can lead to the lowest AIC for each candidate model, as
listed below.
• Exponential: b = 5, d = 0.1
• Hill: b = 0.2, d = 2
• Logistic: b = 5, d = 0.6
4
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
3 Results
3.1 Maximum likelihood estimation based on summary statistics
The ML inference for normally distributed data based on the sample mean and variance is equiv-
alent to ML inference based on the individual data because first, the observation is coming from a
normal distribution and second, the sample mean and sample variance are sufficient statistics for
the normal distribution.
y|x ∼ N (µ(x), σ 2 )
The density of y|x is given by:
1 1 2
f (yij ) = √ exp − 2 (yij − µ(xi ))
2πσ 2σ
The likelihood is given by,
ni
n Y ni
n Y
2
Y Y1 1 2
L(µ(x), σ ) = f (yij ) = √ exp − 2 (yij − µ(xi ))
2πσ 2σ
i=1 j=1 i=1 j=1
The expression of the log-likelihood function in terms of the (dose-specific) sample mean and
variance is given below:
ni
n X n ni
X X ni 1 X
logL(µ(x), σ 2 ) = ln(f (yij )) = (− ln(2π) − ni ln(σ) − 2 (yij − µ(xi ))2 )
2 2σ
i=1 j=1 i=1 j=1
n ni ni
X ni 1 X X
= (− ln(2π) − ni ln(σ) − 2 [ (yij − mi )2 + (mi − µ(xi ))2 ] + 0)
2 2σ
i=1 j=1 j=1
n
X ni (ni − 1)Si2 ni √
= (− ln(2π) − ni ln(σ) − − 2 (mi − µ(xi ))2 ) with Si = SE ∗ ni
2 2σ 2 2σ
i=1
n
2
X ni (ni − 1)Si2 ni
⇒ logL(a, b, c, d, σ ) = (− ln(2π) − ni ln(σ) − − 2 (mi − µ(xi ))2 )
2 2σ 2 2σ
i=1
(1)
Figure 1 is shown to visualize the fitted curves for each model. All models fit the data equally well
as the AIC values are quite similar for all of them (see table 3). If we want to choose the best
model, the one with the lowest AIC, which is the logistic model, is preferred.
5
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
µ(x) − µ(0)
µ∗ (BM D) = = q ⇒ µ(BM D) = µ(0)(q + 1) (2)
µ(0)
Exponential
Inverse exponential
6
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
Hill
BM Dd
µ(BM D) = a[1 + (c − 1) ] = a(q + 1)
(b + BM Dd )
BM Dd
⇒ (c − 1) =q
(b + BM Dd ) (5)
d
⇒ BM D (c − 1 − q) = bq
bq 1
⇒ BM D = [ ]d
(c − 1 − q)
Logistic
(exp(bBM Dd − 1)
µ(BM D) = a[1 + (c − 1) ] = a(q + 1)
(exp(bBM Dd + 1)
(exp(bBM Dd − 1)
⇒ (c − 1) =q
(exp(bBM Dd + 1)
(6)
(c − 1) + q
⇒ exp(bBM Dd ) =
(c − 1) − q
1 (c − 1) + q 1
⇒ BM D = { ln[ ]} d
b (c − 1) − q
The BMD estimates are in a rescaled unit, so for example the estimate for the exponential model is
given by the 0.00106 ∗ 500 = 0.53 mg/kg bw per day. In terms of confidence intervals, asymptotic
normality-based intervals are narrower as compared to wald type. In addition, they do not contain
negative values in the lower bounds, which does make them more appropriate.
Wald Asymptotic
Model Estimate Std.err. df AIC
Lower Upper Lower Upper
Exponential 0.00106 0.00154 -0.00148 0.00360 0.00066 0.00146 5 79.48784
Inverse exponential 0.00332 0.00207 -0.00009 0.00673 0.00278 0.00386 5 79.72795
Hill 0.00118 0.00110 -0.00063 0.00299 0.00089 0.00147 5 79.87877
Logistic 0.00095 0.00164 -0.00175 0.00365 0.00052 0.00138 5 79.43064
7
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
Figure 2 shows the fitted curves with the BMD estimate given by the red point. Here, it can
be seen that the BMD estimate gives low values and their differences are small (from 0.00095 to
0.00332), which makes them hard to distinguish on a rescaled unit.
First, the AIC values across the models are similar, thus it is expected that the contribution of
each model to the model averaging process is also similar. Looking at the model-average statistics,
the estimate is equal to 0.00159 ∗ 500 = 0.795 mg/kg BW per day. This estimate falls in between
the smallest and largest model specific estimate. Second, the model-average standard error is quite
large. Lastly, for both Wald-type and asymptotic confidence intervals, the model-average is wider
than almost all the other models except the Inverse exponential model, as its standard error is
quite large.
8
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
Wald Asymptotic
Model Estimate Std.err. df AIC Weight
Lower Upper Lower Upper
Exponential 0.00106 0.00154 -0.00148 0.00360 0.00066 0.00146 5 79.48784 0.27
Inverse exponential 0.00332 0.00207 -0.00009 0.00673 0.00278 0.00386 5 79.72795 0.24
Hill 0.00118 0.00110 -0.00064 0.00299 0.00089 0.00147 5 79.87877 0.22
Logistic 0.00095 0.00164 -0.00175 0.00365 0.00052 0.00138 5 79.43064 0.28
Model-average 0.00159 0.00190 -0.00153 0.00471 0.00110 0.00208 - - -
Figure 3 includes the model-averaged curve, and it shows that this curve is lying between the
model-specific curve.
9
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
4 Conclusion
In this analysis, four candidate models were used to explore the dose-response relationship between
the amount of dose (in mg/kg BW per day) and the changes in the length of the oestrus cycle (in
days), to estimate the benchmark dose, and to investigate the effect of model-averaging.
The four candidate dose-response models were found to fit the observed data equally well, with the
logistic model performing slightly better based on the AIC value. Using the model-averaging tech-
nique, which makes use of the properties of all candidate models to come up with a precise estimate,
the BMD estimate is 0.795 mg/kg BW per day. In terms of both Wald type and asymptotically
normal confidence intervals, the model averaging is wider than all but the Inverse exponential
model, probably due to its relatively large standard error.
The BMDL estimates are also given as a reference point to establish the tolerable daily intake.
Regardless of the types of BMDL, it is safer to adapt the BMDL for Logistic model because it
sets the dose level that one can intake without showing adverse effect to be lower. As for the
model-averaging, it combines the characteristics of all candidate models and believes that more
dose needs to be consumed to show adverse effect.
10
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
References
FK Marcondes, FJ Bianchi, and AP Tanno. Determination of the estrous cycle phases of rats:
some helpful considerations. Brazilian journal of biology, 62:609–614, 2002.
Kim Z Travis, Ian Pate, and Zoe K Welsh. The role of the benchmark dose in a regulatory context.
Regulatory Toxicology and Pharmacology, 43(3):280–291, 2005.
11
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
5 Appendix
R code
#--------- Libraries
library(stats)
library(stats4)
library(AICcmodavg)
library(msm)
library(bbmle)
library(locfit)
#-------------- Dataset
dose<-c(0,5,15,50,150,500)
dose_res<-dose/500
dose_log<-log10(dose+1)
obs<-c(7,9,7,7,6,4)
mean_ocl<-c(4.57,5.00,5.57,5.71,5.33,6.00)
se_ocl<-c(0.20,0.24,0.20,0.18,0.33,0.00)
sd_ocl<-se_ocl*sqrt(obs)
df<-data.frame(dose,dose_res,dose_log,obs,mean_ocl,se_ocl,sd_ocl)
attach(df)
logit=function(x) log(x/(1-x))
expit=function(x) 1/(1+exp(-x))
mean.0<-mean(mean_ocl[dose==min(dose)])
mean.max<-mean(mean_ocl[dose==max(dose)])
12
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
#Hill model
minuslogliklnHL=function(a,b,c,d,sigma){
minuslogliklnHL=0
for (i in (1:length(dose_res))){
U=a*(1+(c-1)*(dose_res[i]^d/(b+dose_res[i]^d)))
# minuslogliklnEX=minuslogliklnEX-dnorm(dose[i],U, se[i]*sqrt(n[i]),log=TRUE)
minuslogliklnHL=minuslogliklnHL-(-obs[i]*log(2*pi)/2 -
obs[i]*log(sigma) -
(obs[i]-1)*(sd_ocl[i]^2)/(2*sigma^2) -
obs[i]*((mean_ocl[i]-U)^2)/(2*sigma^2))
}
return(minuslogliklnHL)
}
#Logistic model
minuslogliklnLG=function(a,b,c,d,sigma){
minuslogliklnLG=0
for (i in (1:length(dose_res))){
U=a*(1+(c-1)*(exp(b*dose_res[i]^d)-1)/(exp(b*dose_res[i]^d)+1))
# minuslogliklnEX=minuslogliklnEX-dnorm(dose[i],U, se[i]*sqrt(n[i]),log=TRUE)
minuslogliklnLG=minuslogliklnLG-(-obs[i]*log(2*pi)/2 -
obs[i]*log(sigma) -
(obs[i]-1)*(sd_ocl[i]^2)/(2*sigma^2) -
obs[i]*((mean_ocl[i]-U)^2)/(2*sigma^2))
13
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
}
return(minuslogliklnLG)
}
#--------- ML estimation
EXfit<-mle(minuslogliklnEX,start=list(a=mean.0, b=5,
c=mean.max/mean.0,d=0.1,sigma=1), optim = stats::optim)
INfit<-mle(minuslogliklnIN,start=list(a=mean.0, b=0.8,
c=mean.max/mean.0,d=1.2,sigma=1))
HLfit<-mle(minuslogliklnHL,start=list(a=mean.0, b=0.2,
c=mean.max/mean.0,d=2,sigma=1))
LGfit<-mle(minuslogliklnLG,start=list(a=mean.0, b=5,
c=mean.max/mean.0,d=0.6,sigma=1))
14
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
EXBMD<-((-1/b1) *log(1-(q/(c1-1))))^(1/d1)
a2=coef(INfit)[1];b2=coef(INfit)[2];c2=coef(INfit)[3];d2=coef(INfit)[4]
INBMD<-(-b2/log(q/(c2-1)))^(1/d2)
a3=coef(HLfit)[1];b3=coef(HLfit)[2];c3=coef(HLfit)[3];d3=coef(HLfit)[4]
HLBMD<-(q*b3/(c3-q-1))^(1/d3)
a4=coef(LGfit)[1];b4=coef(LGfit)[2];c4=coef(LGfit)[3];d4=coef(LGfit)[4]
#LGBMD<-( ( log( 1/( 1-(1+0.01)*(1/(1+exp(-a4))) )) -a4)/b4)^(1/d4)
LGBMD<- ((1/b4)*log((c4-1+q)/(c4-1-q)))^(1/d4)
estVEC<-round(c(EXBMD,INBMD,HLBMD,LGBMD),5)
15
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
#--------- Asymptotic CI
malc2<- ma$Mod.avg.est-qnorm(0.95)*(1/sqrt(sum(obs)/ma$Mod.avg.est^2))
mauc2<-ma$Mod.avg.est+qnorm(0.95)*(1/sqrt(sum(obs)/ma$Mod.avg.est^2))
OS_malc2=malc2*500
OS_mauc2=mauc2*500
#--------- Computing the model averaged estimates for grid of dose levels
### BMD visualization
plot(mean_ocl~dose_res,ylab="Mean oestrus cycle length (in days)",
xlab="Rescaled dose (mg/kg b.w. per day)",pch=19,lwd=2)
lines(dosegrid,EXfc,lwd=2,col="black")
lines(dosegrid,INfc,lwd=2,col="blue")
lines(dosegrid,HLfc,lwd=2,col="green")
16
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
lines(dosegrid,LGfc,lwd=2,col="yellow")
abline(v=EXBMD,col="black",lwd=1,lty=3)
abline(v=INBMD,col="blue",lwd=1,lty=3)
abline(v=HLBMD,col="green",lwd=1,lty=3)
abline(v=LGBMD, col="yellow",lwd=1,lty=3)
#--------- Plot for the model average estimate for the BMD
gridsize<-length(dosegrid)
MAfit<-rep(NA,gridsize)
for (k in (1:gridsize)){
d<-dosegrid[k]
EXfc<-coef(EXfit)[1]*(1+(coef(EXfit)[3]-1)*(1-exp(-coef(EXfit)[2]*
d^coef(EXfit)[4])))
INfc<-coef(INfit)[1]*(1+(coef(INfit)[3]-1)*exp(-coef(INfit)[2]*
d^(-coef(INfit)[4])))
HLfc<-coef(HLfit)[1]*(1+(coef(HLfit)[3]-1)*d^coef(HLfit)[4]/(coef(HLfit)[2]+
d^coef(HLfit)[4]))
LGfc<-coef(LGfit)[1]*(1+(coef(LGfit)[3]-1)*(exp(coef(LGfit)[2]*
d^coef(LGfit)[4])-1)/(exp(coef(LGfit)[2]*d^coef(LGfit)[4])+1))
LL<-c(logLik(EXfit)[1],logLik(INfit)[1],logLik(HLfit)[1],logLik(LGfit)[1])
# VECTOR WITH NUMBER OF PARAMETERS FOR THE DIFFERENT MODELS
Ks<-AIC(EXfit,INfit,HLfit,LGfit)[,1]
# VECTOR OF NAMES FOR THE DIFFERENT MODELS
Modnames<-c("EXfit","INfit","HLfit","LGfit")
model.ests<-c(EXfc,INfc,HLfc,LGfc)
model.se.ests<-c(EXse,INse,HLse,LGse)
ma<-modavgCustom(logL = LL, K = Ks, modnames = Modnames, estimate = model.ests,
se = model.se.ests, nobs = ntot, second.ord = FALSE)
MAfit[k]<-ma$Mod.avg.est
17
Microbial Risk Assessment (MRA) - Project 2 (2022–2023)
}
abline(v=0.001592278,col="orange",lwd=1,lty=3)
lines(dosegrid,MAfit,col="orange",lwd=2,lty=3)
# Asymptotic CI
llci2<- estVEC-qnorm(0.90)*(1/sqrt(sum(obs)/seVEC^2))
rlci2<-estVEC+qnorm(0.90)*(1/sqrt(sum(obs)/seVEC^2))
# Tabulation of results
wald_tab1<-round(cbind(estVEC,seVEC,llci,rlci),5)
asymp_tab1<-round(cbind(estVEC,seVEC,llci2,rlci2),5)
# Asymptotic CI
llci3<- maBMDL$Mod.avg.est-qnorm(0.90)*(1/sqrt(sum(obs)/maBMDL$Uncond.SE^2))
rlci3<-maBMDL$Mod.avg.est+qnorm(0.90)*(1/sqrt(sum(obs)/maBMDL$Uncond.SE^2))
18