Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/305724742

Opportunities of the minimum Anderson-Darling estimator as a variant of


the maximum likelihood method

Article  in  Communication in Statistics- Simulation and Computation · September 2016


DOI: 10.1080/03610918.2016.1217011

CITATIONS READS

3 263

1 author:

Mathias Raschke
Co.M.Raschke / independent resercher
86 PUBLICATIONS   699 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Stochastic modelling and statistical analysis of seismic hazard View project

Deutsches Forschungsnetz Naturkatastrophen (DFNK) View project

All content following this page was uploaded by Mathias Raschke on 12 August 2018.

The user has requested enhancement of the downloaded file.


Revised manuscript

Opportunities of the minimum Anderson-Darling estimator as a


variant of the maximum likelihood method

Mathias Raschke, freelancer, Stolze-Schrey-Str.1, 65195 Wiesbaden, mathiasraschke@t-online.de

Abstract: We reveal that the minimum Anderson-Darling (MAD) estimator is a variant of the

maximum likelihood method. Furthermore, it is shown that the MAD estimator offers excellent

opportunities for parameter estimation if there is no explicit formulation for the distribution model.

The computation time for the MAD estimator with approximated cumulative distribution function is

much shorter than that of the classical maximum likelihood method with approximated probability

density function. Additionally, we research the performance of the MAD estimator for the generalized

Pareto distribution and demonstrate a further advantage of the MAD estimator with an issue of seismic

hazard analysis.

Key words: minimum distance estimator, minimum Anderson-Darling estimator, likelihood function,

Bernoulli distribution, generalized Pareto distribution, earthquake ground motion relation

1 Introduction

The minimum-distance estimators which apply to the empirical distribution function (EDF) are well-

known inference methods (e.g. Wolfowitz, 1957; Drossos, 1980; Paar and Schucany, 1980; Paar,

1981). The basic idea of these estimators is to minimise a distance d with

𝑑[𝐹(𝑥; ̂
𝛉), 𝐹𝑛 (𝑥)] = min 𝑑[𝐹 (𝑥; 𝛉), 𝐹𝑛 (𝑥)], (1)
𝛉𝜖Θ

wherein 𝛉 is the parameter vector of the cumulative distribution function (CDF) F. Fn is the empirical

distribution function of a sample of size n. One variant is the minimum Anderson-Darling (MAD)

estimator of Boos (1982), which applies the Anderson-Darling distance A (Anderson and Darling

1954)
2
∞ (𝐹(𝑥;𝛉)−𝐹 (𝑥))
𝑛
𝐴(𝛉) = 𝑛 ∫−∞ 𝐹(𝑥;𝛉)(1−𝐹(𝑥;𝛉)) 𝑑𝐹 (𝑥; 𝛉). (2)

This distance is also frequently applied to goodness-of-fit tests for different distribution types

(Stephens, 1986). The corresponding statistic is


1
Revised manuscript

𝐴(𝛉) = −𝑛 − 1/𝑛 ∑𝑛𝑖=1(2𝑖 − 1) ln(𝐹(𝑋𝑖 ; 𝛉)) + (2(𝑛 − 𝑖 ) + 1) ln(1 − 𝐹 (𝑋𝑖 ; 𝛉)) (3)

wherein Xi is a realisation of the ordered sample. The point estimation by the MAD estimator is

̂ ) = min 𝐴(𝛉)
𝐴(𝛉 (4)
𝛉𝜖Θ

The MAD estimator has a very good performance in the case of many location scale distributions
2
according to Boos (1982, Tab.1), the mean squared error 𝑀𝑆𝐸 = 𝐸 ((𝜃̂ − 𝜃0 ) ) is frequently

marginally higher than MSE of the well-known maximum likelihood (ML) method.

Coronel-Brizio, H.F., Hernandez-Montoya, A.R. (2005)

There are more types of distance estimators. For example Basu et al. (2011) deals primarily with

estimators which apply the distance between probability distribution functions (PDFs). The minimum-

distance estimators which are based on the EDF are currently not very popular in statistics,

publications are infrequent (e.g., Kozek 1998). However, there is a certain interest in minimum-

distance methods in the actuarial research community (e.g. Clemente et al., 2012, Skřivánková, and

Juhás, 2012), and Coronel-Brizio and Hernandez-Montoya use Anderson-Darling distance for the

threshold selection in extreme value analysis. Our attention to MAD arose from a special problem: the

estimation of parameters of a known mechanism which generate a random variable X without explicit

formulations for PDF and CDF of X. Our first idea was to apply an ML estimator with point

estimation

̂ ) = max 𝑙 (𝛉),
𝑙(𝛉 (5)
𝛉𝜖Θ

wherein the logarithmic likelihood function is

𝑙 (𝛉) = ∑𝑛𝑖=1 ln⁡(𝑓(𝑥𝑖 ; 𝛉)). (6)

The PDF can be approximated for a fixed parameter vector by a large sample being generated by a

Monte Carlo simulation and the Kernel density estimation according to Silverman (1986). However,

the computational burden would be high (the same applies to computation of a PDF by

multidimensional integral) and there is the danger that there is more than one maximum of 𝑙 (𝛉) due to

the approximation. The second idea is to apply an estimator which uses the CDF. The latter can be

approximated easily by EDF Fn of ordered sample X1≤ X2≤…≤ Xn* of size n* of simulated sample with

2
Revised manuscript

𝑖
𝐹̂𝑛∗ (𝑥𝑖 ) = 𝑛∗+1 (7)

The MAD estimator would be suitable for the second approach. Nevertheless, we will focus at first on

the ML method because it is the most important estimation method with very good asymptotic

behaviour. We develop a special version of the ML method which applies the CDF in the following

section. The resulting variant of the ML estimator is equivalent to the MAD estimator. In section 3, we

discover an initial advantage of the MAD estimator by researching the computing speed of the MAD

estimator with CDF approximated by EDF of a large simulated sample in comparison to the ML

method with PDF approximated by Kernel smoothing. Then we briefly research the performance of

the MAD estimator for the generalized Pareto distribution (GPD) in section 4 - as the interest of

actuarial science is in distance estimators and GPD. Finally, we demonstrate a further advantage of the

MAD by an example of earthquake engineering – the estimation of the variance of the individual

random component of ground motion relation.

2 MAD estimator and ML method

The CDF F of a continuous random variable 𝑋 ∈ ℝ determines an infinite number of Bernoulli

distributions. A Bernoulli distribution is a discrete, binary distribution with case B and case \B,

parameter p and mass function (cf. Johnson et al., 2005)

𝑃 (𝐵) = 𝑝 and (8a)

𝑃 (\𝐵) = 1 − 𝑝. (8b)

The logarithmic likelihood function for the ML estimation of a Bernoulli distribution is

𝑙 (𝑝) = 𝑛𝐵 ln(𝑝) + (𝑛 − 𝑛𝐴 ) ln(1 − 𝑝), (9)

with sample size n and the number of observations nB for case B. The corresponding point estimation

is
𝑛𝐵
𝑝̂ = 𝑛
. (10)

At each point x at the scale of rational numbers, the case B of an Bernoulli distribution is defined

by X with X≤x and case \B means X>x. The parameter p of Eq. (8,9) is therein determined by

𝑝 = 𝐹 (𝑥). (11)

3
Revised manuscript

Furthermore, the number nB(x) is the number of observations with X≤x and Eq. (9) gives

𝑙 (𝐹(𝑥)) = 𝑛𝐵(𝑥) ln(𝐹(𝑥)) + (𝑛 − 𝑛𝐵(𝑥) ) ln(1 − 𝐹(𝑥)). (12)

We want to estimate the actual 𝟎 of F which applies for every point x. This is carried out by mixing

(12) with PDF f of X and we introduce the Bernoulli likelihood function



𝑙𝐵 () = ∫−∞ 𝑓(𝑥; 𝟎 )(𝑛𝐵(𝑥) ln(𝐹(𝑥; )) + (𝑛 − 𝑛𝐵(𝑥) ) ln(1 − 𝐹(𝑥; )))𝑑𝑥. (13)

Now we replace 𝑓(𝑥; 𝟎 ) by an empirical distribution function of the current sample as we do not

know the actual 𝟎. We write a Bernoulli likelihood function for the ordered sample X1≤ …≤ Xi ≤…≤

Xn
1
𝑙𝐵 () = ∑𝑛𝑖=1 (i − 0.5)ln(𝐹(𝑋𝑖 ; )) + (𝑛 − 𝑖 + 0.5) ln(1 − 𝐹(𝑋𝑖 ; )). (14)
𝑛

The numbers nB(x) and n-nB(x) of observations are replaced therein by i-0.5 and n-i+0.5. The value 0.5

is applied as observation xi at point x=xi could be interpreted as xix or x≥xi in the approximation.

Now, we have the MAD estimator as a special variant of the ML method with

𝑙𝐵 (̂ ) = max 𝑙𝐵 () , (15)


𝜃𝜖Θ

as 𝑙𝐵 (𝛉) of Eq.(14) has obviously the maximum where 𝐴(𝛉) of Eq.(3) has a minimum. This

relationship of the MAD estimator to the ML method explains that the asymptotic MSE of the MAD is

frequently only bit higher than the MSE of ML method according to Boos (1982). Even though the

issue of estimation error for the MAD estimator was not discussed by him. We only refer to the

empirical variant of the well-known Fisher information matrix (see e.g. Coles, 2001; Upton and Cook,

2008) regarding this issue. The Jack-knife method of Efron (1979) could also be applied to estimate

the variance of the estimation error.

If the empirical distribution function (7) is used for the application of the MAD estimator, then the

numerical stability of the estimation procedure can be ensured by a fixed start value of the

initialisation of the random generator for every considered parameter vector . Furthermore, we point

out that the MAD estimator has an advantage over ML method: it can be applied directly and with

much less modification in the case of censored data. Only the limits of the counting variable i has to be

changed in Eq.(14).

4
Revised manuscript

3 Computing speed of estimations with approximated distributions

Here, we compare the computing speed of the MAD estimator with approximated CDF with the

performance of the conventional ML method with approximated PDF. The latter is provided by the

Kernel density estimation according to Silverman (1986). For performance analysis, we construct a

random variable X by the generating mechanism

𝑌1 +√𝑌2 + 4√ 𝑌3
𝑋= 𝑌4 √𝑌5
, 𝑋 ≥ 0. (16)

Wherein Yi is an independent exponential distributed random variable with PDF

1 𝑥
𝑓𝑦 (𝑥) = 𝜎 exp (− 𝜎 ) , 𝑥 ≥ 0, 𝜎 > 0. (17)

The PDF f of X is only parametrized by =1 of (17) in our example. We know neither the PDF nor the

CDF of X; the constructed example also demonstrates the limits of conventional estimation methods.

An interesting detail is that the approximated CDF F of X is very similar in the upper half to the well-

known generalized Pareto distribution (see, e.g., Beirlant et al., 2004) with CDF

𝑥 −1/𝛾
𝐹 (𝑥) = 1 − (1 + 𝛾 𝜎 ) , 𝜎 > 0, (18)

with =5 and extreme value index =1. This means that the expectation E(X) is infinite (cf. Beirlant et

al., 2004, section 5.3) and a moment estimator would not work. The CDFs and the survival functions

are shown in Figures 1a and b. The CDF of X is computed by the empirical distribution function

according to Eq.(7) with a large sample with n*=100,0000 being generated by a Monte Carlo

simulation. The parameter  of fy and f is estimated in the performance analysis for a sample with

n=100, which is depicted in Figure 1c.

Figure 1.
We apply a simple optimization algorithm (nearest neighbour; highest relative exactness 1‰, start

value is =1) to maximize the logarithmized likelihood functions. The numerical procedures are

programmed in VB.net by applying the interpolation tools from the mathematics library of Extreme

Optimization (www.extremeoptimization.com). The bandwidth of the Kernel for the approximation of

the PDF is computed with the sample variance of the observation (Fig.1c) according to the optimal

5
Revised manuscript

bandwidth for a Gauss distribution. We consider Gaussian, Epanechnikov and Biweight kernels

(Silverman, 1986, Tab. 3.1).

The required computation time of an ACPI x64 based PC is listed in Table 1. Obviously,

computation time for the Bernoulli estimator with approximated CDF is much shorter than

computation time for conventional ML estimation with approximated PDF. The interpolation method

does not have very much influence on the computation time and estimation results of MAD method.

Tab.1.

4 Performance of the MAD estimator for the GPD

The current interest in minimum distance estimators for the GPD is relatively high in actuarial science

(e.g., Ruckdeschel and Horbenkom, 2010; Skřivánková and Juhás, 2012) and the performance of the

MAD estimator was not researched by Boss (1982) for GPD. Hence we compute the MSE of the

parameter estimation for finite sample sizes n=50 and 100. These are the sample sizes which have

already been considered in the performance analyse of Hüsler et al. (2011, Figure 2-5), wherein the

Moment – ML method was also introduced. The scale parameter is =1 in all researched cases. The

considered extreme value index  is between -2 and 2. The MSE is quantified empirically by point

estimations of 50,000 samples for each parameter variant. The samples are generated by Monte Carlo

simulation. The results are depicted in Figure 2. The MSE of 𝛾̂ of the MAD estimator is larger than of

the ML and Moment – ML method. But the MSE of scale parameter 𝜎̂ is the smallest for MAAD

estimator if >1. We underline that the ML method has not always a solution for <-0.5 (cf.

Grimshaw, 1993). That is why we only consider the ML method for ≥-0.5.

Figure 2.

6
Revised manuscript

5 Application of the MAD estimator with approximated CDF to an

issue of seismic hazard analysis

In seismic hazard analysis, the annual probability of exceedance of local earthquake shaking intensity

Y is estimated (cf. Raschke, 2013). The ground motion relation is an element of the seismic hazard

model and describes the relation between concrete event parameters and source distance, being a

member of predicting vector X, to local earthquake shacking intensity Y with

𝑌 = 𝑔(𝐗)𝑎 𝑏 , 𝐸 (𝑌) = 𝑔(𝐗), 𝐸 (𝑎 ) = 𝐸 (𝑏 ) = 1, (19a)

ln(𝑌) = ln(𝑔(𝐗)) + ln(𝑎 ) + ln⁡(𝑏 ). (19b)

The random variable Y is the absolute peak of ground motion acceleration or a similar quantity. The

event-specific random component (random variable; residual in the sense of an regression model) is b

and has only one realization per event, and the individual random component a has a realization per

event and site. The value of corresponding variances V(a) and V(ln(a)) can considerably influence the

results of hazard estimation for large return periods. Hence the correct estimation of V(a) is very

important. Eq.(19) looks like an regression model but V(a) may not be estimated by regression

analysis (residual variance) because of the area-equivalence according to Raschke (2013). The

variance V(a) has to be estimated by a special procedure. Therein, we use the fact that there are two

horizontal component intensities Y1 and Y2 with random components a,1 and a,2. The shaking intensity

Yi are the maxima of absolute values of time history yi(t) of earthquake ground acceleration with

𝑌𝑖 = max|𝑦𝑖 (𝑡)|. (20)


𝑡

We also can consider the shaking intensities Y(w)= Y(w+) which depend on orientation angle w with

𝑌(𝑤) = 𝜀𝑎 (𝑤)𝜀𝑏 𝑔(𝐗). (21)

The issue is illustrated by the polar plot of an earthquake time history and the resulting shaking

intensities Y(w) in Figure 3a. Therein, the random components a,1 and a,2 determine the random

difference  according to Eq.(18b) with

 = log(𝑌1 ) − log(𝑌2 ) = log(𝑎,1 ) − log(𝑎,2 ) , 𝐸 () = 0. (22)

7
Revised manuscript

In practice, the concrete orientation angles w1 and w2 are frequently the geographic direction north-

south and east-west. However, it is important that the components are perpendicular to each other.

We approximate the stochastic mechanism which generates a for the estimation of V(a) by

random impulses Zi with a uniformly distributed random direction v according to Fig.3b. The random

element a is

𝜀𝑎 (𝑤) = 𝑚𝑎𝑥{𝑍1 |cos⁡(𝑤 − 𝑣1 )|, 𝑍2 |cos⁡(𝑤 − 𝑣2 )|, … , 𝑍𝑘 |cos⁡(𝑤 − 𝑣1 )|}. (23)

The natural distribution of Zi is the Gumbel distribution (s. Johnson et al., 1994; and appendix) as it is

works like a cluster peak. Furthermore, we assume a Poisson distributed number k of random impulses

(s. appendix).

Figure 3.

We have no explicit formulations for the distributions of  and , and approximate these by the

aforementioned Monte Carlo simulation and Kernel smoothing or by EDF. The ML and MAD

methods can be applied to these approximations for parameter estimation. Here, we combine these

with the moment estimator to ensure numerically that 𝐸(𝑎 ) = 1 and that the empirical variance of the

observed difference  is equal to that modelled. In this way the parameters 𝐸(𝑍) and 𝑉(𝑍) are

determined by the moment estimator, and only the intensity  of the Poisson distributed number k (s.

appendix) has to be estimated by the ML or MAD methods. We have computed the likelihood

functions for a sample of  from data of peak ground accelerations of the SHARE project (Giardini,

2013; Share_Metafile_v3.2a.xls, free field observations with moment magnitudes) with n=1,829 and

𝑉̂() = 0.119. We approximate the PDF and the CDF for the likelihood functions by a sample of size

n*=100,000 generated by a Monte Carlo simulation. Therein we have generated 200 random impulses

Z for every realization of a,1, a,2 and  , although the generated random integer number k of impulses

is much smaller. This procedure ensures smoother likelihood function graphs. For the same reason, the

random generator (Mersenne-Twister of library Extreme Optimization) was started again with the

same start value for every considered value of the Poisson intensity .

Figure 4.

8
Revised manuscript

The likelihood functions are shown in Figure 4 for steps 0.1 with Poisson intensity . The

advantage of the Bernoulli likelihood function of the MAD estimator is obvious; it has a smooth graph

with only one maximum, in contrast to the classical likelihood function of the ML method. The

disadvantage of the MAD estimator is a smaller slope of the Bernoulli likelihood function. The point

estimations of  are 7.9 and 8.0. The actual target parameter is the corresponding variance V(a) and is

estimated at 0.058 for both estimation methods. Additionally, we have tested the estimation by a

comparison of modelled distributions of random the difference  with the EDF of the observed

sample. The results are shown in Figure 5. The symmetric distributions correspond very well.

Figure 5.

6 Conclusions

We have researched the opportunities of the MAD estimator. Its advantages are

• It is a special kind of ML estimator (sect. 2).

• The MSE is not much larger than that of the ML method (Boss, 1982). In some cases of finite

sample size, the MSE can be even smaller (section 4).

• It can be applied with approximated CDF. The computing speed is much higher than for the

ML method with approximated PDF (sect. 3).

• The MAD estimator with approximated CDF is not affected by a larger number of local

maxima of lB() (minima of A()) in the case of approximated CDF in contrast to the ML

method with approximated PDF (sect. 5). Automatic procedures can be applied more easily to

the MAD estimator than to the ML method.

• It can be applied directly to censored data.

In other words, the MAD estimator offers a good and practical solution for special estimation

problems which can occur in the modelling of complex phenomena. The biggest disadvantage of the

MAD estimator is that its MSE can be higher than the MSE of the ML method.

9
Revised manuscript

References

Anderson, T.W. and Darling, D.A. (1954). A Test of Goodness-of-Fit. Journal of the American Statistical

Association 49:765–769.

Basu, A., Shioya, H., Park, C. (2011). Statistical Inference: The Minimum Distance Approach. Monographs on

Statistics and Applied Probability, Chapman & Hall, Taylor & Francis Group, Boca Raton.

Beirlant, J., Goegebeur, Y., Teugels, J., Segers, J. (2004). Statistics of Extremes: Theory and Applications. Wiley

Series in Probability and Statistics, Chichester: Wiley & Sons.

Boos, D. (1982). Minimum anderson-darling estimation. Communications in Statistics - Theory and Methods

11:2747-2774.

Clemente, G.P., Saveli, N., Zappa, D. (2012). Modelling and calibration for non-life underwriting risk: from

empirical data to risk capital evaluation. Presentation, ASTIN Colloquium, 1-4 October 2013, Mexico City.

Coles, S. (2001). An introduction to statistical modeling of extreme values. London: Springer.

Coronel-Brizio, H.F., Hernandez-Montoya, A.R. (2005) On fitting the Pareto–Levy distribution to stock market

index data: Selecting a suitable cutoff value. Physica A 354:437–449.

Drossos, C. A., Philippou, A. N. (1980). A Note on Minimum Distance Estimates. Annals of the Institute of

Statistical Mathematics 32:121–123.

Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. Ann. Statist. 7:1-26.

Giardini, D., Woessner, J., Danciu, L. , Valensise, G., Grünthal, G., Cotton, F., Akkar, S., Basili, R., Stucchi, M.,

Rovida, A., Stromeyer, D. , Arvidsson, R., Meletti, F., Musson, R., Sesetyan, K., Demircioglu, M. B, Crowley,

H., Pinho, R. , Pitilakis, K. , Douglas, J. , Fonseca, J. , Erdik, M., Campos-Costa, A., Glavatovic, B. ,

Makropoulos, K., Lindholm, C., Cameelbeeck, T. (2013). Seismic Hazard Harmonization in Europe (SHARE):

Online Data Resource, http://portal.share-eu.org:8080/jetspeed/portal/, doi: 10.12686/SED-00000001-SHARE

(down loaded March 2015)

Grimshaw, S. (1993). Computing maximum likelihood estimates for the generalized Pareto distribution.

Technometrics 35:185-191.

Hüsler, J., Li, D. , Raschke, M. (2011). Estimation for the generalized Pareto distribution using maximum

likelihood and goodness-of-fit. Communication in Statistics – Theory and Methods 40: 2500 – 2510.

Johnson, N.L., Kemp, A.W. and Kotz, S. 2005. Univariate discrete distributions, 3rd Ed.. Wiley Series in

Probability and Statistics. New York: Wiley.

10
Revised manuscript

Johnson, N.L., Kotz, S., Balakrishnan, N. (1994). Continuous univariate distributions, vol 1. 2nd edn. New

York; Wiley.

Kozek, A.S. (1998). On minimum distance estimation using Kolmogorov-Lévy type metrics. Austral. & New

Zealand J. Statist. 40:317–333.

NIED (National research institute for earth science and disaster prevention) (2015). Strong-motion networks,

http://www.kyoshin.bosai.go.jp/ (downloaded, March 2015).

Parr, W. C., Schucany, W. R. (1980). Minimum Distance and Robust Estimation. Journal of the American

Statistical Association 75:616–624.

Parr, W. C. l98l. Minimum Distance Estimation: A Bibliography , Communications in Statistics - Theory and

Methods, 10, 1205-1 224.

Raschke, M. (2013). Statistical modelling of ground motion relations for seismic hazard analysis. Journal of

Seismology 17:1157-1182.

Ruckdeschel, P., Horbenkom, N. (2010). Robustness properties of estimators in generalized Pareto models.

Scientific report: Berichte des Fraunhofer ITWM, Nr. 182 (http://www.itwm.fraunhofer.de/fileadmin/ITWM-

Media/Zentral/Pdf/Berichte_ITWM/2010/bericht_182.pdf)

Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman & Hall/CRC.

Skřivánková, V., Juhás, M. (2012). EVT methods as risk management tools. Conference paper, 6th International

Scientific Conference Managing and Modelling of Financial Risks, 10-11 September 2012, Ostrava.

Stephens MA (1986). Test based on EDF statistics. in D’Augustino, RB, Stephens, MA (Editors) Goodness-of-

Fit Techniques. statistics: textbos and monographs, Vol. 68, Marcel Dekker, New York

Upton, G., Cook, I. (2008). A dictionary of statistics, 2nd edn. Oxford: Oxford University Press.

Wolfowitz, J. (1957). The minimum distance method. The Annals of Mathematical Statistics 28:75–88.

Appendix

The Gumbel distribution of a real valued random variable X has CDF (cf. Johnson et al., 1995, sect. 22)

𝑥−𝑎
𝐹 (𝑥) = 𝑒𝑥𝑝 (𝑒𝑥𝑝 (− 𝑏
)) (A1)

With parameters a and b and has the moments

𝐸(𝑥) = 𝑎 + 𝑏𝛾,⁡⁡⁡⁡𝛾 ≈ 0.57722 and (A2)

11
Revised manuscript

𝑏 2𝜋 2
𝑉(𝑥) = 6
. (A3)

The Poisson distribution for a discrete random variable K0 is formulated with (cf. Johnson et al.,

2005, sect. 4)

𝜆𝑥 exp⁡(−𝜆)
𝑃 (𝐾 = 𝑥) = 𝑥!
(A4)

With Poisson intensity  and has the moments

𝐸(𝐾) = 𝐸(𝐾) = 𝜆. (A5)

12
Revised manuscript

Tables

Table 1: Computation time of MAD estimator and ML method with approximated distributions
Sample size 𝑛 ∗ 100,000 1000,000
Estimator ML MAD ML MAD
Kernel/interpolation Gauss Epan. Biwei. Linear Spline Gauss Epan. Biwei. Linear Spline
Time [s] for the first computation
0.59 1.08 1.57 0.02 0.05 4.99 10.36 15.38 0.16 0.20
of likelihood function
Time [s] for entire estimation 8.87 8.77 13.29 0.80 0.73 83.00 109.36 432.55 9.07 10.31
Point estimation 𝜎̂ 0.79 1.00 1.00 1.17 1.08 0.99 0.99 0.99 1.16 1.16

13
Revised manuscript

Figures

a) b) c)
Figure 1: Distribution of the researched example of X according toEq.(17,18) with =1 and a generalized Pareto
distribution according to Eq.(19) with =5 and =1: a) CDF, b) survival function, c) analysed sample of X

a) b) c) d)
Figure 2: MSE of different estimation methods for GPD: a) MSE of 𝛾̂ for n=50, a) MSE of 𝛾̂ for n=100, c)
MSE of 𝜎̂ for n=50, d) MSE of 𝜎̂ for n=100 (light blue line: moment – ML method, broken red line: ML method,
solid black line: MAD method; supporting points according to Hüsler et al., 2011).

a) b)
Figure 3: Modelling of random component a. a) example of an earthquake time history [cm/s2] of station
FKS013 in Japan (NIED, 2015; Record Time 2001/10/02 17:20:12) and shaking intensity Y(w), b) random
impulse Zi, random angel vi and orientation wi of Eq.(21).

14
Revised manuscript

a) b)

c) d)
Figure 4: Likelihood functions for the ML and the MAD method for parametrisation of generation of 
according to Eq.(22,23): a) l() of ML method according to Eq.(6), b) lB() of MAD estimator according
to Eq.(15), c) detail of a), d) detail of b).

a) b)
Figure 5: Validation of the modelled distribution of random difference : a) CDFs, b) CDFs with logarithmic
scale

15

View publication stats

You might also like