Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Engineering 192 (2017) 713 – 718

TRANSCOM 2017: International scientific conference on sustainable, modern and safe transport

Application of four probability distributions for wind speed


modeling
Ivana Pobočíkováa, Zuzana Sedliačkováa, Mária Michalkováa*
a
Department of Applied Mathematics, Faculty of Mechanical Engineering, University of Žilina, Univerzitná 1, 010 26 Žilina, Slovakia

Abstract

When modeling the wind speed in some location the probability distributions are proving to be a useful tool. In this study we
consider four different probability distributions: the 2-parameter Weibull, the 3-parameter Weibull, the 2-parameter Gamma and
the 2-parameter Lognormal. All of them are applied on the wind speed data recorded at the airport in Dolný Hričov. Parameters of
the density distributions are estimated by the maximum likelihood method. In order to select the best fitting distribution there are
used the Ftest, the Kolmogorov-Smirnov test, the Akaike information criterion, the Bayesian information criterion, the coefficient
of determination and the root mean square error. Based on the results the 3-parameter Weibull performs as the best and the 2-
parameter Weibull distribution performs as the second best.
©©2017
2017TheThe Authors.
Authors. Published
Published by Elsevier
by Elsevier Ltd.is an open access article under the CC BY-NC-ND license
Ltd. This
Peer-review under responsibility ofthe scientific committee of TRANSCOM 2017: International scientific conference on
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review
sustainable,under responsibility
modern and safe of the scientific committee of TRANSCOM 2017: International scientific conference on sustainable,
transport.
modern and safe transport
Keywords: Wind speed; Weibull distribution; Gamma distribution; Lognormal distribution; goodness of fit test; Akaike information criteria;
Bayesian information criteria; coefficient of determination; root mean square error

1. Introduction

The wind direction and the wind speed are important meteorological information for air transport safety. High wind
speed or wind shear causes turbulences and problems at taking off and landing. At a given location, the wind speed
varies throughout a day, from day to day, season to season, year to year. Wind speed data analysis might contribute to
increasing safety. As the wind speed is random phenomenon, the probability distribution is required for the wind speed

* Corresponding author. Tel.: +421 41 513 4952


E-mail address: maria.michalkova@fstroj.uniza.sk

1877-7058 © 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the scientific committee of TRANSCOM 2017: International scientific conference on sustainable, modern and safe transport
doi:10.1016/j.proeng.2017.06.123
714 Ivana Pobočíková et al. / Procedia Engineering 192 (2017) 713 – 718

modeling. Several probability distributions have been used in literature to describe the wind speed. Commonly used
is the Weibull distribution [1-5], though some authors have indicated that the Weibull distribution should not be used
in a generalized way. They used and compared the other probability distributions that provide better fit to the wind
speed [6-12]. For overview of the probability distributions and the parameter estimation methods using the wind speed
data see [13].

2. Methods

The wind speed data used in this study were measured in Dolný Hričov in January 2010. The meteorological station
was located at the airport in Žilina. The continuously recorded wind speed data (in m/s) were averaged per day and
stored as daily values.
Four different probability distributions, namely the 2-parameter Weibull, the 3-parameter Weibull, the 2-parameter
Gamma and the 2-parameter Lognormal, are considered. These distributions are the most preferred in modeling the
wind speed data. In all considered distributions, the parameters of the probability distributions are estimated using the
maximum likelihood method (MLM). In presented study the distribution parameters are estimated and calculations
are performed using statistical software STATISTICA and software MATLAB.
Estimated parameters undergo several goodness of fit tests in order to determine the most suitable probability
distribution for the wind speed data. The following goodness of fit tests are used: the F-test and the Kolmogorov-
Smirnov test (KS). For evaluating the goodness of fit there are also used the Akaike’s information criterion (AIC), the
Bayesian information criterion (BIC), the coefficient of determination ( R 2 ) and the root mean square error (RMSE).

2.1. Maximum likelihood method

Let X1 , X 2 , , X n be a random sample of size n from the distribution with the probability density function
f ( x,T ) , where T (T1 ,T2 ,...,Tk ), T 4, is the unknown parameter. T is in general vector parameter and let
x1 , x2 , , xn be a realization of the random sample. The log-likelihood function of this random sample is given by
n
ln L( x1 , x2 , , xn ;T ) ¦ ln f ( x ,T ).
i 1
i (1)

The maximum likelihood estimates Tˆ of the parameter T are the values of T that maximize (1) with respect to T .

2.2. Probability distributions

Weibull distribution: The probability density function f ( x) and the cumulative distribution function F ( x) of the
2-parameter Weibull distribution are given by

D
D D 1 § §x· · § § x ·D ·
f ( x) x exp ¨  ¨ ¸ ¸, F ( x) 1  exp ¨  ¨ ¸ ¸ ,
E D ¨ ©E ¹ ¸ ¨ ©E ¹ ¸
© ¹ © ¹

for x ! 0, D ! 0, E ! 0 . D is the dimensionless shape parameter and E is the scale parameter in units of the wind
speed. The MLM estimates are obtained by solving iteratively

¦ xD ln x
i i
1 n §1 n D ·
1/ D

¨ ¦ xi ¸
1
 i 1
 ¦ ln xi 0, E .
D n
©n i 1 ¹
¦x D ni1
i
i 1
Ivana Pobočíková et al. / Procedia Engineering 192 (2017) 713 – 718 715

3-parameters Weibull distribution: The probability density function f ( x) and the cumulative distribution
function F ( x) of the 3-parameter Weibull distribution are given by

D
D § § x T · · § § x  T ·D ·
D
x  T exp ¨  ¨
D 1
¸, F ( x) 1  exp ¨  ¨ ¸,
¨ © E ¸¹ ¸ ¨ © E ¸¹ ¸
f ( x)
E © ¹ © ¹

for x t T , D ! 0, E ! 0. D is the dimensionless shape parameter, E is the scale parameter in units of the wind speed,
T is the location parameter. The MLM estimates are obtained by solving iteratively

¦ x  T ln xi  T
n D
i
1 1 n
D
i 1
 ¦ ln xi  T ,
¦ x T
n D ni1
i
i 1

¦ x T
n D

D
1/ a
§1 D ·
i n

¨ ¦ xi  T ¸
n
1 1
E ,
1D
i 1
¦ x T .
¦ x T
n
© i1
n ¹ n D 1
i 1 i
i
i 1

Gamma distribution: The probability density function f ( x) and the cumulative distribution function F ( x) of the
2-parameter Gamma distribution are given by

§ x·
J ¨D , ¸
f ( x)
1 § x·
xD 1 exp ¨  ¸ , F ( x) © E ¹,
*(D ) E D © E¹ *(D )

for x ! 0, D ! 0, E ! 0, where J p, x ³e
t
t p 1dt , p ! 0 , is the lower incomplete Gamma function. D is the shape
0
parameter and E is the scale parameter. The MLM estimates are obtained by solving iteratively

x 1 n
E
D
, \ (D ) ln D  ln x  ¦ ln xi ,
ni1

w ln * p
where \ ( p) , p ! 0, is the digamma function.
wp
Lognormal distribution: The probability density function f ( x) and the cumulative distribution function F ( x) of
the 2-parameter Lognormal distribution are given by

1 § 1 § ln x  P ·2 · § ln x  P ·
f ( x) exp ¨  ¨ ¸ ¸¸ , F ( x) )¨ ¸,
xV 2S ¨ 2© V ¹ ¹ © V ¹
©

for x ! 0, P  R, V ! 0, where Ȱሺ‫ݔ‬ሻ is the cumulative distribution function of the standard normal distribution. V is
the scale parameter and P is the location parameter. The MLM estimates are obtained by solving
716 Ivana Pobočíková et al. / Procedia Engineering 192 (2017) 713 – 718

2
1 n 1 n § 1 n ·
P ¦ ln xi ,
ni1
V2 ¦ ¨ ln xi  ¦ ln xi ¸ .
n i 1© ni1 ¹

2.3. Goodness of fit tests and model selection criteria

Let X (1) , X (2) , , X ( n ) are order statistics of random sample X1 , X 2 , , X n from distribution with the cumulative
distribution function F ( x) and let x(1) , x(2) , , x( n ) are observations in ascending order, so that x(1) d x(2) d d x( n ) .
The empirical distribution function is defined as follows

Fn ( x)
1 n

¦ I x i d x ,
ni1

where I x(i ) d x 1 , if x i d x and 0 otherwise.
Kolmogorov-Smirnov test: We want to test the hypothesis
H 0 : the data follow the specified distribution with cumulative distribution function F ( x) ,
H A : the data do not follow the specified distribution with cumulative distribution function F ( x) .
The KS test statistics can be computed by calculating

௜ ௜ିଵ
‫ܦ‬௡ ൌ ݉ܽ‫ ݔ‬ቄቚ െ ‫ܨ‬൫‫ݔ‬ሺ௜ሻ ൯ቚ ǡ ቚ‫ܨ‬൫‫ݔ‬ሺ௜ሻ ൯ െ ቚቅǤ
ଵஸ௜ஸ௡ ௡ ௡

F 2 -test: The F 2 -test is applied to binned data. This test is valid for large samples. The expected frequency should
be at least 5. The F 2 -test statistics is defined as

Oi  Ei
2
k
F2 ¦i 1 Ei
,

where Oi is observed frequency for bin i and Ei is the expected frequency for bin i, i 1, 2,..., k , calculated by
Ei n F (ai )  F (ai 1 ) , where ‫ܨ‬ሺ‫ݔ‬ሻ denotes the cumulative distribution function and ai , ai 1 are the lower and
upper limits for bin i .
In this study p-value is used in hypothesis testing. The hypothesis ‫ܪ‬଴ is rejected at the chosen significance level ߙǡ
if p  value  D . While comparing two different distributions, the distribution with higher p-value is likely to better
fit regardless of the level of significance [10].
Akaike’s information criterion: The AIC can be calculated using [14]

AIC
2 ln L x1 , x2 ,..., xn ;Tˆ  2 m,


where L x1 , x2 ,..., xn ;Tˆ is the maximized value of the likelihood function for the estimated model, m is number of
parameters to be estimated, n is number of observed data.
Bayesian information criterion: The BIC can be calculated using [15]

BIC
2 ln L x1 , x2 ,..., xn ;Tˆ  m ln n .

The AIC and the BIC are the measure of the goodness of fit for an estimated statistical model. But these criteria do not
represent tests of the model in the sense of the hypothesis testing; rather they are tools for model selection. The model
with smaller values of the AIC and the BIC is preferred model.
Ivana Pobočíková et al. / Procedia Engineering 192 (2017) 713 – 718 717

The coefficient of determination: The coefficient of determination can be calculated using

¦ Fˆ x  F
n 2
i
R2 i 1
,
¦
n n
Fˆ xi  F  ¦ Fn xi  Fˆ xi
2 2

i 1 i 1

1 n ˆ
where F̂ x is the estimated cumulative distribution function, F ¦ F xi and Fn x is the empirical
ni1
distribution function. The range of R 2 is from 0 to 1. A larger value of R 2 indicates a better fit of the theoretical
distribution to the wind speed data.
The root mean square error: The root mean square error can be calculated using

1/ 2
ª1 n 2º
RMSE
¬ i1

« n ¦ Fn xi  F xi »
ˆ
¼
.

A lower value of the RMSE indicates a better fit of the theoretical distribution to the wind speed data.

3. Results and discussion

Table 1. MLM estimates of the parameters and goodness of fit test


Distribution MLM estimates of the Statistical test
parameters
F 2 -test KS test AIC BIC R2 RMSE
p  value p  value
Weibull D 2.5009, E 1.4954 0.2167 (0.6415) 0.0894 (0.9248) 57.5254 60.1934 0.9781 0.0447
3-parameter D 3.4996, E 1.9692, 0.5091 (0.4755) 0.0773 (0.9925) 58.5468 62.8488 0.9860 0.0357
Weibull T -0.4383
Gamma D 3.9409, E 0.3375 0.9901 (0.6096) 0.1442 (0.5569) 61.5560 64.4240 0.9510 0.0643
Lognormal P 0.1530, V 0.5915 6.5611 (0.0873) 0.1707 (0.3289) 67.9062 70.7742 0.9131 0.0819

Tab. 1. shows the estimates of the parameters obtained for the considered distribution, the comparison of the
p-values of the goodness of fit tests and the values of the model selection criteria. It is obvious that all four distributions
significantly fit the wind speed data at significance level D 0.05 based on the p-values. The values of RMSE, AIC
and BIC for the 3-parameter Weibull and the 2-parameter Weibull distributions were found to be smaller compared to
the Gamma and the Lognormal distributions. The values of R 2 for the 3-parameter Weibull and the 2-parameter
Weibull distributions were found to be greater to 0.97 that indicate that these two distributions fit the wind speed data
very well. In terms of R 2 and RMSE the best performance gives the 3-parameter Weibull distribution. In terms of AIC
and BIC the 2-parameter Weibull distribution performs the best. Based on the results the Lognormal distribution
provides the poorest fit for the wind speed data. The Gamma distribution performs better.

Fig. 1. illustrates how appropriately the considered probability distributions describe the wind speed data.
718 Ivana Pobočíková et al. / Procedia Engineering 192 (2017) 713 – 718

Fig. 1. Comparison of probability density functions and wind speed histogram.

4. Conclusion

Based on the results and with respect to the goodness of fit tests the 3-parameter Weibull and the 2-parameter
Weibull distributions perform the best compared to the Gamma and Lognormal distributions. These distributions can
be used as an alternative distribution that adequately describes the considered wind speed data in Dolný Hričov.

References

[1] J. N. Celik, A statistical analysis of wind power density based on the Weibull and Rayleigh models at the southern region of Turkey. Renewable
Energy, 29 (2003), 593-604.
[2] I.Y.F Lun, J. C. Lam, A study of Weibull parameters using long-term wind observations, Renewable Energy, vol. 20, (2000), 145-153.
[3] P. Ramírez, J. A. Carta, Influence of the data sampling interval in the estimation of parameters of the Weibull wind speed probability density
distribution a case study, Energy Conversion and Management, 46 (2005), 2419–2438.
[4] J. W. Seguro, T. W. Lambert, Modern estimation of the parameters of the Weibull speed distribution for wind energy analysis, Journal of Wind
Engineering and Industrial Aerodynamics, 85 (2000), 75-84.
[5] J. M. Stevens, P. T. Smulders, The estimation of the parameters of the Weibull wind speed distribution for wind energy utilization purposes,
Wind Engineering, vol 3(2), (1979), 132-45.
[6] J. A. Carta, P. Ramírezes, Analysis of two-component mixture Weibull statistics for estimation of wind speed distributions, Renewable Energy,
32 (3) (2007), 518-531.
[7] A. Z. Dhunny, R. M. Lollchund, R. Boojhawon, S. D. D. V. Rughooputh, Statistical Modeling of Wind Speed data for Mauritius, International
Journal of Renewable Energy Research, 4 (4) (2014), 1056-1064.
[8] R. Kollu, S. R. Rayapudi, S. V. L. Narasimham, K. M. Pakkurthi, Mixture probability distribution functions to model wind speed distributions,
International Journal of Energy and Environmental Engineering, Vol. 3, article :27 (2012).
[9] E. C. Morgan, M. Lackner, R. M. Vogel, L- G. Baise, Probability distributions for offshore wind speeds, Energy Conversion and Management,
55 (2011), 15-26.
[10] V. Yilmaz, H. E. Celik, A statistical approach to estimate the wind speed distribution A case of Gelibogu region, Dogus Universiteti Dergisi,
9 (1) (2008), 122-132.
[11] J. Zhou, E. Erdem, G. Li, J. Shi, Comprehensive evaluation of wind speed distribution models: A case study for North Dakota sites, Energy
Conversion and Management, 51 (2010), 1449-1458.
[12] Pobočíková, Z. Sedliačková, J. Šimon, Statistical analysis of wind speed data based on Weibull and Rayleigh distribution, Communications-
Scientific Letters of the University of Žilina, Vol. 16, No. 3a (2014), 136-141.
[13] J. A. Carta, P. Ramírez, S. Velazques, A review of wind speed probability distributions used in wind energy analysis: Case studies in the
Canary Islands, Renewable and Sustainable Energy Rewiews, 13 (2009), 933-955.
[14] H. Akaike, A new look at the statistical model identification, Automatic Control, IEEE Transactions on, 19 (6) (1974), 716-723.
[15] G. Schwarz, Estimating the dimension of a model, Annals of Statistics, 6 (2) (1978), 461-464.

You might also like