Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Vol.

7, 2022-15

Numerical Determination of the Probability Density of Functions of


Randomistic Variables

Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org

doi: 10.13140/RG.2.2.13204.99206

Abstract

The change of variable theorem is a useful equation for analytically determining the resulting
probability density of an arbitrary function of one or more independent randomistic variables.
The term randomistic may represent purely deterministic variables, purely random variables, or
a combination of both. The change of variable theorem requires an inverse of the original
function where one of the independent variables is explicitly solved in terms of all other
variables. Unfortunately, this is not always the case, and therefore the analytical change of
variable theorem cannot be used in those situations. In addition, when two or more
independent variables are involved, the analytical change of variable theorem requires solving
one or more definite integrals, and there are situations where the integrals cannot be
expressed as known analytical functions. In this report, a numerical version of the change of
variable theorem is presented for obtaining the probability density of a function, when the
analytical change of variable theorem does not succeed or cannot be used. The proposed
method is implemented in R language, and its use is illustrated with several examples. Most
examples considered are comparative, where the exact analytical solution is known, in order to
validate the performance of the method. Similitude percentages above 95% were obtained. Of
course, both the accuracy and computational demand of the numerical method will strongly
depend on the step sizes and tolerance considered. Additional examples are included to show
that numerical solutions are possible even when the analytical method fails.

Keywords

Change of Variable Theorem, Cumulative Probability, Distribution Functions, Monte Carlo,


Nonlinear Functions Numerical Methods, Probability Density, Randomistics, Similitude

Cite as: Hernandez, H. (2022). Numerical Determination of the Probability Density of Functions of
Randomistic Variables. ForsChem Research Reports, 7, 2022-15, 1 - 40. doi: 10.13140/RG.2.2.13204.99206
Publication Date: 03/10/2022.
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

1. Introduction

A randomistic variable is a linear combination of a deterministic variable and a pure random


variable [1]. Continuous randomistic variables can be described by their corresponding
probability density function (pdf) or by a finite probability density function if they are discrete
[2]. If the pdf of a randomistic variable is known, all its properties can be determined including:
Expected value, variance (and standard deviation), moments, quantiles (including the median),
mode, etc. If one or more randomistic variables with known pdf are transformed by an
arbitrary function, it is possible to determine the pdf of the resulting variable using the Change
of Variable Theorem [3]. That is, the mathematical transformation of randomistic variables is
also randomistic. Even if the pdf’s of the original randomistic variables are relatively simple
functions (e.g. uniform distributions), an analytical expression for the pdf of the transformation
can be difficult and in some cases impossible to obtain. Perhaps it is for this reason that the
change of variable theorem is seldom used for solving science and engineering problems
involving randomistic variables.

The purpose of this report is the introduction of a simple numerical approach for determining
the pdf of an arbitrary function of one or more randomistic variables with known pdf, and using
such numerical pdf for determining any property of the resulting variable (expected value,
variance, moments, etc.). The derivation of this numerical approach is explained in Section 2
considering functions of single randomistic variables. In Section 3, the general numerical
method for nonlinear functions of multiple randomistic variables is presented. The
implementation of the method in R language (https://cran.r-project.org/) is shown in Section 4.
Finally, some illustrative examples are considered in Section 5.

2. Functions of a Single Randomistic Variable

Let us consider an arbitrary randomistic variable with a probability density function (pdf)
given by the mathematical function with the following properties:

( )
(2.1)

∫ ( )

(2.2)
Let us now consider the following arbitrary nonlinear transformation of :

( )
(2.3)
where is also a randomistic variable, with pdf given by [3]:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (2 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( )
( ) ∑ ( ( )) | |

(2.4)
represents the inverse function of , corresponding to (from Eq. 2.3):

( )
(2.5)

Since the nonlinear function can be non-injective (multiple values of yield the same value of
), the inverse function may result in multiple solutions. Thus, the subscript represents each
of the possible values of ( ).

Eq. (2.4) is difficult to evaluate when the function is not analytically invertible. For example, if
we consider the relatively simple function:

( )
(2.6)

then we find that the corresponding inverse function cannot be described with known
analytical functions, and much less can we determine its derivative.

One possible approach is approximating the inverse using a series expansion about as
follows [4]:

( ) ( ) ( ) ( ) ( )
( ( )) ( ( ))
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )( ) ( )
( ( ))
( )
( ( ))
( )

(2.7)
where

( ( )) ( ) ( ( )) ( )

( )
( )
(2.8)

This method requires that ( ) , and for certain functions it requires many terms in the
polynomial in order to provide a good representation of the inverse.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (3 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Let us now consider a numerical alternative to obtain ( ) using Eq. (2.4). First of all,
notice that Eq. (2.2) can only be valid as long as:

( ) ( )
(2.9)

otherwise the integral in Eq. (2.2) would not be finite. Thus, we can impose arbitrary limits on
the randomistic variable ( [ ]) such that:

∫ ( )

(2.10)

where represents a small tolerance in the error for the total probability of the variable. If Eq.
(2.10) is not satisfied, the bounds of must be expanded until it becomes valid. The integral in
Eq. (2.10) is evaluated numerically using a small resolution . In addition, a resolution in will
also be required ( ). , , and are input parameters of the numerical method. As these
parameters become smaller, the accuracy of the method improves, but the computational
demand increases. A trade-off between accuracy and computational cost is always required.
Also, as any numerical method, the results obtained will greatly depend on the parameter
values used.

For each value of considering within the bounds [ ] (using the resolution ) we
must determine:

 The value of the probability density function ( ).


 The value of the function ( ).
( ) ( ) ( )
 The numerical derivative of the function . A central finite
difference is considered, except for the extreme values where forward and backward
finite differences are used for the lower and upper bound, respectively. If the value of
( )
the numerical derivative is less than a small tolerance , then we may set in
order to avoid handling extreme large values of the probability density function of .
For simplicity, we may assume that .

If the value is obtained for the first time (within the resolution considered), then:

( )
( )
( )
| |
(2.11)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (4 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

If the value (considering the resolution ) is obtained again, the probability density must be
accumulated:
( )
( ) ( )
( )
| |
(2.12)

Finally, the probability density values are normalized in order to strictly satisfy Eq. (2.2). The
vector of numerical values obtained for can be used to determine the properties of :
Expected value, variance, moments, etc. They can also be numerically integrated to yield the
cumulative probability function . The vector obtained for can then be approximated by a
mathematical function using splines [5,6].

3. Nonlinear Functions of Multiple Randomistic Variables

An arbitrary function of multiple randomistic variables can be described in general as follows:

( )
(3.1)

where the probability density function (pdf) of each randomistic variable in the function ( )
is known.

In this case, the pdf of the function, obtained by the change of variable theorem [3], is:

( ) ∫ ∫ ∫ ∏( ( ) )

( )
∑ ( ( )) | |

(3.2)
where the inverse function is obtained by solving Eq. (3.1) for :

( )
(3.3)

When multiple randomistic variables are considered, the inversion of Eq. (3.1) might become
more difficult, particularly for certain variables. However, if at least one variable can be
explicitly solved, then the inversion problem can be solved.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (5 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The main issue in the case of multiple variables is the analytical complexity involved in the
multiple integrals emerging in Eq. (3.2). For this reason, a numerical approach is also desirable.

The numerical evaluation of in the multivariable case is similar to the single-variable


situation. First, each variable in the function is bounded satisfying the condition:

∫ ( )

(3.4)
The error tolerance is the same for all variables.

Each independent variable is partitioned in different levels between [ ] using a step


size or resolution . Then, a multidimensional grid is obtained by considering all possible
combinations of the different levels of the independent variables. Notice that the number of
possible combinations may increase exponentially with the number of independent variables
considered. This will have an important effect of the computational load of the method. For
that reason, a maximum number of combinations to be evaluated can be defined. If the
total number of combinations is less or equal than , then all possible combinations are
considered in the evaluation. If the total number of combinations is more than , then a
random sample of combinations is selected.

For each particular combination in the set of values considered we must determine:

 The value of the probability density function of each independent variable ( ).


 The function value ( ).
 The numerical derivative of the function with respect to the first independent variable
( ) ( ) ( )
. If the value of the numerical
( )
derivative is less than a small tolerance , then we may set in order
to avoid handling extreme large values of the probability density function of .

If the value is obtained for the first time (within the resolution considered), then:

( )∏ ( )
( )
( )
| |
(3.5)

If the value (considering the resolution ) is obtained again, the probability density must be
accumulated:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (6 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( )∏ ( )
( ) ( )
( )
| |
(3.6)
Finally, the probability density values are normalized in order to strictly satisfy Eq. (2.2).

4. Algorithm Implementation

The numerical algorithm for determining the probability density and cumulative probability of
functions of randomistic variables by means of the Change of Variable Theorem was
implemented in R language (https://cran.r-project.org/). The R functions created for this
purpose are presented and briefly explained in this Section.

4.1. PEval

This function evaluates numerically the probability (P) of a variable with known probability
density function (rho) within a range of values (xlim), using an optional size step (dx). By
default, the size step will be 1/1000 of the given range.

Usage: P=PEval(rho,xlim,dx)

Example: Evaluate the probability of a normal variable within 1 standard deviation about the
mean.
PEval(rho=dnorm,xlim=c(-1,1),dx=1e-4)
0.6826895

R Code:
PEval<-function(rho,xlim,dx=NULL){
if (is.null(dx)) dx=(xlim[2]-xlim[1])/1000
n=round((xlim[2]-xlim[1])/dx)
x=xlim[1]+(0:n)*dx
y=rho(x)
P=sum(y*dx)-(y[1]+y[n+1])*dx/2
return(P)
}

4.2. PDvector

This function creates a discrete probability density vector (PDv) from a known probability
density function (rho) within an optional range of values (xlim) using an optional step size (dx).
The default range is [0,1], and the default step size is 1/1000 of the initial range. The PD vector
guarantees that the full distribution is covered within an optional tolerance (tol). The default

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (7 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

tolerance is . If the total probability is less than , then the range of values is
automatically extended until the desired tolerance is reached.

Usage: PDv=PDvector(rho,xlim,dx,tol)

Example: Create a probability density vector for the standard normal random variable using a
step size of 1.
PDvector(rho=dnorm,dx=1)
Var pdf
1 -5 1.486720e-06
2 -4 1.338302e-04
3 -3 4.431848e-03
4 -2 5.399097e-02
5 -1 2.419707e-01
6 0 3.989423e-01
7 1 2.419707e-01
8 2 5.399097e-02
9 3 4.431848e-03
10 4 1.338302e-04
11 5 1.486720e-06
12 6 6.075883e-09
13 7 9.134720e-12
14 8 5.052271e-15

R Code:
PDvector<-function(rho,xlim=NULL,dx=NULL,tol=1e-6){
if (is.null(xlim)) xlim=c(0,1)
if (is.null(dx)) dx=(xlim[2]-xlim[1])/1000
P=PEval(rho,xlim,dx)
nlow=0.5
nhigh=0.5
while ((1-P)>tol){
nlow=2*nlow
exitflag=0
while (exitflag==0){
Pnew=PEval(rho,xlim=c(xlim[1]-nlow*dx,xlim[2]),dx)
if ((Pnew-P)<tol){
exitflag=1
} else {
xlim[1]=xlim[1]-nlow*dx
P=Pnew
nlow=2*nlow
if ((1-Pnew)<tol){
exitflag=1
}
}
nhigh=2*nhigh
exitflag=0
while (exitflag==0){
Pnew=PEval(rho,xlim=c(xlim[1],xlim[2]+nhigh*dx),dx)
if ((Pnew-P)<tol){
exitflag=1

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (8 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

} else {
xlim[2]=xlim[2]+nhigh*dx
P=Pnew
nhigh=2*nhigh
if ((1-Pnew)<tol){
exitflag=1
}
}
}
}
}
n=(xlim[2]-xlim[1])/dx
Var=xlim[1]+(0:n)*dx
pdf=rho(Var)
PDv=data.frame(Var,pdf)
return(PDv)
}

4.3. EValue

This function evaluates the expected value (E) of a function (fun) of a single variable with a
known probability density function (PDF). fun can be given as a single-variable R function, or as
a text describing the function of "x" in R language. PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector.

Usage: E=EValue(fun,PDF)

Examples:
Evaluate the expected value of the square standard normal distribution, using the probability
density as a function.
EValue(fun="x^2",PDF=dnorm)
0.9999944

Evaluate the same expected value again, but this time using a probability density vector with
step size 0.1.
EValue(fun="x^2",PDF=PDvector(dnorm,dx=0.1))
1

R Code:
EValue<-function(fun,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
n=length(x)
if (is.function(fun)==TRUE){
y=fun(x)
} else {

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (9 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

y=eval({x<-PDF$Var;parse(text=fun)})
}
E=sum(y*rho)*dx-(y[1]*rho[1]+y[n]*rho[n])*dx/2
return(E)
}

4.4. CDvector

This function creates a full cumulative distribution vector (CDv) from a known probability
density function (PDF). PDF can be given as a function (rho) or as a probability density vector
(PDv) previously created with the function PDvector. When the function rho is used, the values
for the initial limits (xlim), step size (dx) and tolerance (tol) can be optionally introduced.

Usage:
CDv=CDvector(PDF=rho,xlim,dx,tol)
CDv=CDvector(PDF=PDv)

Example: Obtain the cumulative probability vector from the probability density vector of the
standard normal distribution with a step size of 1.
CDvector(PDF=PDvector(dnorm,dx=1))
Var cdf
1 -5 7.433598e-07
2 -4 6.840183e-05
3 -3 2.351241e-03
4 -2 3.156265e-02
5 -1 1.795435e-01
6 0 5.000000e-01
7 1 8.204565e-01
8 2 9.684373e-01
9 3 9.976488e-01
10 4 9.999316e-01
11 5 9.999993e-01
12 6 1.000000e+00
13 7 1.000000e+00
14 8 1.000000e+00

R Code:
CDvector<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
Var=PDF$Var
dx=Var[2]-Var[1]
pdf=PDF$pdf
cdf=0*pdf
cdf[1]=pdf[1]*dx/2
for (i in 2:length(Var)){
cdf[i]=cdf[i-1]+(pdf[i-1]+pdf[i])*dx/2

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (10 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

}
cdf=cdf/max(cdf)
CDv=data.frame(Var,cdf)
return(CDv)
}

4.5. xfun

This is a family of functions used to obtain specific values from a particular distribution. The
functions are: dfun (probability density value), pfun (cumulative probability value), qfun
(quantile value), rfun (random value). These functions evaluate the corresponding value (d, p,
q, r) from a known probability distribution (PDF). PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector. If the function
rho is used, the initial limits (xlim), step size (dx) and tolerance (tol) are set as default.
Functions pfun, qfun and rfun uses the function CDvector.

Usage:
d=dfun(q,PDF)
p=pfun(q,PDF)
q=qfun(p,PDF)
r=rfun(n,PDF)

Examples:
Evaluate the probability density function of the standard normal distribution at a value .
dfun(q=pi/2,PDF=dnorm)
0.1161772

Evaluate the cumulative probability of the standard normal distribution at .


pfun(q=pi/2,PDF=dnorm)
0.9418852

Evaluate the inverse cumulative probability of the standard normal distribution at a value of 1/3.
qfun(p=1/3,PDF=dnorm)
-0.4307275

Generate 5 random numbers from a probability density vector of the standard normal
distribution (specific values will change every time the function is evaluated).
rfun(n=5,PDF=dnorm)
1.46327002 -1.24235830 -0.02832977 1.50078946 -0.35320490

R Code:
dfun<-function(q,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
d=approx(PDF[,1],PDF[,2],xout=q)$y

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (11 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

return(d)
}

pfun<-function(q,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
p=approx(CDF[,1],CDF[,2],xout=q)$y
return(p)
}

qfun<-function(p,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
q=approx(CDF[,2],CDF[,1],xout=p)$y
return(q)
}

rfun<-function(n,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
rp=runif(n)
r=approx(CDF[,2],CDF[,1],xout=rp)$y
return(r)
}

4.6. PDSummary

This function summarizes the main properties of a variable with known probability density
function (PDF). PDF can be given as a function (rho) or as a probability density vector (PDv)
previously created with the function PDvector. When the function rho is used, the values for
the initial limits (xlim), step size (dx) and tolerance (tol) can be optionally introduced. The
properties include: Variance, standard deviation, raw moments, central moments, skewness,
kurtosis, quartiles, median and mode.

Usage:
OUT=PDSummary(PDF,moments)
Mn=rawmoment(n,PDF)
Mcn=centralmoment(n,PDF)
Var=PDFVar(PDF)
sigma=PDFsd(PDF)
Sk=skewness(PDF)
Ku=kurtosis(PDF)
Qn=PDFQ(n,PDF)
median=PDFmedian(PDF)
Mode=PDFmode(PDF)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (12 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Example: Determine the main properties of the standard normal distribution.


OUT=PDSummary(dnorm)
PDF Properties Summary
Central Tendency and Location Measures:
Mean: -7.870456e-07
Median: -1.889555e-07
Mode: 0
Q1: -0.67449
Q3: 0.6744895
Dispersion Measures:
IQ Range: 1.348979
Variance: 0.9999944
St.Dev.: 0.9999972
Coeff.Var.: -127057087 %
Other Measures:
Skewness: -1.912881e-05
Kurtosis: -0.0001266078
Distribution Moments:
Moment Raw Central
1 1 -7.870456e-07 -1.558859e-13
2 2 9.999944e-01 9.999944e-01
3 3 -2.148978e-05 -1.912865e-05
4 4 2.999840e+00 2.999840e+00
5 5 -5.875848e-04 -5.757798e-04

R Code:
rawmoment<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
nx=length(x)
y=x^n
M=sum(y*rho)*dx-(y[1]*rho[1]+y[nx]*rho[nx])*dx/2
return(M)
}

centralmoment<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
nx=length(x)
E=sum(x*rho)*dx-(x[1]*rho[1]+x[nx]*rho[nx])*dx/2
y=(x-E)^n
M=sum(y*rho)*dx-(y[1]*rho[1]+y[nx]*rho[nx])*dx/2
return(M)
}

PDFVar<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (13 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

}
V=centralmoment(2,PDF)
return(V)
}

PDFsd<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
sigma=sqrt(PDFVar(PDF))
return(sigma)
}

PDFQ<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
p=n/4
Q=qfun(p,PDF)
return(Q)
}

PDFmedian<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
med=qfun(0.5,PDF)
return(med)
}

PDFmode<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
Mode=PDF$Var[which(PDF$pdf==max(PDF$pdf))]
return(Mode)
}

Skewness<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
Sk=centralmoment(3,PDF)/(PDFsd(PDF)^3)
return(Sk)
}

Kurtosis<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
Ku=centralmoment(4,PDF)/(PDFVar(PDF)^2)-3
return(Ku)
}

PDSummary<-function(PDF,moments=5,xlim=NULL,dx=NULL,tol=1e-6){
defaultW <- getOption("warn")
options(warn = -1)
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (14 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Mean=EValue("x",PDF)
Median=PDFmedian(PDF)
Mode=PDFmode(PDF)
Q1=PDFQ(1,PDF)
Q3=PDFQ(3,PDF)
cat("PDF Properties Summary","\n","\n")
cat("Central Tendency and Location Measures:","\n")
cat(" Mean: ",Mean,"\n")
cat(" Median: ",Median,"\n")
cat(" Mode: ",Mode,"\n")
cat(" Q1: ",Q1,"\n")
cat(" Q3: ",Q3,"\n","\n")
Var=PDFVar(PDF)
Sd=PDFsd(PDF)
Cv=100*Sd/Mean
IQR=Q3-Q1
cat("Dispersion Measures:","\n")
cat(" IQ Range: ",IQR,"\n")
cat(" Variance: ",Var,"\n")
cat(" St.Dev.: ",Sd,"\n")
cat(" Coeff.Var.: ",Cv,"%","\n","\n")
Sk=Skewness(PDF)
Ku=Kurtosis(PDF)
cat("Other Measures:","\n")
cat(" Skewness: ",Sk,"\n")
cat(" Kurtosis: ",Ku,"\n","\n")
if (moments>0){
cat("Distribution Moments:","\n")
Moment=c()
Raw=c()
Central=c()
for (i in 1:moments){
Moment=c(Moment,i)
Raw=c(Raw,rawmoment(i,PDF))
Central=c(Central,centralmoment(i,PDF))
}
Moments=data.frame(Moment,Raw,Central)
print(Moments)
} else {
Moments=NULL
}
OUT=list(Mean,Median,Mode,Var,Sd,Cv,Sk,Ku,Moments)

names(OUT)=c("Mean","Median","Mode","Var","Stdev","CVar","Skewness","Kurtos
is","Moments")
options(warn = defaultW)
return(OUT)
}

4.7. plotPDF

This function plots the probability density function (PDF) and cumulative probability function
(CDF) of a known probability density function (PDF). PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector. When the
function rho is used, the values for the initial limits (xlim), step size (dx) and tolerance (tol) can

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (15 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

be optionally introduced. The CDF is obtained using the CDvector function. If an optional second
PDF (PDF2) is used then both PDFs are shown in the same plot.

Usage:
plotPDF(PDF=rho,xlim,dx,tol)
plotPDF(PDF=PDv)
plotPDF(PDF=PDv,PDF2=PDv2)

Examples:
Plot the distribution functions for the standard normal distribution.
plotPDF(PDF=dnorm)

Plot the distribution functions for both the standard normal distribution and the standard
uniform distribution using the probability density vector with step size 0.01 in the range [-4,4]
(tol=0.001 is used to avoid changes in the limits).
plotPDF(PDF=PDvector(dnorm,xlim=c(-4,4),dx=0.01,tol=0.001),
PDF2=PDvector(dunif,xlim=c(-4,4),dx=0.01,tol=0.001))

R Code:
plotPDF<-function(PDF,PDF2=NULL,x=NULL,xlim=NULL,dx=NULL,tol=1e-6){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
CDF=CDvector(PDF)
Var=PDF$Var
pdf=PDF$pdf

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (16 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

cdf=CDF$cdf
par(mfrow=c(1,2))
if (is.null(PDF2)==FALSE){
if (is.function(PDF2)){
PDF2=PDvector(PDF2,xlim,dx,tol)
}
CDF2=CDvector(PDF2)
Var2=PDF2$Var
pdf2=PDF2$pdf
cdf2=CDF2$cdf
xlim=c(min(Var,Var2),max(Var,Var2))
ylim=c(0,max(pdf,pdf2,na.rm=TRUE))
ylim=c(0,min(max(pdf,pdf2),1/tol))
plot(Var,pdf,type="l",col="blue",xlab="Variable
Value",ylab="Probability Density",xlim=xlim,ylim=ylim)
lines(Var2,pdf2,type="l",col="red")
plot(Var,cdf,type="l",col="blue",xlab="Variable Value",ylab="Cumulative
Probability",xlim=xlim,ylim=c(0,1))
lines(Var2,cdf2,type="l",col="red",ylim=c(0,1))

legend(x="bottomright",legend=c("PDF1","PDF2"),col=c("blue","red"),lty=c(1,
1))
} else {
plot(Var,pdf,type="l",col="blue",xlab="Variable
Value",ylab="Probability Density")
plot(Var,cdf,type="l",col="blue",xlab="Variable Value",ylab="Cumulative
Probability",ylim=c(0,1))
}
par(mfrow=c(1,1))
}

4.8. EvalCVT1

This function performs the numerical evaluation of the Change of Variable Theorem (CVT) for
determining the probability density (yPDF) of a variable (y) which is a function (fun) of a single
variable (x) with known probability density function (PDF). fun can be given as a single-variable
R function, or as a text describing the function of "x" in R language. PDF can be given as a
function (rho) or as a probability density vector (PDv) previously created with the function
PDvector. When the function rho is used, the values for the initial limits (xlim), step size (dx)
and tolerance (tol) can be optionally introduced. The range of values (ylim) and step size (dy) of
variable y can be optionally defined. A Boolean parameter (smooth) determines whether the
probability density of y is smoothed or not, using a geometric moving average. By default,
smooth is set as FALSE. The function also plots the resulting probability density and cumulative
probability of the function (using plotPDF). The function EValue is used for normalizing the
resulting yPDF.

Usage:
yPDF=EvalCVT1(fun,PDF=rho,ylim,dy,xlim,dx,tol,smooth)
yPDF=EvalCVT1(fun,PDF=PDv,ylim,dy,smooth)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (17 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Example: Determine the probability density function of the square standard normal
distribution using a step size of 0.1 for y, considering y in the range [0,10], without smoothing.
yPDF=EvalCVT1(fun="x^2",PDF=dnorm,ylim=c(0,10),dy=0.1,smooth=FALSE)

R Code:
EvalCVT1<-function(fun,PDF,ylim=NULL,dy=NULL,xlim=NULL,dx=NULL,tol=1e-
6,smooth=FALSE){
if (is.function(PDF)){
PDF=PDvector(PDF,xlim,dx,tol)
}
rhox=PDF$pdf
x=PDF$Var
if (is.function(fun)==TRUE){
f=fun(x)
} else {
f=eval({parse(text=fun)})
}
dx=x[2]-x[1]
if (is.null(ylim)==TRUE){
f0=f[(is.infinite(f)==FALSE)&(is.na(f)==FALSE)&(rhox>0)]
ymin=min(f0)
ymax=max(f0)
} else {
ymin=ylim[1]
ymax=ylim[2]
}
yrange=ymax-ymin
if (is.null(dy)==TRUE){
dy=yrange/1000
} else {
dy=abs(dy)
}
ylim=c(dy*floor((ymin-0.1*yrange)/dy),dy*ceiling((ymax+0.1*yrange)/dy))
ny=round((ylim[2]-ylim[1])/dy)
y=ylim[1]+(0:ny)*dy
rhoy=0*(0:ny)
f1=dy*round(f/dy)
nx=length(x)
for (i in 1:nx){
j=1+round((f1[i]-ylim[1])/dy)
if ((j>=1) & (j<=(ny+1))){
if (i==1){
if ((abs(f1[i+1]-f1[i])/dx)>tol) {

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (18 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

rhoy[j]=rhoy[j]+rhox[i]*dx/abs(f1[i+1]-f1[i])
} else {
rhoy[j]=rhoy[j]+rhox[i]/tol
}
}
if (i>1 & i<nx){
if ((0.5*abs(f1[i+1]-f1[i-1])/dx)>tol) {
rhoy[j]=rhoy[j]+2*rhox[i]*dx/abs(f1[i+1]-f1[i-1])
} else {
rhoy[j]=rhoy[j]+rhox[i]/tol
}
}
if (i==nx){
if ((abs(f1[i]-f1[i-1])/dx)>tol) {
rhoy[j]=rhoy[j]+rhox[i]*dx/abs(f1[i]-f1[i-1])
} else {
rhoy[j]=rhoy[j]+rhox[i]/tol
}
}
}
}
if (smooth==TRUE){
rhoy0=c(0,rhoy[1:(length(rhoy)-1)])
rhoy2=c(rhoy[2:(length(rhoy))],0)
pdf=(rhoy0*rhoy^2*rhoy2)^(1/4)
} else {
pdf=rhoy
}
Var=y
yPDF=data.frame(Var,pdf)
pdf=pdf/EValue("x^0",yPDF)
yPDF=data.frame(Var,pdf)
plotPDF(yPDF)
return(yPDF)
}

4.9. EvalCVT

This function performs the numerical evaluation of the Change of Variable Theorem (CVT) for
determining the probability density (yPDF) of a variable (y) which is a function (fun) of one or
more variables (x), each one of them with known probability density function (PDF). fun can be
given as a single-vector R function, or as a text describing the function of the x vector ("x[i, ]")
in R language (where the row i represents the variable number and the columns represents the
different combinations considered in the analysis). PDF can be given as a list of functions (rho)
or as a list of probability density vectors (PDv) previously created with the function PDvector.
When functions (rho) are used, the values for the initial limits (xlim), and step sizes (dx) can be
optionally introduced as lists. Also, the overall tolerance (tol) can be defined. The range of
values (ylim) and step size (dy) of variable y can be optionally defined. A Boolean parameter
(smooth) determines whether the probability density of y is smoothed or not, using a
geometric moving average. By default, smooth is set as TRUE. The function also plots the
resulting probability density and cumulative probability of the function (using plotPDF). The

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (19 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

function EValue is used for normalizing the resulting PDF. If a single independent variable is
used, the function EvalCVT1 is employed. If the number of possible combinations is larger than
nmax, then the numerical evaluation is performed by a Monte Carlo method (random
selection) using only nmax random evaluations of the multivariate function. By default, a
maximum of 100.000 combinations is considered.

Usage:
yPDF=EvalCVT(fun,PDF=rho,ylim,dy,xlim,dx,tol,smooth,nmax)
yPDF=EvalCVT(fun,PDF=PDv,ylim,dy,smooth,nmax)

Examples:
Determine the probability density function of the sum of squares of three standard normal
distributions using a step size of 0.1 for x and 0.1 for y, considering y in the range [0,50],
without smoothing. The PD vectors are limited to the range [-4,4] with a tolerance of 10-4.
xPDv=PDvector(dnorm,xlim=c(-4,4),dx=0.1,tol=1e-4)
yPDF=EvalCVT(fun="x[1,]^2+x[2,]^2+x[3,]^2",PDF=list(xPDv,xPDv,xPDv)
,ylim=c(0,50),dy=0.1,smooth=FALSE)

Determine the probability density function of the previous example using the smoothing
function this time.
xPDv=PDvector(dnorm,xlim=c(-4,4),dx=0.1,tol=1e-4)
yPDF=EvalCVT(fun="x[1,]^2+x[2,]^2+x[3,]^2",PDF=list(xPDv,xPDv,xPDv)
,ylim=c(0,50),dy=0.1,smooth=TRUE)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (20 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

R Code:
EvalCVT<-function(fun,PDF,ylim=NULL,dy=NULL,xlim=NULL,dx=NULL,tol=1e-
6,smooth=FALSE,nmax=100000){
if (is.data.frame(PDF)==TRUE | is.function(PDF)==TRUE){

yPDF=EvalCVT1(fun=fun,PDF=PDF,ylim=ylim,dy=dy,xlim=xlim,dx=dx,tol=tol,smoot
h=smooth)
} else {
nvar=length(PDF)
if (nvar==1){

yPDF=EvalCVT1(fun=fun[[1]],PDF=PDF[[1]],ylim=ylim,dy=dy,xlim=xlim,dx=dx,tol
=tol,smooth=smooth)
} else {
if (is.list(xlim)==FALSE){
if (is.null(xlim)==TRUE){
xlim=list(c(0,1))
for (i in 2:nvar){
xlim[[i]]=c(0,1)
}
} else {
xlim0=xlim
xlim=list(xlim)
for (i in 2:nvar){
xlim[[i]]=xlim0
}
}
} else {
if (length(xlim)<nvar){
for (i in (length(xlim)+1):nvar){
xlim[[i]]=c(0,1)
}
}
}
if (is.list(dx)==FALSE){
if (is.null(dx)==TRUE){
dx=list((xlim[[1]][2]-xlim[[1]][1])/1000)
for (i in 2:nvar){
dx[[i]]=(xlim[[i]][2]-xlim[[i]][1])/1000
}
} else {
dx0=dx
dx=list(dx)
for (i in 2:nvar){
dx[[i]]=dx0
}
}
} else {
if (length(dx)<nvar){
for (i in (length(dx)+1):nvar){
dx[[i]]=(xlim[[i]][2]-xlim[[i]][1])/1000
}
}
}
rhox=list(c())
xvar=list(c())
nx=c()
pnx=1
for (i in 1:nvar){

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (21 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

if (is.function(PDF[[i]])){
PDF[[i]]=PDvector(PDF[[i]],xlim[[i]],dx[[i]],tol)
}
rhox[[i]]=PDF[[i]]$pdf
xvar[[i]]=PDF[[i]]$Var
nx=c(nx,length(xvar[[i]]))
pnx=pnx*nx[i]
}
x1=c()
if (pnx>nmax){
for (i in 1:nvar){
x1=c(x1,sample(xvar[[i]],size=nmax,replace=TRUE))
}
x1=matrix(x1,nrow=nvar,ncol=nmax,byrow=TRUE)
} else {
pnx0=1
for (i in 1:nvar){
x1=c(x1,rep(xvar[[i]],each=pnx0,times=pnx/(pnx0*nx[i])))
pnx0=pnx0*nx[i]
}
x1=matrix(x1,nrow=nvar,ncol=pnx,byrow=TRUE)
}
x0=x1
x0[1,]=x0[1,]-dx[[1]]
x2=x1
x2[1,]=x2[1,]+dx[[1]]
if (is.function(fun)==TRUE){
f0=fun(x0)
f1=fun(x1)
f2=fun(x2)
} else {
x=x0
f0=eval({parse(text=fun)})
x=x1
f1=eval({parse(text=fun)})
x=x2
f2=eval({parse(text=fun)})
}
if (is.null(ylim)==TRUE){
fnew=f1[(is.infinite(f1)==FALSE)&(is.na(f1)==FALSE)]
ymin=min(fnew)
ymax=max(fnew)
} else {
ymin=ylim[1]
ymax=ylim[2]
}
yrange=ymax-ymin
if (is.null(dy)==TRUE){
dy=yrange/1000
} else {
dy=abs(dy)
}
ylim=c(dy*floor((ymin-
0.1*yrange)/dy),dy*ceiling((ymax+0.1*yrange)/dy))
ny=round((ylim[2]-ylim[1])/dy)
y=ylim[1]+(0:ny)*dy
f0=dy*round(f0/dy)
f1=dy*round(f1/dy)
f2=dy*round(f2/dy)
rhoy=0*(0:ny)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (22 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Pry=c()
for (k in 1:ncol(x1)){
Prx=1
for (i in 1:nvar){
Prx=Prx*rhox[[i]][which(xvar[[i]]==x1[i,k])]*dx[[i]]
}
df=abs(f2[k]-f0[k])/2
if (df>tol*dx[[1]]){
Pry=c(Pry,Prx/df)
} else {
Pry=c(Pry,Prx/(tol*dx[[1]]))
}
j=1+round((f1[k]-ylim[1])/dy)
if ((j>=1) & (j<=(ny+1))){
rhoy[j]=rhoy[j]+Pry[k]
}
}
if (smooth==TRUE){
rhoy0=c(0,rhoy[1:(length(rhoy)-1)])
rhoy2=c(rhoy[2:(length(rhoy))],0)
pdf=(rhoy0*rhoy^2*rhoy2)^(1/4)
} else {
pdf=rhoy
}
Var=y
yPDF=data.frame(Var,pdf)
pdf=pdf/EValue("x^0",yPDF)
yPDF=data.frame(Var,pdf)
plotPDF(yPDF)
}
}
return(yPDF)
}

4.10. similitude

This function determines the percentage of similitude (simil) between two probability density
functions (PDF1 and PDF2). Both PDFs can be given as functions (rho) or as probability density
vectors (PDv) previously created with the function PDvector. When PDF1 is a function (rho1), the
values for the limits (xlim) can be optionally introduced. If PDF1 is a PD vector, its corresponding
x limit values are used for the calculations. The default number of steps (intervals) used for the
calculation is 100000. This function is an adaptation from a previous work [7].

Usage:
simil=similitude(PDF1=rho1,PDF2=rho2,xlim,nsteps)
simil=similitude(PDF1=PDv1,PDF2=rho2)
simil=similitude(PDF1=rho1,PDF2=PDv2,xlim,nsteps)
simil=similitude(PDF1=PDv1,PDF2=PDv2)

Example: Determine the similitude between the smoothed probability density vector obtained
in the previous Section, and the distribution.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (23 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

xPDv=PDvector(dnorm,xlim=c(-4,4),dx=0.1,tol=1e-4)
yPDF=EvalCVT(fun="x[1,]^2+x[2,]^2+x[3,]^2",PDF=list(xPDv,xPDv,xPDv)
,ylim=c(0,50),dy=0.1,smooth=TRUE)
dchisq3<-function(x){
y=dchisq(x,3)
y[which(is.na(y)==TRUE)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=dchisq3)
Similitude (%)
97.43182

R Code:
similitude<-function(PDF1,PDF2,xlim=c(-10,10),nsteps=100000){
if (is.function(PDF1)==TRUE){
xlim1=NULL
} else {
xlim1=c(min(PDF1[,1]),max(PDF1[,1]))
}
if (is.function(PDF2)==TRUE){
if (is.null(xlim1)==TRUE){
xmin=xlim[1]
xmax=xlim[2]
} else {
xmin=xlim1[1]
xmax=xlim1[2]
}
} else {
if (is.null(xlim1)==TRUE){
xmin=min(PDF2[,1])
xmax=max(PDF2[,1])
} else {
xlim2=c(min(PDF2[,1]),max(PDF2[,1]))
xmin=max(xlim1[1],xlim2[1])
xmax=min(xlim1[2],xlim2[2])
}
}
i=1:(nsteps+1)
x=xmin+(xmax-xmin)*(i-1)/nsteps
Var=x
if (is.function(PDF1)==TRUE){
f1=match.fun(PDF1)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (24 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

rho1=f1(x)
} else {
rho1=approx(PDF1[,1],PDF1[,2],xout=x)$y
}
pdf=rho1
PDF1=data.frame(Var,pdf)
if (is.function(PDF2)==TRUE){
f2=match.fun(PDF2)
rho2=f2(x)
} else {
rho2=approx(PDF2[,1],PDF2[,2],xout=x)$y
}
pdf=rho2
PDF2=data.frame(Var,pdf)
rhomin=pmin(rho1,rho2)

simil=200*sum(rhomin,na.rm=TRUE)/(sum(rho1,na.rm=TRUE)+sum(rho2,na.rm=TRUE)
)
names(simil)=c("Similitude (%)")
plotPDF(PDF1,PDF2=PDF2)
return(simil)
}

5. Selected Examples

In this Section, a selection of representative examples is considered for illustrative purposes.


Some of these examples were solved analytically using the Change of Variable Theorem in a
previous report [3].

5.1. ,

The first example is the reciprocal function of a type III standard uniform random variable
(limited between and ). The exact analytical probability density function is [3]:

( ) {

(5.1)
and the corresponding cumulative probability function is:

( ) {

(5.2)

The numerical CVT and subsequent comparison with the exact result is performed as follows:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (25 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

yPDF=EvalCVT("1/x",dunif,ylim=c(1,100),dy=0.1,dx=0.00001)
Ex51<-function(x){
y=x^(-2)
y[which(x<1)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex51)
Similitude (%)
95.60502

The graphical output generated in R is presented in Figure 1. Figure 1 compares the numerical
results obtained in R with the exact solution given in Eq. (5.1).

Figure 1. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for , where
is a type III standard uniform random variable.

The numerical results obtained are strongly influenced by the parameter values considered. In
this example, the similitude between the numerical and exact probability density functions is
satisfactory ( ).

5.2. ,

The second example is a modified version of the previous example, considering a variable
uniformly distributed between and . The exact solution to this problem is [3]:

( ) {

(5.3)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (26 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( )
{
(5.4)

Figure 2 summarizes the results obtained using the following lines of code:
xPDF=EvalCVT("2+8*x",dunif,ylim=c(0,12),dy=0.0001,dx=0.000001,smoot
h=TRUE)
yPDF=EvalCVT("1/x",xPDF,ylim=c(0,0.6),dy=0.001,smooth=TRUE)
Ex52<-function(x){
y=1/(8*x^2)
y[which(x<1/10)]=0
y[which(x>1/2)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex52)
Similitude (%)
99.26432

Better results were obtained in this case, because the possible values of were limited to a
finite range. The error involved in the numerical solution can be further reduced by decreasing
the step sizes, but this will also increase the computational demand of the method. Also notice
that a linear transformation of the standard uniform distribution was the first step in the
numerical solution.

Figure 2. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for , where
is a uniform random variable between 2 and 10.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (27 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

5.3. ,

In this example, the independent variable is the standard deterministic variable [8]. The
probability density function of the standard deterministic variable is:

( ) ( ) {

(5.5)
where is Dirac’s delta function.

The cumulative probability of is:

( ) ( ) {

(5.6)
where is Heaviside’s step function.

The analytical result of the change of variable theorem for is [8]. The numerical
solution can be obtained by approximating the standard deterministic variable using the
standard uniform random variable as follows:

(5.7)

where is the standard uniform random variable and . The numerical results obtained
assuming are shown in Figure 3.

Figure 3. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for , where
is a standard deterministic variable.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (28 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The corresponding code used was the following:


xPDF=EvalCVT("1-0.001/2+0.001*x",dunif,ylim=c(0.5,1.5),
dy=0.0001,dx=0.00001)
yPDF=EvalCVT("1/x",xPDF,ylim=c(0.5,1.5),dy=0.0001)
similitude(PDF1=yPDF,PDF2=xPDF)
Similitude (%)
100

5.4. | |,

The following example considers the absolute value of a standard normal distribution. The
exact solution to this problem, using the univariate change of variable theorem is [3]:

( ) {

(5.8)

( ) {
( )

(5.9)

Figure 4. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for | |,
where is a standard normal random variable.

Figure 4 summarizes the satisfactory results obtained using the following lines of code:
yPDF=EvalCVT("abs(x)",PDvector(dnorm,xlim=c(-5,5),dx=0.0001),
ylim=c(0,5),dy=0.01,smooth=TRUE)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (29 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Ex54<-function(x){
y=sqrt(2/pi)*exp(-(x^2)/2)
y[which(x<0)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex54)
Similitude (%)
99.47906

5.5. ,

This is a multivariate example. In this case, variable is the result of adding two independent
type III standard uniform variables. The corresponding solution using the multivariate change
of variable theorem is [3]:

( ) {

(5.10)

( )

{
(5.11)

Now, the numerical solution can be obtained as follows:


yPDF=EvalCVT("x[1,]+x[2,]",PDF=list(dunif,dunif),ylim=c(0,2),dy=0.0
5)
Ex55<-function(x){
y=x
y[which(x>1)]=2-y[which(x>1)]
y[which(x<0 | x>2)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex55)
Similitude (%)
99.24617

The comparative results are shown in Figure 5. The computational demand of this method
increases exponentially with the number of variables involved in the function. For this reason,
the Monte Carlo method is preferred over the full grid of possible combinations. Nevertheless,
a satisfactory result (over similitude) was again obtained. The exact analytical solution to

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (30 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

this particular problem is not so straightforward, and it requires plenty of effort to obtain the
result. For highly nonlinear functions, and for a larger number of variables, the analytical
complexity of the solution increases and in some cases may not be available for comparison.

Figure 5. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for ,
where and are type III standard uniform variables.

5.6. √ ,

This example also involves two variables, but in this case they are standard normal random
variables. The analytical solution is the following [3]:

( ) {

(5.12)

( ) {

(5.13)

The numerical solution can be obtained as follows (limiting the standard normal distributions
to the range [-5,5]):

yPDF=EvalCVT("sqrt(x[1,]^2+x[2,]^2)",PDF=list(dnorm,dnorm),ylim=c(0
,7),dy=0.1,smooth=TRUE)
Ex56<-function(x){
y=x*exp(-0.5*x^2)
y[which(x<0)]=0
return(y)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (31 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

}
similitude(PDF1=yPDF,PDF2=Ex56)
Similitude (%)
98.60999

Figure 6. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for
√ , where and are standard normal variables.

Figure 6 shows again a satisfactory prediction of the both distribution functions, with a
similitude over . Since the Monte Carlo method is used, slightly different results may be
obtained in different runs.

5.7. √ ,

The next example corresponds to the Maxwell-Boltzmann distribution involving three


independent variables. The solution to this problem is [3]:

( ) {

(5.14)

( ) {
( ) √

(5.15)
The numerical solution to this problem can be obtained as follows:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (32 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

yPDF=EvalCVT("sqrt(x[1,]^2+x[2,]^2+x[3,]^2)",PDF=list(dnorm,dnorm,d
norm),ylim=c(0,9),dy=0.1,smooth=TRUE,nmax=500000)
Ex57<-function(x){
y=sqrt(2/pi)*x^2*exp(-0.5*x^2)
y[which(x<0)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex57)
Similitude (%)
98.91535

Figure 7. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for
√ , where , and are standard normal variables.

The performance of the numerical solution can be observed in Figure 7. This problem required
increasing the maximum number of evaluations in order to obtain a satisfactory performance
( using 500.000 evaluations).

5.8. ,

The last comparative example represents the contribution of a single velocity component to
the overall velocity of a particle, assuming a standard normal distribution of each velocity
component. The analytical solution to this problem is [3]:

( ) {
| |
(5.16)

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (33 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( ) {

(5.17)

Even if the resulting distribution is relatively simple (uniform distribution), the analytical
solution of the integrals from the change of variable theorem is not so easily obtained. In fact,
modified Bessel functions of the second kind must be integrated during the solution process.

The numerical solution to this problem can be solved using the following code in R:
yPDF=EvalCVT("x[1,]/sqrt(x[1,]^2+x[2,]^2+x[3,]^2)",PDF=list(dnorm,d
norm,dnorm),ylim=c(-1,1),dy=0.01,smooth=TRUE,nmax=200000)
Ex58<-function(x){
y=1/2+0*x
y[which(abs(x)>1)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex58)
Similitude (%)
95.1479

Figure 8. Graphical comparison of probability density and cumulative probability functions


between the numerical and exact solutions of the change of variable theorem for
, where , and are standard normal variables.

Figure 8 summarizes the numerical results obtained. In this case, the probability density
function was obtained with too much noise, in spite of already using the smoothing option.
Such noise leads to a decrease in the similitude percentage with respect to previous examples.
However, the similitude value of obtained is still satisfactory. This is clearly illustrated in
the cumulative probability function whose exact solution is closely represented by the
numerical result.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (34 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

5.9. ( ),

This problem was already proposed in Section 2 as an example of a function without explicit
inverse. Thus, the analytical evaluation of the change of variable theorem is unfeasible.
However, the numerical evaluation is possible. The probability density of the function can be
obtained using:

yPDF=EvalCVT("x+log(1+x)",PDF=PDvector(dunif,xlim=c(0,1),dx=0.00001
),ylim=c(0,1.7),dy=0.01,smooth=TRUE)

Figure 9. Results obtained for the probability density and cumulative probability functions using
the numerical solution of the change of variable theorem for ( ), where is a
type III standard uniform variable.

These results are graphically presented in Figure 9. The probability density and cumulative
probability of the resulting transformation can be calculated using the functions dfun and pfun.
Similarly, quantile values and random numbers can be extracted from the distribution using
qfun and rfun, respectively. In addition, the main properties of this distribution are summarized
as follows:
OUT=PDSummary(yPDF)
PDF Properties Summary
Central Tendency and Location Measures:
Mean: 0.8847004
Median: 0.9036036
Mode: 1.67
Q1: 0.4751766
Q3: 1.304247
Dispersion Measures:
IQ Range: 0.82907
Variance: 0.2319373
St.Dev.: 0.4815987
Coeff.Var.: 54.43636 %

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (35 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Other Measures:
Skewness: -0.09633249
Kurtosis: -1.179385
Distribution Moments:
Moment Raw Central
1 1 0.8847004 4.884981e-17
2 2 1.0146322 2.319373e-01
3 3 1.2972752 -1.076041e-02
4 4 1.7616890 9.793984e-02
5 5 2.4862137 -1.083074e-02

5.10. ( ) ( ),

As an example of a multivariate function without an explicit inverse function on any variable we


have the following situation:

( ) ( )
(5.18)

Particularly, let us consider that the independent variables are standard normal random
variables in the range [ , ]. The numerical solution is in this case:

yPDF=EvalCVT("x[1,]*cos(x[2,])+x[2,]*sin(x[1,])",PDF=list(PDvector(
dnorm,xlim=c(-2*pi,2*pi),dx=0.001),PDvector(dnorm,xlim=c(-2*pi,
2*pi),dx=0.001)),ylim=c(-4*pi,4*pi),dy=0.05,smooth=TRUE)

Figure 10. Results obtained for the probability density and cumulative probability functions
using the numerical solution of the change of variable theorem for ( ) ( ),
where and are standard normal variables.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (36 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Figure 10 summarize the results obtained for the probability density and cumulative probability.
Again, these functions can be approximated using dfun and pfun, respectively. The main
properties of this distribution are summarized as follows:
OUT=PDSummary(yPDF)
PDF Properties Summary
Central Tendency and Location Measures:
Mean: 0.005943125
Median: -0.003117462
Mode: 0
Q1: -0.590475
Q3: 0.6110587
Dispersion Measures:
IQ Range: 1.201534
Variance: 0.9934216
St.Dev.: 0.9967054
Coeff.Var.: 16770.73 %
Other Measures:
Skewness: -0.003262576
Kurtosis: 0.4976511
Distribution Moments:
Moment Raw Central
1 1 0.005943125 1.308963e-17
2 2 0.993456879 9.934216e-01
3 3 0.014481859 -3.230435e-03
4 4 3.451918044 3.451784e+00
5 5 -0.007062769 -1.096356e-01

5.11. ( ),

The last example involves an implicit function. In this case, an approximate explicit function
needs to be created by numerically solving the implicit function. In this case, the solution is
obtained by optimization, and particularly, by using the OAToptim function introduced in an
earlier report [9]. The approximate function is generated using cubic spline regression (rspline)
[6]. The code used for creating such approximate function is the following:
nlfun<-function(y){
load(file="temp.dat")
f=(y-1-cos(x*y/3))^2
return(f)
}
X=-2*pi+0.02*pi*(0:200)
y=c()
for (i in 1:length(X)){
x=X[i]
save(x,file="temp.dat")

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (37 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

OUT=OAToptim(nlfun,x0=1,display=FALSE,optmode='min')
y=c(y,OUT[[1]])
}
plot(X,y)
appfun=rspline(y,X)[[1]]
xPDF=PDvector(dnorm,xlim=c(-2*pi,2*pi),dx=0.0001,tol=0.0001)
yPDF=EvalCVT("appfun(x)",PDF=xPDF,ylim=c(0,2),dy=0.01,smooth=TRUE)

Figure 11 shows the resulting numerical solution found by optimization, and the cubic spline
approximate explicit function of in terms of . Figure 12 shows the distribution functions
obtained for .

Figure 11. Optimal cubic spline regression model describing the explicit dependence of variable
on variable . R2adj = 0.9999954. Number of segments = 10.

Figure 12. Results obtained for the probability density and cumulative probability functions
using the numerical solution of the change of variable theorem for the implicit function
( ), where is a standard normal variable in the range [ ].

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (38 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The main properties of this distribution are:


OUT=PDSummary(yPDF)
PDF Properties Summary
Central Tendency and Location Measures:
Mean: 1.836634
Median: 1.885314
Mode: 1.99
Q1: 1.754617
Q3: 1.958811
Dispersion Measures:
IQ Range: 0.2041937
Variance: 0.02320143
St.Dev.: 0.1523201
Coeff.Var.: 8.293442 %
Other Measures:
Skewness: -1.246654
Kurtosis: 1.173768
Distribution Moments:
Moment Raw Central
1 1 1.836634 2.897552e-16
2 2 3.396425 2.320143e-02
3 3 6.318808 -4.405730e-03
4 4 11.818099 2.246765e-03
5 5 22.206903 -9.201951e-04

6. Conclusion

In the present work, a numerical approximation of the change of variable theorem for
determining the probability density of a function of one or more randomistic variables has been
proposed. The method was implemented in R language, and it was exemplified considering
different illustrative functions. Most functions considered were compared with the exact
analytical solution of the change of variable theorem, resulting in similitude values higher than
95%. It was also shown that functions where the analytical solution to the change of variable
theorem cannot be obtained were successfully obtained by the numerical approach. The
resulting probability density function can be used for determining any property of the function,
including: Expected value, raw and central moments, variance, standard deviation, skewness,
kurtosis, quartiles, etc.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (39 / 40)
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Acknowledgment and Disclaimer

This report provides data, information and conclusions obtained by the author(s) as a result of original
scientific research, based on the best scientific knowledge available to the author(s). The main purpose
of this publication is the open sharing of scientific knowledge. Any mistake, omission, error or inaccuracy
published, if any, is completely unintentional.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC
4.0). Anyone is free to share (copy and redistribute the material in any medium or format) or adapt
(remix, transform, and build upon the material) this work under the following terms:
 Attribution: Appropriate credit must be given, providing a link to the license, and indicating if
changes were made. This can be done in any reasonable manner, but not in any way that
suggests endorsement by the licensor.
 NonCommercial: This material may not be used for commercial purposes.

References

[1] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
ForsChem Research Reports, 7, 2022-06, 1 - 18. doi: 10.13140/RG.2.2.36316.87688.
[2] Hernandez, H. (2020). On the Discreteness of Measured Variables and the Continuous
Approximation. ForsChem Research Reports, 5, 2020-20, 1-18. doi: 10.13140/RG.2.2.27740.00646.
[3] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density
Functions. ForsChem Research Reports, 2, 2017-13, 1-13. doi: 10.13140/RG.2.2.28214.60481.
[4] Hernandez, H. (2020). Approximate Function Inversion by Series Expansions. ForsChem Research
Reports, 5, 2020-04, 1-11. doi: 10.13140/RG.2.2.36280.70406.
[5] Hernandez, H. (2020). Reconstructing Probability Distributions using Quantile-based Splines.
ForsChem Research Reports, 5, 2020-21, 1-23. doi: 10.13140/RG.2.2.14827.36645.
[6] Hernandez, H. (2022). Cubic Spline Regression using OAT Optimization. ForsChem Research
Reports, 7, 2022-13, 1 - 34. doi: 10.13140/RG.2.2.12703.02722.
[7] Hernandez, H. (2018). Comparison of Methods for the Reconstruction of Probability Density
Functions from Data Samples. ForsChem Research Reports, 3, 2018-12, 1-52. doi:
10.13140/RG.2.2.30177.35686.
[8] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
ForsChem Research Reports, 7, 2022-06, 1 - 18. doi: 10.13140/RG.2.2.36316.87688.
[9] Hernandez, H. and Ochoa, S. (2022). Adaptive Step-size One-at-a-time (OAT) Optimization.
ForsChem Research Reports, 7, 2022-12, 1 - 44. doi: 10.13140/RG.2.2.15208.14087.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15


10.13140/RG.2.2.13204.99206 www.forschem.org (40 / 40)

You might also like