Numerical Determination of The Probability Density of Functions of Randomistic Variables

Vol.
7, 2022-15
Numerical Determination of the Probability Density of Functions of

Randomistic Variables
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.13204.99206
Abstract
The change of variable theorem is a useful equation for analytically determining the resulting
probability density of an arbitrary function of one or more independent randomistic variables.
The term randomistic may represent purely deterministic variables, purely random variables, or
a combination of both. The change of variable theorem requires an inverse of the original
function where one of the independent variables is explicitly solved in terms of all other
variables. Unfortunately, this is not always the case, and therefore the analytical change of
variable theorem cannot be used in those situations. In addition, when two or more
independent variables are involved, the analytical change of variable theorem requires solving
one or more definite integrals, and there are situations where the integrals cannot be
expressed as known analytical functions. In this report, a numerical version of the change of
variable theorem is presented for obtaining the probability density of a function, when the
analytical change of variable theorem does not succeed or cannot be used. The proposed
method is implemented in R language, and its use is illustrated with several examples. Most
examples considered are comparative, where the exact analytical solution is known, in order to
validate the performance of the method. Similitude percentages above 95% were obtained. Of
course, both the accuracy and computational demand of the numerical method will strongly
depend on the step sizes and tolerance considered. Additional examples are included to show
that numerical solutions are possible even when the analytical method fails.
Keywords
Change of Variable Theorem, Cumulative Probability, Distribution Functions, Monte Carlo,

Nonlinear Functions Numerical Methods, Probability Density, Randomistics, Similitude
Cite as: Hernandez, H. (2022). Numerical Determination of the Probability Density of Functions of
Randomistic Variables. ForsChem Research Reports, 7, 2022-15, 1 - 40. doi: 10.13140/RG.2.2.13204.99206
Publication Date: 03/10/2022.
Numerical Determination of the Probability
Density of Functions of Randomistic Variables
Hugo Hernandez
ForsChem Research
1. Introduction
A randomistic variable is a linear combination of a deterministic variable and a pure random

variable [1]. Continuous randomistic variables can be described by their corresponding
probability density function (pdf) or by a finite probability density function if they are discrete
[2]. If the pdf of a randomistic variable is known, all its properties can be determined including:
Expected value, variance (and standard deviation), moments, quantiles (including the median),
mode, etc. If one or more randomistic variables with known pdf are transformed by an
arbitrary function, it is possible to determine the pdf of the resulting variable using the Change
of Variable Theorem [3]. That is, the mathematical transformation of randomistic variables is
also randomistic. Even if the pdf’s of the original randomistic variables are relatively simple
functions (e.g. uniform distributions), an analytical expression for the pdf of the transformation
can be difficult and in some cases impossible to obtain. Perhaps it is for this reason that the
change of variable theorem is seldom used for solving science and engineering problems
involving randomistic variables.
The purpose of this report is the introduction of a simple numerical approach for determining
the pdf of an arbitrary function of one or more randomistic variables with known pdf, and using
such numerical pdf for determining any property of the resulting variable (expected value,
variance, moments, etc.). The derivation of this numerical approach is explained in Section 2
considering functions of single randomistic variables. In Section 3, the general numerical
method for nonlinear functions of multiple randomistic variables is presented. The
implementation of the method in R language (https://cran.r-project.org/) is shown in Section 4.
Finally, some illustrative examples are considered in Section 5.
2. Functions of a Single Randomistic Variable
Let us consider an arbitrary randomistic variable with a probability density function (pdf)
given by the mathematical function with the following properties:
( )
(2.1)
∫ ( )
(2.2)
Let us now consider the following arbitrary nonlinear transformation of :
( )
(2.3)
where is also a randomistic variable, with pdf given by [3]:
03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

10.13140/RG.2.2.13204.99206 www.forschem.org (2 / 40)
Hugo Hernandez
ForsChem Research
( )
( ) ∑ ( ( )) | |
(2.4)
represents the inverse function of , corresponding to (from Eq. 2.3):
( )
(2.5)
Since the nonlinear function can be non-injective (multiple values of yield the same value of
), the inverse function may result in multiple solutions. Thus, the subscript represents each
of the possible values of ( ).
Eq. (2.4) is difficult to evaluate when the function is not analytically invertible. For example, if
we consider the relatively simple function:
( )
(2.6)
then we find that the corresponding inverse function cannot be described with known
analytical functions, and much less can we determine its derivative.
One possible approach is approximating the inverse using a series expansion about as
follows [4]:
( ) ( ) ( ) ( ) ( )
( ( )) ( ( ))
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )( ) ( )
( ( ))
( )
( ( ))
( )
(2.7)
where
( ( )) ( ) ( ( )) ( )
( )
( )
(2.8)
This method requires that ( ) , and for certain functions it requires many terms in the
polynomial in order to provide a good representation of the inverse.

Hugo Hernandez
ForsChem Research
Let us now consider a numerical alternative to obtain ( ) using Eq. (2.4). First of all,
notice that Eq. (2.2) can only be valid as long as:
( ) ( )
(2.9)
otherwise the integral in Eq. (2.2) would not be finite. Thus, we can impose arbitrary limits on
the randomistic variable ( [ ]) such that:
∫ ( )
(2.10)
where represents a small tolerance in the error for the total probability of the variable. If Eq.
(2.10) is not satisfied, the bounds of must be expanded until it becomes valid. The integral in
Eq. (2.10) is evaluated numerically using a small resolution . In addition, a resolution in will
also be required ( ). , , and are input parameters of the numerical method. As these
parameters become smaller, the accuracy of the method improves, but the computational
demand increases. A trade-off between accuracy and computational cost is always required.
Also, as any numerical method, the results obtained will greatly depend on the parameter
values used.
For each value of considering within the bounds [ ] (using the resolution ) we
must determine:
 The value of the probability density function ( ).

 The value of the function ( ).
( ) ( ) ( )
 The numerical derivative of the function . A central finite
difference is considered, except for the extreme values where forward and backward
finite differences are used for the lower and upper bound, respectively. If the value of
( )
the numerical derivative is less than a small tolerance , then we may set in
order to avoid handling extreme large values of the probability density function of .
For simplicity, we may assume that .
If the value is obtained for the first time (within the resolution considered), then:
( )
( )
( )
| |
(2.11)

Hugo Hernandez
ForsChem Research
If the value (considering the resolution ) is obtained again, the probability density must be
accumulated:
( )
( ) ( )
( )
| |
(2.12)
Finally, the probability density values are normalized in order to strictly satisfy Eq. (2.2). The
vector of numerical values obtained for can be used to determine the properties of :
Expected value, variance, moments, etc. They can also be numerically integrated to yield the
cumulative probability function . The vector obtained for can then be approximated by a
mathematical function using splines [5,6].
3. Nonlinear Functions of Multiple Randomistic Variables
An arbitrary function of multiple randomistic variables can be described in general as follows:
( )
(3.1)
where the probability density function (pdf) of each randomistic variable in the function ( )
is known.
In this case, the pdf of the function, obtained by the change of variable theorem [3], is:
( ) ∫ ∫ ∫ ∏( ( ) )
( )
∑ ( ( )) | |
(3.2)
where the inverse function is obtained by solving Eq. (3.1) for :
( )
(3.3)
When multiple randomistic variables are considered, the inversion of Eq. (3.1) might become
more difficult, particularly for certain variables. However, if at least one variable can be
explicitly solved, then the inversion problem can be solved.

Hugo Hernandez
ForsChem Research
The main issue in the case of multiple variables is the analytical complexity involved in the
multiple integrals emerging in Eq. (3.2). For this reason, a numerical approach is also desirable.
The numerical evaluation of in the multivariable case is similar to the single-variable

situation. First, each variable in the function is bounded satisfying the condition:
∫ ( )
(3.4)
The error tolerance is the same for all variables.
Each independent variable is partitioned in different levels between [ ] using a step

size or resolution . Then, a multidimensional grid is obtained by considering all possible
combinations of the different levels of the independent variables. Notice that the number of
possible combinations may increase exponentially with the number of independent variables
considered. This will have an important effect of the computational load of the method. For
that reason, a maximum number of combinations to be evaluated can be defined. If the
total number of combinations is less or equal than , then all possible combinations are
considered in the evaluation. If the total number of combinations is more than , then a
random sample of combinations is selected.
For each particular combination in the set of values considered we must determine:
 The value of the probability density function of each independent variable ( ).

 The function value ( ).
 The numerical derivative of the function with respect to the first independent variable
( ) ( ) ( )
. If the value of the numerical
( )
derivative is less than a small tolerance , then we may set in order
to avoid handling extreme large values of the probability density function of .
If the value is obtained for the first time (within the resolution considered), then:
( )∏ ( )
( )
( )
| |
(3.5)
If the value (considering the resolution ) is obtained again, the probability density must be
accumulated:

Hugo Hernandez
ForsChem Research
( )∏ ( )
( ) ( )
( )
| |
(3.6)
Finally, the probability density values are normalized in order to strictly satisfy Eq. (2.2).
4. Algorithm Implementation
The numerical algorithm for determining the probability density and cumulative probability of
functions of randomistic variables by means of the Change of Variable Theorem was
implemented in R language (https://cran.r-project.org/). The R functions created for this
purpose are presented and briefly explained in this Section.
4.1. PEval
This function evaluates numerically the probability (P) of a variable with known probability
density function (rho) within a range of values (xlim), using an optional size step (dx). By
default, the size step will be 1/1000 of the given range.
Usage: P=PEval(rho,xlim,dx)
Example: Evaluate the probability of a normal variable within 1 standard deviation about the
mean.
PEval(rho=dnorm,xlim=c(-1,1),dx=1e-4)
0.6826895
R Code:
PEval<-function(rho,xlim,dx=NULL){
if (is.null(dx)) dx=(xlim[2]-xlim[1])/1000
n=round((xlim[2]-xlim[1])/dx)
x=xlim[1]+(0:n)*dx
y=rho(x)
P=sum(y*dx)-(y[1]+y[n+1])*dx/2
return(P)
}
4.2. PDvector
This function creates a discrete probability density vector (PDv) from a known probability
density function (rho) within an optional range of values (xlim) using an optional step size (dx).
The default range is [0,1], and the default step size is 1/1000 of the initial range. The PD vector
guarantees that the full distribution is covered within an optional tolerance (tol). The default

Hugo Hernandez
ForsChem Research
tolerance is . If the total probability is less than , then the range of values is
automatically extended until the desired tolerance is reached.
Usage: PDv=PDvector(rho,xlim,dx,tol)
Example: Create a probability density vector for the standard normal random variable using a
step size of 1.
PDvector(rho=dnorm,dx=1)
Var pdf
1 -5 1.486720e-06
2 -4 1.338302e-04
3 -3 4.431848e-03
4 -2 5.399097e-02
5 -1 2.419707e-01
6 0 3.989423e-01
7 1 2.419707e-01
8 2 5.399097e-02
9 3 4.431848e-03
10 4 1.338302e-04
11 5 1.486720e-06
12 6 6.075883e-09
13 7 9.134720e-12
14 8 5.052271e-15
R Code:
PDvector<-function(rho,xlim=NULL,dx=NULL,tol=1e-6){
if (is.null(xlim)) xlim=c(0,1)
if (is.null(dx)) dx=(xlim[2]-xlim[1])/1000
P=PEval(rho,xlim,dx)
nlow=0.5
nhigh=0.5
while ((1-P)>tol){
nlow=2*nlow
exitflag=0
while (exitflag==0){
Pnew=PEval(rho,xlim=c(xlim[1]-nlow*dx,xlim[2]),dx)
if ((Pnew-P)<tol){
exitflag=1
} else {
xlim[1]=xlim[1]-nlow*dx
P=Pnew
nlow=2*nlow
if ((1-Pnew)<tol){
exitflag=1
}
}
nhigh=2*nhigh
exitflag=0
while (exitflag==0){
Pnew=PEval(rho,xlim=c(xlim[1],xlim[2]+nhigh*dx),dx)
if ((Pnew-P)<tol){
exitflag=1

Hugo Hernandez
ForsChem Research
} else {
xlim[2]=xlim[2]+nhigh*dx
P=Pnew
nhigh=2*nhigh
if ((1-Pnew)<tol){
exitflag=1
}
}
}
}
}
n=(xlim[2]-xlim[1])/dx
Var=xlim[1]+(0:n)*dx
pdf=rho(Var)
PDv=data.frame(Var,pdf)
return(PDv)
}
4.3. EValue
This function evaluates the expected value (E) of a function (fun) of a single variable with a
known probability density function (PDF). fun can be given as a single-variable R function, or as
a text describing the function of "x" in R language. PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector.
Usage: E=EValue(fun,PDF)
Examples:
Evaluate the expected value of the square standard normal distribution, using the probability
density as a function.
EValue(fun="x^2",PDF=dnorm)
0.9999944
Evaluate the same expected value again, but this time using a probability density vector with
step size 0.1.
EValue(fun="x^2",PDF=PDvector(dnorm,dx=0.1))
1
R Code:
EValue<-function(fun,PDF){
if (is.function(PDF)){
PDF=PDvector(PDF)
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
n=length(x)
if (is.function(fun)==TRUE){
y=fun(x)
} else {

Hugo Hernandez
ForsChem Research
y=eval({x<-PDF$Var;parse(text=fun)})
}
E=sum(y*rho)*dx-(y[1]*rho[1]+y[n]*rho[n])*dx/2
return(E)
}
4.4. CDvector
This function creates a full cumulative distribution vector (CDv) from a known probability
density function (PDF). PDF can be given as a function (rho) or as a probability density vector
(PDv) previously created with the function PDvector. When the function rho is used, the values
for the initial limits (xlim), step size (dx) and tolerance (tol) can be optionally introduced.
Usage:
CDv=CDvector(PDF=rho,xlim,dx,tol)
CDv=CDvector(PDF=PDv)
Example: Obtain the cumulative probability vector from the probability density vector of the
standard normal distribution with a step size of 1.
CDvector(PDF=PDvector(dnorm,dx=1))
Var cdf
1 -5 7.433598e-07
2 -4 6.840183e-05
3 -3 2.351241e-03
4 -2 3.156265e-02
5 -1 1.795435e-01
6 0 5.000000e-01
7 1 8.204565e-01
8 2 9.684373e-01
9 3 9.976488e-01
10 4 9.999316e-01
11 5 9.999993e-01
12 6 1.000000e+00
13 7 1.000000e+00
14 8 1.000000e+00
R Code:
CDvector<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
PDF=PDvector(PDF,xlim,dx,tol)
}
Var=PDF$Var
dx=Var[2]-Var[1]
pdf=PDF$pdf
cdf=0*pdf
cdf[1]=pdf[1]*dx/2
for (i in 2:length(Var)){
cdf[i]=cdf[i-1]+(pdf[i-1]+pdf[i])*dx/2

Hugo Hernandez
ForsChem Research
}
cdf=cdf/max(cdf)
CDv=data.frame(Var,cdf)
return(CDv)
}
4.5. xfun
This is a family of functions used to obtain specific values from a particular distribution. The
functions are: dfun (probability density value), pfun (cumulative probability value), qfun
(quantile value), rfun (random value). These functions evaluate the corresponding value (d, p,
q, r) from a known probability distribution (PDF). PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector. If the function
rho is used, the initial limits (xlim), step size (dx) and tolerance (tol) are set as default.
Functions pfun, qfun and rfun uses the function CDvector.
Usage:
d=dfun(q,PDF)
p=pfun(q,PDF)
q=qfun(p,PDF)
r=rfun(n,PDF)
Examples:
Evaluate the probability density function of the standard normal distribution at a value .
dfun(q=pi/2,PDF=dnorm)
0.1161772
Evaluate the cumulative probability of the standard normal distribution at .

pfun(q=pi/2,PDF=dnorm)
0.9418852
Evaluate the inverse cumulative probability of the standard normal distribution at a value of 1/3.
qfun(p=1/3,PDF=dnorm)
-0.4307275
Generate 5 random numbers from a probability density vector of the standard normal
distribution (specific values will change every time the function is evaluated).
rfun(n=5,PDF=dnorm)
1.46327002 -1.24235830 -0.02832977 1.50078946 -0.35320490
R Code:
dfun<-function(q,PDF){
PDF=PDvector(PDF)
}
d=approx(PDF[,1],PDF[,2],xout=q)$y

Hugo Hernandez
ForsChem Research
return(d)
}
pfun<-function(q,PDF){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
p=approx(CDF[,1],CDF[,2],xout=q)$y
return(p)
}
qfun<-function(p,PDF){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
q=approx(CDF[,2],CDF[,1],xout=p)$y
return(q)
}
rfun<-function(n,PDF){
PDF=PDvector(PDF)
}
CDF=CDvector(PDF)
rp=runif(n)
r=approx(CDF[,2],CDF[,1],xout=rp)$y
return(r)
}
4.6. PDSummary
This function summarizes the main properties of a variable with known probability density
function (PDF). PDF can be given as a function (rho) or as a probability density vector (PDv)
previously created with the function PDvector. When the function rho is used, the values for
the initial limits (xlim), step size (dx) and tolerance (tol) can be optionally introduced. The
properties include: Variance, standard deviation, raw moments, central moments, skewness,
kurtosis, quartiles, median and mode.
Usage:
OUT=PDSummary(PDF,moments)
Mn=rawmoment(n,PDF)
Mcn=centralmoment(n,PDF)
Var=PDFVar(PDF)
sigma=PDFsd(PDF)
Sk=skewness(PDF)
Ku=kurtosis(PDF)
Qn=PDFQ(n,PDF)
median=PDFmedian(PDF)
Mode=PDFmode(PDF)

Hugo Hernandez
ForsChem Research
Example: Determine the main properties of the standard normal distribution.

OUT=PDSummary(dnorm)
PDF Properties Summary
Central Tendency and Location Measures:
Mean: -7.870456e-07
Median: -1.889555e-07
Mode: 0
Q1: -0.67449
Q3: 0.6744895
Dispersion Measures:
IQ Range: 1.348979
Variance: 0.9999944
St.Dev.: 0.9999972
Coeff.Var.: -127057087 %
Other Measures:
Skewness: -1.912881e-05
Kurtosis: -0.0001266078
Distribution Moments:
Moment Raw Central
1 1 -7.870456e-07 -1.558859e-13
2 2 9.999944e-01 9.999944e-01
3 3 -2.148978e-05 -1.912865e-05
4 4 2.999840e+00 2.999840e+00
5 5 -5.875848e-04 -5.757798e-04
R Code:
rawmoment<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
nx=length(x)
y=x^n
M=sum(y*rho)*dx-(y[1]*rho[1]+y[nx]*rho[nx])*dx/2
return(M)
}
centralmoment<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
x=PDF$Var
rho=PDF$pdf
dx=x[2]-x[1]
nx=length(x)
E=sum(x*rho)*dx-(x[1]*rho[1]+x[nx]*rho[nx])*dx/2
y=(x-E)^n
M=sum(y*rho)*dx-(y[1]*rho[1]+y[nx]*rho[nx])*dx/2
return(M)
}
PDFVar<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){

Hugo Hernandez
ForsChem Research
}
V=centralmoment(2,PDF)
return(V)
}
PDFsd<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
sigma=sqrt(PDFVar(PDF))
return(sigma)
}
PDFQ<-function(n,PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
p=n/4
Q=qfun(p,PDF)
return(Q)
}
PDFmedian<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
med=qfun(0.5,PDF)
return(med)
}
PDFmode<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
Mode=PDF$Var[which(PDF$pdf==max(PDF$pdf))]
return(Mode)
}
Skewness<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
Sk=centralmoment(3,PDF)/(PDFsd(PDF)^3)
return(Sk)
}
Kurtosis<-function(PDF,xlim=NULL,dx=NULL,tol=1e-6){
}
Ku=centralmoment(4,PDF)/(PDFVar(PDF)^2)-3
return(Ku)
}
PDSummary<-function(PDF,moments=5,xlim=NULL,dx=NULL,tol=1e-6){
defaultW <- getOption("warn")
options(warn = -1)
}

Hugo Hernandez
ForsChem Research
Mean=EValue("x",PDF)
Median=PDFmedian(PDF)
Mode=PDFmode(PDF)
Q1=PDFQ(1,PDF)
Q3=PDFQ(3,PDF)
cat("PDF Properties Summary","\n","\n")
cat("Central Tendency and Location Measures:","\n")
cat(" Mean: ",Mean,"\n")
cat(" Median: ",Median,"\n")
cat(" Mode: ",Mode,"\n")
cat(" Q1: ",Q1,"\n")
cat(" Q3: ",Q3,"\n","\n")
Var=PDFVar(PDF)
Sd=PDFsd(PDF)
Cv=100*Sd/Mean
IQR=Q3-Q1
cat("Dispersion Measures:","\n")
cat(" IQ Range: ",IQR,"\n")
cat(" Variance: ",Var,"\n")
cat(" St.Dev.: ",Sd,"\n")
cat(" Coeff.Var.: ",Cv,"%","\n","\n")
Sk=Skewness(PDF)
Ku=Kurtosis(PDF)
cat("Other Measures:","\n")
cat(" Skewness: ",Sk,"\n")
cat(" Kurtosis: ",Ku,"\n","\n")
if (moments>0){
cat("Distribution Moments:","\n")
Moment=c()
Raw=c()
Central=c()
for (i in 1:moments){
Moment=c(Moment,i)
Raw=c(Raw,rawmoment(i,PDF))
Central=c(Central,centralmoment(i,PDF))
}
Moments=data.frame(Moment,Raw,Central)
print(Moments)
} else {
Moments=NULL
}
OUT=list(Mean,Median,Mode,Var,Sd,Cv,Sk,Ku,Moments)
names(OUT)=c("Mean","Median","Mode","Var","Stdev","CVar","Skewness","Kurtos
is","Moments")
options(warn = defaultW)
return(OUT)
}
4.7. plotPDF
This function plots the probability density function (PDF) and cumulative probability function
(CDF) of a known probability density function (PDF). PDF can be given as a function (rho) or as a
probability density vector (PDv) previously created with the function PDvector. When the
function rho is used, the values for the initial limits (xlim), step size (dx) and tolerance (tol) can

Hugo Hernandez
ForsChem Research
be optionally introduced. The CDF is obtained using the CDvector function. If an optional second
PDF (PDF2) is used then both PDFs are shown in the same plot.
Usage:
plotPDF(PDF=rho,xlim,dx,tol)
plotPDF(PDF=PDv)
plotPDF(PDF=PDv,PDF2=PDv2)
Examples:
Plot the distribution functions for the standard normal distribution.
plotPDF(PDF=dnorm)
Plot the distribution functions for both the standard normal distribution and the standard
uniform distribution using the probability density vector with step size 0.01 in the range [-4,4]
(tol=0.001 is used to avoid changes in the limits).
plotPDF(PDF=PDvector(dnorm,xlim=c(-4,4),dx=0.01,tol=0.001),
PDF2=PDvector(dunif,xlim=c(-4,4),dx=0.01,tol=0.001))
R Code:
plotPDF<-function(PDF,PDF2=NULL,x=NULL,xlim=NULL,dx=NULL,tol=1e-6){
}
CDF=CDvector(PDF)
Var=PDF$Var
pdf=PDF$pdf

Hugo Hernandez
ForsChem Research
cdf=CDF$cdf
par(mfrow=c(1,2))
if (is.null(PDF2)==FALSE){
if (is.function(PDF2)){
PDF2=PDvector(PDF2,xlim,dx,tol)
}
CDF2=CDvector(PDF2)
Var2=PDF2$Var
pdf2=PDF2$pdf
cdf2=CDF2$cdf
xlim=c(min(Var,Var2),max(Var,Var2))
ylim=c(0,max(pdf,pdf2,na.rm=TRUE))
ylim=c(0,min(max(pdf,pdf2),1/tol))
plot(Var,pdf,type="l",col="blue",xlab="Variable
Value",ylab="Probability Density",xlim=xlim,ylim=ylim)
lines(Var2,pdf2,type="l",col="red")
plot(Var,cdf,type="l",col="blue",xlab="Variable Value",ylab="Cumulative
Probability",xlim=xlim,ylim=c(0,1))
lines(Var2,cdf2,type="l",col="red",ylim=c(0,1))
legend(x="bottomright",legend=c("PDF1","PDF2"),col=c("blue","red"),lty=c(1,
1))
} else {
plot(Var,pdf,type="l",col="blue",xlab="Variable
Value",ylab="Probability Density")
plot(Var,cdf,type="l",col="blue",xlab="Variable Value",ylab="Cumulative
Probability",ylim=c(0,1))
}
par(mfrow=c(1,1))
}
4.8. EvalCVT1
This function performs the numerical evaluation of the Change of Variable Theorem (CVT) for
determining the probability density (yPDF) of a variable (y) which is a function (fun) of a single
variable (x) with known probability density function (PDF). fun can be given as a single-variable
R function, or as a text describing the function of "x" in R language. PDF can be given as a
function (rho) or as a probability density vector (PDv) previously created with the function
PDvector. When the function rho is used, the values for the initial limits (xlim), step size (dx)
and tolerance (tol) can be optionally introduced. The range of values (ylim) and step size (dy) of
variable y can be optionally defined. A Boolean parameter (smooth) determines whether the
probability density of y is smoothed or not, using a geometric moving average. By default,
smooth is set as FALSE. The function also plots the resulting probability density and cumulative
probability of the function (using plotPDF). The function EValue is used for normalizing the
resulting yPDF.
Usage:
yPDF=EvalCVT1(fun,PDF=rho,ylim,dy,xlim,dx,tol,smooth)
yPDF=EvalCVT1(fun,PDF=PDv,ylim,dy,smooth)

Hugo Hernandez
ForsChem Research
Example: Determine the probability density function of the square standard normal
distribution using a step size of 0.1 for y, considering y in the range [0,10], without smoothing.
yPDF=EvalCVT1(fun="x^2",PDF=dnorm,ylim=c(0,10),dy=0.1,smooth=FALSE)
R Code:
EvalCVT1<-function(fun,PDF,ylim=NULL,dy=NULL,xlim=NULL,dx=NULL,tol=1e-
6,smooth=FALSE){
}
rhox=PDF$pdf
x=PDF$Var
f=fun(x)
} else {
f=eval({parse(text=fun)})
}
dx=x[2]-x[1]
if (is.null(ylim)==TRUE){
f0=f[(is.infinite(f)==FALSE)&(is.na(f)==FALSE)&(rhox>0)]
ymin=min(f0)
ymax=max(f0)
} else {
ymin=ylim[1]
ymax=ylim[2]
}
yrange=ymax-ymin
if (is.null(dy)==TRUE){
dy=yrange/1000
} else {
dy=abs(dy)
}
ylim=c(dy*floor((ymin-0.1*yrange)/dy),dy*ceiling((ymax+0.1*yrange)/dy))
ny=round((ylim[2]-ylim[1])/dy)
y=ylim[1]+(0:ny)*dy
rhoy=0*(0:ny)
f1=dy*round(f/dy)
nx=length(x)
for (i in 1:nx){
j=1+round((f1[i]-ylim[1])/dy)
if ((j>=1) & (j<=(ny+1))){
if (i==1){
if ((abs(f1[i+1]-f1[i])/dx)>tol) {

Hugo Hernandez
ForsChem Research
rhoy[j]=rhoy[j]+rhox[i]*dx/abs(f1[i+1]-f1[i])
} else {
rhoy[j]=rhoy[j]+rhox[i]/tol
}
}
if (i>1 & i<nx){
if ((0.5*abs(f1[i+1]-f1[i-1])/dx)>tol) {
rhoy[j]=rhoy[j]+2*rhox[i]*dx/abs(f1[i+1]-f1[i-1])
} else {
}
}
if (i==nx){
if ((abs(f1[i]-f1[i-1])/dx)>tol) {
rhoy[j]=rhoy[j]+rhox[i]*dx/abs(f1[i]-f1[i-1])
} else {
}
}
}
}
if (smooth==TRUE){
rhoy0=c(0,rhoy[1:(length(rhoy)-1)])
rhoy2=c(rhoy[2:(length(rhoy))],0)
pdf=(rhoy0*rhoy^2*rhoy2)^(1/4)
} else {
pdf=rhoy
}
Var=y
yPDF=data.frame(Var,pdf)
pdf=pdf/EValue("x^0",yPDF)
plotPDF(yPDF)
return(yPDF)
}
4.9. EvalCVT
This function performs the numerical evaluation of the Change of Variable Theorem (CVT) for
determining the probability density (yPDF) of a variable (y) which is a function (fun) of one or
more variables (x), each one of them with known probability density function (PDF). fun can be
given as a single-vector R function, or as a text describing the function of the x vector ("x[i, ]")
in R language (where the row i represents the variable number and the columns represents the
different combinations considered in the analysis). PDF can be given as a list of functions (rho)
or as a list of probability density vectors (PDv) previously created with the function PDvector.
When functions (rho) are used, the values for the initial limits (xlim), and step sizes (dx) can be
optionally introduced as lists. Also, the overall tolerance (tol) can be defined. The range of
values (ylim) and step size (dy) of variable y can be optionally defined. A Boolean parameter
(smooth) determines whether the probability density of y is smoothed or not, using a
geometric moving average. By default, smooth is set as TRUE. The function also plots the
resulting probability density and cumulative probability of the function (using plotPDF). The

Hugo Hernandez
ForsChem Research
function EValue is used for normalizing the resulting PDF. If a single independent variable is
used, the function EvalCVT1 is employed. If the number of possible combinations is larger than
nmax, then the numerical evaluation is performed by a Monte Carlo method (random
selection) using only nmax random evaluations of the multivariate function. By default, a
maximum of 100.000 combinations is considered.
Usage:
yPDF=EvalCVT(fun,PDF=rho,ylim,dy,xlim,dx,tol,smooth,nmax)
yPDF=EvalCVT(fun,PDF=PDv,ylim,dy,smooth,nmax)
Examples:
Determine the probability density function of the sum of squares of three standard normal
distributions using a step size of 0.1 for x and 0.1 for y, considering y in the range [0,50],
without smoothing. The PD vectors are limited to the range [-4,4] with a tolerance of 10-4.
xPDv=PDvector(dnorm,xlim=c(-4,4),dx=0.1,tol=1e-4)
yPDF=EvalCVT(fun="x[1,]^2+x[2,]^2+x[3,]^2",PDF=list(xPDv,xPDv,xPDv)
,ylim=c(0,50),dy=0.1,smooth=FALSE)
Determine the probability density function of the previous example using the smoothing
function this time.
,ylim=c(0,50),dy=0.1,smooth=TRUE)

Hugo Hernandez
ForsChem Research
R Code:
EvalCVT<-function(fun,PDF,ylim=NULL,dy=NULL,xlim=NULL,dx=NULL,tol=1e-
6,smooth=FALSE,nmax=100000){
if (is.data.frame(PDF)==TRUE | is.function(PDF)==TRUE){
yPDF=EvalCVT1(fun=fun,PDF=PDF,ylim=ylim,dy=dy,xlim=xlim,dx=dx,tol=tol,smoot
h=smooth)
} else {
nvar=length(PDF)
if (nvar==1){
yPDF=EvalCVT1(fun=fun[[1]],PDF=PDF[[1]],ylim=ylim,dy=dy,xlim=xlim,dx=dx,tol
=tol,smooth=smooth)
} else {
if (is.list(xlim)==FALSE){
if (is.null(xlim)==TRUE){
xlim=list(c(0,1))
for (i in 2:nvar){
xlim[[i]]=c(0,1)
}
} else {
xlim0=xlim
xlim=list(xlim)
for (i in 2:nvar){
xlim[[i]]=xlim0
}
}
} else {
if (length(xlim)<nvar){
for (i in (length(xlim)+1):nvar){
xlim[[i]]=c(0,1)
}
}
}
if (is.list(dx)==FALSE){
if (is.null(dx)==TRUE){
dx=list((xlim[[1]][2]-xlim[[1]][1])/1000)
for (i in 2:nvar){
dx[[i]]=(xlim[[i]][2]-xlim[[i]][1])/1000
}
} else {
dx0=dx
dx=list(dx)
for (i in 2:nvar){
dx[[i]]=dx0
}
}
} else {
if (length(dx)<nvar){
for (i in (length(dx)+1):nvar){
dx[[i]]=(xlim[[i]][2]-xlim[[i]][1])/1000
}
}
}
rhox=list(c())
xvar=list(c())
nx=c()
pnx=1
for (i in 1:nvar){

Hugo Hernandez
ForsChem Research
if (is.function(PDF[[i]])){
PDF[[i]]=PDvector(PDF[[i]],xlim[[i]],dx[[i]],tol)
}
rhox[[i]]=PDF[[i]]$pdf
xvar[[i]]=PDF[[i]]$Var
nx=c(nx,length(xvar[[i]]))
pnx=pnx*nx[i]
}
x1=c()
if (pnx>nmax){
for (i in 1:nvar){
x1=c(x1,sample(xvar[[i]],size=nmax,replace=TRUE))
}
x1=matrix(x1,nrow=nvar,ncol=nmax,byrow=TRUE)
} else {
pnx0=1
for (i in 1:nvar){
x1=c(x1,rep(xvar[[i]],each=pnx0,times=pnx/(pnx0*nx[i])))
pnx0=pnx0*nx[i]
}
x1=matrix(x1,nrow=nvar,ncol=pnx,byrow=TRUE)
}
x0=x1
x0[1,]=x0[1,]-dx[[1]]
x2=x1
x2[1,]=x2[1,]+dx[[1]]
f0=fun(x0)
f1=fun(x1)
f2=fun(x2)
} else {
x=x0
f0=eval({parse(text=fun)})
x=x1
x=x2
}
if (is.null(ylim)==TRUE){
fnew=f1[(is.infinite(f1)==FALSE)&(is.na(f1)==FALSE)]
ymin=min(fnew)
ymax=max(fnew)
} else {
ymin=ylim[1]
ymax=ylim[2]
}
yrange=ymax-ymin
if (is.null(dy)==TRUE){
dy=yrange/1000
} else {
dy=abs(dy)
}
ylim=c(dy*floor((ymin-
0.1*yrange)/dy),dy*ceiling((ymax+0.1*yrange)/dy))
ny=round((ylim[2]-ylim[1])/dy)
y=ylim[1]+(0:ny)*dy
f0=dy*round(f0/dy)
f1=dy*round(f1/dy)
f2=dy*round(f2/dy)
rhoy=0*(0:ny)

Hugo Hernandez
ForsChem Research
Pry=c()
for (k in 1:ncol(x1)){
Prx=1
for (i in 1:nvar){
Prx=Prx*rhox[[i]][which(xvar[[i]]==x1[i,k])]*dx[[i]]
}
df=abs(f2[k]-f0[k])/2
if (df>tol*dx[[1]]){
Pry=c(Pry,Prx/df)
} else {
Pry=c(Pry,Prx/(tol*dx[[1]]))
}
j=1+round((f1[k]-ylim[1])/dy)
if ((j>=1) & (j<=(ny+1))){
rhoy[j]=rhoy[j]+Pry[k]
}
}
if (smooth==TRUE){
rhoy0=c(0,rhoy[1:(length(rhoy)-1)])
rhoy2=c(rhoy[2:(length(rhoy))],0)
pdf=(rhoy0*rhoy^2*rhoy2)^(1/4)
} else {
pdf=rhoy
}
Var=y
pdf=pdf/EValue("x^0",yPDF)
plotPDF(yPDF)
}
}
return(yPDF)
}
4.10. similitude
This function determines the percentage of similitude (simil) between two probability density
functions (PDF1 and PDF2). Both PDFs can be given as functions (rho) or as probability density
vectors (PDv) previously created with the function PDvector. When PDF1 is a function (rho1), the
values for the limits (xlim) can be optionally introduced. If PDF1 is a PD vector, its corresponding
x limit values are used for the calculations. The default number of steps (intervals) used for the
calculation is 100000. This function is an adaptation from a previous work [7].
Usage:
simil=similitude(PDF1=rho1,PDF2=rho2,xlim,nsteps)
simil=similitude(PDF1=PDv1,PDF2=rho2)
simil=similitude(PDF1=rho1,PDF2=PDv2,xlim,nsteps)
simil=similitude(PDF1=PDv1,PDF2=PDv2)
Example: Determine the similitude between the smoothed probability density vector obtained
in the previous Section, and the distribution.

Hugo Hernandez
ForsChem Research
,ylim=c(0,50),dy=0.1,smooth=TRUE)
dchisq3<-function(x){
y=dchisq(x,3)
y[which(is.na(y)==TRUE)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=dchisq3)
Similitude (%)
97.43182
R Code:
similitude<-function(PDF1,PDF2,xlim=c(-10,10),nsteps=100000){
if (is.function(PDF1)==TRUE){
xlim1=NULL
} else {
xlim1=c(min(PDF1[,1]),max(PDF1[,1]))
}
if (is.null(xlim1)==TRUE){
xmin=xlim[1]
xmax=xlim[2]
} else {
xmin=xlim1[1]
xmax=xlim1[2]
}
} else {
if (is.null(xlim1)==TRUE){
xmin=min(PDF2[,1])
xmax=max(PDF2[,1])
} else {
xlim2=c(min(PDF2[,1]),max(PDF2[,1]))
xmin=max(xlim1[1],xlim2[1])
xmax=min(xlim1[2],xlim2[2])
}
}
i=1:(nsteps+1)
x=xmin+(xmax-xmin)*(i-1)/nsteps
Var=x
f1=match.fun(PDF1)

Hugo Hernandez
ForsChem Research
rho1=f1(x)
} else {
rho1=approx(PDF1[,1],PDF1[,2],xout=x)$y
}
pdf=rho1
PDF1=data.frame(Var,pdf)
f2=match.fun(PDF2)
rho2=f2(x)
} else {
rho2=approx(PDF2[,1],PDF2[,2],xout=x)$y
}
pdf=rho2
PDF2=data.frame(Var,pdf)
rhomin=pmin(rho1,rho2)
simil=200*sum(rhomin,na.rm=TRUE)/(sum(rho1,na.rm=TRUE)+sum(rho2,na.rm=TRUE)
)
names(simil)=c("Similitude (%)")
plotPDF(PDF1,PDF2=PDF2)
return(simil)
}
5. Selected Examples
In this Section, a selection of representative examples is considered for illustrative purposes.

Some of these examples were solved analytically using the Change of Variable Theorem in a
previous report [3].
5.1. ,
The first example is the reciprocal function of a type III standard uniform random variable
(limited between and ). The exact analytical probability density function is [3]:
( ) {
(5.1)
and the corresponding cumulative probability function is:
( ) {
(5.2)
The numerical CVT and subsequent comparison with the exact result is performed as follows:

Hugo Hernandez
ForsChem Research
yPDF=EvalCVT("1/x",dunif,ylim=c(1,100),dy=0.1,dx=0.00001)
Ex51<-function(x){
y=x^(-2)
y[which(x<1)]=0
return(y)
}
similitude(PDF1=yPDF,PDF2=Ex51)
Similitude (%)
95.60502
The graphical output generated in R is presented in Figure 1. Figure 1 compares the numerical
results obtained in R with the exact solution given in Eq. (5.1).
Figure 1. Graphical comparison of probability density and cumulative probability functions

between the numerical and exact solutions of the change of variable theorem for , where
is a type III standard uniform random variable.
The numerical results obtained are strongly influenced by the parameter values considered. In
this example, the similitude between the numerical and exact probability density functions is
satisfactory ( ).
5.2. ,
The second example is a modified version of the previous example, considering a variable
uniformly distributed between and . The exact solution to this problem is [3]:
( ) {
(5.3)

Hugo Hernandez
ForsChem Research
( )
{
(5.4)
Figure 2 summarizes the results obtained using the following lines of code:
xPDF=EvalCVT("2+8*x",dunif,ylim=c(0,12),dy=0.0001,dx=0.000001,smoot
h=TRUE)
yPDF=EvalCVT("1/x",xPDF,ylim=c(0,0.6),dy=0.001,smooth=TRUE)
Ex52<-function(x){
y=1/(8*x^2)
y[which(x<1/10)]=0
y[which(x>1/2)]=0
return(y)
}
Similitude (%)
99.26432
Better results were obtained in this case, because the possible values of were limited to a
finite range. The error involved in the numerical solution can be further reduced by decreasing
the step sizes, but this will also increase the computational demand of the method. Also notice
that a linear transformation of the standard uniform distribution was the first step in the
numerical solution.

is a uniform random variable between 2 and 10.

Hugo Hernandez
ForsChem Research
5.3. ,
In this example, the independent variable is the standard deterministic variable [8]. The
probability density function of the standard deterministic variable is:
( ) ( ) {
(5.5)
where is Dirac’s delta function.
The cumulative probability of is:
( ) ( ) {
(5.6)
where is Heaviside’s step function.
The analytical result of the change of variable theorem for is [8]. The numerical
solution can be obtained by approximating the standard deterministic variable using the
standard uniform random variable as follows:
(5.7)
where is the standard uniform random variable and . The numerical results obtained
assuming are shown in Figure 3.

is a standard deterministic variable.

Hugo Hernandez
ForsChem Research
The corresponding code used was the following:

xPDF=EvalCVT("1-0.001/2+0.001*x",dunif,ylim=c(0.5,1.5),
dy=0.0001,dx=0.00001)
yPDF=EvalCVT("1/x",xPDF,ylim=c(0.5,1.5),dy=0.0001)
similitude(PDF1=yPDF,PDF2=xPDF)
Similitude (%)
100
5.4. | |,
The following example considers the absolute value of a standard normal distribution. The
exact solution to this problem, using the univariate change of variable theorem is [3]:
( ) {
√
(5.8)
( ) {
( )
√
(5.9)

between the numerical and exact solutions of the change of variable theorem for | |,
where is a standard normal random variable.
Figure 4 summarizes the satisfactory results obtained using the following lines of code:
yPDF=EvalCVT("abs(x)",PDvector(dnorm,xlim=c(-5,5),dx=0.0001),
ylim=c(0,5),dy=0.01,smooth=TRUE)

Hugo Hernandez
ForsChem Research
Ex54<-function(x){
y=sqrt(2/pi)*exp(-(x^2)/2)
y[which(x<0)]=0
return(y)
}
Similitude (%)
99.47906
5.5. ,
This is a multivariate example. In this case, variable is the result of adding two independent
type III standard uniform variables. The corresponding solution using the multivariate change
of variable theorem is [3]:
( ) {
(5.10)
( )
{
(5.11)
Now, the numerical solution can be obtained as follows:

yPDF=EvalCVT("x[1,]+x[2,]",PDF=list(dunif,dunif),ylim=c(0,2),dy=0.0
5)
Ex55<-function(x){
y=x
y[which(x>1)]=2-y[which(x>1)]
y[which(x<0 | x>2)]=0
return(y)
}
Similitude (%)
99.24617
The comparative results are shown in Figure 5. The computational demand of this method
increases exponentially with the number of variables involved in the function. For this reason,
the Monte Carlo method is preferred over the full grid of possible combinations. Nevertheless,
a satisfactory result (over similitude) was again obtained. The exact analytical solution to

Hugo Hernandez
ForsChem Research
this particular problem is not so straightforward, and it requires plenty of effort to obtain the
result. For highly nonlinear functions, and for a larger number of variables, the analytical
complexity of the solution increases and in some cases may not be available for comparison.

between the numerical and exact solutions of the change of variable theorem for ,
where and are type III standard uniform variables.
5.6. √ ,
This example also involves two variables, but in this case they are standard normal random
variables. The analytical solution is the following [3]:
( ) {
(5.12)
( ) {
(5.13)
The numerical solution can be obtained as follows (limiting the standard normal distributions
to the range [-5,5]):
yPDF=EvalCVT("sqrt(x[1,]^2+x[2,]^2)",PDF=list(dnorm,dnorm),ylim=c(0
,7),dy=0.1,smooth=TRUE)
Ex56<-function(x){
y=x*exp(-0.5*x^2)
y[which(x<0)]=0
return(y)

Hugo Hernandez
ForsChem Research
}
Similitude (%)
98.60999

between the numerical and exact solutions of the change of variable theorem for
√ , where and are standard normal variables.
Figure 6 shows again a satisfactory prediction of the both distribution functions, with a
similitude over . Since the Monte Carlo method is used, slightly different results may be
obtained in different runs.
5.7. √ ,
The next example corresponds to the Maxwell-Boltzmann distribution involving three

independent variables. The solution to this problem is [3]:
( ) {
√
(5.14)
( ) {
( ) √
√
(5.15)
The numerical solution to this problem can be obtained as follows:

Hugo Hernandez
ForsChem Research
yPDF=EvalCVT("sqrt(x[1,]^2+x[2,]^2+x[3,]^2)",PDF=list(dnorm,dnorm,d
norm),ylim=c(0,9),dy=0.1,smooth=TRUE,nmax=500000)
Ex57<-function(x){
y=sqrt(2/pi)*x^2*exp(-0.5*x^2)
y[which(x<0)]=0
return(y)
}
Similitude (%)
98.91535

√ , where , and are standard normal variables.
The performance of the numerical solution can be observed in Figure 7. This problem required
increasing the maximum number of evaluations in order to obtain a satisfactory performance
( using 500.000 evaluations).
5.8. ,
√
The last comparative example represents the contribution of a single velocity component to
the overall velocity of a particle, assuming a standard normal distribution of each velocity
component. The analytical solution to this problem is [3]:
( ) {
| |
(5.16)

Hugo Hernandez
ForsChem Research
( ) {
(5.17)
Even if the resulting distribution is relatively simple (uniform distribution), the analytical
solution of the integrals from the change of variable theorem is not so easily obtained. In fact,
modified Bessel functions of the second kind must be integrated during the solution process.
The numerical solution to this problem can be solved using the following code in R:
yPDF=EvalCVT("x[1,]/sqrt(x[1,]^2+x[2,]^2+x[3,]^2)",PDF=list(dnorm,d
norm,dnorm),ylim=c(-1,1),dy=0.01,smooth=TRUE,nmax=200000)
Ex58<-function(x){
y=1/2+0*x
y[which(abs(x)>1)]=0
return(y)
}
Similitude (%)
95.1479

, where , and are standard normal variables.
√
Figure 8 summarizes the numerical results obtained. In this case, the probability density
function was obtained with too much noise, in spite of already using the smoothing option.
Such noise leads to a decrease in the similitude percentage with respect to previous examples.
However, the similitude value of obtained is still satisfactory. This is clearly illustrated in
the cumulative probability function whose exact solution is closely represented by the
numerical result.

Hugo Hernandez
ForsChem Research
5.9. ( ),
This problem was already proposed in Section 2 as an example of a function without explicit
inverse. Thus, the analytical evaluation of the change of variable theorem is unfeasible.
However, the numerical evaluation is possible. The probability density of the function can be
obtained using:
yPDF=EvalCVT("x+log(1+x)",PDF=PDvector(dunif,xlim=c(0,1),dx=0.00001
),ylim=c(0,1.7),dy=0.01,smooth=TRUE)
Figure 9. Results obtained for the probability density and cumulative probability functions using
the numerical solution of the change of variable theorem for ( ), where is a
type III standard uniform variable.
These results are graphically presented in Figure 9. The probability density and cumulative
probability of the resulting transformation can be calculated using the functions dfun and pfun.
Similarly, quantile values and random numbers can be extracted from the distribution using
qfun and rfun, respectively. In addition, the main properties of this distribution are summarized
as follows:
OUT=PDSummary(yPDF)
Mean: 0.8847004
Median: 0.9036036
Mode: 1.67
Q1: 0.4751766
Q3: 1.304247
IQ Range: 0.82907
Variance: 0.2319373
St.Dev.: 0.4815987
Coeff.Var.: 54.43636 %

Hugo Hernandez
ForsChem Research
Other Measures:
Skewness: -0.09633249
Kurtosis: -1.179385
Moment Raw Central
1 1 0.8847004 4.884981e-17
2 2 1.0146322 2.319373e-01
3 3 1.2972752 -1.076041e-02
4 4 1.7616890 9.793984e-02
5 5 2.4862137 -1.083074e-02
5.10. ( ) ( ),
As an example of a multivariate function without an explicit inverse function on any variable we

have the following situation:
( ) ( )
(5.18)
Particularly, let us consider that the independent variables are standard normal random
variables in the range [ , ]. The numerical solution is in this case:
yPDF=EvalCVT("x[1,]*cos(x[2,])+x[2,]*sin(x[1,])",PDF=list(PDvector(
dnorm,xlim=c(-2*pi,2*pi),dx=0.001),PDvector(dnorm,xlim=c(-2*pi,
2*pi),dx=0.001)),ylim=c(-4*pi,4*pi),dy=0.05,smooth=TRUE)
Figure 10. Results obtained for the probability density and cumulative probability functions
using the numerical solution of the change of variable theorem for ( ) ( ),
where and are standard normal variables.

Hugo Hernandez
ForsChem Research
Figure 10 summarize the results obtained for the probability density and cumulative probability.
Again, these functions can be approximated using dfun and pfun, respectively. The main
properties of this distribution are summarized as follows:
OUT=PDSummary(yPDF)
Mean: 0.005943125
Median: -0.003117462
Mode: 0
Q1: -0.590475
Q3: 0.6110587
IQ Range: 1.201534
Variance: 0.9934216
St.Dev.: 0.9967054
Coeff.Var.: 16770.73 %
Other Measures:
Skewness: -0.003262576
Kurtosis: 0.4976511
Moment Raw Central
1 1 0.005943125 1.308963e-17
2 2 0.993456879 9.934216e-01
3 3 0.014481859 -3.230435e-03
4 4 3.451918044 3.451784e+00
5 5 -0.007062769 -1.096356e-01
5.11. ( ),
The last example involves an implicit function. In this case, an approximate explicit function
needs to be created by numerically solving the implicit function. In this case, the solution is
obtained by optimization, and particularly, by using the OAToptim function introduced in an
earlier report [9]. The approximate function is generated using cubic spline regression (rspline)
[6]. The code used for creating such approximate function is the following:
nlfun<-function(y){
load(file="temp.dat")
f=(y-1-cos(x*y/3))^2
return(f)
}
X=-2*pi+0.02*pi*(0:200)
y=c()
for (i in 1:length(X)){
x=X[i]
save(x,file="temp.dat")

Hugo Hernandez
ForsChem Research
OUT=OAToptim(nlfun,x0=1,display=FALSE,optmode='min')
y=c(y,OUT[[1]])
}
plot(X,y)
appfun=rspline(y,X)[[1]]
xPDF=PDvector(dnorm,xlim=c(-2*pi,2*pi),dx=0.0001,tol=0.0001)
yPDF=EvalCVT("appfun(x)",PDF=xPDF,ylim=c(0,2),dy=0.01,smooth=TRUE)
Figure 11 shows the resulting numerical solution found by optimization, and the cubic spline
approximate explicit function of in terms of . Figure 12 shows the distribution functions
obtained for .
Figure 11. Optimal cubic spline regression model describing the explicit dependence of variable
on variable . R2adj = 0.9999954. Number of segments = 10.
Figure 12. Results obtained for the probability density and cumulative probability functions
using the numerical solution of the change of variable theorem for the implicit function
( ), where is a standard normal variable in the range [ ].

Hugo Hernandez
ForsChem Research
The main properties of this distribution are:

OUT=PDSummary(yPDF)
Mean: 1.836634
Median: 1.885314
Mode: 1.99
Q1: 1.754617
Q3: 1.958811
IQ Range: 0.2041937
Variance: 0.02320143
St.Dev.: 0.1523201
Coeff.Var.: 8.293442 %
Other Measures:
Skewness: -1.246654
Kurtosis: 1.173768
Moment Raw Central
1 1 1.836634 2.897552e-16
2 2 3.396425 2.320143e-02
3 3 6.318808 -4.405730e-03
4 4 11.818099 2.246765e-03
5 5 22.206903 -9.201951e-04
6. Conclusion
In the present work, a numerical approximation of the change of variable theorem for
determining the probability density of a function of one or more randomistic variables has been
proposed. The method was implemented in R language, and it was exemplified considering
different illustrative functions. Most functions considered were compared with the exact
analytical solution of the change of variable theorem, resulting in similitude values higher than
95%. It was also shown that functions where the analytical solution to the change of variable
theorem cannot be obtained were successfully obtained by the numerical approach. The
resulting probability density function can be used for determining any property of the function,
including: Expected value, raw and central moments, variance, standard deviation, skewness,
kurtosis, quartiles, etc.

Hugo Hernandez
ForsChem Research
Acknowledgment and Disclaimer
This report provides data, information and conclusions obtained by the author(s) as a result of original
scientific research, based on the best scientific knowledge available to the author(s). The main purpose
of this publication is the open sharing of scientific knowledge. Any mistake, omission, error or inaccuracy
published, if any, is completely unintentional.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC
4.0). Anyone is free to share (copy and redistribute the material in any medium or format) or adapt
(remix, transform, and build upon the material) this work under the following terms:
 Attribution: Appropriate credit must be given, providing a link to the license, and indicating if
changes were made. This can be done in any reasonable manner, but not in any way that
suggests endorsement by the licensor.
 NonCommercial: This material may not be used for commercial purposes.
References
[1] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
ForsChem Research Reports, 7, 2022-06, 1 - 18. doi: 10.13140/RG.2.2.36316.87688.
[2] Hernandez, H. (2020). On the Discreteness of Measured Variables and the Continuous
Approximation. ForsChem Research Reports, 5, 2020-20, 1-18. doi: 10.13140/RG.2.2.27740.00646.
[3] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density
Functions. ForsChem Research Reports, 2, 2017-13, 1-13. doi: 10.13140/RG.2.2.28214.60481.
[4] Hernandez, H. (2020). Approximate Function Inversion by Series Expansions. ForsChem Research
Reports, 5, 2020-04, 1-11. doi: 10.13140/RG.2.2.36280.70406.
[5] Hernandez, H. (2020). Reconstructing Probability Distributions using Quantile-based Splines.
ForsChem Research Reports, 5, 2020-21, 1-23. doi: 10.13140/RG.2.2.14827.36645.
[6] Hernandez, H. (2022). Cubic Spline Regression using OAT Optimization. ForsChem Research
Reports, 7, 2022-13, 1 - 34. doi: 10.13140/RG.2.2.12703.02722.
[7] Hernandez, H. (2018). Comparison of Methods for the Reconstruction of Probability Density
Functions from Data Samples. ForsChem Research Reports, 3, 2018-12, 1-52. doi:
10.13140/RG.2.2.30177.35686.
[8] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
[9] Hernandez, H. and Ochoa, S. (2022). Adaptive Step-size One-at-a-time (OAT) Optimization.


Numerical Determination of The Probability Density of Functions of Randomistic Variables

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Numerical Determination of The Probability Density of Functions of Randomistic Variables

Uploaded by

Copyright:

Available Formats

Vol.

Numerical Determination of the Probability Density of Functions of

Change of Variable Theorem, Cumulative Probability, Distribution Functions, Monte Carlo,

A randomistic variable is a linear combination of a deterministic variable and a pure random

2. Functions of a Single Randomistic Variable

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

 The value of the probability density function ( ).

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

3. Nonlinear Functions of Multiple Randomistic Variables

An arbitrary function of multiple randomistic variables can be described in general as follows:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

The numerical evaluation of in the multivariable case is similar to the single-variable

Each independent variable is partitioned in different levels between [ ] using a step

 The value of the probability density function of each independent variable ( ).

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Evaluate the cumulative probability of the standard normal distribution at .

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Example: Determine the main properties of the standard normal distribution.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

In this Section, a selection of representative examples is considered for illustrative purposes.

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 1. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 2. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

The cumulative probability of is:

Figure 3. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

The corresponding code used was the following:

Figure 4. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Now, the numerical solution can be obtained as follows:

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 5. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 6. Graphical comparison of probability density and cumulative probability functions

The next example corresponds to the Maxwell-Boltzmann distribution involving three

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 7. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

Figure 8. Graphical comparison of probability density and cumulative probability functions

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

As an example of a multivariate function without an explicit inverse function on any variable we

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15

03/10/2022 ForsChem Research Reports Vol. 7, 2022-15