Modelado: Introducción

Modelado
Introducción
Douglas A. Plaza Guingla, Ph.D.
Programa de Doctorado en Ingeniería Eléctrica

Faculty of Electrical and Computing Engineering (FIEC)
Escuela Superior Politécnica del Litoral (ESPOL)
June 15, 2021
douplaza@espol.edu.ec (ESPOL) Modelado Matemático 1 / 52

Outline
1 Probability Fundamentals
2 Random Variables
3 Systems of Ramdom Variables
4 Random Processes

Concepts of probability
If a geologist is quoted as saying that “there is a 60 percent chance of oil in a

certain region,” we all probably have some intuitive idea as to what is being
said. Indeed, most of us would probably interpret this statement in one of two
possible ways:
Frecuency Interpretation
the geologist feels that, over the long run, in 60 percent of the regions whose
outward environmental conditions are very similar to the conditions that prevail
in the region under consideration, there will be oil
Subjective Interpretation
the geologist believes that it is more likely that the region will contain oil than it
is that it will not; and in fact .6 is a measure of the geologist’s belief in the
hypothesis that the region will contain oil.

Frequency interpretation
The probability of a given outcome of an experiment is considered as being a
“property” of that outcome. It is imagined that this property can be operationally
determined by continual repetition of the experiment. The probability of the
outcome will then be observable as being the proportion of the experiments
that result in the outcome. This is the interpretation of probability that is most
prevalent among scientists.
Subjective Interpretation
The probability of an outcome is considered a statement about the beliefs of
the person who is quoting the probability, concerning the chance that the
outcome will occur. Thus, in this interpretation, probability becomes a
subjective or personal concept and has no meaning outside of expressing
one’s degree of belief. This interpretation of probability is often favored by
philosophers and certain economic decision makers.

Regardless of which interpretation one gives to probability, however, there is a

general consensus that the mathematics of probability are the same in either
case.
For instance, if you think that the probability that it will rain tomorrow is .3 and
you feel that the probability that it will be cloudy but without any rain is .2, then
you should feel that the probability that it will either be cloudy or rainy is .5
independently of your individual interpretation of the concept of probability.

Sample Space and Events
This set of all possible outcomes of an experiment is known as the sample
space of the experiment and is denoted by S.
Example 1:
Experiment: determination of the sex of a newborn child.
Sample Space S = {g = girl , b = boy }.
boy and girl are all the possible outcomes of S
Example 2:
Experiment: Running of a race among the seven horses having post
positions 1, 2,3,4,5,6,7.
S = {all orderings of (1, 2, 3, 4, 5, 6, 7)}.
Outcome (2, 3, 1, 4, 5, 6, 7) means that horse number 2 arrived in the first
position, then horse number 3, then horse number 1 and so on.
Example 3:
Experiment: Determine the amount of dosage that must be given to a
patient until he/she reacts positively.
S = (0, ∞)
Sample Space and Events
Events
Any subset E of the sample space is known as an event. That is, an event is a
set consisting of possible outcomes of the experiment. If the outcome of the
experiment is contained in E, then we say that E has occurred.
In Example 1:
if E = {g}, then E is the event that the child is a girl. Similarly, if F = {b}, then F is
the event that the child is a boy.
In Example 2:
if E = {all outcomes in S starting with a 3}
then E is the event that the number 3 horse wins the race.

Union of Events
For any two events E and F of a sample space S, we define the new event E ∪
F , called the union of the events E and F, to consist of all outcomes that are
either in E or in F or in both E and F.
That is, the event E ∪ F will occur if either E or F occurs.
For instance, in Example 1 if E = {g} and F = {b}, then E ∪ F = { g , b}. That is,
E ∪ F would be the whole sample space S.
In Example 2 if E = {all outcomes starting with 6} is the event that the number 6
horse wins and F = {all outcomes having 6 in the second position} is the event
that the number 6 horse comes in second, then E ∪ F is the event that the
number 6 horse comes in either first or second.

Intersection of Events
For any two events E and F, we may also define the new event EF, called the
intersection of E and F, to consist of all outcomes that are in both E and F. That
is, the event EF will occur only if both E and F occur.
For instance, in Example 3 if E = (0, 5) is the event that the required dosage is
less than 5 and F = (2, 10) is the event that it is between 2 and 10, then EF =
(2, 5) is the event that the required dosage is between 2 and 5.
In Example 2 if E = {all outcomes ending in 5} is the event that horse number 5
comes in last and F = {all outcomes starting with 5} is the event that horse
number 5 comes in first, then the event EF does not contain any outcomes and
hence cannot occur. To give such an event a name, we shall refer to it as the
null event and denote it by 0/ . Thus 0/ refers to the event consisting of no
outcomes.
If EF = 0/ , implying that E and F cannot both occur, then E and F are said to be
mutually exclusive.

Complement and Contained
For any event E, we define the event E c , referred to as the complement of E,

to consist of all outcomes in the sample space S that are not in E. That is, E c
will occur if and only if E does not occur.
In Example 1 if E = {b} is the event that the child is a boy, then E c = {g} is the
event that it is a girl. Also note that since the experiment must result in some
outcome, it follows that S c = 0/ .
For any two events E and F, if all of the outcomes in E are also in F, then we say
that E is contained in F and write E ⊂ F (or equivalently, F ⊃ E ). Thus if E ⊂ F
, then the occurrence of E necessarily implies the occurrence of F.
If E ⊂ F and F ⊂ E , then we say that E and F are equal (or identical) and we
write E = F.

Union and Intersection of more than two Events
The union of the events E1 , E2 , ..., En , denoted either by E1 ∪ E2 ∪ . . . ∪ En or

by ∪n1 Ei , is defined to be the event consisting of all outcomes that are in Ei for
at least one i = 1, 2, . . . , n.
Similarly, the intersection of the events Ei , i = 1, 2, . . . , n, denoted by
E1 E2 . . . En , is defined to be the event consisting of those outcomes that are in
all of the events Ei , i = 1, 2, . . . , n.
In other words, the union of the Ei occurs when at least one of the events Ei
occurs; the intersection occurs when all of the events Ei occur.

Axioms of Probability
AXIOM 1
0 ≤ P (E ) ≤ 1
The probability that the outcome of the experiment is contained in E is some

number between 0 and 1
AXIOM 2
P (S ) = 1
With probability 1, the outcome will be a member of sample space S.
AXIOM 3
n
P (∪ni=1 Ei ) = ∑ P (Ei ), n = 1, 2, . . . , ∞
i =1
For any set of mutually exclusive events the probability that at least one of
these events occurs is equal to the sum of their respective probabilities
Probability Propositions
Proposition 1: P (E c ) = 1 − P (E )
Proposition 1 states that the probability that an event does not occur is 1 minus
the probability that it does occur. For instance, if the probability of obtaining a
head on the toss of a coin is 38 , the probability of obtaining a tail must be 85 .
Proposition 2: P (E ∪ F ) = P (E ) + P (F ) − P (EF )
A total of 28 percent of American males smoke cigarettes, 7 percent smoke
cigars, and 5 percent smoke both cigars and cigarettes. What percentage of
males smoke neither cigars nor cigarettes?
Solution:P (E ∪ F ) = P (E ) + P (F ) − P (EF ) = 0.07 + 0.28 − 0.05 = 0.3
Thus the probability that the person is not a smoker is .7, implying that 70
percent of American males smoke neither cigarettes nor cigars.
P (A) P (A)
Odds of event A: P (Ac ) = 1−P (A)
If P (A) = 34 , then odds is 3, consequently, it is 3 times as likely that A occurs as
it is that it does not.

Conditional Probability
In this section, we introduce one of the most important concepts in all of

probability theory — that of conditional probability. Its importance is twofold. In
the first place, we are often interested in calculating probabilities when some
partial information concerning the result of the experiment is available, or in
recalculating them in light of additional information. In such situations, the
desired probabilities are conditional ones. Second, as a kind of a bonus, it
often turns out that the easiest way to compute the probability of an event is to
first “condition” on the occurrence or nonoccurrence of a secondary event. As
an illustration of a conditional probability, suppose that one rolls a pair of dice.
The sample space S of this experiment can be taken to be the following set of
36 outcomes:
S = {(i , j ), i = 1, 2, 3, 4, 5, 6, j = 1, 2, 3, 4, 5, 6}

where we say that the outcome is (i , j ) if the first die lands on side i and the
second on side j. Suppose now that each of the 36 possible outcomes is
1
equally likely to occur and thus has probability 36 . (In such a situation we say
that the dice are fair.) Suppose further thus we observe that the first die lands
on side 3. Then, given this information, what is the probability that the sum of
the two dice equals 8? To calculate this probability, we reason as follows:
Given that the initial die is a 3, there can be at most 6 possible outcomes of our
experiment, namely,(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), and (3, 6). In addition,
because each of these outcomes originally had the same probability of
occurring, they should still have equal probabilities. That is, given that the first
die is a 3, then the (conditional) probability of each of the outcomes
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6) is 16 , whereas the (conditional)
probability of the other 30 points in the sample space is 0. Hence, the desired
probability will be 16 .

If we let E and F denote, respectively, the event that the sum of the dice is 8
and the event that the first die is a 3, then the probability just obtained is called
the conditional probability of E given that F has occurred, and is denoted by:
P (E |F )
A general formula for P (E |F ) that is valid for all events E and F is derived in the
same manner as just described. Namely, if the event F occurs, then in order for
E to occur it is necessary that the actual occurrence be a point in both E and F;
that is, it must be in EF.
Now, since we know that F has occurred, it follows that F becomes our new
(reduced) sample space and hence the probability that the event EF occurs will
equal the probability of EF relative to the probability of F. That is,
P (EF )
P (E |F ) =
P (F )

Conditional Probability: Examples
Lets solve the experiment of the roll of the two dice:

Event E: The sum of the two dice equals 8.
Event F: The first die lands on side 3.
outcomes in event F 6
P (F ) = =
all outcomes 36
outcomes in E and F 1
P (EF ) = =
all outcomes 36
1
P (EF )
P (E |F ) = = 36
6
P (F ) 36
The probability to get the sum of the two dice equals to 8 given that first die
landed on side 3 is 16

A bin contains 5 defective (that immediately fail when put in use), 10 partially
defective (that fail after a couple of hours of use), and 25 acceptable
transistors. A transistor is chosen at random from the bin and put into use. If it
does not immediately fail, what is the probability it is acceptable?
Solution
All possible outcomes = 5+10+25 = 40
Event E: The chosen transistor from the bin is acceptable.
Event F: The chosen transistor does not fail immediately.
Outcomes in EF: 25
25
P (EF ) 5
P (E |F ) = = 40
35
=
P (F ) 40
7

The organization that Jones works for is running a father–son dinner for those
employees having at least one son. Each of these employees is invited to
attend along with his youngest son. If Jones is known to have two children,
what is the conditional probability that they are both boys given that he is
invited to the dinner?
Assume that the sample space S is given by S = {(b, b), (b, g ), (g , b), (g , g )}
and all outcomes are equally likely [(b, g ) means, for instance, that the younger
child is a boy and the older child is a girl].
Experiment: Father-son dinner to employees with at least one son.
All possible outcomes: For Jones who has two children size(S)=4.
Event E: Jones has two boys.
Event F:Jones is invited to the dinner, which means, that Jones has one boy.
Outcomes in F are (b, b), (b, g ), (g , b), thus outcomes = 3.
Outcomes in EF are only one (b, b)
P (E |F ) = ( 14 )/( 43 ) = 13

The concept of conditional probability may be used to compute the probability

of intersection of events, if somehow I know the conditional probability, as
follows:
P (EF ) = P (F )P (E |F )
Example Ms. Perez figures that there is a 30 percent chance that her company
will set up a branch office in Phoenix. If it does, she is 60 percent certain that
she will be made manager of this new operation. What is the probability that
Perez will be a Phoenix branch office manager?
Event E: Ms Perez manager of the branch office.
Event F: Company set up a branch office in Phoenix.
P (E |F ) = 0.6
P (F ) = 0.3
The Probability of both events to occur is P (EF ) = 0.3 × 0.6 = 0.18

Bayes’ Formula
Let E and F be events. We may express E as:
E = EF ∪ EF c
for, in order for an outcome to be in E, it must

be either be in both E and F or be in E but not
in F .
As EF and EF c are clearly mutually exclusive, we have by Axiom 3 that:
P (E ) = P (EF ) + P (EF c )
= P (E |F )P (F ) + P (E |F c )P (F c )
= P (E |F )P (F ) + P (E |F c )(1 − P (F ))

Bayes’ Formula
The equation states that the probability of the event E is a weighted average of
the conditional probability of E given that F has occurred and the conditional
probability of E given that F has not occurred: Each conditional probability is
given as much weight as the event it is conditioned on has of occurring.
It is an extremely useful formula, for its use often enables us to determine the
probability of an event by first “conditioning” on whether or not some second
event has occurred. That is, there are many instances where it is difficult to
compute the probability of an event directly, but it is straightforward to compute
it once we know whether or not some second event has occurred.

Bayes Formula: Examples
An insurance company believes that people can be divided into two classes:
those that are accident prone and those that are not. Their statistics show that
an accident-prone person will have an accident at some time within a fixed
1-year period with probability .4, whereas this probability decreases to .2 for a
non-accident-prone person. If we assume that 30 percent of the population is
accident prone, what is the probability that a new policy holder will have an
accident within a year of purchasing a policy?
Event E: New policy holder will have an accident within a year of purchase.
Event F: Policy holder is accident-prone.
Probability to be accident-prone = P (F ) = 0.3
Probability to be non-accident-prone = P (F c ) = 0.7
Probability that new policy holder will have and accident within one year given
that he/she is accident-prone P (E |F ) = 0.4
Probability that new policy holder will have an accident within one year given
that he/she is non-accident-prone = P (E |F c ) = 0.2
Thus: P (E ) = 0.4 × 0.3 + 0.2 × 0.7 =, P (E ) = 0.26.

Bayes Formula:Examples
Probability P (E ) = 0.26 is a new information. Considering the initial guess to

be accident prone is P (F ) = 0.3, lets update this probability given that event E
has occured. In other words we will calculate the probability P (F |E )
P (FE ) = P (EF )
P (F |E ) =
P (E )
P (F )P (E |F )
=
P (E )
In our example P (F |E ) = 0.03.×260.4 = 0.46

Bayes formula
P (E ) = P (E |F )P (F ) + P (E |F c )P (F c )
Suppose F1 , F2 , F3 , . . . Fn are mutually exclusive events such that:
∪ni=1 Fi = S
In other words, exactly one of the events Fi must occur. By writing:
E = ∪ni=1 EFi
and using the fact the events EFi , i = 1, . . . n are mutually exclusive, we obtain
that:
n
P (E ) = ∑ P (EFi )
i =1
n
= ∑ P (E |Fi )P (Fi )
i =1

Bayes formula
Thus, the Equation shows how, for given events F1 , F2 , . . . , Fn of which one and
only one must occur, we can compute P (E ) by first “conditioning” on which one
of the Fi occurs. That is, it states that P (E ) is equal to a weighted average of
P (E |Fi ), each term being weighted by the probability of the event on which it is
conditioned.
Suppose now that E has occurred and we are interested in determining which
one of Fj also occurred.
P (EFj )
P (Fj |E ) =
P (E )
P (E |Fj )P (Fj )
= n
∑i =1 P (E |Fi )P (Fi )
The Equation is known as Bayes’ formula, after the English philosopher
Thomas Bayes. If we think of the events Fj as being possible “hypotheses”
about some subject matter, then Bayes’ formula may be interpreted as showing
us how opinions about these hypotheses held before the experiment [that is,
the P (Fj ) ] should be modified by the evidence of the experiment.
Bayes formula: Example
A plane is missing and it is presumed that it was equally likely to have gone
down in any of three possible regions. Let 1 − αi denote the probability the
plane will be found upon a search of the ith region when the plane is, in fact, in
that region, i = 1, 2, 3. (The constants αi are called overlook probabilities
because they represent the probability of overlooking the plane; they are
generally attributable to the geographical and environmental conditions of the
regions.) What is the conditional probability that the plane is in the ith region,
given that a search of region 1 is unsuccessful, i = 1, 2, 3?

Bayes Formula:Example
Solution: Let Ri , i = 1, 2, 3, be the event that the plane is in region i; and let E
be the event that a search of region 1 in unsuccessful.
R1 ∪ R2 ∪ R3 = S The plane is either in region 1,2 or 3.
P (R1 ) = P (R2 ) = P (R3 ) = 31 The plane is equally likely to be in region 1,2,3.
P (E |R1 ) = α1 Probability of overlooking airplane given that plane is region 1.
P (E |R2 ) = 1 Probability of unsuccessful discovery of plane in region 1 given
that the plane is in region 2. Certain Probability. The same for P (E |R3 ).
P (ER1 )
P (R1 |E ) =
P (E )
P (E |R1 )P (R1 )
= 3
∑i =1 P (E |Ri )P (Ri )
α1 13
=
α1 13 + 1 31 + 1 13
α1
=
α1 + 2

Bayes Formula: Example
P (R2 |E ) Probability that plane is in region 2 given that discovery is

unsuccessful in region 1.
P (R3 |E ) Probability that the plane is in region 3 given the fact that the
discovery is unsuccessful in region 1.
For j = 2, 3,
P (E |Rj )P (Rj )
P (Rj |E ) =
P (E )
(1)( 31 )
=
α1 13 + 31 + 13
1
= , j =, 2, 3
α1 + 2
Thus, for instance, if α1 = 0.4, then the conditional probability that the plane is
in region 1 given that a search of that region did not uncover is 61

Random Variables
A random variable can have any numerical value, x (k ), at each trial,

k = 1, 2, . . . is the trial/realization index.
General properties extraction considering an array: x (k ), k = 1, 2, . . . , N
Detect min Xmin and max Xmax values of the array.
Divide interval [Xmin , Xmax ] into M divisions.
Define Dx = (Xmax − Xmin )/M.
Compute the number of realizations within each interval,i.e.,number of
realizations nj such that
Xmin + (j − 1)Dx < x (k ) < Xmin + jDx , j = 1, 2, . . . , M
nj
Compute frequencies fj = N
, j = 1, 2, . . . , M
Build a histogram showing frequencies fj vs x (k ) values. A histogram
relates values of a random variable to the frequency of occurrence of
these values.

Random Variables
A histogram looks like the following figure:
By assuming; N → ∞, M → ∞, Dx → 0, the histogram becomes a continuous

line that represents the distribution law of the random variable, x (k ), this is
called the probability density function P (x ).

Probability density function
The probability of x (k ) with the condition x1 < x (k ) < x2 is:

Rx Rx Rx
P [x1 < x (k ) < x2 ] = x12 P (x )dx = 0 2 P (x )dx − 0 1 P (x )dx.
For the complete interval, we have:
Probability density function
Z ∞ Z −∞ Z ∞ Z 0
P [−∞ < x (k ) < ∞] = P (x )dx − P (x )dx = P (x )dx + P (x )dx = 1.
0 0 0 −∞
A common density function is the normal distribution.

https://www.mathsisfun.com/data/
standard-normal-distribution-table.html

Normal Distribution
Probability of a normally distributed random variable x (k ) with condition
x1 < x (k ) < x2 :
Z x2
P [x1 < x (k ) < x2 ] = P (x , µ, σ )
x
Z 1x2 Z x1
= P (x , µ, σ ) − P (x , µ, σ )
0 0
= Φ(z2 ) − Φ(z1 )
donde:
x −µ x1 − µ x2 − µ
z= , z1 = , z2 =
σ σ σ
1 z1 z12
Z
Φ(z1 ) = √ exp{− }dz
2π 0 2
z22
Z z2
1
Φ(z2 ) = √ exp{− }dz
2π 0 2
Function φ (z ) is an odd function,i.e. Φ(−z ) = −Φ(z ).

Example
It is required to manufacture 300,000 special bolts. The factory’s cost is $0.43

per unit. The allowable length of a bolt is between .194 and .204 inches. When
the existing equipment is used, the length of manufactured bolts has a
standard deviation of .003 inches. The introduction of an advanced control
results in the reduction of this standard deviation to .00133 inches. Modification
of the controls costs $5000. Determine if this modification would pay for itself.
Solution
Since the length of bolts varies and only the “good” bolts will be accepted, the
total number of manufactured bolts, N, will be greater than 300, 000. Let us
determine this number. It is expected that the equipment will be adjusted to
assure the mean value of the length,
µ = (x1 + x2 )/2 = (.194 + .204)/2 = 0.199 (inches). Now, when µ = .199 and
σ = 0.003 one can determine the probability of the length of a bolt to be within
the allowable limits, P [x1 < x < x2 ], assuming the normal distribution law:

Example
z1 = (x1 − µ)/σ = (.194 − .199)/.003 = −1.67

z2 = (x2 − µ)/σ = (.204 − .199)/.003 = −1.67
Therefore, P [x1 < x < x2 ] = 2 ∗ Φ(1.67). According to the table below in Fig.
1.2, P [x1 < x < x2 ] = 2 ∗ .4525 = .905. This result indicates that in order to
manufacture 300, 000 “good” bolts, a total of 300, 000/.905 = 331, 492 units
must be produced. Now let us repeat this calculation assuming the improved
accuracy of the equipment (or reduced standard deviation, σ = .00133):
z1 = (.194 − .199)/.00133 = 3.76 and z2 = 3.76, and Φ(3.76) ≈ .4998,
therefore, P [x1 < x < x2] = .9996. This indicates that the total of
300, 000/.9996 = 300, 121 units must be produced, thus the effective savings
of automation is $.43 ∗ (331, 492 − 300, 121) = $13, 489.53.
The conclusion is obvious: the modification of controls is well justified.

Estimation of mean and variance
Mean
The mean value of a random variable is estimated as Mx = N1 ∑N k =1 x (k ). In
some instances it is often said that Mx = Mx (N ) to emphasize that Mx is
dependant on the number of realizations N of the random variable that were
used for the estimation. It is known that as N → ∞ then Mx (N ) → µ where µ is
the appropriate parameter of the distribution law.
Variance
The variance and standard deviation of a random variable are estimated as
1 N √
Vx =
N −1
∑ [x (k ) − Mx ]2 , and Sx = Vx
k =1
Again, it is often said that Vx = Vx (N ) and Sx = Sx (N ) to emphasize that

these estimates are dependent on the number of realizations of the random
variable used for the estimation. It is known that as N → ∞ then Sx (N ) → σ ,
where σ is the appropriate parameter of the distribution law.

Recursive Estimation
It is used when characteristics of a random variable are calculated on-line. It is
done to incorporate as many realizations, x(k), as possible in the estimation
without the penalty of storing an ever-increasing data array. The following
formulae are applied:
1
Mx [N ] = Mx [N − 1] + [x (N ) − Mx [N − 1]]
N
1
Vx [N ] = Vx [N − 1] + [[x (N ) − MX (N )]2 − Vx [N − 1]
N −1
p
Sx = Vx [N ]
The above expressions could be easily derived. Indeed,
" #
N N −1
1 1
M x [N ] = ∑ x (k ) = N x (N ) + ∑ x (K )
N k =1 k =1
N −1
1 N −1 1 N −1
= x (N ) + ∑ x (k ) = N x (N ) + Mx (N − 1)
N N (N − 1) k =1 N
1
Mx [N ] = Mx [N − 1] + [x (N ) − Mx (N − 1)]
douplaza@espol.edu.ec (ESPOL) Modelado Matemático
N 37 / 52
Properties of random variables
Consider 3 random variables , x (i ), y (i ), z (i ), i = 1, 2, . . . , N, N is the
realization index. The procedure for the extraction of the general properties is
as follows:
Find their Min and Max values: [Xmin , Xmax ], [Ymin , Ymax ], [Zmin , Zmax ].
Divide the above intervals into L subintervals, thus resulting in three steps
∆X = [Xmin − Xmax ]/L, ∆Y = [Ymin − Ymax ]/L, ∆Z = [Zmin − Zmax ]/L.
Compute numbers NKJM , equal to the number of realizations
x (i ), y (i ), z (i ) such that:
Xmin + (K − 1)∆X ≤ x (i ) < Xmin + K ∆x ,

Ymin + (J − 1)∆Y ≤ y (i ) < Ymin + J ∆Y ,
Zmin + (Z − 1)∆Z ≤ z (i ) < Zmin + M ∆x
for every K , J , M = 1, 2, 3, . . . , L
Compute frequencies FKJM = NKJM /N , K , J , M = 1, 2, 3 . . . , L of the
multidimensional histogram.
Properties of random variables
Assume N → ∞, NKJM → ∞, ∆X → 0, ∆Y → 0, ∆Z → 0, then the

histogram turns into a 3-dimensional probability density function,
F (x , y , z , µx , µy , µz , σx , σy , σz , rXY , rXZ , rYZ ) representing the distribution
law. Where µx , µy , µz and σx , σy , σz are mean values and standard
deviations of the respective values representing their individual distribution
laws. rXY , rXZ , rYZ are the correlation coefficients representing interrelation
between individual variables. In the most practical applications we are
dealing with the normal distribution law.

Correlation Coefficients
rXY , rXZ , rYZ are estimated as normalized correlation coefficient as follows:
N
1
rXY = rXY (N ) =
NSX SY
∑ [x (i ) − MX ][y (i ) − MY ]
i =1
where Mx , MY , Sx , Sy are estimates of mean values and standard deviations of

particular variables. The normalized correlation coefficient does not exceed 1,
by its absolute value −1 ≤ rXY ≤ 1. It represents the extend of the linear
relationship between random variables x and y , not a functional relationship,
but a tendency, i.e. the relationship that may or may not manifest itself at any
particular test could be observed on a large number of tests.
Note that:
1 N
RXY = RXY (N ) = [x (i ) − MX ][y (i ) − MY ]
∑
N i =1
is just a correlation coefficient (not normalized).

Confidence Interval for Correlation Coefficients
The following expression defines the interval that with a particular probability
that contains the "true" value of the correlation coefficient:
P [rXY − ∆R (α, N ) ≤ rXY

TRUE
≤ rXY + ∆R (α, N )] = 1 − 2α
where
1 − r2
∆R (α, N ) = t (α, N − 1) √ XY .
N
and t (α, N ) is t-distribution value defined for significance level α and number
of degree of freedom N − 1. It is said that the estimate of correlation coefficient
is statistically significant if |rXY | ≥ ∆R (α, N ). Indeed, the correlation coefficient
could be only positive or only negative therefore the confidence interval of
statistically-significant normalized correlation coefficient cannot include both
positive and negative values.

Example of correlation coefficient
Estimated correlation coefficient between the MPG value of Nissan Pathfinder

and outdoor temperature was obtained using 20 available records, r = 0.12.
Does this indicate that the correlation between these two random variables
exists?
Solution
Assume significance level α = 0.025, then t (0.025, 19) = 2.093, then
D=2,093[1-0.144]/4.47=0.46
0.46 > 0.12, therefore, with 95% confidence the r value is statistically
insignificant.
Read the problem carefully and explain the solution

Conditional Distributions
Consider a group of 2 random variables, x (i ) and y (i ), i = 1, 2, . . . , N , where

i = 1, 2, . . . , N is the realization index. Let us investigate if there is a trend-type
dependence of random variable y (i ) on random variable x (i ).
Find Min and Max values of these variables: [Xmin , Xmax ], [Ymin , Ymax ].
Divide the above intervals into L subintervals, thus resulting steps,
∆X = [Xmin − Xmax ]/L and ∆Y = [Ymin − Ymax ]/L
From the original array x (i ), y (i ), i = 1, 2, . . . , N , select only the

realizations [x (i ), y (i )] such that:
Xmin + (K − 1)∆X ≤ x (i ) < Xmin + K ∆X .
Assume that the total number of such realizations is NK

Obtain histogram for random variable y (i ) from the above array

Conditional Distributions
Obtain histogram for random variable y (i ) from the above array of NK
observations by:
I Computing number of realizations, NKM such that
Ymin + (M − 1)∆Y ≤ y (i ) < Ymin + M ∆Y .
for every M = 1, 2, 3, . . . , L
I Compute frequencies FKM = NKM /NK , M = 1, 2, 3, ldots, L of the
multidimensional histogram. (Note that the histograms in the figure for
variable y (i ) are built only for those values y (i ) when corresponding x (i )
values fall within interval [Xmin + (K − 1)∆X , Xmin + K ∆X ])
Assume N → ∞, NK → ∞, NKM → ∞, ∆X → 0, ∆Y → 0, then the histogram
turns into a 1-dimensional pdf, P (y , µy , σy ), representing the distribution
law of variable y (i ) obtained under the assumption that corresponding
values of x (i ) satisfy some particular conditions,i.e.
P (y , µy , σy ) = P ((y , µy , σy )|x ) is a pdf representing the distribution law
of random variable y subject to variable x

Conditional distribution
Note that the mean value and variance of variable y change subject to
numerical values of the related variable x. The conditional distribution
P (y , µy , σy /x ) is expected to be a normal distribution. Its parameters depend
on x: µy (x ) and σy (x ). These relationships are known as "conditional mean"
and "conditional standard deviation"

Correlation Analysis
Correlation analysis is the analysis of stochastic (trend-type) linear relationship

between random variables x1 , x2 , x3 , . . . , xn . It includes:
Computation of the correlation coefficient for every combination of two
variables:
N
1
rij = [xi (k ) − Mi ][xj (k ) − Mj ].
∑
NSi Sj k =1
Computation of the correlation matrix,
 
r11 r12... r1n
r21 r22... r2n 

. . . . . . . . .
 .
. . .
rn1 rn2 . . . rnn

Random Processes
A random process could be viewed as a continuous function of time y = y (t )
that at any particular moment of time, t ∗, has a random value, i.e. y ∗ = y (t ∗)
is a random variable characterized by its specific distribution law. Recording a
random process over some period of time would result in a graph as the one
shown below.
It is said that the graph features a realization of the random process y(t).
However the same random process repeatedly initiated under the same
conditions would result in many different realizations that in combination
constitute an ensemble, see figure below.
Random Processes
The broken line in the figure, representing time t = t ∗, is known as the

cross-section of the random process y (t ). It could be seen that in the
cross-section multiple realizations form a combination of numerical values of
random variables, i.e. when the time argument is fixed, y (t ∗) is a random
variable with all previously described properties and characteristics.

Random Processes
Due to the proliferation of computers, we should expect to deal with discretized

random processes represented by a sequence of random variables attached to
the time axis, y (∆t ), y (2∆t ), . . . , y (i ∆t ), . . . , y (N ∆t ), or just
y (1), y (2), . . . , y (i ), . . . , y (N ), where i = 1, 2, . . . , N is the discrete-time index,
∆t is the time step, and N ∆t is the entire period of observation. It should be
emphasized that “attached to the time axis” is the key to distinguishing between
a random variable and a random process. While sequencing of the
observations of a random variable is not important, the “natural” sequencing of
numbers representing a random process is crucial for the analysis of the
phenomenon represented by this random process.

Random Processes
A discretized ensemble of the realizations of a random process could be
represented by a square table, where rows represent particular realizations
and columns represent particular cross-sections (discrete-time values):
The approach to analyzing statistical properties of a random process is similar

to the one suggested for a random variable: first a histogram is built provided
that N 1 and M 1 and eventually a distribution law is established.
However,immediate questions arise:
1. Should this distribution law be established for a realization of the process
(one of the rows of the table) or for a cross-section (one of the columns of the
table)?
Random Processes
2. Is it necessary to establish an individual distribution law for every
cross-section (column) of the process? Answers to these questions reflect
fundamental properties of the random process: 1. A random process is called
ergodic if a distribution law established for one of its realizations is identical to
the one established for its cross-section. Otherwise the process is said to be
non-ergodic.
2. A random process is called stationary if a distribution law established for a
cross-section is independent of the cross-section. This implies that statistical
characteristics of the process are time-invariant. Otherwise it is said that the
process is non-stationary or has a parameter drift.
3. Any ergodic process is stationary, but not every stationary process is
ergodic.
Most realistic random processes are ergodic and the normal distribution law is
suitable for their description. It is also known that most non-stationary random
processes are non-stationary only in terms of the mean value, i.e. µ = µ(t ),
but their standard deviation σ is constant.

Actividad Sesión 15-Junio-2021
Analice los problemas Ejemplo 1.1, Ejemplo 1.2, Ejemplo 1.3 del libro de
Victor A. Skormin.
Seleccione uno solo de los ejercicios.
Desarrolle unas pocas diapositivas con las cuales Ud explicará a toda la
audiencia el problema seleccionado.
En el horario de 11h00 a 12h00 ud dispondrá de 5 minutos para explicar
el problema.
El material de las dispositivas y la presentación pueden desarrollarse
empleando el idioma ya sea español o inglés.

Modelado: Introducción

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Modelado: Introducción

Uploaded by

Copyright:

Available Formats

Modelado

Douglas A. Plaza Guingla, Ph.D.

Programa de Doctorado en Ingeniería Eléctrica

June 15, 2021

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 1 / 52

3 Systems of Ramdom Variables

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 2 / 52

If a geologist is quoted as saying that “there is a 60 percent chance of oil in a

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 3 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 4 / 52

Regardless of which interpretation one gives to probability, however, there is a

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 5 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 7 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 8 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 9 / 52

For any event E, we define the event E c , referred to as the complement of E,

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 10 / 52

The union of the events E1 , E2 , ..., En , denoted either by E1 ∪ E2 ∪ . . . ∪ En or

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 11 / 52

The probability that the outcome of the experiment is contained in E is some

With probability 1, the outcome will be a member of sample space S.

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 13 / 52

In this section, we introduce one of the most important concepts in all of

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 14 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 15 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 16 / 52

Lets solve the experiment of the roll of the two dice:

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 17 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 18 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 19 / 52

The concept of conditional probability may be used to compute the probability

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 20 / 52

Let E and F be events. We may express E as:

for, in order for an outcome to be in E, it must

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 21 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 22 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 23 / 52

Probability P (E ) = 0.26 is a new information. Considering the initial guess to

In our example P (F |E ) = 0.03.×260.4 = 0.46

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 24 / 52

In other words, exactly one of the events Fi must occur. By writing:

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 25 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 27 / 52

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 28 / 52

P (R2 |E ) Probability that plane is in region 2 given that discovery is

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 29 / 52

A random variable can have any numerical value, x (k ), at each trial,

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 30 / 52

By assuming; N → ∞, M → ∞, Dx → 0, the histogram becomes a continuous

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 31 / 52

The probability of x (k ) with the condition x1 < x (k ) < x2 is:

Probability density function

A common density function is the normal distribution.

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 32 / 52

Function φ (z ) is an odd function,i.e. Φ(−z ) = −Φ(z ).

It is required to manufacture 300,000 special bolts. The factory’s cost is $0.43

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 34 / 52

z1 = (x1 − µ)/σ = (.194 − .199)/.003 = −1.67

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 35 / 52

Again, it is often said that Vx = Vx (N ) and Sx = Sx (N ) to emphasize that

douplaza@espol.edu.ec (ESPOL) Modelado Matemático 36 / 52

Xmin + (K − 1)∆X ≤ x (i ) < Xmin + K ∆x ,