But What Is A Random Variable - Towards Data Science

11/11/2019 But what is a Random Variable?
- Towards Data Science
But what is a Random Variable?

Kapil Sachdeva Follow
Oct 31 · 10 min read
https://towardsdatascience.com/but-what-is-a-random-variable-4265d84cb7e5 1/11
11/11/2019 But what is a Random Variable? - Towards Data Science
Coin toss from https://www.pexels.com
Introduction
Often simple concepts become difficult to grasp because of the terminology and the
context in which they are applied. Random Variable was one such confusing (even
though simple in the hindsight) aspect for me.
For a software engineer, a variable is of two types — local or global. Often there is a
range and bound on the variables when you are writing software but if you add the
adjective Random to them it means that they can have any value (although you can still
control the randomness using the notion of a seed).
For a cryptographer, randomness is one of the most important properties for his/her
algorithms. He/She strives to generate random numbers that should not repeat at all.
From elementary algebra’s perspective, the notion of variable is straight forward i.e it
represents an input of your equation and is written on the right side and also represents
an output of your equation that is written on the left side. Random aspect generally does
not appear or discussed at that level.
Now if you read its definition on Wikipedia or google — What is Random Variable in
Statistics?, you will see a statement like this:
A Random Variable is a function that maps outcomes to real values.
A variable that is actually a function ?
For statisticians, this may be fine but for people from other disciplines such as software
engineering, this very statement starts to seriously break down the semantics as it
perturbs their understanding of 3 very fundamental things they deal with every day —
Variables, Functions and Randomness!
Clearly Random Variables of statistics are different creatures and understanding them is
fundamental to many other concepts (Gaussian Processes, Bayesian Statistics etc) that
either use them and/or build upon them.
. . .
Outcomes, Sample space and event space

Random Variables find their applicability in the probability theory so let’s first revise
some of the terminologies in that area.
When you are observing a phenomenon or conducting an experiment you would end up
having the outcomes or results. Canonical examples that are generally given are -
Tossing a coin has exactly two possible outcomes — a head or a tail
Rolling a dice can have exactly 6 possible outcomes — 1,2,3,4,5 or 6
A list (formally called a set i.e. items in the list do not repeat) of the possible outcomes is
called a Sample space.
This implies that Sample space for coin toss is {HEAD, TAIL} and for roll of a dice is
{1,2,3,4,5,6}.
The second important terminology here is known by the name Event. What an event is
depends on how you define an experiment on your sample space.
Consider a sample space that corresponds to tossing the coin twice. Possible outcomes
would be {HH,HT,TH,TT}. Now if I am interested in outcomes that result in HEAD
appearing first then I would describe my Event set as {HH, HT}. From this example of
Event you should start to see that an event is actually a subset of sample space.
There is a symbol that is used to represent sample space (i.e. all possible outcomes) and
it is Ω
Similarly, there is a symbol for event space (i.e. all possible events) as well and it is ∑
Another example for the sake of more clarity. Let’s say you are rolling a dice (Ω =
{1,2,3,4,5,6}) and if your experiment is about observing the even outcomes then the
event space would be — ∑ = {2,4,6}
. . .
Randomness and Variability

Now we are finally ready to see where the Random Variable is in all this and more
importantly parts that correspond to Randomness & Variability.
Randomness
You see, the events corresponding to your experiment have inherent uncertainty
(randomness) associated with it i.e. your two coin toss in above experiment could be
HH or HT or TT or TH. You then use probability theory to quantify the uncertainty
corresponding to these events.
I appreciate that at the end of the day it is simply semantics but I really liked the word
uncertainty as it helps me not bring in my understanding of randomness from other
disciplines. This also means that Random Variables in statistics could have been called
Uncertain Variables. But they are not called so :( ….. the literature consistently calls
them Random Variable so if it helps, you could (as I often) do the translation in your
mind to Uncertain Variables.
I explain more on quantifying uncertainty of Random Variables in the latter part of the
article.
Variability
In the coin toss experiment, we used the words HEAD & TAIL in our sample space.
Instead, we could use numbers to represent them, say I would use 1 for HEAD and 0 for
TAIL. In other words, I could have said I am mapping HEAD to 1 and TAIL to 0. Mapping
implies a function that does this transformation. Recall from the earlier definition you
found on wikipedia or in your google search — “A Random Variable is a function that
maps outcomes to real values”.
You may wonder why I chose to assign 1 to HEAD and 0 to TAIL …. could it have been
the other way i.e. 0 for HEAD and 1 for TAIL ?. In many ways, you are free with your
assignment of (or rather I should say mapping to) numerical value but as you would see
with more complex examples there is certain meaning & consistency to these mappings
and it mostly depends on your definition of experiment. For e.g. even for a simple single
coin toss example, I could justify the assignment of 1 to HEAD by posing my experiment
as —” I am interested in observing a HEAD” . I would use 1 (=TRUE) as it indicates a
boolean logic for this very experiment definition.
To be not limited to the boolean experiment let’s go through another example. Here I
have two coin toss example i.e Ω = {HH,HT,TH,TT} and my definition of experiment is
about number of heads observed. So, I would define my (Random) Variable to generate
(remember it is a mapper/function) 0 for {TT}, 1 for both{TH} & {HT} and 2 for {HH}.
The Random Variables are generally represented using an uppercase letter. For e.g. for
above experiment I would write it as H = {0,1,2}.
This being said, the mapping of outcomes of sample space (Ω={H,T}) to (real) numbers
seems pretty deterministic i.e. there is no randomness here in the mapping (function) aspect.
This means that it is not the variable part of ‘Random Variable’ that is random rather it
represents that we are working with sample space that has uncertainty (randomness)
associated with the outcomes.
If you are wondering - Is it simply about assigning a number to an outcome in sample

space or there is more to the story then you are on the right path. There is indeed more
to it!
You see, the outcomes of a given sample space could be used to define many different
experiments. For e.g. this time around you would roll two dice such that if they are
indistinguishable (i.e. you do not care about which dice produces which number) then
the possible sample space is Ω = {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6), (2,2),(2,3),
(2,4),(2,5),(2,6),(3,3),(3,4),(3,5),(3,6),(4,4),(4,5),(4,6),(5,5),(5,6), (6,6)}. There
are 21 possible outcomes here. From here on I could pose a few different experiments
that would lead to their respective event spaces. One experiment could be — Sum of the
two dice greater than 8, other could be — observe when the product of two dice is even,
yet another one could be — observe when after subtracting 3 from the product the
outcome is even and many more.
This creation of experiments using a sample space is where the Random Variable starts
to use its ‘functional’ powers and map the outcomes to real numbers depending on how
you have posed your experiment definition.
But why call it a Variable? I wondered for a long time about this aspect and the best
(rather I should say only) explanation I could come up with is that often your task is not
about a single sample space. The final result of your task may depend on many
phenomenons which have their respective sample spaces. You would be applying algebra
on these sample spaces and therefore you would end up adding, subtracting, multiplying
the outcomes. Since you typically add/subtract/multiply variables and not functions
they ended up calling them variables.
. . .
Random Variables, Event Space & Probability

The definition of Random Variables seems to imply the strong connection with the
sample space; after all, it is a function that maps outcomes to real numbers but reality is
that the application of Random Variables is more connected to the events.
You would appreciate that a more interesting aspect is knowing which events out of all
possible outcomes have occurred and hence when you are doing your algebra using
Random Variables you are indeed working with the events.
Let’s see an example here for more clarity. In our example of observing number of
HEADs for two coin toss, we had defined the H = {0,1,2}. Every element in this set
corresponds to an event. Since events have uncertainty (randomness) associated with
them we make use of probability theory. We generally say probability of observing
0 head is p1
1 head is p2
2 heads is p3
and we know that p1 + p2 + p3 = 1. Another way to interpret this is that p1, p2 & p3
are quantifying the uncertainty associated with their respective events.
Now, when we are using Random Variables we would simply write above as
P(H=0) = p1, P(H=1) = p2, P(H=2) = p3
As should be clear, there is a direct corresponds between events and output ranges (all
possible values) of Random Variable that is shown here.
This also means that what you know about the usage of probability calculus i.e.
independent events, joint events, conditional events, sum rule, product rule, etc would
also apply to the Random Variables as well. In other words, you could apply probability
calculus on independent or dependent Random Variables and/or condition one over
another.
. . .
Types of Random Variables

So far all the examples that we have discussed are that of only 1 type of Random
Variables called Discrete Random Variables. These variables (as the name implies) are
representing outcomes that can be counted. You could count the number of heads,
number of times the product was 8, etc.
There exists another type of uncertain phenomenon that can not be counted for e.g. if
we were dealing with experiments and observation around the temperature of the day
then you are dealing with a measure that could take infinitely many different values and
hence practically not countable. These types of random/uncertain phenomenon are then
represented using Continuous Random Variables.
Random Variables and Probability Distributions

We have sort of already touched on the relationship with probability but this deserves a
small section of its own as you rarely talk about Random Variables with out probability
distributions.
Again what is a probability distribution?
If you take the probability associated with the output range of Random Variable X then you
would obtain the probability distribution of X.
Conveniently, many probability distribution functions have been defined that reflect
typical data generation processes. For e.g. our Random Variable that takes a value of 1
when task is to observe HEAD and 0 when it is TAIL is described using Bernoulli
distribution.
This writeup is not about the explanations of various distributions or even an in depth
treatment of probability distribution itself but if you are interested then I would strongly
suggest reading the ones written by https://medium.com/@aerinykim. She does a
phenomenal job in explaining the purpose, origin and math behind many important
distributions.
What I would mention though is that often in practice probability distributions ends up
being more interesting and important than the Random Variables because it is the
distribution that characterizes various properties of the random phenomenon. In other
words, once you have determined the appropriate probability distribution (with required
parameters) for your experiment then the original sample space could be disposed and
you could simply use the probability distribution to make predictions on unobserved
data.
. . .
Random Variables and various toolkits/libraries

The purpose of this section is to illustrate what I mentioned in the previous one i.e. in
practice you end up dealing with distributions only. Here I am providing some examples
from different libraries that have the implementation of Random Variables (read
Probability Distributions) and how they use these 2 terms interchangeably.
scipy.stats
scipy has a submodule called stats that implements various distributions. They define
base classes rv_continuous & rv_discrete from which inherit an impressive list of
distribution functions.
Note how in the above image there is mixup of the notion of Random Variables and
distributions. The documentation says An alpha continuous random variable but it is
listed under the title Continuous distributions Even the documentation says that the
module contains a large number of probability distributions.
pyMC3
pyMC3 is probably (no pun intended) the most used toolkit when it comes to
probabilistic programming.
You can try their notebook here but as I show in the screen shot above in pyMC3 Random
Variables and Distributions are one and the same thing.
tensorflow_probability & edward2
Tensorflow probability is a package that is part of tensorflow ecosystem and it defines

various distributions and neural network layers that make use of tensorflow core
primitives and acceleration goodies. edward2 is defined as probabilistic programming
language that is built using tensorflow_probability. Phew !! …. talk about adding more
confusion by calling a software framework or library a programming language but seems
to be a trend in this domain of probabilistic modeling.
At least edward2 seems to have a python class called RandomVariable that encapsulates
properties of a random variable, namely, its distribution, sample shape and optionally
value. I like this one as it seems to better represent theory in software design.
Concluding Remarks
Think of Random Variables as Uncertain Variables
Random Variable is a function that maps outcomes to real numbers
Random Variable itself is deterministic, the randomness (or rather uncertainty) is

present in the sample space
You can apply probability calculus on Random Variables
In practice often the Random Variables and Probability distributions are used
interchangeably even though they are different things (albeit one with out other is
useless).
References
[1] https://en.wikipedia.org/wiki/Random_variable
[2] A nice list of examples for Sample space and Events —

https://faculty.math.illinois.edu/~kkirkpat/SampleSpace.pdf
[3] An answer on math.stackexchange.com that helped me regarding the usage of

‘Variable’ in the term — https://math.stackexchange.com/q/864839
[4] Aerin Kim (https://medium.com/@aerinykim) has many well written articles

explaining important distributions.
Statistics Random Variable Probability Distributions Machine Learning Intuition
About Help Legal

But What Is A Random Variable - Towards Data Science

Uploaded by

Copyright:

Available Formats

You might also like

But What Is A Random Variable - Towards Data Science

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

But What Is A Random Variable - Towards Data Science

Uploaded by

Copyright:

Available Formats

11/11/2019 But what is a Random Variable?

- Towards Data Science

But what is a Random Variable?

Coin toss from https://www.pexels.com

A Random Variable is a function that maps outcomes to real values.

A variable that is actually a function ?

Outcomes, Sample space and event space

Tossing a coin has exactly two possible outcomes — a head or a tail

Rolling a dice can have exactly 6 possible outcomes — 1,2,3,4,5 or 6

Randomness and Variability

If you are wondering - Is it simply about assigning a number to an outcome in sample

Random Variables, Event Space & Probability

P(H=0) = p1, P(H=1) = p2, P(H=2) = p3

Types of Random Variables

Random Variables and Probability Distributions

Again what is a probability distribution?

Random Variables and various toolkits/libraries

tensorflow_probability & edward2

Tensorflow probability is a package that is part of tensorflow ecosystem and it defines

Random Variable is a function that maps outcomes to real numbers

Random Variable itself is deterministic, the randomness (or rather uncertainty) is

You can apply probability calculus on Random Variables

[2] A nice list of examples for Sample space and Events —

[3] An answer on math.stackexchange.com that helped me regarding the usage of

[4] Aerin Kim (https://medium.com/@aerinykim) has many well written articles

Statistics Random Variable Probability Distributions Machine Learning Intuition

About Help Legal

You might also like