Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Lecture 4: Random Variables

MA2224 Data Analysis

2024-01-31

Introduction

Where is the Random?

Consider a population of 500 students in a school, N = 500. They are the full school and information about them are
all that you care about.
You plan to sample 20 of the students from this population without replacement, n = 20, and observe if they have
green eyes or not.
Your population here is the 500 students: P = {y1 , y2 , . . . y499 , y500 }.
These student eye colors are NOT RANDOM. Each student has a definite eye color that you just happen to not know
and of those 500 students there is some number of them, x with green eyes and 500 − x without green eyes.
In this situation the only thing that is random is how we choose those n = 20 students for our sample. We know that
there is a total of 500 2.7 × 1035 possible samples of students and we would want to take those samples in a way

20 ≈
that most “captures” the population.
This situation is the Design-Based Approach to statistics where the method you use for taking your sample is
what introduces randomness into your analysis. The population is fixed and if it were possible to study the entire
population (a Census) then there would be no uncertainty about how many of the students have green eyes.
The design-based approach seems natural and follows logically with what we have done so far.
But this is not the approach we take in this course.
Instead think of it this way: There is an unknown probability of having green eyes called p = p(green eyes).
Each student Yi has the probability p of having green eyes and 1 − p of not having green eyes (Notice the switch to
capital Y). Which instead we label as 1 for Green and 0 for not green (any other numbers would work also) which
leads us to the following function:
 
 1 − p N ot Green
  1−p y =0

p(y) = → p(y) =
 
 p Green  p y=1

This function p(y) becomes the Model for how the data we see is generated. In this situation there is not really a
population to speak (though you hear talk of the population being a sample itself from a multiverse of populations) all
there is here is the value p that is controlling how many green eyed students we have.
This Model Based Approach is what we do in this stats class (and most stats classes).
Two Approaches To Statistics

Two Approaches To Statistics

Design Based Statistics.


In this approach to statistics, your population is fixed and non-random. The randomness and uncertain is
introduced because you take a sample from the population. The data you see is based on how "good" or "bad"
your method of sampling is. If you were able to see the entire population (a Census) then there would be no
randomness. Defining the population is extremely important here. These methods are used in surveying and other
situations where you can get a good sense of who or what is in your population and where they are.

Model Based Statistics.


In this approach, models for your data will be used and populations are not necessary to be well-defined or get
assumed as part of the model. This is the more common approach taken in most courses.

Random Variables and Models

Random Variables

When coming up with the simple model for green eyes, the first thing we did was switch from using the words “Green”
and “Not Green” to using the numbers 1 and 0. This helped make the function p(y) look like a function we know from
other math classes.
This switch from strings and sequences in the sample space to numbers creates a Random Variable

Random Variables.
A function from the sample space to the real numbers. In other words, it turns all the sequences of words and
letters in the sample spaces that we have been dealing with into real numbers.
Random Variable Example

Random Variable Example

We have several times now created the grid of outcomes for rolling two dice:

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

But we had also talked about the possibility of using SS = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} as the sample.
To make this situation a random variable we require numbers and not the pairs in the grid so we can make a random
variable

Y f or y = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Random Variable Types

Like we saw with sample spaces, there are 2 types of random variables

Discrete Random Variables (Addition).


A random variable Y that has a finite number of values or is "countable/listable infinite" (like 1, 2, 3, . . . as a
infinite list).

Continuous Random Variables (Calculus).


A random variable Y is a random variable that has the same number of values as an interval of real numbers.
General Models

General Models

Probability Mass Function.


Is any formula, list, graph, etc. that takes a DISCRETE RANDOM VARIABLE Y and provides BOTH (1)
all possible values of Y, denoted by a lower case y, and (2) the corresponding probability p (Y = y).

Probability Density Function.


Is a function, f (y), for a CONTINUOUS RANDOM VARIABLE Y, such that the probability that Y is in
the interval [a, b] is given by
Z b
P (a ≤ Y ≤ b) = f (y)dy
a

Probability Mass Functions (pmf)

Discrete Axioms

Probability mass functions are just the same exact things we have been doing from a tweaked perspective (Next lecture
will give more details to that).
With that in mind, it would make sense that once you have your random variable Y and want to give probabilities to
each possible y then you have to return to the same rules as before, the axioms

Discrete Random Variable Axioms.


For a discrete random variable Y with y = y1 , y2 , y3 , . . . , yN and N possibly infinite, the probability of each y,
written as P (Y = y) or just p(y), must satisfy the following
1.

0 ≤ p(y) ≤ 1 y = y1 , y2 , y3 , . . . , yN

2.

X
p(y) = 1
y

3.

p(yi or yj ) = p(yi ) + p(yj ) i ̸= j


Example 1

Example 1

Consider the discrete random variable T with the following mass function

t2
p(t) = t = −4, −3, −2, −1, 1, 2, 3, 4
A
1. Determine the value of A.
P
By axiom 2 y p(y) = 1 so

(−4)2 (−3)2 (−2)2 (−1)2 (1)2 (2)2 (3)2 (4)2


+ + + + + + + =1
A A A A A A A A

16 + 9 + 4 + 1 + 1 + 4 + 9 + 16 = A → A = 60

2. Determine the probability that T is less than 2.

16 9 4 1 1 31
P (T < 2) = + + + + = = 0.517
60 60 60 60 60 60
3. Determine P (|T| ≤ 2)

P (|T| ≤ 2) = P (−2 ≤ T ≤ 2) = (4 + 1 + 1 + 4)/60 = 0.167

4. Determine P (T P ositive | |T| Even)

4 16
P (T P ositive And |T| Even) 60 + 60
P (T P ositive | |T| Even) = = 16 4 4 16 = 0.5
P (|T| Even) 60 + 60 + 60 + 60
Probability Density Functions (pdf)

Continuous Axioms

Continuous Random Variable Axioms:.


For a continuous random variable Y, the probability density function, f (y), must satisfy the following:
1.

f (y) ≥ 0 −∞<y <∞

2.
Z ∞
f (y)dy = 1
−∞

Notice that the axioms are over the domain (−∞, ∞) but in many cases the function, f (y), will be set to 0 on large
sections of the domain. the regions where the function is non-zero, f (y) > 0, will usually be called The Support.

Example 2

Consider the continuous random variable W with pdf



2 1≤w≤3
 Aw

f (w) =

 0 elsewhere

1. Determine the value of A

!
33 13
Z ∞ Z 3
3
Aw2 dw = Aw2 dw = 1 → A − =1 → A=
−∞ 1 3 3 26

2. Find P (W < 2)

Z 2
3 3 8 1 7
 
2
P (W < 2) = w dw = − = = 0.2692
26 1 26 3 3 26
3. Find P (2 < W < 2.5)

!
2.53 8
Z 2.5
3 2 3 7.625
P (2 < W < 2.5) = w dw = − = = 0.2933
26 2 26 3 3 26
Example 3

Example 3

Consider the continuous random variable T with pdf



−0.1t t>0
 0.1e

g(t) =

 0 elsewhere

1. Determine P (T > 2).

Z ∞ ∞
P (T > 2) = 0.1e−0.1t dt = −e−0.1t = 0 − (−e−0.2 ) = e−0.2 = 0.8187
2 2

Z 2 2
P (T > 2) = 1 − P (T ≤ 2) = 1 − 0.1e−0.1t dt = 1 − e−0.1t = 1 − (e−0.2 − (−1)) = e−0.2 = 0.8187
0 0

2. Determine P (T = 10).

Z 10
0.1e−0.1t dt = 0
10

3. Determine P ( |T − 3| > 1).

Z 2 Z ∞
P ( |T − 3| > 1) = P ( T < 2) + P (T > 4) = 0.1e−0.1t dt + 0.1e−0.1t dt
0 4


P ( |T − 3| > 1) = (1 − 0.8187) + e−0.1t = 0.1813 + (0 − e−0.4 ) = 0.1813 + 0.6703 = 0.8516
4

Note: Absolute Value is basically distance from some center: |X − center| > distance wants the x-values that are
more than the distance away from the center.
So |T − 3| > 1 means that you need to find the T-values that are more than 1 unit away from 3 which means above 4
or below 1.
In a similar way, |Y − 10| < 3 means that you want the Y-values that are less than 3 units away from 10. Which are
the values from 7 to 13.
You just have to watch for inclusive and exclusive based on ≤ / ≥ vs < / > and |X + 3| puts the center at −3.
The CDF

Cumulative Distribution Function

The process of constantly integrating a pdf starts to get a little repetitive and more boring than usual. The cumulative
distribution function or cdf will give you a function that has already handled the integration for you.

Cumulative Distribution Function (cdf).


The function, F (y), such that for any Random Variable Y:

F (y) = P (Y ≤ y) f or all y ∈ (−∞, ∞)

- If Y is continuous (the usual case): Z y


F (y) = f (t)dt
−∞

- If Y is discrete:

X
F (y) = p(yj )
yj ≤y

Cumulative Distribution Function

Note: capital function letter is cdf (F (y)), lowercase function letter is pdf (f (y)), capital variable letter is the random
quantity (Y), lower case variable letter is a specific value (y).
Note also, cdfs for discrete random variables can be useless in this class but have come up once or twice in 10 years.

CDF to PDF Relationship (Continuous Random Variables).


Z y Z y
F (y) = P (Y ≤ y) = f (t)dt = pdf dt
−∞ −∞

So if you have the cdf and want the pdf you need to take it’s derivative

dF (y) d(cdf )
= = F ′ (y) = f (y) = pdf
dy dy
CDF Rules (IMPORTANT)

CDF Rules (IMPORTANT)

Using a CDF (IMPORTANT).


Note that there are only three ways to use a cdf, F (y)
- For at most (or less than) you just plug into the cdf:

P (Y ≤ a) = F (a)

- For at least (or more than) you just plug into the cdf then subtract from 1:

P (Y ≥ a) = 1 − F (a)

- For between two values you just plug each into the cdf and subtract the right value from the left value:

P (a ≤ Y ≤ b) = F (b) − F (a)

Example 4

Consider the continuous random variable V with pdf



c
 4<v
v2


f (v) =


 0 elsewhere

1. Determine the cdf of V.

Z ∞ ∞
c c

2
dv = 1 → − =1 → c=4
4 v v 4

Z v v
4 4 4 4 4

F (v) = P (V ≤ v) = dt = − = − =1−
4 t2 t 4 4 v v


 0 v≤4

F (v) =

 1−
4
 4<v
v
2. Use the cdf to find P (V ≤ 5).

P (V ≤ 5) = F (5) = 1 − 4/5 = 0.2


Example 4

3. Use the cdf to find P (8 < V ≤ 9)

P (8 < V ≤ 9) = F (9) − F (8) = (1 − 4/9) − (1 − 4/8) = 0.0556

4. Use the cdf to find P (V ≥ 10 |V > 7)

P (V ≥ 10 And V > 7)
P (V ≥ 10 |V > 7) =
P (V > 7)

P (V ≥ 10) 1 − F (10) 1 − (1 − 4/10) 4/10


P (V ≥ 10 |V > 7) = = = = = 0.7
P (V > 7) 1 − F (7) 1 − (1 − 4/7) 4/7

You might also like