Distribution and Loss Functions: Alex Robinson

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Distribution and Loss Functions

Alex Robinson

March 9, 2018
Probability Distributions

• A probability distribution links the outcomes of some


random event to the probability that each outcome occurs.
• They come in two flavors: discrete and continuous.
• Discrete distributions are used when there are a finite number
of outcomes. For example, rolling a six sided die.
• When we roll a six sided die, there are six outcomes: rolling a
1, 2, 3, 4, 5, or 6.
Probability Distributions cont.

• Continuous distributions are used when we have an infinite


number of outcomes.
• For example, imagine randomly choosing any real number
(whole numbers, fractions, irrational numbers included)
between 1 and 10.
• There is an equal chance that any given number will be
chosen. However, there are billions upon billions of possible
choices!
• So, the probability that any one number (say, 8.47563) will be
chosen is very very small.
Discrete Distributions: Example
• Suppose we randomly chose 20 integers between 1 and 10,
with the frequency that each integer was chosen in the table
below.
Outcome Frequency
1 1
2 2
3 2
4 1
5 4
6 1
7 3
8 2
9 1
10 3
Discrete Distributions: Example
• Assume that each number had an equal probability of being
chosen. Since we chose 20 numbers, this means that if we
have n instances of a given number, the probability assigned
to that number is n/20. Adding to our table, we get:
Outcome Frequency Probability
1 1 0.05
2 2 0.1
3 2 0.1
4 1 0.05
5 4 0.2
6 1 0.05
7 3 0.15
8 2 0.1
9 1 0.05
10 3 0.15
Discrete Distribution Example

• Suppose we place the outcomes from our example into 5


equal bins - 1 and 2, 3 and 4, 5 and 6, and so on. How do we
find the probability that an outcome is in one of these bins?
• For bin one, it is Pr(1) + Pr(2) = 0.05 + 0.1 = 0.15. More
generally, if X is our random event, and we want to know the
probability that it will fall into some set of outcomes, say
{1, 2, . . . , n} then the formula is:
n
X
Pr(X = 1, 2, . . . , n) = Pr(X = i)
i=1

This is the probability mass of the distribution for the


particular set.
The Probability Mass Function
• If we graph the probability masses of all possible outcomes,
we get the following:

This is the graph of the Probability Mass Function (PMF) for


our example distribution.
The Cumulative Distribution Function

• Related to the PMF is the cumulative distribution function


(CDF).
• The CDF measures the cumulative probability of the
distribution - starting at 0 and ending at a total of 1.
• If we want to know Pr(X < 7), we use the CDF:

Pr(X < 7) = Pr(X = 1, 2) + Pr(X = 3, 4) + Pr(X = 5, 6)

Pr(X < 7) = 0.15 + 0.15 + 0.25 = 0.55


More generally:
X
Pr(X < x) = Pr(X = i)
i<x
Graphing the CDF

If we graph the CDF of our example distribution we get:

Notice that the CDF ranges from 0 to 1 and is upward sloping.


Expected Value
• The expected value of a random event X is the value that
we ”expect” X to be.
• For discrete distributions, the expected value is a weighted
average of the outcomes - with the associated probabilities as
the weights.
• The expected value of our example distribution is

1(0.05) + 2(0.1) + 3(0.1) + 4(0.05) + 5(0.2)+

6(0.05) + 7(0.15) + 8(0.1) + 9(0.05) + 10(0.15) = 5.85


• For many continuous distributions, the expected value of X is
given as the mean parameter.
• For example, if X is normally distributed with mean a (µ) of
1, then E(X ) = 1.
Loss Functions
• Suppose we have a parameter, Q that we do not want
whatever our example distribution (X ) is measuring to exceed.
If we were measuring sales, Q could be our capacity.
• It would be useful to know by how much we expect X to
exceed Q by.
• If X has outcomes 1, . . . , n (imagine an n sided dice), with pi
denoting the probability of event i, the the general formula is
as follows:
Xn
E(max(X − Q, 0)) = pi (max(i − Q, 0))
i=1
Xn
= pi (i − Q)
i=Q

We need the max(X − Q, 0) since we only want to consider


situations where X exceeds Q.
Loss Functions - Example
• Recall our example discrete distribution from earlier.
• Suppose our Q is 7. We want to know by how much we
expect X to exceed 7 by.
• Using our formula from the previous slide:

10
X
E(max(X − 7, 0)) = pi (max(i − 7, 0))
i=1
10
X
= pi (i − 7)
i=7
= p7 (7 − 7) + p8 (8 − 7)
+ p9 (9 − 7) + p10 (10 − 7)
= 1(0.1) + 2(0.05) + 3(0.15)
= 0.65
Continuous Distributions

• Recall that continuous distributions have an infinite number of


outcomes.
• The simplest continuous distribution is the Uniform
distribution.
• The uniform distribution assigns equal probability to every
outcome in a given interval. So a uniform distribution
between 0 and 10 assigns the same probability to every
number in between 0 and 10, and 0 probability elsewhere.
• Another common continuous distribution is the Normal
distribution. This distribution assigns higher probabilities to
outcomes near the mean - given as a parameter - and lower
probabilities the further away from the mean you get.
Continuous Distributions cont.

• In a uniform distribution, if we assign equal probability to


every number between 0 and 10 being chosen, then the
probability of any one number must be tiny! After all, there
are too many numbers between 0 and 10 to count!
• In fact, if X is a continuous random event, and x is an
outcome, then Pr(X = x) = 0.
• Instead, for continuous distributions (not just uniform), we
think in terms of intervals. So, we might ask what the
probability is that X falls between a and b.
The Probability Density Function

• For discrete distributions, we were able to construct the PMF


dividing outcomes into bins and summing up the probabilities.
• Now, we divide the outcomes into infinitely many, infinitely
small bins.
• The result is a smoothed version of the PMF - called a
probability density function.
• Each point on the PDF represents the probability of that
outcome occurring - relative to all other outcomes.
Example PDFs
The PDF for a uniform distribution between a and b is:

For the normal distribution with a mean of 0 and standard


deviation of 1 it is:
More PDFs

• The real use of PDFs is to figure out Pr(a ≤ X ≤ b) - the


probability that a random event falls within a certain interval.
• Graphically, this is the area under the PDF between the ends
of the interval, a and b (on the x axis).
• Mathematically, this is an integral:
Z b
Pr(a ≤ X ≤ b) = fX (x)dx
a

where fX (x) gives the probability density at x.


Continuous CDFs
• For discrete distributions we were able to use the PMF to find
the cumulative probability of the distribution in the form of
the CDF.
• The same is possible for continuous distributions - we can use
the PDF to find a ”running” total of probability for the
distribution (also called the CDF).
• Just as for PDFs we thought of adding up an infinite number
of infinitely small bins, we do the same for the CDF.
• Now, instead of just adding up each bin, we take a cumulative
sum over the whole distribution.
• Mathematically, this is once again an integral:
Z x
Pr(X ≤ x) = FX (x) = fX (t)dt
−∞
Examples of Continuous CDFs
The CDF for a uniform distribution between a and b is:

The PDF for the normal distribution with a mean of 0 and


standard deviation of 1 is:
Continuous Loss Functions

• We can also define loss functions for continuous distributions.


• Recall that a loss function tells us E(max(X − Q, 0)) for a
continuous random event X and some parameter Q.
• Just like the PDF and CDF, the continuous loss function
formula has an integral rather than a sum:
Z ∞
E(max(X − Q, 0)) = fX (x)(x − Q)dx
Q

• Rather than calculating this by hand, common continuous loss


functions are often evaluated using tables.

You might also like