Continuous Distributions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Continuous Distributions

V. Bardis

Theory: Consider the case where the variable of interest, X, is continuous, that is, it can take on any
value in an interval (or union of intervals).

Example: X = family wealth or X = rhino height or X = running speed, X = guitar weight etc.

Which guitar is heavier? LES PAUL STRATOCASTER

Theory: Further suppose that there exists an interval (or intervals) of values of X such that there is a
positive portion of the population with values inside every smaller subinterval of the interval. Then
assume that the portion for every interval of size zero is equal to zero, i.e., the relative frequency or
(probability) of any given value is zero. Then X is a continuous variable.

Example: The following is an example that fits the theoretical model of a continuous distribution quite
well.

Suppose we have 1 kg of gold dust (shown in purple below) that we spread evenly inside a fish tank of
length 2.
The fish tank is put on a graph where the horizontal axis is used to identify different parts of the
fish tank corresponding to intervals of its base (location) and the vertical axis is used to record
the height (or density) of the mass of gold at every point inside the tank.

Since the gold is evenly spread inside the tank, the height (or density) is the same at every
point.

The total mass of gold inside the tank is equal to " Height x Base " = (1/2) (2) = 1, i.e., the
purple mass we see is 100% of the gold.

Suppose we insert a straw of width α and extract the gold inside the straw. What portion of
gold will we extract?

Answer: (1/2) α = α /2

Observe then that as α increases the portion of the gold we extract increases.
Likewise, as α decreases the portion of the gold we extract decreases so that as the straw
becomes very thin, i.e, α -> 0, we end up with near zero portion of the gold .

Suppose α = 0.10 and we apply the straw starting at location 1 which means it will cover up
to location 1.10, an interval equal to the width of the straw. Since the density is 1/2 from 1 to
1.10, we will extract (1/2)(.10) = 0.05 or 5% of the gold. Thus we say that between 1 and 1.10
we find 5% of the gold. More formally,

P(1 ≤ X ≤ 1.10) = 0.05

Because the gold is evenly spread inside the tank, the portion of gold that can be extracted at
any location with a straw of width alpha is the same. This is the case of a uniform
continuous distribution.

As shown below, in the case of a uniform continuous distribution we have a "density curve"
(the top boundary of the purple mass) that is flat (i.e., parallel to the x axis).
For every continuous distribution, the area under the density curve over a given interval is the
relative frequency (or probability) of that interval.

Formally, the curve is the graph of the frequency (or probability) density function of the distribution,
denoted by f. In our example, f is given by

1/2 , 0 ≤ x ≤ 2
f(x) =
0, otherwise

Note that f(x) is NOT a relative frequency (or probability) in the case of a continuous distribution.
It is simply the "height of the density curve at x".
As with discrete distributions, we can trace cumulative frequencies (or probabilities) from the smallest to
largest values of X using what is know asthe cumulative frequency (or distribution) function F(x).
Think of the "integral" sign in what follows as "the area under the curve".

(the portion of the population with value less than or equal to x )

The two functions are related as follows:

and
Basic Properties of a Continuous Distribution

(the height of the density curve cannot be negative)

(the total area under the density curve must be equal to 1)

(cumulative frequencies/probabilities cannot be negative or greater than 1)

( F(x) is non-decreasing in x )

Calculation of some Population Parameters

Population Mean (the "center of mass")

Population Variance

Population Median

m such that F(m) = 0.5


(We are not very concerned with the mathematical details of calculating mean and variance of
continuous variables at this stage.)

In the example, it is easy to find the median 0.5 = F(x) => 0.5 = (1/2) x => x = 1 i.e., m = 1.
Because the distribution is symmetric, we must have a mean value equal to the median.
The variance is equal to 1/3 and so the standard deviation is equal to sqrt(1/3)

In general, if X is uniformly distributed on the interval [a,b], i.e., X~U(a,b), then

(density function)

(cumulative distribution function)

(population mean)

(population median)

(population variance)

(population standard deviation)


Another Example of a Continuous Distribution: The Triangular Distribution

Suppose the gold inside the fish tank is distributed as follows

The gold is no longer uniformly distributed inside and tank. The height or density varies with
location. It follows that any straw of width alpha will extract different amounts at different locations.

But note, as before, the area of the triangle is equal to (1/2) x Height x Base = (1/2) (1) (2) = 1.
The area under the density must always be equal to 1.

More formally, we can characterize this triangular distribution as follows:

As before, we are not concerned with the exact calculations of the population parameters here. Instead we wish to focus
how to identify "frequencies" or "portions" (and eventually "probabilities) as areas under the density curve and how to
identify what are known as quantiles using the cumulative distribution function.
The Normal Distribution (the "Bell Curve")

The "Standard Normal Distribution" is the Normal distribution with zero mean and variance equal to 1
The Density Function and the Cumulative Distribution Function of the Normal Distribution

and

are not easy to remember... so we use the Z-table instead.

The Z-Table: The Table for the Cumulative Frequencies of the Standard Normal Distribution

F(z) = P(Z ≤ z) Cumulative Frequency or


The Normal Distribution...

- is symmetric and unimodal (μ = m = mode)

- satisfies the "68-95-97.7 rule"

- fits quite well many empirical distributions in nature


(such as height, length of animals, plants etc)

It has many useful applications:

- it is used often as the "distribution of errors" in measurement (Gauss 1801/1809) with the
properties that

* Errors of equal magnitude are equally likely and symmetric about 0 and so
Expected Error = 0
* Large Errors are less likely than Small Errors
* In the presence of several mearurements, the most likely value is their average

- it is used to carry out hypothesis testing.

- it is used often to construct models of risk and uncertainty due to its simplicity.

- it is a good approximation for other important distributions such as the Binomial Distribution,
the distribution of the "sum" of identically and independently distributed random
variables and therefore of the "distribution of the sample mean".

You might also like