Professional Documents
Culture Documents
Continuous Random Variables I
Continuous Random Variables I
• use a probability density function to solve problems involving probabilities, and to calculate the
mean and variance of a distribution. (Including location of the median or other percentiles of a
distribution by direct consideration of an area using the density function.)
Continuous random variables
• Unlike discrete random variables, a continuous random variable is not defined for
specific values. Instead, it is defined over an interval of values.
• The very nature of continuous quantities means that they cannot be measured precisely and, no matter how
hard we try, inaccuracy is also likely because our tools lack perfect calibration and we, as human beings, add
in a certain amount of unreliability.
• Many common natural phenomenon e.g. time of natural sleep or speed of cars consists of continuous data
which can be described as a normal distribution. However, not all can be described as normal distribution
e.g. the rate of decay of a radio active element because the data is strongly skewed here.
Ways to model continuous random variable
Consider an example of patients in a clinic.
In a survey, 425 patients waited to see a doctor. 132 patients waited for 30 minutes
to 60 minutes… . This continuous data, can be described in class intervals using a
histogram.
The same data could be collected in narrower class intervals and then
we could plot a histrogram.
We can now see that the shape of the histrogram begins to approximate a curve.
• Using very narrow class intervals to draw a histogram, and using probability
density (frequency density/total frequency) to find hights of the bars, the graph
increasingly approximates a curve and the area under the curve equals 1, which is
sum of all probabilites.
The Probability Density Function
• In case of a continuous random variable, such a curved graph represents a
function, , and is called a probability density function (PDF). The area under a
PDF equals 1.
Note:
The word function should only be used when referring to a random variable. For
data, we should rather use curve and/or graph.
Properties of a PDF
• PDF cannot be negative since you cannot have a negative probability; .
• In many situations, the data are defined across a specified interval or across specified
intervals, outside of which .
Show that .
, [shown]
Probabilities of continuous random variables
• We can calculate probabilites, e.g. arrival time of a taxi given in an interval. But we cannot
calculate the probability of an exact arrival time for the taxi; that is, we cannot calculate the
probability of an exact outcome.
Consider an example of time of running a marathon which is 30 minutes and 37.260 seconds. But it
is not an exact time. It might be rounded to nearest 10th. It could be 37.256918… or could have
infinite possibilities.
The time is a range from 37.255 to 37.265 seconds.
• So for continuous random variables, we can only calculate the probability of a range of values.
Key facts
• With continuous random variables, each individual value has zero probability of occurring.
• Because we cannot find the probability of an exact value, when finding the probability in a given
interval it does not matter whether you use .
Note that this does not imply that X cannot take the value , it just means the probability of the exact
value is zero.
• Area under the graph represents probability. We can use integration to find
probability.
• The probability of X lying in the interval is given by the area under the graph
between a and b. That is:
Example
The height reached by water erupting from a broken pipe, meters, is modelled by the following
PDF.
a) Show that .
b) Find the probability that the water reaches a height of atleast 6.
Solution
c) (shown)
Median and other percentiles of CRV(s)
• Median of a random variable X is the value where the probability is exactly half.
• So, for example, if we are required to find the value of such that and since total probability equals , . Here
denotes median as the probability equals exactly .
Find the median and 75th percentile of the arrival time of the bus.
Solution:
Let be the median, , , ,
• However, with CRVs, for all specific values of , so we cannot use this formula.
• If we want to find probability of a very small interval, on the graph, this would be a very small
strip, approximately a rectangle with width and height .
So
Hence,
And variance:
Example
The arrival time of an aircraft is modelled by the CRV , whose PDF is given by:
Find the mean and variance of the waiting time for people arriving at the airport.
Summary
• f for all
• Any percentile ,
Exam questions
The continuous random variable has the PDF
(shown)
Thanos models the length of time, in minutes, by which his ship is late on any day by the random variable with PDF
given by
a) Find the probability that the ship is more than 10 minutes late on each of two randomly chosen days. (4 marks)
b) Find . (6 marks)
c) The median of is denoted by .
Show that satisfies the equation , and hence find correct to 3 significant figures.
(4 marks)
d) State one way in which Thanos’ model may be unrealistic. (1 mark)
on two randomly chosen days
c) (shown)
d) This model does not allow times for ships which are early
The length, meters, of Mr. Stark’s armors of a certain type is modelled by the PDF
c) Find the probability that more than 3 of them has length less than 6m. (5 marks)
a) Since is a quadratic curve and it is symmetrical,
The random variable takes values in the range , where is a constant. The graph of the PDF of is
shown in the diagram.
a) Show that . (2 marks)
b) Find . (5 marks)
a) Equates area under the graph to 1.
(shown)