Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Chapter 1, Probability

ECEG-6209: Mathematical Foundations in


Engineering

PG Program
School of Electrical and Computer Engineering
1.1 Introduction
• Probability is the core mathematical tool for communication theory.
• The stochastic model is widely used in the study of communication systems.
• If we consider a radio communication system,
• the received message,
• interference,
• noise and
• many more (delay, phase, fading, ...) are all random processes.

2
1.2 Random Experiments, Outcomes and Events
• A random experiment or random observation- briefly, an experiment or
observation is a process that can be repeated a number of times, essentially under
the same conditions, and whose result cannot be predicted beforehand.
• A single performance of an experiment is called a trial.
• Each possible result an outcome of the trial.
• We call the set of all possible outcomes of an experiment the Sample Space S and
each outcome an element or a point of S.
• Example 1.1:
• In flipping a coin, the possible outcomes are H (head) or T (tail), so that
S = {H; T}
• In rolling a die,
S = {1; 2; 3; 4; 5; 6}

3
• In most practical problems we are not interested in the individual outcomes but in
whether an outcome belongs (or does not belong) to a certain set of outcomes.
• Such a set A is a subset of the sample space S and it is called an event.

4
1.3 Venn Diagram
• A sample space S and the events of an experiment can be represented graphically
by a Venn diagram.

• The complement of an event A with respect to S is the set of all elements


(outcomes) of S that are not in A.
• It is denoted by A` or Ã.
• An event containing no element is called the impossible event or empty event
and is denoted by ∅.
• The intersection of two events A and B, denoted by A ∩ B, is the event containing
all elements that are common to A and B.

5
• Two events A and B are mutually exclusive or disjoint if A ∩ B = ∅.
• The union of the two events A and B, denoted by A ∪ B, is the event containing all
the elements that belong to A or B or both.
• The difference of A and B, denoted by A − B, is the set consisting of all elements of
A which do not belong to B.
• Example 1.3: A die is rolled and consider the events:
• A = Number greater than 3
• B = Number less than 6
• C = Even number
• Use a Venn diagram to represent and determine the events A ∩ B; C ∩ A; A ∩ B ∩ C;
A ∪ B; A’ ∩ C; C − B.

6
1.4 Counting Techniques
• The basic rules of counting can help us in solving a tremendous variety of
problems.
• The Sum Rule
• If a first task can be done in n1 ways and a second task in n2 ways, and if these tasks cannot be
done at the same time, then there are n1 + n2 ways to do either task.
• Example 1.4: A student can choose a computer project from one of three lists. The three lists
contain 10, 15 and 29 possible projects, respectively. How many possible projects are there to
choose from?
• The Product Rule
• If a task can be done in n1 ways, and if for each these a second task can be done in n2 ways, then
the two tasks can be done together in n1*n2 ways.
• Example 1.5: How many sample points are in a sample space when a pair of dice is thrown
once?

7
• Tree Diagram
• Counting problems can be solved using tree diagrams.
• A tree consists of a root, a number of branches leaving the root, and possible additional
branches gram leaving the endpoints of other branches.
• Example 1.8: An experiment consists of flipping a coin and then flipping it a second time if a
head occurs. If a tail occurs on the first flip, then a die is tossed once. Construct a tree diagram to
list the sample space.
• Permutations
• Frequently, we are interested in a sample space that contains as elements possible orders or
arrangements of a group of objects.
• The different arrangements are called permutations.
• Theorem 1. The number of permutations of n distinct objects is n!
• Exercise 1.4: List the possible permutations of the letters a, b and c.

8
• Theorem 2. The number of permutations of n distinct objects taken r at a time is
nPr = n!/(n − r)!
• Example 1.10: Two lottery tickets are drawn from 20 for first and second prizes. Find the number of
sample points in the space S.
• Theorem 3. The number of permutations of n distinct objects arranged in a circle is (n − 1)!
• Theorem 4. The number of distinct permutations of n objects of which n1 are of one kind, n2 of a
second kind,· · · , nk of a kth kind is
n!/(n1!n2! · · · nk!)
• Example 1.11: How many different ways can 3 red, 4 yellow, and 2 blue balls be arranged in a row?
• Combination
• In many problems we are interested in the number of ways of selecting r objects from n without
regard to order. These selections are called combinations. Combinations can occur without
repetitions or with repetitions.

9
• The number of those combinations with repetitions is

10
1.5 Probability of an Event
• Probability of an event is a measure of the chance with which we can expect the
event to occur in a given random experiment.
• It is assigned a number between 0 (event will not occur) and 1 (event sure to
occur.)
• Classical Approach
• 1. Priori Definition: If an event can occur in m different ways out of a total number of n possible
ways, all of which are equally likely, then the probability of the event is m/n.
• 2. Frequency or Posteriori Definition: If after n repetitions of an experiment, where n is very
large, an event is observed to occur in m of these, then the probability of the event is m/n.

11
• Axiomatic Approach
• Axiom 1. If A is any event in a sample space S, then 0 ≤ P(A) ≤ 1
where P(A) is the probability of the event A.
• Axiom 2. The entire sample space S has the probability P(S) = 1
• Axiom 3. If A1; A2; · · · are mutually exclusive events, then
P(A1 ∪ A2 ∪ · · · ) = P(A1) + P(A2) + · · ·
• From the above axioms, we can obtain a number of useful theorems.
• Theorem 6. If A1 ⊂ A2 then P(A1) ≤ P(A2) and P(A2 − A1) = P(A2) − P(A1)
• Theorem 7. P(∅) = 0
• Theorem 8. P(A’) = 1 − P(A)
• Theorem 9. If A and B are any two events then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

12
• Theorem 10. If A = A1 ∪ A2 ∪ · · · An, where A1; A2; · · · ; An are mutually exclusive
events, then
P(A) = P(A1) + P(A2) + · · · + P(An)
In particular if A = S, then P(A1) + P(A2) + · · · + P(An) = 1
• Theorem 11. For any events A and B, P(A) = P(A ∩ B) + P(A ∩ B’)
• Theorem 12. If an event A must result in one of the mutually exclusive events A1;
A2; · · · ; An, then
P(A) = P(A ∩ A1) + P(A ∩ A2) + · · · + P(A ∩ An)

13
1.6 Assignment of Probabilities
• If a sample space S consists only of elementary events A1; · · · ; An then
P(A1) + · · · + P(An) = 1
• If we assume equal probabilities for all simple events, then
P(Ak) = 1/n for k = 1; 2; · · · ; n
• If A is any event made up of m such simple events we have
P(A) = m/n
• Example 1.12: A biased coin is flipped such that head appears twice as that of tail.
Find P(H) and P(T).

14
1.7 Conditional Probability
• Often it is required to find the probability of an event B under the condition that
event A occurs.
• This probability is called conditional probability of B given A and is denoted by
P(B=A).

• Theorem 13. If A and B are events in a sample space S, then


P(A ∩ B) = P(A)P(B/A) = P(B)P(A/B)
• Example 1.14: The probability that a regular scheduled flight departs on time is
P(D) = 0:83; the probability that it arrives on time is P(A) = 0:82; and the
probability that it departs and arrives on time is P(D \ A) = 0:78. Find the
probability that a plane
• 1. arrives on time given that it departed on time.
• 2. departed on time given that it has arrived on time.

15
• Bayes Theorem
• Theorem 14. Let {A1; · · · ; An} be a disjoint partition of the sample space S and let P(Ai) ≠ 0. Then
for any event A

• Theorem 15. Let {A1; · · · ; An} be a partition of S and let P(Ai) ≠ 0, then

• Example 1.15: Three boxes contain lamp bulbs some of which are defective. The proportion of
defective in box A1; A2 and A3 are 1 2; 1 8 and 3 4 respectively. A box is selected at random and
a bulb is drawn from it. If the selected bulb is found to be defective, what is the probability that
box A1 was selected?

16
1.8 Random Variables
• The axiomatic definition of probability is difficult to manipulate mathematically.
• Definition 1. A random variable is a function that associates a real number with
each element in the sample space.
• A random variable is called discrete random variable if its set of possible outcomes is countable.
• And it is called a continuous random variable when the random variable takes on a continuous
scale.

17
1.9 Discrete Probability Distributions
• A discrete random variable assumes each of its values with a certain probability.
• Frequently, it is convenient to represent all the probabilities of a random variable X
by a formula. Such a formula would necessarily be a function of the numerical
values x that we shall denote by f(x); g(x); r(x); and so forth.
• Therefore, we write f(x) = P(X = x); for example, f(3) = P(X = 3).
• The set of ordered pairs (x; f(x)) is called the probability function or probability
distribution of the discrete random variable X.
• Definition 2. The set of ordered pairs (x; f(x)) is a probability function, probability
mass function, or probability distribution of the discrete random variable X if, for
each possible outcome x

18
• Exercise 1.13: If 50% of automobiles sold by an agency for a certain foreign car are
equipped with diesel engines, find a formula for the probability distribution of the
number of diesel models among the next 4 cars sold by this agency.
4
• Answer: f(x) = P (X = x) = /16 for x = 0, 1, 2, 3, 4.
𝑥
• Definition 3. The cumulative distribution F (x) of a discrete random variable X with
probability distribution f(x) is given by

• Example 1.19: Find the cumulative distribution for the random variable X in the
above example.

19
1.10 Continuous Probability Distribution
• If X is a continuous random variable the probability that X takes on any particular
value is generally zero.
• Definition 4. The function f(x) is a probability density function for the continuous
random variable X, defined over the set of real numbers R, if

20
• Example 1.20: Suppose that the error in the reaction temperature, in oC, for a
controlled laboratory experiment is a continuous random variable X having the
probability density function

• a. Find the constant c


• b. Find P (0 ≤ X ≤ 1)
• Definition 5. The cumulative distribution F (x) of a continuous random variable X
with density function f(x) is given by

21
1.11 Joint Probability Distribution
• If X and Y are two discrete random variables, the probability distribution for their
simultaneous occurrence can be represented by a function with values f(x, y).
• This function is referred to as the joint probability distribution of X and Y.
• Definition 6. The function f(x; y) is a joint probability distribution or probability
mass function of the discrete random variables X and Y if

22
• Definition 7. The function f(x; y) is a joint density function of the continuous
random variables X and Y if

• Example: Given the following joint pdf for two random variables X and Y
• Compute C, Find P((x, y): 0<x<0.5, 0.25<y<0.5)

23
• The joint CDF for the two random variables X1 and X2 is defined as

• The joint pdf may be expressed from the joint CDF as

• Definition 8. The marginal distribution of X alone and Y alone are given by

24
• Definition 9. The conditional distribution of the random variable Y , given X = x, is
given by

• If one wishes to find the probability that the discrete random variable X falls
between a and b when it is known that the discrete variable Y = y, we evaluate

25
• Exercise 1.15: Given the joint density function

• Find g(x), h(y), P(0.25<x<0.5/Y=1/3)

26
1.12 Independent Random Variables
• If f(x=y) does not depend on y, then the outcome of the random variable Y has no
impact on the outcome of the random variable X.
• Definition 10. The random variables X and Y are said to be independent random
variables if and only if

27
1.13 Functions of Random Variables
• Given a random variable X, which is characterized by its pdf fX(x), determine the
pdf of the random variable Y = g(X), where g(X) is some given function of X.
• When the mapping g from X to Y is one-to-one, the determination of fY (y) is
relatively straightforward.
• However, when the mapping is not one-to-one, as is the case, for example, when Y =
X2, we must be very careful in our derivation of fY (y).
• Let X be a continuous r.v. with pdf fX(x). If the transformation y = g(x) is one-to-one
and has the inverse transformation

• then the pdf of Y is given by

28
• If the transformation y =g(x) is not one-to-one, fY (y) is obtained as follows:
• Denoting the real roots of y = g(x) by xk, that is, y = g(x1) = · · · = g(xk) = · · · then

• For the case of a function of multiple random variables

• then

• where

29
1.14 Statistical Averages of Random Variables
• Averages (or expectations) play an important role in the characterization of the
outcomes of experiments and the random variables defined on the sample space of
the experiments.
• Mathematical Expectation
• The mathematical expectation or mean of a random variable X gives a single value which acts as
a representative or average value of X.
• It is a measure of central tendency.
• Definition 11. Let X be a random variable with probability distribution f(x). The expectation
value or mean of X is

30
• Example 1.28: Let X be the random variable that denote the life in hours of a certain
electronic device. The probability distribution function is given by

• Find the expected life of this type of device.

31
• Theorem 16. Let X be a random variable with probability function f(x). The mean or
expected value of the random variable g(x) is

32
• Variance and Standard Deviation
• Definition 13. Let X be a random variable with probability distribution f(x) and mean µ. The
variance of X is

• The variance (or the standard deviation) is a measure of the dispersion or scatter of the values
of the random variable about the mean µ.
• If the values tend to be concentrated near the mean, the variance is small; while if the values
tend to be distributed far from the mean, the variance is large.

33
• Some theorems on variace

34
35
• Covariance
• The concept of variance given for one variable can be extended to two or more variables. Thus,
for example, if X and Y are two continuous random variables having joint density f(x; y) then,
the mean and variances are

36
• Definition 14. Let X and Y be random variables with probability distribution f(x; y). The
covariance of X and Y is

• Some Theorems on Covariance:

37
1.15 Some Special Probability Distributions
• We will consider special discrete and continuous distributions that are particularly
important in probability theory and statistics.
• In fact, one needs only a handful of important probability distributions to describe
many of the random variables encountered in practice.

38
1.15.1 Binomial Distribution
• The binomial distribution is obtained if we are interested in the number of times an
event A occurs in n independent performances of an experiment, assuming that
• A has probability (called success) P(A) = p in a single trial.
• Then q = 1 − p is the probability that in a single trial the event does not occur (called failure).
• In a binomial distribution, each trial is called Bernoulli trial and the experiment
process Bernoulli process. The Bernoulli process must possess the following
properties:
• 1. The experiment consists of n repeated trials
• 2. Each trial results in an outcome that may be classified as a success or a failure
• 3. The probability of success, p, remains constant from trial to trial
• 4. The repeated trials are independent.

39
• Definition 15. The probability that the event will happen exactly x times in n trials
(i.e., x successes and n − x failures) is given by the probability function

• Example 1.33: The probability that a patient recovers from rare blood disease is
0.4. If 15 people are known to have contracted this disease, what is the probability
that
• 1. at least 10 survive
• 2. from 3 to 8 survive

40
1.15.2 Multinomial Distribution
• Suppose that events A1, A2, … . Ak can occur with probabilities p1, p2…. pk where
p1 + p2 + …+ pk = 1. If X1, X2, …, Xk are the random variables respectively giving
the number of times A1; A2; : : : ; Ak occur in a total of n independent trials, so that
x1 + x2 + …+ xk = 1, then

41
1.15.3 Hypergeometric Distribution

42
1.15.4 Poison Distribution
• Experiments yielding numerical values of a random variable X which are the
number of outcomes occurring during a given time interval or in a specific region,
are called Poisson experiments.
• For example, X may represent the number of telephone calls per hour received by
an office, or the number of postponed games due to rain in a rainy season
• More examples of events that may be modeled as a Poisson distribution include:
• The number of cars that pass through a certain point on a road (sufficiently distant from traffic
lights) during a given period of time.
• The number of spelling mistakes one makes while typing a single page
• The number of phone calls at a call center per minute.
• The number of times a web server is accessed per minute.
• The number of roadkill (animals killed) found per unit length of road.
• The number of mutations in a given stretch of DNA after a certain amount of radiation.

43
• A Poisson process possesses the following properties:
• 1. The number of outcomes occurring in one time interval or specified region is independent of
the number that occur in any other disjoint time interval or region of space.
• In this way Poisson process is said to possess no memory.
• 2. The probability that a single outcome will occur during a very short time interval or in a small
region is proportional to the length of the time interval or the size of the region and does not
depend on the number of outcomes occurring outside this time interval or region.
• 3. The probability that more than one outcome will occur in such a short time interval or fall in
such a small region is negligible.
• Definition 16. The probability distribution of the Poisson random variable X,
representing the number of outcomes occurring in a given time interval or specific
region denoted by t, is given by

44
• where λ is the average number of outcomes per unit time or region.
• Theorem 19. The mean and variance of the Poisson distribution both have the
value λt, i.e.,

• Example 1.37: The average number of radioactive particles passing through a


counter during 1 millisecond in a laboratory experiment is 4. What is the
probability that 6 particles enter the counter in a given millisecond?

45
1.15.5 The Normal Distribution
• The normal (or Gaussian) distribution is the most important continuous
probability distribution in the entire field of probability and statistics.
• The normal distribution approximately describes many phenomena that occur in
nature, industry, and research.
• Physical measurement in areas such as meteorological experiments, rainfall
studies, and measurements on manufactured parts are often more than adequately
described with a normal distribution.
• Definition 17. The probability density function of the normal random variable X,
with mean µ and variance σ2, is

46
47
• The normal distribution has the cumulative distribution function

• The probability between a and b is then given as

48
• The above integral cannot be evaluated by elementary methods, but can be
represented in terms of the integral

• which is the shaded area under the curve in Figure 1.2.

49
• Φ(z) is the distribution function of the standard normal distribution, i.e., the
normal distribution with mean 0 and variance 1.
• It has been extensively tabulated.
• F(x) can be expressed in terms of Φ(z) by setting

• Then

50
• Example 1.39: Given a normal distribution with µ = 50 and σ = 10, find the
probability that X assumes a value between 45 and 62.
• Example 1.40: Let X be standard normal. Determine c such that
• 1. P(X ≥ c) = 10%
• 2. P(X ≤ c) = 5%
• 3. P(0 ≤ X ≤ c) = 45%
• 4. P(−c ≤ X ≤ c) = 99%

51
1.15.6 Uniform Distribution
• A r.v. X is called a uniform random variable over (a; b) if its pdf is given by

• The corresponding cdf of X is

52
1.15.7 Rayleigh distribution
• The Rayleigh distribution, named for William Strutt, Lord Rayleigh, is the
distribution of the magnitude of a two-dimensional random vector whose
coordinates are independent, identically distributed, mean 0, variance σ2 normal
variables.
• Suppose that Y1 and Y2 are independent random variables with standard normal
distributions. The magnitude R = Y12 + Y22of the vector (Y1; Y2) has the standard
Rayleigh distribution.
• The Rayleigh distribution is frequently used to model the statistics of signals
transmitted through radio channels such as cellular radio.
• Rayleigh fading is the specialized model for stochastic fading when there is no line
of sight signal, and is sometimes considered as a special case of the more
generalized concept of Rician fading.

53
• The probability density function of the Rayleigh distribution is

• where σ is the scale parameter of the distribution.


• The cumulative distribution function is

54
1.15.8 Rice distribution
• The Rice distribution is the probability distribution of the magnitude of a circular
bivariate normal random variable with potentially non-zero mean.
• R ∼ Rice (s; σ) has a Rice distribution if R = X12 + X22where X1 ∼ N( s cos θ; σ2)
and X2 ∼ N( s sin θ; σ2) are statistically independent normal random variables and
θ is any real number.
• The probability density function is

• In wireless communication, Rician fading or Ricean fading is a stochastic model for


radio propagation anomaly caused by partial cancellation of a radio signal by itself.

55
56
1.15.9 Nakagami distribution
• The Nakagami distribution or the Nakagami-m distribution is a probability
distribution related to the gamma distribution.
• It has two parameters: a shape parameter m and a second parameter controlling
spread, Ω.
• Its probability density function is

57
58

You might also like