COMM5005 Lecture 6

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Quantitative Methods

for Business
COMM5005
Lecture 6
Statistics Flow Chart
Statistics
Lecture 5

Probability
Descriptive Inferential

Distributions of Sampling distributions Lecture 6


Moments
Variables

First Moment: Mean Confidence Interval


Bi-variate Distribution Estimation
Lecture 7
Second Moment: Variance Binomial Distribution Lecture 6 Hypothesis Testing

Third Moment: Skewness Uniform Distribution Simple linear regression

Lecture 8

Normal Distribution Multiple linear regression


Lecture 6 topics 4-2

In this lecture we will cover:


• Some important discrete probability distributions
• The normal distribution and other continuous
distributions
• Sampling distributions
Objectives

Ø Recognise and apply the properties of a probability distribution


Ø Calculate average return, and measure risk associated with
various investment proposals
Ø Application of a binomial distribution
Ø Calculate probabilities from some continuous distributions
Ø Interpret the concept of the sampling distribution
Ø Recognise the importance of the Central Limit Theorem
Readings
4-4

Sections of Berenson, M. et al. 5th ed. will help you to


understand this week’s topics more clearly.

Chapter Name Pages


5.1 – 5.4 Some important discrete probability 180-199
distributions
6.1-6.5 The normal distribution and other 212-237
continuous distributions
7.1-7.2 Sampling distributions 248-258
1. Discrete probability distributions

• A probability distribution for a discrete random variable is a


mutually exclusive list of all possible numerical outcomes of the random
variable with the probability of occurrence associated with each
outcome.
Probability distribution of the number of home mortgages approved per
week
Expected Value of a Discrete Random Variable

• The expected value of a discrete random variable is a measure of


central tendency; the mean of a discrete random variable.
N
µ = E(X) = å Xi P( Xi )
i =1

• Toss 2 coins, X = # of heads


• Calculate expected value of X:

E(X) = (0 x 0.25) + (1 x 0.50) + (2 x 0.25)


= 1.0
Variance and Standard Deviation of a
Discrete Random
Variance of a discrete random variable – definition formula
N
σ = å [X i - E(X)]2 P(X i )
2

i =1

Variance of a discrete random variable – calculation formula


N
σ = å X i P( X i ) - E(X) 2
2 2

i =1

where E(X) = expected value of the discrete random variable X


Xi = the ith outcome of the discrete random variable X
P(Xi) = probability of the ith occurrence of X

Standard deviation of a discrete random variable: σ = σ!


Variance and Standard Deviation of a Discrete
Random Variable
Example: Toss two coins, X = # heads, calculate variance and standard
deviation.
• Recall from slide 6 that E(X) = 1

N
σ =
2
åX
i =1
i
2
P ( X i ) - E(X) 2

σ 2 = (02 * 0.25 + 12 * 0.5 + 22 * 0.25) - (12 ) = 0.5

Standard deviation σ = σ 2 = 0.5 = 0.707


Covariance
• The covariance measures the direction of a linear relationship
between two variables.
• Covariance – definition formula:
'

𝜎"# = $ 𝑋$ − 𝐸 𝑋 𝑌$ − 𝐸 𝑌 𝑃(𝑋$ , 𝑌$ )
$%&

• Covariance – calculation formula:


'

𝜎"# = $ 𝑋$ 𝑌$ 𝑃 𝑋$ , 𝑌$ − 𝐸 𝑋 𝐸 𝑌
$%&
where 𝑋! , 𝑌! = the ith outcome of the discrete random variables X and Y,
respectively
𝑃 𝑋! , 𝑌! = probability of the ith occurrence of X and Y
Computing the Mean for Investment Returns
• Return per $1,000 for two types of investments

Investment

𝑃 𝑋! , 𝑌! Economic condition Passive Fund X Aggressive Fund Y

0.2 Recession -$25 -$200


0.5 Stable Economy +50 +60
0.3 Expanding Economy +100 +350

• E(X) = µX = (−25)(0.2) +(50)(0.5) + (100)(0.3) = 50


• E(Y) = µY = (−200)(0.2) +(60)(0.5) + (350)(0.3) = 95
Computing the Standard Deviation for
Investment Returns
Investment

𝑃 𝑋! , 𝑌! Economic condition Passive Fund X Aggressive Fund Y

0.2 Recession -$25 -$200


0.5 Stable Economy +50 +60
0.3 Expanding Economy +100 +350

𝝈𝑿 = −𝟐𝟓 𝟐 𝟎. 𝟐 + 𝟓𝟎 𝟐 𝟎. 𝟓 + 𝟏𝟎𝟎 𝟐 𝟎. 𝟑 − 𝟓𝟎 𝟐 = 43.30

𝝈𝒀 = −𝟐𝟎𝟎 𝟐 𝟎. 𝟐 + 𝟔𝟎 𝟐 𝟎. 𝟓 + 𝟑𝟓𝟎 𝟐 𝟎. 𝟑 − 𝟗𝟓 𝟐 = 193.71


Computing the Covariance for Investment
Returns
Investment

P(XiYi) Economic condition Passive Fund X Aggressive Fund Y

0.2 Recession -$25 -$200


0.5 Stable Economy +50 +60
0.3 Expanding Economy +100 +350

&

𝜎!" = # 𝑋# 𝑌# 𝑃 𝑋# , 𝑌# − 𝐸 𝑋 𝐸 𝑌
#$%

σ XY = [( -25 * -200 * 0.2) + (50 * 60 * 0.5) + (100 * 350 * 0.3)] - [(50 * 95)]
= 8250
Interpreting the Results for Mean Returns

• The aggressive fund has a higher expected return (mean), but


much more risk.
µ Y = 95 > µX = 50
s Y = 193.71 > s X = 43.30
s XY = 8250

• The covariance of 8,250 indicates that the two investments are


positively related and will vary in the same direction.
Expected Value, Variance and Standard
Deviation of the Sum of Two Random Variables

• Expected value of the sum of two random variables


• Measure of central tendency; mean of the sum of two random variables
E(X + Y) = E ( X ) + E (Y )
• Variance of the sum of two random variables
• Measure of variation; directly related to the standard deviation
Var(X + Y) = σ 2X + Y = σ 2X + σ 2Y + 2σ XY
• Standard deviation of the sum of two random variables
• Measure of variation; directly related to the variance
σ X+ Y = σ 2X + Y
Portfolio Expected Return and Portfolio Risk

• Portfolio: a combined investment in two or more assets


• Portfolio expected return: measure of central tendency; mean
return on investment

E(P) = w E ( X ) + (1 - w) E (Y )
• Portfolio risk: measure of the variation of investment returns

σ P = w 2σ 2X + (1 - w )2 σ 2Y + 2w(1 - w)σ XY
Where E(P) = portfolio expected return
w = portion of portfolio value in asset X
(1 − w) = portion of portfolio value in asset Y
Example

• Suppose a portfolio of two assets X and Y. In the portfolio, the share of X


is 70% and the share of Y is 30%.
• The expected return of X is 15 thousands dollars and risk (standard
deviation) is 5 thousands dollars; the expected return of Y is 20 thousands
dollars and risk (standard deviation) is 50 thousands dollars. The
covariance of the two assets is -100.
• What are the expected return and risk of the portfolio?
Binomial Distribution
Binomial distribution: Discrete probability distribution, where the random
variable is the number of successes in a sample of n observations from either
an infinite population or sampling with replacement.

Four essential properties of the binomial distribution:


1. A fixed number of observations, or trials, n
2. Two mutually exclusive and collectively exhaustive categories
3. Constant probability for each observation
4. Observations are independent
Rule of Combinations
The number of combinations of selecting X objects out of n objects
is:

æ nö n!
ç ÷ = nC x =
èXø X !(n - X )!

where
n! =(n)(n − 1)(n − 2) . . . (2)(1)
X! = (X)(X − 1)(X − 2) . . . (2)(1)
0! = 1 (by definition)
The Binomial Distribution Formula

n!
P( X ) = p X (1 - p)n - X
X !(n - X )!

P(X) = probability of X successes in n trials,


with probability of success p on each trial
X = number of ‘successes’ in sample (X = 0, 1, 2, ..., n)
n = sample size (number of trials or observations)
p = probability of ‘success’
1−p = probability of failure
Example
Suppose the transformation rate of online enquiries of hotel booking
to actual booking is 10%. What is the probability of 3 bookings out
of 4 online enquiries?

Computation in Microsoft Excel:


Probability(3 bookings out of 4 online enquires) =
Binom.Dist(3,4,0.1,FALSE)
Characteristics of the Binomial Distribution
• Mean

μ = E(x) = np
• Variance and standard deviation

σ 2 = np(1 - p)

σ = np(1 - p)
Where n = sample size
p = probability of success
(1 – p) = probability of failure
2. Continuous Probability Distributions
• A continuous random variable is a variable that can assume any value
on a continuum (can assume an infinite number of values).

• These can potentially take on any value, depending only on the ability to
measure accurately:
• thickness of an item
• time required to complete a task
• weight, in grams
• height, in centimetres
Three continuous distributions
The Normal Distribution
• Bell-shaped

• Symmetrical
• Mean, median and mode are
equal.
• Central location is
determined by the mean, µ.
• Spread is determined by the
standard deviation, σ .
• The random variable X has
an infinite theoretical range:
+ ¥ to - ¥ .
Relative frequency histogram and polygon of the thickness of 10,000 brass washers
The Normal Probability Density Function
• The formula for the normal probability function is:

1 2
f(X) = e -(1/2)[(X -μ)/σ]
2πs

Where e = the mathematical constant approximated by 2.71828


Π = the mathematical constant approximated by 3.14159
µ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Translation to the Standardised Normal
Distribution
• Any normal distribution (with any mean and standard
deviation combination) can be transformed into the
standardised normal distribution (Z).

• Translate any X to the Standardised Normal (the Z


distribution) by subtracting from any particular X value, the
population mean and dividing by the population standard
deviation.
X -µ
Z=
σ
The Standardised Normal Probability
Density Function (Pdf)
• The formula for the standardised normal probability density function is:

1 2
f(Z) = e -(1/2)Z

Where e = the mathematical constant approximated by 2.71828


π = the mathematical constant approximated by 3.14159
Z = any value of the standardised normal distribution
The Standardised Normal Distribution
• It is also known as the ‘Z distribution’.
• Mean is 0.
• Standard deviation is 1.

• Values above the mean have positive Z-values.


• Values below the mean have negative Z-values.
Finding Normal Probabilities
• Probability is measured by the area under the curve.

• Note that the probability of any individual value is zero since


the X axis has an infinite theoretical range: + ¥ to - ¥.
General Procedure for Finding Probabilities

To find P(a < X < b) when X is distributed normally:


1. Draw the normal curve for the problem in terms of X.

2. Translate X-values to Z-values and put Z values on your diagram.

3. Use the Standardised Normal Table.


Example
What is P(X < 7.88) when X is distributed normally with mean 2 and
standard deviation of 3?

Solution: translate X to Z: Z = (X-2)/3, so that Z follows a standard


normal distribution.

Thus, P(X < 7.88) = P((X-2)/3 < (7.88 -2)/3) = P(Z < 1.96)
Note: t - distribution (Student) (1/2)
• As we will see in the next lecture on hypothesis testing, in certain
cases, we will use a t-distribution instead of a standard normal
distribution.
• The pdf of the t-distribution has a similar shape to that of the
standard normal distribution but more spread out and fatter tails.
• as nà∞ it approaches the normal.
t - distribution (Student) (2/2)
The Uniform Distribution (1 of 2)
• The uniform distribution is a probability
distribution that has equal probabilities for all
possible outcomes of the random variable.
• It is also called a rectangular distribution.
The Uniform Distribution (2 of 2)
• The continuous uniform distribution

1
if a £ X £ b
b-a
f(X) =
0 otherwise
• Where
• f(X) = value of the density function at any X value
• a = minimum value of X
• b = maximum value of X
Characteristics of the Uniform Distribution

• The mean of the uniform distribution is:

a+b
µ=
2

• The standard deviation is:


(b-a)2
σ =
12
Finding P(0.1 < X < 0.3) for a uniform distribution with a = 0 and b = 1
The Exponential Distribution (1 of 3)
• Exponential distribution is a continuous distribution which is
right skewed and ranges from 0 to + ∞.
• Often used to model the length of time between two
occurrences of random or independent events (the time
between events).
For example:
• time between trucks arriving at an unloading dock
• time between transactions at an ATM machine
• time between phone calls to the main operator
The Exponential Distribution (2 of 3)
Defined by a single parameter, its mean λ (lambda), the expected
number of events per interval

P(X < A) = 1 - e - λA

Where e = mathematical constant approximated by 2.71828


λ = the expected number of events in interval
X = an exponential random variable
where 0 ≤ X ≤ ¥
The Exponential Distribution (3 of 3)
Example: Customers arrive at the service counter at the rate of 15
per hour. What is the probability that the arrival time between
consecutive customers is less than three minutes?

The mean number of arrivals per hour is 15, so λ = 15.


Three minutes is 0.05 hours, so A = 0.05.
P(X < .05) = 1 – e-λA = 1 – e-(15)(0.05) = 0.5276
So there is a 52.76% probability that the arrival time between
successive customers is less than three minutes.
3. Sampling Distributions

• A sampling distribution is a distribution of all the possible values of a


statistic for a given size sample selected from a population.
• The sampling distribution of the mean is the distribution of all
possible sample means if you select all possible samples of a certain
size.
• If the average of all possible sample means equals the population
mean then the sample mean is unbiased.
Developing a Sampling Distribution (1 of 5)
Assume there is a population …

Population size N=4

Random variable, X, is age of individuals

Values of X: 18, 20, 22, 24 (years)


Developing a Sampling Distribution (2 of 5)

μ=
åX i
N
18 + 20 + 22 + 24
= = 21
4

σ=
å (X i - μ)2
= 2.236
N
Uniform Distribution
Developing a Sampling Distribution (3 of 5)
Now consider all possible samples of size n=2
1st 2nd Observation
Obs 18 20 22 24
18 18,18 18, 20 18, 22 18, 24
20 20,18 20, 20 20, 22 20, 24
22 22,18 22, 20 22, 22 22, 24 1st 2nd Observation
24 24,18 24, 20 24, 22 24,24 Obs 18 20 22 24
18 18 19 20 21
16 possible samples (sampling with
20 19 20 21 22
replacement
22 20 21 22 23
Resulting in 16 sample means 24 21 22 23 24
Developing a Sampling Distribution (4 of 5)

Sampling distribution of all sample means


16 Sample Means Sample Means Distribution

1st 2nd Observation


Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Developing a Sampling Distribution (5 of 5)

Summary measures of sampling distribution

μX =
å X i
=
18 + 19 + 21 + ! + 24
= 21
N 16

σX =
å ( X i - μ X
) 2

(18 - 21)2 + (19 - 21)2 + ! + (24 - 21)2


= = 1.58
16
Comparing the Population with Its Sampling Distribution

Population Sample Means Distribution


N=4 n=2
μ = 21 σ = 2.236 μ X = 21 σ X = 1.58
Standard Error of the Mean

Different samples of the same size from the same population will
yield different sample means.
A measure of the variability in the mean from sample to sample is
given by the Standard Error of the Mean.

σ
σX =
n

(This assumes that sampling is done with replacement or


sampling is done without replacement from a large or infinite
population.)
Note that the standard error of the mean decreases as the sample
size increases.
Sampling from Non-normally Distributed
Populations – The Central Limit Theorem
If a population is not normal, we can apply the Central Limit
Theorem, which states that regardless of the shape of individual
values in the population distribution, as long as the sample size is
large enough (generally n ≥ 30), the sampling distribution of X will
be approximately normally distributed with:

σ
μx = μ σx =
n
Sampling distribution of the mean for different populations for samples of n = 2, 5 and 30

You might also like