Professional Documents
Culture Documents
2022 LI QuantMethods
2022 LI QuantMethods
MarkMeldrum.com
Probability Concepts 28
Hypothesis Testing 62
Reviews 83
This document should be used in conjunction with the corresponding readings in the 2022 Level I CFA® Program curriculum.
Some of the graphs, charts, tables, examples, and figures are copyright 2022, CFA Institute. Reproduced and republished with
permission from CFA Institute. All rights reserved.
Required disclaimer: CFA Institute does not endorse, promote, or warrant accuracy or quality of the products or services
offered by MarkMeldrum.com. CFA Institute, CFA®, and Chartered Financial Analyst® are trademarks owned by CFA
Institute.
b. explain an interest rate as the sum of a real risk-free rate and premiums that
compensate investors for bearing distinct types of risk;
c. calculate and interpret the effective annual rate, given the stated annual interest
rate and the frequency of compounding;
d. calculate the solution for time value of money problems with different
frequencies of compounding;
e. calculate and interpret the future value (FV) and present value (PV) of a single
sum of money, an ordinary annuity, an annuity due, a perpetuity (PV only), and
a series of unequal cash flows;
f. demonstrate the use of a time line in modeling and solving time value of money
problems.
Last Revised: 08/03/2021
suppose I lend you $1000 for one year. I will want: LOS b
rf real risk-free rate: single period rate rf + e = nominal
-explain
Pg-2
+ inflation premium compensates for expected risk-free rate
inflation ( ) [( + )( + )] −
+ Maturity premium greater interest rate risk (i.e. price risk) with longer
maturities
will also have a premium for inflation
uncertainty the longer the time period, the more
uncertain we are about the level of expected inflation
Last Revised: 08/03/2021
-
t = 0 t = N r = interest rate r must be in the same
t = 0 r = 5% t = 1 N = # of periods periodicity as N
e.g./ $100, 5yrs., 6% 100(1.06)5
-
-
100 FV = 100(1.05) = 105
semi-annual 100(1.03)10
(PV)
quarterly (1.015)20
t = 0 t = 1 t = 2
-
-
100 100(1.05) 100(1.05)(1.05) = 100(1.05)2 = 110.25
FV = PV (1+ r)N
- the power of compounding/
S1
-
0 5 15
N = 10 PV = -10,000,000
10 M CPT FV = 23,673,636.75
, ,
= = , , .
( . )
Last Revised: 08/03/2021
or/
N = 2 × 4 = 8, I/Y = = 2, PMT = 0, PV = -10,000, CPT FV = 11,716.59
LOS c
EAR: effective annual rate EAR
-calculate
$100 at 8% annual 100(1.08) = 108 8%
-interpret
semi-annual 100(1.04)2 = 108.16 8.16% Pg-6
quarterly 100(1.02)4 = 108.2432 8.2432%
monthly 100(1.00 ) = 108.30
̇ 12
8.3%
daily 100(1.000219) 365
= 108.3278 8.3278%
continuous 100e.08 = 108.3287 8.3287%
-
-
-
t = 0 1 2 3 4 5
-1000 -1000 -1000 -1000 -1000
1000(1.05)4 N = 5
( + ) − 1000(1.05)3 PMT = -1000
=
1000(1.05)2 I/Y = 5
( . ) − 1000(1.05)1 PV = 0
= = , .
. 1000 CPT FV
e.g./ € 20,000/yr., N = 30, r = 9% 5,525.63
end of yr. CF
- ordinary annuity/ N = 30, PMT = -20,000
( . ) −
= , = , , . or/ PV = 0, I/Y = 9
.
CPT FV
t = 0 1 2 3 4 5 Pg-8
-
t = 0 1 2 3 4 5
-1000 -2000 -4000 -5000 -6000
× × × × ×
(1.05)4 (1.05)3 (1.05)2 (1.05)1 (1.05)0 = 19,190.76
t = 6 t = 0 4
-
t = 0 8
FV = PV (1 + r)N 10 100,000
PV = ? 100,000
=
= , ( + ) , 0
( . ) = , .
= FV (1 + r)-N = ( . )
(N = 6, PMT = 0, I/Y = 8, FV = 100,000) = , .
CPT PV
= ( . )
= , .
t = 0 1 2 3 4 5 Pg-10
r = 12%
−
( + ) N = 5
=
− or/ FV = 0
( . )
= , . I/Y = 12
.
PMT = 1000 CPT PV
· Annuity Due 1000 1000 1000 1000 1000
or/ N = 4
-
t = 0 1 2 3 4 5 PMT = 1000
− I/Y = 12
( . )
+
.
= . FV = 0
CPT PV
e.g./
+1000
200k 200k 200k 200k
N = 19, PMT = 200,000, I/Y = 7, FV = 0
-
-
t = 0 t = 9 10 11 12 t = 39 Pg-11
PV0 = ? r = 5%
PV10 annuity due
PV9 ordinary annuity
− −
( . ) ( . )
= + = = , , .
. .
= , , .
= ( . ) = , ,
= ( . ) = , ,
N = 30, PMT = 1,000,000, I/Y = 5, FV = 0
BGN mode CPT PV9
N = 30, PMT = 1,000,000, I/Y = 5, FV = 0
CPT PV10 = ( . )
= ( . )
t= 0 5 6 7 8 . ( . )
(r = 5%) = =
. = = .
e.g./ ( . )
100 100 100 100 100
long perpetuity at t = 0 = . = ,
-
t= 0 1 2 3 4 5
100
short perpetuity at t = 4 =
.
) =
-
5 ( .
t= 0 .
= 4 yr. ord. annuity N = 4
−
( . ) PMT = 100
= . I/Y = 5 CPT PV
.
FV =0
Last Revised: 08/03/2021
LOS e
Present value of a series of unequal CFs:
-calculate
1k 2k 4k 5k 6k -interpret
Calculator: Pg-13
-
-
t= 0 1 2 3 4 5
1000/(1.05) 2nd CF 2nd CE/C
2000/(1.05)2 CF0 ↓
4000/(1.05)3 CO1 1000 ENTER ↓ ↓
5000/(1.05)4 CO2 2000 ENTER ↓ ↓
6000/(1.05)5 CO3 4000 ENTER ↓ ↓
= 15,036.46 CO4 5000 ENTER ↓ ↓
CO5 6000 ENTER ↓ NPV
I 5 ENTER
NPV CPT
15,036.46
= . − =. ( . %)
= . − = −. (−. %)
.
e.g./ 2012 - 7.35M units sold
2007 - 8.52M units sold = −
find g
= . − = −. (− . %)
.
Note: g is called the compound annual growth rate
Last Revised: 08/03/2021
− N = 360
( + )
= PV = -100,000
FV = 0
= I/Y = 8/12 = .66 ̇
−
( + )
= , CPT PV
⎡ − × ⎤
⎢ +. ⎥ ,
⎢ . ⎥= . = .
⎢ ⎥
⎣ ⎦ 666.66
interest
t = 0 1 2 15 16 40 41 42 60
2k 2k 2k
age 63
age 22 savings how much to save each yr.? Assume r = 8%
(PMT = ?)
1. FV 15 N = 15, PMT = 2000, I/Y = 8, PV = 0 CPT FV = 54,304.23
2. PV40 N = 20, PMT = 100,000, I/Y = 8, FV = 0 CPT PV = 981,814.74
3. PV15 143,362.53 - 54,304.23 = 89,058.30
N = 25, FV = 0, I/Y = 8, PV = -89,058.30 =
( . )
CPT PMT = 8,342.87
or/ = , .
PV40 = PV15(1.08)25 = 371,901.17 N = 25, FV = 609,913.56, I/Y = 8
FV40 = 981,814.74 - 371,901.17 = 609,913.56 PV = 0 CPT PMT = 8342.87
Last Revised: 08/03/2021
e. describe ways that data may be visualized and evaluate uses of specific
visualizations;
l. interpret skewness;
m. interpret kurtosis;
Sections/
Pg-3
LOS a
III/ Structured vs. Unstructured Data - identify
- compare
a) Structured highly organized in a pre-defined
manner (e.g. stock prices, returns, EPS)
b) Unstructured no organized form (news, social media posts,
company filings, audio/video)
Ex #2
Last Revised: 08/03/2021
Pg-4
One-dimensional array (1 variable) LOS b
- describe
- e.g. a column of a spreadsheet (CS or TS)
Two-dimensional rectangular array (two or more variables)
- data table (CS or panel)
Comp m -
LOS c
- interpret
• Frequency distribution (one-way table)
- the number of observations of a specific value
or group of a variable
- sorted in ascending or descending order Exhibit #8
Pg-5
LOS c
• Absolute frequency - actual count of observations per
- interpret
value of the variable (∑ = )
• Relative frequency - %’age of observations per value of the variable
(abs. freq./total N) ∑ = %
N = 12 2.35
2.38 4 width = range/k = 16/4 = 4
4.28
4.42 5 Intervals:
4.68 [-4.57, -0.57) , [-0.57, 3.43) , [3.43, 7.43) , [7.43 - 11.43]
7.16 6 3 4 4 1
11.43
Exhibit #11
Pg-7
• Contingency table - summarizes data for 2 or more LOS d
categorical variables - interpret
Rows = 5
Columns = 3
R x C table
(5 x 3)
• applications/ Pg-8
LOS d
1/ confusion matrix - interpret
( × ) = 5.38 df = (2-1)(2-1) = 1
Pg-9
Visualization: presentation of data in pictorial or LOS e
- describe
graphical format
- evaluate
y-axis:
frequency
(can be frequency
absolute or polygon
relative)
x-axis intervals/values
Last Revised: 08/03/2021
Pg-10
2/ Bar Chart - represent the frequency distribution of LOS e
categorical data - describe
- evaluate
freq.
horizontal
1 variable
- can be vertical
100%
% Pareto stacked bar
chart chart
2 variables
grouped bar
2 variables
chart
(aka. clustered
bar chart)
4/ Word Clouds
size of each word proportional
(aka. tag cloud)
to its frequency in the text
- colour can be used to display
- depicts frequency different sentiment
of unstructured
data (e.g. text)
Last Revised: 08/03/2021
Pg-12
5/ Line Chart
LOS e
- used to visualize ordered
- describe
observations - evaluate
- typically used for time series data
- facilitates showing changes and underlying
(aids in forecasting) trends
can show more than one time series
Pg-13
6/ Scatter Plot LOS e
- describe
- used to visualize the joint variation - evaluate
identify
in 2 numerical values
outliers
- may be no relationship, a linear or non-linear
relationship
- scatter plot matrix
- assess for pairwise association
among many variables (Exhibit 32)
7/ Heat Maps
Pg-14
LOS f
- describe
Pitfalls/
1 2
1 selecting an improper
chart type - hinders
accurate interpretation
3
of data
2 Selecting data that
favours a conclusion
3 truncating the range
Pg-15
• measures of central tendency - specifies where data LOS g
are centered - calculate
(arithmetic mean, median, mode, weighted mean, geometric - interpret
mean, harmonic mean)
• measures of location - deciles, quantiles, quintiles
population parameters µ ,
measures of dispersion
sample sample statistics , S
descriptive statistics
1/ Arithmetic Mean
average sales of
=
∑ cross-sectional mean
50 companies
average sales for last
time-series mean
X-bar 10 yrs. for GM
Last Revised: 08/03/2021
Pg-17
1/ Arithmetic Mean LOS g
Options - calculate
- interpret
3/ replace with another value
95% winsorized mean top 2.5% of values replaced by
the value at which all others lie
e.g./ 100 4 obs - all 4
88 above (opposite for the bottom)
(2.5%) assigned 88
(25 - 75)
majority 12 4 obs - all 3 assigned
0 (2.5%) 12
( )
even # of obs. median = e.g. n = 10 median =
.
=
Pg-19
4/ Weighted Mean - weights can be probabilities LOS g
- calculate
- interpret
RSP500 = PA * RA + PB + RB + PC RC where ∑ =
( )/
( … ) = ( ) = .
or ( )=
Property:
(* critical
= = ( + ) − to know) ≤
- difference between them
- also referred to as compounded grows as variability increases
returns (example #10)
Last Revised: 08/03/2021
. %
- but (1.042)3 - 1 = 13.137% = = . %
$12M(1.09775)6 = $21M
[( + )( + )( + )( + )( + )] −
End.
g4 g5
Beg g3
g2
g1
time
= −
Last Revised: 08/03/2021
. or/ = =
. +.
=
= +
.
.˙. .
= /
Pg-23
LOS h
× = and > > - select
Pg-24
LOS i
- calculate
- interpret
upper
= (1.5 x 1QR)
Box & whisker plot fence
+ upper
bound
(more at L3)
Pg-26
3/ Variance and standard deviation LOS j
- calculate
∑ ( − ) - variance of
= - interpret
−
measured in
units squared ∑ ( − ) - sd of
= =
−
e.g. %
expressed in the same units of measurement
% = %
as the mean
= √ − −
=
√
(Level 2 lookahead)
Pg-27
Target Downside Deviation - only concerned with LOS k
downside risk - calculate
Target Semideviation a measure of dispersion - interpret
below the target
∑∀ ( − ) e.g. 10
= 8 (5 - 5)2 + (5 - 5)2
− 6
Let B = 5 + (5 - 5)2 + (4 - 5)2
4
+ (2 - 5)2 + (0 - 5)2
full , not just of < 2
0
Pg-28
Coefficient of Variation LOS k
- allows for direct comparisons of - calculate
= dispersion across different data sets - interpret
Skew LOS l
- interpret
Skew = 0
Skew > 0 Skew < 0
mean = median = mode mean > median > mode mean < median < mode
Pg-29
∑ ( − )
≈ for > LOS l
- interpret
Kurtosis ∑ ( − ) LOS m
= − - interpret
leptokurtic (k > 3)
mesokurtic (k = 3)
more
less weight
weight
platykurtic
(k < 3)
more weight less weight
Lepto >
(Exhibit 50 + Example 21)
Meso =
Platy <
good exam Q.
Last Revised: 08/03/2021
Pg-30
LOS n
Covariance the joint variability of 2 random variables
- interpret
expressed in the same units as the
variables
∑ ( − )( − )
= > 0 when they covary together
− ( - ) > 0 when ( - ) > 0
determines and ( - ) < 0 when ( - ) < 0
the sign of rXY
= Properties:
1/ -1 ≤ r ≤ 1 maximum
2/ r = 0 implies no linear relationship diversification
perfect
3/ r = 1 perfect positive correlation
replication
4/ r = -1 perfect negative correlation
Example #22 perfect hedge
Pg-31
LOS n
Limitations/
- interpret
Probability Concepts
c. describe the probability of an event in terms of odds for and against the event;
h. calculate and interpret the expected value, variance, and standard deviation of
random variables;
l. calculate and interpret covariance of portfolio returns using the joint probability
function;
Probability Concepts
LOS a-c Probability Concepts and Odds Ratios (define, identify, describe)
(5p)
LOS d-g Conditional and Joint Probability (calculate/interpret, demonstrate,
(12p) compare, contrast)
LOS h-j Expected Value, Variance and Conditional (calculate/interpret,
(6.5p) measures of Expected Value and Variance explain)
LOS k Expected Value, Variance, Standard Deviation,
(6p) Covariance and Correlation of Portfolio Returns
LOS L Covariance of a Joint Probability Function calculate
(2.5p) interpret
LOS m Bayes’ Formula
(6p)
LOS n Principles of Counting (identify)
(5p)
Page 1
Random variable - a quantity whose future outcomes
LOS a
are uncertain (e.g. returns) - define
Event - a specified set of outcomes e.g.: A = (rP < 10%) B = (rP ≥ 10%)
A
( )+ ( )= %
B
=
0% 10%
P(A) P(B)
Page 2
Property 1: ≤ ( )≤ LOS b
where are - identify
Property 2: 1/ mutually exclusive
( )= - compare
(if one happens, another can’t) - contrast
2/ exhaustive
(covers all possible outcomes)
How are probabilities estimated?
1/ empirical probabilities based on historical observation
past is assumed to be representative of the future
historical period must include occurences of the event
( )= =
Page 3
LOS b
How are probabilities estimated? - identify
3/ a priori probabilities - arriving at a conclusion - compare
based on deductive reasoning - contrast
e.g. P(1) = 1/6 (roll a die, get a 1)
expressed as LOS c
( ) - describe
Odds for: a to b probability =
− ( ) +
− ( )
Odds against: b to a probability =
( ) +
e.g./
P(E) = 1/8 1 to 7 - for each occurence of E, we expect
7 non-occurences
P(A) = 3/17 3 to 14 - for every 3 occurences of A,
we expect 14 non-occurences
odds for
Last Revised: 08/03/2021
Page 4
from odds to probability: for: 1 to 4 LOS c
=.
+ - describe
against: 4 to 1 =.
+
e.g./ Wager:
A = win mutually exclusive Odds for = 1 to 15
B = loss exhaustive
$1 bet pays $16, profit = $15
P(A) = 1/16 ≤ ( )≤ Odds against 15 to 1
P(B) = 15/16 ∑ ( )=
lose $1
Page 5
LOS d
( ) unconditional probability - calculate
( | ) conditional probability (prob. A given B) - interpret
e.g./ P(B) = .5 A B
P(AB) = .1
.˙. ( | ) = . = 20%
.
(A ∧ B)
A = YR1 Winner
AC= YR1 Loser
B = YR2 Winner ex. #3
BC= YR2 Loser
Last Revised: 08/03/2021
Page 6
Tree/ LOS e
( | ) = .66 - demonstrate
1/ YR1 Winner = A
YR1 Loser = AC
YR2 Winner = B ( ) = .50 ( | ) = .34 calculate: ( )
YR2 Loser = BC = ( ) ( )
( | ) = .34 = .66 x .5
= 0.33
( ) = .50
( | ) = .66
A B
Double counting
Page 7
e.g./ find: P(A or B) LOS e
P(A) + P(B) - P(AB) - demonstrate
LOS f
Independent event: 2 events are independent iff
- compare
( | ) = ( ) knowing tells
- contrast
or ( | ) = ( ) us nothing about
Page 9
ex. #9 ( ) = 0.55 ( ) = 0.45 LOS g
- calculate
+ ( | )=? = ( | ).55
- interpret
(find)
( ) = 0.55 -
+
+ ( )
-
+
( ) = 0.45
( ) = 0.40 .40(.45) .55 = .55X + .18
- .37 = .55X
( ) .
X = = .6727
.
LOS h
Recall: ∑
= = + + …+ - calculate
- interpret
weights
= ∑
could be probabilities
Last Revised: 08/03/2021
Page 10
Expected Value of a random variable is the LOS h
probability-weighted average of the - calculate
possible outcomes - interpret
(expected - what we expect the true value to be or
what we expect the future value to be)
Page 11
LOS h
- calculate
Recall: = ∑( − )
( − ) + ( − ) - interpret
− − −
+…+ ( − )
−
weights
= ( ) − ( )
previous example:
=. ( . − . ) +. ( . − . ) +. ( . − . )
+. ( . − . )
=.
( ) = √. = .
Last Revised: 08/03/2021
Page 12
conditional expected value value of X LOS i
( | )= ∑ ( | ) - explain
probability
( )= ( | ) ( )+ ( | ) ( )+ … + ( | ) ( )
total probability rule for expected value
2.60 (.6 x .25 = .15) LOS j
.25 - interpret
- demonstrate
.60 .75
r↓ But: what is:
E(EPS)=2.34 2.45 (.6 x .75 = .45) 1/ ( | ↓)
r-unch 2.20 (.4 x .6 = .24) .25(2.60) + .75(2.45)
.60
.40 = 2.4875
2/ ( | − )
.60(2.20) + .40(2.00)
.40
2.00 (.4 x .4 = .16) = 2.12
2.60 Page 13
.25 LOS j
- interpret
.60 .75 ( | ↓) = . - demonstrate
r↓
( | ↓) = . ( . − . )
2.45
E(EPS)=2.34 +. ( . − . ) =.
r-unch 2.20
.60
.40
( | − )= .
( | − )=. ( . − . )
.40 +. ( . − . ) = .
Example #12 2.00
LOS k
⟹ Portfolio Returns ( ), , ,
- calculate
1/ E(RP) = E(W1R1 + W 2R2 + … + WnRn) - interpret
possible value of R1
also a random variable
E(R1) = P(R11)R11 + P(R12)R12 + … + P(R1n)R1n
probability
Last Revised: 08/03/2021
Page 14
e.g./ W E(Ri) LOS k
SnP500 .50 13% - calculate
- interpret
Corp. bonds .25 6%
MSCI EAFE .25 15% E(RP) = .5(13%) + .25(6%) + .25(15%) = 11.75%
2/ ( )=
( )=∑ ( )
= ∑ − −
to calculate −
portfolio variance, need:
1/ all E(R i)
assume 3 assets R1, R2, R3
2/ all Cov(Ri,Rj)
( )= ( )+ ( )+ ( )
(Exhibit #11) + ( )+ ( )
+ ( )
Page 15
( ) = f(variances, covariances)
LOS k
- calculate
can be
always > 0 - interpret
< 0 or > 0
. Ex #13
= = .
.
Last Revised: 08/03/2021
Page 16
LOS L
Recall:
- calculate
( )= ∑ − −
- interpret
−
= − − + − − + ⋯+ − −
− − −
weights
probabilities?
- the concept of joint probability
where i & j = 1 to n
( )= − −
are scenarios
LOS n
Counting/ e.g./ - identify
1/ Multiplication Portfolio subdivided by
- analyze
Domestic/Foreign
then by 4 industries
then by 3 size categories
- how many sub-portfolios?
2 x 4 x 3 = 24
Page 19
Counting/
e.g./ LOS n
2/ Factorial 3 analysts to cover 3 industries - identify
!
3 x 2 x 1 = 3! - analyze
high risk above-avg. risk avg. risk below-avg. risk low risk
4 4 3 4 3
Counting/ Page 20
LOS n
if k = 1 factorial
- identify
if k = 2 combination or permutation - analyze
5/ Permutation - if k = 2 an
!
= Ex 17, 17, 19
( − )!
Last Revised: 08/03/2021
a. define a probability distribution and compare and contrast discrete and continuous random
variables and their probability functions;
b. calculate and interpret probabilities for a random variable, given its cumulative distribution
function;
c. describe the properties of a discrete uniform random variable, and calculate and interpret
probabilities given the discrete uniform distribution function;
d. describe the properties of the continuous uniform distribution and calculate and interpret
probabilities, given a continuous uniform distribution;
e. describe the properties of a Bernoulli random variable and a binomial random variable, and
calculate and interpret probabilities given the binomial distribution function;
g. contrast between a multivariate distribution and a univariate distribution, and explain the role
of correlation in the multivariate normal distribution;
h. calculate the probability that a normal distributed random variable lies inside a given interval;
k. define shortfall risk, calculate the safety-first ratio, and identify an optimal portfolio using
Roy’s safety-first criterion;
l. explain the relationship between normal and lognormal distributions and why the lognormal
distribution is used to model asset prices;
m. calculate and interpret a continuously compounded rate of return given a specific holding
period return;
n. describe the properties of the Student’s t-distribution, and calculate and interpret its degree of
freedom;
o. describe the properties of the chi-square distribution and the F-distribution, and calculate and
interpret their degrees of freedom;
Page 1
Probability distribution specifies the probabilities LOS a
associated with the possible outcomes - define
of a random variable - compare
(uniform, binomial, normal, lognormal, Student’s t, - contrast
chi-square, F-distribution)
Random variable a quantity whose future outcomes are uncertain
discrete - take on at most a countable number of
possible values (possibly infinite)
continuous - cannot count the possible values
- every random variable is associated with a probability distribution
that describes the variable completely
Probability function specifies the probabilities that a random
variable can take i.e. P(X = x)
discrete variables: p(x)
continuous variables: f(x) the probability density function (pdf)
Last Revised: 08/03/2021
Page 3
Discrete uniform distribution
LOS c
- describe
- calculate
( ≤ )= . - interpret
example #2/
a b
LOS e
Bernoulli random variable: the outcome of a trial that - describe
produces one of two outcomes (1 or 0) - calculate
where p = success - interpret
( )=
Page 5
Binomial Random Variable - # of successes in LOS e
- describe
Bernoulli trials.
- calculate
- interpret
assumptions: 1/ p is constant for all trials
2/ trials are independent
Page 6
! LOS e
( )= ( − )
( − )! ! - describe
probability - calculate
# of ways - interpret
symmetrical
.59
! !
. =. (. ) = .
( − )! ! ! ( − )! !
(. ) = .
( − )! !
example #3.
! !
( )= =. ( )= =.
( )! ! ( )! !
! !
( )= =. ( )= =.
( )! ! ( )! !
! !
( )= =. ( )= =.
( )! ! ( )! !
! !
( )= =. ( )= =.
( )! ! ( )! !
.072998 .363281
or 7.3% or 36.3%
Last Revised: 08/03/2021
Page 8
Mean Variance p 1 LOS e
( ) = ( )+( − )
Bernoulli p p(1 - p) - describe
=
Binomial np np(1 - p) (1-p) 0 - calculate
- interpret
= ( − ) + ( − )( − ) = ( − )( − ) + ( − ) = − + + −
= − = ( − )
( )
= ( )= for −∞ < <∞
√
pdf
= if = , = : standard normal dist.
−∞ +∞
Page 9
1.00
pdf LOS f
cdf - explain
.50
0
50% 50%
0
- the normal dist. will be used to model asset returns (not asset prices)
- more kurtotic than normal
- options add skew
Characteristics:
1/ described by 2 parameters and ~ ( , )
2/ skew = 0 and kurtosis = 3 ( = )
.˙. mean = median = mode
3/ a linear combination of 2 or more normal random variables
is also normally distributed
R P = w 1 R 1 + w 2 R2 + w 3 R 3
normally dist.
univariate random vars.
but multivariate
Last Revised: 08/03/2021
1 = cdf
0 = pdf
+/− = %
+/− . = %
+/− . = %
n = 30 returns Page 11
LOS i
for each xi : e.g./ x = 7.2% - explain
7.2% . ̇
4.7% − . − . 0
= = = . ̇
=3% =1
LOS j
= NORM. S. DIST (z,1) or = NORM. S. INV (probability)
- calculate
(z in, prob. out) (prob. in, z out) - interpret
1.28155
-1.64485 1.64485
Last Revised: 08/03/2021
Page 12
Example #6/ = % = %
LOS j
1/ ( ≥ %) - calculate
64.194% 35.806%
− - interpret
= = .
0 .3636
= NORM. S. DIST (0.3636,1)
= 0.64194 ∴ ( ≥ %) = − . =.
2/ ( %≤ ≤ %)
= = . - =
.3636 0 0 .3636
3/ ( ≤ . %)
. − 38.38%
= = −.
Page 13
Safety first rules focus on shortfall risk - the risk a
LOS k
portfolio value (or return) will fall below some
- define
minimum acceptable level over some time horizon - identify
e.g./ = %
−
= = . ̇ = NORM. S. DIST (-.667,1)
Portfolio = 0.2525
1 12% 15% −
= = . = NORM. S. DIST (-.75,1)
2 14% 16% = 0.227
Page 14
.
= - commonly used to model LOS L
the probability distribution - explain
of asset prices
right
skewed
- a variable Y follows a lognormal
distribution if LN(Y) is normally
0 ∞ distributed
bounded below by 0
/ − / = / − = / = +
Page 15
e.g. = . =
LOS m
/ = . / = . = + .˙. = % - calculate
- interpret
( / )= where = continuously compounded return
= ( . / )= ( . )= . and ~ ( , )
.˙. . = .
- so, while = ( + )
with cont. comp: =
Last Revised: 08/03/2021
LOS n
1/ Student’s t-distribution - defined by a single parameter - describe
known as degrees of freedom (df = − ) - calculate
normal - interpret
more
- as ↑ , the t-distribution weight less
t-dist.
approaches the z-distribution weight
- for > , ≃
Page 17
− −
= = LOS n
standard
/√ /√ - describe
error
- calculate
where and are where and are - interpret
population parameters sample statistics
(only 1 estimate used) (.˙. 2 estimates used)
t-tests are used for hypothesis testing since they are more
conservative, a more stringent test, and they produce wide confidence
intervals
LOS o
- distribution of the sum of squares - describe
- each distribution has
(deviations) of k independent - calculate
its own df
standard normally distributed - interpret
- as df ↑, dist.
becomes more
random variables (dist. of variances)
bell-shaped df = −
bounded below by zero
Last Revised: 08/03/2021
Page 18
F-distribution - the ratio of 2 variables LOS o
the larger - describe
− - calculate
= value is in the
- interpret
numerator
−
used in regression to test the
significance of the whole regression
(explained var/unexplained var)
Example 9 Excel cdfs:
NORM. S. DIST (z,1) NORM. S. INV (p)
T. DIST (t-value,df,1) T. INV (p,df)
CHISQ. DIST (x2-value,df,1) CHISQ. INV (p,df)
F. DIST (F-value,df1,df2,1) F. INV (p,df1,df2)
Page 19
Example 9/ LOS o
- describe
- calculate
- interpret
Last Revised: 08/03/2021
Step 4: Draw standard normal random numbers for each key risk
factor over each sub-periods.
- random number generator produces a distribution of
random numbers from 0 to 1, all equally likely
Page 21
Monte Carlo Simulation/ LOS p
Step #4 distribution of random #’s - describe
#1 0.32
#2 0.64 1000 runs
#20
0 Step #6:
#1 -0.48 z-value
( )= . %− . ( %) = .
= ( . )
#2 0.73 z-value
( ) = . %+. ( )= . %
= ( . )
#20
Last Revised: 08/03/2021
Page 22
Monte Carlo Simulation/ LOS p
- describe
= ( + ( )
distribution of possible
/ + = Beginning Capital
10 yrs.
j. describe the issues regarding selection of the appropriate sample size, data-
mining bias, sample selection bias, survivorship bias, look-ahead bias, and
time-period bias.
Last Revised: 08/03/2021
Page 1
sample a method of obtaining information LOS a-c
- compare
about a population’s parameters ( & )
population - contrast
through sample statistics ( & ) - explain
Page 3
- sample statistics are estimates of population parameters LOS a-c
- not exact, subject to error - compare
- contrast
sampling error difference between observed values of - explain
sampling distribution of
the sample means
Page 5
B/ Non-probability sampling - depends on factors such LOS a-c
- compare
as judgment or convenience (in terms of
- contrast
access to data)
- explain
example 2, 3, 4
Last Revised: 08/03/2021
Page 6
LOS d, e
population with
- explain
any distribution - calculate
with and finite - interpret
sample
size =
sample size
= Standard Error
/√ or /√ if we
sample size
the sampling know
=
distribution of Note: sd ≠ SE
- as ↑, sampling the sampling sd = dispersion from
error decreases means the mean
(data description)
SE = sampling error
best estimate of
(data inference)
= /
= /√
Page 7
Point Estimators/ Desirable properties
LOS f
- identify
1/ Unbiasedness an unbiased estimator is one whose - describe
expected value (the mean of its sampling
distribution) equals the parameter it is intended to estimate
e.g. ∑ ∑
= is unbiased while = is biased
− upwards
2/ Efficiency an unbiased estimator is efficient if no other
unbiased estimator has a sampling distribution with
smaller variance
from estimator A
estimator A is more both are
efficient since it unbiased
from estimator B
produces a smaller
variance
= =
Last Revised: 08/03/2021
Page 8
Point Estimators/ Desirable properties
LOS f
- identify
3/ Consistency - a consistent estimator is one for which - describe
the probability of estimates close to the value
of the population parameter increases as sample size increases
LOS g, h
Confidence Interval a range for which one can assert - contrast
with a given probability ( −∝), called the degree of - calculate
confidence, that it will contain the parameter it - interpret
is intended to estimate
Page 9
Confidence Intervals/ LOS g, h
Interpretation: Probabilistic in repeated sampling - contrast
95% of such CIs will, in the long run, - calculate
include or bracket the population mean - interpret
+/- ∝/ /√ e.g. = , = , ∝= %, =
+/− . √ /√
= NORM. S. INV (.975) = 1.96
or = NORM. S. INV (.025) = -1.96 21.08 28.92
Last Revised: 08/03/2021
Page 10
1/ CI for (Normally Distributed population, known variance) LOS g, h
- contrast
- calculate
common reliability factors - interpret
90% = 1.65
5% 90% 5%
95% = 1.96
99% = 2.58 -1.65 +1.65
lower bound upper bound
+/- ∝/ /√ e.g. = . = . ∝= % =
. +/− . . /√
= NORM. S. INV (.95) = 1.65
or = NORM. S. INV (.05) = -1.65 .4005 .4995
Page 11
3/ CI for ( Unknown) LOS g, h
- contrast
+/- ∝/ /√ sample is large regardless
- calculate
of distribution - interpret
= T. INV (p ,df) = t-val. or/ sample is small but population
or = T. INV (p,df) = -t-val. is normally distributed
< >
normal dist., known z z
normal dist., unknown t t or z
practice uses t.
non-normal dist., known N/A z
non-normal dist., unknown N/A t or z
× note: width = 2E
= √ = =
√
Last Revised: 08/03/2021
Page 12
Resampling repeatedly draw samples from an
LOS i
original data sample in order to estimate
- describe
population parameters
- draw 1 observation,
Population record, replace
- unknown sample all with - draw another obs.,
=
distribution =
record, replace
- all we have
is a sample times
Page 13
LOS i
2/ Jackknife method/ - omit one observation from
- describe
a sample, one at a time
LOS j
1/ Data snooping bias - searching a data set for statistically - describe
significant patterns/relationships
(data mining)
Page 14
1/ Data snooping bias LOS j
- to minimize/avoid: - describe
validation test
out-of-sample test to
training
data set data set data evaluate model fit
- if data snooping is
build and fit evaluate model present, there will be
a model fit and tune the insignificant model fit
model
2/ Sample selection bias/ excluding some observations or time periods
- basically choosing non-random samples
e.g./ survivorship bias historical data may only include data
for companies that survived
- would overstate performance
using hedge fund indexes since they are self report, only
well-performing funds may opt to report
Page 15
3/ Look ahead bias/ using information that was not LOS j
available on the observation date - describe
e.g.: models that use price and accounting data from the
historical record when the actg. data may not have been
available on the same date
( on Dec 31, on Dec 31, but may not have
been reported until mid-February)
Hypothesis Testing
a. define a hypothesis, describe the steps of the hypothesis testing, and describe and
interpret the choice of the null and alternative hypotheses;
c. explain a test statistic, Type I and Type II errors, a significance level, how significance
levels are used in hypothesis testing and the power of a test;
d. explain a decision rule, the power of a test, and the relation between confidence intervals
and hypothesis tests, and determine whether a statistically significant result is also
economically meaningful;
f. describe how to interpret the significance of a test in the context of multiple tests;
g. identify the appropriate test statistic and interpret the results for a hypothesis test
concerning the population mean of both large and small samples when the population is
normally or approximately normally distributed and the variance is 1) known or 2)
unknown;
h. identify the appropriate test statistic and interpret the results for a hypothesis test
concerning the equality of the population means of two at least approximately normally
distributed populations, based on independent random samples with 1) equal or 2)
unequal assumed variances;
i. identify the appropriate test statistic and interpret the results for a hypothesis test
concerning the mean difference of two normally distributed populations;
j. identify the appropriate test statistic and interpret the results for a hypothesis test
concerning 1) the variance of a normally distributed population, and 2) the equality of the
variances of two normally distributed populations based on two independent random
samples;
k. compare and contrast parametric and non parametric tests and describe situations where
each is the more appropriate type of test;
l. explain parametric and non parametric tests of the hypothesis that the population
correlation coefficient equals zero and determine whether the hypothesis is rejected at a
given level of significance;
Hypothesis Testing
LOS a, b (4p) The process of hypothesis testing - define, describe, interpret, compare
1.5p Identifying the appropriate test statistic contrast
LOS c explain
2p Specify the level of significance
3p State the Decision Rule
LOS d explain, determine
1p Make a decision
LOS e (3p) The role of p-values - explain, interpret
LOS f (2.5p) Multiple Tests and Interpreting Significance - describe
LOS g (4.5p) Tests concerning a single mean identify
LOS h (2.5p) Tests concerning differences between means (Ind. Samples) interpret
LOS i (4p) Tests concerning differences between means (Dep. samples)
LOS j (8.5p) Tests concerning tests of Variance
LOS k (2p) Parametric vs. Non Parametric Tests - compare, contrast, describe
LOS L (5.5p) Tests Concerning Correlation - explain, determine
LOS m (5p) Tests of Independence - explain
Page 1
Statistical Inference the process of making judgments
LOS a
about a larger group (pop.) based on a - define
smaller group (sample) - describe
e.g./ hypothesis testing - test to see whether a sample - interpret
statistic is likely to come from a population with
the hypothesized value of the population parameter
i.e. Does = ?
LOS b
Two-sided (two-tailed) test
- compare
e.g. : = % could be fail to - contrast
could be
vs. : ≠ % < %
reject > %
reject = %
reject
Page 3
- the null ( ) always contains the equality sign LOS b
- compare
∶ = ∶ ≤ ∶ ≥ - contrast
Test Statistic:
(Step #2) LOS c
pop. is known - explain
−
=
/√
distributed
normally
pop. is unknown
−
=
/√
t-distributed
Last Revised: 08/03/2021
Page 4
Step 3: Specify the Level of Significance LOS c
- level of sig. depends on the seriousness of making - explain
a mistake
= true = false
fail to reject Correct Type II error
( − ∝) as ∝ ↓, ↑
confidence level
reject Type I error Correct only way to
∝ ( − ) decrease both is
level of sig. Power of a test
to increase
−
=
/√ as ↑,
e.g.: : not pregnant denom. ↓,
: pregnant t-stat ↑
∝= % ∝= %
or T. INV (p,df)
Last Revised: 08/03/2021
lower upper
limit − ∙ /√ + ∙ /√ limit
Page 7
P-value the area in the probability distribution outside LOS e
the calculated test-statistic - explain
- interpret
- for a two-sided test-stat, combine the probabilities under
the curve in both tails
Page 8
if p-value < ∝ , reject LOS e
∝/ ∝/ - explain
∝ - interpret
< /
/
= <∝
−∝
example #5 test-statistic
LOS f
Rejecting is a positive event (support for ) - describe
Rejecting a true is thus a false positive
Page 9
rank all p-values from low to high LOS f
starting at the lowest: - describe
LOS g
Tests concerning a single mean/
- identify
< ≥
- interpret
population known
approx. normal unknown or recall CLT-sampling
this is typically distribution of means will
the case be approx. normal with large
regardless of pop. dist.
∴ test statistic = = −
/√
Last Revised: 08/03/2021
Page 10
test-stat. known -alternative: unknown
LOS g
- theoretically correct to use: but large : - identify
− − - interpret
= =
/√ /√
Decision Rule: if test-stat > critical value reject
2-sided
= NORM. S. DIST(p)
= T. DIST(p,df)
Page 11
Differences between means - independent samples LOS h
- identify
pop 1.
Q: Are and from the - interpret
~ and same population (i.e. = )
or from different populations
independent (i.e. ≠ )
pop 2.
~ and 2-sided or/
: − = : =
: − ≠ : ≠
Assumption: =
test statistic: 1 sided - right
( − )−( − ) : − ≤ : ≤
= : − > : >
+
1-sided - left
: − ≥ : ≥
( − ) +( − ) Ex #9 : − < : <
=
+ −
Last Revised: 08/03/2021
and =
∑
and = /√ = # of paired
observations
: = : ≤ : ≥
: ≠ : > : <
test-statistic: −
= Example 10, 11
/√
Page 13
LOS j
Tests of Variances
- identify
1/ Single Variance - independent observations - interpret
from a normally distributed pop.
- chi-square tests sensitive to violations
2 sided:
0 : =
- not symmetrical, .˙. critical : ≠
values are also not symmetrical
1 sided: left right
= CHISQ. INV(lower p,df) lower : ≥ : ≤
= CHISQ. INV(upper p,df) upper : < : >
( − )
test-statistic = = example #12
Last Revised: 08/03/2021
Page 14
Tests of Variances (e.g. compare volatility of 2 funds) LOS j
2/ Equity of 2 Variances - identify
- interpret
: = : ≤ : ≥
: ≠ : > : < test-statistic
or/
= /
: / = : / ≤ : / ≥
: / ≠ : / > : / < = −
= −
right-tailed left-tailed
Page 15
Parametric Testing Non-parametric testing LOS k
- compare
sample stats distributional no no - contrast
to test pop. assumptions parameters distributional - describe
parameters tested assumptions
( or for or )
1/ when data do not meet distributional
assumptions
i.e. < , pop. is non-normal
2/ when there are outliers
- test of median instead of mean
3/ when data are given in ranks or use
an ordinal scale ordered
NO IR
categorical
4/ hypothesis do not concern a parameter
e.g. Is a sample random?
Last Revised: 08/03/2021
Page 16
Tests of Correlation LOS L
1/ Parametric test left right - explain
2-sided : = one-sided : ≥ : ≤ - determine
: ≠ : < : >
√ −
test-statistic: =
√ −
- in testing , as ↑, rejected for even small correlations
- big data sets, almost any will be significant
e.g./ =. . √ ,
, = ~ vs. critical = .
= , √ −.
ex #16
Page 17
Tests of Correlation LOS L
2/ Non-parametric test - explain
- if normality assumption for or violated, or - determine
Page 18
Tests of Correlation/ LOS L
2/ Non-parametric test - explain
2/ On original data set (pre-ranked): - determine
LOS m
Tests of Independence/
- explain
- test if classification types are independent
e.g./
Are growth stocks equally likely to be any size or
are they more likely to be large-cap stocks?
Page 19
Tests of Independence/ Contingency Table (2-way)
LOS m
- explain
observed
non-parametric test of indep.
− df = (r-1)(c-1)
=
(right-tailed)
m = # of cells (3 x 3 = 9)
= observed value in each cell
= expected value in each cell
( )×
=
=( × )/ = .
=( × )/ = .
.˙.
− ( − . )
= =.
.
Last Revised: 08/03/2021
Page 20
Tests of Independence/ LOS m
: size and type are independent - explain
: size and type are not independent
−
= .
e.g./ − .
= = .
√ .
more than
− . example #18
= = . expected if
√ . independent
Last Revised: 08/03/2021
a. describe a simple linear regression model and the roles of the dependent and
independent variables in the model;
c. explain the assumptions underlying the simple linear regression model, and
describe how residuals and residual plots indicate if these assumptions may
have been violated;
g. calculate and interpret the predicted value for the dependent variable, and a
prediction interval for it, given an estimated linear regression model and a
value for the independent variable;
Page 1
- Simple Linear Regression (LR) one IV
LOS a
DV - dependent variable - - the variable we
- describe
IV - independent variable - are seeking to explain
the explanatory variable
LR assumes a linear relationship between the DV and the IV
Page 2
regression compute a line of best fit that
LOS b
residuals minimizes the sum of the squared - describe
deviations between the observed values of
and the predicted values (the regression
line)
i.e. min − − = SSE
( , ) lies on the
- sum of
regression line DV predicted values the squares
of DV ( )
= − error
- Note: = − implies the (a.k.a. residual
residual is in the same units of sum of squares)
( , )
measurement as the DV ( )
∑ ( − )( − )
( ( )=
∑ = )
= = denominator can never be
∑ ( − )
negative, ∴ sign of is
determined solely by ( , )
the ( − ) cancels out - if > , >
Page 3
Interpreting and LOS b
- describe
= if = only makes sense if
the IV has meaning at =
the change in for a one unit change in
Data/
cross-sectional - many observations on & for the same
time period
time-series - many observations on (and sometimes ) from
different time periods
example #2/3
Last Revised: 08/03/2021
Page 4
Assumptions/ LOS c
1/ Linearity the relationship between & is linear in - explain
the parameters and neither is multiplied - describe
or divided by another regression parameter
implies the IV must not be random - if so, there
would be no linear relation between &
Page 5
Assumptions/ LOS c
4/ Normality is normally distributed - explain
- required to conduct valid tests of the - describe
LOS d
Analysis of Variance/ - calculate
Total sum of squares (SST) - interpret
total
( − )
Page 6
Analysis of Variance/
LOS d
Coefficient of Determination - measures the - calculate
fraction of the total variation in the - interpret
DV that is explained by the IV (goodness of fit measure)
- if only 1 IV, square the correlation between IV and DV
explained variation in = − / −
Page 7
Analysis of Variance/ LOS d
/ / = slope - calculate
= = = - interpret
/ / −( + ) coefficients
df1 =
mean + = regression coefficients
df2 = − −
LOS e
ANOVA table/ - describe
- calculate
- interpret
=
√ =
Last Revised: 08/03/2021
Page 8
Standard Error of the Estimate (SEE) - a measure of LOS e
the s.d. of - describe
/ / - calculate
∑ − ∑ ∑
=√ = = - interpret
− −
= F. INV(.95,1,4)
= 7.71
=√ .
Example #5
Page 9
1/ Hypothesis Tests of : LOS f
hypothesized value - formulate
test statistic: = − - determine
df = − ( + )
standard error of =
∑ ( − )
= . ( − ) = .
e.g./
: =
at ∝ = %
: =∅
= T. INV(.05,4) = 2.776
−( + )
.
= √ = . =
. −
= . Reject
√ . .
( = , . = . )
Note: : =
√ − . √ SLR
: ≠ = = . Reject
√ − √ −. only
df = −
Last Revised: 08/03/2021
Page 11
Level of Significance and p-values/ LOS f
- most software output ∝ = % , : parameter = 0 - formulate
recall: = (1 - T. DIST(+ ,df,1)) x 2 example #6 - determine
LOS g
Prediction interval (or CI) for : - calculate
= + - interpret
estimated with error 2 sources of
− = estimated with error error
Page 12
Prediction interval (or CI) for : 3 LOS g
- calculate
−
= + + - interpret
( − )
1
2
1. the better the fit of the regression model lower lower
2. larger = smaller
3. close is to smaller
Steps/ Determine
Select ∝
Determine
Determine
Determine +/−
example #7
Page 13
1/ Log - lin model LOS h
= - describe
Revenues
- take the log of
both sides
= + growth rate
or revenue
relative
change in for
absolute change in
Page 14
3/ Log - Log model = + ( ) LOS h
- describe
the relative change in for a
relative change in
- exh. #37/38
F-stat
SEE ( )
= =
( + ) moving backwards ( + ) ( + ⁄ )
Review - 2
e or e
rt rn
- continuous compounding EAR = −
EAR = +
Review - 3
PV of an annuity/
− −
( + ) ( + )
= = +
= - PV of a perpetuity
Review - 1
LOS a - identify/compare
Review - 2
LOS b - describe/
one dimensional array - column or row of a spreadsheet
two dimensional rectangular array - two or more variables
(data table) rows x columns
LOS c - interpret/
frequency distribution one way table
- # of obs./variable
absolute frequency - actual count max - min
relative frequency - %’age of obs./variable
- for numerical data place each obs. in an interval (range/k)
non-overlapping
absolute
cumulative adds up frequencies
relative
Last Revised: 08/03/2021
Review - 3
LOS d - interpret/
Contingency table - for categorical data
joint frequencies (r x c entry)
marginal frequencies (r or c totals)
Applications/
Confusion matrix - assess precision of a classification model
T T
- prediction vs. actual
F F
test potential association between 2 variables
( )
- chi-square test of independence =∑
LOS e - describe/evaluate/
Histogram and frequency polygon
- represents distribution of numerical data
Bar chart represents distribution of categorical data
(pareto chart, grouped/clustered bar chart, stacked bar chart)
Review - 4
LOS e - describe/evaluate/
Tree Map - set of coloured rectangles to represent groups
- nested rectangles = more categories
Word Cloud - frequency of unstructured data (text)
Line Chart - typically for time series data (trend analysis)
Scatter Plot - visualize joint variation in 2 numerical variables
Heat Map - contingency table with colour-coded cells
LOS f - describe/
Relationship Comparison
over time
Scatter Heat among categories line chart (2 vars)
Plot Map bar chart bubble line
(2 vars) multiple tree map chart (3 vars)
vars
Scatter Plot Heat Map
Matrix
Last Revised: 08/03/2021
Review - 5
LOS f - describe/ numerical data histogram
Distribution frequency polygon
cumulative distribution chart
unstructured data categorical data
word cloud bar chart, tree map, heat map
Do not: select an improper chart type
select data that favours a conclusion
truncate or extend the range of an axis
LOS g - calculate/interpret/
measures of central tendency
1/ Arithmetic mean ∑
= ⇒ ∑( − )=
sensitive to outliers - do nothing
- delete - trimmed mean (5% trimmed - delete top
& bottom 2.5%)
- replace - winsorized mean
95% winsorized replace all top/bottom 2.5% of obs.
≤ and = −
forecasting next period returns use
" over multiple periods use
LOS h - evaluate/
× ≈ and
> >
one-period minimize
return compounding outliers
Visualization
Box and whisker plot
upper
= (1.5 x 1QR)
fence
+ upper
bound
Review - 9
LOS j - calculate/interpret/
measures of absolute dispersion
1/ Range max value - min value
- only uses 2 observations
- no information about the distribution
= ∑∀ ( − ) - measure of dispersion
− below the target
full dataset N
LOS m - interpret
Kurtosis:
∑( − )
= −
Last Revised: 08/03/2021
Review - 11
LOS n - interpret
Covariance - joint variability of 2 random variables
∑( − )( − )
=
−
− ≤ ≤
r = 0 no linear association - max diversification
r = -1 perfect negative correlation - perfect hedge
r = +1 perfect positive correlation - perfect replication
Probability Concepts
Review - 1
LOS a - define/
Random variable - a quantity whose future outcomes are uncertain
Outcome - a possible value of a random variable (RP)
(4.3%)
Event - a specified set of outcomes (> %, ≤ %)
LOS b - identify/compare/contrast/
Property 1: ≤ ( )≤ where are
i) mutually exclusive
Property 2: ∑ ( ) =
ii) exhaustive
LOS d - calculate/interpret/
( ) unconditional probability
( | ) conditional probability
( ) joint probability
( | )=
( )
(A ∧ B)
LOS e - demonstrate/
A B
Multiplication Rule: ( )= ( | ) ( )
Addition Rule: ( )= ( )+ ( )− ( )
( | )× ( )
Last Revised: 08/03/2021
Review - 3
LOS f - compare/contrast/
Independent event - 2 events are independent iff:
( | ) = ( ) .˙. ( ) = ( ) ( )
Dependent event - ( ) is related to ( )
LOS g - calculate/interpret/
Total Probability Rule: ( )= ( | ) ( )+ … + ( | ) ( )
( )= ( )=
unconditional ( | )
( | ) ( )
probability
( ) multiplication
( ) ( )
( ) rule
+
( )
+
( ) ( )
( )
( ) ( ) ( )
conditional
total probability rule
probability
( )= ( ) − ( ) squared deviation
probability
LOS i - explain/
value of Xi
Conditional expected value: ( | )= ( | )
probability
LOS j - interpret/ ( | ) X1
= ( | ) + ( )
demonstrate/
B ( | )
X2 = ( ) + ( )
A ( | ) X3
C ( | ) = ( ) ( | )+ ( ) ( )
X4
Last Revised: 08/03/2021
1/ ( )= ( ) and =
2/ ( )= cross product of
the deviations
∑ − −
=
−
LOS m - calculate/interpret
Bayes’ Formula - a method for updating prior probabilities
based on new information
( | ) ( )=
( | ) ( )
( | ) ( ) ( | )=
( )
( | ) +
( ) ( )
( )=
( )
Last Revised: 08/03/2021
LOS n - identify/analyze/ P
Review - 7
1/ Multiplication (2 x 4 x 3)
D F
industries
sizes
!
3/ Multinomial - the number of ways objects can
! !… !
be labelled w/ k labels (k ≥ 3), each
of which has objects
!
4/ Combination - the number of ways of selecting
( − )! !
objects from where order does not
matter (k = 2)
!
5/ Permutation
( − )!
Last Revised: 08/03/2021
LOS b - calculate/interpret/
cumulative distribution function (cdf) gives the probability that a
variable is less than or equal to a particular value
( )= ( ≤ )
LOS d - describe/calculate/interpret
Continuous uniform distribution ( )= for all ≤ ≤
−
−
( )=
−
LOS e - describe/calculate/interpret
Bernoulli random variable the outcome of any binary trial
= success ( − ) = failure
- in trials, the number of successes is a binomial random variable
(# of successes in Bernoulli trials)
assumptions: is constant and trials are independent
~ ( , ) - described by 2 parameters
Last Revised: 08/03/2021
Review - 3
LOS e - describe/calculate/interpret
! - if p = .50, the distribution will
( )= ( − )
( − )! ! be symmetrical
- for ( ≤ )= ( )+ ( − )+⋯ ( )
mean variance
cdf
Bernoulli ( − ) 100%
Binomial ( − ) pdf
Review - 4
LOS g - explain/contrast/
- multivariate normal distribution completely described by
3 lists of variables
1/ all the means ( )
2/ all the variances ( )
3/ all pairwise correlations ( − )/ unique corrs.
− =1
= for all in a sample/data set
0
LOS L - explain/
- commonly used to model prob. dist. of asset prices
right
- a variable Y follows a lognormal dist. if
skewed
LN(Y) is normally dist.
0 ∞
- described by the and of its associated normal distribution
LOS m - calculate/interpret/
( / )= + but ( / )= continuously compounded return
normally distributed
Review - 6
LOS m - calculate/interpret/
~ ( - both
, &
) scale linearly with time
- assumptions: returns are 1/ independent
2/ identically distributed ( & do
not change)
Volatility the annualized sd of the continuously
compounded daily returns of the underlying asset
LOS n - describe/calculate/interpret/
Review - 7
F-distribution ratio of / −
= df =
2 variables / − − , −
Review - 8
LOS p - describe/
B/ Non-probability sampling
5/ Convenience sampling - observations that are easy to
obtain or are accessible
6/ Judgmental sampling - select observations based on
experience and knowledge
samples may not be representative
LOS g, h contrast/calculate/interpret/
Confidence Intervals: Point estimate +/- Reliability Factor x Standard
Error
Review - 4
LOS g, h contrast/calculate/interpret/
< >
normal dist., known z z
normal dist., unknown t t or z
practice uses t
non-normal dist., known N/A z
non-normal dist., unknown N/A t or z
LOS i - describe/
Resampling - repeatedly drawing samples from a given sample
Review - 5
LOS i - describe/
1/ Bootstrap method - can also find of an estimator even
when no analytical formula is available
LOS j describe/
1/ Data snooping bias (aka data mining) - searching a data set
for statistically significant patterns
- typically will not be theory-driven
- will lack an economic rationale
Review - 6
LOS j describe/
minimize: training data validation test out-of-sample
set data set data test to evaluate
evaluate fit
model fit
build and fit
a model and fine tune
3/ Look ahead bias - using information that was not available on the
observation date
4/ Time period bias - results of one time period may be specific to that
time period
Last Revised: 08/03/2021
Hypothesis Testing
Review - 2
LOS c - explain/
test statistic a calculated value of a distribution to compare
to a critical value of the distribution in order
to test
- as ∝ ↓, ↑
negative DNR Correct (false)
−∝ Type II - to decrease both, increase
positive R ∝ Correct
Type I ( − )
(false) Power of a test
LOS d - explain/determine/
Reject : 2 sided test-stat > critical value
left-side test-stat < critical value
right-side test-stat > critical value
Last Revised: 08/03/2021
Review - 3
LOS d - explain/determine/
Review - 4
LOS f - describe/
- multiple testing problem in repeated tests with a level
of significance ∝, expect ∝ false positives
= − = − or = −
/√ /√ /√
Last Revised: 08/03/2021
Review - 5
LOS h - identify/interpret/
- Differences between means - independent samples
= assumed
=( − )−( − ) =( − ) +( − )
+ −
+
df
: = : ≤ : ≥
: ≠ : > : <
LOS i - identify/interpret/
- Differences between means - dependent samples
- arrange data in pairs, calculate ( − ) ,
∑
=
− : = left right
= df = −
/√ ≥ ≤
: ≠ < >
Review - 6
LOS j - identify/interpret/
left right
Single Variance : = ≥ ≤
: ≠ < >
( − )
= df = − note: must determine each critical
value
left right
Equality of 2 Variances : = ≥ ≤
: ≠ < >
Review - 7
LOS L - explain/determine left right
Test of correlation parametric test : = ≥ ≤
: ≠ < >
√ −
= df = − Pearson or bivariate
√ −
correlation
Review - 8
LOS m - explain/ Tests of Independence
−
non-parametric test = df = ( − )( − )
intercept slope
regression coefficients
Review - 2
LOS b - describe/ = −
( , )
= - denominator can never be neg.
( ) ∴ sign of determined by ( , )
the change in for if > , >
a 1 unit change in
LOS c - explain/describe/
1/ Linearity - relationship between and is linear in the parameters
- IV must not be random, i.e. ( | )
2/ Homoskedasticity - ( ) is constant for all observations
3/ Independence - ( , ) pairs are independent of each other
- is uncorrelated across observations
Last Revised: 08/03/2021
Review - 3
LOS c - explain/describe/
4/ Normality - is normally distributed
LOS d - calculate/interpret/
∑( − ) SST Total sum of squares
sum of squared
errors SSE
∑ − + ∑ − SSR - regression sum of
squares
Coefficient of Determination = =
=
- measures fit
slope
coeff.
F-test = / / : =
= =
/ / −( + ) : ≠
reg. coeff.
LOS e - describe/calculate/interpret/
Review - 4
LOS e - describe/calculate/interpret/
SEE = ∑ / - the smaller the SEE, the more
accurate the regression
−
LOS f - formulate/determine/
− : =
= = =
: ≠
∑ ( − )
√ −
= - produces the same result.
√ −
− = +
= ∑ ( − )
Review - 5
LOS g - calculate/interpret/
= + - 2 sources of error
+⁄− = + +( )
1 2 3
more narrow CI lower SEE /higher /closer is to
LOS h - describe
1/ Log - lin model = +
relative change in for absolute change
in
2/ Lin - log model = + ( )
absolute change in for relative change in