Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Review of Statistics

Topics
Descriptive Statistics
Mean, Variance

Probability
Union event, joint event

Random Variables
Discrete and Continuous
Distributions, Moments

Two Random Variables


Covariance and correlation

Central Limit Theorem


Hypothesis testing
z-test, p-value

Simple Linear Regression

Statistical Methods

Statistical
Methods

Descriptive
Statistics

Inferential
Statistics

Descriptive Statistics
Involves
Collecting Data
Presenting Data
Characterizing Data

Purpose
Describe Data

90
80
70
60
50
40
30
20
10
0

East
West
North

1st 2nd 3rd 4th


Qtr Qtr Qtr Qtr

Inferential Statistics
Involves
Estimation
Hypothesis
Testing

Purpose
Make Decisions
About Population
Characteristics

Population?

Descriptive Statistics

Mean

Measure of central tendency


Acts as Balance Point
Affected by extreme values (outliers)
Formula:
n

X =

Xi
i =1

X1 + X2 +
n

+ Xn

Median

Measure of central tendency

Middle value in ordered sequence


If odd n, Middle Value of Sequence
If even n, Average of 2 Middle Values

Value that splits the distribution into two


halves

Not Affected by Extreme Values

Median (Example)
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8

Median =

16 + 16
2

= 16

Mode
Measure of Central Tendency
Value That Occurs Most Often
Not Affected by Extreme Values
There May Be Several Modes
Raw Data:
17
16
Ordered:
11
12

21

18

13

16

12

11

13

16

16

17

18

21

Sample Variance
n

S =

(Xi X)
i =1

n 1
2

n - 1 in denominator!
(Use n if population
variance)

(X1 X) + (X2 X) +
n 1

+ (X n X)

Sample Standard Deviation


S=

S
n

(Xi X )
i =1

n 1
2

(X1 X ) + (X 2 X ) +
n 1

+ (Xn X )

Probability

Event, Sample Space


Event:
Sample space:

one possible outcome


collection of all the possible events


S ={


Probability of an outcome: proportion of times that the
outcome occurs in the long run
The complement of event A: includes all the events that
are not part of the event A: Symbol A
Event A { }
Complement of A A {

Properties of an Event
1. Mutually Exclusive
Two outcomes that cannot occur
at the same time

2. Collectively Exhaustive
One outcome in sample space
must occur

Experiment: Observe
gender of one person

Joint Events
Joint event: Event that has two or more characteristics
means intersection of event (set) A and event (set) B
Example:
A and B, (AB): Female, Under age 20

Compound Events
Union of event A and event B ( A B ): Total area of the
two circles
A B contains all the outcomes which are part of event
(set) A, part of event (set) B or part of both A and B
means union of event A and event B

Compound Probability Addition Rule


Used to Get Compound Probabilities for Unions of Events
P(A OR B)

= P(A B)
= P(A) + P(B) - P(A B)

For Mutually Exclusive Events:


P(A OR B)
= P(A B) = P(A) + P(B)
Mutually Exclusive Events

A
B

Random variables
Random variable
numerical summary of a random outcome
a function that assigns a numerical value to each simple
event in a sample space
Discrete or continuous random variables
Discrete: only a discrete set of possible values
=> summarized by probability distribution:
list of all possible values of the variables and the
probability that each value will occur.
Continuous: continuum of possible values
=> summarized by the probability density function
(pdf)

Discrete Probability Distribution


1.

List of pairs [ Xi, P(Xi) ]


Xi = Value of Random Variable (Outcome)
P(Xi) = Probability Associated with Value

2.

Mutually exclusive (no overlap)

3.

Collectively exhaustive (nothing left out)

4.

0 P(Xi) 1

5.

P(Xi) = 1

Joint Probability Using Contingency


Table
Conditional
probability:

Event
Event

B1

B2

Total

A1

P(A1 B1) P(A1 B2) P(A1)

A2

P(A2 B1) P(A2 B2) P(A2)

Total

P(B1)

P(B2)

Joint Probability

Marginal Probability

Joint distribution:

Marginal distributions:
Conditional distribution:

P( A B1 )
P ( A1 B1 )
P ( B1 )
P ( A2 B1 )
P( B1 )

Contingency Table Example


Joint Event: Draw 1 Card. Note Kind, Color
Color
Type
Ace

Red

Black

Total

2/52

2/52

4/52

Non-Ace 24/52

24/52 48/52

26/52

26/52 52/52

Total
P(Red)

P(Ace)

P(Ace AND Red)

Moments Discrete Case


Moment: Summary of a certain aspect of a
distribution
Mean, Expected Value
Mean of Probability Distribution
Weighted Average of All Possible Values
= E(X) = Xi P(Xi)
Variance
Weighted Average Squared Deviation about
Mean
2 = E[ (Xi )2 ] = (Xi )2 P(Xi)

Statistical Independence
When the outcome of one event (B) does not affect the
probability of occurrence of another event (A), the events A
and B are said to be statistically independent.
Example: Toss a coin twice => no causality
Condition for independence:
Two events A and B are statistically independent if and
only if (iff)
P(A | B) = P(A)

Bayes Theorem and Multiplication Rule


Bayes Theorem
P(A | B) =

P(A B)
P(B)

The difficult part is P(A B)


Use above equation to derive P(A B)
P(A and B) = P(A B)
= P(A)P(B | A)
= P(B)P(A | B)
For independent events:
P(A and B) = P(A B) = P(A)P(B)

Covariance
Measures the joint variability of two random variables
N

= (X )(Y )P(X Y )
i
X
i
Y
i, i
XY
i=1

Can take any value in the real numbers


Depends on units of measurement (e.g., dollars, cents,
billions of dollars)
Example:
positive covariance = y and x are positively related;
when y is above its mean, x tends to be above its mean;
when y is below its mean, x tends to be below its mean.

Correlation
Standardized covariance, takes values in [-1, 1]
Does not depend on unit of measurement
Correlation coefficient () formula:

cov( XY ) XY
=
=
X Y
X Y
Covariance and correlation measure only linear dependence!
Example: Cov(X,Y)=0
Does not necessarily imply that y and x are independent.
They may be non-linearly related.
But if X and Y are jointly normally distributed, then they
are independent.

Sum of Two Random Variables


Expected Value of the Sum of Two Random
Variables

E(X + Y) = E(X) + E(Y)


Variance of the Sum of Two Random Variables
2
Var (X + Y) = X+Y
= 2X + 2 Y + 2XY

Continuous Probability Distributions Normal Distribution


Bell-Shaped, symmetrical
Mean, median, mode are equal

f(X)

Infinite range
68% of the data are within 1 standard
deviation of the mean

95% of the data are within 2 standard


deviations of the mean
In early 1800's, German
mathematician and physicist Karl
Gauss used it to
analyze astronomical
data, therefore known
as Gaussian distribution.

Mean, Median,
Mode

Normal Distribution
Probability Density Function
1 X
2

f (X )= 1 e
2
f(X)

=
=
=
=
=

frequency of random variable X


3.14159; e = 2.71828
population standard deviation
value of random variable (- < X < )
population mean

Effect of Varying Parameters ( & )

f(X)
B
A

C
X

Normal Distribution Probability


Probability is the
area under the
curve!

P (c X d ) =

f ( x)dx ?
c

f(X)

Infinite Number of Normal Distribution


Tables
Normal distributions differ by
mean & standard deviation.

Each distribution would


require its own table.

f(X)

X
Thats an infinite number!

Standardize the Normal Distribution

Z=

Normal
Distribution

Standardized
Normal Distribution

z = 1

Z = 0
One table!

Standardizing Example
6
.
2
5

Z=
=
= 0.12

10
Normal
Distribution

= 10

= 5 6.2 X

Standardized
Normal Distribution

Z = 1

Z= 0 .12

Moments: Mean, Variance


(Continuous Case)
Mean, Expected Value
Mean of probability distribution
Weighted average
of
all
possible
values

= E(X) = X f(X) dX
-

Variance
Weighted average squared
deviation about mean
2 = E[ (X )2 ] = (X- )2 f(X) dX
-

Moments: Skewness, Kurtosis


E(X )
Skewness:
S=
Measures asymmetry in distribution
3
The larger the absolute size of the skewness, the more
asymmetric is distribution.
A large positive value indicates a long right tail, and a large
negative value indicates a long left tail. A zero value
indicates symmetry around the mean.
3

E(X )

K=
Kurtosis:
4
Measures thickness of tails of a distribution
A kurtosis above three indicates fat tails or leptokurtosis,
relative to the normal, i.e. extreme events are more likely to
occur.

Central Limit Theorem: Basic Idea


As sample size
gets
large
(n 30) ...

sample mean
will have a normal
distribution.

Important Continuous Distributions


All derived from normal
distribution
2

distribution: arises
from squared normal
random variables,
t distribution: arises
from ratios of normal
2

and
variables
F distribution: arises
2

from ratios of
variables.

distribution

t distribution (red),
normal distribution (blue)

F distribution

Fundamentals of Hypothesis
Testing

Identifying Hypotheses
1.
2.
3.

Question, e.g. test that the population mean is equal to 3


State the question statistically (H0: = 3)
State its opposite statistically (H1: 3)
Hypotheses are mutually exclusive & exhaustive
Sometimes it is easier to form the alternative
hypothesis first.
4. Choose level of significance
Typical values are 0.01, 0.05, 0.10
Rejection region of sampling distribution: the
unlikely values of sample statistic if null hypothesis is
true

Identifying Hypotheses: Examples


1. Is the population average amount of TV viewing
12 hours?
= 12
12
H0: = 12
H1: 12
2. Is the population average amount of TV viewing
different from 12 hours?
12
= 12
H0: = 12
H1: 12

Hypothesis Testing: Basic Idea


Sampling Distribution
It is unlikely
that we would
get a sample
mean of this
value ...

... Therefore, we
reject the null
hypothesis that
= 50.
... if in fact this were
the population mean.
20

= 50

H0

sample mean

Example: Z-test statistic ( known)


1. Convert Sample Statistic (e.g., X ) to Standardized
Z Variable

Z=

X x
x

X
=

n
2. Compare to Critical Z Values
If Z-test statistic falls in critical region, reject H0;
Otherwise do not reject H0

p-value
Probability of obtaining a test statistic more
extreme ( or ) than actual sample value
given H0 is true
Smallest value of for which H0 can be
rejected
Used to make rejection decision
If p value , do not reject H0
If p value < , reject H0

One-Tailed Test: Rejection Region


H0: 0 H1: < 0

H0: 0 H1: > 0

Reject H0

Reject H0

Must be significantly
below .

Here: Small values dont


contradict H0.

One-Tailed Z Test: Finding Critical Z


Values
What Is Z Given = 0.025?
.500
- .025
.475

Z = 1

/2 = .025

Standardized Normal
Probability Table (Portion)

.05

.06

.07

1.6 .4505 .4515 .4525

1.7 .4599 .4608 .4616

1.9 .4744 .4750 .4756

0 1.96 Z

1.8 .4678 .4686 .4693

Two-Tailed Test: Rejection Regions


H0: = 0 H1: 0

Sampling Distribution

Level of Confidence

Rejection
Region

Rejection
Region
1-

1/2

1/2

Nonrejection
Region

Critical
Value

H0
Sample Statistic
Value Critical
Value

t-test, F-test
Test statistic may not be normally distributed
=> z-test not applicable
Examples: Variance unknown, but estimated.
Hypothesis that the slope of a regression line differs
significantly from zero.
=> t-test
Hypothesis that the standard deviations of two normally
distributed populations are equal.
=> F-test

Jarque-Bera test
Assesses whether a given sample of data is normally
distributed.
Aggregates information in the data about both skewness
and kurtosis. Test of the hypothesis that S = 0 and K = 3,
based on S and K .
2
T
1

2
Test statistic:
JB = S + K 3

(
4

(here T is the number of observations)


Under the null hypothesis of independent normallydistributed observations, the Jarque-Bera
statistic is
2
distributed in large samples as a random variable with 2
degrees of freedom.

Simple Linear Regression

Simple Linear Regression Model

y-intercept

random iid
error t (0, 2 )

slope

yt = 0 + 1 xt + t
dependent
(response)
variable

independent
(explanatory)
variable

Linear Regression Assumptions


1. x is exogenously determined
2. t are iid(0,2)

(iid = independently and identically distributed)

Zero mean
Independence of errors (no autocorrelation)
Constant variance (homoscedasticity)
More things to think about:
Normality of t (if not satisfied, inference procedures only
asymptotically valid)
Model specification (e.g. linearity, 1 constant over time?)

Simple Linear Regression Model


y

yt = 0 + 1 xt + t

observed
value

t = disturbance

E y x* = 0 + 1 x*

x
observed value

-- Sample Linear Regression Model


y

y i = b0 + b1x i + ei
ei = Random
Error

y i = b0 + b1x i

Unsampled
Observation

x
Observed Value

Ordinary Least Squares


T

OLS minimizes sum of squared residuals ( yt yt )


T

x
min
y

t
0
1 t

0 , 1

t =1

t =1

= et 2
t =1

predicted
value

yt = 0 + 1 xt + t

e4

e2

e1

e3

fitted value
(in-sample forecast)
yt = 0 + 1 xt

On Thursday: Evaluating the Model


1. Examine variation measures
coefficient of determination (goodness of fit)
standard error of the estimate

2. Analyze residuals e
serial correlation

3. Test coefficients
for significance

yt = 0 + 1 xt

Random Error Variation


1.Variation of Actual Y from Predicted Y
2. Measured by Standard Error of Estimate
Sample Standard Deviation of e
Denoted SYX

3. Affects Several Factors


Parameter Significance
Prediction Accuracy

Measures of Variation in Regression


1.Total Sum of Squares (SST)
Measures variation of observed Yi around the
mean,Y

2.Explained Variation (SSR)


Variation due to relationship between
X&Y

3.Unexplained Variation (SSE)


Variation due to other factors

Variation Measures

Yi
Total Sum
of Squares
(Yi - Y)2

Unexplained Sum of
Squares (Yi - Y^i)2

Yi = b0 + b1X i
Explained Sum of
Squares (Y^i - Y)2

Xi

Coefficient of Determination
Proportion of Variation Explained by
Relationship Between X & Y
0 r2 1
2

r =

Explained Variation
Total Variation
n

i =1

i =1

SSR
SST

b0 Yi + b1 X iYi n (Y)
n

Yi
i =1

n (Y)

Coefficients of Determination (r2) and


Correlation (r)
Y r2 = 1, r = +1

Y r2 = 1, r = -1
^=b +b X
Y
i

^=b +b X
Y
i
0
1 i

X
Yr2 = .8, r = +0.9

X
Y

^=b +b X
Y
i
0
1 i
X

1 i

r2 = 0, r = 0
^=b +b X
Y
i
0
1 i
X

Standard Error of Estimate


n

SYX =

SYX =

Y
i =1

(
Y

Y
)
i i
i =1

n2
n

i =1

i =1

b0 Yi b1 X i Yi
n2

Residual Analysis
1.Graphical Analysis of Residuals
Plot residuals vs. Xi values
Residuals mean errors
Difference between actual Yi & predicted Yi

2.Purposes
Examine functional form (linear vs. non-Linear
Model)
Evaluate violations of assumptions

Test of Slope Coefficient for Significance


1.Tests If There Is a Linear Relationship
Between X & Y
2.Hypotheses
H0: 1 = 0 (No Linear Relationship)
H1: 1 0 (Linear Relationship)

3.Test Statistic

b1
1
t
=
n2 S
b
1

S
YX
where S =
b
n 2
2
1
X

n
(
X
)
i
i =1

You might also like