Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 34

24

26
27 26, 30, 24, 32, 32, 31, 27 and 29?
29
30
31 4.5 4
32 2
32 5
37 4
5
2
6
2.333333

0.4 0.2
0.5 0.2375
CHAPTER 6: Discrete Probability Distributions A discrete random variabl
PDF
CDF + ax0.99=25
E(X) = µ = Sigma(x.Px)

UNIFORM DISTRIBUTION is one of the simplest discrete models.


It describes a random variable with a finite number of consecutive integer values from a to b
a lower limit
b upper limit
PDF a 20 PDF
b 60 CDF(few)
CDF mean = u 40 CDF
mean = u (a+b)/2 Std 11.83216
Std

BINOMIAL DISTRIBUTION describes the number of successes in a fixed number of inde


where each trial has only two possible outcomes: success or failure
P(0)+P(1) = (1-p)+p=1

p prob of success p 0.9 PDF


n number of trials n 200 CDF (few)
mean = µ n.p mean = µ 180 CDF
std = s std = s 4.242641
PDF x 6
CDF

Skewed right if pi < 0.5


Skewed left if pi > 0.5
Symmetric if pi - 0.5
APPROXIMATION "BINO & POISSON" if n >= 20 and π <= 0.05
n 500 x
p 0.003
l=n.p 1.5
PDF 0.2231
CDF 0.2231
0.7769

APPROXIMATION "BINO & HYPERGEO" pi=s/N if n/N < 0.05 and symmetric if π = 0.05
p= s/N
n/N #REF!
N s
n x
screte random variable has a countable number of distinct values.

ger values from a to b.

0.0243902
0.2439024
0.7560976

xed number of independent trials.

=𝒔/𝑵

𝒔𝒕𝒅=√(𝒏.𝒑𝒊(𝟏−𝒑𝒊))
4.38E-184 𝒏)/(𝑵−𝟏))
4.395E-184
1
metric if π = 0.05
POISSON DISTRIBUTION describes the number of times an event occurs in a fixed interv
or space given the average rate of occurrence and assuming that the events occur independ
1/tgian để có 1 event

 number of events x 2 PDF 0.267784


mean   1.8 CDF (few) 0.730621
std = s√() std = s 1 CDF 0.269379
PDF
CDF
Always right-skewed,
l càng lớn thì ít dốc

HYPERGEOMETRIC DISTRIBUTION Bóc theo thứ tự


describes the probability of obtaining a certain number of successes the events being

N: pop size N 9 PDF 0.040


n: sample size n 5 CDF(less) 1.000
s: number of success_p s 4 CDF 0.000
x: number of success_s x 4
=𝒔/𝑵 pi 0.444
std 0.786
𝒔𝒕𝒅=√(𝒏.𝒑𝒊(𝟏−𝒑𝒊)).√((𝑵−
𝒏)/(𝑵−𝟏))

GEOMETRIC DISTRIBUTION models the number of trials required to achieve the fir
where the probability of success is constant across all trials.
pi: probability of success p 0.5 PDF 0.00098
mean = 1µ/p x 10 CDF (few) 0.99902
std = s mean = µ 2 CDF 0.000977
std = s 1.4142136
PDF
CDF
curs in a fixed interval of time
ents occur independently of each other.

es the events being studied are non-independent

d to achieve the first success in a sequence of independent trials,


x P(x) x.P(x) E(x) = u x-u (x-u)^2 (x-u)^2.var std
CHAPTER 7: CONTINUOUS PROBANILITY DISTRIBUTIONS

APPROXIMATION Normal = Binomial

Binomial P(X>=18) Bi P(X<= 18)


Normal P(X>17.5) Norm P(X< 18.5)

APPROXIMATION Normal = Poisson


If LAMDA >= 10,
UNIFORM CONTINUOUS DISTRIBUTION describes a continuous ran
that is equally likely to take on any value within a specified interval.
PDF has constant height, CDF increases linearly to 1
>= and > are the same
U(a,b)
a lower limit a 10
b Upper limit b 16
PDF Mean=u 13
Std 1.732050808
CDF x 13

P(c < X< d) = (d - c)/(b - a)

NORMAL DISTRIBUTION
Bell-shaped / Symmetric / Mesokurtic
Mean, median, and mode are all equal and are located at the center of the curve.
Z=(x-u)/s
z 25
PDF (casio) u 14
s 3
x 25

STANDARD NORMAL DISTRIBUTION


z (x-u)/s u 7000
s 420
PDF z -1.72
x 6000
CDF Norm.s.dist(z,1)
abc was 2.7 std above the mean -> z= 2.7

EXPONENTIAL DISTRIBUTION median =ln(2)/lamda


1/
tgian để có 1 event

Mean 
PDF mean rate 3.60
CDF std = s 1.897366596
x 0.5

chú ý 7.61 / 305


scribes a continuous random variable

PDF -9.9375
CDF(few) 0.5
CDF 0.5
P(c<x<d) 166.6667 c 3000 d 4000

the center of the curve.

CDF(few) 0.999877 inv 25,4,3


CDF(more) 25

CDF(few) 0.042716
CDF(more) 0.957284
between cd 3.921E-55 c 225 d 450

CDF (few) 0.834701


CDF 0.165299
between 0.000746 c 2 d 4
CHAPTER 8: SAMPLING DISTRIBUTION the distribution of the samp
CENTRAL LITMIT THE
Expected range of Sample

CI
90%
95%
99%
alpha
0.1

alpha
0.01

Margin error
75

alpha
0.1

Margin error
0.5

margin of error E

std propor
CHECK NOMAl?
CHECK NOMAl?
BẢNG NÀY CHỈ DÙNG CHO Z CÒN. T. THÌ PHẢI TRA APPENDI
the distribution of the sample mean X_ approaches a normal distribution with mean μ and standard deviation = σ /Căn n as the sample
CENTRAL LITMIT THEOREM
Expected range of Sample Means

alpha z_a/2 Lưu ý Z lấy a/2


0.1 -1.644854 0.024998
0.05 -1.959964
0.01 -2.575829
u s n z Std error upper lower
25 1.25 16 -1.644854 0.3125 24.48598 25.51402

x- s n z_a/2 Std error upper lower alpha x-


36.4 14.5 40 2.576 2.292 42.30419 30.49581 0.05 45.66
width width Margin error t
alpha s z_a/2 n 5.904192 -5.904192 4
0.01 300 -2.575829 106.1583

MEDTHOD TO ESTIMATE SIGMA (for m


M2: Assume Uniform POP
M3: Assume Normal POP
M4: Poisson Arrivals

MEDTHOD TO ESTIMATE
M1: Assume that pi = 0.5
x n p z_a/2 MOE upper Lower M2: nếu pi khác nhiều 0.5 -> dùng p t
12 25 0.48 -1.644854 -0.164354 0.315646 0.644354 m3
std error width width
0.09992 -0.164354 0.164354
alpha p z_a/2 sample size determination The width of the confidence interval fo
0.05 0.789474 -1.959964 2.553878 Sample size
Confidence level
margin of error E 0.09992 Sample proportion p
n 250 E tỉ lệ nghich với n n
pi or p 0.06 muốn E giảm thì tăng m
std propor 0.01502 TRUE
CHECK NOMAl? Yes when n/N greater than 5%
CHECK NOMAl? Yes N 1000 Check >5%
n 90
FPCF 0.954417

TRA APPENDIX D
on = σ /Căn n as the sample size increases.

s n t df estimated std error lower upper


27.79 21 2.085963 20 6.06427516965823 33.010144 58.30986
s sample size determination width width
0 -12.649856 12.64986

D TO ESTIMATE SIGMA (for mean)


me Uniform POP σ = √[(b - a)^2/12 ]
me Normal POP σ = (b - a)/6
σ= √λ

D TO ESTIMATE SIGMA (for p)


me that pi = 0.5
i khác nhiều 0.5 -> dùng p thay pi

of the confidence interval for π depends on


CHAPTER 9: ONE SAMPLE HYPOTHESES TEST
kiểm tra xem trung bình mẫu có khác biệt đáng kể so với trung bình tổng thể giả

TYPE I AND TYPE II ERROR

Type I error (also called a false positive).


Type II error (also called a false negative).

DECISION RULES AND CRITICAL VALUES

Find Critical value of Z


alpha Right Left tailed tailed
0.05 1.645 -1.645 -1.960 1.960
0.1 1.282 -1.282 -1.645 1.645
0.01 2.326 -2.326 -2.576 2.576
0.025 1.960 -1.960 -2.241 2.241
TESTING A MEAN: KNOWN POPULATION VARIANCE
Critical value is the boundary between two regions

Left

Right

Left

Right

Two Tail
Left

Right

Two Tail
Z-Test if two tailed P > alpha, cannot reject Ho

alpha x- µo s n
0.05 55.82 56 0.77 49
Z_crit Z_calc P-value
-1.645 -1.636 0.0509
Z_crit Z_calc P-value
1.645 -1.636 0.9491
tuỳ th
Z_a/2 Z_calc P-value (-z) lower upper width CI width CI
-1.960 -1.636 1636.0000 55.640 56.000 -0.18 0.18
P-value (z)
1.8982

alpha x- µo s n df= n-1


0.05 209 60 13 20 19
T_crit T_calc P-value
-1.729 51.258 1.0000

T_crit T_calc P-value


1.7291 51.258 0.0000

T_crit T_calc P-value lower upper width CI width CI


2.0930 51.258 0.0000 202.916 215.084 -6.084 6.084

alpha x n p p_0
0.05 39 150 0.26 0.02
z_crit z_calc p_value Check normal
-1.644854 20.995626 1 Check normal

z_crit z_calc p_value


1.644854 20.995626 0

z_crit z_calc p_value


-1.959964 20.995626 2
x 30
n 150
pi 0.02
Check normal no
Check normal Yes
INDEPENDENT SAMPLE
Z_Test KNOWN VARIANCE
alpha 0.01
s1 3 s2 3
x1- 13.4 x2- 15.2
n1 18 n2 18
NT SAMPLE INDEPENDENT SAMPLE
N VARIANCE T_Test UNKNOWN VARIANCE

z_crit -2.575829 alpha x1- 240.000 x2- 252.000


z_calc -1.8000 0.05 S1 20.000 S2 15.000
p_value n1 10 n2 14
n1-1 9 n2-1 13
〖𝑠 1 〗 ^2/𝑛 40〖𝑠 2 〗 ^2/𝑛16.07143

Equal variance d.f T_crit T_calc 〖𝑺𝒑〗


p_value
^𝟐
22 -1.717 -1.603 0.061648 296.5909
Unequal variance d.f T_crit T_calc p_value
16 -1.753 -1.603 0.06494

T_crit T_calc p_value RIGHT


1.717 -1.603 0.938352
T_crit T_calc p_value
1.753 -1.603 0.93506
T_crit T_calc p_value TWO TAIL
2.074 -1.603 0.123297
T_crit T_calc p_value
2.1314 -1.603 0.129880

PAIRED Ho: u_d = 0


H1: u_d khác 0 d.f
alpha d_gạch S_d n n-1
0.05 0.8286 1.755 7 6

COMPARE PROPORTION

alpha 0.05
x1 70 x2 104 pc
n1 140 n2 260 z_crit
p1 0.5 p2 0.4 z_calc
p_value

F- TEST : COMPARE TWO VARIANCE


If the test statistic F is much less than 1 or much greater than 1, we
LEFT a s1 s2
0.05 n1 3 n2
df1 2 df2
RIGHT a s1 s2
0.05 n1 n2
df1 df2

TWO TAIL a s1 s2
0.05 n1 n2
df1 df2

FOOLED F TEST
Ho
H1 u1 - u2 = 0
u1 - u2 khác 0

Sp LEFT
17.22181

INDEPENDENT SAMPLE

TWO TAIL 5.1 3 12.1 6.2 11.5 7.8


3.2 2.2 8.7 7.7 9.4 7.8

diff 1.9 0.8 3.4 -1.5 2.1 0


mean 0.82857143
Std 1.75472152

Criti T T_calc p_value


-1.943 1.249 0.1290

CI CI
0.435 -0.002008 0.2020084
-1.6449 Z-RIGHT 1.644854 Z-TWO TAIL-1.95996398
1.924207 1.924207 1.92420724
0.9728 0.027164 1.94567139

1 or much greater than 1, we would reject the hypothesis of equal population variances
F_crit 0.05218
4 F_calc #DIV/0!
3 p-value
F_crit
F_calc
p-value

F_crit
F_calc
p-value
2.2
3.1

-0.9
CHAPTER 11 Analysis of Variance
ANOVA Assumptions
• Observations on Y are independent.
• Populations being sampled are normal.
• Populations being sampled have equal variances

group mea overal mea group sz c (số grou 5 df1= c-1 4 Ho: u1=u2=u3=u4
22.7 489.1169 n (overal s22 df2= n-c 15 H1: Not all the means are equal
20.5 489.1169 alpha 0.05 F-calc > F-crit -> reject Ho
#DIV/0! 489.1169
SSB= sumS 744.000 MSB 186 F_calc F_crit P_value
SSE=sumS 751.500 MSE 50.1 3.713 3.056 0.027136
SS_Total 1495.5

630.83300
CHAPTER 12: SIMPLE REGRESSION 2.9999838

Correlation Coefficient (HỆ SỐ TƯƠNG QUAN) is denoted r. Its value will fall in the interval [-1;1]
This measures the degree of linearity in the relationship between two random variables X and Y and

4601

r 0.803546 array1 (xi) array2 (yi)


1 0.1
3 0.4
5 0.3
2 0.5
6 0.8
7 0.87
l the means are equal
-crit -> reject Ho

Tests for Significant Correlation Using Student’s t


sample correl dùng để estimate pop correl H0: ρ=0
erval [-1;1] H1: ρ khác 0

alpha 0.05 t_crit 2.068658


r 0.6 t_calc 3.596874
t_crit =T.INV.2T(alpha,deg_freedom) n 25 p_value 0.000761
p_value=T.DIST.2T(t_calc, df) df= n-2 23

SLOPE & INTERCEPT Whether population correl

alpha
n
df=n-2

t-crit= T.INV.2T(alpha,df)

It ranges from -1.0 to +1.0 inclusive


It measures the strength of the relationship between two variables
A value of 0.00 indicates two variables are not related
Whether population correlation is zero
S_b1 0.86095 Sb0
0.05 b1 1.9641 b0
25 t_slope 2.281317 t_intercept #DIV/0!
23 T-crit 1.713872 T-crit 2.068658
p 0.016056

You might also like