Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Statistics formula sheet This has mean n and variance n(1 ).

The Poisson distribution:

Summarising data x exp()
p(x) = for x = 0, 1, 2, . . . .
Sample mean:
n This has mean and variance .
x= xi .
Continuous distributions
Sample variance:
n n
! Distribution function:
1 X 1 X
s2x = (xi x)2 = x2i 2
nx . y
n1 n1 F (y) = P (X y) = f (x) dx.
i=1 i=1

Sample covariance:
Density function:
n n
1 X 1 X d
g= (xi x)(yi y) = xi yi nx y . f (x) = F (x).
n1 n1 dx
i=1 i=1
Evaluating probabilities:
Sample correlation:
Z b
r= . P (a < X b) = f (x) dx = F (b) F (a).
sx sy a

Expected value:
Probability Z
E(X) = = xf (x) dx.
Addition law:

P (A B) = P (A) + P (B) P (A B). Variance:

Multiplication law: Var(X) = (x )2 f (x) dx = x2 f (x) dx 2 .

P (A B) = P (A)P (B|A) = P (B)P (A|B).
Hazard function:
Partition law: For a partition B1 , B2 , . . . , Bk
f (t)
h(t) = .
X k
X 1 F (t)
P (A) = P (A Bi ) = P (A|Bi )P (Bi ).
i=1 i=1 Normal density with mean and variance 2 :
Bayes formula: 1 1

f (x) = exp for x [, ].
2 2 2
P (A|Bi )P (Bi ) P (A|Bi )P (Bi )
P (Bi |A) = = Pk .
P (A) P (A|Bi )P (Bi ) Weibull density:

f (t) = t1 exp(t ) for t 0.

Discrete distributions
Exponential density:
Mean value:
X f (t) = exp(t) for t 0.
E(X) = = xi p(xi ).
xi S
This has mean 1 and variance 2 .

X X Test for population mean
Var(X) = (xi )2 p(xi ) = x2i p(xi ) 2 .
xi S xi S Data: Single sample of measurements x1 , . . . , xn .
Hypothesis: H : = 0 .
The binomial distribution:
Calculate x, s2 , and t = |x 0 | n/s.
n x
p(x) = (1 )nx for x = 0, 1, . . . , n.
x Obtain critical value from t-tables, df = n 1.
Reject H at the 100p% level of significance if |t| > c, Calculate
where c is the tabulated value corresponding to col-
s2 = (n 1)s2x + (m 1)s2y /(n + m 2).

umn p.

Look in t-tables, df = n + m 2, column p. Let the

Paired sample t-test tabulated value be c say.
100(1 p)% confidence interval for the difference in
Data: Single sample of n measurements x1 , . . . , xn which
population means i.e. x y , is
are the pairwise differences between the two original sets
of measurements.
(r )
1 1
Hypothesis: H : = 0. (x y) c s2 + .
n m

Calculate x, s2 and t = x n/s.
Obtain critical value from t-tables, df = n 1. Regression and correlation
Reject H at the 100p% level of significance if |t| > c,
where c is the tabulated value corresponding to col- The linear regression model:
umn p.
yi = + xi + zi .

Least squares estimates of and :

Two sample t-test Pn
xi yi n x y
Data: Two separate samples of measurements x1 , . . . , xn = i=1
, = y x.
(n 1)s2x
and y1 , . . . , ym .
Hypothesis: H : x = y .
Confidence interval for
Calculate x, s2x , y, and s2y . Calculate as given previously.
Calculate Calculate s2 = s2y 2 s2x .
s2 = (n 1)s2x + (m 1)sy /(n + m 2).

Calculate SE() s2 .
(n 2)s2x
Calculate t = r . Look in t-tables, df = n 2, column p. Let the
1 1

s2 + tabulated value be c.
n m
100(1 p)% confidence interval for is c SE().

Obtain critical value from t-tables, df = n + m 2.
Reject H at the 100p% level of significance if |t| > c,
where c is the tabulated value corresponding to col- Test for = 0
umn p.
Hypothesis: H : = 0.

CI for population mean Calculate 1/2


t=r .
Data: Sample of measurements x1 , . . . , xn . 1 r2
Method: Obtain critical value from t-tables, df = n 2.
Calculate x, s2x . Reject H at 100p% level of significance if |t| > c,
Look in t-tables, df = n 1, column p. Let the where c is the tabulated value corresponding to col-
tabulated value be c say. umn p.

100(1 p)% confidence interval for is x csx / n.
Approximate CI for proportion
CI for difference in population means
Data: Separate samples x1 , . . . , xn and y1 , . . . , ym . p(1 p)
p 1.96
Calculate x, s2x , y, s2y . where p is the observed proportion in the sample.
Test for a proportion
Hypothesis: H : = 0 .
p 0
Test statistic z = q .
0 (1 0 )
Obtain critical value from normal tables.

Comparison of proportions
Hypothesis: H : 1 = 2 .

n1 p1 + n2 p2
p= .
n1 + n2
p1 p2
z= r  
p(1 p) n11 + n12

Obtain appropriate critical value from normal tables.

Goodness of fit
Test statistic
X (oi ei )2
2 =

where m is the number of categories.

Hypothesis H : F = F0 .

Calculate the expected class frequencies under F0 .

Calculate the 2 test statistic given above.
Determine the degrees of freedom, say.
Obtain critical value from 2 tables, df = .
Reject H : F = F0 at the 100p% level of significance
if 2 > c where c is the tabulated critical value.

You might also like