3401 Formulas & All Tables04!29!06

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

CHAPTER 4 Describing the Relation between Two Variables

for the least-squares regression model


Exponential Equation of Best Fit
Power Equation of Best Fit
y = ax
b
Linear: log y = log a + b log x
y = ab
x
Linear: log y = log a + x log b
yN = b
0
+ b
1
x
R
2
= r
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Correlation Coefficient:
The equation of the least-squares regression line is
where is the predicted value,
is the slope, and is the intercept.
observed
Coefficient of Determination: the percent of total
variation in the response variable that is explained by the
least-squares regression line.
R
2
=
y - predicted y = y - yN Residual =
b
0
= y - b
1
x
b
1
= r
#
s
y
s
x
yN yN = b
1
x + b
0
,
r =
a
a
x
i
- x
s
x
b a
y
i
- y
s
y
b
n - 1
CHAPTER 5 Probability
Classical Probability
Empirical Probability
Addition Rule
Addition Rule for Mutually Exclusive Events
Addition Rule for n Mutually Exclusive Events
Complement Rule
Multiplication Rule
P1E and F2 = P1E2
#
P1F E2
P1E2 = 1 - P1E2
P1E or F or G or

2 = P1E2 + P1F2 + P1G2 +

P1E or F2 = P1E2 + P1F2
P1E or F2 = P1E2 + P1F2 - P1E and F2
P1E2 L
frequency of E
number of trials of experiment
P1E2 =
number of ways that E can occur
number of possible outcomes
=
N1E2
N1S2
Multiplication Rule for Independent Events
Multiplication Rule for n Independent Events
Conditional Probability Rule
Factorial
Permutation of n objects taken r at a time:
Combination of n objects taken r at a time:
Permutations with Repetition: of one type, of a
second type, with
n!
n
1
!
#
n
2
!
# # # # #
n
k
!
n
1
+ n
2
+

+ n
k
= n ,
n
2
n
1
n
C
r
=
n!
r!1n - r2!
n
P
r
=
n!
1n - r2!
n! = n
#
1n - 12
#
1n - 22
# # # # #
3
#
2
#
1
P1F E2 =
P1E and F2
P1E2
=
N1E and F2
N1E2
P1E and F and G

2 = P1E2
#
P1F2
#
P1G2
#

P1E and F2 = P1E2
#
P1F2
CHAPTER 2 Organizing and Summarizing Data
Class midpoint =
Lower class limit + Upper class limit
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Relative frequency =
frequency
sum of all frequencies
CHAPTER 3 Numerically Summarizing Data
Population Mean:
Sample Mean:

Population Variance:
Sample Variance:
Population Standard Deviation
Sample Standard Deviation:
Empirical Rule: If the shape of the distribution is bell-
shaped, then
Approximately 68% of the data lie within one stan-
dard deviation of the mean
Approximately 95% of the data lie within two stan-
dard deviations of the mean
Approximately 99.7% of the data lie within three
standard deviations of the mean
Chebyshevs Inequality: For any data set, regardless of
the shape of the distribution, at least of
the observations will lie within k standard deviations of
the mean where k is any number greater than 1.
Population Mean from Grouped Data: m =
gx
i
f
i
gf
i
a1 -
1
k
2
b100%
s = 2s
2
s = 2s
2
s
2
=
g1x
i
- x2
2
n - 1
s
2
=
g1x
i
- m2
2
N
Range = Largest Data Value - Smallest Data Value
x =
gx
i
n
m =
gx
i
N
Sample Mean from Grouped Data:
Weighted Mean:
Population Variance from Grouped Data:
Sample Variance from Grouped Data:
Population Z-score:
Sample Z-score:
Percentile of
Determining the kth percentile: rounded up
to the next integer. If i is an integer, find the mean of the
ith and st data value.
Interquartile Range:
Lower and Upper Fences:
Five-Number Summary:
Minimum, Q
1
, M, Q
3
, Maximum
Lower Fence = Q
1
- 1.51IQR2
Upper Fence = Q
3
+ 1.51IQR2
IQR = Q
3
- Q
1
1i + 12
i = a
k
100
bn
x =
Number of data values less than x
n
#
100
z =
x - x
s
z =
x - m
s
s
2
=
g1x
i
- m2
2
f
i
A gf
i
B - 1
s
2
=
g1x
i
- m2
2
f
i
gf
i
x
w
=
gw
i
x
i
gw
i
x =
gx
i
f
i
gf
i
CHAPTER 6 Discrete Probability Distributions
Mean of a Binomial Random Variable
Standard Deviation of a Binomial Random Variable
Poisson Probability Distribution Function
Mean and Standard Deviation of a Poisson Random Variable
m
X
= lt s
X
= 2lt
P1X = x2 =
1lt2
x
x!
e
-lt
x = 0, 1, 2,
s
X
=
4
np11 - p2
m
X
= np
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Mean of a Discrete Random Variable
Variance of a Discrete Random Variable
Expected Value of a Random Variable X
Binomial Probability Distribution Function
P1X = x2 =
n
C
x
p
x
11 - p2
n-x
E1X2 = gx
#
P1X = x2
s
2
X
= g1x - m2
2
#
P1X = x2
m
X
= gx
#
P1X = x2
CHAPTER 7 The Normal Distribution
Standardizing a Normal Random Variable
Finding the Score: X = m + Zs
Z =
X - m
s
or Z =
x - m
s
^
1n
Mean of Sampling Distribution of
Standard Deviation of Sampling Distribution of
s
x
=
s
1n
x:
x : m
x
= m
CHAPTER 8 Confidence Intervals
Confidence Intervals
A confidence interval about with
known is provided the population from
which the sample was drawn is normal or the sample size
is large
A confidence interval about with
unknown is provided the population from
which the sample was drawn is normal or the sample size
is large Note: is computed using
degrees of freedom.
A confidence interval about p is
provided
A confidence interval about is
provided the population
from which the sample was drawn is normal.
1n - 12s
2
x
a/2
2
6 s
2
6
1n - 12s
2
x
1-a/2
2
s
2
11 - a2
#
100%
npN 11 - pN 2 10. pN ; z
a/2
#
C
pN 11 - pN 2
n
11 - a2
#
100%
n - 1 t
a/2
1n 302.
x ; t
a/2
#
s
1n
s m 11 - a2
#
100%
1n 302.
x ; z
a/2
#
s
1n
s m 11 - a2
#
100%
Sample Size
To estimate the population mean with a margin of error E
at a level of confidence requires a sample
of size rounded up to the next integer.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the
next integer, where is a prior estimate of the population
proportion.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the next
integer when no prior estimate of p is available.
n = 0.25a
z
a/2
E
b
2
11 - a2
#
100%
pN
n = pN 11 - pN 2a
z
a/2
E
b
2
11 - a2
#
100%
n = a
z
a/2
#
s
E
b
2
11 - a2
#
100%
CHAPTER 9 Hypothesis Testing
provided that and
the sample size is less than 5% of the population size
follows the distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal.
n - 1 x
2
- x
2
=
1n - 12s
2
s
0
2
1n 6 0.05N2.
np
0
11 - p
0
2 10 z =
pN - p
0
C
p
0
11 - p
0
2
n
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Test Statistics
provided that the population from which the
sample was drawn is normal or the sample size is large
follows Students t-distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal or the sample size
is large 1n 302.
n - 1 t =
x - m
0
s
^
1n
1n 302.
z =
x - m
0
s
^
1n
CHAPTER 10 Inferences on Two Samples
Test statistic for matched pairs data:
where is the mean and is the standard deviation of
the differenced data.
Confidence interval for matched pairs data:
Note: is found using degrees of freedom.
Test statistic comparing two means (independent
sampling):
Confidence interval for the difference of two means
(independent samples):
Note: is found using the smaller of or
degrees of freedom.
n
2
- 1 n
1
- 1 t
a/2
Upper Bound: 1x
1
- x
2
2 + t
a/2
C
s
1
2
n
1
+
s
2
2
n
2

Lower Bound: 1x
1
- x
2
2 - t
a/2
C
s
1
2
n
1
+
s
2
2
n
2
t =
1x
1
- x
2
2 - 1m
1
- m
2
2
C
s
1
2
n
1
+
s
2
2
n
2
n - 1 t
a/2
Upper Bound: d + t
a/2
#
s
d
1n

Lower Bound: d - t
a/2
#
s
d
1n
s
d
d
t =
d
s
d
^
1n
Test statistic comparing two population proportions:
where
Confidence interval for the difference of two proportions:
Test statistic for comparing two population standard
deviations:
Finding a critical F for the left tail:
F
1-a,n
1
-1,n
2
-1
=
1
F
a,n
2
-1,n
1
-1
F =
s
1
2
s
2
2
Upper Bound: 1pN
1
- pN
2
2 + z
a/2
C
pN
1
11 - pN
1
2
n
1
+
pN
2
11 - pN
2
2
n
2
Lower Bound: 1pN
1
- pN
2
2 - z
a/2
C
pN
1
11 - pN
1
2
n
1
+
pN
2
11 - pN
2
2
n
2
pN =
x
1
+ x
2
n
1
+ n
2
.
Z =
pN
1
- pN
2
4
pN 11 - pN 2
B
1
n
1
+
1
n
2
AAEC 3401 Page 1 of 7
Fundamentals of Statistics - Formula Sheet
CHAPTER 4 Describing the Relation between Two Variables
for the least-squares regression model
Exponential Equation of Best Fit
Power Equation of Best Fit
y = ax
b
Linear: log y = log a + b log x
y = ab
x
Linear: log y = log a + x log b
y
N
= b
0
+ b
1
x
R
2
= r
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Correlation Coefficient:
The equation of the least-squares regression line is
where is the predicted value,
is the slope, and is the intercept.
observed
Coefficient of Determination: the percent of total
variation in the response variable that is explained by the
least-squares regression line.
R
2
=
y - predicted y = y - y
N
Residual =
b
0
= y - b
1
x
b
1
= r
#
s
y
s
x
y
N
y
N
= b
1
x + b
0
,
r =
a
a
x
i
- x
s
x
b a
y
i
- y
s
y
b
n - 1
CHAPTER 5 Probability
Classical Probability
Empirical Probability
Addition Rule
Addition Rule for Mutually Exclusive Events
Addition Rule for n Mutually Exclusive Events
Complement Rule
Multiplication Rule
P1E and F2 = P1E2
#
P1F E2
P1E2 = 1 - P1E2
P1E or F or G or

2 = P1E2 + P1F2 + P1G2 +

P1E or F2 = P1E2 + P1F2
P1E or F2 = P1E2 + P1F2 - P1E and F2
P1E2 L
frequency of E
number of trials of experiment
P1E2 =
number of ways that E can occur
number of possible outcomes
=
N1E2
N1S2
Multiplication Rule for Independent Events
Multiplication Rule for n Independent Events
Conditional Probability Rule
Factorial
Permutation of n objects taken r at a time:
Combination of n objects taken r at a time:
Permutations with Repetition: of one type, of a
second type, with
n!
n
1
!
#
n
2
!
# # # # #
n
k
!
n
1
+ n
2
+

+ n
k
= n ,
n
2
n
1
n
C
r
=
n!
r!1n - r2!
n
P
r
=
n!
1n - r2!
n! = n
#
1n - 12
#
1n - 22
# # # # #
3
#
2
#
1
P1F E2 =
P1E and F2
P1E2
=
N1E and F2
N1E2
P1E and F and G

2 = P1E2
#
P1F2
#
P1G2
#

P1E and F2 = P1E2
#
P1F2
CHAPTER 2 Organizing and Summarizing Data
Class midpoint =
Lower class limit + Upper class limit
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Relative frequency =
frequency
sum of all frequencies
CHAPTER 3 Numerically Summarizing Data
Population Mean:
Sample Mean:

Population Variance:
Sample Variance:
Population Standard Deviation
Sample Standard Deviation:
Empirical Rule: If the shape of the distribution is bell-
shaped, then
Approximately 68% of the data lie within one stan-
dard deviation of the mean
Approximately 95% of the data lie within two stan-
dard deviations of the mean
Approximately 99.7% of the data lie within three
standard deviations of the mean
Chebyshevs Inequality: For any data set, regardless of
the shape of the distribution, at least of
the observations will lie within k standard deviations of
the mean where k is any number greater than 1.
Population Mean from Grouped Data: m =
gx
i
f
i
gf
i
a1 -
1
k
2
b100%
s = 2s
2
s = 2s
2
s
2
=
g1x
i
- x2
2
n - 1
s
2
=
g1x
i
- m2
2
N
Range = Largest Data Value - Smallest Data Value
x =
gx
i
n
m =
gx
i
N
Sample Mean from Grouped Data:
Weighted Mean:
Population Variance from Grouped Data:
Sample Variance from Grouped Data:
Population Z-score:
Sample Z-score:
Percentile of
Determining the kth percentile: rounded up
to the next integer. If i is an integer, find the mean of the
ith and st data value.
Interquartile Range:
Lower and Upper Fences:
Five-Number Summary:
Minimum, Q
1
, M, Q
3
, Maximum
Lower Fence = Q
1
- 1.51IQR2
Upper Fence = Q
3
+ 1.51IQR2
IQR = Q
3
- Q
1
1i + 12
i = a
k
100
bn
x =
Number of data values less than x
n
#
100
z =
x - x
s
z =
x - m
s
s
2
=
g1x
i
- m2
2
f
i
A gf
i
B - 1
s
2
=
g1x
i
- m2
2
f
i
gf
i
x
w
=
gw
i
x
i
gw
i
x =
gx
i
f
i
gf
i
CHAPTER 6 Discrete Probability Distributions
Mean of a Binomial Random Variable
Standard Deviation of a Binomial Random Variable
Poisson Probability Distribution Function
Mean and Standard Deviation of a Poisson Random Variable
m
X
= lt s
X
= 2lt
P1X = x2 =
1lt2
x
x!
e
-lt
x = 0, 1, 2,
s
X
=
4
np11 - p2
m
X
= np
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Mean of a Discrete Random Variable
Variance of a Discrete Random Variable
Expected Value of a Random Variable X
Binomial Probability Distribution Function
P1X = x2 =
n
C
x
p
x
11 - p2
n-x
E1X2 = gx
#
P1X = x2
s
2
X
= g1x - m2
2
#
P1X = x2
m
X
= gx
#
P1X = x2
CHAPTER 7 The Normal Distribution
Standardizing a Normal Random Variable
Finding the Score: X = m + Zs
Z =
X - m
s
or Z =
x - m
s
^
1n
Mean of Sampling Distribution of
Standard Deviation of Sampling Distribution of
s
x
=
s
1n
x:
x : m
x
= m
CHAPTER 8 Confidence Intervals
Confidence Intervals
A confidence interval about with
known is provided the population from
which the sample was drawn is normal or the sample size
is large
A confidence interval about with
unknown is provided the population from
which the sample was drawn is normal or the sample size
is large Note: is computed using
degrees of freedom.
A confidence interval about p is
provided
A confidence interval about is
provided the population
from which the sample was drawn is normal.
1n - 12s
2
x
a/2
2
6 s
2
6
1n - 12s
2
x
1-a/2
2
s
2
11 - a2
#
100%
np
N
11 - p
N
2 10. p
N
; z
a/2
#
C
p
N
11 - p
N
2
n
11 - a2
#
100%
n - 1 t
a/2
1n 302.
x ; t
a/2
#
s
1n
s m 11 - a2
#
100%
1n 302.
x ; z
a/2
#
s
1n
s m 11 - a2
#
100%
Sample Size
To estimate the population mean with a margin of error E
at a level of confidence requires a sample
of size rounded up to the next integer.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the
next integer, where is a prior estimate of the population
proportion.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the next
integer when no prior estimate of p is available.
n = 0.25a
z
a/2
E
b
2
11 - a2
#
100%
p
N
n = p
N
11 - p
N
2a
z
a/2
E
b
2
11 - a2
#
100%
n = a
z
a/2
#
s
E
b
2
11 - a2
#
100%
CHAPTER 9 Hypothesis Testing
provided that and
the sample size is less than 5% of the population size
follows the distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal.
n - 1 x
2
- x
2
=
1n - 12s
2
s
0
2
1n 6 0.05N2.
np
0
11 - p
0
2 10 z =
p
N
- p
0
C
p
0
11 - p
0
2
n
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Test Statistics
provided that the population from which the
sample was drawn is normal or the sample size is large
follows Students t-distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal or the sample size
is large 1n 302.
n - 1 t =
x - m
0
s
^
1n
1n 302.
z =
x - m
0
s
^
1n
CHAPTER 10 Inferences on Two Samples
Test statistic for matched pairs data:
where is the mean and is the standard deviation of
the differenced data.
Confidence interval for matched pairs data:
Note: is found using degrees of freedom.
Test statistic comparing two means (independent
sampling):
Confidence interval for the difference of two means
(independent samples):
Note: is found using the smaller of or
degrees of freedom.
n
2
- 1 n
1
- 1 t
a/2
Upper Bound: 1x
1
- x
2
2 + t
a/2
C
s
1
2
n
1
+
s
2
2
n
2

Lower Bound: 1x
1
- x
2
2 - t
a/2
C
s
1
2
n
1
+
s
2
2
n
2
t =
1x
1
- x
2
2 - 1m
1
- m
2
2
C
s
1
2
n
1
+
s
2
2
n
2
n - 1 t
a/2
Upper Bound: d + t
a/2
#
s
d
1n

Lower Bound: d - t
a/2
#
s
d
1n
s
d
d
t =
d
s
d
^
1n
Test statistic comparing two population proportions:
where
Confidence interval for the difference of two proportions:
Test statistic for comparing two population standard
deviations:
Finding a critical F for the left tail:
F
1-a,n
1
-1,n
2
-1
=
1
F
a,n
2
-1,n
1
-1
F =
s
1
2
s
2
2
Upper Bound: 1p
N
1
- p
N
2
2 + z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
Lower Bound: 1p
N
1
- p
N
2
2 - z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
p
N
=
x
1
+ x
2
n
1
+ n
2
.
Z =
p
N
1
- p
N
2
4
p
N
11 - p
N
2
B
1
n
1
+
1
n
2
AAEC 3401 Page 2 of 7
+
CHAPTER 4 Describing the Relation between Two Variables
for the least-squares regression model
Exponential Equation of Best Fit
Power Equation of Best Fit
y = ax
b
Linear: log y = log a + b log x
y = ab
x
Linear: log y = log a + x log b
y
N
= b
0
+ b
1
x
R
2
= r
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Correlation Coefficient:
The equation of the least-squares regression line is
where is the predicted value,
is the slope, and is the intercept.
observed
Coefficient of Determination: the percent of total
variation in the response variable that is explained by the
least-squares regression line.
R
2
=
y - predicted y = y - y
N
Residual =
b
0
= y - b
1
x
b
1
= r
#
s
y
s
x
y
N
y
N
= b
1
x + b
0
,
r =
a
a
x
i
- x
s
x
b a
y
i
- y
s
y
b
n - 1
CHAPTER 5 Probability
Classical Probability
Empirical Probability
Addition Rule
Addition Rule for Mutually Exclusive Events
Addition Rule for n Mutually Exclusive Events
Complement Rule
Multiplication Rule
P1E and F2 = P1E2
#
P1F E2
P1E2 = 1 - P1E2
P1E or F or G or

2 = P1E2 + P1F2 + P1G2 +

P1E or F2 = P1E2 + P1F2
P1E or F2 = P1E2 + P1F2 - P1E and F2
P1E2 L
frequency of E
number of trials of experiment
P1E2 =
number of ways that E can occur
number of possible outcomes
=
N1E2
N1S2
Multiplication Rule for Independent Events
Multiplication Rule for n Independent Events
Conditional Probability Rule
Factorial
Permutation of n objects taken r at a time:
Combination of n objects taken r at a time:
Permutations with Repetition: of one type, of a
second type, with
n!
n
1
!
#
n
2
!
# # # # #
n
k
!
n
1
+ n
2
+

+ n
k
= n ,
n
2
n
1
n
C
r
=
n!
r!1n - r2!
n
P
r
=
n!
1n - r2!
n! = n
#
1n - 12
#
1n - 22
# # # # #
3
#
2
#
1
P1F E2 =
P1E and F2
P1E2
=
N1E and F2
N1E2
P1E and F and G

2 = P1E2
#
P1F2
#
P1G2
#

P1E and F2 = P1E2
#
P1F2
CHAPTER 2 Organizing and Summarizing Data
Class midpoint =
Lower class limit + Upper class limit
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Relative frequency =
frequency
sum of all frequencies
CHAPTER 3 Numerically Summarizing Data
Population Mean:
Sample Mean:

Population Variance:
Sample Variance:
Population Standard Deviation
Sample Standard Deviation:
Empirical Rule: If the shape of the distribution is bell-
shaped, then
Approximately 68% of the data lie within one stan-
dard deviation of the mean
Approximately 95% of the data lie within two stan-
dard deviations of the mean
Approximately 99.7% of the data lie within three
standard deviations of the mean
Chebyshevs Inequality: For any data set, regardless of
the shape of the distribution, at least of
the observations will lie within k standard deviations of
the mean where k is any number greater than 1.
Population Mean from Grouped Data: m =
gx
i
f
i
gf
i
a1 -
1
k
2
b100%
s = 2s
2
s = 2s
2
s
2
=
g1x
i
- x2
2
n - 1
s
2
=
g1x
i
- m2
2
N
Range = Largest Data Value - Smallest Data Value
x =
gx
i
n
m =
gx
i
N
Sample Mean from Grouped Data:
Weighted Mean:
Population Variance from Grouped Data:
Sample Variance from Grouped Data:
Population Z-score:
Sample Z-score:
Percentile of
Determining the kth percentile: rounded up
to the next integer. If i is an integer, find the mean of the
ith and st data value.
Interquartile Range:
Lower and Upper Fences:
Five-Number Summary:
Minimum, Q
1
, M, Q
3
, Maximum
Lower Fence = Q
1
- 1.51IQR2
Upper Fence = Q
3
+ 1.51IQR2
IQR = Q
3
- Q
1
1i + 12
i = a
k
100
bn
x =
Number of data values less than x
n
#
100
z =
x - x
s
z =
x - m
s
s
2
=
g1x
i
- m2
2
f
i
A gf
i
B - 1
s
2
=
g1x
i
- m2
2
f
i
gf
i
x
w
=
gw
i
x
i
gw
i
x =
gx
i
f
i
gf
i
CHAPTER 6 Discrete Probability Distributions
Mean of a Binomial Random Variable
Standard Deviation of a Binomial Random Variable
Poisson Probability Distribution Function
Mean and Standard Deviation of a Poisson Random Variable
m
X
= lt s
X
= 2lt
P1X = x2 =
1lt2
x
x!
e
-lt
x = 0, 1, 2,
s
X
=
4
np11 - p2
m
X
= np
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Mean of a Discrete Random Variable
Variance of a Discrete Random Variable
Expected Value of a Random Variable X
Binomial Probability Distribution Function
P1X = x2 =
n
C
x
p
x
11 - p2
n-x
E1X2 = gx
#
P1X = x2
s
2
X
= g1x - m2
2
#
P1X = x2
m
X
= gx
#
P1X = x2
CHAPTER 7 The Normal Distribution
Standardizing a Normal Random Variable
Finding the Score: X = m + Zs
Z =
X - m
s
or Z =
x - m
s
^
1n
Mean of Sampling Distribution of
Standard Deviation of Sampling Distribution of
s
x
=
s
1n
x:
x : m
x
= m
CHAPTER 8 Confidence Intervals
Confidence Intervals
A confidence interval about with
known is provided the population from
which the sample was drawn is normal or the sample size
is large
A confidence interval about with
unknown is provided the population from
which the sample was drawn is normal or the sample size
is large Note: is computed using
degrees of freedom.
A confidence interval about p is
provided
A confidence interval about is
provided the population
from which the sample was drawn is normal.
1n - 12s
2
x
a/2
2
6 s
2
6
1n - 12s
2
x
1-a/2
2
s
2
11 - a2
#
100%
np
N
11 - p
N
2 10. p
N
; z
a/2
#
C
p
N
11 - p
N
2
n
11 - a2
#
100%
n - 1 t
a/2
1n 302.
x ; t
a/2
#
s
1n
s m 11 - a2
#
100%
1n 302.
x ; z
a/2
#
s
1n
s m 11 - a2
#
100%
Sample Size
To estimate the population mean with a margin of error E
at a level of confidence requires a sample
of size rounded up to the next integer.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the
next integer, where is a prior estimate of the population
proportion.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the next
integer when no prior estimate of p is available.
n = 0.25a
z
a/2
E
b
2
11 - a2
#
100%
p
N
n = p
N
11 - p
N
2a
z
a/2
E
b
2
11 - a2
#
100%
n = a
z
a/2
#
s
E
b
2
11 - a2
#
100%
CHAPTER 9 Hypothesis Testing
provided that and
the sample size is less than 5% of the population size
follows the distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal.
n - 1 x
2
- x
2
=
1n - 12s
2
s
0
2
1n 6 0.05N2.
np
0
11 - p
0
2 10 z =
p
N
- p
0
C
p
0
11 - p
0
2
n
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Test Statistics
provided that the population from which the
sample was drawn is normal or the sample size is large
follows Students t-distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal or the sample size
is large 1n 302.
n - 1 t =
x - m
0
s
^
1n
1n 302.
z =
x - m
0
s
^
1n
CHAPTER 10 Inferences on Two Samples
Test statistic for matched pairs data:
where is the mean and is the standard deviation of
the differenced data.
Confidence interval for matched pairs data:
Note: is found using degrees of freedom.
Test statistic comparing two means (independent
sampling):
Confidence interval for the difference of two means
(independent samples):
Note: is found using the smaller of or
degrees of freedom.
n
2
- 1 n
1
- 1 t
a/2
Upper Bound: 1x
1
- x
2
2 + t
a/2
C
s
1
2
n
1
+
s
2
2
n
2

Lower Bound: 1x
1
- x
2
2 - t
a/2
C
s
1
2
n
1
+
s
2
2
n
2
t =
1x
1
- x
2
2 - 1m
1
- m
2
2
C
s
1
2
n
1
+
s
2
2
n
2
n - 1 t
a/2
Upper Bound: d + t
a/2
#
s
d
1n

Lower Bound: d - t
a/2
#
s
d
1n
s
d
d
t =
d
s
d
^
1n
Test statistic comparing two population proportions:
where
Confidence interval for the difference of two proportions:
Test statistic for comparing two population standard
deviations:
Finding a critical F for the left tail:
F
1-a,n
1
-1,n
2
-1
=
1
F
a,n
2
-1,n
1
-1
F =
s
1
2
s
2
2
Upper Bound: 1p
N
1
- p
N
2
2 + z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
Lower Bound: 1p
N
1
- p
N
2
2 - z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
p
N
=
x
1
+ x
2
n
1
+ n
2
.
Z =
p
N
1
- p
N
2
4
p
N
11 - p
N
2
B
1
n
1
+
1
n
2
AAEC 3401 Page 3 of 7
CHAPTER 4 Describing the Relation between Two Variables
for the least-squares regression model
Exponential Equation of Best Fit
Power Equation of Best Fit
y = ax
b
Linear: log y = log a + b log x
y = ab
x
Linear: log y = log a + x log b
y
N
= b
0
+ b
1
x
R
2
= r
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Correlation Coefficient:
The equation of the least-squares regression line is
where is the predicted value,
is the slope, and is the intercept.
observed
Coefficient of Determination: the percent of total
variation in the response variable that is explained by the
least-squares regression line.
R
2
=
y - predicted y = y - y
N
Residual =
b
0
= y - b
1
x
b
1
= r
#
s
y
s
x
y
N
y
N
= b
1
x + b
0
,
r =
a
a
x
i
- x
s
x
b a
y
i
- y
s
y
b
n - 1
CHAPTER 5 Probability
Classical Probability
Empirical Probability
Addition Rule
Addition Rule for Mutually Exclusive Events
Addition Rule for n Mutually Exclusive Events
Complement Rule
Multiplication Rule
P1E and F2 = P1E2
#
P1F E2
P1E2 = 1 - P1E2
P1E or F or G or

2 = P1E2 + P1F2 + P1G2 +

P1E or F2 = P1E2 + P1F2
P1E or F2 = P1E2 + P1F2 - P1E and F2
P1E2 L
frequency of E
number of trials of experiment
P1E2 =
number of ways that E can occur
number of possible outcomes
=
N1E2
N1S2
Multiplication Rule for Independent Events
Multiplication Rule for n Independent Events
Conditional Probability Rule
Factorial
Permutation of n objects taken r at a time:
Combination of n objects taken r at a time:
Permutations with Repetition: of one type, of a
second type, with
n!
n
1
!
#
n
2
!
# # # # #
n
k
!
n
1
+ n
2
+

+ n
k
= n ,
n
2
n
1
n
C
r
=
n!
r!1n - r2!
n
P
r
=
n!
1n - r2!
n! = n
#
1n - 12
#
1n - 22
# # # # #
3
#
2
#
1
P1F E2 =
P1E and F2
P1E2
=
N1E and F2
N1E2
P1E and F and G

2 = P1E2
#
P1F2
#
P1G2
#

P1E and F2 = P1E2
#
P1F2
CHAPTER 2 Organizing and Summarizing Data
Class midpoint =
Lower class limit + Upper class limit
2
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Relative frequency =
frequency
sum of all frequencies
CHAPTER 3 Numerically Summarizing Data
Population Mean:
Sample Mean:

Population Variance:
Sample Variance:
Population Standard Deviation
Sample Standard Deviation:
Empirical Rule: If the shape of the distribution is bell-
shaped, then
Approximately 68% of the data lie within one stan-
dard deviation of the mean
Approximately 95% of the data lie within two stan-
dard deviations of the mean
Approximately 99.7% of the data lie within three
standard deviations of the mean
Chebyshevs Inequality: For any data set, regardless of
the shape of the distribution, at least of
the observations will lie within k standard deviations of
the mean where k is any number greater than 1.
Population Mean from Grouped Data: m =
gx
i
f
i
gf
i
a1 -
1
k
2
b100%
s = 2s
2
s = 2s
2
s
2
=
g1x
i
- x2
2
n - 1
s
2
=
g1x
i
- m2
2
N
Range = Largest Data Value - Smallest Data Value
x =
gx
i
n
m =
gx
i
N
Sample Mean from Grouped Data:
Weighted Mean:
Population Variance from Grouped Data:
Sample Variance from Grouped Data:
Population Z-score:
Sample Z-score:
Percentile of
Determining the kth percentile: rounded up
to the next integer. If i is an integer, find the mean of the
ith and st data value.
Interquartile Range:
Lower and Upper Fences:
Five-Number Summary:
Minimum, Q
1
, M, Q
3
, Maximum
Lower Fence = Q
1
- 1.51IQR2
Upper Fence = Q
3
+ 1.51IQR2
IQR = Q
3
- Q
1
1i + 12
i = a
k
100
bn
x =
Number of data values less than x
n
#
100
z =
x - x
s
z =
x - m
s
s
2
=
g1x
i
- m2
2
f
i
A gf
i
B - 1
s
2
=
g1x
i
- m2
2
f
i
gf
i
x
w
=
gw
i
x
i
gw
i
x =
gx
i
f
i
gf
i
CHAPTER 6 Discrete Probability Distributions
Mean of a Binomial Random Variable
Standard Deviation of a Binomial Random Variable
Poisson Probability Distribution Function
Mean and Standard Deviation of a Poisson Random Variable
m
X
= lt s
X
= 2lt
P1X = x2 =
1lt2
x
x!
e
-lt
x = 0, 1, 2,
s
X
=
4
np11 - p2
m
X
= np
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Mean of a Discrete Random Variable
Variance of a Discrete Random Variable
Expected Value of a Random Variable X
Binomial Probability Distribution Function
P1X = x2 =
n
C
x
p
x
11 - p2
n-x
E1X2 = gx
#
P1X = x2
s
2
X
= g1x - m2
2
#
P1X = x2
m
X
= gx
#
P1X = x2
CHAPTER 7 The Normal Distribution
Standardizing a Normal Random Variable
Finding the Score: X = m + Zs
Z =
X - m
s
or Z =
x - m
s
^
1n
Mean of Sampling Distribution of
Standard Deviation of Sampling Distribution of
s
x
=
s
1n
x:
x : m
x
= m
CHAPTER 8 Confidence Intervals
Confidence Intervals
A confidence interval about with
known is provided the population from
which the sample was drawn is normal or the sample size
is large
A confidence interval about with
unknown is provided the population from
which the sample was drawn is normal or the sample size
is large Note: is computed using
degrees of freedom.
A confidence interval about p is
provided
A confidence interval about is
provided the population
from which the sample was drawn is normal.
1n - 12s
2
x
a/2
2
6 s
2
6
1n - 12s
2
x
1-a/2
2
s
2
11 - a2
#
100%
np
N
11 - p
N
2 10. p
N
; z
a/2
#
C
p
N
11 - p
N
2
n
11 - a2
#
100%
n - 1 t
a/2
1n 302.
x ; t
a/2
#
s
1n
s m 11 - a2
#
100%
1n 302.
x ; z
a/2
#
s
1n
s m 11 - a2
#
100%
Sample Size
To estimate the population mean with a margin of error E
at a level of confidence requires a sample
of size rounded up to the next integer.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the
next integer, where is a prior estimate of the population
proportion.
To estimate the population proportion with a margin of
error E at a level of confidence requires a
sample of size rounded up to the next
integer when no prior estimate of p is available.
n = 0.25a
z
a/2
E
b
2
11 - a2
#
100%
p
N
n = p
N
11 - p
N
2a
z
a/2
E
b
2
11 - a2
#
100%
n = a
z
a/2
#
s
E
b
2
11 - a2
#
100%
CHAPTER 9 Hypothesis Testing
provided that and
the sample size is less than 5% of the population size
follows the distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal.
n - 1 x
2
- x
2
=
1n - 12s
2
s
0
2
1n 6 0.05N2.
np
0
11 - p
0
2 10 z =
p
N
- p
0
C
p
0
11 - p
0
2
n
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Test Statistics
provided that the population from which the
sample was drawn is normal or the sample size is large
follows Students t-distribution with
degrees of freedom provided that the population from
which the sample was drawn is normal or the sample size
is large 1n 302.
n - 1 t =
x - m
0
s
^
1n
1n 302.
z =
x - m
0
s
^
1n
CHAPTER 10 Inferences on Two Samples
Test statistic for matched pairs data:
where is the mean and is the standard deviation of
the differenced data.
Confidence interval for matched pairs data:
Note: is found using degrees of freedom.
Test statistic comparing two means (independent
sampling):
Confidence interval for the difference of two means
(independent samples):
Note: is found using the smaller of or
degrees of freedom.
n
2
- 1 n
1
- 1 t
a/2
Upper Bound: 1x
1
- x
2
2 + t
a/2
C
s
1
2
n
1
+
s
2
2
n
2

Lower Bound: 1x
1
- x
2
2 - t
a/2
C
s
1
2
n
1
+
s
2
2
n
2
t =
1x
1
- x
2
2 - 1m
1
- m
2
2
C
s
1
2
n
1
+
s
2
2
n
2
n - 1 t
a/2
Upper Bound: d + t
a/2
#
s
d
1n

Lower Bound: d - t
a/2
#
s
d
1n
s
d
d
t =
d
s
d
^
1n
Test statistic comparing two population proportions:
where
Confidence interval for the difference of two proportions:
Test statistic for comparing two population standard
deviations:
Finding a critical F for the left tail:
F
1-a,n
1
-1,n
2
-1
=
1
F
a,n
2
-1,n
1
-1
F =
s
1
2
s
2
2
Upper Bound: 1p
N
1
- p
N
2
2 + z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
Lower Bound: 1p
N
1
- p
N
2
2 - z
a/2
C
p
N
1
11 - p
N
1
2
n
1
+
p
N
2
11 - p
N
2
2
n
2
p
N
=
x
1
+ x
2
n
1
+ n
2
.
Z =
p
N
1
- p
N
2
4
p
N
11 - p
N
2
B
1
n
1
+
1
n
2
AAEC 3401 Page 4 of 7


Chapter 10 Inferences on Two Samples (Continued)

To estimate the difference in population proportions with a margin of error E at a % 100 ) 1 ( level of
confidence requires a sample of size:

[ ]
2
2 /
2 2 1 1 2 1
) 1 ( ) 1 (

+ = = =
E
z
p p p p n n n

rounded up to the next integer, where
2 1
p and p are prior estimates of the population proportions, p
1
and p
2
, respectively.

If prior estimates of p
1
and p
2
are unavailable, the sample size is:
2
2 /
2 1
5 . 0

= = =
E
z
n n n

rounded up to the next integer.

AAEC 3401 Page 5 of 7
CHAPTER 11 Chi-square Procedures
Chi-Square Test Statistic
(1) All expected frequencies are greater than or equal to 1
and (2) no more than 20% of the expected frequencies
are less than 5.
Use degrees of freedom for goodness of fit.
Use degrees of freedom when testing for
independence or homogeneity of proportions (r is the
number of rows, c is the number of columns).
1r - 121c - 12
k - 1
i = 1, 2, , k
x
2
=
a
1observed - expected2
2
expected
=
a
1O
i
- E
i
2
2
E
i
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Expected Counts (when testing for goodness of fit)
Expected Frequencies (when testing for independence or
homogeneity of proportions)
Expected frequency =
1row total21column total2
table total
E
i
= m
i
= np
i
for i = 1, 2, , k
CHAPTER 12 Inference on the Least-squares Regression Model; ANOVA
Standard error of the estimate:
Standard deviation of
Test statistic for the slope of the least-squares regression line:
Confidence Interval for the Slope of the Regression Line:
A confidence interval for the slope of the
true regression line, is given by:
where is computed with degrees of freedom. n - 2 t
a/2
Upper Bound: b
1
+ t
a/2
#
s
e
4
g1x
i
- x2
2

Lower Bound: b
1
- t
a/2
#
s
e
4
g1x
i
- x2
2
b
1
,
11 - a2
#
100%
t =
b
1
- b
1
s
e
^
4
g1x
i
- x2
2
=
b
1
- b
1
s
b
1
s
b
1
=
s
e
4
g1x
i
- x2
2
b
1
:
S
e
=
C
g1y
i
- y
N
i
2
2
n - 2
=
C
gresiduals
2
n - 2
Confidence Interval about the Mean Response of
A confidence interval for the mean
response of is given by
where is the given value of the predictor variable and
is the critical value with degrees of freedom.
Prediction Interval about
A prediction interval for the individual
response of is given by
where is the given value of the predictor variable and
is the critical value with degrees of freedom. n - 2 t
a/2
x

Upper Bound: y
N
+ t
a/2
#
s
e
C
1 +
1
n
+
1x

- x2
2
g1x
i
- x2
2

Lower Bound: y
N
- t
a/2
#
s
e
C
1 +
1
n
+
1x

- x2
2
g1x
i
- x2
2
y, y
N
,
11 - a2
#
100%
y
N
:
n - 2 t
a/2
x

Upper Bound: y
N
+ t
a/2
#
s
e
C
1
n
+
1x

- x2
2
g1x
i
- x2
2
Lower Bound: y
N
- t
a/2
#
s
e
C
1
n
+
1x

- x2
2
g1x
i
- x2
2
y, y
N
,
11 - a2
#
100%
y, y
N
:
Test Statistic for One-Way ANOVA:
where
MSW =
1n
1
- 12s
1
2
+ 1n
2
- 12s
2
2
+

+ 1n
k
- 12s
k
2
n - k
MSB =
n
1
1x
1
- x2
2
+ n
2
1x
2
- x2
2
+

+ n
k
1x
k
- x2
2
k - 1
F =
Between estimate of s
2
Within estimate of s
2
=
MSB
MSW

Confidence Interval about any Population Mean,
For any population i, the confidence interval for is given by
where is the critical t value with degrees of
freedom and s
p
= 2MSW.
n - k t
a/2
Upper Bound: x
i
+ t
a/2
#
s
p
1
n
i
Lower Bound: x
i
- t
a/2
#
s
p
1
n
i
m
i
m
i
:
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
CHAPTER 13 Nonparametric Statistics
Test Statistic for a Runs Test for Randomness
Let n represent the sample size of which there are two mu-
tually exclusive types
Let represent the number of observations of the first
type
Let represent the number of observations of the second
type
Let r represent the number of runs
Small-Sample Case
If and the test statistic in the runs test for ran-
domness is r, the number of runs.
Large-Sample Case
If or the test statistic in the runs test for ran-
domness is
where
m
r
=
2n
1
n
2
n
+ 1 and s
r
=
B
2n
1
n
2
12n
1
n
2
- n2
n
2
1n - 12
z =
r - m
r
s
r
n
2
7 20, n
1
7 20
n
2
20, n
1
20
n
2
n
1
Large-Sample Case
The test statistic, z, is
where n is the number of minus and plus signs and k is obtained
as described in the small sample case.
z =
1k + 0.52 -
n
2
1n
2
1n>252
Two-Tailed Left-Tailed Right-Tailed
The test statistic, k, will The test statistic, The test statistic,
be the smaller of the k, will be the k, will be the
number of minus signs number of number of
or plus signs plus signs minus signs
H
1
: M 7 M
o
H
1
: M 6 M
o
H
1
: M Z M
o
H
o
: M = M
o
H
o
: M = M
o
H
o
: M = M
o
Test Statistic for a One-Sample Sign Test
The test statistic will depend upon the structure of the hypothe-
sis test and the sample size.
Small-Sample Case 1n 252
Large-Sample Case
Based upon the Central Limit Theorem, the test statistic is
given by
where T is the test statistic from the small sample case.
Test Statistic for the MannWhitney Test
The test statistic will depend upon the size of the samples from
each population. Let represent the sample size for population
X and represent the sample size for population Y.
Small-Sample Case and
If S is the sum of the ranks corresponding to the sample from
population X, then the test statistic, T, is given by
Note: The value of S is always obtained by summing the ranks
of the sample data that correspond to in the hypothesis.
Large-Sample Case or
Based upon the Central Limit Theorem, the test statistic is
given by
z =
T -
n
1
n
2
2
B
n
1
n
2
1n
1
+ n
2
+ 12
12
n
2
>202 1n
1
>20
M
x
T = S -
n
1
1n
1
+ 12
2
n
2
202 1n
1
20
n
2
n
1
z =
T -
n1n + 12
4
C
n1n + 12 12n + 12
24
1n>302
Test Statistic for Spearmans Rank Correlation Test
The test statistic will depend upon the size of the sample, n, and
the sum of the squared differences.
where the difference in the ranks of the two observations
in the ordered pair.
Test Statistic for the KruskalWallis Test
The test statistic for the KruskalWallis Test is
A computational formula for the test statistic is
where
is the sum of the ranks squared for the first sample,
is the sum of the ranks squared for the second sample,
and so on.
is the number of observations in the first sample,
is the number of observations in the second sample,
and so on.
N is the total number of observations
k is the number of populations being compared.
1N = n
1
+ n
2
+

+ n
k
2.
n
2
n
1
R
2
2
R
1
2
H =
12
N1N + 12
B
R
1
2
n
1
+
R
2
2
n
2
+

+
R
k
2
n
k
R - 31N + 12
H =
12
N1N + 12
a
1
n
i
BR
i
-
n
i
1N + 12
2
R
2
i
th
d
i
=
r
s
= 1 -
6gd
i
2
n1n
2
- 12
Tables and Formulas
for Sullivan, Statistics: Informed Decisions Using Data
2004 Pearson Education, Inc.
Two-Tailed Left-Tailed Right-Tailed
Test Statistic: T is the smaller Test Statistic: Test Statistic:
of or T = T
-
T = T
+
T
-
T
+
H
o
: M
D
7 0 H
1
: M
D
6 0 H
1
: M
D
Z 0
H
o
: M
D
= 0 H
o
: M
D
= 0 H
o
: M
D
= 0
Test Statistic for the Wilcoxon Matched-Pairs Signed-
Ranks Test
The test statistic will depend upon the size of the sample and
the alternative hypothesis. Let n represent the sample size.
Small-Sample Case 1n 302
AAEC 3401 Page 6 of 7


Test Statistic for One-way ANOVA:

MSW
MSB
of estimate Within
of estimate Between
F = =
2
2



where

1
) ( ... ) ( ) (
2 2
2
2
2
1
1

+ + +
=
k
x x n x x n x x n
MSB
k
k

k n n n
s n s n s n
MSW
k
k k
+ + +
+ + +
=
) ... (
) 1 ( ... ) 1 ( ) 1 (
2 1
2 2
2 2
2
1 1





Confidence Interval about any Population
Mean, i:
For any population i, the confidence interval for

i
is given by
i
p
i
n
s
t x Bound Lower
2 /
:



i
p
i
n
s
t x Bound Upper +
2 /
:



Where t
/2
is the critical t value with (n
1
+ n
2
+
+ n
k
) k degrees of freedom and s
p
=
MSW.




Least Significant Difference (LSD),
2 1
2
1 1
*
n n
s t LSD
p
+ =

,
For n
1
=n
2
=n,
n
s t LSD
p
2
*
2

=

where t has df = (n
1
+ n
2
++ n
k
- k), MSW s
p
= from the within-samples line of the ANOVA
table.
AAEC 3401 Page 7 of 7
Source of
Variation
Sum of
Squares
Degrees of
Freedom Mean Square F-Statistic
Between SSB k-1 MSB=SSB/(k-1) F=MSB/MSW
Within SSW n
1
+n
2
++n
k
-k MSW=SSW / ( n
1
+n
2
++n
k
-k)
Total SS n
1
+n
2
++n
k
-1

You might also like