Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

statistics notes

1. Estimation process of using a sample data to calculate a number


that can be used to represent an unknown population parameter
a. Point estimate a single value estimate for a population
parameter; number obtained from a sample, estimate of the
population parameter; however, using a point estimate to get an
idea about the value of a population parameter poses so much
uncertainty because we do not expect the estimate to be exactly
equal to the parameter

x ( sample mean ) is the most unbiased estimate for the population mean

x
^p=
n

i. Must be unbiased expected value or the mean of the


estimates obtained from samples of a given size is equal
to the parameter being estimated
ii. Must be consistent as sample size increases, value of
the estimator approaches the value of the parameter
being estimated
iii. Must be relatively efficient estimator must have the
smallest variance
iv. Steps:
1. Have a definite target populationthe variable of
interests
2. Collect a sample from the population using a
statistically valid sampling procedure
3. Compute point estimate
b. Interval estimate of a parameter is a range of values used to
estimate a population parameter; this estimate may or may not
contain the actual value of the parameter tested
c. Confidence level (c) - of an interval estimate of a parameter is
the probability 1 that the parameter is contained in the
interval estimate. This is assuming that a large number of
samples is selected and that the estimation process on the same
parameter is repeated
i. Area under the standard normal curve between two critical
z + z
values, 2 2

d. Confidence interval- is an interval or a range of values


associated with a confidence level used to estimate the true
value of the unknown population parameter; confidence means
degree of belief
i. For the population mean : x E< <: x + E
e. Sampling error: difference between the point estimate and the
actual value of the parameter
i. E= x error occurs in all sample; this aims to
calculate the maximum value of the error E.

Page 1 of 5
statistics notes
f. Margin of error is the maximum error of estimate given by
(where z is the z-score that corresponds the value of c

E=+ z x =E=+ z
2 2 n
g. T-distribution: If the sample is size is less than 30.
i. Bell-shaped and symmetric about the line through the
mean
ii. Family of curves determined by a (parameter) value of
df called degrees of freedom. The number df gives the
number of free choices left after a sample statistic
like the mean is calculated.
iii. The total area under a t-curve is equal to 100%
iv. The mean, median, and mode are all equal to 0.
v. As the degrees of freedom df increase, the t-
distribution approaches the normal distribution.
After ro degrees of freedom,t he t-distribution is very close
to the standard normal distribution
x
t=
s
n

s
E=t
2 n

s=
( xx )2
n1

2. Confidence interval for the Population Proportions


a. Proportion statistic ^p is the average of the responses

^p=
bi
n

Where bi is either a 1 or a 0. By letting x= b i as the


counts of Yes we retain the previous point estimate. We also
have the probability of failures as
q^ =1 ^p
b. Binomial distribution can be considered as normal distribution
if and only if n q^ 5 and n ^p 5

c. When n q^ 5 and n ^p 5 , the sampling distribution of ^p

is approximately normal with a mean of ^p= p and a

standard error of ^p=


pq
n

Page 2 of 5
statistics notes
d. The confidence interval for a population proportion p is

^pE< p< ^p + E where E=+ z


2 ^p q^
n

3. Determining the sample size


a. level of confidence increases, the confidence interval widens,
which decreases the precision of our estimates.
i. Improve precision without decreasing level of confidence
by increasing the sample size n.
ii. What we want: n is large enough so as to guarantee a
certain level of confidence for a given margin of error E
b. Minimum Sample Size for the Estimation of population
mean
2
z
n= ( )
2
E
c. Minimum Sample size for the Estimation of population
proportion p
2
z
n=^p q^ ()
E
2

4. Hypothesis testing
a. Statistical hypothesis: conjecture or supposition about a
population parameter
i. Null hypothesis H 0 : no difference between particular
value and parameter
ii. Alternative Hypothesis H 1 : a parameter and a
particular value has a difference
Two-tailed Right-tailed Left-tailed
H 0 : =K H 0 : =K H 0 : =K
H1: K H 1 : > K H 1 : < K

b. A statistical test uses data from a sample in order to make a


decision whether the null hypothesis should be rejected or not.
c. Test statistic: numerical value obtained form a statistical test.

H 0 is true H 0 is false

Reject H 0 ERROR (Type 1) Correct decision

Accept H 0 Correct decision Error (Type 2)

Page 3 of 5
statistics notes
d. Reject the null hypothesis when the sample statistic you derive
from the sampling distribution is unusual (if the probability of
occurrence is very small).
e. Level of significance: decrease this value to minimize the
probability of committing a Type I error; maximum
allowable probability of committing a Type I error. If type
B,
i. Common values of alpha: 0.05, 0.10, and 0.01
ii. Researcher decides depending on the nature of study.
1. If life and death, use 0.01 or less
2. Social research, can be 5% or 10%
3. By setting level of significance at a small value, it is
saying that the probability of you rejecting a true
hypothesis is small.
a. When increases, decreases
4. After specifying level of significance, critical value is
selected from the table. This determines critical and
non-critical regions
f. Critical value separates the critical region from the non-
critical region
i. Critical region rejection region, range of values of the
test values that there is a significant difference between
the actual value of the parameter and its hypothesized
value. Null hypothesis must be rejected.
ii. Non-critical region non-rejection region or
acceptance region is the range of values of the test
value that indicates that the difference was probably due
to change and that the null hypothesis should not be
rejected.
iii. Position of critical value depends on the inequality sign
of alternative hypothesis
1. If > 0 , cv is on the side of the mean.
2. Null hypothesis only rejected if sample mean
is greater than 0 .
iv. One-tailed test indicates rejection of null when the test
value is in the critical region on one side of the mean
1. Right if alternative hypothesis has the inequality
sign
2. Left if
g. Steps
i. Identify null and alternative
ii. Decide on level of significance
iii. Find critical value on the table
iv. Compute test statistic
v. Make decision
vi. Interpret results

5. Test for mean: z-test and t-test


a. Computation for test statistic

Page 4 of 5
statistics notes
observed valueexpected value
test statistic=
standard error
Observed value = sample mean, expected value = population
mean when the null hypothesis is assumed to be true, standard

error of the mean is computed as the number x =
n .
x 0
b. Using the z-test: z=
x
c. You may use the probability value to make decisions
i. Probability of getting a sample statistic or a more extreme
sample statistic in the direction of the alternative
hypothesis when the null hypothesis is true
ii. Reject null if ^p

iii. Do not reject null if ^p

d. T-test for a mean where n<30


i. T-test for a mean is a statistical test for a population mean
which is used when the population is normal or
approximately normal, is unknown, but n<30
x 0
t=
s
n

Page 5 of 5

You might also like