Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

STATISTIC TERMS AND NOTES YAYAYAYAY

(almost breaaak)
SAMPLING AND SAMPLING DISTRIBUTIONS - if di siya normal then the sampling
- we use sample data to be able to distribution will be approx. normal kung
estimate the characteristics (mean, sd, minimum of 30 ang sample size (why?
variance, etc.) of a population Kasi accdg to the Central Limit
- inferential statistics Theorem, the sample will approach
- using info from a sample to normality the greater the size)
reach conclusions about the
pop. STANDARDIZATION
- examines how sample means and - like normal standardization but
proportions from a pop. will tend to be use standard error instead of
distributed deviation (don’t forget the fps if
finite)
SAMPLING DISTRIBUTIONS
- the probability distribution of the SAMPLING DISTRIBUTION OF PROPORTION
sample - sample proportion is the proportion of
STANDARD ERROR successes in a sample consistion of n
- standard deviation of the trials
sample - so that’s number of successes over
number of trials
- any mean based on the sample drawn - 𝜌is sample proportion, 𝜋is the
from a population is expected to be of population proportion
diff. values, meaning that the sample - hi this is ur reminder to not forget to use
mean is a random variable the correction factor kahit sa proportion
- sampling distribution of the sample basta finite pop - but also remember na
means is the probability distribution of dapat at least 5% of the sample is as
the sample means large as the pop when using this
- population mean (𝜇 ) = sample mean - before using the z-score to find
- standard error < standard deviation ( 𝜎) probabilty, make sure na both n𝜋 and
- finite population correction factor n(1-𝜋) are ≥5
(fps)
- used when the population is ESTIMATION OF PARAMETERS
finite - OKAY THIS IS WHERE THREE
- when the pop is infinite or too DIFFERENT FORMULAS COME IN
large, the fps is equal to 1 (kaya - why do we do this? Impractical kasi to
you don’t multiply it na to the collect data from the population
standard error kung ganun)
- if the orig pop is normal, the standard STATISTICAL INFERENCE
error will be smaller (bakit? kasi kada - process by which conclusions
sample mean is the avg of ilang values about parameters (pop.
galing sa population, and then yang characteristics) are made based
masyadong maliit or malaki kay masali on sample data
man sa ibang values na tama tama - two areas: estimation and
lang) hypothesis testing
- if the orig pop is normal, the sampling POINT AND CONFIDENCE INTERVAL
distribution will also be normal ESTIMATION
- estimate
- value or a range of - the proportion of such
values that approximate intervals that would
a parameter include the pop
- based on sample parameter if the process
statistics computed from leading to the interval
sample data were repeated a great
- point estimate many times
- single number that - confidence level
estimates the exact - expresses the degree of
value of the pop certainty that an interval
parameter of interest will include the actual
- interval estimate value of the pop
- range of possible values parameter
that are likely to include - accuracy
the actual pop - difference between the
parameter observed sample
- confidence interval statistic and the actual
- when the interval value
estimate is associated - a.k.a. estimation or
with a degree of sampling error
confidence that it - if sigma is known, use z-score
actually includes the and standard dev. if not, use t-
pop parameter value and sample standard
- unbiased estimate deviation. if it’s pop proportion,
- estimates from many use the proportion formula
samples are equal to
the true pop parameter CONFIDENCE INTERVAL ESTIMATES
- biased estimate - for the mean: 𝜎 known
- sample statistic departs - assumes that the pop is
from the true pop value normal
- either that or the sample
POINT ESTIMATES size is at least 30
- range of confidence
population unbiased interval is defined by the
parameter estimator sample statistic
±margin of error
mean, 𝜇 𝑥 - for the mean: 𝜎 unknown
- n should be at least 30
variance, 𝜎2 s2
parin
proportion, 𝜋 p - it also assumes na the
sample is random and
that the population is
INTERVAL ESTIMATES normally distributed if
- interval estimates ever n is less than 30
- interval limits - student’s t
- lower and upper values distribution
of the interval estimate - degree of
- confidence interval freedom is (n-1)
- confidence coefficient
- for the population proportion Pearson Product Moment
- uses the normal correlation coefficient,
distribution as an denoted as r
approximation to the - value varies from -1 to +1
binomial distribution quantifying the direction and
whenever np and n(1-p) strength
are both n≥5 - the correlation coefficient
- better for large values of measures the strength of the
n and whenever it is relationship between variables
close to 0.5
- the midpoint of the REGRESSION ANALYSIS
confidence interval is - simple regression is used to
the sample proportion examine the relationship
between one dependent and
SAMPLE SIZE ESTIMATION one independent variable
- margin of error - criterion variable = dependent
- maximum difference variable
between the observed - predictor variable =
sample mean and true independent variable
value - can be used to predict the
- since the value of 𝜎 is unknown, dependent variable when the
it can be estimated na the independent variable is known
standard dev na s is from a prior - regression line
sample na - a.k.a. the least squares
- ALWAYS ROUND UP THE line
VALUE OF N OKAY BES? - plot of the expected
value of the dependent
ESTIMATING THE POP PROPORTION variable for all values of
- if 𝜋is known, you can use this in the independent
the formula variable
- if no approximation is known, - line that minimizes the
use p=0.5 nalang (the value will squared residuals, best
result in a sample size large fitting the data sa
enough to predict the actual scatterplot
size)
- if the proportion is within a SIMPLE LINEAR REGRESSION
range, look at the upper and - most widely used statistical
lower limit and choose the one technique, describing
closest to 0.5 relationships with a straight line
- y+a+bx
CORRELATION AND REGRESSION - coefficient of determination,
ANALYSIS r2
- LAST NA TOOO - determines how good
CORRELATION ANALYSIS the estimated reg. eq. Is
- measure of association between
two continuous variables CORRELATION ≠CAUSATION
- estimates a sample correlation
coefficient, more specifically the

You might also like