Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

STATISTICS AND PROBABILITY ➢ Can take on any value in a

given interval
Lesson 1: Random Variables ➢ Example: The amount of
A random variable is a type of water consumed by an
variable whose value depends athlete
upon the numerical outcomes ➢ These random variables
of a certain random are measurable and can
phenomenon. be expressed not only by
❖ Random variable has whole numbers
no specific value
❖ Can take on many Lesson 2: Random Sampling
values ➢ Population is defined as
all subjects under study
Difference between random and ➢ Sample is a subset of
algebraic variables: population
• Random Variables
➢ Have a set of values that Why do we use Sample ?
could be the resulting 1. It saves the
outcome of a random researcher’s time and
experiment money
➢ Denoted by capital letters 2. Enables the
• Algebraic Variables researcher to get
➢ Represents the value of information that they
unknown quantity in an might not be able to
algebraic equation that obtain otherwise
can be calculated 3. Enables the
researcher to get
Every probability of random more detailed
variable: information about a
• Each probability must be particular subject
between 0 and 1
➢ 𝟎 ≤ 𝑷(𝒙) ≤ 𝟏 In random sampling, the basic
• The sum of all requirement is that all possible
probabilities is equal to 1 samples of this size have an
➢ 𝜮 𝑷(𝒙) = 𝟏 equal chance of being selected
from the population.
A random variable can be either
Simple Random Sampling
discrete or continuous.
➢ Takes a small, random
• Discrete
portion of the entire
➢ Are countable numbers of
population to represent
a distinct value
the entire data set
➢ Example: The number of
➢ Each member has an
students in Grade-11
equal probability of being
STEM
chosen
➢ These random variables
➢ Examples:
are whole numbers in
1. Random lottery
quantity
2. Tossing a coin
• Continuous
3. Rolling a die
➢ However, simple sampling 2 types of Stratified Sampling
has one limitation. If the a. Proportionate
population is extremely ➢ In this approach, the size
large, it is time- of each stratum is
consuming to number and proportional to the size of
select the samples. population strata
b. Disproportionate
Systematic Random Sampling ➢ In this approach, the size
➢ A sample is obtained by of each stratum is not
numbering each element proportional to the size of
in the population selecting population strata
some random starting
point and then selecting Cluster Random Sampling
every kth element (third, ➢ A sample is obtained by
or fifth, or tenth, etc.) selecting a pre-existing or
➢ The advantage of natural group called
systematic sampling is the cluster, and using the
ease of selecting the members in the cluster
sample elements. Also, in for the sample
many cases, a numbered • 3 Advantages of using
lost of the population Cluster Sampling
units may already exist 1. Can reduce costs,
2. Can simplify
Stratified Random Sampling fieldwork, and
➢ A sample is obtained by 3. It is convenient
dividing the population ➢ The major disadvantage of
into subgroup, called cluster sampling is that
strata (plural of stratum) the elements may not
based on shared have the same variations
characteristics (e.g., race, in characteristics as
gender, identity, location, elements selected
etc.) and then, random individually from a
samples are selected from population
each stratum.
➢ Example: Number of Non-Probability Sampling
students at CNHS based • Convenience
on their strand • Purposive
➢ Advantage: It ensures
representation of all Lesson 3: Mean and Variance
population subgroups are ➢ Mean = average
important to the study ➢ To have a reliable mean,
➢ Disadvantage: When we have to toss infinite
researchers can’t BUT that is impossible. So,
confidently classify every we use this formula
member of the population instead: 𝝁 = 𝜮 [𝒙 • 𝑷(𝒙)]
into a subgroup
4. The curve is symmetric
For example, 𝒙 = tossing 2 coins about the mean.
𝑥 𝑃(𝑥) 𝑥 • 𝑃(𝑥) 5. The curve is continuous;
0 1/4 0 that is, there are no gaps
1 2/4 2/4 or holes. For each value
2 1/4 2/4 of x, there is a
𝟒 corresponding value of y.
𝝁= =𝟏 6. The curve never touches
𝟒
➢ Variance = measure of the x-axis.
how data points differ
from mean (how data The total area under a normal
varies) distribution curve is equal to
❖ If the variance is high, 1.00 or 100%. This fact may
data points are very seem unusual, since the curve
spread out never touches the x-axis, but
❖ If the variance is low, one can prove it
data points are close mathematically by calculus.
to mean
𝝈𝟐 = 𝜮 [𝒙𝟐 • 𝑷(𝒙)] − 𝝁𝟐 The Standard Normal Distribution
The standard normal
Example: distribution is a normal
𝑥 𝑃(𝑥) 𝑥 • 𝑃(𝑥) 𝑥 2 • 𝑃(𝑥) distribution with a mean 𝝁 of 0
0 1/4 0 0 and a standard deviation 𝝈 of
1 2/4 2/4 2/4 1.
2 1/4 2/4 4/4
𝟔 ❖ To the left of any value
𝝈𝟐 = = 𝟏. 𝟓 ➢ Look up to the z value in
𝟒
the table and use the area
Lesson 4: Normal Distribution given.
If a random variable has a ➢ To the right of any value
probability distribution whose ➢ Look up to the z value in
graph is continuous, bell- table and subtract the
shaped, and symmetric, it is given area from 1.
called ‘normal distribution’. ➢ Or, look up to the opposite
• The graph is called sign. If negative, look at
normal distribution the positive side.
curve
Applications of the Standard
Properties of Normal Distribution Normal Curve
𝑿− 𝝁
1. A normal distribution Formula: 𝒛 =
𝝈
curve is bell-shaped. ❖ If given has two X, use the
2. The mean, median, and formula. And then,
mode are equal and are subtract the area of both z
located at the center at value.
the distribution.
3. A normal distribution
curve is unimodal – it has
only one mode.
Lesson 5: Sampling Distribution • Interval Estimation
of the Sample Mean ➢ Range of values that such
❖ Parameter is a data from a range contains the
the Population parameter value
Symbols:
• population mean – mu Confidence Interval
(𝝁) Confidence Level
• population s.d – sigma ➢ Refers to the probability
(𝝈) that the confidence
• population variance interval contains the true
(𝝈𝟐 ) population parameter

❖ Statistics is a data from Law of Significance


the Sample ➢ Probability that the
Symbols: confidence interval does
• sample mean – xbar not contain the true
(𝒙
̅) population parameter
• sample s.d – (𝒔) ➢ Opposite of confidence
• sample variance - level
(𝒔𝟐 ) ❖ Symbol: alpha
(𝒂)
Combination Formula:
𝑵! Confidence Level
➢ 𝑵𝑪𝒏 = 𝒏!(𝑵−𝒏)! Three most commonly used
wherein, confidence level are 90%, 95%,
• 𝑁 = total number of and 99%
samples ➢ If you desire to be more
• 𝑛 = sample size confident, such as 99% or
99.5% confident, then you
Formula for Mean of the Sampling must take the interval
Distribution of the Sample Mean: larger.
➢ 𝝁𝒙̄ = 𝜮 [(𝒙̄ ) • 𝑷(𝒙̄ )]
Confidence Interval of the Mean for
Formula for Variance of the a Specific 𝒂 when 𝝈 is known:
𝝈
Sampling Distribution of the ➢ ̅ ± 𝒁𝑪𝑳 •
𝑿
√𝒏
Sample Mean:
wherein,
𝜮(𝒙̄ −𝝁)𝟐
➢ 𝝈𝟐 = ➢ 𝑿 ̅ = sample mean
𝑵
➢ 𝒁𝑪𝑳 = z-value
corresponding a
Lesson 6: Estimation
confidence level
➢ Descriptive
➢ 𝝈 = standard deviation
❖ to describe
➢ √𝒏 = sample size
➢ Inferential
❖ to infer
Inferential Statistics ❖ The z-values
• Point Estimation corresponding a
➢ Single value that confidence level are also
estimates the parameter called the critical values.
value
Critical Values &. Confidence Level ➢ ̅ − 𝒕𝒂 ( 𝒔 ) < 𝝁 <
(𝑿
CL SL CV 𝟐 √𝒏
90% 0.10 1.65 ̅ + 𝒕𝒂 ( 𝒔 )
(𝑿 𝟐 √𝒏
95% 0.05 1.96
99% 0.01 2.58 Margin of Error
𝒕𝒂 𝒔
➢ 𝑬= ( )
Margin of Error 𝟐 √𝒏
Also called the maximum
error of the estimate, is the If looking for:
likely difference between the ❖ One tail = either positive
point estimate of a parameter or negative value
and actual value of the ❖ Two tails = both positive
parameter. and negative value
❖ Formula: 𝑬 = 𝒁𝑪𝑳 •
𝝈 ❖ However, if the tail is not
√𝒏 stated, assume that it is
❖ Upper limit looking for two tails.
➢ 𝑿 ̅+ 𝑬
❖ Lower limit:
➢ 𝑿 ̅− 𝑬

Is the 𝝈 known ?
• If YES, use z-test
• If NO, use t-test

Formula of Sample Size derived


from the formula of Margin of
Error:
(𝒁𝝈)𝟐
➢ 𝒏= 𝑬

Lesson 7: T-Distribution
The z-statistics is used when
the population standard
deviation is known and the
sample size is equal to or
more than 30.
❖ z-distribution is wider
than t
❖ while, t-distribution is
more spread out than z

The d.f can be obtained by


subtracting the sample size to 1.

If 𝒏 < 𝟑𝟎 and 𝝈 is known the


confidence interval for population
mean is

You might also like