Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

WEEK 3, QUARTER 3 of STATISTICS 1

Introduction
to Inferential
Statistics &
Estimation
Prepared for B2026 by
MJAyaay, VCChua, LDEstonilo
WEEK 3, QUARTER 3 of STATISTICS 1
ç
Introduction to Inferential
Statistics & Estimation

LESSON
OUTLINE
WEEK 3, QUARTER 3 of STATISTICS 1
Inferential
Introduction to Inferential statistics
Statistics & Estimation

LESSON
OUTLINE
WEEK 3, QUARTER 3 of STATISTICS 1 ç Inferential
statistics
Introduction to Inferential
Statistics & Estimation
Estimation

LESSON
OUTLINE
WEEK 3, QUARTER 3 of STATISTICS 1 ç Inferential
statistics
Introduction to Inferential
Statistics & Estimation
Estimation

LESSON Confidence
Intervals
OUTLINE
WEEK 3, QUARTER 3 of STATISTICS 1 ç Inferential
statistics
Introduction to Inferential
Statistics & Estimation
Estimation

Confidence
LESSON Intervals

OUTLINE Margin of
Error
WEEK 3, QUARTER 3 of STATISTICS 1
ç
Introduction to Inferential Inferential
statistics
Statistics & Estimation
Estimation

LESSON Confidence
Intervals

OUTLINE Margin
of Error
Can we trust surveys?
Can we trust surveys?
Let’s talk about how surveys work…
Can we trust surveys?
Let’s talk about how surveys work…

Trustworthy sources show survey


results that can be source of accurate
information.
However, surveys have also been used
for propaganda and manipulation,
especially during the recent election
Some have conducted poorly done
surveys, which cast doubt on the
validity of surveys in general.
Can we trust surveys?
Are conclusions from surveys accurate?
How can we detect properly done surveys from the
problematic ones?
Let’s study one
survey.
According to the Philippine
Statistics Authority (PSA),
around 18.1% of Filipino
families are considered poor.
Poverty incidence is defined as the proportion
of Filipinos whose per capita income cannot
sufficiently meet the individual basic food and
non-food needs.

The PSA also reports that the


90% confidence interval is
given by (17.8%, 18.5%).
What do you think does this
interval mean?
Here’s
another one.
How can a survey with just
1200 respondents be
trusted to represent the
entire Philippine
population?
Why do credible survey
firms declare a margin or
error and a confidence
level?
How are these computed?
SO, can we trust surveys?
When it is done properly, YES.
But what determines what is “proper”?

Credible surveys are trustworthy because


they are based on the principles of

Inferential Statistics
Inferential Statistics
POPULATION

RANDOM
SAMPLE

SAMPLE DATA

POPULATIO PROBABILITY
INFERENCE
N
Inferential Statistics
The goal is to INFER – make
conclusions
Draws conclusions about a POPULATION

population or process from sample RANDOM


data. It also provides a statement of SAMPLE

how much confidence we can place


SAMPLE
in our conclusions DATA

(MOORE, ET.AL, 2017).


POPULATIO PROBABILITY
N INFERENCE
Consists of those methods by which
one makes inferences or
generalizations about a population
(WALPOLE, ET.AL, 1998).
Inferential Statistics
How do we ensure that we can make appropriate
inference about the population using data from a
sample?

The sample must be representative of the population.


RANDOMIZATION!
So how do we know which
surveys have proper methods?

Check if they randomly selected Check who funded for the


respondents. survey. Typically, the funder has
an inherent agenda for
Check if the group where commissioning the survey.
respondents are from is
representative of the entire Check who posted the survey.
country, e.g. online surveys are
not as good because there is a
significant section of the country
with no internet access.
How, exactly, do we employ
inferential statistics?

EST I M AT I O N
The process of estimating the value of a
parameter from information obtained from a
sample (BLUMAN, 2014).

T EST O F S I G N I F I C A N C E
The process of assessing the evidence
provided by the data in favor of some claim
about the population parameters (MOORE,
ET.AL, 2017).
How much time each day on
average did you sleep the past
month? Include nap times.
You probably answered in two different ways…
Estimation

P O I N T EST I M AT E I N T E RVA L EST I M AT E


a specific numerical value estimate of a range of values used to estimate the
a parameter parameter. This estimate may or may
not contain the value of the
parameter being estimated.
Based on the data from our randomly
Based on the data from our randomly
selected sample, we are 95 percent sure
selected sample, the proportion of the
that the proportion 0.32 to 0.38 contains
population who likes the color pink is
the true proportion of the population who
0.35
likes the color pink.
Estimation
There are two essential qualities that we
consider when making an estimate.

ACCURACY refers to how close PRECISION refers to the


an estimate is to the true reproducibility of an
value. estimate.

How close do you How well can you


hit the target? replicate your results?
Estimation LOW ACCURACY
HIGH PRECISION
HIGH ACCURACY
HIGH PRECISION

How well are we hitting the same spot?


ACCURACY refers to how close
an estimate is to the true
value.

PRECISION
PRECISION refers to the
reproducibility of an
estimate.

LOW ACCURACY HIGH ACCURACY


LOW PRECISION LOW PRECISION

ACCURACY
How well are we hitting the target (true value)?
Estimation

P O I N T EST I M AT E A D V A N TA G E
a specific numerical value estimate of Precise value that is easy to interpret
a parameter
Based on the data from our randomly D I S A D V A N TA G E
selected sample, the proportion of the Lack of knowledge on how accurate the
population who likes the color pink is
0.35
estimate is
How sure are you that the number you
got is close or not to the actual value?
Estimation
A point estimate is single value used to estimate
the parameter value.

Can multiple statistics be used as a Is there a “best” point estimate for a


point estimate for a population parameter?
statistic?

YES. Example, any measure of YES. A statistic can be


center may be used to estimate determined as a best point
the population mean. estimate for a parameter.
Estimation
A point estimate is single value used to
estimate the parameter value.

Is there a “best” point estimate for a


parameter?

YES. A statistic can be The best point estimate of a


determined as a best point parameter is its corresponding
estimate for a parameter. statistic.
Properties of UNBIASED
an Estimator The expected value or the mean of the estimates
obtained from samples of a given size is equal to
the parameter being estimated.
BLUMAN, 2014

CO N S I ST E N T
As sample size increases, the value of the
estimator approaches the value of the parameter
estimated.

R E L AT I V E LY E F F I C I E N T
Of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the
smallest variance.
Properties of UNBIASED
an Estimator The expected value or the mean of the estimates
obtained from samples of a given size is equal to
the parameter being estimated.
BLUMAN, 2014

CO N S I ST E N T
As sample size increases, the value of the
estimator approaches the value of the parameter
estimated.

R E L AT I V E LY E F F I C I E N T
Of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the
smallest variance.
Properties of UNBIASED
an Estimator The expected value or the mean of the estimates
obtained from samples of a given size is equal to
the parameter being estimated.
BLUMAN, 2014

CO N S I ST E N T
As sample size increases, the value of the
estimator approaches the value of the parameter
estimated.

R E L AT I V E LY E F F I C I E N T
Of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the
smallest variance.
Properties of UNBIASED
an Estimator The expected value or the mean of the estimates
obtained from samples of a given size is equal to
the parameter being estimated.
BLUMAN, 2014

CO N S I ST E N T
As sample size increases, the value of the
estimator approaches the value of the parameter
estimated.

R E L AT I V E LY E F F I C I E N T
Of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the
smallest variance.
Estimation

P O I N T EST I M AT E I N T E RVA L EST I M AT E


a specific numerical value estimate of a range of values used to estimate the
a parameter parameter. This estimate may or may
not contain the value of the
parameter being estimated.

Based on the data from our randomly Based on the data from our randomly
selected sample, the proportion of the selected sample, we are 95 percent sure
population who likes the color pink is that the proportion of the population who
0.35 likes the color pink is somewhere between
0.32 to 0.38.
Estimation
It is usually centered at the point
I N T E RVA L EST I M AT E estimate.
a range of values used to estimate the
parameter. This estimate may or may The boundaries of the interval are
not contain the value of the constructed by adding and
parameter being estimated. subtracting a specified value.
Based on the data from our randomly
selected sample, we are 95 percent sure It comes with some percentage of
that the proportion 0.32 to 0.38 contains how certain we are that the
the true proportion of the population who parameter being estimated is
likes the color pink.
contained within it.
Estimation

I N T E RVA L EST I M AT E A D V A N TA G E
a range of values used to estimate the More likely to contain the actual
parameter. This estimate may or may parameter value
not contain the value of the You can estimate the probability it
parameter being estimated. contains the parameter.

Based on the data from our randomly


selected sample, we are 95 percent sure D I S A D V A N TA G E
that the proportion of the population who Not precise as point estimates
likes the color pink is somewhere between
0.32 to 0.38.
Estimation
Based on the data from our randomly selected sample, we
are 95 percent sure that the proportion of the population
who likes the color pink is somewhere between 0.32 to 0.38.
0.38

The confidence level of an interval A confidence interval is a specific


estimate of a parameter is the probability interval estimate of a parameter
that the interval estimate will contain the determined by using the data
parameter, assuming that a large number obtained from a sample and by
of samples are selected and that the using the specific confidence level
estimation process on the same of the estimate.
parameter is repeated.
Estimation
Based on the data from our randomly selected sample, we are 95
percent sure that the proportion of the population who likes the
color pink is somewhere between 0.32 to 0.38.
0.03
0.32 to 0.38
0.35 to 0.38 A confidence interval is a specific
The point estimate is the number in the interval estimate of a parameter
middle of the interval. determined by using the data
obtained from a sample and by
The difference between each interval using the specific confidence level
and the point estimate is called of the estimate.
Margin of Error
Margin of Error
Also called the maximum
error of the estimate, the
margin of error is the
maximum likely difference
between the point estimate
of a parameter and the
actual value of the
parameter.
Let’s study the
simulators
CONFIDENCE INTERVAL FOR
A PROPORTION
By setting a sample size, the
population proportion, and
confidence level, the
simulator will generate
intervals representing the
sample proportion and the
interval estimate around that
sample proportion.

Link:
https://www.statcrunch.com
/applets/type3&ciprop
Set sample size

Set population proportion

Set the Confidence Level

Generate intervals and observe


them

Green intervals contain the


population proportion, the red
ones do not.

You can click on a particular


sample to check the details
Set sample size

Set population proportion

Set the Confidence Level

Generate intervals and observe


them

Green intervals contain the


population proportion, the red
ones do not.

You can click on a particular


sample to check the details
Recall from the
previous lesson

Describe how the sample


proportion, 𝑝,Ƹ is distributed if
the population proportion is
30%, 𝑝 = 0.30, and the
sample size is 100, 𝑛 = 100.
Recall from the The standard deviation of all sample
previous lesson proportions, 𝜎𝑝ො , is given by

Describe how the sample 𝑝 1−𝑝 0.3 1 − 0.3


proportion, 𝑝,Ƹ is distributed if 𝜎𝑝ො = =
𝑛 100
the population proportion is
So if 𝑝 = 0.30 and 𝑛 = 100, then 𝜎𝑝ො ≈
30%, 𝑝 = 0.30, and the
0.04583.
sample size is 100, 𝑛 = 100.
What is the interval that contains the middle
95% of sample proportions?
0.2102 − 0.3898

How far are the boundaries from the mean,


0.30?
0.0898
Recall from the
previous lesson

Describe how the sample


proportion, 𝑝,Ƹ is distributed if
the population proportion is Suppose I took a random sample from a population
with 𝑝 = 0.30, and the sample proportion is given
30%, 𝑝 = 0.30, and the
by 𝑝.Ƹ Suppose I create the interval:
sample size is 100, 𝑛 = 100.
𝑝Ƹ − 0.0898, 𝑝Ƹ + 0.898
What percentage of these intervals contain 𝑝 =
0.30?
95%
Why?
Recall from the
previous lesson So we can confidently say that if we get random
sample from a population with 𝑝 = 0.30 and if we
get the corresponding sample proportion 𝑝,Ƹ then
Describe how the sample the interval, 𝑝Ƹ − 0.0898, 𝑝Ƹ + 0.898, will contain
proportion, 𝑝,Ƹ is distributed if the population proportion 95% of the time.
the population proportion is Another way of interpreting it is that if you
30%, 𝑝 = 0.30, and the repeatedly sample from this population, then the
sample size is 100, 𝑛 = 100. interval 𝑝Ƹ − 0.0898, 𝑝Ƹ + 0.898 will contain the
population proportion in 95% of such samples.
We then say, the interval you compute is the 95%
Confidence Interval for the population proportion.
Where did 𝟎. 𝟎𝟖𝟗𝟖 come from?
Let’s study the
simulators
CONFIDENCE INTERVAL FOR
A MEAN
By setting a sample size, the
population mean, and
confidence level, the
simulator will generate
intervals representing the
sample mean and the
interval estimate around that
sample mean.

Link:
https://www.statcrunch.com
/applets/type3&cimean
Set population distribution and
sample size

Set population mean and SD

Set the Confidence Level

Generate intervals and observe


them

Green intervals contain the


population proportion, the red
ones do not.

You can click on a particular


sample to check the details
Prepare for
Laboratory Activity 1
Sampling distribution of the proportion | Sampling distribution
of the mean | Confidence interval simulations

You might also like