Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Statistics and Probability

ESTIMATION OF PARAMETERS (PART 2)

“A statistical estimate may


be good or bad, accurate or
the reverse, but in almost
all cases it is likely to be
more accurate than a
casual observer’s
impression, and the nature
of things can only be
disproved by statistical
methods.”

1st Semester AY 2021-2022


1
Statistics and Probability: Confidence Intervals

❖ Illustrate and distinguish point and interval estimation


❖ Identify point estimator of population mean and proportion
❖ Identify the appropriate form of confidence interval for the
population mean and proportion
❖ Compute for the confidence interval estimate of the
population mean and proportion
❖ Draw conclusion about the population mean and proportion
based on its confidence interval estimate
❖ Identify and compute for the length of a confidence interval
❖ Solve problems involving sample size determination

2
Statistics and Probability: Confidence Intervals

A recent survey by Social Weather Stations found that 84% of Filipinos


believe that strict lockdown measures imposed due to the pandemic “are
worth it to protect people and limit the spread” of the virus. The report
stated that 4,010 working-age Filipinos (15 years old and above) were
selected and that the poll has a margin of error of ±2%.

Question: Will you believe this survey results?

In this lesson, you will learn how to make a true estimate of a parameter,
what is meant by the margin of error, and whether or not the sample size is
large enough to represent all Filipinos.

3
Statistics and Probability: Confidence Intervals

Statistical Inference
It refers to methods by which one uses sample information to
make inferences or generalizations about a population.

Two Areas of Statistical Inference:

1. Estimation

2. Hypothesis Testing
4
Statistics and Probability: Confidence Intervals

Estimation
One aspect of inferential statistics is estimation, which is the
process of estimating the value of a parameter from information
obtained from a sample.

An estimator is any statistic whose value is used to estimate an


unknown parameter. A realized value of an estimator is called an
estimate.

Examples:
• the sample mean 𝑿 ഥ is an estimator of the population mean µ.
• the sample standard deviation 𝒔 is an estimator of the
population standard deviation 𝝈.

5
Statistics and Probability: Confidence Intervals

Estimation

consistent

6
Statistics and Probability: Confidence Intervals

Estimation
Properties of a Good Estimator:

1. The estimator should be an unbiased estimator; that is, the


expected value or the mean of the estimate obtained from
samples of a given size is equal to the parameter being
estimated.
2. The estimator should be consistent. For a consistent
estimator, as sample size increases, the value of the
estimator approaches the value of the parameter estimated.
3. The estimator should be a relatively efficient estimator;
that is, it is the estimator with the smallest variance among
all the statistics that can be used to estimate the parameter.

7
Statistics and Probability: Confidence Intervals

Types of Estimate
A point estimate is a specific numerical value estimate of a
parameter.

• The best point estimate of the population mean µ is the


sample mean 𝑋.ത

An interval estimate of a parameter is an interval or a range of


values used to estimate the parameter.

• It is denoted by (a,b) and in this interval, the parameter is


expected to lie.

8
Statistics and Probability: Confidence Intervals

Confidence Level and Confidence Intervals


The confidence level of an interval estimate of a parameter is the
probability that the interval estimate will contain the parameter.

A confidence interval is a specific interval estimate of a parameter


determined by using data obtained from a sample and by using the
specific confidence level of the estimate.

AGAIN, we want to emphasize that the confidence level or confidence


coefficient is the PROBABILITY THAT THE INTERVAL ESTIMATOR
ENCLOSES OR CAPTURES THE TRUE VALUE OF THE PARAMETER.
CONFIDENCE LEVEL IS NOT THE PROBABILITY THAT THE TRUE
VALUE OF THE PARAMETER WILL FALL IN THE INTERVAL ESTIMATE.

Why????

9
Statistics and Probability: Confidence Intervals

Confidence Intervals
For example, say the interval estimate from a survey have 90% confidence
level. This means that, if the survey were to be done for 100 times, then in 90 of
those times, the interval estimate will enclose or capture the true value of the
population parameter and not the other way around.

We must NOT say that there is a 90% chance or probability that the true
value of the parameter falls within the interval estimate, because it implies
that the parameter may be within this interval, or it may be somewhere else.

This phrasing makes it seem as if the POPULATION PARAMETER IS A


VARIABLE, when in fact, it’s not. The population parameter is fixed. An interval
estimate either captures the parameter or didn’t. We know that intervals
change from sample to sample, but the POPULATION PARAMETER we’re trying
to capture does not.

IT IS SAFER TO SAY WE’RE 90% CONFIDENT THAT THE INTERVAL


ESTIMATE CAPTURED THE TRUE VALUE OF THE POPULATION
PARAMETER. 10
Statistics and Probability: Confidence Intervals

Confidence Intervals
There are three common confidence intervals that are used: the
90%, the 95%, and the 99% confidence intervals.

NOTE: If there is no confidence level explicitly stated, use 95%


confidence level.

Remarks:
• Increasing the confidence level will increase the margin of
error resulting in a wider interval.
• Decreasing the confidence level will decrease the margin of
error resulting in a narrower interval
• If the confidence level is higher, the probability that an interval
estimate CAPTURES or ENCLOSES the true value of the
population parameter also increases.
11
Statistics and Probability: Confidence Intervals

12
Statistics and Probability: Confidence Intervals

Estimating the Mean

Is the population standard


deviation known?

NO YES

Is the sample size greater than or    


 X − z / 2 , X + z / 2 
equal to 30?  n n

YES NO

 S S   S S 
 X − z / 2 , X + z / 2   X − t / 2 , X + t / 2 
 n n  n n
13
Statistics and Probability: Confidence Intervals

Estimating the Mean


𝜎 𝑠 𝑠
𝑧𝛼ൗ2 , 𝑧 ൗ2
𝛼 , 𝑡 ൗ2
𝛼
𝑛 𝑛 𝑛
are called the maximum error of the estimate (also called margin of error).
Common values of 𝒛𝜶Τ𝟐 :
• For 90% confidence level, 𝑧𝛼Τ2 = 1.645
• For 95% confidence level, 𝑧𝛼Τ2 = 1.96
• For 99% confidence level, 𝑧𝛼Τ2 = 2.576
• For other confidence level, see Student’s t-table (shown below)
For 𝒕𝜶Τ𝟐 , Student’s t-table is also needed. It depends on two things:
• Degree of freedom, v = n – 1
• Confidence Level

14
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 1:
An electrical firm manufactures light bulbs that have life spans
that are normally distributed, with a standard deviation of 40
hours. If a random sample of 25 bulbs has a mean life span of 780
hours, find a 95% confidence interval for the population mean of
all bulbs produced by this firm.

Questions to Answer:
• Is the population standard deviation (σ) known?
YES, σ = 40

Then, we use the form…


𝜎 𝜎

(𝑋 − 𝑧𝛼ൗ ത
, 𝑋 + 𝑧𝛼ൗ )
2 𝑛 2 𝑛

15
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 1 (cont.):
𝜎 𝜎
(𝑋ത − 𝑧𝛼ൗ ത
, 𝑋 + 𝑧𝛼ൗ )
2 𝑛 2 𝑛

Given:
𝑋ത = 780, σ = 40, n = 25
For 95% confidence level, use 𝑧𝛼Τ2 = 1.96

𝜎 𝜎
(𝑋ത − 𝑧𝛼ൗ ത
, 𝑋 + 𝑧𝛼ൗ )
2 𝑛 2 𝑛
40 40
(780 − (1.96) , 780 + (1.96) )
25 25

Answer: (764.32, 795.68)


Interpretation: The researcher is 95% confident that the interval (764.32,
795.68) encloses the true value of the population mean life span of all light
bulbs produced by this firm.
16
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 2:
Regular consumption of presweetened cereals contributes to tooth decay, heart
disease, and other degenerative diseases, according to a study. In a random
sample of 20 similar servings of Alpha-Bits, the mean sugar content was 11.3
grams with a standard deviation of 2.45 grams. Assuming that sugar content in
Alpha-Bits servings is normally distributed, construct a 95% confidence
interval for the mean sugar content of single servings of Alpha-Bits.

Questions to Answer:
• Is the population standard deviation (σ) known?
NO, σ is unknown.
• Is the sample size greater than or equal to 30?
NO, n = 20 which is less than 30.

Then, we use the form…


𝑠 𝑠
(𝑋ത − 𝑡𝛼ൗ ത
, 𝑋 + 𝑡𝛼ൗ )
2 𝑛 2 𝑛

17
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 2 (cont.):
𝑠 𝑠
(𝑋ത − 𝑡𝛼ൗ ത
, 𝑋 + 𝑡𝛼ൗ )
2 𝑛 2 𝑛

Given:
𝑋ത = 11.3, s = 2.45, n = 20
For 95% confidence level and v = n – 1 = 20 – 1 = 19,
𝑡𝛼Τ2 = 2.093

𝑠 𝑠
(𝑋ത − 𝑡𝛼ൗ ത
, 𝑋 + 𝑡𝛼ൗ )
2 𝑛 2 𝑛
2.45 2.45
(11.3 − (2.093) , 11.3 + (2.093) )
20 20

Answer: (10.153, 12.447)


Interpretation: The researcher is 95% confident that the interval (10.153,
12.447) encloses the true value of the population mean sugar content of single
servings of Alpha-Bits. 18
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 3:
A random sample of 100 automobile owners shows that an automobile is
driven, on the average, 23,500 kilometers per year in the state of Virginia, with
a standard deviation of 3900 kilometers. Construct a 99% confidence interval
for the average number of kilometers an automobile is driven annually in
Virginia.

Questions to Answer:
• Is the population standard deviation (σ) known?
NO, σ is unknown.
• Is the sample size greater than 30?
YES, n = 100 which is greater than 30.

Then, we use the form…


𝑠 𝑠
(𝑋ത − 𝑧𝛼ൗ ത
, 𝑋 + 𝑧𝛼ൗ )
2 𝑛 2 𝑛

19
Statistics and Probability: Confidence Intervals

Estimating the Mean


Example 3 (cont.):
𝑠 𝑠
(𝑋ത − 𝑧𝛼ൗ ത
, 𝑋 + 𝑧𝛼ൗ )
2 𝑛 2 𝑛

Given:
𝑋ത = 23500, s = 3900, n = 100
For 99% confidence level, use 𝑧𝛼Τ2 = 2.576

𝑠 𝑠
(𝑋ത − 𝑧𝛼ൗ , 𝑋ത + 𝑧𝛼ൗ )
2 𝑛 2 𝑛
3900 3900
(23500 − (2.576) , 23500 + (2.576) )
100 100

Answer: (22495.36, 24504.64)


Conclusion: The researcher is 99% confident that the interval (22495.36,
24504.64) encloses the true value of the population mean number of
kilometers an automobile is driven annually in Virginia. 20
Statistics and Probability: Confidence Intervals

21
Statistics and Probability: Confidence Intervals

Estimating Proportions
DEFINITION:

The Sample Proportion, denoted by 𝑝,Ƹ is given by the formula,

𝒙
ෝ= ,
𝒑
𝒏
where
x is the number of sample units that possess the characteristics of interest
n is the sample size

The proportion of the sample that does not possess the characteristics of
interest is given by:

𝒏−𝒙
ෝ=
𝒒 𝐨𝐫 ෝ =𝟏−𝒑
𝒒 ෝ
𝒏

Note that 𝒑
ෝ (sample proportion) is an estimator of the population
proportion, p.
22
Statistics and Probability: Confidence Intervals

Estimating Proportions
MAXIMUM ERROR OF THE ESTIMATES (Margin of Error)

ෝ𝒒
𝒑 ෝ
𝑬 = 𝒛𝜶ൗ
𝟐 𝒏
where 𝑧𝛼Τ2 depends on the confidence level of the interval.

FORM OF CONFIDENCE INTERVAL FOR PROPORTIONS

ෝ𝒒
𝒑 ෝ ෝ𝒒
𝒑 ෝ
(ෝ
𝒑 − 𝒛𝜶ൗ ෝ + 𝒛𝜶ൗ
,𝒑 )
𝟐 𝒏 𝟐 𝒏

23
Statistics and Probability: Confidence Intervals

Estimating Proportions
Example 1:
In a random sample of 200 students who enrolled in Math 17, there were 138 students
who passed on their first take. Construct a 95% confidence interval for the population
proportion of students who passed Math 17 on their first take.
Given:
n = 200
𝑥 138
𝑝Ƹ = = = 0.69
𝑛 200
𝑞ො = 1 − 𝑝Ƹ = 1 − 0.69 = 0.31
For 95% confidence level, 𝑧𝛼Τ2 = 1.96

𝑝ො𝑞ො 𝑝ො𝑞ො
(𝑝Ƹ − 𝑧𝛼Τ2 , 𝑝Ƹ + 𝑧𝛼Τ2 )
𝑛 𝑛

0.69 0.31 0.69 0.31


0.69 − 1.96 , 0.69 + 1.96
200 200

Answer: (0.626, 0.754)


Conclusion: The researcher is 95% confident that the interval (0.626, 0.754) encloses
the true value of the population proportion of students who passed Math 17 on their
first take. 24
Statistics and Probability: Confidence Intervals

Estimating Proportions
Example 2:
A sample of 500 nursing applications included 60 applications from men. Find
the 90% confidence interval of the true proportion of men who applied to the
nursing program.
Given:
n = 500
𝑥 60
𝑝Ƹ = = = 0.12
𝑛 500
𝑞ො = 1 − 𝑝Ƹ = 1 − 0.12 = 0.88
For 90% confidence level, 𝑧𝛼Τ2 = 1.645

𝑝ො𝑞ො 𝑝ො𝑞ො
(𝑝Ƹ − 𝑧𝛼Τ2 , 𝑝Ƹ + 𝑧𝛼Τ2 )
𝑛 𝑛

0.12 0.88 0.12 0.88


0.12 − 1.645 , 0.12 + 1.645
500 500

Answer: (0.096, 0.144)


Conclusion: The researcher is 90% confident that the interval (0.096, 0.144) encloses
the true value of the population proportion of men who applied to the nursing program.
25
Statistics and Probability: Confidence Intervals

26
Statistics and Probability: Confidence Intervals

Determining Sample Size


• In random sampling, if the sample mean will be used to estimate µ, we can
be (1-α)100% confident that the error will not exceed a specific amount, e,
when the sample size is
𝒛𝜶ൗ 𝝈
𝒏 = ( 𝟐 )𝟐
𝒆

• If the sample proportion will be used to estimate p, then we can be


(1-α)100% confident that the error will not exceed a specific amount, E, when
the sample size is
𝒛𝟐𝜶ൗ 𝒑ෝ𝒒

𝟐
𝒏=
𝑬𝟐

• When the value of 𝒑 ෝ is unknown or cannot be approximated, then using


𝑝Ƹ = 0.5 produces the maximum value of 𝑝Ƹ 𝑞ො = 0.25. Hence a conservative
formula for the sample size is
𝒛𝟐𝜶ൗ
𝟐
𝒏=
𝟒𝑬𝟐
27
Statistics and Probability: Confidence Intervals

Determining Sample Size


Example 1:
Previous surveys of household internet usage showed that σ = 9.45 minutes.
We would like to start an Internet Service Provider (ISP) and need to estimate
the average Internet usage of households in one week for our business plan.
How many households must we randomly select to be 90% sure that the
sample mean is within 3 minutes of the population mean?
Given:
σ = 9.45
e=3
For 90% confidence level, 𝑧𝛼Τ2 = 1.645
We use the formula…
𝑧𝛼ൗ 𝜎
𝑛 = ( 2 )2
𝑒
(1.645)(9.45) 2
𝑛=( )
3
Answer: n = 26.851 ~ 𝟐𝟕 (always ROUND UP)

28
Statistics and Probability: Confidence Intervals

Determining Sample Size


Example 2:
A clinical trial is planned to determine the effect of a new treatment for a common
illness. The plan is to show that the new treatment is, clinically, at least as good as
the existing treatment. The parameter of interest is the true percentage of patients
responding to the new treatment. From previous experience, one knows that the
percentage of patients who respond to the existing treatment is approximately
60%. How large a sample must be obtained to be 99% confident that the estimated
proportion is in error by no more than 5%?
Given:
𝑝ො = 60% or 0.60
𝑞ො = 1 − 𝑝Ƹ = 1 − 0.60 = 0.40
E = 5% or 0.05
For 99% confidence level, 𝑧𝛼Τ2 = 2.576
We use the formula…
𝒛𝟐𝜶ൗ 𝒑
ෝ𝒒ෝ
𝟐
𝒏=
𝑬𝟐
2.576 2 (0.60)(0.40)
𝑛=
(0.05)2
Answer: n = 637.034 ~ 𝟔𝟑𝟖 (always ROUND UP) 29
Statistics and Probability: Confidence Intervals

30
Statistics and Probability: Confidence Intervals

Length of Interval
𝜎 𝑠 𝑠
𝑒 = 𝑧𝛼ൗ , 𝑒 = 𝑧𝛼ൗ , 𝑒 = 𝑡𝛼ൗ
2 𝑛 2 𝑛 2 𝑛
are the maximum error of the estimate (margin of error) for the mean.

𝑝Ƹ 𝑞ො
𝐸 = 𝑧𝛼ൗ
2 𝑛
is the maximum error of the estimate (margin of error) for proportion.

Length of Interval
If asked to calculate for the length of interval, you can do two things:
• Subtract the Maximum and the Minimum value of the interval OR
• Calculate for:
• Length of Interval (Mean) = 2e
• Length of Interval (Proportion) = 2E

31
Statistics and Probability: Confidence Intervals

Let’s go back to the SWS Survey in the introduction. The estimates given in
the survey were point estimates; however, since the sample proportion is
84% or 0.84 with a margin of error of 2% or 0.02, a confidence interval can
be constructed. The confidence interval would be 82%, 86% or (0.82, 0.86).
We don’t know whether this is a 90%, 95%, 99%, or some other confidence
level because this was not specified in the report.

We use a 95% confidence level (use 95% CL if not explicitly stated). Using the
formulas in Slide 27, a minimum sample size can be calculated. We can take
𝑝Ƹ = 0.84 and 𝑞ො = 0.16. For 95% confidence level, 𝑧𝛼Τ2 = 1.96
𝒛𝟐𝜶ൗ 𝒑
ෝ𝒒ෝ 𝟏. 𝟗𝟔 𝟐
𝟎. 𝟖𝟒 𝟎. 𝟏𝟔
𝟐
𝒏= = = 𝟏𝟐𝟗𝟎. 𝟕𝟕𝟕𝟔 ≈ 𝟏𝟐𝟗𝟏 (𝒓𝒐𝒖𝒏𝒅 𝒖𝒑)
𝑬𝟐 𝟎. 𝟎𝟐𝟐

32
Statistics and Probability: Confidence Intervals

For a 95% confidence interval, then, the minimum sample size would be
1291. Since the survey used 4010 respondents, it can be said that we are at
least 95% confident that the interval (0.82, 0.86) encloses the true
proportion of Filipinos who believe that strict lockdown measures imposed
due to the pandemic “are worth it to protect people and limit the spread” of
the virus.

Note: Even if the minimum sample size was satisfied, this does not
automatically mean that this is real, or this is the truth. Many factors may
affect the results of the survey such as the respondent’s educational
attainment, political views, socio-economic status, location, gender, age, to
name a few. It is important that you will investigate if the study was done
in an unbiased, accurate and reliable way, before you believe it.

33
Statistics and Probability
ESTIMATION OF PARAMETERS (PART 2)

Thank You!

34

You might also like