Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 150

QUANTITATIVE METHODS IN

MANAGEMENT
Course content
Chapter Page content
number
1 11-32 Introduction, variables, levels of measurement, types of statistics

2 33-98 Organizing and visualizing variables


3 99-148 Numerical descriptive measures – categorical / numerical – Measures of central
tendencies, measures of dispersion, skewness, kurtosis, measures of relations – co
variance and correlation

12 430-446 Simple linear regression, estimating bo and b1, measures of variations, SST, SSR and
SSE, coefficient of determination, coefficient of correlation.

4 149-182 Basic probability


5,6 183-232 Discrete probability distributions – binomial and poisson
Continuous probability distribution – Normal
7 234-257 Sampling distribution
8 258-293 Confidence interval – mean, proportion and determining sample size

9 294-304 Fundamentals of testing of hypothesis


11 402-415 Chi square test
15 15-1 to 15- Decision analysis
17
RECAP
• Introduction – definition, types of statistics, levels of measurement
• Collection / compilation/ classification / tabulation
• Presentation – graphical and diagrammatic
• Measures of central tendencies
• Measures of dispersion
• Measures of skewness
• Exploratory data analysis
• Association between variables – covariance and correlation
• Regression analysis – simple, measures of variations ( SSE, SSR, SST,
coefficient of determination and coefficient of correlation)
• Introduction to Probability – Addition and multiplication theorem,
Baye’s theorem
• Theoretical distribution – RV, Binomial, Poisson and Normal distribution
Business Statistics:
A First Course
5th Edition

Chapter 7
Chapter 7 : 234-257
Sampling and Sampling
Distributions
Learning Objectives
In this chapter, you learn:
• To distinguish between different
sampling methods
• The concept of the sampling distribution
• To compute probabilities related to the
sample mean and the sample proportion
• The importance of the Central Limit
Theorem
Unit 3: Sampling Theory
Introduction to Sampling

We taste two to A chemist takes a sample


three grapes of alcohol to determine
before purchasing whether it is proof or not.
a whole bunch.
Does the selected sample represent the characteristics of the
whole bunch?
Statisticians, however, recommend a scientific approach to sampling, which
helps gets an accurate sample, which in most cases represent the
characteristics of the whole bunch. Know more about sampling, as
recommended by statisticians, in this unit.
Population Vs. Sample
• Population
– Set of all elements of interest in a study

– Complete enumeration – Census

• Sample
– Subset of the population

– Sampling

• The purpose of statistical inference is to develop


estimates & test hypotheses about the
characteristics of a population using the
information obtained in the sample
Unit 3: Sampling Theory

Finite Population:
If the population consists of a finite number of
individuals, then it is called a Finite Population.

Infinite Population:
In a statistical survey aimed at
determining average per capita
income of the people in a city, all
earning individuals in the city form
the population.
Why Sample?
• Selecting a sample is less time-
consuming than selecting every item in
the population (census).

• Selecting a sample is less costly than


selecting every item in the population.

• An analysis of a sample is less


cumbersome and more practical than
an analysis of the entire population.
A Sampling Process Begins With A Sampling
Frame
• The sampling frame is a listing of items that
make up the population
• Frames are data sources such as population
lists, directories, or maps
• Inaccurate or biased results can result if a frame
excludes certain portions of the population
• Using different frames to generate data can
lead to dissimilar conclusions
Sampling techniques
Random/ Probability sampling
Non probability/Non random sampling
• Simple random sample • Purposive
– With replacement • Convenience
– Without replacement
• Quota
• Stratified sample • Judgment
– Proportionate
• Snowball
– optimal
• Systematic sample
• Cluster sample
• Multi stage sample
Types of Samples:
Nonprobability Sample

• In a nonprobability sample, items


included are chosen without regard
to their probability of occurrence.
– In convenience sampling, items are selected
based only on the fact that they are easy,
inexpensive, or convenient to sample.
– In a judgment sample, you get the opinions of
pre-selected experts in the subject matter.
Types of Samples:
Probability Sample
• In a probability sample, items in the sample
are chosen on the basis of known probabilities.

Probability
Samples

Simple
Random Systemati Stratifie Cluste
c d r
Probability Sample:
Simple Random Sample
• Every individual or item from the frame
has an equal chance of being selected

• Selection may be with replacement


(selected individual is returned to frame
for possible reselection) or without
replacement (selected individual isn’t
returned to the frame).

• Samples obtained from table of random


numbers or computer random number
generators.
Selecting a Simple Random Sample Using A
Random Number Table
Portion Of A Random
Sampling Frame Number Table
49280 88924 35779 00283 81163
For Population 07275
11100 02340 12860 74697 96644
With 850 Items 89439
09893 23997 20048 49420 88872
Item Name 08401

Item # The First 5 Items in a


Bev R. 001 simple random
Ulan X. 002 sample
. . Item # 492
. . Item # 808
. . Item # 892 -- does not exist
so ignore
. .
Item # 435
Joann P. 849
Item # 779
Paul F. 850 Item # 002
Simple Random Sampling…
• Lottery Method
• Random Number Tables
– Tippetts Table (digits taken from
the census report and combined
by 4s to give 10400 four figure
numbers)
– Fisher & Yates Table
– Kendall & Babington Smith’s
Tables
Problem 1
• Assume a finite population of 350. Using
the last 3 digits of the following 5 digit
numbers (601, 022….) determine the first
4 elements that will be selected for the
simple random sample
98601 73022 83448 02147 34229
84147 93289 14209
Ans
# 022 147 229 289
Simple Random Sampling…
Random Number Table

99437 87961 45737 37552 97969 39094 34475 31618


50656 00127 68367 66882 08156 80016 78224 58326
80880 63171 42877 66835 60515 70296 50026 45587
86420 40853 53798 89454 68130 91253 88104 74319
60097 86436 01869 47758 89535 99400 48268 30606
52587 71965 85453 46834 00991 99729 76948 15941
89155 90553 90689 48637 07955 47062 71182 64493
Probability Sample:
Systematic Sample
• Decide on sample size: n
• Divide frame of N individuals into groups
of k individuals: k=N/n
• Randomly select one individual from the
1st group
• Select every kth individual thereafter
N = 40 First
n=4 Group
k = 10
Probability Sample:
Stratified Sample
• Divide population into two or more subgroups (called
strata) according to some common characteristic
• A simple random sample is selected from each subgroup,
with sample sizes proportional to strata sizes
• Samples from subgroups are combined into one
• This is a common technique when sampling population of
voters, stratifying across racial or socio-economic lines.

Populati
on
Divided
into 4
strata
Probability Sample
Cluster Sample
• Population is divided into several “clusters,” each
representative of the population
• A simple random sample of clusters is selected
• All items in the selected clusters can be used, or items can
be chosen from a cluster using another probability
sampling technique
• A common application of cluster sampling involves election
exit polls, where certain election districts are selected and
sampled.

Populati
on
divided Randomly
selected
into 16 clusters for
clusters. sample
Multi Stage Sampling
• Sampling carried out in stages
• Material regarded as being made up
of a number of I stage sampling
units, each of which is made up of a
number of II Stage units and so on
• Example – sample of 5000
households in Karnataka
– I Stage – State – divided into District
– II Stage – Districts – divided into villages
– III Stage – Villages – divided into
households
Judgement Sampling
• Choice of a sample depends
exclusively on the judgement of
the investigator
• Quality of the sample depends
exclusively on the judgement of
the person selecting the sample
Convenience Sampling
• Elements are included in the
sample without pre specified or
known probability of being
selected – convenience of
researcher
• A convenient chink or slice of
the population is taken
• Example
– From telephone directories
– Professor conducting research
may use student volunteers
Quota Sampling
• Quotas are set based on a
given criteria, but within the
quotas, the sample is
judgmental
• Example
– If out of 100 people to be
interviewed, 60 are to be
housewives, 25 farmers, 15
children less than 15 years
Biased Sampling
• Picking a sample by choosing
people who would have very
strong feelings on the issue
Snowball Sampling
• Survey subjects are selected
based on referral from other
survey respondents
Probability Sample:
Comparing Sampling Methods

• Simple random sample and Systematic


sample
– Simple to use
– May not be a good representation of the
population’s underlying characteristics
• Stratified sample
– Ensures representation of individuals
across the entire population
• Cluster sample
– More cost effective
– Less efficient (need larger sample to
acquire the same level of precision)
Evaluating Survey Worthiness
• What is the purpose of the survey?
• Is the survey based on a probability sample?
• Coverage error – appropriate frame?
• Nonresponse error – follow up
• Measurement error – good questions elicit
good responses
• Sampling error – always exists
Sampling Vs. Non Sampling Error

• Sampling Error
– As sample results are based on
partial or incomplete analysis of the
population features, any statistical
inference based on the sample may
not always be correct
• Non sampling error
– Incorrect enumeration of population
– Non random selection of samples
– Use of faulty questionnaire
– Wrong editing, coding or analysis
Types of Survey Errors
• Coverage error or selection bias
– Exists if some groups are excluded from the frame
and have no chance of being selected

• Non response error or bias


– People who do not respond may be different from
those who do respond

• Sampling error
– Variation from sample to sample will always exist

• Measurement error
– Due to weaknesses in question design, respondent
error, and interviewer’s effects on the respondent
(“Hawthorne effect”)
Types of Survey Errors
(continu
ed)

Excluded
• Coverage error from frame

Follow up
• Non response error on
nonrespon
ses
Random
• Sampling error differences from
sample to
sample
• Measurement error Bad or leading
question
SAMPLING DISTRIBUTION
N = population size n sample size
N= (1,2,3,4,5) n= 3 NCn ways = 10
Mean Medi SD Prop . .
an
1 1,2,3 6/3
2 1,2,4 7/3
3 1,2,5 8/3
4 2,3,4 9/3
5 2,3,5 10/3
6 3,4,5
7 1,3,4
8 1,3,5
9 1,4,5
10 2,4,5
Sampling Distributions
• A sampling distribution is a distribution of all of the possible
values of a sample statistic for a given size sample selected
from a population.

• For example, suppose you sample 50 students from your


college regarding their mean GPA. If you obtained many
different samples of 50, you will compute a different mean for
each sample. We are interested in the distribution of all
potential mean GPA we might calculate for any given sample
of 50 students.
µ= 3 sample mean = 30/10 = 3 SD SE

• N= { 1,2,3,4, 5} n= 3 Sample size = NCn


= 5C3 = 10
Sample Sampl Mean propor Mode SD Varianc
No es tion e
1 1,2,3 6/3 … … … ….
2 1,2,4 7/3
3 1,2,5 8/3
4 2,3,4 9/3
5 2,4,5 11/3
6 1,3,4 8/3
7 1,3,5 9/3
8 1,4,5 10/3
9 2,3,5 10/3
10 3,4,5 12/3
Developing a
Sampling Distribution
• Assume there is a population …
D
• Population size N=4 A B C
• Random variable, X,
is age of individuals
• Values of X: 18, 20,
22, 24 (years)
Developing a
Sampling Distribution
Summary Measures for the Population
Distribution:

μ
 X i
P(x)

N .3

18  20 22 24 .2
  21
4 .1

σ
(X  μ)i
2

 2.236
18
A
20
B
22
C
24
D
x
N
Uniform Distribution
Developing a
Sampling Distribution
(continu
Now consider all possible samples of size n=2ed)

16
1st 2nd Observation
Obs Sample
18 20 22 24
Means
1st 2nd Observation
18 18,1 18,2 18,2 18,2 Obs 18 20 22 24
8 0 2 4
18 18 19 20 21
20 20,1 20,2 20,2 20,2
8 0 2 4 20 19 20 21 22
22 22,1 22,2 22,2 22,2
16 possible
8 0 2 4 22 20 21 22 23
samples
24 24,1 24,2 24,2
(sampling with 24,2 24 21 22 23 24
8 replacement)
0 2 4
Developing a
Sampling Distribution
(continu
ed)
Sampling Distribution of All Sample Means

16 Sample Sample
Means
1st 2nd Observation Means
Obs 18 20 22 24 _ Distribution
P(X
18 18 19 20 21 ).
3
20 19 20 21 22 .
2
22 20 21 22 23 .
1
24 21 22 23 24 0 _
18 19 20 21 22
23 24 X
(no longer
Developing a
Sampling Distribution
(continu
ed)
Summary Measures of this Sampling Distribution:

μX 
X 18  19  19   24
i
  21
N 16

σX 
 ( X i  μ X
) 2

(18 - 21)2  (19 - 21)2    (24 - 21)2


  1.58
16
Comparing the Population
Distribution
to the Sample Means Distribution
Sample
n = 2 Means Distribution
Population
N=4

μ 21 σ  2.236 μX  21 σX  1.58
_
P(X) P(X)
.3 .3

.2 .2

.1 .1

0 18 20 22
0 _
X 18 19 20 21 22 23 24
24 X
A B C D
Sampling Distribution…
• The function used – Mean or SD – is
the Sample Statistic
• The SD of the distribution of the
sample statistic is the Standard
Error of the Statistic
• The expected value is regarded as the
true value and any deviation is
regarded as error of estimation due
to sampling effects
Sample Mean Sampling Distribution:
Standard Error of the Mean
• Different samples of the same size from the same
population will yield different sample means
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)

σ
σX 
n
• Note that the standard error of the mean decreases as the
sample size increases
Sample Mean Sampling Distribution:
If the Population is Normal
• If a population is normally distributed
with mean μ and standard deviation σ,
the sampling distribution of X is also
normally distributed with

μX μ σ
and σX 
n
Z-value for Sampling Distribution
of the Mean

• Z-value for the sampling X


distribution of :
(X μX ) (X μ)
Z 
σX σ
n
where: X = sample mean
=μpopulation mean
=σpopulation standard deviation n=
sample size
Sampling Distribution Properties

Normal
μx μ Population
Distribution

• μ x
x Normal Sampling
(i.e. is Distribution
unbiased ) (has the same
mean)

μx
x
Sampling Distribution Properties

As n increases, Larger
σx decreases sample
size

Smaller
sample
size

μ x
Sample Mean Sampling Distribution:
If the Population is not Normal

• We can apply the Central Limit Theorem:


– Even if the population is not normal,
– …sample means from the population will be
approximately normal as long as the sample size
is large enough.

Properties of the sampling distribution:

σ
μx μ σx 
and
n
Central Limit Theorem

the
As the n↑
sampling
sample
distributio
size
n
gets
becomes
large
almost
enough
normal

regardless
of shape
of
population x
Sample Mean Sampling Distribution:
If the Population is not Normal

Sampling Population
distribution Distribution
properties:
Central Tendency

μx μ
μ x
Variation Sampling Distribution
σ (becomes normal as n increases)
σx  Larger

n Smaller
sample
sample
size

size

μx x
How Large is Large Enough?

• For most distributions, n > 30 will give a


sampling distribution that is nearly normal
• For fairly symmetric distributions, n > 15 will
usually give a sampling distribution is almost
normal
• For normal population distributions, the
sampling distribution of the mean is always
normally distributed
Example

• Suppose a population has mean μ


= 8 and standard deviation σ = 3.
Suppose a random sample of size
n = 36 is selected.

• What is the probability that the


sample mean is between 7.8 and
8.2?
Example
Solution:
• Even if the population is not normally
distributed, the central limit theorem
can be used (n > 30)
• … so the sampling distribution of x is
μx
approximately normal
• … with mean = 8 σ 3
σx    0.5
n 36
• …and standard deviation
Example
Solution (continued):
 
 7.8 - 8 X -μ 8.2 - 8 
P(7.8  X  8.2)  P   
 3 σ 3 
 36 n 36 
 P(-0.4  Z  0.4)  0.3108
Populati Sampling Standard
on Distributio Normal .
Distribut ?? n Distribution 1554
ion ? ? ??
? ? Sam Standar +.15
?? ?
? ple dize 54
7.8 -0.4
μ 8 X μ 8 x μ 0
0.4 z
Z
8.2 X
Problem
• A bank calculates that its individual
savings accounts are normally
distributed with a mean of 2000 and
SD 600. If a random sample of 100
accounts are taken, what is the
probability that the sample mean
will lie between 1900 and 2050
SNV1900 = -1.67
SNV2050 = 0.83
P = .4525 + .2967 = .7492
Finite Population Multiplier
• When the population is finite,
a finite population multiplier
is used
• (N – n) / (N – 1)
• Find the Sampling Fraction
n/N, if < .05, the Finite
multiplier is NOT to be used
• Also called Finite Correction
Factor
Standard Error of Mean
• Infinite Population

__
n
• Finite Population

 N-n
___ _______
n  N-1
Problem 4
• In a sample of 25 observations
from a normal distribution with
mean 98.6 and SD 17.2, what is
p(92 < x < 102)
• n = 25  = 17.2  = 98.6
•  /  n = 3.44
• P(92< x <102)
= p(-1.92 < z < 0.99)
= .4726 + .3389
= .8115
Problem 5
• The auditor of a credit card
company knows that on an
average, the daily balance of
any given customer is 112
and the SD 56. From 50
randomly selected accounts
what is the probability that
the sample average daily
balance is
• < 100 ( .0643)
• Between 100 and 130 ( .9241)
Problem 6
• From a population of 125
items, with mean of 105 and
SD 17, 64 items are chosen,
what is the standard error of
the mean?
N = 125 n = 64
 = 105  = 17
SE = 1.4896
Population Proportions
π = the proportion of the population having
some characteristic
• Sample proportion ( p ) provides an estimate
of π:

X number of items in the samplehavingthe characteristic of interest


p 
n• 0 ≤ p ≤ 1 samplesize
• p is approximately distributed as a normal distribution
when n is large
(assuming sampling with replacement from a finite population or
without replacement from an infinite population)
Sampling Distribution of p

• Approximated by a
Sampling
P( p )
normal distribution if: Distribution
s
.
–nπ  5 3
.
2
.
1
and
0 0 .2 .4 .6 p
n(1π)  5 8 1

π(1π)
where
μ p π σp 
and n
(where π = population
proportion)
Z-Value for Proportions
Standardize p to a Z value with the formula:

p p
Z 
σp (1)
n
Example

• If the true proportion of voters who support


Proposition A is π = 0.4, what is the
probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?

 i.e.: if π = 0.4 and n = 200,


what is P(0.40 ≤ p ≤ 0.45) ?
Example
• if π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?
 (1  ) 0.4(1 0.4)
Findσ p σp    0.03464
: n 200

Convert to P(0.40 p  0.45) P 0.400.40 Z  0.450.40


standardized  0.03464 0.03464 
normal:  P(0 Z 1.44)
Example
• if π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?

Use standardized normal table: P(0 ≤ Z ≤ 1.44) = 0.4251

Standardized
Sampling Normal
Distribution Distribution
0.42
51
Standar
dize
0. 0.4 0 1.
40 5
p 44
Z
Standard Error of Proportion

• Infinite Population
 pq/n
• Finite Population
______
N-n
_______  pq/n
 N-1
Problem 7
• The President of a company
believes that 30% of the firm’s
orders come from first time
customers. A simple random
sample of 100 orders is used
to estimate the proportion of
first time users
• What is the standard error of
proportion (.0458)
Problem 7…
• What is the probability that
the sample proportion will be
• Between .20 & .40
= p(.20<p<.40)
= p[(.2-.3/.0458) <z<(.4-.3/.0458)]
= p(-2.18<z<+2.18)
.4854 * 2 =.9708
• Between .25 & .35
= p(-1.09 < z < +1.09)
= .3621 * 2 = .7242
Chapter Summary
• Introduced sampling distributions
• Described the sampling distribution of the
mean
– For normal populations
– Using the Central Limit Theorem
• Described the sampling distribution of a
proportion
• Calculated probabilities using sampling
distributions
Business Statistics:
A First Course
5th Edition

Chapter 8

Confidence interval estimation


Learning Objectives

In this chapter, you learn:


• To construct and interpret confidence
interval estimates for the mean and the
proportion
• How to determine the sample size
necessary to develop a confidence interval
for the mean or proportion
Statistics
- Descriptive
- Inference
- Estimation mean and proportion
- Point
- Interval estimation/confidence interval
- Determination of sample size
- Testing of hypothesis
- Parametric ( Z, t, F)
- Non parametric ( Chi square....)
Chapter Outline

Content of this chapter


• Confidence Intervals for the Population
Mean, μ
– when Population Standard Deviation σ is
Known
– when Population Standard Deviation σ is
Unknown
• Confidence Intervals for the Population
Proportion, π
• Determining the Required Sample Size
Process of Statistical Inference

Population A simple random sample


with mean of n elements is selected
m=? from the population.

The value of x is used to The sample data


make inferences about provide a value for
the value of m. x .
the sample mean
Statistics

Descriptive Inference

Estimation TOH Regression


- Point
-mean
-proportion
- interval
-mean – when sigma known
when sigma unknown
-proportion
- sample size
-mean
-proportion
Types of Inference
1) Estimation: We estimate the value
of a population parameter.

2) Testing: We formulate a decision


about a population parameter.

3) Regression: We make predictions


about the value of a statistical
variable.
• To evaluate the reliability of our
inference, we need to know
about the probability distribution
of the statistic we are using.

• Typically, we are interested in the


sampling distributions for sample
means and sample proportions.
_ C
Sample
XC sc
Sample
_ D n
XD sd Population
n
Sample
_ B
µ  sb
n XB

SampleE SampleA
_ sa
XE n XA
se
In reality, the sample mean is just one of many possible
sample means drawn from the population, and is rarely
equal to µ.
Estimation
Point Estimation
In
In point
point estimation
estimation wewe use
use the
the data
data from
from the
the sample
sample
to
to compute
compute aa value
value ofof aa sample
sample statistic
statistic that
that serves
serves
as
as an
an estimate
estimate of
of aa population
population parameter.
parameter.

We
We refer to x as
refer to as the
the point
point estimator
estimator of
of the
the population
population
mean ..
mean

ss is
is the
the point
point estimator
estimator of
of the
the population
population standard
standard
deviation ..
deviation

p is
is the
the point
point estimator
estimator of
of the
the population
population proportion
proportion pp
Terms, Statistics & Parameters
Introduction…
• Use of sample statistic to
estimate population parameter

Estimator Sample Statistic used


to estimate the
population parameter
Estimate Specific Observed
value of the Statistic
Estimator
An estimator of a population
parameter is a sample statistic
used to estimate the parameter

Any systematic deviation of the


estimator from the population
parameter of interest is called a
bias
Point and Interval
Estimates

• A point estimate is a single number


• a confidence interval provides additional
information about the variability of the estimate

Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
Point Estimates

We can estimate a witha Sample


Population Parameter Statistic
(a Point Estimate)

Mean μ X
Proportion π p
Estimator
• The sample mean, is the most
common estimator of the
population mean
• The sample variance, is the most
common estimator of the
population variance
• The sample standard deviation, s,
is the most common estimator of
the population standard deviation
• The sample proportion, is the most
common estimator of the
population proportion
Types
• Point Estimate – Single number
used to estimate
single-valued estimate.
A single element chosen from a
sampling distribution.
Conveys little information about the
actual value of the population
parameter, about the accuracy of the
estimate
Types
• Interval Estimate – Range of
values
An interval or range of values
believed to include the unknown
population parameter.
Associated with the interval is a
measure of the confidence we have
that the interval does indeed contain
the parameter of interest.
Properties of a good estimator
• Property of unbiasedness
– Expected value of the estimator is equal to the
parameter being estimated.
• Property of efficiency
– Smallest variance
• Property of sufficiency
– Use as much information as possible from the
sample
• Property of consistency
– Sample size increases, estimate tends to be
parameter value
Point Estimate
• The sample mean is the best
estimator of the population
mean
• The sample SD is the best
estimator of the population SD
• Sample proportion is the best
estimator of the population
proportion
Problem 1
From the following data find the
point estimates of the
population mean and the
population SD

5 8 10 7 10 14
Problem 2
A survey question for a sample of
150 individuals yielded 75 YES
responses, 55 NO responses
and 20 NO OPINIONS
What is the point estimate of the
proportion in the population
who respond
(i) Yes (ii) No (iii) No Opinion
Problem 3
A bank wants to determine the
number of tellers during lunch
rush on Fridays. Data on the
number of people who entered the
bank between 11 am and 1 pm on
Friday over the last 3 months is:
242 275 289 306
342 385 279 245
269 305 394 328
Find point estimates of mean & SD of
population from which the sample
was drawn
Problem 4
In a sample of 400 textile workers, 184
expressed extreme dissatisfaction
regarding a prospective plan to modify
working conditions. Because this
dissatisfaction was strong enough to
allow management to interpret plan
reaction as being highly positive, they
were curious about the proportion of
total workers harboring this sentiment.
Give a point estimate of this proportion
Confidence Intervals

• How much uncertainty is associated


with a point estimate of a population
parameter?

• An interval estimate provides more


information about a population
characteristic than does a point estimate

• Such interval estimates are called


confidence intervals
Confidence Interval Estimate
• An interval gives a range of values:
– Takes into consideration variation in
sample statistics from sample to sample
– Based on observations from 1 sample
– Gives information about closeness to
unknown population parameters
– Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident
Confidence Interval Example

Cereal fill example


• Population has µ = 368 and σ = 15.
• If you take a sample of size n = 25 you know
– 368 ± 1.96 * 15 / 25 = (362.12, 373.88) contains
95% of the sample means
– When you don’t know µ, you use X to estimate µ
• If X = 362.3 the interval is 362.3 ± 1.96 * 15 / =
25
(356.42, 368.18)
• Since 356.42 ≤ µ ≤ 368.18, the interval based on this
sample makes a correct statement about µ.

But what about the intervals from other possible


samples of size 25?
Confidence Interval Example
(continued)
Sample Lower Upper Contain
X
# Limit Limit µ?

1 362.30 356.42 368.18 Yes

2 369.50 363.62 375.38 Yes

3 360.00 354.12 365.88 No

4 362.12 356.24 368.00 Yes

5 373.88 368.00 379.76 Yes


Confidence Interval Example
(continued)
• In practice you only take one sample of size n
• In practice you do not know µ so you do not
know if the interval actually contains µ
• However you do know that 95% of the intervals
formed in this manner will contain µ
• Thus, based on the one sample, you actually
selected you can be 95% confident your interval
will contain µ (this is a 95% confidence interval)
Note: 95% confidence is based on the fact that we used Z = 1.96.
Estimation Process

Random I am 95%
confident
Sample
that μ is
Populati Mean between
on μ,
(mean, X= 40 & 60.
50
is
unknown
)
Sam
ple
General Formula
• The general formula for all
confidence intervals is:

Point Estimate ± (Critical Value)(Standard Error)


Where:
• Point Estimate is the sample statistic estimating
the population parameter of interest

• Critical Value is a table value based on the


sampling distribution of the point estimate and
the desired confidence level

• Standard Error is the standard deviation of the


point estimate
Confidence Level
• Confidence Level
– The confidence that the interval
will contain the unknown
population parameter
– A percentage (less than 100%)
Confidence Level, (1-)
• Suppose confidence level = 95%
• Also written (1 - ) = 0.95, (so  = 0.05)
• A relative frequency interpretation:
– 95% of all the confidence intervals that can be
constructed will contain the unknown true
parameter
• A specific interval either will contain or will
not contain the true parameter
– No probability involved in a specific interval
Confidence Intervals
Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Interval for μ
(σ Known)
• Assumptions
– Population standard deviation σ is known
– Population is normally distributed
– If population is not normal, use large sample

• Confidence interval estimate:

σ
X Zα/2
n
where is the point estimate
ZXα/2 is the normal distribution critical value for a probability of /2 in
each tail is the standard error

σ/ n
Finding the Critical Value, Zα/2
Zα/2  1.96
• Consider a 95% confidence interval:
1α 0.95soα 0.05

α α
0.025 0.025
2 2

Z units:Zα/2 = -1.96 0 Zα/2 = 1.96


Lower Upper
X units: Confidence Point Confidenc
Limit Estimate e
Limit
Common Levels of Confidence

• Commonly used confidence


levels are 90%, 95%, and 99%
Confidence
Confiden Coefficient,
Zα/2
ce Level
1 value
80% 0.80 1.28
90% 0.90 1.645
95%
0.95 1.96
98%
0.98 2.33
99%
99.8% 0.99 2.58
99.9% 0.998 3.08
0.999 3.27
Intervals and Level of Confidence

Sampling Distribution of the Mean

/2 1 /2


x
Intervals μx  μ
extend x1
from (1-)x100%
σ x2
X  Zα / 2 of intervals
constructed
n contain μ;
to ()x100% do
σ not.
X  Zα / 2
n
Confidence
Intervals
Example
• A sample of 11 circuits from a large
normal population has a mean
resistance of 2.20 ohms. We know
from past testing that the population
standard deviation is 0.35 ohms.

• Determine a 95% confidence interval


for the true mean resistance of the
population.
Example
• A sample of 11 circuits from a large
normal population has a mean
resistance of 2.20 ohms. We know
from past testing that the population
standard deviation is 0.35 ohms.
σ
X  Zα/2
• Solution: n
 2.201.96(0.35/ 11)
 2.20 0.2068
1.9932 μ  2.4068
Interpretation

• We are 95% confident that the


true mean resistance is between
1.9932 and 2.4068 ohms
• Although the true mean may or
may not be in this interval, 95% of
intervals formed in this manner
will contain the true mean
Confidence Intervals
Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Do You Ever Truly Know σ?
• Probably not!

• In virtually all real world business situations, σ is not known.

• If there is a situation where σ is known then µ is also known


(since to calculate σ you need to know µ.)

• If you truly know µ there would be no need to gather a


sample to estimate it.
Confidence Interval for μ
(σ Unknown)
• If the population standard deviation
σ is unknown, we can substitute
the sample standard deviation, S
• This introduces extra uncertainty,
since S is variable from sample to
sample
• So we use the t distribution instead
of the normal distribution
Confidence Interval for μ
(σ Unknown)
• Assumptions
– Population standard deviation is unknown
– Population is normally distributed
– If population is not normal, use large
sample
• Use Student’s t Distribution
• Confidence Interval Estimate:
S
X  tα / 2
n
(where tα/2 is the critical value of the t distribution with n -1 degrees of
freedom and an area of α/2 in each tail)
Student’s t Distribution
• The t is a family of distributions
• The tα/2 value depends on degrees of
freedom (d.f.)
– Number of observations that are free to vary
after sample mean has been calculated

d.f. = n - 1
Degrees of Freedom (df)
Idea: Number of observations that are free to vary after
sample mean has been calculated

Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7
If the mean of these
Let X2 = 8 three values is 8.0,
What is X3? then X3 must be 9
(i.e., X3 is not free to
vary)
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary for
a given mean)
Student’s t Distribution
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric,
but have ‘fatter’ tails than t (df = 5)
the normal

0 t
Student’s t Table
Upper Tail Area
Let: n = 3
df .25 .10 .05 df = n - 1 = 2
 = 0.10
/2 = 0.05
1 1.000 3.078 6.314

2 0.817 1.886 2.920

3 0.765 1.638 2.353 /2 =


0.05
The body of
the table 0 2.9 t
contains t
values, not 20
probabilities
Example of t distribution confidence interval

A random sample of n = 25 has X = 50 and


S = 8. Form a 95% confidence interval for μ

– d.f. = n – 1 = 24, so tα/2  t 0.025  2.0639

The confidence interval is


S 8
X tα/2  50 (2.0639)
n 25

46.698 ≤ μ ≤ 53.302
Problem
•   From a population with SD 1.65, a
sample of 32 items resulted in
34.8 as an estimate of the
mean. Find the SE of the mean.
Compute an interval estimate
that should include the
population mean 99.7% of the
time
 = .292
Interval Estimates 34.8 ± .867
33.93 - 35.67
Problem
Estimate the mean life of
windshield wiper blades
under typical driving
conditions for CL of 95%
SD 6 months
Sample size = 100
Mean = 21 months

(19.824- 22.176 months)


Problem
Estimate the mean annual
earnings of 700 families at
90% CL n = 50 x’ = 11800
s = 950
x’ = 129.57 (FPM) x’ ± 1.64 x’

11587.5 - 12012.5
Problem
Nielsen Media Research reports
that household mean TV
viewing time during 8 pm to 11
pm is 7.75 hrs per week.
Assuming a sample size of 180
households and a sample SD
of 3.45 hrs, what is the 95%
Confidence Interval estimate of
the mean TV viewing time per
week during the 8pm to 11 pm
time period? (7.25 to 8.25)
Problem
Average tyre pressure in a
sample of 62 tyres was 24ppsi
and SD 2.1 ppsi.

What is the S Error of the Mean


(.267 ppsi)

Estimate a 95% Confidence


Interval
(23.48 to 24.52)
Confidence Intervals
Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown
Confidence Intervals for the
Population Proportion, π

• An interval estimate for the


population proportion ( π ) can
be calculated by adding an
allowance for uncertainty to
the sample proportion ( p )
Confidence Intervals for the
Population Proportion, π

• Recall that the distribution of the


sample proportion is approximately
normal if the sample size is large,
with standard deviation

(1)
σp 
n
• We will estimate this with sample
data:
p(1  p)
n
Confidence Interval Endpoints

• Upper and lower confidence limits for the


population proportion are calculated with
the formula

p(1 p)
p  Zα/2
n
• where
– Zα/2 is the standard normal value for the level of confidence desired
– p is the sample proportion
– n is the sample size
• Note: must have np > 5 and n(1-p) > 5
Example

• A random sample of 100 people


shows that 25 are left-handed.
• Form a 95% confidence interval
for the true proportion of left-
handers
Example
• A random sample of 100 people
shows that 25 are left-handed. Form
a 95% confidence interval for the
true proportion of left-handers.

p  Z α/2 p(1  p)/n


 25/100  1.96 0.25(0.75)/100
 0.25 1.96(0.0433)
0.1651  π  0.3349
Interpretation
• We are 95% confident that the true
percentage of left-handers in the
population is between
16.51% and 33.49%.

• Although the interval from 0.1651 to


0.3349 may or may not contain the true
proportion, 95% of intervals formed
from samples of size 100 in this manner
will contain the true proportion.
Determining Sample Size

Determining
Sample Size

For the For the


Mean Proportion
Sampling Error
• The required sample size can be found to
reach a desired margin of error (e) with a
specified level of confidence (1 - )

• The margin of error is also called sampling


error
– the amount of imprecision in the estimate of the
population parameter
– the amount added and subtracted to the point
estimate to form the confidence interval
Determining Sample Size
Determining
Sample Size

For the
Mean Sampling
error (margin
of error)
σ σ
X  Zα / 2 e Zα / 2
n n
Determining Sample Size
Determining
Sample Size

For the
Mean

σ 2
Zα / 2 σ2
e Zα / 2 Now
n
solve for 2
n n to get e
Determining Sample Size
• To determine the required sample size for
the mean, you must know:

– The desired level of confidence (1 - ), which


determines the critical value, Zα/2
– The acceptable sampling error, e
– The standard deviation, σ
Required Sample Size Example

If  = 45, what sample size is


needed to estimate the mean
within ± 5 with 90% confidence?

2 2 2 2
Z σ (1.645) (45)
n 2  2
 219.19
e 5

So the required sample size


is n = 220 (Always
round up)
If σ is unknown

• If unknown, σ can be estimated


when using the required sample
size formula
– Use a value for σ that is expected to
be at least as large as the true σ
– Select a pilot sample and estimate σ
with the sample standard deviation, S
Determining Sample Size
(continu
ed)
Determining
Sample Size

For the
Proporti
on

π(1π) Now Z2 π(1π)


eZ solve for n 2
n e
n to get
Determining Sample Size
• To determine the required sample size for
the proportion, you must know:

– The desired level of confidence (1 - ), which


determines the critical value, Zα/2
– The acceptable sampling error, e
– The true proportion of events of interest, π
• π can be estimated with a pilot sample
if necessary (or conservatively use 0.5
as an estimate of π)
Required Sample Size Example

How large a sample would be necessary


to estimate the true proportion
defective in a large population within
±3%, with 95% confidence?
(Assume a pilot sample yields p = 0.12)
Required Sample Size Example

Solution:
For 95% confidence, use Zα/2 = 1.96
e = 0.03
p = 0.12, so use this to estimate π

Zα/22 π (1  π) (1.96)2 (0.12)(1 0.12)


n   450.74
e2 (0.03)2
So use n = 451
Ethical Issues
• A confidence interval estimate (reflecting
sampling error) should always be included
when reporting a point estimate
• The level of confidence should always be
reported
• The sample size should be reported
• An interpretation of the confidence interval
estimate should also be provided
Chapter Summary
• Introduced the concept of confidence
intervals
• Discussed point estimates
• Developed confidence interval estimates
• Created confidence interval estimates for
the mean (σ known)
• Determined confidence interval estimates
for the mean (σ unknown)
• Created confidence interval estimates for
the proportion
• Determined required sample size for
mean and proportion settings
• Addressed confidence interval estimation
and ethical issues
Determining An Interval Including A Fixed
Proportion of the Sample Means
Find a symmetrically distributed interval
around µ that will include 95% of the sample
means when µ = 368, σ = 15, and n = 25.
– Since the interval contains 95% of the sample
means 5% of the sample means will be outside
the interval
– Since the interval is symmetric 2.5% will be
above the upper limit and 2.5% will be below
the lower limit.
– From the standardized normal table, the Z
score with 2.5% (0.0250) below it is -1.96 and
the Z score with 2.5% (0.0250) above it is 1.96.
Determining An Interval Including A Fixed
Proportion of the Sample Means
• Calculating the lower limit of the interval
σ 15
XL  μ  Z  368  (1.96)  362.12
n 25
• Calculating the upper limit of the interval
σ 15
XU  μ  Z  368  (1.96)  373.88
n 25
• 95% of all sample means of sample size 25 are
between 362.12 and 373.88

You might also like