Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

MIS171 Business Analytics

Week 6b: Confidence intervals


Unit Schedule
Unit Information
Week 1 Introduction and Data Visualisation
Week 2 Univariate Analyses Assessment task 1 (10%) –
(Participation in Pre lectorial and Lectorial workbook
Descriptive activities and continuous class attendance from week 2 to
Week 3 Bivariate Analyses
week 12)
Week 4 Data visualisation Assessment task 2 (10%) - (Project related Online quiz)
Week 5 Inference Probability
Week 6 Confidence Intervals
Week 7 Hypothesis Testing Assessment task 3 (20%) - (Assignment on analysis and
reporting)
Simple Linear Regression and
Week 8 Predictive Prediction
Week 9 Multiple Linear Regression

Week 10 Multiple Linear Regression in action Assessment task 4 (10%) - (Project related Online quiz)

Week 11 Exam Revision 1


Revision
Week 12 Exam Revision 2
Week 13 Exam Exam (50%)
The t distribution
The t-distribution is a family of probability
distributions with a shape similar to the
standard normal distribution. Different t-
distributions are distinguished by an
additional parameter, degrees of freedom
(df)= (Sample size -1) = (n-1)
(the higher degree of freedom ,

As the number of degrees of freedom


increases, the t-distribution converges to the
standard normal distribution.

3
Confidence Interval for the Mean
When Population Standard Deviation is unknown

Student-t distribution with n−1 degrees of freedom (df)

Where tα/2 is the value of the t-distribution with df = n − 1 for an upper tail area of α/2.

t values are found in available tables or with the Excel function T.INV(1 – α/2, n – 1).

The Excel function


=CONFIDENCE.T(alpha, standard deviation, size)
can be used to compute the margin of error
if the count is 116, the degree of confidence is 115 so t graph shows between 1.9818
and 1.9799
4
Example: Estimating average spending of all festival goers
Calculate a 95% confidence interval for μ

S
X  t n -1
n
234.16
565.43  1.98
116
565.43  43.06
522.36    608.50
5
Interpretation

We are 95% confident that the average


spending of all festival attendee is somewhere
between $522.36 and $608.50.

6
Different Confidence levels
Calculate a 90%, 95% and 99% Confidence
Interval estimates for the true mean
Festival spend all festival attendees.

Comparing these intervals we can see


that there is a trade-off between
Confidence and margin of error.

The higher the Confidence, the less


precise (or wider) the interval

7
Workbook Exercise 5.4
An investor is trying to estimate the return on investment in companies that won
quality awards last year.
A random sample of 30 such companies is selected, and the return on investment is
calculated.
Sample mean and standard deviation:
x̅ = 14.75%, s = 8.18%.
Obtain a 95% confidence interval for the mean return.
Workbook Exercise 5.5
In a recent advertising campaign, a bank stated that they had improved customer
service and waiting times at their branches. In particular, they claim that the average
waiting time is now only 4.5 minutes.

A consumer group conducted a survey of a random sample of 49 customers and


timed their length of wait in a queue. From the sample they found that the mean
was 6 minutes and standard deviation was 5.6 minutes.

The consumer group released a media statement that ‘...the bank has provided
misleading advertising as our research shows the average waiting time is actually 6
minutes...’
(a) What is wrong with the claim made by the consumer group?
Continue to do parts (b) to (d)
Confidence interval for a proportion
An unbiased estimator of a population proportion p (this is not the number pi =
3.14159 …) is the statistic = x / n (the sample proportion), where x is the number
in the sample having the desired characteristic and n is the sample size.

A 100(1 – a)% confidence interval for the proportion is

10
Estimating true proportion – Interstate visitors
Calculate a 95% confidence interval for the true proportion of interstate visitors

p(1  p)
pZ
n
0.4483(1  0.4483)
0.4483  1.96
116
0.4483  0.0905
0.3578    0.5388
11
Interpretation

We are 95% confident that approximately


between 36% and 54% of all festival
attendees are interstate visitors.

12
Workbook exercise 5.5
As well as timing how long customers spent in the queue, the consumer group also
asked if the customer was satisfied with the level of service given by the bank teller.
Of the 49 people surveyed, 32 were satisfied with the service they received. Construct
a 90% confidence interval for the true proportion of customers who are satisfied with
the service.

b. Based on your answer in part (a), is it possible to conclude (at 90% confidence) that
more than half of customers are satisfied with the level of service?

c. What effect would changing the confidence factor to 95% or 99% have on your
answer in part (a)? Would you necessarily come to the same conclusion in part (b) if
you had a higher confidence level?
Difference between groups

Calculate 95% confidence interval of μ for each group of attendees (i.e., local,
interstate, international).

Lower Limit Upper Limit


International 578.71 908.79

Interstate 527.40 635.98

Local 439.85 576.19

14
Difference between groups
• At 95% confidence, we cannot conclude that Interstate attendees Spend more
than Locals (since CIs overlap)

• At 95% confidence, we can conclude that International attendees Spend more


than Locals (since CIs do not overlap)

15
Using Confidence intervals to predict election returns
An exit poll of 1,300 voters found that 692 voted for a particular candidate in a two-
person race. This represents a proportion of 53.23% of the sample. Could we
conclude that the candidate will likely win the election?

A 95% confidence interval for the proportion is [0.505, 0.559]. This suggests that
the population proportion of voters who favour this candidate is highly likely to
exceed 50%, so it is safe to predict the winner.

If the sample proportion is 0.515,the confidence interval for the population


proportion is [0.488, 0.543]. Even though the sample proportion is larger than 50%,
the sampling error is large, and the confidence interval suggests that it is
reasonably likely that the true population proportion could be less than 50%, so
you cannot predict the winner.

16
Different sample sizes
Logic tells us the bigger the sample, the more
accuracy we would expect.

From the formula below, if n is larger, the SE must


be smaller.

SE

The sampling distribution would be narrower.

Our confidence interval must therefore be


narrower.

17
Calculating Sample Size
Before we take our sample to estimate  or π, we first estimate what size sample is
required.
To do this, we specify:
• The margin of error, ME, we can work with.
• The degree of confidence we require, e.g., 95%
• Obtain an estimate of  and π

18
Example – Festival goers and spending
Suppose we want to know the average spend of all festival attendees to within $50
(Marginal error), with 95% confidence.

Use the previous study’s s = $235 to approximate .

Thus, n = 85 is the minimum sample size needed for our specifications.

ALWAYS ROUND UP!!

19
Example – Festival goers and spending
Suppose we want to know the true proportion of repeat attendees to within 5%,
with 95% confidence.

Use the previous study’s p = 0.275 to approximate π.

Thus, n = 307 is the minimum sample size needed for our specifications

20
Workbook exercise 5.6
With reference to Exercise 5.5, if the consumer group was to conduct another
survey in the future, what sized sample would be required if they wish to have
a margin of error of at most 5% and maintain a 90% confidence level?
Confidence Interval estimation and ethical issues
• A confidence interval estimate (reflecting sampling error) should always be
included when reporting a point estimate

• The level of confidence should always be reported

• The sample size should be disclosed

• An interpretation of the confidence interval estimate should also be


provided

22
Weekly Excel task

Open week 5 Excel task and continue

23
Summary
In this session, we started by introducing sampling and different sampling
techniques. Then, we discussed the importance of sampling distribution.

Afterwards, we practiced how to construct and interpret confidence interval


estimates for the mean and the proportion as well as to calculate sample size.

24
END

You might also like