Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

OPMT 1130 Myra Andrews

Business Statistics Winter 2023

Lecture 12: Distribution of the Sample Means


You want to estimate the average age of first-year BCIT students. The distribution of
ages is skewed to the right (there are some older students who pull up the mean age).
You randomly select 100 students and ask them how old they are. You get a sample
mean age of 21.3 years. Your friend does the same but gets a mean age of 22.4 years.

→ There are many different samples of size 100 that can be taken from this population.

The sample mean is an estimate of the population mean. The sample mean is our
"best guess" or "best estimate" of the unknown population mean.

𝑥𝑥 = sample mean 𝜇𝜇 = population mean

The mean of the sample means = the population mean μ𝑥𝑥 = μ

Standard Error of the Mean (the standard deviation of the sample means)
It is the population standard deviation divided by the square root of the sample size
𝜎𝜎
𝜎𝜎𝑥𝑥 = 𝜎𝜎 = population standard deviation 𝑛𝑛 = sample size
√𝑛𝑛
𝜎𝜎 𝜎𝜎 𝜎𝜎 1
For example, if 𝑛𝑛 = 1, 𝜎𝜎𝑥𝑥 = = 𝜎𝜎 if 𝑛𝑛 = 25, 𝜎𝜎𝑥𝑥 = = = 𝜎𝜎
√1 √25 5 5

The standard deviation of the sample means (or standard error) falls by the √𝑛𝑛

Shape
The Central Limit Theorem states that if the sample size is large (30 or more) the
distribution of the sample means will be approximately normally even when the
underlying population (from which the sample is taken) is not normal.

• The distribution of the sample means is approximately normal as long as the


sample size is large enough. A sample size of at least 30 is large enough.

• If the population is symmetrical about the mean (e.g. uniform population) a sample
size of at least 15 is considered large enough.

• If the underlying population is normal, then the distribution of the sample means will
be normal for any sample size.

Pg 1 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

Central Limit Theorem:


An urn contains 10,000 balls numbered from 5 to 8. The numbers are uniformly
distributed (there’s an equal percentage of each number).

Probability Distribution

0.25

0
5 6 7 8 X
You randomly select two balls from the urn, calculate the mean of the two numbers and
do this many times. Below is a list of the 16 possible samples of size 2 you could get:
(5, 5) (5, 6) (5, 7) (5, 8) (6, 5) (6, 6) (6, 7) (6, 8)

(7, 5) (7, 6) (7, 7) (7, 8) (8, 5) (8, 6) (8, 7) (8, 8)

The table below shows all possible sample means and their corresponding probabilities.

𝒙𝒙 Frequency 𝒑𝒑(𝒙𝒙)
5.0 1 0.0625
5.5 2 0.1250
6.0 3 0.1875
6.5 4 0.2500
7.0 3 0.1875
7.5 2 0.1250
8.0 1 0.0625
Total 16 1.00

The distribution of the sample means is starting to look normal

0.25

0
5 5.5 6 6.5 7 7.5 8 𝑥𝑥
Pg 2 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

Random Variable − Mean of the distribution


Z=
Standard Deviation of the distribution
𝜎𝜎
𝑥𝑥 − 𝜇𝜇𝑥𝑥 𝑥𝑥 − 𝜇𝜇 since μ𝑥𝑥 = μ 𝜎𝜎𝑥𝑥 =
Z= = 𝜎𝜎 √𝑛𝑛
𝜎𝜎𝑥𝑥 �√𝑛𝑛

Population Sample Means

𝑥𝑥 − 𝜇𝜇 𝑥𝑥 − 𝜇𝜇 𝑥𝑥 = value from the population 𝑥𝑥 = sample mean


Z= Z = 𝜎𝜎 𝜇𝜇 = population mean 𝜎𝜎 = population standard deviation
𝜎𝜎 � 𝑛𝑛
√ 𝑛𝑛 = sample size

Cumulative Normal Probability Distribution


NORM.DIST (X, mean, standard_dev, true) finds the probability below X (or 𝑥𝑥)

Inverse of NORM.DIST finds the value of X (or 𝑥𝑥 ) for any given probability
NORM.INV (probability, mean, standard_dev)

𝑛𝑛 = 1 (a randomly selected person) or “what percentage of the population is …”

o X = value of 𝑥𝑥
o probability (in NORM.INV) is the area below X (at most)
o mean = population mean 𝜇𝜇
o standard_dev = population standard deviation
o true returns the probability of at most (the area below X) (true = 1)

For sample means (when 𝑛𝑛 is greater than 1)

o X = sample mean 𝑥𝑥
o probability (in NORM.INV) is the area below 𝑥𝑥 (at most)
o mean = population mean 𝜇𝜇
𝜎𝜎
o standard_dev = standard deviation of the sample means (standard error) =
√𝑛𝑛
o true returns the probability of at most (the area below 𝑥𝑥 ) (true = 1)

False -we won’t use this


When set to False, Excel will approximate the probability of X being an exact number by
finding the probability that X is between (X − 0.5) and (X + 0.5). For example, the
P(X = 10) is approximately equal to the P(X is between 9.5 and 10.5)
Pg 3 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

Ex 1: The distribution of the ages of first-year students at BCIT is highly skewed to the
right with a mean age of 22 years and a standard deviation of 2.8 years.
(a) Find the probability that a randomly selected student is less than 20.4 years.

(b) 49 first-year BCIT students are randomly selected and asked how old they are.
What is the mean and standard deviation of the distribution of sample means?
Describe the shape of the sampling distribution.
σ
𝜇𝜇𝑥𝑥 = 𝜇𝜇 = 𝜎𝜎𝑥𝑥 = =
√𝑛𝑛

𝑛𝑛 = 49 > 30 so the sample means are approximately normal


(c) You got a sample mean of only 20.4 years from your random sample of 49 students.
What is the probability of getting a sample mean age of 20.4 years or less?

In Excel:
X = sample mean = 20.4 Mean = population mean = 22 Standard_dev = 𝜎𝜎𝑥𝑥 = 0.4

P(𝑥𝑥 𝐚𝐚𝐚𝐚 𝐦𝐦𝐦𝐦𝐦𝐦𝐦𝐦 20.4) = NORM.DIST(X = 20.4, mean = 22, standard_dev = 0.4, true)
or = NORM.DIST(X = 20.4, mean = 22, standard_dev: Enter 2.8/SQRT(49), true)

(d) What does your answer from (c) tell you? Calculate the Z-score.

P(sample mean ≤ 20. 4 years when the population mean = 22 years) = 0.000032

Pg 4 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

Ex 2: You recently graduated from the BCIT Accounting program and just started
working at a CPA firm. Your boss claims that the average starting salary for an
accountant who just graduated is $50,000/year so he offers to pay you $50,000. The
standard deviation is $5,000/year and salaries are approximately normally distributed.
(a) What percentage of accountants earn at least $56,000/year just after graduating?

P(salary of at least $56,000) = 1 − P(at most $56,000)


= 1 − NORM.DIST (X = 56000, mean = 50000, standard_dev = 5000, true)
= 1 − 0.8849 = 0.1151 = 11.51%

(b) Find the probability that a randomly selected accountant earns at least $56,000.

(c) You randomly select 14 accountants (recently graduated) and ask them how much
they earn. Find the probability of obtaining an average salary of at least $56,000.
𝜎𝜎 5000
𝜎𝜎𝑥𝑥 = = = 1336.306 (= standard_dev)
√𝑛𝑛 √14

P(𝑥𝑥 is at least $56,000) = 1 − P(at most $56,000) or enter 5000/SQRT(14)


= 1 − NORM.DIST (X = 56000, mean = 50000, standard_dev = 1336.306, true)
= 1 − 0.9999964 = 0.0000036 = 0.00036%

(d) You got a sample mean of $56,000 from your random sample. Do you think the
high mean salary you observed was due to random fluctuations or do you think
your boss was lying when he said the average starting salary is $50,000?

Pg 5 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

Ex 3: The weight of adults in Blaine is approximately normally distributed with a mean of


170 pounds and a standard deviation of 24 pounds. On a raft that takes people across
the river, a sign states, “Maximum capacity 3,200 pounds or 16 adults". What is the
probability that a random sample of 16 adults will exceed the 3,200 pound weight limit?

𝒙𝒙 Total
Mean 𝜇𝜇 𝑛𝑛𝑛𝑛
𝜎𝜎
Standard Error √𝑛𝑛 𝜎𝜎
√𝑛𝑛

P(𝑥𝑥 > 200) = 1 − NORM.DIST(X = 200, mean = 170, standard_dev = 6, true)


= 1 − 0.9999997 = 0.0000003 (almost never happens)

P(Total > 3200) = 1 − NORM.DIST(X = 3200, mean = 2720, standard_dev = 96, true)

1. (a) 𝒏𝒏 = 1 the population is not normally distributed → cannot answer


σ 2.8
(b) 𝜇𝜇𝑥𝑥 = 𝜇𝜇 = 22 years 𝜎𝜎𝑥𝑥 = = = 0.4 years 𝑛𝑛 > 30 ∼ normal (c) 0.000032
√𝑛𝑛 √49
(d) It is very unlikely you will get a sample mean age of 20.4 (or less) due to random
fluctuations. It’s more likely students are getting younger -mean age is below 22.
2. (a) 11.51% (b) 0.1151 (c) 0.00000391 (almost 0)
(d) Boss is lying- unlikely to get a sample mean of $56,000 (or higher) from a
population with a mean of $50,000. Unlikely it was due to random fluctuations.
3. 0.0000003 (almost 0)
Pg 6 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

𝑥𝑥 − 𝜇𝜇 𝑥𝑥 − 𝜇𝜇 𝜎𝜎
z= 𝜎𝜎
z= 𝜎𝜎 𝜎𝜎𝑥𝑥 = 𝜎𝜎𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = √𝑛𝑛 𝜎𝜎
� 𝑛𝑛 √𝑛𝑛

NORM.DIST (X, mean, standard_dev, true) finds the probability below X (or 𝑥𝑥)

NORM.INV (probability, mean, standard_dev) returns X (𝑥𝑥 ) for any given probability

Lab Exercises
1. Brewers Association studies shows that the average Canadian adult drinks 90 litres
of beer per year. The number of litres of beer drank is highly skewed to the right with
a standard deviation of 48 litres. A random sample of 36 Canadian adults of legal
drinking age is taken.

(a) What is the mean and standard deviation of the distribution of sample means
(standard error)? Describe the shape.

(b) What is the probability of our sample having a mean greater than 100 litres?

(c) What is the probability of our sample having a mean less than 70 litres?

(d) Find the probability of our sample mean being within ten litres of the population
mean.

(e) Find the minimum sample mean consumption needed to be in the top 1% of all
samples.

2. WestJet is concerned about the amount of baggage its passengers are bringing on
the flight. A study last year of passenger bags revealed that the bags had a mean
weight of 18.2 kg and a standard deviation of 3.75 kg and that the weights are
approximately normally distributed.
(a) 1 out of every 40 bags should weigh more than what weight?

(b) Maximum bag weight limit is 23 kg. What percentage of bags exceed this weight?

(c) The baggage supervisor has his staff randomly select 9 bags and weigh them.
(i) What is the probability the sample mean weight will be less than 15 kg?
(ii) Find the probability the sample mean will be within 2 kg of the population mean?

(d) A small plane with 80 seats can handle up to 1,600 kg of checked baggage.
What is the probability that 80 passengers will check too much baggage for the
plane to handle?

Pg 7 of 8
OPMT 1130 Myra Andrews
Business Statistics Winter 2023

3. An elevator company has conducted a study and determined the average adult
weight is 78 kg with a standard deviation of 14 kg. The distribution is approximately
normally distributed.

(a) Find the probability of an individual weighing:


(i) less than 50 kg. (ii) more than 115 kg.

(b) They decide the elevator will have a capacity of 5 people and a maximum load of
575 kg. What is the probability that load will be exceeded if there are 5 people in
the elevator?

(c) The company also makes another elevator that can hold 10 people and has a
maximum capacity of 1,150 kg. Find the probability that 10 people will exceed
the maximum load.

Solutions
1. (a) 𝜇𝜇𝑥𝑥 = 90 litres 𝜎𝜎𝑥𝑥 = 8 litres, the distribution of the sample means is ~ Normal
(b) 0.10565 (c) 0.00621 (d) 0.7887 (e) 108.61 litres

2. (a) 25.55 kg (b) 10.03%) (c) (i) 0.0052 (ii) 0.8904 (d) 0.0000088 (almost never)

3. (a) (i) 0.0228 (ii) 0.0041 (b) Z = 5.91 → ~ 0% (c) Z = 8.36 → ~ 0%

Pg 8 of 8

You might also like