Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 15

Central Limit Theorem

TD 604, 2022
• The sample X is to be calculated from a random sample of size 2
taken from a population consisting of 10 values (2, 3, 4, 5, 6, 7, 8, 9,
10, 11). Find the sampling distribution of x bar based on a random
sample of size 2.
• There exists a population of 500 coins (mean = 13.48 years, std. dev. =
11.164). We would like to find the age of coins. What sample should be
selected?

Choice of sample size is critical. What about n = 2, 5, 10, 50, 100?


Central Limit Theorem
Example
• We want to find the average height, μ , of all rural children of 3 years
of age in a district
• We randomly select 100 children in this age bracket from various
villages across the district, and find the average height of this sample
set of 100,
• We then pick another sample set of 100. And find
• …find …. , ,.. And so on
• We plot this distribution of s. What kind of a distribution do we get?
What is the mean of this distribution? What is its standard deviation ?
Random set of 100

All children 3 Find Another random set of 100 Find


years of age in
the district

𝜎𝑋
𝜇,𝜎

𝑋1 μ𝑋 𝑋2

The Central Limit Theorem says, that this distribution of s will tend towards being a Normal distribution, no matter what
the underlying distribution. But the more symmetrical is the underlying distribution, the closer this will be to a Normal
distribution.
Central Limit Theorem

A sample of size n is taken from a population with mean μ and standard deviation σ
If the sample has a mean of X :
The expected value of X is μ and standard deviation σx= σ /√n
When n is large, the sampling distribution tends towards a Normal distribution
Or in others words:
A sample of size n is taken from a population with mean μ and standard
deviation σ
If the sample has a mean of X
The expected value of X is μ and standard deviation σx= σ /√n
When n is large, the sampling distribution tends towards a Normal distribution

A sample size of 30 is commonly good enough. But it depends on how skewed


it is from a symmetrical distribution.
Example problem
At IIT Bombay, the mean age of the MTech students is 22.5 years, and the standard
deviation is 2.5 years. A random sample of 64 students is drawn. What is the probability
that the average age of these students is greater than 23 years?

Sample size n = 64
Population mean µ= 22.5
Population standard deviation σ = 1.5
The sample average , , has a probability distribution given by N (µ, σ/ √n)
Hence , probability that > 24, can be determined from the Standard Normal tables by converting
to the z score:

= 0.96
..continued
z = 0.96

From the standard normal tables,


find the probability of z being less
than 0.96 – this is given by the pink
area in the figure.

From the table look up the value of z


until first decimal place from the first
column. This gives us the row.

Then look up the second decimal


place in the top row. This gives us the
column.

The probability of z < 0.96 = 0.8315

But we want to find the probability of > 24, or of z > 0.96 , which is 1 – 0.8315 = 0.1685
Central Limit Theorem
• Very important in statistics
• Forms the basis of everything we do in this course henceforth
Leftmost plot : Population distribution

Centre plot: Distribution of X (n = 5)

Rightmost plot: Distribution of X (n = 30)


Leftmost plot : Population distribution

Centre plot: Distribution of X (n = 5)

Rightmost plot: Distribution of X (n = 30)


The sample set must be a random selection

Yields from 100 runs of a chemical process plotted against time

The data shows a trend. This means that each run is in some way
affected by the last run, so they are not random samples. And
hence this dataset cannot be used to find the confidence interval.
End of slide deck

You might also like