Professional Documents
Culture Documents
Central Limit Theorem and Confidence Interval
Central Limit Theorem and Confidence Interval
27)
Intro CLT: the central limit theorem (CLT) states that the distribution of sample approximates a
normal distribution as the sample size becomes larger
Sample sizes equal to or greater than 30 are considered sufficient for the CLT to hold.
A sufficiently large sample size can predict the characteristics of a population accurately.
Understanding:
the mean of a sample of data will be closer to the mean of the overall population in question, as the
sample size increases, notwithstanding the actual distribution of the data
24)
Why useful:
The central limit theorem is vital in statistics for two main reasons—the normality assumption and the
precision of the estimates.
Sampling distributions cluster more around population mean as the sample size increases.
This property of CLT becomes more relevant when we are using samples to estimate the
population mean. With larger sample size, sample mean is approximating population mean, so
the estimate becomes more precise
32)
Where CLT is useful:
investors of all types rely on the CLT to analyze stock returns, construct portfolios, and manage risk.
Say, for example, an investor wishes to analyze the overall return for a stock index that comprises
1,000 equities. In this scenario, that investor may simply study a random sample of stocks, to cultivate
estimated returns of the total index.
30)
Cons
The central limit theorem applies to almost all types of probability distributions, but there are
exceptions. For example, the population must have a finite variance. That restriction rules out
the Cauchy distribution because it has infinite variance.
the value of one observation should not depend on the value of another observation.
the distribution of the independant variable must remain constant across all measurements.
Typically, a sample size of 30 is sufficient for most distributions. But strongly skewed
distributions can require larger sample sizes
Solution:
If population distribution is extremely skewed, might need a substantial sample size for the
central limit theorem and produce sampling distributions that approximate a normal
distribution
33)
Intro Confidence Interval: To test the accuracy of sample mean
A confidence interval is an interval around the estimated mean (μₑ) that is likely to include the
unknown population mean (μ)
A 95% confidence level means that we would expect 95% of the interval estimates would include the
population mean.
Restrictions:
Usually, we work with only one random sample containing large number of data points and in such
case we have only one confidence interval estimate that can be computed as follows for 95%
confidence level:
Higher standard error leads to wider confidence interval, which indicates that the mean of our random
sample is not a good approximation of the population mean.
31)
MARGIN OF ERROR
A margin of error tells you how many percentage points your results will differ from the real
population value. For example, a 95% confidence interval with a 4 percent margin of error means that
your statistic will be within 4 percentage points of the real population value 95% of the time
37)
Counter point:
The idea behind confidence levels and margins of error is that any survey or poll will differ from the
true population by a certain amount. However, confidence intervals and margins of error reflect the
fact that there is room for error, so although 95% or 98% confidence with a 2 percent Margin of Error
might sound like a very good statistic, room for error is built in, which means sometimes statistics are
wrong.
34)
Difference between the confidence interval and margin of error?
The margin of error is how far from the estimate we think the true value might be (in either direction).
The confidence interval is the estimate ± the margin of error.
38)
SIGNIFICANCE LEVEL α
The value of α=0.05 is a common one; it means there's only a 5% chance our confidence interval will
not capture the true value. Using α=0.01 would mean there's only a 1% chance.
35)
THINGS TO KNOW ABOUT CONFIDENCE INTERVALS
1. tell you the most likely range of the unknown population average or percentage
2. provide both the location and precision of a measure
3. Three things impact the width of a confidence interval
a. Confidence level: 90, 95, 99
b. Variability: as measured by the standard deviation
c. Sample Size: Smaller sample sizes generate wider intervals
4. Confidence Intervals can be computed on sample sizes as small as two: The intervals will
be very wide but there’s nothing with the math preventing you form computing them. With
small sample sizes you can show that an interface is unusable, but it’s harder to show it’s
usable. For example, if 0 out of 2 people can complete a task, there’s only about a 5% chance
more than half of all users will.