Week 5 - PROG 8510 Week 5

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 71

PROG 8510 : Programming Statistics for Business

Data and Sampling Distributions II 

Week 5
This class
• Discuss the use of Confidence Interval in statistical analysis.
• Explain various kinds of distributions, such as Binomial Distribution, Normal
Distribution and Chi-Squared distribution.
Interval Estimation
Interval Estimation of the Population Mean
Interval Estimation of the Population Proportion
Confidence Intervals
Confidence intervals are the typical
way to present estimates as an
interval range. The more data you
have, the less variable a sample
estimate will be. The lower the level of
confidence you can tolerate, the
narrower the confidence interval will
be.
Interval Estimation (Slide 1 of 15)
• Because a point estimator cannot be expected to provide the exact value
of a population parameter, interval estimation is frequently used to
generate an estimate of the value of a population parameter.
• An interval estimate is often computed by adding and subtracting a
value, called the margin of error, to the point estimate.
• The general form of an interval estimate is:

Point estimate  Margin of error


Interval Estimation (Slide 2 of 15)
Interval Estimation of the Population Mean:
• An interval estimate provides information about how close the point
estimate is to the value of the population parameter.
• General form of an interval estimate of a population mean is:
x  Margin of error
• General form of an interval estimate of a population proportion is:

p  Margin of error
Interval Estimation (Slide 3 of 15)
Interval Estimation of the Population Mean (cont.):

For any normally distributed random variable:


• 90% of the values lie within 1.645 standard deviations of the
mean.
• 95% of the values lie within 1.960 standard deviations of the
mean.
• 99% of the values lie within 2.576 standard deviations of the
mean.
Interval Estimation (Slide 4 of 15)
Figure 6.8: Sampling Distribution of
the Sample Mean
Interval Estimation (Slide 5 of 15)
• If the sampling distribution follows a normal distribution, address this
additional source of uncertainty by using a probability distribution known as
the t distribution:
o A family of similar probability distributions.
o The shape of each specific one depends on a parameter referred
to as the degrees of freedom.
o Similar in shape to the standard normal distribution, but wider.
• As the degrees of freedom increase, the t distribution narrows, its peak
becomes higher, and it becomes more similar to the standard normal
distribution.
Interval Estimation (Slide 6 of 15)
Figure 6.9: Comparison of the Standard Normal Distribution with t
Distributions with 10 and 20 Degrees of Freedom
Interval Estimation (Slide 7 of 15)
Figure 6.10: t
Distribution with
29 Degrees of
Freedom

Use Excel's T.INV.2T function to find the value from a t distribution such that a given
percentage of the distribution is included in the interval  t for any degrees of freedom.
Interval Estimation (Slide 8 of 15)
Figure 6.11: Intervals Formed Around
Sample Means from 10 Independent
Random Samples
Interval Estimation (Slide 9 of 15)
• Because approximately 90% of all the intervals constructed will
contain the population mean, we say that we are approximately
90% confident that the interval will include the population mean:
o Say that the interval has been established at the 90% confidence level.

o The value of 0.90 is referred to as the confidence coefficient.

o The interval is called the 90% confidence interval.

• The level of significance is the probability that the interval


estimation procedure will generate an interval that does not contain
the population mean:
  level of significance  1  confidence coefficient
Interval Estimation (Slide 10 of 15)
Figure 6.12: t
Distribution with  2
Area or Probability in
the Upper Tail
Interval Estimation (Slide 11 of 15)
Table 6.5: Credit Card Balances for a Sample of 70 Households
9,430 14,661 7,159 9,071 9,691 11,032
7,535 12,195 8,137 3,603 11,448 6,525
4,078 10,544 9,467 16,804 8,279 5,239
5,604 13,659 12,595 13,479 5,649 6,195
5,179 7,061 7,917 14,044 11,298 12,584
4,416 6,245 11,346 6,817 4,353 15,415
10,676 13,021 12,806 6,845 3,467 15,917
1,627 9,719 4,972 10,493 6,191 12,591
10,112 2,200 11,356 615 12,851 9,743
6,567 10,746 7,117 13,627 5,337 10,324
13,627 12,744 9,465 12,557 8,372
18,719 5,742 19,263 6,232 7,445
Interval Estimation (Slide 12 of 15)
Figure 6.13: 95% Confidence Interval for Credit Card Balances
Interval Estimation (Slide 13 of 15)
Interval Estimation of the Population Proportion:

The sampling distribution of p plays a key role in computing the margin of


error in the interval estimate.
Interval Estimation (Slide 14 of 15)
Figure 6.14: Normal Approximation of the Sampling Distribution of p
Interval Estimation (Slide 15 of 15)
Figure 6.15: 95% Confidence Interval for Survey of Women Golfers
Random Variables
Discrete Random Variables
Continuous Random Variables
Random Variables (Slide 1 of 6)
• In probability terms, a random variable is a numerical description of the
outcome of a random experiment.
• Random variables are quantities whose values are not known with certainty.
• A random variable can be classified as being either:
o Discrete.
o Continuous.
Random Variables (Slide 2 of 6)
Discrete Random Variables:
• A random variable that can take on only specified discrete values is referred
to as a discrete random variable.
• Table 4.7 provides examples of discrete random variables.
• Table 4.8 repeats the joint probability table for the Lancaster Savings and
Loan data, but with the values labeled as random variables.
Random Variables (Slide 3 of 6)
Table 4.7: Examples of Discrete Random Variables
Possible Values for the
Random Experiment Random Variable (x) Random Variable
Flip a coin Face of a coin showing 1 if heads; 0 if tails
Roll a die Number of dots showing 1, 2, 3, 4, 5, 6
on top of die
Contact five customers Number of customers who 0, 1, 2, 3, 4, 5
place an order
Operate a health care Number of patients who 0, 1, 2, 3, …
clinic for one day arrive
Offer a customer the Product chosen by 0 if none; 1 if choose
choice of two products customer product A; 2 if choose
product B
Random Variables (Slide 4 of 6)
Table 4.8: Joint Probability Table for Customer Mortgage Prepayments
Random Variables (Slide 5 of 6)
Continuous Random Variables:
• A random variable that may assume any numerical value in an interval or
collection of intervals is called a continuous random variable.
• Technically, relatively few random variables are truly continuous; examples
are values related to time, weight, distance, and temperature.
• Many discrete random variables have a large number of potential outcomes
and so can be effectively modeled as continuous random variables.
Random Variables (Slide 6 of 6)
Table 4.9: Examples of Continuous Random Variables
Discrete Probability
Distributions
Custom Discrete Probability Distribution
Expected Value and Variance
Discrete Uniform Probability Distribution
Binomial Probability Distribution
Poisson Probability Distribution
Discrete Probability Distributions (Slide 1 of 21)
• The probability distribution for a random variable describes the range
and relative likelihood of possible values for a random variable.
• For a discrete random variable x, the probability distribution is defined by
the probability mass function, denoted by f (x).
• The probability mass function provides the probability for each value of
the random variable.
• We can present probability distributions graphically.
Discrete Probability Distributions (Slide 2 of 21)
Figure 4.9: Graphical
Representation of the
Probability Distribution for
Whether a Customer Defaults
on a Mortgage
Discrete Probability Distributions (Slide 3 of 21)
Custom Discrete Probability Distribution:
• A probability that is generated from observations is called an empirical
probability distribution.
• An empirical probability is considered a custom discrete probability
distribution if it is discrete and the possible values of the random variable
have different values:
o Useful for describing different possible scenarios that have different
probabilities.
o Probabilities generated using either the subjective method or the
relative frequency method.
Discrete Probability Distributions (Slide 4 of 21)

Table 4.10: Summary Table of Number of Payments Made per Year

• Example: The random variable describing the number of mortgage


payments made per year by randomly chosen customers.
Discrete Probability Distributions (Slide 5 of 21)
Figure 4.10: Excel
PivotTable for Number of
Payments Made per Year
Discrete Probability Distributions (Slide 6 of 21)
Expected Value and Variance:
• The expected value, or mean, of a random variable is a measure of the
central location for the random variable.
Discrete Probability Distributions (Slide 7 of 21)
Table 4.11: Calculation of the Expected Value for Number of Payments
Made per Year by a Lancaster Savings and Loan Mortgage Customer

If Lancaster Savings and Loan signs a new mortgage customer, the


expected number of payments per year for this customer is 13.8.
Discrete Probability Distributions (Slide 8 of 21)
Figure 4.11: Using Excel
SUMPRODUCT Function to
Calculate the Expected Value
for Number of Payments
Made per Year by a
Lancaster Savings and Loan
Mortgage Customer
Discrete Probability Distributions (Slide 9 of 21)
Figure 4.12: Excel
Calculation of the
Expected Value for
Number of Payments
Made per Year by a
Lancaster Savings and
Loan Mortgage Customer
Discrete Probability Distributions (Slide 10 of 21)
• Variance is a measure of variability in the values of a random variable:

An essential part of the variance formula is the deviation, x   ,


which measures how far a particular value of the random variable is from
the expected value, or mean, .
Discrete Probability Distributions (Slide 11 of 21)
Table 4.12: Calculation of the Variance for Number of Payments Made per
Year by a Lancaster Savings and Loan Mortgage Customer

The standard deviation,  , is defined as the positive square root of the


variance.
The standard deviation for the payments made per year by a mortgage
customer is 42.360  6.508.
Discrete Probability Distributions (Slide 12 of 21)
Figure 4.13: Excel Calculation of the Variance for Number of Payments
Made per Year by a Lancaster Savings and Loan Mortgage Customer
Discrete Probability Distributions (Slide 13 of 21)
Discrete Uniform Probability Distribution:
• When the possible values of the probability mass function are all equal,
then the probability distribution is a discrete uniform probability
distribution.

• Where n = the number of unique values that may be assumed by the


random variable.
Discrete Probability Distributions (Slide 14 of 21)
Binomial Probability Distribution:
• A binomial probability distribution is a discrete probability distribution that
can be used to describe many situations in which a fixed number (n) of
repeated identical and independent trials has two, and only two, possible
outcomes:
o Success.
o Failure.
Discrete Probability Distributions (Slide 15 of 21)
The probability mass function for a binomial random variable that calculates
the probability of x successes in n independent events.
Discrete Probability Distributions (Slide 16 of 21)
Table 4.13: Probability Distribution for the Number of Customers Who Click
on the Link in the Martin’s Targeted E-Mail

• Example: Martin’s, an online specialty clothing store, sends out targeted e-


mails to its best customers notifying them about special discounts
available only to the recipients.
Discrete Probability Distributions (Slide 17 of 21)

Figure 4.14: Graphical


Representation of the
Probability Distribution for
the Number of
Customers Who Click on
the Link in the Martin’s
Targeted E-Mail
Discrete Probability Distributions (Slide 18 of 21)
Figure 4.15: Excel Worksheet for Computing Binomial Probabilities of the
Number of Customers Who Make a Purchase at Martin’s
Discrete Probability Distributions (Slide 19 of 21)
Poisson Probability Distribution:
• We consider a discrete random variable that is often useful in estimating the
number of occurrences of an event over a specified interval of time and
space.
• Examples:
o Number of patients who arrive at a health care clinic in one hour.
o Number of computer-server failures in a month.
o Number of repairs needed in 10 miles of highway.
o Number of leaks in 100 miles of pipeline.
Discrete Probability Distributions (Slide 20 of 21)
• If the following two properties are satisfied, the number of occurrences is
a random variable described by the Poisson probability distribution:
o The probability of an occurrence is the same for any two intervals (of time or
space) of equal length.
o The occurrence or nonoccurrence in any interval (of time or space) is
independent of the occurrence or nonoccurrence in any other interval.
Discrete Probability Distributions (Slide 21 of 21)
Figure 4.16: Excel
Worksheet for
Computing Poisson
Probabilities of the
Number of Patients
Arriving at the
Emergency Room
Continuous Probability
Distributions
Uniform Probability Distribution
Normal Probability Distribution
Continuous Probability Distributions (Slide 1 of 30)
Fundamental difference separates discrete and continuous random
variables in terms of how probabilities are computed:
Continuous Probability Distributions (Slide 2 of 30)
Uniform Probability Distribution:
• Example: Random variable x representing the flight time of an
airplane traveling from Chicago to New York.
• With every interval of a given length being equally likely, the random
variable x is said to have a uniform probability distribution.
Continuous Probability Distributions (Slide 3 of 30)
Figure 4.17: Uniform Probability Distribution for Flight Time
Continuous Probability Distributions (Slide 4 of 30)
Figure 4.18: The Area Under the Graph Provides the Probability of a Flight
Time Between 120 and 130 Minutes
Continuous Probability Distributions (Slide 5 of 30)

• The calculation of the expected value and variance for a continuous


random variable is analogous to that for a discrete random variable.
• For uniform continuous probability distribution, the formulas for the
expected value and variance are:
 b  a
2
ab
E ( x)  Var(x) 
2 12
Continuous Probability Distributions (Slide 11 of 30)
Normal Probability Distribution:
• One of the most useful probability distributions for describing a continuous random variable
is the normal probability distribution.
• Wide variety of practical and business applications:

o Heights and weights of people.

o Test scores.

o Scientific measurements.

o Uncertain quantities such as demand for products.

o Rate of return for stocks and bonds.

o Time it takes to manufacture a part or complete an activity.


Continuous Probability Distributions (Slide 12 of 30)
Figure 4.21: Bell-Shaped Curve for the Normal Distribution
Continuous Probability Distributions (Slide 13 of 30)
• The probability density function that defines the bell-shaped curve of the
normal distribution is:
Continuous Probability Distributions (Slide 14 of 30)
Characteristics of the normal distribution:

1. The entire family of normal distributions is differentiated by two


parameters: the mean  and the standard deviation  .
2. The highest point on the normal curve is at the mean, which is also
the median and mode of the distribution.
3. The mean of the distribution can be any numerical value: negative,
zero, or positive (see Figure 4.22).
Continuous Probability Distributions (Slide 15 of 30)
Figure 4.22: Three Normal Distributions with the Same Standard Deviation
but Different Means (  10,   0,   20)
Continuous Probability Distributions (Slide 16 of 30)
Characteristics of the normal distribution (continued):

4. The normal distribution is symmetric, with the shape of the normal


curve to the left of the mean a mirror image of the shape of the
normal curve to the right of the mean.
5. The tails of the curve extend to infinity in both directions and
theoretically never touch the horizontal axis.
6. The standard deviation determines how flat and wide the normal
curve is; larger values of the standard deviation result in wider,
flatter curves, showing more variability in the data (see Figure
4.23).
Continuous Probability Distributions (Slide 17 of 30)
Figure 4.23: Two Normal Distributions with the Same Mean but Different
Standard Deviations (  5,   10)
Continuous Probability Distributions (Slide 18 of 30)
Characteristics of the normal distribution (continued):

7. Probabilities for the normal random variable are given by areas under the normal
curve. The total area under the curve for the normal distribution is 1. Because the
distribution is symmetric, the area under the curve to left of the mean is 0.50 and
the area to the right is 0.50.
8. The percentages of values in some commonly used intervals are:
a. 68.3% of the values of a normal random variable are within plus or minus one
standard deviation of its mean.
b. 95.4% of the values of a normal random variable are within plus or minus two
standard deviations of its mean.
c. 99.7% of the values of a normal random variable are within plus or minus three
standard deviations of its mean.
Empirical Rule for The Normal Distribution

The 68-95-99.7 Rule (the Empirical Rule)


In bell-shaped distributions, about 68% of the values fall
within one standard deviation of the mean, about 95% of
the values fall within two standard deviations of the mean,
and about 99.7% of the values fall within three standard
deviations of the mean.
Continuous Probability Distributions (Slide 19 of 30)
Figure 4.24: Areas Under the Curve for Any Normal Distribution
Example :Continuous Probability Distributions (Slide 20 of 30)
• Application of the normal probability distribution:
• Grear Aircraft Engines sells aircraft engines to commercial airlines.
Grear offers performance-based sales contract guaranteeing that
engines will provide certain amount of lifetime flight hours subject to
airline purchasing a preventive-maintenance service plan. Based on
extensive flight testing and computer simulations, Grear
believes mean lifetime flight hours is normally distributed with a
mean
  36,500 hours and standard deviation   5,000 hours.
• What is the probability that an engine will last more than 40,000
hours?
Continuous Probability Distributions (Slide 21 of 30)
Figure 4.25: Grear Aircraft Engines Lifetime Flight Hours Distribution
Continuous Probability Distributions (Slide 22 of 30)
Figure 4.26: Excel Calculations for Grear Aircraft Engines Example
Continuous Probability Distributions (Slide 23 of 30)
• Grear is considering a guarantee that will provide a discount on a
replacement aircraft engine if the original engine does not meet the lifetime-
flight-hour guarantee.
• How many lifetime flight hours should Grear guarantee if Grear wants no
more than 10% of aircraft engines to be eligible for the discount guarantee?
(See Figure 4.27.)
• How do we calculate the probability that an engine will have a lifetime of flight
hours greater than 30,000 but less than 40,000 hours? (See Figures 4.28
and 4.29.)
Continuous Probability Distributions (Slide 24 of 30)
Figure 4.27: Grear’s Discount Guarantee
Continuous Probability Distributions (Slide 25 of 30)
Figure 4.28: Graph Showing the Area Under the Curve Corresponding to
P 30,000  x  40,000  in the Grear Aircraft Engines Example
Continuous Probability Distributions (Slide 26 of 30)
Figure 4.29: Using Excel to Find P 30,000  x  40,000  in the Grear
Aircraft Engines Example

You might also like