Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Introduction to Hypothesis Testing

Module Objectives
To introduce statistical hypothesis testing

To define null and alternative hypotheses

To demonstrate a method for constructing a hypothesis test

To provide a roadmap for selecting the appropriate hypothesis test for a given problem

DMAIC Roadmap
Define
Project Scope & Problem Validation

Measure
Refine the Project

Analyze
Failure Modes & Effects Analysis

Improve
Design & Execute An Experiment

Control
Optimize & Refine Solutions

Problem Statement

Process Maps & Simplification

ID Variation: Graphical Analysis

Define Y=f(x)

Control Xs & Monitor Ys

Project Metrics

C&E for Variable Reduction Measurement Capability ID Variation: Statistical Analysis

Recommended Changes

Close & HandOff Project

Objective Statement(s) Data Collection Systems Team Members Process Capability Plan for DOE

Process Analysis
Analyze
Failure Modes & Effects Analysis

Project Recap & Remaining Session 1 Deliverables ID Variation: Exploratory Analysis


Support Tools: FMEA and Data Manipulation using Excel Graphical Techniques Correlation & Regression Confidence Intervals, Central Limit Theorem Testing for Variation and Central Tendency Attribute Testing

ID Variation: Graphical Analysis

ID Variation: Statistical Analysis

Hypothesis Testing and Sample Size

ID Variation: Statistical Analysis

ID Variation: ANOVA Planning for DOE Complete Phase Summary


Conclusions, Issues, & Next Steps

Plan for DOE

Questions asked repeatedly during a Six Sigma Project:


Does this X cause a change in Y? i.e. What is Y=f(X)? Is my understanding of process location, spread, shape, and consistency correct?

Using data to improve our understanding

These questions are answered using:


Assumptions made based on knowledge of the process:
Design intent Historical performance

Real-time data

Current assumptions might be modified because of new data.

Mixing the Past and Present


The trouble with data is that it is noisy.
Many outcomes are possible. The data may suggest that something has changed when it has not. The data may not indicate a change when one has actually occurred.

Likelihood of the Number of Heads Observed in ten flips of a FAIR coin

10

The original assumptions are usually applied unless the real-time data is convincing in suggesting a change.

Using data to test our understanding: an Example


Consider the following: Data has been collected during the Measure Phase on cycle time as an output and time-of-day as an input. Based on history and the process owners impression, cycle time might be different at different times of the day. The data looks like this:
Cycle Time By Time of Day

Could this data have come from a process where time of day is not significant?

Time Of Day

Is this enough evidence to conclude that cycle time depends on time of day? How much evidence is enough?

Late PM Late AM Early PM Early AM


0 50 100 150

Cycle Time

A strategy for combining past and present


Be clear on the assumptions made about the process.
Process Mean Cycle Time = 90 minutes Process output is normally distributed Cycle time is independent of time-of-day

Considering the assumptions to be true, determine what the real-time data should look like.
Averages of 10 cycle times should range between 84 and 96 minutes. Frequency distribution should be symmetrical Average cycle time for each time-of-day should be close to 90 minutes

Collect data and organize it into appropriate statistics. If the real-time data is different than expected, modify the assumptions.
Example:
Assume that a process cycle time averages 90 minutes. Averages of 10 cycle times should range between 84 and 96 minutes. A real-time sample of ten averages..
94.3 minutes 97.6 minutes Keep the assumption. Conclude that mean cycle time has increased.

What Is a Hypothesis Test?


A hypothesis test is simply comparing reality to an assumption and asking, Did things change?

Or

A hypothesis test is testing whether real data fits a model.

Or

A hypothesis test is comparing a statistic to a hypothesis.

The Essence of a Hypothesis Test


Compute from the data a relevant test statistic in order to compare a particular hypothesis of interest (null hypothesis) against some alternative hypothesis. Refer the statistic to a reference distribution, which shows how the statistic would be distributed if the null hypothesis were true. Calculate the probability that a test statistic value at least as large as the one observed would occur by chance if the NULL hypothesis were true.
This probability is called the significance level (or p-value). If this p-value is small, the null hypothesis is discredited and we assert that a statistically significant difference has been observed.

Review: Populations vs. Samples


Sample the group of objects from which one actually gathers data

Population the entire group of objects about which one wishes to draw an inference

Population Parameters Mean Standard Deviation

Sample Statistics _ Sample Mean x Estimate for Standard Deviation s

Population Values Vs. Sample Statistics


What God knows What we measure

Population Parameters Mean Standard Deviation Proportion (Percentage)

Sample Statistics

X
s

^ p

The Purpose of Inferential Statistics


The purpose:

To draw conclusions (inferences) about populations from measurements of samples

The truth about populations:

Populations have parameters which are fixed, and known only to God. Statistics is the science of guessing, from sample data, what God knows about population parameters.

Our initial guesses, before we collect data, are called hypotheses.

Hypotheses
Hypotheses:
Hypotheses are statements about population parameters, not statements about samples.

The Null Hypothesis:


Abbreviated as H0 Usually H0 is a statement of no effect or no difference. We reject or fail to reject H0 based on statistical evidence.

The Alternative Hypothesis:


Abbreviated as Ha A statement about a population parameter that is suspected of being true, if H0 rejected.

Hypothesis Example
To meet an established performance standard the average daily balance of non-interest generating funds must be less than $10K.

What is H0 in common language? The average daily balance of non-interest generating funds is equal to $10K. What is H0 in statistical terms?

What is Ha in common language? The average daily balance of non-interest generating funds is less than $10K. What is Ha in statistical terms?

actual = Std = $10K

actual < Std = $10K

Hypotheses for Order Processing


The Order processing department needs to improve throughput time due to the increase in demand and an organizational goal to hold head count steady. A change in process flow and form design was implemented to improve order cycle time. The sample mean processing time prior to the change was 125 minutes, and the sample mean processing time after the change was 118 minutes.

What is H0 in common language? There is no difference in average processing time between the old and the new processes. What is H0 in statistical terms? New = Old

What is Ha in common language? The new process completes orders faster, on average, than the old process. What is Ha in statistical terms? New < Old

Yet Another Hypothesis Example


Two wholesale distributors are currently under contract. One will be selected in a consolidation move. How does the variability in order cycle time compare between two wholesale distributors?

What is H0 in common language? There is no difference in order time variability between distributors 1 and 2. What is H0 in statistical terms?

What is Ha in common language? The order time variation for distributor 1 is different than distributor 2. What is Ha in statistical terms?

2 1 = 2 2

2 1 2 2

Finally, the Last Hypothesis Example


Historically, 42% of all orders to a stockbroker were for Buys. An analysis of 100 stock orders placed on Mondays showed that 71% of them were Buys. Is the rate of Buys on Mondays different than the historical population?

What is H0 in common language? There is no difference in the proportion of buy orders between the historical population and on Mondays. What is H0 in statistical terms?

What is Ha in common language? The proportion of buy orders on Mondays is greater than the historical population. What is Ha in statistical terms?

PMondays = Ppopulation

PMondays > Ppopulation

Order in the Court


Hypothesis testing is like the American legal system where a defendant is assumed innocent until proven guilty.
Did Not Commit Crime

True State of Nature Did Not Commit Crime Commited A Crime Guilty Goes Free

H0: Defendant is innocent (Assumed) Ha: Defendant is guilty (Must be proved)


Verdict

Correct Verdict

Commited A Crime

Innocent Person Convicted

Correct Verdict

Hypothesis Testing Errors


Type I error

Rejecting the null hypothesis when it is, in fact, true Also known as producer risk The probability of a Type I error is denoted by . (0< <1) Accepting the null hypothesis, when it is, in fact, false Also known as consumer risk The probability of a Type II error is denoted by (0< <1)

Type II error

True State of Nature Ho Ha Conclusion Drawn Correct Decision Type I Error Type II Error Correct Decision Ho Ha

A Hypothesis Testing Flow Chart


State Null and Alternative Hypotheses Set Confidence level (Pick ) Select Appropriate Hypothesis Test Set and calculate sample size Collect Data and Analyze in Minitab Translate to Practical Solution and Verify
Null Hypothesis H0: No difference Alternative Ha: Difference exists (<, , >) Usually 95% ( = 0.05) What type of data? Variable or Attribute? Means, sigmas, counts? How many sets? Or if the sample size is fixed, calculate the power (1-) from the fixed sample size. Run the test. Collect the data. Analyze in Minitab. Interpret p-value against . What does it mean to accept H0? To accept Ha? Verify the results before final implementation

Criteria for Selecting the Appropriate Test


Types of data Variable data (continuous data) Attribute data (discrete data) Number of levels for a given input One Two More than two Types of distributions Normal Non-normal Types of tests Means Medians Variances Counts Proportions

Example 1 Hypothesis Test Setup


Historically, the average time for a marketing document was 38 hrs. A Six Sigma project was conducted to reduce this time. After the project was completed 50 documents were selected at random, and their design times were analyzed.

State Null and Alternative Hypotheses

H0: There is no difference between the new document design time and the historical. new = std Ha: The 50 new documents were faster to design than the historical average.

Set Confidence level (Pick ) Select Appropriate Hypothesis Test

new < std

Usually 95% ( = 0.05) 1 sample t-test

Example 2 Hypothesis Test Setup


A study was carried out to determine if there is a difference in the average time to process an invoice at two different locations. State Null and Alternative Hypotheses
H0: There is no difference between the 2 locations. 1 = 2 Ha: There is a difference between the 2 locations.

Set Confidence level (Pick ) Select Appropriate Hypothesis Test

1 2 Usually 95% ( = 0.05)

2-sample t-test

Example 3 Hypothesis Test Setup


A bank believes that new employees are more likely to approve loans that end up in default. Historically, 3% of all customers default. In a sample of 150 loans approved by employees with less than 1 year of experience, 6 ended up in default. H0: There is no difference between new employee State Null and and historic default rates. Pnew = Phistoric Ha: New employees have a higher proportion of loans in default than historic. Pnew > Phistoric

Alternative Hypotheses

Set Confidence level (Pick ) Select Appropriate Hypothesis Test

Usually 95% ( = 0.05)

1 proportion test

Example 4 Hypothesis Test Setup


A study was conducted to compare the variabilities in the actual vs. forecasted revenues for 2 sales managers. Ten months of data for each were collected. H0: There is no difference in variability between State Null and Alternative Hypotheses Manager 1 and Manager 2. Ho: 2mgr1 = 2mgr2

Ha: The variabilities of Managers 1 & 2 are different. Ha:2mgr1 2mgr2

Set Confidence level (Pick ) Select Appropriate Hypothesis Test

Usually 95% ( = 0.05) Test for Equal Variances

Definitions
Null Hypothesis (H0) Statement of no change or difference; assumed to be true until sufficient evidence is presented to reject it. Alternative Hypothesis (Ha) Statement of a change or difference; assumed to be true if the null hypothesis is rejected. Type I Error The error that occurs when the null hypothesis is rejected when, in fact, it is true. Type II Error The error that occurs when the null hypothesis is not rejected when it is, in fact, false. Alpha Risk The maximum probability of making a Type I error. This probability is established by the experimenter and often set at 5%.

Definitions
Beta Risk The risk or probability of making a Type II error. Significant Difference The term used to describe the results of a statistical hypothesis test where a difference is too large to be reasonably attributed to chance Power (1-) The ability of a statistical test to detect a real difference when there is one; the probability of correctly rejecting the null hypothesis. Determined by alpha and sample size. Test Statistic A standardized value (Z, t, F, etc.) which represents the likelihood of H0, and is distributed in a known manner such that the probability for this value can be determined.

You might also like