Applied Statistics For The Six Sigma Green Belt - Gupta

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Marketing_Gupta.

qxd

1/28/05

1:36 PM

Page i

Applied Statistics for the Six Sigma Green Belt

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page ii

Also Available from ASQ Quality Press: Design of Experiments with MINITAB Paul Mathews Six Sigma for the Shop Floor: A Pocket Guide Roderick A. Munro Six Sigma for the Ofce: A Pocket Guide Roderick A. Munro Dening and Analyzing a Business Process: A Six Sigma Pocket Guide Jeffrey N. Lowenthal Six Sigma Project Management: A Pocket Guide Jeffrey N. Lowenthal The Six Sigma Journey from Art to Science Larry Walters The Six Sigma Path to Leadership: Observations from the Trenches David H. Treichler Failure Mode and Effect Analysis: FMEA From Theory to Execution, Second Edition D. H. Stamatis Customer Centered Six Sigma: Linking Customers, Process Improvement, and Financial Results Earl Naumann and Steven Hoisington Design for Six Sigma as Strategic Experimentation: Planning, Designing, and Building World-Class Products and Services H.E. Cook To request a complimentary catalog of ASQ Quality Press publications, call 800-248-1946, or visit our Web site at http://qualitypress.asq.org.

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page iii

Applied Statistics for the Six Sigma Green Belt

Bhisham C. Gupta H. Fred Walker

ASQ Quality Press Milwaukee, Wisconsin

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page iv

American Society for Quality, Quality Press, Milwaukee 53203 2005 by American Society for Quality All rights reserved. Published 2005 Printed in the United States of America 12 11 10 09 08 07 06 05 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data Gupta, Bhisham C., 1942 Applied statistics for the Six Sigma Green Belt / Bhisham C. Gupta, H. Fred Walker. 1st ed. p. cm. Includes bibliographical references and index. ISBN 0-87389-642-4 (hardcover : alk. paper) 1. Six sigma (Quality control standard) 2. Production management. 3. Quality control. I. Walker, H. Fred, 1963 II. Title. TS156.G8673 2005 658.4'013dc22 2004029760 ISBN 0-87389-642-4 No part of this book may be reproduced in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Publisher: William A. Tony Acquisitions Editor: Annemieke Hytinen Project Editor: Paul OMara Production Administrator: Randall Benson ASQ Mission: The American Society for Quality advances individual, organizational, and community excellence worldwide through learning, quality improvement, and knowledge exchange. Attention Bookstores, Wholesalers, Schools, and Corporations: ASQ Quality Press books, videotapes, audiotapes, and software are available at quantity discounts with bulk purchases for business, educational, or instructional use. For information, please contact ASQ Quality Press at 800-248-1946, or write to ASQ Quality Press, P.O. Box 3005, Milwaukee, WI 53201-3005. To place orders or to request a free copy of the ASQ Quality Press Publications Catalog, including ASQ membership information, call 800-248-1946. Visit our Web site at www.asq.org or http://qualitypress.asq.org. Printed on acid-free paper

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page v

In loving memory of my parents, Roshan Lal and Sodhan Devi. Bhisham In loving memory of my father, Carl Ellsworth Walker. Fred

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page vi

THE NORMAL LAW OF ERROR STANDS OUT IN THE EXPERIENCE OF MANKIND AS ONE OF THE BROADEST GENERALIZATIONS OF NATURAL PHILOSOPHY IT SERVES AS THE GUIDING INSTRUMENT IN RESEARCHES IN THE PHYSICAL AND SOCIAL SCIENCES AND IN MEDICINE AGRICULTURE AND ENGINEERING IT IS AN INDISPENSIBLE TOOL FOR THE ANALYSIS AND THE INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT

W. J. Youden

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page vii

Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv Chapter 1 Setting the Context for Six Sigma . . . . . . . . . . . . . . . . 1.1 Six Sigma Dened as a Statistical Concept . . . . . . . . . . . . . . . 1.2 Now, Six Sigma Explained as a Statistical Concept . . . . . . . . . 1.3 Six Sigma as a Comprehensive Approach and Methodology for Problem Solving and Process Improvement . . . . . . . . . . . . 1.4 Understanding the Role of the Six Sigma Green Belt as Part of the Bigger Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Converting Data into Useful Information . . . . . . . . . . . . . . . . . Chapter 2 Getting Started with Statistics. . . . . . . . . . . . . . . . . . . 2.1 What Is Statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Populations and Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Classication of Various Types of Data . . . . . . . . . . . . . . . . . . 2.3.1 Nominal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Ordinal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Interval Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Ratio Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 3 Describing Data Graphically . . . . . . . . . . . . . . . . . . . . 3.1 Frequency Distribution Table . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Qualitative Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Quantitative Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Graphical Representation of a Data Set . . . . . . . . . . . . . . . . . . 3.2.1 Dot Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Pie Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Bar Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Histograms and Related Graphs . . . . . . . . . . . . . . . . . . . 1 1 2 3 5 6 9 9 10 11 12 12 13 13 15 15 15 18 20 20 22 23 27

vii

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page viii

viii Contents

3.2.5 Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2.6 Stem and Leaf Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.7 Measure of Association . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 4 Describing Data Numerically. . . . . . . . . . . . . . . . . . . . 4.1 Numerical Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Measures of Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Coefcient of Variation . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Measures of Central Tendency and Dispersion for Grouped Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Empirical Rule (Normal Distribution) . . . . . . . . . . . . . . . . . . . 4.6 Certain Other Measures of Location and Dispersion . . . . . . . . . 4.6.1 Percentiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Quartiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Interquartile Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Box Whisker Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Construction of a Box Plot . . . . . . . . . . . . . . . . . . . . . . 4.7.2 How to Use the Box Plot . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5 Probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Probability and Applied Statistics . . . . . . . . . . . . . . . . . . . . . . 5.2 The Random Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Sample Space, Simple Events, and Events of Random Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Representation of Sample Space and Events Using Diagrams . . 5.4.1 Tree Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Permutation and Combination . . . . . . . . . . . . . . . . . . . . 5.5 Dening Probability Using Relative Frequency . . . . . . . . . . . . 5.6 Axioms of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 45 46 46 48 50 52 53 53 55 57 57 58 58 59 60 60 63 63 64 64 66 66 67 71 71 72 73 75 75 77 83 86 88

Chapter 6 Discrete Random Variables and Their Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.2 Mean and Standard Deviation of a Discrete Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.2.1 Interpretation of the Mean and the Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page ix

Contents ix

6.3 The Bernoulli Trials and the Binomial Distribution . . . . . . . . . 101 6.3.1 Mean and Standard Deviation of a Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.3.2 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . 102 6.3.3 Binomial Probability Tables . . . . . . . . . . . . . . . . . . . . . . 105 6.4 The Hypergeometric Distribution . . . . . . . . . . . . . . . . . . . . . . 107 6.4.1 Mean and Standard Deviation of a Hypergeometric Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Chapter 7 Continuous Random Variables and Their Probability Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.1 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.2.1 Mean and Standard Deviation of the Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7.3 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.3.1 Standard Normal Distribution Table . . . . . . . . . . . . . . . . 123 7.4 The Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.4.1 Mean and Standard Deviation of an Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.4.2 Distribution Function F(x) of the Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.5 The Weibull Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.5.1 Mean and Variance of the Weibull Distribution . . . . . . . . 133 7.5.2 Distribution Function F(t) of Weibull . . . . . . . . . . . . . . . 133 Chapter 8 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . 137 8.1 Sampling Distribution of Sample Mean . . . . . . . . . . . . . . . . . . 138 8.2 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.2.1 Sampling Distribution of Sample Proportion . . . . . . . . . . 147 8.3 Chi-Square Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 8.4 The Students t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 153 8.5 Snedecors F-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8.6 The Poisson Approximation to the Binomial Distribution . . . . . 158 8.7 The Normal Approximation to the Binomial Distribution . . . . . 159 Chapter 9 Point and Interval Estimation . . . . . . . . . . . . . . . . . . . 165 9.1 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 9.1.1 Properties of Point Estimators . . . . . . . . . . . . . . . . . . . . 167 9.2 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 9.2.1 Interpretation of a Condence Interval . . . . . . . . . . . . . . 172 9.3 Condence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 9.3.1 Condence Interval for Population Mean When the Sample Size Is Large . . . . . . . . . . . . . . . . . . . . . . . . 173 9.3.2 Condence Interval for Population Mean When the Sample Size Is Small . . . . . . . . . . . . . . . . . . . . . . . . 177 9.4 Condence Interval for the Difference between Two Population Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page x

x Contents

9.4.1 Large Sample Condence Interval for the Difference between Two Population Means . . . . . . . . . . 180 9.4.2 Small Sample Condence Interval for the Difference between Two Population Means . . . . . . . . . . 183 9.5 Condence Intervals for Population Proportions When Sample Sizes Are Large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 9.5.1 Condence Interval for p the Population Proportion . . . . 188 9.5.2 Condence Interval for the Difference of Two Population Proportions . . . . . . . . . . . . . . . . . . . . . . 189 9.6 Determination of Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . 192 9.7 Condence Interval for Population Variances . . . . . . . . . . . . . . 195 9.7.1 Condence Interval for a Population Variance . . . . . . . . . 196 Chapter 10 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 201 10.1 Basic Concepts of Testing Statistical Hypotheses . . . . . . . . . . . 201 10.2 Testing Statistical Hypotheses about One Population Mean When Sample Size Is Large . . . . . . . . . . . . . . . . . . . . . . . . . . 208 10.2.1 Population Variance Is Known . . . . . . . . . . . . . . . . . . . 208 10.2.2 Population Variance Is Unknown . . . . . . . . . . . . . . . . . 213 10.3 Testing Statistical Hypotheses about the Difference Between Two Population Means When the Sample Sizes Are Large . . . . 216 10.3.1 Population Variances Are Known . . . . . . . . . . . . . . . . . 216 10.3.2 Population Variances Are Unknown . . . . . . . . . . . . . . . 219 10.4 Testing Statistical Hypotheses about One Population Mean When Sample Size Is Small . . . . . . . . . . . . . . . . . . . . . . . . . . 222 10.4.1 Population Variance Is Known . . . . . . . . . . . . . . . . . . . 223 10.4.2 Population Variance Is Unknown . . . . . . . . . . . . . . . . . 226 10.5 Testing Statistical Hypotheses about the Difference Between Two Population Means When Sample Sizes Are Small . . . . . . . 229 10.5.1 Population Variances 12 and 22 Are Known . . . . . . . . 230 10.5.2 Population Variances 12 and 22 Are Unknown But 12 22 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 10.5.3 Population Variances 12 and 22 Are Unknown and 12 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 10.6 Paired t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 10.7 Testing Statistical Hypotheses about Population Proportions . . . 240 10.7.1 Testing of Statistical Hypotheses about One Population Proportion When Sample Size Is Large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 10.7.2 Testing of Statistical Hypotheses about the Difference Between Two Population Proportions When Sample Sizes Are Large . . . . . . . . . . . . . . . . . . . 242 10.8 Testing Statistical Hypotheses about Population Variances . . . . 244 10.8.1 Testing Statistical Hypotheses about One Population Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 10.8.2 Testing Statistical Hypotheses about the Two Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 247 10.9 An Alternative Technique for Testing of Statistical Hypotheses Using Condence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xi

Contents xi

Chapter 11 Computing Resources to Support Applied Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 11.1 Using MINITAB, Version 14 . . . . . . . . . . . . . . . . . . . . . . . . . 255 11.1.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 11.1.2 Calculating Descriptive Statistics . . . . . . . . . . . . . . . . . 258 11.1.3 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . 269 11.1.4 Estimation and Testing of Hypotheses about Population Mean and Proportion . . . . . . . . . . . . . . . . . 273 11.1.5 Estimation and Testing of Hypotheses about Two Population Means and Proportions . . . . . . . . . . . . . . . . 276 11.1.6 Estimation and Testing of Hypotheses about Two Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 280 11.1.7 Testing Normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 11.2 Using JMP, Version 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 11.2.1 Getting Started with JMP . . . . . . . . . . . . . . . . . . . . . . . 284 11.2.2 Calculating Descriptive Statistics . . . . . . . . . . . . . . . . . 286 11.2.3 Estimation and Testing of Hypotheses about One Population Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 11.2.4 Estimation and Testing of Hypotheses about Two Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 300 11.2.5 Normality Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 11.3 Web-based Computing Resources . . . . . . . . . . . . . . . . . . . . . . 303 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table I Binomial probabilities . . . . . . . . . . . . . . . . . . . . . . . Table II Poisson probabilities . . . . . . . . . . . . . . . . . . . . . . . . Table III Standard normal distribution . . . . . . . . . . . . . . . . . . Table IV Critical values of 2 with degrees of freedom . . . . Table V Critical values of t with degrees of freedom . . . . . . Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2 respectively ( 0.10) 311 312 315 317 318 320 321

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xii

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xiii

Figures

Figure 1.1 Figure 1.2 Figure 1.3 Figure 1.4 Figure 1.5 Figure 2.1 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 Figure 3.9 Figure 3.10 Figure 3.11 Figure 3.12 Figure 3.13 Figure 3.14 Figure 3.15 Figure 3.16 Figure 3.17 Figure 3.18

The normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Six Sigma (Motorola denition) . . . . . . . . . . . . . . . . . . . . . . . . . . . Current Six Sigma implementation ow chart . . . . . . . . . . . . . . . . Six Sigma support personnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Classications of statistical data . . . . . . . . . . . . . . . . . . . . . . . . . . . Dot plot for the data on defective motors that are received in 20 shipments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pie chart for defects associated with manufacturing process steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bar chart for annual revenues of a company over the period of ve years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bar graph for the data in Example 3.7 . . . . . . . . . . . . . . . . . . . . . . Bar charts for types of defects in auto parts manufactured in Plant I (P1) and Plant II (P2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frequency histogram for survival time of parts under extreme operating conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relative frequency histogram for survival time of parts under extreme operating conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frequency polygon for the data in Example 3.9 . . . . . . . . . . . . . . . Relative frequency polygon for the data in Example 3.9 . . . . . . . . A typical frequency distribution curve . . . . . . . . . . . . . . . . . . . . . . Three types of frequency distribution curves . . . . . . . . . . . . . . . . . Cumulative frequency histogram for the data in Example 3.9 . . . . Ogive curve for the survival data in Example 3.9 . . . . . . . . . . . . . Line graph for the data on lawn mowers given in Example 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ordinary and ordered stem and leaf diagram for the data on survival time for parts in extreme operating conditions in Example 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ordered stem and leaf diagram for the data in Table 3.10 . . . . . . . Ordered two-stem and leaf diagram for the data in Table 3.12 . . . Ordered ve-stem and leaf diagram . . . . . . . . . . . . . . . . . . . . . . . .

2 3 4 6 7 12 21 23 25 26 26 29 30 30 31 31 32 32 33 34 36 37 38 38

xiii

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xiv

xiv Figures

Figure 3.19 MINITAB display depicting eight degrees of correlation: (a) represents strong positive correlation, (b) represents strong negative correlation, (c) represents positive perfect correlation, (d) represents negative perfect correlation, (e) represents positive moderate correlation, (f) represents negative moderate correlation, (g) represents a positive weak correlation, and (h) represents a negative weak correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Figure 4.1 Frequency distributions showing the shape and location of measures of centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Figure 4.2 Two frequency distribution curves with equal mean, median and mode values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Figure 4.3 Application of the empirical rule . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Figure 4.4 Amount of soft drink contained in a bottle . . . . . . . . . . . . . . . . . . . 62 Figure 4.5 Dollar value of units of bad production . . . . . . . . . . . . . . . . . . . . . 62 Figure 4.6 Salary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Figure 4.7 Quartiles and percentiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Figure 4.8 Box-whisker plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Figure 4.9 Example box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Figure 4.10 Box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Figure 5.1 Tree diagram for an experiment of testing a chip, randomly selecting a part, and testing another chip . . . . . . . . . . . . . . . . . . . . 76 Figure 5.2 Venn diagram representing the sample space S and the event A in S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Figure 5.3 Venn diagram representing the union of events A and B (shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Figure 5.4 Venn diagram representing the intersection of events A and B (shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Figure 5.5 Venn diagram representing the complement of an event A (shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Figure 5.6 Venn diagramrepresenting AB {1, 4, 5, 6, 7, 8, 9, 10}, AB {7}, A {2, 3, 5, 9, 10}, B {1, 2, 3, 4, 6, 8} . . . . . . . . 82 Figure 5.7 Two mutually exclusive events, A and B . . . . . . . . . . . . . . . . . . . . 83 Figure 5.8 Venn diagram showing the phenomenon of P(AB) P(A) P(B) P(AB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Figure 6.1 Graphical representation of probability function in Table 6.2 . . . . 96 Figure 6.2 Graphical representation of probability function f(x) in Table 6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Figure 6.3 Graphical representation of the distribution function F(x) in Example 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Figure 6.4 Location of mean and the end point of interval ( 2, 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Figure 6.5 Binomial probability distribution with n 10, p 0.80 . . . . . . . . 105 Figure 7.1 An illustration of a density function of a continuous random variable X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Figure 7.2 Graphical representation of F(x) P(X x) . . . . . . . . . . . . . . . . 117 Figure 7.3 Uniform distribution over the interval (a, b) . . . . . . . . . . . . . . . . . 118 Figure 7.4 Probability P(x1 X x2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Figure 7.5 The normal density function curve with mean and standard deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Figure 7.6 Curves representing the normal density function with different means, but with the same standard deviation . . . . . . . . . . 122 Figure 7.7 Curves representing the normal density function with different standard deviations, but with the same mean . . . . . . . . . . . . . . . . . 123

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xv

Figures xv

Figure 7.8 Figure 7.9 Figure 7.10 Figure 7.11 Figure 7.12 Figure 7.13 Figure 7.14 Figure 7.15 Figure 7.16 Figure 7.17 Figure 7.18 Figure 7.19 Figure 7.20 Figure 7.21 Figure 7.22 Figure 7.23 Figure 7.24 Figure 7.25 Figure 8.1 Figure 8.2 Figure 8.3 Figure 8.4 Figure 8.5 Figure 8.6 Figure 8.7 Figure 8.8 Figure 8.9 Figure 8.10 Figure 8.11 Figure 8.12 Figure 8.13 Figure 8.14 Figure 8.15 Figure 8.16 Figure 8.17 Figure 8.18 Figure 8.19 Figure 8.20 Figure 8.21 Figure 9.1 Figure 9.2

The standard normal density function curve . . . . . . . . . . . . . . . . . 123 Probability (a Z b) under the standard normal curve . . . . . . . 124 Shaded area equal to P(1 Z 2) . . . . . . . . . . . . . . . . . . . . . . . . 125 Two shaded areas showing P(1.50 Z 0) P(0 Z 1.50) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Two shaded areas showing P(2.2 Z 1.0) P(1.0 Z 2.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Showing P(1.50 Z .80) P(1.50 Z 0) P(0 Z 0.80) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Shaded area showing P(Z 0.70) . . . . . . . . . . . . . . . . . . . . . . . . . 126 Shaded area showing P(Z 1.0) . . . . . . . . . . . . . . . . . . . . . . . . 126 Shaded area showing P(Z 2.15) . . . . . . . . . . . . . . . . . . . . . . . . . 127 Shaded area showing P(Z 2.15) . . . . . . . . . . . . . . . . . . . . . . . 127 Converting normal N(6,4) to standard normal N(0,1) . . . . . . . . . . 128 Shaded area showing P(0.5 Z 2.0) . . . . . . . . . . . . . . . . . . . . . 128 Shaded area showing P(1.0 Z 1.0) . . . . . . . . . . . . . . . . . . . 128 Shaded area showing P(-1.50 Z 0.50) . . . . . . . . . . . . . . . . 129 Graphs of exponential density function for 0.1, 0.5, 1.0, and 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Curves of three hazard rate functions . . . . . . . . . . . . . . . . . . . . . . . 132 Hazard function h(t) with 1; 0.5, 1, 2 . . . . . . . . . . . . . . . 133 Weibull density function (a) 1, 0.5 (b) 1, 1 (c) 1, 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Shaded area showing P(2 Z 2) . . . . . . . . . . . . . . . . . . . . . . 142 Shaded area showing P(Z 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Shaded area showing P(2.28 Z 2.28) . . . . . . . . . . . . . . . . . 143 Shaded area showing P(Z 1.14) . . . . . . . . . . . . . . . . . . . . . . . . . 143 Shaded area showing P(1.5 Z 1.5) . . . . . . . . . . . . . . . . . . . 144 Shaded area showing P(1.6 Z 1.6) . . . . . . . . . . . . . . . . . . . 144 Shaded area showing P(2 Z 2) . . . . . . . . . . . . . . . . . . . . . . 144 Shaded area showing P(1.71 Z 1.71) . . . . . . . . . . . . . . . . . 146 Shaded area showing P(2.23 Z 2.23) . . . . . . . . . . . . . . . . . 147 Chi-square distribution with different degrees of freedom . . . . . . . 149 Chi-square distribution with upper-tail area . . . . . . . . . . . . . . . . 149 Chi-square distribution with upper-tail area 0.05 . . . . . . . . . . 150 Chi-square distribution with lower-tail area . . . . . . . . . . . . . . . . 150 Chi-square distribution with lower-tail area 0.10 . . . . . . . . . . 151 Frequency distribution function of t-distribution with, say n 15 degrees of freedom and standard normal distribution . . . . . . . . . . 154 t-distribution with shaded area under the two tails equal to P(T tn,) P(T tn,) . . . . . . . . . . . . . . . . . . . . . . . . . . 154 A typical probability density function curve of F1, 2 . . . . . . . . . 156 Probability density function curve of F1, 2 with upper-tail area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Probability density function curve of F1, 2 with lower-tail area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Comparison of histograms for various binomial distributions (n 15, p 0.2, 0.3, 0.4, 0.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 (a) Showing the normal approximation to the binomial. (b) Replacing the shaded area contained in the rectangles by the shaded area under the normal curve . . . . . . . . . . . . . . . . . . . . . . . . 163 An interpretation of a condence interval . . . . . . . . . . . . . . . . . . . 172 Standard normal-curve with tail areas equal to /2 . . . . . . . . . . . . 174

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xvi

xvi

Figures

(a) Standard normal curve with lower-tail area equal to (b) Standard normal curve with upper-tail area equal to . . . . . . . 175 Figure 9.4 Students t-distribution with tail areas equal to /2 . . . . . . . . . . . . 178 Figure 9.5 Chi-square distribution with two tail areas each equal to 0.025 . . . 197 Figure 9.6 F-distribution curve (a) shaded area under two tails each equal to 0.025 (b) shaded area under left tail equal to 0.05 (c) shaded area under the right tail equal to 0.05 . . . . . . . . . . . . . . 200 in two regions, the Figure 10.1 Critical points dividing the sample space of rejection region and the acceptance region . . . . . . . . . . . . . . . . . . . 204 Figure 10.2 OC-curves for different alternative hypotheses . . . . . . . . . . . . . . . 206 Figure 10.3 Power curves for different hypotheses . . . . . . . . . . . . . . . . . . . . . . 207 Figure 10.4 Rejection regions for hypotheses (i), (ii), and (iii) . . . . . . . . . . . . . 209 Figure 10.5 Lower-tail rejection region with 0.01 . . . . . . . . . . . . . . . . . . . 210 Figure 10.6 Two-tail rejection region with 0.01 . . . . . . . . . . . . . . . . . . . . . 211 Figure 10.7 Power curve for the test in example 10.2 . . . . . . . . . . . . . . . . . . . . 213 Figure 10.8 Rejection regions for hypotheses (i), (ii), and (iii) . . . . . . . . . . . . . 214 Figure 10.9 Rejection region under the lower test with 0.05 . . . . . . . . . . . 215 Figure 10.10 Rejection regions for testing hypotheses (i), (ii), and (iii) at the 0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Figure 10.11 Rejection region under the upper tail with 0.05 . . . . . . . . . . . 218 Figure 10.12 Rejection regions under the two tails with 0.05 . . . . . . . . . . . 219 Figure 10.13 Rejection regions for testing hypotheses (i), (ii), and (iii) at 0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Figure 10.14 Rejection regions for a two-tail test with 0.05 . . . . . . . . . . . . 222 Figure 10.15 Rejection regions for testing hypotheses (i), (ii), and (iii) at the 0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Figure 10.16 Rejection region under the lower tail with 0.05 . . . . . . . . . . . 225 Figure 10.17 Rejection regions under the two tails with 0.05 . . . . . . . . . . . 226 Figure 10.18 Rejection regions for testing hypotheses (i), (ii), and (iii) at the given . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Figure 10.19 Rejection region under the upper tail with 0.05 . . . . . . . . . . . 228 Figure 10.20 Rejection region under the upper tail with 0.01 . . . . . . . . . . . 231 Figure 10.21 Rejection regions for testing hypotheses (i), (ii), and (iii) at the level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Figure 10.22 Rejection region under the upper tail with 0.025 . . . . . . . . . . 234 Figure 10.23 Rejection regions for testing the hypotheses (i), (ii), and (iii) at the level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Figure 10.24 The rejection region under the two tails with 0.01 . . . . . . . . . 237 Figure 10.25 Rejection region under the lower tail with 0.05 . . . . . . . . . . . 240 Figure 10.26 Rejection regions under the two tails with 0.05 . . . . . . . . . . . 242 Figure 10.27 Rejection regions for testing hypotheses (i), (ii), and (iii) at the 0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . 244 Figure 10.28 Rejection region under the chi-square distribution curve for testing hypotheses (i), (ii), and (iii) at the level of signicance . . . . . . . 245 Figure 10.29 Rejection region under the lower tail with 0.05 . . . . . . . . . . . 246 Figure 10.30 Rejection region under the F-distribution curve for testing hypotheses (i), (ii), and (iii) at the level of signicance . . . . . . . 247 Figure 10.31 Rejection region under the two tails with 0.05 . . . . . . . . . . . . 249 Figure 10.32 Rejection region under the right tail with 0.05 . . . . . . . . . . . . 249 Figure 11.1 The screen that appears rst in the MINITAB environment . . . . . 256 Figure 11.2 MINITAB window showing the menu command options . . . . . . . 257 Figure 11.3 MINITAB window showing input and output for Column Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Figure 9.3

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xvii

Figures xvii

Figure 11.4 MINITAB window showing various options available under Stat command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 Figure 11.5 MINITAB display of histogram for the data given in example 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Figure 11.6 MINITAB window showing Edit Bars dialog box . . . . . . . . . . . . 262 Figure 11.7 MINITAB display of histogram with 5 classes for the data in Example 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Figure 11.8 MINITAB output of Dotplot for the data in Example 11.4 . . . . . . 264 Figure 11.9 MINITAB output of Scatterplot for the data given in Example 11.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Figure 11.10 MINITAB display of box plot for the data in Example 11.6 . . . . . 266 Figure 11.11 MINITAB display of graphical summary for the data in example 11.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Figure 11.12 MINITAB display of bar graph for the data in Example 11.8 . . . . 269 Figure 11.13 MINITAB display of pie chart for the data in Example 11.9 . . . . . 270 Figure 11.14 MINITAB printout of 95% Bonferroni condence interval for standard deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 Figure 11.15 MINITAB display of normal probability graph for the data in example 11.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Figure 11.16 The screen that appears rst in the JMP environment . . . . . . . . . . 284 Figure 11.17 JMP menu command options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Figure 11.18 JMP window showing input and output for Column Statistics . . . 287 Figure 11.19 JMP Distribution dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Figure 11.20 JMP display of histogram for the data given in Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Figure 11.21 JMP printout of stem and leaf for the data given in Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Figure 11.22 JMP display of box plot with summary statistics for Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Figure 11.23 JMP display of graphical summary for the data in Example 11.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Figure 11.24 JMP display of bar graph for the data in Example 11.23 . . . . . . . . 293 Figure 11.25 JMP printout of pie chart for the data in Example 11.24 . . . . . . . . 295 Figure 11.26 JMP printout of 1 sample t-test for the data in Example 11.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Figure 11.27 JMP printout of 1 sample z-test for the data in Example 11.26 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Figure 11.28 JMP printout of 2-sample t-test for the data in Example 11.27 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Figure 11.29 JMP printout of paired t-test for the data in Example 11.28 . . . . . 300 Figure 11.30 JMP printout of test of equal variances in Example 11.29 . . . . . . . 301 Figure 11.31 JMP display of normal quantile graph for the data in Example 11.30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xviii

Tables

Table 1.1 Table 1.2 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 3.10 Table 3.11 Table 4.1 Table 5.1 Table 5.2 Table 6.1 Table 6.2 Table 6.3 Table 6.4 Table 6.5 Table 7.1 Table 8.1 Table 8.2 Table 8.3

Process step completion times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Annual revenues of 110 small to midsize companies in the midwestern United States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Frequency distribution table for 110 small to midsize companies in the midwestern United States . . . . . . . . . . . . . . . . . . 17 Complete frequency distribution table for the 110 small to midsize companies in the midwestern United States . . . . . . . . . . . 17 Complete frequency distribution table for the data in Example 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Frequency table for the data on rod lengths . . . . . . . . . . . . . . . . . . 20 Understanding defect rates as a function of various process steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Frequency distribution table for the data in Example 3.7 . . . . . . . . 25 Frequency distribution table for the survival time of parts . . . . . . . 28 Data on survival time (in hours) in Example 3.9 . . . . . . . . . . . . . . 35 Number of parts produced by each worker per week . . . . . . . . . . . 37 Cholesterol levels and systolic BP of 30 randomly selected U.S. men . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Age distribution of group of 40 people watching a basketball game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Classication of technicians by qualication and gender . . . . . . . 89 Classication of technicians by qualication and gender . . . . . . . 91 Probability distribution of a random variable X . . . . . . . . . . . . . . . 95 Probability distribution of random variable X dened in Example 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Probability function of X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Portion of Table I of the appendix for n 5 . . . . . . . . . . . . . . . . . 106 Portion of Table II of the appendix . . . . . . . . . . . . . . . . . . . . . . . . . 114 A portion of standard normal distribution Table III of the appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Population with its distribution for the experiment of rolling a fair die . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 All possible samples of size 2 with their respective means . . . . . . 139 Different sample means with their respective probabilities . . . . . . 139

xviii

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xix

Tables xix

Table 8.4 Table 8.5 Table 8.6 Table 10.1 Table 10.2 Table I Table II Table III Table IV Table V Table VI

A portion of the t-table giving the value of tn, for certain values of n and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Comparison of approximate probabilities to the exact probabilities (n 5, p 0.4, 0.5) . . . . . . . . . . . . . . . . . . . . . . . . . 161 Showing the use of continuity correction factor under different scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Presenting the view of type I and type II errors . . . . . . . . . . . . . . . 205 Condence intervals for testing various hypotheses . . . . . . . . . . . 252 Binomial probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Poisson probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Critical values of 2 with degrees of freedom . . . . . . . . . . . . . . . 320 Critical values of t with degrees of freedom . . . . . . . . . . . . . . . . 322 Critical values of F with numerator and denominator degrees of freedom 1, 2 respectively ( 0.10) . . . . . . . . . . . . . . . . . . . . . 323

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xx

Preface

pplied Statistics for the Six Sigma Green Belt was written as a desk reference and instructional aid for individuals involved with Six Sigma project teams. As Six Sigma team members, green belts will help select appropriate statistical tools, collect data for those tools, and assist with data interpretation within the context of the Six Sigma methodology. Composed of steps or phases titled Dene, Measure, Analyze, Improve, and Control (DMAIC), the Six Sigma methodology calls for the use of many more statistical tools than is reasonable to address in one large book. Accordingly, the intent of this book is to provide Green Belts with the benet of a thorough discussion relating to the underlying concepts of basic statistics. More advanced topics of a statistical nature will be discussed in three other books that, together with this book, will comprise a four-book series. The other books in the series will discuss statistical quality control, introductory design of experiments and regression analysis, and advanced design of experiments. While it is beyond the scope of this book and series to cover the DMAIC methodology specically, we do focus this book and series on concepts, applications, and interpretations of the statistical tools used during, and as part of, the DMAIC methodology. Of particular interest in this book, and indeed the other books in this series, is an applied approach to the topics covered while providing a detailed discussion of the underlying concepts. This level of detail in providing the underlying concepts is particularly important for individuals lacking a recent study of applied statistics as well as for individuals who may never have had any formal education or training in statistics. In fact, one very controversial aspect of Six Sigma training is that, in many cases, this training is targeted at the Six Sigma Black Belt and is all too commonly delivered to large groups of people with the assumption that all trainees have a uent command of the underlying statistical concepts and theory. In practice this assumption commonly leads to a good deal of concern and discomfort for trainees who quickly nd it difcult to keep up with and successfully complete black beltlevel training. This concern and discomfort becomes even more serious when individuals involved with Six

xx

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxi

Preface xxi

Sigma training are expected to pass a written and/or computer-based examination that so commonly accompanies this type of training. So if you are beginning to learn about Six Sigma and are either preparing for training or are supporting a Six Sigma team, the question is: How do I get up to speed with applied statistics as quickly as possible so I can get the most from training or add the most value to my Six Sigma team? The answer to this question is simple and straightforwardget access to a book that provides a thorough and systematic discussion of applied statistics, a book that uses the plain language of application rather than abstract theory, and a book that emphasizes learning by examples. Applied Statistics for the Six Sigma Green Belt has been designed to be just that book. This book was organized so as to expose readers to applied statistics in a thorough and systematic manner. We begin by discussing concepts that are the easiest to understand and that will provide you with a solid foundation upon which to build further knowledge. As we proceed with our discussion, and as the complexity of the statistical tools increases, we fully intend that our readers will be able to follow the discussion by understanding that the use of any given statistical tool, in many cases, enables us to use additional and more powerful statistical tools. The order of presentation of these tools in our discussion then will help you understand how these tools relate to, mutually support, and interact with one another. We will continue this logic of the order in which we present topics in the remaining books in this series. Getting the most benet from this book, and in fact from the complete series of books, is consistent with how many of us learn most effectivelystart at the beginning with less complex topics, proceed with our discussion of new and more powerful statistical tools once we learn the basics, be sure to cover all the statistical tools needed to support Six Sigma, and emphasize examples and applications throughout the discussion. So let us take a look together at Applied Statistics for the Six Sigma Green Belt. What you will learn is that statistics arent mysterious, they arent scary, and they arent overly difcult to understand. As in learning any topic, once you learn the basics it is easy to build on that knowledgetrying to start without a knowledge of the basics, however, is generally the beginning of a difcult situation!

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxii

Acknowledgments

e would like to thank Professors John Brunette and Cheng Peng of the University of Southern Maine, and Ramesh Gupta and Pushpa Gupta of the University of Maine, Orono, for reading the nal draft line-by-line. Their comments and suggestions have proven to be invaluable. We would like to thank Professor Joel Irish of the University of Southern Maine for help in writing a computer program in Mathematica that was used to prepare all the gures in this book. We thank graduate students Mohamad Ibourk, Seetha Shetty and Melanie Thompson for help preparing the chapter on computer resources, as well as Mary Ellen Costello and Stacie Santomango for general manuscript preparation. Also, we thank Laurie McDermott, administrative assistant of the Department of Mathematics and Statistics of the University of Southern Maine, for help in typing the various drafts of the manuscript. We would like to thank the several anonymous reviewers whose constructive suggestions greatly improved the presentations. We also want to thank Annemieke Hytinen, acquisition editor, and Paul OMara, project editor, of ASQ Quality Press for their patience and cooperation throughout the preparation of this project. We thank Minitab Inc. for permitting us to print MINITAB screen shots in this book. MINITAB and the MINITAB logo are registered trademarks of Minitab Inc. We also thank SAS Institute Inc., of Cary, North Carolina, for permitting us to reprint screen shots of JMP v. 5.1 ( 2004 SAS Institute Inc. SAS, JMP and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration). Most of all, the authors would like to thank their families. Bhisham is grateful to his wife, Swarn, daughters Anita and Anjali, and son, Shiva, for their deep love and support. He is grateful to his son-in-law, Mark, for his expressed curiosity. Last but not least, he is grateful to his rst grandchild,

xxii

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxiii

Acknowledgments xxiii

Priya, for reminding him that there is always time for play. Fred would like to sincerely thank his wife, Julie, and sons, Carl and George, for their love, support, and patience as he worked on this and two previous books. Without their encouragement, such projects would not be possible or meaningful. Bhisham C. Gupta H. Fred Walker

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxiv

Introduction

henever a process is not producing products or services at a desired level of quality, an investigation is launched to better understand and improve the process. In many instances such investigations are launched to rapidly identify and correct underlying problems as part of a problem solving methodologyone such methodology is commonly known as root cause analysis. Many problem-solving methodologies, such as root cause analysis, rely on the study of numerical (quantitative) or non-numerical (qualitative) data as a means of discovering the true cause to one or more problems negatively impacting product or service quality. The problemsolving methodologies, however, are all too commonly used to investigate problems that need a quick solution and thus are not afforded the time or resources needed for a particularly detailed or in-depth analysis. Further, problem-solving methodologies are also all too commonly used to investigate problems without sufcient analysis of a series of costs associated with a given problem as they relate to lost prot or opportunity, human resources needed to investigate the problem, and so forth. Let us not have the wrong impression of problem-solving methodologies such as root cause analysis! Each of these methodologies has a proper place in quality and process improvement; however, the scope or size of the problem needs also to be considered. In this context, when problems are smaller and are easier to understand, we can effectively use less rigorous, complicated, and thorough problem-solving methodologies. When problems become large, complex, and expensive, a more detailed and robust problem-solving methodology is needed, and that problem-solving methodology is Six Sigma. While it is beyond the intended scope of this book to discuss, in detail, the Six Sigma methodology as an approach to problem solving, it is the explicit intent of this book to describe the concepts and application of tools and techniques used to support the Six Sigma methodology. Next we give a brief description of the topics discussed in the book, followed by where in the Six Sigma methodology you can expect to use these tools and techniques. In Chapter 1 we introduce the concept of Six Sigma from both statistical and quality perspectives. We briey describe what we need for converting data into information. In statistical applications we come across various
xxiv

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxv

Introduction

xxv

types of data that require specic analyses that depend upon the types of data we are working with. It is therefore important to distinguish between different types of data. In Chapter 2 we discuss and provide examples for different types of data. In addition, terminology such as population and sample are introduced. In Chapter 3 we introduce several graphical methods found in descriptive statistics. These graphical methods are some of the basic tools of statistical quality control (SQC). These methods are also very helpful in understanding the pertinent information contained in very large and complex datasets. In Chapter 4 we learn about the numerical methods of descriptive statistics. Numerical methods that are applicable to both sample as well as population data provide us with quantitative or numerical measures. Such measures further enlighten us about the information contained in the data. In Chapter 5 we proceed to study the basic concepts of probability theory and see how probability theory relates to applied statistics. We also introduce the random experiment and dene sample space and events. In addition, we study certain rules of probability and conditional probability. In Chapter 6 we introduce the concept of a random variable, which is a vehicle used to assign some numerical values to all the possible outcomes of a random experiment. We also study probability distributions and dene mean and standard deviation of random variables. Specically, we study some special probability distributions of discrete random variables such as Bernoulli, binomial, hypergeometric, and Poisson distributions, which are encountered frequently in many statistical applications. Finally, we discuss under what conditions (e.g., the Poisson process) these probability models are applicable. In Chapter 7 we continue studying probability distributions of random variables. We introduce the continuous random variable and study its probability distribution. We specically examine uniform, normal, exponential, and Weibull continuous probability distributions. The normal distribution is the backbone of statistics and is extensively used in achieving Six Sigma quality characteristics. The exponential and Weibull distributions form an important part of reliability theory. The hazard or failure rate function is also introduced. Having discussed probability distributions of data as they apply to discrete and continuous random variables in Chapters 6 and 7, in Chapter 8 we expand our study to the probability distributions of sample statistics. In particular, we study the probability distribution of the sample mean and sample proportion. We then study Students t, chi-square, and F distributions. These distributions are an essential part of inferential statistics and, therefore, of applied statistics. Estimation is an important component of inferential statistics. In Chapter 9 we discuss point estimation and interval estimation of population mean and of difference between two population means, both when sample size is large and when it is small. Then we study point estimation and interval estimation of population proportion and of difference between two population proportions when the sample size is large. Finally, we study the estimation of a population variance, standard deviation, ratio of two population variances, and ratio of two population standard deviations.

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page xxvi

xxvi Introduction

Table 1 Applied statistics and the Six Sigma methodology.


Six Sigma Phase Dene Tool or Technique Descriptive Statistics Graphical Methods Numerical Descriptions Sampling Point & Interval Estimation Probability Discrete & Continuous Distributions Hypothesis Testing Where in this book? Chapter 2 Chapter 3 Chapter 4 Chapter 8 Chapter 9 Chapter 5 Chapters 6 & 7 Chapters 10

Measure Analyze

Improve Control

In Chapter 10 we study another component of inferential statistics, which is the testing of statistical hypotheses. The primary aim of statistical hypotheses is to either refute or support the existing theory, which is, in other words, what is believed to be true based upon the information contained in sample data. This further enhances good procedures. In this chapter we discuss the techniques of testing statistical hypotheses for one population mean and for differences between two population means, both when sample sizes are large and when they are small. We also discuss techniques of testing hypotheses for one population proportion and for differences between two population proportions when sample sizes are large. Finally, we discuss testing of statistical hypotheses for one population variance and for ratio of two population variances under the assumption that the populations are normal. The results of Chapter 9 and this chapter are frequently used in statistical quality control (SQC) and design of experiments (DOE). In Chapter 11 we consider computer-based tools for applied statistical support. Computing resources were purposefully included at the end of the book so as to encourage readers not to rely on computers until after they have gained a mastery of the statistical content presented in the preceding chapters. But where in the Six Sigma methodology do we use these tools and techniques? The answer is throughout the methodology! Lets take a closer look. The information contained in Table 1 will help us better relate specic tools and techniques to phases of the Six Sigma methodology as they relate to the intended scope and purpose of this booka basic level of applied statistics. Additional topics will be discussed in later books in this series. As topics are discussed in later books, these topics will be added to content of Table 1 and readers can use the table to help associate specic tools to the Six Sigma methodology. The array of topics as they relate to the Six Sigma methodology is helpful in understanding where you may use these tools and techniques. It is important to note however, that any of these tools and techniques may come into play in more than one phase of the Six Sigma methodology, and in fact, should be expected to do so. What is presented in Table 1 is a rst point in the methodology you may expect to use these tools and techniques. From here its time to get started! Enjoy!

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 1

1
Setting the Context for Six Sigma

t is important to begin our discussion of applied statistics by recognizing that Six Sigma (6) has come to refer simultaneously to two related but different ideas. The rst idea is that of the technical denition as a statistical conceptthis technical denition will be provided and explained in sections 1.1 and 1.2, respectively. The second idea is that of a comprehensive approach and methodology for problem solving and process improvement this comprehensive approach and methodology will be briey outlined in section 1.3; however, a thorough discussion of the 6 approach and methodology is beyond the scope of this book. The remainder of the chapter will be devoted to describing how the green belt contributes to 6 efforts (section 1.4) and how the green belt goes about the task of converting data into useful information (section 1.5).

1.1 Six Sigma Dened as a Statistical Concept


Six Sigma is a measure of process quality wherein the distance between a target value and the upper or lower specication limit is at least six standard deviations. The most widely publicized consequence of a 6 process is that there are 3.4 defects per million opportunities (DPMO). DPMO is dened not as a count of defects alone, but rather as a ratio of defects compared to the number of opportunities for defects to occur. Since most operators, service providers, technicians, engineers, and managers are trained to think in terms of counting total defects, the concept of comparing defects to opportunities for defects to occur is counterintuitive. In fact, determining what constitutes an opportunity for a defect to occur has, in some circles, become controversial. Now combining these ideas of 3.4 DPMO, that defects are compared to opportunities for those defects to occur and that the denition of an opportunity is not universally agreed upon, means we have a statistical concept (i.e., 6) that is difcult for a great many people to understandeven for professionals with advanced levels of statistical training and education!

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 2

2 Chapter One

Not to worry! We can readily understand the meaning of this 6 concept if we avoid the unnecessary rigor of a theoretical discussion and focus on its application.

1.2 Now, Six Sigma Explained as a Statistical Concept


In its purest statistical form, 6 refers to six standard deviations and describes the variability of a process in what is commonly referred to as a measure of dispersion. In this case, three standard deviations would be located above some measure of location such as a mean or average, and three standard deviations would be located below the same measure of location, as illustrated in Figure 1.1. As you can see from Figure 1.1, the standard deviations are combined to form the boundaries of what is referred to as a normal distributionthis normal distribution is also commonly referred to as the bell-shaped curve. It is important to note that a much more detailed discussion of the topics identied above, and related topics, will be provided where they are appropriate later in this book. For now, let us continue with our explanation of 6. As was stated above, 6 refers to a defect rate equivalent to 3.4 DPMOthis is where understanding the term and concept of 6 can become unnecessarily difcult. And while some people take great satisfaction in being able to explain 6 at an excruciating level of technical detail, such detail is not necessary to grasp a general understanding of the concept. To avoid an unnecessary level of complexity, while still being able to understand the concept, let us think of 6 as illustrated in Figure 1.2. In Figure 1.2 we can readily see there is a normal or bell-shaped distribution. What makes the distribution interesting is that the width of the distribution that describes variability is quite narrow compared to some limits, for example, specication limits. These specication limits are generally provided

+3

Figure 1.1

The normal distribution.

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 3

Setting the Context for Six Sigma 3

LSL Cpk=1.5 3.4 DPMO Cp=2

0+ - 1.5 Cp=2 Cpk=1.5 Cp=Cpk=2

USL

3.4 DPMO

+1

+2

+3

+4

+5

+6

6 to LSL

6 to USL

Figure 1.2

Six Sigma (Motorola denition).

by customers in the form of tolerances and describe the values for which products or services must conform to be considered good or acceptable. There is more to this explanation, however. Again, looking at Figure 1.2 we see that because the width of the distribution is so much smaller than the width of the limits that it is possible for the location of the distribution to move around, or vary, within the limits. This movement or natural variation is inherent in any process, and so anticipating the movement is exactly what we want to be able to do! In fact, we want as much room as possible for the distribution to move within the limits so we do not risk the distribution moving outside these limits. Now someone may ask, Why would the distribution move around within the limits? and How much movement would we expect? Both are interesting questions, and both questions help us better understand this concept called 6 as it refers to quality. When a process is operating, whether that process involves manufacturing operations or service delivery, variation within that process is to be expected. Variation occurs both in terms of the measures of dispersion (i.e., the width of a process) and measures of location (i.e., where the center of that process lies). During normal operation we would expect the location of a process (described numerically by the measure of location) to vary or move / 1.5 standard deviations. Herein lies explanation of 6. Our goal is to reduce the variability of any process as compared to the process limits to a point where there is room for a / 1.5 standard deviation move, accounting for the natural variability of the process while containing all that variability within the limits. Such a case is referred to as a 6 level of quality, wherein no more than 3.4 DPMO would be expected to fall outside the limits.

1.3 Six Sigma as a Comprehensive Approach and Methodology for Problem Solving and Process Improvement
Having been mystied or confused about the technical denition of 6, many people never fully develop an understanding that 6 is really referring

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 4

4 Chapter One

Commitment made to implement Six Sigma

Champion team formed

Potential projects identified and evaluated

Begin/Charter projects with DMAIC methodology

Yes

Do projects meet criteria?

No

Discontinue consideration of project

Define phase Measure phase Analyze phase Improve phase Control phase Yes
Is phase review successfully completed?

No

Complete projects with DMAIC methodology

Verify financial payback criteria have been met

Have financial payback criteria been met?

No

Reconsider project selection criteria

Yes

Complete project involvement and documentation

Figure 1.3

Current Six Sigma implementation ow chart.

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 5

Setting the Context for Six Sigma 5

more to a comprehensive approach and methodology for problem solving and process improvement than to a statistical concept. Developing such an understanding is necessary sooner, rather than later, because implementation of 6 is based on the use of a wide variety of tools and techniquessome statistical in nature and some notwhere they are appropriate to support each of several phases in the methodology. While originally developed as a phased approach to problem solving and process improvement, 6 started as a sequential progression of phases titled Measure, Analyze, Improve, and Control (MAIC). Six Sigma was later expanded to include a Dene phase, as it became apparent more attention was needed to identify, understand, and adequately describe problems or opportunities. In what is now known as the DMAIC approach and methodology, 6 continues to be improved upon, and the addition of new phases as formal components of the methodology is being discussed in various venues. In its current form of implementation, however, Six Sigma is practiced as identied in Figure 1.3. However, as 6 evolves, it is clear that several levels of stakeholders, participants, and team members will be needed to apply the tools and techniques as they are called for within the methodology. And as a percentage of the total number of people involved with 6 efforts, green belts will continue to represent one of the largest groups of stakeholders, participants, and team members.

1.4 Understanding the Role of the Six Sigma Green Belt as Part of the Bigger Picture
Green belts constitute one of the largest contributors to 6 efforts, as highlighted in Figure 1.4. As seen in Figure 1.4, green belts are close to process operations and work directly with shop oor operators and service delivery personnel. Green belts most commonly collect data, make initial interpretations, and begin to formulate recommendations that are fed to black belts. Black belts then perform more thorough analyses, generally with additional data and input from other sources, and make recommendations to master black belts and project champions. The ow of involvement and responsibilities described above is the essence of how 6 has been implemented to date. What is interesting, though, is not how 6 has been implemented to date, but how the implementation of 6 is changing. A current trend consistent with administration of quality and certain management functions is to push responsibility to lower levels within organizations. How this applies to implementation of 6 is that greater responsibility for problem or opportunity identication, data collection, analysis, and corrective action is being levied on green belts. To support that trend, many consultants providing 6 training now include green belts and black belts in the same classes. This means that, in many cases, green belts receive training on all the tools and techniques, as do black belts, and the expectation is that green belts will assume more

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 6

6 Chapter One

6 Master Black Belts

6 Black Belts

6 Green Belts

Process operators and service delivery personnel

Figure 1.4

Six Sigma support personnel.

responsibility for day-to-day operation of 6 efforts. So we see a redenition of responsibilities wherein the green belts no longer simply collect data as prescribed by black belts, but rather green belts are rapidly being tasked with collecting data and, more importantly, converting that data into useful information.

1.5 Converting Data into Useful Information


What does this mean, converting data into useful information? It implies that data and information are somehow different thingsthey are! Data represent raw facts. Raw facts by themselves do not convey to us much meaning. Consider Table 1.1.
Table 1.1 Process step completion times.
24 21 28 30 20 22 26 25 29 23 29 20 27 24 27 27 28 24 26 25 29 24 31 23 26

Table 1.1 has several rows and columns of numbers. These numbers correspond to measurements of the average time to complete a process step. As a collection of numbers, the data in Table 1.1 do not help us understand much about the process. To really understand the process, we need to convert the data into information, and to convert the data we use appropriate tools and

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 7

Setting the Context for Six Sigma 7

Table 1.2 Descriptive statistics.


Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 25.52 3.0430248 0.608605 26.776099 24.263901 25

17.5

20

22.5

25

27.5

30

32.5

Figure 1.5

Histogram.

techniques. In this case we can use simple descriptive statistics to help us quantify certain parameters and we can use graphics to help us visually convert the data into information, as shown in Table 1.2 and Figure 1.5, respectively. Table 1.2 indicates the mean (or average) is 25.52 and the standard deviation is 3.0430248. Now the data have been processed to give us a pair of quantitative values, we can better understand the process. Figure 1.5 indicates that the data appear to be distributed in a manner that looks like the normal distributiona bell-shaped curve. And while we do gain some understanding of any given process by converting data into information such as the mean and standard deviation, we generally also gain very useful information by presenting the same data graphically. And so begins the job of the 6 green beltconverting data into information. As a nal thought in this chapter, it is worth noting that not all information is useful information. You will read about many tools and techniques in the following chapters. It is important to note that these tools and techniques are what we call blind to mistakes and misinterpretations. This is to say that the tools and techniques will not tell you whether the information you create is good or bad. Nor will the tools and techniques give you guidance on how to interpret the informationfor that you will have to learn the lessons contained in this book and be careful what information to use, how to use that information, and when.

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 8

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 333

Index

A
absolute probability, 8990 acceptance regions, 204 aging factors, 129, 131132 Alpha, dened, 305 alternative hypotheses, 202, 203, 305 alternatives, two-tail, 203 Analyze phase, tools/techniques associated, xxi arithmetic means, 305. See also means association, measures of, 3944 associations, perfect, 41 axiomatic approach to probability theory, 8688

normal approximation to, 159163 point, 102 Poisson approximation to, 111112, 158159 sampling with replacement and, 107 standard deviation, 106107 binomial probabilities, tables of, 105106, 314316 bivariate data, 39, 4144 black belts, responsibilities of, 56 BMDP software, 255 bound on error of estimation, 168, 192, 305, 307 box-whisker plots, 6670, 264266, 290291, 305

B
bar charts, 2327, 267269, 292294, 305 before and after data, hypothesis testing with, 237240 bell-shaped curves, 2 Bernoulli distributions, 102 Bernoulli populations, 147 Bernoulli random variables, 102103 Bernoulli trials, 101102, 305 Beta, dened, 305 beta function, 153 bias in point estimators, 167170, 310 bimodal data, 305 bimodal distributions, 305 binomial distributions calculating in MINITAB, 270, 271272 criteria for applying, 102 dened, 305 means, 106107

C
categorical data, graphical representations, 2227 cdf (cumulative distribution function), 9798, 117 central limit theorem, 121, 141148, 305 central tendency, measures of. See measures of centrality chance, 71 charts. See also JMP; MINITAB bar, 2327, 267269, 292294, 305 box-whisker plots, 6670, 264266, 290291, 305 categorical data, 2227 control, 33 dot plots, 2021, 39, 262264 frequency distribution tables, 1520 histograms, 2734, 3537, 260262, 288289, 307

333

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 334

334

Index

JMP, 291292 line graphs, 3334, 39 Pareto, 24 pie, 2223, 268270, 294295, 309 probability function, 97 scatter plots, 21, 3944 Six Sigma implementation ow, 4 stem and leaf diagrams, 27, 3439, 310 summary information, 266267, 291292 time series, 3334 tree, 7577 Venn diagrams, 310 chi-square critical values table, 320321 chi-square distributions, 148152, 270, 305, 320321 chi-square goodness-of-t test, 305 chi-square test of independence, 305 classes, 20, 306 coefcients, condence, 171172 coefcients, correlation, 40, 306 coefcients of variation (CV), 52, 57, 306 combinations of objects, 77, 79 complement operations, 8083, 306 composite hypotheses, 203 conditional probability, 8891, 306 condence coefcients, 171172 condence intervals. See also interval estimation dened, 165, 171 differences between two population means, 180187 hypothesis testing, 250254, 275276 for large sample sizes, 173177, 180183, 187192 one-sided, 174, 176 pivotal quantities and, 172173 for population proportions, 187192 for population variances, 195198 for ratio of two population variances, 198200 for small sample sizes, 177180, 183187 Students t-distribution and, 180 two-sided, 173174, 176 condence limits, 171172 contingency tables, 306 continuity correction factor, 160, 306 continuous distribution, 306

continuous random variables, 94, 115, 117120, 306 control charts, 33 Control phase, tools/techniques associated, xxi correction factors, 140, 160, 168, 306 correlation coefcients, 40, 306 critical points, 204, 306 critical regions, 204 cumulative distribution function (cdf), 9798, 117 cumulative frequencies, 1617, 306 cumulative frequency histograms, 3233, 307 cumulative probabilities, 96 curves bell-shaped, 2 frequency distribution, 3132 Ogive, 3233, 308 operating characteristic, 206, 308 power, 309 CV (coefcients of variation), 52, 57, 306

D
data before and after, 237240 bimodal, 305 bivariate, 39, 4144 categorical, 2227 converting to information, 67 dened, 6 grouped, 20, 5760, 307 hypothesis testing, 237240 interval, 1213, 307 nominal, 12, 308 numerical. See numerical data ordinal, 1213, 308 paired, 237240, 309 qualitative, 1213, 1518, 2227, 309 quantitative. See quantitative data ratio, 1213 sets of, 15, 306 skewed, 51, 52, 310 symmetric, 51, 310 types of, 1113 ungrouped, 20 defects per million opportunities (DPMO), 1

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 335

Index

335

Dene, Measure, Analyze, Improve, and Control (DMAIC), xvii, 5 Dene phase, tools/techniques associated, xxi degrees of freedom, 148, 154, 306 density functions. See probability functions dependent events, 91, 306 descriptive statistics, 10, 15, 306 design of experiments (DOE), 306 deterministic experiments, 72 diagrams. See charts dichotomized populations, 107 discrete distributions, 306 discrete random variables, 93, 94, 97, 99101, 306 discrete sample spaces, 74 dispersion, measures of, 23, 45, 5257, 60, 6465 distribution functions continuous random variables, 117 cumulative, 9798, 117 frequency, 153 distributions Bernoulli, 102 bimodal, 305 binomial. See binomial distributions calculating in MINITAB, 269272 chi-square, 148152, 270, 305, 320321 continuous, 306 discrete, 306 exponential, 129132, 270, 307 F-, 270, 307 hypergeometric, 107110, 307 normal. See normal distributions Poisson, 110114, 270, 309 probability, 9596 rectangular distributions, 118 of sample mean, 140 sampling. See sampling distributions shapes of, 5152, 67 skewed/symmetric, 67 Snedecors F-, 155158 Students t-, 230 tables, 1520, 3437, 39, 307 uniform, 118120 Weibull, 132135 Z, 311

DMAIC (Dene, Measure, Analyze, Improve, and Control), xvii, 5 DOE (design of experiments), 306 dot plots, 2021, 39, 262264 DPMO (defects per million opportunities), 1

E
empirical rule, 6063, 66, 70, 307 equally likely events, 307 errors of estimation, 168, 192, 305, 307 in hypothesis testing, 204205, 212 margin of, 168, 192 mean square, 308 of point estimation, 168, 192, 305, 307 standard, 140, 310 type I, 205, 212, 310 type II, 205, 212, 310 estimators, 137, 307. See also point estimation events dened, 74, 307 dependent, 91, 306 equally likely, 307 independent, 8990, 307 mutually exclusive, 83, 90, 308 null, 75, 79 of random experiments, 7375 rare, 110 representations of, 7577 simple, 73, 75, 309 sure, 75, 80, 310 expected frequencies, 307 expected values, 99, 307 experiments dened, 307 deterministic, 72 random. See random experiments exponential distributions, 129132, 270, 307 exponential models, 131132 extreme values, 48, 66, 67, 308

F
F critical values table, 323330 F-distributions, 270, 307

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 336

336

Index

failure rate function, 132133 nite correction factors, 168 nite populations, 11, 140 rst quartile, 307 ow chart, Six Sigma implementation, 4 freedom, degrees of, 148, 154, 306 frequencies, class, 306 frequencies, cumulative, 1617, 306 frequencies, expected, 307 frequencies, relative, 1617, 8386, 309 frequency distribution curves, 3132 frequency distribution functions, 153 frequency distributions. See distributions frequency histograms, 2730, 3233, 34, 307 frequency polygons, 27, 3031, 33, 307

one population variance, 244247 paired t-test, 237240 probability model for, 201202 purpose, 201202 small samples, 223237, 250254, 274275, 296298 steps in, 207208 two population means, 216222, 229237, 276280 two population proportions, 242244, 276280 two population variances, 247249, 280282, 300301

I
Improve phase, tools/techniques associated, xxi independence, test of, 310 independent events, 8990, 307 independent samples, 307 inertia, moments of, 101 inferential statistics, 10, 307 innite populations, 11 information, 6, 7 inter-quartile range (IQR), 52, 6465, 308 intersection operations, 8083, 307 interval data, 1213, 307 interval estimation, 165, 171172, 192195, 307. See also hypothesis testing; point estimation IQR (inter-quartile range), 52, 6465, 308

G
glossary, 305311 Gosset, W. S., 153 graphical representations. See charts graphs. See charts green belts, responsibilities of, xvii, 56 grouped data, 20, 5760, 307 Gupta, Bhisham C., 333

H
hazard rate function, 132133 histograms, 2734, 3537, 260262, 288289, 307 homogeneity, test of, 310 hypergeometric distributions, 107110, 307 hypotheses, types of, 202203, 209, 305, 308. See also hypothesis testing hypothesis testing before and after data, 237240 condence intervals, 250254, 275276 errors in, 204205, 212 general concepts, 203208 in JMP, 295298, 300301 large samples, 208222, 240244, 252, 273274, 295297 in MINITAB, 273282 normal population, 238240, 253, 254 one population mean, 208216, 223229, 238240, 250252, 295298 one population proportion, 240242

J
JMP basic functions, 284286 calculating statistics, 286287 displaying bar charts, 292294 displaying box-whisker plots, 290291 displaying graphical summaries, 291292 displaying histograms, 288289 displaying pie charts, 294295 displaying stem and leaf diagrams, 289290 hypothesis testing, 295298, 300301 normality testing, 301302 paired t-test, 298300

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 337

Index

337

L
LCL (lower condence limits), 171172 left skewed data, 51 left skewed distributions, 67 level of signicance, 205, 308 limits, class, 306 limits, condence, 171172 limits, specication, 23 line graphs, 3334, 39 location, measures of, 2, 3, 6364 lower condence limits (LCL), 171172 lower fences, 308 lower-tail hypotheses, 209

M
MAIC (Measure, Analyze, Improve, and Control), 5 margin of error, 168, 192 marginal probability, 308 marks, class, 20 mean square error (MSE), 308 means arithmetic, 305 Bernoulli distributions, 102 binomial distributions, 106107 continuous random variables, 120 dened, 308 discrete random variables, 99101 exponential distributions, 130 generally, 4648, 51 for grouped data, 58 hypergeometric distributions, 110 Poisson distributions, 114 population, 138141 sample, 138141 uniform distributions, 120 Weibull distributions, 133 weighted, 311 Measure, Analyze, Improve, and Control (MAIC), 5 Measure phase, tools/techniques associated, xxi measures of association, 3944 measures of centrality dened, 4546, 308 for grouped data, 5759

limitations of, 52 means. See means medians, 4850, 51, 5859, 308 modes, 5051, 59, 308 measures of dispersion, 23, 45, 5257, 60, 6465 measures of location, 2, 3, 6364 measures of variability, 308 medians, 4850, 51, 5859, 308 memory-less properties, 130131 midpoints, class, 20, 306 MINITAB calculating distributions, 269272 calculating statistics, 258260 displaying bar charts, 267269 displaying box-whisker plots, 264266 displaying dot plots, 262263, 264 displaying graphical summaries, 266267 displaying graphs, generally, 260 displaying histograms, 260262 displaying pie charts, 268270 displaying scatter plots, 263264, 265 general use, 255258 hypothesis testing about population mean and proportion, 273276 hypothesis testing about two population means and proportions, 276280 hypothesis testing about two population variances, 280282 normality testing, 282283 paired t-test, 278279 modes, 5051, 59, 308 moments of inertia, 101 Motorola denition of Six Sigma, 3 MSE (mean square error), 308 multiplication rule, 77, 90 mutually exclusive events, 83, 90, 308

N
nominal data, 12, 308 nonconditional probability, 8990 nonparametric statistics, 308 normal distributions calculating in MINITAB, 270271 chi-square distributions and, 148 dened, 121, 123, 308, 310 empirical rule, 6063

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 338

338

Index

examples, 124129 generally, 121124 standard deviation and, 2 Students t-distribution and, 153 tables, 123124, 319 normality testing JMP, 301302 MINITAB, 282283 null events, 75, 79 null hypotheses, 202, 308 numerical data graphical representations, 2021, 2744, 6670 interval estimation and, 171 measures of, 52 point estimation and, 166 numerical measures, 45. See also measures of centrality; measures of dispersion

O
observations, 308 observed level of signicance, 205, 308 OC (operating characteristic) curves, 206, 308 Ogive curves, 3233, 308 one-tail alternatives, 203 one-tail tests, 308 operating characteristic (OC) curves, 206, 308 opportunities for defects, 1 ordered stem and leaf diagrams, 36. See also stem and leaf diagrams ordinal data, 1213, 308 outliers, 48, 66, 67, 308

P
p-values, 210211, 309 paired data, 237240, 309 paired t-test, 237240, 278279, 298300 parameters, 45, 137, 165, 309 Pareto charts, 24 Pearson correlation coefcients, 40 Pearson, Karl, 40 percentiles, 6364, 309 perfect associations, 41 permutations, 7778 pie charts, 2223, 268270, 294295, 309 pivotal quantities, 172173

point binomial distributions, 102 point estimation. See also hypothesis testing; interval estimation bias in, 167170, 310 dened, 165 description, 166169 errors of, 168, 192, 305, 307 examples, 169171 variances of, 167, 169170 point values, 309 Poisson approximation to binomial distribution, 111112, 158159 Poisson distributions, 110114, 270, 309 Poisson probability tables, 114, 317318 Poisson process, 111, 131132 population means condence intervals for large samples, 180183 condence intervals for small samples, 183187 differences between, 216222 sample mean and, 138141 population proportions condence intervals, 187192 difference between two, 242244 estimating unknown, 195 hypothesis testing and, 240244, 273280 population variances condence intervals, 195200 formula for, 54 for grouped data, 60 hypothesis testing with known, 208213, 216219, 223226, 229232, 250252 hypothesis testing with one, 244247 hypothesis testing with two, 247249, 280282, 300301 hypothesis testing with unknown, 213216, 219222, 226230, 232237 unknown, 193194, 195 populations dened, 1011, 309 types of, 11, 107, 140, 147 power curve, dened, 309 power, dened, 309 power of the test, 205 probability absolute, 8990 axiomatic approach, 8688

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 339

Index

339

conditional, 8891, 306 dened, 71, 72 dening by relative frequency, 8386 marginal, 308 nonconditional, 8990 random experiments, 7273 statistics and, 72 theoretical, 85 probability distributions. See distributions probability functions continuous random variables, 115 exponential distributions, 129131 formula for, 9596 graphical representations, 97 Poisson distributions, 111 Snedecors F-distributions, 156157 probability tables, Poisson, 114, 317318 problem-solving methodologies, xix process, dened, 309 process improvement using Six Sigma, 35

dened, 93, 309 discrete, 93, 94, 97, 99101, 306 standard normal, 122123 types, 93, 115 range spaces, 95 ranges, 52, 53, 309 ranges, interquartile, 52, 6465 rare events, 110 ratio data, 1213 rectangular distributions, 118 rejection regions, 204, 205, 309 relative frequencies, 1617, 8386, 309 relative frequency approach, 85 relative frequency histograms, 2730, 307 relative frequency polygons, 3031 replacement, sampling and, 107 research hypotheses, 202 right skewed data, 52 right skewed distributions, 67 root cause analysis, xix

Q
qualitative data dened, 1213, 309 frequency distribution tables, 1518 graphical representations, 2227 quality control, dened, 309 quantitative data dened, 1213, 309 frequency distribution tables, 1820 graphical representations, 2021, 2733, 3444, 6670 interval estimation and, 171 measures of, 52 point estimation and, 166 quartiles, 6465, 307, 309310

S
sample mean, probability distributions of, 140 sample points, 73, 77 sample sizes, determining, 192195 sample spaces, 7377, 7983, 309 sample statistics, 309 sample surveys, 309 sample variances, 54, 60 sampled populations, 11 samples dened, 11, 309 independent, 307 replacement and, 107 sampling distributions. See also central limit theorem dened, 309 generally, 137 of sample mean, 138141 of sample proportion, 147148 Students t-distribution, 153155 SAS software, 255 scatter plots, 21, 3944, 263264, 265 second quartile, 309 Set Theory, 8083 signicance, level of, 205, 308 simple events, 73, 75, 309

R
random experiments dened, 307 events of, 7375 probability and, 7273 random samples, 11, 309 random variables Bernoulli, 102103 continuous, 94, 115, 117120, 306

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 340

340

Index

simple hypotheses, 203 single-valued frequency distribution tables, 18 Six Sigma dened, 1, 310 implementation ow chart, 4 methodology, xix, 35 Motorola denition, 3 statistical concept, 13 steps in, xvii tools/techniques, xxi skewed data, 310 Snedecors F-distribution, 155158 software for statistical analysis, 255, 303. See also JMP; MINITAB specication limits, 23 SPSS software, 255 standard deviations Bernoulli distributions, 102 binomial distributions, 106107 continuous random variables, 120 dened, 310 discrete random variables, 99101 exponential distributions, 130 generally, 5556 for grouped data, 60 hypergeometric distributions, 110 Poisson distributions, 114 uniform distributions, 120 standard error, 140, 310 standard normal distributions. See normal distributions standard normal random variables, 122123 statistical tools, 255, 303. See also JMP; MINITAB statistics calculating in JMP, 286287 calculating in MINITAB, 258260 dened, 9, 45, 137 descriptive, 10, 15, 306 goals of, 165 inferential, 10, 307 nonparametric, 308 probability and, 72 sample, 309 Statpages.net, 303 Statsoftinc.com, 303 stem and leaf diagrams, 27, 3439, 289290, 310

Students t-distribution, 153155, 180, 226, 230, 310 Sturges formula, 19 sure events, 75, 80, 310 surveys, sample, 309 symmetric data, 51, 310 symmetric distributions, 67 SYSTAT software, 255

T
t critical values table, 322 t-distributions, 153155, 180, 226, 230 t-test, paired, 237240, 278279, 298300 tables binomial probability, 105106, 159, 314316 chi-square distribution, 149150, 320321 contingency, 306 F critical value, 323330 frequency distribution, 1520, 3437, 39, 307 normal distribution, 123124, 319 Poisson probability, 114, 317318 Snedecors F-distribution, 157 Students t-distribution, 154155 t critical values, 322 target populations, 11 test statistic, 310 testing statistical hypotheses, 202 tests, types of, 310 theoretical probability, 85 third quartile, 310 time series graphs, 3334 tree diagrams, 7577 two-tail alternatives, 203 two-tail hypotheses, 209 two-tail tests, 310 type I error, 205, 212, 310 type II error, 205, 212, 310

U
UCL (upper condence limits), 171172 unbiased estimators, 167170, 310 ungrouped data, 20 uniform distributions, 118120, 270, 310 union operations, 8083

Marketing_Gupta.qxd

1/28/05

1:36 PM

Page 341

Index

341

upper condence limits (UCL), 171172 upper fences, 310 upper-tail hypotheses, 209 values chi-square, 320321 expected, 99, 307 extreme, 48, 66, 67, 308 F critical, 323330 p-, 210211, 309 point, 309 t critical, 322 variability, measures of, 308 variables dened, 310 in frequency distribution tables, 16 variances dened, 310 generally, 52, 5355, 56, 60 of point estimators, 167, 169170

population. See population variances sample, 54, 60 Weibull distributions, 133 variation within a process, 3 Venn diagrams, 7983, 310

W
Walker, H. Fred, 333334 web-based statistical tools, 303 Weibull distributions, 132135 weighted means, 311 width, class, 306

Z
Z distributions, 311 z-scores, 123, 311

You might also like