Download as pdf or txt
Download as pdf or txt
You are on page 1of 668

Probability and Statistics

in Engineering
Probability and Statistics
in Engineering
Fourth Edition

William W. Hines
Prqjessor Emeritus
School of Industrial and Syste.rns Engineen"n.g
Georgia fnstit,ge of Technology

Douglas C. Montgomery
Professor of Engineering and Statistics
Department of Industrial Engineen"ng
Arizona State University

David M. Goldsman
Professor
School of Industrial and Sys!erns Engineering
Georgia Institute of Technology

Connie :VL Borror


Senior Lecroirer
Department of lr.dustrlal Engir.eering
Arizona State University

~
WILEY
John Wiley & Sons, Inc.
Acquisitions Edi!or Wayne Anderson
Associate Editor Jenny Welter
Marketi.'1g Manager Kathedne Hepburn
Se::ior Production Editor valerie A. mrgas
Senior Designer Dawn Stamey
Cover Image Alfredo PasiekalPhoto Researchers
Production Management Services Argosy Publishing

This book was set in 10/12 Tunes: Roman by Argosy Publishing and printed and bound by Hamil.:on
Printing. The cover was printed by Phoenix Color.

This book is printed on acid~free paper, §

Copyright 2oo3© John WIley &: Sons, me, All rights reserved.
No p31t of this publication may be rcproduced. stored in a retrieval system. or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except ;as permitted under Sec·
tions 107 or 108 of the 1976 United St:ates CopyrightAct. witb.outeither the poor written permission of the Pub-
lisher or au:horiz.ation through payn::tent of the appropriate per-eopy fee to the Copyright Clearance Center. 222
Rosewood Drive. Danvers, MA OHi23. (978) 750-S400. fux (973) 750-44-70. Requests to the Publisher for per-
mission should be nddressed to the PermissionS D<?artment, John Wuey & Sons, Inc., 111 River Street, Hobok-en.
1') 07030, (201)748-{)OCQ, mx
(201)748-6088, E-mail: PERMJffiQ@W!LEYCOM,
10 order books or fur custOtr..er service please call1(800}-CALL vm.EY (225-5945),

Library of Congress CaJ.aloging in PublicaJian Data::


Probabilit)' and statistics in engineering fW'l1Jjaro W. Hines .. (et :ll.]. - 4th cd.
p.em.
Includes bibliographical referenceS,
1. Engineering·~Statistica1 methods. L Hines, W'tlli:::u:n W.
TA340 Jj55 2002 519.2--de21 2002026703
ISBN 0-471~24087-7 (cloth: acjd~free paper)

Printed in :..l::e United States of .A.cl.erica


1098765432
PREFACE to the 4th Edition

This book is written for a first course in applied probability and statistics for undergradu-
ate smdents in engineering, physical sciences, and management science curricula. We have
found that the text can be used effectivelY as a two-semester sophomore~ or junior-level
course sequence, as well as a one~semester refresher course in probability and statistics for
first-year graduate students.
The text has undergone a major overhaul for ille fourth edition, especially wiill regard
to many of the statistics chapters. The idea has been to make the book more accessible to a
wide audience by including more motivational examples, rea1~world applications. and use-
ful computer exercises. With the aim of making the course material easier to learn and eas-
ier to teach, we have also provided a convenient set of cOurse notes, available on the Web
site www.wiley.comico.11ege/hines. For instructors adopting the text, the complete solu-
tions are also available ou a password~protected portion of this Web site.
Structurally speaking, we start ille book off with prObability illeory (Chapter I) and
progress through random variables (Chapter 2), functions of random variables (Cbapter3),
joint random variables (Chapter 4), discrete and continuous distributions (Chapters 5 and
6), and normal distribution (Chapter 7). Then we introduce statistics and data description
techniques (Chapter 8). The statistics chapters follow the Same rough outline as in the pre-
vious edition, namely, sampling distributions (Chapter 9), parameter estimation (Chapter
10), hypothesi, testing (Chapter 11), single- and multifactor design of experiments (Chap-
ters 12 and 13), and simple and multiple regression (ChapterS 14 and 15). Subsequent spe-
cial-topics chapters include nonparametric statistics (Chapter 16), qUality control and
reliability engineering (Chapter 17), and stochastic processes and queueing theory (Chap-
ter 18), Frnally, there is an entirely new Chapter, on statistical techniques for computer sim~
ulation (Chapter 19)-.--perhaps ille first of its kind in this type of statistics text.
The chapters that have seen the most substantial evolution are Chapters 8-14. The dis-
cussion in Chapter 8 on descriptive data analysis is greatly enhanced over that of the previ-
ous edition~s. We also expanded the discussion on different types of interval estimation:in
Chapter 10. In addition. an emphasis has been placed On real-life computer data analysis
examples, Throughout ille book, we incorporated other SlIUctural changes. In all chapters,
we included new examples and exercises; including numerous computerflbased exercises.
A few words on Chapters 18 and 19. Stochastic processes and queueing theory arise
naturally out of probability, and wefeelillat Chapter 18 serves as a good introduction to the
subject-nonnally taught in operations research. management science, and certain engi-
neering disciplines. Queueing theory bas garnered a great deal of use in such diverse fields
as telecommunications, manufacturing, and production planning, Computer simulation. the
topic of Chapter 19, is perhaps the most widely used tool in operations research and man~
agement science, as well as in a number of physical sciences. Simulation :ma..'7ies all th.e
tools of probability and statistics and is used in everyJring from financial analysLs to factory
control and planning, Our text provides what amounts to a simulation mL.1icourse, covering
the areas of Monte Carlo experimentation. random number and variate generation, and
simulation output data analysis.

v
vi Preface

We are grateful to the following individuals for their help during the process of com-
pleting the current revision of the text. Christos Alexopoulos (Georgia fustitute of Tech-
nology), Michael Cararnanis (Boston University), David R. Clark (Kettering University),
J, N. Hool (Auburn University). Johu S. Ramberg (University of Arizona), and Edward J.
Williarus (Gniversity of )":lichigan - Dearborn), served as reviewers and provided a great
deal of valuable feedback. Beamz Valdes (Argosy Publishing) did a wor.derful job super-
vising the typesetting and page proofing of the text, and Jennifer Welter at Wiley provided
great leadership at every tum. Everyone was certainly a pleasure to work with. Of course,
we thank our families for their infinite patier:.ce and support throughout the endeavor.

Hines. Montgomery, Goldsman., and Borror


Contents

1. An Introduction to 1 4. Joint Probability Distributions 71


1~ 1 rntroduction 4-1 Introduction 71
1-2 A Review of Sets 2 4-2 Joint Distribution for Two--Dimensional
1-3 Expe:.'iments a."l.d Saruple Spaces 5 Random Variables 72
1-4 Events 8 4-3 Marginal DistrlbutionS 75
1-5 Probability Definition and Assignment 8 4-4 Conditional Distributions 79
1-6 FInite Sample Spaces and Enumeration 14 4-5 Conditional Expectation 82
1-6_1 Tree Diagram 14 4-6 Regression of the Mean 85
1-6_2 MultiplicatioD Principle 14 4-7 Independence of Random Variables 86
1-6.3 Permutations 15 4-8 Covariance and Correlation 87
1-6.4 Combinations 16 4-9 The Distribution Function for TW(J-
1-6.5 Permutations of Like Objects 19 Dimensional Random Variables 91
1-7 CODditiODal Probability 20 4~ 10 Funetions of Two Random Variables 92
1-8 Pa.·titions, Total Proba~ility. and Bayes' 4-11 Joint Distributions of Dimension n > 2 94
Theorem 25 4-12 Linear Combinations 96
1-9 Summa,y 28 4-13 ~omen>Generat:ing Functions and Linear
1-10 Ex=i,es 28 Combinations 99
4-14 The Law ofL"1le Numbers 99
2_ One-Dimensional Random 4-15 S\lIll1!llUy 101
Variables 33 4-16 Exercises 101

2-1 Introduction 33
5, Some Important Discrete
2-2 The Distribution Function 36
Distributions 106
2-3 Discrete Random Variables 38
2-4 Continuous Random Variables 41 5-1 Introduction 106
2-5 Some Characteristics ofPi;.}tnOUtiODS 44 5w2 Bernoulli Trials a.."1d the Bernoulli
2-6 Chebyshev', Ioequality 48 Dis:ributior.. 106
2-7 SuInmary 49 5~3 The Binomial Dis:ri::outio:l. 108
2-8 Exercises 50 5-3.1 Mean and Variance of the Binomial
Distribution 109
3. Functions of One Random Variable and 5~3.2
The Cumulative Binomial
Exp~tion 52 Distribution - ltO
5~3.3 An Application of the Binomial
3-1 Introduction 52 Distribut:!on 111
3·2 Equivalent Events 52 54 The Geometric Distribution 112
3·3 Functions of a Discrete Random 5-4,1 Mean and Variance of the Geometric
Variable 54 Distribution 113
3-4 Continuous Functions of a Continuous 5~5 The Pascal Distribution 115
Random Variable 55 5-5.1 Mear. a::.d Variance 0: the Pascal
3-5 Ex,pectation 58 Distric:.r:iolJ 115
3-6 Approximations to E[H(Xl] and I1H(X)] 62 5-6 The Multinomial DistnlJotioa 116
3-7 The Moment-Gene.'1Iting Function 65 5-7 The Hypergeometric Distribution 117
3-8 Sum:na:y 67 5-7_1 Mean ar.d Variance ofthc Hyper-
3-9 E--:ercises 68 geometric Distribution 1 I8
viii Contents

5-8 rr.e Poisson Distribution 118 7-2.4 The Standard Normal


5-8.1 Development from a Poisson Distribution 145
Process 118 7M2.5 Problem-Solving Procedure 146
5~S.2 Development of the Pojsson 7~3 The Reproductive Property of the Normal
Distribution from the Distribution 150
Bir..oroial 120 7-4 The Central Limit Theorem 152
5-8.3 Mean and Va..·iance of the Poisson 7-5 The Nonnali\.:pproximation to the Binomia!
Distribution 120 Distribution 155
5-9 Some Approximations 122 7-6 The Lognormal Distribution 157
5-10 Generation of Realizations 123 7-6,1 Density Function 158
5-l! Summary 123 7~6,2 Mean and Va..rianee of the Log:normal
5~12 Exercises 123 Distribution 158
7-6.3 Properties of the Lognormal
~bution 159
6_ Some Important Continuous
7~7 The Bivariate Normal Distdbution 160
Distributions 128
7~8 Generation of Normal Realiza'.:ions 164
6--1 Introduction 128 7-9 Summary 165
6-2 The Unifonn Distribution 128 7-10 Exercises 165
6-2.1 Mean and Variance of the Uniform
Lnsbibution 129 S. Introduction to Statistics and Data
6-3 The Exponential Disbibution 130 Description 169
6-3.1 The Relationship of the Exponential
Distribution to the Poisson 8~1 The Field of Statistics 169
Distribution 131 8-2 Data 173
6~3.2 Mean and Variance 8-3 Graphical Presentation of Data 173
of the Exponential 8-3.1 Numerical Data; Dot Plots and
Distribution 131 Scatter Plots 173
6-3.3 Memoryles, Property of the 8-3.2 Numerical Data: The Frequency
ExponcLtial Distribution 133 Distribution and Histogram 175
6-4 The Gamma Distribution 134- 8-3.3 The Stem-and-LeafPlot 178
6-4.1 The Gamma Function 134 8- 3.4 The Box Plot 179
6-4.2 Definition of the Gamma 8-3.5 ThePa!'oto Chart 181
Distribution 134 8-3.6 Ti.:ne Plots 183
6-4.3 Relationship Between the Gamma 8-4 NUlIlerical Description of Data 183
Distribution and the Exponential 8-4.1 Measures of Central
Distribution 135 Tendency 183
6-4.4 Mean and Variance of thc Gamm.a 8-4.2 Measures of Dispersion 186
Distribution 135 8-4.3 Other Measures for One
6-5 The Weibull Distribution 137 Variable 189
6-5.1 Mean and Variance of the Vleibull 8-4.4 Measuring A.!.wciation 190
Distribution 137 8-4.5 Grouped Data 191
6-6 Generation of Realizations 138 8-5 Summary 192
6-7 Summary 139 8-6 Exercises 193
6-8 Exercises 139
9. Random Samples and Sampling
7. The Normal Distribution 143 Distributions 198
7-1 Introduction 143 9-1 Random Samples 198
7-2 The Normal Distribution 143 9-1.1 Simple Random Sampling from a
7-2.1 Properties of the Normal Finite Universe 199
Distribution 143 9-l.2 Stratified Random Sampling of a
7-2.2 Mean and Variance of the Normal Finite Universe 200
Distribution 144 9-2 Statistics and Sampling
7-2.3 The Normal Cumulative Distributions 201
Distribution 145 9-2.1 Sampling Distributions 202
Contents ix

9-2.2 Finitc Populations and Enumerative 10"8 Othe:: Interval EstiJ:nation Procedures 255
Studies 204- 10-8.1 P:ediction In:eIYals 255
9-3 The Chi-Square Distribution 205 10~8,2 Tolerance Ir.tervals 257
9-4 The t Distribution 208 ;0-9 Su:n:nru:y 258
9~5 The F Distribution 211 10-10 Exercises 260
9-6 Summary 214
9-7 Exercises 214 11. Tests of Hypotheses 266
r ~,,
216 11-1 Introduction 266
(~: ,Parameter Estimation
!l·Ll Statistk:al Hypotheses 266
10-1 Point Estimation 216 11-1.2 Type! and Type II Errors 267
10-1.1 Properties of Estimators 217 !l-13 One-Sided and Two-Sided
10-1.2 The Method of Maximum Hypotheses 269
Likelihood 221 11-2 Tests of Hypotheses on a Single
10-1.3 The Metbod of Moments 224 Sample 271
10-1.4 BaycsianInferencc 226 11 ~2.1 Tests of Hypotheses on the Mean
10-1.5 Applications to Estimation 227 of a Normal Distribution. Variance
10·1.6 Precision of Estimation: KnoVitn 271
The Standard:Error 230 11~2.2 Tests of Hypotheses on the Mean
10-2 Single-Sample Confidence Interval of a Kormal Disttfoution. Varianee
Estimation 232 Unkao~n 27&
10-2.1 Confidence Llterval on the Mean 11-2.3 Tests of Hypotheses on the
of a Normal Distribution, Variance Variance of a Normal
Known 233 Distribution 281
10-2.2 Confidence Interval on the Mean 11-2.4 Tests of Hypotheses 0:.1 a
of a Nonnal Distribution, Variance Proportion 2&3
Unknown 236 1l~3 Tests of Hypotheses on Two Samples 286
10-2.3 Confidence Interval on the 11-3.1 Tests of Hypotheses on the Means
Variance of a Non:u.ll of Two Normal Distributions,
Distribution 238 Variances KnO\\T. 2&6
10-2A Confidence Interval on a 11~3.2 TesLS of Hypotheses on the Means
P;oportion 239 of1\vo Normal Distributions,
10-3 Two-Sampl. Confidence Interval Variances Unknovro 288
Es~on 242 11-33 The Paired t- Test 292
10-3_1 Confidence Interval on the 11 ~3.4 Tests for the Equality ofIwo
Difference between Means of Two Variances 294
Normal Distr1outions, Variances 11 ~3.5 Tests of Hypotheses on 1\vo
Known 242 Proportions 296
10-3,2 Confidence Interval on the 11-4 Testing for Goodrless of Fit 300
Difference between Mea:n:s of'IWo 11-5 Contingency Table Tests 307
Normal Distnoutious. Va.."'iatices 11-6 Sample Computer Output 309
Unknown 244 11-7 Summary 312
10-3.3 Confider.ce Interval an IJ.: - J4 for 11-8 Exercises 312
Pl'rired Observations 247
10-3.4 Confidence Interval 0:.1 the Ratio 12_ Design and Analysis of Single.
ofVariauces of1'v,"o Normal Factor Experiments: The Analysis
IAstributions 248 of Variance 321
10-3.5 Confidence bterval on the
Difference between Two 12-1 The Completely Randomized Single-Factor
Propor::ions 250 Experiment 321
10-4 Approximate Confidence Intervals in 12-:_1 AD Example 321
Maximum Likelihood Esti:trtation 251 12-1.2 The Analysis of
10-5 Simultaneous Confidence Intervals 252 Variance 323
10-6 Bayesian Confidence Intervals 252 12-1.3 Esti.mation of the Model
10-7 Bootstrap Confidence Intervals 253 Parameters 327
X Contents

12-1.4 Residual Analysis and Model 14. Simple Linear Regression and
Checking 330
Correlation 409
12-1.5 An linbaLmcedDe,ign 33\
12-2 Tests on Individual Treaonen~ 14-1 Simple Linear Regression 409
Means 332 14-2 Hypot..~esis Tes!ing in Simple Linear
12-2.1 Orthogonal Contrasts 332 Regression 414
12-2.2 'f\Jkey's Test 335 14-3 lnterval Estimation in Simple Linear
12-3 The Random-Effects Model 337 Regression 417
12-4 The Randomized Block Design 341 14-4 Prediction of New Observations 420
12-4.1 Design and Statistical 14-5 Measuring the Adequacy of the Regression
Analysis 341 Model 421
12-4.2 Tests on Individual Treatment 14-5.1 Residual Analysis 421
Mea:lS 344 14-5.2 The Lack-of·Fit Test 422
12-4,3 Residual Analysis and Model 14-5.3 The Coefficient of
Checking 345 Dete:rmination 426
12-5 Deterrr.:n:.ng Sample Size in Si."l.gle-Factor 14-6 Transformations to a Straight
Experiments 347 Line 426
12~6 Sample Co:rnputer Ou~ut 348 14w-7 Correlation 427
12·7 Summary 350 14~8 SampleComputerOu~ut 43:
12-8 Exercises 350 14·9 Summary 431
14-10 Exercises 432
13. Design of Experiments with Several
Factors 353 15. Multiple Regressiol' __43_7_ _ __
--~~--~------------
13-1 Examples of Experimental Design 15~ 1 Multiple Regression Models 437
Applications 353 15~2 Estimation of the Parameters 438
13-2 Factorial Experi.-uents 355 15-3 Confidence Intervals in Multiple Linear
13-3 Tvio-Factor Factorial Regression 444
Experiments 359 15-4 Prediction of New Observations 446
13-3.1 Statistical Analysis of the Fixed~ 15~ 5 HypOthesis T~'ting in Multiple Linear
Effects Model 359 Regression 447
13-3.2 Model Adequacy 15-5,1 Test for Significance of
Checking 364 Regression 447
13-3.3 One Observation per 15-5.2 Tests on Individual Regression
Cell 364 Coefficients 450
13-3.4 The Random·Effects 15·6 MeasuresofMcdelAdequacy 452
Model 366 15-6.1 The Coefficient of Multiple
13-3.5 The Mixed Model 367 Detennination 453
13-4 General Factorial Experiments 369 15-6.2 ResidualAnalysis 454
13·5 The 2' FaC'.orial Design 373 15-7 Polynomial Re:gression 456
13·5.1 The2'Design 373 15-8 Indicator Variables 458
13·5.2 The 2' Design for k ~ 3 15-9 The Correlation Matrix 461
r"'actors 379 15-10 Problems in Multiple Regression 464
13-5.3 A Single Replicate of the 2~ 15-1O.! Multicollinearity 464
Design 386 15-10.2 Influential Observations m
13·6 Confounding in the 2' Design 390 Regression 470
13~7 Fractional Replication of the 2k 15-10.3 Autocorrel1.tion 472
Design 394 15,11 Selection of Var'.ab1e.s in Multiple
13-7.1 The One-Half Fracuon of the 2' RegressioQ 474
Design 394 15-11.1 The Model·Buililing
13M7.2 Smaller Fractions: The 2k -p Problem 474
Fractional Factorial 400 15-11.2 Computational Procedures for
13·8 Sample Computer Output 4D3 Variable Selection 474
13·9 SlUlllllary 404 15-12 SUlIlIrulJ:)' 486
13·10 Exercises 404 15-13 Exercises 4&6
Contents xi

16. Nonparametric Statistics 491 17-4.2 The Exponential Time~to-FaiJure


Model 541
16-1 Introduction 491 17-4.3 Simple Serial Systems 542
16-2 The Sign Test 491 17-4.4 Simple Active
16~2.1 A Description of the Sign
Redundancy 544
Test 491 17-4.5 Standby Redundancy 545
16-2.2 The Sign Test for Paired 17·4.6 Life Testing 547
Samples 494 17-4.7 Reliability Estimation with a
16-2.3 Type II Error (i3) for the Sign Known Tirr.e-to-Failure
Te't 494 Distribution 548
16~2.4 Co::nparisou of the S:g!l Test and
17-4.8 EsEmation with the
the t- Test 496 Exponential Tune~:o-Failure
16-3 The Wilcoxon Signed Rar..k Test 496 Distribution 548
16~3.1 A Description of the Test 496 17-4.9 Demonstration and Acceptance
16~3.2 A La..rge-Sample
Testing 551
Approrimat:on t.97 17-5 Summary 551
16·3.3 Paired Observations 498 17-6 Exercises 551
16~3A Comparison with the
,-Test 499
18. Stochastic Proc<sses and
16-4 'The Wilcoxon R:mk Sum Test .!99
164.1 A Description of the Test 499
Queueing 555
16-4.2 A Large-Sample 18-1 Introduction 555
ApproxJ.mz:tion 501 18-2 Discrete-TIme Markov Chains 555
16-4.3 Compa.'"'ison with the 18-3 Classification of States and Clams 557
'-Test 501 184 Continuous-Time Markov Chains 561
16-5 Nonparnmetric Methods in the Analysis of 18-$ The Birth-Death Ptoces:s
Variance 501 in Queueing 564
16-5.1 11le Kruskal-Wallis Test 501 18-6 Considerations in Queueing Models 567
16-52 The Rank Transformation 5(l4 18-7 Basic Single-Server Model1A-1.th Consrant
16-6 Summary 5(l4 Rates 568
16-7 Exercises 505 18-8 Single Server with Limited Queue
Length 570
11. Statistical Quality Control and 18-9 Multiple Servers \vith an U~~
ReliabilityJJ:Il~"Ii"~_ _ 507
_ __ Queue 572
18-10 Other Queueing Models 573
17-1 Quality Improvement and • 18-11 S1JJl1IIlaIY 573
Statistics 507 18-12 Exercises 573
17-2 Statistical Q"ality Con",,1 508
17·3 Statistical Process Control 508
19. Computer Simulation 516
17-3.1 Il:!.ttoduction to Control
Char~ 509 19-1 Yfotivatior.al Examples 576
17-3.2 Control Charts for 19~2 Generation ofRando;n Variables 530
Measurements 510 ,9-2.) o"nerating Uniform (0.1) Random
17-3.3 Control Cbw~ for Individual Variables 581
Measurements 518 19-2.2 Generating Nonuniform Random
17~3.4 Control Chart.s for Variables 582
Attribotes 520 19-3 Output Analysis 586
17-35 CUS"L'M and EWMA Control 19-3.1 Terminating Simulation
Charts 525 Analysis 587
17·3.6 Average Run Length 532 19-3.2 Initialization Problems 589
17-3.7 Other SPC Problem-Solving 19-3.3 Steady State Simulation
Tools 535 Analysis 590
17-4 Reliability Engineering 537 19~4 Comparison of Systems 591
17·4.1 Basic Reliability 19-4.1 Classical Confidence
Definitions 538 Ix:tervals 592
xii Contents

19-4.2 Common Random Chart VIII Operating Characteristic Curves for the
Numbers 593 Random~Effects Model Analysis of
19-4.3 Antithetic Rmldom VarianCe 623
Numbers 593 Table IX C:itical Values for the WUcoxon Two-
19-4.4 Selecting the Best Sample Test 627
Syst= 593 Table X Critical Values for the Sign
19-5 SlllIllDZIY 593 Test 629
19-6 Exercises 594 Table XI Critical Values for the Wdcoxon Signed
Rank Test 630
Appendlx 597 Table XTI Percentage Points of the Studentized
Range Statistic 631
Table I Cum\llative Poisson Table xm Factors for Quality-Control
Distribution 598 Cbarts 633
Table II Cumulative Standard Nonna:! Table XIV k Values for One-Sided and Two-Sided
Distribution 601 Tolerance Intervals 634
Table ill Percentage Points of the Table XV Rmldom Numbers 636
y} Distribution 603
Table IV Percentage Points of the References 637
t Distribution 604
Table V Percentage Pomts of the Answers to Selected Exercises 639
F DiJ;tribution 605
CbartVI Operating Cl:iaraJ;teristic Curves 610
Index 649
Cbart\lI Operating Characteristic Ccrves for the
F)X£d-Elfects ModelAnalysiJ; of
V:u:iance 619
Chapter 1

An Introduction to Probability
1-1 INTRODUCTION
Since professionals working in engineering and applied science are often engaged in both
the analysis and the design of systems where system component characteristics are nonde-
tetnlinistic. the understamling and utilization of probability is essential to the description.
design, and analy&is of such systems. Examples reflecting probabilistic behavior are abun-
dant, and in fact. true deterministic behavior is rare, To illusrrate. cons:der the description of
a variety of product quality or perfonnance measurements: the operational lifespan of
mechanical and/or electronic systems; the pattern of equipment failures; the occurrence of
natural phenomena such as sun spots ortomados; particiecouots from a radioactive source;
travel times in delivery operations; vehicle accident counts during a given day on a section of
freeway; or customer waiting times in line at a branch bank,
The termprobability has come to be widely used in everyday life to quantify thc degree
of belief in an event of interest. There are abundant examples, such as the statements that
"there is a 0.2 probability of rain showers" and "the probability that brand X personal com-
puter will survive 10,000 hours of operation without repair is 0.75." In tltis chaprer we intro-
duce the basic structure. elementary concepts, and methods to support precise and
unambiguous statements like those above.
The formal study of probability theory apparently originated in the seventeenth and
eighteenth centuries in France and was motivated by the study of games of chance. WIth lit-
tle formal mathematical understructure, people viewed the field with some $kepticism~
however, this view began to change in the nineteenth century, when a probabilistic model
(description) was developed for the behavior of molecules in a liquid. This became known
as Brownian motion r since it was Robert BrO\vn, an English botanist, who firs.t observed the
phenomenon in 1827. In 1905, Albert Einstein explained Brownian motion under the
hypothesis that particles are subject to the continual bombardment of molecules of the sur-
rounding medium. These results greatly stimulated interest in probability, as did the emer-
gence of the telephone system in the latter part of the nineteenth and early twentieth
centuries. Since a physical connecting system was necessary to allow for the interconnec-
tion of individual telephones, with call1engths and interdemand intervals displaying large
variation~ a strong motivation emerged for developing probabilistic models to describe this
system's behavior.
Although applications like these were rapidly expanding in the early twentieth cenrury,
it is generally thought that it was not until the 19305 that a rigorous mathematical structure
for probability emerged. This chapter presents basic concepts leading to and including a
definition of probability as well as some results and methods useful for problem solution.
The emphasis th:roughout Chapters 1-7 is to encourage an understanding and appreciation
of the subject, with applications to a variety of problems in engineering and science. The
reader should recognize that there is a large, rich field of mathematics related to probabil-
ity (hat is beyond the scope of this book.

1
2 Chapter 1 An Introduction to Probability

Indeed, our objectives in presenting the basic probability topics considered in the cur~
rent chapter are threefold. First, these concepts enhance and enrich our basic understanding
of the world in which we live. Second, many of the examples and exercises deal with the
use of probability concepts to model the behavior of real-world systems. Finally, the prob-
ability topics developed in Chapters 1-7 provide a foundation for the statistical methods
presented in Chapters 8-16 and beyond. These statistical methods deal with the analysis
and interpretation of data, drawing inference about populations based on a sample of units
selected from them, and with the design and analysis of experiments and experimental
data. A sound understanding of such methods will greatly enhance the professional capa-
bility of individuals working in the data-intensive areas commonly encountered in this
twenty-first century.

1-2 A REVIEW OF SETS


To present the basic concepts of probability theory, we will use some ideas from the theory
of sets. A set is an aggregate or collection of objects. Sets are usually designated by capital
letters, A, B, C, and so on. The members of the set A are called the elements of A. In gen-
eral, when x is an element of A we write x E A, and if x is not an element of A we write
x ~ A. In specifying membership we may resort either to enumeration or to a defining prop~
ertj. These ideas are illustrated in the following examples. Braces are used to denote a set,
and the colon within the braces is shorthand for the t = "such that."

The set whose elements are the integers 5, 6, 7, 8 is a finite set with four elements. We could denote
this by
A=[5,6,7,8).
Note that 5 E A and 9 ~ A are both true.

If we write V::: {a, e, i,


0, u} we have defined the set of vowels in the English alphabet. We may use
a defining property and write this using a symbol as
V = (*: * is a vowel in the English alphabet}.

If we say thatA is the set of all real numbers between 0 and 1 inclusive, we might also denote A by a
defining property as
A= [x:x E R, O$x'; 1),
where R is thc set of all real numbers.

The setB= (-3, +3} is the same set as


B =(x: x E R, x' =9),
where R is again the set of real numbers.
1-2 A Review of Sc:ts :3

In the real plane we can consider points (x, y) that lie on a. given lineA, Thus, the condition for inclu-
sion for A requires (x, y) to satisfy ax..;.. by = c, so that
A = (x, y): x E R, y E R. ax+by=c).
where R is the set of real numbers.

The universal set is the set of all objeets under consideration; and it is generally
denoted by u: Another special set is the null set or empty set, usually denoted by 0. To mus-
trate tbls concept> consider a set
A={X:XE R,x'=-l).

'The universal set here is R. the set of re-al numbers. Obviously, set A is empty, since there
are no real numbers having the defining property :? :;:: -1. We should point out that the set
(0)0'0.
If two sets are considered, say A and B, we call A a subset of B, denoted A c B, if each
element in A is also an element of B. The sets A and B are said to be equal (A = B) if and
only if A c B and B c A. As direct consequences of this we may show the following:

1, For any set A, 0 c A.


2. For a given U, A considered in the context of U satisfies the relation A c U.
3, For a given set A, A c A (a rejJexive relation).
4, If A c B and Bee, then A ceCa transitive relation).

An interesting consequence of set equality is that the order of element listing is imma-
terial, To illustrate, let A (a, b, c) and B = (c, a, b). Obviously A ; B by our definition.
Furth=ore, when defining properties are used, the sets may be equal althougb the defin-
ing properties are outwardly different. As an example of the second consequence. we let.4. : : :
{x:.x E R, where x is an even~ prime number} andS """ {x: x+ 3 = 5}. Since the integer 2 is
the only even prime, A B. =
We now considet some operations on sets. Let A and B be any subsets of the universal
set U. Then the following hold:

1. The complemernof.4 (with tespectto U) is the set made up of the elements of Uthat
do not belong ro A. We denote this complementary set as A. That is,
A=(X:XE u,x>!c A}.

2. The intersection of A and B is the set of elements that belong to both A and R We
denote the intersection as A ('j R In other words,
AnB (x:xEAandxEB).

We should also note that A n B is a set, and we could give this set some designator,
such as C.
3. The union of A and B is the set of elements that belong to at least one of the ,etsA
and B. 1f D represents the union, then

D=A u B {x: x E A or x E B Corbothl}.

These operations are illustrated in the following examples,


4 Chapter 1 An Introduction to Probability

'E~~ie'l;~
Let Ube the set of leiters ttl the alphabet, that is, U = (*; * is a letter of the English alphabet}; and let
A {"': ;;0 is :a yowell and B '=' {*: * is one ofthe1etrers a, b. c}. As a consequence of the definitions,

A,= the set of consonants,


B ~ {t!, e,t g ••..• x; y. zl.
AuB= (a, b. c, e, i,o, U}.
A0B~{al.

:E~l!leI1
If the unlversal.setis defined as U =- {1, 2, 3.4, 5, 6, 7), and threesubsets,A = {l. 2. 3},B= {2.4, 6}.
C::;: (1. 3, 5, -"}}. arc defined, then we see immedia:ely from the de:fi.citions that
A= (4.5.6. II. B= {U.5. 71=C.
A c, B= (1. 2. 3, 4. 6). AUC=!1.2.3.5.7}. BvC=U,
A"8~{2I, o4nC=(l,3), 80C=0.

The Venn diagram can be used to illustrate certain set operations, A rectangle is drawn
to represent the universal set U. A s~setA of U is represented by the region, within a cir~
de drawn inside the rectangle. Then A will be represented by the area of the rectangle out-
side of the circle, as illustrated in Fig. I-I. Using this notation, the intersection and union
are illustrated in Fig. 1-2.
The operations of intersection and union may be extended in a straightforward manner
to accommodate any finite number of sets. In the case of three sets, say A. B, and C, A u B
u C has the property that Au (B u C) = (A u B) u C, which obviously holds since both
sides have identical members. Similarly, we see that A n B n C = (A n B) 0 C = A n
(B " C). Some important laws obeyed by sets relative to the operations previously defined
are listed below.
Identity laws: Au0=A. AnU=A,
AuU=U. An0=0.
De Morgan's law:
-- - -
Au B =A n B, A 0 B =A u B.
Associative iaws: A u (B u C) = (A u B) u c,
A nCB n C)=(A nB) n C.
Distributive laws; A U (B n C) = (A u B) n (A u C),
A n (B u C) = (A n B) u (A n C).

The reader is asked in Exercise 1~2 to illustrate some of these statements with Venn dia~
grams. Fonnal proofs are usually more lengthy.

FJgUr< 1-1 Asetln. Venn diagram.


1-3 Experiments and Sample Spaces 5

u u

(aJ (bJ
Figure 1~2 The interseetion and union of two sets in a Venn diagram.
(a) The intersection shaded. (b) The union shaded.

In the case of more than three sets, we usc a subscript to generalize. Thus, if n is a pos-
itive integer, andE j , E 2, .. , , En are given sets, thenE j nE 2 n ... n En is the set of elements
belonging to all of the sets, and E I U E2 U ... U En is the set of elements that belong to at
least one of the given sets.
If A and E are sets, then the set of all ordered pairs Ca, b) such that a E A and bEE is
called the Cartesian product set of A and B. The usual notation is A x B. We thus have
AxB~{(a, b):aE AandbE B).

Let r be a positive integer greater than 1, and let A j ' ••• ,Ar represent sets. Then the Carte-
sian product set is given by
Al xA 2 x .. ' xA r = {(ai' az, ... ,a r): ajE AJorj= 1, 2, ... ,r}.
Frequently, the number of elements in a set is of some importance, and we denote by
n(A) the number of elements in set A. If the number is finite, we say we have afinite set.
Should the set be infinite, such that the elements can be put into a one-to-one correspon-
dence with the natural numbers, then the set is called a denumerably infinite set. The non-
denumerable set contains an infinite number of elements that cannot be enumerated. For
example, if a < b, then the set A = {x E R, a':::; x':::; b} is a nondenumerable set.
A set of particular interest is called the power set. The elements of this set are the sub-
t,.
sets of a set A, and~ common notation is {O, 1 For example if A = {I, 2, 3 }, then
~ ~ (O, (I), (2), (3), p, 2), (1,3), (2,3), (1,2,3)).

1-3 EXPERIMENTS AND SAMPLE SPACES


Probability theory has been motivated by real-life situations where an experiment is per-
formed and the experimenter obser...es an outcome. Furthermore, the outcome may not be
predicted with certainty. Such experiments are called random experiments. The concept of
a random experiment is considered mathematically to be a primitive notion and is thus not
otherwise defined; however, we can note that random experiments have some common
characteristics. First, while we cannot predict a particular outcome with certainty, we can
describe the set o/possible outcomes. Second, from a conceptual point of view, the exper-
iment is one that could be repeated under conditions that remain unchanged, with the out-
comes appearing in a haphazard manner; however, as the number of repetitions increases,
certain patterns in the frequency of outcome occurrence emerge.
We will often consider idealized experiments. For example, we may rule out the out-
come of a coin toss when the coin lands on edge. This is more for convenience than out of
6 Chapter 1 An Introduction to Probability

necessity. The· set of possible outcomes is called the sample space, and these outcomes
define the particular idealized experiment. The symbols <=& and :t are used to represent the
random experiment and the associated saITlple'space·:-
Following the terminology employed in the review of sets and set operations, we will
classify sample spaces (and thus random experiments). A discrete sample space is one
in which there is a finite number of outcomes or a countably (aenumerably) infinite
number of outcomes. Likewise, a continuous sample space has nondenumerable (uncount-
able) outcomes. These might be real numbers on an interval or real pairs contained in
the product of intervals, where measurements are made on two variables following an
experiment.
To illustrate random experiments with an associated sample space, we consider several
examples.

:Ji~i!#~i~;i~,'
~: ·"T~~.s_~~e coin and observe the "up" face.
ff,: {H,n

Note that this set is finite.

'&2: Toss a true coin three times and observe the sequence of heads and tails.
ff,: {HHH, HHI; HIli, HIT, THH, THT, TTH, TIT},

~3: Toss a true coin three times and observe the total number of heads.
ff,: CO, 1,2, 3}.

't:4 : Toss a pair of dice and observe the up faces.


ff 4 : {(I, 1), (I, 2), (1, 3), (1, 4), (I, 5), (I, 6),
(2, 1), (2, 2), (2, 3), (2,4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3,4), (3, 5), (3, 6),
(4,1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6,1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}.

'1g5: An automobile door is assembled with a large number of spot welds. After assembly, each
weld is inspected, and the total number of defectives is counted.
'!i5: (0, 1,2, ... ,X}, whereX= the total number of welds in the door.
1-3 Experiments and Sample Spaces 7

~G: A cathode:ay rube is manufactured. put on life test. ar.d aged to failure. The elapsed time
(in hoUts) at faih.t..-e is recorded.
Ef,: {UE R, ,;"0).
This set is uncountable.

'tj7: A monitor records the emission count from a 7adioactive source in one minute.
Ef,: {O, 1. 2, ... }.

Tn.is set is cocntably infinite.

"£8: Two key solder joints On a printed circuit board are inspected with a probe as well as-visu-
ally, and each joint is classified as gqod. G, or defective, D, requiring rework or scrap.
Ef,: {GG, GD,DG,DD).

e.g9: In a particular chemical plant the volume produced per day for a particular product ranges
between a minimum value, fl, and a maximum value, c, which corresponds to capacity. A
day is randomly selected and the amount produced is observed.
Ef,: (X:XE R, b:;;x';c).

i,O: An extrusion plant is engaged i::l making up an order for pieces 20 feet long. Inasmuch as
the trim ~eration crea~es scrap at both ends, the extruded bar rr.us~ exceed 20 feet.
Because of costs involved, the amount of scrap is critical. A bar is extruded. trimmed, and
iinished. and the totalleagth of Smtp is measured.
:flO: {r.xeR,x>O}.

~! 1: In a missile launch, the three components of velocity are mentioned from the ground as a
function of time. At 1 minute after launch these are printed for a control unit.
9' .l: «v..... V,Y' v): Vx' vJ1 v; arc real numbers}.

i 11: In the preceding example, the ve:ocity components are contin:.l.ously recorded for 5
minutes.
9'12: The space lS complicated here. as we have all possible realizations of the functions v.>:(t).
and vt(t} for 0 S tS 5 minutes to consider.
8 Chapter 1 An Introduction to Probability

All these examples have the characteristics required of random experiments. With the
exception of Example 1-19, the description of the sample space is straightforward, and
although repetition is not considered, ideally we could repeat the experiments. To illustrate
the phenomena of random occurrence, consider Example 1-8. Obviously, if ~l is repeated
indefinitely, we obtain a sequence of heads and tails. A pattern emerges as we continue the
experiment. Notice that since the coin is true, we should obtain heads approximately one-
half of the time. In recognizing the idealization in the model, we simply agree on a theo-
retical possible set of outcomes. In ~ I' we ruled out the possibility of having the coin land
on edge, and in ~6' where we recorded the elapsed time to failure, the idealized sample
space consisted of all nonnegative real numbers.

1-4 EVENTS
An event, say A, is associated with the sample space of the experiment. The sample space is
considered to be the universal set so that event A is simply a subset of ':t. Note that both 0
and ':t are subsets of ':t. As a general rule, a capital letter will be used to denote an event. For
a finite sample space, we note that the set of all subsets is the power set, and more gener-
ally, we require that if A c ':t, then A' c ':t, and if AI' A1 , ••• is a sequence of mutually exclu-
sive events in ':t, as defIned below, then U~"'l Ai c g. The following events relate to
experiments 'i; I' '"t;;1' ••• , 'i; 10' described in the preceding section. These are provided for illus-
tration only; many other events could have been described for each case.

'&\.A: The coin toss yields a head (H).


'&2' A: All the coin tosses give the same face {HHH, TIT}.
'iSJ.A: The total number of heads is two (2).
'&<l-.A: The sum of the "up" faces is seven (I, 6), (2, 5), (3, 4), (4, 3), (5, 2),
(6, I)}.
'&5' A: The number of defective welds does not exeeed 5 (0, 1,2, 3,4, 5).
'&6' A: The time to failure is greater than 1000 hours (t: t> 1000).
'is,. A: The count is exactly two [2}.
'iSs. A: Neither weld is bad {GG}.
'is,. A: The volume produced is between a > b and c [x: x E R, b <a<x<c}.
'&10' A: The scrap does not exeeed one foot [x: x E R, 0 < x:::; I}.
Since an event is a set, the set operations defined for events and the laws and proper-
ties of Section 1-2 hold. If the intersections for all combinations of two or more events
among kevents considered are empty, then the kevents are said to be mutually exclusive (or
disjoint). If there are two events, A and B, then they are mutually exclusive if A n B = 0.
With k= 3, we would requireAJ nA, = 0,A j nA, = 0,A, nA, = 0, andA j nA, nA J =
0, and this case is illustrated in Fig.1-3. We emphasize that these multiple events are asso-
ciated with one experiment.

1-5 PROBABILITY DEFINITION AND ASSIGNMENT


An axiomatic approach is taken to define probability as a set function where the elements
of the domain are sets and the elements of the range are real numbers between 0 and 1. If
event A is an element in the domain of this function, we use customary functional notation,
peA), to designate the corresponding element in the range.
1~5 Probability Definition and Assignment 9

Figun 1--3 Three mutually exelusive events.

Definition
If an experiment %has sample ~ce ff and an event A is defined on ff, then peA) is a real
number called the probability of event A or the probability of A, and the function P(·) has
the following properties:
1.. 0:0; P(A):O; 1 for each event Aof ff.
2" PC::I') = 1..
3. For any finite number k of mutually exclusive events defined on fJ,
(k 1 k
pi UA, 1= lp(AJ
,,1:::::1 J IC::d

4. If Ai' A2 • AJ •••. is a denumerable sequence of mutually exclusive events defined on


::1', then

P(QA}~P{AJ
Note that the properties of the definition given do not tell the experimenter how to
assign probabilities; however. they do restrict the way in which the assignment may be
accomplished. In practice, probability is assigned on the basis of (1) estimates obtained
from previous expeJ;ience or prior observations, (2) an analytical consideration of experi-
mental conditions, or (3) assumption.
To iliustrate the assigmnent of probability based on experience, we consider the repe-
tition of the experiment and the relative frequency of the occu.rrence of the event of interest.
This notion of relative frequency has intuitive appeal., and it involves the conceptual
repetition of an experiment and a counting of both the number of repetitions and the num-
ber of times the event in question occurs. More precisely. ~ is repeated m times and !:\.vo
events are denoted A and B. We let mA and mB be the number of times A and B occur in the
m repetitions.

Definition
The value fA = mA1m is called the relative frequency of event A. It has the following
properties:
1. 0 ~/A::; I.
2, IA = 0 if and only if A never oeeurs, and I" =I if and only if A occurs on every
repetition.
3, If A and B are mutually exclusive events, thenl" ~8 =1" +la.
10 Chapter 1 An Introduction to Probability

As rn becomes large, fA tends to stabilize. That is, as the number of repetitions of the
experiment increases, the relative frequency of event A will vary less and less (from repetition
to repetition). The concept of relative frequency and the tendency toward stability lead to one
method for assigning probability. If an experiment ~ has sample space g and an event A is
defined, and if the relative frequency fA approaches some number PA as the number of repeti-
tions increases, then the number PA is ascribed to A as its probability, that is, as rn """'""7 00,

(1-1)

In.practice, something less than infinite replication must obviously be accepted. As an


example, consider a simple coin-tossing experiment '& in which we sequentially toss a fair
coin and observe the outcome-either heads (H) Or tails (7}-arising from each trial. If the
observational process is cODsidered as a random experiment such that for a particular rep-
etition the sample space is S = {H, T}, we define the event A = {H}, where this event is
defined before the observations are made. Suppose that after m = 100 repetitions of~, we
observe m A :::::: 43, resulting in a relative frequency offA = 0.43, a seemingly low value. Now
suppose that we instead conduet m = 10,000 repetitions of ~ and this time we observe rnA
=.4,924, so that the relative frequency is in this case fA:::::: 0.4924. Since we now have at our
disposal 10,000 observations instead of 100, everyone should be more comfortable in .
assigning the updated relative frequency of 0.4924 to peA). The stability of fA as m gets
large is only an intuitive no!ion at this point; we will be able to be more precise later.
A method for computing the probability of an event A is as follows. Suppose the sam-
ple space has a finite number, n, of elements, ei , and the probability assigned to an outcome
iSPi=P(E,), whereE,= [e,} and

i == 1, 2, .. , , n,
while
PI +P2 + '" +Pn == 1,
P(A)= L,Pi' (1-2)
i:Ci GA

This is a statement that the probability of event A is the sum of the probabilities asso-
ciated with the outcomes making up event A, and this result is simply a consequence of the
definition of probability. The practitioner is still faced with assigning probabilities to the
outcomes, ej , It is noted that the sample space will not be finite if, for example, the elements
ei of g are countably infinite in number. In this case, we note that

Pi~O, i= 1, 2, ... ,

However, equation 1-2 may be used without modification.


If the sample space is finite and has n equally likely outcomes so thatp, =P2 = ... == p"
= lin, then

PtA) = n(A) (1-3)


n

and n(A) outcomes are contained in A. Counting methods useful in determining n and n(A)
will be presented in Section 1-6,
1-5 Probability DefInition and Assignment 11

Suppose the coin in Example 1-9 is biased so that the outcomes of the sample space g' = {HHH, Hm;
HTH, HIT, THH, THT, TTH, TIT} haveprobabilitiesPJ =-f.?,pz =
4 8
P3 = f;-, -
17,
P4 =~'P5=fr'P6=1:;,
.
P7=Ti'PS =21' where e1 =HHH. ez =HHI, etc.lfwe let event A be the event that all tosses YIeld the
I 8 I
same face, then P(A) = 27 + 27 = 3'

Suppose that in Example 1-14 we have prior knowledge that

e-2 ·21-1
i=I,£, ...
Pi = (i-I)l
=0. otherwise.
where Pi is the probabili[)' that the monitor will record a count outcome of i-I during a I-minute inter-
val.lfwe consider event A as the event containing the outcomes 0 and 1, then A = {O, I}, and P(A) =
PI + P2 = e- 2 + 2£-2 = 3[2 =. 0.406.

Consider Example 1-9. where a true coin is tossed three times, and consider event A, where all coins
show the same faco. By equation 1-3,

since there are eight total outcomes and hvo are favorable to event A. The coin was assumed to be true,
so all eight possible outcomes are equally likely.
•_ _ .: ;_'- __ .~v~-"-,

Assume that the dice in Example 1-11 are true, and consider an event A where the sum of the up faces
is 7. Using the resulrs of equation 1-3 we note that there are 36 outoomes, of which six are favorable
to the event in questiOI!, so that P(A) = i.

Note that Examples 1-22 and 1-23 are extremely simple in two respects: the sample space
is of a highly restricted type, and the counting process is easy. Combinatorial methods fre-
quently become necessary as the counting becomes more involved. Basic counting methods
are reviewed in Section 1-'6, Some important theorems regarding probability follow.

Theorem 1-1
If 0 is the empty set, then P(0) = 0.

Proof Note thatg= g u 0 andg and 0 are mutually exclusive. ThenP(9') =P(9') + P(0)
from property 4; therefore P(0) = 0.

Theorem 1-2
P(A) = 1 - P(A).
12 Chapte' 1 An Introduction to Probability

Proof Note that g=A 'J A and A and A are mutually exclusive. ThenP(9') =P(A) .,.P(A)
=
from property 4, but from property 2, P(9') 1; therefore P(A) 1 - PCA). =

Theorem 1·3
P(A V B) = PCA) + P(B) - peA n B).

Proof Since A vB =A v (B nA), where A and (B nA) are mutually exclusive, andB=
(A n B) V (B n A), where CA n B) and (B n A) are mutually exclusive, then peA v B) =
P(A) + PCB n A\ and PCB) ~ PeA n B)+ P(B nA). Subtracting, peA v B) - PCB) = P(A)-
peA n B), and thus P(A v B) = P(A).,. P(B) - P(A n B).

The Venn diagntm shown in Fig. 1-4 is helpful in following the argument of the proof for
Theorem 1-3. We see that the ~double counting" of the hatched region in the expression
PCA) + PCB) is corrected by subtraction of P(A n B).

Theorem 14
PCA v B v C)= P(A).,.P(B) +P(C) -P(AnB) - p(A n C) -PCB n C) + PCA n B n C).

Proof We may write A v B v C = CA v B) v C and use Theorem 1-3 since A v B is 3D


event. The reader is asked to provide the details in Exercise 1-32,

Theorem 1-5
k. k k
P(At vA2 v· .. vA,)= 2,p(A;)- 2,p(A; nAj )';' 2,P(Ai nAj nA, )+ ...
i=l i</=2 £<)<.r-.:.3

Proof Refer to Exercise 1-33.

Theorem 1·6
!f A c B, then PCA):;; PCB).

Proof
!fA cB, thenB =A v (A n B) andP(B) =P(A) +P(A nB) ?;P(A), since peA () B);;' O.

CD
L-_ _ _~_ _ _ _~, Figure 1-4 VeM diagram for two events.
1-5 Probability Definition and Assignment 13

;$~~~pi~I:b1
If A and B are mutually exclusive events, and if it is known that peA) = 0.20 while PCB) = 0.30, we
can evaluate several pr~biliti~~·
1. peA) = I - peA) = 0.80.
2. p(S) =I - PCB) =0.70.
3. peA u B) =P(A) + PCB) = 0.2 + 0.3 =0.5.
4. P(AnB)=O.
5. peA nB)=~(A uB), by De Morgan's law_ V
=1-P(AuB)
= I - [peA) + PCB)] = 0.5.

Suppose events A and B are not mutually, exclusive and we know that P(A) = 0.20, PCB) = 0.30, and
peA n B) = 0.10. Then evaluating the s~;~ilities as before, we obtain
1. peA) = 1- peA) =0.80.
2. p(S) = I - PCB) =0.70.
3. peA u B) =P(A) + PCB) - peA n B) = 0.2+ 0.3 - 0.1 = 0.4.
4. peA n B) = 0.1.
5. peA n B) =peA u B) =I - [peA) + PCB) - peA n B)] =0.6.

!;®R~!~!t~fii:
Suppose that in a certain city 75% of the residents jog (1), 20% like ice cream (1), and 40% enjoy
music (M). Further, suppose that 15% jog and like icc cream, 30% jog and enjoy music, 10% like ice
cream and music, and 5% do all three types of activities. We can consolidate all of this information
in the simple Venn diagram in Fig. 1-5 by starting from the last piece of data, P(J n In M) = 0.05,
and "working our way out" of the center.
1. Find the probability that a random resident will engage in at least one of the three activities.
By Theorem 11, -----,.,,-
P(l u lu M) = pel) + pel) + P(M) - P(l n l) - P(l n M) -pel n M) + P(l n I n M)
=0.75 + 0.20+ 0.40 - 0.15 - 0.30 - 0.10 + 0.05 =0.85.

This answer is also immediate by adding up the components of the Venn diagram.
2. Find the probability that a resident engages in precisely one type of activity. By the Venn dia-
gram, we see that the desired probability is
P(l nl n M) +P(j n In M) + p(J n I n M) = 0.35 + 0 + 0.05 = 0.40.

M Figure 1-5 Venn diagram for Example 1-26.


14 Chapter 1 An Inrroduction to Probability

1-6 FINITE SAMPLE SPACES AND ENUMERATION


Experiments that give rise to a finite sample space have already been discussed, and the
methods for assigning probabilities to events associated with such experiments have been
presented. We can use equations 1-1, 1-2, and 1-3 and deal either with "equally likely" or
with "not equally likely" outcomes. In some situations we will have to resort to the relative
frequency concept and successive trials (experimentation) to estimate probabilities, as indi-
cated in equation 1-1, with some finite m. In this section, however. we deal with equally
likely outcomes and equation 1-3. Note that this equation represents a special case of equa-
tion 1-2, where P! == P2 == •.. == p" = lIn.
In order to assign probabilities, peA) = n(A)/n, we must be able to detennine both n, the
munber of outcomes, and n(A). the number of outcomes favorable to event A. If there are n
outcomes in g, then there are 2" possible subsets that are the elements of the power set,
(0, I)A
The requirement for the n outcomes to be equally likely is an important one, and there
will be numerous applications where the experiment will specify that one (or more) item(s)
is (are) selected at random from a population group of N items without replacement
If n represents the sample size (n .::; N) and the selection is random, then each possible
selection (sample) is equally likely. It will soon be seen that there are N!I[n!(N - n)!] such
samples, so the probabillty of getting a particular sample must be n!(N - n)!/NL It should
be carefully noted that One sample differs from another if one (or more) item appears in one
sample and not the other. The population items must thus be identifiable. In order to illus-
trate, suppose a population has four chips (N = 4) labeled a, b, c, and d. The sample size is
to be two (n = 2). The possible results of the selection, disregarding order., are elements of
:J = (ab, ac, ad, bc, bd, cd). If the sampling process is random, the probability of obtain-
ing each possible sample is i. The mechanics of selecting random samples vary a great
deal, and devices such as pseudorandom number generators, random number tables, and
icosahedron dice are frequently used, as will be discussed at a later point.
It bccomes obvious that we need enumeration methods for evaluating n and n(A) for
experiments yielding equally likely outcomes; the following sections, 1-6.1 through 1-6.5,
review basic enumeration techniques and results useful for this purpose.

1-6.1 Tree Diagram


In simple experiments. a tree diagram may be useful in the enumeration of the sample
space. Consider Example 1-9, where a true coin is tossed three times. The set of possible
outcomes could be found by taking all the patha in the tree diagram shown in Fig. 1-6. It
should be noted that there are 2 outcomes to each trial, 3 trials, and 2 3 = 8 outcomes {HHH,
HHT, HTH, HIT, THH, THY, TTH, TIT}.

1-6.2 Multiplication Principle


If sets AI' A 2 • ••• , Akhave, respectively, nl'~' , .. , nk elements, then there are n1 • ~ • '" 'nk
ways to select an element first from AI' then from~, "" and finally fromAk'
In the special case where nl == ~ == '" == nk == n, there are nk possible selections. This was
the situation encountered in the coin-tossing experiment of Example 1-9.
Suppose we consider some compound experiment 't; consisting of k experiments, cgl'
cgz, ... , egk· If the sample spaces g I' g 2' ... , g k contain n 1, ~, • '" nk outcomes, respectively,
then there are n 1 . ~ ...• 'nk outcomes to eg. In addition, if the n; outcomes of g} are equally
likely forj = 1, 2, ... , k, then the n 1 • n,.' .... n, outcomes of'l: are equally likely.
1-6 Finite Sample Spaces and Enumeration 15

First Second Third


toss toss toss

/
/
Figure 1-6 A tree diagram for tossing a true coin three times.

Suppose we toss a true coin and cast a true die. Since the coin and the die are true, the two outcomes
to '"if: l' SOl = (H, T}. are equally likely and the six outcomes to "&2' S02 = {1,2,3,4,5.6 J, are equally likely.
Since n l = 2 and TI..z = 6, there are 12 outcomes to the total experiment and all the outcomes are equally
likely. Beeause of the simplicity of the experiment in this case, a tree diagram permits an easy and
complete enumeration. Refer to Fig. 1-7.

A manufacturing process is operated with very little "in-process inspection." When items are com-
pleted they are transported to an inspection area, and four characteristics are inspected, each by a dif-
ferent inspector. The first inspector rates a characteristic according to one of four ratings. The second
inspector uses three ratings, and the third and fourth inspectors use two ratings each. Each inspector
marks the rating on the item identification tag. There would be a total of 4 . 3 . 2 . 2 = 48 ways in
which the item may be marked.

1-6.3 Pennutations
A permutation is an ordered arrangement of distinct objects. One permutation differs from
another if the order of arrangement differs or if the content differs. To illustrate, suppose we

go = ({H, 1), (H, 2), (H, 3), (H, 4), (H, 5), (H, 6),
(T, t), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6))

T~~
=--4
5
6
Figure 1-7 The tree diagram for Example 1-27.
16 Chapter I An Introduction to Probability

again consider four distinct chips labeled a, b, c, and d. Suppose we wish to consider all per-
mutations of these chips taken one at a time. These would be
a
b
c
d
If we are to consider all pemlUtations taken two at a time, these would be
ab be
ba cb
ac bd
ca db
ad cd
da de
Note that permutations ab and ba differ because of a difference in order of the objects,
while pennutations ac and ab differ because of content differences. In order to generalize,
we consider the case where there are n distinct objects from which we plan to select per-
mutations of r objects (r::; n). The number of such pennutations, p~, is given by

P," =n(n-l)(n-2)(n-3)·····(n-r+l)
n!
= (n-r)!'
This is a result of the fact that there are n ways to select the first object, (n - 1) ways to
select the second, "', [n - (r- 1)] ways to select the rth and the application of the multi-
plication principle. Note that P~ = n! and OJ = 1.

;~~~i1ipj".i?ir
A major league baseball team typically has 25 players. A line-up consists of nine of these players in
a particular order. Thus, there are P~ == 7.41 x lOll possible lineups.

1-6.4 Combinations
A combination is an arrangement of distinct objects where one combination differs from
another only if the content of the arrangement differs. Here order does not matter. In the
case of the four lettered chips a, b, c, d, the combinations of the chips, taken two at a time,
are
ab
ac
ad
be
bd
cd
We are interested in determining the number of combinations when there are n distinct
objects to be selected r at a time. Since the number of permutations was the number of ways
to select r objects from the n and then permute the r objects, we note that
1-6 Finite Sample Spaces and Enumeration 17

(1-4)

where (' represents the number of combinations. It follows that


n' ::::::p"l'ljr!:::::: I
(
,r j n,
rl(n-r)1
• (1-5)

In the illustration witll the four chips where r~ 2, the reader may readily verify thatY, ~ 12
and (~) : : : 6, as we found by complete enumeration.
For present purposes, e) is defined where n and r are integers such that 0 ::::; r:S; n~ how-
ever. the terms (~ may be generally defined for real n and any nonnegative integer r. In this
case we write
n~ = "(n-l)(n - 2)·· · .. (n- r+ I)
(
,r), r.I
The reader will recall the binomial theorem:
n f n..
(a+by = L
rooO\' r
'ja'b'·" 0-6)

The n~mbers (J are thus called binomial coefficients.


Returning briefly to the definition of "",dam sampling from a finite population with·
out replacement. there were N objects with n to be se1ected, There are thus different :sam~ e)
pIes. If the sampling process is random, eacb possible outcome has probability l!(~) of
being the one selected.
Two identities that are often helpful in problem solutions are

(1-7)

and

(1.8)

To develop the result shown in equation 1-7, we note that

and to develop the re;"lt shown in equation 1-8, we expand the right-hand side and collect
terms.
To verify that a finite collection of n elements has 2' subsets, as indicated earlier, we
see that

2' (l.,.!)"~ il/n) = (nl~(n\~ ... ~(n\


r:::::!G r ,0.1 I; n;

from equation 1-6. The right side of this relationship gives the total number of subsets since
Gl is the number of subsets with 0 elements. (;) is the number v.;th one element, ..., ""d tl
is the number with n elements.
18 Chapter 1 An Introduction :0 Probability

A production lot of size 100 is known to be 5% defective. ArandoDl sample of 10 items is selected
without replacement. In order to determine the probability that there will be no defectives in the sam-
ple. we resort to counting both the number of possible samples a:od The number of samples favorable
to e'/entA. where eYent A is taken to mean that there are nO defectivc.,_ The number of possible sam~
I!l.
pIes is (:l~;;;;; l~::' The number "favorable to A" is ~1 SO that

5)(95[ _51 _951.


(
P(A) -- 0,10/
(100, -
_ O!5!10l85!
100) = 0.5&375.
llo J 10!90!

To generalize the preceding example, we consider the case where the population has N
items of which D belong to some class of interest (such as defective). A random sample of
size n. is selected without replacemenL If il denotes the event of obtaining exactly r items
from the class of interest in the sample, then

fDr N - D\

P(A) \rJ,"'-rj r;;;;; 0, 1,2, ... ,min (n, D).


(N"
, '

~n)
Problems of this type are often referred to as hypergeometric sampling problems,

E:¥.Di~~~);~~
An NBA basketball team typically has 12 players. A starting team consists of five of these players in
no particular order, Thus, there are (~2) = 792 possible starting teams.

:F;{·inji~"~~:ji.
One obvious application of counting methods lies in calculating probabilities for poker bands. Befo:e
proeeeding, we remind the reader of some sta."ldard ~erminology, The rank. of a particu1~ c~d drawn
from a standard 52-card deck can be2, 3, "', Q. K,A, while the possLble suUs are +, 0, v,~. In poker,
we dravo' five cards at random from a deck. The number of poss~ble hands is G)
=2.598,960.
1. We fitst calculate the probability of obtainir.g (V.i'O pairs, for example Av, A~, 3v, 30, !.O~.
We proceed as follows.
(a) Select tv,,,
ranks (e.g.• A. 3). We can do this (;) ways.
(b) Select two suits for first pair (e.g" \? ~), There are W ways,
(e) Select two suits for second pair (e.g., <:', 0). There are Wways.
(d) Select remaining card to complete the hand, There are 44 ways.
Thus, the number of ways to select two pairs is

and So

0.0475.
1-6 Finite Sample Spaces and Enumeration 19

2. Here we calculate the probability of obtaining a full house (onc parr, one three-of-a-kind), for
exampleA'Y,Ao!o, 3'Y, 3<>, 3~.
(a) Select two ordered ranks (e.g., A, 3). There are PJJ ). ways. Indeed, the ranks must be
ordered sinee "three A's, two 3 's" differs from "two A's, three 3's."
(b) Select two suits for the parr (e.g., 'Y, +). There are@ways.
(e) Select three suits for the threc-of-a-kind (e.g., 'Y, <>, ~). (~) ways.
Thus, the number of ways to select a full house is

n(full house)~ 1312(a;J ~ 3744,

and so
3744
P(full house) ~ 0.00144.
2,598,960

3. Finally, we calculate the probability of a flush (all five cards from same suit). How many
ways can we obtain this event?
(a) Select a suit There are (~) ways.
(b) Select five cards from that suit. There are (~3) ways.

Then we have

P(flllSh) 5148 ~ 0.00198.


2,598.960

1-6.5 Pennutations of Like Objects


In the event that there are k distinct classes of objects, and the objects within the classes are
not distinct, the following result is obtained, where n, is the number in the first class, n2 is
the number in the second class, .. " nk is the number in the kth class, and n J + n2 + ." + nk
~n:

n!
(1-10)

Consider the word "TENNESSEE." The number of ways ro arrange the letters in this word is

n! _9_!_=3780
nT!nE!nN!nS! 1!4!2!2! '
where we use the obvious notation. Problems of this type are often referred to as multinomial sam-
pling problems.

The counting methods presented in this section are primarily to support probability
assignment where there is a finite number of equally likely outcomes. It is important to
remember that this is a special case of the more general types of problems encountered in
probability applications.
20 Chapter I An Introduction to Probabilirj

1·7 CONDITIONAL PROBABILITY


As noted in Section 1-4, an event is associated with a sample space~ and the event is repre-
sentod by a subset of::f, The probabilities discussed in Section 1·5 all relate to the entire
sample space, We bave used the symboll:'(4) to denote the probability of these events; bow·
eve~ we could have used the symbol P(AI::!), read as '~the probability of A,g!'Le!!.s.?!!'ple
~Eace ~.. In this section we consideiilie probability of events where the event is con4l-
tione.d On some subset of the sample space. - --'...
"~Some illustrations of tllisidea should be helpful. Consider a group of 100 persons of
whom 40 are college graduates., 20 are se1f~employed, and 10 are both college graduates
and self-employed, Let B represent the set of college graduates and A represent the set of
self-employed, so thatA r1 B is the set of college graduates who are self-employed. From
the group of 100, one person is to be randomly seJected, (Each person is given a number
from I to 100, and 100 chips with the same numbers are agitated, with one being selected
by a blindfolded outsider,) Then, P(A) =0,2, P(B) =OA, and P(A n B) =0, I if the entire
sample space is considered, As noted, it may be more instructive to write P(AI::!), P(BI::f),
and P(A n BI::i') in such a case, ~ow suppose the following event is considered: self·
employed giver. that the person is a college graduate (AlB), Obviously the sample space is
reduced in that only college graduates are considered (Fig, 1·8), The probability P(AIB) is
thus given by

n(AnB) = .(AnB)!. = p(AiB) = p(AnBl = !!.:.!. = 0,25,


.(B) n(B)!n P(B) 0.4
The reduced sample space consists of the set of all subsets of 'i:! that belong to B. Of course,
A n B satisfies the condition.
As a second illustf'ation, consider the case wbere a sample of size 2 is randomly
selected from a lot of size 10, It is known that the lot has seven good and three bad items,
LetA be the event that the:first item selected is good, and B be the event that the second item
selected is good. If the items are selected without replacement. that is, the first item is not
replaced before the second item is selected, then

PtA) =.2.
10
and

If the first item is replaced before the second item is selected, the conditional probability
P(BIA) ~ P(B) = ~, and the events A and B resulting from the two seleetion experiments
comprising % are said to be jr.dependent, A formal definltion of conditiooal probability

(a) (b)
Figure 1·8 Conditional probability, (a) Initial sample space. (b) Reduced sample space,
1-7 Condit:o'&al Probability 21

P(AIB) will be given later, and independence will be discussed in detail, The following
examples will help to develop some intuitive feeling for conditionallfObability.

Exampiel.:34
, ....... '.,
'. ~ /.
Recall E.x.ample 1-11 where two dice are tossed, and ass~e that each die isJ:roe. The 36 poss[b!e Out-
comes were enumerated. If we consider two events,

B= (d" dz): dz?!: d,).


where d! is tlle value of the up face of the first die a..'1d d?zis the value of the up face of the second die,
then PiA) t,., f.'.
= PCB) = peA n B) = 3~' P(BIA) = " and P(AIB) = The probabilities wereeft,
obtained from a direct consideration of the sample space and the counting of outcomes. Note that

p(AIB) = peA n B) and P(BIA) = P(AnB)


PCB) peA)

.' ~a,nk\e~:3~:
In World V;,rar II an early operations research effort in England was directed at establishing search pat-
terns for U-boats by patrol flights or sorties. Fo~ a time, there was a tendency to co:r.cer.tratc the flights
on h!-shore areas, as it had been believed Lim more sighti.."lgs took place in-shore" The research group
studied 1000 sortie :recor~ witll the following result (the data are fictitious):

In-shore Off-shnre Total

Sighting 80 20 100
No sighting 820 80 900
.-~"---

Total sorties 900 100 1000

Let St: There was a sighting.


'9
·'r"
3'1,: There was no sighting. , I
j
B;: In-shore $ortie, • ":, I 'I
Bz: Off-shore sortie.
WI! sel! immediately that
'~/
p( SdB,) = 9: ~ 0,0889.
p(sdB,)
,
~22..=0.20,
100

which iIl.dicates a search strategy counter to prior practice.

Definition
We may define the conditional probability of event A given event B as

p(A1B)= P(AnB) if P(B) >0. (1-11)


P(B)
This definition results from the intuitive notion presented in the preceding discussion. The
conditional probability P('I') satisfies the properties required of probabilities, That is,
22 Chapter 1 An Introduction to Probability

1. 0" P(AIB) ,a
2. NflB) = 1.

3. P[~AiIB) = ~P(A,IB) if A, n Aj = 0 for i * j.


4. P[QAiI B = J ~P(A,IB)
for Ab A2 ,A3 , •.• a denumerable sequence of disjoint events.
In practice we may solve problems by using equation 1-11 and calculating P(A n B) and
P(B) with respect to the original sample space (as was illustrated in Example 1·35) or by
considering the probability of A with respect to the reduced sample space B (as was illus-
trated in Example 1·34).
A restatement of equation 1-11 leads to what is often called the multiplication rule,
that is
P(A n B) = P(B) . P(AIB), P(B) >0,

and
P(A n B) = P(A) . P(BIA), P(A) > O. (H2)
The second statement is an obvious consequence of equation 1-11 with the conditioning on
event A rather than event B.
It should be noted that if A and B are mutually exclusive as indicated in Fig. 1-9, then
A n B = 0 so that P(AIB) = 0 and P(BIA) = o.
In the other extreme, if B cA, as shown in Fig. 1-10, then P(AIB) = 1. In the first case,
A and B cannot occur simultaneously, so knowledge of the occurrence of B tells us that A
does not occur. In the second case, if B occurs, A must occur. On the other hand, there are
many cases where the events are totally unrelated. and knowledge of the occurrence of one
has no bearing on and yields no information about the other. Consider, for example. the
experiment where a true coin is tossed twice. EventA is the event that the first toss results
in a "heads," and event B is the event that the second toss results in a "heads." Note that
-t, t,
P(A) = since the coin is true, and P(B\A) = since the coin is true and it has no memory.
The occurrence of event A did not in any way affect the occurrence of B, and if we wanted
to find the probability of A and B occurring, that is, P(A n B), we find that

P{AnB) = P{A)·p(BlA) =L!=!.


224
We may observe that if we had no knowledge about the occurrence or nonoccurrence of A,
we have P(B) = P(BIA), as in this example.

00
Figure 1-9 Mutually exclusive Figure 1-10 Event B as a subset of A.
events.
1·7 Conditional Probability 23

Infonnally speaking, two events are considered to be independent if the probability of


the occurrence of one is not affected by the occurrence or nonoccurrence of the other. This
leads to the following definition.

Definition
A and B are independent if and only if
PtA n B) = PtA) . P(B). (1-13)
An immediate consequence of this definition is that if A and B axe independent events, then
from equation 1-12,
P(AIB) =PtA) and P(BIA) =P(B). (1-14)
The following theorem is sometimes useful. The proof is given here only for the first part.

Theorem 1-7
If A and B are independent events, then the following holds:
1. A and B axe independent events. - I

2. A and B axe independent events.


3. A and B axe independent eventS.

Proof (part 1) (:
PtA n B) = PtA) . P(BIA)
J.y
= PtA)
= P(A)
. [1 - P(BIA)]
. [1 - P(B)]
~i
=PtA) . PCB).
In practice, there axe many situations where it may not be easy to determine whether
two events axe independent; however, there are numerous other cases where the require-
ments may be either justified or approximated from a physical consideration of the exper-
iment. A sampling experiment will serve to illustrate.

r, );:~"';;~'ei:~~
Suppose a random sample of size 2 is to be selected from a lot of size I ~O, and it is known that 98 of
the I 00 items are good. The sample is taken in such a manner that the first item is observed and
repl~ced before the second item is selected. If we let
A: First item observed is good,
B: Second item observed is good,
and if we want to determine the probability that both items are good, then
98 98
P(AnB) =P(A)· P(B) =_ . - =0.9604.
100 100
If the sample is taken "without replacement" so that the first item is not replaced before the seeond
item is selected, then

98 97
P(AnB) =P(A)' P(~A) = -·-=0.9602.
100 99
24 Chapter 1 An Introduction to Probability

The results are obviously very dose, and one coromon practice is to assume the events independent
when the sampling fraction (sample size/population size) is small, say less than 0.1.

EX3]l,Iiiej:~i .
The field of reliability engineering has developed rapidly since the early 1960s. One type of problem
encountered is that of estimating system reliability given subsystem reliabilitics. Reliability is defined
here as the probability of proper functioning for a sta~ed period of time. Consider the structure of a
simple serial system, shown in Fig. 1-11, The system functions if and only if both subsystems fullC~
tion. If the subsystems surVive independectly, then.

where Rl and R2 are the reliabilities for subsystems 1 and 2, respectively. For exampte, if R j ;;: 0,90
and R, = 0.80, then R, = 0.72.

Example 1-37 illustrates the need to generalize the concept of independence to more
than two events. Suppose the system consisted of three subsystems or perhaps 20 subsys~
terns. W'hat conditions would be required in order to allow the analyst to obtain an estimate
of system reliability by obtaining the product of the sub'J'stem reliabilities?

Definition
The k events AI' Az, ... , A, are mutually independent if and only if the probability of the (
intersection of any 2, 3, ... , k of these sets is the product of their respective probabilities.
Stated more precisely. we require that for r = 2, 3 ..... k,

P(;!.;, nAt, n ... nAt) ~ P(;!.;,) . n4,,) ..... peA;)

= rrp(~I)'
j=t

In the case of serial system ",liability calculations where mutual independence may
reasonably be assumed, the system reliability is a product of subsystem reliabilities

(1-15)

In the foregoing definition there are 2" - k - 1 conditions to be satisfied. Consider three
events, A, B, and C, These are independent if and only if P(;!. n B) = peA) , PCB), PC.4 n C)
P~4) . P(C), PCB n C) = PCB) • P( C), and peA n B n C) = peA) . PCB) . P( C). The fol-
lowing example illustrates a case where events are pairwise independent but not mutually
independent,

System

Figure 1-11 A simple serial system.


1~8 Partitions, Total Probability, and Bayes' Theorem 25

0;,~!~P!:'i.~#~:
Suppose the sample space, with equally likely outcomes, for a particular experiment is as follows:

9'~ (CO. o. 0). (0.1. 1). Cl. 0.1). 0.1. OJ}.

Let Ao: First digit is zero. B J: Second digit is one.


A J: First digit is one. Co: Third digit.js zero.
Bo: Second digit is zero. C J: Third digit is one.

It follows that

and it is easily seen that

ptA, nBj ) ~±~P(A, ).p( BJ ). i ~ 0.1. j ~ 0.1.

ptA, ncj ) ~± ~ P(A,).P( Cj ).

P(B, n CJ ) ~± ~ P(B,).P( Cj ).

However, we note that

1
p(Ao nBo n Co) ~-" p(Ao)·P(Bo)·P( Co).
4

PCAo n Bo n C l ) ~ 0" P(Ao) ·P(BO! 'PCC 1).


and there are other triplets to which this violation could be extended.

The concept of independent experiments is introduced to complete this section. If we con-


sider two experiments denoted ~ I and ~2 and let Al and A2 be arbitrary events defined on
the respective sample spaces:11 and:12 of the ,two experiments, then the following defini-
tion can be given.

Definition
If peAl n A,) ~ peAl) . peA,). then '&1 and '&2 are said to be independent experiments.

1-8 PARTITIONS, TOTAL PROBABILITY, AND BAYES' THEOREM


A partition of the sample space may be defined as follows.

Definition

When the experiment is performed, one and only one of the events, B,. occurs if we
have a partition of :1.
26 Chapter 1 All IJlIroduetion to Probability

A particular binary "word" consists of five "bits," b l • b2• b1> b4,' Os. where bl~O.l.l;;;;t 1, 2. 3, 4, 5. Au
experiment consists of tranSmitting a "word," and it follows that there are 32 possible words, If the
events are
Bi = (CO, 0, 0, 0, 0), (0,0,0,0, OJ,
{(O, 0, 0, 1,0), (0, 0, 0, 1, I), (0, 0, 1,0,0), (0, 0, 1,0, I), (0, 0, 1, 1,0), (0, 0, 1, 1, 1)],
E, = (0, 1. 0, 0, 0), (0,1,0,0,1), (0, 1, 0, 1, 0), (0, 1,0, 1, 11, (0,1, 1,0,01, (0,1, 1,0,11,
(0,1,1,1,0), (0,1,1, 1,1)},
E4 = {(I, 0, 0, 0, 0), (I, 0, 0, 0, 1), (1,0,0, I, 0), (I, 0, 0, I, I), (1, 0,1,0,0), (1, 0,1, 0, 1),
(1,0,1, 1,0), (1, 0, I, 1, 1»),
E, = [(I, 1, 0, 0, 0), (1,1,0,0, I), (I, 1,0, 1,0), (1, 1,0, 1, 1), (1,1,1,0,0), (1,1,1,0,1),
(1, 1, 1, 1, OJ),
B,= [(1, I, 1, 1, 1)],

In general, if k events B; (i = 1, 2. '" , kl form a partition and A is an arbitrary event with


respect to ~. then we may vnite
A= (A nB,l u (A nB,)u '" u (A nB,)

so that

PtA) = PtA n B,l +P(A n B,}+ '" + peA n B,),

since the events (A n B;) are pairwise Illutually exclusive, (See Fig, 1-12 for k 4,) It does
not matter thetA n B; = 0 for some or all of the i, since P(0) = 0, .
Using the results of equation 1-12 we can state the follov.'ing theorem.

Theorem 1-8
If B" B" "', B, represents a partition of:'! and A is an arbitrary event on ff', then the total
probability of A is given by
,
PtA) = P(B,) peA I B,l + peA '.1Jz)~·" + P(B,) , PtA i B,l L,P(E,)P(A I Bil.
i"'1

The result of Theorem 1-8, also known as the law of total probability, is very useful, as there
are numerous practical .imations in wbicb P(A) cannot be computed directly, However,
1-8 Partitions, Total Probabiliry, and Bayes' Theorem 27

with the infonnation that Bi has occurred, it is possible to evaluate P(AIB) and thus deter-
mine P(A) when the values P(B;) are obtained.
Another important result of the total probability law is known as Bayes' theorem.

Theorem 1-9
If B 1• B2, • •• , Bic constitute a partition of the sample space ';j and A is an arbitrary event on
Ef. then for r= I, 2•... , k,

(1-16)

Proof
_p(B,nA)
P(B,A
I )- P(A)
p(B,)PI:AIB,)
k
2.: p(BJ p(AlB, )
i==l

The numerator is a result of equation 1-12 and the denominator is a result of Theorem 1.8.

Three facilities supply microprocessors to a manufacturer of telemetry equipment. All are supposedly
made to the same specifications. However, the manufacturer has for several years tested the micro-
processors, and records indieate the following information:

Supplying Facility Fraction Defective Fraction Supplied

0.Q2 0.15
2 0.01 0.80
3 0.03 0.05

The manufacturcr has stopped testing because of the costs involved, and it may be reasonably
assumed that the fractions that are defective and the inventory mix are the same as during the period
of record keeping. The director of manufacturing randomly selects a microprocessor, takes it to the
test department, and finds that it is defective. If we letA be the event that an item is defective, and Bi
be the event that the item came from faciliry i (i = 1, 2, 3), then we can evalnate PCB)A). Suppose, for
instance, that we are interested in determining P(B31A). Then

p(s,).p(AIs,)
p( S,iA) p( ~ )p(AIB\) + p(s,). p(Ais,) + p( s,) p(AIs,)
(0.05)(0.03) 3
(0.15)(0.02)+ (0.80)(0.01) +(0.05)(0.03) 25
28 Chapter 1 An Introduction to Probability

1-9 SUMMARY
This chapter has introduced the concept of random experiments, the sample space and
events, and has presented a formal definition of probability. This was followed by methods
to assign probability to events. Theorems 1-1 to 1-6 provide results for dealing with the
probability of special events. Finite sample spaces with their special properties were dis-
cussed, and enumeration methods were reviewed for use in assigning probability to events
in the case of equally likely experimental outcomes. Conditional probability was defined
and illustrated, along with the concept of independent events. In addition, we considered
partitions of the sample space, total probability, and Bayes' theorem. The concepts pre-
sented in this ehapter form an important background for the rest of the book.

1-10 EXERCISES
1~1. Television sets are given a final inspection fol-
(a) A n B.
lowing assembly. Three types of defects are identi-
fied, criticaF.'majo~ and minb'r defects, and are coded
(b)AuB.
A, B, and C, respectively, by a mail-order house. Data (e)AnE.
are analyzed with the following results. (d) A n (B n C).
(e) An (B u C).
Sets having only critical defects 2%
1-4. A flexible circuit is selected at random from a
Sets having only major defects 5%
production run of 1000 circuits. Manufacturing
Sets having only minor defects 7% defects are classified into three different types, labeled
Sets having only critic:S' A, B, and C. Type A defects occur 2% of the time, type
and major defects\O 3% B defects occur 1% of the time. and type C occur
Sets having only critieal 1.5% of thc time. Furthermore, it is known that 0.5%
and minor dcfccts U Q., 4% have both type A andB defects, 0.6% have bothA and
Sets having only major C defects, and 0.4% have B and C defects, while 0.2 %
and minor defects V 3% have all three defects. What is the probability that the
Sets having all three types of defects 1% flcxible circuit selected has at least One of the three
types of defects?
(a) What fraction of the sets has no lefects? O.!1~
1~5. In a human-factors laboratory, the reaetion timcs
(b) Sets with either critical defects or major defects
of human subjects are measured as the elapsed time
(or both) get a complete rework. What fraction from the instant a position number is displayed on a
falls in this category? 0 _1% digital display until the subject presscs a button
1 ~2. illustrate the following properties by shadings or located at the position indicated. Two SUbjects are
colors on Venn diagrams. involved, and times are measured in seconds for each
(a) Associative laws: A u (B u C) = (A u B) u C, A subject (t 1, t2). What is the sample space for this exper-
nCB n C)=(AnB)n C. iment? Present the following events as subsets and
mark them on a diagram: (t l + tz)fl::; 0.15, max (t 1, t2J
(b) Distriburive laws: A u (B n C) = (A u B) n (A u
C), A n (B u C) = (A n B) u (A n C).
;;0.15, It,-r,1 ;;0.06.
(e) IfAcB,thenAnB=A.
1~6. During a 24-hour period, a computer is to be
(d) IfAcB,thenAuB=B. aecessed at time X, used for somc processing, and
(e) IfAnB=0,thenAcB. exited at time Y:2: X. Take X and Y to be measured in
(I) IfAcBandBcC,thenAcC. hours on the time line with the beginning of the 24-
hour period as the origin. The cxperiment is to observe
1-3. Consider a universal set consisting of the inte- XandY.
gers 1 through 10, 9r U= [1,2,3,4,5,6,7,8,9, 1O}.
LetA= {2, 3, 4}, B= {3, 4, 5}, and C= {5, 6, 7}. By (a) Describe the sample space ff.
enumeration, list the membership of the following (b) Sketch the following eVents in the X, Yplane.
sets. (i) The time of use is 1 hour or less.
1-10 Exercises 29

(li) The access is before tl and the exit is after t2, probability that the sample will contain no more than
a
where s: tl < t2 $" 24. .two cefective units?
(iii) The time of use is !ess than 20% of the
"!-!s/-k inspecting incoming lots of merchandise, the
period. ~Jollowing inspection rule is used where lhe lots con-
1-7. Diodes from a batch a.-.oe tested one at a time and tain 300 ulllts, A random sample of 10 items is
marked either defective or nondefective, This is con- selected. Ifthere.is no more than one defective item in
tinued until either tv.\) defective items are found or the samp1e, the lot is accepted. Otherwise it is
five items have been~ted. Describe the sample space returned to the vendor. If the fraction defective in the
for this experiment. original lot is p', determine the probability of accept-
ing the lot as a function of p'.
1MK A set has four elements. A :::: {a, b, c, d}.
Describe the power se' {O, J lA. 1~16. In a plastics plant 12 pipes empty different
chemicals into 2. mixing vaL Each pipe has a five-posi-
1-9. Describe the sample space for each of the fol-
tion gauge lhat measures the rate of flow into the 'lat,
low..ng experiments. One day, while expc--r1menting with various mixtures,
(a) A lot of 120 battery lids for pacemaker cells 1S a solution is obtained that emits a poisor.ous gas. The
knov;'U to contain a number of defectives because settings on the gauges were not recorded. W"hat is the
of a problem with 'JJ.e barrier materia: applied to prObability of obtaining this same solution when ran~
the glassed-in feed-through. Three lids are ran- domlyexperimenting again?
domly selected (without replacement) and are
carefully inspected following a cut do\VD. 1~17. Eight equally skilled men and WOmen are
app;ying for two jobs. Because the two new emp:oy-
(b) A pallet of 10 castings is known to contain one
ees must work closely together, their personaliti~·s
defective and nine good Ull.its. Four castfugs a..re
should be compatible. To achieve this. the personnel
randomly selected (without ~eplacement) and
manager has administered a test and must compare
inspected,
the scores for each possibility. How .wany compar~
1-10. The production manager of a ce:rtain company isons must the ~ager make?
is interested in testing a fInished product, which is
available in lots of size 50. She would like to rework a 1·18. By accident. a chemist combined two labora~
lot if she can be reasonably sure that 10% of the items tory substances that yielded a desirable product.
are defective. She decides to select a random sample Unfortunately, her assistant did not record the names
of 10 iter!'.s wilhom replacement and rework the lot if of the ingredients. There are forty substances avail2.ble
it contains one or more defecrivei:ems. Does this pro--
in the lab. If the twO in question must be located by
cedure seem reasonable? successi"'e triakmd~error experiments. what is the
m.aximum number of tests that might be made?
1-11. A trucking firm has a conb:lictto ship a load of
1~ 19. Suppose, in the previous problem. a known Cat~
goods from city W to city Z There are no direct routes
connecting W to Z, but there are six roads from W to X alyst \\'iiS used in 'JJ.e first accidenta:: reaction. Because
and five roads from X to Z, How many rotal routes a.re of this, the order in wxch the ingredients axe mixed is
there to he considered? important. "Wllat is 'JJ.e maximum. number of tests that
might be made?
1~12. A state has one million registered vehicles and
is considering using license plates with six symhols 1-20. A co:npany plans to build five additio.u.al ware~
where the fIst three are letters and the last three are houses at new locations. Ten locations are cnder con-
/digit~ Is this scheme feasihle? sideration. Hov.' many total possible choices are there
for the set of five locations?
( 1-13~)The maLagcI of a small plant wishes to deter~
mine the number of wa)'S he ca:a assign workers to the 1..21. Washing maclllnes can have five kinds of major
first shift. He has 15 workers who can scrve as opera- ru:d five kinds of minor defects. In how many ways
tors of the production equipment, eight who can serve can one oajor ane! one rrtinor defect occur? In how
as maintenance personnel, and fot;I' who can be super- many ways can two major and two minor defects
visors. If the shift requires six operators, two mainte~ occur?
nance personnel, and one supervisor, how many wa.ys 1~Z2. Consider the diagra1!1 at the top of the oext page
can the first shift be :na!l1led? of a::J. electronic system, which shows the probabilities
1..14. A production lot has 100 units of which 2Q of the system co:nponents operating properly. The
are kno>t'D to be defective, A n:,uldom sample of four entire system operates if assembly m and at least
units is selected 'hithout repla.cemenL 'What is the one of the coopOoents in each of assemblies I and IT
30 Chapter 1 An Introductiou to Probability

II III

operates. Assume that the components of each assem- units arc approximately 0.995, 0.993, ~d 0.994,
bly operate indepeudently and that the assemblies respectively. Estimate the system reliability.
operate independently. 'What is the probability that the
1-27. Two balls are drawn from an urn containing m
entire system operates?
balls numbered from I to m. The first ball is kept if it
1-23. How is the probability of system operation is numbered I and returned to the urn otherWise. 'What
affected if, in the foregoing problem, the probability is the probability that the second ball drawn is num-
of successful operation for the component in assembly bered 2?
ill changes from 0.99 to 0.9? 1-28. Two digits are selected at random from the dig-
1-24. Consider the series-parallel assembly shown its 1 through 9 and the selection is without replace-
below. The values Rj (i = I, 2, 3, 4, 5) are the reliabil- ment (the same digit cannot be picked On both
ities for the five components shown, that is, R j = prob- selections). If the sum of the two digits is even, find
ability that unit i will function properly. The the probability that both digits are odd.
components operate (and fail) in a mutually inde- 1-29. At a certain university, 20% of the men and 1%
pendent manner and the assembly fails only when the of the WOmen are over 6 feet tall. Furthermore, 40% of
path fromA to B is broken. Express the assembly reli- the students are Women. If a student is randomly
ability as a function of R 1• R2, R), R4, and R5 • picked and is observed to be over 6 feet tall, what is
the probability that the student is a woman?
1-25. A political prisoner is to be exiled to either
Siberia or the Urals. The prObabilities of being sent to 1-30. At a maehine center there are four automatic
these places are 0.6 and 0.4, respectively. It is also screw machincs. An analysis of past inspection
known that if a resigent .9f Siberia is sejected at ran- records yields the following data.
dom the probability is 05,that he.will be wearing a fur
coat, whereas the probability i~ 0.7 that a resident of
thc Urals will be wearing one. Up-on arriving in exile, Perccnt
the first person the prisoner sees is not wearing a fur Percent Defectives
coat. 'What is the probability he is in Siberia? Machine Production Produced
1-26. A braking device designed to prevent automo- 15 4
bile skids may be broken down into three series sub- 2 30 3
systems that operate independently: an electronics
3 20 5
system., a hydraulic system., and a mechanical activa-
4 35 2
[Or. On a particular braking. the reliabilitics of these

A 8
HO Exercises 31

Machines 2 and 4 are newer and more production has point (in which case he wins), or until he throws a 7
been assigned to them than to machines 1 and 3. (in which case he loses). ¥lhat is L'le probability
Assume that the current inventory mix reflects the that '.he player with the dice \\w
event'..:.ally win the
production perce:ltages indicated. game?
(a) If a screw is randomly picked from inventory. 1-37~ The industrial engineering depa."'1ment of
what is the probability L"lat it will '!::Ie defective? the XYZ Company is performing a work sampling
(b) If a screw is picked and found to be defective, study on eight technicians, The engineer wishes to
what is the probability that it was produced by randomize the order in which he visits the techni-
machine 3? cians' work areas. In how many ways may he arrange
1~31. A point is selected at random inside a cirele,
these visit~?
What is the probability that the point is closer to the 1 ~38. A hiker leaves poin: A shown in the :figure
center than to the circumference? below. choosing at random one path froi.'. A.B, AG,
1 ~32. Complete the details of the proof for Theorem AD, a..'l.dAE, At each subsequenljunction she chooses
1~4 in the text. another path at r.mdom. What is the probability tr,.at
she arrives at.: poir.t X1 ") I ;.?
1~33. Prove Theorem 1~5.
1~39. Three primers do work for the publications
1~34. Prove the second aud third parts of T!leorem office of Georgia Thch, The publications office does
1-7. not negotiate a contraCt penalty for late work. and the
1",35. Suppose there are n people in a room. If a list is
data below reflect a large amount of experience '.vith
these pm:ters.
made of all their birthdays (the specific month and
day of the month). what is the probability that tv.'o or
more persons have'the same bir.hday? Assume there
Fraction of
are 365 days in the year and that each day is equally
Fraction of Deliveries
likely ro occur for any person's birthday. Let B be the
Contracts Held More tha..'1
event that two or' more persons have the same binh~
day, FindP(B) andP('B) for n= 10, 20, 21, 22, 23, 24, Printer i Printer i One Month Late

/---
25, 30, 4<), 50, and 60.
'1~36.Jn a certain dice game, players continue to throw
1
2
0,2
0.3
0,2
0.5
tw:;;dice until they either win or lose. The player 'Nins 3 0.5 0.3
on the first throw if the sum of the two uptumed faces
is either 7 or 11 an9 loses if the sum is 2, 3, 0::- 12.
Otherwise, the sum of the faces becomes the A department obscrves that its recruiting bookIe: is
player's "pOint." The player continues to throw llt;til more th2n a month late. 'Wha: is the probability t'lat
the first succeeding throw on which he makes his the contract is heM by printer 3?

B c

x
32 Chapter I An. Inrroduction to Probability

1-40. Following aircraft accidents, a detailed invesLi.ga~ due to structural failure is 0.2. If 25% of all aircraft
tion is conducted. The probability that an accident accidents are due to structural failures, :find the proba-
due to structural failure is correctly identified is 0.9 bility that an aircraft accident is due to structural failure
and the probability that an accident that is not due given that it has been diagnosed as due to structural
to structural failure being identified incorrectly as failure.
Chapter 2

One-Dimensional Random Variables

2·1 DlTRODUCTIO~

The objectives of this chapter are to introduce the concept of random variables. to define
and illustrate probabilitydisoributions and cumulative disoribution functions and to present
useful character:izatiolls~for va..-iables" ~ - _.- -- - -
When describ.ing the sample space of a random experiment, it is not necessary to spec-
ify that an individual outcome be a number. In several examples we observe this, such as in
Example 1·9, where a true coin is tossed tlrree times and the sample space is '1' = {HHB,
HIfT, HnI, HIT, THH, THT, 7TH, TlTj, or ExampJe 1·15, where probes of solder joints
yield a sample space '1'= {GG, GD, DG, DD}.
In most experimental situations. however, we are interested in numerical outcomes. For
example, in the illustration involving coin tossing we might assign some real number x to
every element of the sample space. In general, we want to assign a real number x to every out-
come e of the sample space if. A functional notation will be used initially, so that x = X(e),
whCreX is the function,. The domain of X is '!:f. and the numbers in the range are real numbers.
The func.ti0n Xis called arandom vari.ble. Figure 2-1 illustrates the nature of this fu'1cnon,

Definition
If't; is an experiment having sample space ~~ and X is a/unction that assigns a real num-
ber X(e) to every outcome e E if, tbenX(e) is called a random variable.

~,,~~pI~~";(
Consider the coin-tossing eXperiment discussed in the preceding paragraphs, If X is me fnrnbcr of
heads showtng, thee X(HHlI) ~ 3, X(HlflJ ~ 2, X(liTH) =2, X(HT/) 1, X(THH} =2, X(TH1) =1.
X(Tm}; I, and X(TTI) =0. The ta.1:ge spa.ceRx= {x:x= 0, t 2. 3} in t,1is example (see Fig. 2-2),

Rx

e.----~------------------+_----~ • X(el )

(e)
~ (b)
Figure2-1 The concept of a random variabk. (a):l': The sample space of'i!l. (b)R,: The range
space of X.

33
34 Chapter 2 One~Dimensional Random Variables

Of Rx

TIT·
TTH· ·0
THT-
-1
HIT·
HHT-
HTH·
THH-
-2
HHH- -3

X
Figure 2-2 The number of heads in three coin tosses.

The reader should recall that for all functions and for every element in the domain,
there is exactly one value in the range. In the case of the random variable, for every outcome
e E g there corresponds exactly one value X(e). It should be noted that different values of
e may lead to the same x, as was the case where X(ITH) = I, X(THTJ = I, and X(HTl) = I
in the preceding example.
Where the outcome in g is already the numerical characteristic desired, thenX(e)::= e,
the identity function. Example 1-13, in which a cathode ray tube was aged to failure, is a
good example. Recall that g = (t: ' " OJ. If X is the time to failure, then X(t) = t. Some
authors call this type of sample space a numerical-valued phenomenon.
The range space, Rx. is made up of all possible values of X, and in subsequent work it
will not be necessary to indicate the functional nature of X. Here we are concerned with
events that are associated with Rx. and the random variable X will induce probabilities onto
these events. If we return again to the coin-tossing experiment for illustration and assume
the coin to be true, there are eighl.:qually likely outcomes: HHH, HHT, liTH, HIT, THH,
THY, TTH, and ITT, each having probability t.
Now suppo~~ ~ i~ the event '~exacJly two
heads" and, as previously, we let X represent the number of h.~ads..(see Fig. 2-2), The event
- ••- - I . .,,"- - 3,
that (X = 2) r.ehltes ~_Rx,_no!3j_ho",ev_er,px(X = 2) = peA) = 8' since A = {HHT, liTH,
THHris the equivalent event in g, and probability-was 'defined--on events in the sample
f
space. The random variable X induced the probability of to the event (X = 2), Note that
parentheses will be used to denote an event in the range of the random variable, and in gen-
eral We will write Px(X =x).
In order to generalize this notion, consider the following definition.

Definition
If g is the sample space of an experiment ~ and a random variable X with range space Rx
is defined on g. and furthermore if event A is an event in g while event B is an event in-R~~
then ~ .~~ B ar~ equivalent events if
A={eE g:X(e)EBj ..
Figure 2-3 illustrates this concept.
More simply, if event A in g consists of all outcomes in g for whichX(e) E B, then A
and B are equivalent events. Whenever A occurs, B occurs; and whenever B occurs, A
occurs. Note that A and B are associated with different spaces.
2-1 Introduction 3S

RX

B
x

Figure 2-3 Equivalent events.

Definition
If A is an event in the sample space and B is an event in the range space Rx of the random
variable X, then we define the probability of B as
Px(B) = P(A), where A= leE 9':X(e)E B}.
With this definition, we may assign probabilities to even~ in Rx in terms of probabili-
ties defined on events in ::1, and we will suppress the function X, so that P x(X = 2) = in f
the familiar coin-tossing example means that there is an event A = (HHT, HTH, THH} = {e:
X(e) = 2} in the sample space with probabilityt. In subsequent work, we will not deal with
the nature of the function X, since we are interested in the values of the rangc space and
their associated probabilities. Vihile outcomes in the sample space may not be real num-
bers, it is noted again that all elements of the range of X are real numbers.
An alternative but similar approach makes use of the inverse of the function X. We
would simply define X-'(B) as
X-'(B) = (e E 9': X(e) E B}

so that
P x(B) =P(X-l(B)) =P(;!').

Table 2-1 Equivalent EVents

Some Events in Ry Equivalent Events in 9 Probability


Y=2 {(I, I)}
,
j6
2
Yd {(I, 2), (2, I)} j6
J
Y=4 {(l, 3), (2, 2), (3, I)}
Y=5 {(l, 4), (2, 3), (3, 2), (4, I)}
,
j6

Y=6 {(l, 5), (2, 4), (3, 3), (4, 2), (5, I)}
,
36

Y=7
,
36

Y=8
{(I, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, I)}
{(2, 6), (3, 5), (4, 4), (5, 3), (6, 2))
j6
,
Y=9 {(3, 6), (4, 5), (5, 4), (6, 3))
j6
,
j6
J
Y= 10
",,
{(4, 6), (5, 5), (6, 4))
Y= 11 {(5, 6), (6, 5)} j6
Y=12 {(6,6)) j6
36 Chapter 2 One~Dimensional Random Variables

The following examples illustrate the sample space-range space relationship, and the
concern with the range space rather than the sample space is evident, since numerical
results are of interest.

.··.'E:l<.~I'~~2:2··
Consider the tossing of two true dice as described in Example 1-11. (The sample space was described
in Chapter 1.) Suppose we define a random variable Yas the sum of the "up" faces. Then Ry = {2, 3,
1
4,5,6,7, 8, 9, 10, 11, 12}. and the probabilities arc (36' -ft' 16, k,
;6' ~, ;6' ~, 3~' ;6' fc;), respee~
tively. Table 2-1 shows equivalent events. The reader will recall that there are 36 outcomes, which,
since the dice are true, are equally likely.

One hundred cardiac pacemakers were placed on life test in a saline solution held as close to body
temperature as possible. The test is functional, with pacemaker output monitored by a system pro-
viding for output signal conversion to digital form for comparison against a design standard The test
was initiated on July 1, 1997. Vlhen a pacer output varies from the standard by as much as 10%, this
is considered a failure and the computer records the date and the time of day (d, t). If X is the random
variable "time to failure," then ~ = {(d, t): d = date •.t = time} and Rx = {x: x;:: a}. The random vari-
able X is the total number of elapsed time units since the module went on test. We will deal directly
with X and its probability law. This concept will be discussed in the following sections.

2-2 THE DISTRIBUTION FUNCTION


As a convention we will use a lowercase of the same letter to denote a particular value of a
random variable. Thus (X = x), eX yx), eX:::; x) are events in the range space of the random
variable X, where x is a real number. The probability of the event ex::;
x) may be expressed
as a function of x as

Fx(x) ~ Pix,;, x). (2-1)

This function Fx is called·the distribution junction or cwnulative function or cumulative dis-


tribution function (CDF) of the random variable X.

In the case of the coin-tossing experiment, the random variable X assumed four values, 0, 1, 2, 3, with
probabilities t, t; t. t.
We can state FJ...x) as follows:
F,(x) ~ 0, x<O,
t
~8' 0::::x<1,

4.
t~x<2,
~8'
7
~8'
=1, x;::3.

A graphical representation is as shown in Fig. 2-4.


2~2 The Distribution Function 37

F(x)
1
7/8 ----.0

4/8 ---.0

1/8
o 2 3 x
Figure 2-4 A distribution function for the number of heads in tossing three true coins.

fi~~p!~~~~~j
Again recall Example 1~ 13, when a cathode ray tube is aged to failure. Now :f = {t: t ~ O}, and if we
let X represent the elapsed time in hours to failure, then the event (X::;; x) is in the range space of X. A
mathematical model that assigns probability to (Jf_~x) is

FX<x) = 0, x:::; 0,
= l-e-,1.r, x>o,
where;" is a positive number called the failure.J!H~.....cfailureslhour). The use of this "exponential"
model in practice depends on certain assumptions -abo~t the failure process. These assumptions will
be presented in more derail later. A graphical representation of the cumulative distribution for the time
to failure for the CRT is shown in Fig. 2 ...5.

A eustomer enters a bank where there is a common waiting line for all tellers, with the individual at
the head of the line going to the first teller that becomes available. Thus, as the customer enters, the
waiting time prior to moving to the teller is assigned to the random variable X. If there is no one in
line at the time of arrival and a teller is free, the waiting time is zero, but if others are waiting or all
tellers are busy, then the waiting time will assume some positive value, Although the mathematical
form of FX depends on assumptions about this service system, a general graphical representation is as
illustrated in Fig. 2...6. ~

Cumulative distribution functions have the following properties, which follow directly
from the definition:
1. 0'; FX<x):;; 1, -oo<x<oo.

F,(x)

o x
Figure 2 ...5 Distribution function of time to failure for CRT.
38 Chapter 2 One~DimensioDal Random Variables

LProbability
, of no wa:t
oL---~o~------------------~x
Figure 2-6 Waiting-time distribution function.

2. limx~X<x) = I,
lim"..,J'x<;tl = O.
3. The function is nondecreasing. That is, if Xl S; x" then F;;.Xt ) ,; F;;'X;;.
4. The function is continuous from the rigbl That is, for all X and Ii> 0,
pm [Fx(x+ Ii) - Fx(x)] = 0.
0->0

In reviewing the last three examples, we note that in Example 2-4, the values of x for
which there is an increase in F "x) were integers, and where x is not an integer, then F .j.x)
bas the value that it had at the nearest integer X to the left. In this case, Fx(x) bas a saltus or
jump at the values 0, I, 2, and 3 and proceeds from 0 to I in a series of such jumps. Exam-
ple 2-5 illustrates a different situation, wbere F x<;t) proceeds smoothly from 0 to 1 and is
continuous everywhere but not differentiable at x = O. Finally Example 2~6 illustrates a sit-
uation wbere there is a saltus at X = 0, and for x > O. F;;'x) is continuous.
Using a simplified [ann of results from the !:z..besgue decomposition thect~ it is noted
that we can represent F X<x) as the sum 6!' two component functions, say GP) and HP), or
FP) = Gx(x) + Hx(x), (2-2)
where GX<x) is eontinuous and Hix) is a right-band continuous step function with jumps
coinciding with those of FP), andHX<- =) =O.lf Gx<;t) =0 for all x, then X is called adis-
crete random variable, and if Hx(x) " 0, then X is called a continuous random variable.
Where neither situation bolds, X is called a mixed type random variable, and although this
was illustrated in Example 2-6, this u.'Xt will concentrate on purely discrete and continuous
random variables, since most of the engineering and management applications of statistics
and probability in this book relare either to counting or to simple measurement,

2-3 DISCRETE RA."IDOM VARIABLES


Although discrete random variables may result from a variety of experimental situations, in
engineering and applied science they are often associated with counting. If X is a discrete
random variable, then FX<x) will bave at most a countably in:fin.ite number of jumps and
Rx {x\,~, .... Xk}···}·

Suppose !hat the number of worldng da:ys in a particular year is 250 and that the records of employ~
ees are marked for each day they are absent from work An experiment consists of randomly selecting
2-3 Discrete Random Variables 39

a record to observe the days marked absent. The random variable X is defined as the number of days
absent, so that Rx = {O, 1,2, ... , 250}. Ibis is an example of a discrete random variable with a finite
number of possible values.

A Geiger counter is connected to a gas tube in such a way that it will record the background radiation
count for a selected time interval [0, tJ. The random variable of interest is the count. If X denotes the
random variable, then Rx = {O, 1, 2, .. ,' k, ... }, and we have, at least conceptually, a countably infi-
nite range space (outcomes ean be placed in a one-to-one correspondence with the natural numbers)
so that the random variable is discrete.

Definition
If X is a discrete random variable, we associate a numbe~pX<x;) = Px(X =:xi)lwith each out-
come Xi in Rx for i = 1, 2, ... , n, ... , where the numbers PTxi) satisfy the following:
I. p,,(x,) ? 0 for all i.
2. 2::, Px(x,) ~ I.
We note immediately that
I

p,,(x,) ~ Fix,) - FX(xH)! v--- (2-3)

and
Fx(x,) ~ px(x,;; Xi) ~ 2:px(x). (2-4)
x:Sx,

The function Px is called the probability function or probability mass f!!:.nction or prob-
ability la'll. of the random variable, and fu,,'collection of pairs [(xi' p,,(x')0;;-'i;-2, ..-:ris
called the probability distribution of X. The functionpx is usually presented in either tabu-
Jf!r, gra.p!:!~91, or matk~"!.~tjc..Cl? f?Im, as illustrated in the following examples. ~-

~~jmf2]
For the coin-tossing experiment of Example 1-9, where X = the number of heads, the probability dis-
tribution is given in both tabular and graphical form in Fig. 2-7. It will be recalled tha~~:= _{O~;~~}} .

.'
Tabular Presentation Graphical Presentation

x pix) 3/8 3/8

o 1/8
1 3/8
2 3/8 1/8 1/8
3 1/8
o 2 3 x
Figure 2-7 Probability distribution for coin-tossing expcriment.
40 Chapter 2 One-Dimensional Random Variables

Suppose we have a random variable X with a probability distribution given by the relationship
( ,
Px(x)=i "JpX(I-P)'-x, x=O, l,. .• • ,n,

"
=0, otherwise, (2-5)

where n is a positive integer and 0 < p < 1. This relatiouship is known as the binomial distribution,
and it -will be studied in more detail later. Although it would be possible to -di;piay this model~
graphical or tabular form forpartieular n andp by evaluating pf.;;.) for x =0. 1,2, ...• n. this is seldom
done in

Recall the earlier discussion of random sampling from a finite population ~i.thout replacement. Sup-
pose there are N objects of which D are defective. A random sample of size It is selected Vlithout
replacement, and if we let X represent the number of defectives in the sample, then

(DYN-DI
Px(x)= 'It:)') X =0, 1, 2.... , min (n, D),

=0, otherwise, (2-6)


This distributio:l is knov.'Il as the hypergeometric distribution... In a particular case, suppose N;;;; 100
items, D == 5 items, and n::::: 4; then

=0, otherwise,
In the event that either tabular Or graphical presentation is desired, this would be as shown in Fig. 2-8;
hOVfeve.r, U!uess there is some special reason to use these forms, we will use the mathematical. relationship.

Tabu!ar Presentation Graphica! Present9.tion

x p,(x)
0.B051
0 (~) (~5)/('~0) '" 0.805
i
m(~5) / C~O) """0.178

(~) (;5) / ('~O) '" 0.014


p,(x)
2

3 m(9,5);C~0) =Q,003
0.178
(~) (9;) / ('~) '" 0.000 0,Q14
4 0.003
------"'--------
0 1 2 3 4 x.

Figure 2-8 Some hypergeometric probabilities N = 100, D =5, r. :: 4.


24 Continuous Random Variables 41

'j~~~ir~1,~}'
In Example 2-8. where the Geiger countel;' was prepared fol;' deteetir.g the background radiation count,
we might use the following relationship. which bas been experimentally sbown to be appropriate:
pix) = .->-e),.t'!!x!, x = 0, 1,2.... , i, > 0,
=0 oilie..-wise, (2-7)

This is called the f!zj§§QJLJ!i~1J:i.bU.tiO!1, and at a later point it will be derived analytically_ The param.-
eter ). is the mean rate~m' "hits"'per unit time. and x is rhe number of rhese "hits."

These examples have illustrated some discrete probability distrihutions and alternate
means of presenting the pab:s [(Xi,P,,(Xi))' i = 1.2, , .. ], In !.ater sections a number of proba-
bility distributions will be developed, each from a set of postulates motivated from consid-
erations of/redl-';""orzdphino~~\
A general graphical presentation of the discrete distribution from Example 2-12 is
given in Fig. 2-9. This geometric interpretation is often useful in developing an intuitive
feeling for discrete distributions. There is a close analogy to mechanics if ¥"e consider the
probability distribution as amass of one unit distributed over the real line in amounts Px(xi)
at points XI' i = 1,2, ...• n. Also, utilizing equation we note the following useful re.....ult
where b2 a:
(2-8)

2-4 CONTINUOUS RANDOM VARIABLES


Recall from Section 2-2 that where Hx(x) O. X is called continuous. Then Fx(x) is contin-
uous, Fix) has derivative fb) (dldx) F,,(x) for all x (with the exception of possibly a
countable number of values), and fx(x) is piecewise contiuuous. Under these conditions, the
range space Rx will consist of one or more intervals,
An interesting difference from the case of discrete random variables is that for i5 > 0,

(2-9)

We define the probability density function f J,x) as


d
fx(x) = dx Fx(x), (HO)

Px(x)

~--~X~,----~X2----~X~3----X~4---\\--~XJn_-,----x~n--------~!
Figure 2~9 Geometric interpretation of a probability distribt.::tion.
42 Chllprer 2 One-Dimemional Random Variables

and it follows that

(2-11)

We also note the close correspondence in this form with equation 2-4 with an integral
replacing a summation symbol, and the following properties off,!,;;):
1. f,!.;;);:'O forallxE Rx.
J
2. fx(;r)dx = L
'.
3. f,Ax) is piecewise continuous.
4. f,Ax) = 0 if x is nOt in the range Rx.
These concepts are illustrated in Fig. 2-10. This definition ofa density function stipu-
lates a function fx defined on Rx such that

(2-12)

where e is an outcome in the sample space. Vv'e are concerned only with Rx and lx- It is
important to realize thatfx(x) does not represent the probability of anything, and that only
when the function is integrated between two points does it yield a probability.
Some comments about equation 2~9 may be useful, as this result may be counterintu-
itive. If we allow X to assume all values in some interval, then Px(X =xo);::::; 0 is not equiv-
alent to saying the event (X = xo) in Rx is impossible. Recall that if A = 0, then Pj.A) = 0;
however, although PiX =xol =0, the fact that the setA = (x: X =xo) is not empty clearly
indicates that the converse is not true.
All immediate result of this is that Px(a ,; X'; b) = Px(a < X'; b) = Px(a < X < b) =
Px(a'; X < b), where X is continuous, all given by Fx(b) - Fx(a).
~

!i¥N'!\l~~3!i~i
The time to failure of the cathode ray tube described in Example 1-13 has the follOwing probability
density function:

'''' O.
otherwise.

where A > 0 is a eonstant knov.-'U as ilie fa.i!ure rate. This probability density function is called the
exponentiol density, and experimental evidence has indicated that it is appropriate to describe the time

~(x)

x
Figure :2-10 HJTOthetical probability density function.
2-4 Contim.:.ous Random Va..'"iables 43

to failure (a real-world occurrence) for some types of components, In this example suppose we want
'" find P,(f?IOO hours). This is equivalent to stating P,(100" =), and T<
PT (T?100) r k-"dt
=h,
=e- too ,,".

We might again employ the cOncept of conditional probability and determine P"T21OOIT:> 99), the
probability the tube lives at least 100 hours given that it has lived beyond 99 hours. From our earlier
wo~k.

P-lT:" lOOT> 99\ = PAT:" 100 and T> 99)


, . ! PP(T> 99)

e-1(0). _.
-9?:t ;;;: e •
e

A random variable X has the triangular probability density fuuction given below and shown graJ?rl~
cally in Fig. 2-11:

/j.x) =x, O::;x < 1,


;;;:2-x, 1 Sx<2.
=0, otherwise.
Thefollowing are calculated for illustration:

1. p/-l<X<!\f'O""+I'!2
'\ 2) -I ()
xd>:=.1..
8
( 3\
2. pxX';;-.=
\.
f
2;...,.,
0",,+ 1,'xd>:+ 1'12(2-x)""
0 1

I
1 ( x'\'1' 7
=°:2+ 2X-Z"J =g ('

3. P,,(X"; 3) = 1.
")
4. P",X" 2.5) = O.

\.4
(1 3)
2 Jj4
r
5, Px -<X<- = xd.Hf,,~ (2-x)""
J! ' 'n

15 3 27
32 8

o 2 x Figure 2~11 Triangular density function.


44 Chapter 2 One·Dimensional Random Variables

In describing probability density functions, a mathematical model is usually employed.


A graphical or geometric presentation may also be useful. The area under the density func~
tion corresponds to probability, and the total area is one. Again the student familiar with
mechanics might consider the probability of one to be distributed over the real line accord~
ing to Ix. In Fig. 2·12, the intervals (a, b) and (b, c) are of the same length; however, the
probability associated with (a, b) is greater.
~~l.r{\T

2-5 SOM:E CHARACTERISTICS OF DISTRIBUTIONS


While a discrete distribution is completely specified by the pairs [(Xl' Pxex»;
i = I, 2, "', n,
... ], and a probability density function is likewise specified by [ex./xex»; x E Rx], itis often
convenient to work with some descriptive characteristics of the random variable. In this sec-
tion we introduce two widely used descriptive measures, as well as a general expression for
other similar measures. The first of these is the first moment about the origin. TIris is called
the mean of the random variable and is denoted by the Greek letter J1, where
for discrete X,

for continuous X. (2-13)

This measure provides an indication of central tendency in the random variable.

Returning to the coin·tossing experiment where X represents the number of heads and the probabil·
ity distribution is as sho'NIl in Fig. 2·7, the calculation of J1 yields
f
1'= ±X;PX(XI)=0·m+l-(%J+2HJ+3(~J=%,
,=1
as indicated in Fig. 2-13. In this particular example, because of symmetry, the value J1. eould have been
easily detenninedfrom inspection. Note that the mean value in this example cannot be obtained as the
output from a single trial,

;E#¥~r;,.1'!~:1
In Example 2-14, a density Ix was defined as

AN = X. O';;X';; I,
=2->; I,;;x ';;2,
=0, otherwise.

a b c x
Figure 2~12 A density function.
2-5 Some Characteristics QfDistribunons 45

316 3/8

118 1/8
o 2 3 x

~ 3/2
Figure 2·13 Calculation of the mean.

The mean is determined by

JJ.= f~x.Xdt>-fX.{2-X)dx

+ f·Odx+ft·Odx=4
another result that we could have dete:rmined via symmetry,

Another measure describes the spread or dispersion of the prebability associated with
elements in RJ(: 'This measure is called the variance, denoted by cr,
--.:..t and is defined as
~ ____ ~ ___ ~_~~ _ _ -i_

follows:

for discrete X,

for continuous X. (2-14)

This is the second moment about the mean, and it corresponds to the moment of .inertia.in
mechanics. Consider Fig. 2-14, where two hypothetical discrete distributions are shown in
graphical fonn. ~ote that the mean is one in both cases. The variance for the discrete ran-
dom variable shov.n in Fig. 2-14a is

0" = (0-1)' {±)+(1-1)' (i)+(2-1)'(±) =i,


and the variance of the discrete random variable shown in Fig. 2~14b is

0" =(_1_1)' . .1:.+(0_1)' 1 +(1_1)'.


5 5
,I ),1
+ (2-1) '5+(3-1 5 2,

WIDch is four times as great as the variance of the random variable shown in Fig. 2·14a.
If the units on the random variable are, say feet, fuen the units of the mean are the same,
but the units on the variance would be feet squared. Another measure of dispersion, called the
standard deviation, is defined as the positive square root of the variance and denoted o;where

(j={;;2. (2-15)
It is noted that the units of (jare the same as those of the random variable, and a small value
for (jindicates little 'dispersion whereas a large value indicates greater dispersion.
46 Cbllpter 2 One-Di:mensiorull Random Variables

PxIA
1;2

114 114
o 2 x -1 o 2 3 x
(al,u~' and 0'0 0,5
Figure 2-14 Some hypothetical distributions.

An alternate form of equation 2-14 is obtained by algebraic manipulation as

for discrete X,

1-
= _x 2 f x (x)dx-/1 , for continuous X. (2-16)

This simply indicates. that the second moment about the mean is equal to the second
moment about the origin less the square of the mean. The reader familiar with engineering
mechanics will recognize that the development leading to equation 2-16 is of the same
nature as that leading to the theorem of moments in mechanics.

Using the alternate form,

(f
2 :- 2 1 2 3 2
=lO '8+ 1 '8+ 2 8+ 3 '8 - \2 ='4'
3 2 1J (3)' 3
which is only slightly easier.
2. Binomial distribution-Example 'J.alO. From equation 2-13 we may show that jJ.= "P, and

or

,,' Jtx"(
I
n11'"(1_ p)n-z1-(np)2,
,:x
r=<l I

wbich (after some algebta) simplifies '"


,,'=np(l-p).
3. JExpo_.en1iaJ dism'bution-Euvnpie 2·13. Consider the density functionftx), where
N.x)=2e-", x;,; 0,
=:: 0 otherwise.
2-5 Some Characteristics of Distributions 47

T'nen, using integration by parts,

o •x

and

4. Another density is exCx}, where

. g.,(x) = 16.re-4': xl:O,


;;; 0, otherwise.

9x(X)

o x

rnen

and

Note that the mean is the same foe the densities in parts 3 and 4. with part 3 having a variance
twice that of part 4,

In the development of the mean and variance, we used the terminology "mean of the
random variable" and ~'variance of the random variable." Some authors use the terminology
"mean of the distribution" and "variance of the distribution:' Either terminology is accept-
able, Also, where several random variables are being considered, it is often convenient to
use • subscript on If. and (Y, for example, If.x and (Yx'
In addition to the mean and variance. other mOments are also frequently used to
describe distributions, That is, the moments of a distribution describe that distribution,
measure its properties. and, in certain circumstances? specify it. ~1oments about the origin
are called on'gin moments and are denoted J.4. for the kth origin moment, where .
48 Chapter 2 One-Dimensional Random Va.'iables

for discrete X,

= [ , / fx(x}dx for continuous X.


k = 0,1,2,,,,, (2.17)
Moments about the mean are called central moments and are ~noted Ilk> where

P.= I,(X,_p)kpX(x,) for discrete X,

for continuous X.
(2-18)
2
Note that the mean j.J. == j.J.~ and the \'ariance is 0 -:= i12. Central moments may be expressed
ill terms of origin moments by fue relationship /'"
, ik\
ILk=I,H)'l,jpJp;-j, k=O,I,2",,, (2·19)
j"'O J.1

2·6 CHEBYSHEV'S Th'EQUALITY


In earlier sections of this chapter, it was pointed out that a small variance, 0 2, indicates that
large deviations from the mean, J.ll are improbable. Chebyshev's inequality gives us a way
of understanding how fue variance measures the probability of deviations about IL'

Thwrem 2·1
Let X be a random variable (discrete or continuous), and let k be some positive number,
Then

(1,..20)

Since

it follows that

(f
1 ~ JP'./K (x- IL)1 'fx(x)dx+ JW ~(x- IL)1 . fx(x)dx.
-- 'J.t+..JK

Now, (x - ILl' ~ K if and only if b: - JLi ~ VK; fuerefore,


,r "J--
Jl-fi ""
Kfx(x}dx+J mKfX<x)dx
f.tTvK

= K[Px( X,;; IL -.fK) + px(x;;: p ~.fK)]


and
2-7 Summary 49

so that if k = YK/cr, then

The proof for discrete X is quite similar.


An alternate form of this inequality,

Px (lX-IlI<kcr) ~ 1- :' (2-21)

or
1
Px(ll-kcr<X<Il+kcr) ~ I-I?'
is often useful.
The usefulness of Chebyshev's inequality stems from the fact that so little knowledge
about the distribution of X is required. Only J1 and 0- 2 must be knO\Vll. However, Cheby-
shev's inequality is a weak. statement, and this detracts from its usefulness. For example,
px(ix - 111 ~ cr)" 1, which we knew before we started! lithe precise form oflx(x) or Px(x)
is known, then a more powerful statement can be made.

From an analysis of company records, a materials control manager estimates that the mean and stan-
dard deviation of the "lead time" required in ordering a small valve are 8 days and 1.5 days, respee-
tively. He does not know the distribution of lead time, but he is willing to assume the estimates of the
mean and standard deviation to be absolutely correct. The manager would like to determine a time
interval such that the probability is at least ~ that the order will be received during that time. That is,

1 8
1--=-
k' 9

so that k = 3 and j1 ± ka gives 8 ± 3(1.5) or [3.5 days to 12.5 days]. It is noted that this interval may
very well be too large to be of any value to the manager, in which case he may elect to learn more
about the distribution of lead times.

2·7 SUMMARY
TItis chapter has introduced the idea of random variables. In most engineering and man-
agement applications, these are either discrete or continuous; however, Section 2-2 illus-
trates a more general case. A vast majority of the discrete variables to be considered in this
book result from counting processes, whereas the continuous variables are employed to
model a variety of measurements. The mean and variance as measures of central tendency
and dispersion, and as characterizations of random variables, were presented along with
more general moments of higher order. The Chebyshev inequality is presented as a bound-
ing probability that a random variable lies between Ji- ka and Ji + ka.
50 Ch2.pter 2 One-Dimensional Rando:::a Variables

2-8 EXERCISES·
2-1. A five-a.rd poker hand may contain from zero to
f~~ aces, If X is the random variable denoting the x=0.1.2 .....
number of aces, enumerate the nmge space of X Vlhat
are the probabilities associated with each possible
=0, otherwise.
vitue of Xl Find the probability that he will have suits left over at
2~2. A car rental agency has either 0. 1, 2. 3, 4, or 5 the season's end. £: '3 ,
'2 J
", ·th
ears returned each U4y, Wi pro
bahil"lUes 6'
I I , J
6' J' 12' z"lO. A ra.:1dom variable X has a CDF of the form
t. -h.
and respectively. Fmd the mean and the vari~
(1 V+1
ance of the number of cars returned..
Fx(x) = 1-(2:) • x=0,1,2, ....
" .~~ random variable X has the probability density
function ce-x, Find the proper value of c, assuming 0 $ =0. x<{}.
X < Z<:I. Find the mean and the variance of X (a) Find the probability function for X.
2.4t The cumulative distribution function that a tele- (h) Find PodO < X ~ 8).
vision tube will fail in! hours is 1 - e-.:t, where c is a
2Ml1. Consider the following probability density
parameter dependent on the manufacturer and t 2::: 0. f';![Lction:
Find the probability density function of T. the life of
fxf.x)=/a; O~x<2,
the tuhe.
=k{4-x). 2';x';4.
~Consider the three functions given below. Deter- := 0, o1heJ1lf1.se.
mine which functions are distribution functions
(a) Fmd the value of k for which f is a prob.bility
(CDFs). 0-7J...
demity function.
>" (a) Fodx) 1 O<X<<><>.
(b) Find the mean and variance of X
_~. (h) GJ.x) ~ e~. O.s:;x<oo,
(c) Find the C1J.l):J,ulative distributionfuncdon,
=0, x<O.
2--12. Rework the above problem, except let the prob-
" (e) HJ.x)=e'. -_<xS;O.
ability density function be defined as
= 1. x>O.
f/..x)=lo:. O::;'x<a"
2,6. Refer to Exercise 2-5. FInd the probability den- = k{2a -x). a";x'; 2a.
sity function corresponding to the functions given, if .' ;,:. ; ~:) '/ 0, otherwise
they are distritaltion functions. ~:
.a.:l~~The manager of a job shop does not "know the
(l!J)Wbich of the f'ollowillg functions are ~screte pr05ability distribution of the time required to com~
probability distributioDs? " plete an order. Howe'ler. from past pcdo:rmance she
( 1 has been able to estimate the mean and variance as :!.4
va) Px () x =-3' x=O. "
deys and 2 (deys) • respectively. Fmd an interval such
, 2 that the probability is at least 0.15 that an. order is fin-
x=l,
, ··0
-'1l
=0, othenvise.
ished during that time.
2~14. The continuous random variable Thas the prob-
(
(5\1 2" ability density funcnonj(') = ki'- for -I ,; ,:5 O. Find
J,3 j l::d .
/1,5-,1:'
,/(h) Px(x) = x x = 0-1.2,3,4,5. 1hc following:
'i (a) The appropriate value of k.
=0. otherwise.
(h) The mean and variancc of T.
2..8. The demandfof a product is-l. 0, +1) +2 per day (c) The cumulative distribution function.
W1'th prob a biliti' . 1
'es SJ 'IO'5'lD,respectlvey.
J 2 3 Ad emand
of -1 implies a Ul1it is returned. Find the expected 2--15. A discrete random variable X has probability
demand and the variax:.ce, Sketch the distribution functionpX<x), where
~n(CDF). pl.x) = k(1/2t. x=I.2.3,
•~.
:r f"
"~ manager 0 a men s clothing store 1$ con~
. =0, otherwise.
\~'"Iled over the inventory of suits, which is currently (a) Fmd k;
30 (all sizes). The number of suits sold from nOw to (h) Fmd the mean and variance of X.
t.~e end of the seasor. is distributed as (c) Fmd the cumulative distribt.>tion ~ctionFxf.x).
2-8 Exercises 51

2-16. The discrete random variable N (N =. 0. 1, ... ) 2-.21. A continuous random 'fanabIe X has a density
has probabilities of OCC1i.'TCnce of kf1 (0 < r < 1). Find function
the appropriate value of k.
/-~ 2x
. ,,2--11:I'he postf. service requires, on the average, 2 f.(x)~9' o<x<3,
days to deliver' a letter across town. The variance is
~O otherwise,
estimated to be 0.4 (dayyz. If a business executive
wants at least 99% of his letters deL'vered on time, (a) Deve!op the CDF for X. V
how early should he mail them?
I,l1) Find the ~ of X and the VlIIWnce of X . /
2-18. Two different real estate developers, A and B, :!lc)~ Find 14. ':.- jr,...
O""'n pa..--cels of land being offered for sale, The proba-
bility distributions of selling prices per parcel are
'rd) F1nda value m such that PI-X l!:m) =Px(X 5m),
This is called the median of X 5f,f-2.,
sho'W"TI in the following ~ab1e,
2-22. Suppose X takes on the values 5 and -5 with
Price t.
probabilities Plot the quantity P[JX - 1'1'" k<r] as a
ftmction of k (for k > 0). On the same set of axes, plot
Sl000 $1050 Sl100 $1150 $1200 $1350 the same probability determined by Chebyshev's
A 0.2 0.3 0.1 0.3 0.05 0.05 inequality.
B 0.1 0.1 0.3 0.3 0.1 0.1 ~:;l;:mLnd the cumulative distribc.tion function asso-
dated with
Assuming that A and B are operating independently,
X If 2
x \
co:rrqJute the fo:Cowi."lg: fx(x)='::r exp - - 2 :" t>0,;;:2:0,
• \ 2t I
(a) The expected selling price of A and of B.
O. otherwise.
(b) The expeeted selling price of A given that the B
selling price is $1150.
2~24. Find the cumulative distribution function asso-
(c) The probability marA and B both have the same ciated with
selling price.
2-19. Show that the probability function for the sum of
values obtained L"l tossing two dice may be written as
x-I
Px(X) =36' x=2.3, ... ,6,
2-25. Coasider t.i.e probability density :unctioa
13-x f,{y) ~ k sin y. 0 ;; y S 7If2. What is the appropriate
x~7.8 .... ,12.
~36' ruue of k? Find the mean of the distribution,
2~2(}. Fbd the mean and variance of
the rando.!1! vari· 2-26. Show tba~ central momen!S can be expressed in
able whose probability fur.ction is defined in the pre- of origin moments by eq:;:aUon 2-19, Hint See
ter!'!1s
vious problem. Chapter 3 of Kendall and Stuart (1963).
Chapter 3

Functions of One Random


Variable and Expectation

3-1 INTRODUCTION
Engineers and management scientists are frequently interested in the behavior of some
function. say H, of a random variable X. For example, suppose the circular cross-sectional
area of a copper wire is of interest. The relationship Y = 1tX2/4, where X is the diameter,
gives the cross-sectional area. Since X is a random variable, Y also is a random variable, and
we would expect to be able to determine the probability distribution of Y =H(X) if the dis-
tribution of X is known. The first portion of this chapter will be concerned 'With problems
of this type. This is followed by the concept of expectation, a notion employed extensively
throughout the remaining chapters of tlris book. Approximations are developed for the
mean and variance of functions of random variables, and the moment-generating function,
a mathematical device for producing moments and describing diytributions, is presented
with some example illustrations. (

3-2 EQUIVALENT EVENTS


Before presenting some specific methods used in determining the probability distribution of
a function of a random variable, the concepts involved should be more precisely formulated.
Consider an experiment ~ with sample space g. The random variable X is defined On
g, assigning values to the outcomes e in g, X(e) = x, where the values x are in the range
= =
space Rx of X. Now if Y H(X) is defined so that the values y H(;c) in Ry, the range space
of Y, are real, then Y is a random variable, since for every outcome e E g, a value y of the
random variable Yis determined; that is, y = HlX(e)]. This notion is iliustrated in Fig. 3-1.
If C is an event associated 'With Ry and B is an event in Rx. then B and C are equivalent
eventsiftbey occur together, that is, ifB= (XE Rx:H(x) E C). In addition, if A is an event asso-
ciated 'With g and, furthermore, A and B are equivalent, thenA and C are equivalent events.

s RX Ry

x +----"'----f-+ • y = H(x)

Figure 3~1 A function of a random variable.

52
3-2 Equivalent Events 53

Definition
If X is a random variable (defined on ff) having range space Rx , and if H is a real-valued
function, so that Y = H(X) is a random \lariable with range space R y. then for any event
C c R", we define
(3-1)
It is noted that these probabilities relate to probabilities in the sample space. We could
write
P,.(C)=P«(ee 9': H[X(e)) E C)).
However, equation 3-1 indicates the method to be used in problem solutiou. We find the
event Bin Rx that is equivalent to event C in Ry; then we find the probability of event B.

:!:~~~~fl@~
In the case of the cross-sectional area Yof a ..vire. suppose we know t.'mt the diamete: of a Vvire has
density function

LOOO ,; x " L0Il5,


other.vise.
Let Y~ (1t14)X' be the cross-sectional area of the wire, and suppose we want to find p,.(y,; (1.01)r.i4).
The equivalent event is deteIpJ.ined Pr(Y" (1.0J)r.i4) = Px[(r.i4)X' :;; (1.01)r.i4j = px<iX1 ".JiJiI).
The event {x E Rx:lxl" .JJ:ij1} is in the range space Rx,:md sinceJx<x) = 0, for all x < l.0.
we calculate

, ~)'
PxliXi';~1.01
" '
r;-;-) ~
=PxILO:;;X';vLOI
,
W·200dx
1.000
OI

0,9975,

rj;:1¥#Yl~_'~~J
In the case of the Geigercouater experiment of Example 2-12. we used the distribution given in equa-
tion 2-7: ~

p,!.x)= '-"(MY/X!, x ~ 0, 1,2, "'>

= O. or.."lerwise.
Recall that .A.. where A > 0, represents the Clean "hit" rate and l is the tir!J.e interval for which the
counteI: is operated. Now suppose we wish to find P~Y:5 5), where
Y~2X+2,

Proceeding as in the previous example,

= [Px(O) +Px(l)j= [e-" (M)O 1O!] +le-" (M)' !1!]


=e-~[I+MJ.

The event {XE Rx: xst} ism the rnngespace ofX. and we have thefuoctlonpx to work with in that
-=---..- . .- . .- - - - - - - - - - - -.. . .
space,
~
54 Chapter 3 Functions of One Random Variable and Expectation

3-3 FIDlCTIONS OF A DISCRETE RANDOM VARIABLE


, .
Suppose that both X and Yare discrete random variables, and letxi "x/21 ••• , xi.!:""j represent _
=
the values of X such that H(x,) Yi for some set of index values, D., = U: j = 1, 2•...• s,}.
The probability distribution for Y is denoted !'!EiY~ given by
I '~i:
!PY(Yi)= py(Y = y,) L,pxtxd'I' (3-2)
\
L-.-- ____jfJ.Q;
- ~

For example, in Fig, 3·A where s, = 4, the probability of y,ls phD = Px(x\) + px<xJ + Pl.x,,)
+ p,!.xJ. '
In the special case where H is such that for each y there is exactly one x, then
ph) = Px(x). where y, = H(x;). To illustrate these concepts, consider the following examples.

In the coin-tossing experiment where X represented the number of heads, recall that X assmned four
values, 0, 1,2, 3. ""'ith probabilities t. t. t, t. If Y:::: 2X ~ I, then the possible values of Y a.::e - 1, 1.
3, 5. and p,(- 1) =t. py(l) ~ 1. py(3) =1, py(S) =}. In this case. H is such that for each y there is
exactly one x.

i¥}~21~,~i1 V
Xis as in the previollS example; however. suppose now that Y:::: ~ - 21, so that the possible values of
Ya:e 0, 1, 2, as indiCllted in Fig. 3·3. In this case

py(O) = Px(2)=t,
4
py(l) = Px(l)+ px(3)=S'
1
pA2)= Px(O)=g'

Rx

H Ry
Xi; •
I
) \!
I Xi::!_

i
I Xl::;_
I
Xil\_

Figure 3·2 Probabilities in R y•


3-4 Conr:I::.uous F-;uerioas of a Coatinuous Random Variable S5

Ax y=H{x)=:x-21 Ay

(~+=============rrl!~"'101
8 ~
\111'

Figure 3-3 An example r.mccon H.

In the event that X is continuous but Y is discrete, the formulation for p.f.y) is

(3-3)

where the event B is the event that is equivalent to tlre eveot (Y = y) in Ryo

··iX;;~~I~~-.~l \//
Suppose X has the exponential probability density function given by
N?:) = kr", X" 0,
=. 0 otherwise.
Furthermore, if
y=o forXS l/)"
for X > 1/}...

and

3·4 CONTI/lilJOUS FUNCTIONS OF A CONTINUOUS


R~OMVARIABLE

lfX is a continuous'random variable with probability density functionf", and His also con-
tinuous, tben Y =H(X) is a continuous random variable, The probability density function for
the random variable Ywill be denotedfyand it may be found by performing three steps.
1. Obtain the CDFof Y. F,(y) = P,(Y ';'y). by finding the eventBinRx• which isequiv-
alent to the ovem (Y;; y) in Ry.
2. Differentiate F,(y) with respect to y to obtain the probability density functionf,ty).
3. Find the range space of the new random variable.
56 Chapter 3 Functions of One Random Variable and Expectation

~x~~ie.§
Suppose that "fue random variable X has the following density function:

fx{x) = xJg, 0';; x;!; A,


:;: : 0, ome::wise.

If y= H(X) is therendom variable for wbich the density ilis desired, and H(>:) =0:+ 8, as shov;min
Fig. 3-4, then we proceed according to the steps given above.

•,... f,h(y'
}
= R'(Y')
r = 32 41
3. If .x-=O.),= 8, and ifx=4,)':=. 16, so then we have
y 1
tr(Y) = 32 -4' g<'y<,16,
• = 0, otb.:TNise.

Y= H(x) 1
I
I Hex) 2x+8

16~
i

FiJlUl'" 3-4 The function H(x) = 2x + S.

y= H(x}

H(X)::; (x- 2)2

Figure 3-5 The function Hex) = (x - 2)'.


3-4 Continuous Functions of a Continuous Random Variable 57

.:§~~l~;~:i;
Consider the random variable X defined in Example 3-6, and suppose Y=H(X) = (X - 2?, as sbown
in Fig. 3-5. Proceeding as in Example 3-6, we find the following:
1. Fh) ~ P!.Y~y) ~ Px«X- 21' ~y) ~px(-.JY ~ X -2 ~.JY)
~Px(2-.JY ~H2+.JY)

2+-5 x x 2 [+-!Y
~t-!Y8dx~16 c
2-'0'

~ I~ [(4 +4.JY i+(4-4.JY + Y)]

~!...JY.
2

2. fy(y) ~ Fy(y) ~ Ir::-


4~y

3. If x = 2, Y = O. and if x = 0 or x = 4, then y = 4. However,fy is not defined for y = 0; therefore,


I
fy(Y)~4JY' O<y~4,

=0 othenvise.

In Example 3-6, the event in Rx equivalent to (y" y) in Ry was [X'; (y - 8)12]; and in
Example 3-7, the event in Rx equivalent to (Y'; y) in Ry was (2 X ,; 2 -.JY ,;
In the +.JY).
first example, the funetion H is a strictly increasing function of x, while in the second
example this is not the case.

Theorem 3-1
If X is a continuous random variable with probability density function/x that satisfies/X<:<)
> 0 for a < x < b, and if y = H(x) is a continuous strictly increasing or strictly decreasing
function of x, then the random variable Y ~ H(X) has density function

. l~fX~~·r£I,I (3-4)

with X~H-l(y) expressed in terms ofy. IfHitiiicreasmg, then/,,(y) > 0 if H(a) <y <H(b);
and if H is decreasing, then/,,(y) > 0 if H(b) < y <H(a).

Proof (Given only for H increasing. A similar argument holds for H decreasing.)
F,,(y) ~ p,(Y'; y) ~ P,(H(X)'; y)
~pX[X';Wl(y)]
=Fx[W1(y)].
'() dFx{x) dx
/y ()
y = Fy y =~. dy' by the chain rule,

=/x{x)- dx, wherex=W'(y).


dy
58 Chapter 3 Functions of One Ramlom V_ble aru:\ Expectation

In Example 3·6, we hall


Ux) =xl8, O';x'; 4,
; ; ;: O. otherwise,
andFJ(x) =: 2x + 8, which is a strictly increasing function. Using equation 3-4,

ty(y)= tX(X)'idxi= y-8 . .!.


dy 16 Z'
since x = (y - 8)/2. R(O) 8 and R(4) = 16; therefore,
/r(y) =-"-_!c 8';y';16,
32 4'
:::;; 0, otherwise,

3-5 EXPECTATION
If X is a random variable and Y = H(X) is a function of X, !bell the expected value of H(X)
is defined as follows:
\
(3-5)

. E[H(X)] =
----~-,.~---
rH(x)· fx(x)dx for X conti~':..
.. _.--- -~ ~-" - ---~....--~
(3-6)

In the case where X is continuous, we restrict H so that Y;;:;:. H(X) is a continuous random
variable, In reality. we ought to regard equations 3-5 and 3·6 as theorems (not definitions),
and these results have come to be known as the law of the unconscious statistician.
The mean and variance, presented earlier, are special applications of equations 3-5 and
3·6. If H(X) = X, we see that
E[H(X») =E(X) =/1. (3·7)
Therefore, the expected value of !be random variable X is just lhe mean, p..
If H(X) = (X - J1.?, then
E[H(X») = E«X - /11') = d'. (3-8)
Thus, lhe variance of the random variable X may be defmed in terms of expectation. Since
the variance is utilized extensively. it is customary to introduce ~a variance operator V that
is defined in terms of the expected value operator E;
V[H(X)] = E([H(X) - E(H(X»)]~. (3-9)
Again, in the case where H(X) = X,

VeX) ~ E[(X - E(X)il


=E(X') - [E(X)]'. (HO)
which is the van"ante of X, denoted dl.
The origin moments and central moments discussed in the previous chapter may also
be expressed using !be expected value operator as

J1.' = E(X'l (3·11)


\ 3-5 Expectation 59

and
ilk = E[(X - E(X))''j.
There is a special linear function H that should be considered at this point. Suppose that
H(X) = aX + b, where a and b are constants. Then for discrete X, we have
-._____c
'I E(aX + bl = I'(ax, + b )Px(x,)
- ,
= a 2,xi Px(x')+b 2,Px(x,)
i, .. ~

faE(X)+b'J (3-12)

and the same result is obtained for continuous X; namely,

~= [(ax+b)fx(x)dx

=a[ xfx(x)dx+b[fx(x)dx

=aE(X)+b, (3-13)

Using equation 3-9,


V(aX + b) = E[(aX + b - E(aX + b))']
E[(aX + b - aE(X) - bJ']
= aE[ (X - E(X) )2]
ti'v(X).] (3-14)
In the further special case where H X) = b, a constant, the reader may readily verify that
E(b) = b (3-15)

and
V(b) = O. (3-16)
These results show that a linear shift of size b only affects the expected value, not the variance.

Suppose X is a random variable such that E(X) = 3 and VeX) = 5. In addition, let H(X) = 2X -7. Then
EIH(X)] =E(2X -7) = 2E(X) -7 =-1

and
V[H(X)] = V(2X -7) = 4V(X) = 20.

The next examples give additional illustrations of calculations involving expectation


and variance.

Suppose a contractor is about to bid on ajob requiring X days to complete, where X is a random vari-
able denoting the number of days for job completion. Her profit, P, depends on X; that is, P = H(X).
The probability distribution of X, (x, p/..x)), is as follows:
60 Chapter 3 Functions of One Random Variable and Expectation

x Px(xl
[
3
8
5
4
ii
5 ~
8
otherwise, O.

Using the notion of expected value, we calculate the mean and variance of X as

1 5 2 33
E(X) = 3·-+4·-+5·-=-
8 8 8 8

and

V(X)=[3,.2.+4,.~+52.~J_(33)2 = 23
8 8 8 8 64
If the function HUG is given as

x H(x)
3 $10,000
4 2,500
5 -7,000

then the expected value of H(X) is

E[ H(X)] = 10,000· m + 2500· (%)-7000 . (i) = $1062.50

and the contractor would view this as the average profit that she would obtain if she bid this job many,
many times (actually an infinite number of times), where H remained the same and the random vari~
able X behaved according to the probability function Px. The variance of P = H(X) can readily be
calculated as

V[ H(X}] = [(IO,OOO)' .2.+ (2500}2 . ~+ (-7000)' '~J- (1062.5}2


888
= $27.53 ·10'.

A well-kn~ simple inventory problem is "the newsboy problem," described as follows. A newsboy
buys papers for 15 cents each and sells them for 25 cents each, and he cannot return unsold papers.
Daily demand has the following distribution and each day's demand is independent of the previous
day's demand.

Number of customers x 23 24 25 26 27 28 29 30

Probability, pIx) 0.01 0.04 0.10 0.10 0.25 0.25 0.15 0.10

If the newsboy stocks too many papers, he suffers a loss attributable to the excess supply. If he stocks
too few papers, he loses profit because of the exeess demand. It seems reasonable for the newsboy to
3·5 Ex.pec:ation 61

stock some number'bf ;tapers so as to ;ninirnize the expected loss. If we let s represent the number of
papers stocKed, Xtbe daLy demand, andL(X, s) the newsboy's loss for a pa..>1:icular stock level $, then
the loss is simply

L(X, s) = O.lO(X -s) ifX>s,


= 0.15(s-X) ifX,;s,
and for a given stock level, s, the expected loss is
, 3ll
E[L{X.s)] = 2:,0.15(s-x) pxCx)+ 2:,O.IO(x-s)' Px(x)
.:o~ .r--,:+l

and the E[L(X, s)I is evaluated for some different values of s.

For 3=26.

E[L(X. 26)] = 0.15[(26 - 23)(0.01) + (26 - 24)(0.04).,. (26 - 25)(0.10)


+ (26 - 26)(0.10)1 + 0.10((27 - 26)(0.25) + (28 - 26)(0.25)
... (29 - 26)(0.15) - (30 - 26)(0.10):
=$0.1915.

Fors=27.
E[L(X, 27)] = 0.15i(27 - 23)(0.01) + (27 - 24)(0.04) + 07 - 25)(0.10)
+ (27 - 26)(0.10) ... (27 - 27)(0.25): ... 0.10[(28 - 27)(0.25)
+ (29 - 27)(0.15) + (30 - 27)(0.10),
= $0. 154().
For $ 28,
BIL(X, 28)J =0,15[(28 - 23)(0.01) + (28 - 24)(0.04) + (28 - 25)(0.10)
(28 - 26)(0.10) + (28 - 27)(0.25) + (28 - 28)(0.25)1
+ 0.10[(29 - 28)(0.15) + (30 - 28)(0.10))
$0.1790

Thus, the newsboy's policy should be to stock 27 papers if he desires to minimize his expected loss.

Consider the redundant system sbam in the diagram below.

At least one of the units must function, the redundancy is standby (meaning that the second unit does
not operate until the first falls), sMtching is perfect, and the system is nonmaintained. It can be
shown that ~det' certain conditions, when the time to failure for each of the units of this system has
an exponential distribution, then the time ttl failure for the system has the following probability den·
sity function.
62 Chapter 3 Functions of One Random Variable and Expectation

f,!x) = J.'xe-'-', x> 0, J. > 0,


= 0, otherwise,
where A is the "failure rate" parameter of the component exponential models. The mean time to fail~
ure (MTTF) for this system is

Hence, the redundancy doubles the expected life. The terms "mean time to failure" and "expected
life" are synonymous.

3-6 APPROXIMATIONS TO E[H(X)] AND V[H(X)]


In cases where HeX) is very complicated, the evaluation of the expectation and variance
may be difficult. Often, approximations to E[H(X)] and V[H(X)] may be obtained by uti-
lizing a Taylor series expansion. This technique is sometimes called the delta method. To
estimate the mean, we expand the function H to three terms, where the expansion is about
x =IL If Y =H(X), then

Y = H(Il) + (X - Il)H'(Il) + (X - Il)' . W(Il) + R,


2
where R is the remainder. We use equations 3-12 through 3-16 to perform

E(Y) = E[H(Il)] + E[H'(Il)(X - Il)]


+ E[~W(Il)(X - Il)' ] + E(R)

= H(Il) + !W(Il)V(X) + E(R)


2

'" H(Il) +!2 W(1l )a'. (3-17)

Using only the first two terms and grouping the third into the remainder so that
Y =HC!l) + (X - Il) . H'C!l) + R I,
where

then an approximation for the variance of Y is determined as


V(y) = V[HC!l)] + V[(X - Il) . H'C!l)] + V(RI)
= 0 + veX) . [H'C!l)]'
= [H'C!l)]'. a'. (3-18)
If the variance of X, 02, is large and the mean, J.1, is small, there may be a rather large error
in this approximation.

The surface tension of a liquid is represented by T (dyne/centimeter), and under certain eonditions,
Too 2(1- 0.005X)1.2, where X is the liquid temperature in degrees centigrade. If X has probability den~
sity functionjx, where
3-6 A;>proximations to E,H(X)] and V(H(X)] 63

fx(x) ~ 30oox"\ x;" 10,


;;: 0 otherwise,
then

E(T)= [2(1-0.oo5x)1.2 ·3OO0x" dx


_to

and

V(T)
.r~, 4(I-D.005x)'·' .3000x-4dx-(E(T»)' .
In order to determine these values, it is necessary to evaluate

r(1-0-OO5x)" ax.
J:o X4

Since the evaluation is difficult., we use the approximations given by equations 3-17 and 3-13. Note
that

Since

H(X) = 2(1 - O.OOSX)'·',


then
H'(X) =- 0.012(1 0,005X)"z

H"(X) = 0.000012(1- O.005X)""·'.


Thus,
H(lS) = 2[I-O"005(15)lt~= 1.82,

H'(lS) =- 0.012,

li"(15) = O.
Using equations 3-17 and 3-18

E{T) =H(15)+tW(15).cr' = 1.82

and
V(1) = [H'(I5)]" d'= [- 0.012]'·75 = 0.Q108.

An alternative approach to this sort of approximation utilizes digital simulation and


statistical methods to estimate E(y) and V(l:'). Simulation will be discussed in genem! in
Chapter 19, but the essence of this approacb is as follows.
64 Chapter 3 Functions of One Random Variable and Expeetation

1. Produce n independent realizations of the random variable X, where X has proba.


bility distribution Px orfx. Call iliese realizations x,. x" '''' x,.. (The notion of inde·
pendence will be discussed further in Chapter 4.)
2. Use Xi to compute independent realizations of Y; namely, y, =H(x,), Y2 =H(x-J, ... ,
y, =H(;<,).
3. Estimate ErY) and V(l:') from ilie y"Yz, ... ,y, values. For example, the natural esti·
matorior E(y) is the sample mean y '" 2:~., y,/n. (Such statistical estimation prob-
lems will be rreated in detail in Chapters 9 and 10.)
As a preview of this approach. we give a taste of some of the details. FIrst, how do we
generate the n realizations of X that are subsequently used to obtain Y1' h, ,,+, Y11? The roOst
important technique, called the imerse transform method, relie..<;. on a remarkable result

Theorem 3·2
Suppuse that X is a random variable with CDF F ,{x). Then the random variable F,{X) has
a uoiforrn distribution on [0, 1]; that is, it has probability density function
Jul.u) = 1, OS uS I,
=0 otherwise. (3·19)

Proof Suppose that X is a continuous random variable. (The discrete case is similar.)
Since X is continuous, its CDF Fx(x) has a unique inverse. Thus, the CDF of the random
variable F x(X) is
P(FiX) S u) = P(X S F!i(u) = F,{F!i(u» = u.
Taking ilie derivative wiili respect to u, we obtain ilie probability density function of F ,{X),
d
duP(Fx(X),;;u)~I,

which matches equation 3·19, and completes ilie proof.


We are now in a position to describe the inverse transform method for generating random
variables. According to Theorem 3·2, ilie random variable F,!X) has a uoiforrn distribution
on [0,1]. Suppose iliatwe can somehow generate such a uoiforrn [O,IJ random variate U, The
inverse transform meiliod proceeds by setting F,{X) ~ U and solving for X via the equaJion
X =F!i(U). (3·20)
We can ilien generare independent realizations of Y by =
Y, = H(X i) H(Fi, (Ui »,
i =1,2, "', n, where U,. U2 , .... U, are independent realizations of the uoiforrn [0,1] dis·
tribution. We defer the question of generating U until Chapter 19. when we discuss com-
puter simulation techniques. Suffice it for now to say iliat iliere are a variety of methods
available for this task, the most widely used acmally being a pseudo·uniform generation
meiliod-one iliat generates numbers that appear to be indepen.dent and uniform on [0, I]
but that are actually calculated from a derern:tinistic algorithm,

Suppose that U is unifO,ITn on [0,1]. We will show how to use the inverse transfonn method
to generate aD exponential random variate with parameter ).,. that is, thc continuous distribution
3~ 7 l.c:.e Moment~Genera.ting Funetion 65

having probability density functionfx.Cx) :;; A,e-.tr. for x 2: 0. lne CDF of the exponential random
va..-iable is

lnerefore, by the inverse rransform theorem. we can Set

and solve for X,

X =F;' (U) =··Iln(l-U).

where we obtain the inverse after a bit of algebra. In other words, -(1/A.) In(1 - [1) yields au expo-
nential random va..-iable with parameter A.

We remark that it is not always possible to find Pi in closed form, so the usefulr.ess
of equation 3-20 is sometimes limited. Luckily, many other random variate generation
schemes are available, and taken together, they span all of the co=only used probability
distributions.

3·7 THE MO?vIENT-GEJIo'ERATING FTNCTION


It is often convenient to utilize a special function in finding the moments of a probability dis-
tribution. TIlls special function, called the moment~ger.eratingfo.nction) is de:finedas follows.

Definition
Given a random variable X, the moment-generating funetionMJt) of its probability distri-
bution is the expected value of ex. Expressed mathematically,
M,ft) = E(ex) (3-21)

= Le'" Px{x,) discreteX. (3-22)

= r e" Ix (x)dx continuous X. (3.23)

For certain probability distributions, the moment-generating funetion may not exist for
all real values of t. However, for the probability distributions treated in this book, the
moment-generating function always exists.
Expanding etX as a power series in t we obtain

On taking expectations we see that


2 ,
( ) Ber~ IX] =hE
Mx t . () '2',I·-+,,·+EIX
X ·t+ElX
, 2
t
\
' ') ._.,.
t . ...
r!
so that

(3·24)
66 Chapter 3 Functim:.s of One Random Variable and Expectation

Thus, we see that when ~41yj..t) is written as a power series in r~ the coefficient of fir! in the
expansion is the 7th moment about the origin. One procedure, then, for using the moment~
generating function would be the following:
1. Find Mxit) analytically for the particular distribution.
2. Expand M,(t) as a power series in t and obtain the coefficient of fir! as the 7th origin
moment.
'The main difficulty in using this procedure is the expansion of Mit) as a power series in t.
If we are only interested in the first few momenTS of the dislribution, then the process
of derenninlng these moments is usually made easier by noting that the rth derivative of
Mxil), with respect to t, evaluated at t= 0, is just
d'
_.Mx(t)1
dt T t=
0 = EflX'e"'] t=O = J.l;, (3-25)

assuming we can interchange the operations of differentiation and expectation. So, a sec-
ond procedure for using the moment-generating function is the following:
1. Determine M/..,) analytically for fue particular distribution.
2. Find J.l; = !', Mx(tll,.o·
.Moment-generating functions have many interesting and useful properties. Perhaps fue
most important of these properties is that the moment·generating function is unique when
it exists, so that if we know the moment-generating function. We may be able to determine
the form of the distribution.
In cases where the moment-generating funetion does not exist, we may utilize tbe
characteristic function, C/"t), which is defined to be the expectation of ,fiX, where i = H.
There are several advantages to using the characteristic function rather than the moment-
generating function. but the principal One is that Cx{t) always exists for all t. However. for
simplicitYj we will use only the moment·generating function.

!i[(~mRf~~tI~~:j
Suppose that X has a binomial distrfbutif:m., that is.
(It~ .. _ ..
Px(x) =l X'
/
p (I-p) , x:= 0,1,2,. '".!t,.

=0, otherwise.
where 0 < p ~·l and n is a pOSitive integer, The moment-generating fimction MJ,t) is

This last summation is re<:ognize<! as the binomial expansion of[pe' + (I-p))". so that

M,(tl = [pe' + (1- pl]".


Taking the derivatives. we obtain
3-8 Summary 67

and
M~(t) = npe'(l- p +npe')(: - p(e' -l)l~'.

Thus

and
14 =M~(t)I",,= np(l- p'" npl.
The second cenrral moment may be oblained using cr;;;;;; J.4 - [il;;;;;; np(1- p).

;i!~p!~~~~E'
Assume X to have the following gamma distribution:

b-l-~
b
a
fxt")=·····-x e O:5x<-,a>O,b>O,
reb) ,
== O. otherwise.

wherer(b)~ r e-'y"-ldy isrhegammajunction.


The mOll'lellt~gen.cratiJlg fmctlon is

which. if v.-e let y ;;:;:. x(a - t), becomes

Since the integral on the right Is just r(b), we obtain

,
Mx(t)=_a_;;;;;;J( 1-!.. )-' fort < a.
(a-t)' \ a

Now using the power series expansion for

(1--'- r.
aJ

we find

t b(b.,.l) ")' + ... ,


Mx(I)=hb-"'--l-
a 21 a
which gives the moments

u' b ,_0(0+1)
. 1 and !1i ---,-.
a a

3-8 SUMl\HRY
This chapter first introduced methods for determining the probability distribution of a ran-
dom variable that arises as a function of another random variable with known distribution.
That is, where Y=H(X), and either Xis discrete with knowndistributionp.,{x) or X is COIl-
68 Chapter 3 Functions of One Random Variable and Expectation

tinuous with known density !x(x), methods were presented for obtaiIDng the probability dis-
tribution of Y.
The expected value operator was introduced in general terms for E[H(Xl], and it was
shown that E(Xl = J1, the mean, and E[(X - J1l'] = ri', the variance. The variance operator V
was given as V(Xl = E(X') - [E(Xl]'. Approximations were developed for E[H(Xl] and
VlH(K)] that are useful when exacr methods prove difficult. We also showed how to use the
inverse transform method to generate real.izations of random variables and to estimate their
means.
The moment~generating function was presented and illustrated for the moments f.l~ of
a probability distribution. It was noted that E(X') = 1';.

3-9 EXERCISES
3~1. A robot positions 10 units .t.... a chuck for machinv
(b) If the profit per sale is $200 and thc replacement
iog as the chuck is indexed. If the robot positions the of a picture tube eosts $200, tind the expected
unit improperly, the unit falls away and the chuek profit of the business.
position rCl'.l:lains open, thus resulting in a cyele that 34~ A contractor is going to bid a project, and the
produces fewer than 10 units. A study of the robot's number of days, X, required for completion fonows
past performance indicates that if X =number of open the probability distribution given as
positions,
pix) =0.1. x= 10.
px(x)=O.6, x=o, =0.3, x= 11,
=0.3, x= 1, =0.4, x=12,
0.1, x 2, =0.1, x=13,
=0.0, otherwise. =0.1, x=14.
'= 0, otherwise.
If the loss due to empty positions is given by Y= 20;(-!,
The contractor's profit is Y"", 2000(12 - X).
find the following:
(a) Find the probability distribution of Y.
(a) p,.iy).
(b) Find E(X). V(X). E(y). and V(Y).
(b) E(Y) and VCy).
3-5. Assume thaI: a continuous random variable X has
3-2. The content of magnesium in an alloy i.s a random probability density function
variable given by the following probability density
function: fx(x) =2xer r , X~O,

=0, otherwise.

Find the probability distribution of Z = X'.


=: 0, otherwise. 3-.6. In developing a random digit generator, an impor~
tant property sought is rhat each digit D; follows the
The profit obtained from this alloy is P = 10 + 2X,
follov.ing discrete unifonn distribution.,
Ca) Find the probability distribution of P.
1
(b) \That is thc expected profit? pJ),(d) = 10' d=0.L2,3..... 9,
3~3. A manufactu.rerof color television sets offers a 1- = O. otherwise.
year warracty of free replacement if the pictu.."'e tube
fails. He estimates the time to failure, T, to be a ran- (a) Find E(D,) and V(D,).
dom variable with the following probability dist:ribu~ (b) If y =l J,
D, - 4.5 where lJ
is the greatest integer
non (in units of years): ("round down") function, find Pr0'}, £(Y), V(l').
3-7. The percentage of a certain additive in gasoline
t> 0,
(·\_.!.e-/!4
f T,·,- 4 determines the wholesale price. If A is a random. vari-
able representing the percentage, then 0 $A ~ 1. If the
= O. otherwise.
percentage of A is less than 0,70 the gasoline is low-
(a) Vlhat percentage of the sets will he have to test and sells for 92 ee.tl.tS per gallon. If the percentage
service? of A is greater tha:n or equal to 0.70 the gasoline is
3·9 Exexcises 69

bighwtest and sells for 98 cents per gallon, Find the Evaluate E(A) :and V(A) by using the approximations
expected re'>'enue per gallon wberefA(a) == 1. O!::a S 1; derived in this chap~.
otherwise.k(a) O. 3~13~ Suppose that X bas the uniform probability
3..8. The probability function of the random variable X. density function

- I -(ve)(x-p) 1 :5'x $; 2,
f x (x) -Be ,
otherwise.
=0, othenvise
(aJ Flnd the probability density funetion of Y = H(X)
is knO'tVtl as the two~parameter exponential distrfbu, wbere H(x) = 4 - x'.
non. Find the moment-generating function of X Eval~ (b) Find the probability density function of Y= H(X)
uate E(X) and V(X) using the moment-generating where H(;<) = e,
function.
3-14. Suppose that X has the exponer.'tial probability
3--9~ A random variable X has the following probabil~ density function
ity density function:
x:::: 0,
fl.x) = a-', X:>O. otherwise.
O. othel;'Wise,
Find the probability density function of r = FI(X; where
(al Develop the density function for Y= 2 X'.
(b) Develop the density for V = X·n. 3
H(X)=~),.
(cJ Develop the density for U = In X tl+x
I •
3--10. A two-sided rotating antenna receives signals. ~ used-car salesman finds that he sells either 1.
The rotanonal position (angle) of the antenna is denoted 2,3,4,5, or 6 ca.."'S per week with equal probabilIty.
X. and It may be assumed that this position at the tlme ~ind the moment-generallng function of X.
a IUgnal is received is a ra.'1dom variable with the den~ (b/usin~ the moment~ger.e:ating function fincC.E('X':
sity below. Actually, the randomness lies in the signal, ~ and VgJ/
'.,..>' ' I

I 3~16. 1.<:.t X be a random variable with probability


fx(x)=-,
2" density function
=0, otherwise.
flx) = ai'e-b~, x> 0,
The signal can be received if Y:> Yo> where Y:;:;; tan X =O. othet'.Vise.
t
For instance, Yo ::;;; 1 corresponds to < X < ~ and
(a) Evaluate the constant a..
~ < X < 3;' Find the density function for Y.
(b) Suppose a new function Y = lax:! is of interest.
3--11. The demand for antifreeze m.a season is consid- /'/ Find an approximate value for EeY) and for V(Y).
ered to be a uniform random variable X, with density
: 3-17. Assume that Y has the exponential probability
fx(x) = lO'' " 10' ,; x S; 2 x 10', ":denS'ity function
.e: 0. otherwise,
y>o,
wbere X is measured in liters. If the manufacturer otherwise.
t:lakes a 50 eent profit an eacb liter shc sells in the fall
of the year, and if she must carry any excess aver to Find the approximate values of E(X) and 'V(X) where
the nex.t year at a cost of 25 OOnt'> per liter, find the
"optimum" stock level for a particular full season,
3--12. The acidity of a certain product. measured on an 3~1S. The concentration of reactant in a chemical
arbitrary scale, is given by the relation.ship process is a random variable having probability
distribution
A = (3 .,. 0.050)',
fk) = 6r(1- r), 05r51,
where G is the amount of one of the constituentS bav-
ing probability distribution =0, otherwise.

The profit associated with the final product is P =


fc(g)=t, O';g';4, $1,00 +$3.00R. Find the expected value of P. 'Vtbat is
= 0, otherwise. the probability &stribution of P?
70 Chapter 3 Functions of One Random Variable and Expectation
1"')\,
,/,' 3~19~ The repair time (in hours) for a certain e1ectron~ 1:(!}l If Y = (X - 2)', find the CDFfor 1:
'-..-iafuy controlled milling machine follows the density 3~24. The third moment about the mean is related to
function
the asymmetry, or skewness, of the distribution :and is
f:!--') ~ wr2I, 7:>0. defined as
=0, othenvise.
!J., =E(X - J1;)',
Determine the moment~generating function for X and
:= p.; - 3,u2jl~ + 2(P~)J. Show that for a
Show that fJ.J
use this fu.'1crion to evaluate E(X) and V{X).
symmetric distribution, fJ.J = 0,
3..20. The .ctQS$*secrional diameter of a piece of bar
stock is circular with diameter X. It is known that 3--25. LetJbe a probability density function for which
E{X'J = 2 em and V(X) = 25 x 1~ cm2• A cutoff tOOl the rth order moment)1~ exists. Prove that all moments
cuts wafers that are e.x:actly 1 Cm thick.. and this is con~ of order less than r also exist
sta(l\Fmd the expected volume of a wafer. 3--26. A set of COllStaIlts k,.. called cumuIants, may be
. ~2V.)I a random variable X has moment-generating used instead of moments to characterize a probability
1/" Tunction Mx(t), prove that the random variable Y"" aX distribution. If .'1(J..r) is the moment-generating func~
+ b has moment-generating function e ib Mx (at). rion of a random variable X, then the cumulants are
defined by the generating function
3-22. Consider the beta distribution probability
density vJJ1Ction vW) log MI'),
Ix (x);;;;;; k(l- X),,-I x b-l. O~xsl,a>O,b>O, Tuus, the rth cumulant is given by
=0, othCI".'tlse,
k = d''If xCl)1 ,
{a) Evaluate the constant k. r dt' Ir-O
(b) Find the me~.
Find the cumulants of the f1Orrr.a1 distributl:on whose
f")fmd the variance,
dens£ty function is
f3~~) The probability distribution of a random vari-
able X is. given by 1 { li'x-u"l:.q
pIx) = 112, x = 0, f(')=-= exp
a-v2r.
--1--'
2 \. (J /
i J'-~<~<~'
1/4, .t;;;;;; 1,

=118, x~2, 3·27~ Using the inverse transform method. produce 20


I ()~i1 =118, 7:=3. realizations of the variable X described by Px(x) in
Exercise 3~23,
'. / -. '- 0, otherwise.
\ fJ) Determine the mean and vadan~e of X from the 3--28. Using the inverse transform method, produce 10
\ / ' moment-generating function. realizations of the random variable Tin Exercise 3-3.
Chapter 4

Joint Probability Distributions


~.

r--'
1,4·1 ) INTRODUCTIO:--;
In many situations we must deal with two or more random variables simultaneously. For
example, we might select fabricated
~--'--- .....
sheet steel specimens
-.----~- - - and
-_ measure
.•.. shear
.._ strength __
- 'and
..
2!]'ld.diameter 01 sP9.!.lilelds. Thus, both weld !'lieru:.str!'llW..?,ll..Q weld di.'!!lleter:."!.e the ran-
~jlbl"'u;>.f mt",re5t. Or we may select people from a certain population and
their height and weigbL '. . . . .. -- - - - - . ' , . . ...'
mcasure .
-The objective "filii, chapter is to formulate joint probability distributions for two or
more random variables and to present methods for obta1.Ung both marginal and coruItiwnal
distributions. Conditional ~~"~~(tas w.tl.ll~!pe regre~~91tpfm[~~!1' We
also present a definition of irul.ependence for random variables, and covariance and corre-
lation are defined. Functions of two or more random variables are p~:;;Sentecf~~d a specIal
~f linear combinations is presented with its corresponding moment~generating func-
tion. Finally the law of large numbers is discussed.

Definition
If if is the sample space associated with an experiment ~, and Xl' X1;> •• '. Xk are functions.
each assigning a rea! number X,(e), .'I:,(e) .... Xie) to every outcome e, we call [Xl' X" ....
XJ a k-dimensional rtJJ]dom vector (see Fig. 4-1).
The range space of the random vector [Xl' X" "', X,l is the set of all possible values of
the random vector. This may be represented as Rx, XX1Y.". xXe where
Rx,xx~x ... xx.. = ([Xl'Xz., ... , x,J:XI E Rx,>.t2 E Rx~. ""Xk E Rx)
This is the Cartesian product of the range space sets for the componentS, In the case where
k =2, that is, where we have a two-dimensional random vector, as in the earlier illustrations,
RXI xX. is a subset of the Euclidean plane.

e.~~__~__________~~__~~R~

r-~./~_.,Rx,

~
71
72 Chapter 4 Joint Probability Distributions

4·2 JOINT DISTRIBUTION FOR TWO·DIMENSIONAL


J RAt'IDOM VARL.ffiLES

In most of our considerations here, we will be concerned with two-dimensional random


vectors. Sometimes the equivalent term. two-dimensional random variables will be used.
If the possible values of [X" x"J are either finite or countably infinite in number, then
[XI' Xz] will be a two-dimensional discrete random vector. The possible values of [XI' X z]
are {x!;. xljJ. i == I, 2, ... ,i =. 11 2, ....
If the possible values of [XI' X,l are some uncountable set in the Euclidean plane, then
{Xl' X:J ~-ill be a nvo-dimensional continuous random vector. For example, if a S b and
c::;:~ s d, we would have RXI xXz == {[xl> xJ: a 5. Xl:':;: b, c 5. Xi. ~ d}.
It is also possible for one component to be discrete and the other continuous; however,
here we consider only the case where both are discrete or both are continuous.

Consider the case where weld shear s::rength a.'1d weld diameter are meas:lred. If we let Xl represent
diameter in inches andX2 represen.t strength In pounds, and if we know 0 ~x! <.0.25 inch while 0 ~
x, S 2000 poUlJds, then me ra:nge space for [X" X,] is the ser {[x" x,]: 0 ~ XI < 0.25, 0 "-"z ;; 2000].
This space is shoWD graphically in Fig. 4~2.

1'E,~~~~~1:'
A small pump is inspe..-red for four quality-co;ltrol characteristics. Each charaCteristic is classified as
good, mino::: defect (not affecting operation) or IIJ.ajor defect (affecting operation). A pump is to be
selected and the defects counted. IfXt :::::: the number of minor defects andX2 = the number of major
defects, we know that Xl =. 0, 1, 2. 3. 4 and Xz =. O. I, ,.,' 4 - X I because only four characteristics are
inspected. The range space for [X" X,] is thus ([O, 0], [0, I), (0, 2], [0, 3), (0,4). (1, 0], [I, IJ, (1,2],
[1,3], [2, 0], [2, 1], [2, 2], [3,0], [3, 1], [4, 0]). These possible outcomes are shown in Fig. 4-3.

X2 (pounds)

2,000

Figure 4--2 The range space of (Xl' XJ. where Xl is


o weld diareeter andX2 is shear strength.

2~'--~--~--+---+---

Figure 4-<3 The range space of [Xl' X:i~, where Xl is


the nUlllOc::: of minor defects andX2 is the number of
major defects, The range space is indicated by heavy
o 2 4 X" dots.
4-2 10im Distribution for Two-Dimensional Random Variables 73

In presenting the joint distributions in the definition that follows and througbout the
remaining sections of this c!tapter, where no ambiguity is introduced. we shall simplify the
notation by omitting the subscript on the symbols used to specify these joint distributions.
Thus. if X = [X,. X,J.Px(x,. x,) = p(x" x.J and/x(x,. x.J =/(x;, x,).

Definition
Bivariate probability functions are as follows.
o
~

!?iscrete c"'!i!; To each outcome [x;, Xzl of [X,. X,l. we a'sociate a number.
p(x,.x.;J =P(X, =x, and X, =x,),

where

and
(4-1)

The values ([x,. x,l. p(X,. Xz») for all i, j make up the probability distribution of
[X,. X,J.
(3) Euclidean
Contim{OHS ~I! [Xl' Xi] is a continuous random vector with range space R in the
plane, tlienf, the joint density fwtction, has the follO\Ving properties:
for all (x" x.,) E R

and

A probability statement is then of the form

P(al ~X,"I1.a, ~X, $b,)= t'J"/(x\,Xz)dxjdx,


0z <1,

(see Fig. 4-4).

'"
Figore4-4 A bivariate den..~ty·funct:ion. w!:.ere peal ::;;X:::; oJ' az S:Xz ~07) is given by the shaded
volume.
74 Chapter 4 Joint Probability Distributions

It should again be noted thatj(x" x,) does not represent the probability of anything. and
the convention thatj(x\, x,) ~ 0 for (Xl'>;') '" R will be employed so that the second prop"
erty may be ",'ritten

In the case where [X,. X,J is discrete, we migbt present the probability distribution of
[Xl' X 2] in tabular, graphical, or mathematical fonn. in the case where [Xl' X,J is continu-
ous, we usually employ a mathematical relationship to present the probability distribution;
however, a graphical presentation may occasionally be helpful.

;ll~~~f~'
A hypothetical probability distribution is show::t in both tabular and graphical form in Fig. 4--5 for the
random variables defined In Example 4-2.

~i>l;;;~:;(
In the casc of the weld diameters represented by Xl and tensile strength represcnted by Xl' we might
have a uniform distribution as by

!~ o I 1 2 3
L=-J
I 0 1/30 ! 1/30 2/;!{) I 3/30 I 1/30
, 1 1/30 i 1/30 3130 i 4/30 I \

:
2 1/30 , 2/30 .'lI3Oi I !

: 3 1/30 I 3/30 i
,

i
i 4 ,3/30 ,
I I I

(a)

pix;, xz}J.

3/30 X,

----- -
1/30
.'lI3O
----- -----
3/30

-----
4/30
-

~30-- ~.
0
2 3 4 x,
(b)
Figure 4~5 Tabular and graphical presentation of a bivariate probability distribution. (a) Tabulated
values are p(x1• xJ. (b) Graphical presentatiQIl of discrete bivariate distribution.
Uf
4-3 Ma..--ginal Distributions 75

"
O$X, <0,25,0$'2 ~2000,

;:; 0, other"<Y1Se,
The range spare was shO'Vi'D. in Fig. +2, and if we add another dimension to display graphically
y;. fix l ,;0, then the distribution would appear as in Fig, 4-6.ln the univariate case, area corresponded
to probability~ in the bivariate case. volume under the surface represents the probability.
For example, suppose we wish to find P(O, 1 ,,0,2, 100 ~X2:;; 200), Tnis probability would
befound by integratingj(x" x,) over Ihe region OJ "X,
,,0,2, 100:5 X, :> 200, That is,

rlOO r"~ 1 1
J100 JO•1 500
/-
,G MARGDlAL DISTRIBUTIOKS
,~

Having defined the bivariate probability distribution, sometimes called the joint probabil,
ity distribution (or in the continuous case the joint density), a natural question arises as to
the distribution of Xl Of X 2 alone. These distributions are called marginal distributions. In
the discrete case, the marginal distribution of Xl is

r
I
p,h)= LP(X',x,) 1for all
,r, -J
x, (4-2)

(4-3)

Jf~~R!~1f~;
In Example 4-2 we considered the joint discrete distribution shov.~ in Fig, 4-5. The marginal distri-
butions are shown in Fig. 4-7. We see that [Xl' Pl(X;)] is a univariate distribution a11d it is the distribu~
tiQn of XI (the number of mL.1.OC defects) alone. Likewise [.l'.1. p:.(~)] is a unl\<1.irlate distribution and it
is the distrib>J.lion (the number of major defects) alone.

If [X" X,J is a continuous randnm vector, the marginal distribution of X, is

(4-4)

= 2.000

F.gnrc 4-6 A bivariate uniform deosity,


76 Chapter 4 Joint Probability Distributions

Ni 0 1 2 3 4 p,(x.) I
i
1
o I 1130 liS(] 2130 8130 1/30 : 8130 I

1 illS(] 1/30 3/30 i 4/30 SIS(] I

2 11/30 2/S(] : 3130 6/30 I


1

3 ill30 3/30 : 41S(] I


i 4 I 3130 31S0 I

lP,(X,) [ 7/S(] 1713~ 8/30. 7/30 1130 ~(')"1[


(a)

P,(X,»)

La
P,('I) +
ai30 9130

! /7130
o 1 2
(b)
3
,1/30
4 X,
lCL
o 1
1
.
L
6130

2 --3:--- 4
(0)
4130
1 j3/30,
'2

Figure 4~7 Marginal distributions for discrete [XI' X:.J. (a) Marginal distributions-tabular fann, (b) Mar~
ginal dislribution (x,.p,(x,», (el Marginal distribution (x,. p,(x,),

and the marginal distribution of X, is

(4-5)
The function/, is the probability density function for Xl alone, and the fUllctionh is the den-
sity function for X2 alone.

'~§l?!~*,6)
In Example 4-4, the joint donalty 01: LX" X,] was given by

t(x"",) = 5~' O~XI <0.25,OSx, $2000,


;;;;; 0, Qilierwise.

The marginal dislributions of X, and X, are

otherwise.
and

otherwise.
These are shown graphically in Fig, 4-8,
4-3 Marginal Disu;butions 77

fl(X:~ .

LL_ o
(a)
0.25 Xl
1/2000

o
(b)
2,000 "z

Figure 4~8 Marginal distributions for bivariate uniform vector [Xl' Xzl. (a) M..arginal>:iistribution of
X" (b) Marginal distribution of X"

Sometimes the maro-.Jnals do not come out so nicely, This is the case, for example,
when the range space of [Xl' X,) is not rectangular.

E~~l~L~f;s
Suppose that the jOint density of [X;, Xii is given by
~

}(x" x,) ~ 6x,. d


0, othel'Vlise.
Then the marginal of X, is

t,(XI)~ [. f(x"x,)dx,

:= f6x
x, t dx 2

forO<x\<':"
and the m~.-ginal of Xz is

= 1: 6x z
1 d:t1

=3xi forO<x 2 <1.

TIle expected values and variances of X, and X, are determined from the marginal dis-
ttibutions exactly as in the univariate case, 'Where [X:. X2 ] is discrete, we have

E(X,)~,Ul ~ 2>lP:h)~ LLxlPh,x,),


XI .xl Xl (4-6)

= LLx~p(XI'X,)-.u~, (4-7)
" x,
78 Chapter 4 Joint Probability Distributions

and, similarly,

(4-8)

(~9)

In Example 4~5 and Fig. 4--7 marginal distributions for XI and ~ were given. Working with the ntar-
gjnal distribution of Xl shown in Fig. 4-lb, we may calculate:
7 7 S 7 1 g/,/
E(X)=
1
"1 =0·-+1·-+2·-+3·-+4·-=-
~ 30 30 30 30 30 5
V
and

V(Xd=O'j2 = I' 7 ,7 , ·-+3


0 ·-+1-·-+2
L 30 30
3
30
2·-+4
7 2. 1-J -1-
30 30 L5
18J'

103
~75

The mean and variance of X2 could also be detennined using the m.aIginal distribution of ~~:

Equations ~6 tlJrough ~9 show that the mean and variance of X, and X" respectively,
may be determined from the marginal distribetions or directly from the joint distributioIL
In practice, if the marginal distribution has already been determined, it is usually easier to
make use of it.
In the case where [Xl' X21 is continuous, then

E(XI) ill = [Xlfi(Xl)dxl [ [ X,f(XI'X,)dx,dx ,> (HO)


- ,
v(XI)~a; 1_(Xl-Ill) fi(xIJdx ,

(~11)

and

E(X,) = 1'2 = [ xd,(x,)dx, = [ [ x,f(x,> xz)dxldx, >


(4-12)

V(X,)=a z ='S-~(X2 -,u,) fz(x,)dx,


2

= r xifz(xlJdx, - J.Li
(4-13)
4-4 Conditional Distributions 79

Again, in equations 4~10 through 4~I3, observe that we may use either the marginal densi-
ties or the joint density in the calculations.

;~l?:4,!l,;
1L E.xample 4-4. th.e joint density ofwe1d diameters, XJ' and shear strength. X'z. was given as

1
f(x"x,> SIlO' 0';", < 0.25, 0,; x, s2000,
;;;;; 0, otherolf..se.

and the marginal densities for Xl and X2 \\-'ere given in Examp!e 4--6 as
O~xl<O.25, V

and

1
A(",) = 2000' 0';" < 2000,
;;;; O. otherwise.
Wo.dd:J.g with L.1.e [!')arginal densities, the mean and V3..."iance of XI are- thus

and

~
rg
\...-
CONDITIONAL DISTRIBUTIONS
When dealing with two'jointly distributed random variables it may be of interest tcfind..the.
~stribution-Of one of these variables, given a particular value of th~oili~r. That IS-I ·we may
~.~istribution OL~l g:ive.J)~2 s: For example, what is the disttltru&in
of a person's weightg(venfhat he is a particularneight? This prObability distribution would
be ealled the conditional distribution of X, given that X, = "'.
Suppose that the random vector [X" X;J is discrete. From the definition of conditional
probability, it is easily seen that the conditional probability distributions are

P(Xl,X2) (4-14)
p,(x,)
and

(4·15)

where p,(x,) > 0 and Pl(X,) > O.


It should be noted that there axe as many eonditional distributions of X2for given Xl as
there axe values Xl 'With PI ex t) > 0, and there are as many conditional distributions of Xl for
given X2 as there are values X2 with Pl~'0 > O.
80 Chapter 4 Joint Probability Distnbutions

i~~I~~~~,t1j
Consider the COU:lting of minor and major defects of the small pumps in Example 4-2 and Fig. 4-7.
There will be five conditional distributions ofXZ, one for each value of Xl' They are shown in Fig. 4-9.
The distribution PXJ'rJ(x,J, fot XI == O,is shown in Fig. 4-9a, Figure 4-9b shows the disaibution Px~: (.xz),
Other conditional distn'butions could likewise be determined for XI 2,3. and 4, respectively. The
distribution of Xl given that Xl == 3 is

otherwise.

If [Xl' X21is a continuous random vector, the conditiooal densities are

(4-16)
and

(4-17)

,~jam~I~*j~:
Suppose the joint density of [Xl' X2] is the functionfpresented here and shown in fig. 4-10:

r_)=x'1 '" x,x,


/( x }.-.. 3'
=0, otherwise.

x, 2
°
p{O,O) p(O,l) p(0,2) p(O, 3)
4

p(O,4)
p,(O) . p,(O) . --p;(O) p,(O) P1(O}

",ISO 1130 1 1130


Quotient mo=7" 7/30 ;::: '1
7130 7

(a)

x, o 2 3 4
- -...- - - - - - - - - - - - - - - - - - -
p{1,0) p(1,1) p(1,2) p(1,3) p(1,4)
P',i ,(x,) = p(l, x 2 )fp,(1) p;(1) p,Cl) . p;(1)
p,(1) p,(1)

1/30 1 11'>0 1 2130 2 3/3() 3 o


Quotient
7130 ='7 7130 = '7 7130 = '7 7130" '1 7/36 = 0

(b)
Figure 4--9 Some examples of conditional distributions,
4-4 Conditional DistributioIlS 81

f(x,. x"l

x, Figure 4-10 A biva..';ate density [unction.

The matginal densities are /,(;<,) ""d/,(z,). These are detennined as

otherwise,

O<xzS;2,

othervrise.

The margina! densities are shown in Fig. 4-1l.


The conditional ~ensities may be determined using equations 4-16 anP 4-17 as
'2 xtxz
x,+--
lx,,, (x,) ,i
2x,"+-x,
3 •
~O. otheI'W...se.

and

",(3x, +x,)
1' /" 1+(x,/2) •

otherwise.
Note that for fx,;,x/.J:;;. there are an i.nfinite number of these conditional densities, one for each value
o<Xl < L TWo of these.fx#!J2)(Xz} andfx:jl(x.z), are shown in Fig, 4-12.A1so, forfx;!xz(,xl)' there are an

o
Figure 4-11 The matginal deusities for Example 4-11.
82 Chapter 4 Joint Probability Distributions

Figure 4-1.2 Two conditional densities iX1jltl and fX~l> from Example 4-11.

o x,

infinite nU!lloer of these conditional densities. one for each va1ue 0 S; Xz ~ 2, Three of these are shov.n
in Fig. 4· 13.
-------------------------------------

4.5/ CONDITIONAL EXPECTATION


./
In this section we are interested in such questions as determining a person' s expected
weight given that he is a particular height. More generally we want to find the expected
value of Xl given iI:iformation about X 2- If [Xl' X 2] is a discrete random vector, the condi-
tional expectations are

E(X"x,) = }:;XtPx,lx, (xt) (4-18)


x,
and
E(X,IX1J = }:;x2PX,lz, (X2)' (4-19)
x,
Note that there will be an E(X,f<,) for each value of x,. The value of each E(X,f<,) will
depend on the value x" which is in turn govemed by the probability function. Similarly,
there will be as many values of E(X,Ix,) as there are values X" and the value of will E(X,Ix,)
X,
depend on the value determined by the probability function.
L5 Conditional Expectation 83

]!~pI{£t2
Consider the probability distribution of the discrete random vector [XI' Xz), where X: represer.ts the
number of orders for a large turbine in July and X2 represents the number of orders in August The
joint distributon as wee as the marginal distributions are given in FIg. 4-14. We cor-sider the tJree
conditional distributions PX:IO' PX2:j. and PXzf/. and the conditional e:x.:pected values of each:
I \ 1
'PX~jl(X2J:;:;: 10'
5 1
Xl"'" 1, Xz =1,
10 '4'
3 1
=-
10' ='4'
1
.%2 = 3, =- =0, Xl :.:::3,
JO
:;:;: O. Otherw1Se., =0, =0, otherwise,
, 1
E(x,12)o,O -'-+1'-
2 4

1-:-2'±+3"C~O.75

If (Xl' X~ is a continuous random vectQr~ the conditional expectation.s are

(4-20)

and
(4-21)

and in each case there will be an infinite number of values that the expected value may take.
In equation 4-20, there will be one value of E(X,ix,l for each value X, and in equation 4-21,
there will be one value of E(X,Ix,) for each value x"

'1!~ii'Ri~,~~l,3.;'
In Example 4-11; we considered a. joint density I, where

\ - 2 +-3-'
J(Xj,X:u-Xj X,x, O<x;S:1,O::';:xz S:2,

=0, otherwise,

X){1 2

0
°
0,05 0,05 O.~O
P2(XiJ
0.2

0,10 0,25 0,05 0.4

2 0,10 0,15 0.05 0,3

3 0.05 0,05 0,00 0.1

p,ix,) 0,3 0.5 0.2

Figure 4-14 Joint and margiaal distrib..1ion, of [X" X,]. Values in body of table are p(x" X;;,
84 Chapter 4. Joint Probability Distributions

The conditional densities were

and

Then, using equation 4-21, the E(x,ix:) is dete_ as

(/ )="I'
E\X,!x,
, I)
I 3x, +x,
x 2 ,-·---dxz
2 3x, + 1 ='
+4 '"
=
9x!+3

It should be noted that this t<; a function of Xl' For the two conditional densities shov.-n in Fig. 4~ 12,
t i;
where Xl ;:;: and Xl ::;< 1. the corresponding expected values are ECX'Ji) = andE{X211) ~ #.

Since E(X,ixll is a function of x,, and x, is a realization of the random variable XI'
E(X2iX,l is a random variable, and we may consider the expected valne of E(X2iX,), that is,
E[E(X,IX,l]. The inneroperatoris the expectation of X2 given X, = xl' and the outer expec,
tation is with respect to the marginal density of X,. This suggests the following "double
expectation~' result.

Theorem 4·1
E[E(X,iX,l] = E(X,) f12 (4,22)
and
(4-23)

Proof Suppose that Xl and Xz are continuous random vMiables. (£he discrete case is sim-
ilar.) Since the random _iable E(X,iX,) is a function of X" the law of the unconscious stat,
istician (see Section 3,5) says that

= [, x,[,f(x"x,)dx:dx2

= [ , x'/2(x,)dx2
= E(X2~
which is equation 4-22. Equation 4,23 is derived similarlY, and the proofis complete,
Consider yet again the join.t probability density function from Example 4-: I.

f(.J:l,.:t2);;:.:t~+ -"1;2.
=0, otherv.'ise.
In that example, we derived expressions for the ma..rginal densities f&~l) a."1df:?;(X:!). We also derived
E(X,Jx:)~ (9x, + 4)/(9x, .,. 3) in ExampJc 4-13. Thus.

=
9
Note that this. is also E(XiJ. since

E(X,)~ E,/,(x,),u, S: x, G.,. "; ),u, ~ J~.

The variance operator may be applied to conditional distributions exactly as in the uni-
variate C3Se.

REGRESSION OF THE MEAN


It has been observed previously that E(X,ix,) is a value of the random variable E(X,iX,) for
a particular Xl =x;, and it is a function of Xl" The graph of this function is called the regres-
sion of X, on X" Alternatively, the function E(X,ix') would be called the regression of Xl on
X,. This is demonstrated in Fig. 4-15.

j[~!3~:f:!5;l
In Example 4-13 we found E~p:) for the bivariate density of Example 4-11, that is,
X.X2
"1"3' O<X1 ~1,O$x2 $2,
=0 otherwise.

(a) (b)
Figure 4--15 Some regression curves. (a) Regression of X2 on X;. (b) Regr-:!Ssion of Xj on X 2 •
86 Chapter 4 Joint Probability Distributions

The result was

In a like manner, we may find

/
Regression will be discussed further in Chapters 14 and 15.
V\
'4-7/ INDEPENDENCE OF RANDOMVARRBLES
//
The notions of independence and independent random variables are very useful and impor-
tant statistical concepts. In Chapter 1 the idea of independent events was introduced and a
fonnal definition oftbis concept was presented. We are now concerned with defining iru1e~
pendent random variables. Vlhen the ontcome of one variable. say Xt' does not influence
the outcome of Xl, and vice versal we say the random variables Xi and XL are independent.

Definition
1. If [Xl' K,J is a discrete random vector, then we say that Xl and X, are independent
if and only if
(4-24)

for all x, and x,.


2. If [X" K,J is a continuous random vector, then we say that Xl and X1 are independ-
ent if and only if
(4·25)

for all Xl and x,.


Utilizing this definition and the properties of conditional probability distributions we
may extend the concept of independence to a theorem.

Theorem 4-2
1. Let [Xl. Xzl be a discretc random vector. Then
Px,~,(x,) = p,(x,)

and
px~,(X) = PI(X)

for all Xl and X, if and only if Xl andX, are independent.


2. Let [Xl' K,J be a continuous random vector. Then
IXI;<,(x,) =hex,)
4-8 Covariance and Correlation 87

and
Ix,,,,(x,) = !; (x,)
for all x, and"" if and only if X, and X, are independent.

Proof We consider here only the continuous case. We see that

1,,1",("") =h(x,) for all x, and x,


if and only if

.t; (x,)
if and only if
ft.x ,• x;J = I, (x,)f;,(x,) for all Xl and x.,
if and only if Xl and X, are independent, and we are done.
Note that the requirement for the joint distribution to be factorable into the respective
marginal distributions is somewhat similar to the requirement that, for indepeudent events,
the probability of the intersection of events equals the product of the event probabilities.

A city trrulSit service receives calls from broken-down buses and a \VTeeker crew must haul the buses
in for service. The joint distribution of the number of calls received on Mondays and Tuese:ays is
given in Fig. 4-16, along 'n-irh the marginal distributions. The variable Xl represents the number of
calls on Mondays and X2 represents the number of calls on Tuesdays. A quick inspection u.ill show
thatX! andXl are independent. since the joint probabilities are the product of the apprOpriate ~'Ull
probabilities.

4·8 COVARIA."<CEAND CORRELATlOK


0'; are the mean and variance of X,. They may
--;;W;;e~h:-a~v'ec-::n-o=ted~tfut=t i,~X,) = 11, and VeX,) =
be detem1ined from the marginal distribution of Xl' In a s1miJar manner, /1., and a; are the
mean and variance of Xz. Two measures used in describing the degree of association
between Xl and Xl are the covariance of [Xl' X2] and the correlation. coefficitm.t.

X){l 0 2 3 4 p,(x,)

0 0.02 0.04 0.06 0.04 0.04 0.2

0.02 0.04 0.06 0.04 0.04 0.2

2 0.01 0.02 0.03 0.02 0.02 0.1

3 0.04 0.08 0.12 0.08 0.08 0.4

4 0.01 0.02 0.03 0.02 0.02 0.1

P,(X,) 0.1 0.2 0.3 0.2 0.2

Figure 4-16 Joint probabilities for wrecker calls.


88 Chapter 4. Joint Probability DistributioDS

Definition
If (Xl' X2] is a two-dimensiona1 random variable; the c<JVariance denoted Ci,,2' is
j

_ _",Co"-,v-,,(X,, X,) = "" = E[(X, - E(X,))(X, - E(X,»] (4-26)


and the~lation COeffi~ denoted p. is
1\ p= ,CoV(X",:\:2)
-,JV(X1 ) . ..,rV(X2)
~.l
"1 ''',
-i c: 0 < I
I
(4-27)

\ j
The cQvariance is measured :in the units of Xl times the units of X2. The correlation
coefficient is a dimensionless quantity that measures the linear association between two
random variables. By performing the multiplication operations in equation 4-26 before dis-
tributing the outside ex lue operator across the resulting quantities, we obtain an
alternate form for tb covariance follows~
(4-28)

Theorem 4-3
ik If X, andX2 are independent, then p = o.
Proof We again prove the result for the continuous case. If XI and X1 are independent,

E(X1 .X,)= r [-'lX" f(x ,x,)dx,dx,


1

= [ [ -'lA(x,).x,tz(Xz)dxJdxz

[r XJj-;(xJdx ,} [[ x,f, (x,)dx, J


= E(X1)· E(X,).

Thus Cov(X" X,) =0 from equation 4-28, aod p =0 from equation 4-27. A similar argument
would be used for [X" X,] discrete.
The converse of the theorem is not necessarily true, and we may have p = 0 withom the
variables being independent. If p = 0, the random variables are said to be uncorrelated.

Theorem 4-4
The value of p will be on the interval [-1, +1], that is,
-!$p$+l.

Proof Consider the function Q defined below aod illustrated in Fig. 4-17.
Q(t) = Ei(X, - E(X,) + t(X, - E(X,))]'
= E[X, - E(X,)]' + 2tE[(X, - E(X,»(X, - E(X,))]
+ "E[X, - E(X,)]'.
Since Q(J) ~ 0, the discriminant of Q(t) must be :;; 0, so that
{ZE[(X! - E(X,»(X, - E(X,))]}' 4E[X, - E(X,)]'E[X, E(X,»)' $ O.
4-8 Covariance and correlation 89

t Y'll'lre 4·17 The quadJ:atic Q(I).

It follows that
4[Cov(X" X,)]' -4V(X,)· V(X,)" 0,

so

[COV(X1,X2)]' ,,1
V(X;)V(X,)
and
-I~p~+l. (4-29)
A correlation of p <::: 1 indicates a ~'hlgh positive correlation." such as one might find
between the stock prices of Ford and General Motors (two related cOtopanies). On the
other baud, a ''high negative correlation," p ~ -I, might exist between snowfall and tem·
perature.

~~"li'4?i~
"_";..."'-:;:,.t~L=;W"",,,,,\.

Rocall Example 4-7, which examined a continuous random voctor rXj' X2] with the joint probability
density function
O<X1 <X1 < 1.
otherwise.

The marginal of Xl is
f,(x,l =6x,([ -x;), forO <""; < [,
-:::;: 0, otherwise,

and the marginal of Xl is


f,(x,) = 3:s, for 0 < x, < [,
= O. o:hci:wise.
These facts yield the following results:

E(X;) = f>X~([-Xl)dxl =1/2.


r" \
I ') =10 6x;(1-x dx =3/[0,
E\X~ ll 1
V(Xl)=E(XJ)-[E(X,)]' =[/20.
90 Chapter 4 loint Probability Distribotions

E(X,) =J:3xidx, =3/4,

£
£"(xl) = 3xIdx, = 3/5,
V(X,) = E(xi)-(E(X,)]' =0.39.
Further. we have

This implies that


Cov(X" X,) = E(X,X,J - E(X)E(X,) = lf40,
and then
Cov(X"X,)
p ~.. -0.179.
~V(XJ V(X,)

A continuous random. vector [XI' Xz] has density functionjas given below;
f(x" x..) = 1, -~<XI<+~.O<x;:<l.
=0, Qtb.envisc.
This function is shown in Fig. 4-18.
The .marginal densities are
.!;IX,) = I-x, f0r.0<x!<l,
=l+xl fOI:-·1 <Xl < O.
=0. otherwise.
and

\
\
\
\

X,

Figure 4-18 A joint density from Example 4-18.


4·9 The Distribution Function for Two~ Dimensional. ~m Variables 91

SinceftxJ> xz}:P !;(x1) :fieX;J, the variables are not independent. If we calculate the covariance, We
obtain

and thus p = 0, so the variables are uncorTelated although they are not independent.

Finally, it is nored that if X, is related to XI linearly, that is. X, = A .,. BX I , then rf L


lfB > O. then p=.,.l; and if B < 0, p=-l. Thus, as we obServed earlier. the correlation coef-
ficient is a measure of linear association between two random variables.

4·9 THE DISTRIBUTION FUNCTION FOR rwO·DIMENSIONAI,


RANDOM VARIABLES
The distribution function of the random vecror [XI' X,] is F, where
F(x l , x,) P(XI '; Xl' X, ;; x,). (4-30)
This is the probability over the shaded region in Fig. 4-19.
If [XI' X,J is discrete, then

F(x"x,.l = L LP(V2)' (4-31)


11:::Xjfl.s:~

and if [XI' X,l is continuous, then

(4-32)

~~ii1~~:~1; !
Suppose XI and Xl have the following density:

j{xp X;J : : :; 24x JXz, x, > O. X:. > 0, Xl + Xz < 1,


0:.:: 0, otherwise.
Looking at the Euclidea!1 plane shown in Fig. 4~20 we see several cases that must be considered.
1. XI sO. F(x" x,) = O.
. :L x., s 0, F(." x,) = O.

Figure 4-19 Domain of inte~on or sum-


mation for F(x 1• .x,;).
92 CluIpter 4 Joint Probability Distributions

Figure4-20 The domain of F. Example 4~19.

F(X].X2)::::::: J;~ ri24tlt2dtldtz

=6xf.xJ.

S.O<xj<landX:!>l.

F(Xt,X2) = I:,t-r'24tlt:zdtzdt1
= 6x:~-8x~+3xi,

The function F has properties analogous to those discussed in the one-dimensional


case. We note that when X I and Xl are continuous,

if the derivatives exist.

4·10 FUNCTIONS OF TWO RANDOM VARIABLES


Often we will be interested in functions of several random variables; however, at present.
this section will concentrate on functions of two random variables, say Y ~ H(X 1.X;;. Since
X, =X1(e) andX,= X,(e), we see that Y =B[X;(e), X,(e)] clearly depends on the aUEome
of the original experiment and, thus. Y is a random variable with range space Ry•
The problem of finding the distribution of Y is somewhat more involved than in the
case of functions of one variable; however, if [Xl' X2 ] fs- discrete, ~e procedure is straight-
forward if X,and X, take on a relatively small number of values.
4-10 Functions of Two Random Variables 93

;,~~~~~3Q~
If Xl represents the number of defective units produce~ by machine No.1 in 1 hour andXzrepresents
the number of defective units produced by machine No.2 in the same hour, then the joint distn1mtion
might be presented as in Fig. 4-21. Furthermore. suppose the random variable Y "" H(X!! X,;). where
H(x x,) ='lx: + ",. I, follows that R y =(0, 1, 2, 3, 4, 5, 6. 7,8, 9j, In order to determine, say, P(Y
=0)"= p,(O), we note that ¥= 0 lfand only if X: = 0 andX,= 0: therefore,p,(O) =0.D2,
We Dote that Y"" 1 if and only if Xl = 0 and X1 -= 1; therefOl;e py(l) = 0,06, We also note that Y= 2
lfand onlyifeitberX, =0, X, =2orX, = I,X, = 0; sop,.(2)=OJO+ 0.03 0,13, Using similar logic,
we obtain the r~1. of the distribution, as follows:

YI Py(Y,)
0 0.D2
0,06
2 0,13
3 0,11
4 0.19
5 0,15
6 0.21
7 0.D7
8 0,05
9 om
otherwise
°
In the case where the random vector is continuous with joint density functionftx:, xJ
and H(x" x,) is continuous, then Y = H(X p X,) is • continuous, one-dimensional random
variable, The general procedure for the determination of the density function of Y is out-
lined below.
1. We are given Y ~ H,iX;, XJ, .(
2, Introduce a second random variable Z = H,(Xl , X,). The function H, is selected for
convenience, but we want to be able to solve y:;;; Hl(XI'~) and z = flz(x 1• Xz) for XI
and x., in ternis of y and z. "
"1
3. Find = G, (Y, zj, and x., = G,(y, z).
4. Find the follov.ing partial derivatives (we assume they exist and are continuous):

ox: ~J.. ax::! o~


dy oz dy az

X~i p,(x,)
° 0,03
2 3

0,01 O.~
° 0.02

0,06 0.09
0.04

0,12 0.03 0.3

2 0.10 0.15 020 0,05 0,5

3 0.02 0.03 OM 0.01 0.1

P,(X,) 02 0.3 0.4 0,1 Figure 4-21 Joint distrihl!tion of defectives


produced au two macb.iD.es p(;r.,,~.
94 Ctw.pter 4 loint Probability Distributions

5. 'The joint density of [Y, 2J. denoted t(y, z), is found as follows:
t(y, z)=f(G1(y, z). G,(y, zll· iJ(Y, z)l. (4-33)

where J(y, z). called the Jacobian of the tnuJsformation. is given by the determinant
IOxtfa y oxt/az!
J(y,,) = axzldy ax,/o,I' (4-34)

6. The density of Y, say gy, is then found as

gy(y) = [l(y,z)dz. (4-35)

!1~iap:.~€~z~I
Consider the contin,uous rax.dom \-'eCtor [Xl' Xi! \O.'ith the foll(,)\-\'ing density:
j(Xl,~)"""'4e-U.Y,+.l'>j. xj>O,x,>O,
= O. otherwis.e.
Suppose we are interested in the distributio:J. of Y =XjlJ[J!~ We will let y:: xlx2 and ehoose z;;; Xl +-'1:
sothatx,=yzl(l+y)~x,=zI(l+y).Itfo];owsthat . -

and aXlr(JZ=~Y~.
l+y
1
and dX,{aZ=-.
l+y

Therefore,
-y-
l+y z zy Z
---+-----
1 -(1+y)' (l+y)'-(I+y)'
l+y

and
I[G/,y, z), G'iy, ill =4e!-~1+yjt-d(l+Y)!1
::;; 4e:-:<Z.

Thus,
. )_- 4e ." '-.-'-I
ily,' z . (
[l+y J
and

y>O,

otherwise.

4-11 JOINT DISTRmunoNs OF DIMENSION n > 2


Should we have three or more random variables, the random vector will be denoted
[Xl' Xl~ ... , Xli]' and extensions will follow from the two-dimensional case. We will assume
4-11 loint Distributions of Dimension n > 2 9S

that the variables are continuous; however, the results may readily be extended to the dis~
crete case by substituting the appropriate summation operations for integrals. Vle assume
the existence of a joint density I such that
(4-36)
and

Thus,

(4-37)

The .marginal densities are determined as follows:

fiL::) r J:'··[/(x".tz,."x.)dx. ···dx"


1,(.2) = [J.:··[J(;V2, .... xnldx, ··dx,dxl, >"
I,(x,.) = L[·[/(x"x,.".,x,)dx,_1···dx2 dx l •
The mtegration is over all variables having a subscript different from the one for which the
marginal density is required.

Definition
The variables [Xl' x,..... XJ are independent random variables ifand only if for all [xpx" •. .• x,J
1(x,. x,.... ,x,) =II(X,!' !,(x.,)' .... I,(xn).
The expected value of. say, XI is
---- (4-38)

111,= :(X,) = [J:r XI' l(x,.x, ... ,xn )dxl dx2 • .. dx, (4-39)

and the variance is

(4-40)

We recognize these as the mean and variance, respectively, of the marginal distribution of Xl'
1:1 the two-dimensional case considered earlier, geometric interpretations were instruc~
tive; however, in dealing with rz--dimensional random vectors, the range space is the Euc1id~
ean n-space, and graphical presentations are thus not possible. The marginal distributions
are, however in one dimension and the conditional distribution for one variable given val-
j

ues for the other variables is in one dimension. The conditional distribution of Xi given va1~
ues (x." x,. "., x"J is denoted

(4-41)

and the expeered value of XI for given (Xz, .... x,) is

E(X;'X2. X' ..... x,) = J': Xl' Ix,I", .....', (x')dxl· (442)
96 Chapter 4 Joint Probability Distributions

The hypothetical graph of E(XJx,. x,. _.. , x,) as a function of the vector tXt, x,. ___ • x,] is
x, x
called the regression of on (X::!, 3• •• ,' X,,).

4·12 LINEAR COMBINATIONS


The consideration of general functions of random variables, say Xl~-Xz, ., .• X"' is beyond the
scope of this text However, there is Olle particular function of the form Y = H(X" ___ , X,),
where
(4-43)
that is of interest. The Q; are real constants for i =: 0, I) 2. _. '. n. This is called a linear com~
binan'011 of the variables X ~, X21 "'1 XI:' A special situation occurs when ao == 0 and a 1 == ~
;;:;; , .. :::;al'l= l,in which case we have asumY=Xl +X',2 + .. - +Xn •

5li2~ifu!ItJ,£~~6
Four resistors are connected in series as shown in Fig. 4-22. Each resistor has a resistance that is a ran.-
dom variable. The resistance of the assembly may be denoted Y, where Y=X 1 + Xz + X3 +Xlj'

;]~g!?'!;23]
Two parts are to be assembled as shov.'ll. in Fig. 4-23. The clearance can be expressed as Y=X\ -Xl
or Y::::::. (l)X\ + (-1)X2; Of eourse, a negative clearance would mean interference. This is a1i.1.ear com-
bination ""i:11 0c ;:;::: 0, a 1 = t and a l = -1.

A sample of 1() items is randomly selected from the output of a process that manufactures a small
shaft used in electric fan motors, and the diameters are to. be measured with a ·...alue called :he sample
mean, calculated as

- 1, )
X=io\X,+X,+"'+X" -

The valueX=~! +~ + ... +~lois a linear combination withac=O and at =a:z = •.. =ow=w.

x, x, x, Figure 4--22 Resistors in series.

Figur< 4-23 A simple assembly_


4~12 Linear Comb:nations 97

Let us next consider how to determine the mean and variance of linear combinations.
Consider the sum of two random variables,
Y=X, +x,. (4-44)

The mean of Yor )1, = E(],) is given as


(4-45)

However, the variance calculation is not so obvious.


V(]') = E[Y - E(]')]2 = E(Y') [E(],)]'
= E[(X, + X,)']- [E(X, +X,)]'
= E[X; + 2X,X, + X~ - [E(X,) + E(X,l]'
E(X~ + 2E(X,X,) .;- E(X~ [E(X,ll' - 2E(X,) . E(X,) - [E(X,)]'
= (E(X;) - [E(X,)]2} + '(E(X~ - [E(X,)]') + 2[E(X,X,) - E(X) . E(X,)]
= VeX,) + V(X,) + 2Cov(X,• X,),

or
, 2 2
(7y =0': +0'2 -:-2a:2_ (4-46)

These results generalize to any linear combination


Y= "0-'- ",X, + azX, + ... +c,x, (4-47)

as follows:
n
E(Y) Co + L ",E(X,)
. ~--11 ..... - .. - ..

ECY )=ao + 'IaIJii'


1",,1
(4-48)

where E(X,) )1,. and


n n n
V(Y) =La;V(X,) + L2>,CJO'ijCov(X,.XJ) (4-49)
hi 1",,1 j=1
l*i
or

a~= iaYG7 + iiaiajvij'


j".l {:=1 j=l
''''j
If the variables are independent, the expression for the variance of Yig greatly simpli·
fied, as all the covariance terms are zero. In this situation the variance of Yis simply
n
V(Y) = La;. v(x,) (4·50)
i=1

or
98 Chapter 4 Jomt Probability Distributions

In Exa,"l1ple 4-22. four resistors were connected in series so that Y = Xj + Xl + X, + X4 was the .Ie:>'ist~
ance of the assembly, where Xl was the resistance of the first resistor, and so on. 'The mean and van-
a.'1ce of Y m terms of the me3llS and variances of the components may be easily calc'J.lated. If the
resistors are selected randomly for the assembly, it is reasonable to assume that Xlt Xl' X3• and ~ are
independent,

We have said nothlng yet about the distribution of Y; however, given the mean and variance of XI'
Xz> X3> and X4• we may readily calculate the mean and the variance of Y smce the variab!es are
independent.

~~p!e.·~~2ijf
ill Example 4-23, where two components were to be assexnbled, suppose the jobt distribution of ,
[X"X,J is
N" x.,) = 8e-<'"' ""', XI ~O,Xz~O,
=0. otherwise.
Sinceflxl. x...J can be easily factored as
fl.x" x.J = [2['''] , [4e-''']
=it(x,) 'f,(x.,),
XI andXz are independent. Furthermore, E(X j ) = PI =~. and E(X:0:::;: J12 =~ We may calculate the vari-
ances

r ' 2-'''d>:
x,e
o :
(1 \' _1
__ 1 _
'2) - 4

and V(}')=

In Example 4-24, we might expect me random variables Xi> Xi' , .. , X1C to he ind~ndent because of
the random. SalIlpling process. Furthermore, the distribution for each variable X! is identical. This is
sho,,"n in Fig. 4~24.1n me earlier exar::.ple. the linear combination of interest \\"3.8 the sample mean

It follO'NS that

-' 1 (X)
1 I ) 1 ' ) ~,,·+-·E
E(XI=;t-=-·E,X
, x 10 . ' +-·E\X,
10 • 10
"
1 1 1
=w;t+ 1O;t+ "+101'
=JL.
r 4-l4 The Law of Large Numben; 99

~~ t!1'" j.! X1
Figure 4-24 Some identical distributions.

Furthermore,

I
4-13 MOMENT·GENERATING FlNCTIONS

I Al'olJ LINEAR COMBINATIONS


In the case where Y == aX, it is easy to show that
M,(t) = Mxlat). (4-51)
t
For sums of independent randam variables Y =X: .,. X! 1"' •• , 1"' Xn•

I M,(r)=Mx,(tl·Mx,(t)· '" ,Mx,Ct). (4-52)


This property has considerable use in statistics_ If the linear combination is of the general
\ fonn Y = "0 + a,X, + ". + a){, and the variables X., .... X, are independent, then
M,(t) = e"'[Mx,(att) . Mx,(a.,r)· ... ,Mx,(a,t)],
Linear combinatiot15 are to be of particular significance in later chapters, and we will dis-
cuss them again at greater length.
,r'

4.('1' THE LAW OF LARGE NUMBERS


A special case arises in dealing with sums of independent random variables where each
variable may take only two values, 0 and 1. CODSider the following formulation. An exper-
iment i! consists of n independent experiments (trials) "'j,j = I, 2, "" rt. There are only two
outcomes, success, lSI, and failure, (F), to each trial, so that the sample space 9',= (S, Fl,
The~~~ .
P(S) = p
and
P(F) = I-p=q
remain constant for j = I, 2, , .. , n. We let
if the jth trial results in failure,
if the j1h trial results in success,
100 Chapter 4 Jomt Probability Distributions

and
Y=X,+X,+,,·+X"<
Thus Y represents the number of successes in n trials, and YIn is an approximation (or esti~
mamr) for the unkn<Yl\'Il probability p. For convenience, we will let p YIn. Note that this
value corresponds to the lenni, used in the relative frequency definition of Chapter 1.
The law of large numbers states that
I
(4-53)

or equivalently

(4-54)

To indi.cate the proof, we note that


E(Y)=n . E(X;) = n[(O . q)+ (1 'p)] =np
and
V(y) nV(Xj ) = n[(O'· q) + (1', p) (P)2) = np(J - p).
8ince p = Yin, we have

E(p) =!.. E(Y) = p (4-55)


n
and

Using Cnebyshev's inequality,

(4-56)

so if

then we obtain equation 4-53.


Thus for arbitrary e> 0, as n -7 <:10,

Equation 4-53 may be rewritten, with an obvious notation, as


p[lp -pi «j" 1- a (4-57)

We mayuow fix both E and ain equation 4-57 and determine the value of n required to sat-
isfy the probability statement as

> p(l- p)
n_--
z-· (4-58)
€ a
4-16 Exercises 101

(~pIe~¥~:
A manufacturing process operates so that there is a probability p that each item produeed is defective,
and p is uu.mo'Wll, A random sample of n. items is to be selected to estimate p. The estimator to be used
isp=YIn., where
; 0, if the jth item is good,

and
Xj
t
= ,... ifthejth item is defective,

Y:::XI+Xz+""+X~.

It is desired that the probability be at least 0.95 that the error, 1ft - pi,
not exceed O.OL In order to
determine the required value of n, we Dote that € = 0.01, and a"", 0.05; however, p is unknown. Equa~
tion 4-58 indicates that

n ~ ~p",(lr--,,-p)i­
(0.01)' ·(0.05)
Since p is unknQV;n, the worst possible case must be assumed [note thatp(1 - p) is maximum when
p=~. This yields
(0.5)(0.5)
11 >- '2 50, ODD,
(0.01) (O.OS)
a very large number indeed.

Example 4-28 demonstrates why the law of large numbers sometimes requires large
sample sizes. The requirements of € = 0.01 and a= 0.05 to give a probability of 0.95 of the
departure 1ft - pI being less than 0,01 seem reasonable; however, the resulting sample size
is very large. In order to resolve problems of this nature we must know the distribution of
the random variables involved (ft in this case). The next three chapters will consider in
detail a number of the more frequently encountered distributions,

4-15 SlJMMARY
This chapter has presented a number of topics related to jointly distributed random variables
and functions of jointly distributed variables. The examples presented illustrated these top-
ics, and the exercises that follow will allow the student to reinforce these concepts.
A great many situations encountered in engineering. science, and management involve
situations where several related random variables si{nultanoously bear on the response
being observed. The approach presented in this chapter provides the structure for dealing
with several aspects of such problems,

4-16 EXERCISES

+1. A refrigerator manufacturer subjects his fir.ished table, where X represents die occu..~ence of finish
prodUcts to a final inspection, Of interest are two cat~ defects and Yrepresents the occummce of mechanical
egories of defects: scratches or flaws iD the porcelain defects.
futish, and mechanical defectS, 'Ib.e number of each <a) Find the margmal distributions of X and Y.
type of defect is a random variable, The results of (b) Find the probability distribution of mechanical
1ru:.--pecting 50 refrigerators are shO'Wl11n the following defects given that there are no finish defects.
102 Chapter 4 Joint Probability Distributions

(a) Find the appropria!-e value of k.


IX o. 1 I 2 I 3 4 i 5
fb) Calculate !he probability that X, < 1. X, < 3.
r 0" 11/5~ 4i50 i 2150 1 1150 1150 ; 1150
(c) Calculate the probability that Xl + Xl ~ 4.
L1 6150 2:'5~~0 I 1/50 1/50 i (d) Find the probability that X, < 1.5.
1 2 4i50: 3/50 I 2150 1 1/50 : (e) Find the marginal densities of both X, andX,.

I 3 3150 I 1/50 I I i 4-5. Consider the density function


! 4 1150 I 1
i f(w.x,y,z)=16wxyz. OSw,x,y,z:::O;l
= 0, otherwise.
(c) Find the probability distribution of :finish defects (a) Compute the probability that W:::O; t and Y ~ i.
given that there are no mechauical defects, (b) Compute the probability that X Stand Z::; t,
4-2. An inventory manager has accumulated records (e) Find the marginal density of1¥.
of demand for her company's product over the last
100 days, The random variable X represents the num· 4-6. Suppose the joint densitY of [X• .Y] is
ber of orders received per day and the random variable 1
f(x,Y)=g{6-x- y), 0:5x";2.2:5y";4.
Y represents the number: of unlt,<; per order. Her data
are shown in !he table at ~e bottom of this page. =0.
(a) Find the marginal distributions of X and f-
Find the conditional densities f",,(x) andfl»(y)'
(b) Find all. conditional distributions for Y given X.
4-3. Let XI and X2 be the scores on a general intelli- 4-.7. For the data in Exercise 4-2 find the expected
gence test and an occupational preference test, respec- number of units per order given that there are three
tively. The probability density function of the rdlldom orders per day.
variables [Xl' X2J is given by
4-8. Consider the probability distribution of the dis-
f(x,.x,)= 1000'
k
0 ~ x,,,; 100, 0"; x, S 10. crete random vector [Xl' XJ. where Xl represents the
number of orde,..·'S for aspirin in August at the neigh-
:..:= O. otherwise. borhood drugstore and X2 represents the number of
orders in September. The joint distribution is shown in
(a) Find the appropriate value of k. the table on the next page.
fb) Find the lll3rginal densities o[X, and x,. (a) Find !he marginal diso:::eution,.
(c) Find an expression for the cumulative distribution (b) Find the expected sales in September given that
function F(x,. x,). s:a1es in August were either 51, 52, 53, 54, or 55.
4-4. Consider a situation in which the surface tension
and acidity of a chemical product are measut¢. These 4--9. Assume that X ~ and Xi are coded scores On two
variables are coded such that surface tension is meas- intelligence testS, and the probability density function
ured on a scale 0 SXI ::;; 2. and acidity is measured on of [Xl> Xz} is given by
2
a scale 2 sXi S 4. The probability density :function of J(x,. x,) = /ix,x,. 0'; x, ,; 1,0'; x,,,; 1.
[X,.x,J is =0, otheJ"\\-'1se.
N,.X,) =k(6 -x,-x,). 0 ";x, >:2. 2";x," 4. Find the expected value of the score on test No.2
=0, otherwise, given the score on test ~o, 1. Also, find the expected

;~ i j 2 _3_1-_4-,'_5_+-_6_ i _7_1-
' _8-1'_9-1
1
t
i 1 1101100'51100 31100 21100 111100 1/10011110011/100 '1/100
+---;~-r~+-~~~--4
i 2 181100 51100; 31100 21100 1'/100 1/100 i 1/100 I I
,
i 3 181100 5/100:21100 1I1001~1100 I I ,
4-16 Exercises 103

~ S1 52 53 54 55 [ g(uz)h(u)lu,du.
S1 0.06 0.05 0.05 0.Q1 , O.Q1
Hint: LetZ=.\'lY and U = Y, and find the Jacobian for
52 0.01 0.05 0.01 • 0.Q1 I om . tlle transforrr:..ation to the joint probability density
! 53 0.05 ! 0.10 ! 0,10 0.05 ! O.OS : function ofZ:md U, say r(z. u). Then ir.tegrate r(z:. u)
0,0'1 0.01 with respect to u.
54 0.05 • 0.02 0.03
....._-
:-ss 0.05 : 0,06 0.05 0.01 i 0,03
4~14. Suppose we have a simple electrical circuit in
which Obm's law V"" IR holds. We v;ish to find the
probability distribution of resis!.ance given that the
value of the score on test No.1 given the score or. test probability clistribuciollS of voltage (11) and current (l)
No.2. are ktiOWll to be
4-10. Let
g,,(y) e~. V;?O,
, - ::::0, otherv.rise;
j{xl.Xz} =4x1Xze-\r1 H t, X;, x.z > 0
:=. O. otherwise. Vi) i~O,

(a) Fmd the marginal distributions of Xl andXz_ 0, otherv.ri.se.


(b) Fmd ce conditional probability diStrib-JtiODS of Use the results of the previous problem, and assume
X, andXz. that V and I are independent random variables.
(c) Find expressions for the conditional expectatiollS 4-15. Demand for a certain product is a randOtn vari-
of X, and X,. able baving a mean of 20 units per day and a variance
4-11. Assume that [X. Y] is a cout:i.!J.uous r.mdom vec- of9, We define the lead time to be the time that elapses
tor and that X and Yare inde:pendcntsucb thatftx. y);;; between the placement of an order and its arrival The
g(x)h(y). Define. now random variable Z=XY. Show lead time for the product is fixed at 4 days, Fmd the
cat the probability density function of Z; t z (z), is expected value and thl! variance of lead time demand,
given by assuming demands to be independotly distributed..
4-16. Prove the discrete case of Theorem 4-2.
4·17. Let XI and Xl be random variables such thatX2
=A+BX,. Show that p' 1 and thatp=-l ifB<O
Hint: Let Z::::: XY and T = X and :fuJ.d the Jacobian for
whilep=+lifB>O.
the transformation to the joint probability density
function of Z and T, say r(z. t). Then integrate ret, t) 4~18. Let X. and Xl be random variables sucb that Xz
with respect to /, ;;;;;;A + BX t , Show that me moment-generating function
4- U~ Use the result of the previous problem to find for Xzis
the probability density function ~of the area of a re~ Mx,(') = iI"Mx,\Bt).
tangle A =SIS:!' where the sides are of random length. 4·19. Let X; and Xz be distributed according to
Specifically, the sides are independent random vari*
j(x"x,) =2. O$XL~X,$;,
abIes s"Jcb that
:= 0, otherwise,
O:::;;SI~l,
Find the correlation coefficient betweeu Xl and Xz
o otherwise,
<

and 4-20~ Let XI and Xl be random variables Vii.th correla w

tion coefficient Px1.Xz" Suppose we de5.ne twO;lCVI ran~


I
h",(s')=il s" 0$s:;:,$4, dam variables U=A + BX, and v~ C .,-DX" whereA-
S, C, and D a.."e constants. Show that PIlY ;;;;;;
0, • otheI'\llise,
(BDIjsDilPx"x,.
Some care must be taken in determining the 1i.I!.lits of 4-21. Consider the data shown lr.. E;w;ercise 4-1. Axe
integration because the variable of integration cannot X and Y independent? Ca.:!culate the correlation
asSll!:1C negative values. coefficient.
4..13. Assume that pc, Y] is a continuous random vec~ 4-22. A couple v;ishes to sell their bouse, The miniw
tor and that X and Yare independent sucb thatf(x, y} mum price that they are willir:!g to accept is a random
:::: g(x)h(y). Define a new random variable Z::: XIY. variable. say X, where SI S; X S; $2' A population of
Show that the probability density function of Z. t.f,.z), bt.yers is interested in the house. Let y, where PI ~ Y
is given by S P;;.. denote the ma.;llir.mm price they are willing to
104 Chapter 4 Joint Probability Distributions

pay, Yis also a random variable. Assume that the joint 4-2&. For the bivariate distribution,
dis:ribution of [X,:(I isir", y). k(l+x+ y)
(a) Under what citcumstances will a sale take place? f(x,y) O';x<-,O$y<~,
(1+x)'(lTy)' ,
(b) Write an expression for the probability of a sale
;;;:. 0, otherwise.
tlkiJ:g place.
(c) Write an expression for the expected prk'e of the (a) Evaluate the con.'itant k..
transaction. (0) Fl.'1d the marginal distribution of X

4~23. Let [X y) be UIllformly distributed over the 4~29. For the bivariate distribution.
semicircle in the following diagram. Thusf(>; y) = 21"
if ~x, y) is in the semicircle, fIx,,) k ,,;;'0,y2:0,n>2,
(l+x+yr
(a) Find the marginal distributions of X and Y.
=0. otherwise.
(b) Find the conditional probability distributions,
(a) Evaluate the constant k.
(c) Find the conditional expectations.
(b) Fi::1d F(x; y).
y 4-30. The manager of a small bank vvisbes to deter~
mine the proportion of the ti..me a particular teller is
busy. He decides to observe the teller at n randomly
spaced mwrvals. The estimator of the degree of gain~
ful employment is to be YIn, where
_ (0, if on the ith observation. the teller is idle,
Xi - ~1, if on the ith observation, the teller is busy,

4~24.Let X and Y he indepen.de:rit random variables. and y =. L:IXi- It is desired to estimatep;;;:,P(Xi= 1)


Prove that EeXiY) =E(X) and that EeYjxl = E(l'). so that the error of the estireate does not exceed 0.05
with probability 0.95, Determine the necessary value
4D25. Show that, in the discrete case, ofn.
E[E(xjl')l =E(X), 4-31.. Given the following joint distributions, deter-
E[E(YiXJJ = E(l'). mine whether X and Yare independent.
4ffl26. Consider the two independent random ...<Uiab1es
<.) g(x, y)=4>:ye-(.e·,f'l, x~O,y;;'O.

Sand D, whose probability densities are (b) .1\>; y) = 3i'y-\ 0'; x'; y'; L
(e) 1(>;y) 6(l+x+y)~, x;;:0,12:0.
fs(s) = 3~' lO~s';40,
4·32, Ler}\>; y, t)= h(x)hfylh(z). x 2: 0, Y,,0, z" 0,
otherwise; Detennine the probability th2;t a point drawn at ran-
.
gn (d)= 20'
° l
WSd';30,
dom will bve a coordinate (x, 'I. z) thaI docs not sat~
isfy either x>y > z or x <y < z.

=0. otherwise. 4-33. Suppose that X and Y are random variables


denoting the fraction of a day that a request for mer-
Fi::J.d the probability distribution of the new random chandlse occurs and the receipt of a shipment oceurs,
variable respectively. The joint prObability density function is
W=S+D. .l\x,y)=l. O$x~l.OSy';l,

4-27. If ;;;:. 0, otherv&e.


.I\>;y)=x+y, O<x<l,O<y<l, (a) What is the probability that both the request for
merchanilise and the receipt of an order occur
=0, other''''lse,
during the first half of the day?
find the following:
(b) What is the probability that a request for merchan-
(a) E[XiYl, dise occurs after its ::eceipt'? Before its ;:eceipt?
(b) E[X:;.
4-34. Suppose that in Prohlem 4-33 the merchandise
(e) E[:(I. is highly perishable and must be requested during the
r! 4~16 Exercises 105
t..ctay interval after it arrives, "''hat is the probability (a) Z=a+bX.
that merchandise will not spoil?
(0) Z= IIX
4..35. Let X be a continuous random variable 'With
(e) Z=lnX
probability density function j(x), Find a general
expression tOt the new random variable Z. where Cd) Z=i'.
Chapter 5

Some Important Discrete


Distributions

5-1 INTRODUCTION
In this chapter we present several discrete probability distributions, developing their ana-
lytical form from certain basic assumptions about rea1~world phenomena. We also present
some examples of their application. The distributions presented have found extensive appli-
cation in engineering, operations research. and management science. Four of the distribu~
tions. the binomial, the geomerric, the Pascal, and the n.egative binomial, stem from a ,
random process made up of sequential Bernoulli trials. The hypergeometric distribution,
the multlnomial distribution., and the Poisson. distribution will also be presented :in this
chapter.
When we are dealing with one random variable and no ambiguity is introduced, the
symbol for the random variable will once again be omitted in the specification of the prob~
ability distributions and cumulative distribution function; thus, px<x) = p(x) and FIx) ~
F(x). This practice wili be continued throughout the rext.

5-2 BERNOULLI TRIALS MlJ THE BERNOULLI DISTRIBUTION


There are many problems in which the experiment consists of n trials or subexperiments.
Here we are concerned with an individual trial that has as its tw.o possible outcomes suc-
cess, S1 or/ailure, F. For each trial We thus have the following:
\1;;, Perform an experiment (the jth) and observe the outcome.
9';: IS, F}.
For convenience, we "ill define a random variable X;.= 1 if'>:) results in S and X)~ 0 if'>:}
results in F (see Fig. 5-1).
The nBernoulli trials o:gl' ~2' .. '\ ~r. are called a Bernoulli pr.ocess if the trials are inde-
pendent, each trial has only two possible outcomes, say S or F, and the probability of suc-
cess remains constant from trial to trial. That is,

and
Xj'=l, j 1.2.... ,~
Xj =0. j=I.2, .... n, (5-1)
.otherwise,

106
5-2 Bernoulli Trials and thc Bernoulli Distribution 107

~ ~
0=\..;;,i rl----X..:.)----t{.
..... hl~X_'<j,((SS;)"~
) FF"j I \ 0 X 'F) .

LV ~
Figure 5-1 A Bernoulli trial

For one trial, the distribution giveD in equation 5-1 and Fig. 5~2 is ealled the Bernoulli
distri.bution.
The mean and variance are

and
VeX) = [(0'. q) + (I'. p)]- p' = p(l-p) = pq. (5-2)

The moment-generating function may be shown to be


MxPl=q+pe'. (5-3)

:~p.~Y:2
Suppose we consider a to.allufactUl'ing process in wbich a stI.ta1.1 steel part is produced by an auto-
matic machine. Furthermore, each part in a production nm of 1000 parts may be classified as defec~
live or good when inspected. We can think of the production of a part as a single trial that results
in success (say a defective) or failure (a good item). 1£ we have reasOD to believe that ffie machine
is just as likely to produce a defecth:e on one run as on another, and if the production of a defec~
tive on one run is neither more nor less likely because of the results on the previous runs, then it
would be quite reasonable to assume that the production run is a Bernoulli process \vith 1000 trials,
The probability, p, of a defective being produced on one trial is called ~ process CNerage fraction
defective.

Note that.in the preceding example the assumption of a Bernoulli process is a matheff
matical idealization of the actual real-world situation. Effects of tool wear, machine adjust-
ment,. and instrumentation difficulties were ignored. The real world was approximated by a
model that did not consider all factors, but nevertheless, the approximation is good enough
for useful results to be obtained.

o Xl Figure- 5~2 The Bernoulli distribution.


108 Chapter :> Some Important Discrete Distributions

We are going to be primarily concerned with a series ofBemoulli trials. In this case the
experiment % is denoted [('1:1' 'S" .",11:,,): 'S; are independent Bernoulli trials,j = I, 2, ... ,
n}. The sample space is .
~ = (Xl' ""X/I): Xi= S or F. i= I •...• n}.

Suppose all experiment consists of three Bernoulli trials and the probability of success is p on each
trial (see Fig. 5-3). The random variable X is given by X = L~"'l Xj' The distribution of X can be
determined as follows:

"
Q P{FFF) =q' q' q=cl
1 P{FFS}+P{FSFj +P{SFFj 3p'!'
2 P{FSS)+P(SFS}+P{SSF}=3p'q
3 P{SSS) =p'

5·3 THE BINOMIAL DISTRIBUTION


The random variable X that denotes the number of successes in n Bernoulli trials has a
binomial distribution given by p(x), where

()
P." 'l'nl
x)pX( I-p )'-X • x=O,I,2,,,.,.,

= 0, ofuerwise. (5-4)
Example 5-2 illustrates a binomial distribution wifu n = 3. The parameters of the binomial
distribution are n andp, where n is a positive integer and 0 ~p S 1, A simple derivation is
outllned below. Let
p(X) = P["x successes in n trials").

S Rx

IP
X
'0
FFS- I
I\
FSP· -j I
I
)
I

"-I
SFF-
FSS-
SFS-
\
)
(
'2
)
i
SSF- I
i SSS-
e) Figure S~3 Three Bernoulli trials.
5~3 The Binomial Distribution 109

The probability of the particular outcome in ~ with Ss for the first x trials and Fs for the
last n - x trials is

p[ss{:ss Ff.~7FJ= p'r


(where q = 1 - p), due to the independence of the trials. There are (n) n!
--"'-- out-
x!(n-x)!
comes having exactly x Ss and (n - x) Fs; therefore, x

p(x) = (:}'qn-", x=O,I,2, ... ,n,

= 0, otherwise.
Since q =-1 - p, this last expression is the binomial distributiun.

5-3.1 Mean and Variance of the Binomial Distribution


The mean of the binomial distribution may be determined as
n I
E(X) = ~
L..,x, n. p " q n-"
""0 x!(n - x)!
. ~ (n-1)! ~-l n-~
=npL..,x· p q ,
""1 (x -I)! (n - x)!
and letting y = x - 1,

so that
E(X) = np. (5-5)

Using a similar approach, we can find

E(X(X-l)) = ±.x(x-l)n!p"r"
""0 x! (n - x)!

2~ (n-2)! "-2 n-"


( 1)PL..,
=nn- p q
. ""2(x-2)!(n-x)!
n-2 ( ?)t
= n(n _1)p2 L n - - . pyqn-y-2
y"oy!(n-y-2)!
= n(n _1)p2,

so that
VeX) = E(X2) - (E(X)),
= E(X(X - 1)) + E(X) - (E(X))'
= n(n - l)p2 + np - (np)'
=npq. (5-6)
116 Chapter 5 Some Important Discrete Distributions

An easier approach to find the mean and variance is to consider X as a sum of n inde~
pendent Bernoulli random variables, each with meanp and variance pq, so that X Xl ~ X, =
+ ... -rXfj' Then
E(X) = p + p + ... .,. p = np
and
VeX) =pq +pq + --. + pq=npq.
The moment-generating function for the binomial distribution is

MIJ) = (pi + qr· (5-7)

;~~~i¥~~~;
A production pro<;ess represented schematically by Fig. 5-4 produces thousands of parts per day. On
the average. 1% of the parts are defeetive and this avernge does not vary with time. Every hour, arm-
dom sample of 100 parts is selected from a oonveyor and severnl characteristics are observed and
measured on each part; however, the .inspector classifies the part as either good or defective. If we
consider the sampling as n 100 Berooulli trials with P =0,01, the total number of defectives in the
sample, X, would have a binomial distribution
100'
p(X) =( x jeO.Ol)'(O.99)'OO-', x=O,1,2, ... ,100,

=0 otherwise.
Suppose the inspector has instructions to stop the process if the sample has more than two defectives.
=
Then,P(X> 2) 1- P(X'; 2), and we maycalcuJate
, (100\
P(X52) = 1, ~O.OI)'(O.99)'OO-'
r<>O" X)

= (0.99)"l<J +100(0.01)'(0.99)'" +4950(0.01)' (0,99)98


!:x 0.92.

Thus, the probability of the inspector stopping the process is approximatdy 1- 0.92::: 0.08. The mean
number of defectives that would be found is E(X) =np = 100(0.01) ~ 1, and :he variance is VeX) = npq
=0.99.

5-3.2 The Cumulative Binomial Distribution


The cumulative binomial distribution Or the distribution function, F, is

(5-8)

I
Conveyor
):I Warehouse
I
I
I ,
{. Hourly, n= 100 Figure 5-4 A sampling situation with attrib-
samples J ute measurement
5-3 The Binomial Distribution 111

The function is readily calculated by such packages as Excel and :Minitab. For example,
suppose that n ~ 10,p ~ 0.6, and we are interested in calculating F(6) ~ P(X" 6). Then the
Excel function call BINOMDIST(6,10,0.5,TRUE) gives the result F(6) ~ 0.6177. In
Minitab, all we need do is go to CalcIProbability DistributionslBinomial, and click on
"Cumulative probability" to obtain the same result.

5-3.3 An Application of the Binomial Distribution


Another random variable, first noted in the law of large numbers, is frequently of interest.
It is the proportion of successes and is denoted by
p~X!n, (5-9)
where X has a binomial distribution with parameters n and p. The mean, variance, and
moment-generating function are

E(p) ~ l.E(X) ~ lnp ~ p, (5- 10)


n n

V(p) ~ (;r . ~ (;r


V(X) npq ~ ~q , (5-11)

M(t)~Mx(~)~(pe'r' +q)'. (5-12)

In order to evaluate, say, P(p :5" Po), where Po is some number bernreen 0 and 1, we note that

Since npo is possibly not an integer,

(5-13)

where l J indicates the "greatest integer contained in" function.

From a flow of product on a conveyor belt between production operations J and J + 1, a random sam-
ple of 200 units is taken every 2 hours (see Fig. 5-5). Past experience has indicated that if the unit is
not properly degreased, the painting. operation will not be successful, and, furthermore, on the aver-
age 5% of the units are not properly degreased. The manufacturing manager has grown accustomed
to accepting the 5%, but he strongly feels that 6% is bad. performance and 7% is totally unacceptable.
He decides to plot the fraction defective in the samples, that is, p. If the process average stays at 5%,
he would know that E(p) :::: 0.05. Knowing enough about probability to understand that p will vary,
he asks the quality-control department to determine the P(j > 0.07 j p = 0.05). This is done as follows:

Operation J . Operation J +1 Operation J + 2


degreasing painting packaging

n = 200 every two hours


Figure 5-5 Sequential production operations.
112 Chapter 5 Some Importaot Discrete Distributions

P(p> 0.07jp=O.05) = I ·-P{fi'; O.07jp ~ 0.05)


=I - P(X ,; 200(O.07)jp = 0.05)
=1- f(2001;;o.05)k(0.95)""-'
""" k
= I - 0.922 =0.078.

:~!,!§pi~~1~
An indusmal engineer is concerned about the excessive "avoidable delay" time that one machine
operator seems to have. The engineer considers two acti.-ities as '"avoidable delay time" and "not
avoidable delay time." She identifies: a t.i.n1e-dependent variable as follows:
X(t) = 1, avoidable delay,
=0, otherwise.
A pMticular realization of X{t) for 2 days (%0 minutes) is shown in Fig, 5~6,
Rather thanbave a time study technician contln'.lously analyze this operation, the engineer electS
to USe '"work sa.'llpling," randomly selects rt points ou the 960~minute span., and estimates the fraction
of time the '"avoidable delay" category rusts, She lets XI::: 1 if XCr) 1 at the time of the ith obser-
vation and X" =:; 0 if X(:) =:; 0 at the time of the ith observation. The statistic

Ix;
P=~
"
is to be evaluated, Of course, Pis anmdom variable having a mean equal to p. variance equal to pqlrt.
and a standard de.iation equal to ~. The procedure oudined is not necessarily the best way to
go about such a study, but it does illusrrate one utilization of the rmdom variable P.

In summary~ analysts must be sure that the phenomenon they are studying may be rea-
sonably considered to be a series ofBemoulli trials.in order to use the binomial distribution
to describe X. the number of successes in n trials, It is often useful to 'visualize the graphi-
cal presentation of the binomial distribution. as shown in Fig. 5-7. The values Pix) increase
to a point and then decrease. More precisely, Pix) > Pix - I) for x < (n + I)p, andp(x) <
p(x -1) for x > (n+ I)p.lf (n + I)p is an integer, say m. thenp(m) = p(m- I).

5-4 TIlE GEOMETRIC DISTRIBUTION


The geometric distribution is also related to a sequence of Bernoulli trials except that the
number of trials is not fixed. and. in fact, the random variable of interest, denoted X. is
defined to be the number of trials required to achieve the first success. The sample space

XC')

o
llLDDlL~ 480 min, 960 min. t
Figure 5-6 A realization of X(I). Example 5-5.
5-4 T,::e Geometric Distribucon 113

'---OO---''--2:'--c'3-~'\-'n-_L2=--n-'_''C1--nL--~x
Figure 5-7 The binomial distribution.

and range space for X are illustrated in Fig. 5·8. The range space for X is Rx= (1,2,3, .. ),
and the distribution of X is given by
.x I, 2, . .. ,
othenvise. (5·14)
It is easy to veciiy that this is a probability distribution since

Lpq'-l
x",t
p fq' =p.[_1_J=1
k=O l-q
and
p(X) <: 0 for ali x.

5-4.1 Mean and Variance of the Geometric Distribution


The mean and variance of the geometric distribution are easily found as follows:

or

d: q ] 1 (5-15)
!l P dql1-q =p'

~~______x______~~
£;~) )::.~6
)' FFFFFS.-t(==================~: )
'\ FFFF::!,,/,F>iS °l-
° i ( ) ·7 I
~ 0
Figure 5,.8 Sample space and range space for X
114 Chapter 5 Some Important Discrete Distributions

or, after some algebra,

(5-16)

The moment-generating function is

pc'
Mx(r)=--, . (5-17)
I-qe

A certain experi:Lent is to be performed until a successful result is obtained. The ~a!s are ir.depend~
ent and the cost of pe:rfonni.1.g the experiment is $25,000; however. if a failure results, it costs $5000
to "set up" for the next trial. The experimenter would like to detennine the expected COS! of the proj~
ect. If X is the number of trials required to obtain a successful experiment, then the cost function
would be

C(X); $25,OOOX .,. $5000(X -I)


= 30,OOOX - 5000.

Then

E[C(.X)) ; $30,000· E(X) - E($5000)

I)
r 000· - -·5000.
; 130,
L P
If the probability of success on a single trial is, say, 0.25, then the E[C(X)]; $30,00010.25 - 55000 =
SI15,000, 'I'his mayor may not be acceptable to the ex.per'..mcnter, It should also be recognized that
it is possible to conti.."me indefinitely without having a successful experiment. Suppose that the exper-
imenter has a maximum of $500.000. He may wish to find the probability that the experimental work
would cost more tha:a this amount., that is.

P(C(X) > $500,000) =P($30,000X - $5000 > 5500,000)

=pl'X>~1
30,000 )
=P(X> 16.833)
I-P(X'; 16)
\6
=1- 2:;O.25(0.75)~1
= O.O!.

The experimenter may Dot be at aU willing to run the risk (probability 0.01) of spending the available
$500,000 without getting a successful run.

The geometric distribution decreases, that is, p(x) < p(x - 1) for x = 2, 3. " .. This is
shown graphically in Fig. 5-9.
An interesting and useful property of the geometric distribution is that it has no mem-
ory, that is,

P(X>x+sp:>s) =P(X>x). (5-18)

The geometric distribution is the only discrete distribution having this memoryless properry.
5-5 The Pascal Dostribution 115

P(X)I

ili~1- - 'c- - -", \_ ~


i 2 3 4 5 x Figure 5~9 The geomet:lc distributiQr..

LetX denote 'the number of tosses efa fair die until we ob.~erve a 6, Suppose we have already tossed
the rlie five times without seeing a 6. The probal:ility that more than tvlO adCiticnal tosses will G'e
requlrOO is
I
P(X> 7 X> 5)=P(X>2)
I-P(X$2)
,
=1- Lp(x)
x_,

1--11 ( 1+-5\
(j \6)
I

25
=-
36

5-5 THE PASCAL DISTRIBUTION


The Pascal distribution also has its basis in Bernoulli trials. It is a logical extension of the
geometric distribution. In this case, the random variable X denotes the trial on which the rth
success occurs, where r is an integer. The Frobability mass function of X is
X-I\
p(x) = ( Ip'q"-', x=r.r+l,r+2, ... ,
r-I)
=0. otherv.tise. (5·19)
The term p'q'~ arises from the probability associated with exactly one outcome in i:f that
has (x - rl Fs (failures) and r Ss (successes). In order for this outcome to occur, there must
be r - 1 successes in the x-I repetitions before the last outcome, which is always success.
There are thus (~=;) arrangements satisfying this condition, a.'1d therefore the distribution is
as abown in equation 5-19.
The development thus fur has been for integer values of r. If we have arbitrary r> 0 and
0< p < 1, the distribution of equation 5·l9ls known as the negative binomial distribution.

5-S.1 Mean and Variance nf the Pascal Distribution


If X has a Pascal distribution t as illustrated in Fig. 5~10, the mean, variance, and moment-
generating function are:
J1. = rip, (5·20)
116 Chapter 5 Some Irnporta.'lt Discrete Distributions

I
Lj
! .11. _,.~
34567
~-------:::-\
x r---
Figure 5-10 An example of the Pascal distribution.

cr:rq/p', (5-21)
and

pi Y
Mx{t)= ( - - , I· (5-22)
1- qe )

.tlciIhpie ~.'8·.
The president of a large corporation makes decisions by throwing darts at a board. The center section
is !:larked "yes" and represents a success. The probability of his hitting a l'yes" is 0.6, and this pro!)..
ability remains constant from throw to throw. The president con.tinues to throw until he has three
"tits." We cenoteX as the number of the trial .on which he experiences:he third hit. The mean is 3/0.6
:5, mea:.ri.rig that on the average it 'Will ta.l(e five throws. The president"s decision rule is simple. Ifhe
gets three hits on or before the fifth throw be decides in favor of the. question. The probability that be
will decide in favor is :herefore
P(X" 5) = p(3) - p(.) + p(5)
(3\,( )3. )' l/4\ )" ,
=l2(21J,0.6 )'r,0.4)' +l2) 0.6. (0.". + 2fo.6 \0.4)

=0.6826.

5·6 THE MULTlNOllfiAL DISTRIBUTION


An important and useful higher-dimensional random variable has a distribution kuown as
the multinomial distribution. Assume an experiment ~ with sample space ff is partitioned
into k mutually exclusive events. say B 11 B2, .,.~ BIr • 'rYe consider n independent repetitions
of 'I: and let Pi = P(B,) be constant from trial to trial, for i = 1, 2, ... , Ie If k = 2, we IDwe
Bernoulli triaJs, as described earlier. The random vector [X" x" ... , XJ has the following
distribution, where Xj is the number of times Bf occurs in the n repetitions of'i, i ::;: 1,
2 1 .,,)k.

P ( XI'X'''''');.
ll
=
r n!
Xl!X2!'''Xk~J
ll:!x" Xl
PI P2' ,,,Pk (5-23)

,
for Xl:::::: 0> 1,2, "., n, i:::::: 1,2. ... , k, and where LX:;:: n.
i=.l
k It should be noted that X,. x" ... , X, are not independent random variables, since
L x It for any n repetitions,
i"": It turns out that the mean and variance of Xi' a particular component, are
5*7 The Hypergeometric Distribution 117

E(X,) = "Pi (5-24)


and
V(Xi) = np,(l - p,). (5-25)

~1~lli~§~9'!
Mechanical peucils are manu:actured by a process involving a large amount of labor in the assembly
operations. This is bighly repetitive work and L'1.Ceutive pay is invo!ved. Final inspection has revealed
that 85% of the product is good, 10% is defective but may be reworked. and 5% is defective and must
be scra.pped. These percentages remain constant oyer time. A random samp~e of 20 items is selec:ed,
and if we ler
X; ;;;;; number of good iteI11S.
A; : : : number of defective but reworkable items,
X) : : : number of items to be scrapped.
then

p(xpx"x,) ~20:! , (0.85)" (010)"' (0.05)" .


Zt· 1 2· x 3·

Suppose we want to evaluate t.;!::; probability function for Xl := 18, ~ = 2. andx~ =0 (we must have Xl
+-'-l +xJ =20); !ben
'ZOi!
Pilg 2 O,=_~i-'·-(O.8S'J"(0.10)'(O.05)'
\ ' • I (18 11 ')10'
r"'" ,
0.102.

5·7 THE HYPERGEOMETRIC DISTRIBUTION


In an earlier section an example presented the hypergeometric distribution. We wiL. now
formally develop this distribution and further illustrate its application. Suppose there is
some finite population withN items. Some number D (D slv') of the items fall into a class
of interest. The particular class will, of course, depend on the situation under consideration.
It might be defectives'(vs. nondefectives) in the case of a production lot, or persons 'With
blue eyes (vs. not blue eyed) in a classroom with N students. A random sample of size n is
selected without replacement. and the random 'variable of interest, X, is the number of items
in the sample that belong to the class of interest. The distribution of X is

D~(N -DJ
() (
_xJn-x x = 0, I, 2, ... , rnin(n, D),
p x - (~J
=0 otherwise. (5-26)
The hypergeometric's probability mass function is available in many popular software pack~
ages, For instance. suppose that N : : : 20, D = 8~ n :::.- 4, and.;t = 1, Then the Excel function

HYPGEOMDIST(x, 11, D.N) = HYPGEOMDIST(1,4, 8, 20) 0.3633.


118 Chapter.5 Some Importar:t Discrete Distributions

5-7.1 Mean and Variance of the Hypergeometric Distribution


The mean and variance of the hypergeometric distribution are

E(x)=n·r~l (5·27)

and

, [DJ [ D) [N-nJ
.(Xl=n· N ' 1- N . N-I' (5.28)

In a receiving inspeetion department. lots of a pump shaft are periodically received. The lots contain
100 units and the following acceptance sampling plan is used. A random sample of 10 units is
selected wi:hout replacement. The lot is accepted if the sample b..a..1'; no more thai: one defective. Sup-
pose a lot is received that is p"(lOO) percent defective. What is the probability that it will be accepted?

~ (lOOp'UIOO[l- p'JI
£../, x 1\
lO-x
P(accept lot) =P(X';l) = =01. (l~O\ )

\ 10)
(100P'X100[1- P'J}(lOOP"(I00[l- p'~
" 0 l 10 , 1) 9 ,
eOO) ,,'
,!O)
Ob. . iously :he probability of accepting :he lot is a function of the lot quality, p~
If p' = 0,05, then

p(accept lot) 0.923,

5·8 TIlE POISSON DISTRIBL'TION


One of the most useful discrete distributions is the Poisson distribution. The Poisson dis~
tribution may be de\"eloped in NO ways, and both are instructive insofar as they indicate the
cixcumstances where this random variable may be expected to apply in practice. The first
development involves the definition of a Poisson process. The second development shows
the Poisson distribution to be a limiting form of the binomial distribution.

5-8.1 Development from a Poisson Process


In. defining the Poisson process, we initially consider a collection of arbitrary, time-oriented
occurrences, often called «arrivals" or "births" (see Fig. 5-11), The random variable of
interest, say X,. is the number of arrivals that occur on the interval (0, t). The range space
Rx, = {O, 1, 2, ... ). In developing the distribution of X, it is neeess3I)' to make some assump-
tions, the plausibility of which is supported by considerable empirical evidence,
5-8 The Poisson Distribution 119

o Figure 5-11 The time axis.

The first assumption is that the number of arrivals during nonoverlapping time inter-
vals are independent random variables. Second, we make the assumption that there exists a
positive quantity A such that for any small time interval, I1t, the following postulates are
satisfied.
1. The probability that exactly one arrival will occur in an interval of width I1t is
approximately A' ill. The approximation is in the sense that the probability is (A'
ru) + o,(ru) where the function [o,(ru)/ru]--+ 0 as ru --+ O.
2. The probability that exactly zero arrivals will occur in the interval is approximately
1 - (,1, . ru). Again this is in the sense that it is equal to 1 - (,1, . ru) + o,(ru) and
[02(ru)/ru]--+ 0 as ru --+ O.
3. The probability that two or more arrivals occur in the interval is equal to a quan-
tity o,(M), where [o,(ru)/ru] --+ 0 as ru --+ O.
The parameter Ais sometimes called the mean arrival rate or mean occurrence rate. In
the development to follow, we let

p(x) = P(X, = x) = pit), x=O, 1,2, .... (5-29)

We fix time at t and obtain

port + ru) = [1 - ,1, . ru] . port),

so that

and

(5-30)

For x> 0,

p,(t + ru) =,1,. ru p~,(t) + [1 -,1,. ru] . p,(t),

so that

and

lim p,(t+ru)-P,(t)]= Px'()="


t ()_"A Px ()
I\,. Px-l t t . (5-31)
&--+0 [ ill

Summarizing, we have a system of differential equations:

(5-32a)

and

x= 1, 2, .... (5-32b)
120 Chapter 5 Some ImpOrtant Disc:ete Distributions

The solution to these equations is

pir) = (AtYe-"'lx!, x=o, 1,2, ",' (5-33)

Thus, for fixed (, we let c = At and obtain the Poisson distribution as

p(X) x::::::O,1,2, ... ,

= 0, otherwise, (5-34)
Note that this distribution was developed as a consequence of certain assumptions; thus,
when the assumptions hold or approximately hold, the Poisson distribution is an appropriate
modeL There ,are many real,world phenomena for which the Poisson model is appropriate.

5-8,2 Development of the Poisson Distribution from the Binomial


To show how the Poisson distribution may also be developed as a limiting form of the bino~
mial distribution with c = np, we return to the binomial distribution

p(X) n! .rfI _ )'-' x= 0, 1.2, ... ,n,


I(n _ x,)I P ,
X.
P ,

If we let np = c, so that P = c/n and I - p = 1- c/n = (n - e)/n, and if we then replace terms
involving p with the corresponding terms involving c, we obtain

p(x) = Xl
l- l -c1"-'
n{,,-I)(n- 2) ... (n - x+ I) r eJ'~ n-
n n ...
C'[ ( IV 2) f x-I'l( c\"f c',-'
=-~ (1)11--,,1-- ",,1--) 1--),1--' . (5·35)
x! \ n/\ n \ n J n \ n/
In lettingn --> = andp --> Oin such a way that op =cremainsfixed, the terms (I _1), (J --"),
1> 1 c C tl
... , (I - -=-) all approach 1, as does (1 - -T". Now we know that (1 -j" --> tf as n -->
!'! II "
=.
tl

Thus, the limiting fonn ofequatlon 5,35 is pix) = (e'lx!) . e-<, which!s the Poisson distribution.

5-8.3 Mean and Variance of the Poisson Distribution


The mean of the Poisson distribution is c and the variance is also C, as seen below.

::::::C. (5·36)
Similarly,

so that
V(X) = E(X') (E(X»)'
c. (5,37)
5-8 The Poisson Distribution 121

The moment~generating function is


M,!J) = e~~;). (5-38)
The utility of this generating function is illustrated in the proof of the following theorem,

Theorem 5-1
If Xl' X2• "'jXk are independently distributed random variables, each having a Poisson dis~
tribution with parameter c," i;::; I, 2, .. _, k, and Y = Xl + X2 + ,,- + Xkj then Yhas a Poisson
distribution with parameter

Proof The moment-generating function of Xi is


Mx(t):: e,,((.:-'-l),

and since M,,(t) = Mx,(t)· Mx,(t)· " .. Mx,(t), then


Ml.t} = e(c:,,"c::,-'" +eJC(,,'-l),

which is recognized as the moment-generating function of a Poisson random variable with


parameter C = c j + Cz + ... + c'\;,
This reproductive property of the Poisson distribution is highly useful Simply, it states
that sums of independent Poisson random variables are distributed according to the Poisson
<Hstribution.
A brief tabulation for the Poisson distribution is given in Table I of the Appendix. Most
statistical software packages automatically calculate Poisson probabilities.

Suppose a retailer determines tha: the number of orders for a certain home appliance in a particular
period has a Poisson distribution with parameter c. She would like to determine the stock level K for
the beginning of the period so that there will be a probability of at least 0.95 of supplY.ng all cus-
tomers who order the appliance dwing the ~od, She does ;:lot wish to back-order merchandise or
me
re..<;upply warehouse du.."':ing the period. If X represents the number of orders.,. the dealer wishes to
determine K such that

P(X'; Kl ,,0.95
or
P(X>Kl ";0.05.
so that

I-e-.:"c Ix! ~ 0.05.


X

x:X+!

The solution may be determined directly from tables of the Poisson distribution and is obviously a
fuuction of c,

The attainable sea.'litivjty for e1ectromc amplifiers and app~ is limited by noise or spontaLeous
current fluctuations. In vacuum t',:ibes, one noise source is shot noise due to the random emission of
122 Chapter :; Some Important Discrete Distributions

electrons from the heated ca.thode. Assume that the potential difference between the anode and cath·
ode is large enough to ensure that all electrons I'.:mitted by the cathode hav\'.: high velocity-high
enQugh to preclude spare charge (accumulation of electrons between the cathode and anode). Under
these conditions, and cefining an arrival to be an emission of an electrode from the cathode. Daven-
port and Root (1958) showed that the number of elect:cons, X, emitted from the cathode in time t has
a Poisson distribution given by
p (x) = (AtYe-:tyx!. X= 0,1. 2, "',
0, otherwise.
The parameter 1 is the mean rate of emission of electrons from the cathode.

5·9 SOME APPROXIM:ATIONS


It is often useful to approximate one distribution using another, particularly when the
approximation is easier to manipulate. The two approximations considered in this section
are as follows:

1. The binomial approximation to the hypergeometric distribution.


2. The Poisson approximation to the binomial distribution.
For the hypergeometric distribution, if the sampling fraction nIN is small, say less than
0,1. then the binomial distribution with parameters p ;;;; DIN and n provides a good approx-
imation. The smaller the ratio n/li the better the approximation.

'~~;Ppl~5jl~'~
A prod..:ction lot of 200 units has eight defectives. A random sample of 10 units is selected, and we
want to find the probability that the sample will contain exacdy one defective, The true probability is

p(X=l) Gf: J 2

0.288.
(~~)
SiDcer.!N=~=O,05 is small. we letp =~= O,(}4 3!l.d use the binomial approximation

p(I)= ( I
10'
r· 04 )'(O.96)' =0.277.

In the case of the Poisson approximation to the binomial, we indicated earlier that for large
n and small p, the approximation is satisfactory, In utilizing this approximation we let c:::;
np. In general, p should be less than 0.1 in order to apply the approximation. The smallerp
and the larger n, the better the approximation,

'~~pIۤi~]
The probability that a particular rivet in the wing surface of a new aircraft is defective is 0.001. There
are 4000 rivets in the wing. \Vhat is the probability that not more than six defective rivets ",ill be
inst2lJed?

P(X~6)= ±(4000 (O.OOlY(O.999)-"


I
::={l x /
Using the Poisson approximation,

and
,
P(X~6)~ 2:,-4
,m,
4 '/x! = 0.889.

5-10 GENERATION OF RE...\LIZATlONS


Sehemes exist for using random numbers) as is described in Section 3~6. to generate rea1~
izations of most common random variables,
With Bernoulli trials, we might first generate a value u, as the ith realization of a urn-
fonn [0,1] random variable U, where
f(.) = I,
=0,
and independence among the sequence Ui 1S maintained. Then if u,. ~ p, we let XI:::::' 1, and
if", > p, X, O. Thus if Y = 2::" Xi' Y will follow a binomial ilistribution with parameters
nand p, and this entire process might be repeated to produce a series of values of Y, that ;8,
realizations from the binomial ilistribution with parameters n and p.
Similarly, we could produce geometrie variates by sequentially generating values Ii;
and counting the number of trials until u, s: p. At the point tlJis conilinon is met, the trial
number is assigned to the random variable X, and the entire preeess is repeated to produce
a series of realizations of a geometrie random variable.
Also. a similar seheme may be used for Pascal random variable realizations, where we
proceed testing", until tlJis conilition has been satisfied r times, at which point the trial
number is assigned to X, and once again, the entire process is repeated to obtain subsequent
realizations.
Realizations from a Poisson ilistribution with parameter At = c may be obtained by
employing a technique based on the so-called acceptance-rejection method, The approach is
to sequentially generate values ui as described above until the product ul . ~ .,. uk _." < e-C. is
obtained,. at which pomt we assign X t- k, and once again) this process is repeated to obtain a
sequence of realizations,
See Chapter 19 for more details on the generation of discrete random variables.

5·11 SUMMARY
The distributions presented in this chapter have wide use in engineering. scientific, and
management applications. The selection of a specific discrete distribution \\ill depend on
how well the assumptions underlying the distribution are met by the phenomenon to be
modeled. The distributions presented here were selected because of their wide applicability.
A sllIl1Jl1JlI)' of these ilistributions is presented in Table 5·1.

S.U EXERCISES
5~ 1. An experiment consists of four independent 5~2. Six independent space missions to the moon are
Bernoulli trials with probability of success p on each planned. The estimated probability of success on each
trial. The random ...ariable X is the number of suc- mission is 0.95. What is the probability that at least
cesses. Enume:rate the probability distribution of X. five of the planned missions will be successful?
,..
~

Q
.g
T"ble 5-1 Summary or Discrete Distributions !\'
,"
Momcnt~

Distribution
..
Parameters
Probability
Function p(x) Mean Variance
Generating
Futiction i
,-, ,
-~- -------~-----

Bernoulli O<p<l p(x)


~O.
.q x~O.1

otherwise
P

r----------
pq pe'+ q

-----~
f
ninonuro 11=1,2 •...
O<p<l
p(x) ~(:}Xq"-x.
=0.
x:::::O,I,2 •. , •• 1I

oilienvi!>c
"P

-_. f--. . -
llpq

-~-
(pel -1- q)~

-------
f
if
,-,
Geometric O<p< I p(x) pq. x-0,1,1, .. , lip qlP' pe'l (I qe') ~
.. otherwise
-,--- - ---- ----- ~'
Pascal
(Neg. binomial)
O<p< 1
r~ I. 2.... (r> 0)
r~lI) prqx-r,
p(x)-= (x.

~O,
x=r. r+1,r+2, ..

otherwjse
rIp rqlp<
[ r pe'
1 qe l

--- f-- ---- ~-. ---- -- - ------

Hypergeomelric N.;;:;1.2, ...


n= 1,2, ... ,N
D=I.2, .. .,N
p(x) =GF)~J. x 0,1,2, .... rnin(".D) "p·l
eN
'flJIl.DI~]
.N N ,N-I
See Kendall and
Stuart (1963)

·0 otherwise
Poisson c>O (). -, Xl-I
pX-f: C .'C. x=O,~2 ....
,
C c e'(r-I)

O. otherwi~e
I
5~12 Exercises 125

S~3. The Xi'Z Company has pla!lD.ed sales presenta- 5-12. The X1'Z Compa.'1Y plans to ...isit potential cus-
tions to a dozen important customers. Tne probability tomers until a substantial sale is made. Each sales
of receiving an order as a result of such a pre~ntation presentation costs $1000. It costs $4000 to travel to
is esti.rnated to be 0,5, What is the probability of the nex.t customer and set up a new preseatation,
receiving fOUI Or more orders as the result of the meet- (a) What is the expected cost of making a sale if the
ings? probability of lMking a sale after any presentation
54. A stockbroker calls her 20 most important cus~ is O.1O?
tomerS every morning, If the probability is one in (b) If \he expeeled profit from each sale is $15,000,
three of making a transaction as the result of such a should the trips be undertaken?
call, what arc the chances of her h.a:tdli:lg 10 or more (c) If the budget for advertising is only $100,000,
transactions? what is the probability that this sum will be spent
5~5. A production process that manufactures transis~ Vlithout getting an order']
tors operates, on the average, at 2% fraction defective. 5~11. l'1nd the mean and variance of the geoJ.lletric
Ever; 2 hours a random sample of size 50 is taken distribution usiog the momeut-ge!1erating function.
frOIll the proeess. If the sample contai:::ts more than 5~14. A snbmarine's probability of sinking an enemy
t\VO defectives the process must be stopped. Deter-
ship with anyone of its torpedoes is 0.8. If the
Ii'lir.e the probability that the process will be stopped firings a:::e ir.depe:ode:1t.. determine the probability of
by the sampling scheme. a sinking within the first two firings. \Vithin the Erst
5~6. Find the mean and variance of the binocrial dis- three.
tribution using the moment~generating function (see 5·15. In Atlar.ta the probability that a thunderstorm
equation 5~7). will occur on any day during the spring is 0.05.
5-7. A production process manufacto.ri.Dg tum~indica~ Assuming independence, what is the probability that
tor dash lights is known to produce lights that are 1% the first thunderstorm occu..rs on April 25? Assume
defective. Assume this value remains unChanged and spring begins on March 21.
assunle a sample of 100 such lights is randomly 5-16. A potential customer enters :an automobile deal-
selected, FindP(p $ 0,03), where Pis the sample frac- ership every hour. The probability of a salesperson
tion defective. concluding a transaction is OJO. She is detennined to
5-8. Suppose a random sample of size 200 is ta.'cen keep working until she has sold th.••..ee cars. 'Nhat is the
frorr.. a process that is 0.07 fraction defective. \\'llat is probability that she will have to work exactly 8 hours?
the p:obabilit}' that p will ex.ceed the L"'Ue fraction More than 8 hours?
defective by one standard deviation? By two standard
5~17. A persor.ncl manager is interviewing potential
deviations? By three standard deviations? employees in ordct to fill twO jobs. The probability of
5-9. Five cruise missiles have beea bc.i1t by an aero~ an interviewee having the necessary qualifications a::.d
space company. The probability of a successful firing accepting an offer is 0.8. VIhat is the probability :hat
is, on anyone test, 0.95. Assuming independent fu- exactly four people must be inteniewed? "'hat is the
ings. what is the probability that thefust failure occurs probability that fewer than four people must be inter-
on the fifth firing? viewed?
5-10. A real estate agent estimates his probability of s..18. Show that the moment-generating function of
sellin.g a house to be 0,10. He has to see fOUI clients the Pascal tandom. variable is as given by equation
today. If he is successf.d on the :first three calls, what 5-22. Use it to deten:.n.i:ne the mean and variance of the
is the probability that his fourth call is unsuccessful? Pascal distnoution.
5-11. Suppose five independent identical laboratory 5 ..19. The probability that an experiment has a suc-
expe..1mcnts are to be undertaken, Each experiment is cessful outcome is 0.80. The experiment is to be
extremely sensitive to en...ironmental conCitions. and repeated until five successful outcomes have occurre,t
there is only a probability p that it will be completed What is the expe..."1ed numbe:- of repetitions required?
successfully, Plot, as a functio:l of p, the probability W:Jat is the variance?
:hat the:fifth experiment 15 the first failure. Fir.d math- 5~20. A military co=mander wishes to destroy an
ematically the value of p that maximizes the probabil- enem.y bridge. Each fligflt of planes he sends out has
ity of the fifth trial being the first unsuccessful a probability of 0.8 of scoring a direct hit on the
e"'periment. bridge. Ir takes four direct hits to completely destroy
the bridge, If he can mount seven assaults before the
126 Chapter 5 Some Important Discrete Distributions

bridge becomes tactically unimporta:J.t. what is the 5~29~ The number of antomobiles passing through a
probability that the bridge will be dcstrOyed? pa.'1icular intersection per hour is estimated to be 25.
,5..21. Three companies, X, Y. and Z, have probabilities Find the probability t:.hat fewer than ; 0 vehicles pass
of obtaining an order for a particular type of mer- through during any 1~hour interval. A",sume that the
chandise of 0.4, 0,3, and 0,3, respectively, Three number of vehicles follows a Poisson distribution.
orders are to be awarded independently, What is the 5-30. Calls arrive ata telephone switchboard such that
probability that one company receives all the orders? the number of calls per hour follows a Poisson distri-
5-22. Four companies are interviewing five college bution with a mean of 10. The current equipment can
students for positions after graduation. Assuming all ha::tdle up to 20 calls \Vithout becoming overloaded.
five receive offers from each company and assuming What is the probability of such an overload occurring?
the probabilities of the companies hiring a new 5-31. The number of red blood cells per square unit
employee ate equal, what is the probability that one v-lsible under a microscope follows a Poisson distribu-
company gets all of the new employees? ~one of tion with a mean of 4, Find the probability that more
diem? than five such blood cells are visible to the observer.
5-23. We are interested in the weight of bags of feed.
5,.32. LetX/ be the number of vehicles passing through
Specifically, we need to know if any of me four events
an intersection during a length of time t. The random
below has occurred:
variable X/is Poisson distributed with a parameter ?t.
TJ ;(X~ 10), p(T1) =0:2, Suppose an auto!'l'latic counter has been installed to
T,;(lO<X'; 11), p(T,) 0.2, count the number of passing vehicles, However, this
counter is not functioning properly, and each passing
T,;(ll <X,; 11.5), p(T,) = 0.2,
vehicle has a probability p ofnot bei:tg counted. Let Yt
T4 ;;; (11.5 < X). p(T,); 0.4.
be the number of vehicles counted during t. Fmd the
probability distribution of ~.
If 10 bags are selected at random. what is the proba-
bility of four being less than or equal to 10 pounds. 5 ..33. A large insurance company has discovered that
one being greater than 10 but less than or equal to 11 0.2% of the U.S. population is injured as a result of a
pound.<;. a!ld two bcing greater than 11.5 pounds? pa.~cuJ:ar t;>pe of accident.:nlis company has l~.OOO

5·24. In Problem 5·23 what is the probability that :ill policyholders carry"ing coverage aga..inst such an acci-
10 bags weigh more than 11.5 pounds? 'What is the denL What is the probability that three or fewer claims
probability that five bags weigh more than 11.5 will be filed against those policies next year? Five o~
pounds and the remaining five weigh less than 10 Qore claims?
pounds? 5,,34. Maintenance crews arrive at a tool crib request-
5-25. A lot of 25 color television :;ubes is subjected to ing a particular spare part according to a Poisson dis-
an acceptance testing procedure, The procedure con- ti.bution with parameter 1;;; 2. T.hree of these spare
sists of drawing five tubes at random, withourreplace w parts are nOI'I!lally kept on hand. If more than three
ment. and testing them. If two or fewer tubes fail. the orders occur. the crews must journey a conside..""abte
remaining ones are accepted. Otherwise the lot is distance to central stores.
rejected, Assume the lot contains four defective tubes. (a) On a given day. what is the probability that such a
(a.) What is the exact probability of lot aeceptance? journey must be made?
(b) \Y'bat is the probability of lot acceptance com- (b) Wbat is the expected demand per day for spare
puted from the binomial distribution with p :=: ?:s partS?
5-26. Suppose that in Exercise 5-25 the lot size had (c) How many spare parts must be carried if the tool
been 100. Would the binomial approxilnation be satis- crib is to service all incoming crews 90% of the
factory in this case? time?

5-27. A purchaser receives small lots (N:= 25) of a Cd) "''hat is the expected number of crews serviced
daily at the tool crib?
high~precision device, She wishes to reject the lot
95% of the time iiit contains as many as seven defec- (e) Wbatis the expected number of crews makiLg the
tives, Suppose she decides that the presence of one journey to central stores?
defective in the sample is sufficient to cause rejection. 5--35. A loom experiences one yam breakage approx-
How large should bet sample size be? .i.mately every 10 hours, A panicular style of cloth is
5 ..28. Show that the moment-generating function of the being produced that will take 25 hours on this loom.lf
Poisson random variable is as give:.:. by equation 5.-38, ti:u:ee or more breaks are required to tender the prod-
5-12 Exercises 127

uct unsatisfactory. find the probability that this style of (a) Find the probabilities of 1, 2, 3, 4. or 5 accidents.
cIo'll is finished wi:h acceptable quality. (b) The Poisson distribution has been free:y aPPlied
5-36. The ::mmber of p,eople boarCing a bus at each in the area of industrial accider:ts. However. ir fre-
stOP follows a Poisson disnibution with parameter A.. que~tly provides a poor ''fit'' to actual hismrical
TIle bus company is surveying its usages for scbedul~ data. Why rnigl".t this be t:1le? Hint: See Kendall
ing purposes and bas installed an automatic counter and Stuart (1963), pp. 128-130.
on each bus. However, if more than 10 people board at 541. ese either yo',;! favorite co;nputer language or
any one s~op, the counter cannot record the excess and the random integers in Table XV in ':he Appendix (and
merely registers 10. If X is the number of riders scale them by mUltiplying by 1(}$ to ge::. uniform :0,1]
recorded, find the probability distribution of X. random nu.::nberrea.liz.ations) to do the following:
5-37. Amathemadcs textbook has 200 pages on which (a) Produce five realizations of a binomial random
typographica: errors in the equations could occur. If variable with n "'" S.p = 0.5.
there are in fact five euors randomly dispersed among
(b) Produce ten realizations of a geometric distribu.-
these 200 pages. what is the probability that a :andom
tion 'With p "'" OA.
sample of 50 pages will contain at least one error?
How large must the .random sample be to assure that at (c) Produce five realizations of a Poisson random
least three ertO('S will be found with 90% probability? variable with c "'" 0.15.
5-38. The probability of a vehicle baving an accident at 5-4~ If Y"", XI .3 a:1(:1 X follows a geometric discribution
a particular intersection is 0.0001. Suppose that 10,000 with a wean of 6, use uniform lO,l} random number
vehicles per day travel th.vugh this intersection. \\"tat realizations, and produce five realizations of Y.
is the probability of no accidents occurring? What is
5-43. \Vith Exercise 5-42 above, use your computer to
the probability of two or more accidents? do the following:
5..39. If the probability of being involved in an auto (a) Produce 500 realiz.ations of Y.
accident is 0.01 during any year, wbat is d:!e probabil-
(b) Calculate)i =~, +)'1 + ........ Ysoo), the mean
ity of baving two Or :Dare accidents during any 10-
from this sz.mple.
yem: driving period?
540* Suppose that the number of aecidents to 5~44. Prove the meworyless prope:ty of the geometric

employees working on high-explosive sbells over a distribution.


period. of time (say 5 weeks) is taken to follow a Pois~
son discrih:.:tion 'With parameter A.::; 2.

,
G
1

/i.

r
\/
Chapter. 6

Some Important
Continuous Distributions
6-1 INTRODUCTION
We will now study several important continuous probability distributions. They are the uni~
form. exponential, gamma, and Weibull distributions. In Chapter 7 the normal distribution,
and several other probability distributions closely related to it, will be presented. The nor-
mal distribution is perhaps the most important of all continuous distributions. The reason
for postponing its study is that the normal distribution is important enough to warrant a sep-
arate chapter, ,
It has been noted that the range space for a continuous random variable X consists of
an interval or a set of intervals. This was illustrated in an earlier chapter, and it was observed
that an idealization i') involved. For example, if we are measuring the time to failure for an
electronic component or the time to process an order through an information system, the
measurement devices used are such that there: are only a finite number of possible out-
comes; however, we v.-ill idealize and assume that time may take any value on some inter-
val. Once again we will simplify the notation where no ambiguity is introduced, and we let
f(x) = fx(x) and F(x) = F,!.x).

6-2 THE UNIFORM DISTRIBlJT10N


The uniform density function is defined as

f(x) =,f-.
p-a
(6-1)

where a and {3 are real COnStants with a < {3 The density function is shown in Fig. 6- L
Since a uniformly distributed random variable has a probability density function that is

f(x)t
!
1 ,
p-a!
,_ _ .... ~L-_ _ _ _-L._ _ _.~

• P x Figur1l 6-1 A uniform density.

128
6-2 Toe Uuiforrr. Distribution 129

constant over some inte:val of definition, the constant must be the reciprocal of the length
of the interval in order to satisfy the requirement that

[fix) dx~1.
A uniformly distributed random variable represents the continuous analog to equally
likely outcomes in the sense that for any subinterval [a, b], where a" a < b ,; f3, the
P(a,,;X"; b) depends only on the length b - a,

P(a$X$b)~ t~~ b-a


af3-a -0:
The statement that we choose a point at random on [a, f3J simply means that the value cho-
sen, say Y, is uniformly distributed on [a, f3].

6-2.1 Mean and Variance of the Unifonn Distribution


The mean. and variance of the uniform distribution are

(6·2)

(which is obvious by symmetry) and

V(X)~JP x'dx _[f3+ a ]'


af3-a 2
_ (f3-a)' (6·3)
12
The momenr-generating function Mit) is found as follows:

(64)

For a uniformly distributed random variable, the distribution function F(x) P(X'; x)
is given by equation 6-.5~ and its graph :is shown in Fig. 6-.2.
F(x) 0, x < a,
}: - - - - -
~ x-{):
J.f3-a- f3-0:'
a';x<f3,
1, (5·5)

F(X)t

o a
/ f3 X
Figure 6-2 Distnoution function for the uniform random variable.
130 Chapter 6 Some Important Continuous Distributions

:E§.¥~!;;t6!J
A point is chosen at random on the interval [0. 10). Suppose we wish to find the probability that the
t t·
point lies between and The density of the random yariableX isf(x) =7:0,
O:S;x::5: 10, andf(x):::O
otherwise. Hence, pet s X s t) =fa.
:~~i~;6;~.:
Numbers of the form NN.N are "rounded off" to the nea..'<!St integer. The round-off procedure is such
that if the decimal. part is less than 0.5, the round is "down" by simply dropping the decim.:ll part~
however, if the decimal part is greater than 0.5. the round is UP. that is. the new number is livlV.NJ 1 "'I'"

whcre LJ is the "greatest integer contaix.ed in" function. Ii the decimal part is exactly 0.5. a coin is
tossed to determine which way to r~:lUnd. The round.-off error,x, is defined as the difference between
the number before roundiD-g and the number after rounding, These errors are comroonly distributed
according to the uniform distribution on the interval [-1),5, + 0.5;. That is,
j(xl I, -<l.5 "X'; ~ 0.5,
== O. otherwise,

~.K<"'aiIl"'1;'(i:3'
"",, ...,,-~," R-"·,~,~"",,w
One of the special features of many simulation languages is asiruple automatic procedure for using the
unifonn distr..bution. The user declares a mean and modifier (e.g.• 500, 100). The compiler immediately
creates l! routine to produce realizations of a random variable X uniformly distributed on (400, 600],

In the special case where a = 0, f3 = 1, the uniform variable is said to be uniform on


[0, IJ and a symbol U is often used to describe this special variable. Using the results from
equations 6·2 and 6·3, we note that E(U) =} and V(U) = l'Z' Ii U" U z, ... , U, is a sequenee
of such variables. where the variables are mutually independent, the values VI' U?J •..• Uk
arc called random numhers and a realization u~. ~, ... , ut is properly called a random ".um-
ber realization; however, in COmmon usage, the term ('random numbers" is often given to
the realizations.

6-3 THE EXPONE:'iiTIAL DISTRIBUTION


The exponential distribution has density function
J(x) = I.e"', x;' 0,
=0, (6·6)
where the parameter A is a real, positive constanL A graph of the exponential density is
shown in Fig. 6·3.

f(xlf

o
~~ x Figure 6-3 The expouetttial density function.
6-3 T.'1e Exponential Distribution 131

6-3.1 The Relationship of the Exponential Distribution


to tbe Poisson Distribution
The exponential distribution is closely related to the Poisson distribution, and an explana~
tion of this relationship should help :.he reader develop an unden;tanding of the kinds of sit·
uations for which the exponential density is appropriate.
In developing the Poisson distribution from the Poisson postulates and the Poisson
process; we fixed time at SOme value t, and we developed the distribution of the number of
occurrences in. the interval (0, r]. We denoted this tandom variable X, and the distribution
was
p(x) = e-',(?..ttlx!, x:::: 0, 1,2\ ... ,
=0, othenvise. (6·7)
Now consider prO), which is the probability of no occurrences on [0, tJ. This is given by
p(O) = e~. (6-8)
Recall thet we originally fixed time at t. Another interpretation of prO) = e'" is that this is
the probability that the time to the Drst occurrence is greater than t. Considering this time
as a random variable T, we note that
p(O) =peT> I) =e-", t" O. (6·9)

If we now let time vary and consider the random variable T as the time to occurrence, then
,,,,0. C\;/ )(6·10)
""lI,{,.
And since jtt) = F(I), we see that the density is "" ",
e"J, j}
f(t)=)..,:"", no, :V'· X)' ~ J/
=0, otherwise. yi'~ (6-11)
This is the exponential density of equation 6·6. Thus, the relationship between the
exponential and Poisson distributions may be stated as follows: i: the number of occur~
rences has .a Poiss'on distribution as shown in equation 6-7. then the time between succes-
sive occurrences has an exponential. distribution as shown in equation 6-11. For example, if
the number of orders for a certain item received pel," week has a Poisson distribution, then
the time betv.reen orders would have an exponential distribution. One variable is discrete
(the count) and the othe, (tiroe) is continuous.
In oreer to verify thatfis a density function, we note that/(x)" for all x and °
l ~ k -'" dx
o
._"'15
-e Q = 1:)l-'<
/~
,
V
/

. (:) / ",
... /~ / )< 'V

6-3.2 Mean and Variance of the E"l'onential DistributiQY 'y


The mean and variance of the exponential distribution are

(6.12)

and

V(X) = J; x2k-"'dx-(l//,)2
= [-x e-"'I; +zJ; xe-"'dx]-(l/:<)' =1/:<'
2 (6.13)
132 Chapter 6 Some lIUportant Continuous Distnoutions

F(x)t

+--
~ _______, Fig= 64 Distribution fulletion for the
x expoaential.

The standard deviation is 1/A., and thus the mean and the standard deviation axe equal.
The moment-generating function is
( . yl
\ '.j
Mx(t)=i 1-":"1 (6-14)

provided t < ?.
The cumulative distribution function F can be obtained by integrating equation 6-6 as
follows:
X<O,

= J: kr"dr=l-e- lx • x;,o. (6-15)

Figure 64 depicts the distribution function of equation 6-15.

An electronic component is known to have a useful life represented by an exponential density 'X-ith
failure rate of 10-' failures per hour (i.e., .1= 10-5). The mean t:i;me to fwtL""e, E{X,), is thus 10> hours.
Suppose we want to determine the fraction of such components that would fail before the mean life
or expected-life:

This result holds for arty value of J. greater than zero. In our example. 63.212% of the iteI:1S
would fail before 10' hours (see Fig. 6-5).

fix)

f(x) = Ae-Ax, X2 0:
:::: 0, ot'1erwfse

A
Figure 6-5 The mean of an exponential dlstribution.
6~3 The Exponential Distribucon 133

Suppose a designer IS to make a d.ecision between two manufacturing processes for the rna!:ufacture
of a certain component Process.4 cos~ C dollars per unit to manufactcre a component P:::ocess B
costs k· C dollars per unit to manufacrure a component, where k> 1, Cor.::ponents J:-..ave an exponen-
tial time to failure density with a failure rate of 200-: failures per hour for process A. while compo-
nents from process B have a failure rate of 300- 1 failures per hour. The mean lives are rhus 200 hours
and 300 hou~, respectively, for the tWo processes, Because of a warrar.ty clause, 1: a componen: lasts
for fewer than 400 hours, the IDOllufacC'.:.ter rou.st pay a penalty of K dollars. Let X be be time to fail-
ure of each component. Thus, the component costs 3..4;;

ifX~400,
UX <400,
and
UX;24OO,
ifX<4oo.
The expected costs are

E(CA ) ~(C+K) 1 200- e-"""'dx+CJ:,200- e-"""dx


400
l l

= (C - K){_e-,noc
40
'0..J
'l+ C~_e-Xl)!fJl- 1
L 400..,;

=(C+K)[l-e-']+C[e-'j
=C+K(I-e-2)
, ,
and

E( Co) = (kC - K) f' 3oo- le-,;300 dx +kCJ:,3OO- 1e-'13" dx

=(kC + K)[1_e-if3]+kC[e-4l3]
. =kC-Kll-e"''''j,
Therefore, if k <: 1- KlC{[2 - e~), then E( CA ) > E(Cfj), and itis likely that !"1.e designer would se:ect
process B.

6-3.3 Memoryless Property of the Exponential Distribution


T4e exponential distribution has an interesting and unique memoryless property for con~
tinuous variables; that is,
P(X"x+s\
P( X>x+sIX>x ) = ' ,
P(X>x)

so that
P(X",,'" six> x) = P(X>s), (6-16)
For example, if a cathode ray tube has an exponential time to failure distribution and at time
x it is observed to be still functioning, then the remain.in.g life bas the same exponential fail-
ure distribution as the tube had at time zero_
134 Chapter 6 Some Important ConriDuous Distributions

6-4 THE GAMl\1A DISTRIBlJTION


6-4.l Tbe Gamma Function
A function used in the definition of a gamma distribution is the gamma function defined by

forn>O. (6-17)

~J\n important recursive relationship that may easily be shown on integrating equation 6~ 17
by parts is
r(n) = (n - I)f(n - I). (6-18)

If It is a positive integer, then


r(n) = (n -I)!. (6-19)

since reI}
= J;
e -~dx::;:;: L Thus, the gamma function is a generalization of the factorial.
The reader is asked in Exercise 6-17 to verify that
r~
r,/1'
-I~ J, x-lf2e-xdx~,.fii. (6-20)
\2/ 0

6-4.2 Definition of the Gamma Distribution


With the use of the gamma fun:tion, We are now able to introduce the gamma probability
density function as

=0, otherwise. (6-21)

The parameters are r > 0 and A. > O. The parameter r is usually called the shape parameter"
and "lis called the scale parameter. Figure 6-6 shows several gamma distributions, for ;1. == 1
and various r. It should be noted thatj(x);, 0 for alb, and

J-~ j(x)dx=1~0 ~(),)Tle-"'dx


f(r)
=-1-1- v,-le-Ydy=_I_ T (r) I.
f(r) 0 - r(r)'
The cumulative distribution function (CDP) of the gamma distribution is analytically
intractable but is readily obtained from software packages such as Excel and Min)tab. Jn par-
ticulat, the Excel function GA.\1MADIST(x, r, !fA, TRUE) gives the CDF F(x). For exam-

f(x)

Figure 6k6 Gamma distribution for ..t= L


r

64 The Gamma Distributio:J. 135

ple, Gk\11<LWIST(15, 5.5, 4.2, TRUE) returns a value of F(15) = 0.2126 fur the case r =
5.5, .<. = 1/4.2 = 0.238. The Excel function GAMMAIl\'V gives the inverse of the CDP.

64.3 Relationship Between the Gamma Distribution


and the Exponential Distribution
There is a close relationship between the exponential distribution and the gamma distribu-
tion. Namely, if r::;;;; 1 the gamma distribution reduces to the exponential distribution. 1bis fol-
lows from the general definition that if the random variable X is the sum of r independent,
exponentially distributed random van'ables, each with parameter )"" then X has a gamJr.tl
density with parameters r and A.. That is to say, if

X~Xl +x, + ... -X" (6-22)


whe,e ~ has probability density function

g(x) M-"', X~O,

=0, othervrise,

and where the Xj are mutually independent, then X has 11e density given in equation 6-21.
b many appli::;ations of the gamma distribution that we will consider, r will be a positive
integer, and we may use this knowledge to good advantage in developing the distribution
function, Some authors refer to the special case in which r is a positive integer as the
Erlang distribution.

6-4.4 Mean and Variance of the Gamma Distribution


We may show that the mean and variance of the gamma d:stribution are

E(X) =rI'<' (6-23)

and
V(X) rl.<.". (6-24)

Equations 6~23 and 6-24 represent the mean and variance regardless of whether or not Tis
an integer; bow ever, wben T is an integer and the interpretation given in equation 6-22 is
made, it is obvious that
,
E(X) = 2:E(Xj) rJj.<.=r/.<.
j=!

and
,
V(X) = 2: V{X))=r<1/.<.2 =r/.<.2
j=l

from a direct application of the expected value and variance operators to the sum of inde-
pendent random variables.
The moment-generating function for the gamma distribution is

(6-25)
136 Chapter 6 Some Important Continuous Distributions
,
RecalUng that the moment-generating function for the exponential distribution was
[1 - (IIi.)r', thls result is expected, since

M(X,+X,+ ...+X,)(t) = UMXj(t1=[(I-HT (6-26)

The distribution function, F, is

F(x)=l- r /rr/,tY-le-ndt, ;<>0,


:::::0, .>:so. (6-27)

If r is a positive integer, then equation 6-27 may be integrated by partS, giving


,-I
F(x) =1- Le-~ (Ax)' /k:, X>O, (6·28)
'"U
which is the sum of Poisson terms with mean Ax. Thus, tables of the cumulative Poisson
may be used to evaluate the distribution function of the gamma.

~~~~!~~i~2
A redundant system operates as shownm Fig, 6-7. Inirially unit 1 is On line, whlleuuit 2 and unit 3 arc
on standby. 'When uniT 1 fails r the decision switch (OS) switches unit 2 on until it fails and then unit 3 is
switehed on. The decision switch i" assumed to be perfect, so that the system life X may be represented
as the sum of the subsystem lives X::= Xl . :. . -l2 + XJ , If the subsystem lives are independent of one
another, and if the subsystems each have a life.x:,,,j== I, 2. 3. ha..,r.ng density g(x) = (1!l00)e-!I1l00, X ~ 0,
then X will have a garr.:.ma density with r~3 and ,l=O,Ol. That is,

f () 0.01 ( , -0.01.< , x>O,


x = - O.Olx) e
2!
=0, otherwise.

The probability that the system 'Hill. operate at least.~ hours is denoted R(x) and is;;;alled the reliabil~
ity ftmction. Here,

R(x)=l-F(xl= ±e"'01'(o.OlX)'/k!
"'"
= e-'()'Ol'[1 + (O.Olx) + (O.01x)' /2]'

Fignnl6-7 A standby redundant syst=


6··5 The Weibull Distribution 137

~~~pX;;¥:Z;
t
For a garn.."I:a distribution with ;t= and r = v!2. where v is a positive integer. the chi-square distri~
lJution with v degrees offreedom results:

f (,x) 1
2 vtZ r\(y/2) x
(,{2)-: -'I' 0
x> ,
,. It •

: : : 0, otherwise.

This distribution will be discussed further in Chapter 8.

6·5 THE WEffiI1LL DISTRIBUTION


The Weibull distribution has been widely applied to many random phenomena. The princi-
pal utility of the Weibull distribution is that it affords an excellent approximation [0 the
probability law of many random variables. One important area of application has been as a
model for time to failure in electrical and mechanical components and systems. This is dis-
cussed in Chapter 17. The density function is

f(x) =%( x;rr exp [-~ x;r)P} x"r,


= O. otherwise. (6·29)
Its parameters are y(- < r< 00)) the location parameter; 0> 0, the scale parameter; and
¢¢

f3 > 0, the shape parameter. By appropriate selection of these parameters, this density func-
tion will closely approximate many observational phenomena.
Figure 6-8 shows some Weibull densities for r= 0, 8 = 1, and f3 1, 2. 3, 4. Note that =
when y= 0 and f3 = 1, the Weibull distribution reduces to an exponential density with }. =
1/8. Although the exponential distribution is 3 special case of both the gamma and Weibull
distributioust the gamma and Vleibull in general are noninterchangeable.

6-5.1 Mean and Variance of the Weibull Distribution


The mean and variance of the Weibull distribution can be shown to be

(6·30)

0.8 2.0 x
Figure 6..s Weibull densities for r=O, 0= 1. and j3 = 1, 2, 3, 4.
138 Chapter 6 Some Impor:ant Continuous Distributions

and

(6-31)

The distribution function has the relatively simple fonn

F(x)=l-exp -: - - r (x-r)Pl
'- \ (j
J' (6-32)

The Weibull CDF F(x) is conveniently provided by software packages sueh as Excel and
Minitab, For the case .< = 0, the Excel function VlEIBlTLL (x, (3, y, TRIJE) retnms F(z).

~~i~~:~)
The tirue~to-failure distribution for eleCtronic subassemblies is known to have a Weibu!l density with
y=O, p=t. and 0= 100. The fraction expected to survive to, say, 400 hours is fuus

1- F(400) =e-,1400,"00 =0.1353.


The same result could have been obtainedin Excel via the function call 1-WEIBUu.. (400, 112, 100.
TRUE).
The mea.'1. time to failure is
E(X) =0 + 100(2) = 200 hours.

~X;;:~~C6::~~
Berrettoni (1%4) presented a number of applications of the Weibull distribution.. The following a...-e
examples of natural processes having a probability law closely approximated by the Weibull distrl~
bution. The randol'!) variable is denoted X in the examp:es.
1. Corrosion resistance of magnesium alloy plates,
X: Corrosion weight loss of 10: mg/(cml)(day) when magnesium alloy plates are immersed
in an inhibited aqueous 20% solution ofMgBr2'
2. Return goods classified according to number of we<"-ks after shipment.
X; Length, of period (10-\ weeks) until a customer returns the defective product after
shipment.
3~ Number of downtimes per shift.
X: Number of downtimes per sblft: (times 10-1) occurring in a CoIltinuoUS automatic and C()m~
plicated .assembly line.
4. Leakage failure in dry-cell batteries.
X: Age (years) when leakage starts.
5. Reliability of capacitors.
X: Life (hours) of 3,3-,uF, 50-V. solid tantalum capacitors operating at an ambient tempera-
ture of 125°C, where the rated catalogue voltage is 33 V
---

6-6 GENERATION OF REALIZATIONS


Suppose for now that U1, U" ',., are independent uniform (0, 1] random variables. We "ill
show how to use these uniionns to generate other random variables.
6-8 Exen:ises 139

If we desire to prod"ce realizations ofa uniform random variable on [a, ill. this is sim-
ply accomplished using
X, a+ u,({3- a), i~ I, 2, .... (6-33)
If we seek realizations of an exponential random variable with parameter A. the inverse
transform merlwd yields
-I
xi =TIn(Ui}' i= 1.2 .... (6-34)

Similarly, using the same method, realizations of a Weibull random variable with P2.....""ame-
ters 11, {3. <5 are obtained usIng
(6-35)
The generation of gamma variable realizations usually employs a technique known as
the acceptance-rejection mefr..od~ and a variety of these methods bave been used. If we wish
to produce realiz,ations from a gamma variable with parameters r > 1 and }.. > 0, one
approach, suggested by Cheng (1977), is as follows:

Step 1, Let a ~ (2r- 1)112 and b = 2r-In 4 + ila.


Step 2. Generate u1' u., as unifonn [0. 1] random number realizations.
Step 3, Lety=r[u/(l-u,)]a.
step 4a. If y > b -In(u;u.,). tejeet y and return to Step 2.
Step 4b. If Y ~ b -InCu;u.,) assign x <- (y/;t).

For more details on these and other random-variate generation techniques, see
Cbapter 19.

6-7 SUMMARY
This chapter bas presented four widely used density functions for continuous random vari~
abIes. The uniform, exponential, gamma, and Weibull distributions were presented along
with underlying assumptions and example applications. Table 6-1 presents a summary of
these distributions.

6-8 EXERCISES
6*1. A point is chosen at random on the line segrr:ent this profit is distributed between SO and
[O,d]. Whatjs the probability 1Jatitlies between and t $2000, find the probability distribution of the broker's
1"47
3 Between 2'4a::.d
" 3t1 total fees.
6--2. The opening price of a particular stock is uni~ 6-5. lise the moment~generating function for the uni-
fonn1y distributed on the interval [35{, 44{-J. What is form density (as given by equation 64) 10 generate
the probability that. on any given day. the opening the mean and VJrian.ce.
price is less than 401 Between 40 and 427 6-6. Let X be uniformly distributed and symmetric
6-3. The random variable X is uniformly distributed about zero with 'lariance 1. Find the appropriate v:li-
on the interval [0, 21. find the distribution of the ran- ues for a and fl.
dom variable Y;;;;; 5 + 2X. 6~7. Show how the uniform density function can be
.-4. A real estate broker charges a fixed fee of $50 used to generate variates from the empirical probabil~
plus a. 6% cqmmission on the lando\iiller's profit. If it}' distribution desctibed below:
....
:!3
Q

!l
'"
Table 6-1 Summary of Continuous Di\'isiolls '"~
, ~ ---------r------------------------,---~---------_r-----------------------I
Moment~
Generating
i
Density I ~----~-.
Pnrl1meters
+-- --1
Density Punctionf(x) Mean Varianee Funetion i
Unifonn a,fJ fix) ~ fJ.a' a';x~fJ (a, fl)12 (fJ- a)' 112
e" - etC' ~
I(f! a) g
f3>a g
:::;; O. otherwise,
~
-----.- ----
n'xpouential ,bO f(x)~M-)·" 00 1/ ,t II ;I.' (1-llAl' t1
);;.

a.~
Gamma 00 fix) ~r(r)(A'r·,eu"', X>O rl;l. rl ).2 (l-ll;tr ~
,bO
~o, otherwise.
\ - - - - - - -- - - - - - - - - - - - - - + --~-

Weibull _00 < r<- fix) ~8 -0- expr--0-)


P(x_r)PU' (X-r\P] , x~r
I
r+"·r( p+l)
I
8'H% '1]-[~i' 1)J)
11>0
thO =0. otherwise,
_ _ _ _ _-'1. ______ ~~
6~8 Exercises 141

6--16. A transistor has an exponential time-to~failm:e


y ply)
distribudon with a mean time to failure of 20,000
0.3 hours. The transistor has already lasted 20.000 nours
2 0.2 in a particular application. Vi"hat is the probability that
3 0.4 the transistor fails by 30,000 hours?
4 OJ 6-17,Sllowthatr(t)= 5.
6-18. Prove the gamma function properties given by
equations 6-18 and 6-19.
Hint: Apply the inverse transform method,
6-8. The random variable X is unifonnly distdbuted 6-19.Aferry boa~ w'Jl take its customers across a rivet
oyer the interval [0. 4). Wbatis the probability thel the when 10 cars are aboord. Experience shows that cars
arrive at the ferry boat independently and at a mean
roots of/+ 4Xy + X + 1 =0 are real?
rate of seven per hour. Fmd the probability that the
6--9. Verify that the moment-gene:ra.ting function of the time bet\Veen consecutive trips will be at least 1 hour,
e~neDtia1 dislribution is as given by equation 6-14,
Use it to generate the mean a.t!d variance. 6-Ut A box of candy contains 24 bars. The time
beN.'een demands for these candy bars is exponen~
()..lO. The engine and drive train of a new car is guar w

tially dist:ibuted with a mean of 10 minutes. 'lflhat is


anteed for 1 year. The mean life of an engine and drive
the probability that a box of candy bars opened at
train is e-stimated to be 3 years, and the time to fdilu.rc.
8,00 AM. will ce empty by noon?
has an exponential densi"::y. The realized profit on a
6~21. Use the moment-generating function of the
new car is $1000, Including costs of parts and labor
the dealer must pay $250 to repair each failure. What gamma distribution (as given by equation 6-25) to find
is the expected profit per car? the mean and variance.

6-11. For the data in Exercise 6~ 10, what percentage 6-22. The life of an electronic system is Y == Xl + Xz . ;.
of C3I'S will experience failure ir. the engine and drive X3 +~, the su:rn of the subsystem component lives,
train du..~g the fi..""St 6 months of use? The subsystems are independent, each having expo-
nential failure densities with a mean time between
6·12. Let the length of time a machine will operate be
failures of 4 hours. "Wnat is the probability '.:hat the
an e,~ponentially distributed random variable with
system wffi opet<1te for at least 24 hom:s?
probability density functionj(t) = ee-!)', t;' O. Suppose
an operator for this machine must be hired for a pre-- 6~23. The repl~hment time for a certaic product is

determined and fixed length of time, say Y. She is paid known to be gamma distributed \\'ith a mean of40 and
d dollars per time period during this interval. The net a variance of 400. FInd the probability that an order is
profit from operating this machine. exclusive of labor received wirhin the first 20 days after it is ordered.
costs, is r dollars per time period that it is operating. Within the first 60 days.
Fmd the value of Y that maximizes the expected total 6-24# Suppose a gamma distributed random variable is
profit obtained. . defined over the interval u :$ x < .;;<) with density
:tCi~ time to failure of a television tube is estt- function
.,
IDa~ to be exponentially distributed \vith a mean of
3 years. A company offers insurance on these tubes
f(x) == :(r) (x_uy-i e-J.{;t-,,) , x 2:u,.;~~O. r> O.
for the f.."'St year of usage. On what percentage of poli-
= 0, otherwise.
cies will they have to pay a claim?
Find the mean of this three~parameter gamma
(fl4Jrs there an exponential density that satisfies the
disoibution.
fc;m5wing condition?
6--25. Tne beta probability distribution is defined by
.P(X ~ 2) = tp{H 3}
If so, fuld the value of ':L f ') r(~+r) "-I (I-x )~I ,O~x~I,)'>O,r>O,
\X - r(),)!'(r) X
6-15. Two manufacturing processes are under consid-
eration. The pet'-umt cost for process I is C. while for = 0, otherwise.
process IT it is 3C Products from both processes have (al Graph the distributiou fOLt> 1, r> L
exponential time~to-failure densities with mean rates (b) Grapll the distribution for ;I, < I, r < L
of 2S-! failures per nour and 35-: fai:ures per =:'our (e) Graph the distribution for ,:( < 1, r;' L
from I and II, respectively. If a product fails before 15
(d) Gmpll the distribution for ,:( ~ 1, r < L
hours it must be replaee'd at a cost of Z dollars. Vlhich
process would you recommend? (e) Gmpll the distribution for,:(= r.
142 Chapter 6 SOUle Important Continuous Distributions

6~26. Show that when i..:= r.:::: 1 the beta distribution 6~36. A manufacturer of a commercial television mou-
reduces [0 the unifoffi1 distribucion, itor guarantees the pic:ure tube for 1 year (8760
hours). The monitors are used in airport terminals for
6~27.Show that when A=2. r= 1 Of,t= 1. r=2:he
flight schedules. and they a.!1~ in continuous use with
beta distribution reduces to a triar.gular probability
power on. The mean life of11e tuhes is 20,000 hours,
distribution. Graph the density fu.1..ction.
and they follow an exponential time-to~failure density,
6~28. Show that if,t ::;:: r :;; 2 the beta distribution It costs the manufacturer $300 to make. sell. and
reduces to a parabolic probability distribution. Graph deliver a ClOnitof that will be sold for $400, It costs
the density functioD. $150 to replace a failed tube, including materials and
6~29. Find the mca."1 and variance of the beta labor. The manufacnuer has no replacement obliga.-
tion beyond the first replacement V1hat is the ma;:lU~
distn'bution.
faeturer's ex.pected profit?
6 ..36. Flnd the mean and variance of the Weibull
6..37. The lead time for ordcts of diodes from a certain
distribution.
manufacturer is known to have a gamma distribution
6 ..31. The d:!ameter of steel shafts is Wcihull distrib.- with a mean of 20 days and a standard deviation of 10
uted with parameters r = 1.0 inches, j3 = 2, and S:::: days. Determine the probability of receiving an order
0.5. Find the probability that :a randomly selected within 15 days of lhe placement date,
shaft 'Nill not exceed 1.5 inches in diameter. 6~38~ Use random numbers generated frOUl your
6~32. The time to failure of a ce::tair. transistor is fav6ritc co:nputer 1an.gu.age or from scaling the ran-
known to be Wei.bull distributed with parameters r:::::: dom integers in Table XV of the Appendix by multi-
0, f3 = ~, and {j = 400. Find the fraction expected to plying by 1O-~. and do the following:
survive 600 hours. (a) Produce 10 realizations of a 'lariable that is uni-
for:n 0= [10, 20}.
6~33. The time to leakage failure in :a certain type of
dry-cell battery is expected to have a Wcibull distribu~ (b) Produce five realizations of an exponential ran-
tion wit.~ panuneters r= t,
0, 13= and 0=400. What dom va..riable with a parameter of ),,= 2 x la-so
is the probability that a battery ""ill survive beyoJid (c) Produce five realizations of a ga.mma variable
800 hours of use? withr=2andi.::::4.
(d) Produce 10 real.izations of a Weibull variable \'lith
6-34, Graph the Weibull distribution with r= 0,8 =1,
and f3 = 1, 2, 3, and 4, r=O,f3 lfl,0=100.
6~39.Use the random number generation schemes
6-35. The time to failure density for a small e01l?-puter
suggested in Exercise 6-38. and do the following:
o. i,
system has a Wetoci1 density v.'ith r:;; fj :::: and
(a) Produce 10 realizations of Y =2X"'! where X fo1-
0= 200,
lows an exponcutial distribution with:a mean of 10.
(a) Vlhat fraction of these units will survive to 1000
hours? (b) Produce 10 realizations of Y:::: fX: I'[:Z;, where
(b) What is the mean time to failure? X; is gamma with r 2, i.::; 4 and ~ is uniform
on [0, 1].
Chapter 7

The Normal Distributio


7·1 INTRODUCTION
In this chapter we consider the normal distribution. This distribution is very important in
both the theory and application of statistics. We also discuss the lognormal and bivariate
normal distributions.
The uormal distribution was first studied in the eighteenth century, when the patterns
in errors of measurement were observed to follow a symmetrical bell-shaped distribution,
It was first presented in mathematical form in 1733 by DeMoivre, who derived it as a lim-
iting fonn of the binomial distribution. The disuibutiou was also known to Laplace uO later
than 1775. Through historical error, it bas been attributed to Gauss, whose first published
reference to it appeared in 1809, and the tenn Gaussian distribution is frequently employed.
Various attempts were made during the eighteenth and nineteenth centuries to establish this
distribution as the underlying probability law for all continuous random variables; thus, the
name normal came to be applied.

7·2 THE NOR.lVlAL DISTRIBUTION


The normal distribution is in many respects the cornerstone of statistics. A random variable
<f1 > 0 if it
X is said to have a normal disttibution with mean J1. (- 00 < J1. < <XI) and variance
has the density function

·f(x) 7~ e -('I~X(~-~)!~l' , _<X<tX<.


(J'l/2rc (7·1)
The distribution is illustrated grapbically in Fig. 7·1. The normal distribution is used so
extensively that the shorthand notation X - N(Jl, 0") is often employed to indic",e that the
random variable X is normally distributed with mean}l and variance cr'.

7·2.1 Properties of the Normal Distribution


The normal distribution has several important properties.

1: I': f(xleb: I
; required of all density functions.
2. f(x):2:0 for all xj (7·2)
3. lim,..,.,f(x); 0 and lim"-+-f(x) ; O.
4. f(Jl. + x) ~ f(Jl. - x). The density is symmetric about J1.
5, The maximum value off occurs at x; }l.
6. The points of inflection off are at x ~ }l ± cr.

143
144 Chapter 7 The Normal Distribution

(Xii

L. ~_ . J-!
I'
_-=._
x
Figure 7·1 The normal disrribution,

Property 1 may be demoIlStrated as fo]Jows. Let y = (x - Jl)/(f in equatio!! 7·1 and


deoote the integral I. That is,

1= 2- 1- e-(l;~)y' dy.
"2,, -
Our proof that [,i( x) dx = 1 will consist of showing thatI' = 1 and then inferring that
1= 1, since/must be everywhere positive. Defining a second normally distributed variable,
Z. we have

[' = !
1 J~ e -(1f2!,' dy ;..- I~ e -(lI')" dz
'12" - ...,2" -
=-
1 Joo 1- e-(1j2)(y'+," 'dydz.
21& -= ----«>

On changing to polar coordinates with the transformation of variables y:::::; r sin e and z == r
cos 91 the integral becomes

completing the proof.

7·2..2 Mean and Variance of the Normal Distribution


The mean of the normal distribution may be detennined easily. Since

E(X) = 1- ;- e-(lJ2)[(x-~)I"]' dx,


'"""" (J",2fC

and if we let z = (x- Jll/(f, we obtain

E(X) 1-- ..,2"


~. _(Jl+(fz)e-z'i'dz

= JlJ- 2- e-,'12dz ~(fI- 1 ze··"!~dz.


-. ~.J2fC """"" ....'2,n

Since the integrand of the first integral is that ofa normal density withJl = 0 and <I' = 1, the
value of the first integral is one. The second integral has value zero, that is,
7~2 The Normal Distribution 145

and thus
E(X) =)1[1] + 0":0],
=)1 (7-3)
In retrospect, this result makes sense via a symmetry argument.
To find the variance we must evaluate

and letting 1.:;;; (x - ,1l)ICF, we obtain

so that
v(X) = 0-'. (7-4)
In summary the mean and variance of the normal density given in equation 7-1 are)J and
0-', respectively.
The moment-generating jimcrion for the normal distribution can be sho\l.'I1 to be
,po
r
Mit) = exp ,t)1 + -t-J. (7-5)

For the development of equation 7~5> see Exercise 7-10.

7·2:3 The Normal Cumulative Distribution Function


The distribution function F is
.F(x)=P{X:>x)= J" -=e
1 -"")[("u'}I"I'
,<.,. duo (7-6)
-«> (T"/ 2tr
It is impossible to evaluate this integral without resorting to numerical methods, and even
then the evaluation would have to be accomplished for eacb pair (p, 0-'), However, a sim-
ple transformation of variables, z = (x - )1)10", allows the evaluation to be independent of )1
and <1. That is)

(7-7)

7-2.4 The Standard Nonnal Distributiou


The probability density function in equation 7-7 above,
I
lI'(z) -OQ <z< 00,

-
146 Qmptel' 7 The Normal Distribution

is that of a normal distribution with mean 0 and variance I; that is, Z - N(O, 1), and we say
that Z has a standard nonnal distribution, A graph of the probability density function is
shown in Fig, 7-2, The corresponding distribution function is 4>, where I

<I>(z) =}' ; <-" f2 d", (7-8)


~ '121':

and this function has been well tabulated, A table of the integral in equation HI has been
provided in Table II of the Appendix, In fact, many software packages such as Excel and
Minitab provide functions to evaluate <l)(z), For instance, the Excel call NOR.\1SDIST(l)
does just this task. As an example, we find that NOR.\1SDIST(l ,96) = 0,9750, The Excel
function NOR.\1SlNV returns the inverse CDE For example, NOR.\1S!NV(O,975) = 1.960,
The functions NORMDIST(x, J1, (J, TRL'E) and NORMIl'N give the CDF and inverse
CDF'of the N(p., <i') distribution.

7-2,5 Problem-Solving Procedure


The procedure for solving practical problems involving the evaluation of cumulative normal
probabilities is actually very simple, For example, suppose that X - N(lOO, 4) and we wish
to find the'probability that X is less than or equal to 104; that is, P(X", 104)=F(l04), Since
the standard normal random variable is
X-Il
Z=--,
(J

we can standardize the point of interest x :::;; 104 to obtain


x- u 104-100
z= --' = --"--"' = 2.
(J 2
Now the piobability that the standard normal random variable Z is less than or equal to 2
is equal to the probability that the original nonnal random variable X is less than or equal
to 104, Expressed mathematically,

or
F(l04) = <r>(2).

<#(z)

u=1

F:agure 7-2 The standard nomtal distributkm.


7-2 Tne Normal Distribution 147

Appendix Table II contains cumulative standard nonna! probabilities for various values of
z. From this table, we can read
4>(2) 0,9772.
Note that in the relationship z = (x - /1)1cr, the variable z measures the departure of x
from the mean 11 in standard deviation (cr) units. For inst:ance. in the case just considered,
F(104) = 4>(2), which indicates that 104 is Mo standard deviations (<1= 2) above the mean,
In general. x = J1 + O';Z. In solving probleIP.s. we sometimes need to use the symmetry prop-
erty of 'P in addition to the tables, It is helpful to make a sketch if there is any confusion in
determ.ining exactly which probabilities are required, since the area under the curve and
over the interval of interest is the probability that the random variable will lie on the
interval

The breaking strength (in newtons) of a synthetic fabric is denoted X, and i: is distributed as N(800,
:44), The pwxhaser of the fabric requires the fab:ic to have a strength of at least 772 nt.A fabric sam-
ple is randomly selected and tested. To find p(X;;:: 772). we first calculate

P(X< 772 l=pl'X-J1 < 772-80°'1


(J I

=P(z<
~ ? :;', ~-­
=4>(-2,33)=0,01. ~/:..

Hence the desired probability. P(X ~ 772), equals 0.99. Figure 7~3 shows the calculated probability
re~a:iveto both X and Z. We have chosen to' work with the randO'm variable Z because its distribution
function is tabulated.

~~i\'/['t:i
T!le time required to repair an automatic loading machine in a complex food-packaging operation of
a production process is X minutes. Studies have sho~"n that t.l:\e approx.imationX - N(120, 16) is quite
goO'd. A sketch is shown ~ Fig. 74. If the process is dO\\-u for more than 125 minates. ail equipment

cr=1

~
. ." y"':/ : - - - _ .
772 I' = BOO x z

Figure 7-3 P(X< 772), where X - N(300. 144).

~=4

~,x I'~ 120 125


Figure?-4 peX>.l25), whereX-N(l20, 16).
148 Chapter 7 The Normal Distribution

must be cleaned, with the loss of all product in process. The total cost of product loss ,and c1eaning
associated with the long downtime is $10.000, In order to deter.:J.I'li.ne the probability of this occurring,
we proceed as follows:

P(X > 125) =1'(, z> 125-12°1


4 /
=p(Z> 1.25)

= 1- <1>(1.25)
=1-0.8944
=0.1056.
Thus. given a breakdov.lI. of the packaging machine, the expected cost is E(C)::: 0,1056(10.000+ CRI)
+ 0.8944(CRj ), where C is the total cost,and CRI is the repair cost. Simplified, E(e)::: CR + 1056. SuP'"
!

pose the management can reduce the mean of the service time distribution to 115 minutes by adding
more mainte::lancc personneL The new Cost for repair will be CRl > CRI; however,

p(x> 125)= 1'(, Z> 125-115)


4
= /,(Z> 2.5)

=1-<!>{2.5)
=1-0.9938
=0.0062,
so that the new expected cost would be Cftz + 62, and ODe would logically make the decision to add
to the maintenance crew if

or
CR, C" <$994.
It is asstl.OC.cd that the frequency of breakdo~:ns remains unchanged.

The pitch diameter of the thread on a:fitting is nonnally distributed with a mean of 0.4008 cmanda stan-
dJtrd deviation of 0.0004 CllL The designspecitiClltiOns are O.4000±O.OOlO em. This is illustruedillFig.
7"5. Notice that the procc.<>s is openrting with the mC3J1 not equal to t:he nominal specifications, We desirc
to detmrrine what fr:action of product is \\itbin tolerance. Using the approach employed previously.

1'(0.399 SX;; 0.401) = p(0.3990-0.4008 SZ < 0.4010- 0.4008')'


0.0004 0.0004
= p(-4.5SZs 0.5)
=<1>(0.5)-<1>(-4.5)
= 0.6915 - 0.0000
=0.6915.

0.3990 0.4010
Lower spec, Upper spec.
limit limit Figure 7-5 Distribution of thread pitch diameters.
7-2 The Normal Distribution 149

As process engineers study tbe result') of such calculations, they decide to replace a worn cutting tool
and adjust the machine producing the fittings so that the new mean falls: dirwJy at the nom.i:nal value
of 0.4000. Then.
P(0.3990 ~ X S 04{)lO) ~ Pi' 0.3990- OA ,; Z:5 0.4010 -0.41
\ 0.0004 0.0004 ~
~ P(-2.5'; Z S+2.5)
~ <I;>{2.5)-<t>(-25)
~ 0.9938-0.0062

=0.9876.
We see that with the adjust1:nt'::nts, 98,76% of the fittings will be ~ithin tolerance. The distribution of
adjusted maclrine pitch diameters is shown in Fig. 7~6,

The pre\'ious example illustrates.a concept important in quality engineering. Operating


a process at the nominal level is generally superior to operating the process at some other
leveL if there are two-sided specification limits.

Another type of problem involving the use of ta.bles of the nonna! distribution sometimes arises. Sup-
pose, for example, that X ~ N(50, 4). Furthermore, suppose we want to determine a value of X. say x.
such that P(X > x) "'" 0,025. Then, ,

,J X-50) =0.025
P(X>x)='\.Z>-2-

or

so that, reading the normal table "backward." we obtain

.~.:::50 =L96=<I;>~1(0.975)
2

x=50 +2(L96) =53.92.

There are several symmetric mrervals that arise frequently. Their probabilities are
P(jJ.- LOOo-;fX".u+ 1.000) =0.6826,
P(jJ.- L6450-"X" Ii + 1.6450) =0.90,
P(jJ. - 1.960-" X "J.I + 1.960-) = 0.95,
P(jJ. - 2.570-;f X;f J.I + 2.57 d) = 0,99,
P(jJ.- 3,()I)o-"X:;; Ii + 3.00d) =0.9978. (7-9)

Figure 7"'() Distribution of adjusted lIUl.chine pHd:.


x diameters,
150 Chapter 7 The Normal Distribution

7-3 THE REPRODUCTIVE PROPERTY 01<' THE


NORMAL DISTRIBUTION
Suppose we have n independent, normal random yariables Xl' X2, •••• X"' where Xi -
Nip" 07),
for i = 1, 2, .• " n. It was shown eadier that if
f=X, +x,.,. ... .,.X.,
then

E(f) /1y = L>'
i=l
(7-11)

and
n
v(f)=ai=Lor
1''"'1
Using moment-generating functions, we see that

(7-12)

Therefore,

(7-13)

which is the moment"'generating function of a normally distributed random yariable with


a;
mean J.l; .,. J.l.z + .,. + /1. and variance + tT, + ... + 0-;. Therefore, by the uniqueness prop-
erty of the moment~generating function, we see that Y is normal with mean jJ.y and yari-
auee c?

~1~~~r~~i;~z
An assembly consists of three linkage components. as sho\Vn in Fig. 7-7. "The properties X:. ;s,;, and
X) are given below, with means in centimeters and variance in square centimeters.

XI - N(12, 0,02)
X, - N(24, 0,03)
X, - N(lS, 0.(4)
Links are ~dU~bY different machine.'> andopemtors, so we have reason to assume thatX\.X2> and
XJ axe ~ent Suppose we want to deter"'~e P(53.& S Y S; 54.2). Since Y;; XI + X2 + XJ' Y is
distributednomuilly withme.m,uy= 12+ 24+ 18 ~54lmd variance rr= a:
~ cr; + aj~O,02 + 0,03
+ OJ14 =0.09. Thus,

=<lI( 0.667) - <I>{-0,667)


= 0.748- 0,252
=0.496,
7-3 The Reprodu..4ive Property of the Normal DkLribution 151

( (0) <0) )
·)(.-..f--_·xv'---+--)(3---i

Figure 7-7 A linkage assembly.

These results can be generalized to linear combinations of indepencent norma] variables.


Linear combinations of the form
(7-14)

were presented earlier and we found that J.Ly ;::: ao +

Again, if X" X" ... , X" are indep<;ndent and

rt!~ilI~I~I~j
A shaft is to be assembled into a bearing, as shown in Fig. 7-8. The clearance is Y;;;;; X! -;X,. Suppose
x, - N(1.500, 0.0016)
and
x, - N(I.480. 0.0009).
Then,
Jly= oJ', + a,J1,
(1)(1.500)+ (-1)(1.480;
=0.02

~:,,;+a;:0
and

= (1)'(0.0016) + (-1)'(0.0009)
=0.0025,
so ::hat

r--
( Vlhen the parts arc assen:bled. there 'Will be interference if Y< 0, so

i P(interference)= P{y < 0) = ,fZ< O-O.?2 \


'l 0.00/
I =<1>(-0.4) = 0.344<5.
~ (~ This indicates. that 34.46% of all a'isemblies attempted would meet with failure. If the designer feels that
I the n.ominal clea~.ce fly;;;;; 0,02 is as :arge as it can be :roadc for the assembly_ then the only way to
reduce the 34.46% figure is to reduce the variance of the distributions. In many cases~ this can be accOIrl-
LPlisbcd by the overhaul of production equipment, better training of production operators, and so on.

x,[~,,-.___

Figure 7·8 An assembly.


152 Chapter 7 The Non:n.a1 Distribution

7-4 THE CENTRAL LIMIT THEOREM


If a random variable Y is the sum of n independent random variables that satisfy certain
general conditions, then for sufficiently large n, Y is appro>imately normally distributed.
We state this as a theorem--the most important theorem in all of probability and statistics.

Theorem 7-1 Central Limit Theorem


If Xl' X" ... ,X, is a sequence of n independent random variables withE(X,) = J1, and VeX,)
~ (both finite) and Y = Xl + X, + ... + X" then under some general conditions
n

Z, = .
Y- k
,
""'/1.,
,'<I _-
I"" (12
(7-15)

il£'" I
l i=l
bas an approximate N(O,!) distribution as n approaches infinity. If F, is the distribution
function of Zit' then

(7-16)

The Hgeneral conditions!) mentioned in the theorem are informally summarized as fol~
lows. The terms Xi' taken individually, contribute anegIigible amount to the variance of the
SUlli, and it is not likely that a single term makes a large contribution to the sum.
The proof of this theorem, as well as a rigorous discussion of the necessary ass"uinp~
!ions, is beyond the scope of this presentation. There are, however. sevenU observations that
should be made. The fact that Y is appro>imately normally distributed when the Xi terms
may have essentially any distribution is the basic underlying reason for the importance of
the nonnal distribution. In numerous applications, the random variable being considered
may be represented as the sum of n independent random variables t some of whicb may be
. measurement error. some due to pbysical considerations. and so on, and thus the normal
distribution provides a good approximation.
A special case of the central limit theorem arises when each of the components has the
same distribvtion.

Theorem 7-2
X:z, .... , Xn is a sequence of n independent. identically distributed random variables
If Xl'
withE(X,) ~ J1 and VeX,) ~ <1", and Y=X, +X2 + ... + X" then

Z =Y-"!!: (7-17)
fI (1'~

has an appro>imate N(O,l) distribution in the same sense as equation 7 -16.


Under the restriction that M/.,l) exists for realI, a straightforward proof may be pre-
, sented for this form of the central limit theorem.. Many matheIlJl!tical statistics texts pres-
ent such a proof. '
The question immediately encountered in practice is the following: How large must 1'1
be to get reasonable results using the normal distribution to approximate the distribution of
Y? This is not an easy question to answer, since the answer depends on the characteristics
of the distribution of the Xl terms as \\leU as the meaning of lI'reasonable results.n From a
7-4 The Ceutral Limit Theorem 153

practical standpoint, some very crude rules of thumb can be given where the dis.tribution of
the X, terms falls mto one of three arbitrarily selected groups, as follaws:
1. Well bebaved-The distribution of Xi does not radically depart from the nonnal dis-
tribution. There is a bell~shaped density that is nearly symmetric. For this case,
practitioners in quality control and other areas of application have found that n
should be at least 4, That is, n " 4,
2. Reasonably bebaved-The distribution of X, has no prominent mode, and it appears
much as a uniform density. In this case, n ~ 12 is a commonly used rule,
3. TIl behaved-The distribution has most of its measure in the tails, as in Fig. 7-9. In
this case it is most difficult to say~ however, in many practical applications, n ¢! 100
should be satisfactory.

,~;W;p'!e1\?':
Small parts are packaged. 250 to the crate. Part weights are independent random variables \';Iith a mean
of 0.5 pound and a standard dC\-i.atiou of O.lQ pound, Twent'j crates are loaded to a pallet. Suppose
we wish to fiDd the probability that the parts Q!: a pallet 1,\-ill exceed 2510 pounds b weight. (Neglect
both pallet and crate weight,) Let
Y=X j .... X2 + ,., +XSI)X}
represent the total weight of the parts. so that

J.!y= 5000(0.5) = 2500,


a;~ 5000(0.01) = 50,

and

O"y =..[50 =::7.071.


Then

P(Y>25lO)=P(Z> 2510-2500\
7.071 )
=1-<1>(1.41) =0.08.
Note that we did not know the distribution of the individual part weights.

t(x)

(al (b)
Figure 7-9 ill-behaved distributions.
154 Chapter 7 1M Normal Distribution

Table 7~1 Activity Mean Times and Variances (in Weeks and Weeks~

Activity Mean Variance Mean Varianee


2.7 1.0 9 3.1 1.2
2 3.2 L3 10 4.2 0.8
3 4.6 LO 11 3.6 1.6
4 21 1.2 12 0.5 0.2
5 3.6 0.8 13 2.1 0.6
6 5.2 2.1 14 l.5 0.7
7 7.1 1.9 15 1.2 OA
S 1.5 0.5 16 2.8 0.7

'[~hlIz~J
In a construction project, a network of m.ajor activities has been eonstucted to serve as the basis for
pla."'lUing and scheduling. On a critical path there are 16 activities, The means and variances are given
in Table 7-1.
The activity times may be considered independent and the project time is the sum of the activ-
ity times on the critical path, that is, Y==-X 1 + Xz -:- .,. + XIS' where fiO) the project time and XI is the
ti:m.e for the ith activity. Although the distributions of the XI arc unknown. the distributions are fairly
well behaved. The contractor would like to know (a) the expected completion time, and (b) a project
time corresponding to a probability of 0.90 of having the project completed. Calculating Jly and ~
we obtain
/.J.1';;;;;: 49 weeks,
er!.; 16 weeks2"
The expected completion time for the project is thus 49 weeks. In detennining the tfrne Yo such that
the probability is 0.9 of having the project completed by thaI time, Fig. 7-10 may be helpful
We may calculate

PlY $ y,) 0.90


or

so that

and
y;=49+ 1.282(4)
~ 54.128 weeks,

t
O'y= 4 weeks

Yo Y
Figure 7~10 Distribution of project times,
7~5 The Norma1 A.pproximation to the Binomial Distribution 155

7·5 THE NORMAL APPROXIMATION


TO THE BINOMIAL DISTRIBUTION
I.u Chapter 5, the binomial approximation to the hypergemnetric distnDution was presented,
as was the Poisson approximation to the binomial distribution. In this section we consider
the normal approximation to the binomial distribution. Since the binomial is a discrete
probability distribution. this may seem to go against intuition; however, a limiting process .
is involved, keeping p of the binomial distribution fixed and letting n -7 ~. The approxi-
mation is known as the DeMoivre-Laplace approximation.
Vle recall the binomial distribution as

p(x)= I{ n~ ),pxq•• x, x=O,I,2, ... ,n,


x. n x.
= 0, otherwise.
Stirling's approximation to n: is
(7-18)
The error

as n -7 IXI. Using Stirling's formula to approximate the terms involving n! in the binomial
model, we eventually find that,{~ Ilirgen,

(7·20)

sO that

(7-21)

This result makes sense in ligbt of the central1intit theorem and the fact that X is the sum
of iudependent Bemotill:r1:rial~ ES<1·tb~t E(X) = np and V(X) = npq). Thus, the quantity
(X -npl/,jnpq approximately llJlj; aN(O,I) distnDutiolL IT p is close to and n > 10, the t
approximation is fairly good; however, for other values of p, the value of n must be larger.
In general, experience indicates that the approximation is fairly good as long as np :> 5 for
1 . 1
P S '2 or when nq > 5 whenp > 2' , _,.'
"'>.~ .,r: ..' ,
I'r )0>5 - .,

~i>~~El

sample of{~ itms


(j', I,C' - J IJ;.::2 V
In samp~m a production process that produces items of which 20o/a--are def~te, a random
0
is selected each hour of each production shift.. The ~r of defectives in a sam-
ple is denoti¥?X To find. say, P(X ~. 15) we may use the nonnal approximation as follows:
.}
IV
J P(XS1S)= j Z,; 15-1,2.0. 0.2 J
' "l ~IOO(O.2)(0.8)
= P(ZS-l.25) =$\-1.25) =0.1056.
Since the bbomial distribution is discrete <llld the normal distribution is continuous. it is common
practice to use a half~inJerva1 correction or cor.ti:nuity correction. In fact, th.is ir<t1fecessity b1 ca1cu~
latiug P(X =: x). The usual procedure is to go a half-unit on either side of th4.:Q~s1,epending on
the interval of interest, Several cases are shown in Table 7-2. "''''E:;:Y

L
156 Chapter 7 The Normal Distributio:l

Table 7-2 Continuity Corrections


QuaIltity Desi:red from Vlith Continuity In Terms of the
Binomi.al Distribution CorrectiO:l Distribution FUDctloa ¢l

pex=x)

P(X" x)

P(X < xl = P(X~ x -I)

P(x:;'x)

p(a"Xsb)

rlHffi§~:t~!!j
Using the data from Example 7-9, where we had n = 100 and p = 0,2. We evaluate P(X 15).
P(X:; IS), peX < 18), P(X ~ 22). aIld P(18 < X < 21),

0. P(j~=P(14.5"
~-=":
X 515.5)= <1>(155-20)_<1>(14.5-20\
\4 4)
= <1>(- Ll25) -<1>(-1375)= 0,046,

Vz. "'"
P{X S 15) = <I>(!2:5-20\j = 0,)30,
\ 4
I7
v.P(X <IS) =P(X" 17)=i .5-2'?! = 0.266,
~ " 4 ;'
~ "",
@''1'(Xi'22)=I-<I>(21.5 -20)=0,354, ---~-'-
4
JI P(lS<X<21)=P(1%X,,20)
=<I>( 20.5 - 20 l_dI85- 20)
" 4 } \ 4
_ _ ,, _ _ =0550-0.354 =0,196. _ _ _ _ _ _ _ _ _ _ _ _ _ ,
~,--c''-'-'---'-~-'--'-
7·6 11:e Lognormal Distribution 157

As discussed in Chapter 5,' the random variable p =XI" where X has a binomial distribution
with parameters p and n, is often of mte:Tst Interest.in this quantity stems primarily from
sampling applications, where a randQID sample of n observations is made, with each obser-
vation classified success or failure. and where X Is the number of successes in the sample,
The quantity p is simply the sample fraction of successes. Recall that we showed that
E(j;) =p
and
v(p)

In addition to the DeMoivre-Laplace approximation. nore that the quantity


(7-23)
z= p-p
...jpq/n

has an approximate N(O, I) distribution. This result has proved useful in many applications,
including those in the areas' of quality control, work measu:ement, reliability cmgineering.
and economics. The results are ouch more useful than those from the law of large numbers.

Instead of timing the activity of a maintenance mechanic over the period of a week to determine the
fraction of his time spent in an activity classification called "secondary but necessary," a technicia.n
elects to use a work sampling st'.1dy, randomly picking 400 time point'> over the week. taking a flash
observation at each, and classifying the ;:activity of the maintenance mecb.ru.tic. The value X will rep~
resent the number of times tbe mechanic was involved in a "secondary but necessary" activity ar~dp
;:;; Xl400. If tbe true fraction of time that he is involved iT. this activity is 0.2. we determine !he prob~
ahility tharp. the estim.atcd fraction, falls bet\\'een 0,15 and 0.25. That is,

'p(O.lS ps 0.3) =<I>r 025~O.2 J-<I>I o~~o.21


\ .JO.1(i;400 \ -..:0.16/400 )
<1>(2.5) - <1>(-2.5)
~ 0.9876.

7-6 THE LOGNORMAL DISTRIBUTION


The lognormal distribution is the distribution of a random variable whose logarithm follows
the normal distribution. Some practitioners hold that the lognormal distribution is as fun-
damental as the normal distribution. It arises from the combination of random terms by a
multiplicative process.
The lognormal distribution has been applied in a wide variety of fields, including the
physical sciences, life sciences, social ?ciences, and engineering. In engineering applica~
tions, the lognormal distribution has been used to describe "time to fail",..." in reliability
engineering and "time to repair" in maintainability engineering.
158 Chapter 7 The };ormal Distribution

7-6.1 Density Function


We consider a random variable X with range space Rx:::;; {x: 0 < x < oo}, where Y::: 1n X is
normally distributed with mean f.1 y and variance that is,0";,
E(Y); l1y and V(y) 0";.
The density function of X is

/(x) --;~e
1 {-1!2)[{InJ:-/ly)!CTy t, x>O,
xO"y.JZ1!
=0 oilieI'W'ise. (7-24)
The lognormal distribution is shown in Fig. 7 -II. Notice that, in general, the distnbution is
skewed, with a long tail to the right. The Excel funetions LOGNORMDIST and LOGINV
provide the CDF and inverse CDF, respectively, of the lognoIlDlll distribution.

7-6.2 Mean and Variance of the Lognormal Distribution


The mean and variance of the lognormal distribution are
E( X) = I1x = ePr +(lI2).r, (7-25)
and

(7-26)

In some applications of the lognormal distribution, it is important to know the values of the
median and the mode. The median, which is the value x such that P(X :> Xl ; 0.5, is
x=eUl', (7-27)

The mode is the value of x for which/(x) is maximum, and for the 10gnoIlDlll distribution,
the mode is

(7-28)
Figure 7~11 shows the relative location of the mean, median, and mode for the
lognormal distribution. Since the distribution has a right skew, generally we will find thai
mode < median < mean.

x
Median Flgnre 7·11 The lognormal disuibution.
7~6 The Lognormal Distribution 159

7·6.3 Properties of the Lognormal Distribution


\¥bile the normal distribution has additive reproductive properties, the lognormal distribu~
tien has multiplicative reproductive properties, Some of the more important properties are
as follows:
1. If X has a lognormal distribution wiili parameters!Ir and a;
and if a, b, and d are
constants such thatb 1 then W= bXC has a lognormal dis-

trihution wiili parameters (d + ally) and (arJ'y)',


2. If X,. and X, are independent lognormal variables, wiili parameters (fly, + rJ';) and
(jJ.Y2 + a;) respectively, then W =Xl . X1 has a lognormal distribution with parame-
ters [(jJ.y! + J.Lr). (a~1 + O'~J)].
3~ If Xl Xz. , •.• XI'. is a sequence of n independent lognormal variates, with parameters
t

()i.YI' q),j = 1, 2, '''' n., respectively, and {a) I is a sequence of constants while b
e" is a single constant, ilien ilie product
,
W~bTIX?
j=l

has a lognormal distribution with parameters

and

4~ If Xl' X-z, .. "' X,t are independent lognorm.al variates; each with the same parameters
(fly. ~), ilien ilie geometric mean

(IT Xi)"'
\j=l
has a lognormal distribution wiili parameters)i., and a;ln.

i~~~iiicz;r~
The: random variable Y = In X has a NO O. 4) distribution, sO X has a lognormal distribution with a
mean and varia.:nce of
E(X) =: e IC +:l'l14 = ell",", 162,754
.
and
VeX) = e[1(10)+4)(e4 - 1).
e'4(e4 _ 1) = 53.59Se",

respectively. The mode and median are


mode = i' """ 403.43
and
median = e: o = 22.026.
In order to determiDe a specilic probability. say P(X S; 1000), we. use the trans:'ormP(L'1X ::; In 1000)
= p(Y ~ In 1000):

P(YSIn 1000) = PI' ZS In 1000-lOi


" 2)
=<1.>(-1.55) =O.0611.
160 Chapter 7 The Normal Distribution

,\J!!~~!t1~!~'"
Suppose
Y,=InX,-N(4,1),
Y, =InX,- N(3, 0.5),
Y, = InX, - N(2, 0.4),
Y. =InX, - N(l, 0.01),
and fu."therrnore suppose XI' Xz, X,:, and X4 are independent rando:n variables. The random va\"iable
W defined as follows represents a critical perfor::nance .'3riable 0:1 a telemetry system:
W =: eL5rx~x;.1x;·7X!1}.

By reproductive property 3, W w.:.u have a lognormal distribucon with parameters


1.5"1- (25·4+0.2·3+0.7·2+3.1'1)= 16.6
and
(2.5)' . 1+ (0.2)' . 0.5 + (0.7)' . Q.4 + (3.1)' . (0.01) = 6.562,
respecti''ely. That is to say, In W- N(16.6, 6.562). If the specifications on Ware, say, 20,000-600,QOO.
we could determine the probability mat W would fall v.ithiIJ. specifications as follows::

p(20,000" W " 600.10')


=Plln(20,000)" In w" 1n(60Q.IO')1
= <1>(1n 600·10' -16.6[_<I>l' In 20,000 -16.6[
.[6.526/ ../6.526 /
= <1>(-1.290)-<1>(-2.621) = 0.0985-0.0044
= 0.0941.

7-7 THE BIVARIATE NORMAL DISTRIBL'TION


Up to this poiI:.t. all of the continuous random variables have been of one dimension. A ¥e:rj
important two-dimensional probability law that is a generalization of the one~d.imensional
normal probability law is called the bivariate normal distribution.. If (Xl' X2] is a bivariate
nonn~ random vector; then the joint density function of [Xl> XJ is

(7-29)

fur ~ -<Xl < 00 and- - <Xz < <>e.


The joint probability pea,
s Xl ,; b" lIz" X, ,; b,) is defined as

Se, 1" f(x" x,) dx1ti<z (7-30)


"' "
and is represented by the volume under the surface and over the region {(xl' .xz): at S Xl ::>
bj , lIz sx, s b,), as shown in Fig, 7-12. Owen (1962) has provided a table of probabilities.
The bivariate normal density has five parameters. These are P" /-0.. 0jt 0"2' and p, the cor-
f(Xl' X"vt

x,

X,

.'f--""",=

Figure 7~1:2 The bivariate normal density,

relation coefficient between XI and X2• such thar - oc < fJ.; < OCt oc < f.l2 < OX'. 0'1 :> 0, 0': >
0, and-l <p< 1.
The marginal densities/, and/, are given, respectively, as

£(Y ) -
.. 1 ....: -
r-
""""
'I x x
.1 \ :' 2
lax2 -- _1_e-{~2)[(', -#,)!c.]'
r:;---
0'1 ",;2%
(7-31)

for _OC <XI < oc and

(7-32)

for-- <~ <oX>.


We note that these marginal densities are normal; that is,
Xl - N(u" a;) (7-33)

and
X, - N(Ji1, a~),

so that
E(X,) -)J,l'
E(x;) = iLz,
VeX,) ~ a;,
veX2 ):=: O'~. (7-34)
The correlation coefficient p is the ratio of the covariance to [at' 0';1. The covariance is

Thus

(7-35)
162 Chap,er 7 The Normal Distribution

The conditional distributions Ix,,,,(x;) and/x,",(x,) are also important. These condi-
tional densities are nonnal, as shown here:

for-ao<X2 <ao and

(1-37) ,

for- 00 < Xl < 00. Figure 7~13 illustrates some of these conditional densities.
We first consider the distribution!XZU:l' The mean and variance are
E(X,!>:,) ~ i11. + p(a.ja)(x, -1',) (7-38)

x,

E(Xzlx,)

x,

x,

(b)
Figure j·13 Some typical conditional distributions. (a) Some example conditional distributions of
X2 for a few values of x~. (b) SO:rtle e:<.ample conditional distributions of Xl for a few values ofX:!.
I
, and

Furthermore, AiL: is normal; that is,


VeX,Ix,)
7-7 The Bivariate Normal Distribl1ton

a;O - p')
163

(7-39)

X:rlx, - N[}l, + p(IY,JO',)(x, - Ji.;J, a;(J - iT»). (7-<W)

The locus of expected values of X:r for xl' as shown in equation 7-38, is called the
regression ojX1 on Xl' and it is linear. Also, the variance in the conditional distributions is
constant for all Xl'
In the case of the distribution/x!;,;:. the results are similar, That is.
E(X,1x,) = Ji., + p(O',/IY,)(x, - .11.,), (7-"1)

V(X,Ix,) = a;(l iT), (7-"2)

and
(7-43)
b the bivariate nonnal distribution we observe that if p:= 0. the joint density may be fac-
tored into the product of the marginal densities and so X j and Xz. are independent. Thus, for
a bivariate normal density, zero correlation and independence are equivalent. If planes par-
allel to the X l ,.l2 plane are passed through the surface shown in Fig, 7~12, the contours cut
from the bivariate normal surface are ellipses. The stt:dent may wish ~o show m:s property.

In an atterr.pt to substitute a nondestructive testing procedure for a destructve test, an extensive st<.ldy
was made of shear strength. Xz., and we~d diameter, Xl' of spot welds, with the foUo'N.ing findings,
1. [XI' Xl) ~ bivariate normaL
2. J.ll = 0.20 inch. Jlz = 1100 pounds, a; "" 0.02 inch:" pounds'. and p=O,9,
The regression of X2 on Xl is ,thfis-.....\

(£ix,,,,,) ~'" + p(<Y,/<5,)(x,- p,)


'---./=llOO~O.9"';
( 1525'(xl-0.2)
-.:0.02 ;
= 145.8:<, + 1070.80.
and the variancc is
/ '-~\
(~
! v(x,lx;) = 0;(1 - p') ,
~ ~525(0.19)=99.75.
In studying these results, the manager of manufacturing notes that since p = 0,9. that is, close to 1.
weld diameter is highly correlated with shear strength. The specification on shear strength calls for a
value greater !:han 1080. If a weld has a diameter of 0.18, he asks: "v,,'hat is the probability that the
strength speci:fi.cation ",,'ill be met?" The process engineer notes teat E(Xdo. lS) 1097.05~ therefore,

p(x. ",080)=/ Z> 1080-:097.05'


. . \ ~99.75,/

=
= 1-1>(-1,71) 0.9564,
a-:d he recommends a policy such that if the weld diameter is not less tea..-: 0.18, the weld will he clas-
sified as satisfactory.
164 Chapter 7 The Norm.al Distribution

F;x;,n,~i~7;':J.S:
In developing an admissions policy for a large university, the office of student testing and evaluadon
has noted rl:at X l the combined score on the college board examinadons, and X:. the student grade
<

point average at the end of the freshman year, have a bivariate normal distribution, A grade point of
4.0 correspond'! to A. A study indicates that

1', = 1300,
",,=2,3,
0;=6400,
~=O.25.
p=O,6,
AJ:J.y student with a grade point average less than 1.5 is au:omatically dropped at the end of the fresh~
man year; however, an average of2.0 is considered to be satisfactory.
Ax.. applicant takes the college board exams, receives a combined score of 900. and is not
accepted. A.n irate parent argues that the student will do satisfactory work and, specifically, will have
better than a 2.0 grade point average at the end of the freshman year. Considering only the proba~
bilistic aspects of t.'1e problem. the director of admissions wants to determine P(~ t:: 2.0rc: 900).
Noting that

E(X,;900) = 23+(06\ ~;j900 -1300)


=0,8
and
V(X,1900) =0, 16,
the director calculates

1_ <1,(2,0-0,8) 0,0013=
;, 0,4 '

which predicts only a very sli.m chance of the parent's claim being valid.

7·8 GENERATION OF NORMAL REALIZATIONS


/ We will consider both direct and approximate methods for generating realizations of a
standa:unonnal variable Z, where Z- NCO, 1), Recall that X = 11+ <1Z, so realizations of X -
N(;1, <1') are easily obtained as x = j1 + <17"
The direc~ method calls for generating uniform [0, 1J random number :realizations in
pairs: u, and u." Then, using the methods of Chapter 4, it tums out that
',= (-2 In u:)i!2cos(2nu,),
Zz = (-2m u.)'''sin(2nu,j (7-44)
at!, realizations of independent, N(O, 1) variables, The values x= I' + <1, follow directly, and
the process is repeated until the desired number of realizations of X are obtained,
An approximate method that makes use of the Central Limit Tneorem is as follows:
12
z=Lu,-6, (7-45)

WIth this procedure, we would begin by generating 12 uniform [0, 1] random number real-
izations. adding them and subtracting 6. This entire process is repeated until the, desired
number of realizations is obtained.
Although the dire<ot method is exact and usually preferable, approximate values are
sometimes acceptable.

7·9 SUMMARY
This chapter has presented the normlll distribution with a number of example applications.
The normal distribution, the related standard nannal, and the lognormal distributions are
univariate, while the bivariate nonnal gives the joint density of tvlo related normal random
variables.
The normal distribution forms the basis on which a great deal of the work in statistical
inference rests. The wide application of the normal distribution makes it particularly
important.

7·10 EXERCISES
/X/
(7:"i. Let Z be a standard no::n:..al random variable and 7~e personnel manager of a large company
"calculate the follo .....ing probabilities. llSing sketches , requires job applicants to take a certain test and
where appropriate: achieve a score of 500. If the test scores are aorn:tall.y
<a) p(0" Z" 2). distributed \Vith a mean of 485 and a standard devia-
(b) p(-I:$Z;;+I). tion of 30, what percentage of the applicants pass the
test?
(e) P(Z ~ 1.65).
7~8. Experience indicates that thf': rhwelonment time
(d) P(Z~-1.96).
for a photographic printing pal
(e) p(IZI> 1.5). N (30 seconds, 1.21 seeonds2).
(I) P(-1.9';; Z;; 2). The probability that X is at 1eas-. _.
(g) P(Z;; 1.37). probability that X is at most 31 seconds. (c) The prob-
(b) pclZl,;; 2.57). ability that X differs from its expected value by more
than 2 seconds.
7·2, Let X - N(lD, 9). Find P(X:!: S), P(X~ I2).PCz,; C':\
X S; 10). t/ ,: 1.9jA ccrtain type of light bulb has an output known
~ roOe norroaily distributed with mea.'1. of 2500 end foot-
, 'Ji? In ea~'part. b'elow, find the value of c that makes candles and ~ standard devi~~n of 75. e~d footean-
/ tile probability statement true. dles. Determ:ne a lower speCt::lcation limit such that
(a} ~(c);;;;; 0.94062. only 5% of the :na:mfactured bulbs will be defe.....-tive.
(b) p(jZl
:!:e) = 0.95. 7·10. Show that the moment·generati!:g function for
(c) P(1Zi5C)=0,99. the nO!."""~ distribution is as given by equation 7-5.
(d) p(Z ~ c) = 0.05, Use it to generate the mean and variance.
~lfX-N(p. 0-'), show that Y=aX+b. where a
14~ 1£ P(Z ~ Zg) = a., detemrlne Zc for a 0 0225
. ~;:r:.:;.¥
and b is
are real comta:nts, also normally distributed.
. ~"=Q.l)'l5, «=0.05, and «=0.0014. . Use the methods outlined in Chapter 3 ..
(,1.0:: ; X' N(80, 10'), compu"' the following.a; ?n 1",12. The inside diameter o:(~_piston ring is normally
Ce) P(X;; 100). distributed with ame;m of 12 em and a standard <levi-
(b) p(X'; SO). arion of 0.02 Glli.
(e) p(75';X;; 100). (a) Wbat freedon of the piston rings will buve dimn-
(d) P(l5';; X). etern exceeding 12.05 em?
(e) p(1X - 801,; 19.6). (b) Wb.at inside diameter value e has a probability of
Q,90 being exceeded?
7";;. The life of a particular type of dry-<ell battery is
normally distributed with a mean of 600 days and a (e) t is the probability that tile iLside diameter
standard deviation of 60 day'S, What fraction of these ? ."".._:fur between 11.95 and 12.051

batteries would be expected to survive beyond 680, \ 7 3. plant manager orders a process shutdown and
days? W"'....at fraction would be expected to fail before 'etting readjustment whenever the pH of the fb.a1
560 days? product falls above 1.20 or below 6.80, T.::e sample
166 Chapter 7 The :Sonnal Distribution
).~" r
"
pH is normally distributed with unknown j.1 and a :(7)\:wh~c~the acceptable range is as in (a) and the
'/

standard deviation of 0"= 0.10, Detennine the follow~ ':;/' hardness of each of cine randomly selected spec~
ing probabilities: imens is independently deten:rined, what is the
~ a) Of readjusting \Vhen the process is operating as , expected llumber of acceptable specimens among
intendedwithj.1::::7.0. '1 ;:~- ()- L ,:..::.::-~ the n.ine specimens?
.......,.-- \. ? ' ,:>,' ,
(b) Of readjusting when the process is slightly off 7..20. Prove that E(Z~) "'" 0 and V(Zt\):::: 1, where z., is
ta..rget with the mean pH 7.05. as cL<>filled in Theorem 7·2,
(c) ?ffailing to readjust ",:hen~e process is too aIka~ l~tXi(i """ 1, 2, ... , n) be independent and iden~
line and the mean pH IS J1. - 725. i'ruCiily distributed random variables with mean j.1 and
(d) Of falling to readjust when the process is too variance cr.
Consider the sample mean,
acidic and the mean pH is f.1 "'" 6.75.
7~14. The price being asked for II certain security is
distributed normally \I,'1tt a mean of $50.00 and a sta.n~
dan! deviation of $5.00. Buy"" are willing <0 pay an Show thatE(X) = Ii and V(X) = d'in.
amount iliac is also nonnaTIy :n~tributed W:th. a m~ 7-22. A sbaft with an outside diameter ~O.D.) -N(L20,
of $45.00 ~~ a standard de-:atlo~ of $2.50. v.~at 1S 0.0016) is inserted into a sleeve bearing having an
the probability that a ~Ctlon ",ill take place, ~in de diameter (I.D.) that is N(L25. 0.0009). Detet~
W a c specifications for a capacitor are that its lif,~' ,,' e the probability of interference,
must be bet-.veen 1000. ~d 5000 ~ours. The life i _ ,,..2 An assembly consists of three components
0:
known to be normally d~,tnbutcd WIth a mean 300.0 (,placed side by side. The length of each component is
hours.. The revenue ~alize~ frmn e3;ch capacItor IS non::nally distributed v.':ith a mean of 2 inches' and a
$9.00; however, a failed urut must be replaced ~t a standard deviation of 0.2 inch. Speeifications re<;iuire
cost of $3.00 to the compa~y. Two .manW:actunng that allrsemblies be between 5.7 and 6,3 inches long.
processes can produce capacltors hav";ng sallsfactory How JPany assemblies v.ill pass these requirements?
mean lives. The standard deviation for process A is \. !
1000 hom and for process B it is 500 hours. How- ,\241 Find the mean and variance of the linear
ever, process A manufacturing costs arc only half co'nlbinatior. '
those for B. What value of process manufacturing cost Y "" Xl + 2Xz + X3 + X4 ,
is critical, so far as dictating the use of process A or B? where Xl ... N(4. 3), X2 ~ N(4. 4), X, ~ Nfl, 4). and
7~16. The diameter ofa ball bea.-ing is • normally dis- ~X.
--1,;(3,2). \Vbat is the probability tbat 15 S; YS; 207
tnbuted random vanable With mean j.1 and a standard ~ound~off error has a uniform distribution on
deviation of 1. s~ecifiC,atio~ f?I" the ffi:a~ete: are, .~ +O.5J and round-off erron; are independent A
6."!: X."!: 5, and a ball beanng \\'1thin these limits )'lelds ';~um of 50 numbers is calculated where each is
<, a proSt of C.doUars. However. if X.: 6. then the profit rounded before adding. \-Vhat is the probability' that
/ f is -R j dollars, or if X > S. L.1e profit is -Rz dollars. the total rO~'1d-off error exceeds 57
Fied the value of jl that maximizes the expected
profit 7-26. One hundred small bolts are packed in a box.
Each bolt weights 1 ounce, with a standard deviation
'Z..dVIn tt:e preceding exercise, find the Optimum of 0.01 O'.L.1.ce, Find the probability that a box. weighs
vatue of j1. if R~ :R2 == R. more than 102 ounces.
7~18. Usc thc results of Exercise 7~ 16 with C= 58.00,
·S~27)An automatic machine is used to fill boxes 'With
Rl "" $2.00, and R1. = $4,00. What is the value of Ii that s;;[;'p powder. Specificati(Iit, require that the boxes

~
;nizes the expected profit? weigh bet"Neen ~l.&and ;I2.2, ounces. The only data
7',~e Rockwell hard::ess of a particuJ.ar alloy is available about ~inachine peiformance concern the
rmally distributed with a mean of 70 ar.d a standard average content of grot:p~(nine boxes. It is l$noy,,"D.
. tion of 4. that the average eontent i~ 11.9l?llUces with a standard
(a) If a specimen is acceptable only jf its hardness is deviation ofl O.OS .ounce.- ~ fraction of the boxes
between 62 and 72, wha~ is me probability that a produced is ddcctive? Where shOUld the me~ be
tandomly chosen specimen has an acceptable located in order to minimize this fractioa defective?
hardness? Assume the weight is normally distributed.
(b) If the acceptable range of hardness was (70 - c. 7~28. A bus-J;r:eve1s between !:VIa cities, but visits six
70 + c), for what value of c would 95% of all Intermediate cities on thero-at.e. The means and st3L-
specimens have acceptable hardness?" dard deviations of t!:e traveL times are as follows:
7~10 Exercises 167

-----------------
Mean StandMd
'=D:Drne ""dom variable Y = In X has a.'l(SO, 2S)
dist..--i.bution. Find the mean, va."':ance, mode, and
City Tune Deviation
median of X
PaL., (hours) (hours)
7-3&. Suppose independent :random variables YJ , Y:;,
1-2 3 0.4 Y~ are such that
2-3 4 0,6 Y, - N(4, I).
3-4 3 0,3
In X, - N(3, I).
4-5 5 1.2
In X3 - N(2, 0.5).
,('1,5-6 7 0,9
0,4 Find the mean and variance of W = eZX~X~SX~,:!ll.
16-7 S
Determine a set of specifications L and R such that
0- 7-8 3 0.4
,r, /.----.., P(L;; W ~ R); 0,90.
~~yts the probability that the bus cOC1p:etcs it" jour- '2::Yhow:ba: the densi"! function for a lognormally

\!"
"
" n·b.~,\~!itbin 32 hottts? distributed ra.'1dom variable X is given by equation

A production process produces


c) ite~
of which
7-24.
,are defective. A random sample of fwo items is 74ft Consider the bivariate DOnna! density
.7 'bted every ~y and the nu.::nber of def~ve items.
say ~ is c~unted, t:sing the.no~ appro::pmation to
~thTOTIllal find the follOWlllg:! ,I'
ttJ.::X~16) ~"iY -"",<x 1 <rn,-=<X1<<><>,
MI'(X =<15),.
IJ;:f fCILX 5 20), I
i5i
'(
"e;:?o f, ' where Ll is chosen so thatfis a probabili:y distribu~
tion. Are the random variables Xl and X2 indepe:::.dent?
(d)' P(X ~ 14), Define wo new random variahles:
7~30. In a work-sampling study it is ofte.'1 desired to 1 ( Xi pXz 1
.find the necessary number of observations. Given that 1S. = (1- p'll1f2 '"------
~ 0'1 ()1)
i•
p -== 0.1. find the necessary n such that P(O.05 ::5[; ,s, \}
O.15{SO.95, Yo = X"

~
.r' : ~se randoLl numbers generated from your - 0'2
avorite computer package or from scali."lg the random Show that the two new random variables are
J integers in Table 'XV of the Appendix by multiplying 0 'er:.dent.
by 10-5 to generate six realizations of a N(100, 4)
~'The life of a tube (X,.) and the filament cliameter
variable, using the follovdng:
(X1) are distributed as a bivwte normal random vari-
(a) The di..rect method.
ahle with the parameters fi.l 2000 hours, fJz::::; 0,10
(b) The approximate methO<t inch, 0-; =2500 houn;', <f, ~ 0.01 inch', and p ~ 0.87.
7~32. Consider a linear SOmbiDation Y::::; 3Xt - 2X;;, The quality-control manager wishes to deteani..'ie the
where XI is N(lO. 3) ~d X1. is unifo::::::::lly distributed life of eaeh tube by illcasuring the filament dim.eter.
on .[01°1. Generat{six rea~tions of the random If a filar:r::.ent diameter is 0.098, what is the probability
va;;al1'fe Y, whetc XI andX:r. are mdependent. that the tube will last 1950 hours?
~ Z - N(O, 1), generate five :ealizations of Zi. 7-42~
A college professor has noticed that grades on
,;;:;.;. ~y = In X a;J.d Y - N(jJ.y, v;), develop a proce-
/mr ger:.era~:lr:wz~tions of X. 2
each of t'\IO quizzes have a bivariate norma: clistribu-
tion with the parameters fi.; 75. fJz ;;;; 83, ~~ 25,
t'~1~j~If y =:: X/~,X;, wnere XI - N(J1.p v t ) and 0; = 16, and p = 0.8. If:a student receivC$ a
/
~' .2 grade of 80 on the first quiz, what is the probability
x,.i~ N(J1,. a,) and where X, and
_
Xo-are
.
independent.
_
thatsh e will d0 be tterOn th csecond ' HOWlS
one: . th e
deVelop a generator for producmg realizations of Y; aft ed -,"- " 8'
~wer ect bv mQ.li.ll1g p..:.:.. --v. .
7~36. The brightness bulbs is normally dis- ( . . . ':\ •
tributed. with. a mean of 2500 foor.caodles and a stan- '-~ Consider the surface y X), whercfis the
dard deviation of 50 footcandles. The bulbs are tested bivariate normal density function.
and all those brighter than 2600 footcandles are (a) Prove that y = constant cuts the surface in an
placed in a special high-qUality lot. What is the prob- ellipse.
ability distribution of the remaining bulbs? \Vbat is (h) Prove that y = constant with p
their expcc:ed hrightness? cuts the surface as a circlc.
and ° 0: ;: : a;
168 Chapter 7 The Normal Distribution

7 ~44.. Let Xl and Xz be independent random variables,


c=X;.
each following a normal density with a mean of zero X,
and variance <fl. Find the distribution of
r-:;----, We say that C follows the Ca:u.ciry distribution. Try to
R;;;:~Xl'l"Xi compute E( C).
The resulting distribution is known as the Rayleigh <W2e, X - N(O, 1), F'md the probability distribution
distribution and is frequently used to model the Cistri- of Y:::: X 2• Y is said to follow the chi~square distribu-
bution of radial error in a plane. Hint: Let Xl :Rcos e tion with one degree of freedom. It is a.'1 important
and X2 :::: Rsin 8. Obtain the joint probability disoibu- distribution in ~tatistical methodology.
e,
clon of R and then integrate out a
7 ~48. Let we independent random variables XI -
~sing a method similar to that in Exercise 7-44. N(O, 1) for i = 1, 2, ... , n. Sbow !hat the probability
obtain the distribution of
distribution of Y::: t xl
i"'!
follows a cbi-square distl'i-

L
WL.ere Xi - lVI " and"IS Ind
"(0,cr) dCll.
epen t bution with n degrees of freedom.
7~46~ Let the independent random variables XI ~ G'LetX- NCO,I), Define anew random variable Y
N(O, 0'-) for i ~ 1, 2. F'l!ld the probabiJity distribution ~!XI. Tneu, firul the probability distribution of Y. This
of is often called the hali'~It()rmLll distribution. .
[ ,
Chapter 8

Introduction to Statistics
and Data Description
8·1 THE FIELD OF STATISTICS
Statistics deals with the collection. presentation, analysis, and use of data to solve prob!ems.
make decisions, develop estimates, and design and develop both prOduCIS and p:ocedures.
An unde:Slllnd:ing of basic statistics and statistical methods would be useful to anyone in
this information age; however, since engineers, scientists. and those working in m:mage~
:ment science are routinely engaged with data, a knowledge of statistics and basic statistical
methods is particularly vilal. In this intensely competitive, high-tech world economy of the
first decade of the tv.:enty-first century. the expectations of consumers regarding product
quality, perrorIllllllce, and reliability have increased significantly over expectations of the
recent past. Furthermore, we have come to expect high levels of performance from logisti-
cal systems at a111eve1s, and the operation as well as the refinement of these systems is
largely dependent on the collection and use of data. While !.hose who are employed in var-
ious "service industries" deal with some\Vbat different problems, at a basic level they, tOO,
are collecting data to be used :in solving problems and improving the "serv.ice" so as to.
become more competitive in attracting market share,
Statistical methods are used to present, describe, and understand variability. In observ-
ing a Y,ariable value or several variable values repeatedly, where these values are assigned
to units by a process, ~:e note that these repeated observations tend to yield different results.
\Vhile this chapter wHl deal with data presentation and description issues; the fonowing
chaplers will utilize the probability concepts developed in prior chapters to model and
develop an understanding of variability and to utilize this understanding in presenting infer-
ential statistics topics and methods.
Virtually all real~world processes exhibit variability. For example, consider situatio:lS
where we select several castings from a manufacturing process and measure a critical .
dimension (such as a vane opening) on each part. If the measuring instrument has sufficient
resolution, the vane openings will be different (there will be variability in the dimension).
Alternatively, if we count the number of defects on printed circuit boards, we will find vari-
ability in the counts, as some boards will have few defects and others will have many
defects. This variability extends to all environments. There is variability in the thickness of
oxide coatings On silicon wafers, the hourly yield of a chemical process, the number of
errors on purchase orders, the flow time required to assemble an aircraft engine, and the
therms of natural gas billed to the residential custon:ters of a distributing utility in a given
month.

169
110 Chapter 8 Introduction to Statistics and Data Description

Almost all experimental activity reflects similar variability in the data observed, and in
Chapter 12 we will employ statistical methods not only to analyze experimental data but
also to construct effective experimental designs for the study of processes.
Why does variability occur? Generally, variability is the result uf changes ill the con-
ditions under which observations are made. In a manufacturing context,. these changes may
be differences in specimens of material, differences in the way people do the work. differ-
enceS in process variables~ such as temperature, pressure, or holding tice, and differences
in environmentalJactors, such as relative humidity. Variability also occurs because of the
measurement system. For example. the measurement obtained from a scale may depend on
where the rest item is placed on the pan. The process of selecting units for observation may
also cause variability. For example, suppose that a lot uf 1000 integrated circuit chips has
exactly 100 defective chips. If we inspected all 1000 chips, and if our inspection process
was perfect (no inspection or measurement error)t we would find aU 100 defective chips.
However, suppose that we select 50 chips. Now some of the chips will likely be defective;
and we would expect the sample to be about 10% defective, but it could be 0% or 2% or
12% defective, depending on the specific chips selected.
The field of statistics consists of methods for describing and modeling variability, and
for ma.ki.ng decisions when variability is present. In ir¢erential statistics, we usually want
to make a decision about some population. The term population refers to the colleCtion of
measurements on aU elements of a universe about Which we wish to draw conclusions or
make decisions. In this text, we make a distinction between the universe and the popula~
tions, in that the universe is composed of the set of elementary units or simply units, while
a population is the set of numerical or categorical variable values for one variable associ~
ated with each of the universe units. Obviously, there may be several populations associated
with a given universe. An example is the universe consisting of the residential class cus~
tomers of an electric power company whose accounts are active during part of the month of
August 2003. Example populations might be the set of energy consumption (kilowatt-hour)
values billed to these customers in the August 2003 bill, the set of customer demands' (kilo-
watts) at the instant the company experiences the August peak demand. and the set made up
of dwelling cat~gory, such as single~family unattached, apartments, mobile home, etc.
Another example is a universe which consists of all the power supplies for a personal
computer manufactured by an electronics company during a givenpctiod. Suppose that the
manufacturer is interested specifically in the output voltage of each power supply. We may
think of the output voltage levels .in the power supplies as such a population, In this case,
each population value is a numerical measurement, such as 5.10 or 5.24. The data in this
case would be referred to as measuremem data. On the other hand, the manufacturer may
be interested in whether or not each power supply produced an output voltage that conforms
to the requirements. V.le may then visualize the population as consisting of attribute data,
in which each power supply is assigned a value of one if the unit is nonconforming and a
value of zero if it conforms to requirements. Both measurement data and attribute data are
called numerical data. Furthermore, it is convenient to consider measurement data as being
either continuous data or discrete data depending on the nature of the process assigning
values to unit variables. Yet another type of data is called categorical data. Examples are
gender, day of the week when the observation is taken. make of automobile. etc. Finally, we
have unit identifying data. which are alphanumeric and used to identify the universe and
sample unitS. These data might neither exist nor have statistical interpretation; however, in
some situations they are essential identifiers for universe and sample units. Examples would
be social security numbers, account numbers in a bank., YIN numbers for automobiles, and
serial numbers of cardiac pacemakers. In this book we will present teclmiques far dealing
with both measurement and attribute data; however, categorical data are also considered.
8-1 The Field of Statistics 171

In most applications of statistics, the available data result from a sample of units
se!ected from a universe of interest, and these data reflect measurement or classification of
one or more variables associated with the sampled units. The sample is thus a subset of the
units, and the measurement or classification values of these units are subsets of the respec-
tive universe populations.
Figure 8-1 presents an overview of the data acquisition activity. It is convenient to think
of a process that produces units and assigns values to the variables associated with the units.
An example would be the manufacturi.I.;g process for the power supplies. The power sup-
plies in this case are the units, and the output voltage values (perbaps along with other vari-
able values) may be thought of as being assigned by the process. Oftentimes, a probabilistic
model or a model with SOme probabilistic components is assumed to represent the value
assignment process. As indicated earlier, the set of units is referred to as the universe. a.'1d
this set may have a firrite or an infinite membership of units. Furthermore, in some cases this
set exists only in concept~ however. we can describe the elements of the set without enu-
merating them, as was the case with the sample space, 'if, associated v,.ith a random exper~
iment presented in Chapter I. The set of values assigned or that may be assigned to a
specific variable becomes the population. for that variable.
Now, we illustrate with a few additional examples that reflect several universe struc-
tures and different aspects of observing processes and population data. First. continuing
with the power supplies, consider the production from one specific shift that consists of 300
units, each having some voltage output. To begin, consider a sample of 10 units selected
from these 300 units and tested with voltage measurements, as previously described for
sample units. The universe here is finite. and the set of voltage values for these universe
UIJits (the population) is thus finite. If our interest is simply to describe the sample results,
we have done that by enumeration of the resnlts, and both graphical and quantitative meth-
ods for further description are given in the following sections of this chapter. On the other

Process
Observation

Df~
Data
Unit
generation '--</
Or

/
/
/
/
/" o
'--/
/
/
/
/
/

/
/
/
/
Sample units I
Q)®®"'@
Figure 8--1 From process to data..
172 Chapter 8 in'JOduction to Statistics and Data D::-:scription

hand, if we wish (0 employ the sample results to draw statistical inference about the popu-
lation consisting of 300 voltage values, this is called an enumerative studyl and careful
attention must be given to the method of sample selection if valid inference is to be made
about the population. The key to much of what may be accomplished in such a study
involves either simple or more complex forms of rar.dom sampling or at least probability
saT1".pling. \VhUe these concepts will be developed further in the next chapter, it is noted at
this point that the application of simple random sampling from a finite universe results in
an equal inclusion probability for each unit of the universe. In the case of a finite universe,
where the sampling is done without replacemem and the sample size, usually denoted n. is
the same as the universe Si7.e, N, then this sample is called a census,
Suppose Our interest lies not in the voltage "talues for units produced in this specific
shift but rather in the process or process variable assignment model. Not only must great
care be given to sampling methods, we must also make assumptions about the stability of
the process and the structure of the process model during the period of sampling. In
essence, the universe unit variable values may be thought of as a realization of me process"
and a random sample from such a universe, measuring voltage values. is equivalent to a rau~
dom sample on the process or process model voltage variable or population. With our
example, we might assume that unit voltage. E, is distributed as N(u. aZ) during sample
selection. This is often called an analytic sruciy, and our objective might be, ior example, to
estimate J1. and/or cr. Once again, random sampling is to be rigorously defined in the fol~
lowing chapter.
Even though a universe sometimes exists only in concept, defined by a verbal descrip-
tion of the membership, it is generally useful to think carefully about the entire process of
observation activity and the conceptual universe. Consider an example where the ingredients
of concrete speci.r:lens have been specified to achieve high strength with early cure time. A
batch is formulated, five specimens are poured into cylindrical molds, and these are cured
according to test specifications. Following the cure, these test cylinders are subjected to lon~
gitudinalload until rupture, at which point the rupture strength is recorded. These test units,
with the resulting data, are considered sample data from the strength measurements associ~
ated with the universe units (each with an assigned population value) that might bave been,
but were never actually, produced. Again, as in the prior example, inference often relates to
the process or process variable assignment model. and model assumptions may become cru-
cial to attaining meaningful inference in this analytic study.
Finally, consider measurements taken on a fluid in order to measure some variable or
characteristic. Specimens are draw:> from well-mixed (we bope) fluid, with eacb specimen
placed in a specimen container. Some difficulty arises in universe identification. A conven-
tion here is to again view the universe as made up of all possible specimens that might have
been selected, Similarly. an alternate view may be taken that me sample values represent a
realization of the process value assignment model. Getting meaningful results from such an
analytic study once again requires close anention to sampling methods and to process or
process model assumptions.
Descriptive statistics is the branch of statistics that deals with organization, summa~
rizatioo. and presentation of data. Many of the techniques of deseriptive statistics bave been
in use for over 200 years, y;ith origins in surveys and census activity, Modem computer
technology, particularly computer graphics, has greatly expanded the field of descriptive
statistics in recent years. The techniques of descriptive statistics can be applied either to
entire finite populations or to samples and these methods and techniques are illustrated in
j

the following s""tions of this chapter. A wide selection of software is available, rdllging in
focus, sophisticatio~ and generality from simple spreadsbeet functions &'Uch as those found
in Microsoft Excel"", to the more-comprehensh'e but "user friendly" Minitab®, to large)
comprehensive, flexible systems such as SAS. Many other options are also available.
8-3 Graphical Preser.tation of Dam 173

In enumerative and analj1ic studies~ the objective is to make a conclusion or draw


inference about a finite population or about a process or variable assignment modeL
This activity is called inferential statistics, and most of the techniques and methods
employed have been developed within the past 90 years. Subsequent chapters will foeus on
these topics.

8-2 DATA
Data are collected and stored electronically as well as by hUJrulD observation using tradi-
tional records or files. Formats differ depending on the observer or observational process
and reflect individual preference and ease of recording, In large~scale studies, where unit
identification exists~ data must often be obtained by stripping the required information
from several files and mergillg it into a file format suitable for statistical analysis.
A table or spreadsbeet-type format is often convenient, and it is also compatible with
most software analysis systems. Rows are typically assigned to units observed, and columns
present numerical or categorical data on one or mare variables. Furthermore, a column may
be assigned for a sequence or order index as well as for other unit identifying data. In the
case of t.~e power supply voltage measurements. no order or sequence was intended so that
any permutation of the 10 data elements is the same. In ather contexts. for example where
the index relates to the time of observation, the position of the elementin the sequence may
be quite important. A common practice in both situations described is to employ an index.
say i = 1,2; , .. , n, to serve as a unit identifier where n units are observed. Also, an alpha~
bedcal character is usually employed to represent a given variable value; thus if €i equals t.~e
value of the ith voltage measurement in the example, e~ = 5.10, e2 5.24, e3 = 5.14.. ."
e lC = 5.11. In a table format for these data there would be 10 rows (II if a headings row is
employed) and one or!\Vo columns, depending on whether the index is included and pre-
sented in a column.
'Where several variables and/or categories are to be associated with each unit. each of
these may be assigned a colUIDIl. as saoVffi in Table 8-1 (from Montgomer:y and Runger,
2003) which presents measurements of puJl strength, wire length, and die height made on
each of25 sampled units in a semiconductor manufacturing facility, In situations where cat~
egorical classification is involved, a columo is provided for each categorical variable and a
categor:y code or identifier is recorded for the unit. An example is shift of manufact1:..""e with
identifiers D, N, G for day, night, and graveyard shifts, respectively.

8·3 GRAPIDCAL PRESENTATION OF DATA


In this section we will present a few of the many graphical and tabular methods for sum-
marizing and displayi:c.g data. In recent years, the availability of computer graphics has
resulted in a rapid expansion of visual displays for observational data.

8-3.1 Numerical Data: Dnt Plots and Scatter Pints


When we are concerned with One of the variables associated with the observed units. the
data are often called univariate data, and dot plots provide a simple, attractive display that
reflects spread, extremes, centering, and voids or gaps in the data. A horizontal line is
scaled so that the range of data values is accommodated. Each observation is then plotted
as a dot directly above this scaled line, and where multiple observations have the same
value, the dots are simply stacked vertically at that scale point. ,,'bere the number of units
is relatively smail, say n < 30. or where there are relatively few distinct values represented
174 Ch.p1er 8 Inrroduction to Statistics .and Data Description
Table 8-1 Wrre Bond Pull Strength Data
Observation Pull Strength WueLength Die Height
Number (y)

9.95 2 50
2 24.45 8 110
3 31.75 II 120
4 35.00 10 550
5 25.02 8 295
6 16.86 4 200
7 14.38 2 375
8 9.60 2 52
9 24.35 9 100
10 27.50 8 300
11 17.08 4 412
12 37.00 11 400
13 41.95 12 500
14 11.66 2 360
15 21.65 4 205
16 17.89 4 400
17 69.00 20 600
18 10.30 ·585
19 34.93 10 .540
20 4<5.59 15 250
21 44.&8 15 290
22 54.12 16 510
23 56.63 17 590
24 22.13 6 100
25 21.15 5 400

in the data set, dot plots are effective displays. Figure 8-2 shows the univariate dot plots or
marginal dot plo", for each of the three variables with dall! presented in Table 8-L These
plots were produced using Minitab".
Where we wish to jointly display results for two variables, the bivariate equivalent of
the dot plot is called a scatter plDt. We construct a simple rectangular coordinate graph,
assigning the horizontal axis to one of the variables and the vertical ""is to the other. Each
observation is then plotted as a point in this plane. Figure 8~3 presents scatter plots for pull
strength vs. wire length and for pull strength vs. die height for the datl! in Table 8-L In order
to accommodate data pairs that are identical, and thus that fall at the same point on the
plane, one convention is to employ alphabet characters as plot symbols; so A is the dis-
played plot point where one data point falls at a specific point on the plane. B is the dis-
played plot point if two fall at a specific point of the plane. etc. Another approach for this.
which is useful where values are close but not identicaL is to assign a randomly generated,
small, positive or negative quantity sometimes called ji..l1er to one or both variables in order
to make the plo", better displays. Wbile scatter plots show the region of the plane wbere the
data points fall, as well as the data density associated with this region, they also suggest
possible association between the variables. Finally, we n.ote that the usefulness of these
plots is not limited to small data sets.
8-3 Graphical Presentation of Data 175

.. ~ !~ .. •• , . .,
10 20 30 40 50 60 70
Pull strength

5 10 15 20
Wire length

T •• t· .., ! • t • •• T

100 200 300 400 500 600


Die height

Figure 8-2 Dot plots for pull strength, wire length, and die height.

70 70
60 60

" 40
50
" ~
50 ..
. . .. .'
~
c c
~ 30 ~ 40
"5 "5 30
~
20
10 ..'. '
~

20 .. ' :
0 10
0 10 20 o 100 200 300 400 500 600
Wire length Die height

Figure 8-3 Seatter plots for pull strength vs. wire length and for pull strength vs. die height (from
Minitab'"l.

To extend the dimensionality and graphically display the joint data pattern for three
variables, a three-dimensional scatter plot Dlay be employed, as illustrated in Figure 8-4 for
the data in Table 8-1. Another option for a display, not illustrated here, is the bubble plot.
whicb is presented in two dimensions, with the third variable reflected in the dot (now
called bubble) diameter that is assigned to be proportional to the magnitude of the third
variable. As was the case with scatter plots, these plots also suggest possible associations
between the variables involved.

8·3.2 Numerical Data: The Frequency Distribution and Histogram


Consider the data in Table 8-2. These data are the strengths in pounds per square incb (psi)
of 100 glass, nonrerurnable 1-liter soft drink bottles. These observations were obtained by
testing eacb bottle until failure occurred. The data were recorded in the order in whicb the
bottles were tested, and in this fonnat they do not convey very Dlucb infonnation about
bursting strength of the bottles. Questions sucb as "wbat is the average bursting strength?"
176 Chapter 8 Introduction to Statistics and Data Desc:iption

~ttn-~~~-l-1~o
406'°0
O'---'-4'"-8.L.-':'12~---;;;,-
8
__?;; ~.e.\~
200300 «,\
100
12 16 20 0 <:)\0
Wire length
FigureS-4 Three~ilimensiona1 plots for pull strength, wire length, and die height.

Table,g..2 Bursting Strength in Pounds per Square Inch for 100 Glass, I-Liter, Nonreturnable Soft
Drink Bottles

265 197 346 280 265 200 221 265 261 278
205 286 317 242 254 235 176 262 248 25G
263 274 242 260 281 246 248 271 260 265
3rJ7 243 258 321 294 328 263 245 274 270
220 231 276 228 223 296 231 301 337 298
268 267 300 250 260 276 334 280 250 257
260 281 208 299 300 264 230 274 278 210
234 265 187 258 235 269 265 253 254 280
299 214 264 267 283 235 272 287 274 269
215 318 271 293 277 290 283 258 275 251

Or "what percentage of the bottles burst below 230 psi?" are not easy to answer when the
data are presented in this form,
A frequeney distribution is a more useful summary of data than the simple enumera-
tion given in 1able 8-2. To construct a frequency distribution, we must divide the range of
the data into intervals, which are usually called class intervals. If possible, the class inter-
vals should be of equal width, to eohance the visual infonnation in the frequency distribu-
tion, Some judgment must be used in selecting the number of class intervals in order ro give
a reasonable display. The number of class intervals used depends on the number of obser-
vations and the amount of scatter or dispersion in the data. A frequency distribution that
uses either too few or too many class intervals will not be very informative. \Ve generally
find that between 5 and 20' intervals is satisfactory in most cases, llIld that the number of
class intervals should increase with n. Choosing a number of class intervals approximately
equal to the square root of the nwnber of observations often works well in practice.
A frequency distribution for the bursting s!l'ength data in Table 8-2 is abown in Table
8-3. Since the data set contains 100 observations, we suspect that about ,fWo = 10 class
intervals will give a satisfactory frequency distribution. The largest and smallest data val~
ues are 346 and 176, respectively, so the class intervals must cover at least 346 - 176 = 170
psi units on the scale. If we want the lower limit for the first interval to begin slightly below
tile smallest data value and the upper limit for the last cell to be slightly abeve the largest
data value, then we might start the frequency distribution at 170 and end it at 350. This is
an interval of 180 psi units. Xine class intervals, each of width 20 psi, gives a reasonable
frequency distribution, ani! the frequency distribution in Table 8·3 is thus based on nine
class intervals.
8-3 Graplrical Presentation of Data 177

Table 8-3 Frequency Distribution for the Bursting Strength Data in Table 8-2

Class Interval Relative Cumulative


(psi) Tally Frequeney Frequeney Relative Frequency
170 ~x < 190 2 0.02 0.02
190 ~x < 210 1111 4 0.04 0.06
210 ~x < 230 1+11 II 7 0.07 0.13
230 ~x < 250 1+11 1+11 III 13 0.13 0.26
250.s;x < 270 1+11 1+11 1+11 1+11 1+11 1+11 II 32 0.32 0.58
270 ~x < 290 1+11 1+11 1+11 1+11 1111 24 0.24 0.82
290 ~ x < 310 II!III!II 11 0.11 0.93
310~x<330 1111 4 0.04 0.97
330~x<350 III 3 0.03 1.00
100 1.00

The fourth column in Table 8-3 contains the relative frequency distribution. The rela-
tive frequencies arc found by dividing the observed frequency in each class interval by the
total number of observations. The last column in Table 8-3 expresses the relativ~ frequen-
cies on a cumulative basis. Frequency distributions are often easier to interpret than tables
of data. For example: from Table 8-3 it is very easy to see that most of the bottles burst
between 230 and 290 psi, and that 13% of the bottles burst below 230 psi.
It is also helpful to present the frequency distribution in graphical form, as shown in
Fig. 8-5. Such a display is called a histogram. To draw a histogram, use the horizontal axis
to represent the measurement scale and draw the boundaries of the class intervals. The ver-
tical axis represents the frequency (or relative frequency) scale. If the class intervals are of
equal width, then the heights of the rectangles drawn on the histogram are proportional to
the frequencies. If the class intervals are of unequal width, then it is customary to draw rec-
tangles whose areas are proportional to the frequencies. In this case, the result is called a

.
40

,-
30

i;'
c
•o r-
~ 20
u.

-
10 r-
-
,--I
170190210230250270290310330350
n
Bursting strength (psi)

Figure 8-5 Histogram of bursting strength for 100 glass, I-liter, nonreturnable soft drink bottles.
178 Chapter 8 Introductioc to Statistics and Data Description

density histogram, For a histogram displaying relative frequency nn the vertical axis. the
rectangle height' are calculated as
'gh
rectangIe hel& t:::::
class relative frequency.,_
class width
When we find there are multiple empty class intervals after grouping data into equal-width
intervals, one option is to merge the empty intervals with contiguous interVals. thus creat~
ing some wider intervals. The density histogram resulting from this may produce a more
attractive display. However, histograms are easier to interpret when the class intervals are
of equal width. The histogram provides a visual impression of the shape of the distribution
of the measurements, as wen as information about centering and the scatter or dispersion
of tho data.
In passing from the original dam to either a frequency distribution or a histogram. a cer~
tain amount of information has been lost in that we no longer have the individual observa-
tions, On the other hand, this information loss is small compared to the ease of
interpretation gamed in using the frequency distnoution and histogram, In cases where the
data assume only a few distinct values, a dot plot is perhaps a better graphical display,
Vlhere observed data are of a discrete nature, sucb as is found in counting processes~
then two choices are available for constrUcting a histogram, One option is to center the rec~
tangles on the integers reflected in the count data, and the other is to collapse the rectangle
into a vertical line flaced directly over these integers. In both cases, the height of the rec-
tangle or the length of the line is either the frequency or relative frequency of the occurrence
of the value in question.
In summary, the histogram is a very useful gr1ljOhic display, A histogram can give the
decision maker a good understanding of the data and is very useful in disflaying the shape,
location, and variability of the data. However, the histOgram does not allow individual data
points to be identified, because all observations falling in a cell are indistinguishable.

8-3.3 The Stem-and-LeafPlot


Suppose that the data are represented by Xj. X;:., ,." x lf' and that each number Xi consists of
at least two digits, To construct a stem-and~leaf plot, we divide each number .xl into two
parts: a steml consisting of one or more of the 1eading digits, and a lear, consisting of the
remaining digits. For example, if the data consist of the percentage of defective information
between 0 and 100 on lots of semiconductor wafers, then we could divide the value 76 into
the stem 7 and the leaf 6. In general, we should choose relatively few stems.in comparison
with the number of observations. It is usually best to choose between 5 and 20 stems. Once
a set of stems has been chosen. they are listed along the lefr-hand margin of the display. and
beside each stem all leaves corresponding to the observed data values are listed in the order
in which they are encountered in the data set.

To illustrate the construction of a stem-and-leaf plot, consider the bottle-bursting-strength data in


Table 8-2. To construct a stem-ar.d-leafplot, we select as stem values the numbers 17.18 19, .,,> 34,
7

The resulting stem-ana-leaf plot is presented in Fig. 8-6. Inspection of this display immediately
reveals that most of ti:.e bursting strengths lie bet'Neen 220 and 330 psi. and that the central value is
soroc'Nhere bet\llccn 260 and 270 psi. Furtheun.ore, the but'Sting strengths are distributed approxi~
mately symmetrically about the central value. Therefore, the stem-and~leaf plot., like the histogram.
ci1OV1S US to determine quickly some important fuatures of the data that were not immediately obvi~
o";!s in the original display. Table 8-2, Note that here the original cumbers. arc not lost, as occurs in a
3-3 Graphical Presentation of Data 179

Stem Leaf
17 6
18 7
19 7
20 0,5,8 3
21 0,4,5 3
22 1,0,8,3 4
23 5,1,1,4.5,5 6
2A 2,8,2,6,8,3,5 7
25 4,0,8,0,0,7,8,3,4,8-1 11
26 5,5,5,1,2,3,0,O,5,3,8,7,O,O,~,5,9,5,4,7,9 21
27 8,4, 1,4,0,6,6,4,8,2,4, 1,7,5 14
28 0,6,1 ,0,1,0,0,3,7,3 10
29 4,6,8.9~9,3.0 7
30 7,1,0,8 4
31 7,8 2
32 1,8 2
33 7,4 2
34 6 1
100

Figure 8--6 Stem~and-leaf plot for the borJe-bursting-strength data in Table 8~2.

histogram. Sometimes, in order to assist in finding percentiles. we order the leaves by magnitude, pro-
ducing a:: ordered stem~and-leaf plot, as in Fig. 8-7. For instance, since n =. 100 is an even number,
the median, or '''midd1e'' observation (see Section 8-4.1), is the average of the tviO observations with
ranl;s 50 and 51, or

x ~ (265 .. 265)12 ~ 265,


The lenlhpercentileis the observation withnmk(O,I)(l00)+05~ 10.5 (halfway between the IOthaJ:d
11th observatiOllS), or (220 + 221)12~2205, Theftrst quartile i:s the observation withnmk(O,25)(loo)
- 0.5 = 25,5 (halfway between the 25th and 26th observations), or (248 + 248)12 = 248, and the third
qu2rti:e is the observation with nmk (0,75)(100) + 0.5 = 75.5 (halfway between the 75th and 76th
O'sc,."·va.tioDs). or (280+ 280)/'2=280, The first and rhird quartiles are QCCjSionilly denoted by the syxn-
bo~s Ql and Q3, r~ectively. and the 1nterquartile rf1!tSe IQR= Q3 -Ql maybe used as amcasureof
variability. For the bor-Jc-bo.tsti1.g-strength data. the i;:;terquartile range is IQR =. Q3 - Q: = 280 - 248
=32, The stem~and-1eaf displays in Figs. 8-6 and 8-7 are equivalent to a histograx. ....ith IS class i:::ter-
vals, In some situations, it may be desirable to provide more classes or stems, One way to do this would
be to modify the original stems as follows: divide stem 5 (say) into two new stems. 5* and 5- S~m 5-
has leaves 0, 1. 2, 3, and 4, and stem 5- has leaves 5, 6, 7, 8, and 9. This will double the number of orig-
inal stems. We could increase the number of original stems by five by defining five new stems: 5:jO vrith
leaves 0 and 1, St (for twos and threes) with leaves 2 and 3, Sf (for fours and fives) with leaves 4 and
5, 5s (:or sixes and sevens) with lea\'es () and 7, and 5" v.i.th leaves 8 and 9.

8-3.4 The Box Plot


A box plot displays the three quartiles. the lllinimum, and the maximum of the data on a
rectangular box, aligned either horizontally or vertically, The box encloses the interqua...--tile
range with the left (or lower) line at the first quartile Q1 and the right (or upper) line at the
third quartile Q3, A line is drawn through the box althe second quartile (which is the 50th
180 Chapter 8 Introduction to Statistics and Data Description

Stem Leaf
17 6 1
18 7 1
19 1
20 0,5.8 3
21 0,4,5 3
0,1,3,8
22
23 1,1.4,5,5,5 "6
24 2.2,3,5,6,8,8 7
25 0,0,0,1,3.4,4,7,8,8,8 11
26 0,0,0.0,1,2,3,3,4,4,5,5,5,5,5.5,7,7,8.9,9 21
27 0,1,1,2,4,4,4,4,5,6,6.1,8,8 14
28 0,0,0,0,1,1,3,3,6,7 10
29 0,3,4,6,8,9,9 7
30 Q,l,7,& 4
31 7.8 2
32 1,8 2
33 4,7 2
34 6 1
100
Figure g~1 Ordered stcm~and~leaf plot for the bott1e~bUX5ting-streugth data.

percentile or the median) Q2 ~ i. A Une at either end extends to the extreme values. These
lines, sometimes called wbiskers, may extend only to the 10th and 90th percentiles or the
5th and 95th percentiles in large data sets. Some authors refer to the box plot as the box"
and-whisker plot. Figure 8-8 presents the box plot forthe bottle-bursting-strength data. This
box plot indicates that the distribution of bursting strengths is fairly symmetric around the
central value, because the left and right whiskers and the lengths of the left and right boxes
around the median are about the sarne.
The box plot is useful in comparing two or more samples. To illustrate, consider the
data in Table 8-4. The data, taken from Messina (1987), represent viscosity readings on
three different mixtures of raw material used on a manufacturing line. One of the objectives
of the study that Messina discusses is to compare the three mixtures. Figure 8-9 presents the
box plots for the viscosity data. This display permits easy interpretation of the data. Mix-
ture I has higher viscosity than mixture 2, and mixture 2 has higher viscosity than mixture
3, The distribution of viSCOsity is not symmetric, and the maximum -viscosity reading from
mixture 3 seems unusually large in comparison to the other readings. This observation may
be an outlier. and it possibly warrants further exantination and analysis.

176 248 2£5 280 346

I
I : I
I I
175 200 225 250 275 300 325 350
Figure 8~~ Box plot for the bottic-bursting-streng'J> data.
8-3 Graphical Presentation of Data 181

Table 8-4 Viscosity Measurements for Three Mixtures


Mixture 1 :Mixture 2 :Mixture 3
22.02 21.49 20.33
23.83 22.67 21.67
26.67 24.62 24.67
25.38 24.18 22.45
25.49 22.78 22.28
23.50 22.56 21.95
25.90 24.46 20.49
24.98 23.79 21.81

27.0
26.67
25.8 25.70
25.19
m 24.7
".g-"
24.62 24.67
24.32
0
rn 23.68
~ 23.5
23.29
'"8
-;;;
22.62
22.3 22.37
5 22.02
21.88
21.49
21.2
21.08
20.33
20.0
2 3
Mixture
Figure 8-9 Box plots for the mixture-viscosity data in Table 8-4.

8·3.5 The Pareto Chart


A Pareto diagram is a bar graph for count data. It displays the frequency of each count on
the vertical axis and the category of classification on the horizontal axis. We always arrange
the categories in descending order offrequency of occurrence; that is, the most frequently
occurring is on the left, followed by the next most frequently occurring type, and so on.
Figure 8-10 presents a Pareto diagram for the production of transport aircraft by the
Boeing Commercial Airplane Company in the year 2000. Notice that the 737 was the most
popular model, followed by the 777, the 757. the 767, the 717, the 747, the MD· 11, and the
11D-90. The line on the Pareto chart connects the cumulative percentages of the k most fre-
quently produced models (k = 1, 2, 3, 4, 5). In this example, the two most frequently pro·
duced models account for approximately 69% of the total aUplanes manufactured in 2000.
One feature of these charts is that the horizontal scale is not necessarily numeric. Usually,
categorical classifications are employed as in the aUplane production example.
The Pareto chart is named for an Italian economist who theorized that in certain
economies the majority of the wealth is held by a minority of the people. In count data, the
"Pareto principle" frequently occurs, hence the name for the chart.
Pareto charts are very useful in the analysis of defect data in manufacturing systems. Fig-
ure 8-11 presents a Pareto chart showing the frequency with which various types of defects
182 Chapter 8 Introduction to Statistics ar.d Dt!ta D::scription

500 100

400 80

sao 60 1:
1: e
8 200 80
-40 a.

10e 20

0 0
737 m 757 767 717 747 MD·11

COL.:ot 281 55 45 4.4 92 25 4 3


Percent 57.5 11.2 9.2 9.0 6.5 5.1 0.8 0.6
Cum"!o 68.7
57.5 77.9 86.9 93.5 98.6 99.4 100.0
Figure S~10 Airplane production in 2000, (Source; Boeing CornmercialAirplane Company.)

oC\..\.1! on metal parts used in a structural component of an automobile doorfrarne. Notice bO\V
the Pareto chart highlights the relatively few types of defects thaI are responsible for most of
the observed defects in the part. The Pareto chart is an imponanl parr of a quality-improve-
ment program because it allows management and engineering to focus attention on the most
critical defects in a product or process. Once these critical defects are identified, corrective
actions to reduce or eliminate these defects must be deVeloped and implemented. This is eas-
ier to do, however, wben we are sure that we are attacking iii legitimate problem: it is much
easier to reduce or eliminate frequently occ~ng defect'> than rare ones.

81 I
-----_. __ . iOQ

.,
'if
75 .3" e

ll'
~ ~
Q
.!!l
m
'0
50 ~
"
cr
'0 Q
~
Q
.0
E so j
~
g
z
3°1 21 L25 0
Q

2°l
l ;;;
1lla:
10
~6 5 5 4 4
O-~.-- J
Out of Missing !
Parts net
contour hoi(JS!slcts igreased
Pans Outet Parts r,ot Oth.,
ur.der·!rlmmcd sequonce dro.;rred
Defect typo

Figuro S-11 Pareto chart of defects in door structural elements.


fi
! 8,,4 ~umerical Description of Data 183

8-3_6 Tune Plots


VIrtually everfone should be familiar with time plots, since we view them daily in media
presentations. Examples are historical temperature profiles for a given city. the closing
Dow Jones Industrials Index for each trading day, each month, each quarter, etc., and the
plot of the yearly Consumer Price Index for all urban consumers published by the Bureau
of Labor Statistics. Many other time-oriented data are routinely gathered to support infer-
ential statistical activity. Consider the electric power demand, measured in kilowatts, for a
given office building and presented as hourly data for each of the 24 hours of the day in
which the supplying utility experiences a summer peak demaud from all cllstomers.
Demand data such as these are gathered using a time of use or "load research" mete!. In
concept, a kilowatt is a continuous. variable. The meter sampling interval is veri short, how-
ever, and the hourly data that are obtained are truly averages over the meter sampling inter-
vals contained within each hour.
Usually with time plots, time is represented on the horizontal axis. and the vertical
scale is calibrated to accommodate the range of values represented in the observational
results. The kilowatt hourly dem",.,d data are displayed in Fig. 8-12. Ordinarily, when series
like those: display averages OVer a time interval, the variation displayed in the data is a fune~
tion of the lcngth of the averaging interval, with shorter intervals producing more variabil-
ity. For example) in the case of the kilowatt data, if we used a 15-minute interval. there
would be 96 points to plot, and the variation in data would appear much greater.

8-4 NUMERICAL DESCRIP:rION OF DATA


Just as graphs can improve the display of data, numerical descriptions are also of value. In
this section, we present several impOrtant numerical measures for describing the character-
istics of data.

8-4.1 Measures of Central Tendency


The most common measure of central tendency, or location, of the data is the ordinary arith~
metie mean. Because we usually think of the data as beillg obtained from a sample of units,

J !

50
c 2 4 6 8 10 12 14 16 18 20 22
Hour
Figure 8-12 Summer peak doy hourly wow.tt (KW) demand data for an office building.

l
184 Chapter 8 Introduction to Statistics and Data Description

we will refer to the arithmetic mean as the sample mean. If the observations:in a sample of
size n are x;. x 2• •• '. X'I' then the sample mean is
f::= X1 +XZ +,,,+xn
n

(8-1)

For the bottle-bursting-strength data in Table 8-2, the sample mean is


(00

}.>
- i~i _ 26, 406 "64 06
X
lOa -J:Oil - ..
From examination of Fig. 8-5, it seems that the sample mean 264,06 psi is a "typicall1 value
of bursting strength. since it occurs near the middle of the data, wbere the observations are
concentrated. However, this impression can be lIlisleading. Snppose that the histogram
looked like Fig, 8~13. The mean of these data is still a measure of central tendencYI but it
does not necessarily imply that most of the observations are concentrated around it. In gen-
eral, if we think ofllie observations as having unit mass, the sample mean is just the center
of mass of the data. This implies that the histogram will just exactly balance if it is sup-
ported at the sample mean.
x
The sample mean represents the average value of all the observations in the sample.
\Ve can also think of calculating the average value of all the observations in a finite popu~
larion. This average is called the population mean, and as we here seen in previous chap-
ters, it is denoted by the Greek letter J1.. \,nen there are a finite number of possible
observations (say Iv) in the population, then the population mean is
_ 1', (8-2)
/1- N I

'\;'N
where l'x = .L<i""l Xi is the finite population total for the population.
In the followmg chapters dealing with statistical inference. we will present methods for
making inferences about the population mean that are based on the sample mean. For exam-
ple, we will nse the sample mean as a point estimate of fl.,
Another measure of central tendency is the median, or;he point at which the sample is
divided into two equal halves, Let X~l)' XC'l)l ••• , ~It) denote a sample arranged in increasing
order of magnitude; that iSt x(li denotes the smallest observation, X(2) denotes the second

figure 8--13 A histogram.


84 1'UIllerica1 Description of Data 185

smallest observation, ... , and x~l!) denotes the largest observation. Then the median is
defined mat'Iematically as
" n odd,
t«(r,+I)/2)
x~ (''') +2(,;2)+1) 11 even. (8-3)

The median bas the advantage that it is not influenced very much by extreme values. For
example, suppose that the sample observations are
1,3,4,2,7,6. and 8.
The sample mean is 4.43, and the sample median is 4. Both quantities give a reasonable
measure of the central tendency of the data. Now suppose that the next~to-Iast observation
is changed, so that the data are
1,3,4,2,7,2519, and 8.
For these dara. the sample mean is 363.43. Clearly, in this case the sample mean does not
tell us very much about the central tendency of most of the data. The median, however, is
still 4, and this is probably a much more meaningful measure of central tendency for the
majority of the observations.
Just as;; is the middle value in a sample, there is a middle value in the population. We
define fl as the median of the population; that is, fl is a value of the associated random vari-
able sucb that half the population lies below i1 and half lies above.
The mode is the observation that occurs most frequently in the sample. For example,
the mode of the sample data
2,4,6, 2) 5~ 6. 2, 9, 4, 5, 2, and 1
is 2, since it occurs four times, and no other value occurs as often. There may be more than
one mode.
If the data are symmetric~ then the mean and median coincide. If. in addition~ the data
bave only one mode (we say the data are unimoda1)~ then the mean, median; and mode may
all coincide. If the data are skewed (asymmetric, with along tail to one side), then the mean,
median, and mode will not coincide. Usually we find t.'13t mode < median < mean if the dis-
tribution is skewed to t'Ie righ~ while mode> median> mean if the distribution is skewed
to the left (see to Fig. -8-14).
The pistribution of the sample mean is well-known and relatively easy to work with.
Furthermore. the sample mean is usually more stable than the sample median, in the sense
that it does not vary as much from samplp to sample. Consequently. many analytical ,"tatis-
tical techniques use the sample mean. However. the median and mode may also be helpful
descriptve measures.

Negative or left skew Symmetric Positive or right skew


(a) (b) (e)
Figure 8.. 14 The mean and zuedian for symmetric and skewed distributions,
186 Chapter 8 Introduction to Stati.s::ics and Data Description

8-4.2 Measures of Dispersion


Central tendency does not necessarily provide enough information to describe data ade~
quately. For example, consider the bursting strengths obtained from two samples of six bot-
tles each:
Sample 1: 230 250 245 258 265 240
Sample 2: 190 228 305 240 265 260
The mean of both samples is 248 psi, However, note that the scatter or dispersion of Sam-
ple 2 is mueh greater than that of Sample 1 (see Fig. 8-15), In this section, we define sev-
eral widely used measures of dispersion.
The most important measure of dispersion is the sample variance. If XII -"2, "', xI! is a
sample of n observations, then the sample variance is
n ,
2:.(x,-xj"
si= j=! S~ (8-4)
n-l n-l
Note that computation of'? requires calculation ofi, n subtractions, 3.I!d n squaring and
x
adding operations. The deviations X t - may be rather tedious to work with, and several
decimals may have to be carried to ensure nur.:tencal accuracy. A more efficient (yet equiv'<
alent) COlO'lputational formula for calculating Sa is

(8-5)

The formula for S;r:x presented in equation 8-5 requires only One computational pass; but
care must again be taken to keep enough decimals to prevent round-off error.
To see how the sample variance measures dispersion or variability, refer to l'1g. 8-16
which shows the deviations Xi -:x for the second sample of six bottle-bursting strengths. The
greater the amount of variability in the bursting~strength data., the larger in absolute mag-
nitude some of the deviations'x; -x, Since the deviations Xi - iwill always sum to zero, we
must use a measure of variability that changes the negative devi.ations to nonnegative quan-
tities. Squaring the deviations is the approach used in the sample variance. Consequently,
if ,? is smaR then there is relatively little variability in the data, but if SZ is large, the vari~
ability is relatively large.
The units of measurements for the sample variance are the square of the original units
of the variable. Thus. if,X is measured in pounds per square inch (psi). the units for the sam~
pIe variance are (pSi)2.

~~;.pi~;~1'
We wi11 calculate tte sample v'ariance of the bottle~bursting stre::J.gths for the second sample in F1g.
8-15. The deviations Xl -.i for this sample are shown in Fig. 8-10.

o o o o 0 o
, , , "1" ~.~~:"-:~_---o'_
180 200 220 240 r~-2W--2BO 300 320

Sample mean "'" 248


• -= Sample 1 0 Sample 2:
Figure 8·15 Bursting-Strength dara,
8-4 Numerical Description of Data 187

x,
o
x,o x,
o
Xo Xs
o 0
x,
o

180 320

Figure 8-16 How the sample variance measures variability thro'.lgh ':he deviations Xi -;t:.

Observations

x, = 190 -58 33&4


~=228 -20 400
'1. = 305 57 3249
x4=240 -'i &4
x5=265 17 289
xs~ 260 12 lUI
x-248

From equation 84,


,
I,(xi -xl'
== S= ;",1
s2
n..:....l n-1
5=.1
7510 '502( pSI.
')'

We may also calculate SkY: from the forn::w.lation given in equation 8-5, so that

II " 1r 01 \2

s2. = S;c __
lx, --llx, . ~ 367.534- "J488) ;6
1",1 n. i,.,1 j
1
:502{psi)2.
'
n-I n-I 5
If we calculate the sample variance of the buISting strength for the Sample I values, we
find that;' 158 (psi)'. This is considerably smaller than the sample variance nf Sample 2,
confirming OUI initial. impression that Sample I has less variability than Sample 2.
Because;' is expressed in the square of the original units~ it is not easy to interpret
Fmthermore, variability is a more difficult and unfamiliar concept than location or central
tendency, However, we can solve the "curse of diroensioo.ality'~ by working with the (posi-
tive) square root of the vari.ance~ $, called tlle sample standard deviation.. This gives a meas-
ure of dispersion e.xpressed in the same units as the original variable.

The sample standard deviation of the bott1e~bursting strengths for the Sample 2 bottles in Rumple 8~2
and Fig, 8-15 is

s= {l ; -11502 =38.76 psi.


For the Sample 1 bottles. the standard deviation of bursti.:og strength is
s= ";i"s-g == 12.57 psi.
188 Cbapter 8 Introduction to Statistics and Data Description

Compute the sz.mple: variance and sample standard deviation of the bottle-bursting-strength data in
Table 8-2. Note that
'00 lC.
2:>:;. 7,074,258.00 LX,
,., :26,406.
Consequently,
S=; 7,074,258.00 - (26,406)'1;00; 101,489.85 and ,2; 101.489,85199 = 1025.15 psi',
so that the sample suu:dard deviation is

,=..]1025.15 ;32.02 psi.


-.~-.--.~------------------~

\\'hen the population is finite and consists of N values we may define the population Yari~
ance as
.v
L(x, -,ul
0'2 = 1=:1___ _ (8-6)
N '
which is simply the mean of the average squared departures of the data values from the pop-
ulation mean. A closely related quantity, if', is also sometimes called the population vari-
ance and is defined as
N
---0'-,
,
(8-7)
N-I
Obviously. as N getS large, a: -1 (f1., and oftentimes the use of iT simplifies some of the
algebraic formulation presented in Chapters 9 and 10, \\'bere severa! populations are to be
observed, a SUbscript may be employed to identify the population characteristics and
descriptive measures, e,g., f.Lx, O'~, s!. x
etc., if the variable is being described,
We noted that the sample mean may be used to make inferences about the population
mean. Similarly. the sample variance may be used to rn.ake inferences about the population
variance, We observe that the divisor for the sample variance,?t is the sample size minus
I, (n - I). If we actually knew the true value of the population mean!l, then we could define
the sample variance a,.;; the average squared deviation of the sample observations about J.l.
In practice, the value of J.l is almost never kno'W'n, and so the sum of the squared deviations
x
about the sample average must be used instead_ However! the observations Xi tend to be
closer to their average i than to the population mean j..L, so to compensate for this we use as
a divisor n 1 rather than n.
Another way to think about this is to consider the sample variance s2 a,.., being based On
II I degrees affreedom. The term degrees offreedom results from the fact thac the n devi-
ations XI - X. X:!: - x, .. ,' XII - x always sum to zero, so specifying the values of any 11 1 of
-<

these quantities automatically determines the remaining one. Thus. only n - 1 of the n de\1.-
ations Xi - X are independent.
Another useful measure of dispersion is the sample range
(8-8)
The sample range is very simple to compute, but it ignores all the information in the sam-
ple between the smallest and largest observations, For small sample sizes, say II :;; 10, this
;nformation loss is not too serious in some situations. The range traditionally has had wide-
8-4 ~wr.erical Des..-ription of Data 189

spread application in statistical quality control, where sample sizes of 4 or 5 are common
and computational simplicity is a major consideration; however, that advantage has been
largely diminished by the widespread use of electronic measurement, and data storage and
analysis systems, and as we will later see, the sample variance (or standard deviation) pro~
vides a "better" measure of variability. We will briefly discuss the use of the range in sta-
tistical quality-control problems in Chapter 17.

Cak'::ate the ranges of the tv,.'o samples 0: bottle-bur:sting~strength data from Section 8-4. shown in
Fig. 8-15. For the first sample, we find that
}(, 265-230=35.
whereas for the second sample
}(, =305 -190= 115.
Note that the range of the second sample is much larger than the range of me first, implying that the
second sample has greater variability than the first.

Occasionally. it is desirable to express variation as a fraction of the mean, A measure


of relative variation called the sample coefficient of variation. is defined as

(8-9)

The coefficient of variation is useful when comparing the variability of two or more data
sets that differ considerably in the magnitude of the observations. For example, the coeffi-
cient of variation might be useful i1. compa.."ing the variability of daiiy electricity usage
within samples of single-family residences in Atlanta, Georgia, and Butte, Montana, dnring
July.

8-4.3 Other Measures for One Variable


Two other measnres, both dimensionless, are provided by spreadsheet or statistical software
systems and are called skewness and kurtosis estimates. The notion of skewness was graph-
£Cally illustrated in Fig. 8-14, These characteristics are population or population model
characteristics and they are defined in terms of moments, /1" as described in Chapter 2,
Section 2-5.

skewness {3, = J.13


(1'
and kurtosis /34 = J.1! - 3, (8-10)
(1

where. in the case of finite populations. the kth central moment is defined as
N
2:(xi-4
(8-11)
N
As discussed earlier, skewness reflect" the degree of symmetry about the mean; and nega*
tive skew results from an asymmetric tail toward smaller values of the variable, while posi-
tive skew results from an asymmetric tail extending toward the larger values of the variable,
190 Chapter 8 Introduction to Statistics and Data Description

Symmetric variables, such as those described by the normal and uniform distributions. hav~
skewness equal zero. The exponential distribution, for example, has skewness /33 = 2.
Kurtosis describes the relative peakedness of a distribution as compared to a normal
distribution, where a negative value is associated with a relatively flat distribution and a
positive value is associated with relatively peaked distributions. For example, the kurtosis
measure for a unifonn distribution is -1.2. while for a nonnal variable. the kurtosis is zero.
If the data being analyzed represent measurement of a variable made On sample units.
the sample estimates of f3, and /3, are as shown in equations 8-12 and 8-13. These values
may he calculated from Excel~ worksheet functions using the SKEW and KURT functions.

n
skewness /3, = . =~-.,...---; n>2.
(n-l}(n-Z)

kurtosis /3 _ (n)(n + 1) i=1


3; n>3.
4 (n-l)(n-2)(n-3)
If the data represent measurementS on all the units of a finite population or a census, then
equations 8-10 and 8-11 ;;hould be utilized directly to detennine these measures.
For the pull strength data shown in Table 8-1, the worksheet funetions return values of
0.865 and 0.161 for the skewness and kurtosis measures, respeetively_

8-4.4 Measuring Association


One measure of association between twO numerical variables in sample data is ealled the
Pearson or simple correlation coefficient, and it is usually denoted r. Where data sets COn-
tain a number of variables and the correlation coefficient for only one pair of variables, des~
ignatedx andy! is to be presented. a subscript notation such as r:r:y may be used to designate
the simple correlation bern-"een variable x and variable y, The correlation coefficient is
S" (8-14)

where Sa is as shown in equation 8-5 and Syy is similarly defined for the y variable, replac-
ing x with yin equation 8-5, while
/11 \J.ln\/Jt\
1--n Ix,).: Iy, i.
11-

Sx, = I(x, -:;:)(y, -y)= IX,y;


I (8-15)
1
1""1 \.1=1 ) \1 .. : \i",,1.1
Now we will return to the wire bond pull strength data as presented in Table 8-1, with
the scatter plots shown in Fig. 8-3 for both pull strength (variable y) vs. wire length (vari-
able XI) and for pull strength vs. die height (variable x]). For the sake of distinguishing
between the two correlation coefficients. we let r l be used for y vs. x:. and T2 for y vs. Xz.
The eorrelation eoeffieient is a dimensionless measure which lies on the interval (-1, +1].
It is a measure of linear association between the two variables of the pair. As the: strength
of linear association increases, ;r!-71. A positive association means that larger Xl values
have larger y values associated with them, and the same is true for X:z and y. In other sets of
data, where the larger x values have smaller y values associated with them. the correlation
coefficient is negative. When the data reflect no linear association, the correlation is zero.
In the case of the pull strength data", = 0.982, and r, =0.493. The calculations were madc
8-4 Numerical Description of Data 191

using lvlinitab®. selecting Stat>Basic Statistics>Correlation. Both are obviously positive. It


is important to note however. that the above result does not imply causality. We cannot
claim that increasing Xl or Xz causes an increase in y. This jmportant point is often missed-
with the potential for major rrrisinteIpretation-as it may well be that a fourth, 1l.oobserved
variable is the causative variable influencing all the obser\led V'driables.
In the case where the data represent measures on variables associated with an entire
finite universe, equations 8-14 and 8-15 may be employed after repJacing x by jl," the finite
population mean for the x population and y by /Jy, the finite population mean for the y pop-
ulation, while replacing the sample size n.in the formulation by lV, the universe size. In tlris
case, it is customary to use the symbol P:cy to represent this finite population correlation
coejficient bel;\Veen variables x and y.

8-4.5 Grouped Data


If the data are in a frequency distribution. it is necessary to modify the computing formu-
las for the measures of central tendency and dispersion given in Sections 84.1 and 84.2.
Suppose that for each of p distinct values of x, say Xl' Xz. "., X P' the observed frequency is
Jj. Then the sample mean and sample variance may be computed as
p
l"fjXj
~ (8-16)
n

(8-17)

respectively.
A similar situation arises where original data have been either lost or destroyed but the
grouped data have been preserv'ed. In such cases, we can approximate the important data
moments by using a convention, which assigns each data value the value representing the
midpoint of the class interval into which the observations were classified, so that all sam-
ple values falling in a particular .interval are as.qigned the same value. TIris may be done eas~
ily, and the resulting approximate mean and variance values are as follows in equations 8-18
and 8.19.lfmj denotes the midpoint ofthejth class interval and there are c class intervals,
then the sample mean and sample variance are approximately .

if m if m
j j
)=1
j j
(8-18)
n

and

C
~ im'
L..iJJ
(~ fm, J'
I'
--I'n~J.
C

si. = J""l \.J=l


(8.19)
n-I
192 Chapter 8 Introduction to Statistics and Data Description.

~~~iiii1!~.f~;.
To illustrate the use of equatio!ls 8-18 and 8~19 we compute tb.e mean and variance of bursting
strength for the data in the :frequency distribution of Table 8~3. Note that there are c 9 class inter-
vals, and ilia, m, ISO,f, =2, m,= 200,!i =4, m, =220,1, =7, m, =240.1. = 13, m, =2f1J,f, =32,
m, = 2lI0,f, = 24, m, = 300'/1 = 11. m, = 320,fs = 4, m, = 340, andj, = 3, Thus,
9
L,JjmJ
- j-' 26,460 2'"
x=---=---:m: \)"+.
60 pst.
n 100
and

7,091,900-(26,460)' /100 91499 ."


99 . ,pSI).
'
Notice that these are very close to the v.uues obtained from the ungrouped data.

'When the data are grouped in class intervals, it is also possible to approximate the
median and mode. The median is approximately

(8-20)

where LM is the lower limit of the class interval containing the median (called the median
c1ass),fM is the frequency in the median class, T is the total of all frequencies in the class
intervals preceding the median class, and A is the width of the median class. The model say
MO, is approximately

MO = L,w +
, a+b
(_a_)c., (8-21)

where L MO is the lower limit of the modal class (the class interVal with. the greatest fre-
quency), a is the absolute value of the difference in frequency between the modal class and
the preceding class, b is the absolute value of the difference in frequency between the
modal class and the following class, and C. is the width of the modal class,

8-5 SUMMARY
This chapter has provided an introduction to the field of statistics, including the notions of
process, universe, population, sampling, and sample results called data. Furthermore, a
variety of commonly used displays have been described and illustrated. These were the dot
plot, frequeru:y distribution, histogram, stem-and-Ieaf plot, Pareto cbart, box plot, scatter
plot, and time plot.
We have also introduced quantitative measures for summarizing data. The mean,
median and mode describe cen.tral tendency. or location, while the variance, standard devi~
ation, range, and interquartile range describe dispersion Or spread in the data. Furtbermor~
measures of skew and kurtosis were presented to describe asymmetry and peakedness,
respecti¥ely. We also presented the correlation coefficient to describe the strength of linear
assoctztion between two variables. Subsequent chapters will focus on utilizing sample
results to draw inferences about the process or process model or about the universe.
8-6 Exc::"Cises 193

8.6 EXERCISES
8~1. Theshelf life of a high~speed photographic film is 86.6 94.1 93.1 89.4 973 83.7
being inves~ated by the manufacturer. The fo!1owing 91.2 97.8 94.6 88.6 96.8 82.9
datl are available, 86.1 93.1 96.3 84.1 94.4 87.3
90.4 86.4 94.7 82.6 96.1 86.4
Life Life Life Life 89.1 87.6 9U 83.1 98.0 84.5
(days) (days) (days) (days)

126 129 134 141


8~4. An electronics company manufac:ures power
131 132 136 145 snppl:es for l!. persO!li: computer. They ?roduce sev-
116 128 130 162 eral hundred power supplies each shift. and each unit
125 126 134 129 ~ssubjected to a 12~hour burn-in test. The nut:1ber of
13& 127 120 127 units failing during this 12-hour test each shift is
120 122 129 133 shown below.
125 11: 147 129 (a) Construct a frequency distribution and histogra.m.
150 148 126 14<) (b) Find the sample mean, sample variance, and sam-
130 120 117 131 ple standard deviation.
149 117 143 133

3 6 4 7 6 7
Construct a histogrnm and comment on the properties 4 7 8 2 4
of tile data. 2 9 4 6 4 8
8-2. The percentage of cotto:!. in a material used to 5 10 10 9 13 7
manufacture men's shirts is givc., below. Construct a 6 14 10 12 3
histogram for t.'lC data Comment on the properties of 10 13 8 7 10 6
the data. 5 10 12 9 2 7
4 9 4 16 5 8
34.2 33.6 33.8 34.7 37.8 32.6 35.8 34.6
3 5 11 .7 4
33.1 34.7 34.2 33.6 36.6 33.1 37.6 33.6
II 10 14 13 10 12
34.5 35.0 33.4 32.5 35.4 34.6 37.3 34.1
9 3 2 3 4 6
35.6 35.4 34.7 34.1 34.6 35.9 34.6 34.7
2 2 8 13 2 17
34.3 36.2 34.6 35.1 33.8 34.7 35.5 35.7
7 4 6 3 2 5
35.1 36.8 35.2 36.8 37.1 33.6 32.8 36.8
8 6 10 7 6 10
34.7 35.1 35.0 37.9 34.0 32.9 32.1 34.3
4 4 8 3 ~
8
33.6 35.3 34.9 36.4 34.1 33.5 34.5 32.7 '10
2 10 6 2 9
6 8 4 9 S J1
8--3. The :following data represent the yield On 90 con~ 5 7 6 4 14 7
secutive batches of ceramic substrate to which l! metal 4 14 15 :3 6 2
coating has been applied by a vapor·deposition 3 13 4 3 4 8
proeess. Const.ract a histogra::n for ::hese data ane 2 12 7 6 4 10
comment on the properties of the data,
8 5 5 5 8 7
10 4 3 10 7 4
94.1 87.3 94.1 92.4 84.6 85.4
9 6 2 6 9 3
93.2 84.1 92.1 90.6 &3,6 86.6
II 5 6 7 2 6
90.6 90.1 96.4 89.1 85.4 91.7
91.4 95.2 88.2 88.8 89.7 87.5
88.2 8ciJ 86.4 86.4 87.6 84.2 8 ..5. Consider the shelf life data in Exercise 8~ 1, Com-
86.1 94.3 85.0 85.1 85.1 85.1 pute the sample mean. sample va.."'iance. sad sample
95.1 93.2 84.9 84.0 89.6 905 standard deviation,
90.Q 86.7 87.3 93.7 90.0 95.6 8~6. Consider ::he cotton percentage da':a in Exercise
92.4 83.0 89.6 87.7 90.1 88.3 3-2. Find the sample =ean. sa=ple variar.c~. sa.."'!1ple
87.3 95.3 90.3 90.6 94.3 84.1 sr.:andard deviation, samp!e median, ar:.d sample mode,

l
194 Chapter 8 Introduction to Statistics and Data Description

8~7. Consider tl:e yiClddata in Exercise 8-3. Calcula!e (a) CoIilltruct a ste:n-and-leaf pIal
the sample mean. samp~e variance, and sample stan- (b) Construct a frequency distribution and bi<;togram.
dard deviation. (c) Calculate the sample mean, sample va.....iance. and
S-S. An article in Computers r:ma' Industrial Engineer- sample standard deviation.
ing (2001; p. .51) describes the time-to-failure data (in (d) Find the sample median and sample mode.
hours; for jet engines. Son::.e of the data are repro- (e) Dete:J:.::.ine:he ske\1lIless and kurtosis measures,
duced below.
8~11. Consider the shelf-life data in Exercise 8-1.
Cor.struct a stem~and-leaf plot for these data. Con.,
Failure Failure
struct an ordered stem-and-leaf piOL Use this plot to
Engine # Time Engine # Tlille
find the 65th and 95th percentiles,
I 150 14 171
&.12. Consider the cotton percentage data in Exercise
2 291 15 197 8-2,
3 93 16 200
(a) Cons:ruC! a stem-and-leaf plot.
4 53 17 262
5 2 [3 255 (b) Calculate the sample mean, sample va:..ance, and
286 sample standard deviatiou.
6 65 19
7 183 20 206 (c) Cons-r..::ct an ordered stem~and-leaf plot.
8 144 21 179 (d) Fmd the median and the fin;t and third quartiles.
9 223 22 232 (e) Fmd the interquartile range.
10 !97 23 165 (f) Detcmirle the ske\\'ness a;)d kurtosis measures.
:1 187 24 155 8~13. Consider ':he yield data in Exercise 8-3,
12 197 25 203
(a) Construct an ordered stcm-and-leaf plot.
13 213
(b) Find the median and the first a."ld third quartiles.
(c) Calculate the inlerquartile range.
(a) Construct a frequcr.cy distn'bution and his.togra.'1l
for these data. 8~14. Construct a box plot for the shelf-life data in

(b) Calculate the sample mean. sample median, sam~ Exercise 8-L Interpret the data using this plot.
pIe variance, and sa.r.:lple standard deviation. 8~15. Construe!. a box plot for the cotton perce:ttage

8-9. For the time·to.-failure data in Exercise 8-8, sup- data in Exercise 8-2. Interpret the data using this plot,
pose the ffth o'::>scrvation (2 hours) is discarded. Con- 8~16. Construct a box plot for the yield data in Exer-
struct a frequency distributioa and a histogram for the cise 8-3. Compare it to the histogram (Exercise 8~3)
remaining data, and calculate the sample mean, sa.'"n- and the stem~and-le3f plot (Exercise 8-13). btc:rpret
pIe median, sample variance, and sample standard the data.
deviation. Compare ':he results with ::hose o:.tained LTl
Exercise 8-8. Wbat impact has removal of this obser- 8~17. An article in the Electrical Manufacturing &
vation had on ':he s.um.ma.ry statistics? coa Winding Conference Proceeding:; (1995, p. 829)
presents the results for the number of returned ship-
8-10. An article in Tecnnol7U!rrics (Vol. 19, 1977, p.
ments for arecord-of~:he-month club. The company is
425) presents the followi:lg data on motor fuel octane interested in the reason for a returned shipment The
ratings of 5eY'eral blends of gasoline:
result') are shown below. Construct a Pareto chart and
88.5,87.7,83.4,86,7,87.5,91.5,88.6, 100.3, interpret the data.
95.6,93.3,94,7,91,1,91.0,94.2, 87.8, 89.9,
88.3,87,6,84.3,86.7,88,2, 90,8,88.3,98,8,
Reason Number of CustOt:l.efS
94.2,92.7,93,2,91.0,90.3,93,4,88.5,90.1,
89.2,83.3,85.3,37.9,88.6,90.9,89.0,96.1, Refused 195,000
93.3,91.8,92,3,90.4,90.1, 93,0, 88,7, 89,9, Wrong selection 50,000
89,8,89.6,87.4,88.4,83,9,91.2,89.3,94.4, Wrong answer 68,000
92,7,91,8,91.6,90,4,91.1,92.6,89,8,90,6,
Canceled 5,000
91.1,90.4,89.3,89.7,90.3,91.6,90.5,93.7,
Other 15,000
')2.7,92,2,92,2,91.2,91.0,92.2,90.0,90,7.
r 8~18. The following table contains the frequency of
&-.6 Exercise'S

8~24. COding the Data. Let Yi::= a + bx", i = 1,2, ....


195

occurrence of final letters in an adcle in the Atlcmta II, where a and b are nonzero constants. Find the rela-
Journal. Construct a histogram from these data. Do tionship between x and y, and between Sx and Sy'
any of the numerical descripto;:s in this :.:hapter have
any meatiliig for these dam? 8~2S. Consider the quantity ",~ L,-..l (Xi _a)2. For what
value of a is '.his quantity m.i.nimized?
a 12 n 19
8~26. The Trimmed l\>lean. Suppose :hat the data are
b II 0 13
arranged in ir.creasing order, L~% of the observations
c Jl p 1
removed from each end, and the sample mean of L.1.e
d 20 q 0
remaining numbers calculated. The resulting q::.antit,Y
e 25 r 15 is c.alled a trimmed mean. The trimmed mean gener-
f 13 s 18 ally lies between the sample mean i and t,i:e sample
g 12 20 median i: (why?).
h 12 u 0 Ca) Calculate the 10% trimmed mean for the yield
8 v 0 da.ta in Exercise S-3.
j 0 w 41 (b) Calculate the 20% trimmed mean for the yield
k 2 x 0 data in Exercise 8-3 and compare it with the quan~
11 Y 15 tit)' found in part (2.).
12 z
m

8-19. Show the following:


° 8-21. The Trimmed :\1ean. Suppose th2.t LN is :r.ot an
integer, DeveJop a procedure fur Obtaining a trimmed
mean.
Cal That ,",,' (x;-£)=O.
~"."1 8-28. Consider the shelf-life data in Exercise S~l.
(b) That
2:"(x. -xl-'2:'2
[",1 '
~ . x/ ,,,,1
Construct a frequency distribution and histogram
using a class interval width of 2. Compute the approx~
8~20. The weight of bearings produced by a forging imate mean and standard deviation from the frequency
process is being investigated. A sa:nple of six bearings distribution and compare it with the exact values
provided the weights US, J.21. l.J9, l.J7, 1.20, and found in Exercise 8~5.
1.2: pounds. rmd the sample mean, sample varian:e.
8..29~ Consider the follov.'ing frequency distribution.
sa:nple standard deviation, and sarr.ple mediat:..
(a) Calcclate L.':.e sample mean, variance. and stan~
8-21. The diameter of eight automotive piston rings is dard deviation.
shown below. Calculate the sample mean, sample
(b) Calcclate the :nedian and mode.
variance, and sample standard de\.iation.
74,001 mm 73",98 mm 1~1161D IUIN UOl21 UZU31M
74.005 74,000
74.003 74.006 h 4 6 9 13 15 19 20 IS 15 JO
74,001 74.002
8-22. The thickDess of pri.nted circuit boards is a very S·3{l, Consider the fQllowing frequency distribution,
impor.ant characteristic, A sample of eight boards had (a) Calculate the sample mean, variance. and stan-
the folJowi::g thicknesses (in thousands of 3t'. inch): dard deviation,
63,61, 65, 62, 61, 6',60, ar.d 66, Calculate the s=ple (b) Calculate the median and mode,
mean, sample variance. and sample standard deviation.
Vllhat are the units of measureme::::.t for each Statistic? x, I -4 -3 -2 -1 o 2 3 4
8~23. Coding the Data Consider the printed circuit .&
60 120 ISO 200 240 190 160 90 30
board thickness data in Exercise 8"22. J{

(a) Suppose that we subtract a constant 63 from eaeh


8-31. For the two sets of data in Exercises 8"29 and S-.
number. How are the sample mean, sample vari-
30, compute the sample coefficients of variation,
ance, and sample standard deviation affected?
(b) Suppose that we multiply each number by lOa. 8-32. Compute the approximate sample mean, sample
How are the sample mean. sample variance, and variance, sample median, and sample mode from the
sample standard deviation affected? data in the follolfling frequency distribution:
196 Chapter 8 Introduction to Statistics and Data Description

Class lntep,rru. Frequency Distance Covered


Exhaustion TIme until Exhaustion
1O~x<20 121
(in seconds) (in meters)
20~x<30 165
30'; x< 4() 184 610 1373
4O~x<50 173 310 698
50 ~x<60 142 720 1440
605x<70 120 990 2228
70~.x<80 118 1820 4550
30~x<90 110 475 713
90Sx< 100 90 890 2003
390 488
745 lll8
8~33. Compute the approximate sample me.ao, sample 885 1991
variance, median, .and mode from the data in the fol~
lowing frequency distribution;
Dete.o.nine t.':te simple (pearson) correlation between
time and distance. Interpret your results.
Class Intental Frequency
8-36. An electric utility which serves 850.332 residen~
-1O~x<O 3 tial customers on May 1. 2002, seleets a sample of 120
O';x< 10 8 customers randomly and installs a time-of-use meter
105x<20 '12 or "'load resea..rch meter" at each se1ected residence. At
20.$x<30 16 the time of installation. the technician also :records the
residence size (sq, ft.), During the SUm:n:ler peak
30~x<40 9
demand period, the company experienced peak
4()5x<50 4
demand at 5:31 P,!,t on July 30, 2002. July bills for
50Sx<60 2
usage (kwh) were sent to all customers, including
those in the sample group. Due to cyclical billing, the
July bills do no: reflect the same time-of-use period
8-34. Compute the approximate sample mean, sampie
for all customers; however, each customer is assigned
standard deviation, sample variance, median, and
a usage for July billing, The time-of-use meters hay\':':
mode for the data in the following frequency
memory and they record time specific demand (kw) by
distributiOll:
time interval; so for: the sampled customers. average
15-minute demand is available for the time interval
Class Iutexval 5:00-5:45 PM, on July 30, 2002. The data file Load·
data contains the kwh. kw, and sq, ft, data for: each
600 ~x<650 41 of the 120 sampled residences, This file is available
6505x< 700 46 at www.wiley.com/ooliegelhines.Using Minitzb® or
7005x<750 50 other software. dQ the fOllowing:
750';x<800 52 (a) Construct scatter plots of
800';x<850 60 1. kw vs. sq. ft. and
850~x<900 64 2, kw V5. kwh,
900$x<950 65
and COlllOlent On the observed displays. specifi-
950,; x <: 1000 70 c:illy in regard '.:0 the nal.'Ure of the associatiou in
1000'; x < 1050 72 parts 1 and 2 above, a.'ld for each part. the general
pattern of observed variation in kw across the
range of sq, ft, and the range of kwh.
8-35. An article in the International. Journal a/Indus-
rrial Ergor.o".Jc3 (l999, p. 483) describes a study con- (b) Construet the three-din:.ensional plot of lew vs,
ducted to determine the relationship between kwh and sq. ft.
exhaustion time: and distance covered until exhaustion (c) Construct histograms for kwh, kw, and sq. ft.
for several wheelchair exercises perfonned on a data. and comment on the patterns observed.
400-m outdoor track. The time and distances for ten (d) Construct a stem-and~leaf plot for the kwh data
participa!lts are as fonows: and compare to the histogram for the kwh data.
8-6 Exercises 197

(e) Determine the sample mean, median, and mode (h) Determine the skewness and kurtosis measures for
for kwh, kw, and sq. ft. kwh, kw, and sq. ft., and compare to the measures
(f) Determine the sample standard deviation for kwh, for a normal distribution.
kw, and sq. ft. (i) Determine the simple (pearson) correlation
(g) Determine the first and third quartilcs for kwh, kw, between kw and sq.ft. and between kw and kwh.
and sq. ft. Interpret.

l
Chapter 9

Random Samples and


Sampling Distributions
In this chapter, \ve begin our study of statistical inference, Recall that statistics is the sci-
ence of drawing conclusions about a population based on an analysis of sample data from
that population. There are many different ways to take a sample from a population. Fur-
thermore t~e conclusions that we can draw about the population often depend on how the
j

sample is selected. Generally, we want the sample to be representative of the population.


One important method of selecting a sample is random sampling. Most of the statistical
techniques that we present in the book assume that the sample is a random sample. In this
chapter we wlll define a random sample and introduce several probability distributions use-
ful in analyzing the information in sample data.

9-1 R4.l'<'DOM SA.'\IlPLES


To define a random sample, let X be a random variable with probability distributionf(x).
Then the set of n observations Xl' X1 •.... Xl'll taken on the random. variable X. and having
numerical outcomes Xl'~' .•• , X". is called a random satr".ple lithe observations are obtained
by observing X independently under unchanging conditions for n times. Note that the obser-
vations XI' Xz, .•.• X" in a random sample are independent random variables with the same
probability distributionf(x). That is, the marginal distributions of Xl' Xv ... , X, aref(x l ),
f(x,), ... .j(x,), respectively, and by independence the joint probability distribution of the
random sample is
g(x" Xz, ... , x,) = f(x) , f(x.j ..... f(x,). (9,1)

Definition
X" X" ... , X, is a random sample of size n if (a) theXs are independent random variables,
and (b) eveq observation Xi bas the same probability distribution.
To illustrate this defmition. suppose that we are investigating the bursting strength of
glass, I-liter soft drink bottles, and that bursting strength in the population of bottles is nor-
mally distributed. Then we would expect each of the observations on b1l!Sting strength X"
~, .•. , XII in a random. sample ofn bottles to be independent random variables v:.1th exactly
the same normal distribution.
It is not always easy to obtain a random sample. Sometimes we may use tables of uni~
form random numbers, At otber times, the engineer or scientist cannot easily use formal
procedures to help ensure randomness and must rely on other selection methods. Ajudg-
ment sample is one chosen from the population by the objective judgment of an indivi~ual.

198
9-1 Random Samples 199

Since the accuracy and statistical behavior of judgment samples cannot be described, they
should be avoided.

Suppose we ",ish to take a random sample of five batches of raw material out of 25 available hatches.
We may n1l!llber the batches 'With the integers 1 to 25. Now, using Table XV of the Appendix, arbi~
tra."ily choose a row and colum."l as a st.arting point Read do';O.'D the chosen column, obtaining rNO dig-
its at a time, until five acceptable numbers are found (an acceptable number lies be!ween 1 and 25),
To illustrate, suppose the above process gives t:s a sequence ofnumben5 that rea.ds 37, ~8, 55,02.17,
61.70.43,21.82.13.13,60.25. The bold numbe;s spec:Iy which batches of raw material arc to be
chosen as the random s<L'llp:e.

We first present some specifics on sampling from finite universes. Subsequent sections will
describe various sampling distributions, Sections marked by an asterisk may be omitted
without loss of continuity.

*9-1.1 Simple Random Sampling from a Finite Universe


When sampling n items without replacement from a universe of size N, there are Cd possi-
ble samples, and if the selection probabilities are 1t, = I/i:i, for k= I, 2, "" (:), then this is
simple random sampling. Note that eacb universe unit appears in exactly (;:;}
of the possible samples, so each unit bas inclusion probability of (~'~ ~ )/(~ = N'
As will be seen later, sampling without replacement is more '''efficient'' than samFling
with replacement for estimating the finite population mean or total; however. we br..efly dis-
cuss si:rnple random sampling with replacement for a basis of comparison. In this case, there
are N" possible sampJes, and we select each v.'ith probability 1':1<:;:;: liNn, for k= 1,2•. ,., N tt ,
In this situation, a universe unit may appear in no samples or in as many as n, so the notion
of inclusion probability is less meaningful; however, if we consider the probability rhat a
specific omt will be selected at least once, that is obviously 1 - (1 -11)"', since for each unit
the probability of selection on a given observation is liNt a constant, and the n selections
are independent, so that these observations may be considered Bernoulli trials.

Consider a univcn;e consisting offive units numbered 1,2,3,4"s.ln sampling without replacement. we
will employ a sample.of si;r.e two and enumerate the possible samples as
(1,2), (1,3), (1,4), (1.5), (2,3), (2,4), (2,5), (3,4). (3,5), (4,5),

Note that there are (~) =- to possible samples. we select one from these. where each has an equi!.l
IT
selection probability of 0.1 assigned, that is si:uplcrandom sampling. Consider these possible s~ples
to be numbered 1,2, ... ,0, where 0 represents num;", 10, Now, go to Table XV (iJ: the Appendix),
sho~.l1g n1;ldom integers, and vAth eyes closed place a finger down, Read the first digit from the five-
digit integer presented. Suppose we pick the integer which is in row 7 of columl14. The fust dig:t is
6. so the sample cor.sists of units 2 and 4. An alternate to us'..::.g the table is to cas~ a. sing;e icosohe-
dron die and pick the sample corresponding to the outcome, Kotice also that each unit appears in
exactly foUI of the possible samples, and thus the inclusion probability for each unit is 0":';', which is
simply rJN =-215,

-May be orcitted on first reading,


20t} Chap{et 9 Random Samples and Sa.'11pling Distributions

It is usually not feasible to enumerate the set of possible samples. For example. if '
N = 100 andn = 25, there would be more than 2.43 X 1013 possible samples, so other selec-
tion procedures which maintain the properties described in the definition must be
employed. The most commonly used procedure is to first number the universe units from 1
to N, thcn use realizations from a random number process to sequentially pick numbers
from 1 to N, discarcful,g duplicates until n units have been picked if we are sampling with-
out replacement, kceping duplicates if sampling is with replacement. Recall from Chapter
6 that the term random numbers is used to describe a sequence of mutually independent
variables U 1, Uz, ... , which are identically distributed as uniform on [0,1). We employ a
realization u 1• ~, ... , which in sequentially selecting units as members of the sample is
roughly outlined as
(Unit Nurober),=LN· u)+ I, 1,2, ... ,1. i= I, 2, .... n, i<j,
where J is the trial number on which the nth, or finaJ.. unit is selected. In sampling without
replacement, J ~ n, and LJ is the greatest integer contained function, 'When sampling with
replacement, J = n.

*9-1.2 Stratified Random Sampling of a Finite Universe


In finite-universe sampling, sometimes an explanatory or auxiliary variable(s) is available
that has bown value(s) fofeach unit of the universe. These may be either numerical or cat-
egorical variables or both. If We are able to use the auxiliary va..<iables to establish criteria
for assigning each universe unit to exactly one of the resulting strata. before sampling
begins, then simple random samples may be selected within each stratum, and the samp.ling
is independent from stratum to stratum, which allows us to later combine~ with appropriate
weights, stratum "statistics" such as the means and variances obtained from various strata;.
In general, this is an "efficient" scheme in the sense of estimation of popUlation means and
totals if, following the classification, the variance in the va..'iable we seek to measure is
small within strata while the differences in the stratum mean values are large. IfL strata are
formed, then thestrarum sizes areNt, N2, ... ,ND andNI +N2+ ... +NL =N. while the sam-
pIe sizes are n l , Tl.:!, ••• , nL> and 1:t + Tl.:! + ... + nL = n. It is noted that the inclusion probabil-
ities are constant within strata. as n,/Nh for stratum h; but they may differ greatly across
strata. Two commonly used methods for allocating the overill sample to strata are propor-
tional allocation and Neyman opt.i.mal allocation, where proportional allocation is
n, =n(N,/l'I), h = 1,2, ... , L, (9-2)

and Neyman allocation is

nh =n
r.LN>·a·a
LNh
h

k
1 h = 1,2,.",L. (9-3)

Lh=l
The values Gh are standard deviation values within strata, and when designing the sampling
study, they are usually unknown for the variables to be observediu the study; however. they
may often be calculated for an explanatory or auxiliary va..<iable where at least one of these
is numeric, and if there is a "reasonable" correlation, Ipi > 0.6, between the variable to be
measured and such an auxiliary variable, then these surrogate standard deviation values,
denoted a~, will produce an allocation wbich will he reasonably close to optimal,
9-2 Statistics and Sampling Distributions 201

In seeking to address growing COnSUmer concerns regarding the quality of claim processing, a
national tnaruiged i:.ealth care:bealtb insurance company has identified several characteristics to be
monitored on a monthly basis. The most important of these to the company are, fi.rst, t..'!e size of the
m~ ''financial error," which is the absolute value of the error (overpay or underpay) :or a claim and.
second. the fraction of eorrectly rued claims paid to providers within 14. days. Three processing cen~
tern are eastern, mid~America, and western. Together, these eenters process about 450,000 claims per
month, and it has been observed that they differ in accuracy and tb::e11ne.ss. Fu.rt.hermore, the corre-
lation between the total dollar amount of the claim end the financial error overall is historically abou:
0,65-0.80. If a sample of size 1000 claims is to be drawn monthly, strata might be formed using cen-
ters E, MA, and Wand total claim sizes as $0-$200, $20h$900, $9(11-$= claim, Therefore there
would be nine strata. And ufor a given year there were 41.357 claims, these may be easily assigned
to strata. sinee they are natu...-ally grouped by center. and each center uses the same processing system
to record data by claim amount. thus allowing ease of classification by claim size, At this point, the
WOO-unit pla.1Jled sa.~1e is allocated to the ;tine strata as shown in Table 9-1, The allocation in ital-
ics, sho'W'tl. first, employs proportional allocation while the second one uses "optimal" allocation. The
values represent the standard deviation in the claim-amount met.:ic w.i:hin the stratum, since the stan-
dard deviation in "financial error" is unknown. After deciding on the allocation to use. nine inde-
pendent~ sirLple random samples, without replacement, would be selected, yielding 1000 claim. forms
to be inspected. Stratum identifying subscripts are not shown in Table 9·1,

It is nored that proportional allocation results in an equal inclusion probability for all
units, not just witb.in strata, while an optimal allocation draws larger samples from strata in
which the product of the internal variability as measured by standard deviation (often in an
auxiliary variable) and the number of units in the stratum is latge, Thus, inclusion proba-
bilities are the same for all units assigned to a stratum but may differ greatly across strata..

9·2 STATISTICS AND SAMPLING DISTRIBUTIONS


A statistic is any function of the observations in a random sample that does not depend on
unknoYt'n parameters, The process of drawing conclusions about populations based on

Table 9·1 Data fur HeaI~ Insurance R"QDlple 9~3


Claim Amount
Center SO - S200 $201- $900 5901 - SMax CIait:J All
E N= 132,365 N=41,321 N= 10,635 184,321
=
(1' $42 (1' = $255 =
(1' $6781
.= .=94154 n=241371 4181454
MA N=96.422 N=31,869 N=6,163 134,454
(1' =$31 "= $210 (1' =$5128
n=218115 n=72/35 n=14/163 3041213
W N=82,332 N= 33,793 N= 6,457 122,582
(1' = $57 (1'=$310 (7' = $7674

n=187124 n=76/54 n=151255 2781333


All N=311,1l9 N= 106,983 N=23.255 441,357
.=70516& .=242/143 "=531789 100011000
202 Chapter '9 Random Samples and Saru~li:rtg Distributions

sample data makes considerable use of statistics. The procedures require that we understand>
the probabilistic behavior of certain statistics. In general, we call the probability distribu-
tion of a statistic a sampling distribution. There are several important sampling distributions
that wili be used extensively in subsequent chapters. In this section, we define and briefly
iliustrate these sampling distributions. Firs~ we give some relevant definitions and addi-
tional motivation.
A statistic is now defined as a value detennioed by a function of the values observed
in a sample. For example, if XI' X2" .. ') Xn represent values to be observed in a probability
sample of size n on a single t'dndom variable X, thenK and S', as described in Chapter g in
equations 8~1 and 8~, are statistics. Furthermore, the same is true for the median. the
mode, the sample range, the sample ske\\'Iless measure, and the sample kurtosis. ~ote that
capital letters are used here as this reference is to random variables, not specific numerical
results, as was the case in Chapter 8.

9-2_1 Sampling Distributions


Definition
The sampling distribution of a statistic is the density function or probability function that
describes the probabilistic behavior of the statistic in repeated sampling from the same uni-
verse or on the same process variable assignment model.
Examples have been presented earlier, in Chapters 5-7. Recall that random sampling
with sample size n on a process variable X provides results XI' X2> •• " XII' which are mutu-
ally independent random variables, all vtith a common distribution function. Thus, the sam-
ple mean, X, is a linear combination of n independent variables. If E(X) = I' and VeX) = cr',
then recall that E(X) = Jl and V(X) = cr'ln. And if X is a measurement variable, the density
function of X is the sampling distribution of this statistic, There are several important
sampling distributions that will be used extensively in subsequent chapters. In this section
we wili describe and briefly illustrate these sampling distributions.
The form of a sampling distribution depends on the stability assumption as wen as on
me form of the process variable model. In Chapter 7, we observed that if X - N eu., cr'), then
X - NC/.!, cr'ln), and this is the sampling distribution of X for n;;: L Now, if the process vari-
able assignment model takes some form other than the normal model illustrated here (e.g.,
the exponential model), and if all stationarity assumptions hold, then mathematical analysis
may yield a closed form for the sampling distribution of X. In this example ease, if we
employ an exponential process model for X, me resulling sampling distribution for X is a
form of the gamma distribution. In Chapter 7 , the important Central Limit Theorem was pre~
sented. Recall that if the moment-generating function M ,,(t) exists for all t, and if

Zn = X-I'
I r>
th
en
lim F.(z)
,
l, (9-4)
cr!"n ..... ~ \IllZ)
where Fiz) is lhe cumulative distribution function of Z" and <I>(z) is the CDF for the stan-
dard normal variable, Z. Simply stated, as n -t ~, Z, -t N(O, I) random variable, and this
result has enormous utility in applied statistics. However1 in applied work, a question arises
regarding how large the sample size n must be to employ the N{O,I) model as the sampling
distribution of Z" or equivalently stated, to describe the sampling distribution of X as
NC/.!, cr'ln). This is an important question, as the exact form of the process variable assign-
mentmodel is usually unknO\\'Il. Furthermore~ any response must be conditioned even when
iris based on simulation evidence. experience. or "'accepted practice." Assuming process sta~
bility, a general suggestion is that if the skewess measure is close to zero (impJying that the
variable assignment model is symmetric or very nearly so), then X approaches normality
9-2 Statistics and Sampling Distributions 203

quickly, say for n ~ 10, but this depends also on the standaulized knrtosis of X, For exam-
ple, if fJ, ~ 0 and Ill,! < 0.75, then a sample size of n = 5 may be quite adequate for many
applications. However, It should be noted that the tail behavior of Xmay deviate somewhat
from that predicred by a nolllW model.
Wnere there is considerable skew present, application of the Central Limit Theorem
describing the sampling distribution must be interpreted with care. A Hrul e of thumb" that
has been successfully used in survey sampling, where such variable behavior is common
(fJ, > 0), is that
n > 25(fJ,)'. (9-5)

For instance, returning to an exponential process variable assignment model. this rule sug-
gests that a sample size of n > 100 is required for employing a nolllW distribution to
describe the behavior of X, since fJ, = 2.

Definition
The standard error of a statistic is the standard deviation of its sampling distribution. If the
standard error involves unknown parnrneters whose values caD be estimated, substitution of
these estimates intO the standard error results in an estimated standard error.
To illustrate this definition, suppose we are sampling from a normal distribution with
mean j.J. and variance fi1. Now the distribution of Xis normal with mean j.J. and variance
d'ln, and so the standard error of X is

If we did not know (J but substituted the sample standard deviation s into the above, then
the estimated standard error of Xi.
s

t~~~pg'[~j
Suppose we colicet data on the tension bond strength of a modified portland cement mortar, The ten
observations are
16.85,16.40, :7.21,16.35,16.52,1'7.04,16.96,17.15,1659,16.57,
whe~e tension bond strength is measured in units of kgf/cml , We assume that the tension bond
strength is well-described by a notnl31 distribution. The sample average is
x= 16.76 kgflcm'.
First, suppose we know (or are willing to assume) that the standard de'\'iation of tension baed strength
is 0'= 0.25 kgfIcm 2• Then the standard error of the sample average is
(J,/.,fn =0.25/~ =0.079 kgf/cm 2 •
If we are unwilling to assume that 0'= 0.25 kgfJcm\ we could use the sample standard deviation. s ==
0.316 kgfJcm2 to obtain the estimared standard error as follows:

s/-./n =O.316/..jl0 =0.0999 kgf/cm2 ,


204 Chapte:: 9: Random Samples and Sampling Distributions

*9-2.2 Finite Populations and Enumerative Studies


In the ca')e where sampling may be conceptually repeated on the same finite universe of
units, the sampling distribution of Xis interpreted in a manner similar to that of sampling a
process, except the issue of process stability is not a COncern, Any inference to be drawn is
to be about the specific collection of unit population values. Ordinarily, in such situations,
sampling is without replacement. as this is more efficient. The general notion of expecta-
tion is different in such studies in that the expected value of a statisticBis defined as

G)
EO(8) = L1!,6k , (9·6)
k=l

where fI, is the value of the Statistic if possible sample k is selected. In simple random sam-
pling, recall that nk ::: 1J('~),
Now in the case of the sample mean statistic, X. we have EC(X) :::.; fl.;::. where I1r is the
finite population mean of the random variable X. Also, under simple random sampling
without replacement.

(9·7)

where a~ is as defined in ~uation 8-7. The ratio nhV is called the sampling fraction, and it
represents the fraction of the population measures to be included in the sample. Concise
proofs of the results shov,'ll in equations 9·6 and 9-7 are given by Cochran (1977, p. 22).
Oftentimes) in studies of this sort, the objective is to estimate the population total (see equa~
tion 8-2) as well as the mean. The "mean per unit,l1 Or mpu. estimate of the total is simply
Tx=N· X, and the variance of chis statistic is obviously W)=lv"'· V(X). Wlrile necessary and
sufficient conditions for the distribution of X to approach normality have been developed,
these are of little practical utility, and the rule given by equation 9-5 has been widely
employed.

In sampling with replacement, the quanitities E'(X) = p" and v"(X) = a~/n.
'tr'here stratification has been employed in a finite population enumeration study, there
are two statistics of common interest. These are the aggregate sample mean and the estimate
of population total. The mean is given by

(9-8)

(9·9)
In these formulations, Xh is the sample mean for stratum h, and WII• given by (Nh/b\ is
called the stratum weight for stratum h, Note that both of these statistics are expressed as
simple linear combinations of the independent" stratum statistics Xhl and both are mpu esti-
mators. The variance of these statistics is given by

(9-10)
I
I The withln~stratum variance terms,
9-3 The Chi-Square Distribution

&;", may be estimated by the sample variance terms,


205

S;~, for the respective strata, The sampling distributions of the 3&:,oregate mean and total
estimate statistics for such stratified, enumeration studies are stated only for situations
where sample sizes are large enough to employ the limiting normality indicated by the Cen-
tral Limit Theorem, Note that these statistics are linear combinations across observations
within strata and across strata. The result is that in such cases we take
(9-11)

and

~~~~~~j
Suppose the sampltng pla."l. in Example 9-3 utilizes::he "optimai" aliocatio;;. as shov.';l:n Table 9-L
The stratum ~ple means and samp!e staadard deviations on the fi.'1a!1cial error variable are cakJ~
lated with ::he results shown in Table 9-2, where the units of measurement are error $/claim.,
Then utilizing the results presented in equations 9-8 and 9-9, the aggregate statistics whe::l eval-
uated are.>'= $18.36, and f, = $8,103,3,5. Utilizing equations 9-10 and eoploying the within-stratum
sample variance ,,'Blues s!, o;h
to estimate the w-lt.1in~stratum variances the estimates for the variance
in the sampling distribl.rtior. of the sample mean and the esti.mate of total are V(fl) "'" 0.574 and
Hi.>:) "'" 1.118 x 101!, and the estimates of the respective standard errors are thus $0.785 and $334,408
for the mean and tOtal esrimator distributions.

9-3 THE elU-SQUARE DISTRIBlJTlON


Many other useful sampling distributions can be defined in terms of norma! random vari-
ables. The chi-square distribution is defined below,

Table 9~2 Samp~e Stati~tics foc Health Insurance Example

Claim Amount
$0-$200 $201-$900 $901 - $Max Claim All
E N= 132,365 N=41,321 N~ 10,635 184.321
n=29 n::.-::54 n=371 454
.>'=6.25 x=34.10 x=91.65

MA N=96,422 N=31,869 N=6,163 134,434


n= IS n=35 n=l63 213
x;;;; 5.30 x=22.00 x=72.oo
16.39 56.67
VI' N=82,332 N=33,793 N=6,457 131.225
n=24 n==54 n=255 333
x= 10.52 >'=46.28 x= 124.91
9.86 31.23 109.42
All N= 311,119 N= 106,983 N=31,898 44l,357
n=67 n = 143 n=789 n 1.000

----
206 Chapter 9 Random Samples illld Sampling Distributioru;

Theorem 9·1
Let Z" z" ... , Z, be normally and independently distributed random variables, with mean
= =
I' Q and variance d' I. Then the random variable
X~=.. i+ Z'·
, Z' Z,
21' .. · , . .... k

has the probability density funetion


'( ) _
J U -
1 /k\u ('/2 )-1e
-<42. U>.
0
2'12 1 1 __ !
,2)
otherwise (9-12)
and is said to follow the chi-squ!U'e distribution with k degrees of freedom, abbreviated X~.
For thc proof of Theorem 9-1, see Exercises 7-47 and 7-48.
X;
The mean and variance of ~e distribution are
!J.=k (9-13)
and
(9-14)
Several chi-square distrib!ltions are shown in Fig. 9-1. Note that the chi-square random vari-
able is nonnegative. and that the probability disuibution is skewed to the right. However, as
k increases, the distribution becomes more symmetric. As k -+ <'<'\ the limiting form ot'the
chi-square dh1ribution is the normal distribution.
The percentage points of the X! distribution are given in Table ill of the Appendix,
Define X!r.: as the percentage point or value of the chi~square random variable with k
X!
degrees of freedom sucb that the probability that exceeds this value is IX. That is,

p{X~" x~,d = J~
z., f(u)du a.

o 5 10 15 20 25 u Figure 9-1 Sever.ll X' distributions.


9-3 The Chi-Square Distribution 207

Ims probability is shown as the shaded area in Fig. 9-2. To illustrate the use of Table m,
note that
p {X:, ;"Xi.".IQ} =P (X;," l8.3lJ =0.05.
That is, the 5% point of the chi-square distribution with 10 degrees of freedom is X;.",.lO =
18.31.
Like the nonnal distribution, the chi-square distribution has an important reproductive
property.

Theorem 9-2 Additivity Theorem of Chi-Square


Let X:' X; . ... , X; be independent chi-square nmdom variables with k" k" ... , k, degrees of
freedom, respeotiveJy. Then the quantity
Y=X'+X'+"'+X'
I -z "/3

follows the chi-square distribution with degrees of freedom equal to


p

k=IA
p"l

PrOfl! Note that each chi~square random variable X~- can be written as the sum of the
squares of k standard nonn.al random variables, say
j
,

i=l. 2, ... , p,

Therefore,
.u P ki
Y=LX;
i=t
LLZ~
1",1 j=:
and since all me random variables ZfJ are independent because the X; are independent, Y is
just the S1..--n1 of the squares of k ~ kJ independent standard normal random variables.
='"
~'=l
From Theorem 9-1, it follows that Y is a chi-square random variable with k degrees of
freedom.

As an example of a statisti:: t:!:iat follows the chi-square distribution, suppose that Xl> x., .... , XI! is a
random sample from a normal population. with mean J1. and variance 01-. The functiQU of t.1C sample
variance

f(u)

o u
Figure 9~2 Percentage point X!..t of the chi-square distribution.
208 Chapter 9 Random S""'ples and Sampling Distributions

is distributed as X!.:' We will use this random variable extensively in. Chapters 10 and 11. We will see
in those chapters that becauSe the distribution of t:l.:ris random variable is chi square. we can construct
confidence interval estimates and test statistical hypotheses about the variance of a nor:wa1 population.
To illustrate heuristically why the distribution of the random variable in Example 9~6,
(n. _l)SlJo-i, is clJ.i~square, note that

(9-15)

If X in equation 9-15 were replaced by,u.. then the distribution of

~::lXi - 1')'
;"'!
,,'
isi, because each term (x,.- ]1)1<115 anindependeut standard normal random variable. Now consider
the following:
"
1')' = Y[(X,-X)+(X-Jl)]
,
,\..1
i. II f'

= Y(X,-X)' + I(X-Jl)' +2(X-Jl)I(X,-X)


j,.,l ; ..: 1",1

= Irx,-X)' +n(X-I')'.
10.. ;

Therefore,

or
"
I(X,-I')' 1- ,Z
,., (n-I)S' lX-I')
2 2 +-.,-,-. (9-16)
(r in
(j
"
Since X is normally distributed v.ith mean.}l and variance erIn, the quanti:y (5: fJif( fTln) is dis-
tributed as X!. Furt.iJ.errnore, it can be shown thaJ: the random Variables X anaS*·ire~ae'pendent.
Therefore. s.ince 2::1 (Xi - j..t)1 is distributed as X~, it seems logical to use the additivity proP"'
1(12
erty of the chi-~u.are distrihucion (Theorem 9~2) and conclude that t..1.e distribution of (n - 1)S2/til is
X~_I' '

9-4 THE t DISTRIBUTION


Another important sampling distribution is the t distribution, sometimes called the student
t distribution.

Theorem 9-3
Let Z - N(O, I) and V be a chi-square random variable v>'ith k degrees of freedom. If Z and
V are independent, then the random variable
Z
T= r,-
'V V/k
9-4 The r DistrilY,;:tion 209

has the probability density function

(t) = f[(k+l)!Zl. 1 -oo<t<_.


1 -Jiikr(k/2) [(t 2/k)+lfl)/2'
(9-17)
and is said to fallow the t distribution with k degrees of freedom, abbreviated t,.

Proof Since Z and V are independent, their joint density function is


1kf21-1 -( _ \.
f(z,v)::= v" ,e J;~+\'JI'2. -oo<z<.()(),O<v<oo,
.J27rZkf2 ~ jr(
Using the method of Section 4-10 we define a new random variable U =V, Thus, the
imrerse solutions of

t= Z
and
u v
are

Z=t~f
and
v::::u,

The Jacobian is
..
IU
1= i'lk
I0 1
Thus,

ill= 1.<:
. ~k
and so the joint probability density fwletion of T and lJ is

g(l,u) ."j~ t}k/2)-1 e-[(1I/k)t


2
+uV2 (9-18)
~2Jrk2kf2~~) ,

Now. since v> 0 we must require that u > 0, and since - 00 < z < 00, then - 00 < t < "">, On
rearranging equ1ition 9-18 we have
-....

and since 1(1)=5; g(t,u)du, we obtain

L
210 Chapter 9 Random Samples and Sampling Distribu:iODS

J(t)

-oo<t<=-,

Primarily because of historical usage, many authors make no dis'"...inction between the
random variable T and the symbol t, The mean and variance of the t distribution are p. = 0
and d' = k/(k- 2) for k > 2, respectively, Several t distributions are shown in Fig, 9-3, The
general appearance of the rdistribution is sllnilar to the standard normal distribution, in that
both distributions are symmetric and unimodal, and the maximum ordIDate value is reached
at the mean p. = 0, However, the t distribution has beavier tails than the normal; that is, it
has more probability further out, As the number of degrees of freedom k ..... ~, the limiting
form of the I distribution is the standard nonnal distribution, In visualizing the t distribution
it is sometimes useful to know that the ordinate of the density at the mean fJ.. := 0 is approx ~
imately four to five times larger than the ordinate at the 5th and 95th percentiles, r'Or exam-
ple, ""th I 0 degrees of freedom fort this ratio is 4.8, with 20 degrees of freedom this factor
is 4.3, and with 30 degrees of freedom this factor is 4.1. By comparison. for the normal dis-
tribution, this factor is 3.9,
The p,,!,centage points of the t distribution are given in Table IV of the Appendix. Let 1«,
be the percentage point or value of the t random variable with k degrees of freedom such that

p{TZ'a,.}=f J(t)dt=a,
'a>
This percentage point is illustrated in Fig. 94. Note that since the t distribution is symmetric
about zero, we find t1-<1.,.1: =:; -ta.,k' Tb.is relationship is useful, since Table N gives only upper-rail
t""
percentage points, that is, values of for a$ 0.50. To illustrate the use of the table, note that
P{T~ tQl)S,IO} =PIT:;' L812} =0,05.
Thus, the upper 5% point of the ,distribution with 10 degrees of freedom is ''os,,. = 1.812,
Similarly, the lower-tail point to..".:o ~ -t'"",10 = -I ,812,

;~~~!~~:J\i
As an ex:runple of a ra."1dom variable that follows the t distribution. suppose thatX:.~, ... , Xil is a xan.-
com sampic from a normal distribution withmcanfJ. and variance rr.and letX and 51. denote thesam~
pIe mean and variance. Corsica the statistic

o
Figure 9~3 Several: distributions,
9·8 The F DIstribution 211

Figure 9-4 Percentage points of the t distribution.

(9-\9)

Divlding both the nurneretor and denominator of equ.ation 9-19 by c;. We obtain

X-/1
...3[y=_; ~~~ .
S'l/n -,;S-/(72

Sbee (X -11)/(c;I';;;) - N(O.1) and S2!dl_ X:~;/(n - 1), and since X and S2 are independent, we see
from Theorem 9-3 mat ~

(9-20)

follows a t distribution with v:::: n ~ 1 degrees of freedom.. In Chapter;, 10 and 11 we \\111 use the rdD-
ciom variable in equa:i.on 9-20 to construct confidence intervals and test hypotheses about the c;.ear.
of a normal distribution.

9-5 TIlE F DISTRIBlTTION


A very useful sampling distribution is the F distribution,

Theorem 9·4
Let Wand Ybe indepe_ndent chi-square random variables with u and v degrees of freedom,
respectively. Then the ratio
F W/u
Ylv
has the probability density function

- ·~(u)ltl2
r(u-+v - p'u(2)-1
h(t) " 2 ; v 0</<=,- (9·21)

and is said to follow the F distribution with u degrees of freedom in the numerator and v
degrees of freedom in the denominator. It is usually abbreviated FIJ,'i"

Proof Since Wand Yare independent, their joint probability density distribution is
212 Chapter 9 RJmdom Samples and Sampling Distributions

(,12)-1 (,/1)-1
W' Y -(w+y)/2
I(w,y) O<w,Y<QQ·
2"2r(%)z'f2r(~J e ,

Proceeding as in Section 4-10 t define the new random variable M ::;:;: 1': The inverse solutions
of/= (wlu)I(ylu) and m = y are

w= umf
v
and
y m.
Therefore. the Jacobian

J=~ ~L~m.
I~ ~ I v
Thus, the joint probability density function is given by
. (
UUfm
)(','2)-1 'ij2
g(j,m) -;:;l-;:; m -(m/2)(('i'lf+l)
2'f2 r( ~
\.2.:
12' r(x.)
J1
\2
e ,
0< j,m < «')

and since h(fJ = r; g(f, m) dIn, we obtain equation 9-21, compleling the proof.
The mean and variance of the F distribution are Il = v/(v - 2) for v > 2, and
2v'(u+v-2)
v>4.
u(v-2)'(v-4)'
Several F distributions are shown in Fig. 9-5. The Frandom variable is nonnegative and the
distribution is skewed to the right. The F distribution looks very similar to the chi-square
distribution ill Fig. 9-1; bowever, the parameters u and v pro"ide extra flexibility regarding
Shape.

u=5,v=5

f Figure 9-5 The F illstribution.


9-5 The F Dis:ribution 213

The percentage points of the F disrribution are given in Table V of the Appendix. Let
F "-"-~ be the percentage point of the F distnlJution with u and v degrees of freedom. such that
the probability that the random variable F exceeds this value is

Th.is is illustrated in Fig. 9-6. For example, if u = 5 and v = 10, we find from Table V of the
Appendix that
PiFe. F,m.',lo) = PiF;' 3.33) = 0.05.
That is, the upper 5% point of F,.lC is FO,05~.JO = 3.33. Table V contains only upper-tail per-
centage points (values of F a,U,lo' for a:s; 0.50). The lower~tail percentage points F I~,v can
be found as follows:
1
fl-a,lt,v =~. (9-22)
a,v,1I

For example, to find the lower-tail percentage point F O•95,5,lOl note that

0.211.
4.74

A.s an example of a statistic that follows the F dis~bution. suppose we have two notmal popula1ons
with variances ~ and 0;. Let independent nm'dom samples of sizes n l and r"1; be taken. from popu1a
S;
!ions 1 and 2. respectively, and let S~ and be the sample variances. Then the rano

(9-23)
has an F distributitin -v.'ith n, - 1 numeratOr degrees of freedom and li.z 1 denominator degrees of
freedom. This follows dire.."tly from the facts that (nt -1)~10'~ - X:, ~ 1 and (n.z-l),f/O'~ "" X!l-! and
from Theorem 9...!, The nmdom variable in equation 9-23 plays a key role in Chapters 10 and 11,
where we address the preblems of confidence interval estimation and hypothesis testing about the
variances of two independent normal populatiOI15.

h(0

f
Figure 9-6 Upper and lower percentage points of the F distribution,
214 Chapter 9 Random Samples and Sampling Distributions

9-6 SL-:rvL\1ARY
This chapter has presented the eoncept of random sampling and introduced sampling' dis-
tributions.In repeated saml'.ling from a population, sample statisties of the sort discussed in
Chaptcr 2 vary from sample to sample, and the probability distribution of suell statisties (or
funetions of the Statistics) is called the sampling distribution, The normal, chi-square, Stu-
dent t, and F distributions have been presented in this chapter and will be employed exten-
sively .in later chapters to describe sampling variation.

9-7 EXERCISES
/~ Suppose that a random variabl: is normally dis- the vendor 2 observed resistances asscmed normaUy
\:.Juted with mean JJ. and variance if. Draw a random and independently distributed with a mean of 105 Q
sample of five observations. What is the joiut density an£! a standard deviation of 2,0 0.. "/hat is the saro~
~tioo of the sample? pling distribution oCr! -X2?
( 9~2_'TransistorS have a life that is exponentially dis~ 9~9. Consider the resistor problem in Exercise 9-8.
"tnbuted ",1.th pararr-.erer ;_ A random sampl::: of It :ran- Fmd the standard elTO! of X! ~ Xl'
slstors is taken. "/hat is the joint density function of 9~ to. Consider the resistor problem in Exerci$e 9~8. If
the sample? we could not assume that resistance is normally dis~
9~3. Suppose thar X is uniformly distributed 00 the tribut~ what could be said about the sampli:cg distri-
,interval from 0 to 1. Consider a ranqom sample of size bution ofX\
4 from X. ~at is the joint density function of the 9~11. SuppoSe that independent random samples of
sample? sizes n l and '1:1 are taken from wo normal populations
94. A lot consists of N transistors, and of these M with mean.., JJ.I and f.J..;. ar.d variances (j~ anda;,
(M $" N) are d:::fective. We randomly select twa tran- respectively" If X, and X2 are the sample means, find
sistors without replacement from this 10t and deter M
the sampling distribution of the statistic
rrine whether they are defective or ooodefectivc. The X, -:1'2 - (Ill -1'2)
ra.'1dom variable
if the iOO transistor is nondef~tive, .
Mlnl)+ ((j~ I~) ::
X-=
,
IlOI if the ith transistor is defective.,
.=L2, 9~ 12. A manufacturer of semico:r.ductor devices takes
a random sample of 100 chips and tests them. classi~
Detennine the joint probability function for Xl and Xl' fying each chip as defective or nondefe<!tive. Let
'Woat arc the marginal probability functions for XI and X, 0(1) if the lth chip is Ilondefective (defective).
~i? Ate Xl and X2independent random variables? The sample fraction defective is
,: 9~~A population of power supplies for a personal ~ Xl -tXZ+'''+XIOO
~puter has an outpUt voltage that is normally dis~ p 100
tributed with a mean of 5.00 V and a standard devia~
"What is the sampling distribution ofp?
tion of 0.10 V. A random sample of eight power
supplies is selected. Specify the sampling distribution 9 ..13. For the semiconductor problem in Exercise
of X 9-12.:find the standard error of p. Also find the esti-
mated standard error of p.
9.-6, Consider j}e power supply problem described in
Exercise 9~5. What is the standard error off? 9,.14. Develop the moment-generating function of the
'/\?'" chi-square distribution.
9-7. Consider the power supply problem described in
9~ 15. Derive the mean and variance of the chi·square
Exercise 9-5. Suppose that the papulation standard
random variable wij} u. degrees of f..-eedom.
deviation is nnknown. How would you obtain the esti-
mated standard error? 9~16. Derive the mean and variance of the I
distribution.
QtA procurement specialist has purchased 25 rcsis-
~from vendor 1 and 30 resistors from vendor 2. 9·17. Derive the mean and variance of the '1
Let X IP X!:2' .. " X j ,2j represent the vendor 1 observed distribution,
resistmce.s assumed normally and independently dis- 9--18. OrderStatistic.s. LetXj , X2, ""X>j be arando:rr
tributed with a mean of 100 .Q and a standard devia- sample of size n from X. a random v.d.riable baving
tion of 1.5 n, Similarly, letA;l'X:u, ... , XzJi)represent distribution function F(:;;). Rank the elements iu arde;
-g~ 7 Exercises 215

of increasing DIlffie:t:CaI magnitude, resciting in X(I)' variable with parameter A. Derive the distribution
XO) •. "., X{,,). where X(ll is the smallest sample element functions and probability distributions for X{l) and
(Xm=min [X1,Xl • .... X~}) andX(~) is the largest sam- X:,,), Use the results of E-'Ccrcise'""9-18.
ple element (X,,,) = max: {XI' Xl' "" X~}). A;,) is called 9-22. Let Xl' Xl' ... , X:, be a tandom sample of a con~
11e ith order statistic, Often. d:e distributio:;:; of some
tinuous random variable. Find
of the order Statistics is of interest. particularly the
;ninimum and maximurI.'.. sample valnes. XiI} and ~"l' E[F(X,,)l]
respectively. Prove that t.1je distribution functions of and
X(n and XC!!}' denoted respectively by FX(,j(t) and
Fx",,(t). are
FX{,,(t) = 1-:1 F(tW. 9-23. Using Table ill of the Appendix, lind the fol-
F",.,<r) ~ [F(r)]". lowmg values:
Prove that if X;;is concinuous y,'ith probability distribu- .~X:.;".
tionf(x), then the probability distributions of Xli) and (~X~,~.ll·
:1;" are (c) X~.cr.5,m'
f"",ttl ~ n[l- F(t)1-' fit), (d) X~:J such that P[X~J ~X~l:'l} =; 0.975.
ix",,{r) ~ n[F{tl]-' fi')·
9-24. Using Table IV of the Appendix. find the fol-
9 19. Continuation of Exercise 9-18. Let Xl' Xl' ....
R lowing values:_
X be a random sample of a Bernoulli random vaii:able
q (a) iO.Z!,lO'
with parameter p. Shov.' that .
P(X(,)~ 1)~ 1- (l-p)'.
-- (b) t;,."w
(c) ta,10 such that Pi tiC::; fa,lo} = 0.95.
P(X(l) 0) = 1 - p'.
9-25. Using Table V of the Appendix, find the follOV{~
tlse the results ofE.1tercise 9-18. ing \--alues;
9-20. Continuation of Exercise 9-18. Let X!, X'). • .... (a) FO.2S.4.9·
Xn be a random sample of a non:nal random variable (b) F01)'j,)s .•c"
\\ith mean J1. and va.>iance a2. Using the resclts of
(e) F O•95 .6.S'
Exercise 9~18, derive the density functions of X(l) and
. X:~)' (d) F O.90.24,24'
9--21. Continuation of Exercise 9-18. Let Xl' X'). • •.• , 9-26~ Let FI-4.!i,~ denote a lowerwtail point «(I. s:; 0.50)
Xi!!) be a tandom sample of a.'1 exponential r:mdom of the F..,v distribution. Prove that F;-tt,H,y =; lIFa,v,JJ'
Chapter 10

Parameter Estimation
Statistical inference is the proeess by which information from samp1e data is used to draw
oonelusions about the population from whieh the sample was selected. The teehniques of
statistical inferenee ean be divided into two major areas: parameter estimation and hypoth-
esis testing. This ehapter treats parameter estimation, and hypothesis testing is presented in
Chapter 11.
As an example of a parameter estimation problem. suppose that civil engineers are ana~
lyzing the compressive strength of eoncrete. There is a natural variability in th~' strength of
each individual eonerete speeimen.£onsequently, the engineers are interested in estimating
the average strength for .the population consisting of this type of concrete. They may also
be interested.in estimating the variability of eompressive strength in this population. We
present methods for obtaining point estimates of parameters such as the population mean
and variance and we alSQ discuss methods for obtaining certain kinds of interval eSP-males
of parameters called confidence intervals.

lO-l POINT ESTIMATION


A point estimate of a population parameter is a single numerical value of a statistic that cor-
responds to that parameter, That is. the point estimate is a unique selection for the value of
an unknown parameter. More precisely, if X is a random variable with probability distribu-
tionj(x), characterized by the unknow~ parameter e, and if X" X" ... , X, is a random sam-
pie of size n from X, then the statistic e= heX,. X" ... , XJ corresponding to eis called the
estimator of $, Note that the estimate eis a random variable, because it is a function of sar:n~
e
pie data. After the sample has been selected, takes on a particular numerical value called
the point estimate of e.
As an example l suppose that the random variable X is normally distributed with
unknown mean J1. and known variance 0'2, The sample mean X is a point estimator of the
unknown population mean 11. Thar is, fl = X. After the sample has been selected. the numer-
ical value x is the point estimate of)1. Thus, i£X1 = 2.5, X, = 3,1, x, 2.8, and x, 3.0, then
the point estimate of)1 is
X= 2.5+3.1+2.8+3.0 2.85.
4
Similarly, if the population variance Ii' is also unknown, a point estimator for Ii' is the sam-
ple variance S' and the numerical value s' 0.07 calculated from the sample data is the
point estimate of Ii'.
Estimation problems occur frequently in engineering. We often need to estimate the
following parameters:
• The mean J1. of a single population

216
to-I Point Estimation 217

The variance (12 (or standard deviation a) of a siugle population


The proportion p of items in a population iliat belong to a class of interest
The difference between means of two populations, i1, - }1,
The difference between two population proportions. Pl - P2

Reasonable point estimates of these parameters are as follaws:


For JL, the estimate is {l the sample mean
For <i', the estimate is &' = S', the sample variance
is
For p, the estimate is p = XJn, the sample proportion, where X the number of items
in a random sample of size n that belong to the class of interest
For .u I - }1" the estimate is fl, - p., = X, - Xz. the difference between the sample
means of two iudependet!,t :random samples
For PI - P2' the estimate 15Pl - P2f the difference betv:een two sample proportions
computed from two independent random samples
There may be several different potential point estimators for a parameter. For example.
if We wish to esti..llare the mean of a random variable, we might cODsider the sample mean,
the sample media:a, or perhaps the average of the smallest and largest observations in the
sample as point estimators. In order to decide which point estimator of a particular param-
eter is the best one to llse, we need to examine their statistical properties and develop some
criteria for comparing estimators.

10-1.1 Properties ofEsfunators


A desirable property of an estimator is that it should be "close" in sOme sense to the true
e
value of the unknown parameter. Formally. we say that is an unbiased estimator of the
parameter 8 if
(10-1)
Thatls. iHs an unbiased estimator of eif"on the average,t its values are equal to e. Note that
e
this is equivalent to requiring that the mean of the sampling distribution of be equal to e.

~~~!€~if:l
Suppose that X is a random variable with mean fland variance cr, I.etX1,XZ> " ' . X" be a randomsamw

pIc of sizen: from X Show that the sample llleanX and samplevanance S1. are unbiased estimators of
jJ. and ,"" respectively. Consi&.'T

and sfuce E(X/) 14 for all i = 1. 2, ... , n,

L
218 Chapter 10 Parameter Estimation

1 '
E(X)=- ~>=Ii.
n ;",1

Therefore, the sample mean X is an unbiased .estimator of the population mean p. Now consider

~I'-'-)
1 £..
=-1:. X'+X'-2XX.
,,-1 ;..! \ I <

~_I Ji x1 -,&')
n-1 \)"'{

~~[iE(Xl)-nE(X')].
1 /..
It 1

However. since I:.(X;) ~,; + <1' andE(X') ~,;.,. <1'1n, we have

~(s') ~ n~ Il'i(Ii' +0"') -.(11' + "'In)]


i_I i
~_1_(nj.t2+ n0'2 _np2 _a'll I
n-l
=0-=.
Therefore, the sm:nple variance S2 is an unbiased estimator of the population va.riance cr. However.
the sample standard deviation S is a biased estimator of the population standard deviation 0: For latge
samp!es tbis bias is negligible.
----------------------------
The mean square eUor of an estimator {) is defined as
MSE(0 = E(e - (f)'. (10-2)
The mean square error can be rewritten as follows:
MSE(0 = £[0- £(fI)]' + (8- E(e)f
= Vr/f) + (bias)'. (10-3)
e
That is. the mean square eITOr of is equal to the variance of the estimator plus the squared
e
bias. If is an unbiased estimator of 0, the mean square error of (j is equal to the variance of e.
The mean square error is an important criterion for comyaring two e~timators. Let 81
and 0, be two ~timat0I' of the parameter e, and let M§ECfl0 and MSE(e..) be the mean
square errors of 0, andB.,. Then the relative efficiency of It, to 8, is defined as

MSE(e,)
MSE(e,) ,
If this relative effici~cy is less than one, we would conclude that a{ is a more efficient esti...
mater of () than is Oz. in the sense that it has smaller mean square error. For example,
suppose that we wish to estimate the mean ,il of a population. We have a random sample of
10-1 Point Estimation 219

l'! observations XI' Xz, ... , X.'I' and we wish to compare hvo possible estimators for J1~ the
sample mean X and a single observation from the sample, say Xi - Note that both X and j{,
..are
------ -.- .. ----- .. -.--~--- ~ ,
unbiased estimators of ).1; consequently, the mean square enor of both estimators: is sim-
ply the variance. For the sample mean, we have MSE(X) =' VeX) = <5'fn, whare <5' is the
population variance; for an individual observation, we have MSE()(,) = VeX,) <5'. There-
fore, the relative efficiency of Xi to X is
MSE(X)
MSE(X,)
Since (lfn) < I for sample sizes n "2, we would conclude that the sample mean is a better
estimator of !1 than a single observation Xr
Within the class of unbiased estimaton;, we would like to find t.~e estimator that has the
smallest variance. Such an estimator is called arninimum variance unbiased estimator. Fig-
ure 10-1 shows the probability distribution of two unbiased estimaton; 8, and with 81 hav- 9"
t\
ing smaller variance than~. The estimator 81 is more likely than to produce an estimate
that is close to the true value of the unknown parameter 8,
It is possible to obtain a lower bound on the variance of all unbiased estimators of ().
Let 8be an unbiased estimaror of the parameter 9, based on a random sample of n observa-
tions, and let fix, IJ) denote the probability distribution of the random variable X. Then a
lower bound on the variance of fJ is'

(10-4)

This inequality is called the Cramer-Rao lower bound. If an unbiased estimator satisfies e
equation lO~4 as an equality, it is the minimum variance unbiased estimator of 8.

;;!!~pI~!!8~1
We 'Will show that the sample mean X is the minimum variance unbiased estimator of the mean of a
nonna! dist;r!lmtion v.-ith known variance. ~.
-~

Figure 10~1 The probability distribution oct'.Vo unbiased estimators, 81 and (J2,

ICertain conditions on the functionf{X, B) a..--e required for obtaining the Cmn6'-Rao inequality (for example,
see Tucker ::962). These conditiOn:> are satisfied by most oCme standard probability distribution&.
220 Chapter 10 Parameter Estimation

From Example lO~l we observe that X is an U!lbiased est.irnator of p.. Note tha~

Substituting into equation 10-4 we obtain

V{X) ::: (~l ~2


nEJ£.I-!n(a-J'2ii)-!( X -I' yli
ldflL 2\tj)~

1
~--~~~

nE(X-l'r
0"4

=-.
n
Since we know ::hat, in general, the variance of the saaple mean is V(X):;:::: erln. we see that VeX} sat~
isfies the C:amer-Rao lower bound as an I!qmility. Therefore Xis the minimum variance unbiased
estimator of J.l. for the normal distribution where 0=' is knO\Vll,

Sometimes we find that biased estimators are preferable to unbiased estimators


because they have smaller mean square error. That is, we can reduce the variance of the esti-
mator considerably by introducing a relatively small amount of bias. So long as the reduc~
tioa in variance is greater than the squared bias, an improved estimator in the mean square
error sense will result. For example, Fig. 10-2 sbows the probability distribution of a biased
estimator 81 with smaller variance than the unbiased estimator 82- An estimate based~ on 81
would more likely be close to the true value of $ than would an estimate based on $,. We
will see an application of biased est:.i.mation in Chapter 15.
An estimator l:r that has a mean square error that is less than or equal to the mean
square error of any other estimator {!, for all values of the parameter e. is called an optimtJl
estimator of O.
Another way to define the closeness of an estimator {; to the parameter eis in terms of
consistency, If en is an estimator of ebased on a random sample of size n. we say that8r. is
consistent for () if,. for £ > 0,

(10-5)

Consistency is • lMge-sample property, since it describes the limiting beh.TInr of the esti-
mator (j as the sample size tends to infinity. It is usillilly difficult to prove that an estimator
r 10-1 Point Estimation 221

Figure 10~2 A biased estimator, 81, that has smaller variance than the unbiased estimamr. e~:

is consistent using the definition of equation 10-5. However, estimators whose mean square
error (or variance, if the estimator is unbiased) tends to zero as the sample size approaches
infinity are consistent. For example, Xis a consistent estimator of the mean of a normal dis-
tribution, since X is unbiased and limn~ VeX) = limll~ (a2/n) = O.

10-1.2 The Method of Maximum Likelihood


One of the best methods for obtaining a point estimator is the method of maximum likeli-
hood. Suppose that X is a random variable with probability distributionf(x, IJ), where 8 is
a single unknown parameter. Let XI' X2• "0' XII be the observed values in a random sample
of size n. Then the likelihood function of the sample is

L( 8) = f(x 1, 8) . f(x" 8) ..... f(x" IJ). (10-6)

Note that the likelihood function is now a function of only the unknown parameter e. The
e e
maximum likelihood estimator (?vll..E) of is the value of that maximizes the likelihood
function L(fJ). Essentially, the maximum likelihood estimator is the value of that maxi- e
,mizes the probability of occurrence of the sample results.

Let X be a Bernoulli random variable. The probability mass function is

p(x) ~ r(1 - p)'-', x ~ 0, I,


::::: 0, othenvise,

where p is the parameter to be estimated. The likelihood function of a sample of size n would be

[ .. \

We observe that if p maximizes L(p) then p also maximizes lnL(p) sinee the logarithm is a monoto-
nieally increasing function. Therefore,
1

222 Chapter 1Q Parameter Estimation

Now

dlnL(p) = t,XI _lr.- t,'I)


dp p 1-p
Equating this zero and solving for p yields the Ml.B p, as

an intuitively pleasing answer. Of COurse, one should also perfOCl a second derivative test, but we
have foregone '<hat here.

"~~I~~!,ii.~
Let X be normally distributed wi;n unknown mean f.1 and knO'Wtl variance cr, The likelihood function
of a sample of size n is

Now

lnLC") = -(n/2)ln(2""'l -(2a'r' L(X1 - Il)'


1..1

dlna;ll) (a'r'L(x,-Il).
J"':
Equating tbls last reswt to zero and solving for f.1 yields

.. - -..- - - - - - - - - - - - - - - - -
It may not always be possible to use calculus methods to determine the maximum of
48). This is illustrated !n the follawing example.

:!~P!~!ji~~:
LetXbe uniformly distributed on the interval 0 to a, 'The likelihood function of a random sample X!,
x,.. "., X" of size n is

Notc that the slope of this function is not zero an)'\Vhcre, so we cannot use calculus methods to find
the maximum likelihood estimator a. Hov{ever, notice that the likelihood function increases as a
decreases. Therefore, we would maximize L(a) by settingd to the sIIULllest value that it could reason~
ably assume. OearIy. a can be no smaller than the largest sample value, so we wo;;Jd use the largest
observation as fi. Thus, d =maxi Xi is the !>.-1LE for a.
------
10-1 Point Estbation 223

The method of maximum likelihood can be used in situations where there are several
unknmvn parameters. say 81> 82, •. '. 8k' to estimate, In such cases, the likelihood function
is a function of the k unknown parametern 8" a" "., 13k and the IDa,mnum likelihood esti-
mators [Ili ) would be found by equaring the k first partial derivatives iJL(B" 8" . ", B,)liJ8;,
1,2. ,..• k, to zero and sol\ug the resulting system of equations.

:~~iF'~~2
LetXbenormally dist:ibuted v;'ithmean f.I. and variance cr. rr
where both ,uand are unknown, Find the
maxi.mw:n likelihood estimators of ,U and 02. Tue likelihood function for a nmdom sa.::cple of size n is

L(Ji.,()2) = tIO'~e-{;t!-f1)~llC12
,.\
1 e -IV2"')k~,(,-4
(z=,t
and

Now

dlnL(ll,a') I "
, :Lh-,u)=o,
all (J i=l

dlnL(!'.,,')
a(,,-)
"\

The solutions to the above equations yield the roaxlmur.llikelihood estimators

and

-2
(J
I ~( X; -X
=-""-' -)' .
n 1..,1
which is closely related to the unbiased sample variance Sl, Namely, 6"1 «n_l)/n)S2.

Maximum likelihood estimators are not necessarily unbiased (see the maximum like-
lihood estimator of d' in Example 10-6), but they usually may be easily modified to make
them unbiased. Further, the bias approaches zero for large samp1es. In general, maximum
likelihood estimators have good large-sample or asymptotic properties. Specifically, they
are asymptotically normally distributed, unbiased, and have a variance that approachcs the
Cramer-Ran lower bound for large "- More precisely, if 11 is thc maximum likelihood esti-
mator for 8. then .-yJ;(8- 8) is normally distributed with mean zero and variance

~,In(iI-e)l=v(~niJ}= I -2

E[:e 1nf(X,1I) J
for large n, Maximum likelihood estimators are also consistent, 1n addition, they possess the
invariance property; that is, if (J is the maximum likelihood estimator of () and u( If) is a func-
)
224 Chaprer 10 Parameter Estimation

tion of II that has a ,ingle-valued inverse, then the maximum likelihood estimator of u( IJ) is
u(fJ).
It can be shown graphically that the maximum of the likelihood will occur at the value
of the maximum likelihood estimator. Consider a sample of size n ;::::: 10 from a normal i
distribution:

14.15,32.07,32.30,25.01,21.86,23.70,25.92,25.19,22.59,26.47.

Assume that the population variance is known to be 4. The 11LE for the mean. J1., of a nor-
mal distribution has been shown to beX. For this set of data, x; 25. Figure 10-3 displays
the log~likelihood for various values of the mean. Notice that the maximum value of the
log~likelihood function OCC1Jl'S at approximately x= 25. Sometimes, the likelihood function
is relatively flat in the region around. the maximum, This may be due to the size of the sam..
pie taken from the population. A small sample size can lead to a fairly fl't log likelihood,
implying less precision in the estimate of the parameter of interest.

10-1.3 The Method ofMomenfs


Suppose that X is either a continuous random variable with probability density f(x; Ii" e"
__ ., Ii,) or a discrete random variable with distributionp(x; 8" 1/2' __ ., Ii,) characterized by k
unknown parameters. Let X l' XZl ••• ~ Xn be a random sample of size n from Xt and define the
first k sample moments about the origin as

t; 1,2,. __ ,k. (10-7)

The first k population moments about the origin are

/1;; E(X'); [ x'f(x;e"eZ,·",ek)tb:, X continuous,

; :2'>'p(x;e1,8z,,,·,e,), t=1.2, ... ,k. X discrete. (1()'8)


xG.R~

27
Sample mean
Figur< 10-3 Log lil:'elihood for various means.
lO~1 Point Estimation 225

The population moments {J1;} will, ill general. be functions of the k unknown parameters
( 8i ). Equating sample moments and population moments will yield k simultaneous equa-
tions in k unknowns (the (Ii); that is,
t= I, 2, ... , k. (10-9)

The solution to equation 10-9, denoted 8"e" ... ,tI" yields the moment estimators of 8" 8"
"., 9;:

~~~w~!~Ii917~
LetX - N(p., cr)
where 11- and crare unknol1."!l. To derive estimators for ,u and cr by the method of
moments. recaE that for the normal. distributio::l

11-; =. tt.
14 = cfl + ;t.
The sample moments are m{ :;;{l/n)'~ ,Xl and rn2::;C: (l/n) ~~ IX?, . From equation 10-9 we obtain
L...t.,.,. L...t"",

which have the solution

't!~R!~10}~
Let X be uniformly distributed on the interval (0, a). To find an estimator of a by the method of
moments, we note that the first population moment about zero is

Il,, = [' x-d:t=-,


1 a
.. 0 a 2
The first sample moment is just X Therefore,

or the moment estimator of a is just twice the sample mean.

The method of moments often yields est.imarors that are reasonably good. In Example
10-7, for instance, the moment estimators are identical to the maximum likelihood estima~
tors. In general. moment estimators are asymptotically normally distributed (approxi-
mately) and consistent. However, their variance may be larger than the variance of
estimators derived by other methods, such as the mcthod of maximum likelihood. Occa-
sionally! the method of moments yields estimators that are very poor, as in Example 10-8.
The estimator in that example does not always generate an estimate that is compatible with
our knowledge of the situation.. For example, if our sample observa.tions were Xl == 60, Zz :::::
10, and x) :::::5. then a= 50, whlch is unreasonable, since we know that a ~ 60.
226 Chapter 10 Parameter Estimation

10-1.4 Bayesian Inference


In the preceding chapters we made an extensive study of the use of probability. Until now,
we have interpreted these probabilities in the frequency sense; that is, they refer to an
experiment that can be repeated an indefinite number of times, and if the probability of
occurrence of an event A is 0.6; then we would expect A to occur in about 60% of the exper~
imental trials. This frequency interpretation of probability is often called the objectivist Or
classical viewpoint.
Bayesian inference requires a different interpretation of probability, called the subjec~
tive viewpoint. We often encounter subjective probabilistic statements~ such as "There is a
30% chance of rain today." Subjective statements measure a person's «degree of belief"
concerning some event, rather than a frequency interpretation. Bayesian inference requires
us to make use of subjective probability to measure our degree of belief about a state of
nature. That is, we must specify a probability distribution to describe our degree of belief
about an unknown parameter, This procedure is totally unlike anything we have discussed
previously, wntil now, parameters have been treated as unkno\VD constants. Bayesian infer~
ence requires us to think of parameters as random variables.
Suppose we letf(S) be the probability distribution of the parameter or state of nature
fI. The distributionf(1t) summmizes our objective information about e prior to obtaining
sample information. Obviously, if we are reasonably certain about the value of 9, we will
choosef(S) with a small variance, while if we are less certain about 8,/(8) will be chosen
with a larger variance. We callf(8) the prior distn'bution of 6.
Now consider the distribution of the random variable X. The distribution of X we denote
f(xl8) to indicate that the clistribution depends on the unknown parameter 9. Suppose we take
a random sample from X, say X" X" ... ,X~ The joint density of likelihood of the sample is
f(x"x.,., ... , x,l8) = f(x,:8)f(x.,.I8) ... f(x,I8).
We define the posterior distribution of f) as the conditional distribution of 8, given the
sample results. This is just

(10-10)

The joint distribution of the sample and 6 in the numerator of equation 10-10 is the prod-
uct of the prior distribution of 6 and the likelihood, or
f(;t" x.,., ... , x,; 8) = f(8)· f(x" x.,.•••. , xn!8)·
The denominator of equation 10-10, which is the maIginal distribution of the sample, is just
a no.rm.alizing constant obtained by

(IO-Il)
, 0
Consequently, we may write the posterior distribution of eas
I )_ f(6)f(x x" ... ,x,18)
f (Srl'X,,,,,,x, - ( " , (10-12)
f xllx2""'Xn)
We nOle that Bayes' theorem has been used to tr.msforrn or update the prior distribution
to the posterior distribution, The posterior distribution reflectS our degree of belief about fJ
given the sample information. Furthermore, the posterior clistribution is proportional to the
product of the prior distribution and the likelihood, the constant of proportionality being the
normalizing constantf(x" x.,., .•. , x,).
HH Point Estimation 227

Thus, the posterior density for e expresses our degree of belief about the value of e
given the result of the sample.

The time to failure of a transisto:; is known to be exponentially distributed with pttameter A" For a ran-
dom sample of n transistors, the joint dens:ty of the sample elements, giver. A. is

-ltx 1

f(x"x" .... x,I;,)~;l.'e ;;; .


Suppose we feel that the prior dist:ibution for Ais also exponential,
j(A) ~ k£-M, A> O.
; ; ; 0, otherwise,
where k wou1d be chosen depending on the exact knowledge or degree of belief we have about t"J.e
value of A. The jOint density of the sample and ). is

f (x!,XZ,·",xn .•.:t)_'
-kA, e->(2'>.+<) .
and the marginal density of the sample is

f (Xl.XZ, ... ,xl1 ) =: [ kl." e -!{l>") d:A.


,
kr(n+ 1)
(Ix,+kr'i'
Therefore, l1e posterior density for ;., by equation lO~12> is

ar.d we see that the posterior density :ot ;. is a gzu:u.'Ul'i distribution 'With pa."'amcters n + 1 and IX; + k

10-1.5 Applications to Estimation


In this section, we discuss the application of Bayesian inference to the problem of estimat-
ing an unknown parameter of a probability distribution. Let Xl' X" .,., X, be a random sam-
ple of the random variable X having density f(xll/), We want to obtain. point estim.are of e.
e
Letf(1f) be the prior distribution for and let eccl; I/) be the loss/u:nction. The loss function
is a penalty function reflecting the "payment" we must make for misidentifyin~ eby • real-
ization of its point estimator /i, Common choices for eeel; I/) are (6 - 1/)' and Gen- e- el',
erally~ the less accurate a realization ofi!J is, the more we must pay. In conjunction \vith a
particular loss function, the risk is defined as the expeeted value of the loss function with
e.
respect to the random variables XI' X2, .. "Xfj comprising In other words, the risk is

R(d;O) ~ E[i( Ii; e)]


= [ [ ... r i{d(xj,xi, ...,x,);B}f(xj,x" .. ,x,le)dxjdx, "dx"
where the function d("" x." .. '. x,,). an alternate notation for the eslimator is simply a e,
e
function of the observations. Since is considered to be a random variable, the risk is itself
a random variable. We would like to find the function d that minimizes the expected risk.
We WTite the expected risk as

,-
228 Chapler 10 Para..'!leter Estin:;.:1tion

B(d) = E[R(d;&)] = [R(d;&)f(e)de


(10-13)
=[{r ,.. [l{d(X1'X2 ....'Xn );&}f(x;'X2' ...'Xn :&),u1,u' .. ,u,} f(&)d&.
e
We define the Bayes estimator of the parameter to be the function d ofllie sample Xl'
X2• Xn that minimizes the expected risk. On interchanging the order of integration in
"',
equation 10-13 we obtain

B(d) = [ ... r {[ t{d(x; ,X2 "..,x );e}f(X 'X2' .... x l&)f(&)d&},uI,uC,u,.


n I n (10-14)

The function B will be minimized if we can find a function d that minimizes the quantity
within the large braces in equation 10-14 for every set of the x values. That is, the Bayes
x,
estimator of &is a function d of the that minimizes

J':i{dh.x" ...,x,);&}f(x1,x" ...,x,le)f(ejd&


= [l(e;e)f(x;,x" ... ,x,;&)de
(10-15)
= f(X1,X2'''''X,)[ l(e;e)f(elx1,x" ... ,xn)de.
Thus, the Bayes estimator of 8 is the value that minimizes e
[ l(e;eY(&jx1 ,x" ... ,xn )de. (10-16)

If the loss function e(1}, (1) is the squared-error loss (8_11)', then we may show that the Bayes
e,
estimator of e, say is the mean of the posterior density for &(refer to Exercise 10-78).

~};~~ii!~l;~f~'
Consider the situation in Example 10-9. where it wa.<; shown. that if the random variable X is expo~
nent:ially distributed with parameter;;" and if the prior distribution for A. is exponential with parame-
ter k, then the posterior distribution for).. is a gamma distribution. with parameters n ..k 1 and
~'~
,L..,=l
Xi + k. Therefore. if a squared-error loss function is assumed. the Bayes estimator for A. is the
mean of this gamma distribution.
i rJ,+1

iXi+k
'"I
Suppose that in the ti:m'e~to-failure problem in EXac1ple 10-9. a reasonable exponen'ial prior distri-
bution fox A. has parameter k::;;; 140. This is equiValent to saying that the prior estimate for)" is
,,10
0.07142. A r:mdom sample of size n== 10 yields A...J.4x,< =1500. The Bayes estimate of).. is

" rJ,+-1 10+-1


.l.~-·-~----=o.o6/07.
to. 1500+ 140
Ix,~k
i.I

We may compare this with the results that would have been obtained by classical methods. The max"
imurn lilce1ihood es.timator of me parameter A. in an exponential distribution is
1().1 Point Estimation 229

Cor.sequently, the maximum likelihood estimate of t., based or. the foregoL'1g sa.-nple data, is

~ = 10
;I.' =
tx;
i=!
1500
0,06667,

Note that the results produced by the two methods differ somewhat. The Bayes estimate is slightly
closer to the prior estimate than is the maximum likelihood estimate.

Let X[, J;, ""XII be a random samplc from the normal density MID mea."l J.L and va..riance 1, where ,I.l
is unknown. Assume that the prior density for J.1. is nonnal with mean 0 and variance 1; that is,

f(ll) = _1_ e-(I/')p' -=<J1.<~ .


.,J2;r
The joint conditional density of the sample given p. is

-
f (xl·x2'··'Xr. If.l )- (2tr),,/2· e
1 -ll!,)r(X,-p)'

1 -(!/2Jfr";-2,J.tr ..,+nd"1
=---e' ,
(2,,:),"''Z
Thus, the joint dC!lsity of the sample and p:s

The marginal density of the sample is

f'~xl,.x2···'x" )- 1
- (2n')~"-i)f2 L.!z x '} [ cXPu,
exp~ 2 ), '_2
t 2lln+1 J.L
f __ -l~'
jP'..xj J.l.

By completing the square in the exponent under t.~e integra;, we obtain

1
(n+l)Ii2(2")"f.l exp
[-.!(ZX'
2
_n',' 'I'
' n-'-l)j
using:he fact :hat the i:ategral is (2n:):t.l./(n ,.;..l)lfl (si:ace a normal density has to integrate to 1). Now
the pos;;erior density for lJ. is
230 Chapte: 10 Parameter Estimation

There is a relationship betvleen the Bayes estimator for a parameter and the maxi-
mum likelihood estimator of the same parameter. For large sample sizes the two are near,!}'
equivalent, In general, the difference between the two estimators is small compared to ;In, 1/
In practical problems, a moderate sample size will produce approximately the same estimate
by either the Bayes Or the maximum likelihood method, if the sample results are consistent
with the assumed prior information. If the sample results are inconsistent with the prior
assumptions, then the Bayes estimate may differ considerably from the maximum likelihood
estimate. In these clrcumstances, if the sample results are accepted as being correct, the prior
information must be incorrect. The maximum likelihood estimate would then be the better
estimate to use.
lf the sample results do not agree with the prior information, the Bayes estimator will
tend to produce an estimate that is between the maxbnUnl likelihood estimate and the prior
assumptions. If there is more inconsistency between the prior ioformation and the sample,
there will be a greater difference between the two estimates. For an illustration of this. refer
to Example 10-10.

10-1,6 Precision of Estimation: The Standard Error


When we report the value of a point estimate. it is usually necessary to give some :idea of
its precision. The standard error is the usual measure of precision employed. If tiis an esti~
e,
mator of then the standard error of~ is just the standard deviation of i), or

(1()..l7)

If Gt! involves any unknown parameters, thee. if we substitute estimates of these param-
eters into equation 1()"17, we obtain the estimated standard error ofe, say6s. A small stan-
dard error implies that a relatively precise estimate has been reported.

·.Ex:#l'l~l~# •
An article in the Journal a/Heat Transfer (Trans. AS:ME. Scs. C, 96, 1974, p. 59) describes a method
of measuring the then::lal conductivity of Anneo iron. Using a temperature of 100"'F and a power input
of 550 W, the followi:.1g 10 measurement<: of thermal conductivity (in Btulnr-ft-l:IF) were obtained:

41.60, 41.48, 42.34, 41.95, 41.86,


42.18, 41.72, 42.26, 41.81. 42.04.
A poiot estimate of mean thermal conductivity allOO°F and 550 W is the sa::nple mea..'1, or
'1= 41.924 BtuIbr-ft-'F.
The star.dard error of the sample:mean is Ui :;;; G/..J;., a."ld since Gis un..tmown. we may replace it with
the sample standard deviation to obtain the estimated standard error of;;'

&1 =' ~:;;; O"~4 =0.0898.


-.Jr!. ..,'10
Notice that the standard error is about 0.2% of the sample mean, implyiug that we have obtained a rel~
arively precise pobt estimate of thermal conductivity.
r
! 10-1 Point Estimation 231

When the distribution of $ is unknown or complicated, the standard errOr of 0 may be


difficult to estimate using standard statistical theory. In this case, a computer-intensive
technique called the bootstrap can be used. Efron and Tibshirani (1993) provide an excel-
lent introduction to the bootstrap technique.
Suppose that the standard error of ii is denoted (To' Fu,iher, assume the population prob-
ability density function is by f(x;lJ). A bootstrap estimate of (T, can be easily
COD.snucted.
1~ Given a random sample fromf(x;~, Xl'~' .. "' X;,. estimate 8, denoted bye.
2. l:sing the estimate 9, generate a sample of size n from the distributionl(x; 8). This
is the bootstrap sample.
3. l:sing the bootstrap sample, estimate 9. This estimate we denote Ii;.
4. GenerateB bootstrap samples to obtain bootstrap estimates, 9;, for i = 1,2, ... , B (B

5. Let t. e;Is
= 100 Or 200 is often used).
e' = represent the sample mean of the bootstrap estimates.

6. The bootstrap standard error of i1 is found with the usual standard deviation
formula:

-8
_.)2
S; =
In the literature, B - 1 is often replaced by B; for large values of B, bowever, there is little
practical difference in the estimate obtained.

::~R~~!~::m:i~;
The failure times X of an electronic component arc known to follow an exponential distribution with
unknown paramete: Il A random sample of ten cot.lponents resulted ill the following fail~e times (in
hours):
195.2, 201.4, 183.0, 175.1. 205.1, 191.7, 188.6, 173.5,200,8,210.0.
The mea.."I. of the e.xpone;ritial distribution is given by E(X) =. 11)., It is also known that E(X; =. 1I1l A
reasonable est:imate for A &n is i = VX'. From the sa.'nple data, we find X 192.44, resulting in
l=: 1/192,44 =0.00520. B= 100 bootstrap samples of size 1'/.::::.10 were generated using Mi..'litab$ll with
fir, 0.00520) =: 0.00520e-o·lrn2fu, Some of the bootstrap estirr.ates are sho'Jlll in Table 10-1,
The average ofllie bootstrap estimares is found to be
ercoc of the estimate is
X' = Ii;/.1
1",1
'\00=0.00551. The standard

0.00169.

Table 10-1 Bootstrap Estimates for Example 10·13


Sample Smr..p1e ~ean, X; ,:;
243.407 0.00411
2 153.821 0.00650
3 126.554 0.00790

100 204.390 0.00489


232 Chapter 10 Parameter Esti.w:ltion
J,
1()"2 SINGLE-SAMPLE CONFIDENCE :Ii'<"TERVAL ESTIMATION
In many situations, a point estimate does not provide enough information about the param-
eter of interest For example, if we are interested in estimating the mean compression
strength of concrete, a single number may not be very meaningfuL An interval estimate of
the form L ,; J1 ,; U might be more useful. The end points of this interval will be random
variables) since they are functions of sample data.
In general, to construct an interval estimator of the unkno\\ll parameter e. we must find
two statistics, L and U, such that
P(L $ (1$ UJ = 1- c< (10-18)
The resulting interval
(10-19)
is called a lOO( 1 - a)% confidence ir.te!Valfor the unknovtn parameter e. Land U are called
the lower- and upper-confidence limits, respectively. and I - a is called the confidence coef-
ficiem. The interpretation ill a confidence interval is that if many random samples are col-
lected and a 100(1 - a)% cenfidence interval on e is compured from each sample, then
100(1- a)% oithese intervals will contain the true value of e. The siruationis illustrated in
Fig. 104 which shows several 100(1- a)% confidence intervals for the mean J1 of a distri-
bution. The dol, at the center of each interval indicate the point estimate of J1 (in this case Xi,
Notice that one ill the 15 intervals fuils to contain the true value of J.L If this were a 95% con-
fidence level, in the long run, only 5% of the intervals would fail to contain J.L.
Now in practice, we obtain only one random sample and calculate one confidence
intervaL Since this interval either will or '\\ill not contain the true value of 8t it is not rea-
sonable to attach a probability level to this specific event. The appropriate statement would
be that elies in the observed interval [L, ("1 with confidence 100(1 - a). This statement has

·1
~Ir-._ - - - l

Figure 104 Repeated construction of a confi¥


dence interval for JL
10-2 Single-Sample Confidence Interval Estimation 233

a frequency interpretation; that is, we do not know if the statement is true for this specific
sample, but the method used to obtain ille interval [L, U] yields correct statements
100(1- a)% of the time.
The confidence interval in equation 10-19 might be more properly called a two-sided
conJidence interval, as it specifies both a lower and an upper limit on (J, Occasionally, a one-
sided confidence interval might be more appropriate. A one-sided 100(1- a)% lower-con-
fidence interval on "is given by the interval
L'; e, (10-20)

where the lower~confidence limit L is chosen so that


P(L'; 8) = 1- a. (10-211

Simila::ly, a one-sided 100(1- a)% upper-co:rlidence interval on e is given by ille interval


esu.
where the upper-confidence limit U is chosen so that
P{ e,; UJ = 1 - a. (10-23)

The length of the observed two-sided confidence interval is an important measure of


the quality of the information obtained from the sample. The half-interval length e - L Or
U - eis called the accuracy of the estimator. The longer the confidence inter1lal. the more
confident we are that the interval acrually contains the true value of 8. On the other hand;
the longer the interval, the less information we have about the true value of e, In an ideal
situation, we obtain a relatively short interval with high confidence.

10-2.1 Confidence Interval on the Mean of a Normal Distribution,


Variance Known
LetXbe a normal random variable with unknown mean I'and known variance 0'-, and sup-
pose that a random sample of size n. X" x" ... , Xn• is taken. A 100(1 - a)% confidence
interval on jl can be obtained by considering the sampling distribution of the sample mean
X. In Section 9-3 we noted that the sampling distribution of X is normal if X is normal and
approximately normal-if the conditions of the Central Limit Theorem are met The mean of
X is,u. and the variance is rrln. Therefore, the distribution of the statistic
Z=X-I'
a/.Jn
is taken to be a standa..--d normal distribution.
The distribution of Z (X -l'l/(Gj.Jn) is sbown in Fig. 10-5. From examination of
this figure we see th.at

Figure 10-5 The distri:mtion of Z


234 Chapter 10 Parameter Estimaticn

or

This can be rearranged as

p{X - Z.I' Cf/..Jn 0; J1';; X + Z.12 O"/..Jn} = I-a. (10-24)

Comparing equations 10-24 and 10-18, we see that the 100(1- a)% two-sided confidence
interval on J1 is ./
(10-25)

t~p!~iij.;H:
Consider the thermal conductivity data in Example 10-12. Suppose that We want to find a 95% con-
fidence interval cn The mean thec:nal conductivity of Ar::nco iron. Suppose we know that the star:.dru:d
deviation of The.rmal conductivity at 100"F ar.d 550 W is 0'= 0.10 BtuIhr-ft-¢R If we assume that
thermal conductiv:ty is normally distributed (or that thc conditions of the Central Limit Theorem are
meo), then we can use equation 10-25 to construct the confidence interva1. A 95% interval implies tliat
1 - a = 0.95, so ex:::;; 0.05, I!n.d from Table IT in the Appendix Zan. """ Z:J,0512 """ ZO.02S """ 1,96, The lower
confidence limit is

L:::::i-Za,12 (j/Jn
= 4 '.924-1.96( 0.1 0)/-50'
=41.924-0.062
41.862
and the upper confidence limit is

U::::.x+Zat'1.vj..J;i
=~1.924+ 1.96(O.10)/-JiO
=41.92~+O.062
= 41.986,
Thus the 95% two-sided confidence intcrval is
41.862 ., 1''; 41.986.

This is our interval of reasonable wlues f-or mean thermal conductivity at 95% coniidence.

Confidence Level and Precision of Estimation


Notice that in the previous example our choice of the 95% level of confidence was eSsen-
tially arbitrary, What would have happened if we had chosen a higher level of confidence,
say 99%1 In fact, doesn't it seern reasonable that we would want the higher level of confi-
dence? At a = 0.01, we find Z"", = 20.•112 = Z,oo, = 2.58, while for a= 0.05, 20,0"" = 1.96.
Thus, the length of the 95% confidence interval is
Ie-' / ......
2( 1.96CfI"ln ) =3.92O"I..Jn,
whereas the length of tne 99% confidence interval is

2(2.5&O";..Jn) = 5. 15a/..Jn.
10-2 Singlc-Sample CDnfidence Interval Estimation 235

The 99% confidence interval is longer tllan the 95% confidence interval. Tbis is why we
have a higher level of confidence in the 99% confidence interval. Generally. for a fixed sam-
ple size n and standard deviation a, the higher the confidence level, the longer the resulting
confidence interval
Since the length of the confidence interval measures the precision of estimation, we see
that precision is inversely related to the confidence level. As noted earlier, it is highly desir~
able to obtain a confidence interval that is short enough for decision-making purposes and
that also has adequate confidence. One way to achieve this is by choosing the sample size
n to be large enough to give a confidence interval of specified length with prescribed
confidence.

Choice of Sample Size


The accuracy of the confidence interval in equation 10-25 is ZalZ,a / Jti. This means that in
i
using x to estimate /1, the error E = IX - JJi is less than Za12 r:r -.In. .with confidence 100( 1
0:), This is shown graphically in Fig. 10-6. In situations where the sample size Can be con~
trolled. we can choose n to be 100(1 - a)% confident that the error in estimating ,il is less
than a specified error E. The appropriate sample size is
/Z a ,Z
n= \-0.'2E-/ . (10·26)

If the right-hand side of equation 10-26 is not an integer, it must be rounded up. Notice that
2 E is the length of the resulting confidence interval.
To illustrate the use of this procedure, suppose that we wanted the error in estimating
the mean thermal conductivity of Armco iron in Example 10-14 to be less than 0.05 BtuIhr
-ft-"F, with 95% confidence. Since r:r= 0.10 and ZU.D25 = 1.96. we may find the required
sample size from equation 10-26 to be

n=(Za/zr:rJ'Z = [(1.96)0.lOT =15.37=16.


E 0.05 J
Notice how, in general, the sample size behaves as a function of the length of the COn·
fidence interval 2 E. the confidence level 100(1 - a)%, and the standard deviation r:r as
follows:
• As the desired length of the interval 2 E decreases, the required sample size n
increases for a fixed value of r:r and specified confidence.
As crincreasest the required sample size n increases for a fixed length 2 E and spec-
ified cQnfidence.
As the level of confidence increases, the required sample size n increases for fixed
length 2 E and standard deviation r:r.

Figure 10-6 Error in estimating I' \'\1th X.


\'1

236 Chapter 10 Parameter Esti..ma.tion

One-Sided Confidence Intervals /

It is also possible to obtain ooe-sided confidence intervals for Il by setting either L = - ~'br
U = - and replacing Z"", by Z". The 100(1 - (X)% upper-confidence inteITal for Il is
- r
fJ5.X+Za G' "in, (10-27)
and the 100(1 - (X)% lower-confidence interval for Il is
X-Za(J .Jn~Il. (10-28)

10-2.2 Confidence Interval on the Mean of a Normal Distribution,


Variance Unknown
Suppose that We wish to find a confidence interval on the mean of a distribution but the vari~
ance is unknown. Specifically, a random sample of size 11, X1,X:: •. ,.~X" is avaj]able, and X
and S' are the sample mean and sample variance, respectively. One possibility would be to
replace (Jin the confidence interval formulas for Il with known variance (equations 10-25,
10-27, and 10-28) with the sample standard deviation s. lithe sample size, n, is relatively
large, say n > 30, then this is an acceptable procedure, Consequently, we often call the con~
fidence intervals in Sections 10-2.1 and 10-2.2large~sample confidence intervals, because
they are approximately valid even if the unknown population vatiances are replaced by the
corresponding sample variances.
When sample sizes are sI:1all, this approach will not work, and we must use another
procedure. To produce a valid confidence interval; we must make a stronger assumption
about the underlying population. The usual assumptiO:l is that the underlying population is
normally distributed. This leads to confidence intervals based on the t distribution. Specif-
ically, let Xl' X'Z' ... , Xn be a random sample from a normal distribution with unknown mean
,u and unknown variance <r. In Section 94 We noted that the sampling distribution of the
statistic

t=~
X-u
Sf";n
is the t distribution with n - 1 degrees of freedom. We nOw show how the confidence inter~
val on f.l is obtained.
The distribution of r = (X - 1l)/(sj.Jn) is shown in Fig. 10-7. L.."11ing r""", __ , be the
upper al2 percentage point of the t distribution with n - 1 degrees of freedom, we observe
from Fig. 10-7 that

Figure 10-7 The t distribution,


lO~2 Single.Samplc Confidence Interval Est.i.ma.tion 237

or

p{-'.,I2,n-' S ;/~ s 'aj2,n-l} ~4 -a.


Rearranging this last equation yields

p{x - '.!~.rt-l S/..;;; s!1 sX H cil.n_1 sf";;;} ~ 1- a, (10-29)


Comparin,g equatiOIl5 10-29 and 10-18, we see that a 100(1- a)% two-sided coniidence
interval OIl !1 is
- Ir - 'r
X -'.IO,n-1 Sf-In SW> X + '.IO,n-l Sj-vn, (10-30)

A 100(1- a)% lower-coniidence interval on!1 is given by


- r
X- 'a,lI-l S -v n ::; J.l., (10-31)
and a 100(1- a)% upper-coniidence interval on!1 is
(10-32)
Remember that these procedures assume that we are sampling from a normal popula-
tiOIl, Tbis assumption is important for small samples, Fortunately, the no!111ility assumption
holds in many practical situations. When it does not) we must use -distribution-free or M!t~
parametric confidence interv$. Nonparametric methO"'&sare'dlscussed in Chapter 16. How-
ever, when the population is normal. the t-distribution intervals are the shonest possible
100(1 a)% confidence intel"',rals. and are therefore superior to the nonparametric methods.
Selecting the sample size n required to give a confidence interval of required length is
not as easy as in the kno\VU (f case~ because the length of the interval depends on the 'Value
of a (unknmvn before the data is collected) and on n. Furthermore, n enters the confidence
interval through both l/-.Jn and t~l' Consequently. the required n must be determined
through trial and erroc

t~~\~~~fgl
An .article in. the Journal <if Testing and Evaluation (Vol 10, No.4, 1982, p. 133) presents the
following 20 measurements on residua! flame time (in seconds) of treated specimens of cbildren's
nightwcar:
9.85. 9.93, 9.75, 9,77, 9.67,
9.87, 9.67, 9,94, 9.85, 9.75,
9.83, 9.92., 9,74, 9,99, 9.88,
9.95, 9,95, 9,93, 9.92, 9.89,

We ms.i, to find a 95% confidence interval on the mean :residual flame time, The sample mean
and standard deviation are
x~9,8475,
s~O,0954.

From Table IV of the Appendix we find to.025 ,19 "'" 2,093. The lower and upper 95% confidence limits
are
L~ X-la/2"H s/~
~ 9,8475 - 2.093(0,0954)/-J20
=9,8029 seconds.
,

/'
238 Chapter 10 Parameter Estimation

and
- (r
U = x +tlX/'L.J1-1S/'Vn
= 9.8475+ 2.093(0.0954)/fjjj
9.&921 seconds.
Tnerefore the 95% confidence interval is
9.8029 sec ~)J. ~ 9.8921 sec
We are 95% confident that the mean residual fla:ne time is between 9.80"..5 and 9.8921 seconds.

10-2.3 Confidence Interval on the Variance of a Normal Distribution


Suppose that X is nO!Il1ally distributed with unknown mean)1. and unknov-n variance a'. Let
X:,X2, •••• X, be • random sample of size n, and let S2 be the sample variance. It was shown
in Section 9·3 that the sampling distribution of
2 (n-l)S2
X:::::: "
0"

is chi-square with n - 1 degrees of freedom. This distribution is shown in Fig. 10-8.


To develop the confidence interv:al. we note from Fig. 10·8 that
, ,.
P(X:'O",~I $X $ X:;"" •.,) = 1- a
or
r
P~l Xl-ai2.1:1~1
2 «n-l)S
- (f2
2 '
< 2
X~/2.n-l -
}-1 - a.

This last equation can be rearranged to yield


(( 2 ~ "";
P ~ ,n-l)S ~ 2 < (n-l)S"
rl-l_
- a. (10·33)
1 '2
Aaj2,n-t
,;:, <Y - 2
%1-a/2,n-1)

Comparing equations 10·33 and 10·18, we see that a 100(1 - a)% two-sided confidence
interval for a' is
(n-l)S2 < 2 < (n-l)S'
2 (1-2 . (10·34)
%a/2,1I-1 %1-«/2,n-1

012

o X~-(l12,n-1
Figure 10-8 'The X" distribution.
10-2 Single-Sa."np]e Confidence Ir.terval EstL.'71ation 239

To fmd a 100(1 - a)% lower-confidence interval on a', set U = = and replace X~'_l with
" ..
X'U,I1-I> gl'nng
{n-I)S' ~ 2 (10-35)
2 ;::;,.(j ,
Xa,n-l
The 100(1 - a}% upper-confidenee interval is found by setting L = 0 and replacing
X.~-ttr2Jl-l with X~----I"L,n:-:' resulting in
2/(n-l)S2
0' ~ 2 (10-36)
Xt-a.n-t

A manufacturer of soft drink beverages is jntcrested in the uniformity of the m..'1.chine used to 'fill cans.
Specifically. it is desirable that the standard deviation 0' of the filling process be less than 0.2 fluid
ounces; otherwise there will be a bigher than allowable perce:ltage of cans that are under:filled. We
will assume that fill volume is approx:i.rr.ately normally distributed. A random sample of 20 cans result
in a s.arople variance of S1 = 0.0225 (fluid ouncesJ2. A 95% upper.cOI:iidence interval is found from
equation 10~36 as fo1:ows:

or

G 2 ,; (19)0.0225 0.0423 (fluidounces'!'.


ml17 .
TIlls last statement may be cO:lverted into a confidence interval on the standard deviation O'by taking
the square root of both sides, resulting in
0'$ 0.21 fluid ounces.
Therefore. at the 95% level of confidence. the data do not sup?Ort the claim that the ;Jtoeess staI:dard
deviation is less than 0.20 fluid our.ces.

10-2.4 Confidence Interval on a Proportion


It is often necessary to construct a 100(1 - a)% confidence interval on a proportion. For
example, suppose that a rundom sample of size n has been talren from a large (possibly infi-
nite) population. and XC'; n) observations in this sample belong to a class of interest. Then
f; ~ X!n is the point estimator of the proportion of the population that belongs to this class.
Note that n and p are the parameters of a binomial distribution. Furthennore. in Section 7-
5 we saw that the samplin,g distribution of p is approximately normal with mean p and vari-
ance p(l - p)/n, if p is not. too close to either 0 or 1, and jf n is relatively large. Thus, the
distribution of

is "l'Proximately standard nonnal.


To construct the confidence inteI'\'al on p, note that
P(-Z"" ;;Z;;Z"") = 1 a
240 Chapter 10 Parameter Estimation

or

This may be rearranged as

(10-37)

We recognize the quantity .fir):"-- lin·


p as the standard error of the point estimator p.
l;nfortunately. the upper and lower lintits of the confidence inrerval obtained from equation
10-37 would contain the unknown parameter p. However, a satisfactory solution is to
replace p by Pin the standard error, giving an estimated standard error. Therefore.

,
P { P--Z~i2V
Iil(l-ill
n
'
';p:5P+Za/~~
Ip(l=iij}_
n -1--a, (10-38)

and the approximate 100(1-- a)% two-sided confidence interval onp is

, !il(I--P) , lil(l--il)
p--ZaI2~ n ,;p:5P+Zat2~ n . (10-39)

An approximate 100(1 -- a)% lower-confidence interval is

P
- -- Z
a',
~(l=iij.:5 p, (10-40)
, n

and an approximate 100(1-- a)% upper-cOllfidence interval is

P
,;'p+ za~IIP'(111-- P') (10-41)

~~i?j~:!.t;.
In a random sample of 75 axle shafts, 12 have a surlace finish ~t is rougher than the specifications
will allow. 'Thetefore, a point estimate of :he proportion p of shafts in the population that exceed :he
roughness specifications is p:::::x/n:::;: 12nS =0.16. A 95% two-sided Confidence interval for p is CQIlJ.-
puted from equation 10-39 as

or

which simplifies to
0.08';;p:; 0.24.
10-2 Single-Sar:::ple Confidence Interval Estimation 241

Define the error in estimating p by jJ as E = iF - pI. Note ......


that
~ .
we are approxnnately
100(1 - a)% confident tbat this error is less than Za/2 'i p(l- p )/n. Tnerciare. in situations
where the sample size can be selected, we may choose n to be 100(1 - a)% confident that
the error is less than some specified value E. The apPrOpriate sample size is

"=(Z: J p(l-p). (10-42)

This function is relatively flat from p:::: 0.3 to p = 0.7. An estimate of p is required to use
equation 10-42. If an estimate p from a pre-vious sample is available, it could be substituted
for p in equation 10-42, or perhaps a subjective estimate could be made. If these alternatives
are unsatisfactory, a preliminary sample could be taken, p computed, and then equation
10-42 used to determine how many additional observations are required to estimate p with
the desired accuracy. The sample size from equation 10-42 will always be a maximum for
p = 0.5 [that is, p(1 0.25], and this can be used to obtain an upper bound on Il. In other
words, we are at least I OO{ 1 - a)% confident that the error b estimating p using p is less
than E if the sample size is

n=(Z;'2 rrO.25).
1n order to maintain at least a 100(1 - a)% level of confidence the value for n is always
rounded up to the next integer.

!~3#p!~i()j8
Consider ihe data in Example 10-17. How large a sample is required if we want to be 95% con5dent
that the error in usingp to estimate p is less than 0.05? Usingp = 0.16 as an initial estimate of p, we
fiLd from equation 1042 that the required sample size is

( Z',02' ')' p(l-pl=(!.96J" 016(084)=207


E' 0.05" "

We note that the procedures developed in th1s section depend on the nonnal approxi~
mation to the binomiaL In situations where this approximation is inappropriate, particularly
cases where n is small, other methods mUSt be used. Tables of the binomial distribution
could be used to obtain a confidence interval for p. If n is large but p is smaIl, then the Pois~
SOn approximation to the binomial could be used to construct confidence intervals. These
procedures are·illustrated hy Duncan (1986).
Agresti and Coull (1998) present 1I!1 alternative form of a confidence interval On the
population proportion, p, based on a large-sample hypothesis test on p (see Chapter 11
of this text). Agresti and Coull show that the upper and lower limits of an approxnnate
100(1 - a)% confidence interval on p are
'
". Z;;/2.... I' A( ')
.pl-p ,
Z;j2
p---_z
2n
/'l
0, .. ,
1---"'-,-
tz. 4n~

Z;!2
1+-'-
n
The authors refer to this as the score confidence interval. One-sided confidence intervals
can be constructed simply by replacing Zafl with Z".
,
242 Chapter 10 Parameter Estimation

To illustrate tins confidence interval, reconsider Example 10-17, whlch discusses the
surface finish ofa shaft with n =75 and p = 0.16. The lower and upper limits of a 95% con-
tidence interval using the approach of Agresti and Coull are

1+ (1.96)"
75
0.186 ± 0.087
=-
1.051
0.177 ±O.083,
The resulting lower- and upper-confidence limits are 0.094 and 0.260, respectively.
Agresti and Coull argue that the more complicated confidence interval has several
advantages over the standard large~sample interval (given in equation 10-39). One advan-
tage is that their confidence interval tends to maintain the stated level of confidence better
than the standard large-sample interval Another advantage is that the lower-confidence
limit: will always be non-negative. The J.a.."ge~sample confidence interval can result in neg-
ative lower-confidence limits, which the practitioner will generally then set to 0, A method
which can report a neg~tive lower limit on a parameter that is inherently non~negative (such
as a proportion, p) is: often considered an inferior method. Lastly, the requirements that p
not be close to 0 or 1 and !l be relatively large are not requirements for the approach sug-
gested by Agresti and CoulL In other words, their approach results in an appropriate confi-
dence interval for any combination of nand p.

10·3 TWO-SAMPLE C01~;FIDENCE INTERVAL ESTIMATION


10-3.1 Confidence Interval on the Difference between Means of Two
Normal Distributions, Variances Known
Consider two independent random variables XI with unknown mean fl.l and knmvn variance
cr,andX, with unl:nownmean IL, and known variance <1,. We wish to fmd a 100(1- a)%
confidence interval on the difference in means)1: -111,. Let Xli' X 12• ,", XI"I be a random
sample of n1 observ-atious from Xl' andXZ1 ' Xn. ,.. , X2l:~ be a random sample of nz obser-
vations fromXz_JiX: andXz a..~ the sample means, the statistic

is standard nonnal if X; andX, are normal or approximately standard normal if the condi-
tions of the Central Lintit Theorem apply, respectively. From Fig, 10-5, tins implies that
P{-Z"",;Z :>Z"nJ: 1- a
or
10-3 Two-Sample Confidence Interval Estimation 243

This can be rearranged as

- - 100t
I 2 (y.,2 '}
~Xl -Xl +Za/l'\I-+~ =l-a. (10-43)
I nJ "'z
Comparing equations 10-43 and 10-18, we note that the 100(1 - a)% confidence interval
for 1'1 - 1"1 is
I 2 2 C:r--2
XJ-Xo-Z"
- al""'I1l!l+'" ""J-"-$X
,- ,-'1. 1 -X,+ZI
.. a 2'11£::..+"2, (10-44)
l~ ~ . \~ ~

One-sided confidence intervals on 1'1 - 1"1 may also be obtaiued, A 100(1- a)% upper-
confidence interval on 1'1 - 1"1 is

(10-45)

and a 100(1 - a)% lower-confidence interval is


, 2 .,
- - 10'1 0';-
XJ -X2 -Za 1-+-- "1'; 1"1, (10-46)
"\ fll 1Lz

Tensile stre:lgth tests were perfoIDlcd on two different grades of alumint:.Il1 spars used in manufac-
turing the v,.ing of a commerciai t:rn:osport aircri.L~ From past experience with the spar manufacturing
process and the testing procedure. the standard deviations of tensile strengths are assumed to be
knOWll. The data obtained are shov,'U in Table 11)...2.
If fl.l and p-:. denote tile true mean tensile strengths for the two grades of spars, then we may 5nd
a 90% confidence ioterval on the difference in mean strength #1 - pz .as follows:
"-,--,-
- -;t.z
L ""';T;. - - Z r>
10'1 0',
1-+-
• a, .. ~ nl n.z
1,(1-0-',;;---(,-5,7,
=87,6-74.5-1.645)-'1_+..:::...L
i 10 ;2
=13,1-0,88
=12,22 kglmm',

TabJe 10--2 Tensile Strength Test Result for Aluminum Spars


Sample :Mean
Spar Sample Te:lsile Strength Standard Deviation
Grnde Size (kg/mm~)

1 ,111:= 10 xl := 87,6 0'; = 1.0


2 .,;12 ,",;74,5 0', ; 1.5
244 Chapter 10 Pa.-amete:: Estimation

I, 0)' '1 5)2


:87,6-74.5~L64511~~_+_l_'-
\ 10 12
=13.1+0.88
= 13,98 kg/mm 2 •
Therefore t..~e 90% co:rfidence interval on the difference in mean tensile strength is
12,22 kgImm'- S Il, - f.I, ~ 13.98 kg/nun'-
We are 90% confident that the mean tensile strength of grade 1 aluminum exceeds that of grade 2 a1u~
minum by between 12,22 and 13.98 kg/rnm?

If the standard deviations 0', and 0'2 are known (at least approximately), and if the sam·
pIe sizes n l and ft:t. are equal (n! ;;;; nz : : n, say), then we can determine the sample size
required SO that the error in estimating Iii - ,Uz using XI - X2 will be less than E at 100(1 -
a)% confidence, The required sample size from each population is

( Z«l2 )"( 2 2) (10"1-7)


n=\E 0'1+0'2'

Remember to round up if n is not an integer.

10-3.2 Confidence Interval on the Difference between


Means of Two lIIormal DistribUtions, Variances Cnknown
We now extend the results of Section 10-2.2 to the case of two populations with unknown
means and variances, and we wish to find confidence intervals on the difference in means
11, - ill. If the sample sizes n, and n, both exceed 30, then the normal known-variances dis-
tribution inter,.'a1s in Section 1()"3, 1 can be used. However. when small samples are taken,
we must assume that the underlying populations are normally distributed with unknown
variances and base the confidence intervals on the t distribution.

Case I. cr: cr; = 0"


Con.<ilder two independent normal random variables, say X; with mean J1! and variance a:.
and X2 with mean f1-;. and variance 0;. Both the means Pl and Jl.z and the variances d; and
0-; are unknown. However. suppose it is reasonable to assume that both variances are equal;
that is, cr: =cr; = 0". We wish to find a 100(1 a)% confidence interval on the difference
in means J1! - Jl.z.
nz.
Random samples of size Itl and are taken nuX J andXz• respectively, Let the sample
- - 1
means be denoted X, and X, and the sample variances be denoted S 1 and S;. ~
Since both S;
~

S;
and are estimates of the commOn variance 0", we may obtain a combined (or "pooled")
estimator of 0":

s' : (n, -1)sl+ (n, -l)si (10-48)


p nl +nz.-2 '
r
10-3 Two~Sample Cor.fider.ce Interval Estitr.ation 245

To develop the confidence interval for Jil - fLJ.t note that the distribution of the
statistic

is the t distribution with 7>1 + ll:: - 2 degrees of freedom, Therefore.


P{- lat'2..nrtr..2:S: t:;; tl.li2.nltll:r2 } = 1- a
or

This may be rearranged as

ri-~1 '
+ t"/2"
I. f'"1
+,,_2S, 1-+- = I-a,
711 ~ J
~ (10-49)

Therefore) a 100(1- a)% two~sided confidence interva: for the difference in means Pi - Jl'J.
is

- - 11 1
$.J11 ~f.l2 ~XI-X2 +ta/21'1 +1I~~2Sp.,.i-+-. (10-50)
'I~'vnl~

A one-sided 100(1 - a)% lower-confidence interval on ,U, - fl.z is

(10-51)

and a one-sided 100(1 - a)% upper-confidence interval on flI - fl.z is

(10-52)

In a batch chemical process used for etching printed circuit boards, two different catalysts are being
compared to de~r.mine whether they require differer:.t emersio!l times fo: re...l1oval of identical quan-
tities of photoresist material, Twelve batches were rn."l with catalyst 1, resti:ung in a sample mean
emersion time of x, ~ 24.6 minutes and a sample standard de..iation of s! = 0,85 minutes. Fifteen
batches were ron -with catalyst 2, resulting in a rce.a... emersioD time of ~ ;;: 22.1 minutes and a stan w

da.."; deviation of S2 "" 0.98 minutes. We will find a 95% confidence interval on the difference in means
246 Olapter 10 Parameter R~tim.ation !
PI - fI2, assuming that the standard deviations (or variances) of the two populations are equa::. The
pooled est.i.mate of the COIllllJ.on va..'iance is found using equation 10-48 as follows:
, (n,-I),; .;.('" -1)$;
s = -'-'--'--'--'-'-:-'-'-
p "1+"2- 2

= 11(0.85)' H4{O.98)'
12.,.15-2
=0.8557.
The pooled standard deviation:is sf! =JO.8557 =0.925, Since lQ/2.n,+".,....2 til1lZ5.2:i "" 2,060, we may
calculate the 95% lowet:~ and upper-confidence limits as ' -

/1 I
=24.6-22.1-2.060(0.925).,-+-
112 15
= 1.76 minuteS
and

= 24.6 - 22.1 + 2.060(0.925),1...1:.. +...1:..


112 15
; ; ; 3.24 minutes,
That is, the 95% confidence inte:val on the difference in mean emersion times is
1.76 minutes s:: fl.l - fl? :5 3.24 minutes.
We are 95% confident tb,at catalyst 1 requires an emerslon time that is between 1.76 minU!es and 3.24
minutes longer than that required by catalyst 2.

Case II. d; " d;


In many siruations it is not reasonable to assume that ~ = ~. Wilen this assnmption 1s
unwarranted, one may still find a 100(1- ci)% confidence intervil for 111 i1'2 using the fact
that the statistic

t _ X, X, -(11, - Jl,)
'21 2 '
..js!!", +S,/nz
is distributed approximately as t with degrees of freedom given by
2 I ,2
Si in, +S21nz I
.., !
(

v=' " '''-2 (10-53)


(5.;/",)' J>ilnzt .
"t+1 nz+ 1
Consequently, an approximate 100(1 - ci)% two-sided confidence interval for 111 - !1-"
when d; " ~, is

(10-54)
10-3 Two-Sample Confidence mre!>,al Esti.-nation 247

Upper (lower) oDe-sided confidence limits may be found by replacing the lower (upper)-
confi.dence limit with -00(00) and changing cd2 to a.

10-3.3 Confidence L.,terval on PI /1z for Paired Observations


In Sections 10-3.1 and 10-3,2 we developed confidence intervals forthe differeDceiD means
where two indepe."ldent random samples were selected from the two populations of bter-
est. That is, Il j observations were selected at random from the :first popUlation and a Com-
pletelY independent sample of rLz observations was selected at random from the second
population, There are also a number of experimental situations where there are o:lly 11- dif-
ferent e:x:perimental units and the data are collected in pairs; that is, two observations are
made on each unit.
For example, the journal HW'IlmI Faclors (1962, p, 375) reports a study in which 14
subjects were asked to park two cars having subscantially differen: wheelbases and turning
radii The time in seconds was recorded for each car arid subject, and the resulting data are
sho'Nn in Table 10-3. Notice that each subject is the "'experimental unit" referred ~o earlier.
\Ve wish to obtain a co!lfidence interval on the difference in :::nean time to park the two cars,
say fll .LIz,
In general, suppose that the data consist of n pairs (Xli' X,,), (X", X,,), ,..,(X", X,,).
Both XI and X, are assumed to be normally distributed with mean 11, and 11-" respectively.
The random variables within different pairs are independent, However, because there are
two measurements on the same experimental unit, the two measurements within the same
pair may not be independent. Consider the n differences D j ;X1! -XlI' D2 =X12 -Xn, ... ,
D" =X1r. - X2r.' Kow the mean of me differences D. say ,tiD' is
110 =E(D) = E(X, - X;) = E(X I ) - E(X,) = 11, - f12.
because the expected value of Xl - X2 is the difference in expected values regardless of
whether Xl and X2 are independent. Consequently, we can construct a confidence interval
for ,til - J.i.2just by finding a confidence interval on fla. Since the differences Df are normally
and independently distributed, we can use the t-distribution pr0Cedure described in Section

Tabl.lO-3 TlIile in Seconds to Parallel Park Tv.'o Automobiles


Automobile
2 Difference
37,0 17.8 19.2
2 25,8 20.2 5,6
3 16.2 16,8 -{J.6
4 24.2 41.4 -17,2
5 22,0 21.4 0.6
6 33,4 38.4 -5,0
7 23.8 16,8 7.0
8 58,2 32,2 26.0
9 33.6 27,8 5.8
10 24.4 23,2 1.2
11 23.4 29,6 -6.2
12 21,2 20,6 0,6
13 36.2 32,2 4,0
14 29.8 53,8 ·-24.0
248 Chapter 10 Parameter Estimation

JO-2.2 to find the confidence interval on ,uo' By analogy with equation 10-30, the 100(1 -
a)% confidence interval on I1D = 11; - i11 is \,

(10-55)
where 15 and SD are the sample mean and sample standard deviaoon of the differences D"
respecovely. This confidence interval is valid for the case where (j~"* because So esti- 0';,
mates a; n
~ veX, X,). Also, for large samples (say 2: 30 pairs), the assumption of nOr-
mality is unnecessary.

We now return to the data in Table 10-3 concerning the time for n =: 14 subjects to parallel park two
cars, From the column of observed differences d, we calculate d:= :.21 and sa=- 12.68. The 90% Con-
fidence interval for f.tn:=: J.1, - P2 is found from equation 10~55 as follows:

d-tl,105,:3Sd -./nS,f.tD:5.d+tMS,OSd .[,;,

L21-1.771(12,6S)!M "I"D 51.21+ 1.711(12.68)/ M,


-4.79'-;I"D '-;7.21.
~otice that ::he confidence L'1terval on PD includes zero, 'This implies that at the 90% level of confi-
dence, the data do not support the ciaim ::hat the two cars have different mean parking times PI and
J1.z. That is, the: value PD =. PI -ILl =0 is not inconsistent ¥lit.!) the observed data,

Note that when pairing data, degrees of freedom are lost in comparison to the two-sample
confidence intervals. but typically a gain in precision of estireation is achieved because 8tl
is smaller than Sp'

10-3.4 Confidence Interval on the Ratio of Variances


of Two Normal Distributions
Suppose that Xl and X2 are independent nonnal random variables Vlith unknown means fJ.,
and i12 and unknown variances 0'; and a;,
respectively, We wish t<J find a 100(1 - a)% con
fidence inten'al on the ratio ~Ja;," Let two random samples of sizes n1 and YI.z be taken on
87 S;
Xl and X21 and let and denote the sample variances. To find the confidence mterva,L
we note that the Sar::1pling distribution of
si/a~
F=---'
Sf/a?;
is F with !'1.z - 1 and n 1 - 1 degrees of freedom. This distribution is shown in Fig. 10-9.
From Fig, 10-9, we see that
P{F 1-al2J'T;rJ,nr 1 :::; F $ F an.,ny-1J'TI-t} = 1 - 0:
or

Hence
1O~3 Two-Sar:\.ple Confidence Interval Estimation 249

Figure 10-9 The distribution of F"rl,ll;-l'

Comparing equations 10-56 and 10-18, we see that a 100(1 - a)% two-sided confi-
dence interval for ~/a; is
sf c;f ~2
si Jli-CZ/2,/l.l-l./I.:-l:5; O'~ s:: s:f Fa:tZ,i'll-!:./l.l-l' (1O-57)

where the lower 1 - aJ2 tail point of the F"r'tJ:!!-; distribution is given by (see equation
9-22).
1
(10-58)
FI)./2,fI,l -l,n:-t

We may also construct one-sided confidence intervals. A 100(1 - a)% lower-confi-


dence limit on ~/ 0"; is
0;' ~2
sf
~-'" <-'
'<l-a,flz-l,illl -1 - O'~ ,
(10-59)

while a 100(1 - a)% upper-confidence interval on a',ia', is

(10-60)

;:[~E~!,'i~2,!
Consider the batch chemical etching process described in Example 10-20. Recall that two catalysts
are being compared to measure their effectiveness in reducing emersioD times for printed circuit
boards. n; 12 batches were run with catalyst 1 and ll:z = 15 batches were rJ.!l with catalyst 2, yield-
ing $! = 0,85 mL.'lutes and $1 :;;;;; 0,9S minutes. We will:find a 90% cor.iidence interval on the ratio of
va.ri3x.\ccs ~!o;. Froro equation 1!)-'57. we find that
$1
2
ar $~
-rro,9)J4JI :S:-2 $21'0.05,14,11>
of2 0'2 $2

(0.85): 0.3% cr~ ::; (0.85)' 2.74,


(0.98t "i (0.98)'
or
,
0.29:S; a~ s 2.06,
cr2
250 Chapter 10 Paramete, Estimation

usmgthe factiliatFQ,95.14,H = lIFo.6~,II.A "'" If2.58 =0.39. Since this confdeace interval includes unity,
we could not claim that the staIl.d.ard deviations of the emersion times for the two catalysts are differ-
ent at the 90% level of confidence.

1()"3.5 Confidence Interval on tbe Difference between Two Proportions


Ii there are two proportions of interest, say P, and P,. it is possible to obtain a 100(1- a)%
confidence interval on their difference, PI - Pl' If two independent samples of size n[ and
n., are taken from infinite populations so that X, and X, are independent. binomial random
variables mth parameters (n 11 PI) and (~, pz), respectively, where Xl represents the num~
ber of sample observations from the first population that belong to a class ofinterest and X,
represents the number of sample observations from the second population that belong to the
class of interes~ then fit = X,/n, and P2 = X,In., are independent estimators of p, and p"
respectively. Furthermore, under the assumption that the normal approximation to the bino-
mial applies, the statistic
p, - ,0, -(p, - P2)
Z--r-~"='~~~~"'~
jP,(l- p,) , p,(l- p,)
~n,~n.,
is distributed approximately as standard nonnaL Using an approach analogous to that oftlle
previous section, it follows that an approximate 100(1 - a)% two-sided confidence inter-
val forp,-pzis

•• Z ,1,0,(1-,0,)+,02(1-,02)
P2 - a!2~i
PI -
I n, ""
r;::---c-.~-;
• • + Zo/2. ,p,(I- PI) . -,,0.2",-(I~p2,,-)
,,; p, - P2 ,,; PI - p, (10-61)
~ nl ll:2
An approximate 100(1 - a)% lower-confidence interval for p, - P2 is

••
PI-PZ -
Za Vlp'(I-p;)+p2(I-fi~L
-·~--~Pl P2' (10-62)
n1 n:z
and an approximate 100(1 - a)% upper-confidence interval for p, - p, is

.-
p~ P2
~n.-'
~ P2
Yl
+Za, ,lp,(I-p,Y~I-':'~
' . (10-63)
1", n.,

;:~~~~~19;~:;
Consider the data in Example 10-17. Suppose that a modification is made in the surface finishing
process and subsequently a second random sample of 85 axle shafts is obtained, The number of defec-
tive shafts.in this second sample is 10. Therefore. since!t: ;; 75, Pt 0,16, ~ = 85. and /)2:::: 10/85:;:;:;
0,12, we can obtain an approximate 95% co:nfidence interval on the difference in the proportions of
defectives produced under the !\Va processes from equation 10-61 as
10-4 Approx.ima~e Confidence Intervals in Max.imum Likelihood Esti.:nation 251

or

0.16_0.12_1.96.1°.16(0.84) + 0.12(0.88)
,75 S5
r, '~~=C::: ...
:> Pl _ p, <016-012
_.
1961°,16(0.84) 0.12(0,88)
, + , 'i 75 + ll5 .

This simplifies to
- 0.07 s: P, - Pz'; 0.15,
This U:.rervaJ includes zero, so, based on the sample datl, it seems unlikely that the changes made in
the st:.rface finish process have reduced the proportion of defective axle shafts being produced..

10-4 APPROXIMATE CONFIDENCE INTERVALS


IN i\fA:Xl1VrulVI LIKELmOOD ESTIMATION
If the method of maximum likelihood is used for parameter estimation, the asymptotic
properties of these estimators may be used to obtain approximate confidence intervals:. Let
Ii be the maximum likelihood estimator of fJ. For large samples, /j is approximately normally
distributed with mean eand variance V(~ given by the Cramer-Ran lower bouad (equation
10-4). Therefore, an approximate 100(1 a)% confiGence interval for (ii,
(10-64)
Usually, the V(fiJ is a timction of the unknown parameter e. In these cases, replace ewith {i

~,~iij!J~~
Recall Example 10-3, where it was shoW!;; that the maximum likelihood estimato= of the parameter p
of a Beruoulli distribution is p= (lIn.)""
Lr",l
XI = X. Using the Cr:un6:'-Rao lower bound, we may
verify that the lower bound for the variance offt is

"),,
V\P - 1 ,
1~ 10[pX(I- Pj'- l] X

1
- J~_ (I-X)]'
'~lp \l-p)
1
nE[X2 + (I-X)' _2;((I-Xl]'
p' (I-p)' ;(I-p)

For the Bernoulli distribution, we observe thatE(X) =P and E(X,) =p. Therefore, this last express:on
simplifies to

V(p\"
, ) [1 1 1
n-+--
P (l-p)
This result should not be surprising, since we know directly that for the Bernoulli distribution, V(X)
= VeX,)/. =p(l- p)/n. Jn any case, replacingp in VCP) oy p, the approxi=te 100(1- a)% contitlence
interval for p is found from equation 10-04 to be
252 Chap~er 10 Parameter Estimation

10-5 SIMDLTA.c'<EOUS CONFIDENCE INTERVALS . '\


1-1
Occasionally it is necessary to construct several confidence intervals on more than one
parameter, and we wish the probability to be (I - a) L'Iat all such confidence intervals
simultaneously produce correct statements. For example, suppose that we are sampling
from a normal population with unknown mean and variance, and we ~...sh to construct con-
fidence intervals for I' and a' such that the probability is (1 - a) that both intervals simul-
taneously yield correct conclusions. Since X and §- are independent, we could ensure this
result by constructing 100(1 - a)!!2% confidence intervals for each parameter separately,
and both intervals would simultaneously produce correct conclusions with probability
(1 - ar
12
(I- al"~" = (I - a).
Ii the sample statistics on which the confidence intervals are based are not independ-
ent random variables, then the confidence interv'als are not independent. and other methods
must be used. In general, suppose that m confidence interVals are required. The Bonferroni
inequality states that
m
p{all m statements are Simultaneously correct} " I-a " 1- La,. (10·65)
£=1

where 1 (Xi is the confidence level used in the ith confidence interval. In practice, we select
a yalue for the simultaneous confidence level 1 - a, and then choose the individual ai such
that ~ ~ at:::: a. 1.Jsually. we set a". = aim.
~I=l
As an illustration, suppose we wished to construct two confidence intervals on the
means of!VIo normal distributions such that we are at least 90% confident that both state-
ments are simultaneously correct. Therefore, since 1 - a 0.90, we have a:::; 0.10, and
since two confidence intervals are required, each of these should be constructed with
= ,=
a,= a12 0.1012 =0,05, 1,2. Thatls, two individual 95% confidence intervals on 1', and
III will simultaneously lead to correct statements with probability at least 0,90.

10-6 BA):'ESIAN CONFIDENCE INTERVALS r


\.
Previously, we presented Bayesian techniques for point estimation. In this section. we will
present the Bayesian approach to constructing confidence intervals.
We may use Bayesian methods to construct interVal estimates of parameters that are
similar to confidence intervals. If the posterior density for 8 has been Obtained, we can con-
struct an interval, usually centered at the posterior mean, that contaIns 100(1 - a)% of the
posterior probability. Such an interval is called the 100(1 a)% Bayes interval for the
unknown parameter e.
e
While in many cascs the Bayes interval estimate for will be quite similar to a classi-
cal confidence intenral with the same confidence coefficient, the interpretation of the two
is very different. A confidence interval is an interval that" before the sample is taken, will
include the unkno"n 8 ",th probability I - a. That ls, the classical confidence interval
relates to the relative frequency of an inreIVlll including 8, On the other hand, a Bayes inter-
val is an interval that contain! 100(1 - a)% of the posterior probability for e.
Since the
posterior probability density measures a degree of belief about 8 given the sample results,
10~7 BootStrap Confidence Intervals 253

the Bayes kterval provides a subjective degree of belief about IJ radler tila:! a frequency
e
interpretation. The Bayes interval estimate of is affected by the sample results but is not
completely determined by them.

Suppose that the random variableXis normally distributed \:.-1m meanjJ. and variance 4. T..'1e value of
jJ. is :.mknOVl!l. but a reasona.ble prior density wocld be normal \Villi mean 2 and variance L Tha: is,

We can show that the posterior density for J.t is

- 8 1'}
-+1)' ,J!2 r I(
eXP1-- '(
'::+1 j.i.- nx+ I
L 2 4 j n+4 j

the methods of Section 10-1.4. Thus, the posterior distribution for jJ. is normal with mean
S)/(r. + 4) and variance 4!(n "7 4). A 95% Bayes interval for jl, which is symmetric about the
posterior mear.;, would be

nX-8 2 nX+8 2
---z".'2S···~"J!~--+z".c" (10-66)
n+4 "Vn+4 n-4

If a ra:odom sample of size 16 is taken and we:5.nd that X:::;; 2.5, equation 1O~66 reduces TO
1.52~J!"3.28.

If we ignore the prior information, the :::lassical confidence interval for jJ. is

We sec tb.at the Bayes i.:.'"lterval is slightly shorter than the classical co:r6dence i..'"lterva1, because the prior
:.riormation is equiva1en~ to a slight increase in lle sample size if ne prior knOWledge was 3.';sumed.

10-7 BOOTSTRAP CONFIDENCE INTERVALS


In Section 10-1.6 we inL.-oduced the bootstrap technique for estimating the standard error
of a parameter, 8. The bootstrap technique can also be used to construe: a confidence inter-
val on e.
For an arbitraI)' parameter 8, general 100(1 - a)% lower and upper limits are,
respectively.
L ~ ~ - 100(1 aI2) percentile of eil -Ii),
u ~ ~-lOOeal2) percentile of e~- fJ).

L
254 O1apter 10 Parameter Estimation

Bootstrap samples can be generated to estimate the values of L and U.


Suppose B bootstrap samples are generated and il;,O;, ... ,O;
and i? are calculated. From
these estimates, we then compute the differences 8; - 8*,0; - (j"', .. .• 8; -
8"', arrange the dif~
ferences in increasing order, and find the necessary percentiles I OO( 1 - aJ2) and I OO{aJ2)
for L and U. For example, if B = 200 and a 90% confidence interval is desired, then the
100(1 0.1012) = 95th percentile and the 100(0.1012) = 5th percentile would be the 190th
difference and the 10th difference, respectively.

A"l electronic device consists of four components. The time ~o failure for each component follows ar.
exponential distribution and the co:nponents are identical to and bdependen! of one another. The
elcctronic device will fail only after all four components have failed. ine times to failure for the elec-
tronic components have been collected for lS such devices. The total times to failure are

78.7778, 13.5260, 6.8291, 47.3746, 16.2033, 27.5387, 28.2515, 385826,


35.4363, 80.2757, 50.3861. 81.3155. 42,2532, 33.9970, 57.4312.

It is of interest to construct a 90% co::Jiidence mterval on the exponential paramcter ;L By definition,


Ll],e sum of r indepcadent and identically distrlbuted exponential ra:.dom variables follows a gamma" I

distribution and is defined as gamma(r, It). Therefore, r=4, bllt A. needs to be estir:lated. A bootstrap
estimate for Acan be found using the technique given in Section 10-1.6, Using the time-to·failure data
above. we find the average time to faibrc to be x::; 42.545. 1he mean of a gamma distribution is E(X)
""" riA 2.I1d A is calculated for each bootstrap sample. Ruuui.ng M:nilablS for B """ 100 bootstraps, we
found that the bootstrap estimate X"", 0.0949. Using the bootstrap estimates for each sample, the dif-
ferences can be calculated and some of the calculations are showr. in Table 10-4.
\\Then the 100 differences are arranged in increasing order, the 5th percentile and the 95th per~
centiles turn out to be -0.0205 and 0,0232, respectively. Therefore, the resulting confidence limits are

L= 0.0949 -0.0232 =0.0717,


U= 0.0949 - (-{).0205) = 0.1154.

We are approximately 90% confident that the :rue value of Alles bet\1.'cen O.fYl 17 a:.d 0.1154. Figure
10-10 displays the histogram of the bootstrap estimates 1~ while Fig. lQ..ll depicts the Cift"'erences
A: - T. The bootstrap esf..JIlates are reasonable when the estimator is unbiased and the stmdm"d eITO!
is approximately constant.

Table 10·4 Bootstrap Estimates for Example 10·26


Sample _ _ _ _ _ _i:...·,_ _ _ _ _ _~.'-,-_X'
__
1 0.037316 -{).OO75392
2 0,090689 -{).o041660
3 0.096664 0.0018094

100 0.090193 -{).0046623


10-8 Other Interval Estimation Problems 255

r
20

- r-

-- I- ~
-

o r il--h D

0,0650,0750,0850,0950.1050.1150.125 0.135 0.145 0.155


lambda
Figure 10-10 Histogram of the bootstrap estimates i;.

0-
20

r- r--

-
-- _r-r-

o r-
,
Il-h
-0,03 -0,02 -0,01 0,00 0.01 0,02 0,03 0,04
differences
Figure 10~11 Histogram of the differences):.; - J'

10-8 OTHER INTERVAL ESTIMATION PROBLEMS


10-8.1 Prediction Intervals
So far in this ehapter we have presented interval estimators on population parameters, sueh
as the mean, 1.1,. There are many situations where the praetitioner would like to predict a sin-
gle future observation for the random variable of interest instead of predicting or estimat-
ing the average of this random variable. A prediction interval ean be eonstrueted for any
single observation at some future time.
Consider a given random sample of size n, Xl' X 2, ... , X n, from a nonnal popUlation
vrith mean J.l and varianee cT. Let the sample average be denoted by X, Suppose we wish to
predict the future observation Xn+l' Sinee Xis the point predietor for this observation, the
prediction error is given by XlI+1 - X. The expected value and variance of the prediction error
are
256 Chapter 10 Parameter Estimation

aDd

Since Xltt1 and Xare independent, normally distributed random variables, the prediction
error is also Donnally distributed and

Z= \
iX,-X)-O
1\'1" _______ •

I t' 1'\
~
o-'l1+-j
n;
and is standard nonnal, If cT is unknown, it can be estimated by the sample variance, S',
and then

follows the ,distribution with n-l degrees of freedom.


Following the usual procedure for constructing confidence intervals, the two-sided
100(1- a)% prediction interval is

By rearranging the inequality we obtain the final fonn for the two-sided 100(1 - a)% pre-
diction interval:

-
X-ta./'l,ll-l;JS
(
I' 2 l'' -
i ,.
ll+-;;)::;XIl+~S;X+tai2.II-I\S 2I I)
(1+;. (10-67)

The lower one-sided 100(1- a)% prediction interval on Xn+: is given by

- f,(!'1
X-t~'"'l';S ll+~)5Xr.+l' (10-68)

The upper one-sided 100(1 - a)% prediction interval onX_, is given by

(10-69)

,!iEiii~!~I~~i
Ma..mum forces experienced by a transport airq:aft for an airline on a particular route for 10 flights
are (in units of graviry, g)
1.15,1.23,1.56,1.69,1.71,1.83,1.83,1.85,1.90,1.91.
10-8 Other Interval Estimation Problems 257

The sample average and sample standard deviation are cai.culared to be i """ 1.666 and s : : :. 0.273,
respectively. Rmay be of importance to predict the next maximum force experienced by the <L.."'Crnft.
Since !C.cG!!.9 = 2.262. the 95% prediction interv<u onX! 1 is
I
t"!2"~IJo.273)2(1 + 1 I" XII" 1.666+ t"/2'~1 JO.Z73)'(1-..!..).
i :.666 -

L018"X" ~2.314.
I 10, I W

10·8.2 Tolerance Intervals


As presented earlier in this chapter. confidence intervals are the intervals in which we
expect the true population parameter, such as Il. t.o lie. In contrast, tolerance intervals are
in~ervals in which we expect a percentage of the population values to lie.
Suppose that X is a normally distributed random 'Y'ariable with mean f.l and variance 0"2,
We would expect approximately 95% of all values of X to be contained within the interval
f.L:: 1.645a. But what if jJ and crare unknown and must be estimated? Using the point esti~
mates xand s for a sample of size n, we can construct the interval:X ± 1.645s. UnfOrtUnate} y,
due to the variability in estimating fJ and a the resulting interval may contain less than 95%
of the values. In this particular instance, a value larger than 1.645 v.ill be needed to guat~
antee 95% coverage when using pDint estimates for the population parameters. We can con-
struct an interval that will contain the stated percentage of population values and be
relatively confident in the result. For example, we may want to be 90% confident that the
resulting intenra] coverS at lea,;:t 95% of the population values. This type of interval is
referred to as a tolerance interval and can be constructed easily for various confidence
levels.
In general. for 0 < q <: 100" the two-sided tolerance interval for covering at least q% of
the values from a normal population with 100(1 - a)% confidence is i ± ks. The value k is
a constant tabulated for various combinations of q and 100(1 .- a), Values of k are given in
Table XlV of the Appendix for q ~ 90, 95, and 99 and for 100(1 - a) ~ 90, 95. and 99.
The lower one-sided tolerance interval for covering at least q% of the values from a
nOOl1al population with 100(1 - a)% confidence is i - <S. The upper one-sided tolerllllCe
interval for covering at least q% of the values from anormaI population with 100(1- a)%
confidence is :x + ks. \TariOtlS values of k for one-sided tolerance intervals were calculated
using the technique given in Odeh and Owens (1980) and are pro;ided in Table XlV of the
Appendix.

Reconsider the maximum forces for the t:rar..spon nL"craft in Exru:Lple 10-27. A two-sided tole:ance
interval is desired that would cover 99% of all maximum forces with 95% confidence, From Tabk
XIV (A.ppendix), with I - a = 0.95, q ~ 0.99, and n la, we ftnd that k = 4.433, The sample aver-
age and sample standard deviation were calculated asx= 1,666 ands= 0.273. :espectively. The result-
ing tolerance intervalls then
1.666 ± 4.433(0.273)
or
(0.4 56,2.876).
Therefore, We conclude that we are 95% confident that at leaSt 99% of all maximum forc~s would lie
between 0.456 g and 2,876 g.
258 Chapter 10 Parameter Estimation

It is possible to construct nonparametric tolerance intervals that are based on the


extreme values in a random sample of size n from any continuous population, If P is the
minimum proportion of the population contained between the largest and smallest obser-
vation with confidence 1 a, then it can be shown that
n1"'-1 - (n -1)1'" a.

Further, the required n i,... approximately

I 1 + P X~.4 (10-70)
n=-+-"-'--.
2 l-P 4
Thus, in order to be 95% certain that at least 90% of the population will be included
between the extreme values of the sample, we require a sample of size .

n=l.+~. 9.488 =46.


2 0.1 4

Note that there is a fundamental difference between confidence limits and tolerance
limits, Confidence limits (and thus confidence intervals) are used to estimate a parameter
of a popu1ation~ while tolerance limits (and tolerance intervals) are used to indicate the lim-
it, between which we can expect to find a proportion of a population. As n approaches infin-
ity, the length of a confidence interval approaches zero, while tolerance limits approach the
corresponding quantiles for the population. .

10-9 SlTh:l.MARY
This chapter has introduced the point and interval estimation of unknown pa.rnmeters. A
number of methods of obt.airring point estimators were discussed, including the method of
maximum likelihood and the method of moments, The method of maximum likelihood usu~
ally leads to estimators that have good statistical properties. Confidence intervals were
derived for a variety of parameter estimation problems. These intervals have a frequency
interpretation. The N/o-sided confidence intervals developed in Sections 10-2 and 10-3 are
summarized in Table 10-5. In some instance.'>t one-sided coclidence intervals may be appro-
priate. These may be obtained by setting one confidence limit in the two-sided confidence
interval equal to ttie lower (or upper) limit of a feasible region for the parameter, and using
a instead of cd2 as the probability level on the remaining upper (or lower) contidenco
limit. Confidence intervals using a bootstrapping technique were introduced, Tolerance
intervals were also presented. Approximate confidence intervals: in maximum likelihood
estimation and simultaneous confidence intervals were also briefly introduced.
Table 10~S Summary of Confidence Interval 'Procedures

Point
Problem Typ~ Estimator Two~Slded 100(1 a)% Confidence Interval

dl known
Mean II of a normal distribution, varianl;c X
.~~==--~--~~~~~----~=------
Difference in means of two nonnal distributions XI--X;: XI --Zajl II)
~ -
5; XI--X.1 +z'111
O'~
-+-
0';
)11 and 11.,.. variances O'~ and a~ known "1 11: '" 111

Me..m It of a normal distribution. .X X -tall,~ ! .'1/ -f,; .s: J1 s X+ t"/1,'1_1 s/ In


variance a'k unknown
-

Difference in means of two normal XI-X:! XI - X2 - 1"/2,,,,+,,,


I - --
-- 5111-JJ2 .s:Xj -Xl +l«j2,~,t", )Sp
[-I + -1 ,
distributions PJ -lkJ. variance a~ == (f~
n1 Vfll n1
unknown
whereS = r(:'!-1)Sl?+(1l2--~l~
p ~ n:;+n.--2

l)jfference in means of two normal Jj D-tut;,HSV/.Jn !O,J1() S 15 + t"/l,,, __ 1SD/.,j;;


distributions for paired samples
Po;: III -- Jit

Variance (f1 of a normal distribution s'J Sq'

s'
Ratio of the variances ~/a~ oflwo
nomulI dislribulions
.-
s~
0'1
cr; s;
::;.-L;s:iF
S2
fJ.j1,~,-1,,,,-1

~
o

Proportion or parameter of a binomjal fJ '·2on


P ~Nl~P)c""p c"+z
•• p F(Hj at!
<>
n n
distribution p '"
Difference in two proportions: or two
binomial parameters PI - P2
Pi - P2 . fE",li~;;:J+
" " Z,,/l
p~ - p] --
~ n!
=~=
{II.
<' •
5 p; .. Pi ~. PI - Pi + L'."J~'-'-----'-"'- J
(g
260 Chapter 10 Parameter Estimation

10-10 EXERCISES
10-1. Suppose we have a random sample of size 2n. 10~10. Find the estimator of /_ in the exponential dis~
from a population denoted X. and E(X) <= P and V(X) tribudon by the method of moments, based on a ran-
= cf. Let dom sample of size n,
10~11~ Find moment estL.'1lators of the parameters r
and ). of the gamma distribution. based on a random
sarr.ple of size n.
be two estimators of JL Vlhich is the better estimator
10~12. Let X be a geometric random variable with
of j.1.? E.~plain you:- choice,
parameter p. Find an estimator of p by the method of
10-2. Let Xl' X;., ..• , X1 denote a :-a.'1dom sample from moments, based on a random sample of size n.
a population having mean j.1. and variance <fl. Con- 10~13. Let X be a geometric random variable with
sider the following esti.n.tators of jJ.: parameter p. Fmd the maximum likelihood estimator
8, =~l~~;. _ _C, of p, based Or'! a random sample of size n.
10-14. Let X be a Bernoulli random variable with
parameter p. Find an es:::i:mator of p by the method of
moments, based on a random sample of size n,
Is either est.mator U:lbiased? \\!hich estimator is "bet- 10,,15. Let X be a binor::rial random v:a..-iable with
ter"? In what sense is it better? parameters n (known) and p, Find an estimator of p by
10-3. Suppos.e tha: 8, and &;. are estimators of the the method of moments, based on arandom sample of
paramet", 8. Weknow thatE(e,)~ e,E(fI,) = el2, Vee,) size N.
:= 10, and '\'(8;,) ;:: 4. \Ar'hid:. esti..::::lator is ·'ber-..er"? In 10-16. Let X be a binornia~ random variable with
what sense is. it better? parameters n and p, both unknown, Fmd es:imators of
n and p by the method of momen~, based on a random
104. Suppose tha: lil' ~, and t\ a..-re estimators of e.
sample of size N.
We ","ow lb." E(G,) = E{e.,) = e, E(B,) '" e, Vee,) = 12,
Veil,) 10, and E(i!, - 0' = 6. Compare these three 10-17. Let X be a. binomial random va..';able with
estimators. Whicb do you prefer? \\-by? Pl'Lrameters n (unlo:1own) and p. Find the maximum
likelihood estimator of p, based on a random sample
10,.5. Let three random samples of siz.es n1 := 10, n2 ;;;;;;
of size N.
8, and n~ = 6 be taken from a. populatioa willi mean J.1.
10~18. Set up the likelihood function for a random
and variance a'. Let S~, and S~ be the samp:e vari~
ances. Show t.'1a. sample of size n from a Weibull distribution. What
difficulties would be encountered in Obtaining the
82 _ lOSt . . . 8Sj -+ 6si maximum likelihood estimators of the three parame~
24 ters of the Weibull distribution?
is an unbiased estin:Ia{or of dl, 10~19. Prove that.if {j is an unbiased estimator of e,
10-6. Best Linear Cnbiased Estimators. An e.<:;tima.~ and iflirc,,-.lf'(e);;;;;; 0, then 8is a consistent estimator
e
tor is called a :;mear estim.ator if it is a linear combi~ of e.
nation of the observ-ations in the sample, 8is called a 10~20. Let X be a random ..mabIe with mean jJ. and
best linear unbiased cstL."l1ator if. of all linear func~ varianee d". Given two random samples of sizes n1
dons of the observations. it both is ur.biased and has and n2 with sample means XI and X;;l respe;;tive1y,
m.in.llnum variance. Show that the sample mean X is show ':hat
the bes;: linear unbiased estimator of the population
0< a < 1,
meanp.
:is an unbiased estimator of J1. Assuming X\ and Xz to
10M7. Find L,e t'laX.i.mum likelihood estimator of the
be ffidependent. find the value of a that minimizes the
parameter c of the Poisson distribution, based on a
va...""iance of X,
random sample of size n.
10-21. Suppose that the random variable X has the
10~8. Find the esti:nator of c in the Poisson distribu~
prohability distribution
tion by the method of moments, based on a random
sample of size n. fix) = ({+ 1)x1, o<x< 1,
10~9. Fmd the maximum likelihood e¢-timator of the =0, oth~'ise,

parameter A. in the exponential distribution, based on a Let X;, Xz, "', Xii be a random sample of size fl., Find
random sample of size !L the maximum likeJihood estimator of Yo
10-10 Exercises 261

10-22. Let X have :..'lJ.e truncated (on the left at x) expo~ 10-32. The time be'h'leen failures of a mili.i::ig machine
nential distribution is exponentially distributed with parameter A.. Sup-
fix) ~ A exp[-A(X - x,)]. x> x, > 0, pose we assume an exponential prior 0:1 A with a
mean of 3000 hOllrs. Two machines are observed and
;:: 0, otherwise.
the average t:i.:;;te betvleen failures is i "'" 3135 hou\'s,
Let X" X2 • ,." X" be a random sample of size n. Find Assumiag a squared-error loss, detem:U:le the Bayes
the ~:::\Um likelihood estimator of ).,. estimate of )..
1(}~23. Assr:me that ;t in the previous exercise is
10-33. 'The weight of boxes of candy is noonally dis
}:nmvn but xf is unknown. Obtain the maximum like-
tributed wi'll mean J1. and variance ~. It is reasoaable
I, lihood estimator of At'
to assume a prior density for il that is normal Vtith a
1(1·24. Let X be a random variable whh mean f1 and mean of 1O pounds and a variance of B" De:ermine the
'l2.Iia."lce r;il, and let X:. XZ7 .",X" be a random sample Bayes estir:1ate of jJ. gjven that 11 sample of size 25
of size n from X. Show thaI the estimator G produces i :0.05 po:mds. If boXes ':...1at weigh less
K",~lr .'
~i"'! ,Xi+l - Xi) is unbiased for .an appropriate than 9.95 pounds are defective, what is the probability
that defective boxes Viill be produced?
choice for K. Find the appropriate value for K.
10~25. Let X be a normally distributed random vari· 10-34. The number of defects that occur on a silicon
i able with mean 11 and variance UZ. Assume that rr is wafer used in integra:ed circult manufacturing is
:: known and Jl unknown. The prior density for ,u is known to be a Poisson random variable with patame-
asSUJUed to be normal with mean Jio ani! variance ~. let A. Assume that the prior density for A is eAl1onen~
Determi.."le the posterior density for 11, given a :andom ti.al \Vith a parameter of 0.25, A total of 45 defects
sample of size!t from X. were observed on 10 wafers:. Set up an integral that
defines a 95% Bayes. interva: for .t. "rna:: difficulties
1()"26. LetA,Zbe norm.illy distributed \Vi.th known mean would you er.counter in evaluating &.:is inregral?
Jl a!1d unknown variance al. A.~:''''::le that the prior
dcllSity for 11 cr is a gan:u:na distribution with parame w 10~3S, The random variable X has a density function
ter:s m + 1 and m~, Det:::m:rine the pOsterior density
f(xW)~2;. 0<x<8,
for ller, given a random sample of size n from X. e
10-21. Let X be a geometric random variable \Vith and the prior density for eis
pa.~ter p, Suppose we assume a beta distributon
J'(e)~l.o<e<1.
with parameters a and b as the prior density for p.
Detennine the posterior density for p, given a random eassUttling n: "" 1.
(a) Find th::: poslerior density for
sl!II1ple of size n from X. (b) Find the Bayes estimator for e assuming :he loss.

10~28. Let X be a. Bernoulli random variable Vllth


function e(e; e) e'(e- et
~ and" = l.
parameter p.lf the prior density for p is a beta distri- l(}~36. Let X follow the Bcrnot;lli distribution with
bution with par..uneters a and b, detennine t.1e poste~ parameter p. Ass',lme a reasonable prior density for p
rior density for p, given a nmdom sample of size n ::0 be
from X. ftp) ~ 6p(l- p). O';p$l,
10-29. Let X be a Poisson random variable with =0, othenvise.
parameter A.. The prior density for A is a ga:::nma dis-
If the loss function is squared error, find the Bayes
tribution with Pi!I'atnct!!::'5 m -;-1 and (m + 1)/~. Deter-
est.i.=:t2.tor of p if one observation is available. If :'1e
mine the posterior density for A, given a random
loss funcrion is
sample of s;ze n from X.
f(P;p) =2(p _ p)'.
lQ.3Q. Suppose that X ~ N(u, 40), and let the prior
density for f.1. be N(4, 8), For a random sample of size :find the Bayes estimator of p for n "" L
25, the value x"" 4.85 is obtained. \Vhat is the Bayes 10~37. Consider the confidence interval for Jl with
estio:tate of J.L, assu,"l)ing a squared-error loss? known standard deviation ct:
10~31. A process manufactures printed circult boards,
X -Z",,<1;;; ~ il ~X +Zr;:/yFn,
A locating notch is drilled a distance X from a com-
ponent hole on the board. The distance is a ta..'1dom where (XI + az "" a. Let a "" 0.05 aad find the interval
\'a!iable X ~ N(p., 0.01). The prior density for ,1.1. ~ uni- for a~ "" a;;:;; a/2:::;; 0.025, ~ow find the interval for the
form between 0.98 and 1.20 inches. A random sample case (Xl ;; 0.01 and ~ 0.04. Wnich interval is
of size 4 produces the value X"" LOS. Assuming a shorter? Is there any advantage to a "symmetric" carr
squared-e>ror loss, determine the Bayes estimate of 11. fidence interval?
262 Chapter 10 Parameter Estimatinn

10~38. "When Xl' X2, ,,'. X~ a;e indepe~dent Poisson and G') = 0.18 fluid ounces for the tv.·o machines,
random variables, each 'With pa.'1UD.eter A, and when n respecti,,-ely. Two random samples of n.j ;:::: 12-'bottles
is relatively large, the sample mean X is approxi- from machine 1 and ~ = 10 bottles from machine 2
mately normal \\-1.th mean Aand variance Nn. are selected. and tie sample mean fill volUJX'.es are Xl
{a) What is the distribution of the statistic ,; 30,87 fluid ounces andJ"2 = 30.68 fluid ounCes,
X-I.? (a) Construct a 90% 'twt.rsided confidence interval on
~}.i;; . the mean difference in fill volume,
(b) Construct a 95% two-sided confidence interval on
(b) Use the results of (al 10 fud a 100(1- a)% confi· tk mean difference il1 fill volume. Compare the
dence interval for ;•. v.idth of this interval to the width of the interval in
10-39. A maD:Jfactu.rer produces piston rings for an part (a).
automobile engjne, It is known :hat ring diameter is (c) Construct a 95% upper-confidence interval on the
approximately normally distributed and bas standard mean difference in fill volume.
deviation Cf= 0.001 mm.A random sample of 15 rings 10~46. The burning rates of tviO different solid-fuel
has a mean diameter ofx;c: 74.036 mm. rocket propellan.ts are bcirtg studied. It is known that
(a) Construct a 99% two-sided confidence iutervalon both propellants have approximately the same standard
the mean piston ring diameter. de'.'iation of burning rate; that is. 0', = OJ::::::: 3 cmIs. Two
(b) Construct a 95% lower-confidence limit on the random samples of n1 :::::: 20 and 1t;;:::::: 20 spe~ens are
mean piston ring diameter. tested, and the sample mean burning rates are Xl ;;;.; 18
cmJs and ~ 24 cm/s. Construct a 99% confidecce
1040. The life in houts oia 75-Wlight bulb is known
interval on the mean difference:u:' burning rate,
to be approximately normally distribQted, "With a stan-
dard deviation CJ = 25 hours. A random sample of 20 10-47. Two different formulations of a lead~free gaso-
bulbs has a mean life of x:::::; :014 hou."'S. line are being tested to study their road octane mlJll w

(a) Construct a 95% two~sjded confidence interval on


bers. The variance of road octane number for
the mean life,
formulation 1 is a-; : : :
1.5 and for formulation 2 it is
(b) ConStI"Jc: a 95% lower-confidence interval on the
a;:::::: 1.2. Two random samples of size I'l l "" 15 and
liz = 20 are tested, and the mean road octane Ullmbers
mean life,
observed are XI:::::: 89.6 and Xz:: 92.5, CO:lstruct a 95%
1041. A civil engineer is analyzing the compressive two-sided confiderlee interval on: the difference in
strength of concrete. Compressive strength is approx~ :nean road octane number.
imately normally distributed with a variance cf1 =. 10-48. The compressive stIength of concrete is being
1000 (psi}2. A random sample of 12 specimens has a
tested by a civil engineer. He tests 16 specimens and
x
mean compressive strength of = 3250 psi. obtains the following dara:
(a) Construct a 95% two-sided confidence interval on
2216 2237 2249 2204
mean compressive strength,
2225 2301 2281 . 2263
(b) Construct a 99% tVlo-sided confidence interval on 2318 2255 2275 2295
mean compressive strength. Compare the mdth 2250 2238 2300 2217
of this confidence interval with the width of the
(a) Construct a 95% two-sided confidence in:e!'t'al oc
one found in pat! (a),
the mcan strength.
10-42. Suppose that in Exercise 10-40 we wanted to (b) Construct a 95% lower-confidence interv~ on the
be 95% confident that the error in estimating the mean mean strength.
life is less than 5 hours. What sample size should be
(c) Comtrucra95% twO-sided confidence interval on
used?
tte mean sll'e::1gth asst!lT!ing that a= 36. Compare
10-43. Suppose that in Exercise 10-40 we wanted the this interval \\ith the one from part (a).
total width of the confidence interval on mean life :0 (d) Construct a 95 % t:'.VQ--sided prediction interva1 for
be S hol1."'S. \\'hat sa:nple size should be used? a. single compressive strength.
10-44. Suppose tb.at in Exercise 10-41 it is desired to (e) Construct a t:'.Vo-sided tolerance interval that
estimate the compressive strength with an CIror that is would cover 99% of all compressive strengths
less than 15 psi. 'Vlhat sample size is required? with 95% confidence.
10-45. Two machines are used to fill plastic bottles 1049. An article in An.nual Revie.vs Material
mth dishwashing deterge::1t. The stat.dard deviations Research (200 1, p. 291) presents bond strengths for
of fill volu.:m.e are ttown to he 0'1 =: 0.15 fluid ounces various energetic materials (explosives, propellants.
10-10 Exercises 263

and pyrotechnics), Bond strengths for 15 such materi~ two-sided confidence interval on the difference in
als ate'shown below, Constxuct a two-sided 95% con- mean Voltage.
fidence interval on the mean bond stre!l.gth.
lO~56. Random samples of SlZC 20 wc.."'C <l:-dwn from
323,312,300.284.283,261,207,183. two independent normal populations. The saxr,p1e
180,179.174.167.167,157. J20. means and standl:rd deviations were XI ':;:;: 22.0, .)1 ;::;
10·50. The wall thlckness of 25 glass 2·liter bottles
1.8. ~ ;. 21.5, and $1 = 1.5. Assuming that 0;, er: : : :
find the following:
was measured by a quality-control engineer, The salll·
(a) A 95% two-sided confidence inter'la: on!1l - ,1)..):
pIe mean was x 4.05 mm and the sample sta.r:.dard
deviation was s 0.08 mm. FInd a 900/0 lower-confi- (b) A 95% upper-confidence ir.tcrval ou Ji; fLl.
dence in!erVa: on the mean wall thickness. (c) A 95% lower-confidence interval 0!1 PI 1-0"

10-51. An industrial er:.gineer is in~erested in estimat- 10..51. The diameter of sreel rods m.a.nufactu.red on
[
ing the mean time required to assemble a printed cir~ two different extrusion machines is being investi-
cuit board, How large a sample is required if the gated. Two random samples of sizes n l ::: 15 alld n., ;;;;
engineer wishes to be 95% confident that the erro!' in 18 are selected, and the samp:e mea.'1S and sample
f
t
estiJnating the mean is less than 0,25 minutes? The variances are :i\ ;:: 8.73, s~ :;:;: 0.30. X1 = 8,680, and $~ :;:;:
standard dev:iat!ou of assem'Oly time is 0.45 minutes. 0.34. respectively, Assuming that oi == a-i. construct a
I 95% two~sided corlictence interval on the difference
I lO~52.- A random sample of size 15 fro:n a norma]
in mean rod diameter.
population has mean x 550 and variance S2:::::: 49,
Find the following; 10~58. Random samples of sizes n: == 15 and "2 10
(a) A 95% two-sided confidence interval on jl. are drawn from two mdependent Donnal populations.
(h) A 95% lower-confidence interval on ,u.
The sample means anc. variances are x; :::
300, if = 16.
~::: 325, s~ =49. Assuming mar o~ ",~. construct a
(c) A 95% upper-confide::.ce intm'3.l o~}1.
95% two-sided confidence interval On 11: - P?,
(d) A 95% two-sided prediction intm'3.l for a single
lO~59. Consider the data in Exercise 10-48. Construct
observation.
the following;
(e) A two~sided tolerance interval that would eover
(a) A 95% two-sided confidence interval on <fl.
90% of all observations with 99% confidence.
(b) A 95% loweN;orJ'idence interval on d.
10~53. An article in Computers in Cardiology (1993,
(c) A 95% upper-coDfictence interval On <f.
p, 317) presents the results of a heart stress test in
which the stress is induced by a particlliar drug. The 10-60. Consider the data in Exercise 10-49. Construct
heart rates (in beats per minute) of nine male patients the following:
after the drug is administered are recorded. The aver- (a) A 99% two-sided confidence int:en'al 00 <f.
age heartrute was found to be x= 102.9 (bpm) with a (b) A99% lower~confidc:lCeinterval on cr-,
sample standare. deviation of s :;:;: 13.9 (bpm), Fmd a
90% eonfidence interval on the mean heart rate after
(c) A 99% upper-confideI:.ee interval on cr.
:he drug is ad.tt1.h'iistered. 10~61. Construct a 95% two-sided confidence interval
on !he variance of the wall thickness data in Exercise
10R54. Two independent random samples of sizes n1 == 10-50.
18 and lIz = 20 are taken from two normal popula~
tions. The sample m.eans are Xl ::;:;: ZOO and Xz = 190. We 10~6.2. In a random sample of 100 light bulbs, the

know that the variances are 0"2 15 and O"~:::::: 12. Fmd samp:e standard deviation of bul'O life was found to be
the folJoW..ng: ' • 12.6 houtS. Compute a 90% -apper-confc.ence interval
on the variance of bulb life.
(a) A 95% rn,'o-sided confidence interva~ on)1: - f.L::..
(b) A 95% lower-eonfidence interval Ot! III - fLl. 10~63. Consider the data in Exercise 10-56. Construct
(e) A 95% upper-coufidenee interval on Jil - J.L;;. 3 95% tvlo-sided eonfidence interval on the rajo of
the population variances ti/~,
lO~S5. The output voltage from two different types of
10~64. Consider the data in Exetcise 10-51. Construct
transformers is being investigated, Ten transfonners
the following:
of each type are seleeted at random and the voltage
measured. The sample z:..eans 12.13 voltS and (a) A 90% two-sided confidence interval on ~/a;.
12.05 volts. We know that the variances of output (b) A 95% two~sided confidence interval on 0-71a;.
voltage fur the two types of transformers are O'~ = 0.7 Compare the width of this interval '1.-ith the 'tvidth
and 0; = 0.8, respectively. ConstrUct a 95% of the interval in part (a).
264 Chapter 10 Parameter Estimation l
(c) A 90% loweN!onfidenceinterval On ~/o;. Subject PSJ FSI
(d) A 90% upper-confidence interval on ~/a;,
25.9 33.4
10-65. Construct a 95% two-sided confidence interval 2 30.2 37.4
on the ratio of the variances ~Jo; using L'1e data in
3 33.7 48.0
Exercise 10-58.
4 27.6 305
10.-(i6. Of 400 randomly selected motorists, 48 were 5 33.3 27.8
found to be uninsured, Construct a 95% two-sided 6 34.6 27.5
confidence interval on the uninsu.red rate fer lnOtonS!$, 7 33.1 36.9
1CJ..67. How large a samp!c would be required in Exer- 8 30.6 31.1
else 10~66 to be 95% confident that ilie error in esti- 9 30.5 27.1
mating the uninsured rate for motorists is less than 10 25.4 38.0
0.037
10M6S. A manufacturer of electronic calculators is Find a 95% confidence interva: on t'1e difference in
interested in estimating the fraction of defective units mean completion times. Is there any indication that
produced. A random sample of 8000 calculators con~ one joystick is preferable?
tains 18 defectives. Compute a 99% upper...confidcnce
10~73~ The manager of a fleet of automobiles is test-
intcrval on ilie fraction defective.
ing two brands of radial tires. He assigns one tire of
10-69. A study .is to be conducted of the percentage of each brand at ~dom to the t\vo rear wheels of eight
homeowners who own at least two television sets. ca.."'S and runs the cars U!1til the tires wear out. The data
How large Ii sample is :required if we wish to be 99% (in kilometers) 3...'"C shown below:
confident that· the error in esti.:nating this quantity is.
less than 0.01?
Car Brand I Brand 2
10*10. A study is conducted to determine whether
there is a significant difference in union membership 1 36.925 34.318
based on sex of the person. A random s.ample of 5000 2 45.300 42.280
factory-employcd men were polled, and of this group, 3 36,240 35,500
785 were members of a union. A random sample of 4 32.100 31.950
3000 fac:ory-employed women were also polled, and 5 37.210 38.015
of this group, 327 were members cf a union, Con- 6 48.360 47.800
struct a 99% confidence interval on the difference in 7 38,200 37,810
proportions PI - P2' 8 33,500 33,215
10~11. The fraction of defective product produced by
two production lines is being analyzed. A random
sample of 1000 units from line 1 has 10 defectives, Find a 95 % confidence i."l.rerval on !he difference in
while a random sample of 1200 units from line 2 has mean mileage. Which brand do you prefer?
25 defeetives. F'md a 99% confidence interval on the 10~ 74. Consider the data in Exercise IQ..50. Find ecn~
differc:lce in fraction defective produced by the two fidenee intervals on J.l and (72 such that we are at least
lines. 90% confident that bofuinter\.'ai~ si.xru.li.taneously lead
to correct conclusions.
10~72. The resultS cf a study on powered wheelchair
driving performance were presented in the Proceed- 10~75. Consider the data in Exercise 10-56. Suppose
tngs of the IEEE 24th Annual Northeast Bfoengineer~ that a ~dom sa:npk of size n:; = 15 is obtained from
ing Conference (1998, p. 130). In this study, the a third nor:wU population. Vtit.'1 Xj::: 20.5 and 5J ::; 1.2.
effects of two types of joysticks, force sensing (FSJ) Find two-sided confidence intervals on #1 - IJ;;. f.ll -
and position sensing (pSJ), on pov,:er wheelchair f.lJ, and iJ...J. - f.lJ such that the probability is at least 0.95
control were investigated, Each of 10 subjects was that all three intervals. simultaneously lead to comet
asked to test both joysticks. One response of inter~"t is c(!!1c1~sicns.

the time (in seconds) to complete 3 predeter:rcined 10-76. A random variable X is normally distributed
couzse. Data typical of this type of experiment ax as with mean f.land variance a1 10. The prior density
follows: for J1. is uniform between 6 and 12. A random sample
10,10 Exercises 265
of size 16 yields x = 8. Construct a 90% Bayes inter~
density for lIcr.If a random sample of size 10 y:elds
va: for f.L Could you reasonably accept the hyPothesis
that J1 = 9? Z(x; - 4)2 =. 4.92, detenr,lne me Bayes estimate of
lJO"': asst.l.!tli.ng a squared-error loss. Set up aa integral
lO~ 77. Let X be a noI'Ill..'illy distributed random Ya.ri~ that defines a 90% Bayes interval for 1/62.
able with mean fJ. =: 5 and unknown variance d2, The
10-78. Prove that if a squared~ecror loss function is
prior dcr.sity for 11(12 is a gamma distribution with
used, the Bayes estimator of eis the lUcan of the pos~ ,
parameters r::;;; 3.and A= 1.0, Determine <he posterior terior distribction for 8.
Chapter 11

Tests of Hypotheses
Many problems requL""e that we decide whether to accept or reject a statement about some
parameter, The statement is usually called a hypothesis, and the decision-making procedure
about the hypothesis is called hypothesis testing. This is one of the most useful aspects of
statistical inference. since many types of decision problems can be formulated as
hypothesis-testing problems. This chapter will develop hypothe.~is~testing procedures for
severa} important situations.

11-1 INTRODUCTION
11-1.1 Statistical Hypotheses
A statistical hypothesis is a statement about the probability distribution of a random vari-
able. Statistical hypotheses often involve one or more parameters of this distribution. For
example, suppose that we axe interested in the mean compressive strength of a particular
ty'Pe of concrete. Specifically, we are interested in deciding whether or not the mean COm~
pressive strength (say /1) is 2500 psi. We may express this fonnally as
Ho' /1 = 2500 psi, (11-1)
HI' /1 ,,2500 psi.
The statemcntHo' /1 = 2500 psi in equation 11-1 is called the null hyporhesis, and the state-
*
ment HI: J1 2500 psi is called the alternative hypothesis. Since the alternative hypothesis
specifies values of /1 that could be either greater than 2500 psi or less thaI: 2500 psi, it is
called a two-sided alternative hypothesis. In some situations, we may wish to formulate a
one~sided alternative hypothesis, as in

Ho: /1 = 2500 psi,


(JJ-2)
HI' /1> 2500 psi.
It is important to remember that hypotheses are always statements about the population
or distribution under study~ not statements about the sample. The value of the population
parameter specified in the null hypothesis (2500 psi in the above example) is usually deter-
mined in one of three ways. First, it may result from past experience or knowledge of the
process, or even irom prior experimentation. The objective of hypothesis testing then is usu-
ally to determine whether the experimental situation has changed. Second, this value may
be determined from some theory or model regarding the process under srudy. Here the
objective of hypothesis testing is to verify the theory or modeL A third situation arises when
the value of the population parameter results from external considerations, such as design
or engineering specifications, or from contractual obligations. In this situation, the usual
Objective of hypothesis testing is conformance resting.

266
11-1 Introdcccion 267

We are inrerested in making a decision about the truth or falsity of a hypothesis. A pro~
cedure leading to such a decision is called a test of a hypothesis. Hypothesis~testing proce-
dures rely on using the information in a random sample from the population of interest. If
this information is consistent with the hypothesis., then we would conclude that the hypoth-
esis is true: however, if this information is inconsistent with the hypothesis, we would con-
clude that the hypothesis is false.
To test a hypothesis. we must take a random sample. compute an appropriate test s.ta~
Listie from the sample d~ and then use the information contained in this test statistic to
make a decision, For example, .in testing the null hypothesis concerning the mean com-
pressive strength of concrete in equation 11 ~ 1. suppose that a random sample of 10 concrete
specimens is tested and the sample mean i is used as a test statistic. 1f x> 2550 psi or ifi
< 2450 psi, we will consider the mean compressive strength of this pa..--ticular type of con-
crete to be different from 2500 psi. That is, we would reject the null hypothesis Ho: j1 :
2500. Rejecting Ho implies that the alternative hypothesis, HI' is true. The set of all possi-
ble values of;;; that are either greater than 2550 psi or less than 2450 psi is called the criti-
cal region or rejection region for the test.A1ternatively, if 2450 psi s::x S 2550 psi. then we
would accept the null hypothesis Ho: j1 = 2500. Thus, the interval [2450 psi, 2550 psi] is
called the acceptance region for the test. Note t:hat the boundaries of the critical region,
2450 psi and 2550 psi (often called the critical values of the test statistic), have been deter-
mined somewhat arbitrarily. In subsequent sections we will show how to construct an
appropriate test statistic to determine the critical region for several hypothesis-testing
situations.

11-1.2 Type I and Type II Errors


The decision to accept or reject the null h}-pomesis is based on a test statistic computed
from the data in a random sample. ¥/nen a decision is made using the information.in a ran-
dom sample, this decision is subject to error, Two kinds of errors may be made when test~
ing hypotheses. If the null hypothesis is rejected when it is true, then a type I ettor has been
made. If the null hypothesis is accepted when it is false, then a type It error has been made.
The situation is described in Table ll-!.
The probabilities of occurrence of type I and type II errors are given special symbols:

'(t = P (type I error): P {reject HOIHo is true}, (11-3)


f3 =P (type II error) : P (accept HolHo is false). (11-4)

Sometimes it is more convenient to work with the power of the test. where

Power: 1- f3 =P (rejectHolHo is false). (11-5)

Note that the power of the test is the probability that a false null hypothesis is correctly
rejected. Because the results of a test of a hypothesis a..""e subject to error, we cannot '1Jrove"
or "disprove» a Statistical hypothesis. However, it is possible to design test procedures that
control the error probabilities (t and f3 to s"itably small values.

Ta.ble 11-1 Decisions in Hypothesis Testing

is True is False
l=eptH, NQ error Type II etro::
RejectH, lYre I error No error
268 C""pter II Tests of Hypotheses

The probability of type I error ais often called the significance level or size of the teSt
In the concrete~testing example, a type 1 error would occur if the sample meanx> 2550 psi
or if;' < 2450 psi when in fact the true mean compressive strength 11- = 2500 psi. Genernlly,
the type I error probability is controlled by the location of the critical region. Thus, it is USu-
ally easy in practice for the analyst to set the type I error probability at (or near) any desited
value. Since the probability of wrongly rejecting Ho is directly controlled by the decision
maker, rejection of Ho is always a strong conclusion. Now suppose that the null hypothesis
Ho: J1 =: 2500 psi is false. That is, the true mean compressive strength J1 is some value other
than 2500 psi. The probability of type IJ error is not a constant but depends on the true mean
compressive strength of the concrete. If jJ. denotes the true mean compressive strength, then
[3(!1-) denotes the type IJ error probability corresponding to J1. The function [3(!1-) is evaluated
by finding the probability that the test statistic (in this case Xl falls in the acceptance region
given a particular value of j.1., We define the operating characteristic curve (or OC curve) of
a tcst as the plot of [3(!1-) against J1.. An example of an operating characterisric curve for the
concrete-testing problem is shown in Fig. 11-1. From this curve, we see that the type II error
probability depends on the extent to which He: J1. = 2500 psiis false. For example, note that
[3(2700) < [3(2600), Thus we can think of the type II error probability as a measure of the
ability of the test procedure to detect a particular deviation from the noll hypothesis He.
Small deviations are harder to detect than large Ones. We also observe that since this is a
two-sided alternative hypothesis, the operating characteristic curve is symmetric; that is,
[3(2400) =[3(2600). Fu:ttherrnore, when 11- =2500 the probability of type II error [3 = I - a,
The probability of type II error is also a function of sample size, as iliustrated in Fig.
11-2. From this figure, we see that for a given value of the type 1 error probability a and a
given value of mean compressive strength the type 1I error probability decrca..~s as the sam~
ple size n increases. That is, a specified deviation of the true mean from the value specified
in the null hypothesis is easier to detect for larger sample size..I) than for smaller ones. The
effect of the type I error probability a on the type IJ error probability fl for a given sample
size n is illustrated in Fig. 11~3. Decreasing acauses f3 to increase, and increasing a causes
[3 to decrease.
Because the type IJ error probability [3 is a function of both the sample size and the
extent to which the null hypothesis Ho is false, it is customary to think of the decision to
accept Ho as a weak conclusion, unless we know that ,8 is acceptably small. Therefore,
rather than saying we <'accept H o'" we prefer the terminolQgy "fail to reject R o." Failing to
reject He implies that we have not found sufficient evidence to reject HOI that is, to make a
strong statement. Thus failing to reject Ro does not necessarily mean that there is a high

.,
'•.00
L~
t
___________ _

I
~(24CO) ~ ~(260C) .[---------- :

I I
~(23GO)~~(2700) ----- ---t---t--..L-- ::===-~
~7Co
Li

O.GC - - . 2300 2400· 2500 ;..


Figure 11~ 1 Operating characteristic curve for the concrete testing example.
11-1 btroducton 269

0,00 L ...=;;;;;:;;;;~=-----;;;;~-=::::~=_~
2500 /L
Figure 11~2 The effect of sample size on the operating characteristic curve.

1,00

0,00 '---"'---------:=;;-----="--~
2500 ~

Figure llR3 The effect of type I error on the operating characteristic curve.

probability that Ho is true. It may imply that more data are required to reach a strong cOn~
elusion. This can have .important implications for the formulation of hypotheses.

11·1.3 One·Sided and Two-Sided Hypotheses


Because rejecting Ho is always a strong conclusion while failing to reject Ho can be a weak
conclusion unless f3 is knovro to be small, we usually prefer to construct hypotheses such
that the statement about which a strong conclusion is desired is in the alternative hypothe-
siI, H" Problems for wbkh a !wo-sided altemative hypothesis is appropriate do not really
present the analyst with a choice of formulation. That is, if we wish to test the hypothesis
that the mean of a distribution /l equals SOme arbitrary value, say ,u", and if it is important
to detect values of the true mean /l that could be either greater than 110 or less than 110. then
One must use the t'~\lo-sided alternative: in
He: /l =110,
HI: /l" ,u,,-
:>fany hypothesis-testing problems naturally involve a one-sided alternative hypothesis,
For example. suppose that we want to reject He only when the true \'alue of the mean
exceeds 110, The hypotheses would be
Ho: /l = 110. (11.6)
H!: /l > ,u",
270 Chapter 11 Tests of Hypotheses

This would imply that the critical region is located in the upper tail of the distribution of the
test statistic. That is. if the decision is to be based on the value of the sample mean XI then
we would reject Ho in equation 11~6 ifi is too large. The operating characteristic curve for
the test for this hypothesis is shown in Fig. 114. along with the operating characteristic
curve for a two-sided test. We obsen'e that when the true mean j.J. exceeds J.ifJ (Le., when the
alternative hypothesis. Hi: f1. > J.i.fJ is true), the one-sided test is superior to the two-sided test
in the sense that it has a steeper operating characteristic curve. When the true mean j.J.:=: J.ifJ.
both the one-sided and two-sided tests are equivalent. However, when the true mean j1Is
less than /4. the two operating characteristic curves differ. If j1 < ,l.l.w the two-sided test has
a higher probability of detecting this departure from f.1o than the one-sided test. This is intu-
itively appealing, as the one~sided test is designed assuming either that fJ. cannot be less than
f.1o or, if IJ. lS less than f.1o, that it is desirable to accept the null hypothesis.
In effect there are two different models that can be used for the one-sided alternative
hypothesis. For the case where the alternative hypothesis is HI: J.l > J1.o, these two models are
H 0: jJ. ~ f.1o,
(11-7)
Hoc> f.1o
and
Ho' IJ. <;, f.1o, (11-8)
H t , IJ. > f.1o.
In equation 11-7, we are assn.ming that jJ. cannot be less than I1rr and the operating charac-
teristic curve is undefi.'1ed for values of j1 < ,Uo. In equation 11-8, we are assuming that fJ. can
be less than f.1o and that in such a situation it would be desirable to accept Ho. Thus for equa-
tion 11-8 the operating characteristic curve is defined for all values of jJ. <;, J.1.o. Specifically,
if IJ." .11u, we have (Jrjl) ~ I af,p), where af,p) is the significance level as a function of J1.
For situations in which the model of equation 11-8 is appropriate, we define the significance
level of the test as the maximum value of the type 1 error probability a; that is, the value of
a at J1. = ikJ. In situations where one~sided alternative hypotheses are appropriate, we will
usually write the null hypothesis with an equality; for example, Ho: jJ. ~ f.1o. This will be
interpreted as including the cases Ho' jJ. <;, f.1o or Ho' .u" .11u, as anropriate.
In problems where one~sided test procedures are indicated. analysts occasionally expe-
rienee difficulty in choosing an appropriate formulation of the alternative hypothesis. For
example, suppose that a soft drink beverage bottler purchases 10-ounce nonreturnable bot-
tles from a glass company. The bottler wants to be sure that the bottles exceed the specifi-
cation on mean internal pressure or bursting stren"cth. which for lO-ounce bottles is 200 psi.

~l

air------_
1.00 ;

OC curve for a
two-sided test

one--siaed test

o.oo'====---l--~2::=="-+-
"" "
Figure 11-4 Operatir.g characteristic CUJ."V'C,5 fortvvo-sided and one-sided tests.
11-2 Tests of HypoL'>eses on a Single Sample 271

The bottler has decided to formulate the decision procedure for a specific lot of bottles as a
hypothesis problem. There are two possible formulations for this problem. either
H,: fl." 200 psi,
(11-9)
Ht : fl.> 200 psi

or

Ho: fl.;>: 200 psi,


(11-10)
HI: fl. < 200 1'5i-

Consider the formulation in equation 11-9. If the null hypothesis is rejected, the bot-
tles will be judged satisfactory~ while if Ho is not rejected, the implication is that the bot~
tles do not conform to specifications and should not be used. Because rejecting Ho is a
strong conclusion, this formulation forces the bottle manufacturer to "demonstrate" that the
mean bursting strength of the bottles exceeds the specification. Now consider the formula-
tion in equation 11- to. In this situation. the bottles will be judged satisfactory unless Ho is
rejected. That is, we would conclude that the bottles are satisfactory unless there is strong
evidence to the contrary.
Which fonnulation is correct, equation 11-9 or equation 11-1O? The answer is "it
deperu:ls;' For equation 11-9, there is some probability that II, will be accepted (i.e .. we
would decide that the bottles are not satisfactory) even though the true mean is slightly
greater than 200 psi. This formulation implies that we want the bottle manufacturer to
demonstrate thar the product meets or exceeds our specifications, Such a formulation could
be appropriate if the manufacturer has experienced difficulty in meeting specifications in the
past. or if product safety considerations force us to hold tightly to the 200 psi specification.
On the other hand, for the formulation of equation 11-10 there is some probability that II0
will he accepted and the bottles judged satisfactory even though the true mean is slighCy less
than 200 psi. We would conclude that the bottles are unsatisfactory only when there is strong
evidence that the mean does not exceed 200 psi; that is, when Ho: fl.2: 200 psi is rejected. This
formulation assumes that we are relatively happy wiD the bottle manufacturer's past per-
formance and that small deviations from the specification of fl.2: 200 psi are not harmful.
In formulating one-sided alternative hypotheses. we should remember that rejecting Ho
is always a Strong conclusion, and consequently, we should put the statement about which
it is important to make a strong conclusion in the alternative hypothesis. Often this will
depend on our point of view and experience with the situation.

11-2 TESTS OF HYPOTHESES ON A SINGLE SA,,'\IPLE


11-2_1 Tests of Hypotheses on the Mean of a Normal Distribution,
Variance Known
Statistical Analysis
Suppose that the random variable X represents some process or population of interest. We
assume that the distribution of X is either normal or that) if it is nonnormal: the conditions
uf the Central Limit Theorem hold. In additinn, we asslllne that the mean fl. of X is unknown
but that the variance c? is known. We are interested in testing the hypothesis
Ho: fl. = i1<J, (11-11)
II,' ,It ¢ i1<J,
where 110 1s a specified constant.
272 Chapter II Tests of Hypot,oeses

A random sample of size n. Xl' X2• •.• , Xll~ is available. Each observation in this sam~
pre bas unknown mean J1 and known variance d'. The test procedure for He: J1 = !10 uses the
test statistic
(11-12)

If the null hypothesis He' J1 = !10 is true, then E(Y:; !10, and it follows that the distribution
of Zo is N(O, I). Consequently, if H,: J1 =!10 is true, the probability is 1 - a; that a value of
the test statistic Zo falls between - Zqfl and Zal2' where Za/2 is the percentage point of the
standard normal distribution such that P{Z;" Zan) = a/2 (i.e., Z"" is the lOO(l-a/2) per-
centage point of the standard normal distribution). The situation is lllustrated in Fig. 11-5 ..
Note ",lilt the probability is a; that a value of the test statistic Zo would fall in She region Zo
>Za!2 or 20 <-Za.'2 whenHo: fJ.::;:;.Uo is true, Clearly, a sample producing a value oftbe test
statistic that falls in the tails of the distribution of Zo would be unusual if Ho' jl = !10 is true;
it is also an indication that Ho is false, Thus, we should reject Ho if either
Zo>Za/2 (1l-l3a)
or
(ll-l3b)
and fail to reject Ho if
(11-14)
Equation 11~14 defines the acceptance region for Ho and equation 11-13 defines the criti-
cal region or rejection region. The type I error probability for this test procedure is ex,

The burning rate of a rocket propellant is being studied. Specifications req~ that the mean burr.in.g
rate must be 40 em/s, Furthermore. suppose that we know that the standard deviation of the burning
rate is approximately 2 cmfs, The experi:nenter decides to spe'cifY a type I errot probability a== 0.05,
and he will base the test 0;1 a random sample of size n == 25. The hypot.heses we wish to test are
He: Jl ~ 40 emfs.
H,:Jl*40cmls.
Twenty-five specimens are tested, and the sample mean burning rate obtained is i;:; 41.25 em/so
The value of u.1.e test statistic in equation 11-12 is

x- JJ.o
Zo:=! (J/~T;;
41.25-40
3.125.
2/.f25

Cr:tcal region
aJ2 ),(!il-----=-----!

Figure 11 ~5 The distribution of Z:; when Ho: J1 ;:; J1Q is true,


11-2 TestS of Hypotheses oa a Single Sample 273

Since a= 0.05. thebcundaries of the critical region arc Zo.m L96and~Zo.tT'..5 =-·1.96. and we note
that Zo falls in the critical region. Therefore. He is reJected. and we conclude mat the mea:: burning
rate is not equal to 40 cds,

Now suppose that we wish to test the one-sided alternative, say


H0: Jl = 11<1,
(11-15)
H,: Jl >!L:r
(Note that we could also wTIte He: /.1 ~ JJ.o.) In defining the critical region for this test; we
observe that a negative value of the test statistic 2Q would never lead us to conclude that Ho: j.t
=11<1 is false. Therefore, we would place the critical region in the upper tail of the NCO, 1) dis-
tribution and reject H0 on values of ZO that are too large, That is, we would reject Ho if
:z;, > Z". (11-16)

Similarly, to test
Ho: Jl = 11<1, (11-17)
HI: j.l <!L:r
we would calculate the teststaostic Zo andrejectHo on values of ZO that are too small. That
is, the critical region is in the lower tail of the N(O, 1) distribution, and we reject Ho if
(11-18)

Choice of Sample Size


In testing the hypotheses of equations 11-11, 11-15, and 11-17, the type I error probability
a is directly selected by the analyst. However, the probability of type II error f3 depends on
the choice of sample size. In this section, we will show how to select the sample size in
order to arrive at a specified value of (J
I. Consider the two-sided hypothesis
H 0: Jl = 11<1,
HI: Jl '" 11<1.
Suppose that the null hypothesis is false and that the trUe value of the mean is J1 = 11<1 + 0,
say, whe:e 0> O. ~ow since HI is tr'J.e. the distribution of the test statistic Zo is

(11-19)

The distribution of the test statistic z" under both the null hypothesis Ho and the alternative
hypotbesisHl is shown in Fig. 11-6, From exam.irring this figure, we note that if Hl is true,
a type II error will be made ooly if - Zan S z" 5 Z<>/2 where :z;, ~ N( 15 .J;;/a, 1). That is, the
probability of the type II error f3 is the probability that z" falls between - Za12. and Z<>/2 given
that Hi is lrue. This probability is shown as the shaced portion of Fig, 11-6. Expressed
mathematically, this probability is

/ 15.[;;1
f3=<11 1 ZaP-a -<II -ZO/2-
r 0..[,;)' (11-20)
, / \ a
where <!>(z) denotes the probabilily to the left of z on the standard normal distribution. Note
that equation 11-20 was obtained by evaluating the probability that Zo falls in the interval

L
274 Cbapter 11 Tests of Hypotheses

Under H 1 : ji.*'1kz;

-z~'2 o

Figure n-6 The distribution of z;, under Ho and H,.

[- Za/2' Zari] 0:1 the distribution of Zo when Hi. is true, These two points were standardized ~'i
to produce equation 11-20. Furthermore, note that equation 11-20 also holds if Ii< 0, due
to the symmetry of the normal distribution.
While equation 11 ~ 20 could be used to evaluate the type II error, it is more eonvenient
to use the operating characteristic curves in Charts VIa and VIb of the Appendix. These
p
curves plot .as calculated from equation 11-20 against a parameter d for various example
sizes n. Curves are provided for both a= 0.05 and a = 0.01. The parameter d is defined as

dJI1- 1101 18 1
(11-21)
(J (J

\Ve have chosen d so that one set of operating characteristie curves can be used for all prob-
lems regardless of the values of f.1.Q and (5. From examining the operating characteristic
curves or equation IlM20 and Fig. 11-6 we note the follOwing:
1. The further the true value of the mean 11 fromllo, the smaller the probability of type
II error pfot a given n and ex. That is, we see that for a specified sample size and a..
large differences in the mean are easier to detect than small ones.
2. For a given 8 and a, the probability of type n error fi decreases as n increases. That
is, to detect a specified difference in the mean 0, we may make the test more pow-
erful by increasing the sample size.

CO!l.Sider the rocket propeBant problem in Example 11-1. Suppose that the analyst is concerned about
the probability of a type II error if the true mean burning rate is J1. = 4-1 em/s. We may use the operat-
ing characteristic curves to find {J. Note that 5=41 40 = 1, n::; 25. 0"= 2. and a= 0.05. Then

and from Chart VIa (Appendix), ?lith n = 25, we find that fJ= 0.30, That is, if the true mean bur.Ling
rate is J1.=41 em/s, the:J. there is approximately a 30% chance that this will nothe detected by the test
w:ithn=25,

;iixipipiiJ~~j
Once again, eo:.mder the rocket propella:ot problem in Example 11-1. Suppose that the analyst would
like to design the test so that if the true mean hurning rate differs from 40 r::tD!s by as much as 1 cr:::Js,
11-2 Tests of Hypotheses on a Sing:e Sample 275

the test will detect this (i.e" reject Ho: JJ. = 40) "''ith a high probability. say 0,90. The operati;:g char~
acteristic curves can be I.:.sed to:tmd the sample size that wi!! give Sl.:.ch a test. Since d;;;;:; lti - .14llkf=
112. a; 0,05. and fJ= 0,10, we find fron Chart VIa (Appendix) tlJat tlJe required sample size is n =
40. approximately,

In general. the operating characteristic curves involve three pru:ameters: j3. 8, and n,
Given any two of these parameters, the value of the thlrd can be determined. There are two
typical applications of these curves,

L For a given n and 8. find j3, This was il!ustratedin Example 11-2, This kindofproh-
lem is often encountered when the analyst is concerned about the sensitivity of an
experiment already performed. or when sample size is restricted by economic or·
other factors,
2. For a given j3 and ii, find n, This was il!ustrated in Example 11-3, This kind ofproh-
lem is usually encountered when the analyst has the opportunity to select the sam-
ple size at the out~et of the experiment
Operating characteristic curves are given in Cha.-ts vlc and VId (Appendix) for the
one-sided alternatives. If the alternative hypothesis is H t : J.L > /10. then the abscissa scale on
these charts is

(11-22)

When the alternative hypothesis is HI: J.L < /10> the corresponding abscissa scale Is

d=J1,~J1.
(5
(11-23)

It is also possible to derive formulas to determine the appropriate sample size to use to
obtain a particular value of j3for a given 0 and a. These formulas are alternatives to using
the operating characteristic curves. For the two-sided alternative hypothesis, we know from
equation 11-20 that

or if 0> 0,

(11-24)

since <p(
-Za/2 - 8.[; j (J) = 0 when 0 is positive. From equation 11-24, We take nor:nal
inverses to obtain

or

n= (11-25)
1
I
276 Chapter 11 Tests of Hypotheses

This approximation is: good when 4>(-Za,t2 - o..Jn/a)


is small compared to /3, For either of
the one--sided alternative hypotheses in equation 11~15 or equation lI 17, the sample size
M

required to produce a specified type II error with probability f3 given (j and 0: is

(Za + 2ft )'.:r2


n= Ii' (11-26)

Returning to the rocket propellant problem of Example l:i.~3, we note that 0'=2,0=41-40;:;: 1, €X=::
0.05, a.'1d /3= 0.10. Since Zan. =2:,.f12S = 1.96 and Zfj =2:,.tO = 1.28, the sample size required to detect
this departure from He: J.l= 40 is found. in equation 11-25 :0 be

(z." +2,)'"., (1.96+1.28)'2'


n =- l;i-··~ l' -42,

which is hJ. close agreement \\'ith the value determined from the operating characteristic curve, Note
that the approximation is good, since <I>(~Z.I2-0.rn/"')=<!>(-!'96-(I)..J4212)=-5.20=O
which is small relative to /3,

The Relationship Between Tests of Hypotheses and Confidence Intervals


There is a close relationship between the test of a hypothesis about a parameter and the e
confidence interval for O. If [L, ('1 is a 100(1 - a)% confidence interval for the parameter
e,then the test of size a of the hypothesis

Ho: 8= eo>
H"e",eo
will lead to rejection of Ho if and only if 80 is not in the interval [L, UJ. As an illustration,
consider the rocket propellant problem in Example 11·1. The null hypothesis Ho' (1 = 40
was rejected using a= 0.05. The 95% two-sided confidence interval on (1 for these data may
be computed from equation 10·25 as 40.47 :0;(1:542.03. That is, the interval [L, UJ is [40.47,
42.03], and since .110 40 is not included in this interval, the null hypothesis Ho' (1 = 40 is
rejected.

Large Sample Test with Unknown Variance


Although we have developed the test procedure for the null hypothesis H D: (1 = flo assum-
ing that <:? is known. in many practical situations d2 will be unknown. In general. if n 2 30,
then the sample variance $2 can be substituted for c? in the test procedures with little
harmful effect. Thus. while we have given a teSt for known cf2-~ it can be converted easily in
to a large-sample test procedure for unknown <f. Exact treatment of the case where cJ1 is
unknown and II is small involves the use of the t distribution and is deferred until
Section 11-2.2.

P-Values
Computer software packages are frequently used for statistical hypothesis testing. Most of
these programs calculate and report the probability that the test statistic will take on a value

I
.~
1:-2 TesTS of Hyporbeses 0;1 a Single Sample 277

at least as extreme as the observed value of the statistic when Ho is true, This probabllity is
usually called a P~value. It represents the smallest level of significance that would lead to
rejection of Ho. Thus, if P=O.04 is reported in !he computer output, the null hypothesis Ho
would be rejected at the level a = 0.05 but not at the level a = O.O!. GeMrally, if P is less
than or equal to a, we would reject H0' whereas if P exceeds a we would fail to reject H o'
It is customary to call the test statistic (and the data) significant when the null hypoth~
esis Ho is rejected, so we may think of theP-value.s the smallest level aat which the data
are significant. Once the P-value is known, the decision maker can determine for himself
or herself how significant the data are without the data analyst formally imposing a prese-
lected level of significance.
It is not always easy to compute the exact P-value of a test. However. for the forego-
ing normal distribution tests it is relatively easy. If ~ is the computed value of the test sta~
tistic, then the P-value is

'*-<D(2oI)] for a two-tailed test,


P=<j-<D(2o} for an upper-lail tes~
<1>(20) for a lower-tail test.

To illustrate, consider the rocket propellant problem in Example 11 -!. The computed value
of the test statistic is Zo = 3,125 and since the alternative hypothesis is two·tailed, the p.
value is
P = 2[1 - <1>(3.125)] =0.0018.
Thus, Ho: JL = 40 would be rejected at any level of significance where a:?: P ~ 0.0018, For
example, Ho would be rejected if a= 0.01, but it would not be rejected if a=O.OOl.

Practical Versus Statistical Significance


In Chapters 10 and 11, we present confidence intervals and tests of hypotheses for both sin-
gle-sample and two-sample problems, In hypothesis testing, we have discussed the statisti~
cal significance when the oui] hypothesis is rejected. What has not been discussed is the
practical significance of rejecting the null hypothesis. In hypothesis testing, the goal is to
make a decision about a claim or belief, The decision as to whether or not the null hypoth-
esis is rejected in favor of the alternative is based on a sample taken from the population of
interest. If the null hypothesis is rejected, we say there is statistically significant evidence
against the null hypothesis in favor of the alternative. Results that are statistically significant
(by rejection of the null hypothesis) do not necessarily imply practically significaJ!t results,
To illustrate, suppose that the average temperature on a single day throughout a par~
ticular state is hypothesized to 63 degrees. Suppose that n = 50 locations within the
x
state bed an average temperature of = 62 degrees and standard deviation of 0.5 degrees.
If we were to test the hypothesis J1 = 63 against H l : J1;/::. 63, We would get a resulting
P·value of approximately 0 and we would reject the null hypo!hesis. Our conclusion would
be that the true average temperature is not 63 degrees. In other words; we have illustrated a
statistically significant difference bv~'een the hypothesized value and the sample average
obtained from the data. But is this a practical difference? That is, is 63 degrees different
from 62 degrees? Very few investigators would actually conclude that this difference is
practical. In other words, statistical significance does not imply practical significance.
The size of the sample under investigation has a direct influence on the power of the
test and thc practical significance. As the sample size increases, even the smallest differ-
ences between the hypothesized value and the sample value may be detected by the
278 Chapter 11 Tests of Hypotheses

hypothesis tes~. Therefore, care must be taken when interpreting the results of a hypothesis
test when the sample sizes are large.

11-2.2 Tests of Hypotheses on the l\Jean of a Normal Distribution,


Variance Unknown
When testing hypotheses about the mean J.1 of a population when ,j2 is unkno'NU, we can use
the test procedures discussed in Section 11-2.1 provided that the sample size is large (n ~
30) say). These procedures are approximately valid regardless of whether or not the under-
lying population is normal However, when the sample size is small and ,j2 is un.lmO'NU1 we
must make an assumption about the form of the underlying dlstribution in order to obtaill a '!
rest procedure. A reasonable assumption in many cases is that the underlying distribution is
normal.
Many populations encountered in practice are quite well approximated by the norm.aJ
distribution, so this assumption will lead to a test procedure of wide applicability. In fact,
moderate departure from normality will bave little effect on the test validity. When the
assumption is unreasonable, we can either specify another distribution (exponential.
Weibull, etc.) and use some general method of test construction to obtain a \'alid procedure)
or we could use one of the nonparametric tests that are 'valid for any underlying distribution
(see Chapter 16).

Statistical Analysis
Suppose that X is a normally distributed random variable with unknown mean f1 and vari~
ance cr.
Vole wish to test the hypothesis that f1 equals a constant 110, Note that this situation
is similar to that treated in Section 11~2.1. except that now both J.1 and if- are unlolOWD.
Assume that a random sample of size nj say Xl' Xl"'" Xfl.' is available. and letX and S2 be
the sample mean and variance, respectively.
Suppose that we wish to test the two-sided alternative
Ho: 11= f1v, (11-27)
H[:I1"".Uo·
The test procedure is based on the statistic

(11-28)

which follows the t distribution with n 1 degrees of freedom if the null hypothesis Ho: 11
: ; :. J.1o is true. To testHo: J.1::: 110 in equation 11 ~27. the test statistic Ie in equation 11-28 is cal-
culated, and Ho is rejected if either
(1I-29a)
or
(Il-Z9b)
where tall, 'I _ I and -tan.. n _ I are the upper aud lower Ct!2 percentage points of the t distri-
bution with n - I degrees of freedom.
For the one-sided alrernative hypothesis
Ho: 11 = /10, (11-30)
H,: 11'> /10,
r
! 11-2 Tests of Hypotheses on a Single Sample 279

we calculate the test statistic to from equation 11-28 and reject Ho if


(11-31)
For the other one-sided alternative,

He: /1 = /1." (11-32)


II,: /1 </10,
we would reject Ho if
(11-33)

The breaking strength of Ii textile fiber is a r.ormally distributed random variable. Specifcations
require that the :t;).can breaking strer.gth shQuld equal 150 psi. The manufact'.1ter would like to detect
any significant departure fro:.;::: this ..'alue. Thus, he wishes to test
no: /1 = 150 psi,
HI:).i;J:.lSOpsi.
A random sample of 15 fiber specimens is $elected and their breaking stre~gths determined, The sam-
ple mean and variance are computed from the sample data as i = 152.18 and 52 16,63. Therefo;:e,
the [est statistic is
'1-/1, 152,18-150
'0 = --,--;::=- "'" -r,=:".=- '= 2.07.
sj.,Jn ,,16.63/15

The type I error is specified as a = 0.05. Therefore t{W2S.14 :=: 2,145 and -toA)'l"s,!4 -2, V:S. and we
would CQucludc that there is not sufficient evidence to reject the hypothesis that JJ. == 150 psi.

Choice of Sample Size


The type II error probability for tests on the mean of a normal distribution with unknown
variance depends on the distribution of the test statistic in equation 11-28 when the null
hypothesis Ho: JI = j1f, is false. V,Ilien the true value of the mean is JI = /10 + S, note that the
test statistic can be written as

G a
= So'

z+S~
= w (11-34)

The distributions of Z and W in equation 11-34 are N(O, 1) and ~ x;_:!(n


-i), respectively,
and Z and Ware independent random variables. Howeve~, (] J~/(] is a nonzero constant.,
so that the numerator of equation 11 ~34 is a LV"( 8·. /;;/a. 1) random variable. The resulting
distribution is called the noncentral t distribution with n - 1 degrees of freedom and

i
L
280 Chapter 11 Tests of Hypotheses
1
Ii
nancentrality parameter o.JnICi. ~ote that if 0= 0, then the noncentral tdistribution reduces
ta the usual or central t distribution. In any case, the type II error of the two-sided alterna-
tive (far example) would be
/3 =P(- 'ai2•• _ 1S to S; 'ai2.n _110;< 0)
=P(-
where to denOtes the noncentral t random variable. Finding the type II error for the I-test
involves fmding the probability contained between two points on the noncenttal t
distribution.
The operating characteristic curves in Charts VIe, Vlf, VIg, and Vlh (Appendix) plot /3
against a parameter d for various sample sizes n. Curves are provided for both the two-sided
and one-sided alternatives and for 0; = 0.05 or a = 0.01. For the
two-sided alternative in
equation 11-27, the abscissa scale factar d on Charts VIe and VI!is defined as

d
181 (11-35)

For the one-sided alternatives. if rejection is desired, ie., )1 > Jl.rr as in equation II-3D, we
use Charts VIg and Vlh with

(11-36)

while if rejection is desired, Le., )1 < }.4j. as in equation 11-32,

d= Po -/1 =!... (11-37)


Ci Ci
V'le note that d depends on the unknown parameter dl. There are several ways to avoid this
difficulty. In some cases, we may use the results of a previous experiment or prior infor-
mation to make a rough initial estimate of C?, If We are interested in examining the operat-
ing characteristic after the data has been collected, we could use the sample variance? to
estimate (32. If analysts do not have any previous experience on which to draw in estimat-
ing (;2, they can define the difference in the mean 5iliat they wish to detect relative to (3.
For example, if one wishes to detect a small difference in the mean~ one might use a value
of d = 10l/CiS 1 (say) whereas if one is interested in detecting auly maderately large differ-
ences in the me,", one might select d = 101/Ci= 2 (say). That is, it is the value of the ratio
10l/Ci that is important in determining sample size. and if it is possible to specify the rela-
rive size of the difference in means that we are .interested in detecting. then a proper value
of d can usually be selected.

i~~tl1jiJ~_i1'.i:
Consider t..'le fiber-testing problem in Ex:ample 11-5. If the b;'eaking strength of this fiber differs from
150 psi by as much as 2.5 psi, the analyst would like to reject the null h)"pothes;s Ho: j.t = 150 psi v,rith
a probability of at least 0,90. Is the sample size n = 15 adequate to ensure that the test is lhis sensitive?
Ifwe usc the sample standard deviation s = --/16.63 =4.08 to est:imare cr, then d = 101/0"= 25/4,08 ='
0.61. By referring to the operatiJ::i.g characteristic curves in Chart VIe, with d = 0.61 and n = 15, we
find ~ = 0.45. Thus, the prabability of rejecting Ho: J.L; 150 psi if the true mean diffurs from this value
by ±2..5 psi is 1 - : 0.45 ~ 0.55, approxi.oJately. and we would conclude that a sample size of
v-! 11-2 Tests of HY;>otheses on a Single SampJe 281

= 15 is not adequate. To find the sample size required to give the desired degree of protection, enter
1:1
the operating characteristic curves in Chart VIe with d == 0.61 and f3 = 0.10, and read the con:espon~
ding sample size as 11. 35, approxbately,

11-2.3 Tests of Hypotheses on the Variance of a Normal Distribution


There are occasions when tests assessing the variance or standard deviation of a population
are needed. In chis section we present tvlo procedures, one based on the assumption of nor-
mality and the other one a large-sample test.

Test Procedures for a Normal Population


Suppose that we wish to test the hypothesis that the variance <72 of a normal distribution
equals a specified value, say";, Let X - N(Ji, (72), where!1 and d' are unknov"n, and letXt •
X" ., .. x" be a random sample of n observations from this population. To test
Ho: d' =,,;, (11-38)
Ht'd'",~,
we use the test statistic
2 (n-l)S2
Xo = ..·------0-' (10·39)
<70

ic
where S' is the sample variance. Now if Ho: d' = ,,; is true, then the test statistic follows
the chi-square distribution with n -I degrees of freedom. Therefore,Ho' d'=,,; would be
rejected if
(11-40a)
or if
;Co<X;-a/2,n-t, (lI-40b)
where X~, n _ 1 and i-
an., ~ _ 1 are the upper and lower ctJ2 percentage points of the chl-
square distribution with n - 1 degrees of freedom.
The same test statistic is used for the one-sided alternatives. For the one-sided
hypothesis
Ho: d' =<7~ (11-41)
H:: if- > O'~.
we would reject Ho if
, 2
Xo> Xa.r.~l· (11-42)
For the other one-sided hypothesis,
Ho' d'=<7:, (I !-43)
Hi: 02< 0';,
we would rejeet Ho if
(11-44)
282 Chapter 11 Tests of Hypotheses

Consider the machine described in Example 10-16, which is used to fill cans with a soft drink bever.
age. If the variance of the fill volume exceeds 0.02 (fluid our.cesf1, then an unacceptably large ~~
centage ofilie cans wi:l be underfilled. The bottler is interesred in testing the hypothesis
H,: d'; 0.02.
H,:<r> 0.02.
A randoCl sample of n.:::::: 20 cans yields a sample variance of il:::::: 0.0225. Thus, the test statistic is

(19)Q0225 21.38
0.Q2

Ifwe choose 0::.:::0.05, we find t.~at .ioll~l~ : : : 30.14, and we would conclude that there is no strong evi~
deuce tha: the variance offill volume e.,,<ceeds 0.02 (fluid ounces)2.

Choice of Sample Size


Operating characteristic curves for the r
tests are provided in Charts VIi through YIn
(Appendix) for iX= 0.05 and Ct= 0.01. For the two-sided alternative hypothesis of equation
11-38, Charts VIi and VIj plot fJ against an abscissa parameter,

I..;~, (1145)
"0
for various sample sizes n, where O'denotes the true value of the standard deviation. Charts
VIk and \11 are for the one-sided alternative H t : d' > 0-;, while Charts VIm and \'In ardor
the other one-sided alternative HI: a'l < ~. In using these charts. we think of eras the value
of the standard deviation that we want to detect.

In Example 11-7. find the probability of rejecting Ho: a1 = 0.02 if the true va.,"iance is as large as d- "'"
0,Q3, Since 0' .jO,03:;;;O.1732and <To .J0.02:=O,l414~ea.bscissaparameteris

From Chart \'Ik, v.ith;1.= 1.23 and n 20, we find that {:J ""'" 0.60. ThaI is, there is only about a 40%
chance that Ho: d1 = 0.02 will be ::ejected if the variance is really as large as <T2 = 0.03. To reduce p.
a larger sample size must be used. From ':he operating characteristic curve. we note that to reduce f3
to 0,20 a sample size of?5 is necessary.

A Large-Sample Test Procedure


The chiwsquare test procedure prescribed above is rather sensitive to the nonnality assump-
tion. Consequently, it would be desirable to develop a procedure that does not require this
assumption. "'hen the underlying population is not necessarily normal but n is large (say n
;" 35 Qr 40), then we can use the following result: if Xl' X2 , .. " Xn is a random sample from
a population with variance cr,
the sample standard deviation S is approximately nonnal
with mean E(S) "" "and vartance V(S) = ,,'an,
if n is large.
11~2 Tests of Hypotheses on a SIngle Sample 283

Then the distribution of

(11-46)

is approxioately standard normal.


To 7est
Ho: cf=~ (11-47)
Hl:cf",~,
substitute 0'0 for v in equation llA6. Thus, the test statistic is

(11-48)

and we would reject Ho if Zo > Zan or if Zo < - Zan' The same test statistic would be used
for the one-sided alternatives. If we are testing
Ho: cf (l'~ (11-49)
HI: cf > 07"
we would reject Ho if Zo > Za> while if we are testing

Ho: cf = d,;,
( 11-50)
HI: cf < d,;,
we would reject Ho if Zo < -Za-

An injection-mo1ded plastic part is used in a graphics printer. Before agreeing to a lQng~term contract.
the printer manufacturer wants to be sure using a"'" 0.01 t."1at the supplier can produce parts vrith a
standard deviation of length of at most 0.025 m.aL The hypotheses to be tested are
Ho: d' 6.25 X 10-4,
H,' d' < 6.25 X 10-4.
smce (0,025)'2= 0,000625. Arandoro sample of n::: 50 parts is obtained, and t.1C sample standard devi~
ation is s = 0,021 I1t01. The test statistic is

Since ~Zom -2.33 and the observed value of Zo is not smaller than this critical value, Ho is aot
rejected. That is, the evidence from the supplier's process is not strong enough to justify a long-term
contract.

11-2.4 Tests of Hypotheses on a Proportion


Statistical Analysis
In many engineering and management problems, we are concerned with a random variable
that follows the binomial distribution. For example, consider a production process that
manufactures items that are classified as either acceptable or defective. It is usually
284 Chapter 11 TestS of Hypotheses 1
reasonable to model the occurrence of defectives with the binomial distibution, where the
binomial parameter p represents the pro:portion of defective items produced,
We will consider testing
Ho:p=po,
(II-51)
H,:p*po·

An approximate tesl: based on the nonnal approxi.-nation to the binomial will be given. 11:lis
approximate procedure will be valid as long as p is not extremely close to zero or 1, and if
the sample size is relatively large. Let Xbe the number of observations in a random sample
of size n that belongs to the class associated with p. Then, lithe null hypothesis Ho: P =Po
is true, we have X - N (npo. nPo(l- pf)). approximately. To testHo: p =Po calculate th~ test
statistic

(II-52)

and reject Ho: p = p, if


(11-53)

Critical regions for the one-sided al.ternative hypotheses would be located in the usual'
manner.

A semieonductor firm produces logic devices. The contract with their custome!; calls for a fraction
defective of no more ilian 0.05. They wish to test
He' P = 0,05,
H,:p>0,05,
A random sample of 200 devices yields six defectives. The. test statistic is

6-200(0,05)
-1.30
~200(0,05)( 0,95)

Using a= 0.05. we f...1.d !hat ZV.C5:::: 1.645, and so we cannot reject the null h}-pothesis that? = 0.05.

Choice of Sample Size


It is possible l<l obtain closed-form equations for the {3 error for the tests in this section, The
*
{3 error for the two-sided alternative H,: p Po:s approximately
11~2 Tests of Hypotheses on a Single Sa.'"Dp:e 285

If the alternative is H,: p < p", then

(11-55)

whereas if the alternative is H,: p > p", then

(11-56)

These equations can be solved to find the sample .3ize n. that gives a test ofleve1 a that
has a specified f3 risk. The sample size equations are

(11-57)

for the two-sided altemative and

n= Za~Po(l- Po) +Zp~p(l- plY


P-Po
j
I (1l-5B)

for the one·sided alternatives.

For the situation described in Example 11"10. suppose that we wish to find the ,8 error of the test if p
;;; o.m, Using equation 1i-56. the f3 error is
( 0.05-0.07 +1.645.1(0.05)(0.951/200 \
f3=~ " ;
l ~(0.07)(O.93)/?OO )
= <1>(0.30)
= 0.6179.

This type II error probability is not as small as one might like. but ft "'" 200 is not'particularlY large and
0,07 is not very fat from the null value Po =< 0,05. Suppose that we wanl the f3 error to be no larger than
0.10 if:he true value of the fraction defective is as large as p;:; 0.07. The required sample size would
be found from equation 11-58 as

n = [L645~(O.05)(O.95) +L2ll~(O.07)(0.93)J'2
0.07 =0.05
= 1174,

which is a very large sample size, However, notice that we are trying to detect a very small deviation
fron:: the null value p, = 0.05.
286 Chapter 11 Tests of Hypotheses

11-3 TESTS OF HYPOTHESES 0.:-1 TWO SAMPLES


11-3.1 Tests of Hypotheses on the Means of Two Normal Distributions,
Variances Known
Statistical Analysis
Suppose that there are two populations of interest. say X, and X,. We assume that X, has
unknown mean il, and known variance <ri.
and that X2 has unknown mean iLl and known
variance a;.We will be concerned with testing the hypothesis that the means /11 and Jlz are
equal. It is assumed either that the random variables Xl and X2 are normally distributed or,
if they are nonnormal, that the conditions of the Central Limit Theorem apply.
Consider first the two-sided alternative hypothesis
Ho: il; =iLl. (II-59)
H,: il, '" iLl·
Suppose that a random sample of size nl is drawn from Xl' say Xll' X: z,.,., XII'j!' and that a
second random sample of size n2 is drawn fromX2, say X Z1 ,X22,. .. ) Xlnt' It is assumed that
the (XI}) are lndepencently distributed with mean il, and variance <ri,
that the (X,) are
independently distributed with mean iLl and variance ai,
and that the two samples (Xlj) and
(X,) are independ':!'t. ~e test procedure is based on the distribution of the difference in
sample means, say X: ~ Xl. In general, we know that

_ _ It' a~)
Xl-X,-N, il,-iLl,-+-' .
aJ
\ nl llz
Thus, if the null hypothesis Ho: il, =iLl is true, the test statistic
ZO = ~~X"-",

V
J7n:f + <1~"2 (11.60)

follows the N(O, I) distribution. Therefore, the procedure for testing Ho: ill = iLl is to cal-
z;,
culate the test statistic in equation 11-60 and reject the null hypothesis if
(1I-6Ia)
or

Zo '" -Z"". (1l-6Ib)


The one-sided alternative hypotheses are analyzed similarly, To test

Bo: ill = iLl, (11-62)


H,: ill > iLl.
the test Statistic Zo in equation 11-60 is calculated, and Ho: p., = iLl is rejected if

Zo>Z.. (11-63)
To test the other one-sided alternative hypothesis,
Ho: il, = .u" (11-64)
HI: J1.1 <fl2.
use the test statistic Zo ill equation 11·60 and reject Ho; il, = iLl if

Zo<-Za- (11-65)
11-3 Tests of Hypotheses on Two Samples 287

...,,,.,S1'.,'-.'
Ex'ii.np'j"ii
{,."",/.",."".,,-,.,
2
The plant mar..ager of an orange juiee canni.....g facility is interested in co:rcparing the performance of
twa different production lines in her plant. AI; line number 1 is relatively new, she suspects that its ont~
put in number of cases per day is greater than the n1.Ullber of cases produced by the older line 2. Ten
xl
days of data are selected at random for each line, for which it is found that ='- 824.9 cases per day
and X::.:;:;; 818.6 cases per day. From experience with operaticg this type of equipmectlt is la:-ovm that
d, = 4() and 0; = 50. We wish :0 test
Ho: J1.t =p,.
H j:J1.j>p,.
The value of the test statistic is
Xj-.i'2 824.9-8,8.6 ~2.10.
20
0; ai
i -;;~. +--;;;
j /40 50
i 10 + 10
"Gsing a:= 0.05 we find that 21:!,os 1.645, and since 4 > 4.1)5' we would reject H J and condude that
t.'le mean number of cases per day produced by the new produetion line is greate! than the mean num-
ber of cases per day produced by the old line.

Choice of Sample Size


The operating characteristic curves in Charts VIa, Vlb, VIc, and 'lId (Appendix) may be
used to evaluate the type IT errorprobab'Jity for the hypotheses in equations 11·59, 11·62,
and 11-64. These curves are also useful in sample size detertll.ination. Curves are provided
for 0:= 0.05 and 0:= 0.01. For the two-sided alternative hypothesis in equation 11·59. the
abscissa scale of the operating characteristic curves in Charts VIa and 'lIb is d, where

1111-1121 !81
d= rc;r~ai ~vl+ai' (11~66)
and one must choose equal sample sizes. say n ;;;; fl.l ::::; ~. The one~sided alternative hypothe-
ses requi::e the use of-Charts VIc and V1i For the one-sided alternative HI: 111 > P, in equa-
tian 11-62, the abscissa scale is

d (11-67)
I" 2 '
+ Ci2~(Ji

where n =r.l :;;; nz· The other one-sided alternative hypothesis, HI; Pl < f11., requires that d
be defined as

(11·68)

and n =n, =",.


It is not unusual to encounter problell1S where the costs of collecting data differ sub·
stantially between the NO populations, or where one population variance is much greater
than the other. In those cases, One often uses unequal sample sizes. If!11 ¢ nz• the operating
characteristic curves may be entered with an equivalent value of n computed from

n- aT. +0'22
(11-69)
-ufj",+u?;,/",'
288 Chapter 11 Tests of Hypotbeses

Ii '" >' ".,. and lbeir values are fixed in advance, lben equation 11-69 is used directly to cal-
culate n. and the operating characteristic curves are entered 'With a specified d to obtain /3.
n,
If we are given d and it is necessary to determine and "., to obtain a specified f3, say f3*,
then one guesses at trial values of n 1 and i1:!, calculates n in equation 11 ~69, enters the curves
with the specified value of d, and finds Ii Ii f3 ~ f3", then lbe trial values of n l and"., are sat-
isfactory.If f3>' f3*. lben adjustments to ", and"., are made and the process is repeated.

Consider the orange juice production line problem in Exampk 11 ~ 12. If the true Cifferenee in ~a:u
production rates were 10 cases per day, find the sample sites required to detect this differ~::ce with a
probability of 0.90. The appropriate value of the abscissa parameter is
d 1',-1'2 10
1.05,
~(jf +O'i .)40+50

and since a=O.05, we find trom Chart,/1c :hat n =n! =~= g,


------~~---------------

It is also possible to derive formulas for the sample size required to obtain a specified
f3 for a given 0 and IX These formulas occasionally are useful supplements to lbe operating
characteristic curves. For the two~sided alternative hypothesis. the sample size n! ; i1:! = n
is

n
(Z<X/2 + zl (<Y? 2
+ <Yn
(11-70)
0
This approximation is valid when <r>( -Za/2 - o~~-I~CJf + ~T) is small compared to {i
For a one-sided alternative, we have nl :::: ~ n. where

(11-71)

The derivations of equations 11-70 and 11-71 closely follow the single-sample case in Sec-
tion 11-2. To illustrate the use of these equations. consider the situation in Example 11-13.
a;
We have a one-sided alternative with a~ 0.05, o~ 10, = 40,0; 50, and f3 = 0.10. Thus
Za 20.05 = 1.645, Zp = 20.10 = 1.28, and the required sample size is found from equation
11-71 to be

which agrees with lbe results obtained in Example 11-13.

11-3,2 Tests of Hypotheses on the :\leans of Two Normal Distributions,


Variances Unknown
We now considertests of hypotheses on lbe equality of the means .u,
and Il2 of two normal
distributions where the variances and CJ; a;
are unknown. A t statistic will be used to test
lbes. hypolbeses. As noted in Section 11-2.2, lbe normality assumption is required to
r,
I ! 1~3 Tests of Hypotheses 0:1 Two Samples 289

develop the test procedure, but moderate departures from normality do not adversely affect
the procedure. There are two different situations that must be treated. In the first case, we
assume that the variances of the two normal distributions are unknown but equal; that is, .i
: : : a;
cr. In the second, we assume that ~ and a~ are unknown and not necessarily equal,-

Case 1: 01: 01: = cr Let X, and X, be two independent normal populations with
unknown means 11: and f.L2, and unknown but equal variances <17;;;: We wish to test a;;;:;: rr.
Ho: f.11 :;:::: P'2~
(11-72)
HI: /11 '" /12.
Suppose thatX w X12• ,.,. XliI) is a random sample ofn} obsex::ati~ns ffOmXl\andX::1. X22•
... , X"" is a random sample of,," observations from X" Let Xl' X" S;, and S; be the sam-
ple me~s and sample variances. respectively. Since both S~ and S; estimate the common
\"a..;ance <i-, we may combine them to yield a single estimate, say

s' __ h -1)Sf +("" -I)si (11-73)


p n1 +nz-2 .
This combined or upooledn estimator was introduced in Section lO~3,2. To test Ho: 111 = f.1.:!.
in equation 11~72, compute the test statistic

t -
~-x.,•
0- II 1 (11-74)
Sp_I-+---
~ nl ""2

If Ho: fll :;::: f.1.:!. is true, '0 is distributed as tn:+~_2' Therefore, if


(l1-75a)
or if

(lH5b)
we reject H~: f.Ll ,~.
The one-sided alternatives are treated si:nilarly. To test
Ho' /11 ~ /12, (11-76)
HI: /11 > /12,
compute the test statistic to in equation 11.-74 and reject Ha: /11 = /12 if
(11-77)
For the other one-sided alternative
Ha: /1, = /12, (11-78)
H ,: /11 < /12,

calcuinte the test statistic to and reject Ho: /11 = /12 if


(11-79)
The two-sample Hest given in !his section is often called the pooled t-test, because the
sample variances are combined or pooled to estimate the common variance. It is also known
as the independent t-test, because the two normal populations are assumed to be independent
290 Chapter 11 Tests of Hypothese,

. Example 1(14
Two catalys':S are being analyzed to dete.."'!!line how t"ey affect the mean yield of a chemical process.
Specifically, cata!yst 1 is cu....-rently in use, but catalyst 2 is acceptable. Since catalyst 2 is eheaper, if i'I'
it does not change ti:e process yield, it should be adopted. S\;,ppose we v.ish to test the hypotheses

HO:!'I=i'?,
H,:!',"i'?.
PLot plant data yields nl =- 8, Xl)::;;; 91.73, s~ =3.89, nl. = 8. ~:= 93,75, and 4.02. From equation
IH3. we find
(7)3.89+ 7(4.02)
3.96.
h8-2
The test statistic is
91.73-93.75
2.03.
'1 1
1.99 1-+-
H S
Using tX= 0.05 we find:hat tM2'i.:4:::: 2.145 and-tC.CZ5, [4 =-2.145, and. conseqr.ently, Ho: #) = 111
cannot be rejected. That is. We do no: have srror.g c\'idence to conclude that catalyst 2 results in a
mean yield that differs fron:: the .rne:an yield when catalyst 1 is used,

a; u;: In some siruations, we cannot reasonably assume that the unknown vari-
Case 2: '#'
ances ~ and 0'; are equal, There is not an exact t statistic available for testing Ho: Pi ~ .'-'2
in this case. However, the statistic

tG=~~·~
Xl-X,
1St + si 01-80)
~ ", P..,
is distributed approximately as t with degrees of freedom given by

(St + SfJ'
v lnl n.z 2
(11-81)
(SUn,), (si. In.,J
-'-'-'-"''-- + .l..::..--c:..<_
n: +1 liz +1
if the nUL hypothesis Ho: 11, ~ i'? is true. Therefore, if ~" a;,
the hypotheses of equations
11-72.11-76, and 11~78 are tested as before, except that to is used as the test statistic and
n1 + 1lz - 2 is replaced by v in determining the degrees of freedom for the test. This general
problem is often called the Behren&-Fisher problem.

A manufacrurer ofYideo display units is testing two microcirc;rit designs to determine whether they
produce equivalent current floW. DevelOpment engineering has obtained the following data:

Desjgn 1 1ft;;;:: 15 Xl ;;;:: 24,2


Design 2 .'12:;; 10 Xl =: 23.9
11-3 Tests of Hypo:hcses on 1\\10 Samples 291

We wish to test
He: 1'1 = fJo,
HI:p'l:;LfJ.-?
where bot.;' populations are assumed to be normal, but we are unwilli."lg to asS\lJne that the un.1caown
variances ci:
and a;
are equal. The test statistie is

'_ ,,-x, 24.2- 23:9 -0 184


to -1sf -.-.!i. )10 ___ 2ij" -. .
V", n, 115 10
The degrees of freedom on t; are found from equation 11-81 to be
/" 2 \2
i E..+~~ I (!Q."t' 20,\2
v ',n~ n,./ 2 15 10) -2=;6.
(sUnJ (sifn,.)' (lOi 15)' + (20/10)'
-~-+ 16 11
ltl+1 ltl-t-l

Using a;:::, 0.10, we find that t:d7..v:::::: to 05,J6 = 1.746. Since It~1 < t(,05.16' we cannot rejeet Hfi. III :!; /k}.,

Choice of Sample Size


The operating characteristic curves in Charts VIe, Vlf, vlg, and VIh (Appendix) are used to
evaluate the type lIerrorfor the case where 01= =
c?, if. Unfortunate,y, when <:(" <1" the
t;
dismbution of is unknown if the null hypothesis is false, and no operating characteristic
curves are available for this case.
For the two-sided alternative in equation 11-72, when a; = c?, = if and n, = "2 = n,
Charts VIe and VI! are used with

d = '---'-_:.1 (11-82)
2a 2a
To use these curves, they must be entered with the sample size n* ::: 2.n - L For the one-
sided alternative hypothesis of equation 11-76, we use Charts VIg and VIh and define

d f.11-J.Lz=J.., (U-83)
2a 2a
whereas for the other one-sided alternative hypothesis of equation 11-78/ we use

d=f.12-f.1, =J.., (11-84)


2a 2a
It is noted that the parameter d is a function of a, which is unknown. As in the single-sample
t-test (Section 11 ~2,2). we may have to rely On a prior estimate of a, or use a subjective
estimate. Alternatively, we could define the differences in the mean that we wish to detect
relative to (1,

!~~~Iil~l;;
Consider the catalyst expc...ment in Example 11-14. Suppose that if catalyst 2 produces a yield that
differs from the yield of catalyst 1 by 3.0% we would lil.--e to reject the null hypothesis .....ith a proba~
bility of at leas! 0,85. What sample size is required'? Using!ip 1.99 as a rO\1gh escrnlte of the

L
292 Chapter 11 Tests of Hypotheses

common standard deviation a, we have d = 1000za= 13.00i/(2)(1.99) = 0.75. From Chart VIe (Appen-
dix) with d;;;;: 0.75 and fJ= 0.15. we find n* =20, approximately. Therefore, since n" =2n - 1.

n .......l 20-!-1
n=-2-=-2-=1O.5 ~ll (say),

and we would use sample sizes ofnl;;;;: n2 ;;;;: n "'" 11.

11-3.3 ThePairedt-Test
A special case of the two-sample t-tests occurs when the observations on the two popula~
lions of interest are collected in pairs. Each pair of observations, say (X:J• X1.j), is taken
under homogeneous conditions, but these conditions may change from one pair to another.
For example, suppose that we are interested in comparing two different types of tips for a
hardness-testing machine. Ibis machine presses the ti.p into a metal specimen with a known
force. By measuring the deplh oflhe depression caused by lhe tip. the bardness of the spec-
imen can be determined. If several specimens were selected at random, half tested with tip
I, half tested wilh tip 2. and lhe pooled or independent Hestin Section 11·3.2 applied. the
results of the test could be invalid. That is, the metal specimens could have been cut from
bar stock that was produced in different heats, or they may not be homogeneous. which is
another way hardness might be affected; then the obseNed differences betvleen mean hard-
ness readings for the two tip types also include hardness differences between specimens.
The correct: experimental procedure is to collect the dara in pairs; that is, to take two
hardness readings of each specimen, one with each tip. The test procedure would then con~
sist of anal)l'zmg the differences between hardness readings of each specimen. If there is no
difference between tips. then lhe mean of lhe differences should be zero. This test procedure
is called the paired t-test.
Let (Xu. X'l)' (XI2 , Xnl, ... , (X",. X,J be a set of n paired observations. where we
assume lhat XI .:: N(p.l' cr;)
and Xl - NCp., 0) Define lhe differences between each pair of
obsenrations as Dj =Xu - X2i' j = 1, 2, ...• n.
The D] are normally distributed v;.':ith mean

so testing hypolheses about the equality of III and '"'- can be aecomplished by perfonDing a
one~sarnple Hest on }J.D' Specifically, testing Ho: J1.~ = fJ? against HI: J.i.l ;t!; f-12 is equivalent
to testing

Eo: IlD= 0, (1l·8S)


HI: IlD* 0.
The appropri.ate test statistic for equation ll·8S is

15
(ll-86)

where

(11·87)

J
11-3 Tests of Hypotb.eses on Two Samples 293

and

s'-D (11-88)

are the sample mean and variance of the differences. \Ve would reject Ho: /l.v =0 (implying
that Il,,,, 10 if to> Ian., <_lor if to < -1",'2" -1' One-sided alternatives would be treated
similarly.

l;E~~i~'lll!z!
An article in the Joul7Ul1 of Strain Analysis (Vol IS, No.2. 1983) compares several methods fur pre-
dicting the shear strength for steel plate girders. Data fur two of these methods. the Karlsruhe and
Lehigh procedures. when applied to nine specific girders. are shown in Table 11-2. We wish to deter-
mine if there is any difference (on the average) between the tvlo methods,
The sample aYera~e a.nd standard deviation of the differences d, are (1 = 0,2739 and Sd "'" 0.1351.
so the test statistic is - "

0.2739
0.1351/.J9 6,08.

For the twQ~sided alternative HI: 1lfJ;;t 0 and a= 0.1, we would fail to reject only if Itol < t~[},l, 3 """ 1.86.
Since to> to.05 , 5' we conclude that the two strength prediction methods yield d.:.fferent results. Spec:f~
ically. the KarhTUhe method produces, on average, higher stfCngt:!l predictio::tS than does t.!..e Lehigh
method.

Paired VersrL(j' Unpaired Comparisons Sometimes in performing a comparative experi~


ment, the investigator can choose between the paired analysis and the two-sample (or
unpaired) I-test. If n measurements are to be made on each population, the two-sample t sta-
tistic is

Table 11·2 Strength P:edictions fur Nine Steel Plate Girders (Pre-
dicted LoarliObserved Load)

Girder Karlsruhe Method Lehigh Melhod Difference


Slil !.la6 1.061 0.125
S2/1 !.l51 0.992 0.159
S31l 1.322 1.063 0.259
Soil 1.339 1.062 0.277
S5/1 1.200 1.065 0.135
S2/1 1.402 un 0.224
S2/2 1.365 1.037 0.328
S2/3 1.537 1.086 0.451
S2/4 1.559 1.052 0.507

l
294 Chapter 11 Tests of HypotOeses
1
which is compared to tOIl. 211- 2- and of course, the paired t statistic is
15
to=S I"
Di ">In

which is compared to tfYi2,n-I' Kotie<! that since

the numerators of both statistics are identical However, the denominator of the two-sam~
pIe t-test is based on the assc.mption that XI andX2 are independent. In many paired exper-
iments, there is a strong positive correlation bet¥leen Xl and X2• That is,

v(15) = V(Xj - X2 )
= V(X:) + V(X,)-2Cov(X;, X,)
20-2 (1- p)
n
assuming that both populations Xl and Xz ha'!'e identical variances. Furthennoret S ~/n esti-
!llates the variance ofD. Now, whenever there is positive correlation within the pairs, the
denominator for the paired t-test will be smaller than the denominator of the t'No-sample
t-test This can cause the t\Vo-sampJe t-test to considerably understate the significance of the
data if it is incorrectly applied to paired samples.
Although pairing will often lead to a smaller value of the variance of Xl - X;.:, it does
have a disadvantage, Namely. the paired t~test leads 1.0 a loss of n.. - 1 degrees of freedom in
comparison to the two-sample t-test. Generally, we know that increasing the degrees of
freedom of a test increases the power against any fixed alternative values of the parameter.
So how do we decide to conduct the experiment-should we pair the observations or
not? Although there is no general answer to this question, we can give some guidelines
based on the above discussion. They ate as follows:
1~ If the experimental units are relatively homogenous (small a) and the correlation
between pairs is small, the gain in precision due to pairing will be offset by the loss
of degrees of freedom, so an independent-samples experiment should be used,
2. If the experimental uIUts are relatively heterogeneous (large a) and there is large
positive correlation betv.'een pairs, the paired experiment should be used.
The rules still require judgment in their implementation, because (j and p are usually
not \:nm>;ll precisely. Furthermore, if the number of degrees of freedom is large (say 40 or
50). then the loss of n - 1 of them for pairing may not be serious, However, if the number
of degrees offreedom is small (say !O or 20), then losing half of them is potentially serious
ifnot compensated for by an increased precision from pairing.

11-3.4 Tests for the Equality of Two Variances


We now present tests for comparing two variances. Following the approach in Section
11-2.3, we present tests for normal populations and latge-sample tests that may be applied
to nonnormal populations.
11-3 Tests of Hypotheses on Two Samples 295

Test Procedure for Normal Populations


Suppose that two independent populations are of interes~ say Xl - NlJ1l, a;)
and X2 - NIJ1z,
<1,), where !J." a;,
p.z, and <1, are unknown. We wish to test hypotheses about the equality
a;
of the two variances, say Hol = 0;. Assume that two random samples of size OJ from pop-
S;
ulation 1 and of size n" from populatioo 2 are available, and let and S, be the sample vari-
ances. To test the two-sided alternative

Ho: cr~= 0;,


(11-89)
Hi: ~*o;.
we use the fact that the statistic
S2
Fa =-{- (11-90)
Si.
is distributed as F, with", -1 and n" -I degrees of freedom. if the null hypothesis llc: 0-;=
a; is i.'!1e, Therefore, we would reject Ho if
(11-91,)
or if
Po < F 1 _ aJ2 ,I'lI_1. /1.~-1' (I1-91b)
where F Cd2• /1.1 -1,112-1 and F 1 _ aI2.fll-l,II2- 1 are the upper and lower (J}2 percentage points of
the F distribution with", - I and n" -I degrees of freedom. Table V (Appendix) gives only
the upper tail points of P, so to fmd F 1 - a12,I'lI- 1,111-1 we must use

1
Pi - aj2'''1 -1,n2-1 (11-92)

The same test statistic can be used to test one-sided alternative hypotheses, Since the
notation Xl and X2 is arbitrary, let Xl denote the population that may have the largest vari~
ance. Therefore, the one-sided alternative hypothesis is

Ho: O'~ == t1~.


(11-93)
11,: cr; > cr;.
If
(11-94)
we would reject Ho: ~ = 0';.

Chemical etching is used :0 remove >copper from. printed cir>cuit boards, X and X:t represen~ p::ocess
yields when twO different concentrations are used Supp¢se 6at we wis:::' to test

Ho: a;
Hj; vi:? 0;.
Two samples of sizes n t = liz """ 8 yield 37=3,89

L
296 Chapter 11 Tests of Hypotheses

If a::. 0.05. we find th~t Fo,q.t5. 7, 7"'" 4.99 and FQS75• 7, 7= (FruJ1j •7, 7r! "'" (4.99r I 0.20. Therefore,
0';.
'Ne ca.'1ll()t reject Ho: 0'1"'" and we can conclude mat there is no strong evidence that the variance
of me yield is affected by the concentration,
Choice of Sample Size
Charts v1o, VIp, v1q, and VIr (Appendix) provide operating characteristic curves for the
F-test for a= 0.05, and a= om, assuming that n, = "? = "- Cb,,'ts VIo and VIp are used
with the two-sided alternative of equation 11-89. They plot j3 .gabst the abscissa parameter
A="1 (11-95)

for various 111 ::::


"2
n2 == n. Charts VIq and VIr are used for the one-sided alternative of equa~
tion 11-93.
I
For the chemical process yield ar.:.alyses problem in Example 11 ~ 18, suppose that one of the concen~
trations affected the variance of the yield so that one of the variances was four times the other an,d We
wished to detect this with probability at least 0.80. What sample size should be used? Note that if one
variance is four times the oilier, men

,1.=
0',

By referring to ChartVIo, with p;:::; 0.20 and 1.= 2, we find that a sample size of nl "'" fl;1::' 20, a:pprox~
imately, is necessary.

A Large-Sample Test Procedure


Vlhen both sample sizes lZl and 1t:l are large. a test procedure that does not require the nor-
mality assumption can be developed. The test is based on the result that the sample standard
deviations 51 and S1. have approxi.mare normal disuibutions with means 0i and 0"2' respec-
tively) and variances u;!2nl and cr:/2~. respectively. To test
, 2
Ho: (jl =az,
(11-96)
HI: cr;:t-~,
we would use the test statistic

r,1 1
S 1--+-·
pOj 2n, 2n"
where Sf} is the pooled estimator of the common standard deviation (j, This statistic has an
approxiinate standard normal distribution when ,,; = 0-;. We would reject Ho if .z;, > Zan or
if~ <- Zc:.'Z' Rejection regions for the one~sided alternatives have the same form as in other
two~sample normal teSts.

1l·3.5 Tests of Hypotheses on Two Proportions


Toe teSts of Section 11-2.4 can be extended to the case where there are two binomial
parameters of interest, say PI and P2> and we wish to test that they are equal. That is, we
wish to test
Ho:p: =p"
H,:p, *p,. (11-98)
11-3 Tests of Hypotheses on Two Samples 297

We will present a large-sample procedure based on the normal approximation to the bino-
mial and then outline one possible approach for small sample sizes.

Large-SampleTes! for Ho:p, =pz


Suppose that the two random samples of sizes n l and nz are taken from two populations,
and let Xl and X2 represent the number of observations that belong to the class of interest
in samples 1 and 2, respectively. Furthennore, suppose that the normal approximation to the
binomial applies to each population, so that the estimators of the population proportions
PI =Xlinl and pz =Xzln,. have approximate normal distributions. Now, if the null hypothe-
sis Ho: PI =P2 is true, then using the fact thatpl =pz = p, the random variable

z PI-P2

is distributed approximately NCO, 1). An estimate of the common parameter pis


- Xl +X
p = - - -z.
n1 + 1lz

The test statistic for Ho: PI =P2 is then

(11-99)

If
(11-100)
the null hypothesis is rejected.

Two different types of.:fire control computers are being considered for use by the U.S. Army in six-
gun IDS-rom barteries. The two computer systems are subjected to an operational test in which the
total number of hits of the target are counted. Computer system 1 gave 250 hits out of 300 rounds,
while computer system 2 gave 178 hits out of260 rounds. Is there reason to believe that the two com-
puter systems differ? To answer this question, we test

Ho:PI=PZ,
Hl:Pl *pz·
Note thatp, =250/300 = 0.8333,p, = 178/260= 0.6846, and

250+178
0.7643.
300+260
The value of the test statistic is

0.8333 - 0.6846 = 4.13.


0.7643(0.2357)[_1_+_1_J
300 260

If we use a = 0.05, then 20,025 = 1.96 and -ZO.025 = -1.96, and we would reject Ho, concluding that
there is a significant difference in the two computer systems.
298 Ch3j:;ter 11 Tests of Hypotheses
1
I
Choice of Sample Size

The computation of the f3 error for the foregoing test is somewhat more involved than .in the
single-sample case. The problem js that the denominator of z;, js an estimate of the standard
deviation ofpl - pz under the assumption that PI::;;;; P2 =p. \\'hen Ho: PI:::::' P2 is false, the stan~
dard deviation ofPl -pz is

(11-101)

If the alternative hypothesis is two·gided. the fl risk tums out to be approximately

fl ~<!>( Za/2~pq(lin~:,~:) -(PI - p,))

(11-102)

where

and <lj;, _{h is given by equation 11-101. If the alternative hypothesis is H,: P, > P2' then

(lH03)

and if the alternative hypothesis is H t : P, <P2' then

(11-104)

For a specified pair of values Pl and P2 we can find the sample sizes nl ""2 ~ n
required to give the test of size a; that has specified type II errOr fl. For the two-sided alter-
native the common sample size is approximately

(11-105)

where ql = 1 - PI and q2 = 1 - Pz' For the one~sided alternatives, replace Zd2 in equation
11-105 with2w
11-3 Tests of Hypotllese, on 1\vo Samples 299

SrnaIl·Sample Test for Ho: p,; p,


Most problems involving the comparison of proportions Pl and p? have relatively large sam~
pIe sizes, so the procedure based on the nonnal approximation to the binomial is widely
used in practice. Howe;.'er, occasionally, a small-sample-size problem is encountered, In
such cases" the Z~tests are inappropriate and an alternative procedure is required. In this sec-
tion we describe a procedure based on the hypergeometric distribution.
Suppose that Xl and X2 are the number of successes in 1vto random samples: of sizes nl
and fi,1' respect.:Yely. The test procedure requires that we view the total number of successes
as fixed at the value X, + X, ; Y. Now consider the hypotheses

HO:PI;p"
B,:p,>p"
Given that Xl + X:;:.;:;:; Y,large values of Xl supportH1• whereas small or moderate values of
Xl support Ho- Therefore, we will reject Ho whenever Xl is sufficiently large.
Since the combined sample of n1 + lLz obse...,ryations contains Xi + Xz : : : Ytotal successes,
if Bo: p, ; p" the successes are no more likely to be concentrated in the first sample than
n,
in the second, That is, all the ways in which the + '" responses can be divided into one
sample of nl responses and a second sample of fl-.2 responses are equally likely_ The number
of ways of selecting Xl successes for the first sample leaving Y - X: successes for the
second is

Because outcomes are equally likcly. the probability of there being exactly Xl successes in
sample I is detennined by the ratio of the number of sample I outcomes having Xl suc-
Cesses to the total number of outcomes. or

x~!Y success in n1 + fl-.2 responses) (11-106)

given that Ho: Pi P2 is true. We recognize equation 11-106 as a hypergeometric


distribution.
To use equation 11,106 for hypothesis testing, we would compute the probability of
finding a value of Xl at least as extreme as the observed value of X:_ Note that this proba-
bility is a P·value, If this P-value is sufficiently small, then the null hypothesis is rejected,
This approach coudd also be applied to lower-tailed and two-tailed alternatives.

:~~§,p,!~~[@i~
InsulatirJ.g cloth used in printed circuit boards is manufactured in large rolls_ The manufacturer is try-
ing to irr;prove the process yield. that is, the number of defect~free rolls produced. A sample of 10 rolls
contains exactly four defect-free rolls. From analysis of the defect types, manufacturing cO$ineering
S:lggests S2VC:a1 changes in the p!"ocess. Fo:!lowing implerne::ttation of these chlUlges. another sample
of 10 rolls yields 8 defect-free rolls. Do the data support the claim thz:t the new process is bet::er tha.'1
the old one, using a:;:;Q,lO?

l
300 Chapter II Te:,'ll; of Hypotheses l
To aIlswer this question. we compute the P-vahle. mour ex.amp1e. nl """ns = 10, Y= 8 "t" 4= 12, and
the observed value of xl = 8, The values of Xi that are more extreme than 8 are 9 and 10, Therefore

12\,'8'
. ( 8 121
p(XJ :8112 successes): ({oy
=0.0750.

,10 J

(12Y81
19;\I;
P(XI =912successes
I ) =~=O.0095.
llO J
. I \
P(XJ :10 12 successes/= (20)
Ga~J :0.0003.
,10
The P-value is p: 0.0750 -;. 0.0095 + 0.0003 = 0.0848. Thus. at the level a: 0.10, ::he nulll:ypothe-
sis is rejected and We conclude that the e:.gineering changes have improved the process yield,

This test procedure i~ sometimes called the Fisher-lrviin test Because the test depends
on the asst::mption that Xl + X2 is :fixed at some value, some statisticians have argued against
use of the test when X, + X, is not actually fixed. Clearly X; + X,is notfixed by the sampling
procedure in our example. However, because there are no other better competing procedures,
the Fisher-lrviin test is often used whether or not Xl + X2 is acrually fixed in advance.

114 TESTING FOR GOODNESS OF FIT


'The hypothesis-testing procedures that we have discussed in previous sections are for prob-
lems in which the fann of the density function of the random variable is known, and the
hypotheses involve the parameters of the distribution. Another kind of hypothesis is often
encountered: we do not know the probability distribution of the random variable under study,
say X, and we wish to test the hypothesis that X follows a particular probability distribution.
For example, we might wish to test the hypothesis that X follows the normal distribution.
m this section, we describe a fonnal goodness-of-fit test procedure based on the
chi-square distribution. We also describe a very useful graphical technique called probabil-
ity plotting. Finally. we give some guidelines useful in selecting the form of the population
distribution.

The Chi-Square Goodness-of-Fit Test


'The test procedure requires a random sample of size n of the random variable X, whose
probability density function is unknown, These n observations are arrayed in a frequency
histogram having k class intervals. Let 0 i be the observed frequency in the ifu class inter-
val. From the hypothesized probability distribution we compute the expected frequeney in
the ith class intervaL denoted E j , The test statistic is

2_ *'
Xo - L,
[=1
(0, -Ed'
Ej
(11-107)

It can be shown that X; approximately follows the chi-square distribution with k - p - 1


degrees of freedom, where p represents the number of parameters of the hypothesized
11-4 Testi~gforGooCness of Fit 301

distn'bution estimated by sample statistics. This approximation improves as n increases. We


,
would reject the hypothesis that X eanfonns to the hypothesized distribution if > xi
XlX.k~p~ :.
One point to be noted in the application of this test procedure concerns the magnimce
of the expected frequencies, If these expected frequencies ~'"e too slI'.a11. then X~ will not
reflect the dep2ItUre of observed from expected, but only the smallest of the expected fre-
quencies. There is no general agreement regarding the minimum value of expected rre*
quencies. but values of 3, 4. and 5 are widely used as minimal. Should an expected
frequency be too small. it can be combined with the expected frequency in an adjacent class
interval. The corresponding observed frequencies would then be combined also, and k
would be reduced by L Class intervals are not required to be of equal width.
We now give three examples of the test procedure.

··Eiim·i.ii;2Z·
p ..; •.• .,
'Co .••

A Completely Specified Distribution A computer scientist has developed an algorithm for gener~
ating pseudorandom i...tegerS over :he inter.'al 0-9. He codes the algorithm and genera:es 1000
pseudo::andom digits. 'The data are shown ir. Table 11-3. Is :here evidence th~t the rz:ndom number
genera~or is working correctly'?
If the randoJ:J. number generator is working correctly, then the values 0-9 should follow the dis-
crete uniform distribution, which implies that each of the integers. should occur about 100 ti.D::.es. That
is, the e.xpecredfrequencies Ei = 100, for i "'" 0, 1, ... ,9. Since these expected f:.-eG,uencies ean be deter-
mined v.rifuout estimating any parameters from the sample c.ata, the resul:ing chi-situate goodness-of-
fit test will have k - p - 1 = 10 - 0 ~ 1 9 degrees of freedom.
The observed value of the test statistie is

J'
xl "" I. \o!. :.~
, i

J=; Et
(94-100)' + (93-100)' + ... +",(9_4",-",IO..c0),-'
100 100 100
= 3.72.

Since ioM.? = 16,92 we ~are unable to rejeec the hypothesis tlJ.at the data eome from a discrete uniform
distribution. Therefore, the :random numbet generator seems to be working satisfactorily.

E~~~.ti83
A Discrete Distribution The number of defects in printed cirenit boards ls hypothesized to follow
a Poisson distribution. A random sample of n 60 printed boards have been eollected, and the :mID-
ber of defects observed. The following data result:

Table 11·3 Data for Example 11-22


Total
0 1 2 3 4 5 6 7 8 9 n
Observed
frequencies, 0, 94 93 112 101 104 95 100 99 108 94 1000
Expected
frequencies. Ej 100 100 lOa 100 100 100 100 :00 100 100 1000
302 Chapter 11 Tests of Hypotheses l,
Number of Defects Observed Frequency
o 32
1 15
2
3
9
4
j
I
i
'P.::;e mean of the assumed Poisson distribution in this example is unknown and must be estimated
from the sample data. The estimate of the mean Dumber of defects per board is the sample <:I'Verage;
that is (32·0 + 15 . 1 + 9 • 2 + 4· 3)160 ~ 0.75. From the cumulative Poisson distnoution with pararr~"
eter 0.75 we may compute the expected frequencies as Ei =np,. where Pi is the theoretic~. hypofue..
sized probability associated with the 1m class interval and n is the total number of observations. The
appropriate h)l'otheses are

H ') c-<'''(075''
,}
o:Pt x ;; .x! "=0,1,2.. ",

HI :p(x) is Dot Poisson '.1tith ,.t =: 0.75,

We may compute the e;r.:pected frequencies as follows:

Nu.'11.ber of Failures Probability Expected F=equency

'0 0,472 28,32


1 0.354 2!.24
2 0,133 7,98
,,3 0.Q41 2A<5

The expected frequencies are obtained by multiplying the sample size times the respective pmbabU-
lties. Since the expected frequency in the las~ cell is less than 3, we combine the last two cells:

Number of Failures ObserVed Frequency Expected Frequency


o 32 28,32
15 21.24
13 10.44

The test statIstic (which will have k - P - t = 3 - 1 - t :;:; 1 degree of .freedom) becomes

(32-28.32)' + (15-21.24)' +Q3-10,44)' =294


28.32 21.24 10,44 "

and since X~~,l 3.&4. we cannot reject the hypothesis that the occurrence of defects follows a
Poisson dis~Jlution v,.itlt mean 0.75 defects per board.

A Continuous Distribution A manufacturing engineer is testing a power supply used in a word.


processing work station, He \\1shes to determine whether output voltage is adequately described by
a normal distribution. From arandom sample of n.::::::: 100 llIlits he obtains sample estimates of the mean
and standard deviation x""" 12.04 V and s:;:; 0.08 V.
A common practice in constructin.g the class iutervals for the frequency distribution used in the
chl-square goodness~of-fit test is to choose the cell boundaries so that the expected frequencies Ei """
"Pi are equal for all cells. To use this method, we want to choose the cell boundaries £101 al' ... , at for ,!
the k cells so that all !:he probabilities

J
11·4 Testing for Goodness of Fit 303

are equal Suppose we decide to use k;;; 8 cells. For the standard normal distribution the intervals :hat
divide the scale into e:gh' eq"alIy1£<ely segments are (0, 0.32), [0.32, 0,675), [0.675, 1.15), [Ll5, -l,
and thea: fow: "mir.::or image" intervals on the other side at zero. Denoting :hese standard normal end-
points by Go. at ...., as> it is a simple matter :0 calculate the endpoints that are necessary for the gen-
ernl normal problem at hand; :narnely, we define the new class interval endpoints by the transformation
a;::::::i + sa;; i;:;;;. O. 1, ,,,.8, Por example. the sixth interVal's right endpoint is
12.04 + (0.0&) (0.675) = 12.094.
For each interVal. Pi:::;; t:::::: 0,125. so the expected cell frequencies are El 100 (0,125) = 12.5.
The complete table of observed and expected frequenc:es is given IT. Table
The computed value at the chi-square statistic is

Si:1ce rNO parameters in t.l"e normal distribt.1ion have been estimated. we would compare ,to:::::: L12 to
a cru-s(j.uare disttibutio:l vrith k - P - 1 =: 8, - 2 - 1 : : : 5 degrees of freedom. Using a:::.. 0,10. we see
that X;,I...i = 13.36, and so we conclude that there is no reason to believe that output voltage is not nor-
mally distributed.

Probability Plotting
Graphical methods are also useful when selecting a probability distribution to describe
data, Probability plotting is a graphical method for determining whether the data conform
to a hypothesized distribution based on a subjective visual examination of the data. The
general procedure is very simple and can be performed quickly. Probability plotting
requires special graph paper, known as probability paper, that has been designed for me
hypothesized distribution, Probability paper is widely available for the normal, lognormal,
Weibull, and various chi-square and gamma distributions. To construct a probability plot,
the obsenrations in the sample are first ranked from smallest to largest. That is. the sample

Table 114 Observed iUld Etpected Frequencies


Oass Observed Expected
Interval Frequency, 0 1 £,
x< 11.948 10 12.5
11.948 ~x< 11,986 14 12.5
1'.986~x< 12,014 12 12.5
12,014:>x< 12,040 13 12,5
12.040:>.« 12,066 11 12.5
12.066 s « 12,094 12 12.5
12.094sx< 12.132 14 12.5
12.132 sx 14 12.5
100 100
304 Chapter II Tests of Hypotheses

Xl' X2 , .••• X,,,! is arranged as X(li' X(z), ...• X(Il» where X(f> ~ XU+ 1)' The ordered abserva~
tions XC;) are then plotted against their observed cumulative frequency U - O.5)ln on the
appropriate probability paper. If the hypothesized distribution adequately describes the
data, the plotted points will fall approximately along a straight line; if the plotted points
deviate significantly from a straight line, then the hypothesized model is not appropriate.'
Usually, the determination of whether or not the data plot as a straight line is SUbjective.

E~~i!e~~:;Z~(
To illustrate probability plotting, consider the followlllg data:
-{).314, 1.080,0.863, -{).179, -1.390, -{).563, 1.436, LlS3, 0.504, -{).801.
We hypothe:,ize that the$C data are adequately modeled by a normal distribl.!tion. The observations are
arranged in llscending order and their cumulative frequencies U- 0.5)/11, calculated as follows:
j 0.5) I n
-1.390 0.05
2 -{).801 0.15
3 -0563 0.25
4 -{).314 0.35
5 -{).179 0.45
6 0.504 055
7 0.863 0.65
8 L080 0.75
9 Ll53 0,85
10 1.436 0.95

The pall's of values ~)) and U - 0.5)1n are now plotted on normal probability paper; This plot is sbo\Vtl
in Fig. 11-7. Most normal probability ,?aper plots lOO(j - 0.5)/n on the right vertical scale a:::ld
100[1 - U - 05)ln~ on the left vertical scale. with the variable value plotted on tl:.e horizontal scale.
We have chosen to plot X,j) versus IOOU 0,5)1n On the right vertical in Fig. 11-7. A straight line.,

1I 99
2r 98

5~ 95
10 L 90

:[ 20 L SO
70 ~
"l
"~
I
:t
50 •
• SO 0
I
50 :::>

""
1:. 60

1:
~

"
~ 70
SO 20

90 10

95 ~ • -5
98~'~ ___ L-._~L- ___ ! I ..~. ~.~,_J2
-2,0 -1.5 -1,0 -0,5 0 0.5 1.0 1,5 2,0
XU:
Figure 11~7 Normal probability plot

J
11-4 Testing for Goodness of Fit 305

chosen subjectively, has been drawn through the plotted points. In drawing the straight line, one
should be influenced more by the points near the middle than the extreme points. Since the points fall
generally near the line, we conclude that a normal distribution describes the data.

We can obtain an estimate of the mean and standard deviation directly from the normal
probability plot. We see from the straight line in Fig. 11-7 that the mean is estimated as the
50th percentile of the sample, orjl = 0.1 0, approximately, and the standard deviation is esti-
mated as the difference between the 84th and 50th percentiles, or 0-= 0.95 - 0.10 = 0.85,
approximately.
A normal probability plot can also be constructed on ordinary graph paper by plotting
the standardized normal scores Zj against XU), where the standardized normal scores satisfy

j -0.5 = p(Z" Zj) = <!J(Zj)'


n
For example, if (j - 0.5)1n = 0.05, then <!J(Z) = 0.05 implies thatZj = 1.64. To illustrate, con-
sider the data from Example 11-25. In the table below we have shown the standardized nor-
mal scores in the last column:

j XCJ) (j-O.5)/n Zj
1 -1.390 0.05 -1.64
2 -0.801 0.15 -1.04
3 -0.563 0.25 -0.67
4 -0.314 0.35 -0.39
5 -0.179 0.45 -0.13
6 0.504 0.55 0.13
7 0.863 0.65 0.39
8 1.080 0.75 0.67
9 1.153 0.85 1.04
10 1.436 0.95 1.64

Figure 11-8 presents the plot of Zj versus XU). This normal probability plot is equivalent to
the one in Fig. 11-7. Many software packages will construct probability plots for various
distributions. For a 1vf!nitab® example, see Section 11-6.

2.0 , - - - - - - , - - - - - , - - - - - , - - - ,

1.0

-1.0


-2.0 :-:c-----,:-:c---;.------;:'-;;-------;;'
-2.0 -1.0 0 1.0 2.0
XU)

Figure 11-8 Normal probability plot.

l
306 Chapter 11 Tests of Hypotheses

Selecting the Fonn of a Distribution


The choice of the distribution hypothesized to lit the data is importan~ Sometimes analysts
can use their knowledge of the physical phenomena to choose a distribution to model the
data. For example, in studying the circuit board defect data in Example 11-23, a Poisson
distribution was hypothesized to describe the data. because failures are an «event per unit"
phenomena, and such phenomena are often well modeled by a Poisson distribution. Some-
times previous experience can suggest the choice of distribution,
In situations where there is no previous experience or theory to suggest a distribution
that describes the data, analysts must rely on other methods. Inspection of a frequency his-
togram can often suggest an appropriate distribution. One may also use the display in Fig.
11-9 to assist in selecting a distribution that describes the data. w'ben using Fig. 11-9. note
that the f3., axis increases downward. This figure shows the regions in the /3" f3., plane for
severa! standard probability distributions, where

~ E(X-fl')'
, - =7--:-;-;':?f-
~ -,
"'y}Jl 2\3/2-
[<1 )'

14
{3, 5

c 6
~
.0
·c
:i
!: 7;

J!
9'

p,
Figure 11-9 Regions in the P" {3, plane for vmom standard di,ttibutions. (Adapted from G. I.
Hahn and S, S. Shapiro, Statisrical Models in Engineering, John Wlley & Sons, New York, 1961;
used with permission of th¢ publisher and Professor E, S, Pearson. Urri\'eTI)1ty of London.)
1l~5 Con:ingency Table Tests 307

is a standardized measure of skewness and

f3., ;~( X - IJ. t


0"
is a standardized measure of kurtosis (or peakedness), To use Fig, 11-9, calculate the sam~
pie estimates of /3, and f3." say

and
, M.
/3 2 ; M"
• 2

where

M·;-
II'(X -X)'
_., j = 1,2,3,4,
J , '
n h:::1

and plot the point~" fJ.,. If this plotted point falls reasonably close to a point, line, or area
that corresponds 10 one of the distributions given in the figure, then this distribution is a log-
ical candidate to model the data.
From inspecting Fig. 11-9 we note that all normal distributions are represented by the
point /31 ; 0 and {3, ; 3. This is reasonable, since all normal distributions have the same
shape, Similarly, the exponential and uniform distributions are represented by a single point
in the /3" {3, plane. The gamma and lognormal distributions are represented by lines,
because their shapes depend on their parameter values. NOte that these lines are close
together. whieh may explain why some data sets are modeled equally well by either distri-
bution, We also observe that there are regions of the /3" {3, plane for which none of the dis-
tributions in Fig. 11-9 is appropriate, Other, more general distributions, such a<; the Johnson
Or Pearson families of distributions, may be required :in these cases, Procedures for fitting
thesefamilies of distributions lmd fignres similar to Fig, 11-9 are given in HaIm and Shapiro
(1967).

11-5 CONTINGENCY TABLE TESTS


11any times, the n ele:uents of a sample from a population may be classified according to
two different criteria. It is then of interest to know whether the two methods of classifica-
tion are statistically independent; for example. we may consider the population of graduat-
ing engineers and we rnay wish to determine whether starting salary is independent of
academic disciplines. Assume that the first method of classification has r levelS and that the
second method of classification has c levels, We willie! Olj be the observed frequency for
level i of the first classification method and for level j of the second classification method.
The data WOUld, in general. appear as in Table ll-S. Sueh a table is commonly called an
r X c contingency table.
We are interested in testing the hypothesis that the row and eolumn methods of classi-
fication are independent. If we reject this hypothesis, we conclude there is some interaction
between the two eriteria of classification, The exact test procedu..'"es are difficult to obtain.
but an approximate test statistic is valid fot' large n. Assume the Of; to be multinomial ran-
dom variables and Pij to be the probability that a randomly selected element falls in the ijth
308 Chapler 11 Tests of Hypothese,

Table 11·5 An r x c Contingency Table

Column
Row 2 c

1 i 0:: 0 12__
2 ,0,,,,--+_O~""--t-- 0,

0" -,-_ _ !,-O. ; ~;. .'


'_'-

cell, given that the two classifieations are independent. Then Pij = Ui""j. where u/ is the
probability that a randomly selected element falls in row class i and Vj is the probability that
a randocly selected element falls in column class j. Now, assuming independence, the
maximum likelihood estimators of Uj and Vj are
,Ie
uj=-LO/i.
12 I""-l "

• 1 r
Vj =- 2..:0ij' (11·108)
n i=:l
Therefore, assuming independence, the expected number of each cell is

(11·109)

Then, for large n, the statistic

2 +~ (Oij-ES 2
Xo=-:..L E -X(~l)(C-;)' (11·1lO)
1""1/,,,1 ij

approlcimately, and we would reject the hypothesis of independence if X;" X~ (r-1)(,-I)'

A company has to choose among three pension plans. Management wishes to know whether the pref~
crence forplan:s independent of job classification. The opinions of arandom sa:nple of 500 employ-
ees are shown in Table 11-6, We may compute ul ;:: (340/500} ". 0.68. ~ ::: (1601500) ::: 032, Vi ;:::
=
(2001500) 0040. il, = (2001500) 0040. and y) = (100/500) = 0.20. The expecTed frequencies may be
computed from equation ll-l09. For exan:ple, the expected number of salaried workers favoring pet.-
sinn plsn 1 is
11-6 Salnpi" COT.1puter Output 309

Table 1l~7 Expected Frequencies foc Example 11-26

Pension Plan
2 3 Total
Saillried
workers 136 136 68 340
Hourly
workers 64 64 32 160
Totals 200 200 100 500

The expected frequencies a...-e shovro in Table 1l~7, The test statistic is computed from equation
11-110 as follom:

Since X~.O$;.! ;;; 5.99, we reject the typotbesis of independence and conclude that the preference for
pension plans is not independent of job classification.

Using the t<.vo-way contingency table to test independence between two variables of
classification in a sample from a single population of interest is only one application of con-
tingency table methods. Another common situation occurs when there are r populations of
interest and each population is divided into the same c categories. A sample is then taken
from the ith population and the counts entered in the appropriate columns of the illl row. In
this situation we want to investigate whether or not the proportions in the c categories are
the same for all populatioos_ The null hypothesis in this problem states that the populations
are homogeneous with respect to the categories. For example, when there are only two cat-
egoriest such as success and failure, defective and nondefective. and so on, then the test for
bomogeneity is really a test of the equality of r binomial parameters. Calculation of
expected frequencies, determination of degrees of freedom, and computation of the
chi-square statistic for the test for homogeneity are identical to the test for independence.

11·6 SAMPLE COMPtJTEROUTPUT


There are many statistical packages available that can be used to construct confidence inter~
vals, carry out tests of hypotheses, and determine sample size. In this section we present
results for several problems using :Minitab®,

A study was conducted on the tensile strength of a pa.."ticular fiber under various temperr.:.tures. The
results of the study (given in :MFa) are
226,237,272,245,428,298,345,201,327,301,317,395,332,238,367,

L
310 Chapter 11 Tests of Hypotheses

SUpPOSe: it is ofiuteres[ to determ.ine if the mean tensile strength is greater than 250 t1Pa. That is, test
Ho:Jl~250,
HI: Jl > 25Q.
A non:nal probability plot was constructed for the tensile strength and is given in Fig. 11 ~ 10. The no:~
mality assumption appears to be satisfied. The population variance for tensile strength is asswned to
be unknown. a."ld as a result. a single-sample Hest will be used for this problem.
The resulrs from Minitah® for hypothesis testing and confidence interval OZl the mean are

!T.st of m" • 250 vs rou > 250


Variable N Mean StDev S£ Mea.'t I
<rs ::'5 301.9 65,9
17.0 I
I Vaxiable
~S
95.0% LOwer Bound
272.0 3.0S
T
DoDO!

The P-value is reported as 0,004, leading us 'to :eject the null hypothesis and conclude that the mean
tensile strt':ngth is greater tha.'1 250 MFa. The lower one-sided 95% confdence mterval is given as
272<,u...

t:1qlDipJ;li:28"
Reconsider Examp1e 11-11. comparing two methods for predicting the shear strength for steel plate
girders.. The Mlnitab$ output for !he paired t-test using Ct= 0.10 is

Fail.'~d T Karlsl.'uhe -
N Mean Sc:Dev SE Mean
Kar ls:ruhe 9 1. 3401 0.1460 0.0487
Lehigh 9 1. 0662 0.0494 0.0165
Difference 9 0.2739 0.1_351 0.0450
190% CI for mean difference: (0,1901. 0.3576)
IT~Test of mean difference "" o (v5o not '" 0): T-Valc.e 6.08 P-Valc.c ""
~ooo

0,999
I
0.99
0.95
~ ---- -
---
;0. 0.80

:a=0.50 ~
£ 0.20 I •
0.05
0.01
I
'.--.--- .

0,001 ,

I i
200 300 400
Tenslle strength
Figure 11-10 No:mal proh.bilityp!ot fur Example 11-27.

j
11-6 Sample Co:nputer Ouqmt 311

The results of t.ie MinitablPl output are iT.: agreement with the J;eS'Jlts round in Example 11~ 17"
:M£nitab((!l also provides the appropriate confidence interval for:he problem. Using a= 0.10, the level
of confidence is 0,90; the 90% confideot.-'e interval on the differel:ce between the two methods is
(0,1901,0.3576). Since the confidence interval does not coutain Zero, we also conclude Liar there is
a significant difference between l1e two mefuods.

The nc.:rnber of airline flights canceled is recorded for all airlines for each day of service. The Dum-
ber of flights recorded and the Dumber of 'llese flights that were canceled on a single day ir. Match
2001 are provided below for t'NO major airlines.

Airline # of Flights 41 of Canceled Flig..its Proporrion


American Airlines 2128 115
America West Airlines 635 49

Is there a significant difference in the proportion of canceled flights for the two airlines? The hypothe-
ses of interest are He: Pl = h, VefS1.:S H!: Pi ¢; P2' A two~sample test 0:< proportions and a twO-sided
confide::.ce interva: on proportions are

Sample X N Sample p
1 115 i128 0.054041
2 49 635 0.07716$

Estimate for p(1) - p(2); -0.0231240


95% ex for 1'(1) - 1'(2): (-(L0459949, -0.00(253139)
ITest for p(l) - p(2) "" 0 (vs not "" 0): Z "" -1,98 F-Valuc

The P"valile is given as 0.04.8, indicating that there is a significant difference between the proportions
of flight... canceled for American and America West airlines at a 5% level of significance. The 95%
confidence interval of (-O.()<i60. -0.0003) indicates thatAmeIica West airlines had a statistically sig-
nificant higher proportion of canceled flights k1an American Airlines for the single day.

,~~~~1!}1:3~
The mea::J. compressive strength for a particular high-strength couercte is hypothesized to be f.J. 20
(MPa). It is knOWD that tile standard deviation of compressive strength :s (j-= 1.3 MPa. A g':Onp of
engineers wants to determ.ioe ':he number of concrete specimens that will be needed in the study to
detect a dec:ease ill the mean compressive strength of two stm:dard deviations. Ii 6c average com-
pressive strength is a;;t>J,ally less than",u - 20', they want to be confident of corre.."tly detecting this sig-
nificant difference. In other words. the test of interest would be HQ: ,il ;;;; 20 versus H;: J1. < 20. For this
study. the significance level is set at a= 0.05 and the power of the test is 1- {J= 0.99, What is :he win-
imam nrunber of conerere specimens ::hat should be used L'1 t1'.is study? For a difference of 2a or 2.6
MFa, a=0.05, and 1- f3= 0.99, ::heminimllm sampie size can be found using}o.1initab~. The result~
ing output is

Testing tnean = null (versus < ::lull)


Calculating power for mean nu:'l + di£fer<tnce i
Alpha = 0.05 S~gffia ~ 1.3
I
Sample Target Actual
Difference SiZe Power Power

I,
i
-2.6 6 0.9900 0.9936

L
312 Chapter 11 Tests of Hypotheses

The .rnin.imum number of specimens to be used in the study sholli.d be r. = 6 in order to attain t!:te
desired power and level of significance.

A rr:anuiactu."'et of robber belts wishes to inspect and control the number of nonconforming belts pro.
duced on line. The proportion of noncor':orrning be~ that is acceptable is p =. 0.01, For practical pur-
poses., if thc proportion increases top = 0.035 0(' greatet'. the manufacturer wants to detect this change.
That is, the test of interest would be Ho: p::: 0.01 versus HI: P > 0.01. If the acceptable level of sig-
nificance is a::: 0.05 and the power is 1 {3 = 0.95, how many rubber belts should be selected fat
inspection? For a:::. 0,05 and 1 - f3::: 0,95. the appropriate sample size can be determined using
Minitab@.Theoutputis

'TestiLg proportion = 0.0: (versus> O,Cl)


Alpha" O.OS
I
A.lternative Sample Target Actua2-
Pro;?or.ior. Size Power Power
:3,SOE-02 348 0.9500 0.9502

Therefore, to adequately detect a significant change in thc proportion of nonconforming n:bbcr belts,
random sac:ples of at least n. """ 348 woU:d be needed.

11-7 SUMll4ARY
This Chapter bas introduced hypothesis testing. Procedures for testing hypotheses on means
and variances are summarized in Table 11-8. The chi-square goodness of fit test was intro~
duced to test the hypothesis that an empirical distribution follows a particular probability
law. Graphical methods are also useful in goodness-of-fit testing. particularly when sample
sizes are small. 1\vo-way contingency tables for testing the hypothesis that two method.;; of
classification of a sample are independent were also introduced. Several eompurer examples
were also presented.

11-8 EXERCISES
11-1. The breaking strength of a fibet used in manu- (b) 'What sample size would be required to detect a
fac:uring cloth is tequited to be at least 160 psi Past true mean yield of 85% with probability 0.95?
experience has indicated that the standard deviation of
11-3_ The diameters of bolts are knowu to have a
bteaking strength is 3 psi. A random sample of four
standa:d deviation of 0.0001 inch. A tandom sample
speci!nens is tested and the average breaking strength
of 10 bolt.,> yields an average diameter of 0.2546 inch.
is found to be 153 psi.
(a) Test the hypotbesis that the true mean diameter of
(a) Should the fiber be judged acceptable with
bolts equals 0.255 inch. using a= 0.05.
a=O.05?
(b) Vihat size sample would be necessary to detect a
(b) \Vh31 is rhe probability of accepting Ho: Ji. ~ 160
true mean bolt diameter of 0.2551 inch with a
iftbe fber has a trne breaking strength of 165 psi?
probability of at least 0"90?
11~2. The yield of a chemical process is being stud-
ied. The variance of yield is known from previous 114. Consider the data in Exercise 10-39.
ex.perience with this process to be 5 (units of a2 ):;:::per~ {a) Test the hypothesis that the mean piston ring
centagel). The past five days of plant operation have diametetis 74.035 rom. Use ct= 0,01.
resulted in the following yields (in percentages): 91,6. (b) 'What san::.ple size is required to detect a troemean
88.75,90.8,89.95,91.3. diameter of 74.030 with a probability of at least
(a) Is there reason to believe chcyield is k:ss than 90%1 0.951
ll-S ExeJX:ises 313

Table 11-8 Sll1ll.ltla.""y of Hypothesis Testing Procedures on Means and Variances


Alternative Crite:ia for OCCurvc

---_
Null Hypothesis
Ho: I' = i4J.
.... --
Test Statistic

x-~
Hypothesis
H,:I'''iJ<;
Rejection

141>
Parameter

d-II' - i4J11a
z,
a' known (II -.In H!:j1.> Pc 4>2. d=(I'-iJ<;)/"

-
H,: 1'- iJ<;,
t·:=~i-,-
X -Po
H,: I' < i4J
H,: i"" .u"
4<-2.
ItJI > lall,n_1
d = (i4J - I'll"
d = II' - iJ<;jia
cf unkJ'1own S/"IlI. H,:I':>iJ<; t(»ta.II_1 d - (I' -101<7
H,:I'<i4J to <-(a,,,-! d - (i4J - I'll"
H,:I', '""', Xl -X1 H,: 1', '" p" Iz,,! :> Za/1 d-I)l,··)l,1 ~O'f +a·~
ZQ= ~ ...=
(j~ fl.'ld ~ knowL Ja 2
a~ Hj:Jl,"> P2 Z. (. ,; r;-:;
d=: J.lJ-).t).l. "';0'1 +O'!
,-L+......t....
iJ "I ": H J:J1.j<.Uz 4 <-Z, d~()J.2 -PI)/.,}(T~ +a~
H,: 1': =,""" H!:j1.;#,Uz Irol > t crn.>1, +,,:-2 d = II': - p,,112a
tc =
a~=~ crunknown HI~J!!>111. to> tCl.,I1, h71-2 d - (1', -/'o)l2er
": H,:I', <p" d=(fl2-J.l.))!2a

Ho: 1'[ = /'0, H,:I', "'p"


0{ * 0'; unknown to =rXj-X2
.. ~·- 2
Hj:Jl! >J.t:. IC>ta.<
lSi -"-~
Vfl.1 1t:!
H1:Jll <.Liz to <-til, v

lSt-+-1
~.,

s;:
~ ,\2

fl.l 11;.;.1
v= -2
"1),(-'),
\Si.tI! ,S;;r,,!
- - - + ..- - -
It, """ 1 ll:! +1
f :z
Ho:if=~ , (n-l)S' H[:a'",o; %0> Xc!'.!..n-I A;:;:;; GIGo
Xo=·.... _--
ci 2
or
.. ;:
Xa<XI-011.lI_1
o 2
H,:a'>d';, XO>XV,lI-l A= oioe
Hl: a' < Of; X~<X~ _o:.n_l ;.= (flan
He:"~ = 0; Fo=S~lS~ H!:a~;zf: cS Fr:>FtPZ.n:_:.nz._l A= Ci1/U1
or
Fo < FI_all.lIl I,tt;:-,
H;:d>~ Fo>Fl7.nl" .,,7z-1 A=at / l.1:!

11-5~ Consider the data in Exercise 1040, Test the ume. whether or not this yolu.::r:,e is 16.0 ounces, A
hypothesis that the mean life of the light bulbs is 1000 random sample is taken from the outpu: of each
hours. Use Ct= 0.05. machine.
U--6. Consider the data in Exercise 1041. Test the
hypothesis that mean compressive strength equals Machine 1 Machine 2
3500 psi. Gsc a= 0.01.
16.03 16.01 16.02 16.03
11~ 7. Two machines a..~ used for filling plastic bottles 16.04 15.96 15.97 16.04
with a net volume of 16.0 ounces. The filling 16.05 15.9S 15.96 16.02
processes can be assumed normal, with standard devi-
16.05 16.02 16.01 16.01
ations '" =O.oJ5 and '" =oms. Quality engineering
16.02 15.99 15.99 16.00
s~ that both mach.i.net :fill to the same net vol-
314 Chapter 11 Tests of Hypotheses

(a) Do you tbin.k that quality engi!leering is correct? at random fro:n the current production. Assume that
Usea:M5. shelf life is normally distributed,
(b) Assuming equal sample sizes, what sample size 108 days 128 days
=
should be used to assure that f3 0,05 if the true 134 163
difference in means is 0.0751 Asswue that
124 159
a=O.05.
116 134
(c) What is the power of the test in. (a) for a true dif-
ference in .means of 0.075? (a) Is there any evidence that t.":e mean shelf life is
greater than or equal to 125 days?
11~8. The f;1m development department of a local
store :s considering the replacement of its current (b) If it is important to detect a ratio of Clr::r of 1.0
Elm-processing .machine. The time in which it takes with a probability 0.90, is the sample size
the machine to completely process a roU of film is su.fEcient?
impo:"".ant. A random sample of 12 rolls of 24-expo- 11~ 14. The titanium content of an alloy is being stud-
sure color film is selected for processing by the c\.tr- ied in the hope of ultimately increasing the tensile
rent machine. The average processir.g time is 8.1 strength. An analysis of six. recent heats choSen at mc-
minutes, 'filth a sample standard deviation of 1.4 mi:t- dam produces the following ritanium con~ents.
utes. A random sample of 10 rolls of the same type of 8.0% 7.7%
film is se:ected for !esting in the new machine. The
9.9 11.6
average processiltg ti;:ne is 7.3 minutes. wit.~ a sample
9.9 14.6
standard deviation of 0.9 minutes, The local store '>Vill
not pUIChase the new machine unless the processing Is th.et'C any evidence that the mean titanium content is
time is more than 2 minu~es shorter than the current greater than 9.5%1
machine. Based on this information. should they pur- 1l~15. An article in the Journal of Construction
chase the new machine? Engineering and Management (1999, p. 39) presents
11-9. Consider the data in Exercise 1{)"45. Test the some data on the nwuber of work hours lost per day
hypothesis that both machines filllO the same volum!!. on a construction project due to weather-related inci~
Use Ct= 0.10. dents, Over 11 workdays, the following lost work
11.. 10. Consider the data in Exercise IO-4ti Test He: hours were recorded.
j1;::= J.l::. against H! :,U: > JIz, using a;;::; 0.05, 8.8 8.8
ll~ll. Consider the gasoline road octane number 12.5 12.2
data in Exercise 10-47. If fonm.:lation 2 produces a 5.4 133
higher road. octane number than fClfl'4ulation 1, the 12.8 6.9
::r.anufacturer would like to dctect tl".is, Fonnulate and
9.1 2.2
test e.n appropria:-c hypo:hesis, using (X= 0,05.
14.7
11 ~ 12. The la:eral deviation in yards of a certain type
of mortar shell is being investigated. by the propellant Assuming work hours are :r.otma11y distributed. is
manufac:urer. The following data have been observed. there any evidence to conclude that the mean number
of work hours lost per day is greater than 8 hours?
11 ..16. The percentage of scrap produced in a !!leW
Round Deviation Round Deviation
finishing operation is hypothesized to be less than
11.28 6 -9.48 7.5%. Several days were chosen at random and the
2 -10.42 7 6.25 percentages of scrap were calculated.
3 -<l.51 8 IO.ll 5,51% 732%
,[ 1.95 9 -<l.65 6.49 8.81
5 6.47 10 -0.68 6.46 8.56
537 7.46
Tes~ the hypothesis that the mean latetru deviation of (2.) In your opinion. is the true scrap rate less than
these mortar shells is ::ffO. Assume that lateral devia~ 7.5%1
tion is normally distributed. (b) Ifitls important to detect a ratio of (JIG: 1.5 v.'ifu
11~13. The shelf life of a photographic film is of a probability of at least 0.90, what is the :mini~
interest to the manufacturer. The manufacturer mum sample size ';hat can be used?
observes the following shelf life tor eight urits chosen (c) For 5/(1:::.2.0, what is the power of the above test?
1\ "8 Exercises 315

11-17. Suppose that we must test the hypotheses


Field Model
H,:I1" 15,
53.33 57.14 47.40 58.20
H,:11<15,
53.33 57.14 49.80 59.00
-where it is knoVtn that c? :;;; 2.5. If a=;; 0.05 and the 53.33 61.54 51.90 60.10
trUe mean is 12, what sample Si4C is necessary to 55.17 6\.54 52.20 63,40
assure 11 type IT er:nr of 5%? 55.17 61.54 54.50 65.80
11-18. An engineer desires to test !.he hypothesis that 55.17 69.57 55,70 71.30
the melting point of an alloy is lOOOQC. 1f the true 57.14 69.57 56.70 75.40
melting point differs from this by more thM 20CC he
must change the alloy's composition, If we assume
that the melting point is a normally distributed ran~ Assu.ming the variances are equal, conduct a hypoth~
com. variable, 0::;;;0.05, th:;O.lO. and (1= lO"C. how esis test to determine whether there is a significant
many observ.ations should be t2.ken? difference between the field data and the model simu-
lated data. Use a:;;; 0.05.
11-19. Two methods for producing gasoline from
11~21. The foUov.i:ng are the bu..lning times (in min-
c..41c.e oil are being i:westigated. The yields of both
processes are asscrc.ed to be normally distributed. The utes) of fla..-es of!\Vo different types.
follo\1.'ng yie:d data have been ob:ain::d from the pilat
plant Type: Type 2

63 82 64 56
Process Yields (0/.;:) 81 68 72 63
1 24.2 26.6 25.7 57 59 83 74
2 21.0 22.1 2\.8 20.9 22.4 22.0 66 75 59 82
82 73 65 82
Ca) Is there reason to believe that process 1 has a
greater mean yield? Use 0: O.OL Assume that (a) Test the hypothesis that the two variances are
bolli variances are equal. equaL Use 0::;;;0.05.
Co) ~l;ssuming that in order to adopt process 1 it must (b) Using the results of {a), test the hypothesis that
produceame:m yield thatls at least5% grea,.-er than the mean buming times are equal.
ti:at of process 2, what are your ;re....--ommendations:?
11 ..22. A new filtering device is insta:led in a cber:ll~
(e) Fl.Ilrl the power of the test in part (a) lithe mean cal unit. Before its i:tstaEation. a random sample
yield of process 1 is 5% greater L'lan that of yielded the fo:Cov.-ing in1:orma:ion about the percent-
process 2. age of impurity: XI = 12.5. si :;: 101.17, and n! =8.
(d) VIhat samp1e size is required for the test in part Aiter installation, a random sample yielded x,::::: 10.2,
1 "
(a) to ensure that the null hypothesis will be sl=94.73,nz=9.
rej~ with • probability of 0.90 if tho mean (a) Can you conclude that the two variances are
yield of process J exceeds tbe mean yield of equal?
process 2 by 5%?
(b) Has the filtering device reduced the percentage of
11~20. An article that appeared in the Proceedings of impurity significantly?
the 1998 Winter Simulation Conferen.ce (1998, p.
11-23~ Suppose that two rando:;). samples were drawn
:079) discusses the concept of validation for traffic
from norn:al populations with equal variances. The
simulation :;).OOels, The stated purpose of the study is
s:unple data yields x\ =: 20,0, nl :::;: 10, L(x!! Xj)l
to design aad modify the facilities (roadways and con-
1480, X, = 15.8, n., = 10, and J:(x,; -x,j' -1425.
trol devices) to optimize efficiency and safety of traf-
fle flow. Pan of the study compares speed observed at (a) Test the hypotl'.esis that the !'No means are equal
various ir.tersections and speed simulated by a model Use 0:=0.01,
being tested, The goal is to determine whether the (b) Fwd the probability that the null hypothesis in (a.;
simulation model is representative of the actual .,-till be rejected if the t.-ue difference in means
observed speed. Field data 1s collected at a particular is 10.
location and then the simulation model is imple~ Ce) What sample size is required to detect a true dif-
mented, Fourteen speeds (ft/sec) are measured at a ference in means of 5 with probability at least 0,80
particular location, Fourteen observations are simu~ if it is known at the starr of the experiment that a
tated using the proposed model The data are; rough estimate of the common variance is 150?

L
316 Chapter 11 Tests of Hypottleses
1 I

11~24. Consider the data in Exercise 10-56. this san:ple size is used and the sample standard devi~
(a) Test the hypothesls that the means of the two nor~ ation s = 0.007, what is your conclusion, using a~
mal distributions are equal. Use a ~ 0.05 and 0,01? Construct a 95% upper-confidence interval for
cr:
assume that ~ 0;. \he true variance,
(b) v.'hat sample size is required to detect a difference 11~29. The manufacturer of a power supply is inter~
in means of2.0 with a probability of at least 0.SS1 ested in the variability of output voltage. He ha.s tested
(c) Test the hypothesis that the: variances of the two 12 units. chosen at random, with the: following results:
distributions ru;e equaL Use (J,;;!;; 0.05. '1
5.34 5.65 4,76
(d) Find the power of the tesl in (c) if the variance of
5.00 5.55 5.54
a population is four times the other. 5.07 5.35 5.44
11-25. Consider the data in Exercise 10-57. Assw:o- 5.25 5.35 4.61
fig that ~:=: 0;. test the hypothesis that the mean rod
diameters do not differ. Use a:::::: 0.05. (a) Test the hypothesis t1at cr =05. Use a=0.05.
11~26. A chemical company produces a certain drug cr
(b) If the true value of = LO. what is the probabil~
whose weight has a standard deviation of 4 mg. A new ity that the hypothesis in (a) v"ill be rejected? ' I·
method of producing this drug has been proposed. 11-30:. For the data in Exercise 11-7, test the hypo£h.. -
although some additioual cost is involved, Manage~ esis that the two variances are equal. using a = 0.01.
ment wpl authorize a change in production technique Does the result of this test influence the manner in
only if the standard deviation of the weight in the new which a test on means would be cor.ducted1 What
process is less than 4 mg. If the standard deviation of sample size is necessary to detect ~j~;;:; 25, 'With a
weight in the new process is as small as 3 mg, the probability of at least 0,901
company would like to switch production methods
11~31. Consider the fOliOVling two sampies, draVt"D.
with a probability of at least 0.90. Assuming weight to
from two normal populations.
be normally distributed and a = 0.05, how many
observations should be taken? Suppose the
resea.."'Chers choose n ;: :; : 10 and obtain the data below, Sample 1 Sa.'11ple 2
Is this a good choice for n? What SI:lOuld be their
----_.- .. ,.

4.34 1.87
decision?
5,00 2.00
16.628 grams 16,630 grams
4.97 2.00
16.622 16.631
4.25 1.85
16.627 16.624
5.55 2.11
16.623 16.622
6.55 2.31
16.618 16.626
6.37 2.28
11~27. A manufacturer of precision measuring instru- 5.55 2.0'7
ments clai:ns that the standzrd deviation in the use of 3.76 1.76
the i=:.sil\;!;).cnt is 0.00002 inch. A.'1 analyst, who is
1.91
u:1aware of the claim, uses the instrument eight times
2.00
and obtain." a sample standard deviation of 0,00005
inch.
(a) Using a= 0,01. is the claimjustiiied? Is there evider.ce to conclude that the variance of pop-
(b) Compute a 99% confidence interval for the true ulation 1 is greater than the variance of population 21
variance. Use a 0.01. Find the probability of detecting
(c) "Vhat is the power of the test if the true standard
if,1c?, ~ 4.0.
de-...1ation equals 0.000047 11-32. Two machines produce :metal parts. The yari~
(d) ·What is the smallest sample size that can be used ancc of the weight of these parts is of interest. The fol-
to detect a true standard deviation of 0,00004 with lowing data have been collected.
a probability at least ofO,95? Use a= 0.01.
11-28. The standard deviation of measu."'ements made Machine 1 Machine 2
by a special thermocouple is supposed to be 0,005
"1 =25 1""'2=30
degree. If the standard deviation is as great as 0.010.
we wish to detect it -with a probability of at least 0,90,
x\ 0.984 x" ~ 0.907
Use a;;;::: 0.01. What sample size should be used? If
s~;;:; 13.46 $;:9.65
11-8 Exercises 317

(a) Test the hypothesis that the var:ances of the twO Do the d&.ta support the designer's theory? Use
macr.rnes are equaL Use a=0.05. a=O.OS.
(b) Test the hypothesis that the two machines proo:.J.ce 11·36. An article in the Internation.al Journal of
parts having the same mean weight. Usc a= 0.05. Fatigue (1998. p. 537) discusses the bending fat~gue
11~33. In a hardness test. a steel ball is press.ed into resistance of ge::\!" teeth when using a pa.."ticular pre-
the material being tested at a standard load, The diam~ stressing or presetting process. Presetting of a gear
eter of the indentation is measured, which is related to tooth is obtained by applying and then removing a
the hardness, Two types of steel balls are available, single overload to the machine element. To detcrmine
and thei:' pe:iormance is compa.red on 10 specimens. significant differences in fatigue resistance due to pre-
Each speci::nen is tested twice. once with each ball. setti:1g. fatigue data were A "preset" tooth and
The results are given below: a "nonpreset" tooth were paired if t.'1.ey were present
on the same gear. Eleven pairs were formed a:ld :he
Ball x 75 46 57 43 58 32 61 56 34 65 fatigue life measured for e:lCrL (The final response of
Bally 52 41 43 47 32 49 52 44 57 60 inrerest is In[(fatigue life) X 10-'n

Test the hypothesis that the two: steel balls give the Pair Preset Tooth Nonpreset Tooth
same expected hardness measurement. Use a= 0.05, 3.813 2.706
1l~34. 1\.'1o:.ypes of exercise equ;p:nent.A and B. for 2 4.025 2.364
handicapped individuals are often !..!Sed to determine 3 3.042 2,773
the effect of the particul:u exercise aD heart rate (in 4 3.831 2558
beats per minute), Seven subjectS participated In a 5 3.320 2.430
study to determine whether the two types of equip- 6 3.080 2.616
ment have the same effect on hem rate, The results
7 2.498 2.765
are given in the table below,
8 2.417 2.486
9 2.462 2.688
Subject A B
10 2236 2.700
1 162 161 11 3,932 2.8:0
2 163 187
3 140 199
Conduct a test of hypothesis to de:ermine whether
4 191 206 presetfulg significantly i::creascs the fatigue life of
5 160 161 gear teeth. Use a=O.10.
6 158 160
11-37. Consider the deta in Exercise 10·66. Test the
7 155 162 hypothesis that the uninsa.w. rate is 10%. Use a:;;; 0.05.
11~3S.. Consider the data in Exercise 10-68, Test the
CQnd~ct an appropriate tCSt of hypo:hesis to deter~ hypothesis t.13t the fraction of defective clLculators
mine whether there is a significant difference in heart produced is 2.5%.
rate due to the type of equipmcnt used.
11~39. Suppose that we . . . -ish to ~est the hypothesis
11·35. An aircraft designer has theoretical e..1dence Ho: PI = fJ.:. against the alternative HI: iJl ;{; 112, where
t.b.at painting lhe airplane reduces its speed at a speci~ botb variances ~ and a; are kno'n'Il. A total of fl.[ + nz
fied power and flap setting. He tests six consecutive = N observations can be taken. How should these
airplanes from the assembly line before and after observations be allocated to the two populations to
painting. The results are shown below. mv:imize the probability thatHowill be rejected if H j
is true. and JlI - P1. : : : b;{; O'?
Top Speed (mph)
11-40. Consider the ~:r:ion membership study
Airplane Painted Not Pairted described in Exercise 10-70, Test the hypothesis that
the proportion of men who belong to a union does not
1 286 289
differ from the proportion of women who belo:tg to a
2 285 286 union, Use a=0.05.
3 279 283
1l~41. Using the data in :a~se 10-71. determine
4 283 288
Whether it is reasonable to conclude that production
5 281 183
line 2 proeuced a higher fraction of defective prcd:lct
6 286 289 t.ian ili:.e 1. Use a = 0.01,
318 Chapter 11 Tests of Hypothe:>es

11-42. Two different types of injectio::-molding (a) It is reaso:::able to conclude that these data come
machines are used to fonn plastic parts. A part is con- from a normal dist."ibution? Use a chi-square
sidered defective if it has excessive shrinkage or is goodncss-of-fit test
discolored. Two random samples, each of size 500, (b) Plot the data on nonnal probability paper. Docs a.1.
are selected, and 32 defective parr.3 are found lj. the assumption of norrn.ality seem justified?
sample from machine 1. while 21 defectl.',le part<; are
11-48. Defects on wafer su.'1'aces in integrated cirelli:
found in the sample from machine 2. Is it reasonable
fabrication are unavoidable. In a particular process the
to co::c1ude that both machines produce tie same me-
follol1ting data were collected.
don of defective parts?
1143. Suppose that we wish to test Ho: PI = fi2 Number of Number of Wafers
agains~ H!: f.lj ;i: il2., where ~ and 0; are known. The Defeetsi with i Defects
total sample size N is fLxed, bu',; the allocation of
observations to tl'lC two populations such that r.( + "''2 0 4

= N is to be made on ilie basis of cost. If the costs of 13


sampling for populations 1 and 2 are C t and 2 34
rcspectively, find the minimum cost sample sizes that 3 56
provide a specified va."iance for the difference in sam~ 4 70
pie means. 5 70
11 ~44. A manufacturer of a new pain relief tab1et 6 58
would like to demonstrate tnat her product works 7 42
twice as fast as hetcompetltor's product Specifically, 8 25
she would like to test 9 15
Ho: 1'-: = 2/11. 10 9
H,: 1'-, > 2/11. 11 3
12 1
where !J.l is the mean absorption time of ::he competi-
tive product and f11. is t.ie mean absorption time of the Doos the assumption of a Poisson distribution seem
nevi product. Assuming that ::he variances ~ and 0;- appropriate as a probability model for:his process?
are known. suggest a procedure for testing this
hypot.iesis. 1149. A pseudora.."1dom number generator is
designed so that integers 0 thro~gh 9 have an equal
11-45. Derive an expression si.'TIi1ar to cquation 11- proba:rility of occurrence. The.first 10.000 numbers
20 for the /3 error for the test on the variance of a nor~ are as follows:
mal distribution. Assume that tie two-sided
alternative is speCified. 0123456789
11~46. Derive an expression similar to equation 11- 967 1008975 1£)22 1003 989 1001 981 1043 1011
20 for :he /3 error for the test of the equfuity of the
Does this generator seem to be working properly?
variances of two normal clistributions. Assume that the
l.VtO-sided alternative is specified 11-50. The cycle time of an automatic machine !las
1147. The number of defective -:.nits found each day been observed and recorded.
by an in-circci: functional tester in a printed circuit
board assembly process is shov.'U below.

Number of (a) Does the nor:tXU11 distribution seem to be a reason~


Defectives Times Observed able probability' model for the cycle time? Use the
1}-10 6 chi-square goodness-of~fit test.
11-15 11 (b) Plot the data on IlQrmal probability ;>aper. DOO1l
16-20 16 the assumption of normality seem reasonable?
21-25 28
26-30 22 11",51. A soft drink bottler is s:udying the interru!l
31-35 19 pressure strength of l-liter glass nor.returnable bot-
tIes. A random sample of 16 borGes is tested and the
36-40 lL
pressure strengL~s obtained, The data are shown
41-45 4
-- below. Plot these data on normal probabiliry paper.
11-8 Exercises 319

Does it seem reasonable to conclude that presst:..'"e tions and ranges. Would you conclude that deflection
strength is normally distributed? and range are independent?
226,16 psi 211.14 psi
Latera: Deflection
202,20 203,62
219.54 188,12 RaDge (yards) Left Normal Rigt.t
193,73 224.39 0- 1.999 6 14 8
208.15 221.31 2,000 - 5,999 9 II 4
;95.45 204.55 6,000 - 1l.999 8 17 6
;93,71 202.21
200,81 201.63 11-56. A stc:.dy is being made of the failures of an
electronic componer.t There ate four t}"pcs of fail'JIes
11~2. A company operates four machines for three
possible arid two mounting positions for the deviee.
shifts each day. From production records. the follow-
The folloVling data have been taken.
ing data on the number of breakdowns ate collected.

Fellure Type
Mounting Position A B C D
22 46 18 9
2 31 II 9 14 2 4 17 6 12
3 15 17 16 10
Would you conclude that the t:,rpe of failure is inde-
Test the hypothesis that breakdowns are independent pendent of the moor-ting position'?
of the shift.
11-57. An article in Research in Nursing and Health
11-53. Patients in a hospital are classified as surgical (1999, p. 263) summarizes data collected from a pre-
or r:.tedicaL A record is ~ept of the number of tmes vious study (Research in Nursing and Healih, 1998, p.
patients require nu;:sing service during t.~e night and 285) on the rela.tionship between physical activity and
whether these patients are or. Medicare or not. The socio-economic status of 1507 Caucasian women. The
data are as follows: data artl given in the table below.

Patient Category
Physical Activtty
Medicare Surgical Medical
Socia-economic Stat'.!s Inactivc Active
Yes 46 52
Low 216 245
No 36 43
Medium 226 409
Test the hypothesis that caJ.s by surgical-medical H:gh 114 297
patients are independen: of whether the patier.rs are
receiving Medicare. Test the hypothesis that physical activity is bdepend~
11~54. Grades L'1 a statistics course and aD operations ent of socia-economic status.
research course taken simultaneously were as follows 11-58. Fabric is graded into three classifications: A.
for a group of students. E, and C. The results below were obtained from five
looms. Is fabric classification independent of th~
Statistics Operations Research Grade loom?
Grade A B C Other
A 25 6 17 13 ::J~ber of Pieces of Fabric
B 17 16 15 6 in Fabric Classi5catio:::
C 18 4 18 10 Loom A B C
Other 10 8 11 20
185 16 12
f:..:e the grades in statistics and operations research 2 190 24 21
related? 3 170 35 16
4 ,58 22 7
11·55. All experiment with artillery shells yields the
5 185 22 15
following data on the characteristics of laterdl deflec-
320 ChaptCI 11 Tests of Hypotheses

1l~59. An article in the Joumal of Marketing l1~()O. Consider the injection molding process
Research (1970, p. 36) reports a study of the relation~ described in li'Cercise 11-42.
ship between facility conditions at gasoline stations (3) Set up this problem as a 2 X 2 contingency table
and the aggressiveness of their gasoline marketing pol~ and perform the indicated statistical a:'.alysis.
icy, A sample of 441 gasoline stations was investigated (b) State deady the hypothesis being tested. Are you
with the results shown below obtained. Is there e-.i- testing homogeneity or independence?
dence that gasoline pricing strategy and facility CODdl~
(c) Is tb.i.s procedure equivalent to the test procedure
tions a..re independent?
used if. Exercise 1:-42?
Condition
Subst2.ndard Standard Modem

Aggressive 24 52 58
Neutral 15 73 86
Nonaggrcs..<;ive 17 80 36
Chapter 12

Design and Analysis of


Single-Factor Experiments:
The Analysis of Variance

Experiments are a natural part of the engineering and management decision-making


process. For e.xample. suppose that a civil engineer is investigating the effect of curing
methods on the mean compressive strength of concrete, The experiment would consist of
making up several test specimens of concrete using each of the proposed curing methods
and then testing the compressive strength of each specimen. The data from this experiment
could be used to determine which curing method should be used to provide maximum
compressive strength,
If the:e are only two curing methods of interest, the experiment could be desigl'.ed and
analyzed using the methods discussed In Chapter 11. That is, the experimenter has a single
facwr of interest--curing methods-and there are only two levels of the factor. If the exper-
imenter is intere..<;ted in deten::rining which curing method produces the maximum com-
pressive strength, then the number of specimens to test can be determined using the
operating characteristic curves in Chart ·VI (Appendix), and the t-test can be used to deter~
mine whether the two means differ,
Many single-factor experimentS require more than two levels of the factor to be con-
sidered. For example, fPe civil engineer may have five different curing methods to investi-
gate. In this chapter we introduce the analysis of ",mance for dealing with more than Wo
levels of a single factor. In Chapter 13, we show how to design and analyze experiments
with se...-eral factors.

12-1 THE COIVIPLETELY RA.."IDOMIZED SINGLE-FACTOR


EXl'ERIM:&",<T
12-1.1 An Example
A manufacturer of paper used for making grocery bags is interested:in improving the ten~
sHe strength of the product. Product engineering thinks that tensile strength is a function of
the hardwood concentration in the pulp, and that the range of hardwood concentrations of
practical interest is between 5% and 20%. One of the engineers responsible for the study
decides to investigate four levels of hardwood concentration: 5%,10%,15%, and 20%. She
also decides to make up six test specimens at each concentration level, using a pilot plant,
PJl 24 specimens are tested on a laboratory tensile tester in random order. The data from
this experi::nent are shown in Table 12-1.
Thi'i is an example of a completely randomized single-factor experi::nent with four lev-
els of the factor. The 1e'vels of the factor are sometimes called treatments. Each treatmem
321
322 Chapter 12 The .Analysis of Va.'ance

Table 12-1 TellSLe Strength of Paper (psi)


Hardwood Observations
Concentration (%) 2 3 4 5 6 Totals Averages
5 7 8 15 11 9 10 60 10,00
10 12 17 13 18 19 15 94 15,67
15 14 18 19 17 16 18 102 17,00
~---
20 ...
19 25 22 23 18 20 127 2Ll7
383 15.96

has six observ'ations. or replicates. The role of randomization in this experiment is


extremely important. By randomizing the order of the 24 runs, the effect of any nuisance
variable that may affect the observed tensile strength is approximately balanced Ollt. For
example, suppose that there is a warm-up effect on the tensile tester; that is, the longer the
machine is on, the greater the observed tensile strength. If the 24 runs are made in order of
increasing hardwood concentration (i.e., all six 5% concentration specimens are tested first.
followed by all six 10% concentration specimens, etc.), then any observed differences due
to hardwood concentration could also be due to the warm-up effect.
It is importar:.t to gT2pwcally analyze the data from a designed experiment. Figure
12-1 presents box plots of tensile strength at the four hardwood concentration levels. This
plot indicates that changing the hardwood concentration has an effect on tensile strength;
specifically. higher hardwood concentrations produce bigher observed (ensile streng"ill. Fur~
thennore, the distribution of tensile strength at a particular hardwood level is reasonably
symmetric, and the variability in tensile strength does not change dramatically as the hard. .
wood concentration changes.

30

25

0;
20

~
~
~
S
.c
rn
c
i" 15
0;
.'!!

$
'0;
c
{!!. 10

J
+ I

Hardwood conce!'1tration (%)


Figure 12-1 Box. plots of hardwood concentration data,
12-1 The Completely Randomized Single-Factor Experiment 323

Graphical interpretation of the data is always a good idea. Box plots show the vari-
ability of the observations within a treatment (factor level) and the variability between treat-
ments. We now show how the data from a single-factor randomized experiment can be
analyzed statistically.

12·1.2 The Analysis of Variance


Suppose we have a different levels of a single factor (treatments) that we wish to compare.
The observed response for each of the a treatments is a random variable. The data would
appear as in Table 12-2. An entry in Table 12-2, say Yip represents the jth observation taken
under treatment i. We initially consider the case where there is an equal number of obser-
vations n on each treatment.
We may describe the observations in Table 12-2 by the linear statistical model,
i ~ 1,2, ... ,a,
Yij:::: J1.+'f j +Eij ._
{ (12·1)
]-1,2, ... ,n,

where Yij is the Ui)th observation, J1. is a parameter common to all treatments, called the
overall mean, 't',. is a parameter associated with the ith treatment, called the ith treatment
effect, and €ij is a random error component. Note that Yij represents both the random vari-
able and its realization. We would like to test certain hypotheses about the treatment effects
and to estimate them. For hypothesis testing, the model errors are assumed to be normally
and independently distributed random variables with mean zero and variance 0"2 [abbrevi-
ated NID(O, 0"2)]. The variance 0"2 is assumed constant for all levels of the factor.
The model of equation 12-1 is called the one-way-classification analysis of variance,
because only one factor is investigated. Furthermore, we will require that the observations
be taken in random order so that the environment in which the treatments are used (often
called the experimental units) is as uniform as possible. This is called a completely ran-
domized experimental design. There are two different ways that the a factor levels in the
experiment could have been chosen. First, the a treatments could have been specifically
chosen by the experimenter. In this situation we wish to test hypotheses about the 'fl , and
conclusions will apply only to the factor levels considered in the analysis. The conclusions
cannot be extended to s:imilar treatments that were not considered. Also, we may wish to
estimate the 7:'1' This is called the fixed effects model. Alternatively, the a treatments could
be a random sample from a larger population of treatments. In this situation we would like
to be able to extend the conclusions (whiCh are based on the sample of treatments) to all
treatments in the population, whether they were explicitly considered in the analysis or not.
Here the 'f; are random variables, and knowledge about the particular ones investigated is
relatively useless. Instead, we test hypotheses about the variability of the 7:'; and try to esti-
mate this variability. This is called the random effects, or components of variance, model.

Table 12-2 Typical Data for One-Way-Classification Analysis of Variance


Treatment Observation Totals Averages

y" y" y," y,. y,.


2 y" y" Y2l1 y., y,.

a Yo, Yo' YO" Yo' Ya'


324 Chaptcr 12 The Analysis of Variance

In this section we will develop the analysis of variance for the fixed-effects model. one-.
way classification. In the fixed-effects model, the treatment effects ~ are usually dnfined as
deviations from the overall mean, so that

(12-2)

Let y;, represent the total of the observations under the ith treatment and Y" represent the
average of the observations under the ith treatment Similarly. let y .. represent the gnnd
total of all observations and y.. represent the grand mean of all observations. Expressed
mathematically,

i=1,2, .. "a.

, n (12-3)
y .. = 2,2:>ii' Y··=y··/N,
1=1 i"'i

whereN =: an. is the total number of observations. Thus the "dot" sUbscript notation implies
sum.m.ation over the subscript that it replaces,
We are interested l!J. testing the equality of the a treatment effects. t;sing equation 12-2,
the appropriate hypotheses are
Ho: 't': 1J :;; ... :;; '4<) 0,
(12-4)
H!: ~ ¢ 0 for at least one l.
That is, if the null hypothesis: is true. then each observation is made up of the overall mean
f.L plus a realization of the random error Ell'
The test procedure for the hypotheses in equation 12-4 is called the analysis of vari~
aIlee. The name "analysis of variance" results from partitioning total variability in the data
into its component pa.'1S, The total corrected sum of squares, which is a measure of total
-variability in the data, may be written as

(12-5)

or

(12-6)
a a
+2 2,2,(Yi.- 'y.)(Yi; - Yi.)
~l j=l

Note that the cross-product term in equation 12-6 is zero, since

i(Yij -Yi.) = Yi·-nYi. =Yi.-n(Yi.!n) =0.


j::::.l

Therefore we have
j

(12-7)
12~ 1 The Completely Randomized Single-Factor Experiment 325

Equation 12-7 shows !hat !he total variability in the data, measured by !he total corrected
sum of squares, can be partitioned into a sum of squares of differences betv,,'een treatment
means and the grand mean and a sum of squares of differences of observations within treat-
ments and the treatment mean. Differences bet\Veen obsen-ed treatment meanS and the
grand mean measure the differences between treatments. while differences of observations
within a treatment from the treatment mean can be due only to random error. Therefore. we
write equation 12-7 symbolically as

SST = SStr"'..aIme~1l< + SSe'


where SST is the total sum of squa...--es. SS:reolm~~~ is the sum of squares due to treatments (Le.,
between treatments), and SSE is the sum of squares due to error (i.e., within t:eatments).
There are an =Ntotal obseJ."'lations; thus SST has N -1 degrees of freedom. There are a lev-
els of the factor, so SS~=ts has a - 1 degrees of freedom, Finally, widrin any treatment
there are n replicates providing n - 1 degrees of freedom Vlith which to esthnate the exper-
imental error. Since there are a treatmen~ we have a(n 1) = an - a = N - a degrees of
freedom for error,
Now consider the distributional properties of these sums of squares. Since we have
asst:med that !he errors €;, are NID(Q, a"), !he observations Yij are NID(,u + "i, 0'), Thus
SSr/a" is distributed as cbi-square with N - 1 degrees of freedom, since SST is a sum of
squares in normal random variables, Vie may also show that SSl1o.:.:mi:,1)~/d is chi-square with
a 1 degrees of freedom, if Hois tru_e, and SSfid is chi-square witt N - a degrees of free-
dom. However, all three sums of squares ate not independent, since SStr~~m~nu and SSE add
up to SST' The followmg theorem, which is a special form of one due to Cochran, is useful
in developing the test procedure,

Theorem 12·1 (Cochran)


Let Z, be NID(O, I) for i = 1, 2, ,," v and let

wheres <v a!1d Q, is chi-square with Vi degrees offree-dom (i= 1, 2 •. ,.,3), Then QH Q;tt
Q: are independent chi-sGuare random variables with Vl , v2 > •• " v, degrees of freedom,
respectively~ if and only if

Using this the-orem, we note that the degrees of freedom for SSt=.tm~n~ and SS£:,' add up
to N - 1, so that SS""',m"Ja" and SSe/a" are independently distributed cbi-square random
variables, Therefore, under the null h)l'othesis, the statistic

(12-8)
follows the Fa _ I• N _ a distribution. The quantities MS<=.:rw:nu and ~y'S£ are rr.ean squares.
The expected values of the mean squares are used to show that Fo in equation 12-8 is
an appropriate test statistic for Ho: 1) =: 0 and to determine the criterion for rejecting this null
hypothesis. Consider
:i
326 Chapter 12 The Analysis of Variance

Substituting the model, equation 12-1, into this equation we obtain

E(MSE) = N :0 E[ t.~(!i-C" Hut -± t.(~(!iHI ~elj In


~ow on squaring and taking the expectation of the quantities within brackets, we see that
terms involving e7; and L;",,;f.7j are replaced by if and n<f!, respectively, because E( ( 1) :::: O.
Furthermore, all cross products involving €'1] have zero expectation. Therefore, after squar~
ing! taking expectation~ and noting that I.~l 'rl """ 0, we have

Using a similar approach. we may show that


a
n2:-rf
f"S trear:me.'llti ) :;;::;; (J 2 + -1-
E \.IYl, .. 1- .
a-I
From the expected mean squares we see that MS E is an unbiased estim.torof cr.
Also,
uncer the null hypothesis. A[SueI.ltmeot!, is an ur:.biased estimator of cr. However, if the null
hypothesis if false, then the expected value of ~o/fS=tmeJl[; is greater than vl. Therefore, under
the alternative hypothesis the expected value of the numerator of the test statistic (equation
12-8) is greater than the expected value of the denominator, Consequently; we should reject
H'J if the test statistic is large, This implies an upper~tail, one-tail critical region. Thus, we
would reject H, if

where Fe is computed from equation l2~8.


Efficient computational formulas for the sums of squares may be obtained by expand-
ing and simpJifying the definitions of SS~a and SST in equation 12~7. This yields
, "
SST = I.2>~ (12-9)
i=1 I::'l N
and

(12-10)
'r;
"
12~ 1 The Completely Randomized S:.ngle-Factor Exp"'...riment 327

i The error sum of squares 1s obtained by subtraction:


,I SS£ :;;; SST - SSrreo.:menr:r,' (12-Jl)
The teSt procedure is summarized in Table 12-3. This is called an analysis-of-variance table,

Consider the hardwood concentration experiment described in Section 12-1.1 We can use the analy-
sis of wrian<:e to test '.he hypothesis that different hardwood concentrations do not affect the mean
tensile strength of the paper. The sums of squares for analysis of variance are computed from equa-
dOO$ 12·9, 12-10, and 12-11 as follows:

" _ (383)'
=(7) +(8J-+ ,,+(20t---=512,96,
24

(60)' 7(94)' -(102)' +(127)' (383)' = 382.79


6 24 '
SSE = SST - SStl:'elItnl~nt:.
=512,96-382,79 130,17,
Th.e analysis of variance is sumrr;.ar.zed in Table 124. Since Fo.c:.:l,:tO = 4.94, we reject Ho and con-
clude that hardwood concentral'io:::, b. the pulp significantly affects the stre:::lgth of the pilper.

12-1.3 Estimation of the Model Parameters


It is possible to derive estimators for the parameters in the one-way analysis-of~variance
model

Table 12-3 Analysis ofVa..";ance for the OI'.e~Way~aassification Fixed-Effects Model


Source of Sum of Degrees of Mean
Variation Squ.a:res Freedom Square Fo
Betweca treatments SStm.l:mOOIt< a-I MS~r.1$
&rOT (within treatments) SSE N-a MS,
Total SST N-I

Table 124 Analysis of Variance for the Tensile Strength Data


Source of Sum of Degrees of Mean
Variatio= Freedom
Hardwood concentration 382.79 3 127,60 19,61
Error 130,17 20 6,51
Total 512,96 23
328 Chapter 12 The Analysis of Variance

An appropriate estimation criterion is to estimate J1 and 1'; such that the sum of the squares
of the errors or deviations Ei I is a minimum. This n:ethod of parameter estimation is called
the method of least squares: In e.ti.,nating }.l and ~i by least squares, the nonnality assump_
tion on the errors Elf is not needed. To find the least-squares estimators of J1 and 1'!~ we form
the sum of squares of the errors

(12-12)

j!.
and find valnes of }.l and T" say and ii' that minimize L The values p. and ii are the solu-
tions to the a + 1 simultaneous equations

~a}.lL -;
jl,.{
=0
,

i=1,2, ... ,a.

Differentiating equation 12-12 with respect to JJ.. and 1'1 and equating to zero, we obtain
a ,
-iLI,(Yirj!.-r,)= 0
i.=:l J=l

and
,
-2I,(Yi rj!.-i',)=O, i::::: 1,2.".,a.
1';;1

After simplification these equations become

NjJ.+nf1 +rii1+ ... +nf~=y."

=)'t' ,

+m, =Y2" (12-13)

Equations 12-13 are called the least-squares normal equations. Notice that if we add the
last a normal equations we obtain the first normal equation. Therefore, the normal equations
are not linearly independent, and there are no unique estimates for f.4 't'p1:.z,.,.,'t'". One way
to overcome this difficulty is to impose a constraint on the solution to the normal equations.
There are many ways to c1loose this constraint Since we have defined the treatment effects
as deviations from the OY"erall mean, it seems reasonable to apply the constraint

(12-14)

Using this constraint, we obtain as the solution to the normal equations

i:::; 1, 2, ..., a. (12-15)


12-1 The Completely Randomized Single-Factor Experiment 329

This solution has considerable intuitive appeal since the overall mean is estimated by the
grand average of the observations and the estimate of any treatment effect is just the differ-
ence bct\lleen the treatment average and the grand average,
This solution is obviously not unique because it depends On the constraint (equation
12-14) that we have chosen. At :first this may seem unfortunate. because two different
experimenters could analyze the same data and obtain different results if they apply differ-
ent constraints. However, certainfimctions of the model parameter are estimated uniquely.
regardless of the constraint. Some examples are 'i-"r which would be estimated by ~,-~ ~
)1. -:ip and f.1 + r i • which would be estimated by it +7:; = <~};.' Since we are usually interested
in differences in the treatment effects rather than their actual values, it causes no concern
that the 7:1 cannot be estimated uniquely. In general, any function of the model parameters
that is a linear combination of the left-hand side of the normal equations can be estimated
lI1liquely, Functions that are lI1liquely esrimated, regardless of which constnlint is used, are
called estimable functions,
Frequently, we would like to construct a confidence interval for the ith treatment mean,
The mean of the ith treatment is

l= 1. 2, .... a.

A point estimator of f.1l would beil} =it + 1i Now t if we assume that the errors are nor~
mally distributed, each 1" is !\'1D(;1,! <i'ln), Thus, if <i' were known, we could use the nQr-
mal distribution to construct a confidence interval for fli' V sing MSE as an estimator of > cr
we can base the confidence interval on the t distribution, Therefore, a 100(1 - a)% confi-
dence interval on the ith treatment mean Pi is

(12-16)

A 100(1- a)% confidence interval on the difference between any two treatment means, say
J11 - /lj. is

~~#l!1~)~-2 .'
We can use the results given previously to estimate the mean tensile s.trengths at different levels of
hardwood concentration for the experiment in Section 12~ 1. L The mean tensile strength estimateS are
Yl' : : :. AI(,:::::' 10.00 psi,

~. =: jJ.ll.Yfc ~ 15.67 psi,

J,. ~ ilm, = 2Ll7 psi


A 95% codidence interval on the mean tensile strength at 20% hardwood is found from equation
12-16 as follews:

[2L17± (2,086)-I65lj6 ].
[2Ll7 ± 2.17],
330 Chapter 12 The Analysis of Variance

1he desired confidence interval is


19.00 psi ~ I'm. ~ 23.34 psi.
\'b-ual. examinarion of the data suggests that mean tensile strength at 10% and 15% hardwood is si1l1~
ilar. A confidence interVal on the difference in means J1.lS'h - JllM. is

[Yi'- Yj ±tc:.{l.N-Q~2MS5/n ].
[17.oo-1j.67±(2.086)~2(6.51)/6l·
[L33±3.07].

Thus. the confidence intm'3.l on J1.!5% -Il;(jl, is


- 1.74 51l1YJ. - Ji.!M. ~ 4.40.
Since the confidence interval includes zero, we would conclude that there is no difference in mean
tensile strength at these two particula:: !laniwooC :levels.

12-1.4 Residual Analysis and Model Checking


The one-way model analysis of variance assumes that the observations are normally and
independently distributed. with the same variance in each treatment or factor level. These
assumptions should be checked by examining the residuals, We define a residual as: e'i =::
Yl} - y~, that is, the difference between an observ"3tion and the corresponding treatment
mean. The residuals for the hardwood percentage experiment are shown in Table 12-5.
The normality assumption can be checked by plotting the residuals on normal probabil-
ity paper. To check the assumption of equal varia..'1ces at each factor level, plot the residuals
against the factor levels and compare the spread in the residuals. It is also useful to plot the
residuals agamst )'1. (someti,nes called thefitted value); the variability in the residuals should
not depend in any way on the value ofy,: Vv'fien a pattern appears in these plots, it usually
suggests the need for transfonnation, that is, analyzing the data in a different metric, For
example, if the variability in the residuals increases with Yi" then a transformation such as
log y or .[Y should be considered. In some problems the dependency of residual scatter in:;1"
is very important information. It may be desirable to select the factor level that results in
maximum y; however, this level may also cause more variation in y from run to run.
The independence assumption can be checked by plotting the residuals against the time
or run order in which the experiment was performed. A pattern in this plot. such as
sequences of positive and negative residuals, may indicate that the observations are not
independent. This suggests that time or run order is important, or that variables that change
over ti..rne are important and have, not been included in the experimental design.
A normal probability plot of the residuals from the hardwood concentration experiment
is shown in Fig, 12-2, Figures 12·3 and 124 present the residuals plotted against the treat-

Tabl.12-5 Residuals for the Tensile Strength Experiment


Hardwood
Concentration R::siduals
5% -3,00 -2.00 5.00 1.00 -l.OO 0.00
10% -3.67 1.33 -2.67 2.33 3.33 ..1J.67
15% -3.00 1.00 2.00 0.00 -1.00 LOa
20% -2.17 3.83 0.83 1.83 -3.17 -Ll7
12-1 The Completely Randomized Single~Factor Experiment 331

ment number and the fitted value YI' ' These plots do not reveal a:ly model inadequacy or
unusual problem with the assumptions,

12-1.5 An "Cnbalanced Design


In some single~factor experiments the number of observations taken under each treatment
may be different. We then say that the design is unbalanced. The analysis of variance
described earlier is still valid. but slight modifications must be made in :he sums of squares
formulas. Let n, observations be taken under treatment i(i = 1, 2,. ". a); and let the total

:
·•
:•
.
'

. •

4 6
Residual value
Figure 12--2 l\~onnal probability plot of residuals from the hardwood co:r.centration experiment.

t
4

2
·1•• •
2 3 4

-2
t
Figure 12~3 Plot of residuals YS. treatment.
332 Chapter 12 T.::le A..'1alysis ofVatiance

2~
!
f
• ••
I

Figure 12*4 Plot of residuals vs. )it'

number of observations N:::: k;"'ln l The computational fOIDlulas for SST and SSfftltilY:rJs
+

become
a "
SST 2::2>3
r""l j:<
N
and

In solving the normal equations, the constraint r: .


In,tl 0 is used. No other changes are
required in the analysis of variance.
111ere are two important advantages in choosing a balanced design. First, the test sta~
tistic is relatively insensitive to small departures from the assumption of equality of vari~
ances if the sample sizes are equal This is not the case for unequal sample sizes. Second,
the power of the test is maximized if the samples are of equal size.

12·2 TESTS ON INDIVIDUAL TREATMENT MEANS


12·2.1 Orthogonal Contt-dSts
Rejecting the null bypothesis in the fixed-effects-modeI analysis of variance implies that
there are differences between the a treatment means1 but the exact nature of the differences
is not specified. In this siruation, further comparisons between groups of treatment means
may be usefuL The ith treatment mean is defined as Ili I' ~ T,. and 1'; is esti.."",ted by 'ii;.•
Comparisons between treatment means are usually made in tenr.s of the treatment totals
{y, }.
Consider the hardwood concentration experiment presented in Section 12-1,1. Since
the hy'pothesis Eo: !j ~ 0 was rejected, we know that some hardwood concentrations pro-
duce tensile strengths different from others, but which ones actually cause this difference?

J
12~2 Tests on Individual Treattnent Means 333

We might suspect at the outset of the experiment that hardwood concentrations 3 and 4 pro~
duce the same tensile stren~ implying that we would like to test the hypothesis

H,: p, = ,,",,
H,: p, *11"
This hypothesis could be tested by using a linear combination of treatment toWs, say

Y:r-Y,·=O.
If we had suspected that the average of hardwood concentrations 1 and 3 did not differ from
the average of hardwood concentrations. 2 and 4. then the hypothesis would have been

Ho: III + P, = P, + 11"


HI: 1', + p, *
p,+ 1'.,
which implies that the linear combination of treatment totals

YI' +y,. -y,.- y" =0.


In general, the comparison of treatment means of Interest will imply a linear oornbina-
tion of treatment totals such as
a
C=l..Ci}'i,'
i<=1

L;
with the restriction that ""tC, O. These linear combinations are called cOntrasts. The sum
of squares for any contrast is

(12-18)

and has a single degree of freedom. If the design is unbalanced, then the comparison of
treatment means requires that 1~ ",1niGo:::: 0, and equation 12~ 18 becomes

(12-19)

A contrast is tested by comparing its sum of squares to the mean square error. The resu1t~
ing statistic would be distributed as F, with 1 and N - a degrees of freedom.
A very important special case of the above procedure is that of orthogonal contrasts.
Two contras:s with coefficients (cJ and (dJ are otthogonal if

or, for an unbalanced design, if


334 Chapter 12 The Analysis of Variance

For a treatments a set of a-I orthogonal contrasts will partition the sum of squares due to
treatments into a-I independent single-degree-of~freedom components. Thus. tests per~
formed on orthogonal contrasts are independent.
There are many ways to choose the orthogonal contrast coefficients for a set of treat-
ments. Usually, something in the nature of the experiment should suggest which compar-
isons will be of interest. For example, if there are a=::3 treatments, with treatment 1 a
"control" and treatments 2 and 3 actua11evels of the facto! of interest to the experimenter.
then appropriate orthogonal contrasts might be as follows:

Treatment Otthogor.al Contrasts


1 (control) -2 0
2 (levell) 1 -1
3 (level 2) I 1

Note that contrast 1 with Ci =-2, 1, 1 compares the average effect of the factor \Vith the con-
trol while contrast 2 with d i = 0, - 1, 1 compares the two levels of the factor of interest.
Contrast coefficients must be chosen prior to running the experiment, for if these com·
parisons are selected after examining the data. mOSt experimenters would construct tests
that compare large observed differences in means. These large differences could be due to
the presence of real effects or they could be due to random error, If experimenters always
pick the largest differences to compare, they will inflate the type I error of the test, since it
is likely that in an unusually high percentage of the comparisons selected the observed dif-
ferences will be due to error.

Consider the hardwood concentration experiment. There an:: four levels of hardwood concentration.
and the poss:.Ole sets of comparisons between these means and the associated orthogonal comparisons
are
Ha: f,i; ..... f,i;, =P-t -1-'1, C1 t;;;,YI' -Y2' -Yy + Y4"
H,,: 3f,i( + ,LL:.,:;;; P-t + 3fJ. 4t C2 = - 3y!' -)'2' + YJ- - 3Y4' •
H,: 1', + 3", ~ 3", + ",. C3=-YI·+3Y2' 3Y3'+Y4'
Notice that the contrast constants are orthogonaL Using the data from Table 12-1. we find the numer-
ical values of the contrasts and the sums of squares as follows:
(-9)' •
C1 ~60-94-102+127~-9. SSe. ~ 6(4) ~ 3.,8,

C, ~-3(60)-94+ 102+3(127) =209. SS ~ (209)' ~ 364.00


C, 6(20) •
(43)'
C, =-<50 +3(94)-3(102) + 127 = 43. SSe, ~ 6(20) =15.41.

These contrast sums of squares C-Ontpletely partition the treatment sum of squares; that is. SS_G::=
SSe l + SSC~ - SSe:; == 382.79. These tests on the contrasts are usually incorporated into the analysis of
variance, such as shown in Table 12-6. From this analysis, We conclude that there are sig:n.ificant dif~
ferences between bardwood concentrations 1,2 vs. 3, 4. but that the average of 1 and 4 does not dif~
fer from the average of 2 and 3. nor does the average of 1 and 3 differ from the average of 2 and 4.

J
12-2 Tests on Individual Treatment Means 335

Table 12-6 Analysis of Variance for the Tensile Strength Data


Source of Sum of Degrees of Mean
Variation Squares Freedom Square F,
Hardwood cO!lCentrarioD 382.79 3 127.60 19.61
C, (1.4 ¥s. 2. 3) (3.3S) (1) 3.38 0.52
C, (1,2 vs. 3,4) (364.00) (1) 364.00 55.91
C, (1.3 vs.2.4) (15.41) (1) 15.4.1 2.37
Error 130.17 20 6.51
Total 512.% 23

12-2.2 Tukey's Test


Frequently~ analysts do not know in advance how to construct appropriate orthogonal con-
trasts, or they may wish to test more than a-I comparisons using the same data. For exam-
ple, analysts may want to test all possible pairs of means. The null bypotheses would then
*
be Ho: III = Ilj for all i j. If we test all possible pairs of means using I-tests, the probability
of committing a type I error for the entire set of comparisons can be greatly increased.
There are several procedures available that avoid this problem. Among the more popular of
these procedures are the Newman-Keuls test ",ewman (1939); Keuls (1952)), Duncan's
multiple range test [Duncan (1955»). and Tukey's test [Tukey (1953)]. Here we describe
Tukey's test.
Tukey's procedure makes use of another distribution, called the Studentized range dis-
tribution. The Studentized range statistic is
_ Y1ThlX - Y~n
q r=-;- •
•~MSE!n

wbere y~ is the largest sample mean and y""" is the smallest sample mean out of p sample
means. Let qa (a,f) repre.<:;ent the upper a percentage point of q, where a is the number of
treatments andfis the number of degrees of freedom for error. Two means) y;. and Yj. (i *'
j), are considered significantly different if

lYi . -)\.1 > T.


where
/MSE
Ta = qa (a,f )'I-n-' (12-20)

Thble XlI (Appendix) contains values of q. (a,fJ for a= 0.05 and 0.01 and a selection of
values for a and! Thkey's procedure has the property that the overall significance level is
exactly a for equal sample sizes and at most a for unequal sample sizes.

~~,!,!~'g4:
We will apply Tukey's test to the hardwood concentration experiment, Rec::ill that there are a = 4
rnea.1S, n:::: 6) and MSc :::: 6,51. The treatment means axe
Y" = 10.00 psi. y,. = 15.67 psi. y, = 17.00 psi, Y.., =: 21.17 psi.
Fro", Table XII (Appendix). with a= 0,05. a =4. and!= 20. we find qOM(4, 20) = 3.96.
336 Chapter 12 The Analysis of Variance

Using Equation 12-20,

Therefore. we would conclude:hat Mo means are significantly different if


[y,. > 4.12.
The differences in treatment averages are
15',. = 110.00 -15.671 = 5.67,
If,. -y,.1 =10.00-17.001 = 7.00. !
11,. - 3',.1 = 110.00 - 21.171 = ILl7,
I)',. -3',·1 = :15.67 -17.001 = 1,33,
[f,. -)',·1 = 115.67 - 21.171 =5.50,
L',· - )',·1 = 117.00-21.17: =4.17.
From this analysis. we see significant differences between all pairs of means except 2 and 3. It may
be of use to draw a graph of the treatment means, such as Fig, 12-5, wi.lo'1 the means that are not dif-
ferent underlined.

Simultaneous confidence intervals can also be constructed on the differences in pairs


of means using the Tukey approach. It can be shown that

when sample sizes are equal. This expression represents a 100(1 a)% simultaneous con-
fidence interval on aU pairs of means Jl; - j..l;.
If the sample sizes are unequal, the 100(1 - 0:)% simultaneous confidence interval On
all pairs of means JJ.; - ~ is given by

Interpretation of the confidence intervals is straightfonvard. If zero is contained in an inter~


val. then there is no significant difference beween the two means at the a significance level
It should be noted that the significance level, IX, in Tukey's multiple comparison pro-
cedure represents an experimental error rate. "With respect to confidence intervals, (1. rep-
resents the probability that one or more of the confidence intervals on the pairwise
differences will not contain the true difference for equal sample sizes (when sample sizes
are unequal, this probability becomes at most 0:).

91. 12,' fa. Y4-

~.
I I
I I I
10.0 12.0 14.0 16,0 1M 20.0

Figure 12-5 Results ofTukey's :est.


12-3 The Random-Effects Model 337

12·3 THE RAC'mOM·EFFECTS MODEL


In many situations, the facwr of interest has a large number of possible levels. The analyst
is interested i:1 drawing conclusions about the entire population of factor levels. If t.l-te
experimenter randomly selects a of these levels from the population of factor levels, then
we say that the factor is a random factor. Because the levels of the factor actually used in
the experiment were chosen randomly, the conclusions reached wil1 be valid about the
entire population of factor levels. We will assume that t.~e population of factor levels is
either of infinite size Or is large enough to be considered inficite.
The linear statistical model is

i 1,2"",,a,
Yij=/1+'r i '€ij { ' - (12·21)
J -1,2,"",n,

where 1; and €;, are independent random variables. Note that the model is identical in struc-
ture to the fixed-effects case, but the parameters have a different interpretation. If t.'1e vari-
ance of 'T, is ~, then the variance of any observa:ion is

V(y) = rr; + <:f,


The variances ~ and t:T are called variance componems and the model, equation 12-21.
j

is called the components-ofvariance or the raru1om~effectsmodeL To test hypotheses using


this model, we require that the (E,) areNID(O, <:f), tha, the ('ti} are ,NID(O, rr;),
and that "i
and €/j are independent.The assumption that the {1j} are independent random variables
implies that the usual assumption on:;,} ~, = 0 from the fixed-effects model does not apply
to the random-effects modeL
The sum of squares identity

(12-22)

still holds. That is~ we partition the total 'Vmiability in the observations into a component
that measures variation between treatments (SSI.."tlI;mI:1lJ and a component that measures vari-
ation within treatments (S5E)' However, instead of testing hypotheses about individual treat-
ment effects) we test the hypotheses

If 0, all treatments are identical. blo1 if U; > 0, t.'len there is variability between treat-
ments, 'The quantity SSE/a' is distributed as chi-square with N - a degrees of freedom, and
under the null hypothesis, SS~enl$/cf is distributed as chi-square with a 1 degrees of
freedom, Further, the random variables are independent of each other. Thus, under the null
hypothesis, the ratio

SS"""""",/( a - I) (12-23)
SSE/eN -a)

is distributed as F with a - I and N - a degrees of freedom, By examining the expected


mean squares we can deten.nme the critical region for this statistic,
Consider

L
338 Chapter 12 The Analysis of Variance

If we square and take the expectation of the quantities in brackets; we see that terms involv-
ing -r: are replaced by d!. as E(r;) O. Also, terms involving I7", 1~' L~.. lL;", I€~' and
L;.IL;~ I ~ are replaced by nO". 000". and an 0;, respectively. Finally, all cross-product
tenns involving ~i and €ijhave zero expectation. This leads to

or
E(MS=Il,:) = (J1 + nd;. (12-24)
A similar approach will show that
E(MS,) = (Jr, (12-25)

From the expected mean squares. we see that if Ho is true, both the numerator and the
denominator of the test statistic, equation 12-23, are unbiased estimators of 0-2; whereas if
HI is true, the expected value of the numerator is greater than the expected value of the
denominator. Therefore, we should rejectHo for values of Fo that are too large. This implies
an upper~tail, one-tail critical region, so we reject Ho if Fo > Fa. a_ I,N_,:'
The computational procedure and analysis-of-variance table for the random-effects
model are identical to the fixed~effects case. The conclusions., however, are quite different
because they apply to the entire population of treatments.
We usually need to estimate the \'ariance components (0-2 and ~) in the model. The
procedure used to estimate 0-2 and a! is called the "analysis..of-variance method," because
it uses the lines in the analysis-of-variance table. It does not require the normality assump-
tion on the observations. The procedure consists of equating the expected mean squares to
their observed values in the analysis-of-variance table and solving for the variance compo-
nents. When equating observed and expected mean squares in the one-way-classification
random-effects model, we obtain
MSu<;wnctJ.I1 = 0-2 + nO-;
and

Therefore, the estimators of the variance components are


&?:::!ldS£ (12-26)
and

(12-27)

For unequal sample sizes, replace n in equation 12-27 with


,-

12-3 The Random-Effects Model 339

1
no=-
a ~n21 ,
Ln,-.c;-1
~ i

r
a-I 1-
'-1
;
L
L
kcl
11.
,
Sometimes the analysis-of-variance method produces a negative estimate of a variance
component Sinee variance components are by defmition nonnegative, a negative estimate
of a variance component is unsettling. One course of action is to accept the estimate and use
it as evidence that the true value of the variance component is zero, assuming that sampling
variation led to the negative estimate. \\'!rile this has intuitive appeal, it will disturb the sta~
tistical properties of other estimates, Another alternative is to reestimate the negative vari~
3..1.ce component with a method that always yields nonnegative estimates. Still anodier
possibility is to consider the negative eS"Jmate as evidence that the assumed linear model is
incorrect, requiring that a study of the model and its assumptions be made to find a more
appropriate model.

;J,l~~ii;)l#:~
In his book Design aruiAnalysis afExperiments (2001). D. C. Montgo:nery describes a single-factor
experiment involving the random-effects model. A textile manufacturing company weaves a fabric on
a la.,'"gc number of loows. The company is interested in loom-to-loom. variability in tensile stre::1gth,
To investigate this, a IruUlufacturing engineer selects four looms at FaJldom and makes fOtlf st."'\'::ngth
determinations on fabric samples chosen at random for each loom. The data are shown in Table 12-7,
and the analysis of variance is summarized in Table 12-8.
From the analysis of variance, we conclude :hat the looms :.n the p:ar.t differ significantly in their
ability to produce fabric of unifonn strength. The variance components are estimated by &::;:::: 1.90 and
., _ 29.73-1.90 6%
vr 4 . .

Therefore, the varia;}.ce of strength in the man.li!actlqing process is estimated by

= 6.96 + 1.90

8.86.
Most of this variability is attributable to dill'erences between looms.

Table 12·7 Streugth Data for Example 12-5

Observations
Loom 1 2 3 4 Totals Averages
1 98 97 99 96 390 97.5
2 91 90 93 92 366 91.5
3 96 95 97 95 383 95.8
4 95 96 99 98 388 97.0
1527 9SA

l
34(l Chapter 12 The Analysis of Variance

Table 12-8 Analysis of Variance for the Strength Data


Source of Sum of Degrees. of Mean
Variation Squares. Freedom Square P,
Looms 89.19 3 29.73 15.68
Error 22.75 12 1.90
Total 111.94 15

This example illustrates an important application of analysis of variance-the isolation


of different sources of variability in a manufacturing process. Problems of excessive
variability in critical functional parameters or properties frequently arise in quality-
im.provement program.~. For example, in the previous fabric-strength example, the process
mean is estimated by y.. = 95.45 psi and the process standard deviation is estimated by
&, = 'wl;;J = '18.86 2.98 psi. If strength is approximately normally distributed, this
would imply a distribution of strength in we Outgoing product that looks like the normal
distribution shown in Fig. 12-6a. If the lower specification limit (LSL) on strength is at 90
psi, then a substantial proportion of the process defective is fallout; that is, scrap or defec~
dve material that must be sold as second quality, and so aD.. This fallout is directly related
to the excess variability resulting from differences between looms. Variability in 1001P- per-
formance could be caused by faulty setup. poor maintenance) inadequate supervision,
poorly trained operators, and so forth. The engineer or manager responsible for qUality
improvement must identify and remove these sources of variability from the process, If he
can do this, then strength variability will be grea~y reduced, perhaps as low as = = a, a
-.11.90 = 1.38 psi, as shown in 12-6b, In this improVed process, reducing the variabil-
ity in strength has greatly reduced the fallout. 'This: will result in lower cost, higher quality,
a more satisfied customer, and enhanced competitive position for the company.

Process
fallout

(a)

110 '.
pSI

(b)

Figure 12.-6 The distribution of fabric strengdl, (a) Current process. (0) Improved process.
12-4 The Randomized Block Design 341

124 THE RANDOMIZED BLOCK DESIGN


12-4.1 Design and Statistical Analysis
In many experimental problems it is necessary to design the experiment so that variability
arising from nuisance variables can be controlled. As an example, recall the situation in
Example 11~17j where two different procedures were used to predict the shear strength. of
steel plate girders. Because each girder has potentially different strength, and because this
variability in strength was not of direct interest, we designed the experiment using the two
methods on each girder and compared the difference in average strength readings to zero
using the paired t-test. The paired t-test is a procedure for comparing two means when aU
experimental runs cannot be made under homogeneous conditions, Thus, the paired t-test
reduces the noise in the experiment by blocking out a nuisance variable effect. The ran-
domized block design is an extension of the paired t-test that is used in situations where the
factor of interest has more than two levels.
As an example. suppose that we wish to compare the effect of four different chemicals
on the strength of a particular fabric. It is knOYtll that the effect of these chemicals varies
considerablY from one fabric specimen to another. In this example, we have only one fac-
tor: chemical type. Therefore. we could select several pieces offabric and compare all four
chemicals within the relatively homogeneous conditions provided by each piece of fabric.
This would remove any variation due to the fabric,
The general procedure for a randomized complete block design consists of selecting b
blocks and running a complete replicate of the experiment in each block. A randomized
complete block design for investigating a single factor with a levels would appear as in Fig.
12-7. There will be a observations (one per factor level) in each block, and the order in
whicb these observations arc run is randomly assigned within the block.
We will now describe the statistical analysis for a randomized block design. Suppose
that a single factor with a levels 1s of interest. and the experiment is run in b blocks. as
shown in Fig. 12-7. The observations may be represented by the linear statistical model.
fi= 1.2, ...,a.
Yii=:!!f.1+'ti+fJj+€ij~. . (12-28)
, ~J 1.2..... b,
where f.1 is an overall mean, 'ti is the effect of the ith treatment, PI js the effect of the jth
block, and e" is the usual NID(O, a') random error term. Treatments' and blocks will be C(lJ1-
sidored initially as fixed factors. Furthermore, the "eattnent and block effects are defined
as deviations from the overall mean, so that:L~", 1'ti:::: 0 and :L;", 1/3;= O. We are interested in
testing the equality of the treatment effects. That is,
BIJ! '1:1 =:!! 12:::: .,.:= 1'a=O~

H}: 't',;;t 0 for at least one i,

Block 1 B:cck 2 Brock b

IY1D
y"
he

Fi,.,oure 12-7 The randomized complete block design,

L
342 Chapter 12 The Analy'sis of Variance

Let Yi. be the total of all observations taken under treatment i,let Y'} be the total of all
observations in blockj, let y .• be the grand total of all observations, and let N = ab be the
total number of observations. Similarly, y;. is the average of the observations taken under
treatment i, yO) is the average of the observations in blockj, and y•. is the grand average of
all observations. The total corrected sum of squares is

Expanding the rigbt-hand side of equation 12-29 and applying algebraic elbow grease
yields
(J b ., (1 b .,
2,2,(Yij - y.,f =b2, (y,.-; ..)' +a2, (;.j - y.)"
i=1 J"'-l i=1 /,=1
o b 2
(12-30)
+ 2,2,(Y,j-Y"-Y'j+)'.')
i=1 j~l

or. symbolically,
SSr= SSumrr,cn~ + SSblocu'l- SS£, (12-31)

The degrees-of-freedolI! breakdown corresponding to equation 12-31 is


no - 1 = (0 1) + (b 1) T (0 - 1)(b - 1). (12-32)

The null hypothesis of no treatment effects (H,: "" = 0) is tested by the F ratio,
MS~,,,IMS£. The analysis of variance is summarized in Table 12-9. Computing formu,
las for the sums of squares are also shown in this table. The same test procedure is used in
cases where treatments and/or blocks are random.

Ex..nJ~~eX2~6
Au experiment was performed ::0 determine the effcct of four different chemica:s on the stre';lgth of a
fabric. These chemicals a.--e used as part of the penna.nent-press finishing process. Five fabric samples
were selected, and a randomized block design was run by testing each chemica! type once in random
order on each fabric s:unple. The data are shown in Table 12-10.
The S~ of squares for the analysis of variance are corcputed as follows:

Table 12~9 Analysis of Variance for Randomized Complete Block Design


Source of SUIDof .Degreesof Mean
Variation Squares Freedom Sql.1.3J:e

Treatments a-I
a-I

S~l"'.'~
Blocks h-I 0-1
ss
Error SSE (oy subtraction) (a -1)(h -1) "
(a-l)(o-l)

Tota! ab-I
124 The Randomized Block Design 343

Table J2.10 Fabnc Strength Data - Randomized Block Design

Row Row
Chemica! Fabric Sawe1e Tota~s, Averages,
Type 2 3 4 5 y,. Y.·
1 1.3 1.6 0.5 1.2 1.1 5.7 1.14
2 2.2 2,4 OA 2.0 1.8 8.8 1.76
3 1.8 1.7 0.6 1.5 1.3 6.9 1.38
4 3.9 4.4 2.0 4.1 3.4 17.8 3.56
CollUli.n
totals, Y'J 9.2 10.1 3.5 8.S 7,6 39.2 1.96
Colu;r.n
averages. Y'j 2.30 2.53 0.88 2.20 1.90 (y .. ) (]i.. )

(5.7)' +(8.8)' ,.(6.9)' 7(17.8)' no ",


- 5
~;18.04.
20 ~

SS
b,od:s
; -.& ZL_ iab
£,..., a
/",1

; (9.2)' +(10.1)2 +(3.5)' +(8.8)' +(7.6)' (39.2)' ; 6.69,


4 20
SSE;;; SST - SSblOCiks - SSlte:J.tmllntl';
= 25.69 -6.69 -18.04= 0.96.

The analysis of variance is summar'.z.ed in Table 12-11. We would conclude that there is a signiiicant
difference in 11e chemical types as far as their effect on fabric strength is oo:l.cernoo,

Table 12-11 Analysis ofVarimcc for the RaI:.domized Block Experiment

Source of Sumo! Degrees of Mean


Variation Squares Freedom Squa...-e
Chemical type
(treatments) 18.04 3 6.01 75.13
Fab::ic saople
(blocks) 6.69 4 1.67
Error 0.96 12 O.OS
Total 25.69 19
'J
344 Chapter:2 The A.'1alys:s of Variance

Suppose an experiment is conducted as a randomized block design, and blocking was


not really necessary. There are ab observations and (a - I)(b - 1) degrees of freedom for
error. If the experiment had been run as a completely randomized single~factor design with
b replicates, we would have hada(b -1) degrees offreedom for error. So, blocking has cost
alb I) - (a-I) (b - 1) = b -1 degrees of freedom for error. Thus, since the loss in error
degrees of freedom is usually small. if there is a reasonable chance that block effects may
be important1 the experimenter should use the randomized block design.
For example, consider the ex.periment described in Example 12-6 as a one-way-
classification analysis of va..riance. We would have 16 degrees of freedom for error. In the
randomized block design there are 12 degrees of freedom for error. Therefore, blocking has
cost only 4 degrees of freedom, a "tty small loss considering the possible gain in informa-
tion that would be achieved if block effects are really important. As a gene.rn.l rule, when in
doubt as to the inlporta:Jce of block effects, the experimenter should block and gamble that
the block effect does exist. If the experimenter is "''Tong, the slight loss in the degrees of free~
dom for error will have a negligible effect, unless the number of degrees of freedom is very
small. The reader shollid compare this discussion to the one at the end of Section 11 ~3 .3.

12-4.2 Tests on Individual TreatnIent Means


When the analysis of variance indicates that a difference exists between treannent means,
we usually need to perform some follow~up tests to isolate the specific differences. Any
multiple comparison method, such as Tukey's test, could be used to do this.
Tukeis test presented in Section 12-2.2 can be used to determine differences between
treatment means when blocking is involved Simply by replacing n with the number of
blocks b in equation 12~20. Keep in mind that the dl?grees of freedom for error have now
Changed. Forth. rancomized block design,j= (a - I)(b - I).
To illustrate this procedure, recall that the four chemical type means from Example
12-6 are
1, = 1.14, y,. = 1.76, y. = 3.56,

Therefore, we would conclude that two means are significantly different if


1)';' -5\.1 > 0.53.
The absolute values of the differences in treatment averages are
IY.. - y,.1 = 11.14 - 1.761 = 0.62,
1Y•. -y,.I=11.l4 1.381=0.24,
IY•. - y,.1 = 11.14- 3.561 ~ 2.42,
IY,. - y,.1 ~ 11.76 - 1.381 = 0.38,
ty,. -Y4.1 =11.76 - 3.561 =1.80,
ty,. - Y4' =11.38 - 3.561 =2.18.
Too results indicate chemical types I and 3 do not differ, and types 2 and 3 do not differ. ;
fignre 12-8 represents the results graphically, where the underlined pairs do not diffeLI
r
, 12-4 The Rar.domized Block Design 345

y,. 13> }'2- y,.


I I I I
!
1.00 1.50 2.QO 2.50 $,00 3.50

Figure 12-8 Results of Tukey's !:esC

124.3 Residual Analysis and Model Checking


In any designed experiment it is always important to examine the residuals and check for
violations of basic assumptions that could invalidate the results. The residuals for the ran-
domized block design are just the differences between the observed and fitted . . 'alues

where the fitted . . '3lues are


(12-33)
The fitted value represents the estimate of the mean response wben the itb treatment is run in
thejth block The residuals from the experiment from Example 12-6 are shown in Table 12-12.
Figures 12-9, 12-10, l2-ll, and 12-12 present the important residual plots for the
experiment. There is some indication iliat fabric sample (block) 3 has greater variability in
strength when treated with the four cbemicals than the other samples. Also, cheILical type

Table12·12 Residuals from ihe Ra:1dontized Block Design


Chemical Fabric
Type
-0.18 -0.11 0.44 -0.18 0.02
2 0.10 0.07 -0.27 0.00 0.10
3 0.08 -0.24 030 -0.12 -0.02
4 0.00 027 -0.48 0.30 -0.10

·-T-·_··_··_·
2


",-


~ 0
••
1il •
E ••
0
Z
-, ,•


-2 •
0.50 0.25 a 0.25 0.50
Res;d:tal value
Figure 12-9 Normal probability plot of residuals from the randomi2ed block design.
Chapter 12 The Analysis of Variance

4, which provides the greatest strength, also has somewhat more mability in strength, Fol,
low-up experiments may be necessary to confirm these findings if they are potentially
important,


-0.5

Figure 12-10 Residuals by trea!:D:.ent.

el/+
, I
0.5 ;-

! r
01- t5
t
.~

2 3
1

-O.5~
I
Figure 12-11 Residuals by block.

eat

°l· •
. ••
'
I
• •


°1 ..•• •
2 4 6

-O'5~
Figure 12-12 Residuals versus YI}"

j
r l2-5 Determining Sample Size in Single-Factor Experiments

12·S DETEIL'1JNING SAMPLE SIZE r.T SINGLE·FACTOR


341

EXPERIMENTS
In any experimental design problem the choice of the sample size or number of replicates
to use is important. Operating characteristic curves can be used to provide guidance in mak-
this selection. Recall that the operating characteristic curve is a plot of !.he type II (fJ)
error for various sample sizes against a measure of the difference in means that it is impor-
tant to detect. Thus, if the experimenter knows how large a difference in means is of poten~
rial importance, the operating characteristic curves can be used to determine how many
replicates are required to give adequate sensitivity.
We first consider sample size determination in a fixed-effects model for the case of
equal sample size in each treatment. The pnwer (1- (3) of the test is
1- {3= P(Reject HoIH, is false} (12·34)
=P{Fo > F", ,.,."••iH, is false I·
To e\'aluate this probabili~ statement, we need to know the distribution of the test statistic
Fo if the null hypothesis is false. It can be shown tllat if He is false, the statistic Fo =
MS=_.IMSE is distributed as a noncentral F random variable, with a I and N - a
degrees of freedom and a noncentra1ity parameter o. If Ii ~ O. then the noncentral F distri·
bution becomes the usual centroi F distribution.
The operating characteristic curves in Chart vn of the Appendix are used to calculate
the pnwer of the test for the fixed-effects model. These curves plot the probability of type
II error ({3) against <1>. where
,
nL'!f
'--,-.
{=1
(12-35)
aIr
o.
The parameter tt;2- is .related to the noncentrali::y parameter Curves arc available for a::::
0.05 and a I::: 0.01 and for several values of degrees of freedom for the numerator and

denominator. In a completely randomized design, the symbol n in equation 12-35 is the


number of replicates. In a randomized block design, replace n by the number of blocks.
In using the operating characteristic curves, we must define the difference in means
that we wish to detect in tenns of 2::", j 1;. Also, the error variance (52 is usually unknown_
In such cases. we must choose ratios of 'i~"'l~j<:f that we wish to detect. Alternatively, if
an estimate of d- is available. one may replace cr with this estimate. For example, if we
were interested in the sensitivity of an experiment that has aJrea.dy been performed, we
might use MS E as the esti:nate of cr,
~E~i!~j~~1
Suppose that five means are being compared in a completely random.ized experimen:: with a::;:; 0.01.
The e.'l.perimenter would like to know how IIlalJ,y replicates to run if it is l.."UpOrta."l.l to reject HIJ with:l
probability of at least 0.90 ifL~", jJdl::;:; 5.0. The parameter ¢2 is, in this case,

nt~?
<I>
:1 ;"'1n(5j'=n..
=--,-=- ,
QIJ 5
and the openting chal;a.cteristic curve for a -1 = 5 -1 =4 and N - a = a(n-l) = 5(11 -1) euor degrees
offreedomis shown in Chart VII (Appendix). As a fut guess, try n =4 replicates. This yields <Ilz 4, =
<1)=2, and5(3)= 15 crrordegrees offrcedom. Consequently, from ChartVlI. we find that i3~ 0.38.
348 Chapter 12 TbeAnalysis of Variance

Therefore, the PO?'ef of the test is approxllnately 1 - P 1 - 0.38 ;;;: 0.62. whicn is less than the
required 0.90, and so we conclude that It:::::: 4 replicates are not sufficient. Proceeding in a similar man-
ner, we can constru:ct the following display.

n <1>' <I> a(n 1) Power (1 - jJ)


4 4 2.00 15 0.38 0.62
5 5 2.24 20 0.18 0.82
6 6 245 25 0.06 0.94

Thus, at least 11 = 6 replicates must be run in order to obtain a test with the required power.

The power of the test for the random-effects model is


1- J3=P{RejectHclHo is false}
(12-36)
=P{F,> F.o_, .. ,ja; > O}.
Once again the distribution of the test statistic Fe under the a1ternative hypothesis is needed.
It can be shown that if HI is lIUe (a: > 0), the distribution of F, is centrnl F, with a-I and
N - a degrees of freedom.
Since the power of thc random-effects model is based on the central F distribution, we
could use the tables of the F distribution in the Appendix to evaluate equation 12-36. How-
ever; it is much easier to evaluate the power of the test by llSing the operating characteris~
tic curves in Chart vm of the Appendix. These curves plot the probability of the type II
error against}., whcre

(12-37)

In the randomized block design, replace n with b, the number ofblocks. Since u' is usually
unknown, we may either use a prior estimate or define the value of cr! that we are interested
in detecting in terms of the ratio d!/~.

Consider a completely randomized design with five treatments selected at random, with six observa w

a-;
tions per treatment and a=O.05. We \¥ish to determine the power of the testlf is equal to cT. Since
a;
a 5,11 = 6, and = 02, we oay compute
A~~1+(6)1 ~2.646.
From the operating characteristic curve with a-l :::::4, tV -a = 25 degrees of freedom and ct= 0,05,
we find that
{J = 0.20.
Therefore. the power is approrimately 0.80.
~~~ .....- - ..
12·6 SAMPLE CO:\1PUTER OUTPUT
Many computer packages can be implemented to carry out the analysis of variance for the
situations presented in this chapter. In this section, computer output from 1-linitab® is
presented.

j
12-6 Sa'JOple Computer Output 349

computer Output for Hardwood Concentration Example


Reconsider Example 12-1, which investigates the effect of hardwood concentration on ten-
sile strength. Using ANOVA in Minitab® provides the following output.

Analysis of Variance ~or TS


Source DF SS >is F F
Concen 3 382.79 127.60 19.61 0.000
; Error 20 130.17 6.51
: Total 23 S12.96
Indi '\ridual 95% C:s :For Mean!
Based 0:1. Pooled 'StDev
Level N Mean StDev --+-----r"----+-~ ...~-+-
5 6 10,QOO 2.828 (--*--)
10 6 15,667 2.805 (--*--)
15 6 17,000 1. 789 (--*--)
20 6 21.167 2.639 (--* .. _)
---+----+-----+----~-
Pooled StDev 2,551 10.0 15.0 20,0 25.0

The analysis of variance results are identical to those presented in Section 12-1.2. 11initab®
also provides 95% confidence intervals for the means of each level of hardwood concen-
tration using a pOOled estimate of the standard deviation, Interpretation of the confidence
intervals is straightforward. Factor levels with confidence lnterYals that do not overlap are
said to be significantly different.
A hetter indicator of significant differences is provided by confidence intervals based
on Tukey's test on pairwise differences, an option in MinitabV , The output provided is

Tukeyl s pairwise comparisons


Fa~ily error rate = 0.0500
Individual error rate 0.0111
Critical ~~lue = 3.96
IIntervals for (cQla~ level mean) - (row level mean)
5 10 15
10 -9.791
-1. 542
15 -11.124 -5,458
-2.876 2,791
20 -15.291 -9,624 -8.291
-7.042 -1.376 -0.042

The (simultaneous) confidence intervals are easily interpreted, For example, the 95% con-
fidence intenra} for the difference in mean tensile strength between 5% hardwood concen-
tration and 10% hardwood concentration is (-9.791, -1.542), Since this confidence interval
does not contain the value 0, we conclude there is a significant difference between 5% and
10% hardwood concentrations. The remaining confidence intervals are interpreted simi~
larly, The results provided by M.in.itab' are identicil to those found in Section 12-2.2,
350 Chapter 12 The Analysis afYanancc

12-7 SGMMARY
This chapter has introduced design and analysis methods for experiments with a single fac-
tor. The importance of randomization in sing1e~factor experiments was emphasized. In a
completely randomized experiment, all runs are made in random order to balance out the
effects of unknown nuisance varia.bles. If a known nuisance variable can be contrOlled,
blocking can be used as a design alternative. The fixed-effects and random~effects models
of analysis of variance were presented. The primary difference bet\\'een the two models is
the inference space.1n the fixed-effects model inferences are valid only about thefactor lev-
els specifically considered in the analysis) while in the random~effects model the conclu~
sions may be extended to the population of factor levels. Orthogonal contrasts and Tukey's
test were suggested for making comparisons between factor level means in the .:fixed~effects
experiment. A procedure was also given for estimating the variance components in a ran-
dom-effects model. Residual analysis was introduced for checking the underlying assump-
tions of the analysis of variance.

12-8 EXERCISES
12--1. A study is conducted to determine the effect of Observations
cuttir.g speed on the life (in hO":lrs) of a particular
?".acbine tooL Four leveis of cutting speed are selected 2 3 4 5 6
for the study with the following results: 125 2.7 4,6 2.6 3.0 3.2 3.8
160 4.9 4.6 5,0 4.2 3,6 4.2
Cuttir.g 200 4.6 3.4 2.9 3.5 4.1 5.1
Tool Life
41 43 33 39 36 40 (a) Does C~6 flow rate affect e~ch unifon:.rity1 Con-
2 42 36 34 45 40 39 struct box. plots to compare the fac~or levels and
34 38 34 34 36 33 perform the analysis of variance,
3
4 36 37 36 38 35 35 (b) Do the residuals indicate any problems v.i.th the
underlying assun:ptions?
tz..3. The compressive strength of concrete is being
(a) Does cutting speed affect tool life? Draw compar-
studied, Fom: different mi.:x.ing techniques are being
ative box piots a.id perform an analysis of investigated. The following data have been collected:
variance.
(b) Plot average tool life agai.."lst cutting speed and Mixing
interpret the results. TechrJque Compressive Stren..,c-th (psi)
(c) Use Tukey's tese to investigate Cifferences 3129 3000 2865 2890
between the individual levels of cutting speed. 2 3200 3300 2975 3150
Interp:et the results. 3 2800 2900 2985 3050
(d) Find the residuals and examine them for model 4 2600 2700 2600 2765
inadequacy.
Ca) Test the hypothesis that mixing techniques affect
J2...2. I!l "Orthogonal Design for Process Optimiza-
the strength of the concrete. 1;se a=' 0.05.
tion and Its Application to Plasma Etching'; (Solid
Sfate Technology, May 1987), G. Z. Ym:md D, W. lil- Cb) Use Thkey's test to make comparisoos bem'eetl
lie describe (1,'1 experiment to determine the effeet of pairs of means. Estimate tl::.e treatment effects.
q,) flow rate on the ur..iformity of the etch on a sili- 12-4. A textile mill has a la..rge nurr:.ber of looms. Each
con wafer used in integrated circuit manufacturing, loom is supposed to provide the sam.e output of cloth
Three flow rates are used in the experiment., and the per minute. To investigat!! this assumption. fiv!! looms
resulting unifomrity (in percent) for six :eplicmes is as are chosen at random and the!! output measured at dif~
follows: ferent times. The following data.are obtained:

•• .....:...J
r
I 12-8 Exercises 351

Output (lb i miJ.<) {c) Compute a 95% interval estimate of the mean for
LoOIC
coating type 1. Compute a 99% interval estimate
4.0 4.1 4.2 4,0 4,1 of the mean difference bern'een coating types 1
2 3,9 3.8 3.9 4,0 4,0 and 4,
3 4,1 4,2 4,1 4,0 3,9
(d) Test all pairs of means using Tukey's test, with
4 3,6 3,8 4,0 3,9 3,7 a= 0,05,
5 3,8 3.6 3.9 3,8 4,0
(e Assu:ni.."1g t.l)ar. coating type 4 is currently in use,
what are your recommendations to the manufac-
(a) Is this a:5xed- or randoIll~effects experiment? Ate ~"'Cf? We wish to minimize conductivity.
the looms similar in output?
12-7. The response time in milliseconds was deter-
('J) Estimate the .wability between looms. mined fot three different types of clrccirs t:.sed in an
(c) Estimate the experimental error variance. electronic calculator. The results a..-e recorded here:
(d) What is the probability of accepting Hr; if ~ is
four times the experimental error variance? Circuit Type Response Time
(e) Analyze the residuals from this experiment and 1 19 22 20 18 25
check for model itmd£quacy, 2 20 21 33 27 4{)
3 16 15 18 26 17
12-5. 1m experiment was run to determine whether
four specific :5ring tempetatu:res affect the density of a
(a) Test the hypothesis that the three circuit types
cer.ain type of brick. The experiment led to the fol~
have the same response time.
lowiDgda~:
(b) Lse Tuley's test to compare pairs of treatment
Temperatu.·..c means.
_-"C-=F)-'-_ _ _ _ _ _D=en=,=ityL...................... _ _ (c) Construct a set of Ort.1og0r..al contrasts. assuming
that at the outset of the experiment you suspected
100 21.8 21.9 21.7 21.6 21.7 21.5 21.8
the response time of circuit type 2 to be different
125 21.7 21.4 21.5 21.5
from the other two.
150 21.9 21.8 21.8 21.6 21.5 -
(d) What is the power of this test for detecting
21.7 21.8 21.7 21.6 21.8 -
-r;
L~., ler = 3.0?
{e) Analyze the residuals from this experiment.
(a) Does dle fixing temperature affect the density of
the bricks? 12·S.In "The Effect of Nozzle Design on the Stability
and Performance of Turbulent Water Jets" (Fire Safety
(b) Estirnate the components in the model,
JOUT1'.aI. VoL 4, Angus! 1981), C. Theobald describes
(c) Analyze the residuals from the expe~ent. an experiment in which a. shape fac:or was determined
12-6.An electro::tics eogineer is interested in the effect for several different nozzle designs at different lev>!ls
on tube conductivity of .five different types of coating of jet efflux velocity. Interest in this experimer.(
for cathode ray tubes used in a telecommunications focuses primarily on nwil.e design, and velocit} is a
system display device, The follov.i.ng conductivity nuisance factor. The data are shown below:
data are obtained:

:43 141 150 146 1 0.78 0,80 0,81 0.75 0.77 0,78
2 152 149 137 143 2 0,85 0,85 0,92 0,86 0,81 0,83
3 134 133 132 127 3 0,93 0,92 0,95 0,89 0,89 0,33
4 129 127 132 129 4 1.14 0,97 0,98 0,88 0,86 0,83
5 147 148 1'4 142 5 0,97 0.86 0,78 0,76 0.76 0,75

(a) Is there any difference in conductivity due to coat~ (a) Does noz:ile type affect shape factor? Compare the
ing type? Use <X= 0,05, noZ<.1es using box plots and the analysis of variance.
(b) Estimate the overall mean and the treatment {b) Use Tukey's test to determine specific differences
effects. between the nozzles. Does a graph of average (or
352 Chapter 12 The Analysis of Variance

standard deviation) of shape factor versus nozzle 12~11. Suppose :hat five uonnal populations have
type assist -.vith the conclusions? coUll!J.On variance IT =100 and means Ji.: = 175~ J.i: =:
{c) Analyze:he residuals from this experiment. 190,)1., 160,)1., = 200, and)1., 215. How = y
observatiollS per population must be taken so :hat the
12-9. In his bookDesigrt andAnalysis ajExperimen.ts probability of rejecting the hypothesis of equ.allry of
(2001), D. C. Montgomery describes an experiment to means is at least 0.951 Use a= 0.01.
determine the effect of four chemical agents on the
strength of a particular type of clo:h. Due to possible 1,2..12. Consider testing the equality of ilie means of
variability from cloth to cloth. bolts of cloth are con- two normal populations where t.'1e variances are
sidered blocks. Five bolts are sclec:ed and all four uokuo\\'n but assumed equal. The appropriate test
chemicals in random order are applied to each bolt. procedure is the two-sample Hest Show that ::he
The resulting tensile streng"..hs are two-sample t-test is equivalent to tl:e one-way~
classification analysis of variance.
Bolt 12-13. Show that the variance of the linear combioa~

Chemical 2 3 4 5
tion Z:~"[c;y,, is 02Z:;d!ni c7,
12··J4. In a fixed-effects moru;l. suppose that there ue
1 73 68 74 71 67 n observations for each of four treatments, Let~, ~,
2 73 67 75 72 70 and a; be single-degree~of-freedom components for
3 75 68 78 73 68 r.h~ ortl;ogO;.1al cootrasts. Prove :hat SSl(tt!!l!Ienu = Q~ +
4 73 71 15 75 69 Q;+ 12;.
12-15. Consider the data shown in Exercise 12-7.
(a) Is there any dIfference 1."1 tensile sjre:tgth between (a) Write out the :east squares normal equations for
the chemicals? this problem., and solve them forfl and1;., making
(b) Use Tukcy's test to investigate sped..£ic differ~ the usual coostraint (Z:~'" Ji,-;; 0). Estimate 'C) - ; .
ences between the chemicals. (b) Solve the equations in (a) using the constraint~:::;;
(c) Analyze the residuals itom L.":ris experin1ent. O. Are the estimators f, and fi the same as you
found in (a)? Why? Now estim.ate 1'[ ~ 't; and com-
12~10. Suppose that four normal populations have
pare your answer with (a). '\\t'hat statement can
COUll!J.On variance <f =25 a.."l.d means ,Il! = 50, ~ ::60.
you make about estimating contrasts in the !)
iLl;;;:: 50, and Ji.4 """ 60. How ma.1Y observations should
be taken on each population so that the probability of (c) Estimate jl+ 1';. 21'] -11- 1";: and jl + 'C( + 12 using
rejectiug the hypothesis of equality of means is at least the two solutions to the normal equations. Com-
0,901 Use" = 0.05. pa..-e L1e results obtained !n each case.
Chapter 13

Design of Experiments
with Several Factors
An experiment is just a test or a series of tests. Experiments are performed in all scientific
and engineering disciplines and are a major part of the discovery and learning process. The
conclusions that can be drawn from an experiment will depend. in part, on how the exper-
iment was conducted and so the design of the experiment plays a major role in problem
solution, This chapter introduces experimental design techniques useful when several fa:~
tors are involved,

13·1 EXAMPLES OF EXPERIMENTAL DESIGN APPLICATIONS

.~.:t.ailii(i~l:fr
A Characterization Experiment A development engineer is working on a new process for sol-
dering electronic components to printed circuit hoards. Specifically. he is working with a r.ew type of
flow solder machine that be hopes will reduce the number of defective solder joints. (A flow solder
machine preheats printed circuit boards and then rr..oves them into contact with a wave of liquid sol-
der. This machi:1e makes all the electrical and most of me mechanical COllt.ections of the components
to the pr. ."1ted circuit board, Solder defects require touehup or rework. which adds cost and often dam~
.ages the boards.) The flow solder machine has several variables that the engineer can controI. They
are as follows:
1. Solder temperature
2.. Prebeat temperature
3. Conveyor speed
4. Flux type
5. Flu., specific gravity
6. Solder wave depth
7. Conveyor angle
In addition to t.iese controllable factors, there are several factors that cannot be easily controlled
once the machine enters routine manufacturing, ineluding the following:
1. Thickness of the printed circuit board
2. Types of components used on the board
3. Layout of the components of the board
4. Operator
5. En"irorunental factors
6. Production rate
Sometimes we call tl:Ie uncontrollable factors noise factors, A schematic representation of the
process is shOVf"U in Fig. lJ., L

353
354 Chapter 13 Design of Experiments with Several Factors

Controllable factors

Input
11 r Process Outpl..1:
(printed circuit boards) (flow solder machine} (defects, y)

. J
ll-~
Z1 2'2 Zq

Uncontrollable {noise) factors


Figure 13~1 The flow solder experiment

In this sit'.lscion the engineer is interested in characterizing the flow solder m:lClrine; that is, he is inter-
ested in detemrining which factors (borb controllable and uncontrollable) affect the occurrence of
defects on the p.r:inted circuit boards, To accomplish this he can design an experiment that v;:ill enable
him to estimate the magnitude and direction of the factor effects, Sometimes we call an experiment
such as this a screening experiment The information from this characterization study Or screening
experiment can be used to identify the critical factors, to determine the direction of adjustr.::Jent for
these factors to reduce the n:nnber of defects, and to assist in determining which factors should be ca.re.
fully controlled dcring manufacturing to prevent high defect levels and erratic process perfo.rmance.

;Ekli~~:i3:~.J
An Optimiza.tion Experiment In a characterization experiment, we are interested in determining
which factors affect the response. A logical next step is to determine the region in the important fac-
tors that leads to an optimum response, For example. if the response i.,. yield, we would look for a
region of maxim1.ll:l yield, and if the response is cost, we would look for a region of minimum cost.
As an illustration. suppose that the yield of a chemical process is influenced by the operating
temperature and the reaction time. We are currently operating the process at 155°F and 1.7 hours of
reaction time and experiencing yields around 75%. Figure 13-2 shows a view of the time-tempera-
ture space from above. In this graph we have connected points of constant yield with lines. These lines
are called contours. and we have shown the contours at 60%, 70%, 80%, 90%, and 95% yield. To
locate the optimU1l1, it is necessary to design an experiment that varies reaction time and temperature
together. This design is illu:strated in Fig. 13-2. The responses observed at the four points:in the exper-
iment (145'F, 1.2 br), 045°F, 2.2 br), (16S'F. 1.2 br), MId (165'1', 2.2 br) indicate that we should
move in the general direction of increased temperature and lower reaction time to increase yield. A
few additional IUIlS could be performed in this direction to locate the region of maximum yield.

These examples illustrate only two potential applications of experimental design meth-
ods. In the engineering environment~ experimental design applications are numerous. Some
potential areas of use are as follows:
1. Process troubleshootiI:g
2. Process development and optimization
3. Evaluation of :material alternatives
4. Reliability and life testing
5. Performance testing
13-2 Factorial Experiments 355

200

Path leading to region


of higher yield

CUrrent operating
160 conditions

150

140

0.5 1.0 1.5 2.0 2.5


TIme (hr)

Figure 13M2 Contour plot of yield as a function of reaction time and reaction temperature, illustrat-
ing an optimization experiment.

6. Product design configuration


7. Component tolerance detennination
Experimental design methods allow these problems to be solved efficiently during the
early stages of the pr6duct cycle. This has the potential to dramatically lower overall prod-
uct cost and reduce development lead time.

13-2 FACTORIAL EXPERIMENTS


Vlben there are several factors of interest in an experiment, afactorial design should be
used. These are designs in which factors are varied together. Specifically, by a factorial
experiment we mean that in each complete trial or replicate of the experiment all possible
combinations of the levels of the factors are investigated. Thus, if there are two factors, A
and E, with a levels of factor A and b levels of factor E, then each replicate contains all ab
treatment combinations.
The effect of a factor is defined as the change in response produced by a change in the
level of the factor. This is called a main effect because it refers to the primary factors in the
study. For example, consider the data in Table 13-1. The main effect of factor A is the dif-
ference between the average response at the first level of A and the average response at the
second level of A, or
A_ 30 + 40 10+20_20
--2---2-- .
356 Chapter 13 Desig."l of Experirne;;.ts v,1.th Several Factors

Table 13-1 A Factorial Expcrimc:1.t with Two Factors

Factor B
Factor .4 B,
10 20
30 4()

That is. changing factor A from levell to level 2 causes an average response increase of 20
units. Similarly, the main effect of B is
B= 20+40 _10+30 =10.
2 2
In some experiments. the difference in response between the levels of one factor is not
the same at all levels of the other factors. V/hen this occurs. there is an interaction bet\Veen
the factors. For example, consider the data in Table 13-2. At the first level of factor B, the
A effect is
A =30-10 = 20,
and at the second level of factor B. the A effect is
A=O-20=-20.
Since the effect of A depends On the leVel chosen for factor B, there is interaction between
A andB.
Vt'hen an interaction is large, the corresponding main effects have little meaning. For
example, by using the data in Table B-2, we find the main effect of A to be
A 30+0_10+20 =0
22'
and we would be tempted to conclude that there is no A effect. However. when we exam~
ined the effects of A at different levels offactor B. we saw that this was not the case. The
effect of factor A depends on the levels of factor B. Thus, knowledge of the AB interaction
is more useful than knowledge of the main effects. A significant interaction can mask the
significance of main effects.
The concept of interaction can be illustrated grnphically. Figure 13-3 plats De data in
Table 13-1 against the levels of A for both levels of B. Note that the B, and B,lines are
roughly parallel, indicating that factors A and B do not interact significantly. Figure 13-4
plots the data in Table 13-2.ln this grnpb, the B, and B, lines are not parallel, indicating the
interaction between factors A and B. Sucb graphical displays are often useful in presenting
the results of experiments.
An alternative to the factorial design that is (unfortunately) used in practice is to change
the factors one at a time rntberthan to vary them simultaneously. To illustrate this one~factor~
at~a-time procedure, consider the optimization experiment described in Example 13-2, The

Table 13-2 A Factorial ExpCrir::.cllt with Inte::action

Factor B
Faztor A B,
A, 10 10
A, 30 o

,
-~
:3-2 Factorial Experiments 357

:~
i 30~
\D 20 i-
~ 10~
O~
i ................ L .. ~ ___ _
A,
Factor A Figure 13-3 Factorial experiment, no interaction.

50 -
§ 4O~

<
~ 30~ 81
~ 20~ 8,
8 10~ 8,
o~ 8,
I
Figure 13..4 Factoria: experiment, with
A, A, interaction,

engineer is interested in finding the values of temperature and reaction time that maximize
yield, Suppose that we fix temperature at 155'F (the current operating level) and perform
five runs at different levels of time, say 0.5 hour, La hour, 1.5 hours, 2.0 hours, and 2.5
hours. The results of this series of runs are shown in Fig. 13-5. This figure indicates that
maximum yield is achieved at about 1.7 hours of reaction time. To optimize temperature, the
engineer fixes time ar 1.7 hours (rhe apparent optimum) and perfonns five runs at different
temperatures, say l40'F, 150'F, 160'F, 170'F, and 180'F. The results of this ser of runs are
plotted in Fig. 13-6. Maximum yield occurs at about 155°F. Therefore, we would conclude
that running the process at 155'F and 1.7 hours is the bestset of operating conditions, result-
ingin yields around 75%,

BO

70 •
~ I •
.",
:.; 60

>-
• •
50

0.5 1.0 1.5 2.0 2.5


TIme (hr)
Jrrgw;e 13~5 Y"lCld versus reaction time with temperature constant at 155"P'
358 Chaptc; 13 Design of Expe:imcnrs with Several Factors

70
~
:Q
m
:;: eo •

Temperature (!JA
Figure 13~6 Yleld versus temperature with reaction time constant at 1.7 hr"

Figure 13-7 displays the contour plot of yield as a function of temperature and time
with the one-factor-at-a~time experiment shown on the contours. Clearly the one-factor-at-
a-time design has failed dramatically here, as the true optimum is at least 20 yield points
higher and occurs at much lower reaction times andbigher temperatures. The failure to dis-
cover the shorter reaction times is particularly important as this could have significant
impact on production volume or capacity, production planning, manufacturing cost. and
total productivity.
The one-factor-at-a-time method has failed here because it fails to detect the interac-
tion between temperature and time, Factorial experiments are the only way to detect inter-

[ 18D!
~

~m
0.
E
m
t-

150l

14G~
'-_;;,',;:" .--~~.:--.~.---.-.-L.----L......-.-
Oll 1.0 1.5 2.0 2.5
Time (hr)
Figure 1J..7 Optimization experiment using the onc-:factor-at~a~time method.
13-3 Two-Factor Factorial Experiments 359

actions, Furthermore, the one-factor-at-a-time method is inefficientj it will require more


experimentation than a factorial, and as we have just seen, there is no assurance that it will
produce the correct results. The experiment shown in Fig. 13-2 that produced the informa~
tion pointing to the regiOD of the optimum is a simple example of a factorial experiment.

13-3 TWO-FACTOR FACTORIAL EXPERIMENTS


The simplest type of factorial experiment involves only two factorS. say A and B. There are
"levels offac1<lr A and b levels offactor B. The two-factor factorial is shown in Table 13-3.
Note that there are n replicates of the experiment and that each replicate contains all ab
Ireatment combinations. The observation in the ijth cell in the kth replicate is denoted 11Ft.'
In collecting the dar.a, the abn observations would be run in random order. Thus! like the
single-factor experiment srudiedln Chapter 12, the two-factor factorial is a completely ran-
domized design.
The obse::vations =y be described by the linear statistical model

(13-1)

where J.1 is the overall mean effect.. -t: is the effect of the ith level of factor A,~, is the effect
of the jth level of factor S, (1:!llij is the effect of the interaction between A and S, and is 'il'
a NID(O, a') (norma! and independently distributed) random error component. We are
interested in testing the hypotheses of no significant factor A effect, no significant factor B
effect, and no significant AB interaction. As with the single-factor experiments of Chapter
12, the analysis of variance will be llsed to test these hypotheses, Since there are two fac-
tors under study, the procedure used is called the two-way analysis of variance.

13-3.1 Statistical Analysis of the Fixed-Effects Model


Suppose that factors A and S are fixed. That is, the " levels of factor A and the b levels of
factor S are specifically chosen by the experimenter, and :nferences are co",';ned to these
levels only. In this model, it is customary to define the effects 'fi,!l" and (1:{3),. as deviations
from the mean, so that :£:~, T, 0, ,!ll 0, :£:~, (1:{3)ij = 0, anl:E:",('l'{3)i) '= 0.
:£;"
Let y.:. denote the total of the observations under the itb level of factor A, let yo)' denote
the total of the observations under the jth level of factor B, let Iii' denote the total of the
obse::vations in the ijth cell of Table 13-3, and let y ... denote the grand total of all the

Table 13-3 Dat:! Arrangement for a Two-Factor Factorial Design


Factor B
Factor A 2 b
1 )'1 (1' )'\11' '''')'1:" Yilt> )'12:1' , .. , )'l1ft )'IW)';O:;- '''' )'11'"
2 )flU> Ym.,. ···'11111 Yl'll> Y2n' .•. , Y:.t.!n Y1.W)':ll>2' ••• ,)l7m.

a ""J!:l:!
360 Chapter 13 Design of Ex-periments with Several Factors

observations. Define Yf ... YT' YIj" and y... as the correspondlng row. colllIIlD.. cell, and
grand averages. That is,
.1
i::::: 1.2•... ~a,

, i:= 1.2..... a. (13-2)


Yij. ::::;: LY(fk'
k::::.l n j=l.2, ... ,b,
a • ,
Y...
Y... = LLLYij" ahn
f=-\ i:zlk=l

The total corrected sum of squares may be written


a b " 2
~~L(Yil,-YJ
f::O.1"""1.:",,1
a b ,

= LLLfCVI.-Y .. )+(Yj.
r"'l 1::::]·k=-1 •
-yJ
+ (YiJ. -Yio. -Y.I +Y... )+(Y,jk-Yi}.jf
a 2 b 2 (13-3)
=bnL(Yi.. -Y.) +anL(Y:! -Y:)
1",1 1::::1
ab "aon, 2
+nLL(YJ}. -y, -Y.j+Y.,J + LLLlY,!k -Yij) .
1""1 j=-l i=1 1",1 k><ol

Thus. the total sum of squares is partitioned into a sum of squares due to "rows," or factor
A (SSA)' a sum of squares due to "columns," or factor B (SS.), a sum of squares due to the
interaction between A and B (SSM). and a sum of squares due to error (SSE)' Notice that
there must be at least tv.'o replicates to obtain a nonzero error sum of squares.
The sum of squares identity in equation 13-3 may be written symbolically as
(13-4)
There are abn - 1 total degrees of freedom. The main effect<; A and B have a - I and
&-1 degrees of freedom, while the interaction effectAB has (a - 1)(b-1) degrees cffree-
dom. Within each of the ah cells in Table 13-3, there are n -I degrees of freedom between
the n replicates, and observations .in the same cell can differ only due to random error.
Therefore, there are ahen -1) degrees of freedom for error. The ratio of each sum of squares
on the right~hand side of equation 13-4 to its degrees of freedom is a mean square.
Assuming that factors A and B are fixed, the expected values of the mean squares are
13-3 Two-Factor F2.ctorial Experimcnt:5 361

a b
, nL2JrJ3)~j
E(MS ) ~ E( SS,s J ~ ,,2 + i=[ j=1
All (a-l)(b-I) (a-l)(b-I) .

and

Therefore} to test He: 't;::;:; 0 (no row factor effects), Ho: /3,- -;::; 0 (no column factor effects), and
H,: (~fi),! ~ 0 (no interaction effects), we would divide'the corresponding mean square by
the mean square error, Each of these ratios will follow an F distribution with numerator
degrees of freedom equal to the number of degrees of freedom for the numerator mean
square and ab(n-I) denominator degrees of freedom, and the critical region will be located
in the upper tail, The test procedure is arranged in an analysis-of-variance table. such as is
shown in Table 13-4.
Computational fonnulas for the sums of squares in equation 13-4 are obtained easily,
The total sum of squares is computed from
a b r.,., y2
SSy~ LLLYi;, -~"'"
1=1 j=l aim .(:",1'
(13-5;

The sums of squares for main effects are

(13-6)

and
b 2 2
SSB -~.L,;
'\' Y,j. y",
--. (13-7)
}=1 an abn

We usually calculate the SS" in two steps. First, we compute the sum of squares between
the ab cell totals, called the sum of squares due to Hsubtotals":
IJ b y~.
SSS\lbtO".,;:Us;; LL..JL:..-
nan
1",1 J""1.
b

Table 134 The .Analysis~of- Variance Table for the Two- Way~aassification Fixed~Effects Mode:

Source of Sum of Degrees of Mean


Variation Squares Freedom

A treatments SS, a-I MSA


a-I
MS,
B treatments SS. 0-1 MS,
0-1 MS,
SSAS
Interaction (a ~ 1)(0 - I) MSAfj
SS"'S (o-I)(b-I) MS,

SS, ab(n-l) MS,:

Total SST abn-l

l
i
!
362 Chapter 13 Design of Experi;;nentS with Several Factors

'This sum -of squares also contains 5Sft and SSs' Therefore, the second step is to compute
SSAS as

SS" = SS"""", - SSA - 5S8 • (13-8)

The error sum of squares is found by subtraction as either


(13-9.)
or
(13-9b)

. E~pl.13,3·.
Aircraft primer paints are applied to alUI1li.nilI!l. surfaces by two n:.ethods: dipping and spraying, The
purpose of the primer is to improve paint adhesion. Some par...s can be primed using either applica-
don method and engineering is interested i'11eamiug whether th...--ee different primers differ in their
adhesio::::. properties. A factorial experiment is performed to investigate the effect of paint primer type
and application method on paint adhesion. 'Three speci.-.ner.s are painted with each primer using each
application method. a ilnish palnt applied, and the adhesion force measured. The data from the
experiment are shown in Table 13-5. The circIed nUDlbers in the cells are the cell totals Y,r . The sums
of squares required to perfow the analysis of variance are computed as follows:

= (4.01,2 +(4.5)' +.. + (5 0\' - (89.8)'


, . ! 18 --1072
, ,

Ss. "' ,
= ~YL_~
,)'?'!5 ~ or. abn.

= (28.7)' +(34,1)' +(27.0)' (898)'=4.58


6 18 •

(89.8)'
4.58- 4.91 = 0.24.
3 18

and

10.72-4.58 -4.91-0.24 = 0.99.

The analysis of variance is summarized in Table 13~6, Sinee F'J,(r.;,i,l'2 =- 3,S9 and FC.O,..,11 = 4.75. we
..:onclude that the main effects of primer type and application method affect adhesion force. Fu..'tber M

more, since 1.5 <: F O.05 ,2,12' there is no indication of interaction between these factors.
13-3 Two-Factor Fac:or'.J1l Experiments 363

TabJe 13--5 Adhesion Force Data for Example 13-3


Application Method

Primer 'JYpe Dipping Spraying 'I:..


1 4.0.4.5,4.3 @ 5.4,4.9,5,6 @ 28,7
2 5.6,4.9, 5.4 @ 5.8,6,1,6.3 @ 34,]

3 3,8,3.7,4,0 @ 5.5,5.0,5.0 @ 27.0

Y,. 40.2 49,6 89,8=y.,

T.b1.1;>.6 Analysis of Variance for Example 13--3

SoutCe of Sum of Degrees of Mean


Variation Squares FreOOom Square

Primer types 4581 2 2,291 27.86


Application !Ilethods 4,909 1 4,909 59.70
Interaction 0241 2 0,121 1.47
Error 0,987 12 0,082
Total 10,718 17

A graph of t.':1e cell adhesion force averages Yir versus the levels of primer type for each application
method is shown in Fig. 13--8. The absence ofintcraction is evident by the parallelism of the two lines.
Furthermore. since a large response indicates greater adhesion force, we conclude that spraying is a
superior application method and that primer type 2 is most effeetive.

Tests 011 Individual Mean. When borh factors are fixed, comparisons between rhe indi-
vidual means of either factor may be made using Tukey's test. When there is no interaction;
these comparisons may be made using either the rOw averages Yi" or the column averages
Y.).. However. when interaction is significant, comparisons between the means of one factor
(say A) may be obscured by rhe AS interaction. In rhls case, we may apply Tukey's test to
the means of factor A, 'With factor B set at a particular level.

Yi}·t
7,0-

6.0 rf ~ng
5,Or
4.0~
~Ping
a,oc
2 Figure 13-& Graph of average adhesion force versus
Prlmertypa primer types for Example 13-3,
364 Chapter 13 Design of Experiments with Several Facton:

13-3.2 Model Adequacy Cbecking


Just as in the single-factor experiments discussed in Chapter 12, the residuals from a facto"
rial experiment play an important role in assessing model adequacy, The residuals from a
two~factor factorial experiment are

That is~ the residuals are JUSt the difference between the observations and the corresponding
cell averages.
Table 13-7 presents the residuals for the aircraft primer paint data in Example 13-3.
The normal probability plot of these residuals is shown in Fig. 13-9. This plot has tails tbat
do not fall exacdy along a straight line passing through the center of the plot, indicating
some potential problems with the normality assumption, but the deviation from normality
does not appear severe. Figures 13-10 and 13-11 plot the residuals versus the levels of
primer types and application methods, respectively, There is some indication that primer
type 3 results in slightly lower variability in adhesion force than the other two primers. The
graph of residuals versus fitted values Yl/k;;;;; YU' in Fig. 13-12 reveals no unusual or diag~
nostic pattern.

13-3.3 One Observation per Cell


In some cases involving a two-factor factorial experiment, we may have only One replicate,
that is, only one observation per ceIL In this situation there are exactly as many parameters

Table 13M7 Residuals for the Aircraft Primer Paint Experiment in Example 13-3

Application Met.'-lod
Primer Type Dippi.'g Spraying
I -iJ.27, 0.23, 0.03 0.10, -OAO. 0.30
2 0.30, -0.4{), 0.10 -0.27, 0.03, 0.23
3 -0.03, -0.13, 0.17 0.33, -iJ.17, -0.17

2.0.

1J ••


ZI oJ ••
••
-1.0 •

-2.0 • ! ..J
-0.5 -0.3 -0.1 +0.1 +0.3
SI}X' residual
Figure 13-9 Nonnal probability plot of the residuals from Example B-3.
13-3 Two~Factor Factorial Experiments 365

Figure 13~10 Plot ofresiduals Yers"JS primer type.

+0.5

Or-------~~------~~~~-· ~
D S Application method
••

-0.5 c

Figure 13~11 Plot of residuals versus application method.



••
0 • •
.4 5 6 A
Yijk
• ••
• •
••
-0.5

in the analysis-of-variance model as there are observations, and the error degrees offree~
dam is zero. Thus, it is not possible to test a hypothesis about the main effects and interac~
tians unless some additional assumptions are made. The usual assumption is to ignore the
interaction effect and use the interaction mean square as an error mean square. Thus the
analysis is equivalent to the analysis used in the randomized block design. This
366 Chapter 13 Design of Ex.periments with Several Factors

no~interaction assumption can be dangerous, and the experimenter should carefully exam~
ine the data and the residuals for indications that there really is interaction present. For more
details, see Montgomery (200 1),

13·3.4 The Random·Effects Model


So far we have considered the case where A and B are fixed factors. We now consider the
situation in which the levels of both factors are selected at random from larger populations
of factor levels, and we wish to extend our conclusions to the sampled population of factor
levels, The observations are represented by the model
~ i:::: 112, ..• ,a.
Yij' : !1 Hi ~ Pj + ("Pl.j H ijk j j:
1,2" •., b,
,.k -1,2,. ..• n,
(13·10)

where the parameters 1:i' pj • (~f5)ii' and ti" are random variables. Specifically, we assume
that~i is NIO(O, a;), p) is NIO(O, ~), (o/!I} is NIO(O, 0".,), and e,jk is NIO(O, 0"), The vari·
ance of any observation is
V(Yij,) ~ a; + ~ + a;p + 0",
and 0;.. ~ o;.~~ and (52 are called variance components. The hypothcses that we are inter~
ested in testing are H,: ~ ~ G, Ho: ~ ~ G, and Ho: 0"" ~ O. Notice the similarity to the one·
way classification random-effects model,
The basic analysis of variance remains unchanged; that is, 8S".• SSe. SSw SSI~ and SSE
are all calculated as in the fixed-effects case, To construct the test statistics. we mUIit exam-
ine the expected mean squares. They are
E(MS,) ~ 0" + no"., + bna;,
E(MS,) ~ 0" + na:." + ana;,
E(MSA,) ~ 0" + na:p, (13·11)
and
E(MS,J = <1'.
~ote from the expected mean squares that the appropriate statistic for testing H;;:
0"",: 0 is
(13·12)

since under Ho both the numerator and denominator of Fo have expectation <f', and only if
He is false is E(MSAIl ) greater than E(MS,), The ratio Fo is distributed as F,,_ ""_ !),,~".IJ'
Similarly. for testing Ho: 0-; ::: 0, we would use

1'0 = MS, , (13·13)


MSAB
which is distributed as F4j -l,(s-l;(b _ l») and for testing Ho: <if, =- 0, the Statistic is
E _ MS.
0- MS '
(13.14)
An

which is distributed as F £>""1,(0-:)(1;-1)' These are all upper-tail, one. .tail tests, Notice that these
test Statistics are not the sarne as those used if both factors A and B are fixed, The expected
mean squares are always used as a guide to test statistic construction.
13-3 Two-Factor Factorial E}."Periments 367

The variance components may be estimated by equating the observed mean squares to
the::r expected values and solving for the variance components, This yields
-2
(j = MS"

(13-15)

Suppose that in Example 13-3, a large number of primers and several application methods could \:Ie
used. 'Three pri..'"Uers, say 1. 2. anc 3, were se:ected at rancom, as were :he two application methods,
Tne analysis of va.--iance assuming the rando:n effects :;node! is sho'W'D in Table 13-8.
Nerice that the first four colurr.ns.in the analysis ofvanance table are exactly as in Exa.'1lple 13-
3, Xow. however. the F ratios are computed according to equations 13~12 through 13~14. Since
F&,~,1.l2::::: 3.89, we conclude that interaction is not sigrlficant. Also, since FI)IJ!,;';;::::: 19_0 and FM).I.? =
18,5, we conc!ude that both types and application. methods significantly affect adhesion force,
although primer type is just ba:ely significant at (j. = 0,05. The '1-ariance components may be o!Stimated
using equation 13-15 as follows:
,,' =0,08,
0,12-0,08 00133
v"
-2
= 3 "
_, 2,29-0,12
"r,~ 6 ",0,36,

,,~ 4,9: - 0, 12 = 0,53,

Clearly, :he tw'Q largest variance components are for prirr.er typeS (6';::::: 0.36) and application meth-
ods 0.53),

13-3.5 The Mixed Model


Now suppose that one of the factors, A, is fixed and the otherl B, is random, This is called
the mixed model analysis of variance. The linear model is
[i=I,2"",a,
f1 ~ ti + P; + (,Plil -;- Eil'lj = 1,2"",b, (13-16)
k = l,2, ... ,n,

Table 1J..8 Analysis of Variance for EXar:lple 13-4


Source of Sumo! Degrees of Mean
Variation Squares Freedom Sq:.:.are F,
Primer typeS 4,58 2 2.29 19,08
Applicatioo methods 4,91 4.91 40,92
Interaction 0.24 2 0,12 1.5
Error 0.99 12 0,08
Total 10,72 17
368 Chapter 13 Design of Ex.periments with Several Factors

In this model, 'li is a fixed effect defined such that 'L~ '" I 't, = 0, ~ Is a random effect, the inter~
action term ('f/3)ij is a random effect, and el;~ is a NID(O, a2) random error. Ie is also cus-
tomary to assurne that Ilj is NlD(O, <i,,)
and that the interaction elements (rfJ!ij are nonnaJ
random variables "ith mean zero and variance l(a - 1)la]d'fJ' The interaction elements are
not all independent,
The expected mean squares in this case are
a
bnI ~?
E(.M:SA)=a2+/ra;jJ+ h:
a-I
(13-17)
E( MS. ) ~(f 2 +an(fp,
2

E(MSAB ) = 0'2 +na;',

and

Therefore, the appropriate test statistic for testing Ho: off = 0 is

(13-18)

whicb is distributed as F.. _1,(c-I)(I1- n' For testing He: cr1 = 0, the test statistic is
MSB
Fo="--, (13-19)
MSe

which is distributed as Fb_l,ab(n_ 1)' Finally. for testing Ho:cr;p = 0, we would use

(13-20)

which is distributed as F(C_IXb_1J,;<'b(,,_I)'


The variance components ~ , dtjl. and rr may be estimated by eliminating the first
equation from eqcation 13-17t leaving three equations in three unknowns, the solutions of
which are

and
(13-21)

This general approach can be used to estimate the variance components in any mixed
model. After eliminating the mean squares containing :fixed factors, there will always be a
set of equations remaining that can be solved for the variance components, Table 13-9
summarizes the analysis of variance for the two~factor mixed model.

J
13-4 General Factorial Experb.en.ts 369

Table 13·9 Analysis ofVa.."iance for the }\No-Factor Mixed Model

Sou....-ce of Sum of Degrees of Mean Expected


Variation Squares Freedom Square Mean Square F,

Rows {A.) 55, a-I M5, aZ+ncr2 +bnI>r"2 l{a~l)


MSA
'" ' MSA,B
M5 p
Columns (B) 55, b-l MSg cr + ar.0'~ M5,
MSA~
Interaction 55" (a -l)(b -1) MSAB r;;1 + nO'~
MS E
Error SSE ab(n -I) MS, a'
Tota1 SSr abn-l

13·4 GENERAL FACTORL'l.L EXPERIMENTS


:Many experiments involve more than two factors. In this section we introduce the case
where there are a lev-els of faetor A, b levels of factor B, c levels of factor C, and so on,
arranged in a factorial experiment. In general. there will be abc ... n total observations, if
there are n replieates of the complete experiment,
For example, consider the thtee~:fuctor experiment with underlying model

YiJ'/ = !1 Hi + /3j +r, + ('1"/3)'1 + «r)" +(J3r)J'


rj = 1,2, ... ,a,
j (13·22)
+ ('I"/3r)'lk + E iikl 1 _l,!, ... ,b,
k-l'~1"'jCj
.1 = 1.2 p •• ,n.
Assuming that A, B, and C are fixed, the analysis of varianee is shown in Table 13-10, Note
that there must be at least !\vo replicates (n ~ 2) to compute an error sum of squares. The
F -tests on main effects and interaetions follow directly from the expeeted mean squares.
Computing formulas for the sums of squares in Table 13·10 are easily obtained, The
total sum of ,squares is, using the obvious "dot» notation,
c b C If

Ss,. = I, I,I, :~::'5k/ - (13-23)


i=1 i=l ko;.l1=: aDen
The sum of squares for the main effects are eomputed from the totals for factors A (Yf"')'
BIy,),,), and ely.. ,.) as follows;

(13-24)

(13-25)

c 2 2
SSC _- "" Y..k, Y....
I £ . . , - - - - -.
.1:=1 abn. abcn
(13-26)

l
310 Chapter 13 Design of Experiments with Several Factors

To compute the two-factor interaction sums of squares, the totals for theA x B,A x C, and
B x C cells are needed. It may be helpful to collapse the original data table into three two-
way tables in order to compute these totals. The sums of squares are

rr
11. b

;""1 j=l
,2
Yij.
en
"2
c_L',-SSA -SSo
abcn (13-27)
= SSw.hroroh(AlT; - SSA - SSB'

SSAC = rr
11. C 2 2
Yd. -~-SSA -SSe
i=l k""l bn abcn (13-28)
;; SSwblotalS{AC) - SSA - SSe>
and

SSnc = rr
b c

J=l K""t
2 2
y ,j>' -~-SSB -SSe
an abcn (13-29)
= SS'.'''WCBC) - SSe -SSc·
The three-factor interaction sum of squares is computed from the three"way cell totals Yijt-
as

Table 13-10 The Analysls~of-Variance Table for the Three-Factor Fixed-Effects Model

SOwceof Sum of Degrees of Mean Expected


Variation Squares Freedom Square Mean Squares F,
..
--~-

, MS;.
SS, a-I
(1z + bcrir."t:
A MS,
a-t MS,
) acnI.j3; MSn
B SSe b-I MS. CI +~-
b-I MS,
'nl; , MSc;_
(12+atJ It
C SSe e-I MS c
c-I MS,

AS SSAB (a - tj(b-i) MS",


c' + rnlI«P);j MS",!>
(a-l)(b-I) MS,

01 + brIE{'ry)~~ MS AC
AC SSAC (a-I)(e-I) MS;.c (a-l)(c-l) MSc

' ar£l:(Jl'f);. MSec


BC SSnc (b -I)(c - I) MSBC CI + '
(b-I)(,-I) MSE

0" + til.::IT.(-rj}y):jk MSAlIC


,4BC SSABC (a - I)(b -1)(c - 1) MS;JJC
(a-l)(b-l)(c-1) MSe

Error SSE cbc(n - i) MS r a'

Total Ss., aCch-i

;
.J
13-4 Gen::ral Factorial Experiments 371

(13-30b)
The error sum of squares may be found by subtracting the sum of squares for each main
effect and interaction from the tOtal sum of squares, or by
(13-31)

A mechanical engineer is stUdying::he sw::face roughness of a part produced in a metal-cutting Qper~


arion. Three factors. feed rare (."1), depth of cut (B), and tool angle (C), are of interest. All three f<lc-
torS have been assigned two levels. and two replicates of a factorial design are ron. The coded data are
shown in Table 13-11. 'The t..'rree~way celJ totals Yijk' are circled in this ta.lJle,
The su.."IlS of sqW1.-'"Cs are ca:!culated as follows, using equatio1:.s 13~23 to 13-31:

a ,,2 y'1
SSA=I:!i.:.:..:...-~
[ .. I ben abcn

~ (75)' +(;02)' (177)' ~ 45.5625


8 16 '
_ ~ )~ .. ~l
SSs- £ - , - - - -
..
) .. 1 am cben
(82)' -(95)' (177)'
10.5625,
g 16

a b;( 2
SSAii ::::: L: 2: Yuen" - Men
;"'! j .. j
y:". -SSA - SSs

(37)' +(38)' +(45)' .,(57)' (177)'


45.5625 -10.5625
= 4 16
~7.5625,

=0.0625,
b c 2 2
SSlJC -_""Y.jk. Y.... SSe- 5Sc
..t......t....------
j .. :.t»o: an abcn

+148\' (J77')'
='138"I +(44)' _ (4")'
" I . 10.5625-3.0625
4 16

L
372 Chapter 13 Design of Experiments \'lith Several Factors

SSE := SST -SS:n.blotlh(ABC}


= 92.9375-73.4375 = 19.5000.

1M analysis of ..'afiance is summarized in Table 13-12. Feed rate has a significant effect on su.rface
finish (a: < 0,01). as does the depth of cut (0.05 < a < 0.10). There is some indication of a mild inter-
action betweet. these factors, as the Ftest for the AJ3 interaction is just less than the 10% critical value.

Obviously factorial experiments wit'" three or more factOTS are complicated and require
many runs, particularly if some of the factors have several (more than two) levels. This '
leads us to consider a class of factorial designs with all factOrs at two levels. These designs
are extremely easy to set up and analyze, and as we will see, it is possible to greatly reduce
the number of expe~ental runs through the tecbnique of fractional replication.

Table lJ..ll Coded Surface Roughness Data for Example 13-5

Depth of Cut (B)


0.Q25 Loch 0.040 inch
Feed Rate (A) Toel Angle l c) Tool Angle (C)
15' 25" W 25" Yr"
9 11 9 10
20inJmln 7 10 II 8 75
@ @ @ @
10 10 12 16
30 in.lmln 12 13 15 14 102
@ @ @ ®
B x Ctotals 38 44 47 48 177 =, ....
A X B Totals )iF' A X CTotals }r'k'

AlB 0.025 0.040 MC 15 25


20 37 38 20 36 39
30 45 57 30 49 53
;,;./<. 82 95 y,,). 85 92
13-5 The 2t: Factorial Design 373

Table 13-12 AnalYStS of Variance for Example 13~5


Sourc~of Sum of Degrees of Mean
Variation Squares Freedom Squa.-re F,
Feed rate (A) 455625 45.5625 18.69"
Depth of cut (B) 105625 10.5625 4.33'
Tool angle (C) 3.0625 3.0625 1.26
AB 75625 75625 3,10
AC 0.0625 0.0625 om
BC 1.5625 1 15625 0.64
ABC 5.0625 1 5.0625 2.08
Error 19.5000 S 2.4375
Tow 92.9315 15
"Significanr at 1%.
!>S:;gni5cant at 10%.

13·5 THE 2k FACTORIAL DESIGN


There are certain special types of factorial designs that are very useful. One of these is a fac-
torial design with k factors, each at two levels. Because each complete replicate of the
design has Z" runs or treatment combinations, the arrangerr.ent is called a Z~ factorial design.
These designs have a greatly Simplified statistical analysis, and they also form the basis of
many other useful designs.

13·5.1 The 22 Design


The simplest type of2k design is the 2z~ that is, two factors, A and B, each at two levels. We
usually think of these levels as the "low" and "high" levels of the factor. The 2' design is
sho"" in Fig. 13·13. Note that the design can be represented geometrically as a square, with
the 21. 4 runs forming the cOmers of the square. A special notation is used to represent the
treatment combinations. In general a treatment combination is represented by a series of
lowercase letters. If a, letter is present, then the corresponding factor is run at the high level

High b
(+) ,....-----------tab

Low
(-) (1) a
Low A High
(-) (+) Figure 13·13 The 2' lilctorial desigr..
374 Chapter 13 Design of Experiments with Severa:. Fac:t.ms

in that treatment combination; if it is absent, the faetor is run at its low level. For example.
treat:raent combination a indicates that factor A is at the high level and faetor B is at the low
leveL The treatment eombination with both factors at the low level is denoted (1), 'This nota-
tion is used throughout the 2K. design series. For example. the treatment combination in a 24
design with A and C at the high level and B and D at the low level is denoted ac.
The effects of interest in the 22 design are the main effects A and B and the two~factor
interaction AB. Let (I), a, b, and ab also represent the totals of all n observations taken at
these design points. It is easy to estimate the effects of these factors. To estimate the main
effect of A we would average the obsenrations on the right side of the square, where A is at
the high level. and subtract from this the average of the observations on the left side of the
square, where A is at the low level, or
A= a+ab _ b+(l)
2n 2n
(13-32)
J...[a +ab -b -(1)].
2n
Similarly, the main effect of B is found by averaging the observations on the top of the
square, where B is at the high level. and subtracting the average of the observations on the
bottom of the square, where B is at the low level:
B= b~aiJ._ a+(l)
2n 2n
(13-33)
= I- [b+ab-a-(I),..
2n '
Finally, the AB interaction is estimated by taking the difference in the diagonal averages in
Fig. l3~13i or

(13-34)

The quantities in brackets in equations 13-32, 13-33, and 13-34 are called contrasts.
For example, the A contrast is
Contrast, ~ a - ab - b - (1).

mthese equations, the contrast coefficients are always either +1 Or -L A table of plus and
minus signs, such as Table 13-13, can be used to determine the sign of each treatment com~
bination for a particular contrast. The column headings for Thble 13MB are the main effects
A and B, AB interaction. and I, which represents the total. The row headi:r.:.gs are the treat~
ment combinations, Note that the signs 1:1 the AS column are the products of signs from

Table13-13 Signs for Effects in che 22 Design

T-"eatment Factorial Effect


Combinations I A B AS

(1) + +
a + +
b + +
ab + + ~ +
13·5 The 2' Factorial Design 375

colu."llOS A and B. To generate a contrast from this table, multiply the signs in the apprQpri~
ate column of Table 13-13 by the treatment combinations listed in the rows and add.
1b obtain the sums of squares for A, B, and AB, we can use equation 12-18; which
expresses the relationship between a sing1e-degree-of~freedom contrast and its sum of
squares:

SS = (ContrdS t l" ".


o L (contrast coefficientst
Therefore, the sums of squares for A, B, andAB are

SSA
40

SS. =
4n

SSAB =
40
The analysis of variance is completed by computing the total sum of squares SST (~ith
40 - 1 degrees offreedom) as usual, and obtaining the error sum of squares SSE [with 4(0
- 1) degrees of freedom] by subtraction.

§;;§gI~',l~~
An article in the AT&T Technical Jo"""u (MarcblApril, 1986, Vol. 65, p. 39) describes the applica-
tion of two-level experimental designs to integrated circuit manufacturing. A basic processing step in
t.'1.is industry is to gr:ow an epitaxial layer on polished silicon wafers. The wafers are mounted On a sus-
ceptor and positioned inside a bell jrtr, Chemical vapors are introduced rr..rough nOl.7..les near the top
of the jar, The susceptor is rotated and heat js applied. These conditions are maintained until the eplR
taxiallayer is thick enough.
Table 13-14 presen!'$ theresuJts of a 2' factorial design \\-1.thn=4replicates using the factors A ""
deposition time and B:::: arsenic flow rate. The two leve~s of deposition time are-= short and,.;. == long,
and the two levels of arsenic flow rate are -:=; 55% and - = 59%. The response variable is epitaxial
layer thickness (urn). We may find the estimates of the effects using eq'Jations 13 32, 13-33, and
R

13-34 as follows:

A = ;n [a+ah-b-(ll]

=2t4l [59.299+ 59, 156-55,686-56.081J =0.836,

1\1ble13-14 The 2,z Design for the Epitaxial Process Experiment


Treatment
Combinations A B AB Thickness (um) Total
(1) ... 14.037.14.165,13.972,13.907 56.081 14.021
a + 14.821,14.757, ,4.843, 14.878 59.299 14,825
b + 13.880,13.860, 14.032, 13.914 55.686 13.922
ab + + + 14.888,14.921, 14.415, 14.932 59.156 14.789

l
376 Chapter 13 Design of Experiments with Several Factors

B=..!..[b+ab-a-(I)]
2n
1 [55.686+59.156-59.299-56.081
=2(4) • ] =-{).O67,

AB=..!..[ab+(I)-a-b]
2n
1
= 2(4) [59.156+56.081 -59.299-55.686]=0.032.

The numerical estimates of the effects indicate that the effect of deposition time is large and has a pos-
itiYe direction (increasing deposition time increases thickness), since changing deposition time from
low to high changes the mean epitaxial layer thickness by 0,836 pm. The effects of arsenic flow rate
(B) and the AB interaction appear small.
The magnitude of these effects may be eonfi'llled with the analysis of variance, The sums of
squares for A, E, andA,B are computed using equation 13-35:
(Contrast)'
SS=-- -,
n·4
ra+ab-b-(I)l'
SS = <

A 16
= [6.688]2
16
~2.7956,


SS.=-
,[b_+_ab_-_"_-l-'(llCL.l'
<

16
[-{).5381'
16
= 0.0181,
[ab+(I)-a-bj'
SSAB 16

_ [O.256f
- 16
=0.0040.

The analysis of variance is summarized in Table 13-15. This confirms our conclusions obtained by
examining the magnitude and direction of the effects; deposition time affects epitaxial layer tb.ickness,
and from the direction of the effect estimates we know that longer deposition ti.:n.es lead to thicker epi~
taxia11aYer5,

Tablel3-15 Analysis ofVa.ciance for the EpitaXial Process Experiment


Source of Sum of Degrees of Mean
Variation Squares Freedom
A (deposition time) 2.7956 I 2.7956 134.50
B (arsenic flow) 0.0181 0.0181 0.87
AB 0.004() 0.0040 0.19
Et::ror 0.2495 12 0.0208
Total 3.0672 15
:3-5 The 21( Fa.ctorial Design 377

ResidualAnalysis It 1s easy to obtain the residuals from a 2;; design by fitting a regression
mode! to the data. For the epitaxial process experiment, the regression model is
y; 130 + 13,x, + E,
since the only active variable is deposition time, which is represented by Xl' The low and
high levels of deposition time are assigned the vaiues Xl =-1 and Xl =+1, respectively. The
fitted model is

y 14.389+lro.836ixl>
2 /

where the intercept A, is the grand average of all 16 obServatiODS (Y) and the slope $, is one·
half the effect estimate for deposition time. The reason the regression coefficient is one-half
the effect estimate is because regression coefficients measure the effect of a unit change in
Xl on the mean of y, and the effect estimate is based on a two-unit change (from -1 to +1),
This model can be used to obtain the predicted values at the four points in the design.
For example, consider the point with low deposition time (x: = -1) and low arsenic flow
rate. The predicted value is

ji 14.389 + (0.836
2 J
IH): 13.971 pm,

and the residuals would be


., 14.037 - 13.971: 0.066,
e, 14.165 13.971 = 0.194,
eJ =13.972 - J3.971 =0.001.
e4 = 13.907 -13.971 =-Q.064.
It is easy to verify that for low deposition time (x, : -1) and high arsenic flow rate, ji:
14.389 + (0.83612) (-1) = 13.971 J1.rI4 the remairting predicted values and residuals are
" = 13.880 - 13,971 =- 0.091,
e,: 13.860-13.971 =-0.111,
e,: 14.032 - [3.971 =0.061,
e,: 13.914 - 13.971 =-0.057,
that for high deposition time (x, = +1) and low arsertie flow rate,jI : 14.389 + 0.83612)(+1) :
14.807 J1.rI4 they are
e,: 14.821-14.807 = 0.014,
e,,: 14.757 - 14.807 =- a.oso,
ell = 14.843 - 14.807: 0.036,

ell =14.878 - 14.807 =0.071,


and that for high deposition time (x, = +1) and high arsertic flow rate,:9 = 14.389 +
(0.83612)(~1)= 14.807 J1.rI4 they are

e,,: 14.888 -14.807 = 0.081,


e" = 14.921-14.807 =0.114,
378 Chapter 13 Design of Experiments with Several Factors

e" ~ 14.415 - 14,807 = - 0,392,


e" = 14.932 - 14.807 = 0.125.
A normal proba1:>ility plot of these residuals is shown in Fig. 13-14. This plot indicates
that one residUal, e;s =:: 0.392, is an outlier. Examining the four runs with high deposition
time and high arsenic flow rate reveals that observ-ation Y15 =:: 14.415 is considerably smaller
than the other three observations at that treatment combination_ This adds some additional
evidence ro the tentative conclusion that observari on 15 is an outlier. Another possibility is
that there are some process variables that affect the variability in epitaxial layer thickness,
and if we could discover which variables produce this effect, then it might be possible to
adjust these variables to levels that would minimize the variability in epitaXial layer thick~
ness. This would have important implications in subsequent manufacturing stages. Figures
13-15 and 13-16 are plots ofresiduals versus deposition time and arsenic flow rate, respec·
tively. Apart from the unusually large residual associated with YU' there is no strong evi-
dence that either deposition time or arsenic flow rate int1uences the variability in epitaxial.
layer thickness.
Figure 13·17 shows the estimated St:llldard deviation of epitaxial layer thickness at all
four runs in the 21 design. These standard deviations were calculated using the data in
Table 13-14. Notice that the standard deviation of tilefour observations with A and B at the
high level is considen:bly larger than the standard deviations at any of the other three design

+T
ZJ
+1

o~ .. .••.


1
-t
-2~

-0.39200 -0.29433 ~O.~9S-67 -0.09900 -0.00133 0.09633 0.19400


Residual
Figure 13-14 Normal probability plot of residuals for the epita:tial proces..'i experiment.

••
O~------~L~O-W------~H~i9~h----~

-0.5 ~ Oeposition time, A

I<'igure 13--15 Plot of residuals versus deposition time.


13-5 The 2k Factorial Desig!!. 379

ol~:------~IL~O-W----~H~i~9h---- ...

~J Arsenic flow rate. B


Figure 13~ 16 P;ot of residuals versus arsenic flow rate.

s~O.077
(+) b

H (1) a Figure 13-17 ne estimated standard deviations


$=0.110 5= 0.051 of epitaxial layer thickness at the four runs in the
H A (+) 2'design.

points. Most of this difference is attributable to the unusually low thickness measurement
associated with YIS' The standard deviation of the fout' observations -with A and B at t.'1e low
levels is also somewhat larger than the standard deviations at the remaining two runs. This
could be an indication that there are other process variables not included in this experiment
that affect the variability in epitaxial layer thickness. Another experiment to study this pos-
sibility, involving other process variables, could be designed and conducted (indeed, the
original paper shows that there are two additional faetors. uneonsidered in this example,
that affect process variability).

13-5.2 The 2' Design for k ;;: 3 Factors


The methods presented in the previous section for factorial designs with k:::: 2 factors cach
at two levels can be easily extended to more than two factorS. For example, consider k = 3
factors, each at two levels. This design is a 2) faetorial design. and it has eight treatment
combinations. Geometrically, the design is a cabe as sho\\<ll in Fig. 13~18, with the eight
runs forming the comers of the cube. This design allows three main effects to be estimated
(A, B, and C) along with three two-factor interactions (AB. AC, and BC) and a three-factor
interaction (ABC).
The main effects can be estimated easily, Remember that (1). a, b,ab, e, ac. be, and abc
represent the total of all n replicates at each of the eight treatment eombinations in the
design. Referring to the cube in Fig. 13-18, we would estimate the main effect of A by aver-
aging the four treatment combinations on the right side of the cube, where A is at the high
380 Chapter 13 Design of Experiments with Several Factors I
be
~----------~~abc

Be
e r''---T-,
,--:err-
I
I
I
I
bi __
~~_ ------ ab
..,- ... ..,-' , +1
-1 :... ( 1 ) " " ' - - - - -. . /

-/a
-1
A
Figure 13·18 The 2' desigu.

level, and sUbtracting from that quantity the average of the four treatment combinations on
the left side of the cube, where A is at the low level. This gives
I
A = --[a ~ ab+ac + abe- b -e- bc- (I)]. (13-36)
4"
In a similar manner the effect of B is the average difference of the four treatment combina-
tions in the back face of the cube and the four in the front, or

B = ..!...rb+ ab + be +abe- a - c - ac -(I)], (13-37)


4n'
and the effect of C is the average difference between the four treatment combinations in the
top face of the cube and the four in the bottom, or

1 •
C=--lc~ac+be+abc-a-b-ab-\I
, )1 ,
(13-38)
4n '
~ow consider the tv/o-factor interaction AB. \¥hen C is at the low level, AB is just the
average difference in the A effect at the t\Vo levels of E, or

AB(CIOW)=_I [ab-b]-_I [a-(I)].


2n 2"
Similarly, when C is at the high leve~ the AB interaction is

AB( C high) = ..!...[abe- be] - "!'.(ac-e].


2n 2"
The AB interaction is just the average of tbese two components. or

AB = "!"'[ab+ (1) + abc + e- b -a - be -ac]' (13-39)


4n
Using a similar approach, we can show that the AC and Be interaction effect estimates are
as follows:
1
AC= -[ac+ (I) + abc + b -a-c - ab - be], (13-40)
4n
I
BC =--[bc+ (1) + abc+a- b-e- ab- ae). (13-41)
411
13~5 The 2k Factorial Design 381

The ABC interaction effect is the average difference between theAB interaction at the two
levels of C. Thus

ABC = :n ([abc~bcl-[ac-cl~[ab~bl+[a~{I)]}
(13-42)
= 2-[ abc -
4n
bc~ac+ c~ab + b + a - (I)l..
The quantities in brackets in equations 13-36 through 13-42 are contrasts in the eight
treatment combinations. These contrasts can be obtained from a table of plus and minus
signs for the 2' design, shown in Table 13-16. Signs for thc main effects (colu.mns A, B, and
C) are obtained by associating a plus with the high level of the faetor and a minus with the
low level. Once the signs for the main effeets have been established. the signs for the
remaining eolumns are found by multiplying the appropriate preceding columns row by
row. For example. the signs in column AB are the produet of the signs in columns A and B.
Table 13-16 has severnl interesting properties.
1. Except for the identity column l, each eolumn has an equal number of plus and
minus signs.
2~ The SUJr. of products of signs in any two columns is zero; that is, the columns in the
table arc orth..ogonal,
3. Multiplicating any column by column [leaves the column unchanged; that is, I is an
identity element.
4. The product of any two columns yields a column in the table; for example, A x B =
AS and AB x ABC =A'B'C C, since any colu.mn multiplied by itself is the iden-
tity column.
The estimate of any main effect or interaction is determined by multiplying the treat-
ment combinations in the first column of the table by the signs in the corresponding main
cffect or interaction colUlllll. adding the result to produce a contrast, and then dividing the
contrast by one-half the tolal number of runs in the expctimcnt. Expressed mathematically,

Effect Contrast (13-43)

The sum of squares for any effect is

(Contrast)2
SS - (13·44)
n2k

Table 13-16 Signs for Effects.in the 2) Design


Treatmem Factorial Effect
Combinations I A S AS C ,K BC ;I.BC
(I) + + + +
a + + + +
b + + + +
ab + + + +
C + + + +
ac + + + +
Ix: + + + +
abc + + + + + + + +
1
382 Chapter 13 Design of Experiments with Several Factors

F!¥~il(>!~11~J
Consider the surface--roughncss experiment described originally in Example 13-5. This is a: 23 facto-
rial design in the factors feed rate (.4)j deptl::. of cut (B), and tool angle 'G), 'Yri.tb 10 =2 replicates. Table
13-17 presents the observed surface~rougbness data.
The main effects may be estimated using equations 13-36 through 13-42. The effect of A is, for
example.

A =.l.ra + ab+ac+abc-b - c-bc- (I)]


4" ~
= 4t2)[22+27+23+30-20-21-18-16]

1
=-[27]=3375,
8
and the sum of squares for A is found using equation 13-44:

(ContrastS

(21)'
=--=455625
2(8) ~ .

It is easy to verify that the"other effects are


B= 1.625,

C= 0.875.

AB 1.375,

AC=O.I25.

BC=-O.625.

ABC= 1.125.
From. examining the magrrimde of the effects. clearly fced rate (factor A) is dom.i."lant. followed by
depth of cut (8) and the AB interaction. although the interaction effect is relatively small. The analy-
sis of \':It....ance is summarized in Table 13-181 and it confirms our interpretation of the effect estimates.

Table 13,17 Surface Roughne.'iS Data for Example 13-7


Treatment Design Facto'" Surface
Combinations A B C Thtals
(1) -1 -1 -1 9, 7 16
a 1 -1 -1 10.12 22
b -I I -1 9.1! 20
ah 1 -1 12,5 27
c -I -1 11.10 21
ae 1 -I 10,13 23
be -I 1 1 10,8 18
abc I 1 16.14 30
:3-5 The 2k Factorial Design 383

T.bl.13-18 Analysis of Variance for the Sl.lt'!aee-Finish Bxperi..mcnt


Source of Sum of Degrees of Mean
Variation Squares Freedom Square F,

A 45.5625 45.5625 18.69


B 10.5625 lO.5625 4.33
C 3.0625 3.0625 1.26
AS 7.5625 7.5625 3.10
AC 0.0625 0.0625 0.Q3
BC 1.5625 1.5625 0.64
ABC 5.0625 1 5.0625 2.08
Error 19.5000 8 2.4375

Total 92.9375 15

Other Methods for Judgfng Significance af Effects The analysis of variance is a formal
way to detenni:1e which effects are nonzerQ. There are t\\'O other methods that are useful.
In the first method, we can calculate thc standard errors of the effects a.~ compare: the mag-
nitudes of the effects to their standard errors. The second method uses normal probability
plou> to assess the importance of the effects,
The standard error of an effect is easy to find, If we assume that there are ft replicates
at each of the 26 runs in the design, and if Yil' yc~' , .. , Yill are: the observations at the ith run
(design point), then

i 1,2 ..... 2.<,

is an estimate of the variance at tl:e ith run. where )'t = r;=


lyJn is the sample mean of the
n obsenrations. The 2" variance esGnates can be pooled to give an overall variance estimate
... 1 2k fi 2
S' - 2k {n_l)I.I.(Yij-Yi) , (13-45)
1=1 j""l

Where we have obviously assumed equal variances for each design point. This is also the
variance estimate gNen by the mean square error from the analysis of variance procedure.
Each effect estimate has variance given by

V{Effect) = v[contrast]
n2k 1
1
-:r
-'-'::""'T,
(n2 k
V(Contrast).

Each contrast is a linear combination of 27. treatment totals, and each total consists of n
ohservations. Therefore.
V(Contrast) = n2'<i'
and the variance of an effect is

V(Effect) 1 n2' (J"


(n2H)' (13·46)
1 2
n2 k- 2 {J •
384 Chapter 13 Design of Experiment<: with Several Factors

The estimated standard error of an effect would be found by replacing rr with its estimate
s' and taking the square root of equation 13-46,
To illustrate for the surface-roughness experiment, we find that S' = 2.4375 and the
standard error of each estimated effect is

s,e,(Effect) = ~;;

= ,} 2, ~3-2 (2.4375)
=0,78,
Therefore two standard deviation limits on the effect estimates are

A: 3.375 ± 1.56,
B: 1.625± 1.56,
C: 0,875 ± 1.56,
AB: 1.375 ± 156,
AC: 0,125 ± 1.56,
BC: -0,625 ± 156,
ABC: Ll25 ± 1.56,
These intervals are approximate 95% confidence intervals. They indicate that the wo main
effects. A and 8, are important, but that the other effects are not, since the intervals for all
effects except A and B include zero.
Normal probability plots can also be used to judge the significance of effects, We will
illustrate that method in the next section.

Projection o/2k Designs .Any zi< design will collapse or project into another i:. design in
fewer variables if One or more of the original factors are dropped Sometimes this can pro-
vide additional insight into the remaining factors. For example, consider the surface-rough~
ness experiment. Since factor C and all its interactions are negligibl~ we could eliminate
factor C from the design. The result is to collapse the cube in Fig. 13~ 18 into a square in the
A - B plane-however, each of the four runs io the new design has four replicates, In gen-
era!, if we delete h factors so that r = k - h factors remain, the origioalZ' design with n repli-
cates will project into a 2r design with n2h replicates.

Residual Analysis We may obtain the residuals from a 2' design by usiog the method
demonstrated earlier for the 22 design. As an example, consider the surface~roughness
experiment. The three largest effects areA, B, and theA.Binteractiou. The regression model
used to obtain the predicted values is
y = Po + p,x, + Ax, + /3,,,,,,,,,,,
where Xj represents factor A~.l1 represents factor B, and x!A:z represents the AD interaction.
The regression coefficients,Bl,A. andft!~ are estimated by one·balfthe corresponding effect
esti:clates andfJo is the grand average, Thus

,
y =.
110625'~ ,(3.3751
- - IX 1 + (1.6251 (1.375\
- 0-)X, + --)x}xz.
\ 2 / '" 2
13-5 The 2< Factorial Design 385

and the predicted values would be obtained by substituting the low and high levels of A and
B into this equation, To illustrate, at the treatment combination where A, B, and C are all at
the low level, the predicted value is

j 11.0625+ [.~75}_1)+ e~25}-I)+ (1.~75}_I)(_J)

=9.25.
The observed values at thls run are 9 and 7, so the residuals are 9 - 9.25 = -0.25 and
7 - 9.25 ~ -2.25. Residuals for the other seven runs are obtained similarly.
A normal probability plot of the residuals is shown in Fig. 13-19. Since the residuals
lie approximately along a straight line. we do not suspect any severe nonnormaJiry in the
data. There are no indications of severe outliers. It would also be helpful to plot the resid·
uals versus the predicted values and against each of the factors A, 8, and C.

Yates' Algorithm for the 2" Instead of using the table of plus and minus signs to obtain the
contrasts for the effect estimates and the sums of squares, a simple tabular algorithm
de'\1.sed by Yates can be employed, To use Yates' algorithm. construct a table with the treat-
ment combinations and the corresponding treatment totals recorded in standard order. By
standard order, we mean that each factor is introduced one at a time by combining it with
all factor levels above it. Thus for a 22, the standard order 1s (1), a, b. 00, while for a 23 it is
(1), a, b, ab, c, ac, be, abc, and for a 2' it is (1), a, b, ab, c, ac, be, abc, d, ad, bd, abd, cd,
acd, bcd, abed. Then follow thls four-step procedure:
L Label the adjacent column [IJ. Compute the entries in the top half of thls column by
adding the observations in adjacent pairs. Compute the entries in the bottom half of
this column by changing the sign of the first entry in each pair of the original obser~
vation. and adding the adjacent pairs.
2. Label the adjacent column [2J. Construct column (2) using the entries in column
(1). Follow the same procedure employed to generate colurr,n (1]. Continue this
process until k columns have been constructed, Colunm [k] contains the contrasts
designated in the rows.
3. Calculate the sums of squares for the effects by squaring the entries in column [k]
and dividing ey ni:'.
4. Calculate the effect estimates by dividing the entries in column [k) by n2'·'.

+2! I

+1 ~

~ 0-

:~
-2~~2~5~OO~--~~--~~~--~~~--~~~--1~.O~a~3~3----7i.~75~OO
J
Residual, ej

Figure 13-19 Normal probability plot of residuals fro", the surface·roughness experimen:
1
386 Chap,er 13 Design of E.."'Cperiments v.ith Several Factors

Table 13·19 Yates' Algorithm for the. Surface~Rougbness Experiment


Sl,;.ITl.of
Treatment Squares Effect Estimates
Corebinations [lJ {2] [3J Effect [31"fn2J [3;1n2'
(1) !6 38 85 177 Total
a 22 47 92 27 A 45.5625 3.375
b 20 44 13 13 B 10.5625 1.625
ab 27 48 14 11 AB 7.5625 1.375
c 21 6 9 7 C 3.0625 0.875
ae 23 7 4 AC 0.0625 0.125
be 18 2 1 -5 BC 1.5625 -0.625
abc 30 12 10 9 ABC 5.0625 Ll25

Exa ';'/lie13-8.
Consider the surfacc-rougr..ness experiment in Example 13-7, This is a 23 design with n;= 2 replicates.
The analysis of this data using Ya,es' algorithm is illustrated in Table 13~19. Note that the sums of
squares computed from Yates' algorithm agree with the resultS obtained in Example 13-7,
.~~ ... ~~-

:3-5.3 A Single Replicate of the zf' Design


As the number of factors in a factorial experi:r:lent grows, the ~umber of effects that can be
estimated g:ows also. For exaClple, a 24 experiment has 4 main effects, 6 two-factor inter~
actions, 4 three-factor interactions, and 1 four-factor interdction, whlle a 2 5experiment has
six main effects, 15 two~factor interactions, 20 three-factor interactions, 15 four-factor
interactions, 6 fivewfactor interactions, and 1 six-factor interaction. In most situations the
sparsity of effecrs principle applies; that is, the system is usually dominated by the main
effects and low-order interactions. Three-factor and higher interactions are usually negligi-
ble, Therefore, when the number of factors is moderately large, say k'2: 4 or 5, a common
practice is to ron only a single replicate of the Zl. design and then pool Or combine the
higher-order interactions as an estimate of error.

.Exam;,l" 1:>:9
An article in Solid State Technology ("Orthogonal Design for Process Optim.i:z.ation and its Application
in Plasma Etchir.g," May 1987, p. 127) describes the application of factorial designs in developing a
nitride etc]:: process ou a single-wafer plasma etcher. The process uses y,., as the reactant gas. Iris pos-
sible to 'vary the gaz flow, the power applied to the cathode, the pressure in the reactor chamber, and the
spacing berwee!l the ::mode azd the cathode (gap). Several response variables would usually be of i:.ter-
cst in this process, but in this exa.:mple we will concentrate on etch rate for silicon ni::ride.
We will use a single replicate of a r design to i.;:.vestigate this process. Since it is un!ikcly that
the three-factor and four-factor intera.ctions are significant, we will tentatively plan to combine diem
as an estim.ate of error. The factor levels nsed in the design are shown here:

Design Factor
A B C D
Gap Pressure c,F,Flow Power
Level (em) (roThrr) (SeeM) (w)

LowH 0.80 450 125 275


High (+) 1.20 550 200 325

J
13-5 The 21 Factorial Design. 387

Table 13-20 presents the data from the 16 runs of the 24 design. Table 13~21 is the table of plus
and minus signs for the 24 design. The signs in the columns of this table can be used to estimate the
factor effects. To illustrate, the estimate of factor A is

A ::::..!.[a+ab+ac+abC+ad +ahd +acd+abcd-(l}-b-c-d -be -btl -cd-bed]


S
= ~[669+650+642 +635 + 749 +868 +860+ 729- 550 -604-633

-601-1037 - ,052- ,075-1063J


=-101.625.
Thus the effect of increasing the gap between the anode and the catbode from 0.80 em to 1.20 em is
to decrease the etch ;rate by 101.625 Almin.
1t is easy to verify that :he complete set of effect estimates is
A = -101.625. D=306.125,
B=-1.625, AD=-lS3.625,
AB=-7.875, BD=-O.615,
C=7375, ABD=4.125,
AC = -24.875, CD=-2.12S,
BC=-43.875, ACD=5.625.
ABC=-IS.62S, BCD =-25.375,
ABCD=-40.125,

A very helpful method in judging the significance of factors in a 2' experiment is to


construct a normal probability plot of the effect estimates, If none of the effects is

Table 13-20 The 24 Design for the Plasc:;;a Ecch Experim::nt

A B C D Etch Rate
(Gap) (Pressure) (C,F, Flow) (power)

-1 -1 -I -1 550
-1 -1 -I 669
-I 1 -I -1 604
1 1 -1 -1 650
-1 -1 -1 633
1 -J -1 642
-1 ! -1 601
1 1 1 -1 635
-1 -1 -1 1 1037
-1 -1 1 749
-1 -1 1052
1 -1 868
-1 -1 1075
1 -1 860
-1 1 1 1063
1 729
388 Chapter 13 Design ofExperi.."Pents with Several Factors

Tablel",21 Contra:>, Constants for the 2' Design


A B AS C AC BC ABC D AD BD ABD CD ACD BCD ABCD
(1) + + + + +
a + + .,. +
b + + + +
ab . .;. . + . .;. . + + .,.
c + + + + + +
ac + + + + +
be + + -r- + + +
abc++++++ +
d + + + + + + +
ad + + + + + +
bd + ~ + + +
aha + + .,. + + +
cd + + + + + +
acd + + + + +
bed + + + + + +
abed + + + + + + + + + + + + + + +

significant, then the estimates will behave like a random sample drawn from a normal dis~
tribution with zero mean, and the plotted effects will lie apProximately along a straight line,
Those effects that do not plot on the line are significant factors:.
The normal probability plot of effect estimates from the plasma etch experiment is
shown in Fig, 13-20, Clearly the main effects of A and D and the AD interaction are signif-
icant, as they fall far from the line passing through the other points, The analysis ofvmance
summarized in Table 13-22 confirms these findings. Notice that in the analysis of variance
we have pooled the three- and four~factor interactions to form the error mean square. If !.he
nonnal probability plot had indicated that any of these interactions were important, they
then should not be included in the error term.
Since A =-101.625, the effect of increasing the gap benveen the cathode and anode is
to decrease the etch rate. However. D 306.125, so applying higher power levels \Vill
increase the etch rate, Figure 13-21 is a plot of the AD interaction, This plot indicates IIlat
the effect of changing the gap width at low power settings is smaU. but that increasing the

+2,~,------,~----~----'------r-----r------Dnil

~
l
+1

Zj O~.

0.37 79.25 152.37 229,80


l
306,12
Effects
Figure 13~20 Normal probabilit)' plot of effects from the plasma etch experiment
13-5 The 2k Factorial Design 389

Table 13·22 Analysis of Variance for the Plasma Etch Experiment


Source of Sum of Degrees of Mean
Variation Squares Freedom Square F,

A 41,310.563 41,310.563 20.28


B 10.563 10.563 <1
C 217.563 217.563 <1
D 374,850.063 374,850.063 183.99
AB 248.063 248.063 <1
AC 2,475.063 2,475.063 1.21
AD 94,402.563 99,402.563 48.79
BC 7,700.063 7,700.063 3.78
BD 1.563 1.563 <1
CD 18.063 1 18.063 <1
Error 10,186.815 5 2,037.363
Total 531,420.938 15

gap at high power settings dramatically reduces the etch rate. High etch rates are obtained
at high power settings and narrow gap widths.
The residuals from the experiment can be obtained from the regression model

• -776 .0625 - (101.625)


y- - - - Xl + (306.125)'X 4
- (153.625)
- - - x1x 4 '
222
For example, when A and D are both at the low level the predicted value is

y = 776.0625 - (!O\625} -I) + e06~125)r-I) _ e53~625}_I)( -I)


=597,
and the four residuals at this treatment combination are
e,=550-597 =-47,
e, = 604- 597 = 7,

1400

1200
c
1000
]
800 - - - - - - - - - - - Dh;gh = 325w
~
~ 600 _-------
.. D10w = 275w
u
W
400

200

0
Low (0.80 em) High (1.20 em)
A (Gap)
Figure 13~21 AD interaction from the plasma etch experiment
390 Chapter 13 Design of Experlments with Several Factors I



.. •

• •

~;:-_-;;;!'c;o; __~.L _ _--::-;:!J


-3.00 20.16 ~.33 66.50
Residua!, 9;
Figure 13-22 Normal probability plot of residuals from the plasma etch experiment.

e, = 638 -597 =41,


e,=601-597=4_
The residuals at the oth~r three treatment combi:Iations (A high, D low), (A low, D high),
and (A high, D high) are obtained similarly. A normal probability plot of the residuals is
shown in Fig. 13-22. The plot is satisfactory.

13-6 CONFotJNDING IN THE 2' DESIGN


It is often impossible to run a complete replicate of a factorial design under homogeneous
experimental conditions. Confour.ding is a design technique for running a factorial experi-
ment in blocks, where the block size is smaller than the number of treatment combinations
in one complete replicate. The technique causes certain interaction effects to be indistin~
guishable from, or confounded with, blocb. \Ve will illustrate confounding in the 2* facto~
';"1 design in 2P blocks, where p < k.
Consider a 21 design. Suppose that each of the 22 == 4 treatment combinations requires
focr hours of laboratory analysis. Thus, two days are required to perform the experiment.
If days are considered as blocks, then we must assign two of the four treatment combina-
tions to each day.
Consider the design shown in Fig. 13-23, Notice that block 1 contains the treatment
combinations (1) and ab, and that block 2 contains a and b. The contrasts for estimating the
main effects A and B are
Contrast, = ab + a - b - (I),
Contraste = ab + b - a - (1).
Note that these contrasts are unaffected by blocking since in each contrast there is one plus
and one minus treatment combination from each block. That is, my difference between
block 1 and block 2 will cancel out. The contrast for the AB interaction is
ContrastAl! = ab + (1) - a-b.
Since the two treatment combinations with the plus sign, ao
and (1), are in block I and the
two with the minus sign. a and b~ are in block 2, the block effect and theAR interaction are
identical. That is, AS is confounded with blocks.
13·6 Confounding in the 2' Design 391

Block 1 Block 2

[IJ
(1)
Block 1 Block 2
ab
(1) ~ 8C , c
all
~ be [abc
Figure 13<-23 The 2: design in two blocks, Figure 13~24 The 2} design in two blocks,ABC
confounded.

The reason for this is apparent from the table of plus and Illlnus signs for the 2' design
(Table 13-13), From this tablel we see that all treatment combinations that bave a plus on
AB are assigned to block 1. while all treatment combinations t..1at have a minus sign on AB
are assigned to block 2.
This scheme can be used to confound any 2k design in two blocks. As a second exar.I-
ple. consider a 2' design. run in two blocks. Suppose we wish to confound the three-factor
interaction ABC witb blocks, From the table of plus and minus signs for the 2' design (Table
13-16), we assign the treatment combinations that are minus on ABC to block 1 and those
that are plus onABCto block 2. The resulting design is shown in Fig. 13-24.
There is a more general method of constructing the blocks. The method employs a
defining contrast, say
(1347)
where Xi is the le'vel of the ith facto:: appearing in a treatr:lent combination and (Xi is the
exponent appearing on the ith factor in the effect to be confounded. For the 2k system. we
have either lX, = 0 Or lj and either x, = 0 (low level) or Xi =: 1 (high level). Treatment combi-
natious that produce tbe sarne value af L (modulus 2) will be placed in tbe sarne block.
Since the only possible values of L (mod 2) are 0 and I, this ,,'ill assign tbe 2' treatment
combinations to exactly wo blocks.
ill an example consider a 2 3 design with ABC confounded with blocks. Here Xl corre-
sponds to A. X Z to E, X] to C, and lX, :::::: ~ :::: OJ = L Thus. the defining contrast for ABC is
L=x1 +X1 +X1•
To assign the treatment combinations to the wo blocks, we substitute the treatment com-
binations i:Jto the defi.n.ing contrast as follows:

(1): L = 1(0) + 1(0) + 1(0) 0 = 0 (mod 2),

a: L= 1(1) + 1(0)+ 1(0) = I = I (mOd 2),


b: L= 1(0) + 1(1) T 1(0) = I = 1 (IDlJd 2),
abo L = 1(1) + 1(1) + 1(0) = 2 = 0 (mod 2),
e: L = 1(0) + 1(0) + 1(1) = I 1 (mod 2),
ac: L= 1(1) + 1(0) + 1(1) 2 = 0 (mod 2),
be: L= 1(0).,. 1(1) + 1(1) = 2 = 0 (mod 2),
abc: L = 1(1) + 1(1) -r 1(1) = 3 = I (mod 2).
Therefore, (I), ab, ac, and be are!1ll1 in block I, and a, b, c, and abc arenm in block 2, This
is the same design shown in Fig. 13-24.
392 Chap'i"er 13 Design of Experiments with Several Factors

A ~hortcut method is useful in constructing these designs. The block containing the
treaunent combination (I) is called the principal block. Any element [except (I)l in the
principal block may be generated by multiplying two other elements in the principal block
modulus 2. For example, consider the principal block of the 2' design with ABC con-
founded, shown in Fig. 13-24. Note that
ab . ac :::; a?bc bC j

ab· bc=ab"J.c=ac,
a.c • be = abC' = abo
Treatment combinations in the other block (or blocks) may be generated by multiplying one
element in the Dew block by each element in the principal block modulus 2. For the 2' with
ABC confounded, since the principal block is (I), ab, ac, and be, we know that b is in the
other block. Thus, the elements of this second block are
b'(I)=b,
b·ab=ab'=a,
b 'ac =abc,
b . be = b'c = e.

Jt'xample13;:lO
An experiment is performed to investigate the effects of foW" factors on the terminal miss distar.ce of
a shoulder-fired ground~to-air missile.
The four faclors are target type CAl, seeker type (8). tatget altitude (C). and target range (D),
Each. factor may bc conveniently ron at two levels, and the optical tracking system will allow termi~
nal miss dist.'Ulce to be measured to the nearest foot. Two differet.t gunners are used in the flighl test,
and since there may be differences between individuals, it was decided to conduct the 24 design in two
blocks with ABeD confounded. T!;.us, the defining contrast is
L=x~ +.:tz+x,+x4-
The experimental design and the resulting data are

Block 1 Block 2
(1).3 ,·7
ab ... 7 b .5
ae =6 c 6
be ~8 d.4
ad"" 10 abe = 6
bd.4 bed = 7
cd 8;lid acd =9
: abed I t 9 abd ~ 12

The anal:ysis of the dcsign by "rates' algorithm is shown in Table 13-23. A noJ;ID.al. probability plot of
the effects would reveal A (ta.'"get type), D (target nL1.ge), and AD to have large effects. A confuming
an.alysis of variance. using three~factor interactions as error, is shown in Table 13-2J..

It is possible to confound the 21:. design in four blocks of 2}:;·-1 observations each. To con~
struct the design. twO effects are chosen to confound with. blocks and their defullng con-
trasts obtained. A third effect, the generalized interaction of the tvlo initially chosen, is also
:.3-6 Confoun.Cing in the 2k Design 393

Table 13·23 Yates' Algorithm for the 24 Design in Examp:e 13-10


Treatment Sum of Efe-.."t
i . Combinations Response [I] [2] [3] [4: Effect Estimate
!l
I," (1) 3 10 22 48 111 Total
i
! a 7 12 26 63 21 A 27.5625 2.625
;.- b 5 12 30 4 5 B 15625 0.625
I. ab 14 33 -I AB
~i 7 17 0.0625 --0.125
e 6 14 6 4 ", C 3.0625 0.875
ae 6 16 -2 1 -19 AC 22.5625 -2.375
be S 17 14 -4 -3 BC 0.5625 --0.375
abc 6 16 3 3 -1 ABC 0.0625 --0.125
d 4 4 2 4 15 D 14.0625 1375
ad 10 2 2 3 13 AD 10.5625 1.625
bd 4 0 2 -8 -3 BD 0.5625 -0.375
abd 12 -2 -I -11 7 ABD 3.0625 0.875
cd 8 6 -2 0 -1 CD 0.0625 ·-8.125
! '
.cd 9 S -2 -3 -3 ACD 0.5625 --0.375
bed 7 2 0 -3 BCD 0.5625 -iJ.375
abed 9 2 -1 -I ABCD 0.0625 --0,125

Table 13-24 Analysis of Variance for Example 13-1Q


Soutce of Sum of Degrees of Meml
Variation SquatCS Freedom Square Fe
Blocks (ABCD) 0.0625 0.0625 0.06
A 27.5625 1 27.5625 25.94
B L5625 1.5625 1.47
C 3.0625 3.0625 2.88
D 14.0625 1 14.0625 13.24
AS 0.0625 1 0.0625 0.06
AC 22.5625 1 22.5625 21.24
i AD 10.5625
!i 1 10.5625 9.94
BC 0.5625 1 0.5625 0.53
BD 0.5625 0.5625 0.53
CD 0.0625 0.0625 0.06
ErrorSABC+ABD +ACD +BCD) 4.2500 4 1.0625
Total 84.9375 15

confounded with blocks. The generalized interaction of two effects is found by multiplying
their respective columns.
For example, consider the 24 design in four blocks. If AC and BD are confounded with
blocks, their generalized interaction is (AC)(BD) = ABCD. The design is constructed by
using the defining contrasts for AC and BD:

L} =X1 +X:J,
L,. ~ x, + "•.
394 Chapter 13 Design of Experi:nents with Several Factors

It is easy to verity that the four blocks are

Block 1 Btock 2 Siock :3 Btock 4-


L,~O, 4~O L,~1,4=O t,~O,4~1 t, ~ 1, 4 ~ 1
rmr
: ire
; bd
i c
abd ~l ~
be
Iad
,

~
I

iabed I bed : cd
--
- '

This general procedure can be extended to confounding the 2k design in 21' blocks.
where p < k. Select p effects to be confounded, such that no effect chosen is a generalized
interaction of the others. The blocks can be constructed from the p defIning contrasts L:, L"
... ~ Lp associated with these effects. 10 addition, exactly 2! - P - 1 other effects are COD-
founded with blocks, these being the generalized in:eraction of the origina! p effects cho-
sen. Care should be taken so as not to eonfound effects of potential interest.
For more information on confounding refer to Montgomery (2001, Chapter 7). That
book contains guidelines for selecting factors to confound with blocks so that main effects
and low-order interactions are Dot confounded. In particular, the book contains a table of
suggested confounding schemes for designs with up to seven factors and a range of block
sizes, some as small as two roos,

13·7 FRACTIONAL REPLICATION OF THE 2' DESIGN


As the number of factors in a 21i. increases, the number of rons required increases rapidly.
For example, a 2' requires 32 runs. In this design, only 5 degrees of freedom correspond to
main effects and 10 degrees of freedom correspond to two-factor interactions. If we can
assume that certain high-order interactions are negligible. then a f"'otiona! factorial design
involving fewer than the complete set of 21< runs can be used to obtain information on the
main effects and low-order interactions. In this section, we will introduce fractional repli-
cation of the 2' design. For a more complete treatment, see Montgomery (2001, Chapter 8).

13·7.1 The One·Half Fraction of the 2' Design


A one-half fraction of the 2k design t.-ontalns 2k - 1 ru::tS and is often called a Zk-I fractional
factorial design. .A..s an example, consider the 2'-1 design; that is. a one-half fraction of the
2'. The table of plus and minus signs for the 2' design is shown in Thble 13-25. Suppose we
select the four treatment combinations a~ h. C1 and abc as our one~half fraction, These

Table 13-25 Plus and Minus Signs for the 21 Factorial Design

Treatment Factorial Effect


Combinations I A B C AB AC BC ABC
a + + +
.;.
b + + +
c + + +
.-' + + +
ab + + + +
ac + + + +
be + .,.
(I) + -' +
13-7 Fractional Replication of t!:e 2k Design 395

treatment combinations are shown in !he top half of Table 13-25. We will use bo!h the COD-
ventional notation (a, hI c, ... ) and the plus and minus notation for the treatment comblll.3.~
tions. The equivalence between the two notations is as follows:

Notation 1 Notation 2

a +--
b -+-
c --+
abc +++

Notice that the 23 -1 design is formed by selecting only those treatment combinations
!hat yield a plus an !he ABC effect, Thus ABC is called !he generator of this particular frac-
tion. Furthermore,. the identity element I is also plus for the four runs, so we call
[=ABC
the defining relation for the design.
The treatment combinations in the 2:!< -: designs yield three degrees of freedom associ-
ated with !he main effects, From Table 13-25, we obtain !he estimates of the rna;:'! effects as
1
A =z:[a-b-c+abc],
I
B=-[-a+ b -c+ abc],
2
I
C=-[-a-b+c+abc].
2
It is also easy to verify that the estimates of the two-factor interactions are
I
BC =z:[a-b -c+abe],
I
AC= 2[-a+b-c+abc],

AB= 1 -b-c+abc].
2
e
Thus, !he linear combination of observations in column A, say A , estimates A + BC
e e
Similarly, E estimate.B .,.AC, and c estimates C ~ AB. Twa or mare effects !hat have this
proper:y are called aliases. In our 2'-1 design, A and BC are aliases, B andAC are aliases,
and C and.4.B are aliases. iiliasing is !he direct result of fractional replication. In many prac-
tical situations, it will be possible to select the fraction so that !he main effectS ""d low-
order interactions of interest will be aliased with high~order interactions (which are
probably negligible).
The alias structure for this design is found by using the defining relation I = ABC, Mul-
tiplying any effect by the defining relation yields !he aliases for !hat effect. In OUr example,
the alias of A
A=A . ABC =A'BC = BC,
since A . I = A and A' = 1. The aliases of B and C are
B=B·.4.BC=AB'C=AC
396 Chapter 13 Design of Expe:irnents with Several Factors

and
C=C·ABC=ABC'=AB.
Now suppose that we had chosen the other one-half fraction. that is" the treatment com~
binations in Table 13-25 associated with minus on ABC. The defining relation for this
design is I =--ABC. The aliases are A =-BC. B =-AC, and C ~ --AB. Thus the estimates of
.4.B, and C with this fraction really estimateA- BC, B -AC, and C -AB. In practice, it usu-
ally does not matter which one·half fraction we select. The fraction with the plus sign in the
defining relation is usually called the principal fraction, and the other fraction is usually
called the altemate fraction.
Sometimes we use sequences of fractional factorial desig::lS to estimate effects. For
example, suppose we had run the principal fraction of the 23 -! design. From this design we
have the following effect estimates:

e,.=A+BC.
C,=B+AC.
ec=c+AB.

Suppose that we are willing to assume at this point that the t:\J,Io-factor interactions are neg~
ligible. If tile}' are, then tbe 23 - J design has produced estimates of the three main effects, A,
B, and C, However, :if after running the principal fraction we are uncertain about the inter~
actions. it is possible to estimate them by running the alternate fraction. The alternate frac~
tien produces the following effect estimates:

e;=.4-BC.
e~ B-AC,
e;~C-AC.
If we combine the estimates from the two fractions t we obtain the following:

Effect i
I
;j(A+BC+A-BC)=A ~A +BC-(A-BC)] =BC
1
1(B+AC+B-AC)cB }[B+AC-(B-AC)] =AC
~(C+AB+C-AB)=C
1
I=C ~[C+AB- (C -AB)) =AB

Thus by combining a sequence of two fractional factorial designs we can isolate both tile
main effects and the two~factor interactions. This property makes the fractional factorial
deSign highly tL>;eful in experimental problems a') we can run sequences of small. efficient
j

experiments, combine infonnation across several experiments, and take advantage of learn-
ing about the process we are experimenting with as we go along.
A 2;:-1 design may be constructed by \\-TIting down the treatment combinations for a
full factorial with k - I factOrs and then adding the kth factor by identifying irs plus and
minus levels with the plus and minus signs afthe highest-order interaction ±ABC,,· (K-
I). Therefore, a 2'-1 fractional factOrial is obtained by writing down the full 2' factorial and
then equating factor C to the ±AB interaction, Thust to obtain the principal fraction. we
would use C =+ AB as follows:
r
l

i Full 2'
13-7 Fractional Replication of the 21 Design 397

A B A B C=AB
+
+ +
+ +
+ + + +

To obtain the alternate fraction we would equate the last column to C = -AB.

To illustrate the use of ZI one-half fraction, consider the plasma etch experiment described in Exa"n-
pIe 13-9, Suppose that we decide to use a 24 - 1 design with 1::; ABeD to investigate the foU!' factors.
gap (A). p::essure (B), C<F6 flow rate (C), and power setting (D). This design would ~e consl'l.lc[ed by
writing down a 2.) in the factors A. B, and C and then setting D = ABC. T'r.e design ond the resulting
etch rates are shov,rn in Table 13-26.
In this design, the main effects a.."'C aliased with the three-factor interactions: note that the a:ias
of A is
A·!=A·ABCD,
;;;;;.A 2BCD,

= BCD,
and sirnila::ly
B=ACD,
C=.4BD,
D=ABC.
The two~factor interactior..s are aliased with each otheL For ex.nnple, the alias of AB is CD:
AB·!=AB·ABCD,
=A'B'CD,
CD.
Tb.e other aliases axe

Table 13~26 Tte 24 -! Design with Defining Relation 1 "" ABCD

Treatr:1ent Etch
A B C D=ABC Combinations Rilte

(1) 550
+ + ad 749
T T bd 1052
+ + ab 650
+ + cd 1075
+ + ac 642
+ + be 601
+ + + + abed n9
398 Chapter 13 Design of Experiments with Several Factors

AC=BD,
AD=BC
The estimates of the mal.!;. effects and their aliases are found using the four colUl.Il:ls of signs in
Table 13~26. For example, from column A we obtain
t, =A + BCD =t(-550 + 749-1052+ 650 -1075 +642 -601 + 729)
;=-127.00.
The other columns produce
t 8 =B+ACD:4.00,
tc=C+ABD=lLSO,
and
tD=D+ABC 290.50.
e
Clearly .e" a....d b are large, and if we believe that me tlL."'ee-factor interactions are negligible. then. the
main effects A (gap) and D (power setting) significantly affect etch rate.
The interactions are estimated by formi::!g the AE, AC, and AD colu::n.1.s and adding them to the
table. The signs in theAB column are +,-, -. +, +, -. -, +, aDd this column prod..:ces the estimate
t" =AB + CD =-j:(550-749-1052 + 650 + 1075 - 642- 601 + 729)

=-10.00.
From theAC and AD columns we find
e;,c=AC+BD ""- 25.50,
£Ao=AD+BC=- :97.50.
The tAD estimate is large; the most straightforward interptetacion of the results is that this is the AD
inreractio!].. Thus, the results obtained from the 2d ~ I design agree with the full factorial results in
Example 13.-9.

Normality Probability Plots and Residuals The normal probability plot is very useful in
assessing the significance of effects from a fractional factorial. 'This is particularly true
when there are many effects to be estimated. Residuals can be obtained from a fractional
factorial by the regression model method shown previously. These residuals should be plot-
ted against the predicted values~ against the levels of the factors. and on normal probability
paper. as we have discussed before, both to assess the validity of the underlying model
asSl.!IDptions and to gain additional insight into the experimental situation.

Projection of the 2k - 1 Design If one or core factors from a one-half fraction of a 210 can
be dropped, the design will project into a full factorial design. For example, Fig. 13-25 pres-
ents a t·- 1 design, Notice that this design will project into a full factorial in any t'No of the
three original factors. Thus. if we think that at most two of the three factors are important
the 2:} -1 design is an excellent design for identifying the significant factors. Sometimes we
call experiments to identify a relatively few significant factors from a larger number of fac~
tors screening experiments, This projection property is highly useful in factor screening. as
it allows negligible factors to be elim.i:mted, resulting in a stronger experiment in the active
factors that remain.

j
13-7 f<ractional Replication of the 2~ Design 399

I I
I /
/ /
I I
/ I
b l I
I I
I I I
abc /
I /
II I A
tf-- - a
I
, II
"I I I
I I I
I I
I I I I
I •

C
~ V
Figure 13-25 Projection of a 2 3 - 1 design iIlto three 2). designs.

In the 2'-1 design used in the plasma etch experiment in Example 13-11, we found that
two of the four factors (B and C) could be dropped. If we eliminate these two factors, t':te
remaining columns in Table 13·26 form a 2' design in the factors A and D, with two reFli·
cates. This design is shown in Fig. 13-26,

Design Resolution The concept of design resolution is a useful way to catalog fractional
factorial designs according to the alias patterns they produce. Designs of resolution III, IV-,
and V are particularly imporraut The cefioitions of these terms and an example of each
follow:
1. Resolution III Designs. These are designs in which no main effects are aliased with
any other main effect, but main effects are aliased with two~factor interactions, and
two·factor interactions may be aJiased with each other. The 2' -, design with I ~
ABC is of resolution m. We usually employ a subscript Roman numeral to indicate
design resolution; thus this one-half fraction is a t'm 1 design,
2~ Resolution IV Designs. These are designs in which no main effect:is aliased with
any other main effect or two-factor interaction. but two-factor interactions are

(1052.1075) [749,729)
+1~------------------'

1(550,601) (650,642)
-1~~~~--------~
-1 +1
A (Gap)
Figure 13-26 The 22 design obtained by dropping factors B and C from the plasma etch
experiment
400 Chapter 13 Design of Experiments with Severa!. Factors
1
aliased "ith each other, The 2'-1 design Mth I =ABCD used io Example 13-11 is
of resolution N (21; 1),
3. Resolution V Designs. These are designs in which no main effect Or two-factor
interaction is aliased with any other main effect or two-factor interaction, but two~
factor interactions are aliased with three~factor interactions. A 25 -! design with I ~
ABCDEis ofresolution V (2',,1).
Resolution ill and N designs are particularlY useful io factor screening experiments,
A resolution IV design provides very good information about main effects and will provide
some information about two-factor interactions.

13-7.2 Smaller Fractions: The 2'-P Fractional Factorial


Although the 2k - 1 design is valuable in reducing the number of runs required for an experi-
ment, we frequently:find that smaller fractions will provide almost as much useful informa-
tion at even greater economy. In general. a 2( design may be run in a 112P fraction called a
2'-' fractional factorial design. Thus. a 1/4 fraction is called a 2'-' fractional factorial design.
a 1/8 fraction is called a 2k - J design, and so on.
To illustrate a 1/4 fraction, consider an experiment Mth six f""tors and suppose that the
engineer is interested primarily in main effects but would also like to get some information
about the two-fa;:;tor interactions. A 2'- t design would require 32 runs and would have 31
degrees of freedom for esthnation of effects. Since there are only six main effxts and 15
two-factor interactions, the one~ha1f fraction is inefficient-it requires too many runs.
Suppose we consider a 114 fraction, or a 26 - 2 design. This design conta.ins 16 runs and, with
15 degrees of freedom. will allow esthnation of all six main effects "ith some capability for
examination of the two-factor interactions, To generate this design we would write dovma 24
design in the factors A. B, C, and D. and then add two columns for E and F. To find the new
columns we would select the two design geMrators I = ABCE andl = ACDF. Thus column E
would befouod from E=ABCand column Fwould be F=ACD, and also columns ABCE
and ACDF are eqnal to the identity column. However, we know that the product of any two
columns in the table of plus and minus signs for a 2' is just another column in the table; there-
fore, the product of ABCE and ACDF or ABCE (ACDF) =A'BC'DEF= BDEF is also an
identity column. Consequently, the complete defin.ing relation for the 25 - 2 design is
I=ABCE=ACDF~BDEF.

To find the alias of any effect, simply multiply the effect by each word io the foregoiog
deflning relation. The complete alias strUct1lre is
A = BCE= CDF=ABDEF,
=
B =ACE DEF ABCDF, =
C =ABE = ADF =BCDEF,
D=ACF=BEF=ABCDE.
E=ABC=BDF=ACDEF.
F=ACD=BDE=ABCEF,
AB= CE= BCDF=ADEF,
AC=BE=DF=ABCDEF,
liD = CF = BCDE = ABEF,
r 13~ 7

AE=BC= CDEF=ABDF,
AF=CD=BCEF=ABDE,
Fractional Replic.ation of the 21,- Design 401

BD=EF=ACDE=ABCF,
BF=DE= ABCD = ACEF
ABF=CEF=BCD=ADE,
CDE=ABD=AEF= CBF.
);otice that this is a resolution IV design; main effects are aliased with three~fac~or and
higher interactions, and two~factor interactions are aliased with each other. TIlls design
would provide very good information on the main effects and give some idea about the
strength of the two-factor interactions. For example, if the AD interaction appears signi:fi~
can~ either AD andior CF are significant, If A andior D are significant main effectS, but C
and F are not, the experimenter may reasonably and tentatively attribute the significance to
the AD interaction. The construction of the design is shov.'1l in Table 13-27.
The same pr.nciples can be applied to obtain even smaller fractions. Suppose we wish
to investigate seven factors in 16 runs. This is a 27 - 3 design (a 118 fraction). This design is
constructed by writing daVID. a 24 design in the factors A, B, C, and D and then adding three
new columns. Reasonable choices for the three generators required ace I = AliCE. I =
BCDF, and I = ACDG, Therefore, the new columns are formed by setting E = ABC, F =
BCD, and G =ACD. The complete defining relation is found by multiplying the generators
together two at a time and then three at a time, resulting in
I =ABCE = BCDF = ACDG = ADEF = BDEG =ABFG = CEFG.
Notice that every main effect in this design will be aliased with three-factor and higher
interactions and that two--factor interactions will be aliased with each other. Thus this is a
resolution IV design,
For seven factors. we can reduce the number of runs even further. The 27 -.; design is
an eight-run experiment accommodating seven variables, This is a 1116 fraction and is

Tabl,13·:.7 Construction of the 26 - 2 Design with Generators I "" ABCE and I =ACDF
A B C D E=ABC F=ACD
(1)
+ ~
+ dej
.;. be
+ + + obf
~ ~
+ eif
+ + at:
+ + + bcf
+ + + abce
+ .;. dj
+ ade
+ .;.
+ + bdef
+ + + obd
+ + + ede
+ + + oedf
+ + bed
+ + + + obcdef
402 Chapter 13 Design of Experiments with Several }actors

obtained by first writing down a 2' design in the factors A, B, and C, and then forming the
four new columns, from 1= ABD, I = ACE, 1= BCF, and 1= ABCG. The design is shown
in Table 13-28.
The complete defining relation is found by IDuetiplying the generators together two,
three, and :finally four at a time, producing
I =ABD =ACE =BCF =ABCG = BCDE =ACDF = CDG =ABEF
=BEG=AFG=DEF=ADEG=CEFG=BDFG=ABCDEFG.
The alias of any main effect is found by multiplying that effect througb each term in the
defining relation. For example, the afuu; of A is
A=BD CE=ABCF=BCG=ABCDE=CDF=ACDG
=BEF =ABEG = FG =ADEF = DEG = ACEFG =ABDFG
= BCDEFG.
This design is of resolution rn~ since the main effect is aliased with two-factor interactions.
If we assume that all three-factor and higher interactions are negligible, the aliases of the
seven main effects are
e,=A+BD+ CE+ FG,
e,=B+AD+CF+FG,
fc=C+AE+BF+DG,
CD=D+AB+CG+EF,
C,=E+AC+BG+DF
C,=F+BC+AG+DE,
eo G+CD+BE+AF.
This 27:i!4 design is called a saturated fractional factorial, because all of the available
degrees of freedom are used to estimate main effects. It is possible to combine sequences
of these resolution ill fractional factorials to separate the main effects from the two-factor
interactions. The procedure is illustrated in Montgomery (2001, Chapter 8).
In constrUcting a fractional factOrial design it is important to select the best set. of
design generators. Montgomery (2001) presents a table of optimum design generators for
designs 'With up to 10 factors. The generators in this table will produce designs of max-
imum resolution for any specified combination of k and p. For more than 10 factors, a

A S C DC_AS) E( ACI F( SCI

+
+ + +
+ +
+ + +
+ + +
+ + +
+ + +
+ + + + + + +
13~8 Sample Computer Output 403

resolution ill design is recommended. These designs may be constructed by using the same
method illustrated earlier for the 211«4 design. For example, to investigate up to 15 factors
in 16 runs, v.'rite down a 24 design in the factorS A, B, C. andD, and then generate 11 new
columns by taking the productS of the original four columns two at a time, three at a time,
and four at a time. The resulting design is a 21~!- n fractional factoriaL These designs, along
witit other useful fractional factorials, are discussed by Montgomery (2001, Chapter 8).

13-8 SAMPLE COMPUTER OUl'l'UT


We provide Minitab® ou:put for some of the examples presented in this chapter.

Sample Computer Output for Example 13-3


Reconsider Example 13-3, dealing with ai.rcraft primer paints. The Minitab0 results of :he
3 x 3 factorial design with t.lu'ee replicates are

Ar..a1ysis of Variance for Force


Source DF SS "4S I p
Type 2 4.5811 2.2906 27.86 0. 000
Applicat 1 4.9089 4.9089 59.70 O. 000
Type~Applicat 2 0.2411 0.1206 1.47 0.269
Error 12 0.9867 0.0822
Total 17 10.7178

The Minitab® results are in agreement with the results given in Table 13-6.

Sample Output for Example 13-7


Reconsider Example 13-7, dealing with surface roughness. The Minitab® results for the 23
design with two replicates are

Term Effect Coef SE Coef T P


Constant 11. 0625 0.3903 28.34 0.000
A 3.37S0 1. 6875 0.3903 4.32 0.003
B 1.6250 0.8125 0.3903 2,08 0.071
C 0.8750 0.4375 0.3903 1.12 0.295
A*B 1.3750 0.6875 0.3903 1. 76 0.116
A*C 0.1250 0.0625 0.3903 0,16 0.877
B*C -0.£250 -0.3125 0.3903 -0,80 0.446
A*B«C 1.1250 0.5625 0.3903 1.44 0.1.88
Analysis of Variance
Source DF Seq SS Adj SS Adj !l.S P p
Mall: E£fect s 3 59.187 59.187 19.729 8.09 0 008
2-Way In~eractions 3 9.~87 9.187 3.062 1.26 0.352
3-Way Interactions 1 5.062 5.062 5.062 2.08 0.188
Residual Error 8 19.500 19.500 2.437
Fure Error 8 19.500 19.500 2.438
Total 1.5 92.937

The ou:put from Minitab'" is slightly different from the results given in Example 13-7. Hests
on the main effects and interactions are provided in addition to tite analysis of variance on
the significance of main effects. two~factor interactions, and three-factor intera(.i.ions. The
404 Chapter 13 Design of Experiments with Several Factors

Al~OVA results indicate that at least one of the main effects is significant, whereas no two-
factor or three-factor interaction is significant.

13-9 SUM1VlARY
This chap:er has introduced the design and analysis of experiments with several factors j

concenrrating on factorial designs and fractional factorials. Fixed, randoIr., and mixed mod~
els were considered, The F~tests for main effects and interactions in these designs depend
on whether the factors are fixed or random.
The 2k. factorial designs were also introduced. These are very useful designs in which
all k factors appear at two levels. They have a greatly simplified method of statistical analy-
sis. In situations where the design cannot be run under homogeneous cQnditiQns~ the 2~
design can be confounded easily in 2 P blocks. This requires that certain interactions be con-
founded with blocks. The 2' design also lends itself to fractional replication. in which only
a parJcular subset of the 2" treatment combinations are run. In fractional replication~ each
effect is aliased with one or n:ore other effects. The general idea is to alias main effects and
low-order interactions with higher-order interactions. This chapter discussed methods for
construction of the 2~-P fractional fac:orial designs, that is, a If2P fraction of the 2* design.
These designs are particularly useful in industrial experimentation,

13-10 EXERCISES
13-1. An article in the Journal of Materials Process- 25, and 30 minutes-andrandornly chooses two types
ing Technology (2000. p. 113) presents results from an of paint from several that axe available, He conducts
experiment involving tool wear estimation in m.ilEng. an experiment and obtains t.'1e data shown here. Ana~
The objective is to minimize tool wear. 1\vo facto;s of lyze the data and draw conclusions, Estimate the vari-
~nterest in ~e study were cutting speed (mlmin) and ance components.
d:;pth of cut {mm). One response of interest is tool
flank wear (mm). Three levels of ea.cll factor were
selected and a factorial experiment with three repli- Drying Time (min)
cates is run. Analyze the data and draw conclusions. Paint 20 25 30
74 73 78
Depth of Cut
64 61 35
2 3 50 44 92
0.170 0.198 0.217 92 98 66
12 0.IS5 0.210 0.241 86 73 45
0.110 0.232 0.223 68 88 85
0.178 0.215 . 0.260
15 0.210 0.243 0.289
13~3. Suppose that in Exercise 13-2 paint types wc:e
0.250 0.292 0.320 fixed effects. Compute a 95% interval estimate of the
0.212 0.250 0.285 mean difference between the responses for paint type
18.75 0.238 0.282 0.325 1 and painttype 2.
0.267 0.321 0.354
13-4. The factors that influence the breaking strength
of cloth are being stu.died. Four machines and three
13~2..A:1 engineer suspectS that the surface finish of a ope{ators axe chosen at random and an experiment is
meta: part is lnI1uenced by the :ype of paint used and run using cloth from the same one~y.ard segment The
the dr)'ing time, He selects three d..")ing times-20, results are as follows:
]3·10 Exe,cises 405

13M8. Consider the tool wear dara in Exercise 13-1.


Machine
Plot the residuals from this expe.riment against the lev~
Operator 2 3 4 cis of cutting speed and against the depth of cut. Com-
ment on the graphs obtained. \Vhat are the possible
A l09 llO lOS 110 consequences of the information conveyed by the
1I0 as 109 116 :residual plots?
B 111 110 III 114 13~9. The pc;cenrage of hardwood concentration in
112 III 109 112 raw pulp ar.d the ::reeness aLd cooking time of pulp
are being investiga':ed for their effects on the strength
C 109 112 114 111
of paper, Analyze the data shown in the following
111 liS 109 112
table., assUltli..;g that all three factors are fixed.
Test for interaction and main effe...-r.s at the 5% leveL
Cooking Time Cooking TUM
Estimate the components of variance.
Percen:age of 1.5 hours 2.0hou.. . s
]3..5. Suppose that in Exercise 13-4 the operators Hardwood Freeness Freeness
were chosen at random. but only four machines were Concentration ~JO 500 650 41)0 500 650
available for the test Does this influence the analysis
or your conclusion:.? 10 96.6 97.7 99.4 98.4 99.6 1OQ.6
13-6. A company employs two time-study engine"". 96.0 96.0 99.S 9S.6 100." 100.9
Tbei= supervisor wishes to determine whether the 15 985 96.098.4 97.5 98.7 99.6
standards set by them are bfluenced by any interac-
972 96.9 97.6 9S.1 98.0 99.0
tion between engineers and operators. Sbe selects
three operators at random and conducts an experiment 20 97.5 95.6 97,4 97.6 97.0 98.5
in which the engineers set standard times for the same 96.6 96.2 98.1 98.4 97.8 99.8
job. She ohtains the data shown here. Analyz.e the data
and draw conclusions. 13-10. An article in Quality Engin.eering (1999, p.
357) presents the results of an experiment conducted
Operator to determine the effects of three factors on warpage in
Engineer 2 3 a.."1. injection-molding process. Warpage is defined as
the nonfiatness property in the product manufactured,
2.59 2.38 2,40
This parrictaar company to.ar.ufactures plastic molded
2.78 2,49 2.72 components for use in television sets. washing
2 2.15 2.85 2.66 machines, and automobiles. The three factors of inter-
2.86 2.72 2.87 est (each at two levels) are A = ::celt temperature, B:::::
injection speed, ar:.d C;;;;; injection pr;;x:ess. A complete
13-7. An article in Industrial Quality Con1lt)l (1956, p. 23 factorial design was carried out 'Nith replication,
5) descnbes an investig'.ite the effect of 'fINo replicates are provided in the table below. A'll1-
t\.Vo factors (glass type and phosphor type) on the lyze the data from this experiment
brightness of a television tuhe, The response variable
measured is the current necessary (in IIiicroamps) to A B C IT
obtaizt a spedfied brightness level The da:a are shO\\TI -1 -1 -1 1.35 1.40
heJ;e. A. nalyze the data and draw conc1us:ons,-a.ssum- -I -1 2.15 2.20
ir.g that hoth factors are fixed, -1 -1 1.50 1.50
Phospbor Type
-1 1.10 1.20
-1 -I 1 0.70 0.70
Glass 2 3 1 -1 1.40 135
280 300 290 -1 1.20 1.35
290 310 285 LlO 1.00
285 295 290
230 260 220 13-11. For the Warpage experimentm Exercise 13~10,
obtain the residuals and plot them on non:nal proba-
235 2"0 225
bility pape:. ALso plot the residuals VerSus the pre-
241) 235 230
dicted values. Commen~ on these ?lots.
406 Chapter 13 Design of Experiments with Several Factors

13..12. Four factors are thought to possibly influence Experimental Statistics (No. 91, 1963) involves
the taste of a soft drink be¥"erage: type of sweetener tlame*testing fabrics after applying fIre-retardan,
~4), rario of syrup to water (B), carbonation level (C). treatments. There are four factors: type of fabric (A),
and temperature (D), Each factor can be run at two type of firc-reta."Iiant treatment (B), laundering condi~
levels. produci.Ilg a 24 design. At each run in the ton (C-the low level is no laundering. the hlgh level
design, samples of the beve:.-age are given to a test is a.Fter one laundering). and the method of conducting
panel consisting of 20 people. Each tester assigr.s a rhe flame test (D). All factors are ron at two levels. and
point score from 1 to 10 to the beverage. Total score is the response ..-a.riable is the inches of fabric burned on
the response variable, and the objective is to futd a a standard size test sample. The data are
formulation that m.aximizes tOW sCOre, Two replicates
of this design are run, and the results shown here. (1) 42 d ...,,40
Analyze the data and draw conclusions, a=;: 31 ad = 30
b =45 hd 50
Tteatment ~lica:e Treatment Replicate:
Combinations I II Combinations T II ab = 29 abd = 25
e = 39 cd =40
(1) 190 193 d 198 195
a 174 178 ad 172 176 DC =28 aed "'" 25
b 181 185 bd 187 183 be =46 bed = 50
ab 183 180 aba 185 186 abc = 32 abed = 23
c 177 178 cd 199 190
ac 181 180 aed 179 175 (a) Esti.matc the effects and prepare a nonnal proba-
be 188 182 bed 187 184 bility plot of the effects.
abc 173 170 abed 180 180 (b) Construct a nonnal probability plot of thc residu-
als and comment on the results"
13~13. Consider the experiment in Exercisc 13~12. (c) Construct an analysis of variance table assuming
Plot the residoals against the levels of factors A, B, C. that three~ and four-factor interactions are
and D. Also construct a normal probability plot of the negligible.
residuals. Comment on these plots.
13-17. Consider the data from the first replicate of
13-14. Fmd the standard error of the effects for £he Exercise 13-10, Suppose rhat these o!:;servations could
cxperit:.lcnt in Exercise 13~ 12. Using the standard not all !:;e ron under the same conditions. Set up a
errors as a guide. what factors appear significant? design to run these observations in two blocks of four
13-15. The data sho~'D here represent a single repli~ observations, each withABC confounded. Analyze Lie
cate of a 25 design that is used in an experiment to data.
study the compressive strength of concrete. The fac- 13~18. Consider the data from the firSt replicate of
tors are nUx (A). time {B). laboratory (C), temperature E."l:ercise 13~ 12. Construct a desig=. with two blocks of
(D), and drying time (E). Analyze the data, assuming eight observations each. with ABCD confounded.
that three-factor and higher interactions are negligible. Analyze the data.
Use a nom'.a1 probability plot to assess the effects.
13 ..19. Repeat Exercise 13-18 assuming that four
(1) = 700 d=,OOO de =1900 blocks are required. Confound ABD and ABC (and
e= 800
consequemJy CD) wi'" blocks.
a 900 ad=ll00 ae: 1200 ade=1500
13~20. Construct a 2$ design in four blocks. Select the
b= 3400 hd = 3000 be =3500 bde=4000
effects to be confounded so rhat we confound the
ab= 5500 abd=6100 abe =6200 abd< =6500 highest possible interactions with blacks.
e= 600 cd:::::: SOO ce = 600 cde =1500
13·21. An article in Industrial and Engineering
ae= 1000 acd=ll00 ace""" 1200 aede =2000 Chemistry ("Factorial Experiments in Pilot Plant
be= 3000 bed =3300 oC$=3006 bede=3400 Studies," 1951, p. 1300) repo::ts on an experiment to
abc = 5300 abcd=6000 abee = 5500 abed< =6300
investigate the effects of temperature (A), gas through-
put (B). and concentration (C) on the strength mprod-
uct solution in a recirculation unit TWo blocks were
13~16. An experiment described by ~L G. Natrella in used with ABC confounded, and the .experiment was
the National Bureau of Standards Handbook of replicated twice. The data are as follows:
r 13·10 Exercises 407

1 2 D :=: hold down time, and E:=: quench oil temperature,


The data are shown below,
Block 1 Block 2 Block 1 Block 2
1
~
(1)';; 46 a A B C D E
0=51 ab=-47 b 7.78. 7.78, 7.81
: 10& I
c:::o ac;;;22 .;. 8,15. 8.18• 7.88
'abc= 35 ' bc=67 .,. 750, 7.56, 7.50
7.59. 7.56. 7.75
(a) Analyze 6e data from this e.."\:perixucnt. + + 7.54, 8.00, 7.88
(b) Plot the residuals on Donna! probability paper and + + 7.69, 8.09, 8.06
against the predicted values. Comment on the + + 7.56, 7.52. 7.44
p:ots obtained. + .;. + .;. 7.56, 7.81, 7.69
{c) Comment Oli the efficiency of this design. Note .;. 7,50. 7.25, 7.12
that we have replicated the experiment tv.'ice, yet .;. .;. + 7.88, 7.88 • 7.44
we have no information on the ABC interaction. + + + 7.50, 7.56, 7.50
(d) Suggest. betterdesign; specifically,ono that would + + + 7.63, 7.75, 7.56
+ + + 732, 7.44, 7.44
provide some information on aU interactions, .;. + + 7.56, 7.69, 7.62
13--22. R. D. Snce ("Experimenting \Vim a Large .,. .,. 7.18, 7.18• 7.25
Number of Variables," in Experiments in Industry; + + + 1.8:" 75O, 759
Design.. Analysis and Interpretation of Results. by R.
D. Snee. L. B. Hare, and J. B. Trout, Editors, ASQC, (al \Vhat is the generator for this fractio!!? Write out
1985) describes an experiment in which a 2~-1 design the alias struCture.
with I =ABCDE was used to investigate-the effects of ('0 luialyz.e t.'le data. Vihat factors influence mean
five factors on the color of a chemical product The
free height?
factors are A ::: solvent/reactant, B:.:;; catalyst/reactant,
C::: temperature, D = reactant pu."ity, lll1d E = reactant
(c) Calct;late the range of free height for each run, Is
pH. The results obtained are as follows: there any indication that any of these factors
affects variability in free height?
e ~-O,63 d =6,,9 (d) .A.nalyze the residuals from this experiment and
a = 2.51 ode 6,47 comment on your findings.
b ~-H8 bde 3A5 13~24. An artic1e in Industrial and Engineering

abe = L66 abd 5,68 Chemistry ("More on Planning Experiments to


increase Research Efficiency," 1970, p, 60) uses a 25 - 2
e ~ ~06 cd< 5,22
design to investigate the effects of A """ condensation
ace = 1.22 aed ~.438 temperature, B "'" anlount of material 1, C = solvent
bee = -2.09 bed =430 vol"..!ZlC, D :)ondenSation time, and E = amount of
abc =L93 abede =4,05 material 2 on yield. The results obtained are as
follows:
(a) Prepare a nnrmal probability plot of the effects. e = 23.2, ad = 16,9, cd =23.8, bde 16.3,
\Vbich factors are active?
:P) Calculate the residuals. Construct a nonna! ptob-
ab = 15.S, be = 16.2. ace =23,4. abed< 18.1.
ahility plot of the residuals and plot the residuals
(a) Verify that the design generators used were I ::
versus the fitted values. Comment on the plots,
ACE and 1= BDE.
(c) If any factors are negligible, collapse the 25 - i
(b Write down the complete defudng relation and
design into a full factorial in the active factors.
the aliases from this design.
Co~e::1t on t'lC resulting design. and i::1te!"pret
the results. (e; Es'l:iJ.:late the rna.in effects.
13-23. An article in the Jouma.l of Quality Tedmol- (d) Prepare a... analysis of variance table. Verify that
ogy, (VoL 17, 1985, p. 198) describes the use of a the Ali and AD interactions are available to use as
replicated fractional factorial to investigate the effects error.
of five factors on the free height of leaf springs used in (e) Plot the residuals versus the fitted values. Also
an automotive application. The factors are A =f..rrnace construct a normal probability plot of the residu-
temperature, B = heating time. C = transfer time. als, Corcoe:.t ou the results.
408 Chapter 13 Design of Expe6ments with Severn! Fa,to"

13-25~ An article in Cement and Concrete Research (a) 'What is the generator for dlls fraction?
(2001. p.1213) describes an experimenttQ investigate (b) Analyze the data. What factors influence mean
the effects of four metal oxides on several cement bulk density?
properties. The four factors are aU run at two levels (e) Analyze the residuals from this experiment and
and one response of i:r.terest is the mean bc.lk density comment on your findings.
(g!cm~). The four factors and their le'l.'els are
13-.26. Consider the 2,,-2 design in Table 13-27. Sup-
pose that aftet analyzing the original data, we find that
Low Level High Level factors C and E can be dropped. Vlhat type of 2t
Factor (-1) (+1) design is left in the remaining variables?
ll.: %Fe,G, 0 30 13-27. Consider the 2'-' design in Table 13·27. Sup.
15
pose that after the original data analysis. we find that
8: % 7.J10 0
factors D and F can be dropped, ""'hat type"'of 2k
c: %PbO 0 2.5
design is left in the rero.aining variables? Compare the
D: %0:,0, 0 2.5
results with Exercise 13-26. Can you explain why the
answers are different?
Typical results from this type of experiment are given 13~28. Suppose that in Exercise 13-12 it was possible
in the follO'Ning !able: to nm only a one-half fracdon of the Z' design. Con-
strUCt the design and perform the statistical analysis,
Run ll. B C D Density use the data from replicate 1
-I -1 -I I 2.001 13-29. Suppose that in Exercise 13-15 only a one·half
fraction of the 2~ design could be run, Construct the
2 I -1 -I -1 2.062
design and perform the analysis.
3 -1 -1 -1 2.019
4 1 -1 2.059 13-30. Consider the data in Exercise 13~15, Suppose
that only a one-quarter fraction of the 2;5 design could
5 -1 -1 -1 1.990
be run. Construct the design and analyze the data.
6 1 -1 1 2.076
13,,31~ Construct a 2t\t:~ fractional factorial design.
7 -1 1 1 2.038
Write do'Wll the :a.:iases, assuming that only main
8 1 1 1 -1 2,118
effects aLd two-factor interactions are of inteCCSL

J
r chap_te_r_l_4____________________

Simple Linear Regression


and Correlation
In many problems there are two or more variables that are inherently related~ and it is nec-
essary to explore the nature of this relationship. Regression analysis is a statistical technique
for modeling and investigating the relationship between NO or more variables. For exam-
ple, ill a chemical process, suppose that the yield of product is related to the process oper-
ating temperature. Regression analysis can be used to build a model that expresses yield as
a function of temperature. This model can then be used to predict yield at a given temper-
ature leveL It could also be used for process optimization or process control purposes,
In general, suppose that there is a single dependent variable. or response y, that is
related to k independent, or regressor, variables, say Xl,:tz. ,." Xt" The response variable y
is a random variable, while the regressor variables x!~;t1, ... , x,I;- are measured wi!h neg1igi~
ble error, The Xi are called mathematical variables and are frequently controlled by the
eXperimenter. Regression analysis can also be used .in situations where y, Xi' X 2' ", •• x\: are
jointly distributed random variables, such as when !he data are collected as different meas-
urements on a common experimental unk The relationship between these variables is char-
acterized by a matheIr..atical model called a regression equation, More precisely. we speak
of the regression of yon X 1>;t1. " "Xk" This regression model is fitted to aset of data. In some
instances. the experimenter will know the exact form of the true functional relationship
between y and x l .;t1 • ... , x", say y;;;; ¢(x\, x2• " ' , xJ, However, in most cases, the true func-
tional relationship is unknov.n, and the experimenter will choose an appropriate function to
approximate i/J, A polynomial model is usually employed as the approximating function,
In this chapter, we discuss the case where only a single regressor variable, x. IS of inter w

est Chapter 15 will present the case involving more than one regressor variable.

14-1 SIl'vIPLE LINEAR REGRESSION


We wish to detennine the relationship bet\\'een a single regressor variable x and a response
variable y. The regressor variable x is assu.."Ued to be a continuous mathematical variable,
controllable by !he experimenter. Suppose that the true relationship benveen y and x is a
straight line, and that the observation y at each level of;; is a random variable. Now; the
expected value of y for each value of x is
(14-1)

where the intercept Pc and the slope /31 are unknown constants. We assume that each obser-
vation~ y~ can be described by the model

y 13.+ f3,x+€, (14-2)

409
410 Chapter 14 Simple Linear Reg:re!)sio~ and Correlation

where € is a random error with mean zero and variance cr.


The {€} are also assumed to be
uncorrelated random variables, The regression model of equation 14-2 involving only a sin-
gle regressor variable x is often called the simple linear regression model
Suppose that we have n pairs of observations, say (y" x), (y,. x,), .... (y., x"). These
data may be used to estimate the unknown parameters f30 and Pl in equation 14-2. Our esti-
mation procedure will be the method ofleast squares. That is, we will estimate p, and Pl so
that the sum of squares of the deviations between the observations and the regression line
is a minimum. Now using equation 14--2, we may write

i==-1.2 ••.. ,n, (14-3)

and the sum of squares of the deviations of the observations from the true regression line is

n n
L =2>1 =l., (Yi - Po - Plx,t
(14-4)

The least-squares estimators of Po and p" say Po andfi" must satisfy

(14-5)

Simplifying these two equations yields

j=l 1=1
n n
Pol., X, + P,l.,X;
(14-6)

Equations 14-6 are called the least~squares normal equations. The solution to the normal
equations is

(14-7)

( 14-8)

,y,
where y= (l/niE;. andX'= (lln)L;.lx" Therefore, equations 14·7 and 14-8 are the least-
squares estimators of the intercept and slope, respectively. The fitted simple linear regres-
sion model is

(14-9)
14-1 Simple Lbear Regression 411

Notationally, it is convenient to give special symbols to the numerator and denomina-


tor of equation 14-8. That is. let

(14-10)

and
n
S", = :2>,(x, (14-11)
i=l

We callSp; the corrected sum of squares of.x and Sr; the corrected sun: of cross products of
x and y. The extreme right-hand sides of equations 14-10 and 14-11 are the usual computa-
tional form.u1as, Using this new notation, the least~squares estimator of the slope is
- S
f3.. =-'L. (14-12)
S'"

A chemical engineer is investigating the effect of process operating terr.perature on product jield, The
study results irl the following data:

Tempernture. 'C (xl 100 110 120 130 140 150 160 170 180 190
YIeld, % (y) 45 51 54 61 66 70 74 78 85 89

These pairs of points are plotted irl Fig. 14--1. Such a display is called a scatter diagram. Examination
of this scatter diagramirldicaLes that there is a s'JOng relationship between yield anc temperature, and
the tentative assumption of the straight~li.:le model y = Pc + PIX - € appears to be reasonable, The fol-
lollti."l.g quantities IDay be computed:

n=lO. " =1450. 2:>i=673.


Lx,
IO
;:=145.
..I
!C 10 !C
Lx? =218,500. Lyl = 47,225,
fool
LX,y, =101,570,
i",l
i"'l

From equations 14--10 and 14-11, we find

S =
n
~ x:
,L,.
1"',
-1.l'~" yI
1O,L,'
,'",j)
=218 500- (:450)' =8250
' 10

and

.ty ,L..;"
10
S = "" ",y.--I "" x·1
lO:.L,;' 1"'-
1 (!O i[""y I10
j
(1450)(6731
,=101570-,,""""",""-' -3985 .
I' 10
/..1 \.;",1,; ;",,1 )

Therefore, the least~squares estimates of the slope and intercept are

, S", 3985
/3,, =-=--=0.483
S," 8250
412 Chapter 14 Simple Linear Regression and Correlation
l
100

9C •
80


70

60 •
,,-"
ID 5D • •
;;: •
40

30

20
10
! 1 ! ! i !
0
100 110 120 130 140 15D 160 170 160 160
Temperature, x
Figure 14.1 Scatter diagram of yield versus temperature.

and
Po ~ y - fi ii =67.3 (0.483)145 =- 2.739.
The fitted simple linear regression model is
y= - 2.739 + 0.481<.

Since we have only tentatively assumed the straight-line model to be appropriate, we


will want to investigate the adequacy of the model. The statistical properties of the least-
Po Pt
squares estimators and are useful in assessing model adequacy. The estimators and A
PI are random variables, since they are just linear combinations of the Yi> and the Yl are ran-
dom variables. We will investigate the bias and variance properties of these estimators. Con~
sider first A.
The expected value of /3; is
,...
r
14-1 Simple Li::!ear Regression 413

since '2.7= !(xl-X) = 0, r..;. ;7:'(X1 ;;;;; Sx,x. and by assurnptionE(e,) = O.Thus'PI is an unbi-
ased estimator of the true slope PI' Now consider the variance of Pt , Siuce we have
assumed that VeE) = 0-', it follows that V(y); 0-', and

(14-13)

The random variables {y,) are uncorrelared because the {€,} are uncorrelated, Therefore, the
variance of the stlmm equation 14-13 is just the sum of the variances, and the variance of
each term in the sum, say VIY;(XI is VZ(x; - i)2. Thus.

v(p,) -(1 I 22:' I x,-x_)2


S2 ~ •
xx ;""l (14-14)
2
0-

S'"
Using a similar approach~ we can show that

and (14-15)

ftc
:Note ~t is an ::nbiased estimator of Po. The covariance of Po and 13: is not zero; iu fact.
Cov({3e' f3,) ;-rr"iIS""
It is usually necessary to obtain an estimate of <fl. The difference bet\Veen the obser-
vation 1, and the corresponding predicted value y" say e} = y, - y" is called a residual, The
sum of the squares of the residuals, or the error sum of squares, would be

(14-16)
, )2 '
y,

A more couyeni",nt computing formula for SSe may be found by substiruting the fitted
model {30 - i3,x, into equation 14-16 and simplifying, The result is
n
SSE=2:Y7
1=1

and if we let :Lf""tyl-ny'2 =i(Yi _y)2 =: Syy. theu we may 'Write SSfi as
j",l

(14.17)
414 Chapter 14 Simple Linear Regression and Correlation

The expected value of the error sum of squares SSe is E(SSE) = (n - 2)(1'. Therefore,

(14-18)
is an unbiased estimator of oz.
Regression analysis is widely used and frequently misused. There are several common
abuses of regression that should be briefly mentioned. Care should be talren in selecting vari-
ables with which to construct regression models and in determining the form of the approxi~
mating function. It is quite possible to develop statistical relationships among variables that
are completely unrelated in a practical sense. For example, one might attempt to relate the
shear stren.,"1h of spot welds with the nn.'Uber of boxes of computer paper used by the data
processing department. A straight line may even appear to provide a good fit to the data, but
the relationship is an unreasonable one On which to rely. A strong observed association
benveen variables does not necessarily imply that a causal relationship exists beween those
variables. Designed experiments are the only way to detennine causal relationships.
Regression relationships are valid only for values of the independent variable \'vithin
the range of the original data. The linear relationship that we have tentatively assumed may
be valid over the original range of XI but it may be unlikely to remain so as we encounter x
values beyond that range, In other words, as we move beyond the range of values of x for
which data were collected, we become less certain about the validity of the assumed modeL
Regression models are not necessarily valid for extrapolation purposes,
Finally, one occasionally feels that the model y = .8x "". is appropriate. The onission of
the intercept from this model implies, of course, that y =0 when x = O. This is a very strong
assumption that often is unjustified. Even when t'.VO variables, such as the height and weight
of men, would seem to qualify for the use of this model, we would usually obtain a better fit
by including the intercept, because of the limited range of data on the independent variable,

14-2 HYPOTHESIS TESTING IN SlMPLE LINEAR REGRESSION


An important part of assessing the adeqaacy of the simple linear regression model is test-
ing statistical hypotheses about the model parameters and constructing certain confidence
intervals. Hypothesis testing is discussed in this section) and Section 14-3 presents methods
for constructing confidence intervals. To test hypotheses about the slope and int=ept of the
regression model, we must make the additional assumption that the error component €t is
normally distributed. Thus, the complete assumptions are that the errors are l'<'JD(0, d')
(normal and independently distributed). Later we will discuss how these assumptions can
be checked through residual analysis.
Suppose we wish to test the hypothesis that the slope equals a constant, say .8,. o' The
appropriate hypotheses are
B,: .8, ~ .8,."
(14-19)
B,:.81 '" .8,.c,
where we have assumed a t'.Vo-sided alternative. Now since the £/ are NlD{O, oZ). it follows
directly that the observations y, are !XJD(j3, + .8.x" 0"). From equation 14·8 we observe that
Yr
/3, is a linear combination of the observations Thus, /3, is a linear combination of inde-
fit
pendent normal random variables and, consequently, is "'!(/3" o"!S~), using the bias and
variance properties of /3, from Section 14-1. Furthermore, /3, is independent of MS,. Then,
as a result of the normality assumption, the statistic

(14-20)
14-2 Hypothesis Te}''ting in Simple Linear Regression 415

foDows the tdistribution with n - 2 degrees of freedom underHo' p, P,.,. We would reject
Ho: p, ~ P;. 0 if
Itol > t"","~" (14-21)
where to is computed from equation 14-20.
A similar procedure can be used to test hypotheses about the intercept To test
Ho: p, P,.o'
(14-22)
H,' Pn " Pn.o,
we wood use the statistic

(14-23)

and reject the null hypothesis if Itol > ta!1."~"


A very important special case of the hypothesis of equation 14-19 is
He' p, =0, (14-24)

H" /3,,, O.
This hypothesis relates to the significance of regression. Failing to reject Ho: PI :.:: 0 is
equivalent to concluding that there is no linear relationship between.x and y. This situation
is illusrrated in Fig. 14-2. Note that this may imply either tbatx is of little value in explain-
ing the variation in y and that the best estimator of y for any x is y = Y (Fig. 14-2.a) or that
the true relationship between" and y is not linear (Fig. 14-2b). Alternatively, if Iio: {3, = 0
is rejected, this implies that ):is of value in explaining the variability in y. This is illustrated
in Fig. 14-3. However, rejecting HO=fJl :::::: 0 could mean either that the straight-line model is
adequate (Fig. 14-3a) or that even though there is a linear effect ob, better results cnuld be
obtained with the addition of higher-order polynomial terms in x (Fig. 14-3b).
The test procedure for Iio:/3, 0 may be developed from two approaches. The frst
approach starts with the following partitioning of the total corrected sum of squares for y:

(14-25)

., . , .
, "

• •

.. , • • • • •

...
• • •

(a) (b)

Figure 14--2 The hypothesis Hy: /3; :.:: 0 is not rejected.


416 Chapter 14 Simple Linear Regression and Correlation
1,
y

..
•• •

...' .'. . ... .


• •

• • ..
•• •• •
' '
•• • • •••
"

x
(a) (b)

Figure 14~3 The hypothesis Ho: PI "" 0 is rejected.

The two components of Sy! measure, respectively. the amount of variability in the Yi
accounted for by the regression line, and the residual variation left unexplained by the
regression line. We usually call SSE:=: ICY! - Yr)'l the error sum of squares and SSp;::::
I.;=,(Yi - Y)'l the regression sum of squares. Thus, equation 14-25 may be v.rritten
(14-26)

Comparing equation 14-f 6 with equation 14-17. we note that the regression sum of squares
SSE'S
(14-27)

S" has" - 1 degrees of freedom, and SS, and SSE have 1 and n - 2 degrees of freedom,
re·spectively.
We may show thatE[SS,,!(r. 2)] ~ a'- and E(SS;;J ~ a'- + f3;S_ and that SSE and SS,
are independent. Thus, if Ho: f3, = 0 is true, the statistic

1';, = SSR/l (14-28)


o SS£/(" - 2)
follows the FL,,_l distribution. and we would rejectHo if Fc > Fex. l,II-1' The test procedure
is usually arranged in an analysis of variance table (or k'lOVA), such as Table 14-1.
The test for significance of regression may also be developed from equation 14-20 with
ilL' = 0, say
(14-29)

Squaring both sides of equation 14-29, we obtain

2 _ /Jfs", _AS", _MS.


to - - - - - - - - - (14-30)
MSe MSE MS£

Table 14..1 Analysis ofVa.'ance for Tes:ing Significance of Regression


Source of Sum of Degrees of Mean
Variation Squares Freedom Square
Regression 35,= p,So-, MS.
Error or residual S5,=S,,_- {J,S" n-2 MS,
Total Syy n 1
14-3 Inte:val Estlmatior. in Simple Linear Regression 41-7

Table 14~2 Testi.'lg for Significance of Regression. Example 14-2


Source of Sum of Degrees of Mean
Variation Squares Freedom
Regression 1924.87 1 1924.87 2,38.74
Ettor 7.23 8 0.90
Th:al 1932.10 9

Note thatt; in equation 14-30 is identical to Fain equation 14-28. It is true, in general, that
the square of a t random variable with! degrees of freedom is an F random variable, with
One and!degrees of freedom in the numerator and denominator, respectively. Thus, the test
using", is equivalent to the test based on Fo.

We will test the model developed in Example 14-1 for significance of regression. The fitted model is
y:::- 2,739 ..;..0.483.x, and Syy is computed as

:47225- (673)'
, :0
=1932.10.

The regression sum of squares is


SS, = PeS", = (0.483)(3985) = 1924.87,
and the error sum of squares is:

1932.10 -1924.87

7.23.
The analysis of variance for testing Ho: f3! 0 is summarized in Table 14-2. Noting that Fo 2138.74
t":";

> FIJ.O\. L~ = 11.26, We reject He and conclude that P! -:;L O.

14-3 INTERVAL ESTIMATION IN SIMPLE LINEAR REGRESSION


In addition to point estimates of the slope and intercept, it is possible to obtain confiden<oe
interval estimates of these parameters. The width of these confidence intervals is a me",,,,re
of the overall quality of the regression line. If the <, are normally and independently dis"
tribute<!, then

and
418 Chapter 14 Simple Linear Regression and Correlation

are both <listributed as t wilb n -2 degrees of freedom. Therefore, a 100(1- a)% confidence
interval on Ibe slope /3, is given by

(14-31)

Similarly, a 100(1 - a)% confidence interval on Ibe intercept /3, is

,
/30-la12
I'
I I MS;; -+-J5./3o
",-2 n S
[ 1 ",' :
S./3o+ta'2'-2~MSE -+-J'
n S
, !,
: [1 j"l (14-32)
,xx xx

We will find a 95% confidence interval on Ll:le slope of the regression line using the data in Exa:rnplc
14-1. Rec.all that ~, ~ 0.483, S~= 8250, and MS,= 0.90 (see Table 14-2). Then, from equation 14-31
we fine.

or

0.483- 2,306,1 0.9~ ,; III ,; 0.483+2.306,1 0.9_0 ,


,8250 ,82,0

TIris simplifies to
0.45% Il,'; 0,507.

A confidence interval may be constructed for the mean response at a specified x. say
x". This is a confidence interval about E(YIxJ and is often called a confidence interval about
Ibe regression line. Since E(ylx,) ~ /3, + /31x", we may obtain a point estimate of E(YIxJ from
Ibe fitted model as
E~) ""y,=i3,+P,xo'
Now Yo is an unbiased point estimator of E(YIxJ, since Aand p, are unbiased estimators of
/30and /3" The variance ofYo is
V(i'o) =O"'~.!.+ ("" - x)'],
Ln S~

A
and Yo is normally distributed, as and PI are normally distributed. Therefore, a 100(1 -
a)% confidence interval abom the true regression line at x '\'0 may be computed from


Yo -ta{l,n-2 _MSEl-+
/1 (1 (xo-xl"J
~ n Sa (14-33)
,----,-----,-,.
'"1
"E (YixO)';Yo+'aJ2~-2> 1MSEI' -+
(x,-x)'1 .
\I n Sa
I' ,
r 14-3 btcrval Estimation in Simple Linear Regression 419

Tne width of the confidence interval for E(y1xJ is a function of xc' The interval width is a
minimum for Xu =::X and 'Widens as ~o - Xl increases, This widening is one reason why using
regression to extrapolate is ill-advised,

We will construct a 95% confidence interval about the regression line for the data in Example 14~ 1.
The fi!ted model is Yi; = - 2.739 ..;.. 0.483x,., and the 95% confidence interval on E(yix,,) is found from
equation 14-33 as

_ f r
1 (xo -14,\ :
Yc ::2.30\!0.90 Iii--' &250 '
-21]I'
1 \ ,

The fitted values Yo and the coa:espomii."lg 95% cor.fldence limits for the poi::ts i= 1, 2. ""
10, a:-e displayed in Table 14-3. To illustrate the use of this table, we may find the confidence
interval on the true mean process yield at Xc = 140"C (say) as

64.88 - 0.71 :;; E(Yix, = [40)'; 64.&8 + 0.71


or
64.[7"; E(Ylx" ~ 140) ,,65.49.
The fitted model and the 95% confidence in,terval about the regression line a.'''c shown in Fig. 14-4,

Table 14-3 Confidence Interval about the Regression Line, Example 14-4
100 110 120 130 140 150 160 :70 1&0 190
45.56 50.39 55.22 60.05 64.&8 69.72 7455 79.38 114.2[ 89.04
95% confider.ce
limits ±L30 ±LlO ±Q.93 ±Q.79 ±Q,71 ±Q.7t ±Q,79 ±Q.93 ±LlO ±1.30

100
90 /~
~ ... /
80
~~
~-; ...
70

"- 60
~
:;: 50
40
30
20
10
0' I I !-~
100 110 120 130 140 150 160 170 180 190
Temperature. x
Figure 144 A 95% confidence interval about the :regression line for E;r;ampie 14-4.
420 Chapter 14 Si.rcple Linear Regres~10n and Correlation

14-4 PREDICTIO~ OF NEW OBSERVATIO~S


An important application of regression analysis is predicting new Or future observations y
corresponding to a specified level of the regressor variable x. If Xo is the value of the regres-
sor variable of interest, then
(14-34)
is the point estimate of the new or future value of the response Yo'
Now consider obtaining an interval estimate of this future observation Yo' llis new
observation is independent of the observations used to develop the regression model. There~
fore) the cOlli"idence interval about the regression line, equation 14-33, is inappropriate,
since it is based only on the data used to fit the regression model. The confidence interval
about the regression line refers to the true mean response at x = Xc (that is, a population
parameter), not to future observations.
Let Yo be the future observation atx:;;:; xo' and letyc given by equation 14--34 be the esti~
matOr of Yo' Note that the random variable
'I'=y, -Yo
is normally ilistributed with mean zero and variance

because Yo is independent ofy,> Thus, the 100(1 - a)% prediction interval on a future obser,
vations ar Xc is

(14,35)

Notice that the prediction interval is of minimum width at Xo := X and widens as ~J xl


increases. By comparing equation 14-35 with equation 14-33, we observe that the predic,
tion interval atxois always wider than. the confidence interval at x()' This results because the
prediction interval depends on both the error from the estimated model and the error asso,
ciated with furore observations (cr).
We may also find a 100(1- a)% prediction interval on the mean of k future observa'
tions on the response at.x =X O' Let Yo be the mean of k future observations at x = Xo- The
100(1- a)% prediction interval on Yo is

(14-36)
14-5 Measuring the Adequacy of the Regression Model 421

To illustra::e the cons!ruction of a prediction interval, suppose we use the data in Exam-
ple 14-1 and find a 95% prediction .inter,fal on the next observation .on the process yield at
Xo == 160°C, Gsing equation 14~351 we find that the prediction interval is

74.55-2.306 iO.90[1+~+ (160-145)']


~ 10 8250
r-c---------:-c;
< < 745' ? 306 1090[1 1 (160-145)'"
- Yo - .• L. ~. +10+ 8250 '

which simplifies to

14-5 MEASL"RING THE ADEQUACY OF THE REGRESSION MODEL


FirJng a regression model requires several assumptions. Estimation of the model parame-
ters requires the assumption that the errors are uncorrelated random variables with mean
zero and constant variance, Tests of h:'t'Potheses and inteITal estimation require that the
errOrS be normally disnibuted. In addition, we assume that the order of the model is correct;
that is, if we fit a first-order polynomial. then we are assuming that the phenomenon actu-
ally behaves in a first·order manner.
The analyst should always consider the validity of these assumptions to be doubtful
and conduct analyses to examine the adequacy of the model that has been tentatively enter-
tained. In this section we discuss methods useful in this respect.

14-5.1 Residual Analysis


We define the residuals as e1 == Y,-y;, i == 1, 2, "', n1 where Yi Is an observation andy, is the
corresponding estimated value from the regression modeL Analysis of the residuals is fre~
quently helpful in checking the assumption that the errors are 1\'ID(O, iT) and in detennin-
ing whether additional terms in the model would be useful.
As anapproximate~ check of normality, the experimenter can construct a frequency his~
togram of the residuals or plot them on normal probability paper. It requires judgment to
assess the nonnormality of such plots. One may also standardize the residuals by comput~
ing d, = e/YMSE , i = 1, 2, ... , n. If the errors are NIO(O, a'), then approximately 95% of
the standardized residuals should fall in the interval +2). Residuals far outside this
interval may indicatc the presence of an outlier, that is, an observation that is atypical of the
rest of the data. Various rules havc been proposed for discarding outliers, However, some-
times outliers pro-vide important information about unusual circumstances of interest to the
experimenter and should not be discarded, Therefore a detected outlier should be investi-
gated first, then discarded if warranted. For further discussion of outliers, sec Y!ontgomery.
Peck, and Yrning (2001).
It is frequently helpful to plot the residuals (I) in time sequence (ifrnown), (2) against
the }>" and (3) against the independent variable x. These graphs will usually look like one of
the four general patterns sho"Wn in Fig. 14-5. The pattern in Fig. 14-5a represents normal-
ity. while those in Figs. 14-5b, C t and d represent anomalies, If the residuals appear as in
Fig. 14-Sb, then the variance of the observations may be increasing with time or w.ith the
magnitude of the }'i or xi' If a plot of the residuals against time has the appearance of
Fig. 14-5b~ then the variance of the observations is increasing with time, Plots agamsty, and

L
Figure14.5 patterns for residual plots, [al Satisfactory, (b) funnel, (e) double bow, (d) nonlinear,
[Adapted from Montgomery, Peck, and Vming (2001),J

Xi that look like Fig. 14-5c also indicate inequality of variance, Residual plots that look like
Fig. 14.5d indicate model inadequacy; that is, higher-order terms should be added to the
model.

l{xam"i,,14~5'
The residuals for the regression model in Example 14-1 are computed as follows:
e, =45.00 - 45.56=- 0,56, e, =70,00 - 69.72 = 0.28,
e, ~ 51.00 - 50.39 = 0,61, .,=74,00 -74.55 =-0.55,
e~ 54.00 - 55.22 =-1.22, <, = 78,00 -79.38 1.38,
e, = 61.00-60,05 = 0,95, e, = 85,00 - 84.21 = 0.79,
e, = 66,00 - 64.88 = Ll2, e" = 89,00 -89.04=- 0,04.
These residuals Me plotted on normal probability paper in Fig. 14--6. Since the residuals fall approx~
imately along a straight line in Fig. 14-6, we conclude that there is no severe departure from nonnal-
ity. The residuals are also plotted agai.'1st y! in Fig. 14-7a and against Xi in Fig, lA-7b. These plots
indicate no serious model inadequacies.

14-5.2 The Lack-of·Fit Test


Regression models are often fit 10 data when the true funetioual relationship is unknown.
Naturally, we would like to know whether the order of the model tentatively assumed is car-
rect, This scction will describe a test for the validity of this assnmption,
14-5 Measu..ring the Adequacy of the Regression Moce] 423

2 98

5 • 95
10 90

20 80
• 70
30
.~ 40 60
j3
m 50 50
.0
e
c.. 60 40
70 30
BO 20

90 -: 10
95 ~ • -is
9B~··-L· I I I 2
-1.5 1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Residuals
Figure 14-6 Nonnal probability plot of residuals.

e,t
2.00~
1,00~ • •
0.00 c •
-1.00 ~

-2.00 l ------~-----------------~- ·
40 50 60 70 80

$,

2.00
----------~---------------
:.00 • •
0.00 . •
• •
-1.00
---~----------------------
.
-2.00
.L ! ! ! !! )

100 110 120 130 ~40 :50 160 170 180 190 x,
Figure 14-7 Residual plots for Example 14-5. (0) Plot agalnsty" (b) plot against.tr

'The danger of using a regression model that is a poor approximation of the true func~
tional relationship is illustrated in Fig. 14·8. Obviously, a polynomial of degree two or
greater should have been used in this situation.
424 C1:.apter 14 Simple Linear Regression and Correlation
l,

• •

• • •
• •
• •
• •
• •
x
Figure 14-8 A regression model displaying llck of fit.

We present a test for the "goodness of fit" of the regression model. Specifically, the
hypotheses we wish to test are
Ho: The model adequately fits the data
H,: The model does not fit the data
The test involves partitioning the error or residual sum of squares into the NO components
SSE = SS" + SSWF
where SSi'E is the sum of squares attributable to "pureH error and SSWF is the sum of squares
attributable to the lack of fit of the model. To compute SS" we must have repeated obser-
vations OIl y for at least one level of x, Suppose thar We have n total observations such that
repeated observations at XI'
repeated observations at~,

Yml' Ynr2' ... , YI1VI", repeated observations at x""


Note that there are m distinct levels of x. The contribution to the pure-error sum of squares
at x (say) would be

(14-37)

The total sum of squares for pure error would be obtained by summing equation 14-37 over
all levels of x as
~ ,
SSpE = I,L(Y,,, _y,)2 (14-38)

There are n~ = :L:! (n; - 1) ;:::: n m degrees of freedom associated with the pure-error
sum of squares. The sum of squares for lack of fit is simply
(14-39)
wid, n - 2 - n, = m - 2 degrees of freedom. The test statistic for lack of fit would then be

~ _ SSwd(m-2)
TQ - (14-40)
SSPE!(n-m)

and we would reject it if Fo> F(t",-2.II-m'

J
14-5 Measuring the Adequacy of the Regression ~lodel 425

This test procedure may be easily introduced into the analysis of va....-fance conducted
for the significance of regression. If the null hypothesis of model adequacy is rejected, then
the model must be abandoned and attempt!) made to find a more appropriate model. If Ho is
Dot rejected, then there is DO apparent reason to doubt the adequacy of the model, and MSPE,
and MSWF are often combined to estimate rfl.

i;:Xi#~re 1ft;.
SuppQse we have the follov,,1ng data:

x 1.0 LO 2.0 3.3 3.3 4.0 4.0 4.0 4.7 5.0

Y 2.3 L8 2.8 1.8 3.7 2.6 2.6 2.2 3.2 2.0

x 5.6 5.6 5.6 6.0 6.0 6.5 6.9

y 3.5 2.8 2.1 3.4 3.2 3,4 5.0

We may compute St)';;; 10.97, S,ry:= 13.20, S;u: = 52.53~ y:;; 2.847, ru;d x;;; 4.382. The regression model
isy,;:::: 1.703 + Q,26Qx, a.nd the,regressionsum of squares is SSr:= f3IS1:)' = (0.260)(13.62) = 3.541, The
pure-error SUr:l of squares is computed as follows:

Level of x :E(y, - Yl' Degrees of Freedom


1.0 0.1250
3.3 1.8050
4.0 0.1066 2
5.6 0.9800 2
6.0 0.0200 1
Total: 3.0366 7

The analysis of variance is summarized in Table 14-4. Since Foz, a, 1 L 70. we cannot reject the
hypothesis that the tenta;tve model adequatelY describes the data. \Ve ",ill poor la¢k~of-fit and pure~
error mean squares to fonn the denominator mean square in the test for significance of regression.
Also, since =4.54, we conclude that{1!"# O.

In fitting a regression model to experimental. a good practice is to use the lowest


degree model that adequately describes the data. The lack-of-fit test may be useful in this

T.bl.14-4 Anilysis of Variance for Example 14~6

Source of Sum of Degrees of Mean


Variation Squares Freedom Square
Regression 3.541 1 3.541 7.15
Residual 7.429 15 0.495
(Lack of fil) 4.392 8 0.549 1.27
(Puree:rror) 3.037 7 0.434
Total 10.970 16
426 Chapter 14 Simple Linear Regression and Correlation

respect. However, it is always possible to fit a polynomial of degree n - I to n data points,


and the experimenter should not consider using a model that is "saturated.," that is, that has
very nearly as many independent variables as observations on y,

14·5.3 The Coefficient of Determination


The quantity

(I4-4J)

is called the coefficient of determination and is often used to judge the adequacy of a
regression model. (VIe will see subsequently that in the case where x and y are jointly dis-
tributed random variables, R2 is the square of the correlation coefficient between x and y.)
Clearly 0 ,; R' ~ I, We often refer loosely to R' as the amount of variability in the data
explained or accounted for by the regression model. For the data in Example 14·1, we have
R' = SSFIS" = 1924,87/1932,10 = 0,9963; that is, 99,63% of the variability in the data is
accounted for by the model
The statistic Rl should be used with caution. since it is al'W"&ys possible to makeR2 unity
simply by adding enougb tenns to the model For example, we can obtain a "perfect" fit to
n data points with a POlplOmiaI of degree n - 1. Also. Rl will always increase if we add a
variable to the model, but this does not necessarily mean the new model is superior to the
old one. Unless the error Sum of squares in the new model is reduced by an amount equal
to the original error mean square, the new model will have a larger error mean square than
the old one, because of the loss of one degree of freedom. Thus the new model will actu~
ally be worse than the old one,
There are several misconceptions about R2, In general, R? does not measure the mag-
nitude of the slope of me regression line, A large value of R' does not imply asleep slope,
Furthermore. Rl does not measure the appropriateness of the model, since it can be artifi-
cially inflated by adding higber-order polynomial terms, Even if y and x are related in a non·
linear fashion, J?l will often be large. For example, R2 for the regression equation in
Fig. 14-3b will be relatively large, even though the linear approximation is poor. Finally,
even though R2 is large, this does not necessarily imply that the regression model will pro~
vide accurate predictions of future observations.

14·6 TRANSFORMATIONS TO A STRAIGHT LTh'E


We occasionally find that the straigbt-line regression model y = Po + P,X + € is inappropri-
ate because the true regression function is nonlinear. Sometimes this is visually determined
from the scatter diagram. and sometimes we know in advance that the mode1 is nonlinear
because of prior experience or underlying theory. In some situations a nonlinear function
can be expressed as a straigbt line by using a suitable transformation. Such nonlinear mod·
els are called intrinsically linear.
As an example of a nonlinear model that is inttinsically linear, consider the exponen-
tial function
Y /3oe P'"e.
This function is intrinsically linear, since it can be transformed to a straigbt line by a loga-
rithmic transformation
lny =lnP,+ Jh+ In €.

J
14.7 Correlation 427

This transformation requires that the transformed error ten:ns 1n E be nor:maEy and inOO·
pendently distributed with mean 0 and variance a?.
Another intrinsically linear funetion is
(11
y;/30+/3'l-I+€'
x/

By using the reciprocal t:ransformation z ;;;: lix, the model is linearized to


y;/3o+/3,z+€.
Sometimes several transformations can be employed jointly to linearize a function. For
example, consider the function

Letting y* = lly. we have the linearized form


lny'; /30 + /3,:<+"
Several other examples of nonlinear models that are intrinsically linear are given by Daniel
and Wood (1980).

14-7 CORRELATION
Our development of regression analysis thus far has assumed that x is a mathematical vari~
able. measured with negligible error, and that y is a random variable. Many applications of
regression analysis invulve situations where both;; and y are random variables. In these sit~
uations, it is usually assumed that the obser"llations (Yi' XI)' i = 1. 2, "', n, are jointly dis-
tributed random variables obtained from the distribution/(y, x). For example, suppose we
wish to develop a regression model relating the shear strength of spot welds to the weld
diameter. In this example, weld diameter cannot be controlled. We would randomly select
n spot welds and observe a diameter (Xi) and a shear strength (y.) for each. Therefore, (y, x)
are jointly distributed random variables.
We usually assume that the joint distribution ofy! and Xi is the bivariate normal distri-
bution. That is,

(14-42)

0: a;
where J.i.l and are the mean and va..--iance of Y. J.4 and are the mean and variance of x,
and p is the correlation coefficient between y and x, Recall from Chapter 4 that the corre-
lation coefficient is defined as

where ()12 is the covariance between y and x.


The conditional distribution of y for a given value of x is (see Chapter 7)
428 Chapter 14 Simple Linear Regression and Correlation

(14-43)

where

(14-44a)

(14-44b)

and
--' ' ,
"]2 = 0;(1 - {r), (l4-44c)
That is~ the conditional distribution of y given x is normal with mean
E(ylx) = Po + P,X (14-45)
and variance O"~i' Note that the mean of the conditional distribution of y given x is a
straight~line regression modeL Furthermore, there is a relationship between the correlation
coefficient p and the slope PI' From equation 14-44b we see that if p=(), then P, 0, which
implies that there is no :regression of y On x. That is, knowledge of x does not assist us in pre-
dicting y.
The method of maximum likelihood may be used to estimate the parameters 130 and {JI-
It may be show'n that the maximum likelihood estimators of these parameters are
{Jo=y-/J"i (14400)
and

i-I
(l4-46b)

We note that the estimators of the intercept and slope in equaden 1446 are identical to
those given by the metl::od of least squares in the case where x was assumed to be a mathe~
matical va..~able, That is, the regression model with y and x jointly normally distributed is
equivalent to the model with x considered as a mathematical variable. This follows because
the random variables y given x are independently and normally distributed with mean Po +
PIX and constant variance 0':,. These results will also hold for any joint distribution of y and
x such that tl::e conditional distribution of y given x is noxmaL
It is possible to draw inferences about the correlation coefficient p in this model. The
es.timator of p is the sample correlation coefficient

(14-47)

J
14-7 Correlation 429

Note that
'S ,,1/,2
s j'
III, ~.,YY
\ .a
r, (14-48)

so the slope PI is JUSt the sample correlation coefficient r multiplied by a scale factor that
is the square root of the "spread" of the y values divided by the "spread" of the x values,
Thus il,
and r are closely related, although they provide somewhat different information,
The sample correlation coefficient r mea!."1U'es the linear association between y and x. while
il,measures the predicted change in the mean of y for a mtit change in x. In the case of a
mathematical variable x, r has no meaning because the magnitude of r depends on the
choice of spacing for x. We may also write) from equation 14-48,

which we recognize from equation 1441 as the coefficient of determination. That is, the
coefficient of determination RZ is just the square of the sample correlation coefficient
between y and x.
It is often useful to test the hypothesis
H,:p=O,
(14-49)
H,:p,eO,
The appropriate test statistic for this hypothesis is

(14-50)

which follows the t distribution with n - 2 degrees of freedom if Ho: P = 0 is true, There-
fore, we would reject the null hypothesis if It,l > 1"',"_1' This test is equivalentto the test of
the hypothesis He: Il, = 0 given in Section 14-2, This equivalence follows directly from
equation 1448,
The test procedure for the hypothesis

H,: P=Pa'
(14-51)
H,:p",po,
where P,,e 0, is somewhat more complicated, For moderately large samples (say n;' 25) the
statistic
1 l~r
Z = arctanh r = -l:n-'- (14-52)
2 l-r
is approximately normally distributed with mean
1 I~p
Ilz
,
=arctanh P =-l:n--
2 I-p
430 Chapter 14 Simple Linear Regression and Correlation

and variance
<r. =(n-3t:.
Therefore, to test the hypothesis Ho: P = Po' we may compute the statistic
Zo =(arctanh T - arctanh pOl(n - 3)ln (14-53)
and reject H,: P = Po if IZo! > Z.~.
It is also possible to construct a 100(1- a)% confidence interval for P using the trans-
formation in equation 14-52. The 100(1 - a)% eonfidence intelval is

tanh( arctanh r - r· ' 'J ~ p


Zai2 ~ tanh (arctanh r + Z.·2
~} (14-54)
"In-3 ..In-3
where tanh u = (e" - .-")I(e" + e-").

~~~@1Efi
Montgomery, Peck, and VUJing (2001) descn.be an application of regression analysis in which an
engineer at a soft-drlnk bottler is investigating the product distribution and route service operations
for vending machines. She suspects that the time required to load and service a machine is related to
the number of cases of produ..-t delivered. A random sample of 25 (etail outlets having vending
machines is selected. and tOe in-outlet delivery time (in minutes) and volume of product delivered (in
cases) is observed for each outlet. The data-are shown in Table ~4-5. We assume that delivery time and
volume of product delivered ate jointly normally distributed.
Using the data in Table 14-5. we may calculate
S" = 6105.9447, s~ = 698.5600, S.,= 2027.7132.
The regression model is

y= 5.1145+ 2.9027<.
The sample correlation coefficient between x and y is computed from equation 14-47 as
S;ry 2027.7132
T 0.9818.
l
S:n;SYJ t [(698.5600)(6105.9447)1'"

Table 14-5 Data for Example 14-7

Delivery }iumber of Delivery Number of


Observation Tune (y) c.ses (x) Observation TUlle (y) C=(x)

9.95 2 14 11.66 2
2 24.45 8 15 21.65 4
3 31.75 11 16 17,89 4
4 35.00 10 17 69.00 20
5 25.02 8 18 10.30 1
6 16.86 4 19 34.93 !O
7 14.38 2 20 46.59 15
8 9.60 2 21 44.88 15
9 24.35 9 22 54.12 16
10 27.50 8 23 56.63 17
11 17.08 4 24 22.13 6
12 37.00 11 25 21.15 5
13 41.95 12

i
J
r
14-9 Summary 431

Note that R'l ~ (0.9818)2 0,9640, or that approximately 96.400/0 of the variability in delivery time is
explained by the linear relationship with delivery volume, To test the hypothesis

I Iio: p~ 0,
Ii,: p" 0,
we can compute the test statistic of equation 14~50 as follows:
~
_ '~n-2
=
0,9818,,23 2480
to - ,......-- • ,
.,jl_,2 .,jl-0,9640

Since to.Ol.'i,1:! = 2.069, we reject Ho and conclude fuat the correlation coefficient p;t O. Fi:::ally, we may
construct an approximate 95% confidence .interval on p from equation 14-54, Since arctanh r ;:; ;
arctmh 0,9818 = 2.3452, equation 14-54 becot:1<S

tmh' 2.3452-~ ,;p,; tmh IZ,3452+ 1.96.,


( ) I \

\ ~22 \ -fii)
which reduces to

0.9585 s P s 0,9921.

14-8 SAMPLE COMPUTER OUTPUT


Many of the procedures presented in this chapter can be implemented usiug statistical soft~
ware. In this seetion, we present the Minitab" output for the data in Example 14-1.
Recall that Example 14-1 provides data On the effect of process operating temperature
on product yield. The Minitab® output is

The regression equation is


Yield =- 2.74 + 0.483 Temp
Predictor Coef SE Coe£ T P
Constant ~2.739 1.546 -~,77 0.114
T~p 0.48303 O.0~046 46.~7 0.000
S = 0.9503 R-Sq 99.6% R-Sq(adj) = 99.6%
A.~alysis of Variance
Source DF 55 MS F F
Regression 1 ~924.9 ~924.9 2131. 57 0,000
Residual Error 8 7.2 0.9
Total 9 ~932.~

The regression equation is pro'tided along with the results from the Hests on the individual
coefficients. The P-values indicate that the intercept does not appear to be significant
(P-value = 0.114) while the regressor variable, temperarure, is statistically significant
(P-va!ue = 0), The analysis of variance is also testing the hypothesis that Ii,: /31 = 0 and can
be rejected (P-value = 0)_ Note also that r = 46.17 for temperature, and f = (46,17)'
2131.67 = F. Aside from rounding, the computer results are in agreement with those found
earlier in the chapter,

14-9 SUMMARY
This chapter has introduced the simple linear regression model and shown how least-
squares estimates of the model parameters may be obtained. Hypothesis ..testing procedures
432 Chapter 14 Simple Linear Regression and Correlation

and confidence interval estimates of the model parameters have also been developed. Tests
of hypotheses and confidence intervals requrre the assumption that the observations y are
nonnally and independently distnouted random ...'3Iiables. Procedures for testing model
adequacy, including a lack-of-fit test and residual analysis, were presented. The correlation
model was also introduced to deal with the case where x and yare jointly normally distrib-
uted. The equivalence of the regression model parameter estimation problem for the case
where x and y are jointly nonnal to the case where x is a mathematical variable was also dis-
cussed, Procedures for obtaining point and interval estimates of the correlation coefficient
and for testing hypotheses about the correlation coefficient were developed.

14-10 EXERCISES

14-1. Montgomery, Peck. and Vwing (2001) present (a) Fit a linear regression model relating games won
data concerning the performance of the 28 National to yards gained by an opponent
Football League teams in 1976. It is suspected that (b) Test for significance of regression.
the number of games won (y) is related to the number (e) Find a 95% confidence interval for me slope.
of yards gained:.'U.Sh.ing by an opponer:.t (x). The data (d) '\Vh.a: percentage of total variability is explained
are shown below, by the lIlodel?
(e) Find::he residuals and prepare appropriate resid-
Yards Rushing by ual plots.
Teams Games Won (y) Opponent (x) 14-2. Suppose We would like to use the model devel~
oped in Exercise 14-1 to prediet ::he number of games
Washington 10 2205
a team will ..........n if it can limit the opponents to 1800
Minnesota 11 2096 yards rushing. Find a poin~ estimate of the number of
New Englaod 11 1847 games won if the opponents gain only 1800 yards
Oakland 13 1903 rushing. Find a 95% prediction interval on the number
Pittsbcr:gh 10 4151 of games WOn.
Baltimore 11 1848 14-3.. Motor Trend magazine frequently presents per-
Los Angeles 10 1564 formance data for automobiles. The table below pres-
Dallas 1! 1321 ents data from the 1975 volume of Motor Trend
Atlanta 4 2577 concern.iog the gasoline milage perfonnance and the
Buffalo 2 2476 engine displacement for 15 automobiles.
Chicago 7 1984
10 1917 :'files I Displacement
Cinc!"1.Ilati
Automobile Gallon (y) (Cubic (x)
Cleveland 9 1761
Denver 9 1709 Apollo 18.90 350
Detroit 6 1901 Omega 17.00 350
Green Bay 5 2288 t\ova 20.00 250
Houston 5 2072 Monarch 18.25 351
Kansas Cit)' 5 2861 Duster 20.01 225
Miami 6 2411 Jensen ConY. 11.20 "40
~ew Orleans 4 2289 Skybawk 22.12 231
New York Gia:ats 3 2203 Monza 21.47 262
'S'ew York Jets 3 2592 Corolla SR-5 30.40 96.9
Philadelphia 4 2053 Carnaro 16.50 350
St Louis 10 1979 Eldorado 14.39 500
San Diego 6 2048 Trans Am 16.59 400
San. Francisco 8 1786 Charger SE 19.73 318
Seatt::c 2 2876 Cougar 13.90 351
Tampa Bay 0 2560 Co,rvette 16.50 350

J
,4-10 Exexoises 433

(a) Fit a regression model relating mileage perform- (a) Fit aregression model relating sales price to taxes
ance to engine displacemenL paid.
(b) Test far significance of regression. (b) Test for significance of regression,

(c) 'What percentage of total ,,"arizbility in mileage is (c) W"hat percentage of the variability in selling price
explained by the model? is explained by the taxes paid?
Cd) Find the residuals for this model. Construe: 11 nor-
(d) Fi:r:.d a 90% confidence inten'a! On the mean
mal probability plot for the residuals. Plot the
mileage if the engine displacement is 275 cubic
residuals versus y and versus x. Does the model
inches,
seem satisfactory?
144. Suppose that we wish to predict the gasoli:r:.c 14-7. The stre:r:gth of paper used ill the manufacture of
mileage from l! car v.ith a 275 cubic inch displacement cardboard boxes (y) is related to the percentage of
engine, Find a point estimate. using the model devel- hardwood concentration in the original. pulp (x). UnCer
oped in Exercise 14-3. ar.d an appropriate 90% interval controlled cor.ditions, a pilot plant manufaetu.'>'CS 16
estimate. Compare this interval to the Onc obra:ned in samples. each from a different batch of pu~p, and
Exercise l4-3d, 'Which one is wider. and why? measures the tensi::e strength. The data are shown here.
14~5. Fmd the rcsidl;als from the moder in Exercise
14.-3. Prepare appropriate residua: plors anC commer.t : :~~4 (:7;411:~/ 11~~211~~911:60911~8 1~:9
on model adequacy.
y ! 11.3 123.0 12':),1 145.2 1343 ,,<.:.4.5 J43.'1 146.9
14~6. An article in TechnomeIrics hy S. C. Narula and
J. E Wellington ("Prediction, Linear Reg:ession, and
a :Mi.nin::n.ntJ. SUDl of Relative Errors:' VoL 19, 1977)
x • zlZSl28 I 2$ • 3.0 • 3.0 • 3.2 I B
(a) Fit a simple linear regression model to the data,
presents data On the selling price and annual taxes for
27 houses. The data are shown below. (b) Test for lack of fit and significa.'1ce of regression.
(c) Construct a 90% confidence interval on the slope

Taxes (Local, School.


fl,·
Cd) Construct a 90% confidence interval 00 the inter~
Sale Price f 1000 County) 11000 cept flo.
25.9 4.9176 (e) Construct a 95% confidence interval on the true
29.5 5.0208 regression line at :t =: 2.5.
27.9 4.5429 14~8.Compute the residuals for the regression model
25.9 4.5573 in Exercise 14-7. Prepare appropriate residual plots
29.9 5.0597 and comment on model adequacy,
29.9 3.8910 14..9. The number of pounds of steam used per month
30.9 5.898Q- hy a chemical plant is thought to he related to the aver~
28.9 5.6039 age ambient temperature for that month, The past year's
35.9 5.8282 usage a...'1d temperatures are shown in the following
31.5 5.3003 table.
3LO 6.2712
Month Temp. Usage 1 1000
30.9 5.9592
30.0 5.0500 Jan. 21 185.79
36.9 8.2464 Feb. 24 21".47
41.9 6.6969 Mar. 32 288.03
4().5 7.7841 Apr. 47 42".8'
43.9 9.0384 May 50 45".58
37.5 5.9894 June 59 539.03
37.9 7.5422 July 68 621.55
44.5 8.7951 Aug, 74 675.06
37.9 6.0831 Sept. 62 562.03
38.9 8.3607 Oct. 50 452.93
36.9 8.1400 No\" 41 369.95
45.8 9.1416 Dec. 30 273.98

l
434 Chapter 14 Simple Linear Regression and Correlation

(a) Pit a simple linear rcgression model to the data.


Average Size Level
(b) Test for significance of regression,
(e) Test the hypothesi! that the slope 131 = 10. 8530 18.20
(d) Construct a 99% confidence interval about ilie 8544 18.05
true regression line atX'''''' 58, 7964 16.81
(e) Construct a 99% prediction interval on the steam 7440 15.56
usage in tlie ne:;.:.t month having a mean ambient 6432 13.98
temperat.'Jl'c of 58<>, 6032 14.51
14-10, Compute the residuals for the reg::ession model 5125 10.99
i::. Exercise 14~9. Prepare appropriate residual ploto;; 4418 12.83
and comment on model adequacy. 4327 11.85
14-11. The percentage of impurity in oxygen gas pro-. 4133 11.33
eueen by a distilling process is iliought to be related to 3765 10.25
the percentage of Ilydroca:bon in the main condenser
of the processor. One mon1±.'s operating data are avail~ (a) Fit a l.i:near regression model relating average
able. as shown in the table at the bottom of this page. :manning level to avemge ship size.
(a) Pit a simple linear regress~o;). model to the data. (b) Test for significance of regression.
(b) Test fot lack of fit and significance of regresston. (c) Find a 95% confidence interval 0:: the slope.
(c) Calculate R2 for this model {d) '\\-'hat percentage of total variability is explained
(d) Calculate a 95% confidence interval for the slope by ilie model?
13,· • (e) Find the residuals and construct appropriate resid-
ual plots.
14~12. Compute the residuals for the data in Exercise
14-11. 14-14. The final averages for 20 randomly selected
(a) Plot the residua:s on norma: probability paper and stv.dents taking a course in engineering statistics and a
draw appropriate coucbsions. course in operations researcb at Georgia Tech a:e
shown here, Assume that the final averages are jointly
(b) Plot the residuals againsty and x. Interpret these
normally distributed.
displays.
14~13. NJ article in Transportation. Research (1999. Statistics
p. 183) presents a study On world maritime employ~
ment, The pwpose of the St.:.dy was to determine a OR
relationship between average manning level and the
average size of the fleet. Mar..rung level refers to the Statistics
ratio of number of posts that must be manned by a OR
seaman per ship (posts/ship). Data collected for ships
of the lIn.:.:ed Kingdom over a 16~yeax period are
(a) Find the regression line relating the statistics final
average to the OR final average.
Average Size Level
(b) Estimate the correlation coefficient.
9154 20.27 (c) Test the hypothesis that p: O.
9277 19.98 (d) Test the bypothesi.~ that p:::: 0.5.
9221 20.28 (e) COIlSQUct a 95% confidence interval esti:nate of
9)98 19.65 the correlation coefficient.
8705 18.81
14-15. The weight and systolic blood pressure of 26
continues
randomly selected males in the age gl'O'Jp 25-30 are

Purity (%) 86.91 87.33 1 86 .29 89.86


HYGr'ocarbon (%) 1.02 0.95 I 1.11 1.02

Purity (%) %.73 95.00 I 96.85 85.20 90.56


Hydrocarbon (%) 1.46 1.01 i 0.99 0.95 0.98
14-10 Exercises 435

shown in the following table. Assume that weight and Is the es-:ir.1ator of t.'1e slope in the simple linear
blood pressure are jointly normally distributed. regression model unbiased?

14~18. Suppose that we are fitting a straight line and


Subject Weight Systolic BP
we wish to make the variance of the s10pe Pl as small
165 130 as possible. Vt'bere should the observations x" i:::. 1,
2 167 133 2"", n, be taken so as to minimize V(fi l )? Dis~ss the
3 180 150 practical implications of this allocation of the X"
4 155 128
14M19. Weighted Least Squares. Suppose thar we are
5 212 lSI
fitting the straight line y;;;::: flo "l' iJjx + € but that the
6 175 146 vzriance of the y values now depends on the level of x;
7 190 150 that is,
8 210 140
9 200 148
i:::.1,2 .... ,n,
10 149 125
11 158 133
12 169 135
where the Wi are unknown constants, often called
13 170 150
weights, Show that the resulting !easHquares normal
14 172 153
equations are
15 159 128
16
17
18
168
174
183
132
149
158
50 tl",!
WI - PI:t w;x; = .t
1=1 ;""
W;Y,',

19 215 150
, , ,
20 195 163 /loLwixl-P:Lwixr= Lw/x;'Y;.
t=l ;"'! i",:
21 180 156
22 143 124
23 :240 170 14 20. Consider the data shOWll below, Suppose that
M

24 235 165 the relationship between y and x is hypothesized to be


25 192 160 y = (iJ':! + PIX + Erl. Fit an appropriate model to the
26 187 159 data, Does the assumed model form seem
appropriate?

(a) Find a regression tine relating.systolic blood pres~


sure to weight, x 6
(b) Estimate the correlation coefficient. y 0.24
(e) Test the hypothesis that p = O.
(d) Test the hypothesis that p = 0.6.
(e) Construct a 95% confidence interval estito.a.."e of 14-21. Consider the weighta.n.d blood pressure data b
the correlation coefficient Exercise 14-15. Fit a nowbtercept model to the data,
and compare it to the model obtained in Exercise
14-16. Consider the siJ:r.ple linear regression model 14~15. Which model is superior?
Y= 130 ..... f3;x + c, Snow that E(MS~) ::,,«;12 ..... fl~S.tt'
14-22. The following data, adapted from
14-17. Suppose that we have assumed the straight~line Montgomery, Peck, and Vming (2001), present the
regression model number of certified mental defectives per 10,000 of
estimated population in the United Kingdom (y) and
the number of radio receiver licenses issued (x) by the
BBC (in millions) for the years 1924-1937. Fit a
but that the response is affected by a second variable,
regression model relating y to x. COIll.m.ent on the
.tz, such t..~t the true regression function is
model. Specifically, does the existence of a strong cor~
relation imply a cause~and-effect relationship?
436 Chapter 14 Simple Linear Regression and Correlation

Number of Certified Number of Radio


Mental Defectives per Receiver Licenses
10,000 of Estimated Issued (Millions)
Year U.K Population (y) in the U,K (xl
1924 8 1.350
1925 8 1.960
1926 9 2,270
1927 10 2.483
1928 ;1 2.730
1929 11 3.091
1930 12 3.674
1931 16 4,620
1932 18 5,497
1933 19 6,260
1934 20 7,012
1935 21 7,618
1936 22 3.m
1937 23 8,593

.J
Chapter 15

Multiple Regression
Many regression problems involve more than one regressor variable. Such models are
called multiple regression models. Multiple regression is one of the most widely used sta-
tistical techniques. This chapter presents the basic techniques of pardII1eter estimation, con-
fidence interval estimation, and model adequacy checking for multiple regression. We also
introduce some of the special problems often encountered in the practical use of multiple
regression including model building and variable selection. autocorrelation in the errors t
j

and multicollinearity or near-linear dependence among the regressors.

15-1 MULTIPLE REGRESSION MODELS


A regression model that involves more than one regressor variable is called a multiple
regression. model As an example, suppose that the effective life of a cutting tool depends
on the cutting speed and the tool angle. A mnltiple regression model that might describe this
relationship is
(15-1)
where y represents the tool life, XI represents the cutting speed, and x 2 represents the tool
angle, This is a multiple linear regression model with two regressors. The term "linear' is
used because equation 15-1 is a linear function of the unknown parameters f3o~ f31~ and~,
Note that the model describes a plane in the two-dimensionalx1 • .:.s space. The parameter f30
defines the intercept of the plane. We sometimes call /3, and A partial regression coeffi-
cients, because {3J measures the expected change in y per unit change in x: when A2 is held
constant, and A measures the expected change in y per unit change in X, when Xl is held
constant.
In general; the dependent variable or response y may be related to k independent vari~
abies. The model
(15-2)
is called a mnltiple linear regression model with k independent variables. The parameters
~,j = 0, I, ... , k, are called the regression coefficients. This model describes a hyperplane
in the k-dimensional space of the regressor variables {x). The parameter f3; represents the
expected change in response y per unit cbange in Xj when all the remaining independent
variables Xi (i :;t)} are held constant. The parameters ~.,j;;;:;:; 1, 2, .,., k, are often called par~
rial regression coefficients, because they describe the partial effect of oue independent 'tari~
able when the other independent variables in the model are held constant.
Multiple linear regression models are often used as approximating functions. That is;
the true functional relationship between y and .x" X" ••. ,.x, is unl::no"'D, but over certain
ranges of the independent variables the linear regression model is an adequate
approximation.

437
438 Chapter 15 Multiple Regression

Models that are more complex in appearance than equation 15-2 may often still be ana~
lyzed by multiple linear regression techniques. For example, consider the cubic polynomial .I
model in one independent variable, i
y Po + P,x+ pzX' + f3i' + e. (15-3)
If we let x, =x, A1 ::;;r, andxJ =;:3, then equation 15-3 can be written
y = Po + PIXI + {3.,x,+ {3.,x) - e, (15-4)
which is a multiple linear regression model wil.h three regressor variables. Models that
include interaction effects may also be analyzed by multiple linear regressiOTIlllethods, For
example, suppose that the model is
y= Po + PIX, + f3,x, + Pi',xlx" + s. (15-5)
If we let x) = XIx" and p, Pll' then equation 15-5 can be written
y = Po + P,x l + f3,x, + {3.,x, + e, ( 15-6)
which is a linear regression model In general, any regression model that is linear in the
paramJJrers (111e {3's) is. linear regression model, regardless of the shape of the surface 111.t
it generates.

15·2 ESTIMATION OF THE P.4.RAMETERS


The method of least squares may be used to estimate the regression coefficients in equation
15-2. Suppose that n;> kobservations are available, and let Xlj denote the ith observation Or
level of variable x,. The data will appear as in Table 15-1. We .ssume that the error term 8
in the model bas E(E) 0, V(s) 0", and 111at the if,} are uncorrel.ted random variables.
We may write the model, equation 15-2, in terms of the observations,
Yi = [10 + f31xil ~ fllxf2 +.,.+ flkxik + ej
k
(15-7)
'= J30 + LJ3j xJj + ei • i::::l.2 •... ,lL.
r=:l

The least-squares function is

(15-8)

Table 15..1 Data for Multiple Linear Regression


y

Xu

_-,V-"___X::;,,, .~_ _x..:",-::....._ _ _ _ _x::,:"''-_

J
15-2 Estimation of the Parameters 439

The function L is to be minimized with respect to A" /3" ... , /31- The least-squares estiroa-
tors of /30' /3" _", /31 must satisfy

(\5-9.)

and

j ~ 1,2, .... k. (15-9b)

Simplifying equation 15-9, we obtain the least-squares nonnal equations


... "r: n n •
n/3o + /3.:I,Xil +h:2>i2 + ... + fl.L,Xik =L,y,.
i",l i~: l"'l j",l

• • ., •
+;3k 2. xilxi/;; ;::: LXilYi'
- • 2
ffi· L,x.
v ,. + /31 L, Xi! + ffi2 L, X il X
', +.,
r""l i""l 1::.1 /."'i i:1 (15-10)

Note that there are p;::: k + 1 normal equations, one for each of the unknown regression coef~
ficients, The solution to the no.rrnaI equations will be the least-squares estimators of the
regression coefficients /10, r /Jll ... , /J
1ti5 simpler to solve the normal equations if they are expresse<.iin matrix notation, We
now give a matrix development of the normal equations that parallels the development of
equation 15-10. The model in terms of the observations, equation 15-7. may be written in
matrix notation,
y=XIHe,
where
:1 x 11 x" ., .

[";:J x~l: X., >": 1'


X 21 -"22
... Xu

y=
.. '
-".2 x""

'0[1] and ,-[::1


£1';_

In general, y is an (n X 1) vector of the observations, X is an (n xp) matrix of the levels of


the independent variahles, II is a (p x I) vector of the regression coefficients, and s is ail
(n X I) vector of random errors.
We wish to fud the vector of least~squares estimators. ~ that :c:litlimizes

L= L,SY e'e=(y-X[3)'(y-XII).
i=l
440 Chaprer 15 Mcltiple Regression

Note that L may be expressed as

L = y'y -Il'X'y - y'X~+ Il'X'X~


(15-11)
y'y - 21l'X'y + Il'X'Xf3,
since IfX'y is a (1 x 1) matrix, hence a scalar, and its transpose (IfX'y), = y'X~ is the same
scalar, The least-squares estimators must satisfy

aLI
- =-2X'y+2X'X~=O, '
a~~

which simpliftes to
X'X~ X'y. (15-12)
Equations 15-12 are the least-squares normal equations. They are identical to equations
15-10. To solve the normal equations, multiply both sides of equation 15-12 by the inverse
of X'X. Thus, the least-squares estimator of ~ is
p= (X'Xt'X'y. (15-13)
It is easy to see that the matrix form of the normal equations is identical to the scalar
form. Writing out equation 15-12 in detail we obtain

n
n
,
l>i!
ie::
, n
n

LX'2
t=l
,
n
LXi.<
,:;;;;1
cpo l ;

n
LV"
n

tti.l

L,xa LxA LXi:Xn LXnXi,k LXllY'


j""l 1"'1 l=1 1=1 A1=,
:=1

, , n , n

LX" LXu:xil L X ik X l"2 LX~ -I


PkJ
LXikYi
L,=! i"'~ 1::::1 i=l ,1""1

If the indicated matrix multiplication is performed, the scalar form of the normal equations
(that is, equation 15-10) will result. In this fonn it is easy to see that X'X is a (p X p) sym-
metric matrix and X'y is a (p x 1) column vector. Note the special structure of the X'X
matrix. The diagonal elements of X'X are the sums of squares of the elements in the
columns QfX, and the off-diagonal elements are the sums of cross products of the elements
in the columns ofX. Furthermore, note that the elements ofXfy are the sums of cross prod-
ucts afthe columns of X and the observations {yJ.
The fitted regression model is
Y=Xp. (15-14)
In scalar notation, the fitted model is
k
y, Po + LP j );.,
j:::1

The difference between the observationy, and the fitted valuey; is a residual, say e;:::: Yt -Yi'
The (n xl) vector of residuals is denoted
o=y-y. (IS-IS)
15~2 Estimation of the Parameters 441

iu.l article in the Journal of Agricultural Engineering and Research (2001, p. 275) describes the use
of a regression model to relate the damage susceptibility of peaches to the height at which they are
dropped (drop height. measured in rom) <C!d the density of the peach (measured in glcmJ ). One gocl
of the ana]ysis is to provide a predictive model for peach damage to serve as a guideline for harvest-
ing and postharvesting operations. Data typical of this type of experiment is given in 1'able 15-2.
We will fit the multiple linear reg:::-e.....sion model
y= i>o+ pjx 1 + f3J.~ +"
to these data. The X matr:..x and y vector for this model are
303.7 O·90i 3.62

I:!1 366.7
336.8
304.5
U14
1.01
0.95
7.27
2.66
1.53
;1 346.8 0.98 4.91
il 600.0 1.04 10.36
:1 369.0 0.96 5.26
1 418.0 1.00 6.09
1 269.0 1.01 6.57
1 323.0 0.94 4.24
x= 562.2 1.01 '
y~
s.o4
284.2 0.97 3,46
558.6 1.03 8.50
.1 415.0 1.01 9.34
I 349,5 1.04 5.55
1 462.8 1.02 8.11
1 333.1 1.05 7.32
502.1 1.10 12.58.
311.4 0.91 0.15
351.4 0.96J 5,23 J
The X'X matrix is

,,:, l'
303.7

X'X=[30~.7 366.7 ... 366.7 1.04 oro]


0.90 1.04 0.96 c ;
351.4 0.96

[ 20
7767.8
3201646
19.93 J
= 7767.8 7791.878 ,
19.93 7791.878 19.9077

and the X' y vector is


3 62
r51129.17J.
l
I I
1 'JC . ]
7.~7 ~
120.79 c
X'y= 303.7 366.7 351.4
0.90 1.04 0.96 5' • 122.70
.25 -
1
442 Chaplel15 Multiple Regression

Table 15·2 Peach Darr.age Data for Example 15-1

Observation Damage (rom), Drop Height (mm), Fruit Density (gIcm,),


Number
3.62 303.7 0.90
2 727 366.7 1.04
3 2.66 336.8 1.01
4 1.53 304.5 0.95
5 4.91 346.8 0.98
6 10.36 600.0 UJ4
7 5.26 369.0 0.96
8 6.09 418.0 1.00
9 6.57 269.0 1.01
10 4.24 323.0 0.94
11 8.04 5622 1.01
12 3.46 284.2 0.97
13 8.50 558.6 1.03
14 9.34 415.0 l.01
15 5.55 349.5 1.04
16 8.11 462.8 1.02
17 7.32 333.1 1.05
18 12.58 502.1 1.10
19 0.15 311.4 0.91
20 S.23 351.4 0.96

The lea.')t-squares estimators are found from equation 15~13 to be

~ = (X'Xt'X'y,
or

[ ~lj=
~Ol [ 20 7767.8 19.93 T'C 120.79]
7767.8 3201646 7791.878J' l'S1l29.17
~, 19.93 7791.878 19.9077 122.70

r 24.63666 0.005321 -26.74679'C 120.79 '


=l· 0.005321 0.0000077 -().008353 rl'51129.17J'
-26.74679 -{).008353 30.096389 j 122.70

-33,831]
= 0.01314 .
[
34.890

Therefore, the fitted regression model is

y=-33.831 + O.01314x, + 34.890x,.


Table 15~3 shows the fitted values of)' and the residuals. The fitted values andresidua1s are calculated
to the same accuracy as the original data.

J
15-2 Estimation of the Parameters 443

Table 15-3 Observations, Fitted Values, and Resid'J:a1s for Example 15-1
Observation Number y,
1 3.62 1.56 2.06
2 7:17 7.27 0.00
3 2.66 5.83 -3.11
4 1.53 3.31 -l.7S
5 4.91 4.92 -0.01
6 10.36 10.34 0.02
7 5.26 451 0.75
8 6.09 6.55 -0.46
9 651 4.94 1.63
10 4.24 3.21 1.03
11 8.04 8.79 -0.75
12 3.46 3.75 -0.29
13 8.50 9.44 -0.94
14 9.34 6.86 2.48
15 555 7.05 -1.50
16 8.11 7.84 0.27
17 1.32 7.18 0.14
18 12.58 11.14 1.44
19 0.15 2.01 -1.86
20 5.23 4.28 0.95

'The statistical properties of the leastRsquares estimator ~ may be easily demonstrated.


Consider first bias:
E(ii) = E [(x'X.t'X'y]
E [(x'X)'X'(xP + Ell
E [(X'xr'x'Xl3 + (X'XY'X' EJ
= I),
since E(e) = 0 and (X'XY'X'X = 1. Thus ~ is an unbiased estimator of fl. The variance
property of ~ is expressed by the covariance matrix
Cov(~) E ([~ - E{~)][~ -E(~)]').
The covariance matrix of ~ is a (p xp) symmetric matrix whosejjth element is the variance
of ~ and ",hose (i,j)th element is the covariance between ~i and A
The covariance matrix
of I) is
Cov(~) cr'(X'X)'.
It is usually necessary to estimate f.il. To develop this estimator, consider the sum of
squares: of the residuals, say


:2A
/.1
: :;: e'e.

l
444 Chapter 15 Multiple Regression

Substituting e=y-y =y -X~, we bave


SS" = (y - X~)'(y X~)
= y'y - P-X'y - y'X~ + ~'X'X~
= y'y-2~'X'y + ~'X'Xp.
Since X'XP = X'y, this last equation becomes
SS£= y'y P'X"y. (15·16)
Equation 15-16 is called the error or residual sum of squares. and it bas n - p degrees of
freedom associated with it. The mean square for errOr is

MS,= (15-17)
n-p
It can be shown that the expected value of MS£, is 0-2; thus an unbiased estimator of (f'l is
given by
&'=MS£, (15-18)

t"".,:pi?f5~
We \Vill estim..a~ t.'le error va:ciance <:f for the multiple regression problem in Exan:.p1e 15-1, Using the
dara in Table 15~2, we find
20
y'y= I,0 =904.60
1",1

and

120.79 1
~'X"Y=[-33.831 0.01314 34.890J 51129.17J
[
122,70
= 866.39.
Iberefore, the e:ctor sum of squares is
SSe = y'y - ~'X'Y

904.60 - 866.39
=38.2[.
The estimate of IT is
'2 _ SSe _ 38.21 _ 2 04-
" .. - - - - - - ._ i_
n- p 20-3

15·3 CONFIDENCE INTERVALS IN M1;"LTIPLE LTh"EAR


REGRESSION
It is often necessary to construct confidence intervcl estimates for the regression coefficients
{fl,}. The development of a procedure for obtaining these confidence intervals requires that
w~ assume the errors (o,) to be normally and independently distributed with mean zero and
variance cr'. Therefore, the observations {Y,) are normally and independently distributed
15-3 Confidence Intervals in Multiple Linear Regresskn 445

witb mean fio + r;.


1 f1:Xi1 and variance r.r. Since the least-squares estimator /lis a linear com~

Ii
bination of the observations, it follows that is nonnally distributed with mean vector II and
covariance matrix u'(X'Xt'. Then each of the qUlllltities

P P
j - j
j O,l•...• k, (15-19)
1~2
'lCf ejj '
is distributed as twith n - p degrees oifreedom, where CJJ is tbei/th element oitbe (X'Xt!
matrix and if' is the estimate of the error variance, obtained from equation 15-13, Therefore,
a 100(1 a)% confidence interval for the regression coefficient Py j = 0, 1, ... , k, is

(15-20)

We will construct a 95% confidence interval on the parameter f31 in Example 15-1. Note that:he point
est:i:n.ate of PL is fil "'" 0.01314, and me diagonal element of (X'X)-! corresponding to fJl is en =
0,0000077, The estimate of <f was obtained in Example 15-2 as :.247. and til.o:.~.11 = 2.110. Therefo~
the 95% confu:1ence interval on [3, is computed from equation 15-20 as

O.01314-(2.1l0)~(2.247)(o.oOOO077) ." /3, i 0.01314+(2, ;)0).)(2.247)(0.0000077).

which reduces to
0,00436"; /3, ,,; 0,0219,

We may also obtain a confidence interval on the mean response at a particular point, say
(X 01 ' Xw "., xoJ· To estimate the mean response at this point define the vector

The estimated mean response at this pom.t is


Yo=x;,~. (15-21)

This estimatoris unbiased, since EGo) = E(x;,~) = x;,11 = EVe)' and the variance afY, is

(15-22)
Therefore, a 100(1- a)% confidence interval on the mean response at the point (x,., x"' ""
..tJt)is

Equation 15~23 is a confidence interval about the regreSSion hyperplane. It is tbe multiple
regression generalization of equation 14~33.
446 Chapt:r 15 M",tiple Regression l I

The scientists conducting the experiment on damaged peact.es in Example 15·1 would like to OOn~
struct a 95% confidence interval on the mean damage for a peach dropped from:a height of Xl'" 325
~ if its density is Xi = 0.98 glcm? Therefore,

The estimated mean response at this point is found from equation 15~21 to be

-33,831]
325 0.98] 0.01314 = 4,63.
[ 34.890

The variance of Yo is estimated by

: 24,63666 0,005321
-26.74679]~ 1 ]
O"x,(X'Xr'xc =2,247[1 325 0.98]1 000;>321 0.0000077 -(l.008353 325
c-26.14679 -(l,008353 30,096389lo,98
=2.247(0,0718) =0,1613,
Therefore, a 95% confidence interval On the me3l1 damage at this point is found from equation 15-23
to be

which reGuces to
3.78 S E(y,) :;; 5,48,

15-4 PREDICTION OF ],;'EW OBSERVATIONS


The regression model can be used to predict future observations on y corresponding to par-
ticular ..'alues of the independent variables, say Acl' xC<,; • ••• , XQI:' If ~ :::: [1, X:)I' XC;!, ••• , X ct];
then a point estimate of the future observation Yo at the point (xm• Xoo. • ••• ~ Xl)k) is
(15-24)
A 100(1 - a)% prediction interval for this future observation is

A
YC-tctj2,I1-p1./cJ
Cl(I'
+:<0 (X",,)-l Xc )
.A.
(15-25)
"' /',c-,(;---,-,-X--X-)-:-t---C'
sYo -Yo +t"12,._p~d'- I+xc\ "0i'
TIlls prediction interval is a generalization of the prediction inteI'\la! for a future observation
in simple linearregressio~ equation 14--35.
In predicting new observations and in estimating the mean response at a given point
(xc;. Xw ••• , X Ok )' one must be careful about extrapolating beyond the region containing the
original observations. It is very possible that a model that fits well in the region of the orig·
inal data will no longer fit well outside that region. In multiple regres>ion it is oiren easy to
inadvertently extrapolate, since the levels of the variables (XI1 • xa.> ... Xi*>' i;;;; 1, 2, , .. , n)
jointly define the region comaining the data. As an example, consider Fig. 15-1, which illus-
15~5 Hyporb.esis Testing in Multiple Linear Regmision 447

: XOj
L Onginal
, rangs for x1

Figure 15~1 An example 0: extrapolation in multiple regression.

trates the region containing the observations for a two-variable regression model. Note that
the point (.:tol' xO") lies within the ranges of both independent variables Xl and ~~ but it is out-
side the region of the original observations, Thus, either predicting the value of a new
observation or estimating the mean response at this point is an extrapolation of the original
regression modeL

Suppose mat the scientists in Example lSffl wish to construct a 95% prediction interval on the dam-
age on a peach that is dropped from a heignt of-XI = 325 IIlIl1 and has a density of.;s =0.98 glcmJ • Note
that x:: = [1325 0.98], and the point estimate of the damageisYn =x~~ =4.63 rom. Also. inE.~ample
15-4 we calculated x~(X'Xrlx.:: = 0.0718, Therefore, from equation 15-25 we have

4.63 -2.1l0~2.247{1 + 0.0718) 5; Y, ~ 4.63+ 2.110,/2.247(1 +0.0718),

and the 95% prediction interval is

15-5 HYPOTHESIS TESTING IN MULTIPLE LINEAR REGRESSION


In multiple linear regression problems, certain tests of hypotheses abeut the model param-
eters are useful in measuring model adequacy. In this section, we describe several impor-
tan! hypothesis-testing procedures. We continue to require the nonnality assumption on the
errors, which was introduced in the prevlous section.

15-5,1 Test for Significance of Regression


The test for significance of regression is a test to determine whether there is a linear rela-
tionship between the dependent variable y and a subset of the independent variables XI' hzl
x,t:. The appropriate hypotheses are
••• f
448 Chapter 15 Multiple Regression

H,:/l ~ /3,~ ... ~ /3,=0,


°
H,: f3;" for at least onej.
(15·26)

Rejection of He: /3) == 0 implies that at least one of the independent variables XI' x?, .... xk
contributes significantly to the model. The test procedure is a generalization of the proce~
dure used in simple linear regression. The total sum of squares SfY is partitioned into a sum
of squares due to regression and a sum of squares due to error, say

Syy=SS,+SS,.

and if H,: /3, = 0 is !rue, then S5.10" - xi, where the number of degrees of freedom for the
x' is equal to the number of regressor variables in the model. Also, we can show that S5,10"
- X;,'," and SSe and SS, are independent. The test procedure for H,: J3, = a is to compute
_. SS,lk _ MS R
Ee- (15-27)
SSE!(n-k-l) MSE

and to reject Ho if FI) > F a.k,lI-i.'-I' The procedure is usually summarized in an analysis of vari~
ance table such as Table 15-4.
A computational formula for SS, may be found easily. We have derived a computa·
tion~ formula for SSE in equation 15-16, that is,

SSE=y'y- ~'X'y.

Now since SyJ :;; l,~",y~ - O:';"'ly'-y1/n = y'y - CZ;"'ly,lln. we may rev.rite the foregoing
equation

or

Therefore, the regression sum of squares is

(15·28)

the error sum of squares is

(15·29)

and the total surn of sq!WeS is

(15·30)
15~5 Hypothesis Testing in Multiple Lir.ear Reg;r<>...ssion 449

Table 154 Analysis of Variance for Significance of RegreSSIon in MultipJe Regression


Sou..rce of Sillll of Degrees of Mean
Variation Sc;u"m Freedom Square
Regression SS, k MS,
Er.ror or residual SSE n-k-l MS,
Total S;), n-[

'~~j~~~~
We v.:ill test for significance of regression using the damaged peaches data from Example 15~ L Some
of the numerical quantities required are calculated in Example J5~2. Note that

I
Syy=y'y-- .L,y,.
r fi ',2

n \i=! )

= 904.60 (120.79)'
20
=175.089,

SSR~~'X'Y-.!.( :~.>,r
n\l:! )

(,20.79)'
=866.39
20
~ [36.88.
SS, =Syy - 5S,

~y'Y-~'X'Y
=38.21.

The analysis of variance is shown in Table 15-5, To testR(J: f31 = A =0, we calcula[e the statistic

Po = MS, = 68.44 =30.46.


MS, 2.247

Since Fo > Fo,IJ~;)',.1 """ 3.59, peach damage is related to drop height, froitdensity, or both. However, we
note that this does not necessarily imply that the relationship found is an apprOpriate oae for predict-
ing damage as a function of drop height or fruit density. Further tests of model adequacy are required.

Table 15-5 Test for Significance of Regression for Example 15~6

Source of Sumo: Degrees of Mean


Variation Squares Freedom Square F,

Regression 136.88 2 68."4 30.46


EI:ror 38,21 17 2.247
Total 175.09 19
450 Chapter 15 Multiple Regression

15-5.2 Tests on lndividual Regression Coefficients


We are frequently interested in testing hypotheses on the individual regression coefficients.
Such tests would be useful 10 determining the value of each of the independent variables in
the regression model For example, the model might be more effective with the inclusion of
additional variables, or perhaps with the deletion of one or lUore of the variables already in
the modeL
Adding a variable to a regression model always causes the sum of squares for regres-
sion to increase and the error sum of squares to decrease. We must decide whether the
increase in the regression sum of squares is sufficient to warrant using the additional vari-
able in the model. Furthermore, adding an unimPOrtant variable to the model can actually
increase the mean square error) thereby decreasing the usefulness of the model.
The hypotheses for testing the significance of any individual regression coefficient, say
f3i , are
Ho: f3j~ 0,
(15-31)
Hl:~''''O.
If H,: f3j = 0 i, not rejected, then this indicates that xJ can possibly be deleted from the
model. The test statistic for this hypothesis is
p,
to ; ~"a:!c;' (15-32)

where A.
ejj is the diagonal element of (X'xyt corresponding to The null hypothesis
He: ~= 0 is rejected if Itol > 1"",.,_ ,. Note that this is really a partial or marginal test,
p;
because the regression coefficient depends on all the other regressor variables xli ¢;) that
are in the modeL To illustrate the ;se of this test. consider the data in Example 15-1, and
suppose that we want to test
He: f3, = 0,
H,: {3," O.
The main diagonal element of (X'Xt' corresponding to jj, is en ~ 30.096, so the r statistic
in equation 15-32 is

.. 34.89 _ 4.24 .
.)(2.247)(30.096)

Since to"" 22 = 2.110, we reject Ho: /3, =' 0 and conclude that the variable x, (density) con-
tributes sigiuficantly to the modeL Note that this test measures the marginal or partial con~
tribution of X1 given that Xl is in the model.
We may also examine the contribution to the regression sum of squares of a variable,
say x" given that other variables x, (i ¢ J) are included in the model. The procedure used to
do this is called the general regression significance test, or the "extra sum of squares"
method. This procedure can also be used to investigate the contribution of a subset of the
regressor variables to the model. Consider the regression model with k regressor variables
y~Xll+£,

whereyis (nX I), Xis (nxp), II is (px I),. is (nx I), andp = k+ 1. We would like to
determine whether the subset of regressor variables Xt~ X:l~ .. "' xr (r < k) contributes

J
15-5 Hypo~esis Testing in Multip:e Lineru; Regression 451

significantly to the regression model. Let the vector of regression coefficients be pa.'1itioc.ed
as follows:

where P: is (rx I) and ~2 is [(P - r) X IJ. We wisb to test the hypotheses

Ho: Pl =0,
(l5-33)
H,: p, ;cO.
The model may be \.VIitten
(15-34)
where Xl represents the columns of X ass.ociated with PI and ~ represents the columns of
X associated ",itb ~"
Fortbeft,II1110del (including both 13, and ~,), we know that ~ = (X'Xr'X'y. Also, the
regression sum of squares for all var..ables including the intercept is
(p degrees of freedom)
and
y'y -~'X'y
MS" = --'-""""""'.
~ n- p

SS,(I3) is called the regression sum of squares due to 13, To find the contribution of the terms
in 13, to the regression, fit the model assuming the null bypothesis Ho: 13, =0 to be true, Tbe
reduced model is found from equation 15-34 to be

y = X'~2 +e, (15-35)


The least-squares estimator of ~, is ~, = (X,x,r'~, and
S5,(J~,) =~Xy (p rdegrees of freedom). (15-36)
The :egression sum of squares due to 13, given that a,z is already in the model is
SS,(I3,:~,) SSR(~) -SS,l~'), (l5-37)
This sum of s.q1.lZlIes has r degrees of freedom. It is sometimes called the "extra sum of
squares" due to ~ ,. Note tblIt S5,(~,1~,) is the increase in the regression sum of squares due
to including the variables x" x" '"., x, in the model. 1'ow S5'(13 ,113,) is independent of MS E'
and the null hypothesis ~, = 0 may be tested by the statistic

sSR(i3d~2)lr
(15-33)

If Fo > Fa. f, "_p we reject HCr< concluding that at least one of the parameters in 1">1 is not zero
and, consequently, at least one of the variables XI' Xz... " x!' in Xj contributes significantly
to the regression model Some authors call the test in equation 15-33 a partial F-test
The partial F-test is very useful. We can use it to measure the contribution of xJ as if it
were the last var.able added to the model by computing
SSiflip", p" ... , Pi-I' /5;." ... , pJ,
452 Chapter 15 Multiple Regression

This is the inerease in the regression sum of squares due to adding X; to a model that already
includes XI' , ..• Xj_I' ;:,"H' ..•• xk' Note that the partial F-test on a single variable Xj is equiva-
lent to the t-test in equation 15-32. However, the partial F-tes! is a more general procedure
in that we can measure the effect of sets of variables. In Section 15-11 we will show how
the partial F-test plays a major role in model building, that is, in searebing for the best set
of regressor variables to use in the model.

Consider the damaged peaches data in Example 15-1. We will investigate the contribution of the vari-
able x,. (density) to tte model. That is, we wish to test
He: fl, = 0.
H,: /3," 0,
To test this hypothesis. we need the extra sum of squares due to /3,.. or
SS,(/3,I/3,. flc) = SS,cf3" fl" f3;J - SS,(/3,. I30l
= SS/J31' f3,I/3;J - sS,(/3,lf3,).
In Example 15-6 we calc'Jlated

(2 degrees of freedom).

and if the model y =: flo + f3 ,XI + e is fit, we have


SS,(fl,lfJol =fj,s", = 96,2, (1 deg;ee of freedom).

The:[efore. we have
sS'/'/3,I/3" flu) = 136,88 -96.21
=40.67 (1 degree of freedom),

This is the increase in the regression sum of squares attributable to adtfng x,. to a model aJJ:eady con-
taining XI' To testHo: fl.1 ;: 0, [onn the test statistic

SSR(fl,lfl,.flc)/1 ~ 40,67 =18.10,


MS E 2,247

Note that theMS£; from thefull model. usi::g both x: ar.d x,.. is used in the denominator of the test sta~
t:.suc. Since F). Cd,I,;7::::: 4.45, we reject Hi; 132 =: 0 and conclude :.bat density (x2) contributes significantly
to the model.
Since this partial F~tes[ involves a single variable, it is equivalent to the Mest To see this, recall
°
that the t-test on Ho: .B: ;;; resulted in the tes: statistic tc :;:;; 4.24. Furt:b.crmore, recall that :he square
of a trmdom variable with v degrees of freedom is an Frandom variable with one and v degrees of
freedom. and we note that (4.24)' =: 17,98::.: FfJ ,

15-6 MEASURES OF MODEL ADEQUACY


A number of techniques can be used to measure the adequacy of a multiple regression
modeL TItis section will present several of these techniques . Model validation is an impor-
15~6 Measures of Model Adequacy 453

tant part of the multiple regression model building process. A good paper on this sUbject is
Snee (1977) (see also Montgomery, Peck, and VIning, 2001).

15-6.1 The Coefficient of Multiple Detennination


The coefficient of multiple determination R2 is defined as

(15-39)

R2 is a measure of the amount of reduction in the variability of y obtained by using the


regressor variables xi'.:s, ... , x k • As in the simple linear regression case, we must have a :s;:
R2:s;: 1. However, as before, a large value of Rl does not necessarily imply that the regres-
sion model is a good one. Adding a variable to the model will always increase R2, regard-
less of whether the additional variable is statistically significant or not. Thus it is possible
for models that have large values of If to yield poor predictions of new observations or esti-
mates of the mean response.
The positive square root of R2 is the multiple correlation coefficient between y and the
set of regressor variables xl' .:s, ... , xk• That is, R is a measure of the linear association
between y andx I'.:s, ... , x k• When k;:::: 1, this becomes the simple correlation between y and x.

The coefficient of multiple determination for the regression model estimated in Example 15~ 1 is

R' ~ SSR ~ 136.88 ~ 0.782.


S,., 175.09

That is, about 78.2% of the variability in damage y is explained when the two regressor variables, drop
height (XL) and fruit density (.:s) are used. The model relating density to Xl ouly was developed. The
value of R2 for this model turns out to be Rl == 0.549. Therefore, adding the variable X? to the model
has increased R2 from 0.549 to 0.782. -

AdjustedR'
Some practitioners prefer to use the adjusted coefficient a/multiple determination, adjusted
R2, defined as

(15-40)

The value Sy-/(n - 1) will be constant regardless of the number of variables in the model.
SSE1(n -p) is the mean square for error, which will change with the addition or removal of
tenns (new regressor variables, interaction terms, higher~order terms) from the model.
Therefore, R ~dj will increase only if the addition of a new term significantly reduces the
mean square for error. In other words, the R!dj will penalize adding terms to the model that
are not significant in modeling the response. Interpretation of the adjusted coefficient of
multiple detemrination is identical to that of R2.
454 Chapter 15 Multiple Regression

We can. calculate K~j for the model fit in Example 15-1. Prom Example 15-6, we found that SSE;;:;:
38.21 and Srr::= 175.09. The estimate R~: is then
38.21/(20-3) _1_ 2.247
~)=I
175.09/(20 I) 9.215
=0.756.

The adjusted R' will playa significant role in variable selection and model building later in
this chapter.

15-6.2 Residual Analysis


The residuals from the estimated multiple regression model. defined by", = y,-Y,. play an
important role in judging mode! adequacy. JUSt as they do in simple linear regression .•'\.5
Doted in Section 14-5.1, there are several residual plots that are often useful. These are illus-
trated in Example 15 -9. It is also helpful to plot the residuals against variables nIlt presently
in the model that are possible candidates for mc1usion. Patterns in these plots. similar to
those in Fig. 14-5, indicate that the model may be improved by adding the candidate
variable.

The residt:.aL.... for the n:odel estimated in Example 15-1 are shown.in Table 15-3. These residuals are
plotted 00:1 a normal probability plot in Fig. 15-2. No severe deviatiollS from r.ocrnaliry are obvious,
although the smallest residual (e) = -3.11) does not fall near the remaining residuals. The standard-
ized residual. -3.l7/~j2,247 :::: -2,11, appears to be large and could indicate an unUSt:a1 observation.
The residuals are plotted against y in Fig. 15~3, and againstx j and.A1 in Figs. 15-4 and ;5-5, respec-
tive;y_ In Fig. 15-4, there is some i.'1dication that the assumption of constant variance may not be sat-
isfied. Removal ofth.e unusual observation may improve the model fit, but there is no indication of
error in data colleetion. Therefore, the point 'kill be retained, We will see subsequently (Example 15-
16) iliat!:\Vo other regressor variables are required to adequately model these data,

2

.•• . .
• •
~
0
0
m
ro 0
E

~ ~ -1
,

• •
••


1-;

I •,
-2 ....
-3 -2 -1 0 2 3
Figure 15-2 Normal probability plot of residuals for E""ample 15-10.
r
15-6 ~!easures of Model Adequacy 455

3~ " i
2- " . " j
~ ~ -----~-"-~"-~------
i
:
[ .
... -"-------+---- .

o:~:~" . '. I

~~l~----------~----------~
2 7
Fitted value
12

Figure 15~3 Plot of residuals againsty for Exan:.ple 15~10.

3~----------------------------------,

2 •

-2 r
i
'.
~'­
--~~--~~----~;~----~
300 400 500 600
x,
,Figure 154 Plot of residuals against x ( :or Example 15-10.

Figure J5.5 Plot of residuals against", for Example 15· I 0,


456 Chapter 15 Multiple Regression

15-7 POLYNOMIAL REGRESSION


The linear model y = X~ + e is a general model that ean be used to fit any relationship that
is linear in the unknown parameters ~. This includes the important elass of polynomial
regression models. For example. the second-degree polynomial in one variable.
y= /3, + /3.x ,. /3,,:? + S, (1541)
and the second-degree polynomial in two variables,

y = /3, ,. /3,x. + /3,x, + /3. ,x; + f3...:l X ; + /3.::>"x, + s, (1542)


are linear regression models,
Polynomial regression models are widely used in cases where the response is curvilin-
ear, because the general principles of multiple regression can be applied. The following
example illustrates some of the types of analyses that can be performed.

E~pl~;~s~ii
Sidewall panels for the interior of an airplane are formed i::.l a 1500~ton press. The unit manufactur~
ing cost varies \\1:th the production lot size. The data shown below give :he average cost per unit (in
hundreds of dollars) for this product (y) and :he production lot size (x), The scatter diagram. shown
in F1g. 15-6, indicates that a second-order polynomiul may be appropriate,

y LSI 1,70 [.65 [,55 lAS lAO 1.30 1.26 1.24 1.21 1.20 1.18
x 20 25 30 35 40 50 60 65 70 75 80 90

We will fit fue model

y = f30 + /3.x + /3"r + "

"90~
1.80 •
1.70 •
~
" i.50
"~
0 •
"- 1.50
0 •u
1.40
m
'"
~
1,30
~
«
1.20
• •
• .
1.10

1,00
, , I .. !~.~
20 30 40 50 60 70 8() 90
Lot size, x
Figure 15-6 D.ta for Example 15-11.
f 15-7 Polyr.omial Regression 457

1 The y vector, X matrix, and ~ vector are as follows:

I j:.811 1 20 400
; 1.70 25 625
1.65 I 30 900
1.55 I 35 1225
L48 1 40 1600
1.40 1 50 2500
y= 1.30 ' x= 60 3600'
1.26 I 65 4225
L24 : 70 4900
1,21 1 75 5625
1.20 ; 80 6400
cLl8j 1 90 8100

Solving the normal equations X"'Xtl = X'y gives the fitted model
, ; 2.1983 - 0.0225, + 0.0001251:<".
The test fo~sign.ificance of:egrcssioIl is shoVt"O in Table :5-6. Since Fo =2171.07 is significant at 1%,
we conclude iliat at least O!:.e of the parameters fJ! and /311 is not zero. Furth.ermore, the sta.1.da!d tests
for model adequacy reveal no unusucl behavior,

In fitting polynomials, we generally like to use the lowest-degree model consistent with
the data. In this example, it would seem logical to investigate dropping the quadratic term
from the modeL That is, we would like to test
H,: 13,,: 0,
H,: 13" "' O.
The general regression significance test can be used to test this hypothesis. We need to
determine the "extra sum of squares" due to {111' or
SS,(f3ulf3;,f3J = SSR(fi" f3 li lf3,) - SS,(f3,If3J·
The sum of squares 5S"(f3,, f3::1f3;J = 0.5254, from Table 15-6. To find SSR(fi,lf30), we fit a
simple linear regression model to the original data, yielding
y 1.9004 0.0091".
It can be easUy verified that the regression sum of squares for this model is
58,(13,113,) 0.4942.

Table 15-6 Test for Significance of Regression for the Second~~er Model in Example 15-11
Source of Sum of Degrees of Mean
Variation Squares Freedom Square F,

Regression 0.5254 2 02627 217L07


Error 0_0011 9 0.000121
Total 0.5265 11
1
458 Chapter 15 Multip!e Regression

Table 15-7 Analysis ofVariaz;ce of Example 15~ 11 Showing the Test for He: [31, 0
Source of $\l."U. of Degree of Mean
Variation Squaces Freedom Square
Reg:-ession SS,([3" {J"i[3,) ~ 0.5:254 2 0.2627 2171.07
Lin_ SS,([3,I[3,) ~ 0.4942 1 0.4942 4084.30
Quadratic SS,(,8"1[3,, .8,) = 0.0312 I 0.0312 257.85
Error 0.0011 9 0.000121
Thtal 0.5265 11

Therefore, the extra sum of squares due to Pll' given that /31 and Pc are in the model, is
SS/fi"lA" /3) SSk(/3" /3,,113,) - SSifiJ/3~
~ 0.5254 - 0.4942
~0.0312,

The analysis of variance, with the test of H,: /311 ~ 0 incorporated into the procedure, is dis-
played in Table 15-7, Note that the quedratic term contributes significantly to the model.

15-8 INDICATOR VARIABLES


The regression models presented in previous sections have been based on quantitative vari~
abIes) that is, variables that are measured on a numerical scale. For example. variables such
as temperature, pressure, distance, and age are quantitative variables. Occasionally. we
need to incorporate qualitative variables in a regression model. For example, suppose that
one of the variables in a regression model is the operator who is associated with each obser-
'''arion y;. Assume that only two operators are involved. We may wish to assign different lev-
els to the two operators to account for the possibility that each operator may have a different
effect on the response.
The usual method of accounting for the different levels of a qualitative variable is by
using indicator variables. For instance, to introduce the effect of two different operators into
a regression model, we could define an indicator variable as follows:
x 0 if the observation is from operator 1,
x = 1 if the observation is from operator 2.
In general, a qualitative variable with t levels is represented by t - 1 indicator variables.
which are ,",signed values of either 0 or 1. Thus, if there were three operators, the different
levels would be accounted for by two indicator variables defined as follows:

x, X,
0 0 if the observation is fmm operator 1,
1 0 if the observation is from operator 2,
0 1 if the obsertation is from operator 3.

Indicator variables are also referred to as dummy variables. The following example illus~
trates some of the uses of indicator variables. For other applications, see Montgomery.
PeCk, and Vining (200 1).
15-8 Indicator Variables 459

E~pl~(?::i2',
(Adapted from Montgomery. Peck. and Vmiog, 2001), A mechanlcal engineer is investigating the sur~
face finish of metal parts ptoduced on a lathe and its relationship to the speed (in RPM) of the lathe,
The data are shown in Table 15-8. Note that the data have been collected usiGg two different typCs of
cutti.:lg tools. Since it is likely that the type of cutting tool affects tb.e surl'ace fiIrish, we will fit the
model
y ~ /3, + fil~' + Ph + E,
where y is the surface finL'>h, XI is the lathe speed in RPM, and X: is a.'1 indicator va.--!ahle denoting the
type of ct;.tting tool used; 'ilia: is,

x -{
ofor tool type 302,
, - 1 for tool type 416,

The parameters if. ttis model may be easily interpreted. If x,! ;;;; 0, then the model becomes

Y"" Pc + ftjX t + e,
which is :a straight-line model with slope Pi and intercept f3y However, if X. ;;;; 1. then the model
becomes
y ~ 13, + fi,x, + 13,(l) + E; p, - /3, + f3,x, + E,
which is a st:raight~Jine model v.ith slope PI and intercept f3ij + /32' Taus, the model y;; Po + f31X+ f3-zXz
+ e implies that surface finish is linearly related to lathe speed and that the slope f31 does Dot depend
on the type of cutting tool used, However, the type of cutting rool does affect rbe intereept,. and /32 indi~
cates the ehange in the intercept associated with a change in too) type from 302 to 416,

Thble15·8 Swiace Furish Da,afor Example 15-12


Observation Su.,..face F1nish. Type of Cutting
N'lr!I:.ber. i y, RPM Tool
45.44 225 302
2 42,03 200 302
3 50,10 250 302
4 48.75 245 302
5 41-92 235 302
6 47,79 231 30?-
7 52_26 265 30?-
8 50.52 259 30?-
9 4558 221 30?-
:0 44,78 218 302
II 33.50 224 416
12 3L23 212 416
13 37,52 248 416
14 37,[3 260 416
15 34,70 243 416
16 33,92 238 416
11 32,13 224 416
18 35,47 251 416
19 33-49 232 416
20 32.29 216 416
-c"J

460 Chapter 15 Multiple Regression

The X matrix and y vector for this problem are as follows:


225 0 45.44
200 0 42.03
1 250 0 50.10
1 245 0 48.75
1 235 0 47.92
1 237 0 47.79
1 265 0 52.26
259 0 50.52
221 0 45.53
1 218 0 44.78
X= 1 • y=
1 224 33.50 .
1 212 31.23
I 248 1 37.52
1 260 37.13
1 243 3<'.70
1 238 33.92
1 224 1 ,32.13 •
251 1: : 35.47·
232 1i ! 33.49:

216 d L32.29 J

The fitted model is


5= 14.2762 + O.141u, -13.2802x,.
The a."ta:ysis of variance for this model is sho'h'Il in Table 15~9. Note that the hypothesis Eo: PI ::: /32
=0 of regression) is reje...'1:ed. This table a.:w contains ::he sum of squares
SSg = SS,(/3,. }3,iP,J

=ssg{/3,I}3') + SS,(}3,I}3,. }3,).


S(}a test of the hypothesis Ho: /32 = 0 can be made. This hypothesis is also rejected, so we conclude
that tool type has an effect on surface finish..
It is atsopossihle to use indicarorvariables tomvesligate whether tool type affects both slopemui
intercept Let the model be

Table 15-9 Analysis of Variance of Example 15-12


Sou.."'Cc of Sumo! Degrees af Mean
Variation Squares Freedom Square
Regression 1012.0595 2 506.0297 1103.69'
SS'(f:l,IfJ.> (130.6091) (1) 130.6091 284.87'
55,<{3,IfJ •• }3o) (881.4504) (1) 881.4504 1922.52'
Error 7.7943 !7 0,4508
Total 1019.8538 19
"Signific.ant at 1%.
15-9 The Correlation Matrix 461

where.:s is the inmcam! variable. Now if too) type 302 is used,.:s "'" O. and the model is
y= {J,x,+e.
lftool rype 416 is used• .:s;;;;: 1, and the model becomes
y ~ {Jo - {J,", + A+ f3:,x, + e
= (fJ" + fiJ + (/3, + /3;JX, + S-
Note that A is the change in the intercept. and A is the change in slope produced by a change in
tool type,
Another method of analyzing these data set is to fit separate regression models to the data for
each tool type. However, the indicator variable approach has several advantages, rmt, only one
regression model must be estimated. Second, by pooling the data on both :001 types, more degrees of
freedom for error are obtained. Th.lrd. tests of both hypotheses on the parameters 13: end /33 a:e just
special cases of the general regression signific3IWe test

15-9 THE CORRELATION l'.lATRIX


Suppose we wish to estimate the parameters in the model

i 1,2, ""n. (15-43)

We may rewrite this model with a transformed intercept f3~ as

Y, = /3; + /3,(x" - x,) + A(x,~ + z, (15-44)


or, since A= y,
y,-y= /3,(X" -x,) + A.(x,~ - + cr (15-45)

The xx matrix for this model is


(1546)

where

Skj= I,(xik-x.)(xg-Xj), k,j 1,2. (15-47)
i=l

It is possible to express this XX matrix in correlation fo=. Let

Tk;' ::::; I )1/2 I


k,j=1,2, (1548)
\
S,S.
JJ
/<.Ii.

and note that r" = r" = 1. Then the correlation form of the XX matrix, equation 15-46, is

R-['I12 r~21 (15·49)

The quantity r t2 is the sample correlation bet\.veenx[ and~. We may also define the sam-
ple correlation between Aj and y as

j=1,2, (15·50)
l
462 Chapter 15 Multiple Regression

where

Sj, i(x'J XI)(Y'-y), j=I,2, (15-51)


1<"":
is the corrected sum of cross products between Xj and y, and S'I"J is the usual total corrected
sum of squares of y.
These transfur:nations result in a new regression model,
(15-52)
where
. . Y,-y
Yi =Slf2-,
yy

x·· -XI
z11;:::: 1~~'2" j= 1,2.
B

The relationship between the parameters b , and b2 in the new model, equation 15-52, and
the paramerers {J" {JI' and A in the original model, equation 1.5-43, is as follows:

(15-53)

{J, (15-54)

(15-55)
The least-squares normal equations for the transformed model, equation 15-52, are

(15-56)

The solution to equation 15-56 is

or

(15-57a)

rZy - liz'll'
(J5-5ib)
1-1.~
The regression coefficients, equations 15-57, are usually called staruiardized regression
coefficients. Many multiple regression computer programs use this transfor.:nation to reduce
round~off errors in the (X'X)-l mati"" These round-off errors may be very serious if the
15~9 The Correlation Matrix 463

original variables differ considerably in magnitude, Some of these computer programs also
display both the original regression coefficients and the standardized coefficients, The
standardized regression coefficients are dimensionless, and this may make it easier to com~
pare regression coefficients in situations where the original variables Xj differ considerably
in their units of measuremenL In interpreting these standardized regression coefficients,
hovvever\ we must remember that they a...-.oe still partial regression coefficients (i.e., b, s..'1ows
the effect of z) glven iliat other Zi> i are in the model). Fur",hermore, the hJ are attected
by the spacing of the levels of the x)' Consequently, we should not use the magnitude of the
bj as a measure of the importance of the regressor variables.
While we have explicitly treated only the case of two regressor variables. the results
generaIiz.e. If there are kreg:ressor variables X,I ~> •••• xk• one may vrrite the X'X matrix in
correlation form.
'12 '13 '"
"Jk
r'12 1 rn ,.,
r",
R=,r13 1 (15·58)
TZ3
''''
!
L'ik r2k r,. , " 1

w:tere r.j S/(S;,Sl))Lf2 is dIe sample correlation between x, and x-' and Sij;';;; I.;:",(X"i - Xi)
(X'I - x,). The correlations between Xj and y are
!'ly
: r2'j
g (15·59)

lrky j
"

where "Y =1:;"', (x., - x,)(y" - Y), The vector of standardized regression coefficients b' = £h"
b" ''', b,] :s
0= R·1g, (15·60)
The relationship between the standardized regression coefficients and the original regres-
sion coefficients is ~

j 0;;: 1,2, ... ,k, (15·61)

~~lil~'131
For the data in Example 15·1, we find
S", = 175,089, S" = 184710.16,
SI,V;;4215.372, S" = 0.047755,
S" = 2,33, Sll ;;;51.2873.

Therefore.
51.2873
0.5460,
~(18471O.16)(0,047755)
464 Chapter 15 Multiple Regression

42\5,372
1(18471O.16)(6i089) - 0,7412,

S"
r2v :::: - - ,.... ljl ",;;l
2.33
••••• - 0.8060.
, (S"S,) "JrO,047755)(175,089)
'. - f)

and the correlation matrix for this problem is

0.54601
I,
J
From equation 15~56. the normal equations in terms of the standardized regression coefficients are

Ic 1
0.5460
0.5460l'~t]=[0'7412],
1 L'" 0,8060

Consequently, L~e standardized regression coefficients are

r~,'J 1 0.5460]",'0,74121
~ b,J L0,5460 1 L0,8060,
1.424737 -<},77791J[0.7412]
= [ ·.{j,77791 1.424737. 0,8060
0.429022]
= [ 0.571754 .

These standardized regression coefficients could also have been computed directly from
elfuer equation 15,57 or equation 15·61. Note that alfuough b, > b" we should be cautious
about ooncluding that the fruit density X, is more important fuan drop height (x,)' smce h,
and b2 are still partial regression coefficients.

15-10 PROBLEMS IN MULTIPLE REGRESSION


There are a number of problems often eneountered in the use of multiple regression. In this
section, we briefly discuss three of these problem areas: the effect of multicollinearity on
the regression model, the effect of outlying points in the x-space on the regression coeffi-
cients,. and autocorrelation in the errors.

15-10.1 Multicollinearity
In most multiple regression problems. the independent or regressor variables Xi are inter-
correlated, In situations whieh this intercorrelation is very large. we say that multi:
collinearity exists. lv1ulticollinearity ean have serious effects on the estimates of the
regression coefficients and on fue genernl applicability of fue estimated model.
The effects of multicollinearity may be easily demonstrated. Consider a regression
model with two regressor variables Xl and x,., and suppose that'xi and ~ have been ;~stan~
dardized.'" as in Section 15~9, so that the X'X matrix is in correlation fo:rm, as in equation
15-49, The model is

,
J
r The (X'Xr' matrix for this model is
15-10 Problems in Multiple Regression 465

C=(X'Xf' =
. [)I(l-";)'
/"'-2
-r,,/(I- rall
-,:,/(I-r,,) lj(l-fl~) J
and the estimators of the parameters are
", -_~i.Y-'12X2Y
P 2'
. 1- r12
,
{3, = X'Y"
2 - '12 x'y
,
l-'l~ ,

where T;2 is the sample correlation between x; and.:tj, and x;y and x~y are the elements of
the X'y vector.
Now, if multicolli.'learity is presen~ x, and.<, are highly correlated, and If,,1 -> 1. In
such a situation) the variances and covariances of the regression coefficients become very
large, since v(li) = ejF' -> ~ as If,,1 -> 1, and COy (A, Ii)
= C"d' -> ± = depending on
whether r 12 ~ ± 1. The large vari,ances for A, imply that the regre.';sion coefficient. . are very
poorly estimated. Note that the effect of multicollinearity is to introduce a ''neat' linear
dependency in the columns ufthe X matrix, As r" ->± 1. this linear dependency becomes
exact. Furthermore, if we assume that x;y -? x~ as Irnl ~ ± 1. then the estimates of the
regression coefficients become equal in magnitude btll opposite in sign; that is = Ii, -A,
regardless of the true values of p, and p,.
Similar problems occur when multicollinearity Lt; present and there are more than t..vo
regressor variables, In general, the diagonal elements of the matrix C (X'X)-J can be
;:.!

written

j"'" 1,2, ...• k. (15·62)

where RJ is the coefficient of multiple determination resulting from regressing Xi on the


other k - 1 regressor variables. Clearly) the stronger the linear dependency of Xj on the
remaining regressor vatiables (and hence the stronger the multicollinearity), the larger the
Ii,
value of If, will be. We say that the variance of , is "inflated" by the quantity (1- , R:r',
Consequently, we usually call

j:::::: 1,2..... k, (15-63)

the variance inflation factor for ~j' ~ote that these factors are the main diagonal elements
of the inverse of the correlation matrix. They are an important measure of the extent to
which multicollinearity is present.
Although the estimates of the regression coefficients are very imprecise when multi-
collinearity is present, the estimated equation may still be useful For example, suppose we
v.1sh to predict new observations. If these predictioQ.'i are required in the region of the
x~space where the multicollinearity is in effect, then often satisfactory results will be
obtained because while individual f3jmay be poorly estimated, the function 'L;',f3;:;, may be
estimated quite well. On the other band. if the prediction of new observations requires
extrapolation, then generally we would expect to obtain poor results. Exlrdpolation usually
requires good estimates of the individual model parameters.
466 Chapter 15 Multiple Regressio"

Multicollinearity arises for several reasons, It will occur when the analyst collects
the data such that. constraint of the form L~, ojX] ~ 0 holds among the colnnms of the X
matrix (the a, are constants, not all zero), For example. if four regressor varia.bles are the
components ~f a mixture, then such a constraint will always exist because the sum of
the cOClponents is always constant. Usually. these constraints do not hold exactly, and the
analyst does not know that they exist.
There are several ways to detect the presence of multicollinearity. Some of the more
important of these are briefly discussed.

1. The ·variance ioilation factors, defined in equation 15-63, are very useful measures
of multicollinearity. The larger the variance inflation factor. the more severe the
multicollinearity. Some authors have suggested that if any variance inflation factors
exceed 10 then multicollinearity is a problem. Other authors CQusiderthis value too
j

liberal and suggest that the variance inflation factors should not exceed 4 or S.
2. The determinant of the correlation matrix may also be used as a measure of multi-
collinearity. The value of this determinant can range between 0 and 1. When the
value of the determinant is I, the columns of the X matrix are orthogonal (i.e" there
is no intercorre1ation between the regression variables). and when the value is O.
there is an exact linear dependency among the columns of X. The smaller the value
of the determinant. the greater the degree of multicollinearity.
3. The eigenvalues or characteristic roots of the correla. .:ion matrix provide a measure
of multicollinearity, If X'X is in correlation form. then the eigenvalues of XX are
the roots of the equation
IX'X - All = o.
One or more eigenvalues near zero implies that multiCOllinearity is present. If Am:»;
and Aml'lC denote the largest and smallest eigenvalues of X'X, then the ratio Amru!Amin
can also be used as a measure of multicollinearity. The larger the value of this ratio,
the greater the degree of multicollinearity. Generally, if the ratio VAm" is less than
10, there is little problem with multicollinearity.
4. SOr:letimes inspection of the individual elements of the correlation matrix can be
helpful in detecting multicollinearity. If an element h·1
is close to 1. then Xi and
X, may be strongly multicollinear. However, when more than tvlO regressor vari-
ables are involved in a multicollinear fashion, the individual Tlj are not necessarily
large. Thus, this method will not always enable us to detect the presence of
multicollinearity.
5. [f the F-test for significance of regression is significant but testt; on the individual
regression coefficients are not significant, then multicollicearity may be present.

Several remedial measures have been pr<!p<Jsed for resolving the problem of multi-
collinearity. Augmenting the data with new observations specmcally designed to break up
the approximate linear dependencies that currently exist is often suggested. However. some-
times this is impossible for economic reasons, or because of the physical constraints that
relate the X,. Another possibility is to delete certain variables from the model 'This suffers
from the disadvantage that one must discard the information contained in the deleted
variables,
Since multicollinearity primarily affects the stability of the regression coefficients. it
would seem. that estimating these parameters by some method that is less sensitive to mul-
tieollinearity than ordinary least squares would be helpful. Several methods bave beeD sug-
gested for this. Hoed and Kennard (1970., b) bave proposed ridge regression as an
15·] 0 Problems in Multiple Regression 467

alternative to ordinary least squares. In ridge regression, the parameter estimates are
obtained by solving
11'(1)= (X'X + Ifr'X'y, (15·64)
where I> 0 is a constant. Generally, mues of 1in the interval 0 :s; I'; I a.-e appropriate. The
ridge estimator 11'(1) is not an unbiased estimator of ]:I, as is the ordinary least·squares esti-
mator ~, but the mean square error of 11'(1) will be smaller than the mean square error of~.
Thus ridge regression seeks to find a set of regression coefficients that is more "stable,lt in
the sense of having a small mean square error. Since mUlticollinearity usually results in
ordinary least-squares estimators that may have extremely large variances, ridge regression
is suitable for situations where the multicollinearity problem exists.
To obtain the ridge regression estimator from equation 15~64, one must specify a value
for the constant I. Of course, there is an "optimum" I for any problem, but the simplest
approach is to solve equation 15-64 for several values of I in the inren'lll 0 :s; I ;; L Then a
plot of the values of Il'(l) against 1is constructed. This display is called the ridge trace. The
appropriate value of 1is chosen subjectively by inspection of the ridge trace. TypicallYI a
value for I is chosen such that relatively stable parameter estimates are obtained. In general l
the variance of 11'(1) is a decreasing function of I, while the squared bias [1l-Il'(I)J' is an
increasing function of l, Choosing the value of [involves trading off these two properties of
13'(1)·
A good discussion of the practical use of ridge regression is in Marquardt and Snee
(1975). Also, there are several other biased estimation techniques that have been proposed
for dealing with multicollinearity. Several of these are discussed in Montgomery, Peck, and
Vrning (2001).

it,,·mp~15-14
(Based on an example in Rald, 1952.) The heat geoerated in calories per gxa.,\ for a particular type of
cement as a function of the qua."1tities of foll::' additives (ZI' z;>.' z,. andz,;) is shown in Table 15-10. We
wish to fit a :nultiple linear reg:res;,lon model to these da:a.

Tabl.15-10 Data [or Example 15·14


Observation Number y' z, ..., Z, z,
1 28.25 10 31 5 45
2 24.80 12 35 5 52
3 11.86 5 15 3. 24
4 36.60 17 42 9 65
5 15.80 8 6 5 19
6 16.23 6 17 3 25
7 29.50 12 36 6 55
8 28.75 10 34 5 50
9 43.20 18 40 10 70

10 38.47 "> 50 10 80
II 10.14 16 37 5 61
12 38.92 20 40 II 70
13 36.70 15 45 8 68
14 15,31 7 22 2 30
15 8.40 9 12 3 24
468 Chapler 15 Multiple Regression

The data 'Will be coded by defining a new set of regressor variables as

i=I,2, ... ,15. i = 1.2,3,4,

r.:
where Sjj = t (Zij - ;i! is the corrected sum of squares of the levels of <;" The coded data are shown
in Table 15-11. This tran..<;fonnation makes the intercept orthogonal to the o~r regression coeffi-
cients, since the:first column of the X matrix consists of ones. Therefore, the intercept in this model
will always be estimated by y. The (4 x 4) X"X matrix for the foUl' coded variables is the correlation
ma:rix

1.00000 0.84894 0.91412 0.933671


0.84894 1.00000 0.76899 Q.g7567
X'X= 0.91412 0.76899 1.00000 0.86784'
[
0.93367 0.97567 0.86784 1.00000
This matrix contains several large correlation coefficients, and this may indicate significant mu1~
dcoll.i..1earity. The inverse of X'X is
20.769 25.813 -0.608 -44.0421
(X'Xr l r
= 25.813
-0.608
74.486
12597
12.597
8.274
-107.710
-18.903'
l-44·042 -107.710 -18.903 163.620

The ...'ariance inflation factors are the main diagonal clements of this matrix, Note 'J:lat three of t!:!e
variance inflation factors exceed 10, a good indication that multicollinearity is present. The eigenval-
ues of X'X are }'1 := 3.657, i-;::= 0.2679, A, = 0.07127. and }"4 ':;::: 0.004014. Two of the eigenvalues,.:t.
and A4> are re1atively close to zero. Also, the ratio of the: largest to the smallest eige:r:.value is

?.~ = 3.652....=911.06,
Amill 0,004014
which is considerably larger than 10. Therefore, since examination of the variance inflation factors
and the eigenvalues indicates ?otential problems with multicollinearity, wewill use ridge regression
:0 estiro.ate the model parameters.

T.blels..n Coded Data for Example 15·14


Observ'dtion Number y x. X, X, x,
28.25 -<J.12515 0.00405 -<JD9206 -<J.05538
2 24.80 -<J.02635 0.08495 -<J.09206 0.03692
3 11.86 -0.37217 -<J.31957 -027617 -<J.33226
4 36.60 0.22066 0.22653 0.27617 0.20832
5 i5.80 -<J.22396 -<J.50161 -<J.09206 -<J.39819
6 16.23 -0.32276 -<J.27912 -<J.276l7 -<J.3l907
7 29.50 -<J.02635 0.10518 0.00000 0.07641
8 28.75 -<J.12515 0.06472 -0.09206 0.01055
9 43.20 0.27007 0.18608 0.36823 0.27425
10 38.47 0.51709 0.38834 0.36823 0040609
11 10.14 0.17126 0.12540 -<J.09206 0.15558
12 38.92 0.36887 0.18608 0.46029 0.27425
13 36.70 0.12186 0.28721 O.I84Jl 0.24788
14 15.31 -0.27336 -0.17799 -<J.36823 -<J.25315
IS 8.40 -<J.17456 -<J.38025 -<J.27617 -<J.33226
15-10 Problems it) Multiple Regression 469

We solved Equation 15-64 for 'various values of l, and the results are su:m..mar:ized in Tabie 15-12. The
ridge trace is shown in Fig. 15-7, The instabLity of the least-squares estimates tJ;U = 0) is evident
from inspectio:1 of the ridge t"ace. It is often difficult to choose a value of 1from the ridge trace e::.3r
simultaneously stabilizes the estimates of all regression coefficients. We will choose l "'" 0.064. which
implies that the regression model is
y = 25,53 -18.0,566x1 "'r' 17.2202xz + 3{}.0743;:;) + 4.7242J:~,

using A:: y:: 25.53. Coavening the model to the original variables Zt we ha,,'C
y = 2.9913 - 0.8920z, + 0.3483" + 3.3209" - 0.0623".

Table 15-12 Ridge Regression Estimates for Example 15-14

0.000 -28.3318 65.9996 64.1)479 -572491


0.001 -31.0360 57,0244 61.9645 -44.0901
0,002 -32.6441 50,9649 60.3899 -35.3088
0,004 -34,1071 43,2358 58,0266 -24.3241
0,008 -34.3195 35,1426 54.7018 -13.3348
0,016 -31.9710 27.9534 50.0949 -4.5489
0,032 -263451 22,0347 43.8309 1.2950
0.064 -18.0566 17.2202 36,0743 4,7242
0,128 -9.1786 13.4944 27,9363 6.5914
0,256 -1.9896 10,9160 20,8028 75076
0,512 2.4922 92014 15.3:'97 7.7224

Pi(l)f
70
60

Ie
o P4 (I}
-10
P;(I)
-2C
-30
-40
-50

--<l°o O.OS 0.:0 0.15 0.20 0.25 0.30 0.35 0.40 D.45 0.50 0,55
Figure 15~7 Ridge trace for Ex.ample 15-14,
1
470 Chap"" 15 Multiple Regression

15·10.2 Influential Observations in Regression


When using multiple regression we occasionally find that some small subset of the obser~
varions is unusually influential, Sometimes these influential observations are relatively far
away from the vicinity where the rest of the data were collected. A hypometical situation
for two variables is depicted in Fig. 15-8, where one observation in x-Sface is remote from
the rest of the data. The disposition of points in the x~space is important in dete..1nining the
propetties of the mode!. For example, the point (x", xQ) in Fig. 15-8 may be very influen-
tial in detenr.Jning the estimates oime regression coefficients, the value of R2, and the value
of MS£,
\Ve would like to examine the data points used to build aregr--vSsion model to determine
if they control many model properties. If these influential points are "bad>l points. or are
erroneous in any way, then they should be eliminated, On the other hand, there may be nom-
ing 'Wl"ong with these points, but at least we would like to determine whether or not they
produce results consistent with the rest of the data, In any event, even if an influential pomt
is a valid one, if it controls important model properties. we would like to know this, since
it could have an imp""t on the use of the modeL
Montgomery, Peck, and Vrning (2001) describe several methods for detecting influen-
tial observations. An excellent diagnostic is the Cook (1977, 1979) distance measure, Thls
is a measure of the squared distance between the least squares estimate of p, based on all n
observations and the estimate ~v, based on removal of the ith point. The Cook distance
measure is

D (~(;' -~')' X'X(~'il


=.J , \.
-~)
i=l 2, ... ,n,
j
, pMSE '

Clearly if the it.!? point is influential, its removal will result in ~"1 changing considerably
from me value p. Thus a large value of D, implies that the ith point is influential. The sta-
tiseic Di is acu:ally computed using

(15·65)

where f = ej JMSE(I-h,,) and hii is me ith diagonal element of the matrix


H = X(x'Xt'X'.

Xl!
XI':'. ,-~---------------!
, Region containing 1
all obServattons I
except the ith :
I. 1
I
I
I

Figure 15~8 A point that is remote in x~space.


15-10 Problems in Multiple Regression 471

The H matrix. is sometimes called the "haC matrix. since

y~x~
= X(X'Xr'X'y
=Hy.
Thus H is a projection matrix that transforms the obsen'ed values of y into a set of fitted
values 'Y.
From equation 15-65 we note that D, is made up of a component that reflects how well
the model fits the ith observatioll Yi [the quantity eJ~MSE(I- hii ) is called a Studentized
residual, and it is a method of scaling residuals so that they have unit variance] and a com~
ponent that measures how far that point is from the rest of the data ["lil(1 - h,) is the dis-
tance of the ith point from the centroid of the remaining n - 1 points]. A value of DI > 1
woold indicate that the point is influential. Either component of D, (or both) may contribute
to a large value,

[~[':!II~te!5Ii5;
Table !5-13lists the values of D; for the c!a::naged peaches data in Example 15~ 1. 'Ib illusnte the ca1~
culations. CODSider the first observation:

Di

Table 15..13 I:rfb.1cnce Diagnostics for the Damaged Peaches Data in Example 15~15

Observation Cook's Distance


(,) Measure
0.249 0.277
2 0.126 0.000
3 0.088 0.156
4 0.104 0.061
5 0.060 0.000
6 0.299 0.000
7 0.081 0.008
8 0.055 0.002
9 0.193 0.116
10 0.117 0.024
11 0.250 0.037
12 0.109 0.002
13 0.213 0.045
14 0.055 0.056
15 0.147 0.067
16 0.080 0.001
17 0.209 0.001
18 0.276 0.160
19 0.210 0.171
20 0.078 0.012
472 Chapter 15 Multiple Regression

[2.06N2.247(1-O.249)]'. 0.249
3 (1-0.249)
~ 0.277.
The values in Table 15~13 were calculated using :Minitab", The Cook distance measu..--e Di does not
identify any potentially influential observations in the data, as no value of Di exceeds unity.

15·10.3 Autocorrelation
The regression models developed thus far have assumed that the model error components
S; are uncorrelated random variables. Many applications of regression analysis involve data
for which this assumption may be inappropriate. In regression problems where the depend-
em and independent variables are time oriented or are time-series data. the assumption of
uncorrelated errors is often untenable. For example, suppose we regressed the quarterly
sales of a product against the quarterly point~of~sa1e advertising expenditures. Both vari~
abIes are time series, and if they are positively correlated with other factors such as dispos-
able income and population which are not included in the model, then it is likely that
the error terms in the regression model are positively correlated over time, Variables that
exhibit correlation over time are referred to as autocorrel.ated variables. 11any regression
problems in economics, business, and agriculture involve autocorrelated errorS.
The occurrence of positively autocorrelated errors has several potentially serious con~
sequences, The ordinary least-squares estimators of the parameters are affected in that they
are no longer minimum variance estimator51 although they are still unbiased. Furthermore,
the mean square error !l;fSJ;; may underestimate the error variance cr, Also, confidence inter-
vals and tests of hypotheses. which are developed assuming uncorrelated errors, are not
valid if autocorrelation is present.
There are several statistical procedures that Can be used to determine whether the error
terms in the model are uncorrelated. We ,Yin describe one of these, the Durbin~ WatSOn test
This test assumes that the data are generated by the first-order au.toregressive model

t=l~2, .",n, (15-66)

where t is the index of time and the error terms are generated according to the process

{15-67}

where Ipi < 1 is an unknown parameter and 0, is a NlD(O, <f'} random vanable. Equation
15-66 is a sjmple linear regression model, except for the errors, which are generated from
equation 15~67, The parameter p in equation 15~67 is the autocorrelation coefficient. The
Durbin-Watson test can be applied to the hypotheses

He: p=O,
(l5-6S)
H1:p>O.
Note that if Ho: P = 0 is not rejected, we are implying that there is no autocorrelation in the
errors. and the ordinary linear regression model is appropriate.
15-10 Problems in Multiple Regression 473

To test II,: p = 0. first fit the regression model by ordinary least squares. Then, ealcu-
late the Duibin-Watson test statistic

(15-69)

where e, is the rth residual For a suitable value of a, obtain the critical values D a. u and Dq,J.
from Table 15-14.lf D > D~", do not reject II,: p =0; but if D < D .. t • reject II,: p= 0 and
conclude that the errorS are pOSitively autocorrelated. If D13... L ~ D !£ D 1Z, lj' the test is

Table 15~14 Critical Values of the Durbin-Watsor. Statistic


P:obability '"
k:::::: K umber of Regressors (Exchding :he Intercept)
LowetTail
SampIe (Significance I 2 3 4 5
Size Level:;; a)
0.01 0.81 1.07 0.70 1.25 0.59 1.46 0.49 1.70 0.39 1.96
15 0.025 0.95 1.23 0.83 lAO 0.71 1.61 0.59 L8L 0048 2.09
0.05 1.08 1.36 0.95 1.54 0.82 1.75 0.69 1.97 0.56 2.21
0.01 0.95 1.15 0.86 1.27 0.77 1.41 0.63 1.57 0.60 1.74
20 0.025 1.08 1.28 0.99 1.41 0.89 1.55 0.79 1.70 0.70 1.87
0.05 1.20 1.41 1.10 1.54 1.00 1.68 0.90 1.83 0.79 1.99
0.01 1.05 1.21 0.98 1.30 0.90 1.41 0.83 !.52 0.75 1.65
25 0.025 1.13 1.34 1.10 1.43 1.02 1.54 0.94 1.65 0.86 1.77
0.05 1.20 1.45 1.21 1.55 1.12 1.66 1.04 1.77 0.95 1.89
0.01 1.13 1.26 1.07 1.34 1.01 1.42 0.94 1.51 0.88 1.61
30 0.025 1.25 1.38 1.18 1.46 Ll2 1.54 1.05 1.63 0.98 1.73
0.05 1.35 1.49 1.28 1.57 1.21 1.65 1.14 1.74 1.07 1.83
0.01 1.25 1.34 1.20 lAO 1.15 1.46 1.10 1.52 1.05 1.58
0.025 1.35 1.45 1.30 1.51 1.25 1.57 1.20 1.63 1.15 1.69
0.05 1.44 154 1.39 1.60 1.34 1.66 1.29 1.72 1.23 1.79
0.0: 1.32 1.40 1.28 1045 :.24 1.49 1.20 1.54 1.16 1.59
50 0.025 1.42 1.50 1.38 1.54 1.34 1.59 1.30 1.64 1.26 1.69
0.05 l.50 1.59 1.46 1.63 1.42 1.67 1.38 :.72 1.34 1.77
0.01 1.38 1.45 1.35 1048 1.32 1.52 1.28 :.56 1.25 1.60
60 0.025 1.47 1.54 1.44 1.57 1.40 1.61 1.37 1.65 1.33 1.69
0.05 :.55 1.62 l.51 1.65 1.48 1.69 1.:44 1.73 1.41 1.77
0.01 1.47 1.52 1.44 1.54 1.42 1.57 1.39 :.60 1.36 1.62
80 0.025 1.54 1.59 1.52 1.62 1.49 1.65 1.47 1.67 1.44 1.70
0.05 1.61 1.66 1.59 1.69 1.56 1.72 1.53 1.74 1.51 1.77
0.01 1.54 1.56 1.50 1.58 1.48 1.60 1.45 1.63 1.44 1.65
100 0.025 1.59 1.63 1.57 1.65 1.55 1.67 1.53 1.70 1.51 1.72
0.05 1.65 1.69 1.63 1.72 1.61 1.74 1.59 ;.76 1.57 1.78
Source: Adapted from Econometrics, by R. 1. Wonnacott and T, H. Won:aaeon., Joron W'ucy &: Sons. New York, 2970, -,villi
pennlss:ion of r.he publisher.
474 Chapter 15 Multiple Regression

inconclusive, \vnen the test is inconclusive, the implication is that more data must be col-
lected. In many problems this is difficult to do.
To test for n.egative autocorrelation, that is) if the alternative hypothesis in equation
15-68 is HI: P < 0, then use lY =4 - D as the test statistic, where D is defined in equation
15-69, If a two-sided alternative is specified, then use both of the one~sided procedures. not-
ing that the type I error for the two-sided test is 2a, where a is the type I error for the one-
sided tests.
The only effective remedial measure when autocorrelation is present is to build a model
that accounts explicitly for the autocorrelative structure of the errors. For an introductory
treatment of these methods, refer 10 Montgomery, Peck, and Vining (2001).

15-11 SELECTIOl\' OF VARIABLES IN MULTIPLE REGRESSION


15-11.1 The Model-Building Problem
An important problem ill many applications of regression analysis is the selection of the set
of independent or regressor variables to be used in the model. Sometimes previous experi-
ence or underlying theoretical considerations can help the analyst specify the set of inde-
pendent variables, Usually, however, the problem consists of selecting an appropriate set of
regressors from a set that quite likely illc1udes all the important variables, but we are sure
that not all these candidate variables are necessary to adequately model the response y.
In such a situation, we are interested in screening the candidate variables to obtain a
regression model that contains the "best" subset of regressor variables. We would like the
final model to contain enough regressor variables so that in the intended use of the model
(prediction, for example) it will perform satisfactorily. On the other band, to keep model
maintenance costs to a minimum, we would like the model to use as few regressor variables
as possible, The compromise betvleen these conflicting objectives is often called finding the
"best" regression equation. However. in most problems, there is nO single regression model
that is ""best" in teons of the varions evaluation criteria that have been proposed. A great
deal of judgment and experience with the system being modeled is usually necessary to
select an appropriate set of independent variab~es for a regression equation.
No algorithm will always produce a good solution to the variable selection problem.
Most currently available procedures are sea..rch techniques. To perform satisfactorily, they
require interaction with and judgment by the analyst. We now briefly discuss some of the
more popular variable selection techniques.

15-11.2 Computational Procedures for Variable Selection


\Ve assume that there are k candidate variables, XI> Xl' .• " XI.:' and a single dependent vari,..
able y. All models will include an intercept term (3" so that the model with all variables
included would have k + 1 terms. Furthermore, the functional fonn of each candidate vari~
able (for example, x, = lIx t ,x, In x" etc.) is correct.

An Possible Regressions This approach requires that the analyst fit all the regression
equations involving one candidate va.,;able, all regression equations involving two candi~
date variables, and so on. Then these equations are evaluated according to some suitable cri-
teria to select the "'best" regression modeL If there are k candidate variables, there are 2k
total equations to be examined, For example, if k 4, there are 24:::;: 16 possible regression
equations, while if k 10, there are 2" 1024 possible regression equations. Hence. the
number of equations to be examined increases rapidly as the number of candidate variables
increases.
15-11 Selection of Variables in Multiple Regression 475

There are a number of criteria that may be used for evaluating and comparing the dif·
ferent regression models obtained, Perhaps the most commonly used cri:erion is based on
Ii!
!.he coefficient of multiple detennination. Let denote the coefficient of determination for
a regression model with p terms, that is, p - 1 candidate variables and an intercept ~eml
(note that p !5 k + 1). Computationally, we have

R' = ~R(P) = l---::-~, (15-70)


P S
"
where SS.cp) and SSE(P) denote the regression sum of squares ,,,,d the error sum of squares,
respectively, for the p-variable equation. Now JS increases as p increases and is a maximum
when p:=; k+ 1. Therefore, the analyst uses this criterion by adding variables to the model
up to the point where an additional variable is not useful in tha: it gives only a small
increase in R!. The general approach is illustrated in Fig. l5~9, which gives a hypothetical
R!
plot of against p, 1i'J:>ically, one examines a display such as this :md chooses the number
of variables in the model as the point at which the "knee" in the curve becomes apparent.
Clearly, this requires judgment on the part of the :malyst.
A second criterion is to consider the mean square error for the p-variable equation, say
MSE(P) SS,(p)/(n -p), Generally, MSE(P) decreases as p increases, but this is not neces-
sarily so. If the addition ofa variable to the model withp 1 terms does not reduce the error
sum of squares in the new p term model by an amount equal to the error mean square in the
old p 1 term model, ,,"dSs(P) will increase, because of the loss of one degree of freedom
for error. Therefore, a logical criterion is to select p as the value that lI'inimizes 1,1SE (P); or
since ,\:fS£(p) is usually relatively flat in the -vicinity of the minimum, we could choose p
such that adding more variables to the model produces only very small reductions in
MSE(P), The general procedure is illustrated in Fig, 15-10,
A third criterion is the Cp statistic, which is a measure of the total mean square error for
the regression modeL We define the total standardized mean square error as
1 n 2
rp = ,,' rEly,
,,,,1
- EHl
= :2 [t,{E(y,J-E(Y,j}' t, V(y,)- +

= 1 [(b.)2 .
las + vanance
(j2
"J-

o
Figure 15-9 Plot of ~ against p,
476 Chapter 15 Multiple Regression

Minimum MSE(o)
L -_ _ _ _ _ _ _ _ _~~----

Figure 15~lO Plot of MSE(P) against p.

\Ve use the mean square error from the full k + 1 term model as an estimate of a'l; that is,
IT = MSs(k + I). An estimator of r,is

(15-71)

If the p-tel1I!. model has negligible bias, men it can be shown that

E( C,lzero bias) = p.

Therefore, the values of Cp for each regression model under consideration should be plot-
ted against p. The regression equations that have negligible bias will have values of Cp that
fall near the line Cp = P. while those with significant bias will have values of Cp that pwt
above this line. One then chooses as the "best" regression equation either a model with min~
imum Cp or a model with a sJJghdy larger CD that contains less bias than the m.i.nimum.
R;
Another criterion is based on a modification of that accounts for the number of vari-
ables in the model. We presented this statistic in Section 15-6.1. the adjusted R' for the
model fit in Example 15-L This statistic is called the adjusted R~ defined as

'()
RoO n-l("
p =1--- 1-11:). (15-72)
7 n- p P

Note that R:"j(P) may decrease as p increases if the decrease in (n - l){l - R; is not CDW-
pensated for by the loss of one degree of freedom in n - p. The experimenter would usually
select the regression model that has the caximum value of R~(P). However, note that this
is equivalent to the model that minimizes MS.(P). since

j
15-11 Seleetion of Variables in Multiple Rcgra""ssion 477

The data in Tab;e 15~ 15 are an expanded set of data fo:- the damaged peach data in Example 15~1.
Tnere are now five candidate variables, drop height (Xl)' fruit density (.x::;, fruit height at impact point
(x:J. fruit pulp thickness (x4 ), and potential energy of me fruit before me impact (x5 ),
Tab~e 15-16 presents the rescl~ of rur.uing all possible regressions (except the tivial modei wim
ody an intercept) on these data. The values of R~, R~lp), MS/.p), and ep are given in fr.e table. A plot
R!
of the maximum for each subset of size p is shown in Fig. 15~ 1 L Based on this plot there does not
R!
appear to be much gain in z.dding the fift:h variable. The value of does Dot seem to bcrease sigrif-
R!
icantly 'J.'!th the addition of x, over the four-variable model with the highest value. A plot of the
minimum lrfS:/.,p) for each subset of size p is sholA.'J1 in FIg. 15~12, The best two-variable model is
either (x;. X:/) or (..tz. X:/); the best th...'"U:-variable model is (xp.x::, x:J; the besl four-w.u::iablc model is
either (x:,.."t:J,.:s. xJ or (xl' 4' x 3• X5)' There are several models with relatively small values of MSi,p).
but either the three-variable model (xl' 4'xJ or me four~variable model (x,.~, Xz, xJ would be supe~
rior to the other models based on the MSdJ;) criterioll. Further investigation ?oill be necessary.
A Cp p:ot is shown in Fig. 15-13, Only 'fue fiv~variable model has a Cp '5. P (specifically Cp =-
6.0), but the Cp value for the four-variable model (Xl' x.~.~, x,J is Cft:;;: 6.1732. There appears to be
insufficer.t gain in the Cp value [Q justify including x5' To illustrate the calculations, for this equation
(for the rrwdel including (X" x" x,. xJ] we wou:d find

C = SS£(p) -n-2p
p 02
18.29715 20+2'5)=6.1 732.
1.13132 \

Table!;·1' Darr:.aged Peach Data fur Example 15-16


Delivery Drop Fnrit Fnrit Fruit Pulp Potential
Observation TIme, Height. Density, Heigh4 Thickness, Energy,
y x, X, X, x, x,
3.62 303.7 0.90 2<i.l 22.3 184.5
2 727 366.7 Ul4 18.0 21.5 185.2
3 2.6§ 336.8 1.01 39.0 22,9 128.4
4 1.53 304.5 0.95 48.5 2004 173.0
5 4.91 346.8 0.98 43.1 18,7 139.6
6 10,36 600.0 1.04 21.0 17.0 146.5
7 5.26 369.0 0.96 12.7 20.L 155.5
8 6.09 418.0 1.00 46.0 18.1 129.2
9 6.57 269.0 1.01 2.6 21.5 154.6
10 4,24 323.0 0.94 6.9 24.4 152.8
11 8.04 5622 1.01 27.3 19.5 199.6
12 3.46 284.2 0.97 30.6 20.2 :77.5
13 8.50 558.6 1.03 37,7 22.6 210.0
14 9.34 415.0 1.01 2<i.l 17.1 165.!
15 5.55 349.5 1.04 48.0 21.5 195.3
16 8.11 462.8 1.02 32.8 23.7 171.0
17 7,32 333.1 1.05 29.2 21.9 163.9
18 12.58 502.1 1.10 '::',0 16.9 1.40.8
19 0.15 311.4 0.91 39.2 26.0 :54.1
20 5.23 351.4 0.96 36.3 23.5 194.6
478 C'lapter 15 Multiple Regression


0.7

5 6
P
Figure 15·11 11:0 maximwn R; plot for Example 15·16.

oJ"
:§;

~
w
0.8 r
J, 2 3

4
.. ·
j
5
t:
6
P
Figure 15·12 The MS,(P) plot for &ample 15·16.

35 F'
u~
25,
15 ~

i
5::::L_ _.
• •j I
2 3 4 5 6
P
Figure 15-13 The C, plot for Exa::nple 15·16.

noting tha:(f1= 1.13132 is obtained from the full equation {xt>;S, x 3• x4• xj ). Slnee all other models
(with the exclusion of the five-variable model) contain substantial bias, we would conclude on the
basis of the C", criterion that the best subset of the regressor variables is (x:. X:!,.:s, x,J. Since this model
also results in rela::ively small MSs(p) and a relatively high R!.
we would selecr it as the "best"
regression equation, The final model is
jk-19.9 + 0.0 123x, + 27.3x, - 0.0655", -0.196<,.,

Keep in mind. though, that fi.t.r.her analysis should be conducted on this model as well as other pos-
sible candidar:e models. VV1th additional investigation,. it is possible to discover an even bcttcr.-fitting
model. We v.ill discuss this in mOre detail later in this chapter.
15~ 11 Selection of Variables in Multiple Regression 479

The all-possIble-regressions approach requires considerable computational effort, even


when k is moderately small. However, if the ~yst is willing to look at something less than
the estimated model and all its associated statistics, ir possible to devise algorithms for all
possible regressions that prOOuce less infonnarion about each model but which are more
efficient computationally. For example, suppose that we could efficiently calculate only the
MSEfor each model. Since models with large MSE are not likely to be selected as the best
regression equations, we would then have only to examine in detail the models with small
values of MS£, There are se....-ern1 approaches to developing a computationaUy efficient algo-
rithm for all possible regressions (for example. see Furnival and Wi~son, 1974). Both
Minitab® and SAS computer packages provide the Fumival and Wilson (1974) algorithm as
an option. The SAS output is provided in Table 15-16.

Stepwise Regression This is probably the most widely used variable selection technique.
The procedure iteratively constructs a sequence of regression models by adding or

Table 15-16 Al1 Pos.sible Regressions for the Data in Example 15~ 16

?':.u:nber in Variables in
Modelp-I Ii! if""cp) Gp MS,fp) Model

0.6530 0.6337 37.7047 3.37540 ":


0.5495 0.5245 53.7211 4.38205 x,
0.3553 0.3194 83.7824 6.27144 x,
1 0.1980 0.1535 108.1144 7.80074 X,
0.0021 -0.0534 138.4424 9.70689 x,
2 0.7805 0.7547 19.9697 2.26063 x"X-z
2 0.7393 0.7086 26.3536 2.685>M ,,:,x,
2 0.7086 0.6744 31.0914 3.00076 X-z,x4
2 0.7030 0.6681 31.9641 3.05883 Xl' x,
2 0.6601 0.6201 38.6031 3.50064 Xz. x1
2 0.6412 0.5990 41.5311 3.69550 X.' x..
2 0.5528 0.5002 55.2140 4.60607 x:·;;:1'
2 0.4940 0.4345 64.3102 5.21141 xJ ' X'4

2 0.4020' 0.3316 78.5531 6.15925 x 4.;;:!


2 0.2125 0.1199 107.8731 8.11045 x;p;;:'"
3 0.8756 0.8523 7.2532 1.36135 xp Xz,:S
3 0.8049 0.7683 18J949 2.13501 x1,,Xz,x.;
3 0.7898 0.7503 20.5368 2.30060 ,Xz.;S,x,.
3 0.7807 0.7396 21.9371 2.39961 xl' :X:::,xj
3 0.7721 0.7294 23.2681 2.49372 x 1.X:!,X4
3 0.7568 0.7112 25.6336 2.6609B Xz, Xl' x$
3 0.7337 0.6837 29.2199 2.91456 X-z, x4 Xs
3 0.7032 0.6475 33.9410 3.24838 xl,:S, Xj
3 0.6448 0.5782 42.9705 3.88683 xl' XM Xs
3 0.5666 0.4853 55.0797 4.74304 X" X.:., 4
4 0.8955 0.8676 6.1732 1.21981 x:. Xz.xJ.x.,.
4 0.8795 0.8474 8.6459 1.40630 x".:t:z, A"XS
4 0.8316 0.7866 16.0687 1.96614 X-z, x,. X4' Xl
4 0.8103 0.7597 19.3611 2.21446 x;. X;:, x4 • Xl
4 0.7854 0.7282 23.2090 2.50467 X(,~,X4'XS
5 0.9095 0.8772 6.0000 1.13132 Xl' X;,~, x.; • .xs
480 Chapti.'rr 15 Multiple Regression

removing variables at each step. The criterion for adding or removing a variable at any step
is usually expressed in terms of a pattial F·tes<. Let F i• be the value of the F statistic for
adding a variable to the model, and letFoot be the value of the Fstatistic for removing a vari~
able from the model. We must have Fin ~ Fout' and usually F.~ ;;: F 00\'
Stepwise regression begir:s by forming a oneMvariable model using the regressor vari~
able that has the highest correlation with the response va..'iable y. This will also be the vari~
able producing the largest F statistic. !fno F statistic exceeds Fin' the procedure terrnina.tes,
For example, suppose that at this step XI is selected. At the second step the rema.:ining k - 1
candidate variables are examined, and the variable for which the statistic

SSR(PJlP" Po) (l5-73)


ldSE(xJ1X1 )

is a maximum is added 10 the equation, pro,ided that F, > F". In equation 15-73, MS/"xr x,)
denotes the mean square for error for the model containing both XI andx? Suppose that this
procedure now indicates that:s should be added to the model. Now the stepwise regression
algorithm de1lermines whether the variable x; added at the first step should be removed. This
is done by calculating the F statistic

(15·74)

If F I < F~u.I.' the variable x~ is removed,


In general, a: each step the set of remaining candidate variables is examined and the
variable with the largest partial F statistic is entered. provided that the observed value of F
exceeds F:~. Then the partial F statistic for each variable.in the model is calculated. and the
variable with the smallest observed value of F is deleted if the observed F < F 00" The pro·
cedure continues until no other variables can be added to or removed from the model.
Stepwise regression is usually performed using a computer program.. The analyst exer. .
cises control over the procedure by the choice of FlO and Ft)u\' Some stepwise regression
computer programs require that numerical values be specified for Fin and F Qu:' Since the
number of degrees of freedom on MSe depends on the number of variables in the model,
which ch3!lges from step to step> a fixed value of FIE! and FOUl causes the type I and type II
error rates to vary. Some computer programs allow the analyst to specify the type I error
levels for Fm and F owt' However. the "advertised" significance level is not the true level.
because the variable selected is the one that maximizes the par.ial F statistic at that stage.
Sometimes it is useful to experiment 'With different values of Fj(j and F(j~t (or different adver-
tised type I eITOr rates) in several runs to see if this substantially affects the choice of the
final model.

~;m;~~~'i~;4t
We w'Jlapply stepwise regression to the damaged peaches data in Thble 15-15, Miuita.b2i O!!tput is pnr
vided in Fig. 15-14. From this figure, we see that variables XI';S, and.x;, are sigcifica..'lt; this is because
the last column contains entries for only ,xl'~? and.t:J. Figure 15-15 provides the SAS computer output
that '.vil.l support the computatioas ro be calculated ne."tt. Instead of speciff.!lg u~ca1 values of F ~
andFQ<,# we use an advertised type I ¢ITOr of a'"",O.10. The first Step consists ofbcild.ing a sirnp:e linear
regression model using the variable that gives the largest F statistic. This is X-z, and since

F, = SSR(P,IPo} _ 114.32885 33.87> F" = F,."".,, = 3.01,


MS£(x,) 3.37540
15~1: Selection of Variables in Multiple Regressior. 481

Alpr..a-to-Enter; Q.l Alpha-to-Remove 0.1


Response is y on $ predictors. with N = 20
Step 1 2 3
Consta."lt -4:2 .87 -33,33 -27,89
x2 49,1 34.9 30.7
T-'.lalue 5.82 4.23 4:.71
P-Value 0.000 0.001 0.000
xl 0.0131 0.0136
T-Value 3.14 4.19
P-Value C .C06 0.001
x3 -0.067
?-Value -3.50
P-Value 0.003

S 1.84 1. SO 1.17
:R-Sg 65.3C 7S.0S 87.56
R-Sq (adj) 63.37 75.47 S5.23
C-p 37.7 20.0 7.3

Figure 15-14 Minitab<!l ou~t for stepwise regression in Example 15-17.

.L:! is entered into the modeL


The second step bet..:ns by finding L'ie va."iable xJ that has the largest partial F statistic, giver;, that
.L:! is in the mode:. This is xI' and since

sS.(lldIl2.llc) 46.45691
MSE(x,.x;)
2.26063 - 20.55 > F;, =F,.".,.i1 =3.03,

x. is added to ':he model. :;ow the procedure evaluates whet.'ier or not Xz should be retained, given that
x ~ is in the modeL rus mvol yes calculating

40.44627 = 17.89 > F FO,:O,J.17 =3.03.


2.26063 m

TherefOie.t: should be retained. Step 2 terminates 'IN:ith bot'i x; and X:: in t'ie model.
The :bird step finds the next variable fer entry as;S. Since

ss,(Il, '/I, ,Il:, Ilc) lq.64910


12.23> Fin FO.IQ,I.IS = 3.05,
MSE(x"x"x,) L36135

:l:; is added to the model. Partial F~te.sts on Xz (given Xl and.x~ and Xl (given ~ and xJ) indicate that
these variables sholtld be retained. Therefore. the third step coucludes with the variables XI' x 2' and Xj
in the model.
At 'the fourth step. neither of the remainir.g terms, x.. or x~. is significant en.ough to be included
in the model. Therefore. the stepwise proced\4'"e ~tes.
The stepwise regression proce~""e would conehlde that the bes~ model includes XI' )'.,.. and-'S.
The usual checks of medel adequacy, such as residual analysis and Cp plots. should be applied to the
equation. These results are si.mi.l.ar to those found by all possible regressions, with. the exception that
Xi was also considered a possible significant variable with all possible regressions,
482 Chapter 15 Multiple Regression

Forward Selection This variable selection procedure is based on the principle that vari-
ables should be added to the model one at a time until no re:mai::ring candidate variables
produce a significant increase in the regression sum of squares. That is. variables a...~ added
one at a time as long as F> Fin' Forward selection is a simplification of stepwise regression
that omits the partial F-test for deleting variables from the model that have been added at
previous steps. TIlls is a potential weakness of forward selection; the procedure does not
explore the effect that adding a variable at the current step has on variables added at ear~
lier steps.

The REG Procedure


Depende~~ Variable: y
Porward Selection: Step 1 Va=iable x2 Entered: R-Square 0.6530 a."ld C(p)
37.70~7
Source DF Sum of Squares
Mean Square P Value Pr > f
Model 1114.32885 11.4.32885 33.87 <.0001
Er:::or 18 60.75725 3.37540
Corrected Total 19175,08609
variable Par~~eter Estimate St andard E=ror Type II 5S F Value Pr :> P
Intercept -42,87237 8.41429 87.62858 25.96 <.OC01
x2 49.08366 8,43377 114.32885 33,87 <.0001
Bounds on conc.i~ion nu.1'lber~ 1. 1

Forward Selection: Step 2 Variable xl Enterec.: R-Square = 0.7805 and C(p)


19.9697
Source DF Sum of Squares Mean Squ;:o..re F Value Pr :> F
Model 2 136.65541 63,3277::. 30.23 <.0001
Error 17 38.43068 2.26063
Corrected Total 19 175.086C
Variab:e Parameter Estimate Standard Error Type II S5 F Val'.le Pr :> F
Intercept -33.83110 7.46286 46.45691 2C.55 0,0003
xl 0.01314 0.00418 22.32656 9.88 C.0059
:<2 34.88963 8.24844 40.44627 17.89 0.0006
Bounds on condition number: 1.4282, 5.7129
-------
Fo~....ard Selection: Step 3 Varl_able x3 Entered: R-Sqaare :::0 0.8756 and C(p)
7.2532
Source Dr Sum of Squa:.::es
Mean Square F Value Pr > F
Model 3 153,30451 51,10150 37.54 <.0001
Error 16 21. 78159 1.36135
CO:::'rected Tota: 19 175.08609
Variable Parameter Esti,.,.tate Standard Error Type I I S5 F Value P" > F
Intercept -27.89190 6.03518 29.07675 21.36 0.0003
xl 0,01360 0.00325 23,87130 17.54 0.0007
:<2 30.68486 6.50286 30.21859 22.40 0.0002
x3 -0,06701 0.01916 16.649:"0 12.23 0.0030
Bounds on condition number: 1.';786, 11.85

No other variable met the 0.1000 significance level for entry into the.. model,
Summary of Forward Se:e..ction
Variab":e Number Partial Model
Step Entered va.:rs In R-Square R-Square C(p) F Value Pr > P
1 x2 1 0.6530 0,6530 37.7047 33.87 <.0001
2 xl 2 0.1275 0.7805 19.9697 9.88 0.0059
3 x3 3 0.0951 0.8756 7,2532 12.23- 0.0030

Figure 15-15 SAS output for stepwise regression in Example l5~ 17.
15-: 1 Selection of Variables in Multiple Regression 483

Ex:unplelS-18 .
AppE.carion of the forward selection algoriilim to the damaged peach data ir. Tabre 15~ 15 would begin
by adding X-z to the :node1. The::l th~ variable that induces the Wb'eSt partial F~test. given tha! X 2 is in
the model, is added-this is variable XI' The third step enters X:" which produces tIre Largest pa.,"tial F
statistic given 1f<.at x t and x: are in the modeL Since the partial F Statlsr:cs for XJ al'l.d;; are not signif~
iea.I!t, the procedure terminates. The SAS outputfo:- forward selection is given in Fig. 15-16. Note <±tat
forward selection leads to the same:final model as stepwise regression, This is not always the case,

Backward Elimino.tion This algorithm begins with all k candldate variables in the modeL
Then the variable \Vith tbe smallest partial F statistic is deleted if this F statistic is insignif~
kant, that.is, ifF < Foul' Next, the model with k - 1 \'3riables is estim.ated~ and the next vari-
able for potential elimination is found. The algorithm terminates when no further variables
can be deleted.

To apply backward elimination to ilie dara in Table 15~15. we begi."'l by estimating t.'1e fur: model in
a!: five variables. This model is y ~-20.89732 + 0.01 lO2x, + 27.37046x, - 0.06929", - 0.25695x, +
O.Ol66&X,.
The SAS COffiputer output is given in Fig. 15~17. The partial F~tests for eaeh variable are as
follows:
SS,(Pl;3,. /3" p, ,p,. Po) 13.65360
1'j 12m.
MSe 1.13132

SS, I,;3, !f3" {3,. p,. P,. /30 'J 21.73153 ~ 19.2J.
F,
MS z 1.13132

SSR( /3, !(3, •.8,. f34' f3,. p,) 17.37834


F, 15.36.
MS E 1.13132
SSR(/3,!P,. f3,. /3,. {3" f30) 5.25602
F, 4.65.
MS, 1.13132
SS,(f3,!{3, •.8, .{3,.f34 ..80) 2.45862
F, 2.17.
MSE 1.13132
T.:e variablex$ has the smallest F statistic. Fs::::' 2.17 < FWI::::' Fo.!O,t.!4::;:: 3.10; therefore. AS is removed
from the model at step L The model is now fit with only the four xmaini.'"lg variables, 11. step 2. the
F statistic for .ott {F:,:::;; 2.86) is less than Foot =FO,10,1.1;;= 3,07, therefore, x4 is :-emoved from the model
No re:rnai::ring variables have F statistics less than the appropriate F values, and the procedure is ter~
f)J,l!

miD.ated. T.:e th.."'Ce~variable model (x!, Xz. Xv bas all variables significant according to the partial
F~test criterion. Note that backward cl.i..::::.Unation has resulted in the same model that was found by
forward selection and stepwise regression, This r::w:y not always happen,

Some Comments on Final )'Jodel Selection livre have illustrated several different
approaches to the seleetion of variables in multiple linear reg..""ession. The final model
obtained from any model-building procedure should be subjected to the usual adequacy
checks. such as residual analysis and examination of the effects of outermost points. The
analyst may also consider augmenting the original set of candidate variables with cross
products, polynomial terms, or other transformations of the original variables that might
improve the model
484 Chap'", 15 Multiple Regression

The REG Procedure


Dependent Va=iable: y
Backward El:i.\ninat:ior,.: S'tep 0 All Vax:iables En":ered; R~Squa't'e 0.9095 and
C(p) = 6.0000
Sum of Mean
Source DP Squares Square F Value Pr :> F
Model 5 159.24760 31. 84952 28.15 <.0001
Error 14 15.83850 1.13132
Co~ected Total. 19 17$,08609
Pa::a.me'ter S1:al1dard
Variable Estimate Error Type II S5 F Value Pr:> P
Intercept -20.89732 7.16035 9.63604 8.52 0,0112
xl 0.01102 0.00317 13.65360 12.07 0.0037
x2 27.37046 6.24496 21. 73153 19.21 0.0006
x3 ~O.O6929 0.01768 17.37834 15.36 0.0015
x4 -0.25695 0.11921 5.25602. 4,65 0.0490
x5 0.016B8 0,01132 2,458622.17 0.1626
Bounds on condition number: 1. 6438, 35,628

Backward E1:lmination: Step 1 Variable xS Removed: R-Square "'" Q.8955 and


C(p) "" 6,1732
5\.l.l'll of Mean
Source DF Squares Square P Val·.le Pr> F
Mode:!.. 4 156.78898 39.19724 32.13 <.0001
Er=or IS 18:29712 1.21981
Corrected Total 19 175.08509
Parameter Standard
Va1:'iable Esti.mate Exror Type II S3 F Value Pr ;> F
Intercept -19.93175 7.40393 8.84010 7.25 0.0167
xl 0.01233 0.00316 18,51245 IS.18 0.0014-
,,2 27,28797 6.48433 21. 6024;6 17.71 0.0008
x3 -0.06549 0.01816 15.86299 13.00 0.0026
-Q.19641 0.11621 3.48447 2.86 0.1117
.Bo'.lnds on condition nUJ('.bcr: 1.6358, 22.31

'Backward Eli.m:ination: Step 2. Variable x4 Removec.: R-Sq'.lare "" 0.8756 and


C(p) ",. 7.2532
Sum of Mean
Source DF Squares Square F Value Pr '> F
Model 3 153.30451 51.10150 37.54 <.0001
Error 16 21. 78159 1,36135
Correct~d Total 19 175.08609
Para,.'T!ctcr Standard
Variable Estirr-catc Erra;::' Type II 55 F Value. l!:r > F
Intercept -27.89190 6.03518 29.07675 21.36 O.oooS
x:. 0.01360 0.00325 23.87130 17.54 0.0007
x2 30.68486 6.5::'286 30.21859 22.20 0.0002
x3 -0.06701 0.01916 16.64910 :'..2.23 0,0030
Bounds on condition number: 1.4786, 11.85

All variables left k" the model are significant: at the 0.1000 level,
Summary of .Bac1cNard Eli.'l'Iination
Variable Nu'1'1ber Partial Model
Step Entered Vaxs In R-Square. R-Square C(p) F Value PO? > F
1 x5 4 0.0140 0.8955 6.1732 2.17 0.1626
2 x4 3 0, :)199 0.8756 7.2532 2.86 0.:1:7
Figure 15~ 16 SAS output for forward selection in Example 15-18.
15-11 Selection of Variables in Multiple Regression 485

~he REG P~Qced~re


Dependent Variable: y
S,::epwise Selectior.: S"':ep 1 Variable x2. Ente~ed: R-Square 0.6530 a..'1d Cep)
37.7047
Sum of Mea.'l
Source DF Squares Square :? Val.ue PI.' > F
Made: 1 114.32885 114.32885 33.87 <.0001
Error 18 60.75725 3.37540
Corrected Tcta~ 19 175.08609
?ara.meter St:andard
Variable Est:imat:e Error Type II SS F Value Pr > F
In"tercept ~42.87237 8.41429 87.62858 25.96 <.0001
x2 49.C8366 8.43377 114.32885 33.87 <.0001
Bounds On condition ncnber: 1, 1

S,::ep~~se Selection: Step 2 Variab~e xl Ent:ered: R-Squaxe 0.7805 a.~d C(p) =


19.9697
S\lln of Mean
Sou!."ce DE Squares Square F Value Pr > F
Model 2 136.65541 63.32771 30.2.3 <.OCOl
Error 17 38.43068 2.26063
Corrected Total 19 ":"75.08609
Parart',et:er Sta.."'1dard
Variab~e Esti.ma'::e Error .:;:.::: 55
~ype F Value Pr > F
Intercept -33.83110 7.46286 46.45691 20.55 C.0003
xl 0.0":"3l4 0.00418 22.32656 9.Sa 0.0059
x2 34.88963 8.24844 40.44627 17.89 0.00C6
Bounds on condi tioD number: 1.4282, 5.7129
----- ---------
St:epwise SElEction! St:ep 3 Variable x3 En"tered: R-Square <= C. 8756 and C(p) ,,-,
7.2532
SUl1I of Mean
SC·J.!.'ce D:? Squares Sqt:are F Value Pr> F
Model 3 153.30451 51.10150 37,54 <.0001
Error 16 21.78159 1. 36135
Ccrrected 70tal 19 175.08609
Pararnet:er S-::~'1dard
Variab:e Es-:::imate Error TJ."?€ II 55 F Va1'.le P:r > F
':::ntercept -27.89190 6.035:'8 29.07675 21.36 0.0003
xl 0.01360 0.00326 23.87130 17,54 0.00C7
x2 30.68486 6.51286 30.2.1859 22.20 0.0002
x3 -0.C6701 0.01916 16.64910 12.23- 0.C030
Bounds on coruli ~ion nurnbe:!:' : 1. 4786, 11.85

AI: variables lef"t in "the model are signific~~t a~ ~he O.lCOO leveL.
Ke other variable met the 0.1000 significance level for entry into the model.
Summary of St:epwise Select~on
Va:r:"able Variable Number Partial Mode:
Step Entered Removed Vars In .a-Squar€ R-Square Cep) F Value Pr > F
x2 1 0.6530 0.6530 37.7047 33.87 <.0001
2 xl 2 0.1275 0.7805 19.9697 9.88 C,0059
3 ,,3 3 0.0951 0.8756 7.253-2 12.23 0,C030
Figure 15~17 SAS output for bacbvard elimination ill Example 15~19.

A major criticism of variable selection methods. such as stepwise regression, is that the
analyst may conclude that there is one "bestH regression equation. This generally is not the
case, because there are often several equally good regression models that can be used. One
486 Chapter 15 Multiple Regression

way to avoid this problem is to use several different model~building techniques and see if
different models result. For example, we have found the same model for the damaged peach
data by using stepwise regression. forward selection, and backward elim.ination. This. is a
good indication that the threewvariable model is the best regression equation, Furthennore,
there are variable selection techniques that are designe.d to find the best one-variable model.
the best two~variable model, and so forth. For a discussion of these methods, and the van..
able selection problem in general, see Montgomery~ Peck., and Vining (200l),
If the number of candidate regressors is not too large, the all~possib:e-regressions
method is recommended. It is not distorted by multicollinearity among the regressors, as
stepwise-type methods are.

15·12 SU~1{

This chapter has introduced multiple linear regression, including least-squares estimation of
the parameters, interval estimation, prediction of new observations, and methods for
hypothesis testing, Various tests of model adequacy. including residual plots. have been dis-
cussed. It was shown that polynomial regression models can be handled by the usual mul-
tiple linear regression methods, Indicator variables were introduced for dealing with
qualitative variables. It also was observed that the problem of multicollinearity, or inter-
correlation between the regressor variables. can greatly complicate the regression problem
and often leads to a regression model that may not predict new observations well. Several
causes: and remedial measures of this problem, inc1udi.'lg biased estimation techniques,
were discussed, Finally, the variable selection problem in multiple regression was intro-
duced. A number ofmodel~bui1ding procedures, including all possible regressions, stepwise
regression. forward selection. and backward e1imination, were illustrated.

15·13 EXERCISES
lS~l, Consider th.e damaged peach data in Table lS~4. Using the results of Exercise 15~2, find a 95%
15-15, confidence interval on j3~.
(a) Fit a regression model using XI (drop height) and 15~5. The data in the table at the top of page 487 are
x.;. (fruit pulp thickness) to these data. the 1976 team performance statistics for the teams in
(h) Test for significance of regression. the National Football League (Source: The Sparring
(c) Compute the residuals from this modeL Analyze ly'ews),
these residuals using the methods discussed in (a) Fit a multiple regression :nodel relating the num-
this chapter. ber of games won to the teams' passing yardage
Cd) How does this twO-\Mable r:1ode~ compare Veith {x..), the percentage of rushing plays (x,), and the
the two-variable model using x: and .:tz from opponentS' yards rosblng (x,),
Example 15- j ? ' (b) Construct the appropriate residual plots a:.d com-
15-2. Consider the damaged pea.en data in Table ment on model adequacy,
15-15, (e) Test the significance of each variable to the
(a) Fit a regression ::nodel using XI {.:top height),.:tz model, u:sing either the Nest Or the p;trtial F~test
(fruit density), and ~ (fruit height at impact point) 1S-6. The table at the top of page 488 presents gaso-
to cese data. line rr.ileage perforrr,.a.::ce for 25 at.tomobiles (Source:
(b) Test for sigrlficance of regression. Motor Trend, 1975),
(c) Compute the residcals from this model. Analyze (a) Fit a multiple regression :::::lodel relating gasoline
these residuals using the methods discussed in mileage to engine displacement (Xl) and number
"'Us chapter. of carburetor ba:c-els (xJ..
15-3. Using the results of Exercise 15-1, find a 95% (b) Analyze the residuals and comment on mood
confidenee interval on A. adequacy.
15·13 Exercises 487

National Football League 1976 Team Performance


Team y x"
Washington 10 2113 1985 38.9 64.7 +4 868 59.7 2205 1917
Mi:onesota II 2003 2855 38.8 61.3 +3 615 55.0 2096 1575
New England 11 2957 1737 4DJ 60.0 +l4 914 65.6 1847 2175
Oa."'dand 13 2285 2905 ':'1.6 45.3 -4 957 61.4 1903 2476
P:otsburgh 10 2971 1666 39.2 53.8 +15 836 66.1 1457 1866
Baltimore II 2309 2927 39.7 74.1 +8 786 61.0 1848 2339
Los Angeles 10 2528 2341 38.1 65.4 +12 754 66.1 1564 2092
Pallas 11 2147 2737 37.0 78.3 -l 761 58.0 1821 1909
Atlanta 4 1689 1414 42.1 47.6 -3 714 57.0 2577 2001
Buffalo 2 2566 1838 42.3 54.2 -1 797 58.9 2476 2254
Chieago 7 2363 1480 37.3 48.0 +19 984 67.5 1984 2217
Cincinnati ;0 2109 2191 39.5 51.9 +6 700 57.2 1917 1758
Cleveland 9 2295 2229 37.4 53.6 -5 1037 58.8 1761 2032
Denver 9 1932 2204 35.1 71.4 +3 986 58.6 1709 2025
Detroit 6 2213 2140 38.8 58.3 +6 819 59.2 1901 1686
Green Bay 5 1722 1730 36.6 52.6 -19 791 54.4 2288 :835
Houston 5 1498 2072 35.3 59.3 -5 776 49.6 2072 1914
Kansas City 5 1873 2929 4Ll 55.3 +10 789 54.3 2861 2.496
:Miami 6 2118 2268 38.2 69.6 +6 582 58,7 2411 2670
New Orleans 4 1775 1983 39.3 78.3 +7 901 51.7 2289 2202
New York Gi.mts 3 1904 1792 39.7 38.1 -9 734 61.9 2203 1988
New York Jets 3 1929 1606 39.7 68.8 -21 627 52.7 2592 2324
Philadelphia 4 2080 1492 35.5 68.8 -8 722 57.8 2053 2550
St. Louis 10 2301 2835 35.3 74.1 +2 683 59.7 1979 2:10
San Diego 6 2040 2416 38.7 50.0 0 576 54.9 2048 2628
San hancisco 8 2.447 1638 39.9 57.1 -1l 848 65.3 1786 1776
Seat-Je 2 1416 2649 37.4 56.3 -22 684 43.8 2876 252.4
TIllIlpa Bay 0 1503 1503 39.3 47.0 -9 S7S 53.5 256) 2241
y; G-im'.e:s won (pc:' 14 ga.'Uc season),
:t,: Rusbing yards (season).
x,; Passing yards (season).
'xl: Punting aver.age (yds I punt) •
.:t.: Field goal percentage (Igs made I fgs attempted).
xl: T"J..'TIo'ler Ciff"eren:ial {rornovers acquirCd il:D:nOV<'..rs lost}.
~: Penalty yards (season).
x;: Percent rushing (rushing prays I total plays).
~: OppoIk"'1lt5' rushing yards (season).
~ Oppocents' passing yards (season).

(c) What is the value of addirtg ,46 to a model aut amo:cn! temperature (xa. the number of days in jre
already contains,X;? lIIDnth (:r;), the average product purity (x,), and :he
tOllS of proouct produced (.xJ. The past year's hlston-
15M'7. The electric power COJ1SllD1ed each month by a cal data is available and is pres:ented in the table at the
chemieal plant is thought to be related to the average bottom of page 488.
488 Chapter 15 ~lu!tipl< Rogression

Automobile y
Apollo 18.90 350 165 260 8.0:1 2.56:1 4 3 200.3 69.9 3910 A
Nova 20.00 250 105 185 8.25:1 2.73:1 3 196.7 72.2 3510 A
Mo::>arCh 18.25 351 143 255 8.0:1 3.00:1 2 3 199.9 74.0 3890 A
Duster 20.07 225 95 170 8.4:1 2.76:1 1 3 194.1 71.8 3365 M
Jenson Cony. 11.20 440 215 330 8.2:1 2.88:1 4 3 1845 69.0 4215 A
Skyhawk 22.12 231 110 175 8.0:1 2.56:1 2 3 179.3 65.4 3020 A
34.70 89.7 70 81 8.2:1 3.90:1 2 155.7
Scirocco
Corolla SR~5 30040 96.9 75 83 9.0:1 4.30:1 2 "5 165.2
64.0
65.0
1905
2320
M
M
Camara 16.50 350 155 250 8.5:1 3.08:1 4 3 195.4 74.4 3885 A
DatsunB210 36.50 85.3 80 83 8.5:1 3.89:1 2 4 160.6 62.2 2009 M
Capri II 21.50 171 109 146 8.2:1 3.22:1 2 17004 66.9 2655 M
"
Pacer 19.70 258 110 195 8.0:1 3.08:1 I 3 171.5 no 3375 A
Granada 17.80 302 129 220 8.0:1 3.00:1 2 3 199.9 74.0 3890 A
Eldorado 14.39 500 190 360 85:1 2.73:1 4 3 224.1 79.8 5290 A
Imperial 14.89 440 2[5 330 8.2:1 2.71:1 4- 3 231.0 79.7 5185 A
NovaLN 17.80 350 155 250 8.5:1 3.08:1 4 3 196.7 72.2 3910 A
Starlirc 23.54 231 no 175 8.0:1 256:1 2 3 179.3 6504 3050 A
Cordoba 21.47 360 180 290 8.4:1 2,45:1 2 3 214.2 76.3 4250 A
Trans Am 16.59 400 185 NA 7.6:1 3.08:1 4 3 196 73.0 3850 A
Corolla E~5 31.90 96.9 75 83 9.0:1 4.30:1 2 5 165.2 61.8 2275 }'1
M"klV 13.27 460 223 366 8.0:1 3.00:1 4 3 228 79.8 5430 A
Celica GT 23.90 133.6 96 120 8.4:1 3.91:1 2 5 171.5 63.4 2535 M
Charger SE 19.73 318 140 255 8.5:1 2.71:1 2 3 215.3 76.3 4370 A
Cougar 13.90 351 148 243 8.0: 1 3.25:1 2 3 215.5 78.5 4540 A
Corvette 16.50 350 165 255 8.5:1 2.73:1 4 3 185.2 69.0 3660 A
~-----------------
y. Miles I gallon.
XL: Displacement (cub~c in.),
x,: Horsepower (ft~lb) .
.'f): Torque (ft~lb).
XA: Compression noo,
x,: Rear a:tle ratio.
;r~: C'WJrctor (barrels).
Xl: No. of tra.-.s:nlssion speeds.
x~; Overall length (in.),
x.;: \Yidth {in.).
X10: Weigl:t Oos).
XI i: Type of tnl.n$mlssion (A-au:omatic. M-manU3l).

j y x, x,
240 25 24 91 100 300 80 25 87 97
236 31 21 90 95 296 84 25 86 96
290 45 24 88 110 267 75 24 88 110
274 60 25 87 88 276 60 25 91 105
301 65 25 91 94 288 50 25 90 100
316 72 26 94 99 261 38 23 89 98
(a; Fit a multiple regression model to these data. (c) Test the hypothesis thar Pn == O.
(b) Test for sigrllficance of regression. (d) Compute the residuals and test for model
( c) Use partial F statistics to test H,: {J, = 0 and H,: adequacy.
(J, =0. 15~10. Consider the follow.Jlg data. which result from
(d) Compute the residuals from this model. .Analyze .an experiment to derer.:nine the effect of x = test time
the res~du.als using the merhods discussed in this in hours at a particular temperature on y change in
chapter. oil viscosity.
15-8. Hald (1952) reports data on the heat evolved in y -4.42 -1.39 -1.55 -1.89 _2.43 -3.15 -4.05 -5.15 -6,43 -7.89
calories per gram of cement (y) for various amounts of
four ingredients: (x!, x... x,> xJ. x O.2S 0,50 0.75 :.00 1.25 L50 1.75 2.00 2.25 2,50

(a) Fit a second~Ol:derpolyno!tial to the data.


Observation
(b) Test for significance of regres.<;ion,
:."u:::.ber y x, x,
'"
X,
(0) Test the hypothesis that Pu =: O.
735 7 26 6 60 (d) Compute the residuals and check for model
2 74.3 1 29 15 52 adequacy,
3 104.3 11 56 8 20
lS~ 11. For many polyr.omial regression models We
4 81.6 11 31 3 47 x
subtract from each x value to produce a "centered"
5 95.9 1 52 6 33 regressor Xl x-x. Using the data from Exercise
6 109.2 11 55 9 22 l5~9, fit the model y e;; f3~ + P~x'+ P;I(X'? + e Use the
7 102.7 3 71 17 6 results to estimate the coefficients: in the uncentered
8 72.5 1 31 22 44 model y = {Jc + {J,x - f3 ..i' + E.
9 93.1 2 54 18 22 15~:u. Suppose that we use a standa.-'1.nzed variabie
10 115.9 21 47 4 26 x';::o (x - x)lsft where Sy is the standard deviation of x,
11 8}.8 I 4() 23 34 in constructing a polyr.omial regression model Using
12 1133 11 66 9 12 !he data in Exercise 15-9 and the standardized var:il;ble
13 109.4 10 68 8 12 approach. fit the model y ~ fJ~ + P;x' + j3~I(X)Z..;.. e
(a) What value of y do you predict when x =. 285;)F?
(a) Fit a multiple regression model to dlese data. (b) Estimate the regression coefficients in the unstan~
(b) Test for sig;'.i.fic2nce of reg:ession, datdized model y = Po + PiX";" PllX~ + e
(c) Test the hypothesis fl, = 0 using the partial F·test (c) What can you say about the relationship beween
SSe and RZ for the standardized and unstandard-
Cd) Compute the r statiscics for each independent vari·
ized models?
able. What conclusions C2n you draw?
(d) Suppose that y"':=: (y - 'Y>fsv is used in the model
(e) Test the hypothesis {J, =fl, ={3, =0 using the par-
along with x~ Fit tbe [Lodel and comment on the
tial F-test.
relationship between SSE and If in the standard-
(f) ConstrJct a 95% confidence interval estimate for ized model and the unstandardized model.
{3,.
15--13. The data shown at the bot"wm of this page were
15-9. An. article entitled "P.. Method for Improving the collected during an experiment to detern:tine :he
Accuracy of Pol.Y::1omial Regression Analysis" in the change in thrust ef5.ciency (%) (y) as the divergence
loumal of Q1'aliry Technology (1971. p. 1"9) reported angle of a rocket nozzle (x) cl:!anges.
the following data on y:=> ultitl:!.ate shear strength of a
(a) Fit a second~order model to the data.
rnb'::ler compound (psi) and x = cure temperature ("'F).
(b) Test for significance of regression and lack of fit.
y 770 800 840 810 735 640 590 560 (c) Test the hypothesis that {3" ~ o.
x 280 284 292 295 298 305 308 315 15-14. Discuss the hazards inherent in fitting polyno-
ll1i2l models.
(a) Fir a second~order polynomial to this data, 15-15. Consider the data in Ex.ample 15-12. Test the
(b) Test for significance of regression. hypothesis that two diffexentregression models (with

4.0 5.0 5.0 6.5 6.5 6.75 7.0 7.3


490 Chapter 15 Multiple Regre>sion

different slopes and intercepts) are required to ade- 15-22. Use the National Football League Team Per~
quately model the data. fonnance data in Exercise 15-5 to build regression
models using rhe following techniques:
15.. 16. Piecewise Linear Regression (I). Suppose
(a) All possible regressions.
that y is piecewise linearly related to x. Thar is. differ·
ent linear rela.tionships are appropriate over the inter- (b) Stepwise regression.
vals - o:x> < X ~:l 3lld x' < x < «l, Show how indicator (c) Fonvard selection.
variables em.;. be used to fit such a piecewise linear (d) Backward elimination.
regression model, assuming that point x' is kno'WU. (e) COIl:.lm.ell.t on the various models obtained.
15~17. Piecewise Linear Regression (II). Consider 15~23. Use the gasoline mileage data in Exercise 15-6
the piecewise linear regression mode3 described in to build regression models using the following tech~
Exercise 15-16. Suppose that at point x' a discontinu- niques:
ity occurs in the .regression function. Show how indi~ (a) All possible regressors,
cator variables can be us~ to iIlcorporate the
(b) Stepwise regression.
discontinuity into the modeI.
(e) Forward selection.
15~18. Piecewise Linear Regression (ill). Consider (d) Backward _atio"
the piecewise linear regression model described in
(e) Comment on the various models obtained,
Exercise 15-16. Suppose that point x' .is not known
15~24. Consider the Rald cement data in Exercise
with certainty aud must be estimated. Develop an
approach that could be used to fit the piecewise linear 15-8. Build regression models for the data using me
regression model following techniques:
(a) i~l1 possible regressions,
1s..19. C'..alcu1ate the standardized regression coeffi-
(b) Stepwise regression.
cients for the regression model developed in Exercise
lS-L (c) FoIW'dl'd selection.
Cd) Backward elimination.
15~20. Calculate the standardized regressioo coeffi-
cients for the regression model developed in Exercise 15~25. Consider the Hald cement data in Exercise
15-2. 15-8. Fit a regression model involving all four :reg...~s­
sors and find t.~ variance inflation factors. Is multi-
15~21. Find the variance inflation factors for the collinearity a problem i:1 this model? Use ridge
regression model developed L1 Example 15-1. Do they regression to estimate the coefficients in this model.
ir.dicate t.~at multicollinearity is a prOblem in this Compare the ridge model to the models obtal.!1ed in
model? E..'Cercise 15~25 using variable selection methods.
Chapter 16

Nonparametric Statistics
16·1 INTRODUCTION
Most of the hypothesis testing and eonfidenee interval proeedures in previous chapters are
based on the assumption that we are working with random samples from normal popula-
tions. Fortunately, most of these proeedures are relatively insensitive to slight departures
from nonnality, In general. the t- and F-tests and t eonfidenee intervals will have aetuallev ...
ets of signifieance or eonfidenee levels that differ from the nominal or advertised levels
chosen hy the experimenter, although the difference between the actual and advertised lev-
els is usually fairly small when the underlying population is not too different from the nor-
mal distribution. Traditionally. we have called these procedures parametric methods
beeause they are based on a partieular parametric family of distributions-in this case, the
nonna!. Alternatively, sometimes we say that these procedures are not distnbution free
because they depend on the assumption of normality.
In this chapter we describe procedures ealled nonparametric or distributionwfree meth-
ods and usually make no assumptions about the distribution of the underlying population;
other than that itis eontinuous. These procedures have actual levels of s:ignifieanee a or con-
fidence levels 100(1- a)% for many different types of distributions. These procedures also
have considerable appeal. One of their advantages is that the data need not be quantitative; it
could be categorical (such a..,.;; yes orno~ defective or nondefective. etc.) or rank data. Another
advantage is that l10nparametrie procedures are usually very quick and easy to perform.
The procedures deseribed in this chapter are competitors of the parametric t- and
F-procedUIes describ~d earlier. Consequently~ it is important to compare the performance
ofbotb parametric and nonparametric methods under the assumptions of both normal and
nonnormal populations. In general, nonparametrie procedUIes do not utilize all the infor-
mation provided by the sample) and as a result a nonparametric procedure will be less effi-
cient than the corresponding parametric procedure when the underlying population is
norma1. This loss of efficiency usually is reflected by a requirement for a larger sample size
for the nonparametric procedure than would be required by the parametric procedure in
order to achieve the same probability of type II error. On the other band. this loss of effi·
ciency is usually not large, and often the difference in sample size is very smalL When the
underlying distributions are not nonnal, then nonp,arametric methods have much to offer.
They often provide considerable improvement over the nOrmal-theory parametric methods.

16·2 THE SIGN TEST


16·2.1 A Description of the Sign Test
The sign test is used to rest hypotheses about the median j1. of a continuous distribution.
Recall that the median of a distribution is a value of the random variable such that the prob·
ability is 0.5 that an observed value of X is less than or equal to the median, and the

491
492 Cbapter 16 Nonpa..'"3J:uetric Statistics

probability is 0.5 that an observed value of X is greater than or equal to the median. That is.
P(X s,{l) = P(X?;fl) = 0.5.
Since the normal distribution is symmetric, the mean of a normal distribution equals
the median. Therefore the sign test can be used to test hypotheses about the mean of a nor~
mal distribution. This is the same problem for which we used the I-test in Chapter 11. We
will discuss the relative merits of the two procedures iu Section 16-2.4. Note that while the
t~test was designed for samples from a normal distribution. the sign test is appropriate for
samples from any continuous distribution. Thus, the sign test is a nonparametric procedure.
Suppose that the hypotheses are
Ho: {l =/10.
(16-1)
H,: {l "/10.
The test procedure is as follows. Suppose that Xl' X2• •••• Xtl is a random sample of n. obser-
vations from the population of mterest. Form the differences (Xi p,;J, i = 1, 2, .. ,' n. Now
if Ho: {l=/1o is true, any differenceX,-/1o is equally likely to be positive or negative. There-
fore let R' denote the number of these differences (X, - p,;; that are positive and let K denote
the number of these differences that are negative. where R;;: min CRt, R-).
When the null bypothesis is true, R has a binomial distribution with parameters n and
p = 0.5. Therefore. we would find a critical value. say R:
from the binomial distribution that
ensures that P(type I erroD = P(reject Ho whenHo is true) = a. A table of these critical val-
uesR;is given in the Appendix. Table X. If the test statistic R:f R:. then the null hypothe-
sis Ho: P. = /10 should be rejeeted.

~1ontgo:mery, Peck. and Vining (2001) report on a study in which a rocker: motor is formed by bind-
L"lg an igniter propellant and -a sustainer propellant together inside -a metal housing. The shear strength
of the bond between j}e two propellant types is an important characteristic. Res'..llts of testing 20 ran~
comly selected motors are shown in Table 16-1. We would like to test the hypothesis that me median
shear strength i'l 2000 psi.
Tne formal statement of the hypotheses of interest is

H,: fl = 2000,

H,: fl" 2000.


The last two columns of Table 16~ 1 show the differences (Xi - 2000) fnr i::::::: 1.2, .,., 20 and the cor~
=
responding sig:us, Note that Jr = 14 and R- = 6, The:efore R = min (Jr, R-) min (1<. 6) 6. Fro:n =
the Appendix, Table X, with n =: 20. we find that the critical value for a = 0.05 is ~.05 = 5. Therefore,
smceR =6 is not less than or equal to the critica: va1ue~1l$ =5. we cannot reject:he ;l;ull hypothesis
that the median shear strength is 2000 psi. "
We note that since R is a b1r.omial random variable, we could test the hypothesis of interest by
directly calculating a P~value from the binomial distribution. 'When Ho: fl:::::; 2000 is true, R has a bino-
mial distributior. with paratl1eters n = 20 and p = 0.5, Thus the probability of observing six or fewer
negative signs in a sample of20 observations is
, (20\
P(R S 6) = '1:" 1(0.5)'(0.5)20-'
..:..1
r..,O\
r ).
= 0.058.
Since the P~ya1ue is not less than the desired level of Significance, we cannot reject the:lull h'ypoth n

,a
esis of ::=. 2000 psi.
r
I
16-2 The Sigr. Test 493
:r.ble16-1 Propellant Shear Streng&. Da:a
Observation (i) Shear Strength (X,) Differences (X, - 2000) Sign
2158.70 +158,70
2 1678.15 -32L85
3 2316,00 +316,00
4 2061.30 +{;L30
5 2207,50 +207,50 +
6 1708.30 -291.70
7 1784,70 -215.30
8 2575,10 +575.00 +
9 2357.90 +357,90 +
10 2256.70 +256,70 +
11 2165.20 +165.20 +
12 2399.55 +399.55
13 1779.80 -220.20
14 2336,75 +336,75
15 1765.30 ··234,70
16 2053.50 +53.50 +
17 2414.40 +414.40
18 2200,50 +20050 .,.
19 2654,20 +654.20 +
20 1753,70 -246.30

Exact Significance Levels When a test statistic has a discrete distribution 1 such as R does
R;
.in the sign test, it may be impossible to choose a critical value that has a level of signif-
icance exactly equal to a. The usual approach is to choose R:to yield an a as close to the
advertised level a as possible.

Ties in the Sign Test Since the underlying population is assumed to be continuous, it is
theoretically impossible to find a " that is, a value of XI exactly equal to Po- However)
this may sometimes .happen in practice because of the way the dzta are collected. \Vhen
ties occur, they should be set aside aJ:,d the sign test applied to the remaining data.

One~Sided Alternative Hypotheses We can also use the sign test wben a one-sided alter-
native bypothesis is appropriate. If the alternative is H,: p > p", then reject Ho: p= p" if
R- <R~ if the alternative is H,: p <p", then rejectHo: P = p" if R" <R~ Tne level of signif-
icance of a one-sided test is one-half the value sbo,," in the Appendix, Table X, It is also
possible to calculate. P-value from the binomial distribution for the one-sided case.

The Normal Approximation When p 0.5, the binomial distribution is well approxi-
mated by a norma! distribution when n is at least 10. Thus, since the mean of the binomial
is np a.'1d the vari""ce is np{l- p), the distribution of R is approximately normal, with mean
O.5n and variance 0.25n wbenever n is moderately large. Therefore in these cases the null
l

hypothesis can be tested with the statistic


R-O.5/l
(16-2)
494 Chapter 16 Nonpara::::etric Slatistics

The two-sided alternative would be rejected if 1201:> Zoo' and the critical regions of the one-
sided alternative would be chosen to reflect the sense of the alternative (if the alternative is
H,: f1 :>f1c, rejectH, if 20 > Z", for example).

16·2.2 The Sign Test for Paired Samples


The sign test can also be applied to paired observations drawn from continuous popu1a~
aons. Let (X "X~,)~j:::: 1, 2. ",~ n, be a collection of paired observatiocs from two contin~
uous population;: and let
1,2. ,",.n,
be the paired differences. We wish to test the hypothesis that the two populations have a
common median, that is, that,il! ~~. This is equivalent to testing that the median ofllie du. .
ferences fld:::: O. This can be done by applying the sign tcst to the r. differences D), as illus-
trated.in the following example.

Exampk16·2
An autor.o.otive engincer is investigating two different types of metering devices for an electronic :bel
injection system to determine if they differ in their fuel.mi.!eage performance. The system is installed
on 12 different cars, and a tes£is run with each meterslg system on each car, The observed fuel mileage
performance data, corresponding differences. ar.d their signs are show!:. in Table 16~2. Note that K:::: 8
andR-;4. Thm::foreR = min (K,R1 = min (8, 4) =4. From the Appendix, Table X, with n; 12, we
:find the criticlli va.:'.1e for a= 0.05 is~.O$ = 2. SinceR is not less than the critical value ~.(jj' we cannot
reject the null hypothesis that the two metering devices produce the same fuel mileage performa.nee.

16·23 Type n Error (fJ) for the Sign Test


The sign test will control the probability of type I error at an advertised level a for testing
the null hypothesis H,: f1 = A, fo, any continuous distribution. As with any hypothesis.
testing procedu!:"e, it is l.-nportant to investigate the type IT error. p, The test should be able
to effectively detect departures from the null hypothesis, and a good measure of this effec,.

Tablel6-Z Pcrfurr:.ance of Flow Metering Devices


Metering Device
Car 2 Difference,
17.6 16.8 0.8 +
2 19.4 20.0 -0.6
3 19.5 18.2 1.3 +
4 17.1 16.4 0.7 +
5 15.3 16.0 -0.7
6 15.9 15,~ 0.5 +
7 16.3 16,5 -0.2
8 18.4 18.0 0.4 ~

9 17.3 16.4 0.9 +


10 19.1 20,1 -l.0
1l 17.8 16.7 J.l +
12 18.2 17.9 0.3 ~
16-2 The Sign Test 495

tiveness is the value of Pfor departures that are important. A small value of Pimplies an
effective test procedure.
In determining p, it is important to realize that not only must a particular value of Jl, say
A:, + Ll., be used, also the form of the underlying distribution will affect the calculations, To
illustrate, suppose that the underlying distribution is nonnal with 0"= I and we are testing
the hypothesis that {1 = 2 (since P. = f.1 in the normal distribution this is equivalent to testing
that the mean equals 2), It is important to detect a departure from{1= 2 to{1= 3, The situa-
tion is illustrated graphically in Fig, 16-10, ,,'hen the alternative hypothesis is !rue (H,: {1 =
3), the probability that the random variable X exceeds the value 2 is
p =P(X> 2) = P(Z>-I) = 1- 4>(-1) = 0.8413.
Suppose we have taken samples of size 12, At the a = 0,05 level, Appendix Table X indi-
cates that we would reject Ho: {1 = 2 if R ,; R;,,, = 2. Therefore, the Perror is the probabil-
ity that we do not reject No: {1 = 2 when in fact {1 = 3, or

P
'
'2 1 12
1- ~l x (0.1 587)"(O.8413)lt- = 0,2944,
x

J
If the distribution of X has been exponential rather than nonnal, then the situation
would be as shown in Fig. 16-1b, and the probability that the random variable X exceeds the
value.x : :;:; 2 when fl 3 (note when the median of an exponential distribution is 3 the mean
is 433) is
1 2
p=P(X>2) r~_1_;4,33X dx=;<,33 =0.6301.
h 4.33

-1 0 2 3 4 5 6
Under He: il =2 Under ~;;:;'=3

(a)

J(,=2 J.L-2.89

Under H;;: 'ji "" 2


(b)

Figure 1&.1 Calcu!ation of {3for the sign test (a) Normal distributions, (b) exponential
distributions.
496 Chzpter 16 Nonparametric Statistics

The p error in this case is

f3 ~ 1- I' 12x 1J(0.3699)" (0.6301)12-x = 0.8794.


2 (

x={)\ ..

l1ms, the /3 error for the sign test depends not only on the alternative value of {l but on
the area to the right of the value specified in the null hypo thesis under the population prob-
ability distribution. This area is highly dependent on the sbape of that particular probabil·
ity distribution.

16-2.4 Comparison of the Sign Test and the t-Test


If the underlying population is normal, then either the sign test or the t-test could be used
to test Ho: p.;;;;;:; flo. 1be Hest is known to have the smallest value of /3 possible among all tests
that have significance level ex,. so it is sUperior to the sign test in the normal distribution case.
When the population distribution is symmetric and nonnormal (but with finite mean i' ~ P.),
then the t-test will have a f3 error that is smaller than f3 for the sign test, unless the distribu-
tion has very heavy tails compared with the nonnal Thus, the SlgTl test is usually consid~
ered a test procedure for the median rather than a serious competitor for the t-tesL The
Wucoxon signed rank rest in the next section is preferable to the sign test and compares
well with the (-test for symmetric distributions.

16·3 THE WILCOXON SIGNED RANK TEST


Suppose that we are willing to assume that the population of interest is continuous and sym~
metric. As in the previous section. our interest focuses on the medianJl (or equivalently, the
mean fJ.. since f1 = f.L for symmetric distributions). A disadvantage of the sign test in this sit-
uation is that it considers O11ly the signs of the deviations Xi - flo and not their magnitudes.
The Wilcoxon signed rank test is designed to overcome that disadvantage.

16·3.1 A Description of the Test


We are interested in testing He: IJ:;:: Jl.o against the usual alternatives. LA.ssume that Xl' X2, , •• ,
X,,:is a random sample from a continuous and symmetric distribution with mean (and medi~
an) fl. Compute the differences X, - 11<;, i = I, 2, ... , n. Rank the absolilte differences
~XI - .L4>!, i;;;;;:; L 2, ... ; n, in ascending order, and then give the:ranks the signs of their cor-
responding differences. Let R' be the sum of the positive ranks and R- be the absolute
value of the sum of the negative ranks, and let R = min (R', R-). Appendix Table XI con-
tains critical values of R, say R~ If the alternative hypothesis is Hi: i' ~ 11<;, then if R':; R;
the null hypothesis He: i' ~ 11<; is rejected.
For one~sided tests, if the altemative is Hj : ,u > Ji.o reject Ho: Ii = J10 1£ R- < R:; and if
l
the alternative is HI: i' < 11<;, reject Ho: i' ~ 11<; if R' < R". The significance level for
one-sided tests is one-half the advertised level in Appendix Table Xl

2F;iiUli~1~~~;1:
To illustrate the Wucoxon signed rank test. consider the propellant shear strength data presented in
Table 16-1. The signed ranks are
16-3 The WIlcoxon Signed Rank Test 497

Observation Difference, XI .- 2000 Signed Ran!:


16 +5350 -1-1
4 ..,,1.30 +2
1 +158.70 -3
11 ,-165.20 +4
18 +200.50 +5
5 +207.50 -.<i
7 -215.30 -7
13 -220.20 -8
15 -234.70 -9
20 -246.30 -10
!O +256.70 +11
6 -291.70 -12
3 +316.00 +13
2 -321.85 -14
14 -1-336.75 +15
9 +357.90 +16
12 +399.55 +17
17 +414.40 +18
8 +575.00 +19
19 -.<i54.20 +20

The sum of the positive:ranks is R" (-1 +2 + 3 +4 + 5 + 6...:.11 + 13.,.. 15 + 16 +:7 + 18 + 19-
20) =: 150 and the sum of the nega:ive ranks is lC::::. (7 + 8 + 9 + 10 + 12 + 14) = 60. Therefore R:::.
nilil (P:', R') = nilil (150, 60) = 60. Proll: Appendix :able Xl, with n = 20 and a= 0.05, we find the
critical value R~.J1 = 52. Since R exceeds R;, we cannot reject the null uypothesls that the mean (or
median. since the popUlations are assumed to be symmetric) shear strength is 2000 psi,

Ties in the Wilcoxon Signed Rank Test


Because the underlying population is continuous, ties are theoretically impossible, although
:.bey will sometimes occur in practice. If several observations have the same absolute mag-
nitude. they are assigned the average of the ranks that they would receive if they differed
slightly from one another.

16-3.2 A Large-Sample Approximation


If the sample size is moderately large, say n > 20, then it can be shown that R has approxi-
mately a normal distribution with mean
n{n + 1)
4
and variance
n(n + 1){2n + 1)
24
498 Chapter 16 Nonpara:netric Statistics

Therefore, a test of lIo: J1::::::: J1r; can be based on the statistic

R n(n+ 1)/4
Zo == ---;===:':;=.==::.00-. (16-3)
..,jn(n + 1)(2n + 1)/24

An appropriate critical region can be chosen from a table of the standard normal
distribution.

16·33 Paired Observations


The Wilcoxon signed rank ,est can be applied to paired data. Let (X,)' X,),j = 1,2, ... , n, be
a coLlection of paired observations from continuous distributions that differ only with
respect to their means (it is not necessal)' that the distributions of X, and X, be symmetric).
This assures that the distribution of the differen.ces DJ ::: X lj - Xl} is continuous and
symmetric.
To use the WIlcoxon signed rank test, the differences are fL""St ranked in ascending order
of their absolute values, and then the ranks are given the signs of the differences. Ties are
assigned average ranks. Let K be the sum of the positive ranks and R- be the absolute value
of the sum of the negative ranks, and let R =min (R'", R-), We reject the hypothesis of equal-
ity of means if R,; R:, where R:
is chosen from Appendix Table Xl.
For one-sided tests, lithe alternative is H:: P, > 110: (or H,: Pn > 0), rejectH, if R- < R;';
and if H,: P, < 110: (or H,: PD < OJ, reject Ho if R'" <R~ Note that the significance level for
one~sided tests is one~half the value given in Table XI.

Consider th~ fuel metering device data examined in Example 16~2. The signed :an.ks are shown
below.

Cat Difference Signed RanI<:

7 -0.2 -1
12 0.3 2
8 0.4 3
6 0.5 4
2 -0.6 -5
4 0.7 6.5
5 -0.7 -6.5
1 0.8 8
9 0.9 9
10 -1.0 -10
11 !.l !1
3 1.3 12

Note that K= 55.5 and It =: 22.5; therefore.R=min (Ir"t10 = min (55.5. 22.5) 22.5. FromAppcn-
dix Table Xl. with It =: 12 and (1,= 0,05, we find::he critical value ~,iJ) = 13. Since R exceeds (,1)$' we
cannot reject tbe null hypothesis that the two metering devices produce the same mileage
performance.
164 The Wilcoxon Rank-Sum Test 499

16·3.4 Comparison with the t· Test


Vlhen the underlying population is normal. either the t-test or the Wilcoxon signed rank
test can be used to test hypotheses about /.1., The t-test is the best test in such situations in
the sense that ir produces a minimum value of fJ for all tests with significance level a.
Howeverj since it is not always clear that the nonnal distribution is appropriate, and since
there are many situations in which we know it to be inappropriate, it is of interest to com-
pare the rNO procedures for both normal and nonnormal populations.
Unfommate)y, such a comparison is not easy. The problem is that [3 for the Vlilcoxon
signed rank test L.1:i very difficult to obtain, and 13 for the t-test is difficult to obtain for non-
normal distributions. Because type II error comparisons are difficult, other measures of
comparison have been developed, One widely used measure is asymptotic relative effi-
ciency (ARE). The ARE of one test relative to another is the Um.:ting ratio of the sample
sizes necessary to obtain identica: error probabilities for the two procedures. For example,
if the ARE of one test relative to a competitor is 05. then when sample sizes are large, the
first test will require a sample t'.¥lce as large as the second one to obtain similar error per-
formance. While this does not tell us anything for small sample sizes, we can say the
following:
1. For normal populations the ARE of the Wilcoxon signed rank test relative to the
,·test is approximately 0.95.
2. For nonnormal populations. the ARE is at leab~ 0.86, and in many cases it will
exceed unity. VV'ben it exceeds unity, the Wilcoxon signed rank test requires a
smaller sample size than does the t~test.
Although these are large-sample results. we generally conclude that the Wilcoxon
signed rank test will never be much worse than the t~test, and in many cases where the pop-
ulation is nonnormal it may be superior. Thus the Vlikoxon signed rank test is a useful
alternative to the t-test.

16·4 THE WILCOXON RANK·SUM TEST


Suppose that we have nvo independent continuous populations X J andX:z with means ,u and j

p.,. The distributions of X, and X, have the same shape and spread and differ only (possibly)
in their means. The Wilcoxon rank-sur:! test can be used to test the hypothesis flo: 11, = f.L:.
Sometimes this procedure is called the Mann-Whitney test, although the Mallll-Vihitney
test statistic is usually expressed in a different form.

16·4.1 A Description of the Test


LetX 11 , X12> ••• ~XI"I andX21 .Xn , ...• Xo,J be two independent random samples from the con~
tinuous populations X; andXj. described earlier. We assume that n l :s:; 1t:!. A..'Tange all n J + n2
observations in ascending order of magnitude and assign ranks to them, If two or more
observations are tied (identical). then use the mean of the ranks that would have been
assigned nthe observations differed. Let R, be the sum of the r:mks in the smaller X. sam-
ple, and define
(16.4)
Now if the two means do not differ, we would expect the surrt of the r:mks to be nearly equal
for both samples. Consequently, if the sums of the ran.l.cs differ g:eatly. we would conclude
that the means are not equal.
500 Chapter 16 :Konparar.Jetric Statistics

R:
Appendix Table IX contains the critical values of the rank sums for 0: = 0.05 and a
0.0 I. Refer to Appendix Table IX, with the appropriate sample sizes n, and Ilz. The null
hypothesis Ho: Ill;;;;.~ is rejected in favor of HI: J1 1-t: J.L;.if either R: or R'{. is less than or equal
to the tabulated critical value R~
The procedure can also be used for one-sided alcematives. If the alternative is H j : J.lt <
/1), then reject He if R] :;; R~ while for He: #1 > fJ..z1 reject 8 0 if Rz s: R~ For these one~sided
tests the tabulated critical values R;correspond to levels of significance of a::::: 0.025 and
0:'" 0.005.

Exartlple.16-S
The mean axial stress in tensile members used in an aircraft st:ucture is bei:1g studied. Two. alloys are
being investigated. Alloy I is a traditio.nal material and a:loy 2 is a new aluminum-lithium tilloy that
is much lighter than the standard materiaL Ten specimens o.f each alloy type are tested, and the axial
stress meas'..lX""...d. The sample data are assembled in the following table:

Alloy 1 Alloy 2
3238 psi 3254 psi 3261 psi 3248 psi
319-5 3229 3187 3215
3246 3225 3209 3226
3190 3217 3212 3240
3204 3241 3258 3234

The data fI.""e arranged in ascending o.rdet and ranked as follows:

Alloy Nutnber Axial Stress Rank

2 3187 psi
3190 2
3195 3
3204 4
2 3209 5
2 3212 6
2 3215 7
3217 g
! 3225 9
2 3226 10
3229 11
2 3234 :2
1 3238 13
2 3240 14
3241 15
3246 :6
2 3248 17
3254 18
2 3258 19
2 3261 20
r 16-5 Nor-parametric Methods in the Analysis ofVa.'iauce 501

The sum of the ranks for alloy 1 are


R, =2 + 3 + 4 + S + 9 + 11 +!3 + 15 + 16 + IS =99,
and for alloy 2 they are

R2 "'" n.1(n l "L~2 + 1) -R j "" 10(10+ 10+ 1) - 99 111.


FromAppendix Table IX. wit.1. ft J :::; nl = 10 and a= 0.05, we fiad that R;,c~ =78. Since neither RI nor
~ is less than R~.ils. we cannot reject the hypothesis that both alloys exhibit the Same mean axial stress.

164.2 A Large-Sample Approximation


When both nj and 1tz are mOderately large, say greater than 8, the distribution of R j can be
well approximated by the normal distribution with mean

n,(n, +":1 +1)


IlR, 2

and \'ariance
, n,":1(n,+,,:!+I)
OR!;;:: 12 .

Therefore, for n J and nz > 8 we could use


z,
(16·5)
as a test statistic, and the appropriate critical region as 2:,/ ,.
Zm:' :z;, > Z'" or Z, < -Z",
dependiag on whether the test is a two-tailed. upper-tail, or lower-tail test.

1643 Comparison with the t·Test


In Section 16-3.4 we discussed the comparison of the t-test with the Wilcoxon signed rank
test. The results for tl:!e two-sample problem are identical to the ooe-sample case; that is,
when me normality assumption is correct me Wilcoxon rank·sum testis approximately 95%
as efficient as the t-test in large samples, On the other band, regardless of the form of the
distributions, the Wilcoxon rank-sum test will always be at least 86% as efficient.if the
underlying distributions are very nonnormal. The efficiency of the ViilClJxon test relative to
me I-test is usually high if me underlying distribution has heavier tails thao the normal,
hecause the behavior of the t-test is very dependent on the sample me~ which is quite
unstable in heavy.tailed distributions,

16·5 NO"'PARA..'-1ETRIC METHODS IN THE ANALYSIS OF VARIAi'iCE


16-5.1 The Kruskal-Wallis Test
The si::tgle·factor analysis of variance model developed in Chapter 12 for comparing a
population means is
'i=t2,,,.,a,
Yij = J1.+ 1:, +£ij
t.
. ] = 1, 2, ..., nl' (16-6)
502 Chapter 16 Nonpanunctric Statistics

In this model the error tenns E., are assumed to be normally and independently distributed
with mean zero and variance '0". The assumption of normality led directly to the F-test
described in Chapter 12. The Kruskal-Wallis test is a nonparametric alternative to the
F-test; it requires only that the Eli have the same continuous distribution for all treatments
j= I, 2, .. " D,
Suppose that N .:;;: L~ln! is the total number of observations. Rank all N observations
from smallest to largest and assign the smallest observation rank 1, the next smallest rank.
2, , .. , and the largest observation ranklY. If the null hypothesis
Ho: 1': = }l, = ... 1',
is true. theN obsen'ations come from the same distribution, and all possible assignments of
the 11' ranl:'s to the a s"'!Iple, are equally likely, then we would expect the ranl:'s I, 2, ... , 11'
to be mi'ICed throughout the a samples, If, however, the null hypothesis Ho is false, then
some samples will consist of observations having predominantly small ranks while other
samples will consist of observations having predominantly large ranks, Let R'j be the rank
of observation Y;" and let R;+ and denote the total and average of the ni ranks in the ith
treatment. When'the null hypothesis is true, then

and

The Kruskal-Wallis test statistic lI1.eaSuteS the degree to which the actual observed average
ranks RI , differ from their expected value (N + 1)/2.lfthis difference is large, then the oull
hypothesis Ho is rejected. The test statistic is

12 " (- 11'+1)'
K
N(N+1)~ n,\"R - -2 - '
(16-7)

An alternate computing formula is

K= 12 ±n,
N(N + I) i~,
11,2 -3(11'+1). (16-8)

We would usually prefer equation 16-8 to equation 16-7, as it involves the rank totals rather
than the averages.
The null hypothesis He should be rejected if the sample data generate a large value fOr
K. The null distribution for K has been obtained by using the fact that under He each possi-
ble assignment of mlks to the a treatments is equaliy likely, Thus we could enumerate all
possible assignments and count the number of times each value of K occurs. This has led to
tables of the critical values of K, although most tables are restricted to small sample sizes
nf' In practice. We usually employ the following large-sample approximation: Vlhenever He
is true and ei ther
a= 3 and n,2:6 for i = 1, 2, 3
or
for i:::: 1, 2, ... , G,
16-5 Nonparametric Met.100s IT. the Analysis of Variance 503

then K has approximately a chi-square distribution with a-I degrees of freedom. Since
large values of K imply that Ho is false, we would reject He if

K?!i~'_l'
The test has approximate significance level a.

Tu;s in the Kruskal- ..Vallis Test When observations are tied, assign an average rank to
each of the tied observations. \\'1len there are ties, we shQuld replace the test statistic in
equation l6~8 with

(16-9)

where ni is the number of observations in the itb treatment, N is the total number of obser~
vatioM.and

52 = _.I_
C

iIR,} _!l(N +1)2].


N -1 l I'd 1'=>1 4
(16-10)

Note that :f is just the variance of the ranks. When the number of ties is moderate, there
will be little difference between equations 16-8 and 16-9, and the simpler form (equation
16-8) may be used.

In Design and Analysis oj Experiments, 5t.'l Edition (John Weey & SODS, 2(01), 0, C, Montgomery
presents data from an experiment in which five different levels of cotton content in a syn~etic Eber
were tested to daermine if eottOn content has any effect on fiber tensile strength. The sa."llple data and
ranks. from this experimer.t are shown in Table 1&-3. Since :here is a fairly large number of ties, we
use equation 16~9 as the test statistic. From equation 16~ 10 we fud

S2 =~f-iiili~ _N(N+l)']
f; -l~i"'l J""l 4

= 1..[5497.79- 25(26)'
24 4 j
1
~53.03.

Table 16-3 Data and Ran.<.s for :he Tensile Testing Experiment
Percentage of Cotton
IS 20 25 30 35

7 2.0 12 9.5 14 11.0 19 20.5 7 2.0


7 2.0 17 14.0 18 16.5 25 25.0 10 5.0
IS 12.5 12 9.5 IS 16.5 22 23.0 11 7,0
11 7.0 18 16.5 19 20.5 19 20.5 15 12.5
9 4.0 18 16.5 19 20.5 23 24.0 11 7.0
lit 27.5 66.0 85.0 113.0 33.5
504 Chapter 16 Nonparametric Statistics

and the test Statistic is


1: Ra 2
K=- L-"
S2liQ1 nJ
- NtN+l)21
\' 4 J'

=_l_r5245.0- 25(26)']
53.03, 4
=19.25.
Si...."lce K> x~,c,'.; ::::;; 13.28, we would reject the nullhypoiliesis andcondude that treatnents differ. This
is the same conclusion given by the usual analysis of variance Fwtest.

16·5.2 The Rank Transfonnation


The procedure used in the previous section of replacing the observations by their ranks is
called the rank tran.sformation. It is a very powerful and widely useful technique. If we
were to apply the ordinary P-test to the rarJ:s rather than to the or::ginal data, we would
obtain
Ki(a-l)
(N-I-K)i(N-a)
as the test statistic. Note that as the Kruskal-Wallis statistic K increases or decreases~ F<;;
also increases or decreases, so the Kruskal-Wallis test is nearly equivalent to applying the
usual analysis of variance to the ranks.
The rank transformation has wide applicability in experimental design problems for
which no nonparametric alternative to the analysis of variance exists. If the data are ranked
and the ordinary F wtest applied, an approximate procedure results, but oue that has good sta-
tistical properties, Villeo we are concerned about the normality assumption or the effect of
outliers or '''wild'' values, we recommend that the usual analysis of variance be perfotmed
on both the original data and the ranks. \','11en both procedures give similar results, the
analysis of variance assumptions are probably satisfied reasonably well. and the standard
analysis is satisfactory. Vi/hen the two procedures differ, the rank tralll;formation should be
preferred since it is less likely to be distorted by nonnormality and unusual observations. In
such cases, the experimenter may want to investigate the use of transformations for non...
normality and examine the data and the experimental procedure to detennine whether out-
liers are present and if so, why they have occurred.

16-6 S~1ARY

This chapter has introduced nonparametric or distribution-free statistical methods. These


procedures are alternatives to the usual parametric t~ and F-tests when the normality
assumption for the underlying population is not satisfied. The sign test can be used to test
h)'Potheses about the median of a continuous distribution. It can also be applied to paired
observations. The Wilcoxon signed rank test can be used to test hypotheses about the mean
of a symmetric continuous distribution. It can also be applied to paired observations. The
Wilcoxon signed rank test is a good alternative to the t~test. The two~sample hypothesis-
testing problem on means of continuous symmetric distributions is approached using the
Wilcoxon rank-sum test. This procedure compares very favorably with the two-sample
Hest. The Kruskal-Wallis test is a useful alternative to the F~testin the analysis of variance,
16-7 E,'{ercises 505

16-7 EXERCISES
16~1. Ten samples were taken Emma plating bath used the sig:c test to determine whether or not the CWO tips
in an electronics manufacturing process and the bath produce equivalent hardness readings.
pH determined. The sample pH values are given
below: Coupon Tip 1 Tip 2

7.91,7.85,6.82, 8.Ql, 7.46, 6.95, 7.05, 7.35, 1 63 60


7.25,7.42. 2 52 51
3 58 56
Manufacturing engineering believes that pH has a
median value of7 ,0. Do the sample cla::a indicate that
4 60 59
this statement is correct? Use the sign tes~ to investi- 5 55 58
gate this hypothesis. 6 57 54
7 53 52
16..2. The titanium content in an aitcraft~gTade :ilioy is 8 59 61
an important determinant of strength. A sample of 20
test coupons reveals the following titanium contents 16-6. Testing for Trends. A turbocharger whee: is
(in percent): ma."n;factured m.ing an invest.'TIent casting process.
832,8.05,8.93,8.65,8.25,8,46,852,8.35, The shaft fits into me wheel opening, and this wheel
8.36,8.41, 8A2, 8.30,8.71,8.75,8.60,8.83, opening is a critical dimension. As wheel wax patterns
850. 8.38, 8.29. 8.46. are fanned, the hard tool producing the wax patterns
wears. This may cause growth in the wheel-opening
The median titanium content should be 8.5%. Use jre dimension, Ten wheel-opening measurements, in time
sign test to investigate this hypottesis. order of production. are shown below:

16 ..3. The distribution time between a..-'rivals in a 4.00 (mm), 4,02, 4.03. 4.01, 4.00, 4.03,
telecommt."Dicauon system is ex-ponential, and the sys- 4.()4, 4.02, 4.03, 4.03.
tem man.ager wishes to test the hypothesis that HG: p.;::.. (a) Suppose that p is the probability that observation
3.5lIlin versus HI: p. > 3.5 min, Xi+5 exceeds observation X" If t.1.ere is no upward
(a) ' ' hat is the value of the mean of the exponential or downward trend. then X;"'5 is no more or Iess
likely to exceed XI or lie below Xt VVhat is the
distribution under Hij: fl.:::. 3.5?
value ofp?
(0) Suppose that we have taken a sample of It:::::. 10
observations and we observe R-;:;; 3, Would the (0) Let V be :he number of values of i for which
sig:1 test reject Ho at a= 0,05? Xi+, > Xr If there is no upward or dov,llward trend
in the measurements, what is the probability dis-
(c) What is the type n error of this test ifjl =4,57
tribution of V?
164. Suppose that we take a sample of n;:;; 10 meas- (c) Use t!:.e data above and the results of parts (a) and
urementS from a normal distributio!l with (1 == 1, We
°
wish to test Ho: J1. "'" against HI: J1. > O. The no:rrn.al
(b) to test He: there is :10 trend versus HI: there is
i!pwatd trend. Use a"",O.05.
test statistic is ~ =(x - #0)/ a/-rn , and we decide Note that this test is a modification of the sign test. It
to use a critical region of 1.96 (:.hat is, rejectRc if4 2: Was dc·...eloped by Cox and S:uart.
1.96). 16..7. Consider the Wilcoxon signed rank test, and
(a) 'What is ex for this test? suppose :hat n == 5. Assume ~t Ho: j.J. = ,l1t.J is trJe.
(b) What is j1fortlllstestif,u= I? (a) How many different sequences of signed r;mks
(c) If a sign tes~ is used, specify the critical region are pOSsible? Errt!rnerate these sequences.
that gives an a value consiste::lt with a for the nor- (b) How ma.1.Y different 'values of R" are there? Find
:nal test the probability tlSsociated with each value of R".
(d) What is the t> value for the sign test if fJ. == 1? {c) Suppose that we define the critical region of the
Compare this with the :esult ob~ed in part (b). test to be R:
such that we would reject if R"'> R:,
16--5. TWo different types of tips can be used in a and R~;::::: 13, What is the approximate a level of
Rockwell hardness tester. Eight COUP0!lS from test this test?
ingots of a nickel-based alloy are selected, and each (d) Can you see from this exerc:se how the critical
coupon is tested twice, once with each tip. The Rock- 'values for the Vlilcoxon signed xank test were
well C-sca1e hardness readings are shown next Use deve,oped? Explain.
506 Chapter 16 Nonparametric Statistics

16..&. Consider:he data in Exercise 16-1. and asstllXte Unit I 25, 27, 29, 31, 30, 26, 24, 32, 33, 38.
that the distribution of pH is symmetric and continu~
ous, Use the Wilcoxon signed rank test to test the
Unit 2 31, 33, 32, 35, 34, 29, 38, 35, 37, 30,
hyp<:1thesis Ho: ,u == 7 against H!; /1 '# 7.
16-16. In Design and h..alysis 0/ Experiments, 5th
16~9. Consider the data in Exercise 16-2, Suppose Edition (John Wiley & Sons, 2001), D. C. Mont-
that the distribution of titanium content is symmetric gomery presents the results of an experiment to com-
and cont:ir.uous. Use the Wilcoxon signed rank test to pare four different :n;.,iting tech.rUques: on the tensile
test the hypotheses Ho: J.l;:;::! 8.5 versus H;: J.l #; &.5. strength of portland cement, The results are sho\Vn
16~10" Consider t.1e data in Ex.ercise 16-2. Use the below. Is t.~ere any indication that rnixi:ng !echnique
large-sample approximation for the Wilcoxon signed affects t..lJ.e st:ength?
rank: test to test :he hypotheses Ho: J1::::: 8.5 Versus H,;
J1 ~ 8,5. Assume that :he distribution of titanium con- :Mixing Technique Tensile St:'ength (lb i In.')
tent is continuo'.)s and symmetric,
Hi-ll. For the large-sample approximation to the 1 3129 3000 2865 2890
Wilcoxon signed rank test, derive the mean and stan- 2 3200 3000 2975 3150
dard deviation of the test statistic used in t!1e 3 2800 2900 2985 3050
procedtlte. 4 2600 2700 2600 2765
16~12. Consider the Rockwell hardness test data in
Exercise 16-5. Assume Li.a: bot."1 distributions are 16~17. An a.rticle in the Quality Control Handbook,
continuous and use ll-)e \Vilcoxon signed rank test to Srd Edition (McGraw~Hill. 1962) presents the results
rest that the mear:. difference iL hardr,ess readings of an experiment perfonned to investigate t!le effect of
between the two tips is zero, th..'"ee differer.t conditioning methods on the breal:ir.g
16-13. An electrical engineer must design a circuit to strength of cement briquettes. Tne data are shown
below, Is there any indication that conditioning
deliver the r.:l:aximum amount of current to a display
rube to achieve sufficient image brightness. Within method affects breaking strength?
rJs allowable design constraints. he has developed
t'NO candidate circuits and tests prototypes of each. Conditioning
The resulting data (in microamperes) is shown below: Method Brealdng (Ib i in,')
Circuit 1: 25:, 255, 258, 257. 250, 251, 254, 2S0, 248 1 553 550 568 541 537
Circuit 2: 250, 253, 249, 256, 259, 252, 250, 251 2 553 599 579 545 540
3 492 530 528 510 571
, Use ;he Wilcoxon r.m.\<;-sum test 1.0 test Ho: J1; = ,u1
against the alternative HI; J11 > ,u1>
16~1S. In Statistics/or Research (John Wiley & Sons,
16-14. A consultant frequently travels from Phoenix, 1983), S. Dowdy and $, We3rden present the results of
Arizona, to Los Angeles. California, He will use one an experiment to measure stress resulting from oper-
of two airlir:es, Uni:ed or Southwest. The number ating hand-held chain saws. The experimenters meas-
of minutes that his flight arrived late for the last ured the kickback angle through which the saw is
six trips on each airlbe is sbo\Vll belOW. Is there evi- deflected when it begins to cut a 3-inch stock syn-
dence that either airli,'le has superior on-time arrival thetic board. Shown below are deflection angles for
perfon:l3!lce'1 five saws chosen at random from each of four differ~
eut manufacturers. Is there any evidence that the man-
United 2 19, -2 8 0 (minutes late)
ufacturers' products Ciffcr with respect to kickback
Southwest 20 4 8 8 -3 5 (minu:es late) ar:gle?

l<i~15. The manufacrurer of a bot tub is interested in


testing twO different heating elements fOf his product.
l\.la.nufucturer Kickback Angle
The element that produces the maximum heat gain after A 42 17 24 39 43
15 rcinutes would be preferable. He obtains 10 samples B 28 50 44 32 61
of each heating unit and tests each one. The heat gain 45
C 57 48 41 54
after 15 minutes (in "F) is shown below, Is there any rea-
D 29 40 22 34 30
son to suspect that one unit is superior to the other?
Chapter 17

Statistical Quality Control and


Reliability Engineering

The quality of the products and services used by our society has become a major consumer
decision factor in many, if not most, businesses today. Regardless of whether the consumer
is an individual, a corporation. a military defense program, or a retail store, the consumer
is likely to consider quality of equal importance to cost and schedule. Consequently, qual-
ity improvement has become a major concern of many U.S. corporations. This chapter is
about statistieal quality control and reliability engineering methods, two sets of tools that
are essential in quality~improvement activities.

17-1 QlJALITY IMPROVEMENT Ar."D STATISTICS


Quality means fitness for use. For example, we may purcbase automobiles that we expect
to be free of manofactering defects and that should provide reliable and economieal trans-
portation, a retailer buys finished goods with the expectation that they are properly pack-
aged and arranged for easy storage and display. or a manufacturer buys raw material and
expeets to process it with minimal rework Or scrap. In other words, all consumers expect
that the products and serviees they buy will meet their requirements, and those requirements
define fitness for use.
Quality, or fitness for use, is detemtined through the il"J.ternetion of quality of design and
quality of conformance. By quality of design we mean the different grades or levels of per-
formance, reliability, serviceability, and function that are the result of deliberate engineer-
ing and management decisions, By quality of conformance, we mean the systematic
reduction of variability and elimination of defects until every unit produced is identical and
defect free.
There is some confusion :in our SOCiety about qlJ.t1lity improvement; some people still
think that it means gold plating a product or spending more money to develop a product or
process. This thinking is \\Tong. Quality improvement means the systematic elimination of
waste. Examples of waste include serap and rework in manufacturing, inspection and test,
errors on documents (such as engineering drawings. checks, purchase orders, and plans),
customer complaiDt hotlines1 warranty costs, and the time required to do things over again
that could have been done right the first time. A successful quality-improvement effort can
eliminate much of this waste and lead to lower costs, higher productivity, increased cus-
tomer satisfaction. increased business reputation, higher market share, and ultimately
bigher profits for the company.

507
508 Chapter 17 Statistical Quality Control and Reliability Engineering

Statistical methods playa vital role in quality improvement. Some applications include
the following:
1. In product design and development, statistical methods, including designed exper-
iments, can be used to compare d.i.f.te:rent materials and different components or
ingredients1 and to help in both system and component tolerance determination.
This can significantly lower development costs and reduce development time.
2. Statistical methods can be used to determine the capability of a manufacturing
process. Statistical process control can be used to systematically improve a process
by reduction of variability,
3. Experiment design methods can be used to investigate improvements in the process,
These improvements can lead to higher yields and lower manufacturing costs.
4. Life testing provides reliability and other performance data about the product. This
can lead to new and improved designs and products that have longer useful lives and
lower operating and maintenance costs.
Some of these applications have been illustrated in earlier chapters of this book. It is essen-
tial that engineers and managers have an in-depth understanding of these statistical tools in
any industry or business that wants to be the bigh-quality, low-cost producer, In this chap.
ter we give an introduction to the basic methods of statistical quality control and reliability
engineering that, along with experimental design, form the basis of a successful quality~
improvement effolt..

17·2 STATISTICAL QUALITY CONTROL


The field of statistical quality control can be broadly defined as consisting of those statistical
and engineering methods useful in the measurement, monilor..ng. contro~ and improvement
of quality, In this chapter, a somewhat more narrow definition is employed, We will define
statistical quality control as the statistical and engineering methods for process controL
Statistical quality control is a relatively new field, dating back to the 19205, Dr, Walter
A. Shewbart of the Bell Telephone Laboratories was One of the early pioneers of the field.
In 1924, he '''TOte a memorandum shoViing a modem oontrol chart, one of the basic tools
of statistical process control. Harold F. Dodge and Harry G. Romig, two other Bell System
employees, provided mucb of the leadership in the development of statistically based sam·
pling and inspection methods, The work of these three men forms the basis of the modem
field of statistical quality controL World War II saw the widespread introduction of these
methods to U,S, icdustry, Dr, W Edwards Deming and Dr, Joseph M. Juran bave been
instrumental in spreading statistical quality-control methods since World War II.
The Japanese have been particularly successful in deploying statistieal quality-control
methods and have used statistical methods to gain significant advantage relative to their
competitors, In the 1970, American industry suffered extensively from Japanese (and other
foreign) competition, and that has led, in tum, to renewed interest in statistical quality-
control methods in the United States. Mucb of this interest focuses on statistical process
control and expen'mental design. Many U.S. companies have begun extensive programs to
implement these methods into their manufacturing~ engineering, and other business
organizations.

17·3 STATISTICAL PROCESS CONTROL


It is impossible to inspect quality into a product; the product must be built right the first
time, This implies that the manufacturing process must be stable or repeatable and capable
17-3 Statistical Process CenITol 509

of operating with little variability around the target .or nominal dimension. Online statisti ~
cal process controls are powerful tools useful in achieving process stability and improving
capability through the reduction of variability.
It is customary to think of statistical process control (SPC) as a set of problem-solving
tools that may be applied to any process. The major IDols of SPC are the following:
1. Histogram
2. Pareto chart
3. Cause-and-effect diagram
4. Defect~concentration diagram
5. Control chart
6. Scatter diagram
7. Check sheet
While these tools are an important part of SPC, they really constitute only the technical
aspect of the subject. SPC is an attitude-a desire of all individuals in the organization for
continuous improvement in quality and productivity by the systematic reduction of vari~
ability. The control chart is the most powerful of the SPC tools. We now give an introduc-
tion to several basic types of control charts.

17-3.1 Introduction to Control Charts


The basic theory of the control chart was developed by Walter Shewhart in the 1920s, To
understand how a control chart works~ we must first understand Shewharfs theory of vari-
ation. Shewhart theorized t."1at all processes, however good; are characterized by a certain
amount of variation if we measure with an instrument of sufficient resolution. When this
variabilicy is confined to ra:ndom or chance variation ouly, the process is said to be in a state
of statistical control. However. another situation may exist in which the process variability
is also affected by s.ome assignable cause, such as a faulty machine setting, operator error.
unsatisfactory raw material, worn machine components, and sO,on. l These assignable
causes of variation usually have an adverse effect on product quality, so it is important to
have some systematic technique for detecting serious departures from a state of statistical
control as soon after th~y occur as possible, Control charts are principally used for this pur-
pose.
The power of the control chart lies in its ability to distinguish assignable causes from
random variation, It is the job of the individual using the control chart to identify the under-
lying root Cause responsible for the out-of-control conditio~ develop and implement an
appropriate corrective action, and then follow up to ensure that the assignable cause has
been eliminated from the process. There are three points to remember.
1~ A state of statistical control is not a natural state for wost processes.
2. The attentive use of control charts will result in the elimination of assignable causes,
yielding an in-control process and reduced process variability.
3. The control chart is ineffective without the system to develop and implement cor-
rective actions that attack the root causes of problems, Management and enginee:~
:ing involvement is usually necessary to accomplish this.

ISoo::.etimes contnum c:J.use is used.insre:ld of "random" or '"'chAnce cause." and special cause is used ifIstead of
"assignab:e cause,"
510 Chapter 17 Statistical QUality Control and Reliability Engineering

We distinguisb between control charts for measurements and control charts for attrib-
utes, depending on whether the observations on the quality characteristic are measurements
or enumeration data. For example, we may choose to measure the diameter of a shaft, say
with a micrometer, and utilize these data in conjunction with a control chart for measure-
ments. On the oilier hand, we may judge each unit of product as either defective or nonde-
fe;::tive and use the fraction of defective units found or the total number of defects in
conjunction with a control chart for attributes. Obviously. certain products and quality char-
acteristics lend themselves to analysis by either method. and a clear-cut choice between the
two methods may be difficult.
A control chart, whetberfor measurements or attributes, consists of a cemerline, corre-
sponding to the average quallty at which the process should perform when statistical "ontrol
is exhibited, and two control U",jts. called the upper and lower control limits (UCL and
LCL). A typical control chartls shown in Fig. 17-1. The control limits are chosen so that val-
ues falling between them can he attributed to chance variation, while values falling beyond
them can be taken to indicate a lack of statistical conrroL The general approach consists of
periodically taking a random sample from the process, computing some appropriate quan-
tity, and plotting that quantity on the control chart. When a sample value falls outside the
control limits, we search for some assignable cause of variation. However, even if a sample
value falls between the conrrollimits~ a trend Or some other systematic pattern may indicate
that some action is necessary, usually to avoid more serious trouble. The samples should be
selected in such a way that each sample is as homogeneous as possible and at the same time
maximizes the oppo:rturUty for variation due to an assignable cause to be present. This is usu-
ally called the ratiolllll subgroup concept. Order of production and source (if more than one
source exists) are commonly used bases for-obtaining rational subgroups.
The ability to interpret control charts accurately is usually acquired with experience. It
is necessary that the user he thoroughly familiar with both the statistical foundation of con-
trol charts and the nature of the production process itself.

17-3.2 Control Charts for Measurements


When dealing with a quality characteristic that can be expressed as a measurement, it is cus-
tomary to exercise control over both the average value of the qUality characteristic and its
variability. Control over the average quality is exercised. by the control chart for means, usu-
ally called the X cha.:t. Process variability can be controlled by either a range (R) cha.:t or a

Upper control limit (UCL)

Sample number

Figure 17~1 A typical control chart


17-3 Statistico.1 Process Control 511

standard deviation chart, depending on how the population standard deviation is estImated.
We will discuss only the R chart.
Suppose that the process mean and standard deviation, say p. and a. are known, and,
furthermore, that we can assume that the quality characteristic follows the normal distribu-
tion. LetXbe the sample mean based on a random sample of size n from this process. Then
the probability is 1 - IX that the mean of such random samples w,n fall between
f.l+Za/2{a/.Ji;) and I1-Z<l/2(a/..;;;). Therefore. we could use these two values as the
upper and lower control limits, respectively. However, we usually do not know f.1 and a and
they must be estimated. In addition, we may not be able to make the normality assumption.
For these reasons, the probability limit 1 - a is seldom used in practice. Usually Zan is
replaCed by 3, and "3.sigma" control limits are used.
'Wben J1. and a are unknown. we often estimate them On the basis of preliminary sam~
pIes. taken when the process is thought to be in control. We recommend the use of at least
20--25 preliminary samples. Suppose k preliminary samples are a;"Ililable, each of size n.
Typically, n will be 4,5, 'Or 6; these relatively small samp:ie sizes are widely used and often
arise from the C'Onstruction of rational subgroups. Let the sample mean for the ith sample
be Xi. Then we estimate the mean of the population, j)., by the grand mean

- l'
X IL.X,.
1'=;
(17-1)

Thus, we may take X as the centerline On the X control chart.


We may estimate afrom either the standard deviations or the ranges of the k samples.
Since it is more frequently used in practice, we confine 'Our discussion to the range metb,od.
The sample size is relaliv'ely small, so there is little loss in efficiency in estimating afrorn
the sample ranges. The relationship between the range, R, of a sample from a normal pop-
ulation with known parameters and the standard deviation of that population is needed.
Since R is a random variable. the quantity W:;:;: RIO', called the relative range. is also a ran-
dom ,{ariable. The parameters of the distribution of W have been determined for any sam-
ple size n. The mean of the distribution of W is called d,., and a table of d,. for various n is
given in Table XllI of the Appendix. Let R, be the range of the ith sample, and let
_ 1 k
R =-2:11,
k I:::::L
(17 -2)

be the average range, Then an estimate of (j would be


. R
0"=-. (]7·3)
d,.
Therefore, we may use as our upper and lower control limits for the X chart

UCL = X
= ~-······F
3- R.
d.2~n .
(17-4)
3
LCL=X--:-;=J:(

We note that the quantity


512 Chapte: 17 Statistical Quality Control and Reliability Eugineering

is a constant depending on the sample size, so it is possible to rewrite equations 17-4 as


UCL= X+A,R,
(17-5)
LCL= X -A;J.
The constant A, is tabulated for various sample sizes in Table XIII of the Appendix.
The parameters of the R chart may also be easily determined. The centerline will obvi-
ously be R. To determine the control limits, we need an estimate of (jR> the standard devia~
tion of R. Once again, assuming the process is in control. the distribution of the relative
range, W. will be useful. The standard deviation of W, say (jWI is a function of n, which has
been determined. Thus, since
R=WO",
we may obtain the standard de;iation of R as
(jR= (jl\<,(j',

As a is unknown, we may estimate O"R as

and we would use as the upper and lower control limits on the R chart

UCL R+ 30"w. R
d2 '
(17-6)
LCL=R- 30"w R.
d,
Setting D3 = 1- 30"w1d2 and D4 := 1 + 30"w1d2' we may rewrite equation 17-6 as
UCL =D4R, (17-7)
LCL=D,R,
where D, and D4 are tabulated in Table XIII of the Appendix.
When preliminary samples are used to construct limits for control charts, it is cus~
tomary to treat these limits as trial values. Therefore, the k sample means and ranges
should be plotted on the appropriate charts, and any points that exceed the control limits
should be investigated. If assignable causes for these points are discovered, they should be
eliminated and new limits for the·contrel charts determined. fn this way, the process may
eventually be brought into statistical control and its inherent capabilities assessed. Other
changes in process centering and dispersion may then be contemplated,

Exaxnpie17,1.
A component pa..rt for a jet aircraft engine is manufactured by ax:. investment casting process, The vane
opening on this casting is an impor.ant functiOIlal Parar::le:er of the part, We will illustrate the use of
X&.,'ld R control cha."tS to assess the statistical stability of this process. Table 17-1 presents 20 samples
of five parts each. The values given in the table have been coded by using the last three digits of the
dimension; that is. 31.6 should be 0.50316 inch.
The quantities X:= 33.33 a:;;.d R "'" 5.85 are shown at the foot of Table 17-L Notice that even
though X, X, R, and R are now realizations of random variables, we have still \V:.i.tten them as
17·3 Statistica: Process Control 513

TabJe 17~1 Vane Opening Measure.-nents


Sample Number x, x, x, >, X R
33
'"
29 "'
0" 32 33 31.6 4
2 35 33 31 37 31 33,2 6
3 35 37 33 34 36 35,0 4
4 30 31 33 34 33 32,2 4
5 33 34 35 33 34 33,8 2
6 38 37 39 40 38 38.4 3
7 30 31 32 34 31 31.6 4
8 29 39 38 39 39 36,8 10
9 28 34 35 36 43 35,2 15
10 39 33 32 34 32 34,0 7
11 28 30 28 32 31 29,S L
12 31 3S 35 35 34 34,0 4
13 27 32 34 35 37 33,0 10
14 33 33 35 37 36 34,8 4
15 35 37 32 35 39 35.6 7
16 33 33 27 31 30 30,8 6
17 35 34 34 30 32 33.0 5
18 32 33 30 30 33 31.6 3
19 25 27 34 27 28 28.2 9
20 35 35 36 33 30 33,8 6

X=33.33 R=5,85

uppercase letters, This is the usual convention in quality control. and it will always be clear from the
context what the notation implies, The trial concrollimits are, for the Xcbart.

X ±A,R 33,3 ± (0,577)(5.85) = 33.33 ,,3.37,

or
UCL = 36,70.
LCL=29,96,
For the R chart, the trial co:r:trollimits are
UCL =D,R = (2.115)(5,85) = 12.37,
LCL =D,R = (0)(5,85) = 0,
The X aodR cootrol charts .....1tb. these trial cootrollimit.,> are sho'NU in Fig, 17~2, Notice that sa.'":lples
6, S, 11. and 19 are out of cOlltrol on the X chart, and that sample 9" is out of control on the R chart.
Suppose that all of these assigr.ablc causes can be traced to a defective tool io the wax-molding area,
We should discard these !lve samples and recompute the limits for the X and R charnL These new
revised limits are, fur the X chart,
UCL= X+A,J< =32,90 + (0.577)(5.313) =35,96.
LCL = X 32,90 - (0,577)(5.313) = 29,84.,

and, for ce R chart, they are


UCL (2,115)(5,067) = 1O,7l,

LCL = (0)(5,067) = 0,
514 Chapter 17 Statistical Quality ContrQI and Reliability Engineering

c
iii 3S:-
I-"~-""-­ UCL= 36.70

E
o
Mean = 33.33

! J-----*--
io, ~1;;-O------;2~O
ILCL
29,96

Sample number

, - - - - - - - .... _ - - -
1S

UCL; 12.37
m
0> 10
C
~
.!!l
c.
E
M S
(j)

O~ LCL=O

10 20
Sample number

Figure 17M2 The X and R control charts for vane opening,

The revised control charts are shO'WTl in Fig. 17-3, };otice that we have treated the firs..: 20 preliminary
samples $S estimation. data with which to establish control limits. These limits can now be used. to
judge the statistical control offuturc production.. As each l1eW sample becomes available, the values
ofX' and R should be cotUputed and plotted on t.~e cOIltrol charts. It may be desirable to revise the lim-
hs periodically, even if the process rCJllJlins stable. The limits should alv.rays be revised when process
improvements are made.

Estimating Process Capability


It is usually necessary to obtain some information about the capa.bility of the proc~ that
is, about the performance of the process when it is operating in control. Two grapbical tools,
the tolerance chart (or tier chart) and the histogram, are helpful in assessing process capa-
bility. The tolerance chart for all 20 samples from the vane manufacturing process is shov.'ll
in Fig. 17-4. The specifications. on vane opening. 0.5030 ± 0.001 inch, are also shown on
the chart. In terms of the coded data, the upper specification limit is USL ~ 40 and the lower
specification limit is LSL = 20. The tolerance chart is useful in revealing pattems oyer time
in the individual measurements, or it may show that a particular value niX or R was pro-
duced by one or two unusual observations in the sample. For example. note the tWO unusual
ObserY3tions in sample 9 and the single unusual observation in sample 8, Note also that it
17 ~3 Statistica! Process Control 515

§ 40r

1-
i UCL~35.96
S 35~
w
""g . L ~
"'~
f
c. "-

"
m
<I)
VS LCL=29.84

I \
0 10 20
Subgroup

15~

m
~
UCL; 10.71
"e 10 ~
'"
c. R;5.C67

:~
E
'"
(J)

LCL;C
0 10 20
Subgroup

8 = Not used in computing control ilm:fs

Figure 17~3 X a.,.d R control charts fOT vane openi.1g, revised limits,

USL;40
40

I
"- 30
35

!Lit! ITI ItT 1'1 r

i
.
0
ID
C
ID
>
--\Nominal •
dirrension ;:: 30

25

LSL
20

Sample r:umber
Figure 17 ~4 Tolerance diagram of vane openings.
516 Chapter 17 Statistical Quality Contro: and Raliability Engineerillg

is appropriate to plot the specification limits on the tolerance chart, since it is a chart of indi~
vidual measurements, It is never appropriate to plot specijicationlimits on a control chart,
Or 10 use the specifications in determining the control limits, Specification limits and con~
trollimits are unrelated. FInally, note from Fig. 174 that the process is running off center
from the nonrillal dimension of 0.5030 inch.
The histogram for the vane opening measurements is shown in Fig. 17-5. The obser-
vations from samples 6, 8, 9, II, and 19 have been deleted from !his histogram. The gen·
era} impression from examining this histogram is that the process is capable of meeting the
specifications. but that it is running off center.
Another way to express process capability is in terms of the process capability ratiQ
(PCR), defined as

peR = USL-LSL, (17·8)


60'
Notice that the 60' spread (30' on either side of the mean) is sometimes called the basic
capability of the process. The limits 3a on either side of the process mean are sometimes
called natural tolerance limits, as these represent limits that an in·control process should
meet with most of tIle units produced. For the vane opening, we could estimate aM

(J R =~=2.06.
d, 2.326
Therefore. an estimator for the peR is
USL-LSL
peR
60'
40-20
= 6(2.06)
1.62.

20,------ - - - - --------,

15

o ~__,__rH_r'Ty.,-'-r-H_rry.,-'-r-,,_,-rl,
18 20 22. 24 26 28 30 32 34 36 38 40 42 44
LSL t USL
Ncmina! dimension
Vane opening
17-3 Statistical Process Contral 517

The PCR has a natural intetpretation; (I!PCR)IOO is just the percentage of the toler-
ance band used by the process. Thus) the vane opening process uses approximately
(111.62)100 = 61.7% of the tolerance band.
Figure 17-6a shows a process for which the peR exceeds unity. Since the process nat-
ural tolerance limits lie inside the specifications. very few defective or nonconforming units
will be produced. If PCR= I, as shown in Fig. 17-6b, more nonconforming units result. In
fact, for a normally distributed process, if peR = 1, the fraction nonconforming Is 0.27%.
or 2700 parts per million. Finally, when the PCR is less than mity, as in Fig. 17-6<, the
process is very yield sensitive and a large number of nonconforming U.1.l3 will be produced.
The definition of the peR given in equation 17-8 implicitly assumes that the process
is centered at the nominal dimension, If th.e process is running off center, itS actual capa~
biliry will be less than that indicated by the peR. It is convenient to think of PCR as a meas-
ure of potential capability, that is, capability with a centered process. If the process is not
centered, then a measure of actUal capability is giVe!1 by

D • rUSL-X X-LSL1
P C"k = romi! 3<1 'J
3<1 " (17-9)

---'c------'~::.L----!c-. _PC-!-R'-=-1--:-:" ... _

LSL L3U I 3u-J USL

(al

PCR

NonconformIng Nonconforming
units ( unilS

LSL USL
(b)

Nonconforming
units

lSL
"
3<T_...;1~-3u-"'"
(e)

Figure 17.-6 Process fallout and the process capability ratio (peR).
518 Chapter 17 StatistiC!! Q~a1ity Cootrol and Reliability Engineering
lI
In effect, peRk is. a one-sided process capability ratio that is calculated relative to the spec-
ification limit nearest to the process mean. For the vane opening process, we find that

PCRt =min[USL-X, X-LSLJ~


30" 30"

=min[40-33.19 -1.10, 33.19-20 =2.131


3(2.06) 3(2.06) j

Note that if PCR = PCR, the process is centered at the nominal dimension. Since PCR,
1.1 0 for the vane opening process, and PCR = 1.62, the process is obviously running off
center, as was first noted in Figs. 17-4 and 17-5. This off~center operation was ultimately
traced to an oversized wax tOoL Changing the tooling resulted in a substantial improvement
in the process,
Montgomery (2001) provides guidelines on appropriate values of the peR and a table
relating fallout for a normally distributed process in statistical control as a function of peR.
Many U.S. companies USe PCR 1.33 as a minimum acceptable target and PCR = 1.66 as
a mini"llUIn target for strength. safety, or critical characteristics. Also, some U.s, compa M

rues, particularly the au;omobile industry, have adopted the Japanese terminology Cp =PCR
and Cpk ::;;: peRk.- As Cp has another meaning in statistics (in multiple regression; see Chap-
ter 15), we prefer the traditional notation PCR and PCR,.

11-3.3 Control Charts for Individual Measurements


Many situations exist in which the sample consists of a single observation; that is, n;;::; L
These situations occur when production is very slow or costly and itis impractical to allow
the sample size to be greater than one. Other cases include processes where every observa~
tion can be measured due to automated inspection, for example. The Shewhart control
chartjor individual measurements is appropriate for this type of situation. We will see later
in this chapter that the exponentially weighted moving average control cbart and the cumu-
lative sum control chart may be more informative than the individual chart.
The Shewhart control chart uses the moving range. MR, of two successive observations
for estimating the process variability. The moving range is defined

For example, for m observations, m - I moving ranges are calculated as MR, lx" - XII,
MR3 = lx, - x,1, ... , MRm = Ixm x",-ll. Simultaneous control chans can be established on
the individual observations and on the moving range.
The control limits for the individuals control chart are calculated as

MR
UCL x+3
d2 '
Centerline:::: X, (17-10)

LCL - - ,MR
-x-'-::i;'
where MR is the saJ::1ple mean of the MR i•
17~3 Statisticul Process Control 519

If a moving range of size n = 2 is used, then d2 = 1.128 from Table XIII of the Appen-
dix. The control limits for the moving range control chart are
UCL = D4MR,
Centerline = MR, (17-11)

LCL=D,MR.

Batches of a particular chemical product are se:ected from a pro:::ess ar.d the purity measured on each,
Da:a for 15 successive bat:::hes have been collected mld are given in Table 17-2, The moving ranges
of size n = 2:are also displayed in Table 17~2,
To Set up the control chart for individuals. we ~t need the sample average of the 15 purity
measurements. This average is found to bex == 0357. The average of the moving ranges of two obser-
vations is MR ;::; 0,046. The control limits for the individuals cha...-rt with moving ;:angcs of size 2 using
me linritsin equation 17*lQ are

UCL = 0.757.,. 3 0.046 =0.879,


1.128
Centerline = 0.757.
LCL = 0.757 _ 3 0.046 = 0.635.
1.128

The control limits for:he moving.range cha.-rt a..--e found using the Emits giver. in Equation 17-:11:
UCL = 3.267(0.046) = 0.150,
Centerline = 0.046,

LCL = 0(0.046) O.

Table17-2 Purity of Olernical Product

Batch Moving Range, ,"ill


0,77
2 0.76 0,0)
3 0.77 0,01
4 0.72 0.05
5 0.73 om
6 0.73 0.00
7 0.85 0.12
8 0.70 0.15
9 0.75 0.05
10 0.74 om
11 0.75 am
12 0.84 0.09
13 0.79 0.05
14 0.72 0.07
15 0.74 0.02
7:=0.757 MR=O.0<6
520 Chapter 17 Statisticil Quality Control and Reliability Engineering

0.9r-======================lUCL=0.879
:·:t Mea.,::::: 0,757

L======::::::=======;::======~LCL=
0.6 0 .. 5 18 15 0.635
Subgroup
(a)

0.15 r::--==========::;;::==========l UCL= Q,150


~~ 0,10 r'

i,~ 0,05 r :!-R= 0,046


0.00 bL====:::;,
o 5
~=====::;,10=====::::J1
15
LCL" 0
Subgroup
(b)

Figure 17~7 Control charts for (a) the individual observations and (b) the moving range ap pu..""ity.

The control charts for individual observations and for the moving range are provided in Fig. 11-1,
Since there are no pointz beyond the controlli..rojts, the process appears to be in statistical canuoI.
. The individuals chart can be interpreted much like the X control chan, An out~of...;;:ontrol situa~
rion would be indicated by either a poi.'u (or points) plotting beyond the controllimi-;s or a pattern
such as a run on one side of the centerline.
The moving range dwt cannot be i::tterpreted in the same way. Although a point (orp0lnts) plot~
ling beyond the control limits would likely indicate a.:l out-of-controlsituation.. a par.em or run on one
side of the centerline is not necessarily an indica:ion that the process is out of conn:oL This is due to
the fact that the moving ranges are correlated, and this cone1ation.lll3.Y natu1'3..lly cause patterns or
trends on the chart

17-3.4 Control Charts for Attributes


The p Chart (Fraction Defective or Nonconforming)
Often it is desirable to classify a product as either defective or nondefectlve on the basis of
comparison with a standard. This IS usually done to achieve economy and simplicity in the
inspection operation. For example. the diameter of a ball bearing may be checked by deter-
mining whether it will pass through a gauge consisting of circular holes cut in a template.
This would be much simpler than measuring the diameter with a micrometet. Control charts
for attributes are used in these situations. However, attribute control charts require a con~
siderably larger sample size than do their measurements counterparts. Vie will discuss the
fraction-defective chart, or p chart, and two chartS for defects, the c and u chartS, Note that
it is possible for a unit to have many defects, and be either defective or nondefective. In
some applications a unit can have several defects, yet he classified as nondefective.
17-3 Statjsti,aJ Process Control 521

SUFPose D is the number of defective units in a random sample of size n. We assume that
D is a binomial random variable with unkno'wn parameter p. Now the sample fraction
defective is an estimator of p, that is
D
(17-12)
n
Furthennore; the variance of the statistic p is
p(l- p)
n
so we may estimate a; as
';(1- P)
(17-13)
n
The centerline and control limits for the fraction~defective control chart may now be
easily determined. Suppose k preliminary samples are available, each of size n, andD; is the
number of defectives in the ith sample. Then we may take

(17-14)

as the centerline and


i (I .)
UCL=P+3? : P ,
(17-15)
LeL= p-3 I!P(l- p)
\ n

as the upper and lower control limits. respectively. These control limits are based On the
normal approximation to the binomial distribution. "When p is sman. the normal approxi-
mation may not always be adequate. In such cases. it is best to use control lim.lts obtained
directly from a table ofbinornial probabilities or. perhaps; from the Poisson approximation
to the binomial distribution. If p is small, the lower control limit may be a negative number.
If this should occur, it is customazy to consider zero as the lower control limit.

;.~~I1I~iZ:~
Suppose we wish to construct a £raction~defectiye control char:: for a ceramic substrate pro-
duction line. We have 20 preliminary samples, each of size 100; the number of defectives
in each sample are shown in Table 17~3, Assume that the samples are numbered in the
= =
sequence of production. Note that Ii 80012000 OAO, and therefore the trial parameters
for the control chart are
Centerline =: 0.395

UCL = 0.395 ... 3,: (0.395)(0.605) 0.5417,


, 100

LCL=O.395_3!(0.395)(O.605) 0.2483.
"'~ 100
522 Chapter 17 Statistical QUality Control and Reliability Engineering

'fibl.17·3 Nu.'Uher of Defectives in S3tI1p:ies of 100 Cerar..1ic Substrates

Sample No. of Defectives Sample No. of Defectives


1 44 11 36
2 48 12 52
3 32 13 35
4 50 14 41
5 29 15 42
6 31 16 30
7 46 17 46
8 52 18 38
9 44 19 26
10 38 20 30

The control chart is shoVr'TI in Fig. 17~8. All samples are in control. If they were not, we
would search for assignable causes of variation and revise the limits accordingly,
Although this process exhibits statistical conlrol. its capability (,0 0.395) is very
poor, We should take appropriate steps to investigate the process to detenr.Jne why such a
large number of defective units are being produced. Defective units should be analyzed to
determine the specific types of defects present. Once the defeet types are known. process
changes should be investigated to determine their impact on defeet levels. Designed exper~
iments may be useful in this regard.

..E~!lI~i7~4
Attributes Versus Measuremenl. Control Charts
Tae advantage of measurement control charts :relarlve to the p chart 'With respect to size of sample may
be easily illustrated" Suppose that a DOnnally distributed quality characteristic has a sta:ndard devia-
tion of 4 a.."\d specification limits of 52 and 68. The process is centered at 60, which results iri. a frnc~
tiOD defective of 0.0454. Let the process mean shift to 56, ~ow the fraction defective is 0.1601. If the

O.SS C -'-=--=--=--=--=--=--=--=-'_'-::-:"_-=--====_"_1 UCL= 0.5417


10.45
§
1

~~ 0.35

~
"' 0.25 LCL= 0,2483

o 10 20
Sample number

Figure 17~8 The p chart for a ceramic substrate.


17-3 Statistical Process Control 523

probability of d::tecting the shi..~ on ';he first sample following the mift is to be 0.50. ';hen the sample
size mus~ be such tl::at ';he lower 3~sigma limit will be at 56. This implies

whose sOlution is It:::; 9, For a p chart, using the no~ approximation to the binomial, we must have

0,0454+3~(0,0454~0,9546) 0.160L

whose solution is n:::; 30. Thus. unless the cost of measurement inspection is more than three times ali
costly as the attributes inspection, the measurement control chart is cheaper to operate.

The c Chart (Derects)


In some situations it may be necessary to control the number of defects in a unit of product
rather than the fraction defective. In these situations we may use the control chart for
defects, or the c chart. Suppose that in the production of cloth it is necessary to control the
number of defects per yard, or that in a')sembling an aircraft wing the number of missing
rivets must be controlled. Many defects-per-unit situations can be modeled by the Poisson
dismbution,
Let c be the numher of defects in a unit, where c is a Poisson random variable with
parameter a. Now the mean and va.."i.ance of this distribution are both ex. Therefore, if k unit.,;
are availahIe and c; is the number of defects in unit i, the centerline of the control chart is
_ 1 k
c;;;;;~Lc[, (17-16)
K i=l

and
UCL;;;;;c+W. (17-17)
LCL;;;;;c -3-«
are the upper and lower control limits, respectively.

Printed circuit boards are assembled by a combina~on of manual assembly and automation. A flow
sold~r machine is used to make the mechanical and elecrrical connections of the leaded components
to the board. The boards are run thtough the flow solder process almost continuously. and every bour
five boards are selected a.,d i..'lSpectcd for process-control purposes, The number of defects .l:_ each
sample of five boards is noted. Results for 20 samples are shown in Table:'7-4. Now c = 160120;;;;; 8.
and t.tlerefore
UCL~8+3;S ~16,41!4.
LCL ~ S- 3-./8 < 0, set to 0,
F:om lite control chart i:o Fig. 17~9, we see that the process is in controL However, eight defects per
group of five printed circuit boards is too many (about 8/5;;;;; 1.6 defectslboord), and the process needs
improvement. An investigation needs to be made of the specific types of defects found on the printed
circuit boards. This will usually suggest POteJltial ave::mes for process improvement
524 Chapter 17 Statistical Quality Control and Reliability Engineering

Table 17 ~4 Number of Defects in Samples of Five Pri.nted Circuit Boards

Sample No. of Defects Sample !'-l'o. of Defects


6 11 9
2 4 12 15
3 8 13 8
4 10 14 10
5 9 15 8
6 12 16 2
7 16 17 7
g 2 18 1
9 3 19 7
10 13 20 13

20

UCL~ 16.484
c
r1

w
"'0 10
;;;
.0
E
~
z

Sample number

Figure 17-9 The c chart for defeCts in samples of five prir.red circuit boards.

The u Chart (Defects per Unit)


In some processes it may be preferable to work with the number of defects per unit rather
than the total number of defects. Thus, if the sample consists of n units and there are c total
defects in the sample, then
c
u=-
n
is the average number of defects per unit. A '" chart may be constructed for such data. If
there are k preliminary samples, each with "11 Uz, .•. , Uk defects per unit, then the center~
line on the u chart is

(l7-18)

and the control limits are given by

(17-19)
17-3 Statistical Process Con""l 525

A u chart may be constructed for the printed cl.rcuir board defect data in Examp~e 17~5. Since each
sample contains r. = 5 printed circuit boards, the values of u for eacb sample may be calculated as
shown in the following display:

Sample Sample size, n Number of defects. c Defects per unit


5 6 1.2
2 5 4 0.3
3 5 S 1.6
4 5 10 2.0
5 5 9 1.8
6 5 12 2.4
7 5 16 3.2
8 5 2 0.4
9 5 3 0.6
1O 5 10 2.0
11 5 9 1.8
12 5 15 3.0
13 5 8 1.6
14 5 10 2.0
15 5 8 1.6
16 5 2 0.4
17 5 7 1.4
[3 5 0.2
19 5 7 L4
20 5 13 2.6

The centerline for the u chart is


_ 1 20
U=-2>
20
1
",[ ,
32 1.6,

and the upper and lower control limits are


_ 1"17 f!':6
UCL=u+3~-;; =1.6+3~5 =33_

LCL=u -3 (f. = 1_0-".,i-:5:-<0, set to O.


1n
The control char: is plotted in Fig. 17~1O. Notice that the u chart in this example is equivalent to the
c chart in Fig, 17 ~9, In some cases, particuLtrly when the s~le size is !lot constant. the u chart will
be preferable to the c chart. For a disc1.!ssio:c. of va..-iable sample sizes on control charts, see Mont-
gomery (2001).

17-3.5 CUSLM and EWMA Control Charts


Up to this point in Chapter 17 we have presented the most basic of control charts, the
Shewbart control cham. A major disadvantage of these control cham is their insensitivity
to small shifts in the process (shifts often less than 1.50)_ This disadvantage is due to the
fact that the Shewhart charts use information only from the current observation.
526 Chapter 17 St>tistical Quality Control and Reliability Engineering

w
1:>
ill
"
1 r
!

i
0
0 10 20
Sample number

Figure 17,.10 The u chart of defects pe~UDit on printed circuit boards. Example 17 o.
w

Alternatives to Shewhart eontrol charts include the cumulative sum control chart and
the exponentially weighted moving average control chart. These control charts are more
sensitive to small shifts in the process because they incorporate i..'1formation from current
and recent past observations,

Tabular CUStJM Control Cilarts for lb. Process Mean


The cumulative sum (CUSUM) control chart was first introduced by Page (1954) and incor-
porates information from a sequenee of sample observations. The chart plots the cumula-
tive SU~ of deviations of the observations from a target value. To illustrate. let Xj represent
the jthsample mean, let J1{J represent the target value for the process mean. and say the sam-
ple size is n " 1, The CUSUM control ebllrt plots the quantity
n
Ci = I,lx J1e)
j - (17-20)
j=l

against the sample i. The quantity C; is the cumulative sum up to and including the ith sam~
pIe. As long as the process is in control at the target value flo, then Ci in equation 17-20 rep-
resents a random walk with a mean of zero. On the other hand. if the process shifts away
from the target mean, then either an upward or downward drift in Ci. will be evident. By
incorporating information from a sequence of observations, the CUSUM chart is able to
detect a small shi.fi: in the process more quickly than a standard Shewhart chan, The
CUSUM eharts Can be easily implemented for both subgroup data and individual observa-
tions, We will present the tabular CUSUM for indiyidual observations.
The tabular CUSUM involves tvlo statistics. c;- and C;, which are the accumulation of
deviations above and below the target mean., respectively. C; is called the one-sided upper
CUS'(rM and c:
is called the one-sided lower CUSUM, The
statistics are eomputed as follows:
(17-21)

(17-22)
'Nith initial values of C~ :;;:: CO =O. The eonstant, K. is referred to as the reference value and
is often chosen approximately halfway between the target mean, ,'4> and the out-of-control
17~3 Statistical Process Control 527

mean that we are interested in detecting, denoted Jl.l' In other words. K is half oftbe mag-
nitude of the shift from 110 10 )1.1' or

The Sllltistics given in equations 17-21 and 17·22 accumulate the deviations from target that
are larger than K and reset to zero when either qm.u:.tity becomes negative. The Cr..:SUM
control chart plots the values of C; and C; for each sample. If either statistic plots beyond
the decision interval. H, the process is considered out of controL We will discuss the choice
of H later in this ch,,!,!er, but a good rule of thumb is often II = 5a.

A study presented in Food Control {200t p. 119) gives the results of measuring the dry-matter cor-tent
in buttercream from a batch process, One goal of t.1e study is to monitor the amount of Cry matter from
batch to batch. Table 17-5 displays some data ':hatrnay be t']pical of this type of process. Th::: reported
values. Xi' are percentage of dry-ll.la!:ter content examinee after mixing. The target amount of d..ry~mat­
ter cor-tent is 45% a:ld assnme that a= 0.&4%, Let us also assume t.1at we.are interested in detect:i:r.g a
shift in the process of mean of at least 10; that is, J1.1 =IJ;;+ 10'=45 + 1(0.84) =45.84%. We will use the

Table 17·5 CUSUM Calculations for Example 17-7


Batch, i x, xi - 45.42 44.58
46.21 0.79 0.79 -1.63 0
2 45.73 0.31 1.10 -1.15 0
3 44.37 -1.05 0.05 0.21 0.21
4 44.19 -1.23 0 0.39 0.60
5 43.73 -1.69 0 0.85 1.45
6 45.66 0.24 0.24 -1.08 0.37
7 44.24 -LIS 0 0.34 0.71
8 44.40 -0.94 0 0.10 O.SI
9 46.04 0.62 0.62 -1.46 0
10 44.04 -:.38 0 0.54 0.54
11 42.96 -2.46 0 1.62 2.16
12 46.02 0.60 0.60 -1.44 0.72
13 44.82 -0.60 0 -D.24 0.48
14 45.02 -D.40 0 -D.44 o.()<
15 45.77 0.35 0.35 -LJ9 0
16 47.40 1.98 2.33 -2.82 0
17 47.55 2.13 4.46 -2.97 0
18 46.64 1.22 5.68 -2.% 0
19 46.31 0.89 6.57 -1.73 0
20 44.82 -0.60 5.97 -D.24 0
21 45.39 -0.03 5.94 -D.81 0
22 47.80 2.38 8.32 -3.22 0
23 4<5.69 . "~
J. •...,I 9.59 -2.1 1 0
24 46.99 1.57 11.16 -2.41 0
25 44.53 -0.89 10.27 0.05 0
528 Chapter 17 Statistical Quality Control and Reliability E<gioeering

recommended decision i!'lterval of H = 5(j~ 5(0.84) = 4.2. The reference valu~ K. is found to be

The va!ues of C; and c-; are given ir. Table 17-5. To illustrate the calculations, consider the first two
sample batches, Recall that C;::::::. CO = 0, and using equations 17-21 and 17~22 ",itb. K:::::; 0.42 and
J.lo : : :; 45, we have
c: ~=(O,,,,-(45 M2)+C;J
",,-maxlO,xj ~45,A.2't"C;]
and
C: =max[O, (45 -0.42) -xl + C;i
=max(O,4458-x, +CQJ,
Fcrbatth l,x\ =46,21,
=max[O, 4<5.21-45,42 +OJ
~ maz[O, 0,79]

=0,79,

and
C; =max[O, 4458 - 4<5.21 + OJ
= max[O, -L63J
=0,
For batch 2,-<" = 45.73,
c; = l.l3.X[O, 45.73 - 45.42 + 0.79]
= max[O, 1.10]
= 1.10,
and
C; = max[O, 44.58 - 45.73+ 0]
:: max[O, -1.15J
=0.

The CUSL"'M calcJlations give::. in Table 17~5 indicate rhat the upper-sided CUSUM for batch 17 is
C77 = 4.46, which CAceeds the dedsion value of H 4,2. Therefore, the process appears to have
shifted. out of concrol. The CUSUM status chart created using ~1initab® withH =4.2 is given in Fig.
17-1 L The out-of-control control sh:uation is also evident On this cha.'t at batch 17.

The CliSUM control chart is a powerl'uJ quality tool lor detecting a process that has
shifted from the target process mean, The correct choices of H and K can greatly improve
the sensitivity of the control chart while protecting against t.'1e occurrence of false alarms
(the process is actually in control, but the control chart signals out of con~ol). Design rec~
ommendations for the CliSUM will be provided later in this chapter whcn the concept of
average run length is introduced,
17~3 St<ltistical Process Control 529

6~ Upper CUSUM

:~----------------------~r-----+-+---
3~
~ 2~
.r
.~
~
1;-
0
,
"'--~-""-~-"-"";o;;,.----.......
B -1~ "J.~""~~;/'~s..v../ ~'~~ :!",~ .. .,j.~~
-2 ~ ' I:
t
-3 '-
~4 ! Lower CUSU;v,
! ! I
1-42
o 5 10 15 20 25
Subgrouo nurr:ber

We bave presented the upper and lower CUSti'M control charts for situations in which
a shift in either direction away froe the process target is of interest. There are many
instances w ben we may be interested in a shift IT. only one direction, either upward or
dov.nward, One-sided CUSUM charts can be constructed for these situations. For a thor-
ough development of these charts and more details, see Montgomery (2001),

EWMA Control Charts


The exponentially weighted moving average (E\V.MA) control ch~"t is also a good alteITl3-
tive to the Shewhart control chart when detecting a small s!:rift in the process mean is of
interest. We will present the EWMA for individual measurements although the procedure
can also be modified for subgroups of size n > 1.
The EWMA control chart was firSt introduced by Roberts (1959). The EV;'MA is
defined as

(17-23)

where Ais a weight, 0 < A';; 1. The procedure to be presented is initialized withZo = Po, the
process target mean, If a target mean is unknown, then the average of preliminary data. x,
is used as the initial value of the EWlYfA. The definitior: given :in equation 17-23 demon-
strates that information from past observations is incorporated into the current value of z/.
The value Zi is a weighted average of all previous sample means. To illustrate, we can
replace Zl~l on the right-hand side of equation 17~23 to obtain

Zi = ;..x, + (1- A)[,Axi _1 + (I - A)zd


= Axi + A(I - A)xi_l + (1 - A)2Z~2'
B~" recursively replacing Z.i~pj 1,2, ... , t, we find

i-I
Z; = A2::(1- Ai X;_J +(1- A)i,,".
530 Chapter 17 Statistical Quality Control a=.d Reliability Engineering

The EWMA can be thought of as a weighted average of all past and current observations.
Note that the weights decrease geometrically with the age of the observation. giving less
weight to observations that occurred early in the process. The E\v:MA is often used in fore-
casting. but the EWMA control chart bas been used extensively for monitoring many types
of processes.
!fme observations are independent random variables with variance <f, the variance of
the E'\V"NIA, Zi' is

Given a target mean, Po, and the variance of the EVlMA, the upper control limit! centerline,
and lower control limit for the Em1.A control chart are

UCL

Centerline = il<J,

I(
LCL=.Uo-L<1~ 2~;y-(1-At',
A '[ "'J
where L is the width of the comrollimits. Note that the term 1 (1 - ).)2.i approaches 1 as i
increases. Therefore, as the process continues running. the control limits for the E\V.M.1\.
approach the steady srate values

(17-24)

Although the control limits given in equation 17 -24 provide good approximations, it is rec-
ommended that the exact limits be used for smalJ values of i.

F!x:lrnp", i '7-8
We will now imp!cmenr the EVlMA control chart with 1;::< 0.2 and L;;: 2.7 for the dry-matter content
data provided in Table 17-5. Recall that the target mean.is Ji.J;; 45% and the process standard devia-
tion is assumed to be <5= 0.&4%. Tae EWMA calculations are provided in Thble 17-6. To demonsmtte
some of the calculations, consider the first observ:ation with XI "'" 46,21. We find

" ~,\,xl + (1- A.)z"


; (0.2)(46.21) + (0.80)(45)
;45.24.
The second EWMA '\'alue is then

z, =.1.x, + (1- A.)z,


; (0.2)(45.73) + (0.80)(45.24)

=45.34.
17,3 Statistical Process Control 531
Tablel7•• EWMA Calculations for Example 17-8
Batch, i XI UCL LCL
46.21 45.24 45.45 44.55
2 45.73 45.34 45.58 44.42
3 44.37 45.15 45.65 44.35
4 44.19 4.95 45.69 44.31
5 43.73 44.71 45.71 44.29
6 45.66 44.90 45.73 44.27
7 44.24 44.77 45.74 44.26
8 44.48 44.71 45.75 44.25
9 46.04 4.98 45.75 44.25
10 44.04 44.79 45.75 4.25
11 42.96 44.42 45.75 44.25
12 46.02 44.74 45.75 44.25
13 44.83 44.76 45.75 44.25
14 45.02 4.81 45.76 44.24
15 45.77 45.00 45.76 44.24
16 47.40 45.48 45.76 44.24
17 47.55 45.90 45.76 44.24
18 46.64 46.04 45.76 44.24
19 46.31 46.10 45.76 44.24
20 44.82 45.84 45.76 44.24
21 45.39 45.75 45.76 44.24
22 47.80 46.16 45.76 44.24
23 46.69 46.27 45.76 44.24
24 46.99 46.41 45.76 44.24
25 44.53 45.04 45.76 44.24

The EW1v1A values 3I'e plotted on a control chart along with the upper and lower control J.i:::;ito given by

• UCL= 11.0 +LO' /( ~ )-1-(I-i.'J


2i
J
.' ·V~2-i.J
1[7·-0=-2=-":-----1
=45+z.7(0.841J - '~)[1-(1-0.212i ,
,\2-0.2 "

LCL= Po -LO'X~ 'f1-(I-i.)"j


i,2-.l.,l
117'-:'0"'0-'-,- - - -

=45 - 2.7( 0.84)~l2 -'~.2 )[1- (1- 0.21'1 j.


Therefore. for i = 1,

=45.45,
rr""X--'" 1 2(11 ..

LCL = Po -LO'\(Z_i.J[.-(H.) 1
=44.55.
532 Chapter 17 Statistical QUality Control and Reliability Engineering

UCL=45.76

45,5
"::;
~
45.Q ~l;ea.n:: 45

44.5

LCL=44.24
44.0
0 5 10 15 20 25
Sample number
Figure 17·12 EWMAcha.",forE:wnple17·8.

The reIl)aining controllim,its are calculated similarly and plotted on the control chart giver:. in Fig. 17~
12, The conrrollimits ::end to' increase as i increases, but then rend to die steady state values given by
equations in 17-24:
.-)-
UCL= 1'0 +La;12~ A

=45+2.7(0.84») 0.2
12~0.2

'-A-
LCL=uo -La 1_ _•
." ~2-1.

=45-2.7(0.84) 1-0. 2
'12-0.2
= 4424.
The E\V'I\1A, control chart signals at observation 17, indicating that the process is O\;t of concr.l.

The sensitivity of the EWMA control chart for a particular process will depeod on the
choices of L and A. Various choices of iliese parameters will be presented later in this chap-
ter, when the concept of the average run length is introduced. For more details and deyel~
opments regarding the EWMA, see Crowder (1987), Lucas and Saccucci (1990), and
Montgomery (2001).

17-3.6 Average Run Length


In this chapter we have presented control-charting techniqces for a variety of situations and
made some recommendations about the design of the control cbarts. In this section; we will
present the average run length (ARL) of a control chart The A.RL can be used to assess the
performance of the control cbart or to detennine the appropriate values of various parame-
ters for the control charts presented in this chapter.
17 ~3 533
r S:atistical Ptocess Control

The ARL is the expected number of samples taken before a control chart signals out of COn~
trOL In general, the ARL is
I
ARL=:-.
p
where p is the probability of any point exceeding the controillmlts. If the process is in con~
trol and the control chart signals out of control,. then we say that afalse alarm has occurred
To illustrate, consider the Xcontrol chart with the standard 30-limits. For this situation, p =
0.0027 is the probability that a single point falls outside the limits when the process is in
controL The in~control ARL for the X CO!ltrol chart is

ARL = .!.. =.~I~ =370.


p 0.0027
In other words, even!f the process remains in control we should expect, on the average. an
out-of-control signal (or false alarm) every 370 samples. In general, if the process is actu-
ally in control. then we desjre a large value of the ARL, More formally. we can define the
in-control ARL as
1
ARLo ==-,
a
where ais the probability that a sample point plots beyond the control limit.
If on the other hand the process is out of control, then a small ARt value is desirable,
A small value of theARL indicates that the control chart will signal out of control soon after
the process has shifted. The out~of~control ARL is
1
1-13 '
wbere 13 ,s the probability of not detecting a shift on the lirst sample after a shift bas
occurred. To illustrate. consider the X control chart with 3 (J limits. Assume the target or in-
control mea:J. is f.to and that the process has shifted to an out~of...control mean given by fJ,1 ::::
,Ur! + ka. The probability of not detecting this shift is given by
13 = P[LCL 5X5 ucLl,u = ,uJl.
That is, 13 is the prooability that the next sample mea:> plots in control, when in fact the
process bas shifted out of control. Since X~N(f.J.,d'ln) and LCL= f.J.o Lo-/.[,; and
UCL=,un + Lo-/,!';; we can rewrite 13 as
13 = P!,uo -Lo-/.[,; 5X 5,uo + Lo-j-J;;I,u =,ud
r (f.J.o - LO-/.[,;)- ,ul X -,u (,uo -+ Lo-I.[,;)- ,u, 1
=Pl'~'~' ,;;,..!< I l,u~,ul!
0-/.[,; vNn 0-;.[,; J
i(,uo -LO-/.[,;)-(,uo -+ko-) CUo + LO-/.[,;)-(,uo ~ko-) 1
-~~ <Z$
-l 0-/.[,; 0-1'['; J
=P[-L-k.[,; ';;Z5L-kFn],
where Z is a standard normal random variable. If we let CI> denote the standard normal
cumulative distribution fonctl.on, then
534 Chapter 17 Statistical Quality Control and Reliability Engineering

From this, 1:- fJ is the probability that a shift in the process is detected on the first sample
after the shift has occurred. That is, the process has shifted and a point exceeds the control
limits,--signaling the process is out of contro1. Therefore, ARLl is the expected number of
samples observed before a shift is detected.
The ARLs have been used to evaluate and design control charts for variables and for
attributes. For more discussion on the use of ARLs for these chms, see Montgomery
(2001),

ARLs for the CUSUM and EWMA Control Charts


Earlier in this chapter, we presented the CUSl,'M and Ew'MA control charts, TheARL can
be used to specify some of the parameter values needed to design these control charts.
To implement the tabular CUSUlvl control chart, values of the decision interval, H,
and the reference value, K, must be cbosen, Recall that H and K are multiples of the
process standard deviation, specifically H =ha and K =ka, where k = 1/2 is often used as
a standard. The proper selection of these values is important. The ~~ is one criterion that
can be used to determine the values of Hand K As stated previously, a large value of the
ARL when the process is in control is desirable. Therefore, we can set ARLo to an accept-
able level. and deter.nine h and k accordingly. In addition) we would want the control chart
to quickly detect a shift.in the process mean. This would require values of hand k such that
the values of ARL, are quite smalL To illustrate, Montgomery (2001) provides the ARL
for a ClJS'L'".M coritrol chart with h = 5 and k = 112, These values are given in Table 17-7,
The in-control average run length, ARLo, is 465, If a small shift, say, 0.50a, is important
to detect, then with h = 5 and k = 1/2, we would expect to detect this shift within 38 sam-
ples (on the average) after the shift has occurred, Hawkius (1993) presents a table of h and
k vaJues that will result in an in-control average run length of ARLo::::: 370. The values are
reproduced in Table 17-8,
Design of the EWMA control chart can also be based on the ARLs, Recall that the
design parameters of the EWMl\ control chart are the multiple of the standard deviation, L,
and the value of the weighting factor, l. The values of these design parameters can. be cho~
sen so that the ARL performance of the control charts is satisfactory.
Several authors discuss the ARL performance of the EWMA control chart, including
Crowder (1987) and Lucas and Saccucci (1990), Lucas and Saccucci (1990) provide the
ARL performance for several combinations of L and,t The results are reproduced in Table
17-9. Again, it is desirable to have a large value of the in-controlA.RL and small values of
out-okontrolARLs, To illustrate, if L= 2:8 and A= 0,10 are used, we would expect ARLo
e 500 while theARL for detecting a shift of 0.5 aisA.RL, '" 31.3. To detect smaller slrifu

Table 17-7 Tabula:: CUSlo'M Penor=nce withh = 5 and k ~ 1/2

Shift in M=
of 0) 0 0.25 0.50 0.75 1.00 LSD 2.00 2.50 3.00 4.00
ARL 465 139 38,0 17,0 10,4 5,75 4,01 3.11 2,57 2,01

Table 17..s Values of h and k Resulting iDARI..c 370 (Hawkins 1993)

k 0,25 0,50 0,75 )'0 1.25 L5


h s.ol 4,77 3,34 2,52 1.99 1.61
17-3 Statistical Process Control 535

Table 17-9 ARl...$ for Various E"'lMA Control Schemes {Lucas and Saccucci 1990}
Shift iL Mean L= 3.054 L""2.998 L=2.962 L=2.814 L=2.615
(multiple of d) }.;;:;;0.40 ).=0.25 1=0.20 A=O.1O ).=0.05

0 500 500 500 500 500


0.25 224 170 150 106 84.1
0.50 71.2 48.2 4LS 31.3 28.8
0.75 28.4 20.1 18.2 15.9 16.4
1.00 1~.3 11.1 10.5 ~O.3 1:-4
1.50 5.9 5.5 5.5 6.1 7.1
2.00 3.5 3.6 3.1 4.4 5,2
2.50 2.5 2.7 2.9 3.4 4,2
3.00 2.0 23 2.4 2.9 3.5
4.00 1-4 1.7 1.9 2,2 2.7

in the process mean, it is found that small values of I. should be used. Note that for L = 3.0
and I. = LO, the EWMA reduces to the standard Shewhart control chart with 3-sigma l:m-
its.

Cautions in the Use of ARLs


Although the ARL provIdes valuable information for designing and evaluating control
schemes, there are drawbacks to relying on theARL as a design criterion. It should be noted
that run length follows a geometric distribution. since it represents the number of sarr..ples
before a "success» occurs (a success being a point falling beyond the control limits), One
drawback is the sumdard de,iation of the run length is quite large. Second, because the dis-
tribution of the run length follows a geometric distribu:ion, the mean of the distribution
(AR.L) may not be a reliable estimate of the true run length,

17-3.7 Other SPC Problem-Solving Tools


\Vhi1e the control chan is a very powerful tool for investigating the causes of variation in
a process, it is most effective when used with other SPC problem-solving tools, In this
section we illustrate some of these toois. using the printed circuit board defect da~ in
Example 17-5.
Figure 17-9 shows a c chart for the number of defects in samples of five printed circuit
boards. The chart exhibits statistical control, but the number of defects must be reduced, as
the average number of defects per board is 815 = 1.6, and this level of defects would require
extensive rework.
The first step in solving this problem is to construct a Pareto diagram of the individual
defect types. The Pareto diagram, shown in Fig. 17-13, indicates that insufficient solcer and
solder balls are the most frequently OCCUrrillg defects, accounting for (109fl60)IOO 68%
of the observed defects. Furthermore, the fust five defect categories on the Pareto chart are
all solder-related defects. This points to the flow solder process as a petential opportunity
for impro'lement.
To improve the flow solder process, a team consisting of the flow solder operator, the
shop 8upen'isor, the manufacturing engineer responsible for the process, and a quality
neer me...~ to study potential causes of solder defects. They conduct a brainstonning session
and produce the cause~and.effect diagram shown in Fig. 17-14. The causew~md-effect
536 Chapter 17 Statistical Quality Control and Reliability Eng'.neering

75

64

Figure 17~13 Pareto diagram for printed circuit board defects,

diagram is widely used to clearly display the various potential causes of defects in products
and their interrelationships, It is useful in summari7ing knowledge about the process.
As a result of the brainsto:rming session, the team tentatively identifies the following
variables as potentially influential in creating solder defects:

lw Flux specific gravity


2. Solder temperature
3. Conveyor speed
4. Conveyor angle
5. Solder wave height
6. Preheat temperature
7. Pallet loading meL'>od

A statistically designed experiment could be used to investigate the effect of these seven
variables on solder defects. Also, the team constructed a defect concentration diagram for
the product. A defect concentration diagram is just a sketch or drawing of the product, with
the most frequently occurring defe<:ts shown on the part. This diagram is used to determine
whether defects occur in the same location on the part The defect concentration diagram
for the printed circuit board is shown in Fig. 17-15. This diagram indicates that most of the
insufficient solder defects are near the front edge of the board, where it makes initial con-
tact with the solder wave. Further investigation showed that one of the pallets used to carry
the boards across the wave was bent, causing the front edge of the board to make poor con-
tact with the solder wave.
17 -4 Reli~bility Engineering 537

Amount

Wf:Ne height
Exhau~ Conveyorsoeed Specifio rav:
Contact time
Conve or an Ie
Wave fluidity
Maintenance

Alignment of pallet
Orler.tation

Temperature
Contaminated lead

Figure 17~14 Cause--and-effect diagram for the print!:d circuit board flow solder process.

Sack
Figure 17~ 15 Defect concentration diagram for a printed cireui: board.

When the defective pallet was replaced, a designed experiment was used to investigate
the seven variables discussed earlier. The results of this experiment indicated that several of
these factors were influential and could be adjusted to reduce solder defects. After the
results of the experiment were implemented, the percentage of solder joints requiring
rework was redueedOOm 1% to under 100 parts per million (0.01 %).

17-4 RELIABILITY ENGINEERING


One of "'Ie challenging endeavors of the past three decades has been the design and devel-
opment of large-scale systems for space exploration, new generations of commercial and
military aircraft, and complex electromechanical products such as office copiers and com-
puters. The perlormance of these systems. and the consequences of their failure, is of vital
concern, For example, the military community has historically placed strong emphasis on
equipment :reliability. This empbasis stems largely from increasing ratios of Illamtenance
cost to procurement costs and the strategic and tacticallinplications of system failure, In the

I
538 Chapter 17 Sutistical Quality Control and Reliability Engineering

area of consi.uner product manufacture, high reliability has come to be e.xpected as much as
confonnance to other important quality characteristics.
Reliability engineering encompasses several activities, one of which is reliability mod~
eling. Essentially, the system survival probability is expressed as a function of a subsystem
of component reliabilities (survival probabilities). Usually, these models are fune depend-
ent, but there are some situations where this is not the case. A second important activity is
that of life testing and reliability esfunation.

17-4.1 Basic Reliability Definitions


Let us consider a component that has just been manufactured. It is to be operated at a stated
«stress level" or within some range of stress such as temperature, shock. and so On. The ran-
dom variable T will be defined lIS fune to failure, and 1ha reliability of the component (or
subsystem or system) at time t is R(,) = P[T> ,]. R is called the reliability function. The fuil-
ure process is usually complex, consisting of at least three types of failures: initial failures,
wear-out failures, and those that fail between these. A hypothetical composite distribution
of tin:.e to failure is shown in Fig, 17-16. This is a mixeti distribution, and

(17-25)

Since for many components (or systems) the initial failures or time zero failures are
removed during testing, the random variable Tis conditioned on the event that T> 0, so that
the failure density is
g(t)
j(t) = i - p{O)' t>o, (17-26)
=0 otherwise.
Thus, in tenns of/; the reliability function, R, is

R(t) = 1- F(t) = fi(x)dx. (17-27)

The term interval failure rate denotes the rate of failure on a particular interval of time
[r i , t 2J and the terms failure rate, iflstantaneousjailure rate, and hazc.rd \\till be used syn-
onymously as a limiting form of thc interval failure rate as 12 ~ t 1- The interval fajlure rate
FR(tt, t,) is as follows:

(17-28)

g(l)

~.-----.:::-..~----.:::--+;
o
Figure 17~16 A composite:faiJu:re distribution.
17-4 Reliability EngiLeering 539

The first bracketed term is simply

P{Failure during it"~ tillSurvival to time Id. (17-29)

The second term is for the dimensional characteristic> so that we may express the condi-
tional probability of equation 17-29 on a per-unit time basis.
We will develop the instantaneous failure rate (as a function of I). Let h(l) be !.'Ie haz-
ard function. Then
h{ll~ lim R(t)-R(I+ill) 1
• ", ....0 R(rl ill
=-lim R(I+ill)-R(I) ._1_
"' .... 0 ill R(t)'
or

h(l) = -R'(I) ~ J(t) (17-30)


R(I) R(t) ,

= =
sinceR(t) l-F(r) and -K(t) J(I). A typical hazard function is shown in Fig. 17-17. Note
that h(t) . dl might be thought of as the instantaneous probability of failure at I, given sur-
vival to t.
A useful result is that the reliability functionR may be easily expressed in tenns of has

' ) -'h(x)'" = e-H(","


Rtr ;;:; e k (17-31)
where

Equation 17-31 results from the definition given in equation 17-30,

h(t)

Early ag~'
h
!
failures Wear Oll:
and Random failures _ _-t-faiiures and_
rancom random
fai!ures
, 1allures

Figure 17~11 A typical hazard function,


540 Chapter 17 Statistical Quality Control and Reliability Engineering

and the integratiorrof both sides

Ja
r
r h(x)dx =- 0 R'«X)"ilx
R x)
=-lnR(X}',!o
so that

J; h(x)dx = -In R(I)+ lnR(O).

Since F(O) 0, we see that In R(O) = 0 and

The mean time to failure (MITF) is

A useful alternate form is

(17-32)

Most complex system modeling assumes that only random component failures need be
considered. This is equivalent to stating that the time~to~failure distribution is exponential,
that is,
t<: 0,
othenvise.
so that
i() ;l,,-"
h(t)=-' =--=,1.
R(I) e-"

is a constant. \\tnen all early-age failures have been removed by bum in., and the time to
occurrence of wearolit failures is very great (as with electronic parts), then this assumption
is reasonable.
The normal distribution is. most generally used to mode! wearout failure or stress fail-
ure (where the random variable under study is StreSS level). In situations where most failures
are due to wear, the nonnal distribution may very well be appropriate.
The lognormal distribntion has been found to be applicable in describing time to fail-
ure for some types of components; and the literature seems to indicate an increased
utilization of this density for this purpose.
The Weibull distribution has been extensively used to represent time to failure. and its
nature is such that it may be made to approximate closely the observ-ed phenomena. Vt'hen
a system is composed of a number of components and failure is due to the most serious of
a large number of defects or possible defects, the Weibull distribution seems to do particu-
larly well as a model.
The gll1Iltlla distribution frequently resulfll from modeling standby redundancy where
components have an exponential ti.me~rQ-failure distribution. We will investigate standby
redundancy in Section 17-4.5.
17-4 Reliability Engineering 541

17-4.2 The Exponential Time-to-Failure Model


In this section we assume that the time-to-failure distribution is exponential; that is, only
"random failures" are considered. The density, reliability function, and hazard functions are
given in equations 17-33 through 17-35, and are shown in Fig. 17~18:

t;' 0,
;::;0, otherwise, (17-33)

;::;0, otherwise, (17-34)

h(t) = f(t) = J..


R(t) ,
=0, othe:'INise. (17-35)
The constar.t haz~d function is mterpreted to mea."'1 L'1at the failure process has no
memory; that is,

(17-36)

fit)

(a) Density function

(b) Re:iability function

hit)

(c) Hazard t ..mction

Figure 17~lS Density, reUabwr:y ft.'Uctio::;., and hazard function for the exponential failure model.
542 Chapter 17 Statistical Quality Control and Reliability Engineering

a quantity that is independent of t. Thus if a component is functioning at time t, it is as good


as new, The remaining life has the same density asj.

Emnple 17:9
A diode used on a printed c!.reui~ board has a fated failure rate of 2.3 x lo-e failures per hour. How-
eve:, under an increased tempera~e stress, it is felt that the :ate is about :t5 x 10-5 failures per hour.
The time to failure is exponentially distributed, so that we have
ft.t) "'" (1.5 X 1O-5)e-<lJh<!o-~)r, t~O,

~O, otherwise,
RU) ::::: e-< L5:«11)-5)I, t~O,

=0, othetwise,
and
h(f) ~ L5x 10-', !~O.

=0, otherwise.
To determine the reliability at t = 10 and t "" 1iY, we evaluate R( 10') =- e...iJ,15 ::.:: 0.861, and R( 10 5)
4
:::::-
[15;0.223.

17-4.3 Simple Selia! Systems


A simple serial system is shown in Fig. 17-19. In order for the system to function, all com-
ponents must function. and it is assumed that the components function independen.!ly. We
let ~ be the time to failure for component Cj for j = 1, 2, .. ,' n and let T be system time to
failure. Tbe reliabiJ~ty model is thus
R(t) = peT> t] = P(TI > t) , P(T2 > f) •. " . PIT, > t),
or
R(I) = R I (:) ,R,(t) , ". ,R,(t),
where

Example17.10.
Three components must all iu..'1ction [or a simple system to funetion. The random variables T;. T2 , and
TJ representing t.i.r!J.e to failure for the components are independent with the following distributions:
4
Tr-N(2XIO J , 4X10 ),

I
T,-WcibUl\y=O, 0=1, '71)-'
T,-lOgnormal(1l = 10, <1'=4).

Figure 17-19 A simple serial system.


17-4 ReliabLity Engineering 543

1: follows that

so that

For exa:nple. if t::::: 2187 ho-.;rs, then


R(2187) ~ [1 - <I>(0.935)][e-'][I- <1>(-1.154)]

~ [0.175::0.0498J[0.876)
=0.0076.

For the simple serial system, system reliability may be calculated using the product of
the component reliability functions as demonstrated; however, when all components have
an exponential distribution, the calculations are greatly simplified, since
R(t) ::;:;; e-).;l . e-.t:f ... e-J"I = e-;).I + AZ+ " . . . ..1.,)/,

or
(17-38)

where A.f = L:~).j represent>; the system failure rate. \Ve also note that the system reliability
function is of the same fonn as the component reliability functions. The system failure rate
is simply the sum of the component failure rates, and this makes application very easy.

Consider an electronic cii-c'Jit with three integrated ciTC".rit devices, 12 silicon diodes. 8 ceramic capac-
itors, and 15 composition resistors. Suppose under given Stress levels of <;eroperatt:re, shock, and so
on that each component has fail:rre rates as shown 1:;. the following table. and the component faillh-eS
~ independent

Faillh-es per Hour


Integrated circuits 1.3 x lv'
Diodes 1.7 X lv'
Capacitors 1.1 X 10-7
Resistors 6.1 x 1U"

Therefore,
A, ~3(O.013 10-') + 12(1.7 x 10- ) + 8(1.2 x 10-7) + 15(0.61
X
'
x 10-7 )
=3.9189 X 10"',
and
544 Chapter 17 Statistical Quality Control and Reliability Engineering

The circuit mean time to failure is

l>1TIF~ E(T];..!:..;_I_X10' ;2.55xlO'ho=.


J., 3.9189
If we wish to determine. say. ROO.?), we get R(104) =e-o·039189 '!::I: 0.96,

17-4.4 Simple Active Redundancy


A.~ active redundant configuration is sho,," in Fig. 17·20. 'The assembly functioos if k or
more of the components function (k:5 n). All components begin operation at time zero. thus
the term "active" is used to describe the redundancy. Again, independence is assumed.
A general fonnulation is not convenient to work withl and in most cases it is unneces~
sary. When all components ha\-e the same reliability function, as is the case when the com-
ponents are the same type, we ret RP) ; r\t) for j = 1, 2, ... , n, so that

(17·39)

Equation 17·39 is derived from the definition of reliability.

Three identical components are arranged in active redundancy. operating independently. In order for
the assembly to fun~tion, at least two of t.'e components must function (k = 2), The reliability func~
don for the system is thus

= 3[r(1)]'[1- r(/l]+[r(r) j'


= [r(I):'[ 3 - 2r(I1].

It is noted tharR is a function of time, r,

Figure 17~2{1 An active redundant configuration.


17 A Reliability Engineering S4S

When only one of the n components is required, as is often the case, and the components
are not identical, we obtain

R(t) = 1- I1[I- Rj(t)1 (17-40)
i""l

The product is the probability that all components fail, and, obviously, if they do not fail the
system survives. When the components are identical and only one is requited, equation
17-40 reduces to
R(t) = 1 - [I r(tlJ", (17-41)
where r(t) = R/t),j = 1, 2, ... , "-
'When the components have exponential failure laws, we will consider two cases. First,
when the components are identical with failure rare ;.t and at least k compo:cents are required
for the assembly to operate, equation 17-39 becomes

(17-42)

The second case is considered for the situation where the components have identical expo-
nential failure densities and where only one component must function for the assembly to
function. Using equation 17-41. we get
R(tl 1 - [1 - e-'1". (17-43)

In Example 17-12, where th::ee identical components were arranged in an active redundancy, and at
:east !:\Vo were required for system operation. we found
R(t) = (r(t)]'[3 - 2r(tl],
If the component reliability functions are

r(tl = e"",
then
R(t) = .-"'[3 - 2'-'1
=3e-2M _ 2e-3.l.:,
If tv.'o eomponents arc ammgcd in an active redu.Ldancy as described. and only one must function for
the assembly to function, and. furthermore, if the time-to-failurc densities arc exponential with fail-
ure rate 1, tb,en from equation 17-42. we obtci:n
R(t) 1- [1 e-42 = 2e-"-

174.5 Standby Redundancy


A common form of redundancy, called stllndby redundancy, is shown in Fig, 17-21. The
unit labeled DS is a decision switch that we will assume has reliability of 1 for all t. The
operating rules are as follows. Component 1 is initially "online," and when this component
fails. the decision switch switches in component 2. which remains online until it fails.
546 Chapter 17 Statistical Quality Control and Reliability Engineering

Figure 17-21 Standby redundancy.

Standby units are not subject to failure until activated. The time to failure for the assem~
bly is
T = T1 + T'}. + ... + Til!
where T! is the time to failure for the ith component and T1, T2, •• ,. Tn are independent ran-
dom variables. The most common value for n in practice is two, so the Central Limit The-
orem is of little value, However, we know from the property of linear combinations that

"
E(T]: 2,£(li)
1=1

and
" V(li).
Y(T] = 2,
i=l

We must know the distributions of the random variables Ti in oroer to find the distribu-
tion of T. The most common case occurs when the components are identical and the time-to-
failure distributions a..re assumed to be exponential. In this case. Thas a gamma distribution

f(t)=~(;urle-", 1>0,
, (n-l)!
=0 otherwise,
so that the reliability function is
"-1
R(t): 2,e-" (At)k /k!, t>O. (17-44)
k.O

The parameter A. is the component failure rate; that is, E(TJ = ItA.. The mean time to fail-
ure and varia"lce are
MTIF = E[11 = nI A. (17-45)
and
(17-46)
respectively,
1 17-4 Reliability Engineering 547

Example i7:14
Two identical compOllents are assembled in a standby redundant configuration with perfect switch-
ing. The cornponer.t lives are identically distributed. independer.t rand()m variables having an expo-
lOu
ner.tial distribu::ion with failure rate l , The mean time to failure is

and the variance is

V[Tl 21(100-')' = 20,000,


The reliability function R is
1
R(t) = :22e.;!J('" (-t/100)' /k!,
,.0
or
R(t) =e-;/IOO[1 ~ t/l00l,

17-4.6 Life Testing


Ufe tests are conducted for different purposes. Sometiroes~ n units are placed on test and
aged until all or most uruts ha\'e failed; the purpose is to test a hypothesis about the fann of
the tirne-to-failure density with certain parameters. Both forma1 statistical tests and proba-
bility plotting are widely used in life testing,
A secane objective in life testing is to estimate reliability. Suppose, for example, that
a manufacturer is interested in estimating R(IOOO) for a particular component or system.
One approach to this problem would be to place n units on test and count the number offail~
ures, r, occurring before 1000 hours of operation. Failed units are not to be replaced in this
example. An estimate of unreliability isp::::: rln, and an estimate of reliability is

R(IOOO) I-~, (17-47)


n
A 100(1- a)% lower--co:Uidence limi: on R(lOOO) is given by [I upper limit onp]. where
p is the unreliability, This upper limit on p may be detemrinec using a table of the binomial
distribution. In t..?te case where n is large, an estimate of the upper limit on pis

(17-48)

.:E"".tipi~17,~~
One hundred units are placed on life test, and the test is run for 1000 Mum. There are two failures dur-
ing test. soft:;;;;; 0.02, ar.d ~~(1000) =0.98. Using a table oft.'e binomial Gistribution, a 95% uppc('con-
fidence limit on pis 0.06, so that a lower limit on R(lOOO) is given by 0.94.

In recent years, t.~ere has been much work on the analysis of failure-u.'Ue data, .:nclud-
ing plotting methods for identification of appropriate failure~time models and paramet:er
estimation. For a good summary oftbis work. refer to Elsayed (1996).
S48 Chapter 17 Statistical Quality Control an.d Reliability Engineering

17-4.7 Reliability Estimation with a Known Time-to-Failure Distribution


In the case where the form of the reliability function is assumed known and there is only
one parameter, the maximum likelihood estimator for R(I) isR(t), which is formed by sub-
stituting ~for the parameter ein the expression for R(t), where ~ is the maximum likelihood
estimator of O. For more details and results for specific time-to-failure distributions. refer
to Elsayed (1996).

17-4.8 Estimation with the Exponential Time-to-Failure Distribution


The most COllUDon case for the one-parameter situation is where the tlme-to-failure distri~
burioo is exponentia~ R(t) :' e..;to, The parameter e= E[T] is called the mean time to fallure
and the estimator for R is R(t), where
R(t) ~ ,-di

and {j is the maximum likeliliood estimator of e.


e
Epstein (1960) developed the maximum likeliliood estimators for under a number of
diEerent conditions and, furthermore, showed that a 1OO( 1 - a)% confidence interval on
R(t) is: given by
(17-49)
for the two-sided case, Or
(17-50)
for the lower, one-sided intervaL In these cases, the values e and eu are the lower- and
L
upper-confidence limits on O.
The following symbols will be used:

n = number of units placed on test at t;::::; O.


Q = total test time in unit hours.
t~ =. time at wruch the test is terminated.
r = number of failures accumulated at time t.
r· : : : preassigned number of failures.
1 - a ;: confidence level.
X! ~ := the upper a percentage pomt of the chiwsquare distribution with k degrees of
freedom.

There are four situations to consider. according to whether the test is stopped after a preas-
signed time or after a preassigned number of failures and whether failed items are replaced
or not replaced during test.
For the replacement test, the total test time in unit hours is Q nt", and for the nonre-
placement test
,
Q= 2> +(n-r)t', (17-51)

If items ace censored (withdrawn items that have not failed), and if failures are replaced
whlle censored items are not replaced. then
c
Q= 2:>! +(n-c)" (17-52)
)"'1
r 17-4 Reliability Engineering 549

where c represents the number of censored items and tj is the time of the jrh censorship. If
neither censored items nor failed items are replaced, then
, c

Q~ 2>+ 2>j+(n-r-c)t'. (17-53)


1''"'1 j=l

The development of the maximum likelihood estimators for 6 is rather straightforward.


In the case where the test is nonrep1acement,. and the test is discontinued after a fixed num~
ber of items have failed, the likeJ11cod function is

L= TIf(ti)'TIW)
(17-54)

Then

l~lnL~-rlne-ekti- (n-rt
)" l~ /8
(""1

and sc1ving (awl!) ~ 0 yields the estimator


,
e 2:. 'i + (n - r)?
i=l,_ _ _ __ Q
(17-55)
r r
It turns out that
(17-56)
e
is the maximum likelihood estimator of for all cases considered for the test design and
operation.
The quantity 2"y8has a chi-square distribution with 2r degrees of freedom in the case
where the test is terminated after a fixed number of failures, For fixed te:rmL.lation time t" l
the degrees of freedom becomes 2r + 2.
Since the expression 2r818 =2Q!O, confidence limits on (j may be expressed as indi-
cated in Table 17-10. The results presented in the table may be used direcdy with equations
17-49 and 17-50 to establish oonfidence limits on R(t). It should be noted that this testi.:lg
procedure does not require that the test be run for the time at which a reliability estimate is
required. For example, 100 writs may be placed On a aoureplacement test for 200 hours, the
paramorer e estimated, and R(lOOO) calculated. In the case of the binomial testing men-
tioned earlier, it would have been necessary to r.m the test for 1000 hours,

Tnble 17·10 Confidence Limits on B


Nature of Limit FIxed NlIlllber of Failures, r' Fixed Terminarior. Time, /

Two...sided lL'11its

Lower. one-sided limit


550 Chapter 17 Statistical Quality Control and Reliability Engineering

The results are, however. dependent on the assumption that the distribution is exponential,
It is sometimes necessary to estimate the time tR for which the reliability will be R, For
the exponential model, this estimate is

(17-'>7)

and confidence limits on 'R are given in Table 17-11.

,l:;~~~';17',!6.
Twenty items are placed on a replacemenr test that is to be O?Cf3ted until 10 failures occur. The tenth
failure occurs at 80 hours, ar.d the reliability engineer wishes to estimate :he (':lean time '.:0 failure,
95% two-sided limits on 8, R(10D), and 95% two-sided limits on ROOO). Finally, she wishes ':0 esti-
mate the time for whkh the :eliability wil: be 0,8 with point and 9.5% two-sided confdence interval
estimates.
According to equation 17~56 and the results presented in Tables 17~lO and 17-11,

iI = nt = 20(80) =160 hours


r 10 '
Q= m* ::: 1600 unit hoUl'S,
r 2Q 2Q'
~ %&.025)0 ' X6.\YIl,,, r c 3200 3200 1
l34,17' 9,591J
= [93,65,333,65J,
R(lOO)=e-100/e =e- 1OOf:60 0.535.

According to equation :7-49, the confidence interval onR(100) is


[e-J.OOi9J·65, e-IOO1333.65];;;;; [0.344, 0.741J.

Also,

The two-sided 95% confidence limit is determined from Table 17-11 a.<:;

r
L 34,17 9.591"
1
2(1600)(0,223), 2(1600)(0.223) ~ [20,9,74.45],

Table 17-11 Confidence Limits on tli


Nature of Limit Fixed Ten.ni.nation Tune, l
Fixed Number of Failures, /
----------~-------

Two-sided limits r2Q,ln(l/R), 2qln(I/Rl]


:.. %';,'2,2r+2 Xi-afl,2r+2

: 2Qln(l/R) -
Lower, ol1e~sided limit
" '-J
I %;;'.21'-1
~
T 17-6 Ex=ises 551

I 17-4_9 Demonstration and Acceptance Testing


It is not uncommon for a purchaser to test incoming products to assure that the vendor is
conforming to reliability specifications. These tests are destructive tests and, in the case of
attribute measurement, the test design follows that of acceptance sampling discussed eat~
lier in this chapter.
A special set of sampling plans that assumes an exponential time-lo-failure distribution
has been presented in a Department of Defense handbook (DOD fI- 108), and :hese plans
are in v.ide use,

17-5 SL"MMARY
This chapter has presented several widely used methods for statistical quality control. Con-
trol charts were -introduced and their use as process surveillance deviees discussed. The X
and R control cha."tS are used for measurement data. ¥t'ben the quality characteristic is an
attribute, either the p chart for fraction defective or the cor u chart for defects may be used.
The use of probability as a modeling technique in reliability analyse, was also dis-
cussed. The exponential distribution is widely used as the distribution of time to failure,
although other plausible models inelude the normal, lognormal, Weibull, and gamma dis-
tributions, System reliability analysis methods were presented for serial systems, as weB as
for systems having active or standby redundaney. Life testing and reliability estimation
were also briefly introduced.

17-6 EXERCISES
17~1. An extrusion die is used to produce aluminum 17..2.. S'I!ppose a process is in control, and 3-sigma
rods. The diameter of the rods is a critical quality controllimlts are in use on the X chart. Let the :mean
eharacteristic. Below are shown X and R values for 20 shift by 1.50. ""'ba! is the probability that t1is shift
samples of five rods eaeh. Specifications on the rods will remain undetected for three consec:.ttive samples?
are 0.5035 ± 0,0010 inch. The values given are the V/hat would this probability be if2-sigma control. lim-
la.st three digits of the measurements; that is. 34.2 is its are used? The sa."Ilple size is 4,
read as 0.50342.
11~3. Suppose that an X chart is used ~o control a :'.01.'-
mall)' distributed process, and that samples of size n
Sample X R Sample X R
are taken every h hours and plotted on the chart, which
1 34.2 3 11 35.4 8 has k sigma limits.
2 31.6 4 12 34.0 6 (a) Find the ex.pected number of samples that will be
3 31.8 4 13 36.0 4 Laken until a false action signal is generated. This
4 33.4 5 14 37,2 7 is called the in-co:::trol average run length CARL).
5 35.0 4 15 35.2 3
(0) S'I!ppose tha~ the ptocess srJfts to an out-of-
6 32.1 2 16 33.4 10 eontrol state, Find the expected number of Sacl-
7 32.6 7 17 35.0 4
pies that will be taken until a false action is
8 33.8 9 18 34,4 7 generated. This is the out-of-cootrol ARL.
9 34.8 10 19 33.9 8 (e) Evaluate the in-comrolARL fork= 3. How does
10 38.6 4 20 34.0 4 this change if k::; 27 What do you thi:lk about the
use of 2-sigma limits in practice?
(a) Set up the X wd R cha..''ts, revising the trial control
(d) Evaluate the out-of-controlARL for a shift of one
limits if neeessary. assuming assignable causes
sigma, give:;', that n = 5.
cau be found,
(0) CalctUa>e peR and PCRk' Interpret these ratios. 11-4~ Twenty-five samples of size 5 are dra'Wn from a
(c) What pereentage of defectives is being produced process at regular intervals, and the following da!a a(e
by this prOl..---ess? obtained:
552 Chapter 17 Statistical Quality Control and Reliability Engineering

'-' 17~7. Montgomery (2001) presents 30 observations of


IX i :: 36275, oxide thickness of individual silicon wafers, The data
{""! are

(a) Compute the control limits for the X and R charts.


(b) Assutning the process is in control and specifica- Oxide Oxide
tior.limits are 14.50 ± 0.50, what conclusions can Wafer Thickness Weier TIickncss
you draw about the ability of the process to oper- 45.4 16 58.4
ate \\1tbin these limits? Estimate the percent2.ge of
2 48.6 17 51.0
defective items ilia: will be produced.
3 49.5 18 41.2
(c) Calculate peR and peRI(.' Interpret these ratios.
4 44.0 19 47.1
11-5. Suppose an X c.1.li.-"t for a process is in control 5 50.9 20 45.7
with 3-sigma limits. Samples of size 5 are drawu 6 55.2 21 60.6
ever.y 15 minutes. on ::he quarter hour, Now suppose
7 45.5 22 51.0
me proce..o:;s mean stills out of control by 1.50' 10 min-
8 52.8 23 53.0
utes after the hour, If D is the ex.pected number of
9 45.3 24 56.0
defectives produced per quarter hour in this out-of-
control st2.te.:find the expected loss (in tenus of defec- 10 %.3 25 47.2
tive unlt.,» that re..,>ults from this control procedure. 11 53.9 26 ~.O

17..(i. The overall length of a cigar lighter body used in 12 49.8 -,


?~
55.9
a."l automobile application is controlled uSing X and R 13 46.9 28 50.0
charts. The following table gives length for 20 sam- 14 49.8 29 47.9
ples of size 4 (measU!"ements a\'1: coded from 5.00 IS 45.) 30 53.4
rnm; that is, 15 is 5.15 r:ill-.).
(a) Construct a nocnaI probability plot of the data.
Observation Does the normality assumption seem reasonable?
2 3 4 (b) Set up an individuals control chart for oxide
thickness. Interpret the chart,
15 10 8 9
2 14 14 ]0 6 17..s.. A machine i~ used to fill bottles Vlith a particu~
3 9 10 9 11 Jar brand of vegetable oil. A single bottle is randomly
~ 8 6 9 13 selected every half hour and the weight of the bottle
5 14 8 9 12 recorded, Experience with the process indicates that
;0 7 13 the variability is quite stable. with (J:: 0,07 oz. The
6 9
process target is 32 oz. Twenty-four samples have
7 15 10 12 12
been recorded in a 12~hour time period with the
8 14 16 11 10 results given below.
9 II 7 16 10
10 11 14 11 12
8 Sample Sample
11 13 9 5
12
Number x :Xumber x
10 15 8 !O
13 8 12 14 9 32.03 13 31.97
14 15 12 14 6 2 31.98 14 32.01
15 13 16 9 5 3 32.02 15 31.93
16 14 S S 12 4 31.85 16 32.09
17 8 10 16 9 5 31.91 17 31.96
18 8 14 !O 9 6 32.09 18 31.88
19 13 15 10 8 7 31.98 19 31.82
20 9 7 15 8 8 32.03 20 31.92
9 31.98 21 31.81
(a) Set up the X and R charts. Is the process in statis-
10 31.91 22 31.95
tical control'?
11 32.01 23 31.97
(b) Specifications are 5,10 ± 0.05 mm. \Vbat can you
say about process capability?
12 32.12 24 31.94
r
I
17 ~6 Exercises 553
Ii
(a) Construct a normal probability plot of the data 17 .. 13, Consider a process where specifications on a
qUality characteristic are 180 ± 15. We know that the

!
Does the no:rrnality assumption appear to be
satisfied? standard deviation of this quality characteristic is 5.
(b) Set up an. individuals control chart for the \Vbe:re ~hou1d we een~er the process to minimize the
weights. Interpret the results, fraction: defective produced? Now suppOse the mean
sl'.ifts to 105 ar.d we atc using a sample size of 4 on an
17-9. The follo~1ng are the numbet of defective sol- X chan:, Wh2.t is the probability that such a sb.ift wit!
der joints found during successive samples of 500 sol· be detw.ed on the first sou:.ple following t."ic shift:
derjol."lts, What sample size would be needed on a p chart to
obtain a similar degree of protection?
Day No. of Defectives Day No. of Defectives 17-14. Suppose the following fraction defective had
been found L'1 successive samples of size 100 (read
106 11 42 down):
2 116 12 37
3 164 13 25 0.09 0.03 0.12
4 89 14 88 0.10 0.05 0.14
5 99 15 101 0.13 0.13 0.06
6 40 16 64 0.08 0.10 0.05
7 \\2 17 51 0.14 0.14 0.14
8 36 18 74 0.09 0.D7 0.11
9 69 19 71 0.10 0.06 0,09
10 74 20 43 0.15 0.09 0.13
21 80 0.13 0.08 0.12
0.06 O.ll 0.09
Construct a fraction-defective control chart. Is the
process in control? Is the process in control with respect to its fraction
defective?
17~10. A process is controlled by a p chart using sam-
ples of size 100. The centerline on the chart is 0.05. 17~15. The foUov.iug- represent the nu.-r.ber of solder
What is the probability th2.t the control chart detccts a defects observed on 24 samples of five printed circuit
shift to 0.08 on the first sample f&lowing the shift? boards: 7. 6.S. 10.24. 6.5,4,8. 11.15. 8,4. ,6.
II,
What is the probability t.hat the shift is detected by at 12, 8,6,5,9,7, 14,8,21. Can we conclude that t.'e
least the third sample following the shift? process is in control using a c chart? If not, asst.::;ne
assignabie causes can be found and revise the control
17~11. Suppose a p :::hart with cC'I}terline at p with k limitS.
sigma units is used to control a process. There is a
17-16. The following represent t.'e number of de:fects
critical fraction defective p< t3at must be detected with
per 1000 feet ir:. rubbcr-cov:red wire: I, 1,3,7,8, 10.
probability 0.50 on the firs: sample followe:g the shift
5,13,0,19,24,6,9, 11, 15,8,3,6.7,4,9,20,11,7,
to dUs state. Derive a general fonuula for the sample
size that snould be used on this enart.
18, 10.6,4.0,9,7,3, I, 8, 12. Do the data come from
a control1ed process?
17~12. A normally distributed process uses 66.7% of 17-17. Suppose t.'e nt::mber of de:ects in a unit is
the specification band, It is ce:.1tered at the nominal knOVI'l1to be 8, If the number of defects ma unit shifts
dimension, located halfway between the upper and to 16, what it !he probability that it will be detected by
lower specifica~on limits. the c chart on the :Erst sample follo'Ning the s..lift?
(a) What is t."ie process capability ratio peR? 17~lS. Suppose we are inspectir:.g disk drives for
(b) What fallout level (fraction defective) is pro- defects per unit, and it is h:o-wn that there is an aver-
duced? age of two defects per urit. We decided to make our
(c) Suppose the mean shifts. to a distance exactly 3 inspection unit for the c chart five disk drives. and we
standard deviations below the upper 1>1lecification control the tOtal number of defects per inspection unit.
llinil What is the value of PCRk? How has peR Describe the new control chart,
changed? 17~19. Consider the data in Exercise 17-15. Set ..tp a u
Cd) 'iVhat is the actual faI:out experienced a....fter the chart for tlrls process. Compare it to the c chart in
shift ill the meal:? Exercise 17-15.
554 Chapter 17 Statistica: Quality Control and Reliability Engineering

17~20. Consider the oxide thickness d3.ia given in decision switch and only one unit is required for sub-
Exercise 17-7. Set up an EWMA control chart with system survival, detem.:i;ne tbe subsystem reliability.
;. == 0.20 and L;:;;;;: 2.962. Interpret t.':le chart. 17~28. One 111mdred unit;.; are p:aced on test and aged
11-21. Consider the oxide thickness data given in until all units have failed. The follo~ing results are
Exercise 17-7. Construct a CUSUM control ch..'ilrt with obtained, and a mean life oft:= 160 hours is calculated
k = 0.75 and h = 3.34 if the target L';leme" is 50. from the serial data.
Interpret the chart.
17-22. Consider the weigh:s provided in Exereise Time Interval Number off-allures
17·8. Set up an E\VMA eontro1 chart with .<=0.10
and L == 2.7. Interprl!t the chart. 0-100 50
17kl3. Consider t."Ie weights provided in Exercise 100-200 18
17~g. Set up a CUStr:M control chart with k;:;;;;: 0.50 and 200-300 :7
h:::::: 4,0. Interpret the chart. 300-400 8
17-24. A titue-to-failure distribution is giv::::n by a uni- 400-500 4
form distribution: After 500 hours 3

1
/(1)=--, Use the chi-square goodness-of-fit test to detennine
fJ-a whether you consider the exponential distribution to
=0 otheI'Vllse, represent a reasonable tirne~to--fai1ure model for thesl!
Ca) Determine the re:iability function. data.
(b; Show tha~
17-29. Fifty uaits are placed on a life test for 1000
hours. Eight units fail during the period. Estimate
(R(t)dt= (tf(t)dt.
R(lOOO) for these units. Determine a lower 95% con~
(c) Determine t.':le hazard function, fidence interval on R(lOOO).
(d) Show that 17-30. In Section 17-4.7 h was noted that for one-
partt.."nerer reliability functions, R(t;fi ),R(t;f1):= R(t;8).
e
where and Rare the maximum likelihood estima~
whereH is defmed as in eqcati.on 17~31. tors. Prove this statement for the case
17-25. TIrree units that operate and fail independently R(r,8) ~ e-<l'. t:?:O,
form a series configuration. as shown in the figure at
the bottom of this page. =0, othenv1.se.
The tirr.e-to-failure distribution for each UJI.it is expo- Hint: Ex.press the density function fin tenns of R.
nential with t.1e failure rates indicated,
(a) Find R(60) fur the systeIIL 17-31. For a nonreplacement test that is terminated after
200 hours of operation, it is noted that failures occur at
(b) ~'hat is the mean time-to-failurc Cv1TTF) for this
the following ti:nes: 9, 21, 40, 55, and 85 l:ours. The
sy&tem'f
units are a.<sl.llned to have an expOTh---ntial tittle-To-failure
17M26. Five identical units are arranged in an active distribution, and 100 uoits were on test initially,
redundancy to fonn a subsystelXl. Unit failure is inde-
(a) Estim.ate the mean time to failure.
pcadem, and a:: least t'No of the units must survive
1000 hours for the subsystem to perfonn its :nission. (b) Construct a 95% lower-confidence limit on t?-e
mean tir::te to failure.
(a) If the units have c:xpone!ltial time-to-failure dis-
tributions with failure rate 0,002. what is t.':le sub- 17~32. Use the statement in Ex.ercise 17-3!.
sys~ reliability? (2) Estimate R(300) and construct a 95% lower-
(0) Vihatis the reliability if only onc unit is reqci."'ed? confidence limit on R(300).
17~27. If the utrits described in the previous exercise (b) Estin-.ate the time for which the reliability will be
are operated in a standby redundancy with a perfect 0,9. and construct a 95% lower limit on te.g,

~ A1=3x10- 2 H A2=6x10- G :4 A:::=4x10- 2 H


Figure for Exercise 17-25.
Chapter 18

Stochastic Processes and Queueing


18-1 INTRODUCfION
The term stochastic process Is frequently used in connection with observations from a
time-oriented. physical process that is controlled by a random mechanism, More precisely;
a stochastic process is a sequence of random variables {XI}' where t E T is a time or
sequence index. The range space for X f may be discrete or continuous; however. in this
chapter we will consider only the case where at a particular time t the process is in exactly
one of m + 1 mutually exclusive and exhaustive stales. The states are labeled 0, 1,
2, 3~ '''. m.
The variables Xl~~' ... might represent the number of customers awaiting service at
a ticket booth at times 1 minute, 2 minutes, and so on. after the booth opens, A.nether
example would be daily demands for a certain product on successive days. xc. represents the
initial state of the process.
The chapter will introduce a special type of stochastic process called a Markov process.
We will also discuss the Chapman-Kolmogorov equa:tions~ various special properties of
Markov chain.s, the birth-dearh equations, and some applications to waiting~line, or queue~
ing, and interference problems.
In the study of stochastic processes. certain assumptions are required about the joint
probability distribution of the random variables Xl' X" ._ .. In the case of Bernoulli trials,
presented in Chapter 5, recall that these variables were defined to be independent and that
the range space (state space) consisted of tw'o values (0, 1). Here we will first consider dis-
crete-time ~arkov chains, the case where time is discrete and the independence assumption
is relaxed ro allow for a one-stage dependence.

18-2 DISCRETE-TTh:IE MARKOV CHAJNS


A stochastic process exhibits the lr1arkovian property if
P {XrT1 ::::!!jlX,= i} =P {Xr+ l =jlXr = i, Xt_ l = i j ,Xr_7. =i2, .". Xo:;;;: fa (18-1)

for t= O. 1r 2, ... ~ and every sequencej, i, it • •.. , it' This is equivalent to stating that the prob-
ability of an event at time t + 1 gi:ven only the outcome at time t is equal to the probability
of the event at time t + 1 given the entire state history of the system. In other words, the
probability of the event at t.,. 1 is not dependent on the state history prior to time t.
The conditional probabilities
P {Xt + 1 =j1X,. """ i} :;:::: Plj (18.2)
are called one..step transition probabilities, and they are said to be statior.ary if
for t= 0,1,2, .,., (18-3)

55S
556 Cbapter 18 Stochastic Processes and Queueing

so that the transition probabilities remain unchanged through time. These values may be
displayed in a matrix P :;;; {Pi)]' called the one-step transitio.o matrix. Tne matrix P has m +
1 rows and m + 1 columns, and

while

That is. each element of the P matrix is a probability, and each row of the matrix surns to one.
The existence of the one-step! stationary transition probabilities implies that

p;;) =P{X,~" =jlK, =i) =P (X, =jlK, =i) (18-4)

for all t= 0, 1,2, .... The valuesp:;)


are called n-step transition probabilities, and they may
be displayed in an n-step transition matrix
pzn; = [pi;)],
where
O,,;p'" ~I n=O, 1,2, .... i:;;;O, 1.2, "., m, j ~ O. 1; 2, ... , m,
'J '

and

n = 0,1,2..... i=O,I,2, ... ,m.

The O~step transition matrix is the identity matrix.


Afinite·state Mar!r.ov chain is defined as a stochastic process having a tillite number of
states, the Markovian property, stationary transition probabilities, and an initial set of prob-
abili"tieL~
, [(0)
GIn, a (0) '0) (0)] b "' - P IXI) -::
p ll1' ... , am • were ai -
_ I'} •
The Chapman-Kolmogorov equations are useful in computing n-step transition prob-
abilities. These equations are
i=O,1,2, ...• m~
j = O.1~2•... ,m , (18-5)
Osv:::;;n,

and they indicate that in passing from state i to state j in n steps the process will be in some
state, say I, after exactly v ,mps (v,,; n). Therefore pt' .
J'~-'J is the conditional probability
that given state i as the starting state, the process goes to state 1in v steps and from 1to j in
(n - v) steps. When summed over 1. the sUIIl of the products )'ie1dspiJ) .
By setting v lor V'=n -1. we obtain
i=O,1,2, ...• m,
j O.1.2.... ,m.
n=l,2, .. ..

It follows that the n-step transition probabilities, pt'" may be obtained from the one-step
probabilities, and
(18-6)
18-3 Classification of States and Chains 557

The unconditional probability of being in statej at time t::;;;n is


A (") [{n)
a'J!a l
(IT)
, ••• ,a",
In,]
'
where

"
~
'" a~O) . p~;:)
I I) 1
j O,I,2" ..,m,
n:;;: 1,2, •...
i=C

Tnus, A(n) =A· p(n). Further, we note that the rule for matrix mcitiplication solves: the total
probability law of Theorem 1~8 so that A{n) A(n-I). p.

.lli~~l~i.~:r
In a eomputing system, the probabiHty of an error on each cyele depends on whether or not it was pre~
ceded by an error. We will define a as the error state and 1 as the nonerror sta:e. Suppose the proba-
bility of an error if preceded by an error is 0.75, the probability of a."'! error if p:eceded by a nonerror
is 0.50, :.he probability of a nonerror if pre~eded by an error is 0.25, a:r:d the probability of noneITOr
if preceded by nOl!error is 0.50. Thus.

p- r
O.75 0.25;
- L0.50 0.50' J
Two-step, threo-step. , .. , seven-s::.ep transition matrices arc sho'W'U below:

P""=
, rO. 688 0.312] p' =rO,672
0.328]
LO.625 0.375 ' LO.656 0.34 '

~~~!}
, jO.668 p-< = [0.667 0.333]
p =LO.664 0.666 0.334 '
p, = [0.667 0333] , _rO. 667 0.333J
0.667 0.333 ' P -,0.667 0.333
If we know that initially the system is in the nonerro: state, thc.'l ai~) I, d,,0) = O. and A(II) :::::; (ar] =
A· pro,. Thus, forcxample, A(1)= [0.667, 0.333J.

18·3 CLASSIF1CATION OF STATES AND CBADlS


We will first consider the notion ofJirst passage times. The length of time (number of steps
in discrete~time systems) for the process to go from state i to state j for the first time is
called the first passage time. If i ::::j. then this is t.."Ie number of steps needed for the process:
to return to state i for the first time, and this is tenned the ,first return time or recurrence rime
for state i.
FIrst passage times under certain conditions are random variables with an associated
probability distribction. We let I;'
denote the probability that the first passage time from
state i to j is equal to n, where it can be sho\\'IJ. directly from Theorem 1-5 that
./~):: 0)
j:; =Pij =Pij'

-,1) _ (n) f'l) in-ll .....2) (1l-2) .An-i)


(18-8)
/ ;i} - Pi; -. J} 'PJJ - J Ij 'Pjj _.,. - J;j Pjj'
558 Chapter 18 Stochastic Processes and Queueing

Thus, recursive computation from the one-step transition probabilities yields the probabil-
ity that the :first passage time is n for given i,j,

;:t>:an;~i~I§:Z:
Using the one-step transition probabilities presented in Exan::.ple 18~1, the distrlbutloo. of the passage
time index n for i = O,j = 1 is determined as
f~';1 =POI =: 0.250,
f~;' = (0.312) - (0.25)(0.5) = 0.187,

fi~' = (0.328) - (0.25)(0.375) - (0.187)(0.5) = 0.141,

f;~' = (0.332) - (0.25)(0.344) - (0.187)(0.375) - (0.141)(0.5) = 0.105.

There are four such distribution, corresponding to i, j value: (0,0), (0, I), (I, 0), (I, 1).

If i and} are fixed, then I:::"tfi;} s 1. W'hen the sum is equal to one, the values for It).
n::;1, 2; ... , represent the probability distribution of first passage time for specific i,j. In
the case where a process in state i may never reach state j, L:tf~; < L
Where i = j and I::,/i;) :;:;: 1, the state i is termed a recurrent STate, since given that the
process is in state i it will always eventually return to i.
If Pa 1 for some State i, then that state is: called an absorbing state. and the process
will never leave it after it is entered.
The state i is called a transient state if

ifS tt
) <1,
n=!

since there is a positive probability that given the process is in state i, it will never retum to
this state. It is not always easy to classify a state as transient or recurrent, since it is some-
times difficult to calculate first passage time probabilities t;;i
for all n 1 as was the case in
Example 18-2. Nevertheless, the expected first passage time is

' " ~(n) < I ,


LJ'j
n=l

(18-9)
1) a simple conditioning argument shows that

I'ij = 1+ I,Pil 'I'v'


!*j (18-10)
If we take i the expected first passage time is called the expected recurrence time. If J.4i :::
00 for a recurrent state, itis called null; if Jlu < 00, it is called narmul[ or positive recurrent.

There are no null recurrent states in a finite-state Markov chain. All of the states in such
chains are either positive recurrent or transient.
18~3 Classification of Sta:es and Chains 559

A state is called periodic with period 1'> 1 if a return is possible only in 'r, 2'r, 3-r, "'\ steps;
::
so p;~) 0 for all values of n that are not divisible by 1'> 1. and -ris the smallest integer hav~
ing this property.
A state j is termed accessible from state i ifp::) > 0 forsorne n= 1.2, .. , . In ol.!!'exam-
'J
pie of the computing system. each state. 0 and 1, is accessible from the other; since p;~') > 0
for all i, j and all n. If state j is accessible from i and state i is accessible from j. then the
states are said to communicate. This is the case in Example 18-1. We note that any state
conununicates with itself. If state i communicates withj,j also communicates with i, Also,
if i communicates with 1and I commuricates \vithJ, then i also communicates with).
If the state space is partitioned into disjoint sets (called equivalence classes) of states,
where COIIL.Yflunicating states belong to the same class, then the IvIarkov chain may consist
of one or more classes. If there is only one class so that at: states communicate. the Markov
chain is said to be irreducible. The chain represented by Example 18-1 is thus also irre-
ducible. For finite-state Markov chains, the states of a class are either aU positive recurrent
or all transient. In many applications, the states will all communicate. This is the case if
there is a value of n for which p;;) > 0 for all values of i andj,
If state i in a class is aperiodic (not periodic), and if the state is also positive recurrent.
then the state is said to be ergodic. An irreducible Markov chain is ergodic if all of its states
are ergodic. In the case of such Markov chains the distribution
A") = A· P"
converges as n -7 00 , and the limiting disuibution is independent of the initial probabilities,
A.In F.xamp1e 18-1, this was clearly observed to be the case, and after fiye steps (n > 5),
PIX, = OJ = 0.667 and PIX, 1) = 0.333 when three significant figures are used,
In general. for irreducible, ergodic 11arkov chains,
lim {r.) - lim {II)_
n--t"" Pi) - n~"''''' aj - Pj.

and, fur..hennore, these values Pi are independent of i. These "steady stare" probabilities, p/,
satisfy the following state equations:
(l8-Ha)
m
I,Pj =1, (18-Hb)
j=O

m
pj=LPi'Pij j == O.1,2,.".m. (l8-11e)
i=O

Since there are m + 2 equations in 18-11b and 18-11c, and since there are m + 1 un.!::nowns,
one of the equations is redundant. Therefore we will use 111 of the 111 + 1 equations in equa-
j

tion 18-110 with equation 18-11b.

In the case of the computing system presented in E.xample 18-1. we have fro;:;. equario:1S IS-lIb and
1S-lle,

1 =Po+Pj'
Po=Po (0.75)+p,(0.50).
560 Chapter 18 Stochastic Processes and Queucing

or
Po =213 and p, = 113,
which agrees with l.1e emerging result as n:> 5 in Exampie 18~ 1.

The steady state probabilities and the mean recurrence time for irreducible~ ergodic
Markov chains have a reciprocal relationship,
1
j = O,1,2, .. "m, (18,12)
Pj
In Example 18,3 :lote that,lloo ~ 1ipo = 1,5 and)1" ~ lip, 3,

Tue mood of a corporate president is observed over a period of time by a psychologist in the opera~
tions research department. Being inclined toward mathematica:. modeling, the psychologist classifies
mood into three states as follows:
0: Good (cheerful)
1: Fair (so-so)
2: Poor (glum and depressed)
'The psychologist observes that mood changes OCClr only overnight: thus. the data allow estimation
of the transition probabilities
jO,6 0,2 0,2'
P~ l0.3 0,4 O.3J'
0,0 0,3 0,7
The equations
Po O.6pQ .... a.3pt + 0P2'
PI ;;;: O,2pJ + OAp; -'- O.3P2'
1 =Po+P,":"Pz
are solved simultaneously for the steady s+..ate probabilities
Pa=3/13.
p, =4113,
P2= 6/13.
Given that the president is in a bad mood. that is, state 2. the mean rime reql.l.ired to return to that state
is i-i;:;, where
1 [3
I-tn ,::::::-=-days,
p:! 6

As noted earlier ifpk}t= 1, state k is called an absorbing state. and the process remains
j

in state k once that state is reached, In this case, b" is called the absorption probability,
18~4 CQntinuous~Time Markoy Chains 561

which is the conditional probability of absorption into state k given state i. }'lathematically,
we have

bik.
'" ·bjl\.>
= LP(l l=0,1.2, ... ,m, (18-13)
i""O
where

and
for i recurrent. i *' k.

184 CONTINUOUS-TIME MARKOV CHAINS


If tl'..e time parameter is continuous rather than a discrete index, as asslL.'TIed in the previous
sections, the Markov chain is called a cOfltinuous~parameter chain, It is customary to use a
slightly different natation for continuous~parameter Markov chains, namely X{t} =X(' where
(X(I) J, t:;' 0, will be considered 1:0 bave states 0, I, ... , m. The discrete nature of the state
space [range space for XlI)] is thus maintained. and

i 0,1,2, ... , m.
PiP)~P [X(t +s)=jlX(s) = ,1, j=O, 1, 2t .. " m,
s;;:: 0, t::?:O,

is the stationary transition probability function. It is noted that these probabilities are not
cependent on s but only on t for a specified i.j pair of states. Furthermore, at time t = 0, the
function is continuous with

There is a direct correspondence between the discrete ..time and continuous-time mod-
els, The Chapman-Kolmogorov equations become

Pij(t) '"
2::Pe(v), p/j(t-v) (18-14)
1=0

for 0:::;; v S; t, and for the specified state pair i,j and time t. If there are times t1 and '2 such
that Plj(t,) " 0 and Pj,{r,) > 0, then states i andj are said to communicate. Once agab states
that communicate form an equivalence claSs. and where the chaL, is irreducible (all states
form a single class)
p,P) > 0, for t > 0,
for eacb state pair iJ
We also have the property that
562 Chapter 18 Stochastic Processes and Queueing

where Pi exists and is independent of the initial state probability vector A, The values Pj a..--e
again called the steady state probabilities and they satisfy
Pj >0, j = 0.1.2....,m,

m
PI = '\' v·· p .. (t).
£..,.<1 1) ,
j = O,1.2,.,.,m, t;;' O.
f",C

The intensity of transition, given that the state is j, is defined as

(18·15)

where the limit exists and is finite, Likewise; the intensity of passage from state ito statej,
given that the system is in state i, is

(18·16)

again where the limit exists and is finite. The interpretation of the intensities is that they rep-
resent an instantaneous rate of transition from state i to j. For a small ill, Pij(ill) :::: u(p.t +
0(/;1), where 0(/;1)1/;1 ... 0 as /;1 ... 0, so that .ii is a proportionality constant by which
Pij(6t) is proportional to ill as ill ~ 0. The transition intensities also satisfy the balance
equations

PI ,uj = LP;-Ujj' j =O,1,2, .... m. (18·17)


i7lij

These equations indicate that in steady state, the rate of transition out of state j is equal to
the rate of transition into j.

An electronic control mechanism for a chemical process is constructed with two identical modU!es,
operating as a parallel. active cedur:.dant pair, The function of at least one module is necessary for the
mechanism to operate. The m3i:ntenance shop has two identical repair stations for these modules and,
furtilerrnore, when a module fails and enters the shop, on,er work is moved aside and repair work is
i.:m..rLediately initiated_ The "system" here consists of tbe mechanism and repair facility and the states
are as fellows:
0: Both modules operating
1: One unit operating and one unit itt repair
2: 1\vo units in repair (mechanism down)
The random variable representing t.inle to faillL>-e for a module has an exponential density~ say
t,.(tl = k"", 1<: O.
=0, ,<0,
and the random variable describing repair time at a repair station also has an exponential density, say
r-;<t) ~ j1e-J.U, I ~ 0,
=0, 1<0.
18-4 Continuous-Tlffie Markov Chains 563

Inteti'ailo.re and interrepait times are independent ax.d {X(t) t can be shown to be a continuous-param-
eter, irred:J.cible Markov cb.ain with transitions only from a state to its neighbor states: 0 -) 1. 1 --t 0,
1 ~ 2, 2 -;. 1. Of course, there may be nO state cbange.
The transition htensities are

ll~::::: 2).., U1=A"'!"jJ..,

uO! = 21, ilL,::::: A..


u~=O, U:lo ~ 0,
uIO=p., ~i=2.u.

'" =2Jl.
L'sing eqaation 18-17.
2};J, = !1P,.
(A...,... ll) P J ::= 2}{Jo + 2p.p2'

and since Po';" p; ..... Pz ::::: 1, some algebra gives

fJ,'
Po=-(;-)-,'
A+Jl,

The system a.vailability (probability that the mechanism is up) in the s:eady state condition is thus

.. bill'ty= 1- - ';t'
Avaca' -,.
(},+fJ,t

The matrix of transition probabilities for time increment At may be expressed as

p~[Pij(ru)J
!-uOru "olru ".
"0/'" ", uO,"At
u1O .6t 1-u, ru " . Ul/lt " . ulmM (18-18)
= ... , ..
UiO At uil At uljAt u:mAt

UmOM U m t6.t urr.jAt ... 1- umD.t


and

Pj(t •. /U) = iPi(t). Pij(t>t), J:::::O.1.2,.",m, (18,19)


i"'J

where
p,{l) = P (X(t) = jJ.
From tbejtb equation in tbe m + 1 equations of equation 18-19,
p/t+ ru) = Po(t). uo;ill+ ". + p,{t)· ui;ill + ... + p/t)(1- u;ill] + ... + Pm(t) , uc, ru,
564 Chapter 18 Stochastic Processes am! Queueing

which may be rewritten as

d I
-p;,t)=
at J'
.
lim
6t t'
[PJ(t+b.J)- p,(t)] =-u.·p.(t)+
1\ ))
.L u.. ·p.(t).
IJ I
(18-20)
--)" ul I'*j

The resulting system of differential equations is


pj(t)=-uj·p/t)+ .Luij.p,(t), j:;;:; O,1.2,'4',m, (18-21)

which may be solved when m is finite, given initial conditions (probabilities) A. and using
lhe result that 2:;.0
p,(t) = L The solution

(Po(t),p,(r), ··.,Pm(t)J =P(I) (18-22)


presents the state probabmties as a function of time in the same manner that pt'
presented
state probabilities as a function of the number of trdIlsitions, n. given an initial condition
vector A in the discrete-time modeL The solution to equations 18-21 may be sornewhatdif-
ficult to obtain. and in general practice~ transformation ted.utiques are employed.

18-5 THE BIRTH-DEATH PROCESS IN QUEUEING


The maior application of lhe so-called birth-dealh process that we will study is in queue-
ing or waiting-line theory, Here birth will refer to an arrival and death to a departure from
a physical system. as shown in Fig. 18~ L
Queueing theory is the mathematical study of queues or waiting lines, These waiting
lines occur in a variety of problem environments. There is an input process or "calling pop'"
ulation," from which arrivals are drawn. and a queueing system. which in Fig. 18-1 consists
of the queue and service facility. The calling population may be finite or infirjte. Arrivals
occur in a probabilistic manner. A common assumption is that the interarrival times are
exponentially dis1..ributed. The queue is generally classified according to whether its capac~
ity is infinite or finite. and the service discipline refers to the order in which the customers
in the queue are served. The service mechanism consists of one or more servers, and the
elapsed service time is commonly called the holding time.
The following notation will be employed:
Xl,) = Number of customers in system at time t
States::::: 0, 1,2• ... ,j,j-r 1, ...
s = Number of servers
pit) P{X(,) = jlAl
p.; limp(t)
; (4""')

An = Arrival rate given that n customers are in the system

JJ..'t ::: Service rate given that n customers are in the system
The birth-death process can be used to describe how X(I) cbanges through time. It will
be assumed bere that when X(t) lhe probability distribution of the time to the next birth
(arrival) is exponential \Vith parameter Aj'! j :: O. 1. 2; .... Furthermore, given X(r) the
remaining rime to the next service completion is taken to be exponential with parameter j1.i'
j 1, 2, .... Poisson-type postulates are assumed to hold, so tha, the probability of mo{~
than one birth or death at me same .instant is zero.
18-5 The Birth-Deaili Process in Queuei..'lg 565

System
~------------------!
: Service !
! facility 1
o I
I
Queue o I
lnpLi Arriva.ls o I
o 0 /) , •• 0 r------1- Departl.res
process I
I
I 0 I
I I
I I
------------------~
Figure 18-1 A simple queucing system,

A transition diagram is shown in Fig. 18-2. The transition matrix corresponding to


equation 18-18 is
: -,;,AI '<,tJ 0 0
,!tID.: 1-(}..+tJ.j)c" AlAr 0
0 tJ.zD.: l··(!.,-.,)& 0
0 0 1-')6( C

po C 0 0 0
).,' Jt.:
0 0 0 i~-()'j ..,u:})& '"
0 0 0 J1 }+l Ar

0 0 0 C

\Ve note that Plj(&) ;;; 0 for j < i 1 or j > i + 1. Furthermore, the transition intensities and
intensities of passage shown in equation 18-17 are

Uj;;;Ay+,u; forj= 1,2, .. "


u'J:;;:;:; A..{ forj=i+ I,
forj=i-l,
forj<i-l,i>i+ 1.

The fact that the transition intensities and intensities of passage are constant with time
is important in the development of this model The nature of ttansiti9D car; be viewed to be
specified by assumption, or it may be considered as a result of the prior assumption about.
the distribution of time between occurrences (births and deat.'ls),

Figure 18--2 Transition diagram for the birth-death process,


5(j(j Chapter 18 Stochastic Processes and Queueing

The assumptions of independent, exponentially distributed service times and inde-


pendent, exponentially distributed interarrival times yield transition intensities that are con~
stant in time. This was also observed in the development of the Poisson and exponential
distributions in Chapters 5 and 6.
The methods used in equations 18-19 through 18-21 may be used to formulate an infi-
nite set of differential state equations from the transition matrix of equation 18-22. Thus. the
time-dependent behavior is described in the following equations:
(18-23)

1,2, .... (18-24)

1
~

~>}(t)= I, and A--[ao(0),Ill(0) .,.,aj(0).....


I

i"O
In the steady state (I -7 ~). we have p;(t) = 0, so the steady state equations are obtained
from equations 18-23 and 18-24:

11,p1 = A"P"
A"Po + p..,y, = ('-I + 11,) . p"
'-:PI + iLJP3 = (A., + J.iJl. p"
(18-25)

'-;.2PI-' + l1}'j = ('-J': ~ Il; ,) Pj. J'


I1 _JPj_1 '+' .il;-+lP)+! (/"j + 11.;)Pjl

and L;, ,PI = L


Equations 18·25 could have also been determined by the direct application of equation
18-17, which provides a "rate balance" or "intensity balance:' Solving equations 18-25 we
obtain

If we let
18~6 Considerations in Que:ucing Models 567

C '_;C!/.)-Z "'An
(18·26)
j- J1jJ1j-l'''fl: •

then

and since

or Pc+ LPj=l, (18·27)


/=1

we obtain

1
PC=-';"--'

1+ LC;
J.1

These steady state results assume that the A,h J1J values are such that a steady state can
°
be reached. This will be true if Ai = for j > k, so iliat there are a finite number of states, It
is also true if p = NS).t < 1, where A. and).t are constant and,$ denotes the number of serverS.
The steady state will not be reached if Ll",
I G.,;;; QQ,

18·6 CONSIDERATIONS IN QUEUEING MODELS


\Vhen the arrival rate ~ is constant for alij. the constant is denoted .:l SimUarly, when the
service rate per busy server is cO::lstant, it will be denoted fl. so that 11;::;;: sp. ifj 2:: S and flj <::::
j fl if j < s. 1ne exponential distributions
ACt) ~N,-k, t:?:O,
0, t<O,
TT(t) = Ilf,-Jil, t<: 0,
=0, t<O,
for inte!arrival times and service times in a busy channel produce rateS A. and p... which are
constant The mean interarrival time is II)" and the mean time for a busy channel to com M

plete service is 11fl.


A special set of notation bas been widely employed in the steady S!A.te analysis of
queueing systems, This notation is given in the following list:

L = L~.J· Pi = Expected number of customers in !be queueing system


L, = L~,,(j-S)' Pj = Expected queue length
W -;::;; Expected time in the system (including sel'\'lce time)
Wq = Expected waiting time in the queue (excluding service time)

If A is constant for all j, then it bas been shown that


L=AW (18·28)
and

L,=AW,
568 Chapter 18 Stochastic Processes and Queue!ng

(These results"are special cases of what is bown as Little's law.) If the ;"J are not equal, I
replaces A., where
~

:1:= LAj·Pj. (18-29)


i:=:.O

The system utilization coefficient p = /Js(.1 is the fraction of time that the servers are busy.
In the case where the mean service time is 11(.1 for allj ~ 1,
I
W )¥,,+-. (18.30)
I'
The birth-death process rates,~, 1 1, ••• , Aj; ••• and J.ljl J.L;" ••• , J.l;~ ••. may be assigned
j

any positive values as long as the assignment leads to a steady state solution. This allows
considerable flexibility in using the results given in equation 18-27. The specific models
subsequently presented will differ in the manner in which Ay and I'} vary as a function ofj.

18·7 BASIC SINGLE·SERVER MODEL V\!ITH CONSTAl"iT RATES


We will now consider the case where s;:;:; 1, that is. a single server. We will also assume an
unlimited potential queue length with exponential mterarrivals having a constant parame-
ler A., So that .:t" =AI =... =A. Fnrthennore, service times will be assumed to be independ-
ent and exponentially distributed with 1'1 = J.L, !L We will assume). < 1'. As a result
of equation 18-26, we have

j=1,2,3, ... , (18·31)

and from equation 18-27


j=I,2.3, ...,
1
Po = _ I-p. (18-32)
l+LP!
i"'l

Thus, the steady state equations are


Pj= (1- p)pi, j = 0, 1,2, . ., .
Note that the probability that there X'I3 j customers in the systempj is given by a geometric
distribution with parameter p. The mean number of customers in the system, L, is deter~
mined as
~

L= Lj·(1-p)pJ
j=O

(18,34)

=2-
I-p
1&-7 Basic Single-Server Model \f.-ith Constant Rates 569

And the expected queue length is

Lq=I,U-I),pj
j=l

=L-(I-po)

Using equations 18-28 and 18-34, we find that the expected waiting time ill the system is

(18-36)

and the expected waiting time in the queue is

;.' 1
W, = , (18-37)
A !1-(!1-- 1 )1 !1-(!1--).)'

These results could have been developed directly from the distributions of time in the sys-
tem and time in the queue. respectively. Since the exponential distribution reflects a mem-
oryless process, an arrival finding j units in the system will wait through j + 1 services,
including its own, and thus its waiting time ~. + I is the sum ofj + 1 independent, exponen-
tially distributed random variables, This random variable was shown in Chapter 6 to have a
gamma distribution. This is a conditional density given that the arrival fi:lds j units in the
system. Thus, if S represents time in the system.

P(S>w) 2:>rP(Tj+1 >w)


j""O

... /+1
I, (l_p)pi . r~~e-I'Idl
i"O w r (J+l)

n-
w
p}!.le-I'II (P':,Y dt
1",,0 ),

(18-38)

= e-,u(I-P)"', w;;'O,

W<O,

which is seen to be the complement of the distribution function for an exponential random
variahle with parameter !1-(1 - pl. The mean value W = IIJl(1 - p) = 11(j.t - )_) follows
directly.
If we let S, represent time in the queue, excluding service time, then

P(S,=O)=p,= I-p.
570 Chapter 18 Stochastic Processes and Queueing

If we take T; as the sum of} service times. ~ will again have a gamma distribution. Then,
as in the previous manipulations.

p(Sq:>Wq ) 2>j·P(1j:>Wq )
1=1
~

= L(l-p)pj .P(Tj:>w,) (18-39)


/",,1

-u(I-p)"
=pe' q,

=0,

and we find the distribution of time in the queue g(w<i); for \IV q > 0, to be

Wq >0,

Thus, the probability distribution is


g(w,) = I-p,
= 4(1- Ple-"'-;;'., (18-40)
which was noted in Section 2-2 as being for amixed~type random variable (in equation 2-2,
G,t 0 and H,t 0). The expected wajting timeln the queue W,could be determined directly
from Ibis distribution as

W. =(1 p)·O.,. Jo~ w, ·;,{I pV,"-I.)""dWq


(18-41)
4

Vlhen /" 2: J.i, the summation of the terms PI in equation 18-32 diverges, In this case.
there is no steady state solution since the steady state is never reached. That is, the queue
would grow without bound.

18-8 SINGLE SERVER 'WITH LIMITED QUEUE LENGTH


H the queue is limited so that at most N units can be in the system, and if the exponential
service times and exponential interarrival times are retained from the prior model. we have

;1,,=41 = ... = '-:1_1 =/-


4]=0, jcN,
and

/1: f.1.z='" /1,,=/1.


It follows from equation 18-26 that

j~N,
(18-42)
=0, j>N.
18-8 Single Server with Limited Qt:.eue I..engt.l:! 571

j = O,I,2,,,,,N,

so that
N
pol>j =1
)"'0

and
I-p
1- pN+l' (18-43)

As a result, the steady state equations are given by

j=O,1,2, ...• N. (18-44)

The mean number of customers in the system in this case is

(18-45)

The mean number of customers in t.lte queue is


N
Lq = LU -1)· Pj
fool
N N
= LjPj - LPj (18-46)

The mean time in the system is found as

m
(18·..

and the mean time in the queue is

(18·48)

where L is given by equaton 1845.


572 Chapter 18 Stochastic Processes and Queueing

18-9 MULTIPLE SERVERS WUHAN ~IMITED QUEUE


\Ve now consider tbe ease where there are multiple servers. We also assume that the queue
is unlimited and tbat exponential assumptions hold for interarrival times and senrlee times.
In this case, we have
(18-49)
and
forj" s,
forj> s.
Thus. defining ¢:=: }Jp., we have

c =)!- . ;11/- j';'s


j j;J1/ j!
(18-50)
j>5.

It follows trom equation 18-27 that the state equations are developed as

p=!bi po
j ji

j >$

(18-51)

where p = YS/1 = ¢Is is the utilization coefficient, assuming p < 1,


The value Lrr representing the mean number of units in the queue, is developed as
follows:

(18-52)

Then
L
W =-'L (18-53)
q I.
lS~12 Exercises 573

and

(18-54)

so that

(18-55)

18·10 OTHER QUEUEING MODELS


There are numerous other queueing models that can be developed from Ule birth-deat1.
process. In addition, it is also possible to develop queueing models for situations involving
nonexponential distributions. One useful result, given without proof, is for a single-server
system having exponential interan::ivals and arbitrary service time distribution with mea.,,, 1/J1
and variance d'. If p=Ai!1< I, then steady stare measures are given by equations 18-56:
Po =1 p,
)}(i2 + p2
L =.~-.~
q 2(1-p)'
L= p+Lq , (18·56)

In the case where service times are constant at 1Ij.L, the foregoing relationships yield the
measures of system performance by taking the variance cJl::::::; O.

18·11 SUlY1MARY
This chapter introduced the notion of discrete~state space stochastic processes for discrete~
time and continuous"time orientations. The Markov process was developed along with the
presentation of state properties and characteristics. TIllS was followed by a presentation of
the birth-death process and several important applications to queueing models for the
description of waiting~time phenomena.

18·12 EXERCISES
1s..1. A shoe repair shop in a suburban .!.'Call has one (b) Find the mean ll1!IDber of pairs in :.'I)e shcp and the
shoesmith. S=.toos are brought in for repair and arrive n::.ean nu::nber of pairs waiting for ser.'ice.
according ro a Poisson process with a constant arrival (c) Bnd the mea>. tumaround time for a pair of shoes
rate of two pairs per hour. The repair time distribution (time in the shop waiting plus repair, but exclud~
is exponential 'kith ame:an of20 minutes. and there is ing time waiting to be picked up),
independence betwee~ the reprur and arrival 1s..2. Weather data are analyzed for a particular local~
processes. Consider a pair of s:Qoes to be the unit to be
it)', and a Markov chain is employed as a model for
served, and do the following: weather change as follows. The conditional probahil~
(a) In the steady slate, find the probability tlJat the i:y of change from rain to clear weather in one day is
number of pairs of shoes in the system e~ceeds 5. 0.3. Likewise, the co;.:;.ditional probabili~ of transition
574 Chapter 18 Stochastie Processes and Qaeueing

from elearto rain in one day is 0.1. The model is to be If t Xn} can. be modeled as a Markov chain with one-
a discrete~time model, with transitions occurring only step tranSition matrix as
between days.
(a) Determine the matrix P of one-step transition o
probabilities, 1/6
(b) Find the steady state probabilities. 2/3
(c) If today is clear, find the probability that it will be o
dear exactly 3 days hence.
(d) Find the probability that the first passage from a do !he following:
clear day to a rainy day OCC'Ui.'S in exactly 2 days,
(a) Show that states 0 and 1 are absorbing states.
given a clear day is the initial state.
(b) If the initial state is state 1. compute the steady
(e) \Vhat is the mean reCW(ence time for the rainy
state probability Clat the system is in state O.
day state?
(c) If !he ~-utial probabilities are (0,112.112,0), com-
19..3~ A communication link: transmits binary chamc~ pute the steady state probability Pc'
tets, (O~ 1), There is a probability P t...'1at a trnnsmitted
character will be received corredy by a receiver,
(d) Repeat (c) "i!hA =(lJ4,1I4,1I4,1I4),
w:-.dch then transmits to another link, etc. If Xu is the 18-6, A gambler bets $1 on each hand of blackjack.
initial character and Xl is the character received after The probability of winning on any hand is p. and the
the first transmission, X2 after the second. etc., then probability of losing is 1 - P = q, Tlle gambler will
\\ith independence {X~} is a ~iarkov chain. End the continue to play until either $Yhas been accumulated.
one-step and steady state transitinn matrices. or he has no money left. Let XI denote the acct;.mu-
18-4. Consider a two--component active redundancy lated 'Winnings on hand t • .Kate that X,+ I =Xt + 1, with
where the components arc identical and the time-to- probability P. that Xl,. 1 :;:;;; Xr - 1. with probability q,
failure distributions are ex.ponential. When both units andXH I ::z:X. if XI = 0 or X,.=Y. The stochastic process
are operating. each cames load U1 and each has fail~ X, is a Markov chain.
ure rate .I•. However. when one unit fails. the load car- (a) Find the one-step transition matrix P.
ried by the other component is L, and its failure ::ate (b) For Y =4 and p 0.3, find the absorption proba-
under this load is (1.5)A. There is only one repair bilities: blO • b 14, bJo• and by..
facility available, and. repair time is exponentially dis~
tributed Wtth mean lip,. The system is considered 18-7. Au ohject moves between fot:! points on a circle,
failed when both components are in the failed state. which are labeled 1. 2, 3, and 4, 11:~ probability of
Both components are initially operat:ir:.g. Assume that moving one unit to the right is p, and the probability
J1:> (l.S))", Let the states be as follows: of mO'.'ing one unit to the left is 1-P =q. Assume that
the object starts at 1. and let Xn denote the location on
0: No components are failed. t.'1c circle after n steps.
1: O:J.e component is failed and is in repair. (a) Find the one....step transition matrix p.
2: Two components are failed, one is in repair,
(b) FilYJ an expression for the steady stilZe probabili~
one is waiting. and the system is in the failed
condition. ties Pi'
(c) Evaluate the probabilities Pj for p ::::: 0 ..5 and
(a; Detem:ine the !:latrix P of transition probabilities
p 0.8,
associated v.ith interval ill.
(b) Determine the steady state probabilities, 18..8. For the singie-set'/er queueing model presented
(c) Write the system of differential equations that in Section 18·7, sketch the graphs of the following
present the transicct or time-dependent relation- quantities as a function of p:;:;;; Alp.., for 0 < P < L
ships for transitioI!, (a) Probability of no units in the system.
(b) :Mean time in the system,
18 S, A communication satellite is launched via a
R

(c) Mean time in the queue.


booster system that has a d!screte-time guidance con-
trol system, Course correction signals form a sequence 1s..9. Interamval times at a telephone booth are expo-
{X,,} where the state space for X is as follows: nential. with an average time of 10 minutes. The
0: No correction required, length of a phone call is assumed to be exponentially
1: lvfiuor correction required. cEstnouted with a mean of 3 minutes.
2: Major correction required" (a) 'What is the probability that a person arn..r--.ng at
3: Abort and system destruct. the booth v,.ill have to wait?
18~ 12 E;o::ercises 575

(b) \\lh31 is the average queue length? (b) Hov.' much time does it take, on average. for a
(c) The telephone company will ir.stall a second professor to get his or her jobs completed?
booth whe4 an arrival would expect to have to (c) If an. economy drive reduced the secretarial force
wait 3 minutes or more for the phone. By how to 1:'\¥o secretaries. what wi1: be the new answers
much must the rate of amvals be increased in to (a) and (b)?
order to justify a second booth?
18~ U. The mean frequency of arrivals at an airpor. is
(d) What is the probability that an arrival VIill ha.ve to 18 planes per hour, and the mean time that a runway is
wait more than 10 minutes for the phone? tied up with an arrival is 2 minutes. How many run~
(e) What is the p:obability that it will: take a person ways VoiD have to be provided so that the probability
more than 10 ;:nir~utes altogether, for the phone of a plane having to wait is 0.201 Ignore finite popu-
and to complete the call? lation effects and make the assumption of exponential
(0 Estimate the :fraction of <:I day that the phone will mJera.'1ival and service times.
be in use.
1&-13. A hotel reservations facility uses inward WATS
18-10. Automobiles arrive at a serviee station in a ran" lines to service customer requests. The mea.r, number
dam manner at a mean rate of 15 per hour. This station of ealls that arrive per hour is 50, and the mean serv~
has o:uy one service position. with a mean serviciI:g ice tim.e for a call is 3 minutes. Assume that interar~
rate of2i customers per hour. Service times are expo~ rival and service times are exponentially distributee..
nentially distributed. There is space fur only the auto~ Calls that arrive when all lines are busy obtaln a busy
mobile being sen'cd and two waiting. If all three signal and a..~ lost from the syste:n.
spaces are filled., an miving automobile will go on to (a) Flnd the steady state equations for dlls system,
another station. (b) How many WATS Jines must be provided to
(a) "''hat is the average ll'J.mber of units in the ensure that the probability of a customer obtain-
station? ing a busy signal is 0.05?
(b) \¥bat fraction of customers VIill be lost? (c) Mat fraetion of the time are all WATS lines
(e) \VbyisL,>'L-l? busy?
(d) Suppose that during the evening hours cal:
18-11. An engineering school has three secretaries in arrivals occur at a mean rate of 10 per hour, How
its general office.. Professors wim jobs for the secre- does this affect the WATS line utilization?
taries arrive at random. at an average rate of 20 per (e) Suppose the estimated mean serv:ce time (3 mill-
8-hour day. The amount of time that a secretary utes) is in enor, and the true mean service rin::e is
spe::1ds on a job has an exponential distribution v."ith a really.5 minutes. Wha: effect INill this have on the
mean of 40 minutes. probability of a customer finding all lines busy if
(a) What fraction of the time a."'C the secretaries busy? the number of lines in (b) are used?
Chapter 19

Computer Simulation
One of the ",ast widespread application. of probability and stati.tics lies in the use of com-
puter simulation methods. A simulation is simply an imitation of the operation of a real-
world system for purposes of evaluating that system. Over the past 20 years t computer
simulation has enjoyed a great deal of popularity in the manufacturing, production, logis-
tics. service, and financial industries) to name just a few areas of application. Simulations
are often used to analyze systems that are too complicated to attack via analytic methods
such as queueing theory, We are primarHy interested in simulations that are:

l~ Dynamic-that is, the system state changes over time, .


2. Discrete-that is, the system state changes as the result of discrete events such as
customer arrivals or departures.
3. Stochastic (as opposed to deterministic).
TIle stochastic nature of simulation prompts the ensuing discussion in the text.
This chapter is organized as follows. It begins in Section 19-1 with some simple moti-
vational examples designed to show how one can apply simulation to answer interesting
questions about stochastic systems, These examples invariably involve the generation of
random variables to drive the simulation, for example customer interarnval times and serv M

ice times. The subject of Section 19-213 the development of techniques to generate random
variables. Some of these techniques have already been alluded to in previous chapters, but
we will give a more complete and self-contained presentation here. After a simulation run
is completed, one must conduct a rigorous analysis of the resulting output, a task made dif-
ficult because simulation output, for example customer waiting times, is almost never inde-
pendent or identically distributed. The problem of output analysis is studied in Section 19-3.
A particularly attractive feature of computer simulation is its ability to allow the experi-
menter to analyze and compare certain scenarios quickly and efficiently, Section 19-4 dis-
cusses methods for reducing the variance of estimators arising from a single scenario, thus
resulting in more-precise statements about system performance, at no additional cost in
simulation run time. \Ve also extend this work by mentioning methods for selecting the best
of a number of competing scenarios. Vle point out here that excellent general references for
the topic of stochastic simulation are Banks. Carson, Nelson, and Nicol (2001) and Law and
Kelton (2000).

19-1 MOTIVATIONAL EXAMPLES


This section illustrates the use of simulation through a series of simple, motivational exam-
ples. The goal is to show how One uses random variables wit.'lin a simulation to answer
questions about the underlying stochastic system.

576
19-1 Motivational Examples 577

:EXam "'19";1
, p",:,
Coin Flipping
We are interested in simulating independent flips of a fair coin. Of course, :his is a trivial sequence of
Bernoulli trials with success probability p =. 112. but this example serves to show how OnC ea., use sim~
ulation to analyze such a system. First of all we need to generate realizations of heads (H) and tails
('T). each with probability 112. Assuming that the simulation can somehow produce a sequence of
independent uniform (0,1) random numbers. Vi' V2• ••• , we will a.,"bitrarily designate f'llp i as H if we
observe Ui < 0.5, and a flip as T if we observe Vi 2: 0.5. HQ'IN one generates independent uniforms is
the subject of Section 19~2. In any case, suppose tha: the following uniforr.:.s are observed:
032 0.41 0,06 0.93 0,82 0.L9 0,21 0,77 0,71 0,08.
This sequence of uuifonns corresponds to the outcomes HHHTIHHTTH. The reader is asked to
study this exan:ple in various ways in Exercise 19--1. This type of "static" simulation. in which we
simply repeat the same type of trials over and over, has come to be known as Monte Carlo simt;lation,
in honor of the European city-state, where gambling is a populru: recreational activity.

Estimate ::r
In this example, we 'Will estimate 1r using Mon~e Carlo simulation in conjunction with a simple geo~
metric relation. Referring to Fig. 19~1. consider a unit square with an L"lscribed eL."de, both centered
at {l/2.1/2). If one were to tlu:ow darts randomly at the square. the probability that a particular dart
willla..d in the circle is ref4, t."le ratio of the circle's area to that of the square. Hew can we use this
simple fact to estimat:e :r? We shall use Monte Carlo sirrl'..:lation to throw many darts at the square.
Specifically, genera:e indepehdent pairs of .independent '..Liiform (0,1) random variables. {U;!. Ud.
(U'2"' U'1.2)' .... Tnese pairs will fall randomlY on the square. If, for pair i. itbappens 'j}at

(19-1)

t."len that pair will also fall Vlithin the circle. Suppose we run the experiment for II. pairs {darts). Let
X, I if pair i satisfies ineqmlity 19-1. t.ltat is, if the itb dart falls in the c;rcle; otherwise,let Xi "'" O.
Now count up the number of d~'"tS X "'" L~JX{ falling in the circ:e, Clearly, X has the binoro.ial
dis;;ribution with parameters n and p "'" 1r/4. Then the proportionp .= Xlr. is the maximum likelihood
estimate for p = ;cf4, and so the rn.aximum likelihood estimator for ;cis justj"", 4ft. If, for instance,

(0.1)
........ ....... ..
r."
, _. ..•••••
~- .. ~
..... .
...
<t.#
~

-t . ....,. ...... .. .. ....::.. ....


~.
r... ... ...
~
'\
"10
• ••
••
,-... . " ".." .
~.
~.

:;

~
.
'[-.........
J.

................
. -,... -
.. ........ "

.... -=. " • •


....
..
"..
..
.... ....
,,"
- t. • • • • •

.. - . ':''':.... .
. . . , . "'. •
••
'.:~
\
~

:,: , . . . . . ~ ...... . , . . • Jet, I

i •••• c.pc.· • .£ ::. • c·c;


~ ....." • #. h ../ .
• ••• ••

. c,'.. c ••, . , • ••••·.r-.


• • .... '\. J ••


:. It
• ,. I: t , •• _:c -::
~..
.-:.
.·c·.... · ... '
!. . . ._ .. c"
• • • I
(O,Oj: • • • • •.,... ···:.·.·:(1,0)
Figure 19~1 Throv.ing darts ~o estimate i!..
578 Chapter 19 Computer Simulation

we conducted n = 1000 trials and obscrvedX =:: 753 dart5 in the circle, OUI estimate wocld be,r::::::: 3.12.
We will encounter this estimation technique again in Exercise 19-2.

Monte Curio lnltgratian


Another interesting uSc of computer simulation involves Monte Carlo integration. Usually. the
method becomes efficacious only for high-dimensional integrals, but we will fall back to the basic
one~dimensiona1 case for ease of exposition. To this e.:ld, consider the integral

(19-2)

As described in Fig. 19-2, we shall estimate the value of this integral by su:mmiog up n rectangles.
each ohvidth lin cen~ randorr.ly at point U; on [O,lJ. and of hcightj(a + (b -a)U), Then an esti-
mate for I is

(19-3;

One can show (see Exercise 19-3) that ~ is an unbiased estimator for I. that is, EV~J : : :. 1 for all n. Tris
makes Z, an in!:'citive and attractive estimator.
To illustrate. suppose that we wa:.t to est:ir:1ate the integra:

31!d the follo\\wg n =4 numbers are a uniform (0.1) samplc:

0.419 0.109 0.732 0.893.

Figure 19~2 Monte Carlo integration.


19¥1 Motivational Examples 579

Plugging into equation 19-3. we obtain


,
I, = I~O .2:[1+co",,(0+(1-O)U,))] = 0896,
I",:

wbicb. is close to the actual answer of 1. See Exercise 19-4 for additi.o~a: Monte Carlo integration
examples,

A Single.Server Queue
Now the goal is to simulate the behavior of a single-server queueing system. Suppose that s:x cus-
tomers arrive at a bank at the following times, which have been generated from some approp;::ia:c
probability distribution:

3 4 6 10 15 20.
UpOn arrival, customers q~eue up in front of a single teller and are processed sequentially, in a fust-
come~:first~served manner. The &erVice times corresponding to the a..'1iving customers are

7 6 4 6 2.

For this example, we assume t."at the bank opens at time 0 and closes its doors at t~me 20 (just after
custorr.er 6 arrives), serving any remaicing customers.
Table 19~1 and Fig. 19-3 trace the evolution of the system as time prOg.Tesses. The table keeps
track of ~e times at which customers arrive, begin semce, <!Ild leave, Figure 19-3 g.Taphs the status
of the queue as a function of time; in particular, it graphsL(t), the number of customers:""l the system
(queue + service) at time t.
Note that cusromer i can begin service only at tinte ma~ (A" Df-.:)' that: is, the maximum of his
arrival time and the previous customer's depanu:re time. The table und figure are quite easy to inter·
pret. For insta.4.ce, the system is empty until time 3, when customer 1 arrives. At time 4, customer '2
a..'1ives, but must wait in line until Cus~mer 1 finishes service at time lO. We see from llc figure :hat
between times 2.0 and 26, eustomer 4 is in scrv::ee, while customers 5 and 6 wait in me queue. From
the table, tbe average waiting time for :he six customers is l:.~l W,16;;;;;; 4416. Further, the avcro.ge num-
ber of customers it:. the system is 1;9 L(t)d!129 "'" 70129, where we have computed the integral by
adding up :he rectangles in Fig. 19-3. Exercise 19-5 looks at extensions of :be single-server queue,
Many simulation software packages provide Simple ways to model und analyze more-eomplicated
queueing networks.

Table 19·1 Bank Customers in Single-Sen.'er Queueing System


t, eustomer Ai' arrivai time Bi• begin serv::cc St, service time D", Gepart time 1\':. wait
3 3 7 10 0
2 4 10 6 16 6
3 6 16 4 20 10
4 :0 20 6 26 10
5 15 26 I 27 11
6 20 27 2 29 7
580 Chapter 19 Co:nputer Simulation

r 5
Queu e i

Customerl
I
i
3 4 4 5 6
i
-~

.-
2 2 3 3 4 5 6
i
1

i
'1 , I
I
I"~~:1"E> I ; 1 i
2
121
3
I
4 5 6
~ 15 16
3 4 6 10 20 2627 29
Figure 19~3 Number of customers L(t) in single~server queueing system.

. EXamJilei?,~'
(s, S) Inventory Policy
Customer orders for a particular good arrive at a store every day. DurIDg a certain one-week period.
the quantities ordered are
10 6 11 3 20 6 8.

The store starts the week off with an initiai stock of 20. If the stock falls to 5 or below, the owner
orders enough from a central warehouse to replenish the stock to 20. Such replenishment orders are
p:laced only at the end of the day and are rece~ved before the store opens the next day. There arc r.o
cusmmer back orders, so any custo:r.:ter orders that are DOt filled immediatelY a..--e lost. This is called
an (s, S) inventory sYStem, where the inventory is replenisbed to S -;: ;:; 20 whenever it hits level s =: 5.
The fonowing is a history for this system:

Initial Stock Customer Order End Stock Reorder? Lost Orders


20 10 10 No 0
2 10 6 4 Yes 0
3 20 11 9 No 0
4 9 3 6 No 0
5 6 20 0 Yes 14
6 20 6 14 No 0
7 14 8 6 No 0

We see that at the end of days 2 and 5, replenishment orders were made, In .particular. on day 5, the
store ran out of stock and lost 14 orders as a result. See Exercise 19-6.

19-2 GEJ\"ERATION OF RANDOM VARIABI,ES


All the exampJes described in Section 19-1 required random variables to drive the simula-
tion. In Examples 19~ 1 through 19-3, we needed uniform (0.1) random variables; Example,
19-4 and 19-5 used more~complicated random variables to model customer arrivals, serv-
19~2 Generation of Random Variables 581

ice times, and order quantities. This section discusses methods to generate such random
variables automatically. The generation of uniform (0,1) random variables is a good pI",.
to start, especially since it turns out that uniform (a,l) generation forms the basis for the
generation of all other random variables.

19-2.1 Generating Uniform (0,1) Random Variables


There are a variety of methods for generating uniform (0,1) random variables, among them
are the following:
L Sampling from certain physical devices, such as an atomic clock.
2. Looking up predetermined random nll.."'Ilbers from a table.
3. Generating pseudorandom numbers (PRNs) from a deterntinistic algorithm.
The most widely used techniques in practice all employ the latter strategy of generating
PRNs from a deterministic algorithm. Alt.':iougb, by definition. PRNs a,,-e not truly random,
·there are many algorithms available that produce PRNs that appear to be perfectly random.
Fu...'1:her. these algorithms have the advantages of being computationally fast and
repeatable-speed is a good property to have for the obvious reasons, while repeatability is
desirable for experimenters wbo want to be able to replicate their simulation results when
the runs: are conducted under identical conditions.
Perhaps the most popular method for obtaining PRNs is the linear congruential gener-
ator (LeG), Here, we start with a nonnegative «seed~' integer, Xo. use the seed to generate
a sequence of nonnegative integers, X:, Xz, .,., and then convert the Xi to PRNs, UI , Uz• .•..
The algorithm is simple.
1. Specify a nonnegative seed integer, X~j'
2. For i = 1. 2, ... , letX,= (aX,_, +c) mod (m), wbere a, c, andm are appropriately cho·
sen integer constants, and where umodH denotes the modulus function, for example,
17 mod (5) =2 and-I mod (5) =4.
3~ For i= 1, 2 j ••• , let Ui =X/m.

E~~iiii?-6
Consider the "'toy" generator Xi = (SXf-1 + 1) mod (8), v.-ith seed Xo ;: O. This produces the .:nteger
sequence X: 1,X2= 6, X:) ::.:: 7, X 4 =4,Xs =5, X6 = 2., X, 3, Xl! O. whereupon things start repeat-
ing, or "cycling." The PRl'i's corresponding to the sequence sta."ting with seedXo;..:;;; 0 are therefore VI ;;;;;
li8, Uz = 618, UJ =7/8, U4 ;:4/8, U~ =5/8, U6 := 2/8, U, "'" 3fS. Us:::; O. Since an;y seed ev·enJ:ual.:y pr0-
duces all integers 0, 1, •.. , 7. we say that this is ajt.tU-cycle ~orfull period) generator.

~Pl.el<j:t
Not all generators are full period. Consider another ''toy'' generator Xi = (3Xi _ 1;.. 1) mod (7), 'With seed
Xc.;; O. This ptoeuces the integer sequence X) =: 1,Xz =4, X3 6, X", =5,Xj :::; 2,X6 =0, whereupon
cycling ensues. Further, notice that for this generator, a seed of Xc =.3 produces the sequence Xl := 3 =
Xl =X3"'" "', not very :random looking!
~-----------------------
The cycle length oft."e generator from ExampJe 19·7 obviously depends on the seed
chosen,. which is a disadvantage, Full-period generators; such as that studied in Example
19-6 obviously avoid this problem, A full~period generator with a long cycle length is given
7

in the following example,


582 Chapter 19 Computer Simulation

.~~".T;iI'I~J?;~
The generator XI = 16807 X,_l mod (2 31 -1) is full period. Since c;;= 0, this generator is termed a mul-
tipiicative LeG and must be \!sed with a seed XO ~ O. This generator is used in many rea1~world a.ppli~
cations and passes most s~atistical !ests for uniformity and randomness, Lrt order to avoid integer
overllow and real-arithmetic round~off problems. Bratley, Fox, and Schrage (1987) oner the follow-
ing Fortran implementation scheme for this algorithm.
F.mCTION ltrJIF(IX)
Kl '" IX/l27773
IX = 16807*(IX - K1~127773) - K1~2836
:p (IX.LT.O)IX ~ IX + 2147483647
UNIF ~ IX ~ 4,6566l2S75E-lO
=uRN
END

L'1 the above prognu:n, we input a.."I. integer seed IX and receive a PRN UNIF. The seed IX is auto-
rr.atical2.y updated for the next calL ?-;'ote t.1.at b Fortran, integer division results in trun.cation., for
example 15/4 3; thus I<J. is an wtege:.

19-2.2 Gimerating Nonuniform Random Variables


The goa1 now is to generate random variab1es from distributions other than the uniform. The
methods we will use to do so always start with a PRN and then apply an appropriate trans-
formation to the PRN that gives the desired nonuniform random variable. Such nonunifonn
random variables are important in simulation for a number of reasons: for example, cus-
tomer arrivals to a service facility often follow a Poisson process; service times may be nor-
mal; and routing decisions are usually cha.."3.cterized by Bernoulli random variables.

Inverse Transform Methojl for Random Variate Generation


The most basic technique for generating random variables from a uniform PRN relies on
the remarkable Inverse Transfonn Theorem.

Theorem 19·1
If X is a random variable with cumulative distribution function (CDF) F(x), then the random
variable Y = FCX') has the unifOffil (0,1) distribution.

Proof
For ease of exposition. suppose thatX is a continuous random variable. Then the CDF of fis
G(y)=P(Y"y)
P(F(X')" y)
= P(X" F'; (y)) (the inverse exists since FCx) is continuous)
= F(P-I(y))
=y.

Since G(y) = y is the CDF of the unifonn (0,1) distribution, we are done,
With Theorem 19-1 in hand. it is easy to generate certain random variables. All one has
to do is the following:
1_ Find the CDF of X, say F(x).
19-2 Generation of Random Variables 583

2. Set F(X) = U, where U is a uniform (0,1) PRN,


3. Solve for X = F- 1( U).
We illustrate this technique w:th a series of examples, for both continuous and disc:ete

l dislributions.

I Here we generate an exponential random variable with rate A, fol1owi!:g the :ecipe outlined above.
1. The COP is F(x) = l-e''-'.
2. Se: F(X) = 1 - e-'X ~ U.

I 3. Solving for X, we obtain X =r'(U) =-[1n(l- U)]IJ..


Thus. if one supplies a unifoIm (0,1) PRN U, we see:hat X:::::: - [In(l - U);/). is an exponential ran-
dom. variable v.'ith parameter 1.

Now we try to generate a standa."'d normal random variable, ca!l itZ. Using the special notation <D(.) for
the standard nor=! (0,1) COP, we set <1>(2) U. so that Z = <I>-I(U). t:::.fortunate1y, :he ,>verse COP
does not exist in closed fonn, so one must reso~ to the use of standard Doana! tables (or other approxi-
mations). For instan~ if we have U"" 0.72, then Table II (Appendix) yields Z"".p-l(O,72) ""0583.

'~~I~1?i'(i'"
\Ve can ex:end the previous example to generate any normal random variable, that is. one with a...'""hi-
crary !!lean and variance. 'This fQilows easily. since ifZ is standard normal, then X "")1+ aZ is llormcl
with mean )1 and variance (5'2. For inst3nce, suppose we are interested in generating a r.orroal ,'Briate
X 'Rith mean)1 = 3 and variance (J(.::;::: 4. Then if, as in the previous example, U= 0.72, we obtainZ""
0.533, and, as.a consequence, X o:::! 3 + 2(0.583) =4.166.

~:E;~~p,I:e'Q,I~
We C3.f~ ilio use the ideas:from Theorem 19# 1 to generate realizations from discrete random variables,
SUppose that the discrete random variable X has probability function
rO.3 ifx=-l,
-' \-to.6
P\b/-
ifx=2.3.
0.1 ifx=?
o otherwise.
To generate variates frore this distribution. we set up:he following table, where F(x) is the associated
CDF and U denotes the set ofunifonn (0,1) PR.,'\'"s cor;esponding to each x~value:

x F(x) U
-1 0.3 D.3 [0,0.3)
2.3 0.6 0.9 [0.3,0.9)
7 0.1 1.0 (0.9.1.0)

To generate a realization of X. we first generate a PRN U and then read the corresponding x-value
from the table. For instance. if U::;::: 0.43, then X =: 2.3.

l
584 Ch.apter 19 Computer Simulation

Other Random Variate Generation ~Iethods


Although the inverse transform method is intuitively pleasing to use, its real-life application
may sometimes be difficult to apply in practice. For instance> closed-form expressions for
the inverse CDF, r" (U), might not exist, as is the case for the normal distribution, or appli-
cation of the method might be unnecessarily tedious. We now present a small potpourri of
interesting methods to generate a variety of random \wables.

Box-M.illier M.ethod
The Box-Muller (1958) method is an exact technique for generating independent and iden-
tically distributed (lID) standard normal (0,1) random variables. The appropriate theorem.
stated without proof. is

Thwrem 19-2
Suppose that U, and U, are lID uniform (0,1) random variables. Then

Z[ = ,j-2ln{U[) cos(2n;U,)
and

are IID standard normal random 'variates.


Kote that the sine and cosine evaluations must be carried out in radians,

Suppose that Vt ::::: 0.35 and U. . ;;; 0.65 are two Ill) PRNs. Using the Box-Milllermethod to generate
two normal (0.1) random variates. we obtain

Z, = .J-2Jn(O.35) cos{2:r(O.65)) = -0.8517

Z, ~-21n(O.35) sin(2:r(O.65)) ~-1.172.

Central Limit Theorem


One can also use the Central Limit Theorem (CLT) to generate "quick-and-dirty" random
variables that are approxil'lUltely normal. Suppose that U" U" ... , U, are lID PIU';s. Then
for large enough r., the CLT says that

Z?[U, - E[l:;"U,]
c··· "
,IV ar( l:Z" U, I
" ,
:=:
I:7-1Ui -
r-'
Ii-lE[Ud
"l:~.,var(U,)
= 17",1UI -(n/2)
-In/12
=:-:1(0,1).

- ,
19·2 Generation of Random Variables 585

In particular, the choice n = 12 (which turns out to be "large enough") yields the conven-
ient approximation
12
2..:Ui - 6 ~ N (0,1),
j""l

Suppose we have the following PRNs:


0,28 0,87 0.44 0.49 0,10 0,76 0,65 0,98 0,24 0,29 0,77 0,90,
Then
12
2..: U,- - 6 =0,77
i",;

is a realization :from a distribution that is a;?proximately standard nOrI!lat

Convolution
Another popular trick involves the generation of random variables via convolution, indi-
cating that some sort of sum is involved.

Suppose that X~, Xl' ... , X" are lID exponential random variables with rate A.. Then Y = L:IX1 is said
to have an Erlang distribution with para."1leters n and A.. It turn.s out that this distribution has proba-
bility density function

(19-4)

whid:, readers may recognize as a special ca.qe of the gamma distribution (see Exercise 19-16),
This distribution's CDF is too difficult to invert directly. One way that comes to mi.'1d to gener-
a:e a ~tion from the BrIang is simply to generate and then add up n TID cxponcntia1().,) random
variables. The following scheme is an efficient way to do precisely that. Suppose that Vj • V z• .. "' Ur
are TID PR..~s< From Example 19~9, wc know thatXr=-xln(l- UI)' i = 1, 2 • ... ,n. are lID expone:r:.~
tial()..) rand.om variables. 'I1?erefore, we can ",,'rite

This iu1plcmematiou is quite efficient., sir.ce it requires only one execution of a natural log operation,
In fact:., we can even do slightly better from an efficiency point of vicw-si.mp:y note that both [I" a:..d
(1- U) a;e uniform (0.1). Then

is. also Erlang.


586 Chapter 19 Computer Simularion

To illustrate, suppose that we have three lID PRNs at our disposal. VI = 0.23, U;. =0.97, and U3 =
0.48. To generate an Erlang realization V.1th parameters n::.;:: 3 and ,1.= 2.we simply take

Y= -±In(U; U,U,) =-±In{(O.23)(O.91)(O.4S)) = 1.117.

Acceptance-Rejection
One of the most popular classes of random variate generation procedures proceeds by sam-
pling PRl.'is until some appropriate "acceptance)1 criterion is met.

',J':ii~tn~t~!9,;i~~
An easy example of the acceptance-rejection technique involves the generation of a geometric :taIl-
dom variable with snccess probability p. To this end, consider a sequence of PRNs UI • U'), • .... Our
aim is to genet3.te a geometric realizatior. X, t.hat is,. one that has probability function

'(
p(xi=1)-P )"-''p lfx =1.2.....
, l O. othenvise.

In words. X represents the number of Bernoulli trials nntil the first success is obse.....'ed, 'This English
characterizatior. immcdiately suggests an elemenTary acceptance-rejection algorithm.

1. Initialize i (- O.
2. Leti t:-i ..... 1.
3. Take a Bemoulli(p) observation,

{I
11 = 0,
ifUi <po
otherwise.

4. If Y,::::: I, then we have our first snccess Md We StOp. in which case We accept X=::> i, Other~
wise, if Y/=O, then we reject and go back to step 2.

To illustrate, let US generate a geometric variate having success probabl.!ity p ;::: 0.3. Suppose we have
ore at our disposal the fonowing PRNs:

0.38 0.67 0.24 0.89 0.10 0,71.

Since U~ =0,38 cp, we have Y1-=0, and so we reject X = 1. Since U2 ::;; 0,67 cp. we have 1:2 ::;;0, and
so we reject X =2. Since [13::;; 0,24 <p, werurve Y3;;:::: 1, and so we acceptX=3,

19·3 OOTPUT ANALYSIS


Simulation output analysis is one of the most important aspects of any proper and complete
simulation study. Since the input processes driving a simulation are usually random vari-
ables (e.g., interarrival times, service times. and breakdown times), we must also regard the
output from the simulation as random. Thus, runs of the simulation only yield estimates of
measures of system performance (e,g., the mean customer waiting time), These estimators
are themselves random variables and are therefore subject to sampling error-and sampling
error must be taken into aCCO\lllt to make valid inferences concerning system performance.
19,3 Output Analysis 587

The problem is that simulations almost never produce convenient raw output that is ffi)
normal data, For example. consecutive customer waiting times from a queueing system
are not independent-typically, they are serially correlated; if one customer at the
post office waits in line a long time, then the next customer is also likely to wait a
longtime.
• are not identically distributed; customers showing up early in the morning might
have a much shorter walt than those who show up just before closing time.
are not normally distributed-they are usually skewed to the right (and are certainly
never less than zero).
The point is that it is difficult to apply "classical'" statistical techniques to the analysis
of simulation Output Our purpose here is to give methods to perform statistical analysis of
output from discrete-event computer simulations. To facilitate the presentation, we identify
two types of simulations with respect to output analysis: Terminati.'1g and steady state
simulations.
1. 'terminating (or transient) simulations, Here, the nature of the problem explicitly
defines the length oftbe simulation run. For instance, we might be interested in sim~
ulating a bank that closes at a specific time each day.
2. Nollterminating (steady stare) simulations. Here, the long-run behavior of the sys-
tem is studied. Presumedly this "steady state>~ behavior is independent of the simula-
tion's initial conditions, l\.n example is that of a continuously running production line
for which the experimenter is interested in some long~run performance measure.
Techniques to analyze output from tenninating simulations are based on the method of
independenlreplications, discussed in Section 1Y,3. I. Additional prob1ems arise for steady
state simulations. For instance, we must now worry about the problem .of starting the
simulation-how should it be initialized at time zero, and how long must it be run before
data representative of steady state can be collected? Initialization problems are considered
in Section 19,3.2. Finally, Section 19-3.3 deals with point and confidence interval estima-
tion for steady state simulation performance parameters.

19-3_1 Terminating Simulation Analysis


Here we are interested.in simulating same system of interest over a finite time horizon. For
now assume we obtain discrete simulation output y!~ Yt., •H' Ymt where the number of
j

observations m can be a constant or a random variable. For example} the experimentcr can
specify the number m of customer waiting times Yl • Yt.? .. " Ym to be taken from a queueing
simulation, Or m could denote the random number of customers observed during a speci-
fied time period [0, TJ.
Alternatively, we might observe continuous simulation output {Y(t)IO ~ t ~ T} over a
specified interval [0, TJ. For instance, if we are interested in estimating the time-averaged
number of customers waiting in a queue during [0, 1], the quantity Y(t) would be the num-
ber of customers in the queue at time I.
The easiest goal is to estimate the expected value of the sam.ple mean of the
observations,
e" E[YmJ,
where the sample mean in the discrete case is
_ 1 m
Ym=-I;r,
m "'.1
588 Chapter 19 Computer Sirculation

(with a similar expression for the continuous case). For instance. we might be interested in
estimating the expected average waiting time of all customers at a shopping cemer during
the period 10 a.ill, to 2 p,m.
e,
Although Ym is an unbiased est.imator for a proper statistical analysis requires that we
also pro\'ide an estimate of Var{Ym). Since the Y" are not necessarily IID random variables,
it may be that Var(Ym) :;i: Var(Y{)/m for any i a case not covered in elementary statistics
j

courses,
For this reason. the familiar sample variance

2. 1 m. _ 2
S =-~(y,-y)
m- 1~'
i=l
t m

is likely to be highly biased as an estimator of mVar(i'm)' Thus, one sbould not use [{'1m to
estimate Var{i'm)'
The way around the problem is via the method of independent replications (IR). IR
estimates Var(i'm) by conducting b independem simulation runs (replications) of the system
under study, where each replication consists of m observations.l! is eru;y to make the repli-
cations independent-simply reinltialize each replication with a different pseudorandom
number seed.
To proceed, denote the sample mean from replication i by

where YiJis observationj from replication i. for i = 1, 2.... , b andj = 1, 2, .. ', m.


If each run is started under the same operating conditions (e,g.• all queues empty and
idle), then the replication sample means Zlj~' , .. , Zb are ill) random variables. Then the
obvious point estimator for Var(l'm) = Var(Z,) is

, 1 b _ 2
VR "-2:;(Z, -Z,) ,
b-l ;",1

where the grand mean is defined as

Notice how dosely the forms of lin and Slim resemble each other, But si:l.ce the replicate
sample means are IID, VR is usually much less biased for Var(Ym) than is S2/m.
In light of the above discussion, we see that V"Ib is a reasonable estimator for Var(Z,,).
Further. if the number of observations per replication, m, is large enough, the Central Limit
Theorem tells us that the replicate sample means are approximately lID normal.
Then basic statistics (Chapter 10) yields an approximate 100(1- a)% two-sided con-
fidence interval (eI) for e,
(19-5)

where '''''''-' is the 1 - cd2 percentage point of the t distribution with b -1 degrees of freedom.
19-3 OutpUt Analysis 589

Suppose we want to estimate the expected avc:age waiting time for the first 5000 custOmerS i::i a ccr~
tam queueing system, We wiil make five indepe:.:dent replications of th.: system, v.itl:. each run ini-
tialized e:npty and idle and consisting of 5000 waiting times. The resulting replicate means are

2 3 4 5
Z, 32 4.3 5.1 4.2 4.6

Then Zs = 4,28 andV:~ "'" 0.487, For level a::= 0.05. we have tom1,4:= 2.72. and equation 19~5 gives
[3.41.5.153 as a 95% CI for the expec'".ed average waiting time for the first 5000 customers.

Independent replications can be used to calculate variance estimates for statistics other
t.i.an sample means. Then the method can be used to obtain CIs for quantities oilier than
E[Ym ], for example quantiles, See any of the standard simulation teXts for additional uses
of independent replications.

19-3.2 Initialization Problems


Before a simulation can be run. one must provide initial values for all of t.'le simulation'S
state variables, Since the experimenter may not know what initial values are appropriate for
the state variables, these values might be chosen so~ewhat arbitrarily. For instance, we
might decide that it is "most convenient" to initialize a queue as empty and idle. Such a
choice of initial conditions cae have a significant but unrecognized impact on the simula-
tion run's outcome. Thus, the initialization bias problem can lead to errors, particularly 1."1
steady state output analysis.
Some examples of problems concerning simulation initialization are as follows.

1~ Visual detection of initialization effect..s is sometimes difficu!t-especially in ':he


case of stochastic p:-ocesses having high intrinsic variance, such as queueing
systems.
2. How should ':he simulation be initialized? Suppose that a machine shop doses at a
certain time each day, even if there are jobs waiting to be served. One mllst there-
fore be careful to _-1 each day with a d=and that depends on the number of jobs
remaining from ':he previous day..
3. Initialization bias can lead t:O point: estimators for steady state parameters ha"ring
high mean squared error. as well as for CIs having poor coverage.

Since initialization bias raises important concerns, how do we detect and &a1 with it?
We first list methods to detect it.

1. Attempt 10 detect the bias visually by scanning a realization of the simulated


process. This might not be easy, since VL-'Ual analysis can nllSS bias. FurJ1er, a
visual scan can be tedious. To make the visual analysis more efficient, one might
transform ':he data (e.g., take logs or square roots), smooth it. ave::age it across sev-
eral independent replications, or construct mov.ing average plots.
590 Chc.pte:r 19 Computer Simulation

2. Conduct statistica.l testsJor initialization bias, Kelton and Law (1983) give anintu-
itively appealing sequential procedure to detect bias. Various omer tests check to see
whether the initial portion of the simulation output contains mOre variation than lat~
ter portions,
If initialization bias is detected, one may want to do something about it. There are tv/o
simple methods for dealing with bias. One is to truncate the output by allowing the simu~
!ation to '''warm up" before data are retained for analysis. The experimenter would then
hope that the remaining data are representative of the steady state system. Output truncation
is probably the most popular method for dealing with initialization bias, and all of the
major simulation languages have built-in truncation functions. But how can one find a good
truncation point? If the output is truncated "too early," significant bias might still exist in the
remaining data. If it is truncated "too late," then good observations might be wasted. Unfor-
tunately, all simple rules to determine tr..mcation pomts do not perfort:l well in general, A
common practice lS to average observations across several replications and then visually
choose a truncation point hased on the averaged run. See Welch (1983) for a good
visual/graphical approach,
TIle second method is to rr.ake a very long mn to overwhelm the effects of initializa-
tion bias. This method of bias control is conceptually simple to carry out and may yield
point estimators having lower mean squared errors than the analogous estimators from
truncated data (see, <,g.• Fisbman 1978). However, a problem with :his approach is that it
can be wasteful with observations; for some systems, an excessive run length might be
required before the initialization effects are rendered negligible.

19·3.3 Steady State Simulatiou Analysis


:Sow assume that we have on hand stationary (steady state) simulation output, YI > Y:.;. " ' 1 YIl'
Our goal is to estimate some parameter of interest, possibly the mean customer waiting time
or the expected proft produced by a certain factory configuration, As in the case of termi-
nating simulations, a good statistical analysis must accompany the value of any point esti-
mator wiL~ a measure of its variance.
A number of methodologies have heen proposed in the literature for conducting steady
state output analysis: batch means, independent replications t standardized time series~ spec~
tral analysis, regeneration, time se:ies modellilg, as well as a host of others. We will examine
the two most popular: batch means and independent replications. (Recall: As discussed ear·
lier, confidence intervals for terminating simulations usnally use independent replications.)

BatchMcam
The method of batch means is often used to estimate 11lr(Yn) or calculate as for the steady
state process mean JL The idea is to divide one long simulation ron into a number of con-
tiguous batches, and then to appeal to a Central Limit Theorem to assume that the resulting
batch sample means are approximately IID normal.
In particular, suppose that We partition Yt , Y:;:> •. , Y" into b nonoverlapping, contiguous
j

batches, eacb consisting of m observations (assume that n =bm). Thus, the itb hatcb, i;::; 1,
2, ... ! b, consists of the randoI<l variables

The ith batch mean is siIDply the sample mean of the mobservations from batch i, i= 1,2, ... , b,
1 m
Z, =- L~H)m+r
m 1",,1
19-4 Comparison of Systems 591

Similar to independent replications, we define the batcb means estimatorfor Vai\Z;l as


... 1 b _ 2
VB = b_lI,(Z,-z.),
(""l

wbere
_ _ 1 b
y" = Zb = -;- I,Z,
/) ;.>1

is the grand sample mean. If m is large. then the batcb means are approxixnately IID nor-
mal, and (as for IR) we obrain an approximate 100(1 - a)% Clfar f1,

J.L E Zb ±ta/'2.b_l-JVB !b.


This equation is very similar to equation 19~5, Of course, the difference here is that
batch means divides one long run into a number of batches~ wbereas independent replica-
tions uses a number of independent sborter runs. Indeed, consider the old IR example from
Section 19-3.1 with the understanding that the ZI must now be regarded as batch means
(instead of replicate means); then the same numbers carry through the example.
The technique of batch means is intuitively appealing and easy to understand. But
problems can come up jf the 1) are not stationary (e.g., if significa.l1t initialization bias is
present), if the batcb means are not nonnal, or if the batch means are not independent. If any
of these assumption violations exist, poor confidence interval coverage may result-un be-
knO\\'J1St to the analyst.
To ameliorate the initialization bias problem, the user can truncate some of the data or
make a long run, as discussed in Section 19-3.2. In addition, the lack of independence or
normality of the batch means can be countered by increasing the batch size In.

Independent Replications
Of the difficulties encountered when using batch means, the possibility of correlation
among the batch means might be the most troublesome. This problem is explicitly avoided
by the method of IR, described in the context of terminating simulations in Section 19-3.1-
In fact, the replicate means are independent by their construction. Unfortunately, since
each of the b replications has to be started properly. initialization bias presents more trou-
ble when using IR than when using batch means, The usual recommendation. in the context
of steady state analysis., is to use batch means over IR because of the possible initialization
bias in each of the replications.

19-4 COMPARISON OF SYSTEMS


One of the most important uses of simulation output analysis regards the comparison of
competing systems or alternative system configurations. For example, suppose we wish to
ev-diuate two different "restart'! strategies that an airline can evoke foll0'.1o-ing a major traf~
fic disroption, such as a snowstorm in the Northeast. Vlhich policy minimizes a certain cost
function associated with the restart? Simulation .is uniquely equipped to belp the experi-
menter conduct this type of comparison analysis.
There are many techniques available for comparing systemS, among them (i) classical
statistical CIs! (ii) common random numbeIS t (iii) antithetic variates, and (iv) ranking and
selection procedures.
592 Chapter 19 COr:.lputer Simclation

19-4.1 Oassical Confidence lntervals


With our airline example in mind, let 2'J be the cost from thejth simulation replication of
strategy i, i 1, 2. j ::; 1, 2, ...• bi' Assume that Zi,l' Zi,;l, ... , Z"b, are ITO normal with
unknown mean I1r and unknown variance, i;::::: 1, 2. an assumption that can be justified by
arguing that we can do the following:
1. Get independent data by controlling the random numbers between replications.
2. Get identically distributed costs be~'een replications by performing the replications
under identical conditions.
3* Get approximately normal data by adding up (or averaging) many su'bcosts to obtain.
overall costs for both strategies.
The goal here is to calculate a 100(1 - a)% CI for the difference ,11, - jl" To this end,
suppose that the 2,,) are independent of the z,J ane define
_ 1 hi
z,. 1,<11
=- ~2,
b,~ 1,)'
I ;=1

and

S,,2 =-,-1 , ( L_.2


b
2 ,-Z. j,
b. -1 ., t,j 1'1
I j"":'

An approximate 100(1 - a)% CI is

where the approximate degrees of freedom v (a function of the sample variances) is given
in Chapter 10,
Suppose (as in airline example) that small cost is good. Jf the interval lies entirely to
the left [rigbt] of zero, then system I [2] is better; if the interval contains zero, then the two
systems must be regarded, in a statistical sense, as about the same,
An alternative classical strategy is to use a C1 that is analogous to a paired Hest. Here
we take b replications from both strategies and set the differences Dj =Z',J - z,J forj = I,
2, ... ) b. Then we calculate the sample mean and variance of the differences:
_ 1b ? 1 b _2
Db=-LD, and S;;=-L(Dj-Db) '
b i"",l • b-l J",l

The resulting 100(1 - a)% CI is

111 - J1z E Db ± t/X!2,b_l-'\/S~/b-.

These paired tintervals are very efficient if Corr(Z;,J' 2zJ) > O,j= I, 2, '''' b (where we
still assume that 2"" Z,," ' .. , Z"b are lID and z,,l' z"" "" z"" are lID), In that case, it rums
out that

v(15 ') < V( 2"j ) ~ V( z"j )


b b '
Jf Zl,J and z,J hed been simulated independently, then we would bave equality in the above
expression. Thus. the trick may result in relatively small S~ and) hence. small C1 length. So
how do we evoke the trick?
19-5 Summ'r:l 593

19-4.2 ConunOll Random Numbers


The idea behind the above trick is to use cOmmon rantlom. numbers, that is, to use the same
pseudorandom numbers in exactly the same ways for corresponding runs of each of the
competing systems, For example. we might Use precisely the same customer arrival times
when simulating different proposed configurations of a job shop. By subjecting the alter-
native systems to identical experimental conditions, we hope to make it easy to distinguish
which systems are best eVen though the respective estimators are subject to sampling error,
Consider the case in which we compare two queueing systems, A and B, on the basis
of their expected customer transit times, fJA and 88' where the smaller 8-value corresponds
to the better system. Suppose we have estimators ~A and ~B for eA and es- respectively. We
will declare.4 as the be"er system ileA e, e.
<~" If and are simulated independently. then
the variance of their difference,

could be very large, in which case our declaration might lack conviction. H we could reduce
V(9A -fiB)' then We could be much more confident about our declaration.
eRN sometimes induces a high positive correlation between the point estimators ~A
and 88' Then we have
V(~A 1I,l = V(~A) + V(ils) - 2eav(llA' e.)
< V(~A) - V(~D)'
and we obtain a savings in variance.

19·4.3 Antithetic Random Numbers


Alternatively, if we can induce negative correlation between two unbiased estimators, and at
e,
~" for some parameter then the unbiased estimator (ill + ~2)/2 might have low variance.
Most simulation texts give advice on bow to run Lf].e simulations of the competing sys~
tems so as to induce positive or negative correlation between them. The consensus is that if
conducted properly, common and antithetic random numbers can lead to tremendous vari-
ance reductions.

19-4.4 Selecting the Best System


Ranking, selection. ar.d multiple comparisons methods form another class of statistical
techniques used to compare alternative systems. Here, the experi."llenter is interested in
seiecting the best of a number of competing processes. Typically, one specifies the desired
probability of correctly selecting the best process, especially if the best process is signifi-
cantly better than its competitorS. These methods are simple to use. :airly general, and ir::tu-
itively appealing. See Bechhofer, Santner. and Goldsman (1995) for a synopsis of the most
popular procedures.

19·5 SUMMARY
This chapter bega!1 vrith SOme simple motivational examples illustrating ,,'arious simulation
concepts. After this) the discussion t'.lrned to the generation of pseudorandom numbers, that
is. numbers that appear to be lID unifO!D1 (0,1). PRNs are important because they drive the
generation of a number of other important random variables) for example normal. expo~
nential, and Erlang. We also spent a great deal of discussion On simulation output analy-
sis--simulation output is almost never IID, so special care must be taken if we are to make
594 Chapter 19 Computer Si..llulation

statistically valid conclusions about the simulation's results. We concentrated on output


analysis for both terminating and steady state simulations.

19·6 EXERCISES
19~1. Extension of Example 19.. 1. (c) Vlhat is the maximum number of customers in the
(a) F'.up a coin 100 times. How n-..:my heads to do you system? When is this maximum achieved?
observe? (d) What is the average number of customers in line
(b) How many times do you observe two heads in a during the first 30 minutes?
row? T1m::e in a row? Four? Five? (e) Now repeat parts (a)-{d) assuming t.tlat the serv~
(c) Find 10 friends and repeat (a) and (b) based on a ices are performed last-in-mst-out.
rotal of 1000 flips. 19-6_ Repeat Example 19-5, whichdea], with an (s, S)
(d) Now simulate eoin flips via a spreadsheet pro- im'entory pOlicy, except now use order level s.:c::: 6.
gram. Flip the simulated coin 10,000 times and
19-7. Consider the pseudorandom number generator
answer (a) and (h).
X, = (5X,-, + I) mOO (16), with seed A;, = O.
19-2. Extension ofRxampie 19,.2. Throw 11 dat"sran-
(a) Calculate Xl andX1, along with the eorresponding
domly at a unit square cootai.ning an inscribed circle.
PR..1I.fs U; and Uz'
Use the results of your tosses to estimate 1C Let r. =:: 2'<
(b) Is this afullwperiod generator?
for k ::.:: 1. 2, ...• 15. and graph your estimates as a
function of k. (c) What is XJ5I)?
19~3. Extension of Example 19~3~ Show thd t. 19~8. Consider the "recoIU."Tlended" pseudorandom
defined in equation 19-3, is ur.biased for the integral I, n'J:JLber geoerator XI = 16807 Xi-I mod (231 - 1\ with
defined in equation 19~2. seed Xc = 1234567.
19-4. Other c-x1.ensions ofRxample 19>-3. (a) Calculate X; andXz, along with the corresponding
(a) Use Monte Carlo :.ntegration with n=10 observa- PR-'1s Vi and V,.
tions to estimate ~e-tl/2.dx. Now use r. =
r2 (h; What.isXtoo.ooo?
J() 2n 19-9~ Show how to use the inverse transform method
1000. Compare to the answer '±tat you c.a.."1 obtain
to generate an exponential ralldom va:i.able with rate
via nonna] tables.
A;;;;;: 2. Demonsnte your technique using :he PRN
(b) Vt'hat would you do if you had to estimate V=0.75.
r1Q J..... e-;;.1/2dx? 19-10. Consider the inverse transformrnethod to gen-
Jo 21r erate a standard normal (0, 1) random variable.
(c) Use Monte Carlo integration withn;;;;;: 10 observa~
(<1) DemOllSt.n1te your technique using the PRN
tions to estimate I~ COS(21CC) dx. Now use n =: V=O.25.
1000. Compare to the actual answer,
(b) Using your answer in (a). generate a:;N 0,9) ran-
19-5. Extension of Example 194. Suppose that 10 dom variable.
customers arrive at a post office at the following
times: 19~11. SU:pPose thatXhas probability density funetion
j(x) = !xl41, -2 <x< 2.
3 4 6 7 13 14 20 25· 28 30
(a) Develop an inverse transfonn technique to gener-
Upon arrival, eustomers queue up in £::ont of a single ate a realization of X
clerk and are processed in a first-comewfirst-served
(b) Demo:J.strate yQW'tech:uque usiDg U:: 0.6.
JJ1a!mer. The service times corresponding to the arriv-
ing customers are as follows:: (c) Sketch out/ex) and see if you can come up with
another method to geoerate X.
6.0 5,5 4.0 1.0 2.5 2.0 2.0 2.5 4.0 2.5
19-12. Suppose that:he discrete random variable X
Assume that the post office opens at time 0, and closes
has probability function
its doors at time 30 Gust after customer 10 arrives),
SIDing any remaining customers. (0.35 if x =-2.5.
(a) wnen does the last customer fillally leave the ifx =: 1.0,
system?
p(x)= 10.25
0. 40 ifx= 10.5.
(b) 'What is the average waitiIlg time for :he 10
C'Jstomers ? 1 O. otherwise.
19-6 E.--.::ercises 595

As ia Exa.-r.ple 19· ~2, set up a table 'X) generate real- 19~22. The yearly unemployment rates for Andorra
izations from this distribution. Illustrate yO'GC tech- daring the past 15 years are as follows:
nique with the PRN U = 0,86, 6,9 8,3 8.8 11A 11.8 12.1 10,6 11.0
19-13. The Weibull (a, (3) distributiou. popular inreli· 9.9 9.2 12.3 13.9 9.2 8.2 8.9
ability theory and o:her applied statistics disciplines, Use the method of batch means on the above data to
has CDP obtain a two-sided 95% confidence interval for the
mean unemployment. Use five batches, each consist~
ifx> 0,
ing of tbJ:ee years' data
otherwise.
19,,23. Suppose that we are interested in steady state
(a) Show how to us.e the inverse transform method confidence intervals for the mean of simulation ou~ut
to generate a realization from the Weibull Xl' X2• X1COOO" (You can pre:end :hat t.'iese are wail-
"'j

distribution, ing times.) We have conveniently divided the run up


(b) Demonstrate your technique for a ,\\reibull into five batches, each of size 2000; su;pose that the
(1.5,2,0) random va.-iable using the PRN resulting batch means are as follows:
U=0.66. 100 80 90 110 120
19~14.Suppose that VJ "" 0,45 and U2 =: 0.12 are two Use the method of batch means on the above data to
TID PR.'!s, vse the Box-Miiller method to generate obtain a two-sided 95% confidence interval for the
two N (0,1) variates, mean.
19-15. Consider the following P~"l"s: 19~24. The yearly total snowfall figures for
0.88 0.87 0.33 0,69 0.20 0.79 0.21 Siberacuse, NY. during the pa.<;t 15 years are as
0.96 0.11 0.42 0.91 0,70 follows:
Use the Central Limit theOrem method to generate a 100 103 88 72 98 121 106 110 99
realization that is approximately st..andard normal. 162 123 139 92 142 169
19~16. Prove equation 194 from the text Tnis 1.1,OWS (a) Use the me:hod ofbatcb means on the above data
that the sum of n ITO exponential tandom variables is to obtain a two-sided 95% confidence interval for
Erlang. Hint: Find the moment-generating function of the mean year:y snowfa:L Use eve batches, each
y, and compare it to fuat of the gan.:.r:la distribution. consisting of three years' data.
19·17. Using r~o PR.'ls, U. = 0.73 and U, = 0,11. (b) The corresponding yearly !Oml snowfall figures
generate a realization from an Erlang distibuti011 with for Buffoonalo, I'-'Y (which is down the road from
n=2and.i.:;;;3. Siberacuse), are as fo11O\1,1s:
19-18. Suppose that UI' U2 , ... , Un are PRl'4's. 90 95 72 68 95 110 112 90 75
(a) Suggest an easy inverse transform method to gen- 144 110 123 81 130 145
erate a sequence of IID Bernoulli random vari- How does. Buffoonalo's snowfall compare to
ables, each with success pardlIleter p. Siberacuse's? Just give an eyeball answer.
(b) Show how to use your answer to (a) to generate a (c) Now find a 95% confidence in~erval for t.'le dif-
billomial r..mdom variate with parameters n and p. ference in means be!:'Ween the two cities. Hint:
19~19. Use thc aceeptance~rejeetion technique to gen-
Think common random numbers.
erate a georeetric random variable wiTl::. success prob~ 19~25. Antithetic variates. Suppose :hat Xl' Xz• . , ,.
ability 0.25. Use as many of the P~'\ls from Exercise X" are IID with mean 11 and yat~ce cr. F:rrthe: sup-
19~ 15 as necessary. pose that fl' Y;;, , .• , YIf are also no 'With mea'l f1 and
19~20. Suppose that Z. = 3. z,. = 5, and .z.. = 4 are V'ariance 0-1 , The triek here is that we will
::hree batch means res~ti.ng fr;m a long ~ation also assc.me that Cov(X.. Y,) < 0 for all 1.. So, in other
run, Find a 90% t.vo-sided confidence lnterYfh for the words, the observations within one of the t'NO
mean. seq"~ences are nD, but they are negatively oon:elared
between sequences.
19~21. Suppose that 11 E [-2.5,3.5] is a 90% confi-
dence interval for the mean eost incurred by a een:ain {a} Here is an eKmlpie showing how can we end up
inventory policy. Further suppose that this interval wi:h the above scenario using simulations. Let
was based on five independent replications of the Xi = -m(Ui) and ~ .",...;!nO U;), where the U i are
underlying inventory system. Unfortunately, the boss the usual IID uniform (0,1) random vari.b1es,
has decided that she wants a 95% confidenee interval. i. ¥/hat is :be distrib;;;.tion of Xi? Of Y,?
Can you supply it? iL What is Cov(U;.: U;)?
5% Chapter 19 Computer SirmLation

iii, Would you expect that Cov(X;, 11) < 07 where the Ui a:e IID uniform (0.1). 'Vv'hat ki!1d of dis~
Answer. Yes, tribution does it look like?
(b) LetI'" and f,; denote the sample means of the-X, 19~28. Another miscellaneous computer exercise.
aed Y" res;;ectively, each based on n observations. Let us see if the Cent:ral Limit Theorem works. In
'Ni::hout actua:ly calculating Cov(Xi' Yi), state how Exercise 19~27> you generated 20,000 exponcntial(l)
V(O!" ~ ))12) compares [0 V(X",). k other words. observations, Now form :000 averages 0[20 observa~
should we do two negativelY correlated runs, each tiollS each fro:.r.. theoriginal20,QOO. More precisely, let
consisting of n observations, or just one run con-
sisting of 2r. observations? I '"
' " X,,(,
Y, : -20..i..J . '.
-'" ,-.J"'J
(e) What if you tried to use tills trick when using .1",1
Monte Carlo simulation to estimate J~ sin(n:x) d:x?
Make a histogram of the Yr DQ they look approxi M

19-U. Another varianee ceduction technique. Sup~


!n.ately nor:nal?
pose that our goal is to estimate the mean jl. of some
steady state simulation output process, Xl' Xi, .. ,. XII' 19~29. Yet another nUsccllaneous computer exec-

Suppose we SOITh':how know the expected value of cise. Let us generate some Donnal observations via
some other RV Y. and we also know that Cov(X, 1) > the Box-Mtl1cr method. To do so, fust generate 1000
0, where X is the sample me<l.."l. Obviously, X is the pairs of liD uniform (0,1) flndoID nu...rnbers, (U J ,!'
"usual" estimator for jL Let us look at another est:bla M Ul. l )' (0;,::, U2;2)' .". (UU:lJo, U2 ,JOIX;). Set
tor for j.!., namely. the control-variate estimator, r·-·-, .
XI ~ r21n(Ulj)cos(2rrU"iJ
c~X - kiY -E(l']),
where k is some consmnt. and
(a) Show that C is unbiased for jl,
(b} Find an expression for V{C). Commems'?
(c) Minimize Vee) with respect to k.
for i;; 1,2, ,." 1000, Make a hisrogram of the result-
19~.27.A miscellaneous computer exercise. Make a ingX" [The Xl'S areN(O,l),) Now gtaphX; vs. Y,> Any
histogram of XI ~ -In(U), for i ~ I, 2, ... , 20,000, commcn~s?
Appendix

Table I Cumulative Poisson Distribution

Table II Cumulative Standard Normal Distribution

Table m Percentage Points of the X2 Distribution

Table IV Percentage Points of the t Distribution

Table V Percentage Points of the F Distribution

Chart VI Operatiug Characteristic Curves

Chart VII Operatiug Chara :teristic Curves for the Fixed-Effects


Model Analysis of Variance

Chart vm Operatiug Characteristic Curves for the Random-Effects


Model Analysis of Variance

Table IX Critical Values for the Wilcoxon Two-Sample Test

Table X Critical Values for the Sign Test

Table XI Critical Values for the Wilcoxon Signed Rank Test

Table XII Percentage Points of the Studentized Range Statistic

Table xm Factors for Quality-Control Charts

Table XIV k Values for One-Sided and Two-Sided Tolerance Intervals

Table XV Random Numbers

597
598 Appendix

Table! Cumulative Poisson Distribution~


C~N
x om 0,05 0,10 0,20 0,30 0,40 0,50 0,60

0,990 0,951 0,904 0,818 0,7"'l 0.670 0,606 0.548


° 0.999 0.998 0.995 0,982 0.963 0.938 0.909 0.878
2 0,999 0,999 0,998 0,996 0,992 0.985 0,976
3 0.999 0.999 0.999 0.998 0.996
4 0,999 0,999 0,999 0,999
5 0.999 0.999

c:= iJ.
x 0,70 0,80 0,90 LOO LlO 1.20 1.30 lAO
0 0.496 0,449 0.406 0,367 0.332 0.301 0.272 0.246

1 0.844 0,808 0.772 0.735 0,699 0,662 0,626 0.591


2 0.965 0.952 0.937 0.919 0.900 0,879 0.857 0,833
3 O.99 L 0.990 0.986 0,981 0.974 0,966 0,956 0.946
4 0.999 0.998 0.997 0.996 0,994 0,992 0,989 0,985
5 0.999 0,999 0.999 0.999 0.999 0.998 0.997 0,996

6 0.999 0.999 0.999 0.999 0.999 0.999 0.999


7 0,999 0.999 0.999 0.999 0.999
8 0.999 0,999

Co:::: Ar
x 1.50 1,60 1.70 1.80 1.90 2.00 2,10 2,20

0 0.223 0201 0,182 0,165 0,149 0,135 0.122 0,110

0.557 0.524 0,493 0.462 O,~33 O."'l6 0.379 0.354


2 0,808 0.783 0,757 0.730 0.703 0.676 0.649 0.622
3 0.934 0.921 0.906 0,891 0,874 0.857 0.838 0.819
4 0,981 0.976 0.970 0.963 0.955 0.947 0.937 0.927
5 0.995 0.993 0,992 0.989 0.986 0.983 0,979 0.975
6 0.999 0.998 0.998 0.997 0.996 0.995 0.994 0,992
7 0,999 0.999 0.999 0.999 0.999 0,998 0,998 0,998
8 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.999
9 0.999 0,999 0.999 0,999 0.999 0.999
10 0.999 0.999
Appecdix 599

Tablel Cumu:.ative Poisson Distribution" (continued)

C=N
x 230 2.40 2.50 2,60 2.70 2,80 2.90 3,00

0 0.100 0.090 0.082 0,074 0,067 0.060 0,055 0.049

0.330 0.308 0,287 0.267 0.248 0,231 0.214 0.199


2 0.596 0569 0543 0518 0."93 0.469 0,445 0.423
3 0.799 0.778 0,757 0.736 0,714 0,691 0,669 0,647
4 0.916 0,904 0,891 0,877 0.862 0,847 0.831 0.815
5 0,970 0,964 0,957 0,950 0.943 0,934 0,925 0,916
6 0,990 0,988 0.985 0,982 0,979 0.975 0.971 0.966
7 0,997 0,996 0.995 0,994 0.993 0.991 0.990 0.988
8 0,999 0.999 0,998 0.998 0.998 0.997 0.996 0,996
9 0,999 0,999 0,999 0,999 0.999 0.999 0.999 0,998
10 0,999 0.999 0.999 0.999 0.999 0.999 0,999 0.999
II 0.999 0,999 0,999 0,999 0,999 0.999
12 0.999 0.999

c::;;: j"t
x 3.50 4.00 4.50 5.00 5.50 6.00 6.50 7,00

{I,030
°
1 0,135
0.018

0,091
0.011

0,061
0,006

0,040
0.004

0.026
0.002

0,017
0,001

0.011
0.000

0,007
2 0.320 0,238 0.173 Q,!24 0,088 0.061 0.043 0,029
3 0536 0,433 0,342 0,265 0,201 0.151 0,111 0.081
4 0,725 0.628 0.532 0,440 0.357 0.285 0.223 0.172
5 0,857 0,785 0,702 0.615 0.528 0,445 0,369 0.300
6 0,934 0,889 0,831 0,762 0,686 0,606 0526 0.449
7 0,973 0,948 0,913 0,866 0,809 0<743 0,672 0.598
8 0,990 0,978 0,959 0,931 0,894 0,847 0,791 0.729
9 0.996 0,991 0,982 0.%8 0,946 0,916 0,877 0,830
10 0,998 0,997 0,993 0.986 0,97L 0,957 0.933 0,901
11 0.999 0,999 0,997 0,994 0,989 0.979 0,966 0.946
12 0,999 0,999 0,999 0,997 0,995 0<991 0,983 0,973
13 0,999 0,999 0,999 0.999 0.998 0,996 0.992 0,987
:4 0,999 0,999 0,999 0.999 0,998 0,997 0.994
15 0,999 0,999 0,999 0,999 0,998 0,997
16 0,999 0,999 0,999 0,999 0,999
17 0.999 0,999 0,999 0,999
18 0.999 0.999 0.999
,9 0,999 0.999
20 0.999

(continues)
600 Appendix

TabId Cumulative Poisson Distribution° Cconfinued)


c '= At
x 7.50 8.00 8.50 9.00 9.50 10.0
_ _ ~ _ _ ~ ___ ~ _ _ _ _ _ ~ _ _ ~ _ _ _ _ _ ~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ c_~ _ _ _ ~ _ _ ~ _ _ _ _ _ _ _
15.0 20.0
----------~-~-~.

0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000


1 0.004 0.003 0.001 0.001 0.000 0.000 0.000 0.000
2 0.020 0.013 0.009 0.006 0.004 0.002 0.000 0.000
3 0.059 0.()42 0.030 0.021 0.014 0.010 0.000 0.000
4 0.132 0.099 0.074 0.054 O.MO 0.029 0.000 0.000 .
5 0.241 0.191 0.149 0.115 0.088 0.067 0.002 0.000
6 0.378 0.313 0.256 0.206 0.164 0.130 0.007 0.000
7 0.524 0.452 0.385 0.323 0.268 0.220 O.DlS 0.000
8 0.661 0.592 0.523 0.455 0.391 0.332 0.037 0.002
9 0.776 0.716 0.652 0.587 0.521 0.457 0.069 0.005
10 0.362 0.815 0.763 0.705 0.645 0.583 0.118 0.010
11 0.920 0.888 0.848 0.803 0.751 0.696 0.184 0.021
12 0.957 0.936 0.909 0.875 0.836 0.791 0.267 0.039
13 0.978 0.965 0.948 0.926 0.898 0.864 0.363 0.066
14 0.989 0.982 0.97Z 0.958 0.940 0.916 00465 O.IM
15 0.995 0.991 0.986 0.977 0.966 0.951 0.568 0.156
16 0.998 0.996 0.993 0.988 0.982 . 0.972 0.664 0.221
17 0.999 0.998 0.997 0.994 0.991 0.985 0.748 0.297
18 0.999 0.999 0.998 0.997 0.995 0.992 0.819 0.381
19 0.999 0.999 0.999 0.998 0.998 0.996 0.875 0.470
20 0.999 0.999 0.999 0.999 0.999 r. 0.998 0.917 0.559
"
21 0.999 0.999 0.999 0.999 0.999 '-Q!l9.9 0.946 0.643
22 0.999 0.999 0.999 0.999 0.999 0.967 0.720
23 0.999 0.999 0.999 0.999 0.980 0.787
24 0.999 0.999 0.988 .0.843
25 0.999 0.993 0.887
26 0.996 0.922
27 0.998 0.947
28 0.999 0.965
29 0.999 0.978
30 0.999 0.986
31 0.999 0.991
32 0.999 0.995
33 0.999 0.9~7
34 0.998
x
"E:nries in the table are ~ucs of F(x) =: P(X s; x):. 'Le-cc1/i!. B1auic spaces below the last cnll')'·in any
;==0
column may be rend as 1.0.
I ,- :
-iOonC:'il(
, .:,)~.,-
- ...... I
-!-o ")
,
:: .. ,'-or", -; i\.;lpendix 601
'J ? "' -",-
{, ,<",;;"
-, ..>.')1 f\ 0 s--:-Qr,,'!;:.f;f!_ -
'\-v·';:')
Table II Cumulative Standard Normal Distribution r lOUT ;:
-:"r{

<I>(z) = L "2,, e-,'il


1 du

z 0.00 0.0: 0.Q2 om· 0.04 z


0.0 0.50000 0.50399 0.507 98 0.51l 97 0.51595 0.0
0.1 0.53983 0.54379 0.54776 O.S5l.72 0.55567 0.1
0.2 0.57926 0.583 17 0.58706 0.59095 0.59483 0.2
OJ 0.61791 0.62172 0.62551 0.62930 0.63307 0.3
0,4 .' 0.055-4'2- 0.65910 0.66276 0.66640 0.67003 0.4
0.5 0.69146 0.69497 0.69847 0.70194 0.70540 0.5
0.6 0.72575 0.72907 0,73237 0,73565 0,73891 0,6
0.7 0.75803 0.761 15 0.76424 036730 0.77035 0.7
0.8 0.788 14 0,79103 0.79389 0.79673 0.799 54 0.8
0.9 0.81594 0.81859 0.82121 0.82381 0,82639 0,9
l.0 0.84134 0,84375 0.846 13 0.84849 0.85083 LO
L1 0,864 33 0.86650 0,86864 0,S70 76 0,872 85 1.1
1.2 0.88493 0,88686 0,88877 0,89065 0,892 51 1.2
1.3 0.90320 0.90490 0,906 58 0,90824 0.909 SS 1.3
1.4 0,~9.24~ 0,92073 0.92219 0.92364 0,92506 1.4
1.5 -().93 :lJ 9 0,93448 :cr9:i5JD 0.93699 0.93822 1.5
c:r.;-
'1.6 0,94520 0.946 30 0.94738 0.94845 0.94950 1.6
1.7 0.95543 0.95637 0,95728 0.958 18 0.95907 1.7
1.8 0,964 07 0,96485 0.96562 0.966 37 0,967 II 1.8
1.9 0.97128 0.97193 0.972 57 0.97320 0.973 81 1.9
2:6' .' ,iXin is-'-' 0,977 78 0.97831 0,97882 0.97932 2.0
2,1 0.98214 0.98257 0,98300 0.98341 0.98382 2.1
2.2 0.98610 0,98645 0.98679 0,987 i3 0.98745 2.2
2.3 0.98928 0.98956 0,98983 0,990;'0 0.99036 2.3
2,4 0.99180 0.99202 0.99224 0,992 45 0.99266 2.4
'--".
2.5 0.99379 0,99396 0.994 :3 ".0,994 30' 0.99446 2.5
2.6 0.99534 0.99547 0.99560 0.99573 0,99585 2.6
2.7 0,99653 0.99664 0.99674 0.99683 0.99693 2,7
2,8 0.99744 0.99752 0.99760 0,99767 0.9977.d. 2.8
2.9 0.99813 0.99819 0.99825 0.99831 0.99836 2.9
3.0 0.99865 0,99869 0.99874 0,99878 0.99382 3.0
3.1 0.99903 0.99906 0.99910 0.999 13 0.99916 3.1
3.2 0,99931 0,999 34 0.99936 0,99938 0.9994{) 3.2
3J 0.999 52 0.99953 0.99955 0,999 57 0.99958 3.3
3.4 0.999 66 0.999 68 0,99969 0,99970 0.999 71 3.4
3.5 0.999 77 0.999 7S 0,99978 0.99979 0.99980 3,5
3,6 0.99984 0.999 85 0,99985 0,99986 0.99986 3.6
3,7 0.99989 0.999 90 0,99990 0,99990 0.99991 3.7
3,8 0.99993 0.99993 0,999 93 0.99994 0.99994 3,8
3.9 0.99995 0,99995 0.99996 0,99996 0.99996 3.9
(continues)
602 Appendix

Table II Cumulative Standard Normal Distribution (continued)

<I>(z) = J~ 1
-,=~e
_uu .,
"du
- ,21t

Z 0.05 0.06 0.07 0.08 0.09 z


0.0 0.51994 0.52392 0.52790 0.53188 0.53586 0.0
0.1 0.55962 0.56356 0.56749 0.57142 0.57534 0.1
0.2 0.59871 0.60257 0.60642 0.61026 0.61409 0.2
0.3 0.63683 0.641J 58 0.644 31 0.64803 0.65173 0,3
0.4 0.67364 0.67724 0.68082 0.684 38 0.68793 0.4
0.5 0.70884 0.71226 0.71566 0.71904 0.72240 0.5
0.6 D.742)~. :1\ 0.74537 0.74857 0.75175 0.75490 0.6
0.7 0.773 37 ~\ 0.77637 0.77935 0.78230 0.78523 0.7
0.8 0.802 34 " 0.80510 0.80785 0.810 57 0.813 27 0.8
0.9 0.8~ 0.83141 0.83397 0.83646 0.83891 0.9
1.0 6~8.53 14) . '0.85543 0.85769 0.85993 0.86214 1.0
1.1 8:i~O::';b876 97 0.87900 0.88100 0.88297 1.1
1.2 0.89435 0.89616 0.89796 0.89973 0.90147 1.2 •
1.3 0.91149 0.913 08 0.91465 0.91621 0.91773 1.3'
1.4 0.926 47 0.927 85 0.92922 0.93056 0.93189 . 1.4
1.5 0.93943 0.94062 0.94179 0.94295 0.944 08 1.5
1.6 0.95053 0.95154 0.95254 0.95352 0.954 48 1.6
1.7 0.95994 0.960 80 0.96164 0.96246 0.96327 1.7
1.8 0.96784 0.96856 0.96926 0.96995 0.97062' 1.8
1.9 0.97441 0.97500 0.97558 0.97615 0.97670 1.9
2.0 0.97982 0.98030 0.98077 0.98124 0.98169 2.0
2.1 0.98422 0,98461 0.98500 0.98537 0.98574 2.1
2.2 0.98778 0.98809 0.98840 0.98870 0.98899 2.2
2.3 0.99061 0.990 86 0.99111 0.99134 0.99158 2.3
2.4 0.99286 0.99305 0.99324 0.99343 0.99361 2,4
2.5 0.99461 0.99477 0.994 92 0.99506 0.99520 2.5
2.6 0.99598 0.99609 0.99621 0.99632 0.99643 2.6
2.7 0.99702 0.99711 0.99720 0.99728 0.997 36 2.7
2,8 0.99781 0.99788 0.99795 0.99801 0.99807 2.S
2.9 0.99841 0.99846 0.99851 0.99856 0.99861 2.9
3.0 0.99886 0.99839 0.99893 0.99897 0.99900 3.0
3.1 0.99918 0.99921 0.999 24 0.99926 0.999 29 3.1
3.2 0.99942 0.99944 0.999 46 0.99948 0.99950 3.2
3.3 0.999 60 0.999 61 0.99962 0.99964 0.99965 3.3
3.4 0.99972 0.99973 0.99974 0.99975 0.99976 3.4
3.5 0.99981 0.99981 0.99982 0.99983 0.999 83 3.5
3.6 0.99987 0.99987 0.999 88 0.99988 0.99989 3.6
3,7 0.999 91 0.999 92 0.99992 0.999 92 0.9999;l 3,7
3.8 0.99994 0.99994 0.99995 0.99995 0.99995 3,8
3.9 0.99996 0.99996 0,99996 0.99997 0.999 97 3.9
( .
Appendix 603

Table ill Percentage Points of the Xl Distribution:

.'_. ~ 0.995 0.990 0.975 0.950 0.900 0.500 0.100 0.050 0.G25 0.010 0.005
1 0.00+ 0.00+ 0.00+ 0.00+ 0.02 0.45 2.71 3.84 5.02 6.63 7.88
2 om 0.02 0.05 0.10 0.21 1.39 4.61 5.99 7.38 9.21 10.60
3 0,07 0.11 0.22 0.35 0.58 2.37 6.25 7.81 9.35 11.34 12.84
4 0.21 0.30 0,48 0.71 1.06 3.36 7.78 9,49 11.14 13.28 14.86
5 0041 0.55 0.83 Ll5 1.61 4.35 9.24 11.07 12.83 15.09 16.75
6 0.68 0.87 :.24 1.64 2.20 5.35 10.65 12.59 14.45 16.81 18.55
7 0.99 124 ~_69 2.17 2.83 6.35 12.02 14.07 16.01 18.48 20.28
,,34 1.65 2.18 2.73 13.36 15.51 17.53 20.09 21.96
9\ 1.73 2.09 2.70 3.33 14.68 16.92 19.02 21.67 23.59 \,:
10 2.16 2.56 325 3.94 15.99 r...J.U! 20,48 23.21 25.19
11 2.60 3.05 3.82 4.57 5.58 10.34 17.28 19.68 21.92 24.72 26.76
r-,''',
'~::'
3.07 3.57 4.40 5.23 6.30 . 11.34. 18.55 21.03 23.34 26.22 28.30
13 3.57 4.11 5,()1 5.89 7.04 12:34' 19.81 22.36 24.74 27.69 29.82
14 4.07 4.66 5.63 657 7.79 13.34 21.06 23.68 26.12 29.14 31.32
.15 4.60 5.23 627 7.26 8.55 14.34 22.31 25.00 27,49 30.58 32.80
16 5.14 5.81 6.91 7.96 9.31 15.34 23.54 26.30 28.85 32.00 34.27
17 5.70 6.41 7.56 8.67 10.09 16.34 24.77 2759 30.19 33.41 35.72
18 6.26 7.01 8.23 9.39 10.87 17.34 25.99 28.87 31.53 34.81 37,16
19 6.84 7.63 8.91 10.12 11.65 18.34 27.20 30.14 32.85 36.19 38.58
20 7.43 8.26 9.59 10.85 12."4 19.34 28.41 31.41 34.17 3757 40.00
21 g.03 8.90 10.28 11.59 1324 2034 29.62 32.67 35.48 38.93 41.40
22 8.64 9.54 10.98 12.34 14.04 21.34 30.81 33.92 36.78 40.29 42.80
23 9.26 10.20 11.69 13.09 14.85 22.3~ 32.01 35.17 38.08 41.64 44.18
24 9.89 10.86 12.40 13.85 15.66 23.34 33.20 36.42 39.36 42.98 45.56
25 JO.52 11.52 13.12 14.61 16.47· 24.3~ .34.28 37.65 4>3.65 44.31 46.93
26 ILl6 12.20 13.84 15.38 17.29 25.34 35.56 38.89 41.92 45.64 48.29
27 11.81 12.88 14.57 16.15 18.11 26.34 36.74 40.11 43.19 46.96 49.65
28 12.46 1357 15.31 16.93 18.94 27.34 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 >7.71 19.77 2834 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 29.34 40.26 43.77 46.98 50.89 53.67
4J 20.71 22,16 24.43 26.51 29.05 39.34 51.8155.76 59.34 63.69 66.77
50 27.99 29.71 32.36 34.76 37.69 49.33 63,17 67.50 7!.42 76.)5 79.49
60 35.53 37.48 40.48 43.19 46.46 59.33 74.40 79.08 83.30 88.38 91.95
70 43.28 45.44 48.76 5).74 55.33 69.33 85.53 90.53 95.02 100.42 104.22
80' 51.17 5354 57.15 60.39 64.28 79.33 96.58 10l.88 106.63 112.33 116.32
90 59.20 61.75 65.65 69.13 73.29 89.33 107.57 113.:4 118.14 124.12 128.30
100 67.33 70.06 7~.22 77.93 82.36 99.33 118.50 12><.34 129.56 135.81 14{i.17
'v:;:: cegreos off:ecdor:n.
604 Appendix

Table IV Percentage Points of the t DistDDution

~I O~___ IJ.2~ __!l:!o~_, 0.05 _0_.02_5_._0_'0__1_ _ 0.~00_5_~:O_02_5_~O~_~_.000_5_


1 0.325 1.000 3.078 6.314 12.706 31.821 63.657 127.32 318.31 636.62
2 0.289 0.816 1.886 2.920 4.303 6.965 9.925 14.089 23.326 31.598
3 . 0.277 0.765 1.638 2.353 3.182 4.541 5.841 7.453 10.213 12.924
4 0.271 0.741 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 0.267 0.727 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 0.265 0.718 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 0.263 0.711 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
S 0.262 0.706 1.397 1.860 2.306 2.8% 3.355 3.833 4.501 5.041
9 0.261 0.703 US3 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 0.260 0.700 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 0.260 0.697 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 0.259 0.695 1.356 1.782 2,179 2.681 3.055 3A28 3.930 4.318
13 0.259 0.694 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 0.258 0.692 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 0.258 0.691 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 0.258 0.690 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 0.257 0.689 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 0,257 0.688 1.330 1.734 2.101 2.552 2.878 . 3.197 3.610 3.922
19 0.257 0.688 1.328 1.729 2.093 2.539 2.861 3,174 3.579 3.883
20 0.257 0.687 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 0.257 0.686 1.323 1.721 2,080 2.518 2.831 3.135 3.527 3.819
22 0.256 .0.686 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 0.256 0.685 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.767
24 0.256 0.685 DI8 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 0,256 0.684 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 0.256 0.684 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 0.256 0.684 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 0.256 0.683 1.313 1.701 2.048 2.4<>7' 2.763 3.047 3.408 3.674
29 0.256 0.683 1.311 1.699 2.M5 2.462 2.756 3.038 3.396 3.659
30 0.256 0.683 1.310 1.697 2.(/42 2.457 2.750 3.030 3.385 3.646
40 0.255 0.681 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
60 0.254 0.679 1.296 l.671 2.000 2.390 2.660 2.915 3.232 3.460
120 0,254 0.677 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
0.253 0.674 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291
Source; TIlls table is adapted from Biometrika Tables for SraristiciAns, VoL t 3rd edi;iOD< 1966. by permissioI!. of dle
Biometrika Trustees.
Table V Percentage Points of th~ F Distribution

F 025, V!,>'l
v, Degrees of fr~edom for the numerator (VI)
v, 2 4 6 7 8 9 to 12 15 20 24 30 40 60 120

5.83 7.50 8.20 8.58 8.82 8.98 9.lD 9.19 9.26 9.32 9,41 9.49 9.58 9.63 9.67 9.71 9.76 9.80 9.85
2 2.57 3.00 3.15 3.23 3.28 3.31 3.34 3.35 3.37 3.38 3.39 3041 3,43 3,43 3.44 3,45 3,46 3.47 3,48
3 2.02 2.28 2.36 2.39 2,41 2042 2,43 2.44 2.44 2.44 2,45 2046 2,46 2,46 2,47 2,47 2.47 2047 2.47
4 1.81 2.00 2.05 2.06 2.(17 2.08 2.08 2.08 2.0& 2.08 2.0& 2.08 2.08 2.08 2.08 2.08 2.08 2.08 2.08
5 1.69 LaS 1.88 1.&9 1.&9 1.&9 1.89 1.89 1.89 1.89 1.89 1.89 \.88 1.88 1.88 1.88 1.87 1.87 1.87
6 1.62 1.76 1.78 1.79 1.79 1.78 1.78 1.78 1.77 1.77 1.77 1.76 1.76 1.75 1.75 1.75 1.74 1.74 1.74
LS7 1.70 1.72 1.72 1.71 1.71 1.70 1.70 1.70 1.69 1.68 1.68 1.67 1.67 1.66 1.66 1.65 1.65 1.65
1.54 1.66 1.67 1.66 1.66 1.65. I.M 1.64 1.63 1.63 1.62 1.62 1.61 1.60 1.6() 1.59 1.59 1.58 1.5&
9 LSI 1.62 1.63 1.63 1.62 1.61 1.6() 1.60 LS9 1.59 1.58 1.57 U6 LS6 LS5 LS4 1.54 1.53 1.53
ill 1.49 1.60 1.60 1.59 1.59 1.5& 1.57 LS6 LS6 LS5 1.54 1.53 152 LS2 1.51 LSI LSO 1.49 1.48
11 1,47 1.58 LS8 157 LS6 155 1.54 1.53 LS3 1.52 LSI LSO 1.49 1.49 1.48 1,47 1,47 1.46 1.45
"£ 12 1.46 1.56 1.56 1.55 154 1.53 1.52 LSI LSI 1.50 1.49 1,48 1,47 1.46 1,45 1,45 1.44 1.43 1.42
~ 8 1.45 LS5 LS5 LS3 1.52 LSI 1.50 1.49 1.49 1048 1,47 1.46 lAS 1.44 1.43 1.42 1.42 1,41 1,40

1 1.44 1.41 1.39


M 1.44 LS3 1.53 LS2 LSI 150 1.49 1,48 1.47 1.46 1,45 1.43 1.42 1,41 1,40 1.38
li 1.43 152 1.52 1.51 1,49 104& 1,47 1.46 1.46 lAS 1.44 1.43 1.41 1.41 1,40 1.39 1.38 1.37 1.36
16 1,42 LSI 1.51 LSO 1,48 1.47 1,46 1,45 1.44 1.44 1.43 1.41 1,40 1.39 1.38 1.37 1.36 1.35 1.34
-1l n 1,42 151 LSO 1.49 1,47 1.46 1,45 1.44 1.43 1.43 1.41 lAO 1.39 1.38 1.37 1.36 1.35 1.34 1.33
.il U 1.41 LSO 1.49 1.48 1,46 lAS 1.44 1.43 1.42 1.42 1,40 1.39 1.38 1.37 1.36 1.35 1.34 1.33 1.32

i 19
20
21
1,41
1,40
1,40
1,49
1.49
1,48
1,49
1,48
1,48
1,47
1,47
1.46
1.46
1,45
1,44
1.44
1.44
1.43
1.43
1.43
1,42
1,42
1.42
1.41
1.41
1.41
1,40
1,41
1,40
1.39
1,40
1.39
1.38
1.3&
l.37
1.37
1.37
1.36
1.35
1.36
1.35
1.34
1.35
1.34
1.33
1.34
1.33
1.32
1.33
1.32
1.31
1.32
1.31
1.30
1.30
1.29
1.28

'" n 1,40 1,48 1,47 1,45 1.44 1.42 1.41 lAO 1.39 1.39 1.37 LJ6 1.34 1.33 1.32 1.31 1.30 1.29 1.28

£ D
24
e
1.39
1.39
1.39
1,47
1,47
1,47
1,47
1.46
1.46
1,45
1.44
1.44
1.43
1,43
1,42
1.42
1.41
1.41
1.41
1,40
1,40
lAO
1.39
1.39
1.39
1.38
1.38
1.38
1.38
1.37
1.37
'1.36
1.36
1.35
1.35
1.34
1.34
1.33
1.33
1.33
1.32
1.32
1.32
1.31
1.31
1.31
1.30
1.29
1.30
1.29
1.28
1.28
1.28
1.27
1.27
1.26
1.25
U 1.38 1.46 1,45 1.44 1.42 1,41 1.39 1.3& 1.37 1.37 1.35 1.34 1.32 1.31 1.30 1.29 1.28 1.26 1.25
D 1.38 1.46 1,45 1,43 1,42 lAO 1.39 1.38 1.37 1.36 1.35 1.33 1.32 1.31 1.30 1.28 1.27 1.26 1.24
U 1.38 1.46 1,45 1.43 1,41 lAO 1.39 1.38 1.37 1.36 1.34 1.33 1.31 1.30 1.29 1.28 1.27 1.25 1.24
B 1.38 1,45 1,45 1.43 1.41 lAO 1.38 1.37 1)6 1.35 1.34 1.32 1.31 1.30 1.29 1.27 1.26 1.25 1.23
1.28 1.27
30
~
1.38
1.36
1,45
1.44
1.44
1,42
1.42
1.40
1.41
1.39
1.39
[.37
1.38
1.36
1.37
1.35
1.36
1.34
1.35
1.33
1.34
1.31
1.32
1.30
1.30
1.28
1.29
1.26 1.25 1.24
1.26
1.22
1.24
1.21
1.23
Ll9 ;g
60 1.35 1.42 1.41 1.38 1.37 US 1.33 1.32 1.31 1.30 1.29 1.27 1.25 1.24 1.22 1.21 1.19 1.17 LIS g
120 1.34 1,40 1.39 1.37 1.35 1.33 1.31 1.30 1.29 1.28 1.26 1.24 1.22 1.2\ 1.19 LI8 Ll6
Ll2
L13
1.08
1.10
~
1.32 1.39 1.37 1.35 1.33 1.31 1.29 1.2& 1.27 1.25 1.24 1.22 1.19 1.18 1.16 1.14 1.00

Source: Adapted with permhiion from Blome/rikn Tllbfes illr Swtisticillns, Vol. 1, 3rd edition, by E. S. Pear-sOD and H. O. HlITtI<lY, Cambridge Univenity Press, Cambridge, 1966, (ContiHlles)
~
Table V Percentage Points of !he F Distrihution (cmltinlJed)
§l
FO$I.",,\.~,_ _ _-:-_ _ _ _ _ _ _ _ _ _ _ _ _ _ __
D.!grees of fr~om for the numerator (VI} .6'
Y,
v,
2 3 4 6 7 8 9 10 U 15 20 24 30 40 60 120 g
31,l.86 49JO 53.59 55.!B 57.24 58.20 58.91 59.44 5!tS6 60.l9 60.71 61.22 61.74 62.00 62,26 6253 61,79 63JJ6 6133 If
2
3
it~3
5.$4
,.'5A6" 9.16
5.39
9.24
5.34
9.29
5.3]
9.33
5.28
!U5
5,27
9.37
525
9.3&
5,24
9.39
'5.23
9.41
5.22
9.42
5.20
'.44
5.IR
9,45
5.18
9A1i
5,17
9.47
5.16
9.47
'5.15
9,48
5.14
!JA9
5.13
4 4.54 4.'\2 4.19 4,11 4.{l5 4.Ot 198 3.95 3.94 3.92 3.9{) 3.87 3,84 ),Sl 1.S2 ).BO 3.79 1.78 3.76
5 4.06 3.78 3.62 352 't45 'AD 3.17 1.34 3.32 ,UO ]'27 ),24 1.21 3.19 3.17 3.16 ).14 3.12 3.10
6 3.78 1.46 3.29
1.1l7
'1.18 3.11
2.&8
3.05 3.01 2.98 2.96
2.72
2.94 2.90 '.In
2.63
'.84 ,." 2.80
2.56
2.78
254
2.76 2:'4
2.'19
,."
2.47
7 3.59
1.46
1.26
3.11 2.92
2!J6
2.81 '.73
2.33
,."
2.78
2.62
2.75
2.59 2.56
2.70
2.54
2.67
2JO 2.46
'39
2.42
258
2040 2.38
2.25
2:,6
2.21
'"
234
2.21
2.32 2.29
2.16
9 3.36 3.01 2.8\ '~9 2.61 255 B1 2A7 2.44 2.42 2.38 2.34 2.30 2.28 2.18
10 3.29 ,.'" 273 2~1 252 2A6 2A1 2.38 2.35 2.12 2.Ut 2.24 220 Vi! 2.16 2.l3 2.11 2.0& 2.06

~
II ,."
3.lB
2.86
281
2.66
2~1
254
2048
2A5
239
'39
2.33
234
2.26
2.10
2.24
2.27
2.21
2.25
2,19
2.21
2.15
2.17
2.10
2.12
2J16
2.10
,{);
2.0Il
2.01
2,05
1.99
2.03
1.96
'.00
1.93
1.97
1.00
§ I'
13 3.14 2."/6 256 2,43 2.35 2.28 2.23 2.20 :Uti 2,14 2.10 2,05 2.01 1.98 1.96 1.93 1.00 La& 1.85
.~ 14 3.10 2,73 2.52 2.39 2.31 2.24 2.19 2.15 2.12 2.tO 2.05 2.01 1.96 1.94 1.91 Ul9 1.86 l.83 L80

~ IS 3.07 2.70 249 2.% 22"/ 1.21 2.16 2,12 2.09 2.06 2.02 1.97 1.92 1.00 1.87 1.35 l.S2 1.79 Uti
16 3.05 2.67 2,46 233 224 2.18 2.13 2.09 2.M 2.03 1.99 1.94 1.89 L87 Lll4 l.81 1,78 1.75 1.72
<l 17 3.03 '.64 2.44 211 222 2.15 2.10 2.06 2.03 2.00 1.96 1.91 1.86 I.Il4 L31 1.78 U5 1.72 1.69
.ll
" 3.01 1.62 2.42 2,29
= 2.13 2.0B 2.04 2.00 1.98 1.93 Ul9 U4 1.81 Ug 1.75 1.72 1.69 ,Jj6

] 19
20
21
2.99
2.97
,.%
2.61
2.59
2.57
2M)
238
1.36
227
2.21
2.13
2.18
2.16
2.14
2.ll
'J)9
2.08
2.06
2.04
2.02
2.02
2JJO
1.98
1.98
1.96
1.95
1.91i
1.94
1.92
1.91
1.89
l.B7
1.86
I.Il4
1.83
L81
1.79
US
1.79
1.7'1
L75
1.76
1.74
1.72
1.73
1.71
1.69
1.70
1.6&
1.66
1.67
1.64
1.62
If~
1.6J
1.";9
'a
22 2.1,l5 2.56 2.35 2.13 2.06 2.01 1.97 L93 1.90 L86 LSI 1.76 1.73 1.70 1.67 1.<>1 1.60 1,57

I
2.22
23 '.94 255 2:>1 2.21 2.11 ,.OS 1.99 1.95 1.92 1.89 1.84 UlU 1.74 1.72 1.69 1.66 1.62 159 1.55
24 293 254 2.33 2.19 2.10 2M 1.9& 1.94 1.91 1.88 U3 l:/8 1.73 1.70 1.67 1.64 1.61 1.57 1.53
25 2.92 2.53 232 2.18 2J)9 2J)'l 1.97 un L89 1.87 1.82 1.71 1.72 1.69 1.66 1.63 LSI) 156 1.52
U 2.91 252 '31 2.17 2.08 2.01 1.96 1.92 U& Uti 1.81 1.76 1.71 1.68 1.65 1.61 1.58 1.54 1.50
27 '.90 2.51 2JO 2,17 '1JfJ '.00 1.95 1.91 1.87 I.SS 1.80 1.75 1.70 1.67 1.<>1 U50 1.57 153 1.49
28 2.89 2.50 2.29 2.16 2.06 :2.00 1.94 }.90 1.87 U!4 1.79 1.74 1.69 1.66 1.63 1.59 1.56 1..52 1048
21' 1.99 1.91 1.89 L86 un 1.78 1.68 1.6'\ US 1.55 1.51 IA7
"
30
40
2.&9
2.B8
'.ll4
250
2,49
244
2.28
22'
2,23
2.l4
,.'"
'.06
:un
2.'"
1.98
1.93
1.91
L87
UIS
1.83
U15
L79
1.82
1.76
1.77
l.71
1.71
1.72
1.66
1.67
U;1
1.64
LS7
1.62
1.1;1
1.54
LS7
151
1.54
1.47
l50
1042
lA6
US
60 '79 219 1.18 '.M ).95 1.81 U2 1.77 1.74 1.71 1.66 1.60 1.54 1.51 lAS 1.44 lAO US 1.29
1'0 2.75 235 2.13 1.99 1.00 1.82 1.77 , L72 1.68 1.65 1.60 155 lAg iA5 \.41 137 1.32 L26 U9
2.71 2.30 2.03 1.94 1.85 1.77 1.72 1..67 1,63 1.60 U5 1,49 JA2 U8 1.34 1.30 1.24 Ll7 1.00
Table V Pen.:entage Points of the FDistrlbution (comitmed)

FilM,v"",
Vt Degrees of freedom for the numerll.tOr (VI)
\ '3 4 5 6 7 8 9 10 12 15 20 24 30 40 6(J 120

iol.4 199.5 l1S.1 224.6 230.2 234.0 236.8 238.9 240.5 2'11.9 243.9 245.9 248.0 249.1 250.1 2')Ll 252.2 2~3.3 254,3
2 ]8.5l 19.00 19.16 19.25 19.30 19.11 1935 19.37 19.33 19.40 J9.41 19.43 19.45 19.45 19.46 19.41 t9.48 19.49 19.50
3
4
5
]0.13
,~
UI
~
9.55

= ~
~
9.23 9.12
~
~~
9.01
~
'M
8.-94
U6
om
3.R9
_
_
lUiS
~
_
= _
ItSl

~ ~
8.19 8.74
~
_
8.70
~
~
8.66
~
~
8.64
~
~D
= =
8.62

~ _
8.59 8.51
5,69
4,41
8.SS
5.66
4AD
8.53
S.63
4.36
6
7
8
5.99
~
~
5,14
~
-
4.76
~
~
4.53
41'
~
4.19
~
~
4.28
1m
=.
4.21

=
~
4.15
D'
~
4.10
~
~
4.M
~
,~
=
4.00

= =
3.94
~
3,87
~
~
184
~
~
=
3.&1

~
1.71
~
~
>.74
3.)0
3.01
3,70
311
2.97
1.67
3.23
2.93
9
IO
5.12
~
_ =
4.26
410
~l.8(j 1.61
~
1,48
~ =
_ = =
3.37

= = 3.29
1M
1.21 :'\.18 3.l4
I.
3,07
UI
3.01
'8
.n
2,94
~J
2.90
~ =
2.&6 :un
1M
2.79
2.62
2.?S
258
2.11
2..54
~.
11
12
~
W ~
~
~
~
Uri
~
ill ~ = ~ ~
~
~
,n
D9
~ ~
~
I~
W
UI
UJ
~J
'"
,~
2.49
2Jf!
2.45
2.34
2.40
2.30

.j 13 4.61 :UH 3.41 3,18 1.03 2.92 2.83 2.77 2.7l 2.67 2.60 2.53 2.46 2.42 2.38 2.:34 2.30 2.25 2.21

j
.4
15
16
4.60

_
~
3.14

~
~
334
~
~
3.11

.m
~
2.%
~
~
2.85
~
~
2.]6

_
~I
2.70
~
~
2iiS
~
~
2.60
~
~
253
~
~
2.46
~
~
2.39

=
~
2.35
~
~
= =
2.3\

~~
2.2/

~B
2.22
2.16
2.11
2.18
2.11
2,06
2.13
2.07
2,01
-!l lJ 4.45 3.59 3.20 2.% Hi 2.70 2.61 2:5') 2,49 2.4S 2.3g 2.31 2.21 2.19 2.15 2,H) 2.1l6 2.01 L%
.!i 18 4.41 3.55 3.16 2.93 2.77 2.66 253 7..51 2.46 2.41 234 2~27 2,19 2.15 2.ll 2.06 '.ill un 1.92
Il
]
19
20
21
4.38
W
4.32
352
~
3.'17
1.J 3
~IO
3.07
2.90
10
2.84
2:J4
DI
2.68
2,61
I~
2.~7
254
=
2.49
2AS
~
2,42
2.42
~
2,37
=
2.38

2:12
2.31
~
2.25
2.23
~
2J3
•..
2.16

2.10
2.l1
~
2.05
2.111
~
2.01
.•
2.00

1.96
1.9&
1,95
1.92
193
L9{)
1.87
!JIg
UN
1.81
'~" "
.. 23
4.30
~
3,44
~
3.05
~
2.8;>
~
2,66
1M =
2.55 2.46
~ = =
;>AO 2.34 2,30
In
2.23
110
2. t5
,., =
2.07 2.03
WI
1.98
1M
1.94
1m
1.89
UII'i
1.M
1.81
L7&
1.76
~
,."
24 4.26
4.24
lAO
'l39
3"01
2.99
2.78
2.76
2,62
2,60
251
2.49
2,42
2.40
2.36
2,]4
2,30
2.211
2.25
2.24
2,18
2.16
f.ll
2.09
2,03
2.01
1.98
1.96
1.9~
J.92
1.89
1.87
1.84
1.82
L79
1.77
1.73
J.7l

27

"
'"
30
4.23
4.21
~
~.,
4.17
3.37

= =
3.35
~

:t32
2.98
2.%

~
2.92
2,74
2.71
v.
,~
2.69
:2-59
2.57

=
~

2.53
2.47
2.46
~
~
2,42
2.39

= =
2.37
~

2.33
2,32
2.11

ill
2.27
2.27
2,25
~
~
2.2'1
2.22
2.20
U9
~
2.16
2.15
2.13
1.2
2JO
2.09
2JJ7
2.06

=
~

7 O:l
loY')
L97
'M
I~
1.91
1.95
1.93
~
.~
U9
1.9(1
U:Hs
.m
1M
1.84
I.
1.85
U!4

•••
L79
LSO
L79
1.17
US
1.74
us
1.73
1.71
1.10
t68
1,69
1.67
LG~
1.41
1.62
40 4,08 3.71 2.84 2.61 2,45 2.'14 2.25 7.18 2.12 2.0S: 2.00 1.92 1.8.4 1.19 1.74 1.69 1.64 1.58 1.51 £-
'0
60 4.00 3.15 2.76 2.51 2.37 2.2.') 2.17 2.10 UM 1.99 L92 1.84 1.75 1.70 1.6'5 1.59 .>} IA7 1.19 g
120 192 ::to., 2.68 2,45 2.29 2.17 2J)9 2.02 1.96 1.91 1..83 1.75 }'66 1.6l 1.5) 1.55 1.43 US 1.25 ~
3.84 ::tOo 2.60 2,1'1 2.2l 2.10 2.01 1.94 LSS 1.113 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 LOa

(cO'njimlfS)
13
Table V Percenfage POints of lhe FDistribulion (colltr1H1ed) g
FOf)'.!.5.Y,,~!
, - vIr"
~
~ I
1

647.& 799.5
2 3

864.7.
4

399.6
5

921.8
6

Sm.l 943.2
7
lJ.!gTC;:S of fre.edom for lb(." mlmt13lOr (Y,)

956,7
&

963,3
9

963.6
10 12

976.7
15

9&4.9
20

993.1
24

997.2 100[
30

1006
40

lOW
60 120

lOJ4 HHB
f!'l'
2 I 3851 39.00 39.17 39.25 3930 19,33 19.36 'l,9.31 39.39 39.40 3M) 39.41 39.4'> :¥.lAo 39.46 39,47 19.4.& 39.49 3950
3 17,44 16'(H l5A4 15.l0 14,8& 14,73 14,62 14,~4 14.41 14.42 J4.34 \4.15 \4.17 i4J2 14.0& 1<1,/)4 13.99 13.95 13.90
• "D _ 9. _ ~ ~ _ ~ ~ w m ~ ~ "1 W UI U. g" ~

5

HUll
,..
&.43
~
7.76
_
7.39
~
.n
7.l5
m
6.9&
m
.n
=
6.&5

_
6.76
,~
_
=
6.6& 662
~
6.52
~ =
6.43 6.33
.n
6.2&
'J2 =
6.13 6.1&
,m
612
~
6.07
_
6.02
~

, = _
7 ~ ~ ~

~ ~
~

~ ~ ~ ~
~

W
~

~
W
~
.~

un
~

~
W
~
W
I.
W
~
W
~
~

.n
.~

9
10
7,2-1
6.94
6.72
5.11
~
~
SAil!
~
~
4.72
W
~
4.48
~
.M
4.32
~
~
4.20

=
_
4,f(}
1n
oM
4.03
~
~
3.96
D,
~
=
3,87

W
3,71
=
~
3.67
~
w
=
3.61

.n
356
UI
~n
3.51
~
~
3.45
~
~
3.39
'"
,~
3.33
m
~
.'£ "
12 6.55 ~.10 4A7 4.12 3.89 3.13 3.61 3.51 3.44 3.37 3,28 3.1& 3J.!7 3.0-1 2,% 2,91 2.85 2,79 2.72
11 6Al 4.97 435 4.00 3.77 3.60 JAS ].39 3.31 125
=
115 3.05 2.95 2.89
=
2.84 2.1& 27'J; 2.66 2.60

1"
"
I"
15
6.30
6.2Q
6.12
~
4,77
4.69
~
4.15
4,08
~
3.80
3,73
~
358
350
~
141
3.3'1
g
3.29
3.22
~
3.20
3.12
U'
3.12
3.05
10
:t06

=
2.99
2.96
2.89
~
2.86

=
2J9
~
2.16
2.68
~
110
2.63
1.64

= = =
2.57
~
259
2.~1
W
252
2A5
,~
2.46
2:3&
~
2AO

=
2.32
-5
"q
17 6.;)4
5.98
~
.~
~
~
~ ~
.~ =U, 116
1m = =
~ ~
~ = ,~
=
=
~
W
~
~
~ ~
= =
=
~ 'D
~ ~

t '"1
UI
\9 5.92 ~ ~ ~ .~ 1U ~ 2% = = UI ~ ,~ 2~ 2JI
10 5.8:7 4.46 3.116 151 3.29 '3.13 3.01 2.91 2.84 2.17 HiS 257 2.'16 2.4J 2.35 2.29 2.22 2.16 2.09
5.8"3 -1,42 3.&2 3A8 3.?-5 3.M 2.97 2.&7 2.80 /.73 2.64 2.53 2,42 2,37 2.31 2.15 2.18 1.1 t 2.04

i" ""
22 5.79 4.3& 17& 1.44 :1.22 3.05 2m 2.&4 2.76 210 2-60 :1.50 139 133 2.21 2.21 2,14 2.0& 2.00
5,15 4.1~ 115 3Al 3,18 3.02 2.90 2,81 2.13 2.61 2,51 2A7 236 1.30 2.24 2.JR ;UJ 204 1.97
5.72 -1.32 172 ),38 3.1'1 2.91:1 2jn 2_78 2.70 2.64 254 2.44 2.3':1 2,?7 2.21 2.15 2.08 2.01 1.94
II ~69 4,29 \69 ':U5 3.13 2.97 /..85 2.75 2.68 2.61 2.51 '.:AI 2.30 :1-2<1 2.18 2.12 2.05 1.98 L9J
26

""
5,66
5.613
5.61
4.27
~
4,22
3.61
W
3,63
3.33
UI
3.29
3.}0
~
106
=
2.44

2.90
2,&2
m
/.,?8
2,73
'"
2.69
2.M
~
2.61
=
2.59

:t55
2,49
W
2.45
2.39
~
2,34
= .,,
228

2.23
2.22

;1.17
2.16
'n
2.11
2.09
~
2.05
;.'! OJ
,.
1.98
1.95
I.
1.91
1.88
~
l.B3

29
30
I
I 559
5.57
4.20
'LIS
3.6]
359
3.17
3.2)
3.{}4
3.03
2.3S
2.B7
2.76
2.75
2.67
2.65
2.59
257
2.53
2.51
2.43
2,41
232
2.:H
2.21
2.20
2,15
2.14
2,09
2.07
Hj:l
2.0t
1.96
L94
1.89
un
LSI
1:19

I~
5042 4,05 3.46 3:13 2.W 2,74 2,62 2.53 2A5 2,39 2.29 U8 2.07 2JH 1.94 til£! L~O 1.72 }.6<1
5,19 3.93 3.:)4 3.01 2.79 2.63 2.SI 2.41 2.33 2.27 2.17 2.06 1.94 1.8& 1.82 1.74 un 1.58 1.48:
5.15 3,go 3.23 2.89 2.67 l.52 239 2.30 2.22 2,16 2.05 1.94 1.&2 L76 1.69 L61 1.53 lA3 1.31
)
5.02 3.69 3.12 2.79 2.57 2.41 229 2.19 2,11 2.05 1.94 U3 L71 1.64 157 1.'18 1.39 1.27 1.00
---

--~-.-,--
Table V Percentage Points: of the FDislrlbutlon (continued)

F(lm,~",,:>

v, '. , 3 4 6 7
Degrel!is of fr~~dorn fer the namenter (v,1
8 9
._---------
10 12 15 20 24 30 40 60 120
4052 4m,5 5403 5625 5764 5859 "23 WR2 6022 6056 6W6 6j<;7 6.209 613<; 6261 6287 6313 6339 6366
2 98.50 9'),00 99.17 99,25 99.30 99.33 99,36 99.37 99.39 99.40 99,42 99.43 99,45 99,46 99,47 99A7 99.4S 99,49 99.50
3 34.12 30.32 29A6 2lt71 2S,,24 27.91 27.G7 ZiA9 27.35 27.23 nos 26.87 :2(i}:;9 26.00 26.50 26.41 26.32 26.22 26.13
4 21.20 18,00 16.69 l't98 15.52 15.21 14.98 14.&0 14.66 14.55 14.3·' 14.20 14.02 13.93 13.84 11.7<; 11.M 11;56 13.46
5 1616 1121 12.06 11.39 10,91 10,67 10A6 10.29 10.16 10.05 9.89 9.72 9.55 9A7 9.38 9.29 9.20 9.11 9.m
6 D.75 10.92 9.78 9.l5 K75 8.47 ~" IUO 7.93 7.&7 7.n 7.56 7.'10 7.31 7,23 7.14 7.06 6.9'} 6.8g
7 1.2,25 9.55 8,45 U5 7.46 7.19 M9 6.1<4 6.72 6.62 6,47 6.11 6.16 6.07 '.99 5.91 5.8:2 5.74 5.65
g 1L26 8.65 73' 7,01 6,63 6.37 6.[8 6,03 5.91 5.81 5.61 <;,";2 <;.16 5.28 5.20 5.12 S.UJ 4.95 4.86
9 lQS6 g,{)2 6.99 6.42 6.00 5,80' 5.61 5A7 5.35 5.26 5.11 4.96 4.81 473 4.65 4/17 4,48 4.40 4.31
10 1O.Q.I. 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.71 4.56 4..11 4.33 4.23 4.17 4,OR 4.00 3.9l
11 9.65 7.21 6.22 5.67 5,32 5.07 4.89 4.74 4.63 4.54 4.40 4.25 4.10 4.02 3." 3.86 3.78 3.69 HiO
£ 6.93 'AI 5.06 4.64 354
J2 9.33 5.95
5.21
4.82
4.62
4.50
4,30
4.39
4,19
4.30
4.10
4.16
3.%
4.01 3.86
'"
'},59
3:10
3~1
3.62 3.45 3.36

I
13 9.07 6.70 5.74 4.86 4.44 3.82 1.66 JA3 3.31 3.25 3.17
14 8,86 6.51 5,";6 5.04 4.69 4,46 428 4.14 4.03 3.94 3.W 3.66 151 3,43 135 3.27 3.111: 3.00 3.00
lS 8.68 6.36 5,42 4,89 4,36 4.32 4.14 4.00 3.89 3.W 3.67 3.52 3.37 3.29 3.21 .1,13 1,0-; 2.96 2.117
]
..• 16
17
8.53
8,40
6.'n
6.11
5.29
5.1&
4.'f'1
4.67
4,44
4.34
4.20
4.10
4.01
1.93
3.89
1.79
3.78
3.68
3.69
3.59
1.55
1.46
3.4}
3.31
3.26
3.16
3.1&
3.OS
3,10
3.00
3.02
2.92
2,93
2JiJ
2.84
2:15
2J5
2.65

"]e l'
8.29 6.01 5.09 458 4.25 4.01 3.84 l71 3,60 3.51 3.37 3.23 1,08 3.00 2.92 2-'14 2.75 2.66 257
19 8.18 5m 'Di 450 4.17 3S14 3.77 3.63 3.52 J.43 3.30 3.15 3.00 2.92 2.1<4 2.76 2.67 2.58 2.'"
20 8.10 5.S5 4.94 4.43 4,10 ~87 3,70 3.56 lA6 3.37 3.13 3m 2,94 2,1:16 :2.18 2.69 '2.61 2.52 2.42
21 8.02 5.13 4.87 4.37 4.04 3.81 364 3.5J lAO 33J 3.17 3m 2JP:l :2.80 2.72 2.64 2:'iS lAG 2.36
." 22 7.95 5.n 4.82 4,31 3.99 3.76 3.59 3,4<; 335 3.26 3.12 2.98 2.33 2." 2.67 2,58 250 lAO 2.31

J "" 24
7.88
7.82
7.77
5.66
5.61
5,51
4.76
4.n
4,6&
4,26
4,22
4,l8
3,91
3.90
3,8':5
3,71
3.67
3.63
354
3.50
JAG
3,41
3.36
3.32
330
3.26
3.22
3.21
3.17
3.13
3.07
3.03
2.'"
2.91
2.89
2.85
2)3
214
:Uo
2.70
2.66
2.62
2.62
2,58
2.54
2~1
2.'19
2,45
2.45
2A0
2.36
2,)5
2,31
2.27
2.26
2.21
2.17
26 7.n 5.53 4.64 4.14 3J!2 3." 3.42 3.29 :1.18 3,09 2.% 2.81 vS6 2.58 2.:50 2.42 'U'3 2.23 2.11
27 7.£8 5A9 4.60 4.11 3.78 356 3.39 3a6 3.15 3.06 2.93 2.78 2.63 23' 2,47 2.38 2.29 220 2,10
7.64 5.45 3.21 3.12 2.90 2.44 2.26 2.06
"2. 7.60 5.42
4.57
4.>1
4.07
4.01
3.75
3.73
3.53
350
3.36
3.33 'UO 3.(l9
3.01
3.00 2.87
2.75
2,73
Via
2.57
2~2
2,49 2.41
2.35
2.33 2.23
2.\'1
2.14 2,03
30 7.56 5.39 4.51 4.02 3.70 3.17 3,1/) 3J7 '-07 2.98 2.84 2,70 '55 2.47 2,39 2.30 2,21 2.11 2.01
3,12 2)19 2.66 2,29 2.1 1 LSD .i;'
40
60
7.31
7.OS
5.13
4.93
4.31
4.13
3.1\3
3.65
3.51
3.>4
3.29
3.12 2,95
2.99
2,82 2,72
lJlO
2.63 2.50
2.51
:,",35
2:17
2,20 2,12
2.20
2.03 1.94
2.02
1.84
1.92
1.73 1.60 g
120 6,35 4.79 3.')"; 3AS 3.n 2.% 2.79 2.66 2.56 2.47 2.34 :,",19 2.03 1.95 LS6 L76 1,66 1..53 1.38
~
6i;) 4.til 118 3.31 3.01 2.30 2.64 2.51 2.41 2-32 2.18 2.04 L88 1:/9 1.70 1.59 1.17 1.32 1.00

§
610 Appendix

Chart VI Operating ChMacteristic Curves

.~J
i
r-

2 3 4 5
d

(0) OC curves for different values of n for the tvfo~sided normal test for a leve~ of significance
,,~O.05.

I
I
0.00 --- _., .. ,.. l-

I
£
-~
~
'" 0.60 L._
~
U
d

i"•
D
O.4C,~-

6:
0.20

(b; OC curves fot cii.fferent values of n for the two-sided nonnal rest for a level of significance
"~O.OL

SO!4rce: Cia.'t VIa. e,/: k, r.; md q are reproduced Vlifr. pe.'1D:issior. from "'Operating Cha.-a~eristiC$ for the
Common St.J.istical Tests of Significance;" by C. L. Ferris. F, E. Grebbs. and C. L. W¢aver,.A1w11s of
14athematicai Statistics, June 1946.
Cha..'1:s Vlb. c, d, g, k, i,j, I, n, 0, p. and r are reproduceC ...1m pemissio!1 from Er.gineering Srar~tics. 2.'1.d edi~
non, by A H. Bowker and G. 1. Lieber:na:n, Pre!J.oce-Hal~, Englewood Cliffs, NJ, 1972.
Appendix 611

Chart VI Operating Characteristic Curves (continued)


1.00 ,--=r-:.lll!Iiilli---,---.------,-----,--,-----,

0.80 f---+---+
:F
~
Co 0.60 r-----r--··-+-\\Ii\\\-~\\,'Ii\_'c-\-·_\f·-·-_'<+·-·-·-·--I-----·I·-·

~
~
:c 0.40

~
"
0.20

o~-~----~--~~~~~~~~~~~~~~
-1.00 -0.50 0.0 0.50 1.00 1.50 2.00 2.50 3.00
d

(c) OC curves for different values of n for the one-sided normal test for a level of significance
a= 0.05.

1.00
I
,
,
....

I
0.80

:F
rn

~ 0.60
•8

":E£ 0.40

~
e
" ,
,

- __I.
0.20

0
-1.00 -0.50 0.0 0.50 1.00 1.50 2.00 2.50 3.00
d

(d) OC curves for different values of n for the one-sided normal test for a level of significance
a=O.OL
(continues)
612 Appendix

Chart VI Operating Characteristic Curves (continued)


1,0

0.8

:!?
~ 0,8
1l

':5
~
0.4 '
~
~
e
~

02

(e) OC curves for di:ff~"ent values of n for the t\'l.to-sided r test for a level of significance ct= 0.05.

(f) OC curves for different values of n for the "No-sided t test for a level of significance 0:;: 0.01,
Appentli< 613

Chart VI Operating Charac:eristie Curves (comirtlled)


1.0C i'F"f"''l1liIi£":-r-;-,,-c-T-rI , - ' - i T T ' ' 1
0.90

0.80

t' 0.70
~
<
~
~
0
O.fie
~ 0.5C
':5
£? 0.4{}
~
D
2 0.30
'-

(g) OC curves for different values of II for the oae-sided t test for a level of significance Ct= 0,05,

1.";;--;-~"i"'''''
0.90

0.& f·-+- ~--+ --i--· i\l\\~ix-",f-'c-~':~"-""I::-,-+--·+'::'-.",,-j

::r~ j-~
0.30
~
-f-+- fi-\I\ \--H+
;.~-- t - t-
0,20 r"
C.10 I

(h) OC ctn"\'e$ for differeat values cf n for the one-sided t test for a level of significaacc a = 0.01.
614 A?pendix

Chart VI Operating Characteristic Curves (conr.n.ued)


i.OO~~-T-'--:-~"--T-r-,---r-----·T--T--'·'---'-'---T·---'----;

"£ 0:70
g>

O.'0UiJjJjJJJ/j[:r;j~~~~
CAO 2.00 2.4C 2,SO 320 3.60 4,CO
1
(0 OC C'J..'"VCS for different values of r. for the t\Vo~sided cru·square test for a level of s~gnificance a= 0.05.

1.00 ;--.,--:--=

ia 0.70,-
j.60 - II
, ,

I
c
II
'0 0.50 I
'
~
'§ OAO i
£

OM 0.80 i.OO 1,20 1.60 2,00 2M 2,60 3.20 3,6<) 4,00


1

01 OC c'U..""Vcs for different values of r. for the twoMsided chi~square test for a level of signillcance a,: 0.01.
Appendix 615

Chart VI Operating Characteristic Curves (continued)


1.00 ---,---.""---.,..---,----y-------r---
c".

0.80

:i!:
~
c
=n 0,60 . -- - - - --- --
8u

'0
if
D 0.40

D
e
Q.

0.20

0; i.O
,
2.0 3,0 A,O

(k) OC curves for d.iffL'!ent values of r. for me one~sided (upper tail) chl-squa.-e test fo:: a leve; of
significance a= 0.05.

1.00

C.80
:1'
0

n 0,60
•8

"15l? 0.40
D•
e
"-
75
50

0.2:0
'20"
"5

°0 1.0 2.0 $,0 4.0 5.0 6.0 10 8.0 9,0


<
(l) OC curves for different values of n for the one-sided (upper tail) chi-square test for a level of
significance a:::: O.OL
(continues)
616 Appendix

Chart '\'1 Operating Characteristic Curves (cohtfnued)

A
(m) ac curves for different values of n for the one-sided (lower taU) chi-square test for a level of siguificaucc
<:t= 0.05.

:f
~ I
•8 '.60
~
I,
• II
":5z. i


~
GAOl
E
a.

'.2<l

>
(1'1) oc C4cves for different v~ucs of n for the one-sided (lower taU) chi~squa:re test for a level of sign.if.cance
a=O.01.
Appendix 617

Chart'VI Operating Characteristic Cu.-ves (continued)


1.00 ,..--r-,..---r-,---,---,-r--,-,-...,-----_--_-~--"

0.80

:!!:
§ O.BO
~

E
0

"
"5
~
:5 O.LD

i
<L

020

(0) OC curves for di!feren: va2ues of n for the two-sided F-test for a level of significance cc= 0.05.

~ .DC

0.80

:E
~

~ 0.60
§
d
"5
~
0.40
~
~
.<L

020

0
0

(p) OC curves for different values of n for the two-sided F-test for a level of sig::.ificance a::::: 0,01
(cor.rinues)
618 Appendix

ChartVl Operating Characteristic Curves (con.tinued)

(q) OC curves for different values of n for tb.e one-sided F-test for a level of significance a:;:;:; 0.05.

1,00

0.80

:if
:E 0,6Q

~
~
15

:s'"•
n
0.40

:t

o~~~~~~~~~~~==~~
o 1.00 .2.00 4.00 5,00 8.00 10.00 12.00 14.00 16,00
'-
(r) OC CUlVes for different values of n for the one-sided F-test for a level of significance a = 0.01.
Appcndix 619

Chart VII Operating Characteristic Curves for the Fixed-Effects Model Analysis of Variance
1.00,---~----c--,--------~-------,--------,--------,--------,----,

¢(!ora "'0.Q1)-2 3 4 5

1.00,----------,----------,-----------,----------,-----------------,

0.40

0.30
~
5
0

'"

0.20
---T
~

J
§'
0 0.10
~ 0.08
0.Q7
--- - - - - -

":B'" 0.06
0.05
E
~
e 0.04

0.03

0.02 ._--

0,01
1 2 3 - ¢(fora: ",0.05)
<fJ(fora .. 0.Q1) -~1 2 3 4 5

"1 = l1umorutor degroos ol'roodom. "2'" donomlnator dogroas 01 lroodom.


Source: Chart vn is adapted with permission from Biom.etrika Ta.bles/or Sra.risrician.s, Vol. 2, by E. S. Pearson
ilIld H. O. Hartley, Cambridge Univen;ity Press, Cambridge, 1972.
(continues)
620 Appendi:<

ChartVll Opemtiug Characteristic Curves for the FIxed-Effects Model Analysis of Variance
(continued)

030
1
'i
I
~
0,20 i

g 0.10

~ 0.05
M7
"?i 0,," t"
t
0,05 I ~ •

i {L(J.t
" i \l,;)S -, ···,t

oror

,
(},01

tp(rore",i).C~)-1 2 3 4

1,00
o.en
0.70
0,60
0.50 i

.*
"
~
~

~
02C:

! :),10 I"
~ 0,06", -
e,C7:
'il
0,00 C·
'"
~ ;::,05 !'"

1 0,,,1
o,osl
Appendix 621

Chart vn Operating Characteristic Curves for the Fixerl~Effects Model Analysis of Variance
(continued)
,-~--~----,-----~----~----~~

II

',00"- - , - - , - - - - - - - - - , - - - - - , - - - - - - - , - - - - ,
0,';0'
0,70
0,50
0.50
OAO ~,

0.0
'~

O.03C"0_0'~' 0"1""\11,\"\

O.011!-------!:-L.l..Ll-'-.Ll~_:'::::::':_=:_::__;:_;S:_..L..LL'-.L.L-'-l-'-"'--.J
2 4

(continues)
622 Appendix

Chart vn Ope(ating Characteristic Curves for the Fixed-Effects Model A!talysis of Variance
(continued)
1.00 ~----:-'--------'--T'-'-'~--"--'--'--,-----'-'~'---'

O.llIJ~~--"'--'-;:=--=::~:~~~=:=:j
(),10

I
I

0.01 ",-----'-;;'-'...\..-'-'-l...\..'-;;3~~=·'-.·{;1~":·a-.·-;:o.~0S):;-'-'--''--''-...\..-'--'--'--'--'---~
¢:!cra .. O,O'l-_1 2 4
Appendix 623

Chart VIII Operating Olaractcristic Cur¥eS ror the Random-Effects Model Analysis ofVa..iance

'.00
O.1l<l
O.1C
0.00
4!• 0.5C
OA:

I• 0.30

020

"
g>
.~

"- 0.10
S
u Q,08

15
0.Q7
0.06 ,.
O.OS
g 0.04
jj om
D
e
"- M2

190 "",*-,\ (fw (l '" 0.05)


110 tOO

0.00
0]0
0.00 --+_ ........ _ ... _........ .
O.SO

0.40

0.20
"
"i 020

'•"
,s
g>
'K 0.10

~0.08
'00.C7
C 0.06 f---+--+..-
:a• J.CS

E
U.
3.04- !----I---+--+-

Mai--"""-+-

oo,L....--~--~__-1~--L---JL--~--~~~~~~~~~~~~~~~~~.
;3 5 1 9 11 i3 15 17 19 21 23 25""O:-.l.(!orcu.<.l.OSj
5 7 ;; 11 '3 15 17 19 21 23 25

So~rce: Reproduced with pennission from Ettgineer.n8 Statistics. 2nd edition, by A. H, Bowker and G,], Lieberman.
P~ntice"HalJ., Englewood Cliffs, NJ, 1972.
(continues)
624 Appendix

Chart VDI Operating Characteristic CtU-YCS for the Random-Effects Model Analysis of
Va..>iance (continued)

0,03

002
I
I,
1
I
. 1t
00'
t 3
,
4 5 6 7 ., 0 10 11 -}.{for;;t ",0.05)
5
.l,(for'X"'O.Ol} _1 3 4 6 7 8 9 10
" 12 13

1
2 4 5
• '0
Appendix 625

Chart vm Operating Characteristic Curves for the Rando:=l-Effects Mode! Analys.is of


\Tariance (co!L~d)
:.00 ,
C.80 :
'70
::.60
c.so -
C.40
c.so
'";
lIi c2C

i!' i
'3. 0.10:
i 0.00: .
i ~:: f-~~~t_.
£Jl C.04 - --- :~- ..

0.03

C.02

0,0.1 "L-:2;--";---;--'.!~5~""~'L'-j-:;~;\;;;:;;';;~l"~~-".~~"-':'~"---_J
:t(fpra "'v.Ol}----1 , , 1

2 3 4
;:.(bra .. C.05)-1
5
,
6
,
(con.tin.ues)
626 Appendix

Chart vmOperating Characteristic Curves for the Ra.ndom~Effects Model Analysis of


Variance (contir.ued)
'.00 "-~---"--'-c----.. :-~'·--'---C--·-T---;--'----r----'--r~.,
0.,,/
0,70 '
0.00 i.
C.50 :
0.<0 I
O.3Q I
1
~
0 .2:: .

, 7 ,
tOO ~ -, -~:Y----r~--.,. ::--~'--;'--'T---,-',-~'----r---~-~-'- --)
0.83
0.70
CuD i-
0.50 i
GAO
OOSO i
j I
~ 020 I r
:l - I

~ 000-
;2:" 0:07\
OAiS .
-
~ 0.C5
il i

" ::\
0,02 '

;, {for ti '" 0.05) ---- 1 , 7



Appendix 627

Table IX Critical Values for the Wl1coxon Two-Sample Test"


R~,05
n, ., 2 3 4 5 6 7 8 9 10 11 12 l3 14 15
4 10
5 6 11 17
6 7 12 18 26
7 7 13 20 27 36
8 3 8 14 21 29 38 49
9 3 8 15 22 31 40 51 63
10 3 9 15 23 32 42 53 65 78
1J 4 9 16 24 34 44 55 68 81 96
12 4 10 17 26 35 46 58 71 85 99 115
!3 4 10 18 27 37 48 60 73 88 103 119 137
14 4 !I 19 28 38 50 63 76 91 106 123 141 160
15 4 11 20 29 40 52 65 79 94 110 127 145 164 185
16 4 12 21 31 42 54 67 82 97 114 131 150 169
17 5 12 21 32 43 56 70 84 100 !l7 135 154
18 5 13 22 33 45 58 72 87 103 121 139
19 5 13 23 34 46 60 74 90 107 124
20 5 14 24 35 48 62 77 93 110
21 6 14 25 37 50 64 79 95
22 6 15 26 38 51 66 82
23 6 15 27 39 53 68
24 6 16 28 4(l 55
25 6 16 28 42
26 7 17 29
27 7 17
28 7
$oun::e: ReproduCed wit..'t permission from "'The Use of Ranks in ATest of Significance for Comparing Two Treatments,"
by C. 'Nlllte. Biomerrics.1952. Vol. 8, p. 37.
"For large n; and 1'-., R is :l??torimarely normal!y di.~tributed, with mean nl(n, ..... n,. + 1)12 and variance 1l 1n::(1t 1 + 11:: + 1)/12.
(con/inUes)
628 Appendix

'ubi. IX Critical Values for the Wilcoxon Two--Sample Test" (continued)

1':: nl 2 3 4 5 6 7 8 9 10 II 12 13 14 15
,~--

5 15
6 10 16 23
7 10 17 24 32
8 II 17 25 34 43
9 6 11 18 26 35 45 56
10 6 12 19 27 37 47 58 71
11 6 12 20 28 38 49 61 74 87
12 7 13 21 30 40 51 63 76 90 106
13 7 14 22 31 41 53 65 79 93 109 125
14 7 14 22 32 43 54 67 81 96 112 129 147
15 8 15 23 33 44 56 70 84 99 115 133 151 171
16 8 15 24 34 46 58 72 86 102 119 137 155
17 8 16 25 36 47 60 74 89 105 122 140
18 8 16 26 37 49 62 76 92 108 125
19 3 9 17 27 38 50 64 78 94 111
20 3 9 18 28 39 52 66 81 97
21 3 9 18 29 40 53 68 83
22 3 10 19 29 42 55 70
23 3 10 19 30 43 57
24 3 10 20 31 44
25 3 il 20 32
26 3 11 21
27 4 11
28 4
Appendix 629

Table X Critical Values for the


R'«
;;Z.!
n 0,10 0.05 0.01 n
IX
0.10 0.05 om
5 0 23 7 6 4
6 0 0 24 7 6 5
7 0 0 25 7 7 5
8 0 0 26 8 7 6
9 0 27 8 7 6
10 0 28 9 8 6
11 2 0 29 9 8 7
12 2 2 30 10 9 7
13 3 2 31 10 9 7
14 3 2 32 10 9 8
15 3 3 2 33 II 10 8
16 4 3 2 34 II 10 9
17 4 4 2 35 12 II 9
18 5 4 3 36 12 11 9
19 5 4 3 37 13 12 ;0
20 5 5 3 38 13 12 10
21 6 5 4 39 13 12 II
22 6 5 4 40 14 13 II

"For n:> 40, R is approximately normally distributed. with mean nI2 and variance nl4.
630 Appendix

Table XI Critical Values for the Wilcoxon Signed Rank Test'

EI 0,10 0.05
~-~--,----.-~--
0.02 O,O[
" "I
28
0,[0 0.05 0,02
.~-~--.~-~---.~-----~"~~
0,01

9[
4
5 I
I 0 29
130
140
116
126
101
110 100
6 2 0 30 151 137 120 109
7 3 2 0 31 163 147 130 118
8 5 3 0 32 175 159 140 128
9 8 5 3 1 33 187 170 151 138
10 10 8 5 3 34 200 182 162 148
11 13 10 7 5 35 213 195 173 159
12 17 13 9 7 36 227 208 185 171
13 21 17 12 9 37 241 221 198 182
14 25 21 15 12 38 256 235 211 194
15 30 25 19 15 39 271 249 224 207
16 35 29 23 19 40 286 264 238 220
17 41 34 27 23 41 302 279 252 233
18 47 40 32 27 42 319 294 266 247
19 53 46 37 32 43 336 310 281 261
20 60 52 43 37 44 353 327 296 276
21 67 58 49 42 45 371 343 312 291
22 75 65 ,5 48 46 389 361 328 307
23 83 73 62 54 47 407 378 345 322
24 91 81 69 61 48 426 396 362 339
25 100 89 76 68 49 446 415 379 355
26 110 98 84 75 50 466 434 3'17 373
27 119 107 92 83
--~

Source: Adapted with peo:nission from "Extended Table.'> of the Wilcoxon Matched Pair Signed Rank Statistic"
by Robert L. McComnck. Joumw. of the American. Statistical Assoc".atiOI'l, Vol. 60, September, 1965.
"If n > 50, R is approximately normally distributed, with mean n(n 1" 1)/4 and varianoe rt{r. + lX2Jt 1" 1)124.
Thh]e XII Percentage Point~ of the Studentized Range St'Jlisli(;'

QO_01(P.f)
----------~~~~~~~~

p
------------------------
---------~~~~~~~~

f 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 90~0 135 164 186 202 ll!> 227 237 246 253 260 266 272 272 282 286 290 294 298
2 14~0 19~0 22.3 24.7 26.6 28.2 29.5 30.7 31.7 32.6 33.4 31.4 34.8 35.4 3M 365 37.0 37.5 37.9
3 8.26 10.6 12.2 13.3 14.2 15.0 15.6 16.2 16.7 17.1 17.5 17.9 18.2 18.5 18.8 19.1 19.3 19.5 19.8
4 6.51 8.12 9.17 9.96 10.6 ILl ll.5 11.9 12.3 12.6 12.8 13.1 ll.3 13.5 13.7 J3.9 14.1 14.2 14.4
5 5.70 6.97 7.80 8.42 8.91 9.32 9.67 9.97 lO.24 WAS 10.70 10.89 11.08 11.2,1 11.10 11.55 lL68 lLKl lL93
6 5.24 6.33 7.03 7.56 7.97 8.32' 8.61 8.87 9.10 9.30 9,49 9.65 9.81 9.95 lO.08 10.21 JO.32 10.13 10.54
7 4.95 5.92 6.54 7.01 7.37 7.68 7.94 S.17 8.37 8.55 B.7l 8.86 9.00 9.12 9.24 9.35 9.16 9.55 9.65
8 4,74 5.63 6.20 6,63 6.96 7.24 7.47 7.68 7.87 8.03 8.18 8.31 8.44 8.55 8.66 8.76 8.85 8.94 9.03
9 4.60 5.43 5.96 6.35 6.66 6.91 7.13 7.32 7.49 7.65 7.78 7.91 8.03 8.13 8.23 8.32 8.41 8.49 857
10 4A8 5.27 5.77 6.14 6.43 6.67 6.87 7.05 7.21 7.36 7,48 7.60 7.71 7.81 7.91 7.99 8.07 8.15 8.22
11 4.39 5.14 5.62 5.97 6.25 6.48 6.67 6.84 6.99 7.13 7.25 7.36 7.46 7.56 7.65 7.73 7.81 7.R8 7.95
12 4.32 5.04 5.50 5.84 6.10 6.32 6.51 6.67 6.81 6})4 7.0(, 7.17 7.26 7.36 7.44 7.52 7.59 7.66 7.73
13 4.26 4.96 5.40 5.73 5.98 6.19 6.37 6.53 6.67 6.79 6.90 7.01 7.10 7.19 7.27 7.34 7.42 7.48 7.55
14 4.21 4.89 532 5.63 5.88 6.08 (,.26 6.41 6.54 6.66 6.77 6.87 6.96 7.05 7.12 7.20 7.27 7.33 7.39
15 4.17 4.83 5.25 5.56 5.80 5,99 6,16 6.31 6,44 6.55 6.66 6.76 6.84 6.93 7.0{) U)7 7.14 7.20 7.26
16 4.13 4.78 5.19 5.49 5.72 5.92 6.08 6.22 6.35 6.46 6.56 6.66 6.74 6.82 6.90 6.97 7.03 7.09 7.15
17 4.10 4.74 5.14 5.• 3 5.66 5.85 6.01 6.15 6.27 6.38 6.18 6.57 6.66 6.73 6.80 6.87 6.94 7.00 7.05
18 4.07 4.70 5.09 5.38 5.60 5.79 5.94 6.08 6.20 6.31 6.11 6.50 6.58 6.65 6.72 6.79 6.85 6.91 6.96
19 4.05 4.67 5.05 5.33 5.55 5.71 5.89 6.02 6.14 6.25 6.34 6.43 6.51 6.58 6.65 6.72 6.78 6.84 6.89
20 4.02 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 6.19 6.29 6.37 6,45 6.52 6.59 6.65 6.71 6.76 6.82
24 3.96 4.54 4.91 5.17 5.37 554 5.69 5.81 5.92 6.02 6.11 6.19 6.26 6.33 6.39 6.45 6.51 6.56 6.61
30 3.89 4.45 4.80 5.05 5,24 5.40 5.54 5.65 5.76 5.85 5.93 6.01 6.0S 6.14 6.20 6.26 6.31 6.36 6.41
40 3.82 4.37 4.70 4.93 5.1l 5.27 5.39 550 5.60 5.69 5.77 >.84 5.90 5.96 6.02 6.07 6.12 6.17 6.21
60 3.76 4.28 4.60 4.82 4.99 5.13 5.25 5.36 5,45 5.53 5.60 5.67 5.73 5.79 5.84 5.89 5.93 5.98 6.02
120 3.70
3.64
4.20
4.12
4.50
4.40
4.71
4.60
4.87
4.76
5.01
4.88
5.12
4.91)
5.21
5.08
5.30
5.16
5.38
5.23
5A4
5.29
551
5.35
5.56
5.40
5.61
5,45
5.66
5.49
5.71
5.54
5.75
5.57
5.79
5.61
5.83
5.65 i
~
f"' degress of freedom.
}l'rom J. M. May, ''li'{leuded and Corrected Tables of tbe Upper Pe~enl'lge Points of tbe S[lldel'l!ized Rang~," JJiomelrika. Vbl. 39, pp. 192-193. 1952. Reproduced by permission of the
trus{eM of Biowetrlka. ,..el
(contllIues)
Table ~'1I Percentage Points of the Studcnlized Range Statistica(cQII/ilwed)
!li
q",(p,f)

f 2 3 4 5 6 7 8 9 10
p
II 12 13 14 15 16 11 18 19 20
f!t-
I 18.1 26.7 32.8 37.2 40.5 43.1 45.4 47.3 49.1 50.6 51.9 53.2 54.3 55.4 56.3 57.2 58.0 58.8 59.6
2 6.09 8.28 9.80 10.89 11.73 12.43 13.03 13.54 13.99 14.39 14.75 15.08 15.38 15.65 15.91 16.14 16.36 16.57 16.77
3 4.50 5.88 6.83 7.51 8.04 8.47 8.85 9.18 9.46 9.72 9.95 10.16 10.35 10.52 JO.69 10.84 10.98 11.12 11.24
4 3.93 5.00 5.76 6.31 6.73 7.06 7.35 7.60 7.83 8.03 8.21 8.37 8.52 8.67 8.80 8.92 9.03 9.14 9.24
5 3.64 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 7.17 7.32 7.47 7.60 7.72 7.83 7.93 8.03 8.12 8.21
6 3.46 4.34 4.90 5.31 5.63 5.89 6.12 6.32 6.49 6.65 6.79 6.92 7.04 7.14 7.24 7.34 1.43 7.51 7.59
7 3.34 4.16 4.68 5.06 5.35 5.59 5.80 5.99 6.15 6.29 6.42 6.54 6.65 <\.75 6.84 6.93 7.01 7.08 7.16
8 3.26 4.04 4.53 4.89 5.17 5.4Q 5.60 5.71 5.92 6.05 6.18 6.29 6.39 6A8 6.51 6.65 6.73 6.80 6.87
9 3.20 3.95 4.42 4.76 5.02 5.24 5.43 5.60 5.74 5.87 5.98 6.09 6.19 6.28 6.36 6.44 6.51 6.58 6.65
10 3.15 3.88 4.33 4.66 4.91 5.12 5.30 5.46 5.60 5.72 5.83 5.93 6.03 6.12 6.20 6.27 6.34 6.41 6.47
11 3.11 3.82 4.26 4.58 4.82 5.03 5.20 5.35 5.49 5.61 5.71 5.81 5.90 5.98 6.06 6.14 6.20 6.27 6.33
12 3.08 3.77 4.20 4.51 4.75 4.95 5.12 5.27 SAO 5.51 5.61 5.71 5.80 5.88 5.95 6.02 6.09 6.15 6.21
13 3.06 3.73 4.15 4.46 4.69 4.88 5.05 5.19 5.32 5.43 5.53 5.63 5.71 5.79 5.86 5.93 6.00 6.06 6.11
14 3.03 3.70 4.11 4.41 4.64 4.83 4.99 5.13 5.25 5.36 5.46 5.56 5.64 5.72 5.79 5.86 5.92 5.98 6.03
15 3.01 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 5.31 5.40 5,49 5.57 5.65 5.72 5.79 5.85 5.91 5.96
16 3.00 3.65 4.05 4.34 4.56 4.74 4.90 5.03 5.15 5.26 5.35 5.44 5.52 5.59 5.66 5.73 5.79 5.84 5.90
17 2.98 3.62 4.02 4.31 4.52 4.70 4.86 4.99 5.11 5.21 5.31 5.39 5.47 5.55 5.61 5.68 5.74 5.79 5.84
18 2.97 3.61 4.00 4.28 4,49 4.67 4.83 4.96 5.07 5.17 5.27 5.35 5,43 5.50 5.57 5.63 5.69 5.74 5.79
19 2.96 3.59 3.98 4.26 4.47 4.64 4.79 4.n 5.04 5.14 5.23 5.32 5.39 5.46 5.53 5.59 5.65 5.70 5.75
20 2.95 3.58 3.96 4.24 4.45 4.62 4.77 4.90 5.01 5.11 5.20 5.28 5.36 5.43 5.50 5.56 5.61 5.66 5.71
24 2.92 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 5.01 5.10 5.18 5.25 5.32 5.38 5.44 5.50 5.55 5.59
30 2.89 3.48 3.84 4.11 4.30 4.46 4.60 4.72 4.B3 4.92 5.00 5.08 5.15 5.21 5.21 5.33 5.38 5.43 5.48
4Q 2.86 3.44 3.79 4.04 4.23 4.39 4.52 4.63 4.74 4.B2 4.90 4.98 5.05 5.11 5.17 5.22 5.27 5.32 5.36
60 2.83 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 4.73 4.81 4.88 4.94 5.00 5.06 5.1 I 5.15 5.20 5.24
120 2.80 3.36 3.69 3.92 4.10 4.24 4.36 4.47 4.56 4.64 4.71 4.78 4.84 4.90 4.95 5.00 5.04 5.09 5.13
2.77 3.32 3.63 3.86 4.03 4.11 4.29 4.39 4.47 4.55 4.62 4.68 4.74 4.80 4.84 4.98 4.93 4.97 5.01
Appendix 633

Tabl.xm Factors for Quality~Con1rol o.arts


x Chart R Chart
Factors for Factors for Factorsfo!
Control Limits Central Line Control Limits
n'

2 3.760 1.880 1.128 0 3.267


3 2.394 1.023 1.693 0 2.575
4 1.880 0.729 2.059 0 2.282
5 1.596 0.517 2.326 0 2.115
6 1.410 0.483 2.534 0 2.004
7 1.277 0.419 2.704 0.076 1.924
8 1.175 0.373 2.847 0.136 1.864
9 1.094 0.337 2.970 0.184 :.816
10 1.028 0.308 3.078 0.223 1.717
11 0.973 0..285 3.173 0.256 1.744
12 0.925 0.266 3.258 0.284 1.716
13 0.884 0.249 3.336 0.308 1.692
14 0.848 0.235 3.407 0.329 1.671
15 0.816 0.223 3.472 0.348 1.652
16 0.788 0.212 3.532 0.364 1.636
17 0.762 0.203 3.588 0.379 1.621
18 0.738 0.194 3.640 0.392 1.608
19 0.717 0.187 3.689 0.404 1596
20 0.697 0.180 3.735 0.414 1586
21 0.679 0.173 3.778 0.425 1.575
22 0.662 0.167 3.819 0.434 1.566
23 0.647 0.162 3.858 0.443 1.557
24 0.632 0.157 3.895 0.452 1.548
25 0.619 0.153 3.931 0.'159 1.541
7t> 25; AI :;::: 3f'>h:. •n ::::: nl1IT.ber of obserY.lticns in sample.
634 Appendix

Table XIV k Values for OT,le-Sidcd and Two-Sided Tolerance Intervals


One-Sided mlerance Intervals
Confidence
Level 0,90 0,95 0,99
Percent
Coverage 0,90 0,95 0.99 0.90 0.95 0.99 0,90 0.95 0.99

2 10.253 13.090 18.500 20.581 26,260 37.094 103.029 131.426 185.617


3 4.258 5.311 7,340 6.155 7.656 10.553 13.995 17.370 23.896
4 3.188 3.957 5.438 4.162 5.144 7.Q42 7.380 9,083 12.387
5 2.742 3.400 4.666 3.407 4.203 5.741 5.362 6.578 8.939
6 2.494 3.092 4.243 3.006 3.708 5.062 4.411 5.406 7.335
7 2.333 2,894 3.972 2.755 3.399 4.642 3.859 4.728 6.412
8 2.219 2.754 3,783 2.582 3.187 4.354 3.497 4.285 5.812
9 2,133 2.650 3.641 2,454 3,031 4.143 3.240 3.972 5.389
10 2.066 2,568 3.532 2.355 2.911 3.981 3.()48 3.738 5,074
11 2,011 2.503 3.443 2.275 2.815 3,852 2.898 3.556 4.829
12 1.966 2.448 3.371 2.210 2.736 3.747 2.m 3.410 4.633
13 1.928 2.402 3.309 2.155 2.671 3.659 2,677 3.290 4,472
14 1.895 2.363 3257 2.109 2.614 3.585 2.593 3.189 4,337
15 1.867 2.329 3.212 2.068 2.566 3.520 2.521 3.102 4.222
16 1.842 2.299 3.l72 2.033 2,524 3.464 2.459 3.028 4.123
17 L819 2.272 3.137 2.002 2.486 3,414 2.405 2.963 4,037
18 1.800 2.249 3.105 1.974 2.453 3.370 2.357 2,905 3.960
19 1.782 2.227 3.077 1.949 2.423 3.331 2.314 2.854 3.892
20 1.765 2.028 3.052 1.926 2.396 3.295 2.276 2,808 3.832
21 1.750 2,190 3.028 1.905 2.371 3.263 2.241 2.766 3.777
22 1.737 2.174 3,007 1.886 2.349 3.233 2.209 2.729 3.727
23 1.724 2,159 2.987 1.869 2.328 3,206 2.180 2.694 3.681
24 1.712 2.145 2.969 1.853 2.309 3.181 2.154 2.662 3.640
25 1.702 2.132 2.952 1.838 2.292 .3,158 2.129 2.633 3.601
30 1.657 2.080 2.884 I.m 2.220 3,064 2.030 2.515 3.447
40 1.598 2.010 2.793 1.697 2.125 2.941 1.902 2.364 3.249
50 1.559 1.965 2.735 1.646 2.065 2.862 1.821 2.269 3.125
60 1.532 1.933 2.694 1.609 2,022 2,807 1.764 2.202 3.038
70 1.511 ' 1.909 2.662 1.581 1.990 2.765 1.722 2.153 2.974
80 1.495 1,890 2.638 1.559 1.964 2.733 1.688 2.114 2.924
90 1.481 1.874 2.618 1.542 1.944 2.706 1.661 2.082 2,883
100 1.470 1.861 2.601 1.527 1.927 2.684 1.639 2.056 2.850
Appendix 635
ThbleXIV k Values for One-Sided and Two-Sided Tolerance Intervals (continued)

Two-Sided Tolerance Intervals


Confidence
Level 0.90 0.95 0.99
"".--~---- .. ~~ ..

Percent
Coverage 0.90 0.95 0.99 0.90 0.95 0.99 0.90 0..95 0.99
2 15.978 18.800 24.167 32.019 37.674 48.430 160.193 188.49: 242.300
3 5.847 6.919 8.974 8.380 9.916 12.861 18.930 22.401 29.055
4 4.166 4.943 6.440 5.369 6.370 8.299 9.398 ILl50 14.527
5 3.949 4.152 5.423 4.275 5.079 6.634 6.612 7.855 10.260
6 3.131 3.723 4.870 3.712 4.414 5.775 5.337 6.345 8.301
7 2.902 3.452 4.521 3.369 4.007 5.248 4.613 5.488 7.187
8 2.743 3.264 4.278 3.136 3.732 4.891 4,147 4.936 6.488
9 2.626 3.125 4.098 2.967 3.532 4.631 3.822 4.550 5.966
10 2.535 3.018 3.959 2.839 3.379 4.433 3.582 4.265 5.594
11 2.463 2.933 3.849 2.737 3.259 4277 3.397 4.045 5.308
12 2.404 2.863 3.758 2.655 3.162 4.150 3.250 3.870 5.079
13 2.355 2.805 3.682 2.587 3.081 4.044 3.130 3.727 4.893
14 2.314 2.756 3.618 2.529 3.012 3.955 3.029 3.608 4.737
15 2.278 2.113 3.562 2.480 2.954 3.878 2.945 3.507 4.605
16 2.246 2.676 3.514 2.437 2.903 3.812 2,'072 3.421 4.492
17 2.219 2.643 3.471 2.400 2.858 3.154 2.808 3.345 4.393
18 2.194 2,614 3.433 2.366 2.819 3.702 2.753 3.279 4.307
19 2.112 2.588 3.399 2.331 2.7&4 3.656 2,703 3.221 4.230
20 2.152 2.564 3,368 2.310 2.752 3.615 2.659 3.168 4.161
21 2.135 2.543 3,340 2.286 2.723 3.577 2.620 3.121 4.100
22 2.118 2.524 3.315 2.264 2.697 3.543 2.584 3.078 4.044
23 2.103 2.506 3.292 2.244 2.673 3.512 2.551 3,040 3.993
24 2.089 2.489 3,270 2.225 2.651 3.483 2,522 3.004 3.947
25 2.071 2.474 3.251 2.208 2.631 3.457 2.494 2.972 3.904
30 2.025 2.413 3.170 2.140 2.529 3.350 2.385 2.841 3.733
40 1.959 2.334 3.066 2.052 2.445 3.213 2.247 2.677 3.518
50 1.916 2.284 3.001 1.996 2.379 3.126 2.162 2.576 3.385
60 1.887 2.248 2.955 1.958 2.333 3.066 2.103 2.506 3.293
70 :.865 2.222 2.920 1.929 2.299 3.021 2.060 2.454 3,225
80 1.848 2.202 2.894 1.907 2.272 2.986 2,026 2.414 3.173
90 1.834 2.185 2.872 l.889 2.251 2.958 1.999 2,382 3.130
100 1.822 2.172 2.854 1.874 2.233 2.934 1.977 2.355 3.096
636 Appendix

ThbleXV Random Numbers


1G480 15011 01536 02011 81647 91646 69179 14194 62590
22368 46573 25595 85393 30995 89198 27982 53402 93965
24130 48360 22527 97265 76393 64809 15179 241130 49340
42167 93093 06243 61680 07856 16376 39440 53537 71341
37570 39975 81837 16656 06121 91782 6G468 81305 49684
77921 06907 11008 42751 27756 53498 18602 70659 90655
99562 72905 56420 69994 98872 31016 71194 18738 44013
96301 91977 054<i3 (Y1972 18876 20922 94595 56869 69014
89579 14342 63661 10281 17453 18103 57740 84378 25331
85475 36857 53342 53988 53060 59533 38867 62300 08158
28918 69578 88231 33276 70997 79936 56865 05859 90106
63553 40961 48235 03427 49626 69445 18663 72695 52180
09429 93969 52636 92737 88974 33488 36320 17617 30015
10365 61129 87529 85689 48237 52267 67689 93394 01511
07119 97336 7lG48 08178 77233 13916 47564 81056 97735
51085 12765 51821 51259 77452 16308 60756 92144 49442
02368 21382 52404 60268 89368 19885 55322 44819 01188
own 54092 33362 94904 31273 G4146 18594 29852 71585
52162 53916 46369 58586 23216 14513 83149 98736 23495
07056 97628 33787 09998 42698 06691 76988 13602 51851
48663 91245 85828 14346 09172 30168 90229 04734 59193
54164 58492 22421 74103 47070 25306 764<i8 26384 58151
32639 32363 05597 24200 13363 38005 94342 28728 35806
29334 27001 87637 87308 58731 00256 45834 15398 46557
02488 33062 28834 07351 19731 92420 60952 61280 50001
81525 72295 04839 96423 24878 82651 66566 14778 76797
29676 2()591 68086 26432 46901 20849 89768 81536 86645
00742 57392 39064 66432 34673 40027 32832 61362 98947
05366 G4213 25669 26422 44107 44048 37937 639G4 45766
91921 26418 64117 94305 26766 25940 39972 222()9 71500
00582 04711 87917 77341 422()6 35126 74087 99547 81817
00725 69884 62797 56170 86324 88072 76222 36086 84637
69011 65795 95876 55293 18988 27354 26575 08625 40801
25976 57948 29888 886G4 67917 48708 18912 82271 65424
09763 83473 73577 12908 30883 18317 28290 35797 05998
91567 42595 27958 30134 04024 86385 29880 99730 55536
17955 56349 90999 49127 20044 59931 06115 20542 18059
4<i503 18584 18845 49618 023G4 51038 20655 58727 28168
92157 89634 94824 78171 34610 82834 09922 25417 44137
14577 62765 35605 81263 3%67 47358 56873 56307 61607
98427 07523 33362 64270 01638 W,477 66%9 98420 04880
34914 63976 88720 82765 34476 17032 87589 40836 32427
70060 2'0277 39475 46473 23219 53416 94970 25832 69975
53976 54914 06990 67245 68350 82948 11398 42878 80287
76072 29515 40980 07391 58745 25774 22987 80059 39911
90725 52210 83974 29992 65831 38857 50490 83765 55657
64364 67412 33339 31926 148S3 24413 59744 92351 97473
08962 00358 31662 . 25388 61642 34072 81249 35648 56891
95012 68379 93526 70765 10592 G4542 76463 54328 02349
15664 10493 2G492 38391 91132 21999 59516 81652 27195
References

Agresti, A. and B. Coull (1998), "Approximate is Cook, R. D. (1977). "Detection of1Dlluential Ooser-
Better than "Exact' for Jr.terval Estimation of vations in Linear Regression," Technoff'..etrics, Vot
B:nomiai Proportions." The Aw..erican Statistician, 19, pp, 15-18,
52(2). Crowder, S, (1987), "A Simple Method for Studying
.t\nderson, V. L., ar.dR A. McLean (1974), Design Run~LeIlc,trth Distributions of Exponentially
of Experiments: A Realistic Approach, ~1arcel Weighted Moving Average Olarts," Technometn'cs,
Del:ker. New York. VoL 29. pp, 401-407,
Banks.]., J, S, Carsoo, B,L, Nelson., andD. M.Nicol Daniel. C" and F. S, Wood (1980), Fitting Equations to
(2001).Discrete~E)len.t System Simulation, 3rd edi- Daw.. 2nd edition, John Wiley & Sons, New York.
tion, Prentice~Hall. Upper Saddle River, NJ. Davenport, W. B" and W, 1" Root (1958), An Intro-
Bartlett, M. S. (1947), "The Use ofTransforma· duction to the Theory ofRarulom Signals and
tions," Biometrics, Vol. 3, pp. 39-52. Noise, McGraw~Hill. ~ewYork.
Bechhofer, R. E .• T. 1, Santner, and D. Goldsman Draper, N. R., and W. G. Hanter (1969), "Transfor-
(1995), Design. and Analysis of Experiments for mations: Some Examples Revisited,'" Technomet-
Statistical Selection" Screen.ing and Multiple Com~ rics, Vol. 11, pp, 23-4{),
parnaM, lohn Wiley and Sor.s, New York. Draper, N. R., and 'H. Smith (1998), Applied Regres·
BelsleY, D, A., E, Kuh, and R. E. Welsch (1980). sion Atudysts, 3rd edition., John Wiley & Sons,
Regression Diagnostics, John Wiley & Sons, New NewYotk.
York. D-.mcan, A, I, (1986), Quality Control and Industrial
Berrettoni, J. M. (1964), "Practical Applications of Statistics, 5th edition, Richard D. Irwin, Home-
the Weibull Distribution,"" Industrial Quality Con- wood, IL,
""I, Vol, 21,No, 2,pp. 71-79. Duncan, D, B. (1955). "Multiple Range and Multip!e
Box, G, E, P" and D, R. Cox (1964). "AllA:1alysis of FTests," Biometrics, Vol. III PP, 1-42,
Ttansfon:::lations." Journal of the Royal Statistical Efron, B. andR. TIbshi.'1mi(1993),Anln.rroti"wctionlO
Society, B, VoL 26, pp. 211-252, the Bootstrap, ~an ar.d Hall. ~ew lOCk.
Box, G, E, p" and M, F. Mdller (1958), "A Note on Elsayed, E. (1996), Reliability Engineering. Addison
the Generation of Normal RandgID Deviates," Wesley Longman, Reading. MA.
A.n1tals of Mathematical Statistics, Vol, 29, pp. Epstein, B, (1960). "'Estimation from Ufe Test Data,"
610-011, IRE Tran.sactions on Reliability, Vol. RQC"9.
Bradey, P,. B, L, Fox. and 1" E. Schrage (1987),A
Feller, W. (1968). An Introduction to Probability
Guide to Simulation, 2nd edition, Sp.ringer~Verlag,
Theory and Its Application.~, 3rd edition, John
New York.
Wiley & Sans, New York.
Cheng, R. C. (1977), "The Generation of Gamma
Fishman, G, S. (1978). PrinCiples of Discrete Event
Variables with NoninJ:cgra1 Shape Parameters;'
Simulation, John Wtiey &: Sons, ~C\\' York.
Applied Statistics, VoL 26, No, I. pp, 71-75,
Cochran, W, G, (1947), "Some Co!)Seq"e""es "''ben FumiYll!, G, :>1" and R. W, Wilson, Jr, (1974),
the .A.ssumptions for the Analysis of Variance Are "Regression by Leaps and Bounds," Tecrmomet-
Not Satisfied," Biometrics, Vol. ;., pp, 22-38. nus. Vol. 16, pp, 499-512,
CochrllJ1, W. G, (1977). Sampling Techniques, 3rd Hahn, G., and S. Shapiro (1967), Statistical Models
edition, Joha Wl1ey & Sons, New York, in Engineering, John Wliey &: Sons, New York.
Cochr:m, W. G,. and G, M, Cox (1957), Experimen- Hald,A. (1952), Statistical 'Dutory wiJI: Engineering
tal Designs. John Wiley & Sons, Ne\VYork. Application.s, John Wiley & Sons, New York.
Cook, R. D. (1979), "1Dlluential Obsel'V1loons in Lin- Hawkins, S. (1993), "Cumulative Sum Control
ear Regression," Jou.mal oftheAmencan: Sta;isti- Chartillg: All underntilized SPC Tool;' Quality
cal Association.. Vol. 169-174. Engineering. Vol, 5, pp, 463-477,

637
638 References

Hocking, R. R, (1976), "The Analysis and Selection Mood, A. M., F. A. Graybill, and D. C. Bees (1974),
of Variables in Linear Regression," Biometrics, IrtlroducMn to (roe Theory 0/ Statistics. 3rd edi~
Vo!' 32, pp. 1.-49. !ion, McGraw..Hill, New York.
Hocking, R. R .. F. M. Speed, and M. J. Lynn (1976). Neter.l. M. Kutner, C. Nachtsheim, and W. Wasser~
"A Class of Biased Estimators in Linear Regres~ man (1996), Applied Lir.ear Statistical Mrxiels.
SiOD," TecntU>metrics, Vol. 18, pp. 425-437. 4th edition, Irwin Press, Homewood.,.IL.
HOed, A. E., and R W. Kennard (1970a), "Ridge >lewman, D, (1939), 'The Distribution of the Range
Regression: Biased Estimation for Non-Orthogonal in Samples from a NoID13l Population Expressed
Problems,'" Technometdcs, Vol. 12, pp, 55-67. in Tern:s of an Iodependent Esti.JruIte of Standard
Hoed. A. E., andR. W. Kennard (197Gb), "Ridge Deviation," Biometrika, Vol. 31, p. 20.
Regression: Application to Non-Orthogonal Prob~ Odeh. R., and D. Owens (1980), Tables/or Nonr.al
leInS," TechtU>metrics, Vol. 12, pp. 69~82. Tolerance Limits, Sampling Plans, ami Screening,
Kelton, W. D .. and A. M. Law (1983), "A New Marcel Dekker, ~ew York.
Approach for Dealblg with the Startup Problem in Owen. D. B. (l962), Handbook of Statistical Tables,
Discrete Event Simulation:' Naval Research Addison·Wesley Publishing CompllIlY. Reading,
Logist;cs Q-,tarterly, \O\. 30, pp. 641-{558. Mass,
Kendall, M. G., and A. S:uart (1963), The Advanced Page, E. S. (1954). "Continuous Inspection
Theory of Statistics, Hafner Publishing Co:::npany, Schemes." Biometrika, Vol. 14. pp. 100-115.
>lew York, Roberts, S. (1959), "Control C'ha.ttTests Based on
Keuls, M. (1952). 'The Use of me S:Udentized Range Geometric Moving Averages," Technometrics. Vol
in Connection with aD Analysis of Variance," I, pp. 239-250.
Euphytics, Vol. 1, p. 112. Scbeff6, H. (1953), "A Memod for Judging All Con·
Law, A. M., and W. D. Kelton (2000). Simulation trasts in the Analysis of Variance," Biometrika,
Modeling andAnalysis, 3rd edition, McG:raw-Hill, VoL 4(], pp.87-104.
New York, Snee, R. D. (1977), "Validation of Regression Mod~
Lloyd.D. K, aDdM. Lipow (1972), Reliability: eIs; Methods and Examples," Technometrics, VoL i
Mana.gerr.ent, Methods, arul Mathenulrics. P1entice~
Hall, Englewood ClilIs, N.J.
19, No.4, pp. 415-428.
Tucker, Ii. G. (1962), An Introduction to Probability
~
Lucas, 1., and M, Saccucd (1990), "Exponentizlly and Mathematical Statistics, Academic Press,
Weighted Moving Average Control Schemes: ~ewYork.
Properties and Enhancements." Technometrics, Tukey, J. W. (1953), 'The Problem of Multiple
Vol, 32, pp. 1-12. Comparisons," unpublished notes, Princeton
Marquardt, D. W.. and R. D. Snee (1975), "Ridge University.
Regression in Practice:' The American Tukey, J, W. (1977), Exploratory Data Analysis.
Statistician, Vol. 29, pp. 3-20. Addison-Wesley, Reading, MA.
Montgomery, D. C. (2001), Design and Analysis of United States Department of Defense (1957). Mili-
Experimeras, 5th edition, 10hn Wiley & Sons. tary Sttmdard Sampling Procedures and Tablesfor
New York. Inspection by Variables for Percent Defective
~fontgomery, D. C (2001),lntroduction to Statisti~ (MlL-STD-414), Government Printing Office,
cal Quality Control, 4th edition, John W:t1ey & Washing'.on, DC.
SOIlS, New York. Welch, P. D. (1983), 'The Statistical Analysis of
Montgomery, D. C, E. A. Peck, and O. G. vining Simulation Results." in The Corr.puter Perfor·
(2001), Introduction to Linear Regressicrn. mance M.odeling Handbook (ed. S. Lavenberg),
Analysis. 3rd edition, John Wiley & Sons, New . Academic Press, Orlando, FL.
York.
Montgomery, D.C., and G.C. Runger (2003),
Applied Statistics and Probability for Engineers,
3rd edtion, John Wiley & SODS, New York.
Answers to Selected Exercises

Chapter 1
1·1. (a) 0.75. (h) 0.18,
1·3. (a) it (' B= [5), (h) it c; B= {I, 3,4, 5, 6, 7, 8, 9.1O}, (e) A0!i = (2, 3, 4, 5).
(d) U=(l,2,3,4,5,6,7,8,9,lOj, (e) A('(B,-,C)= (1,2,5,6,7,8,9,IO).
1.5 9' = W, r,): t; ~ 0, I," 0),
't
A = (t t,):t, " 0, ,,0, ('l - :,)12 S; 0,15).
"
B = [('l' '2): " ~ 0,', ~ 0, max (t" t,)'; 0,15),
e = {(t t,l: 'l ,,0, t, ~ 0, It, - r,jt2 ,; 0,06),
1·7, "
9' = (NNNNN, NNN,ND, NNNDN, NNNDD. NND"IN, l'-CIDND, NNDD, ':'IDl'--:W, NDt-.'NIJ,
NDND, NDD, DNNNN, DNNND, DNND, miD, DD),
1~9. (a) N:::: not defective, D := defective.
9' = (NNN, ~'NIJ, NDN, NDD, DNN, DND, DDN, DDD) ,
(h) 9' = (NNNN, NNND, NNDN, NDi'.'N, DNNN),
1·11. 30 tomes. 1-13. 560.560 ways.
(3OOP "1(300(I- p'Yi
1·15. P(Accept Ip') =L \ ,
t li x IO-x
)I

,-0 (31~)
1-17. 28 comparisons, 1·19. (40)(39) = 1560 tests,
1-21. (a) (;)(;) = 25 ways, (h) @@= 100 ways,
1-23. R, =[1- (0,2)(0,1)(0.1)] [1 - (0,2)(0,1)] (0,9) =0,880,
1-25. S =Siberia. U = Ural, P(S) =0,6, P(U) =0.4, peElS) =P(FlS) =0,5,
P(F~U)=0,3, P(SIF) '( 0,6),0,))4-
/:~X~'\ 0.4 (0,3
0,714. )'
1 1 1 m-l m 2 -m-l
1-27, -_._+-,--= 2 1·29. P(womenI6,) '" 0,03226, 1·31. !l4,
m-1 m m m m (m-l)'

1.35, p(li) = (365)(364)"'(365-~ + I) .


. 365/1
n 21 22 30 40 60
PCB) 0.444 j 0.476 0.706 iJ:891 0,994
1·37. 8: = MmO, 1-39. 0,441,

Chapter 2

2·1. Px(X=x) x=o, L 2, 3, 4,

639
640 Answers to Selected Questions

3:;'~S~l-Yes, (b) No, (eJ Yes. 2-7. a, b. 2-9. P(Xs 29) ~ 0.978.
~ 2.11 •.(.) k ~~, I (b) 11 ~ 2. ",' = l.
"-~/"'~ "> /"

(e) 7;(x) = 0, ,<0,


=,'/8. O>x<2.
x'
=-l+x-
S' 2Sx<4,

x:::'4.
2-13. b 2. [14 - 2'12. 14 + 2£J.
2~15, (a) k=;. (b) .u::::~, cr"=i;,
(e) Fx(x) = 0" < l.
=-k.l S;x<2,
=r.f." 2 :5:x<3,
=1.x~3.

2·17. k = 10 and 2 + kW '" 8.3 days.


2-21. (.)F,(x)=O,x<O,
=-<'I9,O';x<3,
=1.x2:3.
(b) 11 = 2. cr " ,. .
(e) 113 (d) m=cc'
3
,2
2·23. F,(,)=l-e'v"",,,O, 2·25. k=I.I1=1.
=O,x<O.

Chapter 3
3·1. (a) y p,fy) (b) E(l') = 14, Vel') = 564.
-~---
0 0.6
20 0.3
80 0,1
Qther.vise 0.0
3·3. (al 0.221. (b) SI55.80.
3~5. liz) :::: e-r., Z 2: 0,
= 0, other.vise.
3-7. 93.S c!gal.
3·9. {a} f,J;;) =:;; kfr 1f1 , trOfl)ill, y > D,
::::: 0, otherwise~
(b) Mv)= 2"'~, v > 0,
:::: 0, otherwise.
(c) f,)u)=e-('f"-u!,u>O,
::::; 0, other"Nise.
3·11. s = (513) x 10'.
3·13. (al !,(Y)=1<4-yr ",O';yS3,
"
"'" Q, otherwise.
(b) Jy(y):::' J. e:5: y.$ C,
:::;: 0. othGI"Nisc.
Answers [Q Selected Questions 641

6
3·15. Mx(t) = I,(t)e",

E(X)=M~(O)=t,

V(X) =M;(0)-[M~(0)J2 =-/!.


3.17. ECl') = I, VCl') = I,
ECX) = 6.16, VCX) = 0.027.
3·19. Mj.t) = (1- tnt', ECX) =M~CO) =I, VCX) = M~CO) - [M~CO)J' =.;..
3~21. M.J..t):::: E(e IY) :::: E(er(aX+ /J)) :::: e1b E(e(al)X) :::: e rb Mx(at).
3-23. (a) MJ,.t) :::: ~ + ie + ~e2t + t
ic l
, E(X):::: Mx(O) ::::;, V(X):::: ~(O) - (;f = ~.
Cb) Fy(Y)=O,y<O,
=~,O~Y<1.
=~.1~y<4,
=1,y>4.

Chapter 4
4·1. Ca) x 0 I 2 3 4 5
pj.x) 27/50 II/50 6/50 3/50 'li50 1/50

Y 0 I 2 3 4
py(y) 20/50 15/50 10/50 4/50 1/50
(b) Y 0 I 2 3 4
Pno Cy ) llf27 8/27 4f27 3/27 In7
Ce) x 0 I 2 3 4 5
P.oCx) llnO 4/20 2120 1/20 Ino 1/20

4-3. Ca) k = 1.
,
Cb) Ix,Cx,) =100, 0 ';x,'; 100,
= 0, otherwise.
,
Ix,Cx,) = TI>, 0 ,; x,'; 10,
= 0, otherwise.

(c) IXI.x/xI'~):::: 0, if xl or Xz < 0,


=~, 0<x l <100, 0<Xz<10,
=-frm. a <XI < 100, ~ ;;::'10,
=ti, XI;::: 100, O<~< 10,
= I, x, ;, 100, x,;' 10.
4·5. Ca);. (b) ior.
Ce) J..Cw)'= 2w, 0,; w'; I,
= 0, otherwise.
4·7. ~ 4-9. ECX,ix,) =l. ECX,i:,,) =;. 4-15. ECl') = 80, VCl') =36.
4-19. P =~. 4-21. X and Y are not independent. p = -0.135.

4·23. Ca) Ix(x)=3c~I-x', -1<x<I,

= 0," otherwise;
fr(Y)=~JI-y', O<y<I,
"
:::: 0, otherwise.
642 Answers to Selected Questions

4-23. (h) f..,,(x)= ~,. -~I-y' <X<~I-y',


2,I-y
= 0, otherwise;
C""1
1
iJ'1.x(Y)= r~-=' 0<y<'VI-x ,
'i1_X2
;;; 0, otherwise.
(e) E(Xiy)=O.
r~~--

E(Ylx)=t~l-x' .
2+3y
4-27. (a) E{Xly)= 3(h2Y)' O<y<l.

4-29. (a) k= (n - 1)(n ·2). (h) F(x. y) = I · (1 + x),". (1 + y)'" • (1 + x+ Y)"'.;P 0, y;> O.
4-31. (a) Independent. (b) Not independent (c) Not independent
4-33. (a) i (b) i
4-35. (a)F,(z)=Fx!(,.a)lb). (b) F,!.z)=I-Filh). (e) F,!.z) = Fx(i'). (d) F,!.z) = FJ)n z).

Chapter 5
5.1. P(X=x) =(~ p' (1- p)4.', x = 0, 1,2.3,4.

5·3. Assuming independence, P(W" 4) = 1-(0.5)" ~(~) =0.927.


5-5. p(X>2)=I- L, (50)
x (0.Q2)"(0.98)>>-' =0,078.
.r"'-l )
5·7. PCu S 0.03) '" 0.98. 5·9. p(X= 5) "0.0407. 5·11. p '" 0.8.

5·13. E(X) = ,\Ix (0) = 1., E(X') =M;(O) = ~,<1~ =E(X')-(E(X))' =3,.
p p' P
5·15. P(X= 36) '" 0.0083.
5·17. P(X = 4) =0.077. P(X <4) =0.896. 5·19. E(X) = 6.25, VeX) =1.5625.
5·21. p(3, 0, 0) + p(O, 3, 0) + P(O. 0, 3) = D.llS. 5-23. p(4, I, 3. 2) '" 0.005.
=
5.25. P(X" 2) 0.98. Binomial approx. P(X S 2) '" 0.97.
5·27, P(X" I) '" 0.95. Binomial'ppro.". "" n = 9.

5·29, P(x<IO)=11.388SXIO·"yf,(25Y 5-31. P(X>5) = 0.215.


\ } x-.d) xl

5--33. Poisson model, c:::::: 30.

P(XS;3)=e'''± 30',
..:..0 xl

P(X~5)=I-e'''± 3~ .
.r< X_I

5-35. Poisson model, c = 2.5. P(X" 2) '" 0.544.


5~37. X = number of errors on n pages - Bin (5, nJ200)_ (a) n -= 50. P(x ~ 1);:;; 0_763.
(h) P(X" 3)" 0,90, ifn = 15!'
5-39. P(X <: 2) = 0.0047.
Answers to Selected Questions 643

Chapter 6
,,
6-1. ~,n·

6-3. f,,(y) =l, 5 <y <9,


= 0, otherwise.
6-5. E(X) = M~(O) = (f3 + a)l2, VeX) = M~(O) - [M>O)J' = (f3 - 0.)'112.
6-7, y Fyiy)
y<1 0
l~y<2 0.3
2~y<3 0.5
3~y<4 0.9
y>4 1,0
Generate realizations u. i - uniform [0, 1] as random numbers, as is described in Section 6-6; use these
in the inverse as Y. =Fy-1(u.). i = 1, 2, ....
6-9, E(X) =M~(O) = 1/.1., VeX) =M~(O) - [M~(O)J' = 1/-'.'.
6-11. 1- e-1I6 == 0.154. 6-13. 1- e- == 0.283.
1f3

6-15. C1 = C, X > IS, ClI = 3C, x > 15,


=C+Z,x:515; = 3C+Z, x.::; 15;
E(CI) = Ce-'!> + (C + 2)[1- e- 31'J '" C + (0.4512)2;
E(C,) = 3Ce-3n + (3C + 2)[1- e-J"J '" 3C + (0.3486)2;
=> Process J, if C > (0.0513)2-
6-19. 0.8305. 6-23. 0.8488.

6-27. For-'.=I, r=2, fxtx) = (~(3)( ).xO(I-X)=2(I-X),0<x<1.


r 1 ·r 2
= 0, otherwise.
6-31. 1- e- I '" 0.63. 6-33. = 0.24. 6-35. (a) = 0.22. (b) 4800. 6-37. = 0.3528.
Chapter 7
7-1. (a) 0.4772. (b) 0.6827. (e) 0.9505. (d) 0.9750.
(e) 0.1336. (I) 0.9485. ,(g) 0.9147. (h) 0.9898.
7-3. (a) c = 1,56. (b) c = 1,96. (e) c = 2.57. (d) c = -1,645.
7-5. (a) 0.9772. (b) 0.50. (e) 0.6687. (d) 0.6915. (e) 0.95.
7-7. 30.85%. 7-9. 2376.63 fc.
7-13. (a) 0.0455. (b) 0.0730, (e) 0.3085. (d) 0.3085.
7-15. B, if cost of A < 0.1368. 7-17. !' = 7.
7-19. (a) .0.6687. (b) ~&4. (e) 6.018.
7-23. 0.616. 7'25. 0.00714,
7-27, (a) 0.276. (b) At!, = 12.0,
7-29. (a) 0.552. (b) 0.100. (e) 0.758. (d) 0.09.
7-30. n = 139. 7-36. 2497.24.
7-37. E(X) = e623 • VeX) = e 125 (o"_I), mdn = eso , mode = 0". 7-41. 0.9788. 7-42. 0.4681,

Chapter 8
8-1. x = 131.30,? = 113.85, s = 10.67. 8-21. x = 74.002,? = 6.875 x 10", s = 0.0026.
8-23. (a) The sample average will be reduced by 63.
644 Answers to Selected Questions

(b) The sample mean and standard deviation will be 100 units larger. The sample vali.ance v.iJl be
10,000 units larger,
8·25. a = x. 8·29. (a) x= 120.22, s' 5.66. s =2.38. (b) x = 120. mode = 121.
8-31. For 8·29. cv = 0.0198; for 8·30. CY 9.72, 8·33. = x= 22.41, s' =208.25, x = 22.81, mode = 23.64.
Chapter 9 ,
9·1. j(x" x,. "., x,); (1I(2nO"»5!le·,n"' L (x,·· Il)'.
9·3. j(x" x" x" xJ = 1. 9·5. N(5.00,O.00125).
9·7. Use S I·r;;.

9·9.
- - :~-;;:z
ThestandatderrorofX,-X,is i'....L+:2.='/-·-·-+-=--·=0.47.
I(T 5)' (" 0)'
9·11. N(O,I).
- in! r-;. ,:25 30

9·13. se(p)=.j;{I=IN~, se(p)=~p{H)/n. 9·15. J.l u,O"=2u.


~ " 2n2(m+ft~2)
9-17. ForF",Il.wehave,u=n!(n-2)rorn>2anda"'- .'~ forn>4.
. . rn(n-2)"(.-4)
, (f';:;:
9·21- JX;1l I
1- e-t'.N. F.:Iii,,;'(t) =: 11_ e.J.Al't,
\ )
9·23. (al 2.73. (b) 11.34. (e) 34.17. (d) 20,48.
9·25. (a) 1.63. (b) 2.85. (el 0.241. (d) 0.588.

Chapter 10
10-1. Both esti:nators are unbiased. ~ow,. V(X1):= (J'llln whereas V(X2) ::= a 211'1. Since V(X\) < V{X:J, XI is
a more efficient estimator ':han )(2'
10-3. $'l' because: it would have a smalier MSE.

10-7.
A fox, -
IX;£.. -=X. 10·9.
.
lit.
i=l n

2
10·25, , . r 6f
f(.uJx),xZ,,,,x,,)::::CJ, 2(2~rtl12Jexp~ __ J,l--11'--Z+-f
nX JJ.c ,\'i1 ]l,
I( L2. c\ a 0'0"
/j j

n 1
wMre C=-.,+-,.
a" °0
10·21- The posterior density for p is a beta distribution with para.n:.eters a + n and b + L. Xi - n,
10·29, The posterior density fer ?.is gamma with parameters r =: m + Lx; + 1 and 0;:; n + (m +l)/l,:.
10·31. 0.967. 10·33. 0.3783.
j(x"e) 2x
f )=-':-(~)' =-'(2-2" (b)
10·35. (al j.elx, 112. 10-37. at:::;: a;::::: cd2 is shorter,
J x·, e , - xl,
10·39, (a) 74.03533 ~ 1">74.03666. (b) 74.0356" .u.
10..41, (al 3232.11 ;; i" ~ 3267.89. (b) 1004.80:$ J.l. 1043. 150 Or 151.
10-45. (a) 0.0723:$ J.l, -Ilz:$ 3267.89. (b) 0.049% Il: -Ilz:$ 0.33. (e) i", -Ilz;; 0.3076.
Answers to Selected Questions 645

1041. -3.68 S 1', - iLl S -2.12. 1049. 183.0 S I'S 256.6. 10·51. 13.
10·53. 94.282 S it ~ 111.518. 10-55. -D.839 ~ .u, - iLl ~ -D.679. 10·57. 0.355 $1', - iLl'; 0.455.
10-59. <a) 649.60'; a' $ 2853.69. (0) 714.56 s a'. (e) a' $ 2460.62. 10-61. 0.0039$ a' $ 0.0124.
10-63. 0.514" ,,',; 3.614. 10·65. O.ll S 0';1,,; ~ 0.86. 10-67. 0.088 $p S 0.152. 10·69. 165i7,
10-11. -D.0244 "p, -p,,,
0.0024. 10·13. -2038 $1', - iLl'; 3714.8.
10-75. -3.1529';!1, - iLl " 0.1529; -1.9015 "!1, - iLl S 0.9015; -D.1775 S!1, - iLl S 2.1775.

Chapter 11
11·1. <aJ:z;, -:.333, do not reject H,. (0) 0.05. U·3, (a) z;, = -12.65. reject lID. (b) 3.
11·5. :z;, = 2.50, reject II,. 11·7, <al:z;, = 1.349, do not r<;ject H" (b) 2. (e) L
11·9, Z, = 2.656, reject H~ n·11. z;, = -7.25, reject H,. 11·13. 10 = 1.842, do not reject Iio•
11-15. t o ;;d.47, do not rejectHo at a=: 0.05. 11-17. 3.
11·19. (a) '0 ~ 8.49, reject Ii", (b) '0 -2.35, do not reject II,. (e) 1. (d) 5.
11·21. Fe = 0.8832, do not reject He' 11·23. <a) Fo = 1.07. do not reject He' (b) 0.15. (e) 15.
11-25. to"" 0,56, do not reject Hey
11·27, (a) ~~43.75,rejectH,. (b) 0.3078 x 10"'. (e) 0.30. (d) 17.
11·29. (aj X; = 2.28. ,eject H,. (b) 0.58. 11·31. Fo = 30.69, rejeet H,; P'"' 0.65.
11..33. :;;;;: 2.4465, do not r<Bect Hr;> 1l<~5. to =: 5.21, reject HD, 11..37. Zo;;: 1.333, do not reject HQ•
1141. Zr;=-2.023,donotrejectHo. 11-47. ~=:2.9!5,donotrej:;ctHrr
1149. ~ = 4,724. do not reject Ho. 11",53. ~ = 0,0331, do not rejectHc~
11~55. ~ =: 2A65. do not reject Hrr 11s!. ~ =34.896. rejectHD• 11~59. ~ =: 22.06, reject He'

Chapter 12
12-1. (aj Fe =3.17. 12-3. (a) Fo =12.13. (bj );.fixing technique 4 is different from I, 2, and 3.
12-5. (a) Fe ~ 2.62. (b) jl = 21.70, '" 0.023. T, ~ -D. 166, i:, 0.029, "4 ~ 0.059,
12·7. (a) Fe=4.0L (b) Mean 3 differs from 2. (e) S5"=246.33. (d) 0.88.
12-9. (aJ Fe ~ 2.38. (b) None. 12-11. n = 3.
12-15. (aJ /l = 20.47, !; = 0.33, T, = 1.73. i:, ~ 2.07. (b) T, - i:, = -lAO.

Chapter 13
13-1. Source DF S5 MS J? P
CS 2 0.0317805 0.01.58903 1.5.94 0.000
DC 2 0.0271854 0.01.35927 13.64 0.000
CS'DC 4 0.0006873 0.0001.71.8 0.1.7 0.9S0
Error 1.8 0.017941.3 0.0009967
Total 26 0.0775945

Main efiects are significant; interaction is not significant.


13-3. -23.93 $ i', - f.L, ~ 5.15. 13·5. No change in conclusions.
13·7. Source DJ? S5 MS F P
glass 1 14450.0 1.4450.0 273.79 0.000
phos 2 933.3 466.7 8.84 0.004
glass*phos 2 133.3 66.7 1.26 0.318
Error 12 633.3 52.8
Total 17 16150.0

Significant main effects.


646 Answers to Selected Questions

13-9_ Source
- -.. ~~.~
DP SS MS F P
Cone 2 7_7639 3_8819 10_62 0.001
Freeness 2 19_3739 9.6869 26.50 0.000
Time 1 20.2500 20_2500 55.40 0.000
Conc'llFreeness 4 6.0911 1.5228 4.17 0.015
Conc*Time 2 2.0817 1. 0408 2_85 0.084
:E'reeness*Time 2 2.1950 1. 0975 3.00 0.075
Conc~Freeness~Tirne 4 1.9733 0.4933 1.35 0.290
Error 18 6.5800 0.3656
Total 35 66.3089
Concentration.lIDle, Freeness. and the interaction Tlllle·Freeness axe significant at 0.05.
13--15. Main effects A. B, D. E and the interaction AB are signi:ficant
13-11. Block 1: (1). abo ac, be. Block 2: a, b. c. abe.
13-19. Block 1: (ll, abo bed. !lCd, Block 2: a. b. cd, abed, Block 3: c, abc. bd. ad, Block 4: d, abd, be. ac.
13-21. A and C are significant. 13-25. (a) D; ABC. (b) A is significant.
13M27. 2>-\ wit.i two replicates. 13..29. 25-2 design.. Estimates for A, E, and AB are large.

Chapter 14
14-1. (a) y = 10.4391 - 0.00156.<. (b) F,; 2.052. (e) --0.0038'; fl, ,; 0.00068. (d) 7.316%.
14-3. (aJY;J 1.656 - 0.041x. (b) Fo ~ 57.639. (c) 81.59%. (d) (19.374.21.388).
14-7. (a) y= 93.3399 + 15.6485.. (b) Lack of fit not significant, regression significant.
(e) 7.997'; .8, s: 23.299. (d) 74.828 $ flo s 111.852. (e) (126.012.138.910).
14-9. (a) y; --6.3378 + 9.20836x. (b) Regression is significant. (e) to ~ -23.41. reject He.
(d) (525.58,529.91). (e) (521.22, 534.28).
14-11. (a) y; 77.7895 + 11.8634x. (b) Lack of fjt not signifi=~ regression is significru:;L
(e) 0.3933. (d) (4.5661. 19.1607).
14-13. Ca) y 3.96 + 0.00169x. (b) Rcgrcsiion is significant.
(e) (0.0015.0.0019). (d) 95.2%.
14-15. Ca) y~ 69.1044 + 0.4194.<. (b) 77.35%. (e) t, ~ 5.85, reject lit.
Cd) ZO~ 1.61. (e) (05513.0.8932).

Chapter 15
15-1. Ca) Y;7.30~0.0183xl-0.399x.,. (b) Fo~ 15.19.
15-3. 15-5. y; -1.808372 + 0.00359&x, + O.1939360x, - 0.0()4;l15x,.
(-0.8024.0.0044).
IS-7. Ca) ,;; -102.713 + 0.605x, + 8.924:<" + 1.437", + 0.014.<,.
(b) Fe; 5.106. (c) fJ,. Fa ~ 0.361; fJ,.Fo O.OOO<l.
15-9. Cal y = -13729 + 105.0""" - 0.18954>'.
15·13. (a) y ~ -4.459 + 1.384" + 1.467x'. (b) Significant lack of fit. (e) F, = 16.68.
15-15. to= 1.7898. 15-21. v'IF, ~VIF,; 1.4.

Chapter 16
1/i-l. R = 2. 16-5_ R = 2. 16-9. R ~ 88.5. 16-13. R, = 75.
16-15. Z, -2.117, 16-17. K = 4.835.

Chapter 17
17-1. (a) ~ ~ 34.32, R ~ 5.65. (b) PeR, ~ 1.228. (e) 0.205%.
17-5. DI2. 17·7. LCL; 34.55. CL; 49.85. UCL ~ 65.14. 17-9. Process is not in control.
A.nswers to Selected Questions 647

17·13. 0.1587, n ~ 6 or 7. 17·15. R....1sed cOlltrollimits: LCL~ O. UCL = 17,32.


17-17, UCL=16.485: 0.434. 17·19, LCL= om. UCL =4378. 17-25. (alR(60l=O.0105. (b) 13.16.
17·27. 0.98104. 17-29. 0.84,0.85. 17-31. Cal 3842. (b) (913.63, OOJ.

Chapter 18
18-1. (al ~ 0.088. (b) L = 2, L, = 1.33. (el W = 1 h.

18-3. l~PJ.
2
1/ 1-
1/2;

o
18-7. (al p= 0
[
I~ P I-p~
P 0 1- P
p
o

A=[lOOol·
, 9 ,
18·9, (a) ,0' (b)?ii' (e) 3.
(d) 0.03. (e) 0.10. (I) 10'
18-11. (a) 0.555. (b) 56.378 min. (e) 244.18 min.

18-13, (a) Pj=[(.:llili /f!j-Po,f=0,1,2, .. "s, (b) s=6,p=G.417.

""" 0, otherwise;
I
Po i (;.!!l)i .
j~ j!
(e) p, = 0.354. (d) From4L6 to 8.33%. (el ¢= 4.17. p, 0.377.

Chapter 19

19-3. E(l"l= b:" ~ t,f(a+(b-a)V11J


= (b-a)E(J(a+(b-a)V,))

t
=(b··o) f(a+{b-a)u)du
=I.
19-5. (a) 35. (b) 4.75. (e) 5 (at time 14).
19·7, Ca) X,I, V, 1116, Xi =6, V, =6/16. (b) Yes. (e) X"o 2.
19·9. (al X=-(II.1.) tn(l- U). (b) 0.693.

19·11. (a) X=-2,,}1-2V, ifO<V<1/2, (b) X=O.894.


=2~, if 112 <V <1.
19-13. (a) X = a[-tn(l- v)]"p. (b) 1.558.
19-15, :E;:: V,-6=1.07. 1.9·17. X=-(l/.1.)m(V,V,)=O.84L
19·19, X = 5 trials. 19-21. [-3.41,4.41].
19-23. (80.4,119.6]. 19·25. E<Jl"llelltial with parameter 1; -V (V,) = -1/12.
Index

22 factorial design. 373 nonnal approximation to Bootstrap confidence intervals,


2 3 factorial design. 379 binomial, 155 253
2); factorial designs+ 373, 379 Assignable cause of variation, Bootstrap standard etror, 231
2k- l .fractional factorial design, 509 Box plot, 179
394 Asymptotic prOperties of the BOA-Mill.1ermethod of
2k."? fractional factorial design. maximum likelihood generating normal random
400 estimator, 223 numbers, 5&4
3-sigma control limits, 511 Asymptotic relative effiCiency. Burn-in, 540
499,501
A Attribute data. 170 c
Absorbi.:lg sta~e for a Markov Attocorrelarion in regression, c chart, 523
cbain, 558, 560 472 Car..esian product, 5, 71
Acceptance-rejection method of Average ron length (ARL), 532, Cauchy distribution, 168
random number generation, 534,535 Cause-and-effect diagram, 509,
586 535
Active redunda:Lcy, 544 B Censored:ife teSt, 548, 549
Actual process capabilit)'. 517 Bacbvard elimination of Census, 172
Additivity theorem of chl~ variables in regression, Centerline on control chart,
square. 207 483 510
, Adjusted R', 453, 476 Batcb means, 590 Central limit theorem, 152, 202,
Aliasing of effects in a fractional Bayes' estimator. 228. 230 233,546,584,588
fact,:)rlal,395,400 Bayes' theorem, 27, 226 Central moments, 48, 58, 189
All possible regressions, 474 Bayesian confidence intervals, Chance cause of variation, 509
Alternate fraction, 396 252 Chapman-Kolmogorov
Alternative hypothesis, 266 Bayesian inference, 226,227, equations, 555, 556, 561
Analysis of variance, 321. 323, 252 Characteristic function, 66
324,331,342 Bernoulli distribution, 106, 124 Chebysbev's incq:Hhlty, 48
Analysis of variance model, 323, Bernoulli process, 106 Check sheet, 509
328,337,341,359,366,367 Bernoulli rlIDdom variable, S82 Chi-square distribution, 137,
Analysis of variance tests in Bernoulli tr',aJs, 106,555 168,206,238,300,549
regression, 416, 448 Best linear unbiased estimator, mean and variance of, 206
Analytic study, 172 260 additivity theorem of, 207
A'iOVA, see analysis of Beta. distribution, 141 percentage points of. 603
variance Biased estimator, 220, 467, 588, Chi~square goodness-of-fit test,
A:.ltithetic variates, 591, 593, 589 300
595 Binomial approximation to Class intervals for histogram,
Approximate coniidence hypergeometric, 122 176
intervals in. ma.--umum Binomial distribution, 40, 46, 66, Cochran's Theorem, 325
likelihood estimation. 251 106, !O8, Ill, 124,492 Coefficient of deternrination,
Approximation to the mean and melID and variance of, 109 426
variance, 62 cumulative tlmomial Coefficient of multipie
Approxin:tations to Cistributions. distribution, 110 determ.ination, 453
122, Birth-a""", equations, 555, 564 Combinations, 16
binomial approximation to Bivariate normal distribution. Common random numbers i.rl
hypergeometric, 122 160,427 simulation, 59:. 593
Poisson approximation to Blocking Pl'_,c;p]e, 341, 390 Comparison of sign ~t a:J.d
binom:ial. 122 Bootstrap, 231 I-test, 496

649
650 Index

Comparison of\,\'ilooxon rank~ Confidence limitS, 232 Cycle length of a random


sum test and t-test, 501 Confounding, 390, 394 DllIl:ber generator, 581
Comparison ofW::.1ooxon signed Consistent estimator, 220, 223
rank test and I-test. 499 Contingency table, 307 D
Complement (set complement), Continuity corrections, 155 Data, 173
3 Continuous data, 170 Decision int-.'TIal for the
Completely randomized design, Continuous function of a CUSUM,527
359 contirtuous random Defect concentration diagram
Components of variance modei. variable, 55 536
323 Continuous random ...a...-iable. 41, Defects, 523
Conditional distributions. 71. 79, 55 Defining contTIlSts, 391
80,128,162 Continuous sample space, 6 Defining relation for a desig:1:,
Conditional expectation. 71, 82, Continuous sim.I.l.,iation output, 395,400
84 587 Degrees offreedom. 188
Conditional probzbLity, 19.21 Continuous uniform distribution, Delta method, 62
Confidence coefficient. 232 128,140 Demonstration and acceptance
Confidence inter>al. 232. 233 mean ar.d \-'Rriance of. 129 testing. 551
Co:iidence interval on the Continuous-time :Markov chain. Descriptive statistics, 172
difference in means for 561 Design generator, 395,. 400
paired observations, 247 Contour plot of response, 355, Design of experimentS, 353, 354
Confidence inter.'ai on the 358 Design resolution, 399,402
difference in means of two Contrast, 374 Designed experiment, 321, 322.
:1ormal distnDutions, Control chart. 509 536
variances known, 242 ContrOl charts for attributes, 520, Determ!.nistic versus
Confidence interval on the 522,523,524 nondererrr.inistic systems, 1
difference in means of two Control chart for individuals, Discrete data, 170
:.ormal distributions, 518 Discrete distributions, 106
variances unknown, 244, Control charts for measurements. Discrete random variable, 38, 54
246 510,518,522 Discrete sample space, 6
Confidence interval on the Control limits, 510, 511 Discrete simulation output, 587
difference in two Convolution, 585 Discrete-event simulation. 576
proportions? 250 CODYoh:tion method of random Distribution free statistical
Confidence interval on mean number generation. 585 methods, see nonpararnetric
response in regression., 418 Cook's distlltlce, 470 :nethods
Confidence interval on the mean Correlation coefficient, 190 Distribution function 36,37,38,
of a nor!:1al distribution, Correlation matrix of :egressor 91,110
variance lcnOv.n, 233, 236 variables, 461 Dot plots, 173, 175
Confidence interval on the mean Correlation, 71, 87, 88, 101,.409, Durbin-Watson test, 472
of a normal distribution, 427 Dynamic simulation, 576
variance unknown., 236 Covariance, 71, 87, 88,161
Confidence interval on a Covariance mat:ri.x of regression E
proportion, 239, 241 coefficients. ~3 Enumerative study, 172.204
Confidence interval On me ratio Cp Statistic in :regression, 475 Equally likely outcomes, 10
ofvariancC$ oftwn normal Cramer-Rao lower bound, 219, Equivalent events, 34, 52
distributior.s, 248 223,251 Ergodic property of a Markov
Confidence i::1tervaLs on Critical region, 267 chain,559
regression coefficients. 4:7, Critical values of a test statistic, Erlang distributiou, 5&5
444 267 Error sum of squares, 325, 413
Confider.ce interVal on Cumulative distributiQn function Estiro.able functions; 329
sim:llation output. 588, 591, (CDF), see distribution Estin::.ated standard error, 203,
592 function 230,240
Confidence i:.terval OIl. thc CUll':.U.lao.vc nor:::nal distribution. Estirr:.at:i.on of r:? in regression,-
variance of a normal 145 414,444
distribution. 238 Curr:.ulative sun: (Cl!SL"M) EStiInationofvariance
Confidence level, 234, 235 control chart, 525, 526, 534 components, 338,. 367, 368
Index 651

Events. 8 Function of a random variable, Hypothesis tests on the mean of


Expectation, 58 52, 53 a norrn.a1 distribution.
E.~pected JJie, 62 Functions of two random varianee known, 271
Expeeted mean squares, 326, variables, 92 Hypothesis teStS on the mean of
338,360,361,366,368 a normal distribution•
..E.xpected :ecurre:nce time in a G variance ur.known, 278
Markov chaw, 558 Gamma distribution, 67,134, Hypothesis tests on the means of
Expected value of a random 140.540,546 tvlo normal distributions.,
variable, 58, 77 relationship of gamma and variances knmvn. 286
Experimental design, 508 exponential, 135 Hypothesis tests on the means of
Exponential distribution, 42, 46, mean and variance of. 135 two normal distributions,
130,140.540,541.548. relationship to chi~square variances unknmvn, 288.
546,564,583 distribution, 137 290
relationship of exponeIltia.! three-parameter gamma, Hypothesis tests on the vatianee
, and Poisson, 131 141 0; a nol1ll.al distribution.
mean and variance of 131 Gamma funetion, 134 281
memoryless property. 133 General regression signifieance Hypothesis tests on tvlO
..E.xponentially weigh:ed !eslS,450 propoctions, 297,299
mOving average (EViM,A.) Generalized ir.teraction. 393
control ehart, 525. 526. Generation of random variables. I
529.534 580,582 Idealized experimentS, 5
EXtra sum of squares method. Generation of realizations of ldenti:y element. 381
450 random variables, 123, 138, In-control process, 509
Extrapolation, 447 164 Independent events, 20, 23
Geometric distribution, ] 06, 112, Independent expet'.ments, 25
F 124.535,586 Independent ra.1rlom variables,
Factor effects, 356 mean and variance of. 113 71,86,88.95
Factorial experiments. 355. 359, memoryless property, 114 Indiearor variables, 458
369 Inferential statistics, 170
Failure rate, 62, also see bazard H In.fi:lential observations in
function Half-interval corrections, 155 regression, 470
F-distribution, 211 Half-nonnal dlstribution, 168 Initialization bias in simulation,
mean and va.."ianec of, 212 Hat matrix in regression, 471 589
percentage points of, 605, Hazard function, 538, 539, Instantaneous failure rate. see
606,607,608.609 541 hazard function
Finite population, 172, 204 Histogram. 175, 177, 509. 514 Intcns,ty of passage, 562
Finite sample spaces, 14 Hypergeometric distribution, 40, Intensity of transition, 562
Finite~state Markov ch.a.in, 556 106,117.124 Interaction, 356, 374.438
First passage tiJ:Ge, 557 mean and variance of, lIS Interaction term in a regression
Fin;t~order autoregressive model, Hypothesis testing, 216, 266, model, 438
472 267 Intersection (set intersection), 3
Fixed effects ANOVA, 323 Hypothesis tests in the Interval failure rate, 538
Fixed effects model, 359 correlation model, 429 Intri.:J:\sieally linear ;egression
Forward selection of variables in Hypothesis tests in multiple model,426
regression, 482 linear regression, 447 InvarianCc property of the
Fraction <!efective or Hypothesis tests in simple linear ma."<.imum likelihood
nonconforming, 520 regression. 414 estimator, 223
Fractional factorial design, Hypothesis tests on a proportion, Inventory system, 580
394 283 Inverse transfonn method for
Frequency distn'bution,175, Hypothesis testS on individual generating ::an.dom
176 coefficients in multiple nll1tlbers. 582
FuU-cyele :random number regression, 450 Inverse transform theorem, 582
generator. 581 Hypothesis tests on the equality Inverse transformation method,
Function of a discrete tandom of two variances, 295, 64
variable. 54 296 In'educ-;o1e Markov chai:.!. 559
652 Index.

J Mean squares, 325, 342, 360 Nonparametric ANOVA, see


Jitter, 174 Mean time to failure (MTI'F), Kruska1~ Wallis test
Joint probability distributions. 62,540 Nonparametric confidence
71, 72,73,94 Measurement data. 170 interval, 237
Judgmentsample, 198 Median of the population, 185 Nonparametric methods, 237,
Median, 158,492 491
K Memoryless property of the Nonterminating (steady state)
KrJ.Skal-Wallis tesr, SOl, 503, exponential distribution~ sUtxilations,587,590
504 133,541 NoIJ.D.al apprOximation for the
Kurtosis, 189, 307 Memory!ess property of the sign test, 493
geometric distributioo, Normal approximation for the
L 114 Wilcoxon rank-sum test,
Lack of fit test in regression, 422 Method of least squares, see 501
Large·sample confidence least squares estimation Normal approximation for the
interval, 236 Method of maximum likelihood. Wilcoxon signed rank test.
Law oflarger.umbers, 71, 99, see maximum li.\:elihood 497
101 estimator Nonnal approximation to the
Law of the ur.conscious 1fini.mum varianc¢ unbiased binomial, 155,241
statistician. 58 estimator. 219 Normal distribution, 143,583,
Least squa.--es estimation, 328, Mixed mudd, 367 mean and variance of.
410,438 Mude, 158 144
Least squa.'"Cs nonnal equations, Model adequacy checking, 330, cumulative distribution, 145
328,410,439,440 345,364,384,414,421, reproductive property of,
Life testing, 547 452, 454 150
Likelihaud function, 221, 226 Moment estimator. 224 Normal probability plot, 304,
Linear combinations of random Moment generating function, 65. 305
variables, 96, 99,151 99,107,110,114, 116,121, Normal probability plot of
Linear congruential random 129, 132, 135, 145, ISO, effects, 387, 398
number generator (LeG). 202 Normal probability plot of
581,582 Moment', 44, 47, 48, 58, 65 residuals,330
Linear regression DJJXlcl, 409, Monte-Carlo integration, 578 Null hypothesis, 266
437 Monte-Carlo sin:tulation. 577
Little's law, 568 Moving range, 518 o
LognormaJdistribution, 157, 159 mp'.l, see mean per unit estimator One factor at a time expericent.
mean and variance of. 158 Multicollinearity, 464 356
Loss function, 227 Multicollinearity diagnostics, One-half fraction, 394
Lower control1imit, 51 0 466 One-sided alternative hypothesis,
Multinomial distribution, 106, 266,269-271
M 116 One-sided confidence interval,
Main effects. 35.5, 374 Multiple comparison procedures, 233,236
Mann-Vlhitney test, see 593 One-step transition matrix for a
VlUcoXQO rank-sum test Multiple regression model, 437 Murkov chain, 556, 563
Margi:na.l distribution. 71, 75, 76, Multiplication principle, 14" 22 One-way classification ANOVA,
161 Multiplicative linear 323
Markov chain, 555, 558, 559, congruential random Operating charactutistic (OC)
560,561 oumber generator. 582 curve. 274. 275,280. 347
Markov process, 555, 573 Mutually exclusive, 22, 24 charts for, 610-626
MlL-kov property, 555 Optimal estimator, 220
Maximum likelihood estimator, N Order statistics, 214
221,223,230,251,428 Natural tolerance limits of a Origin moments, 47,58,224
Mean of a random variable- 44. process, 516 Orthogonal contrasts, 332
58 Negative binomial distribution, Orthogonal design, 381
Mean per unit estimate. 204 106,124 Outlier, 180,421
Mean square error of an Noncentral F distribution, 347 Output analysis of simulation
estiuJator, 218 Nonccntral t-distribution, 279 models, 586, 587, 591
Index 653

p Probability plotting, 303 Regression of thc mean, 71, 85,


p cnart, 520 Probability sampling, 172 163
l'.urcd data, 292, 494, 498 Probability, 1, 8, 9, 10, 11.12, Regression sum of squares. 416
Paired r~test. 292 13 Regressor va."'iable, 409
Parameter estimation. 216 ProCess capability, 514, 516 Rejection region, see critical
Pareto char. 181,509,535 Process capability ratio, 516, 5:8 region
Partial regression coefficients. Projection of 2k designs, 384 Relationship between hypothesis
437 Projection of2K<1 designs, 398 teSts and confidence
Partition of j)e sample space. 25 Properties of estimated intervaJs,276
Pascal tUStribUtiOD, 106, 115, :regression coefficients, 412. Relationship of exponential and
124 413,443 Poisson random variables.,
mean and variance of. 115 Properties of probability, 9 131
Pearson correlation coefficient, Pseudorandom numbers (PRNs), Relationship of gamma and
see oorrelatlon coefficient 581 exponential ratldom
P~utatioDS) 15, 19 variables, 135
Piecewise linear regression. 490 Q Relative efficiency of an
Point estimate, 216 QuaLitative regressor variables, estimator, 218
Point estimator, 216 458 Relative frequency, 9, 10
Poisson approximation to Quality improvement., 507 Relative range, 511
binorr.ial, 122 Quality of conformance, 507 Reliability, 507, 538
Poisson distribution., 41. 106, Quality of design, 507 Reliability engineering, 24, 507,
118, 120, 124,523 Quetting, 555, 564, 568, 570, 537
mean and variance of, 120 572,573,579 Reliability estimation. 548
cumu1ative probabilities for, Reliability funcdon, 538, 539
598,599,600 R Reliability of serial systems. 542
Poisson process. 119, 582 Robart, 510, 512 Replication, 322
Polynomial regression model. R',426,429,453,475 Reproductive property of the
438,456 R1adit see adjusted R2 normal distribution. 150,
Pooled estimator of variance, Random effects A:.'10VA, 323, 151
289 337 Residual analysis, 330. 345, 364,
Pooled r-test, 289 Random eff<>.-'is model, 366 384,414,421,454
PopUlation. 170, 171 Random experiments. 5 R.. :dlWs, 330, 364, 377, 421
Popclarion mean, 184 Ra.'1.dom number. see Resolution m design. 399
Population mode. 185 pseudorandom number Resolution N design, 399
Positive reC':lII'ent Markov chain., Random sample, 198, 199 Resolution V design. 400
558 Random sampling, 172 Response variable. 409
Posterior distribution, 226 Random variable. 33, 38 Ridge regression, 466, 4f>7
Potential process capability, 517 Random vector, 71 Risk, 227
Power of a statistical test, 267. Randomization, 322.359
347 Randomized block ANOVA, S
Practical versus statistical 342 Sample correlation coefficient,
significance iI:. hypothesis Randomized block design, 341 428
tests. Tl7 Rank transformation in ANOVA. S<mlple mean, 184
Precision of estimation, 234, 235 504 Sample median, 184, 185
Prediction in regression, 420, Ranking and selection Sample mode, 185
446 procedures, 591, 593 Sample range, 188
Prediction interval. 255 Ranks, 496, 498, 499, 502, 504 Sample size fur ANOVA, 347
Prediction interval in regression, Rational subgroup, 510 Sample size for confulence
420,446 Rayleigh distnlludon, 168 intervals, 235, 237, 241,
Principal block, 392 ReCUI1'e.nce state for a Markov 2"4
Principal medon, 396 chain, 558 Sample size for hypothesis tests,
Prior distribution. 226 Recurrence time. 557 273,279,282,284,287,
Probability density function, 42 Redundant systems, 544, 545 291, 296, 298
Probability distribution, 39. 42 Regression analysis. 409 Sample spaces., 5, 6, 8
Probability mass function. 39 Regression model 437 Sample standard deviation, 181
654 Inde,

Sample variance, 186 S:andardizing, 146 Total !'robability Law, 26


Sampling distribution, 201, 202 Standby redundancy. 545 Transfonna~on in regression,
Sampling fraction, 204 S~ate equations for a Markov 426
Sampling with replacement, 199, chain, 559 Transformation of the response,
204 Statistic, 201, 216 330
Sampling without replacement, Statistical control) 509 Transient state for a Markov
172. 199,204 Statistical hypotheses. 266 chain, 558
Satu.rated fractional facrorial. Statistical inference, 216 Thmsition probabilities, 555,
402 Statistical process control (SPC), 556,563
Scatter diagram. see scatter plot 508 Treatment. 321
Scatter plot, 173, 174, 17.5. 41 1. Statistical quality control. 507 Treatment sum of squares. 325
412,509 Statistically based =pling Tree diagram., 14
Seed for a random nU0100r plans, 508 'Ilial control limits, 512
generator. 581 Statistics. field of, 169 Triangular distribution. 43
Selecting the form of a Stem and leaf plot, 178 TriInIned mean, 195
distribution, 306 Stepwise regression. 479 t-testS on individual regression
Set operations, 4 Stirling', fo<mula, 145 coefficients, 450
Sets. 2 Stochastic process. 555 Tukey's test, 335, 336, 344, 363
Shewhart control charts, 509, Stochastic simulation, 576 Two-facto! factorial, 359
525 Strata, 200, 204 Two-factorrando~ effects
Sign rest, 491, 492, 493, 494, Stratified ra.'1dom sample, 200. ANOYA,366
496 204 Two-factor mixed model
critical values for, 629. Strong versus weak conclusions k'iOVA.367
Sign test for pai."'ed samples, 494 in hypothesis testing, 268 Tw<'rsided alternative
Significance level of a statistical Studentized residual, 471 hypothesis, 266, 269,271
test, 267 Sum ofPoiSS{}n random Two-sided confidence interval,
Significance levels in the sign variables, 121 233
test, 493 Summa."j' statistics for grouped Typc 1 error, 267
Significance of regression. 415, dam, 191 Type U error, 267, 494
447 System failure rate, 543 'l.Ype II error for the sign test.,
Simple correlation coefficient,. 494
see correlation oocfficient T
Simple linear regression model. Tabular form of the CUSW'M, U
4Q9 526 524
u chart,
Simulation, 576 Taylor series, 62 Vnbalanced design, 331
Simultaneous confidence t<listcibution,208,236 Unbiased estimator, 217,219.
intervals, 252 percentage points at 604 220
Single replicate of a factorial Terminating (transient) Uncorrelated random variables,
experiment, 364, 386 simulations, 587 88
Skewness, 189,203,306,307 Three factor analysi.<t of variance, Uniform (0, 1) !1lIldom numb.",
Sparsity of effects principle, 386 366 581,582
Specification limits, 514 Three factor factori.a1. 369 Union of sets, 3
Standard deviation, 45 1'h.fee...dimensional scatter plot, Universal set, 3
Standard error, 203 174,176 Universe, 170
Standard error of a point TIer chart, see tolerance chart Upper controlliroit, 510
estimator, 230 Tws in the Kruskal-Wallis test.
Standard error of factor effect, 503 Y
383 Ties in the sign test. 493 Variability, 169
Standard Dormal distribution. Ties in the Wilcoxon signed t3.Ilk Variable selection in regression.
145,152,233 test, 497 474.479,482,483
cumulative probabilitie.", for. 'lime plots, 183 Variance components, 337, 366,
60),602 Titrlb-ro-failure distribution, 53'S, 368
Standardized regression 540,541 Variance inflation factors, 464-
coefficients, 462 Tolerance chan, 514 Variance of a random variable,
Staudardized residual, 421 TQlerance interval, 257 45,58
Index 655

Variar.ce of an estimator. 218 Weibull distribution, 137, 140, Wllcoxon signed rank test. 496
Va.-iance reduction JlleU:.ods in 540 critical values for, 630
simulation. 591, 593. 596 mean and variance of, 137
Venn diag:am... 4 \Veighted least squares, 435 X
\Vllcoxon ran..t:-sum test. 499 X chart, 510, 511
W critical values for, 627. 628
Waiting-line tb.coiy, see \V1lcoxon s.:gned rank test for y
queuing paired s3JLples, 498 . Yates' algorithm fot the2k, 385

You might also like