This document provides an overview and table of contents for a textbook on multivariate statistics. It introduces multivariate statistics and discusses why it is useful, provides definitions of key terms, and outlines the organization of the book. The book covers a wide range of statistical techniques for analyzing relationships between multiple variables, predicting outcomes, and exploring data structures. It also addresses important assumptions, limitations, and issues to consider for each technique.
This document provides an overview and table of contents for a textbook on multivariate statistics. It introduces multivariate statistics and discusses why it is useful, provides definitions of key terms, and outlines the organization of the book. The book covers a wide range of statistical techniques for analyzing relationships between multiple variables, predicting outcomes, and exploring data structures. It also addresses important assumptions, limitations, and issues to consider for each technique.
This document provides an overview and table of contents for a textbook on multivariate statistics. It introduces multivariate statistics and discusses why it is useful, provides definitions of key terms, and outlines the organization of the book. The book covers a wide range of statistical techniques for analyzing relationships between multiple variables, predicting outcomes, and exploring data structures. It also addresses important assumptions, limitations, and issues to consider for each technique.
Barbara G. Tabachnick California State University, Northridge Linda S. Fidell California State University, Northridge Allyn and Bacon Boston London Toronto Sydney Tokyo Singapore CONTENTS 1 Preface xxv Introduction 1 1.1 Multivariate Statistics: Why? 1 1.1.1 The Domain of Multivariate Statistics: Numbers of IVs and DVs 1 1.1.2 Experimental and Nonexperimental Research 2 1.1.2.1 Multivariate Statistics in Nonexperimental Research I A.2.2 Multivariate Statistics in Experimental Research 3 1.1.3 Computers and Multivariate Statistics 4 1.1.3.1 Program Updates 4 1.1.3.2 Garbage In, Roses Out? 5 1.1.4 Why Not?" 5 1.2 Some Useful Definitions 5 1.2.1 Continuous, Discrete, and Dichotomous Data 5 1.2.2 Samples and Populations 7 1.2.3 Descriptive and Inferential Statistics 7 1.2.4 Orthogonality 8 1.2.5 Standard and Sequential Analyses 9 1.3 Combining Variables 10 1.4 Number and Nature of Variables to Include 11 1.5 Statistical Power 11 1.6 Data Appropriate for Multivariate Statistics 12 1.6.1 The Data Matrix 12 1.6.2 The Correlation Matrix 13 1.6.3 The Variance-Covariance Matrix 14 1.6.4 The Sum-of-Squares and Cross-Products Matrix 14 1.6.5 Residuals 16 1.7 Organization of the Book 16 A Guide to Statistical Techniques: Using the Book 17 2.1 Research Questions and Associated Techniques 17 2.1.1 Degree of Relationship among Variables 17 2.1.1.1 Bivariater 17 2.1.1.2 Multiple/? 18 H I IV CONTENTS 2.1.1.3 Sequential/? 18 2.1.1.4 Canonical/? 18 2.1.1.5 Multiway Frequency Analysis 19 2.1.2 Significance of Group Differences 19 2.1.2.1 One-Way ANOVA and t Test 19 2.1.2.2 One-Way ANCOVA 19 2.1.2.3 Factorial ANOVA 20 2.1.2.4 Factorial ANCOVA 20 2.1.2.5 Hotelling's T 2 20 2.1.2.6 One-Way MANOVA 21 2.1.2.7 One-Way MANCOVA 21 2.1.2.8 Factorial MANOVA 21 2.1.2.9 Factorial MANCOVA 22 2.1.2.10 Profile Analysis 22 2.1.3 Prediction of Group Membership 23 2.1.3.1 One-Way Discriminant Function 23 2.1.3.2 Sequential One-Way Discriminant Function 23 2.1.3.3 Multiway Frequency Analysis (Logit) 24 2.1.3.4 Logistic Regression 24 2.1.3.5 Sequential Logistic Regression 24 2.1.3.6 Factorial Discriminant Function 24 2.1.3.7 Sequential Factorial Discriminant Function 25 2.1.4 Structure 25 2.1.4.1 Principal Components 25 2.1.4.2 Factor Analysis 25 2.1.4.3 Structural Equation Modeling 26 2.1.5 Time Course of Events 26 2.1.5.1 Survival/Failure Analysis 26 2.1.5.2 Time-Series Analysis 26 2.2 A Decision Tree 26 2.3 Technique Chapters 29 2.4 Preliminary Check of the Data 30 Review of Univariate and Bivariate Statistics 31 3.1 Hypothesis Testing 31 3.1.1 One-Sample z Test as Prototype 31 3.1.2 Power 34 3.1.3 Extensions of the Model 35 3.2 Analysis of Variance 35 3.2.1 One-Way Between-Subjects ANOVA 36 3.2.2 Factorial Between-Subjects ANOVA 40 3.2.3 Within-Subjects ANOVA 41 3.2.4 Mixed Between-Within-Subjects ANOVA 44 CONTENTS 3.2.5 Design Complexity 45 3.2.5.1 Nesting 45 3.2.5.2 Latin-Square Designs 46 3.2.5.3 Unequal n and Nonorthogonality 46 3.2.5.4 Fixed and Random Effects 47 3.2.6 Specific Comparisons 47 3.2.6.1 Weighting Coefficients for Comparisons 48 3.2.6.2 Orthogonality of Weighting Coefficients 48 3.2.6.3 Obtained F for Comparisons 49 3.2.6.4 Critical F for Planned Comparisons 50 3.2.6.5 Critical F for Post Hoc Comparisons 50 3.3 Parameter Estimation 51 3.4 Strength of Association 52 3.5 Bivariate Statistics: Correlation and Regression 53 3.5.1 Correlation 53 3.5.2 Regression 54 3.6 Chi-Square Analysis 55 4 Cleaning Up Your Act: Screening Data Prior to Analysis 56 4.1 Important Issues in Data Screening 57 4.1.1 Accuracy of Data File 57 4.1.2 Honest Correlations 57 4.1.2.1 Inflated Correlation 57 4.1.2.2 Deflated Correlation 57 4.1.3 Missing Data 58 4.1.3.1 Deleting Cases or Variables 59 4.1.3.2 Estimating Missing Data 60 4.1.3.3 Using a Missing Data Correlation Matrix 64 4.1.3.4 Treating Missing Data as Data 65 4.1.3.5 Repeating Analyses with and without Missing Data 65 4.1.3.6 Choosing among Methods for Dealing with Missing Data 65 4.1.4 Outliers 66 4.1.4.1 Detecting Univariate and Multivariate Outliers 67 4.1.4.2 Describing Outliers 70 4.1.4.3 Reducing the Influence of Outliers 71 4.1.4.4 Outliers in a Solution 71 4.1.5 Normality, Linearity, and Homoscedasticity 72 4.1.5.1 Normality 73 4.1.5.2 Linearity 77 4.1.5.3 Homoscedasticity, Homogeneity of Variance, and Homogeneity of Variance-Covariance Matrices 79 VI CONTENTS 4.1.6 Common Data Transformations 80 4.1.7 Multicollinearity and Singularity 82 4.1.8 A Checklist and Some Practical Recommendations 85 4.2 Complete Examples of Data Screening 86 4.2.1 Screening Ungrouped Data 86 4.2.1.1 Accuracy of Input, Missing Data, Distributions, and Univariate Outliers 87 4.2.1.2 Linearity and Homoscedasticity 90 4.2.1.3 Transformation 92 4.2.1.4. Detecting Multivariate Outliers 92 4.2.1.5 Variables Causing Cases to be Outliers 94 4.2.1.6 Multicollinearity 98 4.2.2 Screening Grouped Data 99 4.2.2.1 Accuracy of Input, Missing Data, Distributions, Homogeneity of Variance, and Univariate Outliers 99 4.2.2.2 Linearity 102 4.2.2.3 Multivariate Outliers 104 4.2.2.4 Variables Causing Cases to be Outliers 107 4.2.2.5 Multicollinearity 108 Multiple Regression 111 5.1 General Purpose and Description 111 5.2 Kinds of Research Questions 112 5.2.1 Degree of Relationship 113 5.2.2 Importance of IVs 113 5.2.3 Adding IVs 113 5.2.4 Changing IVs 113 5.2.5 Contingencies among IVs 114 5.2.6 Comparing Sets of IVs 114 5.2.7 Predicting DV Scores for Members of a New Sample 114 5.2.8 Parameter Estimates 115 5.3 Limitations to Regression Analyses 115 5.3.1 Theoretical Issues 115 5.3.2 Practical Issues 116 5.3.2.1 Ratio of Cases to IVs 117 5.3.2.2 Absence of Outliers among the IVs and on the DV 117 5.3.2.3 Absence of Multicollinearity and Singularity 118 5.3.2.4 Normality, Linearity, Homoscedasticity of Residuals 119 5.3.2.5 Independence of Errors 121 5.3.2.6 Outliers in the Solution 122 5.4 Fundamental Equations for Multiple Regression 122 5.4.1 General Linear Equations 123 5.4.2 Matrix Equations 124 CONTENTS Vll 5.4.3 Computer Analyses of Small-Sample Example 128 5.5 Major Types of Multiple Regression 131 5.5.1 Standard Multiple Regression 131 5.5.2 Sequential Multiple Regression 131 5.5.3 Statistical (Stepwise) Regression 133 5.5.4 Choosing among Regression Strategies 138 5.6 Some Important Issues 139 5.6.1 Importance of IVs 139 5.6.1.1 Standard Multiple Regression 140 5.6.1.2 Sequential or Statistical Regression 142 5.6.2 Statistical Inference 142 5.6.2.1 Test for Multiple/? 142 5.6.2.2 Test of Regression Components 143 5.6.2.3 Test of Added Subset of IVs 144 5.6.2.4 Confidence Limits around B 145 5.6.2.5 Comparing Two Sets of Predictors 145 5.6.3 Adjustment of/? 2 147 5.6.4 Suppressor Variables 148 5.6.5 Regression Approach to ANOVA 149 5.6.6 Centering when Interactions and Powers of IVs Are Included 151 5.7 Complete Examples of Regression Analysis 153 5.7.1 Evaluation of Assumptions 154 5.7.1.1 Ratio of Cases to IVs 154 5.7.1.2 Normality, Linearity, Homoscedasticity, and Independence of Residuals 154 5.7.1.3 Outliers 157 5.7.1.4 Multicollinearity and Singularity 157 5.7.2 Standard Multiple Regression 159 5.7.3 Sequential Regression 165 5.8 Comparison of Programs 170 5.8.1 SPSS Package 170 5.8.2 SAS System 175 5.8.3 SYSTAT System 176 Canonical Correlation 177 6.1 General Purpose and Description 177 6.2 Kinds of Research Questions 178 6.2.1 Number of Canonical Variate Pairs 178 6.2.2 Interpretation of Canonical Variates 178 6.2.3 Importance of Canonical Variates 178 6.2.4 Canonical Variate Scores 178 Vlll C O N T E N T S 6.3 Limitations 178 6.3.1 Theoretical Limitations 178 6.3.2 Practical Issues 180 6.3.2.1 Ratio of Cases to IVs 180 6.3.2.2 Normality, Linearity, and Homoscedasticity 180 6.3.2.3 Missing Data 181 6.3.2.4 Absence of Outliers 181 6.3.2.5 Absence of Multicollinearity and Singularity 181 6.4 Fundamental Equations for Canonical Correlation 182 6.4.1 Eigenvalues and Eigenvectors 183 6.4.2 Matrix Equations 185 6.4.3 Proportions of Variance Extracted 189 6.4.4 Computer Analyses of Small-Sample Example 190 6.5 Some Important Issues 198 6.5.1 Importance of Canonical Variates 198 6.5.2 Interpretation of Canonical Variates 199 6.6 Complete Example of Canonical Correlation 199 6.6.1 Evaluation of Assumptions 200 6.6.1.1 Missing Data 200 6.6.1.2 Normality, Linearity, and Homoscedasticity 200 6.6.1.3 Outliers 203 6.6.1.4 Multicollinearity and Singularity 207 6.6.2 Canonical Correlation 216 6.7 Comparison of Programs 216 6.7.1 SAS System 216 6.7.2 SPSS Package 216 6.7.3 SYSTAT System 218 Multiway Frequency Analysis 219 7.1 General Purpose and Description 219 7.2 Kinds of Research Questions 220 7.2.1 Associations among Variables 220 7.2.2 Effect on a Dependent Variable 221 7.2.3 Parameter Estimates 221 7.2.4 Importance of Effects 221 7.2.5 Strength of Association 221 7.2.6 Specific Comparisons and Trend Analysis 222 7.3 Limitations to Multiway Frequency Analysis 222 7.3.1 Theoretical Issues 222 7.3.2 Practical Issues 222 7.3.2.1 Independence 222 7.3.2.2 Ratio of Cases to Variables 223 8 CONTENTS IX 7.3.2.3 Adequacy of Expected Frequencies 223 7.3.2.4 Outliers in the Solution 224 7.4 Fundamental Equations for Multiway Frequency Analysis 224 7.4.1 Screening for Effects 225 7.4.1.1 Total Effect 226 7.4.1.2 First-Order Effects 227 7.4.1.3 Second-Order Effects 228 7.4.1.4 Third-Order Effect 232 7.4.2 Modeling 233 7.4.3 Evaluation and Interpretation 235 7.4.3.1 Residuals 235 7.4.3.2 Parameter Estimates 236 7.4.4 Computer Analyses of Small-Sample Example 241 7.5 Some Important Issues 250 7.5.1 Hierarchical and Nonhierarchical Models 250 7.5.2 Statistical Criteria 251 7.5.2.1 Tests of Models 251 7.5.2.2 Tests of Individual Effects 251 7.5.3 Strategies for Choosing a Model 252 7.5.3.1 SPSS HILOGLINEAR (Hierarchical) 252 7.5.3.2 SPSS GENLOG (General Log-linear) 253 7.5.3.3 SAS CATMOD, SYSTAT LOGLINEAR, and SYSTAT LOGLIN (General Log-linear) 253 7.6 Complete Example of Multiway Frequency Analysis 253 7.6.1 Evaluation of Assumptions: Adequacy of Expected Frequencies 253 7.6.2 Hierarchical Log-linear Analysis 254 7.6.2.1 Preliminary Model Screening 254 7.6.2.2 Stepwise Model Selection 256 7.6.2.3 Adequacy of Fit 258 7.6.2.4 Interpretation of the Selected Model 264 7.7 Comparison of Programs 270 7.7.1 SPSS Package 273 7.7.2 SAS System 274 7.7.3 SYSTAT System 274 Analysis of Covariance 275 8.1 General Purpose and Description 275 8.2 Kinds of Research Questions 277 8.2.1 Main Effects of IVs 278 8.2.2 Interactions among IVs 278 8.2.3 Specific Comparisons and Trend Analysis 278 8.2.4 Effects of Covariates 278 C O N T E N T S 8.2.5 Strength of Association 279 8.2.6 Parameter Estimates 279 8.3 Limitations to Analysis of Covariance 279 8.3.1 Theoretical Issues 279 8.3.2 Practical Issues 280 8.3.2.1 Unequal Sample Sizes, Missing Data, and Ratio of Cases to IVs 280 8.3.2.2 Absence of Outliers 281 8.3.2.3 Absence of Multicollinearity and Singularity 281 8.3.2.4 Normality of Sampling Distributions 281 8.3.2.5 Homogeneity of Variance 281 8.3.2.6 Linearity 282 8.3.2.7 Homogeneity of Regression 282 8.3.2.8 Reliability of Covariates 283 8.4 Fundamental Equations for Analysis of Covariance 283 8.4.1 Sums of Squares and Cross-Products 284 8.4.2 Significance Test and Strength of Association 288 8.4.3 Computer Analyses of Small-Sample Example 289 8.5 Some Important Issues 291 8.5.1 Test for Homogeneity of Regression 291 8.5.2 Design Complexity 293 8.5.2.1 Within-Subjects and Mixed Within-Between Designs 293 8.5.2.2 Unequal Sample Sizes 296 8.5.2.3 Specific Comparisons and Trend Analysis 298 8.5.2.4 Strength of Association 301 8.5.3 Evaluation of Covariates 302 8.5.4 Choosing Covariates 302 8.5.5 Alternatives to ANCOVA 303 8.6 Complete Example of Analysis of Covariance 304 8.6.1 Evaluation of Assumptions 305 8.6.1.1 Unequal n and Missing Data 305 8.6.1.2 Normality 305 8.6.1.3 Linearity 305 8.6.1.4 Outliers 305 8.6.1.5 Multicollinearity and Singularity 309 8.6.1.6 Homogeneity of Variance 309 8.6.1.7 Homogeneity of Regression 310 8.6.1.8 Reliability of Covariates 310 8.6.2 Analysis of Covariance 310 8.6.2.1 Main Analysis 310 8.6.2.2 Evaluation of Covariates 313 8.6.2.3 Homogeneity of Regression Run 315 8.7 Comparison of Programs 319 8.7.1 SPSS Package 319 8.7.2 SYSTAT System 319 8.7.3 SAS System 321 CONTENTS XI Multivariate Analysis of Variance and Covariance 322 9.1 General Purpose and Description 322 9.2 Kinds of Research Questions 325 9.2.1 Main Effects of IVs 325 9.2.2 Interactions among IVs 326 9.2.3 Importance of DVs 326 9.2.4 Parameter Estimates 326 9.2.5 Specific Comparisons and Trend Analysis 327 9.2.6 Strength of Association 327 9.2.7 Effects of Covariates 327 9.2.8 Repeated-Measures Analysis of Variance 327 9.3 Limitations to Multivariate Analysis of Variance and Covariance 328 9.3.1 Theoretical Issues 328 9.3.2 Practical Issues 328 9.3.2.1 Unequal Sample Sizes, Missing Data, and Power 329 9.3.2.2 Multivariate Normality 329 9.3.2.3 Absence of Outliers 330 9.3.2.4 Homogeneity of Variance-Covariance Matrices 330 9.3.2.5 Linearity 330 9.3.2.6 Homogeneity of Regression 331 9.3.2.7 Reliability of Covariates 331 9.3.2.8 Absence of Multicollinearity and Singularity 331 9.4 Fundamental Equations for Multivariate Analysis of Variance and Covariance 332 9.4.1 Multivariate Analysis of Variance 332 9.4.2 Computer Analyses of Small-Sample Example 339 9.4.3 Multivariate Analysis of Covariance 340 9.5 Some Important Issues 347 9.5.1 Criteria for Statistical Inference 347 9.5.2 Assessing DVs 348 9.5.2.1 Univariate/ 7 348 9.5.2.2 Roy-Bargmann Stepdown Analysis 350 9.5.2.3 Using Discriminant Function Analysis 351 9.5.2.4 Choosing among Strategies for Assessing DVs 351 9.5.3 Specific Comparisons and Trend Analysis 352 9.5.4 Design Complexity 356 9.5.4.1 Within-Subjects and Between-Within Designs 356 9.5.4.2 Unequal Sample Sizes 356 9.5.5 MANOVA vs. ANOVAs 357 9.6 Complete Examples of Multivariate Analysis of Variance and Covariance 357 9.6.1 Evaluation of Assumptions 358 9.6.1.1 Unequal Sample Sizes and Missing Data 358 Xii CONTENTS 9.6.1.2 Multivariate Normality 360 9.6.1.3 Linearity 360 9.6.1.4 Outliers 360 9.6.1.5 Homogeneity of Variance-Covariance Matrices 361 9.6.1.6 Homogeneity of Regression 362 9.6.1.7 Reliability of Covariates 365 9.6.1.8 Multicollinearity and Singularity 365 9.6.2 Multivariate Analysis of Variance 365 9.6.3 Multivariate Analysis of Covariance 376 9.6.3.1 Assessing Covariates 377 9.6.3.2 Assessing DVs 377 9.7 Comparison of Programs 386 9.7.1 SPSS Package 389 9.7.2 SYSTAT System 389 9.7.3 SAS System 390 JLU Profile Analysis: The Multivariate Approach to Repeated Measures 391 10.1 General Purpose and Description 391 10.2 Kinds of Research Questions 392 10.2.1 Parallelism of Profiles 392 10.2.2 Overall Difference among Groups 393 10.2.3 Flatness of Profiles 393 10.2.4 Contrasts Following Profile Analysis 393 10.2.5 Parameter Estimates 393 10.2.6 Strength of Association 394 10.3 Limitations to Profile Analysis 394 10.3.1 Theoretical Issues 394 10.3.2 Practical Issues 394 10.3.2.1 Sample Size, Missing Data, and Power 394 10.3.2.2 Multivariate Normality 395 10.3.2.3 Absence of Outliers 395 10.3.2.4 Homogeneity of Variance-Covariance Matrices 395 10.3.2.5 Linearity 395 10.3.2.6 Absence of Multicollinearity and Singularity 396 10.4 Fundamental Equations for Profile Analysis 396 10.4.1 Differences in Levels 396 10.4.2 Parallelism 398 10.4.3 Flatness 401 10.4.4 Computer Analyses of Small-Sample Example 403 10.5 Some Important Issues 410 10.5.1 Contrasts in Profile Analysis 410 11 CONTENTS Xlll 10.5.1.1 Parallelism and Flatness Significant, Levels Not Significant (Simple-Effects Analysis) 413 10.5.1.2 Parallelism and Levels Significant, Flatness Not Significant (Simple-Effects Analysis) 414 10.5.1.3 Parallelism, Levels, and Flatness Significant (Interaction Contrasts) 416 10.5.1.4 Only Parallelism Significant 421 10.5.2 Univariate vs. Multivariate Approach to Repeated Measures 421 10.5.3 Doubly-Multivariate Designs 423 10.5.4 Classifying Profiles 429 10.5.5 Imputation of Missing Values 429 10.6 Complete Examples of Profile Analysis 430 10.6.1 Profile Analysis of Subscales of the WISC 430 10.6.1.1 Evaluation of Assumptions 431 10.6.1.2 Profile Analysis 435 10.6.2 Doubly-Multivariate Analysis of Reaction Time 442 10.6.2.1 Evaluation of Assumptions 442 10.6.2.2 Doubly-Multivariate Analysis of Slope and Intercept 446 10.7 Comparison of Programs 453 10.7.1 SPSS Package 453 10.7.2 SAS System 455 10.7.3 SYSTAT System 455 Discriminant Function Analysis 456 11.1 General Purpose and Description 456 11.2 Kinds of Research Questions 458 11.2.1 Significance of Prediction 458 11.2.2 Number of Significant Discriminant Functions 458 11.2.3 Dimensions of Discrimination 459 11.2.4 Classification Functions 459 11.2.5 Adequacy of Classification 459 11.2.6 Strength of Association 460 11.2.7 Importance of Predictor Variables 460 11.2.8 Significance of Prediction with Covariates 460 11.2.9 Estimation of Group Means 460 11.3 Limits to Discriminant Function Analysis 461 11.3.1 Theoretical Issues 461 11.3.2 Practical Issues 461 11.3.2.1 Unequal Sample Sizes, Missing Data, and Power 461 11.3.2.2 Multivariate Normality 462 11.3.2.3 Absence of Outliers 462 XIV CONTENTS 11.3.2.4 Homogeneity of Variance-Covariance Matrices 462 11.3.2.5 Linearity 463 11.3.2.6 Absence of Multicollinearity and Singularity 463 11.4 Fundamental Equations for Discriminant Function Analysis 463 11.4.1 Derivation and Test of Discriminant Functions 464 11.4.2 Classification 467 11.4.3 Computer Analyses of Small-Sample Example 469 11.5 Types of Discriminant Function Analysis 477 11.5.1 Direct Discriminant Function Analysis 478 11.5.2 Sequential Discriminant Function Analysis 478 11.5.3 Stepwise (Statistical) Discriminant Function Analysis 481 11.6 Some Important Issues 481 11.6.1 Statistical Inference 481 11.6.1.1 Criteria for Overall Statistical Significance 481 11.6.1.2 Stepping Methods 482 11.6.2 Number of Discriminant Functions 482 11.6.3 Interpreting Discriminant Functions 483 11.6.3.1 Discriminant Function Plots 483 11.6.3.2 Loading Matrices 484 11.6.4 Evaluating Predictor Variables 485 11.6.5 Design Complexity: Factorial Designs 488 11.6.6 Use of Classification Procedures 489 11.6.6.1 Cross-Validation and New Cases 489 11.6.6.2 Jackknifed Classification 490 11.6.6.3 Evaluating Improvement in Classification 490 11.7 Complete Example of Discriminant Function Analysis 492 11.7.1 Evaluation of Assumptions 492 11.7. 11.7. 11.7. 11.7. 11.7. 11.7. .1 Unequal Sample Sizes and Missing Data 492 .2 Multivariate Normality 492 .3 Linearity 493 .4 Outliers 493 .5 Homogeneity of Variance-Covariance Matrices 493 .6 Multicollinearity and Singularity 493 11.7.2 Direct Discriminant Function Analysis 497 11.8 Comparison of Programs 509 11.8.1 SPSS Package 515 11.8.2 SYSTAT System 516 11.8.3 SAS System 516 JL.Z Logistic Regression 517 12.1 General Purpose and Description 517 CONTENTS XV 12.2 Kinds of Research Questions 518 12.2.1 Prediction of Group Membership or Outcome 518 12.2.2 Importance of Predictors 518 12.2.3 Interactions among Predictors 518 12.2.4 Parameter Estimates 520 12.2.5 Classification of Cases 520 12.2.6 Significance of Prediction with Covariates 520 12.2.7 Strength of Association 520 12.3 Limitations to Logistic Regression Analysis 521 12.3.1 Theoretical Issues 521 12.3.2 Practical Issues 521 12.3.2.1 Ratio of Cases to Variables 521 12.3.2.2 Adequacy of Expected Frequencies and Power 522 12.3.2.3 Linearity in the Logit 522 12.3.2.4 Absence of Multicollinearity 522 12.3.2.5 Absence of Outliers in the Solution 523 12.3.2.6 Independence of Errors 523 12.4 Fundamental Equations for Logistic Regression 523 12.4.1 Testing and Interpreting Coefficients 524 12.4.2 Goodness-of-Fit 525 12.4.3 Comparing Models 527 12.4.4 Interpretation and Analysis of Residuals 527 12.4.5 Computer Analyses of Small-Sample Example 527 12.5 Types of Logistic Regression 533 12.5.1 Direct Logistic Regression 533 12.5.2 Sequential Logistic Regression 533 12.5.3 Stepwise (Statistical) Logistic Regression 535 12.5.4 Probit and Other Analyses 535 12.6 Some Important Issues 536 12.6.1 Statistical Inference 536 12.6.1.1 Assessing Goodness-of-Fit of Models 537 ! 12.6.1.2 Tests of Individual Variables 539 j 12.6.2 Number and Type of Outcome Categories 539 12.6.2.1 Unordered Response Categories with SYSTAT LOGIT 540 j 12.6.2.2 Ordered Response Categories with SAS LOGISTIC 542 ' 12.6.3 Strength of Association for a Model 545 12.6.4 Coding Outcome and Predictor Categories 546 12.6.5 Classification of Cases 547 12.6.6 Hierarchical and Nonhierarchical Analysis 548 12.6.7 Interpretation of Coefficients using Odds 548 12.6.8 Importance of Predictors 549 12.6.9 Logistic Regression for Matched Groups 550 XVI C O N T E N T S 13 12.7 Complete Examples of Logistic Regression 550 12.7.1 Evaluation of Limitations 551 12.7.1.1 Ratio of Cases to Variables and Missing Data 551 12.7.1.2 Adequacy of Expected Frequencies 554 12.7.1.3 Linearity in the Logit 558 12.7.1.4 Multicollinearity 558 12.7.1.5 Outliers in the Solution 559 12.7.2 Direct Logistic Regression with Two-Category Outcome 559 12.7.3 Sequential Logistic Regression with Three Categories of Outcome 563 12.8 Comparisons of Programs 575 12.8.1 SPSS Package 575 12.8.2 SAS System 580 12.8.3 SYSTAT System 581 Principal Components and Factor Analysis 582 13.1 General Purpose and Description 582 13.2 Kinds of Research Questions 585 13.2.1 Number of Factors 585 13.2.2 Nature of Factors 586 13.2.3 Importance of Solutions and Factors 586 13.2.4 Testing Theory in FA 586 13.2.5 Estimating Scores on Factors 586 13.3 Limitations 586 13.3.1 Theoretical Issues 586 13.3.2 Practical Issues 587 13.3.2.1 Sample Size and Missing Data 588 13.3.2.2 Normality 588 13.3.2.3 Linearity 588 13.3.2.4 Absence of Outliers among Cases 588 13.3.2.5 Absence of Multicollinearity and Singularity 589 13.3.2.6 Factorabilityof/? 589 13.3.2.7 Absence of Outliers among Variables 589 13.4 Fundamental Equations for Factor Analysis 590 13.4.1 Extraction 591 13.4.2 Orthogonal Rotation 595 13.4.3 Communalities, Variance, and Covariance 596 13.4.4 Factor Scores 597 13.4.5 Oblique Rotation 600 13.4.6 Computer Analyses of Small-Sample Example 603 13.5 Major Types of Factor Analysis 609 13.5.1 Factor Extraction Techniques 609 13.5.1.1 PCAvs. FA 610 14 CONTENTS XVU 13.5.1.2 Principal Components 612 13.5.1.3 Principal Factors 612 13.5.1.4 Image Factor Extraction 612 13.5.1.5 Maximum Likelihood Factor Extraction 613 13.5.1.6 Unweighted Least Squares Factoring 613 13.5.1.7 Generalized (Weighted) Least Squares Factoring 613 13.5.1.8 Alpha Factoring 613 13.5.2 Rotation 614 13.5.2.1 Orthogonal Rotation 614 13.5.2.2 Oblique Rotation 616 13.5.2.3 Geometric Interpretation 616 13.5.3 Some Practical Recommendations 618 13.6 Some Important Issues 619 13.6.1 Estimates of Communalities 619 13.6.2 Adequacy of Extraction and Number of Factors 620 13.6.3 Adequacy of Rotation and Simple Structure 622 13.6.4 Importance and Internal Consistency of Factors 623 13.6.5 Interpretation of Factors 625 13.6.6 Factor Scores 626 13.6.7 Comparisons among Solutions and Groups 627 13.7 Complete Example of FA 627 13.7.1 Evaluation of Limitations 628 13.7.1.1 Sample Size and Missing Data 628 13.7.1.2 Normality 628 13.7.1.3 Linearity 628 13.7.1.4 Outliers 628 13.7.1.5 Multicollinearity and Singularity 633 13.7.1.6 Factorabilityof/? 633 13.7.1.7 Outliers among Variables 633 13.7.2 Principal Factors Extraction with Varimax Rotation 633 13.8 Comparison of Programs 648 13.8.1 SPSS Package 648 13.8.2 SAS System 652 13.8.3 SYSTAT System 652 Structural Equation Modeling by Jodie B. Ullman 653 14.1 General Purpose and Description 653 14.2 Kinds of Research Questions 657 14.2.1 Adequacy of the Model 657 14.2.2 Testing Theory 657 14.2.3 Amount of Variance in the Variables Accounted for by the Factors 657 14.2.4 Reliability of the Indicators 657 XV111 C O N T E N T S 14.2.5 Parameter Estimates 657 14.2.6 Mediation 658 14.2.7 Group Differences 658 14.2.8 Longitudinal Differences 658 14.2.9 Multilevel Modeling 658 14.3 Limitations to Structural Equation Modeling 659 14.3.1 Theoretical Issues 659 14.3.2 Practical Issues 659 14.3.2.1 Sample Size and Missing Data 659 14.3.2.2 Multivariate Normality and Absence of Outliers 660 14.3.2.3 Linearity 660 14.3.2.4 Absence of Multicollinearity and Singularity 660 14.3.2.5 Residuals 661 14.4 Fundamental Equations for Structural Equations Modeling 661 14.4.1 Covariance Algebra 661 14.4.2 Model Hypotheses 663 14.4.3 Model Specification 665 14.4.4 Model Estimation 667 14.4.5 Model Evaluation 672 14.4.6 Computer Analysis of Small-Sample Example 674 14.5 Some Important Issues 691 14.5.1 Model Identification 691 14.5.2 Estimation Techniques 694 14.5.2.1 Estimation Methods and Sample Size 696 14.5.2.2 Estimation Methods and Nonnormality 697 14.5.2.3 Estimation Methods and Dependence 697 14.5.2.4 Some Recommendations for Choice of Estimation Method 697 14.5.3 Assessing the Fit of the Model 697 14.5.3.1 Comparative Fit Indices 698 14.5.3.2 Absolute Fit Index 700 14.5.3.3 Indices of Proportion of Variance Accounted 700 14.5.3.4 Degree of Parsimony Fit Indices 701 14.5.3.5 Residual-Based Fit Indices 702 14.5.3.6 Choosing among Fit Indices 702 14.5.4 Model Modification 703 14.5.4.1 Chi-Square Difference Test 703 14.5.4.2 Lagrange Multiplier Test (LM) 703 14.5.4.3 WaldTest 713 14.5.4.4 Some Caveats and Hints on Model Modification 715 14.5.5 Reliability and Proportion of Variance 715 14.5.6 Discrete and Ordinal Data 716 14.5.7 Multiple Group Models 717 14.5.8 Mean and Covariance Structure Models 718 CONTENTS XIX 14.6 Complete Examples of Structural Equation Modeling Analysis 719 14.6.1 Confirmatory Factor Analysis of the WISC 719 14.6.1.1 Model Specification for CFA 719 14.6.1.2 Evaluation of Assumptions for CFA 719 14.6.1.3 CFA Model Estimation and Preliminary Evaluation 721 14.6.1.4 Model Modification 730 14.6.2 SEM of Health Data 737 14.6.2.1 SEM Model Specification 737 14.6.2.2 Evaluation of Assumptions for SEM 738 14.6.2.3 Model Estimation and Preliminary Evaluation 742 14.6.2.4 Model Modification 745 14.7 Comparison of Programs 764 14.7.1 EQS 764 14.7.2 LISREL 764 14.7.3 SAS 771 14.7.4 AMOS 771 J..5 Survival/Failure Analysis 772 15.1 General Purpose and Description 772 15.2 Kinds of Research Questions 773 15.2.1 Proportions Surviving at Various Times 773 15.2.2 Group Differences in Survival 774 15.2.3 Survival Time with Covariates 774 15.2.3.1 Treatment Effects 774 15.2.3.2 Importance of Covariates 774 15.2.3.3 Parameter Estimates 774 15.2.3.4 Contingencies among Covariates 774 15.2.3.5 Strength of Association and Power 774 15.3 Limitations to Survival Analysis 775 15.3.1 Theoretical Issues 775 15.3.2 Practical Issues 775 15.3.2.1 Sample Size and Missing Data 775 15.3.2.2 Normality of Sampling Distributions, Linearity, and Homoscedasticity 775 15.3.2.3 Absence of Outliers 775 15.3.2.4 Differences between Withdrawn and Remaining Cases 776 15.3.2.5 Change in Survival Conditions over Time 776 15.3.2.6 Proportionality of Hazards 776 15.3.2.7 Absence of Multicollinearity 776 15.4 Fundamental Equations for Survival Analysis 776 15.4.1 Life Tables 777 15.4.2 Standard Error of Cumulative Proportion Surviving 778 XX CONTENTS 15.4.3 Hazard and Density Functions 779 15.4.4 Plot of Life Tables 780 15.4.5 Test for Group Differences 781 15.4.6 Computer Analyses of Small-Sample Example 783 15.5 Types of Survival Analysis 791 15.5.1 Actuarial and Product-Limit Life Tables and Survivor Functions 791 15.5.2 Prediction of Group Survival Times from Covariates 796 15.5.2.1 Direct, Sequential, and Statistical Analysis 796 15.5.2.2 Cox Proportional-Hazards Model 797 15.5.2.3 Accelerated Failure-Time Model 797 15.5.2.4 Choosing a Method 804 15.6 Some Important Issues 805 15.6.1 Proportionality of Hazards 805 15.6.2 Censored Data 807 15.6.2.1 Right-Censored Data 807 15.6.2.2 Other Forms of Censoring 808 15.6.3 Strength of Association and Power 808 15.6.4 Statistical Criteria 809 15.6.4.1 Test Statistics for Group Differences in Survival Functions 809 15.6.4.2 Test Statistics for Prediction from Covariates 809 15.6.5 Odds Ratios 811 15.7 Complete Example of Survival Analysis 813 15.7.1 Evaluation of Assumptions 814 15.7.1.1 Accuracy of Input, Adequacy of Sample Size, Missing Data, and Distributions 814 15.7.1.2 Outliers 814 15.7.1.3 Differences between Withdrawn and Remaining Cases 816 15.7.1.4 Change in Survival Experience over Time 819 15.7.1.5 Proportionality of Hazards 820 15.7.1.6 Multicollinearity 821 15.7.2 Cox Regression Survival Analysis 822 15.7.2.1 Effect of Drug Treatment 822 15.7.2.2 Evaluation of Other Covariates 825 15.8 Comparison of Programs 829 15.8.1 SAS System 829 15.8.2 SYSTAT System 829 15.8.3 SPSS Package 836 Time-Series Analysis 837 16.1 General Purpose and Description 837 CONTENTS XXI 16.2 Kinds of Research Questions 839 16.2.1 Pattern of Autocorrelation 841 16.2.2 Seasonal Cycles and Trends 841 16.2.3 Forecasting 841 16.2.4 Effect of an Intervention 841 16.2.5 Comparing Time Series 841 16.2.6 Time Series with Covariates 842 16.2.7 Strength of Association and Power 842 16.3 Assumptions of Time-Series Analysis 842 16.3.1 Theoretical Issues 842 16.3.2 Practical Issues 842 16.3.2.1 Normality of Distributions of Residuals 842 16.3.2.2 Homogeneity of Variance and Zero Mean of Residuals 843 16.3.2.3 Independence of Residuals 843 16.3.2.4 Absence of Outliers 843 16.4 Fundamental Equations for Time-Series ARIMA Model s 843 16.4.1 Identification of ARIMA (p, d, q) Models 844 16.4.1.1 Trend Components, d: Making the Process Stationary 844 16.4.1.2 Auto-Regressive Components 847 16.4.1.3 Moving Average Components 848 16.4.1.4 Mixed Models 848 16.4.1.5 ACFsandPACFs 849 16.4.2 Estimating Model Parameters 854 16.4.3 Diagnosing a Model 855 16.4.4 Computer Analysis of Small-Sample Time-Series Example 855 16.5 Types of Time-Series Analysis 865 16.5.1 Models with Seasonal Components 865 16.5.2 Models with Interventions 869 16.5.2.1 Abrupt, Permanent Effects 870 16.5.2.2 Abrupt, Temporary Effects 870 16.5.2.3 Gradual, Permanent Effects 872 16.5.2.4 Models with Multiple Interventions 877 16.5.3 Adding Continuous Variables 877 16.6 Some Important Issues 878 16.6.1 Patterns of ACFs andPACFs 878 16.6.2 Strength of Association 881 16.6.3 Forecasting 882 16.6.4 Statistical Methods for Compari ng Two Models 884 16.7 Complete Example of a Time-Series Analysis 884 16.7.1 Evaluation of Assumptions 884 16.7.1.1 Normality of Sampling Distributions 884 16.7.1.2 Homogeneity of Variance 885 16.7.1.3 Outliers 885 XX11 CONTENTS 16.7.2 Baseline Model Identification and Estimation 885 16.7.3 Baseline Model Diagnosis 892 16.7.4 Intervention Analysis 893 16.7.4.1 Model Diagnosis 893 16.7.4.2 Model Interpretation 893 16.8 Comparison of Programs 897 16.8.1 SPSS Package 897 16.8.2 SAS System 897 16.8.3 SYSTAT System 900 J. / An Overview of the General Linear Model 901 17.1 Linearity and the General Linear Model 901 17.2 Bivariate to Multivariate Statistics and Overview of Techniques 901 17.2.1 Bivariate Form 901 17.2.2 Simple Multivariate Form 902 17.2.3 Full Multivariate Form 904 17.3 Alternative Research Strategies 907 Appendix A A Skimpy Introduction to Matrix Algebra 908 A.I The Trace of a Matrix 909 A.2 Addition or Subtraction of a Constant to a Matrix 909 A.3 Multiplication or Division of a Matrix by a Constant 909 A.4 Addition and Subtraction of Two Matrices 910 A.5 Multiplication, Transposes, and Square Roots of Matrices 911 A.6 Matrix "Division" (Inverses and Determinants) 913 A.7 Eigenvalues and Eigenvectors: Procedures for Consolidating Variance from a Matrix 914 Appendix 15 Research Designs for Complete Examples 918 B.I Women's Health and Drug Study 918 B.2 Sexual Attraction Study 919 B.3 Learning Disabilities Data Bank 922 B.4 Reaction Time to Identify Figures 923 CONTENTS XX111 B.5 Clinical Trial for Primary Biliary Cirrhosis 923 B.6 Impact of Seat Belt Law 924 Appendix C Statistical Tables 925 C.I Normal Curve Areas 926 C.2 Critical Values of the t Distribution for = .05 and .01, Two-Tailed Test 927 C.3 Critical Values Of the /^Distribution 928 C.4 Critical Values of Chi Square (/ 2 ) 933 C.5 Critical Values for Square Multiple Correlation (R 2 ) in Forward Stepwise Selection a = .05 934 C.6 Critical Values for F max ( S ma x / S^ J Distribution for a = .05 max \ max' and .01 936 References 937 Index 945