Introduction to the 2013 Edition ...................................................................................................... 1
You, Analytics, and Excel.............................................................................................................................................................2 Excel as a Platform .....................................................................................................................................................................4 What’s in This Book ....................................................................................................................................................................4 Introduction to this Edition .............................................................................................................. 7 Inside the Black Box ...................................................................................................................................................................8 Helping Out Your Colleagues ......................................................................................................................................................8 1 Building a Collector ....................................................................................................................11 Planning an Approach .............................................................................................................................................................12 A Meaningful Variable .......................................................................................................................................................12 Identifying Sales ................................................................................................................................................................13 Planning the Workbook Structure ............................................................................................................................................13 Query Sheets .....................................................................................................................................................................13 Summary Sheets ...............................................................................................................................................................18 Snapshot Formulas ............................................................................................................................................................20 Customizing Your Formulas................................................................................................................................................21 The VBA Code ...........................................................................................................................................................................23 The DoItAgain Subroutine ...........................................................................................................................................24 The DontRepeat Subroutine ........................................................................................................................................25 The PrepForAgain Subroutine ...................................................................................................................................25 The GetNewData Subroutine ........................................................................................................................................26 The GetRank Function....................................................................................................................................................30 The RefreshSheets Subroutine ...............................................................................................................................32 The Analysis Sheets..................................................................................................................................................................33 Defining a Dynamic Range Name ......................................................................................................................................34 Using the Dynamic Range Name .......................................................................................................................................36 2 Linear Regression ......................................................................................................................39 Correlation and Regression .....................................................................................................................................................39 Charting the Relationship .................................................................................................................................................40 Calculating Pearson’s Correlation Coefficient .....................................................................................................................43 Correlation Is Not Causation .....................................................................................................................................................45 Simple Regression ...................................................................................................................................................................46 Array-Entering Formulas ...................................................................................................................................................48 Array-Entering LINEST( ) ..................................................................................................................................................49 Multiple Regression ................................................................................................................................................................49 Creating the Composite Variable ......................................................................................................................................50 Entering LINEST( ) with Multiple Predictors ......................................................................................................................51 Merging the Predictors .....................................................................................................................................................51 Analyzing the Composite Variable ....................................................................................................................................53 vi Predictive Analytics: Microsoft® Excel
Assumptions Made in Regression Analysis ..............................................................................................................................54
Variability .........................................................................................................................................................................55 Measures of Variability: Bartlett’s Test of Homogeneity of Variance ..................................................................................57 Means of Residuals Are Zero .............................................................................................................................................58 Normally Distributed Forecasts .........................................................................................................................................59 Using Excel’s Regression Tool ...................................................................................................................................................59 Accessing the Data Analysis Add-ln ..................................................................................................................................59 Accessing an Installed Add-ln ...........................................................................................................................................60 Running the Regression Tool .............................................................................................................................................61 Understanding the Regression Tool’s Dialog Box ...............................................................................................................62 Understanding the Regression Tool’s Output ....................................................................................................................64 3 Forecasting with Moving Averages .............................................................................................71 About Moving Averages ..........................................................................................................................................................71 Signal and Noise ...............................................................................................................................................................72 Smoothing Out the Noise .................................................................................................................................................73 Lost Periods ......................................................................................................................................................................74 Smoothing Versus Tracking ...............................................................................................................................................74 Weighted and Unweighted Moving Averages ...................................................................................................................76 Total of Weights ................................................................................................................................................................77 Relative Size of Weights ....................................................................................................................................................78 More Recent Weights Are Larger .......................................................................................................................................78 Criteria for Judging Moving Averages .....................................................................................................................................80 Mean Absolute Deviation ..................................................................................................................................................80 Least Squares ....................................................................................................................................................................80 Using Least Squares to Compare Moving Averages ............................................................................................................81 Getting Moving Averages Automatically .................................................................................................................................82 Using the Moving Average Tool .........................................................................................................................................83 Labels ...............................................................................................................................................................................85 Output Range ...................................................................................................................................................................85 Actuals and Forecasts ........................................................................................................................................................85 Interpreting the Standard Errors—Or Failing to Do So......................................................................................................87 4 Forecasting a Time Series: Smoothing .........................................................................................89 Exponential Smoothing: The Basic Idea....................................................................................................................................90 Why “Exponential” Smoothing? ...............................................................................................................................................92 Using Excel’s Exponential Smoothing Tool................................................................................................................................95 Understanding the Exponential Smoothing Dialog Box .....................................................................................................96 Choosing the Smoothing Constant ........................................................................................................................................102 Setting Up the Analysis ...................................................................................................................................................103 Using Solver to Find the Best Smoothing Constant ..........................................................................................................105 Understanding Solver’s Requirements .............................................................................................................................110 The Point .........................................................................................................................................................................113 Handling Linear Baselines with Trend ....................................................................................................................................114 Characteristics of Trend ....................................................................................................................................................114 First Differencing .............................................................................................................................................................117 Contents vii
5 More Advanced Smoothing Models ...........................................................................................123
Holt’s Linear Exponential Smoothing .....................................................................................................................................123 About Terminology and Symbols in Handling Trended Series ..........................................................................................124 Using Holt’s Linear Smoothing.........................................................................................................................................124 Holt’s Method and First Differences .................................................................................................................................130 Seasonal Models ....................................................................................................................................................................133 Estimating Seasonal Indexes ...........................................................................................................................................134 Estimating the Series Level and First Forecast .................................................................................................................135 Extending the Forecasts to Future Periods .......................................................................................................................136 Finishing the One-Step-Ahead Forecasts .........................................................................................................................137 Extending the Forecast Horizon .......................................................................................................................................138 Using Additive Holt-Winters Models ......................................................................................................................................140 Level ................................................................................................................................................................................143 Trend ...............................................................................................................................................................................143 Season .............................................................................................................................................................................144 Formulas for the Holt-Winters Additive and Multiplicative Models........................................................................................145 Formulas for the Additive Model .....................................................................................................................................146 Formulas for the Multiplicative Model .............................................................................................................................148 The Models Compared ...........................................................................................................................................................149 Damped Trend Forecasts ........................................................................................................................................................151 6 Forecasting a Time Series: Regression .......................................................................................153 Forecasting with Regression ..................................................................................................................................................153 Linear Regression: An Example ........................................................................................................................................155 Using the LINEST( ) Function ...........................................................................................................................................158 Forecasting with Autoregression............................................................................................................................................164 Problems with Trends ......................................................................................................................................................164 Correlating at Increasing Lags..........................................................................................................................................165 A Review: Linear Regression and Autoregression.............................................................................................................168 Adjusting the Autocorrelation Formula............................................................................................................................169 Using ACFs .......................................................................................................................................................................171 Understanding PACFs.......................................................................................................................................................172 Using the ARIMA Workbook .............................................................................................................................................178 7 Logistic Regression: The Basics..................................................................................................181 Traditional Approaches to the Analysis ..................................................................................................................................181 Z-tests and the Central Limit Theorem.............................................................................................................................181 Sample Size and Observed Rate.......................................................................................................................................183 Binomial Distribution ......................................................................................................................................................183 Only One Comparison ......................................................................................................................................................184 Using Chi-Square .............................................................................................................................................................185 Preferring Chi-Square to a Z-test .....................................................................................................................................187 Regression Analysis on Dichotomies .....................................................................................................................................191 Homoscedasticity ............................................................................................................................................................191 Residuals Are Normally Distributed ................................................................................................................................194 Restriction of Predicted Range ........................................................................................................................................194 viii Predictive Analytics: Microsoft® Excel
Ah, But You Can Get Odds Forever .........................................................................................................................................195
Probabilities and Odds .....................................................................................................................................................195 How the Probabilities Shift .............................................................................................................................................197 Moving On to the Log Odds ............................................................................................................................................200 8 Logistic Regression: Further Issues ............................................................................................203 An Example: Predicting Purchase Behavior ............................................................................................................................204 Using Logistic Regression ................................................................................................................................................205 Calculation of Logit or Log Odds ......................................................................................................................................213 Comparing Excel with R: A Demonstration .............................................................................................................................228 Getting R .........................................................................................................................................................................229 Running a Logistic Analysis in R ......................................................................................................................................229 Importing a csv File into R ...............................................................................................................................................230 Importing From an Open Workbook Into R ......................................................................................................................233 Understanding the Long Versus Wide Shape ...................................................................................................................234 Running Logistic Regression Using glm ...........................................................................................................................235 Statistical Tests in Logistic Regression ....................................................................................................................................240 Models Comparison in Multiple Regression .....................................................................................................................240 Calculating the Results of Different Models .....................................................................................................................241 Testing the Difference Between the Models ....................................................................................................................242 Models Comparison in Logistic Regression ......................................................................................................................243 9 Multinomial Logistic Regression ...............................................................................................253 The Multinomial Problem ......................................................................................................................................................253 Three Alternatives and Three Predictors .................................................................................................................................254 Three Intercepts and Three Sets of Coefficients ................................................................................................................256 Dummy Coding to Represent the Outcome Value ............................................................................................................256 Calculating the Logits ......................................................................................................................................................256 Converting the Logits to Probabilities ..............................................................................................................................257 Calculating the Log Likelihoods .......................................................................................................................................258 Understanding the Differences Between the Binomial and Multinomial Equations ........................................................258 Optimizing the Equations ................................................................................................................................................260 Benchmarking the Excel Results Against R ............................................................................................................................261 Converting the Raw Data Frame with mlogit.data..................................................................................................262 Calling the mlogit Function .........................................................................................................................................264 Completing the mlogit Arguments..............................................................................................................................266 Four Outcomes and One Predictor..........................................................................................................................................267 Multinomial Analysis with an Individual-Specific Predictor .............................................................................................269 Multinomial Analysis with an Alternative-Specific Predictor ...........................................................................................272 10 Principal Components Analysis .................................................................................................275 The Notion of a Principal Component ....................................................................................................................................275 Reducing Complexity .......................................................................................................................................................276 Understanding Relationships Among Measurable Variables ............................................................................................277 Maximizing Variance........................................................................................................................................................278 Components Are Mutually Orthogonal ............................................................................................................................280 Contents ix
Using the Principal Components Add-In ................................................................................................................................281
The R Matrix ....................................................................................................................................................................284 The Inverse of the R Matrix ..............................................................................................................................................284 Matrices, Matrix Inverses, and Identity Matrices ..............................................................................................................287 Features of the Correlation Matrix’s Inverse .....................................................................................................................288 Matrix Inverses and Beta Coefficients ..............................................................................................................................290 Singular Matrices .............................................................................................................................................................293 Testing for Uncorrelated Variables ...................................................................................................................................293 Using Eigenvalues ............................................................................................................................................................295 Using Component Eigenvectors .......................................................................................................................................296 Factor Loadings ...............................................................................................................................................................299 Factor Score Coefficients ..................................................................................................................................................299 Principal Components Distinguished from Factor Analysis.....................................................................................................303 Distinguishing the Purposes ............................................................................................................................................303 Distinguishing Unique from Shared Variance...................................................................................................................303 Rotating Axes ..................................................................................................................................................................305 11 Box-Jenkins ARIMA Models .......................................................................................................307 The Rationale for ARIMA ........................................................................................................................................................307 Deciding to Use ARIMA ....................................................................................................................................................308 ARIMA Notation ...............................................................................................................................................................308 Stages in ARIMA Analysis.......................................................................................................................................................310 The Identification Stage .........................................................................................................................................................310 Identifying an AR Process ................................................................................................................................................310 Identifying an MA Process ...............................................................................................................................................313 Differencing in ARIMA Analysis........................................................................................................................................315 Using the ARIMA Workbook .............................................................................................................................................320 Standard Errors in Correlograms ......................................................................................................................................321 White Noise and Diagnostic Checking..............................................................................................................................322 Identifying Seasonal Models............................................................................................................................................323 The Estimation Stage .............................................................................................................................................................324 Estimating the Parameters for ARIMA(1,0,0) ...................................................................................................................324 Comparing Excel’s Results to R’s .......................................................................................................................................326 Exponential Smoothing and ARIMA(0,0,1) ......................................................................................................................329 Using ARIMA(0,1,1) in Place of ARIMA(0,0,1) ..................................................................................................................332 The Diagnostic and Forecasting Stages ..................................................................................................................................333 12 Varimax Factor Rotation in Excel ...............................................................................................335 Getting to a Simple Structure ...............................................................................................................................................335 Rotating Factors: The Rationale .......................................................................................................................................336 Extraction and Rotation: An Example ..............................................................................................................................339 Structure of Principal Components and Factors .....................................................................................................................344 Rotating Factors: The Results ..........................................................................................................................................345 Charting Records on Rotated Factors ..............................................................................................................................348 Using the Factor Workbook to Rotate Components .........................................................................................................350 Index ...........................................................................................................................................353