QC Sep Sol

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Subject: Solving Quasi-complete Separation Problem

Solution 1. Deleting the problem variable Implications y Not recommended [1]  The variable deleted is likely to have strong effect on the outcome y Aspect: F  A flat ground will not slide  In GIS terms, Flat is not one of the orientations of slope aspect [source] Therefore, there is no implications on the other variables when F is removed; this can be demonstrated using the likelihood ratio test (see [findings]) y This is not logical in the context of Aspect  In GIS terms, Flat is not one of the orientations of slope aspect [source]  It is illogical to combine F with the other categories of Aspect y Reported to be computationally unfeasible for large sample size [1] y Minitab does not provide this function y Minitab does not provide this function Remarks Reference(s) [1] M. Altman, J. Gill, M. P. McDonald, Convergence problems in logistic regression: solutions for quasicomplete separation, in Numerical Issues in Statistical Computation for the Social Scientist, Hoboken, New Jersey: John Wiley & Sons, Inc., 2004, pp. 248-251

2. Merging categories of categorical variables

3. Use exact logistic regression

4. Use penalized maximum likelihood estimation

y Recommended because i. The penalized maximum likelihood estimators were found to have relatively less bias than the median unbiased estimates generated by exact logistic regression [1] ii. The penalized maximum likelihood estimation is found to be computationally feasible even

5. Leave the problem variable in the model and report the likelihood-ratio chi-squares

y The maximum likelihood estimates do not exist for the problem variables; however, the maximum likelihood estimates for the other variables are valid estimates [1]

with large sample [1] y The reason why the Newton-Rhapson algorithm converges after the gradient is replaced by was not explored due to the timeconsuming process of verification y Special care needs to be taken in interpreting the estimates if the problem variable is a dummy variable

Binary Logistic Regression: Landslide versus F, N, NE, NW, S, SE, SW, W


* WARNING * Algorithm has not converged after 20 iterations. * WARNING * Convergence has not been reached for the parameter estimates criterion. * WARNING * The results may not be reliable. * WARNING * Try increasing the maximum number of iterations.

Link Function: Logit

Response Information Variable Landslide Value Yes No Total Count 327 81792526 81792853

(Event)

Frequency: Pixels * NOTE * 17 cases were used * NOTE * 1 cases contained missing values or was a case with zero frequency. Logistic Regression Table Predictor Coef SE Coef Z P Odds Ratio 95% CI Lower Upper

Constant F N NE NW S SE SW W

-12.3243 -20.1054 0.803909 0.416537 0.815262 0.159663 0.0193891 0.158881 0.528285

0.192451 -64.04 0.000 174.750 -0.12 0.908 0.234988 3.42 0.001 0.243433 1.71 0.087 0.230578 3.54 0.000 0.267432 0.60 0.550 0.2697 25 0.07 0.943 0.256142 0.62 0.535 0.242439 2.18 0.029

0.00 2.23 1.52 2.26 1.17 1.02 1.17 1.70

0.00 1.04420E+140 1.41 3.54 0.94 2.44 1.44 3.55 0.69 1.98 0.60 1.73 0.71 1.94 1.05 2.73

Log-Likelihood = -4208.506 Test that all slopes are zero: G = 366.037, DF = 8, P-Value = 0.000 * NOTE * 9 time(s) the standardized Pearson residuals, delta chisquare, delta deviance, delta beta (standardized) and delta beta could not be computed because leverage (Hi) is equal to 1.

Figure 1.1: Logistic regression table of the model with F

Binary Logistic Regression: Landslide versus N, NE, NW, S, SE, SW, W


Link Function: Logit

Response Information Variable Landslide Value Yes No Total Count 327 49046086 49046413

(Event)

Frequency: Pixels

Logistic Regression Table Odds Ratio 2.23 95% CI Lower Upper 1.41 3.54

Predictor Constant N

Coef -12.3243 0.803909

SE Coef Z P 0.192450 -64.04 0.000 0.234987 3.42 0.001

NE NW S SE SW W

0.416537 0.815262 0.159663 0.0193891 0.158881 0.528285

0.243433 0.230578 0.26743 2 0.269725 0.256142 0.242438

1.71 0.087 3.54 0.000 0.60 0.550 0.07 0.943 0.62 0.535 2.18 0.029

1.52 2.26 1.17 1.02 1.17 1.70

0.94 1.44 0.69 0.60 0.71 1.05

2.44 3.55 1.98 1.73 1.94 2.73

Log-Likelihood = -4208.506 Test that all slopes are zero: G = 31.566, DF = 7, P Value = 0.000 * NOTE * 1 time(s) the standardized Pearson residuals, delta chisquare, delta deviance, delta beta (standardized) and delta beta could not be computed because leverage (Hi) is equal to 1.

Figure 1.2: Logistic regression table of the model without F y In order to solve the convergence problem, F is deleted from the model and Binary Logistic Regression is run y Figure 1.1 shows the logistic regression table of the model with F while Figure 1.2 shows the logistic regression table of the model without F; if we compare Figure 1.1 and Figure 1.2, the following are observed: i. The log-likelihood for both models are the same, i.e. -4208.506; this shows that F does not contribute to the occurrence of landslide; ii. The warning that the algorithm does not converge disappears iii. The coefficients of the remaining dummy variable remain the same

You might also like