Professional Documents
Culture Documents
QC Sep Sol
QC Sep Sol
QC Sep Sol
Solution 1. Deleting the problem variable Implications y Not recommended [1] The variable deleted is likely to have strong effect on the outcome y Aspect: F A flat ground will not slide In GIS terms, Flat is not one of the orientations of slope aspect [source] Therefore, there is no implications on the other variables when F is removed; this can be demonstrated using the likelihood ratio test (see [findings]) y This is not logical in the context of Aspect In GIS terms, Flat is not one of the orientations of slope aspect [source] It is illogical to combine F with the other categories of Aspect y Reported to be computationally unfeasible for large sample size [1] y Minitab does not provide this function y Minitab does not provide this function Remarks Reference(s) [1] M. Altman, J. Gill, M. P. McDonald, Convergence problems in logistic regression: solutions for quasicomplete separation, in Numerical Issues in Statistical Computation for the Social Scientist, Hoboken, New Jersey: John Wiley & Sons, Inc., 2004, pp. 248-251
y Recommended because i. The penalized maximum likelihood estimators were found to have relatively less bias than the median unbiased estimates generated by exact logistic regression [1] ii. The penalized maximum likelihood estimation is found to be computationally feasible even
5. Leave the problem variable in the model and report the likelihood-ratio chi-squares
y The maximum likelihood estimates do not exist for the problem variables; however, the maximum likelihood estimates for the other variables are valid estimates [1]
with large sample [1] y The reason why the Newton-Rhapson algorithm converges after the gradient is replaced by was not explored due to the timeconsuming process of verification y Special care needs to be taken in interpreting the estimates if the problem variable is a dummy variable
Response Information Variable Landslide Value Yes No Total Count 327 81792526 81792853
(Event)
Frequency: Pixels * NOTE * 17 cases were used * NOTE * 1 cases contained missing values or was a case with zero frequency. Logistic Regression Table Predictor Coef SE Coef Z P Odds Ratio 95% CI Lower Upper
Constant F N NE NW S SE SW W
0.192451 -64.04 0.000 174.750 -0.12 0.908 0.234988 3.42 0.001 0.243433 1.71 0.087 0.230578 3.54 0.000 0.267432 0.60 0.550 0.2697 25 0.07 0.943 0.256142 0.62 0.535 0.242439 2.18 0.029
0.00 1.04420E+140 1.41 3.54 0.94 2.44 1.44 3.55 0.69 1.98 0.60 1.73 0.71 1.94 1.05 2.73
Log-Likelihood = -4208.506 Test that all slopes are zero: G = 366.037, DF = 8, P-Value = 0.000 * NOTE * 9 time(s) the standardized Pearson residuals, delta chisquare, delta deviance, delta beta (standardized) and delta beta could not be computed because leverage (Hi) is equal to 1.
Response Information Variable Landslide Value Yes No Total Count 327 49046086 49046413
(Event)
Frequency: Pixels
Logistic Regression Table Odds Ratio 2.23 95% CI Lower Upper 1.41 3.54
Predictor Constant N
NE NW S SE SW W
1.71 0.087 3.54 0.000 0.60 0.550 0.07 0.943 0.62 0.535 2.18 0.029
Log-Likelihood = -4208.506 Test that all slopes are zero: G = 31.566, DF = 7, P Value = 0.000 * NOTE * 1 time(s) the standardized Pearson residuals, delta chisquare, delta deviance, delta beta (standardized) and delta beta could not be computed because leverage (Hi) is equal to 1.
Figure 1.2: Logistic regression table of the model without F y In order to solve the convergence problem, F is deleted from the model and Binary Logistic Regression is run y Figure 1.1 shows the logistic regression table of the model with F while Figure 1.2 shows the logistic regression table of the model without F; if we compare Figure 1.1 and Figure 1.2, the following are observed: i. The log-likelihood for both models are the same, i.e. -4208.506; this shows that F does not contribute to the occurrence of landslide; ii. The warning that the algorithm does not converge disappears iii. The coefficients of the remaining dummy variable remain the same