Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Summary - Binary Logistic Regression Page 1 of 37

Binary Logistic Regression


Summary

Binary logistic regression examines the relationship between one or more predictor variables and a binary response. A binary
response variable has two possible outcomes, such as the presence or absence of a disease.
Factors and covariates can be used as predictors in a binary logistic model. The fitted binary logistic model is sometimes used
to classify observations into one of two categories.

Data Description

A cereal company is investigating the effectiveness of a TV advertisement for a new product called Cocoa Crunch. After
showing the advertisement in a particular community for one week, they randomly sampled seventy-one adults exiting a local
supermarket.
They were asked:
·    whether or not they bought Cocoa Crunch (Bought)
·    their household income (in thousands of dollars) (Income)
·    whether or not they had children (Children)
·    whether or not they viewed the advertisement (ViewAd)
Data: CerealAd.MTW (available in the Sample Data folder).

Binary Logistic Regression


Link Function

Minitab provides three link functions , allowing you to fit a broad class of binary response models. These are the inverse of the
cumulative logistic distribution function (logit), the inverse of the cumulative standard normal distribution function (normit),
and the inverse of the Gompertz distribution function (gompit).
You want to choose a link function that results in a good fit to your data. You can use goodness-of-fit statistics to compare fits
using different link functions. Certain link functions may be used for historical reasons or because they have a special meaning
in a discipline.
One advantage of the logit link function is that it provides an estimate of the odds ratio for each predictor in the model. For
the logit link function, the odds ratios from a retrospective sample estimate the odds ratios from a prospective sample.
Retrospective samples are faster to collect than prospective samples.

Example Output
Method

Link function                 Logit
Categorical predictor coding  (1, 0)

Interpretation

For the cereal data, investigators chose to use the Logit link function.

Binary Logistic Regression


Response Information

Minitab displays the following information about the response:


· Variable: the name of the response variable.
· Value: the two levels of the binary response.
· Count: the number of observations at each level of the response.
· Event: the reference event .

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 2 of 37

· Total: the number of nonmissing observations.

Example Output
Response Information

Variable  Value  Count
Bought    1         22  (Event)
          0         49
          Total     71

Interpretation

For the cereal data, the response name is Bought, the values of the binary response are 1 (bought the cereal) and 0 (did not
buy the cereal), 22 adults bought the cereal (1) and 49 adults did not buy the cereal (0), buying the cereal (1) is considered the
reference event, and there are 71 observations.

Binary Logistic Regression : Topics


Summary
Link function
Response information
Deviance table
P-value
Model Summary
Summary of model
Coefficients
Logistic model
Odds ratios
Odds ratio
Equation
Equation
Goodness-of-fit tests
Pearson and deviance tests
Hosmer-Lemeshow test
Measures of association
Fits and Diagnostics
Unusual observations
Residual plots
Histogram of residuals
Normal probability plot of residuals
Residuals versus fits
Residuals versus order
Residuals versus variables
Three-in-one plot

Binary Logistic Regression


Deviance Table - P-Value

The p-values test whether or not an observed relationship is statistically significant . The p-values in the deviance table are for

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 3 of 37

the likelihood ratio tests. The likelihood ratio tests are more accurate for small samples than Wald approximation tests. You
need to:
1    Identify the p-value at the top of the deviance table. This p-value tells you if there is a significant association between at
least one predictor and the response by testing whether all slopes are equal to zero.
2    Compare this p-value to your a-level . If the p-value is less than or equal to the a-level you have selected, the association is
significant. A commonly used a-level is 0.05.
·      If the p-value is less than or equal to the a-level, then the association is significant, and you can conclude that at least
one predictor is significantly associated with the response.
·      If the p-value is greater than the a-level, then you can conclude that there is no significant association and the
interpretation ends.
3    If you concluded in step 2 that there is at least one significant predictor, identify the p-value for each term in the model.
These p-values tell you whether or not there is a statistically significant association between a particular predictor variable
and the response.
4    Compare the individual p-values to your a-level: If a p-value is less than or equal to the a-level you have selected, the
association is significant.

Example Output
Deviance Table

Source      DF  Adj Dev  Adj Mean  Chi-Square  P-Value
Regression   3  11.1298    3.7099       11.13    0.011
  Income     1   0.4985    0.4985        0.50    0.480
  Children   1   3.3886    3.3886        3.39    0.066
  ViewAd     1   3.3764    3.3764        3.38    0.066
Error       67  76.7665    1.1458
Total       70  87.8963

Interpretation

For the cereal data, the p-value for testing that all slopes are zero is 0.011. Assume an a-level of 0.05. Because 0.011 is less
than 0.05, you conclude that there is a significant relationship between the response and at least one of the predictor variables.
Now look at the p-values for each predictor. If the a-level is 0.10, ViewAd (P = 0.066) and Children (P = 0.066) are both
significant at the 90% confidence level. You would also conclude that there is no significant association between household
income and purchase of the cereal.

Binary Logistic Regression


Summary of model

The model summary table contains statistics that you can use to select a model. The table includes three statistics:
· Deviance R-Sq is typically thought of as the proportion of the deviance in the data that the model explains. The larger the
deviance R , the better the model fits the data.
· Deviance R-Sq(adj) is a modified deviance R that has been adjusted for the number of terms in the model. The typical
interpretation is the same as for Deviance R2. Use the adjusted statistic to compare models with different numbers of
predictors.
·    Akaike Information Criterion (AIC) is the most useful statistic of the three statistics that compare models. However, AIC has
2
no typical interpretation by itself like the R statistics do. The smaller the AIC, the better the model fits the data.
2
Use these statistics to compare different models. High R values and low AIC values do not guarantee that a model fits the
data well. Use the goodness-of-fit tests in addition to the model summary to assess how well a model fits the data.

Example Output

Model Summary

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 4 of 37

Deviance   Deviance
    R-Sq  R-Sq(adj)    AIC
  12.66%      9.25%  84.77

Interpretation

2 2 2
For the cereal data, deviance R is 12.66%, adjusted deviance R is 9.25%, and AIC is 84.77. Higher values of the R statistics or
a lower value of AIC for another model suggests that a different set of predictors does a better job.

Binary Logistic Regression


Coefficients - Logistic Model

Binary logistic regression examines the relationship between one or more predictor variables and a binary response. The
logistic equation can be used to examine how the probability of an event changes as the predictor variables change.
The interpretation of the estimated coefficients for categorical predictors is relative to the reference level of the predictor.
Positive coefficients indicate that a level of the predictor is more likely to impact the binary response than the reference level.
Negative coefficients indicate that a level of the predictor is less likely to impact the binary response than the reference level.
Coefficients close to zero indicate that an association between the predictor and binary response may not be important.

Example Output
Coefficients

Term        Coef  SE Coef   VIF
Constant  -3.016    0.939
Income    0.0137   0.0195  1.15
Children
  Yes      1.433    0.856  1.12
ViewAd
  Yes      1.034    0.572  1.03

Interpretation

For the cereal data,


·    The positive coefficient for ViewAd (1.034) implies that an adult that has viewed the advertisement is more likely to
purchase the cereal than an adult that has not viewed the advertisement. Note that the reference level for ViewAd is No.
·    Similarly, the positive coefficient for Children (1.433) implies that an adult with children is more likely to purchase the cereal
than an adult without children. Note that the reference level for Children is No.
·    The positive coefficient for Income (0.0137) implies that the greater the household income, the more likely a subject is to
purchase the cereal. This statement only applies to the range of household incomes in the sample, that is, incomes less than
$75,000. (The relatively large p-value suggests that this association is not important. You would probably exclude this
predictor and refit the model.)

Binary Logistic Regression


Regression Table - Odds Ratio

One advantage of the logit link function is that it provides an estimate of the odds ratio for each predictor in the model. The
larger the odds ratio, the greater are the odds of a predictor impacting the binary response relative to the predictor's reference
level . An odds ratio of 1 indicates no association between the predictor and response.

Example Output

Odds Ratios for Continuous Predictors

           Unit of   Odds
Predictor   Change  Ratio     95% CI
Income           1  1.014  (0.98, 1.05)

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 5 of 37

Odds Ratios for Categorical Predictors

                       Odds
Predictor  Reference  Ratio      95% CI
Children
  Yes             No  4.190  (0.78, 22.45)
ViewAd
  Yes             No  2.813  (0.92,  8.63)

Interpretation

For the cereal data, the logit link was used, therefore the odds ratios can be interpreted as:
·    an adult who has viewed the advertisement has an odds 2.813 times larger of purchasing Cocoa Crunch than a subject who
has not viewed the advertisement (assuming common values for the other variables). Note that the reference level for
ViewAd is No.
·    an adult who has children has an odds 4.190 times larger of purchasing Cocoa Crunch than a subject who does not have
children (assuming common values for the other variables). Note that the reference level for Children is No.
·    an adult with a household income one thousand dollars (one unit) greater than another subject has an odds 1.014 times
greater of purchasing Cocoa Crunch (assuming common values for the other variables). However, the relatively large p-
value suggest that this association is not important. You would probably exclude this predictor and refit the model.

Binary Logistic Regression


Regression equation

The regression equation is an algebraic representation of the regression line and is used to describe the relationship between
the response and predictor variables. The form of the regression equation with respect to the probability of an event depends
on the link function. Use the equation to predict the probability of an event.
Minitab provides a separate regression equation for each level of each categorical predictor in the model. When Minitab
calculates probabilities, the results of the linear equation is the input into the equation for the probability.

Example Output
Regression Equation

P(1)  =  exp(Y')/(1 + exp(Y'))

Children  ViewAd
No        No      Y' = -3.016 + 0.01374 Income

No        Yes     Y' = -1.982 + 0.01374 Income

Yes       No      Y' = -1.583 + 0.01374 Income

Yes       Yes     Y' = -0.5490 + 0.01374 Income

Interpretation

For the cereal data, there are four equations because there are 2 categorical levels with 2 levels. Because there is no interaction
in the model, the coefficient for income is the same in all four equations.
In the absence of interactions, you can evaluate the relative probabilities of the groups with a comparison of the constant
values in the equation. For example, customers who did not have children and did not view the ad have the most negative
constant (-3.016). This group is least likely to buy the cereal.

Binary Logistic Regression

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 6 of 37

Goodness-of-Fit Tests - Pearson and Deviance Tests

When fitting a logistic model, you want to choose a model (link function and predictors) that results in a good fit to your data.
You can use goodness-of-fit statistics to compare the fits of different models. A low p-value indicates that the predicted
probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict.
By default, Minitab provides three goodness-of-fit tests: Pearson, Deviance, and Hosmer-Lemeshow.
Pearson and Deviance are both types of residuals for logistic models. They are useful measures for evaluating how well the
selected model fits the data. The higher the p-value, the better the model fits the data. You may want to check other models
and select the one that produces the largest goodness-of-fit p-values (unless one model has special meaning in your
discipline).

Example Output
Goodness-of-Fit Tests

Test             DF  Chi-Square  P-Value
Deviance         67       76.77    0.194
Pearson          67       76.11    0.209
Hosmer-Lemeshow   8        5.58    0.694

Interpretation

For the cereal data, both the Pearson and Deviance tests have p-values that are greater than 0.10 indicating that there is
insufficient evidence for the model not fitting the data adequately when the a-level is less than or equal to 0.10.

Binary Logistic Regression


Goodness-of-Fit Tests - Hosmer-Lemeshow Test

When fitting a logistic model, you want to choose a model (link function and predictors) that results in a good fit to your data.
Goodness-of-fit statistics can be used to compare the fits of different models. A low p-value indicates that the predicted
probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict.
By default, Minitab provides three goodness-of-fit tests: Pearson, Deviance, and Hosmer-Lemeshow.
The Hosmer-Lemeshow test assesses the model fit by comparing the observed and expected frequencies. The test groups the
data by their estimated probabilities from lowest to highest, then performs a Chi-square test to determine if the observed and
expected frequencies are significantly different.

Example Output
Goodness-of-Fit Tests

Test             DF  Chi-Square  P-Value
Deviance         67       76.77    0.194
Pearson          67       76.11    0.209
Hosmer-Lemeshow   8        5.58    0.694

Observed and Expected Frequencies for Hosmer-Lemeshow Test

            Event
         Probability       Bought = 1          Bought = 0
Group       Range      Observed  Expected  Observed  Expected
    1  (0.000, 0.065)         1       0.4         6       6.6
    2  (0.065, 0.137)         1       0.7         6       6.3
    3  (0.137, 0.193)         1       1.1         6       5.9
    4  (0.193, 0.232)         0       1.5         7       5.5
    5  (0.232, 0.252)         2       1.7         5       5.3
    6  (0.252, 0.304)         1       2.0         6       5.0
    7  (0.304, 0.466)         4       2.8         3       4.2
    8  (0.466, 0.514)         4       3.5         3       3.5
    9  (0.514, 0.552)         5       4.3         3       3.7

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 7 of 37

   10  (0.552, 0.568)         3       4.0         4       3.0

Interpretation

For the cereal data, the relatively large p-value (0.694) for the test indicates that there is consistency between the observed
and expected frequencies.
The largest difference between these values is found in Group = 4:
·    For Value = 1 the observed frequency is 0, but 1.5 observations were expected.
·    For Value = 0, the observed frequency is 7, but only 5.5 observations were expected.
If you scan through the table of observed and expected frequencies you can see that the observed and expected values are
generally quite close.

Binary Logistic Regression


Measures of Association

The Measures of Association table contains the following:


·    Pairs information, which contains the number and percent of pairs of observations with different response values that are
concordant pairs , discordant pairs , and tied pairs .
·    Somers' D, which shows how many more concordant than discordant pairs exist divided by the total number of pairs.
·    Goodman-Kruskal Gamma, which shows how many more concordant than discordant pairs exist divided by the total
number of pairs excluding ties.
·    Kendall's Tau-a, which shows how many more concordant than discordant pairs exist divided by the total number of pairs
of observations including pairs with the same response value.
To create the pairs used in these statistics, each observed "success" is paired with every "failure." It is then noted whether the
probability of success predicted from the model is higher for the actual "success."
·    If the predicted probability of success is higher for the observation corresponding to a "success," the pair is considered
concordant.
·    If the predicted probability of success is higher for the observation corresponding to a "failure," the pair is considered
discordant.
·    If the predicted probability of success is the same for both the observed "success" and the observed "failure," the pair is
considered tied.
Larger values for Somers' D, Goodman-Kruskal Gamma, and Kendall's Tau-a indicate that the model has better predictive
ability.

Example Output
Measures of Association

                                                     Summary
Pairs       Number  Percent  Summary Measures       Measures
Concordant     786     72.9  Somers' D                  0.47
Discordant     283     26.3  Goodman-Kruskal Gamma      0.47
Ties             9      0.8  Kendall's Tau-a            0.20
Total         1078    100.0

Association is between the response variable and predicted probabilities

Interpretation

For the cereal data, 72.9% of the pairs were concordant, while 26.3% of the pairs were discordant. Thus, there is almost a 50%
better chance for a pair to be concordant than discordant.
Somers' D (0.47) and Goodman-Kruskal Gamma (0.47) are very close to one another because there are very few tied pairs.
They tell you how many more concordant pairs exist as a percentage of the total number of pairs. Somers' D includes tied pairs
in this calculation, Goodman-Kruskal Gamma does not.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 8 of 37

Binary Logistic Regression


Unusual Observations Table - Standardized Residuals

The unusual observation table displays cases that meet one of two criteria:

l Standardized residuals with absolutes values greater than 2. These cases do not follow the proposed regression
equation well.
l Leverages greater than the lesser of 3p/n or 0.99, where p is the number of terms in the model including the constant
and n is the number of observations in the data set. These cases could have undue influence on the proposed
regression equation.

For unusual observations, you should investigate whether the data were recorded correctly, and whether the data collection
process was affected by any other factors.

Example Output
Fits and Diagnostics for Unusual Observations

        Observed                  Std
Obs  Probability    Fit  Resid  Resid
 50        1.000  0.062  2.357   2.40  R
 68        1.000  0.091  2.189   2.28  R

R  Large residual

Interpretation

For the cereal data, two observations, 50 and 68, have standardized residuals with absolute values greater than 2 (2.40 and
2.28). These two observations do not follow the proposed regression equation well.

Note Residual plots can also help you examine the assumptions about the regression model.

Binary Logistic Regression


Residual Plots - Histogram of the Residuals

A histogram of the residuals shows the distribution of the residuals for all observations. Use the histogram as an exploratory
tool to learn about the following characteristics of the data:
·    Typical values, spread or variation, and shape
·    Unusual values in the data
The histogram of the residuals should be bell-shaped. Use this plot to look for the following:

This pattern... Indicates...


Long tails Skewness
A bar far away from the other bars An outlier
Because the appearance of the histogram can change depending on the number of intervals used to group the data, use the
normal probability plot and goodness-of-fit tests to assess whether the residuals are normal.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 9 of 37

Interpretation

For the cereal data, one group of residuals is most frequent around -0.75 and the second group is most frequent around 1.25.
This pattern often means not enough data is in the set for normal approximation theory to apply. Confidence intervals for
predictions are probably inaccurate.
Two observations have standardized deviance residuals that are greater in absolute value than 2.

Binary Logistic Regression


Residual Plots - Normal plot of the Residuals

This graph plots the residuals versus their expected values when the distribution is normal. The residuals from the analysis
should be approximately normally distributed. In practice, for data with a large number of observations, moderate departures
from normality do not seriously affect the results.
When the data are in event/trial format, the normal probability plot of the residuals should roughly follow a straight line. Use
this plot to look for the following:

This pattern... Indicates...


Not a straight line Nonnormality
Curve in the tails Skewness
A point far away from the line An outlier
Changing slope An unidentified variable
If your data have fewer than 50 observations, the plot may display curvature in the tails even if the residuals are normally
distributed. As the number of observations decreases, the probability plot may show even greater variation and nonlinearity.
Use the normal probability plot and goodness-of-fit tests to assess the normality of residuals in small data sets.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 10 of 37

Interpretation

For the cereal data, one group of residuals is most frequent around -0.75 and the second group is most frequent around 1.25.
The lower group does not follow the line. This pattern often means not enough data is in the set for normal approximation
theory to apply. Confidence intervals for predictions are probably inaccurate.
Two observations have standardized deviance residuals that are greater in absolute value than 2.

Binary Logistic Regression


Residual Plots - Residuals versus fits

This graph plots the residuals versus the fitted values. When the data are in event/trial format and have repeated observations
at the predictor values, the interpretation is similar to least squares regression. The residuals should be scattered randomly
about zero. Use this plot to look for the following:

This pattern... Indicates...


Fanning or uneven spreading of Nonconstant variance or an
residuals across fitted values inappropriate link function
Curvilinear A missing higher-order term
or an inappropriate link
function
A point far away from zero An outlier
A point far away from the other An influential point
points in the x-direction

If the data do not have repeated observations at the predictor values, the plot shows two groups of points that approximate
lines. This pattern is not informative.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 11 of 37

Interpretation

For the cereal data, the residuals have two groups of points that approximate lines. Because few customers have the same
income, only a few predictor values repeat. The pattern that appears is not informative.
The data in the sample folder is not in event/trial format. See Change from binary response format to event/trial format to
learn how to produce this plot.

Binary Logistic Regression


Residual Plots - Residuals versus order

This graph plots the residuals in the order of the corresponding observations. The plot is useful when the order of the
observations may influence the results, which can occur when data are collected in a time sequence or in some other sequence,
such as geographic area. This plot can be particularly helpful in a designed experiment in which the runs are not randomized.
The residuals in the plot should fluctuate in a random pattern around the center line. Examine the plot to see if any correlation
exists between error terms that are near each other. Correlation among residuals may be signified by:
·    An ascending or descending trend in the residuals
·    Rapid changes in signs of adjacent residuals

Example Output

Interpretation

For the cereal data, the residuals alternate relatively quickly. From 36 to 43, 8 customers in a row did not buy the cereal. A

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 12 of 37

series of 8 is not unusual when about 70% of the customers did not buy the cereal, but the cereal company could review the
data to see if the customers in the series have common traits.

Binary Logistic Regression


Residual Plots - Residuals versus variables

This graph plots the residuals versus another variable. The residuals should fluctuate in a random pattern around the center
line. If the variable is already included in the model, use the plot to determine if you should add a higher-order term of the
variable. If the variable is not already included in the model, use the plot to determine if the variable is influencing the
response in a systematic way.
Use this plot to look for the following:

This pattern... Indicates...


Pattern in residuals The variable is influencing the
response in a systematic way
Curvature in the points The model needs a higher-order
term of the variable or a different
link function

Example Output

Interpretation

For the cereal data, no patterns are apparent in the plot against the Income predictor. The two observations with large
standardized residuals are visible again. The potential outliers come from different income levels, so income does not explain
why the outliers occur.

Binary Logistic Regression


Residual Plots - Three-in-One Residual Plot

The three-in-one residual plot displays three different residual plots together in one graph window. This layout can be useful
for comparing the plots to determine whether your model meets the assumptions of the analysis. The residual plots in the
graph include:
·    Histogram - indicates whether outliers exist in the data
·    Normal probability plot - indicates whether outliers exist in the data
·    Residuals versus order of the data - indicates whether there are systematic effects in the data due to time or data collection
order

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 13 of 37

Example Output

Interpretation

To view the interpretation of each residual plot in the three-in-one plot, refer to the individual topic for each residual plot
preceding this topic.

What are complete separation and quasi-complete separation?


Two conditions exist that prevent the convergence of the maximum likelihood estimates for the coefficients: complete
separation and quasi-complete separation.
Complete separation
Complete separation occurs when a linear combination of the predictors yield a perfect prediction of the response variable. For
example, in the following data set if X ≤ 4 then Y = 0. If X > 4 then Y = 1.
Y 0 0 0 0 0 0 1 1 1 1

X 1 2 3 4 4 4 5 6 7 8

Quasi-complete separation
Quasi-complete separation is similar to complete separation. The predictors yield a perfect prediction of the response variable
for most values of the predictors, but not all. For example, in the previous data set, for one of the values where X = 4, let Y = 1
instead of 0. Now, if X < 4 then Y = 0, if X > 4 then Y = 1, but if X = 4 then Y could be 0 or 1. This overlap in the middle range
of the data makes the separation quasi-complete.
Y 0 0 0 0 0 1 1 1 1 1

X 1 2 3 4 4 4 5 6 7 8

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 14 of 37

Causes and remediation


Often, separation occurs when the data set is too small to observe events with low probabilities. The more predictors are in the
model, the more likely separation is to occur because the individual groups in the data have smaller sample sizes. In Minitab,
the model can also fail to converge for very large or very small probabilities that are not strictly 0 or 1, such as less than 1 out
of 1 trillion.
Although Minitab prints a warning when it detects separation, the more predictors are in the model the more difficult the
identification of the cause of the separation is. The inclusion of interaction terms in the model makes the difficulty even
greater.
When the maximum likelihood estimates fail to converge because of separation, consider the following 5 strategies:
1. Increase the amount of data. Separation often occurs when there is a category or range of a predictor with only one value of
the response. A larger sample size increases the probability of different values for the response.
2. Consider what the separation means. While complete separation and quasi-complete separation can indicate that the
sample size is too small, they can also indicate important relationships. If the true probability of an event at a particular
level or combination of levels is close to 0 or 1, this information is important.
3. Consider an alternative model. The more terms are in the model, the more likely that separation occurs for at least one
variable. When you select terms for the model, you can check whether the exclusion of a term allows the maximum
likelihood estimates to converge. If a useful model exists that does not use the term, you can continue the analysis with the
new model.
4. Check to see whether you can combine categories in problematic variables. If there are categories that are sensible to
combine, the separation can disappear from the data set. For example, suppose “Fruit” is a variable in the model.
“Grapefruit” has no events because of the small number of trials. Combining “Grapefruit” and “Oranges” into the category
“Citrus” eliminates the separation.
Data with complete separation

Fruit Events Trials

Grapefruit 0 10

Oranges 5 100

Apples 25 100

Bananas 40 100

Data with overlap

Fruit Events Trials

Citrus 5 110

Apples 25 100

Bananas 40 100
5. Check to see whether a problematic categorical variable is an aggregated variable. If the relationship of the unaggregated

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 15 of 37

variable to the response does not show complete separation, the substitution of the numeric data can eliminate the separation.
For example, suppose “Length of employment” is an aggregated variable in the model. When the data are in 30-day
increments, the lowest level has all events and the highest level has no events, which creates complete separation. The
substitution of the number of days into the model eliminates the separation.
Data with complete separation

Categories of length Events Trials

1–90 2 2

91–80 1 2

181–270 1 2

271–360 0 2
Data with overlap

Exact length Events Trials

45 1 1

60 1 1

95 1 1

176 0 1

185 0 1

241 1 1

280 0 0

299 0 0
For more information about separation, please refer to Albert and J. A. Anderson (1984) “On the existence of maximum
likelihood estimates in logistic regression models” Biometrika 71, 1, 1—10.

Change from binary response format to event/trial format


Data in binary response format can give the most detail about the order of data collection. When each row contains the result
of an individual trial in order, the data has the most results.
However, data in binary response format cannot show an informative pattern on the residuals versus fits plot. If the data have
multiple trials at the combinations of the predictor variables in the model, you can change the format of the data to create a
meaningful residuals versus fits plot. The residuals versus fits plot can help you assess the model fit and the appropriateness of
the link function. If the data do not have multiple trials at the combinations of predictor variables, the residuals versus fits plot
is still uninformative.

If the data are in binary response format without a frequency column

1. If the response column does not contain 0 and 1, use Data > Code to create a new response column of zeroes and ones.
Use 1 for the event.
2. Choose Stat > Basic Statistics > Store Descriptive Statistics.
3. In Variables, enter the response column.
4. In By variables, enter the predictor variables in the model.
3. Click Statistics.
5. Check Sum and N nonmissing.
6. Click OK.
7. Click Options.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 16 of 37

8. Uncheck Include empty cells.


9. Click OK in each dialog box.
The sum column contains the number of events and the N column contains the number of trials. The by variables include the
predictor variables.

If the data are in binary response format with a frequency column

1. Choose Stat > Basic Statistics > Store Descriptive Statistics.


2. In Variables, enter the frequency column.
3. In By variables, enter the predictor variables in the model and the response variable.
4. Click Statistics.
5. Check Sum.
6. Click OK.
7. Click Options.
8. Uncheck Include empty cells.
9. Click OK in each dialog box.
10. Press CTRL + E to reopen Store Descriptive Statistics.
11. In Variables, enter the column that contains the sum from the frequency columns, for example SUM1.
12. In By Variables, enter the stored by variable column that contains the response values, for example ByVar1.
13. Click OK.
The first sum column contains the number of events and the second sum column contains the number of trials. The by
variables include the predictor variables.

Predict
Binary Logistic Regression
Summary

Use a model of the relationship between a binary response (Y) and variables (X) to:
· predict the probability of an event for the combinations of variable values that you request
· create a confidence interval for the event probability at these combinations

Data Description

A cereal company is investigating the effectiveness of a TV advertisement for a new product called Cocoa Crunch. After
showing the advertisement in a particular community for one week, they randomly sampled seventy-one adults exiting a local
supermarket.
They were asked:
·    whether or not they bought Cocoa Crunch (Bought)
·    their household income (in thousands of dollars) (Income)
·    whether or not they had children (Children)
·    whether or not they viewed the advertisement (ViewAd)
Data: CerealAd.MTW (available in the Sample Data folder).

Predict
Binary Logistic Regression
Equation and Variable Settings

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 17 of 37

Minitab displays the equation and the variable settings that it uses to calculate the predicted response value (fit).

Example Output

Prediction for Bought 

Regression Equation

P(1)  =  exp(Y')/(1 + exp(Y'))

Y' = -2.700 + 0.000000 Children_No + 1.641 Children_Yes + 0.000000 ViewAd_No
     + 1.106 ViewAd_Yes

Variable  Setting
Children      Yes
ViewAd        Yes

Interpretation

For the cereal data, the model uses two variables: Children and ViewAd. Minitab predicted the mean probability that a
customer buys the cereal (response) at one combination of variable settings: the customer has children and saw the
advertisement. Minitab uses the variable settings as inputs for the displayed equation.

Predict
Binary Logistic Regression
Predicted Values - Fitted Probability

The fit is the predicted probability of the event at these variable settings. The interval calculations use the standard error of the
fit to help you understand the precision.
Predict does not use the data in the worksheet. Instead, Minitab estimates the predictions based on a stored model. You must
fit a model before you can predict the response value for new observations. Predictions are accurate only if the model
represents the true relationships.

Example Output

Variable  Setting
Children      Yes
ViewAd        Yes

     Fitted
Probability     SE Fit         95% CI
0.511797 0.0935084  (0.334856, 0.685831)

Interpretation

For the cereal data, Minitab predicted the mean probability that a customer buys the cereal when the customer has children
and saw the advertisement. Using the estimated regression equation, Minitab predicts that the fitted probability is 0.511797
with a standard error of 0.0935084.

Predict
Binary Logistic Regression
Predicted Values - Confidence Interval

For the predicted responses, the confidence interval (CI) is a range of values that is likely to contain the event probability for a

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 18 of 37

selected combination of variable settings.

Example Output

Variable  Setting
Children      Yes
ViewAd        Yes

     Fitted
Probability     SE Fit         95% CI
   0.511797  0.0935084  (0.334856, 0.685831)

Interpretation

For the cereal data, you can estimate with 95% confidence that the probability that a customer buys the cereal is between
0.334856 and 0.685831 for a customer who has children and saw the advertisement.

Factorial Plots
Binary Logistic Regression
Summary

Use factorial plots in conjunction with binary logistic regression. Factorial plots include the Main Effects Plot and Interaction
Plot.
A main effect is the change in the mean response from the low level to the high level of a variable. Use this plot for these
purposes:
·    examine the level means for each variable
·    compare the level means for several variables
·    compare the relative strengths of the effects across variables
An interaction occurs when the effect of one variable depends on the value of another variable. Each plot displays the
interaction between two variables. Use interactions plots to compare the relative strength of the effects across variables.

Data Description

A cereal company is investigating the effectiveness of a TV advertisement for a new product called Cocoa Crunch. After
showing the advertisement in a particular community for one week, they randomly sampled seventy-one adults exiting a local
supermarket.
They were asked:
·    whether or not they bought Cocoa Crunch (Bought)
·    their household income (in thousands of dollars) (Income)
·    whether or not they had children (Children)
·    whether or not they viewed the advertisement (ViewAd)
Data: CerealAd.MTW (available in the Sample Data folder).

Binary Logistic Regression : Factorial Plots : Topics


Summary
Graphs
Main effects plot
Interaction plot

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 19 of 37

Factorial Plots
Binary Logistic Regression
Main Effects Plot - Graph of Probabilities

The main effects plot is most useful when you have several categorical variables. You can then compare the changes in the
level event probabilities to see which categorical variable influences the response the most. A main effect is present when the
event probability of the response changes at the different levels of the variable. For a variable with two levels, the event
probability is higher at one level of the variable than at another level. This difference is a main effect. Main effects are only
interpretable if the interaction effects are not significant.
Minitab creates the main effects plot by plotting the fitted event probabilities for each variable in the model. Minitab can plot
event probabilities from the data for variables that are not in the model. A line connects the points for each variable. Look at
the line to determine whether or not a main effect is present for a variable.
·    When the line is horizontal (parallel to the x-axis), then there is no main effect present. Each level of the variable affects the
response in the same way, and the response event probability is the same across all levels.
·    When the line is not horizontal (parallel to the x-axis), then there is a main effect present. Different levels of the categorical
variable affect the response differently. The greater the difference in the vertical position of the plotted points (the more the
line is not parallel to the X-axis), the greater the magnitude of the main effect. To determine if a difference is statistically
significant, check the p-value of the term in the analysis of deviance table.
By comparing the slopes of the lines, you can compare the relative magnitude of the effects.
Factorial plots do not use the data in the worksheet for the fitted event probabilities. Instead, Minitab estimates the fitted
event probabilities based on a stored model. You must fit a model before you can generate a factorial plot. To produce an
interaction plot, you must include two or more variables in the plots. Factorial plots are accurate only if the model represents
the true relationships.

Example Output

Interpretation

For the cereal data, the plots indicate the following:


· Children: Customers with children are more likely to buy the cereal.
· ViewAd: Customers who saw the advertisement are more likely to buy the cereal.
The magnitude of the main effect for Children appears to be larger than the magnitude for ViewAd. The main effects are only
interpretable if the interaction effects are not significant.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 20 of 37

Factorial Plots
Binary Logistic Regression
Interaction Plot

Use interaction plots to assess two-way interactions. Evaluate the lines to understand how interactions affect the response.
·    If the lines are parallel, there is no interaction.
·    The greater the lines depart from being parallel, the greater the strength of the interaction.
As always, plots indicate patterns. To determine if a pattern is statistically significant, check the p-value of the interaction term
in the analysis of variance table.

Note An interactions plot for two variables can be displayed in two ways (variable 1 by variable 2 or variable 2 by variable
1). How you plot your data can make some patterns easier to see. However, if the two variables interact, you can see
this in both plots.
If you have three or more variables, Minitab displays a matrix plot. The default plot has one panel for each pair of variables in
your data set. Or, you can display two panels for each pair of variables, one for each order of the variables. You can use either
or both plots to assess how the variables interact. Although an interaction should show up in both panels, it may be easier to
see the interaction in one panel than the other panel.
Factorial plots do not use the data in the worksheet for the fitted means. Instead, Minitab estimates the fitted means based on
a stored model. You must fit a model before you can generate a factorial plot. To produce an interaction plot, you must
include two or more variables in the plots. Factorial plots are accurate only if the model represents the true relationships.

Example Output

Interpretation

The cereal data set has two categorical variables in the model: Children and ViewAd. The interaction term is not in the model.
Because the model does not include the interaction, the interaction plot shows the probabilities in the data.
In the interaction plot, the effect of whether a customer has children looks greater if a customer saw the advertisement than if
the customer did not see the advertisement. To determine if a pattern is statistically significant, check the p-value of the

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 21 of 37

interaction term in the analysis of deviance table.


For the cereal data, the sample size was so small that no customers who did not have children and did view the bought the
cereal. The small sample size causes separation for the model with the interaction term. For more information, see What are
complete separation and quasi-complete separation.

Contour Plot
Binary Logistic Regression
Summary

Use Minitab's contour plot to help you visualize the effects of continuous variables. This plot shows how a response variable
relates to two continuous variables based on a model equation while any additional variables are held constant.
These plots are useful for establishing desirable response values and operating conditions.

Data Description

These data are the results of a survey given to an MBA class at the beginning of the semester. The data contain some
characteristics about the students education, finances, and the credit cards they have.
Column Name Count Missing Description
C1-T Gender 28 0 Gender; Female or Male
C2-T HDegree 28 2 Highest degree earned; Bachelors or
Masters
C3 GMAT 28 4 Score on the GMAT test
C4 Cash 28 1 Cash on their person, in dollars
C5 AIncome 28 1 Annual income, in dollars
C6 AmEx 28 0 American Express credit card; 1 = yes or 0 = no
C7 Discover 28 0 Discover credit card; 1 = yes or 0 = no
C8 MC 28 0 MasterCard credit card; 1 = yes or 0 = no
C9 Visa 28 0 Visa credit card; 1 = yes or 0 = no
C10 Other 28 0 A credit card other than those above; 1 = yes or 0
= no

Binary Logistic Regression : Contour Plot : Topics


Summary
Contour Plot

Contour Plot
Binary Logistic Regression

Use a contour plot to help you visualize the response surface. Contour plots are useful for determining desirable response
values and operating conditions.
A contour plot shows how a response variable relates to two continuous variables based on a model equation. That is, the
contour plot represents, in two dimensions, the functional relationship between the response and the variables. Points that
have the same response are connected to produce contour lines of constant responses.
Because a contour plot shows only two variables at a time, any extra variables are held at a constant level. Thus, the contour
plots are for fixed levels of the extra variables. If you change the hold levels, the response surface changes as well, sometimes
drastically.
Contour plot does not use the data in the worksheet. Instead, Minitab estimates the contours based on a stored model. You
must fit a model with two or more continuous variables before you can generate a contour plot. Contour plots are accurate

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 22 of 37

only if the model represents the true relationships.

Example Output

Interpretation

For the survey data, the response is the probability a student carries an American Express card. The annual income and the
amount of cash a student carries are continuous variables. The highest degree that a student has is a categorical variable.
Therefore, it makes sense to set the highest degree a student has at its different levels and compare the plots. The
interpretation of the plots is as follows:
· AIncome versus Cash (HDegree = Masters): This plot shows how income and cash are related to the probability that a
student carries an American Express card when the student has a master's degree. The darkest green area indicates the
contour where the probability is the highest. The highest probabilities occur when students have high incomes and carry
little cash or when students have moderate to low incomes and carry moderate to high amounts of cash.
· AIncome versus Cash (HDegree = Bachelors): This plot shows how income and cash are related to the probability that a
student carries an American Express card when the student has a bachelor's degree. The darkest green area indicates the
contour where the probability is the highest. The pattern of the probabilities is similar to the probabilities for students with
master's degrees, but the area of low probabilities is larger.
The contours correspond to a minimax response surface.

Functional basis for the plots


Minitab creates the plot from the most recent model. Keep in mind that the plotted response values are only accurate if the

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 23 of 37

regression model is an adequate representation of the true relationship between the factors and the response.

Simple maximum: surface and contour plots


The following surface and contour plots represent a response surface with a simple maximum. As the color gets darker, the
response increases. Note the relationship between the shape of the surface and the shape of the contours. Both the surface
and contour plots are based on a regression model.

Simple maximum surface plot Simple maximum contour plot

Minimax: surface and contour plots


The following surface and contour plots represent a minimax response surface. As the color gets darker, the response
increases. Note the relationship between the shape of the surface and the shape of the contours. From the stationary point
(saddle point) near the center of the design, simultaneously increasing or decreasing both factors leads to a decrease in the
response. But from the stationary point (saddle point), increasing either factor while decreasing the other leads to an increase
in the response. Both the surface and contour plots are based on a regression model.

Minimax surface plot Minimax contour plot

Stationary ridge: surface and contour plots


The following surface and contour plots represent a stationary ridge surface. As the color gets darker, the response increases. A
stationary ridge is shaped like an arch. In these graphs, observe that there are many possible factor settings that maximize the
response. Note the relationship between the shape of the surface and the shape of the contours. Both the surface and contour
plots are based on a regression model.

Stationary ridge surface plot Stationary ridge contour plot

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 24 of 37

Rising ridge: surface and contour plots


The following surface and contour plots represent a rising ridge surface. As the color gets darker, the response increases. The
response increases as you simultaneously decrease Time and increase Temperature. Note the relationship between the shape
of the surface and the shape of the contours. Both the surface and contour plots are based on a regression model.

Rising ridge surface plot Rising ridge contour plot

Surface Plot
Binary Logistic Regression
Summary

Use Minitab's surface plot to help you visualize the effects of continuous variables. This plot shows how a response variable
relates to two continuous variables based on a model equation while any additional variables are held constant.
These plots are useful for establishing desirable response values and operating conditions.

Data Description

These data are the results of a survey given to an MBA class at the beginning of the semester. The data contain some
characteristics about the students education, finances, and the credit cards they have.
Column Name Count Missing Description
C1-T Gender 28 0 Gender; Female or Male
C2-T HDegree 28 2 Highest degree earned; Bachelors or
Masters
C3 GMAT 28 4 Score on the GMAT test
C4 Cash 28 1 Cash on their person, in dollars
C5 AIncome 28 1 Annual income, in dollars

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 25 of 37

C6 AmEx 28 0 American Express credit card; 1 = yes or 0 = no


C7 Discover 28 0 Discover credit card; 1 = yes or 0 = no
C8 MC 28 0 MasterCard credit card; 1 = yes or 0 = no
C9 Visa 28 0 Visa credit card; 1 = yes or 0 = no
C10 Other 28 0 A credit card other than those above; 1 = yes or 0
= no

Binary Logistic Regression : Surface Plot : Topics


Summary
Surface plot

Surface Plot
Binary Logistic Regression

Use a surface plot to help you visualize the response surface. Surface plots are useful for establishing desirable response values
and operating conditions.
The surface plot shows how a response variable relates to two continuous variables based on a model equation. The surface
plot, a three-dimensional wireframe graph, represents the functional relationship between the response and the continuous
variables. The response surface helps you to visualize how the response reacts to changes in the variables.
Because a surface plot shows only two continuous variables at a time, any extra variables are held at a constant level. Thus, the
surface plots are only valid for fixed levels of the extra variables. If you change the holding levels, the response surface changes
as well, sometimes drastically.
Surface plot does not use the data in the worksheet. Instead, Minitab estimates the response surface based on a stored model.
You must fit a model with two or more continuous variables before you can generate a surface plot. Surface plots are accurate
only if the model represents the true relationships.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 26 of 37

Interpretation

For the survey data, the annual income and the amount of cash a student carries are continuous variables. The highest degree
that a student has is a categorical variable. Therefore, it makes sense to set the highest degree a student has at its different
levels and compare the plots. The response is the probability a student carries an American Express card. The interpretation of
the plots is as follows:
· Income versus Cash (HDegree = Masters): This plot shows how income and cash are related to the probability that a
student carries an American Express card when the student has a master's degree. The highest probabilities occur when
students have high incomes and carry little cash or when students have moderate to low incomes and carry moderate to
high amounts of cash.
· Income versus Cash (HDegree = Bachelors): This plot shows how income and cash are related to the probability that a
student carries an American Express card when the student has a bachelor's degree. The pattern of the probabilities is
similar to the probabilities for students with master's degrees, but the area of low probabilities is larger.
The contours correspond to a minimax response surface.

Choosing hold levels


When choosing hold levels, keep in mind the following:
·    If hold factors do not interact with either of the factors in the plot, then the shape of the response surface will be the same
no matter what hold level you choose. Only the level of the response surface will change.
·    In addition, if hold factors do not interact with either of the factors in the plot and there is no squared effect in the hold
factor, the change in response level will be proportional to the change in the level of the hold factor.
·    If hold factors do interact with either of the factors in the plot, then changing its level will affect not only the level, but the
shape of the response surface.
It is often wise to create plots of all pairs of factors that interact, trying to find the best level combination. If there are many
interactions or strong squared terms, this may require several iterations with different hold levels.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 27 of 37

Overlaid Contour Plot


Binary Logistic Regression
Summary

Use overlaid contour plots to jointly evaluate multiple binary responses. Minitab draws contours for each response and
overlays them in a single graph. Overlaid contour plots can help you identify variable settings that optimize a single response
or set of responses.
Contour plots for regression show how response variables relate to two continuous variables while holding the rest of the
variables in the model at fixed settings.

Data Description

These data are the results of a survey given to an MBA class at the beginning of the semester. The data contain some
characteristics about the students education, finances, and the credit cards they have.
Column Name Count Missing Description
C1-T Gender 28 0 Gender; Female or Male
C2-T HDegree 28 2 Highest degree earned; Bachelors or
Masters
C3 GMAT 28 4 Score on the GMAT test
C4 Cash 28 1 Cash on their person, in dollars
C5 AIncome 28 1 Annual income, in dollars
C6 AmEx 28 0 American Express credit card; 1 = yes or 0 = no
C7 Discover 28 0 Discover credit card; 1 = yes or 0 = no
C8 MC 28 0 MasterCard credit card; 1 = yes or 0 = no
C9 Visa 28 0 Visa credit card; 1 = yes or 0 = no
C10 Other 28 0 A credit card other than those above; 1 = yes or 0
= no

Binary Logistic Regression : Overlaid Contour Plot : Topics


Summary
Graphs
Parameters
Overlaid contour plot

Overlaid Contour Plot


Binary Logistic Regression
Graphs - Parameters
Each overlaid contour plot consists of a pair of variables (one for X-axis, one for Y-axis). If there are more than two continuous
variables, the additional variables are held at a fixed level.
On an overlaid contour plot, the contours for each response are overlaid in a single graph. Each set of contours defines the
boundaries of acceptable values of the event probability. The solid contour is the lower bound and the dotted contour is the
upper bound . The contours of each response are displayed in a different color.
Overlaid contour plot does not use the data in the worksheet. Instead, Minitab estimates the contours based on stored models.
You must fit a model with two or more continuous variables before you can generate an overlaid contour plot. If you want to
include multiple responses, you must fit a model for each response separately. Overlaid contour plots are accurate only if all
models represent the true relationships.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 28 of 37

Example Output

Interpretation

For the survey data, annual income and cash are plotted on the X- and Y-axes, respectively. The third continuous variable,
GMAT score, is held at its mean. The two categorical variables are held at single levels. The plot shows probabilities for females
with bachelor's degrees. The contours for the survey data are:
· American express card (red): the contours show probabilities between 0.9 and 1.
· Master card (blue): the contours show probabilities between 0.9 and 1.

Overlaid Contour Plot


Binary Logistic Regression
Graphs - Overlaid Contour Plot
Look at the overlaid contour plot and find the white area, which is the feasible region . The feasible region is the area formed
by the two continuous variables, given the hold values of any other variables, such that the event probabilities for each
response are between their respective contours.
Because only a pair of continuous variables can be displayed on each overlaid contour plot, and any additional variables are
held at a fixed level, a single contour plot may not provide a complete picture of the feasible region. You might consider
creating overlaid contour plots for all possible pairs of continuous variables, and changing the hold values of the additional
variables.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 29 of 37

Interpretation

For the survey data, the white area is the region formed by annual income and cash, given GMAT score is about 615, such that
the probabilities for the two responses are between their respective contours. Therefore, any variable settings for annual
income and cash that fall in this region should produce probabilities over 0.9 for both responses. For example, possible
combinations of variables for a female student with a bachelor's degree include:
·    Annual Income = 10,000, Cash = 300, GMAT = 615
·    Annual Income = 60,000, Cash = 250, GMAT = 615
The grid pattern shows where the model predictions are flat against one of the contours. For the survey data, the red grid
shows where the probability that a student carries a Master Card is nearly 1.

contours of a response
The contours for each response consist of a lower and upper bound, which specify the interval of acceptable mean response
values. Two contour lines (lower and upper) for each response are drawn on the overlaid contour plot.
A contour line is a curve that connects plot points such that the fitted response values are equal. For example, given a lower
bound of 0.3, the contour for the lower bound would be a curve connecting the points on the plot with fitted response values
equal to 0.3.

feasible region
The area formed by the two variables displayed on the X- and Y-axes such that the response values fall between their lower
and upper bounds. Any settings for the two plotted variables in the feasible region, while keeping all other variables at their
hold levels, should produce a product with acceptable mean responses.
For designs with more than two continuous variables, Minitab displays the hold values at the bottom right of the plot.

lower bound
The lower bound is the lower end of the interval in which mean response values should fall.
For example, a lower bound of 0.3 indicates that acceptable mean response values are over 0.3 but below the upper bound.

upper bound
The upper bound is the upper end of the interval in which the mean response values should fall.
For example, an upper bound of 0.6 means that acceptable mean response values are below 0.6 but above the lower bound.

Response Optimizer

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 30 of 37

Binary Logistic Regression


Summary

Use Minitab's Response Optimizer to help identify the variable settings that optimize a single response or a set of responses.
For multiple responses, the requirements for all the responses in the set must be satisfied.
Response optimization is often useful in product development when you need to determine operating conditions that will
result in a product with desirable properties. For example, you may need to determine settings that optimize several properties
of a product, such as its elasticity and puncture resistance.

Data Description

A cereal company is investigating the effectiveness of a TV advertisement for a new product called Cocoa Crunch. After
showing the advertisement in a particular community for one week, they randomly sampled seventy-one adults exiting a local
supermarket.
They were asked:
·    whether or not they bought Cocoa Crunch (Bought)
·    their household income (in thousands of dollars) (Income)
·    whether or not they had children (Children)
·    whether or not they viewed the advertisement (ViewAd)
Data: CerealAd.MTW (available in the Sample Data folder).

Response Optimizer
Binary Logistic Regression
Optimization Parameters - Parameters

Minitab displays the design parameters for each response in the Session window. You should check these results and verify
that the displayed design parameters are correct.
Your choices of goal , lower , target , upper , and weight define the desirability function for each individual response. The
importance parameters determine how the desirability functions are combined into a single composite desirability .
Response optimizer does not use the data in the worksheet. Instead, Minitab estimates the optimal variable values based on
stored models. You must fit a model before you can use the response optimizer. If you want to optimize multiple responses,
you must fit a model for each response separately. The optimal values are accurate only if all models represent the true
relationships.

Example Output

Parameters

Response  Goal        Lower  Target  Upper  Weight  Importance
Bought    Maximum  0.309859       1              1           1

Interpretation

For the cereal data, the response variable is whether a customer bought the cereal. The design parameters are as follows:
·    The goal for Bought is to is to Maximize it. A value of 1 is perfect, while values below 0.309859 are unacceptable in the
desirability calculation.

Response Optimizer
Binary Logistic Regression
Optimization Solution - Solution

The optimization procedure picks several starting points from which to begin searching for the optimal variable settings. There

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 31 of 37

are two types of solutions for the search:


· Local solution: For each starting point, there is a local solution. These solutions are the "best" combination of variable
settings found beginning from a particular starting point.
· Global solution: There is only one global solution, which is the best of all the local solutions. The global solution is the
"best" combination of variable settings for achieving the desired responses.
By default, Minitab only displays the global solution.
Minitab calculates the individual desirability for each predicted response. The individual desirability values are then combined
into the composite desirability. These desirability values can help you understand how close the predicted responses are to
your target requirements. Desirability is measured on a 0 to 1 scale.
Individual desirability: The closer the predicted responses are to your target requirements, the closer the desirability will be
to 1. The individual desirability for each response is displayed on the Optimization Plot.
Composite desirability: The composite desirability combines the individual desirabilities into an overall value, and reflects the
relative importance of the responses. The higher the desirability the closer it will be to 1.
By default, Minitab places equal importance on the responses and assigns each an importance value of one. You can change
the importance to allow some responses to have more influence on the composite desirability than other responses.
·    If you want more emphasis on a response, increase its importance relative to the other responses.
·    If you want less emphasis on a response, decrease its importance relative to the other responses.

Example Output

Solution

                                 Bought
                                 Fitted     Composite
Solution  Children  ViewAd  Probability  Desirability
1         Yes       Yes        0.555556 0.356009

Interpretation

For the cereal data, customers who are most likely to buy the cereal have these characteristics:
·    The customer has children.
·    The customer saw the advertisement.

Response Optimizer
Binary Logistic Regression
Optimization Solution - Predicted Responses

Minitab calculates the predicted responses using the global solution variable settings. The predicted responses are the
responses that you can expect if the global solution variable settings are used.
For the predicted responses, the confidence interval (CI) is a range of values that is likely to contain the event probability for a
selected combination of variable settings.

Example Output

Multiple Response Prediction

Variable  Setting
Children  Yes
ViewAd    Yes

               Fitted
Response  Probability  SE Fit       95% CI

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 32 of 37

Bought         0.5556  0.0956  (0.3691, 0.7276)

Interpretation

For the cereal data, the global solution variable settings are Children = Yes and ViewAd = Yes.
The predicted response indicates that, according to the fitted model, the probability that a customer will, on average, buy the
cereal when these characteristics are true is 0.5556.
The confidence interval indicates that the range of likely values for the probability is from 0.3691–0.7276. These results are
imprecise. The company should conduct additional experimentation and/or use a larger sample size if the company needs
more precision.

Response Optimizer : Binary Logistic Regression : Topics


Summary
Optimization parameters
Optimization solution
Solution
Predicted responses
Graphs
Optimization plot
Layout
Interpretation

Response Optimizer
Binary Logistic Regression
Graphs - Optimization Plot Layout

The optimization plot shows how the variables affect the predicted responses and allows you to modify the variable settings
interactively.
·    Each column of the graph corresponds to a variable.
·    The top row of the graph corresponds to the composite desirability, if shown. Each remaining row corresponds to a
response variable.
·    Each cell of the graph shows how the corresponding response variable or composite desirability changes as a function of
one of the variables, while all other variables remain fixed.
·    The numbers displayed at the top of a column show the current variable settings (in red) and the high and low variable
settings in the data.
·    The Predict link in the top left of the graph calculates the prediction for the current variable settings.
·    At the left of each response row, Minitab shows the goal for the response, the predicted response, y, at the current variable
settings, and the individual desirability score.
·    The composite desirability , D, is displayed in the top row and the upper left corner of the graph.
·    The label above the composite desirability refers to the current setting and changes if you move the variable settings
interactively. When the optimization plot is created, the label is Optimal. If you change the settings, the label changes to
New. If you find a new optimal setting, the label changes to Optimal. If you save the current setting, the label changes to a
number to indicate the position in the list of saved settings.
·    The vertical red lines on the graph represent the current settings.
·    The horizontal blue lines represent the current response values.
·    The gray regions indicate where the corresponding response has zero desirability.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 33 of 37

Interpretation

For the cereal data, the current settings are Children = Yes and ViewAd = Yes. The goal was to Maximize the probability that a
customer buys the cereal. The predicted value is 0.5118, and its individual desirability is 0.51180.

Response Optimizer
Binary Logistic Regression
Graphs - Optimization Plot Interpretation

Look at the plot to see the variable settings that optimize the responses.
Response optimizer does not use the data in the worksheet. Instead, Minitab estimates the optimal variable values based on
stored models. You must fit a model before you can use the response optimizer. If you want to optimize multiple responses,
you must fit a model for each response separately. The optimal values are accurate only if all models represent the true
relationships.

Example Output

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 34 of 37

Interpretation

For the cereal data, customers who have children and who view the advertisement are more likely to buy the cereal. Because
there is only one response, the individual desirability and the composite desirability are equal, 0.5118. The desirability is low
because the greatest probability that the model predicts is far from the target probability of 1.

Changing the factor settings interactively


You might want to change the factor levels on the optimization plot for many reasons, including:
·    to search for settings with a higher composite desirability
·    to search for lower-cost factor settings with near optimal properties
·    to explore the sensitivity of response variables to changes in the factor settings
·    to "calculate" the predicted responses for factor settings of interest
·    to explore factor settings in the neighborhood of a local solution
·    to explore factor settings that are optimal for other values of a covariate
You can change the factor settings by dragging the vertical red lines with your mouse or by clicking on the current level shown
in brackets and typing a new value. When you change a factor to a new level, the graphs are re-drawn and the predicted
responses and desirabilities are automatically re-calculated.
If you find a combination of settings with a composite desirability greater than the initial setting, Minitab automatically saves
the new optimal setting. You may also

·    return to the initial settings by clicking on the Toolbar.

·    reset the graph to the optimal settings clicking on the Toolbar.

·    save new factor settings by clicking on the Toolbar.

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 35 of 37

Examples of desirability functions


The desirability function translates each response scale to a zero-to-one desirability scale. The shape of the desirability function
depends on the design parameters.
The graphs below show how the combination of parameters for the tire data responses elasticity and puncture resistance
create the individual desirability function.
     Goal   Lower  Target Upper  Weight  Import
    Target    80    100    150      1      1

Target

Desirability

Elasticity

     Goal    Lower  Target Upper  Weight  Import
    Maximize   900   1000  1000      1      2

Maximize

Desirability

Puncture Resistance

composite desirability
Often, if you have multiple responses, there is no factor setting that simultaneously maximizes the desirability of all of them.
For this reason, we maximize a composite desirability.
The composite desirability combines the individual desirability of all the response variables into a single measure. Greater
emphasis is placed on the response variables with the greatest importance.

desirability function
A desirability function translates each response scale to a zero-to-one desirability scale. The most desirable values of the
response have desirability one. The least desirable values have desirability zero.

goal

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 36 of 37

For each response you selected a goal. The goal is interpreted in terms of the target parameter for the response. If the goal is
to
·    minimize the response, the desirability is one for all response values less than or equal to the target, and decreases for
response values greater than the target.
·    maximize the response, then the desirability increases as response values increase from the lower bound to the target, and
the desirability is one for all values at or above the target.
·    target the response, then the desirability is one at the target and decreases the more the response deviates from the target
in either direction.

importance
Determines the relative importance of multiple response variables.
Often, there is no factor setting that simultaneously maximizes the desirability of the individual responses. That is why we
maximize the composite desirability. The importance determines how much influence each response has on the composite
desirability.

lower
The smallest acceptable response value.
·    If the goal is maximize or target, the desirability is zero for any response value at or below this lower bound. The closer the
lower bound is to the target, the faster the desirability falls off as the response deviates from the target.
Suppose the target is 100. If the lower bound is 0 and the weight is 1, then the desirability of a response of 90 is 0.90. If the
lower bound is 50 and the weight is 1, then the desirability of 90 is 0.80.
·    If the goal is minimize, the lower bound is ignored.

target
The most desirable response value.
·    If a response is equal to the target, its desirability is one.
·    If the goal is minimize, all responses less than the target also have a desirability of one.
·    If the goal is maximize, all responses greater than the target also have a desirability of one.

upper
The highest acceptable response value.
·    If the goal is minimize or target, the desirability is zero for any response value at or above this upper bound. The closer the
upper parameter is to the target, the faster the desirability falls off as the response deviates from the target.
Suppose the target is 100. If the upper bound is 200 and the weight is 1, then the desirability of a response of 110 is 0.90. If
the upper bound is 150 and the weight is 1, then the desirability of 110 is 0.80.
·    If the goal is maximize the response, the upper bound is ignored.

weight
The weight determines how the desirability is distributed over the interval between the lower (or upper) bound and the target.
It determines the shape of the desirability function that is used to translate the response scale to the zero-to-one desirability
scale to determine the individual desirability of a response. You can think of a weight of one as a neutral setting. Increasing the
weight requires the response to move closer to the target to achieve a given desirability. Decreasing the weight has the
opposite effect. The illustrations below show how various weights affect the shape of the desirability function.
Weight = 1

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018
Summary - Binary Logistic Regression Page 37 of 37

Desirability

Response

Weight = 0.5

Desirability

Response

Weight = 2

Desirability

Response

file:///C:/Users/SEPTIANA/AppData/Local/Temp/~hhDDF5.htm 2/7/2018

You might also like