Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/282589347

On the Measurement of Judicial Ideology

Article  in  Justice System Journal, The · October 2015


DOI: 10.1080/0098261X.2015.1084249

CITATION READS
1 111

3 authors:

Christopher D. Johnston Maxwell Mak


Duke University City University of New York - John Jay College of Criminal Justice
43 PUBLICATIONS   962 CITATIONS    15 PUBLICATIONS   46 CITATIONS   

SEE PROFILE SEE PROFILE

Andrew Sidman
City University of New York - John Jay College of Criminal Justice
18 PUBLICATIONS   153 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Decision-making by Three Judge District Court Panels in VRA Cases View project

All content following this page was uploaded by Andrew Sidman on 22 May 2016.

The user has requested enhancement of the downloaded file.


This is an Accepted Manuscript of an article published by Taylor & Francis in Justice System
Journal in 2016, available online:

http://www.tandfonline.com/10.1080/0098261X.2015.1084249

On the Measurement of Judicial Ideology

Christopher D. Johnston
Department of Political Science
Duke University
cdj19@duke.edu

Maxwell Mak
Department of Political Science
John Jay College of Criminal Justice
mmak@jjay.cuny.edu

&

Andrew H. Sidman
Department of Political Science
John Jay College of Criminal Justice
asidman@jjay.cuny.edu
Abstract

Researchers cannot assess the importance of ideology to judicial behavior without good

measures of ideology and great effort has been spent developing measures that are valid and

precise. A few of these have become commonly used in studies of judicial behavior. An

emphasis has naturally been placed on developing continuous measures of ideology, like those

that exist for other institutions. There, however, are concerns with using continuous measures

because they are built on two assumptions that may be untenable when examining judicial

decision-making: that the level of precision assumed by these measures is capturing true

ideological distinctions between judges, and the effects of ideology as measures are uniform

across levels. We examine these assumptions using different specifications of ideology finding

that categorical measures are more valid and better depict the impact of ideology on judicial

decision-making at the U.S. Courts of Appeals, but not the Supreme Court.

Key Words

Appellate courts; Interaction with ideology; Judicial behavior; Judicial decision making; Political

methodology; Supreme Court

2

Theory testing requires that we measure important concepts as precisely as is appropriate.

In the area of judicial politics, the importance of ideology to decision-making cannot be

understated (Segal and Spaeth 2002) and much work has been devoted to the measurement of

ideology with an emphasis on the development of continuous measures. For Supreme Court

justices, Segal-Cover scores (Segal and Cover 1989) are a frequently used measure.1 For Courts

of Appeals judges, the measurement of ideology is equally important. To test hypotheses at this

level, researchers have commonly used three measures of judge ideology: (1) the party of the

appointing president, (2) the scores derived by Giles, Hettinger, and Peppers (2001) coding

strategy (GHP scores), and (3) the scores developed by David Nixon presented as part of the

Political Ideology Measurement Project2 (PIMP scores) and discussed in Howard and Nixon

(2003). The first measure has been utilized in many examinations of Courts of Appeals behavior

(e.g., Songer 1987; Songer and Sheehan 1990), but it has been criticized as, “a poor proxy for

judicial preferences. It fails to account for the reality that presidents of the same party vary in

their preferences and the clear evidence in the selection literature that senators of the president’s

party can constrain the president's choice” (Giles 2008, 53). The latter two measures, especially

GHP scores, have become frequently used measures in more recent statistical examinations of

circuit judge decision-making.3

Discussing GHP scores, Epstein, Martin, Segal, and Westerland (2005) note that the

measure does have, “face, convergent, and construct validity and outperforms other common

1
Martin and Quinn scores (Martin and Quinn 2002) are another commonly used measure of Supreme Court
ideology. We do not include them in this discussion, nor do we analyze them in the manner we analyze Segal-Cover
scores, because of our preference for measures of ideology that are not based on the voting behavior of judges and
justices we are attempting to explain and predict. We do, however, use Martin and Quinn scores as a control
variable in analyses of Court of Appeals judges.
2
The Political Ideology Measurement Project is housed at <http://www2.hawaii.edu/~dnixon/PIMP>.
3
Google Scholar indicates that Giles, Hettinger, and Peppers (2001) has been cited in 327 works at the time of our
writing this manuscript. Giles (2008) notes that 40% of the citations he examined used the scores as measures of
judicial preferences. Davis (2006) and Steigerwalt, Vining, and Stricko (2013) are examples of published works
employing Nixon’s measure.

3

measures, such as the party of the appointing President, or the ideology of the state from which

the judge is selected” (2006, 4). We agree that GHP, as well as PIMP and Segal-Cover scores,

have face validity. GHP and PIMP scores, which utilize common space scores (Poole 1998), are

based on a commonly used measure of ideology. Moreover, we do not empirically question the

content and convergent validity of these measures, which are beyond the scope of this

examination. We, however, do question the degree to which these scores have construct and

predictive validity.4

As scholarship on the U.S. Courts of Appeals grows and, therefore, inclusion of

specifications of judicial ideology become more common, we believe it is important to

reevaluate the role and nature of ideology in the decision-calculus of circuit court judges.

Specifically, we argue that continuous measures, and specifically the scores created by Giles,

Hettinger, and Peppers (2001), may serve as a more appropriate basis for categorizing the

ideology of a particular lower court jurist, but treating ideology as a continuous measure in

models of circuit judge voting behavior is inappropriate. We are not arguing that all measures of

judicial ideology are invalid. To the contrary, we find that Segal-Cover scores provide the most

appropriate measure of justice ideology. We do, however, argue that, in models of circuit judge

voting behavior, the continuous measures of ideology we examine here make an assumption that

is not met, namely that the effect of ideology is uniform across its range, for example, that a shift

from -0.9 to -0.8 produces the same change in behavior as a shift from 0.6 to 0.7. Also, if

continuous ideological measures are interacted with legal contexts (e.g., Bartels 2009) or other

4
The level of content validity is difficult to ascertain for any “measure” of ideology. Theoretically, ideology can be
expressed on a number of meaningful dimensions (e.g., social issues and economic/spending issues, Best 1999).
Even the common space scores on which GHP and PIMP are based have two dimensions. Given that most studies
employing these or similar measures use only the first dimension scores, we do not question whether these measures
are capturing all of the relevant aspects of ideology. We do assume that these measures reflect a high degree of
convergent validity. Looking within levels of the judiciary, there are high, significant correlations between each
measure (r = 0.77 for GHP and PIMP; r = -.6 for Segal-Cover and PIMP) supporting the argument that each pair of
scores is measuring the same concept.

4

constraints, the assumption requires that constraints will have the same impact on ideology

across levels of the continuous measure. In other words, a contextual factor will impact a liberal

with the same level of influence as it would on a conservative judge. If this assumption is not

met, instead of providing additional information, as is typically the case when one moves from

the ordinal to the interval level of measurement, these measures are providing additional noise,

masking what could be non-monotonic effects of ideology.

To support this argument, we first analyze the measurement properties of two continuous

measures of ideology for both the Courts of Appeals and Supreme Court using alternating least

squares optimal scaling. Second, we estimate models of vote choice using each of the

continuous measures and a categorical measure of ideology, comparing various model

performance statistics across specifications. Third, we estimate models interacting ideology with

the broad issue area addressed in the case to ascertain the extent to which different measurement

strategies lead to different conclusions regarding the conditioning effects of ideology on those

factors and vice versa. Briefly, we find that (1) all of the continuous measures of ideology we

examine evince some degree of clustering around particular values, violating the assumption

made by interval-level measures, (2) using a nominal measure of ideology produces better

performing models of Courts of Appeals judge behavior, but not Supreme Court justice behavior,

and (3) level of measurement significantly affects the conclusions drawn regarding the effects of

ideology in the Courts of Appeals models when ideology is interacted with issue areas.

Measures of Judicial Ideology

Before delving into the properties of interval-level measures and the assumptions

researchers make when operationalizing concepts at this level, it is worth taking some time to

present the measures that are examined in this study. One measure for which we do not present

5

analyses is the party of the appointing president.5 In addition to the criticism of the measure

noted in the introduction, the party of the appointing president is, put simply, not a measure of

ideology. If one is explicitly testing hypotheses regarding the differences between Democratic

and Republican appointees, this variable is completely appropriate. As a measure of ideology, as

noted earlier, it fails to address nuances in ideological differences by lumping all judges

appointed by presidents of the same party together (Giles 2008). Presidential partisanship does

not capture the qualitative nor quantitative differences of conservatives, liberals or moderates;

this is especially concerning when we test whether and to what degree case-level factors that

may accentuate or attenuate the role of ideology. While parsimonious in specification, the party

of the appointing president is inadequate for testing conditionality and variation in the role of

judicial ideology. Later in this work, we present a three-category measure of ideology based on

GHP scores. One might argue that the three-category measure we employ is little better than the

dichotomous party of the appointing president. We disagree. The measure we present here is

based on a measure of ideology. It, therefore, does not count all appointees from a given party

the same. While our measure adds only one more category, it allows us to distinguish

meaningful behavioral differences between so-identified liberal, moderate, and conservative

judges, labels that comport with our basic understanding of ideology itself.

Of the continuous measures examined here, the scores created by Segal and Cover (1989)

were created first. The ideology scores measure the preferences of Supreme Court nominees,


5
While we do not present analyses using the party of the appointing president as a proxy for ideology, we did
conduct those analyses. As a dichotomous measure, it is unsuitable for the optimal scaling analysis. When included
in the Courts of Appeals and Supreme Court models, we reach the following conclusions. First, the models
employing presidential party generally perform worse than models including the other specifications of ideology
according to the model performance statistics we discuss later. Second, in the models with and without interactions
between ideology and issue area, we can generally conclude that Democratic appointees are more liberal in their
voting behavior and Republican appointees are more conservative, but we observe none of the complexities of
ideology. Thus in civil rights and liberties cases and economic cases, we would draw the same flawed conclusions
about the role of ideology as we do when using continuous measures.

6

and by extension justices, based on the coding of editorials published in four nationally-focused

newspapers, two of which tend to provide liberal commentary and two of which tend to provide

conservative commentary.6 The scores range from zero to one and increase with perceived

liberalism. Given their relatively small numbers and relatively high salience, particularly at

confirmation, there is more readily available information about Supreme Court justices that one

could use to construct a measure of ideology. Thus, an important distinguishing characteristic of

Segal-Cover scores is that they are based directly on information about the perceived policy

and/or legal preferences of the nominee.

There is typically far less information about lower court nominees and, therefore, far less

information on which one could base a measure of ideology. Giles, Hettinger, and Peppers

(2001) work around this problem by leveraging aspects of the appointment process for lower

court judges and making use of common space scores (Poole 1998), which are widely recognized

as valid measures of ideology for presidents and members of Congress. Giles, Hettinger and

Peppers (2001) place assumptions—arguably heavy ones—on the nature of selection of Article

III jurists to lower federal courts. The GHP score for a lower court judge takes on the value of

the nominating president’s common space score if senatorial courtesy is inactive. If senatorial

courtesy is operative, a given judge’s ideology takes on the value of the home-state senator of the

president’s party; if both home-state senators share the same party affiliation as the president, the

judge’s ideology is measured as the average of the senators’ common space scores. The

underlying assumption is that either the president is able to select a nominee exactly at his ideal

point, a single senator can exercise the same influence, or that two home-state senators will

select a nominee equidistant between their ideal points. Giles (2008) discusses some of the


6
Segal and Cover (1989) use the New York Times and Washington Post as their liberal sources and the Chicago
Tribune and The Los Angeles Times as their conservative sources.

7

criticisms of the measure, particularly the assignment of a home state senator’s common space

score to a judge when senatorial courtesy is operative. Nixon (2004) critiques the measure from

the other direction, noting that senatorial courtesy is inoperative in most appointments. Giles,

Hettinger, and Peppers (2001), therefore, assume that in this context presidents are, “completely

unrestrained in their appointment choices” (Nixon 2004, 1). Both critiques raise concerns for us

regarding the construct validity of the measure. It is safe to assume that the relevant players will

look for nominees proximate to their own positions. That judges selected by presidents (or

senators) with first dimension common space scores of 0.2 and 0.25, for example, evince the

same difference in ideological position is far less tenable. The implications of this are addressed

in the next section.

The scores developed by David Nixon (Howard and Nixon 2003; Nixon 2004) are also

based on common space scores. Rather than rely on assumptions about how presidential and

senatorial preferences impact appointment, Nixon’s method makes use of “bridging” judges:

appointees to the federal bench that have served in Congress and, therefore, have a common

space score based on their roll-call voting behavior. Nixon identifies 63 such individuals, using

them as observations in a regression of common space scores on several independent variables.

The resulting parameter estimates are then used to impute the common space scores of

appointees that have not served in Congress. A benefit of this method is that it can be used to

derive scores for appointees at all levels of the federal judiciary, including the Supreme Court,

thus allowing for direct comparisons between judges at different levels, as well as between

judges and legislators or presidents, and across time. These last two characteristics are shared

with GHP scores. Compared to GHP scores, PIMP scores appear to have an “informational”

advantage in that judicial ideology is a function of information about the judges themselves.

8

This information is, however, limited to the judge’s party identification and whether the judge is

a Southern Democrat or Northeastern Republican. There is no direct information on the policy

preferences of the judges.

We choose to focus our attention on these three measurement strategies for three reasons.

First, all three strategies produce widely used judicial ideology scores. Second, all three

strategies produce scores that are assumed to be and treated as interval-level measures. Third, as

none of these measures are based on judges’ voting behavior on the bench, they are all

potentially ideal for use as independent variables in explanatory models using votes as the

dependent variable.

Measurement as Theory Testing

As Jacoby notes, “all measurement is theory testing. Therefore, measurement always

constitutes a tentative statement about the nature of reality” (1999, 271). When we claim that a

given variable is operationalized at some level of measurement, whether it be nominal, ordinal,

interval, or ratio, we are not making a statement about some immutable reality, but rather are

choosing a theoretical model for how units are mapped from categories to numerical values. In

other words, the level of measurement for a variable is a choice made by the researcher, either

explicitly on the basis of theory and/or empirical evidence, or implicitly through the

unconsidered assigning of values to categories. It is important to recognize that this theoretical

aspect to levels of measurement applies to all variables, even those which we may feel are

intrinsically of a certain level. For example, religious identification may seem inherently

nominal, but within the context of a given model we may wish to operationalize this variable as

ordinal vis-à-vis degree of orthodoxy (Jacoby 1999).

9

In the present context, models of judicial decision-making assume that ideology is an

interval-level variable. While a general understanding of what this means is widespread, it is

worthwhile to consider it in a bit more detail. It is useful to consider levels of measurement as

functions (representing underlying models) which assign values to categories of a given variable,

and thus values to the units in the population of interest (Jacoby 1999). The implicit model

underlying the interval-level measurement of ideology in the present context is as follows:

(1) M(si) = β0 + β1(si) ,

where M(si) is the function mapping the set of observations S into the set of values of the

independent variable denoted M. As seen, the current operationalization assumes a linear

relationship. In other words, it is assumed, a priori, that the difference between values assigned

to units (Mi -Mj) is proportional (β1) to the difference between them in the underlying

characteristic (si – sj), in this case, ideological liberalism. Practically speaking, this implies that a

unit change at any point on the measured ideology scale corresponds with a substantively

identical increase in the underlying trait.

While it is possible that this assumption is approximately true, for the reasons discussed

above we believe that some degree of skepticism is warranted, and thus empirical testing of the

underlying measurement model would be fruitful. Before we discuss our empirical approach,

however, we note again that this is more than a simple empirical exercise. There are real

consequences for ignoring measurement issues with respect to the conclusions we draw from our

analyses. We consider two important potential consequences for the present field here.

First, the appropriate specification of ideology can have serious implications for the

interpretation of the role of ideology. If ideological distinctions are coarser, using an interval-

level measure of ideology would overvalue and overestimate the influence of ideology,

10

suggesting significant differences across levels of ideology where no such distinctions actually

exist. In other words, we would treat judges at different levels as substantively different when

they are not, misconstruing the nature of ideology and its relationship to voting. It is also

possible that the interval level mischaracterizes the size of differences between clusters of judges

at different points on the scale. For example, judges near the midpoint are assumed under the

linear model to be equidistant in terms of the underlying trait from both extremes. There may,

however, be nonlinear relationships between ideological groups and voting such that

conservatives and moderates, or moderates and liberals are more similar in their behavior under

particular conditions.

Second, if a lower level of measurement is warranted, this implies the possibility of

substantively interesting, qualitative distinctions between categories of ideologues. In other

words, ideological distinctions may be more complex conceptually than implied by a

unidimensional continuum. If this is the case, we may expect that moderating factors (e.g., case-

level factors; see Bartels 2009) differentially affect different groups of judges. Just as a lower

level of measurement may imply non-linear main effects on the dependent variable, it may also

imply non-linear conditional effects. To the extent that such qualitative differences do indeed

exist between ideological groups, this constitutes a fruitful avenue for future research in the field.

To date, work at all levels of judicial decision-making has more often than not treated ideology

as a unidimensional, interval-level continuum. This assumption has proved too simple in other

subfields of political science (e.g., Feldman and Johnston 2014), and it is one that deserves

examination in the present field.

11

Data and Variables

In order to test these propositions, we employ the Original U.S. Courts of Appeals

Database7, compiled by Donald Songer, the update to this database, compiled by Ashlyn

Kuersten and Susan Haire, and the U.S. Supreme Court Database, originally compiled by Harold

Spaeth.8 To keep the time periods consistent, we analyze judicial behavior from 1947, the

earliest year available for Supreme Court data, through 2002, the latest year available for Courts

of Appeals data. This is an ideal place to examine the role and nature of ideology at the circuit

courts; not only is it a large collection of cases that spans several decades, but it is also the data

employed in many previous examinations of circuit court decision-making (e.g., Calvin, Collins

and Eshbaugh-Soha 2011; Hettinger, Lindquist and Martinek 2006; Kaheny, Haire and Benesh

2008) and the data employed in the Giles, Hettinger, and Peppers (2001) examination. For all of

the analyses, the dependent variable is whether a judge or justice voted in a liberal direction on

the merits of a given case. We only include votes coded as liberal (1) or conservative (0),

excluding unclassified or mixed votes. As the dependent variable is dichotomous, we estimate

all models, except for the alternating least squares analyses, as logit models.9

Ideology

As described previously, we include two continuous measures of ideology for the Courts

of Appeals judges: GHP scores and PIMP scores. We also include two continuous measures of

ideology for Supreme Court justices: Segal-Cover scores and PIMP scores. At both levels of the

judiciary, performance of the continuous measures is compared to a nominal, categorical


7
Both the Original U.S. Appeals Courts Database (1925-1996) and the Update to the Appeals Court Database
(1997-2002) were obtained from the website of the Judicial Research Initiative at the University of South Carolina
<http://artsandsciences.sc.edu/poli/juri/appct.htm>.
8
Our analyses use the 2013 Release 1 justice-centered version of the Supreme Court Database
<http://supremecourtdatabase.org/index.php>.
9
Data and replication files for all of the analyses we present can be found at www.andrewsidman.com.

12

measure of ideology. For the Courts of Appeals, we employ a three-category measure of

ideology using a tertile split of GHP scores. The analyses include two dummy variables,

Moderate, which is scored 1 for the middle-third of judges, and Conservative, which is scored 1

for upper-third of judges. Liberal judges comprise the excluded category. For the Supreme

Court, we follow the same coding strategy using Segal-Cover scores: Moderate is scored 1 for

the middle-third of justices, Conservative is scored 1 for the lower-third10, and liberal justices are

the excluded category.

Control Variables

The analyses that follow include several control variables thought to explain judicial

voting behavior. The following variables are included in all models. First, we include dummy

variables for three different issue areas: civil rights and liberties, economic, and criminal; other

issue areas comprise the excluded category.11 Second, we use a dummy variable to control for

the direction of the decision of the lower court being reviewed (liberal decisions are coded as 1,

conservative as 0). Given the tendency of appellate courts to affirm district court decisions, we

expect this variable to have a positive effect on the likelihood of a liberal vote by circuit judges.

We expect the opposite effect on the behavior of Supreme Court justices whereas the Court tends

to follow a reversal strategy when reviewing decisions. Third, all models include a dummy

variable coded 1 if the respondent is the federal government and fourth, the interaction between

the lower court decision and whether the respondent is the federal government. Given the


10
Segal-Cover scores increase with perceived liberalism and both GHP and PIMP scores, which are based on
common scores, increase with perceived conservatism.
11
We generate the issue area dummies using GENISS in the Courts of Appeals data (1=criminal, 2 through 5=civil
rights and liberties, and 7=economic) and issueArea in the Supreme Court database (1=criminal, 2 through 5=civil
rights and liberties, and 8=economic).

13

observed advantages the federal government has in court12, we expect the interaction to lower the

magnitude of the effects of the lower court decision.

Two variables are included only in the Courts of Appeals models. We control for panel

effects at the U.S. Courts of Appeals by including the number of judges, not including the judge

of the observation, on the panel appointed by a Democrat. We expect that an increase in the

number of Democratic appointees on the panel will increase the likelihood of a liberal vote.

Additionally, we include the one-year lagged median ideology score of the Court as measured by

Martin and Quinn (2002) scores to account for circuit court responsiveness to the Supreme

Court.13 Given that these scores increase with perceived liberalism, we expect a positive

relationship with the likelihood of a liberal vote. The lagged Supreme Court median is,

naturally, only included in the Courts of Appeals models. For a final set of control variables, the

models all include fixed effects. Courts of Appeals models include fixed effects for year and for

circuit. Supreme Court models include fixed effects for terms of the Court.14 Table 1 presents

summary statistics for all of these variables.

[Table 1 about here]

Analysis of the Measurement Properties of Ideology


12
At the Supreme Court, the Solicitor General’s Office and the federal government as a whole have more experience
appearing before the Supreme Court and achieve greater degrees of success (McGuire 1998; Segal 1990; Sheehan,
Mishler and Songer 1992). Songer and Sheehan (1992) note the same federal government advantage in the United
States Court of Appeals. Much like Sheehan, et al. (1992), Songer and Haire (1992) in looking at obscenity cases at
the Court of Appeals, find varying degrees of success for litigants, but also a distinct advantage for the federal
government. They find that the predicted probability of a vote supporting a defendant when opposed by the federal
government decreases to 6.7%, which is daunting when compared to the 25.3% likelihood when a defendant faces a
local government opponent (977-978). The explanation for federal government success is deference, on the part of
the justices, to a coordinate branch of government. With neither the power of the purse nor the sword, the Court is
reliant on the other branches and overall should be more willing to support a majoritarian, popularly-elected branch.
Thus, this effect should trickle down to the lower federal courts.
13
While we generally prefer non-vote based measures of ideology, Martin and Quinn scores are not being used here
to predict the votes of justices and have the advantage of being dynamic. Given their construction, the use of Martin
and Quinn scores in this context allow the circuit court models to account for judges changing their behavior in
response to the shifting voting behavior of the median justice (or changing median justices).
14
Judge and justice fixed effects cannot be included because the measures of ideology employed here are constant
within each judge.

14

In a typical regression analysis, the values of the variables are taken to be fixed with the

respective regression coefficients as parameters to be estimated from the data. A generalization

of this framework allows the values of the variables, and in turn, the level of measurement on

which the variables are operationalized, to be estimated as additional model parameters. In other

words, the level of measurement becomes another aspect of the model to be determined from the

empirical observations. Observations which are initially assigned to the same category remain in

the same category, but the values assigned to those categories are estimated.15 More specifically,

values are assigned which, along with the estimated regression coefficients, jointly maximize the

value of R2. Thus, the “best-fitting” model is found where fit varies along both structural and

measurement dimensions as opposed to only the former in traditional regression analyses

(Jacoby 1999).

Within the linear regression context, alternating least squares (ALS) can be utilized to

obtain these two sets of parameter estimates. The ALS procedure begins with the original

category values, and estimates the vector of regression coefficients via ordinary least squares

(OLS).16 The regression coefficients are then treated as fixed, and values of the variables are

found which minimize the sum of squared errors. These new values are then treated as fixed,

and OLS is utilized to derive updated regression coefficients. This procedure is repeated until


15
As a basic example, consider party identification measured using a seven-point scale ranging from 0 (strong
Democrat) to 6 (strong Republican). In explaining preferences for defense spending, it may be that the preferences
of strong Democrats and weak Democrats do not differ. An optimal scaling procedure would suggest that party
identification is best measured using six categories, not seven, giving strong and weak Democrats the same category
value.
16
Judge votes are obviously dichotomous. Our utilization of ALS is thus equivalent to the linear probability model
for binary dependent variables. As we are presently interested in the measurement characteristics of the ideology
variable and not significance tests on the structural parameters of the model, and given that the linear probability
model provides unbiased estimates of such parameters (Long 1997), this is unproblematic.

15

improvement in model fit halts (thus Alternating Least Squares).17 We estimate the OLS

regression using judge or justice votes as the dependent variable and ideology18, issue areas, the

direction of the lower court decision, whether the federal government is the respondent, and, for

the Courts of Appeals only, panel effects and the lagged Supreme Court median as independent

variables. Ideology, measured using GHP scores, PIMP scores, and Segal-Cover scores, is the

only variable to which we apply the optimal scaling procedure.19

Prior to the ALS procedure, the researcher specifies constraints upon the value

transformations which are allowed for each variable in the model. In other words, the researcher

may opt to restrict value transformations in line with theory regarding level of measurement, or

allow values to be estimated independent of any such restrictions (other than category integrity).

For example, the researcher may choose to place an ordinal restriction on a given variable during

the ALS estimation. This would imply that any value transformations occurring during the

estimation must maintain the original category ordering from the untransformed variable. If the

measurement level is assumed to be nominal, which is a far weaker assumption, the values of the

transformed variable need not preserve the order of the original values.

Beyond the theoretical justifications for imposing (or not) an ordinal level of

measurement, there are practical benefits and drawbacks associated with assuming a nominal or

an ordinal level of measurement. Restricting transformations to an ordinal level can uncover

whether the effects of the original variable are proportional to the change in the underlying units,

the key assumption of interval-level measures. Allowing the transformed variable to be



17
An example of an optimal scaling routine from Jacoby (N.d.) suggests alternating the OLS and optimal scaling
until improvement in R2 is less than 0.001. In our analyses, we only had to scale ideology once per model. R2
improved between the first and second OLS estimations, but not thereafter.
18
We round the ideology variables to two decimal places to reduce the number of “categories” of the original
variable. For each measure, we retain the following number of unique values: GHP, 95 out of 191 values; PIMP
(Courts of Appeals), 57 out of 182 values; Segal-Cover, 21 out of 23 values; PIMP (Supreme Court), 17 out of 24
values.
19
We use the lm and optiscale packages in R to conduct the ALS analyses.

16

measured as nominal enables the researcher to observe whether the effects of the original

variable are proportional to the change in the underlying units and whether the effects are

actually monotonic. Unrestricting the ALS procedure in this way, however, can be impractical

with a large number of original values. For GHP scores, for example, the resulting graph of

original against optimally scaled values includes ninety-five points, one for each unique, original

value of ideology, arrayed such that it difficult to discern patterns in the results. We present the

unrestricted (nominal) transformations first, and then results for the Courts of Appeals including

the ordinal restriction.

If ideology as measured by GHP, PIMP, or Segal-Cover scores meets all of the

assumptions applied to interval-level variables, the original values and optimally scaled values

should be the same. That is, the values of ideology that maximize R2, and best explain variation

in judicial voting behavior, would be those of the untransformed variable. As Figure 1

demonstrates, this is not the case. For all three measures, we observe a general positive

relationship; increases in the transformed values correspond to increases in the original values.

Yet, practically every transformed value (looking horizontally across the graphs) includes a

range of values of the original measure. We do not observe the point-for-point relationship we

should if these scores were truly interval-level measures.

[Figure 1 about here]

Each panel of Figure 1 presents the optimal scaling results from one model of judicial

voting behavior. Panels A and B present models for Courts of Appeals judges; C and D present

models for Supreme Court justices. In all panels, values of the untransformed variable are

presented along the horizontal axis. Optimally scaled values, which are the values of ideology

that maximize R2, are presented along the vertical axis. If a given variable was best measured at

17

the interval level, the points of the scatter plot would fall on the 45-degree line, presented in each

panel as a dashed, gray line. Looking across the panels, for none of the measures does it appear

safe to assume that a one-unit increase in ideology translates into a corresponding, meaningful

increase or decrease in the likelihood of voting liberal. Furthermore, the optimally scaled

category values are not evenly spaced, which is a central assumption of interval-level measures.

As noted earlier, the unrestricted transformations can produce difficult to read graphs, especially

for the Courts of Appeals models given the larger number of unique original values. Figure 2

presents the same optimal scaling for the Courts of Appeals, but with the imposition of an

ordinal restriction on the transformed values. Here, one can more clearly see the clustering of

original values around particular transformed values. For example, in models of voting behavior,

GHP scores ranging from roughly -0.35 to -0.1, a range of 0.25, are best coded as a single value,

-0.2.

[Figure 2 about here]

To further support our argument, we conduct additional ALS analyses of the measures of

circuit judge ideology, both of which had a much larger number of unique, original values as

compared to the measures of Supreme Court ideology. In these analyses, we reduce GHP and

PIMP scores to ten, evenly spaced categories. For GHP scores, we begin with the minimum

value of -0.7 and set category ranges equal to 0.131. For PIMP scores, we begin with -0.34 and

set category ranges of 0.075. The ALS analyses presented in Figure 3 allow for unrestricted, that

is nominal, transformation of the original values. Again, if the measures should truly be treated

as interval-level, one would observe a roughly unit-for-unit increase between the original and

optimally scaled values. Figure 3 casts doubts on whether these variables should even be treated

as ordinal. For GHP scores (Panel A), the most liberal voting behavior is observed for judges in

18

the penultimate liberal category (-0.569 to -0.438).20 Likewise, the most conservative voting

behavior is observed for judges in the penultimate conservative category (0.348 to 0.479). We

also observe further clustering in three of the “moderate” categories (original scores ranging

from -0.307 to 0.086). PIMP scores (Panel B) appear even less ordinal than GHP scores.

Original and optimally scaled values generally increase from the first through the fifth

categories. For subsequent categories, the relationship between the two sets of scores is

approximately parabolic with conservative behavior reaching a local maximum in the fifth

category, decreasing through the eighth category, and increasing again through the tenth.

[Figure 3 about here]

The evidence from the optimal scaling analysis suggests a categorical representation of

ideology may be more appropriate to explaining judicial voting behavior. In and of themselves,

the results do not provide sufficient evidence of the inadequacy of these measures in models of

judicial voting behavior. For that, we turn to more appropriate models of behavior and,

importantly, statistical evaluation of model performance using various operationalizations of

ideology. We recognize that it may be impractical for researchers to optimally scale their

preferred continuous measure of ideology before every examination. Furthermore, measures of

ideology using different values for and definitions of categories, as could result from using

different subsets of the datasets used here or new datasets all together, reduces the ability of

researchers to compare their findings. As a practical solution, we compare the performance of

these continuous measures to an easily created categorical measure based on GHP scores for

Courts of Appeals judges and Segal-Cover scores for Supreme Court justices, as described in the

Data and Variables section.


20
Since GHP (and PIMP) scores increase with conservatism, and the structural model will estimate one parameter
for the effect of ideology, lower values will be associated with a greater propensity to vote in a liberal direction.

19

Comparison of Explanatory Models

[Table 2 about here]

The results present the estimation of three specifications of logit models, one for each

operationalization of ideology, using all of the independent variables described in the Data and

Variables sections. Again, all of the specifications include year fixed effects and Courts of

Appeals models also include circuit fixed effects.21 We provide brief discussion of the effects of

the independent variables, but our present emphasis is on the model performance statistics

included at the bottom of Tables 2 and 3. We present five different statistics: (1) the Bayesian

information criterion (BIC’), using a version less sensitive to the number of parameters in the

model22, (2) Akaike’s information criterion (AIC), (3) the area under the receiver operating

characteristic (AUROC) curve, (4) McFadden’s R2, which is the pseudo-R2 reported by Stata,

and (5) the percent correctly predicted. Models are assumed to be “better” when (1) BIC’ and

(2) AIC are lower and when (3) AUROC, (4) McFadden’s R2, and (5) percent correctly predicted

are larger.

[Table 3 about here]

As Table 2 demonstrates, the model using a categorical specification of ideology

preforms better than the nearest competitor, the model using GHP scores, on four of the five

statistics. The categorical specification results in a BIC’ that is 13.599 points lower, which

indicates a very strong preference for that model. The categorical specification boasts a larger

area under the ROC curve, a larger McFadden’s R2, and a slightly larger percent correctly


21
Year and circuit fixed effects are not presented in the tables, but are available upon request.
22
We opt for this calculation of the BIC because the categorical specification would be penalized for the extra
parameters. Long and Freese (2006) state there are no important differences between this and the more standard
calculation of BIC. As a means of comparing BIC for any set of models, Long and Freese (2006) note that
differences of between 2 and 6 points represent positive support for selecting the model with the lowest BIC.
Differences between 6 and 10 points represent strong support and differences greater than 10, very strong support.

20

predicted. The categorical specification and the model using GHP scores have an equal AIC. In

general, the model using PIMP scores is a very close third with respect to performance statistics.

The differences in model performance are not striking across the three specifications. Generally,

however, it is argued that continuous measures are preferable to categorical measures because

the latter “throw away” information. In the context of the voting behavior of circuit court judges,

far from losing information, the categorical specification of ideology produces a model that

provides at least as good, if not better explanations, and therefore predictions, of circuit judge

voting behavior. For the Supreme Court, however, a continuous measure, Segal-Cover scores,

clearly produces the best performing model. The model in Table 3 using Segal-Cover scores

excels in all five measures, including a BIC’ difference of 958.717 points between itself and the

model using a categorical specification of ideology.

[Figure 4 about here]

In addition to presenting the improved performance of a model using a categorical

specification of circuit judge ideology, we demonstrate the potential empirical pitfalls of

continuous measures through Figure 4. Figure 4 plots the predicted probabilities of voting

liberal for circuit judges against the actual behavior of judges at those values of ideology for the

ten categories used in the third ALS analysis. For example, the first category for GHP scores

ranged from -0.7 to -0.569. The first curve in each panel of Figure 4 plots the predicted

probability of a liberal vote for GHP scores from -0.7 to 0.61, increasing in increments of 0.131

(Panel A) and for PIMP scores ranging from -0.34 to 0.41, increasing in increments of 0.075

(Panel B).23 The second curve plots the proportion of votes that are liberal for all judges whose


23
Except for the measures allowed to vary as noted in the text, predicted probabilities are calculated holding GHP
and Segal-Cover scores at their respective means (-0.020 for GHP, 0.547 for Segal-Cover). Dichotomous variables
are kept at their base categories (“other” issue area, conservative lower court decision, federal government is not the
respondent), panel effects at their median (one Democratic appointee on the circuit court panel), and the lagged

21

ideology scores are within the category ranges from Figure 3. While we do not expect a perfect

relationship, we should observe a monotonically decreasing proportion of liberal votes as

categories (conservatism) increase, if the interval level were appropriate. That, clearly, is not the

case. The curves representing actual behavior are essentially the mirror images of the curves

presented in Figure 3, the third ALS analysis. We add the predicted probabilities curves to make

clear that the assumption of behavioral changes that are proportional to unit increases is simply

not met for either GHP scores or PIMP scores.

Control Variables

The control variables have the same substantive effects within each level of the judiciary

across specifications.24 The interpretations that follow use the results from the GHP scores model

and the Segal-Cover scores model. Looking first at the effects of different issue areas, relative to

other issue areas, the probability of a liberal vote by a judge decreases by 0.044 in civil rights

and liberties cases, by 0.028 in economic cases, and by 0.177 in criminal cases. For Supreme

Court justices, relative to other issue areas, the probability of a liberal vote increases by 0.113 in

civil rights and liberties cases, 0.111 in economic cases, and 0.039 in criminal cases. The

direction of the lower court decision, combined with the federal government as respondent, has

significant effects on judicial behavior. In general, circuit judges are most likely to vote liberal

when the lower court decision is liberal and the federal government is the respondent (predicted

probability, 𝑝, equals 0.763). Circuit judges are least likely to vote liberal when the lower court

decision is conservative and the federal government is the respondent (𝑝 equals 0.362). Taken

together, the results suggest a high rate of affirmance and deference to the federal government.

Supreme Court median is held at its mean (0.386). For the Courts of Appeals, marginal effects are calculated for the
7th Circuit, holding the year at 1980. For the Supreme Court, term is held at 1993. None of those effects were
significant and all three were very close to zero.
24
Not only do the control variables have the same effects in the three basic models presented in Tables 2 and 3, but
they also have the same effects in the models that interact ideology with issue areas (Tables 4 and 5).

22

We observe a similar dynamic for the Supreme Court with respect to deference for the federal

government, but we also see preferences for reversing as opposed to affirming lower court

rulings; when the federal government is not the respondent, there is a significantly higher

probability of a justice voting liberal when the lower court decision was in a conservative

direction as opposed to liberal one. We observe significant “panel” effects at the Courts of

Appeals, with an increasing number of Democratic appointees increasing the likelihood that a

judge casts a liberal vote. An increase from zero to two Democrat-appointed judges on the panel

increases the probability of a liberal vote by 0.059. Lastly, circuit court judges exhibit

responsiveness to the Supreme Court. An increase of one standard deviation (0.55) from the

mean of the lagged Supreme Court median, which represents a liberal change in Court

preferences, causes the probability of a liberal vote to increase by 0.083.

Ideology and Interactions

[Table 4 about here]

As a final illustration of the caution that needs to be observed when measuring judicial

ideology, we present Tables 4 and 5, which estimate the same models for the Courts of Appeals

and Supreme Court as presented in Tables 2 and 3, adding interactions between ideology and

issue areas. Looking first at model performance, Tables 4 and 5 tell a story similar to Tables 2

and 3. The model using the categorical specification of circuit judge ideology performs at least

as well as models using continuous measures. The categorical specification has the lowest BIC’

(by 11.303 points, indicating strong support for this specification), the highest area under the

ROC curve, the highest McFadden’s R2, the lowest AIC, and the highest percent correctly

predicted. Again, the differences in model performance are not spectacular, but all five statistics

suggest that the categorical specification produces models comparable to those using a

23

continuous specification of ideology. Turning to the Supreme Court, again, the model using

Segal-Cover scores outperforms the other two models on all five statistics.

[Table 5 about here]

More important than relative model performance, the conditional effects of ideology on

voting behavior provide, for us, strong support for the argument that researchers need to consider

the advantages of categorical measures over continuous ones in this context. We begin this time

with the straightforward presentation of conditional effects in models of Supreme Court

decision-making. All three models expect the likelihood of voting liberal to decrease as

perceived conservatism increases (Segal-Cover scores decrease and PIMP scores increase) across

all issue areas. The interactions between ideology and issue area all have the effect of

accentuating the impact of ideology relative to its effects in other, unspecified issue areas (the

base category). Liberal justices become more likely to vote liberal, conservative justices become

more likely to vote conservative, and moderate justices become slightly more likely to vote

conservative, although this result for moderates is statistically weaker for economic cases.

Importantly, we observe the same substantive relationships for all three specifications of

ideology.

The same cannot be said of the Courts of Appeals models. Recall that an assumption of

interval-level measures is that a constant distance between sets of values of the independent

variable will produce proportionally constant responses in the dependent variable. Consider, for

example, the interaction between civil rights and liberties cases and PIMP scores (the second

substantive column of Table 4). The coefficients suggest that liberal judges become more liberal

when deciding civil rights and liberties cases, moderate judges become slightly more liberal, and

conservative judges become more conservative. Thus, we draw a similar conclusion to the

24

effects of ideology on Supreme Court behavior. Liberal judges are more likely to vote liberal

than moderate judges, and moderate judges are more likely to vote liberal than conservative

judges. The coefficients from the last column of Table 4, however, tell a different story. The

effect of Moderate is not statistically significant, suggesting that in civil rights and liberties

cases, there are no statistical differences in the voting behavior of moderate judges and liberal

judges, ceteris paribus.

[Figure 5 about here]

Figure 5 graphs the predicted probabilities of a liberal vote in each issue area when

continuous measures equal their minimum value, 25th percentile, median (50th percentile), 75th

percentile, and maximum value. Categories of the nominal level measure are placed with the

minimum, median, and maximum values for ease of presentation. In the bottom two panels,

depicting criminal cases and other issue areas, all three measures produce substantively similar

conclusions: the propensity to vote liberal decreases with conservatism. In the top two panels,

depicting civil rights and liberties cases and economic cases, the continuous measures, because

they assume (and therefore force) proportional changes in the response, tell a misleading story

about the role of ideology in decision-making. In civil rights and liberties cases (Panel A), while

moderates are more likely to vote conservative than liberals, the difference between these two

types of judges (0.042) is much smaller than the difference between moderates and conservatives

(0.107). Such differences are even sharper in economic cases where moderate judges are, on

average, more liberal in their voting behavior than liberal judges.25 Conservative judges are,

once again, far more conservative in their behavior than the other two groups. Thus, for these

two important types of cases, reliance on continuous measures of ideology would lead the


25
The coefficient of the interaction between moderate judges and economic cases is not listed as statistically
significant, but it does come close to conventional levels of significance (p = 0.060).

25

researcher to conclude that increasing conservatism leads to decreases in the likelihood of voting

liberal. In actuality, “moderate” judges are much closer to their liberal colleagues in both civil

rights and liberties and economic cases. Coupled with the model performance statistics, the

results raise serious doubts about the conclusions drawn from analyses using continuous

measures of ideology in the current context.

Conclusion

The measurement strategies presented in Giles, Hettinger, and Peppers (2001) and

Howard and Nixon (2003) are a welcome departure from using the party of the appointing

president as a proxy for judicial preferences. Rather than paint all nominees with the same broad

brush, GHP and PIMP scores leverage the selection process, variation in presidential or

senatorial preferences, or basic characteristics of the nominees to better distinguish liberal,

moderate, and conservative judges. Keeping these measures at the interval level, however,

places a heavy assumption on the manner by which ideology affects behavior. Specifically, it is

assumed that unit increases in these variables cause proportional changes in the likelihood judges

cast liberal votes; that a change from -0.5 to -0.45 in a given ideology scale not only produces a

meaningful change in voting behavior, but also that the same change in behavior results from a

change from 0.45 to 0.5. We argue that this assumption is untenable and our argument is

supported in the context of voting behavior at the Courts of Appeals.

We recognize that quantitative scholars generally prefer continuous measures over

categorical ones. Continuous measures provide increased variation and greater precision. These

benefits are only realized, however, when the assumptions of these levels of measurement are

met. This is not to say that the field should abandon GHP or PIMP scores as measures of circuit

judge ideology. The methods of their creation place both measures on the same scale as actors in

26

the coordinate branches of government, making them well-suited to inter-branch ideological

comparisons.

In studies of judicial behavior that do not include such explicit ideological comparisons,

we find issues with these continuous measures that can be resolved by lowering the level of

measurement. Different research contexts, and different hypotheses, may call for different types

of measures. Ideology can be measured using any number of categories appropriate to testing

these arguments. We present a three-category measure, which, while simple, brings a number of

benefits beyond dropping the unmet assumptions of the interval level. Particularly in models of

circuit judge voting behavior, especially when ideology is interacted with contextual factors, the

categorical measure: (1) is easy to construct assuming the researcher already has GHP scores, (2)

is easy to interpret, (3) produces models that perform just as good as, if not better than, models

using the common continuous alternatives, and (4) leads to more appropriate conclusions

regarding the effects of ideology.

We present three sets of analyses for the Courts of Appeals and Supreme Court. In the

first, it was demonstrated that all four continuous measures of judicial ideology evince some

degree of clustering around particular values when used in models of vote choice. Thus, model

fit improves when these continuous measures are reduced to sets of categories. Second, we

present logit models of vote choice using the continuous measures and a simple, categorical

construction, based on GHP and Segal-Cover scores, identifying liberal, moderate, and

conservative judges and justices. On five different statistics, the Courts of Appeals model

employing the categorical measure performed slightly better than the models using continuous

measures. Thus, there is no information loss dropping from the interval to the nominal level and

drastically decreasing the number of categories into which judges are placed. Interestingly,

27

Segal-Cover scores, and not the categorical measure based on them, provide better explanations

of justice voting behavior. We speculate that the reason for this is that Segal-Cover scores are

based on information about the nominees’ preferences directly. GHP and PIMP scores, even

those for Supreme Court justices, make assumptions about nominee preferences based on

presidential/senatorial preferences and nominee party identification/region respectively. It

appears that the information used by Segal and Cover (1989), which unfortunately is not

available for lower court judges, lends itself to a more precise measure of ideology. Third, we

find for the Courts of Appeals that a categorical measure of ideology produces different

substantive conclusions regarding the effects of ideology when ideology is interacted with the

issue area addressed by a case. Such qualitative differences in the effect of a moderating variable

can only emerge by relaxing the measurement constraints of GHP and PIMP.

Beyond questions of measurement, our analyses also demonstrate the importance of

estimating well-specified models. While ideology is an important predictor of judicial behavior,

other factors exert strong influences over the votes of judges and justices. The substantive issue

area of the case and the presence of the federal government as a party in particular are powerful

predictors of judicial behavior regardless of how ideology is measured in the context we study

here.

One final thought we wish to reiterate is that the context under which judges are studied

is important. Our argument should not be construed as cautioning against the use of continuous

measures of ideology for circuit judges in all research, even for the two measures we examine

here. Furthermore, we do not argue that usage of the party of the appointing president is

necessarily poor. The results we present, however, demonstrate that what are seemingly greater

levels of precision are not automatically better measures in the context of ideology and voting

28

behavior for the data examined here. For other research questions, for other datasets, continuous

measures of circuit judge ideology may be perfectly appropriate. We do argue that researchers

should be particularly mindful of the measures of ideology they employ in empirical

examinations of judicial voting behavior. Moreover, future examinations should be especially

cautious when testing the conditional impact and effects of ideology. We hope that our

discussion serves to make researchers more aware of the assumptions and consequences of those

measurement choices, and to help researchers make those decisions conscientiously.

29

References

Bartels, Brandon L. 2009. “The Constraining Capacity of Legal Doctrine on the US Supreme

Court.” American Political Science Review 103: 474-495.

Best, Samuel J. 1999. “The Sampling Problem in Measuring Policy Mood: An Alternative

Solution.” Journal of Politics 61: 721-740.

Calvin, Bryan, Paul M. Collins, Jr., and Matthew Eshbaugh-Soha. 2011. “On the Relationship

between Public Opinion and Decision Making in the U.S. Courts of Appeals.” Political

Research Quarterly 64: 736-748.

Davis, Jeffrey. 2006. “Justice Without Borders: Human Rights Cases in U.S. Courts.” Law and

Policy 28: 60-82

Epstein, Lee, Andrew D. Martin, Jeffrey A. Segal, and Chad Westerland. 2005. “The Judicial

Common Space.” Journal of Law, Economics, & Organization 23: 303-325.

Feldman, Stanley, and Christopher Johnston. 2014. “Understanding the Determinants of Political

Ideology: Implications of Structural Complexity.” Political Psychology 35: 337-358.

Giles, Michael W. 2008. “Commentary on ‘Picking Federal Judges: A Note on Policy and

Partisan Selection Agendas.” Political Research Quarterly 61: 53-55.

Giles, Michael W., Virginia A. Hettinger, and Todd C. Peppers. 2001. “Picking Federal Judges:

A Note on Policy and Partisan Selection Agendas.” Political Research Quarterly 54: 623-

641.

Hettinger, Virginia A., Stefanie A. Lindquist, and Wendy L. Martinek. 2006. Judging on a

Collegial Court: Influence on Federal Appellate Decision-Making. Charlottesville: Virginia

University Press.

30

Howard, Robert M. and David C. Nixon. 2003. “Local Control of the Bureaucracy: Federal

Appeals Courts, Ideology, and the Internal Revenue Service.” Journal of Law and Policy 13:

233-256.

Jacoby, William G. 1999. “Levels of Measurement and Political Research: An Optimistic View.”

American Journal of Political Science 43: 271-301.

Jacoby, William G. N.d. “opscale: A Function for Optimal Scaling.”

http://polisci.msu.edu/jacoby/icpsr/scaling/computing/alsos/Jacoby,%20opscale%20MS.pdf

(February 24, 2015).

Kaheny, Erin B., Susan B. Haire, and Sara C. Benesh. 2008. “Change over Tenure: Voting,

Variance, and Decision Making on the U.S. Courts of Appeals.” American Journal of

Political Science 52: 490-503.

Long, J. Scott. 1997. Regression Models for Limited and Categorical Dependent Variables.

Thousand Oaks, CA: Sage.

Long, J. Scott and Jeremy Freese. 2006. Regression Models for Categorical Dependent

Variables Using Stata, 2nd Edition. College Station, TX: Stata Press.

Martin, Andrew D., and Kevin M. Quinn. 2002. “Dynamic Ideal Point Estimation via Markov

Chain Monte Carlo for the U.S. Supreme Court, 1953-1999.” Political Analysis 10: 134–53.

McGuire, Kevin T. 1998. “Explaining Executive Success in the U.S. Supreme Court,” Political

Research Quarterly 51: 505-526.

Nixon, David C. 2004. “Appendix A: Ideology Scores for Judicial Appointees.” November 23.

http://www2.hawaii.edu/~dnixon/PIMP/judicial.pdf (February 16, 2015).

Poole, Keith T. 1998. “Recovering a Basic Space from a Set of Issue Scales.” American Journal

of Political Science 42: 954-993.

31

Segal, Jeffrey A. 1990. “Supreme Court Support for the Solicitor General: The Effect of

Presidential Appointments.” Western Political Quarterly 43: 137-152.

Segal, Jeffrey A. and Albert D. Cover. 1989. “Ideological Values and the Votes of U.S. Supreme

Court Justices.” American Political Science Review 83: 557-565.

Segal, Jeffrey A. and Harold J. Spaeth. 2002. The Supreme Court and the Attitudinal Model

Revisited. Cambridge, UK: Cambridge University Press

Sheehan, Reginald S., William Mishler & Donald R. Songer. 1992. “Ideology, Status, and the

Differential Success of Direct Parties Before the Supreme Court.” American Political Science

Review 86: 464-471.

Songer, Donald R. 1987. “The Impact of the Supreme Court on Trends in Economic Policy

Making in the United States Courts of Appeals.” Journal of Politics 49: 830-41.

Songer, Donald R., and Susan Haire. 1992. “Integrating Alternative Approaches to the Study of

Judicial Voting: Obscenity Cases in the U.S. Courts of Appeals.” American Journal of

Political Science 36: 963-82.

Songer, Donald R., and Reginald S. Sheehan. 1990. “Supreme Court Impact on Compliance and

Outcomes: Miranda and New York Times in the United States Courts of Appeals.” Western

Political Quarterly 43: 297-319.

Songer, Donald R., & Reginald S. Sheehan. 1992. “Who Wins on Appeal? Upperdogs and

Underdogs in the United States Court of Appeals.” American Journal of Political Science 36:

235-258.

Steigerwalt, Amy, Richard L. Vining, Jr., and Tara W. Stricko. 2013. “Minority Representation,

the Electoral Connection, and the Confirmation Vote of Sonia Sotomayor.” Justice System

Journal 34: 189-207.

32

Figure 1.
Optimal Scaling of Continuous Measures of Ideology: Nominal Transformations
Note: Each panel presents the scatter plot of the original values of each measure (on the horizontal axis) against
the optimally scaled values (on the vertical axis), derived from alternating least squares analyses as described in
the text. For the purpose of comparison, the “45-degree” line, on which the points would fall if the original and
optimally scaled values were equal, is presented as a dashed, gray line. Note also that Panel A, the optimal
scaling of GHP scores, does not include one extreme point. A GHP score of 0.4 has an optimally scaled value
of -3.38. Inclusion of this point would have distorted the rest of the graph.

33

A. Courts of Appeals: GHP Scores
0.8

0.6

Optimally Scaled Values 0.4

0.2

-0.2

-0.4

-0.6

-0.8
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
Original Values

B. Courts of Appeals: PIMP Scores


0.8

0.6

0.4
Optimally Scaled Values

0.2

-0.2

-0.4

-0.6
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
Original Values

Figure 2.
Optimal Scaling of Continuous
Measures of Ideology: Ordinal Transformations
Note: This figure is similar to Figure 1 except that the optimally scaled values
assume an ordinal relationship between values in the original measure. Again,
the 45-degree line is presented for comparison purposes.

34

A. Courts of Appeals: GHP Scores
10

6
Optimally Scaled Value

-2

Original Value

B. Courts of Appeals: PIMP Scores


12

10

8
Optimally Scaled Value

-2

Original Value

Figure 3.
Nominal Transformation of 10-Category GHP and PIMP Scores
Note: This figure presents the optimal scaling analyses of GHP and PIMP scores both
measured originally as ten-point scales. Each score was transformed into a ten-point scale
using equally sized categories (0.131 for GHP and 0.075 for PIMP) prior to analysis. Same
as the analyses presented Figure 1, the analyses here do not force the optimally scaled
values to maintain the original order of the categories.

35

A. Predicted v. Actual Liberal Voting by GHP Scores
0.5

0.48

0.46

0.44
Probability / Proportion

0.42

0.4

0.38

0.36

0.34

0.32

0.3

GHP Score / Range

P(Lib. Vote) % Lib. Votes

B. Predicted v. Actual Liberal Voting by PIMPScores


0.5

0.48

0.46

0.44
Probability / Proportion

0.42

0.4

0.38

0.36

0.34

0.32

0.3

PIMP Score / Range

P(Lib. Vote) % Lib. Votes

Figure 4.
Predicted and Actual Liberal Voting at the Courts of Appeals
Note: Each panel presents two curves. The first is the predicted probability of voting
liberal at the first listed (lower) value of GHP scores (Panel A) and PIMP scores (Panel B)
in the category range. The second is the actual proportion of votes cast by judges with
GHP or PIMP scores in the given range that are liberal.

36

Figure 5.
The Effects of Ideology Across Issue Areas
Note: Predicted probabilities are generated from the coefficients in Table 4 holding the lower court decision and
the federal government as respondent at their base values, panel effects at their median, and the lagged Supreme
Court median at its mean. For ease of presentation, value labels of the categorical variable are placed at the
minimum, median, and maximum values of the continuous measures to which they are compared.

37

Table 1.
Summary Statistics
Variable Obs. Mean SD Min Max Median
Courts of Appeals
Liberal Vote 32,884 0.382 0.486 0 1 0
GHP Scores 32,884 -0.020 0.347 -0.699 0.608 -0.007
PIMP Scores 32,606 0.055 0.253 -0.336 0.409 0.029
Liberal District Court 32,884 0.309 0.462 0 1 0
Federal Gov. Respondent 32,884 0.484 0.500 0 1 0
# Dem. Appointees on Panel 32,884 0.954 0.944 0 10 1
Lagged S.C. Median 32,884 0.386 0.550 -0.969 1.122 0.588
Issue Area 32,884 Freq. Percent
Civil Rights & Liberties 4,902 14.9%
Economic 13,469 41.0%
Criminal 11,537 35.1%
Other (Base Category) 2,976 9.0%

Supreme Court
Liberal Vote 74,789 0.533 0.499 0 1 1
Segal-Cover Scores 74,789 0.547 0.330 0 1 0.5
PIMP Scores 74,789 0.052 0.255 -0.312 0.366 -0.016
Liberal Lower Court 74,789 0.425 0.494 0 1 0
Federal Gov. Respondent 74,789 0.193 0.394 0 1 0
# Liberal Votes 74,789 4.098 2.802 0 8 4
Issue Area 74,789 Freq. Percent
Civil Rights & Liberties 22,622 30.3%
Economic 15,485 20.7%
Criminal 15,741 21.0%
Other (Base Category) 20,941 28.0%

38

Table 2.
Logit Models of Court of Appeals Voting Behavior
Variables GHP PIMP Categorical
Coef. SE Coef. SE Coef. SE
Ideology
GHP -0.471* 0.039
PIMP -0.601* 0.053
Moderate -0.118* 0.031
Conservative -0.452* 0.035

Control Variables
Issue Area
Civil Rights & Liberties -0.188* 0.051 -0.188* 0.051 -0.187* 0.051
Economic -0.119* 0.044 -0.124* 0.044 -0.118* 0.044
Criminal -0.852* 0.050 -0.850* 0.050 -0.852* 0.050
Liberal District Court 0.995* 0.034 0.990* 0.034 0.998* 0.034
Federal Gov. Respondent -0.144* 0.033 -0.150* 0.035 -0.143* 0.035
× Liberal Lower Court 0.738* 0.058 0.740* 0.058 0.739* 0.058
# Dem. Appointees on Panel 0.123* 0.014 0.121* 0.014 0.126* 0.014
Lagged USSC Median 0.617.. 0.350 0.832* 0.361 0.702* 0.351
Intercept -0.756* 0.167 -0.829* 0.172 -0.603* 0.167

Model Evaluation Statistics


BIC' -4,722.185 -4,638.933 -4,735.784 (13.599)
AIC 1.168 1.169 1.168
AUROC 0.7335 0.7327 0.7340
McFadden’s R2 0.1256 0.1247 0.1261
% Correctly Predicted 70.8% 70.8% 70.9%
Observations 32,884 32,606 32,884
* p < 0.05
Note: Model evaluation statistics in bold type indicate the best performing model by that statistic. The
number in parentheses is the absolute difference between the BIC’ of the model using the categorical
measure and the model with the next lowest BIC’ score (using GHP scores). All models were estimated
with circuit and year fixed effects. Those coefficients are available upon request.

39

Table 3.
Logit Models of Supreme Court Voting Behavior
Variables Segal-Cover PIMP Categorical
Coef. SE Coef. SE Coef. SE
Ideology
Segal-Cover 1.441* 0.029
PIMP -1.146* 0.033
Moderate -0.516* 0.020
Conservative -0.945* 0.023

Control Variables
Issue Area
Civil Rights & Liberties 0.459* 0.021 0.453* 0.021 0.455* 0.021
Economic 0.451* 0.023 0.446* 0.023 0.447* 0.023
Criminal 0.157* 0.023 0.155* 0.023 0.157* 0.023
Liberal Lower Court -0.951* 0.018 -0.931* 0.018 -0.938* 0.018
Federal Gov. Respondent -0.472* 0.025 -0.459* 0.025 -0.466* 0.025
× Liberal Lower Court 1.446* 0.043 1.412* 0.043 1.427* 0.043
Intercept -0.833* 0.058 0.088* 0.054 0.624* 0.055

Model Evaluation Statistics


BIC' -7,623.435 (958.717) -6,193.584 -6,664.718
AIC 1.273 1.292 1.285
AUROC 0.6880 0.6713 0.6782
McFadden’s R2 0.0805 0.0667 0.0713
% Correctly Predicted 64.2% 62.3% 63.7%
Observations 74,789 74,789 74,789
* p < 0.05
Note: Model evaluation statistics in bold type indicate the best performing model by that statistic. The number
in parentheses is the absolute difference between the BIC’ of the model using Segal-Cover scores and the
model with the next lowest BIC’ score (using the categorical measure). All models were estimated with term
fixed effects. Those coefficients are available upon request.

40

Table 4.
Courts of Appeals Models with Ideology-Issue Area Interactions
Variables GHP PIMP Categorical
Coef. SE Coef. SE Coef. SE
Ideology
GHP -0.500* 0.115
PIMP -0.536* 0.156
Moderate -0.171.. 0.091
Conservative -0.402* 0.105

Ideology-Issue Interactions
Civil Rights & Liberties -0.191* 0.052 -0.158* 0.052 -0.113.. 0.084
× Ideology -0.226.. 0.143 -0.465* 0.198 M: -0.002.. 0.116
C: -0.260* 0.130
Economic -0.112* 0.044 -0.133* 0.044 -0.231* 0.073
× Ideology 0.247* 0.126 0.179.. 0.171 M: 0.188.. 0.100
C: 0.143.. 0.115
Criminal -0.856* 0.050 -0.837* 0.050 -0.752* 0.078
× Ideology -0.157.. 0.133 -0.236.. 0.180 M: -0.106.. 0.105
C: -0.232.. 0.121

Control Variables
Liberal District Court 0.996* 0.034 0.990* 0.034 0.999* 0.034
Federal Gov. Respondent -0.148* 0.035 -0.154* 0.036 -0.147* 0.035
× Liberal Lower Court 0.740* 0.058 0.744* 0.058 0.743* 0.058
# Dem. Appointees on Panel 0.123* 0.014 0.121* 0.014 0.127* 0.014
Lagged USSC Median 0.594.. 0.350 0.807* 0.360 0.673.. 0.351
Intercept -0.751* 0.167 -0.822* 0.172 -0.584* 0.174

Model Evaluation Statistics


BIC' -4,723.647 -4,632.682 -4,712.344 (11.303)
AIC 1.168 1.169 1.167
AUROC 0.7342 0.7331 0.7349
McFadden’s R2 0.1263 0.1253 0.1270
% Correctly Predicted 70.7% 70.78% 70.82%
Observations 32,884 32,606 32,884
* p < 0.05
Note: Model evaluation statistics in bold type indicate the best performing model by that statistic. The
number in parentheses is the absolute difference between the BIC’ of the model using the categorical
measure and the model with the next lowest BIC’ score (using GHP scores). All models were estimated
with circuit and year fixed effects. Those coefficients are available upon request. For the interaction
coefficients of the categorical specification, “M” denotes the interactions between issue areas and the
“moderate” dummy and “C” denotes the interactions between issue areas and the “conservative” dummy.

41

Table 5.
Supreme Court Models with Ideology-Issue Area Interactions
Variables Segal-Cover PIMP Categorical
Coef. SE Coef. SE Coef. SE
Ideology
Segal-Cover 0.688* 0.047
PIMP -0.566* 0.058
Moderate -0.205* 0.036
Conservative -0.398* 0.040

Ideology-Issue Interactions
Civil Rights & Liberties -0.194* 0.039 0.508* 0.021 1.038* 0.043
× Ideology 1.229* 0.063 -0.924* 0.080 M: -0.581* 0.053
C: -0.955* 0.054
Economic 0.281* 0.045 0.471* 0.023 0.558* 0.044
× Ideology 0.305* 0.069 -0.560* 0.088 M: -0.094.. 0.056
C: -0.221* 0.059
Criminal -0.637* 0.045 0.202* 0.024 0.784* 0.046
× Ideology 1.463* 0.070 -0.861* 0.087 M: -0.649* 0.057
C: -1.032* 0.059

Control Variables
Liberal Lower Court -0.940* 0.018 -0.928* 0.018 -0.931* 0.018
Federal Gov. Respondent -0.503* 0.025 -0.467* 0.025 -0.483* 0.025
× Liberal Lower Court 1.467* 0.043 1.419* 0.043 1.438* 0.043
Intercept -0.371* 0.062 0.084.. 0.054 0.342* 0.058

Model Evaluation Statistics


BIC' -8,240.629 (1,154.416) -6,321.152 -7,086.213
AIC 1.264 1.290 1.279
AUROC 0.6945 0.6733 0.6845
McFadden’s R2 0.0868 0.0682 0.0761
% Correctly Predicted 64.6% 62.9% 64.3%
Observations 74,789 74,789 74,789
* p < 0.05
Note: Model evaluation statistics in bold type indicate the best performing model by that statistic. The number in
parentheses is the absolute difference between the BIC’ of the model using Segal-Cover scores and the model with
the next lowest BIC’ score (using the categorical measure). All models were estimated with term fixed effects.
Those coefficients are available upon request. For the interaction coefficients of the categorical specification, “M”
denotes the interactions between issue areas and the “moderate” dummy and “C” denotes the interactions between
issue areas and the “conservative” dummy.

42

View publication stats

You might also like