Abstract
This systematic review evaluated outcomes associated with arrest for domestic violence (DV), for both victims and perpetrators, considering both classic and modern research. A systematic search of 5 databases for quantitative reports resulted in 1,379 potentially relevant entries, of which 34 met the inclusion criteria. Studies were screened using prespecified criteria for the population (adults), outcomes (individual-level outcomes), study design (quantitative, including arrest for DV as a focal independent variable and a non-arrest comparator), publication type (peer-reviewed academic journal), study location (United States), language (English), and publication year (in or after 1984). Part I employs narrative synthesis to explore the impact of arrests by race/ethnicity, revealing gaps in reporting and a scarcity of analyses that consider race/ethnicity or its intersections with gender. Part II, a meta-analysis, indicates that arrest does not consistently reduce repeat violence and suggests variability based on the type of comparator used and the proportion of Black victims in a sample. The review finds that much of the research on the impacts of arrest is outdated and lacks diversity in data sources and outcomes, with few studies examining outcomes other than repeat violence. Future research should prioritize an intersectional approach and the perspectives and needs of survivors. Policymakers should consider the potential for disparate impacts and evaluate alternatives to mandatory arrest policies, with funding available for new data sources and related projects. Ultimately, policymakers must consider the context when evaluating the effectiveness and ethics of arrest policies.
Keywords
Domestic violence (DV), sometimes referred to as intimate partner violence, 1 defined as sexual violence, stalking, physical violence, or psychological aggression committed by a current or former intimate partner, is a problem that affects two in five women and one in four men in the United States, with rates even higher for Black, American Indian/Alaska Native, and multiracial individuals (Leemis et al., 2022), and Black women, in particular (Richie & Eife, 2021). Although men also experience violence committed by intimate partners, women are more likely to experience serious bodily injury or violent death as a result of DV (Catalano, 2013; Leemis et al., 2022). One of the top DV-related impacts mentioned by women and men is the need for help from law enforcement (D’Inverno et al., 2019). Police who are called to the scene of a DV incident can potentially intervene to prevent future, possibly fatal, violence.
Prior to the 1970s to 1980s, U.S. law enforcement largely treated DV as a private matter not warranting intervention (Buzawa & Buzawa, 1990; Pleck, 1989; Siegel, 1996). By the 1990s, there had been widespread adoption of legislation that encouraged or required arrest in DV cases (Lerman & Livingston, 1983; Zorza, 1992). Drivers of this cultural shift included the battered women’s movement (Fedders, 1997; Lerman & Livingston, 1983), several prominent lawsuits against police departments for failing to protect victims of DV (e.g., Scott v. Hart, U.S. District Court for the Northern District of California, C76–2395 [1976]; Thurman v. City of Torrington, 595 F. Supp. 1521 [1984]; Buzawa & Buzawa, 1990; Zorza, 1992), a rising fear of crime and coinciding political push for greater criminalization (Coker & Macquoid, 2015; Kim, 2020), as well as the influential Minneapolis Domestic Violence Experiment (MinDVE) conducted by Sherman and Berk (1984). In the MinDVE, over a period of 18 months in two Minneapolis precincts, households to which the police were dispatched to investigate incidents of misdemeanor domestic assault were randomly assigned to arrest, physical separation, or mediation. Sherman and Berk (1984) found that arrest, relative to mediation and forced separation, reduced reoffending and victim-reported violence during a 6-month follow-up period. Though the authors recommended presumptive arrest rather than mandatory arrest 2 and cautioned for the need for replication, this experiment likely influenced the adoption of mandatory arrest policies (Sherman & Cohn, 1989).
Subsequent replication (Spousal Assault Replication Program, SARP) studies were equivocal regarding the efficacy of arrest in the reduction of further violence (Garner et al., 1995); however, many states enacted mandatory arrest laws that are still in force (Chin & Cunningham, 2019). Arguments favoring pro-arrest legislation are largely premised on the assumption that arrest will reduce recidivism (e.g., Sherman & Berk, 1984; Williams, 2005); that is, individuals who are arrested for DV will be deterred from similar behavior in the future. 3 Thus, recidivism, the focus of prior research syntheses (e.g., Garner et al., 1995; Sherman, 1992), remains a pivotal concern for criminologists, policymakers, and law enforcement agencies. In this study, we meta-analyze the relationship between DV arrest and repeat violence, advancing upon the groundwork laid by Hoppe et al. (2020). We extend their analysis by incorporating modern research outside of the classic MinDVE and SARP studies, conducting a risk of bias assessment, and delving into additional moderating factors. However, arrest may also be associated with other important outcomes. Thus, this review also considers other outcomes for victims (e.g., empowerment) and for perpetrators (e.g., mortality risk) associated with arrest for DV.
Another primary aim of this review is to assess the impacts of DV arrest on people of color. The rise of the feminist anti-DV movement coincided with the rise of a conservative law and order movement that harnessed racist fears of violent and dangerous criminals (Pleck, 1989) to garner support for wide-sweeping legislation that resulted in the hyper-incarceration and hyper-surveillance of Black Americans, with devastating effects on individuals, families, and communities (Coker & Macquoid, 2015; Richie, 2015). Some scholars argue that an alliance with the law-and-order movement, resulting in the adoption of pro-criminalization strategies and priorities, contributed to the anti-DV movement’s success (Coker & Macquoid, 2015; Kim, 2020). Further, they note that the anti-DV movement’s reliance on arguments centering white women victims has led to policies and practices which harmfully impact people of color (Coker, 2001; Crenshaw, 1991; Kim, 2013; Richie, 2015; Ruttenberg, 1994). Thus, this review will also examine, wherever the data permit, whether outcomes differ by race/ethnicity and intersections with gender.
This review is structured into three parts. First, we describe our search strategy, screening process, review procedures, and present overarching results. Then, we present a narrative synthesis exploring the impact of arrest across different racial and ethnic groups, focusing on the reporting and analysis of race/ethnicity, as well as its intersection with gender. Finally, we present the results of a risk of bias assessment and meta-analysis conducted on studies examining repeat violence.
Prior research examining the effects of arrest has concentrated on repeat violence, often overlooking other potential consequences of arrest. Moreover, existing meta-analyses (e.g., Hoppe et al., 2020) have focused on a narrow selection of classic studies and explored a limited number of moderators. Additionally, despite the undeniable significance of race in shaping criminal justice experiences and outcomes, meta-analyses to date have not systematically incorporated a racial lens into their analyses. This review significantly advances upon these limitations by incorporating modern research examining outcomes other than repeat violence, expanding the depth and scope of previous meta-analyses, and shedding light on the role of race in responses to DV arrest. Through this multifaceted approach, we aim to catalyze advancements in the current literature, leading to more informed and equitable policies and interventions for DV.
Method
Data Sources and Search Strategy
In accordance with the PRISMA-P 2015 statement, we developed a protocol before beginning the literature search. A systematic search was conducted by a librarian collaborating with the research team. Five databases were selected: EMBASE (Excerpta Medica Database), Criminal Justice Abstracts, PubMed, Web of Science, and ProQuest Social Sciences premium. 4 A search was constructed and vetted for each database, involving testing and validating search terms. No database filters were applied as they interfered with the precision and accuracy of the retrieved records. The searches were completed in December of 2021. 5 The initial number of results from all databases was 1,379. They were then extracted as RIS files and imported into EndNote for review. In EndNote, 286 duplicates were removed. See Supplemental Appendix A for the final searches. After the full-text review, but prior to extraction, backward and forward citation searching methods (Page et al., 2021) were applied to eligible reports, resulting in six additional relevant reports.
Study Selection
Reports were included in the review if they described one or more quantitative studies (randomized controlled trials (RCTs), quasi-experiments, or observational designs) that (a) included arrest for DV as a focal independent variable (i.e., arrest was the focus of one or more hypotheses), whether alone or in combination with other treatments, wherein arrest was defined as the physical removal and detainment of one or more parties involved in the incident by a law enforcement officer or officers; (b) reported individual-level estimates of the relationship between arrest and one or more outcomes; (c) had a comparison group(s) not “exposed” to arrest or a comparison group(s) assigned to receive an alternative treatment; (d) involved adult samples; and (e) were conducted in the United States or a U.S. territory. Additionally, the report must have been: (f) published in English; (g) published in or after 1984 (MinDVE publication year); and (h) published in a peer-reviewed academic journal. Meta-analyses, qualitative studies, case studies, and studies conducting macro-level examinations of the effects of mandatory arrest policies were excluded from consideration. The specific questions investigated in this review, which informed our selection criteria, were defined in advance using the PICO framework (Population, Intervention, Comparison, Outcome), in alignment with systematic review principles. The review’s scope was carefully defined, considering factors such as feasibility, interpretability, equity, relevance, and impact (Thomas et al., 2024). Our focus on examinations of individual-level arrest, rather than macro-level arrest policy, arose from concerns about inappropriate “lumping” of intervention types and represented a refinement to the original protocol (Thomas et al., 2024). Our aim to identity events for which the causal influence of arrest is plausible and quantifiable drove our decision to focus on quantitative research.
Screening, Full-Text Review, and Data Extraction
Search results were imported into Covidence, an online systematic review management program (https://www.covidence.org/). Title and abstract screening, full-text review, and data extraction were completed within Covidence. Four reviewers individually screened citation titles and abstracts, and then reviewed full-text reports based on the eligibility criteria. Each record was screened by two people, each eligible full-text report was reviewed by two people, and all conflicts were resolved by discussion. Four reviewers independently extracted data using a standardized form. Before starting extraction, calibration exercises were performed to assess the appropriateness and comprehensiveness of the form, as well as reviewer consistency. At least two reviewers performed data extraction for each report. Information was captured regarding sample characteristics (e.g., sample size, location, gender and racial composition), methodological details (e.g., study design, timeframe), and all reported outcomes. See Figure 1 for the PRISMA flow diagram.

PRISMA flow diagram.
Results
The search yielded a total of 1,093 records. After a second round of de-duplication within Covidence, 1,051 remaining records were screened, with 947 excluded during the title/abstract screening, and 76 reports excluded during the full-text review. Following the full-text review, six additional reports were identified in reference lists and assessed for eligibility. Thirty-four reports met the inclusion criteria and were included in the review. Strikingly, 62% (n = 21) of the included reports were related to the MinDVE or the SARP experiments, either as an analysis of primary data (n = 5) or as re-analyses (n = 16). 6 Of those analyzing data from the MinDVE and SARP experiments, reports most frequently used data from the Milwaukee (n = 11) and Minneapolis (n = 5) experiments. Across these 21 reports, there was significant overlap in authorship, and the majority (n = 14) were published before 2000.
Unlike the MinDVE or SARP reports, all of which described RCTs or re-analyses of RCTs, reports unrelated to MinDVE/SARP described studies with quasi-experimental or observational designs (n = 13): four of these studies used data from the National Crime Victimization survey (NCVS, Cho & Wilke, 2010a, 2010b; Felson et al., 2005; Xie & Lynch, 2017), one used data from the National Violence Against Women survey (Wilson & Jasinski, 2004), and the remaining studies used regional (e.g., county or city) police or legal records. Sample size depended on the outcome measure used, with smaller samples for MinDVE/SARP analyses involving victim-reported outcomes, but ranged from small (200–300 cases) to large (>2,000 cases).
All studies included individuals affected by violence perpetrated by former or current romantic partners, but many studies related to MinDVE or SARP used broad definitions of DV, including violence committed within non-romantic relationships (e.g., siblings, roommates, and friends), though these cases typically comprised a small portion of each sample. In contrast, studies unrelated to MinDVE or SARP tended to restrict samples to cases involving intimate partner violence. MinDVE/SARP studies exclusively focused on misdemeanor DV, whereas studies unrelated to MinDVE/SARP generally did not restrict based on offense class. Across all reports, nine described studies which further restricted to intimate partner violence cases involving male perpetrators and female victims. Despite the actual sample composition, a few reports used misleading terms like “spouse assault” or “spouse abuse” when describing the study (e.g., Berk & Sherman, 1988; Berk et al., 1992a; Dunford, 1992). Several reports (Berk & Newton, 1985; Mears et al., 2001; Steinman, 1990) described DV in “traditional” terms (e.g., referred to perpetrators as “men” and victims as “women” or used terms like “spouse abuse” and “wife battery”) but did not explicitly state the gender composition or types of relationships included in the sample.
Repeat Violence
Reports examining whether victims experienced revictimization or whether perpetrators engaged in new acts of violence or related crimes following the arrest intervention predominated (only five reports examined other outcomes). Tables A2 and A3 in the Supplemental Appendix summarize results. It should be noted that the “Key Findings” columns focuses on key takeaways and the most robust findings and, accordingly, masks the complexity of results (i.e., it is biased towards non-null findings) when multiple tests are reported (as is common). For instance, Sherman et al. (1991) performed 30 analyses, with 90 possible pairwise comparisons, of which only 4 tests produced results significant at p < .05, which is not apparent from the table.
Repeat violence was measured using two types of measures: official records and victim reports. Official records included police, jail, or court records detailing offenses, arrests, complaints, or warrants associated with the original perpetrator following the target incident and through the follow-up period. This category also included “hotline” reports (i.e., DV incidents reported by police to a women’s shelter). Some studies restricted definitions of official recidivism to records of DV (e.g., Pate & Hamilton, 1992), while others did not (e.g., Hirschel & Hutchinson, 1992 included any arrest for any subsequent offense by the same perpetrator against the same victim). All in all, 23 reports examined officially-recorded repeat violence.
Importantly, official records reflect documented police actions rather than recurrent violent behavior necessarily (Dunford, 1992), missing instances of abuse that are not reported or recorded by the police. Thus, many studies also collected victim reports of repeat violence. Fifteen studies examined relationships between arrest and victim-reported repeat violence, eight of which also reported results for officially-recorded repeat violence. Not included in this number are two reports that reference, but report in very limited detail, analyses using victim reports (Berk & Newton, 1985 7 ; Pate & Hamilton, 1992 8 ). All but three reports (Cho & Wilke, 2010a, 2010b; Felson et al. 2005), which utilized NCVS data, defined victim-reported repeat violence as occurring between the same victim-perpetrator pair as in the original incident.
Other Outcomes
Only a handful of reports (n = 5) examined outcomes other than repeat violence, four of which examined victim outcomes. Miller (2003) tested various hypotheses, including the impact of arrest on victim perceptions of power and safety. While arrest did not enhance feelings of independence, it did increase safety perceptions. Additionally, victim empowerment by police actions varied based on the victim’s preference for arrest. Wilson and Jasinski (2004) observed greater satisfaction with the police among victims following arrest, and Kernic and Bonomi (2007) observed a significant positive link between arrest and the activation of crisis intervention services for female victims. Finally, Sherman and Harris (2013) conducted a longitudinal examination of mortality risk, finding that perpetrators who were arrested had marginally greater odds (OR > 3, p < .10) of dying from homicide 23 years later relative to those not assigned to arrest. In a similar study, Sherman and Harris (2015) conducted an investigation of victim mortality risk, the findings of which are discussed in the following section.
Part I. Impacts of Arrest by Race/Ethnicity
Based on arguments that mandatory arrest has adversely impacted people of color (e.g., Coker, 2001; Richie & Eife, 2021; Ruttenberg, 1994), a narrative synthesis was conducted to explore the impact of arrest across different racial and ethnic groups. Since few reports assessed disparate impacts, this section primarily documents how race/ethnicity and its intersection with gender were reported and analyzed.
Reporting of Race/Ethnicity
Of the 34 reports, 15 included information about the victim’s race or ethnicity. For 8 of the 19 reports not including this information, this information could be inferred based on other sources, either, in the case of re-analyses, the primary analysis report 9 or reports to the National Institute of Justice (Hirschel et al., 1991; Pate et al., 1991). For seven reports conducting re-analyses, including all the multisite studies, victim race/ethnicity could not be determined due to differences in inclusion criteria as compared to the original study (Berk et al., 1992a, 1992b; Gartin, 1995; Johnson & Goodlin-Fahncke, 2015; Maxwell et al., 2002; Miller, 2003; Paternoster et al., 1997). For example, Gartin (1995) re-analyzed the MinDVE data, but since he focused on cases submitted by certain officers, the extent to which the racial/ethnic characteristics of this subsample match those of the original sample is unclear. Similarly, although Miller (2003) reports the racial/ethnic demographics of the context (Dade County, FL, USA) and the original sample, the composition of the subsample is not identified. Berk et al. (1992a, 1992b) report victim race/ethnicity for the Colorado Springs sample but not for the pooled samples. Four remaining reports (Broidy et al., 2016; Cho & Wilke, 2010b; Felson et al. 2005; Steinman, 1990) did not indicate the racial/ethnic composition of victims in their samples.
Of the 34 reports, 21 included information on perpetrator race or ethnicity. For 4 of the 13 reports not reporting this information, it could be inferred based on other sources. Again, it was unclear the extent to which cases analyzed by Berk et al. (1992a, 1992b), Gartin (1995), and Johnson and Goodlin-Fahncke (2015) overlapped with the original sample. Five remaining reports (Cho & Wilke, 2010a, 2010b; Mears et al., 2001; Steinman, 1990; Wilson & Jasinski, 2004) did not report racial/ethnic information for perpetrators in their samples.
Altogether, after filling in information from other sources, there were a total of 22 reports with incomplete or absent information about race/ethnicity. Of those reporting race/ethnicity, several reports included only the proportion of one or two racial/ethnic groups, usually the largest, rather than detailed breakdowns (e.g., Berk & Newton, 1985; Felson et al., 2005; Sherman et al., 1991; Wilson & Jasinski, 2004).
Table A4 in the Supplemental Appendix reports the racial/ethnic demographics of victims and perpetrators in each study sample. Due to data overlap in the MinDVE/SARP reports and challenges determining sample composition for re-analyses, numbers are presented by experimental site, rather than by published report, with notes indicating information sources. Inspection of Table A4 in the Supplemental Appendix reveals the relevance of race/ethnicity in discussing DV arrest impacts: Black/African American individuals comprise a sizable proportion of both victims and perpetrators, particularly in the MinDVE/SARP experiments, but also exceeding their population share (13.6%; U.S. Census Bureau, 2023) in most non-MinDVE/SARP studies. Conversely, some groups, such as Asian/Pacific Islanders, are rarely represented, and others, like Latinx/Hispanics, may be underrepresented based on current population estimates.
Analyses of Race/Ethnicity
Eighteen reports controlled for race or ethnicity (Victim Only: n = 4; Suspect Only: n = 10; Both: n = 4) when estimating the effects of arrest, half of which dichotomized race (most often, Black and white). In 11 of the 16 remaining reports, subjects were randomized to condition, negating the need to control for race/ethnicity (assuming baseline equivalence, though controls are arguably still important when testing interactions between arrest and non-randomized variables). Three additional reports appear to have evaluated race/ethnicity as a covariate, but it was either excluded from the final statistical model, or there was not sufficient information to determine its inclusion. Finally, two observational studies made no mention of controlling for race or ethnicity.
Although many studies controlled for race and/or ethnicity, only six reports performed subgroup analyses involving race, most of which focused on repeat violence as an outcome. Out of the six, two stated that race/ethnicity was tested as a moderator, but results were not significant, and they provided no further detail (victim race/ethnicity—Mears et al., 2001; perpetrator race—Sherman et al. 1992b). One additional study found that suspect race/ethnicity did not significantly modify the arrest-repeat violence relationship (Felson et al., 2005).
Berk et al. (1992b) examined arrest effects by marital status, employment, and race of the perpetrator (defined as Black, white and/or Latinx) using pooled data from four SARP sites. Overall, Black perpetrators had higher odds of repeat violence following arrest (OR: 1.11) than did white and Latinx/Hispanic suspects (OR: 0.91), but this varied by marital status and employment. Black perpetrators who were neither married nor employed had among the highest odds of repeat violence (OR: 1.65), but white and Latinx perpetrators who were married but not employed also had high odds of repeat violence (OR: 1.65). Married and employed Black perpetrators had similar (reduced) odds of repeat violence compared to white and Latinx/Hispanic suspects (ORs of 0.82 and 0.80, respectively). Overall, there is no a clear pattern of disparate impact by race. Employment is associated with reduced future violence across groups, but differences emerge in the relationship between marriage and repeat violence for the unemployed.
Sherman et al. (1992a) characterize race as an indicator of stake in conformity (SIC). The stake-in-conformity hypothesis (Toby, 1957) predicts that formal sanctions like arrest will most effectively deter those with the strongest ties to society, because they have the most to lose from deviant behavior (Piquero et al., 2011). They find that race, like other SIC indicators (e.g., employment), modified the effects of arrest. Specifically, differential impacts of arrest were observed when examining the frequency of violence,
10
with results in the direction of escalation for Black perpetrators but deterrence for white perpetrators. The authors note: The difference in reaction to full arrest between Blacks and whites is startling. The fact that 10,000 arrested whites produce 2,504 fewer acts of domestic violence a year than warned whites, while 10,000 arrested Blacks produce 1,803 more acts of violence per year than warned Blacks, is a far larger magnitude than we ever expected. If three times as many blacks as whites are arrested in a city like Milwaukee, which is a fair approximation, then an across-the-board policy of mandatory arrests prevents 2,504 acts of violence against primarily white women at the price of 5,409 acts of violence against primarily Black women. (p. 160)
Although these estimates appear to be unadjusted for covariates, differences in the effects of arrest by race are observed even after accounting for SIC indicators, including employment.
Sherman and Harris (2015) identified victim race and employment as “powerful moderators” of the impact of arrest on victim mortality. Twenty-three years after the Milwaukee Domestic Violence Experiment, they matched death records to the original victims (who were primarily women), finding that employed Black victims were at higher risk of premature mortality following a partner’s arrest than white victims. Specifically, arrest was associated with a significantly increased risk of mortality for African American (by 98%) but not white (by 9%) victims; further, employed Black victims were at the highest risk of premature mortality following a partner’s arrest (11% died) relative to those whose partner received a warning (0%).
Intersections Between Race/Ethnicity and Gender
Studies reporting on the intersection of gender and race did so only incidentally because they excluded women perpetrators and/or men victims (n = 9; Cho & Wilke, 2010a; Kernic & Bonomi, 2007; Lyons et al., 2019; Maxwell et al., 2002; Miller, 2003; Pate & Hamilton, 1992; Paternoster et al., 1997; Syers & Edleson, 1992; Tolman & Weisz, 1995). Perpetrator race/ethnicity was more often noted than victim race/ethnicity in these cases. Studies which restrict to men perpetrators and women victims are indicated in Table A4 in the Supplemental Appendix. No study performed subgroup analyses by gender and race/ethnicity, but since women make up the majority of victims in many samples, studies like Sherman et al. (1992a) and Sherman and Harris (2015) shed light on the impacts of arrest for Black and white women.
In summary, the examination of the impacts of arrest by race/ethnicity reveals significant gaps in existing literature. This section explored the reporting of race/ethnicity when describing sample demographics, as well as analyses of race/ethnicity and its intersections with gender. The next section describes a meta-analysis of the arrest-repeat violence relationship, addressing the role of race as a potential moderator.
Part II. Meta-Analysis of Arrest and Repeat Violence
A risk of bias assessment and meta-analysis were performed, focusing on officially-recorded repeat violence in the case of RCTs that reported multiple outcome measures. This approach was chosen because many reports relied heavily on official records, considering them more reliable and less biased (e.g., Berk & Sherman, 1988). Further, this approach enabled us to derive a single effect size per study. Moreover, given the overlap in results across reports detailing the same study, these efforts focused solely on the 17 distinct studies.
While many studies compared arrest to non-arrest (either the absence of arrest or a combination of treatments not involving arrest), various interventions were tested, including informal mediation, issuance of a warning, and forced separation. In some cases, arrest was combined with other interventions (e.g., protection orders or court-ordered treatment). Consistent with prior research (e.g., Hoppe et al., 2020; Maxwell et al., 2002), arrest was operationalized as a dichotomous variable, in order to compare interventions involving arrest to those without arrest.
Across studies, definitions of repeat violence were heterogeneous, differing in scope (sometimes restricted to interpersonal violence, other times including property damage), the target of violence (same victim, any victim), the perpetrator of violence (same perpetrator, different perpetrator), time to follow-up, metric (prevalence, incidence, time to failure), incident seriousness (misdemeanor offenses, felony offenses), and police involvement (e.g., perpetrator was arrested/a complaint was filed versus police were not notified). As others have noted (e.g., Garner et al., 1995), these differences complicate across-study comparisons and likely contribute to between-study heterogeneity. In order to determine whether we could identify any moderators, we coded each study for many of these variables (described below), including violence definition, target of violence, and time to follow-up, which were then submitted to a random forest procedure to identify important moderators. Given our interest in differential impacts of DV arrest, ideally, the meta-analysis would test whether effect sizes differed across racial subgroups. However, due to limited availability of data on subgroup differences, an alternative approach was adopted. Studies reporting victim or perpetrator race were meta-analyzed to explore whether either variable acted as a moderator. This decision was guided by previous research highlighting the importance of considering sample racial composition (specifically, the proportion of Black victims and perpetrators) as a potential explanation for divergent findings in the literature (e.g., Sherman, 1992). We further focused on Black victims and perpetrators, as this category was one of the most consistently reported. This focus also reflects the longstanding history of race-based policing and systemic discrimination within the U.S. criminal justice system, which has disproportionately and profoundly affected African Americans (Coker & Macquoid, 2015; Richie & Eife, 2021).
Method
Risk of Bias Assessment
One reviewer assessed the risk of bias of included studies using the Cochrane Risk of Bias 2 tool (RoB 2 tool) for randomized controlled trials or the Risk of Bias in Non-randomized Studies of Interventions tool (ROBINS-I tool) for non-randomized studies (Higgins et al., 2024; Sterne, 2016). Assessments were checked by a second reviewer. The effect of interest was the effect of assignment to the intervention, regardless of whether the interventions were received as intended (the “intention-to-treat effect”). We assessed the risk of bias for the main outcome of interest in each study, which was repeat violence or recidivism by the perpetrator. We answered signaling questions as either yes, probably yes, probably no, no, or no information, as per the guidance. We used these to determine the overall risk of bias for each domain and the overall risk of bias for the outcome from the included studies (high, some concerns, low).
Meta-Analysis
A meta-analysis was conducted to investigate the impact of DV arrest on repeat violence. To derive log odds ratios, our chosen effect size metric, we utilized frequency tables either readily available or reconstructed from information provided in the reports. When raw frequencies were unavailable, we utilized adjusted estimates or frequencies extrapolated from plots using a plot digitizer software, as described by Kadic et al. (2016). When multiple effect sizes were available within a study, preference was given to those with the longest follow-up duration and the largest sample size. A random-effects model, which does not assume a true common effect size across studies (Borenstein et al., 2010), was utilized because substantial heterogeneity was expected given the diversity of research designs, samples, and interventions. Analyses employed the inverse variance method, where effect sizes with smaller variances were given greater weight during pooling, compared to those with less precision. Heterogeneity was evaluated through the use of Q- and I2 statistics, while publication bias was evaluated using a funnel plot, Egger’s test (Egger et al., 1997), and the trim and fill method (Duval & Tweedie, 2000). The R packages meta (Balduzzi et al., 2019) and metafor (Viechtbauer, 2010) were used to perform analyses.
Analysis was conducted in several stages. Initially, a meta-analysis stratified by research design was conducted, in order to assess differences across RCTs and non-RCTs and present the risk of bias findings. Subsequently, the MetaForest R package (Van Lissa, 2020) was used to implement an exploratory, machine-learning-based approach to identify potential moderators. This method adapts the random forests algorithm to generate and combine tree models from weighted bootstrapped samples to perform variable selection, thus avoiding the risk of overfitting associated with traditional methods (e.g., including all moderators in a meta-regression). Following this, a meta-analytic model incorporating the moderators suggested by the MetaForest procedure was constructed.
The following variables were coded for each study to evaluate as potential moderators: (a) method (RCT or non-RCT); (b) overall risk of bias (moderate or serious); (c) number of arms/conditions (ranging from 2 to 4); (d) comparator (whether arrest was compared to the absence of arrest [e.g., “non-arrest”] or an alternative treatment 11 ); (e) time to follow up (6 to 48 months); (f) DV definition (narrow or broad, as in Table A1 in the Supplemental Appendix); (g) victim sex (female only or other); (h) outcome source (official records or victim self-report); (i) scope of violence (actual or attempted physical violence or included other forms of violence); (j) target victim (whether repeat violence targeted the same or any victim); (k) sample size; (l) years since the first report on the study included in the review was published; (m) whether or not effect sizes were calculated based on crude frequencies gleaned from plots; and (n) whether or not effect sizes were based on estimates adjusted for covariates. The proportion of Black victims and Black perpetrators within a sample were also evaluated as moderators in a subset of studies reporting this information.
Our analysis using a MetaForest procedure with 10,000 bootstrapped samples aimed to identify moderators to explain the significant heterogeneity in effect sizes observed across studies. To exclude irrelevant moderators, we implemented a recursive variable pre-selection strategy, replicated 100 times. We retained only moderators demonstrating positive variable importance in over half of the replications. This criterion was met by only one moderator: comparator, which indicates whether arrest was contrasted with the absence of arrest or an alternative intervention. Optimal values for tuning parameters were determined by performing 10-fold cross-validation to minimize prediction error. Finally, two additional models were constructed, focusing on a subset of studies reporting race/ethnicity, in order to evaluate victim and perpetrator race as potential moderators.
Results
Risk of Bias
Of the included randomized studies, we judged four of these to have some concerns overall for the outcome of repeat violence/recidivism (Dunford et al., 1990a; Hirschel et al., 1991; Pate & Hamilton, 1992; Sherman et al., 1991). Two studies were judged to be at high risk of bias (Berk et al., 1991; Sherman & Berk, 1984). All of these studies had concerns relating to the selection of the reported result, either due to not pre-specifying the analysis plan or for reporting multiple measurements and analyses of the outcome. All studies were also unable to conceal the assignment of the intervention to the participants. One study at high risk of bias failed to appropriately randomize participants and deviated from the intended interventions (Berk et al., 1991 12 ). The other study at high risk of bias did not appropriately conceal the intervention from police officers before it was assigned (Sherman & Berk, 1984). Comparatively, the risk of bias would have been more severe for victim-reported repeat violence due to missing outcome data and issues with measurement of the outcome (e.g., potential for recall bias).
Of the included non-randomized studies, we judged two to have overall some concerns relating to the risk of bias (Berk & Newton, 1985; Felson et al., 2005). The other nine non-randomized studies were at high risk of bias (Broidy et al., 2016; Cho & Wilke, 2010a, 2010b; Lyons et al, 2019; Mears et al, 2001; Steinman, 1990; Syers & Edleson, 1992; Tolman & Weisz, 1995; Xie & Lynch, 2017). Broidy et al. (2016) and Xie and Lynch (2017) were initially judged to have some concerns related to confounding, but we present them as high risk since we could not utilize adjusted estimates. Thus, in our analysis, all non-randomized studies had high risk relating to confounding the effect of the intervention, and only Berk and Newton (1985) and Felson et al. (2005) accounted for this appropriately in their analyses. The detailed risk of bias assessments is available on request.
Meta-Analysis
Figure 2 presents a forest plot depicting a random-effects model analyzing the relationship between arrest and repeat violence. Overall, arrest was found to reduce the odds of repeat violence slightly; however, this effect did not reach statistical significance (OR: 0.86, 95% CI [0.72, 1.02]). .In subgroup analyses, while arrest significantly reduced repeat violence in non-randomized studies (OR: 0.79, [0.63, 0.98]), randomized studies did not show a significant reduction in repeat violence (OR: 1.02, [0.78, 1.35]). Despite substantial heterogeneity between the two subgroups of different designs (I2 = 70%, p < .01), there was no statistically significant difference in their pooled effects (p = .07). There was substantial heterogeneity among all studies included in the meta-analysis (I2 = 69.9%; Q(16) = 53.20, p < .001), a large proportion of which could be attributed to the studies with non-randomized designs (for which I2 = 71%, p < .01, as opposed to randomized studies with I2 = 35%, p = .18). Additionally, the prediction interval for the treatment effect in a new study (Nagashima et al., 2019) ranged from 0.45 to 1.63, reflecting considerable variability and encompassing both potential reductions and increases in violence. Thus, the results of this analysis do not support that, at conventional levels of significance, arrest consistently reduces violence. Examination of a funnel plot (see Figure 3) indicated relative symmetry in the distribution of effect sizes. This symmetry was corroborated by the Trim and Fill method, which concluded that no studies were missing due to publication bias, and a non-significant Egger’s test (t(15) = 0.38, p = .71).

Forest plot of meta-analytic results stratified by research design.

Funnel plot.
Selection of Potential Moderators
Our MetaForest analysis final model, comprising 5,000 regression trees and using fixed weights with a minimum node size of six cases, revealed comparator as an important moderator. This was evidenced by an “out-of-bag” R squared of .07, indicating the portion of the variance in effect sizes the model is expected to explain when applied to new data.
Figure 4 presents a forest plot generated from a random-effects model, stratified by comparator. The analysis shows notable subgroup differences. Specifically, while the pooled effect size for studies evaluating alternative interventions showed no meaningful effect in reducing repeat violence (OR: 1.03, 95% CI [0.83; 1.28]), the pooled estimate from studies comparing arrest to the absence of arrest suggested a significant reduction in violence (OR: 0.71, [0.58; 0.87]). Additionally, a meta-regression analysis indicated that comparator significantly accounted for heterogeneity in effect sizes (F(1, 15) = 8.60, p = .01), but the test for residual heterogeneity was also significant (QE(15) = 26.48, p = .03), suggesting other important moderators.

Forest plot of meta-analytic results stratified by comparator.
Assessing Race as a Moderator
Additional random-effects meta-regression models assessed whether the proportion of Black victims or perpetrators within the study samples contributed to heterogeneity in effect sizes. Separate models were necessary due to the partial overlap among studies; some reported proportions for victims, others for perpetrators, but not consistently both. Additionally, these variables were not included in the MetaForest procedure because some studies reported neither.
The proportion of Black victims in a sample significantly predicted effect size (b = 0.93, 95% CI [0.20, 1.65]) and explained a significant amount of heterogeneity in effect sizes (F(1, 10) = 8.05, p < .05). Specifically, arrest deterred violence only when Black victims made up less than a third of the sample, with larger reductions as their proportion decreased. When the proportion of Black victims exceeded a third, the deterrent effect reversed direction, indicating no decrease in violence.
After incorporating comparator into the model, the two moderators accounted for significant portion of the variance in effect sizes (F(2, 9) = 4.89, p < .05), and the test for residual heterogeneity was no longer significant (QE(9) = 14.80, p = .10); however, neither coefficient was statistically significant. This is likely due to considerable overlap between the two variables (see Figure 4), such that studies comparing arrest to alternative interventions had a higher proportion of Black victims (0.23 and above) than those comparing arrest to the absence of arrest (0.25 and below). Thus, it is difficult to determine the individual impact of each variable. Finally, the proportion of Black perpetrators, in contrast, did not significantly predict effect size.
Discussion
The present review reveals that remarkably, much of the quantitative research on the impacts of DV-related arrest is from prior to 2000, and much of it is related to the MinDVE or SARP studies. Even some of the most recently published reports utilized data collected during this period, resulting in considerable overlap in the data sources used to generate findings. There is also a lack of quantitative research on the relationship of arrest with outcomes other than repeat violence. Therefore, a straightforward implication of this review is that new research utilizing data from more recent periods and collecting data on a greater range of outcomes is needed.
Furthermore, despite arguments about the harms of carceral feminism and policies like mandatory arrest, for people of color, especially women of color (Coker, 2001; Crenshaw, 1991; Kim, 2013; Richie, 2015; Ruttenberg, 1994), few reports assess the extent to which effects of arrest depend on race or ethnicity, and none explicitly examine their intersection with gender.
Research published since the MinDVE/SARP has continued to document an inconstant relationship between arrest and repeat violence. This is echoed in the findings of our analysis, which does not support a consistent deterrent effect. Our findings align with a recent meta-analysis by Hoppe et al. (2020) of an overlapping body of studies, as well as an earlier study by Maxwell et al. (2002), which pooled data from five SARP sites and found that while the effects of arrest on officially-recorded violence were in the direction of deterrence, they were not statistically significant. Notably, our study underscores that there is significant heterogeneity in the arrest-repeat violence relationship that remains unexplained. This finding aligns with multiple reports in our review that found that the effects of arrest depend on various factors, as well as the conclusion that arrest likely deters some offenders some of the time 13 (Piquero et al., 2011; Sherman, 1992). Critically, although our findings align with Hoppe et al. (2020), our study builds upon their work by including a wider array of recent research, applying a race-conscious lens, conducting a risk of bias assessment, and exploring additional moderating factors.
Outcomes Other than Repeat Violence
A goal of the review was to identify other outcomes associated with arrest, but studies examining repeat violence dominate this research area. Studies addressing other outcomes did provide some support for arguments that arrest has benefits beyond the reduction of future violence (e.g., Stark, 1993). For instance, arrest was associated with victims’ enhanced perceptions of legal power (when the victim wanted the arrest, Miller, 2003), the activation of crisis intervention services (Kernic & Bonomi, 2007), greater victim satisfaction with police handling of the case (Wilson & Jasinski, 2004), and enhanced feelings of safety following intervention (Miller, 2003).
There is a continued need for a comprehensive evaluation of DV arrest (Maxwell et al., 2002), which takes into consideration not only its efficacy in reducing violence, but its impact on a range of (offender and victim) outcomes, including those psychological in nature (e.g., perceived normative acceptability of violence and perceptions of fair treatment) as well as material (e.g., socioeconomic resources, housing stability, and state surveillance), to develop a fuller picture of its risks and benefits. Therefore, this review highlights the importance of examining outcomes other than repeat violence in research on the effects of DV arrest (Stark, 1993).
Impacts of Arrest by Race
Although many studies reported the racial and/or ethnic composition of their samples, some did not. And while some studies reported information at the intersection of gender and race, this was a byproduct of how violence is commonly defined in this literature—as experienced by women at the hands of men within the context of a heterosexual relationship, which has implications for the inclusion of forms of DV that do not fall under this definition. Documenting the characteristics of a sample (including race, ethnicity, gender, and their intersections) is important because it allows for the evaluation of representativeness, ensuring that results can be generalized to the population being studied. Further, reporting on these categories is essential for identifying and tracking disparities. Accordingly, future researchers should provide detailed information about the gender, race, and ethnicity of victims and perpetrators, ideally obtained through self-report rather than based on officer perceptions (Laniyonu & Donahue, 2023).
This review also identified that some racial and ethnic groups were rarely represented (Asian/Pacific Islander) or underrepresented (e.g., Latinx) in study samples. To avoid further marginalizing these groups, researchers should make concerted efforts to enhance their representation in future studies. Researchers might also measure legal status, as individuals who are undocumented, particularly those of color, face unique risks of police intervention (Crenshaw, 1991). While race was often included in statistical models, it was frequently dichotomized, and groups outside of this dichotomy were sometimes excluded from the sample. It is recommended that future researchers avoid such practices and instead use more nuanced approaches to capture the diversity of racial and ethnic identities in their samples. Researchers also might consider incorporating constructs relating to discrimination or oppression instead of relying solely on demographic categories (Bowleg, 2008).
Finally, only a few studies performed subgroup analyses to determine whether the effects of arrest differed by race/ethnicity, and none disaggregated effects by gender. Most analyses focused on Black and white individuals, with limited inclusion of other racial or ethnic groups. Several studies provided evidence that arrest has harmful effects on Black victims, particularly Black women (e.g., Coker, 2001; Ruttenberg, 1994), finding that arrest was associated with an increased risk of violence among Black perpetrators (which, assuming intra-racial violence, increases Black women’s risk of victimization) and increased risk of premature mortality among employed Black women. However, other studies found no evidence of differences, though most provided very little detail, making them difficult to evaluate. Our moderation analysis also suggests that arrest may be harmful or at least less beneficial for Black victims, most of whom are women. Overall, these findings highlight a significant gap in the research underscoring the need for quantitative research that moves beyond reductive approaches to race and gender to center the experiences of Black women (Bowleg, 2008; Crenshaw, 1991; Gómez, 2023).
In an era marked by heightened public awareness of racially-disparate policing, including brutal violence committed by police officers toward innocent people of color, the assumption that the experience of being arrested, or having a partner arrested, is uniform across racial groups must be problematized. To improve our understanding of how arrest affects racial/ethnic groups, future research must conduct a priori planned moderator and/or subgroup analyses using richer racial and ethnic categories. Further, building on the theoretical groundwork of feminist scholars like Crenshaw (1991) and Richie (2015), quantitative work in this area should also examine how DV arrest contributes to the intersectional subordination of women of color.
Moderators of Effect of Arrest
We explored various factors as possible moderators using a novel machine learning technique; however, only the type of comparator used to assess the effects of arrest emerged as a significant moderator. Specifically, arrest appears to deter future violence relative to the absence of arrest, but not in comparison with alternative treatments. One possible conclusion is that arrest is better than doing nothing, if not better than other interventions. However, it is unclear what police or judicial actions were administered in the absence of arrest, and there is almost certainly overlap with some of the interventions, particularly those that tested “informal” responses like mediation or separation. Given that all studies comparing arrest with non-intervention were non-RCTs and mostly deemed high risk for bias, these comparisons are almost certainly “noiser” and pose higher risks to baseline equivalence between groups. Additionally, our decision to aggregate alternative interventions may have masked distinct impacts of individual interventions.
Another significant finding from our analysis is that racial composition within the sample was associated with the effectiveness of arrest. Reductions in DV following arrest of the perpetrator were observed only in samples with proportionally few Black victims. Given that the majority of victims were women, this finding is consistent with scholars’ claims that that mandatory arrest policies do not benefit Black women (Coker, 2001; Richie & Eife, 2021; Ruttenberg, 1994). Interestingly, the proportion of Black perpetrators did not significantly predict the effect size, despite the majority of included studies focusing on racial differences from the perspective of perpetrators. However, these analyses suffer from limitations, such as incomplete data, and are further not ideal to assess disproportionate impacts, as correlated demographic or contextual characteristics within the sample could be contributing to observed relationships.
Our study provides valuable insights into the moderators of the arrest-repeat violence relationship but has several limitations. The overlap between moderating variables complicates interpretation and underscores the need for future research to disentangle their impact. Thus, beyond collecting comprehensive race/ethnicity data and conducting targeted subgroup analyses, future studies should assess different intervention components. Collecting detailed data on police actions presents practical challenges, yet it is essential for non-RCTs to accurately capture and control for police and judicial actions helping to understand the effectiveness of arrest and identify alternatives. Advanced methodologies like network meta-analysis should be employed to isolate and critically compare different elements of interventions. Experimental researchers should design studies to separate specific intervention components and assess their individual impacts.
It is also possible that MetaForest failed to identify important moderators or that the moderator identified is not reliable. Simulation studies (Van Lissa, 2017) suggest that its performance decreases under specific conditions, such as when the number of studies or the effect size is small, there are a large number of uncorrelated moderators, and residual heterogeneity is very large. However, in our analysis, the number of studies and moderators, as well as the heterogeneity observed, was within the range of the thresholds used in simulations (Van Lissa, 2017).
In lieu of availability of raw data on the event frequencies of interest, we resorted to extrapolating the respective data from plots for two studies, which carries a risk of imprecision (Burda et al., 2017). Additionally, the sample of studies included frequency data reconstructed from adjusted and unadjusted models, thereby introducing some heterogeneity. However, the latter was not found to be a significant predictor of the effect in the meta-regression model. We also recognize that focusing on the prevalence of repeat violence, when multiple measures (e.g., frequency, time to failure) were often available, may have affected our results. Thus, future meta-analyses might wish to incorporate hierarchical techniques that can account for dependence among estimates within the same study.
Finally, it is important to acknowledge that our analysis did not exhaustively test all possible moderators, and undoubtedly, other factors influence the effectiveness of arrest. Studies including this review have identified potential factors not included in our analysis, such as certain characteristics of officers, victims, and perpetrators; coordinated community response efforts; procedural justice; and characteristics of offenders’ community contexts. However, few of these variables have been consistently examined, with the exception of SIC. Thus, it is important to broaden the scope of future research to include other potentially influential factors.
Implications for Research, Policy, and Practice
Methodological Implications
Given that the methodological limitations of the MinDVE and SARP studies (particularly the MinDVE) have been previously discussed (e.g., Binder & Meeker, 1988; Garner et al., 1995), we will refrain from restating the same critiques. Nevertheless, it is important to note that multiple statistical tests were often conducted using liberal alpha levels (e.g., p < .10), and no effort was made to correct for the number of tests (e.g., Sherman et al., 1991), which, as noted by Garner et al. (1995) increases the risk of false positives. However, these practices, coupled with repeated re-examinations of the same data to answer related research questions, only further elevate the potential for false positives (Thompson et al., 2020).
Some observational studies included limited sets of controls (e.g., Syers & Edelson, 1992; Tolman & Weisz, 1995), raising concerns about omitted variable bias. Observational studies examining recidivism should adjust for offenders’ prior offense records, given that it consistently predicts both arrest and repeat violence, yet at least three of the reports did not. The most rigorous observational studies (e.g., Felson et al., 2005) controlled for a range of perpetrator, victim, and situational characteristics. Overlooked possible confounders include perpetrator presence at the scene at police arrival (Dunford et al., 1990b; Hirschel & Buzawa, 2013); perpetrator pre-arrest risk (which overlaps with prior record; Berk & Newton, 1985; Buzawa & Buzawa, 1990; Hilton et al., 2007); perpetrator characteristics and attitudes (Buzawa & Buzawa, 1990; Gartin, 1995); neighborhood characteristics (Hirschel & Buzawa, 2013; Marciniak, 2021; Xie & Lynch, 2017); and victim preferences in places without mandatory arrest (Buzawa & Buzawa, 1990).
Researchers will likely continue to use data from the MinDVE and SARP studies because it is the only experimental data available. While informative, continued reliance on these data is not advisable for at least two reasons besides the one already stated. First, research questions are limited in scope by the data collected; and second, police and criminal justice practices may have changed in the intervening years, which limits generalizability to present conditions. Therefore, funders should consider prioritizing new experimental studies. However, while field experiments are superior in their ability to establish causality, they are rife with complications pertaining to ethicality, funding, implementation, oversight, etc. In the event research initiatives such as MinDVE or SARP are not funded in the future, researchers must make do with non-experimental data. Moving forward, this area of research would benefit from the cultivation of new datasets that permit multifaceted examinations of arrest, as well as the greater application of techniques designed to isolate the effect of arrest in non-experimental data (Table 1).
Critical Findings and Implications for Practice, Policy, and Research.
Theoretical Implications
Criminological theories of deterrence and deviance underlie most of the work in this area, while other mechanisms of violence reduction have not been examined. This has meant a focus on the individual perpetrator’s behavior over the victim, family, or community. Two recent meta-analyses (Pratt & Cullen, 2005; Pratt et al., 2006) argue that the empirical evidence in support of deterrence theory is weak, and the “deterrence perspective—by itself—falls well short of being a theory that should continue to enjoy the allegiance of criminologists” (Pratt et al., 2006, p. 385). Pratt et al. (2006) propose that deterrence theory be integrated with more comprehensive perspectives instead of being wholly discarded and that researchers focus on the conditions under which deterrence is likely (Piquero et al., 2011), which challenges the wisdom of continued reliance on this foundational framework.
Overall, quantitative work in this area would benefit from an infusion of new perspectives that consider the broader social and ecological context, prioritize examining power dynamics and inequalities, and center the experiences of the victim. For instance, reformulation of assumptions undergirding commonly-utilized frameworks (e.g., stakes in conformity, Toby, 1957) may be warranted, particularly value-laden assumptions around race, poverty, and conformity to social institutions designed to benefit the dominant culture. Enhanced recognition of how structural barriers limit opportunity structures and benefits for conformity, as well as how conformity to dominant institutions may beget or reinforce violence (Godenzi et al., 2001) would also benefit work in this area. Beyond individual differences, structural and cultural aspects of communities should be taken into account (Marciniak, 2021). By adopting these approaches, researchers can broaden their understanding of how arrest affects various individuals and communities.
This systematic review focused on quantitative outcomes associated with arrest. However, this focus has consequential disadvantages, particularly as it relates to the exclusion of qualitative studies. Qualitative research is uniquely capable of providing rich descriptions of the complex processes and practices at play in police responses to DV incidents, thus offering critical insights into the experiences and perspectives of those involved. Qualitative and mixed approaches are essential for a comprehensive understanding of arrest outcomes, as they incorporate critiques of police responses and survivors’ lived experiences, and shed light on how intersections of race, gender, and other identities interact to influence experiences of arrest. Recognizing this, we advocate for future quantitative research to integrate insights gleaned from qualitative approaches to inform theory development and to guide the replication and interpretation of findings.
Policy Implications
Important questions remain that need to be answered before determining whether mandatory arrest policies should be reconsidered or significantly altered. It is crucial that researchers and policymakers extend evaluation beyond the outcome of repeat violence and consider the potential for disparate impacts. Any proposed alternatives must be rigorously tested, taking into account the potential benefits of arrest policies (e.g., victim empowerment and safety, Miller, 2003), to mitigate unintended consequences. To facilitate this research, it is essential that funding support the development of new data sources and projects. And if the research bears out that arrest policies benefit privileged groups but harm marginalized ones, we must critically assess their ethical justification (e.g., Goodmark, 2018; Sherman, 1992). Critical to this examination, beyond whether arrest works, researchers and policymakers must consider “what works for whom, in what circumstances, in what respects and how” (Pawson et al., 2005). While this review has focused on impacts of arrest by race, considering impacts on other marginalized identities (e.g., LGBTQIA+; Durfee & Goodmark, 2020) is essential. Future quantitative research should therefore adopt an intersectional approach covering a range of identities to better understand the challenges and implications of arrest policies (Durfee, 2021).
Importantly, since this review focused on individual-level interventions and outcomes, findings do not directly translate to implications about mandatory arrest policies. Even if arrest does not have consistent, robust impacts on individual outcomes, mandatory arrest policies could communicate that DV is socially unacceptable, thus deterring further violence (e.g., Stark, 1993). We must also note that our conclusions are based on the studies included in this review. It is possible that this review did not capture all relevant research. We excluded studies conducted outside the U.S. due to differences in policy contexts, but additional studies conducted outside the U.S. exist that could shed further light on this topic (e.g., Hilton et al., 2007).
Conclusion
This systematic review synthesized nearly 40 years of research on the impacts of arrest for DV violence in U.S.-based samples. The findings suggest that arrest does not consistently reduce violence, and while it may offer certain benefits to survivors, there is also evidence of harmful impacts on Black women. Future research should prioritize an intersectional approach, examining how power dynamics and inequalities shape experiences of arrest. Additionally, researchers should employ innovative methodologies to critically compare intervention components, assess a broader range of outcomes, and evaluate the contextual efficacy of arrest. By adopting a more nuanced and comprehensive approach, we can better understand and address the complex impacts of DV arrest, ultimately informing more effective and equitable policies.
Supplemental Material
sj-docx-1-tva-10.1177_15248380241284777 – Supplemental material for Outcomes Associated with Arrest for Domestic Violence: A Systematic Review and Meta-Analysis
Supplemental material, sj-docx-1-tva-10.1177_15248380241284777 for Outcomes Associated with Arrest for Domestic Violence: A Systematic Review and Meta-Analysis by Rachel A. Connor, Laura Johnson, Matthew Bridgeman, Farhad Shokraneh and Bagrat Hakobyan in Trauma, Violence, & Abuse
Footnotes
Acknowledgements
The authors wish to thank Annelise Mennicke for her guidance on the systematic review process, Meredith Parker for her help with the literature search, and Sarah McMahon for her comments on an earlier version of this paper. We also thank Jamie Pytlik, Paola Ogando, and Leah Stone for serving as research assistants on this project. Finally, we wish to acknowledge Tilly Fox, a systematic reviewer at Systematic Review Consultants LTD, who was one of the two reviewers who assessed the risk of bias for the included studies. She also drafted the methods and results sections on assessing the risk of bias of the studies.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was supported by Award No. 2016-MU-CX-K011, awarded by the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice. The opinions, findings, and conclusions or recommendations expressed in this report are those of the authors and do not necessarily reflect those of the Department of Justice.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
