Abstract
We review the literature on the general solvability of homicides to learn if characteristics pertaining to the victim, the event and the investigating agency may help predict which unresolved or ‘cold’ cases are likely to be ultimately resolved. We utilize two definitions of when a case becomes ‘cold,’ the conventional or Traditional (unresolved after a year) and the Alternate (unresolved after 30 days). We use binomial logistic regression (BLR) to analyse data collected from the National Incident-Based Reporting System (NIBRS) and the Law Enforcement Management System. Several factors (e.g., child victim, location of residence of victim, location where body was found, region of agency, type of investigating agency, agency salary level and educational requirements) are suggestive and need to be examined further.
In cases of criminal homicides, as Leo Tolstoy noted regarding ‘unhappy families’ in Anna Karenina, every event is unique as a personal tragedy that is also a victimological concern and an investigative challenge. If a homicide remains unresolved or becomes what is popularly referred to as a ‘cold’ case, this Karenina Uniqueness shifts to the extreme with both offender accountability and justice for the victim and grieving loved ones jeopardized. Sometimes, these challenges are overcome, and a long-unresolved case is closed. 1 There are factors associated with the victim, the event or the investigative agency which may serve to enhance the likelihood that a cold homicide will eventually be resolved. There has been a surge of recent interest in cold cases among homicide researchers 2 and police investigators. 3 As homicide clearance rates in the USA declined from 91% in 1965 to 62.5% in 2012 to 61.4% in 2019, answers to questions about how case characteristics affect the efficacy and efficiency of unresolved homicide investigations become particularly pertinent.
In general, a cold case is a felony (not subject to a statute of limitation) that remains unsolved because a suspected perpetrator was not identified and prosecuted for committing it. Sometimes, even if a suspect is found and prosecuted, his or her subsequent acquittal may result in the case reverting to cold status. For investigators, renewed interest in resolving cold homicide cases is due to newer forensic analytic techniques for physical evidence and large-scale databases linking offenders to homicides becoming available. As older cases are reconsidered, a discussion about improving the entire investigative process has also emerged. 4 Thus, identifying determinants of cold case homicide resolution may assist in triaging and prioritizing cases for reconsideration based on the likelihood of their clearance. Increasing the odds of clearance will also help assure co-victims (grieving family and friends of the deceased) that the police have not given up and are likely to succeed in their efforts. 5
This article will also address a gap in the existing homicide literature. It is true that the larger topic of homicide case clearance has been studied, 6 but cold homicide cases remain less explored. 7 Davis et al. note that ‘we do not know whether findings [from the general case clearance] can be applied to cold case investigations’. 8 This is relevant in that ‘[c]old cases are emphatically not a representative sample of all crimes, so it is not clear whether the same rules apply to cold cases as apply to other cases’. 9 In this article, we explore whether characteristics of unresolved homicides such as those pertaining to the victim, location and investigating agency may make it more or less likely to be resolved.
Analytical Context
We base this analysis on previous research which, while not specifically focused on cold case homicides, examines the impact of a variety of factors on the ‘general’ solvability of homicides, that is, whether a homicide case was eventually resolved. Among the various factors investigated in the general homicide solvability literature are the following:
Victim characteristics
10
: These include characteristics such as victim’s age, sex, race, etc. Motive and circumstances
11
: These include whether the homicide took place along with the commission of another felony or may have been committed by a stranger. Weapon use
12
: These include the type of weapon used to carry out the homicide such as guns or knives. Location type
13
: These factors focus on where the homicide took place such as in a residence, business or outdoors. Temporal effects
14
: When the homicide that took place has been considered sometimes by focusing on the particular police work shift involved.
Initial analysis of our data indicated that the conceptual definition of ‘cold’ when employed in the context of criminal homicide was critical. This emerged from observing that there was little volatility in the periodic probability of case resolution once the case ages 30 days past the event date (or reporting date, which serves as the event date in approximately 11% of cases). This is largely consistent with the findings of Addington. 15 We recognize that how cold case homicide is defined and operationalized will affect the analysis. Therefore, we undertook a discovery process to isolate a best-fit definition or set of definitions.
Two time-based case clearance categories exist in the literature. We will refer to one as the ‘Conventional’ or ‘Traditional’ (we use these terms interchangeably) definition and the other as the ‘Alternate’ 16 definition. In the ‘Conventional’ case clearance definition, ‘cold’ is defined as a case which has aged 365 days from the event date prior to being cleared or it remains unresolved. 17 For example, in the state where we are located, Colorado Revised Statute, 24–4.1–302(1.2) defines cold case to mean ‘a felony crime reported to law enforcement that has remained unsolved for over 1 year after the crime was initially reported to law enforcement and for which the applicable statute of limitations has not expired’. Within the one-year time frame, we term solved cases as follows: The ‘Dunker’ clearance, as in a slam dunk in basketball (a case resolved in 0 to 1 days from the event date); 18 the ‘Hot’ clearance (a case resolved in 2 to 17 days from the event date); 19 and the ‘Warm’ clearance (a case resolved in18 to 365 days from the event date).
Addington 20 based on the observations of temporal clearance that are consistent with Singer and Willet 21 employs ‘Time to Clearance’ as an outcome measure. Cases were classified into the following categories: ‘cleared on the event date’, ‘cleared between 1 and 7 days after the event date’, ‘cleared between 8 and 30 days after the event date’, ‘cleared 30+ days after the event date’ and ‘uncleared’. In this usage, cases which have aged 30 or more days from the event date, or have never been cleared, are cold by default. There is no distinction between a case cleared on Day 36 and a case cleared on Day 366.
Our goal is to use a number of variables ranging from the characteristics of victims and offenders to features relating homicide events along with factors pertaining to the investigating agency to examine if they help us understand which cold cases of homicide are more likely to be solved. We do this by employing both the ‘Conventional’ and ‘Alternate’ definitions, which is a slight modification of Addington’s approach. Our ‘Alternate’ classification will define as cold a homicide which has aged 31 or more days from the event date; we also recognize a ‘Dunker’ clearance (a case resolved in 0 to 1 days from the event date) and a ‘Hot’ clearance (a case resolved in 2 to 30 days from the event date), though they are not specifically operationalized for purposes of this analysis. Using the two contrasting definitions allows us to examine the impact of the different approaches on the selected covariates and also allows for discursive speculation on the nature of observed differences, if any.
Data and Methods
This project employs data available from the Federal Bureau of Investigation’s (FBI) National Incident-Based Reporting System (NIBRS) and the Bureau of Justice Statistics’ (BJS) Law Enforcement Management and Administrative Statistics (LEMAS) survey. These databases serve as sources for the event and agency-level data needed to analyse the impact of victim, event and agency characteristics on the solvability of cold cases of homicide. Specifically, the NIBRS data for the period January 1991 through December 2008 and the LEMAS data from the surveys implemented in 2000 and 2003, operationalized so as to describe reporting agencies for the period January 1997 through December 2003, were converted from files maintained at the National Archive of Criminal Justice Data (NACJD). The LEMAS data were matched to case-level NIBRS events for two periods (LEMAS 2000 for NIBRS January 1997 through December 2000 & LEMAS 2003 for NIBRS January 2001 through December 2003). For this analysis, we assume that law enforcement agencies vary little over short periods of time (3 to 5 years) in terms of personnel, structure, resources and jurisdictional boundaries and that such survey data is generally ‘backwards looking’. The case-to-agency matching process was made possible due to a ‘crosswalk’, maintained by the NACJD for such purposes, which allowed for survey respondent identifiers in LEMAS to be matched to Originating Agency Identifier (ORI) numbers in the NIBRS data.
We are aware of scholarly concerns over using these databases. With respect to NIBRS, beyond general criticisms voiced by Jarvis and Regoeczi, 22 there are suggestions that data complexity may increase analytical confusion in operationalization 23 and that there may be negative effects due to missing and non-meaningful data. 24 However, Addington 25 finds that data completeness is high for NIBRS in comparison to the Uniform Crime Reports-Supplementary Homicide Reports (UCR-SHR). Other criticisms include non-representativeness of the data. 26 On our part, we have additional concerns regarding the spatial distribution of the 5,271± reporting agencies, and also, the non-randomness of the data. 27 Pertaining to the last point about non-representativeness, we have an additional concern regarding the ‘self-selecting’ nature of agencies that participate in the NIBRS and LEMAS programmes. We also note, given the case selection process, the number of unique ORI reporting agencies varied over the course of the 18-year sampling period, from 181 case-submitting agencies in 1991 to 798 in 2008. With the LEMAS data, in addition to general criticisms of the stratified, randomly sampled collection of 2,859 law enforcement organizations 28 additional concerns about the consistency of the definitions applied by respondents exist in that some values vary widely.
From the NIBRS data, case-level homicide events were selected where there was a single, ‘Individual’ (Non-Law Enforcement Officer) homicide victim. Consistent with Jarvis and Regoeczi, 29 we excluded Negligent Manslaughter cases. Using this criterion, which was then keyed for matching case events with an Administrative NIBRS Segment record, data from the 13 separate, annualized NIBRS Segment files were extracted and transformed for use in a relational database environment. Adopting an analytical bifurcation consistent with the above observations, we developed a model on the unit of the Victim-Case 30 for application over two data sets; one containing cases that are classified as cold under the ‘Traditional’ definition and another containing cases classified as cold under the ‘Alternate’ definition. The dependent variable is a derived value, indicating whether the case had been cleared by the terminal creation date for the 2008 NIBRS data file set (~ 27 October 2009—the NIBRS allows for an ongoing data editing process, mandated for an initial time period and voluntary thereafter). This value was derived since we classified a situation involving either an arrest or an exceptional clearance as a resolved case, and it varies slightly with previous research.
The independent variables drawn from the NIBRS Segment files included UCR offense, victim’s age, victim’s sex, victim’s race, victim’s resident status, nature of crime location, primary weapon employed, nature of assault, victim–offender relationship, agency service population group, event geographic region and division, and agency type. Independent variables derived from the LEMAS data included total agency staffing, investigative staffing, agency type, operating budget, agency population, officer minimum salary, officer maximum salary, computer-aided investigations, computer-aided crime analysis and the required officer education. The majority of the variables from both sources were code-table based, categorical values; the exceptions being victim age from NIBRS and total agency staffing, population, operating budget, and officer minimum and maximum salary. For the victim age value, which exists in the NIBRS as a blend of continuous, numeric values and some descriptively coded elements for victims at the extremes of the age range, we transformed the coded material so that age becomes a categorical data element. Data describing the event location and primary weapon employed were simplified via code consolidation as detailed in Table 1 to enhance thematic analytical effect and clarify presentation. In LEMAS, the population value was excluded in favour of the more prevalent population group coding available from the NIBRS data, and the operating budget, computer-aided investigations and computer-aided crime analysis values were excluded as a result of the aforementioned uncertainty concerning how various law enforcement organizations operationalized these concepts. The officer minimum and maximum salary values were used to calculate a midpoint value, which was then used as a discrete indicator in place of the range data provided by LEMAS (and, for presentation and categorical analysis purposes, the officer salary range midpoint and total agency staffing were classified into interquartile ranges).
Binary Logistic Regression Products
Hosmer & Lemeshow Chi-square = 5.562, significance = 0.592;
Nagelkerke R-squared = 0.032;
Omnibus Test of Model Coefficients (Model)–Chi-square = 29.839, significance = 0.054;
Alternate model: (n = 9,733/solved n = 1,134);
Hosmer & Lemeshow Chi-square = 12.917, significance = 0.115;
Nagelkerke R-squared = 0.021; and
Omnibus Test of Model Coefficients (Model)–Chi-square = 105.329, significance = 0.000.
Analysis
We used binary logistic regression as the analytical method, as opposed to Discriminant Function Analysis or Analysis of Variance, given the known characteristics of the data and the nature of the research question. These included the presence of a dichotomous, categorical dependent variable, categorical independent variables and an assumed non-normal distribution (owing primarily to the vagaries of the NIBRS and LEMAS informants). This selection is consistent with assumptions in the related literature, in particular, Alderden and Lavery and Addington. 31 An initial exploration of this consolidated data revealed a secondary issue, well known to those who have used NIBRS. Our data consists of rare-event cold case homicides drawn from sources known for a certain incompleteness. Even in the context of more mundane events, there is some variable coding that is non-meaningful (e.g., not simply missing, but coded ‘Unknown’ or some equivalent, under the assumption that for those variables allowing for ‘Other’ coding actually imply a known but unspecified ‘Other’ as opposed to an alternative ‘Unknown’). For the variables included in this study, the non-meaningful data rates ranged between 0.90% (victim sex—Alternate) and 56.61% (victim–offender relationship—Traditional). A variety of estimation and imputation methods were considered to address this problem, but we determined that none were suitable. As a result, we adopted a case rule, removing from the analysis any event which did not have meaningful coding in all of the selected variables, similar to the solution employed in Addington. 32 The decision to employ this case rule for the BLR model both reduced the available variables to a more limited, victim demographics-centric data set and the elimination of approximately 14% of the derived event data. This information loss should be considered separate, but predicate, to the necessary isolation of the LEMAS-derived variables. To include all data in a single-pass analysis would have had an outsized reductive effect under the case rule (on the order of an 80% case loss prior to removal of incomplete NIBRS variable events).
Results
The bifurcated BLR analysis was conducted employing NIBRS data for each of the definitions (Traditional and Alternate), the products of which are reported in Table 1. While overall goodness-of-fit for the models was acceptable, there is a concern about sample adequacy in terms of the ‘observation-to-predictor’ ratio. 33 Homicide is a relatively high ‘clearance’ crime event (64.8%). 34 However, cold homicide case clearance rates are typically lower than those observed with ‘non-cold’ homicides, and in the cold category clearance probability is sensitive to how it is defined. Using the Traditional definition of a cold case, the rate of clearance is 1.04% of cases; by the Alternate definition the clearance rate is 11.65%. Under the standard described in Peng et al. (2002) 35 , only the Alternate data sets would have sufficient ‘positive’ observations to ensure sample adequacy, and it only marginally exceeds that threshold. Using a sample design formula described by Peduzzi et al. 36 which accounts in sample size estimation for the effect of limited ‘positive’ case representation, both models have adequate samples.
In considering the predictive utility of the specified models, the Alternate definition possesses significant predictive effect when compared to the constant-only model; the Traditional model just misses the cut-off for significance. The majority of the defined covariates are insufficiently significant predictors under both the Traditional and Alternate definitions. Yet given this exploratory analysis, they may be of interest for theoretical approaches based on the concepts of demographically determined differential case outcomes. Given the limited practical application of the BLR results for further analysis, 37 we developed a series of enhanced contingency tables to assist in an exploratory identification of potentially important factors in the case resolution process. Tables 2–4 are organized to present differential outcome observations in the context of factors associated with the victim, event or agency.
Data Set Composition Products: Victim
Notes: NIBRS data—Traditional (n = 9,907)/alternate (n = 11,304);
LEMAS data: Traditional (n = 1,677)/alternate (n = 1,933);
Line clearance = Percentage of cleared cases within value (male cleared cases vs. all male cases);
Set clearance = Percentage of cleared cases by value versus total cases (male cleared cases vs. all cases);
Cleared cases = Percentage of total cleared cases by line (male cleared cases vs. all cleared cases); and
Set cases = Percentage of total cases by line (male cases vs. all cases).
Data Set Composition Products: Event
Victim’s age: Categorically significant in the Alternate model, the analysis suggests that most age groups possess an enhanced probability of case resolution as compared with young child victims (e.g., those 0 to 7 years of age at the time of the event). This effect is not fully consistent between models in either direction or intensity of effect, and there is little significance at the category level. Young child victims are asymmetrically represented in both the cold and ‘non-cold’ case data, in comparison to their typical participation rates in general population statistics. Cases involving these victims that have become cold pose unusually difficult investigative challenges. Thus, it seems possible that some of this effect is due to the age distribution of victims. In contrasting the cold case data with the known shape of the ‘non-cold’ victim distributions, we should also mention that the cold case data may slightly over-represent 18-to-40-year-olds and under-represent older victims, particularly those over 60 years old (Figure 1). This could be due to the differential risk exposures of older victims.

Victim’s sex: This variable exhibits an inconsistent effect between the models, with no significance in either. Some of the positive effect observed in the Traditional model may be due to a female victim being atypical in an instrumental crime of violence.
Victim’s race: This is notable for its inconsistent and insignificant effect between models (Table 1). In the Traditional model, we find under-resolution of cases involving Black victims. We suspect that their differential rate of representation in drug dealing cases (65.2% of victims), in cases where a firearm was employed (60.1%) and the lower likelihood of having a familial relationship with the offender may explain some of the noted effects. The performance of other racial categories in the analysis would seem to be a product of the relatively small number (1%) of victims in the data.
Victim’s resident status: Being a resident of the investigating jurisdiction seems to have a significant effect on the outcome of the case (Table 2), though this was a variable which was excluded from the BLR analysis due to the complete case rule. This is based on the differential participation rate (cleared cases vs. set cases). This effect may be a product of the difficulty inherent in a ‘body dump’ homicide case (e.g., victim’s body is recovered in a location that is not proximate to the event scene, frequently remote, and often temporally removed from the event). To this, add the practical effects of the process of conducting an investigation at a distance (e.g., canvassing associates to develop complete victimological details in a place distant from and unfamiliar to the investigator, lack of a local base of knowledge regarding the victim, etc.).
Victim–offender relationship: Having a known relationship to the offender in the case of a homicide provides a meaningful, positive impact on the probability of the case being resolved. However, this is another variable lost to the BLR analysis due to the complete case rule. In almost all cases, being able to describe the social nexus between victim and offender results in an over-performance (Table 2) in case clearance. The inability to provide that description produces an inverse under-performance, noteworthy when using the Alternate definition. The inversion of performance in the interpersonal victim classification between Alternate and Traditional definitions suggests that cases tagged as ‘domestic’ not resolved either immediately due to circumstance of event discovery or witness evidence or within the first year due to emergent evidence (e.g., physical clues needing laboratory processing for use in the case filing decision, and thus subject to time lags) become increasingly less likely to be resolved. The under-performance of these events in the Traditional case set (10.08% of cases, 1.96% of cleared cases) may indicate events where the classic Motive–Means–Opportunity template is insufficient for circumstantial proof and the evidentiary basis is problematic.
Event location type: This variable was compressed for analytical and presentation purposes. The 25 descriptive categories available in the data were consolidated by grouping them as residential, commercial, public, open space/roadway or other spaces. With the exception of the ‘other’ classification, the direction of the effect for each classification was consistent between models, though the intensity varied (e.g., the Exp[B] is 1.384 under the Alternate schema, 5.165 under the Traditional); only ‘open space/roadway’ had significance, and then only in the Traditional model (where ‘location’ also had categorical significance). The inference here is that with the relatively minimized role of interpersonal violence in the cold case environment (and, thus, the reduced relevance of ‘residential’ crime scenes), the ‘open space/roadway’ setting becomes the more probable crime scene or discovery location.
Event assault type: Under the Traditional definition, every category except ‘felony-other’, ‘unknown’ and the gang-related events underperforms, as measured by differential participation rates (Table 3). Under the Alternate definition, every category performs at approximately par value or over-performs except ‘gangland’ and ‘unknown’. This is a near mirror image of effect between the two models. We speculate this is due to cases in the Alternate domain, which are ultimately resolved primarily by physical evidence, the processing of which may necessitate a resolution date falling into the 31to 365-day window. Cases lacking an immediate suspect and promising physical evidence predominate in the Traditional population.
Data Set Composition Products: Agency
Event weapon type: This is another variable which was consolidated by reducing 22 categorical descriptions to four classes of weapons, namely edged, firearm, exotic or other. The variable largely has mixed effect and lacks significance under both definitions. The exception is the use of either an ‘edged’ or ‘other’ weapon-type in the Alternate model (and the weapon type did have categorical significance). It is speculated that the positive and significant effect of the use of an edged weapon may be a proxy for the increased likelihood of other physical evidence being available to the investigation, due to the social and spatial proximity necessary for this form of deadly violence. This may result in a cold clearance due to evidence processing delays. Similarly, the uniqueness of a weapon which would qualify for classification as ‘other’, (seen only in 3.6% of cases) may contribute to enhancing solution probability.
Agency geographic division: The South Atlantic and Mountain regions consistently and noticeably over-perform in terms of case clearance. However, given the spatial distribution of NIBRS reporting agencies, drawing even exploratory points of interest from this data may not be appropriate due to the over-representation of a limited number of states.
Agency type: County agencies consistently over-perform, and municipal agencies consistently underperform across both model data sets. It should be kept in mind that municipal agencies investigate more than 70% of the cold case load, while county and state agencies investigate marginally higher rates of non-resident victims’ cases.
Agency characteristics: The quartile of the smallest agencies, as measured by reported total full-time-equivalent (FTEs) staff on the LEMAS survey, underperforms across both models, and the quartile of the largest department consistently over-performs. However, the general conclusion is that the effect of ‘agency staffing’ is mixed. This is consistent with the other agency variables considered in the analysis. Of the other two variables considered, ‘agency officer salary’ and ‘officer education’, requiring some level of exposure to college coursework (or having the ability and desire to gain access to college-level classes, depending on the presumed direction of effect), but not requiring degree completion, has some consistent, positive effect on case clearance across both definitional regimes.
Discussion and Conclusion
We began this article by introducing the idea of Karenina Uniqueness, the assertion that each instance of what we have come to think of collectively as unresolved or cold homicide is a unique tragedy in an investigative sense. If this unhappy speculation holds true, there would be no template and no reliable determinates to employ in optimizing methods to produce homicide resolutions. It would mean that each case would be resolved along its own distinctive path. The concept of Karenina Uniqueness is our default hypothesis; a hypothesis we had hoped to reject. After an initial analysis of the total homicide data set (n = 23,109), conducted to gain a sense of the shape of the problem, we considered two definitions since there was little marginal change in the daily probability of clearance for a case once it moved beyond 30 days from the event. This led us to wonder if a case was not practically cold at this point, rather than at the one-year line of demarcation conventionally employed. And, if a case was cold at 31 days, was it similar to a case that was cold at 366 days, or 3,660 days? The answer to that question seems to be, at first blush, no, though a qualified no. Reading broadly and using the Conventional definition (cases that were resolved after 365 days), some factors such as the presence of a child victim, whether the victim was a resident of the location where the homicide took place, whether the body was found in an open space or roadway, whether it took place in the South Atlantic or Mountain regions, whether the homicide were investigated by a county or larger police agency which paid its officers well and had some educational entrance requirements do make some difference whether a cold homicide eventually becomes a resolved case. However, given the limitations of our data and analysis, these are only suggestive possibilities and not certainties.
For now, we surmise that a plurality of the cases which clear in that Alternate cold window between 31 and 365 days after the event are resolved due to processing the evidence. This introduces a lag into the case clearance timeline. Where in that first year the curves cross, where the cases with pending evidence are likely to clear, and the Karenina Uniqueness predominates are something we hope to explore further via survival analysis techniques. The latter will allow us to take advantage of the broader information from the composite homicide case data.
Two implications result from this analysis. For the practitioner, the oft-repeated triumvirate of Motive–Means–Opportunity 38 continues to retain force, in both definitional contexts. In the Alternate timescale, these factors, specifically as conceived in this study, have a legitimate bearing on the probability that a homicide will be solved. In the Conventional realm, it inverts, that is, victims who are murdered for unknown reasons in unfamiliar spaces by unknown offenders are more likely to become unsolved cold cases. This conclusion further suggests that a sense of time and urgency is always of the essence in solving homicide cases, and thereby achieving justice for victims and co-victims.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
