Abstract
Casualty counts are often controversial, and thorough research can only go so far in resolving such debates—there will almost always be missing data, and thus, a need to draw inferences about how comprehensively violence has been recorded. This article addresses that challenge by developing an estimation strategy based on the observation that violent events are generally distributed according to power laws, a pattern that structures expectations about what event data on armed conflict would look like if those data were complete. This technique is applied to estimate the number of Native American and US casualties in the American Indian Wars between 1776 and 1890, demonstrating how scholars can use power laws to estimate conflict size, even (and perhaps especially) where previous literature has been unable to do so.
Estimating the severity of armed conflict is generally difficult. During recent violence in Iraq, Afghanistan, and the Balkans, for instance, casualty estimates ranged widely, despite efforts to record this information objectively. These figures can be the subject of high-profile controversies. More broadly, scholars’ inability to generate rigorous casualty estimates prevents them from describing and analyzing the defining feature of armed conflict. 1
One of the main difficulties of estimating conflict size is that this is not just a matter of carefully gathering information. No matter how much effort analysts put into recording violence, the available record will almost always be incomplete, and thus, there will almost always be a need to estimate how comprehensively these data have been recorded. This article explains how power laws can be used to perform this kind of inference. A power law is a special probability distribution characterizing the relationship between the frequency and severity of many phenomena, including violent events. The notion that violent events often approximate power laws is a well-documented empirical regularity. This article instrumentalizes that fact to draw inferences about what the distribution of violent events might look like if the data were complete. In other words, the technique advanced here leverages the distribution of available data to evaluate the comprehensiveness of those data.
To demonstrate that technique, this article uses power laws to estimate Native American and US casualties during the American Indian Wars from 1776 to 1890. 2 There are two main reasons to focus on this experience. First, despite the central importance of these conflicts to North American history, there are no reliable estimates of how much fighting actually took place. Millions of Native Americans died as Europeans colonized North America and the tribes were subjected to coercive expansion by settlers and their governments. At the same time, most population loss resulted from disease, and some historians have recently argued that the frontier was far less violent than commonly believed. 3 The historical significance of this violence is out of proportion with scholars’ understanding of what it entailed.
Second, the American Indian Wars provide a context where estimating conflict size relies heavily on extrapolation. The US Army kept records of fighting with the tribes, and other sources help to flesh out this information. But it is highly unlikely that scholars can reliably estimate the severity of the American Indian Wars simply by accumulating data. Many violent events—especially small-scale engagements with little historical salience per se—have presumably gone unrecorded, becoming effectively invisible to event-count methodologies. Basing casualty estimates on the available record thus requires drawing inferences about how incomplete that record is.
The approach developed here is designed to do this. It indicates that roughly 50,000 Native Americans and roughly 12,000 US forces were killed, captured, or mortally wounded during the American Indian Wars. Combined, this exceeds the recorded figures for 90 percent of intrastate wars in the Correlates of War (COW) data. 4 This historical contribution reinforces the article’s methodological purpose, which is to demonstrate how power laws can be used to estimate conflict size, even (and perhaps especially) where previous literature has been unable to do so.
The article proceeds in five sections. The first section describes the American Indian Wars and introduces original data on 2,537 military engagements between Native American and US forces from 1776 to 1890. These are the most comprehensive event-level data on the American Indian Wars, but they doubtlessly remain incomplete. The second section explains how power laws can be used to estimate the comprehensiveness of these data. The third section implements this technique, the fourth section discusses uncertainty surrounding the resulting estimates, and the fifth section the author’s conclusions.
Data on the American Indian Wars
When the United States declared independence, most of its population lived near the Atlantic seaboard and the country had limited military resources. Settlers on the early frontier were vulnerable to attacks from nearby tribes, who in turn suffered frequent encroachments. These tensions regularly produced low-grade violence and sometimes spiraled into larger conflicts. In some cases, pan-tribal military alliances opposed the United States, leading to major engagements such as the 1794 Battle of Fallen Timbers and the 1811 Battle of Tippecanoe.
During the first half of the nineteenth century, the United States coercively relocated dozens of tribes to lands west of the Mississippi River. In some cases (as with the Choctaws), removal entailed relatively little violence, but in other cases (as with the Seminoles), tribes fought protracted conflicts to stay on their land. As US settlement expanded by mid-century, conflict ensued with prominent tribes such as the Cheyennes and Comanches, as well as with numerous, lesser-known tribes such as the Chetcos and Kalispels. Native Americans were increasingly confined to reservations during this period. Several tribes (such as Navajos and Nez Perces) forcibly resisted the reservation policy, and some (such as Apaches and Sioux) engaged in armed conflict when factions left reservations to live, hunt, or raid elsewhere. Some battles took place within the reservations themselves. In December 1890, the US Army fought Sioux conducting religious ceremonies near Wounded Knee Creek on a reservation in South Dakota. This is typically accepted by historians as marking the end of the American Indian Wars. 5
The American Indian Wars thus span a wide range of time, geography, and participants. Clodfelter (2008) divides this experience into forty separate conflicts; Axelrod (1993) chronicles thirty-six “Indian Wars” after 1776. But as these authors acknowledge, dividing the period into discrete episodes is misleading, since much of the armed conflict between the United States and Native Americans involved protracted, low-level violence throughout the broader course of US expansion that took more than a century to complete. 6 That movement constituted a major geopolitical shift: the United States emerged as a continental power, while previously hegemonic tribes lost vast amounts of people and land. Yet if violence played a leading role in precipitating this shift, there are no reliable estimates of how much violence actually transpired.
This article thus attempts to estimate the number of casualties—defined as the number of people killed, captured, or mortally wounded in battle—that occurred during the American Indian Wars, both on the side of the Native Americans and on the part of the United States. The analysis includes military engagements between 1776 and 1890 that occurred within the continental United States or involved pursuits into neighboring territory (such as entering Mexico to capture Geronimo). The analysis includes armed engagements of any size, involving regular or militia forces. Data include noncombatant casualties, but not losses from displacement or disease. As in many conflicts, it is sometimes ambiguous whether a particular event should be seen as belonging to “the American Indian Wars” as opposed to interpersonal violence that was essentially nonpolitical. This study approaches the issue inductively by including information from a broad range of sources, and thus letting the sources “say” which engagements belong in the data. To the extent that there is disagreement, the present study thus errs on the side of inclusion, while following the lead of the literature on which it aims to build.
Data collection comprised several anthologies recording armed conflict at the level of individual military engagements. The most comprehensive is Webb’s (1939) Chronological List of Engagements between the Regular Army of the United States and Various Tribes of Hostile Indians, which lists 1,177 engagements between 1790 and 1890. Webb’s book is itself a complication of official records produced by the US Army Adjutant General and the US Army War College Historical Section. 7
Five additional sources flesh out the data. Michno’s (2003) Encyclopedia of Indian Wars describes 787 engagements occurring after 1850; a follow-on work, Forgotten Fights (Michno and Michno 2008) adds another 334 engagements dating to 1823. Axelrod’s (1993) Chronicle of the Indian Wars, Ratjar’s (1999) Indian War Sites, and Nunnally’s (2007) American Indian Wars all survey violence dating from 1776, covering 123, 559, and 940 engagements, respectively. These anthologies include transparent sourcing; consistent information on the date, location, casualties, and tribes involved in each engagement; and broad temporal and geographic coverage. In all, these data incorporate 3,920 event reports covering 2,537 separate engagements, recording 25,643 casualties sustained by eighty-six different Native American tribes 8 and 10,476 casualties sustained by US forces. 9 Yet there is surely a substantial amount of missing data here, and this impedes drawing inferences about the overall severity of the American Indian Wars.
How can scholars draw this kind of inference? We could examine the data and see if the results “look right,” but this requires making assumptions about the quantity we are trying to estimate. Polling subject matter experts poses a similar difficulty—since the magnitude of the American Indian Wars is the subject of scholarly debate, expert opinions are bound to differ, and the very purpose of this project is to help resolve that disagreement by producing an independent, objective estimate. For assessing recent conflicts, scholars can sometimes use sampling methods to cross-check event counts, but this approach is not available for conflicts, like the American Indian Wars, that concluded long ago. 10
In summary, it is unlikely that data collection alone can ever fully (or even remotely) tell us how many casualties occurred during the American Indian Wars. Analysts must ultimately make some assessment of how comprehensive their data are, and thus how the sample of available information compares to the overall population of interest. The next section outlines an approach to dealing with this challenge, based on the well-documented finding that the severity of violence often follows a special kind of probability distribution called a power law.
Estimation Strategy
Power laws characterize the relationship between the frequency and the severity of certain events. If discrete data follow a power law, then the probability of seeing an observation with magnitude x is given by the distribution function p(x) = Cx −α, where C is a constant ensuring the distribution sums to 1, and α is the scaling parameter. A distinctive feature of power laws is that when they are represented on a “log–log” plot (with the logarithm of the event’s severity on the x-axis and the logarithm of the probability of an event being at least that severe on the y-axis), the data will form a straight line. 11
Power laws characterize many phenomena. The sizes of cities, earthquakes, moon craters, and annual incomes have all been represented using power laws (Newman 2005). Richardson (1948) originally observed that violent events, ranging from homicides to world wars, also seem to follow this pattern. Power laws characterize the size of interstate wars (Cederman 2003) and terrorist attacks (Clauset, Young, and Gleditsch 2007). Bohorquez et al. (2009) plotted data on the severity of insurgent attacks within nine separate conflicts; in each case, the data resembled power laws. These findings have important implications. As Clauset, Young, and Gleditsch (2007, 59) explain with respect to their findings on terrorism: “The frequency-severity statistics of terrorist events [demonstrate that] there is no fundamental difference between small and large events; both are consistent with a single underlying distribution. This fact indicates that there is no reason to expect that major or more severe terrorist attacks should require qualitatively different explanations than less salient forms of terrorism.”
In general, the notion that power laws characterize violent events is a well-known empirical regularity, and this has received substantial attention in literature intended for broader audiences. 12 Figure 1 demonstrates that event-level data on the American Indian Wars also resemble a power law. Panels A and B plot the (logged) severity of armed engagements on the x-axis along with the (logged) probability that a randomly chosen engagement is at least that severe on the y-axis; Panel A represents casualties sustained by Native American forces, Panel B represents casualties sustained by US forces; both plots are approximately linear.

Power laws in data on the American Indian Wars.
Using Power Laws to Extrapolate Missing Data
Yet even if these plots resemble linear relationships, they are not linear exactly. In particular, the left tail of the data in Panel A, representing casualties sustained by Native Americans, “sags” below what we would expect to see if the distribution followed a power law. The data contain noticeably fewer small-scale events than what a power law would predict, which is consistent with the expectation that these engagements are least likely to have been comprehensively recorded.
Panel B, representing data on US casualties, is not as noticeably concave on the left tail. This again makes sense, given that US casualties are more likely to have been documented at the time and preserved by available sources (especially since those sources include official US Army records). Together, Panels A and B reinforce the expectation that when data on violent events are better measured, they will more closely approximate power laws.
This suggests a strategy for estimating the comprehensiveness of conflict data. Given a wide range of previous scholarship (and the way that the data themselves shape up), we can assume that the distribution of violent events during the American Indian Wars can be approximated with a power law. In addition, we expect that the data are relatively comprehensive when it comes to capturing high-intensity violent events, since those are most likely to have been recorded and preserved. If these assumptions are reasonable, then we can use the data on the right side of these distributions in order to develop an estimate of what we should see on the left if the data were complete. 13
Figure 1 demonstrates this estimation strategy. Panels A and B provide scatterplots of the raw data. These data resemble power law distributions, and Panels C and D show how we can estimate their scaling parameters using a method that the next section describes in more detail. Panels E and F then show what the plots would look like if we added observations, so that the distributions maintained the same slopes throughout their left tails. Panels E and F are thus stylized projections of what these data might look like if they were complete, and we can use these projections to estimate how much information is missing from the original data.
Evaluating Key Assumptions
How plausible are the key assumptions driving this estimation strategy? The first of these assumptions, that data on the most violent events during the American Indian Wars are likely to be well-measured, seems fairly safe. Events like Andrew Jackson’s Battle of Horseshoe Bend against the Creeks (1814), or George Custer’s last stand against the Sioux (1876) were high profile, widely known and discussed at the time. For most of the period studied here, the US Army was small and soldiers did not expect to participate in much violence. 14 Large battles were salient events, and since the data used in this study draw on Army records, it is doubtful that the right tail of this distribution is missing many observations. It is hard to say just how large an event would need to be before it would reliably appear in these data, but the technique described in the next section does not require making assumptions about where observed data begin deviating from a power law—this is something we can estimate directly.
The second key assumption driving the estimation strategy, that the severity of small-scale violent events should follow the power law which characterizes the rest of the distribution, is more contentious. 15 Of course, most statistical estimation strategies rely on assumptions about probability distributions and functional forms; the operative question is not whether an empirical model is perfect, but whether its assumptions are reasonable for drawing inferences. The fact that so many different kinds of violent events resemble power laws throughout most of their distributions suggests that this is a reasonable baseline to use. And the fact that data on US casualties in the American Indian Wars conform to a power law down to events of much lower magnitude than the data on Native American casualties reinforces the idea that better measured data should fit a power law more closely. Ideally, however, we would be able to go beyond suggestive evidence and directly examine this assumption.
To do so, we need data on violent events that are known to be complete (or come as close as possible to this standard), and for which casualties can be grouped by discrete incidents. Figure 2 plots three data sets that meet these conditions, representing US fatalities in Iraq, Vietnam, and Korea. For each of these conflicts, the US government (and for Iraq, media and nonprofit organizations) documented combat-related deaths in a manner that we can expect to be essentially comprehensive. Each data set provides not just the name of each soldier who died, but also their military unit, the type of incident leading to their death, and the location and date where this incident occurred. This is not the same as event-level data, but these individual-level factors can be used to determine which soldiers were killed as a result of incidents that occurred at the same place and time. 16

Distribution of US casualties by incident in Iraq, Vietnam, and Korea.
All three data sets follow power laws throughout the extent of their left tails, as using the method described in the next section, we will estimate that the proper place to begin estimating the power law is for events causing one casualty or more. For that reason, the estimation technique advanced in this article will project that there are zero casualties missing from the data, and the estimated numbers of US fatalities in Iraq, Vietnam, and Korea exactly match the observed body counts of these wars, which we expect to be nearly comprehensive.
The distributional assumption driving this article’s estimation strategy thus holds up well to a direct test, but of course this does not establish that the estimation strategy will work in all cases; perhaps there are some contexts where we would expect data on small-scale events to deviate substantially from the rest of the distribution, and no empirical phenomenon will conform precisely to a stylized model. Yet at the very least, we can say that the key assumptions invoked in this article are supported by clear empirical evidence. There is certainly more reason to think that we can approximate violence data with power laws than to think that error terms resemble normal distributions or that marginal effects are linearly additive in empirical studies of armed conflict. The next section describes in more detail how this assumption can be leveraged to estimate conflict size.
Implementation and Results
The implementation here is described specifically for casualties sustained by Native Americans. The same procedures were repeated for estimating the number of casualties sustained by US forces. The implementation employs statistical techniques developed by Clauset, Shalizi, and Newman (2009, henceforth CSN). 17 This section leverages those techniques to provide a new approach for estimating conflict size.
We begin by estimating the scaling parameter. In order to do this, it is important to specify the range over which this parameter should be estimated. CSN present a maximum likelihood approach to determining the cutoff point, x min, below which the data do not conform to a power law, along with α, the scaling parameter characterizing all x ≥ x min. 18 Using this method, the fitted x min for data on Native American casualties is twenty and the scaling parameter characterizing the power law distribution above and including this point is α = 2.21.
Carefully defining x min serves an important function beyond maximizing model fit. As described previously, the central thrust of the estimation technique developed here is to use observed data on the right side of a distribution (which generally conform to power laws) in order to characterize unobserved events on the left side of a distribution (where the data generally do not conform to power laws, but are presumably undermeasured). CSN’s method for defining x min objectively indicates where we should divide the data for this purpose. Once we have done so, and estimated the scaling parameter characterizing points x ≥ x min, we can estimate how many data are missing for events of size x < x min. 19
Table 1 (Panels A and B) presents the results. There are presumably a large number of missing observations for events that led to few Native American casualties—in fact, more than 90 percent of the events projected to be missing from the observed data correspond to engagements causing three Native American casualties or fewer. This is consistent with the expectation that these events would not only have been the most common, but also the most poorly recorded and preserved. Together, these projections require adding 19,786 casualties to the observed data.
Extrapolating Data on Native American and US Casualties. Panel A: Observed versus projected data, Native American casualties Estimated xmin: 20 Estimated α: 2.21 Estimated total casualties sustained by Native American forces: 53,361
See the fourth section for a discussion of statistical uncertainty surrounding these estimates.
The projected volume of missing data tapers quickly, however. When it comes to events causing ten Native American casualties, for example, Table 1 (Panel A) shows that we will only estimate about 26 missing engagements (implying that there are roughly 260 missing casualties associated with events of this size). Events causing 15 Native American casualties are actually overrepresented in the observed data (perhaps because this is a number to which observers or historians would have rounded uncertain estimates), and so we have to remove about 50 observed casualties here to make the data consistent with projected values.
In total, the method described here estimates that the data set is missing 16,678 engagements, which would have led to 27,718 Native American casualties. Adding these to observed figures, we can estimate that 53,361 Native Americans were killed, captured, or mortally wounded during the American Indian Wars between 1776 and 1890. 20
Table 1 (Panel B) also shows results of repeating these procedures to estimate casualties sustained by US forces. These data are still far from being complete—overall, they may be missing as many engagements as they record. Nevertheless, given that most missing engagements are presumably small in size, we can project that there are fewer than 1,500 US casualties missing from the data and that the US Army sustained 11,889 total casualties during the American Indian Wars.
Uncertainty and Sensitivity Analysis
One advantage of the estimation technique described here is that it is possible to characterize explicitly the uncertainty associated with estimated parameters along with how this affects resulting projections. This is generally not possible for casualty estimates that rely on expert opinion or secondary literature. 21
Nonparametric bootstrapping is the logical approach. There are 1,297 engagements in the data set that resulted in recorded casualties sustained by Native American forces and 1,233 engagements in the data set that resulted in recorded casualties sustained by US forces. We can generate “bootstrapped” data sets of the same size by randomly sampling with replacement from the observed data. For each bootstrap sample, we can estimate maximum likelihood values for α and x min, resulting in the joint distributions shown in the top panels of Figure 3. Then for each pair of parameters, we can project the total number of casualties sustained by Native American and US forces, resulting in the histograms at the bottom of Figure 3. 22 As expected, Figure 3 shows that there is substantially more uncertainty associated with estimating Native American casualties.

Bootstrapped estimates of x min, α, and total casualties.
As shown in the bottom of Figure 3, bootstrap samples return mean casualty estimates that are close to the projections provided earlier: the mean estimate of 50,994 Native American casualties is 4 percent less than the projection given in Table 1 (Panel A), and the mean estimate of 11,849 US casualties is within 1 percent of the baseline model. Across 10,000 bootstrap samples, the standard error for estimates of total Native American casualties is 13,330, and the standard error for estimates of total US casualties is 1,485.
Analyzing the Data by Subset
Given that the American Indian Wars spanned a wide range of time, space, and actors, it makes sense to examine whether results would be meaningfully different if the estimation strategy were applied to subsets of the data rather than to all of them at once. In Figure 4, for example, the data are plotted for each of three periods—1776 to 1814, 1815 to 1864, and 1865 to 1890. These time periods correspond to logical historical breakpoints. Britain played a prominent role in backing tribes who fought the United States between the War of Independence and the War of 1812. After the Treaty of Paris, Britain curtailed this military support, and the continent’s interior was opened to rapid US settlement (along with escalating demands for resettling tribes). Once the Civil War ended, the United States had a greatly expanded military infrastructure that it used to confine tribes to reservations. These periods thus had different political and military dynamics. Yet when the estimation strategy developed in this article is applied to each subset individually and the results are aggregated together, the projection of 55,689 total Native American casualties is just 4 percent larger than the output of the full-sample analysis. The projection of 13,669 US casualties is 15 percent larger than the original estimate.

Analysis of period subsets.
Similar procedures were used to analyze data divided into Eastern, Plains, and Western regional subsets. 23 These regions varied by topography in ways that influenced military behavior: in the east, for instance, tribes often fought on foot using woodlands for cover and concealment, while tribes on the open Plains were more likely to fight on horseback, and armed conflict in the West typically took place on more rugged terrain. Tribes living in different parts of the continent also had systematic social and cultural differences; for example, tribes living on the Pacific Coast tended to have relatively decentralized social structures compared to those in the East. Yet when these regional subsets are analyzed separately and the results are aggregated together, they again return estimates near to the original figures. The overall projections of 55,507 Native American casualties and 13,323 US casualties are 4 percent and 12 percent higher than the baseline estimates, respectively.
A third way to divide the data is based on the population of the tribes involved in each military engagement. The 86 tribes in the data set varied widely on this score, from tribes with fewer than 500 members (such as the Modocs and Wallawallas) to those with more than 10,000 (such as the Creeks and Cherokees). The mean tribe in these data had a population of about 5,000 (the standard deviation is roughly 4,000). 24 There are several reasons to think that tribes of different sizes might have varied in their military behavior, with the most obvious being that larger tribes would have been more able, all else being equal, to sustain and inflict larger numbers of casualties. If we divide the data into quartiles based on the populations of the tribes involved in each engagement, analyze each of these subsets individually and then aggregate the estimates together, then this projects a total of 62,122 Native American casualties during the American Indian Wars, 16 percent higher than the original estimate. The projection for US casualties based on these subsets of the data is 11,902, almost identical to the baseline projection.
Table 2 summarizes these estimates. All are within 30 percent of the baseline projection, and many are much closer. This stands in sharp contrast with extant debates about the sizes of individual armed conflicts, where estimates often range by an order of magnitude or more. 25
Summary of Sensitivity Analyses.
Considering Alternative Distributions
Another way to evaluate how power laws characterize the data is to examine this fit relative to other distributions. As CSN explain, several alternatives can easily be mistaken for power laws when they take on certain parameters. In examining twenty-four phenomena that have been claimed to follow power laws, CSN found that it is especially difficult to determine whether data are more representative of power law or lognormal distributions Clauset et al. (2009, 679-89).
Figure 5 draws a similar comparison for the Native American casualties data used in this article, showing how the power law and lognormal fits are hard to distinguish. 26 If we measure model fit by mean squared error (predicted vs. actual observations at each magnitude), the power law and the lognormal perform identically to two decimal places. 27 If we compare model fit with a likelihood ratio test, then the fits are similarly indistinguishable (p = .93). 28 And even though these distributions fit the data similarly well, they have noticeably different implications: since the lognormal distribution is thinner than a power law on its left tail, it predicts that there are fewer missing observations, leading to an estimate of 36,892 Native American casualties. This is 31 percent less than the 53,361 Native American casualties projected using power laws. Substituting a lognormal fit makes less of a difference for estimating US casualties: the resulting estimate of 11,502 US deaths is 3 percent less than the estimate based on a power law. This is what we would expect given that the x min cutoff for the US data is much lower; to the extent that the lognormal and power law distributions have differently shaped left tails, a lower cutoff gives these divergences less room to affect overall projections.

Comparing power law and lognormal models.
What should we make of these comparisons? It will almost always be possible to make plausible estimates based on alternative distributions; ultimately, there are an infinite number of distributions to choose from, and for any given empirical data set, some are bound to fit better than a power law. With that said, the power law is appealing because it is simple (requiring analysts to estimate only one parameter), it is generalizable (as scholars have found that violence data appear to conform to power laws widely, both within and across conflicts), and we have seen that when data are better measured then they approximate power laws more closely. At the very least, this seems to be a reasonable starting point for the analysis. And even if this section has demonstrated that varying distributional assumptions can cause estimates of Native American casualties to change by 30 percent, this uncertainty still pales in comparison to the vastly different figures often given in scholarly and public debates about conflict size. The following section concludes by drawing connections to these broader issues.
Discussion
This article offers two main contributions: one is historical, speaking to knowledge about the American Indian Wars in particular, and the second is methodological, speaking to broader academic questions about estimating conflict size.
The principal historical relevance of this article is to provide the first systematic casualty estimates for the American Indian Wars. These conflicts are central to understanding how the United States became a continental power, yet historians do not know how violent this period actually was. This is largely because any historical analysis of the subject will inevitably confront missing data and necessitate drawing difficult inferences. This article offers a novel approach to drawing those inferences based on empirical regularities that characterize armed conflict generally. This technique suggests that roughly 50,000 Native Americans and roughly 12,000 US forces were killed, captured, or mortally wounded during the American Indian Wars.
To put these figures in perspective: if US forces sustained 12,000 casualties in the American Indian Wars, then this would be about as large as the totals from the War of Independence, the War of 1812, the Mexican–American War, and the Spanish–American War combined. 29 This is roughly a third of combat losses US forces sustained in Korea, and about twice the number of losses sustained during the occupations of Iraq and Afghanistan, as of this writing. And these comparisons are all in absolute terms. Per capita, the American Indian Wars are even more pronounced as one of the most violent periods in US history.
For Native Americans, these conflicts exacted a far greater toll. Fifty thousand combat casualties are roughly what US forces sustained in Vietnam, and again, the populations in question are far different. From 1776 to 1890, the population of Native Americans living in the continental United States averaged roughly 400,000 (Reddy 1993). Against this reference point, a figure of 50,000 casualties is massive, and it is unclear how many populations have sustained more battle deaths per capita. Obviously, it is difficult to draw direct comparisons in terms of aggregate counts because losses during the American Indian Wars were distributed across so much time and so many actors, and thus, it is also instructive to describe the intensity of this violence. As mentioned earlier, the average tribe enumerated in the data set had roughly 5,000 members at the time of its first engagement with the United States. Fifty thousand Native American casualties, distributed across the 86 tribes and 115 years covered in this analysis, amounts to approximately one of every thousand Native Americans killed per annum—for the United States today, this would be equivalent to about 300,000 annual fatalities.
Of course, we do not need to invoke power laws to establish that the American Indian Wars were costly, especially for Native Americans—this is common knowledge. But from a historical perspective, it is useful to have an objective sense of relevant magnitudes. For instance, Pinker (2011, 195) cites a figure of 20 million Native American deaths during his discussion of the “annihilation of the American Indians” as being one of the most violent phenomena in world history, but the vast majority of this population loss was a result of disease and not armed conflict. It is important to understand the impact of these diseases, but they were largely independent of the fighting that took place during the American Indian Wars, and the resulting population loss was so huge that it can obscure the military aspects of the conflict, which were important and devastating in their own right. This article addresses the military costs of the American Indian Wars directly, and this is its main historical contribution.
More generally, this article has offered a new method for estimating conflict size, a subject that is often controversial within scholarship and public debates. During the occupation of Iraq, for instance, a study published in The Lancet (Burnham et al. 2006) used survey methods to estimate that roughly 600,000 civilians had died as a result of the war. This figure was an order of magnitude higher than contemporary estimates provided by the US government. The Lancet article received substantial attention, as well as rebuttals from both the US government and academics. 30
Among scholars, these kinds of disagreements often go well beyond assessing individual cases, as prominent data sets measuring conflict size are often themselves the subject of dispute. For example, two recent studies (Obermeyer, Murray, and Gakidou 2008; Gohdes and Price 2013) argued that the most widely used data set on battle deaths since World War II (Lacina and Gleditsch 2005) had systematically underestimated the costs of recent conflicts; furthermore, both studies claimed that that correcting these estimates eliminates the perception that violence has been declining over time. 31 These debates are thus not just about the relative merits of different estimation techniques per se, but about the existence of an empirical phenomenon that is central to international relations scholarship. This only highlights the fact that while violence is the defining feature of armed conflict, measurement issues impede scholars from analyzing it systematically on even the broadest dimensions.
This article offers a new way to approach such debates. The method described here is based on event counts, but it explicitly seeks to go beyond them, using the data’s distribution in order to draw inferences about their comprehensiveness. In order to explain the technique and demonstrate its utility, this article examined how power laws can be used to estimate the severity of the American Indian Wars, but scholars can use the same methodology in order to evaluate the size of many other conflicts as well. In order to employ this technique, observed data must be relatively comprehensive when it comes to capturing large-scale events, but one of the appealing aspects of the estimation strategy is that it is specifically designed to deal with missing data for small-scale events. Political scientists can play an important role on this score by developing objective ways of going beyond historical material or government records, which are usually incomplete. It will almost always be necessary to supplement data that are available with inferences about information that has not been preserved. This article has shown how power laws can be used as a tool for that purpose.
Footnotes
Author’s note
Acknowledgment
Special thanks to Bear Braumoeller, Aaron Clauset, Allan Dafoe, Scott Gates, Colin Gillespie, Neil Johnson, Joshua Kertzer, Richard Nielsen, Arthur Spirling, Brandon Stewart, Benjamin Valentino, and two anonymous reviewers for providing helpful comments that greatly improved the manuscript. All errors are the author’s.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
