Abstract
The production of good quality research is now one of the expectations for those working in forensic science and forensic pathology. Although potentially daunting, this necessary and important goal can be realized with the appropriate selection and utilization of statistical tools in the analysis of research projects in the forensic sciences. As detailed textbooks on statistics are available, this article provides an overview of how to organize data commonly seen in forensic pathology and which statistical procedures should be performed.
Introduction
Evidence based medicine is not a new concept. In clinical medicine it is used to individually tailor decisions concerning patients’ treatment, based on the best medical clinical evidence available at the time of the treatment (1). Although forensic pathology cannot conduct double blinded randomized studies assessing treatments, this does not preclude the discipline from conducing research with statistical analysis. With the release of the National Institute of Justice report on forensic sciences (2) and the Goudge report examining pediatric forensic pathology in Ontario, Canada (3) there is now, more than ever, an expectation that research into forensic sciences and forensic pathology respectively will be undertaken by those in the profession.
Forensic pathology is a discipline that has frequently relied on case studies and small case series to present unusual or atypical findings. The absence of a survival period amongst our patients does not preclude us from conducting good quality research and using appropriate study design and statistics for the planning and analysis of our data.
In any article examining patients there is a need to convey to the readers the general characteristics of the study population. Although descriptive statistics for studies with small sample numbers may effectively present findings as a list of each patient, along with the details of the variable being studied, this method is impractical for larger sample sizes. In addition, a list of data does not give the reader an overall sense of the data. Thus there is a need to present a summary of the data; both for the patient characteristics and for the variable(s) being examined.
In all studies, general guidelines for the presentation of this type of descriptive information indicates that an overview of the data should be presented at the beginning of the results section of the paper. The data presented should accurately reflect both the data included within the study and give details regarding any data that was dropped from the study, and an explanation as to why the data was dropped. (For example, in a study examining toxicological analysis on consecutive autopsy cases, there may be a few cases that were excluded from the study due to decomposition and the inability to acquire the needed blood samples for the toxicological analysis.)
Beyond a descriptive overview, the use of statistics in research can attempt to establish whether there is an association between a variable being studied, to a particular outcome. Further analysis then may explore relative weighting of the impact of multiple variables on the outcome being investigated. What tests should be used however varies depending on the data being examined. The results section should outline what statistical analysis was conducted and should provide sufficient information on the analysis that will allow any reader to understand how the results were obtained. Use of computerized statistical packages is common, and therefore details pertaining to the computer program used should also be referenced.
This review article will give a brief overview of how statistics can be used in forensic pathology research. This overview is not meant to be a complete text on study design, mathematical formulae explaining the statistical procedures or to be a complete discussion on all medical statistics as such texts are already available (4–7). Instead the goal is to present a general introduction to the approach in analysis of the data sets obtained from the most common types of studies currently performed in forensic pathology. First level or exploratory research is usually descriptive in nature and frequently is obtained from case series while more analytical studies require statistical tests designed to compare two or more groups that differ with respect to one or more independent variables. Either type of research can explore continuous or discrete data. In all studies, especially in which analysis goes beyond the merely descriptive, the selection of the appropriate analytical tool requires an understanding of, a respect for, and an ability to test for the assumptions and limitations embedded within that test selection. This brief overview will use more specific examples selected to illustrate these general concepts.
Continuous Data
Continuous data refers to data that can be measured along a spectrum and is not restricted to simple integers; examples used in forensic pathology include height, weight, organ weights and concentrations of drugs in the blood. When summarizing this type of data, the mean is commonly sited; this is an average of the data. The mean however is sensitive to outlying data and when such values are present it is recommended to present the sample median as well. The median in an odd numbered set of ordered data is the middle value; in an even number it is the average of the central two points. Presenting the smallest and largest values of a variable being examined (the range) will also assist the reader in appreciating the distribution of the data for that variable.
Continuous data can also be converted into discrete data by presenting frequency distributions. For example, body weight can be broken down into ten or twenty kilogram intervals. The breakdown of the number of individuals within each interval can then be presented in a table or a graph (histogram), the latter gives a visual representation of the weights of the patient population. The spread of the data can also be conveyed to the reader by providing the standard deviation.
Case series are commonly seen in forensic pathology literature. When deaths are a result of a recently described disease process or secondary to intoxication from a new drug, there is a need to describe the parameters of those deaths. Examples of these statistical concepts can be found in the article “Methadone deaths: a toxicological analysis” (8) where the authors present a series of 111 deaths involving methadone. The purpose of the article was to present the fatal concentration of methadone in deaths involving the drug. Within the article, the deaths were divided into two groups - deaths where methadone was considered the sole cause of death, and the second group where death was due to a combination of methadone and other drug(s). The concentrations of methadone in the blood of the individuals dying from this drug were presented. The data pertaining to the concentrations of methadone included the ranges, the mean and the median. The authors also presented the distribution of blood methadone using a histogram; the histogram was broken down by 100 ug/l intervals, deaths from methadone alone and methadone with other drugs were presented separately within the same histogram, giving the readers a visual representation of the blood methadone concentrations in the deaths that were being examined.
Following descriptive case series, the next common type of research paper in the forensic sciences goes beyond this and both presents and compares information derived from two (or more) contrasting groups. In these studies the underlying question is whether the two groups examined are distinct or not. There are several statistical tests available to answer that question. The Student's t-test is commonly used for such a comparison; it does however have certain assumptions that must be met. These assumptions are that the variable being examined is normally distributed, that the variance for the two groups is assumed to be the same for the two populations and that the samples that formed the two groups were randomly and independently chosen (5).
To illustrate this; in a study reporting hearts with and without acute thrombi, Burke and colleagues examined several different coronary risk factors that could precipitate acute coronary thrombosis (9). For their univariate analyses they used Student's two-tailed t-test to examine coronary risk factors and plaque morphology. In the univariate analyses, they report that plaque rupture was associated with low serum HDL cholesterol concentrations (P=0.008), elevated serum total cholesterol concentrations (P=0.01) and an elevated ratio of total to HDL cholesterol (P=0.001). Or put simply, the Student's t-test found a statistical difference between the two groups’ cholesterol levels.
If the data is paired, it is not independent, therefore the paired t-test should be used. Paired data includes subjects that have been matched, or when one subject has had two separate exposures. In these situations the variability in the outcome is reduced; the calculations associated with the paired t-test take this into consideration (5).
When data is not normally distributed, nonparametric tests should be used. One of the more commonly used in the pathology literature is the Mann-Whitney U-test. Mathematically, this test uses the rank of the data to establish whether there is a difference between two groups that is greater than random chance alone. This statistical test is commonly seen when immunohisto-chemical staining is examined and compared in two groups; the staining scores are not normally distributed, so the Mann-Whitney test is appropriately used in these studies. For matched data, the Wilcoxon signed rank test is used instead (5).
Categorical Data
Categorical data is often encountered in forensic pathology studies. It can be dichotomous (such as gender), unordered (such as manners of death), or ordered (such as injury severity scores). For ease of analysis and to simplify data presentation, some variables can be broken down into dichotomous categories, such as a smoker vs. nonsmoker, or a hypertensive vs. a normotensive patient. Categorical data can be summarized in narrative form; summary tables showing a detailed breakdown of the number of cases per category along with their relative frequency is a simple and efficient way to present the information quickly to the reader.
The simplest summary table is a 2 x 2 table where the data is divided into four categories. Typically the variable being examined is broken down into whether it is present or not (usually listed in the table by rows), and into one of two outcomes (usually listed in the table by columns). When a study measures more than one variable, each variable can still be examined separately to establish whether its distribution differs in the two groups being studied. The chi-square test is commonly used to examine categorical data in this manner.
As an example, Opeskin and Berkovic examined risk factors for sudden unexpected death in epilepsy (SUDEP), comparing patients dying of SUDEP to control patients who had a history of epilepsy but had died from other causes (10). In their study they presented extensive data on both patient groups using several tables to effectively convey the information to the reader. They used the Student's t-test to examine differences in means for the independent groups, the Mann-Whitney U-test for non-parametric testing and the chi-square test to compare categorical data. Concentrating on the results of the categorical data analyses, the authors found no differences in clinical features (presence or absence of mental retardation, psychiatric illness, dementia, or recent stressful life event) between the SUDEP group and the control group. The SUDEP group however was more likely to die in their sleep and have evidence of terminal seizure activity when compared to the control group. This article exemplifies how simple, but appropriate, statistical analyses can be used even when survival data is absent.
When a 2 x 2 table is being used but the overall sample size is small, a continuity correction (or Yates’ correction) is used within the calculation. The Fischer exact test is recommended when one or more of the table cells is smaller than 5. Finally, similar to the paired t-test used with continuous data, when categorical data is paired, the analysis must take this into consideration. This can be done using the McNemar chi-square test (5).
Linear and Logistic Regression
The mathematic theory behind linear and logistic regression extends beyond the scope of this paper, but an overview of how these are conducted will be discussed. An over-simplified explanation is that regression attempts to describe and quantify a relationship between a dependent variable (y) and an independent variable (x). Multiple regression allows the simultaneous examination of several independent variables, thus allowing for the control of more than one confounding variable (5). In linear regression, the relationship is linear; it is defined by an intercept and the coefficient of each independent variable. In logistic regression, ‘y’ has been replaced by the natural logarithm of the odds of success for a dichotomous random variable. The relationship between In[p/1-p] and the independent variables however is linear (11). With logistic regression, the additional benefit is that the coefficients of the independent variables in the final model provide an adjusted odds ratio. With computerized statistical packages, both multiple linear regression and multiple logistic regression can be done by non-statisticians. Like any computer software however its use still requires decisions to be made before a final model is produced. To demonstrate an example of multiple logistic regression I will use an example written by this author and go through the steps in detail. The use of this study is not to self-promote, but to explain in detail the process of multiple logistic regression in a study with which I am well acquainted.
As a background, the study was a self-administered, mailed questionnaire sent to over 300 medical (non-forensic pathologist) coroners in the province of Ontario. The aim of the study was to examine the validity of the certification of manner of death for 14 fictitious scenarios where clinical and pathological information was provided. Variables purposefully altered within the scenarios and coroner demographics were examined as part of the logistic regression (12). In logistic regression, any subject with missing independent variables is excluded from the regression analysis; therefore the first step was to identify missing data. Such analysis revealed that 42 subjects were missing one or more self reported characteristics. To proceed with the analysis with these missing points would have resulted in 18.6% of respondent data being dropped. This missing data can have a negative effect on statistical analyses based on possible bias introduced by unpredictable differences in unknown/unmeasurable variables in cases with and without complete data. Imputation, a process of entering data where information is missing, can be used and was therefore undertaken in this study following guidelines suggested by Harrell (13). After imputation, 26 of the 42 subjects who originally had missing data had complete data, bringing the working sample size up to 210 subjects, or 92.9% of respondents. The logistic regression was performed on this group of 210 subjects.
The study design had constructed 4 sets of clinico-pathological scenarios; each set was therefore analyzed separately. The correct manner of death was represented by a dummy variable, with 1 representing the correct manner of death and 0 representing the incorrect manner of death. As each coroner answered multiple scenarios (a repeated measures design), this was factored into the analysis. For the model examining coroner characteristics, these and the scenarios were entered as independent variables. Variables found to have a p-value greater than 0.15 were removed from the model. The variables with the largest p-value were removed in a stepwise fashion until the final model was achieved. The adjusted odds ratio was then estimated using the coefficients of the independent variables from the final model. By performing such analysis, the study was then able to report and quantify the ‘odds’ of having a correctly certified manner of death.
In summary, much of the work associated with multiple linear and logistic regression is performed by computerized statistical packages. Organizing, analyzing and imputing the data (if required) and arriving at the final model still requires a human element to navigate through the various decisions that are required. The above description can serve as a guide to the reader, in either undertaking their own analyses or when working with statisticians.
Statistical Power and Sample Size
As mentioned previously, a common research design in forensic pathology is to compare two different groups to establish whether they are distinct or not. Through the methods outlined above, this is essentially done by assessing the probability that the two groups are from distinct populations, or that they are indeed the same and have been drawn from the same population; the latter representing the null hypothesis.
If we reject the null hypothesis when it is true, we have made a type I error (α error). If however the null hypothesis is false, but we fail to reject it, we have made a type II error (β error). Power is the probability of rejecting the null hypothesis when it is false, it is calculated by subtracting | from 1 (11). As both types of errors are problematic, these concepts must be kept in mind when designing a study; specifically when determining the sample sizes needed for that study.
When calculating sample sizes, it is customary to set alpha at 0.05 and beta at 0.20 (translating to a power of 0.80). If we want to reduce the type I error, or increase the power within a study, a larger sample size is needed. A larger sample size is also needed when the two study populations have only a small difference between them. When comparing continuous data, knowledge from previous studies concerning the estimated standard deviation and the expected difference between the two groups is also required to calculate the appropriate sample size (14–16). For studies comparing two proportions, as would be seen in studies with binary outcomes, a different formula is used. This calculation requires estimates of the two proportions being compared and knowledge regarding the magnitude of the difference between the two groups (15, 16).
Failure to perform sample size calculations prior to undertaking a study, or to omit these calculations in research papers is not limited to forensic pathology research. When medical studies and their statistical content have been reviewed, sample size calculations have frequently been absent (17–20). Recognizing the need for such calculations and incorporating them into a study design before the study is initiated can only improve the strength of the study design.
Conclusion
Research into forensic sciences and forensic pathology is now an expectation that has fallen onto those in the profession. Research in the field has evolved past simple case presentations and descriptive case series and now includes studies comparing groups both retrospectively and prospectively. Pathologists undertaking the challenges of research should write papers that present the reader with appropriate information on the acquired data and its spread. For statistical comparisons to be valid, the study design must respect the type of data being analyzed and how the data was acquired (ie: unpaired vs. paired data). Decisions made when conducting more complex analysis using multiple linear or logistic regression should also be transparent to the reader. Appropriate sample sizes should be calculated prior to undertaking a study, and the details behind such calculations should be disclosed within the method section of the paper. Such study design and analysis will only lead to an improvement in the overall research content of forensic pathology and ultimately strengthen the field. The brief outline on statistics presented above will assist the readers in such an endeavor.
