Abstract
The theory and practice of statistics comprises two main schools of thought: frequentist statistics and Bayesian statistics. Frequentist methods are most commonly used to analyze animal-based laboratory data, while Bayesian statistical methods have been implemented less widely and may be relatively unfamiliar to practitioners in experimental science. This paper provides a high-level overview of Bayesian statistics and how they compare with frequentist methods. Using examples in rodent toxicity research, we argue that Bayesian methods have much to offer laboratory animal researchers. We advocate for increased attention to and adoption of Bayesian methods in laboratory animal research. Bayesian statistical theory, methods, software, and education have advanced significantly in the last 30 years, making these tools more accessible than ever.
Introduction and definitions
There are two main approaches to statistical theory and practice: frequentist statistics and Bayesian statistics. Statistical inference tasks such as hypothesis testing or parameter estimation can be conducted with either approach. Many common statistical analyses such as t-tests, analysis of variance, chi-square tests, dose–response trend analyses, and many analyses based on regression models are traditionally conducted using a frequentist approach. Statistics for rodent toxicology consists primarily of frequentist statistical analyses. While frequentist approaches are suitable in many situations, there are a growing number of applications where a Bayesian approach is preferred.
Parameter estimation is one aspect of statistics where frequentist and Bayesian approaches differ. Suppose we are investigating the relationship between body weights and brain weights at necropsy for a set of control animals, using the simple linear regression model
Selecting an approach
Interpretability of the results
Results from Bayesian analyses are typically more interpretable than those from frequentist methods, particularly in terms of probabilistic statements about parameters of interest. A frequentist approach to the body and brain weight linear regression problem would produce a single
Dependence between endpoints within individual animals
In a typical rodent toxicity study, many endpoints are measured on each animal and each endpoint often undergoes statistical analysis separately. For example, a researcher looking at incidence of multiple tumor types may conduct separate trend tests for dose–response, one for each tumor type. Performing multiple independent analyses implies (the typically incorrect assumption) that the endpoints are uncorrelated, that is, toxicity occurring in one organ or system of the body does not affect the chances that toxicity will occur in other organs or systems. When endpoints are correlated, multiple independent hypothesis tests will suffer from high false positive rates. Although frequentist methods can bring the inflated error rates back to expected levels by adjusting p-values using a multiple testing correction, Bayesian methods like hierarchical models allow simultaneous inference on multiple endpoints with no need for multiple testing correction. 3 We encourage researchers to collaborate with statisticians and consider what inferences are possible under a Bayesian framework.
Bayesian approaches for modeling dependence between endpoints in animal studies have been demonstrated in the literature. Dunson et al. provide a general approach for simultaneous analysis of endpoints. 4 Dunson and Herring use data from a mouse bioassay study to present a method for simultaneous analysis of outcomes such as time to first tumor, weekly increases in number of tumors, and presence of tumors at time of death. 5 Kim and Hwang employ a Bayesian approach to a developmental toxicity study of diethylhexyl phthalate, simultaneously analyzing pup malformations and fetal weight. 6 Hwang presents a Bayesian joint model for developmental toxicity studies on continuous data (e.g. fetal weight) and zero-inflated count data (e.g. birth defects or rare tumors). 7
Although certain applications of Bayesian modeling of dependence across multiple endpoints have been addressed in the literature, the methods have not been widely implemented. The detailed paper by Dunson et al. on a general approach to modeling endpoint dependence has been widely cited, but few citations relate to laboratory animal research. 4 In a 2014 review of statistics for toxicological bioassays, 8 Bayesian methods are mentioned but only two citations are included on Bayesian methods for joint modeling in toxicology.8 –10 There is therefore a substantial opportunity for more inclusion of Bayesian approaches in laboratory animal science.
Littermates (nesting among animals) and small samples sizes
Animals from the same litter represent a second type of nested data common in rodent studies: a single endpoint may be more correlated among littermates than it is between animals from different litters. Moreover, small sample sizes can occur in these contexts as some rodent toxicology experiments select only two or four animals per litter for analysis. There are other examples: historical control data is nested by laboratory, rodent strain, and sex; pup or body weight data are nested within individual animals whose endpoints are measured repeatedly over time; and histopathology endpoints can be considered nested by tissue or organ.
Whether the nesting structure is within- or between-animals (or both), the dependence that comes with it would ideally be accounted for in the statistical method. Mixed models that include parameters representing variability within and between litters can be fitted with a frequentist approach. However, these models can perform poorly when sample sizes are small. Fitting the model with a Bayesian approach can mitigate the issues related to small sample sizes (assuming sensible prior distributions are used).
11
For example, suppose we are using a Bayesian hierarchical model to estimate the variances
When sample sizes are large, Bayesian and frequentist methods often produce similar results. Unfortunately, the efficacy of many frequentist statistical methods relies on sufficiently large sample sizes. On the other hand, the 4 Rs of animal research (reduction, replacement, refinement, and responsibility) encourage smaller sample size designs, which can threaten the reliability of a frequentist statistical analysis. 12 Unlike their frequentist counterparts, Bayesian methods are adept at handling small sample sizes.
Other reasons to consider Bayesian methods
Another reason to consider a Bayesian approach is that the Bayesian framework for inference obviates the need for p-value based significance thresholds or ad hoc p-value corrections when considering multiple hypotheses, since the posterior distribution gives the distribution of all model parameters of interest. With a posterior distribution estimated, the researcher can readily answer questions about multiple parameters, for example “what is the probability that
Besides being a required part of a Bayesian analysis, prior distributions are valuable tools. Westfall and Soper discuss the use of prior distributions to alleviate the multiple comparison problem in carcinogenicity tests. 13 Priors can be used to incorporate information from similar past experiments, for example using historical control data. For an excellent expository example, see Bayesian data analysis, 3rd ed., p.102. 3
A word of caution
The specification of prior distributions in a Bayesian analysis should be overseen by a statistician experienced in Bayesian methods. There is no one-size-fits-all recipe for specifying prior distributions. The exact form of the prior distributions can influence the results, sometimes substantially; re-running the analysis under different priors is a necessary step to understand how the priors affect the results. Moreover, using a so-called “default prior” or one that is ostensibly “non-informative” can be a poor choice in certain cases. For further discussion on this critical aspect of Bayesian analysis, see Wheeler, 14 Depaoli et al., 15 and Seaman III et al. 16
Finally, we emphasize that a full Bayesian analysis often involves advanced algorithms with nuances that can be masked by the user-friendliness of the software. Readers should take care when interpreting the results of a Bayesian analysis, and involve a statistician to assist with interpreting and communicating results.
Accessibility
Identifying and implementing appropriate Bayesian methods is still a challenge for the modern data analyst. Computation for complex Bayesian models can be intensive, although software, hardware, and user interfaces are always improving. For readers interested in considering Bayesian methods in their work, we recommend a four-pronged approach consisting of (1) educational/background materials, (2) applied journal articles, (3) software or book resources for fitting Bayesian models, and (4) consulting with a statistician with experience in Bayesian methods. To assist with this four-pronged approach, we refer the reader to the included table of suggested materials (Table 1). It is more common today than in past decades for statisticians and programmers to publish papers containing full computer code available to fit the models and examples from the paper. The code is often released as part of an open-source software package with documentation, thereby expanding the set of software packages available for Bayesian model-fitting algorithms, results processing, and even visualization.
Suggested materials for readers.
Conclusion
Despite the increased accessibility and relative user-friendliness of many Bayesian approaches, we recommend collaboration with statisticians when applying Bayesian methods to ensure that the analysis is properly executed; including setting prior distributions, defining/building the statistical model, writing or editing existing code to fit the model, checking model fit and convergence, and guiding other researchers in identifying and interpretating the results of the analysis.
Footnotes
Acknowledgments
The authors would like to thank Dr. Guanhua Xie, Dr. Matt Wheeler, and Dr. Helen Cunny for their helpful review on this manuscript.
Data availability
There are no experimental data associated with this manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Intramural Research Program of the National Institutes of Health, National Institute of Environmental Health Sciences (NIEHS) and contract GS-00F-173CA/75N96022F00055 to Social and Scientific Systems, Inc., a DLH Holdings Corp Company.
Research ethics
Our study did not require an ethical board approval because it did not contain human or animal trials.
