Confidence in confidence intervals!

Abstract

In hypothesis testing analyses, researchers often make statistical inferences by evaluating the p values or the confidence intervals (CIs) of these analyses. I noticed however that the use of p values is the predominant practice in majority of published nursing research reports. While the use of p values is an absolutely acceptable practice, I am of the opinion that nursing evidence will be better served if more research reporting was based on CIs instead whenever possible. I say so because, unlike p values, CIs tell us much more than whether or not one shall reject the null hypothesis. They also provide invaluable insight about the accuracy of research estimates under investigation. To demonstrate this point, let us consider a fictitious research scenario in which the researcher conducts a logistic regression analysis to explore the predictors of stroke in Canada. At the end of the study, the authors report that smoking (odds ratio = 2.3, p = 0.034, 95% CI 1.2 – 18.5) and hypertension (odds ratio = 2.1, p = .014, 95% CI 1.9 – 2.3) were both independent predictors of stroke based on a two tailed alpha of 0.05. The proceeding discussion of this scenario assumes that the reader has a basic conceptual understanding of p values and CIs, and it will therefore be entirely focused on outlining the advantage of CIs over p values in hypothesis testing.

The reported odds ratio of 2.3 for smoking is an impressive estimate suggesting that the odds of stroke among smokers are 2.3 times those of the non-smokers (i.e., smokers are 2.3 times more likely to develop stroke than non-smokers). The p value of 0.034 indicates that this is a statistically significant finding. The 95% CI further supports this conclusion because the lower (1.2) and upper (18.5) bounds of this CI are both more than one (the value for the reference category of non-smokers). The question remains however, how accurate is the reported odds ratio in its estimate of the true parameter value? Unfortunately, p values don’t provide any insight concerning this question because the sole function of p values is to report the probability of chance in our conclusion. CI on the other hand indicate the range of possible values of the unknown parameter under investigation. They therefore provide very important additional information concerning the level of precision in the reported estimate. This is an especially important attribute for a practice discipline like nursing because the clinical significance of our findings is as important as their statistical significance. In the smoking related findings above, the confidence interval indicates that the parameter value for the association between smoking and stroke is anywhere in the interval between 1.8 and 18.4. That is, the parameter value for this association could be as low 1.2 or as high as 18.4! The wide range of intervals possibilities in this CI reveals that our resulting odds ratio of 2.3 is not a precise estimate of the parameter, despite it being a statistically significant one. Think about it, the wide ranging confidence interval of 1.8 to 18.4 creates a situation of uncertainty around the true magnitude of the relationship between smoking and stroke. Such wide confidence intervals are often the result of a large sampling error that may be the function of a smaller than required sample size.

Now, let us consider the reported association between hypertension and stroke in our scenario. Again, we are dealing with an impressive odds ratio that was also statistically significant as per the p value of 0.014. The lower and upper bounds of the CI for this association were both above one; further supporting our conclusion that hypertension is statistically associated with increased risk of stroke. What especially impressive about this finding however is how close the estimate was to the possible parameter value as portrayed by the very narrow 95% CI. Our CI suggests that the parameter value for this association is anywhere in the interval between 1.9 and 2.3. I think that we can our agree that the lower and upper bounds of this interval are so close to our estimate of 2.1; making it a precise estimate of the parameter value. This scenario demonstrates a statistically significant result of a sizable association (odds ratio = 2.1) with a precise estimate of the actual parameter value.

The above discussion highlights that importance of using CIs, not only for the purpose of hypothesis testing, but also in evaluating the precision of estimates investigated in these hypotheses. In the above scenario, both smoking and hypertension had near similar estimates that were statistically significant. However, the CIs of these findings revealed that the odds ratio for hypertension was a much more precise estimate than than that of smoking. Thus, it is important that nurse researchers be more invested in reporting CIs whenever possible. Such reporting will enable readers of nursing research to make better judgement about the clinical value and precision of the reported estimates. Remember, the narrower the CI the better it is. The wider the CI, the less precise is the estimate even if it was statistically significant.

Maher M. El-Masri, RN, PhD

Editor-in-Chief, CJNR

Professor and Research Chair, University of Windsor, Faculty of Nursing