Neglect of Alternative Causes in Predictive but Not Diagnostic Reasoning

Abstract

People are renowned for their failure to consider alternative hypotheses. We compare neglect of alternative causes when people make predictive versus diagnostic probability judgments. One study with medical professionals reasoning about psychopathology and two with undergraduates reasoning about goals and actions or about causal transmission yielded the same results: neglect of alternative causes when reasoning from cause to effect but not when reasoning from effect to cause. The findings suggest that framing a problem as a diagnostic-likelihood judgment can reduce bias.

Keywords

diagnostic reasoning predictive reasoning causal reasoning neglect of alternative causes medical reasoning goal shielding probability judgment

Causal inference can go in two directions: from cause to effect and from effect to cause. The likelihood of effects can be predicted from knowledge of their causes (predictive reasoning), and the likelihood of causes can be diagnosed from their effects (diagnostic reasoning). Several biases in judgment can be traced to people’s tendency to focus too narrowly on a hypothesis that is currently under consideration, neglecting relevant alternatives (e.g., Doherty, Chadwick, Caravan, Barr, & Mynatt, 1996; Klayman & Ha, 1987; Pitz, Downing, & Reinhold, 1967; Ross & Murphy, 1996). Does such neglect affect reasoning in both directions? Do people (fail to) think about alternative causes to the same extent when making predictions and diagnoses?

A common view is that predictive judgments are overestimated relative to diagnostic ones because predictive reasoning is more natural (Medin, Coley, Storms, & Hayes, 2003; Tversky & Kahneman, 1982). However, neglect of alternative causes would lead to the opposite pattern: underestimation of predictive probability judgments and overestimation of diagnostic probability judgments. Consider a doctor required to judge the probability that an older patient with congestive heart failure will be alive in 5 years. If the doctor neglects alternative possible ailments, he or she will provide a prognosis that is too sunny. Analogously, a doctor who fails to consider alternative causes in diagnosis will overestimate the probability of the ailment already in mind.

Fernbach, Darlow, and Sloman (2009) performed a rational analysis of predictive and diagnostic reasoning about causal transmissions (e.g., the likelihood a baby has a drug addiction given her or his mother does vs. the likelihood a mother has a drug addiction given her baby does). To test the analysis, we collected judgments of people’s underlying beliefs about the causal scenarios (e.g., base rates, causal strength, and strength of alternatives) and compared people’s predictive and diagnostic judgments with those implied by the analysis on the basis of the beliefs. The analysis predicted people’s diagnostic judgments but overestimated their predictive judgments. This is indirect evidence that people neglect alternatives, but only in the predictive direction.

In this article, we take a more direct approach to assessing the role of alternative causes in the two directions of inference. Our method is to compare standard predictive and diagnostic judgments: those in which alternative causes are implicit (full conditionals) with those in which participants are told that no alternative causes are present (no-alternative conditionals). The design is depicted in Table 1. Excepting unusual circumstances (Pearl, 1988), alternative causes increase the likelihood of the effect. Therefore, full-conditional probabilities should be judged as higher than no-alternative conditionals. Conversely, in diagnostic reasoning, alternative causes compete to explain the effect and therefore should yield lower probability judgments (often called discounting or explaining away). Full conditionals should therefore be judged as less likely than no-alternative conditionals. If participants neglect alternatives in prediction, but not in diagnosis, then their judgments of full and no-alternative predictive conditionals should be the same, but full diagnostic conditionals should be judged as less likely than no-alternative conditionals.

Table 1.

The Design of Experiments 1 Through 3

	Conditional
Judgment	Full	No-alternative
Predictive	P(effect\|cause)	P(effect\|cause, no alternative causes)
Diagnostic	P(cause\|effect)	P(cause\|effect, no alternative causes)

Experiment 1 tested the neglect hypothesis in an expert population: mental health practitioners reasoning about a case study. Experiment 2 tested inferences about people’s goals and means to achieving those goals, extending existing research on goal shielding (Shah, Friedman, & Kruglanski, 2002). Experiment 3 manipulated strength of alternatives in arguments involving causal transmission. Experiments 2 and 3 allowed us to assess how judgments about full and no-alternative conditionals vary with the strength of alternatives.

Experiment 1

Medical judgment suffers from the same biases as those observed in everyday judgment (Bornstein & Emler, 2001). One purported source of error is the neglect of alternative causes (e.g., diseases or other medical conditions) when clinicians are called on to make prognoses or diagnoses. Experiment 1 tested whether mental health practitioners would neglect alternatives when making a predictive (prognostic) as opposed to a diagnostic medical judgment.

Method

Two hundred sixty-five mental health practitioners participated as part of a psychopharmacology review course offered by the Massachusetts General Hospital Psychiatry Academy (70% M.D.s, 9% nurse-practitioners, and 21% other; 51% female and 49% male). Participation was voluntary; 56% of course attendees completed the experiment. The participants were assigned alphabetically to one of two groups. The predictive group answered two predictive questions: one full conditional and one no-alternative conditional. The diagnostic group answered two diagnostic questions: one full conditional and one no-alternative conditional. The questions are shown in Table 2. Responses were made on a 10-point scale, ranging from 1, least likely, to 10, most likely. The full-conditional question was always asked first and was completed on the first day of the course. The no-alternative question was presented the next day.

Table 2.

Questions From Experiment 1

	Conditional
Judgment	Full	No-alternative
Predictive	Ms. Y is a 32-year-old female who has been diagnosed with depression. Please indicate on the scale below from 1 to 10 (1 being the least likely and 10 being the most likely) the likelihood that she presents with lethargy.	Ms. Y is a 32-year-old female who has been diagnosed with depression. A complete diagnostic workup reveals that she has not been diagnosed with any other medical or psychiatric disorder that would cause lethargy. Please indicate on the scale below from 1 to 10 (1 being the least likely and 10 being the most likely) the likelihood that she presents with lethargy.
Diagnostic	Ms. Y is a 32-year-old female who presented with lethargy. Please indicate on the scale below from 1 to 10 (1 being the least likely and 10 being the most likely) the likelihood that she has been diagnosed with depression.	Ms. Y is a 32-year-old female who presented with lethargy. Please indicate on the scale below from 1 to 10 (1 being the least likely and 10 being the most likely) the likelihood that she has been diagnosed with depression given that a complete diagnostic workup revealed that she has not been diagnosed with any other medical or psychiatric disorder that would cause lethargy.

Results and discussion

Mean judgments for the predictive and diagnostic questions are shown in Figure 1. To analyze the data, we performed a 2 (direction of inference: predictive or diagnostic) × 2 (conditional type: full or no-alternative) analysis of variance with repeated measures on the conditional type factor. The analysis revealed a significant interaction between direction of inference and conditional type, F(1, 263) = 16.5, p < .0001, η_p ² = .06, as predicted. There was also a main effect of direction of inference, F(1, 263) = 9.1, p < .01, η_p ² = .03, and conditional type, F(1, 263) = 12.1, p < .001, η_p ² = .04. Follow-up planned comparisons revealed a significant difference between full (M = 5.9) and no-alternative (M = 6.7) diagnostic conditionals, t(129) = 4.9, p < .0001, Cohen’s d = 1.1, but not between predictive full (M = 6.9) and no-alternative (M = 6.8) conditionals, t(134) = 0.5, p > .6, Cohen’s d = 0.04.

Fig. 1.

Mean likelihood rating as a function of type of judgment (predictive or diagnostic) and type of conditional (full or no-alternative) in Experiment 1. Responses were made on a 10-point scale, ranging from 1, least likely, to 10, most likely. Error bars represent standard errors.

Predictive judgments were insensitive to the absence of alternative causes. Ratings for diagnostic judgments were higher when alternatives were absent, as they should be. The results support the conclusion that the medical professionals neglected alternatives when reasoning from disease to symptom but took them into account to make a diagnosis.

Experiment 2

One role of predictive and diagnostic reasoning is to inform choices about how to achieve goals. Evaluating a plan of action requires predicting the likelihood of success. Evaluating actions in retrospect requires diagnosing whether they were important for having achieved the goal. Shah, Friedman, and Kruglanski (2002) showed that thinking about one means to achieving a goal reduces thinking about or pursuing alternative means in a variety of tasks. We predicted that this would occur in predictive, but not diagnostic, reasoning.

Method

Seventy-five Brown University students were recruited on campus and participated voluntarily. They were randomly divided into five groups. Groups 1 and 2 gave full-conditional judgments, Groups 3 and 4 gave no-alternative judgments, and Group 5 rated the strength of alternatives. We generated questions for eight goal schemata. We instructed participants to rate how likely each event was, on a scale ranging from 0, impossible, to 100, definite. Each group answered one question about each schema, and the questions were split so that no participant saw both the predictive and diagnostic questions for a given schema. Thus, for each predictive question that Group 1 answered, Group 2 answered the corresponding diagnostic question and vice versa (and likewise for Groups 3 and 4). The presentation order of the schemata was determined randomly and was the same across all groups.

The five questions for one of the schemata are shown in Table 3. The additional schemata can be viewed in Supporting Details in the Supplemental Material available on-line. The eight questions were displayed on a single page with instructions at the top, and the questionnaire took 5 to 10 min to complete.

Table 3.

Sample Questions From Experiment 2

Question type	Wording
Full predictive	Imagine you exercise hard in April. How likely is it that you weigh less in May?
No-alternative predictive	Imagine you exercise hard in April. You don’t have the opportunity to do anything else to lose weight besides exercising hard. How likely is it that you weigh less in May?
Full diagnostic	Imagine you weigh less in May than April. How likely is it that you exercised hard in April?
No-alternative diagnostic	Imagine you weigh less in May than April. You didn’t have the opportunity to do anything else to lose weight besides exercising hard. How likely is it that you exercised hard in April?
Strength of alternatives	Imagine you don’t exercise hard in April. How likely is it that you weigh less in May?

Note: A predictive judgment is a prediction of the likelihood of an effect based on its cause; a diagnostic judgment is a diagnosis of the likelihood of a cause based on its effect. Full conditionals contain alternative causes; no-alternative conditionals do not contain alternative causes. For strength-of-alternative questions, participants rated the strength of the alternatives presented. Participants rated how likely each event was on a scale ranging from 0, impossible, to 100, definite.

Results and discussion

Mean judgments for the predictive and diagnostic questions are shown in Figure 2a. A 2 (direction of inference: predictive or diagnostic) × 2 (condition type: full or no-alternative) analysis of variance revealed a significant interaction between direction of inference and condition type, F(1, 58) = 22.4, p < .0001, η_p ² = .3, as predicted by the neglect hypothesis. There were also main effects of direction of inference, F(1, 58) = 10.6, p < .01, η_p ² = .2, and condition type, F(1, 58) = 24.3, p < .0001, η_p ² = .3. Planned comparisons revealed a significant difference between full (M = 54.7) and no-alternative (M = 83.3) diagnostic conditionals, t(58) = 7.0, p < .0001, Cohen’s d = 1.8, but none for full (M = 59.2) or no-alternative (M = 58.8) predictive conditionals, t(58) < 0.08, p > .9, Cohen’s d = 0.02.

Fig. 2.

Mean likelihood rating as a function of (a) type of judgment (predictive or diagnostic) and type of conditional (full or no-alternative) and (b) type of judgment, type of conditional, and type of alternative schemata (strong or weak) in Experiment 2. Participants rated likelihood on a scale ranging from 0, impossible, to 100, definite. Error bars represent standard errors.

We divided the schemata evenly into strong versus weak alternatives on the basis of the strength-of-alternatives ratings. Mean conditional probability judgments for each group are shown in Figure 2b. To assess the effect of strength of alternatives on full and no-alternative judgments, we compared responses to strong and weak items separately for each type of question. Strong alternatives yielded lower diagnostic full-conditional judgments than weak alternatives, t(56) = 4.3, p < .0001, Cohen’s d = 1.1. None of the other groups showed differences across the strong/weak factor.

As in Experiment 1, the presence of alternatives influenced only diagnostic judgments and did so appropriately: Strong alternatives lowered full diagnostic conditional judgments to a greater degree than weak alternatives, but the strength of alternatives had no effect on no-alternative diagnostic judgments. Predictive judgments were insensitive to both the strength and even absence of alternatives. The results again suggested that people neglect alternatives in the predictive direction but treat alternatives appropriately when reasoning diagnostically.

Experiment 3

Experiment 3 was designed to replicate and extend the results of Experiment 2 to causal transmission arguments of the type tested in Fernbach et al. (2009). These arguments manipulated strength of alternatives, allowing further validation of the pattern of neglect in Experiment 2.

Method

Sixty-three Brown University students participated for class credit or were paid $8 per hour. They were divided into four groups. Groups 1 and 2 answered the full-conditional questions, and Groups 3 and 4 answered the no-alternative conditional questions. Each question referred to a causal transmission in which a predicate was transmitted from a cause category to an effect category. For each set of categories, we used two predicates: one that implied strong alternative causes and one that implied weak alternative causes. Fernbach et al. (2009) had verified that the strong predicates yielded higher alternative-strength judgments than did the weak predicates. As in Experiment 2, no participant saw the predictive and diagnostic questions for a particular predicate. We used 10 sets of categories and two predicates per set. Each participant therefore answered 20 questions. Participants rated how likely each event was on a scale ranging from 0, impossible, to 100, definite.

The four questions for a weak and strong version of a sample argument are shown in Table 4. The additional categories and predicates can be viewed in Supporting Details in the Supplemental Material available on-line. The procedure was identical to Experiment 2 except that the questionnaire was completed on a computer in a lab.

Table 4.

Sample Questions From Experiment 3

	Predicate
Question type	Strong alternative	Weak alternative
Full predictive	The coach of a high school football team is highly motivated. How likely is it that his team is highly motivated?	The coach of a high school football team knows a complicated play. How likely is it that his team knows a complicated play?
No-alternative predictive	The coach of a high school football team is highly motivated. Imagine a situation in which there are no other possible causes of the team being motivated except for the coach. How likely is it that the team is highly motivated?	The coach of a high school football team knows a complicated play. Imagine a situation in which there are no other possible causes of the team knowing a complicated play, except for the coach teaching it to them. How likely is it that the team knows a complicated play?
Full diagnostic	A high school football team is highly motivated. How likely is it that their coach is highly motivated?	A high school football team knows a complicated play. How likely is it that their coach knows a complicated play?
No-alternative diagnostic	A high school football team is highly motivated. Imagine a situation in which there are no other possible causes of the team being motivated except for the coach. How likely is it that the coach is highly motivated?	A high school football team knows a complicated play. Imagine a situation in which there are no other possible causes of the team knowing a complicated play, except for the coach teaching it to them. How likely is it that the coach knows a complicated play?

Note: A predictive judgment is a prediction of the likelihood of an effect based on its cause; a diagnostic judgment is a diagnosis of the likelihood of a cause based on its effect. Full conditionals contain alternative causes; no-alternative conditionals do not contain alternative causes. Participants rated how likely each event was on a scale ranging from 0, impossible, to 100, definite.

Results and discussion

Mean judgments for the predictive and diagnostic questions are shown in Figure 3a. A 2 (direction of inference: predictive or diagnostic) × 2 (conditional type: full or no-alternative) analysis of variance revealed a significant interaction between direction of inference and conditional type, F(1, 61) = 62.3, p < .0001, η_p ² = .5, and main effects of direction of inference, F(1, 61) = 18.0, p < .0001, η_p ² = .2, and of conditional type, F(1, 61) = 24.9, p < .0001, η_p ² = .3. Planned comparisons revealed a significant difference between full (M = 69.3) and no-alternative (M = 93.9) diagnostic conditionals, t(61) = 8.4, p < .0001, Cohen’s d = 2.2, but no difference for full (M = 74.7) or no-alternative (M = 75.8) predictive conditionals, t(58) = 0.4, p > .7, Cohen’s d = 0.09. Mean responses for the strong and weak alternatives are shown in Figure 3b. As in Experiment 2, alternative strength had a significant effect only on full diagnostic judgments, t(64) = 7.8, p < .0001, Cohen’s d = 2.0.

Fig. 3.

Mean likelihood rating as a function of (a) type of judgment (predictive or diagnostic) and type of conditional (full or no-alternative) and (b) type of judgment, type of conditional, and strength of alternatives (strong or weak) in Experiment 3. Participants rated likelihood on a scale ranging from 0, impossible, to 100, definite. Error bars represent standard errors.

The pattern of results in Experiment 3 was identical to Experiment 2: The explicit absence of an alternative cause affected diagnostic but not predictive judgments, and strength of alternatives affected only full diagnostic judgments.

General Discussion

Whether experts reasoning about psychopathology or undergraduates reasoning about their goals and actions or causal transmissions, people neglected alternative causes when making predictive-likelihood judgments but were sensitive to them when reasoning diagnostically.

Alternative explanations

One might argue that the pattern of results reflects the special status of no-alternative diagnostic judgments. These probabilities should be rated very highly, equal to 1 or close to it, making the difference between full and no-alternative diagnostic judgments obvious. Conversely, the difference between no-alternative and full predictive judgments is subtler because neither takes a value at the end of the probability scale. This interpretation predicts the high likelihood ratings in the no-alternative diagnostic condition relative to other judgments, but not the effects of alternative strength in Experiments 2 and 3. In diagnostic reasoning, participants were sensitive not just to the presence or absence of alternatives, but to the degree of alternative strength. Predictive judgments did not vary with alternative strength.

Another potential explanation for the results is that people neglected alternatives in the predictive direction because of pragmatic considerations. Do people interpret full conditionals as containing an implicature to ignore unmentioned causes in the predictive direction only? We tried to avoid such implicatures by choosing wordings that lent themselves more naturally to the intended interpretation: the full conditional. For example, in Experiment 1, participants were told that Ms. Y was diagnosed with depression and then were asked to judge the likelihood of her presenting with lethargy. Admittedly, it remains a logical possibility that people interpreted this as a request to judge the probability that the patient presents with lethargy that is due to depression and not any other cause, but we find this interpretation unlikely. Nonetheless, there is a fine line between a cognitive process that habitually neglects relevant information and a pragmatic one that infers intent to exclude the information. The role of pragmatics in these kinds of probability judgments is worthy of further exploration.

Potential mechanisms

A more complete explanation for the divergence between predictive and diagnostic reasoning emerges from a consideration of the demands imposed by the two directions of reasoning. People make predictions by simulating the mechanisms that produce predicted states from specific causes (Hagmayer & Waldmann, 2000; Kahneman & Tversky, 1982), and people tend to simulate one or a small number of mechanisms for a particular outcome (Dougherty, Gettys, & Thomas, 1997). It is reasonable to start with the cause that is picked out by the current argument or situation. Generating novel explanations is difficult because of the vast number of potentially relevant factors (Josephson & Josephson, 1984; Peirce, 1931/1965). Diagnostic-likelihood judgments, however, demand comparing the cause at hand with alternative possible causes; engaging in explanation is unavoidable. The presence of an explanatory process may also be why making diagnostic judgments seems harder than making predictive ones (Tversky & Kahneman, 1982).

Implications

Our findings are at odds with the claim that predictions are positively biased relative to diagnoses (Medin et al., 2003; Tversky & Kahneman, 1982). Instead we found that predictions are underestimated due to the neglect of alternatives. The effect reported by Tversky and Kahneman is based on only a single question, and those reported by Medin et al. may be driven by strong alternative causes lowering diagnostic judgments and not by bias (for more details, see Fernbach et al., 2009).

Neglect of alternatives in predictive-likelihood judgments implies an undue optimism in the case of medical prognoses (or pessimism regarding the success of treatments) and undue pessimism in the case of planning and goal pursuit. For example, a graduate student thinking about future job prospects in the context of his or her current research neglects the effects of future research. In the light of research showing neglect of alternatives in some diagnostic situations (Doherty et al., 1996; Fischhoff, Slovic, & Lichtenstein, 1978), the consideration of alternatives in diagnostic-likelihood judgments is at least as surprising as the neglect in predictive ones. It is notoriously difficult to get people to consider alternative hypotheses. One debiasing strategy is to get them to consider the opposite (Lord, Lepper, & Preston, 1984). Our work suggests that getting people to consider alternatives may be facilitated by having them explicitly judge how likely their hypothesis is, given the evidence, especially when the hypothesis can be construed as a potential cause of the evidence. People apparently are already equipped to consider alternative causes under these conditions.

Footnotes

Acknowledgements

We thank Jonathan Bogard for collecting data and Leonel Garcia-Marques, Ju-Hwa Park, and John Santini for discussions of the work. We also thank Marc Buehner and two anonymous reviewers for comments on an earlier draft.

The data from Experiment 1 appeared in a poster presented at the Simches Symposium, Boston, MA, November 2008 (Romeo et al., 2008).

The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.

This work was supported by National Science Foundation Award 0518147.

Additional supporting information may be found at

References

Bornstein

B.H.

Emler

A.C.

(2001). Rationality in medical decision making: A review of the literature on doctors’ decision-making biases. Journal of Evaluation in Clinical Practice, 7, 97–107.

Doherty

M.E.

Chadwick

Caravan

Barr

Mynatt

C.R.

(1996). On people’s understanding of the diagnostic implications of probabilistic data. Memory & Cognition, 24, 644–654.

Dougherty

M.R.P.

Gettys

C.F.

Thomas

R.P.

(1997). The role of mental simulation in judgments of likelihood. Organizational Behavior and Human Decision Processes, 70, 135–148.

Fernbach

P.M.

Darlow

Sloman

S.A.

(2009). Asymmetries in predictive and diagnostic reasoning. Manuscript submitted for publication.

Fischhoff

Slovic

Lichtenstein

(1978). Fault trees: Sensitivity of estimated failure probabilities to problem representation. Journal of Experimental Psychology: Human Perception and Performance, 4, 330–344.

Hagmayer

Waldmann

M.R.

(2000). Simulating causal models: The way to structural sensitivity. In Gleitman

L.R.

Joshi

A.K.

(Eds.), Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society (pp. 214–219). Mahwah, NJ: Erlbaum.

Josephson

J.R.

Josephson

S.G.

(1994). Abductive inference: Computation, philosophy, technology. New York: Cambridge University Press.

Kahneman

Tversky

(1982). The simulation heuristic. In Kahneman

Slovic

Tversky

(Eds.), Judgment under uncertainty: Heuristics and biases (pp. 201–208). New York: Cambridge University Press.

Klayman

Y.W.

(1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94, 211–228.

10.

Lord

C.G.

Lepper

M.R.

Preston

(1984). Considering the opposite: A corrective strategy for social judgment. Journal of Personality and Social Psychology, 47, 1231–1243.

11.

Medin

D.L.

Coley

J.D.

Storms

Hayes

B.K.

(2003). A relevance theory of induction. Psychonomic Bulletin & Review, 10, 517–532.

12.

Pearl

(1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann.

13.

Peirce

C.S.

(1965). Lectures on pragmatism. In Hartshorn

Weiss

(Eds.), Collected papers of Charles Sanders Peirce (Vol. 5, pp. 14–212). Cambridge, MA: Harvard University Press. (Original work published 1931)

14.

Pitz

G.F.

Downing

Reinhold

(1967). Sequential effects in the revision of subjective probabilities. Canadian Journal of Psychology, 21, 381–393.

15.

Romeo

Sutton-Skinner

Petersen

Baer

Huffman

Birnbaum

Sloman

S.A.

(2008, November). Clinical decision making biases in a group of mental health providers. Poster presented at the Simches Symposium, Boston, MA.

16.

Ross

B.H.

Murphy

G.L.

(1996). Category-based predictions: Influence of uncertainty and feature associations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 736–753.

17.

Shah

J.Y.

Friedman

Kruglanski

A.W.

(2002). Forgetting all else: On the antecedents and consequences of goal shielding. Journal of Personality and Social Psychology, 83, 1261–1280.

18.

Tversky

Kahneman

(1982). Causal schemata in judgements under uncertainty. In Kahneman

Slovic

Tversky

(Eds.), Judgement under uncertainty: Heuristics and biases (pp. 117–128). Cambridge, England: Cambridge University Press.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB