Abstract
Human gambling generally involves suboptimal choice because the expected return is usually less than the investment. We have found that animals, too, choose suboptimally under similar choice conditions. Pigeons, like human gamblers, show an impaired ability to objectively assess overall probabilities and amounts of reinforcement when a rare, high-value outcome (analogous to a jackpot in human gambling) is presented in the context of more frequently occurring losses. More specifically, pigeons prefer a low-probability, high-reward outcome over a guaranteed low-reward outcome with a higher overall value. Furthermore, manipulations assumed to increase impulsivity (pigeons maintained at higher levels of motivation for food and pigeons housed in individual cages) result in increased suboptimal choice. They do so presumably because they function to increase attraction to the signal for the low-probability, high-reward outcomes rather than consider the more global probability of reinforcement associated with each alternative.
When humans engage in commercial gambling, they are choosing suboptimally, because they are choosing a low-probability but high-payoff alternative over a high-probability, low-payoff alternative (i.e., not gambling), such that the net expected return is less than what was wagered (e.g., slot machines and state lotteries). This is an impulsive choice in the sense that the gambler’s behavior suggests a failure to consider the long-term consequences of the decision. In fact, research has shown that patterns of decision-making in pathological gamblers are marked by a preference for immediate gratification or relief from states of deprivation relevant to their addiction despite negative long-term consequences (Yechiam, Busemeyer, Stout, & Bechara, 2005).
Recent research suggests that decision making depends on two different sources of input: primary processes governed by relatively simple associative learning that typically occurs impulsively, often without awareness, and secondary processes comprising what we normally classify as thought processes, the conscious effort to consider possibilities, and an attempt to resolve dilemmas (Dijksterhuis, 2004; Evans, 2003; Klaczynski, 2005). It is widely acknowledged that nonhuman animals are thought to rely on primary decision processes associated with more primitive areas of the brain. It is noteworthy that pathological gamblers are also thought to arrive at decisions through the use of more primitive areas of the brain (Potenza, 2008).
A Procedure Analogous to Human Gambling
Consistent with the hypothesis that primary processes are involved in the suboptimal choices involved in gambling, we have found that pigeons, too, reliably prefer an alternative that signals a low-probability, high-payoff outcome even if it results in a substantial loss of reinforcement (Gipson, Alessandri, Miller, & Zentall, 2009; see also Fantino, Dunn, & Meck, 1979; Mazur, 1996; Spetch, Belke, Barnet, Dunn, & Pierce, 1990). This finding is contrary to optimal foraging theory (Stephens & Krebs, 1986), because animals should be sensitive to alternatives that provide greater probabilities or greater quantities of food. Gipson et al. (2009) gave pigeons a choice between two white lights. A single peck to one of the lights always resulted in the presentation of one of two colored lights. In the case of the first white light, a red light was presented (always followed 10 s later by noncontingent reinforcement) or a green light was presented (never followed by reinforcement). In the case of the second white light, a yellow or blue light was presented (each 50% of the time), and 75% of the time, the yellow or blue light was followed 10 s later by noncontingent reinforcement (Fig. 1a). Some of the trials were forced to each alternative, and other trials were choice trials. After training, the pigeons showed a 69% preference for the suboptimal reinforcement alternative.

Experimental paradigm. In each session, pigeons were given a choice between two white lights. A single peck to one of the lights always resulted in the presentation of one of two colored lights. In the case of the first white light (left column), a red light was presented (always followed 10 s later by noncontingent reinforcement) or a green light was presented (never followed by reinforcement). The percentage of time that each color was shown is as follows: 50% red, 50% green (a); 20% red, 80% green (b–d). The probability of reinforcement, or p(rf), was always 100% for red and 0% for green. In (c), 10 pellets were given after the red light, and no pellets were given after the green light. In the case of the second white light (right column), a yellow or blue light was presented, followed 10 s later by noncontingent reinforcement. Yellow and blue each appeared 50% of the time, and each was reinforced 75% of the time (a). Yellow and blue appeared 20% and 80% of the time, respectively, and each was reinforced 50% of the time (b). Yellow and blue appeared 20% and 80% of the time, respectively, and each light was followed by a noncontingent reinforcement of 3 pellets (c). Yellow and blue each appeared 50% of the time; yellow was always followed by a noncontingent reinforcement, and blue was never followed by such a reinforcement (d). In all experiments, the colors and sides were balanced across pigeons.
Stagner and Zentall (2010) lowered the overall probabilities of reinforcement such that the signaled reinforcement occurred on only 20% of the trials when that alternative was chosen, whereas unsignaled reinforcement occurred on 50% of the trials when the other alternative was selected (Fig. 1b). In this experiment, the pigeons showed a 97% preference for the signaled reinforcement alternative.
A Procedure More Analogous to Human Gambling
When humans are involved in commercial gambling, the alternatives generally involve different magnitudes of reinforcement (typically money) rather than different probabilities of reinforcement. Zentall and Stagner (2011) examined the effect of magnitude of reinforcement in pigeons. If pigeons chose one alternative, they had a 20% chance of receiving a signal for 10 pellets of food (an average of 2 pellets), whereas if they chose the other alternative they received a signal for a guaranteed 3 pellets of food (Fig. 1c). Just as with probability of reinforcement, pigeons showed a strong (86%) preference for the infrequent 10 pellets over the certain 3 pellets.
What Is the Mechanism Responsible for Suboptimal Choice by Pigeons?
Dinsmoor (1983) argued that any stimulus that predicts reinforcement with a high probability (S+) will become a conditioned reinforcer and will elicit observing behavior. In the present procedure, the effectiveness of a conditioned reinforcer is that, unlike the reinforcer, it occurs immediately after choice. Thus, in these gambling-like procedures, one can think of the choice as being between the stronger conditioned reinforcer on the gambling-like alternative relative to the weaker conditioned reinforcer associated with the other alternative. Attractiveness of the stronger conditioned reinforcer results in impulsive choice of the suboptimal alternative.
But why does nonreinforcement that occurs after the 0% reinforcement, suboptimal-alternative stimulus (the S–, which occurs 80% of the time) not result in conditioned inhibition? Given that in some experiments it occurred four times as often as the stimulus that was always followed by reinforcement (Stagner & Zentall, 2010; Zentall & Stagner, 2011), it should have decreased the attractiveness of the conditioned reinforcer. Perhaps the S– failed to become a conditioned inhibitor because it maintained little observing behavior (i.e., the pigeons may have looked away from the S– as soon as it appeared; see Dinsmoor, 1983). To test this hypothesis, we replaced the localized S– with a diffuse light in the operant chamber that should have been very difficult to avoid (Stagner, Laude, & Zentall, 2011). We found that pigeons exposed to a diffuse S– continued to prefer the signaled reinforcement associated with an overall lower probability of reinforcement as much as control groups. Thus, it is not simply that the pigeons have little experience with the S–; rather, they do not seem to attribute much negative value to the S–. It is noteworthy that a theory based on the absence of conditioned inhibition to losses also has been proposed to account for human gambling behavior (Blanco, Ibáñez, Sáiz-Ruiz, Blanco-Jerez, & Nunes, 2000; Breen & Zuckerman, 1999).
Greater attention to the conditions of signaled reinforcement than to the conditions of signaled nonreinforcement (Hayden, Heilbronner, Nair, & Platt, 2008) may explain some of the differences in suboptimal choice among experiments. For example, when the alternatives were associated with 50% and 75% reinforcement (Gipson et al., 2009), the pigeons’ effective choice may have been between the signal for 100% reinforcement and the signals for 75% reinforcement, and when the alternatives were 20% and 50% reinforcement (Stagner & Zentall, 2010), the pigeons’ effective choice may have been between the signal for 100% reinforcement and the signals for 50% reinforcement. Furthermore, when a short gap is presented between the choice response and one of the signals for reinforcement or its absence, preference for the suboptimal alternative decreases, but only when the gap occurs before the S+, not when it occurs before the S– (McDevitt, Spetch, & Dunn, 1997). Thus, the S– seems to play a minimal role in the suboptimal choice preference.
If this analysis is correct, it suggests that the variable of primary importance is the outcome-signaling value of the conditioned reinforcer, and it may be independent of the frequency with which it occurs. We tested this hypothesis by pitting two conditioned reinforcers against each other. Each signaled 100% reinforcement: one that occurred on 20% of the trials, the other associated with 50% of the trials (Stagner, Laude, & Zentall, 2012; see Fig. 1d).
Consistent with the hypothesis that the signaling value of reinforcement is associated with the conditioned reinforcers, the pigeons were relatively insensitive to the overall frequency of reinforcement. That is, the pigeons had no preference between the alternatives. It is noteworthy that in humans, gambling memories generally involve enhanced memory for the salient events of winning but not losing, an effect sometimes referred to as the availability heuristic (Tversky & Kahneman, 1974). This overemphasis of wins by humans, like pigeons, probably contributes to maintenance of gambling behavior (Blanco et al., 2000).
Extending the Animal Model: Motivational and Environmental Sources of Control of Suboptimal Choice by Pigeons
Motivation
Delay discounting is the reduced value attributed to reinforcers that are delayed compared with those that are immediate. There is evidence that greater levels of food restriction are associated with greater rates of delay discounting by animals (Eisenberger, Masterson, & Lowman 1982), such that hungry animals tend to be more impulsive, showing a greater preference for immediate rewards (Bradshaw & Szabadi, 1992; Snyderman, 1983). We have found that pigeons are less attracted to the gambling-like alternative when they are less motivated by food and presumably less impulsive (Laude, Pattison, & Zentall, 2012; see Fig. 2). We attributed the reduction in suboptimal choice by pigeons maintained on a less restricted diet to less attraction to the stronger conditioned reinforcer when they chose the suboptimal reinforcement alternative. Analogous findings indicate that people with higher needs (i.e., lower socioeconomic status) tend to gamble proportionally more than those with higher socio-economic status (Lyk-Jensen, 2010).

Percentage of choices of the suboptimal alternative (50% chance of obtaining food) over a better alternative (75% chance of obtaining food) as a function of session and food restriction. The dashed line indicates chance performance.
Environmental enrichment
In humans, it has been found that lower levels of impulsivity (Perry & Carroll, 2008) and a reduced effectiveness of conditioned reinforcers (Jones, Marsden, & Robbins, 1990) are associated with reduced drug self-administration. Furthermore, there is evidence that reduced impulsivity can have a similar effect on compulsive gambling and drug addiction (Potenza, 2008).
Environmental enrichment also has been found to decrease impulsivity in rats as measured by decreased delay discounting (Perry, Stairs, & Bardo, 2008). Likewise, environmental conditions under which rats live can affect the likelihood of drug self-administration (Stairs & Bardo, 2009).
Environmental enrichment also has been shown to affect the degree of suboptimal choice in pigeons. For example, pigeons given access to a large cage with conspecifics, compared with the more typical, smaller individual housing that allows for limited social interaction, showed reduced choice of the suboptimal alternative (Pattison, Laude, & Zentall, 2013; see Fig. 3). In pigeons, social enrichment seems to reduce the attractiveness of the strong conditioned reinforcer, thereby reducing the attraction of the suboptimal alternative. The results of the environmental manipulation with pigeons suggest the possibility that social isolation may encourage gambling behavior in humans.

Percentage of choices of the suboptimal alternative as a function of session and group.
It is noteworthy that although socially enriched pigeons initially prefer the optimal alternative, with extended training, they come to choose the suboptimal alternative almost exclusively (Pattison et al., 2013). Thus, despite the fact that social enrichment may decrease impulsivity, extended exposure to the choice task may cause those pigeons to grow increasingly attracted to the high-valued conditioned reinforcer. Similar predictions are made by certain models of human gambling (Blaszczynski & Nower, 2002; Sharpe, 2002), suggesting that problem gambling can emerge by way of experience (conditioning).
Test of the Suboptimal Choice Procedure on Self-Reported Human Gamblers
If the procedures used with pigeons provide an appropriate model of human gambling behavior, one should see a difference in suboptimal choice with this task between humans who report considerable gambling behavior and those who do not. To test this prediction, a sample of undergraduates was identified who indicated that they engaged in gambling-related activities on a regular basis. They were matched to subjects who reported that they never engaged in gambling-related activities. The design was similar to that used in Figure 1c, but the colored response keys were replaced with a 10-s video game involving different colored planets and the participants were given points (10 or 0 vs. 3), presumably for shooting down invading space ships. It was found that self-reported gamblers chose the low-probability, high-payoff alternative significantly more often (56.5%) than control subjects (23.0%) (Molet et al., 2012). These results suggest that the suboptimal-choice task designed for use with pigeons may be also be appropriate for studying the mechanisms that contribute to human gambling choices.
Although one tends to think in terms of the outcomes (winning and losing), the role of signaled reinforcement in human gambling deserves more attention. One can better appreciate the role of signaled reinforcement in human gambling by asking if humans would be as likely to gamble if, for example, when operating a slot machine, the symbols that typically appear in the window were covered (i.e., if money merely appeared or failed to appear after depositing a coin). The role of signaled reinforcement (conditioned reinforcers and conditioned inhibitors) in human gambling has received less attention than it should and is worthy of further study. Results we have obtained using an animal model of gambling behavior suggest that the propensity to make suboptimal choices in a gambling environment depends on the signals for reinforcement and the absence of reinforcement. In fact, these signals probably contribute to the cognitive biases that promote the acquisition and maintenance of gambling in humans as well.
Conclusions
The finding that pigeons choose suboptimally because of the impairment of their ability to objectively assess the overall probability of reinforcement is reminiscent of the decision-making process of pathological gamblers. Like pigeons, the apparent impairment in calculating odds by gamblers may be due to a form of bias to attend to signals for reinforcement. In fact, it has been suggested that attentional bias to gambling-related targets generates positive outcome expectancies, consequently motivating instrumental gambling behavior (Field & Cox, 2008).
Conditions under which we find an increased attraction to the stronger signal for reinforcement associated with the suboptimal alternative are thought to be produced by being in an impulsive state (Laude, Beckmann, Daniels, & Zentall, in press); this suggests that impulsivity may function as a proximal mechanism that increases suboptimal choice in pathological gamblers. In fact, over a wide range of studies, there seems to be a correlation between impulsivity and gambling in humans (Blaszczynski, Steel, & McConaghy, 1997). However, pigeons that are presumably less impulsive (because of mild food restriction and social enrichment), as evidenced by the fact that they start out by showing a preference for the optimal alternative, seem to lose that preference with continued exposure to the task. This finding suggests that it may be difficult for humans, even those who are not particularly impulsive and are not initially attracted to suboptimal choices, to resist gambling if they are exposed to gambling environments for extended durations. To the extent that pigeons show suboptimal choice under conditions that mimic human gambling behavior, an animal model may be useful in studying variables that contribute to (or discourage) habitual gambling behavior by humans.
Recommended Reading
Dinsmoor, J. A. (1983). (See References). Reviews the role of conditioned reinforcement in learning and preference.
Roper, K. L., & Zentall, T. R. (1999). Observing behavior in pigeons: The effect of reinforcement probability and response cost using a symmetrical choice procedure. Learning and Motivation, 30, 201–220. Demonstrates that animals will strongly prefer discriminative stimuli even when that preference has no effect on the probability of reinforcement.
Zentall, T. R. (2011). Maladaptive gambling by pigeons. Behavioural Processes, 87, 50–56. Provides a review of research on suboptimal choice by pigeons.
Footnotes
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
The research described in this article was supported by National Institute of Mental Health Grant 63726 and by National Institute of Child Health and Development Grant 60996.
