Abstract
Although play occurs in a wide variety of animals, models of the origins of play behavior are lacking. We propose a novel computational model exploring the evolution of non-social frivolous play. Asexually reproducing semelparous animals can either rest or forage. Foraging occurs when an organism is below an energy threshold. Success is determined by the combination of skill and availability of resources, which declines over time but replenishes for each generation. Play was introduced as a mutant strategy: a frivolous activity that uses energy and increases the probability of dying over resting with no direct fitness benefit. Simulations show that play behavior becomes fixed in the population and the time spent playing is maintained at a low rate in spite of its costly nature. When play behavior is functional by increasing foraging ability, it evolves quickly and the time individuals spend playing increases, but eventually the population of players collapses and play disappears. We suggest a mechanism underlying the origins of adaptive play from non-adaptive behavior when resources expand. Initially play acts as a spiteful behavior in that playing individuals suffer a direct cost to their fitness, but also may incur even greater costs to other individuals in the population.
Introduction
There have been only a few attempts to model play behavior in the literature (citations in Burghardt, 2005; Pellis, Burghardt, Palagi, & Mangel, 2015) and no models we are aware of deal with the origins of play and how it could become established in the first place. Progress was hindered by an unclear conception of what play was, how to identify it, and confusion among different types and complexities of play (Burghardt, 2005). We now have a set of five criteria for identifying play in diverse species and contexts, allowing play to be identified in animals ranging from invertebrates (e.g. octopus, spiders, insects) to members of all vertebrate classes (Burghardt, 2014; Graham & Burghardt, 2010). A recent brief statement (Burghardt, 2014, p. 91) is: “Play is repeated, seemingly non-functional behavior differing from more adaptive versions structurally, contextually, or developmentally, and initiated when the animal is in a relaxed, unstimulating, or low stress setting.” Thus, whereas play is certainly most common and complex in many mammals (e.g., primates, rodents, carnivores, cetaceans) and birds (e.g., corvids, parrots, raptors), the fact that it occurs in diverse and distantly related taxa raises the issue of the origins of such behavior multiple times over the course of animal evolution.
Another advance was to recognize that play may be classified into three processes—primary, secondary, and tertiary (Burghardt, 2005). Only the first is relevant to the origins issue, as such play is a byproduct of various factors such as high metabolic energy, boredom, motivational conflicts, incipient (intention) movements, and other ethological and psychological factors (Pellis et al., 2015). Here we use a very simple system to model the behavior of animals engaged in seemingly nonfunctional, but somewhat costly, behavior or not doing so, and exploring if and how play becomes established in the population. Play in animals is typically divided into locomotor, object, and social play, and can be quite complex, especially the latter. Here we focus on nonsocial play in a simple system, with the hope that more realistic models and research testing hypotheses in more naturalistic contexts will be fostered. We are attempting to construct models in which play can not only originate, but become established and be maintained in a population, even if the time spent playing is at a low rate.
We are using as our conceptual source Surplus Resource Theory (SRT), as developed in several places over the last decades (e.g., Burghardt, 1988, 2005, 2014; Pellis et al., 2015). SRT postulates that abundance and variety of resources (e.g., energetic, behavioral) are important for play behavior; empirical work has shown that play can be very costly in terms of energy use (e.g., Burghardt, 2005; Martin, 1984; Miller & Byers, 1991). Although much modeling attention has been given to resource abundance and foraging, growth, reproductive rates, and territorial behavior (Dugatkin, 2013; Dugatkin & Reeve, 2013; Fretwell, 1972; Smith, 1982), little modeling has been done for play behavior (Dugatkin & Bekoff, 2003; Grunloh & Mangel, 2015).
A general finding motivating our approach is widespread evidence that limited food resources reduce or even eliminate play in many animals (reviewed in Burghardt, 2005); in fact, play is usually the first behavior dropped from a species’ repertoire when survival conditions are unfavorable. In the following model, then, the “animal” population is in an environment with food resources of a fixed amount, that are only replenished “seasonally” as often occurs in natural settings.
General overview of the model
Individual-based models (IBMs) allow for inclusion of a vast array of detail concerning individual behavior and characteristics in determining population response (DeAnglelis & Mooij, 2005; Grimm & Railsback, 2005; Judson, 1994). They have been particularly important in situations in which interactions between individuals affect population response, notably when these interactions are affected by spatial aspects of the environment or the interactions depend upon the characteristics of the individuals involved, such as their size, social status, or genetically influenced behavioral traits such as shyness, boldness, and aggressiveness (Grimm & Railsback, 2005). IBMs are formulated based upon sets of rules for changes in state of individuals, with the rule set chosen to represent general responses (e.g., to address fundamental questions such as the nature of spatial patterns that arise from different individual movement rules) or responses of particular species of interest. In the latter case, rules are chosen from observation and the model may be evaluated through comparison with field or laboratory data. Dependence of rule sets upon such data provides one of the challenges to developing IBMs in that available data may be too sparse to account for the range of possible individual responses. In this case, we develop our IBM from a theoretical perspective to begin teasing apart primary drivers of the evolution of play behavior.
Our initial model is a non-spatial agent-based model where individuals indirectly compete for a fixed amount of resources in their environment over a given time period. Individuals can rest, forage, or play (given that they have the genetic disposition to do so). Note that “play” is initially equated with non-functional but energy consuming behavior. The other behaviors of resting and foraging are both energy consuming (and have associated mortalities), but foraging leads to energy acquisition and not just expenditure. Play behavior is introduced as a mutant and can come in two flavors—frivolous (without any direct benefit) or non-frivolous (having a foraging benefit). This is realistic to the extent that object or locomotor play may improve hunting or foraging behavior, directly or indirectly (Burghardt, 2005).
Methods
The intent of the model was to explore under what conditions play behavior emerges in a population of non-social organisms that spend their time resting, foraging, or playing (if they have the genetic propensity to do so). The model begins with a fixed population size of non-player agents (NPAs) whose behavior is limited to foraging a limited resource or resting at each time step (
The probability that a foraging attempt is successful at time t is determined by:
where (
whereas unsuccessful foraging result in energy cost
Foraging and resting also have an associated probability of dying,

Strategies and their payoffs for the non-player agents.
Play behavior is introduced as a heritable behavior and is modeled with two loci. The first locus is binary in character and determines whether play occurs, whereas the other is a quantitative trait that determines how often the individual plays. The life history of the player agents (PAs) is similar to the NPAs (Figure 2), except that when the PAs are satiated they play with probability

Strategies and their payoffs for the player agents.
The probability of foraging success is now a function of play frequency:
where
Differences in individual fitness are determined by individual energy levels at the end of the reproductive period (
where
where
To determine the hunger threshold, simulations were conducted for a given set of parameter values seeded with different hunger thresholds. The maximum of these values that maintained the population size (i.e., the population survived
Parameter descriptions for the model.
Values in bold were used for the simulations used in the results unless otherwise noted. All combinations of parameter values were explored. (
Results
The main result of the model simulations is that frivolous play evolves in a population when there are ample resources, and that even when there are ample resources, non-frivolous play can only be maintained (i.e. fixation occurs) in the population for a small set of extreme parameter values (e.g., a large energy cost to play). Other results from these computational simulations show that play can emerge and be maintained in a population even when play does not have a direct fitness benefit and play has a greater energy cost than resting (Figure 3A). When play does have a benefit (i.e., the more the organism spends time playing, the better they are at foraging), play also evolves, yet cannot be maintained in the population, even when play gives a small benefit to foraging (Figure 3B). Play behavior becomes more frequent and more variable before it reaches some threshold in the population, then it disappears and the cycle starts anew (Figure 3B).

Dynamics of the play gene frequency and the play time. (A) Maintenance of a small amount of average time spent playing when not hungry (~10%) in the population when there is no benefit to foraging success from play behavior (
Different parameter values were explored for the costs of playing (

The dependence of play behavior on resource abundance and other parameter values. The vertical axis gives the average maximum play time in the population (A) or average play gene frequency in the population (B) over the last 10,000 generations and for 100 simulations. Outer axes correspond to different values in the initial amount of resources for each generation (

The dynamics of resource abundance in the environment due to the amount of play behavior in the population. For each simulation a proportion of the population were player agents with a fixed play time. (A) Results for the model without an increased foraging benefit from play behavior (
There is a strong relationship between play behavior and resource dynamics (Figure 5): organisms that play use more energy and therefore get hungrier more quickly than non-players and reduce the available resources more quickly than a population of non-players. There is a non-linear effect on this depletion of resources, with the amount of time spent playing having a stronger effect than the proportion of the population who play. Unlike with frivolous play (Fig. 5A), when
Discussion
The results show play evolving and becoming fixed in the population where there is no direct fitness benefit to engaging in play, and even though play incurs direct costs of increased mortality and energy use. This finding implies that there must be an indirect benefit of play for it to arise and be maintained in a population. The bar plots of Figure 4 show that resource abundance in the environment is the most important parameter of the model to explain the emergence of play behavior, followed by the energetic costs of playing. Ironically, it appears that non-functional play acts as a spiteful behavior in that playing individuals suffer a direct cost to their fitness, through an increased probability of dying and energy use, but may also incur even greater costs to other individuals in the population (Hamilton, 1964a, 1964b, 1970). But our counterintuitive results may also tap into new findings on plasticity suggesting that “non-adaptive plasticity potentiates evolution by increasing the strength of directional selection” (Ghalambor et al., 2015).
In this model, the relation of play behavior to the amount of resources in the environment is clear (Figure 5). The mechanism at work seems to be that those individuals who play get hungrier (i.e. need resources) sooner than those who rest, resulting in their foraging more often and more quickly. This reduces the amount of resources available in the environment, leaving the non-players less likely to effectively forage and resulting in their having fewer offspring. There is also an upper limit to how much individuals should play, as they can deplete the environment too quickly and die. This is similar to pathogen virulence; if a pathogen becomes overly virulent, it can kill the host (Goodnight et al., 2008). This scenario occurs when there is a direct fitness benefit to play, namely that the more an individual plays the better at foraging they become. Play evolves more quickly in the population, but then disappears after an arms race to play more and more often results in a depletion of resources that cannot sustain the population. Thus, it is not only the amount of resources in the environment, but how quickly the population depletes them that facilitates population collapse. If we were to allow for unlimited resources or resources that did not decline as they were consumed, then we would not expect the above scenario to occur. For example, it has recently been shown that vigorous play in male macaque monkeys (Macaca assamensis) leads to more rapid acquisition of motor skills but at the cost of slower growth even when food was somewhat restricted (Berghänel, Schülke, & Ostner, 2015).
The results might explain why play rarely seems to have a runaway effect. In fact, many human moral lessons point out that if everyone in a society engaged in nonproductive activities (sports, arts, theatre, religion, video games, etc.) in terms of resource acquisition, the society could not be maintained, as no members were producing the resources necessary for such otherwise socially and personally enriching activities. Play is, however, both a measure of freedom and a source of creative achievement (Burghardt, 2013, 2015). The irony is that our initially non-intuitive results from the novel model suggest that play could have originated as a selfish, even spiteful behavior, en route to becoming a socially and reproductively valuable activity.
This model provides a possible key to the puzzle for the evolution of apparently non-functional nonadaptive behavior. Is this also supported biologically? If the playing animals are operating as if they have a higher metabolic rate, they are overall more active and thus exploiting environmental resources at a higher rate than less active animals, and a relationship between metabolic rate and play has been often proposed and supported (Burghardt, 1984, 1988, 2005). Thus, under certain conditions, animals with higher metabolic rates or more complex and energetically costly behavior and physiology, may be more evolutionarily successful than species with fewer requirements for survival. This may relate to the observation that terrestrial endothermic vertebrates generally out-compete ectothermic vertebrates, such as reptiles, competing for similar resources. But, in deserts, for example, where food is often limited both in amount and seasonally, reptiles are often more successful, numerous, and speciose.
Future directions for the model include, but are not limited to, variability in initial resource availability within and across generations (to simulate seasonal, environmental variability), overlapping generations, reproduction not dependent on time but after reaching an energy threshold, a spatial component, social play, direct competition for resources (e.g. hawk–dove interactions), and predator-prey interactions. A construction of the model accounting for kin-selection and non-random mating is another possible avenue for future research. The formulations of this theoretical model offer experimenters and empiricists the opportunity to test numerous hypotheses about the relationship between resource abundance and play behavior. The parameters of this model are intuitive and observable for different species, allowing this model’s predictions to be verified. Sexually reproducing animals should be introduced as well.
Although highly simplified, we have shown how a costly behavior without any obvious value can evolve. We recognize that this could apply to nonfunctional behaviors of any type, and our labeling this behavior as play may seem a sleight of hand. But play has been used as the prime example of a mysterious, non-serious, energetic behavior perplexing scientists as to how it could originate and become fixed in a population unless it had, at the same time considerable value balancing its costs. Indeed, the ratio of costs to benefits has been an enduring and crucial parameter in behavioral ecology and evolution. We have shown that behavior can initially evolve in the absence of adaptive value. But once present, play, if it can result in expanding the resources available (e.g., invading new habitats, creating novel ways of exploiting resources), may become the avenue for further evolution, an idea supported elsewhere (Burghardt, 2015). But also, once present, resource enhancement occurring for other reasons (e.g., invasive food resource) may facilitate the extent and amount of play whereas contraction of resources (e.g., due to climate change) may reduce it. Future model developments identifying the conditions in which the evolutionary processes underlying play are now needed.
The great technological and creative advances in human civilization, for good or ill, coincided with the advent of agriculture, providing for denser populations, as well as more stratified ones, which led to a more specialized class with opportunities and surplus time, energy, and physical and intellectual “tools” to exploit other aspects of nature (Burghardt, 2005) in what seems today to be a positive feedback ratcheting effect. If our resources to maintain billions of people do not continue to expand and our behavior does not reflect such limitations, the model proposed here may be rather prescient, if sadly so, and the costs of non-frivolous “playing” may take a toll, as in our model, though in far fewer generations.
Footnotes
Acknowledgements
We thank K. Rooker, for comments and advice. This work was conducted at the National Institute for Mathematical and Biological Synthesis, an Institute sponsored by the National Science Foundation through NSF Award #DBI-1300426, with additional support from The University of Tennessee, Knoxville. We also thank the participants in the working group on Play, Evolution, and Sociality for discussion and comments.
