Formal Modeling as Theoretical Glue between Laboratory and Naturalistic Studies of Memory

Abstract

Memory research has evolved along two distinct traditions: well-controlled laboratory experiments emphasizing precision and tractability and naturalistic-memory experiments emphasizing generalization to real-world contexts. Although both have yielded important insights, we do not yet have a generalized theory of memory consistently interpreted across laboratory and naturalistic paradigms. By analyzing the strengths and limitations of the two traditions, I propose that formal modeling is the key to creating this theoretical link. A formal theory, instantiated in precise computational models that are developed over decades of laboratory-based experiments, needs naturalistic experiments to test its generalizability and reveal its limitations. Naturalistic experiments, in turn, better connect with existing laboratory paradigms when their results are explained by the same theoretical model. To achieve this, I propose a step-by-step procedure in which naturalistic settings are considered as all possible scenarios that could be realized in the real world, with laboratory settings forming a smaller subset that we have understood well. Our goal as memory researchers is to incrementally expand the scope of existing laboratory studies, theories, and models to account for increasingly naturalistic scenarios, ultimately achieving a generalized theory of memory. Together, the proposed framework no longer views laboratory versus naturalistic approaches as a trade-off to navigate, given their different priorities and methodologies, but considers them both essential in working toward the same goal.

Keywords

memory naturalistic approach laboratory approach generalizability formal modeling

Introduction

Two distinct traditions have shaped the study of memory: One focuses on memory in well-controlled laboratory experiments (Ebbinghaus, 1885) while the other focuses on memory in naturalistic contexts (Neisser, 1982). The former tradition dates back to Ebbinghaus (1885), who pioneered the method to study memory under highly controlled conditions using lists of nonsense syllables. The 20 years following Ebbinghaus’s monograph saw a proliferation of laboratories that began to characterize how people remember under various laboratory conditions, developing fundamental memory paradigms that remain standard in the laboratory today (Kahana, 2012). The rigor of the laboratory approach has generated rich empirical findings on memory and laid a solid foundation for later theoretical developments of memory (Anderson, 1996; Atkinson & Shiffrin, 1968; Tulving & Thomson, 1973).

Despite the success of early laboratory approaches in producing rich groups of theories, one does not know whether they provide meaningful insight into natural-memory behavior (Neisser, 1982). A new wave of memory research emerged that started to emphasize practical aspects of memory and examine memory not just for lists of simple items but also for facts, stories, and personally experienced events (Gruneberg et al., 1988; Neisser & Winograd, 1988). However, as Neisser (1988) cautioned, these early efforts in naturalistic-memory research are “primarily interested in the phenomena themselves” and “came to it without strong theoretical commitments” (Neisser, 1988, p. 2). Neisser argued that it is not enough to simply denounce the old laboratory methods and call for more ecologically oriented alternatives, but we need to answer the question of whether the findings discovered in naturalistic settings are already adequately explained by theories developed from laboratory experiments or whether they represent entirely new concepts. A conversation was started to integrate naturalistic and laboratory approaches during the second Emory Cognition Project Conference in 1985, leaving participants with the optimism that “a new psychology of memory is beginning to take shape – one that will eventually yield theoretically consistent interpretations of both laboratory paradigms and naturalistic phenomena. We do not have that psychology yet, but we are moving” (Neisser, 1988, p. 3).

Forty years after this optimism, have we achieved the new psychology with consistent interpretations across both laboratory paradigms and naturalistic phenomena? There is no doubt that both traditions have continued to flourish and produce important results. Well-controlled laboratory experiments remain the primary approach for refining theoretical models of memory (Kahana, 2012). Yet, there is a lack of effort in testing these theories against more naturalistic and realistic scenarios, missing opportunities to identify gaps in these theories. Meanwhile, we can now capture components of naturalistic memory under unprecedented richness: Participants can go around their daily lives wearing a camera (Cabeza et al., 2004; Jeunehomme & D’Argembeau, 2020; St. Jacques et al., 2008), perceive and recall events from movies or stories (Baldassano et al., 2018; Zacks et al., 2001), and navigate virtual-reality spaces before recalling elements of their experiences (Herweg et al., 2020; J. F. Miller et al., 2013). Although these more recent studies of naturalistic memory provide important theoretical insights, they do not offer the same precision to explain and predict behavior compared with theories developed in well-controlled laboratory environments. Furthermore, it remains underexplored how existing theories based on controlled experiments should be revised to accommodate these naturalistic empirical findings. The current perspective aims to initiate a dialogue and propose steps toward a theoretical unification of the two traditions, particularly emphasizing the role of formal modeling in this integrative process.

A Perceived Tension Between the Two Traditions

Why is there a divide between laboratory and naturalistic approaches? There is a perceived tension between laboratory and naturalistic approaches regarding their priorities. The laboratory approach offers experimental control where only a small set of variables are involved, and experimenters can manipulate specific independent variables while holding others fixed. The laboratory approach strongly emphasizes the internal validity of obtained knowledge (Fig. 1a), which refers to “the approximate validity with which we infer that a relationship between two variables is causal” (Cook et al., 1979, p. 37). Yet the very condition that enables internal validity through strict experimental control is often thought to be at odds with external validity (Dhami et al., 2004), which refers to how much the discovered relationship generalizes across “different types of persons, settings, and times” (Cook et al., 1979, p. 37). The artificiality and the simplicity of traditional experimental design required to secure internal validity make the inferences from the experimental to the real-world, our ultimate interest, difficult (Dhami et al., 2004).

Fig. 1.

Laboratory and naturalistic approaches as complementary routes to generalizability. Traditionally viewed as a trade-off (a), laboratory approaches emphasize internal validity (the degree to which causal relationships can be accurately identified) and naturalistic approaches emphasize external validity (the ability to generalize findings across different contexts and situations). Here (b), I provide arguments considering both traditions essential for generalizability. Naturalistic approaches study memory closer to the environments to which we want to generalize our theories (green arrow). Experimental control in laboratory approaches facilitates the discovery of fundamental rules that apply across contexts (yellow arrow) and supports the development of cumulative, theoretically grounded knowledge via formal modeling (red arrow).

Naturalistic studies, although lacking the same level of control and precise measurement that laboratory studies provide, offer greater external validity by studying psychological processes in the environments to which we want to generalize our theory (Fig. 1a; Brunswik, 1947, 1952, 1955b). For example, in studies of autobiographical memory, people’s natural-memory behavior is probed as they remember the events of their lives (McDermott et al., 2009). While the complexity of autobiographical-memory studies hinders our ability to isolate and characterize the full range of variables involved, such studies better capture how memory functions in natural contexts. Relatedly, Yarkoni (2022) discussed “the generalization crisis” in traditional laboratory experiments and explored the consequences of failing to consider stimuli and situation variability during statistical inference. Yarkoni (2022, p. 12) emphasized that “If authors intend for their conclusions to hold independently of variation in uninteresting factors, and to generalize to broad classes of situations, there is no good substitute for studies whose designs make a serious effort to respect and capture the complexity of real-world phenomena.” Therefore, depending on the research question, it seems that one must often navigate a trade-off or find a middle ground between experimental control and generalizability (Hasson & Honey, 2012; Yarkoni, 2022), where strengthening one would mean compromising the other. This tension may explain the emergence of two distinct research traditions in human-memory studies, each prioritizing one approach over the other, given their emphasis on internal versus external validity. The next section of this article focuses on addressing this conceptual tension between experimental control and generalizability, arguing that researchers need not trade off one for the other, but that the two can work synergistically toward the same goal.

A Shared Goal of Generalizability

To move toward integrating laboratory and naturalistic approaches, we could view experimental control and naturalistic settings not as mutually exclusive choices but as complementary for achieving a shared scientific goal. Despite their distinct methodologies, both laboratory and naturalistic approaches share the same ultimate goal: developing a generalizable theoretical understanding of memory. Although the artificiality of traditional laboratory experiments is often associated with a lack of external validity or generalizability, their fundamental goal is also to uncover general rules that apply beyond specific experimental settings. Consider Ebbinghaus’s pioneering memory experiments: His use of random syllables in controlled settings was not meant to understand memory for nonsense syllables per se, but to uncover universal principles of memory and forgetting that would apply across contexts, including real-world situations. If the goal is to understand human memory in its most natural and generalized environments, why did cognitive psychologists start with these lab-based settings to uncover generalized theories? The answer lies in recognizing that while lab-based settings may seem distant from real-world environments, their precision and tractability enable generalizability in other important ways. These additional factors of generalizability, which I will discuss in the following section, suggest that the two approaches are complementary rather than opposing strategies to advance our understanding of memory.

Naturalistic approaches study memory closer to the environments to which we want to generalize our theory

Brunswik (1955b) has famously argued that controlling selected variables in traditional laboratory experiments (referred to as systematic design) destroys the naturally existing texture of the environment to which an organism has adapted, limiting the generalizability of findings. Those highly constrained situations are convenient for the experimenter but atypical for the individual. As an alternative, Brunswik (1947, 1952, 1955b) proposed the representative design, in which stimuli in an experiment are sampled in a way that is representative of the organism’s natural environment in terms of their number, values, distributions, and intercorrelations. For example, one way of achieving representative design is through random sampling. In Brunswik (1944), a participant went about their daily routine over a 4-week period in various outdoor and indoor situations. At random intervals, the participant was asked to give perceptual estimates of whichever object they happened to be looking at in that moment. This process was repeated multiple times to collect samples of objects representative of real-world situations. Compared with well-controlled laboratory experiments, the representative design better reflects the actual environment to which researchers aim to extend their findings.

Limitations

Although naturalistic settings are commonly associated with external validity, relying solely on these settings does not guarantee generalizability. Although the representative design aims to sample adequately to capture the natural complexity of the real-world environment, it is a “formidable task in practice” (Brunswik, 1955a, p. 239). As a result, most naturalistic studies instead focus on identifying formal properties of a naturalistic environment and reconstructing them in laboratory settings, an approach known as formal situational sampling (Hammond, 1966). For example, recent naturalistic studies of memory employ movie stimuli, narratives, and social interaction to recreate elements of naturalistic settings that are absent in traditional laboratory experiments (Baldassano et al., 2018; Zacks et al., 2001; Zadbood et al., 2017). While these studies each reveal important aspects of naturalistic environments, they cannot, by themselves, adequately sample and represent the full variability inherent in real-world contexts. Therefore, just as findings from laboratory settings with limited variability in populations, stimuli, or contexts are questioned regarding their broader generalizability, we must also consider the same limitation in any single naturalistic environment because of the large variability from one naturalistic environment to another. Ultimately, a strong theoretical framework is needed to integrate results across diverse naturalistic experiments. A science of naturalistic memory can succeed only if it has “precise techniques for translating observations into a formal language such that the operations of invariant mechanisms can be shown obviously” (Banaji & Crowder, 1989, p. 1188).

Experimental control facilitates the discovery of fundamental rules that apply across contexts

It has been argued that issues of internal validity are chronologically and epistemically antecedent to issues of external validity: asking whether a result applies beyond the experimental circumstances is only meaningful once we have established its validity within those circumstances (Guala, 2003). Experimental control can contribute to a better chance of generalization by discovering a real (causal) relationship between different variables from the original environment. Causality speaks more directly to the fundamental mechanisms researchers seek to uncover and is therefore more likely to remain valid in new contexts; in other words, establishing internal validity is crucial for establishing external validity (Fig. 1b, yellow arrow). In contrast, correlation often reflects spurious patterns that occur only under one specific environment. The latter can be an issue for naturalistic environments, given the large number of variables involved, many of which may not be precisely measured or controlled. This is not to say that internal validity is impossible to establish in naturalistic studies, as there are machine-learning approaches dedicated to learning low-dimensional causally related variables from high-dimensional data (Schölkopf et al., 2021). In fact, the link between causality and generalization is so strong—causal mechanisms tend to persist across different contexts—that methods have been developed using invariance as a property to identify causality (Arjovsky et al., 2019; Heinze-Deml et al., 2018; Peters et al., 2016). For example, if various data sets and environments are characterized by the same relationships, they are likely to be causal relationships (Arjovsky et al., 2019). However, these data-driven approaches require access to large amounts of data collected from multiple environments, which may be beyond the feasibility of current naturalistic studies of memory in psychology.

Limitations

Just discovering empirical relationships with internal validity is not enough; a mature science requires an understanding of what explains the discovered relationships. Newell (1973) cautioned that experimental psychology as a science had primarily dealt with phenomena: Upon discovering a new phenomenon, we explored all possible variations of variables affecting this phenomenon, but what was missing was theoretical unification of these results; that is, explanations of how different variables affect given phenomena and how various phenomena relate to each other (Cummins, 2000; Fried, 2020; Newell, 1973; van Rooij, 2019). Without these explanations, simply examining the generalization of individual empirical findings across different situations and contexts does not add up to a generalizable theoretical understanding of memory.

Formal modeling contributes to a generalized theory of memory by unifying findings across experiments

Formal modeling addresses limitations in both traditional laboratory-based and naturalistic empirical studies. As Newell (1973) argued, to unify disparate empirical findings, we need to construct complete and precise models capable of simulating a range of experiments. Without such models, research can become discovery-oriented research, with hypotheses that are loosely linked with existing theories and that may or may not be supported by empirical data (Oberauer & Lewandowsky, 2019). We cannot have strong confidence in these hypotheses until they have been replicated with large sample sizes and across similar and different contexts. In contrast, theory-testing research generates experimental hypotheses that are strongly motivated by theoretical models; these models must hold if the theory is correct, and if they are not supported by empirical evidence, researchers must call for modification of the theory (Oberauer & Lewandowsky, 2019). Regarding generalization, discovery-oriented research alone provides little confidence that conclusions will extend to new contexts unless explicitly tested in those situations. However, when combined with theory-testing research that precisely formulates core assumptions into formal cognitive models, empirical effects are allowed to accumulate (Jamieson & Pexman, 2020). Each new piece of evidence either supports the current model or contributes to its revision. When a theory has accumulated sufficient empirical evidence, its deductive power becomes so robust that we can not only generate new hypotheses for further testing but also confidently predict its generalization to new contexts without explicitly testing every situation. Such a theory-testing approach should be our method of choice if the goal is to develop a generalizable theory of memory that applies across laboratory and naturalistic settings. Focusing on validating formal theoretical models rather than isolated phenomena allows us to build a theoretical framework with strong generalizability beyond the specific conditions under which it was initially tested.

Past work in formal modeling has contributed to building a generalizable theory of memory. While early verbal theories introduced important theoretical concepts in memory (Carr, 1931; Ebbinghaus, 1885; Galton, 1883), formal models implemented as computational simulations and mathematical equations have been proved useful for formulating precise, testable predictions (Estes, 1955; Howard & Kahana, 2002b; Oberauer & Kliegl, 2006; Raaijmakers & Shiffrin, 1980). To build models of memory, researchers examine a set of existing empirical effects for a memory task and develop a minimal set of model assumptions that can explain these effects, usually involving specifying memory representations and processes underlying stages of memory encoding, storage, and retrieval. A theoretical model then needs to go through stages of refinement by testing its predictions in new situations. Although memory models are typically developed to account for a range of empirical patterns for a single memory task, recent modeling efforts have increasingly focused on uncovering generalizable rules that apply across different tasks or contexts. For example, while the context maintenance and retrieval (CMR) model was initially developed to understand free recall of lists (Howard & Kahana, 2002a; Polyn et al., 2009b), it has been generalized to account for behavioral patterns in serial recall tasks (Logan & Cox, 2021; Lohnas, 2025), free-association tasks (Richie et al., 2023), collaborative free-recall tasks (Angne et al., 2024), as well as a broader range of memory behavior in memory consolidations (Z. Zhou et al., 2024), rewards (Rouhani et al., 2020), and decision-making (C. Y. Zhou et al., 2025). Exemplar-based models (Medin & Schaffer, 1978; Nosofsky, 1986), originally developed for categorization tasks, can account for old/new item-recognition tasks and explain relations between classification and recognition (Nosofsky et al., 2011). A resource-limited theory of memory encoding, as implemented in a mathematical model, can account for word-frequency effects across item recognition, associative recognition, cued recall, and free recall (Popov & Reder, 2020). While separate Bayesian models of memory have explained category effects during memory reconstruction (e.g., single category by Huttenlocher et al., 2000, or hierarchical categories by Hemmer & Steyvers, 2009), Xu et al. (2025) unified them into a single framework.

Limitations

Historically, formal models of memory have primarily relied on well-controlled laboratory experiments, such as list-learning paradigms, to study the encoding and recall of information (Kahana, 2012). The close control over both the selection and timing of stimuli and procedures, along with the precision of their measurements, is critical for making computational modeling tractable. Additionally, the task variables are relatively few and well-specified in well-controlled experiments, allowing for direct comparison of experimental results obtained under different laboratories, a prerequisite for integrating empirical results in a single theoretical model. For instance, the serial position effects in free-recall paradigms—how the recall probability of an item differs as a function of its study position in a list—has been observed across numerous laboratories, enabling researchers to develop and refine explanations in a theoretical model about what affects the shape of the serial position curve (Howard & Kahana, 1999; Ma et al., 2024; Tan & Ward, 2000; Watkins et al., 1989; Zhang et al., 2023). The progress of developing computational models for naturalistic memory is still at an early stage (see examples at Franklin et al., 2020; Lu et al., 2024; Michelmann et al., 2023) as it is challenging to precisely measure the full range of variables in complex naturalistic tasks and to navigate the large number of possible model alternatives. Although simplified laboratory experiments have provided convenience and tractability in building theoretical models of memory, without extending the models to a wider range of scenarios identified in naturalistic studies of memory, we miss opportunities to identify further gaps in the theory.

To summarize, both naturalistic and laboratory approaches aim to build a theory of memory that generalizes across different contexts and environments. Naturalistic approaches study memory directly in the environments to which we want to generalize our theories (Fig. 1b, green arrow), though reliably extracting rules, modeling, and integrating results from these complex settings remains challenging. Although the laboratory approach is traditionally associated with internal validity, it contributes to external validity by uncovering causal relationships that are likely to hold across contexts (Fig. 1b, yellow arrow). It also supports a strong tradition of formal modeling to integrate empirical findings (Fig. 1b, red arrow). However, without directly testing these theoretical models across more naturalistic scenarios, we cannot determine with certainty whether they can successfully account for empirical results in both controlled and naturalistic contexts. The rest of the article outlines the framework and concrete steps for building a unified theory by combining the strengths from both traditions, using formal modeling as the bridge.

An Integrative View Necessary for Theoretical Unification

To facilitate a path toward theoretical integration through formal modeling, it is useful to reconceptualize the relationship between naturalistic and laboratory-based approaches. Existing research may emphasize theoretical developments of either laboratory or naturalistic studies of memory. At one extreme is the optimistic view (Fig. 2a), according to which laboratory settings are considered an ideal abstraction of naturalistic settings. As Tulving (1983) put it, “Words to the memory researcher are what fruit flies are to the geneticist: a convenient medium through which the phenomena and processes of interest can be explored and elucidated . . . Words are of no more intrinsic interest to the student of memory than Drosophila are to a scientist probing the mechanisms of heredity . . .” (p. 146). Although recalling random word lists may seem removed from naturalistic memory scenarios, words have served as ideal abstractions of meaningful information units in memory. According to this perspective, theories or models developed under laboratory settings should theoretically apply well to other contexts or situations, directly assuming external validity from internal validity alone. However, the limitation of the optimistic view is that we disregard the unique contributions of naturalistic settings, potentially missing opportunities to identify further gaps in the theory. At the opposite extreme is the pessimistic view (Fig. 2b), according to which laboratory settings are considered unrepresentative of what takes place in real life: “Conclusions drawn from controlled experimental designs with a limited number of variables may not be valid in real-life behavior” (Shamay-Tsoory & Mendelsohn, 2019, p. 844). Thus, one should focus efforts on investigating memory within naturalistic settings. The consequence of the pessimistic view is that we disregard the role of laboratory settings in contributing to external validity and miss opportunities to connect new findings with existing theories. In addition to the optimistic and pessimistic views, an intermediate view (Fig. 2c) offers a middle ground, acknowledging that laboratory settings have aspects that share features with naturalistic settings, as well as aspects that are artificial and stand orthogonal to naturalistic settings.

Fig. 2.

Toward a theoretical unification of laboratory and naturalistic approaches. The optimistic view (a) focuses efforts on investigating memory within laboratory settings, seeing them as an ideal abstraction of naturalistic environments. The pessimistic view (b) focuses efforts on investigating memory within naturalistic settings and sees laboratory settings as unrepresentative of what takes place in real life. The intermediate view (c) acknowledges that laboratory settings have aspects that are in common with naturalistic settings as well as aspects that are artificial and stand orthogonal to naturalistic settings. I propose the integrative view (d), which considers naturalistic settings as containing all possible scenarios that could be realized in the real world, with laboratory settings forming a smaller subset that we have understood well. Our goal as memory researchers is to gradually expand the scope of laboratory studies, theories, and models to account for the full range of naturalistic-memory behavior, ultimately achieving a generalized theory of memory. Under the integrative view, we can adopt established formal modeling refinement approaches (in iteratively testing predictions in new situations), using components identified in naturalistic settings to incrementally guide the direction of this refinement process (e).

While the intermediate view helps reconcile the extremes of the optimistic and pessimistic views, I propose an integrative view that can more directly facilitate a research program that unifies laboratory or naturalistic studies of memory (Fig. 2d). Under an integrative view, naturalistic settings refer to all possible scenarios that could be realized in the real world, whereas laboratory settings represent a smaller subset that we have studied and understood well, presumably because of their relative simplicity. Despite their artificial appearance, laboratory conditions remain valid slices of our reality. Our goal as memory researchers, regardless of whether we are from the laboratory or naturalistic traditions, is to gradually expand the scope of laboratory studies and theories to eventually account for the full range of naturalistic memory behavior, thus achieving a generalized theoretical understanding of memory. It ensures that researchers from the laboratory-based approaches can see naturalistic-memory studies as opportunities to test the generalizability of their theories and that researchers from the naturalistic approaches make attempts to tie their findings closely with existing theories. The integrative view has several important features that I will highlight and discuss below.

It is challenging to draw a rigid line between naturalistic and artificial

What, precisely, is a “naturalistic setting”? The integrative view considers naturalistic settings to be all possible scenarios that could be realized in the real world, including aspects of the laboratory settings that may appear artificial. Unlike the pessimistic view and intermediate views, the integrative view avoids rigidly defining what is and is not naturalistic, a line that is difficult to draw (Winograd, 1988). Some definitions are too abstract to provide concrete criteria for judgment: For example, artificial situations are “those that are specifically designed for research” and naturalistic situations are “the target situations to be understood by research” (Hoc, 2001, p. 282). Other approaches have considered a framework in which stimuli, tasks, and behavior can be evaluated on a continuum of simplicity and complexity, where laboratory experiments have a “reductionistic” tendency to simplify the complexity of the real world (Kingstone et al., 2008; Shamay-Tsoory & Mendelsohn, 2019; Sonkusare et al., 2019). However, complexity is also subjective and context-dependent. As Holleman et al. (2020) noted, complexity has often been expressed in strict mathematical terms in physical sciences (Gell-Mann, 1995), but psychologists have used the term loosely, either by describing something’s size, dimension, or variety or by referring to things that are not yet understood well, as in “the brain is too complex for us to understand” (Edmonds, 1995, p. 4). Furthermore, the definition of what is naturalistic versus artificial can be ever-changing, depending on our knowledge of the world. While virtual-reality paradigms are now accepted as realistically simulating naturalistic and real-world experiences, a few decades ago, before familiarity with the technology, the first lab participants using virtual-reality headsets would not have found the experience to be immersive or to resemble their own everyday experiences.

What is considered artificial also reveals important aspects of naturalistic behavior

To gain a full understanding of human memory, we must not dismiss certain experimental paradigms simply because they appear artificial. A complete theory of memory should be able to account for human behavior across all environments, including artificial ones involving tasks like passively viewing information, memorizing lists of random words, or performing simple key presses. In fact, by intentionally removing naturalistic elements from an experiment, researchers can often better characterize the fundamental cognitive constraints that are important in real-world behavior. For example, Hick’s law, a principle with wide real-world applications in interface design (Proctor & Schneider, 2018), was first derived under extremely artificial conditions in which 1 participant made over 8,000 button presses (Hick, 1952). Similarly, our understanding of working memory capacity has been built from studies using discrete items like digits, letters, and words (Cowan, 2001; G. A. Miller, 1956).

A dynamic boundary encourages collaboration between the two research traditions

Most importantly, setting a fixed boundary between what is naturalistic and what is artificial often promotes separation rather than integration between research traditions. For example, many contemporary articles on naturalistic memory start by highlighting elements missing from traditional laboratory-based experiments. These elements should be seen as opportunities to extend theory previously developed in laboratory-based experiments (under the integrative view), rather than as justification to move away from these laboratory-based paradigms (under an intermediate view or a pessimistic view). The integrative perspective encourages collaboration between the two research traditions by recognizing that the boundary between laboratory and naturalistic settings is not fixed but dynamic. What is considered new, naturalistic, and understudied by laboratory studies today could become a staple in laboratory settings tomorrow. For this transition to happen, collaborative efforts from both research traditions are essential: identifying important components from naturalistic settings and systematically expanding existing memory theories (particularly those formulated in formal models) to incorporate these new phenomena. For example, while early memory work focused on the memory of random materials, Bartlett (1932) introduced more naturalistic approaches for studying memory by examining the role of prior knowledge, or schema, in story recall. Though revolutionary at that time in introducing aspects of naturalistic memory overlooked by traditional laboratory methods, the concept of schema has been effectively assimilated into traditional laboratory approaches since then, with a large body of research these days investigating the role of schematic knowledge on episodic memory both empirically (Popov et al., 2019; Tompary & Thompson-Schill, 2021; Tse et al., 2007) and computationally (Hemmer & Steyvers, 2009; Huttenlocher et al., 2000; Zhang, 2022). Thus, the integrative view reduces the artificiality of laboratory experiments over time, not by removing nonnaturalistic elements, but by progressively expanding its coverage to incorporate more elements important in real-world settings.

Concrete Steps Toward Theoretical Integration Through Formal Modeling

Under the integrative view, I propose a step-by-step procedure to achieve theoretically consistent interpretations across both naturalistic and laboratory memory studies through formal modeling (illustrated in Fig. 2e). This procedure adopts the established formal modeling-refinement approaches (i.e., in iteratively testing predictions in new situations), using components identified in naturalistic settings to guide the direction of this refinement process.

First, we begin by identifying key components within naturalistic settings that current memory theories or models may not adequately address. Second, we incrementally increase the complexity of controlled laboratory experiments, maintaining experimental control while including these newly identified components of naturalistic settings, making it tractable for formal modeling to be applied. Finally, we extend existing theoretical models previously developed under well-controlled laboratory experiments, with minimal adjustments, to account for the data emerging from these more complex experiments. Successful generalization of the existing theory in this extension provides strong support for the theory, whereas any failure to generalize pinpoints specific areas where theoretical revision is needed. This step-by-step procedure provides a tight link between existing theory and newly added naturalistic components, making it clear what model mechanisms generalize and what additional mechanisms are needed. When this procedure is applied iteratively, one can gradually expand the scope of the existing laboratory studies, theories, and models to account for increasingly naturalistic scenarios (see the integrative view in Fig. 2d), ultimately achieving a generalized theory of memory. Several key characteristics of the proposed framework will be further discussed below.

Identifying components from naturalistic studies that may challenge existing theory

Researchers from laboratory-based traditions typically test their computational models in contexts similar to those that initially informed their theories. It would be a more productive practice, however, to intentionally seek scenarios that might challenge or “break” existing theories. This is where naturalistic-memory research proves valuable; researchers using naturalistic approaches actively identify aspects of memory that are underexplored in traditional laboratory paradigms, which have the potential to reveal limitations of existing memory theories. Many aspects of naturalistic memory remain understudied in controlled laboratory settings, which poses challenges for theoretical unification between laboratory and naturalistic approaches. First, everyday memory experiences often involve continuous interactions with environmental cues and other individuals. Yet the majority of controlled laboratory experiments have participants complete memory tasks in isolation. Recent cognitive research has begun to identify how an individual’s memory is altered by external aids (Cornell et al., 2024; Martin et al., 2022; Niforatos et al., 2017; Sparrow et al., 2011) and collaborative settings (Rajaram & Pereira-Pasarin, 2010; Weldon et al., 2000). For example, using a smartphone to replay rich cues from daily life can enhance recall of past events (Martin et al., 2022). When people can look up information online, they better remember where to access the information instead of the information itself (the “Google effect”; Sparrow et al., 2011). When individuals collaborate during recall, it can lead to forgetting and increased memory errors (Rajaram & Pereira-Pasarin, 2010). Second, controlled laboratory experiments primarily rely on simplified stimuli, such as word lists, whereas real-world information is rich and continuous, as seen in movies and narratives (Lee et al., 2020). Studies using these naturalistic stimuli, such as having participants recall an episode of BBC’s “Sherlock”, reveal how people segment continuous experiences into discrete events and how these events exhibit a nested hierarchical structure in the brain (Baldassano et al., 2017; Zacks et al., 2001). It remains a challenge to build a theoretical model that can capture memory recall across both simplified discrete stimuli and complex continuous narratives. Third, traditional laboratory memory experiments explicitly probe participants’ memories, providing clear instructions on when to encode and retrieve episodic memories. In real-world contexts, individuals have agency over what and when they learn (Shamay-Tsoory & Mendelsohn, 2019) and can choose whether to retrieve something based on its necessity (Lu et al., 2022). Recent research has examined scenarios in which participants took self-guided museum tours while wearing a camera, learning information at their own pace and by their own choices (St. Jacques & Schacter, 2013). Fourth, real-world memory can operate on much longer time scales than those typically studied in laboratory settings. For instance, over the course of 50 years, Bahrick (1984) traced the long-term forgetting function of Spanish words people studied at various times of their lives and found that the rate of forgetting dropped to zero after 5 years of learning. Finally, we should investigate not just how people recall the past, but also what they use the past for (Neisser, 1982). A growing number of studies show that people draw on their past experiences to guide decision-making (Hornsby & Love, 2022; Zhao et al., 2022; C. Y. Zhou et al., 2025), construct future plans (Mattar & Daw, 2018; Ólafsdóttir et al., 2018), and summarize information (Angne et al., 2025).

An incremental approach to generalizing existing theories to increasingly naturalistic settings

The components identified from naturalistic memory settings present both a challenge and an opportunity to validate and revise existing theories developed from traditional laboratory approaches. Formal models of memory have proven useful in providing theoretical unification of various empirical effects observed in traditional laboratory experiments. A critical step toward bridging laboratory and naturalistic studies of memory is to extend these existing formal models to capture the additional components identified through naturalistic approaches. One may wonder why we do not start directly with a model of naturalistic settings. While there may be instances in which building entirely new models becomes necessary to capture complex, naturalistic memory tasks, this should occur only after thoroughly exploring and exhausting possibilities with existing models to ensure that the phenomena are not already adequately explained by current theories. We should prioritize an approach that incrementally extends our current theoretical frameworks, not only because it is simpler than building entirely new models or concepts but also because these theories, which have successfully unified results from various laboratory experiments, are expected to generalize to, and withstand testing in, more complex environments. Any failure to generalize in these richer environments would not only highlight the significance of newly identified naturalistic components but also provide crucial constraints for refining the existing theory of memory.

I will highlight several examples from the recent memory literature to demonstrate the proposed step-by-step approach under the integrative view. These studies bring aspects of naturalistic memory incrementally into laboratory settings and subject them to formal modeling to provide unification with existing theories (as illustrated in Figs. 2d and 2e).

Real-world environmental statistics

Information in real-world environments, such as news articles, email, and tweets, tends to reappear with statistical regularity, often following a power-law function in which recent items are more likely to reappear (Anderson & Schooler, 1991; Anderson et al., 2023). Traditional memory experiments, however, typically deviate from these natural statistics, presenting stimuli randomly or at equal spacing. An influential theory, the rational anlaysis of memory, proposes that the statistical patterns of the natural environment have shaped human memory to produce the classic forgetting curve observed in these controlled lab experiments (Anderson, 1990). It remains unknown whether this rational principle holds when memory experiments themselves use stimuli that follow naturalistic statistical patterns. A study by Anderson et al. (2026) directly tested this by creating a more naturalistic continuous-recognition experiment. The presentation order of word stimuli in the recognition experiment was matched to the order of tweets from a real-world Twitter data set, thus embedding natural environmental statistics into a controlled laboratory task. This naturalistic condition was compared with two other conditions using typical laboratory-based environmental statistics: randomly sampled stimuli and equally spaced stimuli. Crucially, Anderson et al. (2026) showed that a single computational model, using only the history of each stimulus’s appearance, could accurately predict memory performance across all three conditions. These findings support the theoretical claim that human memory rationally adapts to the statistical structure of its environment, regardless of whether that structure is naturalistic or artificially controlled.

Rich and complex stimuli

Formal models of memory have primarily been developed using highly simplified perceptual stimuli, such as words or abstract shapes. Although these models are useful in unifying a range of empirical results and revealing mechanisms underlying encoding and recall, it remains a question whether they generalize to rich and high-dimensional real-world material. To address this, Meagher and Nosofsky (2023) applied a model of recognition and categorization, which has been successfully used to predict old/new recognition for simplified perceptual stimuli, to an experiment in which the stimuli consisted of a set of high-dimensional rock images. Their key hypothesis was that the fundamental cognitive mechanisms for recognition judgments are the same for both simplified and naturalistic stimuli: An item is judged as old or new based on its summed similarity to all individual items stored in memory. The only difference lies in the complexity of the stimulus representations on which these similarities are calculated. Confirming this, Meagher and Nosofsky (2023) embedded rock images in a high-dimensional feature space using multidimensional scaling and demonstrated that a hybrid-similarity exemplar model accounted well for recognition behavior in their experiment, similar to the way the same model explained behavior for simplified perceptual stimuli, like color patches (Nosofsky & Zaki, 2003). Compared with machine-learning approaches that can directly map complex images to recognition behavior (Bylinskii et al., 2022), extending an existing cognitive model of recognition incrementally to incorporate new stimuli contributes to a unified theory that supports recognition performance for both simplified and real-world, high-dimensional stimuli.

Interaction with environmental cues

Our memory in daily lives constantly interacts with information in the environment, like notes, or Google Calendar. Yet our current understanding of human memory is predominantly based on controlled laboratory experiments in which participants engage in memory tasks without any external reference or help. To better reflect these naturalistic scenarios in laboratory-based settings, Cornell et al. (2024) conducted a modified free-recall experiment in which participants tried to recall as many items as possible from a studied list. In a typical free-recall experiment, the recall period ends after a fixed amount of time; when participants in this study could not remember any more items, however, they pressed a button to receive an external cue as a reminder. To account for participants’ memory behavior after receiving the reminder, Cornell et al. (2024) extended a computational model of memory search that had previously successfully explained free recall behavior without interaction with external cues (Howard & Kahana, 2002a; Polyn et al., 2009b). The key additional model assumption that links the memory-search process with or without reminders is whether the next recall is driven by the context of the reminder or by the context of the preceding recall, with all other mechanisms and parameters of the model shared between the two scenarios. Using parameters fitted from a standard free-recall experiment without reminders, the model accurately predicted memory behavior in the reminder condition over a new group of participants. The model could also distinguish in real time, based on preregistered model parameters, which reminders are the most effective to deliver in order to improve memory. These findings provide a unified understanding of the cognitive mechanisms that drive memory search with or without interaction with external cues. Extending an existing model to capture the increasingly naturalistic scenario with reminders also increases the real-world applicability of the theory.

Sequential decision-making

Research on how people search their memories has largely been examined under controlled laboratory conditions, where memories are directly queried by experimenters. In real-world contexts, however, memory does not function in isolation but supports goal-directed behavior in other cognitive tasks. To better understand how memory guides sequential decision-making, Hornsby and Love (2022) analyzed a large data set of online grocery purchases and examined how people decide what item to buy next based on options generated from their long-term memory. This consumer-choice data set offers increased naturalism by examining memory’s role in real-world decision-making while maintaining tractability for computational modeling, as the human responses are lists of discrete grocery items, similar to responses in typical list-learning paradigms. Past modeling work has characterized how people search for information in their episodic memory (e.g., recalling items from a previously studied grocery list; Raaijmakers & Shiffrin, 1980) versus their semantic memory (e.g., recalling all grocery items that fall into the “vegetable” category; Abbott et al., 2015). Building on this, Hornsby and Love (2022) developed a computational model that predicts these sequential choices by proposing a two-stage process. First, people search their memories for available options using a combination of cues from episodic and semantic memory. Second, they examine the relevance of each retrieved option against their internal goals. Extending previous models of memory search to capture sequential choices provides a common framework for understanding memory search in the lab and for seeing how these basic mechanisms shape decision-making in real-world, goal-directed tasks.

Formal modeling provides a strong link between naturalistic and laboratory studies

Both laboratory and naturalistic approaches aim to develop a generalized theory of memory. Laboratory approaches emphasize internal validity, often assuming their results generalize across more complex scenarios without explicitly testing this assumption. Naturalistic approaches address this gap by examining memory processes in naturalistic settings, often drawing conclusions about whether what we know from traditional laboratory paradigms generalizes to naturalistic paradigms (Griffiths et al., 2016), or whether they reflect fundamentally different processes between traditional laboratory paradigms and naturalistic paradigms (Roediger & McDermott, 2013). Given the complexity of studying behavior in naturalistic environments, conclusions like these are challenging to reach without explicitly formulating them into precise computational models. While similar results across settings reasonably suggest shared underlying mechanisms (Griffiths et al., 2016), different behavioral outcomes do not necessarily imply fundamentally distinct cognitive processes. For example, Roediger and McDermott (2013) suggested that laboratory events may be fundamentally different from memory for events of one’s life, and they may even be considered as “two types of memory.” Their arguments rest on two observations. First, behavioral evidence demonstrates a dissociation: Individuals with highly superior autobiographical memory (HSAM) are superior in autobiographical memory, but average in remembering laboratory events (Patihis et al., 2013); mnemonists or memory athletes demonstrate excellent performance in laboratory-like memory tasks, such as encoding and retrieving a long list of items, but do not have abilities like HSAM individuals in autobiographical memory (Maguire et al., 2003). Second, neuroimaging evidence from meta-analyses reveals that different brain networks are involved during a laboratory memory task than when people are asked to remember their life events (McDermott et al., 2009). While compelling, this reasoning potentially conflates behavioral and neural differences with differences in underlying cognitive mechanisms. Without knowing what underlying cognitive mechanisms correspond to these behavioral or neural differences, we cannot conclusively determine whether laboratory and naturalistic paradigms engage truly distinct memory processes or simply reflect different manifestations of shared underlying systems.

Extending an existing model of memory from laboratory settings to incrementally more naturalistic settings (as illustrated in Figs. 2d and 2e) can aid in this reasoning, because we can clearly distinguish the mechanisms that are carried over and generalized from the lab-based settings and the mechanisms that are extensions to account for the new results in naturalistic settings, after which we can evaluate whether the necessary adjustments to the model reflect fundamental differences between the laboratory and the naturalistic paradigms. An example of this approach comes from recent theoretical modeling work on collaborative memory (Angne et al., 2024). In real-world social contexts, people frequently recall information in groups rather than in isolation, which is the typical setup in traditional laboratory studies. We might want to know whether our memory-search process functions fundamentally differently during social interactions compared with recalling information alone. Empirical evidence suggesting potential differences includes the counterintuitive collaborative-inhibition effect, in which groups recall less information collectively than the same number of individuals recall separately (Kelley et al., 2012; Rajaram & Pereira-Pasarin, 2010; Weldon et al., 2000). Although several verbal theories have been proposed to explain collaborative inhibition (Basden et al., 1997; Hyman et al., 2013), these explanations have not been formally connected to existing theories of memory developed under laboratory conditions where there is no collaboration. This theoretical gap leaves open the question of whether collaborative and individual recall engage fundamentally different cognitive mechanisms or reflect variations of the same underlying processes.

To address this question, Angne et al. (2024) connects both literatures under the same theoretical framework by extending a model of individual recall, the CMR model, to capture collaborative-recall processes. The CMR model has successfully explained various individual-recall patterns (Howard & Kahana, 2002a; Polyn et al., 2009a) by theorizing how items become associated with different states in a context space and are subsequently retrieved from this space. Critically, with minimal model adjustments, it was shown that the same fundamental processes in the model govern how people search their memory individually and collaboratively. In both cases, each new recall is influenced by the context of the previous recalls (e.g., after recalling “apple,” one is more likely to recall contextually similar items like “banana”). The key difference is that in individual recall, one’s retrieval process is driven solely by the context of their own previous recalls, whereas in collaborative recall, retrieval is additionally influenced by the context of others’ recalls through the same context-updating process. By applying model parameters that were fitted to the individual-recall condition, the extended model successfully predicted collaborative-recall behavior (Angne et al., 2024): As recall unfolds, minds (contexts) within a collaborative group become more aligned or synchronized with each other, and thus individuals miss opportunities to recall unique information that others may not have considered, giving rise to the collaborative-inhibition effect. This modeling approach not only provides an intuitive explanation of the empirical results observed in collaborative recall but also unifies these findings under the same theoretical framework that explains individual-recall behavior. This unification supports the important role of context as a shared mechanism across individuals and group settings and offers precision in linking cognitive processes underlying laboratory versus naturalistic paradigms.

Connection to Related Approaches

Verbal theories versus formal modeling

Advocating for formal modeling in this article is not intended to dismiss or replace the development of verbal theories. In fact, verbal theories often precede and guide the formulation of formal models. The primary advantage of modeling lies in its ability to add precision to an initially verbally formulated theory and better connect it with hypotheses and data. The idea that formal models can serve such a bridging function is not new. Computational modeling forces researchers to explicitly document their assumptions and remove ambiguity, a process that helps “safely remove a theory from the brain of its author” (Guest & Martin, 2021, p. 2). Computational modeling also makes it clear the kinds of experimental data that would validate or invalidate a given theory, however intuitively compelling it is (Hintzman, 1991). Despite the many advantages of modeling, it is also worth acknowledging that not all areas of memory research have carried a long tradition of formal modeling, and it is possible to formulate verbal theories that serve similar purposes. Researchers can always strive to articulate their theories more precisely. A formal theory does not have to be expressed in fully computational and mathematical terms; it can exist in various levels of abstraction (Oberauer & Lewandowsky, 2019). For example, adding precision to a theory can mean implementing the theory as a computer program that simulates detailed learning behavior (Pavlik & Anderson, 2008); it can also mean creating a diagram between several variables to make their causal relationships more explicit (Glymour, 2003).

Opportunity to integrate with neuroscience approaches

In recent years, neuroscience has played an increasingly important role in studying naturalistic memory. More naturalistic paradigms can push the brain through a wider range of states, allowing researchers to identify brain function and organization that was not possible before (Lee et al., 2020). For example, neuroimaging studies of naturalistic memory have uncovered how the brain segments continuous experiences into events (Baldassano et al., 2017; Ben-Yakov & Henson, 2018), represents narrative information (Lerner et al., 2011; Nguyen et al., 2019), and supports communication from one to another (Nozawa et al., 2016; Zadbood et al., 2017). While this growing body of research has reflected our excitement about how naturalistic-memory paradigms can yield new insights into the brain (Jääskeläinen et al., 2021), less emphasis has been placed on how these neural findings facilitate an integrated understanding of memory in both laboratory-based and naturalistic settings. A fruitful future direction would be for neuroimaging studies to simultaneously characterize both shared and distinct brain mechanisms across laboratory-based and naturalistic paradigms and, when brain activations differ, to systematically interpret whether these differences reflect methodological choices such as test format, richness of sensory inputs, or fundamentally distinct memory processes. Furthermore, neuroscience and formal modeling are not competing approaches; they can be integrated to serve the same goal. In the emerging field of model-based cognitive neuroscience (Turner et al., 2017), neural data can help validate a model’s mechanisms in ways that behavioral data alone cannot, while a model can guide the interpretation of neural differences by pinpointing underlying cognitive processes. Although the examples of formal modeling in this article are primarily based on behavioral data, one could use neuroimaging and formal modeling concurrently. For instance, in the example of collaborative recall (Angne et al., 2024), the model predicts that mental contexts within a group become more aligned. Integrating this model with neuroimaging in the future might reveal that the widely observed brain synchronization during social communication (Nozawa et al., 2016; Zadbood et al., 2017) could correspond to these synchronized mental contexts.

Alternative approaches in building generalized models

This article shares a similar goal with the concurrent work by Carvalho and Lampinen (2025): to build a generalized theory of cognition by modeling both simple lab experiments and complex naturalistic behaviors. However, we take fundamentally different, and complementary, paths to get there. Inspired by approaches and practices in machine learning, Carvalho and Lampinen (2025) proposed building complex task-performing neural-network models that work simultaneously in many tasks, trained over as wide a variety of naturalistic settings as possible. This is in contrast to the present work, which starts with existing theory developed over simple, well-controlled laboratory experiments, and carefully extends them, adding one naturalistic element at a time. Although the incremental approach may appear less ambitious, it provides a tight interpretable link to existing theory at every step of increased naturalism, making it clear what model mechanisms generalize and what additional adjustments are needed. The complex models built by Carvalho and Lampinen (2025) also capture both simple and naturalistic behaviors, but the kind of theories that could be derived and reduced from such complex models would look very different from ones that are built in a bottom-up, incremental manner. Ultimately, there is no single right answer for what makes a good theory. The solution cognitive science seeks could be similar to those in machine learning, driven by explanations reduced from complex, task-performing models that predict a range of behavior. Alternatively, it could resemble solutions in physics, where a small set of core principles, developed in well-controlled laboratory settings, are incrementally refined and tested over more naturalistic, real-world settings. This article argues for the latter path, believing that our goal is to “understand how many apparently diverse empirical phenomena can arise from a small set of basic principles” (Hintzman, 1991, p. 52).

Conclusion

Our goal as memory researchers, regardless of tradition, is to develop a theory of memory that generalizes across different contexts and situations. Although naturalistic environments are rich in the representativeness of stimuli, experimental variables, and situations, an exhaustive test of a phenomenon across more and more complex and diverse settings does not automatically lead to a generalized theory. Similarly, while we can conclude the causality of an empirical relationship from well-controlled laboratory experiments, which are likely to be fundamental, these relationships are not commensurate unless we commit to developing and testing formal theories that can accumulate knowledge across these experiments. In many ways, the challenges identified in the current work reflect the broader “theory crisis” in cognitive psychology (Borsboom et al., 2021; Jamieson & Pexman, 2020; Oberauer & Lewandowsky, 2019), where coherent psychological principles are needed to tie empirical records together.

Beyond emphasizing the role of formal theory construction, the current work proposes concrete steps to link laboratory and naturalistic traditions via formal modeling. Naturalistic studies of memory are valuable in identifying components that challenge and guide further development of existing memory theories, thus evaluating their generalizability; formal models are used to incrementally extend existing theories to account for novel findings in naturalistic studies, thereby establishing explicit connections between these findings and existing theoretical frameworks. This incremental approach makes it clear which mechanisms generalize and what adjustments are needed, answering the question of whether the findings discovered in naturalistic settings are already adequately explained by existing theory or whether they represent entirely new concepts (Neisser, 1988). Critical to the collaboration of the two traditions is to reconceptualize well-controlled laboratory experiments not as artificial or opposed to the goal of naturalistic experiments, but as intermediate stages in expanding our theoretical understanding to incorporate increasingly naturalistic scenarios.

In conclusion, this article has proposed a framework to unify laboratory and naturalistic approaches in memory research, where naturalistic data serve to constrain and build a more generalized theory of memory, and formal modeling better connects naturalistic findings with existing theories. Beyond its theoretical contributions, unifying laboratory and naturalistic approaches will also contribute to our ability to develop a theory-driven way to improve memory in the future. The current disconnect between theoretical work and practical applications has arisen, in part, from the tradition of developing memory theories and models under highly controlled laboratory environments. Bridging our knowledge from well-controlled laboratory environments to naturalistic ones simultaneously bridges the gap between our theories and the situations in which they are most relevant to be applied.

Footnotes

Acknowledgements

I would like to thank Vencislav Popov, Christopher Baldassano, Gregory Cox, Hongmi Lee, Richard Shiffrin, Jacob Feldman, Pernille Hemmer, and Karin Stromswold for helpful discussions, as well as anonymous reviewers for their valuable feedback.

Transparency

Action Editor: Zhicheng Lin

Editor: Arturo E. Hernandez

ORCID iD

Qiong Zhang

References

Abbott

J. T.

Austerweil

J. L.

Griffiths

T. L.

(2015). Random walks on semantic networks can resemble optimal foraging. Psychological Review, 122(3), 558–569.

Anderson

J. R.

(1990). The adaptive character of thought. Psychology Press.

Anderson

J. R.

(1996). Act: A simple theory of complex cognition. American Psychologist, 51(4), 355–365.

Anderson

J. R.

Betts

Byrne

M. D.

Schooler

L. J.

Stanley

(2023). The environmental basis of memory. Psychological Review, 130(5), 1137–1159.

Anderson

J. R.

Betts

Fincham

J. M.

(2026). Using the environment to predict memory performance. Journal of Experimental Psychology: Learning, Memory, and Cognition. Advance online publication. https://doi.org/10.1037/xlm0001577

Anderson

J. R.

Schooler

L. J.

(1991). Reflections of the environment in memory. Psychological Science, 2(6), 396–408.

Angne

Cornell

C. A.

Zhang

(2024). A context-based model of collaborative inhibition during memory search. Scientific Reports, 14(1), Article 27645. https://doi.org/10.1038/s41598-024-78517-w

Angne

Mishra

S. S.

Lin

Zhang

(2025). Bridging real-world summarization and laboratory-based memory recall. OSF. https://doi.org/10.31234/osf.io/aqu29_v1

Arjovsky

Bottou

Gulrajani

Lopez-Paz

(2019). Invariant risk minimization [Preprint]. arXiv. arXiv:1907.02893

10.

Atkinson

R. C.

Shiffrin

R. M.

(1968). Human memory: A proposed system and its control processes. In Spence

K. W.

Spence

J. T.

(Eds.), The psychology of learning and motivation (pp. 89–195, Vol. 2). Elsevier.

11.

Bahrick

H. P.

(1984). Semantic memory content in permastore: Fifty years of memory for Spanish learned in school. Journal of Experimental Psychology: General, 113(1), 1–29.

12.

Baldassano

Chen

Zadbood

Pillow

J. W.

Hasson

Norman

K. A.

(2017). Discovering event structure in continuous narrative perception and memory. Neuron, 95(3), 709–721.

13.

Baldassano

Hasson

Norman

K. A.

(2018). Representation of real-world event schemas during narrative perception. Journal of Neuroscience, 38(45), 9689–9699.

14.

Banaji

M. R.

Crowder

R. G.

(1989). The bankruptcy of everyday memory. American Psychologist, 44(9), 1185–1193.

15.

Bartlett

F. C.

(1932). Remembering: A study in experimental and social psychology. Cambridge University Press.

16.

Basden

B. H.

Basden

D. R.

Bryner

Thomas

R. L.

III . (1997). A comparison of group and individual remembering: Does collaboration disrupt retrieval strategies? Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(5), 1176–1189.

17.

Ben-Yakov

Henson

R. N.

(2018). The hippocampal film editor: Sensitivity and specificity to event boundaries in continuous experience. Journal of Neuroscience, 38(47), 10057–10068.

18.

Borsboom

Van Der Maas

H. L.

Dalege

Kievit

R. A.

Haig

B. D.

(2021). Theory construction methodology: A practical framework for building theories in psychology. Perspectives on Psychological Science, 16(4), 756–766. https://doi.org/10.1177/1745691620969647

19.

Brunswik

(1944). Distal focussing of perception: Size-constancy in a representative sample of situations. Psychological Monographs, 56(1), 1–19.

20.

Brunswik

(1947). Systematic and representative design of psychological experiments [Symposium]. Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability (pp. 143–202), Berkeley, CA, United States. University of California Press.

21.

Brunswik

(1952). The conceptual framework of psychology. International Encyclopedia of Unified Science, 1(10), 656–760.

22.

Brunswik

(1955a). In defense of probabilistic functionalism: A reply. Psychological Review, 62(3), 236–242.

23.

Brunswik

(1955b). Representative design and probabilistic theory in a functional psychology. Psychological Review, 62(3), 193–217.

24.

Bylinskii

Goetschalckx

Newman

Oliva

(2022). Memorability: An image-computable measure of information utility. In Carbon

C. C.

(Ed.), Human perception of visual information: Psychological and computational perspectives (pp. 207–239).

25.

Cabeza

Prince

S. E.

Daselaar

S. M.

Greenberg

D. L.

Budde

Dolcos

LaBar

K. S.

Rubin

D. C.

(2004). Brain activity during episodic retrieval of autobiographical and laboratory events: An fMRI study using a novel photo paradigm. Journal of Cognitive Neuroscience, 16(9), 1583–1594.

26.

Carr

H. A.

(1931). The laws of association. Psychological Review, 38(3), 212–228.

27.

Carvalho

Lampinen

(2025). Naturalistic computational cognitive science: Towards generalizable models and theories that capture the full range of natural [Preprint]. arXiv. arXiv:2502.20349

28.

Cook

T. D.

Campbell

D. T.

Day

(1979). Quasi-experimentation: Design & analysis issues for field settings. Houghton Mifflin.

29.

Cornell

C. A.

Norman

K. A.

Griffiths

T. L.

Zhang

(2024). Improving memory search through model-based cue selection. Psychological Science, 35(1), 55–71. https://doi.org/10.1177/09567976231215298

30.

Cowan

(2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24(1), 87–114.

31.

Cummins

(2000). “How does it work?” vs.“what are the laws?” Two conceptions of psychological explanation. In Keil

Wilson

(Eds.), Explanation and cognition (pp. 117–145). MIT Press.

32.

Dhami

M. K.

Hertwig

Hoffrage

(2004). The role of representative design in an ecological approach to cognition. Psychological Bulletin, 130(6), 959–988.

33.

Ebbinghaus

(1885). Memory: A contribution to experimental psychology (Trans. Ruger

H. A.

Bussenius

C. E.

, 1913). Teachers College, Columbia University.

34.

Edmonds

(1995). What is complexity? - The philosophy of complexity per se with application to some examples in evolution. In Heylighen

Aerts

(Eds.), The evolution of complexity (pp. 1–18). Kluwer.

35.

Estes

W. K.

(1955). Statistical theory of spontaneous recovery and regression. Psychological Review, 62(3), 145–154.

36.

Franklin

N. T.

Norman

K. A.

Ranganath

Zacks

J. M.

Gershman

S. J.

(2020). Structured event memory: A neuro-symbolic model of event cognition. Psychological Review, 127(3), 327–361.

37.

Fried

E. I.

(2020). Lack of theory building and testing impedes progress in the factor and network literature. Psychological Inquiry, 31(4), 271–288. https://doi.org/10.1080/1047840X.2020.1853461

38.

Galton

(1883). Inquiries into human faculty and its development. Macmillan.

39.

Gell-Mann

(1995). What is complexity? Remarks on simpicity and complexity by the Nobel Prize-winning author of The Quark and the Jaguar. Complexity, 1, 16–19. https://doi.org/10.1002/cplx.6130010105

40.

Glymour

(2003). Learning, prediction and causal Bayes nets. Trends in Cognitive Sciences, 7(1), 43–48.

41.

Griffiths

Mazaheri

Debener

Hanslmayr

(2016). Brain oscillations track the formation of episodic memories in the real world. NeuroImage, 143, 256–266.

42.

Gruneberg

M. M.

Morris

P. E.

Sykes

R. N.

(1988). Practical aspects of memory: Current research and issues, vol. 1: Memory in everyday life [Conference session]. International Conference on Practical Aspects of Memory, August 2, 1987, Swansea, Wales.

43.

Guala

(2003). Experimental localism and external validity. Philosophy of Science, 70(5), 1195–1205.

44.

Guest

Martin

A. E.

(2021). How computational modeling can force theory building in psychological science. Perspectives on Psychological Science, 16(4), 789–802.

45.

Hammond

K. R.

(1966). Probabilistic functionalism: Egon Brunswik’s integration of the history, theory, and method of psychology. In Hammond

K. R.

(Ed.), The psychology of Egon Brunswik (pp. 15–80). Holt, Rinehart & Winston.

46.

Hasson

Honey

C. J.

(2012). Future trends in neuroimaging: Neural processes as expressed within real-life contexts. NeuroImage, 62(2), 1272–1278.

47.

Heinze-Deml

Peters

Meinshausen

(2018). Invariant causal prediction for nonlinear models. Journal of Causal Inference, 6(2), Article 20170016.

48.

Hemmer

Steyvers

(2009). A Bayesian account of reconstructive memory. Topics in Cognitive Science, 1(1), 189–202.

49.

Herweg

N. A.

Sharan

A. D.

Sperling

M. R.

Brandt

Schulze-Bonhage

Kahana

M. J.

(2020). Reactivated spatial context guides episodic recall. Journal of Neuroscience, 40(10), 2119–2128.

50.

Hick

W. E.

(1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4(1), 11–26.

51.

Hintzman

D. L.

(1991). Why are formal models useful in psychology? In Hockley

W. E.

Lewandowsky

(Eds.), Relating theory and data (pp. 39–56). Psychology Press.

52.

Hoc

J.-M.

(2001). Towards ecological validity of research in cognitive ergonomics. Theoretical Issues in Ergonomics Science, 2(3), 278–288.

53.

Holleman

G. A.

Hooge

I. T.

Kemner

Hessels

R. S.

(2020). The ‘real-world approach’ and its problems: A critique of the term ecological validity. Frontiers in Psychology, 11, Article 721.

54.

Hornsby

A. N.

Love

B. C.

(2022). Sequential consumer choice as multi-cued retrieval. Science Advances, 8(8), Article eabl9754. https://doi.org/10.1126/sciadv.abl9754

55.

Howard

M. W.

Kahana

M. J.

(1999). Contextual variability and serial position effects in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(4), 923–941.

56.

Howard

M. W.

Kahana

M. J.

(2002a). A distributed representation of temporal context. Journal of Mathematical Psychology, 46(3), 269–299. https://doi.org/10.1006/jmps.2001.1388

57.

Howard

M. W.

Kahana

M. J.

(2002b). When does semantic similarity help episodic retrieval? Journal of Memory and Language, 46(1), 85–98.

58.

Huttenlocher

Hedges

L. V.

Vevea

J. L.

(2000). Why do categories affect stimulus judgment? Journal of Experimental Psychology: General, 129(2), 220–241.

59.

Hyman

I. E.

Jr. Cardwell

B. A.

Roy

R. A.

(2013). Multiple causes of collaborative inhibition in memory for categorised word lists. Memory, 21(7), 875–890.

60.

Jääskeläinen

I. P.

Sams

Glerean

Ahveninen

(2021). Movies and narratives as naturalistic stimuli in neuroimaging. NeuroImage, 224, Article 117445.

61.

Jamieson

R. K.

Pexman

P. M.

(2020). Moving beyond 20 questions: We (still) need stronger psychological theory. Canadian Psychology/Psychologie canadienne, 61(4), 273–283.

62.

Jeunehomme

D’Argembeau

(2020). Event segmentation and the temporal compression of experience in episodic memory. Psychological Research, 84, 481–490.

63.

Kahana

M. J.

(2012). Foundations of human memory. Oxford University Press.

64.

Kelley

M. R.

Reysen

M. B.

Ahlstrand

K. M.

Pentz

C. J.

(2012). Collaborative inhibition persists following social processing. Journal of Cognitive Psychology, 24(6), 727–734.

65.

Kingstone

Smilek

Eastwood

J. D.

(2008). Cognitive ethology: A new approach for studying human cognition. British Journal of Psychology, 99(3), 317–340.

66.

Lee

Bellana

Chen

(2020). What can narratives tell us about the neural bases of human memory? Current Opinion in Behavioral Sciences, 32, 111–119.

67.

Lerner

Honey

C. J.

Silbert

L. J.

Hasson

(2011). Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. Journal of Neuroscience, 31(8), 2906–2915.

68.

Logan

G. D.

Cox

G. E.

(2021). Serial memory: Putting chains and position codes in context. Psychological Review, 128(6), 1197–1205.

69.

Lohnas

L. J.

(2025). A retrieved context model of serial recall and free recall. Computational Brain & Behavior, 8(1), 1–35. https://doi.org/10.1007/s42113-024-00221-9

70.

Hasson

Norman

K. A.

(2022). A neural network model of when to retrieve and encode episodic memories. eLife, 11, Article e74445. https://doi.org/10.7554/eLife.74445

71.

Nguyen

T. T.

Zhang

Hasson

Griffiths

T. L.

Zacks

J. M.

Gershman

S. J.

Norman

K. A.

(2024). Reconciling shared versus context-specific information in a neural network model of latent causes. Scientific Reports, 14(1), Article 16782. https://doi.org/10.1038/s41598-024-64272-5

72.

Popov

Zhang

(2024). A neural index reflecting the amount of cognitive resources available during memory encoding: A model-based approach. Journal of Experimental Psychology: Learning, Memory, and Cognition. 51(3), 350–370. https://doi.org/10.1037/xlm0001364

73.

Maguire

E. A.

Valentine

E. R.

Wilding

J. M.

Kapur

(2003). Routes to remembering: The brains behind superior memory. Nature Neuroscience, 6(1), 90–95.

74.

Martin

C. B.

Hong

Newsome

R. N.

Savel

Meade

M. E.

Xia

Honey

C. J.

Barense

M. D.

(2022). A smartphone intervention that enhances real-world memory and promotes differentiation of hippocampal activity in older adults. Proceedings of the National Academy of Sciences, 119(51), Article e2214285119. https://doi.org/10.1073/pnas.2214285119

75.

Mattar

M. G.

Daw

N. D.

(2018). Prioritized memory access explains planning and hippocampal replay. Nature Neuroscience, 21(11), 1609–1617.

76.

McDermott

K. B.

Szpunar

K. K.

Christ

S. E.

(2009). Laboratory-based and autobiographical retrieval tasks differ substantially in their neural substrates. Neuropsychologia, 47(11), 2290–2298.

77.

Meagher

B. J.

Nosofsky

R. M.

(2023). Testing formal cognitive models of classification and old-new recognition in a real-world high-dimensional category domain. Cognitive Psychology, 145, Article 101596. https://doi.org/10.1016/j.cogpsych.2023.101596

78.

Medin

D. L.

Schaffer

M. M.

(1978). Context theory of classification learning. Psychological Review, 85(3), 207–238.

79.

Michelmann

Hasson

Norman

K. A.

(2023). Evidence that event boundaries are access points for memory retrieval. Psychological Science, 34(3), 326–344. https://doi.org/10.1177/09567976221128206

80.

Miller

G. A.

(1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97.

81.

Miller

J. F.

Neufang

Solway

Brandt

Trippel

Mader

Hefft

Merkow

Polyn

S. M.

Jacobs

Kahana

M. J.

Schulze-Bonhage

(2013). Neural activity in human hippocampal formation reveals the spatial context of retrieved memories. Science, 342(6162), 1111–1114.

82.

Neisser

(1982). Memory: What are the important questions. In Neisser

(Ed.), Memory observed: Remembering in natural contexts (pp. 3–19).

83.

Neisser

(1988). New vistas in the study of memory. In Neisser

Winograd

(Eds.), Remembering reconsidered: Ecological and traditional approaches to the study of memory (pp. 1–10). Cambridge University Press.

84.

Neisser

Winograd

(1988). Remembering reconsidered: Ecological and traditional approaches to the study of memory [Conference session]. Emory Cognition Project Conference, October 2, 1985, Emory University, Atlanta, GA.

85.

Newell

(1973). You can’t play 20 questions with nature and win: Projective comments on the papers of this symposium. In Chase

W. G.

(Ed.), Visual Information Processing.

86.

Nguyen

Vanderwal

Hasson

(2019). Shared understanding of narratives is correlated with shared neural responses. NeuroImage, 184, 161–170.

87.

Niforatos

Cinel

Mack

C. C.

Langheinrich

Ward

(2017). Can less be more? Contrasting limited, unlimited, and automatic picture capture for augmenting memory recall. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(2), 1–22.

88.

Nosofsky

R. M.

(1986). Attention, similarity, and the identification–categorization relationship. Journal of Experimental Psychology: General, 115(1), 39–57.

89.

Nosofsky

R. M.

Little

D. R.

Donkin

Fific

(2011). Short-term memory scanning viewed as exemplar-based categorization. Psychological Review, 118(2), 280–315.

90.

Nosofsky

R. M.

Zaki

S. R.

(2003). A hybrid-similarity exemplar model for predicting distinctiveness effects in perceptual old-new recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(6), 1194–1209.

91.

Nozawa

Sasaki

Sakaki

Yokoyama

Kawashima

(2016). Interpersonal frontopolar neural synchronization in group communication: An exploration toward fNIRS hyperscanning of natural interactions. NeuroImage, 133, 484–497.

92.

Oberauer

Kliegl

(2006). A formal model of capacity limits in working memory. Journal of Memory and Language, 55(4), 601–626.

93.

Oberauer

Lewandowsky

(2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26, 1596–1618.

94.

Ólafsdóttir

H. F.

Bush

Barry

(2018). The role of hippocampal replay in memory and planning. Current Biology, 28(1), R37–R50.

95.

Patihis

Frenda

S. J.

LePort

A. K.

Petersen

Nichols

R. M.

Stark

C. E.

McGaugh

J. L.

Loftus

E. F.

(2013). False memories in highly superior autobiographical memory individuals. Proceedings of the National Academy of Sciences, 110(52), 20947–20952.

96.

Pavlik

P. I.

Anderson

J. R.

(2008). Using a model to compute the optimal schedule of practice. Journal of Experimental Psychology: Applied, 14(2), 101–117.

97.

Peters

Bühlmann

Meinshausen

(2016). Causal inference by using invariant prediction: Identification and confidence intervals. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5), 947–1012.

98.

Polyn

S. M.

Norman

K. A.

Kahana

M. J.

(2009a). A context maintenance and retrieval model of organizational processes in free recall. Psychological Review, 116(1), 129–156. https://doi.org/10.1037/a0014420

99.

Polyn

S. M.

Norman

K. A.

Kahana

M. J.

(2009b). Task context and organization in free recall. Neuropsychologia, 47(11), 2158–2163. https://doi.org/10.1016/j.neuropsychologia.2009.02.013

100.

Popov

Reder

L. M.

(2020). Frequency effects on memory: A resource-limited theory. Psychological Review, 127(1), 1–43.

101.

Popov

Zhang

Koch

G. E.

Calloway

R. C.

Coutanche

M. N.

(2019). Semantic knowledge influences whether novel episodic associations are represented symmetrically or asymmetrically. Memory & Cognition, 47, 1567–1581.

102.

Proctor

R. W.

Schneider

D. W.

(2018). Hick’s law for choice reaction time: A review. Quarterly Journal of Experimental Psychology, 71(6), 1281–1299.

103.

Raaijmakers

J. G.

Shiffrin

R. M.

(1980). SAM: A theory of probabilistic search of associative memory. In Bower

G. H.

(Ed.), The psychology of learning and motivation (Vol. 14, pp. 207–262). Elsevier.

104.

Rajaram

Pereira-Pasarin

L. P.

(2010). Collaborative memory: Cognitive research and theory. Perspectives on Psychological Science, 5(6), 649–663.

105.

Richie

Aka

Bhatia

(2023). Free association in a neural network. Psychological Review, 130(5), 1360–1385.

106.

Roediger

H. L.

III McDermott

K. B.

(2013). Two types of event memory. Proceedings of the National Academy of Sciences, 110(52), 20856–20857.

107.

Rouhani

Norman

K. A.

Niv

Bornstein

A. M.

(2020). Reward prediction errors create event boundaries in memory. Cognition, 203, Article 104269. https://doi.org/10.1016/j.cognition.2020.104269

108.

Schölkopf

Locatello

Bauer

N. R.

Kalchbrenner

Goyal

Bengio

(2021). Toward causal representation learning. Proceedings of the IEEE, 109(5), 612–634. https://doi.org/10.1109/JPROC.2021.3058954

109.

Shamay-Tsoory

S. G.

Mendelsohn

(2019). Real-life neuroscience: An ecological approach to brain and behavior research. Perspectives on Psychological Science, 14(5), 841–859.

110.

Sonkusare

Breakspear

Guo

(2019). Naturalistic stimuli in neuroscience: Critically acclaimed. Trends in Cognitive Sciences, 23(8), 699–714.

111.

Sparrow

Liu

Wegner

D. M.

(2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333(6043), 776–778.

112.

St. Jacques

Rubin

D. C.

LaBar

K. S.

Cabeza

. (2008). The short and long of it: Neural correlates of temporal-order memory for autobiographical events. Journal of Cognitive Neuroscience, 20(7), 1327–1341.

113.

St. Jacques

Schacter

D. L

. (2013). Modifying memory: Selectively enhancing and updating personal memories for a museum tour by reactivating them. Psychological Science, 24(4), 537–543.

114.

Tan

Ward

(2000). A recency-based account of the primacy effect in free recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6), 1589–1625. https://doi.org/10.1037/0278-7393.26.6.1589

115.

Tompary

Thompson-Schill

S. L.

(2021). Semantic influences on episodic memory distortions. Journal of Experimental Psychology: General 150(12), 2411–2429.

116.

Tse

Langston

R. F.

Kakeyama

Bethus

Spooner

P. A.

Wood

E. R.

Witter

M. P.

Morris

R. G.

(2007). Schemas and memory consolidation. Science, 316(5821), 76–82.

117.

Tulving

(1983). Elements of episodic memory. Oxford University Press.

118.

Tulving

Thomson

D. M.

(1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80(5), 352–373.

119.

Turner

B. M.

Forstmann

B. U.

Love

B. C.

Palmeri

T. J.

Van Maanen

(2017). Approaches to analysis in model-based cognitive neuroscience. Journal of Mathematical Psychology, 76, 65–79.

120.

van Rooij

. (2019). Psychological science needs theory development before preregistration. Psychonomic Society Featured Content. https://featuredcontent.psychonomic.org/psychological-science-needs-theory-development-before-preregistration/

121.

Watkins

M. J.

Neath

Sechler

E. S.

(1989). Recency effect in recall of a word list when an immediate memory task is performed after each word presentation. The American Journal of Psychology, 102(2), 265–270.

122.

Weldon

M. S.

Blair

Huebsch

P. D.

(2000). Group remembering: Does social loafing underlie collaborative inhibition? Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6), 1568–1577.

123.

Winograd

(1988). Continuities between ecological and laboratory approaches to memory. In Neisser

Winograd

(Eds.), Remembering reconsidered: Ecological and traditional approaches to the study of memory. Cambridge University Press.

124.

Hemmer

Zhang

(2025). Towards a generalized Bayesian model of reconstructive memory. Computational Brain & Behavior, 8, 134–146. https://doi.org/10.1007/s42113-024-00222-8

125.

Yarkoni

(2022). The generalizability crisis. Behavioral and Brain Sciences, 45, Article e1.

126.

Zacks

J. M.

Tversky

Iyer

(2001). Perceiving, remembering, and communicating structure in events. Journal of Experimental Psychology: General, 130(1), 29–58.

127.

Zadbood

Chen

Leong

Norman

Hasson

(2017). How we transmit memories to other brains: Constructing shared neural representations via communication. Cerebral Cortex, 27(10), 4988–5000. https://doi.org/10.1093/cercor/bhx202

128.

Zhang

(2022). How and why does schematic knowledge affect memory? In Musolino

Sommer

Hemmer

(Eds.), The cognitive science of belief: A multidisciplinary approach (pp. 113–134). Cambridge University Press.

129.

Zhang

Griffiths

T. L.

Norman

K. A.

(2023). Optimal policies for free recall. Psychological Review, 130(4), 1104–1136. https://doi.org/10.1037/rev0000375

130.

Zhao

W. J.

Richie

Bhatia

(2022). Process and content in decisions from memory. Psychological Review, 129(1), 73–102.

131.

Zhou

C. Y.

Talmi

Daw

N. D.

Mattar

M. G.

(2025). Episodic retrieval for model-based evaluation in sequential decision tasks. Psychological Review, 132(1), 18–49. https://doi.org/10.1037/rev0000505

132.

Zhou

Kahana

M. J.

Schapiro

A. C.

(2024). A unifying account of replay as context-driven memory reactivation. eLife, 13.