Abstract
The age-related declines observed in scores on paired-associate-learning (PAL) tests are widely taken as support for the idea that human cognitive capacities decline across the life span. In a computational simulation, we showed that the patterns of change in PAL scores are actually predicted by the models that formalize the associative learning process in other areas of behavioral and neuroscientific research. These models also predict that manipulating language exposure can reproduce the experience-related performance differences erroneously attributed to age-related decline in age-matched adults. Consistent with this, results showed that older bilinguals outperformed native speakers in a German PAL test, an advantage that increased with age. These analyses and results show that age-related PAL performance changes reflect the predictable effects of learning on the associability of test items, and indicate that failing to control for these effects is distorting the understanding of cognitive and brain development in adulthood.
Keywords
The ability to learn and recall arbitrary word pairs (e.g., jury-eagle) during paired-associate-learning (PAL) tests declines systematically as adult age increases. Along with similar patterns of change on other neuropsychological tests, this is thought to show that cognitive capacities decline across adulthood, functionally characterizing the structural changes that occur as healthy brains age (Deary et al., 2009; Lindenberger, 2014; Salthouse, 2012; Singh-Manoux et al., 2012).
PAL tests are particularly sensitive to the effects of age on cognition (Rabbitt & Lowe, 2000), which are evident surprisingly early in adulthood. For example, a reanalysis by Ramscar and Port (2016) of data obtained by desRosiers and Ivison (1986) showed that in 20- to 29-year-olds, average performance on the PAL subtest of the Wechsler Memory Scale (WMS) was 78%, but fell to 70% in 30- to 39-year-olds, the largest by-decade decline on this test across the life span (see Fig. 1); Wilcoxon signed-rank test on average item scores: range: 1.13–2.97, z = 3.71, p < .001, 95% confidence interval for the mean performance difference between the two age groups = [0.14, 0.33] (Ramscar & Port, 2016).

Average by-item performance for 400 adults between the ages of 20 and 29 and 30 and 39 years (50% females per group) on Forms 1 and 2 of the paired-associate-learning (PAL) subtest of the Wechsler Memory Scale (data obtained from desRosiers & Ivison, 1986). The order of items on the y-axis is based on the mean item score across both age groups, with item difficulty increasing from hardest to easiest starting from the bottom.
However, considered alone, these changes cannot be construed as evidence of significant cognitive decline between the ages of 20 and 40 years. This is because raw PAL scores cannot be used to compare performance between groups whose experience varies unless one also assumes that PAL performance is unaffected by differences in people’s prior experience of PAL items, an assumption that research into associative learning has repeatedly shown to be false. Learning to associate a cue (e.g., jury) with an outcome (eagle) cannot be predicted from association rates alone (for a review, see Ramscar, Dye, & McCauley, 2013), and two other factors have been shown to be critical to associative learning: cue background rates (Ramscar, Dye, & Klein, 2013; Rescorla, 1968; in PAL tests, the frequencies at which cue words are encountered in the absence of response words), and blocking (Arnon & Ramscar, 2012; Kamin, 1969; the prior predictability of a response in context).
While association rates tend to promote learning, blocking and background rates inhibit it, and critically, the way these factors interact to influence the learning of a specific association is entirely a function of a learner’s experience (Ramscar, Dye, & McCauley, 2013). All three factors are also critical to explaining the pattern of PAL performance across adulthood: As Figure 1 shows, age makes “hard” PAL items proportionally harder to learn than “easy” PAL items, a nonlinear pattern that is not predicted by theoretical accounts of cognitive decline; however, as the following simulation shows, this pattern is predicted by standard models of the associative learning process.
Simulation Experiment
Method
The development of word associations in a small lexicon comprising four “easy” (meaningful) PAL pairs (baby-cries, baby-eagle, jury-duty, and jury-summons) and two “hard” (meaningless) pairs (baby-summons and jury-eagle; see Fig. 2) was simulated using the learning rule devised by Rescorla and Wagner (1972; computationally, the rule describes a discriminative learning process; Ramscar, Yarlett, Dye, Denny, & Thorpe, 2010). To reflect the distributional differences of words in natural languages and to show how variations in word co-occurrences in a typical English-speaker’s experience affect learning over time, we pretrained the meaningless items with low association rates (10 times each).

Results from the simulation experiment. The Rescorla-Wagner (1972) association strength for each of the six trained word pairs is shown as a function of the number of times the word pair had been presented to the model (learning events).
The effect that experience of the more frequent meaningful pairs have on PAL learnability was then simulated by training the model on jury-duty 40 times, baby-cries and baby-eagle 60 times, and jury-summons 80 times. The order in which individual item exemplars were presented was determined randomly, subject to the probability of their occurrence in training. The simulations were run using the ndl package for the statistical software R (Arppe et al., 2015), and the code for the simulation is in the Supplemental Material available online.
Results
The development of the word associations in the model is plotted in Figure 2. As can be seen, increased experience of a world containing jury-duty and baby-eagle serves to discriminate against the learnability of jury-eagle. Increased experience with the meaningful word pairs increases the background rate of jury in relation to eagle, while simultaneously forcing jury into competition for associative value with the more frequent cue baby. This ultimately results in the model learning a negative association between jury and eagle, and negative association will have to be unlearned in order for the model to positively associate jury with eagle.
What Do Declining PAL Scores Say About Cognition?
The simulation results are further supported by analyses of the WMS-PAL normative data (desRosiers & Ivison, 1986) by Ramscar and Port (2016), who found that using large text corpora to empirically derive parameters for the background rates (Word 1 frequencies), blocking (Word 2 frequency/Word 1 frequency), and association rates (Word 1–Word 2 co-occurrence rates) for the PAL pairs plotted in Figure 1 accounted for over 85% of the by-item variance in the observed performance of the 20- to 29-year-old and the 30- to 39-year-old age groups. Consistent with the simulation results, Ramscar and Port’s analysis showed that background rates and blocking were associated with lower scores, while association rates were associated with higher scores, with sensitivity to all of the predictors being greater in the 30- to 39-year-old group compared with the 20- to 29-year-old group (see Ramscar, 2014, for a replication using different corpora). Further analyses of the full normative data set revealed that this pattern is consistent across the life span (Ramscar, Hendrix, Love, & Baayen, 2013), such that the oldest adults’ (ages 60–69 years) performance showed greatest sensitivity to the factors that caused negative associations to develop in the simulation, whereas these factors did not significantly influence the youngest participants’ performance at all. In terms of learning the more complex set of word associations in the English lexicon, 20- to 29-year-olds’ performance is akin to learning at around Epoch 60 in the simulation, and 60- to 69-year-olds’ performance is more akin to learning around Epoch 250.
In other words, long-established principles of learning explain why some PAL pairs are harder or easier to learn in the first place. Further, they predict that PAL performance can be expected to decline as adults age, simply because the discriminative processes that produce “associative” learning teaches English speakers not only which words go together, but also which words do not go together. This process both increasingly differentiates meaningful and meaningless word pairs (Fig. 1) and makes meaningless pairs harder to learn (Fig. 2).
Behavioral Experiment
Because of the way people are exposed to language throughout their lives, native speakers of similar ages and educational backgrounds have levels of first-language experience that significantly exceed those of age-matched adult second-language speakers. Our simulation and analyses make two clear (though somewhat counterintuitive) predictions about how these differences will affect PAL performance:
Older native speakers ought to perform worse on lexical PAL tests than age-matched nonnative speakers of a language.
The differences in native and bilingual PAL performance can be expected to increase with growing experience (see Fig. 2).
By contrast, if PAL tests do in fact measure cognition simply as a function of frequency of presentation, then older second-language speakers should not outperform older native speakers. Indeed, given that frequent PAL pairs are easier than infrequent ones, a naive account should predict that older native speakers’ greater experience should lead them to outperform older second-language speakers.
Method
To examine these hypotheses, we tested 20 young (18–28 years old) and 20 older (38–53 years old) monolingual speakers of German (a nontonal language deriving most of its lexicon from the Germanic branch of the Indo-European language family) and two age-matched groups of 20 native speakers of Mandarin (a tonal member of the Sino-Tibetan language family), for whom German is a second language. Given that PAL is a reliable measure that is particularly sensitive to the effects of aging (Rabbitt & Lowe, 2000) and that sample sizes greater than 20 are typical in neuropsychological studies that employ PAL tests, this sample was judged to be sufficient to test these hypotheses. The monolinguals completed a PAL test in German only, whereas the bilinguals completed Chinese and German PAL tests (see the Supplemental Material for details).
Table 1 shows the mean age for the four groups of participants, as well as the mean scores in vocabulary tests in German—and, where applicable, Chinese—which confirm native-speaker superiority when it comes to vocabulary skills.
Descriptive Statistics for Age and Vocabulary Scores for Each Group of Participants
Results
An analysis of the performance of our participants using generalized additive models (Wood, 2006) revealed a significant interaction between co-occurrence frequency 1 and age, χ2(6.106) = 38.687, p < .001. The interaction differed depending on whether the test was administered in participants’ first or second language, χ2(3.000) = 9.122, p < .028. For young adults, first- and second-language performance was similar, and the interaction between age and frequency, χ2(4.368) = 19.658, p = .001, did not differ between first and second language, χ2(4.744) = 1.357, p > .250. This is consistent with previous analyses that showed that young adults’ PAL performance is largely insensitive to background rates and blocking (Ramscar et al., 2013). By contrast, the performance of older adults was better in their second language than in their native language, as revealed by an Age × Co-occurrence Frequency interaction, χ2(4.009) = 36.335, p < .001, that differed significantly between first and second language, χ2(3.000) = 14.959, p = .002, and a main effect of language that indicated that older adults’ performance was better in their second language than in their native language, z = 2.113, p < .035.
Figure 3 shows performance of young monolinguals and bilinguals in German. Consistent with the generalized-additive-model analysis, Figure 3 reveals that the performance of these two groups is comparable. Bilingual young adults outperformed monolingual young adults on 14 items, whereas monolingual young adults outperformed bilingual young adults on 13 items (with performance being identical for the remaining three items).

Results from the behavioral experiment: average by-item performance on the paired-associate-learning (PAL) test, separately for young native German monolinguals and young Chinese-German bilinguals tested in German. The order of items on the y-axis is based on the mean item score across both age groups, with item difficulty increasing from hardest to easiest starting from the bottom.
The three-way interaction among age, association rate, and language (first vs. second) is reflected in Figure 4, which shows performance of old monolinguals and bilinguals in German. Older participants performed better in their second language than in their native language for the majority of the co-occurrence frequency range (19/25 of the easiest PAL pairs). For the very “hardest” PAL items—i.e., those with the lowest association rates—this pattern reversed, such that the older adults performed worse in their second language. 2 In addition to the improved performance of older participants in the second language, we observed an attenuation of the age effect in the second language.

Results from the behavioral experiment: average by-item performance on the paired-associate-learning (PAL) test, separately for old native-German speakers and old Chinese-German bilinguals tested in German. The order of items on the y-axis is based on the mean item score across both age groups, with item difficulty increasing from hardest to easiest starting from the bottom.
Figure 5 shows the predicted performance (proportion correct) in the first language and in the second language as a function of age and co-occurrence frequency. As can be seen, there is a clear age effect in the first language, which is qualitatively similar to the age effects reported in monolingual PAL studies. Throughout the left panel of Figure 5, the performance of the old participants is worse than that of the young participants. The difference is small for items with high association rates but increases as co-occurrence frequency decreases. At the midpoint of the co-occurrence frequency range (as indicated by the dashed lines), for instance, the estimated performance of the oldest participants is .59 correct, whereas the performance of the youngest participants is .76 correct (difference: .17). In the second language, this age effect is substantially reduced. A clear age effect in the second language is present for the hardest pairs only. At the midpoint of the co-occurrence frequency range, the estimated performance is between .73 and .76 across the age range. For the easiest pairs, old participants even perform somewhat better than young participants, although the performance of both groups is close to ceiling (see the Supplemental Material for details and further analyses).

Effect of (log-transformed) co-occurrence frequency and age on paired-associate-learning (PAL) performance in the first and second language. The z-axis represents the proportion of correct responses in the PAL test, with colder colors indicating poorer performance, and warmer colors representing better performance. Contour lines connect points with the same accuracy scores. Dotted lines indicate the midpoint of the (log) co-occurrence frequency range.
Finally, the somewhat counterintuitive prediction tested here—that increased language experience impairs overall PAL performance—is further supported by another result of the study: Older adults with doctoral degrees, the attainment of which is likely to involve a larger than usual amount of reading, performed significantly worse than did older adults without them (z = −2.073, p = .035).
Discussion
Previously, we showed how age-related changes on other measures of cognitive performance are likely to reflect the effects of learning and increased knowledge rather than cognitive decline (Blanco et al., 2016; Ramscar et al., 2013; Ramscar, Hendrix, Shaoul, Milin, & Baayen, 2014). The present study extends this finding by showing how standard learning models actually predict that PAL performance will decline even when learning capacities remain constant, simply because cumulative linguistic experience will make meaningless word pairings ever harder to learn. This prediction is supported by the results of our study, which show that when age is controlled for, less linguistic experience predicts higher PAL scores; this indicates that PAL scores measure the “costs” that accompany the acquisition of an increasingly well-discriminated lexical knowledge base.
A steadily accumulating body of evidence in the domain of linguistic cognition indicates that the skills of speakers increase with age and experience: Speakers’ vowel spaces expand with age, which suggests that the articulation of specific words becomes more distinct with age (Baayen, Tomaschek, Gahl, & Ramscar, in press); and although older speakers respond more slowly in the lexical decision task, their accuracy on less frequent words is substantially higher than that of young speakers (Ramscar et al., 2013), which is indicative of both more extensive lexical knowledge (Keuleers, Stevens, Mandera, & Brysbaert, 2015) and a speed/accuracy trade-off favoring precision. Similarly, although speakers increasingly use more pronouns as they age (which has been taken to reflect declining processing capacity; Hendriks, Englert, Wubs, & Hoeks, 2008), further examination reveals that this characteristic applies not only to individual speakers, but also to languages as they change across generations; and since languages do not age, it seems that in both cases, this trend is likely to reflect adaptation to the demands of processing an ever expanding vocabulary (Baayen et al., in press).
The present results and analyses show that when age is controlled for, higher PAL scores are a “benefit” of the less well-discriminated knowledge associated with less experience. Studies of aging and associative learning outside the linguistic domain support this conclusion: Naveh-Benjamin (2000) shows that older adults are worse at learning associations if units of information are unrelated rather than meaningfully related; Castel (2005) shows that older adults are better at associating realistic than unrealistic prices with grocery items; and Old and Naveh-Benjamin (2008) show that adults encode less information about background context in memory tests as they age. These findings have previously been taken to reveal age-related associative deficits that are somehow lessened when associative information is consistent with the environment. However, it is notable that the same pattern of learning the informative and neglecting the uninformative is also seen when infants lose their sensitivity to nonnative phonetic distinctions in the course of learning a language (Werker & Tees, 1984), when it is not typically seen as cognitive decline. Similarly, the finding that despite their highly developed spatial skills and distinctive hippocampi, London taxi drivers are worse than control subjects at learning novel object-location associations (but not associations in other domains) is taken to reveal a puzzle, not pathology (Woollett & Maguire, 2012). What the analyses presented here show is that all these phenomena are consistent with what is known about the actual processes of associative learning (Ramscar et al., 2010), and that far from declining, they suggest that when properly analyzed, the learning processes in healthy adults appear to function in a remarkably consistent way across the life span.
Current ideas about age-related cognitive decline are irrevocably linked to the tests used to study the effects of age on cognition, so cognitive decline is defined in terms of declining test performance. These tests fall into two broad groups that were originally developed across the course of the 20th century for other purposes. The first category consists of a variety of relatively simple clinical tests for assessing neuropsychological injuries and pathologies, designed to allow the loss of or damage to a localizable function to be identified by comparing patient performance with normal performance in a population for such factors as age and gender (Strauss, Sherman, & Spreen, 2006). The second category consists of a variety of relatively simple psychometric tests developed to assess intelligence in populations such as school children and military recruits (Gregory, 2004). All of these tests make use of learned information; however, in repurposing them for assessing aging, researchers have (albeit sometimes implicitly) assumed that performance on them is independent of experience (Deary et al., 2009; Lindenberger, 2014; Salthouse, 2012; Singh-Manoux et al., 2012). Our analyses and results emphasize just how incompatible this assumption is with what is known about how learning from experience actually works (e.g., see Daw, Courville, & Dayan, 2008; McDannald, Jones, Takahashi, & Schoenbaum, 2014; Rescorla & Wagner, 1972; Schultz, 2006; Sutton & Barto, 1981).
It is clear that a proper assessment of cognitive performance in healthy aging cannot be made unless the knowledge and skills that are inevitably accumulated as experience grows are controlled for. For tests such as PAL, which make use of linguistic stimuli, big linguistic data now make this straightforward, as we have shown. Deconfounding the effects of experience on other tests of life-span cognitive performance is likely to be less straightforward, but it is nevertheless equally critical. In the meantime, in absence of assessments of the effects of learning on other tests and in light of the fact that changes in performance on other life-span measures of cognitive performance correlate well with corresponding PAL scores (Rabbitt & Lowe, 2000), scientific prudence indicates that the way the results of these tests are interpreted should be tempered. Indeed, given that our results indicate that basic learning processes do not decline across the life span, and given that the model we used to predict these results has proven remarkably adept at accurately predicting learning across a range of behavioral domains (as well as responses in the structures that implement learning processes in the brain; Schultz, 2006), scientific prudence further indicates that current beliefs about the supposed deterioration of cognitive faculties across the life span ought to be seriously questioned: We have shown that when the interaction between learning and processing is controlled for, the development of linguistic cognition across the life span is hopelessly mischaracterized by simple ideas about “decline”; if, as we expect, the effects we have described generalize to other cognitive domains, the associated societal and economical consequences will be profound.
Footnotes
Action Editor
Philippe G. Schyns served as action editor for this article.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Open Practices
All data and materials have been made publicly available via the Open Science Framework and can be accessed at https://osf.io/g6tnd/. The complete Open Practices Disclosure for this article can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797617706393. This article has received the badges for Open Data and Open Materials. More information about the Open Practices badges can be found at
.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
