Working Memory Affects Older Adults’ Use of Context in Spoken-Word Recognition

Abstract

Many older listeners report difficulties in understanding speech in noisy situations. Working memory and other cognitive skills may modulate older listeners’ ability to use context information to alleviate the effects of noise on spoken-word recognition. In the present study, we investigated whether verbal working memory predicts older adults’ ability to immediately use context information in the recognition of words embedded in sentences, presented in different listening conditions. In a phoneme-monitoring task, older adults were asked to detect as fast and as accurately as possible target phonemes in sentences spoken by a target speaker. Target speech was presented without noise, with fluctuating speech-shaped noise, or with competing speech from a single distractor speaker. The gradient measure of contextual probability (derived from a separate offline rating study) affected the speed of recognition. Contextual facilitation was modulated by older listeners’ verbal working memory (measured with a backward digit span task) and age across listening conditions. Working memory and age, as well as hearing loss, were also the most consistent predictors of overall listening performance. Older listeners’ immediate benefit from context in spoken-word recognition thus relates to their ability to keep and update a semantic representation of the sentence content in working memory.

Keywords

Speech perception Working memory Sentence context Individual differences Ageing

Many older listeners report difficulties in understanding speech, particularly in noisy situations. Noise impacts speech perception as it blends acoustically with speech, thereby making the speech signal less intelligible (Bregman, 1990; Brungart, 2001). In addition to this energetic masking, noise can interfere with speech perception at higher processing levels if the noise itself is linguistically processed. Informational masking in speech perception refers to the additional challenge that is elicited by the linguistic processing of a speech noise masker, on top of its energetic masking of the target speech (Watson, 2005). Listeners can compensate for the effects of masking by using, for example, the semantic context in the target speech stream. Words are recognized more easily when their surrounding words make them more predictable (Kalikow, Stevens, & Elliott, 1977), or if the general topic of the sentence is provided beforehand (Helfer & Freyman, 2008). In the present study, we investigated individual differences among older adults with respect to their immediate use of context information to facilitate spoken-word recognition across different listening conditions.

When listening to speech in noise, separating target speech from noise is complicated by the spectral and temporal overlap between the two signals, thus creating energetic masking (for a review of perceptual segregation, see Bregman, 1990). Whenever the target speech momentarily exceeds the noise in energy, listeners can exploit these “glimpses” to try and understand the speech message (Cooke, 2006). How often target speech glimpses over the noise signal is highly correlated with intelligibility (Cooke, 2006). If the masker itself is speech, then the processing of the target speech is further impacted by the linguistic processing of the competing noise masker (e.g., Brouwer, Van Engen, Calandruccio, & Bradlow, 2012; Garcia Lecumberri & Cooke, 2006; Tun, O'Kane, & Wingfield, 2002; Van Engen & Bradlow, 2007). Brungart and colleagues (Brungart, 2001; Brungart & Simpson, 2002, 2004, 2007) have shown that nonspeech noise with a similar temporal structure and long-term average spectrum as a speech masker does not produce informational masking. Informational masking in speech perception is therefore not due to the acoustic overlap of speech and noise but rather due to the linguistic processing of the speech noise (cf. also Kidd, Mason, Richards, Gallun, & Durlach, 2008; Pollack, 1975). Furthermore, the degree of informational masking is modified by the linguistic content of the masker: Competing multitalker babble containing high-frequency words impairs word recognition more than babble containing low-frequency words (Boulenger, Hoen, Ferragne, Pellegrino, & Meunier, 2010). This suggests that listeners process the words presented in the competing speech stream to a certain extent. Energetic masking thus mainly complicates peripheral processing, and informational masking additionally affects target speech processing at a higher cognitive level (Ezzatian, Li, Pichora-Fuller, & Schneider, 2011; Freyman, Balakrishnan, & Helfer, 2004).

Older adults’ speech understanding suffers more in noisy listening situations than that of younger adults. In addition to age-related hearing loss resulting in elevated audiometric thresholds, age-related declines in temporal and spectral processing (Gordon-Salant, Frisina, Popper, & Fay, 2010) may lead to poorer segregation of the target stream and the noise stream in older than in younger listeners (Huang et al., 2008), making older listeners more vulnerable to the effects of noise. Older adults also seem to suffer more from informational masking than younger adults due to age-related cognitive decline, for example, in attentional control (Tun et al., 2002, but see Schneider, Daneman, Murphy, & Kwong See, 2000). In this study, we examined the factors underlying individual differences in how much an older listener is impacted by energetic and informational masking.

The present study focused on older adults’ ability to use the contextual probability of a word within a sentence for its recognition in different listening conditions. Older and younger listeners have been shown to use contextual information to alleviate the masking effects of noise on the recognition of words (Benichov, Cox, Tun, & Wingfield, 2012). Words in sentences can be recognized more easily when they are more predictable from the preceding or following context (Marslen-Wilson & Tyler, 1980). This effect can be partially explained through associative and semantic priming (e.g., Meyer & Schvaneveldt, 1971) where words recognized earlier in a sentence activate related words, or at least constrain the lexical search space, and hence facilitate the processing of related later occurring words (Altmann & Kamide, 1999; Ferretti, McRae, & Hatherell, 2001; Knoeferle & Crocker, 2007). Context information can reduce the effects of noise masking in speech perception. Listeners tolerate more nonspeech or babble noise when identifying the final word of semantically biasing sentences than when identifying that of neutral sentences (Benichov et al., 2012; Bilger, Nuetzel, Rabinowitz, & Rzeczkowski, 1984; Gordon-Salant & Fitzgibbons, 1997; Kalikow et al., 1977).

Older adults as a group seem to rely more on contextual information in word processing than younger adults (e.g., Lash, Rogers, Zoller, & Wingfield, 2013; Nittrouer & Boothroyd, 1990; Pichora-Fuller, Schneider, & Daneman, 1995; Sheldon, Pichora-Fuller, & Schneider, 2008). Contextual information can mitigate the negative effects of age and hearing loss on spoken-word recognition (Lash et al., 2013) in tasks where processing time is not limited. Informational masking was reduced in both younger and older adults, for example, when participants were informed about the topic category of the sentence before stimulus presentation (Helfer & Freyman, 2008). Older adults benefited, however, more from meaningful sentence context than younger listeners in identifying speech in multitalker babble noise (Frisina & Frisina, 1997; Pichora-Fuller et al., 1995; Sheldon et al., 2008). Both age groups derived equal context benefit when the masker was broadband noise, which was spectrally shaped to compensate for the individual's hearing loss (Dubno, Ahlstrom, & Horwitz, 2000). This result suggests that age differences in the use of context may differ across listening conditions.

One reason why older adults may rely more on context than younger adults (Lash et al., 2013; Nittrouer & Boothroyd, 1990; Pichora-Fuller et al., 1995; Sommers & Danielson, 1999) could be their greater linguistic expertise, built up over many years of listening to their native language. Verbal ability seems to be preserved with ageing. Older adults can even outperform younger adults on vocabulary tasks (Park et al., 2002; Verhaeghen, 2003). Furthermore, greater verbal expertise in older adults has been evidenced as increased effects of lexical frequency on spoken word recognition, compared to those found for younger adults (Revill & Spieler, 2012).

The apparent greater reliance of older than of younger adults on semantic context could, however, also be a result of the use of nonspeeded listening tasks. The speech-reception threshold tasks (Dubno et al., 2000; Frisina & Frisina, 1997; Pichora-Fuller et al., 1995; Sheldon et al., 2008) used in these studies allowed participants unlimited time to prepare their response. Consequently, older listeners may have had ample time for the semantic processing of the sentence in order to facilitate word recognition. Only the final result of the speech perception process can be evaluated in such nonspeeded tasks, and not the time or effort it took to obtain the result. Recent evidence indeed suggests that older adults’ greater reliance on context may reflect a (postperceptual) bias to respond consistently with the context, rather than their greater skill in using context during word recognition (Rogers, Jacoby, & Sommers, 2012). Preliminary data presented by Aydelott, Leech, and Crinion (2010) suggest that older adults are able to use context information to benefit processing already during spoken-word recognition in clear conditions, but that challenging listening conditions may disproportionately affect older listeners’ ability to make use of semantic context during word recognition, relative to younger listeners. Peelle, Troiani, Wingfield, and Grossman (2010) investigated the connectivity between neural processing regions involved in auditory speech comprehension in younger and older adults. Their age group comparison results suggested that older adults’ limited ability to coordinate activity between processing regions may play a role in older adults’ difficulty with sentence comprehension under difficult listening conditions. This result ties in with the suggestion that age groups differ in how quickly they can benefit from context. The critical question we asked here is whether older adults can process contextual semantic content rapidly enough to facilitate processing as the spoken target sentence is unfolding.

In addition, we asked which factors determine older listeners’ individual differences in the use of context. The use of context by older adults may differ not only across listening conditions, but also across individuals. In addition to hearing ability, cognitive abilities could explain individual differences, especially in more difficult listening conditions. The less intelligible the speech becomes in adverse listening conditions, the greater the need for cognitive resources to filter out irrelevant information and to fill in any missing information by using context information to comprehend the speech content (Francis & Nusbaum, 2009). This is in line with the ease of language understanding (ELU) model by Rönnberg and colleagues (Rönnberg, 2003; Rönnberg, Rudner, Foo, & Lunner, 2008) that assumes that cognitive processes become more important when listening conditions are more challenging, due to hearing loss, background noise, or both. Cognitive processes thus become more relevant if implicit matching of linguistic input to stored representations in long-term memory is not sufficient for lexical access. Listening in difficult conditions has been assumed to contribute to mental effort (Koelewijn, Zekveld, Festen, & Kramer, 2012; McCoy et al., 2005; Piquado, Isaacowitz, & Wingfield, 2010; Rabbitt, 1968; Tun, McCoy, & Wingfield, 2009; Zekveld, Kramer, & Festen, 2010, 2011a). Verbal working memory in particular could play a role in how efficiently contextual probability can be used across listening conditions (but see Otten & van Berkum, 2009). Differences in working memory may also explain why older listeners benefited more than younger adults from context preceding a target word but less so from following context (Wingfield, Alexander, & Cavigelli, 1994). Older adults may have greater difficulty than younger adults in maintaining an unidentified word in working memory for it to be recognized later. Zekveld et al. (2011b) found that adults with better verbal working memory (as indexed by reading span performance) perceived spoken sentences in noise better in a condition with mismatching text cues than did adults with poorer working memory. This correlation between performance and working memory was not found in a condition with matching cues. This suggests that older adults with better verbal working memory were better at inhibiting distraction from irrelevant (written) context. Similarly, in a follow-up study on sentence comprehension cued by matching or mismatching text cues, Zekveld, Rudner, Johnsrude, Heslenfeld, and Rönnberg (2012) showed that better verbal working memory (indexed by reading span) was associated with greater intelligibility benefit from the related text cues, and with less speech-related activation in the left superior temporal gyrus and left inferior frontal gyrus. Zekveld et al. (2012) argue that these results agree with the hypothesis of “neural efficiency” (Neubauer & Fink, 2009) in that those listeners with better working memory are able to use context more efficiently for speech comprehension. We investigated here whether individual differences in working memory affect older listeners’ ability to use context information during the recognition of spoken words. Critically, we also investigated whether age differences among these older adults predict the ability to use context efficiently, beyond working memory differences. If so, this would mean that working memory differences only partially explain age differences in the ability to benefit from context immediately.

In particular, we focused on individual differences among older adults in the ability to use contextual probability for spoken-word recognition in different listening conditions. Spoken word processing as a function of contextual probability was investigated in a condition without noise (no-noise condition), in a condition with fluctuating speech-shaped noise (nonspeech-noise condition), and in a condition with competing speech from a single distractor speaker (speech-noise condition). In contrast to prior listening studies, we investigated older adults’ ability to process semantic meaning rapidly enough to benefit the recognition of words during sentence processing. So far, studies that have investigated individual differences, rather than age group differences, in the rapid use of contextual probability were only on reading (Federmeier, Kutas, & Schul, 2010; Lee & Federmeier, 2011) and thus did not involve any adverse processing conditions.

In the present study, we asked participants to monitor meaningful spoken sentences of a target speaker for the occurrence of a specific target phoneme. This phoneme-monitoring task is a speeded-response task as processing time is limited (Connine & Titone, 1996). Despite its name, this task reflects lexical processing when meaningful sentences are monitored (Cutler, Mehler, Norris, & Segui, 1987; Cutler & Norris, 1979; Mirman, McClelland, Holt, & Magnuson, 2008). Phoneme monitoring can thus be used to measure the effect of context on the speed of lexical access. The task itself does not interfere with processing the sentences for meaning (Brunner & Pisoni, 1982; Ford et al., 1996; Foss & Blank, 1980). Phonemes are detected earlier in high-frequency words than in low-frequency words (Dupoux & Mehler, 1990; Eimas & Nygaard, 1992) and earlier in semantically biasing sentences than in neutral sentences (Eimas & Nygaard, 1992; Foss & Jenkins, 1973). Target-bearing words varied in our study in their predictability from their preceding sentence context. Contextual probability was manipulated as a continuous variable (see e.g., DeLong, Urbach, & Kutas, 2005) rather than a dichotomous one. Target words in our study had a contextual predictability within the intermediate range.

We examined which characteristics of older listeners relate to the use of context during sentence processing. The main hypothesis investigated was that verbal working memory relates to efficient use of contextual probability by older listeners—that is, those older adults who are better able to keep and update a semantic representation of the sentence content will show more context facilitation of spoken word recognition in a speeded listening task. Possibly, context effects are stronger for those older listeners with more linguistic expertise, as indexed by vocabulary knowledge. We also tested whether there were additional age effects on the ability to immediately use context to facilitate spoken word recognition. Due to age-related decline in other auditory and cognitive abilities, age may affect the degree to which listeners are affected by energetic and informational masking. We also tested specifically whether individual selective attention ability was a potential predictor of the degree to which listeners were affected by noise masking, informational masking in particular. Attentional abilities seem to relate to listening performance in situations with distraction from meaningful speech (Janse, 2012; Tun et al., 2002). Attention has been proposed to involve three attentional networks (Fan, Raz, & Posner, 2003; Posner & Petersen, 1990), carrying out the functions of alerting (defined as achieving and maintaining an alert state), orienting (defined as selection of information from sensory input), and executive control (defined as resolving conflict among responses). The ability to successfully ignore competing speech would be considered as being governed by the listener's executive control ability. We used a flanker task to assess listeners’ executive control ability as the ability to ignore incongruent and irrelevant flanking signals. Results from the flanker task thus allow us to test whether susceptibility to distraction from competing speech relates to general executive control abilities (measured in the visual modality, and not involving language). Hearing sensitivity was also assessed as a standard procedure as hearing sensitivity is known to affect older listeners’ performance in a speech perception study. Hearing sensitivity was therefore viewed as a control variable that was partialled out of age differences within our older adult sample.

Experimental Study

Method

Participants

Sixty-three community-dwelling older adults (22 men) were paid for their participation in the experiment (€8 per hour). All participants were tested at the Max Planck Institute for Psycholinguistics in two 1-hour sessions. Participants completed the phoneme-monitoring study at the beginning of the second session. The background tests were done afterwards and after another main study conducted in the first session (cf. Janse & Adank, 2012). Participants’ mean age was 73.3 years (SD = 5.3; range: 64 to 89 years). This relatively wide age range was needed to investigate the influence of age among older adults. All older adults were native monolingual speakers of Dutch, with no self-reported history of oral or written language impairment or of neurological or psychiatric problems. None of the participants wore hearing aids in daily life. No other criteria were set with respect to participants’ hearing ability. Participants wore, when applicable, their appropriate glasses.

Materials

The sentence materials we used were the audio tracks of a subset of sentences recorded as videos for Jesse and Janse (2012). Sixty sentences contained the target phoneme /p/, and 60 sentences contained the target phoneme /k/. Targets were placed word-initially in these carrier sentences of varying length, such that a target phoneme only occurred once in a sentence. Half of the target-bearing words in each target set were monosyllabic; the other half were bisyllabic with lexical stress on the initial syllable. The four word sets (monosyllabic and bisyllabic /p/ and /k/ words) were matched for lexical frequency (cf. Jesse & Janse, 2012). Sixty foil sentences for each target phoneme set were selected that were similar in complexity and structure to the target sentences, but did not contain the respective target phoneme. Three additional target and foil sentences were created for each target phoneme as practice materials.

The position of the target-bearing words within the sentences varied freely. Target words were not highly predictable on the basis of their preceding (or following) context [e.g., “De circusartieste had al jaren een p il die haar zenuwen onder controle hield” (“The circus artist had a p ill for years that kept her nerves under control”); “Dit is niet bepaald een k lus om blij mee te zijn” (“This is not exactly a ch ore to be happy with”); targets are underlined and in bold]. Cloze probability was assessed for all target words in a separate, web-based study. Twenty-three students of the Radboud University Nijmegen were paid for their participation in this self-paced web experiment. We only collected cloze probability ratings from younger adults, as past research has shown no age differences in cloze probability ratings (Federmeier, McLennan, De Ochoa, & Kutas, 2002; Lahar, Tun, & Wingfield, 2004; Lovelace & Coon, 1991). Participants first read the target sentences up to the target word and were asked to continue the sentence fragment with one word. The experiment was self-paced. Out of the 128 target sentences of the complete Jesse and Janse (2012) set, only six /p/- and six /k/-words were correctly anticipated (M = 12% for these sentences, range = 4% to 26%). Next, participants read the sentence fragments again, but this time the sentence fragments included the target word. Target words were underlined. Participants were asked to rate on a scale from 1 (does not fit at all) to 7 (perfect fit) how well a target word fitted the preceding sentence fragment. Out of the original 128 items, three sets of 40 items (each set consisting of half /p/ items and half /k/ items) were selected that were matched in their ratings (M = 4.69, SD = 1.03; set means were 4.61, 4.73, and 4.72), F(2, 78) < 1, ns. These three matched sets were rotated in the phoneme-monitoring study through the three experimental noise conditions. Averaged over the three item sets, ratings for the /p/ items were slightly higher (M = 4.84, SD = 1.04) than for the /k/ items (M = 4.53, SD = 1.01), but this difference was not significant (t = 1.62, p > .1). Furthermore, a target's contextual probability rating was not correlated with its position in the sentence (measured in ms from sentence onset; Pearson's r = −.11, p > .1). Both context rating and position in the sentence were therefore entered as continuous item variables in the analyses of the phoneme-monitoring study.

All sentences were presented in the experiment without a masker (no-noise condition), with a speech masker (speech-noise condition), and with a speech-shaped masker (nonspeech-noise condition). To create the materials for the speech-noise condition, each sentence was assigned to a longer, meaningful masking sentence spoken by a second female native speaker of Dutch. Masking sentences did not contain the respective target phoneme. Each masking sentence was cut off at the end to match the duration of its assigned target speaker sentence. Mean F0 of the target speaker was 224 Hz (SD = 42 Hz), and mean F0 of the distractor speaker was 202 Hz (SD = 31 Hz), thus creating spectral overlap between target and masker speech (mean pitch difference being 1.8 semitones; cf. Assmann & Summerfield, 1990; Brokx & Nooteboom, 1982; Jackson & Moore, 2013). To create energy-matched nonspeech masker versions of each masking sentence for the nonspeech-noise condition, speech-shaped noise was created on the basis of the competing speaker's long-term spectrum. The long-term average spectrum was calculated over a sound file consisting of a concatenation of 10 randomly selected foil sentences of the masking speaker (5 foil sentences meant to go with /p/ target sentences and 5 meant to go with /k/ target sentences), with silences longer than 100 ms removed. The long-term average spectrum was used to spectrally shape a broadband noise file. For each target and foil sentence, the amplitude contour of its masking sentence was computed. Each sentence's amplitude contour was applied to a fragment of the same length taken from the speech-shaped noise file. The resulting energetic maskers hence varied in intensity over time, just as the speech-masker foil sentences they were respectively based on. The target-to-noise ratio was set to +2 dB (average intensity over the sentence) in both masking conditions, similar to the signal-to-noise ratio for the older adults in Jesse and Janse (2012). Target speech and noise were mixed and were presented binaurally.

Three different experimental lists were created to rotate the three target sets through the three noise conditions. Additionally, two versions of each list were created to balance the order in which participants were presented with the /p/ target phoneme block and the /k/ block (/p/ first or /k/ first), yielding six different experimental lists in total.

Procedure

For the phoneme-monitoring task, participants were tested individually in a sound-attenuated booth. Audio was presented binaurally over headphones at a fixed listening level (80 dB SPL). Participants were first familiarized with the voice of the target speaker with a 40-second audio fragment in which the speaker introduced herself as the target speaker and provided some instructions. Participants then received written instructions explaining that on each trial they would hear the target speaker, without noise, embedded in noise, or while hearing a competing speaker. Participants were asked to monitor the speech of the target speaker on every trial for a word beginning with the respective target phoneme [e.g., “p” in plant (“plant”) or “k” in kabel (“cable”)]. The detection of a target phoneme was to be indicated by pressing a button on a button box. Accuracy and speed were to be maximized. If there was no target phoneme within a presented sentence, no response was to be given. Participants first obtained a practice block, presenting one target and one foil practice trial in each of the three listening conditions. Participants then proceeded with two test blocks (one for each target phoneme), each consisting of 120 trials. Presentation of listening condition was mixed within each block. Participants were able to take a short break in between these blocks.

Practice and experimental trials were structurally the same: Each trial began by showing the target phoneme in a white font (Arial, font size 60) for 1000 ms centred on the black computer screen before a sentence was played. Responses were collected up to 1500 ms after sentence offset. The next trial began 500 ms later. Stimuli materials were always played completely, regardless of whether or not a response was given. After another 500 ms, participants proceeded with the next trial. The experiment was controlled by Presentation experimental software (Version 14, Neurobehavioral Systems, www.neurobs.com).

Auditory, cognitive, and linguistic background tests

Hearing acuity

Air-conduction hearing thresholds were assessed with a portable Maico ST 25 audiometer in a sound-attenuating booth. Mean thresholds are given in Figure 1 for octave frequencies from 0.5 to 8 kHz. Given the high-frequency hearing loss associated with ageing, the mean pure-tone average thresholds (PTA) were computed over 1, 2 and 4 kHz (instead of over 0.5, 1 and 2 kHz) for each participant's better ear. Higher PTAs indicate poorer hearing acuity. The overall mean PTA threshold was 25.61 dB HL (SD = 11.11, range = 3.33–48.33 dB HL). Even though none of the participants wore hearing aids, some had clinically significant hearing loss.

Figure 1.

Mean thresholds at several octave frequencies (in the better ear). Error bars represent one standard deviation.

Working memory

A computerized version of the backward digit span task (a subpart of the Wechsler Adult Intelligence Test; Wechsler, 2004) was used to measure individual verbal working memory capacity. The backward version of this task was used to assess the ability to both store and manipulate information in verbal working memory, as the forward version would only reflect passive storage. In this task, a series of digits appeared sequentially in the centre of the computer screen for one second. The interval between consecutive digits was one second. Digits were presented in a large white font (Arial, font size 100) against a black background. After the presentation of a digit sequence (e.g., 3 6 2), participants were prompted to recall the digits in reversed order (e.g., 2 6 3). Participants first practised with 2 three-digit trials, before being tested twice on sequences including two to eight digits (i.e., 14 trials in total). Trials were the same for all participants regardless of their performance. Individual performance on this task was determined as the proportion of correctly recalled digit sequences out of 14 test trials, regardless of digit sequence length. Larger proportions thus indicated better working memory. Mean proportion correct in this task was .36 (SD = .13, range = .14–.86).

Selective attention. In a computerized version of the classic flanker task (Eriksen & Eriksen, 1974) participants indicated the direction of a visually presented arrow by pressing the “z” (left) or “/” (right) key on a keyboard. Speed and accuracy had to be maximized. The target (“>” or “<”) was presented in the middle of four other arrows, which either pointed in the same (or congruent) direction as the target (e.g., for target “>”, “> > > > >”), or in a different (i.e., incongruent) direction (“< < > < <”). In a neutral condition, the target was flanked by dashes (“– > –”). All symbols were shown in white in Arial (font size 80) against a black background. Each trial started with a 400-Hz pure-tone beep presented at 50 dB SPL before a fixation cross was shown for 250 ms. A symbol string was then presented for 1500 ms, while responses were collected. Intertrial time was 1000 ms. The two targets (“<”, “>”) were presented 12 times in each of the three flanker conditions. The order of these 72 trials was randomized for each participant. Testing began with one repetition of each of the six stimuli for practice.

Three participants were 33% or less correct in the incongruent condition. Consequently, their mean reaction times (RTs) for this condition were considered to be invalid. The mean accuracy without these three participants was 94% correct (SD = 11). As expected, accuracy was lowest and most variable in the incongruent condition (88%, SD = 20). Accuracy in the congruent and neutral conditions was 97% (SD = 8) and 97% (SD = 8), respectively. Mean response times for correct responses, measured from visual presentation onset, were 768 ms in the incongruent condition (SD = 297), 620 ms in the congruent condition (SD = 201), and 607 ms in the neutral condition (SD = 198). RTs were log-transformed in order to reduce the skew and nonnormality of their distribution (see e.g., Quené & van den Bergh, 2008), as statistical tests assume data distributions to be normal.

A repeated measures analysis of variance (ANOVA) across participants showed a significant effect of condition on log-transformed RTs, after Greenhouse–Geisser correction for violation of the sphericity assumption, F(1.353, 83.897) = 141.957, p < .001, η² _p = .696. Bonferroni-corrected pairwise comparisons (adopting a criterion level of .05/3 = .017) showed that responses in the incongruent condition were significantly slower than those in the neutral (mean difference = −0.228, SE = 0.018, p < .001) and congruent conditions (mean difference = −0.204, SE = 0.016, p < .001), and that the difference in response time between the neutral and congruent conditions was not significant (mean difference = 0.023, SE = 0.009, p = .026). Individual performance on this task was determined by log-transforming a participant's mean RTs in the neutral and incongruent conditions, and then computing the ratio of each participant's log RT in the incongruent condition, divided by the log RT in the neutral condition. Larger ratios indicated poorer selective attention skills. The mean flanker ratio was 1.04 (SD = 0.02, range = 0.99–1.10). This mean ratio was used as individual score for the three participants for whom the mean RT in the incongruent condition could not be computed.

Vocabulary test

A 60-item receptive multiple-choice test was used to assess vocabulary size (Andringa, Olsthoorn, van Beuningen, Schoonen, & Hulstijn, 2012). The test by Andringa et al. (2012) consists of a selection of items from Hazenberg and Hulstijn's (1996) test for second-language speakers of Dutch and new items that make the test suitable for native speakers. Target words were presented on a computer screen (Courier, font size 15) in neutral carrier sentences [e.g., the target word mentaliteit (“mentality”) was presented in the carrier phrase “Wat een vreemde mentaliteit!” (“What a strange mentality!”)]. Participants were asked to choose the best description of the word's meaning out of five alternatives—for example, for “mentality”: (a) table, (b) person, (c) way of thinking, (d) atmosphere, and (e) I really don't know. The last option was always “I really don't know”. Individual scores were defined as the proportion of correct items (out of 60). Higher scores hence indicated better vocabulary knowledge. The mean score was .87 (SD = .09, range = .57–.98).

Predictor intercorrelations

Table 1 gives an overview of the intercorrelations between all background measures. The Bonferroni-corrected alpha level for multiple testing was .005. As expected, hearing loss and age were correlated in this sample of older adults: Hearing loss increased significantly with ageing (r = .40, p = .001). In addition, those with more hearing loss generally had poorer working memory (r = .44, p = .0003). Participants with better working memory also had better vocabulary scores (r = .37, p = .003).

Table 1.

Pearson correlation coefficients indicating intercorrelations between participant characteristics

Characteristic	Age	Hearing loss	Working memory	Selective attention
Age
Hearing loss	.40*
Working memory	−.10	−.44**
Selective attention	−.02	.06	.01
Vocabulary	.08	−.19	.37*	−.18

p < .01.

p < .001.

To avoid collinearity in our predictor measures, we partialled out hearing loss from working memory and age. The highest remaining correlation between working memory (controlled for hearing) and vocabulary no longer exceeded the Bonferroni-corrected threshold (r = .32, p = .011; all other correlations had r values smaller than .20).

Results

Correct responses to targets were defined as all responses that occurred within 2.5 standard deviations of the mean response time after target onset (i.e., within 4843 ms; M = 1400 ms, SD = 1377 ms), calculated across all conditions. Response times for these hits were measured from the end of the plosive's release. False alarms were defined as all responses that were given during the presentation of a filler sentence (starting from filler sentence onset up to 1500 ms after sentence offset). Detection sensitivity (d′) was calculated as the difference in z-transformed hit and false-alarm rates (Macmillan & Creelman, 1991). Hit rates, false-alarm rates, and d′s in the three listening conditions are presented in Table 2.

Table 2.

Phoneme-monitoring performance in the three listening conditions

Condition	Hit (%)		False alarm (%)		d′		Hit RT (ms)
Condition	M	SD	M	SD	M	SD	M	SD
No noise	94	7.2	3.5	3.0	3.62	0.66	725	443
Nonspeech noise	76	15.4	5.9	5.3	2.45	0.80	962	612
Speech noise	71	19.4	4.7	4.7	2.43	0.86	921	558

Note: RT = reaction time.

All results were analysed with linear mixed-effect models, including both items and participants as random effects. These offer several advantages over repeated measures ANOVA, as laid out in introductory papers (Baayen, Bates, & Davidson, 2008; Baayen, Tweedie, & Schreuder, 2002; Quené & van den Bergh, 2008). One relevant advantage is that both discrete and categorical predictors can be included in a model (or can be modelled to interact). To deal with the categorical nature of the hit and false-alarm measures, a binomial logit linking function between responses (0 or 1) and predictor variables was included into the hit and false-alarm models (Jaeger, 2008). The no-noise condition was mapped onto the intercept. Target phoneme was contrast-coded (−.5 for /p/ and +.5 for /k/). Regression weights for categorical factors reflect the adjustments to the intercept across conditions. Effects of trial number, contextual probability rating, and position of the target in the sentence (ms from sentence onset) were evaluated as numerical factors. Regression weights of numerical factors indicate adjustments to the slope of the regression line. The best fitting model for each data set was established through systematic stepwise model comparisons using likelihood ratio tests.

First, we established the best fitting models for accuracy measures (hits, false alarms, d′s), and hit RTs by evaluating the design variables (listening condition, contextual probability rating, target position) and control variables (target phoneme, experimental trial number). Control variables were only included here as they may account for some of the variability in the data and reduce residual noise. The final best fitting models served as starting points for the individual differences analyses. For the individual differences analyses, interactions between the five individual participant characteristics on the one hand (hearing loss, vocabulary accuracy, and selective attention scores, and residuals of age and of working memory) and listening condition, contextual probability rating, and target position on the other hand were evaluated. Age and working memory effects are only reported with hearing loss partialled out. Nonsignificant predictors were taken out of the models in a stepwise fashion, starting from the highest order interaction, until no further predictor could be taken out without a significant loss of fit. Main effects of factors that were part of a significant interaction remained in the models. The results of all best fitting models are provided in Table 3. Only significant results and results relating to our design variables are reported below.

Table 3.

Results of best fitting models for the different dependent variables

Model	Hits	False alarms	d′	Hit RT
No-noise condition on intercept
Effect
Nonspeech noise	−2.231***	0.522***	−0.779***	0.296***
Speech noise	−2.887***	0.277^#	−1.069***	0.281***
Target phoneme	−0.346			−0.033
Trial number	−0.005^#	−0.007***		0.001***
Target position	0.248*			−0.043**
Hearing loss	−0.075***	0.025***	−0.037***	0.010***
Working memory (residuals)	0.024*		0.015**	−0.0001
Age (residuals)	−0.052**			0.0001
Vocabulary		−1.549*
Selective attention				2.048
Nonspeech noise × Target phoneme	0.685**			−0.085***
Speech noise × Target phoneme	0.388^#			−0.011
Nonspeech noise × Trial number	0.001
Speech noise × Trial number	0.007*
Nonspeech noise × Target position				−0.047***
Speech noise × Target position				−0.033**
Nonspeech noise × Hearing loss	0.015		−0.007
Speech noise × Hearing loss	−0.004		−0.020**
Nonspeech noise × Working memory	−0.026*		−0.016**
Speech noise × Working memory	−0.012		−0.005
Nonspeech noise × Selective attention				−1.726**
Speech noise × Selective attention				−0.291
Context rating				−0.039*
Context rating × Target phoneme				0.072*
Context rating × Age				0.002*
Context rating × Working memory				−0.001*
Target position × Hearing loss				−0.002***
Nonspeech noise condition on intercept
Speech noise	−0.657***	−0.245^#	−0.290^#	−0.015

Note: Estimates and a significance indication. RT = reaction time.

p < .1.

p < .05.

p < .01.

***

p < .001.

Accuracy results

Hits

The best fitting model for hits showed that fewer targets were detected in the nonspeech-noise condition (β = −2.231, SE = 0.237, p < .001) and in the speech-noise condition (β = −2.887, SE = 0.234, p < .001) than in the no-noise condition. Importantly, when the nonspeech-noise condition was mapped onto the intercept, hit rates in the speech-noise condition were lower than those in the nonspeech-noise condition (β = −0.657, SE = 0.151, p < .001). Accuracy was hence affected by energetic and by informational masking. Participants detected targets more reliably the later they occurred in the sentence (β = 0.248, SE = 0.101, p < .05). The effect of probability rating was not included in the best fitting model as its inclusion only marginally improved the model's fit, χ²(1) = 3.485, p = .062.

Participants with poorer hearing detected fewer targets (β = −0.074, SE = 0.012, p < .001), and the older the participant was, the fewer targets were detected (β = −0.052, SE = 0.018, p < .01). Participants with better working memory had higher hit rates (β = 0.024, SE = 0.011, p < .05), but less so in the nonspeech-noise condition (β = −0.026, SE = 0.010, p < .05). There were no other significant interactions between individual participant characteristics and condition or context rating. In summary, hearing loss, age, and working memory predicted hit accuracy. The association between working memory and accuracy was particularly found in the no-noise and the speech-noise conditions.

False alarms

The only random factor included in the false-alarm data analysis was participant, since several foil sentences did not elicit any false alarms. Relative to the no-noise condition, there were more false alarm responses in the nonspeech-noise condition (β = 0.522, SE = 0.136, p < .001). There was no statistical evidence for informational masking. False alarms were hence only affected by energetic and not by informational masking. Participants with more hearing loss gave more false-alarm responses (β = 0.024, SE = 0.007, p < .001). Further, participants with better vocabulary knowledge gave fewer false-alarm responses across all conditions (β = −1.549, SE = 0.780, p < .05). In summary, better hearing and better vocabulary knowledge were associated with lower false-alarm rates.

d′

We also analysed d′s as a measure of detection sensitivity. Higher d′ values indicate greater detection sensitivity, whilst accounting for the fact that (certain) participants may have a bias to respond that the target is present. Given the nature of the d′ statistic, only participant and not item was included as a random variable. Relative to the no-noise condition, d′ was lower in the nonspeech-noise (β = −0.779, SE = 0.156, p < .001) and in the speech-noise condition (β = −1.069, SE = 0.156, p < .001). When the nonspeech-noise condition was mapped onto the intercept, d′s in the two noise conditions did not differ from another (β = −0.025, SE = 0.079, p > .1). Detection sensitivity was thus affected by energetic masking, but not further affected by informational masking.

Participants with poorer hearing had lower d′s (β = −0.037, SE = 0.005, p < .001). This was true in the no-noise condition (mapped on the intercept) and in the nonspeech-noise condition (as the hearing effect was not stronger in the nonspeech than in the no-noise condition: β = −0.007, SE = 0.007, ns), but was even more the case in the speech-noise condition, as indicated by an interaction between hearing loss and the speech-noise condition (β = −0.020, SE = 0.007, p < .01). Further, participants with better working memory had higher d′s, both in the no-noise condition (β = 0.015, SE = 0.005, p < .01) and in the speech-noise condition (as the interaction was not significant: β = −0.005, SE = 0.005, ns). The strength of this relationship between working memory and d′ was decreased, however, in the nonspeech-noise condition (β = −0.016, SE = 0.005, p < .01). In summary, better hearing and better working memory were associated with better detection sensitivity, but the strength of these associations differed across listening conditions.

Hit response time results

Relative to the no-noise condition, response times were slower in the nonspeech-noise condition (β = 0.296, SE = 0.012, p < .001) and in the speech-noise condition (β = 0.281, SE = 0.013, p < .001). When the nonspeech-noise condition was mapped onto the intercept, there was no difference in performance in the two noise conditions (β = −0.015, SE = 0.013, p > .1), indicating a lack of informational masking. Faster responses were given the later the target occurred in the sentence (β = −0.043, SE = 0.016, p < .01). This effect was modified by listening condition, such that it was larger in the nonspeech-noise (β = −0.047, SE = 0.012, p < .001) and speech-noise conditions (β = −0.033, SE = 0.013, p < .01) than in the no-noise condition. Higher contextual probability ratings facilitated response times (β = −0.039, SE = 0.016, p < .05).

Across listening conditions, participants with poorer hearing had slower response times (β = 0.010, SE = 0.002, p < .001). Further, the poorer the participants’ hearing, the more their responses were facilitated by the target occurring later in the sentence (β = 0.002, SE = 0.0005, p < .001), such that poorer hearing participants showed larger effects of preceding phonetic context. Participants with poorer selective attention were less affected by nonspeech noise (β = −1.726, SE = 0.620, p < .01), which appears to be an unexpected result at first. When the nonspeech-noise condition was mapped onto the intercept, the results showed that although those participants with poorer selective attention benefited less from the absence of noise (β = 1.726, SE = 0.620, p < .01), they were also impacted more by the presence of competing speech (β = 1.435, SE = 0.683, p < .05). This suggests that adults with poorer selective attention are less affected by energetic masking but more affected by informational masking.

Two individual characteristics related to the size of the contextual probability facilitation on response times: Participants with better working memory benefited more from higher contextual probability (β = −0.001, SE = 0.0005, p < .05). This result is shown in Figure 2, where participants have been grouped on the basis of a median split for working memory ability. Figure 2 illustrates that those participants with better working memory show more contextual facilitation of their responses than those with poorer working memory. In addition to working memory, age interacted with contextual probability, such that the older the participant, the smaller the facilitation by context (β = 0.002, SE = 0.001, p < .05).

Figure 2.

Hit response times averaged over listening conditions as a function of contextual probability rating (i.e., how well the target word fit the preceding sentence fragment on a scale from 1 “word does not fit at all” to 7 “perfect fit”) and digit span performance. For illustration purposes, participants have been grouped into High Span and Low Span subgroups, based on a median split on digit span performance.

We also evaluated whether these effects were solely driven by those participants with clinically significant hearing loss. To do so, we reevaluated the final model on a subset of 47 participants (out of a total of 63 participants) whose pure-tone average threshold in the better ear was below 35 dB HL, a threshold applied by Dutch health insurance companies as to whether or not to provide a partial reimbursement of a hearing aid. One additional participant was excluded because of a low hit rate (below 50%). The lowest hit rate in the remaining 46 participants was 66% (M = 85%, SD = 8.2, range = 66–98). The results of the subset analysis largely mirrored those of the full participant sample. Importantly, the interactions between contextual probability and working memory (β = −0.001, SE = 0.0005, p < .05) and between contextual probability and age (β = 0.004, SE = 0.001, p < .01) were also found in this subset of participants with relatively normal hearing. Hearing loss thus does not drive our finding that the immediate use of contextual probability is modulated by both working memory and age of the listener.

In summary, hearing loss affected response times across conditions, and those with poorer hearing showed greater facilitatory effects of phonetic context. Those older adults with poorer selective attention were more impacted by speech noise and benefited less from the absence of noise. Both age and working memory related to context facilitation, with more facilitation with younger age and better working memory.

Discussion

In this study, we investigated individual differences in older adults’ ability to efficiently use contextual probability for the recognition of spoken words in sentences presented in different listening conditions. We used phoneme monitoring with meaningful sentences, as performance in this task can then be interpreted as reflecting lexical processing. The time to process the speech materials was limited by requiring participants to maximize response speed and accuracy. Limiting processing time was indeed important as the gradient measure of contextual probability (derived from a separate offline rating study) mainly affected the speed of word recognition in our materials and had only a marginal effect on accuracy. The experimental results showed three main points. First, semantic context facilitation of spoken-word recognition was modulated by verbal working memory and age. Second, hearing loss, working memory, vocabulary knowledge, and age were the most consistent predictors of overall listening performance. Hearing loss also predicted the use of preceding phonetic context in phoneme monitoring. And third, adults with poorer selective attention were less affected by energetic masking, but more affected by informational masking.

The main question was whether verbal working memory and age would independently be associated with the size of the context facilitation effect on spoken-word recognition. There are different components to working memory that can be tested in many ways. Here, we focused on the backward digit span task as it captures the crucial ability to simultaneously store, process, and manipulate verbal information in memory. As predicted, participants who are better at manipulating and updating information in working memory (as indexed by better performance on a backward digit span task) showed larger facilitation of their responses to words with greater contextual probability. This suggests that working memory skills may aid the listener in keeping and updating a semantic representation of the sentence content. Importantly, our results show that verbal working memory is associated with the immediate use of contextual probability for spoken word recognition, as the sentence unfolds. That is, the relationship between working memory and context use for speech comprehension is not limited to situations in which listeners have unlimited processing time (e.g., in Zekveld et al., 2011b, 2012). Our results therefore suggest that contextual probability helps constrain the lexical search space immediately, particularly for those listeners who are better at storing new and updating old information in memory. As a target's contextual probability and position in the sentence were not confounded in our sentence materials, we were also able to show that working memory modulates the immediate use of semantic contextual probability in spoken-word recognition, whereas hearing sensitivity modulates facilitation of word recognition from preceding phonetic context. Even though the monitoring of phonemes in meaningful sentences is interpreted as reflecting lexical processing, participants’ way of listening to speech may have been different from that in normal listening situations as their attention here is explicitly drawn to the occurrence of a specific target phoneme. Future research will have to determine whether our results thus hold for other listening situations. Importantly, however, there are some preliminary indications that our findings on working memory and immediate use of contextual probability may generalize to an experimental method (i.e., eye tracking using a visual world paradigm) with fewer metalinguistic demands, as the latter only requires participants to listen to speech while looking at a computer screen display (Huettig & Janse, 2012).

The ability to benefit from semantic context was modulated not only by working memory, but also by age differences among the older adults—namely, above and beyond working memory effects. Increased age within our sample of older adults was associated with smaller context facilitation effects across all listening conditions. However, age-related declines, such as cognitive slowing (Salthouse, 1996) or reduced auditory temporal processing (cf. e.g., Pichora-Fuller, Schneider, MacDonald, Pass, & Brown, 2007; Smith, Pichora-Fuller, Wilson, & MacDonald, 2012) cannot fully account for the present age effects on efficiency of semantic integration, as there was no general age effect on response latencies. Age only modulated the effect of context on response latencies. This result thus rather supports the idea that ageing impairs semantic integration in speech processing, as found for age group comparison studies on N400 effects in reading (Federmeier & Kutas, 2005; Federmeier et al., 2010). This poorer semantic integration may be related to the reduced coherence between activated brain regions in spoken sentence comprehension (Peelle et al., 2010). Some recent evidence suggests that older listeners may also not be able to overcome an initial difficulty in semantic integration with additional (postperceptual) processing (Benichov et al., 2012). Benichov et al. (2012) showed that in a task with unlimited processing time, the age of listeners (ranging from 19 to 89 years) was negatively correlated with spoken word recognition, after hearing sensitivity had been accounted for. This relationship was especially strong for the recognition of words in highly constraining contexts. More research is, however, needed to investigate explicitly the time course of the effects of context on sentence comprehension in younger and older adults. In summary, in the present study, we found associations between immediate context facilitation and working memory and age. These associations held across listening conditions and thus across different degrees of speech processing difficulty.

We also examined the relationship of several other listener characteristics to the performance in this speeded word recognition task. Linguistic expertise, as indexed by vocabulary knowledge, did not modify the efficient use of context among older adults. This suggests that listeners’ knowledge about the words within their native language does not predict their ability to anticipate upcoming words. Possibly, other measures of linguistic expertise may be more relevant. Even though vocabulary knowledge as an index of linguistic expertise did not predict the use of context information in this study, it was an important predictor of overall task performance. Vocabulary knowledge predicted the false detection of phoneme targets. Those listeners with better vocabulary produced fewer false-alarm responses. In addition to vocabulary knowledge, we found that hearing loss and working memory were predictive of hit and false-alarm rates across listening conditions. Hearing loss also predicted listeners’ overall detection sensitivity to the target-bearing words and the impact of competing speech on detection sensitivity. Working memory plays a general role in lexical processing as it modulated performance in the no-noise condition. The finding that working memory relates to listening performance in the speech-noise condition is in line with working memory being the most consistent cognitive predictor across several studies on speech recognition in noise (Akeroyd, 2008). We have currently no explanation as to why working memory was not associated with accuracy in the nonspeech-noise condition.

The only participant characteristic that predicted the degree to which older listeners were affected by masking was selective attention. More specifically, adults with poorer selective attention were affected less by energetic masking, but more by informational masking. The latter result is in line with prior work suggesting that attentional control relates to interference from meaningful competing speech (Janse, 2012; Tun et al., 2002). There has been considerable debate as to whether executive function should be seen as a unitary construct, or whether it is a term covering several distinct cognitive processes, working memory being one of them (cf. e.g., McCabe, Roediger, McDaniel, Balota, & Hambrick, 2010, and references therein). Based on structural equation modelling, McCabe and colleagues (2010) argue that tests of working memory capacity and of executive function share a common underlying executive attention component. Others have also suggested considerable overlap between working memory and attentional abilities (e.g., Baddeley, 1996; Cowan, 1995). Yet other studies have shown that intercorrelations among these cognitive functions may be low (see Miyake, Friedman, Emerson, Witzki, & Howerter, 2000, and references therein), which was also the case for the indices of selective attention and working memory in our participant sample. Selective attention in our study thus predicts, independently of working memory, a listener's susceptibility to masking and is not an index of general task performance.

There are several possible explanations for why we mainly found predictors of general listening performance and only one predictor of masking effects (selective attention). One reason could be that we did not block by listening condition. This might have made the experimental task relatively demanding, as participants may have had to switch back and forth between processing strategies. Another possible account for the fact that we mainly found predictors for general performance could be that our method was sensitive enough to investigate individual differences in the no-noise condition. Most studies on individual differences in speech processing lack a no-noise condition since accuracy in a no-noise condition would be too close to a ceiling level of performance to be able to analyse individual variation. The use of a speeded measure allowed us to investigate individual differences among older adults in this listening condition. Our results illustrate that individual differences are magnified in more difficult conditions, but can already be found in easier listening conditions, if the testing method is sensitive enough.

In conclusion, our results show that older listeners can immediately use contextual probability to facilitate the recognition of spoken words in sentences. The size of this contextual benefit is modulated by the listener's verbal working memory and age. This was found for listening to speech without noise and with noise. Older adults with better working memory also showed generally better speech processing performance, and older adults with better selective attention were less impacted by distraction from competing speech. These results therefore provide evidence that memory underlies efficient speech processing and semantic integration of the spoken message. Importantly, our results also suggest that verbal working memory is not the sole factor, but that other age-related changes matter as well.

Footnotes

Acknowledgements

We thank Antje Meyer for useful feedback on an earlier version of this paper. This work was supported in part by the Netherlands Organization for Scientific Research (NWO) under a Vidi Grant awarded to the first author [grant number 276-75-009].

References

Akeroyd

M. A.

(2008). Are individual differences in speech reception related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. International Journal of Audiology, 47(Suppl. 2), S53–S71. doi:10.1080/14992020802301142

Altmann

G. T. M.

, & Kamide

(1999). Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition, 73, 247–264. doi:10.1016%2FS0010-0277%2899%2900059-1

Andringa

, Olsthoorn

, van Beuningen

, Schoonen

, & Hulstijn

(2012). Determinants of success in native and non-native listening comprehension: An individual differences approach. Language Learning, 62(Suppl. 2), 49–78. doi:10.1111/j.1467-9922.2012.00706.x

Assmann

P. F.

, & Summerfield

(1990). Modeling the perception of concurrent vowels: Vowels with different fundamental frequencies. Journal of the Acoustical Society of America, 88, 680–697.

Aydelott

, Leech

, & Crinion

(2010). Normal adult aging and the contextual influences affecting speech and meaningful sound perception. Trends in Amplification, 14, 218–232. doi:10.1177%2F1084713810393751

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412. doi:10.1016/j.jml.2007.12.005

Baayen, R. H., Tweedie, F. J., & Schreuder, R. (2002). The subjects as a simple random effect fallacy: Subject variability and morphological family effects in the mental lexicon. Brain and Language, 81, 55–65. doi:10.1006/brln.2001.2506

Baddeley

(1996). Exploring the central executive. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 49A, 5–28.

Benichov

, Cox

L. C.

, Tun

P. A.

, & Wingfield

(2012). Word recognition within a linguistic context: Effects of age, hearing Acuity, verbal ability, and cognitive function. Ear & Hearing, 33, 250–256. doi:10.1097/AUD.0b013e31822f680f

10.

Bilger

R. C.

, Nuetzel

J. M.

, Rabinowitz

W. M.

, & Rzeczkowski

(1984). Standardization of a test of speech perception in noise. Journal of Speech and Hearing Research, 27, 32–48. doi:10.1121%2F1.2017541

11.

Boulenger

, Hoen

, Ferragne

, Pellegrino

, & Meunier

(2010). Real-time lexical competitions during speech-in-speech comprehension. Speech Communication, 52, 246–253. doi:10.1016%2Fj.specom.2009.11.002

12.

Bregman

A. S.

(1990). Auditory scene analysis: The perceptual organization of sounds. London, England: The MIT Press.

13.

Brokx

J. P. L.

, & Nooteboom

S. G.

(1982). Intonation and the perceptual separation of simultaneous voices. Journal of Phonetics, 10, 23–36.

14.

Brouwer

, Van Engen

K. J.

, Calandruccio

, & Bradlow

A. R.

(2012). Linguistic contributions to speech-on-speech masking for native and non-native listeners: Language familiarity and semantic content. Journal of the Acoustical Society of America, 131, 1449–1464. doi:10.1121%2F1.3675943

15.

Brungart

D. S.

(2001). Informational and energetic masking effects in the perception of two simultaneous talkers. Journal of the Acoustical Society of America, 109, 1101–1109. doi:10.1121%2F1.1345696

16.

Brungart

D. S.

, & Simpson

B. D.

(2002). The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal. Journal of the Acoustical Society of America, 112, 664–676. doi:10.1121%2F1.1490592

17.

Brungart

D. S.

, & Simpson

B. D.

(2004). Within-ear and across-ear interference in a dichotic cocktail party listening task: Effects of masker uncertainty. Journal of the Acoustical Society of America, 115, 301–310. doi:10.1121%2F1.1628683

18.

Brungart

D. S.

, & Simpson

B. D.

(2007). Effect of target-masker similarity on across-ear interference in a dichotic cocktail-party listening task. Journal of the Acoustical Society of America, 122, 1724–1734. doi:10.1121%2F1.2756797

19.

Brunner

, & Pisoni

D. B.

(1982). Some effects of perceptual load on spoken comprehension. Journal of Verbal Learning and Verbal Behavior, 21, 186–195. doi:10.1016%2FS0022-5371%2882%2990551-5

20.

Connine

C. M.

, & Titone

(1996). Phoneme monitoring. Language and Cognitive Processes, 11, 635–645.

21.

Cooke

(2006). A glimpsing model of speech perception in noise. Journal of the Acoustical Society of America, 119, 1562–1573. doi:10.1121%2F1.2166600

22.

Cowan

(1995). Attention and memory: An integrated framework. Oxford Psychology Series, No. 26. New York: Oxford University Press.

23.

Cutler

, Mehler

, Norris

, & Segui

(1987). Phoneme identification and the lexicon. Cognitive Psychology, 19, 141–177.

24.

Cutler

, & Norris

(1979). Monitoring sentence comprehension. In Cooper

W. E.

& Walker

E. T. C.

(Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 113–134). Hillsdale, NJ: Lawrence Erlbaum Associates.

25.

DeLong

, Urbach

, & Kutas

(2005). Probabilistic word pre-activation during language comprehension inferred from electrical brain activity. Nature Neuroscience, 8, 1117–1121. doi:10.1038/nn1504

26.

Dubno

J. R.

, Ahlstrom

J. B.

, & Horwitz

A. R.

(2000). Use of context by young and aged adults with normal hearing. Journal of the Acoustical Society of America, 107, 538–546. doi:10.1121%2F1.428322

27.

Dupoux

, & Mehler

(1990). Monitoring the lexicon with normal and compressed speech: Frequency effects and the prelexical code. Journal of Memory and Language, 29, 316–335. doi:10.1016%2F0749-596X%2890%2990003-I

28.

Eimas

P. D.

, & Nygaard

L. C.

(1992). Contextual coherence and attention in phoneme monitoring. Journal of Memory and Language, 31, 375–395. doi:10.1016%2F0749-596X%2892%2990019-T

29.

Eriksen

B. A.

, & Eriksen

C. W.

(1974).Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception and Psychophysics, 16, 143–149. doi:10.3758%2FBF03203267

30.

Ezzatian

, Li

, Pichora-Fuller

, & Schneider

(2011). The effect of priming on release from informational masking is equivalent for younger and older adults. Ear and Hearing, 32, 84–96. doi:10.1097%2FAUD.0b013e3181ee6b8a

31.

Fan

, Raz

, & Posner

M. I.

(2003). Attentional mechanisms. In Aminoff

M. J.

& Daroff

R. B.

(Eds.), Encyclopedia of neurological sciences (Vol. 1, pp. 292–299). San Diego, CA: Academic Press.

32.

Federmeier

K. D.

, & Kutas

(2005). Aging in context: Age-related changes in context use during language comprehension. Psychophysiology, 42, 133–141. doi:10.1111%2Fj.1469-8986.2005.00274.x

33.

Federmeier

K. D.

, Kutas

, & Schul

(2010). Age-related and individual differences in the use of prediction during language comprehension. Brain and Language, 115, 149–161. doi:10.1016%2Fj.bandl.2010.07.006

34.

Federmeier

K. D.

, McLennan

D. B.

, De Ochoa

, & Kutas

(2002). The impact of semantic memory organization and sentence context information on spoken language processing by younger and older adults: An ERP study. Psychophysiology, 39, 133–146. doi:10.1017.S004857720139203X

35.

Ferretti

, McRae

, & Hatherell

(2001). Integrating verbs, situation schemas, and thematic role concepts. Journal of Memory and Language, 44, 516–547. doi:10.1006%2Fjmla.2000.2728

36.

Ford

J. M.

, Woodward

S. H.

, Sullivan

E. V.

, Isaacks

B. G.

, Tinklenberg

J. R.

, Yesavage

J. A.

, & Roth

W. T.

(1996). N400 evidence of abnormal responses to speech in Alzheimer's disease. Electroencephalography and Clinical Neurophysiology, 99, 235–246. doi:10.1016%2F0013-4694%2896%2995049-X

37.

Foss

D. J.

, & Blank

M. A.

(1980). Identifying the speech codes. Cognitive Psychology, 12, 1–31. doi:10.1016%2F0010-0285%2880%2990002-X

38.

Foss

D. J.

, & Jenkins

C. M.

(1973). Some effects of context on the comprehension of ambiguous sentences. Journal of Verbal Learning and Verbal Behavior, 12, 577–589. doi:10.1016%2FS0022-5371%2873%2980037-4

39.

Francis

A. L.

, & Nusbaum

H. C.

(2009). Effects of intelligibility on working memory demand for speech perception. Attention, Perception, and Psychophysics, 71, 1360–1374. doi:10.3758%2FAPP.71.6.1360

40.

Freyman

R. L.

, Balakrishnan

, & Helfer

K. S.

(2004). Effect of number of masking talkers and auditory priming on informational masking in speech recognition. Journal of the Acoustical Society of America, 115, 2246–2256. doi:10.1121%2F1.1689343

41.

Frisina

D. R.

, & Frisina

R. D.

(1997). Speech recognition in noise and presbycusis: Relations to possible neural mechanisms. Hearing Research, 106, 95–104. doi:10.1016%2FS0378-5955%2897%2900006-3

42.

Garcia Lecumberri

M. L.

, & Cooke

M. P.

(2006). Effect of masker type on native and non-native consonant perception in noise. Journal of the Acoustical Society of America, 119, 2445–2454. doi:10.1121/1.2180210

43.

Gordon-Salant

, & Fitzgibbons

P. J.

(1997). Selected cognitive factors and speech recognition performance among young and elderly listeners. Journal of Speech, Language, and Hearing Research, 40, 423–431.

44.

Gordon-Salant

, Frisina

R. D.

, Popper

, & Fay

(Eds.). (2010). The aging auditory system: Perceptual characterization and neural bases of presbycusis. Berlin, Germany: Springer.

45.

Hazenberg

, & Hulstijn

J. H.

(1996). Defining a minimal receptive second-language vocabulary for non-native university students: An empirical investigation. Applied Linguistics, 17, 145–163. doi:10.1093%2Fapplin%2F17.2.145

46.

Helfer

K. S.

, & Freyman

R. L.

(2008). Aging and speech-on-speech masking. Ear and Hearing, 29, 87–98. doi:10.1097%2FAUD.0b013e31815d638b

47.

Huang

, Huang

, Chen

, Qu

T. S.

, Wu

X. H.

, & Li

(2008). Perceptual integration between target speech and target-speech reflection reduces masking for target-speech recognition in younger adults and older adults. Hearing Research, 244, 51–65. doi:10.1016%2Fj.heares.2008.07.006

48.

Huettig

, & Janse

(2012). Anticipatory eye movements are modulated by working memory capacity: Evidence from older adults. Poster presented at the 18th Annual Conference on Architectures and Mechanisms for Language Processing (AMLaP 2012), Riva del Garda, Italy.

49.

Jackson

H. M.

, & Moore

B. C. J.

(2013). Contribution of temporal fine structure information and fundamental frequency separation to intelligibility in a competing-speaker paradigm. Journal of the Acoustical Society of America, 133, 2421–2430. doi:10.1121/1.4792153

50.

Jaeger

T. F.

(2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434–446. doi:10.1016%2Fj.jml.2007.11.007

51.

Janse

(2012). A non-auditory measure of interference predicts distraction by competing speech in older adults. Aging, Neuropsychology, and Cognition, 19, 741–758. doi:10.1080/13825585.2011.652590

52.

Janse

, & Adank

(2012). Predicting foreign-accent adaptation in older adults. Quarterly Journal of Experimental Psychology, 65, 1563–1585.

53.

Jesse

, & Janse

(2012). Audiovisual benefit for recognition of speech presented with single-talker noise in older listeners. Language and Cognitive Processes, 27, 1167–1191. doi:10.1080/01690965.2011.620335

54.

Kalikow

D. N.

, Stevens

K. N.

, & Elliott

L. L.

(1977). Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. Journal of the Acoustical Society of America, 61, 1337–1351. doi:10.1121%2F1.381436

55.

Kidd

, Jr, Mason

C. R.

, Richards

V. M.

, Gallun

F. J.

, & Durlach

N. I.

(2008). Informational masking. In Yost

W. A.

, Popper

A. N.

, & Fay

R. R.

(Eds.), Auditory perception of sound sources (pp. 143–190). New York: Springer Science + Business Media.

56.

Knoeferle

, & Crocker

M. W.

(2007). The influence of recent scene events on spoken comprehension: Evidence from eye movements. Journal of Memory and Language, 57, 519–543. doi:10.1016%2Fj.jml.2007.01.003

57.

Koelewijn

, Zekveld

A. A.

, Festen

J. M.

, & Kramer

S. E.

(2012). Pupil dilation uncovers extra listening effort in the presence of a single-talker masker. Ear and Hearing, 33, 291–300. doi:10.1097%2FAUD.0b013e3182310019

58.

Lahar

C. J.

, Tun

P. A.

, & Wingfield

(2004). Sentence-final word completion norms for young, middle-aged, and older adults. Journal of Gerontology: Psychological Sciences, 59B, 7–10.

59.

Lash, A., Rogers, C. S., Zoller, A., & Wingfield, A. (2013). Expectation and entropy in spoken word recognition: Effects of age and hearing acuity. Experimental Aging Research, 39, 235–253. doi:10.1080/0361073X.2013.779175

60.

Lee

C. L.

, & Federmeier

K. D.

(2011). Differential age effects on lexical ambiguity resolution mechanisms. Psychophysiology, 48, 960–972. doi:10.1111%2Fj.1469-8986.2010.01158.x

61.

Lovelace

E. A.

, & Coon

V. E.

(1991). Aging and word finding: Reverse vocabulary and cloze tests. Bulletin of the Psychonomic Society, 29, 33–35.

62.

Macmillan

N. A.

, & Creelman

C. D.

(1991). Detection theory: A user's guide. New York: Cambridge University Press.

63.

Marslen-Wilson

W. D.

, & Tyler

L. K.

(1980). The temporal structure of spoken language understanding. Cognition, 8, 1–71. doi:10.1016%2F0010-0277%2880%2990015-3

64.

McCabe

D. P.

, Roediger

H. L.

, McDaniel

M. A.

, Balota

D. A.

, & Hambrick

D. Z.

(2010). The relationship between working memory capacity and executive functioning: Evidence for a common executive attention construct. Neuropsychology, 24, 222–243.

65.

McCoy

S. L.

, Tun

P. A.

, Cox

C. L.

, Colangelo

, Stewart

R. A.

& Wingfield

(2005). Hearing loss and perceptual effort: Downstream effects on older adults’ memory for speech. Quarterly Journal of Experimental Psychology (A), 58, 22–33. doi:10.1080%2F02724980443000151

66.

Meyer

D. E.

, & Schvaneveldt

R. W.

(1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, 227–234. doi:10.1037%2Fh0031564

67.

Mirman

, McClelland

J. L.

, Holt

L. L.

, & Magnuson

J. S.

(2008). Effects of attention on the strength of lexical influences on speech perception: Behavioral experiments and computational mechanisms. Cognitive Science, 32, 398–417. doi:10.1080/03640210701864063

68.

Miyake

, Friedman

N. P.

, Emerson

M. J.

, Witzki

A. H.

, & Howerter

(2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100.

69.

Neubauer

A. C.

, & Fink

(2009). Intelligence and neural efficiency. Neuroscience & Biobehavioral Reviews, 33, 1004–1023. doi:10.1016/j.neubiorev.2009.04.001

70.

Nittrouer

, & Boothroyd

(1990). Context effects in phoneme and word recognition by young children and older adults. Journal of the Acoustical Society of America, 87, 2705–2715. doi:10.1121%2F1.399061

71.

Otten

, & van Berkum

J. J. A.

(2009). Does working memory capacity affect the ability to predict upcoming words in discourse? Brain Research, 1291, 92–101.

72.

Park

D. C.

, Lautenschlager

, Hedden

, Davidson

N. S.

, Smith

A. D.

, & Smith

P. K.

(2002). Models of visuospatial and verbal memory across the adult life span. Psychology and Aging, 17, 299–320. doi:10.1037%2F%2F0882-7974.17.2.299

73.

Peelle

J. E.

, Troiani

, Wingfield

, & Grossman

(2010). Neural processing during older adults’ comprehension of spoken sentences: Age differences in resource allocation and connectivity. Cerebral Cortex, 20, 773–782. doi:10.1093/cercor/bhp142

74.

Pichora-Fuller

M. K.

, Schneider

B. A.

, & Daneman

(1995). How young and old adults listen to and remember speech in noise. Journal of the Acoustical Society of America, 97, 593–608. doi:10.1121%2F1.412282

75.

Pichora-Fuller

M. K.

, Schneider

B. A.

, MacDonald

, Pass

H. E.

, & Brown

(2007). Temporal jitter disrupts speech intelligibility: A simulation of auditory aging. Hearing Research, 223, 114–121. doi:10.1016/j.heares.2006.10.009

76.

Piquado

, Isaacowitz

, & Wingfield

(2010). Pupillometry as a measure of cognitive effort in younger and older adults. Psychophysiology, 47, 560–569. doi:10.1111%2Fj.1469-8986.2009.00947.x

77.

Pollack

(1975). Auditory informational masking. Journal of the Acoustical Society of America, 57(Suppl. 1), 5. doi:10.1121%2F1.1995329

78.

Posner

M. I.

, & Petersen

S. E.

(1990). The attention systems of the human brain. Annual Review of Neuroscience, 13, 25–42.

79.

Rabbitt

P. M.

(1968). Channel-capacity, intelligibility and immediate memory. Quarterly Journal of Experimental Psychology, 20, 241–248. doi:10.1080%2F14640746808400158

80.

Revill

K. P.

, & Spieler

D. H.

(2012). The effect of lexical frequency on spoken word recognition in young and older listeners. Psychology and Aging, 27, 80–87. doi:10.1037/a0024113

81.

Rogers

C. S.

, Jacoby

L. L.

, & Sommers

M. S.

(2012). Frequent false hearing by older adults: The role of age differences in metacognition. Psychology and Aging, 27, 33–45. doi:10.1037%2Fa0026231

82.

Rönnberg

(2003). Cognition in the hearing impaired and deaf as a bridge between signal and dialogue: A framework and a model. International Journal of Audiology, 42, 68–76. doi:10.3109%2F1499202030 9074626

83.

Rönnberg

, Rudner

, Foo

, & Lunner

(2008). Cognition counts: A working memory system for ease of language understanding (ELU). International Journal of Audiology, 47(Suppl. 2), S99–S105. doi:10.1080%2F14992020802301167

84.

Quené, H., & van den Bergh, H. (2008). Examples of mixed-effects modeling with crossed random effects and with binomial data. Journal of Memory and Language, 59, 413–425. doi:10.1016/j.jml.2008.02.002

85.

Salthouse

T. A.

(1996). The processing-speed theory of adult age differences in cognition. Psychological Review, 103, 403–428. doi:10.1037%2F%2F0033-295X.103.3.403

86.

Schneider

B. A.

, Daneman

, Murphy

D. R.

, & Kwong See

(2000). Listening to discouse in distracting settings: The effects of aging. Psychology and Aging, 15, 110–125. doi:10.1037%2F%2F0882-7974.15.1.110

87.

Sheldon

, Pichora-Fuller

M. K.

, & Schneider

B. A.

(2008). Priming and sentence context support listening to noise-vocoded speech by younger and older adults. Journal of the Acoustical Society of America, 123, 489–499. doi:10.1121%2F1.2783762

88.

Smith

S. L.

, Pichora-Fuller

M. K.

, Wilson

R. H.

, & MacDonald

E. N.

(2012). Word recognition for temporally and spectrally distorted materials: The effects of age and hearing loss. Ear and Hearing, 33, 349–366. doi:10.1097/AUD.0b013e318242571c

89.

Sommers

M. S.

, & Danielson

S. M.

(1999). Inhibitory processes and spoken word recognition in young and older adults: The interaction of lexical competition and semantic context. Psychology and Aging, 14, 458–472. doi:10.1037%2F%2F0882-7974.14.3.458

90.

Tun

P. A.

, McCoy

, & Wingfield

(2009). Aging, hearing acuity, and the attentional costs of effortful listening. Psychology and Aging, 24, 761–766. doi:10.1037%2Fa0014802

91.

Tun

P. A.

, O'Kane

, & Wingfield

(2002). Distraction by competing speech in young and older adult listeners. Psychology and Aging, 17, 453–467. doi:10.1037%2F%2F0882-7974.17.3.453

92.

Van Engen

K. J.

, & Bradlow

A. R.

(2007). Sentence recognition in native- and foreign-language multi-talker background noise. Journal of the Acoustical Society of America, 121, 519–526. doi:10.1121%2F1.2400666

93.

Verhaeghen

(2003). Aging and vocabulary score: A meta-analysis. Psychology and Aging, 18, 332–339. doi:/10.1037%2F0882-7974.18.2.332

94.

Watson

C. S.

(2005). Some comments on informational masking. Acta Acustica united with Acustica, 91, 502–512.

95.

Wechsler

(2004). Wechsler adult intelligence test (Dutch version, 3rd ed.). Amsterdam: Harcourt Test Publishers.

96.

Wingfield

, Alexander

A. H.

, & Cavigelli

(1994). Does memory constrain utilization of top-down information in spoken word recognition? Evidence from normal aging. Language and Speech, 37, 221–235.

97.

Zekveld

A. A.

, Kramer

S. E.

, Festen

J. M.

(2010). Pupil response as an indication of effortful listening: The influence of sentence intelligibility. Ear and Hearing, 31, 480–490. doi:10.1097%2FAUD.0b013e3181d4f251

98.

Zekveld

A. A.

, Kramer

S. E.

, Festen

J. M.

(2011a). Cognitive load during speech perception in noise: The influence of age, hearing loss, and cognition on the pupil response. Ear and Hearing, 32, 498–510. doi:10.1097%2FAUD.0b013e31820 512bb

99.

Zekveld

A. A.

, Rudner

, Johnsrude

I. S.

, Festen

J. M.

, van Beek

J. H. M.

, & Rönnberg

(2011b). The influence of semantically related and unrelated text cues on the intelligibility of sentences in noise. Ear and Hearing, 32, 16–25. doi:10.1097%2FAUD.0b013e318228036a

100.

Zekveld

A. A.

, Rudner

, Johnsrude

I. S.

, Heslenfeld

D. J.

, & Rönnberg

(2012). Behavioral and fMRI evidence that cognitive ability modulates the effect of semantic context on speech intelligibility. Brain and Language, 122, 103–113. doi:10.1016/j.bandl.2012.05.006