L2 proficiency influence in the perception of lexical stress in Spanish: Ab initio vs. experienced learners

Abstract

While learners from first languages (L1s) without word stress (e.g. French) often struggle to perceive lexical stress in a second language, some specific L1 prosodic cues (e.g. use of F0 in Korean) may facilitate second language (L2) stress perception. This study investigated how L1 background (French vs. Korean) and L2 proficiency (ab initio vs. experienced) influence Spanish lexical stress perception. Seventy-eight French and Korean learners of Spanish (in two proficiency groups) completed an oddity discrimination task under varying phonetic variability conditions (different talkers and intonation patterns). Results showed that Korean learners significantly outperformed French learners, irrespective of proficiency, suggesting cross-linguistic transfer of F0-based prosodic sensitivity for L2 stress perception. Experienced learners performed better than ab initio learners, indicating that L2 experience facilitates stress perception. However, increased phonetic variability reduced stress discrimination accuracy for all learners, with no interaction between proficiency and phonetic variability. These findings indicate that while L1-specific prosodic background and L2 experience each independently benefit L2 lexical stress perception, advanced learners also struggle under high-variability conditions. The results underscore the importance of accounting for L1 prosodic transfer in second language acquisition and support the use of high-variability perceptual training in L2 instruction to improve learners’ stress perception.

Keywords

cue-weighting French Korean L1 transfer L2 experience L2 prosody acquisition Spanish word stress

1. Introduction

1.1. The influence of L1 in the perception of lexical stress

First language (L1) stress properties strongly influence second language (L2) stress perception (e.g. Lin et al., 2014; Ortega-Llebaria et al., 2013; Ortín and Simonet, 2022, 2023; Qin et al., 2017; Schwab and Dellwo, 2017). Because of limited exposure to variable stress patterns in their L1, speakers of languages with fixed stress – whether lexical (e.g. Hungarian) or phrasal (e.g. French) – often struggle to perceive stress in free-stress languages (e.g. English, Spanish) (Dupoux et al., 1997, 2001, 2008). By contrast, native speakers of free-stress languages (e.g. English, Spanish, German, Dutch), who are accustomed to stress pattern variation, tend to perceive L2 stress more accurately, although not reaching a native-like performance. This may stem from cross-linguistic transfer, where reliance on L1-specific stress cues, such as vowel reduction, interferes with processing stress in the L2 (Ortega-Llebaria et al., 2013; Schwab et al., 2024).

Expanding on the idea that specific prosodic features of the L1 (such as tone, stress, or utterance-level prominence) can shape learners’ processing of a broader range of prosodic patterns in the L2 (Braun et al., 2014; Choi et al., 2019; Tremblay et al., 2018), Kim and Tremblay (2022) have argued that L1 cues signaling specific linguistic contrasts can also transfer to different types of contrasts in the L2. In their study comparing L2 English word stress perception by French and Korean learners, Korean learners outperformed French learners, suggesting that the use of F0 cues in the L1 to signal segmental contrasts (as in Korean) influences L2 lexical stress processing. While these findings concern English, where stressed-unstressed contrasts are signaled both suprasegmentally (F0, amplitude, duration) and segmentally (vowel reduction; Kim and Tremblay, 2022), it remains unclear whether similar patterns occur in other word-level stress languages such as Spanish, where stress is primarily suprasegmental and less dependent on vowel reduction (Delattre, 1969; Quilis, 1981).

The present study investigates Spanish as an L2, with French and Korean as the learners’ respective L1s. Spanish is classified as a free-stress language, where stress can fall on different syllables of a word (e.g. Spanish: rosa, correr; English: rose, run). Thus, stress serves a distinctive function, as it can differentiate words (e.g. Spanish: número vs. numero; English: number vs. I number) (Quilis, 1981). In Spanish, lexical stress is primarily perceived through changes in fundamental frequency (F0), with duration and intensity serving as secondary cues (Llisterri et al., 2002a, 2002b).¹

French has been described either as a ‘langue sans accent’ (language without stress; Rossi, 1979) or a fixed-stress language (Garde, 1968): stress does not apply to individual words, but to the accentual phrase (AP), in which its position is fixed, i.e. stress always falls on the final syllable (Jun and Fougeron, 2000). Consequently, stress is not lexically distinctive. Since it occurs at phrase boundaries, stress is closely intertwined with the intonation (hence, with F0) of the phrase (Rossi, 1979). While F0 is used for phrase boundary marking, duration plays a crucial role in signaling the stressed syllable in the accentual phrase, while intensity plays a comparatively minor role (Léon, 2007).

Like French, Seoul-Gyeonggi Korean (hereafter referred to simply as ‘Korean’) also lacks word stress. Instead, Korean prosody is organized around tonal patterns that are applied at the level of the accentual phrase (AP). Although French and Korean share the AP as the smallest prosodic unit (e.g. Jun, 1998), French is described as a head/edge prominence language (i.e. phrase-final prominence is marked both by a pitch accent associated with the prosodic head and by a boundary tone; Jun, 2014). In contrast, Korean is described as an edge prominence language (i.e. relying solely on a boundary tone for phrase-final syllables; Jun, 2005, 2014). Another distinction concerns the interaction between tone and segmental features. In French, AP tone contours are largely independent of the segmental content. In Korean, however, the tonal pattern at the beginning of the AP is conditioned by the phonetic properties of the initial segment: a low tone is assigned when the AP-initial segment is lenis, while a high tone is assigned when the segment is aspirated or fortis (Jun, 2005).

In summary, although both French and Korean lack word stress, the main difference between the two languages is that, unlike French, Korean uses F0 for segmental contrasts at the beginning of accentual phrases.

1.2. The role of L2 proficiency in the perception of lexical stress

Empirical evidence shows that higher L2 proficiency – defined here as learners’ overall L2 knowledge, including their experience with the language (Ortín and Simonet, 2022) – improves speech perception (Flege et al., 1997; Gorba, 2019), but results for suprasegmental features like lexical stress are mixed, likely due to methodological factors. First, the impact of L2 proficiency on stress perception seems to depend on the typological properties of the learners’ L1 – specifically, whether the L1 has lexical stress. For instance, higher proficiency does not seem to help French listeners (L1 without distinctive word stress) to process Spanish stress (Dupoux et al., 2008; Tremblay, 2009), whereas higher proficiency does help English listeners (L1 with distinctive word stress) in processing Spanish stress (Ortín and Simonet, 2022). Second, the effect of proficiency may vary depending on the range of proficiency levels examined. While no difference was found between stress perception in Spanish by ab initio (naive) learners (no knowledge of Spanish) and beginner learners (less than one year of instruction) (Saafeld, 2012), L2 stress perception was shown to be improved from beginner (1 year of instruction) to intermediate level (3 years of instruction) (Ortín and Simonet, 2022). Tremblay (2009) did not observe an effect of proficiency in French learners of English, possibly because her sample included only intermediate and advanced learners. Third, the cognitive demand of the stress perception task used seems to modulate the effect of L2 proficiency. While a relationship between L2 proficiency and L2 stress perception was found with a highly demanding task (i.e. sequence recall; Ortín and Simonet, 2022), the relationship disappeared with a less demanding task (i.e. ABX, Ortín and Simonet, 2023). Additional evidence for the link between proficiency and L2 stress processing comes from studies on higher-level processing domains such as morphological processing and lexical access (Sagarra and Casillas, 2018; Sagarra et al., 2024).

1.3. The role of cognitive demands in the perception of lexical stress

Beyond modulating the effects of L2 proficiency on L2 stress perception, the cognitive demands involved in the stress perception task have been shown to significantly impact listeners’ stress discrimination abilities. The level of cognitive demand can be modified by manipulating specific task parameters. In particular, cognitive demand can vary as a function of the amount of memory load of the task. Stress perception declines as task memory load increases, with performance highest in AX discrimination, lower in ABX discrimination, and lowest in sequence recall tasks (Dupoux et al., 1997; Ortín and Simonet, 2022, 2023).

Another dimension of cognitive load involves the degree of acoustic-phonetic variability in the speech input. Phonetic variability is usually used in the context of High Variability Phonetic Training (HVPT), a perceptual training method that has been shown to improve L2 speech perception and production (Uchihara et al., 2024, 2025). In this context, phonetic variability has mainly been induced by talker variability (i.e. single versus multi-talker training) (Nagle et al., 2025), although a few studies have also manipulated phonetic context variability (i.e. target sound in different phonetic contexts or words) (Shejaeya et al., 2024). While not questioning the value of HVPT, the present study investigates the effect of phonetic variability on the perception of L2 stress contrasts, rather than on training these contrasts. High phonetic variability requires listeners to engage additional cognitive resources to normalize the speech signal. Research has demonstrated that the presence – and degree – of phonetic variability reduces listeners’ ability to discriminate stress contrasts (Dupoux et al., 1997; Tremblay, 2009; Schwab and Dellwo, 2017). In this context, phonetic variability was mainly generated by talker variability, but one study – on which the present study is based – also introduced variability through manipulation of the word’s intonation contour, i.e. words were produced with falling and rising intonation (Schwab and Dellwo, 2017). Note that a word produced with rising intonation requires listeners to reweight the cues associated with lexical stress, since F0 is used to mark interrogation, thereby making L2 stress perception more challenging. That study found that ab initio French learners’ performance significantly decreased with talker variability and dropped to chance level with intonation variability, suggesting a stronger detrimental effect of intonation than talker variability. With respect to L2 proficiency, while the effect of phonetic variability was found to be smaller for native listeners than for learners, no evidence was found for a differential effect of phonetic variability across L2 proficiency levels ranging from intermediate to high-advanced (Tremblay, 2009).

2. The present study

Most research on L2 lexical stress perception has compared learners whose L1s do or do not have word-level stress, leaving other pairings relatively underrepresented (Dupoux et al., 1997; Schwab and Dellwo, 2017). The present study addresses that gap by comparing L2 stress perception in learners of two L1s without word stress: French and Korean. While both groups are expected to experience difficulty in perceiving L2 stress, Korean learners may outperform French learners due to the transfer of F0 cues used for segmental contrasts in Korean, though French learners might also benefit from F0 sensitivity related to phrase boundaries in French (Christophe et al., 2004; Kim and Tremblay, 2022). We indeed hypothesize that the Korean learners’ use of a cue that correlates in their L1 with consonantal contrast would confer them an advantage in perceiving stress contrast in L2 Spanish, because of cross-domain, cross-language transfer, whereby sensitivity to F0 as a cue in one linguistic domain and language facilitates its use in another domain and language (Kim and Tremblay, 2022).

Previous findings on the role of L2 proficiency in stress perception have been inconsistent, likely due to limitations in the range of proficiency levels tested – either too low or too high – and the use of tasks that were not sufficiently cognitively demanding. In the present study, the L2 proficiency range is broad – ab initio learners with no prior knowledge and experienced learners with intermediate-advanced level – leading us to predict an effect of L2 proficiency, especially since the task used (i.e. Odd-One-Out; Schwab and Dellwo, 2017) is cognitively demanding, although less so than the sequence recall task.

Higher cognitive demands (i.e. increased memory load or phonetic variability) reduce stress detection performance (Dupoux et al., 1997; Tremblay, 2009). In the present study, cognitive demands are manipulated through talker and intonation variability. Based on previous research, we expect both sources of variability to affect L2 stress detection to different degrees, with intonation variability showing a stronger negative effect than talker variability (Schwab and Dellwo, 2017). It remains unclear, however, how L2 proficiency moderates this effect. Following Ortín and Simonet (2022, 2023), we expect an interaction between L2 proficiency and the degree of phonetic variability: high L2 proficiency will help the learners to overcome their stress detection difficulties, particularly when the degree of phonetic variability in the input is high (i.e. especially with intonation variability).

3. Methodology

3.1. Participants

Seventy-eight participants, divided into four groups according to their L1 (French, Korean) and L2 proficiency (ab initio and experienced learners, henceforth ‘AbInitio’ and ‘Exp’), took part in this experiment. Ab initio and intermediate-advanced learners were selected to reflect different stages along the L2 proficiency continuum.

The ab initio groups had no prior knowledge of Spanish or other free-stress Romance languages. They included 20 Swiss French speakers (‘FR-AbInitio’, mean age 21.3 years, 12 females), recruited at the University of Neuchâtel (Switzerland), and 18 Korean speakers (‘KR-AbInitio’, mean age 22.2 years, 11 females), recruited at the University of Utah Asia Campus (South Korea). Although Korean ab initio participants reported approximately 12 hours of prior instruction in Spanish, this was considered unlikely to have a meaningful impact on their proficiency and, therefore, made them comparable to the French group. All ab initio participants knew English (and German for the Swiss French group), but none were early bilingual (i.e. no language other than L1 learned before age 7 years).²

The experienced groups comprised 20 Swiss French intermediate-to-advanced learners of Spanish (‘FR-Exp’, mean age 22.8 years, 17 females), recruited at the University of Fribourg (Switzerland), and 20 Korean intermediate-to-advanced learners (‘KR-Exp’, mean age 23.9 years, 13 females), recruited at the University of Utah Asia Campus and at Hankuk University of Foreign Studies (South Korea). Most of the French-speaking learners began receiving Spanish instruction in high school (n = 18), with two beginning in secondary school and two continuing their Spanish studies at university. In contrast, the majority of the Korean experienced learners began Spanish instruction at university (n = 17), while only three started in high school. As shown in Table 1, Korean learners therefore began studying Spanish significantly later than French learners (U = 107.5, p = .012)³; however, despite this difference in age of acquisition, the two groups did not differ in Spanish proficiency (U = 197, p = .94), years of instruction (U = 151.5, p = .26), or months spent in a Spanish-speaking country (U = 128.5, p = .11). None of the experienced learners was early bilingual, though all had knowledge of English. Spanish proficiency was tested with the Dialang listening proficiency test (Alderson and Huhta, 2005).

Table 1.

Second language (L2) Spanish background of the French and Korean experienced learners.

	Swiss French (FR) (n = 20)			Korean (KR) (n = 20)
	Mean	Median	Range	Mean	Median	Range
Spanish proficiency (Dialang, 1–6)	3.3	3	1–5	3.5	3	2–6
Age of acquisition of Spanish (years)	16.6	17	13–20	18.2	19	11–21
Years of Spanish instruction	3.7	3	1–8	5.1	4	2–12
Number of months spent in Spanish-speaking country	2.9	0	0–24	7.1	1.5	0–36

None of the participants reported having any language or hearing disorders or dyslexia. The local ethics committees approved the study, and written informed consent was obtained from all participants. Participants received monetary compensation for their participation.

3.2. Material and procedure

Participants performed an Odd-One-Out task where they had to indicate the deviant (‘odd’) word among three Spanish words. The material and procedure followed the same design as described in Schwab and Dellwo (2017). As the full details, as well as the acoustic description and perceptual evaluation of the material, are provided in their original publication, only the key aspects are summarized here.

3.2.1. Material and design

The experiment employed six trisyllabic Spanish words that varied exclusively in their lexical stress pattern: words with stress on the first syllable (‘1st’), second syllable (‘2nd’), and third syllable (‘3rd’). The word sets included número (number), numero (I number), numeró (he numbered); válido (valid), valido (I validate), validó (he validated). Two native female speakers of Peninsular Spanish produced the target words in a declarative sentence with a falling intonation (i.e. Spanish: Le dijo a Pat ‘número’; English: ‘He/she said to Pat “number” ’) and in an interrogative sentence with a rising intonation (i.e. Spanish: ¿Le dijo a Pat ‘número’?; English: ‘Did he/she say to Pat “number” ’?). The final trisyllabic word to serve as stimulus in the experiment was extracted from its carrier sentence. Acoustic analyses showed that, with falling intonation, stress in first- and second-syllable stressed words was signaled by increased duration and F0, whereas in third-syllable stressed words it was marked by increased F0 and intensity. With rising intonation, duration and intensity patterned similarly, but F0 varied with the stress pattern, presenting differences in rising slope rather than a peak on the stressed syllable (for further details, see Schwab and Dellwo, 2017).

Trials were created combining three segmentally identical words (i.e. either numero or valido) separated by 500 ms. In each trial, two of the words shared the same stress pattern (e.g. stress on the second syllable), while the third word (the odd one) had a different stress pattern (e.g. stress on the third syllable). The two words with the same stress pattern were different recordings of the same word. The 144 test trials were constructed to meet the following criteria:

Half of the trials contained numero, and the other half contained valido.

All stress contrasts were tested (e.g. ‘1st–1st–2nd’, ‘1st–1st–3rd’, ‘2nd–2nd–1st’, etc.).

The position of the odd word within the trial (Position 1, Position 2, Position 3) was counterbalanced.

Trials were presented in four experimental conditions differing in the degree of phonetic variability:

○ no phonetic variability, with all three words produced by a single talker using a falling intonation pattern;

○ talker variability, with the three words produced by two different talkers using a falling intonation pattern;

○ intonation variability, with all words produced by a single talker but with both falling and rising intonation patterns; and

○ combined talker and intonation variability, with the three words produced by two talkers and with both falling and rising intonation patterns.

(For further details, see Schwab and Dellwo, 2017).

3.2.2. Procedure

The experiment was administered using Praat (Boersma and Weenink, 2021). In each trial, participants listened to a sequence of three words and were instructed to indicate the word with a different stress pattern by clicking on the corresponding word (1, 2, or 3) displayed on the screen. Trial order was randomized individually for each participant to control for order effects. The experiment lasted approximately 25 minutes, and participants’ correct and incorrect responses were recorded for each trial.

3.3. Data analysis

Statistical analyses were conducted in R (v4.4.1; R Development Core Team, 2024) using the lme4 package (Bates et al., 2015). A mixed-effects logistic regression was run on correct/incorrect responses (Baayen et al., 2008), with significance assessed via likelihood ratio tests comparing models with and without each effect. Estimates (β) are expressed in logit, using ‘incorrect response’ as the reference. Post-hoc Tukey-corrected comparisons were performed. Residual diagnostics using DHARMa (Hartig, 2024) indicated no major deviations, and no overdispersion was detected, confirming an adequate model fit. Figures show percentage correct, although analyses were performed on binary data (correct/incorrect responses; n = 11,232).

The fixed part of the model included ‘L1’ (French, Korean), ‘L2 proficiency’ (AbInitio, Exp), ‘phonetic variability’ (1into-1talker, 1into-2talkers, 2into-1talker, 2into-2talkers), and the three- and two-way interactions between the three variables. ‘L1’ and ‘L2 proficiency’ were coded using scaled sum contrasts, while ‘phonetic variability’ was coded with treatment contrasts.

The following control variables were included in the model: ‘trial number’, ‘lexical item’ (numero, valido), ‘odd position’ (1, 2, 3), and ‘stress contrast’ (1st–2nd, 1st–3rd, 2nd–1st, 2nd–3rd, 3rd–1st, 3rd–2nd). The nominal control variables were coded with treatment contrasts. ‘Trial number’ was centered on the mean. The random effects included random intercepts for participants and items, with random slopes for ‘phonetic variability’ across participants and for ‘L1’ and ‘L2 proficiency’ across items.

4. Results

Figure 1 presents the percent correct as a function of L1 (French, Korean), L2 proficiency (Ab Initio, Experienced), and phonetic variability. As shown in Table S1 in supplemental material, none of the three- or two-way interactions was significant. Results showed a main effect of L2 proficiency (χ²(1) = 14.64, p < .001). Ab initio learners performed worse (44.94%) than experienced learners (62.34%), regardless of L1 and/or phonetic variability. Likewise, we found a main effect of L1 (χ²(1) = 10.52, p = .001): Korean participants averaged 59.08% correct vs. 48.9% for French participants, irrespective of L2 proficiency and/or phonetic variability. Finally, phonetic variability also had an effect (χ²(3) = 107.89, p < .001) and did not vary as a function of L1 or L2 proficiency. As can be seen in Figure 1 and in Table S2 in supplemental material, the introduction of talker variability in trials without intonation variability significantly hampered the participants’ detection of the stress deviant (1into-1talker = 68.48%, 1into-2talkers = 61.00%). However, the effect of talker variability was not present in trials that included intonation variability (2into-1talker = 42.84%, 2into-2talkers = 43.13%). The presence of intonation variability affected the performance, whether the trials were produced by one talker (1into-1talker = 68.48%, 2into-1talker = 42.84%) or by two (1into-2talkers = 61.00%, 2into-2talkers = 43.13%). We also observe that introducing intonation variability alone (i.e. 2into-1talker) led to poorer performance than introducing talker variability alone (i.e. 1into-2talkers).

Figure 1.

Percent correct as a function of first language (L1), second language (L2) proficiency and phonetic variability.

The French and Korean experienced learners’ performance was above chance level in the four conditions of phonetic variability (exact binomial p [one-tailed] < .05). For the ab initio French and Korean learners, their performance was also above chance level when there was no intonation variability (1into-1talker and 1into-2talkers; exact binomial p [one-tailed] < .05), but their performance was at chance level when the stimuli included intonation variability, especially for the French learners (FR: 2into-1talker: exact binomial p [one-tailed] =.15, 2into-2talkers: exact binomial p [one-tailed] = 1; KR: 2into-1talker: exact binomial p [one-tailed] =.15, 2into-2talkers: exact binomial p [one-tailed] = .01).

5. Discussion

Given that neither French nor Korean possesses lexically distinctive word stress, one might expect speakers of these languages to show comparable difficulties in discriminating stress contrasts in L2 Spanish. However, the results revealed a significant effect of L1, with Korean learners outperforming their French counterparts, irrespective of L2 proficiency. As proposed by Kim and Tremblay (2022), this advantage may be attributed to the functional role of F0 to distinguish segmental contrasts in Korean. Their sensitivity to F0 may indeed have transferred to the perception of Spanish word stress, which is partially realized through an increase in F0. These findings provide further empirical support for the view that prosodic cues used in the processing of segmental contrasts in the L1 can be transferred to the perception of suprasegmental features, such as word stress, in the L2 (Kim and Tremblay, 2022). Interestingly, although French also relies on F0 to mark prosodic boundaries, this use does not appear to confer the same perceptual advantage. This suggests that the functional domain in which F0 operates in the L1 – segmental contrasts versus prosodic boundary marking – may critically influence the extent to which such cues transfer to L2 stress perception.

L2 proficiency had a clear beneficial effect on stress perception: the experienced learners outperformed ab initio learners. This finding suggests that L2 experience markedly improves the ability to perceive L2 stress. In other words, as proficiency increases, the challenges associated with perceiving L2 stress appear to diminish. This outcome seems to contradict studies that failed to find proficiency effects when only higher-level learners were compared (e.g. Tremblay, 2009). The discrepancy with Tremblay’s (2009) results is likely due to differences in proficiency range: our study contrasted ab initio with much more experienced learners, while Tremblay examined only intermediate vs. advanced students. Thus, proficiency effects become evident when a broad enough spectrum of L2 experience is considered, especially including true beginners. However, it might be that both learner groups processed stress information in a different way. Because the ab initio learners had no prior knowledge of Spanish, they were unlikely to process the stimuli as lexical items, but rather as sequences differing only in acoustic stress patterns. By contrast, experienced learners were more likely to access lexical representations associated with the words. This difference in the level of processing (acoustic vs. lexical) may contribute to the observed proficiency effect.

Previous research has shown that both musical abilities and experience with another stress language, such as English, can influence sensitivity to prosodic cues and lexical stress perception (e.g. Kolinsky et al., 2009; Martínez García and Schwab, 2023; Schwab et al., 2024). Musical aptitude, particularly pitch-related perceptual skills, has been associated with enhanced stress discrimination across languages, and knowledge of English may facilitate access to prominence patterns through transfer. Additional analyses in the present study indicated that neither musical aptitude nor English proficiency accounted for the observed differences in stress perception. This suggests that the effects reported here are more strongly driven by learners’ L1 prosodic background and their experience with Spanish than by musical skills or experience with another stress language. Future research may determine whether musical aptitude and additional stress-language experience play a stronger role in other experimental tasks or at different proficiency levels.

As expected, phonetic variability had a significant impact on stress perception. The introduction of phonetic variability, whether in the form of talker or intonation differences, hampered participants’ ability to accurately identify the word with a deviant stress pattern. Notably, intonation variability had a more detrimental effect than talker variability. These results are in agreement with previous research (Schwab and Dellwo, 2017), suggesting that difficulties in using F0 in rising intonation interfere with the detection of deviant stress patterns more strongly than differences between speakers. The effect of phonetic variability did not interact with participants’ L1 background, nor did it differ between ab initio and experienced learners. Although experienced learners showed better performance than ab initio learners globally, they still had more difficulties identifying the deviant word when talker and/or intonation variability was introduced.

This result is somewhat surprising and contrasts with the suggestion by Ortín and Simonet (2023) that proficiency effects should be most pronounced under high cognitive load. Instead, our data indicate that intermediate-advanced L2 proficiency is not sufficient for learners to generalize stress perception skills to highly variable inputs involving voice or intonation variability. An open question remains as to whether learners at higher proficiency levels (e.g. near-native speakers or those with extensive immersion experience) eventually develop greater resilience to such variability. Future work could explore this by testing advanced learners or longitudinally tracking learners as they progress, to determine if and when the proficiency x variability interaction emerges.

Our findings raise questions about how learners acquire and integrate different types of phonetic variability in L2 speech perception and whether certain types of variability (e.g. talker versus intonation differences) are more readily internalized than others. From a pedagogical perspective, these results underscore the need for instructional approaches that explicitly train learners to process speech in variable conditions. High Variability Phonetic Training (HVPT) has been shown to be effective (for example, Uchihara et al., 2024, 2025), and it would be valuable to investigate whether learners who undergo different HVPT paradigms focusing on specific types of variability (e.g. Shejaeya et al., 2024), including talker and intonation variability, perform differently from those who receive training with limited phonetic variability. Such work may inform the design of instructional interventions aimed at enhancing learners’ ability to cope with the challenges posed by phonetic variability in naturalistic L2 listening contexts. Importantly, our results should not be interpreted as evidence against the effectiveness of HVPT, as the present study did not include a training phase. Rather, they reflect the immediate perceptual cost of processing variable input. High variability increases normalization demands across talkers and intonation patterns, which can temporarily reduce discrimination accuracy. Although learning may occur during testing, this study was not designed to measure such learning effects and instead captures listeners’ perceptual performance under varying degrees of variability.

While the present study yields novel insights, several limitations must be acknowledged. First, our participant groups were limited to two L1 backgrounds (French and Korean). These were strategically chosen to test a specific theoretical contrast, but future research should examine a broader range of L1s – for instance, tone languages (e.g. Mandarin), pitch-accent languages (e.g. Japanese), or other fixed-stress languages (e.g. Hungarian) – to investigate if the patterns observed here hold generally. Such studies would reveal whether any use of F0 cues in the L1 (tonal or accentual) confers an advantage in L2 stress perception, and whether our French–Korean differences replicate across other language pairings. Second, our experienced learners, though more advanced than the ab initio group, were intermediate-advanced learners. It would be informative to include highly advanced or near-native L2 learners in future studies to determine if native-like stress perception can be reached, or if some difficulties (especially under high variability) persist even at the very high proficiency levels. Third, considering individual differences beyond L2 proficiency could deepen our understanding of stress perception variability. For instance, working memory capacity or attentional control might modulate how well learners cope with multiple sources of variability. Incorporating cognitive assessments in future experiments could help explain why some learners adapt to variability more easily than others. Finally, an important practical next step, suggested by our findings, is to experiment with targeted training interventions. As noted above, implementing HVPT trainings that focus specifically on lexical stress (and manipulating talkers or intonation patterns) would allow researchers to test causally whether such training can improve learners’ adaptability to variability. If successful, this would offer a clear avenue for practitioners to enhance L2 learners’ prosodic perception skills. In sum, addressing these future directions will further clarify how L1 background and L2 experience interweave to shape the acquisition of lexical stress – an area of enduring significance for second language research and pedagogy.

Supplemental Material

sj-docx-1-slr-10.1177_02676583261457752 – Supplemental material for L2 proficiency influence in the perception of lexical stress in Spanish: Ab initio vs. experienced learners

Supplemental material, sj-docx-1-slr-10.1177_02676583261457752 for L2 proficiency influence in the perception of lexical stress in Spanish: Ab initio vs. experienced learners by Sandra Schwab and María Teresa Martínez García in Second Language Research

Footnotes

Acknowledgements

We would like to thank Julie Kamber and Hugo Lee for their support with experiment design, participant recruitment and testing.

ORCID iDs

Sandra Schwab

María Teresa Martínez García

Ethical considerations

Data collection was conducted across different projects at various points in time and in different universities. Data collection for ab initio French-speaking participants was carried out within the framework of an SNF Ambizione project conducted at Zurich University (Ambizione PZ00P1_148036/1, 2014-2017). The Ethics Committee from Canton Zurich waived the need for ethics approval for behavioral studies (11 February 2013). Data collection for experienced French-speaking learners was carried out at University of Bern. The Ethics Committee of the Faculty of Humanities of University of Bern, Switzerland (no REC2022/06) approved the study on 16 December 2022. Data collection for Korean speakers (ab initio and experienced learners) was carried out at the University of Utah Asia Campus and Hankuk University of Foreign Studies and the Ethics Committee of the University of Utah approved the study (IRB_00147958 and 00151004) on 28 March 2022 and 18 October 2023, respectively.

Consent to participate

All participants provided written informed consent prior to participating. Written informed consent was obtained for anonymized participant information to be analyzed and published.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Parts of this research was funded by the Swiss National Science Foundation (grant Ambizione PZ00P1_148036/1, 2014-2017), by the Faculty of Humanities of University of Bern (2022, 2023), by Zurich Empiris Foundation (2023–24) and by the University of Utah (Fall 2022 Faculty Small Grant Program award).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

Datasets have been deposited in the repository:

Supplemental material

Supplemental material for this article is available online.

Notes

References

Alderson

Huhta

(2005) The development of a suite of computer-based diagnostic tests based on the Common European Framework. Language Testing 22(3): 301–320. https://doi.org/10.1191/0265532205lt310oa

Baayen

Davidson

Bates

(2008) Mixed effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59: 390–412. https://doi.org/10.1016/j.jml.2007.12.005

Bates

Mächler

Bolker

, et al. (2015) Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1): 1–48. https://doi.org/10.18637/jss.v067.i01

Boersma

Weenink

(2021) Praat: Doing phonetics by computer [Computer Software]. Available at: http://www.praat.org (accessed May 2026).

Braun

Galts

Kabak

(2014) Lexical encoding of L2 tones: The role of L1 stress, pitch accent and intonation. Second Language Research 30: 323–350. https://doi.org/10.1177/0267658313510926

Choi

Tong

Samuel

(2019) Better than native: Tone language experience enhances English lexical stress discrimination in Cantonese–English bilingual listeners. Cognition 189: 188–192. https://doi.org/10.1016/j.cognition.2019.04.004

Christophe

Peperkamp

Pallier

, et al. (2004) Phonological phrase boundaries constrain lexical access: I. Adult data. Journal of Memory and Language 51: 523–547. https://doi.org/10.1016/j.jml.2004.07.001

Delattre

(1969) An acoustic and articulatory study of vowel reduction in four languages. International Review of Applied Linguistics and Language Teaching 7: 294–325. https://doi.org/10.1515/iral.1969.7.4.295

Dupoux

Pallier

Sebastián-Gallés

, et al. (1997) A destressing ‘deafness’ in French? Journal of Memory and Language 36(3): 406–421. https://doi.org/10.1006/jmla.1996.2500

10.

Dupoux

Peperkamp

Sebastián-Gallés

(2001) A robust method to study stress ‘deafness’. Journal of the Acoustical Society of America 110(3): 1606–1618. https://doi.org/10.1121/1.1380437

11.

Dupoux

Sebastián-Gallés

Navarrete

, et al. (2008) Persistent stress ‘deafness’: The case of French learners of Spanish. Cognition 106(2): 682–706. https://doi.org/10.1016/j.cognition.2007.04.001

12.

Flege

Bohn

Jang

(1997) Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics 25(4): 437–470. https://doi.org/10.1006/jpho.1997.0052

13.

Garde

(1968) L’accent. Presses Universitaires de France.

14.

Gorba

(2019) Bidirectional influence on L1 Spanish and L2 English stop perception: The role of L2 experience. The Journal of the Acoustical Society of America 145(6): EL587–EL592. https://doi.org/10.1121/1.5113808

15.

Hartig

(2024) DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models [R package version 0.4.7]. Available at: http://florianhartig.github.io/DHARMa (accessed June 2026).

16.

Jun

(1998) The accentual phrase in the Korean prosodic hierarchy. Phonology 15(2): 189–226. https://doi.org/10.1017/S0952675798003571

17.

Jun

(2005) Korean intonational phonology and prosodic transcription. In: Jun

(ed.) Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford University Press, pp. 201–229.

18.

Jun

(2014) Prosodic typology: By prominence type, word prosody, and macro-rhythm. In: Jun

(ed.) Prosodic Typology II: The Phonology of Intonation and Phrasing. Oxford Academic, pp. 520–539. https://doi.org/10.1093/acprof:oso/9780199567300.003.0017

19.

Jun

Fougeron

(2000) A phonological model of French intonation. In: Botinis

(ed.) Intonation: Analysis, Modelling and Technology. Kluwer Academic, pp. 209–242.

20.

Kim

Tremblay

(2022) Intonational cues to segmental contrasts in the native language facilitate the processing of intonational cues to lexical stress in the second language. Frontiers in Communication 7: Article 845430. https://doi.org/10.3389/fcomm.2022.845430

21.

Kolinsky

Cuvelier

Goetry

, et al. (2009) Music training facilitates lexical stress processing. Music Perception 26(3): 235–246. https://doi.org/10.1525/mp.2009.26.3.235

22.

Léon

(2007) Phonétisme et prononciations du français [Phonetics and pronunciations of French]. Armand Colin.

23.

Lin

Wang

Idsardi

, et al. (2014) Stress processing in Mandarin and Korean second language learners of English. Bilingualism: Language and Cognition 17: 316–346. https://doi.org/10.1017/S1366728913000333

24.

Llisterri

Machuca

de la Mota

Riera

Ríos

(2002a) Algunas cuestiones en torno al desplazamiento acentual en español [Some issues surrounding accent shift in Spanish]. In: Herrera Zendejas

Martín Butragueño

(eds) La tonía: Dimensiones fonéticas y fonológicas (Estudios de Lingüística IV), pp. 163–185.

25.

Llisterri

Machuca

de la Mota

, et al. (2002b) The role of F0 peaks in the identification of lexical stress in Spanish. In: Braun

Masthoff

(eds) Phonetics and Its Applications. Festschrift for Jens-Peter Köster on the Occasion of His 60th Birthday. Franz Steiner Verlag, pp. 350–361.

26.

Martínez García

Schwab

(2023) Relation between musical aptitude and L2 stress perception in French- and Korean-speaking listeners. In: Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023), Prague, Czech Republic.

27.

Nagle

Bruun

Zárate-Sández

(2025) Comparing lower and higher variability multi-talker perceptual training. Applied Psycholinguistics 46: Article e14. https://doi.org/10.1017/S0142716425000141

28.

Ortega-Llebaria

Fan

(2013) English speakers’ perception of Spanish lexical stress: Context-driven L2 stress perception. Journal of Phonetics 41: 186–197. https://doi.org/10.1016/j.wocn.2013.01.006

29.

Ortega Llebaria

Prieto

(2011) Acoustic correlates of stress in Central Catalan and Castilian Spanish. Language and Speech 54: 73–97. https://doi.org/10.1177/0023830910388014

30.

Ortín

Simonet

(2022) Phonological processing of stress by native English speakers learning Spanish as a second language. Studies in Second Language Acquisition 44(2): 460–482. https://doi.org/10.1017/S0272263121000309

31.

Ortín

Simonet

(2023) Perceptual sensitivity to stress in native English speakers learning Spanish as a second language. Laboratory Phonology 14(1): 1–41. https://doi.org/10.16995/labphon.7978

32.

Quilis

(1981) Fonética Acústica de la Lengua Española [Acoustic Phonetics of the Spanish Language]. Gredos.

33.

Qin

Chien

Tremblay

(2017) Processing of word-level stress by Mandarin-speaking second language learners of English. Applied Psycholinguistics 38: 541–570. https://doi.org/10.1017/S0142716416000321

34.

R Development Core Team (2024) R: A language and environment for statistical computing. R Foundation for Statistical Computing.

35.

Rossi

(1979) Le français, langue sans accent [French, a language without an accent]. Studia Phonetica 15: 13–52.

36.

Saafeld

(2012) Teaching L2 Spanish stress. Foreign Language Annals 45(2): 283–303. https://doi.org/10.1111/j.1944-9720.2012.01191.x

37.

Sagarra

Casillas

(2018) Suprasegmental information cues morphological anticipation during L1/L2 lexical stress. Journal of Second Language Studies 1: 31–59. https://doi.org/10.1075/jsls.17026.sag

38.

Sagarra

Fernández-Arroyo

Lozano-Argüelles

, et al. (2024) Unraveling the complexities of second language lexical stress processing: The impact of first language transfer, second language proficiency, and exposure. Language Learning 74: 574–605. https://doi.org/10.1111/lang.12627

39.

Schwab

Dellwo

(2017) Intonation and talker variability in the discrimination of Spanish lexical stress contrasts by Spanish, German and French listeners. The Journal of the Acoustical Society of America 142: 2419–2429. https://doi.org/10.1121/1.5008849

40.

Schwab

Etter

Kamber

, et al. (2024) The influence of linguistic and cognitive background on word stress processing in an unknown language. Journal of Second Language Pronunciation 10(2): 204–231. https://doi.org/10.1075/jslp.23028.sch

41.

Shejaeya

Roon

Whalen

(2024) Talker variability versus variability of vowel context in training naïve learners on an unfamiliar class of foreign language contrasts. Journal of Phonetics 107: Article 101369. https://doi.org/10.1016/j.wocn.2024.101369

42.

Tremblay

(2009) Phonetic variability and the variable perception of L2 word stress by French Canadian listeners. International Journal of Bilingualism 13(1): 35–62. https://doi.org/10.1177/1367006909103528

43.

Tremblay

Broersma

Coughlin

(2018) The functional weight of a prosodic cue in the native language predicts the learning of speech segmentation in a second language. Bilingualism, Language and Cognition 21: 640–652. https://doi.org/10.1017/S136672891700030X

44.

Uchihara

Karas

Thomson

(2024) Does perceptual high variability phonetic training improve L2 speech production? A meta-analysis of perception–production connection. Applied Psycholinguistics 45(4): 591–623. https://doi.org/10.1017/S0142716424000195

45.

Uchihara

Karas

Thomson

(2025) High variability phonetic training (HVPT): A meta-analysis of L2 perceptual training studies. Studies in Second Language Acquisition 47(3): 794–827. https://doi.org/10.1017/S0272263125100879

46.

Wei

(2020) Dimensions of bilingualism. In: Wei

(ed.) The Bilingualism Reader. Routledge, pp. 3–22.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.10 MB