Second language experience influences salience of phonological units in spoken word production in the first language

Abstract

Research question:

Previous research suggests that the grain size of primary phonological units (PUs) in spoken word production is language-specific (e.g., phonemic segments in Germanic languages, and atonal syllables in Chinese). When the two languages of bilingual speakers have different primary PUs in their native speakers, will first language (L1) phonological processing be influenced by second language (L2) experience?

Methodology:

In a picture–word interference task, native Chinese speakers who spoke English as L2 were required to say aloud the predesignated L1 name of a picture while ignoring a written L1 character superimposed on the picture. The picture name shared a certain phonological component (i.e., rhyme or atonal syllable) with the distractor in the related condition but not in the unrelated condition.

Data and analysis:

Data of 186 participants from eight originally independent experiments were pooled. Multiple regression analyses were conducted on subject means to investigate whether the effects of rhyme relatedness and syllable relatedness on L1 naming latency were influenced by L2 self-rated proficiency, age of acquisition (AoA), and/or years of use. Trial-by-trial data were then analyzed with linear mixed-effects modeling.

Findings:

Both the rhyme effect and the syllable effect increased with years of L2 use, indicating that the salience of PUs in L1 spoken word production can be influenced by L2 experience.

Originality:

The current study adopted a chronometric approach to investigate the influence of L2 experience on phonological processing during L1 spoken word production. Importantly, multiple aspects of L2 experience (i.e., self-rated proficiency, AoA, and years of use) were examined at the same time in a relatively large sample.

Implications:

The current findings provide evidence for backward transfer of primary PUs in spoken word production, which demonstrates the plasticity of the phonological encoding process in bilingual speakers. These findings are discussed and compared with cross-language transfer of phonological awareness in the discussion.

Keywords

Language production phonological unit backward transfer Chinese English bilingualism

Introduction

Language production in bilingual speakers is an important issue that raises questions of both theoretical and practical interest. Spoken word production involves processing at multiple levels: conceptual; lexical; phonological; phonetic; and articulatory (Levelt, 1999). How bilinguals differ from monolinguals at each level of these word production processes has been widely investigated by researchers from various fields. For example, substantial evidence in psycholinguistic research suggests that bilingualism slows down speakers’ lexical processing in each of the two languages (e.g., Gollan et al., 2005; Ivanova & Costa, 2008; Poarch & van Hell, 2012; Sullivan et al., 2018). On the other hand, linguists have conducted numerous studies to investigate phonetic change in bilinguals’ first language (L1) and second language (L2) production (e.g., Cook, 2003; Flege, 1995; Kartushina et al., 2016). However, fewer studies have focused on the intermediate process—the phonological level—in bilingual word production (e.g., Roelofs, 2003). The current study aims to add evidence to this relatively small but growing body of research.

In one influential model of speech production, the WEAVER++ model (Levelt et al., 1999; Roelofs, 2015), the phonological word form is constructed on the fly in a stage called phonological encoding. While a general process entailing content-to-frame association is proposed for this stage (see Verdonschot et al., 2020 for a possible exception in Korean), what content and frame are involved in this process may differ across languages. In Germanic languages such as Dutch and English, the content is phonemic segments, and the metrical frame mainly indicates stress pattern of the syllables (e.g., stress is on the second syllable in the word tattoo). During phonological encoding, the retrieved segments fill in the metrical frame incrementally in a rightward direction, called syllabification. Evidence suggests that online syllabification may not necessarily happen in production of all languages (for relevant evidence in Chinese, see e.g., Chen et al., 2016; O’Séaghdha et al., 2010; You et al., 2012). In the Chinese version of WEAVER++ (Roelofs, 2015), atonal syllables (i.e., syllables irrespective of tone) are the content and associated with the tonal frame directly, and segments within a syllable are selected simultaneously. One possible reason for this language-specific property is that the position of a segment within Chinese syllables is context-independent, unlike in Germanic languages (e.g., the segment /k/ is a syllable coda in the single word take, but it becomes an onset of the second syllable in the phrase take it). Without the need for online syllabification, segments can be already syllabified before the online phonological encoding stage and thus are likely to be specified straightforwardly following syllable-to-tone association. Such a difference in the grain size of primary phonological units (PUs) (i.e., phonemic segments in Germanic languages and atonal syllables in Chinese) leads to the question how bilinguals construct phonological words in speaking each language. Especially when the two languages differ in the primary PUs adopted by their native speakers, will bilinguals’ phonological encoding in a target language be influenced by their experience of the non-target language? The answer to this question may demonstrate how much plasticity and flexibility bilingual minds have in representing and accessing information of each language.

Adopting the form-preparation paradigm (Meyer, 1990), Li et al. (2017) asked native Chinese speakers and native English speakers to name pictures in two different contexts: a homogenous context where all the picture names started with the same onset segment; and a heterogenous context (control) where the picture names were phonologically unrelated. Native English speakers showed a typical onset-preparation effect in English (i.e., shorter naming latencies in the homogenous context than control), suggesting their use of segments as phonological planning units in spoken word production. On the other hand, native Chinese speakers showed a null effect in Chinese. Most importantly, when native Chinese speakers who spoke English as a second language (Chinese ESLs) did this task in English, they also failed to show the onset-preparation effect until the last block (three blocks in total). This finding suggests that the Chinese ESLs’ L2 phonological encoding was influenced by L1 experience, at least before enough practice was done in a certain task. However, it is unclear whether Chinese ESLs will use smaller PUs in L2 word production as their L2 experience increases.

Nakayama et al. (2016) further investigated a similar question among Japanese ESLs. Unlike speakers of Germanic languages or Chinese, native Japanese speakers use morae as primary PUs in spoken word production (Kureta et al., 2006; Verdonschot et al., 2011). Nakayama et al. (2016) compared two groups of Japanese ESLs with different L2 proficiency in a masked priming task. The participants read aloud English words which were preceded by masked primes. The target word and the prime shared just the onset or the onset plus vowel (i.e., consonant–vowel (CV), like a mora). Relative to the unrelated control, both groups responded faster in the CV-related condition (i.e., CV priming effect). In contrast, only the high-proficient group showed an onset priming effect, but not the low-proficient group. In a further analysis, they found that the onset priming effect was modulated by the time spent in an English-speaking country, but not by L2 proficiency or age of acquisition (AoA). Consistent with this finding, Verdonschot and Masuda (2020) found that low-proficient Japanese ESLs were more likely to insert epenthetic vowels after consonants (e.g., smack may be pronounced as “sumacku”) than high-proficient ones were, suggesting that low-proficient bilinguals might use mora-sized units (i.e., L1 primary PUs) in L2 word production.

The L1 influence on L2 production in the above-mentioned studies is a phenomenon of forward transfer (FT) (Kartushina et al., 2016). Since L1 is typically acquired earlier than L2, it is not surprising that existing L1 representations could affect the formation of L2 representations. In contrast, backward transfer (BT) (i.e., L2 influence on L1 production), if any at the phonological level, would arguably require more plasticity within bilingual speakers, which is thus in greater need of empirical examination. In Verdonschot et al. (2013), high-proficient Chinese ESLs showed a masked syllable priming effect in Chinese and a masked onset priming effect in English. When the syllable structure of the target word and the prime overlapped, they even showed an onset priming effect in L1 Chinese, which suggests the influence of L2 experience on L1 word production. On the other hand, Ida et al. (2015) found that high-proficient Japanese ESLs who were able to use phonemic segments as primary PUs in English word production did not show a masked onset priming effect in Japanese, which is somewhat inconsistent with the finding of Verdonschot et al. (2013).

So far, available evidence suggests FT of L1 primary PUs to L2 word production among low-proficient ESLs but not high-proficient ESLs, whereas the existence of BT of primary PUs remains unclear and requires further investigation. Accordingly, one can imagine that when people start to learn an L2, they may use L1 primary PUs to encode L2 words at first (i.e., FT). As L2 proficiency increases, they may gradually become more like native speakers of L2, using their primary PUs in planning L2 word production (i.e., a decrease in FT). However, this is probably not sufficient for BT, if any at the phonological level, to occur. That means even if BT of primary PUs exists, there could be a stage, where its influence is not strong enough to be manifested, at which bilinguals appear to use language-dependent PUs (e.g., Ida et al., 2015). In that case, the effect of BT might be stronger among the participants of Verdonschot et al. (2013) and thus able to be observed to some extent (yet not robust). Alternatively, it is also possible that BT of primary PUs does not exist so that high-proficient bilinguals will keep using language-dependent PUs in planning spoken word production. To resolve this question, the current study attempts to find evidence for BT of primary PUs in Chinese ESLs. Given the difficulty in observing an L1 onset effect even in many high-proficient ESLs (Ida et al., 2015; Verdonschot et al., 2013), we chose to investigate the potential influence of L2 experience on Chinese ESLs’ sub-syllabic processing in L1 spoken word production.

While atonal syllables are assumed to be the primary PUs in Chinese word production, previous research has found evidence for sub-syllabic processing in Chinese word production (e.g., Verdonschot et al., 2015; Wong & Chen, 2008). Adopting the picture–word interference (PWI) paradigm, Wong and Chen (2008, 2009) required native Chinese speakers to name aloud pictures while ignoring word distractors that were phonologically related or unrelated to the picture name. A facilitation effect was observed when the picture name and the distractor shared the same rhyme. Such a rhyme effect has been observed in English as well and localized at an early stage before motor output (Lupker, 1982). Moreover, in another Chinese word production study, Feng et al. (2019) combined the PWI paradigm with the technique of event-related brain potential and found that the rhyme effect occurred more than 60 milliseconds (ms) later than the syllable effect did, suggesting that syllabic and sub-syllabic effects originate from different processes during production (i.e., the syllable-to-tone association process and the subsequent segmental specification process). This result is consistent with the findings of previous PWI studies which compared the time course of processing syllables and syllable bodies in Chinese word production (Wang et al., 2018; Wong et al., 2019).

If backward transfer of primary PUs exists among Chinese ESLs, the role of PUs that are smaller than atonal syllables should be strengthened. Therefore, one would expect to see not only a larger onset effect but also a larger rhyme effect when a certain level of L2 proficiency is achieved. Hence, the current study examined whether English-speaking experience increases the L1 rhyme effect in Chinese ESLs. A larger sub-syllabic effect in high-proficient Chinese ESLs than that in low-proficient ones would suggest that speaking ESL increases the salience of sub-syllabic units in Chinese word production, which would support the existence of BT of primary PUs among Chinese ESLs.

In the current study, instead of directly comparing high-proficient and low-proficient Chinese ESLs, we pooled data from eight PWI experiments (193 Chinese ESLs in total) and used multiple regression analysis and linear mixed-effects modeling (LMEM; Baayen et al., 2008) to investigate how different aspects of L2 experience influence Chinese ESLs’ phonological encoding in L1 word production, including L2 proficiency, AoA, and years of use. L2 proficiency and AoA are two of the most frequently measured variables in the bilingual literature (Li et al., 2006). Another frequently used variable is years of residence in the place where L2 is spoken, which was not included in the current study. This is because our participants were university students from Mainland China and they typically learned to speak ESL at school (not as a result of residing in an English-speaking country). English is one of the subjects in the College Entrance Examination in the People’s Republic of China, and university students continue studying and using English. Hence, we chose to measure years of use instead as the third L2-related variable (e.g., if a 19-year-old participant started to learn English at 12 years old, years of L2 use would be seven). As mentioned above, the L1 rhyme effect in the PWI paradigm was analyzed to examine the influence of L2-related variables on Chinese ESLs’ L1 sub-syllabic processing. Moreover, the L1 syllable effect was also analyzed for comparison. If BT in bilingual word production exists, the expected pattern of results is that all or some of the L2-related variables would contribute significantly to the L1 rhyme effect but not to the L1 syllable effect.

Method

Participants

A total of 224 students from the Chinese University of Hong Kong (where English was used as the medium of instruction) participated in eight experiments with monetary rewards (around 6 USD per participant). They were native Mandarin Chinese speakers from Mainland China who spoke ESL, neurologically healthy, and had normal or corrected-to-normal vision. Each of them participated in only one of the eight experiments, and informed consent was obtained at the beginning of the experiment.

The participants provided their language background information in a questionnaire (questions adapted from Li et al., 2014), including rating their English proficiency levels in listening, speaking, reading, and writing (1 means none, 4 means adequate, and 7 means excellent). Among the 224 participants, 31 of them spoke a third language other than Chinese and English (e.g., Japanese, Korean, or Spanish) and were excluded in the current study; also excluded were seven participants who had stayed in a foreign country for more than 12 months. Among the 186 remaining participants (34 males; mean age = 22.2 years, standard deviation (SD) = 2.7 years), some acquired other Chinese dialects, such as Cantonese and Southern Min, before or at the same time of acquiring Mandarin. We assume that the influence of speaking other Chinese dialects on the current study is minimal, since similar patterns of syllabic and sub-syllabic effects have been obtained across different Chinese dialects in the PWI studies (e.g., Wang et al., 2018; Wong & Chen, 2008, 2009). Table 1 summarizes the language background information (see Table S1 in the Online Supplemental Material for descriptive statistics per experiment).

Table 1.

Statistical summary of the participants’ second language (L2) background.

	n	Mean	Standard deviation	Median	Minimum	Maximum	Skewness	Kurtosis
	Experiments 1–6
L2 speaking proficiency	140	4.4	0.9	4	2	7	0.4	0.2
Age of acquisition of L2	140	7.7	2.6	7	2	13	0.2	−0.6
Years of L2 use	140	13.8	2.8	14	7	22	0.5	0.3
	Experiments 1–8
L2 speaking proficiency	186	4.4	0.9	4	2	7	0.4	0.1
Age of acquisition of L2	186	7.8	2.6	7	2	13	0.2	−0.6
Years of L2 use	186	13.8	2.8	14	7	22	0.5	0.2

Experiments

The eight experiments were independent of each other, but all adopted the PWI paradigm. They were conducted originally to replicate previous findings in the Cantonese PWI studies (e.g., Wong & Chen, 2008, 2009) and to experiment with a few parameters, including duration of distractor presentation and stimulus onset asynchrony (SOA). Since most results are consistent with previous findings in Cantonese, only Experiment 3 has been reported due to its originality in showing the time course of syllabic and sub-syllabic processing in Chinese word production (Wang et al., 2018). Here we briefly describe the stimuli and procedure (see Wang et al., 2018 for details), which were largely similar across experiments.

In each experimental trial, the participants were required to say aloud the predesignated name of a picture presented on a computer screen (i.e., the target) in L1 Mandarin Chinese while ignoring a written Simplified Chinese character (i.e., the distractor) superimposed on the picture. Different sets of Chinese characters were paired with the monosyllabic or disyllabic picture names to generate different types of target relatedness and distractor-relatedness, respectively. For each related condition, the picture names and the distractors were also recombined to generate the corresponding unrelated condition (as control). For example, six pairing conditions were included in Experiment 1: syllable-related (i.e., the target and the distractor shared the same atonal syllable); syllable-unrelated; rhyme-related; rhyme-unrelated; onset-related; and onset-unrelated. The distractor was presented before, at, or after picture onset (i.e., different SOAs), and it disappeared shortly or stayed on the screen until a naming response was detected. Different conditions were intermixed in a number of blocks and randomly presented.

Table 2 displays the major parameters in which the eight experiments differed. All included the syllable-related and syllable-unrelated conditions, and six of them (Experiments 1–6) included the rhyme-related and rhyme-unrelated conditions (145 participants, 29 males; mean age = 21.6 years, SD = 2.7 years). Participants’ mean naming latencies in these four conditions can be found in the Online Supplementary Material Table S2.

Table 2.

Parameters of the eight picture–word interference experiments.

Experiment	Number of participants	Number of pictures	Target	Duration of distractor presentation	Type of target–distractor relatedness	Stimulus onset asynchrony
1	26	24	Monosyllabic	Stay on screen	Syllable, rhyme (and onset)	0 (and −75, 75) milliseconds (ms)
2	20	12	Monosyllabic	Stay on screen	Syllable, rhyme (and onset)	0 (and−150, 150) ms
3	28	24	Monosyllabic	200 ms	Syllable, rhyme (and body)	0 (and−100, 100) ms
4	22	24	Disyllabic	Stay on screen	1st syllable, 1st rhyme (and 2nd syllable, 2nd rhyme)	0 ms
5	25	24	Disyllabic	200 ms	1st syllable, 1st rhyme (and 1st body)	0 (and−100, 100) ms
6	19	24	Disyllabic	200 ms (and 300 ms)	1st syllable, 1st rhyme (and 2nd syllable, 2nd rhyme)	0 ms
7	23	24	Disyllabic	200 ms	1st syllable (and1st body, 2nd syllable, 2nd body)	0 (and 100) ms
8	23	24	Disyllabic	200 ms	1st syllable (and 2nd syllable, both syllables)	0 (and−100, 100) ms

Note: for the type of target–distractor relatedness, “syllable” means that the target and the distractor shared the same atonal syllable but differed in the tone. Conditions in parentheses were included in the experiments but excluded in the current analyses.

Data analysis

The effect of a certain type of target–distractor relatedness on Chinese ESLs’ spoken word production was calculated by subtracting each individual participant’s mean naming latency (RT) in the related condition from that in the corresponding unrelated condition (e.g., dif_rhy = RT_{rhyme-unrelated} - RT_{rhyme-related}). The current study focused on the rhyme effect and the syllable effect. For experiments with disyllabic picture names, we only considered conditions where the distractor was paired with the first syllable of the picture name. Besides, we only examined the rhyme effect and the syllable effect at the SOA shared in all the experiments (i.e., 0-ms SOA). Other data (parenthesized in Table 2) were not analyzed in the current study.

In order to find out the variables that significantly influenced the L1 rhyme effect (dif_rhy) in Experiments 1–6, L2 speaking proficiency (pro), AoA of L2 (AoA), and years of L2 use (use) were mean-centered and entered into a stepwise multiple regression analysis in R Version 3.5.3 (R Development Core Team, 2019). In addition, two categorical parameters of the experiments, target (tar: monosyllabic, disyllabic picture names) and duration of distractor presentation (dur: 200 ms, stay on screen), as well as their interactions with each above-mentioned variable were also entered into the analysis. The formula of the full model is [dif_rhy ~ AoA*tar*dur + use*tar*dur + pro*tar*dur]. Similarly, another stepwise multiple regression analysis was conducted to find out the variables that significantly influenced the L1 syllable effect (dif_syl) in Experiments 1–8: [dif_syl ~ AoA*tar*dur + use*tar*dur + pro*tar*dur].

Trial-by-trial RT data were then analyzed with LMEMs. Target–distractor relatedness (rel: related and unrelated), type of relatedness (typ: syllable and rhyme), times of repeated presentation (rep; e.g., 3 if the picture appeared for the third time in an experiment, non-analyzed conditions also counted), variables remained in the above multiple regression analyses, and their interactions were fixed effects, while participants (subj) and items (pic) were random effects. By-participant and by-item random intercepts, and by-participant random slopes for target–distractor relatedness were included. The lmerTest package (Kuznetsova et al., 2017) was used to calculate p values with Satterthwaite approximation.

Results

In the two stepwise multiple regression analyses, the best models were [dif_rhy ~ use + dur] and [dif_syl ~ tar + use], respectively. The results did not change when we replaced L2 speaking proficiency with L2 average proficiency (in listening, speaking, reading, and writing). Hence, years of L2 use, duration of distractor presentation, and target were entered into the LMEMs. The formula of the first LMEM (m1) was [RT ~ rel * use-centered * typ + rel * tar * typ + rel * dur * typ + rel * rep * typ + (1 + rel | subj) + (1 + rel | pic)]. A likelihood ratio test showed that removing the variable times of repeated presentation and its interactions with other variables (m2) did not influence model fit significantly (χ²₍₄₎ = 1.55, p = 0.818), indicating that the effect of target–distractor relatedness and its interaction with type of relatedness did not vary with item repetition.

Table 3 shows the regression coefficients (β), standard errors (SE), t values, and p values of the fixed effects and their interactions in m2. Since the reference level of the variable type of relatedness was syllable, the coefficient of the item relUnrel:use_centered estimated that the facilitation effect of syllable-relatedness on naming latency increased significantly as years of L2 use increased (1.9 ms per year). The coefficient of the item relUnrel: use_centered: typRhy1 indicated that the increase of the rhyme facilitation effect with more years of L2 use (1.9 + 1.5 = 3.4 ms per year) was numerically, but not significantly, larger than that of the syllable facilitation effect. A further simplified model without the three-way interaction of years of L2 use, target–distractor relatedness, and type of relatedness (m3) can be found in the Open Science Framework repository.

Table 3.

Parameter estimates, standard errors, and statistical significance of the fixed effects in the linear mixed-effects modeling (LMEM) analysis of naming latencies.

	β	Standard error	t	p
Intercept	617.5	8.5	72.81	<0.00001***
relUnrel	25.0	3.7	6.75	<0.00001***
use_centered	0.5	1.9	0.28	0.778
typRhy1	23.2	3.9	5.96	<0.00001***
tarMono	−45.9	13.7	−3.35	0.0009***
durLong	75.2	12.9	5.84	<0.00001***
relUnrel: use_centered	1.9	0.9	2.07	0.040*
relUnrel: typRhy1	−20.5	5.3	−3.89	0.0001***
use_centered: typRhy1	−0.7	0.9	−0.75	0.452
relUnrel:tarMono	21.0	6.1	3.44	0.0007***
typRhy1: tarMono	1.4	5.3	0.27	0.786
relUnrel: durLong	−3.5	6.0	−0.59	0.558
typRhy1: durLong	−1.3	5.3	−0.25	0.805
relUnrel: use_centered: typRhy1	1.5	1.2	1.25	0.210
relUnrel: typRhy1: tarMono	−16.6	7.4	−2.26	0.024*
relUnrel: typRhy1: durLong	16.7	7.4	2.26	0.024*

Notes: LMEM formula: RT ~ rel * use-centered * typ + rel * tar * typ + rel * dur * typ + (1 + rel | subj) + (1 + rel | pic). use_centered, mean-centered years of second language use; subj, subject; pic, picture. Dummy coding was adopted for the following factors: rel, target–distractor relatedness (reference: related; relUnrel, unrelated); typ, type of relatedness (reference: syllable; typRhy1, rhyme); tar, target (reference: disyllabic; tarMono, monosyllabic); dur, duration of distractor presentation (reference: 200 milliseconds; durLong, stay on screen). *p < 0.05, **p < 0.01, ***p < 0.001.

Discussion

The current study aimed to reveal whether experience with a later acquired L2 influences L1 phonological encoding when bilinguals produce spoken words. It investigated how different aspects of English-speaking experience might influence Chinese ESLs’ L1 word production, including L2 self-rated proficiency, AoA, and years of use. Data of 186 participants from eight originally independent experiments were pooled, all of which adopted an L1 PWI task. The combined analysis revealed that years of L2 use significantly increased both syllabic and sub-syllabic effects in this task, while the other L2-related variables had no significant influence.

Interpretation of the L2 influence on the L1 syllabic and sub-syllabic effects

Syllables are very salient PUs for native Chinese speakers (Chen, 2000; Chen et al., 2007), probably because Chinese is a syllable-timed language in which syllables even correspond to discrete written characters. The syllable effect is very robust in behavioral experiments (Chen et al., 2016; O’Séaghdha et al., 2010; You et al., 2012). It is thus unexpected that L2 use would further increase the L1 syllable effect in the PWI paradigm. Two tentative explanations are proposed here. First, our syllable-related distractors did not share the entire syllable with the target, but only the atonal syllable (i.e., the target and the distractor differed in tone). The lack of lexical tone in English might increase the salience of atonal syllables in Chinese ESLs. Second, L2 use might influence general cognitive processes (e.g., Kroll & Bialystok, 2013). For example, if longer L2 use somehow makes Chinese ESLs activate distractor representations to a larger extent, the target–distractor relatedness will generate a larger effect. The current results cannot tell apart these explanations, but the L2 influence on the L1 syllable effect is not the main focus of the current study.

Notably, the second account mentioned above also applies to the L2 influence on the L1 rhyme effect. If it is true, does that mean the increase in the rhyme effect was determined by the participants’ general cognitive abilities but not the salience of sub-syllabic units at all? Given the modeling results, this should not be the case. Note that the average rhyme effect was much smaller than the average syllable effect in Experiments 1–6 (14.1 vs. 34.2 ms: t₍₁₃₉₎ = −4.86, p < 0.001). If the increases in the rhyme effect and the syllable effect shared the same underlying cause(s), the increasing rate of the rhyme effect would be proportionally smaller than that of the syllable effect. But the current results showed that the increasing rate of the rhyme effect was larger than that of the syllable effect (3.4 vs. 1.9 ms per year), though not significantly. Hence, at least part of the increase in the rhyme effect could be attributed to the increased salience of sub-syllabic units in the Chinese ESLs as a result of a longer period of L2 use.

Comparison with the study of phonological awareness

The grain size of phonological representations (Ziegler & Goswami, 2005) has been widely investigated in the reading literature as well, mainly through phonological awareness tests (McBride, 2016). Such tests require participants to manipulate PUs at different levels (e.g., to delete 苹 /ping2/ from 苹果 /ping2 guo3/, to transpose the onset segments of big dot). A large number of studies have shown that the granularity of the writing system influences the development of phonological awareness (e.g., Holm & Dodd, 1996; Huang & Hanley, 1995). For example, English readers typically perform better than Chinese readers (especially those who do not know any alphabetic phonetic scripts such as Pinyin and Zhuyin) in phonemic awareness, since grapheme-to-phoneme correspondence exists in English but not in Chinese. Researchers are interested in cross-language transfer of phonological awareness, that is, how phonological awareness developed in one language of bilinguals is related to that in the other language while general verbal and nonverbal abilities are controlled for (e.g., Luo et al., 2014; Wang et al., 2005).

It is worth noting, however, that phonological awareness at a certain level does not equal salience of certain PUs during spoken word production. Native Chinese speakers who had learned Pinyin or Zhuyin still showed no onset effect in previous studies of spoken word production (Chen et al., 2016; O’Séaghdha et al., 2010; Verdonschot et al., 2015). These participants were most likely to have good onset awareness, since onset segments are explicitly represented in Pinyin and Zhuyin (Holm & Dodd, 1996; Huang & Hanley, 1995; McBride-Chang et al., 2004). The distinction between phonological awareness and salience of PUs is also supported by the evidence that the presence of a Chinese onset effect was modulated by orthographic form-cuing (Li et al., 2015) and participants’ orthographic experience (Li & Wang, 2017). Adopting the form-preparation paradigm, Li et al. (2015) found that native Chinese speakers showed a significant onset-preparation effect when the stimuli were Pinyin syllables but not when they were Chinese characters. It can be assumed that the phonological awareness of their participants did not change within the short experiment. Nevertheless, the salience of onset segments was flexibly promoted by the Pinyin forms (see Verdonschot et al., 2011; Yoshihara et al., 2017, 2020, for a similar influence of script type on Japanese spoken word production). Moreover, Li and Wang (2017) compared native Chinese speakers across multiple age groups (kindergartners, 1st graders, 2nd graders, 4th graders, and adults) and found that only 1st graders showed a significant onset-preparation effect. They proposed that this was because 1st graders started to learn Pinyin and received extensive Pinyin practice on a daily basis. 2nd graders and up were exposed to Pinyin to a smaller extent, which decreased the salience of onset segments, but their phonological awareness was very likely to be more developed than that of 1st graders.

The Chinese ESLs in the current study were able to use Pinyin. Thus, the following inference was made, based on the research findings mentioned above. Although English-speaking experience of the participants might improve their phonological awareness and increase the salience of smaller PUs in Chinese at the same time, the increase in the rhyme effect was not necessarily (very likely not) caused by a potential change in their rhyme awareness. A more plausible explanation of the underlying mechanism is provided below.

BT of primary PUs

The current findings provide evidence for the influence of L2 experience on bilinguals’ phonological encoding process during L1 word production. The significant influence of L2 use, in contrast with the null effects of L2 self-rated proficiency and AoA, on the salience of L1 sub-syllabic units is consistent with the finding of Nakayama et al. (2016) that among Japanese ESLs, the L2-masked onset priming effect was modulated by the time spent in an English-speaking country, but not by L2 proficiency or AoA. It seems to suggest the importance of exposure to and use of L2 in modulating the salience of certain PUs in both L1 and L2. Hence, what matters may be the experience of speaking an L2 with certain inherent characteristics. In the case of Chinese ESLs and Japanese ESLs, special characteristics of L2 English include the frequent occurrence of consonant clusters (e.g., spr ing and du st ) and the need for resyllabification (e.g., take it becomes ta-kit), both of which emphasize phonological processing at the segmental level and thus may increase the salience of smaller PUs.

The FTs and BTs of primary PUs in bilingual spoken word production are compatible with the finding of Roelofs (2003) that phonological representations were shared between the two languages of bilingual speakers (see also Costa et al., 2006). For low-proficient Chinese ESLs, the salience of shared syllabic units may be much higher than that of shared segmental units, so atonal syllables are the primary PUs in both L1 and L2 word production (i.e., FT). As L2 use (and probably L2 proficiency) increases, the salience of segmental units becomes higher and allows bilingual speakers to use segmental units as primary PUs under certain circumstances (e.g., when speaking English). Before BT shows a strong effect, there might be a stage where the salience of both syllabic units and segmental units is high and bilingual speakers can flexibly use language-dependent units as primary PUs (i.e., atonal syllables when speaking Chinese and segments when speaking English). The mechanism underlying such a flexibility might be the same one that modulated the presence of a Chinese onset effect in the aforementioned study of Li et al. (2015), in which the Chinese-speaking participants showed a significant onset effect temporarily when the stimuli were Pinyin syllables but not when they were Chinese characters. If the salience of segmental units continues to be boosted and becomes much higher than that of syllabic units, one may expect to see an onset effect in both L1 and L2 word production (i.e., BT). However, this last stage may be relatively hard to achieve.

Although FTs and BTs of primary PUs can be nicely accounted for by shared phonological representations between the two languages, it is not the focus of the present study and thus the current findings do not provide direct evidence for shared phonological representations. Further studies are needed to find out whether shared phonological representations are the origin of FTs and BTs of primary PUs in bilinguals’ word production.

Contributions and limitations

Although substantial studies have investigated the grain size of phonological representations in bilinguals, the majority are from the reading literature and focus on phonological awareness. The current study is one of a few that adopted behavioral methodologies to investigate the influence of L2 experience on phonological processing during L1 spoken word production. Importantly, multiple aspects of L2 experience (i.e., self-rated proficiency, AoA, and years of use) were examined at the same time in a relatively large sample. The significant influence of L2 use on the salience of L1 sub-syllabic units in Chinese ESLs provides strong evidence for BT of primary PUs in spoken word production.

On the other hand, the null effect of L2 self-rated proficiency in the current study should be interpreted with caution. Previous studies have shown that Chinese ESLs and Japanese ESLs do not use phonemic segments as primary PUs in English word production when their L2 proficiency level is not high enough (Li et al., 2017; Nakayama et al., 2016; Verdonschot & Masuda, 2020). In other words, forward transfer may play a dominant role at the low-proficient stage in L2. That means the effect of BT might be observed among high-proficient ESLs only, after the effect of FT decreases. Therefore, L2 proficiency should be an important factor in BT. The null effect of L2 proficiency in the current study might be due to the use of self-rated proficiency measures. Further studies are needed to investigate the role of L2 proficiency in BT with more objective proficiency measures. That being said, the observed significance of L2 use indicated that the current sample, though pooled from independent experiments, is representative of bilingual speakers with varying L2 experience that allows the gradual process of BT of primary PUs to be captured.

A second limitation is that we investigated the L1 rhyme effect but not the L1 onset effect, in order to increase the possibility of observing an L2 influence. This was also due to the limited sample size in our data, which was insufficient for an attempt to analyze the onset effect (Experiments 1 and 2, n = 46). Future studies focusing on the L1 onset effect in Chinese ESLs are needed to further investigate the extent to which BT could influence the phonological encoding process in bilingual word production.

Conclusion

A combined analysis on data pooled across experiments revealed that years of L2 use may increase the salience of sub-syllabic units in L1 word production in Chinese ESLs, supporting BT of primary PUs in bilingual spoken word production. This demonstrates the plasticity of the phonological encoding process in bilingual speakers, which depends essentially on years of L2 use rather than L2 AoA. The current findings, together with previous studies which reported aspects of bilingual processing independent of L2 AoA (e.g., De Carli et al., 2015; Rossi & Prystauka, 2020), imply that it is never too late to learn an L2.

Supplemental Material

sj-pdf-1-ijb-10.1177_13670069211031001 – Supplemental material for Second language experience influences salience of phonological units in spoken word production in the first language

Supplemental material, sj-pdf-1-ijb-10.1177_13670069211031001 for Second language experience influences salience of phonological units in spoken word production in the first language by Jie Wang, Andus Wing-Kuen Wong and Hsuan-Chih Chen in International Journal of Bilingualism

Footnotes

Authors’ Note

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a Direct Grant for Research from the Chinese University of Hong Kong (Grant Number: 4052098).

ORCID iD

Jie Wang

Data availability

The data that support the findings of the current study are available in the Open Science Framework repository ().

Supplemental material

Supplemental material for this article is available online.

Author biographies

Jie Wang is an assistant professor. She is broadly interested in human cognition, especially human language. Her research work mainly focuses on cognitive mechanisms underlying speech production.

Andus Wing-Kuen Wong is an associate professor. His research interests include Language processing, Speech production, Speech motor learning and control, Stress and speech performance, Visual word recognition, and Communication disorders. He uses both behavioral and cognitive neuroscience approaches in his study.

Hsuan-Chih Chen is a professor emeritus. His research investigates language processing, with a specific focus on the processing of spoken and written Chinese.

References

Baayen

R. H.

Davidson

D. J.

Bates

D. M.

(2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. https://doi.org/10.1016/j.jml.2007.12.005

Chen

J.-Y.

(2000). Syllable errors from naturalistic slips of the tongue in Mandarin Chinese. Psychologia, 43(1), 15–26.

Chen

J.-Y.

O’Séaghdha

P. G.

Chen

T.-M.

(2016). The primacy of abstract syllables in Chinese word production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(5), 825–836. https://doi.org/10.1037/a0039911

Chen

T.-M.

Dell

G. S.

Chen

J.-Y.

(2007). A cross-linguistic study of phonological units: Syllables emerge from the statistics of Mandarin Chinese, but not from the statistics of English. Chinese Journal of Psychology, 49(2), 137–144. https://doi.org/10.6129/CJP.2007.4902.02

Cook

(Ed.) (2003). Effects of the second language on the first. Multilingual Matters.

Costa

Roelstraete

Hartsuiker

R. J.

(2006). The lexical bias effect in bilingual speech production: Evidence for feedback between lexical and sublexical levels across languages. Psychonomic Bulletin & Review, 13(6), 972–977. https://doi.org/10.3758/BF03213911

De Carli

Dessi

Mariani

Girtler

Greco

Rodriguez

Salmon

Morelli

(2015). Language use affects proficiency in Italian–Spanish bilinguals irrespective of age of second language acquisition. Bilingualism: Language and Cognition, 18(2), 324–339. https://doi.org/10.1017/S1366728914000054

Feng

Yue

Zhang

(2019). Syllables are retrieved before segments in the spoken production of Mandarin Chinese: An ERP study. Scientific Reports, 9(1), 1–9. https://doi.org/10.1038/s41598-019-48033-3

Flege

(1995). Second language speech learning: Theory, findings, and problems. In Strange

(Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). York Press.

10.

Gollan

T. H.

Montoya

R. I.

Fennema-Notestine

Morris

S. K.

(2005). Bilingualism affects picture naming but not picture classification. Memory & Cognition, 33(7), 1220–1234. https://doi.org/10.3758/BF03193224

11.

Holm

Dodd

(1996). The effect of first written language on the acquisition of English literacy. Cognition, 59(2), 119–147. https://doi.org/10.1016/0010-0277(95)00691-5

12.

Huang

H. S.

Hanley

J. R.

(1995). Phonological awareness and visual skills in learning to read Chinese and English. Cognition, 54(1), 73–98. https://doi.org/10.1016/0010-0277(94)00641-W

13.

Ida

Nakayama

Lupker

S. J.

(2015). The functional phonological unit of Japanese–English bilinguals is language dependent: Evidence from masked onset and mora priming effects. Japanese Psychological Research, 57(1), 38–49. https://doi.org/10.1111/jpr.12066

14.

Ivanova

Costa

(2008). Does bilingualism hamper lexical access in speech production? Acta Psychologica, 127(2), 277–288. https://doi.org/10.1016/j.actpsy.2007.06.003

15.

Kartushina

Frauenfelder

U. H.

Golestani

(2016). How and when does the second language influence the production of native speech sounds: A literature review. Language Learning, 66(S2), 155–186. https://doi.org/10.1111/lang.12187

16.

Kroll

J. F.

Bialystok

(2013). Understanding the consequences of bilingualism for language processing and cognition. Journal of Cognitive Psychology (Hove, England), 25(5), 497–514. https://doi.org/10.1080/20445911.2013.799170

17.

Kureta

Fushimi

Tatsumi

I. F.

(2006). The functional unit in phonological encoding: Evidence for moraic representation in native Japanese speakers. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32(5), 1102–1119. https://doi.org/10.1037/0278-7393.32.5.1102

18.

Kuznetsova

Brockhoff

P. B.

Christensen

R. H. B.

(2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 27pp. https://doi.org/10.18637/jss.v082.i13

19.

Levelt

W. J. M.

(1999). Models of word production. Trends in Cognitive Sciences, 3(6), 223–232. https://doi.org/10.1016/S1364-6613(99)01319-4

20.

Levelt

W. J. M.

Roelofs

Meyer

A. S.

(1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22(1), 1–38. https://doi.org/10.1017/S0140525X99001776

21.

Wang

(2017). The influence of orthographic experience on the development of phonological preparation in spoken word production. Memory & Cognition, 45(6), 956–973. https://doi.org/10.3758/s13421-017-0712-5

22.

Wang

Davis

J. A.

(2017). The phonological preparation unit in spoken word production in a second language. Bilingualism: Language and Cognition, 20(2), 351–366. https://doi.org/10.1017/S1366728915000711

23.

Wang

Idsardi

(2015). The effect of orthographic form-cuing on the phonological preparation unit in spoken word production. Memory & Cognition, 43(4), 563–578. https://doi.org/10.3758/s13421-014-0484-0

24.

Sepanski

Zhao

(2006). Language history questionnaire: A web-based interface for bilingual research. Behavior Research Methods, 38(2), 202–210. https://doi.org/10.3758/BF03192770

25.

Zhang

Tsai

Puls

(2014). Language history questionnaire (LHQ 2.0): A new dynamic web-based research tool. Bilingualism: Language and Cognition, 17(3), 673–680. https://doi.org/10.1017/S1366728913000606

26.

Luo

Y. C.

Chen

Geva

(2014). Concurrent and longitudinal cross-linguistic transfer of phonological awareness and morphological awareness in Chinese–English bilingual children. Written Language & Literacy, 17(1), 89–115. https://doi.org/10.1075/wll.17.1.05luo

27.

Lupker

S. J.

(1982). The role of phonetic and orthographic similarity in picture–word interference. Canadian Journal of Psychology, 36(3), 349–367. https://doi.org/10.1037/h0080652

28.

McBride

(2016). Children’s literacy development: A cross-cultural perspective on learning to read and write (2nd ed.). Routledge.

29.

McBride-Chang

Bialystok

Chong

K. K. Y.

(2004). Levels of phonological awareness in three cultures. Journal of Experimental Child Psychology, 89(2), 93–111. https://doi.org/10.1016/j.jecp.2004.05.001

30.

Meyer

A. S.

(1990). The time course of phonological encoding in language production: The encoding of successive syllables of a word. Journal of Memory and Language, 29(5), 524–545. https://doi.org/10.1016/0749-596X(90)90050-A

31.

Nakayama

Kinoshita

Verdonschot

R. G.

(2016). The emergence of a phoneme-sized unit in L2 speech production: Evidence from Japanese–English bilinguals. Frontiers in Psychology, 7. https://doi.org/10.3389/fpsyg.2016.00175

32.

O’Séaghdha

P. G.

Chen

J.-Y.

Chen

T.-M.

(2010). Proximate units in word production: Phonological encoding begins with syllables in Mandarin Chinese but with segments in English. Cognition, 115(2), 282–302. https://doi.org/10.1016/j.cognition.2010.01.001

33.

Poarch

G. J.

van Hell

J. G.

(2012). Cross-language activation in children’s speech production: Evidence from second language learners, bilinguals, and trilinguals. Journal of Experimental Child Psychology, 111(3), 419–438. https://doi.org/10.1016/j.jecp.2011.09.008

34.

R Development Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org.

35.

Roelofs

(2003). Shared phonological encoding processes and representations of languages in bilingual speakers. Language and Cognitive Processes, 18(2), 175–204. https://doi.org/10.1080/01690960143000515

36.

Roelofs

(2015). Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese. Japanese Psychological Research, 57(1), 22–37. https://doi.org/10.1111/jpr.12050

37.

Rossi

Prystauka

(2020). Oscillatory brain dynamics of pronoun processing in native Spanish speakers and in late second language learners of Spanish. Bilingualism: Language and Cognition, 23(5), 964–977. https://doi.org/10.1017/S1366728919000798

38.

Sullivan

M. D.

Poarch

G. J.

Bialystok

(2018). Why is lexical retrieval slower for bilinguals? Evidence from picture naming. Bilingualism: Language and Cognition, 21(3), 479–488. https://doi.org/10.1017/S1366728917000694

39.

Verdonschot

R. G.

Masuda

(2020). Sumacku or Smack? The value of analyzing acoustic signals when investigating the fundamental phonological unit of language production. Psychological Research, 84(3), 547–557. http://doi.org/10.1007/s00426-018-1073-9

40.

Verdonschot

R. G.

Han

J. I.

Kinoshita

(2020). The proximate unit in Korean speech production: Phoneme or syllable? Quarterly Journal of Experimental Psychology, 74(1), 187–198. https://doi.org/10.1177/1747021820950239

41.

Verdonschot

R. G.

Kiyama

Tamaoka

Kinoshita

La Heij

Schiller

N. O.

(2011). The functional unit of Japanese word naming: Evidence from masked priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37(6), 1458–1473. https://doi.org/10.1037/a0024491

42.

Verdonschot

R. G.

Lai

Chen

Tamaoka

Schiller

N. O.

(2015). Constructing initial phonology in Mandarin Chinese: Syllabic or subsyllabic? A masked priming investigation. Japanese Psychological Research, 57(1), 61–68. https://doi.org/10.1111/jpr.12064

43.

Verdonschot

R. G.

Nakayama

Zhang

Tamaoka

Schiller

N. O.

(2013). The proximate phonological unit of Chinese–English bilinguals: Proficiency matters. PLoS ONE, 8(4). https://doi.org/10.1371/journal.pone.0061454

44.

Wang

Wong

A. W.-K.

Chen

H.-C.

(2018). Time course of syllabic and sub-syllabic processing in Mandarin word production: Evidence from the picture-word interference paradigm. Psychonomic Bulletin & Review, 25(3), 1147–1152. https://doi.org/10.3758/s13423-017-1325-5

45.

Wang

Perfetti

C. A.

Liu

(2005). Chinese–English biliteracy acquisition: Cross-language and writing system transfer. Cognition, 97(1), 67–88. https://doi.org/10.1016/j.cognition.2004.10.001

46.

Wong

A. W.-K.

Chen

H.-C.

(2008). Processing segmental and prosodic information in Cantonese word production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(5), 1172–1190. https://doi.org/10.1037/a0013000

47.

Wong

A. W.-K.

Chen

H.-C.

(2009). What are effective phonological units in Cantonese spoken word planning? Psychonomic Bulletin & Review, 16(5), 888–892. https://doi.org/10.3758/PBR.16.5.888

48.

Wong

A. W.-K.

Chiu

H.-C.

Wang

Wong

S.-S.

Chen

H.-C.

(2019). Electrophysiological evidence for the time course of syllabic and sub-syllabic encoding in Cantonese spoken word production. Language, Cognition and Neuroscience, 34(6), 677–688. https://doi.org/10.1080/23273798.2018.1562559

49.

Yoshihara

Nakayama

Verdonschot

R. G.

Hino

(2017). The phonological unit of Japanese Kanji compounds: A masked priming investigation. Journal of Experimental Psychology: Human Perception and Performance, 43(7), 1303–1328. http://dx.doi.org/10.1037/xhp0000374

50.

Yoshihara

Nakayama

Verdonschot

R. G.

Hino

(2020). The influence of orthography on speech production: Evidence from masked priming in word-naming and picture-naming tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(8), 1570–1589. http://doi.org/10.1037/xlm0000829

51.

You

Zhang

Verdonschot

R. G.

(2012). Masked syllable priming effects in word and picture naming in Chinese. PLoS ONE, 7(10). https://doi.org/10.1371/journal.pone.0046595

52.

Ziegler

J. C.

Goswami

(2005). Reading acquisition, developmental dyslexia, and skilled reading across languages: A psycholinguistic grain size theory. Psychological Bulletin, 131(1), 3–29. https://doi.org/10.1037/0033-2909.131.1.3

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.10 MB