Arabic speech intelligibility: Perception of spoken Arabic by native and non-native speakers

Abstract

Previous studies have shown that native language backgrounds of both talkers and listeners affect speech intelligibility. This study investigated the interlanguage speech intelligibility benefit (ISIB) that is also known as the advantage in understanding second language (L2) speech that non-native listeners have over native listeners when both groups listen to speakers with the same first language (L1). More specifically, it looked into the ISIB in relation to the Arabic language spoken by both native Arabic (NA) and English speakers. To this end, 15 NA and 15 native English (NE) subjects listened to Arabic produced by two groups of talkers (5 NA talkers and 5 NE talkers) and were asked to identify the words they heard. Results showed evidence for the interlanguage speech intelligibility for listeners (i.e. NE listeners were more accurate than NA listeners at identifying English-accented Arabic speech). However, no evidence for the ISIB for talkers was found. That is, NE listeners did not find English-accented speech more intelligible than NA speech. By examining L2 learners’ recognition of L2 consonant contrasts, the study contributes to the body of knowledge on L2 sound acquisition as well as the ISIB literature. It also provides some insight into the problem of adult L2 learners’ ability to learn novel L2 consonants.

Keywords

Arabic language foreign accent interlanguage speech intelligibility benefit native language background perception

I Introduction

Non-native speakers (NNSs) typically speak a second language with a foreign accent that is defined as ‘a pronunciation deviating from what a native speaker (NS) expects another NS to sound like’ (Major, 2018, p. 1). Such deviation results from the use of non-standard speech features that differ from those used by native speakers of the language including variations in different aspects of speech production such as intonation, stress patterns, and rhythm.

To better understand speech intelligibility and the factors that impact it; therefore, researchers have begun to examine the intelligibility of native and non-native speech (NNS) for both native and non-native listeners (NNLs). In this regard, prior research studies have distinguished two types of factors. The first type is related to speakers that may contribute to speech intelligibility such as speech rate (Derwing & Munro, 2001), stress, pausing and intonation (e.g. Tajima et al., 1997; Trofimovich & Baker, 2006), neighborhood density (Imai et al., 2005), and word frequency (Bradlow & Pisoni, 1999). The second type, however, is related to listeners that include language proficiency (Stibbard & Lee, 2006), context (Field, 2004), and speech familiarity (Hsieh & Tsao, 2022; Saito et al., 2019).

Researchers have also discussed the presence of an interlanguage linguistic system that contains speech features which differ from learners’ first language and their second language. That is, the intelligibility of second language (L2) learners’ speech is related to the phenomenon of interlanguage speech intelligibility benefit (ISIB) that refers to the extent to which listeners can understand speech produced by non-native speakers (Bent & Bradlow, 2003; Hayes-Harb et al., 2008; Gosselin et al., 2022). This is a crucial aspect of second language acquisition (SLA) because it can greatly impact the success of communication in multilingual settings. As the world becomes increasingly globalized and multilingual, studying the ISIB has become increasingly important. Thus, there are several studies reporting on the ISIB in relation to several languages such as English (e.g. Bent & Bradlow, 2003; Imai et al., 2005; Major et al., 2002; Munro et al., 2006; Stibbard & Lee, 2006; Xie & Fowler, 2013), and Dutch (e.g. van Wijngaarden, 2001; van Wijngaarden et al., 2002). However, to our knowledge, there are no studies on the Semitic languages that are less studied in previous SLA research. Hence, this article aims to build on recent work in this area and explore the ISIB in the context of the Arabic language. More particularly, it investigates how speech produced by L2 learners of Arabic is perceived by both native Arabic (NA) listeners and other NNLs. By doing so, it contributes to an understanding of L2 speech intelligibility and adds to the literature of the L2 sound acquisition.

II Literature review

1 Foreign accent and accented speech

When learning an L2, one of the biggest challenges that learners encounter is acquiring the new language’s phonemic system, which includes the set of sounds used to differentiate between words in that language (Albrechtsen et al., 1980; Selinker, 1972). That is, each language has its own set of phonemes, or distinct sounds, which can vary widely from one language to another. As a result, L2 learners often transfer the phonemic system of their native language to their L2, leading to the production of non-native sounds, stress patterns, and intonation patterns. For example, the English phoneme /ð/ (as in ‘the’ or ‘this’) does not exist in the French language, so French learners of English may substitute it with the closest French phoneme, which is /z/. This substitution can result in non-native-like pronunciation, as the two sounds are not acoustically identical, which can make it difficult for native English (NE) speakers to understand, especially if they are not familiar with the phonemic inventory of French (Laeufer, 1996).

Similarly, a Japanese speaker experiences difficulties in distinguishing the English /l/ and /r/ sounds, which are two separate sounds in English but not in Japanese which may lead to errors in comprehension and miscommunication (Guion et al. 2000; Hirata, 2004). Likewise, Jenkins (2000) found that Japanese, Taiwanese, and Korean speakers frequently produced phonetic substitutions of English sounds that led to unintelligible speech for listeners who did not share the speaker’s native language. In Spanish, moreover, voiceless stop consonants have a much shorter voice onset time (VOT) than in English and therefore, the delay between the release of the stop and the onset of voicing for voiceless consonants in Spanish is much shorter than in English. As a result, native Spanish speakers tend to produce English voiceless stop consonants with shorter VOT values than NE speakers do (Sole, 2018). Together, it can be concluded that there are difficulties in producing and perceiving sounds that are not present in one’s first language, which may pose a challenge in cross-linguistic communication.

It is well documented, moreover, that foreign-accented speech is less intelligible for native listeners who are not familiar with the foreign accent that involves variations in pronunciation, stress patterns, and intonation. Thus, it differs from what listeners are accustomed to hearing (Bradlow & Bent, 2002; Munro,1998; Munro & Derwing,1995; Sereno et al., 2016; Thomson, 2018). Due to these variations, it becomes difficult for listeners to recognize individual words and phrases, or to comprehend the overall meaning of a sentence. Foreign accents also cause confusion around sounds that are similar in the listener’s first language but distinct in the speaker’s language (Major et al., 2002). Furthermore, researchers have become interested in understanding how listeners’ language backgrounds can affect speech intelligibility. In this respect, Smith (1992) has pointed out that ‘native speakers were not found to be the most easily understood, nor were they, as subjects, found to be the best able to understand the different varieties of English’ (p. 88).

Prior research has indicated that speech intelligibility is affected not only by the talker’s pronunciation, but also by listener variables such as vocabulary knowledge (Politzer, 1976), familiarity with the topic being discussed (Gass & Varonis, 1984), and the listener’s ability to process speech sounds that are not part of their native language (Jenkins, 2000, 2002). In other words, a listener who has a limited vocabulary in the language being spoken may struggle to understand a non-native speaker who uses complex or unfamiliar words. Similarly, a listener who is not familiar with the topic being discussed may find it harder to follow the speaker’s train of thought. As a result, researchers have increasingly focused on understanding how listener variables interact with talker variables to affect speech intelligibility. This includes studying factors such as the listener’s cognitive abilities, their prior exposure to the non-native language, and their level of experience in listening to non-native speech. By gaining a better understanding of the role of listener variables, researchers aim to develop more effective techniques for enhancing the intelligibility of non-native speech (Lee et al., 2022; van Wijngaarden et al., 2002).

2 ISIB in SLA research

Research in SLA has shown that non-native listeners may have an advantage in understanding non-native speech spoken by speakers of their own native language compared to native listeners (Dokovova et al., 2022; Fishero et al., 2023; Hayes-Harb et al., 2008; Korpal & Sobkowiak, 2020; Munro et al., 2006). This advantage arises because non-native listeners are more attuned to the phonetic and prosodic features of their native language, including the patterns of stress, intonation, and rhythm. As a result, they may be better able to recognize these features in non-native speech, even when spoken by non-native speakers. This phenomenon is called by Bent and Bradlow (2003) as the ISIB that they defined as ‘the benefit afforded by a shared interlanguage between a non-native talker and listener’ (p. 1602). Their study explored how the native language backgrounds of non-native listeners affected their ability to comprehend non-native speech. The authors thus recruited sixty-three participants whose native languages were English, Chinese, and Korean, in addition to a mixed group of learners with other native language backgrounds. All participants listened to English speech produced by NE, Chinese and Korean speakers and were asked to transcribe the sentences they heard. The results showed that native Mandarin and Korean listeners were able to comprehend non-native accented English speech spoken by high-proficiency speakers with whom they shared the same native language that they found as intelligible as speech produced by NE talkers. This phenomenon was identified by the researchers as ‘matched ISIB’. The study also found evidence for mismatched ISIB, as non- native listeners from the mixed native language group found the speech produced by the high-proficiency Korean talker, with whom they did not share the same native language, equally or more intelligible than the speech produced by the NE talker.

There is also a handful of studies that have explored the ISIB advantage. For example, van Wijngaarden et al.’s (2002) study investigated the role of language proficiency in ISIB effects, moreover, native Dutch listeners listened to Dutch and English and German sentences produced by speakers of these languages. Findings revealed native Dutch listeners were able to perceive German talkers rather than English talkers that was explained in light of listeners’ low proficiency in German and high proficiency in English. It is thus implied that ISIB can be detected when the listeners have a limited level of proficiency in the target language. Besides, Imai et al. (2005) used eighty familiar English words produced by one NE talker and one native Spanish talker that were used in a word recognition task in which NE and Spanish listeners participated. Listeners in the two groups were asked to write down each word they heard. The results indicated that native Spanish listeners were more accurate than NE listeners at identifying words produced by native Spanish talkers.

Similarly, Munro et al. (2006) conducted a study in which they asked participants from four different first language (L1) backgrounds (Cantonese, Japanese, Spanish, and English) to transcribe and rate English sentences produced by L2 learners of English from four different native language backgrounds (Cantonese, Japanese, Polish, and Spanish). The study reported no advantage for native Cantonese or Spanish listeners in understanding Cantonese-accented English and Spanish-accented English, respectively, compared to NE listeners. However, native Japanese listeners found the speech of native Japanese talkers more intelligible than did NE listeners. There was also a moderate to high correlation between participants’ intelligibility scores and their ratings of both comprehensibility and accentedness ratings, suggesting that native and non-native listeners shared a similar response to speech, regardless of their different linguistic backgrounds. The results, therefore, suggested that listeners’ ability to understand L2 speech is affected by the resemblance between the listener’s native language and the accent of the speaker. Additionally, listeners’ ability to understand speech in a second language is related to their perception of comprehensibility and accentedness. That is, listeners’ ability to understand L2 speech is influenced by the similarity between their native language and the speaker’s accent. Thus, it is important to consider both the listener’s native language and the speaker’s accent when studying second language speech perception.

Furthermore, Hayes-Harb et al. (2008) identified two types of ISIB in previous research: ISIB for listeners (ISIB-L) and ISIB for talkers (ISIB-T). While the first one refers to the phenomenon where non-native listeners find non-native speech more intelligible than native listeners do, the latter refers to a phenomenon where non-native listeners find the speech of non-native talkers more intelligible than the speech of native speakers. These two types of ISIB are distinct and independent phenomena, and they have been observed in various studies using different languages and populations (e.g. Imai et al., 2005; Munro et al., 2006; Stibbard & Lee, 2006). To explore these two types of ISIB, Hayes-Harb and colleagues asked NE listeners and native Mandarin listeners to identify individual English words in a forced-choice word identification task that were produced by six native speakers of English and six native speakers of Mandarin. The results provided evidence for the ISIB-L, as native Mandarin listeners understood native Mandarin-accented English speech better than NE listeners did. However, no evidence for the ISIB-T was found, as the speech of NE speakers was not more intelligible to native Mandarin listeners than the speech of native Mandarin talkers. The findings revealed that the intelligibility of L2 speech can be influenced by the correlation between the listener’s mother tongue and the speaker’s accent. Specifically, non-native listeners may have an advantage in understanding non-native speech, but this advantage may not extend to the speech of native speakers.

In the same vein, Fishero et al. (2023) examined how native and non-native listeners at various competency levels perceived L1 and L2 speech. In order to investigate the effects of ISIB on both accuracy and response time, two groups of listeners (i.e. 36 NE listeners and 36 native Mandarin listeners) took part in a lexical judgment task in which they listened to NE and Mandarin-accented English. Findings revealed evidence for both ISIB-L and ISIB-T effects for native Mandarin listeners. It was concluded that the advantage non-native listeners had in comprehending non-native speech was found to rely on multiple factors, including listener proficiency, speaker proficiency, phoneme attributes, and the acoustic properties of individual speech tokens.

In contrast, other studies reported no evidence for the two types of ISIB. For instance, Major et al. (2002) investigated the effect of shared native language between the speaker and listener on their listening comprehension performance in English. The study used a special version of the listening section of the TOEFL called the listening comprehension trial test. The participants were four groups of 100 listeners from four different L1 backgrounds: Chinese, English, Japanese, and Spanish. The participants listened to lectures delivered in English by speakers from the same native language as the listeners and then answered questions based on the lectures. The results showed that both native and non-native listeners had lower scores on the listening test when they listened to non-native speakers of English. However, native Spanish listeners had a small advantage in intelligibility when hearing Spanish-accented English speech compared to other non-native English listeners. Hence, the study revealed that having the same native language as the speaker can provide a slight advantage in understanding English speech. The total listening comprehension ratings did not, however, significantly change as a result of this advantage. The results also revealed no evidence for the ISIB-L, as NE listeners were more accurate at recognizing words produced by native Chinese and Japanese talkers than both native Chinese and Japanese listeners.

In addition, Stibbard and Lee (2006) used a word recognition test in which fifty participants in four groups (10 native Korean, 10 NA, 10 NE, and 20 with mixed L1 backgrounds) listened to sentences produced by five talkers from different L1 backgrounds (2 native Korean, 2 NA and 1 NE). The results showed no evidence for the ISIB-T for low-proficiency non-native speakers and no strong evidence for the ISIB-T for non-native speakers with high-proficiency. Nevertheless, NE talkers were more intelligible to all listeners in the four groups than non-native talkers.

Likewise, Rasmussen (2007) explored the ISIB in relation to the production of the English /p/ and /b/ by both NA talkers (7 males) and NE talkers (5 male, 6 female). The study also examined how their productions of the two target phonemes were perceived by both NE and NA listeners. Stimuli were minimal pairs, such as bat and pat, presented in a forced-choice word identification task. The results demonstrated no evidence for the ISIB-L, as NA listeners were less accurate at identifying the Arabic-accented English than were the NE listeners. In contrast, NA listeners found Arabic-accented English words as intelligible as NE words that provides support for the ISIB-T. Moreover, the study’s examination of acoustic measurements including stop closure duration and VOT further supports this conclusion, as it indicates that NA listeners had difficulty consistently differentiating between the English /p/ and /b/ sounds.

Additionally, Xie and Fowler’s (2013) study reported an ISIB-L for two groups of Mandarin learners of English who were able to identify Mandarin-accented English better than NE listeners. However, their findings revealed no significant difference in the performance of Mandarin listeners in the US, regardless of ‘whether the speaker was Mandarin-accented or not’ (p. 11). In other words, the study provided no supporting evidence for the ISIB-T. More recently, Kang et al. (2018) found no evidence for both ISIB-L and ISIB-T where Chinese and Mexican listeners scored lower when listening to speech produced by talkers with whom they shared the same native language.

More recently, Lee et al. (2022) investigated the effects of talker type (native vs. non-native) and experience with the target dialect – North American English (NAE) vs. Standard Southern British English (SSBE) – on L2 listeners’ perception. To this end, Korean listeners were tested on how well they could distinguish 12 English vowels spoken by native and non-native (L1-Korean) talkers of North American English (NAE) and Southern Standard British English (SSBE). The tests were given to two groups of native Korean listeners: L1-Korean ESL listeners in the USA and L1-Korean ESL listeners in the UK. The findings displayed a significant influence of L2 listeners’ experience with the target dialect on the accuracy of the target vowel recognition. No ISIB-talker effects were noticed for the L1-Korean listener groups despite differing degrees of exposure to the two English dialects.

To sum up, previous research has reported inconsistent results with respect to the ISIB. Whereas some studies provided evidence in support of the existence of either the ISIB-L, ISIB-T, or both, others provided evidence against either benefit. Except for few studies that focused on Dutch speech (e.g. van Wijngaarden, 2001; van Wijngaarden et al., 2002), previous research has primarily examined the ISIB in relation to English speech produced by learners with diverse native languages, including Arabic, Chinese, Japanese, Korean and Spanish. Given the varying results reported in previous studies and the limited range of language investigated, it is apparent that more research is needed to more fully understand the ISIB phenomenon. Therefore, the present study was designed to address this issue.

3 Arabic perception research

Arabic is a Semitic language that is widely spoken by 422 million people in the Middle East (Ryding, 2013). Due to its significance, it is one of the major six languages of the United Nations (Younus, 1977). It is characterized by one main variety, Modern Standard Arabic (MSA), and a number of spoken dialects that vary from one Arab state to another (Ferguson, 1959). Whereas MSA is acquired through formal education and is only used in formal contexts including formal speeches and news bulletins, the spoken colloquial varieties are the medium of communication in daily life conversations in the Arab world. Although the last two decades have witnessed a rapid increase in the number of language programs teaching Arabic at different universities in North America (Abdalla, 2006), Arabic is still classified as a less studied language (Shehata, 2015, 2023; Taha, 2007) compared to other languages such as English (Gordon & Darcy, 2016), German (O’Brien, 2014), and modern languages including Spanish as well as French (Imai et al., 2005; Mroz, 2018). As result, very little research has been published on the acquisition of Arabic by non- native speakers. While some studies have examined grammar (see Ryding, 2013) and receptive (e.g. Zouhir, 2013) and productive skills (e.g. Shehata, 2021), fewer have addressed Arabic perception (e.g. Al Mahmoud, 2013; Hayes-Harb & Durham, 2016; Lababidi & Park, 2017; Shehata, 2018). For instance, Al Mahmoud (2013) explored the perception of Arabic consonant contrasts including /x–ɣ/, /t–d/, and /x–ħ/ by English speakers. The results revealed that learners were better able to distinguish contrasts that have English equivalents, such as /t–d/ and /θ–ð/ than the emphatic-plain contrasts that do not have English equivalents, such as /x–ɣ/, /ħ–h/, and /x–ħ/.

Besides, Hayes-Harb and Durham (2016) explored NE speakers’ perception of Arabic emphatic-plain contrasts using vowel identification and discrimination tasks. The findings indicated that when distinguishing between Arabic emphatic and plain consonants, English speakers tended to depend more on following vowels rather than consonants. Moreover, the accuracy of their discrimination was higher when the following vowel was /æ/, followed by /u/ and /i/. Moreover, Lababidi and Park (2017) examined how factors like prosodic location, vowel duration and vowel quality affect the perception of Arabic consonants by NE speakers without prior exposure to Arabic. To this end, 14 Arabic consonants /t, d, ð, s, t^ʕ, d^ʕ, ð^ʕ, s^ʕ, q, x, ɣ, ħ, ʕ, ʔ/, presented in CV and VCV syllable structures were used in two main tasks: an identification and a goodness-of-fit rating. It was found that both vowel duration and quality significantly influenced category selection for some of the target consonants, but there was no discernible impact of prosodic location on the CV and VCV structures with two syllables.

Similar findings were reported by Shehata’s (2018) study that displayed some Arabic consonants such as /t–t^ʕ/, /h–ħ/, and /s–s^ʕ/ to be more difficult to perceive by NE speakers of Arabic than others (i.e. /ħ–ʕ/). More recently, Aldamen and Al-Deaibes (2023) investigated L2 learners’ perception of Arabic emphatic consonants using mono- and disyllabic words that contrasted the target consonants in initial position. The study consisted of two main experiments: one examined perception, and the other, a production experiment, explored an analysis of acoustic aspects of learners’ production as well. The results indicated that learners performed well on both the perception tasks in both the pre-and post-tests, suggesting that they had a good grasp of the phonetic differences between plain and emphatic consonants, as they were able to accurately perceive both types.

To address the gap in previous research and arrive at a clearer understanding of the phenomenon in question, the current study investigated the ISIB in relation to Arabic spoken by both native and non-native Arabic speakers.

III Current study

The main purpose of the present study is to investigate the flickering phenomenon of ISIB in relation to the Arabic language. Therefore, the following research question is addressed by this study.

• Is there evidence for the ISIB-L and/ or the ISIB-T for native Arabic and native English participants listening to English-accented and native Arabic speech?

To address this question, the intelligibility of three Arabic consonant contrasts (i.e. /h–ħ/, /s–sˤ/ and /t–tˤ/) in English-accented Arabic speech by both NA listeners and L2 Arabic listeners are examined. This study contributes to the existing body of knowledge on L2 sound acquisition as well as the ISIB literature by exploring the ISIB phenomenon in relation to the target Arabic contrastive consonants, which have not previously been looked at in this context.

IV Methods

1 Participants

a Talkers

The spoken materials used in this study were elicited from two groups of talkers. The first group included 10 NE participants (6 females, 4 males) who were enrolled in an intermediate and Advanced Arabic classes at a Mid-West American university. This means that they have studied Arabic for more than one year and their proficiency level ranged from intermediate-low to advanced. They ranged in age from 20 to 27 years (M = 22.9 years). The second group of talkers included 10 NA participants (5 females, 5 males) who were graduate students at the same university campus, and they ranged in age from 26 to 36 years (M = 30.7 years). Whereas NE participants were given Arabic language course credit for their participation in the study, NA participants voluntarily participated.

b Listeners

Two groups of listeners were recruited from the Arabic courses at the same campus. The first group included 15 NE participants (10 females, 5 males) who reported studying Arabic for more than one year and were given Arabic language course credit for their participation in this study. Listeners ranged in age from 19 to 29 years (M = 24.6) and their language proficiency level ranged from intermediate-low to advanced. The second group included 15 NA listeners (9 females and 6 males) who ranged in age from 18 to 29 years (M = 26.22) and participated voluntarily in the present study. No individual served both as a talker and a listener.

Learners of Arabic in this study were recruited from two Arabic language classes at a mid-west American university which used two different parts of Al Kitaab fii ta’allum al-’arabiyya textbook (Brustad et al., 2013a, 2013b) as the primary teaching material. These classes were conducted four times a week, with 50-minute sessions each, and focused on teaching all language skills, including listening, speaking, writing, and reading.

2 Stimuli

Six non-word minimal pairs were used that were of the form C1VC2, contrasting /h–ħ/, /s–sˤ/ and /t–tˤ/ in C1 (word-initial) position (i.e. حاث-هاث /haːθ/–/ħaːθ/; حوج-هوج /huːʒ/–/ħuːʒ/; -صاثساث /saːθ/–/sˤaːθ/; صوج-سوج /suːʒ/–/sˤuːʒ/; طاث-تاث /taːθ/–/tˤaːθ/; طوج-توج /tuːʒ /–/tˤuːʒ/). Each talker produced each of the target words in the context of the sentence, ‘I want to write the word haːθ and after that I will write ħaːθ.’ Each word was read two times in each sentence position (medial and final), resulting in a total of forty-eight individual nonword productions. Two tokens of each of these words were randomly extracted from the productions of each speaker using PRAAT software (Boersma, 2001). This resulted in a total of 480 tokens (12 words* two tokens * 20 talkers) that were used in a forced-choice word identification task described below.

3 Word identification task

The listeners performed a forced-choice word identification task. In each trial, the listeners heard an auditory word and simultaneously saw two words written in Arabic on a computer screen as presented in Figure 1. They were asked to identify which of the two words matched the auditory word they heard by pressing a right or left shift key on a computer keyboard. Crucially, each item involved an auditory word (e.g. huːg) and the written presentation of that word along with its minimal pair correspondent (e.g. huːg and ħuːg). In this way, it is possible to investigate the confusability of pairs of words differing only in the segments of interest (i.e. /h–ħ/, /s–sˤ/, and /t–tˤ/).

Figure 1.

An example of what the screen looked like for the listeners for /huːʒ/–/ħuːʒ/.

DMDX experiment presentation software was used to present the visual and auditory stimuli and to collect responses (Forster & Forster, 2003). Each of the 480 word tokens was presented in random order once in each of two blocks of trials, for a total of 960 trails in the task (480 tokens/block * 2 blocks). Listeners had no time limit to respond to the auditory stimuli in the listening task and were given participant-controlled breaks between blocks. The task took approximately 45 minutes to complete. After completing the word identification task, listeners were asked to fill out a written questionnaire that included questions about their age, gender, and language experience.

V Data coding

Listeners’ responses on the word identification task were correct only when they matched words as intended by the talkers. For example, if participants listened to /haːθ/, saw /haːθ/–/ħaːθ/ on the screen and then pressed the left shift on the keyboard, their answer was counted as correct. Conversely, if they pressed the right shift, their answer was considered wrong as it did not match the talker’s intended word. The word tokens in each of the two blocks of trails were balanced, and the presentation positions of the two response options on the screen were balanced across trails (e.g. ‘haːθ’ appeared on the right side of the screen in one trail and ‘ħaːθ’ on the left side in another trail). Only participants’ correct responses were coded and analysed in the present study.

VI Results

Listeners’ responses were coded as correct if they matched the words intended by the talker, and proportion correct scores were calculated for each combination of listener and talker group. This data was analysed using a mixed-design analysis of variance (ANOVA) with listener group (two levels: NA and NE listeners) as a between-participants factor and talker group (two levels: NA and NE talkers) as a within-participants factor. This analysis showed a significant main effect of listener group (F(1,28) = 88,866, p < .001, partial η² = .76), with performance by NE listeners (.92) more accurate than that of NA listeners (.67). There was also a significant main effect of talker group (F(1,28) = 79.646, p < .001, partial η² = .74), with performance of NA talkers (.91) higher than that of NE talkers (.67). However, there was a significant interaction between talker and listener group (F(1, 28) = 39.685, p < .001, partial η² = .58).

Follow-up pairwise analyses were conducted to explore the interaction of talker and listener group. For English-accented, the NE listeners were significantly more accurate (.79) than NA listeners (.63) at identifying words spoken by NE talkers (F(1, 28) = 161.398, p < .001, partial η² = .53). This finding provides support for the ISIB-L. It was also found NA listeners’ performance (.96) was more accurate than NE listeners’ performance (.86) for NA speech (F(1, 28) = 46.800, p < .001, partial η² = .51) (see Figure 2).

Figure 2.

Word identification accuracy, organized by listener group and talker group.

On the other hand, NA talkers were found to be more intelligible than NE talkers by all listeners. Thus, the NA listeners were more accurate at identifying words produced by NA talkers (.96) than by NE talkers (.63; F(1,14) = 1056,321, p < .001, partial η² = . 98). Additionally, NE listeners were significantly more accurate at identifying words produced by NA talkers (.86) than by NE talkers (.79; F(1,14) = 51,432, p < .001, partial η² = .79). These findings did not provide support for the ISIB-T.

VII Discussion

The current study aimed at exploring both the ISIB-L and ISIB-T as related to the perception of some Arabic consonant contrasts (i.e. /h–ħ/, /s–sˤ/ and /t–tˤ/). To this end, two groups of listeners (native English and native Arabic) listened to individual Arabic non-words produced by both NE and NA talkers. The major finding of the present study was that NE listeners were significantly more accurate than NA listeners at identifying words produced by other NE talkers, providing evidence for an interlanguage speech intelligibility benefit for listeners (the ISIB-L). One possible explanation for this finding may relate to the shared phonetic and phonological knowledge between non-native talkers and non-native listeners who share the same native language background. This suggests that listeners who share the same native language background as the talker may be better equipped to understand interlanguage speech, and that this shared knowledge can help to overcome the challenges posed by accent and pronunciation differences (Fishero et al., 2023; Gosselin et al., 2022; Hayes-Harb et al., 2008; Lee et al., 2022). On the contrary, NA listeners did not share the same phonetic and phonological knowledge with NE talkers; therefore, their intelligibility of NE talkers was not equal to their intelligibility of NA talkers. This finding is consistent with that of Hayes-Harb et al. (2008), which demonstrated that native Mandarin listeners were significantly more accurate than NE listeners at identifying English words produced by native Mandarin talkers. It is interesting that both studies found evidence for the ISIB-L despite using different target languages (English vs. Arabic) and stimuli (real English minimal pairs vs. non-word minimal pairs). This suggests that the ISIB-L may be a robust phenomenon that is not limited to specific languages or types of stimuli. Taken together, the two studies provide converging evidence for the ISIB phenomenon and reinforce the claim that factors related to both talkers and listeners should be taken into consideration with respect to the phenomenon of non-native speech intelligibility.

Another interesting finding of the present study is that NE listeners were more accurate at identifying words spoken by NA talkers than those produced by NE talkers. This finding is in line with previous research conducted by Hayes-Harb et al. (2008), for instance, which found that native Mandarin listeners were more accurate at identifying English words produced by NE talkers than by native Mandarin talkers, this finding was not supportive of the interlanguage speech intelligibility benefit for talkers (the ISIB-T). The analysis of the data collected from both native and non-native Arabic speakers in the present study revealed that NE listeners were significantly better at detecting words produced by NA talkers than by NE talkers. Similarly, some previous studies have not found support for the ISIB-T. Imai et al. (2005), for example, found that high-proficiency native Spanish listeners were better at identifying words produced by native talkers than those produced by native Spanish-accented speech. Furthermore, Munro et al. (2006) found that Cantonese listeners did not find speech produced by Cantonese talkers to be more intelligible than that of any of the other talker groups, but they found the Japanese talkers to be more intelligible than other talkers, which provided evidence counter to the ISIB-T. Likewise, Stibbard and Lee (2006) found no strong evidence for the ISIB-T for high-proficiency non-native speakers. However, the results provided some support for the matched ISIB, with Korean listeners gave the Saudi high-proficiency talker a lower rating than NE, Korean high-proficiency, and Korean low-proficiency talkers. These findings are intriguing and highlight the complexity of non-native speech intelligibility. It is worth noting that the lack of support for the ISIB-T in this study and other previous studies does not necessarily imply that non-native speakers cannot benefit from their non-native language background. It may be that the benefit is less consistent or less pronounced than the ISIB-L. Another possibility is that the listeners may have been more familiar with the accent of the NA talkers than that of the NE talkers. It could also be the case that the NA talkers may have adapted their speech to the expectations of the NE listeners, making their speech more intelligible.

On the other hand, a lack of evidence for the ISIB-T is inconsistent with the findings of other previous studies, such as Bent and Bradlow (2003), van Wijngaarden (2001) and van Wijngaarden et al. (2002). One possible explanation for this difference may be attributed to the different tasks used in these studies. While the present study used a forced-choice word identification task in which participants listened to isolated words, Bent and Bradlow (2003), for instance, used a listening task where the target words were presented in sentences that may have provided additional information, i.e. morphological, semantic, syntactic and prosodic information, that helped participants better identify the target words. On the contrary, presenting words in isolation in the present study did not require lexical access that may have contributed to participants’ perception of accented speech.

These findings suggest that the relationship between the native language of the listener and the accent of the speaker can influence speech intelligibility. Specifically, non-native listeners may have an advantage in understanding non-native speech because they may be more familiar with the phonetic and prosodic patterns of the speaker’s native language, which can help them compensate for the differences between the speaker’s accent and the target language. However, this advantage may not extend to the speech of native speakers, possibly because native speakers are more variable in their pronunciation and less predictable in their use of prosodic patterns. It is very likely that studying the ISIB provides a better understanding of a situation in some L2 classrooms where the students easily communicate using a shared foreign language that may not be easily understood by the teacher.

The current study is limited in that it only included a small number of participants with mixed proficiency levels and one specific type of accent (i.e. English-accented Arabic). Thus, caution should be taken when generalizing the results to other contexts or accents. Moreover, the use of non-words may not fully capture the complexities of natural language production; therefore, the results may not be generalized to other types of speech, such as connected speech or spontaneous speech. The fleeting phenomena of the ISIB-L and the ISIB-T, moreover, require further investigation before we can draw any conclusions about them and to fully understand the factors that contribute to them, and how these factors can be used to improve language learning and teaching. Future research could explore the ISIB for listeners in more complex linguistic contexts using different perception and production tasks. Another possible direction for future research is to explore the effects of task type and listeners’ and talkers’ phonological proficiency on the listening accuracy of native and non-native speakers of Arabic at different proficiency levels using a recognized language proficiency test such as ACTFL Oral Proficiency Interview (OPI). By doing so, we further understand the development of interlanguage phonetic systems that allows us to get further insight into how interlanguage systems, particularly those featuring novel phonemic contrasts develop and explore the role of listener and talker proficiency that were described as ‘critical variables in understanding this shared language effect’ (Fishero et al., 2023, p. 15). Future research could also investigate the ISIB among learners of Arabic from different L1 backgrounds.

VIII Conclusions

This study provides valuable insights into the ISIB and presents important issues related to the intelligibility of non-native speech. First, NE listeners participated in this study showed an advantage over NA listeners at identifying words in English-accented Arabic speech. This finding may be attributed to the shared phonetic and phonological knowledge between non-native talkers and non-native listeners. In addition, NE listeners did not find the speech produced by other NE speakers more intelligible than the speech of NA talkers. In fact, NA speech was proved to be more intelligible than non-native speech to all native and non-native listeners. Overall, a better understanding of the ISIB-L and the ISIB-T can help to improve language learning as well as teaching strategies and facilitate communication and understanding between speakers of different languages.

Footnotes

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Asmaa Shehata

References

Abdalla

(2006). Arabic immersion and summer programs in the United States. In Wahba

K.M

Taha

Z.A.

England

(Eds.), Handbook for Arabic language teaching professionals in the 21st century (pp. 317–330). Routledge.

Albrechtsen

Henriksen

Færch

(1980). Native speaker reactions to learners’ spoken interlanguage. Language Learning, 30, 365–396.

Aldamen

Al-Deaibes

(2023). Perception and production of L2 Arabic emphatic consonants: The role of communicative and traditional form-based approaches. Ampersand, 10, 100–105.

Al Mahmoud

M.S

. (2013). Discrimination of Arabic contrasts by American learners. Studies in Second Language Learning and Teaching, 3, 261–292.

Bent

Bradlow

A.R.

(2003). The interlanguage speech intelligibility benefit. Journal of the Acoustical Society of America, 114, 1600–1610.

Boersma

(2001). Praat, a system for doing phonetics by computer. GLOT International, 59, 341–345.

Bradlow

A. R.

Bent

(2002). The clear speech effect for non-native listeners. Journal of the Acoustical Society of America, 112(1), 272–284.

Bradlow

A.R.

Pisoni

D.B.

(1999). Recognition of spoken words by native and non-native listeners: Talker-, listener-, and item-related factors. Journal of the Acoustical Society of America, 106, 2074–2085.

Brustad

K.E.

Al-Batal

Tūnisī

(2013a). Al-Kitaab fii ta’allum al-’Arabiyya: A textbook for intermediate Arabic. Georgetown University Press.

10.

Brustad

K.E.

Al-Batal

Tūnisī

(2013b). Al-Kitaab fii ta’allum al-’Arabiyya: A textbook for advanced Arabic. Georgetown University Press.

11.

Derwing

T.M.

Munro

M.J.

(2001). What speaking rates do non-native listeners prefer? Applied Linguistics, 22, 324–337.

12.

Dokovova

Scobbie

J.M.

Lickley

(2022). Matched-accent processing: Bulgarian–English bilinguals do not have a processing advantage with Bulgarian-accented English over native English speech. Laboratory Phonology, 13, 1–40.

13.

Ferguson

C.A.

(1959). Diglossia. Word-Journal of the International Linguistic Association, 15, 325–340.

14.

Field

(2004). An insight into listeners’ problems: Too much bottom-up processing or too much top-down? System, 32, 363–377.

15.

Fishero

Sereno

J.A.

Jongman

(2023). Perception and production of mandarin-accented English: The effect of degree of accentedness on the interlanguage speech intelligibility benefit for listeners (ISIB-L) and talkers (ISIB-T). Journal of Phonetics, 99, 1–18.

16.

Forster

K.I.

Forster

J.C.

(2003). DMDX: A windows display program with millisecond accuracy. Behavior Research Methods, Instruments, and Computers, 35, 116–124.

17.

Gass

Varonis

E.M.

(1984). The effect of familiarity on the comprehensibility of nonnative speech. Language and Learning, 34, 65–89.

18.

Gordon

Darcy

(2016). The development of comprehensible speech in L2 learners: A classroom study on the effects of short-term pronunciation instruction. Journal of Second Language Pronunciation, 2, 56–92.

19.

Gosselin

Martin

C.D.

González Martín

Caffarra

(2022). When a nonnative accent lets you spot all the errors: Examining the syntactic interlanguage benefit. Journal of Cognitive Neuroscience, 34, 1650–1669.

20.

Guion

S.G.

Flege

J.E.

Akahane-Yamada

Pruitt

J.C.

(2000). An investigation of current models of second language speech perception: The case of Japanese adults’ perception of English consonants. Journal of the Acoustical Society of America, 107, 2711–2724.

21.

Hayes-Harb

Durham

(2016). Native English speakers’ perception of Arabic emphatic consonants and the influence of vowel context. Foreign Language Annals, 49, 557–572.

22.

Hayes-Harb

Smith

Bent

Bradlow

(2008). The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts. Journal of Phonetics, 36, 664–679.

23.

Hirata

(2004). Training native English speakers to perceive Japanese length contrast in word versus sentence contexts. Journal of the Acoustical Society of America, 116, 2384–2394.

24.

Hsieh

Y.L.

Tsao

F.M.

(2022). The effect of speech familiarity and phonetic similarity on the acquisition of Mandarin Chinese vowels by English-speaking learners. Second Language Research, 38, 111–135.

25.

Imai

Walley

A.C.

Flege

J.E.

(2005). Lexical frequency and neighbourhood density effects on the recognition of native and Spanish-accented words by native English and Spanish listeners. Journal of the Acoustical Society of America, 117, 896–907.

26.

Jenkins

(2000). The phonology of English as an international language. Oxford University Press.

27.

Jenkins

(2002). A sociolinguistically based, empirically researched pronunciation syllabus for English as an International Language. Applied Linguistics, 23, 83–103.

28.

Kang

Thomson

Moran

(2018). The effects of international accents and shared first language on listening comprehension tests. TESOL Quarterly, 53, 56–81.

29.

Korpal

Sobkowiak

(2020). The perception of native vs. nonnative Danish speech: Bent and Bradlow’s matched interlanguage speech intelligibility benefit revisited. Scandinavian Philology, 18, 284–296.

30.

Lababidi

Park

(2017). Perceptual mapping between Arabic and English consonants. In Ouali

(Ed.), Perspectives on Arabic linguistics XXIX (pp. 89–126). John Benjamins.

31.

Laeufer

(1996). The acquisition of a complex phonological contrast: Voice Timing patterns of English initial stops by native French speakers. Phonetica, 53, 86–110.

32.

Lee

Kang

Nam

(2022). Identification of English vowels by non-native listeners: Effects of listeners’ experience of the target dialect and talkers’ language background. Second Language Research, 38, 449–475.

33.

Major

R. C.

(2018). Foreign Accent. In Chapelle

C. A.

(Ed.) The encyclopedia of applied linguistics (pp. 1–8). https://doi.org/10.1002/9781405198431.wbeal0420.pub2

34.

Major

Fitzmaurice

Bunta

Balasubramanian

(2002). The effects of nonnative accents on listening comprehension: Implications for ESL assessment. TESOL Quarterly, 36, 173–190.

35.

Mroz

(2018). Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, 51, 617–637.

36.

Munro

(1998). The effects of speaking rate on listener evaluations of native and foreign-accented speech. Language Learning, 48, 159–182.

37.

Munro

Derwing

(1995). Foreign accent, comprehensibility and intelligibility in the speech of second language learners. Language Learning, 45, 73–97.

38.

Munro

Derwing

Morton

(2006). The mutual intelligibility of L2 speech. Studies in Second Language Acquisition, 28, 11–31.

39.

O’Brien

M.G.

(2014). L2 learners’ assessments of accentedness, fluency, and comprehensibility of native and nonnative German speech. Language Learning, 64, 715–748.

40.

Politzer

R.L.

(1976). Linguistic accuracy and intelligibility. In Nickel

(Ed.), Proceedings of the Fourth International Congress of Applied Linguistics (pp. 505–513). University of California.

41.

Rasmussen

Z.B.

(2007). The interlanguage speech intelligibility benefit: Arabic-accented English. Unpublished senior honors thesis, University of Utah, UT, USA.

42.

Ryding

C.K.

(2013). Teaching and learning Arabic as a foreign language: A guide for teachers. Georgetown University Press.

43.

Saito

Tran

Suzukida

, et al. (2019). How do L2 listeners perceive the comprehensibility of foreign-accented speech? Roles of L1 profiles, L2 proficiency, age, experience, familiarity and metacognition. Studies in Second Language Acquisition, 41, 1133–1149.

44.

Selinker

(1972). Interlanguage. IRAL, 10, 209–231.

45.

Sereno

Lammers

Jongman

(2016). The relative contribution of segments and intonation to the perception of foreign-accented speech. Applied Psycholinguistics, 37, 303–322.

46.

Shehata

(2015). Problematic Arabic consonants for native English speakers: learners’ perspectives. The International Journal of Educational Investigations, 2, 24–47.

47.

Shehata

(2018). Native English speakers’ perception and production of Arabic consonants. In Alhawary

M.T.

(Ed.), Handbook of Arabic second language acquisition (pp. 56–69). Routledge.

48.

Shehata

(2021). Short vowels and context effects: The case of English speakers reading Arabic. International Education Studies, 14, 93–103.

49.

Shehata

(2023). Corrective feedback on Arabic pronunciation: Teacher beliefs and practices. In Thomson

R.I.

Derwing

T.M.

Levis

J.M.

Hiebert

(Eds.), Proceedings of the 13th pronunciation in second language learning and teaching conference. Available at https://doi.org/10.31274/psllt.15714 (accessed February 2024).

50.

Smith

L.E.

(1992). Spread of English and issues of intelligibility. In Kachru

B.B.

(Ed.), The other tongue: English across cultures (pp. 75–90). University of Illinois Press.

51.

Sole

M.J.

(2018). Articulatory adjustments in initial voiced stops in Spanish, French and English. Journal of Phonetics, 66, 217–241.

52.

Stibbard

R.M.

Lee

J.I.

(2006). Evidence against the mismatched interlanguage intelligibility benefit hypothesis. Journal of the Acoustical Society of America, 120, 433–442.

53.

Taha

T.A.

(2007). Arabic as ‘a critical-need’ foreign language in the post-9/11 era: A study of students’ attitudes and motivation. Journal of Instructional Psychology, 34, 150–160.

54.

Tajima

Port

Dalby

(1997). Effects of temporal correction on intelligibility of foreign-accented English. Journal of Phonetics, 25, 1–24.

55.

Thomson

R.I.

(2018). Measurement of accentedness, intelligibility and comprehensibility. In Kang

Ginther

, (Eds.), Assessment in second language pronunciation (pp. 11–29). Routledge.

56.

Trofimovich

Baker

(2006). Learning second-language suprasegmentals: Effect of L2 experience on prosody and fluency characteristics of L2 speech. Studies in Second Language Acquisition, 28, 1–30.

57.

van Wijngaarden

S.J

. (2001). Intelligibility of native and non-native Dutch speech. Speech Communication, 35, 103–113.

58.

van Wijngaarden

S.J.

Steeneken

H.J.M.

Houtgast

. (2002). Quantifying the intelligibility of speech in noise for non-native listeners. Journal of the Acoustical Society of America, 111, 1906–1916.

59.

Xie

Fowler

C.A.

(2013). Listening with a foreign-accent: The interlanguage speech intelligibility benefit in Mandarin speakers of English. Journal of Phonetics, 41, 369–378.

60.

Younus

F.A.

(1977). Curriculum for teaching of Arabic as a second language at the beginning level. Dar Al Thaqafa Printing and Publishing.

61.

Zouhir

(2013). Unpacking the teaching and learning practices of Arabic at a major U.S. university. Journal of Second and Multiple Language Acquisition, 1, 1–20.