Abstract
Purpose:
Bilingual language experience is thought to confer non-linguistic benefits in general cognition including improved cognitive control. These bilingual effects are most often observed in samples of bilinguals who are highly proficient in both languages. However, across the majority of previous studies, assessments of language proficiency are exclusively subjective. While evidence supports that subjective and objective measures of language proficiency are correlated, no studies have explored whether the use of either measure impacts on model results when investigating bilingual effects on cognitive control.
Methodology:
Mandarin-English bilingual young adults completed both subjective and objective assessments of language proficiency and a Simon task to measure differences in cognitive control.
Data and Analysis:
Data were analyzed using linear mixed-effects models to best account for differences in linguistic and non-linguistic variables as well as the repeated-measures nature of the Simon task.
Findings and Conclusions:
We report no evidence in support of improved cognitive control associated with higher levels of language proficiency. Crucially, results did not differ when either subjective or objective measures were included in our models. Results support that both subjective and objective assessments of language proficiency may be equivalent when modeling bilingual effects on cognitive control.
Originality:
This study is the first direct investigation of the influence of the proficiency assessment method on model results.
Implications:
The findings of this study have implications for the assessment of language proficiency in future investigations of bilingual effects.
Keywords
Introduction
Bilingualism, the use of more than one language, is thought to confer domain-general benefits in cognitive control (Antoniou, 2019), although this claim is controversial (e.g., Paap, 2019). Cognitive control, sometimes called executive function, is a set of attentionally controlled mental processes that regulate thinking and behavior (Diamond, 2013). Theoretical models of cognitive control generally describe multiple, partially overlapping domains including inhibitory control, shifting, and updating (Friedman & Miyake, 2017; Miyake et al., 2000). Differences in each of these separate domains are assessed by behavioral tasks such as the Simon Task (Simon & Wolf, 1963). The extant literature supports that bilinguals tend to outperform monolinguals on tasks that measure differences in cognitive control (Grundy, 2020; Tao et al., 2021), although reported differences tend to be small (e.g., Lehtonen et al., 2018). In addition, the exact conditions under which these “bilingual advantages,” more broadly referred to as “bilingual effects” (Privitera et al., 2023a), emerge is unclear (e.g., Paap et al., 2015; Ware et al., 2020).
Evidence in support of bilingual effects on cognitive control has been identified across a broad developmental range from samples of children (Bialystok & Martin, 2004; Iluz-Cohen & Armon-Lotem, 2013), young adults (Antón et al., 2019; Costa et al., 2008; Privitera et al., 2022, 2023a), and elders (Bialystok et al., 2004; Dash et al., 2019). These bilingual effects are thought to result from the constant need to inhibit an active but unneeded language to successfully communicate (Green & Abutalebi, 2008). While early theoretical positions proposed that these effects would manifest in the domain of inhibition (Bialystok et al., 2009), most recent evidence suggests these effects are more general, impacting on attentional processes (Bialystok & Craik, 2022). Despite growing support for these bilingual effects, improved performance is thought to alternatively result from the influence of highly correlated non-linguistic variables (Morton & Harper, 2007; Paap, 2019) or reflect the influence of publication bias (De Bruin et al., 2015).
While initially considered a unidimensional categorical variable (Luk & Bialystok, 2013), bilingualism has more recently been viewed as a series of multidimensional continuums on which two people from the same environment can differ considerably (Dash et al., 2022; Gullifer et al., 2021). The most common metric for assigning bilingual language status is that of language proficiency (Surrain & Luk, 2019), that is, people are not generally considered to be bilingual unless they claim or demonstrate sufficient proficiency in two languages (Grosjean, 2020). Interestingly, bilingual effects do not seem to emerge in all bilinguals but are more readily observed in samples with higher levels of language proficiency (Mishra, 2015). One argument for this observed trend relates to the lack of interference between two languages when proficiency in one is low. As it follows, bilingual effects would be less likely to be observed if sufficient effort was not engaged in the inhibition of an active but unneeded language (Green, 1998). For this reason, assessment of language proficiency is essential when investigating the influence of language experience on cognitive control.
Language proficiency can be assessed using both subjective and objective tools. Across previous studies, proficiency has most often been assessed subjectively through self-report (Surrain & Luk, 2019). However, inconsistencies in the collection of subjective proficiency data have led to high variability across studies, preventing direct comparisons from being made (Grosjean, 1998). In response to this issue, standardized instruments including the Language History Questionnaire (LHQ; Li et al., 2020), Language Experience and Proficiency Questionnaire (LEAP-Q; Marian et al., 2007), and the Language and Social Background Questionnaire (LSBQ; J. A. E. Anderson et al., 2018) have been developed in the interest of increasing cross-study consistency. These tools are widely used with established reliability and validity (Bidelman et al., 2011; Grant & Li, 2019; Mann & De Bruin, 2022).
While convenient to collect, subjective assessments of proficiency have been criticized based on the expectation that participants might overestimate or underestimate their own language abilities (MacIntyre et al., 1997). These concerns can be addressed through the use of objective assessments of second language proficiency. As identified in a recent systematic review (Surrain & Luk, 2019), the majority of previous studies comparing monolingual and bilingual samples have assessed objective proficiency using the Peabody Picture Vocabulary Test (PPVT; Dunn & Dunn, 2007). Picture naming tasks are a widely used way to objectively assess proficiency (Kaplan et al., 2001) and require that participants retrieve specific lexical knowledge in a given language to successfully name the presented stimulus (Gollan et al., 2005). These tests have demonstrated high reliability across picture stimuli and presentations (Herbert et al., 2008), although some evidence suggests that responses to different classes of pictures may vary based on a participant’s linguistic background (e.g., Momenian et al., 2021, 2024).
To date, the majority of studies investigating bilingual effects on cognitive control in young people have utilized a between-groups design, comparing task performance between monolingual and bilingual samples (Privitera & Weekes, 2023). Surprisingly, nearly a quarter of these between-group investigations did not measure proficiency in their bilingual samples (Surrain & Luk, 2019). In studies that do assess second language proficiency, bilingual participants are separated into groups based on scores, and differences in task performance are tested between groups. Of these limited studies, evidence in support of improved monitoring in more proficient bilinguals relative to those with lower proficiency has been identified (e.g., Mishra, 2015; Xie, 2018). Of note is the observation that the number of levels assigned to proficiency as a factor may impact on the emergence of significant results. While Mishra and colleagues (2012) identified a significant effect of proficiency when participants were separated into two groups (i.e., low and high proficiency), significant findings were limited to only comparisons between the lowest and highest proficiency groups in another study that included a third “middle proficiency” condition (Xie, 2018). Together, these findings support that bilingual effects do not emerge until sufficient levels of proficiency are reached, highlighting the caution that must be exercised when operationalizing proficiency.
While the assignment of bilinguals to groups based on second language proficiency avoids issues associated with the ecologically flawed comparison of monolinguals and bilinguals (Rothman et al., 2023), it ignores the fact that categorical proficiency labels ignore nontrivial differences between bilinguals. Whether proficiency is considered a categorical or continuous variable during the analysis of behavioral data is an important consideration as different approaches can lead to different conclusions (Champoux-Larsson & Dylman, 2021). Considering language proficiency as a continuous variable when modeling behavioral data can allow for the identification of graded effects that would otherwise be masked in between-group studies (Grundy et al., 2020; Privitera et al., 2022, 2023a). Previous within-group studies have reported improved monitoring and inhibition associated with higher levels of second language proficiency (e.g., Privitera et al., 2022, 2023a; Xie & Zhou, 2020). This continuous approach to assessing the influence of language proficiency on cognitive control aligns with the most recent calls to view the separable dimensions of bilingual language experience as continuums instead of restrictive “boxes” that participants must be placed into (Dash et al., 2022; Gullifer et al., 2021).
Interestingly, only a small number of studies have included both subjective and objective assessments of language proficiency (Surrain & Luk, 2019), implying that many consider these assessments to be equivalent measures. However, whether people are accurate when assessing their own abilities is contested (Zell & Krizan, 2014). While some evidence supports that subjective and objective measures of language proficiency are correlated (Gollan et al., 2011; Grant & Li, 2019; Li et al., 2020; Marian et al., 2007), whether the inclusion of either measure during analysis impacts on results is an underexplored area of inquiry. To date, no studies have directly explored whether the inclusion of either subjective or objective assessments of language proficiency leads to different results when modeling bilingual effects on cognitive control. Despite this lack of comparable work from which to inform expected outcomes, previous studies investigating the influence of bilingualism on cognitive reserve and dementia conversion can provide some insight. In one study, Zahodne and colleagues (2014) reported no independent association between bilingualism and cognitive decline or dementia conversion when including either subjective or objective proficiency scores during analyses. Conversely, Gollan and colleagues (2011) reported that only scores from objective proficiency assessments were associated with the age of onset for Alzheimer’s disease. These conflicting findings highlight a need for further work to systematically explore whether the use of different methods of proficiency assessment impacts on results, with significant implications for the commonplace assessment of language proficiency through self-report.
To explore this topic in more detail, this study aimed to investigate the influence of subjective and objective assessments of language proficiency on the emergence of bilingual effects on cognitive control. Specifically, using behavioral data from a Simon task, we tested the hypothesis that the inclusion of either measure of language proficiency, while controlling for differences in other domains of bilingual experience and background variables, would result in comparable findings.
Methods
Participants
Seventy-four Mandarin-English speaking bilingual college students (50 females; Mage = 20.01 years, SDage = 1.24 years) were recruited from a Sino-Foreign university in Mainland China. All participants were native Mandarin speakers with an average of 12.44 years of experience using English (±2.90 years). All participants were enrolled full-time in an American undergraduate curriculum program where English was the primary language of instruction and assessment. Written informed consent was collected from all participants. Approval for this study was granted by the Human Research Ethics Committee of the University of Hong Kong (#EA100010). The number of participants recruited was based on guidance for sufficiently powered linear mixed-effects modeling from Brysbaert and Stevens (2018), recommending at least 40 participants with 40 trials each.
Language experience and background measures
A combination of both subjective and objective assessments was used to best capture the heterogeneity of language experience in our sample. Participants first completed the LHQ-3. A full description of this instrument can be found in the original article (Li et al., 2020). To briefly summarize, the LHQ-3 contains a series of self-report questions that assess three separable dimensions of language experience in all languages a participant reports using: proficiency, immersion, and dominance. Proficiency is assessed by asking participants to rate how well they listen, speak, read, and write in a given language using a 7-point Likert-type scale from “1 = very poor” to “7 = excellent.” Immersion and dominance are assessed using a series of questions about the number of years of language experience and hours spent engaged in specific activities in a given language, with the dominance score further weighted by reported proficiency. For each dimension, an aggregate score ranging from 0 to 1 is generated. For example, a participant reporting a 5/7 for listening, speaking, reading, and writing for a given language would have a proficiency score based on the following calculation:
Participants were also asked to complete two separate picture naming tasks, one in Mandarin and one in English, to assess their objective proficiency. A set of 60 color object pictures was taken from the set developed by Rossion and Pourtois (2001) with 30 pictures included for each language. Task language order was randomized across participants and picture order was randomized within each task. Participants were instructed to type the correct name of each object in a blank box in either English or Chinese characters depending on the language of the task. Based on expected variability in responses, each picture had both a standard answer and an alternative answer. Answers were initially checked using a matching function in Microsoft Excel and were then manually checked by both authors. During manual checking, the correctness of an alternative response was determined using the following principles: (1) answers given in the wrong language, even if correct, were coded as incorrect; (2) plural or singular versions of the same correct answer were considered correct (e.g., putting “shoes” when the correct answer was “shoe” is correct); (3) in Mandarin, correct answers provided in pinyin instead of Chinese characters were counted; (4) spelling errors on correct answers thought to result from miskeying a response were counted (e.g., “sheo” was considered the correct answer to “shoe”); and (5) spelling errors not thought to result from miskeying that phonetically matched the correct answer were counted (e.g., “neckliss” was considered the correct answer to “necklace”). The decision to include phonetic matches that were misspelled was made to more closely mirror the scoring criteria used when picture naming is done verbally. Objective proficiency was calculated separately for each language based on the total number of correct responses given, with scores ranging from 0 to 1. For example, if a participant got 27 items correct out of 30 total, they would have an objective proficiency score of 0.90.
Participants also reported on basic demographic details, weekly use of video games and musical instruments, language switching frequency, perceived stress (PSS-10; Cohen, 1988), and family education level as a proxy for socioeconomic status (SES; Wermelinger et al., 2017).
Measure of cognitive control
A two-color Simon task (Privitera et al., 2022) was administered online using the Gorilla online experiment builder (Anwyl-Irvine et al., 2020). Prior to the start of the task, participants were instructed to place their left index finger on the “Q” key and their right index finger on the “P” key on their computer’s keyboard. At the beginning of each trial, a fixation cross (black; 2.54 cm line; 2.54 cm thick) was presented on a white background for 300 ms before disappearing. Depending on the trial condition, the target stimulus, either a blue or brown square (2.54 × 2.54 cm), appeared in one of three locations: left, center, or right, relative to the fixation cross that was previously on the screen. In response to the presentation of each stimulus, participants were asked to press one of two different keys on a standard keyboard based only on the stimulus color. Button and color mapping were counterbalanced across participants with half instructed to press the “Q” button for a blue square and the “P” button for a brown square, and the other half receiving the reversed directions. Stimuli remained on the screen until a response was given, followed by a blank screen for 500 ms. Given the color of the stimulus and the mapping of color to the response key, three trial conditions were generated: congruent (match between stimulus and response key location), incongruent (mismatch between stimulus and response key location), and neutral (no conflict; target stimulus in the center). In total, 6 practice trials with feedback and 84 experimental trials without feedback were presented. The trial presentation was randomized and included equal proportions of each of the possible conditions.
General administration procedures
All data were collected online due to strict pandemic restrictions in Mainland China. Participants were sent a link to the experiment through email and were asked to find a quiet area where they could focus and complete the tasks. They were further instructed to maximize the size of their browser screen prior to starting the experiment and to avoid using their phone or engaging in other distracting activities. Informed consent was collected from all participants prior to the start of the experiment, followed by demographic details and the LHQ-3. Next, either the Simon task or Attention Network Test was completed (only Simon data presented here), followed by the PSS-10 and both picture naming tasks. The completion of all tasks took around 30 minutes for each participant with breaks available after each phase of the experiment.
Statistical analysis
A within-subjects design was used to investigate whether differences in bilingual language experience impacted on cognitive control and whether results differed between models using exclusively subjective or objective assessments of language proficiency. Reaction time (RT) data were analyzed with linear mixed-effects models using the lmer function from the lme4 package (Version 1.1–26; Bates et al., 2015) in R (Version 4.0.5; R Core Team, 2021). While still uncommon, the application of linear mixed-effects models in the investigation of bilingual effects on cognitive control allows for consideration of individual differences in linguistic and non-linguistic background while also accounting for the multi-trial (i.e., repeated measures) nature of most widely used behavioral tasks (Privitera et al., 2023b; Privitera & Weekes, 2023). Full analysis details can be found in our previous work (Privitera et al., 2022). Here, we briefly describe the procedure. RT data from all correct trials longer than 150 ms and shorter than 2,000 ms were included in our analysis. These cutoffs were selected to maximize the likelihood that an authentic bilingual effect could be identified (Zhou & Krott, 2016). Prior to model fitting, RT data were log-transformed, addressing issues with non-normality.
Multicollinearity between predictor variables was assessed using variance inflation factor (VIF). Two separate models were built in this study with one containing objective measures of Mandarin (L1) and English (L2) proficiency, and the other containing subjective measures. Both models initially contained main effects for gender, task order, age, reported stress, video game experience, musical instrument experience, SES, number of languages used, language switching, L1 proficiency, L1 dominance, L2 proficiency, L2 immersion, L2 dominance, and L2/L1 dominance ratio. In addition, interactions with congruency were included for language switching, L2 proficiency, L2 immersion, L2 dominance, and L2/L1 dominance ratio based on our a priori expectation that differences in these variables would impact on inhibitory control. Random effects structure fitting began with a maximal model which included random participant intercepts and random by-participant slopes for congruency (Barr et al., 2013). Finally, absolute standardized residuals exceeding 2.5 standard deviations were removed, which resulted in residuals that were normally distributed (Baayen & Milin, 2010).
Results
Beyond the data trimming described above, we removed all data from 1 participant who had completed the experiment twice, 5 participants with an accuracy below 70%, 5 participants with objective L1 proficiency scores of 0%, and 3 participants with objective L2 proficiency scores of 3% or less. This resulted in the inclusion of 4,680 trials from 60 participants (41 females; Mage = 20.10 years, SDage = 1.32 years) with an average of 12.67 years of experience using English (±2.79 years).
Paired sample t-tests between subjective and objective measures of proficiency for each language were performed. Expectedly, both subjective ratings, t(59) = 13.532, p < .001, and objective scores, t(59) = 18.844, p < .001, of L1 proficiency were significantly higher than those for L2. Participants also rated their own language proficiency as lower than their objective proficiency in both L1, t(59) = –4.861, p < .001, and L2, t(59) = –3.628, p < .001. Finally, subjective and objective measures of language proficiency were not correlated in L1, r(59) = –.053, p = .689; and L2, r(59) = .043, p = .747. Complete background details of our sample are summarized in Table 1.
Demographic and language history data.
Note. PSS: perceived stress; SUB: subjective; OBJ: objective.
Modeling results
Due to high VIF (>5; Craney & Surles, 2002), L1 and L2 dominance as well as the interaction between L2 dominance and congruency were removed from both models prior to fitting. In addition, maximal models which included random participant intercepts and by-participant random slopes for congruency did not converge. Therefore, final models contained only random participant intercepts. Finally, trimming of extreme residuals resulted in the removal of 143 trials from the objective proficiency model and 142 trials from the subjective proficiency model. After trimming, residuals for both models were approximately normally distributed.
Congruency was initially sum coded (–1, 0, 1) during model fitting to assess main effects. A bilingual effect on monitoring would present as a significant main effect of L2 proficiency, L2 immersion, or L2/L1 dominance ratio. Congruency was then dummy-coded with the congruent set as the reference level to assess for simple effects. With the congruent condition set as the reference level, a significant effect of congruency for the incongruent condition with a positive coefficient would indicate the presence of the classic Simon effect. Under these same reference level conditions, a bilingual effect on inhibitory control would present as a significant interaction between L2 proficiency, L2 immersion, or L2/L1 dominance ratio, and the incongruent condition. For both main effects and interactions, negative coefficients would represent improved task performance associated with higher bilingual experience.
Model results are summarized in Table 2 for the subjective proficiency model and Table 3 for the objective proficiency model. The pattern of results for both models was identical and, for this reason, the results of both models will be discussed together. The presence of a significant effect of incongruent trial condition with a positive coefficient confirmed the presence of a Simon effect. A significant interaction between incongruent trial condition and L2/L1 dominance ratio was observed, with higher levels associated with reduced inhibitory control. In addition, a marginally significant interaction between neutral trial condition and L2/L1 dominance ratio was observed, with higher levels associated with slower performance on neutral trials relative to congruent trials. The simple effects of L2/L1 dominance ratio on task performance for both models are shown in Figure 1. Finally, a significant interaction between neutral trial condition and language switching was observed, with higher reported language switching associated with faster performance on neutral trials relative to congruent trials. No significant main effects were observed in either model. Full results for each model can be accessed on Open Science Framework (https://doi.org/10.17605/OSF.IO/6WF74).
Subjective proficiency model results.
Note. CI: confidence interval; SES: socioeconomic status; SUB: subjective; OBJ: objective.
Congruent condition set as a reference level for interactions.
Objective proficiency model results.
Note. CI: confidence interval; SES: socioeconomic status; SUB: subjective; OBJ: objective.
Congruent condition set as a reference level for interactions.

Influence of L2/L1 dominance ratio on Simon task performance. Data presented are from (A) a model which included objective language proficiency measures and (B) subjective measures. The pattern of results was nearly identical for both models. *95% confidence interval. RT is plotted on its original scale for display purposes.
Discussion
This study investigated whether the inclusion of either subjective or objective language proficiency measures generated different results when modeling bilingual effects on cognitive control. Regardless of whether subjective or objective measures were included, model results did not differ. Neither model provided support for an influence of second language proficiency on Simon task performance. However, we did identify simple effects on task performance associated with other dimensions of language experience. Together, our results suggest that both subjective and objective measures of language proficiency may be equivalent when included in models testing the influence of bilingualism on cognitive control.
Language proficiency is a critical variable of interest when investigating bilingual effects. A reliable and valid assessment of proficiency is essential if accurate conclusions are going to be drawn regarding the influence of this dimension of language experience on cognitive control. While the commonplace assessment of proficiency using exclusively subjective measures is convenient, it has questionable utility given our limited understanding of our own abilities (Zell & Krizan, 2014). While this methodological trend may appear like a cause for concern, whether results from models of behavioral data differ when subjective measures of language proficiency are substituted with objective measures is an open question. The primary contribution of this study is that the observed pattern of results did not differ between subjective and objective proficiency models of Simon task data.
Our findings align with at least one previous study supporting that the use of either subjective or objective measures of language proficiency leads to comparable results. In a prospective longitudinal study, Zahodne and colleagues (2014) reported no influence of bilingualism on cognitive decline or dementia conversion in a sample of Spanish-English bilingual Hispanic immigrants. Crucially, results were comparable when second language proficiency was assessed either subjectively through self-report or objectively through performance on a reading test. Conflicting findings have also been reported. Gollan and colleagues (2011) observed that only objective measures of language proficiency predicted the age of Alzheimer’s diagnosis in a sample of Hispanic elders. These contrasting results could be attributed to differences in the scope of proficiency assessment. While the assessment was limited to only English (L2) language proficiency in the Zahodne study, a composite “bilingualism” score based on both Spanish (L1) and English (L2) proficiency was used in the Gollan study. The alignment of the present findings with those of Zahodne and colleagues (2014) suggests that subjective measures may only retain predictive power when they are considered individually and not as part of a composite score.
Results of this study do not support the claim that improved cognitive control is associated with higher levels of second language proficiency. This null result is consistent with previous research supporting the absence of bilingual effects in samples of younger adults (Ware et al., 2020). By comparison, bilingual effects are more often reported in studies focused on samples of older adults. One explanation for the reported discrepancy between age groups is the expectation that the benefits associated with bilingualism may be more readily observed in samples exhibiting reduced cognitive control as a consequence of aging or dementia (e.g., Van den Noort et al., 2019). This explanation is further supported by the observation that young adults are likely experiencing a developmental peak in cognitive control (P. Anderson, 2002). Consequently, as summarized in the peak performance hypothesis, improved cognitive control associated with bilingual language experience in young adults may be difficult or impossible to observe due to a developmental ceiling effect on task performance (Bialystok, 2016).
Most significant was the observation that nearly identical results emerged when either subjective or objective proficiency measures were included in our model, supporting that this finding was not due to the use of one measure of proficiency over another. While past work is limited, different methods for the measurement of separate dimensions of language experience have been shown to influence the emergence of bilingual effects. In one study, Anthony and Blumenfeld (2019) identified that the emergence of a bilingual effect on inhibitory control depended on the way in which language dominance was operationalized. Specifically, only a hybrid index of language dominance calculated as the average of subjective and objective language proficiency and exposure scores was associated with performance on a Stoop task. This finding suggests that, in the case of language dominance, not all measures are equivalent. Taken together with the comparable results observed in this study, we conclude that proficiency measures, whether subjective or objective, may be tapping into the same construct in a way that does not impact on model results. However, this is not the case across all dimensions of language experience, highlighting that the use of both subjective and assessments across the many dimensions of language experience is best practice.
Unexpectedly, we found that participants rated their proficiency as significantly lower than their objective score for both Mandarin and English. That is, in both their native and second language, participants were actually more proficient than they expected. This finding is consistent with the observation that Asian learners tend to underestimate their language learning ability (Lien, 2016). Previous studies support that the accuracy of subjective ratings of proficiency may differ between native and second languages and that linguistic context may also modulate these ratings. In a similarly aged sample of Mandarin-English bilinguals, Tomoschuk and colleagues (2019) observed that participants tended to overestimate their own Mandarin proficiency while underestimating their English proficiency relative to objective scores. Our finding only partially aligns with this previous report and may have resulted from differences in the sociolinguistic context of the samples. This study included Mandarin-English bilinguals living in the Mandarin-dominant environment of Mainland China, while Tomoschuk and colleagues recruited their sample from the English-dominant United States of America. Participants in our study likely had fewer opportunities to get language feedback from native English speakers, possibly biasing them toward underestimating their own proficiency in a language they have less experience with. By the same logic, the higher percentage of native English speakers and smaller percentage of native Mandarin speakers in the United States may have given participants in the Tomoschuk study a less optimistic view of their English proficiency, while simultaneously inflating ratings of Mandarin proficiency given their lifetime of experience with the language.
Alternatively, our observed discrepancy between subjective and objective assessments of language proficiency may relate to the unique characteristics of Chinese students. In traditional Chinese education, teachers believe in “setback education” (Cuozhe jiaoyu), which means to “hammer” students’ willpower to prevent self-inflation while also crushing students’ self-confidence (Wang & Byram, 2011). Students that are considered to be “good” in China often demonstrate humbleness (Wang & Byram, 2011), even when making good achievements (Salili & Hau, 1994). This humbleness or humility is thought to motivate students to continue in their pursuit of self-improvement and success (Martin et al., 2014). Lower subjective ratings for both Mandarin and English proficiency may have resulted from our participants’ desire to demonstrate humility when assessing their own abilities (Whitcomb et al., 2017).
While we observed lower subjective ratings for both Mandarin and English proficiency, it may be that perceptions of English proficiency are perhaps more likely to be underestimated. In the Chinese educational context, English is employed more as an exam skill than a means of communication for students (Davey et al., 2007; Pan & Block, 2011; Privitera, 2023). The association of English with high-stakes testing can generate considerable academic pressure, which may lower students’ self-confidence and promote language anxiety (Sun et al., 2013). This anxiety, which likely shows considerable heterogeneity (Liu, 2006) may have biased participants in our sample toward rating their English proficiency lower. Considering the cultural background of our participants and the linguistic context in which they live, a combination of multiple factors likely underlies our observed lower subjective ratings of proficiency. This cultural interpretation is further supported by the observation that Spanish-English/English-Spanish bilinguals in America tend to overestimate the proficiency of their dominant language (Cieślicka & Guerrero, 2023). Collectively, these findings suggest that differences in sociolinguistic context may impact on subjective ratings of proficiency. This suspected modulation provides one possible explanation for the observed lack of correlation between our subjective and objective proficiency assessments. However, as only a small number of studies conducted on bilinguals use both subjective and objective assessments of proficiency (Surrain & Luk, 2019), it is unknown whether the absence of a correlation between these measures is common. Future work is needed to more clearly elucidate the influence of these factors, especially in light of recent evidence supporting a modulatory influence of sociolinguistic context on the manifestation of bilingual effects on cognitive control (Freeman et al., 2022), and calls for the investigation of these effects across more diverse linguistic settings (e.g., Privitera & Weekes, 2023).
Implications
This study is of interest to researchers who study the relationship between language experience and a diverse number of outcomes. While we explored this topic in the context of non-linguistic benefits associated with bilingualism, the use of subjective assessments of ability is relevant to a number of other areas of research. One recommendation for future studies is to include both subjective and objective assessments in order to best capture the construct of interest. This is especially crucial when investigating bilingual effects as the vast majority of studies rely on either subjective or objective assessments exclusively (Surrain & Luk, 2019). Theoretically, our findings also contribute to the ongoing debate about whether bilingual language experience confers non-linguistic benefits. Results from both subjective and objective proficiency models did not provide evidence in support of improved task performance associated with higher levels of bilingual language experience (Paap, 2018). Finally, this study has implications for the field of education. The observed discrepancy between subjective and objective measures may reflect underlying self-confidence issues, which may impact on other domains of a student’s life. This highlights a potential need for student support interventions aimed at improving self-confidence in the interest of student well-being.
Limitations
Our reported findings should be considered in light of few limitations. Our observed results could be culturally specific given evidence that subjective ratings of language proficiency differ across samples from different countries and linguistic contexts (e.g., Tomoschuk et al., 2019). Considering the characteristics of our sample, bilinguals from the Chinese campus of an American university, English proficiency is likely more relevant than for Chinese public university students. For this reason, our findings likely do not reflect what would be observed in a typical Chinese university student. In addition, our investigation was limited to comparing the influence of subjective and objective measures of a single dimension of bilingual language experience (i.e., proficiency). Although we controlled for differences across a range of additional linguistic and non-linguistic variables, these were exclusively subjective in nature. While our findings do not support that results differ between models using subjective and objective measures of language proficiency, this may not be true for other dimensions of language experience such as dominance (e.g., Sheng et al., 2014). Finally, objective proficiency was assessed through picture naming. As Gollan and colleagues (2012) point out, picture naming tasks such as the Boston Picture Naming Task may not be sufficient to assess bilingual language proficiency because they were originally designed to test English proficiency in monolinguals. Sheppard and colleagues (2016) also suggest caution when interpreting bilingual speakers’ performance based on picture naming tasks.
Conclusion and future research
Our study aimed to investigate whether the use of subjective or objective measures of language proficiency influenced the emergence of bilingual effects on cognitive control. Results from our study support that both subjective and objective assessments of language proficiency may be equivalent when used during the modeling of behavioral data. Future work is needed to determine whether our findings can be replicated across other sociolinguistic contexts, at different points in development, and with alternative subjective and objective assessments of language experience. In addition, underlying explanations for why Chinese students tend to underestimate their language proficiency is in need of further exploration, especially with regard to how different sociolinguistic contexts impact on this tendency.
Footnotes
Authors’ note
An abbreviated version of the work presented in this paper was published in the proceedings of the 4th International Conference of Chinese Applied Psychology held in Wuhan, China in 2022.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was supported by a Gorilla grant that provided free online task hosting.
