Abstract
Research on young learners’ motivation in second language (L2) acquisition has expanded considerably, yet pronunciation-specific motivation remains largely overlooked, particularly in Content and Language Integrated Learning (CLIL) contexts. At the same time, gender has frequently been claimed as a source of variation in L2 motivation and pronunciation attainment, although recent evidence from primary-school CLIL settings suggests that such differences may be far less pronounced than traditionally assumed. This study compares English pronunciation motivation across CLIL and non-CLIL programs in Spanish primary education learners and between boys and girls within each type of instruction. A total of 337 pupils in Grades 3 and 4 from five semiurban schools in a monolingual region of northern Spain (3 CLIL, 2 non-CLIL) completed a 38-item questionnaire designed to capture a broad range of motivational dispositions relevant to pronunciation learning. An exploratory factor analysis identified a reliable 36-item, 7-factor structure consisting of intrinsic motivation, extrinsic motivation, ideal L2 self, ought-to L2 self, learning experience, informal engagement, and self-efficacy. Nonparametric tests were used for between-sample comparisons and for inspecting intragroup motivational hierarchies. Results showed high pronunciation motivation across the sample and no statistically robust differences between CLIL and non-CLIL learners or between boys and girls in any motivational factor. Across all subsamples, extrinsic motivation and self-related future aspirations were the highest scored components, whereas self-efficacy and informal engagement were consistently the weakest. These findings suggest that, in this context, pronunciation motivation is predominantly externally oriented and does not appear to differ substantially by moderate-intensity CLIL exposure or gender. The study argues that the low curricular visibility of pronunciation may limit opportunities for instructional models or individual differences to exert an effect, highlighting the need for more explicit and engaging pronunciation pedagogy in both CLIL and non-CLIL classrooms.
1. Introduction
Motivation is understood in second language acquisition (SLA) studies as the phenomenon that “concerns the choice and direction of a particular action, the effort expended on it and the persistence with it” (Dörnyei, 2019a, p. 61). Although some motivation theories are rooted more broadly in general and educational psychology, motivation in SLA has developed along its own trajectory (Ryan, 2019). One reason is that learning a foreign language in an educational setting differs from learning other academic subjects. Second language (L2) learners not only acquire knowledge about the language, but may also integrate it into their behavioral repertoire and associate it with a different cultural identity (Gardner, 1985). Furthermore, the growing importance of English in a globalized world has increased scholarly interest in motivational frameworks that can account for learners’ aspirations to learn and use the language.
Research on motivation in formal L2 learning settings remains necessary because learners’ engagement with the L2 is no longer shaped exclusively by classroom instruction, but also by growing opportunities for exposure outside school. Such informal exposure may strengthen learners’ interest in using and learning the language outside school, even when motivation in formal L2 lessons is weaker (Ushioda, 2013). This is especially relevant for English, which learners may frequently encounter in informal settings (Henry, 2014) and perceive as useful due to its international status (Lasagabaster, 2011). However, traditional L2 learning environments may not provide optimal conditions to foster motivation (Mercer & Dörnyei, 2020), as informal exposure is seen as more authentic (Henry, 2013). In some contexts, extracurricular instruction through private lessons accelerates learners’ progress to the point that the school syllabus becomes repetitive or insufficiently challenging, which can undermine engagement in formal lessons (Banegas, 2013). In this context, Content and Language Integrated Learning (CLIL), an approach in which learners learn the L2 and use it to learn academic content (Dalton-Puffer, 2011), may provide conditions that support stronger motivation toward the L2. In this sense, CLIL has been associated with the potential to enhance language learning motivation among learners (Coyle, 2008). Notwithstanding, motivation still requires further research in CLIL settings as most work on CLIL has mainly focused on linguistic and content-related outcomes (Lasagabaster, 2019).
Although L2 motivation research has expanded considerably in recent decades (Lamb et al., 2019), most studies have focused on older learners, while primary-school populations remain comparatively underexamined (Boo et al., 2015). This oversight is particularly important because motivation is a multifaceted phenomenon that cannot be reduced to a single theoretical perspective (Ushioda, 2019). Therefore, empirical studies focusing on younger learners, understood as learners up to 12 years old (Pinter, 2011), are needed to better understand the motivational dynamics operating at earlier developmental stages. To address this gap, the present study investigates young learners’ pronunciation motivation by operationalizing constructs that have been central in broader quantitative motivation research. Pronunciation is a particularly relevant domain because it carries a substantial emotional and attitudinal load for learners (Baran-Łucarz, 2017) and its relationship with motivation appears to be complex and sometimes ambivalent (Nagle, 2018; Smit, 2002), as learners may perceive pronunciation goals differently. Although recent studies have begun to examine the potential of technology-based pronunciation learning to enhance motivation (Inceoglu et al., 2024), motivation remains underexplored in relation to pronunciation. This limited attention to pronunciation motivation may partly reflect the broader marginality of L2 pronunciation research in comparison with other linguistic domains (Derwing & Munro, 2022), particularly in CLIL contexts.
2. Theoretical Background
2.1. Motivation in SLA
Motivation has been regarded as a prerequisite for foreign language achievement (Ushioda, 2010) and a key predictor of success in foreign language acquisition since the inception of language acquisition theories that incorporated affective factors (Krashen, 1982). Academic interest has produced various motivation theories in the last decades, receiving special attention in SLA studies as “a foreign language is more than a mere communication code that can be learned similarly to other academic subjects, and it has therefore typically adopted paradigms that link the L2 to the individual’s personal core” (Dörnyei, 2009, p. 9).
Various theories that integrate psychology and SLA studies have investigated the construct of motivation. Gardner’s (1985) socio-educational model made a distinction between integrativeness, which implies the desire to assimilate into the foreign language community, and instrumental orientation, which refers to external or pragmatic factors that influence the motivation to learn the foreign language, i.e., career advancement or improved school grades. Different explorations such as The Self-Determination Theory model introduced by Deci and Ryan (1985) included different nuances by distinguishing extrinsic motivation, referring to motivational drives oriented toward specific outcomes or rewards, and intrinsic motivation, referring to interest in language learning itself and the personal fulfilment it may bring, without necessarily implying identification with the target-language community (Wu, 2003) or orientation toward a specific external goal, as in extrinsic motivation.
Another construct that has played a foundational role in psychological approaches to motivation is self-efficacy. Within Bandura’s social cognitive theory, perceived self-efficacy is defined as “beliefs in one’s capabilities to organize and execute the courses of action required to produce given attainments” (Bandura, 1997, p. 3). Self-efficacy is central to motivational functioning because it shapes “the choices individuals make, the effort they invest, and the persistence and resilience they display when facing difficulties” (Bandura, 1997, p. 3). Efficacy beliefs are conceived as domain- and task-specific and therefore measured with reference to future tasks and levels of perceived easiness or difficulty (Mills, 2014). Recent research, reviewed by Graham (2022) and extended by H. Kim (2024), shows that self-efficacy is closely linked to self-regulation and engagement and often predicts language achievement and perceived proficiency more strongly than broader belief constructs such as language mindsets. In this sense, learners’ perceptions of how easy or difficult it will be to carry out specific tasks can be interpreted as an expression of their language-related self-efficacy, with direct consequences for the effort they are willing to invest. In this context, CLIL is hypothesized to negatively affect learners’ self-efficacy given the additional difficulty of learning content in a foreign language (Doiz et al., 2014). However, this does not seem to always be the case (Gallardo-del-Puerto & Blanco-Suárez, 2021).
More recently, Dörnyei (2006, 2009) proposed a model that associates motivation with a more complex set of factors, including self-concept, as well as the language learning context. The L2 Motivational Self System (L2MSS) attempts to capture a more multidimensional picture of motivation by exploring 3 different motivational constructs: (1) the ideal L2 self, or the image that a foreign language learner projects of whom they want to become as an L2 user; (2) the ought-to L2 self, or the image that a foreign language learner projects of what type of L2 user they should become according to others; and (3) L2 learning experience, or the exploration of those factors related to classroom variables and various characteristics of the learner group which may significantly affect L2 motivation (Dörnyei, 2019b).
Another component of L2 motivation is engagement, understood as learners’ level of interest, participation, and willing involvement (Reschly & Christenson, 2012). Engagement in formal contexts is close to the idea of learning experience (Dörnyei, 2019b), as it deals with the interaction of the learner and tasks, materials or interaction with teacher or peers. The distinction between the two concepts has sometimes been the subject of debate (Appleton et al., 2008), as their conceptualizations seem to overlap. Nonetheless, they offer different points of view that help us understand “learners’ potential and learners’ realized potential” (Dörnyei, 2019b, p. 60). The debate on whether they overlap in meaning or not serves as proof that they are two highly related concepts. Engagement can be explored both inside and outside the classroom (Martin et al., 2017). Formal engagement can be measured according to school context, teaching materials, learning task or interaction with peers and teacher (Dörnyei, 2019b), while informal engagement (also referred to as extramural exposure in the literature) can be measured through learners’ interaction with activities that involve the L2 (Sundqvist, 2011) such as watching films, playing videogames, or listening to music. Informal engagement has been mainly explored in terms of linguistic achievement, focusing on overall communicative skills (Wilde et al., 2021) or speaking proficiency (Sundqvist & Uztosun, 2023). A study focusing on informal exposure in CLIL and non-CLIL samples shows that, although CLIL students report substantially higher levels of extramural English than non-CLIL peers, this additional exposure does not lead to stronger development in academic vocabulary (Olsson & Sylvén, 2015). Recent work in Spain further indicates that the type of informal engagement available to young learners tends to be limited to low-interaction activities such as listening to music or watching TV, and that its linguistic effect is therefore modest and skill-dependent (Lázaro-Ibarrola, 2024).
Different motivation conceptualizations have mainly been researched in adult L2 learner populations. Early studies on motivation that have been conducted with younger learners have shown that they tend to be more motivated intrinsically or by classroom-related variables such as their teacher, or materials, rather than expressing instrumental or integrative motives (Nikolov, 1999). This suggests that young learners enjoyed learning the language but did not necessarily express a desire to identify with the L2 community. More recent studies in various countries have also found that young learners display integrative and instrumental motives at the onset of their learning, possibly due to their increased exposure to the target culture and the social discourse on the importance of learning English (T.-Y. Kim, 2011). The ideal and ought-to L2 selves have not been thoroughly explored in young learners (Mihaljević Djigunović & Nikolov, 2019) as it has traditionally been claimed that these originate later (Zentner & Renaud, 2007). However, it has been noticed that learners with more favorable socioeconomic background develop their ideal L2 self earlier, while those from less favorable backgrounds only develop their ought-to L2 self (Lamb, 2007). L2 learning experience has also been found to play a greater role in younger learners than in older ones (Kormos & Csizér, 2008), mainly because the language learning and use processes are more related to classroom variables (Csizér & Kormos, 2009) such as the teacher, materials, activities, or experience of success (You et al., 2016). Learning experience could play an important role in CLIL, as content subject lessons taught in English are presumably different from traditional English as a foreign language (EFL) lessons. This was the case in the study that Zhu et al. (2024) carried out in China comparing CLIL and non-CLIL samples, where the authors found differences in terms of learning experience motivation, in favor of CLIL learners, who reported that they enjoyed learning English and that they looked forward to their English lessons.
2.2. Profiling Pronunciation Motivation
Research on L2 motivation has largely concentrated on broader orientations toward language learning rather than on domain-specific motivations such as pronunciation motivation. As a result, studies on pronunciation have typically imported general motivational constructs to explain variation in pronunciation outcomes. This move has empirical grounding in early work that showed that learners who were more concerned about their pronunciation achieved higher levels of phonological accuracy (Elliott, 1995), and later research that demonstrated that attitudes toward accent and identification with target-language speakers influenced degree of foreign accent (Flege et al., 1995). Likewise, research on exceptional late L2 learners who attain near-native pronunciation consistently finds unusually high levels of motivation and sustained, deliberate effort, suggesting that motivation can mitigate age-related disadvantages in phonological attainment (Moyer, 2014). More recent longitudinal work further confirms that motivational effort predicts accentedness trajectories (Nagle, 2018). Therefore, these findings indicate that pronunciation is a domain in which motivation may exert a direct influence on performance.
However, pronunciation motivation has not been defined or conceptualized as an independent affective construct in the way that, for example, pronunciation anxiety has been (Baran-Łucarz, 2014). As Smit (2002) notes, this may stem from the inherent ambivalence and complexity of how pronunciation itself is conceptualized, which compounds the already multifaceted nature of motivation. Learners may be motivated to align with target-language norms, yet others may resist nativelike pronunciation because it threatens aspects of their first language (L1) identity (Müller, 2013). In other cases, pronunciation is framed in terms of communicative usefulness (Gómez-Lacabex & Roothooft, 2023) or long-term instrumental goals such as employability (Moyer, 1999). These divergent orientations make it difficult to formulate a single, unified construct of pronunciation motivation. The development of instruments such as the Learner Attitudes and Motivations for Pronunciation (LAMP) inventory (Sardegna et al., 2014) represents progress, but this instrument still operates without an overarching conceptual model. Consequently, pronunciation motivation remains an empirically attested but theoretically unsettled construct that depends on broader L2 motivation models to be operationalized.
2.3. Motivation and CLIL in Spain
Given the absence of research on pronunciation motivation in CLIL settings, and the considerable variability in CLIL implementation, the present study reviews evidence on CLIL and broader L2 English motivation among young learners in Spain in order to establish a baseline against which pronunciation-related patterns can be interpreted. Research conducted in Spanish secondary education generally reports higher motivational levels among CLIL learners, an advantage often attributed to increased exposure rather than selection effects (Doiz et al., 2014; Lasagabaster, 2011). However, findings vary depending on program intensity, with high-exposure CLIL showing clearer gains than low-exposure models (Azpilicueta-Martínez & Lázaro-Ibarrola, 2023; Heras & Lasagabaster, 2015), which underscores the need to examine whether additional exposure exerts similar effects in younger primary-school learners.
Studies examining the relationship between CLIL instruction and young learners’ motivation in Spain have produced heterogeneous findings. Table 1 summarizes the main studies on L2 English motivation conducted in primary education, including overall direction of CLIL effects, motivational constructs analyzed, instruments used, gender patterns, CLIL subjects, and additional CLIL exposure.
Summary of Previous Studies on L2 English Motivation in Primary CLIL Contexts.
Note. AMTB = Attitude/Motivation Test Battery. L2MSS = L2 Motivational Self-System. MA = Motivation and Anxiety.
Figure determined using the data supplied within the article.
Figure determined using the data supplied within the article.
Some studies report lower motivation among CLIL learners. This is the case in the work of Fernández-Fontecha and Canga-Alonso (2014), who examined intrinsic, extrinsic, and general motivation using a quantitative AMTB-adapted questionnaire with fourth graders from a monolingual region. Non-CLIL learners obtained higher scores on all motivational dimensions despite the CLIL group having accumulated 281 hours of additional exposure through science. No statistically significant gender differences were observed.
Other studies have found no motivational differences between CLIL and non-CLIL learners. Gallardo-del-Puerto and Blanco-Suárez (2021), using an AMTB-inspired quantitative instrument with pupils in Grades 4 and 6, compared intrinsic, integrative and instrumental orientations together with parental support, effort, self-efficacy and anxiety. Both groups showed similar motivational levels, even though CLIL learners had received between 361 and 462 hours of additional exposure. Gender patterns differed across instructional contexts: boys and girls did not differ in the CLIL group, whereas in the non-CLIL group girls scored higher in intrinsic motivation, effort and self-efficacy. Comparable findings regarding the absence of broad motivational differences were reported by Pladevall-Ballester (2019), who used an L2MSS-based quantitative questionnaire with learners in the last cycle of primary education of a bilingual region and carefully matched the CLIL and non-CLIL groups so that both received the same amount of exposure to English. Under these conditions of controlled exposure, CLIL and non-CLIL learners showed similar levels of ideal L2 self, intrinsic motivation, and instrumental motivation. Differences emerged only within the CLIL curriculum, where learning experience varied across subjects, with science lessons eliciting more positive perceptions than arts and crafts. This pattern suggests that any motivational impact of CLIL may depend on the specific subject through which English is used rather than on CLIL participation itself.
A third group of studies reports higher motivation among CLIL learners in bilingual regions. Lasagabaster and López Beloqui (2015), using an AMTB-inspired quantitative questionnaire, found higher intrinsic and integrative motivation in CLIL learners. Their CLIL curriculum included CLIL instruction in arts, physical education, and computers, with maths and science taught partly in English. Their overall estimated exposure was considerably higher than in the other studies, approximately 1348 hours, placing the program in a high-intensity CLIL category that could be associated with the motivational advantage found. Navarro-Pablo and García-Jiménez (2018) also observed slightly higher motivation in CLIL sixth graders and stronger associations between motivation and proficiency among CLIL learners, although exposure data were not reported.
More recent evidence reinforces the importance of intensity even at young ages. Lázaro-Ibarrola and Azpilicueta-Martínez (2024) found that CLIL learners in Grades 5 and 6 receiving CLIL arts and crafts alongside EFL outperformed their non-CLIL peers in ideal L2 self and learning experience after approximately 407 hours of accumulated CLIL exposure. Expanding this line of research with a large sample (n = 895), Azpilicueta-Martínez and Lázaro-Ibarrola (2023) demonstrated that high-intensity CLIL produced significantly higher motivation than both low-CLIL and non-CLIL programs across nearly all dimensions measured, including the learning experience, integrativeness, instrumentality-promotion, degree of difficulty, and L2 self-appraisal.
2.4. Gender in SLA, Motivation, and CLIL
SLA research has traditionally examined gender as a means to explain variation in final achievement, as it has been suggested that female learners tend to outperform males in terms of general L2 achievement (Główka, 2014) and English pronunciation (Moyer, 2016). This difference is also noticeable in school marks obtained for linguistic subjects (Voyer & Voyer, 2014) and in L2 motivation levels (López Rúa, 2006; Pavlenko & Piller, 2008; van der Slik et al., 2015), where female participants obtain higher scores than their male counterparts. As for engagement, boys tend to be less engaged in school settings (Fernández-Zabala et al., 2016) and their levels of engagement seem to decrease over time (Davies & Brember, 1994). However, some studies that have explored the gender variable in CLIL settings have found that attainment levels between boys and girls in CLIL settings are more similar than in non-CLIL contexts (Nieto Moreno de Diezmas & Hill, 2019). The same result is found in terms of motivation (Amengual-Pizarro & Prieto-Arranz, 2015; Gallardo-del-Puerto & Blanco-Suárez, 2021; Heras & Lasagabaster, 2015). However, more studies in this area are needed, as the limited research on CLIL and gender also suggests that similar results between boys and girls may be found in both CLIL and non-CLIL contexts (Martínez Agudo, 2022).
3. The Present Study
As the literature review shows, evidence on the motivational role of CLIL in primary education is inconsistent, and existing studies have examined only broad L2 motivation rather than pronunciation-specific motivation. Consequently, it remains unclear whether the additional exposure and instructional conditions characteristic of CLIL influence young learners’ motivation toward pronunciation, a domain with distinct cognitive, affective, and identity-related demands. Gender also requires systematic attention. Although gender differences are well documented in general L2 motivation and in pronunciation attainment, studies conducted in CLIL contexts suggest that such differences may narrow or disappear. Whether this convergence extends to pronunciation motivation is not known, as this area has not been investigated. Examining CLIL and gender therefore addresses a clear empirical gap and allows a more precise understanding of how contextual and individual factors relate to young learners’ pronunciation motivation. The objectives of this study are therefore:
- to compare pronunciation motivation between CLIL and non-CLIL samples;
- to compare pronunciation motivation between boys and girls within each type of instruction; and
- to analyze the distribution of motivational factors across samples.
3.1. Participants
A total of 337 primary education learners from the same monolingual region participated in this study. They were drawn from five schools in semiurban areas with similar socioeconomic backgrounds. Three of those schools followed a CLIL approach (n = 182) in which English was taught in the EFL subject and used as a medium of instruction in content subjects such as music, social sciences, or arts and crafts. In these schools, EFL classes were taught by English language specialists, while CLIL subjects were taught by content specialists with at least a B2 level in the Common European Framework of Reference for Languages (CEFR). The remaining two schools (n = 155) did not follow a CLIL approach and English was only used in the EFL subject. However, in neither CLIL nor non-CLIL schools was pronunciation targeted explicitly by EFL or CLIL teachers, as it is not part of the official curriculum. All participants were either in Grade 3 (aged 8–9) or Grade 4 (aged 9–10). In the CLIL sample, 91 identified as boys and 91 identified as girls, while in the non-CLIL sample 82 identified as boys and 73 identified as girls.
The amount of English exposure varied based on instructional approach and grade, in line with regional educational regulations. By the time of data collection, Grade 3 CLIL learners had received a minimum of 434 hours of English instruction, compared with 638 hours among Grade 4 learners. In contrast, non-CLIL Grade 3 and Grade 4 learners had received at least 249 and 379 hours, respectively.
Both EFL and CLIL teachers reported using English as the main classroom language, though Spanish was regularly used by teachers and pupils to ensure content clarity and maintain smooth communication. EFL teachers indicated that they provided corrective feedback on pronunciation but did not devote instructional time to explicit pronunciation teaching. In contrast, CLIL teachers corrected pronunciation only when it interfered with intelligibility.
English did not function as a lingua franca in the learners’ sociolinguistic environment because they had no communicative need or opportunity to use the language outside school. Informal exposure to English was generally low, although many learners reported attending private English lessons. The total hours of private instruction were calculated to compare CLIL learners (M = 117, SD = 121) and non-CLIL learners (M = 123, SD = 132). A Mann–Whitney U test showed no significant differences between the two samples (U = 15,018, p = .901).
3.2. Instruments
The study employed a 38-item questionnaire administered in Spanish and rated on a 5-point scale (see the Appendix). Given the lack of pronunciation-specific motivation instruments, the pronunciation motivation scale used in this study was adapted from existing research on broader language learning motivation and reflected the multidimensional nature of the construct, following the example of Sardegna et al. (2014). The adapted items were reviewed by a panel of experts in its English form and then translated into Spanish. The scale incorporated elements from established frameworks such as Self-Determination Theory (Deci & Ryan, 1985), L2 Motivational Self System (Dörnyei, 2006, 2009) and engagement in formal and informal contexts (Dörnyei, 2019b), all of which were adapted to the pronunciation domain.
Some items were reverse-coded to avoid acquiescence bias. The scores of these items were reversed prior to data analysis. The items were presented in a mixed order rather than grouped by construct in order to reduce response patterning.
The questionnaire was piloted several months before data collection with a group of 16 pupils from Grades 3 and 4, ensuring balanced gender representation. The aim of the pilot was to verify the clarity of the wording and the response scale, as well as to confirm that the length of the instrument was appropriate for completing it in a single sitting.
3.3. Data Analysis
An exploratory factor analysis (EFA) was conducted on the initial 38 items to examine the latent structure of the questionnaire, given the absence of prior evidence regarding its internal organization. The analysis used the minimum residual extraction method with oblimin rotation, as the motivational dimensions were assumed to be theoretically related rather than orthogonal. A loading threshold of 0.35 was adopted to support factor interpretation. The EFA yielded a 7-factor solution based on parallel analysis. However, two items did not load onto any factor and showed high uniqueness values (> 0.80). The two discarded items were “At school we are told that we must have good pronunciation” and “If I have good pronunciation, I get better marks.” The EFA was then repeated without the two items. The second EFA yielded a 7-factor solution again. Sampling adequacy was confirmed by a kaiser-meyer-olkin (KMO) value of 0.871, which indicates a satisfactory degree of shared variance among items, and Bartlett’s test of sphericity was significant, χ²(561) = 3987, p < .001. The model showed acceptable fit indices (RMSEA = 0.042, 90% CI [0.035, 0.049]; tucker-lewis index (TLI) = 0.906) and accounted for approximately 46% of the total variance. The factors were named based on prior research on language motivation. Because the items within each factor showed heterogeneous loadings, reliability was estimated using McDonald’s ω, which does not assume equal item contributions as Cronbach’s α does. Intrinsic motivation (F1) ω = .80, Extrinsic motivation (F2) ω = .86, ideal L2 self (F3) ω = .71, ought-to L2 self (F4) ω = .62, learning experience (F5) ω = .78, informal engagement (F6) ω = .71, and self-efficacy (F7) ω = .71. The comparatively lower coefficient for the ought-to L2 self factor is likely related to the fact that it comprises only four items, which restricts the amount of shared variance that can be captured and typically depresses reliability estimates.
Composite scores were computed for each factor along with an additional overall motivation score containing the mean of all the 36 items. Mann–Whitney U tests were used for between-sample comparisons given violations of normality, as indicated by the Kolmogorov–Smirnov test (p < .05) and histograms. All analyses employed a significance threshold of α = .05. For between-group results that reached unadjusted significance, a post hoc Holm–Bonferroni adjustment was applied. Effect sizes for Mann–Whitney tests were reported as r and interpreted according to standard conventions, with values around .10 considered small, around .30 moderate, and around .50 large. Additional Wilcoxon tests were used for intragroup comparison of motivation factors, following Gallardo-del-Puerto and Blanco-Suárez (2021). Comparing each motivational factor with the overall motivation score makes it possible to identify the relative position of each factor within each sample. Factors that yielded a significant difference are displayed with a gray background, while factors that did not significantly differ from the overall motivation variable are displayed with a white background.
4. Results
The comparison of pronunciation motivation between CLIL and non-CLIL samples (Table 2) revealed comparable outcomes across types of instruction, and only intrinsic motivation and self-efficacy differed significantly, with non-CLIL learners obtaining higher scores in both cases. These differences were small in magnitude (r = .162 for intrinsic motivation and r = .127 for self-efficacy) and did not remain significant once Holm–Bonferroni correction was applied (adjusted p = .088 and p = .371). No significant differences emerged for extrinsic motivation, ideal L2 self, ought-to L2 self, learning experience, informal engagement or overall motivation.
CLIL and Non-CLIL Pronunciation Motivation Descriptives and Comparisons.
Participants in both CLIL and non-CLIL samples reported high levels of pronunciation motivation, with mean values above the midpoint of the 1–5 scale across all dimensions. The SD ranged from 0.55 to 1.01, indicating moderate dispersion around the group means.
Intragroup analyses examining the distribution of pronunciation motivation factors within each sample (Table 3) revealed highly similar patterns across the CLIL and non-CLIL samples. The ranking of these factors followed the same order in both samples. Extrinsic motivation, ought-to L2 self, and ideal L2 self emerged as the most strongly endorsed pronunciation motivation factors in both samples, with means significantly higher than overall motivation. Intrinsic motivation did not differ significantly from overall motivation in the non-CLIL sample, whereas in the CLIL sample intrinsic motivation was significantly lower than overall motivation. In both samples, self-efficacy and informal engagement were consistently the lowest-ranked pronunciation motivation factors, with means significantly lower than overall motivation. Learning experience was the only factor that did not differ significantly from overall motivation in either sample, along with intrinsic motivation in the non-CLIL sample. This analysis indicated that the distribution of learners’ pronunciation motivation factors was highly similar across types of instruction, despite minor differences in mean scores.
Pronunciation Motivation Factor Distribution in CLIL and Non-CLIL Samples.
Note. p values represent Wilcoxon signed-rank comparisons between each factor and the overall pronunciation motivation score within the corresponding group.
Analyses comparing CLIL boys and girls (Table 4) revealed no significant gender differences in any pronunciation motivation factor. Although girls showed slightly higher mean scores than boys in intrinsic motivation, self-efficacy, learning experience, and overall motivation, none of these contrasts reached significance. The same applied to extrinsic motivation, ideal L2 self, ought-to L2 self, and informal engagement, all of which showed similar levels across genders and negligible effect sizes.
CLIL Boys and Girls Pronunciation Motivation Descriptives and Comparisons.
Intragroup rankings also displayed parallel patterns for CLIL boys and girls (Table 5). Extrinsic motivation, ought-to L2 self, and ideal L2 self were the highest-scoring pronunciation motivation factors for both samples, with means significantly higher than overall motivation. Self-efficacy and informal engagement consistently appeared as the lowest-ranked factors, with means significantly below overall motivation, while learning experience did not differ significantly from overall motivation in either sample.
Pronunciation Motivation Factor Distribution in CLIL Boys and Girls.
Note. p values represent Wilcoxon signed-rank comparisons between each factor and the overall pronunciation motivation score within the corresponding group.
The analysis comparing non-CLIL boys and girls (Table 6) showed no significant gender differences in any of the pronunciation motivation factors. Although girls obtained slightly higher mean scores in intrinsic motivation, ideal L2 self, learning experience and overall motivation, none of these contrasts reached significance. Boys scored marginally higher in informal engagement and ought-to L2 self, but these differences were also nonsignificant with small effect sizes.
Non-CLIL Boys and Girls Pronunciation Motivation Descriptives and Comparisons.
Intragroup rankings for non-CLIL boys and girls (Table 7) revealed similar patterns between the two samples. Extrinsic motivation, ought-to L2 self and ideal L2 self were the highest scored pronunciation motivation factors for both samples, with means significantly higher than overall motivation. Although intrinsic motivation and learning experience showed slight descriptive differences between non-CLIL boys and girls, neither factor differed significantly from overall motivation in either sample. Self-efficacy and informal engagement consistently appeared as the lowest-ranked factors, with means significantly below overall motivation.
Pronunciation Motivation Factor Distribution in Non-CLIL Boys and Girls.
Note. p values represent Wilcoxon signed-rank comparisons between each factor and the overall pronunciation motivation score within the corresponding group.
The analyses across types of instruction and gender revealed a similar pattern in pronunciation motivation among young learners. Pronunciation motivation was moderate to high throughout the whole sample and the hierarchy of motivation factors was consistent across all samples analyzed. Extrinsic motivation and the ought-to L2 self consistently received the highest ratings, whereas self-efficacy and informal engagement received the lowest. Intrinsic motivation tended to fall in an intermediate position, depending on the sample examined. The overall configuration of pronunciation motivation factors showed that external or expectation-based constructs systematically obtained higher scores than more internal or experience-based factors.
5. Discussion
This study set out to examine pronunciation motivation in young learners by comparing CLIL and non-CLIL instructional contexts, by analyzing potential gender differences within each, and by inspecting how motivational components are hierarchically configured across groups.
The first objective was to compare pronunciation motivation across instructional models. The results provide no evidence that CLIL learners showed stronger pronunciation motivation than their non-CLIL peers. In fact, the only descriptive contrasts favored the non-CLIL group, which showed slightly higher intrinsic motivation and self-efficacy. These differences were small in magnitude and did not survive correction for multiple comparisons. This means that the additional exposure accumulated in CLIL was not associated with a distinct or enhanced motivational profile for pronunciation. This pattern is consistent with primary-school CLIL research in Spain that has repeatedly reported negligible motivational differences when such programs operate with moderate exposure (Gallardo-del-Puerto & Blanco-Suárez, 2021; Pladevall-Ballester, 2019). Studies that have found a CLIL advantage typically involve substantially higher accumulated hours (e.g., Azpilicueta-Martínez & Lázaro-Ibarrola, 2023; Lasagabaster & López Beloqui, 2015), which suggests that the exposure available in standard CLIL implementations is insufficient to produce detectable differences in motivational dispositions. This explanation is reinforced by the curricular reality that pronunciation was not explicitly targeted in either CLIL or non-CLIL lessons. If pronunciation remains pedagogically invisible, the additional exposure provided by CLIL may be less likely to generate qualitatively different motivational experiences, an interpretation compatible with findings showing that young learners’ motivation is closely related to salient classroom experiences (Csizér & Kormos, 2009; You et al., 2016). A further factor contributing to the lack of differences is the quality of CLIL input. CLIL subjects were taught by content specialists with B2-level English, not by pronunciation-trained language experts. In this context, a combination of limited CLIL intensity, lack of explicit pronunciation instruction, and less specialized phonological exposure may help explain the absence of motivational differences between CLIL and non-CLIL learners.
The second objective was to determine whether boys and girls differed in pronunciation motivation within each instructional context. No significant gender differences emerged in any factor and effect sizes were negligible throughout. The motivational hierarchy was identical across all four subsamples. Traditional SLA literature commonly reports female advantages in motivation and achievement (López Rúa, 2006; Pavlenko & Piller, 2008), and pronunciation research has sometimes identified gender-linked variation in outcomes (Moyer, 2016). The present findings do not replicate those patterns and instead align with CLIL research reporting similar motivational profiles for boys and girls (Amengual-Pizarro & Prieto-Arranz, 2015; Gallardo-del-Puerto & Blanco-Suárez, 2021) and, importantly, with recent studies indicating that comparable patterns may also be found in non-CLIL contexts (Martínez Agudo, 2022).
One possible interpretation is that the lack of divergence reflects the sociocultural framing of English in Spain, where its international status may be strongly valued and associated with school success and future opportunities (Lasagabaster, 2011), and where these expectations may apply similarly to boys and girls. The lack of gender differences may also be related to the low visibility of pronunciation in the curriculum. As pronunciation was not salient in either CLIL or non-CLIL instruction, boys and girls may have limited opportunities to develop gender-specific attitudes toward it. Another possibility is that any emerging gender differences in motivation may appear later in schooling, when learners’ trajectories, identities and social influences begin to differentiate more markedly. However, the cross-sectional design prevents firm conclusions at the moment.
The third objective was to examine how motivational components were distributed within each sample. The results revealed a stable hierarchy across groups: extrinsic motivation, ought-to L2 self, and ideal L2 self formed the upper tier, whereas self-efficacy and informal engagement were consistently the weakest components. This profile indicates that young learners’ pronunciation motivation is strongly shaped by external expectations and instrumental considerations. Learners report that accurate and effective pronunciation in English is important for successful communication, for future studies and employment, and for meeting what parents and teachers expect from them.
At the same time, the high endorsement of ideal L2 self in relation to pronunciation suggests that many learners have already developed future-oriented self-images that include speaking English with a certain level of phonological adequacy. This challenges earlier claims that ideal L2 selves emerge mainly in adolescence (Zentner & Renaud, 2007) and aligns with evidence that self-related motivational constructs can be observed in primary school learners, particularly in contexts where English is strongly valorized (T.-Y. Kim, 2011; Lamb, 2007). The present study extends this insight to the domain of pronunciation. These responses suggest that young learners do not simply view English pronunciation as useful for external reasons, but also relate pronunciation outcomes to future self-images as speakers of English.
Although self-efficacy scores were above the scale midpoint, they consistently occupied a lower position than the other motivational factors across all samples. This comparatively lower ranking may reflect learners’ early awareness of the demanding nature of pronunciation, a skill that requires perceptual and productive control. This pattern contrasts with the findings of Gallardo-del-Puerto and Blanco-Suárez (2021), whose participants reported high self-efficacy for broad L2 English use. Given that self-efficacy is inherently task and domain-specific (Mills, 2014), such differences could be expected. Young learners may feel confident about general English learning while evaluating their pronunciation abilities more cautiously because they perceive this domain as more demanding than other language skills.
Informal engagement also occupied the lowest end of the motivational hierarchy, which is unsurprising in a context where English is not used as a lingua franca and where opportunities for authentic oral interaction are scarce. Prior work on informal engagement has shown that its motivational benefits are strongest when learners participate in activities requiring regular spoken output or real-time communication (Sundqvist & Uztosun, 2023), conditions that seem to be largely absent in the learners’ sociolinguistic environment (Lázaro-Ibarrola, 2024).
The pronunciation motivation profile observed in this study suggests several pedagogical implications. First, the dominance of externally oriented motives indicates that learners already recognize the importance and usefulness of pronunciation. This could create favorable conditions for teachers to strengthen more internalized forms of motivation by providing pronunciation activities that are meaningful, success-oriented, and connected to communicative goals. Second, the consistently weaker self-efficacy scores may reflect the few opportunities that learners might encounter to feel competent in pronunciation. Teachers could address this by designing richer pronunciation learning experiences, including explicit work on segmental and suprasegmental features, scaffolded perception practice, and activities that provide clear evidence of improvement.
In addition, the findings point to implications for teacher education. Since pronunciation is often marginal in primary curricula and many CLIL teachers are content specialists rather than language experts, greater attention to pronunciation pedagogy in teacher training is essential. Professional development should equip teachers with strategies for integrating pronunciation into both EFL and CLIL lessons in ways that are developmentally appropriate and communicatively meaningful. Enhancing teachers’ confidence and competence in this area is likely to support not only learners’ phonological development but also their motivation, engagement, and self-efficacy.
6. Conclusions
This study has examined pronunciation motivation among young learners with a focus on type of instruction and gender. Participants completed a 38-item questionnaire (36 after EFA) organized into 7 factors identified through EFA. The results indicate moderate to high pronunciation motivation across the whole sample, with higher levels of extrinsic motivation, ought-to L2 self, and ideal L2 self, and systematically lower levels of self-efficacy and informal engagement. The comparison between CLIL and non-CLIL learners revealed similar motivational patterns, with the non-CLIL group showing slightly higher intrinsic motivation and self-efficacy, although the effect sizes were small. Gender analyses showed comparable distributions of motivational factors and no meaningful differences between boys and girls. The findings point to a similar motivational profile among young learners in this context and suggest that pronunciation motivation is more strongly associated with external motives than with instructional model or gender. Given the prominence of externally oriented motives in this sample, teachers may be able to emphasize classroom experiences that support more internalized motivational dimensions.
The study is not without limitations. The reliance on a questionnaire as a single instrument restricts the range of motivational evidence that can be captured, although the incorporation of qualitative methods was not feasible due to parental consent restrictions and broader project constraints. The specific CLIL model and the relative homogeneity of the sample further limit generalizability and call for larger studies that consider variability in CLIL intensity, subject integration, and teacher-related factors.
Future work could examine how pronunciation motivation changes across school years and whether instructional conditions and gender become more salient as learners progress in their English learning. The inclusion of qualitative approaches and the integration of affective variables and pronunciation outcomes would allow for a more comprehensive understanding of pronunciation motivation among young learners and its potential role in shaping learning processes.
Footnotes
Appendix
Appendix: Factor Loadings for the Pronunciation Motivation Questionnaire
| Item | Factor | ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| 4. I get bored when we learn English pronunciation in class (R) | 0.70 | ||||||
| 24. English pronunciation lessons are fun | 0.69 | ||||||
| 16. I think English pronunciation is interesting | 0.68 | ||||||
| 18. Learning English pronunciation is boring (R) | 0.60 | ||||||
| 31. I would like to continue learning English pronunciation | 0.52 | ||||||
| 1. I like how English sounds | 0.49 | ||||||
| 33. I like practicing English pronunciation in my free time | 0.38 | ||||||
| 11. Learning English pronunciation will be useful for me in several aspects | 0.77 | ||||||
| 7. Learning English pronunciation is not important (R) | 0.70 | ||||||
| 37. I learn English pronunciation because I think I will need it when I am older | 0.63 | ||||||
| 34. Learning to pronounce well in English is unnecessary (R) | 0.56 | ||||||
| 17. A good English pronunciation is necessary to find a better job | 0.38 | ||||||
| 5. I imagine that, if I use English in my job, I will speak it very well | 0.59 | ||||||
| 38. I imagine I will have good English pronunciation when I finish learning it | 0.49 | ||||||
| 15. I do not picture myself having friends with whom I speak English very well (R) | 0.41 | ||||||
| 12. I imagine I will speak English with good pronunciation when I am older | 0.39 | ||||||
| 26. I imagine I will travel around the world with a very good English pronunciation | 0.37 | ||||||
| 22. My parents think that having a good English pronunciation is important | 0.71 | ||||||
| 20. My parents will be disappointed if I don’t have a good English pronunciation | 0.50 | ||||||
| 8. My teacher appreciates that I have a good English pronunciation | 0.48 | ||||||
| 9. My parents want me to learn good English pronunciation | 0.39 | ||||||
| 30. Our teachers teach English pronunciation very well. | 0.75 | ||||||
| 13. I learn little English pronunciation at this school (R) | 0.72 | ||||||
| 36. My English pronunciation is getting better at school | 0.70 | ||||||
| 32. Class activities (exercises, etc.) help me with my English pronunciation | 0.49 | ||||||
| 6. Class materials (books, worksheets, computers, audios, videos, etc.) help me with my English pronunciation | 0.44 | ||||||
| 35. My peers help me with my English pronunciation | 0.36 | ||||||
| 21. My English pronunciation improves when I watch films and/or series in English | 0.67 | ||||||
| 2. My English pronunciation improves when I watch videos and/or cartoons in English | 0.54 | ||||||
| 10. My English pronunciation improves when I speak English outside school | 0.49 | ||||||
| 23. My English pronunciation improves when I listen to music in English | 0.46 | ||||||
| 29. My English pronunciation improves when I play games on-line in English | 0.37 | ||||||
| 25. Learning English pronunciation is difficult for me (R) | 0.80 | ||||||
| 28. I am good at English pronunciation | 0.57 | ||||||
| 27. It is difficult for me to understand English pronunciation in listening activities (songs, audios, videos, etc.) (R) | 0.52 | ||||||
| 14. It is easy for me to understand my teachers’ pronunciation in English | 0.44 | ||||||
| % of variance | 10.04 | 8.47 | 3.95 | 4.62 | 7.39 | 5.24 | 6.23 |
Notes. Only loadings of 0.35 or above are shown. (R) = reverse-coded item. Factor 1 = intrinsic motivation. Factor 2 = extrinsic motivation. Factor 3 = ideal L2 self. Factor 4 = ought-to L2 self. Factor 5 = learning experience. Factor 6 = informal engagement. Factor 7 = self-efficacy.
Acknowledgements
We would like to express our gratitude to the schools, pupils and teachers who participated in this study.
Ethical Considerations
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of the University of Cantabria (protocol code 000029; 17 March 2022).
Consent to Participate
Written informed consent was obtained from the participants’ parents or legal guardians.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the predoctoral contract PRE2021-100711 and the project PID2020-115327RB-I00 (Ministry of Science and Innovation, Government of Spain, MICIU/AEI/10.13039/501100011033).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
