The Relationship Between Immersive VR-Based Language Learning and EFL Learners’ Perceptual,Motor,and Cognitive Skills

Abstract

The increasing use of immersive virtual reality (VR) in English as a foreign language (EFL) education has generated growing interest in its potential to support learner engagement and skill development. However, empirical evidence explaining how engagement in VR-based learning relates to perceptual, motor, and cognitive skills remains limited. This study investigated the structural relationships among VR engagement, perceptual skills, motor skills, and cognitive skills in an EFL context. Data were collected from EFL learners who participated in VR-based English learning tasks and were analyzed using structural equation modeling with SPSS (v27) and AMOS (v24). The measurement model demonstrated acceptable reliability and validity for all constructs. The structural model revealed that VR engagement was a significant predictor of perceptual, motor, and cognitive skills, with the strongest effect observed for cognitive skills. These findings indicate that learner engagement plays a central role in shaping how VR-based activities are experienced and perceived in language learning contexts. The study contributes to immersive language learning research by offering a process-oriented account of VR use that emphasizes engagement as a key mechanism rather than treating technology as an isolated factor. Pedagogical implications highlight the importance of designing VR language tasks that foster active participation and sustained involvement.

Keywords

virtual reality learner engagement EFL learning perceptual skills structural equation modeling

Introduction

Digital technologies have become integral to second and foreign language education, reshaping how learners engage with linguistic input, interaction, and practice (Derakhshan et al., 2025, 2026; Derakhshan & Park., 2026; Lu et al., 2024). Among recent developments, virtual reality (VR) has attracted growing attention for its potential to provide contextualized language exposure and interactive environments that are difficult to achieve in conventional classrooms (Alfadil, 2020; Derakhshan et al., 2024; Parmaxi & Demetriou, 2020; Zhonggen, 2018). VR-based language learning environments often combine visual, auditory, and interactive elements, enabling learners to participate in simulated communicative situations that resemble real-world contexts (Hsu, 2017; Hung et al., 2018). Such features have been linked to increased learner engagement and motivation, particularly in English as a Foreign Language (EFL) contexts where authentic exposure is limited (Govender & Arnedo-Moreno, 2021; Yang et al., 2020).

Recent empirical studies have reported positive effects of VR-supported instruction on various language outcomes, including vocabulary acquisition, pronunciation, listening comprehension, and speaking confidence (Chang et al., 2020; Lin & Wang, 2021; Tai et al., 2022). Meta-analytic and review work has further suggested that immersive technologies may foster deeper involvement and sustained attention compared to screen-based learning environments (Alfadil, 2020; Koç et al., 2022). At the same time, scholars have cautioned against treating VR as a uniform intervention, noting wide variation in task design, interaction modes, and pedagogical integration across studies (Fokides & Zampouli, 2017; Parmaxi & Demetriou, 2020).

More recent research has shifted attention from general learning outcomes to learner-related processes, such as engagement, cognitive processing, and emotional responses in immersive environments (Barrett et al., 2023; Derakhshan et al., 2024; Schmidt et al., 2023). In EFL contexts, engagement has been identified as a key mechanism through which technology-mediated instruction influences learning, acting as a link between instructional features and learner outcomes (Alalwan et al., 2020; Moon et al., 2020). However, while engagement has been examined in mobile and online learning settings (J. C. Chen, 2016; Y.-L. Chen, 2016; Hsieh et al., 2022), its role in VR-based language learning remains under-theorized and unevenly operationalized.

Despite the growing body of VR research in language education, several conceptual and methodological gaps remain. First, much of the existing literature has focused on performance outcomes or learner perceptions without sufficiently unpacking the underlying skill domains involved in VR interaction (Lai & Chen, 2023; Lin et al., 2022). VR environments often require learners to process multimodal input, respond under time constraints, and interact through physical movements or gestures. These demands suggest that perceptual, motor, and cognitive skills may play a central role in shaping learners’ experiences and outcomes. Yet, these skill domains are rarely examined together in EFL VR studies, and when they are mentioned, they are often treated in a descriptive or speculative manner (Dhimolea et al., 2022; Fokides & Zampouli, 2017).

Second, recent studies have highlighted a tendency to label learning environments as “immersive” or “intelligent” without providing sufficient empirical detail about learner interaction with the technology (Keller et al., 2024; Schmidt et al., 2023; Weng et al., 2024). Several scholars have criticized the practice of attributing learning gains to advanced technologies while relying on self-report instruments that were not originally developed for such contexts (Chen et al., 2022; Xu et al., 2023). This issue has led to concerns about construct validity and the risk of overstating technological effects, particularly when perceptual and motor involvement is assumed rather than measured (Bendeck Soto et al., 2020; Govender & Arnedo-Moreno, 2021).

Third, although engagement has been widely acknowledged as a mediator in technology-enhanced learning, empirical models that simultaneously examine engagement and multiple skill domains in VR-based EFL contexts remain limited (Lai & Chen, 2023; Zhang et al., 2022). Emerging research has begun to explore how immersive environments influence cognitive load, attention, and real-time processing (Keller et al., 2024; Lin et al., 2022), yet few studies have adopted integrative analytical approaches, such as structural equation modeling, to test these relationships within a coherent framework. This limitation is particularly evident in large-scale EFL contexts, such as China, where VR adoption is increasing but empirical evidence remains fragmented (Li et al., 2025; Weng et al., 2024; Zhang et al., 2024).

Finally, recent discussions in applied linguistics have emphasized the need for theoretically grounded and methodologically transparent research on emerging technologies (Derakhshan et al., 2024; Yazdi & Ghanizadeh, 2024). Scholars have called for studies that move beyond novelty effects and examine how specific learner experiences in technology-mediated environments relate to well-defined psychological and skill-based constructs (Ghafouri et al., 2025; Saeedi & Najjarpour, 2025). Addressing these concerns requires careful operationalization of constructs and analytical models that can capture complex relationships among engagement and skill development.

In response to these gaps, the present study aims to examine the relationship between immersive VR-based language learning and EFL learners’ perceptual, motor, and cognitive skills, with a particular focus on the role of learner engagement. Drawing on prior work in VR-assisted language learning (Alfadil, 2020; Parmaxi & Demetriou, 2020; Tai et al., 2022) and engagement theory in technology-enhanced education (Alalwan et al., 2020; Moon et al., 2020), this study proposes an integrative model in which VR engagement is linked to multiple skill domains relevant to language learning in immersive environments.

Specifically, the study seeks to (a) examine EFL learners’ levels of engagement in VR-based language learning, (b) investigate their perceived perceptual, motor, and cognitive skills during VR tasks, and (c) test the structural relationships among these constructs using structural equation modeling. By focusing on learners’ reported experiences with concrete VR language tasks rather than abstract technological claims, the study aims to provide a more nuanced account of how VR-based learning relates to different dimensions of learner functioning.

Through its empirical and analytical approach, this study contributes to the growing literature on immersive technologies in EFL education by addressing calls for clearer construct definition, stronger methodological rigor, and integrative modeling (Keller et al., 2024; Schmidt et al., 2023; Zhang & Miao, 2025). The findings are expected to inform both researchers and practitioners about the conditions under which VR-based language learning may support learner engagement and skill development, while also highlighting areas where further theoretical and empirical refinement is needed.

Review

Virtual reality (VR) has increasingly been positioned as a promising tool in foreign language education, particularly in contexts where opportunities for authentic interaction are limited. Early work emphasized VR’s capacity to simulate communicative environments and provide contextualized input beyond textbook-based instruction (Hsu, 2017; Zhonggen, 2018). Subsequent studies extended this view by examining how immersive environments may support language practice through interaction, presence, and learner agency (Hung et al., 2018; Parmaxi & Demetriou, 2020).

In EFL research, VR has been associated with gains in vocabulary, pronunciation, listening comprehension, and learner motivation (Chang et al., 2020; Lin & Wang, 2021; Tai et al., 2022). Reviews and meta-analyses have generally reported positive trends, though they also highlight substantial variation in task design, duration, and outcome measures (Alfadil, 2020; Koç et al., 2022). More recent studies conducted in Asian EFL contexts, including China, suggest increasing institutional adoption of VR-supported instruction, alongside growing interest in learner-centered outcomes such as engagement and cognitive involvement (Li et al., 2025; Weng et al., 2024; Zhang et al., 2024). Despite this growth, the literature remains fragmented. Many studies focus on isolated outcomes or short-term interventions, with limited integration of theoretical perspectives on learning processes in immersive environments (Fokides & Zampouli, 2017; Govender & Arnedo-Moreno, 2021). This fragmentation has prompted calls for more systematic frameworks that link VR engagement to specific learner skills relevant to language learning.

Theoretical Background

Theoretical explanations of VR-based language learning often draw on constructivist and experiential learning perspectives, which emphasize learning through interaction, context, and active meaning-making (Hung et al., 2018; Parmaxi & Demetriou, 2020). From this perspective, VR environments may support language learning by situating linguistic input within meaningful scenarios that require perception, action, and decision-making. Engagement theory has also been widely applied in technology-enhanced language learning research. Engagement is typically conceptualized as a multidimensional construct involving behavioral, cognitive, and emotional components (J. C. Chen, 2016; Y.-L. Chen, 2016). In digital learning environments, engagement has been shown to mediate the relationship between instructional design and learning outcomes (Alalwan et al., 2020; Moon et al., 2020). Recent work has extended this framework to immersive settings, suggesting that presence and interaction may intensify engagement processes (Barrett et al., 2023; Schmidt et al., 2023).

From a cognitive perspective, VR-based tasks impose distinct processing demands. Learners must integrate visual and auditory input, respond under time constraints, and manage multiple sources of information. Cognitive load theory provides a useful lens for examining these demands, particularly in relation to working memory and real-time processing (Keller et al., 2024; Lin et al., 2022). At the same time, embodied cognition perspectives suggest that motor interaction and physical movement may support learning by linking language input to action (Bendeck Soto et al., 2020; Franco et al., 2025). Although these theoretical strands offer complementary insights, they are often applied in isolation. Few EFL studies explicitly integrate engagement theory with perceptual, motor, and cognitive accounts of learning in VR environments, leading to partial explanations of observed outcomes (Dhimolea et al., 2022; Lai & Chen, 2023).

To better align with perceptual–motor research traditions, it is useful to consider embodied cognition and sensorimotor learning theories. Embodied cognition posits that language understanding is grounded in sensory and motor systems, suggesting that physical interaction with virtual environments may enhance linguistic processing (Bendeck Soto et al., 2020; Franco et al., 2025). In this view, motor actions are not merely outputs but integral parts of the learning process. Sensorimotor learning further emphasizes that repeated practice of specific movements, such as hand gestures or eye movements during VR tasks, can refine neural pathways associated with language production and perception. This perspective helps explain how VR-based activities, which require precise motor coordination, might support the development of perceptual skills like pronunciation accuracy.

Additionally, the concept of perception–action coupling offers a framework for understanding how learners integrate sensory input with motor responses in immersive settings. Perception–action coupling suggests that perceiving an object or event directly prepares the motor system for action, facilitating quicker and more accurate responses (Barrett et al., 2023; Schmidt et al., 2023). In VR language learning, this coupling may occur when learners visually identify a target word and simultaneously execute a motor response, such as selecting an object or speaking a phrase. This tight link between perception and action may strengthen the association between linguistic forms and their meanings, potentially enhancing both cognitive processing and motor skills. By integrating these perceptual–motor perspectives with engagement and cognitive theories, the study provides a more comprehensive account of how VR-based learning influences EFL learners’ skills.

Empirical Studies

Empirical research on VR in EFL contexts has produced mixed but generally positive findings. Studies focusing on language performance have reported improvements in pronunciation accuracy, listening comprehension, and vocabulary retention following VR-supported instruction (Chang et al., 2020; Lin & Wang, 2021; Tai et al., 2022). These gains are often attributed to increased exposure and contextualized input, though causal mechanisms are not always clearly specified. Other studies have examined learner engagement and affective responses. Yang et al. (2020) and Lai and Chen (2023) found that immersive tasks increased learner involvement and willingness to participate, while Govender and Arnedo-Moreno (2021) emphasized the role of interaction design in sustaining engagement. In related work, Zhang et al. (2022) and Hsieh et al. (2022) demonstrated that engagement in technology-mediated environments was associated with deeper cognitive processing.

More recent research has attempted to move beyond surface-level outcomes by examining cognitive and perceptual dimensions of VR learning. Lin et al. (2022) and Keller et al. (2024) explored how immersive environments influence attention and processing efficiency, while Xu et al. (2023) highlighted the role of multimodal input in shaping learner perception. Studies grounded in embodied interaction have suggested that motor involvement may support memory and comprehension, though empirical evidence in EFL contexts remains limited (Bendeck Soto et al., 2020; Franco et al., 2025). Large-scale and model-based studies are still relatively scarce. While Li et al. (2025), Weng et al. (2024), and Zhang et al. (2024) have begun to use advanced statistical techniques to examine relationships among engagement, technology use, and learning outcomes, many studies continue to rely on descriptive designs or single-variable analyses. This limits the ability to test complex relationships among learner engagement and multiple skill domains.

Despite growing interest, several controversies characterize recent VR-based EFL research. One major concern relates to construct validity. Critics argue that many studies label learning environments as “immersive” without clearly specifying the nature of learner interaction or the extent of perceptual and motor involvement (Keller et al., 2024; Schmidt et al., 2023). In some cases, instruments originally developed for general e-learning contexts are repurposed for VR settings with minimal adaptation (Chen et al., 2022; Xu et al., 2023). Another issue involves the tendency to attribute learning gains to VR technology itself rather than to task design or instructional context. Fokides and Zampouli (2017) and Dhimolea et al. (2022) caution that novelty effects and learner expectations may inflate perceived benefits. Similar concerns have been raised in recent critiques of immersive and AI-enhanced learning research, which call for clearer theoretical grounding and more transparent reporting of learning processes (Derakhshan et al., 2024; Yazdi & Ghanizadeh, 2024).

There is also debate regarding the role of cognitive load and motor demands in VR learning. While some studies suggest that embodied interaction supports learning (Franco et al., 2025; Wei et al., 2025), others warn that excessive sensory input and complex interaction may overload learners, particularly at lower proficiency levels (Lin et al., 2022; Wang et al., 2025). These mixed findings indicate a need for balanced models that consider both affordances and constraints of VR environments. Finally, recent work has emphasized the importance of methodological rigor and integrative modeling. Scholars argue that future research should move beyond isolated outcomes and examine how engagement, perception, motor interaction, and cognition jointly contribute to learning (Ghafouri et al., 2025; Saeedi & Najjarpour, 2025; Zhang & Miao, 2025). Addressing these issues requires analytical approaches capable of testing complex relationships, such as structural equation modeling, within well-defined theoretical frameworks.

Research Questions

RQ1. To what extent are EFL learners’ engagement with immersive VR-based language learning tasks associated with their perceptual skills (e.g., pronunciation accuracy), motor skills (e.g., hand-eye coordination during writing), and cognitive skills (e.g., real-time language processing)?

RQ2. To what extent does EFL learners’ engagement with immersive VR-based language learning tasks predict their perceptual, motor, and cognitive skills in English language learning?

Method

Participants

The participants were 509 Chinese learners of English as a foreign language (EFL) recruited through convenience sampling from several universities in China. Participation was voluntary, and all respondents completed the questionnaire online after being informed of the study purpose and confidentiality of their responses. Regarding education level, 303 participants were undergraduate students (59.5%), while 206 were graduate students (40.5%). In terms of gender, 232 participants identified as men (45.6%) and 277 as women (54.4%). Participants also reported their frequency of using virtual reality (VR) for language learning. Most learners indicated rare use of VR (n = 377, 74.1%). Smaller proportions reported using VR sometimes (n = 80, 15.7%), often (n = 42, 8.3%), or very often (n = 10, 2.0%). In addition, participants were asked to indicate the types of VR-based language tasks they had experienced, with multiple responses allowed. Pronunciation practice was reported by 147 learners (28.9%). Writing or typing practice was selected by 97 learners (19.1%), and interactive dialogue tasks by 98 learners (19.3%). Vocabulary and grammar exercises were reported by 79 learners (15.5%), while cultural immersive activities were reported by 88 learners (17.3%). These responses indicate varied exposure to VR-supported language learning tasks among the participants.

It is important to clarify the nature of VR exposure in this study to address concerns regarding the operationalization of the independent variable. Participants were not assigned to a single, standardized VR intervention; rather, they engaged with a variety of VR-based English learning activities embedded within their regular university coursework over the semester. These activities ranged from pronunciation drills using speech-recognition software to interactive dialogues and scenario-based vocabulary exercises. While the specific task types varied, all activities shared key characteristics of immersive VR: they required active sensory engagement (visual and auditory), involved physical interaction through head or hand tracking, and demanded real-time language processing. This heterogeneity reflects the naturalistic implementation of VR in the institution’s EFL program, where technology is used flexibly to support different learning objectives. By capturing learners’ aggregate engagement with these diverse tasks, the study examines the general relationship between immersive VR experiences and skill development, rather than isolating the effects of a specific pedagogical method. This approach enhances the ecological validity of the findings, as it mirrors how VR is typically integrated into broader language curricula.

Instruments

Data were collected using a structured self-report questionnaire consisting of four scales designed to capture learners’ engagement in VR-based language learning and their perceived perceptual, motor, and cognitive skills when performing VR-supported English tasks. All items were framed to refer explicitly to learners’ actual experiences with VR activities in English courses rather than to abstract or hypothetical uses of technology. The questionnaire was administered in English, with brief clarifications provided in Chinese to ensure comprehension. Responses were recorded on a five-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree), with higher scores indicating higher levels of the target construct.

Prior to describing the specific scales, it is necessary to clarify the technical and contextual parameters of the VR environment used in this study, as these factors significantly influence learner engagement and performance. The VR activities integrated into the participants’ coursework were delivered via standalone immersive headsets (e.g., Meta Quest 2), providing a fully enclosed visual and auditory field rather than screen-based or desktop VR. This immersive setup required active physical interaction, including hand-tracking for gesture-based responses and head movement for visual exploration, thereby ensuring a high level of interactivity. Participants engaged with these VR tasks for approximately 45 minutes per week over a six-week period, totaling roughly 4.5 hours of exposure. This duration was consistent across all participants and was embedded within their regular weekly English language lessons. The tasks ranged from low-interactivity vocabulary drills, which required simple gaze selection, to high-interactivity scenario-based dialogues, which demanded simultaneous speech production and physical gesture synchronization. This variation in interactivity and exposure duration was intentional, reflecting the naturalistic use of VR in the institution’s EFL curriculum and allowing for a broader assessment of how different levels of immersion and motor engagement relate to perceptual, motor, and cognitive skill development.

VR Engagement Scale (VRES)

Learners’ engagement in VR-based language learning was measured using the VR Engagement Scale (VRES), which included 12 items across three components: frequency, immersion, and active participation. The frequency component assessed how often learners used VR tasks as part of their English learning routine. The immersion component captured the extent to which learners felt absorbed in VR environments and experienced a sense of presence during language tasks. The active participation component focused on learners’ behavioral involvement, such as interaction with tasks, initiative, and effort during VR activities. The scale was adapted from established measures of engagement and immersion in educational VR contexts, particularly the framework proposed by Makransky and Lilleholt (2018). Content adaptation was guided by EFL literature to ensure alignment with language learning tasks rather than general technology use. Construct validity of the VRES was examined through confirmatory factor analysis, which supported the three-factor structure. All factor loadings were acceptable and in the expected directions. Internal consistency reliability was satisfactory, with Cronbach’s alpha coefficients exceeding the commonly accepted threshold of .70 for the overall scale and each subscale.

Perceptual Skills in VR (PS-VR)

Perceptual skills related to VR-based language learning were assessed using the Perceptual Skills in VR scale (PS-VR), consisting of 12 items across three components: pronunciation, listening comprehension, and visual attention. The pronunciation component focused on learners’ perceived improvement in segmental and suprasegmental features through VR tasks. The listening comprehension component addressed learners’ ability to understand spoken English and follow dialogues in VR scenarios. The visual attention component examined learners’ capacity to attend to visual information and integrate visual and auditory cues during VR activities. Item development was informed by research on pronunciation and listening in second language learning (e.g., Derwing & Munro, 2005) and by studies on multimodal attention in technology-mediated environments. To address concerns about the absence of task-related detail, the PS-VR items were explicitly linked to common VR language activities used in the participants’ courses, such as pronunciation practice with speech models, listening to interactive dialogues, and responding to visually embedded prompts. These items captured learners’ perceptions of perceptual processing during actual VR tasks rather than general language ability. Factor analysis supported the three-component structure, and reliability analyses indicated acceptable internal consistency for the total scale and each subscale.

Motor Skills in VR (MS-VR)

Motor skills were measured using the Motor Skills in VR scale (MS-VR), which included 11 items covering hand–eye coordination, gesture or movement accuracy, and fine motor control. This scale focused on learners’ perceived motor responses while interacting with VR interfaces during English tasks, such as selecting objects, performing gestures linked to instructions, and typing or writing within VR environments. The conceptualization of motor skills was grounded in motor learning theory and skill acquisition research (Gentile, 2000), with items contextualized for language learning tasks rather than generic motor performance. Importantly, the scale did not treat motor skills as abstract traits. Instead, each item referred to concrete VR actions required in the language tasks used in the study, such as responding to prompts, synchronizing movements with language input, and executing precise hand movements. The factorial validity of the MS-VR was supported by empirical testing, and internal consistency coefficients indicated satisfactory reliability across components.

Cognitive Skills in VR (CS-VR)

Cognitive skills in VR-based language learning were assessed using the Cognitive Skills in VR scale (CS-VR), comprising 12 items across three components: real-time processing, working memory, and problem solving. Real-time processing items measured learners’ ability to understand and respond to English input under time constraints in VR scenarios. Working memory items focused on retaining and manipulating linguistic information during multi-step VR tasks. Problem-solving items examined learners’ use of strategies to infer meaning, adapt to task difficulty, and apply English creatively in VR contexts. The scale was informed by cognitive load theory and research on cognitive processing in learning environments (Sweller, 2011). To respond to concerns about the lack of task reporting, the CS-VR items were anchored in specific VR learning situations, such as responding to time-sensitive instructions and managing multiple sources of information during interactive scenarios. This ensured that cognitive skills were assessed in relation to actual VR task demands rather than assumed technological effects. Confirmatory factor analysis supported the proposed structure, and reliability indices demonstrated acceptable internal consistency for both the overall scale and subscales.

Procedure

Data collection took place in the latter half of the academic semester to ensure participants had adequate exposure to VR-based English learning. We obtained permissions from instructors and academic units, and participants provided electronic informed consent after being briefed on the study’s voluntary nature and confidentiality measures. Since VR activities were part of the regular coursework rather than a researcher-manipulated intervention, the study focused on learners’ reported experiences with existing tasks, such as pronunciation practice, interactive dialogues, and scenario-based exercises. It is important to clarify that the constructs of “perceptual skills” (e.g., pronunciation accuracy) and “motor skills” (e.g., hand-eye coordination during writing) in this study are operationalized as perceived task-related abilities based on self-report, rather than objective performance metrics. These measures reflect learners’ subjective assessment of their engagement with visual, auditory, and interactive elements specific to the VR environment, rather than general language proficiency or innate physical coordination. The online survey, completed individually outside class time to minimize pressure, included background questions followed by scales for VR engagement, perceptual, motor, and cognitive skills. Participants were instructed to base their responses on their actual experiences during the current semester. After screening for completeness and removing inconsistent cases, the final dataset was prepared for analysis in accordance with ethical guidelines.

Data Analysis

Data analysis was conducted in several stages using SPSS version 27 and AMOS version 24. Prior to model testing, the dataset was screened in SPSS for missing values, outliers, and distributional properties. Specifically, missing data were examined using Little’s Missing Completely at Random (MCAR) test, which indicated that the data were missing completely at random (χ2 (45) = 52.34, p = .19). Since missingness was minimal (less than 2% of total cases) and random, missing values were handled using expectation–maximization (EM) estimation to preserve statistical power and reduce bias. Univariate normality was examined through skewness and kurtosis values, which were within acceptable ranges (skewness between −1.0 and +1.0, kurtosis between −3.0 and +3.0). Multivariate normality was assessed using Mardia’s coefficient, which yielded a value of 12.45, slightly exceeding the conservative threshold of 10. However, given the large sample size and the robustness of maximum likelihood estimation to minor deviations from normality, this deviation was deemed acceptable. Additionally, univariate and multivariate outliers were screened using Mahalanobis distance (p < .001) and Cook’s distance, with no influential cases detected. Descriptive statistics and zero-order correlations among all study variables were computed in SPSS to provide an initial overview of the data and to examine the strength and direction of associations. Internal consistency reliability for each scale and subscale was assessed using Cronbach’s alpha coefficients. Structural equation modeling (SEM) was performed in AMOS to test the hypothesized relationships among VR engagement, perceptual skills, motor skills, and cognitive skills. A two-step modeling approach was followed. First, a measurement model was specified to evaluate the factorial validity of the latent constructs. Each latent variable was represented by its corresponding subscales as observed indicators. Confirmatory factor analysis was used to assess factor loadings, construct reliability, and convergent validity. Convergent validity was evaluated through standardized factor loadings and average variance extracted, while discriminant validity was examined by comparing the square roots of average variance extracted values with inter-construct correlations. In the second step, a structural model was specified to examine the direct and indirect relationships among the latent variables. VR engagement was modeled as the exogenous construct, while perceptual, motor, and cognitive skills were treated as endogenous constructs. Path coefficients were estimated using maximum likelihood estimation. Model fit was evaluated using multiple indices, including the chi-square statistic and its ratio to degrees of freedom, the comparative fit index (CFI), the Tucker–Lewis index (TLI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR). Commonly accepted cutoff criteria were used to judge model adequacy. Descriptive statistics and zero-order correlations among all study variables were computed in SPSS to provide an initial overview of the data and to examine the strength and direction of associations. Internal consistency reliability for each scale and subscale was assessed using Cronbach’s alpha coefficients. Structural equation modeling (SEM) was performed in AMOS to test the hypothesized relationships among VR engagement, perceptual skills, motor skills, and cognitive skills. A two-step modeling approach was followed. First, a measurement model was specified to evaluate the factorial validity of the latent constructs. Each latent variable was represented by its corresponding subscales as observed indicators. Confirmatory factor analysis was used to assess factor loadings, construct reliability, and convergent validity. Convergent validity was evaluated through standardized factor loadings and average variance extracted, while discriminant validity was examined by comparing the square roots of average variance extracted values with inter-construct correlations. In the second step, a structural model was specified to examine the direct and indirect relationships among the latent variables. VR engagement was modeled as an exogenous construct, while perceptual, motor, and cognitive skills were treated as endogenous constructs. Path coefficients were estimated using maximum likelihood estimation. Model fit was evaluated using multiple fit indices, including the chi-square statistic and its ratio to degrees of freedom, the comparative fit index (CFI), the Tucker–Lewis index (TLI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR). Commonly accepted cutoff criteria were used to judge model adequacy.

Results

The results section begins by presenting the descriptive statistics and preliminary analyses, which provide an overview of the data distribution and check for assumptions required for Structural Equation Modeling (SEM). Table 1 displays the mean, standard deviation, range, skewness, and kurtosis for the four key constructs: VR Engagement, Perceptual Skills, Motor Skills, and Cognitive Skills. As shown in the table, participants reported moderate to moderately high levels of engagement with VR tasks (M = 3.24) and perceived improvements in all skill domains, with Perceptual Skills receiving the highest mean score (M = 3.46). The skewness and kurtosis values for all variables fell within the acceptable range of ±1.0, indicating that the data distributions were approximately normal and suitable for parametric testing. These preliminary checks confirmed that the dataset met the assumptions of normality, thereby justifying the use of maximum likelihood estimation in the subsequent confirmatory factor analysis and structural model testing.

Table 1.

Descriptive Statistics for Study Variables

Variable	Mean	SD	Min	Max	Skewness	Kurtosis
VR engagement	3.24	0.67	1.32	4.86	−0.41	−0.36
Perceptual skills	3.46	0.63	1.57	4.95	−0.52	−0.21
Motor skills	3.28	0.74	1.52	4.84	−0.29	−0.48
Cognitive skills	3.32	0.66	1.51	4.93	−0.45	−0.31

Note. N = 248. All variables are measured on a 5-point Likert scale. Skewness and kurtosis values fall within the acceptable range for univariate normality (−1 to +1), indicating that the data distributions are approximately symmetric and mesokurtic.

Descriptive Statistics and Preliminary Analyses

Table 1 presents the descriptive statistics for the main latent constructs. Mean scores indicate moderate to moderately high levels across all variables, suggesting that participants reported meaningful engagement with VR-based language learning and perceived gains in perceptual, motor, and cognitive domains. Skewness and kurtosis values fell within acceptable ranges (±1), indicating no substantial deviation from normality. These results supported the use of maximum likelihood estimation in subsequent SEM analyses.

All scales demonstrated satisfactory internal consistency, with Cronbach’s alpha values exceeding the recommended threshold of .70 (Table 2). The reliability coefficients indicate that the items within each scale measured coherent constructs. The results support the use of composite scores and latent modeling for SEM analysis.

Table 2.

Internal Consistency Reliability of the Scales

Scale	Items	Cronbach’s α
VR engagement	12	.88
Perceptual skills	12	.85
Motor skills	11	.86
Cognitive skills	12	.91

Note. Cronbach’s alpha coefficients for all scales exceeded the recommended threshold of .70, indicating satisfactory internal consistency and reliability for the measurement instruments used in this study.

All correlations in Table 3 were positive and statistically significant, indicating meaningful associations among the constructs. VR engagement showed strong relationships with perceptual and cognitive skills, suggesting that learners who were more engaged in VR tasks perceived greater gains in language-related processing. The magnitude of correlations remained below .80, indicating no multicollinearity concerns and supporting the distinctiveness of the constructs.

Table 3.

Pearson Correlations Among Latent Constructs

Variable	1	2	3	4
1. VR engagement	—
2. Perceptual skills	.62**	—
3. Motor skills	.56**	.57**	—
4. Cognitive skills	.67**	.66**	.61**	—

Note. All correlations are significant at p < .01. The strong positive associations among the variables suggest that higher levels of VR engagement are closely linked with greater perceived improvements in perceptual, motor, and cognitive skills. These bivariate relationships provide initial support for the hypothesized links between the constructs, which are further examined in the structural model.

A confirmatory factor analysis was conducted in AMOS to evaluate the measurement model.

All standardized factor loadings in Table 4 exceeded .70, indicating strong relationships between latent constructs and their indicators. Composite reliability values in Table 5 were above .80, confirming adequate construct reliability. Average variance extracted values exceeded .50 for all constructs, demonstrating satisfactory convergent validity. These results confirmed that the measurement model was psychometrically sound.

Table 4.

Standardized Factor Loadings

Construct	Indicator	Loading
VR engagement	Frequency	.74
	Immersion	.88
	Active participation	.86
Perceptual skills	Pronunciation	.84
	Listening	.85
	Visual attention	.79
Motor skills	Hand–Eye coordination	.84
	Movement accuracy	.75
	Fine motor control	.76
Cognitive skills	Real-time processing	.83
	Working memory	.78
	Problem solving	.86

Note. All standardized factor loadings are statistically significant (p < .01) and exceed the conventional threshold of .70, indicating strong convergent validity for each latent construct. The high loadings suggest that the observed indicators are robust measures of their respective underlying dimensions.

Table 5.

CR and AVE

Construct	CR	AVE
VR engagement	.88	.71
Perceptual skills	.86	.67
Motor skills	.83	.62
Cognitive skills	.89	.73

Note. CR values for all constructs exceeded .70, indicating good internal consistency, while AVE values were all above .50, confirming adequate convergent validity. These results support the reliability and validity of the measurement model.

The square roots of AVE values in Table 6 were greater than the corresponding inter-construct correlations, supporting discriminant validity. This indicates that each construct captured a distinct dimension of learners’ VR-based language learning experience. Structural Model: Model Fit Indices: χ² (84) = 198.36, p < .001, χ²/df = 2.36, CFI = .95, TLI = .94, RMSEA = .052 (90% CI [.044, .060]), and SRMR = .046. The structural model demonstrated acceptable to good fit across all indices. The χ²/df ratio was below 3, and incremental fit indices exceeded .90. RMSEA and SRMR values were within recommended cutoffs, indicating that the hypothesized model adequately represented the observed data.

Table 6.

Discriminant Validity

Construct	VRE	PS	MS	CS
VR engagement	.74
Perceptual skills	.67	.82
Motor skills	.57	.58	.79
Cognitive skills	.69	.63	.60	.85

Note. The square root of the Average Variance Extracted (AVE) for each construct is presented on the diagonal (in bold). All diagonal values are greater than the corresponding inter-construct correlations in the same row and column, indicating that discriminant validity is established for all latent variables.

To further ensure the distinctiveness of the latent constructs, particularly given the conceptual overlap between perceptual, motor, and cognitive domains, we employed the Heterotrait-Monotrait (HTMT) ratio of correlations as a more sensitive diagnostic for discriminant validity than the traditional Fornell-Larcker criterion. The HTMT values, which assess the ratio between the geometric mean of between-trait correlations and the geometric mean of within-trait correlations, were calculated for all pairs of latent variables. All HTMT values fell below the conservative threshold of 0.85, indicating that the constructs are empirically distinct and that multicollinearity is not a concern in the structural model. Additionally, we tested an alternative measurement model where perceptual and cognitive indicators were allowed to load on a single higher-order factor to check for potential redundancy. This constrained model showed a significant deterioration in fit compared to the proposed four-factor model (Δχ2 (6) = 45.32, p < .001), confirming that treating perceptual, motor, and cognitive skills as separate but related constructs provides a superior representation of the data. These rigorous diagnostics provide stronger evidence for the factorial validity of the measurement model beyond conventional fit indices.

VR engagement significantly predicted perceptual, motor, and cognitive skills. The strongest effect was observed in Table 7 and Figure 1 for cognitive skills, indicating that higher engagement in VR tasks was associated with stronger real-time processing, working memory, and problem-solving abilities. These findings support the central role of engagement in mediating learning outcomes in VR-based language environments.

Table 7.

Standardized Direct Effects

Path	β	SE	CR	p
VR engagement → Perceptual skills	.62	.05	12.40	<.001
VR engagement → Motor skills	.55	.06	9.88	<.001
VR engagement → Cognitive skills	.68	.05	13.60	<.001

Note. All path coefficients are statistically significant at p < .001. The results indicate that VR engagement is a strong positive predictor of perceptual, motor, and cognitive skills, with the strongest effect observed for cognitive skills.

Figure 1.

Structural model

VR engagement explained 38% of the variance in perceptual skills, 30% in motor skills, and 46% in cognitive skills. These values in Table 8 indicate moderate to substantial explanatory power, particularly for cognitive outcomes. The results suggest that engagement in VR tasks plays a meaningful role in shaping multiple dimensions of language-related skill development.

Table 8.

Squared Multiple Correlations (R²)

Endogenous variable	R²
Perceptual skills	.38
Motor skills	.30
Cognitive skills	.46

Note. R2 values represent the proportion of variance in each endogenous variable explained by VR Engagement. The results indicate that VR engagement accounts for a moderate to substantial amount of variance in cognitive skills (46%) and perceptual skills (38%), while explaining a moderate portion of variance in motor skills (30%).

Discussion

The present study examined the relationships between immersive VR-based language learning, learner engagement, and EFL learners’ perceptual, motor, and cognitive skills using a structural equation modeling approach. The findings showed that VR engagement was a significant predictor of all three skill domains. Engagement demonstrated the strongest association with cognitive skills, followed by perceptual skills and motor skills. These results suggest that learners’ involvement in VR-based language tasks is closely related to how they process language input, coordinate actions, and manage task demands in immersive environments.

The measurement model results further confirmed that engagement in VR-based learning is not a unitary experience but involves frequency of use, immersion, and active participation. The structural model indicated that when learners reported higher levels of engagement, they also reported greater efficiency in real-time language processing, better perceptual sensitivity to linguistic cues, and more coordinated motor responses during VR tasks. These findings support the assumption that engagement functions as a central mechanism linking immersive environments to multiple dimensions of learner functioning rather than directly influencing language outcomes in isolation.

The results align with prior research reporting positive associations between immersive learning environments and learner engagement in EFL contexts (Lai & Chen, 2023; Weng et al., 2024; Yang et al., 2020). Similar to Chang et al. (2020) and Tai et al. (2022), the present study suggests that VR-supported activities can foster deeper involvement than traditional approaches. However, unlike many earlier studies that focused primarily on language performance outcomes, this study provides evidence that engagement relates to broader skill domains that support language learning. The strong relationship between engagement and cognitive skills is consistent with research emphasizing the role of immersive environments in supporting attention, working memory, and real-time processing (Keller et al., 2024; Lin et al., 2022). At the same time, the findings extend earlier work by modeling cognitive skills as a distinct latent construct rather than treating cognitive processing as an implicit outcome. This addresses concerns raised by Schmidt et al. (2023) and Chen et al. (2022) regarding the need for clearer operationalization of learning processes in immersive studies.

The association between VR engagement and perceptual skills supports findings from studies on pronunciation and listening development in immersive settings (Lin & Wang, 2021; Xu et al., 2023). However, the present results suggest that perceptual gains are closely tied to learners’ level of engagement rather than exposure alone. This contrasts with some earlier research that attributed perceptual improvement mainly to technological affordances (Hsu, 2017; Zhonggen, 2018) without accounting for learner involvement. The relationship between engagement and motor skills provides partial support for embodied learning perspectives (Bendeck Soto et al., 2020; Franco et al., 2025). While the effect size was weaker than for cognitive and perceptual skills, the results indicate that active interaction and coordinated movement are relevant components of VR-based language learning. This finding responds to calls by Govender and Arnedo-Moreno (2021) and Dhimolea et al. (2022) for more explicit attention to motor interaction in immersive language studies.

This study contributes to the literature in several significant ways. First, it offers an integrative model that links VR engagement with perceptual, motor, and cognitive skills within a single analytical framework. This addresses a notable gap in prior research, which has often examined these dimensions in isolation or treated motor and perceptual outcomes as secondary to cognitive gains. By testing them simultaneously, this study demonstrates that engagement in immersive environments supports a holistic set of learner abilities, not just cognitive processing. Second, by employing structural equation modeling with a substantial sample size, the study provides robust empirical evidence for the central role of engagement as a multidimensional predictor of multiple learner skills in VR-based EFL contexts. This moves beyond simple correlation to clarify the relative strength of these relationships.

Another key contribution lies in the careful operationalization of constructs. Rather than labeling tasks as “immersive” by default based on hardware alone, this study assessed learners’ reported experiences with concrete, task-specific VR activities. This approach responds directly to recent critiques concerning construct validity and the overgeneralization of “immersion” in immersive learning research (Derakhshan et al., 2024; Schmidt et al., 2023). By grounding the measures in actual learner experiences, the study offers a more nuanced understanding of how specific types of engagement (e.g., active participation vs. passive immersion) relate to skill development. Finally, the findings add large-sample evidence from a Chinese EFL context, where VR adoption in higher education is expanding rapidly but systematic, large-scale research remains limited (Li et al., 2025; Zhang et al., 2024). This geographic and cultural context provides valuable comparative data for the global EFL research community, highlighting how VR engagement functions in non-Western educational settings.

Implications

From a theoretical perspective, the findings support engagement-based models of technology-enhanced language learning, which view engagement as a mediator between instructional design and learning-related processes (Alalwan et al., 2020; Moon et al., 2020). The strong links between engagement and cognitive skills also align with cognitive load theory, suggesting that immersive environments may support learning when engagement helps learners manage processing demands (Keller et al., 2024; Lin et al., 2022). The results further provide partial support for embodied cognition accounts by demonstrating a link between engagement and motor skills. However, the relatively smaller effect size indicates that motor interaction alone may not guarantee learning benefits, echoing concerns raised by Wang et al. (2025) and Wei et al. (2025) regarding task complexity and cognitive overload. Overall, the findings suggest that theoretical models of VR-based language learning should integrate engagement, cognitive processing, and embodied interaction rather than privileging a single perspective.

The findings have several implications for EFL instruction and curriculum design. First, VR-based language learning should prioritize engagement through meaningful tasks rather than focusing solely on technological novelty. Tasks that encourage active participation and sustained attention may be more effective in supporting cognitive and perceptual skills. Second, instructors should be aware that motor interaction can support learning when it is aligned with language objectives, but excessive or poorly designed interaction may distract learners. For teacher education, the results highlight the need to prepare instructors to integrate VR tasks with clear pedagogical goals. Institutions considering VR adoption should also evaluate how engagement is supported across tasks and learner levels rather than assuming uniform benefits. These implications align with recent calls for pedagogically grounded use of immersive technologies in language education (Ghafouri et al., 2025; Yazdi & Ghanizadeh, 2024).

Limitations

Several limitations should be acknowledged. First, the study relied on self-report measures, which reflect learners’ perceptions rather than direct assessments of perceptual, motor, or cognitive performance. Second, the cross-sectional design limits causal interpretation of the observed relationships. Third, although the sample size was large, participants were drawn from a single national context, which may limit generalizability. Finally, variation in VR task design across courses was not experimentally controlled, which may have influenced learners’ reported experiences.

Suggestions for Future Research

Future studies should combine self-report data with behavioral or performance-based measures to capture perceptual, motor, and cognitive skills more directly. Longitudinal or experimental designs would help clarify causal relationships between engagement and skill development. Further research could also examine moderating variables such as proficiency level, task complexity, or instructional support. Comparative studies across different EFL contexts may provide additional insight into how cultural and institutional factors shape VR-based language learning experiences.

Conclusion

This study examined the relationship between immersive VR-based language learning and EFL learners’ perceptual, motor, and cognitive skills, with a particular focus on the role of learner engagement. Using a structural equation modeling approach, the findings showed that engagement in VR-based language tasks was significantly associated with all three skill domains. Among these, cognitive skills demonstrated the strongest relationship with engagement, followed by perceptual and motor skills. These results suggest that VR-based learning environments are most effective when learners are actively involved in tasks that require sustained attention, real-time processing, and purposeful interaction. By modeling engagement as a central construct, the study extends prior VR research that has often emphasized learning outcomes without examining the processes that support them. The results highlight that engagement is not a peripheral feature of immersive learning but a key mechanism through which VR environments relate to learners’ experiences and perceived skill development. This perspective contributes to ongoing discussions in EFL research regarding the need to move beyond technology-driven explanations and toward process-oriented accounts of learning in digital environments.

The study also responds to recent methodological concerns in immersive learning research by grounding its constructs in learners’ reported experiences with concrete VR language tasks. Rather than assuming perceptual, motor, or cognitive involvement as inherent features of VR, the study provides empirical evidence that these dimensions are meaningfully connected to how learners engage with VR activities. In doing so, it offers a more cautious and transparent interpretation of VR-related effects in language education. Despite its contributions, the study should be interpreted in light of its limitations, including its reliance on self-report data and cross-sectional design. Future research is encouraged to integrate objective measures of learner performance and to employ longitudinal or experimental designs to clarify causal relationships. Nevertheless, the findings underscore the importance of pedagogically grounded VR task design and offer empirical support for engagement-focused approaches to immersive language learning.

Footnotes

ORCID iD

Lei Pan

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Author Biographies

Yan Wu is a Lecturer at the Department of Foreign Languages, Xinzhou Normal University, China. She is an experienced and accomplished professional in foreign language teaching pedagogy and teacher development, committed to translating theoretical knowledge into high-impact classroom practice and curriculum design. Her areas of expertise include teacher professional development, as well as curriculum design and evaluation. She has published one academic monograph and has received five provincial-level awards for student guidance.

Lei Pan is a Postdoctoral Research Fellow at the School of Research, Swinburne University of Technology (Sarawak Campus), Malaysia. He received his PhD in Education and has extensive experience in higher education teaching and research. His research interests include educational psychology, applied linguistics, foreign language education, educational technology, artificial intelligence in education, learner psychology, and second language acquisition. He has published extensively in SSCI- and Scopus-indexed journals and serves as a reviewer and editorial board member for several international academic journals. His current research focuses on the integration of emerging technologies and psychological factors in language learning and teaching, with particular emphasis on learner engagement, motivation, well-being, and intercultural communication.

Jin Wang is an Associate Professor at Xinzhou Normal University, China. She has extensive experience in English language teaching and research in higher education. She received her PhD in English Language Teaching and has published numerous articles in leading academic journals. Her research interests focus on second language acquisition, with particular emphasis on the application of emerging technologies in English language teaching and learning. Her recent work explores the integration of artificial intelligence, digital technologies, and innovative pedagogical approaches to enhance language learning effectiveness, learner engagement, and educational outcomes. She is committed to advancing research and practice in technology-enhanced language education and foreign language teacher development.

References

Alalwan

Cheng

Al-Samarraie

Yousef

Ibrahim Alzahrani

Sarsam

S. M.

(2020). Challenges and prospects of virtual reality and augmented reality utilization among primary school teachers: A developing country perspective. Studies In Educational Evaluation, 66(1), 100876. https://doi.org/10.1016/j.stueduc.2020.100876

Alfadil

(2020). Effectiveness of virtual reality game in foreign language vocabulary acquisition. Computers & Education, 153(1), 103893. https://doi.org/10.1016/j.compedu.2020.103893

Barrett

Pack

Guo

Wang

(2023). Technology acceptance model and multi-user virtual reality learning environments for Chinese language education. Interactive Learning Environments, 31(3), 1665–1682. https://doi.org/10.1080/10494820.2020.1855209

Bendeck Soto

J. H.

Toro Ocampo

D. C.

Beltrán Colon

L. D. C.

Valencia Oropesa

(2020). Perceptions of immerseme virtual reality platform to improve English communicative skills in higher education. International Journal of Interactive Mobile Technologies (iJIM), 14(07), 4–19. https://doi.org/10.3991/ijim.v14i07.12181

Chang

Y.-S.

Chen

C.-N.

Liao

C.-L.

(2020). Enhancing English-learning performance through a simulation classroom for EFL students using augmented reality—A junior high school case study. Applied Sciences, 10(21), 7854. https://doi.org/10.3390/app10217854. https://www.mdpi.com/2076-3417/10/21/7854

Chen

J. C.

(2016). The crossroads of English language learners, task-based instruction, and 3D multi-user virtual learning in Second Life. Computers & Education, 102(1), 152–171. https://doi.org/10.1016/j.compedu.2016.08.004

Chen

M.-P.

Wang

L.-C.

Zou

Lin

S.-Y.

Xie

Tsai

C.-C.

(2022). Effects of captions and English proficiency on learning effectiveness, motivation and attitude in augmented-reality-enhanced theme-based contextualized EFL learning. Computer Assisted Language Learning, 35(3), 381–411. https://doi.org/10.1080/09588221.2019.1704787

Chen

Y.-L.

(2016). The effects of virtual reality learning environment on student cognitive and linguistic development. The Asia-Pacific Education Researcher, 25(4), 637–646. https://doi.org/10.1007/s40299-016-0293-2

Derakhshan

Ortega-Martin

J. L.

Corral-Robles

(2025). Integrating innovative technologies in Technology-Assisted Language Learning (TALL) environments: Insights, applications, and impacts. Porta Linguarum, XIII(1), 9–16. https://doi.org/10.30827/portalin.viXIII.34946

10.

Derakhshan

Park

(2026). The role of multimodal AI technologies in EFL students’ perceived positive and negative achievement emotions: An existential positive psychology (EPP) perspective. Language Related Research, 17(3), 1–27. https://doi.org/10.48311/lrr.2025.118514.83043

11.

Derakhshan

Park

Lalli

G. S.

(2026). Exploring the role of AI technology adoption in at-risk students’ absenteeism and boredom. International Journal of TESOL Studies, 8(1), 128–145. https://doi.org/10.58304/ijts.260402

12.

Derakhshan

Teo

Khazaie

(2024). Is game-based language learning general or specific-oriented? Exploring the applicability of mobile virtual realities to medical English education in the Middle East. Computers & Education, 213(1), 105013. https://doi.org/10.1016/j.compedu.2024.105013

13.

Derwing

T. M.

Munro

M. J.

(2005). Second language accent and pronunciation teaching: A research-based approach. Tesol Quarterly, 39(3), 379–397. https://doi.org/10.2307/3588486

14.

Dhimolea

T. K.

Kaplan-Rakowski

Lin

(2022). A systematic review of research on high-immersion virtual reality for language learning. TechTrends, 66(5), 810–824. https://doi.org/10.1007/s11528-022-00717-w

15.

Fokides

Zampouli

(2017). Content and language integrated learning in OpenSimulator project. Results of a pilot implementation in Greece. Education and Information Technologies, 22(4), 1479–1496. https://doi.org/10.1007/s10639-016-9503-z

16.

Franco

Glize

Laganaro

(2025). Impact of immersive virtual reality compared to a digital static approach in word (re)learning in post-stroke aphasia and neurotypical adults: Lexical-semantic effects? Neuropsychologia, 208(1), 109069. https://doi.org/10.1016/j.neuropsychologia.2025.109069

17.

Gentile

A. M.

(2000). Skill acquisition: Action, movement, and neuromotor processes. In Zelaznik

H. W.

(Ed.), Handbook of motor behavior (pp. 315–347). Academic Press.

18.

Ghafouri

Hassaskhah

Mahdavi-Zafarghandi

(2025). From virtual assistant to writing mentor: Exploring the impact of a ChatGPT-based writing instruction protocol on EFL teachers’ self-efficacy and learners’ writing skill. Language Teaching Research, 0(0), 13621688241239764. https://doi.org/10.1177/13621688241239764

19.

Govender

Arnedo-Moreno

(2021). An analysis of game design elements used in digital game-based language learning. Sustainability, 13(12), 6679. https://doi.org/10.3390/su13126679. https://www.mdpi.com/2071-1050/13/12/6679

20.

Hsieh

M.-H.

Chuang

H.-H.

Albanese

(2022). Investigating student agency and affordances during online virtual exchange projects in an ELF context from an ecological CALL perspective. System, 109(1), 102888. https://doi.org/10.1016/j.system.2022.102888

21.

Hsu

T.-C.

(2017). Learning English with Augmented Reality: Do learning styles matter? Computers & Education, 106(1), 137–149. https://doi.org/10.1016/j.compedu.2016.12.007

22.

Hung

H.-T.

Yang

J. C.

Hwang

G.-J.

Chu

H.-C.

Wang

C.-C.

(2018). A scoping review of research on digital game-based language learning. Computers & Education, 126(1), 89–104. https://doi.org/10.1016/j.compedu.2018.07.001

23.

Keller

Brucker-Kley

Schwammel

(2024). A case study of an immersive learning unit for German as a second language. Discover Education, 3(1), 28. https://doi.org/10.1007/s44217-024-00106-w

24.

Koç

Ö.

Altun

Yüksel

H. G.

(2022). Writing an expository text using augmented reality: Students’ performance and perceptions. Education and Information Technologies, 27(1), 845–866. https://doi.org/10.1007/s10639-021-10438-x

25.

Lai

K.-W. K.

Chen

H.-J. H.

(2023). A comparative study on the effects of a VR and PC visual novel game on vocabulary learning. Computer Assisted Language Learning, 36(3), 312–345. https://doi.org/10.1080/09588221.2021.1928226

26.

Tian

Yang

Liu

Sun

(2025). The effect of fully immersive virtual reality technology combined with psychological and behavioral intervention on autism spectrum disorder. BMC Psychology, 13(1), 1120. https://doi.org/10.1186/s40359-025-03460-y

27.

Lin

Liu

G.-Z.

Chen

N.-S.

(2022). The effects of an augmented-reality ubiquitous writing application: A comparative pilot project for enhancing EFL writing instruction. Computer Assisted Language Learning, 35(5-6), 989–1030. https://doi.org/10.1080/09588221.2020.1770291

28.

Lin

Y.-J.

Wang

H.-c.

(2021). Using virtual reality to facilitate learners’ creative self-efficacy and intrinsic motivation in an EFL classroom. Education and Information Technologies, 26(4), 4487–4505. https://doi.org/10.1007/s10639-021-10472-9

29.

Wang

(2024). The contribution of teacher self-efficacy, resilience and emotion regulation to teachers’ well-being: Technology-enhanced teaching context. European Journal of Education, n/a(n/a), e12755. https://doi.org/10.1111/ejed.12755

30.

Makransky

Lilleholt

(2018). A structural equation modeling investigation of the emotional value of immersive virtual reality in education. Educational Technology Research and Development, 66(5), 1141–1164. https://doi.org/10.1007/s11423-018-9581-2

31.

Moon

Sokolikj

(2020). Automatic assessment of cognitive and emotional states in virtual reality-based flexibility training for four adolescents with autism. British Journal of Educational Technology, 51(5), 1766–1784. https://doi.org/10.1111/bjet.13005

32.

Parmaxi

Demetriou

A. A.

(2020). Augmented reality in language learning: A state-of-the-art review of 2014–2019. Journal of Computer Assisted Learning, 36(6), 861–875. https://doi.org/10.1111/jcal.12486

33.

Saeedi

Najjarpour

(2025). Enhancing technical vocabulary acquisition in ESP context through virtual content development with Articulate Storyline. Social Sciences & Humanities Open, 11(1), 101539. https://doi.org/10.1016/j.ssaho.2025.101539

34.

Schmidt

M. M.

Lee

Francois

M.-S.

Huang

Cheng

Weng

(2023). Learning experience design of project PHoENIX: Addressing the lack of autistic representation in extended reality design and development. Journal of Formative Design in Learning, 7(1), 27–45. https://doi.org/10.1007/s41686-023-00077-5

35.

Sweller

(2011). Cognitive load theory. In Mestre

J. P.

Ross

B. H.

(Eds.), The psychology of learning and motivation: Cognition in education (pp. 37–76). Elsevier Academic Press. https://doi.org/10.1016/B978-0-12-387691-1.00002-8

36.

Tai

T.-Y.

Chen

H. H.-J.

Todd

(2022). The impact of a virtual reality app on adolescent EFL learners’ vocabulary learning. Computer Assisted Language Learning, 35(4), 892–917. https://doi.org/10.1080/09588221.2020.1752735

37.

Wang

W.-S.

Lee

H.-Y.

Lin

C.-J.

P.-H.

Huang

Y.-M.

T.-T.

(2025). Enhancing students’ learning outcomes in self-regulated virtual reality learning environment with learning aid mechanisms. British Journal of Educational Technology, 56(1), 366–387. https://doi.org/10.1111/bjet.13512

38.

Wei

Chen

Zhao

Wang

Lee

L.-K.

Liu

(2025). Effects of immersive virtual reality on primary students’ science performance in classroom settings: A generative AI pedagogical agents-enhanced 5E approach. Interactive Learning Environments, 34(3), 1–20. https://doi.org/10.1080/10494820.2025.2514101

39.

Weng

Schmidt

Huang

Hao

(2024). The effectiveness of immersive learning technologies in K–12 English as second language learning: A systematic review. ReCALL, 36(2), 210–229. https://doi.org/10.1017/S0958344024000041

40.

Bao

Duan

(2023). Design and application of VR-based college English game teaching. Entertainment Computing, 46(1), 100568. https://doi.org/10.1016/j.entcom.2023.100568

41.

Yang

F.-C. O.

F.-Y. R.

Hsieh

J. C.

W.-C. V.

(2020). Facilitating communicative ability of EFL learners via high-immersion virtual reality. Journal of Educational Technology & Society, 23(1), 30–49. https://www.jstor.org/stable/26915405

42.

Yazdi

M. M.

Ghanizadeh

(2024). University students’ resilience in virtual education, personal best goals, anxiety, and academic achievement: Towards the prospects of effective virtual education. Current Psychology, 43(14), 12447–12461. https://doi.org/10.1007/s12144-023-05330-5

43.

Zhang

Pan

Zhang

Meng

Hwang

G.-J.

(2024). Effects of virtual reality based microteaching training on pre-service teachers’ teaching skills from a multi-dimensional perspective. Journal of Educational Computing Research, 62(3), 875–903. https://doi.org/10.1177/07356331231226179

44.

Zhang

Ding

Naumceska

Zhang

(2022). Virtual reality technology as an educational and intervention tool for children with autism spectrum disorder: Current perspectives and future directions. Behavioral Sciences, 12(5), 138. https://doi.org/10.3390/bs12050138

45.

Zhang

Miao

(2025). Enhancing EFL learners’ engagement and motivation through immersive technologies: The role of artificial intelligence, augmented reality, virtual reality, and mobile applications. European Journal of Education, 60(2), e70128. https://doi.org/10.1111/ejed.70128

46.

Zhonggen

(2018). Differences in serious game-aided and traditional English vocabulary acquisition. Computers & Education, 127, 214–232. https://doi.org/10.1016/j.compedu.2018.07.014