Abstract
This study explored the impact of using a musical mnemonic device on enhancing memory retention in second-grade students (N = 132) while learning a Chinese poem. The quasi-experimental design involved two groups of students: one receiving music-based instruction (n = 65) and the other receiving traditional verbal instruction (n = 67). Students’ recitation accuracy was assessed immediately after learning and 1 week later with repeated procedures. While initial findings revealed no significant difference in immediate performance between the groups (p > .05), notable improvements were observed in the experimental, music-based group after a one-week interval with a repeated procedure (p < .05), particularly for male students (p < .005). In addition, students with instrumental training outperformed their peers without such training, supporting the notion that music training could enhance working memory (p < .01). We also revealed that students’ retention of Chinese poems, especially for those in the lowest 10th percentile of accuracy, was significantly boosted by incorporating music after 1 week’s interval (p < .001). These results suggest that while music as a mnemonic device may not yield immediate benefits, it can potentially enhance memory retention over time.
In recent years, meta-analytic studies (Gordon et al., 2015; Neves et al., 2022) have systematically synthesized evidence on the relationship between music education and academic performance. While these analyses do not assert direct causality, researchers increasingly report consistent correlations between musical training and improvements in working memory and literacy outcomes (Fennell et al., 2021; Talamini et al., 2023). These findings are consistent and point toward the broader benefits of musical education. Some researchers believe working memory mediates the benefits of musicianship on reading ability (George & Coch, 2011; Slevc & Okada, 2015; Suárez et al., 2016), especially since working memory is linked to improvements in language outcomes (Caplan, 2016; Hussey et al., 2017; Payne & Stine-Morrow, 2017). Hetland’s (2000a, 2000b) studies, for instance, suggest that music listening may enhance the brain’s ability to handle tasks involving mental visualization of scenes or objects without a physical reference point. In addition, Butzlaff (2000) has identified a reliable link between music education and enhanced outcomes on standardized tests for reading and language skills.
Pino et al. (2023) found that musical elements like rhythm and melody significantly support early language development, influencing phonological awareness, syntax, and semantics. They argue music is a core component of language acquisition. Certain studies have explored how music and language share common cognitive processing areas in the brain, supporting the use of music in language instruction (Koelsch, 2012; Patel, 2003). Some researchers argue that the brain processes music and language in similar ways, and incorporating music into language teaching could exploit this overlap to enhance learning (Koelsch, 2012; McMullen & Saffran, 2004). These shared mechanisms advocate for an integrative approach to language pedagogy that harnesses musical training to potentially augment linguistic proficiency.
Although researchers have extensively documented the beneficial effects of music on memory (see Butzlaff, 2000; Koelsch, 2012; McMullen & Saffran, 2004), some detrimental effects (El Haj et al., 2014; Fassbender et al., 2012; Jäncke et al., 2014) were also revealed. Fassbender et al. (2012) found that music negatively affected memory during a study or learning phase but increased mood and sports performance. Fennell and others (2021) suggest that, if working memory is the link between musical training and language skills, then tasks that require working memory to process music must interfere with tasks that require working memory for language. Researchers have commonly used dual-task paradigms, such as combining working memory and sentence-reading tasks, to demonstrate this competition effect (Fedorenko et al., 2006). When readers had to maintain verbal items in memory while reading sentences, comprehension and working memory performance both suffered, especially when both domains consumed more working memory resources (i.e., the sentences were syntactically complex, and the working memory stimuli were similar to words in the sentence). In their study, Fennell et al. (2021) found that both music and spoken language similarly interfere with working memory, hinting that they use related verbal abilities, which are different from those used for visuospatial tasks. Musicians showed better working memory performance, especially in tasks involving verbal and musical cues, suggesting that musical training may be linked to improved verbal working memory. Similarly, some researchers believe working memory mediates the benefits of musicianship on reading ability (George & Coch, 2011; Slevc & Okada, 2015; Suárez et al., 2016), especially since working memory is linked to improvements in language outcomes (Caplan, 2016; Hussey et al., 2017; Payne & Stine-Morrow, 2017). These results point to the potential of using music training to help improve reading fluency.
Music as mnemonic devices
According to Wallace (1994), using songs as an auditory framework can greatly enhance learning and memory retrieval by promoting “deep encoding” (Wallace, 1994). The rhythmic and melodic patterns in music provide sequential and temporal cues, aiding in the organization and recall of information. Melodies offer pitch patterns that serve as markers for different pieces of information. In addition, the structure of a song naturally divides information into manageable “chunks,” making it easier to remember, especially when the information pieces, like words in a list, do not relate to one another (Deutsch, 1999; Snyder, 2000). This process of chunking is a well-known memory device, as it lessens the cognitive load (Gobet, 2005). Furthermore, musical mnemonic devices utilize a limited set of tones, which can be strategically paired with information from a larger set, making the recall process more efficient (Dowling, 1973). Knott and Thaut (2018) investigated whether musical mnemonics improve verbal memory in children more than spoken words. Using the Rey Auditory Verbal Learning Test (RAVLT), it was found that children aged 9–11 who learned with songs remembered 20% more words than those who learned with spoken words. This advantage was still present, though reduced to 17%, after a 15-min delay. The song group also showed better recall of the correct word order, pointing to the potential of musical structure as a mnemonic device. These statistically significant findings indicate that songs could be a more effective verbal memory device for this age group.
Nevertheless, the potential of sung stimuli to enhance word and text learning is a topic of considerable debate, giving rise to various theoretical frameworks and research methodologies. For example, in Moore et al.’s (2008) study, participants with multiple sclerosis were tested on verbal learning and memory, comparing the use of music versus speech. Initial tests assessed executive function, memory, attention, and disability level. No significant differences were found between the two groups in these areas or in memory-recognition tasks. In another example, Peterson and Thaut (2007) examined how songs can affect language learning and memory. While the authors did not find a behavioral advantage for sung over spoken word lists in explicit learning tasks, electrophysiological responses indicated that music could provide a structured framework that may facilitate learning. Higher coherence in alpha and gamma bands was observed for sung stimuli, which was not reflected in behavioral data (Thaut et al., 2005). Schon and others (2008) posited that the integration of linguistic and musical information in songs might facilitate learning more effectively than speech alone. Their results supported this hypothesis, demonstrating that songs significantly enhance the learning process compared to speech.
Although the following studies do not focus on using music explicitly as a mnemonic device, recent empirical research (e.g., Bidelman et al., 2013; Nie et al., 2022) suggests that both tonal language experience and musical training are associated with enhanced auditory perception and cognitive functioning, pointing toward shared neural mechanisms underlying music and language processing. Nie et al. (2022) demonstrated that Mandarin-speaking children, who are routinely exposed to stable tonal patterns in spoken language, exhibited superior auditory memory performance compared to non-tonal language speakers, even in the absence of formal music training. This finding highlights the potential for tonal language environments to cultivate melodic sensitivity and memory skills in early development. Similarly, Bidelman et al. (2013) found that both musicians and native Cantonese speakers outperformed English-speaking non-musicians on tasks assessing pitch discrimination, music perception, and working memory capacity. These results provide converging evidence for the mutually reinforcing effects of musical training and tone language experience on auditory cognition and underscore the role of linguistic prosody in shaping domain-general cognitive abilities.
Crucially, these findings suggest that participants’ language backgrounds, alongside the motivational and structural properties of music in songs, can be particularly advantageous during the initial stages of learning a new language when learners are segmenting and internalizing new words. These findings suggest that music may serve as an effective mnemonic, particularly when used during learning recall.
Method
Participants
This quasi-experimental research took place in an public elementary school in Macau, China. We conducted a sensitivity power analysis using a medium effect size (Cohen’s d = .5) to determine whether our sample size was adequate. For between-group comparisons using the Mann–Whitney U test, a sample size of 64 per group was required to achieve 80% power at α = .05, as determined using the parametric equivalent (independent-samples t-test) in G*Power 3.1 (Faul et al., 2009). Our sample of 65 students per group would meet this criterion. For within-group comparisons (pre- and post-test), the Wilcoxon signed-rank test was used, with power calculations indicating that a sample of 34 participants would be sufficient to detect a medium effect. As a result, four intact classes of second-grade students participated in this study (N = 135), and in each class, there are n = 34, 34, 33, and 34 students, respectively. Two of the four classes were randomly assigned as the experiment group (n = 67), and the other two classes were assigned as the control group (n = 68). Three students were absent due to sickness in the first round of data collection and were removed for data collection. No absences were reported in the second round. A total of n = 65 experimental group and n = 67 control group participants were included for the final analysis (N = 132), comprising 66 females (50.0%) and 66 males (50.0%). The researchers collaborated with both the music teacher and Chinese language teacher to execute the quasi-experiment.
We obtained ethical approval for this study through the Research Ethics Review Committee of University of Macau (approval no. HE-0683-2025) prior to the start of the experiment. All participants were volunteers, and students were notified about the experiment procedure before taking part in our study. We obtained both parental consent and student assent through their teachers. No incentives were offered for students. Students were all native speakers of Cantonese, which is one of their most common languages. The first author visited each class, instructed the experiment procedure, and informed students about their rights to withdraw from the study at any point without negative consequences. Our coordinating teacher also assisted us in collecting information regarding students’ participation in instrumental lessons (outside school instrumental training) and/or choral training (i.e., afterschool choir).
Materials and design
Our study aimed to examine the effects of using music as an aid for working memory within educational settings. To this end, we have focused on the memorization of a classical Chinese poem. These poems, with their aesthetic and rhythmic qualities, are a central part of the Chinese literature curriculum and reflect the countries’ learning methodologies, where memorization is traditional and vital (Erbaugh, 1990; Gao & Guo, 2018; Shu, 2018). Chinese students are often taught to recite and memorize poetry, a practice that extends beyond mere cognitive processes, becoming an embodied routine deeply rooted in their cultural heritage (Xu, 2022). This counters the misconception of Chinese learners merely employing a “surface approach” through rote learning, by highlighting the culturally and historically rich tradition of memorization as a skillful, holistic practice (Clark & Gieve, 2006).
The poem chosen for this study was based on the recommendations of the Chinese teachers for second-grade students and considered to be appropriate for their level, as most of the characters are familiar to the students at their current level of Chinese proficiency. Furthermore, the poem lacks typical rhyming elements, thus reducing the influence of other mnemonic enhancers on their memorization capability. We also confirmed that none of the students had learned this material or the song before this experiment.
“The Temple of Bequeathed Love” 遺愛寺 is a five-character quatrain written by the Tang dynasty poet Bai Juyi (772–846). It is a short lyric poem that expresses feelings through its scenery. The poem vibrates with a dynamic stillness and changes its vista with every step. Through actions depicted by playing with stones by the creek, wandering around the temple in search of flowers, and listening to the birdsong and the sound of running water, the poem vividly portrays the vibrant vitality of the Monastery of Loving-Kindness and sketches its movingly beautiful landscape. With descriptions of actions like “playing,” “searching,” and “walking,” the poet expresses his profound love for nature (see Table 1).
Poem Chosen for the Recall Task, “The Temple of Bequeathed Love,” With Chinese and Cantonese Pronunciations and Its English Translated Meaning.
The chosen music to accompany the recitation and memorization of the classical Chinese poem is “穿梭詩人天地” (Shuttling through the Poet’s Universe), performed by the Chinese children’s music artist 李紫昕, known as “Purple.” This song was selected for its thematic relevance in introducing Chinese poetry to children and for its potential application as an educational resource. “Shuttling through the Poet’s Universe” is a children’s song designed to teach poems through melody, echoing the enduring words of ancient poets. It includes sections where one can recite a five-character quatrain poem to a 4×4 musical phrase. In the song, a few well-known Chinese poems have been recited on top of this 4×4 musical phrase. Since those examples were quite familiar to our children, we have intentionally omitted the vocal part of one of these original sections to allow the students to recite our selected poem instead, “the Temple of Bequeathed Love.” We took the excerpts of this song, which comes with a brief introduction with the lead singer’s singing portion (approx. 21 s), followed by the music for students to rhythm the poem on top of the music (approx. 15 s).
Research questions
This study examined the use of musical mnemonics to support memory and retention among second-grade speakers of tonal languages in real classroom settings. Unlike previous research conducted in artificial environments, this study addressed the practical application of music-based strategies in education by assessing their immediate- and short-term (1 week) effects on memorizing and retaining textual information.
Furthermore, we hypothesize that:
Procedure
In the first part of the quasi-experiment, for both the control and experimental groups, the procedure was led by the same instructor to ensure uniformity in delivery, while the first author of the study observed during the process. To guarantee a consistent experience across both groups, we collaborated closely with the music teacher, providing a detailed lesson plan which she adhered to rigorously. The instruction language is in students’ native language, Cantonese. Prior to the main experiment, a pilot study was conducted with an extra group of second-grade class to fine-tune our approach, which we did not include for this class in our data. The pilot study revealed that the original beats per minute (BPM) of the music used for poem recitation was too fast for the students. Consequently, we adjusted the tempo to 75% of the original speed, which we considered better suited to the children’s recitation pace. Although pilot data were excluded from the final analysis, the insights gained mirrored the main study’s outcomes and supported the effectiveness of our methodological adjustments.
In the quasi-experiment process, the teacher initiated the session by presenting the Chinese poem “The Temple of Bequeathed Love” to the students, accompanied by a brief explanation of its background. The poem was displayed via PowerPoint, and the teacher ensured that students could recognize and know how to pronounce all the words correctly. During the explanation and reading, the teacher engaged the students with further questions to confirm their comprehension of the poem’s meaning. They recited the text together five times as part of the learning process.
For the experimental group, the next step involved a short introduction to the music “Shuttling through the Poet’s Universe” (穿梭詩人天地). Students listened to the music before the teacher led them in reciting along with the music. Throughout this exercise, the teacher reminded students to try recalling the text from memory. As the music played and the group recited along, the teacher incrementally obscured portions of the text to encourage memorization. This method of memorization through musical recitation was repeated five times. This process was timed for 30 min. For the control group, the steps were identical except no music was introduced. The teacher reminded students to try recalling the text from memory. The group recited along with free flow, and the teacher incrementally obscured portions of the text to encourage memorization. This method of memorization with free flow was repeated five times. This process was also timed for 30 min.
At this stage of the class, two authors and one researcher collaborated to assess students’ recall of the taught poem. The students’ recall scores were based on the number of words correctly recited. The scoring system was highly objective and based on a simple, transparent rubric. Each student could earn up to 20 points, with one point awarded for each correctly recalled line or segment of the poem. Errors such as omissions, insertions, or mispronunciations were not given credit. This design reduced the risk of rater bias.
In the control group, students recited freely without music, while the experimental group recited along with the music. To keep the students engaged during their wait, they were encouraged to draw something related to the poem. To reduce the waiting for the memorization task among students, the two authors divided the memorization tasks so they could be conducted simultaneously. In addition, data collection proceeded according to a strict schedule; on average, it took about 42 s for a student in the experimental group and 28 s for a student in the control group to finish their memorization tasks. The memory task for each class was completed in under 15 min and was recorded for later analysis.
For the second part of the study, the entire procedure was repeated for both the control and experimental groups 1 week later, led by the same instructor. Before the repeated procedure, no instructions were given to students regarding reviewing the learned materials in between. Participants who missed the first part were removed from the pool and excluded from data collection. The same procedure was conducted after 1 week, and second round of data was collected. A flow diagram of our quasi-experimental study was illustrated below (see Figure 1).

Flow Diagram of the Quasi-Experimental Study.
Data analysis
To carry out data analysis, two researchers collaborated to import the Chinese poem recitation and retention data, conduct data visualization, and perform statistical analysis using Python. Missing data was removed from three absent students during the study. Descriptive statistics were calculated for poem recitation during the learning session for the experimental group (M = 10.73, SD = 6.69) and the control group (M = 11.92, SD = 6.56). After 1 week, retention scores increased for both groups: experimental (M = 18.23, SD = 3.12) and control (M = 16.36, SD = 5.36).
We plotted histograms to present the distributions of students’ performance for immediate recall and at a 1-week interval (see Figure 2), and Kernel density plot (see Figure 3) was employed to indicate distributions of score improvement of the experimental group and the control group. Notably, both histograms demonstrated that the recitation performances among all students were heavily skewed and did not follow any probability distribution. We then further assessed the validity and statistical assumptions for the analyses. We conducted a Shapiro–Wilk test for immediate recall and retention recall (at a 1-week interval), and the results indicated that both the immediate recall and retention recall violated the assumption of normality (p < .001). According to these, the Wilcoxon signed-rank test was mainly conducted to address the first research question, and the Mann–Whitney U-test and quantile regression models at different quantiles were appropriate to be employed to address the remaining questions. To test the validity of our hypotheses, we assessed multi-collinearity using variance inflated factor (VIF). We found that students’ previous Chinese scores and their retention results exhibited very high VIF values of 18.07 and 17.29, respectively, both exceeding the threshold of 10.0. This suggests that students’ recent Chinese scores were highly correlated with their retention scores.

Histograms of Poem Recitation Scores (Immediate Test) and Retention Scores (After a 1-Week Duration).

Kernel Density Plot of Score Improvement of the Experimental and Control Groups.
Results
RQ1: Does the use of musical mnemonic device facilitate better immediate recall of a poem among second-grade students, as opposed to traditional learning techniques?
To investigate the differences in the immediate recall outcome between the experimental group and the control group, we conducted a Mann–Whitney U-test for the immediate poem recitation scores. The results (Table 2) did not show a statistically significant difference in the recitation performance during the class between the experimental group and the control group (U = 2025.0, p > .05). Therefore, the data indicated that there was no significant relationship between the use of musical mnemonic device during working memory tasks and immediate memorization accuracy among second-grade students, and H1 cannot be supported. We also considered the variable “gender” and divided the sample into female and male students. The result of Mann–Whitney U-test (U = 2631.0, p < .05) and the corresponding boxplot (Figure 4; left) indicated that the rank sums of the immediate recall between female and male students differ significantly, and female students scored significantly higher in the immediate recall. Within both the female and male populations, we found that the rank sums of the immediate scores between the experimental and control groups did not differ significantly (U = 471.0, p > .05 for females; U = 547.0, p > .05 for males).
Mann–Whitney U-Test Results.
Note. EG: experimental group; CG: control group; Gender distribution: 66 female (50.0%), 66 male (50.0%). Students learning a musical instrument: n = 11 (8.1%); not learning: n = 124 (91.9%).
p < .05, **p < .01.

(Left) The Boxplot for the Immediate Test Scores of Female and Male Students. (Right) The Boxplot for the Immediate Recall Scores of Students Who Had Instrumental Training and Without Instrumental Training.
While investigating the effect of learning musical instruments during the immediate recall, we found that the rank sums of the immediate scores between students who had previously learned instruments (N = 11; 8.1%) and those who had not differ significantly (U = 1025.0, p < .01). And the boxplot (Figure 4, right) indicated that students who had previously learned instruments scored significantly higher in the immediate recall than those who had not, while there is no support for choir participants. Therefore, H3 can be supported for instrumental learning.
RQ2: Does the use of musical mnemonic device facilitate memory retention after a 1-week interval, as opposed to traditional learning techniques?
At a 1-week interval, we repeated the experiment procedure with both groups. A substantial number of students achieved the maximum score on the Chinese poem recitation task, with over half obtaining full marks. This distribution resulted in a ceiling effect that limited score variability but only at the upper end. Given this ceiling effect at the upper end of the distribution, we next applied non-parametric tests to more accurately assess differences in retention between the experimental and control groups. We employed Wilcoxon signed-rank test to find the memorization differences of retention scores between experimental group and the control group. The results indicated that in both, the test scores of the experimental group after 1 week of teaching with music were higher than the test score of the control group (p < .01). Consequently, we conducted a Mann–Whitney U-test (Table 2) to further address this research question. Since the p-value is less than .05 (U = 2587.5), we have sufficient evidence to claim that the rank sums of the retention test scores between the experimental and control groups differ significantly. Therefore, we concluded that using music as a teaching medium significantly improved students’ memorization of a Chinese poem in retention after a 1-week interval, and H2 can be supported.
To further explore the impact of using musical mnemonic device on the memory performance of the experimental group and the control group, we created the “difference” variable for each group to present the memorization differences before and after a 1-week interval. Results of the Mann–Whitney U-test indicated that the rank sums of the retention test score between the experimental group and the control group differ significantly (U = 2656.5, p < .05). The boxplot (Figure 5; left) for the memorization differences of each group presented that the improvement of students who had music as a teaching medium (experimental group) was significantly greater than that of the control group.

(Left) The Boxplot for Retention (Week 2) Scores Between Experimental Group and Control Group. (Right) The Boxplot for Retention Scores (Week 2) of Male Students Between the Experimental Group and the Control Group.
We also considered the variable “gender” and conducted a Mann–Whitney U-test for the female and male population. The results indicated that within the female population, the rank sums of the retention scores between the experimental and control groups did not differ significantly (U = 532.5, p > .05), while the rank sums of the retention scores between the experimental and control groups differed significantly within the male population (U = 767.5, p < .001). From the boxplot (Figure 5, right) for retention test scores of the experimental group and the control group with the male population, we concluded that within the male population, the students who had music as a teaching medium demonstrated significantly higher accuracy than those in the control group. Overall, the findings support H2 that the use of musical mnemonic devices during working memory tasks can increase student accuracy retention, especially for male students, compared to those completing tasks without music after a 1-week interval.
RQ3: To what extent do background variables, such as prior musical training and recent Chinese language achievement, predict students’ performance in poem recall and retention?
To explore how the effects of teaching method and prior academic performance vary across levels of student performance, we employed quantile regression. This approach does not assume a normal distribution of scores and is well-suited for detecting differences in effects at specific points of the outcome distribution (Taddy & Kottas, 2010). We modeled poem retention scores at the 10th and 90th percentiles using the same set of predictors. When estimating the 90th percentile, the model revealed no significant effects from any predictor variables (p > .05). This finding coincides with the observation that more than half of the students scored full marks, indicating a ceiling effect. Such high performance at the top end constrained the variability in scores, making it difficult to detect any influence of teaching method or prior achievement.
In contrast, the 10th percentile model showed statistically significant effects for both the teaching method and recent Chinese language scores. Music-based instruction was associated with a 5.11-point increase in retention scores (p < .001). Recent Chinese language achievement also positively predicted poem retention (b = .31, p < .001). These results suggest that for students with lower baseline performance, music-based teaching and strong prior academic performance can meaningfully improve their ability to retain and recite classical Chinese poetry. This supports H4 and demonstrates that targeted instructional methods may especially benefit students at risk of lower performance (see Table 3).
Quantile Regression of the Chinese Poem Immediate Recall Results and Retention Recall Results.
Note. Standard errors are reported in parentheses, with robust standard errors included in all models.
p < .01, ***p < .001.
Discussion
This study explored the effects of using music as a mnemonic device for Chinese second-grade students. The study involved two groups: one receiving music-based instruction and the other receiving traditional verbal instruction on a Chinese poem. The students were then assessed on their recitation accuracy. While existing studies have explored the role of music in enhancing memory (Butzlaff, 2000; Koelsch, 2012; McMullen & Saffran, 2004), many have only employed music passively as background music (e.g., Ferreri et al., 2015) or instrumental training (e.g., Roden et al., 2014), rarely integrating music actively in the classroom setting as a tool for all students, regardless of their music background. Our research contributes to the limited body of literature of using music as a mnemonic device within a classroom setting to determine its effectiveness in enhancing memory retention. Knott and Thaut (2018) investigated whether musical mnemonics improved verbal memory in children more effectively than spoken words. Their study found that children who learned through songs for 2 consecutive weeks exhibited better recall of the poem than those who learned through spoken words. This suggests the potential of musical structure as a mnemonic device. However, our initial findings revealed no significant difference in immediate recall performance between the two groups.
From our observations, the use of music posed some challenges for students in the experimental group. For instance, despite tempo reduction after the pilot study, some students still found the tempo too fast for them to comfortably recite the poem from memory with the new melody. From our observations, some students struggled to keep up with the tempo, eventually falling out of rhythm, which made it harder for them to recite the poem accurately. In addition, the cognitive load required to process both the song and the memorized materials might have been too high for some students, particularly those who were less familiar with the content or the musical structure. This increased cognitive load could have interfered with their ability to encode and recall the information effectively in the short term (e.g., de Groot and Smedinga, 2014; Jäncke & Sandmann, 2010; Moreno & Mayer, 2000).
For the second part of our study, a repeated quasi-experiment was conducted 1 week later with the same groups of students. Our results aligned with Schon and collaborators (2008), suggesting that the integration of linguistic and musical information in songs may facilitate more effective learning than speech alone. This indicates that while the use of songs as a mnemonic device may not yield immediate results, it could enhance retention better than traditional speech-based methods over time. As students became more familiar with the melody, the cognitive load may have decreased, allowing for better integration of the musical and verbal information and leading to improved recall, as observed in the 1-week follow-up.
In addition, the results also showed that students with musical training, particularly instrumental training, outperformed their peers without such training in the first part of the experiment. This aligned with studies that believe music training enhance working memory (Moreno et al., 2011; Roden et al., 2014). Students with musical training may have developed better auditory and cognitive skills, which could help them process and remember the poem more effectively. For our study, we were not able to find any evidence that choir participants can support this hypothesis fully. We speculate that since many students had recently joined the school choir in second grade with only a limited number of rehearsals, the effects of choir training will need further investigation.
Students with higher Chinese scores performed notably better in both reciting Chinese poems during the class and retaining the poems after a week. Consistent with the findings of Aronen et al. (2005), children with poor working memory performance, particularly in audiospatial memory, were more likely to experience academic and attentional or behavioral difficulties at school than children with strong working memory abilities. As such, higher-performing students likely possess stronger language skills and memory capabilities, which contribute to their better performance in memorization tasks (Caplan, 2016; Hussey et al., 2017; Payne & Stine-Morrow, 2017). Therefore, we employed separated quantile regression models to focus on a high-performing group of students who fully retained the Chinese poems and a group of students who got a low retention score. We chose to regress on the 10th percentile and the 90th percentile in order to see the differential effect of using music as a teaching method. Remarkably, at the 10th percentile, students’ retention of Chinese poems was significantly boosted by incorporating music after 1 week. Therefore, when students struggle with remembering class material, integrating music as a teaching medium can potentially serve as an alternative learning experience that is conducive to language learning and can improve their working memory over time.
Implications
Our research also particularly highlighted that the enhancement was significant among students in the lower percentile of achievement. We suggest that incorporating music as a teaching tool or mnemonic device could be especially beneficial for students who typically struggle with memorizing tasks or facts initially. This approach would not only help students retain information better but also improve learning engagement and motivation (e.g., An et al., 2014; Israel, 2013; Nadera, 2015). Furthermore, integrating music into the curriculum may provide a more inclusive and supportive learning environment for all students, fostering a greater interest in and appreciation for the subject matter. Extending beyond educational settings, these findings also carry potential implications for clinical practice. In clinical contexts, musical mnemonic strategies may serve as an effective and cost-efficient approach to supporting children’s working memory and verbal retention, particularly in language-related tasks.
Limitations
In the quasi-experimental study, we evaluated students’ performance immediately after their learning and 1 week later. Longitudinal research is needed to better to assess the long-term benefits of incorporating music into children’s language development. Future studies should investigate whether music as a mnemonic device helps students retain memory in the long term.
Furthermore, our study used a tonal language (Cantonese) to assess students’ language development. Although we identified factors that contribute to better retention of Chinese poems, these effects may not be generalizable to all languages (e.g., English and Japanese). For instance, Zhang et al. (2020) used the Musical Ear Test to compare melody and rhythm perception among native Chinese and Japanese speakers. Although overall accuracy was comparable, Chinese participants showed superior performance in melody discrimination, while Japanese participants excelled in rhythm tasks. These group-level and within-group differences suggest that native language experience shapes specific aspects of musical perception. Similar studies could be conducted to investigate cross-cultural languages through the lens of using music as mnemonic device.
Moreover, it is possible that some students may have reviewed the immediate recall materials between sessions, though very unlikely. While we did not instruct students to review any materials, no handouts of the poem were provided, and students were not informed about the repeated procedure in the second week, it is possible that some students may have shared what they learned with their parents after the experiment. This raises the possibility that motivated students who may have revisited the material on their own may have influenced the outcomes, potentially affecting the validity of the results, as this behavior was beyond the researchers’ control.
Last, since we used music as a mnemonic device for children to recite their poems, we do not fully understand how music enhances working memory over a 1-week period. For instance, it is possible that music specifically acts as a facilitating encoding context for verbal episodic memory (Ferreri et al., 2015), or that rhythms induce beat-related entrainment, thereby improving short-term memory of middle-list items in a serial recall task (Jirakittayakorn & Wongsawat, 2017). If beat-related entrainment is the mechanism, further research should investigate the optimal tempo for students.
Suggestions for future research
Future studies should isolate specific factors to determine exactly how music aids working memory in children. We also recommend that more studies be conducted in classroom settings to increase their impact in real-world scenarios. Most importantly, while the findings suggest a potential relationship, direct causality is cautiously refrained from being asserted, as additional factors and variables may contribute to the observed outcomes. In addition, we recommend future studies compare musical mnemonics with other memory-enhancing strategies, such as rhythm-based chanting, storytelling, visual imagery, or movement-based encoding to determine their relative effectiveness and isolate music-specific effects. In addition, cross-linguistic investigations are warranted to explore whether mnemonic benefits differ between speakers of tonal and non-tonal languages, offering insights into the interaction between linguistic background and music-based learning.
Conclusion
With this research, we aim to close the gap in understanding the impact of working memory and music education in the classroom setting with these research findings. While initial findings indicated no significant difference in immediate performance between the groups, notable improvements were observed in the music-based group after a 1-week interval with repeated procedures, particularly among male students. In addition, students with instrumental training significantly outperformed their peers without such training, further supporting the idea that music training could enhance working memory. Not surprising, higher academic performance in Chinese also predicted better recitation accuracy. Moreover, the retention of Chinese poems, especially for students in the lowest 10th percentile of accuracy, was significantly enhanced by incorporating music after a 1-week interval. This could imply that there is an optimal time for using music as a mnemonic device in the classroom. Our research also particularly highlighted that the enhancement was significant among students in the lower percentile of achievement. In conclusion, the findings indicate that music-based instruction can function as an effective mnemonic support, enhancing delayed retention and benefiting students with lower initial achievement, including those without musical backgrounds, when implemented through repeated instruction. We hope that our study can contribute to a growing body of evidence supporting the intersection of music and memory, highlighting music’s potential not only as a pedagogical tool but also as a broader cognitive scaffold for learning and development.
Footnotes
Acknowledgements
The corresponding author presented partial findings from this study at the 2025 Suncoast Music Education Research Symposium at the University of South Florida. The authors are grateful for the valuable feedback provided by the symposium committee and the reviewers, which has helped to further strengthen this manuscript.
Ethical considerations
This study was approved by the University of Macau Ethics Committee (No. HE-0683-2025).
Consent to participate
Informed consent was obtained from all participants and, in the case of minors, from their parents or legal guardians. Participation was voluntary, and participants were informed of their right to withdraw at any time without consequence.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data that support the findings of this study are not openly available due to reasons of participant confidentiality and are available from the corresponding author upon reasonable request. Data are stored in secure data storage at the University of Macau.
