Abstract
Emotion-enhanced memory occurs when an arousal response to an emotion stimulus strengthens memory consolidation. We tested whether listening to emotionally arousing music enhanced memory in this way. In a within-subjects design, 37 participants (18 to 50 years, 22 female) listened to two of their own highly enjoyed music tracks, two self-rated neutral tracks from other participants’ selections, and a five-minute radio interview. After each listening episode, participants memorised a unique array of 24 images. Subjective and physiological emotional arousal was monitored throughout the experiment and free recall of all images within the five image arrays was tested at the end. As predicted, compared to the music and non-music controls, self-selected enjoyed music elicited greater subjective and physiological changes consistent with emotion, and more details from images presented after enjoyed music were recalled than after listening to the radio interview. A multiple regression analysis revealed that physiological changes consistent with an emotional arousal response to enjoyed music reliably predicted memory. Further research with larger samples is needed to replicate these exploratory findings.
One of the most pleasurable features of music, and a primary reason for listening to it, is its ability to elicit emotion (Gabrielsson, 2001; Juslin, Liljestrom, Vastfjall, Barradas, & Silva, 2008; North, Hargreaves, & Hargreaves, 2004). Perhaps some of the pleasure from listening to emotionally arousing music comes from the physical responses it elicits, such as changes in facial expression, body movement, and the experience of chills/thrills (Grewe, Kopiez, & Altenmuller, 2009; Hodges, 2010; Rickard, 2004; Salimpoor, Benovoy, Longo, Cooperstock, & Zatorre, 2009). We may continue listening to music because it activates reward centres in the brain (Blood & Zatorre, 2001; Koelsch, 2014; Salimpoor, Benovoy, Larcher, Dagher, & Zatorre, 2011). As emotion is associated with enhanced autobiographical memory (Cahill et al., 1996; Cahill, Prins, Weber, & McGaugh, 1994; McGaugh, 2003), responding emotionally to music in everyday settings may also have the effect of modulating autobiographical memory related to the music listening experience.
The effect of emotion on memory is well documented in animal and human studies (Hermans et al., 2014; LaBar, 2007; McGaugh, 2003; Rickard, Toukhsati, & Field, 2005; Roozendaal, McEwen, & Chattarji, 2009; Sandi & Pinelo-Nava, 2007). The central tenet of the emotion-enhanced memory model is that arousal hormones released during an emotional event modulate the consolidation of memory for that event. The amygdala, a brain structure responsible for emotion processing, is critical to this memory modulating process. When we are faced with an emotion stimulus, a sub-set of nuclei in the central amygdala innervates the hypothalamus and autonomic nervous system (ANS), leading to adrenal gland secretion of arousal hormones (epinephrine, norepinephrine, and cortisol) into the bloodstream and increased sympathetic nervous system (SNS) activity to support physical responding (e.g. fight or flight). The adrenal secretion of epinephrine activates β-adrenergic receptors on the ascending vagus nerve, which projects to the nucleus of the solitary tract (NTS) located in the brain stem. The NTS in turn has norepinephrine projections to the locus coeruleus, which is a major site of norepinephrine projections throughout the brain, including the baso-lateral amygdala (BLA). Activation of β-adrenergic receptors within the BLA reinforces noradrenergic projections throughout the cortex, including the hippocampus, a structure critical to the consolidation of long-term memory. These “waves” of autonomic activity (Cahill & McGaugh, 1998) provide sufficient activation for gene transcription and protein synthesis required for cell growth in the hippocampus and entorhinal cortex that consolidate the memory trace.
The memory modulating effect of adrenal hormones in humans has been demonstrated in a seminal paper by Cahill et al. (1994). Cahill and colleagues revealed that long-term autobiographical memory for an emotional story could be attenuated when an adrenal stress response was blocked with a β-adrenergic receptor blocker (beta-blocker). In their double-blind placebo-controlled experiment, four groups of participants viewed either a neutral story or a closely matched emotional story after being administered either a placebo or a beta-blocker. Subjective reports confirmed that the emotional story elicited an emotional response; however, after a one-week delay, a surprise long-term recognition memory test revealed that participants who viewed the emotional story after being administered the beta-blocker recalled less of the emotional story elements than those administered the placebo. Blocking the effect of adrenal hormones therefore attenuated emotion-enhanced memory. Emotional arousal effects on autobiographical memory have since been reported by several researchers. For instance, studies measuring autonomic responses to emotional stories, still images, or films reveal that increases in heart rate and/or skin conductivity and decreases in heart rate variability are associated with enhanced recognition and free recall of the emotion stimuli (Abercrombie, Chambers, Greischar, & Monticelli, 2008; Buchanan, Etzel, Adolphs, & Tranel, 2006; Cahill & Alkire, 2003; Laney, Campbell, Heuer, & Reisberg, 2004; Palomba, Angrilli, & Mini, 1997; Vecchiato et al., 2010).
Music can also elicit strong emotional responses and may offer utility in testing the neurobiological model of emotion-enhanced memory. Listening to music elicits subjective responses along the important valence and arousal dimensions of emotion (Russell, 1980; Schubert, 1999), and elicits changes in emotional behaviour and expression, activates systems in the central and peripheral nervous systems consistent with emotion, and activates brain regions involved in emotion-enhanced memory (Blood & Zatorre, 2001; Hodges, 2010; Koelsch, 2014; Salimpoor et al., 2009; Salimpoor et al., 2011). Several researchers have revealed that emotion-inducing music increases SNS activity, indexed by increased perspiration (skin conductivity) and heart rate (e.g. Baumgartner, Esslen, & Jancke, 2006; Ellis & Simons, 2005; Iwanaga & Moroki, 1999; Khalfa, Peretz, Blondin, & Manon, 2002; Krumhansl, 1997; Lundqvist, Carlsson, Hilmersson, & Juslin, 2009; Mockel et al., 1994; Rickard, 2004; Taylor, 1973; Vanderark & Ely, 1993), and decreases parasympathetic nervous system (PNS) activity, indexed by decreased heart rate variability (HRV; Riganello, Candelieri, Quintieri, & Dolce, 2010). Further, the experience of chills (also known as goose bumps, thrills, or frisson) while listening to music is frequently reported, and the experience of these chills is correlated with autonomic changes (Grewe et al., 2009; Grewe, Nagel, Kopiez, & Altenmuller, 2005; Nagel, Kopiez, Grewe, & Altenmuller, 2008) and reward related brain activity (Huron & Margulis, 2010; Salimpoor et al., 2009). The chills response has therefore been proposed as a valid and simple measure of music-induced autonomic activation (Grewe et al., 2009). The combination of increased SNS and decreased PNS activity, and the experience of chills when listening to emotionally powerful music could thus signal an emotion response. The emotion response could in turn enhance memory consolidation via the neurobioligical mechanism of emotion-enhanced memory. However, to the best of our knowledge this possibility has not been tested.
Although there is potential for music-induced emotion to be experimentally manipulated to enhance memory, there are also significant obstacles to obtaining a valid emotion response. For instance, many music-emotion researchers have used tightly controlled music excerpts to improve interpretation of findings in terms of the emotion properties of the music rather than other factors, such as general arousal caused by dynamic changes in music tempo and volume (Schubert, 2004; Thompson, Schellenberg, & Husain, 2001). However, due to wide variation in individual music preferences, this method may be limited in its emotion-eliciting efficacy. To overcome this methodological problem, a procedure was developed by Salimpoor and colleagues (Salimpoor et al., 2011; Salimpoor et al., 2009) that controlled for the potentially confounding structural characteristics of different self-selected music excerpts. In the procedure, participants were asked to select their own music as the emotion stimulus, and that same music acted as a non-emotion control for another participant (a pre-rating procedure was completed by each participant first to determine which music tracks in the pool were considered neutral). The music selections within the experimental pool therefore acted as both an emotion stimulus for one participant, and a non-emotion control for another. Differences between conditions in autonomic changes and brain activation consistent with emotion were thus more confidently attributed to emotion elicited by the music.
In the current study, the aim was to explore whether listening to highly enjoyed self-selected music before a memory task enhanced the early stages of long-term memory consolidation. We also sought to determine whether memory differences were explained by autonomic changes consistent with an adrenal response, and by extension, neurobiologically modulated emotion-enhanced memory. It was hypothesised that relative to a non-music active-listening control and to other participants’ music selections, there would be greater recall of images presented after listening to self-selected highly enjoyed music. Further, it was hypothesised that image recall would be predicted by increased SNS activity (greater chills frequency and increased skin conductance levels and heart rate) and decreased PNS activity (decreased HRV).
Method
Participants
Fifteen males and 22 females aged 18–50 years (M = 35.03, SD = 9.76) with normal or corrected to normal vision and hearing were recruited through university newsletters and posters or by word of mouth. Participation was voluntary with no reward or incentive provided. Sixty-two percent had played a musical instrument (48% had played within the last year) and 92% had completed tertiary education. Ethical approval for the study was granted by the Monash University Standing Committee on Ethics in Research Involving Humans (reference number CF09/1737 – 2009000961).
Materials and apparatus
Two participant-selected enjoyed music tracks (participants’ music; PM) acted as both the emotion stimuli for that participant and the neutral stimuli for another participant (others’ music; OM). A 5-minute excerpt from a national radio science show acted as a non-music active listening control. All stimuli were presented through closed headphones and volume was adjusted by the participant to a range within approximately 50–70 decibels.
Target information for the memory test consisted of five 4 by 6 arrays of International Affective Picture System (IAPS) digital images presented on a 19” computer monitor. The IAPS catalogue of over 1000 images was created by Lang, Bradley, and Cuthbert (1995) to provide researchers with a set of normative stimuli rated on the dimensions of pleasure, arousal, and dominance. In the current study, only images rated as moderate in arousal (within the range of z ± 0.5) were used. An equal number of positive, negative, and neutral images composed of the semantic categories of animals, buildings, everyday objects, people, and nature scenes were resized to 3.13cm high by 4.17cm wide. The location of the 24 images within each array was randomised.
An emotional response to the music and non-music stimuli was determined by changes in subjective feelings of affective valence and arousal, the number of chills experienced, and autonomic changes. Subjective feelings of valence and arousal were reported immediately after each stimulus presentation on 11-point scales presented in a grid format. Valence was represented on a horizontal axis anchored by “-5” (very negative) to “5” (very positive), with zero indicating neither negative nor positive. Arousal was represented on a vertical axis anchored by “-5” (very deactivated) to “5” (very activated), with zero indicating neither activated nor deactivated. The number of chills experienced (chills frequency) was also reported immediately after each stimulus presentation. (Ideally, chills should be recorded continuously throughout the listening period to allow analysis of correlations with peaks in physiological responses. However, due to equipment limitations we were unable to measure chills at this level of precision.)
Autonomic changes were captured with psychophysiological recording equipment (Bioview V2.11 Series IV, Zencor, 1998). Skin conductance was recorded from sensors attached to the distal phalanx of the second and fourth fingers of the non-dominant hand, and heart rate changes were detected by a light-dependent resistor attached to the left ear lobe. Movement artefacts were reduced by anchoring the heart rate lead to participants’ clothing. All recording channels were digitised at an output rate of 2 Hz. Data submitted for offline analysis included the period from 5 seconds after stimulus onset to 5 seconds before stimulus offset. The removal of the initial and final 5 seconds of each stimulus minimised the influence of orienting responses in the data. Increased SNS activity was indexed by increased mean skin conductance levels (SCL) and skin conductance responses (SCR), and decreased intervals between adjacent heart beats (inter-beat interval [IBI], analogous to increased heart rate). SCR was calculated by summing the number of skin conductance peaks greater than .05 micro Siemens (μS) per 10 second epoch. This procedure controls for inherent downward drift in skin conductance levels over time (Duncko, Cornwell, Cui, Merikangas, & Grillon, 2007). Decreased PNS activity was indexed by decreased HRV, determined by calculating the root mean square of successive heart beat interval differences (RMSSD; Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology, 1996).
As the current experiment employed a within-subjects design, a difficult semantic fluency task was used to minimise rehearsal and maintenance of the images between conditions. The task involved participants naming as many items as possible in 60 seconds from a variety of semantic categories of (e.g. clothing, transport, food). Responses were voice recorded and scored as a percentage of the maximum number of items described from each category.
To test the influence of participating in the experiment on subjective feelings of positive and negative affect, the short form of the International Positive and Negative Affect Schedule (I-PANAS-SF) developed by Thompson (2007) was administered before commencement of the experiment and again on completion. The I-PANAS-SF comprises five positive and five negative affect adjectives (e.g. inspired, determined, ashamed, and nervous). Participants were instructed to rate their level of agreement with the adjectives at the current time on a scale ranging from 1 (never) to 5 (always). Summation of the five positive adjective scores yields a positive affect index ranging from 5 to 25, and summation of the five negative adjective scores yields a negative affect index ranging from 5 to 25.
Memory for the images was tested with a free recall test. Participants were instructed to write a detailed description of as many images as they could recall from the five image arrays. No time limit was placed on the memory test, although a guideline of around 20 minutes was given. Each image detail correctly recalled received a score of 1. Recall responses were scored independently by two judges with high inter-rater reliability (r = .96).
Procedure
The study included a pre-experimental music selection stage which was conducted at a time and place convenient to participants. The experimental session was conducted in the research laboratory to allow recording of physiological responses.
Pre-experimental music selection
Prior to the experimental session, participants were first required to select and return to the researchers two highly enjoyed music tracks. A pack containing instructions for music selection, music rating scales, and a demographic questionnaire was posted to 47 participants. Participants were instructed to select two music tracks that they intensely and consistently enjoyed, that were preferably “unusual” or “eclectic” (to reduce other participants’ liking due to familiarity), were between 2 and 5 minutes long, did not contain lyrics unless in an unfamiliar language (to avoid linguistic interference or mnemonic cues), preferably induced chills, were not associated with significant past events, and were of similar energy/activation levels. Participants rated their enjoyment (1 = not at all, 4 = neither, 7 = very much), chills frequency, and felt arousal and valence (grid format, −5 to 5), and also noted the timing of the “best one minute” of each track. Participants then copied their music selections onto a compact disc (CD), which they returned with the music ratings and demographic questionnaire to the researcher by pre-paid post. Eight participants who took longer than four weeks to respond were excluded from further correspondence.
On return of music tracks from the first 10 participants, a compilation of the “best one minute” excerpts from each member of the group less the participants’ own selections was copied onto a CD. Each compilation CD therefore contained a unique combination of 18 one-minute excerpts. The compilation CDs were posted to the 10 participants within the group along with a new set of music rating scales (as used in rating their own pieces) and a music familiarity rating scale (1 = not at all, 4 = somewhat, 7 = very). The procedure was repeated each time a new group of 10 participants returned their enjoyed music. Thirty-seven completed rating scales were returned to the experimenter by pre-paid post. The last group was constrained to seven participants due to a time limit for data collection. Two participants who took longer than four weeks to respond were excluded from further correspondence. From the returned music ratings, two “non-emotion” tracks were selected for each participant based primarily on the criteria that enjoyment ratings were around “4” (neither enjoyed nor not enjoyed) and the music was unfamiliar. Care was taken to match tracks as much as possible on ratings of valence and arousal.
The research laboratory experimental session
The laboratory session involved the experimental manipulation of emotion with music, the measurement of emotional responses, and the memory test. All testing was conducted in a climate controlled windowless room with constant humidity and temperature (21°C). On arrival, participants were seated in a firm armchair 100 cm from the computer monitor (adjusted to eye level), given an overview of the experimental tasks, connected to the physiological recording equipment, and given an opportunity to ask questions. To ensure participants fully attended to the picture encoding task, they were explicitly informed that their memory for details within each of the five image arrays would be tested at the end of the session. Instructions and stimuli were presented via Microsoft Office PowerPoint software.
The experimental session proceeded as follows: (1) presentation of a listening stimulus (one of four music stimuli or the non-music control; 2–5 min); (2) rating felt valence and arousal, chills, and enjoyment (1 min); (3) image encoding (2 min); (4) semantic fluency task (3 min); and (5) rating felt valence and arousal (30 s). All five steps were repeated four more times to yield a repeated measures design with five levels. The second set of valence and arousal ratings were recorded to confirm levels had returned to baseline between conditions. After the last condition, participants were given the option of a 20-minute break before completing the memory test. Most participants completed the memory test within 20 minutes.
Due to variability in the way participants respond to music in different contexts, the two PM tracks and two OM tracks were presented consecutively and responses later averaged. The presentation order of the PM and OM conditions and the non-music control were quasi-randomised so that each was presented equally in the first and second position. The time to complete the full procedure was between 90 and 120 minutes.
Results
Data screening and manipulation checks
Data were analysed with the Statistical Package for the Social Sciences (SPSS; IBM) v22. Raw scores for the OM and PM conditions were averaged and then data were screened for missing values and within-subject outliers. One participant reported feeling extreme negative valence after listening to a PM track, while across participants there was negative skew in reports of subjective enjoyment, valence, arousal and chills for PM tracks. No transformations to the data were made as these were expected responses to highly enjoyed music and maintaining raw scores was deemed to enhance the integrity of interpretation in this case. An extreme chills response to others’ music for one participant was Winsorised and missing control physiological data for another were replaced with that participant’s mean across the remaining conditions. No other outliers were identified. All physiological data were lost for three participants due to equipment malfunction and heart rate data were lost for five additional participants due to high movement artefacts. The final sample was n = 35 for the skin conductance measures, and n = 29 for the heart rate measures. Variability across conditions was greater than usual for all measures due to the unique combination of music tracks presented to each participant. However, as ANOVA is a robust test when the assumption of normality is violated and sample sizes are equal (Howell, 2002), ANOVA was used to test differences between conditions for all variables.
Analysis of the I-PANAS-SF data with a time (before vs. after) by affect (positive vs. negative) by condition order (4) mixed ANOVA revealed that participants generally felt more positive (M = 14.40, SD = 3.89) than negative (M = 6.45, SD = 1.29) at the commencement of the experiment, F(1,31) = 166.30, p < .001, ηp2 = .84. No other main effects or interactions were significant. These results confirmed that the more positive affective states at the commencement of the experimental procedure remained constant throughout the experiment and thus were not influenced by the experimental tasks. Checks of the experimental manipulation revealed that PM was significantly more enjoyed (M = 6.46, SD = 0.77) than OM (M = 4.72, SD = 0.95) and the non-music control (M = 4.35, SD = 1.23), F(2,72) = 45.15, p < .001, ηp2= 0.56. There were no significant differences between conditions in semantic fluency, F(2,72) = 0.35, p = .71, post-task valence, F(2,72) = 1.28, p = .28, or post-task arousal, F(2,72) = 2.98, p = .06. A “condition order” (e.g. PM in first, second or third position) by “condition type” (control, OM or PM) mixed ANOVA failed to detect significant differences in memory performance based on the order in which the experimental stimuli were presented, F(6,66) = 0.80, p = .58. Primacy or recency effects can therefore be excluded as a cause of memory differences between conditions.
Music effect on multiple components of emotion
Table 1 shows the means and standard deviations for each of the emotion variables measured. Consistent with an arousal response, there were increases in subjective arousal, chills, skin conductivity and heart rate (decreased IBI), and decreased HRV from the non-music control to OM and then again to PM. Skin conductance response followed a different pattern, with more responsiveness to the control than PM, and the least responsiveness to OM. Consistent with a positive emotion response, valence ratings were most positive for PM, which were more positive than the control and OM ratings. A series of one-way within-subjects ANOVAs were conducted to test whether these mean differences were significant (α = .05). Due to the repeated measures design, Huynh-Feldt adjustments to the F value were used when the assumption of sphericity was violated (for the chills and SCL measures). Significant ANOVAs were decomposed with planned comparison comparing the non-music control to OM, and to PM. The ANOVAs revealed that the differences between group means were significant for valence, F(2,72) = 13.47, p < .001, ηp2= 0.27, arousal, F(2,72) = 12.35, p < .001, ηp2= 0.26, chills, F(1.28, 46.12) = 49.20, p < .001, ηp2= 0.58, SCL, F(1.53, 52.03) = 11.81, p < .001, ηp2= 0.26 and SCR, F(2,68) = 7.20, p < .01, ηp2= 0.18, but not IBI, F(2, 58) = 0.10, p = .91, or HRV, F(2, 58) = 1.35, p = .27. The planned comparisons (see Table 1) revealed that mean OM valence did not differ significantly from the non-music control, whereas PM valence was rated as significantly more positive than the non-music control. Subjective arousal, chills, SCL and SCR for both the OM and PM conditions were significantly greater than the control.
Mean and SD for subjective, physiological and memory measures in each condition.
Note. SCL = skin conductance level; SCR = skin conductance response; HRV = heart rate variability (RMSSD).
significantly different from the non-music control, p < .05.
Music effect on memory
Mean recall scores (presented in Table 1) increased from the non-music control to OM and then again to PM. The highest recall scores were for images presented after the PM condition. There also appeared to be higher recall scores for images presented after the OM condition than after the control condition. A one-way between subjects ANOVA confirmed that the observed memory differences between conditions were statistically significant, F(2, 72) = 3.28, p < .05, ηp2= 0.08. The OM recall scores did not differ significantly from control (p = .08). There was, however, a significant difference between PM recall scores and the non-music control (p = .03).
Pearson’s correlation coefficients were utilised to explore associations between the emotion and memory variables within each condition. SCL was correlated with recall in all three conditions (control r(35) = .37, p < .05; OM r(35) = .42, p < .05; PM r(35) = .43, p < .05), and HRV was correlated with recall in the PM condition (r(30) = −.44, p < .05). Significant correlations are modelled in Figure 1. There were also significant correlations between variables within the OM and PM conditions. OM valence and OM SCL were negatively correlated (r(35) = −.44, p < .01), indicating that as affect became more positive, SCL decreased. OM SCR was positively correlated with OM SCL (r(35) = .39, p < .05) and negatively correlated with OM HRV (r(35) = −.37, p < .05). OM HRV was negatively correlated with OM subjective arousal (r(30) = −.37, p < .05). The only significant correlation between PM variables was subjective arousal and chills (r(37) = .40, p < .05). The correlations between IBI and all other variables measured were not significant.

Model of memory predictors. Arrows represent significant correlations between variables(p < .05) for the conditions 1 (non-music control), 2 (others’ music) and 3 (participants’ music). SCL = skin conductance level; SCR = skin conductance response; HRV = heart rate variability (RMSSD).
A multiple regression analysis (MRA, enter method) was utilised to determine how much variance in memory scores within the PM condition could be explained by SCL and HRV. The chills frequency measure was not included in the analysis as it was not correlated with memory scores. Bias-corrected bootstrapping (1000 iterations) of the MRA confidence intervals was used due to the low sample size. All other assumptions for MRA were met. The MRA revealed that SCL and HRV accounted for 35.5% of the variance in recall scores (adjusted R2 = .31), and the model was a significant predictor of memory recall scores, F(2,27) = 7.44, p < .01. Both variables were also significant unique predictors of recall scores (see Table 2).
Coefficients for the participant-selected music (PM) MRA model predicting recall scores from SCL and HRV. 95% confidence intervals (in parentheses) bootstrapped.
Note. SCL = skin conductance level; HRV = heart rate variability (RMSSD); sr2 = squared semi-partial correlations.
p < .05.
Discussion
People report that listening to music elicits strong emotional reactions. We hypothesised that a neurobiological mechanism underlying these reactions may also be modulating autobiographical memory for surrounding information. The attempt to experimentally manipulate emotional memory in this way was partially successful. Although the sample in the current study was small, self-selected emotionally arousing music was experienced by participants as more positive and physiologically activating than an active listening non-music control. However, others’ music previously rated with relative indifference was also physiologically activating, although the music failed to elicit change in subjective emotional valence, which is an important dimension of emotion. Exploration of predictors of memory performance revealed that increased arousal was positively correlated with memory performance in all three conditions. This result is consistent with an arousal dose-response effect on performance of a wide variety of mental and physical tasks, including memory (Baldi & Bucherelli, 2005). Importantly, we revealed that listening to self-selected highly enjoyed music was the only condition to elicit an association between reduced HRV and greater image recall. Reduced HRV (in the high frequency domain) signals increased SNS control of heart rate via the peripheral release of epinephrine and norepinephrine (Berntson, Quigley, & Lozano, 2007). It therefore appears that participant-selected enjoyed music was emotionally arousing and activated the adrenergic system, and that this activation facilitated memory for subsequently presented images.
The results of this study may contribute to explaining the phenomenon of music-cued autobiographical memory recall – the feeling of strong memories for specific places and events when certain music excerpts are heard (Janata, Tomic, & Rakowski, 2007). For instance, if a piece of music elicits an emotional response, then amygdala activation would increase SNS activity, including adrenal modulation of heart rate, and activation of adrenergic projections from the peripheral nervous system back to brain structures involved in memory consolidation. Information surrounding the music listening experience would then be strengthened. Hearing the same music in the future could reactivate the memory trace, cueing associated contextual details that were consolidated into long-term memory via the neurobiological adrenergic mechanism, and that under normal (non-emotional) circumstances could have decayed.
Limitations
The failure to find a correlation between chills and any of the physiological measures was unexpected in light of previous research (Huron & Margulis, 2010). For instance, time-series analysis conducted by Grewe et al. (2009) of chills responses and peaks in skin conductance while participants listened to emotionally arousing music revealed a reliable relationship between the two. It is possible that the method used here, which included responses only if they exceeded .05 micro Siemens within each 10 second period, was not sensitive enough to detect music-induced chills. Alternatively, retrospectively reporting the number of chills experienced may have been difficult for participants, especially if their attention was directed towards fully engaging with the music. The chills response is therefore ideally recorded throughout the listening period by a button press or some other non-intrusive method (e.g. Grewe et al., 2009; Salimpoor et al., 2009). Further, the failure to find an association between average heart rate, the other emotion variables, and memory is also likely due to the limited sensitivity of the measure. The function of the autonomic system is to maintain homeostasis; therefore an increase in heart rate in response to an emotion peak in the music is generally quickly stabilised by the PNS. Averaging heart rate over a 3- to 5-minute period smooths out these fluctuations and results in loss of sensitivity.
Although we revealed significant memory differences between emotionally arousing music and the non-music control, there was high variability in memory scores in our sample of 37 participants, thus reducing the effect size. A power analysis (power .8, alpha .05) for this mixed design indicated that a sample of 120 would be required to replicate the memory results. There was also higher than expected variability in subjective and autonomic responding, probably due to the large and varied pool of music used in this study (i.e. each participant was exposed to a unique combination of music tracks). Finally, we were able to establish that emotionally arousing music facilitated memory for images early in the consolidation process; however, we were unable to test whether the memory effect was sustained over a longer period. It is recommended that future research incorporates a long-term (e.g. one week) surprise memory test to determine whether emotionally arousing music – relative to non-music and non-emotion-music controls – elicits lasting change in autobiographical memory.
Conclusion
The results of this study demonstrate that self-selected highly enjoyed music elicited subjective and physiological responses consistent with emotional arousal. The participants’ own music selections elicited changes in both valence and arousal – two important dimensions of emotion, whereas others’ music elicited changes in arousal only. Furthermore, after listening to their own highly enjoyed music, participants recalled more details from an array of images presented soon afterwards. Physiological changes while listening to the music were consistent with an adrenergic response that may have signalled neurobiological modulation of memory. Although the design of the study was complex and required high participant involvement over a protracted period of time, the use of participant-selected music acting as both the music-emotion stimulus and the music-control meant changes in general arousal caused by listening to music could be controlled as an explanation for enhanced memory. Further, the inclusion of an active listening non-music control meant reductions in arousal caused by sitting in silence could be excluded as an explanation for memory differences between the music and non-music conditions (Schellenberg, 2012). The inclusion of multiple measures of emotion, particularly of the autonomic nervous system, enabled us to more confidently attribute the observed memory effect to the neurobiological model underlying emotion-enhanced memory. Although the sample size in the regression model was small, all assumptions were met and bootstrapping of the confidence intervals was applied to improve the reliability of the result. Further, a sizeable proportion of the variance in memory scores (30%) was accounted for by autonomic changes. However, as the power analysis indicates, replication of these findings with a larger sample is essential to ensure these conclusions are robust. In sum, this exploratory study indicates that it is plausible that listening to emotionally arousing music modulates memory for contextual information, and may help to explain the phenomenon of music-cued autobiographical memory. In the era of ubiquitous music listening, this potential memory modulating effect of music warrants further investigation.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
