Abstract
This study investigated the relationships between bodily expression, bodily coordination, mental effort, and self-reported experiences of musical togetherness. Singer-pianist duos performed pieces of classical repertoire before and after rehearsing together. Video, head motion, respiration, and pupillometry data were collected. Afterwards, the performers watched a video of their post-rehearsal performance and made continuous ratings of how together they felt. They also provided written descriptions of the cues that informed their togetherness judgements. Our analyses showed stronger head coordination between partners and larger pupil diameter during high-togetherness phrases than during low-togetherness phrases for one of the pieces. Inhalation synchrony and quantity of head motion did not differ between high- and low-togetherness phrases. Pupil size was greater during pre-rehearsal performances than during post-rehearsal performances, suggesting that demands on attention might have reduced after players had jointly constructed a shared interpretation and grown more accustomed to performing together. A thematic analysis of written responses showed that performers’ concept of togetherness related to coherence in musical parameters, moving and breathing together, and sensations of feeling together, shared musical emotion, and effortlessness. We discuss these findings in relation to the musical togetherness model.
Ensemble musicians create shared rhythms through their sound and body motion. It has been theorised that shared body rhythms support a sense of musical togetherness (Bishop, 2024). Musical togetherness refers to a sense of musical connection that arises during real-time interaction and results in feelings of social connection and aesthetic pleasure. It makes group music-making intrinsically motivating and is part of what performing ensembles aim to express through their playing (D’Amario & Bishop, 2025).
A large literature has explored the ways in which ensembles express togetherness. For classical ensembles, which are the focus of the current study, synchronisation of note onsets (D’Amario et al., 2018; Goebl & Palmer, 2009; Palmer et al., 2019), blending of sound (D’Amario et al., 2023), coordination of periodic body sway (Chang et al., 2019; Goebl & Palmer, 2009; Keller & Appel, 2010) and bowing motion (Hilt et al., 2019; Laroche et al., 2022), and eye contact between players (Bishop et al., 2019a; Davidson, 2012; Gaggioli et al., 2017; Vandemoortele et al., 2018) have been highlighted as some potential behavioural markers of togetherness. Some of these, such as coordination in body sway, have been shown to inform audiences’ judgements of ensemble togetherness (D’Amario et al., 2023; Jakubowski et al., 2020). Few studies, however, have tested whether any of these measures of coordination can actually index the strength of togetherness that ensemble musicians feel. This is a gap that the current study aimed to fill.
The relevance of this aim is twofold: First, testing the relationship between musical togetherness and measurable alignment in behaviour addresses a critical theoretical question, which is how strongly togetherness depends on convergent internal states expressed via overt cues. Second, confirming a positive relationship would open the possibility of using bodily measures as an index of musicians’ experiences in situations where self-reporting is impractical.
We also investigated the amount of mental processing evoked in classical duo playing through measurement of musicians’ pupil size. The link between pupil responses and mental processing has been well established (e.g., Kahneman, 1973). Psychosensory pupil responses occur when thoughts, emotions, or sensory stimuli activate the mind. Mathôt (2018) distinguishes between rapid orienting responses, which are evoked by sudden external events, and slower effort/arousal-related responses. We were interested in the latter, which might evolve over time during ensemble performance as challenging and rewarding moments unfold, activating mental processing to varying degrees. In the literature, this effort/arousal-related activity is commonly referred to as mental effort (see Bruya & Tang, 2018, for a deep discussion of this term). In the domain of music, some studies have investigated pupil responses in listeners (Laeng et al., 2016) and some have explored the effects of musical difficulty on performers’ pupil size (Endestad et al., 2020; O’Shea & Moran, 2018). However, the relationship between activating experiences of musical togetherness and pupil dynamics has not yet been investigated. The effects of rehearsal on pupil size likewise remain unexplored.
The current study addressed the question of how pupil dynamics, bodily expression, and bodily coordination relate to musicians’ self-reported experiences of togetherness. We evaluated pupil size, bodily expression (operationalised as quantity of head motion; see below), and coordination in the head motion and respiration of duo singers and pianists as they performed two pieces of classical Lied repertoire. Of additional interest was how pupil size, bodily expression, and bodily coordination differed before and after a period of duo rehearsal.
The link between musical togetherness and bodily coordination
Bishop (2024) presented a model that posits how the sense of musical togetherness arises during ensemble playing and the factors that act to strengthen it. Perceiving one’s co-performers as intentional (i.e., having a flexible understanding of the rules that govern the interaction), live (i.e., interacting in real-time), and responsive make a musical interaction feel social. Perceived alignment in musical understanding (i.e., how the music should sound and what it should express) and joint agency (i.e., the sense of acting together towards a shared goal) make social-musical interactions more rewarding. Musical togetherness fluctuates throughout an interaction and affects musicians’ decision-making in real-time, prompting them to take actions that maintain or strengthen it (Gesbert et al., 2022; Noy et al., 2015; Stephens, 2021).
Particularly relevant to the current study is the idea that perceived alignment in musical understanding affects how strongly together musicians feel. Musical understanding extends beyond knowing which notes to play and at what tempo; for classical musicians, it is about having an interpretation of the music that maintains basic structural elements but also includes aspects of individual expressivity (e.g., Juslin, 2003; Repp, 1997). Musical understanding relates to the high-level aesthetic effects that musicians want to realise, for example, the expression of certain emotions or musical character. Thus, Bishop (2024) is not proposing that strong musical togetherness experiences come from coordinating basic structural features. Rather, the argument is that musicians can sense (dis)similarity in their interpretations and feel more strongly together when they perceive that they are striving for compatible high-level aesthetic outcomes.
Musical understanding is grounded in body activity—this is the central idea of embodied music cognition (Leman, 2008). Some of this activity yields observable signals in the form of body motion (Dahl et al., 2010; Demos et al., 2018), breathing (Cara & Mitrovic, 2024; Sakaguchi & Aiba, 2016), facial expressions (Davidson, 2012), and gaze patterns (Bishop et al., 2019a). These signals can communicate aspects of musical understanding to co-performers and audience members who are physically co-present. Given the opportunity to communicate multi-modally, ensemble musicians do so by making use of these different body signals. Musicians move their heads in a more coordinated way (Bishop et al., 2019b), move their heads at a higher rate of displacement (Bishop, González Sánchez, et al., 2021), use less self-regulating head and bowing motion (Laroche et al., 2022), and are more willing to take creative risks when they can see each other than when they cannot (Golvet et al., 2021; Iorwerth & Knox, 2019). These findings suggest that they use each other’s body signals as attention cues towards certain aspects of the music and as a way of confirming that coordination is stable.
While ensemble musicians’ body motion coordination has been studied extensively, less is understood about the benefits and mechanisms of their respiratory coordination. Respiratory and cardiac rhythms have been found to synchronise between singers in a choir and synchronise more strongly when they are singing in unison than when they are singing a canon (Müller & Lindenberger, 2011). For singers performing polyphonic music, these rhythms synchronise more strongly when singers have physical contact than when they have no physical contact (Lange et al., 2022). Although some studies have examined respiratory patterns during piano performance (Nakahara et al., 2010; Sakaguchi & Aiba, 2016), the role of respiratory synchrony for this instrument remains unclear. Respiratory synchronisation in singer–pianist ensembles might be found at phrase onsets, since prior case studies suggest that pianists do sometimes breathe in anticipation of phrases (King, 2006), and singers should do so reliably.
In sum, the literature suggests that musicians’ body signals reflect their musical understanding and that ensemble musicians are receptive to what these signals communicate. Therefore, we may hypothesise a positive link between expressive, coordinated motion and how strongly together musicians feel. The current study tested the prediction that musicians feel more strongly together when they are moving their heads at a higher rate and there is stronger coordination between them in expressive head motion and respiration. We focused on motion of the head because it was easily visible to both duo partners during their performances. Head motion is anatomically linked to motion of the torso, with additional degrees of freedom afforded by the neck, and a key source of expressive motion for both singers and pianists (Livingstone & Palmer, 2016; Thompson & Luck, 2012). It has been found to be a prime communication device (Bishop & Goebl, 2018) and contains the vestibular system, which plays an important role in sensing both locomotion periodicities and musical tempo (Todd & Lee, 2015).
Mental effort and arousal in ensemble playing
According to Bishop’s (2024) model, a strong sense of musical togetherness is associated with low subjective effort and positive emotional responses. Low subjective effort occurs because musicians are closely aligned in their expressive aims, so coordination does not require much attention or control. Positive emotions arise as musicians perceive their collaboration as successful. These emotions can vary from low arousal (e.g., relaxation) to high arousal (e.g., joy).
A strong sense of musical togetherness co-occurs with group flow, a shared state of heightened absorption that is associated with feelings of effortlessness and loss of self-consciousness (Sawyer, 2006). Recent research carried out in a non-musical context suggests that individual flow and mental effort, indexed through pupil size, share a positive linear relationship (Lu et al., 2023). Thus, the brain seems to be highly engaged when a person experiences a flow state. Some other studies have shown heightened physiological arousal during periods of high togetherness (Noy et al., 2015) and heightened coordination of cardiac activity between musicians in more interactive playing conditions, which is thought to signify shared absorption (Høffding et al., 2023). Synchronisation in physiological arousal might provide a mechanism for the enhanced social cohesion that arises from shared experience (Konvalinka et al., 2011). The current study built on these findings by testing for a potential positive relationship between pupil size and musical togetherness. We predicted that more mental processing takes place when musicians feel strongly together due to a combination of increased absorption and emotional arousal.
Many other aspects of ensemble playing evoke increased mental processing, including musical and coordination demands. For classical musicians, a core challenge in ensemble playing is negotiating a shared interpretation of the music (Ginsborg & King, 2012). During the early stages of joint rehearsal, ensemble members may have different ideas of how the music should sound, especially if they have been practising it individually. Co-performers’ expectations can diverge, resulting in relatively poor synchronisation during their first performances together compared to subsequent performances (Ragert et al., 2013). They may sacrifice individual expressivity to prioritise synchronisation (Bishop & Goebl, 2020; MacRitchie et al., 2018). Thus, the first time an ensemble plays a piece together, substantial mental effort might be evoked. Drawing on these findings, we predicted that mental processing would be greater for our participants prior to a period of joint rehearsal than afterwards, resulting in larger pupil size pre-rehearsal.
In the literature, a few recent studies have made use of pupillometry, a technique that has traditionally been reserved for more controlled laboratory settings, in music performance settings. Studies with performing musicians have manipulated musical difficulty and/or whether performance was overt or imagined, and shown increased pupil dilation for more difficult music and overt performance (Bishop et al., 2021; Endestad et al., 2020; O’Shea & Moran, 2018).
The current study
This study used a mixed-methods approach to investigate the relationships between mental processing, bodily expression, bodily coordination, and musicians’ self-reported experiences of togetherness. Our central hypothesis was that bodily expression, bodily coordination, and heightened social-musical engagement (which evokes increased mental effort and arousal) support feelings of musical togetherness. We predicted that measures of bodily expression and coordination would be stronger and pupil size would be larger during periods of high self-reported togetherness than during periods of low togetherness. We additionally predicted that more mental processing overall would be evoked pre-rehearsal than post-rehearsal, resulting in a larger average pupil size pre-rehearsal. On the surface, these hypotheses might seem contradictory: We posit that both social-musical engagement (linked to musical togetherness) and the challenges of playing for the first time with a new partner evoke increased mental processing. However, the literature shows that multiple factors can have an activating effect on attention. We conceive the relationships as unfolding across different timescales. We expect that musical togetherness and pupil size co-vary dynamically during performances, while there is an enduring effect of rehearsal.
We recorded performances by singer–pianist duos who performed two pieces from the classical Lied repertoire. One performance was recorded at the beginning of the session, prior to a period of joint rehearsal, and three additional performances were recorded after the rehearsal. We collected motion capture, respiration, and eye-tracking data. At the end of the session, musicians individually watched video recordings of their jointly selected best post-rehearsal performances and made continuous ratings of musical togetherness using a mouse-controlled slider. We also collected some free-written responses from the musicians about what they felt informed their togetherness ratings.
In our primary analysis, our independent variables comprised piece (two; see below) and togetherness level (high/low), which was calculated from musicians’ ratings. We had four dependent variables. From the musicians’ motion capture data, we calculated quantity of head motion as an index of bodily expression and the power of shared periodicities in head velocity, using cross-wavelet transform analysis (Issartel et al., 2015), as an index of bodily coordination. From their respiration data, we calculated inhalation synchrony at phrase beginnings. From their eye-tracking data, we extracted pupil diameter as an index of mental effort. We carried out additional analyses to test for differences in our dependent variables pre- and post-rehearsal, to explore the relationships between dependent variables, and to assess the similarity of musician partners’ ratings over time. Finally, we carried out a thematic analysis on musicians’ written responses, which we considered in relation to our quantitative findings.
Method
Participants
Twenty-four semi-professional musicians participated in the study (12 pianists and 12 singers; age
Equipment
Recordings took place in a motion capture laboratory at the University of Music and Performing Arts Vienna. Pianists played on a Yamaha Clavinova. Singers wore a head-mounted microphone, and two stereo microphones collected audio from the room. Audio and MIDI from the Clavinova and audio from the microphones were recorded on a PC running Ableton Live.
Musicians wore SMI (SensoMotoric Instruments) ETG 2 mobile eye-tracking glasses connected to separate PCs, which collected pupillometry data at 120 Hz. To maintain controlled conditions for eye-tracking, room lighting was kept constant using overhead lights and covered windows. An OptiTrack (NaturalPoint) motion capture system with 12 Prime 13 cameras recorded body motion at 240 Hz. Reflective markers were placed on the head, back, chest, shoulders, arms, wrists, and hands (with additional markers on singers’ hips). The current study used only head marker data. Respiration was captured using SOMNOtouch RESP devices (SOMNOmedics), recording at 32 Hz from belts placed around the chest and abdomen. Electrocardiogram data were also collected with the SOMNOtouch devices, but are not analysed here. Performances were recorded with a DSLR camera placed in front of the singer at a distance of about 3 metres. Due to COVID-19 regulations at the time of data collection, plexiglass barriers were placed between the pianist and singer and in front of the singer (between the singer and video camera). Pianists were required to wear a mask while playing.
A trigger system was set up to synchronise pupillometry, body motion, and breathing data. TTL triggers were sent by the OptiTrack software at the starts of recordings through an OptiTrack eSync device and received by the computer running SMI software and the SOMNOtouch devices, the latter of which were connected to the eSync with an optocoupler cable. The SMI recording software and SOMNOtouch device recorded the timestamps of received triggers, which we later exported and used to retrospectively trim recordings to a common starting point. A clapboard with a reflective marker was then struck, giving an audiovisual signal that was picked up by the microphones and motion capture system.
For a few duos, an alternate synchronisation solution was required for the respiration data due to a missing connection with the eSync device. For these duos we pumped air into the flow sensor of the SOMNOtouch. A reflective marker was placed on the pump for capture of the pumping motion, and the sound of the pump was also captured on audio recordings. A few duos also required an alternate synchronisation solution for the eye-tracking data due to failed eSync triggers. The SMI glasses record audio, so we manually identified the moment of the clapboard strike in these cases.
Musical material
Participants performed two pieces of classical Lied repertoire: Automne Op. 18, No. 3 by Gabriel Fauré and Die Kartenlegerin Op. 31, No. 2 by Robert Schumann. The pieces differed in musical character and, between them, presented duos with a range of expressive and coordination challenges, including a slow and flowing tempo (Automne), numerous tempo changes (Die Kartenlegerin), and solos and re-entries (both).
Procedure
A week before their first recording session, participants received sheet music for the two pieces with instructions to practice their parts individually. At the start of the recording sessions, participants were outfitted with physiological sensors, reflective markers for motion capture, and eye-tracking glasses. They each completed a 5 minute baseline recording during which they sat (pianists) or stood (singers) alone in the lab and were instructed to remain quiet and still. They then performed the pieces once for recording with their duo partner. The order in which they performed the pieces was randomised across duos. Following this initial “pre-rehearsal” performance, duos rehearsed the pieces for about 10–15 minutes each, then performed each piece for recording three times more (“post-rehearsal” performances). Post-rehearsal performances were blocked by piece, and at the end of each block, duos jointly decided which of the three performances was their best.
The musicians then took a short break while the video recordings of their selected “best” performances were uploaded to the Gorilla web platform (www.gorilla.sc). Their final task was to watch their videos while making continuous ratings of how together the performances were. They used a mouse to adjust the position of a horizontally-oriented slider, which was visible on a computer screen under the video display. The slider ranged from 0 (low togetherness) on the far left to 100 (high togetherness) on the far right, and slider movements were sparsely sampled at 100 Hz. Musicians were also asked to comment in writing about what informed their togetherness judgements.
Analysis
Duos performed each piece a total of four times (1 pre-rehearsal + 3 post-rehearsal), but our analysis uses only the pre-rehearsal performance and the post-rehearsal performance that they selected as their best. As explained above, participants only performed the togetherness rating task on their “best” post-rehearsal performance of each piece. Our analysis pipeline is detailed below.
Performance-score alignment and musical analysis
The first step in our analysis pipeline involved aligning MIDI data from the Clavinova with a MIDI representation of the score. This alignment allowed us to assign musical time labels (beat numbers) to all of our other data. A performance-score matching algorithm was used for the alignments (Nakamura et al., 2017). Manual corrections to the alignments were made using the Parangonada interface (Peter et al., 2023). The alignment output consists of correctly-performed notes (i.e., performed notes that could be matched to score notes) with corresponding MIDI and musical beat information. Insertions are left out of the alignment since they cannot be matched to any score notes. Omissions are likewise left out since they do not correspond to any performed notes. From these alignments, a series of onset times per beat were extracted. In R, linear interpolations were performed to obtain estimated onset times of missing beats.
We computed our measures of high versus low togetherness, pupil size, quantity of head motion, head coordination, and inhalation asynchrony (all described below) per phrase. Phrase beginnings and endpoints were determined based on a musical analysis of the singer’s part in the score. Information on the location of phrase boundaries is available in the Supplementary Materials.
Pupil size
Pupil diameters for musicians’ left and right eyes were averaged per fixation in SMI’s BeGaze processing software. Average pupil diameters per fixation in pixels were then exported. In R, these data were aligned with musical performance data based on their timestamps and averaged per phrase for each participant and performance.
We ran a test to check for pupil foreshortening error, which occurs if the pupil is measured as smaller than it actually is when the eye rotates away from the eye-tracking camera. Z-scores were calculated for the x and y positions of each fixation, within participants and performances. These z-scores were used to categorise fixations as occurring in the centre of the visual field, in the extreme right or left (x-position z-scores beyond 3 standard deviations from the mean) or in the extreme top or bottom (y-position z-scores beyond 3 standard deviations from the mean). We then calculated average pupil diameters per participant, performance, and gaze location and ran paired, one-sided t-tests to determine whether pupil diameters were smaller in the extreme left/right or top/bottom regions of the visual field than in the centre. These t-tests showed that, indeed, pupil diameters were smaller in the extreme left,
Quantity of head motion
For quantity of head motion and head coordination (below), we used data from a reflective marker that was placed just above the forehead. A Savitzky-Golay filter (prospectr package in R; Stevens & Ramirez-Lopez, 2025) was applied to smooth head position data and derive velocities (polynomial order = 3, window size = 11). We computed the velocity norm, summed these values per second, then averaged per phrase to obtain quantity of head motion (head QoM).
Head coordination
Cross-wavelet transform (CWT) analysis (WaveletComp package in R; Roesch & Schmidbauer, 2018) was carried out on head velocity norms to measure the strength of coordination between duo partners. CWT measures how a signal’s frequency changes over time. We used it to measure the strength (or power) of shared frequencies in duo partners’ head velocity signals.
We initially ran the analysis on a broad range of frequencies, but predicted that players would synchronise most strongly at frequencies related to the temporal structure of the music, for example, at the bar or phrase level. D’Amario et al. (2023) tested this prediction using the same motion dataset as we use here and found that, indeed, synchronisation was strongest at the bar, half-bar, or double-bar level (depending on the piece and performance; see Table 1). For the current analysis, we extracted only power values for this reduced range of periods where synchrony was strongest. We averaged CWT power across this reduced range of periods per timestamp for each duo and performance. These data were then aligned with musical performance data based on their timestamps and averaged per phrase for each duo and performance.
Range of Periods Used in Head Coordination Analysis.
Note. CWT power was extracted for a reduced range of periods centred around different bar levels for different pieces and performances. The width of these narrow bands was scaled according to the bar level.
Inhalation asynchrony
Respiration data comprised series of values representing change in circumference of chest and abdomen belts. Chest and abdomen data were added together per timestamp and these summed values were normalised (within performer) to a range of 0 to 1. We then took the first derivative of the normalised values to get a measure of the acceleration of circumference changes, and a moving average (running mean) filter was applied (window size = 21; prospectr package in R; Stevens & Ramirez-Lopez, 2025).
Next, we developed a pipeline to identify peaks in inhalation acceleration associated with phrase beginnings. We defined a window around the start of each musical phrase that differed in width depending on the piece, which metrical position within a bar the phrase started on, and whether the phrase start was “free” for the singer (preceded by multiple beats of rest) or “restricted” (preceded by one beat of rest; see Table 2). Phrases always began on beats 10 or 12 for Fauré and on beats 1 or 2 for Schumann (see Table 2, column 3). We counted beats according to the metre noted in the scores; for both pieces, this meant that an 8th note = 1 beat.
Details on Windows Around Phrase Beginnings Where the Peak-Finding Algorithm Was Run.
Note. “Beat position in bar” refers to where in a bar phrases began (measured in beats). For each type of phrase, we defined windows that started a certain number of beats prior to the phrase beginning (“Beats before start”) and ended a certain number of beats after the phrase beginning (“Beats after start”).
A peak-finding algorithm was run on the acceleration data. If one acceleration peak fell in the window for a phrase, the timestamp of that peak was used as the inhalation timestamp for that phrase. If more than one peak fell within the window, we took whichever was larger in magnitude or, if two peaks shared the same magnitude, whichever was nearer to the phrase beginning. The absolute magnitude of time difference in seconds between the pianist’s inhalation peak and the singer’s inhalation peak comprised our measure of inhalation asynchrony.
Togetherness ratings
Togetherness ratings were de-sparsed using a constant interpolation, then aligned with musical performance data and averaged per phrase. Ratings were classified as indicating high or low togetherness using a method inspired by Noy et al. (2015), which also compared motion with togetherness ratings. Noy et al. (2015) set a threshold for each participant with the top 30% of togetherness ratings considered “high”. Our method also established a threshold per performer and performance, but we found it was necessary to set the threshold using a different cut-off point (10% rather than 30%) to ensure that some (but not too much) of each performance was classified as high-togetherness. The threshold was calculated as follows:
Ratings above this threshold were labelled as “high togetherness” and ratings below were labelled as “low togetherness”. This resulted in 29% of all phrases, across performers and performances, being classified as “high togetherness”. Table 3 lists the range and average percentage (and number) of bars classified as “high togetherness” for each piece.
Percentages of Phrases in Classified as “High Togetherness”.
Note. The number of phrases corresponding to each percentage is included in parentheses.
Linear mixed effects modelling and hypothesis testing
Linear mixed effects modelling was carried out in R using the lmer and anova functions from the package lmerTest (Kuznetsova et al., 2017). The package emmeans (Lenth, 2024) was used for pairwise comparisons. We checked for normality of residuals using a Shapiro-Wilk test, and square root transformations were applied to head coordination, inhalation asynchrony, and pupil size values to achieve more normally-distributed residuals.
Togetherness written responses
A thematic analysis was carried out to identify the main themes in participants’ written responses to the question of what informed their togetherness ratings. Two of the authors independently carried out the initial steps of familiarising themselves with the data and generating codes, then jointly agreed on a fixed list of codes. With input from a third author, these codes were then grouped into overarching themes. Using this structure of codes and themes, a scoring system was designed to quantify the similarity of duo partners’ written responses. Scoring took into account the overlap in thematic content (i.e., how many themes partners had in common) and similarity in the complexity of responses (i.e., the ratio of the number of different themes commented on by one partner to the other). For example, say partner A in one duo commented on three themes, while partner B commented on the same three themes plus a fourth that partner A did not comment on, this duo would receive a similarity score of 3 (overlap in themes) + .75 (ratio of partner B’s three themes to partner A’s four themes) for a total of 3.75.
Missing data
Respiration data could only be included for 14 of the 24 duos because of delays in obtaining equipment. Analyses of respiration data use this reduced dataset while the rest of the analyses use the full dataset. Additionally, pupil data were missing for three of the duos who were also missing respiration data.
Results
In the subsections below, we present the results of analyses addressing our hypotheses. Preliminary analyses showed that musicians responded differently to the Fauré and Schumann pieces. For this reason, we addressed the pieces separately in all of our analyses. We start our presentation of results with a comparison of the temporal structure of the pieces.
How did the pieces differ in temporal variability?
Figure 1 shows the tempo curves for the two pieces, which we include as a demonstration of their differences in tempo and temporal regularity. To generate these plots, we extracted the interbeat intervals (IBIs, in seconds) for each performance using pianists’ MIDI data. For the counting of beats we used the metrical information given in the scores: Fauré’s Automne is in 12/8 and Schumann’s Die Kartenlegerin is in 2/8, so for both pieces, 1 beat = 1 eighth note. Where pianists had beats of rest, interbeat intervals were interpolated (resulting in curve segments that appear flat). We ran a t-test to examine how variability in IBIs (coefficient of variation for IBIs per performance) differed between pieces. This showed greater temporal variability in the Schumann piece than in the Fauré piece,

Tempo Curves for Fauré’s Automne (Top) and Schumann’s Die Kartenlegerin (Bottom).
Do pupil diameter, coordination in head motion or respiration, or quantity of head motion differ between high- and low-togetherness periods?
We predicted that larger pupil diameter, stronger head coordination, smaller inhalation asynchrony, and increased head QoM would occur during periods of high togetherness. This prediction was tested with a series of models (four in total) that tested the effect of the interaction between togetherness level (high/low) and piece (Fauré/Schumann) on each of these dependent variables separately. Performer ID was included as a random intercept. Once models were constructed, an ANOVA was run on each to obtain statistics for main and interaction effects. Post-hoc tests were conducted with Bonferroni correction. This analysis only used data from the post-rehearsal performance each duo selected as their best.
For pupil diameter, there was a significant interaction,

Interaction Plots for the Effects of Piece and Togetherness Level on, Clockwise From Top-Left, Pupil Diameter, Head Coordination, Head QoM, and Inhalation Asynchrony.
For head coordination, there were significant main effects of togetherness,
For head QoM, interaction and main effect of togetherness were non-significant (
We ran two further analyses. First, we carried out a paired t-test to test the similarity in average togetherness ratings between pieces. The test showed stronger ratings overall for the Schumann piece (
Spearman’s
Do pupil diameter, coordination in head motion or respiration, or quantity of head motion differ between pre- and post-rehearsal performances?
We carried out an additional analysis to test whether pupil size, head coordination, inhalation asynchrony, or head QoM differed before and after the rehearsal period. For each of these dependent variables, we tested a model that included the interaction between rehearsal (pre/post) and piece (Fauré/Schumann) as a fixed effect and a random intercept for performer ID (for pupil size and head QoM) or duo (for head coordination and inhalation asynchrony). This analysis used data from each duo’s pre-rehearsal performance and the post-rehearsal performance they selected as their best.
For pupil size, there was no significant interaction or effect of piece, but the effect of rehearsal was significant, with larger pupil diameters pre-rehearsal than post-rehearsal,

Interaction Plots for the Effects of Piece and Rehearsal on, Clockwise From Top Left, Pupil Diameter, Head Coordination, Head QoM and Inhalation Asynchrony, and Head QoM.
What informed musicians’ togetherness judgements?
The six themes that arose in musicians’ written descriptions of what informed their togetherness judgements are listed in Table 5 along with the numbers of singers and pianists who mentioned them. Below, we describe the themes in more detail.
Themes Describing Musicians’ Written Responses to the Question of What Informed Their Togetherness Ratings.
Note. The numbers of singers and pianists commenting on each theme are listed.
Body motion included references to body movement, including breathing. The most commonly-assigned code in this category—and the most common code overall—was “breathing together”, which was mentioned by 17 musicians (11 pianists, 6 singers).
Effortlessness included references to knowing or feeling what a partner would do, thinking together, or flow states. Common codes included “effortlessness”, “flow”, and “inspiration” (all mentioned by 2–4 musicians).
Emotion included references to expressing emotion, sharing emotion, or jointly feeling the music. References to feeling the partner or feeling together were classified under Feeling together. Common codes included “multisensory emotion” and “feeling about the music” (both mentioned by 4–5 musicians).
Feeling together included references to connection, being united, or joint thinking. Example codes included “feeling the music together”, “together in imagination”, and “connection” (all mentioned by 1–2 musicians).
Musical coherence included references to similarity or coherence in specific musical parameters or sharing an interpretation or intentions. References to timing, tempo, rhythm, or synchronisation were classified under Temporal coherence. Common codes included “phrasing” (13 musicians), “dynamics” (11 musicians), “colour” (5 musicians), and “sharing an interpretation” (4 musicians).
Temporal coherence included references to alignment of tempo or tempo changes, similarity of timing or rhythm, or explicit references to synchronisation. Common codes included “same tempo” (10 musicians), “timing” (8 musicians), and “synchronization” (5 musicians).
How similarly did duo partners rate togetherness?
We ran Spearman correlations to test the similarity of duo partners’ togetherness ratings (averaged per phrase). The correlation was slight

Togetherness Ratings for Two Duos for Schumann (Left) and Fauré (Right), Sampled Per Beat.
An additional analysis was run to test whether the similarity of duo partners’ ratings related to the similarity of their written responses. A Spearman correlation was calculated for each duo and piece to assess the similarity of their togetherness ratings. We then calculated the Pearson correlation between these Spearman correlation coefficients for ratings and duos’ written response similarity scores. This yielded a positive correlation for the Schumann piece (
Discussion
This study investigated the relationships between bodily expressivity (quantity of head motion), bodily coordination (in head motion and inhalation timing), amount of mental processing (pupil diameter), and self-reported feelings of musical togetherness among classical singer-pianist duos. Our main results show that head coordination between musicians was stronger and pupil size was larger during periods of high togetherness than during periods of low togetherness, but only for the piece by Fauré. Pupil diameter was greater during the Fauré piece than during the Schumann piece during high-togetherness periods, and musicians moved more during the Schumann piece than during the Fauré piece. We also observed smaller pupil diameter (suggesting overall reduced mental processing) and increased head motion post-rehearsal.
The finding that head coordination was stronger during high-togetherness phrases for the Fauré piece supports a link between musicians’ sense of togetherness and their coordination of expressive head motion. This effect occurred for duo musicians who were playing different instruments (voice versus piano) with different motor demands, and playing classical Lied repertoire that encourages a fairly consistent distinction in musical roles. The effect is in line with the idea that strong togetherness experiences require alignment at the level of musical understanding, not only accurate coordination of basic structural features (i.e., playing the right notes at the right time). It is also in line with the idea that visual communication and body expressivity contribute to musicians’ abilities to monitor each other’s musical understanding (Bishop, 2024). The finding is in contrast to some prior research where musicians’ body coordination did not differ between playing conditions that were expected to yield a stronger versus weaker sense of togetherness (Bishop, 2023). In the study by Bishop (2023), head coordination was evaluated per performance instead of per phrase, which is an important difference from the current study. The current findings show that togetherness fluctuates within pieces, which should be taken into account in the future as strategies for manipulating and measuring togetherness are designed.
The effect of head coordination arose for the Fauré piece, but not for the Schumann piece. Head coordination was also stronger for high-togetherness phrases in Fauré than for high-togetherness phrases in Schumann. This combination of effects suggests that musicians were particularly coordinated during high-togetherness phrases in their Fauré performances. The structure and expressive affordances of a piece might affect how much scope there is for coordinating ancillary motion. Our pieces were selected to differ in their basic structure and expressive character. The Fauré piece is more regular in its timing than the Schumann piece. It is also slower and more flowing, while the Schumann piece is faster, more playful, and more dynamic. These differences might have afforded a different relationship between head coordination and togetherness for the two pieces. For the Schumann piece, togetherness might be more meaningfully tied to other aspects of bodily coordination.
It is also relevant to consider how we measured head coordination (i.e., with CWT analysis). Some previous studies have used CWT analysis to evaluate coordination in ensemble musicians’ body motion (Clayton et al., 2019; Eerola et al., 2018), but measures like cross-correlations and Granger causality have been commonly-used as well (e.g., Chang et al., 2019; D’Ausilio et al., 2012; Goebl & Palmer, 2009; Keller & Appel, 2010). These measures capture different aspects of the time-varying relationships between co-performers’ movements and/or how information is transferred between people. They are affected differently by characteristics of performers’ movements like how much motion there is and how regular it is. In the future, methods-focused studies should be carried out to compare how different measures of coordination relate to musicians’ feelings of togetherness.
Differences between pieces were reflected in other measures of body activity, with musicians showing a larger pupil diameter during Fauré performances than during Schumann performances during high-togetherness periods and moving more during Schumann performances than during Fauré performances. The overall greater amount of motion shown during the Schumann piece is likely tied to its faster tempo and more dynamic character. The larger pupil diameter that occurred during the Fauré piece in high-togetherness periods might signify greater overall emotional involvement or greater difficulty in processing technical or musical demands. We might speculate that the former is more likely, as the Schumann piece was the more harmonically and rhythmically complex of the two. Prior research has shown a relationship between pupil diameter and changes in technical, harmonic, and expressive difficulty (Bishop, Jensenius, & Laeng, 2021), but more systematic study involving manipulation of specific musical demands is needed for a thorough understanding of how mental effort unfolds during performance.
We did not find any effect of togetherness level on inhalation synchrony. On one hand, the null effect on inhalation synchrony might seem unsurprising since singers and pianists have different breathing requirements and pianists tend not to be very consistent in how they breathe in relation to the music (Sakaguchi & Aiba, 2016). On the other hand, the musicians in our study reported that breathing was important for their sense of togetherness. Indeed, “breathing together” was the most common cue that they reported in their written responses. These comments about breathing might reflect musicians’ experience-based beliefs about how to foster feelings of togetherness rather than cues that they actually use while playing. Alternatively, breathing might indeed be important, but only at very few specific moments in a piece rather than at phrase beginnings in general.
We observed larger pupil diameter in high-togetherness periods than in low-togetherness periods, but only for Fauré, suggesting greater cognitive arousal linked to togetherness in this piece. The fact that this finding did not hold for the Schumann piece might relate to differences in expressive character of the pieces. As we suggested above, musicians might have experienced greater emotional involvement in the Fauré piece, especially during high-togetherness passages. The relationship between mental processing and musicians’ sense of togetherness has not previously been tested, although pupil dilation is known to relate to emotional arousal (Laeng et al., 2016), processing of information about musical expression (Skaansar et al., 2019), and performance-related effort (Endestad et al., 2020). It is challenging to measure the effects of specific variable on mental effort since so many factors affect it simultaneously, and their effects change from moment to moment (Mathôt, 2018). Few studies have endeavoured to measure pupil diameter during live music performance given the variability that arises. In this study, we were able to ensure fairly consistent visual input since both players were reading their scores. More controlled playing conditions, for example, with simpler music that necessitated less body motion, might yield a clearer pupil signal, but such conditions might inhibit strong togetherness experiences, which seem to benefit when performers have the opportunity to overcome musical challenges together (Bishop, 2024). In future research, capturing any relationship between mental effort and togetherness might be more effectively approached with a task that presents moderate physical demands and tight control of visual conditions while manipulating the presence or absence of barriers to togetherness (that is, making it likely that togetherness is sometimes very weak, to contrast periods where participants are free to connect more strongly).
We found that pupil diameter was smaller, indicating reduced mental effort and arousal, and quantity of head motion was larger post-rehearsal than pre-rehearsal. This reduction in pupil diameter probably reflects not only the heightened attention that is needed to coordinate for the first time with a new partner, but also the cognitive arousal that is associated with start of a recording session and the novelty of the lab setting and equipment. Future studies might be designed specifically to distinguish these effects, for example, by having musicians acclimatise to the experimental setup by rehearsing individually first. The increased quantity of head motion that we observed post-rehearsal echoes previous findings (Bishop et al., 2019b; Wood et al., 2022) and suggests that musicians are more overtly expressive once they have established a shared interpretation of the music.
We observed correlations between body activity measures. Head coordination was positively correlated with all other measures: inhalation synchrony (for both pieces; reported in the results as a negative correlation with inhalation asynchrony), pupil size (only for Schumann), and quantity of head motion (for both pieces). We also observed a positive correlation between quantity of head motion and pupil diameter for both pieces, although it was stronger for the Schumann piece. This finding is in contrast to results reported by Bishop, Jensenius, and Laeng (2021), where no such relationship was observed for a string quartet (although pupil diameter was positively linked to quantity of arm motion). That study used a different analysis approach that involved combining head QoM with musical difficulty and complexity variables in the same model to assess their effects on pupil diameter. Future studies should continue to use multi-method approaches to examine how different body systems combine to support togetherness experiences and in what ways different variables overlap.
It is important to clarify that we cannot assume any causality in these correlations, nor in the other relationships that we observed in this study, such as that between head coordination and high togetherness. We posit that these features co-occur as part of a highly engaged, attentive, expressive, and coordinated mode of playing that characterises and strengthens performers’ sense of togetherness. We furthermore posit that the sense of togetherness prompts players to be more cooperative and engaged. That is, we expect that these measurable features of body activity promote togetherness and are themselves strengthened by the emotional rewards that they facilitate (i.e., a cycle of positive rewards).
We found a positive relationship between duos’ similarity in togetherness ratings and the similarity scores for their written responses. This relationship suggests that musicians who shared a concept of what togetherness means responded similarly to both tasks. With the participants’ written responses, we captured a concise description of how classical musicians understand musical togetherness. Musical coherence (alignment in musical features and interpretation) seems to be fundamental, and temporal coherence was so prominent among their comments that it merited its own theme. These themes describe aesthetic outcomes, making the point that togetherness is partly about aesthetics. For classical musicians, aesthetic outcomes primarily involve performers converging onto a shared interpretation. We would expect that musicians in other genres and traditions would describe their aesthetic aims differently.
Based on their written responses, participants also seem to have an awareness of moving and breathing together. While we did not see a relationship between inhalation synchrony and togetherness, inhalation synchrony did correlate with head coordination for both pieces. Moving together might be perceived as breathing together from the perspective of those inside the interaction. It is also notable that so many of the pianists commented on breathing together with the singers. Further study is needed to clarify what musicians mean by breathing together (how precise must it be?) and in what contexts it is relevant for their feelings of togetherness. Future studies should also test for possible alignment in musical partners’ exhalations, which can also have musical significance (Sakaguchi & Aiba, 2016).
The remaining three themes refer to sensations rather than measurable body activity: emotion, feeling together, and effortlessness. Feeling together and effortlessness seem to relate to a state of perceived shared intentions where players feel that they do not need to expend effort on predicting what their partner will do or correcting for coordination errors (Sawyer, 2006). These themes also signify reduced uncertainty about whether coordination will be successful. We defined separate themes for comments about feeling together and comments about feeling shared emotion about the music. This distinction is in line with the idea that the musical material and the uncertainty of interaction are nominally separable sources of shared emotion for ensemble musicians. They can share musical emotions and additionally share emotions relating to the success of their interaction (Salmela & Nagatsu, 2017).
The results of this study provide some evidence for aspects of Bishop’s (2024) model. First, our results show that musical togetherness is dynamic, as it fluctuated within performances. A qualitative inspection of the togetherness ratings that participants mapped out suggests that people might have differed in the temporal scale in which they experienced (or reflected on) togetherness. Some made near-constant changes with the slider while others showed prolonged periods of stability. Many participants stayed almost entirely in the upper half of the slider range, suggesting that they experienced a moderate-to-strong sense of togetherness overall. Often, slider changes seemed to happen decisively at particular moments. We see this in Figure 4 where the curves often have slopes approaching +/– 1. These seemingly-instantaneous responses suggest that as musicians reflect on their sense of togetherness, their judgements can be affected by discrete moments or events. Gesbert et al. (2022), similarly, found that artistic swimmers’ ratings of togetherness related to specific events such as coordination errors, which subsequently triggered corrective actions.
The current study also supports Bishop’s (2024) argument that togetherness is an individual rather than group-level phenomenon. We found only a slight correlation between partners in their ratings, which suggests that their experiences are related but sometimes diverge. Ensemble members might approach an interaction with different expectations about what should happen, and they might be satisfied by different aesthetic outcomes. They might also differ in their sensitivity to each other’s body cues and emotional states. A point for further study would be to identify situations that enable particular agreement or disagreement in how together musicians feel.
This study is subject to some limitations. First, our conclusions about the relationship between musical togetherness and measures of body activity are based on a single performance per piece by each duo. Multiple performances of the same pieces and a larger number of pieces would have been preferred for a stronger design, but would not have been practical in the context of an already long and demanding experiment session. A second limitation is that participants each did the experiment twice with different partners. This might have affected how they engaged with the pieces the second time around (either increasing engagement because of familiarity or reducing engagement because of boredom), although we do not expect that such effects were systematic.
A third limitation relates to the range of togetherness experiences that we were able to capture. A broader range of experiences (stronger and weaker) might have yielded stronger effects. However, it is challenging in the context of a controlled experiment to create conditions that enable either a very strong or a very weak sense of togetherness. Our recommendations for future studies of togetherness are to select repertoire that is challenging but enjoyable to play, as this incentivises participants to engage in the task of performing it, and to design the procedure to mimic the flow of situations that participants are familiar with, like a rehearsal or studio recording.
Finally, we should acknowledge that togetherness ratings that are collected during a stimulated recall task can only approximate the actual sense of togetherness that musicians experience during a performance. Our participants completed the stimulated recall task for their self-selected best performance, and it was sometimes the case that there were one to two intervening performances (if they preferred their first or second of the three post-rehearsal performances). For these participants, there might have been memory interference from the intervening performances. In a stimulated recall task, participants might also struggle to focus their judgements on their memory of what they experienced rather than what they perceive in their recorded performance. They might be unsure of how they felt, especially if their sense of togetherness fluctuated around a moderate level without any memorable extremes. The task might also prompt them to re-evaluate their experiences in light of their beliefs about what togetherness looks and sounds like and what they imagine that the experimenters are looking for. Nevertheless, stimulated recall is a useful strategy for obtaining information about musicians’ experiences without disrupting their real-time performance, and is commonly used in ethnographic research (Dempsey, 2010).
In conclusion, the novelty of this study is that it tested the relationship between patterns of bodily activity that are theoretically linked to musical interactivity and engagement with musicians’ self-reported experiences of musical togetherness. Our findings show that classical musicians’ sense of togetherness is linked to cognitive arousal and how strongly they coordinate their expressive body motion, although these relationships seem to depend on the demands of specific pieces. Our findings support a developed understanding of how musical togetherness manifests in classical ensembles and how ensemble musicians understand and reflect on it.
Supplemental Material
sj-pdf-1-pom-10.1177_03057356261436796 – Supplemental material for Moving together, feeling together: Body coordination, pupil size, and musical togetherness in classical duo performance
Supplemental material, sj-pdf-1-pom-10.1177_03057356261436796 for Moving together, feeling together: Body coordination, pupil size, and musical togetherness in classical duo performance by Laura Bishop, Anna Niemand, Sara D’Amario and Werner Goebl in Psychology of Music
Supplemental Material
sj-tex-2-pom-10.1177_03057356261436796 – Supplemental material for Moving together, feeling together: Body coordination, pupil size, and musical togetherness in classical duo performance
Supplemental material, sj-tex-2-pom-10.1177_03057356261436796 for Moving together, feeling together: Body coordination, pupil size, and musical togetherness in classical duo performance by Laura Bishop, Anna Niemand, Sara D’Amario and Werner Goebl in Psychology of Music
Footnotes
Ethical considerations
The study received ethical approval from the Ethics Committee of the University of Music and Performing Arts Vienna.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the University of Oslo and the Research Council of Norway through its Centres of Excellence scheme, project number 262762, and the Austrian Science Fund (FWF), project P32642.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
