Abstract
Teleoperation is the act of controlling an object that exists in a space, real or virtual, physically disconnected from the user. During such situations, it is not uncommon to observe those controlling the remote object exhibiting movement consistent with the behaviour of the remote object. Though this behaviour has no obvious impact on one's control of the remote object, it appears tied to one's intentions, thus, possibly representing an embodied representation of ongoing cognitive processes. In the present investigation, we applied a natural behaviour approach to test this notion, (a) first by identifying the representational basis for the behaviour and (b) by identifying factors that influence the occurrence of the behaviour. Each study involved observing participant behaviour while they played a racing video game. Results revealed that the spontaneous behaviour demonstrated in a teleoperation setting is tied to one's remote actions, rather than local actions or some combination of remote and local actions (Experiment 1). In addition, increasing task demand led to an increase in the occurrence of the spontaneous behaviour (Experiment 2). A third experiment was conducted to rule out the possible confound of greater immersion that tends to accompany greater demand (Experiment 3). The implications of these results not only suggest that spontaneous behaviour observed during teleoperation reflects a form of visible embodiment, sensitive to task demand, but also further emphasizes the utility of natural behaviour approaches for furthering our understanding of the relationship between the body and cognitive processes.
Research in cognitive science has begun to place a greater emphasis on understanding the embodied and embedded aspects of cognition (Barsalou, 2010; Clark, 2010; Glenberg, 2010; Hollan, Hutchins, & Kirsh 2000; Hutchins, 1995; Killeen & Glenberg, 2010; Kirsh, 1996; 2010; Pfiefer & Bongard, 2006; Wilson, 2002). One useful tool in this pursuit involves the systematic investigation of spontaneous natural behaviours that emerge in the context of a cognitive task (i.e., a natural behaviour approach). This approach is modelled after more ethnographic (e.g., Hollan et al., 2000; Hutchins, 1995) and ethological traditions (Kingstone, Smilek, & Eastwood, 2008; Tinbergen, 1963) that emphasize the systematic observation and description of natural behaviour, and as such can provide a ready window into the relation between brain, body, and world. This approach has been used successfully in a number of different domains (e.g., Chisholm, Risko, & Kingstone, 2013; Goldin-Meadow, 2005; Kirsh, 1995; Risko & Kingstone, 2011; Risko, Medimorec, Chisholm, & Kingstone, in press; Schwartz & Black, 1996; Vallee-Tourangeau & Wrightman, 2010). In the present investigation, we apply this approach in the context of teleoperation in order to provide new insights into the representation of remote actions.
One area that has relied heavily on investigating natural behaviour is gesture (e.g., Alibali, Spencer, Knox, & Kita, 2011; Chu & Kita, 2008, 2011; Goldin-Meadow, 1999, 2005; Goldin-Meadow, Nusbaum, Kelly, & Wagner, 2001). In this case, a controlled setting is created that engages the cognitive act of interest (i.e., communicating through speech), and the behaviours that accompany speech (i.e., the gestures) are systematically observed and subjected to various manipulations (including restriction of the behaviour) that elucidate their cognitive function. The gesture-as-simulated action (GSA) theory (Hostetter & Alibali, 2008) is one proposal that has emerged from this body of work to account for the natural production of gestures. Influenced by common coding theory, which argues that the simulation, perception, and execution of actions are commonly coded in the brain (e.g., if one imagines or perceives an action, similar activation is seen in neural regions responsible for the actual execution of that action; Hommel, Musseler, Aschersleben, & Prinz, 2001; Jeannerod, 1994; Prinz, 1992, 1997), the GSA theory (Hostetter & Alibali, 2008) argues that gestures reflect the embodied nature of language and mental imagery. Specifically, language and mental imagery involve perceptual and motor simulations that are grounded in the same neural mechanisms as those responsible for online perception and action. In many cases during offline cognition the simulations are covert, but in some cases activation spreads in such a way that the simulation turns into an overt action, and, in the words of Hostetter and Alibali (2008), “a gesture is born” (p. 503). For example, when conveying spatial information, a simulation of directional information is likely to be generated in premotor regions, which has the potential to activate motor regions, leading to an overt representation of that spatial information (e.g., directional hand gestures). A similar account of how covert simulations can emerge as overt spontaneous behaviours has also been offered by Chandrasekharan, Athreya, and Srinivasan (2010). In the present context, the critical notion to take from the GSA and related frameworks is that overt natural behaviours that emerge in the context of a cognitive act can, in and of themselves, provide a window into the very nature of the mechanisms underlying that cognitive activity. In other words, these spontaneous natural behaviours can make the embodied nature of cognition visible (Chandrasekharan et al., 2010; Hostetter & Alibali, 2008). Critically, we argue that this is unlikely to be limited to gesture (Chisholm et al., 2013; Risko et al., in press). Here we expand this notion of natural behaviour providing a window into the embodied nature of cognition to a novel context—teleoperation.
Teleoperation presents an increasingly common situation where individuals use a local control device to execute actions in a remote space via some intermediate (e.g., machine) or virtual actor. Thus, an intended action of a user is transmitted via this mediating technology. Specifically, once an action plan in the distal (remote) space is established, individuals must generate proximal actions that lead to the realization of those distal goals (i.e., the intention to move an object left requires the formation of an action on the control device in a manner that would lead to that outcome). Riva and Mantovani (2012) have recently provided a framework for conceptualizing these distinct action spaces. Namely, using the body to act on a local control device has been referred to as first-order mediated actions (similar to tool use situations) whereas actions that occur in a remote or virtual space, as a result of first-order mediated actions, are referred to as second-order mediated actions. Examples of situations that involve second-order mediated actions consist of remote surgery, operating a crane, and playing video games. According to the Riva and Mantovani (2012) framework, the result of effectively executing first-order actions (incorporation) is similar to the effects demonstrated in the tool-use literature—namely, individuals experience an extension of their peripersonal space (e.g., Farnè, Bonifazi, & Làdavas, 2005; Farnè, Serino, & Làdavas, 2007; Maravita & Iriki, 2004). However, in the context of teleoperation, this local body-tool incorporation is combined with remote actions that are spatially disconnected from the local space. Thus, in addition to an extension of peripersonal space, second-order mediated actions lead to an extension of one's extrapersonal space (incarnation). According to Riva and Mantovani (2012), within this context, both first- and second-order actions generate their own body-based action representations. Thus, like the GSA theory, this theory postulates that both one's plan regarding how to act on a local object and one's plan regarding how an action should be implemented in remote space are represented in the same perceptual/motor systems as those that would actually realize those goals. This suggests that a systematic investigation of the overt behaviours that emerge during such situations could provide important insights into the nature of the action representations formulated in teleoperation contexts.
Within the context of teleoperation, it is not uncommon to observe spontaneous body movements that appear consistent with the actions of the remotely controlled object (e.g., rightward sway when trying to move an object to the right). Interestingly, these body movements do not seem to have any apparent direct causal link to the control of the remote object (e.g., rightward sway of the body does not itself move a remotely controlled object to the right). This behaviour, whether derived by one's local goal (i.e., controller input) or remote goal, like gestures, may represent the visible embodiment of the cognitive mechanisms underlying action representations in this context. In particular, these movements might represent the visible embodiment of the remote or locally derived goal (i.e., first- and/or second-order mediated action). This possibility falls out of the Riva and Mantovani (2012) account, suggesting that both types of actions are grounded in the neural regions responsible for realizing that action in one's personal space (i.e., covert simulation). In other words, when I want to move either the local (first-order) or remote tool (second-order) to the right, I might actually end up moving my body to the right (i.e., covert simulation becomes overt). This explanation of the behaviour immediately raises questions regarding (a) whether the spontaneous body movements observed during teleoperation reflect first- or second-order mediated actions (i.e., local goal-derived movement or remote goal-derived movement, respectively) or some combination of both and (b) whether the occurrence of such behaviour is sensitive to manipulations that have been postulated to influence the likelihood of a covert simulation becoming overt (Chandrasekharan et al., 2010; Hostetter & Alibali, 2008). We address both of these questions in the present investigation.
Experiment 1
Determining whether movement observed during teleoperation reflects the visible embodiment of first- or second-order mediated actions is not straightforward given the typical correspondence between first- and second-order actions (e.g., moving a remote object to the left is typically achieved with a leftward directional input). Thus, it is unclear whether the physical movement observed in the context of controlling a remote object reflects an overt realization of the individual's first- or second-order actions.
In order to distinguish between these two alternatives, we first needed to create an experimental setting to systematically observe the behaviour of interest. Using natural behaviour as a means to understand cognition requires that individuals engage in the behaviour in a relatively unconstrained laboratory setting. Although teleoperation is perhaps more commonly associated with military, surgical, or robotic settings, we employed a more cost effective and familiar, yet conceptually similar, teleoperation setting. We recorded participant behaviour while they played a video game, a situation where participants remotely control a virtual representation through a local input device. We have previously used video game playing as a context to investigate natural behaviour (Chisholm et al., 2013) as it presents participants with a more cognitively complex and arguably more natural task to perform. Critically, the behaviour of interest is common in this type of task. For example, when controlling a virtual character to turn around a corner, players can be observed to lean in the same direction as their character. As noted previously, this movement occurs despite the behaviour not having any apparent functional connection with the remote object.
To assess whether this behaviour represents the individual's first- or second-order mediated actions, we manipulated the correspondence between the first- and second-order mediated actions. Specifically, one group of participants played a racing video game with the control mapping reversed (i.e., pushing left would move the vehicle to the right and vice versa). In this condition the first- and second-order mediated actions are placed in opposition, thus providing an opportunity for us to disambiguate their origin. Specifically, if these behaviours reflect the first-order mediated actions (i.e., are local goal-derived movements), then the overt behaviour should be consistent with the actions executed on the local control device (e.g., leaning right when pushing right). Alternatively, if these behaviours reflect second-order mediated actions (i.e., are remote goal-derived movements), then the overt behaviour should be consistent with the movement of the remote vehicle (e.g., in the reversed mapping condition leaning right when pushing left to move the vehicle to the right).
While the incompatible condition provides an opportunity to identify the source of the overt behaviour, the overall frequency of the behaviour (i.e., collapsing across first- and second-order mediated actions) in the reversed mapping condition when compared with the frequency of the behaviour in a control condition can assess the extent to which the two action plans interfere with one another. Therefore, we also included a control condition where participants played the video game with normal control settings. Specifically, if the overall frequency of the behaviour decreases in the reversed mapping condition relative to the normal mapping condition, this would provide evidence that the two action plans compete at some level within the action planning sequence. This would be compatible with much work in the stimulus–response compatibility literature (Kornblum, Hasbroucq, & Osman, 1990) and, critically, would be inconsistent with the Riva and Mantovani (2012) suggestion that the action plans underlying first- and second-order mediated actions are independent.
Method
Participants
Eighteen participants (13 females) were recruited from the University of British Columbia and received course credit or monetary compensation for their participation. All participants provided written informed consent.
Apparatus
Participants sat approximately 150 cm away from a Samsung 40-inch LCD high-definition television. The display was connected to a Sony PlayStation 3 (PS3) video game console and Motorstorm (Sony Computer Entertainment), an off-road racing video game, was chosen for participants to play. A racing game was chosen for the purpose of simplifying the behavioural data analysis as it involves primarily only left/right directional input. Thus, the occurrence of overt movements would also be largely limited to these directions rather than all possible directions, which could occur with more open-control-based games (e.g., first-person shooters). Game sessions were played in a moderately lit sound-attenuated chamber with a PS3 wireless DualShock 3 controller. Participant behaviour during gameplay was recorded with three webcams. One camera recorded the game progress shown on the LCD HDTV, another was set to record the lateral view of the player's body, and the third recorded the player's whole front body (see Figure 1). Video data were coded with custom video annotation software. Finally, the mental, physical, and temporal demand as well as effort items from the NASA (National Aeronautics and Space Administration) Task Load Index (TLX; Hart & Staveland, 1988) were used to ensure equal task demands across groups.

Example of experimental set-up. Participants sat in a sound-attenuated chamber where two cameras recorded participant behaviour (front and lateral views), and another recorded the video game display.
Procedure
Participants were randomly assigned to either the normal or the reverse mapping conditions. For those in the normal mapping condition, the analogue control stick mapped normal left and right vehicle movements. For those in the reversed mapping condition, the controller was inverted so that the directional mapping would be reversed. That is, pressing left on the analogue stick turned the vehicle to the right, and vice versa. Prior to any gameplay, participants were provided with a brief overview of the video game's controls. All races were played in a first-person perspective. 1
A first-person perspective was chosen to provide participants with a more compelling experience. However, given recent evidence from the field of perspective taking (e.g., Vogeley & Fink, 2003), whether perspective is an important factor in the emergence of spontaneous behaviours is an interesting question for future investigation.
Following an overview of game controls, participants played two 10-min sessions alone in the sound-attenuated room. Two different racing tracks were used to provide some variety in gameplay and to avoid game fatigue or boredom. Participants played in a time trial mode, which required them to race alone and attempt to achieve their best lap time. Average lap time and number of vehicle crashes were recorded as measures of performance. Participants also completed the NASA TLX questionnaire after each race. Due to the nature of our task, participants were also asked to record their prior experience with video games.
Results
Following data collection, all recorded videos were coded for overt movement during gameplay. Movement was recorded as first- or second-order consistent if the behaviour was linked to, and occurred in time with, local controller input (e.g., body leaning or hand tilting tied to the direction of controller input) and in-game events (e.g., body leaning or hand tilting tied to the direction or changes in vehicle movement), respectively. Given the correspondence between first- and second-order actions in the normally mapped condition, this distinction of movement categories was unnecessary. However, such division was critical for the reversed mapping condition. A second individual coded a pseudorandom selection of 25% of the videos, and an assessment of interrater reliability for coding of participant movement revealed highly consistent ratings for the frequency of the overt behaviour (r = .95, p < .001). Coders were not blind to conditions as knowledge of normal or reversed controls was required to efficiently code whether an observed movement was consistent with the local input or the remote actions. Coders were, however, naive to the theories and predictions of the study itself. For all analyses, we did not observe any effects associated with tracks or race session (all ps > .05); therefore, all measures were averaged across these order variables.
We compared TLX ratings, where higher values on perceived workload (e.g., mental, physical, or temporal) represent greater demand, and performance measures across normal and reversed mapping conditions. For the TLX, results from independent-samples t tests, with Cohen's d reported as a measure of effect size, revealed no differences in ratings for mental demand (normal = 10.1, reversed = 13.8), t(16) = 1.74, p = .10, d = 0.82, physical demand (normal = 7.5, reversed = 7.1), t(16) = 0.16, p = .88, d = 0.07, temporal demand (normal = 13.5, reversed = 13.4), t(16) = 0.01, p = .99, d = 0.005, and effort on task (normal = 13.1, reversed = 12.3), t(16) = 0.39, p = .55, d = 0.18. In order to provide a summary measure of demand, we collapsed ratings across mental, physical, and temporal demand and effort. No difference in overall demand was observed between normal and reversed mapping conditions (normal = 44.1, reversed = 46.6), t(16) = 0.33, p = .75, d = 0.15, indicating that the different mappings exerted equivalent demands on the participants. 2
Analysing these data with nonparametric statistical methods also yielded the same outcomes. This was also the case for the analyses in Experiments 2 and 3.
The gender distribution in normal and reversed conditions was uneven—more females were randomly assigned to the reversed mapping condition (7 vs. 4). Although gender played a role in task performance (i.e., trend for males to outperform females, partially accounting for the lap time result), the frequency of the spontaneous behaviour was not influenced by gender (p > .05).
Finally, to assess whether a reduction in overt movement in the reversed condition was observed as a result of having to generate two, potentially competing, action plans, an independent-samples t test was used to compare the total frequency of overt movements (i.e., sum of first- and second-order consistent movements) across normal and reverse mapping conditions. Participants in the normal mapping condition produced, on average, 45.5 movements compared to the 41.3 movements observed in the reverse mapping condition. Analysis revealed that both normal and reversed mapping conditions produced the same overall amount of overt movement, t(16) = 0.36, p = .72, d = 0.17. Further, comparing the frequency of second-order consistent movement in the reverse mapping condition to the movement observed in the normal mapping condition revealed no difference (normal = 45.5, second order = 39.2), t(16) = 0.56, p = .58, d = 0.27.
Discussion
The results of Experiment 1 are straightforward and revealing. First, the controlled setting we created to observe the behaviour was sufficient to elicit and record the behaviour in a natural and unprompted fashion. That is, participants demonstrated bodily movement that had no apparent direct impact on the task at hand but was clearly linked to the intended actions. In addition, coders were able to reliably detect this behaviour from video records. As a result we were able to determine the basis for such movement in a teleoperation setting. Almost all movement was consistent with the actions of the remotely controlled vehicle, even when this movement was in opposition to the physical actions necessary to execute such control. Specifically, in the reverse mapped condition, individuals actually moved in the direction opposite to the directional input that they exerted on the controller. Thus, the behaviour reflects a remote goal-derived movement (rGDM) in that the movement observed reflects what the participant intended to accomplish in the remote space (second-order mediated actions) and not the actual action the participant had to execute in order to accomplish that goal (first-order mediated actions).
The observation that the overall frequency of rGDMs was equal across normal and reverse mapping conditions suggests that the generation of first- and second-order mediated actions did not interfere with each other and that the behaviour is independent of the nature of the local action. This finding provides support for the notion that both action plans are generated separately (Riva & Mantovani, 2012) and is consistent with the idea that the movement reflects an overt manifestation of second-order mediated action representations rather than first-order mediated action representations, respectively.
Experiment 2
Results from Experiment 1 demonstrate that the overt behaviour observed during a teleoperation task is tied to the remote- or second-order mediated actions and is unaffected by change in the mapping of the local action. The view that rGDM represents an overt manifestation of an embodied simulation necessarily raises questions about the eliciting conditions. Specifically, if the default state of the system is to keep simulations for spontaneous actions covert (e.g., by inhibiting them; Chandrasekharan et al., 2010; Hommel et al., 2001; Hostetter & Alibali, 2008), then what determines whether a given simulation becomes overt? In both the GSA model (Hostetter & Alibali, 2008), and Chandrasekharan et al.'s (2010) cognitive-demand-modulated model, whether a simulated action becomes overt is believed to be dependent on the strength of the activation associated with the simulated action and whether this activity surpasses a given threshold. Critically, keeping this activity below threshold is thought to require the expenditure of cognitive resources. For example, such resources would be devoted to inhibiting the action simulation. Thus, if there are fewer resources available in the system, then simulations should be more likely to become overt. Much evidence has been provided from the gesture literature to support the influence of task demand on the expression of gestures. For example, when verbally conveying more challenging or complex material to others, the occurrence of gesturing increases (e.g., Hostetter, Alibali, & Kita, 2007; Melinger & Kita, 2007). Chandrasekharan et al. (2010) also demonstrated that gesturing during a spatial visualization task was more frequent in a high-demand condition than in a low-demand condition. In Experiment 2, we explore the predicted relation between demand and overt manifestations of covert simulations in the context of teleoperation.
In order to investigate the association between demand and rGDM, we manipulated task difficulty by instructing participants either to play the same racing video game very slowly (low demand) or to race as fast as they could (high demand). Given the results from Experiment 1 and the correspondence of local and remote actions with normally mapped controls, overt movements were coded for whether they were consistent with in-game actions. Based on the purported link between demand and overt manifestations of covert simulations, there should be fewer rGDMs in the low-demand condition than in the high-demand condition.
Method
Participants
Sixteen participants (8 females, ages 18–23 years) were recruited from the University of British Columbia and received course credit or monetary compensation for their participation. All participants provided written informed consent.
Apparatus and procedure
The set-up and procedure were identical to those of Experiment 1 except for the following changes. Participants played both game sessions with normally mapped controls; however, during one session participants were instructed to race as fast as they could, utilizing speed boosts as much as possible, whereas during the other session participants were instructed to refrain from using any speed boosts and were told to take their time, casually driving through the track. The order of driving instruction was counterbalanced. At the end of each race session, in addition to completing the NASA TLX questionnaire (Hart & Staveland, 1988), participants also completed a state immersion questionnaire (Jennett et al., 2008).
Results
To assess whether participants properly followed our instructions, the average race completion time (i.e., two laps) was compared across conditions. Analysis revealed faster completion times in the drive fast condition (M = 334 s) than in the drive slow condition (M = 386 s), t(15) = 2.56, p = .02, d = 0.99. An assessment of the average number of crashes during each race session also revealed more crashes in the drive fast condition (M = 7.31) than in the drive slow condition (M = 3.69), t(15) = 3.54, p = .003, d = 1.14, providing performance-based confirmation that our manipulation influenced task demand.
In order to assess whether the game speed manipulation affected perceived task demands, we compared responses on the NASA TLX scale across the drive slow (low demand) and drive fast (high demand) conditions. Results from paired-sample t tests revealed that participants experienced greater mental (13.6 vs. 10.0), t(15) = 3.06, p = .008, d = 0.80, 4
It is worth noting the similarity in averages for mental demand in Experiments 1 and 2 despite the effect being significant in Experiment 2 and not Experiment 1. A Bayesian analysis (Rouder, Speckman, Sun, Morey, & Iverson, 2009) revealed Bayesian (JZS) factors of 1.00 (Experiment 1) and 0.16 (Experiment 2), which provides no evidence and substantial evidence for a difference in mental demand across conditions, respectively (Wagenmakers, Wetzels, Borsboom, & van der Maas, 2011). Importantly, the comparison of the overall demand measure in both experiments confirms this pattern.
A single coder, blind to experimental conditions, coded videos for rGDMs. In the present case, as all participants used normally mapped controls, this was also considered as game-consistent movement. In order to standardize the exposure of in-game events across participants (i.e., all participants experienced equal number of turns and obstacles), videos were coded only to the maximum number of laps completed by all participants. Thus, the results below are based on coded behaviour from only the first two laps of each session. A second individual coded a pseudorandom selection of 25% of the videos, and analysis of interrater reliability again revealed high agreement amongst coders (r = .74, p < .05). The critical analysis then compared the frequency of rGDM across both demand conditions. The result of this analysis revealed that significantly more rGDMs were produced in the high-demand (M = 21.6) condition than in the low-demand condition (M = 11.9), t(15) = 3.10, p = .007, d = 0.68.
Discussion
In Experiment 2, the task instruction—namely, to drive fast or slow—influenced the cognitive demand of the task. Critically, this manipulation yielded a significant modulation of rGDM. Participants produced more spontaneous actions during a situation that they reported required more effort and felt was more mentally and temporally demanding. This result is consistent with recent theories that argue that task demand is a critical factor in determining whether a simulated action is expressed overtly (Chandrasekharan et al., 2010; Hostetter & Alibali, 2008) and extends such claims to the context of teleoperation.
One potential alternative to the demand-based account of Experiment 2 can be derived from the observation that in Experiment 2 participants experienced a greater sense of immersion during the more demanding task. A sense of immersion can occur when individuals are presented with an experience that encourages them to feel more enveloped or engaged in the virtual or remote space (Brown & Cairns, 2004; Jennett et al., 2008). Past work has also demonstrated that a greater sense of immersion can lead to more pronounced spontaneous behaviour in the context of more passive video watching. Specifically, greater lateral leaning has been observed when participants viewed videos in a more immersive context (i.e., 3D head-mounted display) than in a less immersive context (i.e., 3D consumer television; Hoshino, Takahashi, Oyamada, Ohmi, & Yoshizawa, 1997). A similar finding was also reported by Freeman, Avons, Meddis, Pearson, and Ijsselsteijn (2000), who demonstrated a trend for greater lateral spontaneous movement when presented with stereoscopic than with monoscopic stimuli. Therefore, in the present context, it is possible that the increase in spontaneous movements (i.e., an increase in covert simulations becoming overt) may have been a result of participants feeling more immersed in the game experience, rather than the proposed demand-based mechanism. We assess this possibility in Experiment 3.
Experiment 3
In order to determine whether the results of Experiment 2 were due to our driving manipulation or due to a greater sense of immersion experienced by participants in the high-driving-demand condition, we kept driving demand constant while manipulating participants' sense of immersion. In doing so, we could assess the effect immersion may have on the occurrence of rGDM. If immersion is sufficient to influence the occurrence of rGDM, then the prediction is that a greater sense of immersion will lead to a significant increase in rGDMs. Disconfirming this prediction would suggest that (a) rGDM is not sensitive to between-condition differences in immersion, thus dissociating it from the leaning behaviour seen in previous work (Freeman et al., 2000; Hoshino et al., 1997), and (b) the results in Experiment 2 are best understood to reflect changes in driving demand rather than immersion.
In the present study, participants played the same racing game with either the presence or absence of sound. We predicted that playing the game with sound would provide a more immersive experience than playing in the absence of sound. Data collected from a pilot study revealed that this manipulation was effective in influencing participants' sense of immersion.
Method
Participants
Twenty-four participants (10 females, ages 18–27 years) were recruited from the University of British Columbia and received course credit or monetary compensation for their participation. All participants provided written informed consent.
Apparatus and procedure
The set-up and procedure were identical to those of Experiment 2 except that participants completed both racing sessions at their own pace and that during one session the sound was turned off. The order of sound condition was counterbalanced. At the end of each race session, participants again completed the NASA TLX (Hart & Staveland, 1988) and state immersion questionnaires (Jennett et al., 2008).
Results
We first assessed whether the sound present versus sound absent manipulation was effective in influencing participants' subjective experience of immersion. Results revealed that participants experienced significantly greater immersion when the sound was present (M = 102.3) than when the sound was absent (M = 96.3), t(23) = 3.32, p = .003, d = 0.39. Analysis of participants' NASA TLX responses revealed no differences between sound present and absent conditions in mental (11.3 vs. 11.6), t(23) = 0.53, p = .60, d = 0.07, or temporal (13.5 vs. 12.6,) t(23) = 1.20, p = .24, d = 0.19, demand as well as the reported effort needed to perform the task (12.2 vs. 11.6), t(23) = 0.61, p = .55, d = 0.12. Participants did report higher physical demand in the sound present condition (4.8 vs. 4.3), t(23) = 2.20, p = .04, d = 0.11. Clearly as there was no actual manipulation of physical demand, we feel that this effect is a reflection of the immersion manipulation rather than any demand effect (i.e., physical difference in the task/environment when sound is either present or absent). Analysis of the overall demand measure also revealed no differences between conditions (sound present = 41.8, sound absent = 40.1), t(23) = 1.14, p = .27, d = 0.13. Based on the results of these measures, any difference in the expression of rGDM would reflect an immersion effect rather than a task demand effect.
A single individual, blind to experimental conditions, coded the videos for rGDM. We again standardized coding to the first two laps of each session. Interrater reliability between the primary coder and a second coder, who rated a pseudorandom selection of 25% of the videos, was again high (r = .71, p < .01). Critically, analysis of game-consistent rGDM revealed no significant difference in the frequency of movements across sound present (M = 33.3) and sound absent (M = 30.8) conditions, t(23) = 1.03, p = .31, d = 0.15. Despite the null effect, the difference between means is in the direction predicted by an immersion effect; therefore, as any conclusions drawn from this analysis are based on a null result, we used Bayesian statistical methods (Rouder et al., 2009) to assess whether we could provide additional evidence to be more confident in the null result. Results revealed a Bayesian (JZS) factor of 3.86, which can be interpreted as “substantial evidence” in favour of the null hypothesis (Wagenmakers et al., 2011).
Discussion
Results from Experiment 3 are straightforward. Manipulating whether participants played the racing game with sound present or absent influenced their experience of immersion; however, despite a significant increase in immersion in the sound present condition, there was no reliable change in the frequency of rGDM. This finding disconfirms the hypothesis that the reliable change in rGDM in Experiment 2 reflected a significant change in immersion. Rather our results support a cognitive-demand-based mechanism (Chandrasekharan et al., 2010; Hostetter & Alibali, 2008).
It is worth noting that instead of the observed rGDM representing the overt manifestation of remote actions, it may instead, given the driving context, reflect the body's tendency to engage counterforces to those typically exerted on the body when driving (i.e., we lean to the left to counteract the centrifugal force that would push the body right when turning to the left). This account would probably predict increased spontaneous movement with increased immersion, as a greater sense of feeling “in” the driving context could lead to stronger representations associated with the need to engage the counterforces typically experienced in more natural driving settings. The fact that immersion did not influence the frequency of rGDM helps to argue against this counterforce account. To further test this alternative account, we also conducted a small follow-up investigation with a nondriving context. If such a counterforce account were to be true, then it would be expected to be specific to a driving context, thus no rGDM should be observed while playing nondriving video games. Therefore, we recorded participant behaviour while they played both first- and third-person shooter video games. Contrary to this alternative account, we observed behaviour in the nondriving context identical to that observed in the driving context. Specifically, individuals would tilt their head and body in the direction they wanted their avatar to look or move within the virtual environment. Thus, we feel confident that these behaviours are not simply a learned response to driving situations.
General Discussion
In the present investigation we created a controlled environment to naturally observe and systematically investigate spontaneous overt behaviours that emerge in the context of teleoperation. In such a context, where individuals control a remote object via some spatially disconnected mediated control device, it is not uncommon to observe spontaneous overt behaviour that appears to reflect the remote goals of the individual in control. Through a series of experiments, we have demonstrated that this behaviour is linked to remote or second-order mediated action representations (Experiment 1) and that this remote goal derived movement is sensitive to concurrent task demands (Experiment 2) while ruling out the impact of user immersion (Experiment 3).
Local and remote action representations
By manipulating the correspondence between the local input and the remote actions, the present study demonstrates unequivocally that the spontaneous behaviour is linked solely to one's remote/second-order mediated actions. This is consistent with work on action control suggesting that action representations are generated based on their perceived effect on the external or distal world (e.g., James, 1890; Prinz, 1992). For example, the literature has provided behavioural evidence that action representations are hierarchically organized with a dissociation between goal-related representations and kinematic-based representations (e.g., Grafton & Hamilton, 2007; van Elk, van Schie, & Bekkering, 2008). Neurophysiological evidence has also supported this notion, demonstrating dissociable neural activity for actions associated with the end goal versus the immediate actions working toward that goal (Majdandžić et al., 2007; van Schie & Bekkering, 2007). This finding has also extended to the context of tool use where dissociable neural activity was observed between the local actions necessary to manipulate a tool and the distal effects of the tool, again with an emphasis placed on the actions associated with the distal goal (Umiltà et al., 2008). The present data appear to map on quite well to the notion that goal and local-based action representations are subserved by different underlying processes, but are generated effectively in parallel to produce coordinated actions. Specifically, participants appeared to generate remote (i.e., goal) and local representations in parallel, with an emphasis placed on the end goal rather than the local/kinematic representations (Experiment 1). This finding is also consistent with a recently proposed framework, which suggested that local and remote action representations, also referred to as first- and second-order mediated actions, respectively, in a teleoperation context are generated in parallel (Riva & Mantovani, 2012). Thus, not only are our findings consistent with previous work on action control in body and tool-based contexts, we also provide an important contribution to this body of work by extending notions of action control to the realm of teleoperation. Specifically, our findings demonstrate that our understanding of action control also applies to a context where there exists a physical separation between the location where necessary proximal actions are executed and the remote space where distal goals/actions are realized.
The fact that the frequency of rGDM in the reversed condition did not differ from the frequency of movements in the normal mapping condition also suggests a common basis for the behaviour in the normally mapped condition. Specifically, the frequency of the behaviour was equivalent across the compatible and incompatible conditions, suggesting that there is little interference between the two action representations. If the two representations overlapped at some level prior to the generation of the behaviour, then we would have expected it to be less frequent in the incompatible condition when the direction of movement is opposite. This result provides evidence consistent with the Riva and Mantovani (2012) claim that the action representations underlying first-order and second-order mediated actions are independent.
Cognitive resources and spontaneous behaviour
In isolating the basis for rGDM, we sought to gain insight into the causes of the behaviour. Experiment 2 demonstrates that the frequency of rGDM is influenced by the demand level of the driving, with greater demands leading to more frequent rGDMs. This finding is consistent with recent theories that have proposed that the occurrence of spontaneous behaviours is influenced by the availability of cognitive resources (Chandrasekharan et al., 2010; Hostetter & Alibali, 2008). Typically, most simulated actions are not expressed in overt behaviour, as cognitive resources are allocated to engage inhibitory processes that keep the activity covert. However, by increasing the difficulty of a given task, individuals are required to dedicate resources to the task, leaving fewer resources available to inhibit the simulated actions. This notion has primarily been supported by work in the field of gestures (e.g., Chandrasekharan et al., 2010; Hostetter et al., 2007; Melinger & Kita, 2007). Therefore, results point to rGDMs and gestures as a similar kind of spontaneous behaviour, emerging from a common mechanism (e.g., a covert stimulation becoming overt). However, we must acknowledge the possibility that the specific mechanism underlying both behaviours may not, in fact, be exactly the same (e.g., it is possible that rGDMs reflect a covert simulation becoming overt but gestures do not). It is also worth noting that there is some debate about whether these putative covert simulations are automatically generated as implied by the theories on which we based our account (e.g., Heyes, 2001; Newman-Norlund, van Schie, van Zuijlen, & Bekkering, 2007). The present data cannot speak to this debate except to note that the proposed GSA (Hostetter & Alibali, 2008) and cognitive-demand-based accounts (Chandrasekharan et al., 2010) provide a cogent explanation for the results of the three experiments reported here. Furthermore, as the critical point addressed here is the conditions under which a covert simulation becomes overt, it is not clear to what extent the covert simulation would need to be automatic for such an account to explain the behaviour in question.
Importantly, our results now extend this cognitive demand modulation account to spontaneous behaviour that emerges in the context of teleoperation. These findings highlight the bidirectional relationship between cognition and the body. Although much work in the field of embodied cognition has demonstrated the influence the body can have on cognitive processes (e.g., Eerland, Guadalupe, & Zwaan, 2011; Lopez, Bachofner, Mercier, & Blanke, 2009; Niedenthal, 2007; Proffitt, 2006; Proffitt, Stefanucci, Banton, & Epstein, 2003; Williams & Bargh, 2009), the present findings indicate that the body is influenced profoundly by relatively subtle modulations in cognitive demand (see Chisholm et al., 2013; Hostetter et al., 2007; Melinger & Kita, 2007; Risko et al., in press; Wilson, 2002). As noted in the introduction, the emergence of these cognitively modulated behaviours can provide a tool for gaining insight into concurrent cognitive activity. Additionally, as these types of behaviours are commonly observed in natural settings, the implication is that they can be used effectively to investigate cognition as it naturally occurs “in the wild”. For instance, in cases where spontaneous behaviour is modulated by cognitive demand, such behaviour could be used as an indicator of the cognitive demand that individuals experience when presented with a given task (i.e., greater frequency of spontaneous behaviour would predict that the individual is experiencing greater demand).
Finally, manipulating immersion did not influence the frequency of rGDM (Experiment 3). This suggests that a greater sense of immersion may have simply been a by-product of the increase in task demand. Specifically, a sense of immersion may have resulted from being required to engage greater attentional focus or exert more effort to maintain sufficient performance on a task (Brown & Cairns, 2004). As noted above, previous reports have demonstrated a change in the magnitude of spontaneous leaning behaviour across immersion manipulations (Freeman et al., 2000; Hoshino et al., 1997). To reconcile these findings with our own data, we suggest that our rGDM may be qualitatively different than the leaning behaviour observed in past work. For example, whereas spontaneous leaning was observed when a stimulus was passively viewed, the spontaneous rGDM observed in the present investigation emerges naturally during an interactive task. However, given previous findings and its intuitive appeal, we acknowledge the possibility that, under different circumstances, immersion could influence the occurrence of spontaneous behaviour. However, although more research is needed, in the current context, our data argue that such an immersion effect may only emerge via a demand-based mechanism. For example, a greater sense of immersion probably coexists with an increase in the allocation of cognitive resources to the task.
Functional or epiphenomenon?
One question for future investigation is to assess the possible functional role of these spontaneous behaviours, such as the leaning behaviours reported here. Other work has demonstrated that spontaneous overt behaviour can influence task performance and learning. For example, spontaneous head tilting while reading rotated passages of text improves reading time relative to conditions where head tilting was prevented (Risko et al., in press), and gesturing can facilitate spatial problem solving (Chu & Kita, 2011) as well as children's learning of mathematical concepts (Goldin-Meadow, Cook, & Mitchell, 2009). Thus, in some cases these behaviours are thought to represent a form of cognitive offloading (Ballard, Hayhoe, Pook, & Rao, 1997; Clark, 2010; Kirsh, 2010; Wilson, 2002), where external processing (e.g., body movement) takes on some of the load to reduce the processing requirements placed on internal cognitive processes. As one learns to deal with task demand, the cognitive load is lessened. A basic example of such behaviour is using one's fingers when counting. This behaviour is a useful strategy when first learning basic mathematical computations; however, over time, as an individuals' maths skill improves, overt finger counting becomes less prevalent. It is unclear whether the rGDM observed in the present investigation plays a similar functional role or whether it is simply epiphenomenal. Focusing specifically on the possible functional role of this behaviour, for example by restricting it, as well as how it may change as one becomes more familiar with a task remain as interesting questions for future work. 5
Although a future investigation aimed specifically at examining whether rGDMs are influenced by practice is needed, we performed an analysis to assess whether practice-related effects were present in our data. Comparing the frequency of rGDMs across game sessions for each of the three experiments revealed no reliable practice effects (all ps > .05).
Conclusion
In the present investigation, we employed a natural behaviour approach to investigate a relatively common behaviour that occurs in the context of teleoperation—what we have referred to as remote goal-derived movement (rGDM). Critically, we provide evidence that establishes a connection between rGDM and one's intended remote actions during teleoperation, and have provided insight into the mechanism that gives rise to the behaviour. We argue that the behaviour reflects an example of visible embodiment, providing a window into ongoing cognitive activity. However, whether the behaviour plays a functional role in task performance remains unclear. In the context of teleoperation, understanding this behaviour helps provide insight into the factors involved in properly managing the complex interplay between coactive action representations. Finally, one advantage of this investigation is that the potential window that these behaviours provide into action representations during teleoperation (like gesture) is presented in the natural context of these actions rather than in a context divorced from it. Thus, our investigation further highlights the utility of employing a natural behaviour approach for enhancing our understanding of the complex relationship between body and cognition.
Footnotes
Acknowledgements
This work was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Graduate Fellowship to J.D.C., an NSERC Postdoctoral Fellowship and a Killam Postdoctoral Fellowship to E.F.R., and NSERC operating grants to A.K. We would like to thank Tom Foulsham for the creation and use of VideoCoder, a video annotation application (
).
