Abstract
In a recent study Boulenger et al. (2006) found that processing action verbs assisted reaching movement when the word was processed prior to movement onset and interfered with the movement when the word was processed at movement onset. The present study aimed to further corroborate the existence of such cross-talk between language processes and overt motor behaviour by demonstrating that the reaching movement can be disturbed by action words even when the words are presented delayed with respect to movement onset (50 ms and 200 ms). The results are compared to studies that show language–motor interaction in conditions where the word is presented prior to movement onset and are discussed within the context of embodied theories of language comprehension.
Recent brain imaging studies have revealed that within 150–200 ms following the onset of a spoken word that denotes human body actions, a short-lived cortical activity is observed outside the classical language areas in regions that have generally been associated with motor control (Pulvermüller, Shtyrov, & Ilmoniemi, 2005b; Hauk & Pulvermüller, 2004). This language-induced cortical motor activity follows the somatotopy of motor actions in that words referring to actions involving the face, for instance, activate inferior frontocentral regions, while words referring to actions involving the leg activate superior central sites (Pulvermüller, Hauk, Nikulin, & Ilmoniemi, 2005a; Hauk, Johnsrude, & Pulvermüller, 2004, and Tettamanti et al., 2005). Studies using transcranial magnetic stimulation (TMS) corroborate these findings by indicating that processing action-related words or sentences alters the excitability of the left but not the right motor cortex and modulates reaction times when the motor response and the words call the same effector (Buccino et al., 2005; Meister et al., 2003; Oliveri et al., 2004; Pullvermüller et al., 2005b). This language-induced activity in cortical motor regions is believed to reflect early automatic processes involved in word encoding and is therefore taken to suggest that neural systems for action are also involved in the perception of language (Pulvermüller, 2005). Their exact functional role for language, however, remains largely underspecified (see Mahon & Caramazza, 2005; Zwaan & Taylor, 2006, for discussion).
To show that language-induced cortical motor activity reflects behaviourally relevant processes, Boulenger and colleagues (Boulenger et al., 2006) recently developed a paradigm that allowed measuring the effect of action word processing on the execution of a reaching movement. The underlying assumption of their study was that if processing of action words recruits cortical regions that are also involved in the programming and execution of actions, processing these words should interfere with overt motor behaviour when the two tasks are performed concurrently. Fine-grained analyses of movement kinematics of reaching movements revealed that this assumption was correct. When visual words were processed prior to the reaching movement—that is, when the word display triggered the movement—action words assisted the ensuing movement in that the latency of the wrist acceleration peak appeared earlier when the word was an action verb than when it was a concrete noun. However, when movement onset triggered the word display, action verbs interfered with the concurrent reaching movement—that is, the latency and amplitude of the wrist acceleration peak appeared later and were smaller, respectively, when the word was a verb than when it was a noun. These interference effects were observed within less than 180 ms following word onset, which is the limit within which lexico-semantic processes are typically noticed (e.g., word frequency effects or effects of word category; Preissl, Pulvermüller, Lutzenberger, & Birbaumer, 1995; Pulvermüller, 2001; Pulvermüller, Lutzenberger, & Preissl, 1999; Sauseng, Bergmann, & Wimmer, 2004; Sereno & Rayner, 2003; Sereno, Rayner, & Posner, 1998). The short delay within which language-induced motor effects were observed and the fact that cross-talk between language and motor tasks switched from facilitation to interference when action words were processed concurrently to the movement suggest that cortical motor regions are not merely activated as a consequence of word recognition but are indeed recruited during word encoding (Boulenger et al., 2006).
Evidence for cross-talk between language processes and motor actions is actually abundant in the literature (e.g., Gentilucci, 2003; Gentilucci, Benuzzi, Bertonali, Daprati, & Gangitano, 2000; Gentilucci & Gangitano, 1998; Glenberg & Kaschak, 2002; Glover, Rosenbaum, Graham, & Dixon, 2004; Tucker & Ellis, 2004; Zwaan & Taylor, 2006). Among the first to report such effects were Gentilucci and colleagues who showed that printed words on an object modulate the movement directed to that object (Gentilucci & Gangitano, 1998). Hence, when participants were required to reach and grasp a wooden block on which the word “long” or “short” was printed, peak acceleration, peak velocity, and peak deceleration of the arm were higher when the printed word was the word “long”. Glover et al. (2004) showed similar effects when using printed words that described large (e.g., apple) or small (e.g., grape) graspable objects. Here, larger maximum grip aperture for the same object was observed for the condition where the word “apple” was printed on the objects than when the word “grape” was printed. Note though that these studies measured the effect of language processing on motor action in conditions where words were presented prior to movement onset, which is thus comparable to the condition in Boulenger et al. (2006) where action words were shown to facilitate the ensuing reaching movement. Yet, presenting words prior to movement onset does not allow excluding that the observed cross-talk between language processes and motor actions occurs subsequent to word encoding. What we are aiming at, by contrast, is establishing that cortical motor regions are recruited during (and not after) language perception. For this, it is essential to show that processing action words interferes with a movement almost instantly following word onset because this limits the possibility that language-induced motor activity stems from processes that arise after the word has been identified (see Pulvermüller, 2005, for related arguments). We come back to this point later.
The aim of the present study is to test whether language-induced motor interferences such as reported in Boulenger et al. (2006) can also be observed when words are displayed after the movement has started. Although it is known that large parts of the motor programme that underlie a movement are computed prior to movement onset, cortical motor activity is observed throughout the motor action (see Riehle, 2005, for a review of single-cell studies in monkeys). Single-cell studies with monkeys have, in fact, roughly classified three types of neurons that are encountered during movement preparation and execution: (a) neurons that are involved in movement preparation only; these neurons are mainly found in primary motor and premotor cortex and show maximum activity during movement preparation prior to movement onset; (b) neurons that are involved in movement execution only; these neurons, which are found in sensory, parietal, primary motor, and premotor cortex, are nearly silent during movement preparation and show maximum activity during movement execution; (c) neurons that are involved in movement preparation as well as in movement execution; these neurons, which represent a large part of movement related neurons in sensory, parietal, primary motor, and premotor cortex, sustain their activity from movement preparation to the end of the movement. Assuming that similar relations hold for human primates (see Georgopoulos, 2000), processing action-related language at any time between movement preparation and the end of a movement should thus interfere with overt motor behaviour.
We test this hypothesis by determining the impact of action word processing on a concurrent reaching movement when the word display is delayed by either 50 ms or 200 ms with respect to movement onset. Besides substantiating our previous finding (Boulenger et al., 2006), this study was also aimed at determining whether fine-grained analysis of movement kinematics can be used as an online measure of language-induced motor effects, in a similar way as evoked-response potentials (ERPs) are used for exploring cognitive tasks. If so, this paradigm could serve more sophisticated linguistic experiments using, for instance, sentences (that could be displayed along the movement) in order to determine whether language-induced motor effects are modulated by the sentence context or are simply limited to the verb itself.
Method
Participants
A total of 9 French native volunteers participated in each of the two delay conditions. All were right-handed as measured by the Edinburgh Inventory (Oldfield, 1971) and had normal or corrected-to-normal vision. None of the volunteers participated in both experiments.
Stimuli
The stimuli used in this experiment were exactly the same stimuli as those used in the study by Boulenger et al. (2006). A total of 84 words (42 verbs and 42 nouns) were selected from the French lexical database “Lexique” (New, Pallier, Ferrand, & Matos, 2001). Verbs were all in the infinitive form and denoted actions performed with the hand/arm, leg, or mouth/face (e.g., paint, jump, cry). Nouns were all in singular form and referred to imageable, concrete entities that cannot be manipulated (e.g., star, cliff, meadow). Words that could be used as both nouns and verbs were excluded from the selection. Stimuli were matched for relevant lexical variables including word frequency, length in letters, number of syllables, bi- and trigram frequency, and number and cumulative frequency of orthographic neighbours (for details, see Boulenger et al., 2006). Word age of acquisition was also controlled using empirical ratings performed by 20 volunteers on a 7-point scale (1 = 0–2 years, and 7 = older than 13 years; Gilhooly & Logie, 1980). Word imageability was estimated following the same procedure by another 18 volunteers (with 0 = impossible, and 6 = very easy, to generate a mental image of the word). To prevent participants from focusing on word-class discrimination, they were asked to perform a lexical decision task (deciding whether a letter string is a word or not). A total of 84 pseudowords (constructed by changing one letter from real nouns or real verbs) were added as fillers to perform this task. Pseudowords were either “pseudonouns” (42 items) or “pseudoverbs” (42 items) and were all pronounceable. Pseudowords were matched to words for relevant lexical variables such as number of letters and syllables, bi- and trigram frequency, and number and frequency of neighbours. Verbs and pseudoverbs were also carefully matched for endings, such that as many verbs as pseudoverbs (32 out of 42) ended with “er”, which is a frequent ending for verbs in French. All items were presented in lower case.
To guarantee that potential differences in movement kinematics during verb and noun displays were not due to surface features inherent to our specific word stimuli, 9 French right-handed volunteers performed a classical visual lexical decision task with the stimuli by indicating as per keystroke (using the left and right index fingers) whether the stimulus was a word or not. Participants were significantly slower to respond to pseudowords (617 ms) than to words (560 ms), F(1, 8) = 12.9878; p = .0069. However, no significant difference was observed between nouns and verbs (564 ms and 556 ms, respectively), F(1, 8) = 0.7815; p = .4024. Difference in movement kinematics during noun and verb displays can therefore not be attributed to differences in the word lists per se.
Procedure
Participants were asked to touch a home-pad (10 cm from their chest) with their right thumb and index finger held in a pinch grip position, while fixating on a monitor (95 cm from their chest). When a white cross appeared at the centre of the monitor (500 ms; go-signal), participants were required to leave the home-pad to reach and grasp a cylindrical object (height, 30 mm; diameter, 15 mm) placed vertically in front of them (40 cm from the home-pad). In the 50-ms delay condition, a letter string replaced the fixation cross 50 ms after the onset of the movement (i.e., leaving the home-pad). In the 200-ms delay condition, the orthographic stimulus was delayed by 200 ms. If the string was a word, participants were required to carry on the movement. If the string was a pseudoword, they had to interrupt the movement and return to the home-pad. The stimulus remained on the screen until participants grasped the object (in the word condition) or returned to the home-pad (in the pseudoword condition). The experimenter triggered the next trial once participants were in the starting position. Video recording assured that participants maintained their gaze on the cylindrical object during final movement execution (word condition only). Each stimulus was presented once and in random order. A total of 20 practice trials (different from the experimental stimuli) were given to familiarize participants with the task.
Movement recordings
An Optotrak 3020 (Northern Digital) was used to record the spatial positions of four markers (infrared light-emitting diodes), at a frequency of 200 Hz and with a spatial resolution of 0.1 mm. One marker was taped to the wrist. The three remaining markers were fixed on the experimental set-up to define a space in which all recorded movements were systematically placed from participant to participant.
Data analysis
A second-order Butterworth dual-pass filter (low-pass cut-off frequency, 10 Hz) was used for raw data processing. Movements were then visualized and analysed using Optodisp software (Optodisp copyright UCBL-CNRS; Thévenet, Paulignan, & Prablanc, 2001). Kinematic parameters for the word condition were assessed for each individual movement. Movement onset was determined as the first value of a sequence of at least 11 increasing points on the basis of wrist velocity. End of movements were determined similarly, going backwards from the end. For each participant, tangential velocity was calculated for individual trials. The initial part of trials (50 or 200 ms following movement onset depending on delay condition) was then removed, and the remaining part was normalized in time to 100 frames. Note that by normalizing the data, information about real time is lost. However, without such normalization it is difficult to compare the data point by point along the movement. Individual trials were then averaged as a function of word category, and acceleration/deceleration profiles were computed. Paired (one-tailed) t tests (per time unit) were used to identify periods in the acceleration/deceleration profiles where the two word conditions started to differ significantly (p ≤ .05; note, the t test was one-tailed because from previous studies we know that verbs should interfere more with the movement than nouns). Within such periods, movement parameters were then defined for further statistical analyses. Two movement parameters served such analyses: the deceleration peak and the velocity peak (the velocity peak corresponds to the point in time where the acceleration curve crosses 0 mm/s2), which were gained from the individual data of each participant.
Trials in which participants made errors or anticipated or delayed movement execution were excluded from the analysis.
Results and Discussion
One participant in the 200-ms delay condition was excluded from the analyses because of unusual strong variations in the data. For the remaining participants, a total of 12% of trials were excluded from the analyses (13% for nouns and 11% for verbs). In the 50-ms delay condition, 17% of trials were excluded (19% for nouns and 14% for verbs).
Total movement time in the 50-ms delay condition was 1,316 ms and 1,304 ms for noun and verb displays, respectively. In the 200-ms delay condition, it was 1,384 ms and 1,379 ms, respectively. Movement time did not distinguish between word displays.
Figure 1 plots normalized wrist acceleration/deceleration profiles in the two delay conditions averaged over participants. Recall that the initial 50 ms and 200 ms of the movement were removed. Zero on the time axis thus corresponds to the onset of the word and represents different points of the reaching movement in the two delay conditions.

Averaged wrist acceleration/deceleration profiles of all participants (normalized between 0% and 100% of movement time after word onset) during processing of nouns (dotted lines) and verbs (unbroken lines). Note that by normalizing the data, real time information is lost. The grey bar indicates the time window within which paired t tests (per time unit) revealed a significant difference between the two conditions. The top panel gives data for the 50-ms delay condition, the bottom panel for the 200-ms delay condition. (At the end of the movement, the curves do not converge at zero because the wrist is not entirely immobile even when the fingers are in contact with the cylindrical object.)
In the 50-ms delay condition, the noun and verb displays differed significantly starting from 32% to 40% (i.e., 9 time frames, indicated by the transparent grey bar) of movement time after word onset. In the 200-ms delay condition, significant differences were observed from 23% to 27% (i.e., 5 time frames) and from 36% to 43% (i.e., 8 time frames) of movement time after word onset. Note that in both delay conditions differences between noun and verb displays were observed around the deceleration and the velocity peaks (i.e., the point in time where the acceleration curve crosses 0 mm/s2). Deceleration was generally stronger for noun than for verb displays, which indicates that verb displays interfere more with the execution of the movement than noun displays. Table 1 gives the real latencies of the two movement parameters. Except for the amplitude of the deceleration peak in the 50-ms and the 200-ms delay conditions, none of the parameters captured significant differences in the data.
Latencies and amplitude of movement parameters in the noun and verb displays
Note: Latencies are from word onset. Data are given separately for the two delay conditions.
In ms.
In mm/s2.
Note that on average, velocity peak was attained 413 ms after word onset in the 50-ms delay condition and 226–229 ms after word onset in the 200-ms delay condition. The significant differences between noun and verb displays observed in the normalized data near the velocity peak thus occurred at variable delays with respect to word onset in the 50-ms and 200-ms delay conditions (e.g., it occurred earlier in the 200-ms delay condition). Similarly, the time interval around the deceleration peak within which significant differences between noun and verb displays were observed occurred earlier with respect to word onset in the 200-ms delay condition. Hence, although significant differences between noun and verb displays could be observed, these differences did not occur locked on word onset. Rather, in both delay conditions, differences between noun and verb displays were evident at the same moment between the velocity peak and the deceleration peak of the reaching movement.
While the present study thus replicated and substantiated our previous findings (Boulenger et al., 2006) that processing action words interferes with the execution of a concurrent reaching movement, it also showed that cross-talk between language processing and movement execution surfaces at particular moments during the reaching movement and not after a constant interval following word onset. In the study by Boulenger et al. (2006), where the delay between movement and word onset was 0 ms, this cross-talk could be captured at and around the peak of wrist acceleration—which occurred within 160–177 ms following word onset. With the present delays of 50 ms and 200 ms between movement and word onset, however, wrist acceleration peak occurred too early with respect to word onset, and the “next possible moment” where this effect could surface seemed to be around the velocity peak. Since characteristics of the movement itself appear to partially mask the immediate impact of the linguistic stimulus, information about when exactly word processing starts to affect motor behaviour is therefore lost. Unlike ERPs for cognitive tasks, fine-grained analyses of movement kinematics thus cannot capture effects of action word processing on cortical motor structures in an online manner.
General Discussion
Consistent with Boulenger et al. (2006), and together with the accumulating TMS and brain imaging studies (Buccino et al., 2005; Hauk et al., 2004; Oliveri et al., 2004; Pulvermüller et al., 2005a, 2005b; Tettamanti et al., 2005), the present results further add to speculations that action words are—at least partly—represented in cortical motor regions (Pulvermüller, 2005; Zwaan & Taylor, 2006). However, since lesions over left motor cortex do not predictably lead to impairment in processing action words (De Renzi & di Pellegrino, 1995; Mahon & Caramazza, 2005; Saygin, Wilson, Dronkers, & Bates, 2004), motor processes alone do not represent all that we know about these words. The functional role of cortical motor regions for language understanding therefore needs to be specified.
As we pointed out earlier, cross-talk between language processes and motor behaviour differs qualitatively depending on whether the word is presented prior (facilitation) or concurrently (interference) to the movement, and these contrasting patterns are likely to reflect different aspects of word processing. Language-induced cortical motor activity that occurs early after action word onset—as evidenced in the brain imaging study by Pulvermüller et al. (2005b) and indicated by the motor perturbations seen in the present and in our previous study (Boulenger et al., 2006)—may indeed participate during action word encoding. Language-induced motor effects that occur when the word is processed prior to movement onset, however, probably do not (e.g., the second experiment in Boulenger et al., 2006; Gentilucci, 2003; Gentilucci et al., 2000; Gentilucci & Gangitano, 1998; Glenberg & Kaschak, 2002; Glover et al., 2004; Tucker & Ellis, 2004; Zwaan & Taylor, 2006). Note that as demonstrated by the so-called “action–sentence compatibility effect” (ACE; Glenberg & Kaschak, 2002), these latter effects can bridge entire sentences. To obtain an ACE, participants are asked to judge the sensibility of sentences describing the transfer of objects towards or away from themselves, such as “you delivered the pizza to Leo” or “Leo delivered the pizza to you”, by moving their hand towards or away from their body. Judgement time (i.e., the time elapsed between sentence onset and the beginning of the movement) is generally shorter when transfer direction implied by the sentence is consistent with the direction of the required response movement than when it is inconsistent. This judgement time, however, can exceed action word display by some 1,000 ms, which makes it unlikely that the ACE arises from action word encoding. Given the systematic nature of these language–motor interactions, however, it is reasonable to assume that they might reflect functionally relevant aspects of language processing. Both phenomena therefore need to be addressed.
Lexical access versus access to meaning: A speculation
While some of the studies that investigated language–motor relations have tested the impact of action words embedded within a sentence (Buccino et al., 2005; Glenberg & Kaschak, 2002; Tettamanti et al., 2005; Zwaan & Taylor, 2006), others have tested the impact of single action words (Boulenger et al., 2006; Gentilucci & Gangitano, 1998; Glover et al., 2004; Hauk et al., 2004; Hauk & Pulvermüller, 2004; Oliveri et al., 2004; Pulvermüller et al., 2005a, 2005b). The meaning of a word without context, however, is generally indeterminate (e.g., Borer, 2005a, 2005b; Frege, 1892). The “action” word “take”, for instance, can have a number of different meanings depending on whether it is part of a sentence like “take a break”, “take the example of”, “take a book”, or “take the train”, etc. In trying to understand the potential role of cortical motor regions in language processing, it is therefore useful to distinguish between lexical access and access to word meaning (as determined by the context). Language-induced motor activity/effects that are observed early after word onset (150–200 ms) could reflect processes that are involved in lexical access. Motor effects that occur subsequent to word display, by contrast, may arise as consequence of access to meaning. Hebbian association learning during language acquisition (e.g., hearing the command “kick the ball”, while playing soccer, which links the word “kick” with the action of kicking) could explain why lexical access for action words (but not for nouns) involves these cortical motor regions (Pulvermüller, 1999, 2005). Motor effects that occur subsequent to word displays, by contrast, may involve more complex mechanisms than Hebbian association learning. We want to emphasize though that we are not suggesting that cortical motor regions are the bases of lexical access or access to action word meaning but simply that they are implicated in these processes.
Though admittedly speculative, this hypothesis allows a series of interesting predictions. First of all, brain imaging studies such as the magnetoencephalography (MEG) study by Pulvermüller et al. (2005b), which demonstrated short-lived language-induced cortical motor activity around 150 ms, should observe that neural activity in these regions reappear at a later moment following action word onset if context information is provided. Why language-induced motor effects switch from interference during the hypothesized lexical access to facilitation during the hypothesized access to meaning need to be specified though. Second, early language-induced motor activity/effects should occur only for action words but not for nouns such as “apple” and “grape”, for which Glover et al. (2004) have shown that they affect reaching grasping kinematics when processed prior to movement onset. Third, action words that are used as metaphors such as “the cash machine swallowed his credit card” should engage cortical motor region during lexical access for the word “swallow” but probably not during subsequent access to the meaning as implied by the sentence. Fourth, since access to meaning depends on sentence context, language-induced cortical motor activity that reflects processes involved in meaning access should vary depending on how sentence context modifies the action (see Glenberg et al., 2008 this issue, for first evidence). But what is the function of language-induced cortical motor activity?
The potential role of cortical motor regions for language
Embodied theories of language have proposed that understanding verbal description of actions requires, as one essential component, the involvement of the motor system (Zwaan & Taylor, 2006; see also Gallese & Lakoff, 2005). The cell assembly approach by Pulvermüller (2003), and complementary theoretical views for the perception of objects by Barsalou, Kyle Simmons, Barbey, and Wilson (2003) or Rogers et al. (2004) similarly imply that conceptual content is grounded in modalities and that semantic knowledge emerges from the interactions between sensory-motor information and the words that are used to describe them. Yet, damage to cortical motor structures, though it affects motor behaviour, seems not to systematically affect the perception and production of action-related language. Equally, while language-induced cortical motor activity in healthy participants has been shown to affect motor behaviour (e.g., the present study; Boulenger et al., 2006; Gentilucci & Gangitano, 1998; Glenberg & Kaschak, 2002; Glover et al., 2004; Zwaan & Taylor, 2006), it has not yet been shown that it also serves language understanding.
To better grasp the functional role of the observed language-induced motor effects/activity for language, one should thus focus on a language task instead on a motor task. So far, Myung, Blumstein, and Sedivy (2006) are among the few to have done so. Myung et al. could show that lexical decision to auditorily presented words and pseudowords was faster when the target word (e.g., typewriter) was preceded by a prime word that shared manipulation features (typing with the fingers) with the target (e.g., piano), than when it was preceded by a prime that did not (e.g., blanket). Note that in their experiment no overt typing was required, which thus suggest that word-associated knowledge about how to manipulate the objects had mediated priming. More evidence for such effects of the “motor system in language” (instead of “language in the motor system”) is urgently required. However, one very likely role of the motor system for the comprehension of action-related language could reside in supplying this motor knowledge. How essential this contribution is for language understanding remains to be established, though, as those who do not know how to “ride a bike” or to “knit a sock” can still talk about it.
