Abstract
Compared to concrete concepts, like “book,” abstract concepts expressed by words like “justice” are more detached from sensorial experiences, even though they are also grounded in sensorial modalities. Abstract concepts lack a single object as referent and are characterised by higher variability both within and across participants. According to the Word as Social Tool (WAT) proposal, owing to their complexity, abstract concepts need to be processed with the help of inner language. Inner language can namely help participants to re-explain to themselves the meaning of the word, to keep information active in working memory, and to prepare themselves to ask information from more competent people. While previous studies have demonstrated that the mouth is involved during abstract concepts’ processing, both the functional role and the mechanisms underlying this involvement still need to be clarified. We report an experiment in which participants were required to evaluate whether 78 words were abstract or concrete by pressing two different pedals. During the judgement task, they were submitted, in different blocks, to a baseline, an articulatory suppression, and a manipulation condition. In the last two conditions, they had to repeat a syllable continually and to manipulate a softball with their dominant hand. Results showed that articulatory suppression slowed down the processing of abstract more than that of concrete words. Overall results confirm the WAT proposal’s hypothesis that abstract concepts processing involves the mouth motor system and specifically inner speech. We discuss the implications for current theories of conceptual representation.
Introduction
Abstract and concrete concepts
The capability to use words, particularly words that convey increasingly abstract concepts (ACs), is a unique characteristic of humans. Many of the words used by adults are abstract or contain some abstract elements (Lupyan & Winter, 2018). Concepts are the building blocks of our knowledge; they are the “glue” that connects our past, present, and future experience (Murphy, 2004). They tell us what objects are; they allow us to make inductions and generalisations. Concepts are not the same as word meanings—animals and preverbal children possess them. However, most studies on adults and all the studies we will illustrate here investigate concepts studying the words that convey them. Therefore, it is often difficult to distinguish between concepts and word meanings; we will try to distinguish them when possible.
The distinction between ACs and concrete concepts (CCs), (e.g., “book” vs. “justice”) is not dichotomous (Barsalou et al., 2018). However, compared to CCs, ACs lack a single object as referent, are more detached from sensorial experiences (Barsalou, 2003; Brysbaert et al., 2014), and the features they evoke are more variable both within and across participants. Furthermore, they are on average, more arbitrary and influenced by cultures/languages than concrete ones (Borghi & Binkofski, 2014; Zannino et al., 2015).
The issue of how ACs are represented is becoming increasingly debated (reviews: Borghi et al., 2017; Dove, 2016 research issues: Bolognesi & Steen, 2018; Borghi et al., 2018b; Tomasino & Rumiati, 2013). To account for them, the recent multiple representation views have bridged the most insightful principles of embodied/grounded (Barsalou, 2016; Glenberg & Gallese, 2012; Pulvermüller & Fadiga, 2010) and distributional views of meaning (Landauer & Dumais, 1997), highlighting the role of both sensorimotor and linguistic experience (Andrews et al., 2014). Multiple representation views propose that sensorimotor experience is crucial for all concepts, especially for CCs. ACs would instead evoke linguistic, social, and inner experiences (interoception, emotions) to a greater extent than CCs (Borghi et al., 2018a; Dove, 2016; Newcombe et al., 2012; Vigliocco et al., 2013).
Some multiple representation views have mostly emphasised the importance of language for ACs, intending language as an instrument that helps thought processes (Dove, 2014, 2018; Dove et al., 2020). A major role of language(s) for abstract concepts is also in keeping with the view according to which ACs boundaries are arbitrary and more influenced by spoken languages and cultures than those of concepts learned through sensorimotor experience in more stable environments (Zannino et al., 2015). This proposal highlights the importance of natural languages in conveying the cultural stipulations relevant to arbitrary concept boundaries, which might be more typical of ACs than CCs.
Within multiple representation views, the Words As social Tools (WAT) proposal highlights the role linguistic and social experience play for ACs acquisition and representation (Borghi et al., 2019).
Abstract concept and mouth motor activation
The hypothesis that linguistic experience is crucial for ACs is supported by evidence showing that they activate the mouth motor system more than CCs. Specifically, rating and fMRI studies have demonstrated that ACs, and especially mental states ones(e.g., “thought,” “logics,” “remembering the past”), evoke actions performed with the mouth (Ghio et al., 2013) and engage the mouth motor system (Dreyer & Pulvermüller, 2018). Furthermore, facilitation for ACs compared with CCs was found across many studies and tasks when participants used the mouth instead of the hand as response effector; reviews Borghi et al., 2019; Dove et al., 2020).
In two studies mimicking conceptual acquisition, in which adults were presented with novel categories and novel names, we found that concrete words were processed faster than abstract ones, consistently with the concreteness effect. More crucially, in a property verification task, the first yielded faster responses with the hand, the second with the mouth (Borghi et al., 2011; Granito et al., 2015). In a further task with Italian words, we found that when participants had to decide whether a concrete or abstract word matched with a definition, the advantage of the hand over the mouth responses was reduced with abstract words (Borghi & Zarcone, 2016). The advantage of the mouth over the hand effector was not present in a lexical decision task, but it reappeared in a word recognition task (Mazzuca et al., 2018).
These facilitation effects obtained when the mouth was the response effector are flanked by interference effects, found when the mouth was actively occupied while performing a task. For example, the perceived difficulty of concrete but not abstract words was reduced when participants had to chew gum while evaluating words’ difficulty (Villani et al., 2020). Results showed that a device actively involving mouth movements, such as the pacifier, also affects children’s conceptual acquisition and may make the learning of abstract concepts more difficult. Two studies, one on first-graders and another on third-graders, suggested that those who had used the pacifier for more than three years had a different way to define abstract words and employed slower response times in processing abstract than concrete words (Barca et al., 2017, 2020)
Overall, these studies reveal facilitation for ACs compared with CCs when participants used the mouth instead of the hand as response effector and interference when the mouth was actively involved in another motor task,(e.g., chewing), during conceptual processing. As argued elsewhere, the involvement of the mouth motor system, even though clearly linked to language, can be due to different mechanisms. We might re-enact the linguistic acquisition of ACs during their processing, or we might need to inner search for their meaning. Finally, we might need to ask other people for information on their meaning. These processes are not mutually exclusive and might involve inner speech.
Abstract concepts and inner speech
This study aims to shed light on the function of mouth activation for ACs. Specifically, it tests the hypothesis that such activation is due to the use of inner speech (IS) during abstract words processing and that the articulatory suppression interferes with such processing.
The notion of inner speech has a long story and is hotly debated in recent literature (review: Langland-Hassan & Vicente, 2018). It gained popularity in Vygotsky’s (1932/1986) theory of cognitive development. While Piaget (1945/1976) intended language as driven by thought, Vygotsky proposed that language, and particularly inner speech, was an instrument for thought. According to Vygotsky, inner speech is the outcome of a developmental process and consists of an initially outer speech that becomes internalised (Trimbur, 1987); it is a sort of self-regulative inner conversation. Later, within models of working memory, inner speech was intended as an active rehearsal mechanism using offline speech planning processes (Baddeley, 1992). Recent studies focus on the functions of inner speech (e.g., behaviour regulation, Clark & Toribio, 2012) and of its components. In accordance with Nalborczyk et al., 2018, inner speech can be conceived: “as a physical process that unfolds over time, leading to an enactive re-creation of auditory percepts, via the simulation of articulatory actions” (for reviews, see Alderson-Day & Fernyhough, 2015; Lœvenbruck et al., 2018; Perrone-Bertolotti et al., 2014).
Inner speech includes an (auditory-phonological) sensory, a semantic, and an articulatory component that might also involve motor imagery of the mouth movements. Hence, inner speech and overt speech share many linguistic and structural features (Alderson-Day & Fernyhough, 2015). Evidence supporting this perspective shows that while silently reading words (Topolinski et al., 2014; Topolinski & Strack, 2009), we have a motor intention, (i.e., we covertly articulate the speech gesture which has the goal of a particular sound).
While these studies focus more on the inner articulation of word pronunciation, we hypothesise that during the processing of ACs the articulatory component of inner speech is tightly linked to semantic access. Hence, we would use inner speech to search for word meaning and to re-explain to ourselves the meaning of ACs. One could argue that, if no meaning representation is already present, there would be nothing to re-explain to ourselves through IS. However, suppose that I have some partial information derived from different sources on what “atom” means. Inner speech would help us to put this sparse information together in an ordered way aiming to determine what the word “atom” really means and whether it is concrete or abstract (for more details on a possible role of inner speech with abstract concepts, see Borghi, 2020; Borghi et al., 2020). We are not arguing that participants would use this mechanism to decide whether a word is abstract or concrete, but to access the word meaning. Consistently, recent evidence suggests that during abstract thought, neural areas linked to inner speech, intended as a form of covert linguistic production, are activated (Berkovich-Ohana et al., 2020). As anticipated in the introduction, in our view, abstract and concrete words do not represent a dichotomy, but more abstract words are more challenging and generate more uncertainty as to their meaning. Because of the higher uncertainty of word meaning, we hypothesise that inner speech is more likely to be utilised in the semantic search of abstract than of concrete words, and that the recruitment of inner speech involves articulation, as shown by several pieces of evidence (review in Loevenbruck et al., 2018). Using inner speech does not exclude that the semantic control areas are activated (see Coutanche & Thompson-Schill, 2015; Ralph et al., 2010). Inner speech and the articulatory system form the phonological loop (Chella & Pipitone, 2020).
We propose that the processing of visually presented words might entail the phonological loop (which includes the articulatory system activation) and that this phenomenon might be more engaged in the case of abstract as opposed to concrete concepts. Indeed, the phonological loop has a fundamental role in increasing the short-term memory capacity and in generating complex forms of language (Aboitiz et al., 2010). This is compatible with activation of the left inferior frontal gyrus (LIFG) during abstract word processing. Such activation has been interpreted as reflecting the longer time required by the items to be processed in the short-term phonological memory (Binder et al., 2005; Borghi et al., 2019). The role of the phonological loop is, in our view, more critical for abstract concepts, which are more complex and have a stronger socio-linguistic component than concrete ones. In our experiment, we tested whether the articulatory suppression interferes with the phonological loop during the concepts’ silent reading.
Once we have realised that such a meaning representation is not available, we would use the mouth motor system to prepare ourselves to ask information to others through outer speech, because we are aware of the inadequacy of our concepts (“social metacognition” mechanism) (Borghi et al., 2018a, 2020; Fini & Borghi, 2019).
Hypothesis of the current study
To test the involvement of inner speech during word processing, we used articulatory suppression. Articulatory suppression (i.e., number, word, syllable repetition) has been widely used to interfere with inner speech on cognitive tasks (Baldo et al., 2005; Lidstone et al., 2010).
We hypothesise that articulatory suppression, disrupting inner speech, impairs the processing of ACs but not of CCs, since linguistic experience is more crucial for the representation of ACs than of CCs. To test this hypothesis, participants were asked to repeat a syllable during the processing of ACs and CCs continuously. Ideally, articulatory suppression is employed along with an additional condition including a nonverbal task: this allows investigators to control for effects of dual-tasking and to identify effects specific to inner speech (Alderson-Day & Fernyhough, 2015). Thus we contrasted the articulatory suppression with a manipulation condition, in which participants had to manipulate a softball. If ACs activate IS, hence the mouth, and CCs the hand motor system, manipulation and articulatory suppression should exert opposite effects on Acs and CCs. Notably, we selected concrete words whose referents are typically acted upon with a hand action, excluding words referring to objects involved in oral actions (see Materials for more details).
While the conversation would be the ideal situation where to detect which mechanisms are active during abstract concepts use, interrupting the flow of a conversation to submit participants to an articulatory suppression task would render the conversation quite unnatural. At the same time, we had the necessity to use a task that involves processing in depth of the word meaning. In a recent study we namely, did not find facilitation of the mouth responses with abstract compared to concrete words when using a lexical decision task, probably because it was too shallow (Mazzuca et al., 2018). In lexical decision, it is possible to discriminate words from nonwords without necessarily accessing word meaning (Barsalou et al., 2008).
We, therefore, opted for a categorization task, in which we asked participants to decide whether words were concrete or abstract. On the one hand, using a categorization task has the advantage of accessing the word meaning; on the other hand, it requires a relatively short time to be performed that might be incompatible with the inner speech’s timing. Nevertheless, we opted for exploiting this task since it involves the comprehension of the word meaning. Moreover, in the literature, from Egger (1881) on, much evidence has shown that inner speech (or at least one kind of inner speech) is condensed (for an overview, see Loevenbruck et al., 2018). Egger argued that for physiological reasons, inner articulation is much faster than outer articulation, where we need to take a breath during speech fragments. Others (e.g., Vygotsky, 1932/1986) have claimed that during inner speech we might drop words and use only initials. Korba (1990) asked participants to mentally solve verbal problems and reported the inner speech they used. Then they had to report through overt speech the adopted strategies: inner speech overcame of 4000 words per minute the speaking rate of overt speech (see Loevenbruck et al., 2018). These data suggest that inner speech might be much faster than overt speech, hence it might influence a categorization task.
Articulating impoverished words or initials might not seem useful. However, the literature on inner speech suggests that such impoverished words have specific and clear meanings for the person who utters them. In our case, we propose that they can contribute to a better understanding of the word meaning, searching for it and re-explaining it to ourselves, as clarified above. Whether inner speech activated during abstract concepts processing is condensed or extended, we believe it makes use of phonetic features, the use of which might be interfered with through articulatory suppression.
To test these hypotheses, we first performed a pilot study (see, Open Practices Statement) in which the conditions of articulatory suppression and manipulation were manipulated between participants (20 for each condition). We found an interaction between word category (abstract vs. concrete) and interference task (articulatory suppression vs. manipulation), supporting our prediction that manipulation and articulatory suppression exert opposite effects on concrete and Acs. However, while our focus was on abstract words, it was unclear whether such an interaction was driven entirely by manual manipulation. To address this concern, in the present preregistered study, we added a baseline condition, and we manipulated the three conditions within participants.
Our preregistered, confirmatory hypotheses were the following:
Hypothesis 1a (directional): We predicted that articulatory suppression would affect inner speech, which in turn would interfere more with the processing of abstract than of concrete concepts, slowing down RTs and accuracy of abstract words.
Hypothesis 1b (directional): We predicted that the softball manipulation would interfere more with the processing of concrete than of abstract concepts, slowing down RTs and decreasing the accuracy of concrete words.
Hypothesis 2 (directional): We predicted RTs to be slower in the inner speech and manipulation condition as compared to the baseline condition.
Hypothesis 3 (bidirectional): We will investigate the possible interaction between our experimental conditions (i.e., manipulation vs. articulatory suppression) with the factor Morphological Complexity (monomorphemic vs. suffixed words). We were interested in verifying whether suffixed words are processed faster, as in a previous pilot experiment.
Hypothesis 4 (directional): we predicted that there would be a negative correlation between Reaction Times and abstractness for abstract words and a similar relationship between concreteness and RTs for concrete words.
Method
Participants
Forty-eight healthy students of Sapienza University of Rome, participated in this experiment (21 females, mean age 23.5 ± 3.04, range 19–31). All participants had normal or corrected-to-normal vision. The experiment was in accordance with the Declaration of Helsinki and was approved by the ethical committee of Sapienza.
To establish the sample size for the models, we used Westfall et al. (2014) method in which we included two fixed conditions and two random intercepts (subjects and words). In accordance with the results obtained in a pilot study, we hypothesised a medium effect size for the main fixed effects. We obtained a power = .83 with a sample size of n = 40 participants, mean diff = .50, residual var = .30; participant intercept var = .20; target intercept var = .20; participant slope var = .10; target slope var = .10 for 78 items.
Procedure
The experimental task was administered on a PC controlled by E-Prime software (Version 3). The participants sat at 60 cm from a 15-inch computer monitor in a dimly lit room. They were asked to maintain a comfortable position and to keep the feet on two pedals connected with the laptop through a Multifunctional response box (Chronos PST100430 model).
The experiment included three subsequent sessions, one for each of the three experimental conditions (Baseline, Manipulation, and Suppression, see below). The same 78 words (half abstract and half concrete) were used in each session/condition in a fully randomised order. Condition’s order was randomised across participants. Between subsequent sessions, 5 min of time were allotted for rest. Each trial started with the presentation of a fixation cross lasting 500 ms at the centre of the screen, followed by a written word. Participants were asked to judge whether each word was concrete or abstract by pressing the two pedals with the right/left foot. The mapping between concrete/abstract words and right/left foot was counterbalanced across participants. Each word remained on the screen either until the participant’s response or until 2000 ms without response. A 1500 ms blank screen concluded each trial. If participants pressed the wrong pedal, an error sound (i. a white noise) was delivered. Response accuracy (1 = right, 0 = wrong) and reaction times (in ms) were recorded. In the Baseline condition, participants were only engaged in the judgement task, by contrast, in the other experimental conditions, they were required to perform a concurrent activity during the main task. While categorising words, in the Articulatorysuppression condition, participants had to constantly repeat the syllable “ta” at a fast pace, while in the Manipulation condition, they were required to rhythmically contract and release a softball with their dominant hand. Specifically, the instructions were to adopt a fast pace (shown before by the experimenter), so that participants could keep it for some minutes without being breathless or having pain in the forearm. In other words, they could autoregulate the pace under the supervision of the experimenter who was continuously checking the maintenance of the initial pace.
Materials
In all, 39 concrete and 39 abstract nouns were selected for constructing the categorization task. CCs were selected among those with the highest concreteness values according to the Italian norms by (Barca et al., 2002); Acs were selected among the most abstract items in the corpus by Della Rosa et al. (2010). Among concrete terms, we selected words whose referents are typically acted upon with a hand action, (e.g., “sock,” not “giraffe”), while words referring to objects involved in oral actions (e.g., “fork,” food names), were excluded. We also excluded body parts (e.g., “leg”), words with emotional valence (e.g., “gun,” “love”), social terms (e.g., “teacher”) and superordinates,(e.g., “animal”).
Due to the limited size of the corpora and the adopted selection constraints, we were not able to control for all confounding variables. Normative data for Age of Acquisition (Age of Acquisition), Familiarity (Fam), and word Frequency (Freq) were available in Italian norms for the selected items. Since both Barca et al. (2002) and Della Rosa et al. (2010) collected norms on the same 7 point scale for Age of Acquisition and Familiarity these values were taken from the corresponding corpus. Word frequency was taken and log-transformed from the Italian norms by Bertinetto et al. (2005), finally, length in syllables (Syl) was measured by ourselves.
Only Age of Acquisition turned out to be comparable across CCs and Acs, F(1, 77) = 0.846, p = .361. By converse, CCs were reliably less frequent, F(1, 77) = 88.6, p < .001, shorter, F(1, 77) = 24.5, p < .001, and more familiar, F(1, 77) = 4.4, p = .039, than Acs (see Table 1).
Mean and standard deviations of available confounding variables across abstract and concrete words.
Natural logarithm of the number of occurrences in the 3 million words corpus by Bertinetto et al. (2005).
The above-mentioned confounding variables were taken into account in the analysis of the outcome of the abstract/concrete judgement task.
Finally, after the item selection was completed, we noticed that while 22 out of 40 Acs were suffixed, this was never the case for CCs. A supplementary analysis was performed to rule out any possible confounding effect of unbalanced morphological complexity across abstract and concrete words (see results section).
Results
We performed Spearman correlation between mean RTs on Acs (n = 39) and abstractness, and between mean RTs on CCs (n = 39) and concreteness. Only the correlation between mean RTs on Acs (n = 39) and abstractness in the manipulation condition resulted to be significant (see Figure 1), indicating that the more abstract concepts were high in abstractness, the less they were slowed down by the manipulation condition.

Spearman correlation between mean RTs on ACs (n = 39) and abstractness, and between mean RTs on CCs (n = 39) and concreteness. Although the analysis is nonparametric, by default ggplot2 R package generates regression lines and confidence intervals.
We then performed mixed-effects models (R packages lme4 and lmerTest) to investigate the effect of the experimental condition (Baseline, Manipulation, and Suppression) and word category (Abstract vs. Concrete) on reaction times during our abstract/concrete judgement task. For contrasts, we used dummy coding, in which each level of the categorical variable (Category and Experimental Condition) is compared to a fixed reference level. Abstract and Baseline served as a reference level for Category and Experimental Condition, respectively. Raw reaction times were log-transformed to reduce skewness in the distribution of our outcome variable. 1
Although, as stated in the preregistration of this study, we were willing to analyse accuracy in addition to RTs, this was not possible due to a too small number of wrong responses.
Analyses were performed on a total of 10.282 reaction times. From the overall 11.232 responses (48 participants x 78 items x 3 experimental conditions), we discarded 950 trials; (380, 40 %) were wrong responses while the remaining (570, 60%) were above or below 2 st dev from the average.
We modelled RTs as a function of the following predictors: Condition, Category and Condition by Category interaction, (i.e., the variables that were critical for verifying our hypothesis), Frequency, Familiarity and Syllables, Age of Acquisition (AoA), Experimental Session, and as random intercepts we entered Items and Participants. Item, and a by Participant random slopes for Condition were also entered as random intercepts to account for the lack of independence between observations. Then, we added a by Item random slope for Condition, which yielded an unreliable singular model and was thus removed. Following (Barr et al., 2013), we chose the reliable model with the maximal random effects structure, which was the model 2 (see Table 2), the model with the best fit 2 was the second one [R2 m = .04, R2c = .34] (Nakagawa & Schielzeth, 2013), χ2(5) = 253.33, p < .001.
Information criteria of the models.
AIC: Akaike information criterion; BIC: Bayesian information criterion.
The summary of the model with the best fit is reported on Table 3. As can be seen, manipulating a softball while performing the judgement task on abstract words did not significantly slowed RTs (p = .131). By contrast, Articulatory Suppression reliably slowed RTs on abstract words as compared to Baseline (p = .020). More interestingly with respect to our hypothesis, a Condition by Category interaction was found. As can be seen on Table 3, Concrete and Abstract words did not behave differently at a statistically significant level when passing from the Baseline to the Manipulation condition (p = .198). By contrast, having the Baseline as reference level, the effect of articulatory suppression was significantly more pronounced for Abstract than for Concrete words (p = .046).
Summary of fixed effects (RTs as a function of category and condition).
SE: standard error.
In Figure 2) are reported the Log-transformed reaction times distribution according to word category and experimental conditions. In Figure 3 are reported the mean log-transformed reaction times with the standard errors for abstract and concrete concepts in each conditions.

Log-transformed reaction times distribution according to word category and experimental conditions.

Predicted log-transformed reaction times according to word category and experimental condition; bars = 95% confidence intervals.
In addition, a drift diffusion model was applied using DstarM package in R (van den Bergh et al., 2020) to deeply describe the response style profiles between conditions and concept types. The model was developed on a total of 10.852 observations, using as response variable the type of concept (abstract vs. concrete, considering the abstract as the upper boundary and the concrete as the lower), as a function of the reaction time and the condition (Baseline, Manipulation, Suppression). This model allows to disentangle whether the response time difference between abstract and concrete response is due to different profiles of evidence accumulation according to the type of concept (i.e., whether the presented word is concrete or abstract) or to differences in response strategies.
The models’ goodness of fit Chi square showed reliable value for all the estimated parameters. According to the drift diffusion model, when the absolute value of the drift rate (v) is high, decisions are fast and accurate; when the drift rate is low, the processing is driven to a large extent by noisy fluctuations, and as a result, decisions are slow and inaccurate indicating a higher difficulty in performing the task (Wagenmakers, 2009). As can be seen in Table 4, the result on the drift rate (v) indicates that the articulatory suppression condition interferes more with abstract concepts than the manipulation condition.
Parameters (and their estimate standard errors) of the drift diffusion models applied to the type of concept (abstract vs concrete) as a function of RTs and condition (baseline, manipulation and suppression.
a: boundary separation; v: drift rate; z: starting point.
Importantly, concrete vs. abstract concepts differ not only in the manipulation, but also in the articulatory suppression condition and this difference in the articulatory suppression condition is marked when comparing it with the baseline. Hence, it can be argued that the apparent similarity between the manipulation and the articulatory suppression conditions does not reflect the decisional processes but less relevant parts of the response process.
In conclusion, the articulatory suppression condition differs more from the baseline than the manipulation one, the less pronounced difference between the manipulation and the articulatory suppression conditions might be explained by the slow RTs caused by performing a dual task.
RTs on ACs as a function of morphological complexity
Due to the presence of several constraints in the item selection (see materials section) morphological complexity turned out to be unbalanced across ACs and CCs. While 22 out of 38 of the ACs were suffixed, this was never the case for CCs. Recent research suggests that the number of morphemes influences written word processing: Muncer et al. (2014) found that words with more morphemes produce faster RTs both in naming and lexical decision. To rule out possible confounds linked to the unbalanced morphological variable, we tested whether our experimental conditions, (i.e., Baseline, Manipulation and Suppression), interacted with the factor Morphological Complexity (monomorphemic vs. suffixed words).
To this aim, a subset of 5.059 cases pertaining to abstract words was selected and analysed with a similar model to that used for the entire dataset, save that the factor Morphology (Monomorphemic vs. Suffixed) was entered instead of the factor Category. In particular, we entered as fixed factors Condition, Morphology and Condition by Morphology interaction. As before we also modelled as fixed factors Freq, Fam and Syll, Age of Acquisition and Experimental Session. Finally, random intercepts for Subjects and Items were also entered in the model and a by Subject random slope for Condition. A complete summary of the model m1 <-lmer(Rt_ln_2sd ~ Familiarity + Frequency + Syllables + Session + Age of Acquisition + suffix * Condition + (1 + Condition|Participant) + (1| item), [R2 m = .05, R2c = .37] (Nakagawa & Schielzeth, 2013) is displayed on Table 5. As can be seen, no reliable Morphology effect was found. More interestingly, the effect of experimental Condition was comparable across monomorphemic and suffixed words. In particular, passing from Baseline to Manipulation and from Baseline to Suppression increased RTs in a comparable way across suffixed and non-suffixed words.
Summary of fixed effects (RTs on abstract words as a function of morphology and condition).
SE: standard error.
In light of this result we can be confident that the Condition by Category interaction we found in the previous analysis is genuine, (i.e., it is not driven by a subset of either monomorphemic or suffixed abstract words).
Discussion
Our main result consists of a significant interaction between word category (abstract vs. concrete) and condition. We did not find the expected interference of the manipulation condition with concrete words that was present in the pilot study. This is striking, also in light of the various evidence on the interference played by manipulation information during the processing of concrete objects (Davis et al., 2020). A possibility is that the absence of an effect is due to the greater difficulty of the articulatory suppression condition compared to the manipulation one when the two conditions are presented within participants. An alternative explanation is that a manual manipulation might also interfere with speech motor planning and inner speech production, even if to a lesser extent than articulatory suppression (Nalborczyk et al., 2018). More crucially, we confirmed our prediction that the effect of articulatory suppression was more pronounced for abstract than for concrete words. This is the first demonstration that articulatory suppression slows the processing of abstract words more than that of concrete ones. Such an interference suggests that the involvement of the mouth motor system plays a functional role, as it influences speed in accessing word meanings. Hence, the activation of the mouth is not simply a by-product of ACs processing, occurring a posteriori, once meaning had been grasped, as data on facilitation of mouth responses could imply.
The results support the predictions of the WAT proposal, (e.g., Borghi et al., 2019), indicating that linguistic experience plays a more prominent role in the representation of ACs than of CCs (Zannino et al., 2015). Before concluding, we need to point to some limitations of our study, that might lead to future research lines. First, our results do not allow us to completely rule out an alternative explanation. Articulatory suppression could influence more abstract than concrete words because it has a stronger disruption effect compared to manipulation. Hence it affects the processing of the most difficult words, which are the abstract ones. We are inclined to believe this is not the case, for a couple of reasons: a. in the pilot study, in which the task and the words were the same, but the manipulation and articulatory suppression conditions were manipulated between participants, the results were different, and both concrete and abstract words were balanced for Age of Acquisition, typically correlated to word difficulty.
The result obtained in the drift diffusion model, allows us to support the hypothesis that the articulatory suppression condition interferes more with abstract words than the manipulation condition. It is unclear why the manipulation is easier than the baseline, though.
The results are quite relevant since they allow us to argue that the manipulation and the articulatory suppression condition act differently on concrete and abstract words. One possible problem of our results was the following: one could argue that only the articulatory suppression condition had an effect, and that it affected more abstract words because of the higher complexity of the concepts they express. The results of the drift model suggest that this is not the case: if we look at the concrete vs. abstract dissimilarities, we can see that concrete vs. abstract words differ not only in the manipulation, but also in the articulatory suppression condition. This difference in the articulatory suppression condition is marked if we compare it with the baseline. Hence the apparent similarity we had found was not due to the decisional processes, which are what counts more for us, but to other, less relevant parts of the response process.
If we look at the comparison between conditions, we confirm what was found in the linear mixed models that the articulatory suppression condition differs more from the baseline than the manipulation one. Why the similarity between the two conditions is less pronounced appears to be less clear, it might be that both the conditions lead to slowing down the reaction times.
Second, we do not know whether the effect of articulatory suppression we found can be generalised to other tasks beyond categorization. Further studies are needed to explore this issue, possibly studies designed in more ecological situations (e.g., capturing conceptual processing during a conversation).
Finally, a further limitation consists in the timing of our task. We have detected an effect of inner speech with abstract concepts in a categorization task. However, the effect of inner speech might be stronger with tasks in which more time is allowed to recruit and use it. Further research is needed to determine this.
In sum: our results suggest that to comprehend ACs we use IS, and that disrupting this use impacts response times. Further research should determine whether inner speech is used to monitor our knowledge internally and search for meaning, tore-explain the word meaning to ourselves, to prepare ourselves to ask information to other people, or for all these reasons. Further studies could also help us to determine whether the influence of inner speech on ACs processing can also be captured with tasks that do not involve articulation, in line with theories according to which inner speech is specified at the phonological but not necessarily at the articulatory level (e.g., Oppenheim & Dell, 2010). Here we show that inner speech represents a powerful instrument to improve and guide our thoughts. This occurs especially with ACs, which are the hallmark of the complexity and sophistication of human knowledge.
Footnotes
Author contributions
G.Z., A.M.B., and C.F. developed the study concept and the study design. Testing and data collection were performed by C.F. and G.Z. M.B., and C.F. performed the data analysis and interpretation under the supervision of A.M.B. and A.M.B., C.F., M.B., and G.Z. drafted the manuscript, and M.B., C.F., A.M.B., and G.Z. provided critical revisions. All authors approved the final version of the manuscript for submission.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: A.M.B. was supported by Sapienza Progetti di Ricerca H2020 (2018, 2019) and H2020-TRAINCREASE-From social interaction to ACs and words: towards human centred technology development; CSA, Proposal no. 952324 P.I. A.M.B.
