Abstract
Within single-mechanism connectionist models of inflectional morphology, generating the past-tense form of a verb depends upon the interaction of semantic and phonological representations, with semantic information being particularly important for irregular or exception verbs. We assessed this hypothesis in two experiments requiring normal speakers to produce the past tense from a verb stem that takes a regular or exceptional past tense. Experiment 1 revealed significant latency advantages for high- over low-imageability words for both regular verbs (e.g., “lunged” faster than “loved”) and exception items (e.g., “drank” faster than “dealt”); but critically, this effect was significantly larger for exceptions than for regulars. Experiment 2 employed a semantic priming paradigm where participants inflected verb stems (e.g., sit) preceded by related (e.g., chair) or unrelated primes (e.g., jug) and revealed a priming effect in accuracy that was confined to the exception items. Our results are consistent with predictions from single-mechanism connectionist models of inflectional morphology and converge with findings from neurological patients and studies of reading aloud.
English inflectional morphology is an example of a “quasiregular” domain in that the relationship between inputs and outputs is largely systematic (e.g., walk ↠ walked and jump ↠ jumped) but allows for exceptions (e.g., run ↠ ran; Plaut, McClelland, Seidenberg, & Patterson, 1996; Seidenberg, 1992). The vast majority (≈86%) of English verbs are transformed from their present- to past-tense forms by addition of the –ed suffix (Plunkett & Nakisa, 1997), but around 180 monomorphemic verbs (Joanisse & Seidenberg, 1999)—including some of the most frequently used verbs in the English language—undergo a more idiosyncratic transformation. These include vowel changes (sing ↠ sang), consonant changes (send ↠ sent), vowel and consonant changes (teach ↠ taught), complete changes (go ↠ went), and no changes (beat ↠ beat). Any adequate account of inflection must explain how both regular and the various exceptional cases are processed, as well as generalization to novel or nonce forms (e.g., wug ↠ wugged and sometimes spling↠ splang).
The dual-mechanism “words-and-rules” (WR) model (Pinker, 1999; Pinker & Prince, 1988) proposes that regular forms are computed by rule (“add –ed”), whereas exceptional past-tense forms are stored in a lexicon. For regular (e.g., talk) and novel verbs (e.g., wug), the rule is applied by default (Pinker & Ullman, 2002), but if a stored inflected form of a verb is retrieved, it blocks application of the rule (e.g., brought blocks bringed). Ullman and colleagues (Ullman, 2001; Ullman et al., 1997) extended this model to include hypotheses regarding its neurocognitive substrates. According to their declarative/procedural (DP) hypothesis, regular and novel past-tense forms are computed online, each time they are required, by the grammatical system that is part of procedural memory located in the frontal cortex (including Broca's area) and the basal ganglia that project to it. In contrast, the lexicon in which exceptional past-tense forms are stored is part of the declarative memory system, located in medial-temporal regions.
A contrasting view is offered by single-mechanism connectionist models of inflection in which all items are inflected according to learnt probabilistic mappings, rather than requiring distinct subsystems to produce the past tense of regular and nonce versus exception verbs. Some single-mechanism models implemented the mapping between input and output phonology (e.g., Daugherty & Seidenberg, 1992; Rumelhart & McClelland, 1986), or the generation of inflections directly from meaning (e.g., Hoeffner, 1992; Ramscar & Yarlett, 2007), while others have incorporated both aspects (e.g., Joanisse & Seidenberg, 1999; Karaminis & Thomas, 2010; Woollams, Joanisse, & Patterson, 2009). Within this latter approach, present-tense stems are mapped to past-tense forms via a system of distributed phonological and semantic representations. Regular and novel verb forms may be inflected primarily on the basis of phonological information as they exemplify the typical present- to past-tense mapping, whereas verbs with exceptional past-tense forms depend more on additional semantic information as a consequence of their relatively atypical mapping. In terms of neural correlates of this view, the process of inflecting all verbs recruits both left-frontal brain regions supporting phonological processing and anterior temporal areas that contribute to semantic processing; as indicated in the previous sentence, however, there should be somewhat differential reliance on the two regions for the two verb types.
Behavioural research using past-tense elicitation paradigms has consistently shown that skilled adult speakers inflect regular verbs faster and more accurately than exception verbs, and that this is particularly the case for low-frequency items (e.g., Prasada, Pinker, & Snyder, 1990; Seidenberg & Bruck, 1990). Both dual- and single-mechanism approaches are able to account for this basic finding. According to dual-mechanism accounts, the exceptional past-tense form needs to be retrieved from the lexicon in order to block application of the “add –ed” rule. This blocking process is time consuming and can occasionally fail altogether, resulting in a “regularization” error (e.g., swear > “sweared”). The efficiency of the retrieval and blocking process is modulated by verb frequency, meaning that the latency cost or incidence of errors is greater the lower the frequency of the verb (Pinker, 1999; Prasada et al., 1990).
In contrast, single-mechanism models attribute the frequency by regularity interaction to the fact that regular inflection is the most common in the domain, and therefore the learnt mappings encoding this transformation will be the strongest. Thus, regular past-tense transformations can be performed quickly and efficiently, regardless of verb frequency. This well-learned regular mapping causes interference when exception items are to be inflected. Such interference can be overcome to some extent for high-frequency exceptions due to greater exposure to these items during training. This form of support is less available in the case of low-frequency exceptions, meaning that the inflection process will be influenced more by competition from regular items, resulting in longer reaction times (RTs) and a higher incidence of regularization errors (Daugherty & Seidenberg, 1992; Seidenberg, 1992, 1993; Seidenberg & Hoeffner, 1998).
The fact that both dual- and single-mechanism accounts of inflectional morphology can account for basic empirical findings from normal participants has led researchers attempting to adjudicate between the two approaches to consider the consequences of different forms of brain damage upon past-tense production. Ullman et al. (1997) reported a neuropsychological double dissociation between success in regular and nonce versus exceptional verb inflection. Patients with Parkinson's disease or agrammatic aphasia associated with anterior lesions showed a relative advantage for producing exceptional past-tense forms over regular and nonce forms, which Ullman et al. attributed to damage to the grammatical process that implements the “add –ed” rule. Conversely, patients with Alzheimer's disease or with fluent aphasia resulting from posterior temporal lesions produced more errors when inflecting exception verbs than when inflecting regular or nonce verbs, which was explained in terms of damage to the lexicon.
Whilst Ullman et al. (1997) interpreted their results as strong evidence for the functional independence of the components of the dual-mechanism model, the precise nature of the apparent dissociation and the underlying causes of selective impairments have been the focus of much subsequent debate. According to dual-mechanism accounts, any deficit in inflection of nonce verbs should also hold for regular verbs, as such deficits reflect disruption of the grammar that applies the “add –ed” rule. In contrast, connectionist single-mechanism accounts favour an explanation in terms of a generalized phonological impairment that will particularly affect nonce verb inflection. According to Bird, Lambon Ralph, Seidenberg, McClelland, and Patterson (2003), phonological damage disproportionately affects regular past-tense forms because they are, on average, more phonologically complex than exceptional past-tense forms (Burzio, 2002; Hoeffner & McClelland, 1993). In 10 nonfluent aphasic patients with a central phonological deficit, Bird et al. demonstrated that the advantage for exception over regular verbs on the stimulus materials from Ullman et al. was eliminated with a new set of items in which the past-tense forms were matched for phonological complexity across regularity. Moreover, the deficit for nonce as opposed to real verbs amongst these patients was maintained (Braber, Patterson, Ellis, & Lambon Ralph, 2005). These data conform to the predictions of the Joanisse and Seidenberg (1999) single-mechanism model: When its phonological representations were “lesioned”, deficits were significantly larger for nonce verbs than for any other stimulus type, as such items place increased demands upon phonology as a consequence of their novelty. These findings suggest that the “exception > regular pattern” is attributable to differences in phonological complexity rather than morphological category (Lambon Ralph, Braber, McClelland, & Patterson, 2005), although this conclusion is by no means universally accepted (Ullman et al., 2005).
The underlying cause of a selective deficit for exception verbs also remains contentious. Although the original report attributed this deficit to damaged lexical (nonsemantic) representations (Ullman et al., 1997), the single-mechanism account instead proposes that the more idiosyncratic mappings for exception items increase reliance on semantic information. In the Joanisse and Seidenberg (1999) model, damage to the whole-word representations used to approximate semantics resulted in a disproportionate deficit in the generation of exceptional past-tense forms and a preponderance of regularization errors. These predictions were confirmed by Patterson, Lambon Ralph, Hodges, and McClelland (2001): Patients with semantic dementia involving a relatively selective deficit of semantic memory performed well at inflecting regular verbs but poorly for exceptions. Moreover, this relationship held at the item level, with scores on a synonym judgement task correlating with extent of impairment in inflection of the same exception verbs, whilst no such relationship held for regular verbs. The advantage for regular and novel past-tense forms over exceptional forms has been replicated in other patients with semantic dementia (e.g., Bird et al., 2003; Patterson et al., 2006) and has also been found in patients with lesions of the left temporal lobe resulting from herpes simplex virus encephalitis (Tyler et al., 2002). Nevertheless, there have been occasional reports of semantic dementia patients with intact exception inflection (Tyler et al., 2004). This has led some researchers (e.g., Bright, Moss, Stamatakis, & Tyler, 2008) to attribute the deficit, when it occurs, to damage to lexical processing areas adjacent to the anterior temporal regions responsible for semantic processing (Hodges, Patterson, Oxbury, & Funnell, 1992; Mummery et al., 1999), consistent with a dual-mechanism account.
The goal of this study was to provide convergent evidence concerning the role of semantic information in the inflection of exception verbs, through assessment of the influence of semantic variables upon past-tense production by normal participants. The WR and DP models seem to assume that successful performance of the inflection-from-stem task requires words and rules but not semantic information (Woollams et al., 2009). These models, therefore, are silent in terms of predictions regarding the influence of semantics on past-tense generation. In the Joanisse and Seidenberg (1999; Woollams et al., 2009) model, throughout the course of learning, exception items come to depend upon a relatively greater contribution of semantic information. This then predicts that there should be an interaction between semantic variables and regularity, such that semantic effects will be more apparent for exception than for regular items.
A similar line of logic has been pursued within the quasi-regular domain of reading aloud English words, where the interaction between imageability, a semantic variable reflecting the ease with which one can form a mental image of a word's referent, and regularity has been taken as evidence for a single- over a dual-mechanism approach (e.g., Cortese, Simpson, & Woolsey, 1997; Shibahara, Zorzi, Hill, Wydell, & Butterworth, 2003; Strain, Patterson, & Seidenberg, 1995, 2002; Woollams, 2005). The finding that the effect of imageability on reading aloud by skilled adults is limited to low-frequency exception words has also been simulated in a single-mechanism connectionist computational model that incorporated a full system of featural semantic representations (Harm & Seidenberg, 2004). Also consistent with the single-mechanism view of reading aloud (Plaut et al., 1996), significantly larger semantic priming effects have been observed for exception words than for regular words amongst skilled adults (Cortese et al., 1997). In the present study on verb inflection from stem, we sought to demonstrate similar interactions between regularity and imageability (Experiment 1) and regularity and semantic priming (Experiment 2), as would be explicitly predicted by the single-mechanism view.
Experiment 1
In Experiment 1, we aimed to assess the impact of semantic information on past-tense inflection in skilled adult speakers by manipulating the semantic variable imageability, which has previously been employed in studies of reading aloud (e.g., Cortese et al., 1997; Strain et al., 1995). The experimental task entailed speeded inflection, which required participants to generate the past-tense form from a visually presented verb stem as quickly as possible, whilst avoiding errors. For example, when presented with “walk” on the computer screen, participants were required to say “walked” as quickly as possible. The expectation was that this process would be more efficient for high- than for low-imageability items, and that this imageability effect would be significantly larger for exception than for regular items, in line with the predictions of the single-mechanism model.
Method
Participants
Forty-eight normal adult speakers (42 female, 6 male) were recruited from the University of Manchester School of Psychological Sciences (SPS) participant pool, with course credit offered for participation. The only exclusion criterion was not having English as a first language. Ethical approval for Experiments 1 and 2 was obtained from the SPS Ethics Committee. Of the 48 participants, 16 took part in a pretest rating study and 32 in the main experiment.
Stimuli
In order to distil optimum stimuli for Experiment 1, in terms of matching across regularity on level of imageability, we conducted a ratings study prior to launching the main experiment. The ratings study also served to check participants' knowledge of the past tense, ensuring that only verbs for which the majority of participants knew correct past-tense forms were included in the final experimental stimuli. Any imageability effects found in error rates should therefore reflect experimental manipulations rather than vocabulary limitations of participants.
16 participants (14 female, 2 male) took part in the ratings task. They were tested in groups of up to 6, in a multitesting lab. The study was conducted using Dell Optiplex 755 computers with a 60-Hz refresh rate, at 1,280 × 1,024-pixel screen resolution. Participants were given a number of practice trials before each task. The first task required participants to rate the imageability of 200 verb stems on a scale of 1–7 (1 = abstract, 7 = concrete) using the computer keyboard. Some words have more than one meaning, and can function as verbs (e.g., to run) or nouns (e.g., to go on a run). Verb stems were therefore preceded by the word “to”, and participants were instructed to rate the imageability of each stem specifically in its context as a verb, following instructions used by Bird, Franklin, and Howard (2001) in their large-scale imageability rating study. The instructions presented to participants at the beginning of the imageability ratings task are provided online as Supplementary Material. The 200 verb stems to be rated consisted of 100 regular and 100 exception items, presented in a different randomized order to each participant. Each regular verb was matched to an exception verb on lemma frequency, letter length, and perceived imageability (as judged by the experimenter). Only verbs that were monosyllabic in their present- and past-tense forms were included. In a subsequent task, the same participants were also asked to type in the past-tense form of the same 200 verb stems (presented in a different randomized order) in their own time, with an emphasis on avoiding errors. For example, in response to “Today I run, yesterday I … ”, participants were required to type “ran”. Verbatim instructions given to participants for this task are provided online as Supplementary Material. The inflectional accuracy and imageability ratings for each of the 200 items are also available online as Supplementary Material.
The final set of stimuli, selected on the basis of the imageability ratings and accuracy in the unspeeded past-tense generation task, consisted of 120 verb stems. These comprised 30 items for each condition (regular high imageability, regular low imageability, exception high imageability, exception low imageability). The stimuli retained were only those for which more than 50% of participants generated the correct past-tense form, with an overall mean of 93% (see Table 1). Verb stems with mean imageability ratings below 4.1 were classified as low imageability, and those with mean ratings above 4.7 were classified as high imageability. Lexical properties of the final stimulus sets used in Experiment 1 were derived from the CELEX database (Baayen, Piepenbrock, & van Rijn, 1993) and are presented in Table 1.
Average values of stimuli used in Experiment 1 on a variety of lexical variables as a function of regularity and imageability
Note: Standard deviations in parentheses.
We conducted a series of between-items 2 (regularity: regular/exception) × 2 (imageability: low/high) analyses of variance (ANOVAs) to ensure that there were no significant differences between stimuli in each condition, barring the intended differences in regularity and imageability. Whilst there was a significant effect of regularity on inflection accuracy, F(1, 116) = 63.13, p < .0005, with participants being better at inflecting regular than exception verbs across the board, there was no significant effect of imageability, F(1, 116) = 1.10, p = .297, nor any interaction between regularity and imageability, F(1, 116) = 1.10, p = .297. As expected, there was a significant effect of imageability on imageability rating, F(1, 116) = 325.90, p < .0005, but there was no effect of regularity, F(1, 116) = 0.14, p = .705, nor any interaction between the two variables, F(1, 116) = 2.37, p = .126. Analyses revealed no significant effects of regularity or imageability, nor any interaction between the two, on lemma frequency, stem frequency, past-tense frequency, or stem letter length (F value range = 0.001 to 1.07, p value range = .977 to .303). There was a highly significant effect of regularity on past-tense letter length, F(1, 116) = 152.39, p < .0005, but no main effect of imageability, F(1, 116) = 0.10, p = .752, nor any interaction, F(1, 116) = 0.01, p = .916. Although number of meanings 1 did not show any main effects of regularity, F(1, 116) = 1.20, p = .277, or any interaction, F(1, 116) = 0.87, p = .353, it did show a marginally reliable effect of imageability, F(1, 116) = 2.84, p = .095, consistent with previous studies showing a correlation between the two measures (Baayen, Feldman & Schreuder, 2006). For this reason, we included number of meanings as a covariate in our item analyses.
We are very grateful to Professor Harald Baayen for providing us with this measure computed from the WordNet database.
Procedure
Thirty-two participants who had not taken part in the pretests completed the main experiment. Instructions and stimuli were presented and RTs recorded using DMDX experimental software (Forster & Forster, 2003) using a microphone headset connected to a Dell Optiplex GX620 with a 60-Hz refresh rate at 1,280 × 1,024-pixel screen resolution. The experimenter manually recorded any erroneous responses and measurement errors due to inaccurate activation of the microphone.
A total of 120 verb stems, varying in regularity and imageability, were presented in a different random order for each participant. These were preceded by a series of 12 practice trials not seen in the subsequent task (three low-imageability exception, three low-imageability regular, three high-imageability exception, three high-imageability regular, randomly intermixed). All trials began with a white cross on a black screen for 500 ms, followed by central presentation of the target stimulus in lower case. The stimulus remained on the screen until the participant produced a spoken response that was detected by the microphone or for a maximum of 4,000 ms. Participants' responses were recorded for 2,000 ms from onset. The next trial began after an interval of 2,250 ms. General feedback on task performance was provided during the practice trials by the experimenter, but no feedback was provided during the subsequent experimental trials.
Results
One female participant was found to be a non-native English speaker; all data for this participant were excluded. Analyses were, therefore, based on results from 31 participants. We conducted ANOVAs based on both participant (F1) and item (F2) means. Mean RTs and accuracy/error rates for all of the items used in Experiment 1 are presented in Appendix A.
Reaction times
Any trial for which the microphone was inappropriately activated (2.77% of trials), or on which a participant gave an erroneous response, was excluded from RT analyses. The by-participants analysis involved a 2 (regularity: regular/exception) × 2 (imageability: low/high) repeated measures ANOVA. The analysis of RTs revealed main effects of regularity, F1(1, 30) = 68.77, p < .0005, and imageability, F1(1, 30) = 39.35, p < .0005, which were qualified by a significant interaction between the two, F1(1, 30) = 7.01, p = .013. The interaction between the effects of regularity and imageability on mean RTs is illustrated in Figure 1. Preplanned comparisons revealed that the 193-ms regularity effect for low-imageability items was highly significant, t(30) = 7.76, p < .0005, as was the 141-ms regularity effect for high-imageability items, t(30) = 7.16, p < .0005. There was a highly significant imageability effect of 99 ms for exception items, t(30) = 5.55, p < .0005, and a smaller effect for regular items at 47 ms that was still significant t(30) = 3.87, p = .001.

Mean reaction times (RTs) as a function of regularity and imageability for Experiment 1. Error bars represent ± standard error.
For the items analysis, imageability was treated as a continuous variable in order to capture more accurately the graded nature of this variable 2 and, in light of its marginal confound with number of meanings, to allow a comparable treatment of the two variables. Imageability ratings for each item acquired in the pretest were entered with number of meanings as linear predictors in an analysis of covariance (ANCOVA), which also included regularity (regular/exception) as a dichotomous factor. The analysis revealed significant effects of regularity, F2(1, 116) = 11.00, p = .001, and imageability rating, F2(1, 116) = 17.96, p < .0005, and an interaction between the two, F2(1, 116) = 6.72, p = .011. Imageability rating predicted RTs for both exception and regular items. For exception items, imageability rating accounted for approximately 19% of variance in RTs over and above number of meanings, which was highly significant, β = –65.25, t(1) = –3.66, p < .0005. Imageability rating accounted for less variance in RTs to regular items after controlling for number of meanings, approximately 11%, which was still significant, β = –25.13, t(1) = –2.69, p = .009.
Although an alternative approach to achieving a continuous treatment of the imageability dimension would have been to conduct a simultaneous participant and items analysis using a linear mixed-effect model, we adopted separate participants and items analyses to allow more direct comparison to parallel manipulations in the reading-aloud literature.
Accuracy and error rates
All errors not associated with inappropriate activation of the microphone were coded as one of the following categories: (a) regularization errors, in which exception verb stems were inflected as regular verbs (e.g., shed > “shedded”, stick > “sticked”); (b) legitimate alternative rendering of components (“LARC”) errors (Patterson et al., 2006), in which a verb stem (regular or exception) was incorrectly inflected by analogy to another verb (e.g., glow > “glew”, hide > “hode”); (c) production of the past participle of the verb, rather than its simple past (e.g., beat > “beaten”, steal > “stolen”); (d) “no change errors”, in which the participant merely reproduced the verb stem (e.g., chew > “chew”, lend > “lend”); (e) no response after 4,000 ms; (f) dysfluencies; (g) “other name errors”, in which the participant inflected a word that was visually similar to the target, rather than the target itself (e.g., lend > “led”, sting > “strung”); and (h) “other” errors, which included a variety of errors that did not fall into the categories described (e.g., bid > “bode”, join > “joint”, strike > “strucked”, talk > “spoke”). The frequencies of different error types produced in Experiment 1 are given in Appendix B.
Overall accuracy for each condition was calculated based on number of correct trials out of the number of valid trials (i.e., excluding voicekey errors). Mean accuracy rates for each condition of Experiment 1 are provided in Table 2. A repeated measures 2 (regularity) × 2 (imageability) ANOVA by participants revealed a highly significant effect of regularity, F1(1, 30) = 85.52, p < .0005, but no significant effect of imageability, F1(1, 30) = 0.03, p = .863, nor any interaction between the two, F1(1, 30) = 0.70, p = .409. The effect of regularity on accuracy was significant for both high-, t(30) = –7.88, p < .0005, and low-imageability, t(30) = –8.99, p < .0005, items. As in the analysis of RTs by items, imageability rating and number of meanings were entered as a continuous predictors in an ANCOVA on accuracy that included regularity as a between-items factor. There were no significant effects in this items analysis: regularity, F2(1, 116) = 0.85, p = .359; imageability rating, F2(1, 116) = .24, p = .628; regularity by imageability rating, F2(1, 116) = .03, p = .869.
Mean percentage accuracy for each condition of Experiment 1 as a function of regularity and imageability
Note: Standard deviations in parentheses.
In addition to calculating overall accuracy rates, we conducted error analyses based on the combined rate of regularization and LARC errors. Regularizations and LARCs were of particular interest because they are errors that are clearly inflectional in nature. Participants' combined mean regularization and LARC error rates for the different conditions of Experiment 1 are presented in Figure 2. When analysed with a repeated measures 2 (regularity) × 2 (imageability) ANOVA by participants, a highly significant effect of regularity was found, F1(1, 30) = 65.03, p < .0005, but there was no significant effect of imageability, F1(1, 30) = 0.87, p = .359, and no interaction between the two, F1(1, 30) = 0.34, p = .567. The analysis by items also showed a significant regularity effect, F2(1, 116) = 5.832, p = .017, a marginally significant effect of imageability rating, F2(1, 116) = 3.42, p = .067, and no interaction, F2(1, 116) = 1.97, p = .163.

Percentage regularization and legitimate alternative rendering of components (LARC) error rate as a function of regularity and imageability for Experiment 1. Error bars represent ± standard error.
Discussion
The results of the present experiment revealed the standard disadvantage for exception relative to regular verbs in RTs and combined regularization/LARC error rates. In addition, a highly significant impact of imageability upon performance in speeded inflection for both regular and exception items was demonstrated. Most importantly, the predicted interaction between regularity and imageability in RTs emerged: The advantage for high- over low-imageability items was more pronounced for the exception verbs (99 ms) than for the regular verbs (47 ms). Similarly, the regularity effect observed for high-imageability items (141 ms) was smaller than that for low-imageability items (193 ms). These results support the prediction derived from connectionist single-mechanism models that semantic information has a greater impact on the transformation from present to past tense for exception than regular verbs. Interestingly, the presence of a smaller but still significant imageability effect for regular verbs is also more in line with single- than dual-mechanism accounts. Even more than arguing that semantic knowledge is largely irrelevant to transforming exception verbs to their past-tense forms, dual-mechanism theories would certainly deny any role for meaning in the rule-governed inflection of regular verbs, as for nonce verbs (Ramscar, 2002). The single-mechanism account argues that both phonological and semantic knowledge are recruited—just to different degrees—for both types of verb. This is precisely what the term “single mechanism” means—that is, that the same set of procedures is applied to exception, regular, and even nonce verbs.
The present results also closely resemble the interaction between regularity and imageability observed in the domain of reading aloud (Strain et al., 1995), as would be expected according to the connectionist single-mechanism account. Although significant imageability effects have not been found for items with regular spelling–sound correspondences in reading aloud, the reliable effects observed for regular items here probably result from the fact that past-tense generation is a fundamentally more semantic task, requiring activation of the meaning of “past tense” for its completion (Woollams et al., 2009). In addition, although high imageability has been found to entirely eliminate the costs of exceptional spelling–sound correspondence in reading aloud, the significance of the regularity effect observed here is no doubt a consequence of the overwhelming regularity of present- to past-tense transformations, meaning that exception verbs require responses that are relatively more atypical within their domain. The results of Experiment 1 indicate that the semantic variable of imageability influences the efficiency of past-tense inflection from stem, particularly for exception verbs. Experiment 2 explored whether similar effects could be obtained using a different semantic manipulation—namely, semantic priming.
Experiment 2
The aim of Experiment 2 was to investigate whether semantic priming effects comparable to those seen in reading aloud (Cortese et al., 1997) would characterize past-tense generation. The rationale behind semantic priming is that prior presentation of a semantically related stimulus should facilitate performance to a target stimulus more than prior presentation of an unrelated stimulus (Neely, 1991). Cortese et al. (1997) suggested that semantic priming may be a more powerful tool for testing the relationship between semantics and phonology than imageability, as priming involves presentation of semantic information before the target is given. Whilst we controlled for a variety of parameters when selecting stimuli for Experiment 1, such as present- and past-tense frequency, it has been noted that imageability correlates highly with variables such as age of acquisition (Monaghan & Ellis, 2002). In order to obtain convergent evidence for the influence of semantic variables on past-tense processing, we employed a semantic priming paradigm in Experiment 2, which allows the same items to be presented in both the related and unrelated conditions across participants. As per Experiment 1, the prediction derived from the connectionist single-mechanism account is that past-tense generation should be more efficient following semantically related than following unrelated primes, with this priming effect being significantly stronger for exception than for regular verbs. Confirmation of this prediction would corroborate our interpretation of the imageability effect observed in Experiment 1 and, more generally, would support connectionist single-mechanism conceptions of inflectional morphology.
Method
Participants
A further 48 undergraduate psychology students, all native speakers of English, participated in this study (41 female, 7 male), in return for course credit.
Stimuli
As in Experiment 1, we conducted a ratings task in order to obtain optimum stimuli for Experiment 2. This time the purpose was to match across regularity on degree of prime–target semantic relatedness, as well as to provide a means of ensuring that only verb stems for which most participants actually knew correct past-tense forms were included in Experiment 2.
A pool of 200 verb stems, 100 regular and 100 exception, was presented to participants. Each regular verb was matched to an exception verb on frequency, length, and perceived imageability. Only verbs that were monosyllabic in their present- and past-tense forms were included. A further pool of 200 prime words related to target verb stems in meaning (according to the judgement of the experimenter) and matched across regularity on frequency and length was also included. For the prime relatedness ratings task, there were two versions, in each of which half of the items were presented with a related prime, and half were presented with an unrelated prime. Unrelated word–verb pairs were generated by re-pairing targets and primes. Items presented with a related prime in Version A were presented with an unrelated prime in Version B, and vice versa. The administration of Versions A and B was alternated across participants. Thus all participants saw the same 200 verb stems, but half saw a given verb after a related prime while the other half saw the same verbs with an unrelated prime (and vice versa).
The semantic relatedness ratings task was run with the same 16 participants and during the same sessions as the imageability ratings task for Experiment 1. Before each task, participants were given a number of practice trials. Participants were required to rate the semantic relatedness of 200 word–verb pairs on a scale of 1–7 (1 = unrelated, 7 = related) using the keyboard. The specific instructions presented to participants are provided online as Supplementary Materials. The past-tense generation pretest described in Experiment 1 provided information on accuracy of inflection, which was used to select stimuli for both Experiments 1 and 2. The relatedness ratings for each related prime–target pair are presented online as Supplementary Materials.
We derived 120 word–verb pairs from the pretests. As per Experiment 1, we retained only verbs for which more than 50% of participants provided the correct past-tense form, with an overall mean of 92% (see Table 3). We also excluded any word–verb pairs presented in the related condition that received a mean relatedness rating of below 4. We selected sets of 60 exception items and 60 regular items using the “Match” programme (van Casteren & Davis, 2007) in order to be best matched pairwise on prime relatedness, frequency, and letter length of the stem. The majority of primes were noun/verb homographs (25 for regular and 37 for exception) or verbs (17 for regular and 13 for exception), with an additional adjective/verb homograph in the regular condition. The remainder were nouns (16 for regular and 10 for exception), with an additional adjective in the regular condition. Of the primes that could act as a verb, the vast majority took regular past-tense forms (93% for regular and 96% for exception). The regular and exception sets were divided into two sets of 30 pairs, Set 1 and Set 2. For the main experiment, there were two versions. In Version A, Set 1 verbs (regular and exception) were presented with their related primes, and Set 2 verbs (regular and exception) were presented with unrelated primes, obtained by recombination of primes and targets within regularity. In Version B, this assignment of sets to prime relatedness condition was reversed. Table 3 documents the lexical properties of the sets of stimuli used in Experiment 2.
Average values of stimuli used in Experiment 2 on a variety of lexical variables as a function of regularity and set
Note: Standard deviations in parentheses.
We carried out a series of 2 (regularity: regular/exception) × 2 (set: 1/2) between-items ANOVAs to ensure that stimuli did not differ significantly across conditions on potentially influential variables. There were no significant effects of either factor, nor any interaction between them, on prime length, prime frequency, imageability rating of target, lemma frequency of target, stem frequency of target, past-tense frequency of target, stem letter length of target, relatedness rating of target to prime, or number of meanings (F value range = 2.043 to 0.002, p value range = 1.000 to .202). There was a significant effect of regularity on past-tense length of target, F(1, 116) = 133.70, p < .0005, and on inflection accuracy, F(1, 116) = 59.54, p < .0005, but there were no main effects of set or interactions between regularity and set for either variable (F value range = 1.146 to 0.024, p value range = 1.000 to .287).
Procedure
Equipment used to run the experiment and error recording procedures were identical to those used in Experiment 1. Experiment 2 was a speeded inflection task, requiring participants to generate the past-tense form of visually presented stems as quickly as possible, whilst avoiding errors. Thirty-two people took part, none overlapping with participants in the pretests or Experiment 1. Verb stems were preceded by prime words that were either related or unrelated to the verb in meaning. All participants saw the same 120 verb stems in a randomized order, but related and unrelated word–verb pairs were counterbalanced across participants (according to whether they were given Version A or B of the experiment). Therefore, 16 participants saw a given target in a related pair, and the other 16 participants saw the same target in an unrelated pair.
The experiment began with 12 practice trials not seen in the subsequent experimental trials (3 exception with unrelated prime, 3 regular with unrelated prime, 3 exception with related prime, 3 regular with related prime, randomly intermixed). All trials began with a white cross on a black screen for 500 ms. This was followed by central presentation of the prime word, which was given in upper case (e.g., DOG). This remained on the screen for 500 ms. Immediately afterwards, the target stimulus was presented in lower case (e.g., breed). The target stimulus remained on the screen until the participant produced a response detected by the microphone or for a maximum of 4,000 ms. Responses were recorded for 2,000 ms after onset. The next trial began after an interval of 2,250 ms. General feedback on task performance was provided during the practice trials by the experimenter, but no feedback was provided during the subsequent experimental trials.
Results
After testing, one female participant in Experiment 2 was found to be a non-native English speaker; hence all data for this participant were discarded. Analyses were, therefore, based on results from 31 participants. We conducted ANOVAs based on both participant (F1) and item (F2) means. Mean RTs and accuracy/error rates for all of the items used in Experiment 2 are presented in Appendix C.
Reaction times
Trials in which the microphone was inappropriately activated (2.12% of trials) or a participant gave an erroneous response were excluded from the RT analysis. Mean RTs for each condition are presented in Figure 3. Participant means were analysed using a 2 (regularity: regular/exception) × 2 (priming: related/unrelated) repeated measures ANOVA. A significant main effect of regularity was found, F1(1, 30) = 58.40, p < .0005. However, there was no effect of priming on RTs, F1(1, 30) = 0.79, p = .381, nor any interaction between regularity and priming, F1(1, 30) = 0.03, p = .863. The ANOVA by items included regularity (regular/exception) as a between-items factor and priming (related/unrelated) as a within-items factor. As in the analysis by participants, a highly significant effect of regularity was found, F2(1, 118) = 32.85, p < .0005, but no significant effect of priming, F2(1, 118) = 0.23, p = .633, nor any interaction between regularity and priming, F2(1, 118) = 0.11, p = .744.

Mean reaction times (RTs) as a function of regularity and semantic priming for Experiment 2. Error bars represent ± standard error.
Accuracy and error rates
Categorization of errors was as per Experiment 1, with the addition of a further error category, “prime errors”. Only one such error occurred across all of the trials in Experiment 2, and it entailed the participant inflecting the prime word instead of the target word. The frequencies of different error types produced in Experiment 2 are provided in Appendix D.
Accuracy rates were calculated in the same manner as that described for Experiment 1, and mean accuracy rates for each condition of Experiment 2 are presented in Table 4. A repeated measures 2 (regularity) × 2 (priming) ANOVA by participants revealed a highly significant effect of regularity, F1(1, 30) = 199.94, p < .0005, and a marginal effect of priming, F1(1, 30) = 3.22, p = .083, which was qualified by an interaction between the two variables, F1(1, 30) = 6.56, p = .016. Planned comparisons showed that the regularity effect was highly significant for related, t(30) = –7.83, p < .0005, and unrelated prime trials, t(30) = –11.89, p < .0005. However, the priming effect was only significant for exception items, t(30) = 2.30, p = .028, not regular items, t(30) = –1.44, p = .161. The analysis of item means revealed the same results, with a highly significant effect of regularity, F2(1, 118) = 69.93, p < .0005, a marginally significant effect of priming, F2(1, 118) = 3.67, p = .058, and a significant interaction between the two, F2(1, 118) = 8.39, p = .005. Paired-samples t tests again showed significant regularity effects for related, t(59) = –7.67, p < .0005, and unrelated prime trials, t(59) = –8.75, p < .0005, and an effect of priming that was significant for exception items, t(59) = 2.59, p = .012, but not regular items, t(59) = –1.31, p = .195.
Mean percentage accuracy for each condition of Experiment 2 as a function of regularity and priming
Note: Standard deviations in parentheses.
Again, we conducted error analyses on the combined rate of regularization and LARC errors, which are displayed in Figure 4. The analysis by participants revealed sizeable main effects of regularity, F1(1, 30) = 70.48, p < .0005, and priming, F1(1, 30) = 8.70, p = .006, qualified by a significant interaction between the two variables, F1(1, 30) = 8.18, p = .008. A series of paired-samples t tests showed that, as would be expected, there was a highly significant effect of regularity on regularization and LARC errors for both related, t(30) = 7.11, p < .0005, and unrelated prime trials, t(30) = 8.57, p < .0005, with effects of 12.47% and 16.13%, respectively. Most importantly, a significant priming effect (3.77%) for exception verbs, t(30) = –2.92, p = .007, contrasted with a nonsignificant priming effect (0.11%) for regular verbs, t(30) = –1.00, p = .325. The by-items analysis of regularization and LARC errors confirmed significant main effects of regularity, F2(1, 118) = 59.09, p > .0005, and priming, F2(1, 118) = 9.80, p = .002, and an interaction between them, F2(1, 118) = 8.81, p = .004. Significantly fewer errors were made to regular than to exception verbs, whether they were preceded by related, t(118) = 7.54, p < .0005, or unrelated primes, t(118) = 7.16, p < .0005. Exception verbs showed a significant priming effect, t(59) = –3.06, p = .003, that was not apparent for regular verbs, t(59) = –1.00, p = .321.

Percentage regularization and legitimate alternative rendering of components (LARC) error rate as a function of regularity and priming for Experiment 2. Error bars represent ± standard error.
Discussion
The results of the semantic priming task employed in Experiment 2 converge with those obtained in Experiment 1. RTs, accuracy, and combined regularization/LARC error rates revealed the standard disadvantage for exception relative to regular items. In addition, accuracy and regularization/LARC error rates both revealed a significant interaction between regularity and semantic priming. Analyses of combined regularization/LARC error rates revealed that the regularity effect of 12.47% for the related prime trials was reduced relative to the 16.13% effect for the unrelated prime trials. Even more striking was the finding that the 3.77% semantic priming effect for the exception items was significant, whereas the 0.11% effect for the regular items was not. As the same items were presented with related and unrelated primes alternately between participants, it can be confidently inferred that it is semantic priming effects that are producing the differences in error rates, rather than participants merely being more familiar with the correct past-tense forms of the items in one condition than those in another. Whilst it might be argued that accuracy for regular verbs was already so high in the unrelated condition that it left little room for a priming effect, there was no “need” for the exception items to benefit from priming either and no prediction from a dual-mechanism perspective that they should do.
One unexpected result from Experiment 2 was the absence of any appreciable semantic priming effect in RTs. This contrasts with previous work in reading aloud (Cortese et al., 1997) where effects of semantic priming upon RTs to regular and exception words were obtained with stimulus onset asynchronies (SOAs) of 233 ms and 533 ms. One possible explanation for the absence of a significant semantic priming effect in RTs here is that the 500-ms SOA we used was sufficiently long that, when combined with the relatively slow RTs observed with inflection from stem (979 ms) relative to reading aloud (519 ms), the benefit of having a related prime in terms of speed of processing had dissipated by the time of response. Further research using the inflection from stem task with shorter SOAs would seem to be required to understand the lack of priming effect that we observed in RTs.
Nevertheless, in terms of accuracy, the findings of Experiment 2 mirror those found in reading aloud using the semantic priming paradigm, where Cortese et al. (1997) found a significant interaction between the effects of priming and regularity on error rates. Exception words preceded by related primes were read more accurately than those preceded by unrelated primes, whereas accuracy of regular word reading was equivalent whether preceded by related or unrelated primes (Cortese et al., 1997). This is identical to the pattern of performance that the present study has revealed in past-tense generation. To summarize, in this speeded inflection task, regular items were inflected more quickly than exception items, and presence of a related prime was associated with more accurate inflection for exception verbs but not for regular verbs. This latter result supports the notion that inflection of exception verbs involves more semantic support than regular verbs and is precisely what is predicted by connectionist single-mechanism models of inflectional morphology.
General Discussion
The goal of the present study was to assess the connectionist single-mechanism hypothesis of a greater role for semantic information in the inflection of exception than in the inflection of regular verbs. Experiment 1 revealed a significant interaction between imageability and regularity in inflection RTs, such that the benefit for high- over low-imageability items was significantly larger for exception than for regular verbs. This finding resembles the interaction between imageability and regularity observed in a number of studies of reading aloud (Cortese et al., 1997; Shibahara et al., 2003; Strain et al., 1995, 2002; Woollams, 2005). Experiment 2 demonstrated a significant interaction between semantic priming and regularity in inflection accuracy and regularization/LARC error rates, with the benefit for related over unrelated primes being significantly larger for exception than for regular verbs, mirroring the interaction between semantic priming and regularity seen in reading aloud (Cortese et al., 1997).
It is worth noting that Prado and Ullman (2009) recently showed a similar interaction between regularity and imageability specifically for their male participants. Although Prado and Ullman (2009) argue that imageability is a marker of lexical retrieval, and hence effects for the exceptions are consistent with a dual-mechanism account, we would note that such a treatment of imageability effects is contrary to that usually taken in the reading literature, where they have been used as a marker for semantic rather than lexical involvement (e.g., Cortese et al., 1997; Strain et al., 1995). Our study demonstrates the relationship between semantic activation and exception verb inflection not only through showing an interaction between regularity and imageability in inflection across a predominantly female sample but also by revealing, for the first time, a parallel interaction between regularity and semantic priming.
To the extent that imageability and semantic priming effects in inflection are appropriately considered as markers of semantic processing, then the interactions between semantic effects and regularity that we have demonstrated favour recent connectionist single-mechanism accounts of inflectional morphology (Joanisse & Seidenberg, 1999; Karaminis & Thomas, 2010; Woollams et al., 2009). Within Joanisse and Seidenberg's (1999) model, when the whole-word representations used to approximate semantics were damaged, a relatively selective deficit for exception relative to regular and nonce items emerged. This prediction was confirmed by Patterson et al. (2001; see also Cortese, Balota, Sergent-Marshall, Buckner, & Gold, 2006) who demonstrated a selective deficit in exception inflection amongst semantic dementia patients. The severity of the inflectional deficit for exception verbs is associated with degree of semantic deficit in this group (Patterson et al., 2006) and even holds at an item level for specific verbs (Patterson et al., 2001). It is worth noting that both of these findings also apply to exception items in the domain of reading aloud (Graham, Hodges, & Patterson, 1994; Woollams, Lambon Ralph, Plaut, & Patterson, 2007). The findings of the current study complement the neuropsychological and modelling results in that they demonstrate for the first time a clear link between semantic activation within an intact human system and performance on exception verbs in past-tense production tasks. Of course, the present results still need to be simulated, and we suggest that a recent version of the connectionist single-mechanism model incorporating distributed semantic representations (Woollams et al., 2009) represents a good starting point for such an endeavour, although some means of implementing the dimension of imageability would obviously be required.
Despite the mounting evidence from both neurological patients and now normal participants for a specific relationship between semantic activation and exception inflection, proponents of dual-mechanism approaches would challenge a causal interpretation of this association on the basis of relatively rare cases of neuropsychological dissociations between these two abilities. One of these is patient A.W. reported by Miozzo (2003), claimed to demonstrate impaired exception inflection despite having intact semantic representations. Although it is true that A.W. performed normally on receptive tasks that required access to specific semantic information, she was clearly anomic across a variety of picture-naming tasks. Hence A.W. was no longer able to use her intact semantic representations to adequately activate phonological representations, suggesting a disconnection between the two systems. For the purposes of correct inflection of exception items, such a disconnection would be sufficient to undermine effective use of the necessary semantic information, producing the observed inflectional deficit. As already noted in the introduction, the opposite pattern of dissociation has been provided by Tyler et al. (2004): Only one of the four semantic dementia patients they tested had significant deficits in inflection of exception verbs. It has, however, been noted that the stimulus items in that study included very few low-frequency exception items, perhaps rendering their experimental task insufficiently sensitive to the predicted nature of the deficit (Patterson et al., 2006).
These reported dissociations not only have alternative possible interpretations, as noted above, but contrast with overall and item-level associations between semantic processing and exception inflection in a fairly substantial number of patients with semantic dementia. Furthermore, the demonstrated association applies not only to verb inflection but to a number of other tasks—that is, patients with semantic dementia are usually impaired on lower frequency exception items in reading aloud, spelling, lexical decision, object decision, and delayed copy drawing as well as verb inflection (Patterson et al., 2006). Such cross-task associations provide strong evidence for a role of semantics in accurate processing of domain-atypical items, as proposed by a number of connectionist single-mechanism models (Harm & Seidenberg, 2004; Joanisse & Seidenberg, 1999; Plaut et al., 1996; Rogers et al., 2004; Woollams et al., 2009). The results of the present study provide the first convergent evidence from skilled adults of increased reliance on semantic information for inflection of exception verbs.
Supplementary material
Supplementary material is available via the “Supplementary” tab on the article's online page (http://dx.doi.org/10.1080/17470218.2012.661441).
