Semantic Categorization in L1 and L2: Calculating Distances

Abstract

Aims and Objectives:

The objective of this paper is to explore how learners organize and access their mental lexicon in the L1 and L2 and to look for similarities and differences in these processes. We wanted to explore the impact of L1 linguistic and cultural background on semantic categorization.

Methodology:

The method used includes data gathering, graph analyses and computing. Specifically, graphs were calculated from experimental data and, then, distances among the graphs were computed.

Data and Analyses:

Data were obtained from 430 English as a foreign language (EFL) learners from a Spanish background aged 15–16. Analyses compared English and Spanish semantic categorization data. Several analyses using different methodologies were conducted (based on vectorization of networks, computing distances by means of metrics and spectral vectorizations) and found similar conclusions.

Findings:

Our results suggest that the thematic axis outweighs the language axis for Spanish learners of English in semantic categorization. Furthermore, the direction of links and their weights seems irrelevant: this might point to a high homogeneity in semantic categorization.

Originality:

To our knowledge, this is the first time these methods are used on a large sample of experimental linguistic data.

Significance:

Our results provide further evidence of a semantic categorization process which is shared by speakers and learners of different languages, showing that semantic similarity is an overriding factor over language in categorization, either because of a shared conceptual space or as a result of a translation process from L1 into L2/EFL.

Keywords

semantic categorization taxonomic and experiential categories Spanish L1 EFL graph analysis

Introduction

Semantic categorization is a mental process pertaining to how humans organize and make sense of the world around them, of reality (cf., Coni et al., 2019; Sass et al., 2009). Semantic categories are stored in semantic memory, but how exactly these categories are organized still remains an open question. In general, it is believed that we organize and group information based on similarity, mainly semantic, but also formal (orthographic or phonological), syntactical (e.g., collocations), functional or pragmatic and experiential (Aitchinson, 1994), and hence, semantic categories are formed. Examples of semantic categories are Animals, Colours, Occupations, Food and drink and so on. This organization of similar or related concepts facilitates cognition and word retrieval for communication (e.g., Bower et al., 1969). Generally, there are some exemplars in the category which are believed to be the best examples, most representative or typical members, also known as prototypes (cf., Rosch, 1978). These ideal members are most easily accessed. Membership in the category is decided upon the number of shared properties or features of one category member and the others.

The way in which categories are formed changes based on several variables. For instance, the type of category is paramount in determining the nature of the relations among the members of the said category. Hence, we distinguish among taxonomic, ad hoc and experiential or thematic categories (see, e.g., Benn et al., 2023), also known as schema categories (e.g., Mandler, 1984). Taxonomic categories are similarity-based, generally found in nature, and can be named by superordinate terms, for example, Animals, Food and drink (e.g., Hernández-Muñoz et al., 2025; Kleiber, 1995), whereas thematic or experiential are co-occurrence-based and refer to events or scenarios like things you find in the Countryside or things that you Love (Benn et al., 2023). Ad hoc categories tend to be more novel or spontaneous groupings, Things that are yellow (Barsalou, 1983; Barsalou, 2010; Hernández-Muñoz et al., 2025), which base their belonging on a shared objective or idea. Some previous studies revealed that language is more relevant in ad hoc categorization or when the members of the category share fewer properties (e.g., Lupyan & Mirman, 2013). Accordingly, the smaller the linguistic resources, that is, the vocabulary in the L2, for example, the more difficult it will be for learners to respond to these categories. Also, category types such as experiential, like Sports and Hobbies, or schema-based, like Countryside, have an internal structure that is less robust and stable and highly influenced by cultural, emotional and linguistic experiences (cf. Hernández-Muñoz et al., 2025). Accordingly, taxonomic categories such as Animals might be less influenced by the linguistic knowledge of the learners, and therefore, L1 and L2 categorization might be closer. However, more experiential categories based on ad hoc decisions or dependent on cultural scenarios might appear more distant in L1 and L2 renderings.

Apart from the internal characteristics of the category, other factors, such as age, or language and culture of origin might impact the structure of the category. Thus, children tend to organize their semantic information according to their experience of the world in a slot-filler way. As age increases, and vocabulary knowledge does as well, conceptual similarity takes over and organization becomes more taxonomic, that is, according to the natural organization of the world (cf. Shivabasappa et al., 2017).

Although categorization is a typically human activity that spreads across languages, there are some slight variations as to how speakers of disparate languages and cultures organize the members in the category. The internal structure of the categories is not only cognitive, but also culture-dependent (cf. Kövecses, 2006; Lakoff, 1986). Cultural availability or familiarity with the exemplars of the categories is a crucial element that determines the structure of a semantic category (Lin et al., 1990; Shivabasappa et al., 2017). For instance, pie might be a good example of the category Food and drink for the British speaker, but not necessarily for the French, German or Spanish native, who might rather refer to quiche, sausage or omelette, respectively, instead, as prototypical or first responses. In addition, different languages map concepts onto lexical words differently. Several studies (e.g., Ameel et al., 2005; Malt et al., 1999; Wierzbicka, 1992) have noticed that speakers of different languages map conceptual meanings into words in disparate ways so that word-to-word mappings among the different languages are not always identical. These differences may affect the structure of the semantic categories in the different languages. Mirroring these previous studies, in the present research, we distinguish between lexical items, which are the observable responses produced by participants, and conceptual organization, which refers to the semantic relations underlying those responses.

L2 learners develop semantic categorizations that are influenced by both their L1 and L2, indicating the importance of language in shaping these categories (e.g., Ameel & Storms, 2006). The impact of language on categorization varies across different semantic categories. Taxonomic categories, whose members are close in conceptual space, might be less prone to be influenced by language or cultural experience. On the contrary, those involving conceptually more distant items or exemplars might be more affected (e.g., Imai & Gentner, 1997). In addition, larger semantic categories facilitate learning of prototypes in L2, suggesting an inherent influence of the semantic category itself (e.g., Malt et al., 1999).

Aitchinson (1994), who investigated the categorization of items and the prototype effects in vocabulary learning, reached the conclusion that the differences in the responses of adult learners of English as a foreign language (EFL) with different native languages who participated in her study were due to their native language and culture. Cross-linguistic differences reveal that bilingual speakers show different categorization patterns depending on the language they are using, suggesting a strong influence of language-specific features (e.g., Pavlenko, 2009; Viñas-Guasch et al., 2017).

In conclusion, while both language and semantic category are important factors in semantic categorization, their relative importance appears to vary depending on the specific context, type of categorization task and the nature of the semantic categories involved. The interaction between these factors suggests that neither can be considered universally more important than the other across all situations.

Semantic Categorization in L1 versus L2

Previous studies have also dealt with semantic processing in the L1 and the L2 via semantic processing tasks. Please note that we refer to semantic processing for discussions of access and activation in the mental lexicon, whereas semantic categorization refers mainly to the task and the resulting organization of responses. There seems to be a general consensus that there is a partial overlap of core conceptual systems, with disparate boundaries, radial or external members and even different categorization strategies, depending on linguistic structure, cultural, emotional and experiential context (cf., e.g., Francis, 1999; Teixeira-Moláns, 2024). Specifically, L2 learners develop semantic categorization in their second language through a complex process that involves both L1 influence (e.g., Wolter & Yamashita, 2018) and the formation of new L2-specific categories (Saji et al., 2024). The evidence suggests a combination of both strategies, rather than a simple reliance on L1 categorization or complete native-like L2 categorization. Generally, L1 and L2 categorization processes converge with increased bilingual experience (cf. Ameel & Storms, 2006).

Recently, the application of complex network analysis has allowed for some insights into how semantic categories are organized and structured. Hernández-Muñoz et al. (2025) use complex network analysis to examine empirical language data, specifically, how actual speakers respond to active categorization tasks. They conclude that network metrics such as betweenness centrality or extint, the strength of the association within and outside the communities, serve to describe the categories and discriminate among the members. Furthermore, they believe that categories are small-world structures, characterized by short path lengths and high clustering, which facilitates navigation and efficient word retrieval and that modularity of the network, that is, the way in which word clusters are established and the strength of their links, helps distinguish among categories. Hence, taxonomic categories display fewer communities with more numerous clusters of members, whereas experiential or schema categories exhibit a more dispersed structure and ad hoc categories lie in between. The creation of lexical-semantic networks as complex graphs with experimental linguistic data is a metaphorical approximation to the mental lexicon and the relations among the words (e.g., Collins & Loftus, 1975; Dubossarsky et al., 2017; Hernández-Muñoz et al., 2025; Steyvers & Tenenbaum, 2005). Both L1 and L2 lexical-semantic networks display small-world and scale-free network properties, featuring short paths connecting words and strong clustering (Feng & Liu, 2023), which in linguistic terms means that speakers have readily available vocabulary and that communication does indeed proceed successfully (Steyvers & Tenenbaum, 2005). However, L2 lexical-semantic networks are less densely connected and less well-organized compared to L1 networks (e.g., Borodkin et al., 2016); they evolve with time and L2 proficiency, gradually departing from L1 patterns and becoming more similar to those of native speakers of the L2 (Feng & Liu, 2023; Quintanilla & Kloss, 2024). Besides, as measured by processing time studies, L2 response times to semantic processing tasks and semantic fluency tasks are generally slower (cf. Fitzpatrick & Izura, 2011) and show stronger frequency effects than L1 processing, suggesting more lexical involvement in L2 processing (Plat et al., 2018). This slowness is generally interpreted as a reflection of the L2 lexical-semantic system being less automatized and less densely connected (as discussed with the network analysis). Accessing or searching through the L2 mental lexicon requires more cognitive effort and time because the links between concepts and words are not as strong, or well-practised as they are in the L1. In the L1, semantic access is thought to be highly efficient, possibly allowing for a more direct conceptual route where word meaning is rapidly retrieved, regardless of frequency. In the L2, the learner may rely more heavily on the lexical (word form) route and the strength of the lexical-conceptual link. High-frequency words have stronger connections in the L2 network, making them easier and faster to activate. For low-frequency words, the weaker connections require a more effortful search or activation process, causing the stronger frequency effect to surface. This indicates that the L2 system is less robustly connected and still relies more on the strength of repeated exposure (frequency) to facilitate semantic access. Furthermore, Kroll and Tokowicz (2001) suggest that during L2 processing, learners automatically activate L1 translations, which can influence semantic judgements and word retrieval. In this sense, typological closeness, that is, formal and semantic similarity among the L1 and L2, might play a role in L2 lexical-semantic organization, with L1 structures influencing and facilitating L2 word retrieval, even at advanced levels of proficiency (e.g., Dijkstra & van Heuven, 2002; Wolter & Yamashita, 2018).

In the case of bilingual and multilingual lexical representations, it is crucial to determine whether L1 and L2 forms share a conceptual node. Different theories have accounted for shared and separate conceptual representations, see the Revised Hierarchical Model (RHM) in Kroll and Stewart (1994), the Bilingual Interactive Activation (BIA) model in Kroll and de Groot (2005) and Kroll and Tokowicz (2005) for a review. The former proposes that the L1 lexicon is strongly connected to the conceptual store, while the L2 lexicon has a weaker and indirect connection to concepts, relying primarily on a strong link to the L1 lexicon. The L1 word serves as the main access route to meaning, especially for novice learners. The latter model posits that the lexical items of both the L1 and L2 are stored in a single, integrated system and are simultaneously activated whenever a bilingual encounters language input, regardless of the intended language. This model is characterized by non-selective access and interactive activation at the word form level, where activation flows between orthographic (spelling/form), phonological (sound) and semantic levels across both languages. However, empirical evidence pointing to one or the other is scarce and contentious. In this sense, most empirical evidence conducted via word association tests, semantic fluency tasks and translation priming highlight that there is a word effect pointing to cognate word, words that are concrete or with universal referents to share conceptual representation, whereas more abstract, emotional or culturally-loaded words might be stored separately with separate or partially separate conceptual nodes (cf. Chaouch-Orozco et al., 2024; Kolers, 1963).

Borodkin et al. (2016) examined L2 lexical-semantic networks with participants with a highly structured L1 lexicon already in place. They found out that L2 networks, although thematically organized, were more dispersed and thematic clusters were fuzzier than in the L1 networks. On their part, Feng and Liu (2024) state that “L2 learners are assumed to apply their L1 lexical-semantic knowledge to build their L2 lexical semantic networks.” Accordingly, their L2 networks might resemble or imitate their L1, both in terms of word responses and in terms of their connections and the nature of these connections.

Here, we intend to explore this issue by applying network theory analysis to experimental linguistic data. Hence, we propose a methodology to analyse empirical linguistic data derived from fluency tests, and which allows for between-groups comparisons. Banking on previous findings with L1 and L2 lexico-semantic networks, the present study asks whether the organization of lexical responses in L2 semantic categorization resembles that observed in L1, and whether this similarity varies according to category type (taxonomic vs. experiential/schema-based). Data will be approached in an aggregated fashion and distances will be computed using different methods, as explained below in detail.

Methodology

We calculated distances between graph metrics in L1 and L2 for both semantic categories to find out how close or far they were. Specifically, we used two methods for distance calculation to check generalizability capacity.

Participants

Our study has been carried out with a group of Spanish students of EFL L2 to compare their semantic categorization in Spanish L1 and EFL. Specifically, a group of 430 Spanish EFL learners participated in the study. All of them completed a semantic fluency task, first in English, the FL, and then in Spanish, their L1. This order of administration was followed to avoid any priming effect deriving from the L1. They were in the final year of their secondary education, grade 10. They were aged 15–16. All students were learners of EFL and were at the low-to-high intermediate level of proficiency in English (A2–B2) as per the Oxford Placement Test (UCLES, 2001), which they completed prior to the semantic fluency task.

Instruments

A semantic fluency task called Lexical Availability task (LAT), which is a multi-response fluency task type, was used to elicit production of vocabulary data from informants. The LAT presents informants with a stimulus or cue word and asks them to generate responses related to the stimulus category. In particular, learners had to write, in 2 minutes, as many words as came to their mind. Specifically, we had participants respond to the categories (Animals and Love), the former a taxonomic category and the latter a more experiential one (e.g., Hernández-Muñoz, 2014; Jiménez-Catalán, 2014) in relation to the prompts: Animals and Love. These two prompts were selected on three grounds: they feature (a) different productivity, (b) different response diversity or response spread and (c) different cohesion index. Animals is an inclusive or closed category which gives rise to many but very homogeneous responses. Love is a less productive prompt, but where a broader amount of types are to be found, it is more open and gives rise to more heterogeneous responses (e.g., Hernández-Muñoz, 2014; Tomé-Cornejo, 2015). Participants were instructed in Spanish L1 and each prompt and the corresponding responses occupied an independent sheet. Participants had to respond in Spanish L1 and EFL. The LAT collects multiple responses from learners (cf. Jiménez-Catalán, 2014; Schmitt, 1998) and gives thus a more complete picture of learners’ lexicons (Precosky, 2011; Sheng et al., 2006). Multiple-response association tests tend to prompt chain responses that associate one another rather than with the stimulus word (cf. De-Deyne & Storms, 2008; Precosky, 2011), that is, the word produced will facilitate or prime recall at two levels simultaneously, that of other related concepts or word forms; this is called a priming effect.

Procedures and Analyses

Data were collected via an online application specifically designed for the purposes of a larger project, within which this study is framed. Participants completed the LAT in class in the presence of the teacher and the researchers conducting the study. Responses were obtained in computer-readable form for each of the prompts. The data were carefully edited, adopting the following criteria:

No repetitions per informant were allowed.

Spelling errors were corrected.

Multiple-word responses were hyphenated in order for them to be counted as a single word (e.g., fresh-air).

Once the editing process was complete, the data were typed into text files. Data were processed by means of the Gephi software package (Cherven, 2015) (see also Borodkin et al., 2016; Christensen & Kenett, 2021; Zemla & Austerweil, 2022 for alternative ways to construct graphs). Gephi allows one to construct graphs from association data and obtain different key statistical measures, such as, for instance, average degree, clustering coefficient, diameter, eigenvector centrality or closeness, to mention but a few.

Comparing Linguistic Graphs

While there is a panoply of methods for comparing networks (see, for instance, Tantardini et al., 2019), only some of them are useful when comparing linguistic graphs as the ones obtained through Lexical Availability tasks. The methods for network comparison can be divided into two different types. The first type focuses on the study of the relative complexity of a pair of networks with the same nodes or, equivalently, of two networks with natural bijection between their sets of nodes. This is not applicable in the present case, where the number of nodes depends on the learners’ responses, which can be, and actually are, different with respect to each stimulus. Furthermore, even if we were to discretionally impose a fixed number of nodes, as will be done for methodological reasons in the next section, establishing a natural bijection between the set of nodes remains impossible. For instance, the 50 most frequent words observed for Animals in the L2 network do not correspond to the translation of the 50 most frequent words with respect to Animales in L1.

Given these constraints, the analyses of our data necessitated the selection of methods of the second type, specifically those designed to compare networks with different number of nodes (Tantardini et al., 2019). From this anthology of options, we chose approaches rooted in graph vectorization. In this technique, a network is represented by means of a numerical vector, and then differences between two graphs are measured through the distances between the corresponding vectors. In this paper, we focus on the Euclidean distance:

d (v^{1}, v^{2}) = \sqrt{\sum_{i = 1}^{n} {(v_{i}^{1} - v_{i}^{2})}^{2}}

where $n$ is the dimension of both vectors $v^{1}$ and $v^{2}$ .

Considering that we should compare three graphs denoted by $G 1$ , $G 2$ and $G 3$ , and assuming that we apply a vectorization algorithm producing, respectively, three vectors denoted by $v G 1$ , $v G 2$ and $v G 3$ , then, to study the similarity among the graphs, we compute the numbers $d (v G 1, v G 2)$ , $d (v G 1, v G 3)$ and $d (v G 2, v G 3)$ . For the sake of clarity, let us state that, for example, $d (v G 1, v G 2) = 10$ , $d (v G 1, v G 3) = 120$ and $d (v G 2, v G 3) = 130$ (the absolute values fixed are irrelevant; the important fact is the relative change of scale). With these data, we could infer that $G 1$ and $G 2$ are closer (with respect to the used vectorization method), and $G 3$ is less similar to $G 1$ and $G 2$ .

In our first approach, a vectorization by global statistics (Tantardini et al., 2019) or features was selected. We compute, for each graph, the values of 11 features calculated by default in Gephi:

Number of nodes.

Number of edges.

Average degree. The degree of a node is the number of nodes adjacent (i.e., to say: linked with at least one edge) to it. The average degree is the arithmetic mean of the degrees of all nodes.

Average weighted degree. As the precedent one, but counting the number of edges linking each pair of nodes.

Diameter. A path between two nodes $v_{1}$ and $v_{2}$ is a sequence of edges $(e_{1}, \dots, e_{m})$ such that $e_{1}$ contains $v_{1}$ , $e_{m}$ contains $v_{2}$ and each edge $(e_{i}, e_{i + 1})$ has a common node, for $i = 1, \dots, m - 1$ ; then, $m$ is called length of the path. The diameter of a graph is the longest shortest path between any two nodes in the graph.

Density. Graph density is the ratio of the actual number of edges to the maximum possible number of edges in a graph (which depends on the number of nodes).

Number of connected components.

Modularity. Modularity is a measure of the strength of a graph’s division into communities (see details in Blondel et al., 2008).

Statistical inference number. This number measures assortative communities in networks based on a nonparametric Bayesian formulation (see the formal definitions in Zhang & Peixoto, 2020).

Clustering coefficient. The clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together (see Barrat et al., 2004 for details).

Average path length. The average path length of a graph is the average of the shortest path lengths between all possible pairs of nodes in the graph.

Being aware that feature vectorization could have certain drawbacks in some circumstances (see Tantardini et al., 2019), we thought about using other ways of vectorization to confirm our insights. We moved to spectral vectorization. This method is not free of deficiencies either, see again (Tantardini et al., 2019), but we hope that mixing two very different approaches to computing distances between graphs could offset the potential drawbacks of each method in isolation. The spectral method consists of representing each network by means of the eigenvalues of its adjacency matrix, sorted in decreasing order. To be able to sort the eigenvalues, it is required that they are real numbers (and not complex numbers, which is the case in general), and then that the adjacency matrices are symmetric. Our initial graphs are weighted and directed, so their adjacency matrices do not satisfy the required properties. It is so mandatory to transform our graphs to obtain their simple and undirected counterparts. As a byproduct, as we will explain in the next section, this will allow us to stress that some of our outcomes are not dependent on the orientation and weight of edges.

Results

Once our graphs have been built, we got four networks denoted by the name of the stimulus which produced each one of them; namely, Animales, Animals, Love and Amor. To establish a first visual comparison, the Fruchterman-Reingold visualization method (Fruchterman & Reingold, 1991) is used in the different figures displayed in this paper. The Fruchterman-Reingold visualization method is based on the idea that nodes with a greater degree (connected with a greater number of other nodes) attract more “force” towards the centre of the figure, while nodes with a lower degree are “expelled” towards the exterior of the figure. When displaying these graphs in the Gephi application (Bastian et al., 2009), it can be observed visually (see Figures 1 to 4) that Animales and Animals are very similar, and, in turn, Love and Amor are more similar, too. More specifically, Animales in Figure 1 has the densest core (visually marked by a dark kernel), while Love and Amor (Figures 3 and 4, respectively) have brighter cores; Figure 2, corresponding to Animals, is more similar to Animales than to Love or Amor. This effect is stressed in Figure 5, where the four graphs are displayed together, highlighting four cores with very different densities.

Figure 1.

Gephi capture of the Animales graph is displayed by means of the Fruchterman-Reingold distribution.

Figure 2.

Gephi capture. The Animals graph.

Figure 3.

Gephi capture of the Love graph.

Figure 4.

Gephi capture of the Amor graph.

Figure 5.

Gephi capture of the four stimuli put together.

Nevertheless, when trying to reflect this property in a quantitative way, we found the problem that the noisy nature of this kind of lexical availability graphs (with many vertices with very small degree) perturbed a clean comparison. Hence, we pruned the graphs, keeping only the 50 most central [highest degree] or generation-frequent words for each stimulus (as it was done in the paper Agustín-Llach & Rubio, 2024), obtaining four new graphs, called Animales50, Animals50, Love50 and Amor50, shown, respectively, in Figures 6 to 9. In these pruned graphs, the differences and similarities are even more striking (Animales50 and Animals50 exhibit, in Figures 6 and 7, a quite interwoven net, while Love50 and Amor50 manifest, in Figures 8 and 9, a clearer interlacing), and it is on these pruned networks that we undertake our quantitative comparison.

Figure 6.

Gephi capture of the Animales graph pruned up to the 50 most frequent nodes.

Figure 7.

Gephi capture of the Animals graph pruned up to the 50 most frequent nodes.

Figure 8.

Gephi capture of the Love graph pruned up to the 50 most frequent nodes.

Figure 9.

Gephi capture of the Amor graph pruned up to the 50 most frequent nodes.

As explained in the previous section, we proceed to vectorize our graph by means of the 11-statistics offered by default in Gephi. But, before comparing these vectors, it is necessary to normalize their values. The reason is that each statistic is on a very different scale. For instance, modularity is a number between 0 and 1, while the statistical inference number runs over tens of thousands. Then, comparing distances of vectors would be distorted, and the statistical inference number would hide the differences related to modularity and other metrics. Therefore, we proceed to normalize each component in the vector in such a way that all of them are numbers between 0 and 1. Even after this normalization process, when replacing each graph Animales, Animals, Love and Amor by the corresponding 11-dimensional normalized vectors and computing the corresponding distances, we found a messy pattern of difficult interpretation. The reason is the very infrequent words introduce noise in the analytics of the graphs. On the contrary, when considering the pruned networks Animals50, Animales50, Love50 and Amor50, we get the distances (increasingly sorted) described in Table 1.

Table 1.

Distances for Graphs Pruned to 50 Nodes, Vectorized With All 11 Gephi Features.

d(vNLove50, vNAmor50)	0.652
d(vNAnimals50, vNAnimales50)	1.032
d(vNLove50, vNAnimals50)	2.581
d(vNAmor50, vNAnimales50)	2.585
d(vNLove50, vNAnimales50)	2.627
d(vNAmor50, vNAnimals50)	2.661

In Table 1 we read, for instance, that d(vNLove50, vNAmor50) = 0.652 and d(vNAmor50, vNAnimals50) = 2.661, showing that Amor50 is closer to Love50 than to Animals50. In general, this table illustrates the fact (already remarked visually in Figures 1 –9) that thematic proximity implies distance proximity, with the two pairs Love-Amor and Animals-Animales clearly separated from the other comparison pairs. Actually, the other axis (language) also seems to entail nearness with respect to distance (with the pairs Love-Animals and Amor-Animales above the other pairs). There are, in fact, shared words in both categories such as dog or pet. Nevertheless, seeing the numbers (with very small differences between them), the crisp separation of the two first rows in the table appears to be more important.

But we wondered why consider all the Gephi features offered by default to encode our graphs as vectors. There is no special reason, and then, it could occur that the previous distances could be produced by some random characteristics. Let us note, in particular, that the number of nodes cannot have any discrimination capacity, since the four graphs have been forced to have 50 nodes. In addition, we know that several of these statistics are strongly correlated (for instance, the density of a graph is determined by the number of nodes and the number of edges). Then, to clear up this doubt, we undertake a systematic study considering all the possible combinations (in number of $2^{11} = 2, 048$ ) of these 11 features. The pairings Love-Amor and Animals-Animales show a very persistent similarity because only in 35 cases from the 2048 possible ones they do not occupy the top two rows in the corresponding tables. Let us call these 35 combinations as failure cases.

Interestingly enough, if we exclude the combinations with one or two features (that, evidently, have less discrimination power), in all the failure cases, except one, the metric modularity is included among the features involved. In fact, the closest pair with respect to modularity is Animales-Amor, putting more emphasis on the language axis than in the thematic axis. This characteristic should be related to the fact experimentally observed in Agustín-Llach and Rubio (2024) that the communities based on modularity capture conceptual clusters, that could be produced in similar ways for the different stimuli in the same language.

Coming back to the non-failure cases, the most frequent ones, after carefully examining the different combinations of features, we conclude that three statistics are enough to explain the differences among the four graphs Animals50, Animales50, Love50 and Amor50: average degree, density and average path length. With these three metrics, we get the distances described in Table 2. Again, there is a crisp difference among distances in the two first rows (pairs Animals-Animales and Love-Amor) and the rest. The reason is that, independently from the language of expression, taxonomic stimuli (like Animals/Animales) have the greatest average degree and density and the minimal average path length. This ties in with the idea, previously explained, that taxonomic categories whose members are very close in the conceptual space are less permeable to linguistic differences, that is, they are more stable category members despite the language in which the task is being performed.

Table 2.

Distances for Graphs Pruned to 50 Nodes, Vectorized With the Features Average Degree, Density and Average Path Length.

d(vNAnimals50, vNAnimales50)	0.121
d(vNLove50, vNAmor50)	0.170
d(vNAmor50, vNAnimals50)	1.467
d(vNAmor50, vNAnimales50)	1.587
d(vNLove50, vNAnimals50)	1.612
d(vNLove50, vNAnimales50)	1.732

As announced in the previous section, to ensure that the differences found by using the feature vectors do not depend on this specific method of vectorization, the so-called spectral vectorization was employed, too. To this aim, we transform our graphs to obtain their simple (S) and undirected (U) counterparts. We called these new graphs AnimalsSU50, AnimalesSU50, LoveSU50 and AmorSU50. The spectral distances between these networks are displayed in Table 3 (the eigenvalues were computed with the SageMath symbolic computation system SageMath, 2026).

Table 3.

Spectral Distances for Graphs Pruned to 50 Nodes and Made Simple and Undirected.

d(svAnimalsSU50, svAnimalesSU50)	2.016
d(svSULove50, svAmorSU50)	2.471
d(svAmorSU50, svAnimalsSU50)	10.688
d(svAmorSU50, svAnimalesSU50)	11.472
d(svLoveSU50, svAnimalsSU50)	12.440
d(svLoveSU50, svAnimalesSU50)	13.267

One more time, Table 3 describes a crisp separation between the pairs Animals-Animales and Love-Amor, and the rest of the comparisons. Nevertheless, the graphs are different with the feature vectorization and with the spectral vectorization. Accordingly, perhaps the conclusions are not sufficiently justified. To check that it is not the case, we re-applied the feature vectorization to the simple and undirected graphs; after normalization, we got the outcomes collected in Table 4.

Table 4.

Distances for Graphs Pruned to 50 Nodes, and Made Simple and Undirected, With All the 11 Gephi Features.

d(vNAnimalsSU50, vNAnimalesSU50)	0.382
d(vNSULove50, vNAmorSU50)	1.044
d(vNAmorSU50, vNAnimalsSU50)	2.503
d(vNLoveSU50, vNAnimalsSU50)	2.535
d(vNAmorSU50, vNAnimalesSU50)	2.700
d(vNLoveSU50, vNAnimalesSU50)	2.750

Not only does Table 4 repeat the same pattern: when performing the same systematic study like before, from the $2^{11} = 2, 048$ combinations of features, only in 38 cases the pairs Animals-Animales and Love-Amor do not occupy the two first rows in the table. As a byproduct, these calculations show that our conclusions do not depend on the number of times each pair of words is produced or on the direction of edges either: this is a purely combinatorial result, related to the way ideas are linked together. Considering the whole, we claim that there is enough quantitative evidence to support the idea that the thematic axis rules over the language axis for Spanish learners of English. This might mean that categorization rather depends on non-linguistic features and that it might reflect similar codifications of reality in the different language backgrounds.

Comparing Words and Links

To illustrate the above result, we checked among the 50 most central [highest degree] words in each category and looked for overlaps. Centrality measured as degree is chosen as a proxy of lexicon organization because central nodes are believed to serve as anchors for community navigation since they are more readily accessible (Agustín-Llach & Rubio, 2024) and also as anchors for new words so that lexicon growth is based on these central words (cf. Feng & Liu, 2024; Steyvers & Tenenbaum, 2005).

Accordingly, the overlap between Animals and Animales amounts to 86% (43 out of 50), that is, the responses to the category Animals in EFL reproduce the answers given to the category Animales in L1, and this is true both for cognate words and also for non-cognate translation equivalents. For Love and Amor a large, but smaller, overlap is also found with 78% (39 out of 50). The different nature of the categories: taxonomic and concrete versus more experiential and abstract, is to be made accountable for the differences.

Discussion

The results of our analyses lead us to believe that L2 semantic categorization is inspired by L1 categorization and that category type is an overriding factor in categorization over task language. Accordingly, we assume that not only are L1 responses translated or transposed into the L2, but also the associations they established. And this is true not only when cognates are at play, but also with other translation equivalents.

Our results support the idea that the notion of semantic category is crucial in the structure of the mental lexicon and that the nature of the categories at stake (taxonomic, ad hoc, schema or experiential, for instance) plays an outstanding role in shaping the lexical-semantic network and determining network metrics (cf. Hernández-Muñoz et al., 2025; Sánchez-Saus & Alvarez-Torres, 2024).

Differences in L1L2 category overlap between the categories Animals and Love can be explained by word concreteness. Concrete words align highly with their translation equivalents, think of perro-dog, león-lion, whereas more abstract words might share a smaller number of semantic features such as novia-girlfriend or confianza-trust, but also faith, familiarity, confidence (see Chaouch-Orozco et al., 2024). The conceptual scenario changes slightly in Spanish and English. Chaouch-Orozco et al. (2024) believe that whereas concrete words have a common or holistic conceptual node, abstract words might have conceptual nodes which are similar or have some shared aspects, but not strictly identical in L1 and L2. In this sense, they concur with van Hell and de Grot (1998) in that abstract words are more context and language-specific, instantiating similar, but not necessarily identical, conceptual scenarios. The responses prompted by the stimulus amor/love are more of an abstract nature than those given in response to animales/animals, which are not only concrete, but which have universal referents in the extra-linguistic reality: cat, dog, elephant, mammal and so on. Also, taxonomic categories include members which are conceptually universal or very close in different languages and cultures. Conceptual similarity over linguistic differences might also account for our results, then. Cultural background may shape the internal organization of schema-based categories; however, because the present sample is limited to Spanish L1 learners, this study cannot directly test cross-L1 cultural variation.

This result could be interpreted in two ways. First, the L1 categorization process (lexical-semantic and conceptual organization of words in the mental lexicon) is reproduced in the L2. The fact that not only words, but also associations are reproduced in the L2 responses serves as evidence for this interpretation, learners use their L1 networks to build their L2 networks. In addition, and in line with Pavlenko’s (2017) observation, one could think that the fact that learners’ L2 experience is mainly confined to formal learning in the classroom and deprived of real-life experiences accounts for learners resorting to their L1 categorization frames as their main, and probably only, reference of categorization of the world. The low proficiency of the learners, together with their limited exposure to natural language use, might be made accountable for the similarity in their L1 and L2 categorization systems. Our results tie in with the idea of conceptual transfer where learners rely on their well-established and reliable L1 conceptual scenarios to categorize and conceptualize reality (cf. Cadierno, 2017; Jarvis & Pavlenko, 2008), also in the L2, since their L2 patterns are probably less developed and automatized. The limited and often decontextualized nature of classroom input means L2 lexical-semantic terms lack the strong, multimodal and sensorimotor connections that are vital for robust acquisition and conceptual grounding, as per an embodied cognition perspective (Lu & Yang, 2025). Consequently, the L1 categorization system remains the default and most accessible conceptual framework, as evidenced by the observed structural similarity between L1 and L2 categorization systems in our data. Further evidence in favour of the translation/transposing scenario is the fact that, as ascertained by Feng and Liu (2024), most central words in L1 are also most frequent in L1 corpus terms, but not in L2, which might again point to learners’ using the translation equivalents of their L1 responses. Second, our results could indicate that while responding in the L2, learners also activate their L1, supporting the parallel activation paradigm of bilingual processing (see Chaouch-Orozco et al., 2024; Dijkstra & van Heuven, 2002; Kroll & Stewart, 1994; also see Collins & Loftus, 1975 for their spreading activation model). Zhao and Li (2022) believe that learners develop parasitic representations of L2 words based on L1 information. In this sense, L2 representations are fuzzier and less strong. Still, another explanation might refer to the shared conceptual system in English and Spanish. Both languages are semantically and lexically very close. Further research could include comparisons with typologically more distant languages such as Arabic, Russian or Chinese.

Accordingly, our results support previous evidence that L1 lexical-semantic networks show thematic structure (De-Deyne & Storms, 2008); that is, words are grouped according to semantic and/or thematic similarity, and go beyond that to prove that this thematic structure replicates in L2 networks. The semantic category students are responding to is more important than language in terms of network organization.

In this sense, our results agree with Feng and Liu (2024) in identifying a large proportion of shared words between L1 and L2 lexical-semantic systems, especially among the most central words. When there is a lack of overlap, lexical gaps in L2 can be identified, and this can serve to inform vocabulary instruction in the classroom, for example.

Conclusion

The present study wanted to look into semantic categorization in Spanish as an L1 and EFL. For that purpose, we design and propose a methodology of analysis that we believe can throw interesting results and be useful for the study of semantic categorization distances. Two different categories were selected, a taxonomic one, Animals, and an experiential or scheme-organized one, Love. Taxonomic categories reflect nature since they contain expressions that show an existential belonging to the category; in this sense, their members are close in conceptual space, and are little influenced by contextual factors such as linguistic or cultural background. Our results support this idea, showing the very close proximity of Animals and Animales. Our results also show high similarity in categorization and organization for Love and Amor, revealing that learners also use their L1 conceptual and lexical-semantic information as a scaffold to delineate semantic categories in the L2. This reliance on L1 conceptual knowledge is particularly pronounced when dealing with schema-organized categories. In addition, we also could show that semantic categorization within a language across categories, be it Spanish L1 or EFL, also yields short distances, but longer than within categories across languages, pointing to the outstanding role of semantic category over language in the categorization process. Two main interpretations can be brandished to account for the similarity found across categories. First, we might think that because of cultural and linguistic proximity, most conceptual space is shared between Animales and Animals and Amor and Love. Second, learners might be translating L1 lexical-semantic items into the L2 and thus L2 semantic categorization mirrors L1. This L1-mediated categorization has frequent support in the literature (see, e.g., Chaouch-Orozco et al., 2024). Although previous studies (cf. Matsuki et al., 2021) could point to cultural immersion as an overriding predictor of semantic categorization over L2 proficiency, the participants in the present study lack cultural immersive experience and have limited L2 exposure in natural context, but also a limited L2 proficiency, both presumably affecting their categorization process. Further research should focus on examining categorization in more distant languages, as well as the categorization process of monolinguals and bilinguals in their shared language. The cultural dimension deserves particular attention in schema-organized categories, mainly. Unlike taxonomic categories, which are often more stable and closer to universal conceptual structure, experiential categories are more dependent on personal experience and sociocultural knowledge. Because the present study focuses on Spanish L1 learners of English, it has not directly tested cross-cultural or cross-linguistic variation. Future research should therefore compare participants with different L1s and cultural backgrounds to determine whether the patterns observed here reflect general tendencies in L2 semantic categorization.

Footnotes

ORCID iD

MariaPilar Agustín-Llach

Ethical Considerations

The study was approved by the Ethics Committee of the University of La Rioja and the Ethics Committee of the University Extremadura and complies with all the ethical requirements concerning participant information and consent.

Consent to Participate

Participant consent was given in written form via the online application.

Author Contributions

MariaPilar Agustín-Llach: conceptualization, literature review, data gathering and study design, and paper writing and conclusions.

Julio Rubio: conceptualization, mathematical calculations and analyses and study design, paper writing and conclusions.

Funding

The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was partially supported by Grants PID2022-137337NB-C21, PID2024-155834NB-I00 and PID2024-157733NB-I00, by MICIU/AEI/10.13039/501100011033 and by ERDF/EU and AFIANZA 2024/03 (funded by La Rioja Government).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Data Availability Statement

A replication package for the replicability of this study with all the necessary data and analyses conducted can be found here:

Author Biographies

MariaPilar Agustín-Llach is a Full Professor in the Department of Modern Languages at the University of La Rioja. Her main research interests focus on FL vocabulary acquisition, as well as the factors that shape this process, including proficiency level, age, L1/ Lx, and bilingualism and multilingualism.

Julio Rubio is a Full Professor in the Department of Mathematics and Computing Science at the University of La Rioja. His main research interests focus on Natural Language Processing and the application of topological methods in Data Science.

References

Agustín-Llach

M. P.

Rubio

(2024). Navigating the mental lexicon: Network structures, lexical search and lexical retrieval. Journal of Psycholinguistic Research, 53, 21.

Aitchinson

(1994). Words in the mind: An Introduction to the mental lexicon (2nd ed.). Blackwell.

Ameel

Storms

(2006). From prototypes to caricatures: Geometrical models for concept typicality. Journal of Memory and Language, 55, 402–421.

Ameel

Storms

Malt

B. C.

Sloman

S. A.

(2005). How bilinguals solve the naming problem. Journal of Memory and Language, 53, 60–80.

Barrat

Barthelemy

Pastor-Satorras

Vespignani

(2004). The architecture of complex weighted networks. Proceedings of the National Academy of Sciences, 101(11), 3747–3752.

Barsalou

L. W.

(1983). Ad hoc categories. Memory & Cognition, 11(3), 211–227.

Barsalou

L. W.

(2010). Grounded cognition: Past, present, and future. Topics in Cognitive Science, 2, 716–724.

Bastian

Heymann

Jacomy

(2009). Gephi: An open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media, 3, 361–362.

Benn

Ivanova

A. A.

Clark

Mineroff

Seikus

Silva

J. S.

Varley

Fedorenko

(2023). The language network is not engaged in object categorization. Cerebral Cortex, 33(19), 10380–10400.

10.

Blondel

V. D.

Guillaume

J. L.

Lambiotte

Lefebvre

(2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), 10008.

11.

Borodkin

Kenett

Y. N.

Faust

Mashal

(2016). When pumpkin is closer to onion than to squash: The structure of the second language lexicon. Cognition, 156, 60–70.

12.

Bower

G. H.

Clark

M. C.

Lesgold

A. M.

Winzenz

(1969). Hierarchical retrieval schemes in recall of categorized word lists. Journal of Verbal Learning and Verbal Behavior, 8(3), 323–343.

13.

Cadierno

(2017). Thinking for speaking about motion in a second language: Looking back and forward (pp. 279–300). John Benjamins.

14.

Chaouch-Orozco

González-Alonso

Duñabeitia

J. A.

Rothman

(2024). Are translation equivalents really equivalent? Evidence from concreteness effects in translation priming. International Journal of Bilingualism, 28, 149–162.

15.

Cherven

(2015). Mastering Gephi network visualization. Packt Publishing.

16.

Christensen

A. P.

Kenett

Y. N.

(2021). Semantic network analysis (semna): A tutorial on preprocessing, estimating, and analyzing semantic networks. Psychological Methods, 28(4), 860–879.

17.

Collins

A. M.

Loftus

E. F.

(1975). A spreading-activation theory of semantic processing. Psychological Review, 82(6), 407–428.

18.

Coni

A. G.

Ison

Vivas

(2019). Conceptual flexibility in school children: Switching between taxonomic and thematic relations. Cognitive Development, 52, 100827.

19.

De-Deyne

Storms

(2008). Word associations: Network and semantic properties. Behaviour Research Methods, 40(1), 213–231.

20.

Dijkstra

van Heuven

W. J. B.

(2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5(3), 175–197.

21.

Dubossarsky

De-Deyne

Hills

T. T.

(2017). Quantifying the structure of free association networks across the life span. Developmental Psychology, 53(8), 1560–1570.

22.

Feng

Liu

(2023). The developmental trajectories of L2-semantic networks. Humanities and Social Sciences Communications, 10, 128.

23.

Feng

Liu

(2024). The structure of lexical-semantic networks at global and local levels: A comparison between L1 and L2. Complexity, 2024, 1–3.

24.

Fitzpatrick

Izura

(2011). Word association in L1 and L2: An exploratory study of response types, response times, and interlingual mediation. Studies in Second Language Acquisition, 33(3), 373–398.

25.

Francis

W. S.

(1999). Cognitive integration of language and memory in bilinguals: Semantic representation. Psychological Bulletin, 125(2), 193–222.

26.

Fruchterman

T. M. J.

Reingold

E. M.

(1991). Graph drawing by force-directed placement. Software: Practice and Experience, 21(1), 1129–1164.

27.

Hernández-Muñoz

(2014). Categorías en el léxico bilingüe: Perspectivas desde el priming semántico interlenguas y la disponibilidad léxica [Conceptual Categories in the Bilingual Lexicon: Perspectives from Cross-Linguistic Semantic Priming and Lexical Availability]. RAEL: Revista Electrónica De Lingüística Aplicada, 13(1), 19–38.

28.

Hernández-Muñoz

Tomé-Cornejo

López

(2025). Redefining linguistic categories through network theory. Language and Cognition, 17, Article e40.

29.

Imai

Gentner

(1997). A cross-linguistic study of early word meaning: Universal ontology and linguistic influence. Cognition, 62(2), 169–200.

30.

Jarvis

Pavlenko

(2008). Crosslinguistic influence in language and cognition. Routledge.

31.

Jiménez-Catalán

R. M.

(2014). Lexical availability in English and Spanish as a second language. Springer-Verlag.

32.

Kleiber

(1995). La Semántica de los Prototipos. Categoría y Sentido Léxico [Prototype Semantics: Category and Lexical Meaning]. Visor.

33.

Kolers

P. A.

(1963). Interlingual word associations. Journal of Verbal Learning and Verbal Behavior, 2(4), 291–300.

34.

Kövecses

(2006). Language, mind and culture: A practical introduction. Oxford University Press.

35.

Kroll

J. F.

de Groot

A. M. B.

(2005). Handbook of bilingualism: Psycholinguistic approaches. Oxford University Press.

36.

Kroll

J. F.

Stewart

(1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174.

37.

Kroll

J. F.

Tokowicz

(2001). The development of conceptual representation for words in a second language (pp. 49–71). Wiley-Blackwell.

38.

Kroll

J. F.

Tokowicz

(2005). Models of bilingual representation and processing: Looking back and to the future (pp. 531–553). Oxford University Press.

39.

Lakoff

(1986). A figure of thought. Metaphor and Symbolic Activity, 1(3), 215–225.

40.

Lin

Schwanenflugel

Wisenbaker

(1990). Category typicality, cultural familiarity, and the development of category knowledge. Developmental Psychology, 26, 805–813.

41.

Yang

(2025). Second language embodiment of action verbs: The impact of bilingual experience as a multidimensional spectrum. Bilingualism: Language and Cognition, 28(4), 1117–1133.

42.

Lupyan

Mirman

(2013). Linking language and categorization: Evidence from aphasia. Cortex, 49(5), 1187–1194.

43.

Malt

B. C.

Sloman

S. A.

Gennari

Shi

Wang

(1999). Knowing versus naming: Similarity and the linguistic categorization of artifacts. Journal of Memory and Language, 40, 230–262.

44.

Mandler

J. M.

(1984). Stories, scripts, and scenes: Aspects of schema theory. Lawrence Erlbaum Associates.

45.

Matsuki

Hino

Jared

(2021). Understanding semantic accents in Japanese–English bilinguals: A feature-based approach. Bilingualism: Language and Cognition, 24(1), 137–153.

46.

Pavlenko

(2009). Conceptual representation in the bilingual lexicon and second language vocabulary learning (pp. 125–160). Multilingual Matters.

47.

Pavlenko

(2017). Do you wish to waive your rights? Affect and decision-making in multilingual speakers. Current Opinion in Psychologys, 17, 74–78.

48.

Plat

Lowie

de Bot

(2018). Word naming in the L1 and L2: A dynamic perspective on automatization and the degree of semantic involvement in naming. Frontiers in Psychology, 8, Article 2256.

49.

Precosky

(2011). Exploring the mental lexicon using word association tests: How do native and non-native speakers of English arrange words in the mind? [Doctoral thesis]. University of Birmingham, Birmingham.

50.

Quintanilla

Kloss

(2024). Academic lexicon development in an EMI context: A study of pre-service English teachers’ lexical availability. Journal of TESOL Studies, 6(2), 61–84.

51.

Rosch

(1978). Principles of categorization (pp. 27–48). Erlbaum.

52.

SageMath. (2026). SageMath [Technical report]. https://www.sagemath.org

53.

Saji

Hong

Wang

(2024). Learning semantic categories of L2 verbs: The case of cutting and breaking verbs. PLOS ONE, 19(1), Article e0296628.

54.

Sánchez-Saus

Alvarez-Torres

(2024). Influencia de los contextos de aprendizaje en el lexicón mental: Productividad léxica y redes semánticas en estudiantes de ELE [Influence of Learning Contexts on the Mental Lexicon: Lexical Productivity and Semantic Networks in Learners of Spanish as a Foreign Language (SFL)]. Revista De Lingüística Y Lenguas Aplicadas, 19, 204–217.

55.

Sass

Sachs

Krach

Kircher

(2009). Taxonomic and thematic categories: Neural correlates of categorization in an auditory-to-visual priming task using fMRI. Brain Research, 120, 78–87.

56.

Schmitt

(1998). Quantifying word association responses. What is nativelike? System, 26(3), 389–401.

57.

Sheng

McGregor

K. K.

Marian

(2006). Lexical–semantic organization in bilingual children: Evidence from a repeated word association task. Journal of Speech, Language, and Hearing Research, 49(3), 572–587.

58.

Shivabasappa

Peña

E. D.

Bedore

L. M.

(2017). Typicality effect and category structure in Spanish–English bilingual children and adults. Journal of Speech, Language and Hearing Research, 60(6), 1577–1589.

59.

Steyvers

Tenenbaum

J. B.

(2005). The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29(1), 41–78.

60.

Tantardini

Ieva

Tajoli

Piccardi

(2019). Comparing methods for comparing networks. Scientific Reports, 9, 17557.

61.

Teixeira-Moláns

(2024). Galician colour semantics: An investigation of basic colour terms [Doctoral thesis]. University of Glasgow, Glasgow.

62.

Tomé-Cornejo

(2015). Léxico disponible. Procesamiento y aplicación a la enseñanza de ELE [Doctoral thesis]. Universidad de Salamanca, Salamanca.

63.

UCLES. (2001). Quick placement test (Revised ed.). Oxford University Press.

64.

van Hell

J. G.

de Grot

A. M. B

. (1998). Conceptual representation in bilingual memory: Effects of concreteness and cognate status in word association. Language and Cognition, 1(3), 193–211.

65.

Viñas-Guasch

Gathercole

Stadthagen-Gonzalez

(2017). Bilingualism and the semantic-conceptual interface: The influence of language on categorization. Bilingualism, Language and Cognition, 20, 5.

66.

Wierzbicka

(1992). Defining emotion concepts. Cognitive Science, 16, 539–581.

67.

Wolter

Yamashita

(2018). The influence of L1–L2 congruency on L2 collocational processing. Studies in Second Language Acquisition, 40(3), 689–710.

68.

Zemla

J. C.

Austerweil

J. L.

(2022). Estimating semantic networks of groups and individuals from fluency data. Computational Brain and Behavior, 1(1), 36–58.

69.

Zhang

Peixoto

T. P.

(2020). Statistical inference of assortative community structures. Physical Review Research, 2(4), 043271.

70.

Zhao

(2022). Fuzzy or clear? A computational approach towards dynamic L2 lexical-semantic representation. Frontiers in Communication, 6, 726443.