Abstract
The goal of this article is to analyse the role of urban place names (‘urbanonyms’) with dialectal origins in the domain of Italian toponomastics. The article offers a gazetteer-based study from which a list of dialectal generic terms is extracted and compiled. From this list, a detailed geographic distributional analysis of these terms is offered, and the dialectal origins of terms are assessed. A questionnaire study is subsequently presented that analyses to what degree Italian speakers may interpret these terms as being dialectal in origin, and to an extent unique to the places they classify. The article concludes by discussing how these findings can inform our understanding of place names, their linguistic properties and their possible dialectal origins.
Introduction: Previous literature and research goals
The study of place names, or ‘toponyms’, has received renewed attention across different linguistic disciplines. Several works suggest that toponyms and other proper name types have distinctive properties necessitating the formulation of ‘onymic grammars’ (Nübling et al., 2015). Toponomastic works suggest that toponyms display a combination of two core terms. Generic terms classify the place types they name; specific terms act as referring labels for these places (Blair and Tent, 2015). For instance, Federation Square names a square in Melbourne, Australia, and includes the generic term ‘square’ and the specific term ‘Federation’. One can also find toponyms which include generic terms and complex specific terms, which include further generic terms (e.g. respectively ‘New’, ‘South’ in ‘New South Wales’), or ‘bare’ specific terms (e.g. ‘Australia’). Either linear order can be attested within a language, e.g. the generic and specific term for Mount ‘Rushmore’ vs. the specific and generic term for ‘Pitt Street’. Toponyms thus display distinctive grammatical properties (Blair and Tent, 2021; Perono Cacciafoco and Cavallaro, 2023).
The cross-linguistic and dialectal roots are also well-documented. A cross-linguistic case is ‘Lake Rotorua’, a toponym for a lake in New Zealand. This toponym combines an English generic term, ‘lake’, with a Māori specific term, ‘roto-rua’, literally ‘lake second’ (i.e. ‘(the) second lake’; Tent and Blair, 2019). Instead, mountain toponyms may include terms originating in local dialects (e.g. Swiss German ‘horn’ in ‘Matter-horn’; Derungs and Purves, 2016). Several languages (e.g. Swedish, Lithuanian) also include suffixes acting as generic terms, and having dialectal origins (e.g. ‘-köping’ (‘market’) in Swedish ‘Jön-köping’; Stemshaug, 2002). Toponomastics and dialectology have thus amply explored geo-historical connections between languages and dialects via the lenses of toponyms and their properties (Scott, 2016; Wiereck, 2006).
This intertwining of national and local languages is also a well-documented research path in Italian geolinguistics (Granucci, 2004). Italian toponyms may have millennia-old origins, which hint at the presence of civilizations predating Roman rule (Gasca Queirazza et al., 1990). They may also be rooted in the local dialects (e.g. the generic term ‘rua’ in the city of Ascoli Piceno; De Stefani, 2004). Italian and other Romance varieties spoken in the Italian peninsula (or ‘dialects’) have co-evolved from Latin for several centuries, after all (Berruto, 2012; Frasson, 2022; Maiden and Parry, 1997). Though diachronic trajectories may obfuscate these dialectal origins via phonological changes (Stolz et al., 2017), a rich toponomastic tradition has carefully documented these roots (e.g. Calafiore, 1975). However, most studies focus on the distribution and analysis of these toponyms within local (i.e. provincial) territories. They forego a multi-scalar (i.e. local, regional, national) analysis of the patterns leading to this interaction. Therefore, they also do not analyse in which places one can find dialectal toponyms.
The first goal of this article is to offer a distributional analysis of dialectal terms in Italian toponyms. We thus assess where these toponyms can be found in gazetteers, and the dialects and territories they represent. The second goal is to analyse what kinds of toponyms have dialectal origins, and how these terms may follow the properties of Italian ‘toponymic grammar’. We restrict our attention to ‘urbanonyms’, i.e. toponoyms for urban conglomerates (e.g. cities, villages) and their constituting parts (e.g. streets, squares and their names: ‘Via Roma’ ‘Rome Street’, ‘Piazza Navona’ ‘Navona Square’, among others; David, 2011). We make this choice for two reasons. First, this sub-type almost always features the presence of generic terms (Ursini, 2016). Second, their documentation in gazetteers is highly accurate and well-documented (Samo and Ursini, 2022). We thus suggest that answering these questions can lead us to a firmer grasp of the geolinguistic interplay between languages and dialects.
The article is organised as follows. In the remainder of the introduction, we offer an overview of previous studies on Italian urbanonyms, and motivate the need for an analysis of their geographical distribution. We then present the methodology for our study in the second section, which consists of three parts. The first is a corpus-based, quantitative analysis of generic terms and their geographic distribution. The second is a lexicographic, qualitative analysis of these terms and their dialectal roots. The third is a questionnaire in which participants offered their intuitions on the dialectal/local roots of selected terms. We present the results in the third section, and conclude in the fourth section by discussing these findings and whether they allow us to achieve our goals. First, however, we define ‘place’ as any location which humans can develop attachment to and use for social interactions (Cresswell, 2014: Ch. 1). ‘Place names’ can refer to these concepts via the lens of locally defined socio-cultural and linguistic uses (Coates, 2013). Our study therefore aims to study naming patterns for places in the context of the Italian landscape.
We begin by offering an overview of the institutions and practices that govern the collection of official toponyms in Italy (see also Ursini, 2020). The institutional entity that presides over these tasks is the Istituto Geografico Militare Italiano (IGMI; Cantile, 2004, 2013). The IGMI operates via three bottom-up procedures. First, researchers collect grassroots data via local speakers, ideally NORMs (non-mobile, older, rural and male; Chambers and Trudgill, 1998: 28–30). Second, researchers consult documents from local government institutions (e.g. land registries, city administrations). Third, researchers also consult the findings from ‘deputazioni di storia patria’, ‘national history delegations’, which are institutes that collect information about local history, culture and traditions across the country (Panella, 2020 [1931]). The IGMI also operates in a top-down manner by classifying toponyms via synchronic (e.g. grammatical, phonological) and diachronic (e.g. etymological) principles of linguistic analysis.
IGMI's data collection practices thus follow a form of triangulation, where different methods, converging to equivalent results, contribute to validate data (Damico and Tetnowski, 2014; Rothbauer, 2008). When conflicts in validation arise, researchers give priority to the historically older and less standardized sources. They thus assume that toponyms whose phonological form is closer to standard Italian are probably the result of external pressures. In fact, local speakers may show resistance to offering dialect-based toponyms because they attach to them a stigma of ‘imperfect’ variants of standard Italian ones (Cassi, 2004: 726). Historical maps from local ‘deputazioni’ and land registers may also preserve toponyms in their dialectal forms. When this is the case, the dialectal form is recorded because it existed before the Italian language became the official language of the new unitary state.
From the inception of Italy as a modern country in 1861 to the present, Italian linguists have also cultivated toponomastics and geolinguistics as important avenues of linguistic research (e.g. Ascoli, 2008 [1895]; Marcato, 2009; Nocentini, 2004). Toponomastic research simultaneously operates across different layers of linguistic analysis. However, it is convienent for our goals to discuss the key results of this research tradition as follows. We focus on the possible etymological roots of Italian toponyms, then on their grammatical and lexical properties.
The contribution of other languages to Italian toponyms is an amply documented fact. For instance, Romance languages spoken in the Peninsula contributed thousands of toponyms. Examples include Sardinian (see ‘Nùoro’; Pittau, 2018), Valdotain (e.g. ‘Courmayeur’; Nocentini, 2004) and Friulan (e.g. ‘Triest’; Frau, 1968). Past colonisers and indigenous populations have also left their clear and indelible trace through their toponyms. Examples include Etruscan in Tuscany (e.g. ‘Vincena’; Massarelli, 2012) and Celtic across Northern regions (e.g. ‘Bologna’ from ‘Bononia’; Pellegrini, 1986). One can also find toponyms with Arabic (e.g. ‘Misilmeri’ from ‘manzil-el-emir’, ‘emir's home’; Pellegrini, 1961) and Greek origins (e.g. ‘Napoli’ from ‘nea-polis’, ‘new city’, Chiappinelli, 2013). Langobardic toponyms are found in Northern Italy but also in the central regions. For instance, the generic term ‘fara’ in the toponym ‘Fara Sabina’ originates from the Germanic ‘farh’, ‘expeditionary troop’. This generic term was introduced during a period in which the Langobard duchies governed vast portions of central and Southern Italy (e.g. the duchies of Spoleto, Benevento; Melelli and Sacchi De Angelis, 1982).
Toponyms can also display dialectal origins: generic terms often offer key evidence for these origins (Calafiore, 1975). For instance, one can have ‘pesco’, ‘cliff’, for cliffs and villages located near salient villages, found in the Southern Appennine (e.g. ‘Pescocostanzo’; Chiappinelli, 2013). One can also have Neapolitan ‘pallonetto’ for the unique network of alleyways only found in Naples (e.g. ‘Pallonetto Santa Lucia’; Doria, 1982). Thus, Italian toponyms may offer ‘memory traces’ of a country's past inhabitants and their place-naming practices (Gelling, 1988; Hanks, 2011: 306–307; Marcato, 2009; Whittlesey, 1929). Specific and generic terms alike, however, have often undergone processes of phonological assimilation. Hence, they have become part of the Italian ‘toponomastycon’, or ‘lexicon of toponyms’ (Rezsegi, 2012, 2020a, 2020b). The empirical question thus becomes how one can trace these dialectal roots, as these roots may have faded in contemporary Italian.
Before we can answer this question, we must introduce the key grammatical and lexical properties of this category. Italian toponyms are considered a sub-type of proper names (De Felice, 1987). Grammar-wise, they can display three construction types (Marcato, 2009, 2010–2011). The first includes synchronically opaque forms of suffixation that may hint at their pre-Italian origins (see ‘Coppito’ from Latin ‘Poppl-etum’, ‘Poplar's’ place’). Often, these forms only include a specific term that is also usually opaque in its sense (e.g. Roma). The second includes nominal compounds, with generic terms as the (left) head and of a compound (e.g. ‘piazza’ in ‘Piazza Navona’, ‘Navona Square’). The third includes the partitive/relational preposition ‘di’ combining with a generic and a specific term (e.g. ‘Ponte dei Sospiri’, ‘Bridge of Sighs’). Thus, Italian toponyms can realize the ‘generic term and specific term’ template also found in English via the second construction. However, they also include a construction featuring only ‘bare’ specific terms (first type), and relational/partitive constructions (third type).
These key properties hold for urbanonyms, even if at first glance they seem a heterogeneous sub-type. Urbanonyms appear heterogeneous because they can name different types of places (e.g. streets, squares, parks; Basik, 2021; Marcato, 2011; Seidl, 2019). However, cross-linguistically they tend to include generic terms classifying these place types (Köhnlein, 2015; Koptjevkaja-Tamm, 2013; Vannieuwenhuyze, 2007). For Italian, (Ursini and Long, 2020) shows that 99.1% of urbanonyms from four cities (Rome, Milan, Naples and Venice) include generic terms, exceptions being urbanonyms for widely established landmarks (e.g. ‘Il Colosseo’ in Rome). However, the aforementioned study does not explore the lexical content and origins of urbanonyms’ generic terms. Etymological (e.g. Cassi, 2015; Granucci, 1988, 2004) and lexicographic studies (Ursini and Samo, 2022b) have hinted at the dialectal origins of some generic terms in this class of toponyms. However, these studies do not assess in what form the distribution of dialectal generic terms is linked to the Italian territory, its distinctive parts (e.g. cities, regions) and its dialects. Therefore, our where question, and with it our what and how questions, are still outstanding. The remainder of this article answers these questions.
Materials and methods
Our study included three parts, which we discuss in order of execution: data extraction from gazetteers, lexicographic analysis of potential dialect-based terms and a questionnaire about local speakers’ intuitions. Our purpose was to reach a form of triangulation by testing if the diachronic, dialectal origins of generic terms were synchronically accessible to speakers of the relevant dialects. Though toponyms may represent memory traces of the interplay between languages and dialects, assimilation into Italian may render these traces opaque to speakers (Battisti, 1928, 1959; Marcato, 2010–2011; Nocentini, 2004). For instance, Genoean speakers should not know that ‘crosa’, ‘a small, crimson-paved alley’, was once ‘crêusa’, unless they are well acquainted with the term's etymology and its origin in the Genoean dialect. For other Italian speakers who have seldom used this and other terms, ‘crosa’ could be an uncommon generic term that may be geographically restricted to a city or region.
For these reasons, we conjecture that when the phonological form of a term has undergone a process of assimilation, dialectal origins become opaque. However, speakers could potentially infer their dialectal origins via an indirect process of ‘geo-localization’. That is, they can associate rare or lesser known generic terms as descriptors for places only found in certain cities or regions (see Cassi, 2015; Rezsegi, 2020b; Tent and Blair, 2019; Ursini and Samo, 2022b). Our goal with the questionnaire was thus to test whether speakers of any regional origin could at least have intuitions about the dialectal roots of these terms, and whether these roots would affect their comprehension and answering of the questionnaire.
Let us describe our approach. In the first part, we used the online dynamic gazeetter OpenStreetMap (henceforth, OSM; www.openstreetmaps.org) to extract urbanonyms. OSM operates according to a philosophy known as ‘Volunteered Geographic Information’ (henceforth VGI; e.g. Sui and Goodchild, 2011). Users of the platform can spontaneously insert place names for places on maps, and can resolve eventual disagreements via the platform's discussion forums. When a critical number of users’ input is reached upon the toponym for a place, then the toponym is officially recorded. Thus, OSM represents grassroots, citizens’ bottom-up knowledge about places and their names, potentially growing on a daily base. It is for these reasons that we opted to use this gazetteer: ultimately, OSM could offer us easily accessible data ‘closer’ to local knowledge. It also offered us possibly the highest number of tokens and types on Italian urbanonyms (see Ursini and Samo, 2022b for a comparison).
We thus operated as follows. We queried the dedicated platform Overpass-Turbo (https://overpass-turbo.eu/), extracting the textual place names in dedicated annotated areas. These areas represent the 20 Italian administrative regions; the code is available in supplementary file A. The output .txt file was then transformed into a .csv file, adopting spaces as separators so that every first word of the urbanonym would represent the target generic term. Given the syntactic nature of Italian, the generic term represents the first element of the syntactic constituent. From this list, we created an Excel file in which the generic terms were organized according to their regional distribution (see supplementary file B for the full list).
In the second part, we performed a lexicographic analysis. We consulted definitions of the extracted generic terms in Italian (e.g. De Mauro, 2020; Gabrielli, 2020) and toponym dictionaries (e.g. Cassi and Marcaccini, 1998; Gasca Queirazza et al., 1990). We aimed to verify the possibility that some terms may be attested only in the second group. We also aimed to verify the possibility that definitions would be more detailed in the second group. Our reasoning was that toponym dictionaries could offer etymological information about the local/dialectal origins of terms. This information may instead go amiss in dictionaries that offer a focus on the synchronic status of terms. A key result was the discovery of synonym sets: words that may carry the same senses but display different, possibly related forms (e.g. ‘vico’ vs. ‘vicolo’ vs. ‘rua’; Murphy, 2010).
In the third part, we prepared a written questionnaire in which participants had to evaluate whether a sub-set of the extracted terms was of dialectal origin. We chose a written questionnaire for two reasons. First, questionnaires are a well-established research methodology in dialectology (Chambers, 2017; Llamas, 2017, and references therein) and can be designed to allow participants an active contribution to the study. Researchers can invite participants to evaluate items according to pre-determined criteria, but also invite them to elaborate on the items (Döllinger, 2015: Chs. 2–3). Second, questionnaires can be completed online (e.g. on smartphones), and thus represent a logistically convienent tool in these challenging times of global pandemics (see Singh and Sagar, 2021).
The questionnaire focused on the manifold generic terms forming the set of synonyms for ‘vicolo’, ‘alley’. We opted to use a web-based questionnaire in which participants were asked to offer their opinion as speakers of standard Italian regarding the ‘locality’ of the terms of their and others’ regions. For each entry, participants could choose values within a five-point Likert scale, with ‘1’ confirming that a term was perceived as dialectal, and ‘5’ as Italian (Kho, 2018). We implemented our questionnaire on the web-based platform ‘psytoolkit’ (Stoet, 2010, 2017). Participants could also add comments in a feedback textbox if they wished to clarify their choice.
Participants were presented with the picture of a real-world road sign containing the targeted generic term extracted from the web-mapping repositories of Google Street View (Google, 2022; (all the experimental items, including pictures, are available in supplementary file C)). The generic term was highlighted, and brought to the attention of the participant, by being circled in red. The participants were instructed, in the welcoming screen and during the task, to evaluate the generic term (‘la parola cerchiata in rosso’, ‘the word circled in red’, in the instructions) according to the given scale. Figure 1 exemplifies the procedure.

An example of the design of the questionnaire.
The regional synonyms of ‘vicolo’ (see the third section and the Appendix) were randomly interspersed with fillers exploring the Likert scale. Examples include purely regional terms like Venetian ‘salizada’ and extremely frequent standard Italian terms (e.g. ‘piazza’, ‘square’; Ursini and Samo 2022b). The participants were presented with a welcome screen, followed by a page collecting their age and origins. While age information involved selecting relevant demographic intervals (e.g. 18–30, 30–40, 40–50, 50–60, 60–70, older than 70), participants offered their origins via the question ‘Da dove vieni (città o provincia)?’, ‘Where are you from (city or administrative region)?’. A practice session helped the participants familiarize themselves with the task. After all the experimental stimuli and fillers evaluations, participants were presented with a screen optionally collecting information on sex (‘male’, ‘female’, ‘non-binary’, ‘I prefer not to answer’). We acknowledge that these personal data might not be fine-grained, but such a structure helped us in keeping the questionnaire easily accessible and shareable. Furthermore, as discussed in the third section, data analysis did not detect any statistically relevant regional/age correlation with respect to our sample.
Results
The results were as follows. In the first part, we extracted a total of 452,538 tokens distributed unevenly across regions. We thus obtained a far more voluminous sample than the one in (Ursini and Samo, 2022b), which extracted 213,218 tokens. In so doing, we increased possibilities to discover local/dialectal terms. The more densely populated regions with the highest number of urban centres (e.g. Lombardy, Lazio, Sicily, Veneto) have the highest numbers of tokens. For Lombardy, for instance, Milano but also close urban centres (e.g. Bergamo) contribute thousands of urbanonyms and respective places. For Lazio, capital city Rome and other centres play a similar relevant role. Conversely, regions such as the mountainous Aosta valley in the North-West and Molise in the centre contribute fewer tokens, as they also have small populations and matching urban centres. Nevertheless, all Italian regions contribute thousands of tokens and potentially local/dialectal generic terms to the study. We give the distribution in Figure 2. We assume that the possibility of OSM including incorrect toponyms is statistically non-significant; see however Ursini & Samo (2023) for discussion.

Distribution of the tokens in OpenStreetMap (map created with datawrapper v.1.25.0; Lorenz et al., 2012).
In the second part, we first isolated types occurring via at least two tokens in every region to extract types of generic terms, and performed a lexicographic analysis. We then aggregated the different types’ counts to explore the local dimension. We operated a manual analysis to localize terms that could potentially be attested in only one region. In Italy, geo-linguistic regions representing the distribution of dialects and administrative regions only loosely coincide (Frasson, 2022: Ch. 1; Lameli et al., 2010: Ch. 3; Rabanus, 2017). For instance, the Neapolitan dialects have Naples as a centre of propagation (Sornicola, 1996). However, these dialects are also spoken in the Campania region in which Naples is situated, and in other contiguous regions such as Abruzzo, Molise and Basilicata (Ledgeway, 2009: Chs. 1–3). Thus, traces of the Neapolitan dialects in urbanonyms can be found in cities and regions in which this dialect is spoken and has played an historical role as a substratum (Berruto, 2012).
We hence verified that terms attested in fewer than five administrative regions would cover continuous geographical regions associated with a single dialect. We found that only a subset of generic terms is attested in all 20 Italian regions. These were ‘corso’, ‘avenue’; ‘galleria’, ‘gallery’; ‘percorso’, ‘trail’; ‘piazza’, ‘square’; ‘piazzetta’, ‘little square’; ‘ponte’, ‘bridge’; ‘sentiero’, ‘country road’; ‘strada’, ‘road’; ‘via’, ‘street’; ‘viale’, ‘avenue’; and ‘vicolo’, ‘small alley’. These terms refer to the types of places most commonly found in each Italian city. In order to extract local generic terms, we carried out a manual analysis, since some terms could be found in multiple but disjointed regions. For instance, maritime terms such as ‘banchina’, ‘pier’, or ‘erta’, ‘steep slope’, could be found in Sardinia, Tuscany and Friuli-Venezia-Giulia. After the manual analysis data extraction part, we isolated more than 150 types of ‘local’ generic terms that can be found in contiguous regions. We summarize our results in Table 1.
Number of contiguous regions (#CR) and generic terms.
Note here that some terms (e.g. German ‘weg’, Valdotain ‘allée’) do not have origins in the Romance dialects we discuss in this article. We however include them to show that our extraction procedure was as exhaustive as possible. As Table 1 shows, certain terms have restricted distributions that correspond to specific geo-linguistic regions or even specific cities. These generic terms were organized according to their regional distribution, shown in Figure 3. For instance, ‘salizada’, ‘cleaning place’, can only be found in the Veneto region, and ‘venula’, literally ‘little vein’ (i.e. ‘small alley’), can only be found in Erice (Trapani, Sicily). These types of terms are shown in the right panel of Figure 3.

Distribution of all the local generic terms (left panel) and distribution of terms found in only one region (right panel).
Geographical distributions are often very limited: for example, ‘calle’, ‘alley’, is mainly related to the city of Venice (1209 occurrences in the city gazeeteer of Venice; see also Ursini & Long, 2020). It can be found in other contiguous regions with smaller frequencies. One case is Friuli-Venezia Giulia, with 133 occurrences (e.g. ‘Calle Leonardo da Vinci’ ,‘Leonardo da Vinci's Alley’, in Lignano Sabbiadoro, at the border with Veneto). A second is Emilia Romagna: three occurrences, e.g. ‘Calle dei Campionesi’, ‘Campionesis’ Alley’, located in the city centre, near the cathedral, in Modena. Overall, the geographic distribution from which we collect data and the factor of contiguous regions strongly hint at their geo-linguistic origins in a dialect and the corresponding city acting as its place of origin.
A converse case involves terms lacking dialectal origins. For instance, we found tokens of the term ‘bastioni’, ‘bastions’, in the regions of Sicily, Veneto, Puglia and Calabria. Sicily and Calabria are contiguous, being the southernmost regions in Italy; however, Veneto is in the northeast and Puglia in the southeast of the country. A quick glance at the tokens’ distributions reveals that these terms are used for names of naval bastions in maritime cities (e.g. Syracuse, in Sicily). Though their distribution may be rare over the national territory, this fact only reveals that certain types of places are less common in Italian cities. Similarly, the rare ‘chiasso’, ‘nexus of alleys’, is found in two non-contiguous Italian regions, Abruzzo and Tuscany, whose cities still feature alley networks characterizing their medieval city centres. Such generic terms are almost obsolete in modern Italian. However, they are still part of Italian's toponomasticon as a ‘trace’ of these places’ historical roots. Thus, the generic terms making up the urbanonyms for these places may be rare Italian terms, but not necessarily dialectal in origin.
Two other key results emerging from our analysis involve the terms ‘sottoportico’, ‘underporch’, and ‘vicolo’, ‘alley’. For the first term, we found the variants ‘supportico’ in Naples and surrounding cities (Campania) and regions (Molise and Basilicata), and ‘sotoportego’ in Venice and surrounding cities (Veneto). Both versions can be traced back to their respective dialects (Neapolitan, Venetian) because of their spelling forms. Interestingly, Italian ‘sottoportico’ is rarely found outside Northern regions (Trentino, Lombardia), though porches covering aisles are common in most Italian cities, as places allowing sheltered evening strolls. For ‘vicolo’, we have a nuanced picture, as Figure 4 shows.

Data visualization of the local synonyms of ‘alley’ (map created with datawrapper v.1.25.0; Lorenz et al., 2012).
First, the alternation of the vocabulary items ‘vico/vicolo’ with the more archaic ‘vico’ may be traced to their origin from Latin ‘vicus’ (Gasca Queirazza et al., 1990). Second, there exist several regional or even city-bound varieties of this term that also involve nuanced semantic differences with respect to the basic term. For instance, Genoean ‘crosa’ and ‘crosino’ are respectively ‘alleys’ and ‘small alleys’ with usually crimson floor tiles, directed to the main square of a quarter. Again, both terms are modern assimilated forms of Genoan ‘crêusa’. Venetian ‘calle’ refers to the narrow alleys of Venice and neighbouring cities that however act as main streets in this city. ‘Rua’ can be found in Ascoli Piceno, Marche and the Molise and Campania regions. It describes alleys criss-crossing the city centre; the roots of this term lie in the Latin ‘ruga’, ‘urban street’. Florentinian ‘viuzzo’ is a diminuitive form of ‘via’, ‘street’ (see also ‘viucola’, ‘viuccio’, ‘viucciolo’), but refers to the ascending alleys of Florence's centre. Thus, these synonyms for ‘vicolo’ reflect the importance of alleys in Italian cities and their respective dialects.
The city of Bozen/Bolzano and its province represent a German- and Ladin-speaking enclave in the Trentino-Alto Adige region (Cordin, 1997). The urbanonyms and other toponyms in this province are all bilingual, as one can see from the Bozen/Bolzano case. They thus present another distinctive case. One can find the following German words as generic terms only within the urbanonyms of this city and its province (Blanco, 2006): ‘Platz’, ‘square’; ‘Straße’, ‘street’; ‘Allee’, ‘avenue’; ‘Gasse’ ‘alley’; ‘Weg’, ‘path’.These terms co-exist with their Italian counterparts. Thus, one can find ‘Garibaldistraße’ and ‘Via Garibaldi’, ‘Garibaldi Street’, only in Bozen. Bozen's German generic terms qualify as ‘exonym’ (i.e. foreign) generic terms of Italian, but as endonym (i.e. internal) terms of German (Stolz and Warnke, 2018).
Note that generic terms of Ladin origins seem rarely attested in Trentino-Alto Adige urbanonyms, apart from ‘streda’, ‘street’, and ‘puent’, ‘bridge’ (to be compared with ‘puint’, ‘bridge’, as in ‘puint dal diàul’, ‘devil's bridge’, in Cividale del Friuli, Udine). Ladin is generally used in Trentino-Alto Adige alpine valleys, and is harder to find in urban centres (Salvi, 2016). We thus conjecture that we did not find urbanonyms with Ladin origins due to this specific geographical distribution of the language. Overall, we can conclude that local dialects (e.g. Venetan, Neapolitan) have offered several generic terms to the toponymasticon of Italian. We can also observe that other languages (i.e. German) are used in some regions of Italian, and thus that their generic terms can also be found and used on Italian territory. For our purposes, however, only the dialectal terms play a role in our further analysis.
In the third part, we focused on the synonym set of ‘vicolo’ to investigate how speakers may (or may not) access dialectal origins in generic terms. The retrieved terms are presented in the experimental settings presented in the second section. The experimental stimuli (see also Figure 4) were (in alphabetical order) ‘arruga’, ‘calle’, ‘contrà’, ‘crosa’, ‘piaggia’, ‘rua’, ‘ronco’, ‘tresanda’, ‘venula’, ‘vico’, ‘vicolo’, ‘viuzza’ and ‘viuzza’. Fillers, which also act as a control group, included local generic terms (‘borc’, ‘salizada’, ‘sotoportego’, ‘fondamenta’, as a term used for storage rooms rather than buildings’ foundations; see Table 1) and five of the most frequent terms in Italian (‘corso’, ‘largo’, ‘piazza’, ‘via’, ‘viale’; Ursini & Samo, 2023). Participants (N = 80) accessed the link via social media. We present the results of all items in terms of mean and standard error, experimental stimuli and fillers in Figure 5.

Error bar chart of every tested lexical item (experimental stimuli and fillers); alley-sin = synonyms of alley (experimental stimuli), dialect and Italian represent the control group items (fillers).
Overall, ‘crosa’ has been indicated as the most dialectal item in our sample (M = 1.13, SD = 0.50), followed by ‘rua’ (M = 1.21, SD = 0.61) and ‘tresanda’ (M = 1.23, SD = 0.71). Venetian ‘calle’ (M = 1.85, SD = 1.32) shows the highest standard deviation of the target group: despite being generally evaluated as dialectal, the highest variability among participants has been observed. A set of lexical items derived from the same stem of ‘via’ (‘viella’, ‘viuzzo’, ‘viuzza’) and the aforementioned dichotomy of ‘vico/vicolo’ triggered less definite but still clear responses. ‘Viella’, ‘viuzzo’ and ‘viuzza’ can be considered morphological compounds in the standard language, since they are created with the stem of ‘via’ merged with dedicated morphemes (diminuitive suffixes ‘-ello/a’ and ‘-uzzo/a’ encoding ‘little’). Our participants evaluated ‘viella’ as the most dialectal of this sub-set (M = 2.13, SD = 1.33). Asymmetries arise with respect to ‘viuzzo’ (M = 2.26, SD = 1.28) and ‘viuzza’ (M = 3.13, SD = 1.50). We believe that ‘viuzza’ is rated as ‘more italian’ than ‘viuzzo’ since it is the standard diminuitive form of ‘via’: one can find the first but not the second form in a dictionary (e.g. De Mauro, 2020).
Finally, we detected asymmetries with respect to ‘vico’ (M = 3.05, SD = 1.57) and ‘vicolo’ (M = 4.78, SD = 0.63), with ‘vicolo’ evaluated as more ‘standard Italian’ than ‘vico’. As in the case of the ‘viuzzo/viuzza’ pair, ‘vicolo’ is attested as the standard form for this word in dictionaries (e.g. De Mauro, 2020). The score thus reflects this interpretation of this term. Overall, what emerges is that the two groups are statistically different (t test), with ‘vicolo’ considered ‘vico’ t(158) = 9.2047, p < 0.0001). Furthermore, no apparent correlation between the age or sex of the participant has been detected (see Table A in the Appendix; models created with the python module ‘statsmodels’; and Seabold and Perktold, 2010). Frequency in use could be a factor, and could perhaps provide a welcome general trend. We decided to observe this phenomenon by operating on Google N-grams (see Michel et al., 2011), exploring the R environment (R core team, 2021) with the package ‘ngramr’ (Carmody, 2021) for the Italian corpus ‘ita-2019’. We considered 1900 as a starting date for the investigation. All codes are available in supplementary file D. Figure 6 summarizes the results. ‘Vicolo’ is more frequent than ‘vico’, with an growing distance in recent decades. As we have exhausted the analysis of the relevant data, we turn to discussion and conclusions.

Google N-grams frequencies of ‘vico’ and ‘vicolo’ in the Italian corpus ‘ita-2019’.
Discussion and conclusions
We believe that our results lead us to the discussion of four key points. First, our choice in using OSM for a dialectological and geolinguistic study seems justified on two grounds: we have been able to extract a sample size superior to previous studies (e.g. Ursini & Samo, 2022b); and the VGI philosophy underpinning this platform guarantees the local, grassroots origins of the data (see Sui and Goodchild, 2011). Our findings are thus methodologically consistent with previous work in Italian toponomastics (e.g. Cantile, 2013; Granucci, 2004). However, they are based on a novel, online methodology potentially operating at very fine-grained resolution levels (e.g. tiny urban places). We can thus reach our first goal, i.e. an answer to the where question. Cities, villages and their alleys are the urban places where one can find toponyms with generic terms of dialectal origins. This result confirms previous findings in Italian toponomastics (from e.g. Calafiore, 1975 to Cassi, 2015). However, we innovate by mapping distributional patterns at a multi-scalar level, i.e. on the local, regional and national maps of Italy.
Second, our lexicographic analysis shows what kinds of dialectal terms can be found in each region and city potentially expressing a dialect. Arguably, former capitals of pre-unity kingdoms such as Florence, Naples and Venice featured dialectal varieties that played an important role in their respective regions of influence (see Berruto, 2012; Chiappinelli, 2013). Such influence is reflected in the dozens of generic term types and thousands of tokens they contributed to our database (see also Massarelli, 2012). Equally clear, however, is the contribution of regions and zones in which non-Italian languages and dialects are spoken (e.g. German and Ladin in Trentino; Blanco, 2006; Cordin, 1997). The several synonyms for ‘vicolo’ offer clear-cut evidence: Naples and its Neapolitan dialect contribute ‘viella’ but also ‘pallonetto’, the networks of alleys only found in this city (see Doria, 1982). Genoa, Genoan and ‘crosa’ are another example: crimson alleys in the city's quarters seem unique to this city. Locality and dialectal roots in Italian urbanonyms are phenomena distributed at a national level; each Italian urban centre and dialect can potentially contribute a local name for a local place.
Third, the questionnaire reveals that speakers tend to be aware of the local origins of these terms. A general trend across age groups and the sexes is that participants can glean the roots of generic terms that originate in local dialects. However, participants did not offer any further comments on whether they would know more about these origins. For instance, for most participants ‘calle’ was clearly a dialectal term, and Venetian participants would associate it with their dialect. However, no participant offered an explanation about what type of place ‘calle’ refers to. These data thus suggest that participants can have intuitions about the diachronic, dialectal origins of these terms even though they may not assign these terms detailed definitions.
We thus have evidence that dialects have borrowed generic terms in urbanonyms to Italian. This is not a rare case, since local dialects often act as lexical sub-strata to national languages (see Scott, 2016; Wiereck, 2006). We also have evidence that participants could understand that these terms are part of the Italian toponymic grammar and lexicon (see Nübling et al., 2015; Rezsegi, 2020a, 2020b). Participants had no trouble answering questions about their dialectal origins, and completing questionnaires in which we proposed these questions (see also Ursini & Samo, 2020a). We can thus conclude that our lexicographic study and questionnaire reach our second and third goals, respectively. The kinds of toponyms that have dialectal origins are those including generic terms describing geo-located places, as the lexicographic study shows (second goal, the what question). Although participants can glean their dialectal, geo-located origins, they use these terms as standard words of Italian ‘toponymic grammar’ (third goal, the how question).
Before we move to the conclusions, we offer a compact theoretical observation on generic terms that our data seem to support. Our experimental results suggest that speakers do not have intuitions on generic terms’ senses and etimology. In this regard, participants differ from geographers and toponomasticians in the use of these terms (see Blair and Tent, 2021; Cassi and Marcaccini, 1998; Derungs and Purves, 2016), at least with respect to our questionnaire. Some basic notions from lexicography and terminology can help us in solving this apparent contrast. Within terminology, a distinction is made between ‘general vocabulary’, ‘specialized vocabulary’ and ‘technical terms’ (Pearson, 1998). General vocabulary includes words used without resorting to domain-specific knowledge (e.g. ‘dog’ in a daily conversation). Specialized vocabulary usually involves this knowledge (e.g. ‘dog’ during a veterinary visit). Technical terms are uses of words that involve formal definitions in normative contexts (e.g. scientific texts defining ‘dog’ as a species; ten Hacken, 2008, 2015). Thus, words may have different definitions and uses in increasingly technical contexts.
Our questionnaire indirectly suggests that speakers may use generic ‘terms’ as either general or specialized vocabulary words. Again, speakers could guess the dialectal, geo-located origins of the tested terms. Participants could thus access at least one aspect of these words that qualifies as a specialized use. No participant, however, provided definitions of the senses of these terms, even when they would be well-acquainted with them (see the ‘calle’ case). Thus, participants’ knowledge of these terms may not be as accurate as the definitions found in works such as Gasca Queirazza et al. (1990). Those are mostly the province of geographers and dialectologists investigating these definitions (see Derungs and Purves, 2016; Panocová and ten Hacken, 2017). Even though we have used ‘generic term’ as a label for this distinctive constituent of urbanonyms, our participants’ answers suggest that, for them, these are ‘geographical words’ with uses oscillating between general and specialized ones.
In conclusion, this article has offered an analysis of the dialectal generic terms/words of Italian urbanonyms. It has pursued two goals via a novel methodology based on the online, VGI-based gazetteer OSM. The first goal is answering a question on where terms with dialectal roots can be found. The second goal is answering a question on what kinds of toponyms have such roots. The third goal is answering a question on how these words/terms follow grammar use. The article has shown that terms/words with dialectal origins can be attested in any region of Italy, although they have highly restricted distribution (answer to the where question). Such terms identify highly specific kinds of places, often found in specific cities or regions of Italy (the what question). The use of these terms follows morpho-syntactic standard rules of Italian: participants of our study could easily answer questions about the terms (the how question). Notably, speakers may always be aware that they originate in different dialects and varieties, as our questionnaire has shown. The article thus offers novel evidence of the geographic interplay between languages and dialects. For other topics, however, we must wait future research.
Supplemental Material
sj-txt-1-foi-10.1177_00145858231190030 - Supplemental material for Geographical maps meet place names where languages meet dialects: The case of Italian
Supplemental material, sj-txt-1-foi-10.1177_00145858231190030 for Geographical maps meet place names where languages meet dialects: The case of Italian by Giuseppe Samo and Francesco-Alessio Ursini in Forum Italicum
Supplemental Material
sj-xlsx-2-foi-10.1177_00145858231190030 - Supplemental material for Geographical maps meet place names where languages meet dialects: The case of Italian
Supplemental material, sj-xlsx-2-foi-10.1177_00145858231190030 for Geographical maps meet place names where languages meet dialects: The case of Italian by Giuseppe Samo and Francesco-Alessio Ursini in Forum Italicum
Supplemental Material
sj-rar-3-foi-10.1177_00145858231190030 - Supplemental material for Geographical maps meet place names where languages meet dialects: The case of Italian
Supplemental material, sj-rar-3-foi-10.1177_00145858231190030 for Geographical maps meet place names where languages meet dialects: The case of Italian by Giuseppe Samo and Francesco-Alessio Ursini in Forum Italicum
Supplemental Material
sj-txt-4-foi-10.1177_00145858231190030 - Supplemental material for Geographical maps meet place names where languages meet dialects: The case of Italian
Supplemental material, sj-txt-4-foi-10.1177_00145858231190030 for Geographical maps meet place names where languages meet dialects: The case of Italian by Giuseppe Samo and Francesco-Alessio Ursini in Forum Italicum
Footnotes
Supplemental material
Supplemental material for this article is available online.
Appendix
Mean and SD for every group of the experiment. M = male, F = female, NB = non-binary.
| Age | Sex | N | M_vico | SD_vico | M_vicolo | SD_vicolo | SE |
|---|---|---|---|---|---|---|---|
| 18–30 | M | 22 | 3.636364 | 1.255292 | 4.772727 | 0.528413 | 0.380287 |
| 18–30 | F | 6 | 2 | 0.632456 | 4.833333 | 0.408248 | 0.424866 |
| 18–30 | NB | 1 | 2 | 0 | 5 | 0 | 0 |
| 30–40 | M | 10 | 3.1 | 1.595131 | 4.9 | 0.316228 | 0.604425 |
| 30–40 | F | 4 | 2 | 1.414214 | 5 | 0.316228 | 0.865221 |
| 30–40 | NB | 1 | 5 | 0 | 5 | 0 | 0 |
| 40–50 | M | 7 | 3.857143 | 1.069045 | 5 | 0 | 0.404061 |
| 40–50 | F | 7 | 3.285714 | 2.13809 | 4.571429 | 0.316228 | 0.927645 |
| 50–60 | M | 2 | 4 | 1.414214 | 5 | 0 | 1 |
| 50–60 | F | 8 | 2.125 | 1.642081 | 4.5 | 0.92582 | 0.90789 |
| 60–70 | M | 5 | 2.4 | 1.341641 | 5 | 0 | 0.6 |
| 60–70 | F | 5 | 1.8 | 1.788854 | 4.4 | 1.341641 | 1.4 |
| 70older | F | 2 | 5 | 0 | 5 | 0 | 0 |
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
