Abstract
This study examines Patient Information Leaflets' linguistic and textual features (PILs) in Italian language, focusing on their readability, comprehensibility, and effectiveness. PILs, essential documents accompanying medications, are regulated by stringent legal standards to ensure safety and usability. Despite their critical role, PILs often fail to meet the needs of diverse patient demographics, particularly in terms of accessibility and understanding. The analysis identifies structural, lexical, and syntactical challenges inherent in PILs, highlighting the tension between their dual function as informational tools for patients and precautionary legal instruments for pharmaceutical companies. Drawing on a corpus-based methodology, the research analysed 120 PILs across various medical domains (e.g., cardiovascular, dermatological, gastrointestinal), assessing parameters such as word count, syntactic complexity, and the use of specialized terminology. Key findings reveal a lack of uniformity in language use, with frequent reliance on medical jargon, collateral technicalities, and nominalizations that hinder comprehensibility. Additionally, readability indices such as GULPEASE and READ-IT indicate significant challenges for audiences with limited education, as many PILs scored below thresholds for adequate comprehension. The study concludes by addressing the sociolinguistic implications of health communication, recognizing PILs as pivotal to bridging gaps in health literacy and fostering equitable access to medical information. This research contributes to the field of medical linguistics by providing insights into the challenges of specialized communication and proposing strategies for enhancing the usability of health-related texts.
Keywords
The linguistic-textual features of the package insert
The patient information leaflet (PIL) is an official document contained in the packaging of a drug; it follows a specific editorial structure regulated by law
1
and contains information about the product. In terms of the textual dimension, the PIL presents some sections that are typically regulatory in nature (precautions, contraindications, method of administration, storage of the product, etc.) and others that are informative (composition, undesirable effects, etc.), due to the different audiences it addresses: experts and non-experts (Felici, 2020: 384). According to the Italian Medicines Agency, the PIL constitutes a dynamic text that is regularly updated: poiché ogni farmaco nel corso della sua vita è soggetto a modifiche – sia pure con frequenza variabile – l’ RCP [Riassunto delle Caratteristiche del Prodotto] e il FI costituiscono un documento ‘dinamico’ che viene costantemente aggiornato. L’aggiornamento può essere richiesto direttamente dall’azienda titolare dell’AIC [Autorizzazione all’Immissione in Commercio] (ad esempio a seguito di una nuova segnalazione di sicurezza) o dalle Autorità competenti Europee e/o nazionali, in quanto entrambe le parti sono tenute a monitorare continuamente la sicurezza e l’efficacia dei medicinali in commercio. (Agenzia Italiana del Farmaco, n.d.)
2
As mentioned above, the PIL follows strict drafting criteria imposed by law but often remains incomprehensible to recipients (Giumelli, 2013: 160).
In fact, as far as the comprehension dimension is concerned, most research has focused on the structure of PILs at the level of marketing and product communication (Vicentini, 2012), without necessarily assessing the functionality and impact of the content. Moreover, even in Italy systematic surveys are in this regard scarce and fragmentary (Gualdo and Telve, 2021: 296).
Although there are numerous references to the comprehensibility of PILs in the regulations, the aspect is often disregarded or there is no insistence on a shared definition of comprehensibility: the PIL ‘must be noticed, read, understood, believed and remembered’ (Ley, 1992: 23). In particular, the investigations of Garner, Ning and Francis (2012) highlight how even research in the language sciences often underestimates and standardizes content comprehension in a biased manner. More precisely, if we focus on the reading performance-level comprehension (readability) of a PIL, by even a very large sample of informants, the statistical value will be highly flawed and lack predictive relevance, since ‘readability scores provide information about the surface of the text, but [do] not directly provide information about the comprehensibility of the text. [They] do not acknowledge the specific needs of the target readers’ (Pothier et al., 2008: 43). Sample comprehensibility surveys, as well as readability indicators, therefore, represent partial elements to verify the comprehensibility of PILs, but they allow us to trace a trend. In this regard, the Italian Drug Agency reports that: Il foglietto illustrativo è destinato al paziente e descrive, in un linguaggio chiaro e facilmente comprensibile, le informazioni contenute nel RCP per l’impiego sicuro e corretto del medicinale. Poiché il FI si rivolge a un pubblico eterogeneo dal punto di vista del livello di scolarizzazione e sociale, prima della sua approvazione deve essere opportunamente testato per verificarne la leggibilità. (Agenzia Italiana del Farmaco, n.d.)
3
In addition to the above, it should be recalled that the characteristics of individuals who use PILs are widely taken up and addressed within the regulations, with particular reference to the elderly, adolescents and children, and it is possible to note that the potential vastness of the audience towards which a drug is indicated should de facto increase the comprehensibility of the respective PIL (Cortelazzo, 2008). In this regard, Mayberry and Mayberry (1996) recommend specializing the concept of comprehensibility with respect to:
the readability of language, understood as clarity of exposition, syntax and terminological and syntagmatic choices; readership comprehension, by means of large-scale sample testing; the long-term response, including memory, of patients regarding pharmacological and pharmacodynamic indications.
Regarding the first point, a medical specialist language presents a highly stratified lexicon with several cases of allomorphy and suppletivism: ‘the primary cause of this phenomenon is the competition between Greek and Latin in the formation of medical terms’ (Giumelli, 2013: 161). To this must be added the mediation of French for xenogrecisms and xenolatisms (e.g. cirrosi, flebite; Gualdo and Telve, 2021: 288) and the learned and popular traditions with respect to the passage of forms from Latin to Italian (the origin of several cases of allotropy), e.g. fegato/epatico, cervello/cerebrale, etc.
Another special case concerns the semantic oscillation of collateral technicalisms (Serianni, 2012), i.e. those expressions, characteristic of special languages, related to textual, diastratic and diaphasic needs and not to connotative needs, often semantically and grammatically restated or specialized (e.g. interessare, episodio, trattare, importante, riguardare, responsabile, a carico di, accusare, evento, rischio, in fase di, in sede di). Some collateral technicalisms are also employed to avoid tabooed forms related to the scatological sphere or death (e.g. esito infausto, enuresi notturna). Collateral technicalities represent one of the greatest challenges for understanding medical texts. Indeed, specific technicalisms constitute necessary elements, often characterized by semantic opacity (e.g. periduttale), while collateral technicalisms appear to be seemingly more intuitive as they are already present in common language (e.g. apprezzare).
As is the case in many specialized languages (among others, Gotti, 1991), the language of medicine has several borrowings from English (e.g. mottling, stroke; Onorato and Russo, 2017) and words formed by prefixes or suffixes (e.g. geriatria) (Serianni, 2012). It should be noted that the loans do not affect comprehensibility for a wide audience, if regularly acclimated in the language (e.g. day hospital), due to their attestation in hospital facilities and mass media communication (Gualdo and Telve, 2021: 296). The interest in specialized medical language, on the part of the public, has certainly contributed to the acclimatization and diffusion of loanwords into the common language.
The use of specialized suffixes and prefixes makes it possible to ‘express in a single term more features of the concept it designates […], thanks to the correspondence created between conceptual categories and lexical forms’ (Magris, 1992: 30). Further complicating medical terminology are the many points of contact with specialized disciplines or neighbouring fields, as is the case with psychology (e.g. resilienza, bucare un trattamento), pathology (e.g. endometriosi, nicturia), physiology (e.g. omeostasi, cuneiforme), pharmacology (e.g. eccipiente), biochemistry (e.g. ossalato), pathology (e.g. osteosarcoma) and diagnostics and research methodology (e.g. glucometria, specillo). Two other processes characteristic of medical language concern eponymy (e.g. sindrome di Guillain-Barré), beginning in the 19th century (Morgana, 1984), and the use of acronyms (e.g. AAG – anemia aplastica grave), also borrowed from foreign languages (e.g. TIA – transient ischaemic attack). In this regard, Serianni (2005: 213–214) notes how the use of acronyms affects medical texts aimed at subject specialists (primary texts) rather than a general audience, since abbreviations would significantly contribute to making the content even more opaque to the speaker. Abbreviations, on the other hand, respond to precise needs for synthesis, typical of a great many specialized languages (Ballarin and Nitti, 2020).
With regard to morphosyntax (among others, Gualdo and Telve, 2021; Serianni, 2005), the language of medicine exhibits extensive use of nominalizations and nominal phrases (e.g. patogenesi, riferimenti all’eziologia in […]), reduction of verbal uses to the indicative, use of passive and passivating forms, singularis pro plural (e.g. informare il medico o il farmacista) and the extensive use of relation adjectives and strong collocations (e.g. ascesso perianale, rischio emorragico).
As the language of medicine is a specialized language, the high monoreferentiality and specialization of terms (Hall, 2013) has contributed, in relation to the high rate of popular interest (Nitti, 2022), to the formation of an internal diglossia characterized by both formal and colloquial forms (e.g. fuoco di Sant’Antonio/herpes Zoster). On the other hand, medicine […] more than other disciplines is of interest to laypeople and hence its strong variability in a diaphasic sense’ (Giumelli, 2013: 161).
As seen above, the second and third points discussed by Mayberry and Mayberry (1996) refer to sample testing and the retention of information in memory.
Scholarly considerations are significant in terms of health literacy (among others, Berkman et al., 2011; Zotti, 2016). The function refers to issues not strictly linguistic but social: ‘health literacy, as a term first proposed in the 1970s, generally concerns whether an individual is competent with the complex demands of promoting and maintaining health in the modern society’ (Liu et al., 2020: 1). The function proposed by Gualdo and Telve, on the other hand, is based on the obvious effects of the lack of such competence, and in particular, health literacy indicates the ‘disparity in citizens’ knowledge and attitudes toward medicine’ (Gualdo and Telve, 2021: 311).
According to Piemontese, memory is facilitated to retain the contents of a communicative message if the individual can nimbly ‘revise, integrate, correct and reinforce the ideas possessed with those he or she derives from other sources’ (Piemontese, 1996: 139). The scholar dwells on the fact that readability often does not coincide with comprehensibility, as the latter does not refer purely to the formal aspects of a text but constitutes an aspect of a pragmatic nature (Piemontese, 2003). In this regard, recall how De Beaugrande and Dressler's (1994) indicators implicitly include comprehensibility within the intentionality and acceptability of a text, elements favoured by coherence and cohesion but strictly pragmatic-textual in nature.
Regarding the textual dimension of PILs, Cortelazzo (1994) proposes that they should be included within secondary texts, as they are produced (also) for a non-expert and non-expert audience, as opposed to primary texts which are designed exclusively for medical professionals. On a regulatory level, in fact, the PIL is an official document, subject to a high rate of standardization, used to accompany drugs, regulated by European Directive 92/27/EEC, implemented in Italy through Decreto Legislativo n. 540 12/30/1992 and updated through Decreto Legislativo n. 219 4/24/2006. Both the European Medicines Agency (EMA)
4
and the Agenzia Italiana del Farmaco (AIFA) are concerned with assessing the readability of PILs, but: leaflets approved by the EMEA or, at the national level, by the Ministry of Health, have two distinctly different communication styles. In the former, the contents are articulated according to a question/answer scheme orientated on possible user queries and are presented in a simple and understandable dictation aimed at the addressee. Secondary implications such as side effects are treated to an essential extent, inviting an interview with the doctor. In the latter, on the other hand, information is both more complex and abundant (and also responds in part to the communication needs of pharmaceutical companies). (Gualdo and Telve, 2021: 313)
Therefore, despite the type text scheme, pharmaceutical companies can include additional elements, especially in relation to utility for patients (Gualdo and Telve, 2021).
Giumelli notes how the drafting of PILs cannot ‘disregard the dual nature of these texts, of a bureaucratic document on the one hand and an informational tool on the other’ (Giumelli, 2013: 163). A further consideration should be added to this statement: the informational text concerns both the public and the professional community, and as for the last case, ‘the PIL performs both an informational function towards the patient and a precautionary function for the pharmaceutical company’ (Giumelli, 2013: 163). In addition to the above, Puato's (2011, 2012) research allows us to distinguish two editorial styles for PILs: the notarial style and the communicative style. The difference between the two styles relates precisely to the dual informational function of the PIL and its intended audience. Generally, the prevailing dimension is cautionary, privileging medical and legal communication, to the detriment of product communication with the general public (Iovino and Orletti, 2018; Puato, 2012). The issue of transparency and notoriety in the lexicon of medicine is far from simple, as dividing the public into experts and non-experts seems reductive and trivializing (Nitti, 2022). In this regard, Gualdo and Telve (2021: 294) proposed considering a continuum of expressions with a range of specialism from a minority of medical specialists to all speakers (Figures 1 and 2).
Research methodology
Based on the premises described in the previous section, a study was conducted with the aim of corroborating the considerations of previous investigations (Giumelli, 2013; Gualdo and Telve, 2021; Iovino and Orletti, 2018; Iovino et al., 2016; Puato, 2012), by means of a corpus-based analysis, conducted through software, particularly WordSmith, of a significant sample of PILs. The research involved the collection and analysis of 120 5 PILs of various kinds (divided by sample area of interest, respectively: gut, heart, vagina, head, skin) Table 1.
It was decided, for the sake of synthesis, to focus the research on a few relevant aspects, questioning the corpus regarding structure, number of words, presence of passive diathesis and passivating forms, terms with scientific denotation, internal repetitions of terms, presence of modal verbs, readability, presence of suffixes and prefixes typical of the specialized language of medicine (Gualdo and Telve, 2021) and presence of eponyms and borrowings. We chose to neglect other stylistic devices related to the specialized language of medicine, which are not very representative of PILs, such as slang formations (e.g. bypassare, gestire la complessità, redigere un referto, biopsiare, etc.), but are present in other medical texts, especially primary ones (Gualdo and Telve, 2021: 290).

Notoriety and transparency of medical vocabulary (Serianni, 2007: 14, in Gualdo and Telve, 2021: 294).

Examples of the use of in caso di and a carico di.
PIL structure and features
As seen in the previous paragraphs, the structure of the PILs has little room for variation, as they are regulated by law. The various information blocks of the PILs present the basic information related to the drug and, in particular, the name of the drug (top and bold), therapeutic indications, drug composition, contraindications (pregnancy, lactation, alcohol intake, etc.), precautions/warnings, interactions, dosage and overdose, side effects, warnings and, finally, expiration and storage.
In this regard, it is noticeable how over-the-counter drugs tend to have reformulations of learned and complex syntagmas (e.g. forma farmaceutica > come si presenta; controindicazioni > quando non deve essere usato, etc.). In this regard, Carducci's (2008) research shows that an audience with an average high school diploma is substantially unable to comprehend the texts of PILs, with greater difficulty in therapeutic indications, overdose and undesirable effects. Compared to Carducci's survey, this study shows different results, as will be seen later. The sample surveyed shows little variation in structure (4%), mainly intended for product promotion and inclusion of additional information (e.g. subjects who can sell the drug).
The number of words
A second section of the survey was devoted to counting the words in the PILs. The objective was to ascertain, first, how many words on average were contained within the PILs and, second, whether there were variations depending on the scope of the drugs.
As can be seen in Chart 1, most of the PILs contain between 2001 and 3000 words. Quite high is also the number of PILs containing between 1001 and 2000 and between 3001 and 4000 words. Lower, however, is the number of PILs containing between 0 and 1000 words, while only 13 contain more than 4000 words. The only PIL with more than 6000 words is that of the Nuvaring contraceptive.

Word count.
Regarding the breakdown by sphere of interest, significant fluctuations are observed in the number of words. As can be seen in Table 2, the PILs with the least number of words are those in the groups of PILs used for skin and intestinal problems. In fact, most of the PILs in both groups do not exceed 2000 words, and only one PIL per group has more than 3000 words. In the case of the group of PILs related to skin drugs, it is the PIL of the drug Cibinqo, for the treatment of atopic dermatitis, while in the case of the gut group, it is the PIL of the drug Eparema, for the treatment of constipation.
Drugs analysed according to type.
Word count in relation to group.
The other three groups, however, have on average the same number of words, ranging from 2001 to 4000. The variation in word count is determined by both the severity and condition of the patient and the degree of responsibility attributed to him or her. In fact, as we have seen, the purpose of a PIL is not only to inform the correct use of the drug but also to protect the manufacturing companies from the potential risks involved in taking it: the text of the package insert responds to different communication needs that do not always manage to coexist harmoniously alongside one another: the precautionary need of the manufacturing company must in fact combine with the informative-descriptive (and partly promotional) need characteristic of the package insert, addressed to the specialist on the one hand and to the ordinary citizen on the other. (Gualdo and Telve, 2021: 312)
In terms of research, it is observed that the PILs that contain the most words are those related to drugs whose possible side effects may be more serious than others (all prescription drugs); these PILs clearly exhibit a more cautionary style than others.
Passive diathesis
The subject of the third part of the analysis was passive diathesis and passive expressions. Data analysis (Table 3) shows that PILs of drugs related to the head tend to abound in passive diathesis, while the adoption of passive forms can be observed predominantly for PILs of drugs for the gut. In addition, PILs related to dermatological problems tend to use the passive diathesis considerably less, and the passive form si osserva is preferred for every occurrence of the passive forms, probably due to the fact that dermatological problems are generally visible. The use of the verb osservare, however, in these cases is not semantically restated and bent to languages for special purposes (LSP) uses.
Occurrences of passive diathesis and passive forms.
Regarding the use of the modal verb potere (e.g. può essere preso), it can be noted that in many cases the responsibility for a possible consequence with respect to taking the drug is attributed to the doctor, the pharmacist or the patient themselves. The style, therefore, is cautionary, and the purpose is once again to protect the manufacturers from the potential risks arising from the use of the drug.
Terms with scientific denotation and repetition
The fourth section of the analysis involved searching for terms with scientific denotation present in 120 PILs. These terms were identified from the root morphemes, sometimes also aggregated or with parasyntactic outcomes: {tratt-}, {as}sum-}, {com}par-}, {manifest-}, {in}duc-} and {us-}.
Data analysis shows that paradigmatic and etymologically related forms of {us-} are the most common (46%), followed by {tratt-} (23%) and {as}sum-} (20%). In contrast, the paradigmatic and etymologically related terms with {manifest-} (8%), {com}par-} (2%) and {in-}duc-} (1%) are less recurrent (Chart 2).

Weighted average drug name repetitions (standard deviation equal to 22.55).
Drug names have a total of 5136 occurrences among the 120 drugs examined, for a weighted average of 41.13 repetitions per FI. In addition, reading the graph shows that drugs concerning the heart have the highest number of occurrences, both for the total and the average.
The most repeated words in the PILs are medicinal* (3714), medico (2638), sangue (1289), dos* (1274) and effett* (1200). These words all record an equal or higher average of 10 occurrences per PIL, with the word medicinal* having 30.95 per FI.
In principle, the repetitions do not correlate with the body part to which the drugs refer, with the exception of the word coagul* which records 319 occurrences and is used predominantly in PILs related to the vagina.
Modal verbs
Regarding modal verbs, the conjugation of dovere occurs for 1319 occurrences and, in particular, the most expected form is the third-person singular of the present indicative, deve, with 1041 occurrences. In addition to the present indicative, several present conditional and future indicatives are also detected, especially with regard to the phrases dovrebbe notare altri disturbi o dovrà rivolgersi al medico.
The verb potere, on the other hand, appears an average of 26 times per PIL, with a total of 3102 occurrences. The prevalent use of the various conjugated forms indicates the ways in which the drug is taken and the undesirable effects that might occur as a result of taking it.
Analysis of the data shows that the most frequently occurring syntagmas containing the modal poter are: potrebbe avere bisogno, potrebbe essere pericoloso, potrebbe essere necessario, potrebbe assumere, potrebbe andare incontro a gravi problemi, possono dare disturbi, possono formare, possono portare, possono presentare and possono interessare. The most commonly used form turns out to be the indicative present può, contained mainly in PILs related to medication for migraine problems.
In addition, it is observed that side syntagmas a carico di and in caso di are mostly used within contraindications.
While, as noted in Table 4, the technicality a carico di, in most PILs, is scarce or even not present, the syntagma in caso di is attested in substantial numbers in all PILs, with an average of three repetitions per PIL.
Occurrences of a carico di and in caso di.
Analysis of the data allows us to see that the collateral technicality a carico di is in the minority in PILs, probably attesting significantly in other primary and secondary medical texts; moreover, the prevalence of in caso di is perfectly justified by the regulatory textual type and notarial style of caution on the part of pharmaceutical companies.
Morphological analysis of suffixes and prefixes
Another section of the research involved analysing the use of the suffixes {-ite}, {-ism}, {-osi}, {-asi}, {-osio} and {-oma}. The suffix {-ite} denotes ‘an ongoing inflammatory process in a region of the body indicated in the base’ (Gualdo and Telve, 2021: 289) and is often anticipated by a root base of Greek origin (e.g. dermatite, rte, bronchite). To indicate a pathology, the specialized language of medicine often employs the suffix {-ism}, coupled with the reference with the patient {-ista} (e.g. alcolismo, alcolista).
The suffix {-osi}, on the other hand, indicates ‘a pathological condition of a regressive-degenerative type’ (Gualdo and Telve, 2021), e.g. necrosi. The suffixes {-asi}, {-osio} and {-oma} denote enzyme names, and are posited to the lexical base indicating the substance on which the enzyme acts (e.g. lipasi), the type of carbohydrate (e.g. ribosio) and the part affected by a tumour process (e.g. melanoma) (Patota, 1985).
Regarding the corpus-labelling step, words that could have resulted in data bias (e.g. qualsiasi) were expunged. In particular, we show that the most recurrent suffix is {-asi}, with 1598 occurrences (Chart 3). As shown in Figure 3, The {-asi} suffix appears to be the most frequently used considering the total PILs. The suffixes {-ism} and {-osio}, on the contrary, turn out to be the least present compared to the corpus.

Presence of suffixes characteristic of medical language.

GULPEASE PIL vagina index.

GULPEASE PIL skin index.

GULPEASE PIL head index.

GULPEASE PIL heart index.
As for the analysis of prefixes, we focused on {para-}, {anti-} and {ana-} (Chart 4). In the above graph (Chart 4), words with the prefix {para-} are much more numerous than words with the prefixes {anti-} and {ana-}. As for labelling, again, words that did not pertain to the specialized language of medicine or false prefixes (e.g. paragrafo, paragonabile, anale, etc.) were expunged (Chart 5).
Based on the reading of Chart 5, it is possible to see that some words prefixed with {anti-}, compared to others, are much more present within the IFs; in fact, based on the reading of the graph, it is possible to highlight how antinfiammator*, antispasmin* and anticoaugulant* are much more repeated than terms such as antipsicotic* or antiepilettic* (Chart 6). Also when analysing Chart 6, it can be seen that the word analgesico is repeated more frequently than other terms, considering that false prefixes (e.g. anale) are expunged from the query.

Presence of prefixes within PILs.

Presence of words prefixed with {anti-}.

Presence of words prefixed with {ana-}.
Analysis of eponyms and presence of borrowings
The use of eponyms, as seen in the preceding paragraphs, is characteristic of specialized medical language. It involves the individual lending his or her name to indicate a syndrome, condition or disease. For the purposes of this research, therefore, the eponyms contained in the syntagmas sindrome di * and malattia di * were analysed. In both cases, the constructions (Masini, 2017) are for all intents and purposes are considered polyrhematic, as the meaning of the expression is different from what would be obtained by summing the meanings of the individual words that compose it, and it is not possible to insert material within (e.g. riflesso di Landau / riflesso * grave di Landau).
In terms of data analysis, the eponym sindrome di * (89%) appears to be more widely used in comparison with malattia di * (11%). The explanation for these percentages is related to the different function of syndrome and disease: while the eponym sindrome di * refers to a general clinical picture, the eponym malattia di * indicates a specific condition and for this reason appears to be less present. For precautionary reasons and based on the large audience of potential patients to whom the drug is administered, it is more reasonable to conceptualize the diseases by macro-sets.
Leaving aside the invisible casts and translations from English (Grasso, 2007: 51), characteristic of many specialized languages, we focused on the presence of borrowings, analysing them according to the language of origin (Chart 7).

Origin of borrowings in the PILs.
In particular, Figure 4 shows that most of the borrowings are of Latin origin (e.g. tinea), which is followed by English (e.g. blister) and Greek (e.g. pityriasis).

GULPEASE PIL intestine index.

PILs readability index through READ-IT.
PIL readability
Readability indices are valid and effective tools for analysing the readability of the texts examined, and it would be necessary to juxtapose them with other types of investigation to check the comprehensibility of a text. As seen above, readability allows for the identification of a trend but it does not represent an absolute value, since individual specificities and experience concur in determining comprehension. However, tracking readability is significant for delineating trending aspects, with obvious repercussions for effective textual comprehension by a wide audience (Dyda, 2020). In terms of analysing the readability of IFs, the GULPEASE index (Lucisano and Piemontese, 1988) was used. This tool, despite being subject to criticism (Venturis, 2022) and being dated, allows us to explore the readability of an Italian language text by considering two variables: word and sentence length versus number of letters (Benjamin, 2012). Specifically, the algorithm (Piemontese, 1996: 100), based on Flesch's formula, is described with the formula:
For the sake of synthesis and in continuity with the research of Carducci (2008) and Puato (2012), it was decided to evaluate the paragraphs referring to medication use patterns, contraindications and side effects.
For the PILs of drugs referring to the vagina, the GULPEASE readability index was 37%. This means that for those with a higher culture readability is difficult, while for those with an average culture it is very difficult and for those with an elementary culture it is almost incomprehensible.
Regarding the group of PILs on skin problems, the GULPEASE readability index was 66%. It can therefore be inferred that for those with a higher and average education readability is easy, while for those with an elementary culture it is very difficult (Figure 5).
Regarding the analysis of the PILs on symptoms related to the head, the GULPEASE readability index was 40%. This indicates that those with a higher culture find them difficult, while those with an average culture find them very difficult and for those with an elementary culture they are almost incomprehensible (Figure 6).
For PILs on symptoms related to the heart, the GULPEASE readability index was 50%. The readability for those with a higher culture is therefore easy, for those with an average culture it is very difficult, while for those with an elementary culture it is almost incomprehensible (Figure 7 and Chart 1).
In conclusion, for PILs on the gut, the GULPEASE readability index was 41%. This means that it is easy for those with a higher education to read, very difficult for those with average education and almost incomprehensible for those with elementary education.
In addition to the above, it should be noted that readability indicators are not always reliable (Lines, 2022), and the ranges of difficulty are sometimes biased by values that are close to the previous or successive ones, e.g. 41%, which is closer to 40% (the threshold of difficulty for those with a higher education).
It was decided to compare the results produced through the GULPEASE index with the READ-IT index (Dell’Orletta et al., 2011), and it was found that:
the readability of paragraphs regarding the use of medications is readable for 51% of people; paragraphs containing contraindications of medications are understandable for 54%; paragraphs concerning side effects are understandable for 50% of people.
It thus emerges that with the use of the READ-IT index, the readability level of the selected paragraphs is challenging for about 15% of readers (Chart 8).

The readability of PIL paragraphs through the READ-IT index.
As for the whole corpus, however, the different specialized parameters of the READ-IT index were considered and it was found that with the basic READ-IT index the readability difficulty was equal to 30.2%. Similarly to the GULPEASE index, which was specifically designed for the Italian language, this model uses standard measures of text readability (Figure 8). These include nominal sentence length, calculated as the average number of words per sentence, and word length, derived as the average number of characters per word;
using the lexical READ-IT index the readability difficulty was equal to 56.3%. The model focuses on the lexical features of the text, consisting of vocabulary structure, as well as its lexical richness; with the syntactic READ-IT index the readability difficulty was equal to 94%. This model is based on grammatical information, i.e. the combination of morpho-syntactic and syntactic features inferred from the corresponding levels of linguistic analysis; using the global READ-IT index the readability difficulty was equal to 99.3%. This model is based on the combination of traits of various kinds, ranging from the general text features of the READ-IT basic model to the lexical and syntactic features of the above-mentioned models.
Conclusions
The analysis of the data adequately confirmed the results of previous research, even if the current investigation involved a larger sample. Some relevant differences remain. In particular, with regards to readability, it was noted that the most significant difficulties emerge among those with an elementary school education. The use of specific prefixes and suffixes relates to the type of text and the division between primary and secondary texts. Indeed, the prefixes and suffixes used in the language of medicine do not appear to be uniform in all medical texts. In conclusion, the PIL is complex and can only minimally be traced back to a specific text type intended for the general public (Ferrari, 2019), despite the indications given in the legislation. This complexity, moreover, harks back to the popular tradition, which conceives the PIL as a bugiardino (literally ‘little liar’). The Accademia Della Crusca (n.d.), one of the highest authorities on Italian Linguistics, has commented on the issue, noting how in the boom years of pharmacology, PILs tended to gloss over the flaws and undesirable effects of the drug in order to extol its merits and efficacy. Today, the trend is probably the opposite; in many cases, a cautionary style is used for precautionary purposes by pharmaceutical companies, at the expense of immediacy of understanding. Thus, the lack of comprehensibility, with good reason, continues to justify the title of bugiardino (Accademia Della Crusca, n.d.).
