Abstract
According to a language-integrated view of spelling development, learning to spell involves the same language-learning skills across alphabetic systems. A prediction based on this view is that the same spelling training should be equally effective for learning to spell in a shallow (Italian, native language) or an opaque (English, additional language) orthography. We tested this prediction by teaching 6- to 9-year-old Italian children to use multiletter spelling units to spell words in Italian and English. The children were trained on the spelling of Italian words containing orthographic difficulties that required switching from phoneme–grapheme spelling correspondences to larger grain size (multiletter) spelling units. In a stepped-wedge cluster randomized trial, 108 Italian children (ages 6–9 years) were assigned to the experimental spelling training or a waiting list condition. Their ability to spell the trained (Italian and English) word lists and to generalize the acquired knowledge to new (untrained) words was assessed. Similar learning effects were found in the two languages for the trained word lists. However, generalization of the acquired spelling knowledge to new words occurred only in English. The influence of language-specific factors on learning to spell could account for these findings.
There is debate in the literature as to whether acquiring word reading and spelling skills in a native language (L1) and in an additional orthography (L2 or a foreign language) requires the same or partially different linguistic abilities (Chung et al., 2019; Dixon et al., 2010; Sun-Alperin & Wang, 2011; van Daal & Wass, 2017). Several hypotheses have been formulated around this topic (see Geva & Siegel, 2000; Kahn-Horwitz et al., 2011; Lado, 1957; Sparks, 1995) that can be subsumed under two main categories (Arfé & Danzak, 2020): language-integrated hypotheses, which assume that common cognitive and linguistic abilities underlie spelling acquisition in the L1 and in the additional language (AL), and language-specific hypotheses, which see spelling acquisition as constrained by the specific characteristics of the orthographic system to learn and thus as a language-specific learning process (Chung et al., 2019; Kahn-Horwitz et al., 2011, 2012; Lado, 1957; Sun-Alperin & Wang, 2011).
Language-integrated hypotheses, such as the central processing hypothesis (Genesee et al., 2006; Geva & Ryan, 1993; Geva & Siegel, 2000; Gholamain & Geva, 1999) and the linguistic coding differences hypothesis (Sparks, 1995; Sparks et al., 2009; Sparks, Patton, et al., 2008), posit that common cognitive abilities (e.g., short-term memory and phonological processing skills) and general language-learning skills are applied to learn to read and spell across alphabetic systems (i.e., in the L1 and in an AL). Support for these hypotheses comes from studies showing that children’s spelling performances in the L1 and an AL are associated concurrently (Russak & Kahn-Horwitz, 2015) and longitudinally (Sparks et al., 2009; Sparks, Patton, et al., 2008) and that also problems in learning to read and spell in L1 and in the AL are strongly associated (Bonifacci et al., 2017; Palladino et al., 2016; Sparks, Humbach, & Javorsky, 2008). As similar language-learning skills are assumed to underlie spelling acquisition across orthographies, a prediction based on a language-integrated view of spelling development is that the same instructional strategies would be effective for learning to spell in L1 and an AL, regardless of the nature—or language-specific features—of the orthographic system (e.g., the degree of its shallowness).
In contrast to the language-integrated hypotheses of spelling acquisition are language-specific hypotheses (e.g., the Contrastive Typological Framework, Lado, 1957; and the Orthographic Proximity Hypothesis, Kahn-Horwitz et al., 2011), according to which children need to develop language-specific learning skills to acquire spelling knowledge across two (or more) orthographies, including their L1 and an AL. As the specific features of a language/spelling system determine which learning mechanisms will be most effective in learning to read and spell in that language (Ziegler & Goswami, 2006), the proximity (or distance) between two spelling systems (i.e., whether they share similar characteristics or not) will account for the facility with which writers can apply the same set of abilities to spell (and learn to spell) in their L1 and an AL (Kahn-Horwitz et al., 2011). Thus, for example, similar spelling strategies (e.g., one-to-one phoneme–grapheme spelling correspondences) will be applied between two shallow orthographies (Italian, L1 and Spanish, AL) but not between a shallow orthography (Italian) and a deep one (English). In line with these hypotheses are findings showing that (a) L1 Italian writers need to inhibit their (L1) one-to-one phoneme–grapheme spelling procedures to spell in English (AL), a deeper orthography with many sound–multiletter spelling correspondences (Arfé & Danzak, 2020), and (b) phonological, but not orthographic, skills can be transferred from Spanish (another shallow orthography) to English spelling (Sun-Alperin & Wang, 2011). Based on the assumption that the specific features of a language/spelling system determine which learning mechanisms are most effective in learning to read and spell in that language (Ziegler & Goswami, 2006), language-specific hypotheses predict that spelling instruction can be effective only when it is language-specific or targeted to the specific characteristics of the L1 or AL orthography being learned.
In synthesis, language-integrated and language-specific hypotheses translate into different approaches to bilingual (L1 and AL) spelling instruction: one based on the idea that the same set of instructional strategies can be effective to teach spelling across alphabetic (L1 and AL) scripts and the other grounded in the idea that spelling instruction should be differentiated between orthographies according to the language-specific characteristics of the orthography (L1 or AL) being learned.
Although several correlational studies have explored the validity of the language-integrated and language-specific hypotheses (Kahn-Horwitz et al., 2012; Russak & Kahn-Horwitz, 2015; Sparks et al., 2009; Sparks, Patton, et al., 2008), randomized controlled intervention trials that test the efficacy of these two approaches to spelling instruction are still lacking. In this study, we tested whether a language-integrated approach to bilingual spelling instruction—that is, teaching the same spelling strategies in the L1 and in an AL—would lead to similar or different gains in second- to fourth-graders’ spelling in two orthographies at the opposite ends of the shallow–deep continuum (Italian, L1, and English, AL). In this trial, we also tested the language-integrated and language-specific hypotheses presented earlier. If children are taught the same spelling procedures in their L1 and an AL but fail to apply these language-learning (e.g., orthographic mapping) strategies across the two orthographies, differences in learning to spell in the L1 and the AL can be attributed to language-specific factors (supporting the language-specific hypotheses). In contrast, if children show similar learning across the two orthographies, it means that the same general language-learning (e.g., orthographic mapping) strategies have been effective across the two orthographic systems (supporting the language-integrated hypotheses).
Exploring the efficacy of language-integrated and language-specific approaches to spelling instruction is important both theoretically and practically. Theoretically, it contributes to increasing the knowledge of the cognitive and linguistic processes underlying spelling development. Practically, it can inform instructional decisions, as in modern instructional contexts children are increasingly required to learn to spell simultaneously in different spelling systems beginning in their first years of schooling (Arab-Moghaddam & Senechal, 2001; Dixon et al., 2010; Sun-Alperin & Wang, 2011), and similar problems in learning to spell in L1 and in the AL may occur (Palladino et al., 2016).
Learning to Spell in Italian and English
Alphabetic orthographies can be ordered on a continuum from very shallow to very deep (Caravolas, 2004; Seymour et al., 2003). Italian, a spelling system with very consistent phoneme–grapheme correspondences, is at the shallow end of the continuum (Angelelli, Notarnicola, et al., 2010; Ziegler & Goswami, 2006). The procedures that Italian children use to spell new words are mainly phonological, that is, based on a phonological analysis of the word and one-to-one (phoneme–grapheme) mapping procedures. Once they acquire these procedures in Grade 1, Italian children become accurate in spelling and, by the end of Grade 2, they are already efficient spellers (Arfé et al., 2016). Apart from explicitly teaching the phoneme–grapheme correspondences that are characteristics of Italian, Italian teachers normally train children’s spelling skills by spelling from dictation. These simple instructional methods are generally sufficient to support the development of the spelling skills of young Italian writers. However, also in Italian, spelling is not completely shallow. The spelling of words containing context-sensitive graphemes, such as the letters c and g, deviates, for example, from the rule of perfectly consistent one-to-one correspondences. The same grapheme c corresponds indeed to different phonemes when followed by -e as in cena/dinner, /tʃ/, or when followed by -a, as in casa/house, /k/. Likewise, the consonant sounds /k/ and /g/ are transcribed differently depending on the accompanying vowel (e.g., /k/ is transcribed by the grapheme c- in casa, but by the two-letter grapheme ch- in chiavi/keys). Other spelling challenges are represented by the phonemic group /kw/, which can be transcribed by -cu- as in cuore/heart, -qu- as in quota/rate, or -cq- as in acquolina/drool (Angelelli, Marinelli, & Zoccolotti, 2010), and by geminates in which a phoneme, for example, /t/, corresponds to two letters -tt-, as in mattino/morning. To spell these words accurately, Italian children should switch from one-to-one to multiletter/syllabic transcription units (i.e., the retrieval and transcription of orthographic groups such as -ca-/ka/, -chi-/ki/, or -cqu-/kw/; Burani et al., 2006), a challenge both for Italian children with spelling disabilities and for typically developing children (Arfé, Cona, et al., 2018).
At the opposite end of the continuum, English is an exceptionally deep orthography (Share, 2008), characterized by being extremely inconsistent with respect to single phoneme–grapheme correspondences, but more consistent when larger reading/spelling units, such as rhymes or syllables are considered (Berninger, Vaughan, et al., 1998; Ziegler & Goswami, 2006). Although it has been shown that English-speaking children can benefit from being exposed early to spelling units of varying size (from single letters to rhyme–onset multiletter units; Berninger, Vaughan, et al., 1998), a focus on larger size correspondences, that is, multiletter spelling units (such as rhymes) is critical to develop spelling skills in English (Berninger, Abbott, et al., 1998; Berninger, Vaughan, et al., 1998; Grainger & Ziegler, 2011). For Italian students who learn to spell in English, a typical difficulty in English spelling acquisition is thus represented by the need to switch from phonological one-to-one spelling procedures, which are typically very effective in their L1, to these more complex mappings and multiletter spelling procedures (Arfé & Danzak, 2020). However, as noted earlier, learning to map multiletter units to spoken word sounds can be effective in Italian too, for spelling words with context-dependent graphemes or when correspondences are between letter clusters and phonemes. In these cases, multiletter spelling strategies may be beneficial for spelling in English and in Italian (Arfé, Cona, et al., 2018).
The Mind’s Ear and Eye Training
The mind’s ear and eye training (Berninger, Abbott, et al., 1998; Berninger, Vaughan, et al., 1998), an instructional intervention proven effective for teaching spelling to English-speaking poor spellers (Berninger, Vaughan, et al., 1998) and children with learning disabilities (Berninger, Abbott, et al., 1998), was adapted and used in this study to teach Italian second to fourth graders to spell in Italian and English.
The original spelling intervention was set up by Berninger, Abbott, et al. (1998) to train English-speaking children with learning disabilities. The children were trained to spell 48 words through a sequence of nine steps. Step 1 consisted of naming the word while pointing to it (from left to right). Step 2 involved direct instruction of the alphabetic principle: The children were taught how to parse the written word and map phonemes onto its spelling units from left to right. In this step, for example, the experimenter helped the children parse the word “boat” by pronouncing /b/ when pointing to b, /o/ when pointing to oa, and /t/ when pointing to t. Step 3 consisted of repeating the word and asking the children to pronounce it again aloud. Thereafter, in Step 4, the children were asked to name the letters in the word one by one from left to right while the experimenter pointed to them. In Step 5, after having analyzed the phonological and orthographic word structure, the children were asked to close their eyes and make a mental picture of the written word. This step was meant to encourage them to build a mental representation of the word to be spelled. In Step 6, the children were asked to keep their eyes closed, read the word from their mind, and spell it again from left to right. The children then were asked to write down the word (Step 7). In Step 8, the children were asked to compare the written word produced with the target word and to rewrite it by repeating the same procedure when the two words did not match (i.e., the written word was incorrect). Through this teaching method, the authors aimed to support the children in making connections between their phonological and orthographic word representations at both the subword and word levels. Sound–grapheme correspondences of different sizes were trained from one-to-one phoneme–letter to multiletter sound-spelling units. However, the spelling achievement of the students receiving the training was predicted mainly by their ability to learn associations involving multiletter spelling units. Based on these findings, the instructional recommendation was thus to provide direct instruction in the correspondence between sound and multiletter spelling units. Recent studies have confirmed the effectiveness of these instructional strategies for English-speaking spellers (Berninger et al., 2015) and young writers with dyslexia learning to spell in shallower orthographies (e.g., Italian; Arfé, Cona, et al., 2018).
Although the initial focus of the mind’s ear and eye intervention was on remediating spelling difficulties in children with learning disabilities (Berninger, Abbott, et al., 1998), its efficacy was proved also for school-based prevention programs aimed at beginning spellers (Berninger, Vaughan, et al., 1998). This is unsurprising, considering that the intervention builds around a foundational process of typical and atypical spelling development: the integration of aural (phonological), visual (the written word), and motor (handwriting) word representations (Berninger, 2000; Wolf et al., 2017). An adaptation of the mind’s ear and eye training was used in this study to train 6- to 9-year-old Italian children’s multiletter/syllabic spelling skills in Italian and in English. The goal of the study was to test whether or not language-specific characteristics mediated the effectiveness of this spelling intervention, that is, children’s improvement in spelling the words trained in the two languages and their ability to generalize the acquired spelling skills to new (untrained) words. Based on the language-integrated and language-specific hypotheses, opposite predictions could be made: (a) if, as suggested by the language-integrated hypotheses, learning to spell in different orthographies involves the same cognitive processes and basic language-learning skills, then similar improvement in spelling should be observed in the L1 (Italian, a shallow orthography) and in the AL (English, a deep orthography) when the same spelling procedures are used to train spelling in the two languages; however, (b) if, as predicted by the language-specific hypotheses, the specific features of the spelling system determine which learning mechanisms will be most effective in learning to spell in that language, then the spelling intervention could lead to different outcomes in English and in Italian, and, in particular, the multiletter sound-spelling training procedures could be more effective in English, the orthography it originally targeted.
Thus, we formed two hypotheses:
Method
In a stepped-wedge cluster randomized trial (Campbell et al., 2019), 120 Italian native speakers (ages 6–9 years) were assigned to the experimental spelling intervention in which 48 Italian words and 42 English words were trained, or to a waiting list condition. The experimental group received the experimental training in Italian and English between Time 1 (T1) and Time 2 (T2) in eight training sessions over 1 month, and the waiting list control group received the same intervention between T2 and Time 3 (T3). The children’s ability to spell the trained (Italian and English) word lists and their ability to generalize the acquired knowledge to new (untrained) words were assessed.
Participants and Instructional Context
Initially, 120 six- to nine-year-old children in Grades 2 to 4 were enrolled in the study. The participants were recruited from two public schools in northern Italy. A sociodemographic questionnaire was used to gather data on the socioeconomic status (SES) of their families and their home language environment. The data of 12 participants who took part in the study and received the intervention were subsequently excluded from the analyses because their primary (home) language was not Italian (n = 7), English was one of their home languages (n = 4), or they did not take part in all of the assessment sessions (n = 1). The final sample consisted of 108 six- to nine-year-old participants (50 girls, 46%) with Italian as their L1. Their SES, calculated as the sum of the highest educational and highest occupational levels obtained by the children’s parents (Arfé, Montanaro, et al., 2018), was medium to high (M = 6.29, on a range from 1 to 8). Sixteen participants of this sample also were exposed to other languages at home (Romanian, Russian, Polish, Spanish, or Sinhalese) although Italian remained their L1 and primary home language. All children had nonverbal IQs within the normal range (M = 107). See Table 1 for a synthesis of the participants’ characteristics. The experimental and waiting list groups did not differ significantly in age or any other control or dependent measure assessed in the study (i.e., SES, nonverbal IQ, alphabet task, and Italian and English spelling skills). The gender distribution was equivalent between the groups (girls comprised 48% of the waiting list group and 44% of the experimental group), χ2 = .17, p = .68.
Participants’ Characteristics.
Note. SES = socioeconomic status; DDE-2 words = Italian standardized test of word spelling; DDE-2 pseudowords = Italian standardized test of pseudoword spelling; PAL-II word choice = English standardized test of orthographic knowledge.
Fourteen third- and fourth-grade children performed at or below the fifth percentile in a standardized Italian word spelling test (DDE-2; Sartori et al., 2007); three other children performed between the fifth and 10th percentiles on the same spelling test. They did not differ from the control children in gender, χ2 < 1, SES, age, graphomotor skills, or nonverbal intelligence (mean IQ = 109), Fs < 1.
From Grade 1, Italian primary school children are normally introduced to explicit instruction of phoneme–grapheme correspondences and the plain practice of spelling from dictation. Exception words and spelling rules for context-sensitive graphemes (c and g) are also taught explicitly, typically from Grade 2. Contrary to Italian spelling, English spelling is normally introduced in Grade 2. Italian children learn to spell in English through exposure to printed words in their textbooks, orthographic games (e.g., reconstructing scrambled words), and teachers’ feedback, when practicing spelling in writing tasks such as in sentence or text production. The participants in this study received this standard spelling instruction in Italian and English.
Procedure
Children’s spelling skills were assessed at school during school hours in classroom testing sessions. Three assessments were organized over the school year: at the beginning of the year (October–November), before the experimental group received the training (T1/pretest); after receiving the training (experimental group) or business-as-usual (waiting list group) instruction (December, T2/Posttest 1); and at 6 weeks from Posttest 1, after the waiting list group had also received the experimental intervention (February, T3/Posttest 2; for the experimental group, this second posttest consisted of a follow-up assessment.). During the first assessment session, the children’s nonverbal IQs and graphomotor skills were also assessed to control for possible individual differences between the groups (experimental and waiting list) at this level.
Pretest–posttests assessment
All participants performed experimental and standardized spelling tasks in Italian and English at three time points: the pretest, Posttest 1, and Posttest 2 (a detailed description of the tasks is reported in the following). All tasks were completed in two classroom sessions of about 1 hr each. In addition to these tasks, the children’s nonverbal and handwriting skills were also assessed in the pretest session.
Nonverbal skills
The Primary Mental Ability spatial relations subscale (Thurstone & Thurstone, 1963) was used to estimate the children’s nonverbal IQs. The task requires the child to choose from among four forms the one that best completes a geometric figure. The split-half reliability of the subscale is .94.
Graphomotor skills
The alphabet task was used to assess children’s graphomotor abilities. The task, widely used in handwriting research (Arfé et al., 2020; Berninger et al., 1997; Graham et al., 1997; Kim et al., 2011; Limpo & Alves, 2013), assesses automatic access and reproduction of alphabet letters from memory. The child is asked to write the alphabet in order in lower case manuscript letters as quickly and accurately as possible. The number of legible alphabet letters produced in the right order in 15 s is scored. The task’s interrater reliability is .97 (Berninger et al., 1997).
Italian spelling skills
Italian spelling skills were assessed by standardized spelling tests (word and pseudoword dictation) and an experimental task.
Standardized Italian word spelling test
The word spelling subtest of the Battery for the Assessment of Dyslexia and Dysorthographia (DDE-2; Sartori et al., 2007) was used to assess the children’s lexical spelling strategies. The subtest is standardized for Italian. It consists of writing from dictation 48 words (nouns) varying in length (from two to four syllables), frequency, and orthographic structure. The number of misspelled words (errors) is scored. The concurrent validity of this subtest, as reported by prior studies (Arfé et al., 2016), is .82. The test–retest reliability, computed in this study by correlating the scores obtained on this test by the waiting list group at T1 and T2 (1 month), was r = .75.
Standardized Italian pseudoword spelling test
The pseudoword spelling subtest of the DDE-2 was used to assess the children’s sublexical spelling strategies. The subtest, standardized for Italian, consists of writing from dictation 24 pseudowords varying in length (from two to four syllables) and orthographic structure (e.g., CVCV, CVCCV). The number of misspelled pseudowords (errors) is scored. The concurrent validity of the subtest, computed in this study by correlating the children’s pseudoword spelling scores at T1 with those on the word spelling subtest of the DDE-2 (T1), was r = .76. The test–retest reliability, computed by correlating the scores of the waiting list group on the pseudoword subtest at T1 and T2, 1 month later, was r = .62, a coefficient lower than the .75 reliability coefficient of the word spelling subtest. We attributed this difference to the lower stability (i.e., greater fluctuation) of the children’s performance on the pseudoword spelling test. Spelling pseudowords involves greater cognitive control and focused attention than spelling familiar words, which is a more automatic process. Natural fluctuation in children’s attention and fatigue may thus more heavily affect children’s performance in pseudoword spelling than in word spelling, with influences on its stability (i.e., reliability).
Experimental Italian word spelling task
Children were asked to spell from dictation a list of 48 three- and four-syllable words selected from the Italian database CoLFIS (Bertinetto et al., 2005). The list included words containing context-sensitive graphemes (g, c; that is, letter strings such as chi, gli, gni, ghi, or sci, as in sciatore/skier or ringhiera/railing), geminates (-tt- or -ss- as in passerotto/sparrow), or the /kw/ group (as in acquitrino/marsh or obliquo/oblique) whose transcription requires the use of syllabic or multiletter units, deviating from the most typical one-to-one correspondence rules (Arfé, Cona, et al., 2018). Twenty-five words were trained during the intervention (trained list). Another 23 words, matched to the trained list on word length, word frequency, and type of orthographic difficulties (e.g., cqu-, -tt-), were not trained (untrained list) but were retested on Posttest 1 (T2) and Posttest 2/follow-up (T3). The untrained word list served to assess the generalization of the spelling skills acquired to new words. On the pretest, the children’s performance on the trained and untrained list showed a strong and significant correlation, r = .89. Their performances on both lists were scored for accuracy (i.e., total number of correctly spelled words).
English spelling skills
English spelling skills were assessed by a standardized orthographic task and an experimental dictation task.
Standardized English orthographic task
The word choice subtest from the PAL-II (Berninger, 2007) was used to assess the children’s orthographic knowledge of English words. The subtest consists of 30 items. The child is asked to identify the correctly spelled word presented with two distracters that have the same pronunciation (e.g., was, wuz, and whas). The test is standardized for English. Second and third graders were administered the first 15 items of the subtest, whereas fourth graders performed all 30 items. The test–retest reliability, computed in this study by correlating the scores of the waiting list group on the subtest at T1 and T2, 1 month later, was r = .90. The number of correct word choices was scored.
Experimental English word spelling task
The children were asked to spell from dictation 52 one- to three-syllable words selected from English textbooks and from the Children’s Printed Word database (Stuart et al., 1993–1996). A list of about 300 words, selected from the children’s textbooks and the Children’s Printed Word database, was preliminarily presented to the children’s English teachers, who rated each word for familiarity (i.e., how much they believed the word was part of the children’s spoken or written vocabulary). Words that were rated as very familiar were excluded. The final list comprised words that were used in the children’s textbooks but were not yet considered part of the children’s written vocabulary.
Two lists of 52 words were used, one for second and third graders and one for fourth graders. Each list included 25 words that were trained during the intervention (trained word list) and 25 words that were not trained (untrained word list), which were used to assess generalization of the intervention effects to new words. As for the Italian trained and untrained word lists, the English trained and untrained word lists also were matched on word frequency, word length, and orthographic difficulties (e.g., -ch- as in peach/each, -sh- as in dish/push, -ght- as in night/fight, -h- as in hand/hole, -w- as in window/snowman, -ee- as in deep/feed, -ll- as in call/tall, or -oa- as in boat/road). English words that were considered orthographically difficult were those containing spelling units not existing in Italian (such as -th- as in think/thing, -sh- as in dish/push, or -ght- as in night/fight), a different phoneme–grapheme correspondence than Italian (e.g., -ch- as in peach/each, where -ch- represents the phoneme /tʃ/ in English but the phoneme /k/ in Italian), or spelling units in different positions than in Italian (such as the geminate -ll- in call/tall, which is never at the end of a word in Italian). On the pretest, the children’s performance on the trained and untrained word list were strongly and significantly correlated, r = .88. Remarkably, the children’s accuracy scores on the Italian and English experimental word lists also were significantly correlated: r = .62 for the trained lists and r = .56 for the untrained ones. Their performances on both English word lists (trained and untrained) were scored for accuracy (i.e., total number of words spelled accurately).
The second author scored all of the children’s spellings of the experimental word lists (Italian and English) for accuracy. Interrater agreement was computed by asking an independent rater, blind to the hypotheses of the study, to score 20% of the pretest/T1, Posttest 1/T2, and Posttest 2/T3 spelling protocols, randomly selected from the test materials. Interrater agreement, calculated as the overall number of agreements over the total number of words scored by the second author and the second, independent, rater, was 99% for both (trained and untrained) word lists.
The training
The mind’s ear and eye training was adapted for this study based on Arfé, Cona, et al. (2018). Children were trained to spell 25 Italian words from the experimental Italian spelling word list and 25 English words from the experimental English word list. The training was administered to the whole class by the second author of the study. The classroom teacher assisted the experimenter during the training by organizing or motivating the class but did not contribute to delivering the intervention. Each week, the children were met twice for training on Italian spelling and twice for training on English spelling in separate spelling sessions. The overall training lasted about 4 weeks (eight biweekly training sessions per language). Each training session had a duration of approximately 20 min and consisted of spelling and transcribing six Italian or six English words from the experimental word lists, following the seven steps procedure described, as follows, in this section. Of the six words trained in each session, three were selected from the experimental (Italian and English) trained word lists and three were additional words of similar orthographic complexity and frequency. A total of 48 words were trained in Italian and 42 words were trained in English.
The training procedure was similar to that designed by Berninger, Abbott, et al. (1998) except that the children were taught to spell multiletter/syllabic units only. As noted earlier, multiletter clusters and syllables have been shown to be natural transcription units of expert and developing writers in orthographies such as English and French (Berninger, Vaughan, et al., 1998; Kandel & Valdois, 2006; Lambert et al., 2008). However, in more shallow orthographies, such as Italian, syllabic transcription also may be an efficient spelling strategy for spelling orthographically complex words (Arfé, Cona, et al., 2018; Burani et al., 2006).
The experimenter introduced the Italian or English target word, writing it on the blackboard in lower case manuscript letters and reminding the children of the word’s meaning (Step 1). Thereafter, the children were invited to read the word aloud with the experimenter, while she pointed to the letters from left to right (Step 2). All of the letters were then named aloud by the experimenter (Step 3). In this step, letter sounds (phonemes)—not letter names—were used. Thereafter, the word was read again syllable by syllable, while the experimenter pointed to the corresponding multiletter units from left to right, and the children were asked to imitate the experimenter (Step 4). In Step 5, the children were invited to make a mental picture of the word and close their eyes to read the word from their mind. The word was then erased from the blackboard and the children were asked to read it from their memory and write it down (Step 6). Finally, the word was shown again and the children were invited to compare the word produced with the target word (Step 7). When the two words did not match and the spelling was incorrect, they could repeat Steps 1 to 7. The children were provided with exercise books for practicing the other 12 words at home during the week. Home practice was aimed to help the children consolidate the spelling strategies learned during the intervention. Instructions were given to all children to train alone, following the same procedure learned at school and to bring the exercise book back to the experimenter at the end of each week. All children delivered their homework.
Treatment fidelity
The intervention was entirely delivered by the second author of the study after 2 weeks of training on the procedure. A manual containing the training word lists and detailed instructions for delivering the training was provided to, and discussed with, the second author before she started the intervention. Fidelity to the training procedure was monitored during the intervention in weekly meetings between the second and first authors. Moreover, classroom teachers were asked to observe the experimenter once a week during the intervention and complete a checklist assessing the frequency with which all relevant training elements and steps were implemented. Fidelity to the planned training procedure, as evaluated by the English and Italian classroom teachers, was 100%.
Data Analyses
The children’s improvement following the spelling intervention was assessed by computing gain scores between the Pretest (T1) and Posttest 1 (T2) and between Posttest 1 (T2) and Posttest 2/follow-up (T3), using the following formulas: T1–T2 gain score = T2 accuracy-T1 accuracy, and T2–T3 gain score = T3 accuracy-T2 accuracy. For the experimental group, who received the intervention immediately after the pretest (T1), the first gain score (T1–T2 gain) represented a measure of the children’s learning following the intervention; the second (T2–T3 gain) estimated maintenance of the intervention’s effects. For the waiting list group, who received the intervention between T2 and T3, the first gain score (T1–T2 gain) assessed improvements due to practice (i.e., test–retest) effects, whereas the second (T2–T3 gain) measured the efficacy of the intervention (i.e., learning due to intervention).
As we were interested in the interaction between training and language effects, 2 (Group: experimental/waiting list) × 2 (Language: Italian/English) × 2 (Time: T1–T2 gain/T2–T3 gain) mixed ANOVAs were run, with Group as between-factor and Language and Time as within-factors. Age was a covariate to control for age-related differences and years of spelling instruction. Moreover, as 14 children performed very poorly (≤ fifth percentile) on the Italian standardized word spelling task (DDE-2 words), pretest scores (errors) on this test were also covaried. Dependent measures were the children’s gain scores on the following: the Italian trained and untrained word spelling lists, the English trained and untrained word spelling lists, the Italian standardized word and pseudoword spelling subtests (errors), and the English word choice task. The effects of the spelling intervention on the trained (Italian and English) word lists were tested first. Thereafter, generalization of the trained spelling skills to the untrained (Italian and English) word lists and to the standardized Italian and English spelling tests was explored. Effect sizes—partial eta squared—were computed using SPSS statistical package (Version 26),
Results
Skewness and kurtosis in the data distribution were preliminarily inspected. Two measures showed a moderate but acceptable skewness (West et al., 1995): the English trained list = 1.4, and the English untrained list = 1.4. Absolute kurtosis was <7 (SPSS kurtosis <4) for all variables.
Effects of the Spelling Intervention in Italian and English: Trained Word Lists
There was a significant main effect of Language, F(1, 104) = 5.71, p < .05,
Group effects per time span and language
In the first time interval (from T1 to T2) the experimental group improved significantly more than the waiting list group in Italian (I–J mean difference = 1.94, p < .001), Cohen’s d = 0.69, and in English (I–J mean difference = 4.05, p < .001), Cohen’s d = 1.36. The effect size was moderate for the Italian word list and large for the English spelling list. In the second time interval (from T2 to T3), the waiting list group, who started the intervention at T2, showed greater gains in spelling than the experimental group, in both Italian (I–J mean difference = −2.91, p < .001), Cohen’s d = 1.08, and in English (I–J mean difference = −5.02, p < .001), Cohen’s d = 1.97 (see also Table 2). The dimension of the effect was large for both languages.
Differences in Spelling Gain Scores Between the Experimental and Waiting List Groups.
Note. T1–T2 gain = difference in spelling accuracy scores between Times 2 and 1; T2–T3 gain = difference in spelling accuracy scores between Times 3 and 2. DDE-2 words = Italian standardized test of word spelling; DDE-2 pseudowords = Italian standardized test of pseudoword spelling; PAL-II word choice = English standardized test of orthographic knowledge.
Time effects per group and language
T1–T2 and T2–T3 gain scores differed significantly for the experimental group in both languages (Italian and English). In both cases, the experimental group showed greater gains (improvement) in the first time interval (between T1 and T2), when it received the spelling intervention: The I–J mean difference was 4.78 for the Italian word list (p < .001) and 4.29 for the English word list (p < .001). The dimension of the effects was large in both cases: Cohen’s dRepeated Measures, pooled = 1.09 and Cohen’sdRepeated Measures, pooled = 0.84, respectively. For the waiting list group, the difference between the T1–T2 and T2–T3 gain scores was significant only for the English word list: I–J mean difference = −4.77, p < .001. The negative difference indicated that the T2–T3 gain scores were greater than the T1–T2 gain scores. Inspection of the means in Table 2 shows that the improvement in English spelling skills was indeed greater in the second time interval (i.e., between T2 and T3), when the waiting list group received the intervention. The effect size was large: Cohen’s dRepeated Measures, pooled = 1.39.
Language effects per group and time interval
For the experimental group, the effects of the intervention (received between T1 and T2) were similar in the two languages. As shown in Table 2, the experimental group children improved in spelling both the Italian and English trained word lists between T1 and T2 and did not show a change in spelling performance between T2 and T3, indicating a general maintenance of the training effects. For the waiting list group, significant language effects were found in both time intervals. In the first time interval (T1–T2), the children showed greater learning (gain) in Italian: I–J mean difference = 2.25, p < .001, Cohen’s dRepeated Measures, pooled = 0.51; in the second time interval (T2–T3), corresponding to the spelling intervention, they showed a greater gain in English: I–J mean difference = −2.46, p < .001, Cohen’s dRepeated Measures, pooled = 0.63. The negative difference indicated that children’s spelling skills improved less in Italian than in English.
We also explored the three-way Time × Language × DDE-2 words (pretest errors) interaction. To explore the interaction, separate ANCOVAs were run. The Language × DDE-2 word pretest errors interaction was tested separately for each time span (T1–T2 and T2–T3), and the Time × DDE-2 word pretest errors interaction was tested separately for the two languages (Italian and English).
Language × DDE-2 words (pretest errors): first time span (T1–T2)
The interaction between language and DDE-2 pretest errors was significant, F(1, 105) = 12.83, p = .001,
Language × DDE-2 words (pretest errors): second time span (T2–T3)
No significant interaction between Language and DDE-2 word pretest errors was found.
Time × DDE-2 words (pretest errors): Italian and English spelling
No significant interactions were found between Time and DDE-2 pretest errors.
In sum, both groups’ spelling skills improved more when the children received the intervention. The experimental group improved significantly more than the waiting list group in Italian and English spelling between the Pretest and Posttest 1 (T1–T2), whereas their spelling skills did not change significantly between Posttest 1 and Posttest 2/Follow-up (T2–T3), indicating a general maintenance of the intervention effects. The waiting list group, who received the intervention between T2 and T3, improved more in this time interval. Clear language-specific effects were observed for this group only. The children in this group improved more in Italian than in English between T1 and T2 (before receiving the intervention) but improved more in English than in Italian between T2 and T3 (after the intervention). Hence, the intervention was more effective on English spelling for this group. We will comment on this finding in detail in the Discussion section. The younger participants and the children who made more errors on the DDE-2 word spelling task (on the pretest) benefited more from the spelling training in Italian.
Generalization of the Trained Spelling Skills to the Untrained Italian and English Word Lists
Language was significant, F(1, 104) = 12.73, p = .001,
Group effects per time span and language
Between T1 and T2, the two groups (experimental and waiting list) showed similar improvement in Italian (I–J mean difference = −.094, p = .86). In contrast, the experimental group improved significantly more than the waiting list group in English spelling (I–J mean difference = 1.17, p = .001), Cohen’s d = 0.54 (the effect size was medium.). Moreover, between T2 and T3, the improvement of the two groups was similar in Italian (I–J mean difference = −.24, p = .61), whereas, in English, the waiting list group showed greater gains: I–J mean difference = −2.21, p < .001. The dimension of the effect was large: Cohen’s d = 1.03.
Time effects per group and language
In Italian, the experimental group showed similar improvement over the two time intervals (T1–T2, T2–T3), I–J mean difference = .39, p = .52, whereas in English, the improvement was greater between T1 and T2, the period in which the experimental group received the intervention: I–J mean difference = 1.60, p < .001. The dimension of the effects was moderate: Cohen’s dRepeated Measures, pooled = 0.46. For the waiting list group, generalization also was significant only in English: The T2–T3 gain scores (time interval corresponding to the intervention) were greater than the T1–T2 gain scores: I–J mean difference = −1.77, p < .001. The dimension of the effects was moderate: Cohen’s dRepeated Measures, pooled = 0.58. In Italian, the improvement was similar between the two time intervals: I–J mean difference = .25, p = .67.
Language effects per group and time interval
The effects of the intervention were similar in the two languages for the experimental group (I–J mean difference = −.83, p = .08 for T1–T2 gain scores and I–J mean difference = .38, p = .35 for T2–T3 gain scores). For the waiting list group, language effects were significant. Between T2 and T3, once they had received the intervention, the children in this group showed greater learning in English than in Italian (I–J mean difference = −1.59, p < .001). The effect size was moderate: Cohen’s dRepeated Measures, pooled = 0.49. Between T1 and T2, the change was limited and similar in both languages (Italian and English): I–J mean difference = .43, p = .34.
The three-way Time × Language × Age interaction was explored by ANCOVAs testing the Language × Age interaction separately for each time span (T1–T2 and T2–T3).
Language × Age: first time span (T1–T2)
No significant interaction emerged between Language and the covariate Age.
Language × Age: second time span (T2–T3)
A significant interaction emerged between Language and Age: F(1, 105) = 22.81, p < .001,
In synthesis, the analyses revealed a generalization of the training effects only to the English untrained word list. Language-specific effects were clearer for the waiting list group: In the experimental group, a similar improvement in English and Italian was indeed observed for the untrained word lists following the intervention (T1–T2 gain scores), whereas for the waiting list group, the improvement following the intervention was significantly greater in English (T2–T3 gain scores). It must be noted, however, that with the intervention, the experimental group showed greater gains than the waiting list group in English only. The between-group pairwise comparisons showed indeed that in Italian the performance of the experimental group was equivalent to that of the waiting list controls (Table 2). This confirmed a general greater benefit of the training on English spelling. The greater generalization of the learned skills to English than Italian cannot be attributed to ceiling effects in performance on the Italian untrained word list as the children’s mean accuracy was 15/23 (for both the waiting list and experimental groups) at T1, 16.8/23 (for both groups) at T2, and 18/23 at T3.
Generalization to Italian Standardized Word and Pseudoword Spelling Tests
A 2 (Group) × 2 (Time) ANOVA, with Age and DDE-2 word spelling (pretest errors) covariates, was performed. For the standardized word spelling task, the ANOVA revealed a main effect of Time, F(1, 104) = 4.50, p < .05,
For the pseudoword spelling task, the covariate DDE-2 word pretest errors was the only significant factor, F(1, 104) = 18.41, p < .001,
Generalization to the Word Choice Test in English
Time, Age, and Group were nonsignificant, Fs < 1. The effect of the covariate DDE-2 word pretest errors approached statistical significance, F(1, 104) = 3.37, p = .07,
In synthesis, generalization to the standardized tasks was thus evident in English only. As noted earlier, 14 of the participants in this study performed below the fifth percentile on the DDE-2, the standardized Italian word spelling test. We controlled for this variable in all analyses. However, to ensure that an uneven distribution of poor spellers between the experimental and waiting list groups did not affect the results of the study, the same analyses were performed excluding the 14 participants with DDE-2 spelling scores below the fifth percentile. These analyses confirmed the initial findings. The only exceptions were the effects of Time and Time × Age on the untrained words spelling lists, which were nonsignificant in these new analyses. Remarkably, all significant Time × Group and Time × Group × Language interactions were replicated.
Discussion
A language-integrated approach to bilingual (L1/AL) spelling instruction is based on the hypothesis that learning to spell involves the same basic language-learning (i.e., orthographic mapping) abilities across alphabetic orthographies. As a consequence, the same instructional strategies should be effective for developing spelling skills in L1 and in an AL, independently of the differences between the two alphabetic scripts (i.e., their language-specific characteristics). We tested this hypothesis by exploring whether delivering the same spelling intervention—an adaptation of the mind’s ear and eye training (by Berninger, Abbott, et al., 1998)—in Italian (L1, a shallow orthography) and English (AL, a deep orthography) could support the young Italian writers’ development of spelling skills in both languages. The training modeled the use of syllabic/multiletter units in spelling.
Being a very shallow orthography, Italian is characterized by consistent small grain size (phoneme–grapheme) correspondences; thus, Italian children typically do not need to rely on larger spelling units, such as rhymes or syllables, to learn to spell words (Ziegler & Goswami, 2006). However, regarding orthographically complex words, such as those trained in this study, syllabic transcription may also become a useful spelling strategy in Italian (Arfé, Cona, et al., 2018; Burani et al., 2006). Thus, we assumed that the training would be effective in both languages (Italian and English) but also formed the hypothesis that language-specific factors could amplify or hinder (depending on the orthography) the effect of the intervention, affecting children’s ability to transfer syllabic transcription skills to new words in Italian and English.
Effectiveness of the Spelling Intervention in Italian and English
In keeping with prior studies showing a significant correlation between spelling in the L1 and an AL (Russak & Kahn-Horwitz, 2015; Sparks et al., 2009; Sparks, Patton, et al., 2008), for the children in this study, Italian and English word spelling skills also were significantly correlated on the pretest. Correlations between spelling accuracy in the two languages were r = .62 for the trained lists and r = .56 for the untrained ones. These correlations are similar to those observed in English-speaking children learning an AL at school (Sparks et al., 2009) and indicate that spelling skills are associated across languages.
Language-general effects
The language-integrated spelling intervention proved effective in both languages: With the intervention, children improved significantly in spelling both the Italian and the English trained word lists. The experimental group, who received the intervention first, made significantly greater progress than the waiting list group in spelling the Italian and English word lists between T1 and T2. The waiting list group, who received the intervention between T2 and T3, improved more in this time interval. For the experimental group, the gain scores associated with the intervention were 4.86 (Italian trained word list) and 4.38 (English trained word list). For the waiting list group, they were 2.73 (Italian trained word list) and 5.30 (English trained word list). Although these findings support a language-integrated approach to spelling instruction, other findings of this study suggest that language-specific factors should also be considered.
Language-specific effects
The within-group analyses helped clarify the results of this study. With the intervention (i.e., between T2 and T3), the waiting list group showed greater gains in spelling the English words than the Italian trained ones, whereas in the first time interval, between T1 and T2, it showed greater improvement on the Italian word spelling list, a finding that will be discussed in greater detail later. Moreover, generalization of the acquired spelling knowledge to the untrained word lists was significant for both groups only in English. This finding suggests, on one hand, that the children were able to transfer and generalize the spelling knowledge acquired during the training to new words and, on the other hand, that the likelihood they did so depended on language-specific factors.
The improvement of the waiting list group in spelling the Italian word list between T1 and T2, that is, before the intervention, suggests that in this time period, the children in this group benefited from plain spelling practice (e.g., spelling from dictation). The standard practice of spelling was probably less effective in English and thus gains in English in the first time span were only minor. By being exposed to English spelling regularities through print or writing—the two methods most commonly used to teach English spelling in Italian primary schools—children may learn spelling patterns implicitly, simply recoding (new) written words (Shahar-Yames & Share, 2008). These implicit learning processes were, however, largely inefficient in English, where only more direct multiletter-sound mapping instruction produced significant improvement.
The greater generalization of the training effects in English suggests that the participants were better at extracting spelling patterns and rules when trained in their AL, rather than in their L1. A possible explanation for this finding—in line with the language-specific hypotheses—is that regularities are typically abstracted in Italian and English from spelling units of different size: smaller one-to-one phoneme–grapheme spelling units in Italian (L1) and larger, multiletter spelling units in English (Berninger, Vaughan, et al., 1998; Ziegler & Goswami, 2006). Direct instruction of multiletter spelling procedures could have thus been more effective in English, making our participants more able to generalize the acquired knowledge to new words.
Generalization of the multiletter spelling procedures to new (untrained) words was more difficult in Italian, probably because the training conflicted with what children had learned at school or because of the language-specific characteristics of Italian orthography. These findings are coherent with the second hypothesis of this study (i.e., the mediating role of language-specific factors in response to spelling instruction) and with the orthographic proximity hypothesis (Kahn-Horwitz et al., 2011; Schwartz et al., 2014), which predicts that the likelihood of the same learning processes and spelling abilities being successfully used and transferred between the L1 and an AL depends on the similarity between the two orthographies. This study cannot answer questions on crosslinguistic transfer effects, such as “What happens to spelling skills when we introduce the same training at different time points in the L1 and an AL? Or when we train spelling in one language (AL) only and test the transfer effects to the other (L1)?” However, similar questions could be the target of future studies.
Overall, the finding of language-specific generalization effects integrates the previously reported finding of a positive, language-general effect of the training, suggesting that the language-specific characteristics of the orthography to be learned can moderate and limit the effectiveness of a language-integrated spelling intervention. However, it is interesting to note that the younger children in this study generalized more the acquired spelling knowledge to the new (untrained) Italian words. This indicates that language-specific effects can be greater for older writers than for younger ones.
In general, our findings showed that the younger children and poorer spellers (i.e., the children who made more errors on the DDE-2) benefited more from the training, that is, the children seemed to show a greater response to the spelling intervention when their one-to-one spelling strategies were less efficient, either because they were not yet fully consolidated or because of spelling problems. The effectiveness of the mind’s ear and eye training for improving the spelling skills of children with spelling problems has also been demonstrated by other studies (Arfé, Cona, et al., 2018; Berninger, Abbott, et al., 1998). However, researchers should further examine how language-specific factors interact with spelling abilities and the type of training (language-integrated or language-specific) in children’s response to spelling interventions.
Limitations of the Study
In this study, the intervention effects were tested in Italian with a limited set of orthographic difficulties (context-dependent graphemes, geminates). This choice was made because words containing these difficulties typically represent the only spelling challenges for young Italian spellers. However, in future studies, the possibility of extending the use of the training to Italian words with more consistent one-to-one spelling with varying orthographic structures should also be explored.
Another limitation of this study is methodological. In the study, standardized spelling tests were used to assess spelling skills in both Italian and English. However, an Italian standardization of these tests was available only for the Italian (DDE-2) instrument. The lack of an Italian standardization of the English spelling test limits the possibility of assessing how these young writers truly performed on average in English spelling for their age. In addition, the study lacks parallel spelling tests (e.g., for pseudoword spelling) in the two languages (Italian and English). This limits—at least for the standardized measures of this study—the possibility of a direct (crosslinguistic) comparison between languages.
The short duration of the intervention represents a further possible limitation of this study. The spelling intervention proposed in this study (eight 20-min biweekly sessions over 1 month, with six words trained per session) was shorter than in Berninger, Vaughan, et al. (1998), where the intervention consisted of 24 20-min training sessions conducted over approximately 4 months. The duration (or intensity) of the training could have been insufficient to allow children to consolidate the spelling knowledge acquired in Italian, where a shift in spelling strategies (from phoneme–grapheme to syllabic strategies) was necessary. Longer or more intensive training could have yielded different results.
Conclusion and Implications
This study shows that a language-integrated approach to bilingual (L1/AL) spelling instruction can have positive effects on children’s ability to spell in their L1 and in the AL, especially for young and poor spellers. In this study, the training was focused on multiletter spelling units and thus showed greater effects in English. However, as pointed out by other authors (Berninger, Vaughan, et al., 1998), teaching multiple levels (small and large size) of sound-spelling connections can be a more efficient method for teaching spelling than focusing on spelling units of small size only, as in Italian. In keeping with other findings (Arfé, Cona, et al., 2018), the results of this study also show how in a very shallow orthography with highly consistent phoneme–grapheme correspondences, multiletter-sound mappings indeed can be useful to teach children more complex (and less shallow) spelling regularities. Language-specific characteristics may, however, hinder the generalization of these spelling procedures across languages. An open research question is whether this happens because the characteristics of the orthographic system constrain the child’s learning process in that language or because the spelling procedures taught to children shape and structure their learning processes. In the first case, children’s implicit learning of language-specific spelling patterns would shape the spelling processes that will be subsequently used to abstract spelling knowledge (Ise et al., 2012). In the second case, language-specific spelling processes would be an outcome of explicit learning processes and children’s awareness that some (e.g., one-to-one) spelling strategies may be more efficient than others. Only studies focused on emergent and early spelling skills can answer this research question, exploring, for example, the effects of a language-integrated approach to spelling instruction from the earlier phases of spelling development. As discussed in the introduction to this article, similar studies are essential for making instructional decisions in an increasingly multilingual society.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
