Abstract
Teaching linguistic aspects relevant to text construction is an essential component of any thorough writing instruction program, despite the conflicting evidence regarding its effectiveness. In this study, 889 second- and fourth-grade students were assigned to one of three conditions: Self-Regulated Development (SRSD), SRSD-connectors (SRSD-C), and business-as-usual (BAU). The experimental conditions addressed planning and self-regulation strategies to write opinion essays, but only the SRSD condition included explicit teaching of connectors (e.g., because) and discourse markers (e.g., In conclusion). Children in both experimental conditions outscored children in the BAU condition across grades and outcome variables. In addition, the SRSD condition showed larger effect sizes on Grade 2 children’s gains in text quality, number of genre-appropriate elements, and number of connectors than the SRSD-C condition. The study provides evidence of the effectiveness of explicitly teaching functionally motivated linguistic representations within a SRSD program. Theoretical and educational implications are discussed.
Keywords
Learning to write at school is a complex endeavor that spans throughout the entire period of obligatory education. As a top priority in elementary schools, it can be overwhelming for teachers to decide what to teach about writing, when to teach it, and how to teach it. In this article, we investigated the effectiveness of teaching children writing, paired with self-regulation strategies, with or without explicit teaching of relevant linguistic representations.
What Works in Writing Instruction: Self-Regulated Strategy Development (SRSD)
Writing instruction has undergone considerable changes in light of accumulating evidence about the processlike nature of writing development and of effective writing practices. Accordingly, one of the most important differences has been the shift in focus toward teaching children to write by providing strategies to cope with the writing process, rather than expecting students to improve their writing based chiefly on the traits of model texts, that is, a product approach (Berninger et al., 1996; Graham & Perin, 2007; Graham et al., 2012). More recently, a more integrative view of writing instruction has been favored, such as the one proposed by the Writer(s)-Within-Community model (Graham, 2018). This model views writing as a series of domain-specific, recursive processes, namely, planning, translating, and revising (e.g., Alamargot & Chanquoy, 2001; Hayes & Flower, 1980), that are carried out thanks to the contribution of a host of domain-general, cognitive skills (e.g., working memory, short-term memory; Kellogg, 1996; McCutchen, 2000; Salas & Silvente, 2020), while situated within a specific communicative, social context.
SRSD is a program of writing instruction that is well-rooted in current models of writing development. As such, it capitalizes on the abundant evidence suggesting that writing is best conceptualized as a socially situated, complex cognitive task. Accordingly, it involves teaching strategies to execute writing processes, and how to generate text attending to key product features, in tandem with self-regulation strategies (e.g., goal-setting, self-instructions; Graham & Harris, 2018). SRSD writing interventions are developed for a specific communicative situation (e.g., opinion essay, narrative, summary). They are typically discourse-rich and involve a progressive expectation of autonomy on the part of the learner to use the strategies; this means that teachers will first execute and model the strategies themselves, while they will gradually transfer the use of the strategies to students until they are fully autonomous. SRSD writing interventions include six elements that may be combined in different ways to design the sessions that will form the intervention. These elements or “stages” involve (1) the activation of previous knowledge (Develop and activate background knowledge stage); (2) the discussion of learning goals (Discuss it stage); (3) the modeling of the strategies on the part of the teacher (Model it stage); (4) the use of mnemonics to help remember the strategies (Memorize it stage); (5) the support that the teacher provides to students, both in terms of constructive, targeted feedback, as well as in the development of supporting material to assist the use of the strategies (Support it stage); and (6) the sensible, progressive fading out of support, to encourage children’s autonomous use of the strategy (Independent performance stage; Graham & Harris, 2009, 2018).
Component Analyses of SRSD Writing Interventions
SRSD has been found to be effective across a number of population types (typically developing, as well as learning- and language-disabled children, Gillespie & Graham, 2014; Graham & Harris, 2003), languages (e.g., English: see Harris & Graham, 2017 for a review; Portuguese: Limpo & Alves, 2013; Spanish: García-Sánchez & Fidalgo-Redondo, 2006; German: Glaser & Brunstein, 2007; or Catalan: Salas et al., 2021), and types of administration (i.e., individual, small group, whole class; Graham et al., 2012). Despite their well-attested effectiveness, precise knowledge of the specific impact of each of SRSD’s design features is still scarce. In this sense, component analyses are instrumental to determine which traits of SRSD writing instruction contribute to its effectiveness. These analyses should be useful to improve our understanding of the causal mechanisms underlying writing development and instruction (Geres-Smith et al., 2019). Additionally, they should facilitate the implementation of evidence-based practices for writing instruction by allowing educators to more flexibly apply effective teaching practices when it is not be feasible to execute a writing program in its entirety (Fidalgo & Torrance, 2018; López Gutiérrez, 2019).
A number of studies have conducted component analyses of SRSD writing instruction. A few studies have inquired whether direct instruction is vital for an SRSD intervention to be successful. For the most part, including direct instruction has not shown to strengthen the effectiveness of SRSD programs, provided the rest of its features are followed (e.g., Fidalgo et al., 2011; Graham & Harris, 1989; Sawyer et al., 1992). One study compared whether teaching students planning and revising strategies was superior to teaching them planning strategies alone. Results showed that students benefited from the inclusion of revising strategies (López Gutiérrez, 2019). Fidalgo et al. (2015) showed that sixth-grade children benefited from writing instruction including modeling and a teacher-led reflection on such modeling to the same extent as from watching the teacher model writing strategies only (i.e., without a posteriori reflection). Other studies have looked at the relative importance of self-regulation strategies, often reporting seemingly conflicting findings. Brunstein and Glaser (2011) and Glaser and Brunstein (2007) found that including instruction on self-regulation strategies (e.g., goal-setting, self-monitoring, self-instructions) in addition to providing writing strategies was more effective than providing writing strategies alone. However, Geres-Smith et al. (2019) found that the inclusion of self-statements in an SRSD writing intervention was as effective as not including this strategy.
To recap, few component-analysis studies on SRSD writing instruction have been carried out to date. A number have been concerned with pedagogical aspects of instruction (e.g., including direct instruction or not and modeling plus reflection or alone), with the role of self-regulation strategies, while some researchers have been concerned with the content to be taught (e.g., planning-only or planning + revision). To our knowledge, no previous study has attempted to understand the specific impact that providing explicit instruction on linguistic aspects of text construction has on SRSD’s overall effectiveness. Therefore, in this study we implemented two versions of an SRSD planning intervention, one version with explicit strategies on using connectivity devices and a second version without explicit strategies on using connectivity devices.
Teaching Relevant Linguistic Representations
Teaching students about characteristics of language has been one of the core goals in the history of compulsory education worldwide (D. A. Myhill, 2016). Grammar instruction is perhaps the linguistic component most often addressed in research, in terms of its usefulness for writing development. However, it has, for the most part, shown a negligible effect, at best, on the writing competence of primary and secondary students (Andrews et al., 2006; Graham & Perin, 2007; D. Myhill & Watson, 2014; Wyse et al., 2022), although it is an underresearched area (D. Myhill & Watson, 2014). Notably, most of the empirical research showing the absence of positive effects of linguistic or, more specifically, grammatical instruction on writing outcomes has been classified as “traditional” or form-focused, involving merely labeling parts of speech and syntactic analysis, while adopting a prescriptive attitude, with the primary purpose being to identify features of “good writing.”
Functional views of grammar (e.g., Halliday & Matthiessen, 2013) have typically been regarded as an alternative to traditional grammar, as they involve consideration of the way in which languages use forms to achieve socially meaningful purposes (Derewianka & Jones, 2010). In this sense, some evidence has accrued of the benefit of adopting a functional approach to linguistic instruction on writing. Such an approach entails careful consideration of the linguistic support that students might require to compose specific texts. Language teaching in this context is embedded within a writing instruction program and is aimed to assist the student writer in the composition process (e.g., Fearn & Farnan, 2007; Jones et al., 2013; D. A. Myhill et al., 2012, 2016). For example, Jones et al. (2013) conducted a randomized controlled trial to over 800 12- to 13-year-old students. The authors’ goal was to provide relevant linguistic instruction to support the writing of narrative, argumentative, and poetry texts. After controlling for several factors at the school, teacher, and student levels, they found that the experimental group showed superior writing competence than a comparison group of children. Nevertheless, they noted that the intervention was effective only on average and more able writers. In a subsequent study, D. A. Myhill et al. (2018) addressed this shortcoming by targeting students whose narrative texts were characterized by including a very limited use of punctuation, low sentence variety, little character or setting development, low noun-phrase expansion, and language patterns more typical of speech. Their results showed a small effect of the intervention that benefited the experimental group, with significant, positive effects only for their assessment of “sentence structure and punctuation” but not for “text structure and organization” or “composition and effect.”
Other approaches to determining the role of specific linguistic aspects in writing development and instruction involve an interest in the role of explicit vocabulary teaching, given that there is accumulating evidence that vocabulary knowledge is an essential aspect of learning to write (Olinghouse & Leaird, 2009). Studies that conducted interventions that targeted explicit vocabulary training, whether within a wider writing instruction program or in isolation, showed generally positive results of experimental groups in comparison to a control group (e.g., Duin & Graves, 1987; McCutchen et al., 2022; Moseley, 2003; Papadopoulou, 2007; Yonek, 2008). It should be noted, however, that studies varied considerably on many aspects, such as the educational level of the participants, the type of comparison group, or the vocabulary-teaching technique.
Finally, SRSD writing studies typically involve some sort of explicit teaching of linguistic aspects relevant for text construction. Specifically, the POW+TREE strategy for planning opinion essays typically includes providing students with a “transition word” chart (Graham & Harris, 2018). 1 Similarly, the WRITE strategy embeds the importance of transition words in the mnemonic itself, since it stands for “Work from your plan to develop a thesis statement, Remember your goals, Include transition words for each paragraph, Try to use different kinds of sentences, Exciting words’’ (Mason et al., 2011, p. 21). Although these SRSD strategies have been shown to significantly improve the writing outcomes of learning-disabled and typically developing students, the use of such transition words has not been systematically evaluated (e.g., Benedek-Wood et al., 2014; Cuenca-Carlino & Mustian, 2013; Gillespie & Kiuhara, 2017; Hoover et al., 2012). Most importantly, no study to date has accounted for the specific function of explicit linguistic information within SRSD writing instruction.
To sum up, the value of language teaching for supporting writing competence is uncertain. Although it seems fairly clear that traditional approaches to language instruction are ineffective in improving students’ writing, more functionally oriented approaches, especially those that embed language instruction within writing programs, seem promising. Moreover, studies that target specific vocabulary items appear to be generally effective, while SRSD writing programs usually include explicit reference to specific words (e.g., transition words) and are generally successful. In this study, we compared an SRSD writing intervention focusing on planning of opinion essays that did not include any explicit reference to transition words to an identical SRSD intervention that included typically embedded language instruction; specifically, students were taught about the role of a specific subset of genre-appropriate lexical items and grammatical constructions: connectivity devices (e.g., discourse markers). As mentioned above and, to the best of our knowledge, no experimental or quasi-experimental studies have been conducted to single out the efficacy of teaching connectivity devices to improve writing competence. Our design allows to determine whether language instruction embedded within an SRSD writing intervention results in an additional benefit to students’ writing competence, over and above the benefit observed in an SRSD condition that does not include specific linguistic strategies. The interventions were administered to teach beginner- (Grade 2) and intermediate-level (Grade 4) writers to compose opinion essays. By “opinion essay” we refer to a text whose main aim is to convince the reader that the writer’s point of view is the right one (Coirier & Golder, 1993). This discourse genre was chosen for two main reasons: First, because the opinion essay has been considerably less explored than other genres (especially, narratives), particularly in early- and mid-elementary school levels. Second, because the opinion essay is a more challenging genre for children, relative to narrative texts, with which children are usually familiar (Berman & Nir-Sagiv, 2007). We reasoned that opinion-essay writing was a more timely genre to investigate that could build on and expand findings on other, more explored genres.
This Study
In the current study, we compared an opinion essay planning intervention, paired with self-regulation strategies, which included explicit teaching of connectivity devices. Because overt reference to transition words or other relevant linguistic features of texts is customary in SRSD interventions (Harris & Graham, 2017), we refer to this condition as the “SRSD condition.” We compared the SRSD condition to a virtually identical one, which also taught planning of opinion essays paired with self-regulation strategies but which did not include explicit reference to the linguistic makeup of texts; that is, it did not involve direct instruction of connectivity devices (henceforth, SRSD-C condition). In a previous study, we found that the SRSD-C condition was more effective than a business-as-usual (BAU) control condition for second- (7-8 years old) and fourth- (9-10 years old) grade students to improve planning skills, increase text length and the number of genre-appropriate structural elements, as well as overall text quality (Salas et al., 2021). We therefore implemented two writing interventions to teach second- and fourth-grade children strategies for composing opinion essays. The design of the SRSD condition reflected the explicit reference to linguistic elements typical of SRSD writing interventions (Harris & Graham, 2017). Moreover, it was informed by recent findings of the effectiveness of functional approaches to language instruction, which indicate that the teaching of language works to improve writing when it is embedded within a writing intervention program (i.e., the overall focus is on writing), and when it targets aspects that are highly relevant for the communicative function of the text (D. A. Myhill et al., 2018). Thus, the SRSD intervention entailed teaching the same writing and self-regulation strategies as in the SRSD-C condition, but it also embedded the teaching of transition words or connectivity devices. Connectivity devices included interclausal connectors (e.g., because) and discourse markers (e.g., First, In my opinion, In conclusion). Connectors were introduced with explicit reference as to their function in the type of texts that were the target of instruction (e.g., Use “First” to introduce a reason for your thesis statement; or “In conclusion” lets the reader know your text is about to end).
This study is an improvement over previous research in that (1) it addresses a component of writing instruction, language teaching, that is relatively unexplored, especially in large-scales studies controlling for several individual and contextual factors (e.g., SES, gender, and nesting of children in classrooms; Andrews et al., 2006); (2) it includes children in earlier developmental stages than most previous studies that, as reviewed above, have typically included participants in late primary or secondary education; and (3) it provides a focused assessment of the value of adding language instruction to an otherwise successful intervention, and not just to a control condition. This design means that, should participants in the SRSD condition outscore their peers in the SRSD-C condition, the added benefit could be safely attributed to the explicit linguistic support provided to those students.
The study had a pretest-posttest design, and the schools children attended were randomly assigned to one of the experimental groups or to the control, business-as-usual (BAU) group. At each testing time, we obtained measures of children’s planning skills and knowledge of connectivity devices. We also obtained a number of text-embedded measures: text quality, defined as the extent to which a text was considered by raters to be grade-level appropriate; text generation (number of words), since it is a more objective measure that is established from a very early age across languages and is highly sensitive to writing development (e.g., Berman & Verhoeven, 2002; Salas & Caravolas, 2019); number of genre-appropriate structural elements (e.g., reasons, conclusion), as these would provide insight into the extent to which the intervention was successful to develop students’ dicursive representations of opinion-essay writing; and number of connectors used in the text. These measures were analyzed to answer the following research questions (RQs):
RQ1: Do children in the SRSD condition show superior writing performance than children in the BAU control group?
RQ2: Do children in the SRSD condition show superior writing performance than children in the SRSD-C group?
We reasoned that the answer to RQ1 would be affirmative, given the ample evidence showing the effectiveness of various forms of SRSD interventions to improve several aspects of writing performance (e.g., Graham et al., 2012). With regard to RQ2, we hypothesized that the functional approach to teaching connectivity devices would help children gain insight into these linguistic elements that are essential for opinion essay composition. Therefore, they would outscore their SRSD-C peers in most writing performance tasks.
Method
Participants
We approached 1,021 second- and fourth-grade students attending 12 public primary schools in Barcelona, Spain. Parental consent was obtained for 1,011 children. One school withdrew from the study right before data collection began, leaving 909 participants in the final sample. Schools were randomly assigned to one of three conditions: a BAU control group, and two experimental groups, SRSD and SRSD-C (Table 1). Assignment to a given condition applied to whole schools, rather than to children or classrooms, to avoid contamination between classrooms in different conditions, and to enable researchers to offer teacher training to the entire school staff. The exception was one classroom assigned to an SRSD condition, whose teacher went on sick leave just before the start of the intervention. The class was then reassigned to the BAU group. There were four schools assigned to the BAU condition (12 classrooms), five schools assigned to the SRSD-C condition (15 classrooms) and three schools assigned to the SRSD condition (11 classrooms).
Participants’ Demographic Characteristics.
Note. ISEI = International Socioeconomic Index (Ganzeboom et al., 1992), range: 8.3-86; Exposure to LoI = Exposure to Language of Instruction (Catalan), range: 0 to 2.5.
Participants completed a sociolinguistic questionnaire with the help of the administrator and in consultation with the classroom teacher. The questionnaire inquired about children’s linguistic background; specifically, the extent to which they used the language of instruction (Catalan) outside school (i.e., with their parents, siblings, friends). Catalan is a Romance language, spoken in specific regions in Spain, particularly Catalonia, where the research was conducted. It has official-language status in that region, and it coexists with the national language, Spanish. Virtually all speakers of Catalan also speak Spanish, but not the reverse. Although all public schools impart classes in Catalan (all subjects, except for Spanish language and literature and English), some children do not use the language outside of the classroom and have little to no exposure to it. Therefore, in this study we measured the extent to which children had exposure to Catalan outside school, according to whether they used Catalan to speak to parents, siblings, and with their friends or, rather, whether contact with Catalan was nonexistent out of the classroom. We obtained data about language exposure from 93.39% of the sample. The resulting measure ranged from 0 (no exposure to Catalan outside school) to 2.5 (Catalan was the main language spoken to parents, siblings, and/or friends). We ran a 2 (Grade) × 3 (Group) analysis of variance to test differences between groups and grade levels in overall exposure to Catalan outside school. This test yielded a small, but significant, main effect of Group, F(2, 842) = 13.26, p < .001, ηp2 = .030, but no significant Grade effect or interaction (ps > .26). Post hoc Bonferroni tests indicated that the children in the control group were significantly less exposed to the language of instruction outside school than both experimental groups (ps < .001), which did not differ from each other (p = .178). Therefore, the degree of exposure to Catalan outside school was used as a control variable in all subsequent analyses.
The sociolinguistic questionnaire was also used to ascertain the socioeconomic (SES) background of children’s families. We inquired about parents’ occupation, as this objective piece of information could be easily verified by teachers and has been used in previous studies to quantify socioeconomic status, such as the International Socioeconomic Index (ISEI; Ganzeboom et al., 1992). The average of a child’s mother and father ISEI score was used as an indicator of SES. We obtained data from 84.93% of the sample. We ran a 2 (Grade) × 3 (Group) analysis of variance to test for differences between groups and grade levels in average ISEI scores. There was a significant, small main effect of Group, F(2, 766) = 4.15, p = .016, ηp2 = .011, but no significant Grade effect or interaction (ps > .80). Post hoc Bonferroni tests indicated that the families of children in the SRSD group group had significantly higher average ISEI scores than the control group (p = .024). There were no differences between the SRSD and the SRSD-C group (p = .083), or between the SRSD-C and the control group (p = 1.00). The average ISEI score of each child was therefore used as a control variable in all subsequent analyses.
Tasks and Measures
Connectivity test
Participants were administered a sentence-completion, forced-choice test that required them to choose one suitable interclausal connector or discourse marker. Tests were designed by the researchers to estimate students’ ability to choose an adequate connector to fulfill different grammatical and discursive functions, typical of opinion-essay writing. Children were presented with two clauses or short texts (of up to two and five sentences, for second and fourth grade, respectively) in which connectors and discourse markers were missing. Participants were instructed to choose from three alternatives the best suiting word for the blank. Each blank counted as an item, and there were 12 and 21 items in the second- and fourth-grade instruments, respectively, at both pretest and posttest. Note that more than one option was usually grammatically possible, but only one option was both grammatically and discursively correct. For example, in the sentence We did not go to the beach _______ it was raining, the options and and because are both grammatically plausible, but the latter is a much better fit. Items became progressively more complex, until the children were required to fill in the blanks of short texts. Each correct item received a score of 1, and the final score resulted from the sum of all correct items. Because the number of items differed across instruments, percentage-correct scores were used. Reliability of the instruments was an issue, despite extensive piloting before the data collection. The second- and fourth-grade pretest instruments were only moderately reliable, Cronbach’s α = .52 and .49, respectively, but only after removing five items of the second-grade instrument and six items in the fourth-grade instrument. At posttest, both the second- and the fourth-grade instruments showed adequate reliability, Cronbach’s α = .73 and .71, respectively, although only after removing two items from the second-grade instrument, and 5 items from the fourth-grade instrument. Therefore, readers should take results involving the measures obtained from these instruments with caution.
Opinion essay samples
Children were required to compose opinion-essay texts at pretest and at posttest. They were given up to 10 minutes to make drafts and then up to 20 minutes to write the final version of their texts. There were two prompts at both pretest and posttest, which were randomly assigned to each classroom. The prompts were, Do you think that all children your age should go to school? and Do you think that recess time is necessary at school? Children who were administered the first prompt at pretest were administered the second at posttest, and vice versa. Children were encouraged to write as best as they could and did not receive help with spelling or ideas. This writing task was used to obtain a series of measures.
Planning quality
Children’s planning skills were evaluated using an adaptation of the 1-5 scoring procedure proposed by Whitaker et al. (1994). Plans that consisted of a blank page or a text that was later copied verbatim in the final version of the text received the lowest score, 1. Plans that were too short (a single sentence, just the prompt, one word or phrase), also received the lowest score. Plans that showed minimal planning, meaning only small modifications between draft and text (e.g., addition of new information, brief expansion of an idea, small change in the order of presentation of ideas), received a score of 2. Plans that consisted of lists of keywords, received a score of 3. Plans that involved emerging elaboration of ideas, particularly if they made reference to structural plans (e.g., referring to the part of the texts where the idea would go, such as Introduction, body, or for/against lists) received a score of 4. Finally, plans where ideas were hierarchically organized and elaborated received a score of 5. A trained research assistant scored all plans and a second rater scored a random 20%. Interrater reliability was assessed with weighted Kappa = .871.
Text generation (word level)
Children’s text generation skills at the word level were assessed by automatically counting the number of words in their pretest and posttest texts. All texts were transcribed in a word processor, without format, correcting children’s spelling mistakes. Texts were transformed to .chat format using the TEXT2CHAT command in CLAN (MacWhinney, 2000). We used the STATFREQ command to obtain automatic counts of word tokens in each text.
Text generation (sentence level)
Children’s text-generation skills at the sentence level were assessed by automatically counting the number of clauses in their pretest and posttest texts. Two research assistants with extensive linguistic training transcribed all texts and divided them into clauses, following the criteria by Berman and Slobin (1994). Transcription reliability was ascertained by calculating a Pearson correlation between the clause counts in 150 texts, selected randomly, r = .959.
Structural elements
We evaluated the extent to which children included genre-relevant structural elements in their essays. All transcribed texts, with corrected spelling and punctuation errors, were uploaded to a bespoke web app, Text Handler. Raters were advanced students of speech therapy or primary-school teacher trainees. They logged in and were randomly shown an opinion essay that they had to evaluate in terms of whether it presented the following elements: introduction, thesis statement, one or more reasons, one or more explanations, and conclusion. One point was awarded for each structural element included, and the final score resulted from adding up all points, yielding a continuous variable. Raters did not have access to any information about the text, except for the prompt given to the child. They were trained by reading a detailed manual that included precise details on how to assess each structural element. They were told to assess whether the text included an overt thesis statement (and to disregard sentences simply stating, “I (don’t) think so”), to identify reasons that supported the thesis statement, explanations that elaborated reasons further (e.g., by providing an example), and a synthetic or expansive conclusion. Importantly, raters were instructed not to be biased by the quality or persuasiveness of the reasons/explanations, and to give credit for any coherent argument written by the child. They were told, however, to disregard repeated reasons/explanations. Raters evaluated 40 random texts that had been already assessed by the researchers. Once reliability (intraclass correlation coefficient [ICC]) between the rater’s and the researchers’ evaluation was .80 or higher, the rater proceeded to evaluate texts using TextHandler (Figure 1). Raters were instructed to skip a text if they were unsure as to how to evaluate it. Two raters scored all the texts, and the third author rescored a random 30% of texts. The reliability, ICC, was .908.

The TextHandler app for evaluating structural elements and use of connectors.
Number of connectors
Using the same app, TextHandler, raters were instructed to specify if the text included suitable discourse markers and connectors to introduce the thesis statement (e.g., In my opinion), reasons (e.g., First and On the one hand), explanations (e.g., because, for example, and in other words), or a conclusion (e.g., In conclusion and To sum up). Raters received extensive training on identifying connectors. Their reliability with a random selection of 40 training texts, previously evaluated by the researchers, was assessed before proceeding to evaluate texts with TextHandler. Reliability between raters on a random 30% of texts indicated that the ICC was .907.
Text quality
TextHandler was also used to evaluate texts for holistic text quality. Raters were trained on a set of 40 sample texts from a specific grade that were already scored by the researchers; that is, they were trained either to assess second- or fourth-grade opinion essays. Raters logged into TextHandler, chose the grade they wanted to evaluate (second or fourth). The app would then show a random text on the left-hand side of the screen (Figure 1). On the right-hand side, raters had to choose a score from 1 (considerably under grade-level expectations) to 6 (considerably above grade-level expectations). All texts were scored by two different raters. Texts that received scores differing by more than 1 point were rescored by a third rater. The interrater reliability (ICC) was .857 for Grade 2 and .725 for Grade 4. The final score was the average across all raters.
Procedure
All tests were administered by trained research assistants in children’s regular classrooms. Pretests took place 1-2 weeks before the interventions began, while posttest tasks were administered exactly 1 week after the last session of intervention. The text connectivity tests were administered in the same session as the opinion essay composing task, but the order of presentation of tasks was counterbalanced.
The Interventions
Intervention groups (SRSD; SRSD-C)
Children in the intervention groups received 11 one-hour sessions during the time dedicated to language instruction. The design of the intervention was adapted from Limpo and Alves (2013). Sessions occurred once or twice weekly and were imparted by children’s regular teachers. Children were told that they would participate in a project aimed to ask children their opinion about things that matter to them, and that they would share their opinions via a blog where they would post their essays. In both groups, they worked with a planning strategy: children in Grade 2 learned the “PER,” while children in Grade 4 learned the “CREC” strategy (Figure 2). Examples of the planning strategies and graphic organizers can be consulted in Figure 3. Students in the SRSD group received almost identical sessions to the SRSD-C group, but some activities were changed for others focusing on connectivity devices for opinion essay writing. Session details and differences between groups can be found in Figure 4.

Planning strategies visualizations through mnemonic techniques: PER (second-grade) and CREC (fourth-grade).

Text organizers and self-evaluation sheets for the SRSD-C and the SRSD interventions.

Overview of the 11 session design per groups (SRSD-C; SRSD) and grade (when differentiated).
Throughout the sessions, teachers were expected to (1) encourage participation, (2) keep children’s focus on the strategies being taught and tell them to deal with spelling or other concerns at a different time, (3) provide feedback specifically on the application of the strategy (and not on spelling, punctuation, etc.) in order to be consistent as to the teacher’s expectations, and (4) encourage autonomy in the use of the strategies. Note that, contrary to several SRSD writing interventions, ours was time-based, rather than criterion-based (Harris & Graham, 2009). However, teachers were also expected to help students adapt goals to their level of abilities. For example, some struggling students could start by setting more moderate goals, such as writing only the thesis statement and one reason, rather than asking them to produce full plans. As they improved, goals could increase in complexity. Similarly, although children were expected to stop using graphic organizers from session 8 onwards, some children could benefit from using them longer.
BAU group
Control group teachers were asked to write down diaries of any literacy-related activities carried out between pretest and posttest. The first author reviewed all diaries and ensured that no activities were similar to the ones carried out in the experimental groups. The majority of literacy-related activities dealt with basic literacy skills (e.g., spelling, punctuation, decoding), reading of short texts and sporadic free writing activities. To control for practice effects, teachers in the control groups asked children to produce the same number of texts, using the same prompts as children in the experimental groups. Teachers in the control groups received the same training the academic year after the study had concluded.
Treatment Fidelity
Treatment fidelity was ensured by a series of actions. First, teachers across experimental groups received an 8-hour training at their school, delivered by the first author or an experienced continued-education teacher trainer. The authors prepared manuals for teachers with the detailed 11-session plan, including step-by-step lesson specification, scripted activities (e.g., for modeling the strategy), and answer keys to all activities. The manuals were reviewed intensively during the training sessions. Second, each lesson included a checklist of key actions to fulfill, which teachers were expected to complete after each session to determine the degree of coverage of the lesson plans. Third, all teachers were given audio recorders and were asked to self-record all sessions and, in particular, the sessions where their involvement was greatest (i.e., Sessions 1-6). A random 30% of all recorded sessions were listened to by an external research assistant who completed herself the same checklists. The analysis of this assessment yielded a mean coverage of 95.14% (SD = 6.66). Finally, the first author and the experienced teacher trainer who delivered the training sessions made contact with each of the experimental-group teachers, before and after each session, to discuss issues with a previous lesson and overview the main goals for the next one.
Results
We conducted multilevel regression analyses on the outcome variables, calculated as the difference between posttest and pretest scores in the connectivity test, planning quality, text generation (at the word and sentence levels), number of connectors, structural elements, and text quality. Descriptive statistics are displayed in Table 2. The percentage of variance attributed to children being nested into classrooms was determined by the ICC, where values greater than .05 are large enough to warrant a two-level analysis (LeBreton & Senter, 2008; Table 3).
Descriptive Statistics of Pretest and Postest Scores, and of the Postest-Pretest Gains by Condition and Grade.
Note. ConnT = Test of connectivity (% accuracy); PlanQ = planning quality (range: 1-5); TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements (sum of thesis statement, reasons, explanations, conclusion); NoConn = number of text-based connectors; TQ = text quality (range 1-6).
Intraclass Correlation Coefficients (ICCs) for All Outcome Variables.
Note. TConn = Test of connectivity; PlanQ = planning quality; TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements; NoConn = number of text-based connectors; TQ = text quality; SES = individual ISEI score.
At Level 1, the level of individual students, outcome variables were regressed on their own pretest scores, on students’ ISEI scores (as a proxy for SES), gender, and on the extent of exposure to the language of instruction outside school. At Level 2, the level of the classrooms, outcome variables were regressed on the condition to which students had been randomly assigned. Because there were three different conditions, we included two dummy variables, Cond1, which compared children in the control group to children in the SRSD-C group; and Cond2, which compared children in the control condition to children in the SRSD condition. Outcome variables at Level 2 were also regressed on Grade and on the aggregated ISEI score for each classroom (ISEI-C). We also included two interactions, between Cond1 and Grade, and between Cond2 and Grade. Finally, we included additional model constraints to determine the difference between the SRSD-C and the SRSD groups and, in the case of a significant interaction, whether there were differences between all possible combinations of each grade and condition. There were 883 valid observations and missing values were estimated using maximum likelihood estimation in MPlus 8.1.6 (Muthén & Muthén, 2017). Tables 4 and 5 show the unstandardized coefficients for the Level 1 and Level 2 models, respectively. Table 6 shows the effect sizes of both treatment conditions against the control group, while Table 7 shows the effect sizes of the comparison between the SRSD-C and the SRSD conditions.
Level 1 Multilevel Regression Results for All Outcome Variables (Unstandardized Coefficients).
Note. TConn = Test of connectivity; PlanQ = planning quality; TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements; NoConn = number of text-based connectors; TQ = text quality; SES = individual ISEI score.
p < .05. **p < .01. ***p < .001.
Level 2 Multilevel Regression Results for All Outcome Variables (Unstandardized Coefficients).
Note. TConn = Test of connectivity; PlanQ = planning quality; TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements; NoConn = number of text-based connectors; TQ = text quality; SES = individual ISEI score.
p < .01. ***p < .001.
Effect Sizes and 95% CIs of the SRSD and SRSD-C Conditions for All Outcome Measures by Grade.
Note. TConn = Test of connectivity; PlanQ = planning quality; TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements; NoConn = text-based connectors; TQ = text quality; SES = individual ISEI score. All effect sizes were calculated using the online calculator available at https://www.campbellcollaboration.org/escalc/html/EffectSizeCalculator-Home.php (based on Lipsey & Wilson, 2001).
Effect Sizes and 95% CIs of the SRSD Condition Against the SRSD-C Condition for All Outcome Measures by Grade a .
Note. TConn = Test of connectivity; PlanQ = planning quality; TGword = text generation at the word level (no. of words); TGsent = text generation at the sentence level (no. of sentences), StrEl = structural elements; NoConn = text-based connectors; TQ = text quality; SES = individual ISEI score.
Positive values should be interpreted as SRSD > SRSD-C.
Connectivity Test
Students’ gains at posttest in their knowledge of discourse connectors was affected significantly by their pretest scores, so that the higher their pretest score on the connectivity test, the smaller their gains at posttest (Table 4). At Level 2, students’ average difference between posttest and pretest scores was not significantly impacted by the condition to which they had been assigned: children in both the SRSD-C and SRSD conditions performed similarly to each other and to children in the control group. Gains in the connectivity test were affected, however, by a significant effect of Grade, such that children in Grade 2 made larger average improvements at posttest than Grade 4 students. The aggregated SES of the classroom also had a significant impact on the posttest-pretest difference in connectivity scores, where classrooms with an overall higher SES made larger gains than classrooms with generally lower SES. No significant interactions were observed.
Planning Quality
The difference between posttest-pretest scores for children’s planning of opinion essays was affected, at Level 1, by children’s gender and by their pretest scores. As in other outcome variables where gender was a relevant factor, girls outperformed boys. The significant effect of pretest scores indicated a regression to the mean effect, whereby children who showed more advanced planning at pretest made smaller gains at posttest, on average. At Level 2, children in both treatment conditions made significant larger gains in planning skills than children in the control group across grades. In addition, children in the SRSD-C condition outscored children in the SRSD condition. Effect sizes were very large for both treatment groups. No other significant effects or interactions were attested.
Text Generation (Word Level)
At Level 1, the difference in text generation at the word level between posttest and pretest was explained by pretest scores. As children wrote more words at pretest, they gained less at postest. There was also a significant gender effect, where girls outscored boys. Neither children’s SES nor exposure to Catalan outside of school had a significant impact on their individual gains at posttest (ps > .05). At Level 2, children in the control group were significantly outscored by children in both the SRSD-C condition and in the SRSD condition, while there were no significant differences between the SRSD-C and SRSD groups. Effect sizes were large and similar across treatment groups and grades. There was also a significant impact of Grade, by which children in fourth grade showed larger gains than children in second grade. Finally, a significant effect of ISEI-C meant that classrooms with a higher aggregated ISEI score were associated with slightly larger gains at posttest than classrooms with lower ISEI scores. The Condition × Grade interactions were not significant.
Text Generation (Sentence Level)
At Level 1, children’s gains in the average number of clauses included in their essays depended significantly on their gender (girls > boys) and on their pretest scores; children who had written more clauses in their pretest texts tended to increase the number of clauses in their posttest texts to a lesser extent than children who had written shorter texts. At Level 2, the average difference in the number of clauses in posttest and pretest texts was significantly larger in both treatment groups (i.e., SRSD and SRSD-C conditions) than in the control group. These effects were medium-large across treatment conditions and grades. In addition, children in the SRSD-C condition made a slightly larger, but significant, gain than the SRSD condition. No significant interactions were observed.
Structural Elements
Students’ gains at posttest on the number of structural elements included in their essays was impacted significantly by their gender (girls > boys), their pretest scores (the larger pretest score, the smaller the gain), and by their overall exposure to Catalan outside school, such that children who had more exposure experienced significantly larger gains at posttest. At Level 2, children in both treatment groups included more structural elements in their posttest texts relative to their pretest texts than children in the control group. These effects were in the large range for both treatment groups and across grades. Also, children in fourth grade outscored children in second grade. These effects were moderated by a significant Grade × Condition interaction. A follow-up of said interaction revealed that the Grade effect was significant across all groups, b = −0.76 (SE = 0.31), p = .014, for the control group; b = −2.35 (SE = 0.25), p < .001, for the SRSD-C condition; and b = −1.20 (SE = 0.31), p < .001, for the SRSD condition. However, the Condition effect was not the same across all grades. In Grade 2, the SRSD-C condition outscored the control condition, b = −1.66 (SE = 0.27), p < .001; and so did the SRSD condition, b = −2.27 (SE = 0.31), p < .001; while the SRSD condition outscored the SRSD-C condition, b = −0.60 (SE = 0.29), p = .038. Specifically, the SRSD condition in Grade 2 had an added effect size of 0.47 over the effect of the SRSD condition. In Grade 4, the SRSD-C condition outscored the control condition, b = –3.25 (SE = 0.28), p < .001; the SRSD condition outscored the control condition, b = −2.71 (SE = 0.31), p < .001; but it was the SRSD-C condition that made significantly larger gains than the SRSD condition, b = 0.54 (SE = 0.27), p = .046. This means that the Condition effect had the opposite sign in fourth grade than in second grade, although the size of this effect in fourth grade was much smaller (Table 7).
Number of Connectors
The number of connectors that children included in the posttest texts in relation to the number they included in the pretest text was affected, at Level 1, by children’s gender (girls > boys), their pretest scores (larger pretest scores were associated with smaller gains at posttest), and their overall exposure to Catalan outside school (the more exposure, the larger the gains at posttest). Children in both treatment groups made significantly larger gains in the average number of connectors in their essays relative to their peers in the control group, but there were no significant differences as a function of treatment group. Effect sizes across treatment groups and grades were in the large range (Table 6). Gains at posttest in the number of connectors were also significantly affected by grade. However, these effects were moderated by a significant Grade × Condition interaction. The interaction revealed that the Grade effect was significant for the treatment groups, b = −2.67 (SE = 0.30), p < .001, for the SRSD-C condition, and b = −0.76 (SE = 0.37), p = .041, for the SRSD condition; but not for the control group, b = −0.23 (SE = 0.37), p = .541. As for the Condition effect, in Grade 2, children in the SRSD-C condition outscored children in the control condition, b = −1.85 (SE = 0.33), p < .001, children in the SRSD condition also outscored children in the control condition, b = −3.47 (SE = 0.38), p < .001, while children in the SRSD condition outscored children in the SRSD-C condition, b = −1.62 (SE = 0.35), p < .001. Table 7 shows that, for Grade 2 children, the SRSD condition resulted in an additional d = 0.99 over the SRSD-C condition. In Grade 4, children in the control condition also obtained lower gains than both the SRSD-C, b = −4.29 (SE = 0.34), p < .001, and the SRSD conditions, b = −4.00 (SE = 0.37), p < .001, but children in either treatment group obtained similar gains, b = −0.29 (SE = 0.33), p = .379. Our analysis of effect sizes in the SRSD-C versus the SRSD conditions provided further support that there was no added benefit of receiving either treatment (Table 7).
Text Quality
At Level 1, children’s gains in text quality at posttest were significantly affected by their gender (girls > boys), exposure to Catalan (more exposure was associated with larger gains), and by their pretest scores (larger pretest scores were associated with smaller gains). At Level 2, children in either the SRSD or the SRSD-C conditions made significantly larger gains than children in the BAU group. Furthermore, there were no significant differences between treatment conditions, nor was there a significant main effect of Grade. Effect sizes were all in the large range (Table 6). We observed, however, a significant Grade × Condition interaction. A follow-up of this interaction indicated that children in the control group showed similar gains in text quality across grades, b = −0.05 (SE = 0.21), p = .803; and so did children in the SRSD-C condition, b = −0.32 (SE = 0.17), p = .059. In contrast, second-graders made larger gains than fourth-graders in the SRSD condition, b = 0.83 (SE = 0.21), p < .001. The Condition effect also had a different impact across grade levels. In Grade 2, control children showed significantly smaller gains in text quality at posttest than both the SRSD-C, b = −1.33 (SE = 0.19), p < .001, and the SRSD condition, b = −2.28 (SE = 0.21), p < .001. In addition, second-grade children in the SRSD condition significantly outscored children in the SRSD-C condition, b = −0.95 (SE = 0.20), p < .001. Indeed, receiving the SRSD condition in second grade was associated with a large effect, over the effect of the SRSD-C condition (Table 7). In fourth grade, children in the control group were outscored by children in the SRSD group, b = −1.60 (SE = 0.20), p < .001, and by children in the SRSD group, b = −1.40 (SE = 0.21), p < .001. Nevertheless, fourth-grade children in both treatment groups showed similar gains at posttest in text quality, b = 0.21 (SE = 0.19), p = .269. Accordingly, we observed no consistent benefit to receiving either type of treatment (Table 7).
Discussion
This study sought out to determine the impact of including explicit linguistic instruction into an SRSD writing intervention in early and middle elementary school. The effectiveness of SRSD interventions, which typically include overt linguistic instruction, to improve writing outcomes is well documented (e.g., Graham & Harris, 2017; Graham & Perin, 2007; Graham et al., 2012). We intended to compare the specific benefit of providing explicit instruction on discourse connectors for opinion essay writing, by comparing a typical SRSD condition to both a BAU control group and a SRSD condition without such explicit instruction.
Our first research question inquired whether the SRSD condition was superior to a BAU control group for improving writing outcomes. The context for such research question was the uncertainty in the field about the appropriateness of providing explicit language instruction at all, given the evidence reported in a number of meta-analytic reviews that such instruction may be ineffective or even detrimental to writing outcomes (Andrews et al., 2006; Graham et al., 2012). Our findings suggest that this was not the case. The writing outcomes of children in the SRSD group were significantly better than those of the control group across grades. Effect sizes were, in general, similar to those obtained in the SRSD-C condition. At the very least, this finding means that providing explicit instruction on linguistic aspects of text construction is not detrimental and does not diminish the overall effectiveness of SRSD interventions that do not include explicit reference to linguistic facets of text construction (e.g., Limpo & Alves, 2013; Salas et al., 2021).
How, then, do we explain differences with previous meta-analyses? First, the meta-analyses in question included only a handful of studies, of which only one included more than 60 participants (Graham et al., 2012). Other shortcomings of previous meta-analyses were the poor quality of most research on the topic (Andrews et al., 2006). These facts weaken the conclusions of these meta-analyses to an extent. Second, although the connectivity instruction delivered in the current study included reference to grammatical aspects (e.g., how each connector combined only with a certain type of structure and not others), our focus was on introducing the function of connectors for achieving discourse purposes (e.g., signaling to the reader that what follows after “In sum,” is the end and recapitulation of the essay), rather than, say, the grammatical accuracy of the constructions involved. Finally, the studies included in the meta-analyses targeted various grammatical constructions and different grades and discourse genres, making comparability with the current study problematic. We would argue that the present findings, far from closing the debate as to whether explicit language instruction has a place or not in teaching writing, call for more research to understand the conditions for their potential advantages.
Our second research question involved ascertaining whether there was a specific benefit to applying the SRSD intervention, against the known benefit of the SRSD-C intervention (as per Salas et al., 2021). Our findings suggest that benefits were quite clear for our Grade 2 participants, but not for our Grade 4 participants. Second-grade children in the SRSD group outscored both their BAU and SRSD-C peers in the gains they experimented on the number of connectors and genre-appropriate elements included in their essays, as well as in overall text quality. Effect sizes for the SRSD condition sometimes doubled those of the SRSD-C group. However, fourth-graders saw no added benefit to receiving the SRSD rather than the SRSD-C condition. We speculate that the differential effect of the SRSD condition as a function of grade may be related to the linguistic items that were targeted by that condition. We chose to teach very basic discourse markers and connectors, whose form was likely already known to most students, but whose function in opinion-essay writing may not be as familiar to them. Also, we selected constructions whose discourse function, even if unknown, was transparent and easy to understand by even our youngest participants. Therefore, the mere exposure to these connectivity devices provided already by the SRSD-C condition may have been enough for the fourth-graders to understand their role in opinion-essay writing and incorporate them into their own essays. In contrast, second-graders might not have noticed these elements without teachers explicitly drawing their attention to them, and providing students with support as to how to use them during text composition. Put differently, older children may be more proficient speakers of the language of instruction. The strategies taught to use connectivity devices during opinion-essay composition required them, at most, to assign new functions to lexical tokens with which they were likely already familiar. In contrast, for the younger children, the strategies might have involved new expressions and new functions at the same time. Therefore, what was required of them was more than just a redescription of the context of use of a set of expressions (Karmiloff-Smith, 1994), but rather the addition of brand new constructions. More research, particularly component analysis designs, is needed to elucidate which linguistic aspects deserve explicit instruction within a comprehensive writing program, and what are the developmental stages or population types where the benefit is highest.
Theoretical and Educational Implications
The Writer(s)-Within-Community (WWC) model (Graham, 2018) emphasizes the importance of the context in which texts are produced, above and beyond the subject-level factors at play. It highlights the need for researchers and educators to consider both the cognitive and the social aspects that underlie and shape text construction. While the model does make explicit reference to the importance of linguistic knowledge and representations (Graham, 2018, p. 2.65), when it comes to deciding whether or not to include linguistic content as an integral part of an SRSD intervention, the research is merely in its infancy (e.g., Mason et al., 2013). However, we would like to suggest that the findings reported in this study are well aligned with the WWC model, because our approach to embed language instruction into an SRSD writing intervention served to mobilize linguistic competence, while simultaneously and overtly drawing the links with reader-focused, socially rooted practices, expectations, and purposes. In other words, engaging students in functionally motivated language instruction may tackle specific linguistic aspects from different perspectives at once, thus reinforcing students’ representations of the target constructions and their role in genre-specific written composition.
From an educational viewpoint, the present study indicated that teaching linguistic aspects of text construction may be beneficial in some cases. Although considerable research is needed prior to drawing any conclusions, the evidence seems to point to functional approaches to linguistic instruction should be favored, rather than more “traditional” ones. Moreover, linguistic instruction in the context of SRSD writing interventions seems to be effective when clear connections between the target linguistic constructions and the writing strategies developed during the intervention. Finally, educators should be aware that the choice of linguistic constructions should be tailored to students’ needs and developmental stages.
Limitations
We are aware of a number of limitations. First, our focus on discourse connectors means that findings from this study are not necessarily generalizable to other linguistic aspects of text construction. Rather, we hope that this research inspires future studies that aim to determine which linguistic representations should be mobilized explicitly or implicitly to support competent written expression. Second, the tests used to assess knowledge of connectivity at pretest and posttest were instruments with low reliability and, even after eliminating problematic items, reliability was “acceptable” at best. In addition, we observed no significant effects of Condition in the gains on such assessment. Altogether, the causal interpretation of improvements in the SRSD condition as a function of the explicit instruction on connectivity devices is compromised. Unfortunately, there are no alternative assessments of connectivity devices use or knowledge, or of any other language skill for the language of the study (Catalan), so it is hard to say how future research could improve on the present one. We would like to suggest that future studies conduct extensive piloting of the assessment of this linguistic feature, including both productive and receptive measurements. These instruments should be critical in ascertaining the mechanisms (e.g., metalinguistic awareness) through which linguistic instruction aids written composition. Finally, we compared two versions of a very effective instructional program (SRSD) to a BAU control condition. We were aware that, especially in second grade, children in the control classrooms would have received little to no instruction on opinion essay writing at all. Future studies may consider an alternative or additional comparison group, which receives some other form of opinion-essay writing instruction to determine the impact of SRSD writing instruction in this genre.
Concluding Remarks
This article challenges contentions that explicit linguistic instruction may be detrimental to writing development. Using an experimental design that isolated the specific effect of such instruction, we have offered compelling evidence that at least some types of linguistic training, embedded within a comprehensive writing instruction program, may be beneficial to some students. Our findings are in line with previous research that claimed that functional approaches to linguistic training may improve writing outcomes for some children. More research is needed to determine the precise contexts in which meta-linguistic components enhance the effectiveness of process-oriented writing programs.
Footnotes
Acknowledgements
The authors are indebted to the schools that participated in the projects. We are also thankful to Mar Formiga for her help in evaluating some of the texts included in this study and to Dr. Gabriel Liberman for his support in the statistical analyses.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grants 2015ACUP 00175 and PID2019-108791GA-I00, awarded to N. Salas.
