Abstract
Exploratory talk is increasingly recognized in formal education for its role in enhancing students’ critical thinking and literacy skills, which are crucial for quality education both within and beyond school contexts. However, research shows that students often lack opportunities for inquiry-based learning and rarely receive explicit guidance on using language for reasoning, particularly in second language (L2) learning environments. Understanding how students engage in this complex function and effectively promoting it in L2 subject contexts remains a challenge. This study introduces an operational framework for the function of ‘explore’, based on L2 learning and socio-cultural theories and Dalton-Puffer’s construct of cognitive discourse functions (CDFs). It provides both quantitative and qualitative insights into how secondary-level content and language integrated learning (CLIL) students (N = 113) from three different types of schools in Spain performed the ‘explore’ function orally, and it examines the role of epistemic modality in this meaning-making process by analysing the following features: (1) modal verbs, (2) modal adverbs and adjectives, (3) epistemic lexical verbs (ELVs), stance-taking forms, (4) discourse markers and the conditional ‘if’. A learner corpus was created for this analysis using Sketch Engine. The findings suggest that the CDF of ‘explore’ involves a combination of epistemic modality markers that serve as reasoning and exploratory discourse indicators. There is, however, a pressing need to raise teachers’ awareness of how language (through CDFs) supports students’ exploratory and deeper learning in L2 content-learning contexts. To this end, the discussion presents pedagogical implications for future research and practice in fostering exploratory reasoning, and where possible, embedding these skills in exploratory talk within CLIL classrooms.
Keywords
I Introduction
Exploratory talk – a cognitively and linguistically challenging operation that includes activities such as predicting, hypothesizing, speculating, and establishing cause–effect relationships – has an important role in formal education. In many countries, by the time students reach secondary school, they are expected to be able to effectively process complex subject matters and engage in productive discussions – and where content and language integrated learning (CLIL) is implemented – they need to do this in more than one language (Dalton-Puffer, 2016; Llinares & Morton, 2024).
In science, for instance, students are often encouraged to generate hypotheses, predict outcomes and provide potential explanations for various phenomena or objects based on their knowledge and experience (Lemke, 1990; Mortimer & Scott, 2003). Similarly, in social sciences like history, students are prompted to offer alternative perspectives, speculate about different outcomes and interpretations of past and future events, and explore their own and others’ viewpoints (Lorenzo, 2017), which requires hypothetical and counterfactual thinking (Bauer-Marschallinger, 2022). Learning to predict the likelihood of events and engaging with doubts and uncertainties have become a central and defining aspect of the scientific method and academic enquiry across various research fields, both within and beyond the school context (e.g. in economics, politics, physics and health sciences) (Boyd & Kong, 2015). Moreover, learning to engage in exploration is crucial as it enhances higher-order thinking and advanced literacy skills, which can lead to deeper processing (Mercer & Littleton, 2007; Mercer et al., 1999) and which aligns with the 2030 Sustainable Development Goal (SDG) of Quality Education (Llinares & Morton, 2024).
However, to master this function in CLIL contexts, students need to develop the necessary cognitive and linguistic skills in both their first language (L1) and second language (L2) to participate effectively in a knowledge-based, plurilingual and cultural society (Moate, 2011). As Boyd (2023, p. 1) states, ‘schooling involves more than teacher delivery of materials; rather, it is about building a classroom community as we think, talk, relate, explore and learn together’. One key element in such exploratory talk is the ability to speculate and hypothesize, an aspect that is captured in both linguistic and cognitive terms by the cognitive discourse function ‘explore’.
This article aims to shed light on the function of ‘explore’ in secondary-school CLIL programmes. Specifically, this operationalizes one important aspect of exploratory talk and reasoning, namely the students’ use of epistemic modality features. Previous research has primarily focused on the use of modals in English as a foreign language (EFL), English medium instruction (EMI) and general academic English writing contexts (Carrió-Pastor, 2014; Dafouz et al., 2007; Pemberton, 2020), noting that this aspect is particularly challenging for L2 learners (Hohaus & Schulze, 2020). However, little is known about how CLIL students perform the function of ‘explore’ or the role of epistemic modality in this meaning-making process (see Maine & Čermáková, 2023).
To achieve this aim, the authors first present a theoretical overview of exploratory talk, drawing on L2 learning and socio-cultural theories (Mercer & Littleton, 2007) and on Dalton-Puffer’s (2013) construct of cognitive discourse functions (CDFs). By integrating contributions from both fields, they develop an operational framework for the function of ‘explore’ in CLIL, focusing on key linguistic features related to possibility and reasoning. Second, they provide qualitative and quantitative insights from a corpus-based study, where secondary-level CLIL students (n = 113) were asked to explore cross-curricular topics. Lastly, the authors present a series of pedagogical guidelines for CLIL teachers aimed at fostering exploratory reasoning and, where possible, embedding these skills into exploratory talk.
1 Exploratory talk and the construct of cognitive discourse functions (CDFs)
In what follows, we first explain the key concepts of exploratory talk and cognitive discourse functions (CDFs), and then suggest how the CDF ‘explore’ can be operationalized in CLIL to contribute to the development of exploratory talk.
a Exploratory talk
There has been growing interest in both education and applied linguistics regarding the role of classroom talk in shaping students’ understanding of subjects and knowledge (Dalton-Puffer, 2013; Moate, 2011). L2 learning theories (Swain, 2000) and socio-cultural discourse analytical theories (Mercer & Hodgkinson, 2008; Swain & Lapkin, 2013; Vygotsky, 1987) consider classroom talk as a valuable resource and catalyst for learning, with three integrated main functions: (1) cognitive (as a tool to process knowledge, that is, for reasoning and meaning-making), (2) socio-cultural (for sharing and negotiating knowledge collaboratively; referred to as ‘inter-thinking’ (Mercer, 2000)), and (3) pedagogical (to assist students in developing proficiency in specific language uses, such as every-day, academic, and subject-specific language in L1 and L2) (Mercer et al., 1999, Mercer, 2000; Moate, 2011). Talk is referred to here as a means to enhance students’ content understanding, reasoning skills and communicative competences (Mercer & Littleton, 2007).
Certain forms of classroom talk are clearly more beneficial for developing these competences. Among Mercer’s (2000) triad – cumulative, disputational and exploratory talk – the last is considered the most valuable educationally. Cumulative talk involves accepting and agreeing with others’ statements without critically examining them, and disputational (or constraining) talk is marked by disagreements and a refusal to accept others’ opinions, while strongly reaffirming one’s own. Neither of these types of talk actively fosters joint, constructive reasoning. However, exploratory talk views learning as an interactive, cognitive and communicative process (Mercer & Hodgkinson, 2008). In this approach, students co-construct meaning, enhancing intellectual development by deepening comprehension, critical thinking, creativity and language proficiency. Students are, for instance, encouraged to explore new ideas, uncover alternatives and make connections to their experiences (Mercer & Littleton, 2007; Mercer et al., 1999).
Experimental interventions confirm the productive and positive potential of exploratory talk in supporting the development of meaning-making and reasoning articulation in both first-language (L1) and second-language (L2) primary-, middle- and high-school classrooms, leading to higher academic achievements (Boyd et al., 2019; Maine & Čermáková, 2023; Mercer et al., 1999). These studies also report improvements in science, mathematics and language education, indicating gains in students’ reasoning abilities, including knowledge retention and transfer, higher problem-solving abilities, better conceptual understanding and longer utterances (see Boyd & Kong, 2015; Liang & Fung, 2021; Mortimer & Scott, 2003; Rojas-Drummond & Zapata, 2004).
However, students often lack opportunities for scaffolded exploratory and inquiry-based learning, including explicit guidance on how to use language for reasoning during classroom interactions, whether in L1 or L2 contexts (Boyd & Kong, 2015; Mercer et al., 1999). This challenge is particularly critical in L2 learning environments, such as CLIL, where both content knowledge and communicative skills are cultivated through a second or additional language use (Boyd et al., 2019; Maine & Čermáková, 2023).
b Cognitive discourse functions (CDFs)
Dalton-Puffer’s (2013) construct of cognitive discourse functions (CDFs) represents a ‘zone of convergence’ (p. 216) between the notion of thinking skills studied in cognitive theories of education (Anderson & Krathwohl, 2001; Biggs & Tang, 2011; Bloom, 1956) and academic language or discourse functions studied in functional linguistics, particularly in English for Specific Purpose (ESP) (Trimble, 1985) and Systemic Functional Linguistics (SFL) traditions (Halliday, 1994). Her contribution lies in introducing order among the variety of existing functions (Morton, 2020) and providing both subject and language experts with an operational framework for analysing and better understanding the roles of language and cognition in students’ L2 knowledge construction. This framework includes seven cognitive macro-functions: categorize, define, describe, evaluate, explain, explore and report. Each of them is linked to specific linguistic structures, which can make learners’ reasoning and meaning-making explicit and help teachers identify and support potential learning difficulties.
Within the CDFs-framework, ‘explore’ is situated alongside a series of analogous verbs (see Table 1). These include near-synonymous forms (like ‘assume’, ‘suppose’, ‘presume’, and ‘conjecture’), as well as other semantically more distant operations (such as ‘guess’, ‘speculate’, ‘hypothesize’ or ‘predict’) (Bauer-Marschallinger, 2016). These verbs share the same communicative intention, which consists of expressing something potential (i.e. non-factual). Dalton-Puffer (2016) describes this hypothetical potentiality as ‘talk about that which is not in the here and now, that which might have been in the past, and that which could be in the future’ (p. 14). In other words, ‘exploring’ is not about discussing current facts and certainties but rather speculating on possibilities, predictions or assumptions that could have occurred or may happen in the future. Such statements can stem from various sources, including intuition, past experiences, facts or imagination.
List of cognitive discourse function (CDF) types, members and communicative intentions.
Source. Dalton-Puffer & Bauer-Marschallinger, 2019, p. 35.
Previous studies examining the use of ‘explore’ in CLIL education encompass various mainly observational classroom studies, which focus on the occurrence of exploration in natural CLIL classroom interactions, examining participants’ lexico-grammatical expressions and the presence of metatalk. In primary education, these include the study by Llinares and Nikula (2024) and Llinares and Morton (2024), while in secondary education there is Dalton-Puffer’s (2007) study, a series of MA theses (Dalton-Puffer et al., 2018), Lorenzo (2017), Bauer-Marschallinger (2022), Salvador-García and Chiva-Bartoll (2022) and Llinares and Nashaat-Sobhy (2023). In tertiary education, Martín del Pozo (2015) and Doiz and Lasagabaster (2021) have made contributions to understanding the use of ‘explore’ in EMI settings.
Similar to the research findings on exploratory talk, these suggest limited use of ‘explore’ during classroom interactions (Dalton-Puffer, 2007; Martín del Pozo, 2015), often alongside other functions (such as ‘describing’, ‘explaining’ and ‘classifying’), leading to complex CDF patterns (Breeze & Dafouz, 2017). Most utterances were prompted by teachers, but such strategies were only minimally integrated into their teaching practice. Students were rarely encouraged to delve deeper into subject matters through the CDF of ‘explore’ and received limited guidance on how to do so effectively within their disciplines. As a result, their responses were often brief, relying on basic forms (e.g. ‘maybe’ or the modal verb ‘would’) to convey hypotheticality (Bauer-Marschallinger, 2022; Dalton-Puffer et al., 2018). Challenges were noted in formulating grammatically correct hypothetical conditions (Llinares & Morton, 2024; Llinares & Nikula, 2024). Dalton-Puffer (2007) observed students’ avoidance of using modal auxiliaries and highlighted their difficulty in applying knowledge from language (EFL) classes to content-based subjects. Some attributed these findings to the linguistic complexity of formally expressing exploration and students’ insufficient L2 competence and cognitive development (Dalton-Puffer & Bauer-Marschallinger, 2019; Llinares & Nikula, 2024). Others argued that CLIL classes tend to prioritize discussions on factual knowledge over speculative and hypothetical inquiries, which are relegated to a secondary role (Dalton-Puffer, 2007; Martín del Pozo, 2015).
2 Operationalizing the CDF ‘explore’ for CLIL
In this study, we aimed to operationalize the CDF ‘explore’ for the CLIL context by drawing on previous research and refining its key linguistic features. Clarifying the discursive tools necessary for developing exploration skills can help teachers and coursebook writers better support their students and foster more effective content learning. The theoretical framework developed for analysing students’ performance of the CDF ‘explore’ in CLIL contexts can be seen as an amalgam of the two previously introduced frameworks – Mercer’s (2000) concept of exploratory talk and Dalton-Puffer’s (2013) construct of CDFs with its ‘explore’ function – as it integrates elements from both.
The function of ‘explore’ involves both cognitive and linguistic skills, allowing individuals to delve deeply into a topic, think critically, and discuss various aspects of a subject. This process can prompt individuals or groups to thoroughly investigate a subject and articulate their findings, either orally or in written form. In contrast, exploratory talk provides a communicative, collaborative framework – a space where learners can develop this cognitive function in a shared, dialogic manner. Both concepts emphasize open-ended, critical engagement and the verbalization of ideas. They are related in that the ‘explore’ function serves as an important step toward exploratory talk, which, in turn, incorporates additional elements (such as interaction, collaboration, and negotiation).
Moreover, in the field of education, various taxonomies (Anderson & Krathwohl, 2001; Biggs & Tang, 2011; Bloom, 1956) incorporate ‘explore’ as a key higher-order thinking skill (HOTS). Bloom (1956) categorizes it as ‘hypothesizing’, ‘predicting’, and ‘speculating’ within the evaluation and application of knowledge, whereas Anderson and Krathwohl (2001) position it within the domain of creating and understanding.
In terms of its linguistic forms, the operation of ‘explore’ is particularly complex for L2 learners as it requires elaborate lexical-grammatical structures operating at various levels (lexical, phrasal and morphological) (Llinares & Morton, 2024; Mifka-Profozic, 2017; Schulze & Hohaus, 2020). Dalton-Puffer (2007, 2016) noted that unlike other functions (such as ‘explain’, ‘define’ or ‘compare’), ‘explore’ is less clearly delineated. In general, it is studied within the field of epistemic modality concerned with the speaker’s perspective related to the likelihood ‘that the situation or event expressed by a clause has taken, is taking or will take place, or the obligation involved in it’ (Halliday, 1994, p. 75), that is knowledge and beliefs as opposed to facts (Biber et al., 2021; Hyland, 1998; Palmer, 2005).
Previous classroom studies (Boyd & Kong, 2015; Boyd et al., 2019; Mercer, 2000; Mercer et al., 1999; Soter et al., 2008) have shown that exploratory talk tends to occur and can be facilitated through a specific set of speculation and reasoning (S&R) words. Boyd and Kong (2015) categorized them into the following three groups: (1) language of possibility, including modals (e.g. ‘may/might’, ‘could’, ‘would’ and ‘think’), (2) reasoning links in the form of conjunctions (e.g. ‘so’, ‘because’, ‘but’ and ‘if’), which introduce or connect reasoning, and (3) questioning forms (e.g. ‘how’ and ‘why’), which prompt reasoning, also referred to as ‘pressed for reasoning’ forms; while the former (2) is about linking reasoning through logical connectors, the latter (3) is about encouraging further exploration through questioning.
In this study, we employed the concept of S&R words to analyse students’ oral performance of ‘explore’, except for ‘pressed for reasoning’ forms, which were not used by the students. Expanding on Boyd and Kong (2015), we argue that oral exploration tends to be realized through a blend of epistemic and deontic modality (modal verbs, adjectives and adverbs, epistemic lexical verbs), conditional ‘if’, and discourse markers. Therefore, we analysed students’ answers by focusing on the following four aspects:
a) modal verbs
b) modal adverbs
c) epistemic lexical verbs
d) discourse markers.
a Modal verbs
Taking modal verbs (e.g. could, might, should, would, or must), auxiliary modal verbs (i.e. need to, ought to or used to), and semi-modal verbs (such as have to, had better, have got to or be going to), a widely accepted way of categorizing these is Palmer’s (2005) and Biber et al.’s (2021) three-fold distinction (see Table 2), which assigns each modal verb one or more functions. Deontic modality refers to ‘actions or events that humans directly control’ (Biber et al., 2021, p. 483), activating concepts such as permission, obligation and volition. Dynamic modality concerns speakers’ ability and willingness, while epistemic modality involves ‘human judgment of what is or is not likely to happen’ (Quirk et al., 1985, p. 137). It refers to the speaker’s perception of the likelihood or truth value of a proposition in terms of possibility, necessity or prediction.
Semantic classification of modal and semi-modal verbs.
b Modal adverbs
Examples of this category are modal adverbs (i.e. maybe, perhaps) and adjectives (i.e. possible, probable), which can occur in parallel forms (such as certainly and certain).
c Epistemic lexical verbs
Epistemic lexical verbs (ELVs) indicate the writer’s position towards the claim (see Hyland, 1998). Within the field of epistemic lexical verbs (ELVs), Palmer (2005) distinguishes between judgment and evidential verbs (see Table 3). Judgment verbs refer to how speakers assess the truth-value or factual status of a proposition through speculation, deduction and/or assumption (using forms such as ‘I suggest’ or ‘I think’ to indicate their modes of knowing). Evidential verbs, on the other hand, pertain to the type of evidence (reported, sensorial or narratorial) that the speakers provide to support the expressed claim. In a similar vein, Dalton-Puffer (2007) presented a list of typical verbs and phrases that often trigger exploratory episodes, allowing for reasoning.
Epistemic lexical verbs (ELVs): Judgment and evidential verbs.
d Discourse markers
Discourse markers (e.g. so, because, or but) are used to make argument structures more explicit and easier to follow, as they signal the relationships between different parts of a conversation. For instance, ‘because’ can indicate a cause-and-effect relationship, while ‘on the other hand’ can highlight an alternative perspective. Additionally, we will include the conditional ‘if’ in this group.
II Corpus-based approach and learner corpus
1 Objectives and research questions
The study investigates two main aspects. First, it examines how Spanish secondary-level students (N = 113) perform the CDF ‘explore’ in a CLIL social science setting. This operation is often assumed but not explicitly addressed in L2/CLIL learning environments, despite its importance for promoting integrated and deeper learning (Coyle & Meyer, 2021). Second, the study analyses the role of epistemic modality in students’ meaning-making process. The function ‘explore’ presents a compelling ‘testing ground’ for this investigation, as it requires complex language use and higher-order thinking skills, as highlighted by Dalton-Puffer (2007).
As such, the study poses the following two research questions:
• Research question 1: How do secondary-level students from three different types of schools in Spain perform the CDF ‘explore’ in CLIL social science?
• Research question 2: What is the role of the following features in the students’ performance of the CDF ‘explore’? Modal verbs, adverbs and adjectives, epistemic lexical verbs (ELVs) and stance-taking forms, discourse markers (DM) and the conditional ‘if’.
2 Context and participants
The study was part of a broader national assessment project, Evaluación de la Enseñanza Bilingüe en España ENEBE (Evaluation of Bilingual Education in Spain) (Vinuesa et al., 2024), which used Dalton-Puffers’ CDFs construct to assess tenth-grade language skills across various CLIL subjects (geography, technology and history). This article presents part of these data, specifically, the findings related to the function of ‘explore’ in the province of Navarra, northern Spain.
The study took place in three different types of middle-class urban, secondary-level schools in Spain: a public school following a regular programme, a public school with a British Council programme, and a charter school, which receives public funding but operates with a privately-run programme. These schools were selected based on their comparable CLIL traditions (as detailed in Table 4) and the likelihood that students had CLIL in primary education. A total of 113 tenth-grade students (aged 15–16 years) participated, alongside five researchers. The students studied English as a foreign language, demonstrating an overall proficiency level of B2. Spanish was their primary language (L1), with some students being bilingual in Basque and Spanish.
Study sample.
3 Method and data analysis
To prompt students’ oral performance of ‘explore’, individual 8-minute interviews were conducted and audio-recorded with each student, in accordance with the projects’ ethical guidelines. This student–interviewer format was chosen to ensure that all students – both the more active and the less engaged ones – had the opportunity to participate, allowing for the assessment of each student’s exploratory skills at an individual level.
The interviewers posed questions about two transdisciplinary topics (climate change and technology), which are part of the secondary school curriculum and familiar to the students. The questions followed a similar pattern, allowing the interviewer to select different questions or make minor sub-topic shifts to encourage participation, especially if students felt uncertain. However, this approach did not lead the interviewer away from the main topics of climate change and technology. As indicated in Table 5, the students were prompted to make predictions, suggest solutions and alternatives, and draw connections between global issues and their local and personal circumstances, thereby bridging scientific knowledge with everyday understanding (Boyd et al., 2019).
Interview prompts.
This design was previously validated through a pilot study conducted in 2021. All students were randomly assigned a code to ensure anonymity, no sensitive information was collected, and the participating researchers signed a confidentiality agreement.
The students’ recorded responses were transcribed completely, ensuring no parts were excluded. To do so, an automated transcription service was used, followed by a manual review of the transcripts by the authors. The transcribed data were then organized into a comprehensive learner corpus which includes all the students’ responses. In addition, three sub-corpora were created – one for each type of school – using the corpus manager and text analysis software Sketch Engine. The entire dataset comprises a total of 29,061 words (see Table 6).
Dataset.
The data were first analysed quantitatively using Sketch Engine, which allows for word and concordance searches. We examined the frequency with which the students used epistemic modality and reasoning markers and, in particular, the four categories previously outlined, which include:
• modals and semi-modals (e.g. ‘would’, ‘need to’)
• modal adverbs and adjectives (e.g. ‘maybe’, ‘probably’)
• epistemic lexical verbs (ELVs) and stance-taking forms (e.g. ‘I suggest’, ‘I think’)
• discourse markers (DM) and the conditional ‘if’.
To do so, we first filtered the forms used by the students, compiling them into tables (see Section III), eliminating forms that were not used, and then calculated their respective frequencies using the concordance option in Sketch Engine. This allowed us to see which forms the students tend to opt for when being engaged in exploratory tasks. The corpora included all valid forms, such as contractions, negations (e.g. ‘shouldn’t’ or ‘I don’t think’) and potential variations (e.g. ‘in my experience’ for ‘in my opinion’).
Afterwards, a qualitative analysis was conducted, where we examined the accuracy with which the students used each of the four categories and the difficulties they may encounter; that is, whether the students knew how to use these forms appropriately. This was accomplished by manually reviewing the different forms provided by the students, which were listed using the concordance option in Sketch Engine, filtering examples of correct usage and possible erroneous uses, and identifying patterns of use among the students.
III Results
This section presents the results in two parts: first, the quantitative data, followed by the qualitative findings, which offer representative examples of how students used the four analysed categories.
1 Quantitative analysis
Figure 1 and Table 7 show the distribution of the four analysed categories used by the students in the entire dataset. They illustrate that students primarily tended to use discourse markers (DM) and conditional ‘if’ clauses (utilized at 50%), followed by modals and semi-modals (30% usage), epistemic lexical verbs (ELVs) and stance-taking forms (17% usage) when being asked to explore possibilities in a CLIL social science context. In contrast, the forms that were barely used were modal adverbs and adjectives (3.6% usage).

Distribution of the four analysed categories.
Frequency of the four analysed categories in the entire dataset.
Next, a more specific analysis of these four categories will be presented, highlighting the specific forms students use in each of the three types of schools.
a Frequency of modals and semi-modals
As depicted in Figure 2 and detailed in Table 8, the students from the three schools used a considerable number of modal verbs (1,045 occurrences, constituting 3.07% of the entire dataset). The most frequently used forms include ‘can’, ‘will’, ‘would’ and ‘could’. Following these were the semi-modals ‘be going to’ and ‘have to’, alongside a range of other forms (‘should’, ‘need to’, ‘might’, ‘used to’, ‘may’, and ‘must’), albeit with less frequency. Furthermore, upon examining possible combinations of modals and semi-modals within the corpus, instances of ‘will’/‘would’ + ‘have to’/‘need to’ were identified, comprising 0.04% of the entire dataset.

Frequency of modals and semi-modals in the entire dataset.
Frequency of modal and semi-modal verbs.
Notes. Hits = number of hits; Percentage = Percentage of dataset.
Upon close examination of the semantic domain of the modals used, most forms were associated with meanings related to prediction and volition, as shown in Table 9. This tendency is likely attributed to the nature of the prompts provided. Additionally, the students used modals conveying meanings of ability and possibility (both dynamic and epistemic modality), with ‘can’ and ‘could’ being the most frequently used forms. In third place were modals that express obligation or necessity. Biber et al. (2021) attribute this lower frequency to a general inclination to avoid potentially face-threatening expressions.
Semantic meaning of modals.
b Frequency of modal adverbs and adjectives
As shown in Figure 3 and Table 10, the students demonstrated limited variety in their use of different forms. The most prevalent was the adverb ‘maybe’ (106 occurrences, 0.31% of the dataset). There were also occasional occurrences of ‘possible’, ‘eventually’, and ‘likely’, though these were negligible.

Frequency of modal adjectives and adverbs in the entire dataset.
Frequency of modal adjectives and adverbs.
Notes. Hits = number of hits; Percentage = Percentage of dataset.
c Frequency of epistemic lexical verbs (ELVs) and stance-taking forms
Palmer’s (2005) distinction between judgment and evidential verbs enabled us to analyse the types of judgment verbs used by the students during exploration and whether they supported their claims with evidence. Figure 4 and Table 11 illustrate that the students across all three types of schools predominantly used verbs from the ‘judgment verbs’ category over ‘evidential verbs’. Specifically, cognitive forms like ‘I think’ were most common (297 occurrences, 0.87%), followed by verbs such as ‘say’, ‘know’, ‘believe’, and ‘mean’ (16–54 occurrences), with minimal use of more speculative verbs like ‘suggest’, ‘suppose’, ‘consider’, ‘propose’, ‘reckon’ or ‘imagine’.

Frequency of epistemic lexical verbs (ELVs) and stance-taking forms.
Epistemic lexical verbs (ELVs) and stance-taking forms in the entire dataset.
Notes. Hits = number of hits; Percentage = Percentage of dataset.
In the same category, the use of inferential and assumptive forms (e.g. ‘infer’, ‘estimate’, or ‘it is known that’) was infrequent. Some students drew inferences from personal experiences, primarily using sensory evidential verbs (e.g. ‘see’, ‘notice’, ‘feel’, ‘remember’, ‘imagine’, or ‘perceive’) (1–66 occurrences), and a few used assumptive forms (i.e. ‘we’ve all heard’, ‘we/you know’ or ‘the general opinion is’) to refer to common knowledge. Students also employed qualification devices such as comparatives, hedges, evaluative and hyperbolic adjectives (e.g. ‘better’, ‘catastrophic’, ‘drastically’).
Regarding evidential verbs, reported forms (e.g. ‘stated’ or ‘claimed by’) indicating third-party affirmations or statements were rarely used. Although the students occasionally referenced sources such as readings, news, social media or classroom discussions, they predominantly justified their statements using sensory verbs from their own perspective (e.g. ‘see’, ‘feel’ or ‘perceive’). This tendency was likely influenced by the question prompt asking the students to relate the topics of climate change (CC) and technology (Tech) to their local context and express their opinions. Additionally, the students sporadically used stance-taking expressions (e.g. ‘For me’, ‘In my opinion’, ‘From my point of view’, ‘personally’), but these were uncommon.
d Frequency of discourse markers (DM) and conditional ‘if’
As illustrated in Figure 5 and detailed in Table 12, the students employed a considerable number of discourse markers to structure and support their reasoning with additional information. A total of 1,781 instances were identified, comprising 5.26% of the entire dataset. The distribution of these markers across schools was similar, with no notable differences observed.

Frequency of discourse markers (DM) and conditional ‘if’ in the entire dataset.
Frequency of discourse markers (DM) and conditional ‘if’.
Notes. Hits = number of hits; Percentage = Percentage of dataset.
To facilitate analysis, the identified conjunctions were categorized into five groups: additive, contrastive, causal/exemplifying, sequential and temporal/conditionals. The most frequently used forms were the more basic, such as ‘because’, ‘for example’/‘for instance’, ‘also’, ‘so’, ‘but’, and the conditional ‘if’.
Less common forms included ‘as well’/‘too’, ‘instead’, ‘since’, ‘such as’, various sequential markers (‘first’, ‘another’, ‘then’), and ‘when’. The forms that were rarely used were: ‘moreover’, ‘not only . . . but also’, ‘however’, ‘on the one hand . . . on the other’, ‘otherwise’, ‘due to’, ‘finally’, and ‘therefore’, which are more typical in written prose (Biber et al., 2021).
The exemplifying form of ‘like’ was not included in the graph or table, because it was often difficult to distinguish between its use as an exemplifying form and as a placeholder filler. A total of 496 instances were identified, making up 1.46% of our learner corpus.
2 Qualitative analysis
Next, we will present qualitative insights into the students’ use of the four analysed categories by providing representative examples.
a Modals and semi-modals
The students generally tended to express hypothetical situations and possibilities by using the modal verb ‘can’ rather than ‘could’. In some instances, this choice is justified, as illustrated in example 1, where ‘can’ is used to indicate a strong possibility in the present or future, suggesting that the proposed solution (replacing fossil fuels with renewable energies) is feasible and practical. Using ‘could’ in this context would imply a weaker possibility. However, in other cases, ‘can’ can be replaced by ‘could’, as shown in example 2, where potential policies are discussed that might be implemented (hypothetical situations). In example 3, ‘can’ might also be substituted with non-assertive modals (such as ‘might’).
(1) Some possible solutions to this problem can be changing fossil fuels, for example, for renewable resources such as solar and wind energy . . . (B-A8) (2) There can (could) be restrictions to accept an amount of gasoline or what you can buy to try not to pollute . . . (B-A15) (3) I think it should be paid by taxes more than by people because maybe someone can’t (might not be able to) afford public transport . . . (B-A14)
Overall, the students demonstrated proficient use of various modal verbs to express possibilities, hypotheses and predictions (see example 4).
(4) For example, if we couldn’t have computers (hypothetical situation/ condition), we probably wouldn’t know the vital signs of people (hypothetical consequence) and they could die (potential outcome). (B-A12)
They also effectively employed repairs (such as “would . . . ehh could”) during their speech.
b Modal adverbs and adjectives
As the quantitative results already indicated, the students tended to repeat a few modal verbs and adjectives, with ‘maybe’ being the most frequently used. Example 5 illustrates this.
(5) . . . maybe use more public transport . . . Maybe I think, it should be paid by the taxes more than by people, because maybe someone can’t afford public transport, but that way they can . . . (B-A14)
c Epistemic lexical verbs (ELVs)
Regarding the students’ use of epistemic lexical verbs (ELVs), they particularly employed cognitive judgment verbs (such as ‘I think’) and justified their claims by drawing on personal experiences they had witnessed, utilizing evidential sensory verbs (like ‘perceive’). As illustrated in example 6, the student attempts to explain something he or she has observed (the decline of certain animal species and the acceleration of the maturation process of certain plants) by using the sensory verb ‘see’ in relation to the potential consequences of climate change. This tendency was likely influenced by the question prompt, which encouraged students to relate the topics to their local context and express their opinions.
(6) I used to hunt in my free time a lot of animals and I have seen that these animals are decreasing and I think it could be because of climate change. Also, I’m from a little village in the north of Spain and I can see that in the mountains, all the vegetation, vegetables and food need less time to grow and mature . . . (E-B5)
d Discourse Markers (DM) and conditional ‘if’
The students tended to use basic discourse markers (such as ‘because’ and ‘but’) and repeated them. The following example 7 shows how one of the students managed to correctly incorporate different markers to explain the consequences of climate change resulting from flooding and water scarcity.
(7) I think that climate change is affecting us more and more every year, because first of all, some coastal towns are disappearing, because of the sea levels that are increasing . . . but on the other hand, . . . with the lack of water, we can’t grow crops and without growing crops, we can’t get those plants and animals, our aliment. So, the price of them increased a lot, and as we’ve seen now there’s inflation. (EB-3)
Regarding the conditional ‘if’, the students used three types of conditionals (zero, first and second), as shown by examples 8, 9 and 10.
(8) If we are exposed to high temperatures because the ozone layer is becoming weaker and weaker, because of the greenhouse effect, we can have things like skin cancer (C-B13) . . . (zero conditional) (9) I think if we don’t stop climate change, things will continue getting worse and eventually, the poles will melt, . . . (E-A3) (first conditional) (10) If we didn’t have computers, we would have to store information in paper . . . (C-B13) (second conditional)
The first conditional was the most commonly used, allowing students to respond effectively to questions about possible scenarios related to current challenges such as climate change. However, they encountered difficulties with the second conditional. This was prompted by thought experiments involving phrases like ‘What would happen if’ or ‘What problems would you have if’. Several students tended to use a mixed form that combined elements of the first and second conditionals, as illustrated in example 11. They employed the present tense in the if-clause, which is typical of the first conditional, along with a ‘would’ + infinitive form in the main clause, characteristic of the second conditional. The correct structure should involve using the past simple in the first part, such as ‘If computers disappeared’ or ‘If there weren’t any computers’, followed by ‘I think I would have difficulties doing my school homework.’ (11) If computers disappear, I think for me it would be difficult to do my schoolwork . . . (B-B9)
3 Role of epistemic modality in reasoning
To investigate the role of language and reasoning in the students’ responses, we analysed their use of different epistemic modal forms. The analysis revealed that these features can occur independently; that is, exploratory reasoning can occur with minimal or no use of its corresponding linguistic forms. However, when these features are used together, they significantly enhance students’ responses, highlighting the fundamental role language plays in cognitive knowledge development.
Based on the qualitative findings, we tried to establish a ranking and to inductively define four levels of exploratory language use, from minimal use of reasoning and exploratory language to those exhibiting greater use (see Table 13). At the lowest level (level 1), students responded by merely expressing their ideas sequentially without engaging in deeper analysis or elaboration. Furthermore, their use of epistemic modal forms was minimal. Example 12 illustrates an example of such, where a student offers various options for addressing climate change. At the second level, students made efforts to incorporate aspects of exploratory language (such as modal verbs, adverbs, basic ELVs, expressions for shared references, exemplifiers and DMs), as illustrated in example 13.
Four levels of exploratory language use.
At a third and fourth level, the students demonstrated episodes of exploratory reasoning by developing specific points through exemplifications, causal forms or other types of clarifications. In example 14, for instance, the student suggested moderate use of natural resources and explained this through the use of mobile devices, while in example 15 the student proposes using information campaigns, as well as personal and governmental actions (considering both micro and macro levels). The difference lies in the linguistic usage: Example 14 is more vague and imprecise (using expressions such as ‘stuff’, ‘a little bit’, and ‘just’), presenting reasoning with minimal exploratory language. In contrast, example 15 shows a better command of both reasoning and exploratory language.
IV Discussion and pedagogical implications
This study analysed the use of the CDF ‘explore’ among secondary-level CLIL students by examining their use of epistemic modality markers and discourse markers, and investigating their role in reasoning. This is essential as a step on the way to performing exploratory talk, which is recommended for deep processing of contents at secondary school level.
First, regarding how secondary-level students from three different types of schools in Spain performed the CDF ‘explore’ in CLIL social science, this study has shed light on the structures used. The quantitative results show that, similar to previous studies (Boyd & Kong, 2015; Mercer, 2000; Soter et al., 2008), students performed the CDF ‘explore’ through a combination of different epistemic modality forms when performing exploratory tasks. They employed Boyd and Kong’s (2015) S&R words, except for the questioning forms (‘how’ and ‘why’), likely because the exploratory tasks were not conducted in an interactive, dialogical manner (i.e. between peers or between teacher and students). Additionally, as noted in these studies, we confirm that these epistemic and reasoning markers tend to co-occur and work together, serving as indicators of exploratory reasoning. However, as pointed out by Herrlitz-Biro et al. (2013) and Maine and Čermáková (2023), students can exhibit exploratory reasoning with minimal or no use of explicit linguistic features, as demonstrated in example 14. Nonetheless, their use makes reasoning more explicit and accessible.
Regarding the actual features used, we have seen that students tended to use basic modal verbs (such as ‘can’, ‘will’, ‘would’, and ‘could’), which aligns with the research findings of Dalton-Puffer (2007) and Bauer-Marschallinger (2022). These modal verbs were complemented by semi-modal forms (like ‘be going to’ and ‘have to’), with predominantly epistemic and dynamic meanings. Other forms (such as ‘should’, ‘need to’, ‘might’, ‘may’, and ‘must’) were rare, and there were no instances of ‘had better’, ‘have got to’, or ‘shall’. These findings are consistent with Biber et al. (2021), who found similar usage patterns in oral conversation, but in a different order (‘will’, ‘would’, ‘can’, and ‘could’) and with a higher presence of ‘may’ and ‘might’ than in our study. The high presence of ‘can’, which could sometimes be substituted by ‘could’ or ‘might’, was highlighted in previous studies on academic writing. Carrió-Pastor (2014) noted that Spanish L2 English speakers tend to rely more on ‘can’ than English L1 speakers, perhaps because of mother tongue influences. This trend was further confirmed by Dafouz et al. (2007) and Crawford Camiciottoli (2004), who observed a high frequency of ‘can’ among non-native university students and lecturers in EMI business classes. On the other hand, ‘may’ and ‘might’, which are commonly used by native speakers (Carrió-Pastor, 2014), were underused in our corpus.
Regarding the other forms studied, epistemic lexical verbs had a prominent role (ELVs), particularly cognitive judgment verbs (such as ‘I think’), which is similar to the results obtained by Boyd and Kong (2015). Additionally, students used some speculative forms (like ‘suggest’, ‘suppose’, and ‘consider’), a few assumptive forms, and within evidential verbs, sensory verbs and a few reported forms. Furthermore, their ideas were linked through five different types of discourse markers: additive, contrastive, causal/exemplifying, sequential, and temporal/conditional forms. The most basic DMs (‘but’, ‘for example’, ‘so’, ‘because’, ‘if’ and ‘also’) were the most frequent. Students used three types of conditionals (zero, first and second), although the second conditional caused difficulties, with students often using a mixed form between the first and second conditionals.
We also identified repairs, various attempts to start a sentence, repetitions, incomplete sentences, as well as vagueness words (i.e. ‘something/stuff like this’, ‘all that’) and fillers used as placeholders in both English and Spanish (i.e. ‘like’, ‘well’, ‘yeah’, ‘okay’, ‘vale’, ‘o sea’, ‘bueno’). These characteristics are typical of exploratory language, which has been described as tentative, hesitant and incomplete (Maine & Čermáková, 2023; Mercer & Hodgkinson, 2008).
These results show that students have a basic awareness of the role of epistemic modality markers in developing and displaying exploratory reasoning explicitly, likely acquired from their EFL classes. However, they need more explicit guidance, as they repeatedly used basic forms, and they mixed the first and second conditionals (Llinares & Morton, 2024). Additionally, some students used the CDF ‘explore’ in a cumulative manner, presenting one idea after another without engaging with or expanding on what was said (e.g. offering context, explanations, or illustrative examples). If we want students to successfully process subject contents through higher-order thinking and communicative skills (such as CDFs), they need to transition from this unsubstantiated style of argumentation to a more exploratory approach by elaborating on and deepening their understanding. To operationalize this process, students need to use a greater variety of epistemic forms and develop a better awareness of the role of modals when expressing possibilities, likelihood or making personal judgments. DMs have an important place in this, as they show how ideas can be expanded and linked together.
To achieve this, it is fundamental to raise teachers’ awareness of the role language (CDFs and a collaborative use of these) plays in developing students’ thinking and subject learning, as well as using the CDF ‘explore’ in sequences where the students engage in speculative reasoning. Studies (Boyd et al., 2019; Wilkinson et al., 2010) show that when teachers use S&R words and engage students to use these forms explicitly, learners provide longer utterances, more exploration, and improve their critical thinking and linguistic proficiency. These are two essential components in CLIL learning, which are involved in meaning-making and deeper learning (Coyle & Meyer, 2021).
On the basis of the above, we propose the following classroom strategies to promote exploratory reasoning, and where possible, instantiate this in exploratory talk:
• Encourage open-ended questions (‘What will happen if’, ‘How do you think could’) to invite further exploration, gradually fostering students’ responsibility in constructing understanding.
• Prompt for detailed elaboration and evidence (e.g. arguments, counterarguments, examples, explanations) to establish connections across ideas, link them to prior knowledge and consider diverse perspectives.
• Incorporate meaningful, real-life examples (‘connected episodes’, see Boyd et al., 2019), where academic concepts relate to students’ personal experiences and contexts.
• Explicitly teach and practice speculative constructions and reasoning words, such as those identified in this article, along with other linguistic supports (i.e. nominalization and technical terms), to structure students’ thinking (see Bauer-Marschallinger, 2022; Breeze & Gerns, 2019; Gerns, 2023; Tedick & Young, 2018). Avoid overly focusing on grammar or eliminating vague terms (i.e. ‘like’) as this can inhibit students’ participation. Instead, familiarize students with discipline-specific ways of explorative and speculative talk in an age-appropriate way.
• Foster a collaborative and supportive classroom environment where students feel safe taking intellectual risks, sharing personal and socio-cultural values, making mistakes and speculating on complex subject-specific matters in both their L1 and L2 (Mortimore, 2024). This environment should embrace diversity, encourage active listening and ensure every voice is heard (Maine & Čermáková, 2023; Mercer & Hodgkinson, 2008).
• Acknowledge and value students’ contributions, fostering what Boyd et al. (2019) term ‘epistemological commitment’.
Raising teachers’ awareness towards these issues and the proposed strategies presents a good starting point, but for teachers to really adopt these considerations more systematically and sustainably, teacher training and materials that incorporate these findings might be necessary.
V Conclusions
This study shows how secondary CLIL students orally performed the CDF ‘explore’ in an L2 content learning environment, and investigates the role of epistemic modality in this meaning-making process. It thus provides significant insights into the development of exploratory talk in the CLIL context. Our findings concerning the cognitive and linguistic skills of CLIL students are a useful resource for researchers, teachers, and textbook publishers, highlighting both their strengths and areas for improvement. While these CLIL students demonstrate that they have acquired basic exploratory competences (such as using a combination of basic epistemic forms), they also appear to need additional support to overcome specific difficulties (e.g. mixing the first and second conditionals or arguing in a cumulative manner) in order to meet the growing demands of the school curriculum.
Educators need to raise awareness of the roles and uses of the CDF ‘explore’, as well as epistemic modality and linguistic markers of reasoning to better support students’ exploratory and deeper learning in L2 contexts (Coyle & Meyer, 2021). This study offers pedagogical guidance for enhancing these skills and, where possible, implementing them into exploratory talk. However, for teachers to adopt these considerations more systematically and sustainably, targeted teacher training and the establishment of supportive communities of practice are essential.
Future research could investigate CLIL students’ performance of ‘explore’ in classroom interactions (e.g. in group work, pair work, or whole-class discussions) and incorporate more challenging prompts. Additionally, further research is needed to explicitly focus on the distinct features of speculation and exploration across different disciplines and languages. Hypothetical reasoning in natural sciences differs from that in social sciences (i.e. history), and such activities in Spanish differ from those in English. Clarifying these differences is essential, as language can serve not only as a quality indicator of academic performance but also as a pedagogical tool to simultaneously enhance students’ higher-order thinking and language skills.
Footnotes
Acknowledgements
The authors are grateful to the schools and researchers who participated in the study, as well as to the Department of Education from the Government of Navarra, the Asociación Enseñanza Bilingüe (EB Spain) and the ENEBE (Evaluación de la Enseñanza Bilingüe en España) project. They also wish to thank the Universidad Internacional de La Rioja for their support.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The first author was funded by a postdoctoral fellowship of the UNIR.
Ethical approval
We declare that this study, conducted as part of the national assessment project ENEBE, adheres to proper ethical principles and has received consent from the Asociación Enseñanza Bilingüe (EB Spain) and the Department of Education from the Government of Navarra. This study does not involve any components requiring a committee approval number. No animal subjects were involved, and all participants consented to be interviewed and permitted their anonymous statements to be quoted.
