Abstract
This study examined trends in educational research topics and methodologies by analysing conference abstracts from major international conferences between 2010 and 2024. We inductively analysed 28,000 abstracts from yearly American and European conferences (2010–2024) using structural topic modelling and deductively coded 3342 abstracts from five conferences across Asia, Europe, and America for the years 2013, 2018, and 2023. Worldwide, attention increased for sustainability, emotions, wellbeing, and tertiary education. Tertiary education is now studied twice as often as primary education. Methodologically, the proportion of intervention and methodological studies decreased from 17% to 9% and from 7% to 4% respectively, while the proportion of qualitative and review studies increased. We use Principal-Agent theory to critically question these trends.
Introduction
Educational research is a contested field with ongoing discussions about its relevance (e.g. Akkerman et al., 2021), impact (e.g. Perry and Morris, 2023), quality (Inglis et al., 2024), diversity (e.g. Morrison, 2020), and direction (e.g. Pivovarova et al., 2020). Debates about the role and direction of educational research risk polarization and ‘Balkanization’ when researchers in different streams of scholarship avoid engaging with each other’s strongest arguments or lack a shared factual picture of the field (Alexander, 2009; Dekker and Meeter, 2022). Knowledge about trends in topics and methods can help to ground debates in a common factual understanding.
This knowledge, however, is only partially available through two types of meta-studies. One type of meta-studies charted developments in educational research using an inductive approach (Deleye et al., 2025), while the other stream of scholarship used a deductive approach (Brady et al., 2023). The inductive approach used the educational research meta-data as the starting point without pre-specified categories, while the deductive approach analysed the research data with pre-defined categories. Below we summarize the main findings from both types of studies and make a case for combining their approaches with datasets that provide a more representative overview of the field.
One of the inductive studies, conducted by Huang et al. (2020), mapped the evolution of topics within educational research based on key-words of articles in prestigious educational research journals published between 2000 and 2018. With keyword co-occurrence networks, they identified five core overarching topics and described their development: Human Capital was a core topic but receded into the background after 2012. Race/social justice, higher education, and teacher education became core topics and increased in popularity throughout the studied time-span. Interactive learning environments/Intelligent tutoring systems were a core topic from the start and never declined in popularity. A second inductive study by Munoz-Najar Galvez et al. (2020) charted the field based on abstracts of US dissertations published between 1980 and 2010. They found that topics were clustered along two paradigms: the interpretative and the outcome-oriented. During the 1980s the outcome-oriented paradigm was dominant, in the 90s the range of topics was diverse, and from the 00s onwards the interpretative paradigm became dominant. Over time, the most popular topics became ‘interviews’, ‘race’, ‘conceptual framework’, and ‘teacher practice’, while ‘experiments’, ‘mean differences’, and ‘cognition’ showed sharp decreases in popularity.
More recently, Deleye et al. (2025) mapped the European educational research by using topic modelling on a sample of ECER conference abstracts from 1998 to 2024. Based on the topics and trends that emerged, they detected four central thematical evolutions: (1) an increase in attention for wider sociopolitical challenges (sustainability, inequalities, technological changes); (2) a decline in attention for formal education, reflected in core educational topics such as learning, teaching, and curriculum; (3) an increase in popularity of interviews and literature reviews as research methods; (4) a decline in the prevalence of theory. Inglis et al. (2024) looked at the state of educational research in the UK by analysing articles that educational researchers submitted for a national quality assessment in 2014 and 2021. They found that in 2021 ‘interviews and focus groups’ were the most popular topic (9%), followed by ‘methodological depth’ (8%), ‘teacher education and professional development’ (6%), ‘critical and social studies’ (6%), and ‘educational philosophy’ (6%). These topics dominated both 2014 and 2021 submissions, the other 30 topics covered a distinctly smaller proportion of the papers.
Taken together, these inductive studies provide a patchwork of regional trends that are hard to compare. Across these studies, two report increased interest in race, and three report an increase in the use of interviews. Yet, we still do not know whether these patterns are global or how the prevalence of interviews compares to other research methods across regions.
Deleye et al. (2025), Munoz-Najar Galvez et al. (2020), and Inglis et al. (2024), all used structural topic modelling to inductively cluster the field into topics. Huang et al. (2020) used a similar inductive approach (co-occurrence networks), although their method shows direct word-relationships rather than latent topics. The advantages of the inductive approach are that it allows for analyses of large representative datasets and provides a comprehensive overview without imposing any structure or theory beforehand. However, when the aim is to accurately chart specific mutually exclusive categories, a deductive method is more suitable.
Several educational psychologists used a deductive approach to study changes in the prevalence of empirical research designs within their field. Hsieh et al. (2005), Robinson et al. (2007), Reinhart et al. (2013), and Brady et al. (2023) coded research methods used in articles published in prominent educational psychology journals. They found that the percentage of empirical articles that studied interventions decreased from 47% in 1994, to 25% in 2020. Randomized trials decreased from 33% of articles in 2000 to 20% in 2020. Correlational studies varied from 46% in 2000 to 66% in 2010, to again 46% in 2020 (Brady et al., 2023). Meanwhile, the percentage of observational articles that offered recommendations for practice increased from 41% to 66%. Based on these findings, Brady et al. (2023) concluded that researchers are increasingly ‘squeezing’ causal conclusions out of correlational studies. These findings align with the findings from Munoz-Najar Galvez et al. (2020). However, because the sample is limited to educational psychology and prestigious journals, it is not generalizable to the broader field of educational research.
Based on the previous overview of what is known, we detected two gaps in the current empirical literature on the evolution of the educational research field. The first is related to the representativeness of the used samples. Existing studies each covered a subset of educational research: focussing only on US, UK, EU, only on Educational psychology, or only on keywords in prestigious journals. The second gap is that none of the existing studies combined an inductive and a deductive approach to complement the strengths of each method.
This study addressed these gaps by applying both the inductive and the deductive approach to samples that are representative of educational research across regions. Providing a more representative overview of trends in educational research will help any assessment of and discussion about the direction of the broad inter- and multidisciplinary field by providing insights on where we stand and come from (Nisbet, 2005).
Theory
Conceptualizing changes in research fields
Only two of the previous studies used theory to help interpret their analyses of changes in the field. Inglis et al. (2024) used Lakatos theory of research programmes. Munoz-Najar Galvez et al. (2020) used Bourdieu’s theory of social capital. We describe why both can be helpful when interpreting the evolution of the field, but also point out their limitations and add principal-agent theory as a more practical complement for understanding changes in the behaviour of researchers instigated by research policy.
Philosophers of science conceptualized how research fields change over time in several ways. From notions of steady progression through falsification (Popper, 1959) to revolutions of paradigms that replace each other without certain rational progression (Kuhn, 1962). Lakatos (1978) combined Popper’s notion of rational processes with the historic fidelity of Kuhn’s theory in his theory of scientific research programmes. According to this theory, research fields are collections of constructively competing research programmes that struggle for researchers’ attention. Researchers can choose between or combine several different types of research subjects as well as different types of research methods (heuristics). Research programmes develop when they deliver new results and directions, or dwindle when they fail to do so and have to address an increasing amount of empirical ‘anomalies’ that cannot be explained by the hard core of the theory or its auxiliary hypotheses. ‘Learning styles’, for example, was a concept that was initially embraced by many researchers and practitioners, but lost its popularity when studies increasingly reported that adjusting teaching strategies to learning styles did not enhance student learning (Kirschner and van Merriënboer, 2013).
According to Lakatos, researchers join or disengage with research programmes for rational reasons. Feyerabend (1981) questioned whether there is evidence to support this claim. According to his anarchist theory of knowledge ‘we can only say that one programme was accepted while the other receded into the background; we cannot add that the acceptance was rational or that a rational development took place’ (Feyerabend, 1981: 220). With the more recent development of metascience as a field, this philosophical debate has increasingly become the subject of empirical research. Munoz-Najar Galvez et al. (2020), for example, showed that doctoral students who chose a research programme with momentum (hype) increased their chances of obtaining a job in academia. This could support the Lakatosian theory if momentum is based on new results and directions, but there are several alternative explanations. Munoz-Najar Galvez et al. (2020) used Bourdieu’s social capital theory to explain how scientists’ choices can be driven by a battle for scientific capital between newcomers and incumbent powers. From that perspective, outcomes may be explained by social power structures rather than rational qualities of the programmes.
We argue that both theories can offer part of the answer but overlook the influence of research policy. Within the setting of educational research, Lakatos theory would predict that researchers select topics and methods that contribute to progress in scientific understanding of education. Bourdieu’s theory can help to explain how different programmes might resist consensus or rational progress. Neither of these theories, however, takes important developments in research policies during the past decades into account. Zapp et al. (2017, 2018) analysed trends in research policies in western countries and characterized the most poignant development as the ‘programmification’ of research. Zapp defined programmification as the shift from autonomous lump-sum support for research to financing based on programmes that award grant money to researchers who write proposals that fit the programme and promise societal value. This shift in policy creates a dynamic between researchers and funding agencies that can be theoretically understood with ‘principal-agent theory’ (Van der Meulen, 1998). Principal-agent theory helps to understand relationships between a principal who wants certain work to be done but is dependent on an agent to perform the work. The interests of the principal and the agent do not necessarily align and the principal is dependent on the agent to provide research proposals, conduct the research and provide information about progress. The principal and agent engage in a contract, and the principal incentivizes the agent to pursue its interests and monitors whether the agent is fulfilling its tasks. In the case of research, governments act as principal when they steer research funding with research grants that are awarded to researchers who promise to pursue a certain research programme. Principal-agent theory assumes that researchers act rationally and let incentives such as research grants influence their choice of research topics. The calls and research grants can be relatively open with more autonomy for researchers, or they can be more focussed on specific policy interests. Over the past decades, governments increasingly steered research towards policy goals, such as strategically valuable technology or societal issues (Zapp et al., 2017, 2018). Academic publications function as information for the principal in two ways: proof of the capability of researchers to be able to fulfil tasks and proof of progress.
The system in which researchers operate incentivizes them to study the topics for which funding is available, and incentivizes them to publish as many peer-reviewed papers (‘marginally publishable units’) as possible to signal capability and progress. This research policy can influence the choice of topics through the themes of the calls, and influence the used research methods based on degree to which these can help to maximize publications. Think of the following examples. Cross-sectional methods are more attractive compared to longitudinal studies because they are a quicker route to a publication. Tertiary education is preferable over primary or secondary education because of easier access to participants and no obligation of consent from parents. It may be more efficient to conduct observational over intervention studies because they do not require an intervention, informed consent, or repeated observations.
The principal-agent theory also helps to explain why scientists do not yet fully apply open science practices that are in the interest of the society, such as data sharing. Data sharing allows for more error detection, more efficient use of research means and subsequent discovery (Logan et al., 2021). However, for the individual scientist, not sharing data allows them to monopolize access (Stephan, 1996) and reduces the risk of error-detection which might lead to retractions and reputation damage (Else, 2024).
The combination of the abovementioned theories means that the following factors could contribute to development of research programmes: their power of delivering relevant new insights for educational practice, their momentum and alignment with the dominant paradigm, and their suitedness for allowing researchers to obtain grants for these topics and maximize publications within limited time. In our discussion of the findings we will use these lenses to critically interpret our findings.
The present study
This study aimed to provide a factually representative picture of the educational research field and its recent trends. It had two main contributions. First, it offered a more comprehensive overview than previous studies by including more than one continent and the breadth of educational research. Second, it combined the inductive and deductive methods that were thus far used in order make the findings more robust. We formulated the following research questions for this purpose.
Together, the answers to these research questions will allow us to provide a more reliable factual overview of the state and direction of the field and reflect on this through different theoretical lenses.
Methodology
Sample
In order to inductively measure what researchers present we used a sample of abstracts from two of the largest educational research conferences with publicly available abstracts for individual years between 2010 and 2024: the annual American Educational Research Association (AERA) conference, and the European Conference for Educational Research (ECER), organized by the European Educational Research Association. We chose to sample these two conferences because they are the only two broad educational research conferences that have yearly data available over this time period, which allows for fine-grained analysis of developments in topics with structural topic modelling. For both conferences, abstracts were publicly available on their website in a well-organized archive.
We wrote Scrapy-based python scripts to scrape (obtain) the abstracts from the conference websites (www.scrapy.org). The ECER dataset missed 2020 because that conference was cancelled due to Covid-19, and the AERA dataset missed data from 2024. We first removed abstracts with less than 100 words (often not paper proposals) and made a random selection of 1000 abstracts out of the remaining abstracts for each yearly conference, leading to a total of 28,000 abstracts. The AERA abstracts averaged 220 words, while the ECER abstracts averaged 1220 words. We randomly selected the first or second 220 words from the ECER abstracts to balance the contribution from the two conferences given that stm is sensitive to document length. In order to test if the subsections of the ECER abstracts were representative of the longer versions, we also conducted all analyses using the complete versions of the ECER abstracts.
In order to answer research question 2, we selected a broader selection of prominent and long-running conferences from three continents, namely the AERA, ACE (Asian Conference on Education), and EARLI (European Association for Research on Learning and Instruction) conferences. Of those, we selected the 2023, 2018, and 2013 editions. For EARLI the 2019 edition was used since its conference is held only in odd years. We copied abstracts from published abstract books of these conferences, available on conference websites. For the deductive approach we prioritized representativeness of the sample across regions over more fine-grained yearly developments. Because the European and Asian continents are divided in different comprehensive associations and conferences we added the HKERA (Hong Kong Educational Research Association), and ECER conferences of 2023 to increase coverage for these regions. For each edition, we manually coded 384 randomly selected abstracts if available, or as many as were available if the number of accepted abstracts was less than 384 (this was the case for each ACE edition and HKERA). A sample of 384 allowed us to detect medium-effect differences between conferences. We coded 3342 abstracts in total.
Analytical strategy
Inductive
We conducted STM with the stm (Roberts et al., 2019) and quanteda (Benoit et al., 2018) packages in R (R Core Team, 2021). During the first phase we prepared the text dataset by adding conference, year, and unique identifiers as metadata. We removed adverbs (this, therefore, thus, etc), numbers, words with less than 4 letters (et., al, der) and stemmed the remaining terms. We continued by making bigrams (e.g. social_capital), and trigrams (e.g. higher_education_institut), and removed all words that occurred less than 11 times. We transformed the words (‘tokens’) into document-feature matrices (dfm) and used these to make a stm file.
In the second phase we used the stm file to explore how many topics were optimal by first examining fit statistics of models with 30, 40, 50, 60, 70, and 80 topics. We plotted the exclusivity (degree to which words are specific to a single topic), variational lower bound (determines convergence), residual (estimation of dispersion of residuals for the model), and semantic coherence (how commonly most probable words co-occur; Weston et al., 2023). Models with 50, 60, and 70 performed the best on these indices and were modelled to compare in model detail. For these three models we compared overlap in topics between models (congruence). The 60 topic model showed the best overall fit statistics and provided topics that were interpretable and specific. Outcomes of these analyses with graphs are available together with the used code and data in the online materials (Dekker et al., 2025).
During the third phase we coded and named the different topics. For each topic we looked at the list of most frequent words, words that are exclusive to that topic, and abstracts that identified most with the topic. Topic 1, for example, had as most probable words: roma, educ, school, spain, republ, group, Czech. The most typical words for this topic were: matura_exam, roma_minor, roma_children, roma_communiti, non-roma, gitano, roma_pupil. Both author 1 and 2 named this topic ‘Roma’ (the largest ethnic minority in Europe). Author 1 and 2 named all topics independently. Afterwards we checked whether we had given similar names. In 40 instances the names were identical, 15 topics showed minor wording differences but conceptual agreement, resulting in substantial agreement on 55 topics (92%). The Cohen’s Kappa for agreement between raters was κ = 0.83, indicating good intercoder reliability. Author 3 acted as referee and chose the most fitting solution based on example abstracts from the topic for the 20 topics that did not receive identical names. For the five topics with substantial disagreement all three coders agreed with the proposed solution from author three after reading the example abstracts. In the fourth phase we calculated the correlations between topics to show how the different topics are related to each other. Finally, in order to answer RQ1 we calculated and plotted the proportion that each topic covered and how this changed over time in the AERA and ECER sample.
Deductive
We coded the research methods with the codebook of Brady et al. (2023), refined after piloting on abstracts outside the final sample (Table 1). After testing the codebook with a sample of abstracts, we added the category ‘observational mixed methods’ and ‘design research’. We defined intervention studies as any study – besides design research – that manipulated an independent variable, be it a randomized controlled trial, a mixed-method action-research or any other type of evaluation. We treated design research as a separate category due to its iterative nature and unique combination of research methods. Additionally, we also coded nonempirical papers as either: ‘conceptual’ (essays, commentaries), ‘methodological’, ‘narrative review’, ‘systematic review’, and ‘meta-analysis’. We also coded whether the sample was taken from primary, secondary or tertiary education students, teachers, school leaders, parents, or others. A sample of 50 abstracts was coded by three raters to assess interrater reliability, which was sufficient (ICC of 0.704 and higher).
Adapted codebook from Brady et al. (2023) used for coding research methods.
Note. Every abstract received only one research method code (1.1–3.5) and one education level indicator.
Results
Research topics in educational research based on AERA and ECER 2010–2024
The 60 different topics displayed in Table 2 categorize the AERA and ECER abstracts from 2010 to 2024. We organized them in 11 broader themes such as Administration, Teachers, and Psychology. For each topic, the FREX terms indicate which words were exclusive to this topic and Proportion indicates the proportion of the topic found in the average abstract.
Topics organized in themes with labels and average proportion.
Each abstract belonged to one or more of the topics. The topics, in turn, relate to each other in differing degrees, forming broader clusters. This reflects which topics are likely to co-occur in an abstract. These broader clusters can also be seen as domains within the field. Figure 1 displays a topic heatmap with positive (green) and negative (red) correlations between the different topics. The squares that envelop combinations of topics visualize clusters of topics that generally relate more to each other. The order of the topics is based on hierarchical cluster modelling, in order to make sure that related topics are positioned closer to each other. For example, Educational policy, Educational reform, and School quality often co-occur in abstracts, and are grouped together in one cluster (top right square). Sexual education is quite singular: except for a few slight negative correlations it does not correlate with any other topic.

Hierarchically ordered topic correlation heatmap.
Based on this figure, it is already visible that some topics match better with certain methods: experiments co-occur more often with topics that are assessed at the student level, and therefore falls within a square near the bottom of the figure (3rd from below) with reading, and mathematics and also has positive correlation with motivation. It has negative correlations with the policy-oriented topics in the top rightward square. Experiments may correlate negatively with policy topics because it is quite expensive and complicated to randomize at the policy level compared to the student level, for example Cowen (2019). Action research co-occurs more with identity related themes. It could be that these methods are more suited to the topics for substantive or practical reasons, or that they are the status quo in their respective domains.
Trends in research topics based on AERA and ECER 2010–2024
The most salient findings based on our mapping of developments in topics at AERA and ECER can be organized into three categories: (1) similar developments across continents, (2) differences in continental trends, and (3) structural differences between the two conferences. The Supplemental Materials provide a complete overview of visualized trends in all topics.
Similar developments across Europe and North America
The COVID-19 and climate crises were inductively combined in one topic: global disasters. A sharp increase in prevalence of this topic after 2020 in both conferences reflects the global nature of these crises (p < 0.001, R2 = 0.81). Emotions and wellbeing (p ⩽ 0.001, R2 = 0.83) and Mental health (p = 0.01, R2 = 0.40) also increased in popularity at both conferences. This might have been exacerbated by the Covid crisis, which increased concerns about mental health and wellbeing due to lockdowns.
Differences in continental trends
Broadly three differences in trends stood out when comparing trends between AERA and ECER.
The first difference is the attention for race and black experience at AERA. Munoz-Najar Galvez et al. (2020) described how race increased in popularity from less than 1% to 4.1% between 1980 and 2010 in US dissertation abstracts. This trend continued in our dataset. Race and black experience combined for around 4% in 2010 and increased to a combined 9.6% in North America. Both increases in the frequency of race (p < 0.001, R2 = 0.92) and black experience (p < 0.001, R2 = 0.83) are large and significant. These increases could reflect societal interest in the United States for these topics during the time of the Black Lives Matter movement. Both race and black experience are not prevalent in Europe, which seems to have its own discourse around cultural diversity, youth, Roma, and refugees. Within Europe, refugees as a topic increased in prevalence (p < 0.001, R2 = 0.64) after the Syrian refugee crisis that impacted Europe during this timeframe (Figure 2).

Salient differences in trends in topics at AERA and ECER 2010-2024.
The second major difference is that the frequency of digital literacy saw a large increase at ECER (p < 0.001, R2 = 0.81), while there was no significant increase in contributions about this topic at AERA (p = 0.15).
The third difference is that several topics that used to be prevalent at AERA decreased in popularity. The frequency of student performance (p < 0.001, R2 = 0.76), statistical modelling (p = 0.002, R2 = 0.58), and experiments (p = 0.003, R2 = 0.53), declined significantly at AERA, while other methods remained relatively stable.
Structural differences between AERA and ECER
Several of the differences between AERA and ECER were relatively stable. Educational philosophy is significantly more popular at ECER than at AERA (p < 0.001), while statistical modelling, experiments, and design research occur in more of the AERA abstracts and are relatively marginal at ECER. The topics of motivation, urban/rural communities, preservice teachers, and high school to college transition are more prevalent at AERA, while that of special educational needs, learning environments, international comparisons, vocational education, different teaching approaches, teacher PCK, university faculty development, and school quality are more prevalent at ECER.
The data does not offer explanations for these differences, but some are likely attributable to language and substantive differences. Community colleges, for example, are a North American phenomenon, while Europe houses vocational education schools. International comparisons may be more popular in the European context because they provide comparisons between similar European countries while they do not offer comparisons between the different states of the United States. Differences could also be due to the specific research community that each conference attracts. ECER is not the only large European educational research conference and might not be representative of the European educational research community. The deductive approach, which included a broader array of conferences could provide more information about this.
While structural topic modelling detects differences in trends over time in a large sample, it is important to note that methods are not detected through STM when they are not explicitly mentioned in a uniform way in the abstracts. This is why a reliable estimate of proportions of methods should rely on a deductive method such as used in Brady et al. (2023).
Trends in research methods in America, Asia, and Europe 2013–2023
Next, we turn to our deductive coding of abstracts presented at five conferences across three continents. Across all conferences, the number of conference abstracts describing any type of intervention studies decreased from 17% to 9% while qualitative observational studies increased from 30% to 45% (Figure 3). Observational mixed-methods studies increased from 8% to 10%. Within intervention research, the proportion of studies with an experimental design remained stable at about 40%. These findings are in line with the trend in educational psychology that Brady et al. (2023) found in educational psychology journals.

Trends in empirical contributions to ACE, AERA, and EARLI.
Among the non-empirical contributions, review papers (meta, systematic, and narrative) are becoming more common, while conceptual papers and methodological contributions seem to decline (Figure 4).

Trends in nonempirical contributions to ACE, AERA, and EARLI.
Although trends were similar, the conferences exhibited large, stable differences in the types of research that was presented. Both AERA and the Asian conferences (ACE and HKERA) were increasingly dominated by qualitative research (Figure 5). Within Europe, quantitative and qualitative streams seem to be largely divided between the two large conferences. Worldwide, EARLI is the educational research conference with the highest percentage of quantitative contributions, while ECER is the conference with the highest percentage of qualitative and mixed contributions.

Trends in empirical contributions at AERA, EARLI, ECER, ACE, and HKERA.
Within EARLI, there were striking differences between national research traditions. The two countries with most EARLI abstracts, Germany (26%) and the Netherlands (13%), had a strong quantitative slant and contributed the majority of experimental studies (61% of 121 experiments being presented). Scandinavian researchers, on the other hand (representing 12% of abstracts), contributed only 5% of experimental studies, and instead presented mostly qualitative research.
Research into tertiary education increased with 8 percentage points, making it increasingly the most studied educational level, while research into primary education decreased and research into secondary education remained relatively stable (Figure 6).

Trends in studied educational levels.
Discussion
This study offered an overview of trends in research topics and research methods based on inductive and deductive analyses of abstracts of conference proceedings of large educational research conferences. Our inductive approach showed common trends (e.g. a rise in attention for sustainability, emotions, and wellbeing), notable differences in trends (e.g. increase in attention for race in NA and digital literacy in EU), and regional terminology (e.g. community colleges in NA vs vocational education in EU). In line with the earlier studies, both our inductive and deductive approach indicate a decrease in all types of intervention studies (randomized trials as well as other types of evaluation). At ACE, AERA, HKERA, and ECER, qualitative research is most prevalent, while EARLI increasingly hosts observational quantitative studies. Review studies are gaining in popularity, while the frequency of conceptual and methodological studies is in decline. Tertiary education is increasingly the most studied education level while primary education is the least and increasingly less studied level. Apart from differences between Europe and North America, we also found large differences between the two big conferences within Europe and between countries within Europe. In the following, we will use the theoretical lenses of research programmes, principal-agent theory, and social capital to reflect on these findings.
Research programmes and principal-agent theory
In line with the Lakatosian theory of research programmes, the decline of research programmes is induced by a lack of breakthroughs or new findings. This theoretical lens can be applied to our findings in several ways. Could the decline in intervention studies be attributed to a lack of findings or breakthroughs? Lortie-Forgues and Inglis (2019) analysed large-scale experimental studies in educational research, and noted that they often are ‘uninformative’, with average effect sizes of 0.06 standard deviations and confidence intervals with a mean width of 0.30 SD. This could make these methods less successful and popular.
A similar case could be made for the relative absence of replication studies in educational science (Makel and Plucker, 2014; Perry et al., 2022). Lortie-Forgues and Inglis argued that the lack of informative results could be due to either a flawed literature on which the interventions are based, a problem in the execution of interventions at scale with sufficient fidelity, or RCTs might need even larger sample sizes. Critics of RCTs might add that the problem is more fundamental and that the whole underlying outcome-oriented paradigm is flawed and inappropriate for educational contexts (discussed in Dekker and Meeter, 2022). Perry and Morris (2023), however, point out that educational research from the interpretative domain does not seem to offer insights that lead to sustainable educational improvement either. Because educational research lacks cumulative development and clear agreement on its major breakthroughs, Lakatos’s theory of research programmes provides a poor fit to our field’s trends. At this point, the more specific principal-agent theory might be a helpful addition to the Lakatosian lens to help interpret researcher preferences.
Principal-agent theory suggests different mechanisms for these patterns. Based on principal-agent theory one could predict that researchers facing publication pressures and resource constraints might prefer observational studies over intervention studies, and tertiary education over other levels due to easier participant access. Observing these trends alone, however, does not yet confirm principal-agent mechanisms at work. Researchers’ choices may equally reflect epistemological commitments to qualitative paradigms (Munoz-Najar Galvez et al., 2020), loss of methodological capacity (Smith et al., 2025), or genuine substantive reasons for preferring certain approaches.
In line with Deleye et al. (2025), we also found an increase in attention for societal issues such as climate change, digital literacy, mental health, race, refugees, and wellbeing at the cost of topics such as teaching, assessment, and student performance. Deleye et al. (2025) argued that this could be due to the focus of funding agencies on these societal issues, suggesting ‘programmification’ may steer substantive topics. While this interpretation is consistent with principal-agent theory, alternative explanations deserve consideration: these topics may have gained prominence due to genuine societal urgency (e.g. the COVID-19 pandemic, Black Lives Matter movement, climate crisis), increased researcher awareness of social justice issues, or paradigm shifts towards critical and interpretive approaches. Without direct evidence of researchers motivations, we cannot determine the relative contribution of funding incentives versus these other factors.
Interestingly, methodological trends appear less responsive to funder incentives than topic selection. Smith et al. (2025) found that British educational researchers in 2024 were far less likely to conduct intervention studies and less likely to possess the required methodological skills. Moreover, academic researchers’ share of EEF commissioned evaluations dropped from 80% in 2011 to just 20% in 2024. This pattern suggests that factors beyond principal-agent incentives, such as methodological training, paradigm preferences, or self-selection into qualitative traditions, may be more powerful drivers of method choice. This reveals important limitations to principal-agent theory’s explanatory scope: while it may partially account for topic selection (where researchers retain agency), it appears less predictive when explaining entrenched methodological preferences. Theoretically, this represents a principal-agent misalignment where researchers’ methodological preferences diverge from funders’ calls for intervention research. It is important to note that funding agencies might not be the only ‘principal’ for researchers as agents. Universities and schools can also be funders, with more or less restrictive goals and demands.
Paradigms
Generally there seems to be a decline in the occurrence of outcome oriented and intervention studies and more interest in socio-cultural, interpretative, and qualitative approaches. This finding aligns with the trend that was found by Brady et al. (2023), Deleye et al. (2025), Munoz-Najar Galvez et al. (2020), Nisbet (2005), Inglis and Foster (2018), and Smith et al. (2025).
Qualitative research methods are most prevalent in most conferences, with the exception of EARLI. While most conferences are becoming increasingly qualitative, EARLI is becoming more quantitative. This could indicate a form of polarization when research methods and topics are increasingly clustered and isolated, a development that was spotted and characterized as ‘Balkanization’ by Alexander in 2009. Stronger correlations between research topics and methods can be present for substantive reasons, but would also be more likely to occur in isolated domains with certain researcher preferences, self-selection into domains, and the comfort zone of a status quo about how a topic is studied within this domain. Are STEM topics more strongly correlated with quantitative research methods (Figure 1) because of substantive reasons or researcher preferences or skills? Smith et al. (2025) and Morris et al. (2023) found that British educational researchers in 2024 were far less likely to use a range of methodological approaches compared to 2002. In their analyses of the educational effectiveness research landscape, Perry and Morris (2023) describe the segregation of the different subfields and associated research methods. Because of their segregated organization, they rarely build on each other’s findings, which makes the findings less cogent and coherent and makes it harder for practitioners to make sense of the fragmented outcomes.
Additionally, within EARLI conferences we found that preferences for certain methods differed from country to country. It is not clear whether these differences are based on social processes within research communities or systemic and policy differences such as research grant agenda’s and institutional training. Munoz-Najar Galvez et al. (2020) found that researchers who graduated in a trendy subject within the leading paradigm were more likely to obtain a job as PhD advisor. Our results also were consistent with an increasing dominance, world-wide, of the interpretative paradigm in educational research. This might have been one of the forces behind the further increases in popularity that we registered for the same methods and topics that Munoz-Najar Galvez et al. (2020) observed, such as race at AERA. These findings highlight that principal-agent theory, while valuable for understanding how external incentives may influence research patterns, cannot fully account for the complex interplay of epistemological commitments, disciplinary socialization, institutional cultures, and historical trajectories that shape research communities. The ‘balkanization’ we observed may reflect emergent properties of these multiple forces rather than simple responses to incentive structures.
Limitations
There are at least four serious limitations to this study that should be taken into account in the interpretation and weighing of the findings. First, the large broad conferences do not cover all conferences that educational researchers frequent. The included conferences include a large variety of different special interest groups that are represented, but each subfield typically also has specific conferences outside of large, international, and broad conferences. The degree to which the variance found in the large conferences matches the variance in the total population is unknown. It could well be possible that some subfields in educational science (AI in education, for example) are less well represented in the broader conferences. Similarly, our sample only includes conferences with paper presentations in the English language. Some countries and continents are therefore less well represented because researchers more often publish and present in a non-English language. Thus, while North America and Europe were well-represented in our sample we had difficulty identifying large international conferences on other continents that were organized over longer time frames. The two Asian conferences included in this study were clearly less extensive than AERA or the European conferences.
Although conference papers are a less selective category compared to dissertations or published articles, it still does not cover all educational research, such as studies that are only published on a website or used within an organization. Acceptance rates for the selected conferences were not publicly available, except for the EARLI conference of 2021, which had an acceptance rate of 77%. We could not find whether the acceptance rates for the other conferences were similar, but conferences are selective to some degree and, thereby, also discursively shape the field.
Secondly, we used ECER in our inductive sample because it is a yearly organized conference, like AERA, but our deductive sample shows that ECER is not representative of the European educational research field. The two large European educational research conferences, ECER and EARLI, have clearly different profiles and distributions of prevalent research methods. Such cleavages may also exist on other continents but be less visible there.
Thirdly, abstracts are only brief summaries of a paper. It can be questioned whether they sufficiently capture the breadth of what is studied and written in a study. We explored whether this was the case by also running STM analyses with extended abstracts that were available from ECER, containing over 1000 words instead of 220. These analyses resulted in similar topics, but because STM is sensitive to document length, these topics were more Europe-centric. Adding weights to correct for this, such as adding log(word count) as a covariate, resulted in models that would not fit. We read a random sample of 384 ECER abstracts and found that the first 440 words mostly did cover the substantive topic and applied methods. In order to ensure that the length of the AERA and ECER contributions was similar, we therefore randomly selected either the first 220 or the second 220 words from each ECER abstract. Sensitivity analyses showed both windows produced comparable topic structures. The main findings were supported both when using complete ECER abstracts, the first 220, the second 220 words, or a mix of the first and second 220 words.
A fourth limitation concerned our theoretical interpretation. While we use theory to interpret trends, our observational data cannot confirm the causal mechanisms that these theories propose. We observe alignment between incentive structures and research patterns, but cannot rule out that these patterns reflect other dynamics such as how researchers were trained (Smith et al., 2025). Future research using surveys or interviews with researchers could directly examine whether and how funding structures influence substantive and methodological decisions.
Conclusions
By charting trends in topics and methods that educational researchers presented, this paper has made the following three contributions to the literature. One, it provides a broad overview of trends and comparative frequency of used research methods. This can be informative for debates and analyses about the status and direction of the field. Two, it has detected several regional differences in trends and frequencies between continents and within Europe. These comparisons can be useful benchmarks for transnational and national policy and urge for reflection and follow-up questions: should there, for example, be more attention for digital literacy at AERA or in the North American educational research landscape? Are comparative differences within Europe attributable to research policies or differences in educational contexts? Should research domains and methods interact more? Three, it offers critical analyses of some specific trends based on three different theoretical lenses. This third contribution raises several uncomfortable questions for us as a field.
Are we studying what is relevant to educational practice or are we selecting topics and methods because they are easier to study or within our comfort zone? The degree to which relevance and access might be increasingly at odds with each other is illustrated, for example, by the fact that more than twice as many people finish primary education than enrol in tertiary education (UNESCO Institute for Statistics, 2024), while tertiary education is studied twice as often (36%) as primary education (18%) by educational researchers. Similarly, we could ask whether we are moving away from intervention studies because they are uninformative or because they cost more resources compared to observational studies? These uncomfortable questions point towards follow-up studies that can disentangle the drivers behind the changes we have documented.
Footnotes
Acknowledgements
The authors would like to thank Bert Bredeweg, Marco Kragten, Tom Perry, and Marij Veldman for feedback on drafts of the manuscript. We thank AERA and ECER for making their conference abstracts publicly available and Alexander Christ for sharing a subsection of the dataset used by Deleye et al. (2025) for cross-validation. Finally, we thank the three anonymous reviewers for their careful and helpful review commentaries.
Author contributions
Izaak Dekker: Conceptualization, Methodology, Formal Analysis, Investigation, Visualization, Supervision, Writing - Original draft, Writing - Review & Editing Efthimia Smixioti: Investigation Martijn Meeter Conceptualization, Resources, Methodology - Analytical Strategy, Formal Analysis, Investigation, Writing - Review & Editing.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
