Examining Data Richness in Undergraduate Students’ Reflections: A Linguistic Analysis of Three Data Collection Methods

Abstract

Researchers select data collection methods for a variety of reasons. In our research on experiential education programs, we perceived that when we collected students’ reflections using different methods, the data we received was different. However, we found few existing studies to help us characterize what was different across our data sets to inform our decisions about which methods to use. Thus, the purpose of our study was to compare data richness across three data sets from previous studies of engineering students studying abroad, each collected using a different method (written diaries, interviews, video diaries). We used linguistic inquiry as a lens for exploring data richness and considered two dimensions of richness: depth of content and level of reflection. In comparing our data sets, we identified differences in both dimensions. We found that each data set demonstrated depth of content in different ways, but the data from interviews and video diaries revealed greater levels of reflection than the data from written diaries. Our study contributes a more nuanced understanding of the data richness that can be achieved using different qualitative data collection methods. We identify unique patterns in the depth of content and level of reflection dimensions across data collection methods, which may inform a researcher’s choice of which method to use in a study.

Keywords

data richness reflection diaries interviews data collection linguistic inquiry

Introduction

A central decision that researchers make for each new study is the selection of a data collection method. The advent of new technologies and communication methods have inspired qualitative researchers to explore an expanding array of new approaches to collecting data (Creswell & Poth, 2023). However, the range of emerging qualitative methods can make it challenging for researchers to select the method or methods most appropriate for their study. Several criteria have been suggested to help researchers in choosing between methods, including considerations of time required for researchers and participants, access for populations who may be marginalized by certain approaches, ethical concerns related to privacy and confidentiality, and trust in the authenticity of the data collected through digital methods (Creswell & Poth, 2023; Thomas Dotta et al., 2024). Last, but not least, researchers must consider the research question of interest and identify a form of data that will allow them to answer the question (Creswell & Poth, 2023; Maxwell, 2013).

In the context of higher education, researchers face the additional consideration that data collection can also be an educational intervention (Cao & Henderson, 2021). Whether intentional or incidental, asking students to respond to questions about their educational experiences can be an opportunity for reflection and learning. Similarly, assignments designed for educational purposes can also be used as data collection by researchers seeking to understand student learning. Though some researchers may question the blurring of lines between education and research (Cao & Henderson, 2021), we believe that it is impossible to avoid impacting students through such a study, and therefore a responsible research design will consider what is best for student learning. We faced this question as we sought to study to engineering students’ experiences studying abroad. Experiential education programs (e.g., study abroad) in higher education settings frequently integrate reflection activities as a research-based approach to support student learning (Moon, 2004). Because of the central role of reflection in experiential learning, researchers often analyze reflection data to understand students’ learning processes (Davis & Knight, 2023; Savicki & Price, 2015). We identified diary studies as a unique connecting point between formal research design and the reflection assignments common in experiential education programs. However, diary studies can take many forms, and we wondered which approach would result in the richest reflection data—both for research insights and student learning. Though several researchers have defined data richness (Braun & Clarke, 2013; Charmaz, 2003; Maxwell, 2013), we found few examples where reflection approaches or qualitative data collection methods have been compared on this characteristic.

The purpose of our study was to compare data richness across three existing data sets from studies of engineering students studying abroad, each collected using a different method (written diaries, video diaries, interviews). We include written diaries as a common form of reflection in experiential education programs, interviews as a common qualitative study design in higher education research, and video diaries as a newer method in both education and research. We review definitions of data richness and introduce linguistic inquiry as a lens for exploring data richness. In our analysis, we identify unique patterns in both the depth of content and level of reflection demonstrated across data collection methods, which may inform a researcher or educator’s choice of which method to use in a particular context. Importantly for student learning, interviews and video diaries in our study demonstrated greater levels of reflection across most dimensions compared to written diaries. Our study contributes a new framing through which to consider data richness and proposes level of reflection as a useful dimension to consider, particularly in the context of educational research.

Literature Review

In this section, we discuss diaries as a data collection method, along with their strengths and weaknesses. We then propose a natural alignment between diaries and the reflection activities commonly used in experiential education programs in the higher education context. Finally, we review previous comparisons of qualitative research methods, define data richness in the context of our study, and present our research questions.

Diary Data Collection in Qualitative Research

A diary is a regular record kept by an individual over a period of time. Diaries are characterized by being contemporaneous (i.e.,recorded close to actual events) and personal (i.e.,controlled by the individual; Alaszewski, 2006). Many social science fields collect diaries to understand the everyday lives of individuals. Diaries can be solicited (i.e.,collected for research purposes) or collected for another purpose (e.g., a classroom reflection) and then analyzed for a study (Bartlett & Milligan, 2015; Cao & Henderson, 2021). Diaries have been collected as handwritten journals, digital journals, and more recently, audio and video diaries. The content of diaries can be highly structured (e.g., filling out a questionnaire), totally unstructured narrative, or semi-structured to balance flexibility with research relevance. In this way, diary studies can be carried out with a range of epistemological and theoretical groundings and result in quantitative or qualitative data (Cao & Henderson, 2021; Hyers, 2018). In this paper, we focus on diaries as they are used in qualitative research, where they have long been recognized as a way to capture personal life narratives (Alaszewski, 2006; Hyers, 2018).

Diaries offer several affordances that can result in meaningful insights for studies in a range of contexts. In particular, diaries can give researchers access to phenomena that are hard to observe through other means. For example, diaries can be used to study populations that are remote or marginalized or to study sensitive topics, which participants could be uncomfortable discussing with an interviewer (Alaszewski, 2006; Bartlett & Milligan, 2015). The longitudinal nature of diaries allows narratives to be recorded closer to actual events, reducing challenges with remembering what occurred and the participant’s response (Alaszewski, 2006; Hyers, 2018). Diaries are often used in combination with other methods, such as interviews or document analysis. Using multiple methods can provide several angles to understand complex topics (Baker, 2023; Gibson et al., 2013) and address weaknesses of individual methods, such as the challenge of maintaining participant engagement in diary studies (Baker, 2023; Rudrum et al., 2022). Diary studies can be labor intensive for researchers because they generate significant amounts of data (Rudrum et al., 2022) and for participants, which can bias recruitment towards individuals willing to keep a diary (Alaszewski, 2006). The design of a diary study must therefore consider both the strengths and weaknesses of this method within a particular context.

Connecting Diary Studies and Reflection in Experiential Education Programs

Building on this understanding of diaries, we saw a connection between diary studies and the reflection activities common in experiential education programs. Experience-based learning occurs in many forms ranging from unmediated learning outside formal education to experiences designed and mediated by an instructor within a formal education setting (Andresen et al., 2020; Moon, 2004). In our study, we use the term experiential education programs to refer to the latter type of learning experience specifically within the context of higher education. Common forms of experiential education programs in higher education include study abroad, community-based learning, internships, or research experiences for students. Reflection is a term often used in higher education, but with a variety of definitions (Harvey et al., 2016; Hatton & Smith, 1995; Rogers, 2001). Rogers (2001) synthesizes across educational theorists, describing reflection as: a cognitive activity (with some affective and metacognitive components), involving active engagement, responding to an unusual situation, critically reviewing one’s beliefs, and integrating a new understanding based on one’s experience. Reflection is a component in various learning theories, including Experiential Learning Theory (Kolb, 2015) and Transformational Learning Theory (Mezirow, 1991), in which reflection allows an individual to learn from experience and put that learning into action. Based on these learning theories, many experiential education programs incorporate reflection activities (Ash & Clayton, 2009; Moon, 2004). These activities have been identified as a contributor to student learning in study abroad programs (e.g., Chwialkowska, 2020; Vande Berg & Paige, 2012; Whatley et al., 2021), service learning (e.g., Kiely, 2004), and internships (Bhuwandeep, 2022; Levine et al., 2008). Ash and Clayton (2009) summarize the literature on reflection for learning by saying that, when intentionally designed, reflection can generate, deepen, and document learning.

Given the frequent use of reflection in experiential education programs (Ash & Clayton, 2009; Moon, 2004), we propose diary studies as a promising research approach to enhance our understanding of student learning through experiences. Although Hyers (2018) suggests that diaries fit well in educational research studies, diary studies are underutilized in higher education research (Cao & Henderson, 2021). In the context of experiential education programs, the longitudinal nature of diaries could be particularly helpful in identifying points of struggle and moments of breakthrough (e.g., Turzańska, 2014). Keeping a diary can help students process specific events to develop new understanding (Hyers, 2018). This kind of processing occurs immediately following an event and is thus best captured in the moment. However, many studies in higher education rely on interviews, focus groups, or surveys that occur long after critical events (Cao & Henderson, 2021; Huisman, 2024; McAllister-Grande & Whatley, 2020; Tight, 2013). Diary studies offer an in-the-moment perspective on student learning that much of the current research is lacking. Similarly, the longitudinal perspective can offer insights into student development over an entire program, where early weeks may offer different experiences compared to later weeks (Ward, 2001; Wrobetz et al., 2024). In sum, we saw several connections between diary studies and the reflective activities typically used in experiential education programs, but wondered what diary format would fit best in this context—both to support student learning and to provide insights about student learning for research.

Comparing Qualitative Data Collection Methods

In leading and researching study abroad programs, we have tried several formats for reflection activities (e.g., interviews, written diaries, photographs, video diaries) and noticed differences in the depth of the resulting reflection data. We wanted to characterize those differences to inform our choice of reflection methods for future programs and research studies. We found few comparisons of reflection activities in the higher education research literature; however, prior studies have compared diary methods to other forms of data collection (Baker, 2023; Danielsson & Berge, 2020; Fitt, 2018; Litovuo et al., 2019; McDonnell et al., 2017). Danielsson and Berge (2020) identify affordances of video diaries compared to interviews or observations, including: lack of interference from researchers, generation of a “thick” data set, and supporting deep conversations in follow-up interviews. Similarly, Fitt (2018) found that audio diaries supported participant expression and reduced participants’ tendency to self-edit compared to written diaries. McDonnell et al. (2017) use the metaphor of camera lenses to describe the differences in data from interviews (wide-angle lens) compared to diaries (telephoto lens). Monrouxe (2009) describes how participants include mundane events in audio diaries which are often left out in other formats. Diaries can also better capture sense-making processes compared to interviews, where memory decay may result in a focus on the final outcome rather than the process (Monrouxe, 2009; Olorunfemi, 2024). Though these studies touch on some aspects of interest to our study, they do not fully describe the differences we observed in our data sets, which we eventually conceptualized as differences in data richness.

The quality of qualitative data is often described as data richness (Charmaz, 2003; Maxwell, 2013; Ogden & Cornwell, 2010). Data richness has been variously described as when participants: disclose feelings and views that are inaccessible in ordinary discourse (Charmaz, 2003), go beyond the “surface” of a topic to offer a thorough perspective (Braun & Clarke, 2013, p. 34), or respond with sufficient detail to provide a complete picture of a situation (Maxwell, 2013). These definitions have been operationalized in various ways in prior studies comparing data richness across qualitative methods (e.g., Gothberg et al., 2013; Ogden & Cornwell, 2010; Roberts et al., 2025). For example, Gothberg et al. (2013) in studying focus group data compared participant interactions, breadth of topics, disclosure of sensitive information, adherence to the topic, and depth of conversation. Ogden and Cornwell (2010) in analyzing interview responses considered length, degree of description (e.g., emotions, quantifiers, adverbs), level of personal content, level of analysis, and descriptions of behavior. Much of the prior work exploring the question of data richness has compared in-person and remote data collection approaches. Roberts et al. (2025) summarize these comparisons in their scoping review, finding that there is some evidence of greater depth for in-person data collection, but that depth is not always clearly defined or operationalized in such studies. Taken together, these studies highlight a lack of clarity in defining and measuring data richness in qualitative research.

Building on these examples, we suggest that data richness may look different in different study contexts. For example, we were interested in understanding student learning during study abroad programs. In this context, “adherence to the topic” and “breadth of topics” are less important because we are open to the different learning experiences and topics students may bring up based on their experiences. Similarly, because we focus on individual reflections, we do not consider “participant interactions” as one might in a focus group. On the other hand, we are interested in the level of reflection present in the data, because of the essential role that reflection plays in the experiential learning process. For this study, we operationalize data richness by considering two aspects: depth of content and level of reflection. By synthesizing the definitions and literature discussed in this section, we define depth of content as detail provided about an event or experience, the individual, their emotions, their analysis, and their behaviors. We will define level of reflection as we describe the conceptual framework in the next section.

Research Questions

Based on our literature review, we identified the following research questions to guide our comparison of our three qualitative data sets:

• RQ1: How do the three qualitative data sets compare in their depth of content?

• RQ2: How do the three qualitative data sets compare to other common language types in their depth of content?

• RQ3: How do the three qualitative data sets compare in their level of reflection?

Conceptual Framework

In this section, we propose linguistic inquiry as a method that can offer a new perspective on the differences in data collected across qualitative methods. We then present our approach for measuring the two aspects of data richness in our data sets.

Overview of Linguistic Inquiry

People choose different words depending on our environment, audience, and mode of communication. These words can be analyzed to compare differences between individuals or explore one person’s state of mind (Koutsoumpis et al., 2022). Earlier qualitative studies utilized content analysis of written documents (Allport, 1947) and verbal language (Gottschalk & Gleser, 1969; Weintraub, 1989) to draw conclusions about psychological states. The Linguistic Inquiry and Word Count (LIWC) tool was developed to bring computing power to linguistic analysis (Boyd et al., 2022). This application categorizes each word in a text according to its function and meaning and calculates scores based on the relative frequency of particular words. The LIWC has been updated four times since it was first released in 1992, with expanded dictionaries that reflect changing use of language. LIWC-22 is the latest version, containing updated dictionaries based on a corpus of text ranging from speeches to e-mails to Reddit posts (Boyd et al., 2022). Prior studies using the LIWC have analyzed the impact of written disclosures on individuals’ health (Francis & Pennebaker, 1992), the correlation between college admission essays and academic performance (Pennebaker et al., 2014), and markers of intercultural competence in blogs written during study abroad (Hanegreefs et al., 2023). We chose to use the LIWC because the tool is widely used, consistently updated, and the dictionaries have been validated by a team of researchers across several decades (Boyd et al., 2022). The LIWC also publishes statistics on the test corpus data used to train the model (Boyd et al., 2022), which allowed us to compare our data to common forms of language. This level of transparency is not offered by alternative tools (e.g., MAXQDA or ATLAS.ti).

The developers of the LIWC explain that context influences the words that people use (Boyd et al., 2022), and therefore it is important to select particular variables for analysis based on the research questions and context of a study rather than reviewing all the categories offered through the tool. This approach aligned with our goal of evaluating specific aspects of data richness, which we sought to operationalize through the categories offered in the LIWC-22 dictionary. First, there are linguistic categories, which represent linguistic constructs such as pronouns, prepositions, or conjunctions. Second, psychological process categories (e.g., cognition, affect, and social processes) have been developed and revised over different versions of the LIWC through psychometric evaluation to determine which words in the text corpus correlate positively with these categories. A group of judges has reviewed candidate words to add or remove from these categories in each revision of the LIWC (Boyd et al., 2022). Both the linguistic and the psychological process categories are calculated as the percentage of words in a text that are included in that LIWC dictionary. Third, the LIWC includes four summary variables that have been developed through previous research to capture more complex constructs: analytical thinking, clout, authenticity, and emotional tone (Cohn et al., 2004; Kacewicz et al., 2014; Newman et al., 2003; Pennebaker et al., 2014). The calculations for these variables are not published, but their calculation has been updated in LIWC-22 to reflect language changes, and the metrics are reported as percentiles based on standardized scores from comparison corpora (Boyd et al., 2022). In the next section, we present our conceptual framing and how that informed the LIWC categories we included in our analysis.

Operationalizing Data Richness Using the LIWC

The first aspect of data richness that we explored is depth of content. We define this aspect as detail provided about an event or experience, the individual, their emotions, their analysis, and their behaviors. We developed this definition by drawing on previous studies of data richness, several of which also used LIWC. We identified five conceptual dimensions that we used to operationalize the depth of content (see Table 1). Ogden and Cornwell (2010) used the LIWC to study five dimensions of richness in interview data (length, descriptive, personal, analytical, and action) and researchers have adopted these dimensions in other studies (e.g., Abrams et al., 2015; Flynn et al., 2018). Others have chosen different LIWC variables to analyze data richness. For example, Gothberg et al. (2013) analyzed depth of conversation and defined related language categories as prepositional phrases, conjunctions, exclusions, and cognitive mechanical words. Because these prior studies used older versions of LIWC (with fewer language categories) we chose to adopt similar conceptual dimensions but update the LIWC categories. We based our dimensions on the Ogden and Cornwell (2010) study, with two changes. First, we split the first dimension into descriptive and emotional depth because we hypothesized that the levels of description and emotion might vary differently across our data sources. Second, we did not include length as a dimension in our analysis because the data collection methods we compared do not result in similar word counts and the studies we are drawing from were not designed to compare on this metric. Our final conceptual dimensions, their definitions, and the aligned LIWC categories are shown in Table 1.

Table 1.

Operationalizing Depth of Content Aspect of Data Richness

Conceptual dimensions	Description	LIWC categories	Example words
Descriptive Depth	Detail in descriptions of experiences	Quantities, adverbs, adjectives, perception*	some, more, hard, feel
Emotional Depth	Frequency of emotion words	Affect*	happy, hate, love
Personal Depth	Frequency of references to self and authentic language	First person pronouns, Authenticity⁺	I, we
Analytical Depth	Measure of logical, formal, or hierarchical language	Analytic⁺	Not Available
Behavioral Depth	Frequency of action words and discussion of interactions among people	Verbs, adverbs, social behavior*	is, just, care

*Indicates psychological process categories; + indicates summary variables.

The second aspect of data richness in our study we called level of reflection. Using the LIWC, Savicki and Price (2015, 2017, 2021, 2022) have proposed and tested a framework linking language types to components of reflection (shown in Table 2). Their framework builds on prior LIWC research connecting language usage with psychological processes (Tausczik & Pennebaker, 2010), in particular, the idea that language can represent internal thought processes. Savicki and Price (2015, 2017) argue that operationalizing cognitive processes by considering patterns of language use can offer a more concrete way to measure level of reflection than typical approaches (i.e.,rubrics) which require training, calibration, and significant time investment. In line with this goal, they identified characteristics of reflection from the literature, stating that reflection: 1) requires a shifted perspective, not just rumination; 2) describes disaggregated events and avoids overgeneralization; 3) is contextual rather than assuming events are universal and unchangeable; 4) is integrative, linking cognitive, affective, and behavioral aspects (Savicki & Price, 2015).

Table 2.

Operationalizing Level of Reflection Aspect of Data Richness

Components of reflection	LIWC cognitive complexity factors	LIWC definition	Example words
Contextual	Interaction	Use of past tense and description of social interactions.	Group, person, past tense –ed verbs
Integrative	Immediacy	Writing or speaking simply, in the present tense, and about oneself.	I, me, present tense –ing verbs
Disaggregated/Contextual	Making Distinctions	Observing differences or exclusions from groups and being open to new information.	Either, except, rather, cannot,
Shifted Perspective	Making Sense	Making causal statements and sharing insight about events.	Accept, insight, realize, imply,

In subsequent studies, Savicki and Price (2017, 2021, 2022) demonstrated how these characteristics of reflection align with four variables from the LIWC literature which they name cognitive complexity factors (shown in Table 2). These variables were identified by Pennebaker and King (1999) through factor analysis to group dimensions of language developed in earlier studies. These factors integrate cognitive and affective language and, given this higher order language usage, are indicative of cognitive complexity. Savicki and Price (2022) suggest the following connections between the cognitive complexity factors and level of reflection. Immediacy can indicate reflection that includes an individual’s beliefs, assumptions, and expectations. Making Distinctions may demonstrate processing differences in culture or worldview. Interaction indicates less introspection and more collaborative processing, but describing one’s response to social interactions is often central to making meaning. Making Sense may indicate shifted perspectives and accommodation of new worldviews (Savicki & Price, 2022). Savicki and Price have provided evidence of construct validity (Savicki & Price, 2021) and criterion validity (Savicki & Price, 2022) for using these cognitive complexity factors to measure reflection.

Methods

We analyzed data from three studies of student learning in study abroad programs that used different methods of data collection: written diaries (Study 1), interviews (Study 2), and video diaries (Study 3). These studies were approved by the Purdue University Institutional Review Board, and all participants signed consent forms before participating in the studies. We will first describe our positionality and discuss why we believe it is reasonable to compare the data from these studies. We will then explain the data collection process for each study and the data analysis approaches we used to compare the data across studies.

Positionality

The authors of this paper are both white women educated in engineering as undergraduates and educational research in our advanced degrees. Author 1 designed all three research studies included in this analysis, personally collected and analyzed the data for Studies 1 and 2, and collaborated on the data collection and analysis for Study 3 with Author 2. Author 2 supported data collection and analysis for Study 3 and previously analyzed data from Study 2. We developed these studies from an interpretivist perspective, in which we sought to understand student experiences and the meaning they drew from those experiences. In the current study, we are conducting a secondary data analysis of these data sets in which we take a post-positivist perspective through our use of content analysis with pre-defined categories (Neuendorf, 2017), in alignment with our research questions. Both authors were involved in the data analysis for this study (discussed further below).

Comparability of These Data Sets

A potential challenge in addressing our research questions is that our data sets were not collected from the same participants. Nevertheless, we believe that a comparison is worthwhile because 1) An ideal study using three data collection methods with the same participants is impracticable; 2) The three studies were similar in most other aspects. First, as mentioned earlier, all studies were designed and carried out by the same researcher (Author 1) based on the same research objective and epistemological framing. Each study used the Critical Incident Technique (CIT) to collect stories from students about their experiences abroad. CIT asks participants to tell a story about a specific event, including what happened, how they responded to the situation, and what they learned from this experience (Douglas et al., 2009). The participants for all studies were engineering students at universities in the United States, and most participants were in the 18-24 age range (some postgraduate students in Study 2 were slightly older). In the following sections, we will provide details about the unique aspects of each study, which can inform our interpretation of the findings. However, based on these commonalities, we believe that a comparison of the data sets can provide useful insights for researchers and educators considering these different methods. We acknowledge that the comparison of only three studies is a limitation; however, by focusing on fewer studies we were able to limit the differences across studies, which is essential to addressing our research questions.

Study 1 – Written Diaries

The participants for Study 1 were undergraduate engineering students in their second semester at one university who completed a two-week study abroad program in various locations. Additional participant and program characteristics are included in Appendix A. As a requirement of this program, the students kept written diaries in which they reflected on their experience at multiple points during the program in response to reflection prompts. The reflection prompts asked students to discuss key events from their program and provided several reflection questions to help the students process these events. Because the reflections were required for the program, no incentives were provided. Author 1 selected 29 written diaries from across the different program locations based on students’ responses on a pre-post administration of the Global Perspectives Inventory (GPI; Research Institute for Studies in Education, 2017). She aimed for variation in the sample by selecting written diaries from some students whose GPI scores increased, some whose scores decreased, and some whose scores stayed the same between the pre- and post-surveys. The average length of the written diaries was 6040 words, with a total of 175,157 words across all the written diaries included in our analysis. Previous analysis of the written diaries for student learning outcomes can be found in Davis and Knight (2021). We could not include additional written diaries in this study because of data availability (we had access to these diaries because of their use in the prior study).

Study 2 – Interviews

The participants for Study 2 were current or recently graduated undergraduate (72%) and postgraduate (28%) engineering students who had completed different types of study abroad programs. Author 1 conducted interviews with 24 participants from short-term study tours, 9 from short-term classes abroad, 35 from research/internships abroad, and 10 from semester abroad programs (79 total participants). Additional participant and program characteristics are included in Appendix B. All interviews took place after the students returned home from their programs and asked students to describe their program, motivations for participating, take-aways for their future, and connections between engineering and culture. A large portion of each interview focused on one CIT question, where Author 1 asked students to provide two critical incidents from their time abroad. The specific prompt was: Talk about two specific experiences that were significant to you during your time in [country name]. For these examples, I’d like you to think of a time where you felt that you learned something important (and this could be any kind of learning, about research, culture, travel, yourself, etc.) Author 1 used follow-up questions to help students expand on their stories, based on previous CIT studies (Bott & Tourish, 2016; Hess et al., 2017; Walther et al., 2011). The participants in this study did not receive incentives. Through analysis of the interview transcripts, Author 1 identified 173 critical incidents, which she analyzed for student learning processes (Davis & Knight, 2023, 2025). For our current analysis, we removed all words spoken by the interviewer, after which the average length of the interviews was 4299 words, with a total of 335,317 words across all the interviews.

Study 3 – Video Diaries

The participants for Study 3 were ten undergraduate engineering students completing semesters abroad in different locations. Additional participant and program characteristics are included in Appendix C. These students recorded video diaries from their phones over a 15-week period. We used a phone application to prompt students each week to submit a reflection. We included a prompt similar to the one for Study 2 and asked students to make a video about 5-8 minutes in length. Most of the video diaries were 8 minutes or more and sometimes participants recorded a second video in the same week. Some students recorded videos every week, while others primarily recorded videos in the earlier weeks of their travel. The students in this study received $5 for each video and an additional $25 if they completed 80% of the requested reflections. The reflections were transcribed into a single document for each student, which was provided to them as an additional incentive for participation. The average length of the video diaries for one participant across the 15-week period was 9084 words, with a total of 90,840 words across all the video diaries we included in our analysis. Our previous analysis of student learning based on this data set is reported in Wrobetz et al. (2024).

Data Analysis

The authors developed and implemented the analysis plan together through regular meetings and discussions of the prior literature. Author 2 then imported data from each of the studies above into LIWC-22 and ran a text analysis based on the most current dictionaries. She analyzed each student’s transcript or written diary individually and averaged scores for each data collection method. To answer RQ1, we compared the depth of content dimensions shown in Table 1 by adding the relevant LIWC category scores. Author 2 calculated a z score for each dimension, normalizing to a mean of 0 and a standard deviation of 1. Any z scores above 0 represent a positive deviation from the mean score, and z scores below 0 represent lower-than-average results. Author 2 then used a one-way Welch’s ANOVA to test for statistically significant variations between the data collection methods. She ran a pairwise Games-Howell post-hoc comparison to determine which of the data sets differed and calculated the effect size using Hedges g. Each of these methods were chosen to accommodate our data sets, which have widely different sample sizes and variances (Field et al., 2012). Finally, the authors worked together to identify quotes from each data set that demonstrate the depth of content differences identified through the LIWC analysis.

To answer RQ2, both authors compared the depth of content scores for each data collection method with scores from related communication methods in the LIWC-22 Test Kitchen Corpus. In a team discussion, we chose representative communication modes from the Test Kitchen Corpus (see Table 3) and noted variations of at least one standard deviation between our data and the Test Kitchen Corpus data. Our goal in this analysis was to provide additional evidence of the depth of content variations across our data sets. We did not apply statistical analysis in response to RQ2 because we did not have sufficient information about the data in the Test Kitchen Corpus to run the relevant tests (e.g., Hedges g).

Table 3.

Comparison Communication Methods From the LIWC Test Kitchen Corpus

Data collection method	Corpus comparison 1	Corpus comparison 2
Written Diaries	Blogs	Stream-of-consciousness essays
Interviews	Conversations	Speeches
Video Diaries	Conversations	Speeches

To answer RQ3, Author 2 ran our data through the 2007 LIWC dictionaries which contain the categories used by Savicki and Price (2017) to analyze reflective processes. She then calculated the cognitive complexity factors defined in Pennebaker and King (1999) by multiplying each category by the rotated factor and summing each resulting score. Author 2 converted the factor scores to z scores, performed a Welch’s ANOVA and Games-Howell test, and calculated the Hedges g to determine effect size, following the same approach as we used to answer RQ1. Finally, both authors compared the z scores, ANOVA results, and effect sizes to demonstrate differences in level of reflection across data collection methods.

Results

RQ1: How do the Three Qualitative Data sets Compare to Each Other in Their Depth of Content?

To address RQ1, we compared interviews, written diaries, and video diaries on five dimensions of depth of content described earlier (Table 1). We report the mean and standard deviation for each method in Table 4. Because the scores for each dimension are calculated by adding together multiple LIWC variables shown in Table 1, we cannot interpret the values as percentages or compare these values across dimensions. However, we can compare the raw scores across the different data collection methods, and we show z scores in Figure 1 to allow for comparison across dimensions.

Table 4.

Depth of Content Scores for Each Data Collection Method

	Interviews	Video diaries	Written diaries
Descriptive Depth	27.04 (2.22)	29.11 (2.23)	28.58 (1.73)
Emotional Depth	2.91 (0.55)	3.03 (0.98)	4.07 (0.46)
Personal Depth	88.26 (10.57)	92.98 (8.89)	79.15 (10.57)
Analytical Depth	15.86 (8.27)	17.48 (6.04)	56.40 (13.52)
Behavioral Depth	30.62 (2.22)	32.29 (1.02)	26.42 (2.46)

Note. Mean and Standard Deviation (in parentheses) reported.

Figure 1.

Comparing depth of content dimensions across data collection methods (z scores)

Based on the z scores, we can see that the interviews and video diaries in our data sets are more similar in their depth of content compared to written diaries. The written diaries tend to have more analytical and emotional depth, meaning that the students used more emotion words, and were more likely to use formal or hierarchical language. In contrast, in the interviews and video diaries, students demonstrated more personal depth, meaning they referred to themselves more often and demonstrated authenticity in their responses. Similarly, these methods elicited more behavioral depth, including more action words and more descriptions of interactions among people.

To explore the statistical significance of these differences, we conducted Welch’s ANOVA and found omega squared (effect size) for each dimension of depth of content. The results are shown in Table 5. Effect sizes are interpreted as: small > 0.01, medium > 0.06, and large > 0.14 (Field et al., 2012).

Table 5.

Results of Welch’s ANOVA Comparing Depth of Content Across Methods

	F	df	p-value	Effect size
Descriptive Depth	8.66	23.64	.002**	.230 (large)
Emotional Depth	51.94	21.65	<.001***	.683 (large)
Personal Depth	10.13	24.48	<.001***	.256 (large)
Analytical Depth	94.87	25.18	<.001***	.775 (large)
Behavioral Depth	51.04	33.37	<.001***	.586 (large)

Note. * p< .05, ** p< .01, *** p< .001.

Depth of content differs significantly among the data collection methods with a large effect size for each dimension. To determine which specific data collection methods differed, we present the pairwise comparison of data collection methods using the Games-Howell test and Hedges g (effect size) in Table 6. Effect sizes are interpreted as: small > 0.2, medium > 0.5, and large > 0.8 (Field et al., 2012 ). Negative effect sizes indicate that the data collection method listed first has a lower mean score compared to the method listed second.

Table 6.

Pairwise Comparison of Depth of Content Across Data Collection Methods

	Data collection methods	p-value	Effect size
Descriptive Depth	Interviews-Video Diaries	.044*	-0.92 (large)
	Interviews-Written Diaries	.001**	-0.71 (medium)
	Video Diaries-Written Diaries	.750	0.30 (medium)
Emotional Depth	Interviews-Video Diaries	.929	-0.17 (small)
	Interviews-Written Diaries	<.001***	-2.09 (large)
	Video Diaries-Written Diaries	.024*	-1.55 (large)
Personal Depth	Interviews-Video Diaries	.304	-0.45 (medium)
	Interviews-Written Diaries	<.001***	0.84 (large)
	Video Diaries-Written Diaries	.005**	1.32 (large)
Analytical Depth	Interviews-Video Diaries	.733	-0.20 (small)
	Interviews-Written Diaries	<.001***	-3.81 (large)
	Video Diaries-Written Diaries	<.001***	-2.87 (large)
Behavioral Depth	Interviews-Video Diaries	.001**	-0.78 (medium)
	Interviews-Written Diaries	<.001***	1.77 (large)
	Video Diaries-Written Diaries	<.001***	2.53 (large)

Note. * p< .05, ** p< .01, *** p< .001; Mean and standard deviation are shown in Table 4.

Similar to the z scores, these findings indicate that the interviews and video diaries in our data sets are similar, lacking significant differences in the dimensions of emotional, personal, and analytical depth. The written and video diaries are similar in descriptive depth and differ significantly from the interviews in this dimension. The most notable differences are between the written diaries and the other two methods, where written diaries score significantly higher (with large effect sizes) for emotional and analytical depth and significantly lower (with large effect sizes) for personal and behavioral depth. These results confirm the findings from the z scores.

At a high level, these scores capture the “reporting” tone that is often found in written diaries, with lots of description and analysis, but with the author more removed from the content. For example, one student wrote about their time in Chile:

“The next day began with a tour of the Universidad Catolica, which I believe was said to be the best engineering school in Chile. It was gorgeous…The entire university had earthquake-resistant supports underneath each building. They were called isolators and made it so when the ground moved, the buildings did not.”

This quote focuses on describing facts about the campus but provides little personal content. In contrast, the students in our study spoke more authentically in video diaries and interviews and told stories with themselves as the main characters interacting with others. In a video diary, one student described a trip to the airport this way:

“I was a little confused because I never really had an issue going through security with food. So, I looked at him and he was like, ‘No,’ he’s like, ‘only liquid, only liquid.’ So, I understand that he’s asking me to throw away my food because I thought he meant I couldn’t have food unless it was liquid…Then I get my food, and I throw it away in front of him. And he didn’t say anything either. He watched me throw away my food and he didn’t say anything.”

Similarly, in our interviews, students spent more time describing their own interpretations of events and interactions among people. One student realized their own lack of connection to their environment:

“I think I just got a better appreciation for how people can interact with the wider region around them. We typically just drive places; we don’t really walk everywhere. I think you miss a lot when you just drive. Or if you’re on a highway, and there’s the noise barriers, and you can’t even see what’s next to the highway. We don’t have a really good sense for what’s around us.”

These quotes demonstrate the more formal and descriptive language that is typical in our written diaries as compared to the more personal and active narratives that we saw in our interviews and video diaries. The differences between the interviews and video diaries are more subtle, and perhaps most noticeable in the greater descriptive depth in the videos, where the students were closer to their experiences rather than reflecting back over a longer period.

RQ2: How do the Three Qualitative Data Sets Compare to Other Common Language Types in Their Depth of Content?

We compared each data set to similar language types that are available within the LIWC Test Kitchen Corpus to gain a different perspective on the depth of content in our data sets. We show these results in Table 7 (written diaries), Table 8 (interviews), and Table 9 (video diaries). We use asterisks and bold text to indicate cases where the mean for the comparison language type is within a single standard deviation of the mean for our data set, to indicate the dimensions in which the comparison language types were most similar to our data set.

Table 7.

Comparing Our Written Diaries to Similar Language Types

	Written diaries	Blogs	Stream-of-consciousness essays
	Mean (SD)	Mean	Mean
Descriptive Depth	28.63 (1.73)	24.71	32.21
Emotional Depth	4.09 (0.46)	5.54	5.44
Personal Depth	79.50 (10.55)	75.27*	98.94
Analytical Depth	56.69 (13.37)	38.70	14.05
Behavioral Depth	26.41 (2.42)	27.59*	32.90

Asterisks and bold text indicate cases where the mean for the comparison language type is within a single standard deviation of the mean for our data set

Table 8.

Comparing Our Interviews to Similar Language Types

	Interviews	Conversations	Speeches
	Mean (SD)	Mean	Mean
Descriptive Depth	27.04 (2.22)	26.96*	21.73
Emotional Depth	2.91 (0.56)	5.49	4.27
Personal Depth	88.26 (10.57)	77.80*	34.16
Analytical Depth	15.86 (8.27)	9.46*	75.93
Behavioral Depth	30.62 (2.22)	32.38*	21.36

Asterisks and bold text indicate cases where the mean for the comparison language type is within a single standard deviation of the mean for our data set

Table 9.

Comparing Our Video Diaries to Similar Language Types

	Video diaries	Conversations	Speeches
	Mean (SD)	Mean	Mean
Descriptive Depth	29.11 (2.12)	26.96	21.73
Emotional Depth	3.029 (0.93)	5.49	4.27
Personal Depth	92.98 (8.43)	77.80	34.16
Analytical Depth	17.48 (5.73)	9.46	75.93
Behavioral Depth	32.29 (0.97)	32.38*	21.36

Asterisks and bold text indicate cases where the mean for the comparison language type is within a single standard deviation of the mean for our data set

The written diaries in our study are most similar to blogs, falling within one standard deviation of the mean in both the personal and behavioral depth categories. These written diaries differ from stream-of-consciousness essays by at least one standard deviation in all depth of content categories. In particular, our written diaries show less personal depth and higher analytical depth than stream-of-consciousness essays, indicating more self-editing and greater focus on interpreting situations rather than just describing them.

The interviews in our study are more similar to conversations than speeches, which makes sense as they are not scripted and rehearsed. In terms of depth of content, conversations fell within one standard deviation of the mean for our interview data set on four out of the five dimensions. However, it is interesting to note that our interviews have lower emotional depth than both the conversations and the speeches in the LIWC Text Kitchen Corpus. Compared to speeches, our interviews also have much higher personal depth and much lower analytical depth. This pattern suggests that our interviews are less formal and more personal than speeches, which may indicate that a positive rapport was established between the interviewer and the participants. However, the lack of analytical depth in our interviews could suggest a weakness for studies where understanding a participant’s interpretation of a situation is important. It is interesting to note that the depth of content in the speeches is closer to our written diaries than to the interviews.

The video diaries in our study are also more similar to conversations than speeches, but they differ from conversations more than the interview data. This finding suggests that, although the video diary method could offer students a chance to prepare their answers (compared to interviews), the students in our study did not seem to take this approach as the resulting data set does not look like speeches. At the same time, our video diaries are noticeably different than the conversations as well, particularly in terms of personal and analytical depth. Overall, our video diary data seems to capture a somewhat more analytical response than the interviews while also being more personally and behaviorally deep than the written diaries.

RQ3: How do the Three Qualitative Data Sets Compare to Each Other in Their Level of Reflection?

To address RQ3, we compared the data from the interviews, written diaries, and video diaries on the four dimensions of level of reflection described earlier (Table 2). Similar to our approach for RQ1, we calculated the scores for each interview, video diary, and written diary, then we found the mean and standard deviation for each of the data collection methods. We report these statistics in Table 10. We then calculated z scores so that we could compare across dimensions to see which ones demonstrate greater differences (shown in Figure 2).

Table 10.

Level of Reflection Scores for Each Data Collection Method

	Interviews	Video diaries	Written diaries
Immediacy	-2.38 (2.79)	-2.13 (1.99)	-11.5 (2.15)
Making Distinctions	5.16 (1.41)	4.92 (1.37)	1.60 (1.14)
Interaction	1.82 (1.74)	-0.84 (1.40)	3.90 (2.01)
Making Sense	2.93 (0.55)	2.96 (0.43)	2.52 (0.58)

Note. Mean and Standard Deviation (in parentheses) reported.

Figure 2.

Comparing level of reflection dimensions across data collection methods (z scores)

To explore the statistical significance of these differences, we ran Welch’s ANOVA for each dimension (with omega squared for effect size), which is shown in Table 11. Effect sizes are interpreted as: small > 0.01, medium > 0.06, and large > 0.14 (Field et al., 2012).

Table 11.

Results of Welch’s ANOVA Comparing Level of Reflection Across Methods

	F	df	p-value	Effect size
Immediacy	173.00	26.12	<.001***	.859 (large)
Making Distinctions	90.15	23.75	<.001***	.776 (large)
Interaction	32.87	24.66	<.001***	.544 (large)
Making Sense	5.65	25.06	.009**	.147 (large)

Note. * p< .05, ** p< .01, *** p< .001.

All the dimensions for level of reflection had significant differences across methods with a large effect size. We then ran a post-hoc Games-Howell test (with Hedges g for effect size) to identify specific differences between the data collection methods (shown in Table 12). Effect sizes are interpreted as: small > 0.2, medium > 0.5, and large > 0.8 (Field et al., 2012). Negative effect sizes indicate that the data collection method listed first has a lower mean score compared to the method listed second.

Table 12.

Pairwise Comparison of Level of Reflection Across Data Collection Methods

	Data collection methods	p-value	Effect size
Immediacy	Interviews-Video Diaries	.935	-0.09 (small)
	Interviews-Written Diaries	<.001***	3.45 (large)
	Video Diaries-Written Diaries	<.001***	4.36 (large)
Making Distinctions	Interviews-Video Diaries	.865	0.17 (small)
	Interviews-Written Diaries	<.001***	2.64 (large)
	Video Diaries-Written Diaries	<.001***	2.71 (large)
Interaction	Interviews-Video Diaries	<.001***	1.54 (large)
	Interviews-Written Diaries	<.001***	-1.14 (large)
	Video Diaries-Written Diaries	<.001***	-2.48 (large)
Making Sense	Interviews-Video Diaries	.981	-0.05 (small)
	Interviews-Written Diaries	.005**	0.73 (medium)
	Video Diaries-Written Diaries	.049*	0.78 (medium)

Note. * p< .05, ** p< .01, *** p< .001; Mean and standard deviation are shown in Table 8.

As we saw in our findings for RQ1, our interview and video diary data sets are similar to each other (only one significant difference) while our written diaries are significantly different from both other methods across all dimensions. However, in contrast to RQ1, where we saw high depth of content in the written diaries for some dimensions, here we see (for the most part) lower levels of reflection compared to the interviews and video diaries. Most notably, the students in our study demonstrated significantly lower scores in the Immediacy, Making Distinctions, and Making Sense categories in the written diaries. Lower Immediacy scores are likely related to the differences between writing and speaking (i.e.,we use longer words and more formal language in writing) but may also indicate less introspection as students refer to themselves less often and may not describe their personal values or expectations. The difference in scores for Making Distinctions suggests that students were more likely to observe contrasts and differences in culture or worldview in the interviews and video diaries compared to the written diaries. Similarly, differences in the Making Sense dimension suggests that students were less likely to describe shifted perspectives in their written diaries compared to the other data collection methods.

The one area where the written diaries scored significantly higher than both other methods is the Interaction dimension, which focuses on a combination of social interactions and past tense verbs. It is also notable that this dimension was especially low for the video diaries—the only significant difference between the videos and the interviews. This pattern is likely a function of three things: speaking in past tense, types of study abroad programs, and the social nature of interviews. First, the students in written diaries were more likely to write in past tense compared to the video diaries, where students spoke mostly in present tense. Second, the students in the written diaries were part of a group study abroad program, whereas the interview students came from many different programs (some individual, some groups) and the video diary students were all traveling individually. Finally, the interviews were conversations between participants and the interviewer, giving them more social interaction language compared to the video diaries. Overall, however, these results suggest a higher level of reflection in both the interviews and the video diaries compared to the written diaries in our study.

Discussion & Implications

Our study explored differences in data richness across three qualitative data collection methods: written diaries, interviews, and video diaries. We used linguistic inquiry to compare these data sets for depth of content (i.e.,detail provided about an experience) and level of reflection (i.e.,cognitively complex thought processes). In response to RQ1, we found different strengths for depth of content among the three data collection methods. Our written diary data had greater analytical and emotional depth compared to the other two methods, but was lower in personal and behavioral depth. The interview and video diary data sets were similar, but interviews had somewhat less descriptive and behavioral depth that the videos. In response to RQ2, we compared our data to other common language types in the LIWC Test Kitchen Corpus. We found that the written diaries were similar in their depth of content to blogs (though notably more analytical), and the interviews were similar to conversations. The video diaries were also similar to conversations but demonstrated more personal and analytical depth. In response to RQ3, we found that the interview and video diary data sets demonstrated higher levels of reflection across most dimensions compared to written diaries. The written diaries were higher in the Interaction dimension, which may be related to the specific types of programs represented in that study. Overall, we found that data richness differed both in depth of content and level of reflection across our three data sets.

Our study contributes a better understanding of the data richness that can be achieved using different qualitative data collection methods. Although several previous studies have compared data collected through different methods (e.g., Baker, 2023; Danielsson & Berge, 2020; Fitt, 2018; Litovuo et al., 2019; McDonnell et al., 2017), we offer a new perspective to the conversation by using linguistic inquiry to consider differences in both depth of content and level of reflection. First, we found unique patterns in the depth of content dimensions, which may inform a researcher’s choice of which method to use in a study. For example, prior studies found written diaries are more self-edited than audio diaries (Fitt, 2018) and less personal and more “matter of fact” than interviews (Baker, 2023, p. 698). We noted a similar trend in the formal language of the written diaries in our study, but the written diaries also contained more descriptive and analytical depth, which may be desirable in certain studies or educational contexts. Similarly, the video diaries in our study demonstrated more descriptive and emotional depth than interviews, which supports claims that participants may feel more comfortable without an interviewer present and express themselves freely (Danielsson & Berge, 2020). At the same time, the differences between video diaries and interviews were smaller when compared to the written diaries, which may reflect previous findings that individual participants can prefer one method over the other (McDonnell et al., 2017). Overall, our study offers a more nuanced framing of depth of content and demonstrates how linguistic inquiry can reveal distinctive patterns of strengths and weaknesses in the richness of qualitative data.

Second, we found that the interviews and video diaries in our study demonstrated greater levels of reflection across most dimensions compared to written diaries. The differences in the depth of content (where each method demonstrated unique strengths) and level of reflection (where journals were clearly weaker) suggests these are different aspects of data richness that should be considered separately. It is important to note the differences in the LIWC variables that we used to operationalize depth of content versus level of reflection. The depth of content variables characterize which types of content were discussed and what kind of language was used, whereas the level of reflection factors integrate cognitive and affective language to demonstrate cognitive complexity. In written diaries, the students in our study would describe a situation in detail but fail to introspect about its personal impact. Similarly, they might use logical language to analyze an event but stop short of reflecting on contrasting perspectives or inferring causation. It is possible that the greater effort involved in writing results in the loss of reflective content, or it could be related to discomfort that engineering students may feel with writing due to its lack of emphasis in their typical curricula (e.g., as compared to math or design-based activities; Conrad, 2017). Monrouxe (2009) suggested that audio diaries can lead to sense-making during the think-aloud process and shares examples from a participant narrative. Perhaps this connection is reflected in the differences we found between written diaries and both interviews and video diaries. We did not find other studies comparing the level of reflection elicited by different data collection methods but hope that our analysis inspires future researchers to expand on this line of inquiry. As educators, we believe that better understanding this aspect of data richness can have implications for both educational research and student learning.

Our findings have several implications for research and educational practice. First, our comparison of methods can inform the selection of methods for research studies, depending on the dimension(s) of content that are most important for a study. For example, a study seeking rich descriptions might prefer written diaries whereas a study seeking personal information may prefer interviews. Second, our study supports previous arguments that multiple methods can strengthen a study (Baker, 2023; Filep et al., 2018; Gibson et al., 2013) by specifying the unique contributions each method could make in terms of data richness. Third, we built on previous work using the LIWC to characterize data richness (Abrams et al., 2015; Flynn et al., 2018; Ogden & Cornwell, 2010) and proposed updated dimensions of depth of content based on the latest LIWC library. We additionally introduced level of reflection as another aspect of data richness, building on work by Savicki and Price (2017, 2021, 2022). These contributions to the study of data richness in qualitative research can inform future research to understand the data generated using different methods. Additionally, our study highlights the natural connection between diary methods and reflection activities used in experiential education programs. Our findings suggest that educators should consider using video diaries in these programs to enhance students’ level of reflection and encourage in-the-moment reflection, both of which have the potential to support student learning.

Limitations

Our study only considered data from one context: university students studying abroad. Other populations may respond differently to these data collection methods, so our findings may not generalize to other types of studies. For example, most participants in our studies were in the 18–24 age range, and we found anecdotally that many students in this age group prefer recording video diaries to writing diaries. However, different age groups may feel more comfortable with different methods. Our studies also only include students who chose to study abroad. These students are different from the general population of students as they are likely to be interested in global affairs, have a higher socioeconomic status, and may be more mature (McAllister-Grande & Whatley, 2020). These characteristics could make our data sets unique, even within the context of higher education. Additionally, our study is limited by the fact that we only compared data from three studies, which allowed us to reduce differences across studies and focus on comparing the data collection methods. Future research using a similar study design could explore whether our findings transfer to a broader range of research topics and contexts.

A second limitation is that there are some differences between the three studies that we are comparing. First, the reflection prompts for Study 1 were not identical to those used in Studies 2 and 3. We have previously analyzed the Study 1 data for critical incidents and identified similar categories to those found in Studies 2 and 3 (Davis, 2020), and therefore believe that the data is comparable in terms of content despite the differences in the original prompts. Another difference is that the interview protocol included additional questions beyond the CIT question and allowed the interviewer to ask follow-up questions. We considered only including responses to the CIT question in our analysis, but we felt that this would not fully capture the differences between the interview data and the other data sets. One fundamental advantage of interviews is that they provide the opportunity to ask more questions, so we decided to include all the data for an accurate comparison. A limitation with Study 3 is that it only included ten participants, though we have longitudinal data for each of them. This limited sample could be unrepresentative of study abroad students, though these students traveled to different locations and demonstrated different levels of reflection on average. Finally, although we have built on prior research and theory to select the LIWC categories used in our analysis, there are other categories that we could have chosen. We will continue to explore the use of linguistic inquiry to measure data richness and build on Savicki and Price’s (2021, 2022) suggestions for collecting evidence of validity for this method.

Footnotes

Acknowledgements

This material is based in part upon work supported by the United States National Science Foundation under Grant Number OISE-1658604. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States National Science Foundation.

ORCID iD

Kirsten A. Davis

Ethical Considerations

Data from several studies is used in our analysis for this paper, each of which was approved by the Purdue University Institutional Review Board (IRB). • The Purdue IRB approved our interviews, which were conducted through two projects: IRB-2020-1394 (approved 12/10/2020) and IRB-2020-1557 (approved 11/19/2020). The participants provided written consent before participating in interviews. • The Purdue IRB approved the written journals in IRB-2021-81 (approved 03/16/2021). The participants provided written consent for us to use their journals for research purposes (the journals were originally collected as a class assignment). • The Purdue IRB approved the video reflections in IRB-2021-1554 (approved 11/19/2021). The participants provided written consent before completing any video reflections.

Consent to Participate

All participants in all the studies that we analyzed data from provided written consent to participate in the research study.

Consent for Publication

The consent forms for each study included the information that the data collected through the study would be used for publication. All participants signed a consent form before participation in each of the studies.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Some of the interviews were collected as part of a study funded through the National Science Foundation (OISE-1658604). The authors received no financial support for studies through which we collected the remaining interviews, the written journals, or the video reflections.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Our deidentified data could be shared with other researchers at their request and with approval from the Purdue University IRB.*

Appendix

References

Abrams

K. M.

Wang

Song

Y. J.

Galindo-Gonzalez

(2015). Data richness trade-offs between face-to-face, online audiovisual, and online text-only focus groups. Social Science Computer Review, 33(1), 80–96. https://doi.org/10.1177/0894439313519733

Alaszewski

(2006). Using diaries for social research. Sage Publications.

Allport

G. W.

(1947). The use of personal documents in psychological science: Prepared for the Committee on Appraisal of Research. Social Science Research Council. https://doi.org/10.1037/11389-000

Andresen

Boud

Cohen

(2020). Experience-based learning. In Foley

(Ed.), Understanding Adult Education and Training (2nd ed., pp. 225–239). Routledge. https://doi.org/10.4324/9781003118299

Ash

Clayton

(2009). Generating, deepening, and documenting learning: The power of critical reflection in applied learning. Journal of Applied Learning in Higher Education, 01(Fall), 25–48. https://doi.org/10.57186/jalhe_2009_v1a2p25-48

Baker

(2023). Young people engaging in event-based diaries: A reflection on the value of diary methods in higher education decision-making research. Qualitative Research, 23(3), 686–705. https://doi.org/10.1177/14687941211048403

Bartlett

Milligan

(2015). Diary method: Research methods (1st ed.). Bloomsbury Academic.

Bhuwandeep . (2022). The impact of reflective practices on student learning in remote internships during COVID 19 pandemic: A qualitative study. Reflective Practice, 23(4), 509–523. https://doi.org/10.1080/14623943.2022.2064446

Bott

Tourish

(2016). The critical incident technique reappraised: Using critical incidents to illuminate organizational practices and build theory. Qualitative Research in Organizations and Management, 11(4), 276–300. https://doi.org/10.1108/QROM-01-2016-1351

10.

Boyd

Ashokkumar

Seraj

Pennebaker

(2022). The development and psychometric properties of LIWC-22. https://doi.org/10.13140/RG.2.2.23890.43205

11.

Braun

Clarke

(2013). Successful qualitative research: A practical guide for beginners (First published). Sage Publications.

12.

Cao

Henderson

E. F.

(Eds.). (2021). Exploring diary methods in higher education research: Opportunities, choices and challenges. Routledge.

13.

Charmaz

(2003). Grounded theory: Objectivist and constructivist methods. In Denzin

N. K.

Lincoln

Y. S.

(Eds.), Strategies of qualitative inquiry (pp. 249–291). Sage Publications.

14.

Chwialkowska

(2020). Maximizing cross-cultural learning from exchange study abroad programs: Transformative learning theory. Journal of Studies in International Education, 24(5), 535–554. https://doi.org/10.1177/1028315320906163

15.

Cohn

M. A.

Mehl

M. R.

Pennebaker

J. W.

(2004). Linguistic markers of psychological change surrounding September 11, 2001. Psychological Science, 15(10), 687–693. https://doi.org/10.1111/j.0956-7976.2004.00741.x

16.

Conrad

(2017). A comparison of practitioner and student writing in civil engineering. Journal of Engineering Education, 106(2), 191–217. https://doi.org/10.1002/jee.20161

17.

Creswell

J. W.

Poth

C. N.

(2023). Qualitative inquiry and research design: Choosing among five approaches (5th ed.). Sage Publications.

18.

Danielsson

A. T.

Berge

(2020). Using video-diaries in educational research exploring identity: Affordances and constraints. International Journal of Qualitative Methods, 19, 1–9. https://doi.org/10.1177/1609406920973541

19.

Davis

K. A.

(2020). Pursuing Intentional Design of Global Engineering Programs: Understanding Student Experiences and Learning Outcomes [Dissertation, Virginia Tech]. https://vtechworks.lib.vt.edu/handle/10919/97979

20.

Davis

K. A.

Knight

D. B.

(2021). Comparing students’ study abroad experiences and outcomes across global contexts. International Journal of Intercultural Relations, 83, 114–127. https://doi.org/10.1016/j.ijintrel.2021.05.003

21.

Davis

K. A.

Knight

D. B.

(2023). Assessing learning processes rather than outcomes: Using critical incidents to explore student learning abroad. Higher Education, 85(2), 341–357. https://doi.org/10.1007/s10734-022-00836-6

22.

Davis

K. A.

Knight

D. B.

(2025). Exploring how personal and program characteristics inform the experiences of engineering students abroad. Journal of Engineering Education, 114(1), Article e20625. https://doi.org/10.1002/jee.20625

23.

Douglas

J. A.

McClelland

Davies

Sudbury

(2009). Using critical incident technique (CIT) to capture the voice of the student. The TQM Journal, 21(4), 305–318. https://doi.org/10.1108/17542730910965038

24.

Field

Miles

Field

(2012). Discovering statistics using R. Sage Publications.

25.

Filep

C. V.

Turner

Eidse

Thompson-Fawcett

Fitzsimons

(2018). Advancing rigour in solicited diary research. Qualitative Research, 18(4), 451–470. https://doi.org/10.1177/1468794117728411

26.

Fitt

(2018). Researching mobile practices: Participant reflection and audio-recording in repeat question diaries. Qualitative Research, 18(6), 654–670. https://doi.org/10.1177/1468794117743462

27.

Flynn

Albrecht

Scott

S. D.

(2018). Two approaches to focus group data collection for qualitative health research: Maximizing resources and data quality. International Journal of Qualitative Methods, 17(1), 1–9. https://doi.org/10.1177/1609406917750781

28.

Francis

M. E.

Pennebaker

J. W.

(1992). Putting stress into words: The impact of writing on physiological, absentee, and self-reported emotional well-being measures. American Journal of Health Promotion, 6(4), 280–287. https://doi.org/10.4278/0890-1171-6.4.280

29.

Gibson

B. E.

Mistry

Smith

Yoshida

K. K.

Abbott

Lindsay

Hamdani

(2013). The integrated use of audio diaries, photography, and interviews in research with disabled young men. International Journal of Qualitative Methods, 12(1), 382–402. https://doi.org/10.1177/160940691301200118

30.

Gothberg

Applegate

Reeves

Kohler

Thurston

Peterson

(2013). Is the medium really the message? A comparison of face-to-face, telephone, and internet focus group venues. Journal of Ethnographic & Qualitative Research, 7(3), 108–127.

31.

Gottschalk

L. A.

Gleser

G. C.

(1969). The measurement of psychological states through the content analysis of verbal behavior. University of California Press.

32.

Hanegreefs

Pluymaekers

Hoefnagels

(2023). Linguistic markers of intercultural competence in student blogs. PrePrints.Org. https://doi.org/10.20944/preprints202204.0303.v2

33.

Harvey

Coulson

McMaugh

(2016). Towards a theory of the ecology of reflection: Reflective practice for experiential learning in higher education. Journal of University Teaching and Learning Practice, 13(2), Article 2. https://doi.org/10.53761/1.13.2.2

34.

Hatton

Smith

(1995). Reflection in teacher education: Towards definition and implementation. Teaching and Teacher Education, 11(1), 33–49. https://doi.org/10.1016/0742-051X(94)00012-U

35.

Hess

J. L.

Strobel

Brightman

A. O.

(2017). The development of empathetic perspective-taking in an engineering ethics course. Journal of Engineering Education, 106(4), 534–563. https://doi.org/10.1002/jee.20175

36.

Huisman

(2024). The use of methods: Are higher education scholars lazy or insufficiently skilled? Higher Education Research & Development, 43(1), 260–266. https://doi.org/10.1080/07294360.2024.2305961

37.

Hyers

L. L.

(2018). Diary methods: Understanding qualitative research. Oxford University Press.

38.

Kacewicz

Pennebaker

J. W.

Davis

Jeon

Graesser

A. C.

(2014). Pronoun use reflects standings in social hierarchies. Journal of Language and Social Psychology, 33(2), 125–143. https://doi.org/10.1177/0261927X13502654

39.

Kiely

(2004). A chameleon with a complex: Searching for transformation in international service-learning. Michigan Journal of Community Service Learning, 10(2), 5–20.

40.

Kolb

D. A.

(2015). Experiential learning: Experience as the source of learning and development (2nd ed.). Pearson Education, Inc.

41.

Koutsoumpis

Oostrom

J. K.

Holtrop

van Breda

Ghassemi

de Vries

R. E.

(2022). The kernel of truth in text-based personality assessment: A meta-analysis of the relations between the big five and the linguistic inquiry and word count (LIWC). Psychological Bulletin, 148(11–12), 843–868. https://doi.org/10.1037/bul0000381

42.

Levine

R. B.

Kern

D. E.

Wright

S. M.

(2008). The impact of prompted narrative writing during internship on reflective practice: A qualitative study. Advances in Health Sciences Education, 13(5), 723–733. https://doi.org/10.1007/s10459-007-9079-x

43.

Litovuo

Karisalmi

Aarikka-Stenroos

Kaipio

(2019). Comparing three methods to capture multidimensional service experience in children’s health care: Video diaries, narratives, and semistructured interviews. International Journal of Qualitative Methods, 18, 1–12. https://doi.org/10.1177/1609406919835112

44.

Maxwell

J. A.

(2013). Qualitative research design: An interactive approach (3rd ed.). Sage.

45.

McAllister-Grande

Whatley

(2020). International higher education research: The state of the field. NAFSA: Association of International Educators. https://www.nafsa.org/bookstore/international-higher-education-research-state-field

46.

McDonnell

Scott

Dawson

(2017). A multidimensional view? Evaluating the different and combined contributions of diaries and interviews in an exploration of asexual identities and intimacies. Qualitative Research, 17(5), 520–536. https://doi.org/10.1177/1468794116676516

47.

Mezirow

(1991). Transformative dimensions of adult learning. Jossey-Bass Inc.

48.

Monrouxe

L. V.

(2009). Solicited audio diaries in longitudinal narrative research: A view from inside. Qualitative Research, 9(1), 81–103. https://doi.org/10.1177/1468794108098032

49.

Moon

J. A.

(2004). A handbook of reflective and experiential learning: Theory and practice. Routledge.

50.

Neuendorf

K. A.

(2017). The content analysis guidebook. Sage Publications. https://doi.org/10.4135/9781071802878

51.

Newman

M. L.

Pennebaker

J. W.

Berry

D. S.

Richards

J. M.

(2003). Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin, 29(5), 665–675. https://doi.org/10.1177/0146167203029005010

52.

Ogden

Cornwell

(2010). The role of topic, interviewee and question in predicting rich interview data in the field of health research. Sociology of Health & Illness, 32(7), 1059–1071. https://doi.org/10.1111/j.1467-9566.2010.01272.x

53.

Olorunfemi

(2024). Diary studies in research: More than a research method. International Journal of Market Research, 66(4), 410–427. https://doi.org/10.1177/14707853231222139

54.

Pennebaker

J. W.

King

L. A.

(1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6), 1296–1312. https://doi.org/10.1037/0022-3514.77.6.1296

55.

Pennebaker

J. W.

Chung

C. K.

Frazee

Lavergne

G. M.

Beaver

D. I.

(2014). When small words foretell academic success: The case of college admissions essays. PLOS ONE, 9(12), Article e115844. https://doi.org/10.1371/journal.pone.0115844

56.

Research Institute for Studies in Education . (2017). Global Perspective Inventory: Theoretical foundations and scale descriptions. Iowa State University. https://www.gpi.hs.iastate.edu/documents/GPI_Theory_and_Scales.pdf

57.

Roberts

Onuegbu

Harris

Clark

Griffiths

Seers

Aktas

Staniszewska

Boardman

(2025). Comparing in-person and remote qualitative data collection methods for data quality and inclusion: A scoping review. International Journal of Qualitative Methods, 24, 1–13. https://doi.org/10.1177/16094069251316745

58.

Rogers

R. R.

(2001). Reflection in higher education: A concept analysis. Innovative Higher Education, 26(1), 37–57. https://doi.org/10.1023/A:1010986404527

59.

Rudrum

Casey

Frank

Brickner

R. K.

MacKenzie

Carlson

Rondinelli

(2022). Qualitative research studies online: Using prompted weekly journal entries during the COVID-19 pandemic. International Journal of Qualitative Methods, 21, 1–12. https://doi.org/10.1177/16094069221093138

60.

Savicki

Price

M. V.

(2015). Student reflective writing: Cognition and affect before, during, and after study abroad. Journal of College Student Development, 56(6), 587–601. https://doi.org/10.1353/csd.2015.0063

61.

Savicki

Price

M. V.

(2017). Components of reflection: A longitudinal analysis of study abroad student blog posts. Frontiers: The Interdisciplinary Journal of Study Abroad, 29(2), 51–62. https://doi.org/10.36366/frontiers.v29i2.392

62.

Savicki

Price

M. V.

(2021). Reflection in transformative learning: The challenge of measurement. Journal of Transformative Education, 19(4), 366–382. https://doi.org/10.1177/15413446211045161

63.

Savicki

Price

M. V.

(2022). Reflective process and intercultural effectiveness: A case study. Frontiers: The Interdisciplinary Journal of Study Abroad, 34(4), 6–25. https://doi.org/10.36366/frontiers.v34i4.505

64.

Tausczik

Y. R.

Pennebaker

J. W.

(2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54. https://doi.org/10.1177/0261927x09351676

65.

Thomas Dotta

Freitas

Sousa

R. T. D.

(2024). Methodological issues in technology-mediated qualitative data collection: A mapping of research undertaken in schools during the Covid-19 pandemic. London Review of Education, 22(1), Article 34. https://doi.org/10.14324/LRE.22.1.34

66.

Tight

(2013). Discipline and methodology in higher education research. Higher Education Research & Development, 32(1), 136–151. https://doi.org/10.1080/07294360.2012.750275

67.

Turzańska

(2014). Junior high school learners’ ability to reflect in the process of keeping a diary in a foreign language. In Gabryś-Barker

Wojtaszek

(Eds.), Studying second language acquisition from a qualitative perspective (pp. 71–89). Springer International Publishing. https://doi.org/10.1007/978-3-319-08353-7_6

68.

Vande Berg

Paige

R. M.

(2012). Why students are and are not learning abroad. In Berg

Michael Paige

R. M.

Lou

K. H.

(Eds.), Student learning abroad: What our students are learning, what they’re not, and what we can do about it (pp. 29–58). Stylus.

69.

Walther

Kellam

N. N.

Sochacka

N. W.

Radcliffe

(2011). Engineering competence? An interpretive investigation of engineering students’ professional formation. Journal of Engineering Education, 100(4), 703–740. https://doi.org/10.1002/j.2168-9830.2011.tb00033.x

70.

Ward

(2001). The A,B,Cs of acculturation. In Matsumoto

(Ed.), The Handbook of Culture and Psychology (pp. 411–445). Oxford University Press.

71.

Weintraub

(1989). Verbal behavior in everyday life. Springer Publishing Co.

72.

Whatley

Landon

A. C.

Tarrant

M. A.

Rubin

(2021). Program design and the development of students’ global perspectives in faculty-led short-term study abroad. Journal of Studies in International Education, 25(3), 301–318. https://doi.org/10.1177/1028315320906156

73.

Wrobetz

Davis

K. A.

Artiles

M. S.

Murzi

(2024). Engineering students learning abroad: Experiences captured via longitudinal video reflections. IEEE Transactions on Education, 67(3), 423–433. https://doi.org/10.1109/TE.2023.3337783