Abstract
Practitioners often struggle to assess reflective learning in the workplace because of difficulties conceptualizing reflection and its effects in the workplace. This article addresses this problem by offering a pragmatic approach to assessment that asks practitioners to specify why they are using reflection, what they are hoping to gain from it, and how it manifests in practice. This article then discusses several ways that practitioners can assess the impact of reflective learning at work while accounting for the contextual nature of adult learning and practice. Methods are described that aim to help practitioners identify with whom reflection works, where reflection works, and when reflection works best. It discusses implications for adult learning practice and theory.
“A pragmatic approach to assess reflective learning in the workplace begins with defining one’s values (Why?), aims (What?), and vision of reflection for achieving those aims (How?).”
When I first began studying workplace reflection as a young scholar, I asked what I thought was a simple question: “But does it work?” (Roessger, 2015, p. 83). With years comes humility, and since then I have learned there’s no easy answer. What we talk about when we talk about reflection depends on whom we ask. Different views impede efforts to systematically assess the skills, competence, or performance thought to emerge from reflective learning; and efforts to establish what it is are more likely to validate the seeker’s views than to settle the issue. Accordingly, I offer no definition of reflection here, as the assessment strategies I propose do not require it. My earlier question, then, was likely the wrong place to start. A better place may be, “But does yours work for you?”
Asking this in a discussion of assessment places one in the realm of pragmatism, where little value is given to what an idea really means. Instead, what is valued is how well an idea works and how well an idea helps one realize goals. Inquiry, then, does not seek whether one’s view of reflection mirrors what it really is, but whether one’s view of reflection emerges from practice and is further shaped by its implications to practice. This requires professionals to consider whether learning thought to emerge from reflection has actually been caused by the practice of reflection and not something else. Both learning and the cause of that learning are critical concerns because a measure of only the former allows no way to distinguish whether it was indeed reflective learning or learning of some other kind.
From this perspective, assessment of reflective learning is a practical activity. And it is here I discuss different assessment strategies for reflective learning in the workplace. But first, some preliminary work. How do I define assessment? General definitions assert that assessment is “the measurement of what an individual knows and can do” (Banta & Palomba, 2015, p. 1). Less common variations target learner needs or the group as the unit of analysis. Assessment may be formative (conducted during learning) or summative (conducted after learning) or use questions with one correct answer (objective assessment) or multiple correct answers (subjective assessment). Here, I adopt the customary unit (individuals) and target (knowing and doing) and extend an approach that accommodates these variants. What follows is a series of strategies or plans of action. As numerous tools work with these strategies, I spend little time on any one.
In explaining this approach, I first discuss the need to explain why one uses reflection and what one hopes to gain from it, and then I address the need to describe how reflection occurs in practice. Next, I detail considerations for establishing causal relations between one’s conceptualization of reflection and its outcomes. I conclude with suggestions for future applications and implications to adult learning practice and theory.
Establishing the Why and the What
When assessing reflective learning, one must first ask, is all knowing and doing within its purview? The answer is likely no. Reviews have found that using reflection for instrumental skill and knowledge acquisition is ineffective (Roessger, 2014; Mann et al., 2009; Ruth-Sahd, 2003), and notable adult learning theorists have agreed, rejecting its use in procedural learning (Boud, 2010) or settings adopting traditional, competitive examinations of skills and knowledge (Boud & Walker, 1998). Accordingly, one should avoid assessing knowledge and performances meant to remain fixed across contexts, such as how well an employee recites a return policy or performs a safety check.
Although scholars generally agree on what reflection should not target, they disagree over what it should. In their review of reflection in social work, psychology, and teacher education, Van Beveren et al. (2018) found reflection’s expressed purposes varied. They found three levels on which researchers justify its use: (a) the personal, (b) the interpersonal, and (c) the socio-structural level. On each, reflection purportedly changes what a person knows or does relevant to the aims of that level. Given such diversity, people should explain why they are using (or promoting) reflection. This is central to pragmatic assessment, where “what works is assessed by first declaring one’s goals and then empirically weighing how well those goals are realized” (Roessger, 2017, p. 210).
By clarifying why, learning professionals are then free to assess whatever relevant skills and knowledge align with that intention. This freedom emerges from a space where reflection mediates relationships between employees and their knowing or doing. Reflection is itself a skill, but from an organizational perspective, it serves to further new knowledge or action, not itself. Capable reflectors who fail to transform organizational practices and products often find themselves in untenable positions within their organizations. As such, assessments must not target reflection, but rather what is sought from reflection’s use.
The Domains of Reflective Learning
The range of skills and knowledge one targets when assessing reflective learning has limitations. Consider Moore et al.’s (2009) framework for clinical skill assessment of health professionals (see Figure 1). They argue that a comprehensive assessment of professional services should consider declarative knowledge (knows), procedural knowledge (knows how), contrived performance (shows how), and authentic performance (does). Each level contains myriad targets for assessment. But the lowest almost exclusively involve instrumental learning (see Figure 2), a process whereby emerging professionals construct understandings aligned with those of their profession (Roessger, 2015). Minimal support exists for reflection in this domain.

Moore et al.’s (2009) framework for clinical assessment (adapted from Miller, 1990).

Roessger (2015) conceptualization of instrumental learning for professional learning contexts.
Within the top two levels, however, meanings are fluid; and in professions with considerable uncertainty, those meanings must, and should, change when warranted. At the third level, learners illustrate competence through contrived performances (shows how) and at the fourth, authentic performances (does). Here they refine and transform existing meanings rather than construct new meanings of profession-specific information and actions.
Consider the call center worker who after familiarizing herself with such things as the company’s product catalog (knows), prices (knows), and return policies (knows) is now able to use this information to learn the process of facilitating a customer service call on a decision-making flowchart (knows how). So far, her learning involves knowing and knowing how, what may be called meaning making (Roessger, 2019). But it is only when these meanings are transformed that reflection plays a role. When her supervisor then asks her to role-play a call (shows how), she must navigate potential ambiguities and adapt precisely defined skills and knowledge for customers with little interest in following her script.
Adaptations become even more critical during authentic performance (does), when even a field’s most experienced professionals encounter novel problems with ambiguous solutions. Consider an experienced salesperson who was recently asked to implement a proven strategy from another field. After learning about the strategy (knows) and its use (knows how), she resists adopting it because her beliefs and repertoires are incongruent with it. She finds herself uncertain of its effect in practice’s unpredictable world. Here, reflection may help her analyze the new strategy, allowing her to construct and evaluate hypothetical outcomes against her experience.
Such scenarios involve meaning transformation, the process whereby learners relate what they have learned to what they know to find new interpretations for solving novel problems (see Figure 3). Within this space, reflective learning assessments target novel applications, interpretations, and innovations; and changes in initial meanings often characterized as adaptability, creativity, and flexibility. Here learning professionals might ask, “Has the call center worker identified changes to established practices that make the company more responsive to customer concerns?” This informs a related objective: learners will be able to identify changes to established practices that improve company responsiveness to customer concerns. Both the question and objective are demonstrable and measurable. Both fall within reflection’s purview and target meaning transformation. What remains is to assess whether employees have met the objective. This could occur during weekly debriefings where supervisors ask employees to describe their experiences with returns and note practices that alleviate or exacerbate customer concerns. During these meetings, follow-up questions could gauge what changes, if any, employees seek. One may assess responses using checklists, rating scales, or rubrics with clearly defined criteria indicating what a change to established practice looks like and to what degree it has the potential to improve company’s responsiveness.

Roessger (2019) functional taxonomy of meaning construction. Learner direction and content familiarity increase with each process.
Reliability, Validity, and Pragmatic Quality
By ensuring that the skills and knowledge thought to emerge from reflective learning are appropriate and measurable, learning professionals can then determine the reliability of the assessment measurement itself. Reliability refers to a measure’s scoring consistency, that is, how well it (e.g., a checklist, rating scale, rubric, or inventory) produces similar results for different people. Customary ways of determining reliability are interrobserver agreement (Richards et al., 2014) and interrater reliability (Cohen, 1960), both of which calculate quotients expressing degrees of agreement among raters or observers. The former assesses how well observers agree when counting behaviors, the latter when assigning categorical scores for demonstrations of skills and knowledge. Using the previous example, our learning professional could record weekly debriefings and, after scoring conversations against her learning objective, ask a colleague to do the same. She could then subject results to interrater reliability tests to determine how well the two agree.
Another quality historically considered is the measure’s validity. Here, validity refers to whether assessments are measuring their intended underlying construct, that is, whether they are actually gauging reflective learning. But although this question is critical when considering whether one’s view corresponds to another’s, it is pragmatically inconsequential. Validity assumes the underlying thing being measured exists apart from the knower, and that its conceptualization is unvarying. Adopting these assumptions restricts the freedom one gains from a pragmatic approach and leaves unaddressed its inherent values (i.e., Does it work?). A measure may be psychometrically valid, for instance, but if it is time-consuming or expensive to administer, it rarely gains traction in practice (Powell et al., 2017). More important is its pragmatic quality, namely its: (a) acceptability to the community of practice, (b) compatibility with the community’s practices, (c) ease of implementation, and (d) usefulness for refining practice and solving problems (Powell et al., 2017). Using these criteria, our learning professional could discuss her assessment with stakeholders to refine its usefulness.
Establishing the How
So far, I have discussed the need to articulate both why one uses reflection and what one hopes to gain from it. I have stressed that one should define outcomes that are measurable and then subject those measures to tests of reliability and pragmatic quality. Now some may be wondering if focusing on effects before causes is, in a sense, putting the cart before the horse. But my advocating for just such an approach is intentional, a way of stressing a pragmatic approach to assessment whereby a practice’s effect in the world determines its value and meaning.
What I am now recommending is that learning professionals describe specifically how reflection occurs and name the observable actions indicative of the unobservable thing (that thing we call reflection). This ensures that learners are participating and that learning professionals are able to tie it to its outcomes. One may ask, “What does reflection look like, and how do I know people are doing it?” The literature is replete with practices thought to occasion—or function as proxies for—reflection, such as journal writing (Bell et al., 2011) and dialogue (Nyaumwe & Mtetwa, 2011). Each are adopted as strategies in short courses or workshops, and each describes how reflection occurs within these settings. If using such a reflective activity, learning professionals may presume that how reflection occurs is indicative of how the activity is carried out.
But in the actual workplace, assessing reflection is trickier. Reflection remains a critical part of learning on the job, often involving learners who are unaware of their learning processes (Glahn et al., 2008c; Marsick & Watkins, 2001). Describing how learners reflect here is difficult, as the learning professional is often removed from the experience. Attempts to retrospectively illuminate the process are flawed. For example, interviewing or ex post facto journaling requires learners to recall their processes rather than report them as they occur (Roessger et al., 2017). This often produces unreliable and inaccurate reporting because it emerges from chronologically divergent perspectives: the experiencing self and the remembering self (Kahneman, 2011). The experiencing self serves as a perspective from which a person contacts present thoughts, feelings, and behaviors; whereas the remembering self serves as a perspective from which learners retrospectively qualify their meaning. Reports of an event or process often differ considerably depending upon the perspective adopted (see Kahneman, 2011). Attempts to establish how reflection occurs without considering this may produce different descriptions of the same phenomenon.
Fortunately, methods exist for accessing a person’s experiencing self that can help determine how reflection occurs in the moment. One strand of research has focused on visual interaction footprints of users of online systems, an approach allowing stakeholders to see select areas of online activity in real time. Researchers have gauged learners’ reflection using indicators of their activity in an online community (Glahn et al., 2007), their implicit and explicit use of tagged resources (Glahn et al., 2008a), and their use of social bookmarking (Glahn et al., 2008b). Others have used data collection apps to gather data in real time through multiple choice, sliding scale, or open-ended questions on learners’ phones (see Roessger et al., 2017). Gathering behavioral observation data with these tools is also possible through photograph, voice memo, or video responses. In each case, researchers can see how reflection occurs over time while avoiding retrospective reporting.
Say our learning professional is using workplace coaching to help a manager use reflective strategies within the call center. Her aim is to help him adapt to changing work conditions and overcome challenges using novel solutions. To see how he reflects, she designs a data collection app that randomly sends a daily push notification. When the manager receives it, he has 5 minutes to respond. Designed to be unobtrusive, the notification generally takes less than 30 seconds to complete. First, he receives a categorical response question: What are you doing right now? Contingent on a response, the second question appears: Is it going according to plan? If he selects no, he receives another question: Are you able to adapt your skills and knowledge to create a successful outcome? If he then selects yes, he is presented with a request for a voice memo: Tell me briefly how you are doing that? Using these data, the learning professional can see how the manager is using previously discussed reflective strategies. She is then able to engage him in additional reflection on this experience during their next session.
But Does Yours Work?
Demonstrating whether something works is deceptively difficult, requiring causal inferences between a presumed cause and its effect. One might ask, does reflection influence the outcomes I have chosen? Myriad factors contribute to this presumed relationship (e.g., psychological flexibility, incentive structures), many of which directly or indirectly influence desired outcomes. Making causal inferences for reflection and its outcomes, then, involves accounting for these and understanding that reflection is likely not the only cause—or even the principal cause.
My aim is to briefly review causal inference here and highlight its most critical considerations for the workplace. For those interested in more comprehensive treatments, I suggest further investigating my sources. For those planning to use research consultants for this part of assessment, what follows may still aid those collaborations.
Chambliss and Schutt (2019) identified five criteria for establishing causal relationships:
Association is the need for correlation between reflection and the skills and knowledge specified in associated outcomes. If a causal relationship exists, we would expect to see that the more a person engages in reflection, the more that person demonstrates targeted skills and knowledge, or the higher the marks are for those demonstrations.
Temporal order is the need for reflection to precede its outcomes in time. If a causal relationship exists, we would expect to see a person engage in reflection before demonstrating outcomes presumed to result from it.
Nonspuriousness is the need to eliminate confounding variables that influence both a person’s engagement in reflection and demonstration of targeted skills and knowledge. Confounding variables are sometimes called lurking variables because they are hidden but directly responsible for the association and temporal order presumed between a cause and effect. For instance, mastery of a profession’s skills and knowledge (confounding variable) may cause adaptability (presumed effect) and reflection (presumed cause). If learning professionals overlook this, they might falsely conclude that reflection is the cause and, in turn, mistakenly emphasize reflection in their learning plans rather than strategies for mastery.
Mechanism is the need to identify the underlying process through which a causal relationship occurs. Although specifying such a process is critical for placing reflection within a broader theoretical framework, it is unnecessary for a pragmatic endeavor such as workplace assessment. For our purposes, we may ignore it.
Context is the need for learning professionals to consider variables related to time, place, and people that influence reflection’s effect. These are contextual variables that illustrate reflection’s conditional effects. I will discuss these more fully later.
Although each criterion plays a critical role in causal inference, perhaps the two most neglected in workplace assessment are temporal order and nonspuriousness. Informally, the learning professional often observes an association between reflection and targeted outcomes (Criterion 1) and concludes the former causes the latter. She may think, “This person demonstrates both how I see reflection and what I see as its result.” While an important observation on its own, association is not causation. To infer causality, she must also establish that reflection precedes these outcomes in time. One time-inclusive assessment design is the single case research design (SCRD), which measures a consecutive series of observations in one person or group and then interrupts that series with an intervention (Shadish et al., 2002). Numerous variations exist, but common among them is that each measures targeted skills and knowledge on multiple occasions until a stable baseline is reached (Phase A). Reflective learning is then introduced and measurements continued (Phase B). If reflection influences targeted outcomes, one expects to see an immediate or developmental change in Phase B.
After establishing temporal order between reflection and targeted outcomes, the learning professional must then rule out alternative explanations, that is, threats to internal validity. These are disparate phenomena that produce similar observed outcomes and promote false conclusions. While the literature discusses numerous threats to internal validity (see Shadish et al., 2002), two are particularly salient. First, history effects refer to the impact of another event concurring with the introduction of reflective learning. For instance, after obtaining stable baselines for targeted skills or knowledge, an organization may implement an incentive program along with the introduction of reflective learning activities. The latter’s effects, then, are said to be confounded with the incentive program. To control for this, the learning professional must record multiple baselines for different people at different times and stagger the introduction of reflective activities. Evidence of consistent changes following reflective learning across times and people will eliminate possibilities of history effects.
The second threat to internal validity is the regression effect, which refers to the tendency of extreme measures to attenuate over time (Shadish et al., 2002). For instance, our learning professional may initially observe demonstrations of targeted skills and knowledge that fail to meet the organization’s quantity and quality benchmarks. After introducing reflective learning activities, she notices improvements and concludes they result from reflective learning. What our professional may be witnessing are extreme scores returning to their average. In other words, learners’ subpar demonstrations of skills and knowledge are improving to their normal levels, something that would happen without her intervention. To eliminate this possibility, our professional should again establish stable baselines before introducing reflective activities or, when simple pre-post test designs are used, record two pre-test measures to identify initial trends unrelated to reflective learning.
Where, When, and With Whom?
Now, I would like to discuss context. The idea of context and its interplay with practice is foundational to adult learning (Roessger, 2018), perhaps no more so than for reflection. Despite this and context’s role in casual inference, assessments and evaluations of reflection often ignore it (Van Beveren et al., 2018).
Any assessment of reflection should question where it works, when it works, and with whom it works best. Single case designs account for when something works, so I will focus here on the questions of where and with whom. To establish with whom reflection works, learning professionals must first consider third variables related to learners themselves. For large-scale assessments, this is accomplished by incorporating additional variables in general linear models (i.e., multiple linear regression). Say our learning professional is considering how reflective activities for occasioning workplace reflection (e.g., coaching)—or reflective learning indicators (e.g., use of tagged resources)—affect adaptations of organizational practices to novel problems. She has operationalized both variables and developed reliable and pragmatic scoring criteria for the latter. She then collects data on 100 call center employees who participated in coaching and 100 who did not, and she uses simple regression to model the relationship between reflection and her targeted outcome. To consider third variables, she then incorporates into her model learner personality measures, such as Big Five personality and growth mindset scores. Doing so allows her not only to isolate reflection’s unique contribution to variability in her target outcome, but also to explore how personality variables interact with reflection to produce conditional effects. She may find that reflection improves targeted outcomes in employees with growth mindsets but does little for those without them. She may also find that reflection helps those who score high on the openness to experience personality dimension but not others. In other words, she finds reflection’s effects are conditional. See Field (2017) for a helpful overview of using third variables in the general linear model.
To account for the role of place, learning professionals can adopt a version of the general linear model called multilevel modeling. Multilevel modeling allows one to examine nested data structures, which occur when learners’ data are not independent and instead rely on the contextual unit in which they are situated (Roessger, 2018). Employees may be nested within workgroups, departments, or branch locations; and those belonging to a particular unit will tend to perform similarly to others within that unit. The work unit itself has qualities that influence employees’ thinking and doing. General linear models ignore such nesting and assume it does not occur. But multilevel modeling determines if reflection’s effects vary across contextual units. Using our previous example, one could examine how effects vary across branches and then incorporate measures from each branch (e.g., size, leadership style) to explain this variability. Findings may suggest that not only does reflection’s effect depend on learner characteristics, but also on branch characteristics. Multilevel modeling has numerous applications for workplace assessment. For an accessible overview, see Heck et al. (2014).
I have deliberately stopped short here of extending this framework to return on investment (ROI) analyses. These require learning professionals to identify additional links in the causal chain beyond those discussed here. For instance, related to targeted skills and knowledge are an organization’s short-term, medium-term, and long-term goals. Ideally, causal threads connect these elements and, once established, learning professionals may determine the profitability of activities thought to occasion targeted skills and knowledge. A sound assessment strategy will aid in subsequent ROI analysis.
Conclusion
My aim was to provide a succinct overview of a number of assessment strategies within a pragmatic framework for workplace learning. The approach is summarized in Figure 4. It begins with defining one’s values (Why?), aims (What?), and vision of reflection for achieving those aims (How?). One then iteratively examines the relationship between reflection and its targeted outcomes by accounting for context and threats to internal validity. This relationship and its components are continuously refined using empirical evidence. The process concludes when what works best for the user is realized. A limitation is its reliance on research methods for causal inference, which may be unfamiliar to some practitioners.

A pragmatic model for developing assessment strategies for reflective learning in the workplace.
Implications for Practice
From this work, several implications for practice emerge. For practitioners, I suggest incorporating this framework into implementation plans and collaborating with research consultants before instituting workplace reflection strategies. For researchers, I suggest accounting for context within assessment plans and involving practitioners to better understand their outcomes of interest. For educators, I suggest asking students to design and assess reflective learning activities within authentic settings and allowing components to emerge from their implications to practice. For theorists, I suggest considering effects as well as first causes, and articulating how assessment is realized within chosen views of reflection. This highlights a final implication for theory: scholars advocating reflective learning must acknowledge the pragmatic concerns of practice and clarify why their recommendations are important, what they produce, and how they manifest in practice.
Footnotes
Conflict of Interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Author Biography
Kevin M. Roessger, PhD, is an associate professor of adult and lifelong learning at the University of Arkansas. He has published numerous articles and book chapters in the field’s most respected outlets and is currently overseeing a grant from the Department of Corrections that examines the effect of correctional education programs on recidivism and post-release employment. His research interests include reflective learning strategies and developing reflective skills in adult learners.
