Abstract
Background
This study explores teacher-practitioners’ roles, strategies, and challenges in developing a Generative AI (GenAI)-enhanced educational simulation. Using collective autoethnography, this study focuses on iterative refinement processes that remain underexamined in research on educational uses of GenAI.
Methods
Grounded in Cultural-Historical Activity Theory, we analyzed the experiences of three South Korean teacher-practitioners who co-developed a simulation integrating ChatGPT in Roblox. Data included reflective journals, group discussions, development artifacts, communication records, and contribution tracking derived from these materials, which were analyzed through a triangulation matrix aligning eight development phases with five practitioner roles.
Results
Five practitioner roles were identified: Designer, Instructor, Programmer, Builder, and Manager. Across eight development phases, these roles shifted in response to technical, pedagogical, and ethical demands. This pattern reflected contextual expertise activation, in which practitioners redistributed existing expertise across roles during the design process.
Conclusion
The findings show how teacher-practitioners coordinated instructional and design decisions while developing a GenAI-enhanced educational simulation. They also suggest implications for professional development, especially in technical upskilling, reflective design, and collaborative role flexibility.
Keywords
One of the primary challenges in educational game design is ensuring that digital environments are not only technologically advanced but also effectively meet the nuanced needs of both educators and learners (Ceresia, 2016). When these environments fail to address real-world instructional requirements, their educational impact is significantly diminished. Therefore, designers must craft immersive experiences that map onto clear instructional objectives and fit naturally into varied modes of classroom use such as short in-class simulations, homework extensions, formative-assessment tasks, or self-paced practice modules (Burns, 2012).
Teachers, who are most familiar with classroom constraints, are well positioned to make these alignments. They routinely set learning objectives, plan, manage, and evaluate learning processes based on the needs of learners and the features of the learning environment. In immersive learning experiences, teachers also identify the instructional goals that educational games can fulfill and determine how such games should be used and deployed in accordance with learners’ cognitive and developmental levels. The involvement of teachers as practitioners becomes especially important when the content of the educational game represents classroom realities. Teachers’ direct experience, understanding of learner diversity, and awareness of contextual factors can therefore guide educational game design, ensuring that its elements correspond to the realities of instructional practice (Schneider & Huanca, 2021).
The emergence of Generative Artificial Intelligence (GenAI) increases both the potential and the complexity of this design work. GenAI can produce adaptive dialogue, branching scenarios, and personalized feedback that expand access to rich practice opportunities at scale (Kadaruddin, 2023). However, the integration of GenAI also introduces ethical and operational challenges. Beyond well-known concerns about content bias and hallucination, developers must also address data privacy, algorithmic transparency, sustainability, and continuous monitoring to ensure that models remain aligned with curricular developments (French et al., 2023). In educational contexts, controlling bias and ensuring transparency are essential to maintain fairness. Practitioner involvement helps ensure pedagogical fit and supports the development of procedures that keep GenAI aligned with ethical and regulatory expectations.
Although GenAI technology is rapidly advancing, existing research on its educational applications has rarely examined teacher-led iterative refinement processes. While previous studies have explored how GenAI can support instruction and feedback, they have mainly focused on system performance or learner outcomes, often highlighting risks related to bias and explainability (Pan et al., 2025; Xing et al., 2025). However, there is still little empirical understanding of how practitioners refine GenAI-based simulations. In particular, little is known about how teachers adjust prompts, constraints, materials, and performance protocols in order to produce scenarios that ensure scalability as well as cultural and pedagogical diversity across different classroom contexts (Caccavale et al., 2025; Pozdniakov et al., 2024; Xiao et al., 2025). Understanding this process matters because these adjustments shape whether GenAI outputs remain usable and appropriate for local classroom contexts. When this process is not well understood, biased or culturally mismatched responses may persist, and simulations that appear scalable may still fail to support meaningful classroom use (Kim, 2026; Nyaaba & Zhai, 2025). Clearer accounts of teacher-led refinement can therefore strengthen both research on GenAI design and practical guidance for educational implementation (Cheah et al., 2025; Choi et al., 2026).
Against this background, we examine how practitioner-led iterative refinement processes shaped the design of a GenAI-enhanced simulation under pedagogical and ethical constraints. Specifically, the study asks two research questions: (1) How do practitioners involved in game design describe the iterative process and roles that emerge when working within a GenAI-enhanced digital platform? (2) What challenges arise in their respective roles, and what strategies do they develop to overcome these obstacles? By connecting these questions with the challenges of bias mitigation and scalable design, we treat practitioner roles as one site where GenAI’s generative capabilities meet classroom requirements.
To frame this inquiry, the study employs Cultural-Historical Activity Theory (CHAT), which provides analytical tools for understanding evolving roles, interactions, and collaborative processes in GenAI-enhanced simulations. CHAT is particularly suitable because its focus on mediation, role evolution, and systemic contradictions offers a means to explain how practitioners navigate the complex social and technical dimensions of GenAI implementation.
Literature Review
Educational Game Design and Immersive Learning
Educational games and simulations foster engagement, motivation, and skill acquisition in immersive learning environments (Dede, 2012; Rapaka et al., 2025). Early approaches using traditional simulations and static 3D worlds (Li & Ip, 2022; Pagano, 2013) offered limited adaptability despite representing complex scenarios. For decades, automated content creation in games relied on Procedural Content Generation (PCG; Shaker et al., 2016), which uses algorithms to generate elements like maps, levels, and items based on predefined rules. While effective for structural content, traditional PCG offered limited capacity for dynamic narrative adaptation (Smith et al., 2011).
GenAI supports more adaptive and student-centered learning environments by generating contextually relevant content and enabling personalized learning trajectories (Baidoo-Anu & Ansah, 2023; Chien et al., 2024). Compared with more standardized designs, these systems may allow greater variation in learner experience. Recent research suggests that GenAI supports rapid prototyping, co-creation of narratives, and adaptive storylines in game design, making it easier to develop more flexible and context-sensitive designs (Alharthi, 2025; Moon et al., 2025). For educational simulations, such attributes translate into opportunities for teacher-practitioners to experiment with branching scenarios, culturally responsive dialogues, and generative assessment prompts that would be prohibitively time-consuming to author manually.
However, integrating GenAI presents challenges: mitigating biases, ensuring scalability while maintaining fidelity, and aligning outputs with educational goals (French et al., 2023; Wang et al., 2024). Recent studies have identified additional complexities in GenAI integration, including variable reliability of generated content, divergent stakeholder expectations about AI capabilities (Fui-Hoon Nah et al., 2023), and the need for continuous alignment between rapidly evolving AI models and relatively stable educational standards (Zhang et al., 2024).
In this context, teachers increasingly serve as both instructional experts and design participants. Advanced learning technologies transform student experiences by enhancing engagement and problem-solving (Chuang, 2014; Khukalenko et al., 2022), with teachers ensuring these tools align with curricular goals and teaching strategies (Du et al., 2022; Rachmadtullah et al., 2023). However, many educators struggle to adopt emerging technologies due to limited training and difficulties integrating new tools into established frameworks (Alalwan et al., 2020). As AI tools become more common in schools, teachers are also asked to consider issues of data safety, equity, and ethics (Ng et al., 2023; Wambsganss et al., 2021; Wang et al., 2023).
Effective collaboration among designers, developers, and educators is increasingly recognized as essential for educational game production (Dimitriadou et al., 2021; Linderoth & Sjöblom, 2019). Teachers’ insights into curriculum standards, learner variability, and authentic tasks help shape scenarios that address real educational challenges (Schachter & Rich, 2011), ensuring pedagogical principles remain central.
Teacher-practitioners play a vital role in design processes, yet research rarely focuses on their contributions. Most studies emphasize technical features or broad outcomes rather than practitioner-led decisions. Recent findings highlight that teachers need both technical proficiency and critical “AI skepticism” (Walter, 2024), alongside hybrid expertise combining domain knowledge, algorithmic awareness, and educational design principles (Felix, 2020; Kim, 2024). Here we focus on how practitioners described their work while creating a GenAI-enhanced simulation.
CHAT Theory and Immersive Learning Environment Design
CHAT provides a framework for understanding GenAI’s role in educational game design. Building on Vygotsky’s ideas and later elaborations (Cole, 1998; Kaptelinin & Nardi, 2006; Yamagata-Lynch, 2010), CHAT examines learning as mediated by tools, social interactions, and cultural contexts (Batiibwe, 2019; Qureshi, 2021). In our study, teacher-practitioners are subjects, the GenAI-enhanced game is both object and tool, and the design team functions as the community with an emergent division of labor.
Activity Theory/CHAT has been used extensively in neighboring domains, including HCI, CSCW, and organizational studies, to analyze how mediating artifacts, rules, and division of labor shape collaboration and innovation (Alharthi et al., 2021; Clemmensen et al., 2016). These cross-domain analyses foreground practical patterns we also observe here: tooling shifts that reconfigure roles, rule-object tensions that surface when institutional requirements meet platform defaults, and historically situated contradictions that drive redesign. We use CHAT in a way that is consistent with prior activity-theory research while adapting it to GenAI-based game design.
CHAT directly informed our methodological choices and analytical framework. Specifically, CHAT’s conceptualization of mediated activity and systemic contradictions guided our data collection instruments and analytical process: reflective journals captured tool mediation and rule negotiations; group discussions documented community dynamics; and development artifacts revealed the object’s evolution through iterations.
We first traced GenAI’s mediating role across design phases, from a simple coding assistant to a generator of learner-specific dialogue. We then examined contradictions within and between activity-system elements, as CHAT views these tensions as innovation drivers (Engeström, 2001). Role-tool contradictions emerged when programmers’ expertise marginalized designers’ pedagogical voice; rule-object contradictions arose when institutional demands for classroom realism clashed with default platform assets; and community-division-of-labor contradictions appeared when builders and programmers overlapped in responsibility.
CHAT’s historicity principle informed our longitudinal approach to tracking practitioner role transformation. The theory’s focus on contradictions helped identify critical incidents where role boundaries blurred and new collaborative patterns emerged. In applying CHAT to our research questions, we specifically focused on three core principles: (1) mediation, which examines how GenAI tools transformed practitioners’ activities; (2) role evolution, which describes how the division of labor shifted in response to challenges; and (3) contradiction resolution, which explores how tensions drove innovation and role refinement. By mapping these changes, we show how CHAT constructs help make sense of teacher roles in GenAI-centered design (Lazarou, 2011), and how practitioners’ evolving responsibilities shape the design process and related learning considerations (Barab et al., 2004; Fire & Casstevens, 2013), while also addressing ethical and practical issues.
Methods and Context
This study employed collective autoethnography to examine the experiences of three teacher-practitioners in South Korean K-12 education who participated in the co-development of a GenAI-enhanced educational game for pre-service teachers. The game trains classroom dialogue skills by simulating authentic instructional challenges with elementary pupils. ChatGPT-4 was integrated into Roblox for its contextually appropriate responses and robust API (Lim et al., 2025). Implementation focused on dynamic NPC interactions, adaptive scenario generation across five difficulty levels, and system architecture managing technical constraints through response caching and fallback dialogues. The South Korean educational context, characterized by advanced technological infrastructure alongside traditional pedagogical approaches that emphasize teacher authority, created specific tensions in the design process, particularly with respect to student autonomy in GenAI-enhanced simulations.
We selected collective autoethnography with a CHAT lens over alternative approaches for three main reasons. First, although case study methods can be applied to process-oriented phenomena, they often rely on external observation and therefore lack the reflexive depth needed to capture how teachers themselves make sense of and reshape GenAI-based design work in real time. Collective autoethnography, by contrast, situates practitioners as both participants and interpreters, allowing access to tacit reasoning and value conflicts that remain invisible to outside researchers. Second, while survey or quasi-experimental designs can measure trends or outcomes across larger samples, they are less suited to tracing the iterative, contradiction-driven cycles that CHAT explicitly conceptualizes. Third, combining autoethnography with CHAT made it possible to examine mediation tools, shifting roles, and systemic tensions as they emerged throughout the development process, providing process-level insights unattainable through interview-based or broad quantitative approaches. Similar integrations of autoethnography and CHAT have been used to analyze teacher learning and human-AI collaboration in educational contexts (Batiibwe, 2019; Kim & Reichmuth, 2020; Mao et al., 2023; Panke, 2025).
Practitioners’ Profiles
Data Sources
Reflective journals were guided by structured prompts mapped to the five CHAT dimensions (e.g., how mediation tools shaped decisions, rules constrained actions, or role divisions shifted during collaboration). Semi-structured discussions deliberately explored emerging contradictions, with researchers prompting elaboration on instances where role boundaries blurred. The triangulation matrix organized evidence in an 8×5 framework (phases × preliminary role functions), with each cell containing relevant journal entries, discussion transcripts, code commits, and communication logs. This structure revealed patterns like how Designers initially struggled with GenAI prompt engineering in Phase 4 but developed systematic templates by Phase 5, and how Builders evolved from implementers to co-designers as they gained technical confidence.
Data Analysis
All collected data, including journals, transcripts, chat logs, development artifacts, and contribution tracking derived from these materials, were imported into Atlas.ti for qualitative coding and cross-source comparison. Our analysis followed an iterative process informed by Cultural-Historical Activity Theory. We began with an initial coding frame based on the five CHAT dimensions and used it to identify actions, decisions, tensions, and role negotiations across the data. As analysis progressed, additional data-driven codes were added, revised, and grouped through repeated comparison across journals, group discussions, communication records, and artifacts.
The Findings section was organized through three analytic categories. Phases were identified from recurring clusters of tasks, feedback, and revision activity across the project timeline. Roles were traced from preliminary role functions identified during team organization and then refined through repeated patterns of task ownership, self-positioning, and role overlap across sources. Engagement dynamics were informed by practitioners’ self-reported contribution levels across phases and roles, which were interpreted alongside journals, discussions, and artifacts to examine shifts in participation, collaboration, and role overlap. Throughout this process, CHAT guided our interpretation of how tools, rules, community relations, division of labor, and the object of activity shaped these patterns.
For example, data from Phase 4 showed how one analytic cluster developed across sources. Journal entries described difficulties in developing NPC personas and adjusting prompts. Meeting discussions focused on who would revise personas, prepare examples, and test output quality. Development artifacts documented repeated changes to persona templates and NPC scripts. Practitioners’ contribution records also showed increased involvement during this stage. Read together, these materials were coded as persona revision, prompt adjustment, technical troubleshooting, and role negotiation. This cluster contributed to our identification of Phase 4 as a distinct period of program development, showed overlap between design and programming work, and helped explain the stronger engagement observed during iterative revision.
Credibility and Trustworthiness
Because three of the four research team members also served as participants, particular attention was paid to reflexivity and analytic transparency. Three coders independently reviewed and coded the data, then compared interpretations and resolved differences through iterative discussion to reach coding consensus. Reflexive journals included meta-reflections on potential bias and decision rationales, which were reviewed across the team to identify possible self-confirmation or overinterpretation. To further address the dual-role issue, an external auditor, a senior scholar affiliated with the project but not involved in data collection, reviewed the coding framework, representative data-to-category mappings, and analytic memos. Triangulation across journals, group discussions, communication records, development artifacts, and contribution tracking derived from these materials further supported the credibility of the interpretations. Together, these procedures strengthened reflexive awareness, transparency, and rigor across the analytic process.
Findings
RQ1. How do Practitioners Involved in Game Design Describe the Iterative Process and Roles That Emerge When Working Within a GenAI-Enhanced Digital Platform?
Phases of Development
The Phases of the GenAI-Based Simulation Development Research Project
Educational Game Roles and Objectives
The GenAI-enhanced educational simulation game was conceptualized by practitioners to serve multiple pedagogical functions within pre-service teacher training. Throughout the design process, practitioners articulated and refined the educational roles and objectives, positioning it as an integrated learning tool rather than merely supplementary material. Practitioners defined the primary educational role of the simulation as a structured practice environment for pre-service teachers to develop classroom dialogue skills in a low-risk setting. The game was designed to function as a formative assessment tool providing immediate feedback on communication strategies and classroom management techniques.
In terms of implementation context, practitioners conceptualized the game to function in three distinct educational settings: (1) as an in-class guided activity facilitated by instructors to demonstrate specific techniques and prompt group discussion; (2) as a self-directed learning resource that students could access independently to practice scenarios at their own pace; and (3) as a reflective assessment tool where instructors could assign specific scenarios as formal assignments to evaluate communication competencies. Specific learning objectives were established for each implementation context. For in-class guided activities, objectives focused on collaborative problem-solving and real-time adaptation to student responses. For self-directed learning, objectives emphasized repeated practice and self-evaluation of communication strategies. For assessment purposes, instructors evaluated pre-service teachers on their ability to apply theoretical knowledge in authentic classroom situations, their responsiveness to diverse student needs, and their capacity to adjust teaching approaches based on student feedback. The educational objectives extended beyond technical skill development to include fostering adaptive thinking, cultural sensitivity, and reflective practice. These broader objectives shaped design decisions, particularly in creating NPC personas that represented diverse student characteristics and learning needs relevant to the practitioners’ Korean elementary school contexts. Implementation required ethical considerations in GenAI integration, including addressing potential biases, protecting data privacy, and documenting AI decision-making processes for transparency about the adaptive elements.
Practitioner Roles
Roles of the Practitioners
Engagement Dynamics
Figure 1 depicts practitioner engagement across the eight phases based on practitioners’ self-reported contribution levels across roles and phases. During initial phases, engagement levels were moderate, with practitioners contributing to administrative and foundational tasks. In defining individual roles, practitioners were recognized for their competence in educational settings, enabling them to take on design-focused positions. Engagement trends across phases, illustrating the intensity of practitioner involvement at different stages of the design process
Engagement peaked during middle phases, particularly in planning, development, and testing. These phases involved hands-on activities like designing game components, scripting features, and iterating based on feedback. The collaborative dynamics were particularly evident, with practitioners working closely to align pedagogical goals with technological innovations. In later phases, such as usage evaluation and interview analysis, engagement declined as practitioners focused on procedural tasks like conducting assessments and analyzing feedback. Although crucial for ensuring quality, these tasks involved less innovation and creativity, demanding independence and precision rather than collaborative creativity. The final modification phase emphasized error correction and optimization, offering limited opportunities for dynamic collaboration. The engagement trends underscore variability in roles across phases. Designers and Programmers showed highest involvement during middle phases, Instructors maintained steady engagement throughout, Builders focused on testing phases, and Managers maintained consistent engagement across all phases.
Role Fluidity
Roles evolved as practitioners encountered new challenges. Role ambiguity, initially viewed as problematic in early phases, became recognized as valuable by Phase 5. The team developed contextual expertise activation, fluidly shifting between roles based on specific design challenges.
The dual analytical approach revealed how roles adapted to project requirements. Boundaries frequently blurred, particularly between Designers and Programmers. Designers engaged in debugging scripts alongside Programmers to ensure seamless functionality, while Instructors and Designers integrated pedagogical expertise into game mechanics to ensure the simulation aligned with instructional objectives while remaining immersive.
Roles evolved dynamically to meet shifting project needs, essential for overcoming challenges such as technical barriers and GenAI integration. GenAI-driven dialogue required collaboration for both technical accuracy and pedagogical relevance. Despite collaborative achievements, role ambiguity and technical barriers necessitated enhanced communication strategies including weekly meetings and clear task assignments.
RQ2. What Challenges Arise in Their Respective Roles, and What Strategies do They Develop to Overcome These Obstacles?
The process of designing a GenAI-enhanced educational game revealed a series of intricate challenges faced by practitioners across their respective roles. These challenges, identified through collective autoethnography, reflect the dynamic and evolving nature of interdisciplinary collaboration in educational game development. The following section describes the challenges practitioners reported in navigating the dual domains of pedagogy and technology and the strategies they used to address these hurdles.
Designer
The practitioner-designers faced two notable challenges during the development of the GenAI-enhanced educational game: (1) limited programming knowledge, which contributed to their reduced influence or “weak voice” within the design process, and (2) achieving the game’s pedagogical goals considering the learning environment.
Limited Programming Knowledge and the “Weak Voice”
One critical challenge encountered by practitioner-designers was their limited technical expertise in programming. However, the lack of programming knowledge among practitioner-designers often weakened their ability to articulate and assert their ideas in technical discussions. This issue was exacerbated by the presence of programmers in the community with backgrounds in educational technology, who were often perceived as more competent in bridging technical and educational domains. Consequently, the practitioners’ voices were underrepresented, and direct communication between project stakeholders and programmers became the norm, bypassing the practitioners. This misalignment in the division of labor undermined the practitioner-designers’ ability to influence key design decisions effectively.
To address this imbalance, the practitioners recognized the need to expand their skill sets to include basic programming knowledge. This enhancement would allow them to better engage in technical discussions and assert their perspectives in the design process, which further contributed to their new role as a Programmer. As one practitioner reflected: “After that, the development team was just involved in troubleshooting and maintenance by communicating directly with the client, so you could say that the design team's position was quite awkward at this point. (Translated), “Rather, if they were developers, they would be able to take on a role in responding to issues as they arise, but in this process (Translated)
Achieving the Game’s Pedagogical Goals Considering the Learning Environment
The other challenge was designing the game environment while effectively reflecting key instructional elements. Since the game is based on the specific instructional goal, enhancing communication skills with students of pre-service teachers to deal with classroom challenges, the game mechanics and scenarios needed to be carefully developed in the context of the instructional design for the pre-service teachers. However, it was challenging for practitioner-designers to align game elements with pedagogical intentions.
To address this issue, practitioner-designers drew on the academic literature about educational games, game-based learning, and GenAI. They referred to journals, articles, and news that focus on game elements and conversation strategies, maintaining pedagogical authenticity and engagement in the learning experience. As one practitioner reflected: “I appreciate how the inclusion of a unique game element called “objects” helps guide users toward achieving both the game’s goals and its educational objectives, thereby enhancing pedagogical authenticity. This decision was influenced by research by Amory (2007), suggesting that game objects can be designed to support authentic learning activities when educational games are structured as relevant, exploratory, engaging, and socially interactive environments.” (Translated)
Instructor
The practitioners, in their roles as instructors, faced two significant challenges during the development of the GenAI-enhanced educational game: (1) limited recognition of the importance of realism in map design and NPCs, and (2) navigating the diverse expertise within the community of educational professionals.
Limited Recognition of Realism in Map and NPC Design
One of the key challenges was the lack of attention and interest from community members regarding the realism of the map and NPCs used in the simulation. While practitioner-instructors recognized the importance of aligning the simulation’s aesthetic and cultural context with real-world classroom settings, the default components in Roblox were not suitable for creating maps or NPCs reflective of Korean classrooms.
The practitioner-instructors, drawing upon their teaching expertise, addressed this challenge by taking on additional roles in programming and map-building tasks. By leveraging their empirical knowledge of Korean classroom environments, they reconstructed the map to accurately reflect realistic classroom layouts and developed contextually grounded NPC personas based on classroom interaction patterns and learner characteristics described from the practitioners’ experience in Korean school settings. These modifications enhanced the authenticity of the simulation, aligning it with the project’s pedagogical goals. As one practitioner explained: “… I mean, my experience as a teacher was subconsciously used to imagine learning problems that are highly relevant to the environmental background of the students, and to connect those problems and backgrounds of the NPC personas to the flipped learning classroom situation, which was the basic context of the simulation.… (Translated), “Teacher researchers have the advantage of conducting research based on empirical data based on actual practice in the field, which can reflect the process and problems of actual practice that are difficult to understand if the research is conducted without that empirical knowledge. (Translated)
Diverse Expertise Within the Community
The second challenge stemmed from the community’s diverse professional backgrounds, which included student researchers, professors, school teachers, and programmers. While this diversity brought a wide range of perspectives, it also introduced complexities in aligning the members’ understanding of the limitations of the current educational system and the potential of immersive learning environments, GenAI, and simulations. To navigate this challenge, the community actively sought to identify commonalities among its members, fostering horizontal and collaborative interactions. This shared context allowed community members to engage in meaningful exchanges of feedback and innovative ideas, which advanced the project cohesively. As one participant reflected: “Since we all had a common context of educational experience in the school setting and a common experience …, I can say that we can overcome the problem in communicating with each other and giving feedback. (Translated)
Programmer
The practitioner-programmers in this study faced a significant challenge: the prolonged time required to complete game development tasks. As non-professional programmers with only limited technical knowledge, they struggled with coding and scripting, which led to inefficiencies in the overall project workflow. Despite these limitations, the practitioner-programmers exhibited resourcefulness in overcoming this obstacle, employing various tools and strategies to enhance their efficiency and confidence in game development.
Limited Technical Knowledge and Inefficiency
The practitioners’ lack of advanced programming skills hindered their ability to complete development tasks swiftly. Tasks like coding objects, creating non-playable characters (NPCs), and integrating GenAI functionalities into the game demanded substantial time and effort. These challenges were compounded by their unfamiliarity with programming concepts, making the process not only time-consuming but also prone to errors and setbacks. One participant described their experience: “The book object was intimidating because I had to code it by typing prompts into the system, … (Translated), “I've been tasked with developing personas. I wasn't sure what to consider to develop personas.” (Translated)
To address these challenges, the practitioner-programmers employed three key tools and strategies. First, one of the most effective solutions was collaborating with the more experienced programmer within the research group. This collaboration leveraged the interdisciplinary expertise of the community, fostering a supportive environment where members could share knowledge and troubleshoot problems together. Second, the practitioner-programmers turned to online resources, including instructional videos, discussion forums, and programming guides available on platforms like YouTube and the Roblox homepage. These resources, products of the broader programming community’s sharing culture, provided accessible tutorials and step-by-step instructions that demystified complex coding tasks. One participant noted: “… but I was able to implement the object through mediums such as YouTube and the Roblox homepage. This experience made the Roblox code and scripts more accessible to me so that I could feel more confident in my game development. (Translated)
These reflections underscore the practitioners’ initial struggles with the technical aspects of game development, highlighting the steep learning curve they faced. Third, repeated experimentation emerged as another critical tool, allowing practitioners to refine their skills and achieve their objectives through trial and error. The ease of using GenAI tools and the iterative nature of coding facilitated a learning-by-doing approach, enabling practitioners to gradually improve their technical proficiency. This hands-on experimentation not only enhanced their programming abilities but also fostered confidence in their capacity to tackle similar tasks in the future.
By leveraging these tools, the practitioner-programmers overcame their initial challenges and contributed effectively to the game development process. Their efforts not only enhanced the technical quality of the game but also fostered a deeper understanding of programming principles, enabling them to approach future projects with greater confidence and competence.
Builder
Practitioners in the role of builders faced significant challenges, including prolonged development timelines and the inability to optimize or fully implement certain design aspects. The practitioner-builders, lacking advanced expertise in map construction and object selection, encountered difficulties in translating conceptual designs into functional components within the GenAI-enhanced simulation. To overcome these obstacles, the practitioner-builders employed a primary strategy: engaging with students and colleagues for support. As educators, practitioner-builders leveraged their professional networks, consulting students and colleague researchers who possessed greater familiarity with the Roblox platform. The elementary students, in particular, offered valuable insights not only into the technical aspects of Roblox but also into its popular games and design preferences. One practitioner reflected on their experience: “I asked elementary school students and teachers who knew and designed Roblox. Especially for elementary students, it was useful to get not only Roblox technology, but also famous Roblox games, desired Roblox games, and ideas for educational games. I also got advice on Roblox technology from Ms. 000, the teacher I’m working with.” (Translated)
These findings align with the broader trends of contemporary learning practices, where individuals actively seek knowledge from diverse, decentralized sources to overcome technical challenges. By engaging with both their immediate professional community, practitioner-builders demonstrated adaptability and resourcefulness in addressing skill gaps.
Manager
The role of the practitioner-manager in this study posed two primary challenges: (1) Difficulty in assessing and utilizing the skills of team members and (2) Managing feedback and issues from usability tests. These challenges stemmed from the multifaceted nature of the team’s composition and the dynamic requirements of game development.
Difficulty in Assessing and Utilizing Team Members’ Skills
The practitioner-manager struggled to identify and evaluate the diverse competencies of the team members. The multifaceted roles required of the practitioners, which spanned design, instruction, programming, and building, further complicated the process. The lack of objective criteria for assessing skills introduced ambiguity, making it difficult to allocate tasks effectively and ensure optimal utilization of individual expertise. One participant noted: “One of the difficulties for practitioners is that the process of verifying what competencies and abilities other practitioners have is very vague and complex. Practitioners can only prove themselves by their work experience, which is very qualitative and cannot be evaluated by any metric other than the practitioner's explanation of what it means. (Translated)
This lack of standardized evaluation measures led to inefficiencies in role assignment and task distribution, with some team members being underutilized while others were overburdened.
Leveraging Team Strengths and Transparent Role Assignment
To address this challenge, the practitioner-manager implemented several strategies to enhance role clarity and optimize the division of labor. Allocating time for team members to introduce themselves and articulate their strengths provided a qualitative foundation for role assignments. This approach allowed the manager to gain insights into individual competencies, fostering a better understanding of how each member could contribute to the project. This transparency promoted efficiency and helped the team adapt to the dynamic demands of the project. As one participant reflected: “When assigning roles within the team, we shared characteristics among team members so that we could take into account individual characteristics and strengths. I think this helps to ensure that roles are assigned to a relatively small number of people so that they can be handled efficiently by considering their actual work capabilities. (Translated)
Managing Feedback and Issues from Usability Tests
The usability tests introduced a second layer of complexity, requiring the practitioner-manager to efficiently collect, organize, and address user feedback. This task involved identifying errors, documenting them, and ensuring they were communicated to the relevant team members responsible for resolving the issues.
The practitioner-manager employed a combination of tools and strategies to streamline feedback management. A social media group chat was established to facilitate quick and informal feedback exchanges. This platform allowed team members to promptly share insights and address immediate concerns, fostering a dynamic feedback loop. For more structured error tracking and task distribution, the manager transitioned to a collaborative document platform. This tool provided a centralized repository for organizing feedback, assigning responsibilities, and monitoring progress. A participant emphasized the importance of this adaptability: “We chose a social media group chat for quicker feedback gathering. When the need arose for precise error tracking and task distribution, we transitioned to a collaborative document platform. (Translated)
The findings highlight the critical role of the practitioner-manager in balancing team dynamics and feedback mechanisms in the development of GenAI-enhanced educational games. By implementing strategies to assess team members’ strengths and employing flexible communication tools, the manager effectively overcame challenges related to role assignment and feedback management. These solutions underscore the importance of transparent communication in fostering successful collaborative environments for complex, interdisciplinary projects.
Discussion
This study addresses a central challenge in integrating Generative AI (GenAI) into educational game design: how to scale these systems across classroom contexts while maintaining pedagogical alignment and ethical integrity. While previous research has discussed these concerns in principle (French et al., 2023; Wang et al., 2024), how teacher-practitioners actually confront and navigate them in design practice has remained underexplored. Using Cultural-Historical Activity Theory, we connect our findings to these gaps and consider their implications for professional learning and design practices in educational technology.
Contributions to Understanding Practitioner Roles (RQ1)
Teacher-practitioners moved fluidly among five roles: Designer, Instructor, Programmer, Builder, and Manager, illustrating CHAT’s claim that the division of labor shifts with changing tools and objectives (Engeström, 2001). This role fluidity helped address scalability concerns: as teachers gained cross-role competencies (e.g., Designers learning basic programming), the team reduced design bottlenecks and revised features more efficiently. Similarly, when Instructors co-created culturally responsive NPC scripts, they mitigated content bias by embedding authentic classroom narratives into GenAI outputs.
From a pedagogical standpoint, the findings extend previous understandings of teacher agency in technology-integrated design (Garreta-Domingo et al., 2017; Ivanova, 2021; Paniagua & Istance, 2018) which suggest that teachers need to continually refine their roles to adapt to evolving technological environments. Rather than simply confirming that teachers require multifaceted competencies, the study shows how these competencies are activated and negotiated through iterative collaboration with GenAI. This process can be understood as contextual expertise activation, in which teachers repurpose pedagogical expertise as a design and technical resource to address cultural sensitivity, ethical concerns, and scalability. This interpretation can be situated in relation to discussions of adaptive expertise and boundary-spanning work. In our case, the findings make visible a more specific process in which existing expertise was redistributed across roles and applied directly to prompts, personas, and review processes within the design workflow. Taken together, the findings suggest that professional growth in this setting involved both learning new skills and reworking existing expertise, and that flexible role allocation, rather than fixed task boundaries, supports ethically and pedagogically sound GenAI integration at scale.
Contributions to Ethical and Sustainable Design
Our findings reveal that ethical considerations were central to the practitioners’ design process. Data privacy was discussed as an important consideration during design, prompting discussion about how user interaction data should be handled while enabling meaningful feedback. Transparency was treated as an important concern in documenting and reviewing AI-supported design decisions. AI bias proved important in simulating classroom interactions. Practitioners reviewed scenarios, personas, and generated responses to reduce problematic outputs. These review practices and the use of diverse personas translated abstract principles of fairness into concrete design practice. Beyond reaffirming the importance of these issues (Holmes et al., 2023), our analysis offers empirical evidence of how teachers applied AI ethics in design decisions.
Taken together, these cases suggest that ethical concerns shaped several design decisions. The bias-auditing routines exemplified CHAT’s notion of systemic contradiction as a source of improvement: rather than treating bias as a barrier, practitioners leveraged the tension between fairness and scalability to refine both algorithms and pedagogy. This perspective repositions teachers from ethical gatekeepers to co-regulators who embed transparency and accountability directly within design workflows by shaping prompts, personas, and review routines alongside technical development. Long-term sustainability further guided design choices. Practitioners modularized system components so that updates could be implemented independently as AI models evolved. The team relied on repeated review and feedback to check whether generated elements remained pedagogically appropriate. These practices show how ethical tensions, such as bias risks or unclear decision processes, can be turned into design principles, reinforcing CHAT’s view of contradictions as productive forces for systemic learning.
Contributions to Addressing Role-specific and Cross-Cutting Challenges (RQ2)
The study identifies specific challenges encountered by teacher-practitioners during the development of GenAI-enhanced educational games. These challenges underscore the intricate interplay among technical, pedagogical, and managerial responsibilities required to create effective immersive learning environments. By examining journal reflections, meeting transcripts, and prototype artifacts together, we show how teams developed solutions that balanced automation with pedagogical control and translated ethical and scalability concerns into concrete design practices.
Practitioner Roles and Nuanced Challenges
Designers often lacked in-depth programming knowledge, which limited their voice when technical choices were made. This gap blurred their responsibilities and made it hard to convey pedagogical intent. By completing short, targeted programming courses and working inside a cross-disciplinary design process, they gradually built enough fluency to translate learning goals into concrete feature requests and to challenge purely technical decisions when they risked undermining educational value (Dimitriadou et al., 2021).
Instructors needed each map, quest, and NPC to feel like a believable extension of real classrooms, yet translating culture and context into game assets proved demanding. They met the challenge by sharing classroom anecdotes during asset reviews and by co-authoring scenario scripts with designers, ensuring that every dialog line or environmental cue pointed back to authentic teaching practice (Schachter & Rich, 2011).
Practitioners who served as programmers lacked formal training, posing significant technical challenges. They overcame these barriers by seeking help from experienced colleagues, using online resources, and engaging in iterative experimentation. This proactive approach improved their technical proficiency and facilitated the integration of GenAI features, ensuring that technological advancements supported pedagogical aims (Suenson, 2015).
Builders struggled to optimize interactive elements because they were unfamiliar with advanced authoring tools. Relying heavily on user feedback, they refined interfaces and functionality. By incorporating insights from pre-service teachers and user testing, builders enhanced the usability and engagement of the educational games (Yilmaz & Cagiltay, 2016).
Managers faced the dual challenge of coordinating diverse team members and managing feedback from usability tests. They implemented flexible communication strategies, such as shifting between social-media platforms and collaborative documents, to streamline feedback collection and task allocation. This adaptability fostered effective coordination and timely issue resolution, sustaining project momentum and quality (Ropponen & Lyytinen, 2000).
To strengthen practitioner readiness for GenAI-enhanced educational game design, we recommend a dual focus on technical upskilling and reflective pedagogy. Opportunities for basic programming practice, prompt refinement, and collaborative design reflection may help practitioners connect technical decisions with pedagogical consequences. Design journals may help practitioners connect technical decisions with pedagogical consequences. As teams become more comfortable working across these domains, they may be better positioned to design more context-sensitive and pedagogically grounded learning activities.
Addressing GenAI-Specific Challenges in Learning Experience Design
Beyond role-specific challenges, our findings highlight issues in GenAI implementation, emphasizing pedagogical accountability within algorithmic systems. The “black box” nature of many GenAI systems created tensions between technological sophistication and pedagogical transparency. Practitioners developed visualizations of prompt-response pathways and implemented stepped difficulty progressions to make learning scaffolds more visible. Maintaining educational integrity required continuous vigilance. Practitioners repeatedly reviewed AI-generated dialogue to examine its educational appropriateness and cultural sensitivity. This review process was intended to keep generated dialogue aligned with pedagogical expectations. Finally, practitioners faced ethical questions about how to set appropriate boundaries for simulated experiences, developing strategies such as graduated exposure and reflective debriefing to situate these experiences within broader educational goals.
Cross-Role Solutions and Collaborative Team Dynamics
The most effective solutions emerged through collaboration. When GenAI produced pedagogically inappropriate responses, Designers and Programmers created hybrid approaches that combined fixed templates with generative elements. Instructors contributed authentic classroom contexts, which Builders then translated into suitable game components. From a CHAT perspective, these exchanges show how interconnected roles shaped the design process. Within this collaborative environment, contributions were valued for their relevance rather than hierarchy. Practitioners from different backgrounds worked across pedagogical and technical boundaries. However, their contribution went beyond mediation alone, as they also participated directly in shaping technical decisions and review processes within the design workflow. Through these exchanges, the team developed a shared language that connected pedagogy, ethics, and technical concerns, showing that interdisciplinary co-design is not only efficient but also a source of new professional knowledge.
Limitations and Future Research
This collective autoethnography explores teacher-practitioners navigating GenAI-enhanced game design. Because autoethnography produces context-bound rather than generalizable findings, our results offer transferable, process-oriented practitioner insights that should be seen as starting points for further investigation rather than universal models.
While we used triangulation across data sources, independent coding followed by consensus discussion, reflexive memoing, and external audit to strengthen interpretive rigor, several limitations remain. First, the small sample size (three teachers) within a single national context restricts the generalizability of findings beyond the Korean setting. Second, because participants had substantial expertise in technology-enhanced learning and GenAI, the challenges identified, such as role fluidity across design, programming, and instructional domains, may differ for teachers with less technological proficiency or those working in lower-infrastructure environments. Future research should (a) replicate the design protocol in diverse cultural and technological contexts, (b) trial the simulation in teacher-education programs, and (c) conduct longitudinal studies to examine sustainability and cultural influences on practitioner roles.
Our findings offer practical implications for teacher preparation programs: the five roles provide a scaffold for GenAI integration modules, and effective implementation requires collaborative environments that cultivate hybrid expertise with an emphasis on contextual expertise activation.
Footnotes
Informed Consent
All participants were informed about the study and provided written informed consent to participate voluntarily.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Generative AI Use Disclosure
During the preparation of this manuscript, the authors used ChatGPT (OpenAI) to assist with translation, paraphrasing, and language editing. All AI-assisted output was reviewed, revised, and verified by the authors, who take full responsibility for the final content of the manuscript.
Author Biographies
