Abstract
Background:
Serious illness conversations (SICs) are an important part of medical care, yet trainees continue to express discomfort. Educational opportunities are time and resource-intensive. This pilot study's aim was to explore the experience of artificial intelligence (AI) technology deployed for SIC practice and feedback.
Methods:
2-Sigma, an AI-based simulation platform, was deployed as a tool for medical students to practice SICs utilizing skills taught through a dedicated Advance Care Planning (ACP) curriculum. Eleven medical students completed 27 cases and full transcripts were reviewed to describe (1) student interactions with AI, (2) AI responses, and (3) AI-provided feedback.
Results:
Students interacted with AI utilizing skills taught in the ACP curriculum. AI responses displayed emotion, however, quickly resolved negative feelings. Feedback addressed important components of SICs but varied in its specificity.
Conclusion:
AI technology represents a potential tool for practicing skills necessary for SICs.
Keywords
Background
Effective serious illness conversations (SICs) are critical to ensure patients receive goal-concordant care as their disease progresses. 1 Structured curricula designed to teach SIC skills frequently utilize standardized patients (SPs), role-play, observation, and utilization of conversation guides.2–5 Additionally, trainees report that SIC educational experiences primarily occur through observation and didactics.5,6 Despite these educational experiences, trainees and physicians continue to express discomfort having SICs.1,2,7,8
Patients and families prefer providers who are empathic, trusted, and skilled in SICs.9–11 Patients also desire receiving serious information while having their emotional needs addressed. 9 While best practice in empathic communication is addressing emotion, physicians commonly respond with medical data or do not acknowledge the emotion.12–14 Our current medical student curricula utilizes didactic, role-play, and SP encounters to build these skills, yet resource constraints limit expansion. 4
Recent advances in artificial intelligence (AI) offer potential scalable methods to allow trainees to practice SICs in a safe environment. 15 Building on our established framework for responsible AI integration in medical education, 16 this pilot study explores the feasibility of using AI as a modality to practice SICs while also identifying opportunities to refine AI’s responses for a more realistic experience.
Methods
Study population and setting
Study participants were fourth-year medical students participating in a previously established 2-week elective including a dedicated Advance Care Planning (ACP) curriculum. 3 Students practice SICs with an SP at both the beginning and end of the elective.
Pilot intervention
2-Sigma, an AI-based simulation platform that incorporates adaptive clinical scenarios, was developed at UCCOM by LT and SO. 17 The 2-Sigma platform utilizes large language model architectures with chain-of-thought and few-shot prompting techniques to simulate realistic virtual patient encounters and deliver immediate AI-generated feedback. 2-Sigma was instructed to create a GOC case, provide only necessary background information, share information only when asked, and require students elicit details on goals, priorities, and concerns before the AI was willing to make medical decisions. It was instructed to provide feedback on skills such as assessing patient understanding, using validating statements, eliciting patient goals and concerns, and indicating future steps.
Students were prompted to identify a specialty or scenario to generate the case. AI simulated a patient while the student acted as their physician. Students read AI responses and typed or used voice-to-text to respond. At the end of the conversation, AI provided text feedback.
Data collection and review
Two investigators (D.K., D.E.W.) utilized deductive thematic analysis to independently review transcripts to explore how students utilized SIC frameworks when interacting with the AI. The same investigators utilized inductive thematic analysis of transcripts and AI feedback to develop codes and subsequently met to develop themes and reconcile any differences. A third investigator (L.A.H.) subsequently reviewed transcripts to ensure no other codes or themes arose. Microsoft Excel was used to organize data.
The analysis aimed to (1) understand how students incorporated curricular content into interactions with the AI, (2) describe strengths and limitations of the AI-generated content to inform iterative improvements to the prompt engineering, and (3) evaluate the AI’s ability to provide effective feedback. The two investigators then met to reconcile differences and discuss emerging themes and patterns. A third investigator (L.A.H.) subsequently reviewed all transcripts to validate results through investigator triangulation.
Results
Participation
Eleven students participated in a total of 27 SIC cases with AI feedback being generated for 24 of the cases. All authors compared the student prompt with AI-generated clinical scenario and case goal to ensure face validity (Table 1).
Generated Case and Goals Based on the Prompt Submitted by the Learner. Completed Cases Included AI Generated Feedback
Yo, year old.
Student interaction
Analysis of the SIC case transcripts demonstrated students interacted with AI using skills taught in the course including sharing difficult news, eliciting understanding and goals, and practicing preferred empathic phrases such as “I wish.” Analysis also identified areas for improvement such as quickly pivoting to treatment options or using less preferred empathic statements such as “I understand” (Table 2).
Student Interaction with AI
Response: Table 2 Column 1 Label: Skill PracticedTable 2 Column 2 Label: Representative Statement Response: ** generated by AI program to represent a non-verbal response
AI responses
Analysis of AI responses were completed, which demonstrated an overall consistent pattern of behavior from case to case. AI would have emotional responses to difficult news, which quickly resolved or shifted to acceptance of a poor prognosis. AI also demonstrated high health care literacy along with trust in the medical system (Table 3).
AI Responses Allowed Trainees to Practice Addressing Emotion When Discussing Serious News; however, AI Quickly Accepted Medical Information and Values Were Frequently Comfort Focused
AI feedback
AI feedback provided both positive and constructive feedback on key components of SIC. There was variability with regard to how actionable or specific the feedback was (Table 4).
AI Feedback Provided a Range of Clear, Actionable Feedback and Feedback That Was Less Specific
Discussion
AI is a growing field through which trainees can practice medical skills in a safe environment. Unlike SPs, role-play, and targeted observation, this modality is less resource-intensive, allows asynchronous learning, and provides opportunities for trainees to revisit scenarios to hone skills or for “just-in-time” practice. In our pilot study, we explored trainee interactions with AI, the content of AI responses, and AI’s ability to provide effective feedback.
Providing serious information clearly and with empathy is a skill highly prioritized by patients, 9 yet data demonstrates that physicians are often vague when discussing prognosis. 18 2-sigma technology allowed students to practice SIC utilizing skills taught during the elective (Table 2). One strength is that this format gave students space to pause and consider what words and phrases taught during the course would be best. Additionally, AI response to serious news often included strong emotion such as “There’s a moment of stunned silence. Sarah’s face loses color, and David stiffens, his brows furrowing in disbelief.” This highlighted the need for the students to address the emotion before moving on to answering medical questions. Students were also able to repeat the session with different prompts to try different cases without the resources necessary to build multiple SP or role-play scenarios.
This pilot identified areas where AI would require additional prompt engineering to become more realistic. We learned AI would quickly resolve negative emotions or easily accept their poor prognosis limiting the student’s ability to practice navigating aspects of SIC such as denial or persistent negative emotions. AI responses suggested a high level of health care literacy and trust in the medical system, which is not always present. This highlights the need for increased diversity in persona and personal values of AI to ensure a breadth of reactions and overall goals of care.
AI feedback adds another opportunity for students to improve communication skills. In our pilot, AI provided feedback on many components of SIC including assessing understanding, providing medical information, and responding to emotion (Table 4). While much of the feedback was vague such as “You demonstrated strong empathetic communication throughout the encounter, which is essential in goals of care discussions.” AI also provided specific, actionable feedback linked to key parts of the conversation (Table 4). With refinement, AI feedback could serve to provide both generalized feedback on the overall tone of the conversation previously provided by the SP in this curriculum, while also providing more specific actionable examples often provided by physician preceptors. 19
Limitations
This pilot study was limited by several methodological constraints. The small sample size of 11 students (27 encounters) precluded statistical analysis and limits generalizability. Fifty-five percent of cases were cancer diagnoses, with the remainder split between other illnesses limiting our understanding of the platform’s ability to vary noncancer cases. Additionally, trainees self-select for this elective, thus they may be more engaged in developing SIC skills influencing our results.
AI responses were another limitation of this study. Text responses limited the trainee’s ability to appreciate tone or voice modulation as conversational cues, and trainees were unable to receive feedback on nonverbal communication, pacing, or tone. Additionally, given the limited range of responses from AI, this pilot was unable to explore how students would respond to situations where AI did not quickly resolve negative emotions or accept the medical diagnosis. Finally, this study was also not scoped to review the quality of feedback.
Future directions
Future directions include additional prompt engineering to allow a broader range of AI responses to serious news and improved AI-generated feedback to link comments to specific moments within the conversation. 2-sigma technology will also be tested with more diverse cases and larger learner populations to explore how AI can be utilized for complex communication practice more broadly. Prior work has demonstrated that multimodal feedback from faculty and SPs can expand learning from SP encounters, 19 thus, we plan to evaluate if faculty review of AI transcripts can be a more efficient way to provide faculty feedback compared to watching recorded SP encounters. Finally, it will be important to explore the learner perspective on AI technology to aid in the development of SIC skills and how simulated conversations translate to interactions with live patients.
Conclusion
AI technology represents a promising tool to create a safe learning environment for medical trainees to practice SICs. This pilot demonstrates that students were able to practice SIC skills including delivering serious news and responding to emotion. Future work will include additional prompt engineering to expand the range of AI responses and allow for more specific feedback. This pilot demonstrates feasibility; however, the technology requires refinement to effectively supplement traditional SIC training.
Author Disclosure Statement
All authors have no conflicts of interest to disclose.
Funding Information
This project was partially supported by the American Medical Association, Medical Education Research and Program Evaluation area within the Medical Education Unit. L. Turner and S. Overla have a provisional patent (#63/524,759).
