Abstracts from the Society for Clinical Trials 46th Annual Meeting (2025)

Abstract

INVITED SESSION PROPOSALS

SP1 – RECENT ADVANCES IN ADDRESSING DESIGN AND ANALYSIS CHALLENGES OF CLUSTER RANDOMIZED TRIALS

Organizer: Edward Mascha (Cleveland Clinic)

Chair: Edward Mascha (Cleveland Clinic)

Speaker(s):

1) Edward Mascha (Cleveland Clinic)

2) Hrishikesh Chakraborty (Duke University)

3) Emine Bayman (University of Iowa)

4) Fan Li (Yale University)

Cluster randomized trials (CRTs), including parallel, crossover, and stepped wedge designs, are becoming more popular and prominent in medical and healthcare delivery research for several reasons, including administrative and logistical considerations, ethics, and ease of application at the cluster level. For example, when the research question involves a change in clinical practice or a systemic change in a hospital, it might not be feasible to study the intervention in a design using individual patient randomization. It is often more appropriate in such settings to randomize clusters of patients or providers and design a study which compares interventions while accounting for the within-cluster correlations, both in design and analysis. Our session clearly addresses the theme of “Shaping the Future: The Right Questions, Robust Answers”, as cluster randomization is becoming more frequently used to address important scientific questions in healthcare delivery research, and all four presentations center on the common theme of providing robust statistical solutions to the right question. Specifically, Speaker #1 will focus on methods advancements for cluster-randomized crossover trials (CRXO), a design that robustly combines the benefits of within- and between-cluster comparisons to make inference on the treatment effect. Speaker #2 will discuss simulation-based power comparisons for hierarchical composite outcomes in CRTs, and identify robust and powerful methods when complex endpoints meet clustering. Speaker #3 will discuss methods to re-estimate the intracluster correlation coefficient (ICC) during an internal pilot study for CRTs to ensure additional robustness in the final sample size and power. Speaker #4 will focus on estimating transparent estimands in CRTs to ensure model-robust inference even under working model misspecification. Collectively, the talks provide a suite of robust solutions to combat challenges in the design and analysis of CRTs.

Talk 1: Salient Design and Sample Size Features of Cluster-Randomized Crossover (CRXO) Trials. Edward Mascha, Cleveland Clinic

In a cluster-randomized crossover trial (CRXO) different clusters, such as hospitals, operating rooms, providers, or care teams are randomized in each period to receive one of the study interventions. A CRXO trial may include many periods, with measurements between periods made either on the same patient or on different/new patients. In this talk, we describe how a CRXO design borrows features from an individual patient crossover trial, a cluster randomized trial, and a stepped wedge cluster randomized trial. We then highlight situations in which a CRXO design is most appropriate, beneficial/attractive, and powerful. We further illustrate how CRXO sample size calculation depends on key design parameters including within-cluster within-period and within-cluster between-period correlations, which may degrade as periods are farther apart, as well as on number of clusters and periods. Methods will be illustrated using multi-period CRXO trials completed in the perioperative setting at Cleveland Clinic.

Talk 2: Power Comparison for Different Hierarchical Composite Outcomes in Cluster Randomized Trials. Hrishikesh Chakraborty, Duke University

Hierarchical composite outcomes (HCO) combine multiple endpoints into a single measure, frequently used in clinical trials. Different methods have been proposed to create variations of HCO. However, HCO implementation in CRT is limited, and the power implications of different HCO methods are unclear. We integrated various estimation methods of different HCO and CRT to conduct a simulation-based power comparison for these methodologies in a CRT environment. Methods tested included Finkelstein-Schoenfeld, unmatched win-ratio, unmatched win-difference, matched win-ratio, and worst-rank score under varying ICCs, cluster numbers, and size variations. Our simulation study was based on a real-world HCO outcome used in CRT design, specifically the HiLo trial, where we utilized time-to-death and hospitalization to generate different HCO. We concluded that unmatched win-ratio and unmatched win-difference had the highest power across scenarios, while the Finkelstein-Schoenfeld method and worst-rank score followed.

Talk 3: Reassessing the ICC during an internal pilot study for a cluster-randomized trial. Emine Bayman, University of Iowa

A defining characteristic of cluster randomized trials is the randomization of clusters of individuals to study arms and the resulting potential for correlation of outcomes within clusters. This correlation, assessed by the intraclass correlation coefficient (ICC), must be considered in the design and primary analysis. Accordingly, in addition to estimating the effect size, the researchers must estimate the ICC for a valid calculation of the target sample size in a CRT trial. In many situations, a reliable estimate of the ICC may not be available in the design phase. Thus, researchers may wish to use interim data collected during the trial to estimate outcome data for the ICC and reassess the sample size. We will discuss the Fibromyalgia Transcutaneous Electrical Nerve Stimulation (TENS) in Physical Therapy Study (FM-TIPS) to demonstrate interim reassessment of sample size in a CRT. FM-TIPS is a cluster randomized pragmatic trial examining whether the addition of TENS to routine physical therapy improves movement-evoked pain between baseline and 60 days compared with physical therapy alone among patients with fibromyalgia.

Talk 4: Model-robust standardization in cluster-randomized trials. Fan Li, Yale University

Although generalized linear mixed models and generalized estimating equations have conventionally been the default analytic methods for estimating the average treatment effect in practice, recent studies have demonstrated that the treatment effect coefficient may correspond to an ambiguous estimand when the regression model under consideration does not perfectly align with the data generating process and when there exists informative cluster size. In this talk, we will present simple and accessible methods to standardize output from any regression model to ensure robust estimand-aligned inference in cluster-randomized trials. In particular, the talk will introduce estimators for both the cluster-average and the individual-average treatment effects that are always consistent regardless of whether the specified working multilevel regression models align with the unknown data generating process. Simulation experiments and analysis of a real CRT are used to demonstrate the utility of these simple estimators over existing model-based estimators.

SP2 – THE HIGHLIGHTS AND HICCUPS OF HIGH-DENSITY DATA

Organizer: Jonathan Beall (Medical University of South Carolina)

Chair: Jonathan Beall (Medical University of South Carolina)

Speaker(s):

1) Sharon Yeatts (Medical University of South Carolina)

2) Chris Arnaud (Medical University of South Carolina)

3) Zeke Lowell (Medical University of South Carolina)

4) Jonathan Beall (Medical University of South Carolina)

5) Lisa H Merck (Virginia Commonwealth University)

High-density data are an ever-growing field in clinical trials. Continuous monitoring data, an example of such data, allow for high detailed informed interventions for subjects. The BOOST-3 clinical trial, for example, is designed to test a prescribed treatment protocol based on the continuous monitoring of intracranial pressure (ICP) and brain tissue oxygen content (Pbt02). While such data are commonly used for the treatment of patients, in the context of a clinical trial, the capture and analysis of this type of data presents unique challenges. In this session, we will use the ongoing BOOST-3 clinical trial as a case study to discuss the complexity of such data as it progresses through the scope of a clinical trial: from its collection at the clinical site and transfer to the data coordination center, summarization and creation of data quality metrics, reconciling different data streams relevant for assessing treatment fidelity, and final analysis of the collected data.

Talk 1. Sharon Yeatts, Medical University of South Carolina

As the PI of the Data Coordinating Center for BOOST-3, Dr. Yeatts will provide an introduction to the Brain Oxygen Optimization in Severe TBI Phase-3 (BOOST-3) clinical trial and the data collected therein. BOOST-3 is a Phase III, randomized clinical trial designed to compare the effectiveness of for a prescribed treatment protocol for patients with traumatic brain injury: a protocol based on both intracranial pressure (ICP) and brain tissue oxygen content (PbtO2) versus a protocol based on ICP monitoring alone. Both strategies are currently used in care and allow physicians to initiate/adjust treatment given the current levels of ICP and/or PbtO2. The interventions given in a particular subject are documented in the corresponding electronic CRF. The continuous ICP and PbtO2 data are uploaded to the trial’s cloud storage location for analysis of the trial’s secondary aims.

Talk 2. Chris Arnaud, Medical University of South Carolina

Mr. Arnaud will describe the infrastructure developed to download the continuous data from the cloud storage location to the Data Coordinating Center. The cloud storage location processes, stores, and makes subject data available in a data folder following specified naming conventions. For valid folder locations, the transfer program downloads the CSV files and prepares them for import into WebDCU, the DCU’s Clinical Trial Management System. A summary of the process is available with the CTMS for oversight.

Talk 3. Zeke Lowell, Medical University of South Carolina

Quality data is imperative for success of a clinical trial. For BOOST-3, the continuously monitored ICP/PbtO2 are of fundamental importance as these data allow for an assessment of treatment fidelity; however, the high-density of the data present unique challenges from a data management perspective. Metrics for quality of these data had to be developed relative to treatment practices and protocol definitions. Approaches for quantifying treatment performance by comparing the continuously monitored data against multiple data streams to assess fidelity to the prescribed treatment protocols were developed. This talk will present the challenges and subsequent approaches for management of these complex data.

Talk 4. Jonathan Beall, Medical University of South Carolina

The detailed physiologic data are used in real time to administer a prescribed treatment protocol developed to bring these values into pre-specified ranges. A measure of treatment fidelity could be the effect of treatment on time spent outside of the target range for PbtO2; however, clinical or physiologic events, such as patient transport, can result in missing data values for these continuously monitored data. When this occurs, the true proportion of time outside of the specified range becomes obscured, which presents a significant analytical challenge. We propose a Bayesian model allowing for the construction of subject level informative prior distributions for these summary measures. For those subjects with intermittent missing data, this approach will provide informed ranges for imputation of the subject level missing data. We will assess our proposed model under a variety of simulation conditions, including varied rates of missing data and mechanisms for missing data. We will compare our proposed model to alternative approaches for handling missing data.

Discussant: Lisa Merck, Virginia Commonwealth University

Dr. Lisa H. Merck will discuss management of clinical confounders in polytrauma shock physiology. She will briefly outline lessons learned from the ProTECTIII clinical trial, and the unique challenges / opportunities of working with continuous multiparametric data streams.

SP4 – THE MODERN CLINICAL TRIAL: MANAGING THREATS TO ROBUST EVIDENCE

Organizer: Lori E Dodd (National Institute of Allergy and Infectious Diseases)

Chair: Sally Hunsberger (National Institute of Allergy and Infectious Diseases)

Speaker(s):

1) Stuart Pocock (London School of Hygiene and Tropical Medicine)

2) Michael Proschan (National Institute of Allergy and Infectious Diseases)

3) Lori E Dodd (National Institute of Allergy and Infectious Diseases)

4) Colin Begg (Memorial Sloan Kettering Cancer Center)

5) Marc Buyse (International Drug Development Institute)

6) Patrick Phillips (University of California San Francisco)

Randomized controlled trials are intended to resolve, not stir controversy. Many proposals to improve the speed and efficiency of clinical trials present threats to robust evidence. Methods that lack transparency; introduce modeling assumptions, subjectivity, complex randomization schemes; or borrow data from non-randomized subjects that weaken the foundation for error control and have potential to invalidate trial conclusions. An additional potential threat is a push, by some, to rely more heavily on “real-world evidence.” While real-world data come from multiple sources, including from clinical trial data, the term “real-world evidence” often refers to non-randomized or observational studies that utilize data collected in the context of routine medical practice, as the “real experience.” While such data can provide important information about medical practice, real-world evidence is no substitute for a well-conducted randomized controlled trial. In this session, we will address many questions, including: What are the limits of real-world evidence? When is this approach an appropriate substitute for clinical trial evidence? What can be learned from trials that have piloted adaptive and complex modeling techniques in recent years? Are there scenarios where these methods are appropriate? Do we need to further evaluate the relative merits of these proposals? This session will address the topics of the tradeoff between robustness, rigor and practicality, incorporation of data from non-concurrent controls, and response-adaptive randomization.

Talk 1: Real World Evidence for evaluating Treatment Effectiveness and Safety: when is it needed and how can it be made trustworthy? Stuart Pocock, London School of Hygiene and Tropical Medicine.

Talk 2: Response Adaptive Randomization: Rigor Now or Rigor Mortis Later? Michael Proschan, Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases

Talk 3: Use of non-concurrent controls in the RCT paradigm. Lori E Dodd, Clinical Trials Research Branch, National Institute of Allergy and Infectious Diseases

Discussants: Colin Begg, Memorial Sloan Cancer Center; Marc Buyse, International Drug Development Institute; Patrick Phillips, University of California San Francisco

Late Breaking Session – IMPACT OF RECENT US GOVERNMENT ACTIONS: A LATE-BREAKING PANEL WITH AUDIENCE DISCUSSION

Organizer: Scott Evans (George Washington University)

Chair: Scott Evans (George Washington University)

Speaker(s):

1) Jody Ciolino (Northwestern University)

2) Janet Wittes (Wittes LLC)

3) Scott Evans (George Washington University)

Research funded, overseen, and evaluated by the United States National Institutes of Health (NIH), Food and Drug Administration (FDA), Centers for Disease Control (CDC), and Veteran’s Administration (VA) are critical for the world’s health. NIH-funded trials operate with unparalleled objectivity and often prioritize pursuit of answers to important questions for informing medical practice, thus informing medical care in ways that development-oriented industry funded clinical trials generally do not. This includes evaluation of the comparative effectiveness of competing interventions and driving the study of interventions for rare diseases where profit motives are lacking. These institutions are the primary supporters of academic clinical research training. The FDA protects public health by providing the most extensive and critical reviews of interventions to treat and prevent diseases to determine if they are safe and effective for clinical use. The CDC protects public health by preventing and controlling disease, injury, and disability. Recent executive orders have reduced support for these foundational institutions and the research supported by these institutions. This panel will discuss the impact on government agencies academia, and industry; clinical research careers, the clinical research agenda and portfolio, the quality of clinical trials and other research studies, and the evidentiary standard for evaluating interventions. Substantial time will be allocated for audience participation.

Dr. Ciolino will speak from the perspective of an academic data coordinating center.

Dr. Wittes will speak from the perspective of a former NIH employee, NIH grantee, and SCT President.

Dr. Evans will speak from the perspective of an academic researcher, NIH grantee, and SCT President.

SP6 – HIERARCHICAL COMPOSITE ENDPOINTS AND THE WIN RATIO: AN APPROACH TO OBTAINING ROBUST ANSWERS

Organizer: Lai Wei (Ohio State University)

Chair: Valerie Durkalski-Mauldin (Medical University of South Carolina)

Speaker(s):

1) Björn Redfors (Sahlgrenska University Hospital)

2) Lai Wei (The Ohio State University)

3) James Troendle (National Heart, Lung, and Blood Institute)

4) Jarrod Mosier (University of Arizona)

5) Madison Hyer (Ohio State University)

An approach to obtaining robust answers, The Win Ratio (WR) method, introduced by Pocock et al. in 2011, offers a novel statistical approach to enhance the analysis of composite outcomes with varying severities by accounting for the relative priority of each component. Conventional methods for creating and analyzing composite outcomes have their limitations which may require altering and/or narrowing study questions, potentially creating downstream limitations, to interpret results/answers. In contrast with conventional methods, the WR accommodates mixed outcome types (e.g., time-to-event, categorical, and continuous) without relying on distributional assumptions thereby empowering stronger study questions and more resulting in more robust answers. By comparing pairs of patients and assigning a “win” to the patient with the better outcome for each pair, this method prioritizes the most important endpoints set by patient and/or provider priorities, leading to more clinically relevant trial results. Additionally, the WR method allows for the prioritization of fatal outcomes and hierarchical testing of broader composite endpoints, including patient-reported outcomes. Its hierarchical structure, statistical power, and flexibility make the WR an attractive alternative for comparing the efficacy of randomized treatments. With the increasing implementation of the Win Ratio (WR) in various medical fields, this invited session will provide a comprehensive introduction to the WR method and its extensions, explore its applications across multiple fields, address challenges in implementation and securing funding for WR trial designs. In addition, the new Win Time Ratio method will be illustrated. The Win Time Ratio is a new variant of WR that accounts for the time spent in each clinical state during the combined common follow-up period. Finally, we will discuss the physician’s perspective on the critical need for trial designs that better reflect the complexity of real-world patient outcomes.

This invited session aims to provide a comprehensive overview of the Win Ratio (WR) method and its extensions and variants, illustrating how these innovative statistical approaches can improve the analysis of composite outcomes in clinical trials. By exploring both the methodological advancements and practical applications, we aim to showcase the potential of WR-based designs to yield more clinically relevant and robust trial results that better reflect real-world patient outcomes. Attendees will gain a deeper understanding of the WR method and its new variant, the Win Time Ratio, learning how these approaches address key limitations in conventional composite outcome analysis. Participants will also acquire insights into implementing WR-based trial designs, overcoming associated challenges, and the critical role of these methods in generating more meaningful and robust findings in clinical research.

Talk 1: Enhancing Clinical Trial Outcomes: An Introduction to the Win Ratio Method Brief. Björn Redfors, Sahlgrenska University Hospital

This talk will introduce the Win Ratio (WR) method as a novel statistical approach for analyzing composite outcomes with varying severities. The talk will address how WR accommodates mixed outcome types (time-to-event, categorical, and continuous) without relying on distributional assumptions, empowering stronger study designs and more robust conclusions.

Talk 2: Win Ratio implementation in WINDSURFER trial and its extensions. Lai Wei, Ohio State University

Dr. Wei will introduce the implementation of the WIN ratio analysis to Determine a strategy of non-invasive SUpport for Respiratory Failure in the EmeRgency Department (WINDSURFER) trial. The considerations and challenges encountered during the study design of this trial will be shared. Additionally, extensions of the Win Ratio method, incorporating weighted and matched techniques, will be introduced.

Talk 3: Win Time Methods for Clinical Trials. James Troendle, National Heart, Lung, and Blood Institute

In this talk, Dr. Troendle will introduce and illustrate the Win Time methods. These new methods will be compared to the Win Ratio, with an important distinction being how likely a trial is to conclude benefit without there being an overall benefit.

Discussant: Jarrod Mosier, University of Arizona

Dr. Mosier will reflect on the 3 presentations and share his experience as a clinical PI using the WR approach for the primary outcome in acute trials and how it has been perceived by peer review.

SP7 – HARNESSING THE POTENTIAL OF PLATFORM TRIALS IN CHRONIC DISEASES

Organizer: Thomas Jensen (Berry Consultants)

Chair: Thomas Jensen (Berry Consultants)

Speaker(s):

1) Iain Stewart (Imperial College London)

2) Megan McCabe (University of Alabama at Birmingham)

3) Barbara Wendelberger (Berry Consultants)

4) John VanBuren (University of Utah)

5) JonDavid Sparks (Eli Lilly)

Platform trial designs offer an innovative framework for assessing treatment efficacy in progressive and degenerative chronic diseases while alleviating challenges of operations and resource management. Clinical trials in chronic diseases, such as interstitial lung disease, can be hindered by slow recruitment, inconsistent standards of care, and large sample size requirements with extended periods of follow up. Potential trial participants, especially with a rare chronic disease, often face the predicament of having to choose only one among multiple competing clinical trials. In contrast, platform trial designs efficiently address these issues without compromising the integrity and robustness of trial results. The shared trial infrastructure of platform trials facilitates pooling resources across participating institutions, aiding enrollment, while also offering trial participants a greater chance of receiving an effective therapy. Platform trial features that are beneficial in chronic disease populations include 1) the ability to randomize to combination therapies across multiple disease subpopulations, 2) follow up embedded within usual care schedules, and 3) apply adaptive decision rules for early stopping that allow chronically ill patients to receive evidence-based care. This session will discuss unique aspects of platform trial design and implementation in chronic diseases and highlight how to leverage the strengths of platform designs in clinical development.

Five speakers will address different aspects of platform trials with examples in pulmonary and neurological diseases. The first talk will discuss the value of using real world evidence to define well-powered endpoints for varied disease subpopulations and stages of disease progression. Strategies for data collection informed by real world evidence will also be addressed, including frequency of follow up and plans for handling prevalent background therapies. The second talk will address considerations for supporting shared control groups across distinct disease subpopulations and treatment domains, including modes of administration. The third talk will explore adaptive trial features which improve statistical and logistical efficiency and expedite the timeline to trial conclusions. Statistical approaches for investigating potential interactions of combinations therapies within a multi-factorial framework will also be discussed. The fourth talk will present operational aspects of coordinating site activation and data collection, monitoring and implementing the randomization scheme, and navigating IRBs in complex trials. The final talk will discuss an industry sponsor’s perspective of participating in platform trials. Talks will each be roughly 15 minutes and will be followed by a final 15-minute Q&A period.

Talk 1: Utilizing real world evidence to characterize target populations and appropriate clinical outcomes, and to strategize data collection in trial planning. Iain Stewart, Imperial College London

Talk 2: Logistical and statistical considerations of a shared control in platform trials of multiple disease cohorts and treatment domains. Megan McCabe, University of Alabama at Birmingham

Talk 3: Implementing adaptive design features to deliver well-powered, expedited, and patient-centric results. Barbara Wendelberger, Berry Consultants

Talk 4: Coordinating the set up and operations of platform trial implementation in chronic disease settings. John VanBuren, University of Utah

Talk 5: An industry sponsor’s perspective on the benefits of therapeutic development within platform trials. JonDavid Sparks, Eli Lilly

SP8 – DESIGN CONSIDERATIONS FOR MULTI-CANCER DETECTION ASSAY CLINICAL TRIALS: THE NCI CANCER SCREENING RESEARCH NETWORK

Organizer: Katherine A Guthrie (Fred Hutch Cancer Center)

Chair: Katherine A Guthrie (Fred Hutch Cancer Center)

Speaker(s):

1) Katherine A Guthrie (Fred Hutch Cancer Center)

2) Ruth B Etzioni (Fred Hutch Cancer Center)

3) Ziding Feng (Fred Hutch Cancer Center)

4) Charles L Kooperberg (Fred Hutch Cancer Center)

A new generation of Multi-Cancer Detection (MCD) assays that evaluate cell-free DNA or other biological components is rapidly emerging. If the benefit of MCD assays could be established, this would present several advantages for cancer screening. MCDs are simple to implement for both health care providers and their patients (generally, a blood test), so could be widely accessible, even in under-resourced settings. MCD tests also have the potential to greatly expand early detection opportunities for cancers with no established screening technologies. In addition, a single blood test for multiple cancers could improve the reach and efficiency of screening, even for those cancers with existing screening technologies.

Despite their promise, the evidence supporting MCD tests for early detection benefits is quite limited. Only two prospective, uncontrolled studies have reported outcomes for MCD tests, and these results are restricted to test performance, tumor stage at diagnosis, and adverse events related to working up an abnormal MCD test result. No study has documented the impact on cancer-specific or overall mortality, or harms from testing (e.g., over-diagnosis), which is critical for understanding their true value for public health and their implications for health care providers and systems.

An additional complexity of MCD tests is the identification of the tumor site or tissue of origin (TOO). Depending on the assay, an abnormal test result may give an indication of the location of the tumor, which may or may not be accurate, or it may signal only the presence of cancer without specifying a TOO. Because the value of screening depends on timely diagnosis and effective treatment, the process for reaching an accurate diagnostic resolution and accessing treatment is fundamental to the value of MCD-based screening. No standards yet exist to assist a primary care provider in determining how to follow-up an abnormal MCD test result.

The new, NCI-funded Cancer Screening Research Network (CSRN) will evaluate emerging technologies for cancer screening. The CSRN will conduct rigorous, multi-center cancer screening trials with large and diverse populations in a variety of health care settings with the ultimate goal of reducing cancer-related illnesses and deaths. CSRN launches its first randomized clinical trial, named the Vanguard Study, in early 2025. Trial participants without cancer will be randomized to receive one of two MCD tests or to a control arm (no test), with the goal of assessing the feasibility of implementing a large platform RCT to measure the clinical effectiveness of MCD tests. Leaders of the two coordinating centers for the Vanguard Study, the Communications and Coordinating Center and the Statistics and Data Management Center, will present the Vanguard design and discuss unique aspects of MCD screening trial design.

Talk 1: Design of the CSRN Vanguard feasibility study. Katherine A Guthrie, Fred Hutch Cancer Center

The specific aims of the Vanguard Study are to assess the feasibility of conducting a randomized controlled trial to evaluate MCD tests, and to develop and evaluate our ability to engage underserved and under-resourced populations in this effort. This feasibility study will accrue up to 24,000 participants from across the US through 9 Accrual, Enrollment, and Screening Site (ACCESS) Hubs, including academic, community, Federally Qualified, Department of Defense, and Veterans Affairs health care centers. The design incorporates blood draws from all participants at baseline and year 1, single-blinded and unblinded Hubs, collection of standard of care cancer screening episodes and incidental cancer cases, participant-reported mental health outcomes, and diagnostic workups for the expected 3-5% of intervention-arm participants who receive an abnormal MCD test result.

Talk 2: Can we shortcut cancer screening trials? Ruth Etzioni, Fred Hutch Cancer Center

This talk will describe novel designs and alternative endpoints that are being considered to make screening trials more efficient and more able to provide timely results regarding screening efficacy. Dr. Etzioni will discuss the implications of these potential changes for the evaluation of multi-cancer detection tests.

Talk 3: Are MCD tests ready for primetime? Ziding Feng, Fred Hutch Cancer Center

This talk will contrast evidence supporting the effectiveness of MCD tests versus single cancer tests. Dr. Feng will sound a cautionary note that while there is great excitement about MCD tests, we should not necessarily relax established criteria for trial readiness.

A panel discussion with the speakers and other CSRN coordinating center leadership will follow the three talks. We also hope to add a speaker representing a clinical site to discuss challenges inherent to the diagnostic workup following an abnormal MCD test result.

SP10 – PEDIATRIC DRUG DEVELOPMENT: EMERGING INNOVATIONS AND FUTURE DIRECTIONS

Organizer: Jingjing Ye (Global Statistics and Data Sciences)

Chair: Yuanye (Vickie) Zhang (Servier)

Speaker(s):

1) Jingjing Ye (Global Statistics and Data Sciences)

2) Ming-Hui Chen (University of Connecticut)

3) Margaret Gamalo (Pfizer)

4) Robert “Skip” Nelson (Johnson & Johnson)

The development of safe, effective, and targeted medications for pediatric populations has long been a significant challenge. Key obstacles include the small number of pediatric patients, the limited availability of detailed physiological data, and the ethical complexities of conducting research with children. These factors collectively slow the progress of pediatric drug development, often resulting in significant delays compared to the approval timelines of drugs for adults. As regulatory requirements for pediatric studies have evolved, innovative research methods, advanced technologies, and collaborative frameworks have emerged to drive progress in this critical field of medicine. Regional guidelines discussing pediatric extrapolation have been previously issued by various regulatory agencies, including both FDA and EMA. The recently released ICH E11A guideline provides recommendations for, and promotes international harmonization of, the use of pediatric extrapolation to support the development and authorization of pediatric medicines. Recently, a review on pediatric labeling changes in US also showed that the use of extrapolation increased the approval rates of new and expanded pediatric indication (Ye et al., 2023). ICH E11A encourages use of pediatric extrapolation based on evaluation of existing evidence between adult and pediatric population: 1) similarity in disease, 2) similarity of drug pharmacology, and 3) similarity of response to treatment, to reduce the burden of conducting pediatric studies. The level of evidence would depend on the existing strength of evidence and thus the approaches for extrapolation may be different. Explorations are typically conducted between adult and pediatric populations for the same drug. Recently, the mechanism of action (MOA) based extrapolation has been proposed. This MOA-based strategy broadens the scope of data sources beyond just the same drug to include other drugs with the same or similar MOA. As a result, it allows for the integration of diverse data types that are highly relevant to both the pediatric population and the drug under development. Within this context, Bayesian methodologies are emerging as a powerful tool, offering innovative ways to maximize the use of existing data, optimize trial designs, and ensure robust statistical analysis. Represented by American Statistical Association (ASA) Biopharmaceutical Section Statistics in Pediatric Drug Development Scientific Workgroup (SPDRx), this session aims to explore cutting-edge approaches to pediatric drug development, focusing on how these methods can bridge the gap between adult and pediatric populations and streamline trial designs. It will feature three expert-led presentations, followed by a discussion led by the PhRMA Topic Lead for the ICH E11A pediatric extrapolation expert working group, who will provide insights into the latest developments in international harmonization efforts and the role of extrapolation in pediatric medicine.

SP11 – INCOMPLETE VARIANTS OF STEPPED WEDGE CLUSTER RANDOMIZED DESIGNS: RECENT DESIGN INNOVATIONS AND CONSIDERATIONS FOR IMPLEMENTATION

Organizer: Jessica Kasza (Monash University)

Chair: Jessica Kasza (Monash University)

Speaker(s):

1) Kelsey Grantham (Monash University)

2) John Preisser (University of North Carolina)

3) Fan Li (Yale University)

4) Monica Taljaard (Ottawa Hospital Research Institute)

Cluster randomized trials are essential designs for evaluating effects of interventions that are applied to groups of patients (i.e. to entire clusters). A common challenge in practice is balancing the need for a robust design that has a sufficiently large number of clusters, with practical limitations such as limited availability of clusters, budgetary constraints, and securing the support of cluster gatekeepers. Stepped wedge cluster randomized trials are an important variant that have gained popularity over the past three decades: in these designs, all clusters start in their usual-care steady-state, but eventually switch to the intervention during the trial, with the timing of the switch randomized. More than 530 stepped wedge trials are currently registered on ClinicalTrials.gov, and the opportunity for all clusters to implement the intervention is a key reason for their practical appeal. However, standard stepped wedge designs can be costly and burdensome to both clusters and individual participants, and can increase the risk of cluster attrition, as all clusters are required to recruit and measure individuals for the entire study duration.

Incomplete variants of stepped wedge designs, in which clusters participate in a trial for limited durations of time, offer appealing alternatives, alleviating burdens and reducing costs. In fact, statistical work has shown that not all measurements in a stepped wedge design contribute the same amount of information about the effect of an intervention: for example, those measurements taken near the time that a cluster switches from the control to the intervention tend to provide the most information about the treatment effect. This work points the way towards potentially powerful incomplete alternatives to the stepped wedge; but myriad incomplete variants of any complete stepped wedge design exist. Much recent work has focused on the identification of “optimal” incomplete variants of the stepped wedge design: incomplete designs that still provide sufficiently high statistical power to detect effects of interest while reducing the burden of participating in a trial, and reducing trial costs. Major questions remain: which variants of incomplete stepped wedge designs are particularly beneficial; what is the optimal incomplete design for any given scenario; how do conclusions about the efficiency of incomplete designs change for different modelling approaches; and how acceptable are these optimal incomplete designs to trialists?

This session gathers researchers from around the world who are focused on identifying and understanding optimal incomplete variants of stepped wedge designs. In this session, they will discuss why incomplete variants of the stepped wedge design are worth considering; describe some useful variants of incomplete designs; present methods for finding incomplete stepped wedge designs with high levels of power; and consider when these incomplete stepped wedge designs are useful and acceptable alternatives to the complete stepped wedge. This session will be of interest to all researchers working in the design and conduct of cluster randomized trials and has implications for enhancing the robustness of innovative trial designs to answer important clinical research questions.

Talk 1: Outline of session and introduction of speakers. Jessica Kasza, Monash University

Talk 2: From the stepped wedge to the staircase. Kelsey Grantham, Monash University

In this talk, we show that measurements taken in certain regions of stepped wedge designs contribute nothing or very little to estimation of the treatment effect, pointing the way toward new designs that concentrate measurements in only the most impactful regions. We then describe how incomplete variants that are less burdensome and more cost-efficient, such as “staircase” designs, can be derived from a stepped wedge design by removing cells from the design in a principled manner. Finally, we discuss staircase designs in more detail, including when staircase designs can be equally as or more powerful than stepped wedge designs.

Talk 3: Marginal models, binary outcomes, and incomplete designs. John Preisser, University of North Carolina

The use of marginal models for binary outcomes is commonplace in the analysis of data from stepped wedge designs, be they complete or incomplete. Hence, sample size formulas and procedures based on the marginal modelling approach that can accommodate incomplete stepped wedge designs are required. Here we discuss how such a procedure can be applied to explore a range of incomplete stepped wedge trials. We also discuss the interplay between statistical and practical considerations in the design of a particular incomplete stepped wedge trial.

Talk 4. Demystifying incomplete stepped wedge designs under the working independence assumption. Fan Li, Yale University

The design and analysis of stepped wedge designs is complicated by the need to specify the “correct” working correlation structures, and a convenient working independence assumption is sometimes attractive due to its simplicity and accessibility, and its robustness to working correlation misspecification. In this talk, we will discuss the information content of full stepped wedge designs analyzed by independence estimating equations, and identity information-rich cells that contribute the most to treatment effect estimation, thereby motivating the form of an incomplete design variant that balances power with data collection burden. We will discuss a surprising result that an incomplete design can beat a complete design in efficiency under working independence “i.e., less is more” and provide a new justification to the incomplete stepped wedge variant. Practical considerations and numerical examples of designing incomplete stepped wedge trials under working independence are also discussed.

Talk 5: Incomplete designs in practice. Monica Taljaard, Ottawa Hospital Research Institute

In this talk, we consider practical issues around the design, analysis and reporting of incomplete stepped wedge variants. We review the trials literature to consider how commonly incomplete designs are being used and how and why they are implemented. We discuss potential pitfalls raised by this design, and review practical aspects that trialists and statisticians should consider when planning and implementing an incomplete stepped wedge trial. We conclude by identifying gaps and unanswered questions that need to be addressed.

Panel discussion: The future of incomplete designs. Moderated by Jessica Kasza.

SP13 – BUILDING AN EVIDENCE-BASE FOR EXERCISE MEDICINE IN TARGETED POPULATIONS THROUGH RIGOROUS CLINICAL TRIALS

Organizer: Charity G Patterson (University of Pittsburgh)

Chair: Charity G Patterson (University of Pittsburgh)

Speaker(s):

1) Kathryn Schmitz (University of Pittsburgh)

2) Daniel M Corcos (Northwestern University)

3) David X Marquez (University of Illinois Chicago)

4) Eduardo E Bustamante (University of Illinois Chicago)

Physical activity is necessary for optimal health but can exercise be used safely for the treatment and prevention of diseases in vulnerable and diverse populations? Health care providers need high quality evidence to prescribe effective and safe interventions for their patients. The evidence-base for exercise interventions largely consists of single-center underpowered efficacy trials. However, there are ongoing efforts to contribute stronger evidence through rigorously conducted clinical trials across complex diseases and in diverse populations. Challenges encountered include study design, implementation, and recruitment: (1) Exercise trials are complex to design due to the multiple modalities, doses, and durations; (2) Data and safety monitoring must be followed no less than pharmacological trials; (3) Successful recruitment for exercise trials depends on the motivation of the population being studied and competing pharmacological options.

Talk 1. Kathryn Schmitz, University of Pittsburgh

Dr. Schmitz will present on exercise oncology trials, with a focus on THRIVE 65, which is part of the NCI funded ENICTO Consortium. This multisite 2 arm RCT is assessing the effect of twice weekly progressive resistance training and protein supplementation on chemotherapy treatment tolerance (relative dose intensity). The intervention is largely delivered through telehealth to the participants, who are all 65 and older. Cognitive issues, feeling overwhelmed, and lack of familiarity with technology are among the ongoing challenges for carrying out this trial. Monitoring exercise and nutrition intervention dose and adverse effects are also a major focus of the study team as this work progresses.

Talk 2. Daniel M Corcos, Northwestern University

Dr. Corcos will present on exercise trials for Parkinson’s disease. He will discuss the challenges associated with studying disease modification in Parkinson’s disease as opposed to studying the modification of the signs and symptoms of the disease. He will discuss the role of both fluid, and digital biomarkers in the study of exercise. He will also discuss different approaches to solving the randomization problem in exercise studies including masked dose escalation and cluster-randomized experimental designs. He will conclude with discussing SMART experimental designs to change behavior. The science of exercise has made it clear the exercise is a potent medicine to change health span. The next great challenge is behavior change, especially across diverse populations of individuals and multiple health domains.

Talk 3. David X Marquez, University of Illinois Chicago

Dr. Marquez will present on aspects of conducting physical activity trials in diverse populations. Older Latinos are the fastest growing cohort among older adults in the USA, and their lives are often fraught with comorbidities. Evidence has demonstrated health benefits of regular physical activity for older adults. However, older Latinos participate in low levels of physical activity. Interventions designed to increase the physical activity of older Latinos are lacking, and many older Latinos face impediments to participating in physical activity interventions that researchers are unaware of. We have identified barriers and strategies to overcome these barriers that researchers are likely to face in conducting in-person and remote physical activity interventions for older Latinos.

Talk 4. Eduardo E Bustamante, University of Illinois Chicago

Dr. Bustamante will present on aspects of conducting physical activity trials with a focus on mental health. Clinical trials on physical activity and mental health conceptualize physical activity as a form of medicine and seek to discover the optimal dose (i.e., frequency, intensity, time, type) for various outcomes and conditions. However, physical activity is fundamentally different from medicine in that the contextual features of physical activity programs have substantial effects on mental health, independent of physical activity dose. When we exercise, where we exercise, and who we exercise with are inescapable features of physical activity that both confound trials and present new opportunities for mental health promotion. This presentation will review the role of context in physical activity-mental health trials and provides examples of ongoing research harnessing physical activity contexts to optimize mental health in youth.

Talk 5. Charity Patterson, University of Pittsburgh

Dr. Patterson will briefly summarize the opportunities and challenges of conducting physical activity and exercise trials and facilitate a discussion with questions from the audience.

SP14 – STOPPING PROGRESS: FINDING EFFECTIVE TREATMENTS USING DISEASE PROGRESSION MODELING

Organizer: Barbara Wendelberger (Berry Consultants)

Chair: Barbara Wendelberger (Berry Consultants)

Speaker(s):

1) Adam Staffaroni (University of California, San Francisco)

2) Chris Coffey (University of Iowa)

3) Tom Jensen (Berry Consultants)

4) Guoqiao Wang (Washington University in St. Louis)

Progressive diseases are characterized by a systematic pathological advance that can include abnormal biomarker activity, decreased function, and clinical symptoms. For example, in neurodegenerative diseases such as frontotemporal dementia, there is a cascade of pathological processes with early changes in neurofilament light chain and magnetic resonance imaging measures and, later, progression to cognitive symptoms and clinical disease. Quantifying both the trajectory and heterogeneity of these changes, and their relationship to disease state, is key to addressing clinically relevant questions and designing well-powered clinical trials. In this session, we introduce and explore different aspects of disease progression modeling and discuss how to leverage disease progression models in innovative clinical trial design. In the first talk, we explore the idea that a person’s disease state can be modeled using the concept of disease age and illustrate how disease age can be leveraged to improve clinical trial design and yield robust analyses. The second talk focuses on endpoint selection in progressive diseases and considerations for clinical trial design. Progressive diseases may be modeled using various types of endpoints, including clinical, cognitive, and biomarker outcomes. Selecting trial endpoints with well-characterized statistical behavior, as well as clinical relevance is crucial to finding novel and effective treatments. Third, we address the inevitability of missing data due to progression. This talk will provide strategies for addressing mortality in progressive disease trials and describe their impact on the interpretation of results. The final talk focuses on treatment effects in progressive disease, addressing similarities and differences related to deltas, slowing, reduction, and variance. Assumptions about treatment effects, as well as choices in how to model them, have a substantial impact on clinical trial design and subsequent trial results. Talks will be followed by a roughly 15-minute Q&A period. Disease progression modeling provides a framework that enables researchers to ask the right questions about how progressive diseases advance and provides a flexible tool that will shape future clinical trial design as we search for treatments that can slow or halt disease progression.

Talk 1: Understanding disease age and leveraging this concept in clinical trial design. Adam Staffaroni, University of California, San Francisco

Talk 2: Endpoint selection in progressive diseases and considerations for clinical trials. Chris Coffey, University of Iowa

Talk 3: Strategies for addressing mortality in progressive disease trials and impact on the interpretation of trial results. Tom Jensen, Berry Consultants

Talk 4: Defining treatment effects in progressive disease, addressing similarities and differences related to deltas, slowing, reduction, and variance.

Guoqiao Wang, Washington University in St. Louis

SP17 – ESTABLISHING THE UNIVERSITY DATA COORDINATING CENTER (UNICORN) NETWORK TO ADVANCE DATA COORDINATION IN CLINICAL RESEARCH

Organizer: Chris Lindsell (Duke University)

Chair: Jody Ciolino (Northwestern University)

Speaker(s):

1) Chris Lindsell (Duke University)

2) Cristina Murray-Krezan (University of Pittsburgh)

3) Catherine Dillon (Medical University of South Carolina)

4) Nicholas Pajewski (Wake Forest University)

Data Coordinating Centers (DCCs) play a critical role in the design, implementation, analysis and dissemination of multicenter clinical trials. Through thoughtful collaborations, a DCC contributes to defining the right questions that will lead to robust answers. They ensure efficient data procurement and promote data accuracy. This session aims to discuss the formation, objectives, and impact of the UNICORN Network, a coalition of academic data coordinating centers aimed at advancing design, data coordination and statistical methodology in clinical research. The session will provide an up-to-date overview of the UNICORN Network’s mission to share best practices, advocate for data coordination, and address gaps in the landscape of data coordinating centers. By featuring speakers from diverse backgrounds, the session will highlight how the network serves as a platform for discussing innovative solutions and fostering partnerships to advance the field of clinical research data coordination.

Session Objectives: (1) Introduce the UNICORN Network: Outline the network’s formation, principles, and structure, including its mission to improve clinical research informativeness through collaboration; (2) Discuss the Landscape of Academic DCCs: Present findings from the 2024 DCC Summit and follow up survey, addressing current challenges, opportunities, and the critical role of data coordination in clinical research; (3) Highlight Best Practices and Advocacy: Share the network’s efforts in developing and disseminating best practices for data coordination, professional development, and advocacy for the value of academic DCCs; (4) Encourage Collaborative Efforts: Facilitate a discussion on how the network promotes resource sharing, collaboration, professional development and communication among academic institutions.

Introduction to the UNICORN Network: discuss the origins of the UNICORN Network, emphasizing its role in bringing together academic DCCs to share best practices and advocate for high-quality data coordination. Highlight the network’s foundational principles of transparency, member-driven leadership, and the importance of academic centers in advancing clinical research.

Landscape and Challenges of Academic DCCs: present a landscape assessment based on a 2024 DCC Summit and subsequent survey, providing insights into the state of academic DCCs, including staffing, infrastructure, mission priorities, and existing gaps. This segment will cover the unique challenges faced by academic DCCs, such as increasing regulatory requirements, funding constraints, and the need for methodological innovation.

Best Practices and Advocacy for Data Coordination: outline the UNICORN Network’s efforts in developing and advocating for best practices in data coordination including the establishment and operation of DCCs, professional development programs, and strategies for enhancing data management and sharing, regulatory compliance, and cybersecurity in clinical research. Discuss the network’s role in promoting collaboration and resource sharing among institutions.

Panel Discussion and Audience Engagement: Discussant will provide a synthesis of the presented topics, offering insights into the future of data coordination in clinical research. The discussion will focus on how the UNICORN Network can drive innovation and address challenges within the field. Audience members will be invited to participate, posing questions and sharing their perspectives on advancing data coordination practices.

Importance of the Session: Data Coordinating Centers are critical to multi-site clinical research, yet there is no guidebook to follow when building one, and no standard by which they are assessed. This session will highlight the UNICORN Network’s contributions to improving clinical research and advocate for the recognition of data coordination as a critical aspect of trial success. By bringing together experts from various academic institutions, it aims to foster collaboration and disseminate knowledge to enhance the effectiveness and efficiency of DCCs. The session will provide valuable insights into the evolving landscape of data coordination in clinical trials and opportunities for members of the community to get involved in advancing the science and practice of data coordination.

Diversity and Inclusion: The session features speakers and panelists from diverse academic backgrounds and institutions, offering multiple perspectives on data coordination. This diversity will underscore the collaborative nature of the UNICORN Network and its commitment to inclusive practices that advance clinical research.

Expected Outcomes: (1) Enhanced understanding of the role and challenges of academic DCCs in clinical research; (2) Dissemination of best practices for establishing and running data coordinating centers; (3) Engagement with the clinical trials community to identify strategies for addressing current and future challenges in data coordination.

SP18 – IMPROVING THE CLINICAL TRIAL ECOSYSTEM TO EFFICIENTLY GENERATE ROBUST RESEARCH AND IMPROVE CARE

Organizer: Stuart G Nicholls (Ottawa Hospital Research Institute)

Chair: PJ Devereaux (McMaster University)

Speaker(s):

1) PJ Devereaux (McMaster University)

2) Dean Fergusson (Ottawa Hospital Research Institute)

3) Stuart G Nicholls (Ottawa Hospital Research Institute)

4) Sameer Parpia (McMaster University)

Randomized controlled trials (RCTs) are the gold standard for generating high-quality evidence to optimize human health. However, trial teams and patients face substantial challenges to undertaking and participating in clinical trials including delays to study initiation and a lack of access to clinical trials for patients. Indeed, a recent editorial bemoaned that “RCTs are often challenging and resource-intensive to implement” due to logistical barriers as well a need to develop the infrastructure required for study procedures.

As part of a $250 Million investment to deliver on the Canadian Biomanufacturing and Life Sciences Strategy, the Canadian Accelerating Clinical Trials (ACT) consortium has been funded to address these challenges and improve the clinical trials ecosystem. Alongside ACT several Clinical Trial Training Platforms (CTTPs) have also been funded to recruit, train and mentor highly qualified trainees, researchers, healthcare professionals, and clinical research professionals and better position the next generation of clinical trial researchers within the clinical research ecosystem.

In this invited session we identify key challenges and how we are addressing these through ACT and the CTTPs. The session will be opened by PJ Deveraux who will provide an overview of the challenges as well as a high-level synopsis of the work being undertaken by ACT. This will be followed by three focused presentations on: work that is decreasing time and increasing efficiency of study initiation; improvements in awareness of, engagement with, and access to clinical trials and, training that will build a capable and nimble workforce to deliver robust RCTs.

Talk 1: Improving the process and increasing the pie: Time to ACT. PJ Devereaux, McMaster University

In this presentation we will describe the current challenges facing those wanting to conduct clinical trials in Canada, including the need to increase the funding available for clinical trials. From here we will outline initiatives undertaken by ACT to address these challenges, including the development of master contract agreements, building research infrastructure within community hospitals, streamlining the ethics review process, improving awareness of and access to clinical trials, and recognizing the important link between a strong economy and health and the need to engage with industry to grow the funding pie.

Talk 2: Decreasing time and increasing efficiency of study initiation. Dean Fergusson, Ottawa Hospital Research Institute

Operational bottlenecks in trial initiation include research ethics board (REB) approval, the negotiation and execution of clinical trial contracts, regulatory processes, and the recruitment and training of study personal. After outlining these challenges, we will describe two major initiatives of the ACT consortium: the development of a single national distributive REB model with strict timelines, and two pan-Canadian master agreements and accompanying templates relating to the sharing of data and study start up. The first, the master data sharing agreement has, as of October 1, 2024, 46 signatories. The second, the master CIHR-funded participating site agreement (and its accompanying template for non-CIHR-funded studies) has (as of October 1, 2024) 35+ institutions across Canada currently negotiating its language for finalization.

Talk 3: Awareness, Engagement, and Access: getting the right trials to the right people. Stuart Nicholls, Ottawa Hospital Research Institute

In a survey of the public, conducted by Clinical Trials Ontario, only 11% of respondents had been approached to be part of a clinical trial, yet over 65% of respondents indicated they would be willing to participate in a trial. This reflects a major gap between interest and opportunity for patients to benefit from clinical trials. The third presentation will focus on the need to improve access to clinical trials. Specifically, the presentation will focus on the work ACT has undertaken to (1) increase the availability of trials in Canada through direct funding and network support, (2) improve awareness regarding the importance and availability of trials (BeTheCure), (3) building infrastructure to support increased access to trials in areas currently underserved (portfolio hospitals), and (4) improving the design and conduct of trials to make them more including (patient engagement & IDEA).

Talk 4: Building capacity and the future workforce. Sameer Parpia, McMaster University

Clinical research depends on a skilled workforce equipped with the knowledge and mentorship necessary to drive advancements in this field. In this final presentation we discuss the need to develop capacity and skills within the workforce. The presentation will begin by outlining the rapid development of trial methods and needs before outlining the Clinical Trial Training Platforms (CTTPs) funded by the Canadian Institutes of Health Research Clinical Trial Fund. Following a general overview of the CTTPs the presentation will focus on the work and impact of the Canadian Network for Statistical Training in Trials (CANSTAT), a training platform to train and mentor biostatisticians in clinical trials.

Panel discussion: Following the presentations we will facilitate what we hope is a spirited discussion regarding the steps ACT and CANSTAT are taking to improve the trials ecosystem and the opportunity to create further collaborations. The chair will invite questions and comments from the audience to help distil directions for future work to improve the clinical trial ecosystem in Canada, North America, and beyond.

SP20 – IDENTIFYING AND MITIGATING THE IMPACT OF CLINICAL RESEARCH PROFESSIONAL ATTRITION IN CLINICAL TRIALS

Organizer: Valerie Stevenson (University of Michigan)

Chair: Valerie Stevenson (University of Michigan)

Speaker(s):

1) Valerie Durkalski (Medical University of South Carolina)

2) Ian Rines (Medical University of South Carolina)

3) Sara Roy (University of Chicago)

4) Abbey Staugaitis (University of Minnesota)

5) Sharon Yeatts (Medical University of South Carolina)

6) Valerie Stevenson (University of Michigan)

The world of clinical research is not immune from the impact of study team attrition. This presentation explores the impact related to research study coordinator turnover, methods to quantify the rate of attrition, and opportunities to mitigate the impact of research professional turnover on active clinical trials. SIREN is a NIH-funded research collaborative with a number of research professionals at more than 90 sites across the US, Canada and abroad. Staff changes create challenges, and clinical trials are no exception. Changes in research professionals can have a wide range of impacts on an organization, including lower morale, cultural shifts, reduced productivity in the form of lower enrollment and reduced data quality, increased training and hiring costs. The reasons for turnover can vary based on several factors including the industry, geographic location, economic conditions, and specific skill sets in demand. The SIREN Network initiated a process to better understand the incidence, reasons, and impact of turnover specifically focusing on study coordinators, and explore ways to mitigate negative impact on trial operations.

Talk 1: Challenges In Determining Prevalence of Attrition.

The Strategies to Innovate EmeRgENcy Care Clinical Trials Network (SIREN) utilizes a web-based clinical trial management system to capture study team members and roles within active clinical trials via an electronic delegation of authority (DoA) log. Sharon Yeatts will provide an overview of the structure of the network and the electronic DoA, and Ian Rines will describe the methods used to capture and present information related to clinical research coordinator turnover. These methods focus on data-driven questions that lead to evidence-based answers.

Talk 2: Qualitative Approach to Understanding Impact.

By asking the right questions, we can unlock innovative solutions and provide robust answers to staffing challenges. With this in mind, Valerie Stevenson will discuss the tools and techniques used in soliciting feedback regarding the impact of research coordinator attrition. This section provides an overview of LEAN practices, survey development, analysis of results, and development of mitigation strategies.

Talk 3: Implementation of Mitigation Strategies and the Impact of Attrition.

Abbey Staugaitis and Sara Roy will share the study manager perspective. They will provide an overview of their hiring models, experience working through issues related to attrition at the site level, collaborating with network stakeholders, implementation of mitigation strategies and feedback on the process. Models discussed will include pooled and semi pooled, remote/asynchronous, semi-pooled staffing model, and primary or co-lead. Definitions and origins of each model, perceived benefits, encountered challenges and ways to adapt models for different groups will be presented and evaluated.

A collaborative discussion session will follow, allowing participants to engage with the presented findings and contribute to the dialogue on improving retention, training and growth of key members of a clinical trial team.

SP21 – ISSUES IN THE DESIGN OF STUDIES WITH HIERARCHICAL ENDPOINTS

Organizer: Yuliya Lokhnygina (Duke Clinical Research Institute)

Chair: Yuliya Lokhnygina (Duke Clinical Research Institute)

Speaker(s):

1) Huiman Barnhart (Duke University)

2) Marc Buyse (International Drug Development Institute)

3) Toshimitsu Hamasaki (The George Washington University Biostatistics Center)

4) Frank Rockhold (Duke University)

Recently, hierarchical endpoints have become increasingly popular in clinical trials research. They have several advantages over conventional composite endpoints: they (1) use information from multiple outcomes, rather than only the first event, (2) prioritize more important outcomes, and (3) combine information from different types of outcomes (time-to-event, binary, continuous, counts). Multiple approaches have been proposed for analysis of hierarchical endpoints, including win statistics, net treatment benefit and desirability of outcome ranking (DOOR). In this session, leading researchers in this field will discuss relative merits of various approaches and practical issues in designing clinical trials with hierarchical endpoints.

Talk 1: Trial Design with Win Ratio or Win Odds Based on Hierarchical Endpoints. Huiman Barnhart, Duke University

Win statistics, such as win ratio and win odds, have become a popular approach to analysis of hierarchical endpoints in clinical studies. While several sample size formulas are available for design of randomized trials using win statistics, these formulas require investigators to specify clinically significant and meaningful magnitude of win statistics and the expected probability of ties. In practice, these quantities are difficult to identify based on prior published literature. We show that the win ratio for the hierarchical endpoints is a weighted average of marginal win ratios (with similar expression for win odds), under the assumption of independence of the individual endpoints. We also provide the expression for the probability of ties. These formulas provide a simple way to specify clinically significant and meaningful win ratio (or win odds) magnitude and probability of ties. As a result, formula-based power and sample size calculations can be easily obtained for trial design without the need to conduct complex simulation studies. Our extensive simulation studies show that statistical power calculated with the formulas under the independence assumption is similar to the simulation-based power for any type of positively correlated hierarchical endpoints. Our approach gives researchers an easy tool for trial design and gives insights on relative contribution of marginal win ratio (win odds) to overall win ratio (win odds) and the impact of adding endpoints to the hierarchy. Cardiovascular trials are used to illustrate our approach.

Talk 2: Involving Patients in the Design of Trials Using Hierarchical Outcomes. Marc Buyse, IDDI & I-BioStat

Generalized pairwise comparisons is a method of analysis of multiple prioritized outcomes that provides a patient-relevant estimate of the overall effect of the treatment on all outcomes. Such an overall treatment effect can be expressed as a win ratio, a win odds, a probabilistic index, or a Net Treatment Benefit (NTB). We will argue that the NTB has advantages over other measures of treatment effect: it can be decomposed into additive contributions of prioritized outcomes and can be interpreted as the net probability that a random patient receiving treatment has a better outcome than a random patient receiving control. As such, it ranges over the interval [-1,+1] with 0 indicating that treatment does not differ from control. Establishing a hierarchy of outcomes is crucial to ensuring NTB is patient relevant. This process entails selecting the outcomes, prioritizing them according to patient preferences, and choosing thresholds of clinical similarity for some or all outcomes if appropriate. These choices can be based on expert or patient opinion. We have developed a software called “Voice” to elicit patient preferences for a well-defined set of selected outcomes, using the pairwise comparison paradigm. Voice displays the outcomes of pairs of patients, and the user is asked to choose the patient who, in their opinion, has the better outcome. An AI-driven algorithm generates outcomes for successive pairs of patients until the algorithm converges to a prioritized list of outcomes and thresholds that reflect the user’s preferences. Voice keeps track of user responses to justify the choice of a list of prioritized outcomes and thresholds in a prospective trial design, and to document heterogeneity in patient preferences. Voice could potentially inform individualized analyses according to the preferences of each user (or classes of users).

Talk 3: The Desirability of Outcome Ranking: The DOOR to Patient-Centric Benefit-Risk Evaluation. Toshimitsu Hamasaki, The George Washington University Biostatistics Center

Typical clinical trial analyses focus on comparing interventions for each efficacy and safety outcome. While these analyses estimate outcome-specific effects and combine marginal effects for benefit-risk assessments, they often overlook associations between outcomes, face challenges from competing risks, and fail to account for the cumulative impact of multiple outcomes on individual patients. Additionally, differing analysis populations for efficacy and safety complicate the applicability of benefit-risk analyses.

We can address these limitations through patient-centricity by correcting our arithmetic and “using outcomes to analyze patients rather than patients than analyze outcomes.” However to obtain the most informative answers for clinical practice, we prioritize: robustness through the avoidance of reliance upon modeling assumptions for validity; objectivity by avoiding subjective beliefs; the theory for error control consistent with the evidentiary standard for confirmatory evidence; clearly defined estimands and populations from which to estimate parameters; best practices for composite endpoints including integrated analyses of components; best practices for benefit:risk / multi-endpoint analyses to aid comprehensive assessment including analyses based on the absolute (vs. relative) risk scale consistent with providing a common scale for interpretation of multiple outcomes simultaneously; recognition of dimensions of treatment contrast including rank-based and grade-based analyses; best practices for ordinal patient-centric outcomes including cumulative analyses; intuitive interpretation, and sound technical fundamentals.

The Desirability Of Outcome Ranking (DOOR) methodology has been developed to enhance patient-centric benefit-risk evaluation in clinical trials. It allows for a more informative comparison of treatment risks and benefits. Given its complexity, thorough and careful analyses are vital. This talk presents a comprehensive statistical analysis plan for implementing DOOR in research studies, illustrating its components with examples, and addressing design issues in clinical trials utilizing this methodology.

SP22 – PLANNING FOR HETEROGENEOUS TREATMENT EFFECTS: ENRICHMENT FOR TREATMENT-SENSITIVE PATIENT POPULATIONS

Organizer: Amy Crawford (Berry Consultants)

Chair: Amy Crawford (Berry Consultants)

Speaker(s):

1) Giorgio Paulon (Berry Consultants)

2) Patrick Lawler (McGill University)

3) Jordan Elm (Medical University of South Carolina)

4) Ben Saville (Adaptix Trials)

In the era of precision medicine, understanding the wide-ranging responses of patient subgroups to interventions is critical for improving outcomes and enhancing clinical care. This session will address heterogeneous treatment effects (HTE) “variations in treatment efficacy across a patient population” and discuss adaptive clinical trials designed to respond in real time to emerging evidence of HTE. Enrichment designs have emerged as a key strategy for reacting to HTE, allowing for refined inclusion criteria and identifying patients most likely to benefit from specific therapies. By focusing on treatment-responsive subgroups, these designs enhance trial efficiency and increase the likelihood of successful results, paving the way for more targeted and effective interventions in clinical practice.

In the first talk, Dr. Paulon will explore the application of enrichment strategies by examining two clinical trials of early minimally invasive surgical removal of intracerebral hemorrhage (ICH). The first trial discussed, ENRICH, is a recently completed study that showed significant benefit in a pre-specified subgroup of the patient population based on the location of the hemorrhage. The second trial, REACH, seeks to further investigate if hemorrhage size is a meaningful variable in the subgroup where ENRICH did not demonstrate functional benefit.

In the second talk, Dr. Lawler will discuss the clinical motivation and trial design strategy for handling HTE in ATTACC-CAP, a platform trial designed to investigate the effect of antithrombotic therapy for patients with community-acquired pneumonia. The trial can adaptively modify entry criteria and reach conclusions in pre-defined patient risk groups at interim analyses. Risk groups are defined by the combination of several variables, including severity of illness, patient characteristics, and other biomarkers.

In the third talk, Dr. Elm will introduce the StrokeNet Thrombectomy Endovascular Platform (STEP) trial and discuss the anticipation of HTE in the endovascular therapy (EVT) indication expansion domain. Past trials have established EVT as a highly effective treatment for acute ischemic stroke patients in a relatively narrow range of baseline characteristics. It is probable that additional stroke patients benefit from EVT. This trial aims to expand the boundaries of indication for EVT by learning the differences between patients who are responsive and non-responsive to treatment through a changepoint model.

The discussant, Dr. Saville, will provide a broad discussion of clinical trials that plan for HTE. He will discuss similarities and differences in the trials presented during the session, including benefits and challenges associated with such designs. He will discuss the process and key decisions required in the design on an enrichment trial, and highlight the importance of clinical-statistical collaboration.

SP24 – SHAPING EQUITABLE ACCESS TO CANCER CLINICAL TRIALS THROUGH AI AND NAVIGATION

Organizer: Therica Miller (Icahn School of Medicine at Mount Sinai)

Chair: Therica Miller (Icahn School of Medicine at Mount Sinai)

Speaker(s):

1) Therica Miller (Icahn School of Medicine at Mount Sinai)

2) Deirdre Cohen (Mount Sinai)

3) Deborah Doroshow (Icahn School of Medicine at Mount Sinai)

4) Anai N Kothari (Medical College of Wisconsin)

5) Erin Lynch (Medical College of Wisconsin)

6) Regina Schwind (Triomics)

Clinical trial enrollment remains a critical challenge in advancing novel therapeutics. Despite efforts to improve participation, clinical trial participation remains critically low with fewer than 7% of adult cancer patients participating in cancer treatment trials, a statistic that underscores underrepresented groups such as racial and ethnic minorities who continue to face barriers to access. The increasing complexity of trial protocols and the narrowing of eligibility criteria, designed to safeguard patient safety and improve treatment specificity, further restrict the pool of qualified participants. This trend exacerbates low enrollment rates, which are a significant cause of trial failure: nearly 20% of trials fail to meet accrual targets, leading to premature termination. Recruitment costs now account for 25-30% of total trial budgets, and slow accrual not only delays therapeutic innovation but also magnifies these costs. Further, disparities exist in the participation of underrepresented groups in clinical trials, with Black, Hispanic, and rural populations often being under-enrolled relative to their disease burden. Some data suggests, providers are less likely to offer clinical trials to underrepresented groups, a factor that may contribute to persistently low participation rates among these groups. Research has shown that implicit biases, as well as assumptions about patient interest, understanding, and logistical challenges (i.e. transportation or financial concerns), often lead to fewer trial opportunities being extended to these populations. This reduced trial access further widens health disparities and restricts these patients from benefiting from novel, potentially life-saving therapies. Addressing these bias/barriers through proactive recruitment strategies is essential for achieving more inclusive, equitable, and scientifically valid trial results. Recent work highlights the urgent need for systematic approaches to trial screening, such as pre-screening patient records to identify eligible candidates early in the recruitment process. Recent studies have shown that novel recruitment strategies, such as dedicated patient pre-screening programs (Clinical Trial Navigators), lay navigation and the use of technology-driven solutions like Artificial Intelligence (AI) and machine learning, have the potential to significantly improve recruitment efficiency. For instance, proactive screening of patient records, which involves systematically reviewing medical records and genetic data to identify trial candidates, has been proposed as a means of expanding access to trials, particularly for underrepresented populations. Further, contemporary advancements in LLM enable cost-effective automation of patient-trial matching in the real-world setting. The implementation of pre-screening programs is aligned with recent calls to address the “eligibility bottleneck” in clinical trials, which limits patient access and reduces the generalizability of trial findings. AI-driven solutions, such as large language models (LLMs), have demonstrated efficacy in automating trial matching, thereby reducing the manual workload on clinical staff and improving patient diversity in trials. Advanced AI models and LLMs offer a promising solution by automating parts of the screening process, allowing for the rapid identification of potential candidates based on complex eligibility criteria. Studies have shown that integrating pre-screening and AI-driven tools can significantly reduce manual workload for clinical staff, improve patient recruitment, and mitigate institutional barriers and implicit biases that hinder diverse participation. These approaches not only harness the power to increase enrollment rates but also the potential to reduce the time and financial costs associated with recruitment, making trials more accessible and sustainable in the long term. The panelists will discuss real-world examples of three novel approaches to trial recruitment: 1) pre-screening clinical trial navigators, 2) lay navigation and 3) AI/LLM. The discussion will include how dedicated screening resources, lay navigators and AI-driven tools have been implemented to improve patient recruitment in oncology trials, reducing manual workload for clinical staff, and improving patient trial access, reducing institutional barriers and implicit bias. The discussion will address how each recruitment strategy can be deployed to aid investigators in identifying the most appropriate patients for clinical trials. The panelists will discuss the utility of each method (pre-screening, navigation, AI/LLM), the challenge of underrepresentation of diverse populations in clinical trials and how dedicated screening and navigation resources and AI can help identify and screen a more inclusive patient cohort. These solutions represent a shift in trial recruitment practices, addressing the urgent need to improve participation rates and provide a promising direction for expanding access, reducing barriers to participation and paving the way to more inclusive and equitable clinical trials.

SP25 – OVERCOMING CHALLENGES IN RARE CANCERS: LEVERAGING REGISTRY DATA AND INNOVATIVE TRIAL DESIGNS

Organizer: Kenichi Nakamura (National Cancer Center)

Chair: Kenichi Nakamura (National Cancer Center)

Speaker(s):

1) Hitomi Okuma (National Cancer Center)

2) Chiharu Mizoguchi (National Cancer Center)

3) Junki Mizusawa (National Cancer Center)

This session will explore a groundbreaking platform trial, such as the MASTER KEY Project, which focuses on rare cancers and rare molecular fractions—areas where conducting randomized controlled trials is often unfeasible due to limited patient populations. By showcasing real-world examples of regulatory approval based on single-arm trials, utilizing registry data to expand treatment indications, the session will highlight operational strategies and innovative statistical designs that enhance the credibility and impact of these studies.

Talk 1: MASTER KEY Registry.

The MASTER KEY Project is a platform trial comprising a registry and multiple sub-studies, involving participation from eight Asian countries. The registry, with over 5,000 patients diagnosed with rare cancers, plays a critical role in providing historical control data for regulatory application of sub-studies. While most registries typically consist of only clinical data, the strength of the MASTER KEY Registry lies in its inclusion of both clinical and biomarker data. This comprehensive approach allows for more precise extraction of control data, even in clinical trials where specific biomarkers define patient populations. To further enhance the biomarker data, a system was established to centrally collect samples from across Asia, conduct next-generation sequencing analysis, and return the results to participating institutions. The registry encourages collaboration between pharmaceutical companies and academia, with 12 companies actively involved, along with strong support from patient advocacy groups.

Talk 2: MASTER KEY Sub-studies.

To date, 31 sub-studies have been conducted under the MASTER KEY Registry, focusing on rare cancers and molecular fractions. Many of these are single-arm trials with response rates as the primary endpoint, using registry data to extract control group information, thereby improving the likelihood of regulatory approval. Furthermore, to efficiently enroll patients from rare populations, two recent sub-studies within the MASTER KEY Project have implemented a fully decentralized clinical trial system. This allows patients to participate and be enrolled remotely, without the need to visit the clinical trial site. This session will delve into case studies of these sub-studies, demonstrating how quality assurance is maintained to ensure successful regulatory submissions.

Talk 3: Statistical Innovations for Rare Populations.

As highlighted by the FDA’s Complex Innovative Trial Design framework, there is a growing momentum for the use of complex adaptive, Bayesian, and other novel clinical trial designs. In the MASTER KEY Project, several phase 2 and basket trials utilizing Bayesian methods have been conducted, gradually accumulating practical know-how. This session will introduce new clinical trial designs planned in investigator-initiated studies.

After the presentation, we will open the floor for questions and discussions from the audience and encourage participants to share their thoughts on the session topic. We will conclude by summarizing key takeaways from the session and highlighting the broader implications of registry data utilization and innovative trial designs for treatment development in rare diseases.

SP26 – ADVANCED NOVEL RANDOMIZATION IMPLEMENTATION IN REDCAP FOR CLINICAL TRIALS

Organizer: Marilly Palettas (Ohio State University)

Chair: Lai Wei (Ohio State University)

Speaker(s):

1) Marilly Palettas (Ohio State University)

2) John VanBuren (University of Utah)

3) Jody Ciolino (Northwestern University)

4) Matthew Shotwell (Vanderbilt University Medical Center)

5) Madsion Hyer (Ohio State University)

6) Valerie Durkalski-Mauldin (Medical University of South Carolina)

In clinical trials, randomization is crucial for ensuring unbiased treatment allocation and the integrity of results. With the increasing complexity of clinical designs, there is a pressing need for robust and adaptable randomization methods. This session aims to explore advanced implementations of randomization techniques within the Research Electronic Data Capture (REDCap) system. We will present innovative strategies to enhance randomization and stratification processes, focusing on minimal sufficient balance, adaptive randomized designs, and upcoming new randomization features in REDCap. Traditional approaches to randomization often fall short in accommodating the intricacies of advanced clinical trials. REDCap, a secure web application for data collection, provides a built-in framework for directly implementing less intricate randomization techniques such as stratified, block, and stratified-block randomization. However, these randomization strategies do not reflect state-of-the-art randomization methodologies that have been implemented in clinical trials more often in the past decade such as covariate-adaptive and response-adaptive randomization. Though REDCap cannot yet directly implement advanced randomization techniques, the REDCap framework is flexible enough to allow these methodologies to be implemented indirectly. As such, this session will empower researchers with access to REDCap to freely implement state-of-the-art randomization techniques. This session aligns with the conference theme, “Shaping the Future: The Right Questions, Robust Answers,” by addressing the critical questions surrounding randomization methodologies and their practical application in clinical research. In this session, we will illustrate three non-traditional randomization techniques and how they can be used in REDCap using novel techniques developed through active clinical trials.

Talk 1: Overview of REDCap Randomization and Introduction of Covariate-Adaptive Randomized Techniques.

This talk will introduce basic randomization capabilities in REDCap and provide a high-level overview of the menu of randomization methods from which study teams may choose for implementation in their trials, focusing on pros and cons of each and general logistical considerations when deciding on a randomization method. We will begin with a demonstration of the basic randomization process currently available in REDCap. We will next shift our focus toward advanced randomization techniques such as covariate-adaptive randomization. Adaptive randomization algorithms, many falling under the heading of “minimization,” can improve baseline variable balance across study arms and increase statistical power in randomized controlled trials. However, study-specific characteristics—study design, intervention type, clinical context, blinding, and data collection procedures—all play a key role in the ease (or lack thereof) in implementing these complex algorithms. As such, we will: (1) illustrate situations that would or would not merit use of a covariate-adaptive randomization algorithm, and (2) provide insight for incorporating adaptive randomization algorithms into complex study workflows within the REDCap framework.

Talk 2: Minimal Sufficient Balance.

This talk will focus on the methodology of minimal sufficient balance (MSB), a covariate-adaptive randomization technique designed to ensure that treatment groups are comparable across key baseline characteristics. We will discuss the principles of minimal sufficient balance and its importance in achieving unbiased results in clinical trials. We will also provide a step-by-step demonstration of how to adapt and implement this technique for use in REDCap, including practical examples and coding strategies. MSB was intended to be used with proprietary software for covariate-adaptive randomized trials, but with some modifications we were able to adapt MSB for use in REDCap while maintaining the integrity of the methodology and implementing it with a high degree of fidelity. The session will highlight firsthand experience where MSB has been successfully adapted and implemented in an active clinical trial, showcasing the innovative technique and how it can improve study design and outcomes.

Talk 3: Response-Adaptive Randomization.

This talk introduces the methodology of response-adaptive randomization (RAR), which adjusts future randomization probabilities based on previously collected outcome data. We will cover the principles of allocation targets and explore various methods for modifying allocations using both observed and unobserved outcome data. Since interim RAR analyses are often conducted outside of the electronic data capture system, we will focus the talk on how to set up a study in REDCap before enrollment begins to accommodate potential randomization allocation changes, using an ongoing double-blind Bayesian adaptive clinical trial as an example.

Discussant: Dr. Matt Shotwell will reflect on the three presentations and share his experience with implementing an adaptive clinical trial platform in REDCap. In addition, he will discuss forthcoming new randomization features in REDCap.

This session will provide attendees with ways to use REDCap to implement advanced randomization techniques that can enhance the integrity and efficacy of their clinical trials. By leveraging innovative strategies to enhance the capabilities of REDCap, researchers can address pressing questions in clinical design, data collection, and analysis. The integration of minimal sufficient balance and adaptive randomized designs aligns with the overarching theme of shaping the future of clinical trials through innovative practices.

SP27 – ENHANCING KEY ENDPOINT EVALUATION AND MONITORING WITH AI/ML AND RISK-BASED STRATEGIES

Organizer: Yuxi Zhao (Pfizer)

Chair: Maria Kudela (Pfizer)

Speaker(s):

1) Jingjing Ye (BeOne Medicines)

2) Li Wang (AbbVie)

3) Xinlei (Ivan) Mi

4) Elena Rantou (Food and Drug Administration)

In clinical trials, ensuring the quality and validity of data for downstream analysis and results is paramount, thus necessitating thorough data evaluation and monitoring especially for key efficacy endpoints. Through collaboration of multiple personnels, this typically involves employing edit checks and manual queries during data collection. Edit checks consist of straightforward schemes programmed into relational databases, though they lack the capacity to assess data intelligently. In contrast, manual queries are initiated by data managers who manually scrutinize the collected data, identifying discrepancies needing clarification or correction. Manual queries pose significant challenges, particularly when dealing with large-scale data in late-phase clinical trials. Moreover, they are reactive rather than predictive, meaning they address issues after they arise rather than preemptively preventing errors. Aiming for real-time remediation of potential errors based on critical risk assessments, targeted monitoring is appealing for utilizing key risk indicators and statistical monitoring to identify potential issues or anomalies in the data. However, the available tools for risk-based monitoring primarily concentrate on overseeing and managing data entry errors and alterations and being descriptive in nature e.g. Target e*CRF. Similarly, the available tools for the statistical monitoring are mostly being descriptive in nature as well, which may not serve the purpose well. Advances in AI/ML provides powerful techniques for feature/subgroup characterizations and pattern recognition, which can be potentially utilized to identify anomalous patterns and monitor clinical trial data for single endpoints, multiple endpoints/multi-modal data collectively, or temporal data. This session is to bridge the gaps between clinical trials and advances in AI/ML area and advocate adaptation from AI/ML to evaluation and monitoring strategy.

Talk 1: Leveraging AI-assisted Central Statistical Monitoring to Elevate Clinical Trial Oversight and Data Quality. Jingjing Ye, BeOne Medicines

Talk 2: A one-shot deep learning framework for psoriasis area and severity prediction. Li Wang, AbbVie

Talk 3: Open-Source Risk-based Quality Management (RBQM) Software for Good Statistical Monitoring of Critical Clinical Trial Data. Xinlei (Ivan) Mi

TARGETED SESSION

TS2 – FELLOWS SESSION: THE PROMISE AND POTENTIAL PITFALLS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING (AIML) IN CLINICAL TRIALS. REFLECTIONS FROM SCT FELLOWS

Organizer: Anne Lindblad (The EMMES Corporation)

Chair: Li Chen (Amgen)

Speaker(s):

1) Susan Halabi (Duke University)

2) Haoda Fu (Amgen)

3) Alexia Iasonos (Memorial Sloan Kettering Cancer Center)

4) Rick Chappell (University of Wisconsin-Madison)

5) Li Chen (Amgen)

It seems impossible to escape the barrage of claims for how artificial intelligence and machine learning (AIML) can change our lives or businesses for the better. Some of the claims for AIML in clinical trials include: (1) Design and analysis: AIML can assist in optimizing trial designs, analyzing complex datasets and provide objective imaging diagnostics; (2) Efficiency: AIML can facilitate the identification of suitable patient populations and clinical sites through advanced data analytics and predictive modeling, thereby improving recruitment strategies; (3) Safety: AIML can streamline the monitoring of trial progress and outcomes, enabling real-time data analysis and early detection of adverse effects or patient non adherence; (4) Patient engagement: AIML can enhance participant experience through personalized communication and support. But the use of AIML in clinical trials presents potential pitfalls including: (1) Data quality and bias; (2) Generalizability; (3) Regulatory challenges (4) Overreliance on technology; (5) Integration with existing systems; (6) Patient privacy; (7) Inequalities in trial access and outcomes. Addressing these pitfalls will be essential for the successful integration of AIML in clinical trials to ensure that its benefits are realized without compromising patient safety or scientific integrity. Fellows of SCT and invited speakers/panelists in this session will share their experiences and reflections and will invite the audience to share their own experiences.

CONTRIBUTED PRESENTATIONS

SESSION 1

Statistical Analysis of Cluster Randomized Trials

CP1-1 – POWER CALCULATION FOR GROUP SEQUENTIAL CLUSTER RANDOMIZED TRIALS WITH CONTINUOUS OR BINARY OUTCOMES

Primary Author:

1) Lee Ding (Harvard University)

Co-Author(s):

2) Rui Wang (Harvard University)

Well-planned interim analyses provide researchers with early data, allowing for timely decisions about a trial’s continuation, modification, or termination to enhance patient safety, reduce costs, and expedite access to effective treatments. Group sequential methods enable investigators to stop a clinical trial early when there is compelling evidence for efficacy or futility while preserving the trial’s statistical integrity. Although group sequential methods are well developed for individually randomized trials, established methods for cluster randomized trials (CRTs) are limited. In CRTs, groups or clusters (such as communities, schools, or clinics) rather than individual participants are randomly assigned to treatment or control conditions, making this design especially useful in settings where individual randomization is impractical or when the intervention is delivered at a group level. Because clusters are defined based on some shared characteristics or circumstances, outcomes for individuals within the same cluster tend to be more similar than those in different clusters. Group sequential trials require an inflated maximum sample size compared to equivalent fixed-sample designs to account for the possibility of early stopping. However, limited guidance exists on designing and powering a group sequential CRT, which requires accounting for correlated outcomes and repeated interim analyses of data accumulating at both the cluster and individual levels.

To this end, we develop sample size calculation methods for group sequential CRTs with continuous or binary endpoints. Under designs that recruit by cluster or individuals within clusters, we first show that differences between the corresponding sequentially calculated test statistics are asymptotically independent. We then employ an error spending approach to determine the maximum number of clusters or cluster size of a trial. Our method encompasses early stopping for combinations of efficacy and binding or non-binding futility. In simulation studies, we find that group sequential CRTs powered using our sample size calculations achieve the specified power across a range of trial design specifications; these results hold even when both clusters and individual participants enter the trial at varying levels over time. We also provide guidance on how and when to schedule interim analyses to maximize the efficiency of a group sequential CRT. We then apply our approach to planning interim analyses in the MEDUSA study, a CRT evaluating the effect of a multifaceted antimicrobial therapy initiation program on sepsis survival.

CP1-2 – EVALUATING INFORMATIVE CLUSTER SIZE IN CLUSTER-RANDOMIZED TRIALS

Primary Author:

1) Bryan Blette (Vanderbilt University Medical Center)

Co-Author(s):

2) Brennan Kahan (University College London)

3) Andrew Forbes (Monash University)

4) Michael Harhay (University of Pennsylvania)

5) Fan Li (Yale University)

In cluster-randomized trials, two popular estimands of interest are the average treatment effect among participants (p-ATE) and the cluster average treatment effect (c-ATE). The p-ATE is defined as an average of the treatment effects across all individual participants, while the c-ATE first averages the treatment effects within each cluster before averaging across clusters. Both quantities are often of interest for a cluster-level intervention. The p-ATE may be different from the c-ATE when informative cluster size is present, i.e., when treatment effects or participant outcomes depend on cluster size. For example, large hospitals may exhibit better or worse average outcomes than small hospitals in different settings. In such scenarios, mixed-effects models and generalized estimating equations (GEEs) with exchangeable correlation structure (which constituted a majority of cluster-randomized trial analyses in a recent systematic review) are biased for both the p-ATE and c-ATE estimands, and GEEs with an independence correlation structure or analyses of cluster-level summaries are recommended instead in practice. However, when cluster size is non-informative, mixed-effects models and GEEs with exchangeable correlation structure can provide unbiased estimation and notable efficiency gains over other methods. Thus, hypothesis tests for informative cluster size would be useful to formally assess the validity of this key assumption. In this work, we develop model-based, model-assisted, and randomization-based tests for informative cluster size in cluster-randomized trials. We construct simulation studies to examine the operating characteristics of these tests, show they have appropriate Type I error control and meaningful power, and contrast them to existing tests used in the observational study setting. The proposed model-based test has high power but is sensitive to model misspecification. The proposed model-assisted and randomization-based tests are less powerful in general, but they do not require correctly specifying the mechanism of informative cluster size in a model to have valid Type I error. We further show how covariate adjustment can improve the statistical power of these approaches. The proposed tests are applied to data from a recent cluster-randomized trial, and practical recommendations for using these tests are discussed.

CP1-4 – CURRENT PRACTICE AROUND THE USE OF ESTIMANDS IN CLUSTER RANDOMISED TRIALS, AND THE IMPACT OF INFORMATIVE CLUSTER SIZE ON INFERENCES

Primary Author:

1) Dongquan Bi (University College London)

Co-Author(s):

2) Brennan Kahan (University College London)

3) Andrew Copas (University College London)

Introduction: The use of estimands can help to clarify the treatment effect a study aims to estimate. Cluster randomized trials (CRTs) can address either participant-average (PA) or cluster-average (CA) effects. When outcomes and/or treatment effects vary across clusters depending on cluster size (referred to as “informative cluster size”), these two effects can differ, and estimators that target one can be biased for the other. Furthermore, it has recently been shown that common estimators used in CRTs such as generalized estimating equations with an exchangeable correlation structure (GEE) and mixed-effects models can produce biased estimates for both PA and CA effects under informative cluster size. However, current practice around choice of estimands in CRTs, as well as likely impacts of informative cluster size, are unknown. We therefore aimed to (i) establish the current practice around the use of estimands and estimators in CRTs through a review, and (ii) explore the potential impacts of informative cluster size in a re-analysis of a published CRT.

Methods: We conducted a review of recently published CRTs to explore which estimands are most often targeted, which estimators are being used, and how often potential impacts of informative cluster size are considered. We then reanalysed the RESTORE trial, which randomized 31 US pediatric intensive care units to compare protocolised sedation with usual care for critically ill children on mechanical ventilation. For each outcome, we first compared estimates of the PA and CA effect from independence estimating equations (IEE, a method robust to informative cluster size) to evaluate the likelihood of informative cluster size. Next, we compared estimates from GEE (binary outcomes) or mixed-effect models (continuous) against estimates from IEE to evaluate to what extent results from GEE and mixed-effects models may have been affected by informative cluster size. We used bootstrapping to evaluate to what extent differences could be explained by chance.

Results: Among 73 articles reviewed, no trial tried to report the estimand, and the research question was unclear for most trials (N=58). For most trials (N=46), it was not inferable whether they were targeting a PA or CA estimand. Trials often used GEEs or mixed models as the primary estimator (N=37). The potential impacts of informative cluster size were rarely considered, with only 3 trials reporting to what extent cluster sizes varied. The re-analysis found that for certain outcomes (18/22), results differed between the PA and CA estimates, suggesting the possible presence of informative cluster size. For instance, for the clinically significant iatrogenic withdrawal outcome, the PA OR estimate was 1.35 (95% CI, 0.72 to 2.51), and the CA OR estimate was 0.81 (95%C, 0.39 to 1.69), with a -40 (95% bootstrapped CI, -61 to -21) percentage difference.

Potential impact and relevance: Despite growing recognition around the importance of estimands, they are not widely used in reports of CRTs. We found that choice of estimand and estimator can have large impacts on interpretation of results, suggesting that guidance to increase uptake of estimands in CRTs is urgently needed.

SESSION 2

Complex Interventions and Adverse Events

CP2-1 – COMPLEXITIES IN TREATMENT-EMERGENT ADVERSE EVENT ANALYSES DURING INTERIM MONITORING

Primary Author:

1) Nick Zaborek (University of Wisconsin - Madison)

Co-Author(s):

2) Tom Cook (University of Wisconsin - Madison)

3) Bret Hanlon (University of Wisconsin - Madison)

4) Kevin Buhr (University of Wisconsin - Madison)

In our experience producing interim clinical trial reports for data monitoring committee (DMC) review, each trial uses a tailored definition for treatment-emergent adverse events (TEAEs). These definitions seem to be driven by clinical considerations rather than epistemological reasoning. In all cases it remains critical that DMCs are able to effectively and efficiently review accruing data to ensure the safety of all patients in the trials. The objectives of this discussion are to explore the reasons why TEAE analyses increase the complexity of assessing the safety profile of a treatment during interim monitoring by a DMC compared to analyses using all post-randomization adverse events (AEs), and to consider the common practice of excluding AEs that occur between randomization and first treatment administration. We will review TEAE definitions that have appeared in over 40 industry sponsored clinical trials supported by the University of Wisconsin’s Madison Statistical Data Analysis Center, an independent analysis group. These trials span many disease areas from 2008 to the present. This review will provide a detailed summary of the trends and variability of TEAE definitions. Then we will discuss the challenges these various definitions have posed for interim monitoring by DMCs, focusing on data integrity and interpretability of treatment safety profiles. We will present thoughts on the contrasting analyses between all post-randomization adverse events and TEAEs. Lastly, we will facilitate a discussion on handling AEs that occur in the period between randomization and the initiation of treatment.

CP2-2 – USING REDCAP TO FACILITATE ADJUDICATION OF ADVERSE EVENTS

Primary Author:

1) Jennifer Talton (Wake Forest School of Medicine)

Co-Author(s):

2) Kenneth Wilson (Wake Forest School of Medicine)

3) Michael B Nelson (Wake Forest School of Medicine)

4) Haiying Chen (Wake Forest School of Medicine)

5) Laura Lovato (Wake Forest School of Medicine)

6) Patricia Davis (Wake Forest School of Medicine)

7) Karina Kapitanovsky (Wake Forest School of Medicine)

8) Lenore Crago (Wake Forest School of Medicine)

9) Alain Bertoni (Wake Forest School of Medicine)

10) Pamela Sardos (Jefferson Clinical Research Institute, Thomas Jefferson University)

Organization and tracking of adverse events (AE) that need adjudicating can be a challenge for clinical trials. We aimed to streamline the packaging, reviewing, finalization, and tracking of this process in the REHAB-HFpEF study. REHAB-HFpEF is a phase 3 randomized trial testing a novel, multi-domain physical rehabilitation intervention in patients aged >60 years with heart failure with preserved ejection fraction (HFpEF) and hospitalized for acute decompensated heart failure. The study’s goal is to enroll 880 patients at 20 clinical centers to test the hypothesis that the intervention will reduce rate of combined all-cause rehospitalizations and mortality at 6 months follow-up versus usual care attention control. REHAB-HFpEF uses REDCap as the main data collection tool, including repeating instruments for AE case report forms. When a clinical site coordinator (SC) enters an AE form into REDCap, a project manager at the Data Coordinating Center (DCC) is alerted and reviews the form in detail, making sure all needed components are completed and accurate, including uploading into the form any required deidentified medical records and documentation necessary for adjudication (i.e. discharge summary, lab results, etc.). Once the DCC project manager has finalized the form, an application programming interface (API) is used to create a “package” of the AE event information. The API extracts the finalized AE form and all supplementary documentation; as well as automatically removing fields that should be blinded to the adjudicators. This package is then uploaded into a secure file sharing account (SFSA) which is accessible to the adjudication team. The adjudication team receives a weekly email indicating the unique AE packages that have been uploaded to the SFSA and are ready for review. To help facilitate adjudication, a secondary REDCap project which employs a double data entry feature is used by the adjudication team. In this process, two adjudication reviewers independently answer a series of questions about each AE package. A third reviewer compares any incongruent responses between reviewers 1 and 2 and then determines the final responses. Once the third reviewer has completed this task, a “completed adjudication” indicator variable is uploaded back into the main REDCap project via an API. Several indicator variables are used within the AE form to track the status of the adjudication process. This includes 1) SC finalization, 2) DCC finalization, 3) upload to the SFSA, 4) adjudication complete. Dynamic reports are available in REDCap and bi-weekly reports are e-mailed to the adjudication study team to monitor progress, including completion status for each of the three adjudicators. Currently we are about halfway through study recruitment and have sent 500 AEs for adjudication, with 412 completed. Custom adjudication systems can be developed which provide extensive features; however, this can be time and cost prohibitive for some studies. The automated process used for REHAB-HFpEF was built in REDCap and has allowed us to easily compile and share event information, track the status of each AE throughout the adjudication process, and streamline communication among the SC, DCC, and adjudication team.

CP2-3 – ADVANCED GENE THERAPY RESEARCH - MORE SMALL-MOLECULE OR MORE-TRANSPLANT?

Primary Author:

1) Rafael Escandon (DGBI Clinical Research Ethics Consulting, LLC)

This presentation explores questions of whether the existing structure of early-stage (Phase 1-2) clinical research with gene therapies (GT) warrants more alignment with research in solid organ transplants versus remaining in the traditional small molecule drug development rubric. GT trials present unique challenges that differentiate them from conventional drug trials, and research participants (and parents) may benefit from examining the prospects of benefits and burdens of their participation in research differently. Among the most challenging issues with early GT trials are the choice(s) of the starting dose, identifying relevant and rapidly assessable safety criteria for expanding enrollment by dose, and determination of dose-escalation increments since participants cannot be re-dosed. This inability to fully discontinue participation once dosed, the need for immunosuppression, unknown long-term hazards and benefits, and the inability to modify the activity of the replaced or edited gene all engender the need to ensure that participants are fully informed of these additional burdens, of greater trial complexity and of significantly longer trial duration. Compared to small molecule trials, GT trials and organ transplants often also have far more strict and genetically determined eligibility criteria, greater uncertainties about the durability of the intervention’s benefit, and possible transition of participants to a state where direct benefit wanes but continued observation may benefit future patients. Similarly, prior participation in a GT trial or receipt of an approved GT is more likely to limit options for participation in future investigational trials of all kinds. Finally, the technical complexities of GTs can be difficult to explain via the traditional small molecule-based informed consent process, begging the question, particularly in the case where pediatric assent is an ethical requirement, of whether the age where pediatric assent is required should be adjusted upwards (e.g., to 14-16 years) for pediatric GT trials. Thus, engagement processes used in solid organ transplants for transplant-listed potential recipients may be more relevant than the typical small molecule development model in providing the necessary information on trial design, burden, and future implications of participating, including research and therapeutic options.

CP2-4 – DOSE OUTCOME USING STRATIFIED ESTIMATION WITH RANDOM FOREST METHOD (DOSE-RF): A NOVEL APPROACH TO NON-LINEAR DOSE-RESPONSE MODELLING IN COMPLEX INTERVENTIONS

Primary Author:

1) Mollie Payne (King’s College London)

Co-Author(s):

2) Amy Hardy (King’s College London)

3) Ben Carter (King’s College London)

4) Richard Emsley (King’s College London)

Background: In randomized controlled psychotherapeutic trials, we often ask: What is the effect of being offered therapy? However, without perfect compliance, this differs from assessing the effect of receiving therapy. Traditional approaches that only analyze dose-response within the treatment arm can lead to biased, non-causal estimates. Newer methods address these biases but rely on prior assumptions about the dose-response relationship, typically assuming linearity. We propose the DOSE-RF method, a novel, causally valid approach that overcomes these limitations by estimating a dose-response function without predefined assumptions.

Methods: The DOSE-RF method combines machine learning and principal stratification in a two-stage procedure to estimate the dose-response effect. In stage one, random forest classification predicts counterfactual dose for the control arm. In stage two, regression models calculate the causal effect within each dose level. We tested this method across 36 simulation scenarios, varying dose predictors, confounding, dose distribution, and the dose-response function. The method was applied to two illustrative examples: the SoCRATES trial, a trial aimed at reducing paranoia in patients with schizophrenia, and the COMPASS trial, which focused on alleviating psychological distress in patients with long-term health conditions.

Results: In simulations, the DOSE-RF method reliably detected the true dose-response function across diverse scenarios, achieving accurate results without requiring prior assumptions. For scenarios with normally distributed doses, the method showed no significant bias. In cases where the dose followed a beta distribution, some bias emerged due to strong unmeasured confounding, particularly in small sample sizes. Application to the SoCRATES trial confirmed these patterns, though estimation at lower dose levels was challenging due to limited data. Despite this, DOSE-RF successfully identified stratified treatment effects across most dose levels. In the COMPASS trial, DOSE-RF revealed heterogeneous treatment effects, which would enable researchers to pinpoint both the minimum effective dose and the optimal next-best dose for patients who may require more than the minimum. These findings highlight DOSE-RF’s potential to support more personalized and effective treatment recommendations in clinical practice.

Conclusions: The DOSE-RF method offers a robust and flexible approach for analyzing dose-response relationships in clinical trials, addressing biases from non-compliance and assumptions about dose-response. Application to real-world examples highlight the possibilities of this method and allow us a deeper insight into the effect of therapy. It is particularly suited for trials with larger sample sizes and a moderate number of dose levels. Trialists should consider factors influencing compliance and include these in their data collection to optimize the method’s performance.

SESSION 3

Recruitment Challenges and Opportunities

CP3-1 – UNDERSTANDING RECRUITMENT SUCCESS IN CANADIAN TRIALS: AN ANALYSIS OF TRIAL REGISTRY DATA

Primary Author:

1) Alexander Tam (Providence Health Care)

Co-Author(s):

2) Jack Smith (Providence Research)

3) Mark J. Harrison (University of British Columbia)

4) Srinivas Murthy (University of British Columbia)

5) Nick Bansback (University of British Columbia)

Background: Failing to recruit in a clinical trial reduces the trial’s statistical power and exposes participants unnecessarily to potential risks of scientific research, ultimately failing to answer the intended research question and wasting the resources invested. International evidence suggests various trial-level aspects affect recruitment, including non-design (e.g., sponsor/funder) and design (e.g., eligibility criteria) factors. We examined the associations between trial-level aspects and recruitment success among trials in Canada. Specifically, we focused on aspects that trialists can modify or plan for during the design stage.

Methods: We used the clinical trials register ClinicalTrials.gov. We identified interventional clinical trials with at least one study site in Canada with either a “completed”, “suspended”, or “terminated” trial status. We restricted the sample to trials beginning between January 2017 and December 2023, and to phase 3 trials and trials without an applicable FDA-defined phase. We further excluded trials that were missing an estimated recruitment count or were terminated due to safety or efficacy reasons. We retrieved the full version history for each included trial. The estimated enrolment count was defined as the value reported in the last available version prior to the stated start of recruitment. The final enrolment count was the value from the latest version. For missing data, we identified relevant values from published literature. We also performed spot checks and prioritized published data when there were discrepancies with registry data. The ratio between the estimated and final enrolment was calculated, and recruitment success was defined as a ratio ≥0.85. Trial design variables included: planned trial size, number of study sites, number of study arms, comparator type (placebo/active, standard of care, not placebo/active), follow-up duration, and eligibility criteria quartiles. Associations between trial design and recruitment success were explored using logistic regressions.

Results: Of 2,444 trials, 751 (31%) did not successfully recruit. The proportion of trials that successfully recruited differed across funding sources: private (81%), federal (61%), other public (66%), and unknown funding (59%). In covariate-adjusted models, trials with three comparator arms or more were more likely to successfully recruit compared with one-arm trials (adjusted odds ratio (aOR) = 1.7 [95% Confidence Interval (CI) = 1.2, 2.5]) and trials with active and/or placebo comparators were more likely to successfully recruit compared to trials with no active/placebo comparators (aOR = 1.8, [95% CI = 1.3, 2.7]). Factors associated with lower likelihood of success included: ≥1 years follow-up duration (aOR = 0.6 [95% CI: 0.4, 0.9], reference: <3 months) and planned trial size of ≥200 (aOR = 0.4 [95% CI: 0.3, 0.6], reference: <50). Among non-design covariates, federal, other public, and unknown funding were associated with lower odds (reference: private). Funding-stratified models suggested aORs for trial size ≥200 may be smaller among federal/public trials compared to private trials.

Conclusion: We found that three out of ten clinical trials conducted in Canada failed to meet 85% of their recruitment target, with federally/publicly funded trials faring worse than privately funded trials. Trials with large enrolment sizes and long follow-up durations may be more likely to face challenges.

CP3-2 – RECRUITMENT STRATEGIES AND IMPACT IN A CANCER CLINICAL TRIAL

Primary Author:

1) Brianna Conley (American Society of Clinical Oncology)

Co-Author(s):

2) Crystal Tsai (American Society of Clinical Oncology)

3) Cynthia Kelley (American Society of Clinical Oncology)

4) Jacqueline Perez (American Society of Clinical Oncology)

5) Cindy MacInnis (American Society of Clinical Oncology)

6) Pam Mangat (American Society of Clinical Oncology)

Low enrollment in clinical trials can lead to premature study closures, increased operational costs, exhaustion of resources, and negatively impacts the generalizability of trial results, which limits scientific progress. Approximately 80% of trials fail enrollment targets, which cause a loss of up to $8 million per day for drug development companies. The Targeted Agent and Profiling Utilization Registry (TAPUR) Study is a phase II, precision oncology, multi-basket clinical trial that evaluates the antitumor activity of FDA-approved drugs outside of their approved indication(s) in patients with advanced cancers with specific genomic alterations at over 265 clinical sites in the United States. As of November 13, 2024, there were a total of 91 open cohorts, and 164 completed cohorts. During and post COVID-19 pandemic, study enrollment decreased, declining by 4% from 2021 to 2022. Therefore, increased awareness of the TAPUR Study was needed to encourage both physician and oncology community engagement and participant recruitment. This included a multi-stakeholder approach to the multifaceted issue of low enrollment. In collaboration with patient advocacy organizations, targeted recruitment efforts were completed for seven cohorts that had not had any enrollments in the previous three months or were considered treatments for rare targets. This approach focused on reaching potential participants and/or their caregivers. Each patient advocacy organization was consulted on the best methods and platforms to reach their constituents and templates were created to aid the drafting process. However, the study team also recognize that the clinicians and investigators are a crucial audience to improve enrollment onto the TAPUR Study. Therefore, to further promote awareness, the first annual TAPUR Grand Rounds was held in 2023. Community oncologists, Principal Investigators, TAPUR staff, and patient advocates were invited as panelists to present study-related topics from their unique perspective to the target audience (i.e., oncology community, oncology clinicians, etc.). This collaboration resulted in multiple marketing efforts including educational and direct recruitment material such as educational blog posts, updates to internal clinical research search engines, mass email blasts, a social media campaign (Facebook, LinkedIn, X, Instagram) and website advertisements. In addition, the study team held bi-monthly TAPUR Study coordinator webinars which provided a regular opportunity to highlight these cohorts and the Grand Rounds to clinical sites. A total of 43% of registrants attended TAPUR Grand Rounds, with the majority of attendees (47%) self-identifying as patient advocates within the oncology community. As a result of these engagement strategies, the study had 14 additional enrollments across the seven prioritized cohorts, two of which were for rare cohorts, and overall enrollment increased by 4.5% within a period of seven months. This presentation will describe our experience in working with patient advocacy organizations, the effectiveness of an annual grand rounds presentation, and social medial campaigns as a multi-pronged approach to increase participant enrollment to the TAPUR Study.

CP3-3 – A SYSTEMATIC ATTEMPT TO OVERCOME BARRIERS TO ACCRUAL ON RANDOMIZED TRIALS: LESSONS LEARNED FROM 15 YEARS OF CLINICALLY-INTEGRATED TRIALS AT A MAJOR CANCER CENTER

Primary Author:

1) Andrew J Vickers (Memorial Sloan Kettering Cancer Center)

Background: Contemporary cancer research is characterized by randomized trials that accrue poorly and often fail entirely. Even where cancer trials are successfully completed, extremely low accrual rates - typically 2-3 patients per center per year - raise questions about representativeness. Perhaps an even greater problem is the trials that are never started because the investigators deem them infeasible, leaving us without good data to guide clinical practice. The reasons why randomized cancer trials have been failing are unsurprising and have been discussed at length in the literature: complexity, provider and patient burden, patients’ desire for treatment decisions to be made by their doctor rather than at random.

Methods: We systematically evaluated these barriers to trial participation and designed methodologic approaches to address those barriers. Our ideas were synthesized with the concept of the “clinically integrated randomized trial”, sometimes described as a “point-of-care trial”. This involves minimizing eligibility criteria, using routinely collected data as trial endpoints, and developing novel approaches to consent. In particular, we address patients’ concerns about burden by emphasizing that participation entails no extra tests, procedures, questionnaires or visits; we address concerns about treatment being decided at random by including a “clinical judgement prerogative”: doctors are entirely free to ignore randomized allocation if they feel it is in the patient’s best interest. In practice, this does not occur commonly but have found that highlighting clinical judgment in the consent discussion is deeply reassuring to patients. We have also developed “two-stage consent” which splits information about research from information about allocated treatment.

Results: We have completed seven large, single-center randomized trials at low cost, with total accrual over 8000 patients and other trials ongoing. Only one trial had external funding. One of our trials (n=1423) is the first single-center study in the modern era to show an effect of a treatment on cancer metastasis. In a multicenter trial, sites using two-stage consent had four-fold the accrual of sites using traditional one-stage consent. A trial at our center that was failing, and switched to our methodology, experienced a ∼3.5-fold increase in accrual rates and is shortly to complete accrual. Empirical study of two-stage consent has found evidence of both better understanding of research methods and lower levels of patient anxiety when compared to one-stage.

Conclusion: Clinically-integrated trials require good research infrastructure, such as that found at academic medical centers. A separate challenge is that trialists and regulators struggle with both methodologic novelty and with the pragmatic approach required by clinical integration. Given our success, and the generalizability to other centers, further evaluation and expansion of this methodology should be considered an urgent research priority, particularly with respect to trials of chemotherapy agents and radiotherapy.

CP3-4 – SUPPORTING CENTRES TO BEST INTEGRATE AN RCT AND TO GAIN CONFIDENCE IN IT CAN OPTIMISE INFORMED DECISION-MAKING AND RECRUITMENT TO SURGICAL TRIALS: AN EXAMPLE FROM AN ORTHOPAEDIC SURGICAL TRIAL

Primary Author:

1) Nicola Mills (University of Bristol)

Co-Author(s):

2) Samantha Harrison (University of Nottingham)

3) Neha Rasheed (University of Nottingham)

4) Jane Blazeby (University of Bristol)

5) Hugh Jarrett (University of Nottingham)

6) Aisha Shafayat (University of Nottingham)

7) Lixiao Huang (University of Nottingham)

8) Alan Montgomery (University of Nottingham)

9) Alexia Karantana (University of Nottingham and Queens Medical Centre)

10) Tim Davis (University of Nottingham and Queens Medical Centre)

Introduction: Randomized controlled trials (RCTs) offer the most reliable evidence on treatment effectiveness but recruitment challenges are common. Surgical trials are particularly prone to difficulties. Dupuytren’s contracture, causing fingers to bend permanently towards the palm, is an example of a common and debilitating condition where established surgical treatment options are guided by clinician and patient choice in the absence of robust comparative evidence. The Hand-2 RCT compared two procedures for the condition - limited fasciectomy (surgical removal of contracture) and needle fasciotomy (division of contracture by needle). Anticipating difficulties, an established intervention - the QuinteT Recruitment Intervention (QRI) - was integrated to identify and address recruitment challenges supporting the trial to reach target two months ahead of a COVID-19 pandemic imposed revised schedule, at a time when other surgical hand trials were struggling.

Methods: A preventative QRI phase (phase 0) implemented learnings from the QRI in Hand-1 feasibility study into the set-up of Hand-2 and pre-recruitment support for surgeons and research teams around identifying patients and discussing the study. A multi-methods approach was used to understand the recruitment process and challenges in real time involving interviews with center staff, scrutiny of screening logs/recruitment pathways and audio-recording study recruitment discussions (phase 1). Data were subject to simple counts, thematic and content analysis. In phase 2, QRI-informed actions grounded in findings from phase 1 were developed and implemented in collaboration with the trial management team. Phases 1 and 2 continued cyclically until the recruitment target was met.

Results: Data were analyzed from screening logs from all 21 centers, interviews (11 surgeons and 4 research staff from 9 study centers of 25 invited), and audio-recordings of study recruitment consultations (n=157 with 18 surgeons from 7 centers). Findings from screening logs showed wide variation in the number of eligible patients identified and recruited across centers. A core reason for study ineligibility was the surgeon deeming the finger unsuitable for needle fasciotomy. Half of eligible patients declined study participation because they preferred the needle. Analysis of interview and consultation data offered insight into these findings: (1) recruitment was greater in centers that adapted their usual practice and reconfigured their clinic set up to better fit the trial; (2) individual surgeon beliefs affected patient selection and disrupted the equipoise in study recruitment discussions contributing to patients’ treatment preferences. QRI-informed actions focused on sharing successful recruitment processes and pathways, raising awareness of the impact of biased views on patient selection and trial discussion, and providing support and examples of best practice to overcome.

Conclusion: Supporting center teams to adapt their usual practice to best fit the trial and to gain confidence in it in terms of patient suitability and unbiased discussion of treatment can optimize patient informed decision-making and recruitment to surgical trials.

SESSION 4

Sample Size Determination and Estimation

CP4-1 – THE IMPACT OF TREATMENT NON-ADHERENCE ON POWER AND SAMPLE SIZE IN CLINICAL TRIALS

Primary Author:

1) Sherry Livingston (Medical University of South Carolina)

Co-Author(s):

2) Henry Merryday (Medical University of South Carolina)

3) Valerie Durkalski-Mauldin (Medical University of South Carolina)

In any clinical trial it is expected that some number of participants will not adhere to the treatment protocol. Despite this fact, the primary analysis is usually the intention-to-treat (ITT) analysis, where participants are analyzed according to the assigned treatment arm, regardless of treatment received. This results in a conservative estimate of the treatment effect and provides valuable insight into the effect of recommending a new treatment. Trialists are also often interested in estimating the true treatment effect for those participants who receive the treatment, and so a secondary as-treated (AT), per protocol (PP), or other adherence-adjusted analysis is often included in the analysis plan. Selection bias is a concern in such analyses, as unmeasured confounders could potentially affect both adherence and outcomes, thereby negating the protective effects of randomization in an AT or PP analysis. Newer adherence-adjusted methods have been developed and have been shown to produce treatment effect estimates that are less biased. These methods include inverse probability weighted per protocol analysis (IPW-PP) and instrumental variable (IV) approaches. While these methods improve bias, they do so at the cost of higher variability and reduced power. Sample sizes are typically inflated to account for non-adherence, but the required inflation for the newer adherence-adjusted methods has not been carefully studied. The general recommended inflation factor for an ITT analysis is a non-adherence rate of R is 1/(1-R)2. This value can be extremely large for high non-adherence rates, and our goal in this work is to examine how much sample size inflation is actually needed for the various adherence-adjusted approaches. This presentation shows simulation results that examine the power, bias, and type 1 error rate when comparing ITT, AT, PP, IPW-PP, and IV analyses in a variety of conditions and under varying degrees of sample size inflation. We show that when the rate of non-adherence is low, PP and IPW-PP analyses are minimally biased and retain nominal type 1 error rates while still maintaining good power, even when the sample size is not inflated. IV methods have less bias, but do not achieve the required power without an appropriate sample size inflation. When non-adherence increases, however, bias and type 1 error rates are much higher for AT and PP methods, especially when unmeasured confounders have a strong effect on adherence and outcomes. In this scenario, IV methods maintain type 1 error rate regardless of sample size but still require the full sample size inflation to achieve the required power. In all scenarios, ITT analyses maintained nominal type 1 error rates but lacked power to show a treatment effect. While IV methods have very low bias, they produce estimates that have high variability and wide confidence intervals, so appropriate sample size adjustments must be made to find statistically significant results. When treatment non-adherence can be measured, we recommend a comparison of ITT, PP, and IV results to examine the impact on the estimation of treatment effect.

SESSION 5

Informed Consent

CP5-1 – ADAPTING THE QUINTET RECRUITMENT INTERVENTION TO OPTIMISE INFORMED CONSENT IN CLINICAL TRIALS IN INDIA: LESSONS FROM THE ORION-I FEASIBILITY STUDY

Primary Author:

1) Sangeetha Paramasivan (University of Bristol)

Co-Author(s):

2) Jeffrey Pradeep Raj (King Edward Memorial Hospital)

3) Urmila M Thatte (King Edward Memorial Hospital)

4) Saee Hinglaspurkar (King Edward Memorial Hospital)

5) Prachi Bhoir (King Edward Memorial Hospital)

6) Nikita Sawant (King Edward Memorial Hospital)

7) Jenny Donovan (University of Bristol)

8) Nithya J Gogtay (King Edward Memorial Hospital)

Background: In India, the past decade has seen an increasing focus on streamlining and strengthening the clinical trials ecosystem in India through regulatory reforms, with a greater emphasis on the ethical/methodological rigor of trials. A systematic scoping review of empirical research on clinical research ethics in India identified several gaps, including the need to better understand the recruitment-informed consent processes in clinical trials. The UK QuinteT Recruitment Intervention (QRI) was developed to understand and optimize recruitment and informed consent in clinical trials and has been used in over 70 trials. It comprises two phases to: a) understand recruitment challenges using multiple data sources (interviews with patients/clinicians; trial consultation audio-recordings) and b) address identified challenges through a plan of action co-produced with trial teams (confidential group/individual recruiter feedback). The OrION-I (Optimising Informed CONsent in clinical trials in India) study aimed to investigate the feasibility of audio-recording trial consultations and the acceptability of using them to provide recruiter feedback in India, to facilitate a future large-scale study aimed at optimizing the recruitment-informed consent process in clinical trials in India.

Methods: After ethical approval and written, informed consent, the study team audio-recorded informed consent discussions (n=15) between healthcare professionals (HCPs) and participants in two clinical studies being conducted at a tertiary care hospital in Mumbai. We then conducted interviews with HCPs and patients (n=5 per group) to explore perceptions regarding the audio-recording process (feasibility and acceptability), views on acceptability of HCP feedback amongst patients and HCPs, and suggestions for QRI adaptation to India. The audio recorded data was transcribed and translated into English from Hindi and Marathi where necessary. Thematic analysis was employed, drawing from techniques of constant comparison. Data triangulation comprised comparing audio-recordings of consent discussions with interview data.

Results: There was a high rate of acceptance to audio-recordings of consent discussions among study participants (15/17). Patients and HCPs viewed audio-recordings as acceptable for research purposes, appeared comfortable with the process and mentioned a range of potential benefits (e.g., improved information provision in the future; documentation/evidence purposes; better demonstration of voluntariness of participation). Their concerns were centered around confidentiality and indicated the need for additional reassurances from researchers conducting such studies. Some HCPs indicated that they anticipated being anxious about receiving feedback but were also keen to improve how they communicated with patients through feedback. Participants highlighted the need for a nuanced feedback process that is tailored to address patients’ concerns and suits each HCP’s style of communication. Suggestions for HCP feedback included the use of peer feedback, self-appraisals by HCPs, ensuring rapid feedback for real-time improvements in how HCPs communicate research, and the use of a structured format for feedback where feasible. The study also provided valuable information on the content and style of communication in consent appointments.

Conclusion: The audio-recordings of consent discussions were operationalized without major challenges, with HCP and patient interviews indicating that both audio-recordings and feedback to HCPs were feasible and acceptable in India. This will enable future large-scale adaptations of the QRI in India.

CP5-2 – REMOTE, CENTRALIZED MONITORING OF THE INFORMED CONSENT PROCESS IN MULTICENTER TRIALS

Primary Author:

1) Alexandra Gil (University of Pittsburgh)

Co-Author(s):

2) Brianna J Higginbottom (University of Pittsburgh)

3) Charity G Patterson (University of Pittsburgh)

Among the many responsibilities of a Principal Investigator, ensuring the protection of the rights and welfare of human research participants rises to the top of that list. One of the first major steps to achieving this goal is the informed consent process. SPARX3 is large randomized clinical trial comprised of 24 sites across United States and Canada with a target enrollment of 370 drug naïve individuals recently diagnosed with Parkinson’s Disease. During the development of the electronic data capture system (EDC), our goal was to develop a fully remote monitoring mechanism that would enable early identification, intervention and prevention of problems during the informed consent process which typically may only be identified during periodical in-person site monitoring visits. For this reason, instead of simply confirming that IC had been obtained for a participant, our EDC requires sites to complete an Informed Consent Process (ICP) form and upload of a scanned copy of the Informed Consent (IC) Form. The ICP form was created using multiple choice questions targeting key elements required in the process of obtaining informed consent such as: 1) who was present during the informed consent discussion; 2) the fact that risks were presented; 3) confirmation that significant issues of concern to the participant were addressed; 4) date that the participant signed the IC form; 5) start and end time of the process to obtain informed consent; 6) signed by an individual responsible for the documentation as indicated in the Delegation of Authorization (DoA) Form. Documentation of this process enhances the certainty that sites are following strict FDA guidelines which are recommended by the IRB of record and enables monitoring of 100% of ICs for the SPARX3 study. Once a month a monitor uses the ICP data elements to draw conclusions about the Informed Consent Process. For example, the start and end time for the ICP indicates if an appropriate amount of time was provided to review and answer questions regarding the study. In addition, date and time of consent are compared to the first data collection procedure to ensure that research procedures were not conducted prior to obtaining IC. The IC form is checked for version control and expiration date. A few sites’ local regulations require a HIPPA agreement to be signed at the time of consent, and therefore it is expected to be uploaded together with the copy of the IC form. The site staff signature is compared to the DoA to ensure only those with adequate training and authorized to obtain IC are the ones doing it. In this presentation, we will highlight the process of identifying the elements to monitor the informed consent process, the monthly monitoring, issues identified during the monitoring, communication with sites and the impact of this process on the protection of rights and welfare of human subjects in the SPARX3 Trial. We believe similar practices should be implemented in multicenter clinical trials of any size but in particular large-scale trials.

CP5-4 – STAGED INFORMED CONSENT FOR TRIALS WITH USUAL CARE GROUPS: DEVELOPING GUIDANCE

Primary Author:

1) Clare Relton (Queen Mary University of London)

Introduction: Most RCTs use the standard approach to informed consent—where all potential participants are informed about all possibilities and random allocation to groups in one long patient information sheet and consent form prior to the start of the trial. For trials comparing interventions to usual care this standard one stage “everything up front to everyone” approach to informed consent results in the potential control (usual care) participants being informed of interventions which they are not then offered. This one stage approach burdens both patients and clinicians with information and questions which are not in harmony with usual healthcare conversations/ processes and burdensome for patients. This one stage approach is also ethically dubious (expectation and disappointment when interventions are not then offered) and statistical difficulties created by disappointment bias in responses and retention and crossovers. There is growing evidence that trials which stage (and tailor) the information given and consents sought generate more representative and larger trial populations. However, there is uncertainty by all trial design stakeholders on the feasibility and acceptability of adopting staged approaches to informed consent within systems and traditions designed to operate single stage “everything up front” approach to IC. A growing number of trialists in different contexts are using a staged approach to informed. We explored what is known about staged approaches to informed consent with a view to developing guidance for triallists and ethics committees considering pragmatic trial designs with usual care comparators.

Methods: We gathered knowledge of staged approaches to staged informed consent in a wide range of health conditions and settings. This includes (1) a scoping review of the published literature, (2) case studies exploring patient and clinician experiences of a staged approach to informed consent, (3) semi-structured qualitative interviews with professionals implementing or considering using a staged approach to IC within the Trials within Cohorts design and (4) an international symposium on “staged-and-tailored” informed consent which gathered views and perspectives from multiple triallists and other stakeholders in pragmatic trial design.

Results: Our scoping review identified sixty-eight studies using/describing a staged approach were identified with far-ranging size and topics of study and increased sophistication of use. Sixteen professionals were interviewed. Six emergent themes came up from the analysis: (1) previous problems with traditional approaches to IC (2) perceptions of the ethics in practice (3) the impact of a staged- and-tailored approach 4) involvement of key stakeholders (5) the implications for pragmatic research and Real-World Evidence and (6) the challenges of conducting innovative methodological research and facilitating change. Data from the exploration of case studies, interviews and the symposium suggest this approach can lead to an improved patient experience. The consequence is to reduce disappointment bias and crossovers, and more efficient recruitment including more representative populations. This approach was considered acceptable to a range of key stakeholders and efficient by trialists. The term “staged-and-tailored” more accurately describes this approach. Learnings from in-depth evaluations are used to underpin recommendations for future conduct and reporting.

Conclusion: RCTs with usual care comparators can generate valuable knowledge to inform routine clinical decision making by patients, clinicians and commissioners, yet the process of recruitment and consent is practically and ethically complex and “informed consent processes are trial-centric rather than patient/health-centric.”

SESSION 6

Data Collection and Quality

CP6-1 – ROLE OF WEARABLE TECHNOLOGY IN CLINICAL TRIALS: A SINGLE INSTITUTION EXPERIENCE

Primary Author:

1) Gillian Gresham (Cedars-Sinai Medical Center)

Co-Author(s):

2) Allistair Clark (Cedars-Sinai Medical Center)

3) Michael Sobolev (Cedars-Sinai Medical Center)

4) Brian Minton (Cedars-Sinai Medical Center)

5) Celina H Shirazipour (Cedars-Sinai Medical Center)

6) Christie Jeon (Cedars-Sinai Medical Center)

7) Arash Asher (Cedars-Sinai Medical Center)

8) Jethro Hu (Cedars-Sinai Medical Center)

9) Philip Chang (Cedars-Sinai Medical Center)

10) Andrew E Hendifar (Cedars-Sinai Medical Center)

Introduction: Wearable devices are increasingly used to collect longitudinal, objective activity and biometric data, deliver tailored interventions, and monitor adherence to lifestyle and behavioral interventions in clinical trials. They provide a window into a patient’s daily activity and overall well-being in their free-living environments. Herein, we review the role of wearable technology in clinical trials and opportunities and challenges associated with their use based on our experience at a single large academic institution.

Methods: We reviewed clinical trials at our institution that used wearable devices (e.g., activity monitors, blood pressure monitors, and other health monitoring devices). Characteristics of the trials were assessed including population, device type, duration of wear time, adherence, and summary of activity data. We identified challenges associated with data collection and management, device standardization, and technological difficulties.

Results: A multidisciplinary team of researchers, physicians, statisticians and data scientists within our institution have been incorporating wearables into research since 2015. Wearables were included as part of the main study or as an optional sub-study in 12 completed investigator-initiated trials and an additional 4 ongoing large multicenter randomized controlled trials with a total of 324 patients enrolled to date. Most trials were conducted among individuals diagnosed with cancer (prostate, pancreas, breast, brain, or colorectal) although devices were also used in other conditions including cardiovascular disease, pancreatitis, and gastrointestinal disease. Types of wearable devices included wearable activity monitors (n=15 studies), blood pressure monitors (n=2 studies), wireless smart scales (n=3 studies), and transdermal alcohol monitors (n=1 study) with some studies using more than one device. Among completed studies (median sample size n=25, range 6-80), there were 52% male/48% female, median age: 69 years, range 18-92 years. Duration of wear-time ranged from 2 weeks to 1 year, with adherence to wearing the devices ranging between 70% and 98%. Devices were most commonly used as data collection tools to obtain objective measures of activity (physical activity, sleep, heart rate), assess patient function, or to monitor adherence to physical activity and lifestyle interventions. Their use as trial outcomes to evaluate effect of interventions on a patient’s daily activity are also being explored. The most common challenges encountered included initial set-up and synching of devices and managing updates to the technology and software.

Conclusion: The integration of wearable technology into clinical trials has the potential to advance clinical research and improve our ability to monitor and assess patients enrolled in clinical trials. Overall, adherence was high, and devices were demonstrated to be feasible for use across variable cancer populations and those at-risk for cancer. When designing studies involving wearables, the purpose for use, device type, duration of monitoring, and data management plan should be considered and established a priori. Future directions for the use of wearables include their role in determining trial eligibility, the application of artificial intelligence and machine learning methods, exploring the combination of wearable devices with virtual reality, delivery of interventions via wearables, the application of innovative trial designs to maximize their utility, and using wearables as study outcomes.

CP6-2 – STUDY WITHIN A TRIAL OF ELECTRONIC VERSUS PAPER-BASED PATIENT REPORTED OUTCOMES COLLECTION (SPRUCE) - PRIMARY OUTCOME AND PATIENT DEMOGRAPHICS

Primary Author:

1) Lara Philipps (The Institute of Cancer Research)

Co-Author(s):

2) Joanne Haviland (Queen Mary University of London)

3) Morgaine Stiles (The Institute of Cancer Research)

4) Jessica Maudsley (The Institute of Cancer Research)

5) Isabel Syndikus (The Clatterbridge Cancer Centre NHS Foundation Trust)

6) Alison Tree (Royal Marsden NHS Foundation Trust / The Institute of Cancer Research)

7) Keith Harland (The James Cook University Hospital)

8) Paul Ridley (Ipswich Hospital, East Suffolk and North Essex NHS Foundation Trust)

9) Jacqui Gath (Patient and Public Representative)

10) Robert Huddart (The Institute of Cancer Research / Royal Marsden NHS Foundation Trust)

Background: Patient perspective and survivorship effects are important considerations within oncology trials when evaluating new treatments. Patient-reported outcomes (PRO) are collected using validated questionnaires that measure impact of treatments and health conditions on quality of life. Although electronic PRO (ePRO) collection has been studied in general clinical settings, there is little published literature demonstrating it is as effective as paper PRO collection within clinical trials. Our study-within-a-trial (SWAT) was designed with patient and public advisors to investigate whether use of ePRO in oncology clinical trials was appropriate and acceptable to participants.

Methods: SPRUCE is a partially randomized patient preference study which recruited participants from host trials in The Institute of Cancer Research Clinical Trials and Statistics Unit’s (ICR-CTSU) portfolio. SPRUCE participants were either randomized between ePRO and paper PRO or chose their format preference if they were not willing to be randomized. PRO data was collected to the 12-month timepoint of host trial follow up. Administration processes were consistent for ePRO and paper PRO (e.g. patient information content and reminder frequency). Paper PRO compliance at the first post-intervention timepoint is 90% across ICR-CTSU trials. 244 randomized participants were required to exclude ePRO compliance rates <80% (10% non-inferiority margin), with 80% power and 1-sided alpha=0.05. It was assumed that two thirds of participants would agree to be randomized. The primary endpoint was compliance with questionnaire completion at the first quality of life assessment timepoint after the host trial’s study intervention in the randomized cohort, defined as the percentage of participants returning a questionnaire out of those expected. Patient-completed demographics were requested from all participants to investigate differences between groups (completion optional). Differences between paper PRO and ePRO groups in the preference cohort were assessed by Chi Squared and Fisher’s Exact tests.

Results: 414 participants were recruited from three prostate radiotherapy host trials. Five participants have not yet reached the primary endpoint. Of those reaching the primary endpoint, 244 were in the randomized cohort and 165 in the patient preference cohort. In the randomized cohort, 109/121 (90.1%) of ePRO participants completed the primary endpoint questionnaire compared to 115/123 (93.5%) in the paper PRO group. The difference between electronic and paper questionnaire completion rate was -3.4% (90% CI of -9.2% to 2.4%). Non-inferiority could therefore be concluded for ePRO. In the preference cohort, 72/78 (92.3%) of ePRO participants completed the first post-intervention questionnaire compared to 78/87 (90.0%) of paper PRO participants. Patients selecting paper questionnaires were more likely to have a lower household income and education level (p<0.01). Demographics for the randomized and patient preference cohorts are shown.

Conclusion: Compliance with ePRO was non-inferior to that of paper PROs, suggesting ePRO is appropriate for oncology clinical trial use. Future analyses will investigate longer term equivalence of formats and patient acceptability, after all participants have reached 12 months follow up. There remains a group of trial participants who require paper questionnaires, and these should continue to be offered to avoid exclusion of people with lower digital literacy levels.

CP6-3 – USING A CENTRAL OUTCOMES CENTER TO REDUCE ATTRITION IN A LONGITUDINAL ED-BASED PEDIATRIC STUDY

Primary Author:

1) Elizabeth Rosenthal (Kennedy Krieger Institute)

Co-Author(s):

2) Daniel Nishijima (University of California, Davis School of Medicine)

3) Tara Gammi (Kennedy Krieger Institute)

4) Nathan Kuppermann (University of California, Davis School of Medicine)

5) Stacy Suskauer (Kennedy Krieger Institute)

6) Kristy Arbogast (The Children’s Hospital of Philadelphia)

7) Mohamed Badawy (University of Texas Southwestern Medical Center)

8) Daniel Corwin (The Children’s Hospital of Philadelphia)

9) Andrea Cruz (Baylor College of Medicine)

10) Danny Thomas (Medical College of Wisconsin)

Background: Longitudinal follow-up is crucial in many observational studies and clinical trials to characterize patient outcomes after recruitment from the emergency department (ED). However, attrition threatens statistical power and risks creating biased samples. Completing post-ED longitudinal follow-up may be difficult for participants enrolled in EDs due to persistent medical symptoms interfering with participation and the limited opportunity to build rapport and engagement with the study team. Here, we describe the utility of a Central Outcomes Center to support completion of primary outcome collection among participants enrolled in EDs after mild traumatic brain injuries (mTBI).

Method: Participants 11-17 years presented to 4 pediatric EDs across the country <72 hours after mTBI. Upon enrollment, adolescents and their caregivers completed pre-injury baseline surveys and indicated preference for electronic or telephone follow-up. After discharge, both completed <15-minute follow-up surveys (1-week, 1-month, 3-months post-enrollment). The primary outcome was a minimal clinically important difference in adolescent self-reported anxiety or depression symptoms from pre-injury baseline to 1- or 3-months post-mTBI using the General Anxiety Disorder-7 (GAD-7) or Patient Health Questionnaire-8 (PHQ-8). For those who elected electronic surveys, 3 automatic invitations (the initial link and 2 reminders) were sent to caregivers via text or email in the week after each survey opened. If caregivers and/or adolescents did not complete their surveys within that 1st week or if they preferred telephone follow-up, a research assistant from the Central Outcomes Center contacted the caregiver and/or adolescent via text, email, and telephone call. Contact methods/times were flexible to accommodate the needs of adolescents and caregivers. Participants (n=98) enrolled between 5/22/24 and 9/11/24 were included in 1-month survey analyses. Participants (n=38) enrolled by 7/13/24 were included in 3-month analyses. Independent survey completion and contact success were recorded at each time point. Those who did not complete surveys independently and were intentionally not contacted (e.g., withdrawn) during follow-up were excluded from analyses (1-month: 4 caregivers/5 adolescents; 3-month: 3 caregivers/4 adolescents).

Results: In total, 83/97 (85.6%) adolescents and 82/98 (83.7%) caregivers completed 1-month surveys, and 33/37 (89.2%) adolescents and 34/38 (89.5%) caregivers completed 3-month surveys. Completion rates did not differ between adolescents and caregivers. Of the adolescents who completed 1- or 3-month surveys, 39/83 (47.0%) and 18/33 (54.5%) respectively received assistance from the Central Outcomes Center. Of caregivers who completed surveys, 32/82 (39.0%) received assistance at 1-month, as did 14/34 (41.2%) at 3-months. The primary outcome (pre-injury baseline and 1- or 3-month adolescent GAD-7/PHQ-8) was collected for 85/99 (85.9%) participants. 45/85 (52.9%) needed assistance to complete the primary outcome surveys.

Conclusion: Personalized contact enhanced 1- and 3-month follow-up rates, including completion of primary outcomes, for participants enrolled into a research study of mTBI in the ED. Future work should investigate other variables that may impact follow-up success rates, including automatic invitations sent directly to adolescents, rather than only to caregivers. Researchers should consider personalized contact options including use of a Central Outcomes Center to increase follow-up and promote study success.

CP6-4 – ASSESSING FREE-TEXT FIELDS THROUGH NATURAL LANGUAGE PROCESSING TO ENHANCE CRF DEVELOPMENT AND DATA QUALITY

Primary Author:

1) Christine Kohnen (University of Iowa)

Co-Author(s):

2) Corey Moon (University of Iowa)

The use of free-text fields in electronic case report forms (eCRF) in clinical trials give investigators flexibility in the collection of participant data but require natural language processing (NLP) to analyze the resulting unstructured data. Without the use of machine learning techniques, the process of parsing meaningful insights is manual, laborious, and prone to confirmation bias. Many eCRFs use “Other” fields when collecting qualitative data which have an associated free-text field for further explanations. In the context of a longitudinal observational study, we use NLP to assess the free-text field associated with a qualitative “Other” field to determine whether the existing categories options properly capture participant responses and whether additional classifications are needed. We will illustrate three natural language processing techniques utilized to assess the unstructured data: 1) removal of stop word or words that are important to grammar of a sentence but do not add meaning, 2) lemmatization or reducing words to their primary form, and 3) tokenization of preprocessed text to break up phrases into smaller groups. After which we created bigrams and trigrams of these word groups utilizing a term frequency-inverse document frequency (TF-IDF) algorithm to identify distinct and meaningful tokens. Visualizations of both bigrams and trigrams by date were investigated for pattern recognition of the word groups. We will illustrate the process required for the analysis of unstructured free-text fields. The output of the analysis suggested that the “Other” option was often chosen so that additional information could be added to the associated free-text field. To properly capture participant responses, we updated the eCRF to include additional qualitative options identified through text mining along with allowing for multiple selection rather than single selection of qualitative options. NLP techniques such as the TF-IDF algorithm provide opportunities to explore underutilized qualitative responses in the scope of clinical trial data. The analysis of free-text fields allows investigators to extract quantitative metrics from unstructured data originating from free-text fields that could be used to review and validate workflows to ensure high quality eCRF data capture. NLP techniques can also be used to audit eCRF data for misclassifications and other data entry errors that can impact study analyses. Additionally, text mining can be used to report on participant feedback to further enhance clinical research procedures.

SESSION 7

Bayesian Clinical Trials

CP7-1 – REMOTE, BIVARIATE EXPERT ELICITATION TO DETERMINE THE PRIOR PROBABILITY DISTRIBUTION FOR A BAYESIAN NON-INFERIORITY MULTICENTER RANDOMIZED CONTROLLED TRIAL

Primary Author:

1) Arlene Jiang (The Hospital for Sick Children)

Co-Author(s):

2) Alex Aregbesola (Children’s Hospital Research Institute of Manitoba; University of Manitoba)

3) Apoorva Gangwani (Children’s Hospital Research Institute of Manitoba)

4) Terry Klassen (University of Saskatchewan; Jim Pattison Children’s Hospital)

5) Amy C Plint (University of Ottawa)

6) Elisabete Doyle (University of Manitoba)

7) William Craig (University of Alberta)

8) Mohamed Eltorki (University of Calgary)

9) Banke Oketola (Children’s Hospital Research Institute of Manitoba; University of Manitoba)

10) Hoda Badran (Children’s Hospital Research Institute of Manitoba; University of Manitoba)

Bayesian statistics are increasingly used in the design and analysis of clinical trials. A key element of a Bayesian clinical trial is the prior for the treatment effect, which encapsulates existing knowledge and uncertainty regarding treatment efficacy. Typically, a prior for a trial is drawn from previous studies or meta-analyses. However, in emerging research areas, such sources may not be available. Here, expert opinion can be employed to establish a prior, but it must be gathered systematically through elicitation. Traditional elicitation methods often require face-to-face interactions and extensive pre-elicitation training, making them potentially impractical and costly. In this study, we developed a remote, international, structured elicitation method to construct a joint (bivariate) prior distribution for a treatment effect (i.e., the difference between treatment and control groups). This method has been successfully applied to a pediatric croup non-inferiority trial that compares the efficacy of two doses of dexamethasone: 0.60 mg/kg as the active control and 0.15 mg/kg as the experimental treatment. The goal of this elicitation application is to develop a joint distribution representing the difference in the number of return visits to the emergency department (ED) for both doses of dexamethasone. We denote the distribution of the probability of a return visit to ED in 0.15 mg/kg dose group as f(P1) and the probability of a return visit to ED in 0.60mg/kg dose group as f(P2). A total of twelve emergency medicine physicians from Canada and the USA participated in our remote elicitation exercise. We developed an R Shiny application to assist with the elicitation and distribution fitting. The process was conducted in two stages. In the first stage, the experts were presented with two hypothetical clinical scenarios under two doses and were asked to provide their individual judgments to elicit Beta distributions for f(P1) and f(P2). After this initial assessment, the group had the opportunity to discuss their responses. In the second stage, experts were permitted to adjust their judgments based on insights gained from the group discussion, leading to revised marginal distributions of the probability of returning to the ED. Recognizing that individual judgments regarding return visits for high-dose and low-dose groups may be correlated, we aggregated the individual distributions using expert-specific joint (bivariate) distributions f(P1,P2) with latent effects. These bivariate distributions introduced expert-specific correlations between the responses for each dosage. Finally, the distribution of f(P2-P1) was derived from the joint distribution and was subsequently used to determine the sample size for the trial. Figure 1 displays individual expert opinions on the efficacy of each dose at survey rounds 1 and 2. The elicitation generated a final prior distribution centered at 6% (standard deviation: 6%) for the active control dose and 8% (standard deviation: 7%) for the experimental treatment dose (Figure 2). The aggregated prior distribution produced a sample size of 1700. This study demonstrates the feasibility of remotely eliciting bivariate distributions to design clinical trials. Reporting our elicitation process will support the use of elicitation in future clinical trials.

CP7-2 – SUBJECT RANDOMIZATION FOR BAYESIAN ADAPTIVE TRIALS WITH MULTI-ARM UNEQUAL ALLOCATIONS

Primary Author:

1) Wenle Zhao (Medical University of South Carolina)

Research on randomization algorithms for two-arm equal allocation trials has achieved remarkable results, but research on multi-arm unequal allocation trials is still insufficient. In Bayesian adaptive trials with response adaptive randomization (RAR), desired allocations are not only unequal but may also contain decimals or irrational elements, such as 1:1.234:1.789. Periodical update of the target allocation in Bayesian adaptive trials result in small sequence length for each allocation. Currently available randomization algorithms are complete randomization and permuted block randomization. Complete randomization can accurately target the desired allocation without approximation but has low allocation precision, may result in a treatment distribution far away from the target, especially when the allocation sequence length is small, and the trial is running only once. Permuted block randomization requires a block size within the allocation sequence length and therefore may not be able to accurately target the desired allocation. For example, for the desired allocation of 1:1.234:1.789, investigators may have to use 3:4:5, 4:5:7, or 6:7:11 as approximated. Both low allocation accuracy and low allocation precision may reduce the benefit of adaptation design. Furthermore, neither complete randomization nor permuted block randomization control imbalances in potential confounding factors. This talk presents the minimal sufficient balance (MSB) method for trials with two or multiple arms and equal or unequal allocations. In this method, treatment imbalance is defined as the Euclidean distance between the treatment distribution under the desired allocation and the treatment distribution observed in the trial at the given stage. By default, a complete random assignment is applied by using the desired allocation probability as the conditional allocation probability. When treatment imbalance reaches the pre-specified threshold, the conditional allocation probabilities are modified aiming to reduce the treatment imbalance. Otherwise, the randomization algorithm checks the distribution of baseline covariates. The p-value of a test is one of optional measures for baseline covariate imbalances. If serious imbalances (such as a p-value less than 0.3) are found in one or more baseline covariates, the conditional allocation probability is modified to reduce these imbalances. Computer simulation show that less than 20% treatment assignments needed to contain the treatment imbalance within the threshold equal to the number of arms. Nor more than 5% treatment assignments are required to prevent serious imbalance (i.e. p-value<0.3) in a baseline covariate. Most importantly, the MSB method ensures that the desired allocation obtained from the Bayesian adaptation algorithm is accurately targeted without approximation and the allocation precision is controlled. This new randomization method has been implemented in several large multicenter Bayesian adaptive trials in the Stroke Trials Network and SIREN Network, both founded by NIH.

CP7-3 – PRIOR DISTRIBUTIONS FROM ENVISIONED POSTERIOR JUDGMENTS: A NOVEL ELICITATION APPROACH WITH APPLICATION TO BAYESIAN CLINICAL TRIALS

Primary Author:

1) Yongdong Ouyang (The Hospital for Sick Children)

Co-Author(s):

2) Janice J Eng (University of British Columbia)

3) Denghuang Zhan (University of British Columbia)

4) Hubert Wong (University of British Columbia)

Bayesian methods for clinical trials require the specification of a prior distribution for the model parameters. The key benefit derived from the prior distribution is the ability to incorporate prior knowledge, which can increase trial efficiency. Prior elicitation is a scientific process that transforms domain knowledge, previous data, or expert judgments into well-defined prior distributions. It offers a solution to the prior specification problem, especially when limited data is available. Applied to clinical trials, elicitation involves engaging medical experts to assist them with summarizing their judgments about how well treatments work and communicating those judgments in a way that the results can be combined with trial data. The uptake of formalized prior elicitation from experts in Bayesian clinical trials has been limited, largely due to the challenges associated with complex statistical modeling, the lack of practical tools, and the cognitive burden on experts, requiring them to undergo supplementary training to ensure they are adept at quantifying uncertainty using probabilistic statements and to mitigate potential cognitive biases. In addition, existing methods have not addressed the issue of prior-posterior coherence, i.e., does the posterior distribution, obtained mathematically from combining the estimated prior with the trial data, reflect the expert’s actual posterior beliefs? In this work, we propose a new elicitation approach that seeks to ensure prior-posterior coherence and to reduce the expert’s cognitive burden. This is achieved by eliciting responses about the expert’s envisioned posterior judgments (point estimates for parameter values) under various hypothetical outcome data spanning a wide range of outcome values, as well as sample sizes. The presented data are intended to challenge the expert’s beliefs, forcing them to make decisions about the relative weights they assign to their (latent) prior judgments versus the data. A “best fit” prior distribution is then inferred from these elicited posterior judgments based on a specified statistical optimality criterion that minimizes the discrepancies between the elicited responses and the expected responses obtained from the implied posterior distribution. We present the statistical framework and the results from applying this approach in a pilot case study with a group of 10 clinician experts to obtain their prior distributions for the time effect in an ongoing stepped-wedge cluster randomized trial.

CP7-4 – BAYESIAN IN-SILICO CLINICAL TRIALS APPLIED TO OBESITY-RELATED CANCER PREVENTION: THE IMPORTANCE OF EXPERT ELICITATION FOR KEY PARAMETERS IN THE ABSENCE OF EXISTING DATA USING THE SHELF METHOD

Primary Author:

1) Matthew Harris (Manchester Cancer Research Centre)

Co-Author(s):

2) Duncan Wilson (Leeds Clinical Trials Research Unit)

3) Jeremy Oakley (University of Sheffield)

4) Kate Ren (University of Sheffield)

5) Andrew G Renehan (Manchester Cancer Research Centre)

Introduction: Clinical trial feasibility is a critical consideration in the design and implementation of interventions, particularly when addressing complex health outcomes such as cancer prevention. “In-silico” trials provide the ability to model a clinical trial without the risks associated with undertaking this in the real-world. A Bayesian framework allows for the inclusion of uncertainty to be factored into these models, providing an understanding of the risks associated, and the impacts of key aspects of a potential clinical trial. As part of a multi-modal preliminary analysis of the feasibility of a large-scale weight loss intervention to prevent cancer trial, Bayesian In-silico trials have been designed to understand feasibility and optimal designs. In-silico models are only as valid as the data input. In the absence of strong literature, expert opinion can be used to define these parameters. This study defines the potential and importance of expert elicitation of key prior specifications using the SHELF method (shelf.sites.sheffield.ac.uk).

Methods: A two-step “In-silico” clinical trial has been designed, simulating cancer incidence in a weight loss intervention to prevent cancer scenario. It does this by first simulating individual weight losses from priors taken from the literature, then simulating a probability of cancer based on each virtual patient’s weight loss. The prior distribution representing the relationship between individual weight loss and cancer risk is very poorly defined in the literature with a high degree of uncertainty. We compare the effect on trial assurance of the selection of 3 plausible prior distributions for this effect, demonstrated in figure 1. We then undertake expert elicitation of this distribution using the SHELF framework, comparing the impact on assurance as a single value and a distribution of probability.

Results: A two-arm in-silico trial is simulated, comparing a weight loss drug (mean 15% body weight loss) to a control group (mean 2% body weight loss), using a total sample size of 1500 randomized equally between arms with a binary primary endpoint indicating cancer incidence at 10 years. High and low plausible extremes gave an unconditional probability of observing a statistically significant result of, 99.9% and 37.44% respectively. This highlights the impact of variation of this value, motivating the SHELF elicitation exercise planned for February 12th, 2025, and involving 8 experts. The elicited parameter will be used in this simulation and compared with the “plausible” extremes and presented.

Conclusion: This model will show the impact of variability of key In-silico trial model parameters and where expert elicitation can allow for improved validity of outputs. It highlights the importance of generating prior specifications in the most systematic way possible, in the absence of available data using synthesis of opinion.

SESSION 8

Trial Design and Analysis

CP8-1 – AN EXPERIMENTAL DESIGN FOR CLINICAL TRIALS TESTING THE INDIVIDUALIZATION POTENTIAL OF AN INDIVIDUALIZED TREATMENT RULE

Primary Author:

1) Francisco Diaz (The University of Kansas Medical Center)

Considerable money and effort have been invested to develop medical treatment individualization approaches based on biomarkers and diagnostic tests, and, more generally, patient-level variables. However, the pace of evaluation of the clinical utility of these decision methods has been slow, and many are implemented in clinical practice without confirmatory empirical evidence they do what they are supposed to do (treatment individualization). So, there have been in recent years calls in the medical community for the conduct and regulation of prospective randomized clinical trials that evaluate these individualized treatment rules (ITRs). ITRs based on classic statistical methods or machine learning have also proliferated in the statistical literature, but these ITRs are also rarely tested with clinical trials; their clinical utility is often inferred from statistical theory and clinical knowledge or examined with simulations. Thus, there is a need of efficient and scientifically sound clinical trials that examine the clinical utility of ITRs. These trials should not be confused with those widely used for testing the efficacy of individual medical treatments (for example, the well-known parallel-group or crossover trials). Their primary goal is to test the efficacy of the individualization process relative to not following the process, not the efficacy of a specific treatment involved in the process relative to placebo or other treatments. Thus, the development of experimental designs for evaluating ITRs with clinical trials assessing their potential utility in clinical practice is an important and flourishing field of methodological research. Here, we introduce a new confirmatory experimental design for testing ITRs, focusing on the individualization of two treatments. The design is built on the novel published idea of individualization potential, which is a measure of the extent of the superiority of an ITR over treatment without individualization (TWI). This idea implies a novel way of constructing the control group of the clinical trial and of testing the ITR’s utility. Our experimental design compares the application of the ITR against TWI as the control arm. We show that our design is superior to the most common designs used in personalized medicine research. Our design usually requires smaller sample sizes, especially of the patients who are less frequent and therefore more difficult to recruit and implements a more appropriate control arm. We explain how to test the significance of the individualization potential with our design and how to calculate optimal sample sizes. We illustrate by calculating sample sizes for a hypothetical clinical trial of a published ITR for the individualization of ophthalmological treatments.

CP8-2 – EVALUATING THE USE OF CO-PRIMARY ENDPOINTS IN TRIALS CONDUCTED AMONG CRITICALLY ILL PATIENTS: APPLICATION FOR DELIRIUM TRIALS

Primary Author:

1) Quentin Le Coent (Johns Hopkins University)

Co-Author(s):

2) Elizabeth Colantuoni (Johns Hopkins University)

3) Virginie Rondeau (INSERM)

Delirium is an acute state characterized by rapid onset of confusion, inattention or agitation. Delirium is common among patients with critical illness receiving care in an intensive care unit (ICU), where incidence can be as high as 80%. In critically ill patients, delirium is associated with increased mortality and the duration of delirium is associated with cognitive impairment and onset of dementia among patients surviving the critical illness. Therefore, an increasing number of trials are evaluating pharmacologic and nonpharmacologic interventions to reduce delirium duration. Delirium trials conducted among critically ill patients typically measure delirium at least daily from the day of randomization to the first of pre-specified follow-up duration, e.g., 14 or 28 days, or death. Two competing risks complicate operationalizing duration of delirium as an endpoint in these trials. First, critically ill patients may experience periods of deep sedation or coma during which delirium is unable to be assessed, and second, patient death precludes the measurement of delirium. To account for these competing risks, several “failure-free” days composite endpoints have been used, e.g., days alive free of delirium and coma within 14 days. As with any composite endpoint, it is challenging to disentangle the effect of an intervention on a single component, e.g. days of delirium. Further, composite endpoints are not recommended in settings where interventions may have a positive effect on one component and a negative effect on another. For delirium trials, there may be a pharmacologic agent hypothesized to reduce delirium duration while increasing the duration of deep sedation, e.g. benzodiazepines which have sedative properties. As an alternative to a composite endpoint, we investigate the use of multiple co-primary endpoints composed of days of delirium, days of coma and death within the pre-specified follow-up duration, where treatment is deemed efficient if there is a significant reduction of delirium duration with no clinically important increase in the duration of coma or death. Therefore, the global hypothesis test is composed of one superiority test for the duration of delirium and two non-inferiority tests for the duration of coma and death. Co-primary endpoint falls under the intersection-union principle for hypothesis testing where the overall null hypothesis is rejected if all sub-null hypotheses for each endpoint are rejected. We illustrate using a simulation study how this setting does not inflate the type-I error, but it can inflate the type-II error and reduce the power of the study. The study testing for a co-primary endpoint will then need to increase the sample size than that for testing a single endpoint. Using data from a completed delirium trial, we compare power when using multiple coprimary endpoints composed of duration of delirium, duration of coma and time to death. Further, we propose strategies to mitigate the power loss when using the co-primary endpoint by adapting the type-I error for each outcome. This work highlights the potential for co-primary endpoints to improve clinical trial designs and interpretation for delirium trials conducted among critically ill patients.

CP8-3 – EVALUATING ESTIMAND IMPLEMENTATION IN CLINICAL TRIALS IN THE UK

Primary Author:

1) Morgaine Stiles (The Institute of Cancer Research)

Co-Author(s):

2) Fay Cafferty (The Institute of Cancer Research)

3) Beatriz Goulao (University of Aberdeen)

4) Victoria Shepherd (Cardiff University)

5) Christina Yap (The Institute of Cancer Research)

Background: Estimands provide clarity on the precise research question being answered within a clinical trial, ensuring that factors impacting the interpretation of the treatment effect are considered and documented. This helps to facilitate early discussions within trial teams to ensure that the correct data are collected to answer the research question and makes it easier to interpret trial results. The ICH E9 (R1) addendum introduced a framework for including estimands in clinical trials in 2020, however a systematic review in 2021 suggested that the framework had not been widely implemented at that point and indicated that uptake of the framework within the non-commercial setting may be particularly low. In order to improve uptake of the estimand framework, it is important to understand its current use, and barriers to its wider implementation.

Methods: A survey of academic UK clinical trials units will be conducted to assess the current processes around implementation of the estimand framework and barriers towards its use. The survey will be distributed through the UKCRC (UK Clinical Research Collaboration) Clinical Trials Unit statistician and director networks, with one representative from each of the 50 trials units in the network being asked to complete it. In parallel, a systematic scoping review of clinical trials conducted in the UK will be undertaken to examine the extent of estimand use in UK clinical trials registered between 2021 and 2024. This will provide an update on estimand uptake since the 2021 review, enabling a greater understanding of the change in adoption of the framework following the ICH E9 (R1) addendum. This review will assess how widely estimands are used and identify characteristics of trials that are underutilizing the framework (e.g., commercial vs non-profit, exploratory vs definitive trial phases). The search will encompass publication and registry databases, including PubMed, ClinicalTrials.gov, and ISRCTN, and will be analyzed narratively.

Results: Results of the survey (which will be performed in Q4 2024 – Q2 2025) and systematic scoping review will be presented at the meeting. The combined results from these two elements will provide a deeper insight into the landscape of estimand use in UK clinical trials.

Conclusion: The estimand framework is an important tool to ensure that clinical trials are asking the correct questions. However, the extent of uptake of the framework, and the barriers to its implementation, remain unknown. This work will provide a comprehensive understanding of current estimand practices in UK clinical trials. The findings will offer insights into the adoption, challenges, and potential areas for improvement in estimand implementation, and inform further work to develop resources and guidance (including specific materials for patient advocates) to support multi-disciplinary teams in estimand development and use. This will ultimately guide more consistent and effective use of estimands across clinical trials in the UK and beyond.

CP8-4 – PRELIMINARY RESULTS FROM A SYSTEMATIC REVIEW OF BAYESIAN METHODOLOGICAL APPROACHES USED IN THE DESIGN AND ANALYSIS OF CONTEMPORARY RANDOMIZED CLINICAL TRIALS

Primary Author:

1) Rishi Bansal (University of Oxford)

Co-Author(s):

2) Tiago V Pereira (University of Oxford)

3) Maria Nowicka (University of Oxford)

4) Bruno R da Costa (University of Oxford)

5) Peter Jüni (University of Oxford)

Background: The use of Bayesian statistics for the design and analysis of randomized clinical trials (RCTs) has grown considerably over the past decade, largely driven by advances in computing. This has enabled trialists to incorporate historical data when designing trials, make predictions that appropriately integrate multiple sources of uncertainty during a trial, or implement highly flexible adaptive trial designs. These features make Bayesian approaches potentially advantageous over frequentist methods in certain contexts. There is, however, significant variation in the methodological quality and reporting standards of Bayesian trials. Liberal criteria for stopping a randomized comparison and declaring success, as well as poor reporting, documentation, and communication of contemporary Bayesian approaches may undermine their replicability, interpretability, and credibility. The objective of this systematic review is, therefore, to describe the characteristics of RCTs using Bayesian approaches for their primary design and analysis.

Methods: This review was prospectively registered in PROSPERO (CRD42024513251). We searched MEDLINE and Embase between 2019 and 2023 for RCTs that used Bayesian methodology. The Cochrane Central Register of Controlled Trials (CENTRAL) was used to identify trial protocols in the same period. Abstract and full-text screening was completed in duplicate. All disagreements were resolved by consensus with an expert methodologist. We included RCTs and protocols with individual randomization in phase 3 and 4 that used Bayesian approaches for their primary analysis. We excluded cluster and n-of-1 trials, post hoc analyses, retrospective analyses, long-term follow-up studies, and studies not published in English. Data extraction will include relevant methodological items such as basic trial information, sample size considerations, adaptive design elements, stopping rules, data analysis parameters, Bayesian parameters (e.g. priors), and results reporting. Results will be stratified using the 2023 clinical core journals filter, clinical area, use of a master protocol, and other key methodological features.

Results: We identified 1971 citations, of which 478 were duplicates and 256 were marked ineligible by the Covidence automated RCT screening tool. Of 1221 studies identified for screening, a total of 128 trials were included, of which 55 were from clinical core journals. Preliminary findings indicate substantial variability in the methodological quality and reporting standards of included trials. COVID-19 is strongly overrepresented (43.8%) due to the many Bayesian platform trials conducted during the pandemic. The next most common clinical area was cardiology (7.8%). Most trials were related to drugs (63.3%), followed by surgical/procedural (13.3%) and complex interventions (15.6%). Twenty-four (18.8%) of the included studies were protocols and/or SAPs. The platform trials often have the most robust reporting, but many elements are obfuscated by hundreds of pages of supplemental material. Many trials include statements that indicate a misunderstanding of core concepts, including suggestions that error rates are not relevant to Bayesian trials.

Discussion: The preliminary findings indicate a clear need for guidelines to indicate best practices for the reporting and use of Bayesian methods in RCTs. We anticipate that forthcoming results will provide more granular insights into the contemporary landscape of Bayesian trials.

SESSION 9

Platform, Adaptive, and Basket Trials

CP9-1 – ASSESSING ADHERENCE TO CONSORT REPORTING GUIDELINES USING AI

Primary Author:

1) Auden Krauska (University of Wisconsin-Madison)

Co-Author(s):

2) Emma Weishaar (University of Wisconsin-Madison)

3) Jack Grosskreuz (University of Wisconsin-Madison)

4) Jonathan Morris (University of Wisconsin-Madison)

5) Karl Rohe (University of Wisconsin-Madison)

6) Gary Collins (University of Oxford)

Background: The CONSORT statement provides recommendations in the form of a checklist for comprehensive and transparent reporting of randomized controlled trials (RCTs). However, verifying adherence to the checklist is time-consuming. Large Language Models (LLMs) could automate this process.

Methods: We analyzed 10 RCTs published in the BMJ, comparing page numbers recorded by authors in completed CONSORT checklists with those identified by the Rohe Nordberg CONSORT Report, a multi-stage LLM system. Agreement was assessed when page numbers overlapped and analyzed separately for fully, partially, and incompletely reported items. A random sample of disagreements underwent human review.

Results: Agreement varied by reporting category: 72.7% for items marked not applicable, 56.1% for fully reported items, 45.6% for partially reported items, and 44.7% for incompletely reported items. Paper-level agreement ranged from 2.7% to 95%. Human review of disagreements found that in 29% of cases, both sources were valid but different; in 29%, the Rohe Nordberg CONSORT Report identifications pertained more to the given item, while the author citations pertained somewhat; and in 43%, author citations were not consistent with the checklist item while the Rohe Nordberg CONSORT Report’s paragraph citations were.

Conclusions: LLMs show promise in automating adherence checking of CONSORT, with high agreement for basic trial information and ability to identify non-applicable items. Lower agreement for complex methodological items and wide variation across papers suggest areas for author education. The LLM were more precise in locating the relevant page numbers than authors, indicating potential value for improving reporting quality.

CP9-2 – INTEGRATING INTERIM ANALYSES AND DSMB OVERSIGHT IN ADAPTIVE PLATFORM TRIALS: OPERATIONAL INFRASTRUCTURE OF THE HEALEY ALS PLATFORM TRIAL

Primary Author:

1) Brittney Harkey (Massachusetts General Hospital)

Co-Author(s):

2) Marianne Chase (Massachusetts General Hospital)

3) Lori Chibnik (Massachusetts General Hospital)

4) Michelle Detry (Berry Consultants)

5) Cornelia Kamp (CLINTREX Research LLC)

6) Sabrina Paganoni (Massachusetts General Hospital)

7) Merit Cudkowicz (Massachusetts General Hospital)

Introduction: Adaptive platform trials provide an efficient framework for evaluating multiple therapies by continuously adding or removing investigational regimens consisting of active drug and matching placebo. Interim analyses allow for early termination of regimens based on efficacy or futility criteria, while frequent safety monitoring safeguards participants. This abstract outlines the operational infrastructure supporting ongoing interim analyses and safety oversight of the HEALEY ALS Platform Trial.

Methods: The HEALEY ALS Platform Trial is overseen by two independent groups: the Data Safety Monitoring Board (DSMB) and the Independent Statistical Analysis Committee (ISAC). The DSMB provides advisory support to ensure participant safety and proper conduct of the trial and interpretation of the interim and final data. The ISAC is tasked with performing interim analyses according to the regimen-specific protocol and statistical analysis plan and delivering interim analysis reports to the DSMB. Both groups operate under a study-specific charter and work in collaboration to determine whether modifications to a regimen are warranted. The DSMB and ISAC are supported by blinded and unblinded teams of data managers and biostatisticians. Blinded and unblinded teams adhere to the trial’s blinding and unblinding plan, which specifies roles, responsibilities, and secure data handling to prevent accidental unblinding. Restricted access storage locations are designated for unblinded data, accessible only by the unblinded team members. Non-binding interim analyses occur every 12 weeks for regimens with sufficient data to assess futility. DSMB meetings are scheduled to review safety data and coincide with interim analyses. Each DSMB meeting includes open and closed sessions to review all active regimens, which include participants active in the placebo-controlled phase. Blinded data reports contain aggregated safety data and are shared to support the open session, which are attended by both blinded and unblinded study team members. The closed session, limited to unblinded members, includes review of unblinded safety and interim analysis reports prepared by an unblinded biostatistician and the ISAC, respectively. Before each DSMB meeting, data from the electronic data capture system is compiled into comprehensive blinded and unblinded safety and efficacy reports, detailing enrollment, participant status, demographics, adverse events, serious adverse events, premature withdrawals, and outcomes. The ISAC uses unblinded efficacy data to conduct regimen-specific interim analyses, generating a report for the DSMB’s review. The unblinded members of the ISAC attends the closed session to discuss the interim analyses reports. Following their review, the DSMB provides recommendations to the study Sponsor, advising on whether each regimen should continue, be modified, or be stopped.

Results: Since the HEALEY ALS Platform Trial’s launch in 2020, seven regimens have been added with DSMB meetings and interim analyses occurring on schedule every twelve weeks. Reports are adjusted continuously to accommodate the number of active regimens and specific criteria in each regimen’s statistical analysis plan.

Discussion: The HEALEY ALS Platform Trial illustrates the effectiveness of adaptive platform trial infrastructure for managing multiple regimens through ongoing interim analyses and safety oversight. This approach offers a promising, flexible framework for other platform trials, enhancing the capacity to evaluate multiple therapies efficiently and safely.

CP9-3 – CHALLENGES AND SOLUTIONS IN CTMS IMPLEMENTATION FOR A REGISTRY-BASED PLATFORM TRIAL

Primary Author:

1) Catherine Dillon (Medical University of South Carolina)

Real World evidence, including registry data, is becoming increasingly important in answering clinical research by harnessing patient data in the real world outside controlled clinical trials, allowing for treatment across diverse populations of people who might not otherwise be enrolled in clinical trials, providing information about long-term efficacy, safety, and cost-effectiveness, and revealing how treatments are used in clinical practice. It has been argued that registry-based randomized clinical trials identify and recruit patients more efficiently, reduce duplicative data collection and site workload, reduce loss to follow-up, decrease time to database lock, enhance study generalizability, accelerate time to regulatory decision-making, and reduce clinical trial costs compared to traditional randomized clinical trials. STEP is a randomized, multifactorial, adaptive platform trial that seeks to optimize the care of patients with acute ischemic stroke. It is a “hybrid” registry-based trial that leverages data from the American Heart Association’s “Get with the Guidelines” and the “Neurovascular Quality Initiative-Quality Outcomes Database” registries. With permission from participating institutions and the IRB, data elements collected in the registries for patients randomized into STEP are transferred to the clinical trial management system’s (CTMS) case report forms using a customized data transfer program based on the unique participant record link. Once the data are transferred, the site is responsible for reviewing the fidelity of the transferred data, cleaning the data when needed, and entering any fields not transferred from the registries. Additional registry data from patients not randomized in STEP are used for trial screening, planning, and generalizability assessment purposes. Challenges during CTMS implementation include: (1) Unknown treatments and study visits that will be added to or removed from the platform during the life of the study; (2) Multiple mechanisms of data collection including (1) transfer from the registry, (2) data entry by the site, and (3) transferred and then edited by the site; (3) Not all sites are participating in all registries, and some sites only enter a random sample of their patients into the registry(s); (4) Data definitions in the registry may be similar, but not the same as data items defined in the study protocol; (5) Code reconciliation for multiple selection data fields; (6) Time delays due to standard of care data entry in registry systems; (7) Cost justification for registry data acquisition; and (8) Cost justification for information system development for data transfer and reconciliation. This presentation will discuss solutions to these barriers including building flexibility into the study design, data collection schedule, CRF design, and data validation methodology and development of tools for data transfer, conversion, mapping, and reconciliation. The primary aim of this presentation is to evaluate the costs and benefits of combining real word SOC registry data sources with trial specific data captured on CRFs based on the experiences of the STEP platform trial.

SESSION 10

Graphics and Visualization

CP10-1 – LEVERAGING INTERACTIVE DATA VISUALIZATION SOFTWARE TO ENHANCE DATA REPORTING AND QUALITY: EXAMPLES FROM A MULTI-CENTER CLINICAL TRIAL

Primary Author:

1) Seung Ho (Charlie) Choi (University of Iowa)

Co-Author(s):

2) Helena Blumenau (University of Iowa)

Clinical trialists are increasingly utilizing interactive data visualization software, such as Power BI, Tableau, and Looker Studio, with the goal of garnering insights from large trial data sets, often in real-time (or with short lag time). Interactive data visualizations in clinical trials provide a powerful tool, especially in a risk-based monitoring context, where cleaned, high-quality data is critical to the successful execution of a trial and analysis and interpretation of its results. We present two use cases of Power BI in the context of multi-center clinical trials: (1) to generate near real-time reports on ongoing study status, and (2) to create visualizations of endpoint data for the purposes of data cleaning. During the course of a trial, data coordinating centers are tasked with generating reports to share with various trial stakeholders. Often, these reports include information on enrollment and visit compliance, among other metrics. For example, in our traditional weekly reports, we typically include three separate pages summarizing (1) the number of participants consented and randomized by site, along with those eligible/ineligible and the rates of consent/randomization; (2) a figure of consents over time; and (3) a figure of randomizations over time. In Power BI, researchers can summarize this information into a single report page, with interactive visuals that can be filtered to highlight certain fields. For example, filtering by site enables report users to easily see consents/randomizations for a specific site over time, which offers interactive insights not available in traditional reports. In addition, creating a direct link between the Power BI report and a “snapshot” of the production database allows for the report to be updated on a more frequent basis. Power BI also provides creator-controlled filters and subscriptions within the report distribution infrastructure. Automating the generation and distribution of reports saves time and reduces email overload caused by multiple attachments. Power BI can also enhance data cleaning processes. In contrast to static “spaghetti” plots, Power BI provides an interactive environment that links statistician-derived endpoints (using statistical software like SAS) with the raw data, allowing the user to refer back to the case report form (CRF)-level data and examine whether any specific data entry errors or procedural issues are affecting the derived endpoint. In our described process, statisticians write analysis dataset programs that are automatically configured to run on the snapshot database at a regularly scheduled interval, outputting CSV files that are loaded to an “analytics” database using SQL Server Integration Services (SSIS) and linked to the raw data via an identifier (e.g., CRF #). Once set up, this process enables the full use of Power BI’s functionalities. For instance, incorporating the site filters as described above, one can more quickly decipher trends at particular sites or in specific participants, whereas doing the same with static plots would require re-runs of the code by those proficient in the statistical software. These use cases demonstrate that data visualization software like Power BI can help enhance clinical trial data reporting and quality, which are essential to robust scientific inquiry.

CP10-2 – COMMUNICATING TRIAL RESULTS BY GRAPHICAL ABSTRACTS —EXPERIENCES FROM THE GRADE (GLYCEMIA REDUCTION APPROACHES IN DIABETES: A COMPARATIVE EFFECTIVENESS) CLINICAL TRIAL

Primary Author:

1) Heidi Krause-Steinrauf (George Washington University)

Co-Author(s):

2) Michaela Gramzinski (George Washington University)

3) Colleen Suratt (George Washington University)

4) The GRADE Research Group

Background: The use of graphical or visual abstracts is becoming a standard journal requirement for submission and presentation of clinical study findings. Graphical abstracts are used as a visual overview and succinct description for readers to quickly review results to determine whether to read the full article, and convey key information on the design, intervention, study population, and findings. While similar in content to their textual counterparts, graphical abstracts are intended to provide information in a more easily digestible visual medium than traditional abstracts, and better suited than traditional abstracts for use in piquing reader interest across newer media platforms. This requirement was first initiated by top tier medical journals to provide readers an overview of results of clinical trials and systematic reviews. Notable challenges for developing a graphical abstract are determining how to visually summarize the study findings and the primary message in a manner that attracts the interest of the intended audience, without replicating any tables or figures in the article and without oversimplifying the results to the point where they are misleading.

Methods: We will use recently published examples from the GRADE (Glycemia Reduction Approaches in Diabetes) Comparative Effectiveness Trial [www.gradestudy.org], funded by the National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK), NIH, to describe our experience developing high-quality graphical abstracts to communicate study results. Our presentation will focus on our collaborative and iterative approach and the importance of input from clinicians, scientists, and statisticians. We discuss the challenges in creating an informative graphical message for complex results, and describe strategies for determining abstract content, obtaining appropriate graphics, and constructing the abstracts using available secure software (e.g., Microsoft PowerPoint).

Results: To date, the study has developed 20 graphical abstracts for the GRADE Study, fourteen of which have been published, including 10 in a special issue reporting GRADE findings. These abstracts spanned study design, outcome analyses, subgroup evaluations, and comparative treatment assessments. Abstracts are assembled with input from the study investigators and writing group members. Reviews by the study Publications Committee proved essential in improving the clarity and impact of the key message.

Conclusion: Graphical abstracts are becoming a standard requirement for many journal submissions and are emerging as an important source for clinicians and researchers to communicate and highlight key study findings. Graphical abstracts require considerable effort in order to present key messages and context for intended readers. Well-developed visual abstracts are important to the dissemination of study findings.

CP10-3 – GRAPHICAL REPRESENTATION OF ADVERSE EVENTS IN CLINICAL TRIALS

Primary Author:

1) Katrina Dobinda (Northwestern University, Feinberg School of Medicine)

Co-Author(s):

2) Masha Kocherginsky (Northwestern University, Feinberg School of Medicine)

Adverse event (AE) reporting in clinical trials is essential for evaluation of treatment benefits and harms. AEs are usually collected using standardized classification schemes. For example, Medical Dictionary for Regulatory Activities (MedDRA) is a clinically validated international medical terminology system used for AE reporting, which includes five levels of hierarchy, with 26 System Organ Classes (SOCs) at the highest level and more than 80,000 specific lowest level terms which reflect how AEs may be recorded in practice. Common Terminology Criteria for Adverse Events (CTCAE) incorporates certain elements of the MedDRA terminology, and is the standard for classifying, attributing and grading the severity of AEs associated with cancer treatments. A recent update to CTCAE for patient-reported outcomes (PRO), the PRO-CTCAE, has also been proposed. AE reporting is included in the Consolidated Standards of Reporting Trials (CONSORT) statement, and a recent

“2022 CONSORT Harms” extension includes three new items and updates related to benefits and harms reporting for thirteen other CONSORT items. AE data are usually analyzed using descriptive statistics, and the results are reported as tables of patient counts for each AE type, often by treatment group or severity level. Multiple instances of an AE for a participant over time are usually summarized as the maximum grade experienced. Formal comparisons between groups are not common due to low frequencies for many AE types, and the need for multiple comparison adjustments. For high-toxicity treatments descriptive tables are often long and difficult to interpret or identify patterns. Lee et al recently proposed to use circular and butterfly plots to represent proportions maximal-grade AEs by SOC and severity level, but in general graphical tools use has not been adopted. We propose a streamlined graphical approach that enables rapid visual comparison of AE patterns between groups, improves understanding of the affected organ classes, and enhances the overall understanding of treatment harms. First, we propose the use of vertical line charts, similar to love plots used in covariate imbalance assessments for propensity score analyses, to describe AEs rates by group. In these plots, AE terms are ordered by the overall frequency on the y-axis, and the x-axis displays the proportions of patients experiencing each AE type in each group, with connections drawn across AE terms within each group. Second, we propose using a radar plot to summarize organ class involvement in each group, with vertices denoting the proportion of patients experiencing any AE within a SOC. Each plot can be easily customized to include additional groupings, e.g. low vs. high grade AEs, to further highlight subgroup differences or similarities. These plots can also be directly applied to AE summary tables in published studies, or to posted results from completed trials in clinicaltrials.gov. To illustrate our approach, we use AE summary data from a published trial comparing Pemetrexed + Chemotherapy with or without Pembrolizumab in non-squamous non-small cell lung cancer (KEYNOTE-789; NCT03515837). We recommend inclusion of these graphical summaries in analysis reports and publications, alongside the usual table summaries.

CP10-4 – HOW DOES THE USE OF A VIDEO TO INTRODUCE A COMPLEX INTERVENTION AFFECT SERVICE USER UPTAKE AND ENGAGEMENT? RESULTS FROM A MIXED-METHODS SWAT

Primary Author:

1) Sadia Ahmed (University of Leeds)

Co-Author(s):

2) Ellen Thompson (University of Leeds)

3) Bethan Copsey (University of Leeds)

4) Suzanne Richards (University of Leeds)

5) Amanda Farrin (University of Leeds)

6) Andrew Clegg (Bradford Institute for Health Research)

Introduction: This mixed-methods study within a trial (SWAT) was designed to evaluate trial processes relating to intervention implementation. It was embedded within the PROSPER trial which evaluated effects of a personalized care planning intervention for older adults with frailty. The intervention, delivered to service users in their own homes by Personalized Independence Coordinators (PICs), involves goal setting and action planning to improve quality of life. The PROSPER feasibility study found poor intervention uptake and unclear explanations of the intervention from PICs. In response to this, a video was co-developed with patient and public involvement partners, researchers and professional animators. The video provided a case study of a service user using the PROSPER intervention, to be played to the service users on the first visit from the PIC. This SWAT aimed to investigate the effects of the video on service user uptake of and engagement with the intervention.

Methods: A mixed-methods SWAT consisting of a nested randomized controlled trial and a qualitative interview study was embedded within the intervention arm of the PROSPER trial. Intervention deliverers were randomly assigned to either the video or control. The control was a verbal explanation accompanied with the information sheet, without a video. Quantitative data about participant uptake of and engagement with the intervention was collected through trial case report forms and analyzed using regression analysis. Qualitative interviews were conducted with participants and PICs to explore their views and perspectives on the video. Interview data was analyzed using thematic analysis.

Results: In contrast to the feasibility study, service user uptake of, and engagement with the PROSPER intervention was high across both the SWAT intervention and control arms. A cluster-level analysis found no significant difference between the groups for uptake and engagement. The PIC who delivered the intervention was the only significant predictor in the logistic regression model. Qualitative interviews were conducted with 16 service users and 4 PICs. Thematic analysis of qualitative data suggested the video was acceptable to service users, although not particularly memorable. Their motivation to engage with the PROSPER intervention was driven by their relationship with PICs. This is consistent with findings from the quantitative data. PICs found the video to be a useful tool to aid delivery of the intervention.

Discussion: This mixed-methods SWAT demonstrates a novel use of the methodology to investigate intervention implementation processes. Using SWAT methodology in this way provides a systematic approach to testing minor refinements which may enhance intervention implementation. There are methodological, statistical and host trial considerations of doing SWATs like this. These SWATs should supplement, rather than replace evidence from process evaluations embedded in trials. The SWAT found service user uptake and engagement with the intervention to be high across both arms which made it difficult for the video to have any effect. However, this could be attributed to other changes made to the trial design from the feasibility to the main trial, for example changing from cluster to individual randomization. This SWAT has been undertaken as part of a PhD funded by MRC-NIHR-TMRP.

SESSION 11

Review and Guidance

CP11-1 – IDENTIFYING GAPS IN ETHICS GUIDELINES FOR CLUSTER RANDOMIZED TRIALS: A CITATION ANALYSIS

Primary Author:

1) Cory E Goldstein (Ottawa Hospital Research Institute)

Co-Author(s):

2) Jessica du Toit (The University Health Network)

3) Nicholas B Murphy (Western University)

4) Monica Taljaard (Ottawa Hospital Research Institute)

5) Charles Weijer (Western University)

Background: The cluster randomized trial (CRT) is an important design to generate robust evidence about clinical, health policy, health systems, and public health interventions. Unlike trials that randomly assign individual participants to different interventions, CRTs randomize intact groups such as hospitals, schools, and communities while measuring outcomes on individuals within groups. But CRTs raise complex ethical issues. The Ottawa Statement on the Ethical Design and Conduct of CRTs, published in 2012, remains the only international guidance document focused specifically on CRTs. Its 15 recommendations spanning 7 domains have been broadly influential, helping many researchers plan their CRTs according to high ethical standards. However, periodic updates of ethics guidelines are required to address new and emerging issues in the design, conduct, and review of research. To inform the forthcoming update of the Ottawa Statement, we aimed to identify any gaps in the Ottawa Statement discussed within the literature.

Methods: An analysis of publications that cited and engaged with the Ottawa Statement, the Ottawa Statement précis, or one of four background papers. We searched Google Scholar, Scopus, and Web of Science using the “cited by” function on 11 November 2022. We included all types of publications, including articles, book chapters, commentaries, editorials, ethics guidelines, theses and trial-related publications (i.e., primary reports, protocols, and secondary analyses). Data were extracted by four reviewers working in rotating pairs. Reviewers captured relevant text verbatim and recorded whether it reflected a gap relating to one or more of the Ottawa Statement domains. Using a thematic analysis approach, semantic coding was used to summarize the content of the extracted text into distinct gaps within the Ottawa Statement domains.

Results: Our search strategy identified 1,326 records. After duplicates were removed, 383 records underwent full text screening for eligibility. Among the 53 articles retained for analysis, the following Ottawa Statement domains were discussed: obtaining informed consent (37, 70%); identifying research participants (22, 42%); justifying the cluster randomized design (20, 38%); assessing benefits and harms (21, 40%); gatekeepers (14, 26%); research ethics committee review (13, 25%); and protecting vulnerable participants (7, 13%). Two (4%) articles discussed issues that did not fall within an existing Ottawa Statement domain. A qualitative analysis of the text from the 53 articles resulted in the identification of 24 distinct gaps in the Ottawa Statement. In this presentation, select gaps will be presented to demonstrate the breadth of issues discussed in the literature.

Discussion: We found that issues relating to informed consent in CRTs are widely discussed in the literature, whereas issues relating to gatekeepers, research ethics committee review, and protecting vulnerable participants are discussed much less often. However, the number of times an issue is discussed does not necessarily reflect its importance; each identified gap should be considered during the Ottawa Statement update process. These identified gaps will therefore form the basis for the Ottawa Statement update agenda.

CP11-2 – THE PROMISE STUDY: AN INTERNATIONAL CONSENSUS PROCESS TO DEVELOP GUIDANCE FOR ASSESSING “PROMISE OF THE INTERVENTION” AHEAD OF A RANDOMIZED CONTROLLED TRIAL

Primary Author:

1) Selman Mirza (University of Manchester)

Co-Author(s):

2) Antonia Marsden (University of Manchester)

3) Sarah Cotterill (University of Manchester)

4) Jack Wilkinson (University of Manchester)

5) Chris Sutton (University of Manchester)

6) Andy Vail (University of Manchester)

7) Jamie Kirkham (University of Manchester)

Introduction: Increasingly, funders ask researchers to provide evidence of ‘promise’ (or ‘evidence the intervention can work’) when submitting bids for randomized controlled trials (RCTs) or feasibility studies. To avoid research waste, it is important to establish that an intervention has some “promise” before proceeding with expensive clinical research that may be unwarranted if an intervention has little or no effect on patient outcomes. However, there is no shared understanding among stakeholders about what constitutes “promise”, and what study designs/research methods are appropriate. The aim of the Promise study is to develop guidance for funders and researchers around suitable study designs for assessing “promise” before conducting an RCT to evaluate an intervention.

Methods: The Promise project has four stages: 1. A review of funder guidance to see what they required in terms of “promise” prior to funding. This review covers funding streams within the two main UK funders of RCTs, pilot and feasibility studies: National Institute for Health Research (NIHR) and Medical Research Council (MRC), and two exemplars from the charity sector: Cancer Research UK (CRUK) and British Heart Foundation (BHF). The National Institutes of Health (NIH) are included as they are the world’s largest funder of healthcare research. 2. A review of the most recent protocols from successful funding applications to see what evidence of “promise” they provided. Full protocols of recent clinical trials were purposively sampled from trial registries to achieve variety of funding stream, type of intervention and research stage, to capture as many different aspects as possible of research currently being undertaken. 3. An online Delphi to consider the evidence of “promise” extracted during the two reviews and gather opinions on suitable “promise” methods required prior to funding. 4. Consensus meetings to review the evidence gathered, and to agree on “promise” methods to be included in the guidance.

Results: Structure and Timelines Stage 1 included 151 funding steams: 55 indicated they would fund an RCT, pilot/feasibility study. Of these 55 funding streams, 23(42%) explicitly stated that evidence of “promise” was required in the funding application. Only 6(11%) provided guidance related to “promise” which we defined as some specific information about what “promise” is or how to report it. None of the guidance reviewed stated specific methods suitable for demonstrating/assessing “promise”. Stage 2 is ongoing: 96% of protocols offered some evidence of “promise”. Common study designs used as supporting evidence included: non-randomized intervention studies, systematic reviews and RCTs. Reasons given for an intervention being promising included: promising results in previous studies, previous studies were inconclusive/poorly designed, evidence of safety, acceptability to patients/clinicians, reduced healthcare costs/burden. The second and third stages of the Promise project will be complete at time of conference, and proposed methods for assessing “promise” will be presented.

Potential Relevance and Impact: Guidance will be produced to help funders and researchers identify suitable research methods for assessing “promise” of the intervention. Implementation of guidelines will be facilitated through liaison and registration, support from the EQUATOR network, and via discussion with the major funders of clinical trials.

CP11-3 – PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT TO METHODOLOGICAL RESEARCH: INSIGHTS FROM A PANEL

Primary Author:

1) Nikki Totton (University of Sheffield)

Co-Author(s):

2) Steven Julious (University of Sheffield)

3) Ellen Lee (University of Sheffield)

Background: Patient and Public Involvement and Engagement (PPIE) means actively working in partnership with patients and members of the public to plan, manage, design and carry out research. PPIE is often included within primary research (i.e. clinical trials) where the importance of including a patient perspective is well acknowledged. However, methodological research is also completed with the aim to improve the conduct of primary research. As with PPIE in primary research, where the aim is to ensure that research questions being addressed are important to the wider public, PPIE could have useful insights into methodological research. However, as methodological research has a less direct pathway to patient benefit, there are additional complexities. These complexities can arise due to issues with prior knowledge, confidence in commenting on new concepts and researcher experience to engage PPIE members. Our aim was to create a PPIE panel specifically for methodological research to support this work.

Methods: In May 2023, a PPIE methodology panel was created specifically with the aim of actively working in partnership with patients and members of the public to plan, manage, design and carry out research into medical research methodology (e.g. trials, statistical or health economic methodology). Additionally, the panel has the remit to aid with the dissemination of research findings from methodological work. Insights from the organising group of research methodologists has been made to help other researchers setting up their own panel or including PPIE members into methodological research.

Results: The convened panel consists of 22 members of the public. The panel have met six times since May 2023, all as online meetings conducted through Google Meet. Average attendance to the panel is approximately 80%. Eleven different topics have been discussed to date ranging from clinical trial design to the use of registry data in research. Recommendations for a PPIE panel - The panel has three facilitators (researchers) to run each session. It is recommended this is done with a minimum of two people to ensure all information is captured as well as provide structure and support for the PPIE members who will be less familiar with research meetings. Online meetings have been deemed acceptable for PPIE members and are set at a maximum of two hours with a comfort break included. Recommendations for conducting PPIE for methodological work—PPIE members at a minimum can meaningfully input to a plain English summary of the methodological project. Jargon should be avoided where possible but used when the definition is important e.g. “adaptive designs”. In these cases, clear definitions are important at the outset so an agreed understanding can be used when discussing the project. PPIE members can help to refine these definitions. Specific questions are useful to get clear responses but leaving space for general comments will help to highlight anything which may not have been considered.

Conclusions: PPIE can meaningfully input to methodological research however completing this has unique challenges. These insights from a specifically created panel hope to provide recommendations for successfully including PPIE in methodological research.

CP11-4 – USING RE-IDENTIFICATION RISK SCORES ON PUBLICLY AVAILABLE ANONYMISED CLINICAL TRIAL DATASETS

Primary Author:

1) Aryelly Rodriguez (The University of Edinburgh)

Co-Author(s):

2) Linda J Williams (The University of Edinburgh)

3) Steff C Lewis (The University of Edinburgh)

4) Pamela Sinclair (The University of Edinburgh)

5) Sandra Eldridge (The Queen Mary University of London)

6) Tracy Jackson (The University of Edinburgh)

7) Christopher J Weir (The University of Edinburgh)

Background: There are increasing motivations to share anonymized datasets from clinical trials within the scientific community. Many anonymized datasets are now publicly available for secondary research. However, it is uncertain whether they pose a privacy risk to the involved participants.

Methods: We collected a broad sample of publicly available, de-identified/anonymized clinical trials datasets to estimate their re-identification risk scores, employing three equations that can be used to calculate re-identification risk scores for an entire anonymized dataset, using information within the dataset. These equations are typically used for routinely collected health records and only generate potential probabilities; they do not aim to re-identify individuals in the datasets. Firstly, we contacted data holders and requested access to their anonymized datasets following the data owners’ local procedures. Secondly, re-identification risk scores were calculated for each dataset we were granted access to, using the three equations. Finally, we explored which characteristics of the datasets were associated with increased or decreased risk scores and compared the risk scores and their usability. To the best of our knowledge, this is the first study to use these risk of re-identification scores across a range of clinical trials datasets.

Results: From 18 repositories, we identified 86 potentially eligible datasets, from which we secured 76. Seventy datasets from 14 repositories met the inclusion criteria and were analyzed. Thirty-one datasets were shared with minimal restrictions (open access), while 39 were shared with varying levels of restrictions before access was granted (controlled access). Open access datasets had, on average, three identifiers, while controlled access datasets had an average of five identifiers. The most common pieces of information that, when combined, may indirectly identify a participant were sex (80%) and age (72.9%).

Conclusions: This study confirms that clinical trial datasets are very rich in personal details and using re-identification risk scores as a measure of this richness is feasible. These scores have the potential to guide the anonymization process and aid in the decision-making regarding the release of clinical trial datasets for secondary research purposes.

SESSION 12

Translating Evidence to Practice

CP12-1 – PIONEERING PHYSICIAN-LED TRIALS: TRANSFORMING TREATMENT FOR ADRENAL INSUFFICIENCY AND BEYOND

Primary Author:

1) Ryan Berry (Authority Health, a Federally Qualified Health Center)

Co-Author(s):

2) Apoorv Tiwari (Garden City Hospital)

3) Ali Mokbel (Wayne State University)

4) Ammar Abotouk (Metro Detroit Endocrinology Center)

5) Abdulrahman Alrifai (Avalon University School of Medicine)

6) Jumana Waleed (University of Jordan - School of Medicine)

7) Opada Alzohaili (Wayne State University)

Physicians frequently recognize potential treatments for conditions that lack sufficient market size to attract industry-sponsored trials. These opportunities remain unvalidated due to significant barriers, including complex regulatory pathways, funding limitations, and the absence of commercial incentives. This study presents an innovative framework for practitioner-led clinical trials that leverage existing FDA-approved devices and real-world data, enabling impactful research independent of industry support. Using the case of continuous subcutaneous hydrocortisone infusion for adrenal insufficiency, the trial exemplifies how practitioners can navigate these challenges. Subcutaneous infusion devices, approved for insulin delivery, are ideal for repurposing due to their established safety profiles and versatility. In adrenal insufficiency, standard oral hydrocortisone therapy often results in peaks and troughs, requiring supra-physiologic doses during troughs to stabilize patients. These fluctuations compromise quality of life and elevate long-term health risks. Continuous subcutaneous infusion offers a more physiologic delivery, stabilizing cortisol levels, reducing hospitalizations, and improving patient-reported outcomes. Our trial design incorporates retrospective electronic medical record data collected and analyzed by our team to inform eligibility criteria and outcome measures. The primary outcomes include a reduction in hospitalization rates for adrenal crises and improvements in quality of life, assessed using validated questionnaires. Secondary outcomes evaluate fatigue levels, adjustments in daily hydrocortisone dose, serum adrenocorticotropic hormone levels, adverse events, and device-related complications. By leveraging adaptive design elements and real-world data, this framework provides a scalable, resource-efficient approach to trial execution. The trial’s design highlights how funding mechanisms such as PAS-23-086 Small R01 can support small-scale studies, demonstrating that properly constructed trials can yield data robust enough for treatment approval and insurance coverage. Beyond adrenal insufficiency, this model serves as a template for other conditions where potential treatments exist but lack industry interest due to limited market size. The framework prepares for future needs as data analytics continue to identify novel treatment applications that may not attract commercial investment. This practitioner-led approach not only bridges the gap between clinical observation and formal evidence generation but also ensures that overlooked therapies can be validated and brought to patients. By addressing current and future challenges, this model shapes a path for impactful, sustainable research that adapts to the evolving landscape of clinical science, ultimately expanding treatment options and improving patient care.

CP12-2 – INTENTION-TO-TREAT ANALYSIS: A SYSTEMATIC REVIEW ON RECOMMENDATIONS AND HOW TO USE IT APPROPRIATELY

Primary Author:

1) Yan Liu (Carleton University)

Co-Author(s):

2) Dawn Kennedy (University of British Columbia)

3) Guanyu Chen (University of British Columbia)

4) Vishal Gupta (University of British Columbia)

Randomized controlled trials (RCTs) design has been considered the gold standard for testing the effects of an intervention. However, missing data (e.g., drop-out) and non-compliance (e.g., participants not following the original treatment assignment) bring complexities to the data analysis for RCTs. Intention-to-Treat (ITT) has long been recommended for handling these issues in data analysis to preserve the benefits of randomization in RCTs. As various strategies and diverse versions of ITT emerge in the literature, researchers face the imperative of gaining a comprehensive understanding and employing suitable data analysis techniques. This study aims to review how ITT is defined and what ITT methods and practices are recommended in methodology articles from 2010 to 2022. A total of 1,281 articles were identified initially, and only 53 articles met the inclusion criteria and were included in the final review. Our results suggest that a variety of definitions have appeared in the literature in addition to the widely cited definition, “once randomized, always analyzed.” Modified ITT (mITT) has become a popular trend and attracted a lot of attention in the last decade. However, there is no agreement on how to define mITT, and about one-third of these methodological articles did not present a clear definition. Additionally, our study identified a variety of statistical methods and techniques recommended for handling missing outcome data and non-compliance, but one-third of the articles did not provide a proper description of the applied statistical methods. Notably, our study revealed that only six articles (11%) originated from the fields of psychology and social sciences, while the majority were published in medical-related fields. In conclusion, our findings underscore the necessity for researchers across disciplines to enhance their comprehension and adeptly apply ITT or alternative strategies when dealing with missing data and non-compliance in RCTs.

CP12-3 – CHARACTERISTICS OF INTERVENTIONAL CLINICAL TRIALS STARTED IN 2023 REGISTERED IN CLINICALTRIALS.GOV

Primary Author:

1) Rebecca Sullenger (Duke University School of Medicine)

Co-Author(s):

2) Robert M Clare (Duke University School of Medicine)

3) Ali B Abbasi (University of California, San Francisco)

4) Karen E Chiswell (Duke University School of Medicine)

5) Lesley H Curtis (Duke University School of Medicine)

6) Brad G Hammill (University School of Medicine)

7) Martin J Landray (University of Oxford)

8) Chris J Lindsell (Duke University School of Medicine)

9) Scott M Palmer (Duke University School of Medicine)

10) Sara Bristol Calvert (Clinical Trials Transformation Initiative)

Background: A robust national evidence generation system is integral to translating novel medical interventions into clinical practice and informing policies to improve public health. A prior analysis of interventional clinical trials registered in ClinicalTrials.gov between 2007 and 2010, showed a landscape of small, heterogenous trials whose features and focus varied by trial sponsor. We provide an update on the United States (U.S.) interventional clinical trial landscape and report differences in trial characteristics by funding source.

Methods: We extracted all clinical studies registered in ClinicalTrials.gov as of 05/01/2024, using the database for Aggregate Analysis of ClinicalTrials.gov (AACT). Interventional trials started in 2023 with at least one U.S. site were included. Studies were characterized by primary funding source [National Institutes of Health (NIH), other U.S. federal agencies, industry, other (individuals, universities, non-federal organizations)]; actual or anticipated enrollment; target condition (cardiovascular, cancer, mental health); intervention type (drug, device, biological/vaccine, procedural/surgery, behavioral, dietary supplement, genetic, radiation, combination product, diagnostic test, other); study phase (for drug studies only); interventional model; use of randomization; masking; and data monitoring committee (DMC) appointment. We identified trials related to cardiovascular, oncology, and mental health specialties because together these groups account for the largest number of disability-adjusted life years lost in the U.S. We manually reviewed user-submitted disease condition terms and Medical Subject Heading (MeSH) terms generated by a National Library of Medicine (NLM) algorithm to assign studies to these three domains as appropriate.

Results: Of the 7,673 trials, 2,688 (35.0%) were funded by industry, 1,339 (17.5%) by the NIH, 319 (4.2%) by another U.S. federal agency, and 3,327 (43.4%) by other sources. Median (Q1, Q3) trial enrollment was the greatest for NIH-funded trials [90 (40, 240)], followed by other U.S. federal agencies [80 (36, 225)], industry [77 (32, 200)], and other [54 (30, 1250]. Drug interventions were the most common in industry-funded trials [1,818 (67.7%)] and were less common in trials funded by the NIH [341 (25.5%)] and other sources [784 (23.6%)]. Behavioral interventions were most frequent in trials funded by the NIH [648 (48.4%)] and other sources [1,144 (34.4%)]. The majority of drug trials (72.6%) were phase 1 or 2. Of the three clinical specialties, cancer-related trials were the most common [1,768 (23.0% of total)], followed by mental health [1,488 (19.4%)], and cardiovascular-related trials [790 (10.3%)]. Industry-funded trials were most likely to relate to cancer, whereas trials funded by the NIH, other U.S. federal agencies, and other sources were most commonly related to mental health.

Conclusion: Differences in intervention type and target condition by funding source highlight the complementary role of government and other funders in supporting the evidence-generation system. Oncology-related trials are more common than mental health and cardiovascular-related trials. Most trials remain small regardless of sponsor, and whether such trials are informative relative to the opportunity costs should be considered. Future analysis should consider trial completion status, time to results publication, and impact on clinical practice and health policy.

CP12-4 – THE TRANSLATION OF INTERVENTIONAL STUDY FINDINGS FOR ADULTS REQUIRING MAINTENANCE DIALYSIS INTO CLINICAL GUIDELINES

Primary Author:

1) Katherine Hull (University of Leicester)

Co-Author(s):

2) Sherna Adenwalla (University of Leicester)

3) Victoria Cluley (University of Nottingham)

4) Laura Gray (University of Leicester)

5) Matthew PM Graham-Brown (University of Leicester)

6) Daniel S March (University of Leicester)

7) Rahma Said (University of Leicester)

8) James O Burton (University of Leicester)

Introduction: There is substantial gap between the expanding evidence-base from randomized controlled trials (RCTs) and the implementation of findings in routine clinical care; nephrology is no exception. There is an absence of knowledge regarding the adoption and integration of interventions for the maintenance dialysis population. The use of research findings within clinical guidelines reflects a clear component in the pathway to research integration into clinical care that can be objectively assessed. The purpose of this systematic review is to understand the factors that influence the utilization of RCT data in nephrology clinical practice guidelines for the end-stage kidney disease (ESKD) population on maintenance dialysis.

Methods: The systematic review was registered prospectively on National Institute for Health Research’s International Prospective Register of Systematic Reviews (PROSPERO, CRD42021249460). The outcomes of interest were exploratory: identification of the factors influencing the uptake of RCTs into clinical guidelines. Searches were completed in MEDLINE, Embase, CINAHL, and CENTRAL. There were no limits on language or location. The search strategy was limited by publication date 01/01/2015 to 31/12/2018. Descriptive statistics are reported as frequencies with percentages and statistical testing included binomial logistic regression. The data were not appropriate for pooling or meta-analysis.

Results: Database searches identified 7763 records and grey literature searches identified 81 records; from these, 268 RCTs with 305 associated reports were eligible for inclusion in the systematic review. None of the eligible RCTs reported a Patient and Public Involvement and Engagement (PPIE) statement. Twenty-two (8.2%) of the RCTs were utilized in the nephrology clinical guidelines through 24 (7.9%) associated reports. None of the RCTs included in the guidelines were focused just on the peritoneal dialysis population. Binomial logistic regression modelling suggests that RCTs included in the clinical guidelines are: 6.7 times more likely to have clinical effectiveness as their primary purpose; 3.7 times more likely to report a clinical trials registration; and five times more likely to originate from North America. Furthermore, for each additional year of follow-up, the odds of a report being utilized in the clinical guidelines were 1.5 times higher; and reports utilized in the clinical guidelines were 2.2 times more likely to publish adverse event data, although this did not reach significance (95% confidence interval 0.9 to 5.6).

Conclusion: Many RCTs involving the ESKD population on maintenance dialysis are not cited in nephrology clinical guidelines. Research conducted in North America is more likely to be utilized in nephrology clinical guidelines. There is an absence of documented PPIE input. The peritoneal dialysis population are poorly represented in the clinical guidelines. Poor study design, inconsistent reporting and a lack of external validity appears to impact the utility of RCT data in guidelines.

SESSION 13

Operational Difficulties in Research

CP13-1 – STRATEGIES, OBSTACLES, AND FACILITATORS FOR REMOTE TRIALS ADMINISTERING PHYSICAL ACTIVITY-BASED INTERVENTIONS AND PERFORMING PHYSICAL FUNCTION ASSESSMENTS: THE LESSENING INCONTINENCE WITH LOW-IMPACT ACTIVITY TRIAL

Primary Author:

1) Alison Huang (University of California San Francisco)

Co-Author(s):

2) Michael Schembri (University of California San Francisco)

3) Ann Chang (University of California San Francisco)

4) Sarah Pawlowksy (University of California San Francisco)

5) Margaret Chesney (University of California San Francisco)

6) Leslee Subak (Stanford University )

Digital and telehealth methods are increasingly being used in remote trials to administer diverse interventions and conduct participant assessments. Physical movement-based interventions, as well as assessments of participants’ physical performance or function, present unique feasibility, quality, and safety challenges in a remote trial context. When interventionists or assessors cannot lay hands on participants, are unable to view participants in three dimensions, and must rely on speakers, cameras, and microphones to communicate, the risk of error and injury can increase. We discuss strategies, barriers, and facilitators for delivering physical movement interventions and performing physical function assessments over remote video platforms in an NIH-funded, multisite randomized trial of two types of group-based physical movement-based interventions in an aging population. Participants were diverse older community-dwelling women with urinary incontinence (N=240, age range 45 to 90 years) who were randomly assigned to either a therapeutic pelvic yoga program involving twice weekly group intervention sessions supplemented by once weekly self-directed practice, or a skeletal muscle conditioning program involving time-equivalent group sessions and self-directed practice of muscle stretching and strengthening techniques. Originally launched in 2019, the trial was rapidly converted to a videoconference-based platform during the COVID-19 pandemic, with all study interventions and assessments performed remotely thereafter. The trial team developed new protocols and tools for delivering remote videoconference yoga and physical conditioning intervention instruction, assessing the quality of instruction through videoconference observations by expert consultants, and evaluating participants’ ability to perform intervention techniques over video. New strategies were developed to evaluate changes in participants’ physical function using video-based tests of balance (one-legged stand), lower extremity strength (chair-stand testing), and aerobic endurance (step-testing). Compared to 10.5% drop-out in the early study cohorts involving all in-person intervention instruction, 15.9% participants dropped out of interventions in cohorts relying on all-video instruction. Among retained participants, however, adherence to intervention sessions varied modestly, with 77.9% of early in-person cohorts versus 71.7% of later video-based cohorts completing more than 90% of sessions. Using standardized procedures for questioning participants about adverse events, the overall proportion of participants reporting a musculoskeletal adverse event, including joint pain/strain, was 9.9%. However, only 2.6% of participants in early cohorts involving in-person instruction reported an event that categorized as “probably” or “definitely” related to interventions, compared to 5.6% in remotely delivered cohorts. Two notable adverse events demonstrate unique safety concerns that can arise in remote studies—one involving a shoulder tear sustained during a remote intervention class, and another involving a fall from tripping over intervention props. Despite the increased convenience and accessibility, participants and instructors raised challenges about using videoconference-based platforms during intervention instruction and described barriers to establishing appropriate, distraction-free environments during intervention sessions. Opportunities for participants and instructors to develop interpersonal rapport and provide mutual support were also decreased. Findings highlight lessons learned about remote video platforms for instruction in physical movement-based interventions and performance of physical function assessments in a research context, as well as strategies for promoting the safety of both types of activities among older and diverse populations.

CP13-2 – EXTERNAL CHALLENGES IN STEPPED WEDGE CLUSTER RANDOMIZED TRIALS: A CASE STUDY OF THE PREHOSPITAL CANADIAN C-SPINE RULE TRIAL

Primary Author:

1) Elham Sabri (Methodological & Implementation Research)

Co-Author(s):

2) Ranjeeta Mallick (Ottawa Hospital Research Institute)

3) Christian Vaillancourt (Ottawa Hospital Research Institute)

4) Yongdong Ouyang (The Hospital for Sick Children)

5) Manya Charette (Ottawa Hospital Research Institute)

6) Monica Taljaard (Ottawa Hospital Research Institute)

Introduction: Stepped wedge cluster randomized trials (SW-CRT) are a novel type of trial design in which all clusters receive the intervention by the end of the trial; clusters start with control conditions and then are randomized to sequences to receive the intervention sequentially. The SW-CRT is often chosen for logistical reasons, as it allows the intervention to be implemented on a staggered schedule. Conversely, it can be logistically more complicated than parallel arm designs as it requires adherence to the planned schedule of implementation in calendar time and can be sensitive to time-varying effects and external confounders like changes in the policy environment affecting clusters differentially. In this abstract, we present a case study of the prehospital Canadian C-spine trial to demonstrate how logistical issues encountered in the implementation of a policy intervention can affect the conduct and interpretation of a SW-CRT.

Case study: Each year, Emergency Medical Services transport around half a million trauma patients with a potential neck injury to the local emergency department in Ontario, Canada. Paramedics used to transport all such patients fully immobilized using backboards, cervical collars, and head immobilizers. Unnecessary immobilization of low-risk patients was costly, inefficient, and wasted resources. Vaillancourt et al. designed a 12-month pragmatic SW-CRT aimed at improving patient care and health system efficiency by allowing paramedics to assess patients using the Canadian C-Spine Rule (CCR) to determine risk of injury and selectively transport low-risk patients without immobilization. The primary and co-primary outcomes were the proportion of patients immobilized or with discomfort and pain.

External confounder: Starting the 4th period of the trial, a new spinal motion restriction (SMR) protocol was introduced where most patients could be transported without a backboard, with or without a cervical collar as per the CCR. According to SMR, use of neck collar alone could be considered immobilized.

Impact and strategies: This new protocol created an unanticipated change to the implementation schedule at month nine, which led to the investigators censoring the last three months of data from the main analysis. The change in practice confounded all outcomes regardless of the intervention effect as immobilized patients could now be transported with less discomfort and pain only using collar. It also impacted the secondary outcomes such as time spent in the field by paramedics before hospital arrival. The effect of time as before and after the policy change could not be controlled for in the linear mixed effect model because all centers were switched to the intervention at that point. The new definition of immobilization was no longer measuring the side effects of it and was re-defined as using at least the backboard for analysis purposes.

Conclusion: SW-CRTs are vulnerable to biases due to external events such as policy changes. Implementation schedules and analysis plans may need to be modified to accommodate external events. Advanced statistical modeling techniques are available but may not be feasible to implement. Therefore, it is important to assess the stability of the policy environment before adopting a SW-CRT.

CP13-3 – WHY IS DE-IMPLEMENTATION OF INEFFECTIVE INTERVENTIONS SO DIFFICULT? INSIGHTS FROM CLINICAL PRACTITIONERS’ CRITIQUES OF EVIDENCE AND IMPLICATIONS FOR TRIAL DESIGN AND PLANNING

Primary Author:

1) Leila Rooshenas (University of Bristol)

Co-Author(s):

2) Carmel Conefrey (University of Bristol)

3) Nicola Farrar (University of Bristol)

4) Josie Morley (University of Bristol)

5) Joel Glynn (University of Bristol)

6) Timothy Jones (University of Bristol)

7) William Hollingworth (University of Bristol)

Background: Clinical trial evidence has potential to drive evidence-based improvements in healthcare, but implementation can be slow and inconsistent. While trial evidence can promote adoption of effective interventions, it can also trigger “de-implementation” (cessation/restriction) of ineffective care. De-implementation is crucial for improving care and optimizing resource-use but remains notoriously challenging to achieve. Reviews show limited impact of past de-implementation efforts, even following publication of high-profile RCTs. It is widely cited that clinical practitioner resistance is a barrier to de-implementation, but there is limited empirical evidence around the reasons why. The UK-based OLIVIA study aimed to deepen understanding of de-implementation by evaluating the Evidence-based Interventions (EBI) program: a national initiative that sought to de-implement >40 surgical interventions identified as ineffective or cost-ineffective following review of the latest evidence. This presentation will share healthcare professionals’ responses to national proposals for de-implementing procedures and highlight implications for designing future impactful RCTs.

Methods: We conducted 43 semi-structured interviews with UK-based HCPs from five surgical specialties, each linked to procedures/tests identified for de-implementation by the national EBI program. Data were analyzed thematically using constant comparison methods derived from Grounded Theory methodology.

Results: Our findings revealed a disconnect between healthcare professionals’ appreciation for the rigor of trial evidence, and their intentions for operationalizing trial outcomes that pointed towards a need to de-implement interventions within their own field. Criticisms of de-implementation recommendations largely focused on: 1) perceived limitations of RCT evidence on which these were built, and 2) skepticism around the processes by which trial findings had been translated into de-implementation recommendations. Criticisms of RCT evidence included concern about trial design and the relevance of the underpinning questions these answered. Key critiques included choice of outcomes that were not deemed relevant to clinical care, and overly restrictive eligibility criteria that did not reflect the breadth of real-world practice. Concerns relating translating evidence into de-implementation recommendations centered on the extent to which clinical specialists had shaped this process. Although trial evidence was perceived as important for informing de-implementation proposals, viewing this evidence through the lens of clinical expertise was deemed critical, given that there would always be “gaps” that evidence cannot reach. This was deemed particularly important for de-implementation proposals, due to concerns about denial of potentially beneficial care. Transparent involvement of clinical specialists was also thought to be critical for ensuring de-implementation is driven by motives to improve care rather than save money: a suspicion that commonly arose when de-implementation recommendations were perceived to have been formulated by healthcare purchasers/managers alone.

Conclusion: This study reveals a range of factors that shape healthcare professionals’ engagement with de-implementation proposals. Future trialists should consider these factors during the trial design/planning phase, particularly regarding trial eligibility criteria and whether the study is asking “the right question” in relation to outcomes. Plans for engaging clinical specialists as part of the process of translating trial findings into practice-relevant recommendations should also be considered at the outset of clinical trials—particularly for those that have potential to culminate in de-implementation of entrenched practices.

CP13-4 – NIH STROKENET INTERNATIONAL TRIALS IMPLEMENTATION

Primary Author:

1) Iris Davis (University of Cincinnati)

Co-Author(s):

2) Catherine Dillon (Medical University of South Carolina)

3) Jocelyn Craven (Medical University of South Carolina)

4) Emily Stinson (University of Cincinnati)

5) Kimberlee Bernstein (University of Cincinnati)

6) Noor Sabagha (University of Cincinnati)

7) Vivek Khandwala (University of Cincinnati)

8) Janice Carrozzella (University of Cincinnati)

9) Jama Olsen (Medical University of South Carolina)

10) Jessica Griffin (Medical University of South Carolina)

Background: The National Institutes of Health (NIH) StrokeNet is a United States (US) federally funded research network supporting the development and implementation of phase II and III stroke trials. NIH StrokeNet was established in 2013 to serve as an infrastructure for high-quality, multi-site clinical trial execution. The NIH StrokeNet facilitates the conduct of multi-site trials by maintaining a US-based network of Regional Coordinating Centers and affiliated Performance Sites. Non-network sites may also participate in NIH StrokeNet trials within and outside of the US. Trials recruiting a large sample size, studying a specific stroke subtype, utilizing a complex trial design that requires additional resources, or partnering with industry may benefit from including international recruitment sites.

Methods: NIH StrokeNet trial Principal Investigators may plan for international recruitment during the grant development phase, or they may determine that the trial needs to expand internationally after the grant has been awarded. While planning for international trial operations, careful budgetary considerations should be made to support additional country-specific activities such as clinical coordination, data management, and investigational product management and shipping. A country-specific or regional Contract Research Organization is typically used to facilitate in-country operations and understand local context in terms of contracts, currencies, language, ethics boards and competent authorities. The NIH StrokeNet formalized an International Working Group to develop a Standard Operating Procedure (SOP) and a start-up checklist for investigators pursuing international sites.

Results: The NIH StrokeNet has five completed and 13 ongoing trials, nine of which had or have international components. Three trials planned for international recruitment sites at the outset of trial initiation, and five trials expanded internationally during their course due to slow recruitment rates. One trial is currently building international partnerships planning for expansion, and two additional trials are in the development phase with international sites planned at the outset. As of November 22, 2024, 515 participants have been recruited from 42 OUS sites in five countries. The NIH StrokeNet International Working Group issued a publicly posted SOP titled “Management of StrokeNet Trials with International Sites” on September 24, 2024, to outline international component determination, budgeting, trial operational structure, start-up, site selection, regulatory considerations and protocol considerations for current and future investigators. A start-up checklist was also employed detailing the administrative and regulatory processes required to start-up a trial internationally. These tools are available to guide investigators and to ensure a consistent and thorough approach to managing international trials.

Conclusion: Global trial implementation may be utilized to enhance enrollment in NIH StrokeNet trials. When planned carefully and thoroughly, international recruitment may maximize the output of stroke trials.

SESSION 14

Pragmatic Clinical Trials

CP14-2 – TOO MANY COOKS IN THE KITCHEN? LESSONS LEARNED FROM THE DEVELOPMENT OF A PRAGMATIC CLINICAL TRIAL PROTOCOL

Primary Author:

1) Jacqueline Perez (American Society of Clinical Oncology)

Co-Author(s):

2) Amber Boose (American Society of Clinical Oncology)

3) Pam Mangat (American Society of Clinical Oncology)

4) Elizabeth Garrett-Mayer (American Society of Clinical Oncology)

The development of clinical trial protocol is a multidisciplinary effort, requiring diverse perspectives, and varying preferred methods of communication and execution, to unify as a single voice to answer the clinically relevant question at hand. ASCO’s CDK4/6 Inhibitor Dosing Knowledge (CDK) Study exemplifies the challenges and benefits of this approach. The study brought together academic and community breast medical oncologists, patient advocates, biostatisticians, and other clinical research experts for the development of a protocol for a Phase III randomized trial focused on dose optimization for HR+/HER2- patients with metastatic breast cancer in patients aged 65 and older. The collaborative effort underscored the value of multistakeholder input, which enriched the protocol by ensuring patient-centered approaches such as integrating real-world considerations into the protocol (e.g., feasibility and length of study assessments for patients), and expanding the applicability of the findings across diverse clinical context (e.g., broad eligibility criteria). However, the process also revealed notable challenges. Differing professional backgrounds and expectations led to mismatching in decision-making and work styles (e.g., rapid and flexible documentation vs. vetting and rigorous documentation), and communication methods which occasionally slowed progress and created complex coordination and bottlenecks. Differences in technological preferences for protocol development resulted in version control and decision-tracking issues. Some group members favored novel approaches such as online protocol building platforms that allowed for reviews and decision tracking by multiple group members simultaneously, others preferred more traditional methods of document sharing by email. While both methods are acceptable each with advantages and disadvantages, it highlighted the need to discuss and obtain consensus on group operability at the outset. Despite these hurdles, the team identified strategies to overcome these challenges: establishing structured frameworks for decision-tracking, early agreement on technology preferences, and clearly defined roles to streamline coordination. Additionally, fostering open communication and mutual respect was equally essential in aligning strategic visions across the multidisciplinary team. The lessons learned from developing the CDK Study protocol emphasize while managing diverse input can introduce complexity, the collective insights significantly enhance the relevance and impact of clinical research. By balancing inclusivity with structured decision-making, there may be potential to navigate the intricacies of large-scale collaborative efforts resulting in impactful clinical trials.

CP14-3 – ANALYSIS USING INTENT-TO-TREAT VERSUS PER PROTOCOL REVIEWS IN PRAGMATIC CLINICAL TRIALS

Primary Author:

1) Michael Rothe (American Society of Clinical Oncology)

Co-Author(s):

2) Mallory Connors (American Society of Clinical Oncology)

3) Amber Boose (American Society of Clinical Oncology)

4) Pam Mangat (American Society of Clinical Oncology)

5) Elizabeth Garrett-Mayer (American Society of Clinical Oncology)

Clinical trials typically adhere to strict per-protocol or intent-to-treat (ITT) analyses to maintain consistency and validity. The Targeted Agent and Profiling Utilization Registry (TAPUR) Study’s unique pragmatic design highlights that traditional approaches are not always appropriate, necessitating the development of a novel method utilizing well-defined criteria to determine appropriate data inclusion for primary analyses. The TAPUR Study is a pragmatic precision oncology phase II basket trial that evaluates the antitumor activity of commercially available targeted agents outside of their approved indication(s) in patients with advanced disease. Each TAPUR cohort, defined by tumor type, genomic target, and drug, follows a Simon two-stage design. Ten participants enroll in stage I, and if the cohort is not closed for futility, eighteen additional participants enroll in stage II. The primary endpoint of the TAPUR Study is disease control (DC), measured using RECIST criteria. Due to enrollment of participants with advanced disease and no requirement for end of study tumor scans, some participants end study with only a baseline tumor scan. Because the primary endpoint relies on tumor measurements, participants that end study with only a baseline scan may not have primary outcome data. Given the small sample size of each cohort, a few participants without meaningful outcome data can significantly affect the study’s statistical power and ability to make inferences from a cohort. To address the challenges posed by participants leaving the study prior to evaluation of the primary endpoint, the TAPUR Study employs a novel approach to determine whether participant data can be included in the primary analysis per protocol or if another participant should be recruited to “replace” the participant in the cohort. This differs from ITT analysis which would consider participants without outcome data non-responders and include them in the primary analysis. Non-responders on TAPUR are specifically participants with evidence of progression and a defined outcome. The developed methodology involves well-defined criteria, organized into a decision-making flowchart. This flowchart was drafted prior to beginning the review process for any cohort and is applied equally to all cohorts. All reviews occur before the analysis of a cohort so that determinations of which participants can be replaced occur before determining the number of participants with DC. Participants with evidence of tumor progression or clinical progression are included in the primary analysis. Conversely, participants who end the study with insufficient disease information and no link between study departure and disease status are excluded from the primary analysis and replaced to maintain statistical integrity for the primary objective. However, all participants who received at least one dose of study drug are included in safety analyses (i.e., summary of adverse events). The TAPUR Study’s innovative approach to reviewing missing primary outcome data and determining whether participants should be replaced, rather than using a strict ITT analysis, demonstrates how tailored methodologies can address the challenges of unique trial designs while preserving statistical integrity. By implementing a flowchart with well-defined criteria to guide decision-making, the study ensures that primary analyses remain robust and reliable, even in small cohorts.

SESSION 15

Statistical Analysis of Longitudinal Designs

CP15-1 – REPEATED INCLUSION OF CLUSTERS IN LONGITUDINAL CLUSTER RANDOMISED TRIALS

Primary Author:

1) Jessica Kasza (Monash University)

Co-Author(s):

2) Rhys Bowden (Monash University)

3) Andrew B Forbes (Monash University)

4) Kelsey L Grantham (Monash University)

Often referred to as re-randomization designs, randomized trials that allow participants to be included in a randomized clinical trial multiple times (randomized independently each time), have been shown to increase trial recruitment rates. To avoid confusion with other uses of the term “re-randomization,” we refer to these designs as “repeated inclusion” designs. Provided certain assumptions are valid, treatment effect estimators from repeated inclusion designs will be unbiased with increased precision. Until now, the theory of repeated inclusion designs has been restricted to the setting of individually randomized designs; here we extend that theory to cluster randomized trials. Repeated inclusion of clusters may be useful when the number of available clusters is limited, or cluster recruitment is difficult: allowing clusters to participate multiple times in the same trial. In this talk we extend the theory of repeated inclusion designs to cluster randomized trials, including longitudinal variants such as cluster randomized crossover designs. Given the validity of assumptions regarding the constancy of treatment effect across repeated inclusions, for designs where equal numbers of clusters and participants are included in each treatment group in each study period, for the same total number of measurements, study power will never reduce when clusters are randomized multiple times. Study power will either be maintained or increased, and whether power is maintained or increased depends on the combination of the study design and the within-cluster correlation structure. A corollary of our main result indicates that designs conducted over a larger number of periods can be more powerful than designs with a larger number of clusters. These results have implications for the design of longitudinal cluster randomized trials, in particular cluster randomized crossover trials and standard cluster randomized trials. When cluster recruitment is difficult, repeated inclusion designs, where the same clusters are included multiple times in the same study, may thus be feasible alternatives to standard cluster randomized trial designs.

CP15-2 – EXPLORING INCOMPLETE STEPPED WEDGE DESIGNS: BALANCED VERSUS IMBALANCED STAIRCASE DESIGNS

Primary Author:

1) Kelsey Grantham (Monash University)

Co-Author(s):

2) Andrew B Forbes (Monash University)

3) Jessica Kasza (Monash University)

Stepped wedge cluster randomized trial designs can carry burdensome data collection requirements as all clusters must collect and provide data in all periods of the trial. Staircase designs are incomplete variants of the stepped wedge design that can be considerably less burdensome. Visually, the trial design resembles a staircase: clusters are randomly assigned to sequences made up of a limited number of measurement periods (control periods followed by intervention), where sequences start measurement at different times. Recent work has found that, under a linear mixed model, staircase designs with just two periods of measurement in each sequence are particularly lean designs with power that can rival that of the stepped wedge in certain situations. In this talk we will aim to identify efficient staircase designs among those with more than two measurement periods in each sequence. In particular, we will examine whether there is a benefit to using an imbalanced staircase design, with different numbers of control and intervention periods in a sequence, over a balanced staircase design. We will compare the efficiency of different staircase designs via the precision of the treatment effect estimator under a variety of trial settings. Surprisingly, our results show that balanced designs are not always optimal, for certain common trial configurations and modeling assumptions. In particular, imbalanced designs tend to be more efficient than balanced designs for designs with fewer sequences and in settings with larger cluster-period sizes and higher intracluster correlation parameters (i.e., greater similarity between participants’ outcomes in a cluster and less waning in similarity over time). This work adds to the growing bank of knowledge about the types of staircase designs that are most efficient under different trial settings, thereby helping trialists choose trial designs that will make best use of trial data to answer their research questions.

CP15-3 – A REVIEW OF CURRENT PRACTICE IN THE DESIGN AND ANALYSIS OF EXTREMELY SMALL STEPPED-WEDGE CLUSTER RANDOMIZED TRIALS

Primary Author:

1) Guangyu Tong (Yale University)

Co-Author(s):

2) Pascale Nevins (Harvard University)

3) Mary Ryan (University of Wisconsin Madison)

4) Kendra Davis-Plourde (Yale University)

5) Yongdong Ouyang (The Hospital for Sick Children Toronto)

6) Jules Antoine Pereira Macedo (Université de Tours)

7) Can Meng (Yale University)

8) Xueqi Wang (Yale University)

9) Agnès Caille (Université de Tours)

10) Fan Li (Yale University)

Background: Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justification for small numbers of clusters is often unclear and appropriate analysis is often lacking. In addition, stepped-wedge cluster randomized trials are methodologically more complex due to their longitudinal correlation structure, and ignoring the distinct within- and between-period intracluster correlations can underestimate the sample size in small stepped-wedge cluster randomized trials. We conducted a review of published small stepped-wedge cluster randomized trials to understand how and why they are used, and to characterize approaches used in their design and analysis.

Methods: Electronic searches were used to identify primary reports of full-scale stepped-wedge cluster randomized trials published during the period 2016-2022; the subset that randomized two to six clusters was identified. Two reviewers independently extracted information from each report and any available protocol. Disagreements were resolved through discussion.

Results: We identified 61 stepped-wedge cluster randomized trials that randomized two to six clusters: median sample size (Q1-Q3) 1426 (420-7553) participants. Twelve (19.7%) gave some indication that the evaluation was considered a “preliminary” evaluation and 16 (26.2%) recognized the small number of clusters as a limitation. Sixteen (26.2%) provided an explanation for the limited number of clusters: the need to minimize contamination (e.g. by merging adjacent units), limited availability of clusters, and logistical considerations were common explanations. Majority (51, 83.6%) presented sample size or power calculations, but only one assumed distinct within- and between-period intracluster correlations. Few (10, 16.4%) utilized restricted randomization methods; more than half (34, 55.7%) identified baseline imbalances. The most common statistical method for analysis was the generalized linear mixed model (44, 72.1%). Only four trials (6.6%) reported statistical analyses considering small numbers of clusters: one used generalized estimating equations with small-sample correction, two used generalized linear mixed model with small-sample correction, and one used Bayesian analysis. Another eight (13.1%) used fixed-effects regression, the performance of which requires further evaluation under stepped-wedge cluster randomized trials with small numbers of clusters. None used permutation tests or cluster-period level analysis.

Conclusion: Methods appropriate for the design and analysis of small stepped-wedge cluster randomized trials have not been widely adopted in practice. Greater awareness is required that the use of standard sample size calculation methods can provide spuriously low numbers of required clusters. Methods such as generalized estimating equations or generalized linear mixed models with small-sample corrections, Bayesian approaches, and permutation tests may be more appropriate for the analysis of small stepped-wedge cluster randomized trials. Future research is needed to establish best practices for stepped-wedge cluster randomized trials with a small number of clusters.

CP15-4 – ENHANCING COVARIATE-ADAPTIVE RANDOMIZATION IN A CLUSTERED 2X2 FACTORIAL DESIGN TRIAL

Primary Author:

1) Shivam Joshi (Ohio State University)

Co-Author(s):

2) Lai Wei (Ohio State University)

3) Lisa Juckett (Ohio State University)

4) J Madison Hyer (Ohio State University)

5) Marilly Palettas (Ohio State University)

Randomization is cornerstone in conducting a randomized clinical trial. Covariate balance is crucial to randomization because it ensures that key characteristics are distributed evenly across treatment groups, which minimizes confounding and helps isolate the effect of the intervention. However, in smaller trials, covariate balance is not always achieved and depending on the severity of imbalance, the robustness of the trial’s findings may be jeopardized. Traditional randomization techniques (e.g., stratified, block-stratified) limit the number of categorical variables in schemes, while methods like minimization and biased coin designs can compromise treatment allocation randomness. One method for ensuring covariate balance while maintaining randomness is called Minimal Sufficient Balance (MSB). Briefly, MSB utilizes test statistics and p-values to assess balance for both continuous and categorical variables, all while ensuring randomness in treatment assignment. However, to date, MSB has not been implemented for clustered randomized trials, nor has it been used for a 2x2 factorial design trial. Moreover, no covariate adaptive randomization process, nor any other common randomization scheme, attempts to prospectively ensure balance in attrition rates. This talk will focus on the implementation of MSB in a way that address all three novel attributes: 1) 2x2 factorial design, 2) cluster randomized design, 3) prospectively maintaining attrition rate across trial arms. This novel approach of MSB design will be implemented for an ongoing randomized clinical trial called SixtyPlus, which focuses on enhancing the outcomes of low-income older adults by proposing that home-delivered meal services be paired with clinical services to improve health and safety. More specifically, SixtyPlus aims to evaluate the effects of registered dietitian (intervention A) and occupational therapy (intervention B) services on the risk of falling among home-delivered meal clients. To ascertain the effect of these interventions in isolation and in combination, a 2x2 factorial design was employed. Additionally, since these interventions will be administered in participants’ homes, there is a potential for overlapping effects if multiple individuals from the same household receive different interventions. To mitigate this, household clusters will be randomized to specific interventions. Moreover, varying baseline characteristics can be associated with differing attrition rates between randomized groups, which in effect, may introduce bias. Following the burn-in randomization period, the innovative adaptation of MSB in this proposal will include prospectively calculating probability of attrition for each participant nested within the household for the purpose of achieving balance on attrition. Current popular randomization techniques can lead to persistent imbalances in multiple baseline covariates which can jeopardize the methodological integrity of a randomized clinical trial. MSB has proven effective in ensuring balance of covariates in two-armed randomized clinical trials with randomization at patient level. SixtyPlus represents innovation in multiple facets by adapting MSB for a four-armed trial. The robustness and practicality of this method have the potential to establish it as a frequently employed randomization technique in future trials. As a result, it could encourage major data collection and analysis programs to integrate this method as a standard practice in clinical trials to promote more robust answers to complex research questions.

SESSION 16

Participant Engagement and Inclusion

CP16-3 – A SCOPING REVIEW OF PATIENT AND PUBLIC INVOLVEMENT IN THE DESIGN AND CONDUCT OF CLUSTER RANDOMIZED TRIALS CONDUCTED EXCLUSIVELY IN LOW- AND MIDDLE-INCOME COUNTRIES

Primary Author:

1) Stuart Nicholls (Ottawa Hospital Research Institute)

Co-Author(s):

2) Tamara L Morgan (Ottawa Hospital Research Institute)

3) Yacine Marouf (University of Toronto)

4) Eric Tran (Western University)

5) Grace Fox (Ottawa Hospital Research Institute)

6) Cory E Goldstein (Ottawa Hospital Research Institute)

7) Lori Harris (University of Ottawa)

8) Monica Taljaard (Ottawa Hospital Research Institute)

Background: Patient and public involvement (PPI) in clinical trials is the active collaboration between researchers and patients or the public in the design and conduct of trials, and the dissemination of results. PPI helps to identify the needs of the study population as well as improve study design and conduct by, for example, identifying outcomes relevant to the study population or ways to reduce the burden of participation. Describing the prevalence of PPI, and the ways in which patients and the public are involved, facilitates understanding of whether research studies are meeting the needs of potential and enrolled participants and their communities. Reviews of PPI have largely drawn evidence from high-income countries and there has been no research examining whether trials conducted exclusively in low- and middle-income countries (LMICs) incorporate PPI. We address the gap in evidence about PPI in trials conducted in LMICs by levering an existing scoping review of cluster randomized trials (CRTs)—trials in which intact groups such as communities, hospitals, or schools are randomized as opposed to individuals—conducted exclusively in LMICs. This work was undertaken as part of work to update the Ottawa statement on the ethical design and conduct of cluster randomized trials. Understanding current PPI practices in trials conducted in LMICs can help generate best-practice recommendations for PPI for the forthcoming update of the Ottawa Statement on the Ethical Design and Conduct of CRTs.

Objectives: To describe PPI in cluster trials conducted in LMICs.

Methods: A sub-study of 206 trials randomly selected from 800 CRTs conducted exclusively in LMICs between 2017 and 2022. Extracted data included: whether there was PPI reported; if so, whether this was published in the study protocol, primary trial report, or both; aspects of the trial the patients or public were explicitly reported to have been involved, and whether there was any indication that patients or the public were listed as co-authors on the main report or trial protocol.

Results: Of the 206 randomly selected trials, 90 (44%) reported there was PPI. Of these 90 trials, 10 (11%) only reported details of involvement in the study protocol. When PPI was reported, the most common aspects of the trial in which PPI was reported were the design, development, or delivery of the intervention (70; 78%); development or implementation of recruitment or retention strategies (26, 29%), and collection of research data (26, 29%). Other examples included community sensitization activities and public randomization ceremonies. We found no examples of PPI in setting the study question. Only one study included a patient or public member as a co-author.

Conclusion: Our review findings fill a major gap in the current evidence regarding PPI in clinical trials and indicates a substantial proportion of CRTs conducted in LMICs had PPI. The lack of reported PPI in setting study questions is a concern and may point to a need for further work to ensure that trials conducted in LMICs are meeting the needs of the communities participating in the trial.

CP16-4 – STRATEGIES TO IMPROVE RECRUITMENT TO RANDOMISED TRIALS: COCHRANE SYSTEMATIC REVIEW

Primary Author:

1) Adwoa Parker (University of York)

Co-Author(s):

2) Gloria Mongelli (University of York)

3) Camila Piccolo-Lawrance (University of York)

4) Elizabeth Coleman (University of York)

5) Athanasios Gkekas (University of York)

6) Han-I Wang (University of York)

7) Shoba Dawson (University of Sheffield)

8) Heidi Green (COUCH Health)

9) Rosalind Way (Imperial College London)

10) Arti Rai (Northern Care Alliance NHS Foundation Trust)

Background: Participant recruitment to trials can be very difficult. Identifying strategies that improve recruitment would benefit trialists, health research and patients. Our primary objective was to quantify the effects of strategies to improve the recruitment of participants to trials by updating the Cochrane 2018 systematic review of recruitment strategies. Secondary objectives were to: assess the cost-effectiveness of recruitment strategies; assess the impact of the recruitment strategy on retention rates; and determine the effectiveness of recruitment strategies across different patient populations.

Methods: To identify randomized trials of methods to increase recruitment to randomized trials, we searched the following databases up until 15 February 2023: MEDLINE/MEDLINE In Process; EMBASE; Science Citation Index; Social Science Citation Index; Online Resource for Research in Clinical Trials; CINAHL; PsycInfo; ASSIA; ERIC. We applied no language restriction and screened reference lists of included studies. We excluded quasi-randomized and hypothetical trials. Two authors independently screened studies and extracted data. We assessed risk of bias using the Cochrane risk of bias tool.

Results: We included 83 eligible papers (40 new to this update), involving 196,929 individual participants and 1403 clusters. We identified 54 comparisons (29 new to this update). Only four comparisons were supported by high-certainty evidence according to GRADE. 1. Open trials rather than blinded, placebo trials (three studies; 1 new). This showed an absolute improvement of 10% (95% CI 8% to 12%). This result applies mostly to women in the UK and Estonia. No cost data reported. 2. Telephone reminders to people who do not respond to a postal invitation (2 studies; 0 new). The absolute improvement was 6% (95% CI 3% to 9%). This result applies to middle-aged people living in Canada and Norway. No cost data reported. 3. Using a particular, bespoke, user-testing approach to develop participant information leaflets (6 studies; 4 new). This made no difference to recruitment: absolute improvement was 0% (95% CI ?1% to 1%). This result applies to people living in the UK. No cost data reported. 4. Sending a recruitment primer letter/leaflet (two studies; 1 new). This made no difference to recruitment: absolute improvement was 0% (95% CI ?1% to 2%). This result applies to older white people living in the UK and Ireland. The recruitment primer leaflet/letter was also more costly than not sending a primer (incremental cost, GBP £2.08). There was moderate certainty evidence for eight comparisons; low certainty evidence for 11; and very low certainty evidence for seven. Our confidence in the evidence was reduced largely because 38 (70.4%) of comparisons came from single studies. Costs were reported in only 8 (14.8%) comparisons. The impact of the recruitment strategies on participant retention was reported by 10/40 (25%) of new studies.

Conclusion: The literature on strategies to improve recruitment to trials continues to have plenty of variety but little depth; we need less innovation and more replications of existing strategies. We also need better reporting of the following: cost-effectiveness; the impact of recruitment strategies on participant retention; and the impact of recruitment strategies on different patient populations.

SESSION 17

Advanced Statistical Approaches

CP17-1 – ADVANCED STATISTICAL METHODOLOGIES IN CENTRALIZED MONITORING: ENHANCING EFFICIENCY AND DATA INTEGRITY IN CLINICAL TRIALS

Primary Author:

1) Ian Rines (Medical University of South Carolina)

Co-Author(s):

2) Akash Roy (Medical University of South Carolina)

Background: The complexity and scale of clinical trials have led to higher costs, with traditional on-site monitoring often accounting for up to 14% of a trial’s budget. To address this, centralized monitoring has become a more effective option, especially due to changing FDA guidelines and challenges from the COVID-19 pandemic. Centralized monitoring utilizes electronic documents and case report forms to provide real-time oversight, reducing the need for in-person visits. It is part of a risk-based monitoring approach that focuses resources on areas most critical to patient safety and data quality. This proactive strategy enhances trial efficiency while maintaining safety and integrity, supported by advanced statistical methodologies for better data issue detection and management. Using simulated data, we sought to determine the strengths and weaknesses of various central monitoring techniques currently used within the data coordinating center for two NIH-funded clinical trial networks.

Methods: To enhance this monitoring process, we utilize a variety of advanced statistical and machine learning techniques that work together to identify anomalies in the data. Methods such as Mahalanobis distance and inlier score tests are used to detect outliers and excessively consistent data points, flagging unusual patterns or potential data fabrication. Funnel plots help uncover site-level anomalies or manipulation by examining the distributions of metrics. Additionally, heatmaps support monitoring by highlighting clusters of missing or inconsistent data, directing attention to sites that require corrective action. Furthermore, machine learning approaches like Support Vector Machines, Bayesian hierarchical models, and dynamic longitudinal models are employed to predict high-risk sites and subjects. These methods capture the hierarchical and time-based nature of clinical trial data, enabling a comprehensive understanding of variability within and between sites and subjects. This understanding helps pinpoint sources of variation and potential risks.

Results: We simulated a continuous measurement collected at 30 and 60 days from subjects across multiple sites, intentionally including outliers to represent potential data collection issue. Various outlier detection methods were applied to this simulated data, including Mahalanobis distance, inlier scores, funnel plots, and heatmaps. We also employed a Support Vector Machine, Bayesian hierarchical model, and dynamic longitudinal model to identify outliers based on these mRS scores. Performance metrics (accuracy, precision, recall, F1-score) were calculated for each method, using the ground truth outliers. These methods were compared based on their ability to detect outliers in the simulated mRS data.

Discussion: Centralized monitoring, along with advanced statistical methods, facilitates real-time detection of data anomalies in large clinical trials. These methods enhance traditional monitoring by identifying trends across subjects and sites, which allows for proactive recognition of safety and quality issues. Additionally, they improve our understanding of data variability, making these approaches essential for maintaining data quality and ensuring patient safety as trials become more complex.

CP17-2 – ASSESSING TREATMENT BENEFIT IN SURVIVAL ANALYSIS UNDER NONPROPORTIONAL HAZARDS

Primary Author:

1) Theodore Karrison (University of Chicago)

Co-Author(s):

2) William Malbecq (Université libre de Bruxelles)

3) Steven Snapinn (Seattle-Quilcene Biostatistics)

In survival comparisons, the Cox hazard ratio provides an interpretable estimate of the treatment effect under the assumption that the ratio of hazards is constant over time. However, when the proportional hazards assumption is violated, the hazard ratio has no clear interpretation. Snapinn et al argue that in survival analysis a treatment’s benefit has two distinct “dimensions,” namely, the difference in restricted mean survival times and the difference in survival rates at the end of follow-up. They proposed calculation of a generalized hazard difference (GHD) as a means to capture both dimensions in a single estimand and showed that the reciprocal of GHD equals the number of patient-years of follow-up that results in one fewer event (NYNT), a measure analogous to the number needed to treat for binary outcomes. Almost simultaneously, Uno and Horiguchi proposed the same measure, which they termed the “difference in average hazard with survival weight” and performed a simulation study to assess its power compared to Cox’s hazard ratio. One problem with GHD, however, is that it lacks power under early difference alternatives, i.e., when the survival curves separate and then converge. Here, rather than combining the two measures into a single index, we propose analyzing them separately, together with a maximum test (maximum Z-statistic). We find that the maximum test maintains the type I error rate while providing reasonably good power under a variety of alternatives (early difference, late difference, and proportional hazards). We illustrate the procedure using data from a randomized phase III clinical trial in prostate cancer.

CP17-3 – SCORE TESTS FOR NON-PROPORTIONAL HAZARDS IN SINGLE-ARM CLINICAL TRIALS WITH TIME-TO-EVENT ENDPOINTS: A SIMULATION STUDY

Primary Author:

1) Chloé Szurewsky (INSERM)

Co-Author(s):

2) Guosheng Yin (University of Hong Kong)

3) Gwénaël Le Teuff (INSERM)

In oncology, well-powered randomized clinical trials with time-to-event endpoint may be difficult to conduct in pediatric studies, biomarker-defined subsets or considered as unethical explaining that many one or two-stage single-arm designs, analog to the Simon’s two stage design for a binary endpoint, have been developed in recent years. These designs rely on the one-sample log-rank test (OSLRT) and its modified version (mOSLRT) for comparing the survival curve of an experimental arm to that of an external reference group. These tests are developed under the proportional hazards (PH) assumption that may be violated, in particular when evaluating immunotherapies. We proposed to adapt the OSLRT and evaluate alternatives for settings where PH does not hold. We extended the Finkelstein’s score test (OSLRT) developed under the PH assumption by using a piecewise exponential model with change-points (CPs) for the early, middle and delayed treatment effect. For crossing hazards, we used an accelerated hazards model. As CPs are not a priori known, we developed a two-step approach with a landmark analysis using the mOSLRT to determine the time-dependent relative treatment effect and CPs and to select then the appropriate score test. The restricted mean survival time (RMST-) based test is extended to the case of single-arm trials. We also developed a test defined as the maximum of mOSLRT and score tests for early and/or delayed effect (maxCombo test). The performances (type I error and power) of the different approaches are evaluated through a simulation study of a phase II single-arm trials with an accrual and follow-up period of 3 and 4 years, respectively. The reference group survival curve is generated with an exponential distribution admitting no sampling variability and that of the experimental group with a piecewise exponential model. The simulation parameters are sample size of the experimental group (from 20 to 200 patients), exponential censoring rate (from 0 to 35%) and relative treatment effect (hazard ratio from 0.5 to 1). A single-arm trial evaluating an inhibitor (n=91 and reference group n=136) for neuroblastoma patients is used for illustration. The simulation study shows that the developed score tests are more conservative than the mOSLRT but as conservative as the OSLRT. As expected, the score test has the highest power when the data generation matches with the model even when the CPs are misspecified. The landmark analysis works well only for large sample size (n>100). The RMST-based test is as conservative as the mOSLRT and more powerful than the mOSLRT only for an early effect with censoring rate less than 15%. The maxCombo test is conservative and more powerful than the mOSLRT when n>50 but less than the right score test under non PH. To conclude, the developed score tests are efficient under non PH when the approximate values of CPs are known. The maxCombo test is an interesting alternative when the relative treatment effect over time and the CPs are unknown. Further research is needed to study the impact of the historical control survival distribution and its sampling variability.

CP17-4 – ACCOUNTING FOR CENTER-LEVEL EFFECTS IN MULTICENTER RANDOMIZED CONTROLLED TRIALS

Primary Author:

1) Shofiqul Islam (McMaster University)

Co-Author(s):

2) Shrikant I Bangdiwala (McMaster University)

Investigators often conduct randomized controlled trials (RCTs) at multiple centers/sites when determining the effect of a treatment or an intervention. Diversifying recruitment across multiple institutions allows investigators to make recruitment go faster within a shorter timeframe and allows generalizing the study results across diverse populations. Despite having a common study protocol across multiple centers, the eligible participants may be heterogeneous, site policies and practices may vary, and the investigators’ experience, training, and expertise may also vary across sites. These factors may contribute to the heterogeneity in effect estimates across centers. As a result, we usually observe some degree of heterogeneity in effect estimates across centers, despite all centers following the same study protocol. During the analysis of such a trial, investigators typically ignore center effects, but some have suggested considering centers as fixed or random effects in the model. It is not clear how considering the effects of centers, either as fixed or random effects, impacts the test of the primary hypothesis. In this article, we first review the practice of accounting for center effects in the analyses of published RCTs and illustrate the extent of heterogeneity observed in a few preexisting multicenter RCTs. To determine the impact of heterogeneity on the test of a primary hypothesis of an RCT, we considered continuous and binary outcomes and the corresponding appropriate model, namely, a simple linear regression model for a continuous outcome and a logistic regression model for the binary outcome. For each model type, we considered three methods: (a) ignore the center effect, (b) account for centers as fixed effects, or (c) account for centers as random effects. Based on simulation studies of these models, we then examine whether considering the center as a fixed or random effect in the model helps to preserve or reduce the Type I and Type II error rates during the analysis phase of an RCT. Finally, we outline the threshold at which center-level effects are negligible and thus negligible and provide recommendations on when it may be necessary to account for center effects during the analyses of multicenter randomized controlled trials.

SESSION 18

Bayesian Clinical Trials

CP18-1 – THEORY TO PRACTICE: BAYESIAN RESPONSE ADAPTIVE RANDOMISATION IN A RARE DISEASE SETTING

Primary Author:

1) Rajenki Das (University of Cambridge)

Co-Author(s):

2) Sofía S Villar (University of Cambridge)

3) Nina Deliu (University of Cambridge)

4) Mark Toshner (University of Cambridge)

Response-adaptive randomization (RAR) designs are valuable as they increase likelihood of allocations to the most promising arm while maintaining randomization as the allocation method. However, its implementation remains a challenge when the trial size is very small. Motivated by the ongoing StratosPHere 2, a phase 2 trial in a rare disease setting, we discuss the trial design, the associated challenges and their proposed solutions. The trial design incorporates an additional step of Mapping that converts the continuous randomization probabilities produced at the interim stage to a target vector of discrete randomization ratios, using a decision rule. This approach helps to avoid undesirable treatment allocations per randomization stage while staying true to the essence of RAR. Under the implementation of Mapping, we analyze the impact of missing data and discuss an additional concern of reporting safety results accounting for the differential nature of exposure to the treatments. The ultimate goal of this work is to foster greater synergy between practical and methodological research, crucially needed for translating the benefits of using RAR into clinical practice.

CP18-2 – CALIBRATION OF DOSE-AGNOSTIC PRIORS FOR BAYESIAN DOSE-FINDING TRIAL DESIGNS WITH JOINT OUTCOMES

Primary Author:

1) Emily Alger (Institute of Cancer Research)

Co-Author(s):

2) Shing M Lee (Columbia University)

3) Ying Kuen Cheung (Columbia University)

4) Christina Yap (Institute of Cancer Research)

Dose-finding oncology trials are a crucial step in early clinical development. The goal of these trials is to assess the safety of novel anti-cancer treatments across multiple doses and recommended dose(s) for subsequent trials. Based on previous patients’ observed responses to treatment, trialists dynamically recommend new doses for further investigation during the trial. In traditional dose-finding designs, doses are escalated towards a target Dose-limiting toxicity rate (DLT) rate, with the final recommendation identified as the maximum tolerated dose (MTD). Such adaptive decision making lends itself well to Bayesian learning, with Bayesian frameworks increasingly adopted to guide dose recommendations in model-based dose-finding designs. However, such approaches come at a cost of increased complexity, including the challenge of selecting appropriate priors. Such complexity is further amplified for model-based designs, where the incorporation of additional outcomes is leading to increasingly intricate designs. For instances where trialists lack prior knowledge of the MTD, methodology associated with the calibration of dose-agnostic priors has been developed for the continual reassessment method (CRM) design with single-outcomes. However, the application of such methodology to existing joint-outcome CRM based trial designs to generate a priori dose agnostic priors is flawed and inflates lower dose recommendations and results in more patients being treated at sub-optimal doses. We address this issue by extending the calibration techniques for single-outcome trial designs and create a new analytical approach to calibrate dose-agnostic priors for joint-outcome trial designs that jointly evaluate DLTs and efficacy responses, or DLTs and patient-reported outcomes (PROs). This analytical and computationally efficient technique maintains an a priori dose agnostic prior with a reduced standard deviation of the proportion of correct selection across simulation scenarios. This approach also improves the probability of correct selection of the optimal dose in a majority of scenarios. As Bayesian dose-finding trial designs continue to advance, research and guidance on the effective calibration of design parameters are essential to support their uptake and ensure optimal performance in practice. This method provides an analytical and intuitive approach to prior calibration, highlighting the importance of rigorous prior calibration in improving model accuracy and dose selection for safer, more effective oncology treatments.

CP18-3 – CALIBRATION-FREE ODDS CFO SUITE FOR DESIGNING VARIOUS PHASE I CLINICAL TRIALS

Primary Author:

1) Guosheng Yin (University of Hong Kong)

Co-Author(s):

2) Jialu Fang (University of Hong Kong)

In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose in a phase I clinical trial. To use the data more efficiently yet without any model assumption, we propose a novel calibration-free odds (CFO) approach to phase I trial design. Not only is the CFO design free of any dose-toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Seamless phase I/II trials have gained enormous popularity, which aim to identify the optimal biological dose (OBD). To enhance the accuracy and robustness for identification of OBD. For toxicity monitoring, the CFO design casts the current dose in competition with its two neighboring doses to obtain an admissible set. For efficacy monitoring, CFO selects the dose that has the largest posterior probability to achieve the highest efficacy under the Bayesian paradigm. In contrast to most of the existing designs, the prominent merit of CFO is that its main dose-finding component is model-free and calibration-free, which can greatly ease the burden on artificial input of design parameters and thus enhance the robustness and objectivity of the design. We will also illustrate the implementation of CFO using its Shiny App which is user-friendly and publicly accessible at https://clinicaltrialdesign.shinyapps.io/cfoapp/.

CP18-4 – DEVELOPING A SIMPLE PLAN FOR IMPLEMENTATION AND MONITORING OF COMPLEX RANDOMIZATION ALGORITHMS

Primary Author:

1) Kevin Venner (Almac Clinical Technologies)

Co-Author(s):

2) Noelle Sassany (Almac Clinical Technologies)

3) Jennifer Ross (Almac Clinical Technologies)

Demand for sophisticated randomization algorithms such as Covariate Adaptive Randomization or Minimal Sufficient Balance are growing. These designs appeal to statisticians for their robust methods of minimizing predictability in treatment assignment, while maintaining treatment balance across baseline covariates. Translating designs from theoretical to practical are a challenging yet crucial step in successful implementation and monitoring of these randomization algorithms in an Interactive Response Technology (IRT) system. Unlike frequentist designs, where randomization assignments are performed via pre-generated randomization schedule, treatment assignments performed by randomization algorithms require a higher level of knowledge than just identifying the next record in the associated schedule. The theoretical design of these algorithms may appear daunting to non-statisticians but can be redefined in digestible step-by-step terms to ensure non-statistical team members have a firm understanding for IRT implementation. Monitoring plans for critical trial data are required to be assessed by the sponsor for their clinical trials per current regulatory guidance. As randomization is one of these critical steps in trials, an appropriate Randomization Monitoring Plan (RMP) should be developed based on the algorithm’s implementation. An RMP provides patients and sponsors the assurance that a patient’s participation in a clinical trial is accurate and backed up by validated evidence. RMPs require clear documentation of (1) the required checks, (2) the frequency of reviewing the data, and (3) the communication plan for each review of randomization assignments. A collaboration between the sponsor and the independent personnel performing the monitoring is key in crafting an effective RMP. Sponsor stakeholders may have a non-statistical background and will rely on clear, concise, and accurate descriptions of IRT algorithm implementation for monitoring success. Reporting of the randomization data in IRT also needs to be carefully considered to ensure effective randomization monitoring can be performed. Common checks for reviewing treatment assignments of complex randomization algorithms include verifying the individual step by step calculations are correct and result in the actual treatment assignment for each patient. Additional checks may be required based a clinical trial’s protocol design and IRT implementation. For example, if there are multiple responses utilized to identify a single covariate level in IRT, then an additional check to confirm the mapping to the correct covariate level may be required in the monitoring plan. IRT is uniquely positioned to help define the complex randomization design of the complex algorithm, given that they have intimate knowledge of the data structure, step-by-step calculations, and processes for randomization assignment to patients. This presentation intends to establish the importance of accurately defining the steps of complex randomization algorithms for IRT implementation and the monitoring of randomization data.

SESSION 19

Cancer Clinical Trials

CP19-1 – WAS IT WORTH IT: A POOLED ANALYSIS OF PARTICIPANT EXPERIENCE IN CANCER CHEMOPREVENTION TRIALS

Primary Author:

1) David Zahrieh (Mayo Clinic)

Co-Author(s):

2) Carrie Strand (Mayo Clinic)

3) Aminah Jatoi (Mayo Clinic)

4) Sumithra J Mandrekar (Mayo Clinic)

Introduction: Participating in a chemoprevention trial is a personal choice for otherwise healthy and cancer-free participants. There are no comprehensive reports on participant satisfaction in cancer chemoprevention trials. The aim of the current study was to assess whether participation in early-phase cancer chemoprevention trials was satisfactory and to identify a set of features associated with participant satisfaction.

Methods: Individual participant-level data from 13 early-phase chemoprevention trials targeting four disease sites (colorectum; esophagus; liver; lung) and conducted from 2006-2018 by the Cancer Prevention Network were included in the current study. The 5-item “Was It Worth It?” (WIWI) questionnaire was administered at the end of each trial’s intervention period or at the time of early termination for participants who ended the intervention early. The binary outcome, satisfied overall, was defined as a participant who answered yes to the first three questions on the WIWI questionnaire: Q1: Was it worthwhile for you to participate in this research study; Q2: If you had to do it over, would you participate in this research study again; and Q3: Would you recommend participating in this research study to others. Seventeen factors covering trial-design, baseline, and on-study features were identified based on subject matter knowledge and were interrogated with the random forests algorithm. A hierarchy of features based on quantification of the importance of their effects on being satisfied overall was constructed. A supervised approach was utilized to build a multiple logistic regression model to understand the impact of these features on participant satisfaction.

Results: A total of 691 (of 706) participants, with representation in large and small communities located throughout the US, Canada, Puerto Rico, and Honduras started the trial-specific intervention with 652 (94.4%) completing the WIWI questionnaire. Of these 652, 493 (75.6%) were White, non-Hispanic or Latino; 39 (6.0%) Black, non-Hispanic or Latino; 98 (15.0%) Hispanic or Latino; and 8 (1.2%) of another race/ethnicity. There were 193 females (29.6%), 121 (17.5%) were 65+ years, and 517 (79.3%) participated in a placebo-controlled trial. One-third of these participants were enrolled outside the US. 85% indicated that they were satisfied overall. After controlling for age, sex, and intervention duration, the odds of not satisfied overall was 7.4 times higher for participants who terminated the intervention early (P<0.001); 1.8 times higher when the percentage of the intervention duration spent experiencing adverse events (AEs) was >5% (P=0.024); and 1.9 times higher when the cumulative number of preintervention AEs experienced was >0 (P=0.012). Compared with White, non-Hispanic or Latino, the odds of not satisfied overall were 2.96 times higher for Black/Asian/>1race, non-Hispanic or Latino (P<0.001) and 0.40 times lower for Hispanic or Latino (P=0.004).

Conclusion: Cancer prevention is an important health-centric goal of the National Cancer Plan. This work is based on prospective and uniformly collected participant experience data using the WIWI questionnaire from 13 early-phase cancer prevention trials. Knowing the set of features associated with satisfaction from our large series of geographically and demographically diverse participants, can inform design of subsequent trials and develop strategies to improve accrual, retention, adherence, and diversity.

CP19-2 – A COLLABORATIVE APPROACH FOR DEVELOPING SAFETY STOPPING RULES IN PHASE 2 ONCOLOGY CLINICAL TRIALS

Primary Author:

1) Subodh Selukar (St. Jude Children’s Research Hospital)

Co-Author(s):

2) Motomi Mori (St. Jude Children’s Research Hospital)

3) Paul Frankel (City of Hope)

Phase 1 oncology trials provide the initial safety data to support continued investigation of novel regimens, but their limited sample sizes can motivate subsequent phase 2 trials to continue close safety monitoring. Several statistical rules for monitoring safety have been proposed to support clinical evaluation of safety in phase 2 trials. These rules can be prespecified during study design to help protect against enrolling additional patients once it is decided that the regimen is not sufficiently well-tolerated, in opposition to the assessment from the phase 1 study. These rules assess accumulating safety data to evaluate whether there is evidence that the study regimen’s toxicity exceeds a prespecified threshold. Some approaches control the probability of detecting excess toxicity when the toxicity is truly acceptable (type-1 error) and aim to have a high probability of detecting excess toxicity at higher toxicity rates and holding accrual. In practice, however, we observe that traditional type-1 error rates with recommended rules may have undesirable properties: they may allow unacceptably large numbers of early toxicities and may have limited probability to detect excess toxicity under truly unacceptable toxicity rates. Therefore, any initial statistical rule based on traditional type-1 error likely needs to be modified based on clinical expertise to adequately protect the safety of patients under study. We outline a framework of close collaboration between statistical and clinical experts to develop rules that balance statistical operating characteristics with clinical acceptability. And we illustrate this framework by reviewing the potential design of a clinical trial evaluating a novel hematopoietic cell transplantation regimen for pediatric patients with high-risk hematological malignancies.

CP19-3 – USING BENEFIT-RISK METHODS TO ADJUST THE NON-INFERIORITY MARGIN BASED ON TREATMENT BENEFITS

Primary Author:

1) Nikki Totton (University of Sheffield)

Co-Author(s):

2) Steven Julious (University of Sheffield)

3) Stephen Walters (University of Sheffield)

Non-inferiority trials are used to test whether a new treatment is no worse than the comparator. A major design component is the non-inferiority margin (NIM) which represents an acceptable difference between the two treatments whilst being considered non-inferior. The chosen NIM has a major impact on the design and subsequent outcome of a trial, however choosing the NIM can sometimes be difficult. This is especially the case when an increase in the NIM, i.e. a higher level of acceptable inferiority, might be considered due to the trade-off with a benefit of the new treatment. Benefit-risk (B-R) methods aim to create transparency and consistency when trading-off between multiple outcomes. These methods are commonly used to gain regulatory approval at the end of a clinical trial but could be implemented at the design stage when setting the NIM. The aim of this work was to test to potential of using B-R methods to aid when adjusting a NIM based on additional benefits. Following a systematic review of potential B-R methods and a formal, researcher-led selection criteria, four different benefit-risk methods were applied to two real-life NI trials. The four methods tested were the Benefit-Risk Action Team (BRAT) framework, the Unified Methodologies for Benefit-Risk Assessment framework, Multi-Criteria Decision Analysis (MCDA), and the Food and Drug Administration’s (FDA) Benefit-Risk Framework (BRF). The applied methods were presented to stakeholders (n=6) during semi-structured interviews. Stakeholders were asked about perceived usefulness of the B-R methods and asked to highlight any barriers to use in practice. Implementation of the B-R methods was found to be most straight-forward with the FDA BRF. Stakeholders felt this was useful for all trial teams as a starting point to be more explicit about the considerations but lacked some complexity. MCDA was the most complex and quantitative method used and is the only one to output with a value suggestion that can be used for the NIM. This approach raises concerns about obscuring the information used in decision-making. However, stakeholders considered it beneficial, as they found selecting a value for the NIM to be difficult and arbitrary. The BRAT framework presented the most amount of information and was considered the most useful with the important caveat of needing quality data to be available. Stakeholders felt that all methods would assist with the justification produced for the NIM. In conclusion, using B-R methods provided improved structure and transparency around the decision making for the value of the NIM by formally considered an adjustment based on the benefit of the treatment (if appropriate). The use of formal methods provides a level of robustness and consistency that will help to improve the design of NI trials. Additionally, using the B-R methods was found to improve the justification provided for the NIM and importantly help readers to distinguish between evidence and subjective decision-making within the justification. Improving the decision-making and justification of the NIM will hopefully improve confidence in NI trials and their results for all stakeholders.

CP19-4 – ASSURANCE METHODS FOR DESIGNING A SURVIVAL TRIAL WITH DELAYED TREATMENT EFFECTS

Primary Author:

1) James Salsbury (University of Sheffield)

Co-Author(s):

2) Jeremy Oakley (University of Sheffield)

3) Steven Julious (University of Sheffield)

4) Lisa Hampson (Novartis)

An assurance (probability of success) calculation is a Bayesian alternative to a power calculation. These calculations are becoming more regularly performed in industry, especially in the design of Phase III confirmatory trials. Immuno-oncology (IO) is a rapidly evolving area in the development of anticancer drugs. A common phenomenon that arises from IO trials is one of delayed treatment effects, that is, a delay in the separation of the Kaplan-Meier survival curves. To calculate assurance for a trial in which a delayed treatment effect is likely to be present, uncertainty about key parameters needs to be considered. If uncertainty is not considered, then the number of patients recruited may not be enough to ensure we have adequate statistical power to detect a clinically relevant treatment effect. We present an elicitation technique for when a delayed treatment effect is likely to be present and show how to compute assurance using these elicited prior distributions. We provide an example to illustrate how this could be used in practice.

SESSION 20

Data Management and Sharing

CP20-1 – OPEN SCIENCE: SHARING DATA AND RESOURCES IN ALZHEIMER’S DISEASE CLINICAL TRIALS

Primary Author:

1) Gustavo Jimenez-Maggiora (University of Southern California)

Co-Author(s):

2) Michael C Donohue (University of Southern California)

3) Robert A Rissman (University of Southern California)

4) Paul S Aisen (University of Southern California)

Background: The NIA-funded Alzheimer’s Clinical Trials Consortium (ACTC) is a multi-principal investigator program led by Paul Aisen, MD (University of Southern California), Reisa Sperling, MD (Harvard University), and Ronald Petersen, MD, PhD (Mayo Clinic). Its mission is to provide an optimal trial infrastructure, leveraging centralized resources and shared expertise, to accelerate the development and rigorous testing of interventions for Alzheimer’s disease and related dementias (ADRD). ACTC comprises 38 academic member sites across the U.S. with extensive expertise in ADRD clinical trials. ACTC completed the Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease (A4) and Longitudinal Evaluation of Amyloid Risk and Neurodegeneration (LEARN) studies were conducted between 2014 and 2023, with enrollment completed in 2017 and final study results reported in 2023. The A4 and LEARN Studies were conducted at 67 clinical trial sites in the United States, Canada, Japan, and Australia. The A4 study screened (n=6763), enrolled, and randomized (n=1169) participants between the ages of 65 and 85 with a blinded follow-up period of 240 weeks followed by an open-label period of variable length. The LEARN study screened and enrolled individuals (n=538) ineligible for the A4 study based on nonelevated measures of amyloid accumulation using positron emission tomography imaging (amyloid PET). Consistent with the NIH data sharing policy and the principles of Open Science, the A4/LEARN investigators aimed to share data as broadly and early as possible while still protecting participant privacy and confidentiality and the scientific integrity of the studies.

Methods: We describe the approach, methods, and platforms used to share the A4 and LEARN pre-randomization study data for secondary research use. Preliminary results measuring the impact of these efforts are also summarized. We conclude with a discussion of lessons learned and next steps.

Results: The A4 and LEARN pre-randomization study data were released in December 2018. The materials shared included de-identified quantitative and image data, analysis software, instruments, and documentation. As of November 1, 2024, 1749 requests have been submitted by investigators and citizen scientists from more than 50 countries. We identified 78 peer-reviewed publications that acknowledge the A4/LEARN study.

Conclusions: Our initial results provide evidence supporting the feasibility and scientific utility of broad and timely sharing of Alzheimer’s disease trial data.

CP20-2 – DEVELOPING A STANDARDIZED DATA MANAGEMENT PLAN TEMPLATE FOR RANDOMIZED CONTROLLED TRIALS

Primary Author:

1) Anna Catharina Vieira Armond (University of Ottawa Heart Institute)

Co-Author(s):

2) David Moher (Ottawa Hospital Research Institute)

3) Florian Naudet (University of Rennes)

4) Dean Fergusson (Ottawa Hospital Research Institute)

5) Dong Vo (Ottawa Hospital Research Institute)

6) Kelly Cobey (University of Ottawa Heart Institute)

Background: Effective research data management is crucial for ensuring rigor, reproducibility, and transparency. Data Management Plans (DMPs) are increasingly required by funders and organizations to ensure researchers provide detailed plans for managing, sharing, and preserving data throughout the research lifecycle. Researchers, however, face barriers to creating comprehensive DMPs, especially when resources for dedicated data management are scarce. To address this gap, we developed and refined a DMP template specifically tailored to meet the practical needs of trialists.

Methods: A draft RCT DMP template was designed using the structure of the Canadian DMP Portage general template and was refined by considering the essential elements to include in a DMP found in a recently completed scoping review of funder DMP requirements. To ensure its applicability for RCTs, two rounds of user testing of the draft DMP were conducted online with clinical trial community members with diverse backgrounds and data management experience levels. The first round involved data managers, research coordinators, privacy and ethics staff, and quality assurance specialists. The second round included trialists and RCT methodologists. A content analysis was performed on feedback from each round. Participants were recruited through a variety of approaches, including through the networks of Clinical Trials Ontario and Clinical Trials British Columbia. Feedback from these sessions was collected to assess clarity, comprehensiveness, and usability, allowing for iterative improvements to the template.

Results: Thirty-eight participants provided feedback on practical challenges, specific examples, and preferred terminology about DMP items. Feedback highlighted several areas for improvement. Participants emphasized the importance of clear guidance on data types and formats, structured examples relevant to RCT contexts (e.g., de-identification protocols, data version control), and practical guidance on compliance with evolving funder mandates. Suggestions included providing context-specific guidance, expanding sections on data completeness and missing data handling, and distinguishing between short-term data storage and long-term archiving practices. Enhancements also focused on reducing the complexity of technical sections, making the DMP practical for use by trialists without specialized data management expertise. Overall, participants expressed satisfaction with the DMP’s ease of use and adaptability to a wide range of study designs and resource settings. The template provides structured guidance on key areas including (1) data description and collection; (2) documentation and metadata;(3) storage and backup; (4) preservation; (5) sharing and reuse; (6) responsibilities and resources; and (7) ethics and legal compliance.

Conclusion: The developed DMP template offers an accessible and comprehensive tool for clinical trialists to adhere to novel DMP mandates and best practices. It addresses essential data management needs while remaining adaptable to varied RCT study requirements. This presentation will outline the template’s key DMP components, demonstrating the potential impact of this resource on promoting reproducibility and transparency in clinical research.

CP20-3 – ENHANCING THE QUERY PROCESS IN CLINICAL TRIALS THROUGH THE USE OF A QUERY PORTAL SYSTEM

Primary Author:

1) Michelle Lancet (University of Pittsburgh)

Co-Author(s):

2) Joseph Weiss (University of Pittsburgh)

3) Madison Morgan (University of Pittsburgh)

4) Cristina Murray-Krezan (University of Pittsburgh)

5) Kaleab Abebe (University of Pittsburgh)

The University of Pittsburgh’s Center for Biostatistics and Qualitative Methodology Data Coordinating Center (CBQM-DCC) utilizes a homegrown electronic system for data management (eSYSDM) that integrates eligibility confirmation, randomization, participant management, safety reporting, laboratory sample management, outcome adjudication, and data and safety monitoring. The DCC monitors data quality at regular intervals and promotes direct data entry, which allows for monitoring data in real time. While the use of the eSYSDM has provided increased efficiency in data and safety monitoring, the data querying process to participating sites remains cumbersome and highly reliant on dissemination of queries via email and time-consuming manual verification of query resolution. In this talk, we describe the conception, development, and implementation of a web-based query portal integrated within the eSYSDM. Through use of this portal, the DCC’s clinical trial managers can create data queries for each participating site and can track the query through resolution. They can identify missing or discrepant values from either data quality reports or by nature of reviewing data entered into the case report form, create a data query directly in the form along with a deadline, and provide instructions or comments regarding the action item. Once submitted, the data query created will be visible immediately upon a login by clinical site users. The eSYSDM will present a reminder to the site user that there are outstanding data queries; however, the user can proceed with participant management or other activities before resolving the query. The query can be viewed at any time and data changed directly in the form. Queries can be submitted back to the clinical trial manager with comments or confirmation that the variable in question does not need to be changed. Lastly, the clinical trial manager can manage the cumulative list of queries issued to each site and easily view and verify query resolution within the eSYSDM. The ability to create, track, and manage queries directly in the system reduces the amount of email communication to the clinical sites and eliminates the loss of data query communication when site staff turnover. The query portal ensures that all trained users of the eSYSDM at the respective site can resolve queries. Finally, we will describe future enhancements to the query portal, such as integrating site-identified data change requests and approval of those requests from the DCC, as well as system-generated queries automatically created based on set criteria defined for our monthly reports. These could include missing or incomplete forms, problematical data values, and unresolved AEs.

CP20-4 – HARNESSING ARTIFICIAL INTELLIGENCE IN DATA MANAGEMENT

Primary Author:

1) Chris Cook (SWOG Statistics and Data Management Center)

Co-Author(s):

2) Antje Hoering (SWOG Statistics and Data Management Center)

3) Michael LeBlanc (SWOG Statistics and Data Management Center)

Artificial intelligence (AI) tools can be profoundly helpful in clinical trial operations like software development, information technology, and data management. These tools allow us to streamline previously time-consuming tasks. SWOG Cancer Research Network has been increasingly leveraging AI to enhance operational efficiency. One of the most significant AI advancements lies in the automation of software development and other IT tasks. AI systems excel at addressing well-defined software problems for which extensive training data exists. By automating routine coding tasks through code generation and effective AI prompting, these systems enable our developers not only to increase productivity through automated code generation but also to redirect their time to more novel and trial-specific challenges. For example, developers can generate code for more common tasks like input processing, setting up unit tests, and logging errors. The generation of tailored scripts for IT deployments and configurations has streamlined daily IT tasks to support more efficient and scalable infrastructure management. SWOG has also made progress in setting up systems to facilitate easier interrogation of our documents. With AI, we can simply ask questions of our document library like “What is SWOG’s AI Policy?” or “How are adverse events collected in this trial?” instead of having to manually search through folders or long documents. This improves both information accessibility and decision-making agility. Important to SWOG’s work has been developing a framework for these tools to be used successfully and ethically. SWOG’s framework allows for these tools to be used but also implements restrictions on them to maintain privacy of sensitive data and maintain humans as the final decision makers when making critical decisions. This mitigates the risks of AI tools straying from anticipated use or providing inaccurate information. SWOG has utilized a variety of third-party AI tools such as ChatGPT and Copilot but is also training local AI models to help with tasks that require additional privacy and customization. While the commercial frontier AI models offer impressive state-of-the-art capabilities, they cannot offer the same level of privacy and configuration options that locally developed models can. The AI revolution offers unprecedented opportunities to make clinical trial operations more efficient, and we anticipate these opportunities to expand further over time.

SESSION 21

Operations and Execution

CP21-1 – DUAL COORDINATION CENTER ORGANIZATION AND STUDY START-UP OPERATION EFFICIENCIES FOR THE ALL ALS CONSORTIUM

Primary Author:

1) Courtney Igne (Massachusetts General Hospital)

Co-Author(s):

2) Praveena Mohan (Barrow Neurological Institute)

3) James Berry (Massachusetts General Hospital)

4) Robert Bowser (Barrow Neurological Institute)

5) Lindsay Pothier (Massachusetts General Hospital)

6) Ashley Rivera (Massachusetts General Hospital)

7) Brandi Negron (Massachusetts General Hospital)

8) Meghan Hall (Barrow Neurological Institute)

9) Jenny Hamilton (Barrow Neurological Institute)

10) Lisa Butler (Barrow Neurological Institute)

Background: The Access for All in ALS (ALL ALS Consortium) was funded by the National Institute of Health (NIH) in September 2023 as an Other Transaction Agreement between two Clinical Coordination Centers (CCC), The Neurological Clinical Research Institute at Massachusetts General Hospital (MGH) and the Barrow Neurological Institute (BNI). Each CCC received separate funding awards for the ALL ALS Consortium and were tasked with coordinating and collaborating to achieve the primary project goal: to establish a consortium, lead natural history studies, and establish a repository of clinical data and corresponding bio-samples to support open ALS research. The CCCs collectively oversee the research studies, sites, budgets, contracting, vendors, data, and outcomes and monitoring, achieving annual NIH milestones while operating within the standards of their institutions. This abstract describes the organization of the dual CCC model and methods implemented by MGH as the East CCC (ECCC) and BNI as the West CCC (WCCC) to optimize start-up processes.

Hypothesis: The dual CCC model harnesses unique perspectives, experiences, and resources allowing for collaboration to accelerate study start-up, site contracting, site activation, and enrollment for protocols under the ALL ALS Consortium. Flexible, expense-allocation workflows result in budget efficiencies not typically available in the single CCC model.

Methods: Sites were assigned to either the ECCC or WCCC for primary support and oversight, lessening the burden of site management at each CCC and allows for quicker processing for review of site IRB submission materials and requirements for activation. The ECC serves as the Single IRB (sIRB) and IRB or record for all sites, allowing for site reliance on one IRB for submission and approval management. The WCCC’s Outcomes and Monitoring team, the ECCC’s Data and Systems Management team, existing relationships and legal agreements between the two CCCs were leveraged to provide support and governance for the project. Contracting with high-cost vendors was managed by the WCCC due to its more conducive indirect cost rate compared to the ECCC. Existing infrastructure between both CCCs was maximized to accelerate start-up activities, including utilizing previously contracted vendors for regulatory document management and translation services.

Results: Within the first year, the CCCs developed and launched two separate study protocols (ASSESS and PREVENT), the affiliated training materials, databases, and supporting study documents. Additionally, the CCCs executed 97% of site master agreements, received IRB approval for 10 clinical sites, activated 9 of those sites, enrolled the first participants for both protocols, and developed the necessary operational structure to guide decision-making and processes for the consortium through SOPs, Manuals, and Plans. By leveraging existing contracting services, agreement execution time for site and vendor contracts was minimized, resulting in quicker implementation. Table 1 reflects the average time sites have taken to reach critical milestones in study start-up.

Conclusion: Overall, the dual-CCC model has enabled the teams to explore innovative ways to utilize shared resources to strategize clinical operations for successful study start-up for the ALL ALS Consortium.

CP21-2 – OPTIMIZING TRIAL SUCCESS: KEY STAFFING CONSIDERATIONS FOR PLATFORM TRIALS

Primary Author:

1) Lia Tamburello (Massachusetts General Hospital)

Co-Author(s):

2) Lindsay Heyd (Massachusetts General Hospital)

3) Brittney Harkey (Massachusetts General Hospital)

4) Marianne Chase (Massachusetts General Hospital )

5) Sabrina Paganoni (Massachusetts General Hospital)

6) Merit Cudkowicz (Massachusetts General Hospital)

Background: Platform trials offer an innovative approach to clinical research design by enabling multiple drugs to be evaluated for a specific disease area within the same trial framework. This design uses a Master Protocol, shared infrastructure, and a shared placebo group, making it more efficient and cost-effective compared to traditional trials. Given the perpetual nature of a platform trial, adaptive staffing is essential to ongoing success. The HEALEY ALS Platform Trial is designed to evaluate promising drugs in participants with Amyotrophic Lateral Sclerosis. It is conducted under a Master Protocol with each drug added as an appendix (regimen). This trial provides valuable insights into the staffing, resources, and organizational structure necessary to support innovative trial designs.

Hypothesis: Staffing in a platform trial requires thoughtful allocation of responsibilities and resources as well as the flexibility to adapt to evolving operational needs.

Methods: A review of HEALEY ALS Platform Trial clinical trial project management staff roles and responsibilities was conducted for the first five regimens. To support initial launch of the trial in 2020, 54 sites were selected to participate, and contracts were executed with 16 vendors. Clinical trial management activities focused on three concurrent regimens in startup. Key activities included protocol development, document creation, vendor management, trial master file setup, regulatory submissions, site activation, and training. In 2022, the scope expanded with the addition of 21 sites. At this time, clinical trial management was needed for three regimens in closeout, one actively enrolling, and one in design. In addition to the activities outlined above, responsibilities included database lock, results reporting, site closeout, participant recruitment, ongoing site support and training, and new regimen design. Efforts were also made to assess procedures for initial regimens and implement process improvements.

Results: In 2020, six project managers (PMs) were assigned at the time of trial launch. All clinical trial management responsibilities were distributed amongst the six PMs based on availability and bandwidth. In 2022, 20 PMs were working on the trial. Three operational groups were established to distribute the wide range and volume of responsibilities amongst the PMs: Site Management, Vendor Management, and Regulatory/Trial Master File. A detailed breakdown of each group’s responsibilities is provided. The three groups work in tandem to coordinate and communicate project timelines and goals with key stakeholders.

Conclusion: Effective management of the HEALEY ALS Platform Trial requires flexibility to meet the evolving demands of a perpetual platform trial in various stages of the clinical trial life cycle simultaneously. The learnings from this trial can be applied to staffing cross-functional teams, vendors, and sites on innovative clinical trial designs.

CP21-3 – PAIRED MATCHED RANDOMIZATION: THE EFFECT OF INCORPORATING REAL WORLD DATA AND MACHINE LEARNING METHODS IN CLINICAL TRIALS

Primary Author:

1) Ajsi Kanapari (University of Padova)

Co-Author(s):

2) Giulia Lorenzoni (University of Padova)

3) Dario Gregori (University of Padova)

Introduction: Randomized controlled trials (RCTs) are the gold standard for evaluating treatment efficacy and safety. The abundance of real-world data has a role in hypothesis generation and can complement RCTs. In fact, the European Medicine Agency has declared acceptable the usage of patient-level historical data in the analysis stage of the primary outcome by adjusting for a prognostic score. We propose a similar approach that operates at the design stage of the trial, specifically during the randomization process, that aims to minimize imbalances and reduce variability. This is a special case of stratified randomization, that avoids issues related to data sparsity and direct usage of continuous covariates. We propose a Paired Matched Randomization, such that patients with similar profiles are paired together into a strata and randomized with equal probability, to enhance balance of multiple covariates and reduce within-pair variability.

Aim: A simulation study explores a novel method for integrating historical patient-level data with machine learning (ML) techniques to optimize RCT design. Specifically, we propose using prognostic scores, derived from historical registries, to inform paired matching in the randomization process. This approach aims to reduce variability, enhance efficiency in estimating the average treatment effect, by maintaining a robust type I error control. A case study on two twin trials is used for demonstrating the gains in such methods.

Methods: Baseline covariates and outcomes from historical data are used to train ML models (e.g., Lasso, Random Forest, XGBoost, Naïve Bayes) to estimate prognostic or predicted response scores for new patients enrolled in an RCT. Patients with similar scores are then paired based on a specified distance, and treatments are randomized within pairs. Analyses of treatment effects account for the paired design using paired t-tests or generalized mixed models. Alternatively, the prognostic score is incorporated as an adjusting covariate in the analysis to assess its impact on statistical power and type I error control. Power and Coverage are assessed considering equal randomization as benchmark.

Results: For both continuous outcome and binary one, paired matching randomization increased nominal power and reduced standard error compared to simple randomization, with GLM performing best due on continuous outcome and with Naive Bayes for binary one. In this latter, highly prognostic covariates caused power under equal randomization to be way lower than planned one, however with paired matched stratified randomization power improved by up to values like the nominal one. The case study shows that power is maintained with a smaller set of subjects matched in pairs.

Potential relevance and impact: The work wants to explore the possible gains of integrating ML techniques into RCTs, for allowing the usage of historical information with the aim of both maintaining power and improving balance, without inflation of the type I error.

CP21-4 – A GLOBAL REVIEW OF DECENTRALISED CLINICAL TRIALS: KEY ELEMENTS AND TRENDS

Primary Author:

1) Annie Wright (Imperial College London)

Co-Author(s):

2) Victoria Cornelius (Imperial College London)

3) Otavio Berwanger (The George Institute)

4) Shwetha Rajaram (The George Institute)

Background: The role of clinical trials in advancing medical knowledge is critical, yet traditional trial models face persistent challenges, including slow recruitment rates, retention, higher-than-expected costs and limited diversity. Decentralized clinical trials (DCTs) address these issues by allowing participants to take part in their homes, leveraging digital technologies to streamline data collection and improve participation. DCTs promise more efficient and patient-centric trial designs, but their adoption varies globally. To better understand what DCTs are being performed and in what area, we conducted a review to map the current landscape of DCTs, identify leading regions, and explore key elements of their design and implementation.

Methods: We searched Medline, Embase, and Central for published protocols and main papers without restriction on publication year and language restricted to English. Late-phase clinical trials that self-identified as “decentralized” were included, regardless of therapeutic area. Additional related references, such as the protocol, were used for extraction where a study was included. Regional comparisons assessed global trends and determined which regions are leading the adoption of DCTs. We examined specific design elements of DCTs and how outcomes were collected and analyzed.

Results: We identified 4,357 papers from our search of which 33 articles met the inclusion criteria for identifying as a DCT and were of 28 unique trials. Our review revealed 57.2% of the identified DCTs were conducted in North America indicating they are a global leader in DCT implementation. We identified mental health as the most common disease classification, accounting for 21.4% of the trials. Pharmacological interventions were the most frequently investigated (42.9%), suggesting the feasibility of decentralization even for treatments that may be considered more difficult to implement in the home. Figure one shows the use of decentralized elements across included trials. E-consent was implemented in nearly all trials (19/26), reflecting the shift towards digital trial processes. In trials that required physiological or biological sample collection, this was almost always performed in patients’ homes, further reinforcing the feasibility of the decentralized approach (11/12 and 7/11 respectively). PROMs emerged as the most common method for collecting outcome data (85.7%), indicating a growing focus on patient-centered data collection and the importance of capturing patient perspectives in trial design.

Conclusions: Our review demonstrates that North America leads DCT innovation. This may be driven by clear regulatory support, such as the FDA’s guidance on DCTs. In contrast, low- and middle-income countries were unrepresented, highlighting the global inequities in access to innovative trial conduct. We believe other regions will learn from trials that have successfully implemented decentralization and investment in the infrastructure. Terms like “patient-centered” or “digital” are used to describe DCTs but do not fully capture their scope. While many trials incorporate some decentralized elements, such as remote data collection, this review focused solely on trials explicitly identifying as decentralized, ensuring intentional use of these elements in their design and conduct. By understanding these global trends, we can help shape the future of clinical trials, making them more accessible, efficient, diverse, and patient-centric worldwide.

SESSION 22

Equity, Diversity, and Inclusion

CP22-1 – DIVERSITY IN CLINICAL TRIALS: HOW ACTIONABLE IS REGULATORY GUIDANCE?

Primary Author:

1) Rebecca Metcalfe (University of British Columbia)

Co-Author(s):

2) Quang Vuong (Core Clinical Sciences)

3) Jay JH Park (McMaster University)

Background: Around the world, regulatory agencies have released guidance to improve diversity in clinical trials with the goal of reducing established disparities in health research and the resulting downstream inequities in health outcomes. For example, in the United States congress recently passed legislation mandating the FDA to require diversity action plans for drug submissions. However, the success of these regulatory changes depends on the extent to which they can be actioned. The objective of this work was to map the landscape of regulatory agency guidelines relating to diversity in clinical trials and to evaluate their implementability.

Methods: An online search was conducted to identify English-language guidance on diversity in clinical trials from regulatory agencies. The review scope was further refined by focusing on regulators serving the largest pharmaceutical markets worldwide in order to identify regulations likely to have the highest impact. Key search terms included: “diversity OR representative”, “trial”, and “[regulator]”. Information was extracted on guideline stage of development, the extent to which they could be enforced, and on which underrepresented populations, if any, were prioritized. The Guideline Implementability Appraisal tool (GLIA 2.0), a validated tool for appraising the implementability of clinical guidelines, was adapted to this context and used to assess regulatory guidance. The GLIA 2.0 assesses eight domains that impact implementability: executability; decidability; validity; flexibility; effect; measurability; novelty/innovation; and computability.

Results: We identified four agencies with guidance statements: the European Medicines Agency (EMA); the Food and Drug Administration (FDA); Health Canada; and the United Kingdom’s Medicines and Healthcare products Regulatory Agency (MHRA). Guidance varied in stage of development and enforceability: the EMA and Health Canada required justification if guidance was not followed. Guidance from both the MHRA and the FDA was in draft form only. However, while the FDA guidance would soon be binding, no timeline for the MHRA guidance was available. Prioritized populations differed between the four agencies. Despite being a draft, the FDA guidance was assessed as the most implementable, scoring nine out of nine on GLIA 2.0 global domains.

Conclusions: Recent regulatory initiatives to improve representativeness of clinical trials varied in their stage of development, degree of enforceability, and extent of implementability, and the populations they prioritized. Clear implementation pathways are needed to maximize the impact of diversity guidance. Efforts to improve diversity in clinical trials must be sustainable in order to effect lasting change.

CP22-2 – RESULTS MAY VARY: USING SIMULATION METHODS TO IMPROVE DIVERSITY IN CLINICAL TRIALS

Primary Author:

1) Megan O’Sullivan (Queen’s University Belfast)

Co-Author(s):

2) Adele H Marshall (Queen’s University Belfast)

3) Judy Bradley (Queen’s University Belfast)

4) Aiden Flynn (Exploristics)

Background: Regulatory bodies such as the FDA require a series of successful clinical trials for market approval, which makes trials vital for drug development. Several chronic respiratory diseases such as bronchiectasis currently have no licensed treatments, due to a lack of successful clinical trials. Failing to recruit the target number of patients within agreed timelines results in insufficient statistical power which leads to studies being unable to demonstrate the efficacy of a new treatment. It is possible that new medications that have a role to play in the treatment of these conditions are being disregarded, for reasons that could have been prevented at the design stage of the trial. Clinical trials are designed in a way to have the greatest chance of demonstrating the benefits of a new treatment. This can involve tight inclusion criteria, resulting in a narrow, homogenous population of patients that reduces variability in response and allows for the detection of a treatment effect. Failure to recruit is in part due to this highly restrictive eligibility criteria, which has a significant impact on the number of patients who can be enrolled in a trial. Trials often exclude patients with more complex needs, such as the elderly and those with multiple comorbidities. In addition, eligibility criteria are often selected based on precedence as the risks of trial failure are already high and sponsors do not want to introduce additional, unquantifiable risks. This current practice leads to clinical trials that include participants who represent a fraction of the broader population. For example, fewer than 10% of bronchiectasis patients were eligible to participate in some trials.

Aims: This study aims to evaluate the impact of greater diversity in clinical trials using simulation methods, with a view to improving the success rates and generalizability of clinical trials.

Methods: We used the BRONCHUK registry, a database containing information on the broader UK bronchiectasis population. With these data, we extracted information on eligibility criteria and clinical endpoints and used them to iteratively simulate realistic clinical trials covering a range of scenarios.

Results: Using endpoint data from already completed trials, we predicted what the outcomes would have been had the trial included a more diverse population. The simulated trial selected eligible patients, randomized them in a 1:1 ratio to placebo or treatment and then predicted their outcome. At each iteration, an increasing proportion of patients from a previously excluded group were added to understand how the addition of these groups affects the performance of the study. Performance was assessed in terms of the probability of detecting a treatment effect, the ability to recruit patients, the study size and duration.

Conclusion: This study shows how simulation can be used as a cost-effective method moving forward to inform the designs of future clinical trials. This will hopefully lead to higher success rates in clinical trials, especially in terms of efficacy, recruitment and diversity.

CP22-3 – DEVELOPING METHODS WE TRUST: A WORKSHOP AND VIDEO PROJECT WITH LICTR AND THE CULTURALLY DIVERSE HUB

Primary Author:

1) Liam Bishop (University of Leeds)

Co-Author(s):

2) Delia Muir (University of Leeds)

3) Abigail Olaleye (Culturally Diverse Hub, Voluntary Action Leeds)

4) Pei Loo Ow (University of Leeds)

5) Asif Farooqui (University of Leeds)

6) Holly Schofield (University of Leeds)

7) Laurie Cave (University of Leeds)

8) Aminah Malik (University of Leeds)

Background: Ethnically diverse communities are under-served in clinical trials research. This means these communities are potentially not benefiting from the findings of clinical trials, or that treatment harms of interventions tested in clinical trials are under-reported. Working with communities in a co-productive manner on video communications is a way to develop trust. The Culturally Diverse Hub, a community hub based in Leeds, are an organization providing opportunities for ethnically diverse communities to meet service providers and shape research projects. We wanted to explore their communities’ perceptions of clinical trials and understand how to build a more inclusive. We then wanted to use the videos research culture through workshop and development of videos to then use them in staff workshops.

Methods: To initially gain trust, we had engaged with the Hub for over two years listening to members of their organization. Based on discussions with the Hub, we decided to fund a workshop and video project. The facilitation methods used were an adapted version of the world café style, an inclusive method that encourages conversation to create constructive dialogue of critical questions. In the workshop, we discussed participation barriers, people’s personal experiences / perceptions of clinical trials, as well as film style and content. We held the workshop at a trusted community venue in the evening to widen the inclusion. We then held workshops within our institute with staff. We used a similar method to the community workshops with staff.

Results: 20 people, with a range of ages and ethnicities attended the workshops. Five videos, on the community’s knowledge and perceptions of clinical trial and how to build trust with the community, were developed as a result. Notes from the facilitation exercises were written into a report and shared with attendees who were invited to comment on the report. Findings from the workshop included: how historical examples of research malpractice impact trust; the need to understand community health priorities; and the impact of health and structural inequalities within healthcare. Three workshops were held with staff with a range of different job roles. The findings from the staff workshops included drivers (e.g. costing inclusion plans at grant application) and barriers (fears and anxieties over conduct) that staff will think help in delivering more diverse research.

Discussion: The workshops and the development of the videos have helped develop trust with this community. Trust was not empirically measured – this was an engagement project. We have not only understood potential barriers to more ethnic diversity in clinical trials, but methods of engagement with an ethnically diverse community for other researchers to consider. We have also done this our staff based on the outputs developed with the community. Overcoming some of these barriers and enacting the drivers is the next challenge, and we will share ways in how we are trying to do that. We will also report on the effectiveness of the engagement method with both staff and the community.

CP23-4 – CONSENT TO ACCESS AND USE HEALTH SYSTEMS DATA FOR TRIALS –WORKING TOGETHER WITH PATIENTS AND THE PUBLIC TO FIND APPROPRIATE LANGUAGE FOR PARTICIPANT-FACING MATERIALS (CROSSWORD)

Primary Author:

1) Kate Roberts (University College London)

Co-Author(s):

2) Sharon B Love (University College London)

3) Emma L Turner (University of Bristol)

4) Matthew R Sydes (NHS England)

5) Macey L Murray (University College London)

Introduction: In the UK, the wealth of data generated by interactions with the National Health Service (NHS) offers a great opportunity to transform the way researchers conduct randomized clinical trials (RCTs). Using this routinely-collected health systems data (HSD) in trials could increase the diversity of participation of research. However, gaining the necessary consent to access and link to HSD for trials has caused challenges for many researchers. Interpretation of appropriate participant consent has changed over time due to shifting legal frameworks resulting in strengthened guidelines for disclosure of patient health information, with complex and differing requirements for consent. To maintain trust, we must ensure our communication is appropriate for patients and the public. The simplification of consent language for trials could increase diverse enrolment, as well as ensuring people are truly ‘informed’. We have conducted a review of the language used in participant-facing consent materials for trials which have successfully linked to health data in the UK, to explore the language demands of consent statements related to data linkage. We will also conduct focus groups with patient advocates and the public—working together to improve how we communicate more complex aspects of trials to participants.

Methods: We identified RCTs by searching public records of popular national administrative and health data resources. Eligible RCTs accessed HSD from 2022-2024, to inform the study for any purpose. Participant-facing materials (information sheets, consent forms and privacy notices) for identified studies were collected from publicly-available sources and by contacting trial teams. Consent statements referring to linkage to HSD were extracted and reviewed for information provided to trial participants about the intended use of their data. Textual analysis and corpus linguistics were used to explore the readability and complexity of the consent statements. We have recruited a diverse and representative group of patients and members of the public to explore their views on the suitability of the wording in focus groups taking place in December 2024.

Results structure and timelines: Review results show that current consent statements related to data access do not comply with the readability level recommended for patient communication materials. We will present results from a thematic analysis of public focus group views on the language currently used, and propose co-developed examples of consent wording to use to access to confidential patient records for RCTs. Analysis is to conclude by March 2025.

Relevance and impact: Given the potential efficiency gain of using data held in health systems in trials, it is vital that we increase transparency in the requirements for continuous participant consent. This work will contribute to greater clarity in the requirements for consent language to successfully access HSD from UK registries and have wider applications internationally for how we communicate complex information to trial participants. Appropriate consent wording should also meet the needs of participants, for truly informed consent. This work contributes to the wider CrossWord project, continuing to work with patients and the public to co-develop guidance for appropriate consent language to enable data access for trials.

POSTER PRESENTATIONS

P-1 – SHARING CLINICAL TRIAL DATA IN LIMITED-ACCESS REPOSITORIES: AN EXAMPLE FROM NEURONEXT

Primary Author:

1) Anna Gudjonsdottir (University of Iowa)

Sharing clinical trial data in public repositories can be beneficial for advancing clinical research in a cost-effective way by leveraging existing data for exploratory analyses and meta-analyses that can generate new lines of research. Regulatory bodies increasingly mandate incorporating data sharing into early trial planning, and initiatives from scientific publishers, like the recommendations from the International Committee of Medical Journal Editors, require a data sharing statement in order to publish trial results in their journals. While the importance of data sharing is well understood within the scientific community to promote transparency, the practicalities of navigating the regulatory and ethical guidelines involved with data sharing can be complex. This presentation outlines the process of sharing clinical trial data in the limited-access National Institute of Neurological Disorders and Stroke (NINDS) and National Institute of Mental Health Data Archive (NDA) repositories using examples from NeuroNEXT studies to illustrate the key steps of data sharing. Clinical trials from NeuroNEXT will be used to demonstrate the necessary actions required to prepare and submit data. There are some common pitfalls that can be avoided through early planning before the trial starts, including reviewing current regulatory standards, ensuring participants are properly consented on what will be shared, and selecting an appropriate public repository. Relevant regulations will be detailed, like the latest NIH Policy for Data Management and Sharing that became effective on January 25, 2023. After the trial has concluded, data must be properly deidentified to protect participant privacy, quality control must be ensured on shared datasets, detailed metadata must be prepared to accompany the shared data, and data must be submitted in accordance with the repository’s guidelines. Different data repositories can have their own set of requirements and processes, but regardless of which repository is selected, providing a set of standardized documentation on things like the trial design, outcome measures, and the data dictionary can ensure that data follow the FAIR Principles. Similarities and differences in requirements between the NINDS, NDA, and other HEAL-compliant repositories will be highlighted. Finally, the details of who can access the data in the repositories, and under what circumstances access will be granted, will also be outlined. This real-world example provides practical guidance on the best practices for researchers and clinicians to use when navigating the clinical trial data sharing process.

P-2 – AUTOMATED MONITORING AND ALERTS FOR CLINICAL TRIAL OPERATIONS

Primary Author:

1) Donglin Yan (University of Kentucky)

Co-Author(s):

2) Heidi Weiss (University of Kentucky)

3) Brent Shelton (University of Kentucky)

Background: Clinical trials are complex interdisciplinary projects that require close collaborations across different functional teams. Some trials, particularly those involving adaptive actions triggered by specific events or enrollment progress, demand swift responses to ensure patient safety and protocol compliance. These actions, as outlined in the study protocol, must be executed promptly by the relevant teams. We present a suite of programs to monitor and alert study teams about critical events in a tobacco cessation trial (TTOP study). These programs are designed to continuously monitor the electronic data capture system and issue alerts when predefined events occur.

Methods: We enhanced an existing system that uses SAS programs and Oracle views to automate notifications and track trial milestones. We tested and implemented this system on the TTOP study. We customized this system to retrieve and analyze data from standard and tailored eCRFs via Oracle views. The program is composed of four key components: data retrieval, analysis, notification, and automation. Specific events, including serious adverse events (SAEs), alcohol use disorders, and suicidal ideation, trigger automated notifications to the study team. The system runs daily on a UNIX server using CRON to manage scheduled tasks.

Result: Our system has successfully detected important events during this study and the time to actions are shortened. This system can be easily adapted for other electronic data capture systems.

P-5 – USE OF A CLOUD-BASED SYSTEM FOR CENTRAL REVIEW OF DE-IDENTIFIED ECHOCARDIOGRAMS IN CLINICAL TRIALS

Primary Author:

1) Arlene Ruiz (The George Washington University)

Co-Author(s):

2) Grecio Sandoval (The George Washington University)

3) Madeline Rice (The George Washington University)

4) Greg Strylewicz (The George Washington University)

5) Scott Evans (The George Washington University)

In the past, when eligibility or outcomes for a multicenter trial was determined by the review of patient imaging, it may have been difficult to have images centrally reviewed, leading to greater variability of local diagnoses within the included population. With the increased use of data sharing platforms and digital methods to de-identify patient data, cloud-based platforms can be used to facilitate an efficient and precise image review and adjudication process. CORD-CHD (CORD Clamping Among Neonates with Congenital Heart Disease) is a multi-center randomized clinical trial that aims to determine the optimal timing of umbilical cord clamping in neonates born with congenital heart disease (CHD) to improve short-term postnatal and longer-term neurodevelopmental outcomes. Eligible participants have a cardiac lesion that is expected to require neonatal cardiac intervention. Critical to this trial is the eligibility determination of a Fetal Cardiovascular Disease Severity Score based on ECHO imaging review. The final adjudication is performed by a central reviewer, the CORD-CHD Fetal ECHO Core (FEC), to reduce misclassification bias and to allow for a more precise and standardized method of determining eligibility for the clinical trial. Herein, we present an approach and framework for centrally reviewing ECHO images to aid screening eligibility for a clinical trial, and report on barriers encountered and their resolutions. During screening, CORD-CHD sites upload a locally-captured ECHO to the cloud-based system, where images are managed by the data coordinating center (DCC) and sent to the FEC. During upload, the PHI embedded in the ECHO file is removed systematically, and users are able to black out PHI stamped on the image itself with the use of a pixel de-identification tool (PDIT). Use of this cloud-based platform allows sites without local de-identification methods to securely send their images to the central reviewer. Barrier: Originally, the FEC were to review the images on the cloud-based system. Due to the size of the images—in some cases almost 3GB—the FEC spent one hour loading and reviewing each image. Resolution: The DCC and the FEC worked with the cloud-based system host to create a protected location for the files to be reviewed by the FEC locally. The FEC now usually spends about 10-15 minutes completing reviews. Barrier: Upload times for users at the clinical centers were exceptionally long; e.g., two hours. Resolution: The DCC educated users and strongly recommended sites only upload locally, rather than over a VPN; after changing their process to avoid uploading over VPN, the majority of sites now need approximately 25 minutes to complete their upload. Use of the PDIT increased upload times substantially, so the DCC recommends local de-identification of images, if possible. Barrier: The PDIT may be used incorrectly, where PHI stamped on the image is still visible after upload. Resolution: Use of the PDIT required detailed one-on-one training and practice. Conclusion: Despite early-stage challenges regarding successful de-identification of ECHO images, the cloud-based system is a valuable platform to use when sharing large and sensitive files with a central reviewer.

P-7 – IMPROVING DATA COLLECTION EFFICIENCY: INTEGRATION OF INTERNAL HOSPITAL DATABASES WITH REDCAP

Primary Author:

1) Marissa Dulas (Boston Children’s Hospital)

Co-Author(s):

2) Mark Bailey (Boston Children’s Hospital)

3) Nicola Maschietto (Boston Children’s Hospital)

Background: Data extraction between the electronic health record and electronic data capture systems for clinical research is often a manual task, which can introduce human error into data collection. These errors may have far-reaching consequences, including altering statistical analysis and study conclusions. In addition to potential human error, manual data collection can also present a significant time commitment for research staff. We sought to facilitate more efficient and accurate data transfer processes into a database characterizing a large study population of patients receiving transcatheter pulmonary valve replacements (TPVRs) by mapping data from an internal hospital data warehouse to the REDCap database.

Methods: After receiving IRB approval, electronic data capture forms were developed in an internal REDCap project with over 100 relevant study variables for TPVR outcomes. Research Assistants (RAs) identified a list of corresponding variables from HeartCenter360 (HC360), a departmental data warehouse containing clinical information for patient procedures, imaging exams (Echocardiogram, Computed Tomography, Magnetic Resonance Imaging) and Exercise Stress Tests. Using clinical expertise from the study’s Principal Investigator (NM), each REDCap variable was manually reviewed and matched to its corresponding HC360 variable code, if available. Extracts of a subset of patient data from HC360 were validated by crosschecking clinical records. Subsequently, a Hospital Data Specialist (MB) automated the workflow to extract, map, and format patient data from the data warehouse for REDCap import. In this workflow, a Python script connects to the HC360 data warehouse (Snowflake instance) and maps each variable to its corresponding REDCap field via the application programming interface. By updating the column labels for the HC360 elements to match the REDCap project variable names, the Python script creates formatted Excel CSV files that RAs can review and import into REDCap using the data import tool. Although data import could become fully automated, using the Excel file and manual import process allows for comprehensive review as an additional layer of data validation.

Results: Our mapping process can reduce the burden of manual data entry on research employees and facilitate efficient data collection. The new mapping infrastructure allows data to flow from the departmental data warehouse to REDCap for an initial data import with periodic updates as necessary. To date, 678 patient records have been uploaded to the REDCap with pre-catheterization, implant, and follow-up data. In the future, over 1,000 patient records will be included in the REDCap project to analyze and compare the safety and efficacy of various transcatheter pulmonary valves. After implementing this process in a large REDCap project, RAs stated that they felt more confident in the accuracy of numerical data and reported that mapping saved time and energy. On average, manual data entry for one patient would take 1.5-2 hours without the new mapping infrastructure. With mapping, the average time for data entry was reduced by half. For the 58 patients with completed data, approximately 43.5 hours were saved on manual data entry. For a 1,000-patient project, this mapping strategy has the potential to reduce manual data entry time by 750 hours.

Conclusions: Data entry for single-center clinical research studies and data registries can be an arduous task for clinical research staff. As such, mapping internal hospital databases to data collection tools, such as REDCap, can effectively reduce the burden of manual data entry and provide more accurate data.

P-9 – SAMPLE SIZE IN CROSSED-DESIGN SURGICAL TRIALS: ARE WE IGNORING THE NON-IGNORABLE? EXPLORING THE EFFECTS OF TREATMENT HETEROGENEITY, RECRUITMENT RATES, AND DIFFERENT OPERATING-CHARACTERISTICS ON SAMPLE SIZE

Primary Author:

1) Callum Vyner (University of Leeds)

Co-Author(s):

2) Neil Corrigan (University of Leeds)

3) Gemma Ainsworth (University of Leeds)

4) Professor Deborah Stocken (University of Leeds)

Sample size calculations are a crucial design and cost feature of clinical trials. Compared to drug trial designs, surgical trials have additional design factors to consider that could affect sample size. Surgeon experience and skill, correlated observations from patients receiving treatment from the same surgeon, the interaction between surgeon and treatment, and the learning of a new surgical intervention are often over-looked in trial design. Current guidance for the design of surgical trials does not consider all these parameters. A crossed design is where each surgeon performs all levels of the intervention, and patients are randomized to an intervention. According to current guidance, when we consider the main effect of surgeon in a crossed design, the correlation between observations can only be beneficial to power. However, this previous work was without consideration of the aforementioned relevant factors. When performing analyses that uses a model to account for them, there is the potential for a reduction in power, and also potential that these models may not converge for a specified sample size. We have explored the effects of using more exhaustive models on sample size requirements to inform and extend current guidance on when these quantities should or should not be ignored. We focus on exploring the effect on power of the following: including a separate random effect for each treatment level, varying the correlations between the random effects, and varying recruitment rates across the trial duration between surgeons for a binary endpoint, such as surgical complication. Simulation of each of the above scenarios is used to estimate the required sample size for a desired power and/or estimate power at a specified sample size. The ROLARR trial of robotic surgery was used to inform the creation of data-generating models that are representative of the real world. We report the power and type-I error when fitting a model ignoring surgeon-treatment interaction when it exists; the effect of uneven recruitment on power; the reduction in power when you do account for surgeon-treatment interaction with varying values of correlation between surgeon-treatment interaction and main surgeon effect; and finally, the convergence rates of these models. We highlight proposed amendments to current guidance.

P-10 – A STRATEGIC PLANNING CHECKLIST FOR EXECUTION OF FOLLOW-UP IN CLINICAL TRIALS FOR CRITICALLY ILL CHILDREN

Primary Author:

1) Caitlin Long (Boston Children’s Hospital)

Co-Author(s):

2) Krislyn M Boggs (Boston Children’s Hospital)

3) Anjali Sadhwani (Boston Children’s Hospital)

4) Peta MA Alexander (Boston Children’s Hospital)

5) David C Bellinger (Boston Children’s Hospital)

6) Daniel Gagner (Boston Children’s Hospital)

7) Daniel P Kelly (Boston Children’s Hospital)

8) Megha Shrivastava (Boston Children’s Hospital)

9) Lamia Sun (Boston Children’s Hospital)

10) Ravi R Thiagarajan (Boston Children’s Hospital)

Background: Most clinical trials include a follow-up component to assess the impact of a trial intervention. Coordinating in-person follow-up in trials of critically ill pediatric participants presents multiple challenges. For example, families of these participants often face logistical challenges and additional stress compared with families of those without critical illness. As part of the multicenter randomized Trial of Indication-based Transfusion of Red Blood Cells in ECMO (TITRE), a pediatric critical care trial, families return to Boston Children’s Hospital (BCH) 12 months after enrollment for a 2-hour, in-person neurodevelopmental (ND) assessment. This evaluation informs one of the primary TITRE endpoints, so it is critical for every participant (age 1 to 7 years) to return for follow-up. We developed a checklist tool to guide TITRE Study Coordinators on successful coordination of this 12-month visit.

Methods: The checklist was developed in June 2024. It was refined based on feedback from other research team members, including a member of the TITRE ND Core who endorsed it as a tool for best practice. The checklist includes sections on pre-visit communications with the family (e.g., sending an IRB-approved email reminder), local site preparations (e.g., preparing a folder with physical copies of each ND instrument), check boxes to ensure all tests and questionnaires are properly administered, and instructions for post-visit tasks. Many families travel long distances to return for the visit. Thus, organization is key to not missing any visit components.

Results: We have implemented this checklist for the TITRE Trial at BCH. As of October 31, 2024, 3 out of 3 BCH participants have returned for their 12-month ND assessment. These assessments were completed in full, through the post-visit components (i.e., remuneration, scoring ND instruments, and delivery of findings). As intended, the checklist ensured that 1) the coordinator communicated effectively with each family in advance of the visit, so that the family understood what to expect from the visit, thus increasing the likelihood that they would return; 2) each visit ran efficiently, collecting all of the required endpoint data while also avoiding prolongation of the visit for these young children and their families; and 3) post-visit components proceeded smoothly. By following the preparation steps of the checklist, the three visits each lasted under 1.5 hours and all required components were fully completed. The checklist has also helped by advising on scenarios that may affect data integrity. For example, if the participant was recently admitted to a hospital, the checklist advises that the coordinator contacts the ND assessor and TITRE ND Core to adjudicate whether the visit should be rescheduled.

Conclusion: The stepwise checklist assisted in 100% completion (3 out of 3 surviving participants) of the 12-month follow-up visits in the surviving TITRE participants at BCH to date. The robust checklist supports efficiency and allows for flexibly addressing all the unique situations each BCH TITRE family faces. We expect that this checklist can be adapted for other participating TITRE sites, as well as modified for use across other critical care trials.

P-11 – A PRACTICAL SOLUTION TO MINIMISE SELECTION AND CHRONOLOGICAL BIAS IN RCTS

Primary Author:

1) Chukwuemeka Emele (University of Aberdeen)

Co-Author(s):

2) Mark Forrest (University of Aberdeen)

3) Graeme MacLennan (University of Aberdeen)

Background: In clinical trials involving treatment comparison, reliable randomization is crucial to reduce selection bias, enhance the statistical validity, and promote similarity of treatment groups regarding known and unknown confounders. While these controlled trials strive to achieve balance across groups and reduce selection bias, these two objectives often conflict. Minimization is commonly used to achieve balance in patient groups and prognostic factors. However, there are concerns regarding the predictability of treatment allocations and potential selection bias. Just like permuted blocks, minimization alone increases the chance that allocation may be predictable, which could lead to biased estimates of treatment effect and misleading conclusions. The maximal procedure is a restricted randomization method that maximizes the number of feasible allocation sequences under the constraints of the maximum tolerated imbalance (MTI) and the allocation sequence length. By its nature, maximal procedure (and its variants) does not strictly address the selection bias question as they use a fixed MTI throughout (which makes it deterministic) and so the next treatment allocation can be easily predicted when the limit of the MTI has been reached. Objective: To develop a sliding window maximal procedure randomization which achieves balanced treatment arms and prognostic factor distribution and offers better capability to reduce determinism in treatment allocations. Our sliding window maximal procedure addresses treatment determinism and thereby helps to minimize selection and chronological bias in terms of selective enrolment of patients into a trial.

Methods: The sliding window maximal procedure randomization was implemented and deployed in the recruitment phase of a pragmatic multicenter randomized controlled trial. Treatments needed to be balanced across prognostic factors and treatment arms and within centers in the multicenter setup. Outcome Measures: Primary outcome measures included treatment balance across arms, prognostic factors, and reduced determinism in unblinded treatment allocations.

Results: Initial trials, including mental health trial, where the sliding window maximal procedure was integrated into the recruitment process, showed decreased determinism in the assignment of treatment to eligible and consented patients. This was attributed to the MTI limit at each iteration being dynamically and randomly generated, leading to a reduction in predictability. Consequently, the sliding window maximal procedure effectively manages chronological bias, selection bias, and ensures the validity of statistical tests, all while promoting a favorable balance within patient groups and prognostic factors. Sliding window maximal procedure proved particularly effective in ensuring balance (or near-balance) across patient groups and prognostic factors, while making it more difficult to correctly predict treatment allocations. This decreased determinism in treatment allocation underlines the sliding window’s effectiveness in promoting and enhancing research integrity in clinical trials.

Conclusion: The sliding window maximal procedure introduces a promising method for clinical trial randomization by increasing allocation unpredictability while maintaining treatment balance. Future trials will benchmark this approach against established randomization algorithms to assess its broader applicability and impact on trial integrity and efficiency.

P-12 – HOW TO ACCOUNT FOR SUBSEQUENT ANTI-CANCER THERAPY WHEN ANALYSING OVERALL SURVIVAL: A SIMULATION STUDY AND PRACTICAL EXAMPLE

Primary Author:

1) Kara-Louise Royle (University of Leeds)

Co-Author(s):

2) David Meads (University of Leeds)

3) Jennifer Visser-Rogers (Coronado Research)

4) David A Cairns (University of Leeds)

5) Ian R White (MRC Clinical Trials Unit at UCL)

The cancer clinical trial community, including the FDA, acknowledge that the uptake of subsequent anti-cancer therapies can influence the interpretation of overall survival in phase III confirmatory trials. Nevertheless, overall survival remains the gold standard definitive endpoint. It is used alongside other evidence in health technology appraisals to evaluate the performance of experimental interventions, where a limitation is often stated to be the unquantifiable impact of subsequent anti-cancer therapies. In a bid to disentangle the uncertainty concerning subsequent anti-cancer therapies, the objective of this work was to investigate the novel hypothetical question: “What is the experimental trial intervention effect on overall survival compared to the control intervention, if all participants who discontinued prior to death went onto receive the same subsequent anti-cancer therapy?” We will discuss how existing statistical techniques, including the simple (intention-to-treat and per protocol) and the complex (two-stage and inverse proportional censoring weighting), performed when applied to the hypothetical question in an extensive simulation study. We considered a variety of scenarios including, variations of the true experimental intervention effect and the timing of intervention discontinuation. We evaluated the different methods in terms of bias, coverage, and power. We will demonstrate the practicability of the methods through a case study. The case study will consider the analysis of overall survival in a non-inferiority trial in kidney cancer. In the trial, some participants stopped their trial intervention and began immunotherapy during follow-up. This case study was chosen as second-line immunotherapy was not standard of care for this population at trial outset. As more and more effective therapies are made available to patients, this is likely to be a scenario many in the cancer clinical trial community will experience. We will discuss the generalizability of the novel hypothetical question, how it improves the interpretation of clinical trials results, and the necessary considerations when analyzing overall survival in such situations.

P-14 – REFINING DATA PRESENTATION IN DATA AND SAFETY MONITORING BOARDS REPORTS: AN EXAMPLE FROM A CLINICAL TRIAL EVALUATING ANTIFUNGAL THERAPY IN PEDIATRIC UNCOMPLICATED CANDIDEMIA

Primary Author:

1) Qihang Wu (The George Washington University Biostatistics Center)

Co-Author(s):

2) Yixuan Li (The George Washington University Biostatistics Center)

3) Lijuan Zeng (The George Washington University Biostatistics Center)

4) Brian T Fisher (Children’s Hospital of Philadelphia)

5) William J Steinbach (Arkansas Children’s Hospital)

6) Scott R Evans (The George Washington University Biostatistics Center)

7) Toshimitsu Hamasaki (The George Washington University Biostatistics Center)

Data and Safety Monitoring Boards (DSMBs) play a critical role in clinical trials, using emerging data to monitor the welfare of trial participants through assessment of the benefit-risk balance of interventions, and determining whether randomization and follow-up should continue. High-quality DSMB reports are essential for informed decision-making. Unfortunately, in practice, DSMB reports often lack sufficient structure and are excessively lengthy and dense, typically filled with numerous tables and listings. The report volume, coupled with a lack of clear organization, prioritization of presentations, and contextual explanation, can hinder the identification of important treatment effects, and result in sub-optimal DSMB recommendations. DSMB reports are distinct from other types of clinical trial documentation such as clinical study reports or journal articles. DSMB reports are aimed at its target audience, the DSMB. Reports optimally present data in a way that tells a complete and coherent story, prioritizing clarity and actionable insights. Comparison of Uncomplicated Candidemia Therapy Duration in Children and Adolescents (COUNT; NCT05763251) is a multi-center, randomized controlled study comparing two antifungal therapy durations in pediatric patients with uncomplicated candidemia. COUNT is employing the Desirability of Outcome Ranking (DOOR) paradigm, a patient-centric paradigm for designing, analyzing, interpreting, and reporting clinical trial results, focusing on patient-centric benefit-risk evaluation. COUNT was designed with three planned interim analyses and biannual safety reviews. A refined DSMB report template was developed for COUNT in support of effective monitoring. The report emphasizes the use of visual aids, including bar charts, stacked bar charts, forest plots, and predictive interval plots, complemented by concise tables and narrative summaries. The report is designed to distill complex data into clear, digestible presentations that facilitate quick and accurate benefit-risk assessments, by DSMB members. Visual patient stories complement summary tables and figures to provide insights into comprehensive experiences of each participant. In this presentation, we will introduce the DSMB reporting approach, outlining principles for preparing reports and providing practical guidance for ensuring clarity and utility. We illustrate the approach using COUNT as a prototypical example.

P-15 – SAMPLE SIZE RE-ESTIMATION IN CLUSTER RANDOMIZED CLINICAL TRIALS

Primary Author:

1) Sasha Amdur Kravets (Eli Lilly and Company)

Co-Author(s):

2) Heather J Gunn (Mayo Clinic)

3) Sumithra J Mandrekar (Mayo Clinic)

Pragmatic clinical trials such as the parallel arm cluster-randomized trial (CRT) are increasingly utilized to understand the delivery of care in real-world settings; however, these designs are complex. Cluster-randomized trials possess a hierarchical data structure (e.g., patients clustered within clinics) and the intraclass correlation coefficient (ICC) (degree of variability between clusters) is needed to conduct the power analysis. Under or over estimation of the ICC at the design stage may lead to significantly under- or over-powered clinical trials. The designs proposed in the literature for sample size re-estimation methods to adjust for the potentially mis-specified ICC at the design stage do not control type I error rate. Further, designs to adjust sample size in a CRT based on both mis-specified ICC and a mis-specified treatment effect have not been adopted widely. We propose an adaptation of two traditional adaptive sample size re-estimation methods to CRTs that account for both mis-specified ICC and a mis-specified treatment effect. We propose an extension of the internal pilot study approach using a weighted test statistic as a method for type I error control. We also extend the promising zone design to CRTs using conditional power for sample size re-estimation under the linear mixed-effects model. We illustrate the impact of interim accrual on these methods in the CRT setting using simulation studies. Our proposed methods adequately control type I error and recover lost study power.

P-18 – CLINICAL TRIAL TRANSPARENCY: PATIENT PREFERENCES ON CLINICAL TRIAL RESULTS DISSEMINATION FOLLOWING THE PREPARE TRIAL

Primary Author:

1) Jodi Gallant (McMaster University)

Co-Author(s):

2) Tristan Paranavitana (McMaster University)

3) Sofia Bzovsky (McMaster University)

4) Kaitlyn Pusztai (McMaster University)

5) Paula McKay (McMaster University)

6) Debra Marvel (University of Maryland Baltimore)

7) Jeffrey L Wells (University of Maryland Baltimore)

8) Julie Menard (Université de Sherbrooke)

9) Jamal Al-Asiri (McMaster University)

10) Joseph T Patterson (University of Southern California)

Purpose: Clinical trial participants have a right to know the results of the trials in which they participate. Trial results are often not shared directly with participants and regulatory approvals typically prevent contacting participants after trial completion. The primary objective of this cross-sectional study was to determine the proportion of orthopedic fracture trial participants who wish to know the results of a clinical trial in which they participated.

Methods: We surveyed participants from the completed PREPARE trial at our institution to determine if they would like to know the trial results. We asked participants about their preferences for receiving trial results, their experiences upon learning them, and if they wished to learn which treatment they received.

Results: Twenty-eight percent (181/641) of PREPARE trial participants agreed to participate in this study. We found that 95.5% of participants wished to know the trial results and the preferred method was through an online link (78.2%). Most were satisfied with knowing the results (67.8%) and 82.2% wanted to know which treatment they received. Fifty-one percent reported that learning the results increased their likelihood of participating in a future trial.

Conclusions: Although we had a low response rate, our study findings suggest that there is considerable interest among participants in knowing the clinical trial results and that learning the findings may have a positive impact on individual participants and the research community. Given the limited understanding of results, researchers should have processes in place to actively facilitate the accessible sharing of this information, ideally before participants complete the trial.

P-19 – ATTITUDES OF PATIENTS AND CAREGIVERS TO THE AVAILABILITY OF A DEDICATED SOCIAL WORKER IN THE OUTPATIENT FRACTURE CLINIC

Primary Author:

1) Jodi Gallant (McMaster University)

Co-Author(s):

2) Sheila Sprague (McMaster University)

3) Natalie Fleming (McMaster University)

4) Marc Gonsalves (McMaster University)

5) Jamal Al-Asiri (McMaster University)

6) Herman Johal (McMaster University)

7) Dale Williams (McMaster University)

8) Faisal Al-Zahrani (McMaster University)

9) Sara Renaud (McMaster University)

10) Kaitlyn Pusztai (McMaster University)

Purpose: This study examined the benefits of a dedicated social worker in the fracture clinic setting as perceived by patients and caregivers.

Methods: This cross-sectional survey engaged patients aged 18 and older presenting to the fracture clinic associated with a level one trauma center. Surveys were available in English and Arabic and included 11 questions designed to measure level of agreement with various aspects of social worker presence in the outpatient fracture clinic.

Results: A total of 199 patients and caregivers participated in this study. The majority of participants were female (61.8%) and over the age of 50 (57.1%). While there was nuance in the circumstances and issues participants felt were appropriate for social worker assistance, the majority (95.4%) found the general premise of having a dedicated social worker available in the fracture clinic to be acceptable and believed it would be beneficial to fracture clinic patients. Participants indicated that social workers could support trauma patients with multiple issues (60% level of agreement across all items).

Conclusion: Orthopedic trauma patients and their caregivers overwhelmingly support the addition of a dedicated social worker to the fracture clinic, with a high level of agreement across demographic groups and proposed social worker interventions. The patient and caregiver population finds the fracture clinic to be an appropriate and viable setting for the mitigation of non-physical impacts of orthopedic trauma. These findings can inform the development of potential future additions of social services to the fracture clinic.

P-20 – EXPLORING BAYESIAN GROUP-SEQUENTIAL DESIGNS IN MULTIPLE-PERIOD PARALLEL-ARM CLUSTER RANDOMIZED TRIALS

Primary Author:

1) Yongdong Ouyang (The Hospital for Sick Children)

Co-Author(s):

2) Hubert Wong (University of British Columbia)

Group-sequential designs (GSDs), which have been extensively explored in individually randomized trials, demonstrate substantial reductions in expected sample sizes by enabling early stopping at pre-specified interim analyses. While the GSD can be developed within either a Frequentist or Bayesian framework, Bayesian GSDs offer additional flexibility by providing a formal mechanism for incorporating external information, such as historical data or expert knowledge, which enhances the efficiency and interpretability of trial results. The adoption of GSDs in cluster randomized trials (CRTs) could effectively address the statistical challenges arising from having a limited number of clusters in CRTs, but their use has been limited. In this study, we specifically focus on evaluating Bayesian GSD on cross-sectional multiple-period CRTs with a baseline period, where outcomes are measured at equally spaced time intervals both before and after the intervention. However, the ideas apply to any type of CRT. Our primary objective is to introduce and evaluate the statistical efficiency of Bayesian GSDs in multiple-period CRTs under three situations. First, we compare the performance (sample size and power under various standardized effect sizes) of the Bayesian GSD with a non-informative normal prior centered around 0 and a precision of 0.001 (which heuristically should yield similar performance to an equivalent Frequentist GSD) to Frequentist fixed sample-size CRT design while managing Type I error rates. Next, we examine the performance when informative (normal priors with mean = effect size and precision 0.5) are used, both with and without controlling for Type I error rates. Our results showed that Bayesian GS multiple-period CRT with a non-informative prior resulted in similar power while controlling for Type I error rates compared to Frequentist fixed design. However, this design allowed trials to stop earlier, which reduced the expected cluster sizes by up to 40%. If we switch to the informative prior while still maintaining the Type I error rates, the simulation returns a similar performance to that of the designs with the non-informative prior. Once we do not control for Type I error rates, the designs showed up to 19% higher statistical power with the prior informative and with a Type I error of 0.09. The magnitudes of these results depend on the degree of informativeness of the priors, effect sizes, and trial sizes. In summary, we introduced the Bayesian GSD to multiple-period CRTs. Our results indicate that if we require control of the Type I error rate, reductions in expected sample size arising from using a Bayesian GSD are attributable to the group sequential aspect, not from the use of informative priors. This result matches the findings reported previously in the literature for individually randomized trials. Informative priors further reduce sample size only if the Type I error rate is allowed to increase. However, regardless of whether Type I error control is required, Bayesian methods enable bringing existing evidence into the current study and increase the interpretability of the parameters (e.g., probability-based inference instead of P-values), which has been widely discussed in the literature.

P-21 – THE EFFECT OF PREDICTING SERIOUS ADVERSE EVENT FOR GUIDING THE ENROLLMENT PROCEDURE IN CLINICAL TRIALS

Primary Author:

1) Ajsi Kanapari (University of Padova)

Co-Author(s):

2) Dario Gregori (University of Padova)

Background: Serious Adverse events (SAEs) refer to the undesired occurrence of an event that derives from a drug reaction or a medication error, that results in a life-threatening event, long hospitalization or death. It is one of the types of healthcare related harm with direct consequences on patients’ life and compromising study validity and safety. However, they are explored often through post-hoc analysis and not directly informing the design of Clinical Trials, due to their complex application in a dynamic context. On the matter there is room for improvement, that can be guided by the usage of Machine Learning (ML) with the aim of identifying subgroups of patients with meaningful combination of clinical features linked to SAEs, for limiting their frequency. A recent review, identified an increase in the usage of ML models to predict specific SAEs using electronic health record (EHR) data mainly related to chemotherapy, tumor targeting drugs and antibacterial agents. On the other hand, too stringent exclusion criteria that limit generalizability of trial results could be refined and relaxed with the usage of probabilistic methods that rely on clinical features rather than on specific dichotomized variables.

Objective: The aim of this work is the development of a framework in accordance with FDA guidance on enrichment strategies for reducing trial variability, that implements ML models to allow early detection. The framework employs ML models to identify patients at high risk of SAEs, enhancing early detection and informing inclusion/exclusion decisions. Historical data are used to train predictive models that estimate SAE probabilities for new participants, with inclusion decisions guided by a predefined decision rule.

Results: Simulations are performed to assess operational characteristics of the proposed framework, with the aim to maintain balance between the reduction of SAEs incidence, algorithm accuracy and maintaining generalizability of the study. Due to reduced variability consequent to patient exclusion, power is not only maintained but also increased, if the model provides a high performance, however issues are found particularly if low specificity is involved that would cause the unnecessary exclusion of low risk of SAEs subjects. On the positive extend, the algorithm provides reduced standard errors and more precise estimates of treatment effect maintaining ensuring reduced.

P-22 – DISTANCE-BASED NONPARAMETRIC ESTIMATORS FOR COMPARING GROUPS IN BEHRENS-FISHER SETTINGS WITH CLUSTERED CLINICAL DATA

Primary Author:

1) Akash Roy (Medical University of South Carolina)

Co-Author(s):

2) Dipnil Chakraborty (Bristol Myers Squibb)

Background: Statistical comparisons between two independent groups are among the most common inference problems in scientific research, particularly in fields such as biomedical and social sciences. However, in clinical trials with clustered data, traditional statistical methods may not be ideal when measurements are obtained from dependent replicates. This creates a need for statistical procedures specifically designed to account for the dependency among replicates. Parametric assumptions about underlying distributions can be questionable, especially in complex data structures. To address these challenges, we propose to estimate the Wilcoxon-Mann-Whitney (WMW) effect using alternative distance functions which does not rely on parametric assumptions. This non-parametric approach provides a more robust solution for comparing two groups with unequal variances, improving the reliability of treatment effect estimation and mitigating the risks of model misspecification.

Methods: We consider two independent samples with replicated observations modeled by independent random vectors Xik=(Xik1,…,Xikmik)T, where i=1,2 and k=1,…,ni. Here, mik represents the number of replicates of subject k under treatment i, which may be correlated and differ across subjects. The observations from each treatment group follow distributions F1 and F2, and the treatment effect p is defined as p=?F1?dF2, where p measures the tendency of one group to have smaller values than the other. In this work, we estimate the WMW effect p by incorporating two distance functions: the Sign Function-based Distance and the Mahalanobis Distance. For clustered data, we adjust for within-group correlation by resampling clusters or calculating distances within each cluster. The Sign Function-based Distance assigns values of 1, 0.5, or 0 based on comparisons between paired observations, while the Mahalanobis Distance adjusts for variances and correlations between groups. The overall treatment effect is estimated by aggregating distances across cluster pairs, weighted by cluster size, and using bootstrap resampling to estimate confidence intervals. Extensive simulations with small and moderate sample sizes, both balanced and unbalanced under different replication settings were evaluated. The performance of the methods was explored using three types of distributions: multivariate normal, multivariate lognormal, and ordinal data, with varying levels of correlation (low, medium, and high). Additionally, the quality of the proposed tests was evaluated in terms of controlling the nominal type-I error rate (??) and their power to detect alternative hypotheses.

Discussion: Dependent replications are common in many experiments, highlighting the need for appropriate statistical procedures to model them. Reducing replications to single observations using central tendency measures results in loss of valuable information and reduced power. Moreover, data often exhibit skewed distributions or are observed on ordinal scales, necessitating flexible nonparametric methods for analyzing such data in a unified manner. These methods offer flexibility in handling distributional assumptions and managing complex dependencies, making them a valuable tool in clinical trial design and analysis, particularly in cases where traditional parametric tests may be inadequate. The proposed distance-based nonparametric estimators can provide a practical solution in such situations by reducing computation complexity and increasing flexibility.

P-23 – A NON-PARAMETRIC APPROACH TO PREDICT RECRUITMENT FOR RANDOMIZED CLINICAL TRIALS-IMPLEMENTATION IN R

Primary Author:

1) Alejandro Villasante-Tezanos (University of Texas Medical Branch)

Co-Author(s):

2) Xiaoying Yu (University of Texas Medical Branch)

3) Christopher Kurinec (University of Texas Medical Branch)

4) Ioannis Malagaris (University of Texas Medical Branch)

5) Yong-Fang Kuo (University of Texas Medical Branch)

Accurate prediction of subject recruitment is crucial for the success of clinical trials but remains a persistent challenge. Existing prediction models often rely on parametric assumptions, which may not hold, or Bayesian methods, which require prior knowledge that can be difficult for investigators to provide. We introduce RCTRecruit, an R package that employs a novel, flexible, non-parametric, weighted resampling approach for clinical trials. Specifically, we implemented the methodology in inpatient settings such as acute care for the elderly (ACE) units at University of Texas Medical Branch. When simulating future enrollment, our method assigns higher weights to empirical data from dates similar to those in the target period, effectively accommodating seasonal patterns in recruitment. Simulated distributions and resampling techniques are then used to calculate confidence intervals for recruitment numbers at the end of a recruitment period or upon reaching a target number of participants. This method handles diverse enrollment patterns and anticipated changes in recruitment, using only a recruitment log as input. Using RCTRecruit, we applied this method to recruitment data from the GRIPS and PACE studies, comparing its performance to both bootstrapping and Bayesian methods. In these real-world applications, RCTRecruit outperformed both alternatives. Our package demonstrates the feasibility and ease of implementing a flexible, non-parametric weighted resampling approach that requires minimal input from the investigator.

P-24 – REVISITING FUTILITY MONITORING IN CLINICAL TRIALS

Primary Author:

1) Corinne Gant (Medical University of South Carolina)

Modern clinical trials often include interim analyses of accruing data that seek to declare futility when the demonstration of a meaningful treatment effect is unlikely, even if they continue to their planned conclusion. An appropriate futility determination and decision to halt the study early can avoid unnecessary expenditure of resources and exposures to ineffective therapies. However, inappropriate futility determinations will stall progress towards making effective interventions available to patients. Despite the importance of making correct futility decisions, there is little guidance on optimal futility monitoring methods in the literature. A trial design should describe the schedule for interim looks and any decisions that are expected to be made on the accruing data, including the guidance quantities to be used in making decisions to stop a trial. Popular guidance quantities for interim futility monitoring include both error-spending functions and stochastic curtailment methods, such as conditional power, predictive power, and predictive probability of success. To highlight the role of futility analyses in clinical trials, this presentation provides a brief overview of commonly used futility stopping methods, insights into the differences in their performance estimated via simulation, and several real clinical trial examples demonstrating their application.

P-25 – PARTICIPANT DIVERSITY AND INCLUSIVE TRIAL DESIGN: A META-EPIDEMIOLOGIC STUDY OF CANADIAN RANDOMIZED CLINICAL TRIALS

Primary Author:

1) Debby Oladimeji (University of Alberta)

Co-Author(s):

2) Shannon M Ruzycki (University of Calgary)

3) David Collister (University of Alberta)

4) Kirstie C Lithgow (University of Calgary)

5) Claire Song (University of Calgary)

6) Sarah Taylor (University of Calgary)

7) Abinaya Subramanian

8) FengYun (Miriam) Li

9) Stephanie Happ (University of Calgary)

10) Mark Shea

Background: Addressing the lack of diversity in medical research is a global priority. Participants in randomized clinical trials (RCTs) should reflect real-world, non-trial populations to ensure representativeness and generalizability because differences in trial populations may influence baseline risk or disease responsiveness, adherence and acceptability of interventions, and the efficacy or safety of interventions. Despite its importance, trial populations are often not representative of their respective real-world populations with regards to age, sex, gender, race, ethnicity, and disability in a variety of disease and therapeutic areas including coronary artery disease, heart failure, stroke, oncology, and neurology. Improving participation in RCTs for underrepresented groups depends on understanding barriers, facilitators and contexts. The objectives of this meta-epidemiologic study were to describe the diversity of contemporary adult Canadian clinical trial participants and summarize the reporting of social determinants of health of participants in adult Canadian clinical trials.

Methods: Eligible trials were identified through a search of registered trials on ClinicalTrials.gov. We included randomized, phase 2 or 3 clinical trials that were registered between January 1, 2010, and December 31, 2019, and had results that were published in a peer reviewed journal. To be included, trials had to recruit adult participants only in Canada, as participants recruited from other countries may not be comparable to Canadian participants with respect to sociodemographic factors such as socioeconomic status and race and/or ethnicity. Full text manuscripts for each included trial were screened independently for inclusion in duplicate, with disagreements resolved by a third study team member. To describe participant diversity, we extracted whether investigators reported participant demographics according to PROGRESS-PLUS elements; which are (1) Place of residence; (2) Race, ethnicity, culture, language; (3) Occupation; (4) Gender and sex; (5) Religion; (6) Education; (7) Socioeconomic status; (8) Social capital and (9) Personal characteristics associated with discrimination such as age and disability.

Results: The initial search on ClinicalTrials.gov identified 2,749 trial records. After several screenings, 118 trial records were eligible for full text retrieval; these studies recruited a combined total of 17,387 individual participants. We identified a glaring underreporting and underrepresentation of sex and gender minorities in Canadian RCTs. No identified study reported both the sex and gender of participants, as recommended by the Institute of Medicine since 2011. Similar to other reviews of trial diversity, we report that many aspects of participant identity were not reported in primary manuscripts. Based on available information, we found that Black and Indigenous participants were underrepresented relative to their proportion in the general Canadian population; 4.3% of the Canadian population self-identifies as Black compared to 2.7% of trial participants and 6.1% identify as Indigenous compared to 0.2% of trial participants. In summary, there are gaps in participant diversity and inclusion among contemporary Canadian RCTs.

Discussion: Tools and guidelines aimed at improving trial participation for underrepresented groups in Canada should address underreporting and possible exclusion of sex and gender minorities, standardize reporting of race and/or ethnicity data, and describe how to select PROGRESS-PLUS identities relevant to each project. Future studies should examine study design choices that act as facilitators or barriers of recruiting participants who reflect the diversity of Canadians.

P-26 – TREATMENT EFFECT ESTIMATION IN THE PRESENCE OF CLUSTER SIZE DEPENDENT TREATMENT HETEROGENEITY IN STEPPED WEDGE DESIGNS

Primary Author:

1) Fandi Chang (The Ohio State University)

Co-Author(s):

2) Abigail Shoben (The Ohio State University)

Introduction: In stepped-wedge cluster randomized trials (SW-CRTs), all clusters receive the intervention but the timing of when they start varies in a staggered, sequential manner across predefined periods. Previous research has shown that the sequences closer to the edges of the design contribute the most information for estimating the treatment effect using a linear mixed-effects model. These findings further implied that researchers could consider randomizing clusters to specific treatment sequences to improve the efficiency of the treatment effect estimator. However, it remains unknown how the estimation of the treatment effect from SW-CRTs changes when treatment heterogeneity depends on cluster size.

Methods: We conducted a simulation study to evaluate the behavior of the treatment effect estimator under treatment heterogeneity dependent on cluster size. Two sampling schemes were considered to better reflect the practical sampling approaches: 1) sampling clusters of equal size regardless of population cluster size; 2) sampling clusters in proportion to the population cluster size. Treatment heterogeneity was introduced by increasing the treatment effect size with population size. Three randomization schemes were applied to assign clusters to treatment sequences: 1) unconstrained randomization; 2) randomizing clusters with larger treatment effects to the edges of the design; 3) randomizing clusters with larger treatment effects to the middle of the design. Treatment effects were estimated using a linear mixed effects model incorporating treatment, discrete period effects, and cluster-specific random intercepts.

Results: In the unconstrained randomization, sampling clusters of equal size resulted in an estimated treatment effect that weighted each cluster equally, while sampling clusters in proportion to population cluster size led to an estimated treatment effect that weighted each individual equally. However, these estimates changed dramatically when randomization was constrained based on cluster size. Observed estimated treatment effects did not vary across a variety of intraclass correlation coefficients.

Conclusion: In the presence of treatment heterogeneity that depends on the cluster size, researchers need to carefully consider the desired treatment effect and constrained randomization is not recommended.

P-27 – MUST-HAVES OR NICE-TO-HAVES: COMMUNITY OUTREACH TO INFORM RESEARCH INCLUSION IN FUNDING APPLICATIONS: CASE STUDIES FROM THE LEEDS INSTITUTE OF CLINICAL TRIALS RESEARCH - THREE CASE STUDIES FROM LICTR

Primary Author:

1) Finbar Slevin (University of Leeds)

Co-Author(s):

2) Liam Bishop (University of Leeds)

3) Nancy Fernandes da Silva (University of Leeds)

4) Christopher Williams (University of Leeds)

5) Natasha Greatorex (University of Leeds)

6) Delia Muir (University of Leeds)

Introduction: The National Institute for Health Research (NIHR) and other major funders for healthcare research increasingly require costed research inclusion (i.e. planning for equity, diversity and inclusion) plans as a condition of funding. While this is undoubtedly a positive move to improve inclusivity and diversity of research activity, it is likely to bring challenges and adjustments for researchers both in the development and implementation of these plans. In particular, there may be challenges to involving people from diverse communities in the development of those plans. There is a need for real-world, practical examples to guide development of research inclusion plans.

Methods: Three recent funding applications were made through Leeds Cancer Research UK Clinical Trial Unit at University of Leeds. These were: i) Optimizing advanced radiotherapy treatments for prostate cancer: ii) Liver Cancer: a platform study in Hepatocellular Carcinoma combining TARgeted treatments; and iii) FOxTROT Platform – personalizing neo-adjuvant treatment in locally advanced but operable colon cancer. Each of these applications contained research inclusion plans. The applications were reviewed to understand: i) the activities which informed development of these plans, ii) any challenges which were encountered and iii) how the research inclusion activity will benefit the proposed research. All studies costed research inclusion plans through different methods and with different outcomes.

Results: Different approaches were used to develop research inclusion plans in the three applications. For the prostate cancer application, the lead applicant met with people with lived experience of prostate cancer and radiotherapy and their caregivers. This included discussing research inclusion with people from under-served groups at particular risk of prostate cancer during meetings in their communities. For the liver cancer grant application, the lead applicant and members from Leeds CTRU met with people with lived experience of liver cancer to inform the PPI components of the grant application, and also sought advice from a Diversity and Inclusion Officer to help shape the research inclusion plans. The FOxTROT Platform trial lead applicants worked together with patient representatives and the Leeds CTRU PPIE lead to minimize the opportunity for unintentional exclusion of any particular patient groups through the trial design. All applicants had knowledge of the patient population their research would need to serve and were trying to base their research around that knowledge. As a result, research inclusion plans were included in applications.

Conclusion: We will present feedback and reflections on these processes for developing research inclusion plans. Research inclusion was considered valuable, despite there being challenges to undertaking it. There were also practical challenges to undertaking research inclusion planning, in particular community outreach at the application stage. How community outreach will be conducted needs consideration, which includes the time and resources required Increasingly, community outreach and other elements of research inclusion will be considered “must-haves” rather than “nice-to-have”. Innovative approaches, as we have demonstrated, will be required at the grant application stage in addition to during the study.

P-28 – ENHANCING EFFICIENCY AND USER EXPERIENCE IN THE PREVENTABLE STUDY DRUG SYSTEM: UPGRADED FEATURES AND IMPROVED OPERATION

Primary Author:

1) Julissa Almonte (Wake Forest University School of Medicine)

Co-Author(s):

2) Mark King (Wake Forest University School of Medicine)

3) Letitia Perdue (Wake Forest University School of Medicine)

4) Laura Lovato (Wake Forest University School of Medicine)

5) Kenneth Wilson (Wake Forest University School of Medicine)

6) Wesley Roberson (Wake Forest University School of Medicine)

7) Amanda Montgomery (Wake Forest University School of Medicine)

The PRagmatic EValuation of evENTs And Benefits of Lipid lowering in oldEr adults (PREVENTABLE) trial is a double-blind, multicenter, randomized study involving 20,000 community-dwelling adults aged 75 and older, aimed at evaluating atorvastatin’s effectiveness in preventing dementia or persistent disability. Participants will be followed for 5 years, with random assignment to daily atorvastatin 40 mg or placebo. The PREVENTABLE Study Drug System is a critical tool for coordinating direct-to-participant drug distribution and tracking across VA and non-VA sites. Recent updates have enhanced the user-friendliness, efficiency, and adaptability. These updates have streamlined communication between sites, coordinating centers, and the central pharmacy. The web-based Study Drug System, managed by the Data Coordinating Center at Wake Forest University School of Medicine, was originally built to manage secure drug distribution. It now includes new alerts, advanced filters, improved tracking features, and a simplified interface to better support both VA and non-VA sites. User feedback was vital to develop these updates and optimize the interface to meet the needs of both VA and non-VA sites. User focus groups highlighted workflow challenges, which lead to enhancements in alert functionality, efficiency, and troubleshooting. The upgraded system has improved operations and data reliability for PREVENTABLE with a redesigned dashboard that consolidates site-specific orders and displays vital participant information. Site staff can quickly access participant data, view current drug order statuses, and resolve issues with real-time shipment details. Advanced filtering options further streamline data retrieval, allowing staff to easily find specific study drug information. The alerts, designed to notify users of outstanding tasks and shipment-related issues, have been tailored for both VA and non-VA sites. This system includes both manual and automated actions to streamline site-level drug management. Additionally, API integration with USPS tracking has been modified to include for real-time monitoring of drug shipments, which has improved delivery reliability. Ongoing quality assurance measures, including dynamic reporting features, have also helped track study drug shipments, monitor past shipments, and ensure that system alerts are functioning correctly. These reports play an important role in ensuring timely communication, providing visibility into successful shipments, and identifying participants who should be expecting orders. Proactive monitoring helps prevent logistical delays, and supports the accurate tracking of drug orders, crucial for maintaining study integrity and participant adherence. By improving the tracking and communication flow between the coordinating centers, clinical sites, and central pharmacy, the upgraded system fosters a cohesive trial environment by streamlining information exchange. This presentation will highlight the technical and procedural updates to the PREVENTABLE Study Drug System, focusing on improvements in site efficiency, user satisfaction, and data accuracy. Key features such as the redesigned dashboard, advanced filters, and enhanced USPS tracking integration will be discussed, along with the role of user feedback in shaping these changes. The challenges of balancing the needs of both VA and non-VA sites will be addressed, and insights will be shared on applying these enhancements to optimize large-scale clinical trial management.

P-29 – PS-INTEGRATED BAYESIAN PROACTIVE DYNAMIC BORROWING STRATEGY BASED ON THE QUANTITATIVE EVALUATION OF EXCHANGEABILITY OF THE HYBRID CONTROL ARM

Primary Author:

1) Wang, Kai (Peking University)

Co-Author(s):

2) Han Cao (Medical Data Science Center, Beijing Tsinghua Changgung Hospital)

3) Chen Yao (Clinical Research Institute, Institute of Advanced Clinical Medicine, Peking University)

Borrowing external controls in clinical trials to augment the concurrent control arm is attractive due to its ability to reduce sample size and improve efficiency. However, it inevitably faces challenges from various biases due to the incomparability between concurrent and external controls. Several Bayesian methods that rely on the exchangeability of concurrent and external controls have been proposed to dynamically discount external controls based on the heterogeneity of observed outcomes, referred to as prior-data conflict. However, prior-data conflict is viewed as a reactive measure of exchangeability. Some suggest using the propensity score (PS) overlap as a fixed discounting parameter to proactively borrow based on the severity of selection bias that undermines exchangeability. However, this approach overlooks prior-data conflict and other types of biases that can also affect exchangeability. Here, we propose a PS-integrated Bayesian proactive dynamic borrowing strategy based on the quantitative evaluation of exchangeability. Our approach first balances the covariates using PS, then fits the outcome models separately for the concurrent and external control arms. Exchangeability is then quantitatively evaluated based on differences in the model coefficients and the means of covariates, which reflect the conditional exchangeability given covariates and the covariate similarity, respectively. An elastic function is adopted to convert these differences to the exchangeability index within the range of [0,1], where the hyperparameters of the elastic function are determined by the pre-specified maximum clinically tolerant difference of marginal effects between concurrent and external controls under the assumption of exchangeability. If any differences in covariates or model coefficients result in a change in the predicted average control effect that exceeds the pre-specified tolerant difference, the exchangeability index will become a small value close to zero. Finally, a weakly informative initial prior with the exchangeability index as its mean will be used for the random discounting parameter, such as the power parameter of Power Prior. Therefore, our approach can proactively control the amount of discounting of external controls based on the degree of exchangeability, meanwhile allowing for dynamic borrowing based on prior-data conflict. In the simulation study, we examine the statistical operating characteristics of our approach in scenarios with various biases of differing severity, including selection bias, unmeasured confounders, measurement bias in covariates and outcomes, and effect shift. Under mild selection bias, our approach performs similarly to PS-integrated dynamic borrowing based on prior-data conflict in terms of power gain and the width of the 95% credible interval. However, under severe selection bias, our approach’s power gain and the improvement in 95% credible interval become smaller than PS-integrated dynamic borrowing. More importantly, in the presence of other biases, our approach demonstrates better control of bias and the Type I error rate than PS-integrated dynamic borrowing, particularly when these biases are severe. Furthermore, the pre-specification of a tolerant difference is crucial, as a less stringent value tends to increase bias and inflate the Type I error rate. In conclusion, our PS-integrated Bayesian proactive dynamic borrowing strategy can discount external controls based on both the biases that undermine exchangeability and prior-data conflict.

P-30 – BAYESIAN IN-SILICO CLINICAL TRIALS APPLIED TO OBESITY-RELATED CANCER PREVENTION: THE IMPORTANCE OF EXPERT ELICITATION FOR KEY PARAMETERS IN THE ABSENCE OF EXISTING DATA USING THE SHELF METHOD

Primary Author:

1) Matthew Harris (Manchester Cancer Research Centre)

Co-Author(s):

2) Duncan Wilson (Leeds Clinical Trials Research Unit)

3) Jeremy Oakley (University of Sheffield)

4) Kate Ren (University of Sheffield)

5) Andrew G Renehan (Manchester Cancer Research Centre)

Introduction: Clinical trial feasibility is a critical consideration in the design and implementation of interventions, particularly when addressing complex health outcomes such as cancer prevention. In-silico trials provide the ability to model a clinical trial without the risks associated with undertaking this in the real-world. A Bayesian framework allows for the inclusion of uncertainty to be factored into these models, providing an understanding of the risks associated, and the impacts of key aspects of a potential clinical trial. As part of a multi-modal preliminary analysis of the feasibility of a large-scale weight loss intervention to prevent cancer trial, Bayesian In-silico trials have been designed to understand feasibility and optimal designs. In-silico models are only as valid as the data input. In the absence of strong literature, expert opinion can be used to define these parameters. This study defines the potential and importance of expert elicitation of key prior specifications using the SHELF method (shelf.sites.sheffield.ac.uk).

Methods: A two-step In-silico clinical trial has been designed, simulating cancer incidence in a weight loss intervention to prevent cancer scenario. It does this by first simulating individual weight losses from priors taken from the literature, then simulating a probability of cancer based on each virtual patient’s weight loss. The prior distribution representing the relationship between individual weight loss and cancer risk is very poorly defined in the literature with a high degree of uncertainty. We compare the effect on trial assurance of the selection of 3 plausible prior distributions for this effect. We then undertake expert elicitation of this distribution using the SHELF framework, comparing the impact on assurance as a single value and a distribution of probability.

P-33 – A SIMULATION STUDY ON THE IMPLICATIONS OF ESTIMANDS IN TREATMENT SWITCHING IN META-ANALYSES

Primary Author:

1) Quang Vuong (Core Clinical Sciences)

Co-Author(s):

2) Rebecca K Metcalfe (University of British Columbia)

3) Antonio Remiro-Azócar (Novo Nordisk)

4) Anders Gorst-Rasmussen (Novo Nordisk)

5) Oliver Keene (KeeneONStatistics)

6) Jay JH Park (McMaster University)

The ICH E9(R1) addendum promotes the estimands framework to harmonize the reporting of strategies to account for intercurrent events in clinical trials. However, the implications of the estimands framework in meta-analysis have not been well studied. In the context of treatment switching as an intercurrent event, via simulation, we examined the bias caused by pooling together estimates targeting different estimands in a meta-analysis of randomized clinical trials (RCTs) that allowed for treatment switching. We simulated overall survival data of a collection of RCTs that allowed patients in the control group to switch to the intervention treatment after disease progression under fixed-effects and random-effects models. For each RCT, we estimated a treatment policy estimand that ignored treatment switching, as well as a hypothetical estimand that accounted for treatment switching by censoring switchers at the time of switching. Then, we pooled together RCT effect estimates under fixed-effects and random-effects meta-analytical models while varying the proportions of treatment policy and hypothetical effect estimates. We contrasted effect estimates from meta-analyses that pooled different types of effect estimates with those that pooled only treatment policy or hypothetical estimates. We found that pooling estimates targeting different estimands results in pooled estimators that reflect neither the treatment policy estimand nor the hypothetical estimand. This finding suggests that pooling estimates of varying target estimands even under a random-effects model can produce misleading results. Adopting the estimands framework for meta-analysis may improve alignment between meta-analytic results and the clinical research question of interest.

P-34 – SCORING RADIOGRAPHY COLLIMATION: QUANTIFYING BACKGROUND IMPACT USING NEURAL NETWORKS

Primary Author:

1) Viktor Osadsky (Exeter High School)

X-Ray collimation significantly impacts both radiologist performance and model predictions. The excessive space around an image or exclusion of part of an image caused by technologist error is still prevalent in today’s practice where radiographs are made hastily or in high-volume settings. This study aims to develop a scoring mechanism capable of identifying and quantifying these features and perform image manipulation in case of excessive background. The pipeline would allow for greater quality of images provided to radiologists and feedback to technologists on how to improve collimation for future studies. In an environment for conducting clinical trials, the overall effects of reducing radiation exposure of patients when creating a dataset is substantial. A function for superficially annotating images was developed, generating masks to be passed through a model for highlighting background sections. This study used a combination of 1448 randomly selected images from the FracAtlas dataset and 4821 randomly selected images from Stanford’s MURA dataset, split into 70% for training, 22.5% for testing, and 7.5% for validation. A U-Net model was used with an EfficientNetB4 base model, which provided predicted masks for a background collimation-scoring equation. To identify the condition of an overcrop, the segmentation output was used to correct excessive background on the RSNA Bone Age dataset, with identical splitting as previous. Using these images, 50% were randomly cropped to simulate poor collimation over-cropping, which were fed into a ResNet50 model. The output of this pipeline provides technologists with future suggestions through the collimation-score and identification of over-crop. Quantitative analysis of the U-Net model on the test set revealed a dice coefficient of 0.94, IoU of 0.7963, precision of 0.7485, and recall of 0.9308. However, these results may be misleading, as the segmentation model displayed greater coverage of background areas (for example, in rotation) compared to the manual function. The ResNet over-crop model achieved an accuracy of 0.8465, AUC of 0.9094, and loss of 0.3819 on the testing set. The quantitative analysis highlighted the model’s strengths and weaknesses, while also establishing it as a respectable system for collimation identification. These findings suggest that determining the collimation-score of an x-ray image can be done more effectively through a machine learning approach than manual functions. The importance of this pipeline is highlighted when used with clinical trials, as its potential to reduce unnecessary radiation exposure provides a valuable improvement of participant care in radiography experimentation. Some limitations of this pipeline include the exclusion of human-made annotations, relying solely on image patterns.

P-36 – SITE-SPECIFIC COVARIATE AND GROUP SIZE IMBALANCE IN MULTICENTER ACUTE STROKE TRIALS: A COMPARISON OF COVARIATE-ADAPTIVE RANDOMIZATION AND BLOCK RANDOMIZATION STRATEGIES

Primary Author:

1) Timofei Biziaev (University of Calgary)

Co-Author(s):

2) Michael D Hill (University of Calgary)

3) Tolulope Sajobi (University of Calgary)

4) Leonid Churilov (University of Melbourne)

5) Bijoy K Menon (University of Calgary)

Background: Preservation of treatment allocation randomness, achievement of treatment group size balance, and balance on prognostic baseline covariates are desirable properties of optimal randomization schemes. Previous studies have demonstrated the accuracy of covariate adaptive randomizations, such as minimal sufficient balance (MSB) randomization, for achieving group size and covariate balance in acute stroke trials at the end of the trial. However, there is limited investigation of the performance of these randomization designs to achieve within-site balance and preserve treatment allocation randomness across covariates, especially in trials with low-enrolling sites. This study evaluates the performance of block randomization and covariate adaptive randomization techniques in minimizing site-specific imbalance in multicenter acute stroke trials.

Methods: Monte Carlo simulation were used to evaluate performance of MSB, stratified MSB, common scale MSB (cs-MSB), stratified cs-MSB, common scale group-size MSB (csSize-MSB), stratified csSize-MSB, and permuted block designs, for achieving balance across baseline covariates across sites. Simulation conditions investigated include number of sites (3 or 20 sites), enrolment per site (equal or unequal enrollment across sites), treatment-control group allocation ratio (ratio of 1), number and distribution of baseline covariates (sex, age, NIH Stroke Scale, large vessel occlusion status), and sample size (N=600). The probability of observing statistically significant imbalance on any baseline covariate, proportion of biased allocations, and overall and site-specific group allocation ratio at interims and end of enrollment were used to evaluate the performance of the randomization schemes.

Results: The average probability of observing imbalance on any of the baseline covariates for the stratified permuted block, cs-MSB, csSize-MSB, and MSB were 20%, 11%, 11%, and 6%, respectively. Although site-specific treatment allocation imbalance at low-enrolling sites persisted, regardless of the randomization scheme, treatment allocation randomness and treatment-control group balance were preserved in high volume sites.

Conclusions: Although site-specific imbalance on covariates persisted in low-enrolling sites, regardless of randomization technique adopted, the randomness of treatment allocation and balance of covariate were preserved. Logistical considerations to minimize low enrollment across sites are recommended prior to onboarding sites in multicenter acute stroke trials.

P-37 – MEASURING THE SUCCESS OF TREATMENT BLINDING IN SHAM-CONTROLLED TRIALS

Primary Author:

1) Anh Phan (Medical University of South Carolina)

Co-Author(s):

2) Valerie Durkalski-Mauldin (Medical University of South Carolina)

Measuring the short- and long-term success of treatment blinding in clinical trials is critical for understanding its impact on study outcomes and the validity of the results. Effective blinding is crucial, as it mitigates the risk of bias arising from participant and investigator expectations, which can significantly inflate treatment effects. Although a common design feature, several trials, particularly device and surgical trials, are challenged to develop adequate controls for blinding purposes. When feasible, these trials attempt to preserve the blind by developing a “sham” control that mimics the experimental treatment. In these cases, it is particularly important to assess the quality of blinding and the impact on the treatment estimates. We aim to assess the short- and long-term effectiveness of blinding in a sham-controlled clinical trial setting using the Bang Blinding Index. SHARP (NCT03609944) is a multi-center, sham-controlled, single-blinded with a blinded outcome assessment randomized clinical trial of endoscopic retrograde cholangiopancreatography with minor papilla endoscopic sphincterotomy for the treatment of recurrent acute pancreatitis with pancreas divisum. Patients were blinded by not receiving clinical reports or bills for procedures. Research staff and blinded physicians involved with data collection were unaware of the treatment allocation. Subjects, research coordinators, and evaluating physicians were asked to guess the treatment allocation on several occasions during the follow-up period of 48 months. This presentation will review the process of blinding, and the success of blinding based on the questionnaires used in the trial; examining the likelihood of a correct “guess” of assigned treatment and the likelihood of guessing the participant received the real treatment (regardless of being correct). The goal will be to highlight the importance of measuring the success of blinding in a clinical trial and methods of measurement.

P-39 – AN R SHINY WEB APPLICATION FOR DATA MONITORING AND QUALITY ASSURANCE IN THE ACUTE TO CHRONIC PAIN SIGNATURES STUDY (A2CPS)

Primary Author:

1) Briha Ansari (Johns Hopkins Bloomberg School of Public Health)

Co-Author(s):

2) Patrick Sadil (Johns Hopkins Bloomberg School of Public Health)

3) James Ford (Dartmouth College)

4) Margaret Taub (Johns Hopkins Bloomberg School of Public Health)

5) Ari Kahn (University of Texas at Austin)

6) Joshua Urrutia (University of Texas at Austin)

7) Andre Hackman (Johns Hopkins Bloomberg School of Public Health)

8) Adi Gherman (Johns Hopkins Bloomberg School of Public Health)

9) Martin A Lindquist (Johns Hopkins Bloomberg School of Public Health)

Background: Clinical trials and observational studies support the synthesis of evidence and the development of clinical guidelines, highlighting the necessity for strong data quality assurance measures. The Acute to Chronic Pain Signatures (A2CPS) study is a large-scale, multisite observational study aimed at understanding chronic post-surgical pain and opioid dependence. The main goal of A2CPS is to identify biomarkers predictive of progression from acute to chronic pain in patients following total knee arthroplasty or thoracic surgery. The A2CPS sites collect data across various domains, including brain magnetic resonance imaging, electronic health records, psychosocial, multi-omics, quantitative sensory testing, and functional testing. A2CPS is an observational study, but its aims, design, and methodology closely align with clinical trial practices. Like multi-center clinical trials, it is an interdisciplinary initiative amongst experts from various fields, follows standardized protocols, and has defined eligibility criteria for study participants. In multifaceted studies like the A2CPS, high-quality data is paramount to ensure the accuracy of predictive biomarkers. To improve quality assurance for the A2CPS study data, we developed the A2CPS Data Monitoring Web Application (Web App), an interactive R shiny web app with real-time data monitoring capabilities. In this abstract, we describe the functionality and the utility of the A2CPS Data Monitoring Web App in streamlining quality assurance for the A2CPS study.

Methods: The A2CPS Data Monitoring Web App retrieves data from REDCap and feeds preprocessed data into the R shiny framework. The user interface has a navigation bar and six subpanels. Each subpanel has a specific use case and the functionality to generate downloadable error reports for individual sites, making it easy to share quality documents and communicate with the data collection sites. The A2CPS Data Monitoring Web App is a desktop application for authorized Data Integration and Resource Center (DIRC) members. DIRC uses it to identify errors and coordinate with the sites to facilitate training for research personnel and error resolution.

Results: The regular use of the A2CPS Data Monitoring Web App and interaction with the training team resulted in an average reduction of 66% in data quality errors over the last 11 months for the neuroimaging case report form data. The decline in errors was consistent across all sites despite steady enrollment rates. The results show that real-time data monitoring helps provide focused feedback, mitigate future errors, and streamline data quality assurance.

Conclusion: The A2CPS Data Monitoring Web App plays a key role in A2CPS data quality assurance. It is a robust solution for reducing data entry errors, combined with focused feedback and training. Our results support the potential for using open-source computational frameworks that can be adapted by clinical trials and observational studies for data monitoring and quality assurance purposes.

P-41 – REPRESENT: EXPLORING BARRIERS TO RECRUITMENT OF UNDERREPRESENTED GROUPS IN BLADDER AND HEAD & NECK ONCOLOGY TRIALS

Primary Author:

1) Georgiana Synesi (The Institute of Cancer Research)

Co-Author(s):

2) Judith Bliss (The Institute of Cancer Research)

3) Patrick Kierkegaard (Imperial College London)

4) Lucy Kilburn (The Institute of Cancer Research)

5) Rebecca Lewis (The Institute of Cancer Research)

6) Reshma Punjabi (South East London Consumer Research Panel for Cancer)

7) Emma Hall (The Institute of Cancer Research)

Introduction: Underrepresented groups participate in trials less frequently than expected based on population estimates. Addressing this issue requires confirming which groups are included less than expected, and understanding the reasons behind this. US data suggest that ethnic minority groups, older adults, and less affluent individuals are often underrepresented in oncology trials. Although limited quantitative data exist on UK trial participants, a study of 2,700 participants in trials managed by the Clinical Trials and Statistics Unit at the Institute of Cancer Research showed that adults aged 80 and older, women, people living in the most deprived areas, and people from minority ethnic backgrounds were underrepresented compared with the population who were treated for their cancer outside of a clinical trial. REPRESENT employs a mixed-methods approach to identify underserved groups based on a broad range of demographic data, understand reasons for exclusion, and develop targeted recruitment interventions accordingly.

Methods: Participants include patients with bladder or head & neck cancer, and hospital staff involved in their care and/or trial recruitment, at a cancer research-focused NHS hospital in the UK. A bespoke demographic data questionnaire was developed for REPRESENT to identify underrepresented groups by examining patterns in who is invited to participate in trials, and who agrees. Questionnaire items include age, sex and gender, ethnic background, religion, language preferences, health status, and measures of socioeconomic status such as education and employment. A similar questionnaire administered to hospital staff is being used to identify whether their identities have any impact on the identities of the patients they recruit. Ethnographic observations are being conducted during multi-disciplinary team meetings and treatment consultations. Observations of interactions between patients and hospital staff enable the identification of trends and nuances of how decisions around clinical trial participation are made by hospital staff and patients. Semi-structured interviews are being conducted with patients and hospital staff to explore attitudes towards inclusive trial enrolment, informing the design of our interventions. Focus groups comprising patients, hospital staff, and protocol development personnel will utilize insights from earlier observations and interviews to co-develop a recruitment intervention. This collaborative approach ensures that the intervention reflects the complex interaction of experiences and perspectives gathered, particularly during multidisciplinary discussions and treatment consultations. Patient and public contributors were consulted during study design and development of patient-facing materials, and will continue to advise.

Timelines: REPRESENT commenced in October 2024, and research activities are expected to last until Spring 2025. Examples of underrepresented groups, and barriers and facilitators to inclusive trial recruitment will be shared at SCT 2025.

Evaluation and impact: To our knowledge, REPRESENT is the first study employing this methodology to address underrepresentation in UK oncology trials. Due to the complexity of the protocol, several iterations were required to co-design appropriate patient-facing material. Although patient and public contributors from a range of backgrounds provided input, they were mainly experienced in clinical research. Ongoing efforts will focus on engaging a broader spectrum of newly diagnosed patients to ensure the patient-facing material and subsequent interventions developed are widely applicable and effective.

P-42 – LEVERAGING SEASONAL VARIATION FOR PREDICTING ACCRUAL IN CLINICAL TRIALS USING BAYESIAN POSTERIOR PREDICTIVE DISTRIBUTIONS

Primary Author:

1) Jonathan Beall (Medical University of South Carolina)

Co-Author(s):

2) Byron Gajewski (University of Kansas Medical Center)

3) Valerie Stevenson (University of Michigan)

4) Fred Korley (University of Michigan)

5) Bill Barsan (University of Michigan)

6) Renee Martin (Medical University of South Carolina)

7) Galan Rockswold (Hennepin Healthcare Research Institute)

Effective statistical tools are essential for initial planning and ongoing monitoring of clinical trials. A key factor that investigators must carefully assess is the accrual rate—the speed at which patients are enrolled. Slow accrual can limit the likelihood that the trial will provide results with sufficient power to make meaningful scientific inferences. We propose a method for predicting accrual rates while accounting for seasonal variations that often affect enrollment patterns in emergency medicine. Using a Bayesian framework, we combine prior knowledge with data up to a given monitoring point to generate predictions, incorporating seasonal trends into the model. We present posterior predictive distributions for accrual, addressing parameter uncertainty and sampling variability. To illustrate the method, we apply it to ongoing clinical trials, including the HOBIT trial, and compare our seasonal model to the approach we currently use in practice. We discuss practical considerations related to the accrual process, including the impact of seasonal fluctuations on recruitment, and highlight the advantages of our proposed method over traditional approaches.

P-43 – MAPPING THE KDQOL-36 ONTO THE EQ-5D-5L UTILITY INDEX IN PATIENTS UNDERGOING HAEMODIALYSIS

Primary Author:

1) Hannah Worboys (University of Leicester)

Co-Author(s):

2) Nicola Cooper (University of Leicester)

3) James Burton (University of Leicester)

4) Laura Gray (University of Leicester)

Purpose: Health-related quality of life (HRQoL) is frequently used as a primary outcome in clinical trials involving patients with end-stage kidney disease, measured through self-reported questionnaires that are rigorously validated to ensure they capture outcomes important to patients. Although widely used as a tool to measure quality of life, the Kidney Disease Quality of Life Questionnaire (KDQoL-36) does not include a measure of health utility that would enable economic analyses to be performed.

Aim: This study aimed to map the KDQoL-36 onto the EuroQol 5 Dimension (EQ-5D-5L) utility index for patients with end-stage kidney disease undergoing hemodialysis.

Methods: For the development of the mapping function, data from a randomized controlled trial was used and consisted of 6,603 observations. Two modelling techniques were applied: i) linear regression with fixed effects and ii) adjusted limited dependent variable mixture model (ALDVMM). Several model specifications were tested, and the preferred model was chosen based on a catalogue of performance indicators. The validation phase involved applying a selection of the top performing models to an independent dataset consisting of 117 observations. This study follows the mapping onto preference-based measures reporting standards statement.

Results: The ALDVMM model with three components, using five domains; physical component score, mental component score, burden, symptoms and effects, as well as age and sex as explanatory variables was the preferred model during the estimation phase. The validation phase supported this result, as the 3 component ALDVMM was the highest performing model. This model dominated in all aspects of predictive performance. The mapping function has been formatted as an Excel workbook (as well as a ster file) to ensure ease of accessibility and use.

Conclusion: This novel mapping function translates the KDQoL-36 to EQ-5D-5L values in patients with end-stage kidney disease undergoing hemodialysis. This study provides researchers with a way to calculate QALYs in the absence of directly collected utility. This is the first study to develop a mapping algorithm between the KDQoL-36 and the EQ-5D-5L using the US value set. The model results demonstrate satisfactory fit and precision, providing valuable tools for clinicians and researchers, particularly in situations where generic preference-based health-related quality of life instruments are inaccessible for utility derivation in cost-effectiveness studies.

P-45 – METHODOLOGICAL FEATURES OF EARLY PHASE NON-RANDOMISED ADAPTIVE PLATFORM TRIALS: A COMPREHENSIVE REVIEW OF DESIGN, IMPLEMENTATION AND REPORTING

Primary Author:

1) Sabine Dreibe (The Institute of Cancer Research)

Co-Author(s):

2) Xiaoran Lai (The Institute of Cancer Research)

3) Christina Yap (The Institute of Cancer Research)

Background: Early-phase clinical trials often face significant challenges, including small patient populations, high costs and failure rates. Adaptive platform trials (APTs) have shown promise in addressing some of these issues by allowing multiple treatments or interventions to be tested simultaneously, with the flexibility to add or drop arms, adjust sample sizes, and make changes in real time. These features, which offered increased efficiency and reduced resource use, have made APTs increasingly popular in recent years, with a particularly notable surge in use during the COVID-19 pandemic. While randomized APTs are more common particularly in later phase trials, non-randomized APTs have a valuable role in early phase trials where the primary focus is on safety, preliminary efficacy rather than comparative efficacy. However, the implementation of non-randomized APTs poses unique methodological challenges, including selection bias, integration of adaptive features, and the need for robust statistical frameworks. Bayesian and Frequentist approaches, both widely used in early clinical trials, offer distinct advantages and limitations that can significantly influence the design and interpretation of APTs. Comprehensive guidance on their design, implementation and reporting remains limited, leading to variability in practice and reporting standards. This review aims to address these gaps by systematically examining the methodological features of early phase APTs, offering insights in past and current trends, identifying best practices, and highlighting areas for further methodological development.

Methods: A comprehensive review of non-randomized APTs will be conducted to assess several research questions: (1) Key design elements (e.g. trial design, sample size determination, trial duration, endpoints, statistical framework); (2) Implementation strategies (e.g., data monitoring, decision rules); (3) Reporting practices (e.g. protocol transparency, publication outputs); (4) Past and current trends in statistical methods; and (5) Temporal trends across selected subgroups: oncology vs non-oncology and industry vs non-industry. A comprehensive search will be conducted in Q4 2024 and Q1 2025 across major databases including PubMed, ClinicalTrials.gov and Google Scholar to identify relevant published papers, protocols, clinical trial entries, citation tracked sources and grey literature from 2018-2024. Trials eligible for inclusion must be multi-arm, adaptive platform trials, umbrella or basket trials, and non-randomized early-phase (Phase I or II). Statistical analysis methods will include descriptive statistics and data stratification by methods or subgroup type to assess baseline characteristics. Other methods will included logistic regression or generalized linear models to assess temporal changes.

Results: Results of this review will be presented at the meeting spring 2025. The findings will add to the existing literature on randomized APTs and provide a detailed understanding of key characteristics, outcomes, and the evolution of methodological trends in non-randomized early phase APTs.

Conclusion: This review aims to bridge the currently existing literature gap by offering valuable insights into the emerging trends of methods employed in non-randomized early phase APTs. The findings will guide clinical research by encouraging the adoption of innovative statistical methods where appropriate, improving trial efficiency, trial setup, treatment development, and, most importantly, enhancing patient outcomes.

P-46 – STREAMLINING PROMIS SCORING BY USING AN APPLICATION PROGRAMMING INTERFACE (API) FOR AUTOMATION IN THE LOOK AHEAD AGING STUDY

Primary Author:

1) Tara Beckner (Wake Forest University School of Medicine)

Co-Author(s):

2) Anthony Alvarado (Wake Forest University School of Medicine)

3) Jerry Barnes (Wake Forest University School of Medicine)

4) Haiying Chen (Wake Forest University School of Medicine)

5) Mark Espeland (Wake Forest University School of Medicine)

6) Darrin Harris (Wake Forest University School of Medicine)

7) Denise Houston (Wake Forest University School of Medicine)

8) Rebecca Neiberg (Wake Forest University School of Medicine)

9) Lynne Wagenknecht (Wake Forest University School of Medicine)

The abstract compares two methods for scoring PROMIS instruments in the Look AHEAD Aging study. Traditionally, the scoring has been manual, which involves exporting data to a CSV format, then uploading it for scoring, and receiving results by via email, a process prone to delays and errors. Recently, Wake Forest University School of Medicine transitioned to using the Assessment Center API, which automates scoring and improves efficiency. The abstract highlights the advantages of the API in streamlining data handling and improving scoring accuracy, while also addressing potential consequences in data integrity in clinical trials.

P-47 – OPTIMIZING DATA ENTRY AND ENHANCING PROTOCOL ADHERENCE THROUGH DYNAMIC ELECTRONIC FORM VISIBILITY IN A MULTI-SITE CLINICAL STUDY

Primary Author:

1) Vilma Okey-Ewurum (Massachusetts General Hospital)

Co-Author(s):

2) Maddy Roberts (Massachusetts General Hospital)

Objective: The DISCOVERY study is a multi-site, longitudinal nested cohort study conducted across 30 Clinical Performance Sites in the United States. Study visits may be conducted in person, by phone, or as a hybrid of both. Given that some assessments are specific to either phone or in-person visits, it is essential to explicitly capture how each visit is conducted to accurately monitor protocol adherence. At study initiation, all forms were universally visible in the visits, leading to data entry errors where staff completed forms that were not relevant to the visit type or inadvertently administered assessments outside of the protocol simply because those forms were visible. To mitigate these issues and improve monitoring, Visit Face Sheets were implemented to control form visibility.

Methods: Visit Face Sheets are forms designed to manage the visibility of all other forms within a visit. Upon entering a visit, the Face Sheet is the only visible form. Additional forms are dynamically displayed based on selections made on the Face Sheet, such as the mode of the visit (in-person or phone) and whether cognitive testing was conducted, either in-person or by phone. This ensures that only relevant forms are visible for the selected visit type, reducing the risk of errors. Because the Visit Face Sheets were introduced after a sizeable number of study visits had already occurred, an algorithm was used to complete the Face Sheets for previous visits to avoid requiring sites to complete the data entry manually. The algorithm completed the face sheet fields based on over 20 pre-specified scenarios. Each scenario was given a predetermined rank based on relevance. The algorithm script evaluated the visit data for each visit against the scenarios and assigned the values of the applicable condition with the highest rank.

Results: The algorithm successfully populated 7,735 Face Sheet fields for previous visits (96.3%) while 294 fields (3.7%) remained uncategorized due to incomplete data. In addition, visits conducted post-launch of the Face Sheets had less incidence of protocol deviations related to administration of inappropriate assessments.

Conclusion: Visit Face Sheets are an effective tool for managing data entry and ensuring protocol adherence. While ideally deployed at the start of a study, these forms can be successfully introduced mid-study with minimal burden to site staff by using automated algorithms to populate fields. Post-launch, it is essential to monitor data for discrepancies between Face Sheet entries and the database to ensure data quality.

Challenges/limitations: Using an algorithm to complete the Face Sheets is vulnerable to errors in data entry as any mistakes would propagate and lead to misclassification. To ensure flexibility, Face Sheets can be edited after initial selection. However, if visit types are changed after initial data entry, previously entered forms would remain in the database but no longer be visible in the user interface, potentially causing discrepancies downstream during analysis. Future work will focus on enhancing data validation processes to prevent such issues and developing automated checks for inconsistencies between Face Sheet entries and existing visit data.

P-48 – APPLICATION OF THE FLEXIBLE PARAMETRIC CURE MODEL TO CLINICAL TRIAL DATA – THE MEAN SURVIVAL TIME AS A SUMMARY MEASURE OF THE UNCURED POPULATION

Primary Author:

1) Yuka Sano, (National Cerebral and Cardiovascular Center)

Co-Author(s):

2) Shiro Tanaka (Kyoto University)

3) Koko Asakura (National Cerebral and Cardiovascular Center)

4) Tosiya Sato (Shiga University)

Background: In the context of survival analysis in recent cancer clinical trials, there has been an increasing interest in the proportion of patients who are not susceptible to the event, known as the cure proportion. For the cure proportion estimation in clinical trials, the flexible parametric cure model is previously shown to be an attractive alternative to the Kaplan-Meier method, if the last knot position is chosen carefully, and we proposed the procedure to determine it. A flexible parametric cure model with the last knot position at x months from the last observed event time is referred to as “the flexible parametric cure model with a delay of x months.” Following the proposed procedure, the flexible parametric cure model with a delay of 15 months was chosen to apply to the nivolumab group of the CheckMate 141 trial, which provided 0.296 (95% confidence interval: 0.219-0.378) as the cure proportion. However, our focus was only on estimating the cure proportion and not the summary measures of the uncured population survival. In this study, we demonstrate the application of our procedure and estimate the cure proportion along the mean and median survival times of the uncured population.

Methods: We applied the flexible parametric cure models to several clinical trial data and used our previously proposed procedure to determine the last knot position of the model. Then, according to the chosen position of the last knot, the cure proportion and mean and median survival times of the uncured population were estimated. The predicted survival estimates of the uncured population were also depicted.

Results: The flexible parametric cure model with a delay of 15 months was chosen to apply to the standard therapy group of the CheckMate 141 trial, which provided 0.173 (95% confidence interval: 0.094-0.273) as the cure proportion. The mean survival times of the uncured population of nivolumab and standard therapy were estimated as 5.43 (95% confidence interval: 4.70-6.15) and 5.06 (95% confidence interval: 4.34-5.79) months, respectively. The median survival times of the uncured population of nivolumab and standard therapy were estimated as 4.33 (95% confidence interval: 2.98-5.68) and 4.46 (95% confidence interval: 3.71-5.22) months.

Conclusions: The flexible parametric cure model can provide the mean survival time as a summary measure of the uncured population survival. Because the last knot position influences the estimate of cure proportion and mean and median survival times, the process of determining it should be clearly described in the analysis report.

P-49 – IMPACT OF USING ROUTINE DATA ON THE EFFICIENCY OF IMPLEMENTATION TRIALS: A QUALITATIVE COMPARATIVE CASE STUDY

Primary Author:

1) Charis Xie (Queen Mary University of London)

Co-Author(s):

2) Alice-Maria Toader (University of Liverpool)

3) Anna De Simoni (Queen Mary University of London)

4) Sandra Eldridge (Queen Mary University of London)

5) Joseph E Glass (Kaiser Permanente Washington Health Research Institute)

6) Hilary Pinnock (The University of Edinburgh)

7) Clare Relton (Queen Mary University of London)

Background: Randomized implementation trials evaluate the effects of implementation strategies on implementation outcomes and may also monitor clinical effectiveness. Routine healthcare data are commonly used in implementation trials for participant identification, intervention delivery, and/or outcome ascertainment. Trial efficiency is multifaceted, encompassing four theoretical constructs “scientific, operational, statistical, and economic” which are operationalized within five building blocks: trial design, trial process, infrastructure, superstructure, and stakeholders (the Trial Efficiency Pentagon). Despite frequent usage, the contribution of routine data to implementation trial efficiency remains underexplored. We aimed to investigate how the use of routine healthcare data affects trial efficiency in two implementation trials.

Methods: A comparative case study approach was employed to explore two trials: one UK-based and one US-based. Data were collected through interviews, documents, and feedback workshops. The Trial Efficiency Pentagon was applied to organize and analyze the findings, and data flow diagrams were created to visualize the routine data pathways in the trials.

Results: The two trials (DIGITS and IMP2ART) used routine data to characterize the practice population of eligible patients, support clinical and economic outcome evaluation, facilitate audit and feedback, and assist in intervention delivery. Common facilitators that supported the use of routine data included sufficient IT and hardware capacity, relatively low cost, centralized regulatory approval for multi-site studies, and strong collaboration and partnerships. Common barriers included administrative complexity, redundant bureaucratic processes, and challenges with data sharing requirements. Data quality serves as both a facilitator and a barrier. Key differences included the DIGITS trial’s in-house data warehouses within an integrated healthcare system, which ensured high data quality and enabled preliminary analyses. In contrast, the IMP2ART trial, managing a larger national sample, employed an external research database to integrate data from various EHR systems but faced challenges such as legacy systems, changes in coding standards and site-specific approvals.

Conclusions: Implementation trials are inherently population-based and population-serving. Embedding these trials within public healthcare services maximizes population reach, but to improve trial efficiency systems must address technological and regulatory barriers. In contrast, in private healthcare systems, successful use in trials hinges on investing in robust IT infrastructure and ensuring comprehensive organizational commitment.

P-52 – DO SMALL CHANGES LEAD TO BIG IMPROVEMENTS? EMBEDDING EDI IN A VALUE DRIVEN WAY IN CLINICAL TRIALS RESEARCH INSTITUTE

Primary Author:

1) Liam Bishop (University of Leeds)

Co-Author(s):

2) Myka Ransom (University of Leeds)

3) Pei Loo Ow (University of Leeds)

4) Delia Muir (University of Leeds)

5) Holly Schofield (University of Leeds)

6) Brenda Phillips (University of Leeds)

7) Asif Farooqui (University of Leeds)

8) Laurie Cave (University of Leeds)

9) Abigail Olaleye (University of Leeds)

10) Dax Everritt (University of Leeds)

Introduction: Clinical trials have the potential to improve health outcomes for people worldwide. We know that many clinical trial populations are unrepresentative of the populations they should aim to serve. With many communities underserved due to poorly designed research and a lack of trust, clinical trials researchers need to find ways to be more inclusive. At the Leeds Institute for Clinical Trials Research, until the appointment of a Diversity and Inclusion (D&I) Officer, specifically looking at equity, diversity and inclusion in our research, there was no formal structure for tackling the problems outlaid above. This is not a role traditionally seen in clinical trials research. With the advent of this post, we are enacting a local strategy specifically focused on EDI, and principles and process are being implemented across the lifecycle of our research.

Methods: Once the D&I Officer post was formed, a group of individuals with a diverse range of ethnicities, sexes, genders, ages, and job roles (both research and professional services, as well as the chair of the University’s network for Minoritised Ethnic Staff) formed a working group. A prioritization exercise then identified key areas and work packages to inform an EDI research strategy. This was supplemented by a culture and values exercise. Once these were completed, two subgroups were then formed. These groups have led on work packages to report back into the main working group. Other strategic aims are supported by the working group (e.g. community engagement and methodology).

Results: We now have formal structure within our institute for embedding EDI in our research. One example is how our inclusive data subgroup has introduced standard code lists to match the ethnicity data collected by the Office for National Statistics, making it easier to compare the diversity of clinical trial populations to that of the general and disease specific populations and other studies. Our Inclusive Design subgroup is developing a platform of guidance and training to support staff in being more inclusive in their research design. Our community outreach has seen collaborations with the Culturally Diverse Hub form, leading to tangible outputs and building valuable relationships with individuals and organizations who support underserved groups.

Discussion: Having a formal D&I Officer post and working group is novel for a clinical trials unit. Having this post and structure has allowed us to tackle EDI in new and diverse ways while being informed by a values-based approach. Even with this organizational structure, it is challenging to meet the diverse array of challenges this work can bring. We need to be thinking about ways we can resource staff time and increasing skills capacity within different research portfolios to enact disease specific research inclusion plans. Collaboration will also arguably be vital to ensure we not duplicating work and ensuring resources, and that we create flexible systems and structures in which we can implement and test potential solutions in our own institutions (e.g. SWATs).

P-53 – DOSE-FINDING DESIGNS TO ACCOUNT FOR PATIENT HETEROGENEITY IN CANCER CLINICAL TRIALS EVALUATING CELL THERAPIES

Primary Author:

1) Evan Bagley (Medical University of South Carolina)

Co-Author(s):

2) Nolan Wages (Virginia Commonwealth University)

This presentation describes a novel phase I trial design developed to enhance the safety and efficiency of cell therapies in oncology by specifically addressing patient heterogeneity and dose-feasibility encountered in such therapies. Traditional dose-finding methods do not accommodate specific challenges encountered in cell-therapy trials, like patients not being able to receive their intended dose due to manufacturing limitations or specific groups of patients being more prone to toxicity than others. To address these issues, we incorporate statistical models that allow for the adaptive updating of dose levels based on real-time patient-data concerning both toxicity and dose-feasibility. Our design aims to calculate groups specific feasible maximum tolerated doses, by sharing toxicity data between groups and utilizing data observed at unplanned dose levels. We present the design, give some trial illustrations, and provide simulation results.

P-54 – SIMULATION-BASED EXTERNAL VALIDATION OF SURVIVAL MODELS FOR RISK STRATIFICATION IN ONCOLOGY

Primary Author:

1) Gloria Brigiari (University of Padova)

Co-Author(s):

2) Ester Rosa (University of Padova)

3) Dario Gregori (University of Padova)

4) Giulia Lorenzoni (University of Padova)

Introduction: Randomized controlled trials are the foundation of evidence-based medicine, but individual prognostic factors enhance therapeutic strategies by enabling personalized care and efficient trial design. Validated prognostic models expand eligibility criteria, optimize patient stratification, and refine endpoints, bridging observational research and clinical trials. External validation is essential for assessing the robustness of clinical prediction models and their generalizability. Building on the methodology proposed by Riley et al., this study explores a simulation-based approach for evaluating the accuracy of external validation of survival models. The results are grounded in the analysis of standard error (SE) values to assess calibration precision. The focus is on the utility of this approach for risk stratification across tumor stages, a critical aspect of oncological patient management. This study provides preliminary results based on the staging of solid tumors in stages 1 and 2, with the intent to extend the methodology to all tumor stages.

Methods: The proposed approach uses Linear Predictors (LPs) derived from Kaplan-Meier survival curves, enabling the development of an underlying Cox proportional hazards model and the corresponding distribution of LPs. Pearson distribution models were used to simulate survival times, replicating realistic clinical scenarios. Survival data were parameterized using survival rates (lambda_surv) representative of realistic intervals 0.85 to 0.95, aligned with expected outcomes for the two groups. Censoring rates (lambda_cens) were selected to reflect practical clinical scenarios and introduce variability into the analysis. Simulations generated tables linking lambda_surv and lambda_cens combinations with the mean SE of the calibration slope and the sample size required to achieve predefined accuracy thresholds. An iterative algorithm estimated the concordance index (C-index) and evaluated calibration through Cox models, generating calibration curves to compare predicted and observed risks.

Results: The simulations demonstrated that sample size requirements for achieving acceptable SE values varied significantly with lambda_surv and lambda_cens. For an expected 3-year survival of S(3) = 0.65 (stage 1), the model demonstrated precise calibration with moderate sample sizes (N∼250–300). Conversely, for S(3) = 0.39 (stage 2), larger sample sizes (N∼400) were required to reduce the mean standard error below SE_beta<0.05. Key findings include: lower censoring rates (lambda_cens) significantly improved calibration precision; lower survival rates (lambda_surv) increased variability in the estimates, demanding larger sample sizes to maintain accuracy.

Discussion: This study confirms the practicality of the proposed simulation-based methodology for external validation and demonstrates its potential to support robust risk stratification in clinical settings. The application to solid tumors highlights the value of simulation techniques in designing efficient clinical studies, optimizing sample sizes, and evaluating calibration under realistic conditions. The methodology’s ability to link survival rates and censoring with calibration errors makes it a valuable tool for refining risk stratification in tumor staging. While these results are preliminary and focused on stages 1 and 2, the approach is intended to be extended across all tumor stages. By integrating specific clinical data with advanced simulation techniques, this approach enhances the robustness and reliability of survival models, ensuring their utility in diverse patient populations and settings.

P-55 – MAPPING METHODOLOGICAL GUIDANCE FOR INCLUDING LIVED EXPERIENCE IN EARLY CORE OUTCOME SET DEVELOPMENT: A SCOPING REVIEW

Primary Author:

1) Marci Livingston (University of Limerick)

Co-Author(s):

2) Katie Robinson (University of Limerick)

3) Elaine Toomey (University of Galway)

Background: Core Outcome Sets (COS) are critical for standardizing outcome reporting in clinical trials, tackling the problem of outcome heterogeneity in systematic reviews and meta analyses. The development of a COS involves several stages, typically starting with the creation of a long-list of potential outcomes, followed by a consensus process to generate a minimum set of outcomes that should be measured across all clinical trials within a specific health area or domain. The Core Outcome Measures in Effectiveness Trials (COMET) Initiative emphasizes the need for COS to reflect the priorities of all relevant stakeholders, particularly those with lived experience. Despite this, current methodological guidance on incorporating lived experience perspectives during the long-list generation phase of COS development is limited.

Objectives: This scoping review aims to systematically map the existing methodological literature on how to integrate the perspectives of those with lived experience into the early long-list generation stage of COS development.

Methods: Adhering to the Joanna Briggs Institute framework and PRISMA-ScR guidelines, this review involved searches across Embase, Scopus, MEDLINE, CINAHL, the COMET Database, and ProQuest Theses and Dissertations. Citation checking and expert consultation also complements these database searches. Two independent reviewers are responsible for screening and data extraction. Screening, data extraction, and analysis are ongoing, with full results expected for presentation at the upcoming conference.

Preliminary findings: Early screening shows wide variability in approaches to stakeholder involvement in COS development, particularly in defining “key stakeholders” and integrating their input. Methods identified include using qualitative research, patient-reported outcome measures, the involvement of patient and public research partners, and co-design workshops, but few explicitly target stakeholder inclusion at the early long-list generation stage.

Implications: This review is intended to offer valuable insights into effective strategies for early-stage stakeholder inclusion, contributing to the development of more inclusive and robust outcome sets. These outcomes will help ensure that clinical trials address the right questions and generate meaningful, patient-centered results.

This research forms part of a doctoral research project supported by the Health Research Board Trial Methodology Research Network PhD scholarship awarded to ML. The funder had no role in the design, data collection, and analysis or preparation of the review.

P-58 – THE USE OF HEALTHCARE SYSTEMS DATA FOR RCTS

Primary Author:

1) Alice-Maria Toader (University of Liverpool)

Co-Author(s):

2) Carrol Gamble (University of Liverpool)

3) Paula Williamson (University of Liverpool)

4) Susanna Dodd (University of Liverpool)

Introduction: Healthcare systems data (HSD) has the potential to optimize the efficiency of randomized controlled trials (RCTs), by decreasing trial-specific data demands. In 2019, it was estimated that 47% of NIHR-funded trials were planning to use HSD, which is expected to further increase in the future. We aim to understand the extent and nature of its current use and its evolution over time, alongside the information provided to the patients about linkage with HSD and the privacy of their electronic records.

Methods: We identified a cohort of primarily UK-based RCTs within the NIHR Journals Library that commenced after 2019. Details on the source and use of HSD were extracted from eligible RCTs. The use of HSD was categorized according to whether it was used as the sole data source for outcomes and whether the outcomes were primary or secondary. Associated patient information sheets were identified through online searches and emailed requests. A set of questions was designed to understand the clarity of information provided regarding the use of HSD, in accordance with the GDPR Data Protection laws. Within the finished RCTs, the use of HSD is to be compared against the proposed use presented in their published protocols.

Partial results: Of the 84 eligible studies, 52 (62%) planned to use HSD, with 28 (54%) planning to use it for at least one outcome. The most commonly used data sources were National Health Service Digital (n = 37, 79%), patient registries (n = 7, 29%), primary care (n = 5, 21%), The Office for National Statistics (n = 3, 13%). Patient Information Sheets were identified for 43 of the RCTs, 21 (49%) of which were planning to use HSD for outcomes.

Structure and timelines: Within the following months, we aim to have a better understanding of the level of detail that can be found in the Information sheets provided to patients. From the publications of the finished RCTs, we will collect information about the actual use of HSD compared to planned use within their protocols.

Potential relevance: The comparison of current results with an earlier cohort demonstrates an increase in the number of RCTs planning to use HSD. It is important to understand the plans for HSD use and how they are communicated to patients.

P-59 – BENCHMARKING SUBGROUP HEALTH TECHNOLOGY ASSESSMENT ANALYSIS

Primary Author:

1) Fei Yuan (McMaster University)

Background: Government revenues are used to reimburse health programs or interventions in many countries. For example, given a population of 15 million Ontarian residents, the Ontarian senior drug co-pay program covers about 5,000 medications and other products. When allocating a limited budget to multiple interventions, it is interesting to know their cost-effectiveness in specific sub-groups of patients. An intervention may not be cost-effective for an overall study population but may be cost-effective for certain sub-groups. For pre-specified sub-groups, Health Technology Assessment (HTA) guidelines generally recommend conducting an economic evaluation for each sub-group using only data from the sub-group of interest. With incremental net monetary benefit (INMB) being used as a main metric of measuring cost-effectiveness, researchers proposed to estimate the INMB by using statistical modeling techniques. The net monetary benefit (NMB) summarizes the monetary value of an intervention given a willingness-to-pay (WTP) threshold for a unit of benefit. It is calculated by subtracting the monetary cost of an intervention from the monetary gain from clinical benefit (clinical benefit multiplied by the WTP threshold). The incremental net monetary benefit (INMB), which describes the cost-effectiveness of an intervention against the other, is the absolute difference between the NMB associated with each intervention. As the calculation of INMB involves both the cost and clinical benefit, there are different approaches proposed in the literature about how to estimate INMB for subgroups. However, it is unknown how these approaches perform against each other.

Methods: This project benchmarks a few analytical approaches using data from the Cardiovascular Outcomes for People Using Anticoagulation Strategies Study (COMPASS). It compares mean cost difference, mean quality-adjusted-life-year (QALY) difference and incremental net monetary benefit (INMB) using descriptive data, linear regression, linear mixed-effect regression models and Bayesian bivariate models.

Results: The point estimates of mean cost difference across all countries appeared to be very closed to each other from the four analytic approaches, while the mean QALY difference was estimated varying by analytic approaches. Consequently, the incremental net monetary benefit (INMB) was estimated differently. With $ 0 willingness to pay, the estimates from the Bayesian bivariate model aligned to those from other modelling techniques with a tight 95% credible interval. With the willingness-to-pay threshold increasing from $30K to $200K, the estimates of INMB provided by Bayesian bivariate model increased from -$6927.0 (95% credible interval: -$9266.0, -$4540.0) to -$48130.0 (95% credible interval: -$63720.0, -$32220.0).

Conclusion: Taxpayers are obligated to cover an increasingly expanded list of health interventions. So, it is critical to study the cost-effectiveness of interventions in subgroups of patients. Lack of comprehensive comparison of analytic strategies make it challenging to meet the need. This investigation showed that the Bayesian bivariate model was optimal for estimating the cost-effectiveness measured by INMB.

P-61 – WHEN TO SCHEDULE THE INTERIM ANALYSIS IN THE PRESENCE OF MISSING DATA?

Primary Author:

1) Neža Dvoršak (University of Bath)

Co-Author(s):

2) Jianmei Wang (Roche)

3) Thomas Burnett (University of Bath)

4) Christopher Jennison (University of Bath)

5) Robin Mitra (University College London)

Suppose an adaptive Phase III trial has an interim analysis scheduled at a given information fraction, e.g., 50%. The key question is: When will we reach 50% information? In a non-longitudinal setting, the information level for a continuous endpoint can be approximated by the fraction of patients with endpoint data at the interim analysis relative to the final analysis. However, longitudinal trials with repeated measures and missing data require more nuanced methods to estimate the information level accurately. The question then becomes: When will there be 50% information in the presence of missing data? Is it when half of the patients reach the final visit, or could it be earlier? We propose an approach for projecting the information fraction in continuous longitudinal trials analysed using MMRM. We establish a relationship between information time and calendar time, providing practical guidance. At the design stage, prediction for the timing of interim analysis is based on assumptions about enrolment rate, total sample size, dropout rate, visit timing, and the correlation matrix. Once some data is available, this prediction is refined using the observed enrolment rates, dropout patterns, and updated correlation estimates, yielding a more accurate estimate of the current information level and an updated timeline for the interim analysis. Through a practical example, we demonstrate how to project information timelines at the design stage and refine them as data accrues. We discuss how to navigate different missing data patterns, assess the current information level, and set a reliable timeline for the interim analysis.

P-62 – RACIAL MINORITY PARTICIPATION IN DECENTRALIZED CLINICAL TRIALS FOR RARE GENETIC DISEASES

Primary Author:

1) Lauren Edgar (National Institute of Health: National Human Genome Research Institute)

Co-Author(s):

2) Laura Koehly (National Institute of Health: National Human Genome Research Institute)

3) Angel Murray (National Institute of Health: National Human Genome Research Institute)

The design of clinical trials is transforming to include decentralized clinical trials (DCTs), which may provide substantial advantages for minority populations impacted by rare genetic diseases. Despite potential benefits, comprehensive analysis of the incorporation of DCT elements in this specialized context remains limited, mainly due to publication delays, insufficient research, and ongoing underrepresentation of diverse populations in clinical studies. This ‘study of studies’ seeks to systematically map and analyze the implementation of decentralized clinical trial (DCT) designs while evaluating the racial composition of participants in trials focused on rare genetic diseases. The methodology consists of a systematic search of interventional clinical trials conducted in the U.S., utilizing data from ClinicalTrials.gov and information from peer-reviewed publications and grey literature. The inclusion criteria emphasize trials concerning rare genetic diseases that offer insights into the racial demographics of participants and the implementation of decentralized clinical trial methodologies. This dual focus enables a detailed examination of the utilization of methods decentralized in trials with racially diverse populations and the barriers and facilitators affecting their wider adoption. Initial searches conducted on ClinicalTrials.gov utilizing specific keywords associated with rare diseases and decentralized clinical trial designs have produced significant preliminary data. The findings establish a foundation for future inquiries in specialized databases and grey literature to map existing evidence and identify gaps in the current research landscape. This study systematically collates and analyzes data on DCTs in rare genetic diseases to illuminate the inclusivity of racial minorities.

P-63 – REASONS FOR DECLINING TO PARTICIPATE IN A TRIAL OF ONLINE COGNITIVE BEHAVIOURAL THERAPY FOLLOWING ORTHOPAEDIC TRAUMA: A MIXED METHODS STUDY

Primary Author:

1) Jodi Gallant (McMaster University)

Co-Author(s):

2) Sheila Sprague (McMaster University)

3) Natalie Fleming (McMaster University)

4) Sofia Bzovsky (McMaster University)

5) Sarah MacRae (McMaster University)

6) Mavis Lyons (McMaster University)

7) Jose Manuel De Maria Prieto (McMaster University)

8) Herman Johal (McMaster University)

9) Paula McKay (McMaster University)

10) Jason W Busse (McMaster University)

Purpose: The timely enrollment of study participants is critical to the success of clinical trials. Understanding factors that contribute to patients’ decision to participate in trials involving online cognitive behavioral therapy for pain management should prove helpful to optimize the design of study protocols.

Methods: Fracture patients from an orthopedic clinic who declined to participate in the Cognitive behavioral therapy to Optimize Post-operative rEcovery (COPE) trial were asked to complete a Research Participation Questionnaire that asked them about their previous experiences with clinical research and mental health therapy and their reasons for declining to participate in the COPE trial. At the end of the questionnaire, a subset of participants was offered the opportunity to participate in a telephone interview to further discuss why they declined to participate in the COPE trial.

Results: Sixty-four patients who declined to participate in the COPE trial completed the questionnaire and twenty of these participants agreed to take part in a telephone interview (31%). Twenty-two participants (34%) had previous experience with clinical research and six participants (9%) had received cognitive behavioral therapy (CBT) in the past. Excessive time commitment (41%) was the most commonly selected reason for not participating in the COPE trial, followed by a disinclination to participate in clinical research (19%). Four themes emerged from the interviews with participants: 1) belief that they could overcome mental health challenges after their fracture without external help; 2) belief that CBT might be helpful for some fracture patients, but not for themselves; 3) preference for online or in-person CBT; and 4) concerns regarding time commitment.

Conclusion: To maximize enrollment, trials exploring the role of psychotherapy in recovery from orthopedic trauma should optimize time commitment of psychotherapy. Providing information in the patient consent process regarding evidence for psychotherapy and recovery from orthopedic trauma may also prove helpful in promoting patient enrollment.

P-64 – PREVALENCE AND ACCEPTABILITY OF DEDICATED SOCIAL WORK SUPPORT IN THE FRACTURE CLINIC: A SURVEY OF ORTHOPAEDIC TRAUMA SURGEONS

Primary Author:

1) Jodi Gallant (McMaster University)

Co-Author(s):

2) Sheila Sprague (McMaster University)

3) Aleesha Sheikh (McMaster University)

4) Natalie Fleming (McMaster University)

5) William Pereira (McMaster University)

6) Sofia Bzovsky (McMaster University)

7) Jamal Al-Asiri (McMaster University)

8) Dale Williams (McMaster University)

9) Faisal Al-Zahrani (McMaster University)

10) Brad Petrisor (McMaster University)

Purpose: The primary objective of this survey was to determine the prevalence of dedicated social workers or health care professionals otherwise dedicated to the task of assisting patients with non-physical aspects of their recovery in outpatient fracture clinics. Secondary objectives included: determining what services social workers provide where present; understanding the perspectives of orthopedic surgeons regarding the need for the addition of a social worker to the fracture clinic; determining the type of resources and services that orthopedic surgeons believe social workers could provide and identifying surgeons’ perceived barriers and facilitators to this addition.

Methods: We conducted a cross-sectional survey of orthopedic trauma surgeons. The survey, a self-administered electronic questionnaire, consisted of a brief screening questionnaire followed by four main sections: Demographics and Practice Setting; Social Worker Support in Outpatient Fracture Clinic; Social Worker Assistance in Clinic; and Perceptions of Social Worker in Clinic. To ensure responses from surgeons knowledgeable about the needs of patients in the fracture clinic setting, we established the following eligibility criteria: 1) currently practicing orthopedic surgeons, 2) treating patients with fractures as part of their practice, 3) able to complete the survey in English, and 4) willing to participate in the study by completing an electronic survey.

Results: This cross-sectional study surveyed 88 orthopedic trauma surgeons on the potential benefits and barriers to integrating social work support into outpatient fracture clinics. While only 16% of respondents currently had dedicated social work services available at their clinic, respondents acknowledged the critical need for their services, particularly in addressing employment, psychological support, and housing crises. Surgeons expressed strong agreement that social workers could effectively address various patient needs, including assistance with intimate partner violence (88.6%), transportation (86.4%), home care (85.2%), and addictions support (81.8%).

Conclusion: These findings underscore the potential for enhanced patient care through interdisciplinary collaboration and social work support in outpatient orthopedic trauma settings.

P-65 – JOB SATISFACTION OF RESEARCH PERSONNEL IN ORTHOPAEDIC TRAUMA SURGERY

Primary Author:

1) Jodi Gallant (McMaster University)

Co-Author(s):

2) Assweni Gowrishanker (McMaster University)

3) Amanda Hadwen (McMaster University)

4) Sofia Bzovsky (McMaster University)

5) Paula McKay (McMaster University)

6) Sheila Sprague (McMaster University)

Purpose: The role of research personnel (e.g., research assistants, research coordinators, research managers, etc.) is critical to the success of clinical trials in orthopedic surgery. Retention of research personnel is a challenge faced by many academic surgical researchers. Limited investigation has been conducted to identify and quantify factors that lead to low retention rates, especially in the field of orthopedic surgery clinical research.

Methods: We developed an electronic survey, containing the validated Job Satisfaction Survey, to evaluate job satisfaction, career path, and educational pathways in the context of long-term retention of research personnel working in orthopedic surgery. We distributed the anonymized survey to research personnel who had previously participated in an orthopedic trauma clinical trial coordinated at our university.

Results: Seventy-two research personnel working on clinical orthopedic studies completed the survey (43%). Using the Job Satisfaction Survey, overall respondent scores (mean 143.8, standard deviation [SD] 26.6) fell on the border of the ambivalent and satisfaction categories. The lowest mean scores, representing dissatisfaction (scores <12), were seen within the promotion (11.4, SD 4.9) and pay (11.9, SD 5.1) subscales. Higher pay was identified as the most common factor that would increase respondents’ satisfaction in their current position (72%). Almost half (46%) expressed that funding was a barrier to accessing continuing education and 54% were either unsure or considered clinical research in orthopedic trauma a temporary position. Four main themes arose from the qualitative portion of the survey: 1) appreciation and involvement, 2) institutional barriers, 3) training, and 4) support from the principal investigator.

Conclusions: Research personnel in clinical orthopedic trauma surgery are a highly motivated group with job satisfaction bordering ambivalence into satisfaction. Despite the desire to grow in their positions, pay and inadequate funding to support continuing education opportunities are barriers. The qualitative findings provide additional insights into how job satisfaction and retention can be improved amongst clinical research personnel in orthopedic trauma surgery.

P-66 – BENEFITS OF CROSS-INDUSTRY TRAINING IN CLINICAL TRIALS

Primary Author:

1) Megan McCabe (University of Alabama at Birmingham)

Co-Author(s):

2) Emine O Bayman (University of Iowa)

3) Christopher S Coffey (University of Iowa)

There are various training opportunities available to students interested in careers in clinical trials. Students in doctoral programs for statistics or biostatistics often have research or teaching assistantships at their institution which support their funding. In addition to these, students can apply for fellowships and internships with companies or federal agencies to gain valuable real-world experience. This presentation will highlight the speaker’s firsthand experience with clinical trials training opportunities in academia, government, and industry, including a description of four distinct experiences (1 academic, 2 government, and 1 pharmaceutical industry) and the considerations required to pursue them during her doctoral studies. The speaker will detail lessons learned and how these have influenced her career, as well as present recommendations for trainees and those involved in developing the clinical trials workforce. A combination of experiences within and outside of a student’s academic institution is unmatched in terms of career preparation. These experiences further develop the student’s technical skills and knowledge by extending it outside of the classroom into a real-world setting. They also provide a forum to develop practical skills for the workforce, such as project management, communication, and multi-disciplinary collaboration. Additionally, a combination of these experiences allows the student to make a well-informed decision regarding their post-graduation position. This has obvious benefits for future career satisfaction for the student, and it benefits employers in that they can recruit individuals who are passionate about pursuing a career at their organization. For students interested in clinical trials research, cross-industry training is particularly beneficial. It provides exposure to the complexities of trial design, management, and analysis from different perspectives, which is useful since clinical trials often involve communication and collaboration across industries. Despite these benefits, there can also be challenges associated with a student participating in opportunities outside of their institution. Firstly, there could be a lack of awareness about these opportunities. Beyond that, it can be challenging for the student to manage their other obligations, such as their assistantship and dissertation research. There are also often practical hurdles regarding the student’s funding, as well as the potential need for temporary relocation. The speaker was a graduate research assistant at the Clinical Trials and Statistical Data Management Center at the University of Iowa for four years, during which she was the inaugural Network for Excellence in Neuroscience Clinical Trials (NeuroNEXT) Biostatistics Fellow. The NeuroNEXT Biostatistics Fellowship was designed to expose a trainee to various facets of clinical trials, including the pre-award process, ongoing trial management, and statistical analysis of a completed trial. In addition to these invaluable academic experiences, she completed an Oak Ridge Institute for Science and Education Fellowship hosted by the Center for Drug Evaluation and Research at FDA and participated in the inaugural cohort of the Oncology Educational Fellowship, a program established by the FDA Oncology Center for Excellence, American Statistical Association (ASA), and ASA’s Biopharmaceutical Statistics Section. The speaker also gained experience in the pharmaceutical industry through a 4-month co-op program in Biostatistics and Data Science at Boehringer Ingelheim.

P-67 – SIMULATION-GUIDED TRIAL PLANNING FOR DIFFERENT ESTIMANDS FOR TREATMENT SWITCHING IN ONCOLOGY

Primary Author:

1) Quang Vuong (Core Clinical Sciences)

Co-Author(s):

2) Rebecca K Metcalfe (University of British Columbia)

3) Jay JH Park (McMaster University)

Post-randomization (i.e., intercurrent) events, such as treatment switching can affect the interpretation of clinical outcomes. Consequently, the recent ICH E9(R1) addendum on estimands highlights the importance of specifying relevant intercurrent events and how they will be handled analytically. In oncology trials, treatment switching can distort estimated treatment effects because participants are exposed to multiple treatments during follow-up. While the estimands framework offers a way to transparently integrate intercurrent events into the trial design and analyses, planning trials for different estimands is not straightforward. Two estimands promoted by the ICH addendum that are commonly used include “Treatment Policy” (TP) and “Hypothetical” estimands. In the context of treatment switching, the TP estimand is analogous to intention-to-treat analysis where data are analyzed based on randomization status, irrespective of switching. In contrast, the hypothetical estimand reflects a treatment effect under a hypothetical scenario where patients would not have switched their treatment. Because both estimands may be of clinical interest, characterization of the trade-offs associated with each via simulations may aid trial planning. The main objective of this study is to quantitatively examine the trade-offs associated with TP and hypothetical estimands, as measured by error rates, sample size, and treatment effects, in the context of treatment switching in a randomized clinical trial (RCT) powered on overall survival (OS). For our simulation study, we used an illness-death model to generate progression and OS times to mimic a trial that allows control patients to switch onto the experimental therapy after disease progression. We considered censoring as the analytical strategy for treatment switching for the hypothetical estimand. We estimated the TP and hypothetical effects in terms of proportional hazard ratio (95% CI) using transition hazards that were derived from a published RCT that allowed control group participants to switch treatments. We compared the empirical means of the estimators, the type I error rate, and the power of the two estimands. To demonstrate the implications of the simulation results, we will walk through an example of a simulation-guided design case study involving planning a trial in oncology with treatment switching as the main intercurrent event. We found that for our data generating mechanism, the estimator for the hypothetical estimand typically showed larger effect sizes than the estimator for the TP estimand but with less precision due to a higher proportion of censored observations. While the type I error rate could be controlled at 0.05 for both estimators, the estimated power was higher for the analysis targeting the hypothetical estimand. Our findings do not necessarily imply that the hypothetical estimand should be targeted over the TP estimand, but they do highlight important trade-offs at the design stage that can be characterized using simulations. As these two treatment effects reflect distinct research questions, our work underscores the need for transparency regarding intercurrent events during clinical trial planning.

P-68 – ANALYSING HEALTH RELATED QUALITY OF LIFE IN NEPHROLOGY TRIALS: A COMPARISON OF THE LINEAR MIXED EFFECTS, STANDARD JOINT, AND COMPETING RISKS JOINT MODELS IN THE PRESENCE OF INFORMATIVE DROPOUTS

Primary Author:

1) Hannah Worboys (University of Leicester)

Co-Author(s):

2) Nicola Cooper (University of Leicester)

3) James Burton (University of Leicester)

4) Laura Gray (University of Leicester)

Purpose: Health-related quality of life (HRQoL) is a key endpoint in nephrology research. Due to the longitudinal nature of clinical trials, there is often missing data due to dropout. Participants can dropout for various reasons, including disease progression, adverse events, death, and specifically in nephrology, due to kidney transplantation. In the case of informative dropouts, where the reason for dropout is related to the HRQoL outcome, the linear mixed effects model will produce biased estimates. Our previously published systematic reviewed showed that the majority of reported hemodialysis trials use methods which are open to bias. Joint models are gaining attention for their ability to simultaneously model longitudinal outcomes and time-to-dropout. However, dropout can be informative or non-informative depending on the cause, and therefore a joint model with competing risks may be more suitable than a standard joint model.

Aim: The aim of this study is to demonstrate the difference between using a linear mixed effects model, a standard joint mode and a competing risks joint model in the longitudinal analysis of HRQoL in the presence of informative dropouts.

Methods: Methods were applied and compared using data from two previously published trials. The first dataset consists of 130 patients with end stage kidney disease who were randomly allocated to 6-month intradialytic cycling intervention or control. The second dataset consists of 2,141 patients who were allocated to either low-dose or high dose intravenous iron and then followed up for a maximum of 4.5 years. We compared: (a) linear mixed effects model (LMM) not accounting for the reason for dropout; (b) standard joint model (SJM) that models HRQoL jointly with time to all-cause dropout; (3) competing risks joint model (CRJM) that models HRQoL with time to dropout, where two competing types of dropouts are considered (informative and non-informative). We assess the consequences of using the LMM rather than the corresponding JM and CRJM.

Results: We have shown previously in a simulation study that the LMM suffers from substantial bias in a typical situation where poor HRQoL is associated with an increased risk of dropout. In this situation the LMM will overestimate the HRQoL in both arms, but not equally, misestimating the difference between the HRQoL trajectories of the two arms to the disadvantage of the experimental arm. Here, we will present data comparing the LMM, SJM and CRJM using data from 2 completed trials. We will present the difference between the LMM and SJM estimations compared with an CRJM in the presence of both informative and non-informative dropouts and subsequent impact on the HRQoL parameters.

Conclusion: We propose to show the benefits of using a CRJM over a LMM and SJM to ensure bias is minimized in the estimation of HRQoL parameters. The relies on the systematic collection of the reasons for dropout in clinical trials which facilitate the use of CRJMs and could be a satisfactory approach to analyzing longitudinal HRQoL data in presence of both informative and non-informative dropouts.

P-69 – A MULTI-ARM MULTI-STAGE DESIGN FOR TRIALS WITH NO CONTROL ARM AND ALL PAIRWISE TESTING

Primary Author:

1) Peter Greenstreet (Ottawa Hospital Research Institute)

Co-Author(s):

2) Thomas Jaki (University of Regensburg)

3) Alun Bedding (Alun Bedding Coaching & Consulting Ltd)

4) Pavel Mozgunov (University of Cambridge)

Multi-arm multi-stage (MAMS) trials have gained popularity as a means to enhance the efficiency of clinical trials, potentially reducing both duration and costs. This paper focuses on designing MAMS trials where no control treatment exists. This may be because there are multiple treatments already established as the standard treatment option or when no treatment currently exists for a severe disease, so it would be unethical to withhold a potentially helpful treatment. In the proposed design, interim analyses allow for early treatment termination during the trial when a treatment performs notably worse than its competitors, and for the entire trial to stop early if all remaining treatments are showing similar performance. All pairwise comparisons between each treatment arm are conducted allowing for the identification of statistically significant differences between treatments and facilitating the early termination of less effective ones. The proposed design controls the familywise error rate (FWER) for all pairwise comparisons and the necessary conditions when control in the strong sense is guaranteed are provided. The FWER and power are used to calculate both the stopping boundaries and the sample size required. Analytic solutions to compute the expected sample size are also derived. A trial motivated by a study conducted into sepsis, where there was no control treatment, is shown. The multi-arm multi-stage all pairwise design proposed here is compared to multiple different approaches. It is shown, for the trial studied, that the proposed method yields the lowest required maximum and expected sample size when controlling the FWER and power at the desired levels.

P-71 – BAYESIAN ESTIMATION OF DYNAMIC TREATMENT REGIMES FROM A PARTIALLY RANDOMIZED, PATIENT PREFERENCE, SEQUENTIAL, MULTIPLE ASSIGNMENT, RANDOMIZED TRIAL

Primary Author:

1) Marianthie Wank (University of Michigan)

Co-Author(s):

2) Roy N Tamura (University of South Florida)

3) Thomas M Braun (University of Michigan)

4) Kelley M Kidwell (University of Michigan)

As healthcare shifts towards patient-centered care, incorporating patient treatment preferences in clinical trials has become increasingly relevant. The Partially Randomized, Patient Preference, Sequential Multiple Assignment Randomized Trial (PRPP-SMART) combines a Partially Randomized Patient Preference (PRPP) trial with a Sequential, Multiple Assignment, Randomized Trial (SMART), allowing participants to either receive their preferred treatment or be randomized when no treatment preference exists, at multiple points in the trial. In this paper, we introduce a novel Bayesian method to estimate dynamic treatment regimes (DTRs), or tailored treatment guidelines over the course of care, embedded in PRPP-SMARTs. Our Bayesian Joint Stage Model (BJSM) leverages information sharing between preference and randomized participants and across stages of the trial to estimate DTR effects. We compare our BJSM method to weighted and replicated regression models, the current standard for analyzing PRPP-SMART data, and show that our method provides more efficient DTR effect estimates with negligible bias. Our results indicate that BJSM is a promising alternative for analyzing PRPP-SMART data.

P-74 – DESIGN AND ANALYSIS OF SMARTS WITH TREATMENT PREFERENCE, WITH APPLICATION TO THE STAR*D TRIAL

Primary Author:

1) Sarah Medley (University of Michigan)

Co-Author(s):

2) Marianthe Wank (University of Michigan)

3) Roy N Tamura (University of South Florida)

4) Thomas M Braun (University of Michigan)

5) Kelley M Kidwell (University of Michigan)

Background: Effective care for chronic conditions with high rates of non-response or relapse requires personalized and adaptive treatment guidelines known as dynamic treatment regimens (DTRs). Sequential, multiple assignment, randomized trials (SMARTs) are the gold standard for estimating DTR effects, but SMARTs, like any trial, may struggle with recruitment and retention due to patient treatment preferences. A partially randomized, patient preference SMART (PRPP-SMART) design overcomes these issues by assigning participants with a treatment preference to their preferred treatment and randomizing participants who are indifferent at each stage of the SMART.

Methods: We have previously shown that weighted and replicated regression models (WRRMs) combining data from all participants, whether randomized or assigned to a preferred treatment, estimate DTRs with binary outcomes with minimal bias in a PRPP-SMART. Here, we evaluate WRRMs to estimate PRPP-SMART DTRs with continuous outcomes under a broad range of scenarios and illustrate our method using data from the STAR*D trial (NCT00021528).

Results: The performance of our WRRM method is robust to different preference rates. However, DTR estimates may incur non-negligible bias when preference has a strong influence on the outcome. DTR estimates in the STAR*D example from our methodology agree with previous results and suggest a small benefit of preference.

Conclusion: We expect our method to estimate DTRs with minimal bias in most realistic scenarios. The PRPP-SMART design and methods would have overcome many shortcomings of STAR*D.

P-75 – EXPLORING THE ADOPTION OF THRICE-WEEKLY EXTENDED-HOURS IN-CENTRE NOCTURNAL HAEMODIALYSIS IN ROUTINE CLINICAL PRACTICE THROUGH THE NIGHTLIFE STUDY: A QUALITATIVE CONTENT ANALYSIS

Primary Author:

1) Katherine Hull (University of Leicester)

Co-Author(s):

2) Victoria Cluley (University of Nottingham)

3) Matthew PM Graham-Brown (University of Leicester)

4) James O Burton (University of Leicester)

Introduction: Thrice-weekly, extended-hours (six to eight) in-center nocturnal hemodialysis (INHD) has been associated with improvements in important patient-centered outcomes but is offered in less than 5% of dialysis units in the UK. The NightLife study (ISRCTN87042063) is a randomized controlled trial comparing the clinical and cost-effectiveness of INHD to usual care. This offers participating sites the opportunity to set-up an INHD service. In-center nocturnal hemodialysis meets the Medical Research Council definition of a complex intervention. Effective evaluation of a complex intervention requires a detailed understanding of the context within which it is being established. There is little known about the contextual factors influencing INHD implementation in the UK. This study aims to understand the facilitators and barriers to INHD adoption within the NightLife study with a focus on the infrastructure, research environment and healthcare professional perspective.

Methods: This study was completed as a qualitative content analysis with an inductive approach. Content was derived from NightLife study site set-up electronic mail (e-mail) communications, meetings, reports, field notes and semi-structured interviews. The analysis was completed using Braun and Clarke’s Reflexive thematic analysis.

Results: Content was derived over a three-year period from three business cases for INHD development, 80 e-mail discussions (each discussion containing one to 10 e-mails), one internal pilot report, 60 meeting minutes, and seven semi-structured interviews with members of the multidisciplinary team. There were four key themes identified: inequity; role of knowledge and evidence; staff perception and experience; resources, support and complexity. These four themes contributed to both the adoption and non-adoption of INHD as part of the NightLife study.

Discussion: Site set-up and INHD service delivery have been the greatest challenges to NightLife study progression. Although each site appeared to have unique challenges, this qualitative content analysis demonstrated commonality in the facilitators and barriers to dialysis service innovation. Utilizing these findings will support site set-up of INHD within the NightLife study and would inform the development and evaluation of future complex interventions for the dialysis community. This work demonstrates the wealth of information regarding the implementation of complex interventions within routine study communications, meetings and reports. Incorporating qualitative content analysis within a trial provides an additional low-cost source of data for embedded work, such as process evaluations and studies within trials, and contributes to the understanding of contextual factors that impact the intervention adoption within routine clinical practice. This technique is transferrable to any randomized controlled trial study in any area of clinical medicine.

P-79 – CALIBRATION-FREE ODDS CFO SUITE FOR DESIGNING VARIOUS PHASE I CLINICAL TRIALS

Primary Author:

1) Guosheng Yin (University of Hong Kong)

Co-Author(s):

2) Jialu Fang (University of Hong Kong)

In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose in a phase I clinical trial. To use the data more efficiently yet without any model assumption, we propose a novel calibration-free odds (CFO) approach to phase I trial design. Not only is the CFO design free of any dose-toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Seamless phase I/II trials have gained enormous popularity, which aim to identify the optimal biological dose (OBD) and to enhance the accuracy and robustness of OBD identification. For toxicity monitoring, the CFO design casts the current dose in competition with its two neighboring doses to obtain an admissible set. For efficacy monitoring, CFO selects the dose that has the largest posterior probability to achieve the highest efficacy under the Bayesian paradigm. In contrast to most of the existing designs, the prominent merit of CFO is that its main dose-finding component is model-free and calibration-free, which can greatly ease the burden on artificial input of design parameters and thus enhance the robustness and objectivity of the design. We will also illustrate the implementation of CFO using its Shiny App which is user-friendly and publicly accessible at https://clinicaltrialdesign.shinyapps.io/cfoapp/.

P-80 – PRO-ADD: PATIENT-EMPOWERED DOSE-FINDING TRIALS BY INTEGRATING SAFETY, EFFICACY AND PATIENT-REPORTED OUTCOMES FOR OPTIMAL DOSE SELECTION

Primary Author:

1) Emily Alger (Institute of Cancer Research)

Co-Author(s):

2) Sumithra J Mandrekar (Mayo Clinic)

3) Jun Yin (Moffitt Cancer Center)

4) Christina Yap (Institute of Cancer Research)

Advances in oncology drug development are driving the emergence of novel therapies that challenge traditional dose-efficacy assumptions in dose-finding oncology trials. Most trial designs aim to identify a maximum tolerated dose (MTD) by assessing patients’ dose-limiting toxicities (DLTs) - implicitly adopting the traditional dose-efficacy paradigm that efficacy increases with treatment dose. Whilst established cytotoxic agents generally conform to this assumption, new investigational therapies such as molecularly targeted agents and immunotherapies do not necessarily exhibit such a relationship. What’s more, these emerging treatments are often administered over extended durations, extending beyond the traditionally short DLT assessment window. In settings where therapies are administered until treatment resistance or disease progression occurs, it is vital to evaluate treatment tolerability beyond the traditional DLT assessment windows. With these new investigational therapies in mind, emphasis should shift toward methodological advancements in trial designs aimed at identifying optimal doses, rather than solely determining MTDs. Incorporating Patient-reported Outcomes (PROs) within dose-finding oncology trials is increasingly recommended to better understand a treatment’s tolerability profile, especially given the extended tolerability assessment windows which may be needed for novel immunotherapies and targeted therapies. This paper introduces PRO-ADD (Patient-reported Outcomes Aided Dose-optimization Design), a modular trial design framework for dose optimization. This framework allows trialists flexibility to define approaches for dose-escalation, the adaptive randomization of patients across admissible doses, and the final dosing decision criterion. In this paper, we leverage the framework to optimize dosage with respect to three key outcomes - clinician-assessed DLTs, PROs and efficacy. We introduce a novel continuous PRO endpoint, the normalized PRO-Adverse Event (PRO-nAE) burden score to evaluate the trade-off between a treatment’s tolerability and its efficacy. A generalized linear mixed model is used to incorporate the accumulating longitudinal PRO data collected during the trial within the final dose recommendation. Simulation results are presented to evaluate trial design performance under different strategies for the handling of intercurrent events. PRO-ADD leverages the fundamental paradigmatic efficacy and tolerability profiles of new treatments to recommend optimal doses. It performs well at identifying the optimal dose (a dose which is both efficacious and tolerable) when efficacy and PRO data is collected beyond an observed DLT. Particularly in scenarios where efficacy plateaus beyond a specific dose size, PRO-ADD confidently identifies the most tolerable effective dose, avoiding escalation to higher, safe doses that offer no additional efficacy benefit. Futility and safety stopping rules perform well at adaptively assigning patients to admissible doses. When a patient discontinues treatment after a DLT, and subsequent PRO and efficacy data is unavailable, the proposed design still recommends the true optimal dose a majority of times, however results are biased toward lower dose levels which have fewer DLTs. As the field evolves, patient-centric dose-finding approaches incorporating PROs are crucial in advancing our understanding of treatment tolerability, and in turn, will shape the future landscape of dose-finding oncology trials.

P-81 – RANDOMIZATION-BASED INFERENCE FOR MCP-MOD

Primary Author:

1) Lukas Pin (University of Cambridge)

Co-Author(s):

2) Oleksandr Sverdlov (Novartis)

3) Frank Bretz (Novartis)

4) Björn Bornkamp (Novartis)

Dose selection is critical in pharmaceutical drug development, as it directly impacts therapeutic efficacy and patient’s safety of a drug. The Generalized Multiple Comparison Procedures and Modeling approach is commonly used in Phase II trials for testing and estimation of dose-response relationships. However, its effectiveness in small sample sizes, particularly with binary endpoints, is hindered by issues like complete separation in logistic regression, leading to non-existence of estimates. Motivated by an actual clinical trial using the MCP-Mod approach, this work introduces penalized maximum likelihood estimation (MLE) and randomization-based inference techniques to address these challenges. Randomization-based inference allows for exact finite sample inference, while population-based inference for MCP-Mod typically relies on asymptotic approximations. Simulation studies demonstrate that randomization-based tests can enhance statistical power in small to medium-sized samples while maintaining control over type-I error rates, even in the presence of time trends. Our results show that residual-based randomization tests using penalized MLEs not only improve computational efficiency but also outperform standard randomization-based methods, making them an adequate choice for dose-finding analyses within the MCP-Mod framework. Additionally, we apply these methods to pharmacometric settings, demonstrating their effectiveness in such scenarios. The results underscore the potential of randomization-based inference for the analysis of dose-finding trials, particularly in small sample contexts.

P-83 – THE ROLE OF INTERSECTIONALITY IN SHAPING PARTICIPANT ENGAGEMENT WITH DIGITAL HEALTH METHODS: FINDINGS FROM A QUALITATIVE STUDY

Primary Author:

1) Cherish Boxall (University of Southampton)

Co-Author(s):

2) Katherine Bradbury (University of Southampton)

3) Felicity Bishop (University of Southampton)

4) Gareth Griffiths (Southampton Clinical Trials Unit)

5) Nisreen Alwan (School of Primary Care)

6) Shaun Treweek (Health Services Research Unit)

7) John McGavin (University of Southampton)

8) Nnenna Ekeke (National Health Service East of England)

9) Rachel Stone (University of Southampton)

Background: Digital research methods were adopted at a rapid pace during the Covid pandemic in 2020. In line with global and national health initiatives, current policy-level strategies aim to make digitally-enabled research the norm, however, the impact of this on equitable participation and ongoing engagement in health research remains largely unknown. Efforts have been made to enhance the inclusive of digital health interventions (e.g., formatting interfaces for those with lower digital literacy), but the digital methods used to support and conduct the testing of digital and non-digital health interventions (e.g., informed consent, data collection, research communications) are often overlooked, despite their potential to influence an individual’s engagement in a study.

Objective: This study aims to understand the factors influencing the initial uptake and ongoing engagement with digitally enabled research across diverse populations. Its purpose is to capture experiences and perspectives that can inform inclusive and efficient health research conduct.

Methods: Semi-structured interviews were conducted with 50 people who had participation experience in health research in the past 12 months. Reflective thematic analysis was used to understand factors that influence study engagement from participant perspectives whilst acknowledging the role of the researcher in the interpretations of the data.

Results: This qualitative study identified three interconnected themes that illustrate how intersecting identity factors and social contexts shape engagement with digital methods in health research. ‘Digital Identity and Intersectional Role Performance’ revealed how aspects of identity such as age, gender, and socioeconomic status intertwined to create pathways towards or away from engagement with digital methods. The theme of ‘Power Redistribution and Resource Navigation’ explored how digital platforms can shift power dynamics in research relationships, with participants utilizing technology to maintain agency and control over their information. Finally, ‘Multi-Modal Trust Building Pathways’ uncovered the complex ways participants established trust in digital research, often relying on a combination of institutional credentials, community validation, and personal connections. These findings suggest the need for an in-depth and intersectional approach to optimize representation in digitally enabled research.

Conclusions: The findings highlight how intersecting factors like age, gender, and socioeconomic status influence digital methods engagement in health research, informing a way of thinking that could advance inclusive research designs. Future directions, including longitudinal studies and mixed-methods approaches, can develop and assess the impact of strategies to promote engagement with digital methods. Identifying and centering research designs around participants intersecting identities is a crucial step towards developing health research that is truly representative and beneficial for all communities.

P-85 – UTILISING ROUTINELY COLLECTED HEALTH DATA (RCHD) TO ENHANCE LONG-TERM MONITORING AND EFFICIENCY IN CLINICAL TRIALS: INSIGHTS FROM 2 ACADEMIC PRAGMATIC TRIALS

Primary Author:

1) Katie Mariamne Loveday Hullock (University College London)

Co-Author(s):

2) Ruth Langley (University College London)

3) Angela Meade (University College London)

4) Matthew Nankivell (University College London)

The increasing availability and quality of Routinely Collected Health Data (RCHD), including electronic health records (EHRs) and healthcare system data (HSD), have introduced new possibilities in clinical trial design, particularly for long-term follow-up and safety monitoring. This study explores the integration of RCHD into two UK-based clinical trials: the first investigates aspirin’s role in preventing cancer recurrence, and the second evaluates a preventative polypill for age-related conditions. By utilizing both EHRs and HSD to track serious adverse events and long-term outcomes, these trials demonstrate a combined approach to RCHD that is at present, one of the only ways to effectively tackle the healthcare system fragmentation between England, Scotland, Wales, and Northern Ireland in a cost effective and timely way. This integration model mirrors challenges seen in the U.S. where EHR adoption remains uneven across states and systems and in Canada, where provincial systems manage health data separately. Using RCHD in these trials supports cost-efficient, scalable methodologies that could benefit North American healthcare systems facing similar challenges in trial efficiency, participant retention, and resource limitations. Although this study is UK-specific, its findings on RCHD integration offer a framework that may enhance trial operations across varied healthcare landscapes.

P-87 – INTEGRATING SYNTHETIC DATA AND AI IN PEDIATRIC INTENSIVE CARE CLINICAL TRIALS: A BAYESIAN FRAMEWORK FOR ETHICAL AND SCIENTIFIC ADVANCEMENT

Primary Author:

1) Danila Azzolina (University of Ferrara)

Co-Author(s):

2) Dario Gregori (University of Padua)

3) Paola Berchialla (University of Turin)

Introduction: Pediatric clinical trials face peculiar challenges, due to limited sample sizes, accrual problems, ethical concerns, and heterogeneous populations. Synthetic data, generated through advanced statistical modeling and AI (Artificial Intelligence) approaches, offers a promising solution to mitigate these challenges while maintaining trial integrity. Synthetic data can supplement real-world data for scenario testing, simulate rare disease cohorts, and evaluate trial designs under various conditions without exposing children to unnecessary risks. Moreover, the use of synthetic data can improve statistical power and decision-making efficiency by enabling sensitivity analyses and optimizing sample size requirements. Furthermore, the method supports the development of machine learning-based predictive models to identify high-risk pediatric subgroups and evaluate intervention outcomes.

Method: A randomized trial comparing High-Flow Nasal Cannula (HFNC) therapy and non-invasive ventilation (NIV) in children with bronchiolitis serves as a motivating example for addressing pediatric research’s ethical and methodological complexities using synthetic data. The primary outcome of this study is to determine whether HFNC therapy is non-inferior to NIV in preventing intubation among infants admitted with bronchiolitis and experiencing mild to moderate respiratory distress. The non-inferiority margin established a priori, is defined as an absolute risk reduction of 0.15.

We propose a trial design and analysis pipeline integrating AI-generated synthetic data and Bayesian statistical methods to improve scientific validity and uphold ethical standards. AI-based algorithms were considered for creating synthetic data using generative models, checking for fidelity to the original dataset through validation, including distributional equivalence. These synthetic datasets will serve as the backbone of a Bayesian trial design. In the first stage, synthetic data will inform prior distributions for the Bayesian Analysis of the primary endpoint-intubation rates. In the second stage we will combine real and synthetic data to evaluate treatment efficacy across subgroups, including underrepresented trial populations, i.e., children potentially more at risk of intubation for previous wheezing or hospitalizations.

Results: Preliminary findings on the generated synthetic data indicate that in the original dataset, the posterior mean odds ratio (OR), in favor of HFNC, was 0.72 (95% HDI: [0.37, 1.18]), with a 95.8% probability of noninferiority. The synthetic dataset yielded a similar result with an OR of 0.66 (95% HDI: [0.29, 1.11]), and a 96.8% probability of noninferiority. Furthermore, the mean predictive mean squared error between original and synthetic data was 0.003, indicating high fidelity of the synthetic data in capturing the original data’s statistical properties.

Conclusion: From an ethical perspective, especially in the pediatric setting, the integration of synthetic data minimized patient risk by simulating high-risk scenarios in silico. The use of synthetic data in the design also promoted diversity and equity facilitating the estimation of poorly represented patients. Additionally, the de-identification of synthetic data safeguarded patient confidentiality, aligning with privacy regulations and enabling broader data-sharing practices.

P-88 – OPTIMIZING PEDIATRIC OUTCOMES: ADVANCED BAYESIAN MODELING OF DAYS WITHOUT MECHANICAL VENTILATION IN INTENSIVE CARE TRIALS

Primary Author:

1) Danila Azzolina (University of Ferrara)

In pediatric respiratory trials, accurately modeling days without mechanical ventilation (DWMV) is crucial due to the typical zero-inflation and skewness in the data. Standard statistical methods often fail to address these complexities effectively, which can obscure significant clinical insights. This study harnessed Bayesian statistical methods to evaluate the efficacy of high-flow nasal cannula (HFNC) and noninvasive ventilation (NIV) in a sample of 252 pediatric cases, leveraging these methods’ ability to integrate prior clinical knowledge and manage complex data structures. We deployed four distinct Bayesian models to capture the nuanced distribution of DWMV: Gaussian, Hurdle Negative Binomial, Zero-One Inflated Beta, and Cumulative Logistic Regression. Each model’s effectiveness was assessed using the Leave-One-Out Cross-Validation (LOO) Information Criterion, providing a robust measure of predictive accuracy. Among these, the Zero-One Inflated Beta Model stood out, achieving the lowest LOOIC score (294.8). This model was particularly adept at handling zero-inflation and offered detailed insights into how each treatment influenced the distribution of DWMV days, thereby illuminating the differential impacts of HFNC and NIV. Conversely, the Gaussian model, while straightforward, proved less effective (LOOIC = 1157.9) due to its inadequate handling of zero-inflation. Although it provided a basic understanding of treatment effects, its lack of sophistication in managing the data’s specific challenges limited its utility. The Hurdle Negative Binomial and Cumulative Logistic Regression models also showed good performance (LOOIC = 568.3 and 573.8, respectively), particularly in delineating effects across different patient segments, but they did not reach the predictive accuracy of the Zero-One Inflated Beta Model. This research highlights the critical importance of selecting appropriate models based on specific data characteristics to achieve precise clinical results. By utilizing advanced Bayesian techniques and tailored models, we can ensure more accurate and clinically relevant estimations of treatment effects. Such precision is important for informing decisions in pediatric respiratory care strategies, where understanding the nuances of treatment effectiveness can significantly influence patient outcomes. Our findings strongly advocate for the broader adoption of these sophisticated Bayesian methods in clinical research. These methods not only improve the accuracy of treatment effect estimations but also the overall quality of outcome assessments in pediatric respiratory care.

P-89 – EFFECTIVE CENTRALIZED TRAVEL MANAGEMENT IN A MULTI-CENTER PARKINSON’S DISEASE CLINICAL TRIAL

Primary Author:

1) Audra Bright (Indiana University School of Medicine)

Co-Author(s):

2) Laura Heathers (Indiana University School of Medicine)

Objective: Widespread marketing campaigns and use of online enrollment platforms has accelerated the identification of participant for clinical trials. However, travel remains an ongoing barrier to converting these efforts into successful recruitment and retention in longitudinal studies. To overcome this challenge, a centralized model was introduced to offer personalized travel support for participants and reduce burden on clinical study teams.

Background: The Parkinson’s Progression Markers Initiative (PPMI) is a longitudinal, observational study supported by The Michael J. Fox Foundation for Parkinson’s Research. PPMI partners with a network of clinical sites (31 US, 20 non-US) to support the standardized collection of data and biospecimens from a broad cohort of volunteers. To meet the demands of PPMI’s continued expansion, the Indiana University (IU) Travel Core launched in 2020 to support increasing travel needs for North American sites and participants. Prior to this launch, the Travel Core worked diligently to identify a travel vendor partner which could streamline scheduling, help reduce booking costs and meet the high-volume demands.

Methods: Ongoing recruitment campaigns drove individuals to an online platform used for remote screening. Participants identified as eligible for further in-person assessments were contacted for a more thorough phone screening and asked a series of travel-related questions. Agreeable participants were securely transferred to a local PPMI site team through a shared collaboration platform, where the local coordinator would finalize the study visit. Visit details were entered into an online form, which routed to IU Travel team to verify accuracy. To ensure participants were well-informed prior to study visits, the IU travel team provided detailed guidance on the PPMI expense and reimbursement policies. Any travel-related questions were funneled to the IU team to address. Customized travel itineraries were generated by the travel vendor partner, Corporate Traveler (CT), and finalized itineraries were distributed to participants and sites for shared visibility. In instances of disrupted travel, such as flight cancellations, the CT support phone line was also available to assist PPMI participants.

Results: Between November 2020 to November 2024, the IU Travel Core successfully facilitated travel for over 3,000 research participants to over 5,000 PPMI site visits. Since 2020, the PPMI project has seen a 253% increase in participant enrollment, with a significant portion of this growth attributed to the robust recruitment efforts across North America and the centralized Travel Core’s role coordinating participant travel. On average, participant reimbursements are processed within 2 weeks of submission. Despite the high volume of participants, the Travel Core team has prioritized delivering a personalized “white glove” experience, which has led to consistently positive feedback from volunteers.

Conclusion: A dedicated centralized team to manage travel needs for a large trial like PPMI can alleviate the financial and logistical burdens that often discourages participation. This strategy helps minimize barriers to recruitment and retention, while also easing the burden on local teams (allowing them to remain focused on the scientific objectives). By providing participants tailored support, this approach enhances satisfaction and increases retention rates. These insights can be applied to other large-scale trials with similar recruitment models to optimize participant engagement.

P-90 – EVALUATING THE EFFICACY OF OUTBOUND IVR IN ENHANCING FOLLOW-UP IN A MULTICENTRE CLINICAL TRIAL

Primary Author:

1) Mark Forrest (University of Aberdeen)

Co-Author(s):

2) David Emele (University of Aberdeen)

Background: Interactive Voice Response (IVR) systems allow participants to interact with automated phone technology through pre-recorded messages and keypad inputs, eliminating the need for direct researcher contact. Despite the rise of online data collection tools, IVR technology retains potential, particularly with its outbound calling feature, which can be employed for participant reminders, appointment scheduling, and data collection. Its flexibility makes it a viable option in various clinical trial settings.

Objective: This study aimed to assess the effectiveness of IVR as a central component of long-term follow-up in a multicenter clinical trial, in light of current technological alternatives.

Methods: Outbound IVR was deployed in the long-term follow-up phase of a pragmatic multicenter randomized controlled trial (SIMS). Participants who did not respond to initial communications were contacted via automated outbound IVR calls in place of a second postal reminder to complete their annual follow-up questionnaire.

Outcome measures: The primary outcome was the completion rate of the SIMS long-term follow-up questionnaire among participants who had not responded to the original communication at the annual follow-up timepoint.

Intervention: Eligible participants were scheduled to receive five automated calls over seven days, at varying times of the day, including at least one weekend call. Participants received a prior text message with a link to complete the questionnaire online. If they did not respond via the link, IVR calls were made to collect their responses.

Results: A total of 276 participants were eligible for follow-up, with different strategies employed to encourage questionnaire completion: 2,160 IVR calls were placed over seven days, resulting in 205 total responses (78 via SMS link completion and 127 via IVR, achieving a 74.3% response rate), while data coordinators contacted 242 participants, yielding 84 responses (34.7%).

Conclusion: IVR was effective in collecting responses from participants who had not initially responded, demonstrating its potential as a complementary and scalable solution for follow-up in multicenter clinical trials.

P-91 – DESIGN OF AN ELECTRONIC DELEGATION OF AUTHORITY LOG WITHIN A CLINICAL TRIAL MANAGEMENT SYSTEM

Primary Author:

1) Keith Pauls (Medical University of South Carolina)

Co-Author(s):

2) Wenle Zhao (Medical University of South Carolina)

3) Robert Silbergleit (University of Michigan)

4) Deneil Kolk (University of Michigan)

The Delegation of Authority (DOA) Log is required for clinical research studies to record all study team members’ significant study-related duties as well as document and ensure that study team members are aware of their duties, are appropriately trained, and authorized to perform the tasks. Based on team members’ DOA assignments, they may be required to provide documentation in the form of regulatory documents to confirm they are properly trained to perform their designated tasks. It is a critical tool in ensuring oversight and accountability for a clinical research study. However, there are many challenges when it comes to managing and documenting the DOA. These include inaccurate records, inconsistent data capture formats, and lack of standards. These issues can cause many problems, including confusion over which team member is performing which tasks, non-compliance violations, inadequate training, and improper documentation of team member updates. To avoid these pitfalls, an electronic DOA log was implemented into a web-based Clinical Trial Management System (CTMS) for the Strategies to Innovate EmeRgENcy (SIREN) Care Clinical Trials Network, which is funded by the National Institute of Neurological Disorders and Stroke and the National Heart, Lung, and Blood Institute. There are currently 5 projects and 5 ancillary projects for the SIREN network housed within our CTMS. There are over 200 unique sites participating in these projects. Of the 4 active projects, there are over 2,000 clinical site team members listed on the electronic DOAs. The electronic DOA tracks the complete history of DOA changes, including start and end dates when a team members’ assignments change. By using our electronic DOA, it ensures all sites collect the data in a standard way and format. Another benefit of having the DOA data accessible online is that trial operations team members have better oversight over the submitted data and can ensure the DOA is being collected in a standardized and proper way across all sites. Having the DOA data included in our CTMS system allows this data to be used for other trial operation purposes. For example, the system is automatically posting and tracking the appropriate regulatory documents required for a team member based on the DOA assignments. This is done within our regulatory document module that is included in our CTMS. Another benefit of having the DOA data within our CTMS is that it can be used to automatically validate that certain CRF assessments are performed by a team member with the proper DOA assignment and qualifications. By collaborating with stakeholders across all areas of clinical trials, we developed an integrated electronic DOA module within our CTMS, streamlining a complex process and enhancing efficiency in other trial management areas.

P-92 – INTEGRATION OF AI IN EPIDEMIOLOGICAL STUDY DESIGN FOR ENHANCED RESEARCH OUTCOMES

Primary Author:

1) Roberta Bruhn (Yale University)

Co-Author(s):

2) Christine Chaisson (Yale University)

3) Brian Sevier (Yale University)

4) David Coleman (Yale University)

Background: As the landscape of clinical research evolves, ensuring the methodological soundness and relevance of study designs remains paramount. The Protocol Development and Data Feasibility team at the Yale Center for Clinical Investigation (YCCI) has explored the integration of Artificial Intelligence (AI) in the initial stages of study design to address this challenge.

Objective: This initiative aims to leverage AI to screen Principal Investigator (PI) studies and assign the most suitable epidemiological study design types, thereby improving the overall quality and efficiency of clinical research.

Methods: AI algorithms are employed to systematically review study proposals submitted by PIs. The AI analyzes the objectives, hypotheses, and relevant data points from these proposals. It then recommends the most appropriate epidemiological study design, such as case-control, cohort, cross-sectional, or randomized controlled trial. The AI’s algorithms are trained to recognize key indicators and patterns that suggest the optimal design for each unique study.

Results: Preliminary results indicate that the use of AI for assigning study designs provides several benefits. AI-driven recommendations are shown to reduce the time required for human evaluation, enhance the precision of study design selection, and ensure consistency across studies. This process mitigates the risk of inappropriate design choices, thereby increasing the validity and reliability of research findings. Furthermore, this system supports the pursuit of rigorous and robust answers to research questions by tailoring study designs to align with the specific objectives and constraints of each study.

Conclusion: The integration of AI in study design represents a significant innovation in clinical and translational research methodology. By guiding the selection of epidemiological study designs through AI analysis, we can shape the future of research with the right questions and robust answers. This approach promises to enhance the efficiency, accuracy, and overall impact of clinical research conducted at YCCI.

P-93 – BEST PRACTICES FOR THE DESIGN AND CONDUCT OF COMPLEX CLINICAL TRIALS

Primary Author:

1) Tony Succar (University of Southern California)

Co-Author(s):

2) Eunjoo Pacifici (University of Southern California)

Background: The rapid advancement of emerging therapeutic areas and modalities has led to the development of complex clinical trial (CCT) designs, evolving beyond standard clinical trial approaches. CCTs differ from standard clinical trials in their design, methodology, and scope, allowing for the investigation of more intricate clinical research questions, hypotheses and simultaneous evaluation of multiple interventions and outcomes. These trials are characterized by innovative designs such as 1. adaptive trials, which allow prespecified modifications based on interim data analysis; 2. basket trials, which test a single therapeutic for multiple diseases; 3. umbrella trials, which evaluate multiple therapeutics for a single disease; and 4. platform trials, which study treatments in a perpetual manner, with treatments added or removed from the platform during the trial. These innovative designs offer greater flexibility, efficiency, and the ability to address sophisticated research questions. As such, CCTs have gained prominence across various therapeutic areas, extending beyond standard trials with increasing innovations and breakthrough therapies. This review examines the current state and best practices for the design, conduct, and management of modern CCTs.

Methods: A comprehensive review of the scientific, medical, and regulatory literature from 2014 to 2024 was conducted using a combination of keywords related to CCTs such as adaptive trial designs, basket trials, umbrella trials, or platform trials. Publication titles and abstracts were screened for relevance, followed by a full-text review of key studies using the PubMed database. Data from the ClinicalTrials.gov registry was analyzed to identify the prevalence and current trends of different CCT designs. Additionally, official regulatory agency websites were searched for relevant publications and guidance documents regarding CCTs. Findings were analyzed and synthesized to provide best practice recommendations for the design and conduct of CCTs.

Results: The analysis revealed a consistent trend of increasing trial complexity over the past decade, with notable increases in endpoints, inclusion-exclusion criteria, and data points collected. Other key findings included potential benefits of CCTs, showing increased efficiency, improved quality and safety, reduced resource requirements, and the ability to answer multiple research questions simultaneously. Challenges included increased operational complexity, data management hurdles, and regulatory considerations. Best practice strategies such as application of risk-based approaches, use of advanced analytics, and emphasis on stakeholder collaboration were recommended. The Food and Drug Administration (FDA) has been actively encouraging the use of CCTs and established a Complex Innovative Trial Designs Pilot Meeting Program to facilitate and advance their use. The European Medicines Agency (EMA) has initiated an adaptive pathways initiative to encourage innovative trial designs. The review also identified gaps in current research, particularly regarding the long-term impact of CCTs on medical product timelines and success rates.

Conclusion: CCTs offer promising opportunities to accelerate medical product development and regulatory approvals. In recent years, regulatory agencies, particularly the FDA and EMA, have shown increasing support for CCT designs. However, implementing them requires advanced planning, expertise, and resources. Future research should focus on optimizing CCT designs, addressing operational challenges, and evaluating their global and long-term impact on medical product development and healthcare outcomes.

P-94 – PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT TO METHODOLOGICAL RESEARCH: INSIGHTS FROM A PANEL

Primary Author:

1) Nikki Totton (University of Sheffield)

Co-Author(s):

2) Steven Julious (University of Sheffield)

3) Ellen Lee (University of Sheffield)

Results: The convened panel consists of 22 members of the public. The panel have met six times since May 2023, all as online meetings conducted through Google Meet. Average attendance to the panel is approximately 80%. Eleven different topics have been discussed to date ranging from clinical trial design to the use of registry data in research. Recommendations for a PPIE panel – The panel has three facilitators (researchers) to run each session. It is recommended this is done with a minimum of two people to ensure all information is captured as well as provide structure and support for the PPIE members who will be less familiar with research meetings. Online meetings have been deemed acceptable for PPIE members and are set at a maximum of two hours with a comfort break included. Recommendations for conducting PPIE for methodological work – PPIE members at a minimum can meaningfully input to a plain English summary of the methodological project. Jargon should be avoided where possible but used when the definition is important e.g. “adaptive designs”. In these cases, clear definitions are important at the outset so an agreed understanding can be used when discussing the project. PPIE members can help to refine these definitions. Specific questions are useful to get clear responses but leaving space for general comments will help to highlight anything which may not have been considered.

P-95 – TWILIO ALERTS WITHIN THE PREVENTABLE ALERT SYSTEM

Primary Author:

1) Amanda Montgomery (Wake Forest University School of Medicine)

Co-Author(s):

2) Letitia Perdue (Wake Forest University School of Medicine)

3) Mark King (Wake Forest University School of Medicine)

4) Wesley Roberson (Wake Forest University School of Medicine)

5) Julissa Almonte Santana (Wake Forest University School of Medicine)

The PRagmatic EValuation of evENTs And Benefits of Lipid lowering in oldEr adults (PREVENTABLE) Trial is a double-blind, randomized, multi-site pragmatic clinical trial assessing whether the cholesterol lowering drug, Atorvastatin, can help adults 75 and older prevent dementia, physical disability, and death. A pragmatic trial is a type of research study designed to evaluate the effectiveness of interventions in real-world, routine clinical settings. By focusing on those aged 75 and older who do not have a history of cardiovascular disease, the PREVENTABLE study seeks to explore whether this treatment can improve health outcomes and overall quality of life as people age. With over 100 clinics located throughout the United States and Puerto Rico, and a planned enrollment of 20,000 participants, the PREVENTABLE study is a large research initiative. Due to its vast size and expansive coverage area, building an efficient communications system was critical. One key element within the study’s communication framework is a highly specialized, custom-built alert system that notifies clinic staff of actions requiring follow-up. This tailored system has been designed specifically to meet the unique needs of the study, ensuring precise, real-time notifications and effective communication across all relevant parties. By implementing a custom solution, the system can better address specific criteria and response protocols, enhancing the accuracy, efficiency, and reliability of alerts. Automating the processes of detection and notification can significantly reduce the potential for human error thereby enabling responses that are both faster and more precise. An additional piece, critical to the success of a trial, is collection of follow-up data. To help ensure clients stay informed and engaged, PREVENTABLE implemented an automated process to send appointment reminders to participants. The study utilized the robust features and resources provided by Twilio, a cloud communications platform that enables developers to build, scale, and operate communications solutions to optimize efficiency. Twilio APIs allow businesses to integrate communication attributes into their applications, such as sending SMS messages. Twilio’s services are scalable and reliable, with features designed to handle high volumes of communication while ensuring data security and privacy. Communicating appointment reminders via SMS messages introduced the ability for participants to respond to messages with study related information that needed to be forwarded to specific study staff. PREVENTABLE interwove this communication with the PREVENTABLE alerts system to specifically handle incoming participant communication. This presentation/poster will describe and highlight the interworking of such collaboration and provide specific code-snippets to demonstrate the mechanics that bring this successful convergence to fruition.

P-96 – NAVIGATING SAFETY REPORTING IN A RARE DISEASE SETTING

Primary Author:

1) Evan Tomaschek (Medical University of South Carolina)

To responsibly protect human subjects in a vulnerable population, every effort should be made to ensure diligent safety monitoring and adverse event reporting. The TReatment for ImmUne Mediated PathopHysiology (TRIUMPH) is an NIH-funded Phase 2b clinical trial designed to investigate immunosuppressive therapy to treat children with acute liver failure of unknown etiology. The trial is being conducted under an FDA Investigational New Drug application (NCT# 04862221). During study development, participant risk was extensively researched for all treatment arms and in the context of children in critical care for acute liver disease. Exclusion and treatment discontinuation criteria were carefully mapped in the protocol under the guidance of the FDA, DSMB, and Central IRB. A Safety Monitoring Plan (SMP) was developed by the Operations Team to establish study team responsibilities and a clear definition of adverse events, including all anticipated events based on the known complications of pediatric acute liver failure and associated interventions, such as transplantation, and expected events related to study treatment. The plan also includes steps and responsibilities for expedited reporting to the FDA according to the FDA Guidance for Industry on safety reporting requirements for IND studies. A well-rounded SMP implemented prior to enrollment is shown to be a valuable tool as well as site-facing resources for collecting and entering data. This presentation will provide an overview of the steps the TRIUMPH investigators took to ensure a reliable, streamlined process for safety reporting including the components of the SMP, the design of the trial’s clinical trials management system safety module and the study team members involved in the process which includes site investigators, data and project managers, medical safety monitors, regulatory specialists and the unblinded statistical team.

P-98 – ENHANCING DATA QUALITY IN THE HEALEY ALS PLATFORM TRIAL THROUGH SYSTEMATIC OUTCOME REVIEW

Primary Author:

1) Mirna Thomas (Massachusetts General Hospital/Neurological Clinical Research Institute)

Co-Author(s):

2) Hong Yu (Massachusetts General Hospital/Neurological Clinical Research Institute)

3) Michaela Estes (Massachusetts General Hospital/Neurological Clinical Research Institute)

Introduction: In ALS clinical trials, the accuracy of primary outcome measures such as Amyotrophic Lateral Sclerosis Functional Rating Scale - Revised (ALSFRS-R) and Slow Vital Capacity (SVC) is essential for assessing disease progression. A systematic outcome data review process has been implemented across first seven regimens within the HEALEY ALS Platform Trial to detect inconsistency and improve data quality, ensuring more reliable trial results. This poster outlines the workflow and the data quality improvements achieved through this process.

Workflow overview: (1) Identification of Potential Issues: Subject matter experts defined specific criteria for identifying discrepancies in ALSFRS-R and SVC scores between visits. Using predefined criteria, both automated logic checks and manual review are conducted to identify discrepancies in scores. These include score fluctuations and scores that meet threshold for review. (2) Communication with Sites: Upon identifying discrepancies, the Data Management Team (DM) sends emails to the sites with a list of identified scores, requesting score verification, corrections in the electronic data capture if errors are identified, and explanations for fluctuations. (3) Feedback Loop for Training Improvement: Responses from sites are reviewed by DM and categorized according to the reason for fluctuations, such as: disease progression, different evaluators, or data entry errors. This feedback is provided to the Outcome Measurement Training team to support continuous training improvement. (4) Data Quality Improvement: Over time, the review process has led to marked improvements in data quality, especially in later trials (Regimens F & G). Improvements include reduced data entry errors, fewer unexplained score fluctuations, and increased consistency in having the same evaluator conduct assessments across visits, significantly reducing variability in scores.

Results: The implementation of the outcome review process has led to significant improvements in data quality. Key improvements include: (1) 44% reduction in data entry errors, as noted between the pre and post implementation phases; (2) 31% increase in the frequency of the same evaluator conducting assessments across visits; and (3) 25% improvement in the clarify of documented scores fluctuations, leading to improved trial reliability.

Conclusion: The systematic review process has proved effective in identifying potential data issues and improving overall data quality in HEALEY ALS Platform Trial. By continuing to refine this workflow and provide ongoing training based on learnings and site feedback, we highlight the utility of platform trials to allow for continued operational process improvements to increase data accuracy, ultimately enhancing the validity of trial results.

P-100 – MULTIPLICITY IN NON-LICENSING RANDOMIZED CONTROLLED TRIALS: SOFTWARE TOOL TO CALCULATE SAMPLE SIZES

Primary Author:

1) Katie Pike (University of Bristol)

Co-Author(s):

2) Barnaby C Reeves (University of Bristol)

3) Chris A Rogers (University of Bristol)

Background: Multiplicity in randomized controlled trials (RCTs) is a problem because it can increase the chance of false positive findings. Most proposed solutions consist of making a statistical adjustment. Recommendations have been produced for the design and analysis of non-licensing RCTs with multiple primary outcomes or multiple treatment comparisons covering: when multiplicity adjustment is required and which methods to use in which circumstances. A related issue is how the trial sample size should be calculated to appropriately account for necessary changes due to the multiplicity approach.

Methods: A user-friendly tool has been developed to calculate sample sizes, comprising a suite of R programs. Currently, the tool informs sample size adjustments due to multiple primary outcomes; it is being updated to cover multiple treatment comparisons. The user inputs trial design details, including: the number of outcomes; anticipated treatment differences and standard deviations for each outcome; and correlations between outcomes. The tool then calculates study-wise (i.e. combined across all outcomes) Type I error, power and resultant sample size. Unadjusted and multiplicity-adjusted values are calculated (Bonferroni or Hochberg methods). The output comprises the calculated study-wise Type I error, power and sample size values, and optimal values for the specific design; issues with non-recommended, but commonly used sample sizes are listed.

Results: Example trial design scenarios: For a trial where hypotheses for all primary outcomes must be rejected for a treatment to be declared effective, no adjustment for multiplicity is required. However, the trial sample size calculation should be based on conjunctive power (the probability of correctly rejecting null hypotheses for all outcomes), rather than the individual power for each outcome (as is often done). This means an increased sample size is required. Accounting for correlations between outcomes ensures the increase is minimized. The output provides the required sample size based on the input parameters given, and highlights that a calculation based on the individual power for each outcome is likely to mean the trial is underpowered. For a trial where hypotheses for at least one outcome (but not all) must be rejected for a treatment to be declared effective, multiplicity adjustment using the Bonferroni or Hochberg method is recommended. The output provides sample sizes based on both adjustment methods. It also presents the sample size and power if the calculation were to be performed without allowing for the multiplicity adjustment (as is frequently done) and highlights that this approach would mean the trial would be underpowered.

Discussion: The tool provides an accessible method to calculate the sample size for trials with multiple primary outcomes. The correlation between outcomes is accounted for, ensuring that a sample size calculation is not unnecessarily inflated, as may be the case if the correlation were ignored. Furthermore, it gives the user recommendations on which study-wise Type I errors, powers and resultant sample sizes should be used, and provides feedback on problems with certain commonly used designs. Extensions to the tool are planned to increase the applicability to a wider range of designs.

P-103 – USE OF VARYING-ACCESS DATABASE TABLES TO MANAGE CLINICAL SITE AND STUDY PERSONNEL DATA FOR MULTI-CENTER CLINICAL TRIALS

Primary Author:

1) Jennifer Gassman (Cleveland Clinic)

Co-Author(s):

2) Milena Radeva (Cleveland Clinic Quantitative Health Sciences)

3) Brett Larive (Cleveland Clinic Quantitative Health Sciences)

4)Cynthia Kendrick (Cleveland Clinic Quantitative Health Sciences)

5) Kimberly Wiggins (Cleveland Clinic Quantitative Health Sciences)

6) Suzy Comhair (Cleveland Clinic)

7) Anna Hemnes (Vanderbilt University Medical Center)

8) Stephen Mathai (Johns Hopkins University School of Medicine)

9) Gustavo Heresi (Cleveland Clinic Respiratory Institute)

Management of a multi-center clinical trial benefits from tracking site and personnel data in database tables. In multi-center studies run by the Data Coordinating Center team in the Cleveland Clinic’s Quantitative Health Sciences (QHS) Department, we use a Clinical Site table and Study Personnel table allow for dynamic tracking of the status of participating sites and site personnel. These tables have evolved over the last 30 years, each customized to study needs of the study and implementing lessons learned in previous trials. Access is password protected; staff members are only able to see their own site’s tables. We use online site tables for all of our NIH-funded studies. For example, in the NIDDK COMBINE multi-center trial, we used the Site table to track both the date of local IRB approval at each site and documented the manufacturer, field strength, and software version to be used for COMBINE MRIs. During the course of the NIDDK AASK multi-center trial, the calcium channel blocker treatment arm was discontinued during the course of the trial, and we used the site table to capture the date each site’s IRB approved the revised consent (and could begin re-enrollment under the new protocol). We have recently begun work on the NHLBI Empagliflozin to Improve Right Ventricular Function in Pulmonary ArTerial Hypertension (EmPATH) multi-center clinical trial, enrolling patients at the Cleveland Clinic, Johns Hopkins, and Vanderbilt. Funded via Clinical Coordinating Center UG3/UH3 to the Cleveland Clinic Department of Pulmonology and Data Coordinating Center (DCC) U24 to Cleveland Clinic QHS, the triple-masked parallel arm trial will compare the effects of Empagliflozin vs. Placebo in pulmonary arterial hypertension patients. The primary outcome is change in Right Ventricular Ejection Fraction by Cardiac MRI. The Biorepository Core will document the date that each site has met EmPATH training criteria for sample processing, storage, and shipping. The Imaging Core will document the date of approval for each clinical site’s Cardiac MRI system and Echocardiogram/Sonography system. This database table is programmed such that the dates of IRB approval are documented locally, and the dates of Core approvals are documented centrally. These data facilitate automatic checks of a site’s “Ready to Enroll” status. Best Practices for trial documentation require Attribution, including identification of those seeing patients, performing procedures, or collecting data. We use personnel tables to store trial IDs for each staff member. Other key components of the Personnel form include 1) a staff member’s role (e.g., Site PI, Physician, Study Coordinator), 2) “currently active,” a field that facilitates the DCC ending database access when a person leaves their position, 3) individual training, certification, and (when needed) annual recertification. The database table is programmed such the staff member’s role, and “currently active” status is documented locally, and training and certification dates are documented centrally. We will present examples of site and personnel data collection systems from multiple trials for which we served as DCC, describing past utility of and upcoming plans for collecting and using these data.

P-104 – SUMMARIZING PRO-CTCAE: A NEW INDEX FROM AVERAGED COMPOSITE SCORES AT A CROSS-SECTIONAL TIMEPOINT

Primary Author:

1) Minji Lee (Mayo Clinic)

Co-Author(s):

2) Ethan Basch (University of North Carolina)

3) Sandra Mitchell (National Institutes of Health / National Cancer Institute)

4) Allison Deal (University of North Carolina)

5) Blake Langlais (Mayo Clinic)

6) Gita Thanarajasingam (Mayo Clinic)

7) Brenda Ginos (Mayo Clinic)

8) Lauren Rogak (Mayo Clinic)

9) Tito Mendoza (National Institutes of Health)

10) Antonia Bennett (University of North Carolina)

Background: In quality-of-life and multi-symptom assessment measures, unweighted sum scores or their linear transformations are commonly used to summarize complex constructs into a single score. This standard practice can be applied to developing continuous summary scores based on multiple PRO-CTCAE (Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events) scores. The purpose of this study was to investigate the psychometric properties and interpretability of a summary score that uses the average of the PRO-CTCAE composite scores among the adverse event (AE) terms deemed important for several cancers.

Methods: We analyzed the original PRO-CTCAE validation dataset, which included 940 adults undergoing chemotherapy or radiation therapy at nine U.S. cancer centers or community oncology practices (clinicaltrials.gov NCT02158637). The analyses focused on AE terms recommended for breast (16 terms), lung (8 terms), and head and neck (17 terms) cancers selected from prior mixed-methods studies. Each PRO-CTCAE term was measured by frequency, severity, interference or a combination of these attributes, and we used composite scores to provide a single representative value per symptom term. The overall burden of symptomatic AEs at a given timepoint was summarized by averaging the composite scores for each participant. We examined Spearman’s correlations, coefficient alpha, and eigenvalues from the correlation matrices, and evaluated one-factor confirmatory factor analysis (CFA) models using the diagonally weighted least squares estimator.

Results: The mean correlations among the PRO-CTCAE composite average scores were 0.34 in the lung cohort (range: 0.17-0.56), 0.29 in the breast cohort (range: -0.00-0.70), and 0.35 in the head/neck cohort (range: 0.03-0.74). Coefficient alphas were 0.81, 0.87, and 0.91, respectively. The first eigenvalue explained 43%, 36%, and 39% of the data, with eigenvalues flattening significantly after the first. The CFA model fit was strong for the lung cohort: Comparative fit index (CFI)=0.99, Tucker-Lewis Index (TLI)=0.98, Root Mean Square Error of Approximation (RMSEA)=0.067 (90% CI:0.029-0.101), and Standardized Root Mean Square Residual (SRMR)=0.072. For the breast cohort, the model was excellent allowing residual correlations: CFI=0.98, TLI=0.98, RMSEA=0.068 (90% CI:0.056-0.080), and SRMR 0.076. For the head/neck cohort, allowing residual correlations yielded good fit: CFI=0.99, TLI=0.99, RMSEA=0.057 (90% CI:0.037-0.075), and SRMR 0.079. Average unstandardized factor loadings from the CFA models were 0.74 (0.54-1.00) for lung, 0.74 (0.37-1.00) for breast, and .77 (.57-1.00) for head/neck cohorts. Finally, composite average scores were highly correlated with factor scores from the CFA models (0.975, 0.969, 0.977) and the first principal component from the principal component analysis (0.998, 0.996, and 0.999).

Conclusion: The composite average represents a reliable, valid and easily calculated approximation of the latent variable across several distinct adverse event profiles. With well-fitting linear factor models, the composite average maintains the essential property of being monotonically related to the latent variable. The composite average has a theoretical range between 0 and 3, regardless of the number of AE terms. If replicated across study contexts, the composite average may offer an intuitive and user-friendly score for use in clinical trials.

P-106 – ESTIMATING AND INTERPRETING INTERVENTION EFFECTS IN RANDOMISED MULTI-SESSION THERAPY TRIALS WITH PARTIAL INTERVENTION ADHERENCE: IS A BINARY DEFINITION OF COMPLIANCE APPROPRIATE?

Primary Author:

1) Rumana Omar (University College London)

Co-Author(s):

2) Teresa Lee (University College London)

3) Baptiste Leurent (University College London)

4) Julie Barber (University College London)

In complex intervention trials, where the intervention consists of therapy given over multiple sessions, adherence of trial participants to the intervention is often partial, with many attending only a proportion of the prescribed sessions. This partial adherence presents challenges for estimating and interpreting intervention effects on outcomes. This research had two primary aims. The first was to investigate how non-adherence is reported in published trials and the analytical methods used to address it. A systematic review of individually randomized parallel-group trials involving multi-session therapy, published between 2019 and 2023 in leading medical journals revealed that in most trials, data were analyzed using an intention-to-treat (ITT) approach. However, ITT does not account for the fact that a significant number of participants did not fully adhere to the intervention. Some studies applied the Complier Average Causal Effect (CACE) analysis, a method that uses randomization as an instrument to estimate the causal effect among participants who adhered to the intervention, in addition to the ITT approach. The CACE method typically uses a binary definition of compliance. Consequently, to use CACE in multi-session therapy trials participants who had attended at least a certain number of therapy sessions were classified as compliers while others were treated as non-compliers. This approach fails to account for varying adherence levels and if the true compliance-outcome relationship does not follow a strict “jump” function where benefits only begin at a specific adherence cutoff, the binary CACE estimates may be biased. Additionally, the reliance on the assumption that randomized allocation has no effect on non-compliers known as exclusion restriction potentially poses further challenges. The second aim of this research was to address the limitations of the binary CACE approach by exploring an alternative continuous CACE approach and considering the implications on the interpretation of the estimate of the intervention effect. In continuous CACE the number of sessions attended by each participant is treated as continuous and the intervention effect is estimated as the average causal effect by session or by a proportion of the sessions attended. Both ITT and CACE approaches were applied to real trial data. The comparison of the ITT and CACE estimates highlighted how the respective estimands address distinct research questions and emphasized the need to clearly define the estimands to quantify the intervention effect accurately under different assumptions of adherence. A simulation study was conducted to compare the binary and continuous CACE methods by varying adherence levels and dose-response relationships including both linear and non-linear associations between compliance and outcomes. The continuous CACE method performed generally well under different dose-response assumptions. The binary CACE approach provided an unbiased estimate only when the true dose-response association was a jump function with no intervention effect until attendance reached a specific session threshold and the compliance cutoff used in the analysis closely matched this threshold. However, in multi-session therapy trials the true dose-response association is rarely known and is unlikely to follow a jump function and the continuous CACE may offer greater flexibility.

P-108 – PRAGMATIC MONITORING OF EMERGING EFFICACY DATA IN RANDOMIZED CONTROLLED TRIALS

Primary Author:

1) Shrikant Bangdiwala (McMaster University)

Co-Author(s):

2) Salim Yusuf (McMaster University)

Monitoring the conduct of Phase III randomized controlled trials is driven by ethical reasons to protect the study integrity and the safety of trial participants. We propose a group sequential, pragmatic approach for monitoring the accumulating efficacy information in randomized controlled trials. The “PHRI boundary” is simple to implement and sensible, as it considers the reduction in uncertainty with increasing information as the study progresses. It is also pragmatic, since it takes into consideration the typical monitoring behavior of monitoring committees of large multicenter trials and is relatively easily implemented. It not only controls the overall Lan-DeMets Type I error probability (alpha) spent, but performs better than other group sequential boundaries for the total nominal study alpha. We illustrate the use of our monitoring approach in the early termination of the Heart Outcomes Prevention Evaluation (HOPE) trial and the Cardiovascular OutcoMes for People using Anticoagulation StrategieS (COMPASS) trial.

P-109 – RATES AND PREDICTORS OF MISSINGNESS IN CLINICAL TRIALS FOR SUBSTANCE USE: A SECONDARY ANALYSIS OF EIGHT NIDA CLINICAL TRIALS NETWORK STUDIES

Primary Author:

1) Michael Otterstatter (The Emmes Company)

Co-Author(s):

2) Amy Hahn (The Emmes Company)

3) Abigail Matthews (The Emmes Company)

4) Ashley Vena (The Emmes Company)

5) Kathryn Hefner (The Emmes Company)

Background: In clinical trials, despite careful study design and data collection, missing data are inevitable. Missing data represent a loss of information that reduces statistical power and may bias analyses leading to invalid conclusions. This issue is particularly acute in studies of substance use disorders (SUDs), owing to challenges such as relapse, unstable housing and inadequate transportation. Participants may intermittently miss study visits, or drop out from the study entirely, leading to complex and/or heterogeneous patterns of missingness in study outcomes. Interpreting these patterns to inform appropriate handling and analysis of missing values is challenging, given that underlying reasons for missingness may be infeasible to collect. Here, we describe a secondary analysis of missing data in clinical trials on SUD supported by the National Drug Abuse Treatment Clinical Trials Network (CTN). Our aim was to evaluate patterns of missingness and identify predictors to inform trial design and development of realistic simulations for the assessment of statistical methods for handling missing data.

Methods: Data were obtained from the National Institute on Drug Abuse Data Share website on eight studies with longitudinal urine drug screen (UDS) outcome data and an adult study population with sample size of at least 100. Key participant characteristics and substance use measurements were harmonized across studies and study protocols were reviewed to identify design characteristics. Missingness was classified as intermittent (participants with missing UDS followed by non-missing UDS) versus study dropout (participant withdraws from study early, after which all UDS are missing). Rates and patterns of missingness in UDS were determined for each study and compared using chi-square tests. Predictors of missingness were assessed with mixed-effect logistic regression models including participant-level random effects.

Results: Across the eight studies, missingness in the repeated measures UDS outcome was 33% overall and ranged from 15%-52% per study. Overall, 28% of participants had no missing UDS and, within studies, 20-50% of participants had less than 5% missing. However, in six of the studies, the distribution of missingness was highly skewed, with some participants having missingness rates up to 90%. All eight studies showed a mix of both intermittent missingness and study dropout, but a large majority (83%) of participants with missing UDS had only intermittent missingness, while 17% were study dropouts (p < 0.001). Participant sex at birth and age were significant predictors of missingness, with missing UDS more common among females (p = 0.042) and among younger participants (p < 0.001).

Conclusion: Within and across SUD trials we find substantial heterogeneity in rates of missing outcome data. Missingness likely arises through a mix of random and non-random mechanisms, making interpretation difficult. However, we find consistent predictors of missingness that may be useful for informing analyses. The next phase of this study will expand the scope of trials assessed for missing data, as well as the predictors of missingness. Observed patterns of missingness will be used to generate realistic simulations and evaluate key statistical methods for handling missing values. The implications of real-world patterns of missingness for improved design of clinical trials will be assessed.

P-110 – NON-INFERIORITY AND EQUIVALENCY TESTING IN THE FOUR ARM RANDOMIZED HYBRID TYPE I EFFECTIVENESS-IMPLEMENTATION STUDY OF AN EHEALTH DELIVERY ALTERNATIVE FOR CANCER GENETIC TESTING FOR HEREDITARY CANCER (EREACH2)

Primary Author:

1) Brian Egleston (Fox Chase Cancer Center)

Co-Author(s):

2) Dominique Fetzer (University of Pennsylvania)

3) Linda Fleisher (Fox Chase Cancer Center)

4) Jill Hasler (Fox Chase Cancer Center)

5) Angela Bradbury (University of Pennsylvania)

Investigating non-inferiority or equivalency in a trial with three or more arms is less common than in trials with two arms. In this work, we discuss non-inferiority and equivalency designs we developed for a four arm trial. The motivating study is the Randomized Hybrid Type I Effectiveness-Implementation Study of an eHealth Delivery Alternative for Cancer Genetic Testing for Hereditary Cancer (eReach2). Germline cancer genetic testing has become a standard evidence-based practice, with established risk reduction and cancer screening guidelines for genetic carriers. The eREACH2 study investigates in-person visits versus a web-based eHealth intervention for pre-genetic test counseling and post-test disclosure. The trial will inform whether an eHealth intervention can provide non-inferior behavioral outcomes when compared to traditional in-person counseling. Both pre-test and post-test sessions will be randomized. This results in four treatment arms (both sessions in-person, both sessions eHealth, and two arms that are mixtures of in-person and eHealth). Our three primary endpoints will be 1) uptake of services and change in 2) knowledge and 3) anxiety. We will test whether eHealth delivery alternatives are non-inferior (knowledge and anxiety) or equivalent (uptake) to traditional counseling. In non-inferiority and equivalency tests, null and alternative hypotheses are the reverse of usual; for non-inferiority, the null hypothesis is that eHealth delivery alternative is worse than the traditional delivery model. Our non-inferiority test for the two continuous variables will be based on an ANOVA F-statistic test jointly comparing the four randomization arms. We will fail to reject the null hypothesis of inferiority if the joint p-value is less than 0.2 and all of the three experimental web arms have a standardized effect of 0.138 standard deviation units or worse (with the direction standardized such that higher values imply beneficial change) when compared to the control in-person only arm. This design will approximate a one-sided test, so that we will declare non-inferiority if the intervention arms have better outcomes. We chose a 0.138 cut-point because it is a small standardized effect. For the binary uptake endpoints, we will conduct a Chi-squared test of the 4x2 table of uptake among the four arms and require a p-value of 0.1 or greater to declare equivalence. Our design gives us >87% power and <1.67% Type I error rates with 175 to 215 participants per arm under a range of null (i.e., encouraging) and alternative (i.e., discouraging) hypotheses. In this presentation, we will describe our assumptions and simulations that justify our approach. Our design could be useful for others designing non-inferiority rules for trials with multiple arms.

P-111 – GREENER ACADEMIC CLINICAL TRIAL MONITORING

Primary Author:

1) Sharon Love (MRC Clinical Trials Unit at UCL)

Co-Author(s):

2) Jo Grumett (Warwick Clinical Trials Unit)

3) Lisa Fox (Institute of Cancer Research Clinical Trials and Statistics Unit)

4) Lizzie Swaby (Sheffield Clinical Trials Research Unit)

5) Patricia Rafferty (Northern Ireland Clinical Trials Unit)

Introduction: Healthcare contributes an estimated 4-5% of global greenhouse emissions1 and clinical trials contribute to this overall footprint. As part of the effort to reduce the greenhouse gas emissions, we must de-carbonize clinical trials. Clinical trial monitoring can contribute 10-15% of a trial’s overall carbon footprint.

Methods: We applied National Institute for Health and Care Research guidelines https://www.nihr.ac.uk/about-us/what-we-do/key-initiatives/climate-health-sustainability/carbon-reduction-guidelines and the Low carbon Clinical Trials Group strategy to identify ways in which academic trial monitoring processes can be designed and implemented to reduce the carbon footprint of clinical trial monitoring.

Results: Some suggestions at each level are Institutional level: Maximize use of institutional level hybrid-working policies. Clinical Trials Unit level: Produce monitoring reports electronically and share via a cloud-based systems. Work across trial portfolios to identify monitoring activities which can be carried out for more than one trial at a given site. Trial level: When designing new trials avoid the collection of unnecessary data. Consider where electronic documents can be provided to sites instead of printed versions. Ensure the trial monitoring plan is developed with a proportionate approach to monitoring. Investigate use of e-consent strategies where possible and appropriate, to reduce patient travel specifically for informed consent. Resourcing & travel for on-site visits: Consider more sustainable modes of transport, for example replacing driving and short-haul flights with public transport. Ensure the number of data items subject to source data verification is proportionate to the risks. Combine the provision/collection of site materials with a site visit to avoid the need for separate shipments. CTU staff training: New monitors should shadow existing staff during remote visits prior to an on-site visit to reduce the number of training visits needed. Individual level: Organize workspace to maximize natural light and moderate temperature in order to use fewer resources for heating and lighting. Travel on foot, bike or use public transport rather than using the car. Take a refillable water bottle and re-usable hot drink cup for the day.

Conclusion: It is unlikely that all trials at all academic clinical trials units can adopt all these practices, but small, practical steps have the potential to have a meaningful impact when implemented on a large scale.

P-112 – RECOGNITION, REMUNERATION AND REIMBURSEMENT OF PATIENT AND PUBLIC RESEARCH PARTNERS IN PRAGMATIC RANDOMISED CONTROLLED TRIALS. A SURVEY OF AUTHOR PRACTICES

Primary Author:

1) Stuart Nicholls (Ottawa Hospital Research Institute)

Co-Author(s):

2) Pascale Nevins (Ottawa Hospital Research Institute)

3) Grace Fox (Ottawa Hospital Research Institute)

4) Shelley Vanderhout (University of Toronto)

5) Jamie Brehaut (Ottawa Hospital Research Institute)

6) Kelly Carroll (Ottawa Hospital Research Institute)

7) Dean Fergusson (Ottawa Hospital Research Institute)

8) Beatriz Goulão (University of Aberdeen)

9) Alicia Hilderley (The Ottawa Hospital)

10) Colin MacArthur (Hospital for Sick Children Research Institute)

Background: Patient and public involvement (PPI) in the design and conduct of clinical trials is increasingly encouraged by funders, proposed as an ethical requirement, and identified as an ingredient for the conduct of high quality research. Recognition of PPI partners through acknowledgement or authorship, and financial supports including remuneration and reimbursement, may facilitate involvement. However, few empirical data exist regarding current practices of recognizing, remunerating, and reimbursing PPI partners for their contributions to research.

Objective: To describe the extent to which patient and public research partners were recognized, remunerated, and reimbursed for their involvement in published reports of randomized controlled trials (RCTs) identified through MEDLINE.

Methods: Cross sectional survey of corresponding authors of pragmatic RCTs published between January 1, 2014, and April 3, 2019.

Results: From 2585 delivered survey invitations, 710 authors responded (28%), with 334 (47%) indicating that they had involved PPI partners within the trial. The most common countries of respondents were the USA (103, 31%) and the UK (81, 24%), over half (173, 52%) indicated they were female, and 203 (61%) indicated that they were over 15 years since their first academic appointment. The majority (256, 77%) of respondents indicated they were white. Among responders, 20% reported PPI partners were included as named authors, with 6% of PPI partners included as part of a group authorship. In total, 44% indicated that PPI partners were provided some form of remuneration, and 61%, that PPI partners were reimbursed for expenses incurred. Of 274 respondents who completed all three questions regarding recognition (authorship or acknowledgment), remuneration, and reimbursement, 82 (30%; 95% CI 25% to 35%) indicated that all three were provided to PPI partners, while 41 (15%; 95% CI 11% to 19%) indicated that they provided none of the options.

Conclusion: Corresponding authors of the identified RCTs reported in a minority of cases were patients listed as co-authors, less than half were provided remuneration, and over a third were not reimbursed. There is a need to better understand the nature of barriers that research teams and PPI partners face regarding recognition, remuneration, and reimbursement, and to develop targeted interventions that will address these barriers.

P-113 – INTEGRATING IRT INTO EDC FOR ADVANCED RANDOMIZATION DESIGNS

Primary Author:

1) Wenle Zhao (Medical University of South Carolina)

In the last few decades, many advanced randomization designs with better statistical properties have been published, but they are rarely used in clinical trial practice. For example, the maximum tolerated imbalance procedure provides a better trade-off between treatment imbalance and allocation predictability than the commonly used permuted block randomization. The minimal sufficient balance method changes the concept of minimization method and provides higher allocation randomness while controlling imbalances of multiple baseline covariates (not only categorical types but also continuous types). The mass-weighted urn design can accurately target multi-arm unequal allocations, especially in trials with response adaptive randomization (RAR), avoiding the dilemma of having to choose between the permuted block design with low allocation accuracy and the complete randomization with low allocation precision. In practice, researchers who want to use these advanced randomization designs often find that their Electronic Data Capture (EDC) systems or Interactive Response Technology (IRT) providers do not offer them. Two major obstacles contribute to this unfortunate situation. First, most EDC and IRT systems are provided by different vendors and cannot communicate with each other. Second, it is expensive to develop, test, validate, and document new randomization designs in software systems. To overcome these difficulties, an innovative strategy is developed by the Data Coordination Unit at the Medical University of South Carolina, with an integrated subject randomization module within the EDC system and a generic database object providing treatment assignment based on the randomization algorithms specified by the investigator for the trial. Most EDC and IRT systems are running on the internet. The integration of the IRT functionality in the EDC system allows information captured in the EDC system to be used for subject randomization without redundant data entry. A subject randomization request is processed as the submission of a special case report form. Information on previous treatment assignments and their distributions, baseline covariate data for the current and previous randomized subjects, as well as site investigational product (IP) inventory can be retrieved from the EDC system and used by the randomization algorithm. All randomization algorithms can be implemented based on the conditional allocation probabilities and the value of a random number. The random number can be retrieved from a pre-generated random number list or generated by the computer in real time when the subject randomization is requested. Based on this strategy, the pre-generated randomization list is replaced by a mathematical formula that produces treatment assignment by adapting the randomization history. This approach eliminates the risk of treatment allocation concealment failures and removes the damage trial operation glitches on the integrity of randomization procedure. In this presentation, two examples of implementing randomization designs directly in the EDC system will be discussed, one with a 5-arm RAR and covariate balance requirement and the other with 3-arm equal allocation, long term follow-up IP resupply, and tighten site IP inventory control.

P-114 – CONSENT DEVIATIONS IN AN ACUTE ISCHEMIC STROKE CLINICAL TRIAL UTILIZING PAPER AND ELECTRONIC CONSENT (ECONSENT)

Primary Author:

1) Abbey Staugaitis (University of Minnesota)

Co-Author(s):

2) Karen Stalin (University of Minnesota)

3) Anthony Rogers (University of Cincinnati Neurology)

4) Ian Rines (Medical University of South Carolina)

5) Akash Roy (Medical University of South Carolina)

6) Chris Streib (University of Minnesota)

7) S Iris Davis (University of Cincinnati Neurology)

Objective: To compare informed consent protocol deviations (PDs) using conventional paper informed consent documents (ICDs) versus electronic informed consent (eConsent) in the Multi-arm Optimization of Stroke Thrombolysis Trial (MOST, NCT03735979).

Background: MOST was an NIH-funded, multicenter, randomized controlled trial of intravenous thrombolysis plus integrilin/argatroban/placebo for acute ischemic stroke. Research teams documented informed consent using paper ICDs or a central eConsent platform that could be used in person or remotely.

Methods: MOST was conducted between 2019 and 2023. From October 10, 2019, to July 5, 2023, MOST enrolled 514 participants at 57 sites before being stopped for futility. The REDCap eConsent platform launched on June 19, 2020 and the first participant was enrolled using eConsent on July 2, 2020. We reviewed reportable consent-related PDs in relation to how informed consent was obtained (paper in-person, paper remote, eConsent in-person, eConsent remote). PDs were categorized by themes. We utilized goodness-of-fit chi-square tests followed by pairwise testing to detect differences between modalities.

Results: By consent modality, 337 (65.56%) participants were enrolled using paper ICDs in person, 4 (0.78%) using paper ICDs remotely, 93 (18.09%) using eConsent in person, and 80 (15.56%) using eConsent remotely. In total, 173 (33.66%) randomized participants were enrolled using eConsent. The rate of reportable consent-related PDs per 100 enrollments was: paper in-person: 25, eConsent in-person: 6, and eConsent remote: 13 (p=0.0004). Informed consent obtained using eConsent in-person had fewer deviations than paper in-person (p=0.0015), but not eConsent remote (p=0.19); with no difference between eConsent-remote and paper in-person (p=0.06). PDs were classified into the following themes: missing/incorrect HIPAA documentation, incorrect version of the consent form, incorrect signature, and miscellaneous. Missing/incorrect HIPAA forms occurred in 11% (14% paper in-person, 10% eConsent-remote, 2% eConsent in-person), incorrect version of the consent form 4% (6% paper in-person, 1% eConsent-remote, 0% eConsent in-person), incorrect signature 3% (4% paper in-person, 0% eConsent-remote, 1% eConsent in-person), and miscellaneous 2% (1% paper in-person, 1% eConsent-remote, 3% eConsent in-person).

Conclusion: eConsent had a lower rate of consent-related PDs than paper forms. Missing/incorrect HIPAA forms were the most common PD across all consent modalities. Inclusion of an eConsent offers additional ways to deliver consent to participants or legally authorized representatives and may improve the accuracy and completeness of consent documentation.

P-116 – HANDLING OF INCOMPLETE BASELINE COVARIATES IN CLUSTER-RANDOMISED TRIALS: A SIMULATION STUDY

Primary Author:

1) Baptiste Leurent (University College London)

Co-Author(s):

2) Elizabeth Allen (London School of Hygiene and Tropical Medicine)

3) Richard Hooper (Queen Mary, University of London)

4) Clemence Leyrat (London School of Hygiene and Tropical Medicine)

5) Jennifer Thompson (London School of Hygiene and Tropical Medicine)

6) Helen Weiss (London School of Hygiene and Tropical Medicine)

Introduction: Adjustment for baseline variables is recommended in the analysis of cluster-randomized trials (CRT) to increase power and correct for chance imbalance. However, baseline data may be missing, particularly in “open cohort” design, where new participants may join the trial after baseline (e.g. in schools or nursing homes). There is currently a lack of guidance on how to best adjust for baseline variables when they are incomplete. The aim of this work was to compare the performance of different approaches to handle missing baseline data in CRT, using simulations.

Methods: We simulated CRTs with normally distributed variables measured at baseline (covariate) and at follow-up (outcome), varying the number of clusters, clusters size, intra-cluster correlation coefficient (ICC), cluster autocorrelation (CAC), subject autocorrelation (SAC), and proportion of missing data at baseline, representing a total of 216 scenarios. We estimated the mean difference in outcome between arms using a random-effect model. We compared the following methods to handle the baseline variable: unadjusted analysis, complete case analysis, adjusting for the baseline cluster-level mean, mean-imputation (replacing the missing baseline values by the cluster mean, or by the overall trial mean), and repeated-measures model (RMM). All methods were applied with or without adjustment for the baseline cluster-level mean. The bias, model-based standard error, confidence-interval coverage, and empirical efficiency were compared between methods. We illustrated the methods in a CRT of an educational intervention in schools.

Results: All methods were unbiased. The methods differed in efficiency, with the relative performance depending on the trial parameters. The unadjusted or complete-case analyses were generally the least efficient methods. The RMM was particularly efficient, performing well across all scenarios. Adjusting for the cluster-level mean or doing mean-imputation performed reasonably in most scenarios but could be less efficient than RMM with high level of missing data and small clusters. It was generally beneficial to adjust for the baseline cluster-level mean in addition or instead of the subject-level values. In the reference scenario (30 clusters of 40 observations, ICC=0.05, CAC=SAC=0.70, 50% missing data) the variance was reduced by 34% to 38% compared to the unadjusted analysis by either adjusting for the cluster-level mean, doing mean imputation, or using a RMM.

Discussion: Repeated-measures model was found to be the most efficient approach to handle missing baseline covariate in CRT, however its’ implementation requires more care than univariate methods. Simpler methods such as adjusting for the cluster-level mean or mean imputation performed generally well and may be attractive in practice when there are few missing baseline data. Our findings also highlight the importance of adjusting for the baseline cluster-level mean, irrespective of the method used.

P-117 – THE RESEARCH EWORKFLOW TOOL: A MULTI-USER INTERACTIVE SCREENING PROCESS FOR CLINICAL RESEARCH WORKFLOW

Primary Author:

1) Jessica Staloch (University of Minnesota)

Co-Author(s):

2) Megan Tessmer (University of Minnesota)

3) Nitin Ramanujam Chakravarthula (University of Minnesota)

4) Abbey Staugaitis (University of Minnesota)

Background: Uncovering a universal tool to incorporate various complex clinical trial workflows can be challenging. The uniqueness of each clinical trial and clinical research site’s structure and workflow means that a generic template cannot be utilized effectively. Clinical trial sponsors provide various tools, often in paper form, to assist with screening and enrollment tasks of a particular study. These tools are not tailored to clinical trials or clinical research sites and lack an interactive component utilized in real-time among multiple users. Implementing a well-designed tool in the screening and enrollment workflow could enhance clinical trial success.

Purpose: An inefficient workflow can lead to duplication or omission of tasks, missed enrollments, or protocol deviations. To address inefficiencies, it is essential to establish a detailed and collaborative screening and enrollment process to ensure successful enrollments with minimal errors. Whether the workflow consists of one researcher or multiple researchers simultaneously and collaboratively screening a potential subject, a multi-user interactive screening tool would enable collaboration and increase efficiency. To address these issues and maximize efficiency, our team developed the Research eWorkflow Tool. The Research eWorkflow Tool is a real-time, multi-user electronic tool that facilitates simultaneous collaboration while ensuring the efficient execution of clinical trial tasks through a step-by-step approach, enabling consistent and high-quality enrollments.

Methods: A HIPAA-compliant online project management software was utilized to create the Research eWorkflow Tool. Similar to a spreadsheet, this step-by-step tool consisted of three collapsable sections: Section 1: General (e.g. resources, principal investigator hotline phone number, protocol information, a delegation of authority log), Section 2: Screening & Enrollment (e.g. inclusion and exclusion criteria checklist, workflow for consent, randomization, labs, study treatment tasks), Section 3: Post-Enrollment (e.g. study visit and follow-up activities). Any vital study document was attached to a particular row for easy reference. The interactive component of this tool was facilitated through checkboxes. Checkboxes were selected when a task was completed and were updated simultaneously for all users. Filters can be applied for task categorization to increase efficiency when navigating the electronic medical record. These filters proved beneficial in the workflow by efficiently managing the eligibility criteria during screening. They also improved task delegation and communication especially when multiple users collaborated on the Research eWorkflow Tool.

Conclusion: Our team found that implementing the Research eWorkflow Tool increased efficiency, encouraged collaboration, and minimized duplication and errors during the clinical trial screening and enrollment process. Future iterations of the multi-user interactive screening tool will focus on enhancing Part 2: Post-Enrollment, extending its functionality through the end of the study period. Additionally, the tool will be piloted across various clinical trials and with diverse research teams to broaden its applicability and impact.

P-118 – QUALITY ASSESSMENT OF NON-INFERIORITY TRIALS IN ONCOLOGY BASED ON CONSORT GUIDELINE

Primary Author:

1) JiBin Li (Sun Yat-sen University Cancer Center)

Co-Author(s):

2) Ying-Ying Zhu (Sun Yat-sen Memorial Hospital, Sun Yat-sen University)

3) Ying Liu (Sun Yat-sen University)

Background: Non-inferiority trials (NITs) have been increasingly applied in oncology, aiming to verify that novel therapies could offer additional benefits (e.g., safety, quality of life, or convenience) without significantly clinical sacrifice of efficacy. Despite strict methodological principles governing NITs, the evidence about the quality of NITs in oncology is limited.

Purpose: To assess the quality of NITs in oncology based on key 6 criteria and justify the concordance of author’s conclusion with CONSORT guideline.

Methods: All randomized NITs in oncology published in PubMed, EMBASE, MEDLINE and Cochrane library from inception to December 31, 2023 with clinical efficacy as primary endpoints were eligible.

Results: A total of 367 NITs were included. More than 80% were conducted in Europe (n=157, 42.8%) and Asia (n=154, 42.0%). The first three cancers involved with NITs were breast tumors (n=71, 19.3%), lymphatic and hematological tumors (n=68, 18.5%), and colorectal tumors (n=55, 15%). Noticeably, 72.8% (267/367) of trials applied the surrogate endpoints as primary endpoints (i.e., progression-free survival, disease-free survival, objective response rate or pathological complete rate). The first three rationales for conducting NITs include less harmfulness (n=190, 51.8%), providing another treatment choice (n=142, 38.7%), and more convenience compared to standard care (n=50, 13.6%). For the quality assessment of NITs, 58 trials (15.8%) were scored as poor quality (1/6 criteria: n=12; 2/6 criteria: n=46), 233 (63.5%) trials as fair (3/6 criteria: n=110; 4/6 criteria: n=123), and only 21 trials met all of 6 quality criteria. There are still around 2.2% of NITs were not pre-planned, and 69.2% of trials did not justify the selection of a non-inferior margin. About 43.3% (n=159) of trials reported the results both in the intention-to-treat and per-protocol sets. Besides, 40.6% (n=149) of trials reported the width of confidence interval inconsistent with the defined type I error. The prespecified levels of efficacy in control arms used for calculating sample size were accurately estimated only in 214 trials (58.3%) (the ratio compared with the actual levels ranged from 0.81 to 1.20). In addition, nearly 28% (n=103) of trials incorrectly used the experimental arms as the reference to set the margin or report the results. Noticeably, the author’s conclusions were concordant with CONSORT guideline in only 100 trials (27.2%) for noninferiority, 17 (4.6%) for superiority, 38 (10.4%) for the failure of non-inferiority. In our assessment, 87 of 194 trials that authors claimed successful of noninferiority did not be supported by CONSORT guideline (13 trials with the failure of noninferiority: 4 trials with inferiority; 70 with inconclusion). Moreover, 42 of 59 trials with author’s superiority conclusion were deemed noninferiority (n=27), failure of non-inferiority (n=8), evenly inferiority (n=1) or inconclusive (n=6).

Conclusions: The quality of NITs in oncology has important deficiencies, such as the absence of justification for NIM, narrower reported CI, inaccurate estimates for the efficacy in control arms, inversely the reference group. All these shortcomings compromise the robustness of noninferiority conclusion, which should be noticed by the clinical investigators, statisticians and regulator experts definitely.

P-120 – THE PREVENTABLE STUDY CALL TRACKING AND SCHEDULE MANAGEMENT SYSTEM

Primary Author:

1) Lea Harvin (Wake Forest University School of Medicine)

Co-Author(s):

2) Emily Rives (Wake Forest University School of Medicine)

3) Patty Davis (Wake Forest University School of Medicine)

4) Randall Nulph (Wake Forest University School of Medicine)

5) Scott Lang (Wake Forest University School of Medicine)

This presentation will focus on the call tracking and schedule management system designed for telephone-based cognitive and physical function assessments of participants, age 75 and older, enrolled in the US PRagmatic Evaluation of events and Benefits of Lipid-lowering in older adults (PREVENTABLE) trial. We will discuss the specifications of the web-based call tracking and schedule management system design, focusing on how the application’s flexible and dynamic design allows for real-time data entry and communication with clinical sites. This robust design ensures high retention to annual calls to obtain annual phone-only data on participants for 5 years. The PREVENTABLE Call Tracking and Schedule Management System is a complex system currently grounded by SQL tables, app specific development files, and stored procedures that create and maintain staff hours of availability, track a participant’s study history through enrollment, randomization, and multiple years of follow-up contacts. The interface allows staff administrators to add callers to the team and manage available call hours for each individual call staff member. On separate screens, staff can visually ascertain a call’s current status, initiate the data entry process, update the call status at any point throughout the entire lifecycle for each active call type, and review the history of each interaction and all comments documented by clinic or call center staff. There are various triggers, alerts and procedures that create, update and/or remove calls from the main interface as needed behind the screens. We will also touch on maintenance processes and alerts that are run daily to monitor participant status throughout the call lifecycle. Since staff scheduling is the foundation of the system, caller assignments are randomly selected from the available time slot list, this is done on purpose otherwise SQL would instinctively pull time slots in alphabetical order. Participants whose language preference is Spanish are only paired with Spanish speaking staff. Confirmation screens reiterate appointments with participants’ time zone and date. Some processes run hourly and others fire instantly as calls end or retention status changes. We will review specifics important to the system’s functionality, including the use of military time exclusively, on-screen feedback and guided data entry as well as key features of dashboard screens and the rationale behind the layout for efficiency. This well-choreographed dance between human actions and web technology is ever changing with each annual cycle that passes. We’ve learned a great deal over the years by working hand in hand with team members and we look forward to the challenges to come.