Abstract

INVITED SESSION PROPOSALS
SP1 – RECENT ADVANCES IN ADDRESSING DESIGN AND ANALYSIS CHALLENGES OF CLUSTER RANDOMIZED TRIALS
Cluster randomized trials (CRTs), including parallel, crossover, and stepped wedge designs, are becoming more popular and prominent in medical and healthcare delivery research for several reasons, including administrative and logistical considerations, ethics, and ease of application at the cluster level. For example, when the research question involves a change in clinical practice or a systemic change in a hospital, it might not be feasible to study the intervention in a design using individual patient randomization. It is often more appropriate in such settings to randomize clusters of patients or providers and design a study which compares interventions while accounting for the within-cluster correlations, both in design and analysis. Our session clearly addresses the theme of “Shaping the Future: The Right Questions, Robust Answers”, as cluster randomization is becoming more frequently used to address important scientific questions in healthcare delivery research, and all four presentations center on the common theme of providing robust statistical solutions to the right question. Specifically, Speaker #1 will focus on methods advancements for cluster-randomized crossover trials (CRXO), a design that robustly combines the benefits of within- and between-cluster comparisons to make inference on the treatment effect. Speaker #2 will discuss simulation-based power comparisons for hierarchical composite outcomes in CRTs, and identify robust and powerful methods when complex endpoints meet clustering. Speaker #3 will discuss methods to re-estimate the intracluster correlation coefficient (ICC) during an internal pilot study for CRTs to ensure additional robustness in the final sample size and power. Speaker #4 will focus on estimating transparent estimands in CRTs to ensure model-robust inference even under working model misspecification. Collectively, the talks provide a suite of robust solutions to combat challenges in the design and analysis of CRTs.
Talk 1: Salient Design and Sample Size Features of Cluster-Randomized Crossover (CRXO) Trials. Edward Mascha, Cleveland Clinic
In a cluster-randomized crossover trial (CRXO) different clusters, such as hospitals, operating rooms, providers, or care teams are randomized in each period to receive one of the study interventions. A CRXO trial may include many periods, with measurements between periods made either on the same patient or on different/new patients. In this talk, we describe how a CRXO design borrows features from an individual patient crossover trial, a cluster randomized trial, and a stepped wedge cluster randomized trial. We then highlight situations in which a CRXO design is most appropriate, beneficial/attractive, and powerful. We further illustrate how CRXO sample size calculation depends on key design parameters including within-cluster within-period and within-cluster between-period correlations, which may degrade as periods are farther apart, as well as on number of clusters and periods. Methods will be illustrated using multi-period CRXO trials completed in the perioperative setting at Cleveland Clinic.
Talk 2: Power Comparison for Different Hierarchical Composite Outcomes in Cluster Randomized Trials. Hrishikesh Chakraborty, Duke University
Hierarchical composite outcomes (HCO) combine multiple endpoints into a single measure, frequently used in clinical trials. Different methods have been proposed to create variations of HCO. However, HCO implementation in CRT is limited, and the power implications of different HCO methods are unclear. We integrated various estimation methods of different HCO and CRT to conduct a simulation-based power comparison for these methodologies in a CRT environment. Methods tested included Finkelstein-Schoenfeld, unmatched win-ratio, unmatched win-difference, matched win-ratio, and worst-rank score under varying ICCs, cluster numbers, and size variations. Our simulation study was based on a real-world HCO outcome used in CRT design, specifically the HiLo trial, where we utilized time-to-death and hospitalization to generate different HCO. We concluded that unmatched win-ratio and unmatched win-difference had the highest power across scenarios, while the Finkelstein-Schoenfeld method and worst-rank score followed.
Talk 3: Reassessing the ICC during an internal pilot study for a cluster-randomized trial. Emine Bayman, University of Iowa
A defining characteristic of cluster randomized trials is the randomization of clusters of individuals to study arms and the resulting potential for correlation of outcomes within clusters. This correlation, assessed by the intraclass correlation coefficient (ICC), must be considered in the design and primary analysis. Accordingly, in addition to estimating the effect size, the researchers must estimate the ICC for a valid calculation of the target sample size in a CRT trial. In many situations, a reliable estimate of the ICC may not be available in the design phase. Thus, researchers may wish to use interim data collected during the trial to estimate outcome data for the ICC and reassess the sample size. We will discuss the Fibromyalgia Transcutaneous Electrical Nerve Stimulation (TENS) in Physical Therapy Study (FM-TIPS) to demonstrate interim reassessment of sample size in a CRT. FM-TIPS is a cluster randomized pragmatic trial examining whether the addition of TENS to routine physical therapy improves movement-evoked pain between baseline and 60 days compared with physical therapy alone among patients with fibromyalgia.
Talk 4: Model-robust standardization in cluster-randomized trials. Fan Li, Yale University
Although generalized linear mixed models and generalized estimating equations have conventionally been the default analytic methods for estimating the average treatment effect in practice, recent studies have demonstrated that the treatment effect coefficient may correspond to an ambiguous estimand when the regression model under consideration does not perfectly align with the data generating process and when there exists informative cluster size. In this talk, we will present simple and accessible methods to standardize output from any regression model to ensure robust estimand-aligned inference in cluster-randomized trials. In particular, the talk will introduce estimators for both the cluster-average and the individual-average treatment effects that are always consistent regardless of whether the specified working multilevel regression models align with the unknown data generating process. Simulation experiments and analysis of a real CRT are used to demonstrate the utility of these simple estimators over existing model-based estimators.
SP2 – THE HIGHLIGHTS AND HICCUPS OF HIGH-DENSITY DATA
High-density data are an ever-growing field in clinical trials. Continuous monitoring data, an example of such data, allow for high detailed informed interventions for subjects. The BOOST-3 clinical trial, for example, is designed to test a prescribed treatment protocol based on the continuous monitoring of intracranial pressure (ICP) and brain tissue oxygen content (Pbt02). While such data are commonly used for the treatment of patients, in the context of a clinical trial, the capture and analysis of this type of data presents unique challenges. In this session, we will use the ongoing BOOST-3 clinical trial as a case study to discuss the complexity of such data as it progresses through the scope of a clinical trial: from its collection at the clinical site and transfer to the data coordination center, summarization and creation of data quality metrics, reconciling different data streams relevant for assessing treatment fidelity, and final analysis of the collected data.
Talk 1. Sharon Yeatts, Medical University of South Carolina
As the PI of the Data Coordinating Center for BOOST-3, Dr. Yeatts will provide an introduction to the Brain Oxygen Optimization in Severe TBI Phase-3 (BOOST-3) clinical trial and the data collected therein. BOOST-3 is a Phase III, randomized clinical trial designed to compare the effectiveness of for a prescribed treatment protocol for patients with traumatic brain injury: a protocol based on both intracranial pressure (ICP) and brain tissue oxygen content (PbtO2) versus a protocol based on ICP monitoring alone. Both strategies are currently used in care and allow physicians to initiate/adjust treatment given the current levels of ICP and/or PbtO2. The interventions given in a particular subject are documented in the corresponding electronic CRF. The continuous ICP and PbtO2 data are uploaded to the trial’s cloud storage location for analysis of the trial’s secondary aims.
Talk 2. Chris Arnaud, Medical University of South Carolina
Mr. Arnaud will describe the infrastructure developed to download the continuous data from the cloud storage location to the Data Coordinating Center. The cloud storage location processes, stores, and makes subject data available in a data folder following specified naming conventions. For valid folder locations, the transfer program downloads the CSV files and prepares them for import into WebDCU, the DCU’s Clinical Trial Management System. A summary of the process is available with the CTMS for oversight.
Talk 3. Zeke Lowell, Medical University of South Carolina
Quality data is imperative for success of a clinical trial. For BOOST-3, the continuously monitored ICP/PbtO2 are of fundamental importance as these data allow for an assessment of treatment fidelity; however, the high-density of the data present unique challenges from a data management perspective. Metrics for quality of these data had to be developed relative to treatment practices and protocol definitions. Approaches for quantifying treatment performance by comparing the continuously monitored data against multiple data streams to assess fidelity to the prescribed treatment protocols were developed. This talk will present the challenges and subsequent approaches for management of these complex data.
Talk 4. Jonathan Beall, Medical University of South Carolina
The detailed physiologic data are used in real time to administer a prescribed treatment protocol developed to bring these values into pre-specified ranges. A measure of treatment fidelity could be the effect of treatment on time spent outside of the target range for PbtO2; however, clinical or physiologic events, such as patient transport, can result in missing data values for these continuously monitored data. When this occurs, the true proportion of time outside of the specified range becomes obscured, which presents a significant analytical challenge. We propose a Bayesian model allowing for the construction of subject level informative prior distributions for these summary measures. For those subjects with intermittent missing data, this approach will provide informed ranges for imputation of the subject level missing data. We will assess our proposed model under a variety of simulation conditions, including varied rates of missing data and mechanisms for missing data. We will compare our proposed model to alternative approaches for handling missing data.
Discussant: Lisa Merck, Virginia Commonwealth University
Dr. Lisa H. Merck will discuss management of clinical confounders in polytrauma shock physiology. She will briefly outline lessons learned from the ProTECTIII clinical trial, and the unique challenges / opportunities of working with continuous multiparametric data streams.
SP4 – THE MODERN CLINICAL TRIAL: MANAGING THREATS TO ROBUST EVIDENCE
Randomized controlled trials are intended to resolve, not stir controversy. Many proposals to improve the speed and efficiency of clinical trials present threats to robust evidence. Methods that lack transparency; introduce modeling assumptions, subjectivity, complex randomization schemes; or borrow data from non-randomized subjects that weaken the foundation for error control and have potential to invalidate trial conclusions. An additional potential threat is a push, by some, to rely more heavily on “real-world evidence.” While real-world data come from multiple sources, including from clinical trial data, the term “real-world evidence” often refers to non-randomized or observational studies that utilize data collected in the context of routine medical practice, as the “real experience.” While such data can provide important information about medical practice, real-world evidence is no substitute for a well-conducted randomized controlled trial. In this session, we will address many questions, including: What are the limits of real-world evidence? When is this approach an appropriate substitute for clinical trial evidence? What can be learned from trials that have piloted adaptive and complex modeling techniques in recent years? Are there scenarios where these methods are appropriate? Do we need to further evaluate the relative merits of these proposals? This session will address the topics of the tradeoff between robustness, rigor and practicality, incorporation of data from non-concurrent controls, and response-adaptive randomization.
Talk 1: Real World Evidence for evaluating Treatment Effectiveness and Safety: when is it needed and how can it be made trustworthy? Stuart Pocock, London School of Hygiene and Tropical Medicine.
Talk 2: Response Adaptive Randomization: Rigor Now or Rigor Mortis Later? Michael Proschan, Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases
Talk 3: Use of non-concurrent controls in the RCT paradigm. Lori E Dodd, Clinical Trials Research Branch, National Institute of Allergy and Infectious Diseases
Discussants: Colin Begg, Memorial Sloan Cancer Center; Marc Buyse, International Drug Development Institute; Patrick Phillips, University of California San Francisco
Late Breaking Session – IMPACT OF RECENT US GOVERNMENT ACTIONS: A LATE-BREAKING PANEL WITH AUDIENCE DISCUSSION
Research funded, overseen, and evaluated by the United States National Institutes of Health (NIH), Food and Drug Administration (FDA), Centers for Disease Control (CDC), and Veteran’s Administration (VA) are critical for the world’s health. NIH-funded trials operate with unparalleled objectivity and often prioritize pursuit of answers to important questions for informing medical practice, thus informing medical care in ways that development-oriented industry funded clinical trials generally do not. This includes evaluation of the comparative effectiveness of competing interventions and driving the study of interventions for rare diseases where profit motives are lacking. These institutions are the primary supporters of academic clinical research training. The FDA protects public health by providing the most extensive and critical reviews of interventions to treat and prevent diseases to determine if they are safe and effective for clinical use. The CDC protects public health by preventing and controlling disease, injury, and disability. Recent executive orders have reduced support for these foundational institutions and the research supported by these institutions. This panel will discuss the impact on government agencies academia, and industry; clinical research careers, the clinical research agenda and portfolio, the quality of clinical trials and other research studies, and the evidentiary standard for evaluating interventions. Substantial time will be allocated for audience participation.
Dr. Ciolino will speak from the perspective of an academic data coordinating center.
Dr. Wittes will speak from the perspective of a former NIH employee, NIH grantee, and SCT President.
Dr. Evans will speak from the perspective of an academic researcher, NIH grantee, and SCT President.
SP6 – HIERARCHICAL COMPOSITE ENDPOINTS AND THE WIN RATIO: AN APPROACH TO OBTAINING ROBUST ANSWERS
An approach to obtaining robust answers, The Win Ratio (WR) method, introduced by Pocock et al. in 2011, offers a novel statistical approach to enhance the analysis of composite outcomes with varying severities by accounting for the relative priority of each component. Conventional methods for creating and analyzing composite outcomes have their limitations which may require altering and/or narrowing study questions, potentially creating downstream limitations, to interpret results/answers. In contrast with conventional methods, the WR accommodates mixed outcome types (e.g., time-to-event, categorical, and continuous) without relying on distributional assumptions thereby empowering stronger study questions and more resulting in more robust answers. By comparing pairs of patients and assigning a “win” to the patient with the better outcome for each pair, this method prioritizes the most important endpoints set by patient and/or provider priorities, leading to more clinically relevant trial results. Additionally, the WR method allows for the prioritization of fatal outcomes and hierarchical testing of broader composite endpoints, including patient-reported outcomes. Its hierarchical structure, statistical power, and flexibility make the WR an attractive alternative for comparing the efficacy of randomized treatments. With the increasing implementation of the Win Ratio (WR) in various medical fields, this invited session will provide a comprehensive introduction to the WR method and its extensions, explore its applications across multiple fields, address challenges in implementation and securing funding for WR trial designs. In addition, the new Win Time Ratio method will be illustrated. The Win Time Ratio is a new variant of WR that accounts for the time spent in each clinical state during the combined common follow-up period. Finally, we will discuss the physician’s perspective on the critical need for trial designs that better reflect the complexity of real-world patient outcomes.
This invited session aims to provide a comprehensive overview of the Win Ratio (WR) method and its extensions and variants, illustrating how these innovative statistical approaches can improve the analysis of composite outcomes in clinical trials. By exploring both the methodological advancements and practical applications, we aim to showcase the potential of WR-based designs to yield more clinically relevant and robust trial results that better reflect real-world patient outcomes. Attendees will gain a deeper understanding of the WR method and its new variant, the Win Time Ratio, learning how these approaches address key limitations in conventional composite outcome analysis. Participants will also acquire insights into implementing WR-based trial designs, overcoming associated challenges, and the critical role of these methods in generating more meaningful and robust findings in clinical research.
Talk 1: Enhancing Clinical Trial Outcomes: An Introduction to the Win Ratio Method Brief. Björn Redfors, Sahlgrenska University Hospital
This talk will introduce the Win Ratio (WR) method as a novel statistical approach for analyzing composite outcomes with varying severities. The talk will address how WR accommodates mixed outcome types (time-to-event, categorical, and continuous) without relying on distributional assumptions, empowering stronger study designs and more robust conclusions.
Talk 2: Win Ratio implementation in WINDSURFER trial and its extensions. Lai Wei, Ohio State University
Dr. Wei will introduce the implementation of the WIN ratio analysis to Determine a strategy of non-invasive SUpport for Respiratory Failure in the EmeRgency Department (WINDSURFER) trial. The considerations and challenges encountered during the study design of this trial will be shared. Additionally, extensions of the Win Ratio method, incorporating weighted and matched techniques, will be introduced.
Talk 3: Win Time Methods for Clinical Trials. James Troendle, National Heart, Lung, and Blood Institute
In this talk, Dr. Troendle will introduce and illustrate the Win Time methods. These new methods will be compared to the Win Ratio, with an important distinction being how likely a trial is to conclude benefit without there being an overall benefit.
Discussant: Jarrod Mosier, University of Arizona
Dr. Mosier will reflect on the 3 presentations and share his experience as a clinical PI using the WR approach for the primary outcome in acute trials and how it has been perceived by peer review.
SP7 – HARNESSING THE POTENTIAL OF PLATFORM TRIALS IN CHRONIC DISEASES
Platform trial designs offer an innovative framework for assessing treatment efficacy in progressive and degenerative chronic diseases while alleviating challenges of operations and resource management. Clinical trials in chronic diseases, such as interstitial lung disease, can be hindered by slow recruitment, inconsistent standards of care, and large sample size requirements with extended periods of follow up. Potential trial participants, especially with a rare chronic disease, often face the predicament of having to choose only one among multiple competing clinical trials. In contrast, platform trial designs efficiently address these issues without compromising the integrity and robustness of trial results. The shared trial infrastructure of platform trials facilitates pooling resources across participating institutions, aiding enrollment, while also offering trial participants a greater chance of receiving an effective therapy. Platform trial features that are beneficial in chronic disease populations include 1) the ability to randomize to combination therapies across multiple disease subpopulations, 2) follow up embedded within usual care schedules, and 3) apply adaptive decision rules for early stopping that allow chronically ill patients to receive evidence-based care. This session will discuss unique aspects of platform trial design and implementation in chronic diseases and highlight how to leverage the strengths of platform designs in clinical development.
Five speakers will address different aspects of platform trials with examples in pulmonary and neurological diseases. The first talk will discuss the value of using real world evidence to define well-powered endpoints for varied disease subpopulations and stages of disease progression. Strategies for data collection informed by real world evidence will also be addressed, including frequency of follow up and plans for handling prevalent background therapies. The second talk will address considerations for supporting shared control groups across distinct disease subpopulations and treatment domains, including modes of administration. The third talk will explore adaptive trial features which improve statistical and logistical efficiency and expedite the timeline to trial conclusions. Statistical approaches for investigating potential interactions of combinations therapies within a multi-factorial framework will also be discussed. The fourth talk will present operational aspects of coordinating site activation and data collection, monitoring and implementing the randomization scheme, and navigating IRBs in complex trials. The final talk will discuss an industry sponsor’s perspective of participating in platform trials. Talks will each be roughly 15 minutes and will be followed by a final 15-minute Q&A period.
Talk 1: Utilizing real world evidence to characterize target populations and appropriate clinical outcomes, and to strategize data collection in trial planning. Iain Stewart, Imperial College London
Talk 2: Logistical and statistical considerations of a shared control in platform trials of multiple disease cohorts and treatment domains. Megan McCabe, University of Alabama at Birmingham
Talk 3: Implementing adaptive design features to deliver well-powered, expedited, and patient-centric results. Barbara Wendelberger, Berry Consultants
Talk 4: Coordinating the set up and operations of platform trial implementation in chronic disease settings. John VanBuren, University of Utah
Talk 5: An industry sponsor’s perspective on the benefits of therapeutic development within platform trials. JonDavid Sparks, Eli Lilly
SP8 – DESIGN CONSIDERATIONS FOR MULTI-CANCER DETECTION ASSAY CLINICAL TRIALS: THE NCI CANCER SCREENING RESEARCH NETWORK
A new generation of Multi-Cancer Detection (MCD) assays that evaluate cell-free DNA or other biological components is rapidly emerging. If the benefit of MCD assays could be established, this would present several advantages for cancer screening. MCDs are simple to implement for both health care providers and their patients (generally, a blood test), so could be widely accessible, even in under-resourced settings. MCD tests also have the potential to greatly expand early detection opportunities for cancers with no established screening technologies. In addition, a single blood test for multiple cancers could improve the reach and efficiency of screening, even for those cancers with existing screening technologies.
Despite their promise, the evidence supporting MCD tests for early detection benefits is quite limited. Only two prospective, uncontrolled studies have reported outcomes for MCD tests, and these results are restricted to test performance, tumor stage at diagnosis, and adverse events related to working up an abnormal MCD test result. No study has documented the impact on cancer-specific or overall mortality, or harms from testing (e.g., over-diagnosis), which is critical for understanding their true value for public health and their implications for health care providers and systems.
An additional complexity of MCD tests is the identification of the tumor site or tissue of origin (TOO). Depending on the assay, an abnormal test result may give an indication of the location of the tumor, which may or may not be accurate, or it may signal only the presence of cancer without specifying a TOO. Because the value of screening depends on timely diagnosis and effective treatment, the process for reaching an accurate diagnostic resolution and accessing treatment is fundamental to the value of MCD-based screening. No standards yet exist to assist a primary care provider in determining how to follow-up an abnormal MCD test result.
The new, NCI-funded Cancer Screening Research Network (CSRN) will evaluate emerging technologies for cancer screening. The CSRN will conduct rigorous, multi-center cancer screening trials with large and diverse populations in a variety of health care settings with the ultimate goal of reducing cancer-related illnesses and deaths. CSRN launches its first randomized clinical trial, named the Vanguard Study, in early 2025. Trial participants without cancer will be randomized to receive one of two MCD tests or to a control arm (no test), with the goal of assessing the feasibility of implementing a large platform RCT to measure the clinical effectiveness of MCD tests. Leaders of the two coordinating centers for the Vanguard Study, the Communications and Coordinating Center and the Statistics and Data Management Center, will present the Vanguard design and discuss unique aspects of MCD screening trial design.
Talk 1: Design of the CSRN Vanguard feasibility study. Katherine A Guthrie, Fred Hutch Cancer Center
The specific aims of the Vanguard Study are to assess the feasibility of conducting a randomized controlled trial to evaluate MCD tests, and to develop and evaluate our ability to engage underserved and under-resourced populations in this effort. This feasibility study will accrue up to 24,000 participants from across the US through 9 Accrual, Enrollment, and Screening Site (ACCESS) Hubs, including academic, community, Federally Qualified, Department of Defense, and Veterans Affairs health care centers. The design incorporates blood draws from all participants at baseline and year 1, single-blinded and unblinded Hubs, collection of standard of care cancer screening episodes and incidental cancer cases, participant-reported mental health outcomes, and diagnostic workups for the expected 3-5% of intervention-arm participants who receive an abnormal MCD test result.
Talk 2: Can we shortcut cancer screening trials? Ruth Etzioni, Fred Hutch Cancer Center
This talk will describe novel designs and alternative endpoints that are being considered to make screening trials more efficient and more able to provide timely results regarding screening efficacy. Dr. Etzioni will discuss the implications of these potential changes for the evaluation of multi-cancer detection tests.
Talk 3: Are MCD tests ready for primetime? Ziding Feng, Fred Hutch Cancer Center
This talk will contrast evidence supporting the effectiveness of MCD tests versus single cancer tests. Dr. Feng will sound a cautionary note that while there is great excitement about MCD tests, we should not necessarily relax established criteria for trial readiness.
A panel discussion with the speakers and other CSRN coordinating center leadership will follow the three talks. We also hope to add a speaker representing a clinical site to discuss challenges inherent to the diagnostic workup following an abnormal MCD test result.
SP10 – PEDIATRIC DRUG DEVELOPMENT: EMERGING INNOVATIONS AND FUTURE DIRECTIONS
The development of safe, effective, and targeted medications for pediatric populations has long been a significant challenge. Key obstacles include the small number of pediatric patients, the limited availability of detailed physiological data, and the ethical complexities of conducting research with children. These factors collectively slow the progress of pediatric drug development, often resulting in significant delays compared to the approval timelines of drugs for adults. As regulatory requirements for pediatric studies have evolved, innovative research methods, advanced technologies, and collaborative frameworks have emerged to drive progress in this critical field of medicine. Regional guidelines discussing pediatric extrapolation have been previously issued by various regulatory agencies, including both FDA and EMA. The recently released ICH E11A guideline provides recommendations for, and promotes international harmonization of, the use of pediatric extrapolation to support the development and authorization of pediatric medicines. Recently, a review on pediatric labeling changes in US also showed that the use of extrapolation increased the approval rates of new and expanded pediatric indication (Ye et al., 2023). ICH E11A encourages use of pediatric extrapolation based on evaluation of existing evidence between adult and pediatric population: 1) similarity in disease, 2) similarity of drug pharmacology, and 3) similarity of response to treatment, to reduce the burden of conducting pediatric studies. The level of evidence would depend on the existing strength of evidence and thus the approaches for extrapolation may be different. Explorations are typically conducted between adult and pediatric populations for the same drug. Recently, the mechanism of action (MOA) based extrapolation has been proposed. This MOA-based strategy broadens the scope of data sources beyond just the same drug to include other drugs with the same or similar MOA. As a result, it allows for the integration of diverse data types that are highly relevant to both the pediatric population and the drug under development. Within this context, Bayesian methodologies are emerging as a powerful tool, offering innovative ways to maximize the use of existing data, optimize trial designs, and ensure robust statistical analysis. Represented by American Statistical Association (ASA) Biopharmaceutical Section Statistics in Pediatric Drug Development Scientific Workgroup (SPDRx), this session aims to explore cutting-edge approaches to pediatric drug development, focusing on how these methods can bridge the gap between adult and pediatric populations and streamline trial designs. It will feature three expert-led presentations, followed by a discussion led by the PhRMA Topic Lead for the ICH E11A pediatric extrapolation expert working group, who will provide insights into the latest developments in international harmonization efforts and the role of extrapolation in pediatric medicine.
SP11 – INCOMPLETE VARIANTS OF STEPPED WEDGE CLUSTER RANDOMIZED DESIGNS: RECENT DESIGN INNOVATIONS AND CONSIDERATIONS FOR IMPLEMENTATION
Cluster randomized trials are essential designs for evaluating effects of interventions that are applied to groups of patients (i.e. to entire clusters). A common challenge in practice is balancing the need for a robust design that has a sufficiently large number of clusters, with practical limitations such as limited availability of clusters, budgetary constraints, and securing the support of cluster gatekeepers. Stepped wedge cluster randomized trials are an important variant that have gained popularity over the past three decades: in these designs, all clusters start in their usual-care steady-state, but eventually switch to the intervention during the trial, with the timing of the switch randomized. More than 530 stepped wedge trials are currently registered on ClinicalTrials.gov, and the opportunity for all clusters to implement the intervention is a key reason for their practical appeal. However, standard stepped wedge designs can be costly and burdensome to both clusters and individual participants, and can increase the risk of cluster attrition, as all clusters are required to recruit and measure individuals for the entire study duration.
Incomplete variants of stepped wedge designs, in which clusters participate in a trial for limited durations of time, offer appealing alternatives, alleviating burdens and reducing costs. In fact, statistical work has shown that not all measurements in a stepped wedge design contribute the same amount of information about the effect of an intervention: for example, those measurements taken near the time that a cluster switches from the control to the intervention tend to provide the most information about the treatment effect. This work points the way towards potentially powerful incomplete alternatives to the stepped wedge; but myriad incomplete variants of any complete stepped wedge design exist. Much recent work has focused on the identification of “optimal” incomplete variants of the stepped wedge design: incomplete designs that still provide sufficiently high statistical power to detect effects of interest while reducing the burden of participating in a trial, and reducing trial costs. Major questions remain: which variants of incomplete stepped wedge designs are particularly beneficial; what is the optimal incomplete design for any given scenario; how do conclusions about the efficiency of incomplete designs change for different modelling approaches; and how acceptable are these optimal incomplete designs to trialists?
This session gathers researchers from around the world who are focused on identifying and understanding optimal incomplete variants of stepped wedge designs. In this session, they will discuss why incomplete variants of the stepped wedge design are worth considering; describe some useful variants of incomplete designs; present methods for finding incomplete stepped wedge designs with high levels of power; and consider when these incomplete stepped wedge designs are useful and acceptable alternatives to the complete stepped wedge. This session will be of interest to all researchers working in the design and conduct of cluster randomized trials and has implications for enhancing the robustness of innovative trial designs to answer important clinical research questions.
Talk 1: Outline of session and introduction of speakers. Jessica Kasza, Monash University
Talk 2: From the stepped wedge to the staircase. Kelsey Grantham, Monash University
In this talk, we show that measurements taken in certain regions of stepped wedge designs contribute nothing or very little to estimation of the treatment effect, pointing the way toward new designs that concentrate measurements in only the most impactful regions. We then describe how incomplete variants that are less burdensome and more cost-efficient, such as “staircase” designs, can be derived from a stepped wedge design by removing cells from the design in a principled manner. Finally, we discuss staircase designs in more detail, including when staircase designs can be equally as or more powerful than stepped wedge designs.
Talk 3: Marginal models, binary outcomes, and incomplete designs. John Preisser, University of North Carolina
The use of marginal models for binary outcomes is commonplace in the analysis of data from stepped wedge designs, be they complete or incomplete. Hence, sample size formulas and procedures based on the marginal modelling approach that can accommodate incomplete stepped wedge designs are required. Here we discuss how such a procedure can be applied to explore a range of incomplete stepped wedge trials. We also discuss the interplay between statistical and practical considerations in the design of a particular incomplete stepped wedge trial.
Talk 4. Demystifying incomplete stepped wedge designs under the working independence assumption. Fan Li, Yale University
The design and analysis of stepped wedge designs is complicated by the need to specify the “correct” working correlation structures, and a convenient working independence assumption is sometimes attractive due to its simplicity and accessibility, and its robustness to working correlation misspecification. In this talk, we will discuss the information content of full stepped wedge designs analyzed by independence estimating equations, and identity information-rich cells that contribute the most to treatment effect estimation, thereby motivating the form of an incomplete design variant that balances power with data collection burden. We will discuss a surprising result that an incomplete design can beat a complete design in efficiency under working independence “i.e., less is more” and provide a new justification to the incomplete stepped wedge variant. Practical considerations and numerical examples of designing incomplete stepped wedge trials under working independence are also discussed.
Talk 5: Incomplete designs in practice. Monica Taljaard, Ottawa Hospital Research Institute
In this talk, we consider practical issues around the design, analysis and reporting of incomplete stepped wedge variants. We review the trials literature to consider how commonly incomplete designs are being used and how and why they are implemented. We discuss potential pitfalls raised by this design, and review practical aspects that trialists and statisticians should consider when planning and implementing an incomplete stepped wedge trial. We conclude by identifying gaps and unanswered questions that need to be addressed.
Panel discussion: The future of incomplete designs. Moderated by Jessica Kasza.
SP13 – BUILDING AN EVIDENCE-BASE FOR EXERCISE MEDICINE IN TARGETED POPULATIONS THROUGH RIGOROUS CLINICAL TRIALS
Physical activity is necessary for optimal health but can exercise be used safely for the treatment and prevention of diseases in vulnerable and diverse populations? Health care providers need high quality evidence to prescribe effective and safe interventions for their patients. The evidence-base for exercise interventions largely consists of single-center underpowered efficacy trials. However, there are ongoing efforts to contribute stronger evidence through rigorously conducted clinical trials across complex diseases and in diverse populations. Challenges encountered include study design, implementation, and recruitment: (1) Exercise trials are complex to design due to the multiple modalities, doses, and durations; (2) Data and safety monitoring must be followed no less than pharmacological trials; (3) Successful recruitment for exercise trials depends on the motivation of the population being studied and competing pharmacological options.
Talk 1. Kathryn Schmitz, University of Pittsburgh
Dr. Schmitz will present on exercise oncology trials, with a focus on THRIVE 65, which is part of the NCI funded ENICTO Consortium. This multisite 2 arm RCT is assessing the effect of twice weekly progressive resistance training and protein supplementation on chemotherapy treatment tolerance (relative dose intensity). The intervention is largely delivered through telehealth to the participants, who are all 65 and older. Cognitive issues, feeling overwhelmed, and lack of familiarity with technology are among the ongoing challenges for carrying out this trial. Monitoring exercise and nutrition intervention dose and adverse effects are also a major focus of the study team as this work progresses.
Talk 2. Daniel M Corcos, Northwestern University
Dr. Corcos will present on exercise trials for Parkinson’s disease. He will discuss the challenges associated with studying disease modification in Parkinson’s disease as opposed to studying the modification of the signs and symptoms of the disease. He will discuss the role of both fluid, and digital biomarkers in the study of exercise. He will also discuss different approaches to solving the randomization problem in exercise studies including masked dose escalation and cluster-randomized experimental designs. He will conclude with discussing SMART experimental designs to change behavior. The science of exercise has made it clear the exercise is a potent medicine to change health span. The next great challenge is behavior change, especially across diverse populations of individuals and multiple health domains.
Talk 3. David X Marquez, University of Illinois Chicago
Dr. Marquez will present on aspects of conducting physical activity trials in diverse populations. Older Latinos are the fastest growing cohort among older adults in the USA, and their lives are often fraught with comorbidities. Evidence has demonstrated health benefits of regular physical activity for older adults. However, older Latinos participate in low levels of physical activity. Interventions designed to increase the physical activity of older Latinos are lacking, and many older Latinos face impediments to participating in physical activity interventions that researchers are unaware of. We have identified barriers and strategies to overcome these barriers that researchers are likely to face in conducting in-person and remote physical activity interventions for older Latinos.
Talk 4. Eduardo E Bustamante, University of Illinois Chicago
Dr. Bustamante will present on aspects of conducting physical activity trials with a focus on mental health. Clinical trials on physical activity and mental health conceptualize physical activity as a form of medicine and seek to discover the optimal dose (i.e., frequency, intensity, time, type) for various outcomes and conditions. However, physical activity is fundamentally different from medicine in that the contextual features of physical activity programs have substantial effects on mental health, independent of physical activity dose. When we exercise, where we exercise, and who we exercise with are inescapable features of physical activity that both confound trials and present new opportunities for mental health promotion. This presentation will review the role of context in physical activity-mental health trials and provides examples of ongoing research harnessing physical activity contexts to optimize mental health in youth.
Talk 5. Charity Patterson, University of Pittsburgh
Dr. Patterson will briefly summarize the opportunities and challenges of conducting physical activity and exercise trials and facilitate a discussion with questions from the audience.
SP14 – STOPPING PROGRESS: FINDING EFFECTIVE TREATMENTS USING DISEASE PROGRESSION MODELING
Progressive diseases are characterized by a systematic pathological advance that can include abnormal biomarker activity, decreased function, and clinical symptoms. For example, in neurodegenerative diseases such as frontotemporal dementia, there is a cascade of pathological processes with early changes in neurofilament light chain and magnetic resonance imaging measures and, later, progression to cognitive symptoms and clinical disease. Quantifying both the trajectory and heterogeneity of these changes, and their relationship to disease state, is key to addressing clinically relevant questions and designing well-powered clinical trials. In this session, we introduce and explore different aspects of disease progression modeling and discuss how to leverage disease progression models in innovative clinical trial design. In the first talk, we explore the idea that a person’s disease state can be modeled using the concept of disease age and illustrate how disease age can be leveraged to improve clinical trial design and yield robust analyses. The second talk focuses on endpoint selection in progressive diseases and considerations for clinical trial design. Progressive diseases may be modeled using various types of endpoints, including clinical, cognitive, and biomarker outcomes. Selecting trial endpoints with well-characterized statistical behavior, as well as clinical relevance is crucial to finding novel and effective treatments. Third, we address the inevitability of missing data due to progression. This talk will provide strategies for addressing mortality in progressive disease trials and describe their impact on the interpretation of results. The final talk focuses on treatment effects in progressive disease, addressing similarities and differences related to deltas, slowing, reduction, and variance. Assumptions about treatment effects, as well as choices in how to model them, have a substantial impact on clinical trial design and subsequent trial results. Talks will be followed by a roughly 15-minute Q&A period. Disease progression modeling provides a framework that enables researchers to ask the right questions about how progressive diseases advance and provides a flexible tool that will shape future clinical trial design as we search for treatments that can slow or halt disease progression.
Talk 1: Understanding disease age and leveraging this concept in clinical trial design. Adam Staffaroni, University of California, San Francisco
Talk 2: Endpoint selection in progressive diseases and considerations for clinical trials. Chris Coffey, University of Iowa
Talk 3: Strategies for addressing mortality in progressive disease trials and impact on the interpretation of trial results. Tom Jensen, Berry Consultants
Talk 4: Defining treatment effects in progressive disease, addressing similarities and differences related to deltas, slowing, reduction, and variance.
Guoqiao Wang, Washington University in St. Louis
SP17 – ESTABLISHING THE UNIVERSITY DATA COORDINATING CENTER (UNICORN) NETWORK TO ADVANCE DATA COORDINATION IN CLINICAL RESEARCH
Data Coordinating Centers (DCCs) play a critical role in the design, implementation, analysis and dissemination of multicenter clinical trials. Through thoughtful collaborations, a DCC contributes to defining the right questions that will lead to robust answers. They ensure efficient data procurement and promote data accuracy. This session aims to discuss the formation, objectives, and impact of the UNICORN Network, a coalition of academic data coordinating centers aimed at advancing design, data coordination and statistical methodology in clinical research. The session will provide an up-to-date overview of the UNICORN Network’s mission to share best practices, advocate for data coordination, and address gaps in the landscape of data coordinating centers. By featuring speakers from diverse backgrounds, the session will highlight how the network serves as a platform for discussing innovative solutions and fostering partnerships to advance the field of clinical research data coordination.
Session Objectives: (1) Introduce the UNICORN Network: Outline the network’s formation, principles, and structure, including its mission to improve clinical research informativeness through collaboration; (2) Discuss the Landscape of Academic DCCs: Present findings from the 2024 DCC Summit and follow up survey, addressing current challenges, opportunities, and the critical role of data coordination in clinical research; (3) Highlight Best Practices and Advocacy: Share the network’s efforts in developing and disseminating best practices for data coordination, professional development, and advocacy for the value of academic DCCs; (4) Encourage Collaborative Efforts: Facilitate a discussion on how the network promotes resource sharing, collaboration, professional development and communication among academic institutions.
Introduction to the UNICORN Network: discuss the origins of the UNICORN Network, emphasizing its role in bringing together academic DCCs to share best practices and advocate for high-quality data coordination. Highlight the network’s foundational principles of transparency, member-driven leadership, and the importance of academic centers in advancing clinical research.
Landscape and Challenges of Academic DCCs: present a landscape assessment based on a 2024 DCC Summit and subsequent survey, providing insights into the state of academic DCCs, including staffing, infrastructure, mission priorities, and existing gaps. This segment will cover the unique challenges faced by academic DCCs, such as increasing regulatory requirements, funding constraints, and the need for methodological innovation.
Best Practices and Advocacy for Data Coordination: outline the UNICORN Network’s efforts in developing and advocating for best practices in data coordination including the establishment and operation of DCCs, professional development programs, and strategies for enhancing data management and sharing, regulatory compliance, and cybersecurity in clinical research. Discuss the network’s role in promoting collaboration and resource sharing among institutions.
Panel Discussion and Audience Engagement: Discussant will provide a synthesis of the presented topics, offering insights into the future of data coordination in clinical research. The discussion will focus on how the UNICORN Network can drive innovation and address challenges within the field. Audience members will be invited to participate, posing questions and sharing their perspectives on advancing data coordination practices.
Importance of the Session: Data Coordinating Centers are critical to multi-site clinical research, yet there is no guidebook to follow when building one, and no standard by which they are assessed. This session will highlight the UNICORN Network’s contributions to improving clinical research and advocate for the recognition of data coordination as a critical aspect of trial success. By bringing together experts from various academic institutions, it aims to foster collaboration and disseminate knowledge to enhance the effectiveness and efficiency of DCCs. The session will provide valuable insights into the evolving landscape of data coordination in clinical trials and opportunities for members of the community to get involved in advancing the science and practice of data coordination.
Diversity and Inclusion: The session features speakers and panelists from diverse academic backgrounds and institutions, offering multiple perspectives on data coordination. This diversity will underscore the collaborative nature of the UNICORN Network and its commitment to inclusive practices that advance clinical research.
Expected Outcomes: (1) Enhanced understanding of the role and challenges of academic DCCs in clinical research; (2) Dissemination of best practices for establishing and running data coordinating centers; (3) Engagement with the clinical trials community to identify strategies for addressing current and future challenges in data coordination.
SP18 – IMPROVING THE CLINICAL TRIAL ECOSYSTEM TO EFFICIENTLY GENERATE ROBUST RESEARCH AND IMPROVE CARE
Randomized controlled trials (RCTs) are the gold standard for generating high-quality evidence to optimize human health. However, trial teams and patients face substantial challenges to undertaking and participating in clinical trials including delays to study initiation and a lack of access to clinical trials for patients. Indeed, a recent editorial bemoaned that “RCTs are often challenging and resource-intensive to implement” due to logistical barriers as well a need to develop the infrastructure required for study procedures.
As part of a $250 Million investment to deliver on the Canadian Biomanufacturing and Life Sciences Strategy, the Canadian Accelerating Clinical Trials (ACT) consortium has been funded to address these challenges and improve the clinical trials ecosystem. Alongside ACT several Clinical Trial Training Platforms (CTTPs) have also been funded to recruit, train and mentor highly qualified trainees, researchers, healthcare professionals, and clinical research professionals and better position the next generation of clinical trial researchers within the clinical research ecosystem.
In this invited session we identify key challenges and how we are addressing these through ACT and the CTTPs. The session will be opened by PJ Deveraux who will provide an overview of the challenges as well as a high-level synopsis of the work being undertaken by ACT. This will be followed by three focused presentations on: work that is decreasing time and increasing efficiency of study initiation; improvements in awareness of, engagement with, and access to clinical trials and, training that will build a capable and nimble workforce to deliver robust RCTs.
Talk 1: Improving the process and increasing the pie: Time to ACT. PJ Devereaux, McMaster University
In this presentation we will describe the current challenges facing those wanting to conduct clinical trials in Canada, including the need to increase the funding available for clinical trials. From here we will outline initiatives undertaken by ACT to address these challenges, including the development of master contract agreements, building research infrastructure within community hospitals, streamlining the ethics review process, improving awareness of and access to clinical trials, and recognizing the important link between a strong economy and health and the need to engage with industry to grow the funding pie.
Talk 2: Decreasing time and increasing efficiency of study initiation. Dean Fergusson, Ottawa Hospital Research Institute
Operational bottlenecks in trial initiation include research ethics board (REB) approval, the negotiation and execution of clinical trial contracts, regulatory processes, and the recruitment and training of study personal. After outlining these challenges, we will describe two major initiatives of the ACT consortium: the development of a single national distributive REB model with strict timelines, and two pan-Canadian master agreements and accompanying templates relating to the sharing of data and study start up. The first, the master data sharing agreement has, as of October 1, 2024, 46 signatories. The second, the master CIHR-funded participating site agreement (and its accompanying template for non-CIHR-funded studies) has (as of October 1, 2024) 35+ institutions across Canada currently negotiating its language for finalization.
Talk 3: Awareness, Engagement, and Access: getting the right trials to the right people. Stuart Nicholls, Ottawa Hospital Research Institute
In a survey of the public, conducted by Clinical Trials Ontario, only 11% of respondents had been approached to be part of a clinical trial, yet over 65% of respondents indicated they would be willing to participate in a trial. This reflects a major gap between interest and opportunity for patients to benefit from clinical trials. The third presentation will focus on the need to improve access to clinical trials. Specifically, the presentation will focus on the work ACT has undertaken to (1) increase the availability of trials in Canada through direct funding and network support, (2) improve awareness regarding the importance and availability of trials (BeTheCure), (3) building infrastructure to support increased access to trials in areas currently underserved (portfolio hospitals), and (4) improving the design and conduct of trials to make them more including (patient engagement & IDEA).
Talk 4: Building capacity and the future workforce. Sameer Parpia, McMaster University
Clinical research depends on a skilled workforce equipped with the knowledge and mentorship necessary to drive advancements in this field. In this final presentation we discuss the need to develop capacity and skills within the workforce. The presentation will begin by outlining the rapid development of trial methods and needs before outlining the Clinical Trial Training Platforms (CTTPs) funded by the Canadian Institutes of Health Research Clinical Trial Fund. Following a general overview of the CTTPs the presentation will focus on the work and impact of the Canadian Network for Statistical Training in Trials (CANSTAT), a training platform to train and mentor biostatisticians in clinical trials.
Panel discussion: Following the presentations we will facilitate what we hope is a spirited discussion regarding the steps ACT and CANSTAT are taking to improve the trials ecosystem and the opportunity to create further collaborations. The chair will invite questions and comments from the audience to help distil directions for future work to improve the clinical trial ecosystem in Canada, North America, and beyond.
SP20 – IDENTIFYING AND MITIGATING THE IMPACT OF CLINICAL RESEARCH PROFESSIONAL ATTRITION IN CLINICAL TRIALS
The world of clinical research is not immune from the impact of study team attrition. This presentation explores the impact related to research study coordinator turnover, methods to quantify the rate of attrition, and opportunities to mitigate the impact of research professional turnover on active clinical trials. SIREN is a NIH-funded research collaborative with a number of research professionals at more than 90 sites across the US, Canada and abroad. Staff changes create challenges, and clinical trials are no exception. Changes in research professionals can have a wide range of impacts on an organization, including lower morale, cultural shifts, reduced productivity in the form of lower enrollment and reduced data quality, increased training and hiring costs. The reasons for turnover can vary based on several factors including the industry, geographic location, economic conditions, and specific skill sets in demand. The SIREN Network initiated a process to better understand the incidence, reasons, and impact of turnover specifically focusing on study coordinators, and explore ways to mitigate negative impact on trial operations.
Talk 1: Challenges In Determining Prevalence of Attrition.
The Strategies to Innovate EmeRgENcy Care Clinical Trials Network (SIREN) utilizes a web-based clinical trial management system to capture study team members and roles within active clinical trials via an electronic delegation of authority (DoA) log. Sharon Yeatts will provide an overview of the structure of the network and the electronic DoA, and Ian Rines will describe the methods used to capture and present information related to clinical research coordinator turnover. These methods focus on data-driven questions that lead to evidence-based answers.
Talk 2: Qualitative Approach to Understanding Impact.
By asking the right questions, we can unlock innovative solutions and provide robust answers to staffing challenges. With this in mind, Valerie Stevenson will discuss the tools and techniques used in soliciting feedback regarding the impact of research coordinator attrition. This section provides an overview of LEAN practices, survey development, analysis of results, and development of mitigation strategies.
Talk 3: Implementation of Mitigation Strategies and the Impact of Attrition.
Abbey Staugaitis and Sara Roy will share the study manager perspective. They will provide an overview of their hiring models, experience working through issues related to attrition at the site level, collaborating with network stakeholders, implementation of mitigation strategies and feedback on the process. Models discussed will include pooled and semi pooled, remote/asynchronous, semi-pooled staffing model, and primary or co-lead. Definitions and origins of each model, perceived benefits, encountered challenges and ways to adapt models for different groups will be presented and evaluated.
A collaborative discussion session will follow, allowing participants to engage with the presented findings and contribute to the dialogue on improving retention, training and growth of key members of a clinical trial team.
SP21 – ISSUES IN THE DESIGN OF STUDIES WITH HIERARCHICAL ENDPOINTS
Recently, hierarchical endpoints have become increasingly popular in clinical trials research. They have several advantages over conventional composite endpoints: they (1) use information from multiple outcomes, rather than only the first event, (2) prioritize more important outcomes, and (3) combine information from different types of outcomes (time-to-event, binary, continuous, counts). Multiple approaches have been proposed for analysis of hierarchical endpoints, including win statistics, net treatment benefit and desirability of outcome ranking (DOOR). In this session, leading researchers in this field will discuss relative merits of various approaches and practical issues in designing clinical trials with hierarchical endpoints.
Talk 1: Trial Design with Win Ratio or Win Odds Based on Hierarchical Endpoints. Huiman Barnhart, Duke University
Win statistics, such as win ratio and win odds, have become a popular approach to analysis of hierarchical endpoints in clinical studies. While several sample size formulas are available for design of randomized trials using win statistics, these formulas require investigators to specify clinically significant and meaningful magnitude of win statistics and the expected probability of ties. In practice, these quantities are difficult to identify based on prior published literature. We show that the win ratio for the hierarchical endpoints is a weighted average of marginal win ratios (with similar expression for win odds), under the assumption of independence of the individual endpoints. We also provide the expression for the probability of ties. These formulas provide a simple way to specify clinically significant and meaningful win ratio (or win odds) magnitude and probability of ties. As a result, formula-based power and sample size calculations can be easily obtained for trial design without the need to conduct complex simulation studies. Our extensive simulation studies show that statistical power calculated with the formulas under the independence assumption is similar to the simulation-based power for any type of positively correlated hierarchical endpoints. Our approach gives researchers an easy tool for trial design and gives insights on relative contribution of marginal win ratio (win odds) to overall win ratio (win odds) and the impact of adding endpoints to the hierarchy. Cardiovascular trials are used to illustrate our approach.
Talk 2: Involving Patients in the Design of Trials Using Hierarchical Outcomes. Marc Buyse, IDDI & I-BioStat
Generalized pairwise comparisons is a method of analysis of multiple prioritized outcomes that provides a patient-relevant estimate of the overall effect of the treatment on all outcomes. Such an overall treatment effect can be expressed as a win ratio, a win odds, a probabilistic index, or a Net Treatment Benefit (NTB). We will argue that the NTB has advantages over other measures of treatment effect: it can be decomposed into additive contributions of prioritized outcomes and can be interpreted as the net probability that a random patient receiving treatment has a better outcome than a random patient receiving control. As such, it ranges over the interval [-1,+1] with 0 indicating that treatment does not differ from control. Establishing a hierarchy of outcomes is crucial to ensuring NTB is patient relevant. This process entails selecting the outcomes, prioritizing them according to patient preferences, and choosing thresholds of clinical similarity for some or all outcomes if appropriate. These choices can be based on expert or patient opinion. We have developed a software called “Voice” to elicit patient preferences for a well-defined set of selected outcomes, using the pairwise comparison paradigm. Voice displays the outcomes of pairs of patients, and the user is asked to choose the patient who, in their opinion, has the better outcome. An AI-driven algorithm generates outcomes for successive pairs of patients until the algorithm converges to a prioritized list of outcomes and thresholds that reflect the user’s preferences. Voice keeps track of user responses to justify the choice of a list of prioritized outcomes and thresholds in a prospective trial design, and to document heterogeneity in patient preferences. Voice could potentially inform individualized analyses according to the preferences of each user (or classes of users).
Talk 3: The Desirability of Outcome Ranking: The DOOR to Patient-Centric Benefit-Risk Evaluation. Toshimitsu Hamasaki, The George Washington University Biostatistics Center
Typical clinical trial analyses focus on comparing interventions for each efficacy and safety outcome. While these analyses estimate outcome-specific effects and combine marginal effects for benefit-risk assessments, they often overlook associations between outcomes, face challenges from competing risks, and fail to account for the cumulative impact of multiple outcomes on individual patients. Additionally, differing analysis populations for efficacy and safety complicate the applicability of benefit-risk analyses.
We can address these limitations through patient-centricity by correcting our arithmetic and “using outcomes to analyze patients rather than patients than analyze outcomes.” However to obtain the most informative answers for clinical practice, we prioritize: robustness through the avoidance of reliance upon modeling assumptions for validity; objectivity by avoiding subjective beliefs; the theory for error control consistent with the evidentiary standard for confirmatory evidence; clearly defined estimands and populations from which to estimate parameters; best practices for composite endpoints including integrated analyses of components; best practices for benefit:risk / multi-endpoint analyses to aid comprehensive assessment including analyses based on the absolute (vs. relative) risk scale consistent with providing a common scale for interpretation of multiple outcomes simultaneously; recognition of dimensions of treatment contrast including rank-based and grade-based analyses; best practices for ordinal patient-centric outcomes including cumulative analyses; intuitive interpretation, and sound technical fundamentals.
The Desirability Of Outcome Ranking (DOOR) methodology has been developed to enhance patient-centric benefit-risk evaluation in clinical trials. It allows for a more informative comparison of treatment risks and benefits. Given its complexity, thorough and careful analyses are vital. This talk presents a comprehensive statistical analysis plan for implementing DOOR in research studies, illustrating its components with examples, and addressing design issues in clinical trials utilizing this methodology.
SP22 – PLANNING FOR HETEROGENEOUS TREATMENT EFFECTS: ENRICHMENT FOR TREATMENT-SENSITIVE PATIENT POPULATIONS
In the era of precision medicine, understanding the wide-ranging responses of patient subgroups to interventions is critical for improving outcomes and enhancing clinical care. This session will address heterogeneous treatment effects (HTE) “variations in treatment efficacy across a patient population” and discuss adaptive clinical trials designed to respond in real time to emerging evidence of HTE. Enrichment designs have emerged as a key strategy for reacting to HTE, allowing for refined inclusion criteria and identifying patients most likely to benefit from specific therapies. By focusing on treatment-responsive subgroups, these designs enhance trial efficiency and increase the likelihood of successful results, paving the way for more targeted and effective interventions in clinical practice.
In the first talk, Dr. Paulon will explore the application of enrichment strategies by examining two clinical trials of early minimally invasive surgical removal of intracerebral hemorrhage (ICH). The first trial discussed, ENRICH, is a recently completed study that showed significant benefit in a pre-specified subgroup of the patient population based on the location of the hemorrhage. The second trial, REACH, seeks to further investigate if hemorrhage size is a meaningful variable in the subgroup where ENRICH did not demonstrate functional benefit.
In the second talk, Dr. Lawler will discuss the clinical motivation and trial design strategy for handling HTE in ATTACC-CAP, a platform trial designed to investigate the effect of antithrombotic therapy for patients with community-acquired pneumonia. The trial can adaptively modify entry criteria and reach conclusions in pre-defined patient risk groups at interim analyses. Risk groups are defined by the combination of several variables, including severity of illness, patient characteristics, and other biomarkers.
In the third talk, Dr. Elm will introduce the StrokeNet Thrombectomy Endovascular Platform (STEP) trial and discuss the anticipation of HTE in the endovascular therapy (EVT) indication expansion domain. Past trials have established EVT as a highly effective treatment for acute ischemic stroke patients in a relatively narrow range of baseline characteristics. It is probable that additional stroke patients benefit from EVT. This trial aims to expand the boundaries of indication for EVT by learning the differences between patients who are responsive and non-responsive to treatment through a changepoint model.
The discussant, Dr. Saville, will provide a broad discussion of clinical trials that plan for HTE. He will discuss similarities and differences in the trials presented during the session, including benefits and challenges associated with such designs. He will discuss the process and key decisions required in the design on an enrichment trial, and highlight the importance of clinical-statistical collaboration.
SP24 – SHAPING EQUITABLE ACCESS TO CANCER CLINICAL TRIALS THROUGH AI AND NAVIGATION
Clinical trial enrollment remains a critical challenge in advancing novel therapeutics. Despite efforts to improve participation, clinical trial participation remains critically low with fewer than 7% of adult cancer patients participating in cancer treatment trials, a statistic that underscores underrepresented groups such as racial and ethnic minorities who continue to face barriers to access. The increasing complexity of trial protocols and the narrowing of eligibility criteria, designed to safeguard patient safety and improve treatment specificity, further restrict the pool of qualified participants. This trend exacerbates low enrollment rates, which are a significant cause of trial failure: nearly 20% of trials fail to meet accrual targets, leading to premature termination. Recruitment costs now account for 25-30% of total trial budgets, and slow accrual not only delays therapeutic innovation but also magnifies these costs. Further, disparities exist in the participation of underrepresented groups in clinical trials, with Black, Hispanic, and rural populations often being under-enrolled relative to their disease burden. Some data suggests, providers are less likely to offer clinical trials to underrepresented groups, a factor that may contribute to persistently low participation rates among these groups. Research has shown that implicit biases, as well as assumptions about patient interest, understanding, and logistical challenges (i.e. transportation or financial concerns), often lead to fewer trial opportunities being extended to these populations. This reduced trial access further widens health disparities and restricts these patients from benefiting from novel, potentially life-saving therapies. Addressing these bias/barriers through proactive recruitment strategies is essential for achieving more inclusive, equitable, and scientifically valid trial results. Recent work highlights the urgent need for systematic approaches to trial screening, such as pre-screening patient records to identify eligible candidates early in the recruitment process. Recent studies have shown that novel recruitment strategies, such as dedicated patient pre-screening programs (Clinical Trial Navigators), lay navigation and the use of technology-driven solutions like Artificial Intelligence (AI) and machine learning, have the potential to significantly improve recruitment efficiency. For instance, proactive screening of patient records, which involves systematically reviewing medical records and genetic data to identify trial candidates, has been proposed as a means of expanding access to trials, particularly for underrepresented populations. Further, contemporary advancements in LLM enable cost-effective automation of patient-trial matching in the real-world setting. The implementation of pre-screening programs is aligned with recent calls to address the “eligibility bottleneck” in clinical trials, which limits patient access and reduces the generalizability of trial findings. AI-driven solutions, such as large language models (LLMs), have demonstrated efficacy in automating trial matching, thereby reducing the manual workload on clinical staff and improving patient diversity in trials. Advanced AI models and LLMs offer a promising solution by automating parts of the screening process, allowing for the rapid identification of potential candidates based on complex eligibility criteria. Studies have shown that integrating pre-screening and AI-driven tools can significantly reduce manual workload for clinical staff, improve patient recruitment, and mitigate institutional barriers and implicit biases that hinder diverse participation. These approaches not only harness the power to increase enrollment rates but also the potential to reduce the time and financial costs associated with recruitment, making trials more accessible and sustainable in the long term. The panelists will discuss real-world examples of three novel approaches to trial recruitment: 1) pre-screening clinical trial navigators, 2) lay navigation and 3) AI/LLM. The discussion will include how dedicated screening resources, lay navigators and AI-driven tools have been implemented to improve patient recruitment in oncology trials, reducing manual workload for clinical staff, and improving patient trial access, reducing institutional barriers and implicit bias. The discussion will address how each recruitment strategy can be deployed to aid investigators in identifying the most appropriate patients for clinical trials. The panelists will discuss the utility of each method (pre-screening, navigation, AI/LLM), the challenge of underrepresentation of diverse populations in clinical trials and how dedicated screening and navigation resources and AI can help identify and screen a more inclusive patient cohort. These solutions represent a shift in trial recruitment practices, addressing the urgent need to improve participation rates and provide a promising direction for expanding access, reducing barriers to participation and paving the way to more inclusive and equitable clinical trials.
SP25 – OVERCOMING CHALLENGES IN RARE CANCERS: LEVERAGING REGISTRY DATA AND INNOVATIVE TRIAL DESIGNS
This session will explore a groundbreaking platform trial, such as the MASTER KEY Project, which focuses on rare cancers and rare molecular fractions—areas where conducting randomized controlled trials is often unfeasible due to limited patient populations. By showcasing real-world examples of regulatory approval based on single-arm trials, utilizing registry data to expand treatment indications, the session will highlight operational strategies and innovative statistical designs that enhance the credibility and impact of these studies.
Talk 1: MASTER KEY Registry.
The MASTER KEY Project is a platform trial comprising a registry and multiple sub-studies, involving participation from eight Asian countries. The registry, with over 5,000 patients diagnosed with rare cancers, plays a critical role in providing historical control data for regulatory application of sub-studies. While most registries typically consist of only clinical data, the strength of the MASTER KEY Registry lies in its inclusion of both clinical and biomarker data. This comprehensive approach allows for more precise extraction of control data, even in clinical trials where specific biomarkers define patient populations. To further enhance the biomarker data, a system was established to centrally collect samples from across Asia, conduct next-generation sequencing analysis, and return the results to participating institutions. The registry encourages collaboration between pharmaceutical companies and academia, with 12 companies actively involved, along with strong support from patient advocacy groups.
Talk 2: MASTER KEY Sub-studies.
To date, 31 sub-studies have been conducted under the MASTER KEY Registry, focusing on rare cancers and molecular fractions. Many of these are single-arm trials with response rates as the primary endpoint, using registry data to extract control group information, thereby improving the likelihood of regulatory approval. Furthermore, to efficiently enroll patients from rare populations, two recent sub-studies within the MASTER KEY Project have implemented a fully decentralized clinical trial system. This allows patients to participate and be enrolled remotely, without the need to visit the clinical trial site. This session will delve into case studies of these sub-studies, demonstrating how quality assurance is maintained to ensure successful regulatory submissions.
Talk 3: Statistical Innovations for Rare Populations.
As highlighted by the FDA’s Complex Innovative Trial Design framework, there is a growing momentum for the use of complex adaptive, Bayesian, and other novel clinical trial designs. In the MASTER KEY Project, several phase 2 and basket trials utilizing Bayesian methods have been conducted, gradually accumulating practical know-how. This session will introduce new clinical trial designs planned in investigator-initiated studies.
After the presentation, we will open the floor for questions and discussions from the audience and encourage participants to share their thoughts on the session topic. We will conclude by summarizing key takeaways from the session and highlighting the broader implications of registry data utilization and innovative trial designs for treatment development in rare diseases.
SP26 – ADVANCED NOVEL RANDOMIZATION IMPLEMENTATION IN REDCAP FOR CLINICAL TRIALS
In clinical trials, randomization is crucial for ensuring unbiased treatment allocation and the integrity of results. With the increasing complexity of clinical designs, there is a pressing need for robust and adaptable randomization methods. This session aims to explore advanced implementations of randomization techniques within the Research Electronic Data Capture (REDCap) system. We will present innovative strategies to enhance randomization and stratification processes, focusing on minimal sufficient balance, adaptive randomized designs, and upcoming new randomization features in REDCap. Traditional approaches to randomization often fall short in accommodating the intricacies of advanced clinical trials. REDCap, a secure web application for data collection, provides a built-in framework for directly implementing less intricate randomization techniques such as stratified, block, and stratified-block randomization. However, these randomization strategies do not reflect state-of-the-art randomization methodologies that have been implemented in clinical trials more often in the past decade such as covariate-adaptive and response-adaptive randomization. Though REDCap cannot yet directly implement advanced randomization techniques, the REDCap framework is flexible enough to allow these methodologies to be implemented indirectly. As such, this session will empower researchers with access to REDCap to freely implement state-of-the-art randomization techniques. This session aligns with the conference theme, “Shaping the Future: The Right Questions, Robust Answers,” by addressing the critical questions surrounding randomization methodologies and their practical application in clinical research. In this session, we will illustrate three non-traditional randomization techniques and how they can be used in REDCap using novel techniques developed through active clinical trials.
Talk 1: Overview of REDCap Randomization and Introduction of Covariate-Adaptive Randomized Techniques.
This talk will introduce basic randomization capabilities in REDCap and provide a high-level overview of the menu of randomization methods from which study teams may choose for implementation in their trials, focusing on pros and cons of each and general logistical considerations when deciding on a randomization method. We will begin with a demonstration of the basic randomization process currently available in REDCap. We will next shift our focus toward advanced randomization techniques such as covariate-adaptive randomization. Adaptive randomization algorithms, many falling under the heading of “minimization,” can improve baseline variable balance across study arms and increase statistical power in randomized controlled trials. However, study-specific characteristics—study design, intervention type, clinical context, blinding, and data collection procedures—all play a key role in the ease (or lack thereof) in implementing these complex algorithms. As such, we will: (1) illustrate situations that would or would not merit use of a covariate-adaptive randomization algorithm, and (2) provide insight for incorporating adaptive randomization algorithms into complex study workflows within the REDCap framework.
Talk 2: Minimal Sufficient Balance.
This talk will focus on the methodology of minimal sufficient balance (MSB), a covariate-adaptive randomization technique designed to ensure that treatment groups are comparable across key baseline characteristics. We will discuss the principles of minimal sufficient balance and its importance in achieving unbiased results in clinical trials. We will also provide a step-by-step demonstration of how to adapt and implement this technique for use in REDCap, including practical examples and coding strategies. MSB was intended to be used with proprietary software for covariate-adaptive randomized trials, but with some modifications we were able to adapt MSB for use in REDCap while maintaining the integrity of the methodology and implementing it with a high degree of fidelity. The session will highlight firsthand experience where MSB has been successfully adapted and implemented in an active clinical trial, showcasing the innovative technique and how it can improve study design and outcomes.
Talk 3: Response-Adaptive Randomization.
This talk introduces the methodology of response-adaptive randomization (RAR), which adjusts future randomization probabilities based on previously collected outcome data. We will cover the principles of allocation targets and explore various methods for modifying allocations using both observed and unobserved outcome data. Since interim RAR analyses are often conducted outside of the electronic data capture system, we will focus the talk on how to set up a study in REDCap before enrollment begins to accommodate potential randomization allocation changes, using an ongoing double-blind Bayesian adaptive clinical trial as an example.
Discussant: Dr. Matt Shotwell will reflect on the three presentations and share his experience with implementing an adaptive clinical trial platform in REDCap. In addition, he will discuss forthcoming new randomization features in REDCap.
This session will provide attendees with ways to use REDCap to implement advanced randomization techniques that can enhance the integrity and efficacy of their clinical trials. By leveraging innovative strategies to enhance the capabilities of REDCap, researchers can address pressing questions in clinical design, data collection, and analysis. The integration of minimal sufficient balance and adaptive randomized designs aligns with the overarching theme of shaping the future of clinical trials through innovative practices.
SP27 – ENHANCING KEY ENDPOINT EVALUATION AND MONITORING WITH AI/ML AND RISK-BASED STRATEGIES
In clinical trials, ensuring the quality and validity of data for downstream analysis and results is paramount, thus necessitating thorough data evaluation and monitoring especially for key efficacy endpoints. Through collaboration of multiple personnels, this typically involves employing edit checks and manual queries during data collection. Edit checks consist of straightforward schemes programmed into relational databases, though they lack the capacity to assess data intelligently. In contrast, manual queries are initiated by data managers who manually scrutinize the collected data, identifying discrepancies needing clarification or correction. Manual queries pose significant challenges, particularly when dealing with large-scale data in late-phase clinical trials. Moreover, they are reactive rather than predictive, meaning they address issues after they arise rather than preemptively preventing errors. Aiming for real-time remediation of potential errors based on critical risk assessments, targeted monitoring is appealing for utilizing key risk indicators and statistical monitoring to identify potential issues or anomalies in the data. However, the available tools for risk-based monitoring primarily concentrate on overseeing and managing data entry errors and alterations and being descriptive in nature e.g. Target e*CRF. Similarly, the available tools for the statistical monitoring are mostly being descriptive in nature as well, which may not serve the purpose well. Advances in AI/ML provides powerful techniques for feature/subgroup characterizations and pattern recognition, which can be potentially utilized to identify anomalous patterns and monitor clinical trial data for single endpoints, multiple endpoints/multi-modal data collectively, or temporal data. This session is to bridge the gaps between clinical trials and advances in AI/ML area and advocate adaptation from AI/ML to evaluation and monitoring strategy.
Talk 1: Leveraging AI-assisted Central Statistical Monitoring to Elevate Clinical Trial Oversight and Data Quality. Jingjing Ye, BeOne Medicines
Talk 2: A one-shot deep learning framework for psoriasis area and severity prediction. Li Wang, AbbVie
Talk 3: Open-Source Risk-based Quality Management (RBQM) Software for Good Statistical Monitoring of Critical Clinical Trial Data. Xinlei (Ivan) Mi
TARGETED SESSION
TS2 – FELLOWS SESSION: THE PROMISE AND POTENTIAL PITFALLS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING (AIML) IN CLINICAL TRIALS. REFLECTIONS FROM SCT FELLOWS
It seems impossible to escape the barrage of claims for how artificial intelligence and machine learning (AIML) can change our lives or businesses for the better. Some of the claims for AIML in clinical trials include: (1) Design and analysis: AIML can assist in optimizing trial designs, analyzing complex datasets and provide objective imaging diagnostics; (2) Efficiency: AIML can facilitate the identification of suitable patient populations and clinical sites through advanced data analytics and predictive modeling, thereby improving recruitment strategies; (3) Safety: AIML can streamline the monitoring of trial progress and outcomes, enabling real-time data analysis and early detection of adverse effects or patient non adherence; (4) Patient engagement: AIML can enhance participant experience through personalized communication and support. But the use of AIML in clinical trials presents potential pitfalls including: (1) Data quality and bias; (2) Generalizability; (3) Regulatory challenges (4) Overreliance on technology; (5) Integration with existing systems; (6) Patient privacy; (7) Inequalities in trial access and outcomes. Addressing these pitfalls will be essential for the successful integration of AIML in clinical trials to ensure that its benefits are realized without compromising patient safety or scientific integrity. Fellows of SCT and invited speakers/panelists in this session will share their experiences and reflections and will invite the audience to share their own experiences.
CONTRIBUTED PRESENTATIONS
SESSION 1
CP1-1 – POWER CALCULATION FOR GROUP SEQUENTIAL CLUSTER RANDOMIZED TRIALS WITH CONTINUOUS OR BINARY OUTCOMES
Well-planned interim analyses provide researchers with early data, allowing for timely decisions about a trial’s continuation, modification, or termination to enhance patient safety, reduce costs, and expedite access to effective treatments. Group sequential methods enable investigators to stop a clinical trial early when there is compelling evidence for efficacy or futility while preserving the trial’s statistical integrity. Although group sequential methods are well developed for individually randomized trials, established methods for cluster randomized trials (CRTs) are limited. In CRTs, groups or clusters (such as communities, schools, or clinics) rather than individual participants are randomly assigned to treatment or control conditions, making this design especially useful in settings where individual randomization is impractical or when the intervention is delivered at a group level. Because clusters are defined based on some shared characteristics or circumstances, outcomes for individuals within the same cluster tend to be more similar than those in different clusters. Group sequential trials require an inflated maximum sample size compared to equivalent fixed-sample designs to account for the possibility of early stopping. However, limited guidance exists on designing and powering a group sequential CRT, which requires accounting for correlated outcomes and repeated interim analyses of data accumulating at both the cluster and individual levels.
To this end, we develop sample size calculation methods for group sequential CRTs with continuous or binary endpoints. Under designs that recruit by cluster or individuals within clusters, we first show that differences between the corresponding sequentially calculated test statistics are asymptotically independent. We then employ an error spending approach to determine the maximum number of clusters or cluster size of a trial. Our method encompasses early stopping for combinations of efficacy and binding or non-binding futility. In simulation studies, we find that group sequential CRTs powered using our sample size calculations achieve the specified power across a range of trial design specifications; these results hold even when both clusters and individual participants enter the trial at varying levels over time. We also provide guidance on how and when to schedule interim analyses to maximize the efficiency of a group sequential CRT. We then apply our approach to planning interim analyses in the MEDUSA study, a CRT evaluating the effect of a multifaceted antimicrobial therapy initiation program on sepsis survival.
CP1-2 – EVALUATING INFORMATIVE CLUSTER SIZE IN CLUSTER-RANDOMIZED TRIALS
In cluster-randomized trials, two popular estimands of interest are the average treatment effect among participants (p-ATE) and the cluster average treatment effect (c-ATE). The p-ATE is defined as an average of the treatment effects across all individual participants, while the c-ATE first averages the treatment effects within each cluster before averaging across clusters. Both quantities are often of interest for a cluster-level intervention. The p-ATE may be different from the c-ATE when informative cluster size is present, i.e., when treatment effects or participant outcomes depend on cluster size. For example, large hospitals may exhibit better or worse average outcomes than small hospitals in different settings. In such scenarios, mixed-effects models and generalized estimating equations (GEEs) with exchangeable correlation structure (which constituted a majority of cluster-randomized trial analyses in a recent systematic review) are biased for both the p-ATE and c-ATE estimands, and GEEs with an independence correlation structure or analyses of cluster-level summaries are recommended instead in practice. However, when cluster size is non-informative, mixed-effects models and GEEs with exchangeable correlation structure can provide unbiased estimation and notable efficiency gains over other methods. Thus, hypothesis tests for informative cluster size would be useful to formally assess the validity of this key assumption. In this work, we develop model-based, model-assisted, and randomization-based tests for informative cluster size in cluster-randomized trials. We construct simulation studies to examine the operating characteristics of these tests, show they have appropriate Type I error control and meaningful power, and contrast them to existing tests used in the observational study setting. The proposed model-based test has high power but is sensitive to model misspecification. The proposed model-assisted and randomization-based tests are less powerful in general, but they do not require correctly specifying the mechanism of informative cluster size in a model to have valid Type I error. We further show how covariate adjustment can improve the statistical power of these approaches. The proposed tests are applied to data from a recent cluster-randomized trial, and practical recommendations for using these tests are discussed.
CP1-4 – CURRENT PRACTICE AROUND THE USE OF ESTIMANDS IN CLUSTER RANDOMISED TRIALS, AND THE IMPACT OF INFORMATIVE CLUSTER SIZE ON INFERENCES
Potential impact and relevance: Despite growing recognition around the importance of estimands, they are not widely used in reports of CRTs. We found that choice of estimand and estimator can have large impacts on interpretation of results, suggesting that guidance to increase uptake of estimands in CRTs is urgently needed.
SESSION 2
CP2-1 – COMPLEXITIES IN TREATMENT-EMERGENT ADVERSE EVENT ANALYSES DURING INTERIM MONITORING
In our experience producing interim clinical trial reports for data monitoring committee (DMC) review, each trial uses a tailored definition for treatment-emergent adverse events (TEAEs). These definitions seem to be driven by clinical considerations rather than epistemological reasoning. In all cases it remains critical that DMCs are able to effectively and efficiently review accruing data to ensure the safety of all patients in the trials. The objectives of this discussion are to explore the reasons why TEAE analyses increase the complexity of assessing the safety profile of a treatment during interim monitoring by a DMC compared to analyses using all post-randomization adverse events (AEs), and to consider the common practice of excluding AEs that occur between randomization and first treatment administration. We will review TEAE definitions that have appeared in over 40 industry sponsored clinical trials supported by the University of Wisconsin’s Madison Statistical Data Analysis Center, an independent analysis group. These trials span many disease areas from 2008 to the present. This review will provide a detailed summary of the trends and variability of TEAE definitions. Then we will discuss the challenges these various definitions have posed for interim monitoring by DMCs, focusing on data integrity and interpretability of treatment safety profiles. We will present thoughts on the contrasting analyses between all post-randomization adverse events and TEAEs. Lastly, we will facilitate a discussion on handling AEs that occur in the period between randomization and the initiation of treatment.
CP2-2 – USING REDCAP TO FACILITATE ADJUDICATION OF ADVERSE EVENTS
Organization and tracking of adverse events (AE) that need adjudicating can be a challenge for clinical trials. We aimed to streamline the packaging, reviewing, finalization, and tracking of this process in the REHAB-HFpEF study. REHAB-HFpEF is a phase 3 randomized trial testing a novel, multi-domain physical rehabilitation intervention in patients aged >60 years with heart failure with preserved ejection fraction (HFpEF) and hospitalized for acute decompensated heart failure. The study’s goal is to enroll 880 patients at 20 clinical centers to test the hypothesis that the intervention will reduce rate of combined all-cause rehospitalizations and mortality at 6 months follow-up versus usual care attention control. REHAB-HFpEF uses REDCap as the main data collection tool, including repeating instruments for AE case report forms. When a clinical site coordinator (SC) enters an AE form into REDCap, a project manager at the Data Coordinating Center (DCC) is alerted and reviews the form in detail, making sure all needed components are completed and accurate, including uploading into the form any required deidentified medical records and documentation necessary for adjudication (i.e. discharge summary, lab results, etc.). Once the DCC project manager has finalized the form, an application programming interface (API) is used to create a “package” of the AE event information. The API extracts the finalized AE form and all supplementary documentation; as well as automatically removing fields that should be blinded to the adjudicators. This package is then uploaded into a secure file sharing account (SFSA) which is accessible to the adjudication team. The adjudication team receives a weekly email indicating the unique AE packages that have been uploaded to the SFSA and are ready for review. To help facilitate adjudication, a secondary REDCap project which employs a double data entry feature is used by the adjudication team. In this process, two adjudication reviewers independently answer a series of questions about each AE package. A third reviewer compares any incongruent responses between reviewers 1 and 2 and then determines the final responses. Once the third reviewer has completed this task, a “completed adjudication” indicator variable is uploaded back into the main REDCap project via an API. Several indicator variables are used within the AE form to track the status of the adjudication process. This includes 1) SC finalization, 2) DCC finalization, 3) upload to the SFSA, 4) adjudication complete. Dynamic reports are available in REDCap and bi-weekly reports are e-mailed to the adjudication study team to monitor progress, including completion status for each of the three adjudicators. Currently we are about halfway through study recruitment and have sent 500 AEs for adjudication, with 412 completed. Custom adjudication systems can be developed which provide extensive features; however, this can be time and cost prohibitive for some studies. The automated process used for REHAB-HFpEF was built in REDCap and has allowed us to easily compile and share event information, track the status of each AE throughout the adjudication process, and streamline communication among the SC, DCC, and adjudication team.
CP2-3 – ADVANCED GENE THERAPY RESEARCH - MORE SMALL-MOLECULE OR MORE-TRANSPLANT?
This presentation explores questions of whether the existing structure of early-stage (Phase 1-2) clinical research with gene therapies (GT) warrants more alignment with research in solid organ transplants versus remaining in the traditional small molecule drug development rubric. GT trials present unique challenges that differentiate them from conventional drug trials, and research participants (and parents) may benefit from examining the prospects of benefits and burdens of their participation in research differently. Among the most challenging issues with early GT trials are the choice(s) of the starting dose, identifying relevant and rapidly assessable safety criteria for expanding enrollment by dose, and determination of dose-escalation increments since participants cannot be re-dosed. This inability to fully discontinue participation once dosed, the need for immunosuppression, unknown long-term hazards and benefits, and the inability to modify the activity of the replaced or edited gene all engender the need to ensure that participants are fully informed of these additional burdens, of greater trial complexity and of significantly longer trial duration. Compared to small molecule trials, GT trials and organ transplants often also have far more strict and genetically determined eligibility criteria, greater uncertainties about the durability of the intervention’s benefit, and possible transition of participants to a state where direct benefit wanes but continued observation may benefit future patients. Similarly, prior participation in a GT trial or receipt of an approved GT is more likely to limit options for participation in future investigational trials of all kinds. Finally, the technical complexities of GTs can be difficult to explain via the traditional small molecule-based informed consent process, begging the question, particularly in the case where pediatric assent is an ethical requirement, of whether the age where pediatric assent is required should be adjusted upwards (e.g., to 14-16 years) for pediatric GT trials. Thus, engagement processes used in solid organ transplants for transplant-listed potential recipients may be more relevant than the typical small molecule development model in providing the necessary information on trial design, burden, and future implications of participating, including research and therapeutic options.
CP2-4 – DOSE OUTCOME USING STRATIFIED ESTIMATION WITH RANDOM FOREST METHOD (DOSE-RF): A NOVEL APPROACH TO NON-LINEAR DOSE-RESPONSE MODELLING IN COMPLEX INTERVENTIONS
SESSION 3
CP3-1 – UNDERSTANDING RECRUITMENT SUCCESS IN CANADIAN TRIALS: AN ANALYSIS OF TRIAL REGISTRY DATA
CP3-2 – RECRUITMENT STRATEGIES AND IMPACT IN A CANCER CLINICAL TRIAL
Low enrollment in clinical trials can lead to premature study closures, increased operational costs, exhaustion of resources, and negatively impacts the generalizability of trial results, which limits scientific progress. Approximately 80% of trials fail enrollment targets, which cause a loss of up to $8 million per day for drug development companies. The Targeted Agent and Profiling Utilization Registry (TAPUR) Study is a phase II, precision oncology, multi-basket clinical trial that evaluates the antitumor activity of FDA-approved drugs outside of their approved indication(s) in patients with advanced cancers with specific genomic alterations at over 265 clinical sites in the United States. As of November 13, 2024, there were a total of 91 open cohorts, and 164 completed cohorts. During and post COVID-19 pandemic, study enrollment decreased, declining by 4% from 2021 to 2022. Therefore, increased awareness of the TAPUR Study was needed to encourage both physician and oncology community engagement and participant recruitment. This included a multi-stakeholder approach to the multifaceted issue of low enrollment. In collaboration with patient advocacy organizations, targeted recruitment efforts were completed for seven cohorts that had not had any enrollments in the previous three months or were considered treatments for rare targets. This approach focused on reaching potential participants and/or their caregivers. Each patient advocacy organization was consulted on the best methods and platforms to reach their constituents and templates were created to aid the drafting process. However, the study team also recognize that the clinicians and investigators are a crucial audience to improve enrollment onto the TAPUR Study. Therefore, to further promote awareness, the first annual TAPUR Grand Rounds was held in 2023. Community oncologists, Principal Investigators, TAPUR staff, and patient advocates were invited as panelists to present study-related topics from their unique perspective to the target audience (i.e., oncology community, oncology clinicians, etc.). This collaboration resulted in multiple marketing efforts including educational and direct recruitment material such as educational blog posts, updates to internal clinical research search engines, mass email blasts, a social media campaign (Facebook, LinkedIn, X, Instagram) and website advertisements. In addition, the study team held bi-monthly TAPUR Study coordinator webinars which provided a regular opportunity to highlight these cohorts and the Grand Rounds to clinical sites. A total of 43% of registrants attended TAPUR Grand Rounds, with the majority of attendees (47%) self-identifying as patient advocates within the oncology community. As a result of these engagement strategies, the study had 14 additional enrollments across the seven prioritized cohorts, two of which were for rare cohorts, and overall enrollment increased by 4.5% within a period of seven months. This presentation will describe our experience in working with patient advocacy organizations, the effectiveness of an annual grand rounds presentation, and social medial campaigns as a multi-pronged approach to increase participant enrollment to the TAPUR Study.
CP3-3 – A SYSTEMATIC ATTEMPT TO OVERCOME BARRIERS TO ACCRUAL ON RANDOMIZED TRIALS: LESSONS LEARNED FROM 15 YEARS OF CLINICALLY-INTEGRATED TRIALS AT A MAJOR CANCER CENTER
CP3-4 – SUPPORTING CENTRES TO BEST INTEGRATE AN RCT AND TO GAIN CONFIDENCE IN IT CAN OPTIMISE INFORMED DECISION-MAKING AND RECRUITMENT TO SURGICAL TRIALS: AN EXAMPLE FROM AN ORTHOPAEDIC SURGICAL TRIAL
SESSION 4
CP4-1 – THE IMPACT OF TREATMENT NON-ADHERENCE ON POWER AND SAMPLE SIZE IN CLINICAL TRIALS
In any clinical trial it is expected that some number of participants will not adhere to the treatment protocol. Despite this fact, the primary analysis is usually the intention-to-treat (ITT) analysis, where participants are analyzed according to the assigned treatment arm, regardless of treatment received. This results in a conservative estimate of the treatment effect and provides valuable insight into the effect of recommending a new treatment. Trialists are also often interested in estimating the true treatment effect for those participants who receive the treatment, and so a secondary as-treated (AT), per protocol (PP), or other adherence-adjusted analysis is often included in the analysis plan. Selection bias is a concern in such analyses, as unmeasured confounders could potentially affect both adherence and outcomes, thereby negating the protective effects of randomization in an AT or PP analysis. Newer adherence-adjusted methods have been developed and have been shown to produce treatment effect estimates that are less biased. These methods include inverse probability weighted per protocol analysis (IPW-PP) and instrumental variable (IV) approaches. While these methods improve bias, they do so at the cost of higher variability and reduced power. Sample sizes are typically inflated to account for non-adherence, but the required inflation for the newer adherence-adjusted methods has not been carefully studied. The general recommended inflation factor for an ITT analysis is a non-adherence rate of R is 1/(1-R)2. This value can be extremely large for high non-adherence rates, and our goal in this work is to examine how much sample size inflation is actually needed for the various adherence-adjusted approaches. This presentation shows simulation results that examine the power, bias, and type 1 error rate when comparing ITT, AT, PP, IPW-PP, and IV analyses in a variety of conditions and under varying degrees of sample size inflation. We show that when the rate of non-adherence is low, PP and IPW-PP analyses are minimally biased and retain nominal type 1 error rates while still maintaining good power, even when the sample size is not inflated. IV methods have less bias, but do not achieve the required power without an appropriate sample size inflation. When non-adherence increases, however, bias and type 1 error rates are much higher for AT and PP methods, especially when unmeasured confounders have a strong effect on adherence and outcomes. In this scenario, IV methods maintain type 1 error rate regardless of sample size but still require the full sample size inflation to achieve the required power. In all scenarios, ITT analyses maintained nominal type 1 error rates but lacked power to show a treatment effect. While IV methods have very low bias, they produce estimates that have high variability and wide confidence intervals, so appropriate sample size adjustments must be made to find statistically significant results. When treatment non-adherence can be measured, we recommend a comparison of ITT, PP, and IV results to examine the impact on the estimation of treatment effect.
SESSION 5
CP5-1 – ADAPTING THE QUINTET RECRUITMENT INTERVENTION TO OPTIMISE INFORMED CONSENT IN CLINICAL TRIALS IN INDIA: LESSONS FROM THE ORION-I FEASIBILITY STUDY
CP5-2 – REMOTE, CENTRALIZED MONITORING OF THE INFORMED CONSENT PROCESS IN MULTICENTER TRIALS
Among the many responsibilities of a Principal Investigator, ensuring the protection of the rights and welfare of human research participants rises to the top of that list. One of the first major steps to achieving this goal is the informed consent process. SPARX3 is large randomized clinical trial comprised of 24 sites across United States and Canada with a target enrollment of 370 drug naïve individuals recently diagnosed with Parkinson’s Disease. During the development of the electronic data capture system (EDC), our goal was to develop a fully remote monitoring mechanism that would enable early identification, intervention and prevention of problems during the informed consent process which typically may only be identified during periodical in-person site monitoring visits. For this reason, instead of simply confirming that IC had been obtained for a participant, our EDC requires sites to complete an Informed Consent Process (ICP) form and upload of a scanned copy of the Informed Consent (IC) Form. The ICP form was created using multiple choice questions targeting key elements required in the process of obtaining informed consent such as: 1) who was present during the informed consent discussion; 2) the fact that risks were presented; 3) confirmation that significant issues of concern to the participant were addressed; 4) date that the participant signed the IC form; 5) start and end time of the process to obtain informed consent; 6) signed by an individual responsible for the documentation as indicated in the Delegation of Authorization (DoA) Form. Documentation of this process enhances the certainty that sites are following strict FDA guidelines which are recommended by the IRB of record and enables monitoring of 100% of ICs for the SPARX3 study. Once a month a monitor uses the ICP data elements to draw conclusions about the Informed Consent Process. For example, the start and end time for the ICP indicates if an appropriate amount of time was provided to review and answer questions regarding the study. In addition, date and time of consent are compared to the first data collection procedure to ensure that research procedures were not conducted prior to obtaining IC. The IC form is checked for version control and expiration date. A few sites’ local regulations require a HIPPA agreement to be signed at the time of consent, and therefore it is expected to be uploaded together with the copy of the IC form. The site staff signature is compared to the DoA to ensure only those with adequate training and authorized to obtain IC are the ones doing it. In this presentation, we will highlight the process of identifying the elements to monitor the informed consent process, the monthly monitoring, issues identified during the monitoring, communication with sites and the impact of this process on the protection of rights and welfare of human subjects in the SPARX3 Trial. We believe similar practices should be implemented in multicenter clinical trials of any size but in particular large-scale trials.
CP5-4 – STAGED INFORMED CONSENT FOR TRIALS WITH USUAL CARE GROUPS: DEVELOPING GUIDANCE
SESSION 6
CP6-1 – ROLE OF WEARABLE TECHNOLOGY IN CLINICAL TRIALS: A SINGLE INSTITUTION EXPERIENCE
CP6-2 – STUDY WITHIN A TRIAL OF ELECTRONIC VERSUS PAPER-BASED PATIENT REPORTED OUTCOMES COLLECTION (SPRUCE) - PRIMARY OUTCOME AND PATIENT DEMOGRAPHICS
CP6-3 – USING A CENTRAL OUTCOMES CENTER TO REDUCE ATTRITION IN A LONGITUDINAL ED-BASED PEDIATRIC STUDY
CP6-4 – ASSESSING FREE-TEXT FIELDS THROUGH NATURAL LANGUAGE PROCESSING TO ENHANCE CRF DEVELOPMENT AND DATA QUALITY
The use of free-text fields in electronic case report forms (eCRF) in clinical trials give investigators flexibility in the collection of participant data but require natural language processing (NLP) to analyze the resulting unstructured data. Without the use of machine learning techniques, the process of parsing meaningful insights is manual, laborious, and prone to confirmation bias. Many eCRFs use “Other” fields when collecting qualitative data which have an associated free-text field for further explanations. In the context of a longitudinal observational study, we use NLP to assess the free-text field associated with a qualitative “Other” field to determine whether the existing categories options properly capture participant responses and whether additional classifications are needed. We will illustrate three natural language processing techniques utilized to assess the unstructured data: 1) removal of stop word or words that are important to grammar of a sentence but do not add meaning, 2) lemmatization or reducing words to their primary form, and 3) tokenization of preprocessed text to break up phrases into smaller groups. After which we created bigrams and trigrams of these word groups utilizing a term frequency-inverse document frequency (TF-IDF) algorithm to identify distinct and meaningful tokens. Visualizations of both bigrams and trigrams by date were investigated for pattern recognition of the word groups. We will illustrate the process required for the analysis of unstructured free-text fields. The output of the analysis suggested that the “Other” option was often chosen so that additional information could be added to the associated free-text field. To properly capture participant responses, we updated the eCRF to include additional qualitative options identified through text mining along with allowing for multiple selection rather than single selection of qualitative options. NLP techniques such as the TF-IDF algorithm provide opportunities to explore underutilized qualitative responses in the scope of clinical trial data. The analysis of free-text fields allows investigators to extract quantitative metrics from unstructured data originating from free-text fields that could be used to review and validate workflows to ensure high quality eCRF data capture. NLP techniques can also be used to audit eCRF data for misclassifications and other data entry errors that can impact study analyses. Additionally, text mining can be used to report on participant feedback to further enhance clinical research procedures.
SESSION 7
CP7-1 – REMOTE, BIVARIATE EXPERT ELICITATION TO DETERMINE THE PRIOR PROBABILITY DISTRIBUTION FOR A BAYESIAN NON-INFERIORITY MULTICENTER RANDOMIZED CONTROLLED TRIAL
Bayesian statistics are increasingly used in the design and analysis of clinical trials. A key element of a Bayesian clinical trial is the prior for the treatment effect, which encapsulates existing knowledge and uncertainty regarding treatment efficacy. Typically, a prior for a trial is drawn from previous studies or meta-analyses. However, in emerging research areas, such sources may not be available. Here, expert opinion can be employed to establish a prior, but it must be gathered systematically through elicitation. Traditional elicitation methods often require face-to-face interactions and extensive pre-elicitation training, making them potentially impractical and costly. In this study, we developed a remote, international, structured elicitation method to construct a joint (bivariate) prior distribution for a treatment effect (i.e., the difference between treatment and control groups). This method has been successfully applied to a pediatric croup non-inferiority trial that compares the efficacy of two doses of dexamethasone: 0.60 mg/kg as the active control and 0.15 mg/kg as the experimental treatment. The goal of this elicitation application is to develop a joint distribution representing the difference in the number of return visits to the emergency department (ED) for both doses of dexamethasone. We denote the distribution of the probability of a return visit to ED in 0.15 mg/kg dose group as f(P1) and the probability of a return visit to ED in 0.60mg/kg dose group as f(P2). A total of twelve emergency medicine physicians from Canada and the USA participated in our remote elicitation exercise. We developed an R Shiny application to assist with the elicitation and distribution fitting. The process was conducted in two stages. In the first stage, the experts were presented with two hypothetical clinical scenarios under two doses and were asked to provide their individual judgments to elicit Beta distributions for f(P1) and f(P2). After this initial assessment, the group had the opportunity to discuss their responses. In the second stage, experts were permitted to adjust their judgments based on insights gained from the group discussion, leading to revised marginal distributions of the probability of returning to the ED. Recognizing that individual judgments regarding return visits for high-dose and low-dose groups may be correlated, we aggregated the individual distributions using expert-specific joint (bivariate) distributions f(P1,P2) with latent effects. These bivariate distributions introduced expert-specific correlations between the responses for each dosage. Finally, the distribution of f(P2-P1) was derived from the joint distribution and was subsequently used to determine the sample size for the trial. Figure 1 displays individual expert opinions on the efficacy of each dose at survey rounds 1 and 2. The elicitation generated a final prior distribution centered at 6% (standard deviation: 6%) for the active control dose and 8% (standard deviation: 7%) for the experimental treatment dose (Figure 2). The aggregated prior distribution produced a sample size of 1700. This study demonstrates the feasibility of remotely eliciting bivariate distributions to design clinical trials. Reporting our elicitation process will support the use of elicitation in future clinical trials.
CP7-2 – SUBJECT RANDOMIZATION FOR BAYESIAN ADAPTIVE TRIALS WITH MULTI-ARM UNEQUAL ALLOCATIONS
Research on randomization algorithms for two-arm equal allocation trials has achieved remarkable results, but research on multi-arm unequal allocation trials is still insufficient. In Bayesian adaptive trials with response adaptive randomization (RAR), desired allocations are not only unequal but may also contain decimals or irrational elements, such as 1:1.234:1.789. Periodical update of the target allocation in Bayesian adaptive trials result in small sequence length for each allocation. Currently available randomization algorithms are complete randomization and permuted block randomization. Complete randomization can accurately target the desired allocation without approximation but has low allocation precision, may result in a treatment distribution far away from the target, especially when the allocation sequence length is small, and the trial is running only once. Permuted block randomization requires a block size within the allocation sequence length and therefore may not be able to accurately target the desired allocation. For example, for the desired allocation of 1:1.234:1.789, investigators may have to use 3:4:5, 4:5:7, or 6:7:11 as approximated. Both low allocation accuracy and low allocation precision may reduce the benefit of adaptation design. Furthermore, neither complete randomization nor permuted block randomization control imbalances in potential confounding factors. This talk presents the minimal sufficient balance (MSB) method for trials with two or multiple arms and equal or unequal allocations. In this method, treatment imbalance is defined as the Euclidean distance between the treatment distribution under the desired allocation and the treatment distribution observed in the trial at the given stage. By default, a complete random assignment is applied by using the desired allocation probability as the conditional allocation probability. When treatment imbalance reaches the pre-specified threshold, the conditional allocation probabilities are modified aiming to reduce the treatment imbalance. Otherwise, the randomization algorithm checks the distribution of baseline covariates. The p-value of a test is one of optional measures for baseline covariate imbalances. If serious imbalances (such as a p-value less than 0.3) are found in one or more baseline covariates, the conditional allocation probability is modified to reduce these imbalances. Computer simulation show that less than 20% treatment assignments needed to contain the treatment imbalance within the threshold equal to the number of arms. Nor more than 5% treatment assignments are required to prevent serious imbalance (i.e. p-value<0.3) in a baseline covariate. Most importantly, the MSB method ensures that the desired allocation obtained from the Bayesian adaptation algorithm is accurately targeted without approximation and the allocation precision is controlled. This new randomization method has been implemented in several large multicenter Bayesian adaptive trials in the Stroke Trials Network and SIREN Network, both founded by NIH.
CP7-3 – PRIOR DISTRIBUTIONS FROM ENVISIONED POSTERIOR JUDGMENTS: A NOVEL ELICITATION APPROACH WITH APPLICATION TO BAYESIAN CLINICAL TRIALS
Bayesian methods for clinical trials require the specification of a prior distribution for the model parameters. The key benefit derived from the prior distribution is the ability to incorporate prior knowledge, which can increase trial efficiency. Prior elicitation is a scientific process that transforms domain knowledge, previous data, or expert judgments into well-defined prior distributions. It offers a solution to the prior specification problem, especially when limited data is available. Applied to clinical trials, elicitation involves engaging medical experts to assist them with summarizing their judgments about how well treatments work and communicating those judgments in a way that the results can be combined with trial data. The uptake of formalized prior elicitation from experts in Bayesian clinical trials has been limited, largely due to the challenges associated with complex statistical modeling, the lack of practical tools, and the cognitive burden on experts, requiring them to undergo supplementary training to ensure they are adept at quantifying uncertainty using probabilistic statements and to mitigate potential cognitive biases. In addition, existing methods have not addressed the issue of prior-posterior coherence, i.e., does the posterior distribution, obtained mathematically from combining the estimated prior with the trial data, reflect the expert’s actual posterior beliefs? In this work, we propose a new elicitation approach that seeks to ensure prior-posterior coherence and to reduce the expert’s cognitive burden. This is achieved by eliciting responses about the expert’s envisioned posterior judgments (point estimates for parameter values) under various hypothetical outcome data spanning a wide range of outcome values, as well as sample sizes. The presented data are intended to challenge the expert’s beliefs, forcing them to make decisions about the relative weights they assign to their (latent) prior judgments versus the data. A “best fit” prior distribution is then inferred from these elicited posterior judgments based on a specified statistical optimality criterion that minimizes the discrepancies between the elicited responses and the expected responses obtained from the implied posterior distribution. We present the statistical framework and the results from applying this approach in a pilot case study with a group of 10 clinician experts to obtain their prior distributions for the time effect in an ongoing stepped-wedge cluster randomized trial.
CP7-4 – BAYESIAN IN-SILICO CLINICAL TRIALS APPLIED TO OBESITY-RELATED CANCER PREVENTION: THE IMPORTANCE OF EXPERT ELICITATION FOR KEY PARAMETERS IN THE ABSENCE OF EXISTING DATA USING THE SHELF METHOD
SESSION 8
CP8-1 – AN EXPERIMENTAL DESIGN FOR CLINICAL TRIALS TESTING THE INDIVIDUALIZATION POTENTIAL OF AN INDIVIDUALIZED TREATMENT RULE
Considerable money and effort have been invested to develop medical treatment individualization approaches based on biomarkers and diagnostic tests, and, more generally, patient-level variables. However, the pace of evaluation of the clinical utility of these decision methods has been slow, and many are implemented in clinical practice without confirmatory empirical evidence they do what they are supposed to do (treatment individualization). So, there have been in recent years calls in the medical community for the conduct and regulation of prospective randomized clinical trials that evaluate these individualized treatment rules (ITRs). ITRs based on classic statistical methods or machine learning have also proliferated in the statistical literature, but these ITRs are also rarely tested with clinical trials; their clinical utility is often inferred from statistical theory and clinical knowledge or examined with simulations. Thus, there is a need of efficient and scientifically sound clinical trials that examine the clinical utility of ITRs. These trials should not be confused with those widely used for testing the efficacy of individual medical treatments (for example, the well-known parallel-group or crossover trials). Their primary goal is to test the efficacy of the individualization process relative to not following the process, not the efficacy of a specific treatment involved in the process relative to placebo or other treatments. Thus, the development of experimental designs for evaluating ITRs with clinical trials assessing their potential utility in clinical practice is an important and flourishing field of methodological research. Here, we introduce a new confirmatory experimental design for testing ITRs, focusing on the individualization of two treatments. The design is built on the novel published idea of individualization potential, which is a measure of the extent of the superiority of an ITR over treatment without individualization (TWI). This idea implies a novel way of constructing the control group of the clinical trial and of testing the ITR’s utility. Our experimental design compares the application of the ITR against TWI as the control arm. We show that our design is superior to the most common designs used in personalized medicine research. Our design usually requires smaller sample sizes, especially of the patients who are less frequent and therefore more difficult to recruit and implements a more appropriate control arm. We explain how to test the significance of the individualization potential with our design and how to calculate optimal sample sizes. We illustrate by calculating sample sizes for a hypothetical clinical trial of a published ITR for the individualization of ophthalmological treatments.
CP8-2 – EVALUATING THE USE OF CO-PRIMARY ENDPOINTS IN TRIALS CONDUCTED AMONG CRITICALLY ILL PATIENTS: APPLICATION FOR DELIRIUM TRIALS
Delirium is an acute state characterized by rapid onset of confusion, inattention or agitation. Delirium is common among patients with critical illness receiving care in an intensive care unit (ICU), where incidence can be as high as 80%. In critically ill patients, delirium is associated with increased mortality and the duration of delirium is associated with cognitive impairment and onset of dementia among patients surviving the critical illness. Therefore, an increasing number of trials are evaluating pharmacologic and nonpharmacologic interventions to reduce delirium duration. Delirium trials conducted among critically ill patients typically measure delirium at least daily from the day of randomization to the first of pre-specified follow-up duration, e.g., 14 or 28 days, or death. Two competing risks complicate operationalizing duration of delirium as an endpoint in these trials. First, critically ill patients may experience periods of deep sedation or coma during which delirium is unable to be assessed, and second, patient death precludes the measurement of delirium. To account for these competing risks, several “failure-free” days composite endpoints have been used, e.g., days alive free of delirium and coma within 14 days. As with any composite endpoint, it is challenging to disentangle the effect of an intervention on a single component, e.g. days of delirium. Further, composite endpoints are not recommended in settings where interventions may have a positive effect on one component and a negative effect on another. For delirium trials, there may be a pharmacologic agent hypothesized to reduce delirium duration while increasing the duration of deep sedation, e.g. benzodiazepines which have sedative properties. As an alternative to a composite endpoint, we investigate the use of multiple co-primary endpoints composed of days of delirium, days of coma and death within the pre-specified follow-up duration, where treatment is deemed efficient if there is a significant reduction of delirium duration with no clinically important increase in the duration of coma or death. Therefore, the global hypothesis test is composed of one superiority test for the duration of delirium and two non-inferiority tests for the duration of coma and death. Co-primary endpoint falls under the intersection-union principle for hypothesis testing where the overall null hypothesis is rejected if all sub-null hypotheses for each endpoint are rejected. We illustrate using a simulation study how this setting does not inflate the type-I error, but it can inflate the type-II error and reduce the power of the study. The study testing for a co-primary endpoint will then need to increase the sample size than that for testing a single endpoint. Using data from a completed delirium trial, we compare power when using multiple coprimary endpoints composed of duration of delirium, duration of coma and time to death. Further, we propose strategies to mitigate the power loss when using the co-primary endpoint by adapting the type-I error for each outcome. This work highlights the potential for co-primary endpoints to improve clinical trial designs and interpretation for delirium trials conducted among critically ill patients.
CP8-3 – EVALUATING ESTIMAND IMPLEMENTATION IN CLINICAL TRIALS IN THE UK
CP8-4 – PRELIMINARY RESULTS FROM A SYSTEMATIC REVIEW OF BAYESIAN METHODOLOGICAL APPROACHES USED IN THE DESIGN AND ANALYSIS OF CONTEMPORARY RANDOMIZED CLINICAL TRIALS
Discussion: The preliminary findings indicate a clear need for guidelines to indicate best practices for the reporting and use of Bayesian methods in RCTs. We anticipate that forthcoming results will provide more granular insights into the contemporary landscape of Bayesian trials.
SESSION 9
CP9-1 – ASSESSING ADHERENCE TO CONSORT REPORTING GUIDELINES USING AI
CP9-2 – INTEGRATING INTERIM ANALYSES AND DSMB OVERSIGHT IN ADAPTIVE PLATFORM TRIALS: OPERATIONAL INFRASTRUCTURE OF THE HEALEY ALS PLATFORM TRIAL
CP9-3 – CHALLENGES AND SOLUTIONS IN CTMS IMPLEMENTATION FOR A REGISTRY-BASED PLATFORM TRIAL
Real World evidence, including registry data, is becoming increasingly important in answering clinical research by harnessing patient data in the real world outside controlled clinical trials, allowing for treatment across diverse populations of people who might not otherwise be enrolled in clinical trials, providing information about long-term efficacy, safety, and cost-effectiveness, and revealing how treatments are used in clinical practice. It has been argued that registry-based randomized clinical trials identify and recruit patients more efficiently, reduce duplicative data collection and site workload, reduce loss to follow-up, decrease time to database lock, enhance study generalizability, accelerate time to regulatory decision-making, and reduce clinical trial costs compared to traditional randomized clinical trials. STEP is a randomized, multifactorial, adaptive platform trial that seeks to optimize the care of patients with acute ischemic stroke. It is a “hybrid” registry-based trial that leverages data from the American Heart Association’s “Get with the Guidelines” and the “Neurovascular Quality Initiative-Quality Outcomes Database” registries. With permission from participating institutions and the IRB, data elements collected in the registries for patients randomized into STEP are transferred to the clinical trial management system’s (CTMS) case report forms using a customized data transfer program based on the unique participant record link. Once the data are transferred, the site is responsible for reviewing the fidelity of the transferred data, cleaning the data when needed, and entering any fields not transferred from the registries. Additional registry data from patients not randomized in STEP are used for trial screening, planning, and generalizability assessment purposes. Challenges during CTMS implementation include: (1) Unknown treatments and study visits that will be added to or removed from the platform during the life of the study; (2) Multiple mechanisms of data collection including (1) transfer from the registry, (2) data entry by the site, and (3) transferred and then edited by the site; (3) Not all sites are participating in all registries, and some sites only enter a random sample of their patients into the registry(s); (4) Data definitions in the registry may be similar, but not the same as data items defined in the study protocol; (5) Code reconciliation for multiple selection data fields; (6) Time delays due to standard of care data entry in registry systems; (7) Cost justification for registry data acquisition; and (8) Cost justification for information system development for data transfer and reconciliation. This presentation will discuss solutions to these barriers including building flexibility into the study design, data collection schedule, CRF design, and data validation methodology and development of tools for data transfer, conversion, mapping, and reconciliation. The primary aim of this presentation is to evaluate the costs and benefits of combining real word SOC registry data sources with trial specific data captured on CRFs based on the experiences of the STEP platform trial.
SESSION 10
CP10-1 – LEVERAGING INTERACTIVE DATA VISUALIZATION SOFTWARE TO ENHANCE DATA REPORTING AND QUALITY: EXAMPLES FROM A MULTI-CENTER CLINICAL TRIAL
Clinical trialists are increasingly utilizing interactive data visualization software, such as Power BI, Tableau, and Looker Studio, with the goal of garnering insights from large trial data sets, often in real-time (or with short lag time). Interactive data visualizations in clinical trials provide a powerful tool, especially in a risk-based monitoring context, where cleaned, high-quality data is critical to the successful execution of a trial and analysis and interpretation of its results. We present two use cases of Power BI in the context of multi-center clinical trials: (1) to generate near real-time reports on ongoing study status, and (2) to create visualizations of endpoint data for the purposes of data cleaning. During the course of a trial, data coordinating centers are tasked with generating reports to share with various trial stakeholders. Often, these reports include information on enrollment and visit compliance, among other metrics. For example, in our traditional weekly reports, we typically include three separate pages summarizing (1) the number of participants consented and randomized by site, along with those eligible/ineligible and the rates of consent/randomization; (2) a figure of consents over time; and (3) a figure of randomizations over time. In Power BI, researchers can summarize this information into a single report page, with interactive visuals that can be filtered to highlight certain fields. For example, filtering by site enables report users to easily see consents/randomizations for a specific site over time, which offers interactive insights not available in traditional reports. In addition, creating a direct link between the Power BI report and a “snapshot” of the production database allows for the report to be updated on a more frequent basis. Power BI also provides creator-controlled filters and subscriptions within the report distribution infrastructure. Automating the generation and distribution of reports saves time and reduces email overload caused by multiple attachments. Power BI can also enhance data cleaning processes. In contrast to static “spaghetti” plots, Power BI provides an interactive environment that links statistician-derived endpoints (using statistical software like SAS) with the raw data, allowing the user to refer back to the case report form (CRF)-level data and examine whether any specific data entry errors or procedural issues are affecting the derived endpoint. In our described process, statisticians write analysis dataset programs that are automatically configured to run on the snapshot database at a regularly scheduled interval, outputting CSV files that are loaded to an “analytics” database using SQL Server Integration Services (SSIS) and linked to the raw data via an identifier (e.g., CRF #). Once set up, this process enables the full use of Power BI’s functionalities. For instance, incorporating the site filters as described above, one can more quickly decipher trends at particular sites or in specific participants, whereas doing the same with static plots would require re-runs of the code by those proficient in the statistical software. These use cases demonstrate that data visualization software like Power BI can help enhance clinical trial data reporting and quality, which are essential to robust scientific inquiry.
CP10-2 – COMMUNICATING TRIAL RESULTS BY GRAPHICAL ABSTRACTS —EXPERIENCES FROM THE GRADE (GLYCEMIA REDUCTION APPROACHES IN DIABETES: A COMPARATIVE EFFECTIVENESS) CLINICAL TRIAL
CP10-3 – GRAPHICAL REPRESENTATION OF ADVERSE EVENTS IN CLINICAL TRIALS
Adverse event (AE) reporting in clinical trials is essential for evaluation of treatment benefits and harms. AEs are usually collected using standardized classification schemes. For example, Medical Dictionary for Regulatory Activities (MedDRA) is a clinically validated international medical terminology system used for AE reporting, which includes five levels of hierarchy, with 26 System Organ Classes (SOCs) at the highest level and more than 80,000 specific lowest level terms which reflect how AEs may be recorded in practice. Common Terminology Criteria for Adverse Events (CTCAE) incorporates certain elements of the MedDRA terminology, and is the standard for classifying, attributing and grading the severity of AEs associated with cancer treatments. A recent update to CTCAE for patient-reported outcomes (PRO), the PRO-CTCAE, has also been proposed. AE reporting is included in the Consolidated Standards of Reporting Trials (CONSORT) statement, and a recent
“2022 CONSORT Harms” extension includes three new items and updates related to benefits and harms reporting for thirteen other CONSORT items. AE data are usually analyzed using descriptive statistics, and the results are reported as tables of patient counts for each AE type, often by treatment group or severity level. Multiple instances of an AE for a participant over time are usually summarized as the maximum grade experienced. Formal comparisons between groups are not common due to low frequencies for many AE types, and the need for multiple comparison adjustments. For high-toxicity treatments descriptive tables are often long and difficult to interpret or identify patterns. Lee et al recently proposed to use circular and butterfly plots to represent proportions maximal-grade AEs by SOC and severity level, but in general graphical tools use has not been adopted. We propose a streamlined graphical approach that enables rapid visual comparison of AE patterns between groups, improves understanding of the affected organ classes, and enhances the overall understanding of treatment harms. First, we propose the use of vertical line charts, similar to love plots used in covariate imbalance assessments for propensity score analyses, to describe AEs rates by group. In these plots, AE terms are ordered by the overall frequency on the y-axis, and the x-axis displays the proportions of patients experiencing each AE type in each group, with connections drawn across AE terms within each group. Second, we propose using a radar plot to summarize organ class involvement in each group, with vertices denoting the proportion of patients experiencing any AE within a SOC. Each plot can be easily customized to include additional groupings, e.g. low vs. high grade AEs, to further highlight subgroup differences or similarities. These plots can also be directly applied to AE summary tables in published studies, or to posted results from completed trials in clinicaltrials.gov. To illustrate our approach, we use AE summary data from a published trial comparing Pemetrexed + Chemotherapy with or without Pembrolizumab in non-squamous non-small cell lung cancer (KEYNOTE-789; NCT03515837). We recommend inclusion of these graphical summaries in analysis reports and publications, alongside the usual table summaries.
CP10-4 – HOW DOES THE USE OF A VIDEO TO INTRODUCE A COMPLEX INTERVENTION AFFECT SERVICE USER UPTAKE AND ENGAGEMENT? RESULTS FROM A MIXED-METHODS SWAT
SESSION 11
CP11-1 – IDENTIFYING GAPS IN ETHICS GUIDELINES FOR CLUSTER RANDOMIZED TRIALS: A CITATION ANALYSIS
CP11-2 – THE PROMISE STUDY: AN INTERNATIONAL CONSENSUS PROCESS TO DEVELOP GUIDANCE FOR ASSESSING “PROMISE OF THE INTERVENTION” AHEAD OF A RANDOMIZED CONTROLLED TRIAL
Potential Relevance and Impact: Guidance will be produced to help funders and researchers identify suitable research methods for assessing “promise” of the intervention. Implementation of guidelines will be facilitated through liaison and registration, support from the EQUATOR network, and via discussion with the major funders of clinical trials.
CP11-3 – PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT TO METHODOLOGICAL RESEARCH: INSIGHTS FROM A PANEL
CP11-4 – USING RE-IDENTIFICATION RISK SCORES ON PUBLICLY AVAILABLE ANONYMISED CLINICAL TRIAL DATASETS
SESSION 12
CP12-1 – PIONEERING PHYSICIAN-LED TRIALS: TRANSFORMING TREATMENT FOR ADRENAL INSUFFICIENCY AND BEYOND
Physicians frequently recognize potential treatments for conditions that lack sufficient market size to attract industry-sponsored trials. These opportunities remain unvalidated due to significant barriers, including complex regulatory pathways, funding limitations, and the absence of commercial incentives. This study presents an innovative framework for practitioner-led clinical trials that leverage existing FDA-approved devices and real-world data, enabling impactful research independent of industry support. Using the case of continuous subcutaneous hydrocortisone infusion for adrenal insufficiency, the trial exemplifies how practitioners can navigate these challenges. Subcutaneous infusion devices, approved for insulin delivery, are ideal for repurposing due to their established safety profiles and versatility. In adrenal insufficiency, standard oral hydrocortisone therapy often results in peaks and troughs, requiring supra-physiologic doses during troughs to stabilize patients. These fluctuations compromise quality of life and elevate long-term health risks. Continuous subcutaneous infusion offers a more physiologic delivery, stabilizing cortisol levels, reducing hospitalizations, and improving patient-reported outcomes. Our trial design incorporates retrospective electronic medical record data collected and analyzed by our team to inform eligibility criteria and outcome measures. The primary outcomes include a reduction in hospitalization rates for adrenal crises and improvements in quality of life, assessed using validated questionnaires. Secondary outcomes evaluate fatigue levels, adjustments in daily hydrocortisone dose, serum adrenocorticotropic hormone levels, adverse events, and device-related complications. By leveraging adaptive design elements and real-world data, this framework provides a scalable, resource-efficient approach to trial execution. The trial’s design highlights how funding mechanisms such as PAS-23-086 Small R01 can support small-scale studies, demonstrating that properly constructed trials can yield data robust enough for treatment approval and insurance coverage. Beyond adrenal insufficiency, this model serves as a template for other conditions where potential treatments exist but lack industry interest due to limited market size. The framework prepares for future needs as data analytics continue to identify novel treatment applications that may not attract commercial investment. This practitioner-led approach not only bridges the gap between clinical observation and formal evidence generation but also ensures that overlooked therapies can be validated and brought to patients. By addressing current and future challenges, this model shapes a path for impactful, sustainable research that adapts to the evolving landscape of clinical science, ultimately expanding treatment options and improving patient care.
CP12-2 – INTENTION-TO-TREAT ANALYSIS: A SYSTEMATIC REVIEW ON RECOMMENDATIONS AND HOW TO USE IT APPROPRIATELY
Randomized controlled trials (RCTs) design has been considered the gold standard for testing the effects of an intervention. However, missing data (e.g., drop-out) and non-compliance (e.g., participants not following the original treatment assignment) bring complexities to the data analysis for RCTs. Intention-to-Treat (ITT) has long been recommended for handling these issues in data analysis to preserve the benefits of randomization in RCTs. As various strategies and diverse versions of ITT emerge in the literature, researchers face the imperative of gaining a comprehensive understanding and employing suitable data analysis techniques. This study aims to review how ITT is defined and what ITT methods and practices are recommended in methodology articles from 2010 to 2022. A total of 1,281 articles were identified initially, and only 53 articles met the inclusion criteria and were included in the final review. Our results suggest that a variety of definitions have appeared in the literature in addition to the widely cited definition, “once randomized, always analyzed.” Modified ITT (mITT) has become a popular trend and attracted a lot of attention in the last decade. However, there is no agreement on how to define mITT, and about one-third of these methodological articles did not present a clear definition. Additionally, our study identified a variety of statistical methods and techniques recommended for handling missing outcome data and non-compliance, but one-third of the articles did not provide a proper description of the applied statistical methods. Notably, our study revealed that only six articles (11%) originated from the fields of psychology and social sciences, while the majority were published in medical-related fields. In conclusion, our findings underscore the necessity for researchers across disciplines to enhance their comprehension and adeptly apply ITT or alternative strategies when dealing with missing data and non-compliance in RCTs.
CP12-3 – CHARACTERISTICS OF INTERVENTIONAL CLINICAL TRIALS STARTED IN 2023 REGISTERED IN CLINICALTRIALS.GOV
CP12-4 – THE TRANSLATION OF INTERVENTIONAL STUDY FINDINGS FOR ADULTS REQUIRING MAINTENANCE DIALYSIS INTO CLINICAL GUIDELINES
SESSION 13
CP13-1 – STRATEGIES, OBSTACLES, AND FACILITATORS FOR REMOTE TRIALS ADMINISTERING PHYSICAL ACTIVITY-BASED INTERVENTIONS AND PERFORMING PHYSICAL FUNCTION ASSESSMENTS: THE LESSENING INCONTINENCE WITH LOW-IMPACT ACTIVITY TRIAL
Digital and telehealth methods are increasingly being used in remote trials to administer diverse interventions and conduct participant assessments. Physical movement-based interventions, as well as assessments of participants’ physical performance or function, present unique feasibility, quality, and safety challenges in a remote trial context. When interventionists or assessors cannot lay hands on participants, are unable to view participants in three dimensions, and must rely on speakers, cameras, and microphones to communicate, the risk of error and injury can increase. We discuss strategies, barriers, and facilitators for delivering physical movement interventions and performing physical function assessments over remote video platforms in an NIH-funded, multisite randomized trial of two types of group-based physical movement-based interventions in an aging population. Participants were diverse older community-dwelling women with urinary incontinence (N=240, age range 45 to 90 years) who were randomly assigned to either a therapeutic pelvic yoga program involving twice weekly group intervention sessions supplemented by once weekly self-directed practice, or a skeletal muscle conditioning program involving time-equivalent group sessions and self-directed practice of muscle stretching and strengthening techniques. Originally launched in 2019, the trial was rapidly converted to a videoconference-based platform during the COVID-19 pandemic, with all study interventions and assessments performed remotely thereafter. The trial team developed new protocols and tools for delivering remote videoconference yoga and physical conditioning intervention instruction, assessing the quality of instruction through videoconference observations by expert consultants, and evaluating participants’ ability to perform intervention techniques over video. New strategies were developed to evaluate changes in participants’ physical function using video-based tests of balance (one-legged stand), lower extremity strength (chair-stand testing), and aerobic endurance (step-testing). Compared to 10.5% drop-out in the early study cohorts involving all in-person intervention instruction, 15.9% participants dropped out of interventions in cohorts relying on all-video instruction. Among retained participants, however, adherence to intervention sessions varied modestly, with 77.9% of early in-person cohorts versus 71.7% of later video-based cohorts completing more than 90% of sessions. Using standardized procedures for questioning participants about adverse events, the overall proportion of participants reporting a musculoskeletal adverse event, including joint pain/strain, was 9.9%. However, only 2.6% of participants in early cohorts involving in-person instruction reported an event that categorized as “probably” or “definitely” related to interventions, compared to 5.6% in remotely delivered cohorts. Two notable adverse events demonstrate unique safety concerns that can arise in remote studies—one involving a shoulder tear sustained during a remote intervention class, and another involving a fall from tripping over intervention props. Despite the increased convenience and accessibility, participants and instructors raised challenges about using videoconference-based platforms during intervention instruction and described barriers to establishing appropriate, distraction-free environments during intervention sessions. Opportunities for participants and instructors to develop interpersonal rapport and provide mutual support were also decreased. Findings highlight lessons learned about remote video platforms for instruction in physical movement-based interventions and performance of physical function assessments in a research context, as well as strategies for promoting the safety of both types of activities among older and diverse populations.
CP13-2 – EXTERNAL CHALLENGES IN STEPPED WEDGE CLUSTER RANDOMIZED TRIALS: A CASE STUDY OF THE PREHOSPITAL CANADIAN C-SPINE RULE TRIAL
Case study: Each year, Emergency Medical Services transport around half a million trauma patients with a potential neck injury to the local emergency department in Ontario, Canada. Paramedics used to transport all such patients fully immobilized using backboards, cervical collars, and head immobilizers. Unnecessary immobilization of low-risk patients was costly, inefficient, and wasted resources. Vaillancourt et al. designed a 12-month pragmatic SW-CRT aimed at improving patient care and health system efficiency by allowing paramedics to assess patients using the Canadian C-Spine Rule (CCR) to determine risk of injury and selectively transport low-risk patients without immobilization. The primary and co-primary outcomes were the proportion of patients immobilized or with discomfort and pain.
External confounder: Starting the 4th period of the trial, a new spinal motion restriction (SMR) protocol was introduced where most patients could be transported without a backboard, with or without a cervical collar as per the CCR. According to SMR, use of neck collar alone could be considered immobilized.
Impact and strategies: This new protocol created an unanticipated change to the implementation schedule at month nine, which led to the investigators censoring the last three months of data from the main analysis. The change in practice confounded all outcomes regardless of the intervention effect as immobilized patients could now be transported with less discomfort and pain only using collar. It also impacted the secondary outcomes such as time spent in the field by paramedics before hospital arrival. The effect of time as before and after the policy change could not be controlled for in the linear mixed effect model because all centers were switched to the intervention at that point. The new definition of immobilization was no longer measuring the side effects of it and was re-defined as using at least the backboard for analysis purposes.
CP13-3 – WHY IS DE-IMPLEMENTATION OF INEFFECTIVE INTERVENTIONS SO DIFFICULT? INSIGHTS FROM CLINICAL PRACTITIONERS’ CRITIQUES OF EVIDENCE AND IMPLICATIONS FOR TRIAL DESIGN AND PLANNING
CP13-4 – NIH STROKENET INTERNATIONAL TRIALS IMPLEMENTATION
SESSION 14
CP14-2 – TOO MANY COOKS IN THE KITCHEN? LESSONS LEARNED FROM THE DEVELOPMENT OF A PRAGMATIC CLINICAL TRIAL PROTOCOL
The development of clinical trial protocol is a multidisciplinary effort, requiring diverse perspectives, and varying preferred methods of communication and execution, to unify as a single voice to answer the clinically relevant question at hand. ASCO’s CDK4/6 Inhibitor Dosing Knowledge (CDK) Study exemplifies the challenges and benefits of this approach. The study brought together academic and community breast medical oncologists, patient advocates, biostatisticians, and other clinical research experts for the development of a protocol for a Phase III randomized trial focused on dose optimization for HR+/HER2- patients with metastatic breast cancer in patients aged 65 and older. The collaborative effort underscored the value of multistakeholder input, which enriched the protocol by ensuring patient-centered approaches such as integrating real-world considerations into the protocol (e.g., feasibility and length of study assessments for patients), and expanding the applicability of the findings across diverse clinical context (e.g., broad eligibility criteria). However, the process also revealed notable challenges. Differing professional backgrounds and expectations led to mismatching in decision-making and work styles (e.g., rapid and flexible documentation vs. vetting and rigorous documentation), and communication methods which occasionally slowed progress and created complex coordination and bottlenecks. Differences in technological preferences for protocol development resulted in version control and decision-tracking issues. Some group members favored novel approaches such as online protocol building platforms that allowed for reviews and decision tracking by multiple group members simultaneously, others preferred more traditional methods of document sharing by email. While both methods are acceptable each with advantages and disadvantages, it highlighted the need to discuss and obtain consensus on group operability at the outset. Despite these hurdles, the team identified strategies to overcome these challenges: establishing structured frameworks for decision-tracking, early agreement on technology preferences, and clearly defined roles to streamline coordination. Additionally, fostering open communication and mutual respect was equally essential in aligning strategic visions across the multidisciplinary team. The lessons learned from developing the CDK Study protocol emphasize while managing diverse input can introduce complexity, the collective insights significantly enhance the relevance and impact of clinical research. By balancing inclusivity with structured decision-making, there may be potential to navigate the intricacies of large-scale collaborative efforts resulting in impactful clinical trials.
CP14-3 – ANALYSIS USING INTENT-TO-TREAT VERSUS PER PROTOCOL REVIEWS IN PRAGMATIC CLINICAL TRIALS
Clinical trials typically adhere to strict per-protocol or intent-to-treat (ITT) analyses to maintain consistency and validity. The Targeted Agent and Profiling Utilization Registry (TAPUR) Study’s unique pragmatic design highlights that traditional approaches are not always appropriate, necessitating the development of a novel method utilizing well-defined criteria to determine appropriate data inclusion for primary analyses. The TAPUR Study is a pragmatic precision oncology phase II basket trial that evaluates the antitumor activity of commercially available targeted agents outside of their approved indication(s) in patients with advanced disease. Each TAPUR cohort, defined by tumor type, genomic target, and drug, follows a Simon two-stage design. Ten participants enroll in stage I, and if the cohort is not closed for futility, eighteen additional participants enroll in stage II. The primary endpoint of the TAPUR Study is disease control (DC), measured using RECIST criteria. Due to enrollment of participants with advanced disease and no requirement for end of study tumor scans, some participants end study with only a baseline tumor scan. Because the primary endpoint relies on tumor measurements, participants that end study with only a baseline scan may not have primary outcome data. Given the small sample size of each cohort, a few participants without meaningful outcome data can significantly affect the study’s statistical power and ability to make inferences from a cohort. To address the challenges posed by participants leaving the study prior to evaluation of the primary endpoint, the TAPUR Study employs a novel approach to determine whether participant data can be included in the primary analysis per protocol or if another participant should be recruited to “replace” the participant in the cohort. This differs from ITT analysis which would consider participants without outcome data non-responders and include them in the primary analysis. Non-responders on TAPUR are specifically participants with evidence of progression and a defined outcome. The developed methodology involves well-defined criteria, organized into a decision-making flowchart. This flowchart was drafted prior to beginning the review process for any cohort and is applied equally to all cohorts. All reviews occur before the analysis of a cohort so that determinations of which participants can be replaced occur before determining the number of participants with DC. Participants with evidence of tumor progression or clinical progression are included in the primary analysis. Conversely, participants who end the study with insufficient disease information and no link between study departure and disease status are excluded from the primary analysis and replaced to maintain statistical integrity for the primary objective. However, all participants who received at least one dose of study drug are included in safety analyses (i.e., summary of adverse events). The TAPUR Study’s innovative approach to reviewing missing primary outcome data and determining whether participants should be replaced, rather than using a strict ITT analysis, demonstrates how tailored methodologies can address the challenges of unique trial designs while preserving statistical integrity. By implementing a flowchart with well-defined criteria to guide decision-making, the study ensures that primary analyses remain robust and reliable, even in small cohorts.
SESSION 15
CP15-1 – REPEATED INCLUSION OF CLUSTERS IN LONGITUDINAL CLUSTER RANDOMISED TRIALS
Often referred to as re-randomization designs, randomized trials that allow participants to be included in a randomized clinical trial multiple times (randomized independently each time), have been shown to increase trial recruitment rates. To avoid confusion with other uses of the term “re-randomization,” we refer to these designs as “repeated inclusion” designs. Provided certain assumptions are valid, treatment effect estimators from repeated inclusion designs will be unbiased with increased precision. Until now, the theory of repeated inclusion designs has been restricted to the setting of individually randomized designs; here we extend that theory to cluster randomized trials. Repeated inclusion of clusters may be useful when the number of available clusters is limited, or cluster recruitment is difficult: allowing clusters to participate multiple times in the same trial. In this talk we extend the theory of repeated inclusion designs to cluster randomized trials, including longitudinal variants such as cluster randomized crossover designs. Given the validity of assumptions regarding the constancy of treatment effect across repeated inclusions, for designs where equal numbers of clusters and participants are included in each treatment group in each study period, for the same total number of measurements, study power will never reduce when clusters are randomized multiple times. Study power will either be maintained or increased, and whether power is maintained or increased depends on the combination of the study design and the within-cluster correlation structure. A corollary of our main result indicates that designs conducted over a larger number of periods can be more powerful than designs with a larger number of clusters. These results have implications for the design of longitudinal cluster randomized trials, in particular cluster randomized crossover trials and standard cluster randomized trials. When cluster recruitment is difficult, repeated inclusion designs, where the same clusters are included multiple times in the same study, may thus be feasible alternatives to standard cluster randomized trial designs.
CP15-2 – EXPLORING INCOMPLETE STEPPED WEDGE DESIGNS: BALANCED VERSUS IMBALANCED STAIRCASE DESIGNS
Stepped wedge cluster randomized trial designs can carry burdensome data collection requirements as all clusters must collect and provide data in all periods of the trial. Staircase designs are incomplete variants of the stepped wedge design that can be considerably less burdensome. Visually, the trial design resembles a staircase: clusters are randomly assigned to sequences made up of a limited number of measurement periods (control periods followed by intervention), where sequences start measurement at different times. Recent work has found that, under a linear mixed model, staircase designs with just two periods of measurement in each sequence are particularly lean designs with power that can rival that of the stepped wedge in certain situations. In this talk we will aim to identify efficient staircase designs among those with more than two measurement periods in each sequence. In particular, we will examine whether there is a benefit to using an imbalanced staircase design, with different numbers of control and intervention periods in a sequence, over a balanced staircase design. We will compare the efficiency of different staircase designs via the precision of the treatment effect estimator under a variety of trial settings. Surprisingly, our results show that balanced designs are not always optimal, for certain common trial configurations and modeling assumptions. In particular, imbalanced designs tend to be more efficient than balanced designs for designs with fewer sequences and in settings with larger cluster-period sizes and higher intracluster correlation parameters (i.e., greater similarity between participants’ outcomes in a cluster and less waning in similarity over time). This work adds to the growing bank of knowledge about the types of staircase designs that are most efficient under different trial settings, thereby helping trialists choose trial designs that will make best use of trial data to answer their research questions.
CP15-3 – A REVIEW OF CURRENT PRACTICE IN THE DESIGN AND ANALYSIS OF EXTREMELY SMALL STEPPED-WEDGE CLUSTER RANDOMIZED TRIALS
CP15-4 – ENHANCING COVARIATE-ADAPTIVE RANDOMIZATION IN A CLUSTERED 2X2 FACTORIAL DESIGN TRIAL
Randomization is cornerstone in conducting a randomized clinical trial. Covariate balance is crucial to randomization because it ensures that key characteristics are distributed evenly across treatment groups, which minimizes confounding and helps isolate the effect of the intervention. However, in smaller trials, covariate balance is not always achieved and depending on the severity of imbalance, the robustness of the trial’s findings may be jeopardized. Traditional randomization techniques (e.g., stratified, block-stratified) limit the number of categorical variables in schemes, while methods like minimization and biased coin designs can compromise treatment allocation randomness. One method for ensuring covariate balance while maintaining randomness is called Minimal Sufficient Balance (MSB). Briefly, MSB utilizes test statistics and p-values to assess balance for both continuous and categorical variables, all while ensuring randomness in treatment assignment. However, to date, MSB has not been implemented for clustered randomized trials, nor has it been used for a 2x2 factorial design trial. Moreover, no covariate adaptive randomization process, nor any other common randomization scheme, attempts to prospectively ensure balance in attrition rates. This talk will focus on the implementation of MSB in a way that address all three novel attributes: 1) 2x2 factorial design, 2) cluster randomized design, 3) prospectively maintaining attrition rate across trial arms. This novel approach of MSB design will be implemented for an ongoing randomized clinical trial called SixtyPlus, which focuses on enhancing the outcomes of low-income older adults by proposing that home-delivered meal services be paired with clinical services to improve health and safety. More specifically, SixtyPlus aims to evaluate the effects of registered dietitian (intervention A) and occupational therapy (intervention B) services on the risk of falling among home-delivered meal clients. To ascertain the effect of these interventions in isolation and in combination, a 2x2 factorial design was employed. Additionally, since these interventions will be administered in participants’ homes, there is a potential for overlapping effects if multiple individuals from the same household receive different interventions. To mitigate this, household clusters will be randomized to specific interventions. Moreover, varying baseline characteristics can be associated with differing attrition rates between randomized groups, which in effect, may introduce bias. Following the burn-in randomization period, the innovative adaptation of MSB in this proposal will include prospectively calculating probability of attrition for each participant nested within the household for the purpose of achieving balance on attrition. Current popular randomization techniques can lead to persistent imbalances in multiple baseline covariates which can jeopardize the methodological integrity of a randomized clinical trial. MSB has proven effective in ensuring balance of covariates in two-armed randomized clinical trials with randomization at patient level. SixtyPlus represents innovation in multiple facets by adapting MSB for a four-armed trial. The robustness and practicality of this method have the potential to establish it as a frequently employed randomization technique in future trials. As a result, it could encourage major data collection and analysis programs to integrate this method as a standard practice in clinical trials to promote more robust answers to complex research questions.
SESSION 16
CP16-3 – A SCOPING REVIEW OF PATIENT AND PUBLIC INVOLVEMENT IN THE DESIGN AND CONDUCT OF CLUSTER RANDOMIZED TRIALS CONDUCTED EXCLUSIVELY IN LOW- AND MIDDLE-INCOME COUNTRIES
Objectives: To describe PPI in cluster trials conducted in LMICs.
CP16-4 – STRATEGIES TO IMPROVE RECRUITMENT TO RANDOMISED TRIALS: COCHRANE SYSTEMATIC REVIEW
SESSION 17
CP17-1 – ADVANCED STATISTICAL METHODOLOGIES IN CENTRALIZED MONITORING: ENHANCING EFFICIENCY AND DATA INTEGRITY IN CLINICAL TRIALS
CP17-2 – ASSESSING TREATMENT BENEFIT IN SURVIVAL ANALYSIS UNDER NONPROPORTIONAL HAZARDS
In survival comparisons, the Cox hazard ratio provides an interpretable estimate of the treatment effect under the assumption that the ratio of hazards is constant over time. However, when the proportional hazards assumption is violated, the hazard ratio has no clear interpretation. Snapinn et al argue that in survival analysis a treatment’s benefit has two distinct “dimensions,” namely, the difference in restricted mean survival times and the difference in survival rates at the end of follow-up. They proposed calculation of a generalized hazard difference (GHD) as a means to capture both dimensions in a single estimand and showed that the reciprocal of GHD equals the number of patient-years of follow-up that results in one fewer event (NYNT), a measure analogous to the number needed to treat for binary outcomes. Almost simultaneously, Uno and Horiguchi proposed the same measure, which they termed the “difference in average hazard with survival weight” and performed a simulation study to assess its power compared to Cox’s hazard ratio. One problem with GHD, however, is that it lacks power under early difference alternatives, i.e., when the survival curves separate and then converge. Here, rather than combining the two measures into a single index, we propose analyzing them separately, together with a maximum test (maximum Z-statistic). We find that the maximum test maintains the type I error rate while providing reasonably good power under a variety of alternatives (early difference, late difference, and proportional hazards). We illustrate the procedure using data from a randomized phase III clinical trial in prostate cancer.
CP17-3 – SCORE TESTS FOR NON-PROPORTIONAL HAZARDS IN SINGLE-ARM CLINICAL TRIALS WITH TIME-TO-EVENT ENDPOINTS: A SIMULATION STUDY
In oncology, well-powered randomized clinical trials with time-to-event endpoint may be difficult to conduct in pediatric studies, biomarker-defined subsets or considered as unethical explaining that many one or two-stage single-arm designs, analog to the Simon’s two stage design for a binary endpoint, have been developed in recent years. These designs rely on the one-sample log-rank test (OSLRT) and its modified version (mOSLRT) for comparing the survival curve of an experimental arm to that of an external reference group. These tests are developed under the proportional hazards (PH) assumption that may be violated, in particular when evaluating immunotherapies. We proposed to adapt the OSLRT and evaluate alternatives for settings where PH does not hold. We extended the Finkelstein’s score test (OSLRT) developed under the PH assumption by using a piecewise exponential model with change-points (CPs) for the early, middle and delayed treatment effect. For crossing hazards, we used an accelerated hazards model. As CPs are not a priori known, we developed a two-step approach with a landmark analysis using the mOSLRT to determine the time-dependent relative treatment effect and CPs and to select then the appropriate score test. The restricted mean survival time (RMST-) based test is extended to the case of single-arm trials. We also developed a test defined as the maximum of mOSLRT and score tests for early and/or delayed effect (maxCombo test). The performances (type I error and power) of the different approaches are evaluated through a simulation study of a phase II single-arm trials with an accrual and follow-up period of 3 and 4 years, respectively. The reference group survival curve is generated with an exponential distribution admitting no sampling variability and that of the experimental group with a piecewise exponential model. The simulation parameters are sample size of the experimental group (from 20 to 200 patients), exponential censoring rate (from 0 to 35%) and relative treatment effect (hazard ratio from 0.5 to 1). A single-arm trial evaluating an inhibitor (n=91 and reference group n=136) for neuroblastoma patients is used for illustration. The simulation study shows that the developed score tests are more conservative than the mOSLRT but as conservative as the OSLRT. As expected, the score test has the highest power when the data generation matches with the model even when the CPs are misspecified. The landmark analysis works well only for large sample size (n>100). The RMST-based test is as conservative as the mOSLRT and more powerful than the mOSLRT only for an early effect with censoring rate less than 15%. The maxCombo test is conservative and more powerful than the mOSLRT when n>50 but less than the right score test under non PH. To conclude, the developed score tests are efficient under non PH when the approximate values of CPs are known. The maxCombo test is an interesting alternative when the relative treatment effect over time and the CPs are unknown. Further research is needed to study the impact of the historical control survival distribution and its sampling variability.
CP17-4 – ACCOUNTING FOR CENTER-LEVEL EFFECTS IN MULTICENTER RANDOMIZED CONTROLLED TRIALS
Investigators often conduct randomized controlled trials (RCTs) at multiple centers/sites when determining the effect of a treatment or an intervention. Diversifying recruitment across multiple institutions allows investigators to make recruitment go faster within a shorter timeframe and allows generalizing the study results across diverse populations. Despite having a common study protocol across multiple centers, the eligible participants may be heterogeneous, site policies and practices may vary, and the investigators’ experience, training, and expertise may also vary across sites. These factors may contribute to the heterogeneity in effect estimates across centers. As a result, we usually observe some degree of heterogeneity in effect estimates across centers, despite all centers following the same study protocol. During the analysis of such a trial, investigators typically ignore center effects, but some have suggested considering centers as fixed or random effects in the model. It is not clear how considering the effects of centers, either as fixed or random effects, impacts the test of the primary hypothesis. In this article, we first review the practice of accounting for center effects in the analyses of published RCTs and illustrate the extent of heterogeneity observed in a few preexisting multicenter RCTs. To determine the impact of heterogeneity on the test of a primary hypothesis of an RCT, we considered continuous and binary outcomes and the corresponding appropriate model, namely, a simple linear regression model for a continuous outcome and a logistic regression model for the binary outcome. For each model type, we considered three methods: (a) ignore the center effect, (b) account for centers as fixed effects, or (c) account for centers as random effects. Based on simulation studies of these models, we then examine whether considering the center as a fixed or random effect in the model helps to preserve or reduce the Type I and Type II error rates during the analysis phase of an RCT. Finally, we outline the threshold at which center-level effects are negligible and thus negligible and provide recommendations on when it may be necessary to account for center effects during the analyses of multicenter randomized controlled trials.
SESSION 18
CP18-1 – THEORY TO PRACTICE: BAYESIAN RESPONSE ADAPTIVE RANDOMISATION IN A RARE DISEASE SETTING
Response-adaptive randomization (RAR) designs are valuable as they increase likelihood of allocations to the most promising arm while maintaining randomization as the allocation method. However, its implementation remains a challenge when the trial size is very small. Motivated by the ongoing StratosPHere 2, a phase 2 trial in a rare disease setting, we discuss the trial design, the associated challenges and their proposed solutions. The trial design incorporates an additional step of Mapping that converts the continuous randomization probabilities produced at the interim stage to a target vector of discrete randomization ratios, using a decision rule. This approach helps to avoid undesirable treatment allocations per randomization stage while staying true to the essence of RAR. Under the implementation of Mapping, we analyze the impact of missing data and discuss an additional concern of reporting safety results accounting for the differential nature of exposure to the treatments. The ultimate goal of this work is to foster greater synergy between practical and methodological research, crucially needed for translating the benefits of using RAR into clinical practice.
CP18-2 – CALIBRATION OF DOSE-AGNOSTIC PRIORS FOR BAYESIAN DOSE-FINDING TRIAL DESIGNS WITH JOINT OUTCOMES
Dose-finding oncology trials are a crucial step in early clinical development. The goal of these trials is to assess the safety of novel anti-cancer treatments across multiple doses and recommended dose(s) for subsequent trials. Based on previous patients’ observed responses to treatment, trialists dynamically recommend new doses for further investigation during the trial. In traditional dose-finding designs, doses are escalated towards a target Dose-limiting toxicity rate (DLT) rate, with the final recommendation identified as the maximum tolerated dose (MTD). Such adaptive decision making lends itself well to Bayesian learning, with Bayesian frameworks increasingly adopted to guide dose recommendations in model-based dose-finding designs. However, such approaches come at a cost of increased complexity, including the challenge of selecting appropriate priors. Such complexity is further amplified for model-based designs, where the incorporation of additional outcomes is leading to increasingly intricate designs. For instances where trialists lack prior knowledge of the MTD, methodology associated with the calibration of dose-agnostic priors has been developed for the continual reassessment method (CRM) design with single-outcomes. However, the application of such methodology to existing joint-outcome CRM based trial designs to generate a priori dose agnostic priors is flawed and inflates lower dose recommendations and results in more patients being treated at sub-optimal doses. We address this issue by extending the calibration techniques for single-outcome trial designs and create a new analytical approach to calibrate dose-agnostic priors for joint-outcome trial designs that jointly evaluate DLTs and efficacy responses, or DLTs and patient-reported outcomes (PROs). This analytical and computationally efficient technique maintains an a priori dose agnostic prior with a reduced standard deviation of the proportion of correct selection across simulation scenarios. This approach also improves the probability of correct selection of the optimal dose in a majority of scenarios. As Bayesian dose-finding trial designs continue to advance, research and guidance on the effective calibration of design parameters are essential to support their uptake and ensure optimal performance in practice. This method provides an analytical and intuitive approach to prior calibration, highlighting the importance of rigorous prior calibration in improving model accuracy and dose selection for safer, more effective oncology treatments.
CP18-3 – CALIBRATION-FREE ODDS CFO SUITE FOR DESIGNING VARIOUS PHASE I CLINICAL TRIALS
In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose in a phase I clinical trial. To use the data more efficiently yet without any model assumption, we propose a novel calibration-free odds (CFO) approach to phase I trial design. Not only is the CFO design free of any dose-toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Seamless phase I/II trials have gained enormous popularity, which aim to identify the optimal biological dose (OBD). To enhance the accuracy and robustness for identification of OBD. For toxicity monitoring, the CFO design casts the current dose in competition with its two neighboring doses to obtain an admissible set. For efficacy monitoring, CFO selects the dose that has the largest posterior probability to achieve the highest efficacy under the Bayesian paradigm. In contrast to most of the existing designs, the prominent merit of CFO is that its main dose-finding component is model-free and calibration-free, which can greatly ease the burden on artificial input of design parameters and thus enhance the robustness and objectivity of the design. We will also illustrate the implementation of CFO using its Shiny App which is user-friendly and publicly accessible at https://clinicaltrialdesign.shinyapps.io/cfoapp/.
CP18-4 – DEVELOPING A SIMPLE PLAN FOR IMPLEMENTATION AND MONITORING OF COMPLEX RANDOMIZATION ALGORITHMS
Demand for sophisticated randomization algorithms such as Covariate Adaptive Randomization or Minimal Sufficient Balance are growing. These designs appeal to statisticians for their robust methods of minimizing predictability in treatment assignment, while maintaining treatment balance across baseline covariates. Translating designs from theoretical to practical are a challenging yet crucial step in successful implementation and monitoring of these randomization algorithms in an Interactive Response Technology (IRT) system. Unlike frequentist designs, where randomization assignments are performed via pre-generated randomization schedule, treatment assignments performed by randomization algorithms require a higher level of knowledge than just identifying the next record in the associated schedule. The theoretical design of these algorithms may appear daunting to non-statisticians but can be redefined in digestible step-by-step terms to ensure non-statistical team members have a firm understanding for IRT implementation. Monitoring plans for critical trial data are required to be assessed by the sponsor for their clinical trials per current regulatory guidance. As randomization is one of these critical steps in trials, an appropriate Randomization Monitoring Plan (RMP) should be developed based on the algorithm’s implementation. An RMP provides patients and sponsors the assurance that a patient’s participation in a clinical trial is accurate and backed up by validated evidence. RMPs require clear documentation of (1) the required checks, (2) the frequency of reviewing the data, and (3) the communication plan for each review of randomization assignments. A collaboration between the sponsor and the independent personnel performing the monitoring is key in crafting an effective RMP. Sponsor stakeholders may have a non-statistical background and will rely on clear, concise, and accurate descriptions of IRT algorithm implementation for monitoring success. Reporting of the randomization data in IRT also needs to be carefully considered to ensure effective randomization monitoring can be performed. Common checks for reviewing treatment assignments of complex randomization algorithms include verifying the individual step by step calculations are correct and result in the actual treatment assignment for each patient. Additional checks may be required based a clinical trial’s protocol design and IRT implementation. For example, if there are multiple responses utilized to identify a single covariate level in IRT, then an additional check to confirm the mapping to the correct covariate level may be required in the monitoring plan. IRT is uniquely positioned to help define the complex randomization design of the complex algorithm, given that they have intimate knowledge of the data structure, step-by-step calculations, and processes for randomization assignment to patients. This presentation intends to establish the importance of accurately defining the steps of complex randomization algorithms for IRT implementation and the monitoring of randomization data.
SESSION 19
CP19-1 – WAS IT WORTH IT: A POOLED ANALYSIS OF PARTICIPANT EXPERIENCE IN CANCER CHEMOPREVENTION TRIALS
CP19-2 – A COLLABORATIVE APPROACH FOR DEVELOPING SAFETY STOPPING RULES IN PHASE 2 ONCOLOGY CLINICAL TRIALS
Phase 1 oncology trials provide the initial safety data to support continued investigation of novel regimens, but their limited sample sizes can motivate subsequent phase 2 trials to continue close safety monitoring. Several statistical rules for monitoring safety have been proposed to support clinical evaluation of safety in phase 2 trials. These rules can be prespecified during study design to help protect against enrolling additional patients once it is decided that the regimen is not sufficiently well-tolerated, in opposition to the assessment from the phase 1 study. These rules assess accumulating safety data to evaluate whether there is evidence that the study regimen’s toxicity exceeds a prespecified threshold. Some approaches control the probability of detecting excess toxicity when the toxicity is truly acceptable (type-1 error) and aim to have a high probability of detecting excess toxicity at higher toxicity rates and holding accrual. In practice, however, we observe that traditional type-1 error rates with recommended rules may have undesirable properties: they may allow unacceptably large numbers of early toxicities and may have limited probability to detect excess toxicity under truly unacceptable toxicity rates. Therefore, any initial statistical rule based on traditional type-1 error likely needs to be modified based on clinical expertise to adequately protect the safety of patients under study. We outline a framework of close collaboration between statistical and clinical experts to develop rules that balance statistical operating characteristics with clinical acceptability. And we illustrate this framework by reviewing the potential design of a clinical trial evaluating a novel hematopoietic cell transplantation regimen for pediatric patients with high-risk hematological malignancies.
CP19-3 – USING BENEFIT-RISK METHODS TO ADJUST THE NON-INFERIORITY MARGIN BASED ON TREATMENT BENEFITS
Non-inferiority trials are used to test whether a new treatment is no worse than the comparator. A major design component is the non-inferiority margin (NIM) which represents an acceptable difference between the two treatments whilst being considered non-inferior. The chosen NIM has a major impact on the design and subsequent outcome of a trial, however choosing the NIM can sometimes be difficult. This is especially the case when an increase in the NIM, i.e. a higher level of acceptable inferiority, might be considered due to the trade-off with a benefit of the new treatment. Benefit-risk (B-R) methods aim to create transparency and consistency when trading-off between multiple outcomes. These methods are commonly used to gain regulatory approval at the end of a clinical trial but could be implemented at the design stage when setting the NIM. The aim of this work was to test to potential of using B-R methods to aid when adjusting a NIM based on additional benefits. Following a systematic review of potential B-R methods and a formal, researcher-led selection criteria, four different benefit-risk methods were applied to two real-life NI trials. The four methods tested were the Benefit-Risk Action Team (BRAT) framework, the Unified Methodologies for Benefit-Risk Assessment framework, Multi-Criteria Decision Analysis (MCDA), and the Food and Drug Administration’s (FDA) Benefit-Risk Framework (BRF). The applied methods were presented to stakeholders (n=6) during semi-structured interviews. Stakeholders were asked about perceived usefulness of the B-R methods and asked to highlight any barriers to use in practice. Implementation of the B-R methods was found to be most straight-forward with the FDA BRF. Stakeholders felt this was useful for all trial teams as a starting point to be more explicit about the considerations but lacked some complexity. MCDA was the most complex and quantitative method used and is the only one to output with a value suggestion that can be used for the NIM. This approach raises concerns about obscuring the information used in decision-making. However, stakeholders considered it beneficial, as they found selecting a value for the NIM to be difficult and arbitrary. The BRAT framework presented the most amount of information and was considered the most useful with the important caveat of needing quality data to be available. Stakeholders felt that all methods would assist with the justification produced for the NIM. In conclusion, using B-R methods provided improved structure and transparency around the decision making for the value of the NIM by formally considered an adjustment based on the benefit of the treatment (if appropriate). The use of formal methods provides a level of robustness and consistency that will help to improve the design of NI trials. Additionally, using the B-R methods was found to improve the justification provided for the NIM and importantly help readers to distinguish between evidence and subjective decision-making within the justification. Improving the decision-making and justification of the NIM will hopefully improve confidence in NI trials and their results for all stakeholders.
CP19-4 – ASSURANCE METHODS FOR DESIGNING A SURVIVAL TRIAL WITH DELAYED TREATMENT EFFECTS
An assurance (probability of success) calculation is a Bayesian alternative to a power calculation. These calculations are becoming more regularly performed in industry, especially in the design of Phase III confirmatory trials. Immuno-oncology (IO) is a rapidly evolving area in the development of anticancer drugs. A common phenomenon that arises from IO trials is one of delayed treatment effects, that is, a delay in the separation of the Kaplan-Meier survival curves. To calculate assurance for a trial in which a delayed treatment effect is likely to be present, uncertainty about key parameters needs to be considered. If uncertainty is not considered, then the number of patients recruited may not be enough to ensure we have adequate statistical power to detect a clinically relevant treatment effect. We present an elicitation technique for when a delayed treatment effect is likely to be present and show how to compute assurance using these elicited prior distributions. We provide an example to illustrate how this could be used in practice.
SESSION 20
CP20-1 – OPEN SCIENCE: SHARING DATA AND RESOURCES IN ALZHEIMER’S DISEASE CLINICAL TRIALS
CP20-2 – DEVELOPING A STANDARDIZED DATA MANAGEMENT PLAN TEMPLATE FOR RANDOMIZED CONTROLLED TRIALS
CP20-3 – ENHANCING THE QUERY PROCESS IN CLINICAL TRIALS THROUGH THE USE OF A QUERY PORTAL SYSTEM
The University of Pittsburgh’s Center for Biostatistics and Qualitative Methodology Data Coordinating Center (CBQM-DCC) utilizes a homegrown electronic system for data management (eSYSDM) that integrates eligibility confirmation, randomization, participant management, safety reporting, laboratory sample management, outcome adjudication, and data and safety monitoring. The DCC monitors data quality at regular intervals and promotes direct data entry, which allows for monitoring data in real time. While the use of the eSYSDM has provided increased efficiency in data and safety monitoring, the data querying process to participating sites remains cumbersome and highly reliant on dissemination of queries via email and time-consuming manual verification of query resolution. In this talk, we describe the conception, development, and implementation of a web-based query portal integrated within the eSYSDM. Through use of this portal, the DCC’s clinical trial managers can create data queries for each participating site and can track the query through resolution. They can identify missing or discrepant values from either data quality reports or by nature of reviewing data entered into the case report form, create a data query directly in the form along with a deadline, and provide instructions or comments regarding the action item. Once submitted, the data query created will be visible immediately upon a login by clinical site users. The eSYSDM will present a reminder to the site user that there are outstanding data queries; however, the user can proceed with participant management or other activities before resolving the query. The query can be viewed at any time and data changed directly in the form. Queries can be submitted back to the clinical trial manager with comments or confirmation that the variable in question does not need to be changed. Lastly, the clinical trial manager can manage the cumulative list of queries issued to each site and easily view and verify query resolution within the eSYSDM. The ability to create, track, and manage queries directly in the system reduces the amount of email communication to the clinical sites and eliminates the loss of data query communication when site staff turnover. The query portal ensures that all trained users of the eSYSDM at the respective site can resolve queries. Finally, we will describe future enhancements to the query portal, such as integrating site-identified data change requests and approval of those requests from the DCC, as well as system-generated queries automatically created based on set criteria defined for our monthly reports. These could include missing or incomplete forms, problematical data values, and unresolved AEs.
CP20-4 – HARNESSING ARTIFICIAL INTELLIGENCE IN DATA MANAGEMENT
Artificial intelligence (AI) tools can be profoundly helpful in clinical trial operations like software development, information technology, and data management. These tools allow us to streamline previously time-consuming tasks. SWOG Cancer Research Network has been increasingly leveraging AI to enhance operational efficiency. One of the most significant AI advancements lies in the automation of software development and other IT tasks. AI systems excel at addressing well-defined software problems for which extensive training data exists. By automating routine coding tasks through code generation and effective AI prompting, these systems enable our developers not only to increase productivity through automated code generation but also to redirect their time to more novel and trial-specific challenges. For example, developers can generate code for more common tasks like input processing, setting up unit tests, and logging errors. The generation of tailored scripts for IT deployments and configurations has streamlined daily IT tasks to support more efficient and scalable infrastructure management. SWOG has also made progress in setting up systems to facilitate easier interrogation of our documents. With AI, we can simply ask questions of our document library like “What is SWOG’s AI Policy?” or “How are adverse events collected in this trial?” instead of having to manually search through folders or long documents. This improves both information accessibility and decision-making agility. Important to SWOG’s work has been developing a framework for these tools to be used successfully and ethically. SWOG’s framework allows for these tools to be used but also implements restrictions on them to maintain privacy of sensitive data and maintain humans as the final decision makers when making critical decisions. This mitigates the risks of AI tools straying from anticipated use or providing inaccurate information. SWOG has utilized a variety of third-party AI tools such as ChatGPT and Copilot but is also training local AI models to help with tasks that require additional privacy and customization. While the commercial frontier AI models offer impressive state-of-the-art capabilities, they cannot offer the same level of privacy and configuration options that locally developed models can. The AI revolution offers unprecedented opportunities to make clinical trial operations more efficient, and we anticipate these opportunities to expand further over time.
SESSION 21
CP21-1 – DUAL COORDINATION CENTER ORGANIZATION AND STUDY START-UP OPERATION EFFICIENCIES FOR THE ALL ALS CONSORTIUM
Hypothesis: The dual CCC model harnesses unique perspectives, experiences, and resources allowing for collaboration to accelerate study start-up, site contracting, site activation, and enrollment for protocols under the ALL ALS Consortium. Flexible, expense-allocation workflows result in budget efficiencies not typically available in the single CCC model.
CP21-2 – OPTIMIZING TRIAL SUCCESS: KEY STAFFING CONSIDERATIONS FOR PLATFORM TRIALS
Hypothesis: Staffing in a platform trial requires thoughtful allocation of responsibilities and resources as well as the flexibility to adapt to evolving operational needs.
CP21-3 – PAIRED MATCHED RANDOMIZATION: THE EFFECT OF INCORPORATING REAL WORLD DATA AND MACHINE LEARNING METHODS IN CLINICAL TRIALS
Potential relevance and impact: The work wants to explore the possible gains of integrating ML techniques into RCTs, for allowing the usage of historical information with the aim of both maintaining power and improving balance, without inflation of the type I error.
CP21-4 – A GLOBAL REVIEW OF DECENTRALISED CLINICAL TRIALS: KEY ELEMENTS AND TRENDS
SESSION 22
CP22-1 – DIVERSITY IN CLINICAL TRIALS: HOW ACTIONABLE IS REGULATORY GUIDANCE?
CP22-2 – RESULTS MAY VARY: USING SIMULATION METHODS TO IMPROVE DIVERSITY IN CLINICAL TRIALS
CP22-3 – DEVELOPING METHODS WE TRUST: A WORKSHOP AND VIDEO PROJECT WITH LICTR AND THE CULTURALLY DIVERSE HUB
CP23-4 – CONSENT TO ACCESS AND USE HEALTH SYSTEMS DATA FOR TRIALS –WORKING TOGETHER WITH PATIENTS AND THE PUBLIC TO FIND APPROPRIATE LANGUAGE FOR PARTICIPANT-FACING MATERIALS (CROSSWORD)
Results structure and timelines: Review results show that current consent statements related to data access do not comply with the readability level recommended for patient communication materials. We will present results from a thematic analysis of public focus group views on the language currently used, and propose co-developed examples of consent wording to use to access to confidential patient records for RCTs. Analysis is to conclude by March 2025.
Relevance and impact: Given the potential efficiency gain of using data held in health systems in trials, it is vital that we increase transparency in the requirements for continuous participant consent. This work will contribute to greater clarity in the requirements for consent language to successfully access HSD from UK registries and have wider applications internationally for how we communicate complex information to trial participants. Appropriate consent wording should also meet the needs of participants, for truly informed consent. This work contributes to the wider CrossWord project, continuing to work with patients and the public to co-develop guidance for appropriate consent language to enable data access for trials.
POSTER PRESENTATIONS
P-1 – SHARING CLINICAL TRIAL DATA IN LIMITED-ACCESS REPOSITORIES: AN EXAMPLE FROM NEURONEXT
Sharing clinical trial data in public repositories can be beneficial for advancing clinical research in a cost-effective way by leveraging existing data for exploratory analyses and meta-analyses that can generate new lines of research. Regulatory bodies increasingly mandate incorporating data sharing into early trial planning, and initiatives from scientific publishers, like the recommendations from the International Committee of Medical Journal Editors, require a data sharing statement in order to publish trial results in their journals. While the importance of data sharing is well understood within the scientific community to promote transparency, the practicalities of navigating the regulatory and ethical guidelines involved with data sharing can be complex. This presentation outlines the process of sharing clinical trial data in the limited-access National Institute of Neurological Disorders and Stroke (NINDS) and National Institute of Mental Health Data Archive (NDA) repositories using examples from NeuroNEXT studies to illustrate the key steps of data sharing. Clinical trials from NeuroNEXT will be used to demonstrate the necessary actions required to prepare and submit data. There are some common pitfalls that can be avoided through early planning before the trial starts, including reviewing current regulatory standards, ensuring participants are properly consented on what will be shared, and selecting an appropriate public repository. Relevant regulations will be detailed, like the latest NIH Policy for Data Management and Sharing that became effective on January 25, 2023. After the trial has concluded, data must be properly deidentified to protect participant privacy, quality control must be ensured on shared datasets, detailed metadata must be prepared to accompany the shared data, and data must be submitted in accordance with the repository’s guidelines. Different data repositories can have their own set of requirements and processes, but regardless of which repository is selected, providing a set of standardized documentation on things like the trial design, outcome measures, and the data dictionary can ensure that data follow the FAIR Principles. Similarities and differences in requirements between the NINDS, NDA, and other HEAL-compliant repositories will be highlighted. Finally, the details of who can access the data in the repositories, and under what circumstances access will be granted, will also be outlined. This real-world example provides practical guidance on the best practices for researchers and clinicians to use when navigating the clinical trial data sharing process.
P-2 – AUTOMATED MONITORING AND ALERTS FOR CLINICAL TRIAL OPERATIONS
P-5 – USE OF A CLOUD-BASED SYSTEM FOR CENTRAL REVIEW OF DE-IDENTIFIED ECHOCARDIOGRAMS IN CLINICAL TRIALS
In the past, when eligibility or outcomes for a multicenter trial was determined by the review of patient imaging, it may have been difficult to have images centrally reviewed, leading to greater variability of local diagnoses within the included population. With the increased use of data sharing platforms and digital methods to de-identify patient data, cloud-based platforms can be used to facilitate an efficient and precise image review and adjudication process. CORD-CHD (CORD Clamping Among Neonates with Congenital Heart Disease) is a multi-center randomized clinical trial that aims to determine the optimal timing of umbilical cord clamping in neonates born with congenital heart disease (CHD) to improve short-term postnatal and longer-term neurodevelopmental outcomes. Eligible participants have a cardiac lesion that is expected to require neonatal cardiac intervention. Critical to this trial is the eligibility determination of a Fetal Cardiovascular Disease Severity Score based on ECHO imaging review. The final adjudication is performed by a central reviewer, the CORD-CHD Fetal ECHO Core (FEC), to reduce misclassification bias and to allow for a more precise and standardized method of determining eligibility for the clinical trial. Herein, we present an approach and framework for centrally reviewing ECHO images to aid screening eligibility for a clinical trial, and report on barriers encountered and their resolutions. During screening, CORD-CHD sites upload a locally-captured ECHO to the cloud-based system, where images are managed by the data coordinating center (DCC) and sent to the FEC. During upload, the PHI embedded in the ECHO file is removed systematically, and users are able to black out PHI stamped on the image itself with the use of a pixel de-identification tool (PDIT). Use of this cloud-based platform allows sites without local de-identification methods to securely send their images to the central reviewer. Barrier: Originally, the FEC were to review the images on the cloud-based system. Due to the size of the images—in some cases almost 3GB—the FEC spent one hour loading and reviewing each image. Resolution: The DCC and the FEC worked with the cloud-based system host to create a protected location for the files to be reviewed by the FEC locally. The FEC now usually spends about 10-15 minutes completing reviews. Barrier: Upload times for users at the clinical centers were exceptionally long; e.g., two hours. Resolution: The DCC educated users and strongly recommended sites only upload locally, rather than over a VPN; after changing their process to avoid uploading over VPN, the majority of sites now need approximately 25 minutes to complete their upload. Use of the PDIT increased upload times substantially, so the DCC recommends local de-identification of images, if possible. Barrier: The PDIT may be used incorrectly, where PHI stamped on the image is still visible after upload. Resolution: Use of the PDIT required detailed one-on-one training and practice.
P-7 – IMPROVING DATA COLLECTION EFFICIENCY: INTEGRATION OF INTERNAL HOSPITAL DATABASES WITH REDCAP
P-9 – SAMPLE SIZE IN CROSSED-DESIGN SURGICAL TRIALS: ARE WE IGNORING THE NON-IGNORABLE? EXPLORING THE EFFECTS OF TREATMENT HETEROGENEITY, RECRUITMENT RATES, AND DIFFERENT OPERATING-CHARACTERISTICS ON SAMPLE SIZE
Sample size calculations are a crucial design and cost feature of clinical trials. Compared to drug trial designs, surgical trials have additional design factors to consider that could affect sample size. Surgeon experience and skill, correlated observations from patients receiving treatment from the same surgeon, the interaction between surgeon and treatment, and the learning of a new surgical intervention are often over-looked in trial design. Current guidance for the design of surgical trials does not consider all these parameters. A crossed design is where each surgeon performs all levels of the intervention, and patients are randomized to an intervention. According to current guidance, when we consider the main effect of surgeon in a crossed design, the correlation between observations can only be beneficial to power. However, this previous work was without consideration of the aforementioned relevant factors. When performing analyses that uses a model to account for them, there is the potential for a reduction in power, and also potential that these models may not converge for a specified sample size. We have explored the effects of using more exhaustive models on sample size requirements to inform and extend current guidance on when these quantities should or should not be ignored. We focus on exploring the effect on power of the following: including a separate random effect for each treatment level, varying the correlations between the random effects, and varying recruitment rates across the trial duration between surgeons for a binary endpoint, such as surgical complication. Simulation of each of the above scenarios is used to estimate the required sample size for a desired power and/or estimate power at a specified sample size. The ROLARR trial of robotic surgery was used to inform the creation of data-generating models that are representative of the real world. We report the power and type-I error when fitting a model ignoring surgeon-treatment interaction when it exists; the effect of uneven recruitment on power; the reduction in power when you do account for surgeon-treatment interaction with varying values of correlation between surgeon-treatment interaction and main surgeon effect; and finally, the convergence rates of these models. We highlight proposed amendments to current guidance.
P-10 – A STRATEGIC PLANNING CHECKLIST FOR EXECUTION OF FOLLOW-UP IN CLINICAL TRIALS FOR CRITICALLY ILL CHILDREN
P-11 – A PRACTICAL SOLUTION TO MINIMISE SELECTION AND CHRONOLOGICAL BIAS IN RCTS
P-12 – HOW TO ACCOUNT FOR SUBSEQUENT ANTI-CANCER THERAPY WHEN ANALYSING OVERALL SURVIVAL: A SIMULATION STUDY AND PRACTICAL EXAMPLE
The cancer clinical trial community, including the FDA, acknowledge that the uptake of subsequent anti-cancer therapies can influence the interpretation of overall survival in phase III confirmatory trials. Nevertheless, overall survival remains the gold standard definitive endpoint. It is used alongside other evidence in health technology appraisals to evaluate the performance of experimental interventions, where a limitation is often stated to be the unquantifiable impact of subsequent anti-cancer therapies. In a bid to disentangle the uncertainty concerning subsequent anti-cancer therapies, the objective of this work was to investigate the novel hypothetical question: “What is the experimental trial intervention effect on overall survival compared to the control intervention, if all participants who discontinued prior to death went onto receive the same subsequent anti-cancer therapy?” We will discuss how existing statistical techniques, including the simple (intention-to-treat and per protocol) and the complex (two-stage and inverse proportional censoring weighting), performed when applied to the hypothetical question in an extensive simulation study. We considered a variety of scenarios including, variations of the true experimental intervention effect and the timing of intervention discontinuation. We evaluated the different methods in terms of bias, coverage, and power. We will demonstrate the practicability of the methods through a case study. The case study will consider the analysis of overall survival in a non-inferiority trial in kidney cancer. In the trial, some participants stopped their trial intervention and began immunotherapy during follow-up. This case study was chosen as second-line immunotherapy was not standard of care for this population at trial outset. As more and more effective therapies are made available to patients, this is likely to be a scenario many in the cancer clinical trial community will experience. We will discuss the generalizability of the novel hypothetical question, how it improves the interpretation of clinical trials results, and the necessary considerations when analyzing overall survival in such situations.
P-14 – REFINING DATA PRESENTATION IN DATA AND SAFETY MONITORING BOARDS REPORTS: AN EXAMPLE FROM A CLINICAL TRIAL EVALUATING ANTIFUNGAL THERAPY IN PEDIATRIC UNCOMPLICATED CANDIDEMIA
Data and Safety Monitoring Boards (DSMBs) play a critical role in clinical trials, using emerging data to monitor the welfare of trial participants through assessment of the benefit-risk balance of interventions, and determining whether randomization and follow-up should continue. High-quality DSMB reports are essential for informed decision-making. Unfortunately, in practice, DSMB reports often lack sufficient structure and are excessively lengthy and dense, typically filled with numerous tables and listings. The report volume, coupled with a lack of clear organization, prioritization of presentations, and contextual explanation, can hinder the identification of important treatment effects, and result in sub-optimal DSMB recommendations. DSMB reports are distinct from other types of clinical trial documentation such as clinical study reports or journal articles. DSMB reports are aimed at its target audience, the DSMB. Reports optimally present data in a way that tells a complete and coherent story, prioritizing clarity and actionable insights. Comparison of Uncomplicated Candidemia Therapy Duration in Children and Adolescents (COUNT; NCT05763251) is a multi-center, randomized controlled study comparing two antifungal therapy durations in pediatric patients with uncomplicated candidemia. COUNT is employing the Desirability of Outcome Ranking (DOOR) paradigm, a patient-centric paradigm for designing, analyzing, interpreting, and reporting clinical trial results, focusing on patient-centric benefit-risk evaluation. COUNT was designed with three planned interim analyses and biannual safety reviews. A refined DSMB report template was developed for COUNT in support of effective monitoring. The report emphasizes the use of visual aids, including bar charts, stacked bar charts, forest plots, and predictive interval plots, complemented by concise tables and narrative summaries. The report is designed to distill complex data into clear, digestible presentations that facilitate quick and accurate benefit-risk assessments, by DSMB members. Visual patient stories complement summary tables and figures to provide insights into comprehensive experiences of each participant. In this presentation, we will introduce the DSMB reporting approach, outlining principles for preparing reports and providing practical guidance for ensuring clarity and utility. We illustrate the approach using COUNT as a prototypical example.
P-15 – SAMPLE SIZE RE-ESTIMATION IN CLUSTER RANDOMIZED CLINICAL TRIALS
Pragmatic clinical trials such as the parallel arm cluster-randomized trial (CRT) are increasingly utilized to understand the delivery of care in real-world settings; however, these designs are complex. Cluster-randomized trials possess a hierarchical data structure (e.g., patients clustered within clinics) and the intraclass correlation coefficient (ICC) (degree of variability between clusters) is needed to conduct the power analysis. Under or over estimation of the ICC at the design stage may lead to significantly under- or over-powered clinical trials. The designs proposed in the literature for sample size re-estimation methods to adjust for the potentially mis-specified ICC at the design stage do not control type I error rate. Further, designs to adjust sample size in a CRT based on both mis-specified ICC and a mis-specified treatment effect have not been adopted widely. We propose an adaptation of two traditional adaptive sample size re-estimation methods to CRTs that account for both mis-specified ICC and a mis-specified treatment effect. We propose an extension of the internal pilot study approach using a weighted test statistic as a method for type I error control. We also extend the promising zone design to CRTs using conditional power for sample size re-estimation under the linear mixed-effects model. We illustrate the impact of interim accrual on these methods in the CRT setting using simulation studies. Our proposed methods adequately control type I error and recover lost study power.
P-18 – CLINICAL TRIAL TRANSPARENCY: PATIENT PREFERENCES ON CLINICAL TRIAL RESULTS DISSEMINATION FOLLOWING THE PREPARE TRIAL
P-19 – ATTITUDES OF PATIENTS AND CAREGIVERS TO THE AVAILABILITY OF A DEDICATED SOCIAL WORKER IN THE OUTPATIENT FRACTURE CLINIC
P-20 – EXPLORING BAYESIAN GROUP-SEQUENTIAL DESIGNS IN MULTIPLE-PERIOD PARALLEL-ARM CLUSTER RANDOMIZED TRIALS
Group-sequential designs (GSDs), which have been extensively explored in individually randomized trials, demonstrate substantial reductions in expected sample sizes by enabling early stopping at pre-specified interim analyses. While the GSD can be developed within either a Frequentist or Bayesian framework, Bayesian GSDs offer additional flexibility by providing a formal mechanism for incorporating external information, such as historical data or expert knowledge, which enhances the efficiency and interpretability of trial results. The adoption of GSDs in cluster randomized trials (CRTs) could effectively address the statistical challenges arising from having a limited number of clusters in CRTs, but their use has been limited. In this study, we specifically focus on evaluating Bayesian GSD on cross-sectional multiple-period CRTs with a baseline period, where outcomes are measured at equally spaced time intervals both before and after the intervention. However, the ideas apply to any type of CRT. Our primary objective is to introduce and evaluate the statistical efficiency of Bayesian GSDs in multiple-period CRTs under three situations. First, we compare the performance (sample size and power under various standardized effect sizes) of the Bayesian GSD with a non-informative normal prior centered around 0 and a precision of 0.001 (which heuristically should yield similar performance to an equivalent Frequentist GSD) to Frequentist fixed sample-size CRT design while managing Type I error rates. Next, we examine the performance when informative (normal priors with mean = effect size and precision 0.5) are used, both with and without controlling for Type I error rates. Our results showed that Bayesian GS multiple-period CRT with a non-informative prior resulted in similar power while controlling for Type I error rates compared to Frequentist fixed design. However, this design allowed trials to stop earlier, which reduced the expected cluster sizes by up to 40%. If we switch to the informative prior while still maintaining the Type I error rates, the simulation returns a similar performance to that of the designs with the non-informative prior. Once we do not control for Type I error rates, the designs showed up to 19% higher statistical power with the prior informative and with a Type I error of 0.09. The magnitudes of these results depend on the degree of informativeness of the priors, effect sizes, and trial sizes. In summary, we introduced the Bayesian GSD to multiple-period CRTs. Our results indicate that if we require control of the Type I error rate, reductions in expected sample size arising from using a Bayesian GSD are attributable to the group sequential aspect, not from the use of informative priors. This result matches the findings reported previously in the literature for individually randomized trials. Informative priors further reduce sample size only if the Type I error rate is allowed to increase. However, regardless of whether Type I error control is required, Bayesian methods enable bringing existing evidence into the current study and increase the interpretability of the parameters (e.g., probability-based inference instead of P-values), which has been widely discussed in the literature.
P-21 – THE EFFECT OF PREDICTING SERIOUS ADVERSE EVENT FOR GUIDING THE ENROLLMENT PROCEDURE IN CLINICAL TRIALS
P-22 – DISTANCE-BASED NONPARAMETRIC ESTIMATORS FOR COMPARING GROUPS IN BEHRENS-FISHER SETTINGS WITH CLUSTERED CLINICAL DATA
P-23 – A NON-PARAMETRIC APPROACH TO PREDICT RECRUITMENT FOR RANDOMIZED CLINICAL TRIALS-IMPLEMENTATION IN R
Accurate prediction of subject recruitment is crucial for the success of clinical trials but remains a persistent challenge. Existing prediction models often rely on parametric assumptions, which may not hold, or Bayesian methods, which require prior knowledge that can be difficult for investigators to provide. We introduce RCTRecruit, an R package that employs a novel, flexible, non-parametric, weighted resampling approach for clinical trials. Specifically, we implemented the methodology in inpatient settings such as acute care for the elderly (ACE) units at University of Texas Medical Branch. When simulating future enrollment, our method assigns higher weights to empirical data from dates similar to those in the target period, effectively accommodating seasonal patterns in recruitment. Simulated distributions and resampling techniques are then used to calculate confidence intervals for recruitment numbers at the end of a recruitment period or upon reaching a target number of participants. This method handles diverse enrollment patterns and anticipated changes in recruitment, using only a recruitment log as input. Using RCTRecruit, we applied this method to recruitment data from the GRIPS and PACE studies, comparing its performance to both bootstrapping and Bayesian methods. In these real-world applications, RCTRecruit outperformed both alternatives. Our package demonstrates the feasibility and ease of implementing a flexible, non-parametric weighted resampling approach that requires minimal input from the investigator.
P-24 – REVISITING FUTILITY MONITORING IN CLINICAL TRIALS
Modern clinical trials often include interim analyses of accruing data that seek to declare futility when the demonstration of a meaningful treatment effect is unlikely, even if they continue to their planned conclusion. An appropriate futility determination and decision to halt the study early can avoid unnecessary expenditure of resources and exposures to ineffective therapies. However, inappropriate futility determinations will stall progress towards making effective interventions available to patients. Despite the importance of making correct futility decisions, there is little guidance on optimal futility monitoring methods in the literature. A trial design should describe the schedule for interim looks and any decisions that are expected to be made on the accruing data, including the guidance quantities to be used in making decisions to stop a trial. Popular guidance quantities for interim futility monitoring include both error-spending functions and stochastic curtailment methods, such as conditional power, predictive power, and predictive probability of success. To highlight the role of futility analyses in clinical trials, this presentation provides a brief overview of commonly used futility stopping methods, insights into the differences in their performance estimated via simulation, and several real clinical trial examples demonstrating their application.
P-25 – PARTICIPANT DIVERSITY AND INCLUSIVE TRIAL DESIGN: A META-EPIDEMIOLOGIC STUDY OF CANADIAN RANDOMIZED CLINICAL TRIALS
P-26 – TREATMENT EFFECT ESTIMATION IN THE PRESENCE OF CLUSTER SIZE DEPENDENT TREATMENT HETEROGENEITY IN STEPPED WEDGE DESIGNS
P-27 – MUST-HAVES OR NICE-TO-HAVES: COMMUNITY OUTREACH TO INFORM RESEARCH INCLUSION IN FUNDING APPLICATIONS: CASE STUDIES FROM THE LEEDS INSTITUTE OF CLINICAL TRIALS RESEARCH - THREE CASE STUDIES FROM LICTR
P-28 – ENHANCING EFFICIENCY AND USER EXPERIENCE IN THE PREVENTABLE STUDY DRUG SYSTEM: UPGRADED FEATURES AND IMPROVED OPERATION
The PRagmatic EValuation of evENTs And Benefits of Lipid lowering in oldEr adults (PREVENTABLE) trial is a double-blind, multicenter, randomized study involving 20,000 community-dwelling adults aged 75 and older, aimed at evaluating atorvastatin’s effectiveness in preventing dementia or persistent disability. Participants will be followed for 5 years, with random assignment to daily atorvastatin 40 mg or placebo. The PREVENTABLE Study Drug System is a critical tool for coordinating direct-to-participant drug distribution and tracking across VA and non-VA sites. Recent updates have enhanced the user-friendliness, efficiency, and adaptability. These updates have streamlined communication between sites, coordinating centers, and the central pharmacy. The web-based Study Drug System, managed by the Data Coordinating Center at Wake Forest University School of Medicine, was originally built to manage secure drug distribution. It now includes new alerts, advanced filters, improved tracking features, and a simplified interface to better support both VA and non-VA sites. User feedback was vital to develop these updates and optimize the interface to meet the needs of both VA and non-VA sites. User focus groups highlighted workflow challenges, which lead to enhancements in alert functionality, efficiency, and troubleshooting. The upgraded system has improved operations and data reliability for PREVENTABLE with a redesigned dashboard that consolidates site-specific orders and displays vital participant information. Site staff can quickly access participant data, view current drug order statuses, and resolve issues with real-time shipment details. Advanced filtering options further streamline data retrieval, allowing staff to easily find specific study drug information. The alerts, designed to notify users of outstanding tasks and shipment-related issues, have been tailored for both VA and non-VA sites. This system includes both manual and automated actions to streamline site-level drug management. Additionally, API integration with USPS tracking has been modified to include for real-time monitoring of drug shipments, which has improved delivery reliability. Ongoing quality assurance measures, including dynamic reporting features, have also helped track study drug shipments, monitor past shipments, and ensure that system alerts are functioning correctly. These reports play an important role in ensuring timely communication, providing visibility into successful shipments, and identifying participants who should be expecting orders. Proactive monitoring helps prevent logistical delays, and supports the accurate tracking of drug orders, crucial for maintaining study integrity and participant adherence. By improving the tracking and communication flow between the coordinating centers, clinical sites, and central pharmacy, the upgraded system fosters a cohesive trial environment by streamlining information exchange. This presentation will highlight the technical and procedural updates to the PREVENTABLE Study Drug System, focusing on improvements in site efficiency, user satisfaction, and data accuracy. Key features such as the redesigned dashboard, advanced filters, and enhanced USPS tracking integration will be discussed, along with the role of user feedback in shaping these changes. The challenges of balancing the needs of both VA and non-VA sites will be addressed, and insights will be shared on applying these enhancements to optimize large-scale clinical trial management.
P-29 – PS-INTEGRATED BAYESIAN PROACTIVE DYNAMIC BORROWING STRATEGY BASED ON THE QUANTITATIVE EVALUATION OF EXCHANGEABILITY OF THE HYBRID CONTROL ARM
Borrowing external controls in clinical trials to augment the concurrent control arm is attractive due to its ability to reduce sample size and improve efficiency. However, it inevitably faces challenges from various biases due to the incomparability between concurrent and external controls. Several Bayesian methods that rely on the exchangeability of concurrent and external controls have been proposed to dynamically discount external controls based on the heterogeneity of observed outcomes, referred to as prior-data conflict. However, prior-data conflict is viewed as a reactive measure of exchangeability. Some suggest using the propensity score (PS) overlap as a fixed discounting parameter to proactively borrow based on the severity of selection bias that undermines exchangeability. However, this approach overlooks prior-data conflict and other types of biases that can also affect exchangeability. Here, we propose a PS-integrated Bayesian proactive dynamic borrowing strategy based on the quantitative evaluation of exchangeability. Our approach first balances the covariates using PS, then fits the outcome models separately for the concurrent and external control arms. Exchangeability is then quantitatively evaluated based on differences in the model coefficients and the means of covariates, which reflect the conditional exchangeability given covariates and the covariate similarity, respectively. An elastic function is adopted to convert these differences to the exchangeability index within the range of [0,1], where the hyperparameters of the elastic function are determined by the pre-specified maximum clinically tolerant difference of marginal effects between concurrent and external controls under the assumption of exchangeability. If any differences in covariates or model coefficients result in a change in the predicted average control effect that exceeds the pre-specified tolerant difference, the exchangeability index will become a small value close to zero. Finally, a weakly informative initial prior with the exchangeability index as its mean will be used for the random discounting parameter, such as the power parameter of Power Prior. Therefore, our approach can proactively control the amount of discounting of external controls based on the degree of exchangeability, meanwhile allowing for dynamic borrowing based on prior-data conflict. In the simulation study, we examine the statistical operating characteristics of our approach in scenarios with various biases of differing severity, including selection bias, unmeasured confounders, measurement bias in covariates and outcomes, and effect shift. Under mild selection bias, our approach performs similarly to PS-integrated dynamic borrowing based on prior-data conflict in terms of power gain and the width of the 95% credible interval. However, under severe selection bias, our approach’s power gain and the improvement in 95% credible interval become smaller than PS-integrated dynamic borrowing. More importantly, in the presence of other biases, our approach demonstrates better control of bias and the Type I error rate than PS-integrated dynamic borrowing, particularly when these biases are severe. Furthermore, the pre-specification of a tolerant difference is crucial, as a less stringent value tends to increase bias and inflate the Type I error rate. In conclusion, our PS-integrated Bayesian proactive dynamic borrowing strategy can discount external controls based on both the biases that undermine exchangeability and prior-data conflict.
P-30 – BAYESIAN IN-SILICO CLINICAL TRIALS APPLIED TO OBESITY-RELATED CANCER PREVENTION: THE IMPORTANCE OF EXPERT ELICITATION FOR KEY PARAMETERS IN THE ABSENCE OF EXISTING DATA USING THE SHELF METHOD
P-33 – A SIMULATION STUDY ON THE IMPLICATIONS OF ESTIMANDS IN TREATMENT SWITCHING IN META-ANALYSES
The ICH E9(R1) addendum promotes the estimands framework to harmonize the reporting of strategies to account for intercurrent events in clinical trials. However, the implications of the estimands framework in meta-analysis have not been well studied. In the context of treatment switching as an intercurrent event, via simulation, we examined the bias caused by pooling together estimates targeting different estimands in a meta-analysis of randomized clinical trials (RCTs) that allowed for treatment switching. We simulated overall survival data of a collection of RCTs that allowed patients in the control group to switch to the intervention treatment after disease progression under fixed-effects and random-effects models. For each RCT, we estimated a treatment policy estimand that ignored treatment switching, as well as a hypothetical estimand that accounted for treatment switching by censoring switchers at the time of switching. Then, we pooled together RCT effect estimates under fixed-effects and random-effects meta-analytical models while varying the proportions of treatment policy and hypothetical effect estimates. We contrasted effect estimates from meta-analyses that pooled different types of effect estimates with those that pooled only treatment policy or hypothetical estimates. We found that pooling estimates targeting different estimands results in pooled estimators that reflect neither the treatment policy estimand nor the hypothetical estimand. This finding suggests that pooling estimates of varying target estimands even under a random-effects model can produce misleading results. Adopting the estimands framework for meta-analysis may improve alignment between meta-analytic results and the clinical research question of interest.
P-34 – SCORING RADIOGRAPHY COLLIMATION: QUANTIFYING BACKGROUND IMPACT USING NEURAL NETWORKS
X-Ray collimation significantly impacts both radiologist performance and model predictions. The excessive space around an image or exclusion of part of an image caused by technologist error is still prevalent in today’s practice where radiographs are made hastily or in high-volume settings. This study aims to develop a scoring mechanism capable of identifying and quantifying these features and perform image manipulation in case of excessive background. The pipeline would allow for greater quality of images provided to radiologists and feedback to technologists on how to improve collimation for future studies. In an environment for conducting clinical trials, the overall effects of reducing radiation exposure of patients when creating a dataset is substantial. A function for superficially annotating images was developed, generating masks to be passed through a model for highlighting background sections. This study used a combination of 1448 randomly selected images from the FracAtlas dataset and 4821 randomly selected images from Stanford’s MURA dataset, split into 70% for training, 22.5% for testing, and 7.5% for validation. A U-Net model was used with an EfficientNetB4 base model, which provided predicted masks for a background collimation-scoring equation. To identify the condition of an overcrop, the segmentation output was used to correct excessive background on the RSNA Bone Age dataset, with identical splitting as previous. Using these images, 50% were randomly cropped to simulate poor collimation over-cropping, which were fed into a ResNet50 model. The output of this pipeline provides technologists with future suggestions through the collimation-score and identification of over-crop. Quantitative analysis of the U-Net model on the test set revealed a dice coefficient of 0.94, IoU of 0.7963, precision of 0.7485, and recall of 0.9308. However, these results may be misleading, as the segmentation model displayed greater coverage of background areas (for example, in rotation) compared to the manual function. The ResNet over-crop model achieved an accuracy of 0.8465, AUC of 0.9094, and loss of 0.3819 on the testing set. The quantitative analysis highlighted the model’s strengths and weaknesses, while also establishing it as a respectable system for collimation identification. These findings suggest that determining the collimation-score of an x-ray image can be done more effectively through a machine learning approach than manual functions. The importance of this pipeline is highlighted when used with clinical trials, as its potential to reduce unnecessary radiation exposure provides a valuable improvement of participant care in radiography experimentation. Some limitations of this pipeline include the exclusion of human-made annotations, relying solely on image patterns.
P-36 – SITE-SPECIFIC COVARIATE AND GROUP SIZE IMBALANCE IN MULTICENTER ACUTE STROKE TRIALS: A COMPARISON OF COVARIATE-ADAPTIVE RANDOMIZATION AND BLOCK RANDOMIZATION STRATEGIES
P-37 – MEASURING THE SUCCESS OF TREATMENT BLINDING IN SHAM-CONTROLLED TRIALS
Measuring the short- and long-term success of treatment blinding in clinical trials is critical for understanding its impact on study outcomes and the validity of the results. Effective blinding is crucial, as it mitigates the risk of bias arising from participant and investigator expectations, which can significantly inflate treatment effects. Although a common design feature, several trials, particularly device and surgical trials, are challenged to develop adequate controls for blinding purposes. When feasible, these trials attempt to preserve the blind by developing a “sham” control that mimics the experimental treatment. In these cases, it is particularly important to assess the quality of blinding and the impact on the treatment estimates. We aim to assess the short- and long-term effectiveness of blinding in a sham-controlled clinical trial setting using the Bang Blinding Index. SHARP (NCT03609944) is a multi-center, sham-controlled, single-blinded with a blinded outcome assessment randomized clinical trial of endoscopic retrograde cholangiopancreatography with minor papilla endoscopic sphincterotomy for the treatment of recurrent acute pancreatitis with pancreas divisum. Patients were blinded by not receiving clinical reports or bills for procedures. Research staff and blinded physicians involved with data collection were unaware of the treatment allocation. Subjects, research coordinators, and evaluating physicians were asked to guess the treatment allocation on several occasions during the follow-up period of 48 months. This presentation will review the process of blinding, and the success of blinding based on the questionnaires used in the trial; examining the likelihood of a correct “guess” of assigned treatment and the likelihood of guessing the participant received the real treatment (regardless of being correct). The goal will be to highlight the importance of measuring the success of blinding in a clinical trial and methods of measurement.
P-39 – AN R SHINY WEB APPLICATION FOR DATA MONITORING AND QUALITY ASSURANCE IN THE ACUTE TO CHRONIC PAIN SIGNATURES STUDY (A2CPS)
P-41 – REPRESENT: EXPLORING BARRIERS TO RECRUITMENT OF UNDERREPRESENTED GROUPS IN BLADDER AND HEAD & NECK ONCOLOGY TRIALS
P-42 – LEVERAGING SEASONAL VARIATION FOR PREDICTING ACCRUAL IN CLINICAL TRIALS USING BAYESIAN POSTERIOR PREDICTIVE DISTRIBUTIONS
Effective statistical tools are essential for initial planning and ongoing monitoring of clinical trials. A key factor that investigators must carefully assess is the accrual rate—the speed at which patients are enrolled. Slow accrual can limit the likelihood that the trial will provide results with sufficient power to make meaningful scientific inferences. We propose a method for predicting accrual rates while accounting for seasonal variations that often affect enrollment patterns in emergency medicine. Using a Bayesian framework, we combine prior knowledge with data up to a given monitoring point to generate predictions, incorporating seasonal trends into the model. We present posterior predictive distributions for accrual, addressing parameter uncertainty and sampling variability. To illustrate the method, we apply it to ongoing clinical trials, including the HOBIT trial, and compare our seasonal model to the approach we currently use in practice. We discuss practical considerations related to the accrual process, including the impact of seasonal fluctuations on recruitment, and highlight the advantages of our proposed method over traditional approaches.
P-43 – MAPPING THE KDQOL-36 ONTO THE EQ-5D-5L UTILITY INDEX IN PATIENTS UNDERGOING HAEMODIALYSIS
P-45 – METHODOLOGICAL FEATURES OF EARLY PHASE NON-RANDOMISED ADAPTIVE PLATFORM TRIALS: A COMPREHENSIVE REVIEW OF DESIGN, IMPLEMENTATION AND REPORTING
P-46 – STREAMLINING PROMIS SCORING BY USING AN APPLICATION PROGRAMMING INTERFACE (API) FOR AUTOMATION IN THE LOOK AHEAD AGING STUDY
The abstract compares two methods for scoring PROMIS instruments in the Look AHEAD Aging study. Traditionally, the scoring has been manual, which involves exporting data to a CSV format, then uploading it for scoring, and receiving results by via email, a process prone to delays and errors. Recently, Wake Forest University School of Medicine transitioned to using the Assessment Center API, which automates scoring and improves efficiency. The abstract highlights the advantages of the API in streamlining data handling and improving scoring accuracy, while also addressing potential consequences in data integrity in clinical trials.
P-47 – OPTIMIZING DATA ENTRY AND ENHANCING PROTOCOL ADHERENCE THROUGH DYNAMIC ELECTRONIC FORM VISIBILITY IN A MULTI-SITE CLINICAL STUDY
P-48 – APPLICATION OF THE FLEXIBLE PARAMETRIC CURE MODEL TO CLINICAL TRIAL DATA – THE MEAN SURVIVAL TIME AS A SUMMARY MEASURE OF THE UNCURED POPULATION
P-49 – IMPACT OF USING ROUTINE DATA ON THE EFFICIENCY OF IMPLEMENTATION TRIALS: A QUALITATIVE COMPARATIVE CASE STUDY
P-52 – DO SMALL CHANGES LEAD TO BIG IMPROVEMENTS? EMBEDDING EDI IN A VALUE DRIVEN WAY IN CLINICAL TRIALS RESEARCH INSTITUTE
P-53 – DOSE-FINDING DESIGNS TO ACCOUNT FOR PATIENT HETEROGENEITY IN CANCER CLINICAL TRIALS EVALUATING CELL THERAPIES
This presentation describes a novel phase I trial design developed to enhance the safety and efficiency of cell therapies in oncology by specifically addressing patient heterogeneity and dose-feasibility encountered in such therapies. Traditional dose-finding methods do not accommodate specific challenges encountered in cell-therapy trials, like patients not being able to receive their intended dose due to manufacturing limitations or specific groups of patients being more prone to toxicity than others. To address these issues, we incorporate statistical models that allow for the adaptive updating of dose levels based on real-time patient-data concerning both toxicity and dose-feasibility. Our design aims to calculate groups specific feasible maximum tolerated doses, by sharing toxicity data between groups and utilizing data observed at unplanned dose levels. We present the design, give some trial illustrations, and provide simulation results.
P-54 – SIMULATION-BASED EXTERNAL VALIDATION OF SURVIVAL MODELS FOR RISK STRATIFICATION IN ONCOLOGY
P-55 – MAPPING METHODOLOGICAL GUIDANCE FOR INCLUDING LIVED EXPERIENCE IN EARLY CORE OUTCOME SET DEVELOPMENT: A SCOPING REVIEW
This research forms part of a doctoral research project supported by the Health Research Board Trial Methodology Research Network PhD scholarship awarded to ML. The funder had no role in the design, data collection, and analysis or preparation of the review.
P-58 – THE USE OF HEALTHCARE SYSTEMS DATA FOR RCTS
P-59 – BENCHMARKING SUBGROUP HEALTH TECHNOLOGY ASSESSMENT ANALYSIS
P-61 – WHEN TO SCHEDULE THE INTERIM ANALYSIS IN THE PRESENCE OF MISSING DATA?
Suppose an adaptive Phase III trial has an interim analysis scheduled at a given information fraction, e.g., 50%. The key question is: When will we reach 50% information? In a non-longitudinal setting, the information level for a continuous endpoint can be approximated by the fraction of patients with endpoint data at the interim analysis relative to the final analysis. However, longitudinal trials with repeated measures and missing data require more nuanced methods to estimate the information level accurately. The question then becomes: When will there be 50% information in the presence of missing data? Is it when half of the patients reach the final visit, or could it be earlier? We propose an approach for projecting the information fraction in continuous longitudinal trials analysed using MMRM. We establish a relationship between information time and calendar time, providing practical guidance. At the design stage, prediction for the timing of interim analysis is based on assumptions about enrolment rate, total sample size, dropout rate, visit timing, and the correlation matrix. Once some data is available, this prediction is refined using the observed enrolment rates, dropout patterns, and updated correlation estimates, yielding a more accurate estimate of the current information level and an updated timeline for the interim analysis. Through a practical example, we demonstrate how to project information timelines at the design stage and refine them as data accrues. We discuss how to navigate different missing data patterns, assess the current information level, and set a reliable timeline for the interim analysis.
P-62 – RACIAL MINORITY PARTICIPATION IN DECENTRALIZED CLINICAL TRIALS FOR RARE GENETIC DISEASES
The design of clinical trials is transforming to include decentralized clinical trials (DCTs), which may provide substantial advantages for minority populations impacted by rare genetic diseases. Despite potential benefits, comprehensive analysis of the incorporation of DCT elements in this specialized context remains limited, mainly due to publication delays, insufficient research, and ongoing underrepresentation of diverse populations in clinical studies. This ‘study of studies’ seeks to systematically map and analyze the implementation of decentralized clinical trial (DCT) designs while evaluating the racial composition of participants in trials focused on rare genetic diseases. The methodology consists of a systematic search of interventional clinical trials conducted in the U.S., utilizing data from ClinicalTrials.gov and information from peer-reviewed publications and grey literature. The inclusion criteria emphasize trials concerning rare genetic diseases that offer insights into the racial demographics of participants and the implementation of decentralized clinical trial methodologies. This dual focus enables a detailed examination of the utilization of methods decentralized in trials with racially diverse populations and the barriers and facilitators affecting their wider adoption. Initial searches conducted on ClinicalTrials.gov utilizing specific keywords associated with rare diseases and decentralized clinical trial designs have produced significant preliminary data. The findings establish a foundation for future inquiries in specialized databases and grey literature to map existing evidence and identify gaps in the current research landscape. This study systematically collates and analyzes data on DCTs in rare genetic diseases to illuminate the inclusivity of racial minorities.
P-63 – REASONS FOR DECLINING TO PARTICIPATE IN A TRIAL OF ONLINE COGNITIVE BEHAVIOURAL THERAPY FOLLOWING ORTHOPAEDIC TRAUMA: A MIXED METHODS STUDY
P-64 – PREVALENCE AND ACCEPTABILITY OF DEDICATED SOCIAL WORK SUPPORT IN THE FRACTURE CLINIC: A SURVEY OF ORTHOPAEDIC TRAUMA SURGEONS
P-65 – JOB SATISFACTION OF RESEARCH PERSONNEL IN ORTHOPAEDIC TRAUMA SURGERY
P-66 – BENEFITS OF CROSS-INDUSTRY TRAINING IN CLINICAL TRIALS
There are various training opportunities available to students interested in careers in clinical trials. Students in doctoral programs for statistics or biostatistics often have research or teaching assistantships at their institution which support their funding. In addition to these, students can apply for fellowships and internships with companies or federal agencies to gain valuable real-world experience. This presentation will highlight the speaker’s firsthand experience with clinical trials training opportunities in academia, government, and industry, including a description of four distinct experiences (1 academic, 2 government, and 1 pharmaceutical industry) and the considerations required to pursue them during her doctoral studies. The speaker will detail lessons learned and how these have influenced her career, as well as present recommendations for trainees and those involved in developing the clinical trials workforce. A combination of experiences within and outside of a student’s academic institution is unmatched in terms of career preparation. These experiences further develop the student’s technical skills and knowledge by extending it outside of the classroom into a real-world setting. They also provide a forum to develop practical skills for the workforce, such as project management, communication, and multi-disciplinary collaboration. Additionally, a combination of these experiences allows the student to make a well-informed decision regarding their post-graduation position. This has obvious benefits for future career satisfaction for the student, and it benefits employers in that they can recruit individuals who are passionate about pursuing a career at their organization. For students interested in clinical trials research, cross-industry training is particularly beneficial. It provides exposure to the complexities of trial design, management, and analysis from different perspectives, which is useful since clinical trials often involve communication and collaboration across industries. Despite these benefits, there can also be challenges associated with a student participating in opportunities outside of their institution. Firstly, there could be a lack of awareness about these opportunities. Beyond that, it can be challenging for the student to manage their other obligations, such as their assistantship and dissertation research. There are also often practical hurdles regarding the student’s funding, as well as the potential need for temporary relocation. The speaker was a graduate research assistant at the Clinical Trials and Statistical Data Management Center at the University of Iowa for four years, during which she was the inaugural Network for Excellence in Neuroscience Clinical Trials (NeuroNEXT) Biostatistics Fellow. The NeuroNEXT Biostatistics Fellowship was designed to expose a trainee to various facets of clinical trials, including the pre-award process, ongoing trial management, and statistical analysis of a completed trial. In addition to these invaluable academic experiences, she completed an Oak Ridge Institute for Science and Education Fellowship hosted by the Center for Drug Evaluation and Research at FDA and participated in the inaugural cohort of the Oncology Educational Fellowship, a program established by the FDA Oncology Center for Excellence, American Statistical Association (ASA), and ASA’s Biopharmaceutical Statistics Section. The speaker also gained experience in the pharmaceutical industry through a 4-month co-op program in Biostatistics and Data Science at Boehringer Ingelheim.
P-67 – SIMULATION-GUIDED TRIAL PLANNING FOR DIFFERENT ESTIMANDS FOR TREATMENT SWITCHING IN ONCOLOGY
Post-randomization (i.e., intercurrent) events, such as treatment switching can affect the interpretation of clinical outcomes. Consequently, the recent ICH E9(R1) addendum on estimands highlights the importance of specifying relevant intercurrent events and how they will be handled analytically. In oncology trials, treatment switching can distort estimated treatment effects because participants are exposed to multiple treatments during follow-up. While the estimands framework offers a way to transparently integrate intercurrent events into the trial design and analyses, planning trials for different estimands is not straightforward. Two estimands promoted by the ICH addendum that are commonly used include “Treatment Policy” (TP) and “Hypothetical” estimands. In the context of treatment switching, the TP estimand is analogous to intention-to-treat analysis where data are analyzed based on randomization status, irrespective of switching. In contrast, the hypothetical estimand reflects a treatment effect under a hypothetical scenario where patients would not have switched their treatment. Because both estimands may be of clinical interest, characterization of the trade-offs associated with each via simulations may aid trial planning. The main objective of this study is to quantitatively examine the trade-offs associated with TP and hypothetical estimands, as measured by error rates, sample size, and treatment effects, in the context of treatment switching in a randomized clinical trial (RCT) powered on overall survival (OS). For our simulation study, we used an illness-death model to generate progression and OS times to mimic a trial that allows control patients to switch onto the experimental therapy after disease progression. We considered censoring as the analytical strategy for treatment switching for the hypothetical estimand. We estimated the TP and hypothetical effects in terms of proportional hazard ratio (95% CI) using transition hazards that were derived from a published RCT that allowed control group participants to switch treatments. We compared the empirical means of the estimators, the type I error rate, and the power of the two estimands. To demonstrate the implications of the simulation results, we will walk through an example of a simulation-guided design case study involving planning a trial in oncology with treatment switching as the main intercurrent event. We found that for our data generating mechanism, the estimator for the hypothetical estimand typically showed larger effect sizes than the estimator for the TP estimand but with less precision due to a higher proportion of censored observations. While the type I error rate could be controlled at 0.05 for both estimators, the estimated power was higher for the analysis targeting the hypothetical estimand. Our findings do not necessarily imply that the hypothetical estimand should be targeted over the TP estimand, but they do highlight important trade-offs at the design stage that can be characterized using simulations. As these two treatment effects reflect distinct research questions, our work underscores the need for transparency regarding intercurrent events during clinical trial planning.
P-68 – ANALYSING HEALTH RELATED QUALITY OF LIFE IN NEPHROLOGY TRIALS: A COMPARISON OF THE LINEAR MIXED EFFECTS, STANDARD JOINT, AND COMPETING RISKS JOINT MODELS IN THE PRESENCE OF INFORMATIVE DROPOUTS
P-69 – A MULTI-ARM MULTI-STAGE DESIGN FOR TRIALS WITH NO CONTROL ARM AND ALL PAIRWISE TESTING
Multi-arm multi-stage (MAMS) trials have gained popularity as a means to enhance the efficiency of clinical trials, potentially reducing both duration and costs. This paper focuses on designing MAMS trials where no control treatment exists. This may be because there are multiple treatments already established as the standard treatment option or when no treatment currently exists for a severe disease, so it would be unethical to withhold a potentially helpful treatment. In the proposed design, interim analyses allow for early treatment termination during the trial when a treatment performs notably worse than its competitors, and for the entire trial to stop early if all remaining treatments are showing similar performance. All pairwise comparisons between each treatment arm are conducted allowing for the identification of statistically significant differences between treatments and facilitating the early termination of less effective ones. The proposed design controls the familywise error rate (FWER) for all pairwise comparisons and the necessary conditions when control in the strong sense is guaranteed are provided. The FWER and power are used to calculate both the stopping boundaries and the sample size required. Analytic solutions to compute the expected sample size are also derived. A trial motivated by a study conducted into sepsis, where there was no control treatment, is shown. The multi-arm multi-stage all pairwise design proposed here is compared to multiple different approaches. It is shown, for the trial studied, that the proposed method yields the lowest required maximum and expected sample size when controlling the FWER and power at the desired levels.
P-71 – BAYESIAN ESTIMATION OF DYNAMIC TREATMENT REGIMES FROM A PARTIALLY RANDOMIZED, PATIENT PREFERENCE, SEQUENTIAL, MULTIPLE ASSIGNMENT, RANDOMIZED TRIAL
As healthcare shifts towards patient-centered care, incorporating patient treatment preferences in clinical trials has become increasingly relevant. The Partially Randomized, Patient Preference, Sequential Multiple Assignment Randomized Trial (PRPP-SMART) combines a Partially Randomized Patient Preference (PRPP) trial with a Sequential, Multiple Assignment, Randomized Trial (SMART), allowing participants to either receive their preferred treatment or be randomized when no treatment preference exists, at multiple points in the trial. In this paper, we introduce a novel Bayesian method to estimate dynamic treatment regimes (DTRs), or tailored treatment guidelines over the course of care, embedded in PRPP-SMARTs. Our Bayesian Joint Stage Model (BJSM) leverages information sharing between preference and randomized participants and across stages of the trial to estimate DTR effects. We compare our BJSM method to weighted and replicated regression models, the current standard for analyzing PRPP-SMART data, and show that our method provides more efficient DTR effect estimates with negligible bias. Our results indicate that BJSM is a promising alternative for analyzing PRPP-SMART data.
P-74 – DESIGN AND ANALYSIS OF SMARTS WITH TREATMENT PREFERENCE, WITH APPLICATION TO THE STAR*D TRIAL
P-75 – EXPLORING THE ADOPTION OF THRICE-WEEKLY EXTENDED-HOURS IN-CENTRE NOCTURNAL HAEMODIALYSIS IN ROUTINE CLINICAL PRACTICE THROUGH THE NIGHTLIFE STUDY: A QUALITATIVE CONTENT ANALYSIS
P-79 – CALIBRATION-FREE ODDS CFO SUITE FOR DESIGNING VARIOUS PHASE I CLINICAL TRIALS
In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose in a phase I clinical trial. To use the data more efficiently yet without any model assumption, we propose a novel calibration-free odds (CFO) approach to phase I trial design. Not only is the CFO design free of any dose-toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Seamless phase I/II trials have gained enormous popularity, which aim to identify the optimal biological dose (OBD) and to enhance the accuracy and robustness of OBD identification. For toxicity monitoring, the CFO design casts the current dose in competition with its two neighboring doses to obtain an admissible set. For efficacy monitoring, CFO selects the dose that has the largest posterior probability to achieve the highest efficacy under the Bayesian paradigm. In contrast to most of the existing designs, the prominent merit of CFO is that its main dose-finding component is model-free and calibration-free, which can greatly ease the burden on artificial input of design parameters and thus enhance the robustness and objectivity of the design. We will also illustrate the implementation of CFO using its Shiny App which is user-friendly and publicly accessible at https://clinicaltrialdesign.shinyapps.io/cfoapp/.
P-80 – PRO-ADD: PATIENT-EMPOWERED DOSE-FINDING TRIALS BY INTEGRATING SAFETY, EFFICACY AND PATIENT-REPORTED OUTCOMES FOR OPTIMAL DOSE SELECTION
Advances in oncology drug development are driving the emergence of novel therapies that challenge traditional dose-efficacy assumptions in dose-finding oncology trials. Most trial designs aim to identify a maximum tolerated dose (MTD) by assessing patients’ dose-limiting toxicities (DLTs) - implicitly adopting the traditional dose-efficacy paradigm that efficacy increases with treatment dose. Whilst established cytotoxic agents generally conform to this assumption, new investigational therapies such as molecularly targeted agents and immunotherapies do not necessarily exhibit such a relationship. What’s more, these emerging treatments are often administered over extended durations, extending beyond the traditionally short DLT assessment window. In settings where therapies are administered until treatment resistance or disease progression occurs, it is vital to evaluate treatment tolerability beyond the traditional DLT assessment windows. With these new investigational therapies in mind, emphasis should shift toward methodological advancements in trial designs aimed at identifying optimal doses, rather than solely determining MTDs. Incorporating Patient-reported Outcomes (PROs) within dose-finding oncology trials is increasingly recommended to better understand a treatment’s tolerability profile, especially given the extended tolerability assessment windows which may be needed for novel immunotherapies and targeted therapies. This paper introduces PRO-ADD (Patient-reported Outcomes Aided Dose-optimization Design), a modular trial design framework for dose optimization. This framework allows trialists flexibility to define approaches for dose-escalation, the adaptive randomization of patients across admissible doses, and the final dosing decision criterion. In this paper, we leverage the framework to optimize dosage with respect to three key outcomes - clinician-assessed DLTs, PROs and efficacy. We introduce a novel continuous PRO endpoint, the normalized PRO-Adverse Event (PRO-nAE) burden score to evaluate the trade-off between a treatment’s tolerability and its efficacy. A generalized linear mixed model is used to incorporate the accumulating longitudinal PRO data collected during the trial within the final dose recommendation. Simulation results are presented to evaluate trial design performance under different strategies for the handling of intercurrent events. PRO-ADD leverages the fundamental paradigmatic efficacy and tolerability profiles of new treatments to recommend optimal doses. It performs well at identifying the optimal dose (a dose which is both efficacious and tolerable) when efficacy and PRO data is collected beyond an observed DLT. Particularly in scenarios where efficacy plateaus beyond a specific dose size, PRO-ADD confidently identifies the most tolerable effective dose, avoiding escalation to higher, safe doses that offer no additional efficacy benefit. Futility and safety stopping rules perform well at adaptively assigning patients to admissible doses. When a patient discontinues treatment after a DLT, and subsequent PRO and efficacy data is unavailable, the proposed design still recommends the true optimal dose a majority of times, however results are biased toward lower dose levels which have fewer DLTs. As the field evolves, patient-centric dose-finding approaches incorporating PROs are crucial in advancing our understanding of treatment tolerability, and in turn, will shape the future landscape of dose-finding oncology trials.
P-81 – RANDOMIZATION-BASED INFERENCE FOR MCP-MOD
Dose selection is critical in pharmaceutical drug development, as it directly impacts therapeutic efficacy and patient’s safety of a drug. The Generalized Multiple Comparison Procedures and Modeling approach is commonly used in Phase II trials for testing and estimation of dose-response relationships. However, its effectiveness in small sample sizes, particularly with binary endpoints, is hindered by issues like complete separation in logistic regression, leading to non-existence of estimates. Motivated by an actual clinical trial using the MCP-Mod approach, this work introduces penalized maximum likelihood estimation (MLE) and randomization-based inference techniques to address these challenges. Randomization-based inference allows for exact finite sample inference, while population-based inference for MCP-Mod typically relies on asymptotic approximations. Simulation studies demonstrate that randomization-based tests can enhance statistical power in small to medium-sized samples while maintaining control over type-I error rates, even in the presence of time trends. Our results show that residual-based randomization tests using penalized MLEs not only improve computational efficiency but also outperform standard randomization-based methods, making them an adequate choice for dose-finding analyses within the MCP-Mod framework. Additionally, we apply these methods to pharmacometric settings, demonstrating their effectiveness in such scenarios. The results underscore the potential of randomization-based inference for the analysis of dose-finding trials, particularly in small sample contexts.
P-83 – THE ROLE OF INTERSECTIONALITY IN SHAPING PARTICIPANT ENGAGEMENT WITH DIGITAL HEALTH METHODS: FINDINGS FROM A QUALITATIVE STUDY
P-85 – UTILISING ROUTINELY COLLECTED HEALTH DATA (RCHD) TO ENHANCE LONG-TERM MONITORING AND EFFICIENCY IN CLINICAL TRIALS: INSIGHTS FROM 2 ACADEMIC PRAGMATIC TRIALS
The increasing availability and quality of Routinely Collected Health Data (RCHD), including electronic health records (EHRs) and healthcare system data (HSD), have introduced new possibilities in clinical trial design, particularly for long-term follow-up and safety monitoring. This study explores the integration of RCHD into two UK-based clinical trials: the first investigates aspirin’s role in preventing cancer recurrence, and the second evaluates a preventative polypill for age-related conditions. By utilizing both EHRs and HSD to track serious adverse events and long-term outcomes, these trials demonstrate a combined approach to RCHD that is at present, one of the only ways to effectively tackle the healthcare system fragmentation between England, Scotland, Wales, and Northern Ireland in a cost effective and timely way. This integration model mirrors challenges seen in the U.S. where EHR adoption remains uneven across states and systems and in Canada, where provincial systems manage health data separately. Using RCHD in these trials supports cost-efficient, scalable methodologies that could benefit North American healthcare systems facing similar challenges in trial efficiency, participant retention, and resource limitations. Although this study is UK-specific, its findings on RCHD integration offer a framework that may enhance trial operations across varied healthcare landscapes.
P-87 – INTEGRATING SYNTHETIC DATA AND AI IN PEDIATRIC INTENSIVE CARE CLINICAL TRIALS: A BAYESIAN FRAMEWORK FOR ETHICAL AND SCIENTIFIC ADVANCEMENT
We propose a trial design and analysis pipeline integrating AI-generated synthetic data and Bayesian statistical methods to improve scientific validity and uphold ethical standards. AI-based algorithms were considered for creating synthetic data using generative models, checking for fidelity to the original dataset through validation, including distributional equivalence. These synthetic datasets will serve as the backbone of a Bayesian trial design. In the first stage, synthetic data will inform prior distributions for the Bayesian Analysis of the primary endpoint-intubation rates. In the second stage we will combine real and synthetic data to evaluate treatment efficacy across subgroups, including underrepresented trial populations, i.e., children potentially more at risk of intubation for previous wheezing or hospitalizations.
P-88 – OPTIMIZING PEDIATRIC OUTCOMES: ADVANCED BAYESIAN MODELING OF DAYS WITHOUT MECHANICAL VENTILATION IN INTENSIVE CARE TRIALS
In pediatric respiratory trials, accurately modeling days without mechanical ventilation (DWMV) is crucial due to the typical zero-inflation and skewness in the data. Standard statistical methods often fail to address these complexities effectively, which can obscure significant clinical insights. This study harnessed Bayesian statistical methods to evaluate the efficacy of high-flow nasal cannula (HFNC) and noninvasive ventilation (NIV) in a sample of 252 pediatric cases, leveraging these methods’ ability to integrate prior clinical knowledge and manage complex data structures. We deployed four distinct Bayesian models to capture the nuanced distribution of DWMV: Gaussian, Hurdle Negative Binomial, Zero-One Inflated Beta, and Cumulative Logistic Regression. Each model’s effectiveness was assessed using the Leave-One-Out Cross-Validation (LOO) Information Criterion, providing a robust measure of predictive accuracy. Among these, the Zero-One Inflated Beta Model stood out, achieving the lowest LOOIC score (294.8). This model was particularly adept at handling zero-inflation and offered detailed insights into how each treatment influenced the distribution of DWMV days, thereby illuminating the differential impacts of HFNC and NIV. Conversely, the Gaussian model, while straightforward, proved less effective (LOOIC = 1157.9) due to its inadequate handling of zero-inflation. Although it provided a basic understanding of treatment effects, its lack of sophistication in managing the data’s specific challenges limited its utility. The Hurdle Negative Binomial and Cumulative Logistic Regression models also showed good performance (LOOIC = 568.3 and 573.8, respectively), particularly in delineating effects across different patient segments, but they did not reach the predictive accuracy of the Zero-One Inflated Beta Model. This research highlights the critical importance of selecting appropriate models based on specific data characteristics to achieve precise clinical results. By utilizing advanced Bayesian techniques and tailored models, we can ensure more accurate and clinically relevant estimations of treatment effects. Such precision is important for informing decisions in pediatric respiratory care strategies, where understanding the nuances of treatment effectiveness can significantly influence patient outcomes. Our findings strongly advocate for the broader adoption of these sophisticated Bayesian methods in clinical research. These methods not only improve the accuracy of treatment effect estimations but also the overall quality of outcome assessments in pediatric respiratory care.
P-89 – EFFECTIVE CENTRALIZED TRAVEL MANAGEMENT IN A MULTI-CENTER PARKINSON’S DISEASE CLINICAL TRIAL
P-90 – EVALUATING THE EFFICACY OF OUTBOUND IVR IN ENHANCING FOLLOW-UP IN A MULTICENTRE CLINICAL TRIAL
P-91 – DESIGN OF AN ELECTRONIC DELEGATION OF AUTHORITY LOG WITHIN A CLINICAL TRIAL MANAGEMENT SYSTEM
The Delegation of Authority (DOA) Log is required for clinical research studies to record all study team members’ significant study-related duties as well as document and ensure that study team members are aware of their duties, are appropriately trained, and authorized to perform the tasks. Based on team members’ DOA assignments, they may be required to provide documentation in the form of regulatory documents to confirm they are properly trained to perform their designated tasks. It is a critical tool in ensuring oversight and accountability for a clinical research study. However, there are many challenges when it comes to managing and documenting the DOA. These include inaccurate records, inconsistent data capture formats, and lack of standards. These issues can cause many problems, including confusion over which team member is performing which tasks, non-compliance violations, inadequate training, and improper documentation of team member updates. To avoid these pitfalls, an electronic DOA log was implemented into a web-based Clinical Trial Management System (CTMS) for the Strategies to Innovate EmeRgENcy (SIREN) Care Clinical Trials Network, which is funded by the National Institute of Neurological Disorders and Stroke and the National Heart, Lung, and Blood Institute. There are currently 5 projects and 5 ancillary projects for the SIREN network housed within our CTMS. There are over 200 unique sites participating in these projects. Of the 4 active projects, there are over 2,000 clinical site team members listed on the electronic DOAs. The electronic DOA tracks the complete history of DOA changes, including start and end dates when a team members’ assignments change. By using our electronic DOA, it ensures all sites collect the data in a standard way and format. Another benefit of having the DOA data accessible online is that trial operations team members have better oversight over the submitted data and can ensure the DOA is being collected in a standardized and proper way across all sites. Having the DOA data included in our CTMS system allows this data to be used for other trial operation purposes. For example, the system is automatically posting and tracking the appropriate regulatory documents required for a team member based on the DOA assignments. This is done within our regulatory document module that is included in our CTMS. Another benefit of having the DOA data within our CTMS is that it can be used to automatically validate that certain CRF assessments are performed by a team member with the proper DOA assignment and qualifications. By collaborating with stakeholders across all areas of clinical trials, we developed an integrated electronic DOA module within our CTMS, streamlining a complex process and enhancing efficiency in other trial management areas.
P-92 – INTEGRATION OF AI IN EPIDEMIOLOGICAL STUDY DESIGN FOR ENHANCED RESEARCH OUTCOMES
P-93 – BEST PRACTICES FOR THE DESIGN AND CONDUCT OF COMPLEX CLINICAL TRIALS
P-94 – PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT TO METHODOLOGICAL RESEARCH: INSIGHTS FROM A PANEL
P-95 – TWILIO ALERTS WITHIN THE PREVENTABLE ALERT SYSTEM
The PRagmatic EValuation of evENTs And Benefits of Lipid lowering in oldEr adults (PREVENTABLE) Trial is a double-blind, randomized, multi-site pragmatic clinical trial assessing whether the cholesterol lowering drug, Atorvastatin, can help adults 75 and older prevent dementia, physical disability, and death. A pragmatic trial is a type of research study designed to evaluate the effectiveness of interventions in real-world, routine clinical settings. By focusing on those aged 75 and older who do not have a history of cardiovascular disease, the PREVENTABLE study seeks to explore whether this treatment can improve health outcomes and overall quality of life as people age. With over 100 clinics located throughout the United States and Puerto Rico, and a planned enrollment of 20,000 participants, the PREVENTABLE study is a large research initiative. Due to its vast size and expansive coverage area, building an efficient communications system was critical. One key element within the study’s communication framework is a highly specialized, custom-built alert system that notifies clinic staff of actions requiring follow-up. This tailored system has been designed specifically to meet the unique needs of the study, ensuring precise, real-time notifications and effective communication across all relevant parties. By implementing a custom solution, the system can better address specific criteria and response protocols, enhancing the accuracy, efficiency, and reliability of alerts. Automating the processes of detection and notification can significantly reduce the potential for human error thereby enabling responses that are both faster and more precise. An additional piece, critical to the success of a trial, is collection of follow-up data. To help ensure clients stay informed and engaged, PREVENTABLE implemented an automated process to send appointment reminders to participants. The study utilized the robust features and resources provided by Twilio, a cloud communications platform that enables developers to build, scale, and operate communications solutions to optimize efficiency. Twilio APIs allow businesses to integrate communication attributes into their applications, such as sending SMS messages. Twilio’s services are scalable and reliable, with features designed to handle high volumes of communication while ensuring data security and privacy. Communicating appointment reminders via SMS messages introduced the ability for participants to respond to messages with study related information that needed to be forwarded to specific study staff. PREVENTABLE interwove this communication with the PREVENTABLE alerts system to specifically handle incoming participant communication. This presentation/poster will describe and highlight the interworking of such collaboration and provide specific code-snippets to demonstrate the mechanics that bring this successful convergence to fruition.
P-96 – NAVIGATING SAFETY REPORTING IN A RARE DISEASE SETTING
To responsibly protect human subjects in a vulnerable population, every effort should be made to ensure diligent safety monitoring and adverse event reporting. The TReatment for ImmUne Mediated PathopHysiology (TRIUMPH) is an NIH-funded Phase 2b clinical trial designed to investigate immunosuppressive therapy to treat children with acute liver failure of unknown etiology. The trial is being conducted under an FDA Investigational New Drug application (NCT# 04862221). During study development, participant risk was extensively researched for all treatment arms and in the context of children in critical care for acute liver disease. Exclusion and treatment discontinuation criteria were carefully mapped in the protocol under the guidance of the FDA, DSMB, and Central IRB. A Safety Monitoring Plan (SMP) was developed by the Operations Team to establish study team responsibilities and a clear definition of adverse events, including all anticipated events based on the known complications of pediatric acute liver failure and associated interventions, such as transplantation, and expected events related to study treatment. The plan also includes steps and responsibilities for expedited reporting to the FDA according to the FDA Guidance for Industry on safety reporting requirements for IND studies. A well-rounded SMP implemented prior to enrollment is shown to be a valuable tool as well as site-facing resources for collecting and entering data. This presentation will provide an overview of the steps the TRIUMPH investigators took to ensure a reliable, streamlined process for safety reporting including the components of the SMP, the design of the trial’s clinical trials management system safety module and the study team members involved in the process which includes site investigators, data and project managers, medical safety monitors, regulatory specialists and the unblinded statistical team.
P-98 – ENHANCING DATA QUALITY IN THE HEALEY ALS PLATFORM TRIAL THROUGH SYSTEMATIC OUTCOME REVIEW
Workflow overview: (1) Identification of Potential Issues: Subject matter experts defined specific criteria for identifying discrepancies in ALSFRS-R and SVC scores between visits. Using predefined criteria, both automated logic checks and manual review are conducted to identify discrepancies in scores. These include score fluctuations and scores that meet threshold for review. (2) Communication with Sites: Upon identifying discrepancies, the Data Management Team (DM) sends emails to the sites with a list of identified scores, requesting score verification, corrections in the electronic data capture if errors are identified, and explanations for fluctuations. (3) Feedback Loop for Training Improvement: Responses from sites are reviewed by DM and categorized according to the reason for fluctuations, such as: disease progression, different evaluators, or data entry errors. This feedback is provided to the Outcome Measurement Training team to support continuous training improvement. (4) Data Quality Improvement: Over time, the review process has led to marked improvements in data quality, especially in later trials (Regimens F & G). Improvements include reduced data entry errors, fewer unexplained score fluctuations, and increased consistency in having the same evaluator conduct assessments across visits, significantly reducing variability in scores.
P-100 – MULTIPLICITY IN NON-LICENSING RANDOMIZED CONTROLLED TRIALS: SOFTWARE TOOL TO CALCULATE SAMPLE SIZES
P-103 – USE OF VARYING-ACCESS DATABASE TABLES TO MANAGE CLINICAL SITE AND STUDY PERSONNEL DATA FOR MULTI-CENTER CLINICAL TRIALS
Management of a multi-center clinical trial benefits from tracking site and personnel data in database tables. In multi-center studies run by the Data Coordinating Center team in the Cleveland Clinic’s Quantitative Health Sciences (QHS) Department, we use a Clinical Site table and Study Personnel table allow for dynamic tracking of the status of participating sites and site personnel. These tables have evolved over the last 30 years, each customized to study needs of the study and implementing lessons learned in previous trials. Access is password protected; staff members are only able to see their own site’s tables. We use online site tables for all of our NIH-funded studies. For example, in the NIDDK COMBINE multi-center trial, we used the Site table to track both the date of local IRB approval at each site and documented the manufacturer, field strength, and software version to be used for COMBINE MRIs. During the course of the NIDDK AASK multi-center trial, the calcium channel blocker treatment arm was discontinued during the course of the trial, and we used the site table to capture the date each site’s IRB approved the revised consent (and could begin re-enrollment under the new protocol). We have recently begun work on the NHLBI Empagliflozin to Improve Right Ventricular Function in Pulmonary ArTerial Hypertension (EmPATH) multi-center clinical trial, enrolling patients at the Cleveland Clinic, Johns Hopkins, and Vanderbilt. Funded via Clinical Coordinating Center UG3/UH3 to the Cleveland Clinic Department of Pulmonology and Data Coordinating Center (DCC) U24 to Cleveland Clinic QHS, the triple-masked parallel arm trial will compare the effects of Empagliflozin vs. Placebo in pulmonary arterial hypertension patients. The primary outcome is change in Right Ventricular Ejection Fraction by Cardiac MRI. The Biorepository Core will document the date that each site has met EmPATH training criteria for sample processing, storage, and shipping. The Imaging Core will document the date of approval for each clinical site’s Cardiac MRI system and Echocardiogram/Sonography system. This database table is programmed such that the dates of IRB approval are documented locally, and the dates of Core approvals are documented centrally. These data facilitate automatic checks of a site’s “Ready to Enroll” status. Best Practices for trial documentation require Attribution, including identification of those seeing patients, performing procedures, or collecting data. We use personnel tables to store trial IDs for each staff member. Other key components of the Personnel form include 1) a staff member’s role (e.g., Site PI, Physician, Study Coordinator), 2) “currently active,” a field that facilitates the DCC ending database access when a person leaves their position, 3) individual training, certification, and (when needed) annual recertification. The database table is programmed such the staff member’s role, and “currently active” status is documented locally, and training and certification dates are documented centrally. We will present examples of site and personnel data collection systems from multiple trials for which we served as DCC, describing past utility of and upcoming plans for collecting and using these data.
P-104 – SUMMARIZING PRO-CTCAE: A NEW INDEX FROM AVERAGED COMPOSITE SCORES AT A CROSS-SECTIONAL TIMEPOINT
P-106 – ESTIMATING AND INTERPRETING INTERVENTION EFFECTS IN RANDOMISED MULTI-SESSION THERAPY TRIALS WITH PARTIAL INTERVENTION ADHERENCE: IS A BINARY DEFINITION OF COMPLIANCE APPROPRIATE?
In complex intervention trials, where the intervention consists of therapy given over multiple sessions, adherence of trial participants to the intervention is often partial, with many attending only a proportion of the prescribed sessions. This partial adherence presents challenges for estimating and interpreting intervention effects on outcomes. This research had two primary aims. The first was to investigate how non-adherence is reported in published trials and the analytical methods used to address it. A systematic review of individually randomized parallel-group trials involving multi-session therapy, published between 2019 and 2023 in leading medical journals revealed that in most trials, data were analyzed using an intention-to-treat (ITT) approach. However, ITT does not account for the fact that a significant number of participants did not fully adhere to the intervention. Some studies applied the Complier Average Causal Effect (CACE) analysis, a method that uses randomization as an instrument to estimate the causal effect among participants who adhered to the intervention, in addition to the ITT approach. The CACE method typically uses a binary definition of compliance. Consequently, to use CACE in multi-session therapy trials participants who had attended at least a certain number of therapy sessions were classified as compliers while others were treated as non-compliers. This approach fails to account for varying adherence levels and if the true compliance-outcome relationship does not follow a strict “jump” function where benefits only begin at a specific adherence cutoff, the binary CACE estimates may be biased. Additionally, the reliance on the assumption that randomized allocation has no effect on non-compliers known as exclusion restriction potentially poses further challenges. The second aim of this research was to address the limitations of the binary CACE approach by exploring an alternative continuous CACE approach and considering the implications on the interpretation of the estimate of the intervention effect. In continuous CACE the number of sessions attended by each participant is treated as continuous and the intervention effect is estimated as the average causal effect by session or by a proportion of the sessions attended. Both ITT and CACE approaches were applied to real trial data. The comparison of the ITT and CACE estimates highlighted how the respective estimands address distinct research questions and emphasized the need to clearly define the estimands to quantify the intervention effect accurately under different assumptions of adherence. A simulation study was conducted to compare the binary and continuous CACE methods by varying adherence levels and dose-response relationships including both linear and non-linear associations between compliance and outcomes. The continuous CACE method performed generally well under different dose-response assumptions. The binary CACE approach provided an unbiased estimate only when the true dose-response association was a jump function with no intervention effect until attendance reached a specific session threshold and the compliance cutoff used in the analysis closely matched this threshold. However, in multi-session therapy trials the true dose-response association is rarely known and is unlikely to follow a jump function and the continuous CACE may offer greater flexibility.
P-108 – PRAGMATIC MONITORING OF EMERGING EFFICACY DATA IN RANDOMIZED CONTROLLED TRIALS
Monitoring the conduct of Phase III randomized controlled trials is driven by ethical reasons to protect the study integrity and the safety of trial participants. We propose a group sequential, pragmatic approach for monitoring the accumulating efficacy information in randomized controlled trials. The “PHRI boundary” is simple to implement and sensible, as it considers the reduction in uncertainty with increasing information as the study progresses. It is also pragmatic, since it takes into consideration the typical monitoring behavior of monitoring committees of large multicenter trials and is relatively easily implemented. It not only controls the overall Lan-DeMets Type I error probability (alpha) spent, but performs better than other group sequential boundaries for the total nominal study alpha. We illustrate the use of our monitoring approach in the early termination of the Heart Outcomes Prevention Evaluation (HOPE) trial and the Cardiovascular OutcoMes for People using Anticoagulation StrategieS (COMPASS) trial.
P-109 – RATES AND PREDICTORS OF MISSINGNESS IN CLINICAL TRIALS FOR SUBSTANCE USE: A SECONDARY ANALYSIS OF EIGHT NIDA CLINICAL TRIALS NETWORK STUDIES
P-110 – NON-INFERIORITY AND EQUIVALENCY TESTING IN THE FOUR ARM RANDOMIZED HYBRID TYPE I EFFECTIVENESS-IMPLEMENTATION STUDY OF AN EHEALTH DELIVERY ALTERNATIVE FOR CANCER GENETIC TESTING FOR HEREDITARY CANCER (EREACH2)
Investigating non-inferiority or equivalency in a trial with three or more arms is less common than in trials with two arms. In this work, we discuss non-inferiority and equivalency designs we developed for a four arm trial. The motivating study is the Randomized Hybrid Type I Effectiveness-Implementation Study of an eHealth Delivery Alternative for Cancer Genetic Testing for Hereditary Cancer (eReach2). Germline cancer genetic testing has become a standard evidence-based practice, with established risk reduction and cancer screening guidelines for genetic carriers. The eREACH2 study investigates in-person visits versus a web-based eHealth intervention for pre-genetic test counseling and post-test disclosure. The trial will inform whether an eHealth intervention can provide non-inferior behavioral outcomes when compared to traditional in-person counseling. Both pre-test and post-test sessions will be randomized. This results in four treatment arms (both sessions in-person, both sessions eHealth, and two arms that are mixtures of in-person and eHealth). Our three primary endpoints will be 1) uptake of services and change in 2) knowledge and 3) anxiety. We will test whether eHealth delivery alternatives are non-inferior (knowledge and anxiety) or equivalent (uptake) to traditional counseling. In non-inferiority and equivalency tests, null and alternative hypotheses are the reverse of usual; for non-inferiority, the null hypothesis is that eHealth delivery alternative is worse than the traditional delivery model. Our non-inferiority test for the two continuous variables will be based on an ANOVA F-statistic test jointly comparing the four randomization arms. We will fail to reject the null hypothesis of inferiority if the joint p-value is less than 0.2 and all of the three experimental web arms have a standardized effect of 0.138 standard deviation units or worse (with the direction standardized such that higher values imply beneficial change) when compared to the control in-person only arm. This design will approximate a one-sided test, so that we will declare non-inferiority if the intervention arms have better outcomes. We chose a 0.138 cut-point because it is a small standardized effect. For the binary uptake endpoints, we will conduct a Chi-squared test of the 4x2 table of uptake among the four arms and require a p-value of 0.1 or greater to declare equivalence. Our design gives us >87% power and <1.67% Type I error rates with 175 to 215 participants per arm under a range of null (i.e., encouraging) and alternative (i.e., discouraging) hypotheses. In this presentation, we will describe our assumptions and simulations that justify our approach. Our design could be useful for others designing non-inferiority rules for trials with multiple arms.
P-111 – GREENER ACADEMIC CLINICAL TRIAL MONITORING
P-112 – RECOGNITION, REMUNERATION AND REIMBURSEMENT OF PATIENT AND PUBLIC RESEARCH PARTNERS IN PRAGMATIC RANDOMISED CONTROLLED TRIALS. A SURVEY OF AUTHOR PRACTICES
P-113 – INTEGRATING IRT INTO EDC FOR ADVANCED RANDOMIZATION DESIGNS
In the last few decades, many advanced randomization designs with better statistical properties have been published, but they are rarely used in clinical trial practice. For example, the maximum tolerated imbalance procedure provides a better trade-off between treatment imbalance and allocation predictability than the commonly used permuted block randomization. The minimal sufficient balance method changes the concept of minimization method and provides higher allocation randomness while controlling imbalances of multiple baseline covariates (not only categorical types but also continuous types). The mass-weighted urn design can accurately target multi-arm unequal allocations, especially in trials with response adaptive randomization (RAR), avoiding the dilemma of having to choose between the permuted block design with low allocation accuracy and the complete randomization with low allocation precision. In practice, researchers who want to use these advanced randomization designs often find that their Electronic Data Capture (EDC) systems or Interactive Response Technology (IRT) providers do not offer them. Two major obstacles contribute to this unfortunate situation. First, most EDC and IRT systems are provided by different vendors and cannot communicate with each other. Second, it is expensive to develop, test, validate, and document new randomization designs in software systems. To overcome these difficulties, an innovative strategy is developed by the Data Coordination Unit at the Medical University of South Carolina, with an integrated subject randomization module within the EDC system and a generic database object providing treatment assignment based on the randomization algorithms specified by the investigator for the trial. Most EDC and IRT systems are running on the internet. The integration of the IRT functionality in the EDC system allows information captured in the EDC system to be used for subject randomization without redundant data entry. A subject randomization request is processed as the submission of a special case report form. Information on previous treatment assignments and their distributions, baseline covariate data for the current and previous randomized subjects, as well as site investigational product (IP) inventory can be retrieved from the EDC system and used by the randomization algorithm. All randomization algorithms can be implemented based on the conditional allocation probabilities and the value of a random number. The random number can be retrieved from a pre-generated random number list or generated by the computer in real time when the subject randomization is requested. Based on this strategy, the pre-generated randomization list is replaced by a mathematical formula that produces treatment assignment by adapting the randomization history. This approach eliminates the risk of treatment allocation concealment failures and removes the damage trial operation glitches on the integrity of randomization procedure. In this presentation, two examples of implementing randomization designs directly in the EDC system will be discussed, one with a 5-arm RAR and covariate balance requirement and the other with 3-arm equal allocation, long term follow-up IP resupply, and tighten site IP inventory control.
P-114 – CONSENT DEVIATIONS IN AN ACUTE ISCHEMIC STROKE CLINICAL TRIAL UTILIZING PAPER AND ELECTRONIC CONSENT (ECONSENT)
P-116 – HANDLING OF INCOMPLETE BASELINE COVARIATES IN CLUSTER-RANDOMISED TRIALS: A SIMULATION STUDY
P-117 – THE RESEARCH EWORKFLOW TOOL: A MULTI-USER INTERACTIVE SCREENING PROCESS FOR CLINICAL RESEARCH WORKFLOW
P-118 – QUALITY ASSESSMENT OF NON-INFERIORITY TRIALS IN ONCOLOGY BASED ON CONSORT GUIDELINE
P-120 – THE PREVENTABLE STUDY CALL TRACKING AND SCHEDULE MANAGEMENT SYSTEM
This presentation will focus on the call tracking and schedule management system designed for telephone-based cognitive and physical function assessments of participants, age 75 and older, enrolled in the US PRagmatic Evaluation of events and Benefits of Lipid-lowering in older adults (PREVENTABLE) trial. We will discuss the specifications of the web-based call tracking and schedule management system design, focusing on how the application’s flexible and dynamic design allows for real-time data entry and communication with clinical sites. This robust design ensures high retention to annual calls to obtain annual phone-only data on participants for 5 years. The PREVENTABLE Call Tracking and Schedule Management System is a complex system currently grounded by SQL tables, app specific development files, and stored procedures that create and maintain staff hours of availability, track a participant’s study history through enrollment, randomization, and multiple years of follow-up contacts. The interface allows staff administrators to add callers to the team and manage available call hours for each individual call staff member. On separate screens, staff can visually ascertain a call’s current status, initiate the data entry process, update the call status at any point throughout the entire lifecycle for each active call type, and review the history of each interaction and all comments documented by clinic or call center staff. There are various triggers, alerts and procedures that create, update and/or remove calls from the main interface as needed behind the screens. We will also touch on maintenance processes and alerts that are run daily to monitor participant status throughout the call lifecycle. Since staff scheduling is the foundation of the system, caller assignments are randomly selected from the available time slot list, this is done on purpose otherwise SQL would instinctively pull time slots in alphabetical order. Participants whose language preference is Spanish are only paired with Spanish speaking staff. Confirmation screens reiterate appointments with participants’ time zone and date. Some processes run hourly and others fire instantly as calls end or retention status changes. We will review specifics important to the system’s functionality, including the use of military time exclusively, on-screen feedback and guided data entry as well as key features of dashboard screens and the rationale behind the layout for efficiency. This well-choreographed dance between human actions and web technology is ever changing with each annual cycle that passes. We’ve learned a great deal over the years by working hand in hand with team members and we look forward to the challenges to come.
