Abstract
Background
Classic and second-generation antipsychotic mood stabilizers are recommended for treatment of bipolar disorder, yet there are no randomized comparative effectiveness studies that have examined the ‘real-world’ advantages and disadvantages of these medications.
Purpose
We describe the strategic decisions in the design of the Clinical and Health Outcomes Initiative in Comparative Effectiveness for Bipolar Disorder (Bipolar CHOICE). This article outlines the key issues and solutions the investigators faced in designing a clinical trial that would maximize generalizability and inform real-world clinical treatment of bipolar disorder.
Methods
Bipolar CHOICE was a 6-month, multi-site, prospective, randomized clinical trial of outpatients with bipolar disorder. This study compares the effectiveness of quetiapine versus lithium, each with adjunctive personalized treatments (APTs). The co-primary outcomes selected are the overall benefits and harms of the study medications (as measured by the Clinical Global Impression-Efficacy Index) and the Necessary Clinical Adjustments (a measure of the number of medication changes). Secondary outcomes are continuous measures of mood, the Framingham General Cardiovascular Risk Score, and the Longitudinal Interval Follow up Evaluation Range of Impaired Functioning Tool (LIFE-RIFT).
Results
The final study design consisted of a single-blind, randomized comparative effectiveness trial of quetiapine versus lithium, plus APT, across 10 sites. Other important study considerations included limited exclusion criteria to maximize generalizability, flexible dosing of APT medications to mimic real-world treatment, and an intent-to-treat analysis plan. In all, 482 participants were randomized to the study, and 364 completed the study.
Limitations
The potential limitations of the study include the heterogeneity of APT, selection of study medications, lack of a placebo-control group, and participants’ ability to pay for study medications.
Conclusion
We expect that this study will inform our understanding of the benefits and harms of lithium, a classic mood stabilizer, compared to quetiapine, a second-generation antipsychotic with broad-spectrum activity in bipolar disorder, and will provide an example of a well-designed and well-conducted randomized comparative effectiveness clinical trial.
Background
Bipolar disorder is a lifelong, chronic, and highly recurrent mood disorder characterized by episodes of mania (bipolar I subtype) or hypomania (bipolar II subtype) that alternate with episodes of major depression [1]. Compared to other psychiatric disorders, bipolar disorder is likely to be accompanied by lifetime co-occurring anxiety, substance use, and comorbid medical conditions [2,3] with standardized mortality ratios between 1.6 and 2.1 [4], with most mortality due to suicides and cardiovascular disease [5]. Major depressive episodes in bipolar disorder are associated with 25%−56% of lifetime suicide attempts and 10%−19% of deaths by suicide [6], with an earlier age of first suicide attempts associated with more comorbid conditions [7]. Bipolar disorder is among the top 10 causes of disability worldwide [8] with direct and indirect costs estimated at US$151 billion [9].
Mood stabilizers, medications that acutely treat and prevent future mood episodes, are foundational treatments for bipolar disorder. Treatment guidelines recommend that pharmacotherapy should include mood stabilizers for long-term maintenance treatment, but researchers have not conducted randomized comparative effectiveness studies that examine the ‘real-world’ advantages and disadvantages of the second-generation antipsychotics (SGAs) with mood stabilizing effects compared to the classic mood stabilizer, lithium. Large observational studies have, however, implicated SGAs, as well as older antipsychotics, in causing a dose-related increase in the risk of sudden cardiac death [10,11], weight gain, and dyslipidemias [10,12].
Purpose
A current dilemma for clinical trials is maximizing generalizability and translation of results while maintaining their scientific rigor and integrity. For example, current efficacy designs for SGAs in bipolar disorder required by the Food and Drug Administration (FDA) for new indications limit external validity (generalizability) to optimize internal validity and assay sensitivity. These efficacy studies may inflate therapeutic effect sizes by using enriched relapse prevention designs, excluding recently depressed bipolar patients (the phase where most patients attempt suicide), excluding patients with bipolar II disorder (equally prevalent to bipolar I and more likely to be accompanied by depression, suicide, and lability of mood episodes), and most importantly, excluding difficult-to-treat patients (e.g., those with rapid cycling, co-occurring substance use, medical conditions, or anxiety disorders). While improving internal validity and assay sensitivity, the use of such exclusion criteria has markedly limited generalizability. Designs of recent bipolar studies have excluded 16%−62% of potential patients [13]. The dearth of studies on effectiveness, sustainability of symptom benefit, longer-term harms, and functional outcomes has resulted in large gaps in practical clinical knowledge about the actual utility of SGAs in bipolar disorder [14].
The Bipolar Trials Network, a collaboration of 10 clinical research sites (i.e., Massachusetts General Hospital, Stanford University, the University of Pittsburgh, Case Western Reserve University, the University of Texas Health Science Center at San Antonio, University of Pennsylvania, University of Michigan, University of Alabama at Birmingham, Lindner Center for Hope of the University of Cincinnati, and Weill Cornell Medical College), designed the Clinical and Health Outcomes Initiative in Comparative Effectiveness for Bipolar Disorder (Bipolar CHOICE) study to meet the needs of this clinical gap left by previous research studies. Given that comparative effectiveness trials are still quite unique, and often controversial, the authors (members of the Bipolar Trials Network) discuss their process in developing this study. The final study design is to assess the benefits and harms of a prototypical and widely used SGA with mood stabilizing properties, quetiapine (QTP), compared to the classic mood stabilizer lithium (Li), as the foundational treatment along with other necessary adjunctive personalized treatments (APT; i.e. guideline-informed, evidence-based, and personalized therapy based on current symptoms, prior treatment history, and course of disorder). This article describes the rationale for key design features, such as selection of study medications, outcomes, considerations for participant eligibility, and the use of a network to conduct clinical trials. We will also present the final methods and design of the Bipolar CHOICE study.
Methods: decision points of a comparative effectiveness trial for bipolar disorder
Selection of study medications
Bipolar disorder is a complex and chronic disorder, often treated with multiple medications which need to be adjusted over time and tailored for the waxing and waning of multiple symptoms. For bipolar disorder, multiple medications have been approved by the FDA, but of these, two classes are considered first-line treatments: mood stabilizers and SGAs, each with its own benefits and risks. These classes of medications have not been compared in a real-world pragmatic trial. This is a significant gap in our understanding of bipolar treatment. Since several mood stabilizers and SGAs are available, the Bipolar Trials Network had to decide which mood stabilizer and SGA to compare.
Rationale for selection of SGA: quetiapine
Quetiapine was chosen as the SGA comparator to the prototypic classic mood stabilizer in this study because it is the only SGA that has obtained both monotherapy and adjunctive use FDA-approval for the short- and long-term treatment of major depressive episodes in bipolar disorder, for the short- and long-term treatment of manic episodes, and for both types of bipolar disorder (type I and type II). No other SGA has achieved this distinction. It is also the most widely prescribed antipsychotic agent in bipolar disorder in this country. However, the broad spectrum of efficacy of quetiapine and its therapeutic effect sizes (which are unusually large, Cohen’s d = 0.7–1.1 for acute studies and hazard ratios of 0.70 for long-term studies) have been attributed in part to functional nonblinding from high rates of sedation, somnolence, and lethargy. These side effects frequently count as therapeutic effects (e.g., sedation from quetiapine will alleviate insomnia), which can inflate the therapeutic effect size. The nonoverlapping rates of these three crucial side effects (i.e., sedation, somnolence, lethargy) have ranged between 40% and 60% within five acute bipolar I or II depression studies [15–19]. More importantly, there exists a specific concern regarding the metabolic profile of quetiapine, and even the safety and tolerability of SGAs as a class [20]. In sum, quetiapine has received broad regulatory approval worldwide, but there is a compelling public health need to elucidate the metabolic burden and cardiovascular risk associated with the use of this compound and other atypical antipsychotics compared to a classic mood stabilizer.
Rationale for selection of classic mood stabilizer: lithium
Lithium was selected as the prototypic mood stabilizer comparator as it is an inexpensive off-patent ‘orphan compound’ widely considered to be the classic treatment for bipolar disorder by all treatment guidelines worldwide [21–23]. A series of authoritative reviews have unequivocally reaffirmed the role of lithium as the classic prototypic mood stabilizer for treatment of bipolar disorder [24]. These reviews have assimilated a vast literature which demonstrate lithium’s short- and long-term anti-manic efficacy [25], as well as its ability to prevent death due to suicide [26–28], cancer, heart disease, and other causes of premature death in individuals with bipolar disorder [27,29]. Indeed, the role of lithium as the prototypic mood stabilizer for bipolar disorder is generally accepted [25,30].
Nevertheless, a recent review of the short- and long-term treatment of major depressive episodes associated with bipolar I or II disorder raised important questions regarding the evidence base supporting the usefulness of lithium [31]. Lithium’s limited efficacy in the depressed phase of bipolar disorder, the phase where patients spend the majority of their symptomatic lives [32,33], compromises its overall usefulness. Moreover, lithium was not statistically significantly different than placebo over 8 weeks in the industry-sponsored multicenter, randomized, double-blind Efficacy of Monotherapy Seroquel in BipOLar DepressioN (EMBOLDEN) II trial in which quetiapine monotherapy was superior to placebo [16]. In marked contrast, quetiapine has been shown to possess comparable short- and long-term efficacy in both manic and depressive phases of bipolar disorder.
Considerations for other potential medications
Numerous authoritative reviews describe the role of the anticonvulsant mood stabilizers (valproate/divalproex, lamotrigine, carbamazepine) in the treatment of bipolar disorder, although the body of work supporting their use is less extensive. In a recent review [34], valproate, principally as divalproex, was described as having strong evidence for effectiveness in mania and moderately strong evidence for benefits in prophylaxis of recovered mood states. However, the efficacy of valproate in the short-term treatment of bipolar I or II depression is less clear with only four relatively small randomized controled trials (including a total of 142 subjects) supporting its antidepressant efficacy [35]. The most recent of these trials was the largest and only included mood stabilizer-naïve subjects, 28 randomized to placebo and 26 to divalproex [36], and in this positive proof of concept, evidence of efficacy was only present in subjects with bipolar I (but not bipolar II) depression. Lamotrigine has a strong evidence base supporting its effectiveness in the maintenance treatment of bipolar I disorder, but principally for the prevention of major depressive episodes. In addition, lamotrigine has been established as ineffective in mania. Randomized, double-blind, placebo-controlled evidence supporting lamotrigine monotherapy use in acute bipolar depression is most often lacking [37,38]. A notable exception is a study done by Van der Loos et al. [39] that demonstrated the adjunctive usefulness of lamotrigine when used with lithium in acute bipolar depression. Carbamazepine has rigorous evidence for efficacy in acute mania, but lacks substantial evidence for efficacy for other aspects of bipolar disorder treatment. Although these three anticonvulsants, and valproate in particular, were described as having an adequate evidence-base supporting their use in bipolar disorder, there is almost no pragmatic effectiveness data supporting the use of the anticonvulsants, with the exception of Bipolar Affective disorder: Lithium/ANticonvulsant Evaluation (BALANCE), an enriched comparative effectiveness study that compared long-term treatment with lithium, valproate, or the combination for those who responded to the combination acutely [40]. In addition, the extent to which the anticonvulsants decrease risk for death due to suicide and other illnesses is unclear. For these reasons, we selected lithium as the classic mood stabilizer over these other options.
Finalizing the comparison groups
After selecting quetiapine and lithium as the specific medications of interest, our next challenge was how to conduct a ‘real-world’ comparison of these medications given that they are often prescribed with other adjunct medications. Bipolar patients take multiple medications to achieve remission and/or to manage comorbid psychiatric disorders. Over three quarters of individuals with bipolar disorder have at least one comorbid psychiatric diagnoses, and over 40% have at least three or more, highlighting the complexity of treatment [41]. About a third of bipolar patients take more than one psychotropic medication with a mean of about three medications per patient [42,43]. Given these patterns of treatment, a comparative effectiveness trial of quetiapine or lithium monotherapy would not reflect the clinical needs of bipolar patients in the community. Therefore, we rejected the option of a monotherapy comparison and, instead, decided to include additional adjunctive psychiatric medications, with clear rules of engagement that would still allow us to compare the overall effects of the study medications. Clinicians were instructed to prescribe additional medications tailored for each participant informed by systematic diagnostic assessments, tracking of symptoms and side effects, and therapeutic blood levels of medications (when appropriate). We chose to call the addition of these medications, APT.
The Texas Implementation of Medication Algorithm (revised guidelines) became the foundation of APT [44]. Each randomized group receives APT to manage symptoms or episodes informed by published guidelines as adapted for the purposes of this study. Consistent with comparative effectiveness principles, APT is allowed to change in both groups over the course of the 6-month study, based on clinical needs of the participants and consistent with the goal of helping participants achieve sustained remission. The changes in APT and the reasons for those changes are carefully recorded, and the number of medication changes per month (Necessary Clinical Adjustments (NCAs)) will be a co-primary outcome. Equivalence of implementation of evidence-based APT for each randomized group will be monitored by an independent expert. The rationale for using APT as a control group is described elsewhere [45]; thus, we describe the definition of APT for this study and the rationale for this definition.
It was critical for the integrity of the science of this comparative effectiveness study to determine what to include or exclude from APT. For example, should APT include or exclude all other medications used as mood stabilizers? If all mood stabilizers are included, the two arms could be so similar as to negate any difference and end up comparing the combination of a SGA and mood stabilizer with the combination of a mood stabilizer and a SGA. On the other hand, since most patients with bipolar disorder take a combination of medications, to overly restrict treatments would violate the ecological validity of the study. Another consideration is that to successfully implement the protocol of a large effectiveness trial, the ‘rules of engagement’ of treatment for the clinicians to follow must be transparent and clear.
The Bipolar Trials Network had extensive consensus meetings about the advantages and disadvantages of including key medications used in bipolar disorder. We had unanimous agreement that no additional antipsychotic could be used in either arm, unless used as a rescue treatment (which would be considered a study procedure deviation and tracked using a deviation form). If the other widely used mood stabilizers, such as divalproex, were included as a result, we could compare quetiapine plus divalproex to lithium plus divalproex. Such comparisons would inform clinicians about the relative strengths and limitations of combination therapies in a real-world context and allow the clinicians to use all of the tools available to treat their patients. The disadvantage of including other mood stabilizers is that if one believes that lithium and other mood stabilizers are equal, then the distinction between the quetiapine and lithium groups would be blurred. The argument against this is that lithium and other mood stabilizers are distinctly different medications with distinctive mechanisms of action and efficacy spectra. After spirited debate, the Bipolar Trials Network decided to include other mood stabilizers in both groups to (a) maximize ecological validity and (b) enhance the feasibility of the study by allowing clinicians to have a full range of medications commonly used in combination that would maintain the integrity of the study. We defined the rules for APT as follows: The QTP+APT group can receive all available treatments except for Li and other antipsychotics and the Li+APT group can receive all available treatments except antipsychotics. If during the trial, however, participants require the addition of any restricted medication (e.g., if the Li+APT group develops psychotic symptoms), then they may discontinue being on protocol, but remain in the study consistent with true intent-to-treat principles and comparative effectiveness designs.
Selecting the study outcomes
Our third decision, after selecting the specific medications and the allowable adjunctive medications, was to determine the study outcomes. One of the main goals of the study was to minimize subject burden while maximizing useful data. Given that bipolar disorder is a complex condition with varying degrees of severity and course, as well as a range of impairment, it was challenging to settle on a limited number of outcomes that would accurately assess the multi-faceted nature of the disease. Furthermore, patients often experience a wide range of side effects which need to be assessed in addition to illness severity. This poses another dilemma: How do you choose which side effects to measure if multiple medications are allowed? Which side effects could potentially be included as a study outcome? Clinicians treating bipolar disorder have become increasingly concerned about the disproportionate medical burden and risk factors for cardiovascular disease associated with bipolar disorder [46]. Thus, we agreed to prioritize assessing side effects as an outcome and to measure the total side effect burden by incorporating a cost:benefit ratio analysis of improvement in symptoms with potential worsening of side effects. Based on these defining principles, we selected the following outcome measures.
Clinical Global Impression-Efficacy Index
The Clinical Global Impression-Efficacy Index (CGI-EI) is a scale that combines efficacy and burden of adverse events, allowing it to assess the comparative benefit of the treatments in relation to the harms or side effects [47]. This measure has particular importance within comparative effectiveness research because it provides a benefit:risk ratio that can be compared across treatment groups. Thus, the treatment that improves symptoms with the fewest side effects would have a more favorable benefit:risk ratio.
NCAs
The Medication Recommendation Tracking Form (MRTF) was developed and successfully implemented to capture recommended medication changes at each study visit [45]. Clinicians record dosage changes, missed doses, new medications added or discontinued, and specify the reason for each change. NCAs represent changes made for lack of effectiveness or tolerance as opposed to planned dose titrations to allow us to determine if one treatment required significantly more medication alterations than the other to achieve symptomatic improvement or remission.
Framingham Cardiovascular Risk Score
The Framingham Cardiovascular Risk Score captures the risk factors for cardiovascular disease (i.e., age, sex, systolic blood pressure, total and high density lipoprotein cholesterol, diabetes mellitus, smoking) [12,48,49]. The Framingham Cardiovascular Risk Score was developed as a simple predictive tool for determining 10-year risk for developing cardiovascular disease [50].
Longitudinal Interval Follow up Evaluation Range of Impaired Functioning Tool
Longitudinal Interval Follow up Evaluation Range of Impaired Functioning Tool (LIFE-RIFT) assesses the extent to which psychopathology has impacted current functioning in work; household chores; interpersonal relationships with partner, family, and friends; recreational activities; life satisfaction; leisure activities; and social relationships [51]. This tool was chosen to evaluate whether the treatments improve everyday functioning and whether these improvements are related to a reduction in the severity of symptoms.
Bipolar Inventory of Symptoms Scale
The Bipolar Inventory of Symptoms Scale (BISS) [52,53] is a comprehensive assessment of mood, which yields an overall severity of bipolar illness score, aggregating all elements of illness and five domain scores for manic, depressive, anxious, irritable, and psychotic elements, the latter established in exploratory factor analyses [54]. The BISS includes several items either absent or indirectly addressed in current mania or depression rating scales (e.g., anxiety and affective instability). The level of detail in the BISS is especially important in comparative effectiveness research which seeks to find both broad and fine differences in the effects of treatments. Since it has been tested in diverse clinical bipolar populations, the BISS has the potential to facilitate the development of new treatments and compare existing treatments since it allows for the assessment of treatment effects on specific behavioral components of bipolar disorder.
Intent to attend
This single-item measures participants’ intent to attend the next study visit, or to complete the entire study [55]. It consists is a Likert scale of 0 (not likely to attend at all) to 9 (extremely likely to attend). If the patient rated a 4 or below, the study coordinator was prompted to discuss with the patient what issues they were having, such as dissatisfaction with medication, burden of study procedures, compensation, et cetera.
Considerations for participant eligibility
Our fourth key study decision in designing a comparative effectiveness study for a complex disease, such as bipolar disorder, was how to maximize generalizability. As mentioned above, the complex course and wide spectrum of symptoms of bipolar disorder along with multiple comorbid psychiatric disorders, makes designing a generalizable study challenging.
One way to confront this challenge is to recruit a homogeneous population with similar illness characteristics and thereby restrict our subject pool with narrow inclusion and exclusion criteria; however, this would limit generalizability. Thus, we decided to get as close as we could to routine clinical treatment of bipolar disorder when we compared quetiapine and lithium, while including APT [14]. In the spirit of conducting a ‘real-world’ trial, we included participants with comorbid substance abuse and anxiety disorders and individuals with a history of medication nonadherence. However, participants who were in crisis (e.g., needing urgent inpatient hospitalization or alcohol/substance detoxification) would not be eligible to enroll in the study. These participants were excluded because they needed acute crisis intervention or detoxification and would not be appropriate to participate in a randomized controlled longer-term trial.
We considered excluding participants with a prior history of treatment failure; however, such information is subject to informational errors, adequacy of the trial in time, and dosage and context of use (e.g., concurrent medications, type of episode). Participants were excluded only if they had difficulty tolerating the study medications or had not responded after having an adequate trial. We did not exclude participants who had responded to lithium or quetiapine in the past, as long as they were willing to be randomized to either study medication. We did not exclude participants currently taking one of the study medications at the time of the screening visit if the current trial failed to meet criteria for an adequate dose and duration.
We also decided to include participants taking SGAs other than quetiapine at the time of randomization. Taking another SGA and having enough symptoms to qualify for the study does not preclude response to either study medications. However, since SGAs (other than quetiapine, if in the QTP+APT group) are not permitted as part of APT, participants who had been taking SGAs at baseline would need to stop taking them. If participants suddenly stopped the SGAs, they could worsen, but they also could not continue taking them for the duration of the study. To address this dilemma, we decided to allow randomized participants a 4-week wash-out phase to taper off of SGAs while the study medication is tapered up to a therapeutic level. Any deviations from these procedures were tracked with a study procedure deviation form. In summary, we included participants with a wide spectrum of bipolar disorder, including those with comorbid conditions, those who had previously taken the study medications (but did not experience severe intolerance), those who had a history of medication nonadherence, and those who were taking almost any combination of psychotropic medications at study entry, in order to recruit a sample consisting of patients as close as possible to those treated in community settings.
Rationale for a single-blind trial
We considered several design options, but chose a single-blinded trial to mimic as much as possible the care of bipolar disorder in community settings, where the patient and clinician know which medications are being administered. This allows clinicians to tailor adjustments both to the randomized study medication and APT medications. The single-blind design also allows participants to be more representative of those who present to clinicians in the community and increases the generalizability of our results. Importantly, all medication adjustments were recorded as NCAs, and accounted for in our analysis plan.
Other important design features: lessons learned from Lithium Treatment–Moderate Dose Use Study
Our final decisions on the Bipolar CHOICE study design are based on the Bipolar Trial Network’s first collaborative project the Lithium Treatment–Moderate Dose Use Study (LiTMUS) [45]. This study randomized individuals to low to moderate doses of lithium (e.g., 600 mg) plus optimized treatment (similar to APT, or guideline-informed, evidence-based, and personalized therapy based on current symptoms, prior treatment history, and course of disorder) versus optimized treatment without lithium. Given that LiTMUS and Bipolar CHOICE are similar in their aim to conduct highly generalizable, comparative effectiveness trials to inform health-care decisions by providing evidence on the effectiveness, benefits, and harms of different treatment options, we utilized some overlapping design features. These features include randomized, single-blind rater design, intent-to-treat principles, NCAs, and strategies to minimize attrition, including permitting additional (nonrandomized) medications. The rationale for these design features are discussed in depth elsewhere [45] but are also important features in the Bipolar CHOICE study. Important differences in Bipolar CHOICE compared to LiTMUS are more aggressive dosing of lithium (attempting to attain a dose of 900 mg per day), and the general exclusion of SGAs as additional (nonrandomized) medications.
Final study design
Study aims
Based on the gaps in the literature and key methodological decisions discussed above, the Bipolar Trials Network selected two co-primary aims for Bipolar CHOICE. The first aim is to assess the comparative benefits and harms of QTP+APT versus Li+APT. We hypothesize that participants randomized to the Li+APT group will have, on average, a more favorable overall benefit relative to harm as assessed by Clinical Global Impression-Efficacy Index scores over 6 months compared to those randomized to QTP+APT. The second aim is to assess the frequency of NCAs (medication adjustments implemented by the clinician to reduce symptoms, optimize clinical response and functioning, or to address intolerable side effects). We expect that participants randomized to the QTP+APT group will have more NCAs in medications per month over the course of 6 months compared to those randomized to Li+APT.
Additionally, we will compare the effect of QTP+APT and Li+APT on the future risk of cardiovascular disease and functional status. We hypothesize that participants randomized to the QTP+APT group will have greater increases in the Framingham General Cardiovascular Risk Score and less improvement in functional status as measured by the LIFE-RIFT over the course of 6 months compared to those randomized to Li+APT.
Final study procedures
Bipolar CHOICE is a 10-site, randomized parallel-group, and rater-blinded open-label trial of adjunctive quetiapine or lithium in outpatients with bipolar I or II disorder (N = 482) with at least mild symptoms at study entry. Participants will receive 6 months of treatment. At the time of enrollment, participants must be experiencing mood symptoms of sufficient intensity such that a change in treatment is clinically warranted, for which, in the investigator’s judgment, lithium and quetiapine are reasonable therapeutic options. Research clinicians will diagnose participants with the electronic version of the Extended Mini-International Neuropsychiatric Interview, an extended version of a validated structured diagnostic interview to determine current and lifetime Diagnostic and Statistical Manual of Mental Disorders (4th ed.) diagnoses [56]. The full inclusion and exclusion criteria are listed in Table 1. At the screening visit, medical, psychiatric, and medication history as well as laboratory tests are obtained. Eligible participants will then be scheduled for a randomization visit within 10 days to complete the time-sensitive assessments (symptom and side effect scales) and receive their randomized study medications. We used an electronic randomization tool (RS2) to assign treatments. After receiving clearance of eligibility of the subject from the study physician, the research coordinator randomized the subject and gave the printout with the treatment assignment to the study physician. After the study visit, the printout was stored in a sealed envelope with the other source documents for that subject to ensure concealment of allocation. Furthermore, blinded raters were asked to specify on each rating form whether they have been unblinded to treatment group. In the rare case that the rater was unblinded, we had a procedure in place so that another blinded rater completed the rating at the current visit and at all visits going forward for that subject. Clinicians follow the participants over eight visits: the first five visits occur every 2 weeks, and the remaining 3 monthly, with the last occurring at week 24, or 6 months after randomization.
Inclusion/exclusion criteria for study entry
DSM-IV: Diagnostic and Statistical Manual of Mental Disorders (4th ed.); CGI-BP-S: Clinical Global Impression–Bipolar–Overall Severity; SGA: second-generation antipsychotic; QTP: quetiapine; ATP: adjunctive personalized treatment.
Titration schedule of study medications
Bipolar CHOICE clinicians will start lithium carbonate at 150 mg and titrate it up to 900 mg, or the maximally tolerated dose, by the week 2 visit. This titration schedule should be used as a guideline with flexibility to treat patients’ mood states as appropriate. A lithium blood level will be obtained at the week 2 visit. Lithium will be titrated thereafter to obtain a minimal blood level of 0.6 mEq/L, but not more than 1.2 mEq/L. If intolerable adverse effects occur, the dose will be decreased until tolerable. The minimum acceptable daily dose for participants randomized into Li+APT is 600 mg. Quetiapine will be started at 50 mg at night and titrated up to 300–600 mg for acute depression or 800 mg for acute mania, or the maximally tolerated dose, by the week 2 visit. This titration schedule should be used as a guideline with flexibility to treat patients’ mood states as clinically appropriate. The minimum acceptable daily dose for participants randomized into QTP+APT will be 100 mg. Once lithium or quetiapine therapy has been optimized, the treating psychiatrist maintains this dose and employs APT medication regimens assuming that the specific drug and dosage will have equivalent benefits with or without lithium dosages. Clinical exacerbations will be managed by other APT medications.
Statistical analyses
The statistical tests will use a two-tailed alpha-level of 0.05. One exception is the analysis of treatment effects in the two primary hypotheses (described below) will each involve a two-tailed alpha-level of 0.025 because they are co-primary aims and we used a Bonferroni correction. This multiplicity adjustment adheres to the recommendations of the International Conference on Harmonisation (1998) [57]. In contrast, for other (more exploratory) outcomes, corrections for multiple comparisons will not be used. The sample size for the study was determined based on power analyses for the primary hypotheses (described below). Statistical power analyses examined the sample size requirements to detect clinically meaningful group differences on each primary outcome. As stated above, two co-primary aims have been proposed, and therefore require an alpha-level of 0.025 each, which will be used both in hypothesis testing and for sample size determination [58]. The two co-primary endpoints were tested at 0.025 to account for multiplicity. The original sample size for the primary outcomes (the CGI-EI and the NCAs) was determined to detect an effect size of 0.30. We did not have preliminary data on CGI-EI, and based on the current data from the study, we would need 80% power to see a difference in the change in CGI-EI between treatments of 0.353. For the NCAs, an effect size of .30 would correspond to about 0.4 fewer medication adjustments/month for Li+APT subjects than for QTP+APT subjects.
Primary hypotheses
For the first co-primary aim, mixed-effects linear regression analyses will compare the two intervention groups on the repeated assessments of the CGI-EI over the 6-month trial. The mixed-effects model will include a random intercept and slope over time and fixed effects for treatment, time, and site. A treatment-by-time interaction will be used to determine if the rate of change in benefit relative to harm (CGI-EI) varies between two intervention groups over the course of acute treatment. The decision rule calls for rejection of null hypothesis if the treatment by time interaction is statistically significant. The interaction of treatment by site will be evaluated using a likelihood ratio test conducted and included in the model if significant.
For the second co-primary, mixed-effects linear regression analyses will examine NCAs as the dependent variable. NCAs over each 4-week period will be examined in the model. The mixed-models will be analyzed using the strategies described above for analyzing the CGI-EI data. The decision rule calls for rejection of the null hypothesis if the treatment by time interaction is statistically significant. For the secondary outcomes, mixed-effects linear regression analyses will compare the two intervention groups on the repeated assessments of the Framingham Risk Score as well as the LIFE-RIFT over the 6-month trial.
Results: recruitment challenges
Bipolar CHOICE actively recruited participants from 1 December 2010 to 15 September 2012, with follow-up ending on 15 April 2013. Bipolar CHOICE has successfully reached and exceeded its target enrollment by recruiting 482 participants. Yet, we did experience some recruitment challenges which warrant a discussion to assess the feasibility of our methodological decisions discussed above.
Recruitment phase
An important recruitment challenge for Bipolar CHOICE was the short (2-year) recruitment period, requiring sites to each randomize approximately 2.5 participants per month. This challenge was, however, mitigated by the broad exclusion criteria, which allowed us to randomize 482 participants, about 101% of our total recruitment goals.
Prevalence of study medications
The primary recruitment challenge was that participants were commonly taking lithium or quetiapine at the screen visit or had taken lithium or quetiapine in the past with suboptimal outcomes and were not interested in being randomized to either treatment or were not eligible as they had intolerable side effects or lacked treatment response at therapeutic levels. However, when carefully assessed, patients’ reports of previous medication trials suggested that past medication trials were not given at adequate doses for adequate durations [59]. It is possible that the study medications were either under-dosed, which could have resulted in a lack of efficacy, or higher doses may have caused side effects that were not successfully managed. Thus, we carefully assessed past exposure to the study medications at the screen visit and after randomization, monitored side effects and allowed for flexible dosing to personalize patient tolerability.
Cost of treatment
Another recruitment challenge was that participants without insurance had difficulty affording their study medications, particularly quetiapine prior to availability of a generic formulation. Given that this is a comparative effectiveness trial of already established treatments for bipolar disorder, medications are not paid for by the study. Thus, similar to treatment in community clinics, the study sites are utilizing resources in their communities to assist in paying for the medications, if participants do not have medical insurance or other means to pay for these medications.
Conclusion
Despite the challenges described above, the Bipolar Trials Network has consistently recruited more than 90% of the target number of participants at each study milestone and has recruited more than the planned total sample size. It is possible that our success in recruiting participants for a trial of available treatments is due to their willingness to retry medications that may have yielded suboptimal efficacy or tolerability in the past, but nevertheless with careful administration in the setting of also being able to take APT may still prove useful [55]. We also included patients with bipolar disorder who were not currently receiving pharmacological treatment or who have never received lithium or quetiapine. The availability of these patients has been confirmed in national samples [60], as well as through the clinical and research experience of the Bipolar Trials Network investigators.
In summary, Bipolar CHOICE is a randomized comparative effectiveness study of QTP+APT versus Li+APT treatment for 6 months. This study design should provide clinicians and patients with better data to weigh the benefits and harms of these treatments and to determine moderators of treatment response. Because of extensive inclusion and exclusion criteria necessary for efficacy trials of treatments for bipolar disorder, effect sizes from efficacy studies could substantially differ from outcomes of treatments used in more heterogeneous community samples. Thus, there continues to be an urgent need to conduct practical treatment trials in real-world settings that compare medication regimens commonly prescribed by practitioners. Bipolar CHOICE seeks to meet this public health need and further the ultimate goal to develop a comprehensive model for the personalized treatment of bipolar disorder. Finally, the use of a nationwide network (i.e., Bipolar Trials Network) contributes to maximizing efficiency, ecological validity, and scientific clarity in clinical trials.
Footnotes
Appendix A
Acknowledgements
We would like to acknowledge the members of the Bipolar CHOICE Study Group who contributed significantly to this research study (in alphabetical order by last name): Claudia Baldassano, MD; Emily Bernstein, BS; Benjamin Brody, MD; Leah Casuto, MD; Cheryl McCullumsmith, MD; Timothy Denko, MD; Astrid Desrosiers, MD; Jamie Dupuy, MD; Rachel Fargason, MD; Keming Gao, MD; John Hawkins, MD; Chang-Gyu Hahn, MD, PhD; Masoud Kamali, MD; David Kemp, MD; Gustavo Kinrys, MD; Adrian Lagakos; Li Li, MD; Falk Lohoff, MD; Jules Martowski, BA; Cheryl McCullumsmith, MD; Stephanie McMurrich, PhD; Nicole Mori, RN; Leah Pickett, NP; Brian Pollock, RN; Marlon Quinones, MD; Karl Rickels, MD; Martha Schinagle, MD; Amy Shui, MA; Duane Spiker, MD; Peter Thompson, MD; Po Wang, MD; Curtis Wittmann, MD; Sudeep Xavier, MD; and Rusheng Zhang, MD.
Funding
This research was funded by the Agency for Healthcare Research and Quality (AHRQ): 1R01HS019371-01.
Conflict of interest
