Abstract
Assessing the utility of structured approaches to benefit-risk assessment of medicinal products is challenging, in part due to the lack of a gold standard for results and the uncertainty inherent in the data. In place of conducting formal testing, obtaining feedback from users of structured approaches provides insight into their value and limitations. The authors conducted a simulated single-session benefit-risk decision in which 3 groups applied the PhRMA BRAT(Pharmaceutical Research and Manufacturers of America Benefit-Risk Action Team) framework or the multicriteria decision analysis approach. The groups were provided with background and data for a hypothetical triptan for acute migraine in a population with cardiovascular risk factors and were asked to determine and defend an approval decision. Three insights emerged consistently from the groups: (1) the value of a structured approach to benefit-risk assessment, (2) the clarity provided by real-time visualization tools, and, most critically, (3) the importance of bringing the patient into the discussion early.
Keywords
Introduction
Several structured or “framework” approaches to benefit-risk assessment in the medicines and devices industry are currently being developed and tested by academic, regulatory, and industry organizations. 1 –7 While these frameworks differ, they share a core set of principles to assist decision makers in clearly defining the decision, agreeing on the requisite properties of the treatments being considered, assessing trade-offs among these properties, and making defensible and transparent decisions. A difficulty with evaluating such frameworks is that there is generally no gold standard against which to compare results. Such evaluations in pharmaceutical benefit-risk are especially difficult, given the potentially complex background information required, the large number of outcomes that are typically available, the uncertainty in the data for these outcomes, and the varying perspectives of different stakeholders involved in the decision. Recent work by the Innovative Medicines Initiative has made great strides in comparing the use of several frameworks in extended detailed analyses (see Nixon et al, 8,9 Juhaeri et al, 10 and other reports at the Innovative Medicines Initiative PROTECT Project 7 ). Another approach for evaluating such frameworks is to assess how well the decision process worked for the decision makers in a brief decision-making scenario: Did participants feel that the relevant viewpoints were included? Did the decision makers feel comfortable with the decision? Do decision makers believe that the decision could be communicated and defended? Hence, an additional step toward evaluating the utility of these frameworks is to utilize realistic scenarios in discussion groups with different stakeholders, then obtain feedback on the process. This article describes the results of such a simulated decision process using a benefit-risk case study for triptans in migraine performed at the 2011 Centre for Innovation in Regulatory Science Workshop on the benefit-risk assessment of medicines.
Methods
This case study was based on one originally developed by the Pharmaceutical Research and Manufacturers of America Benefit-Risk Action Team (BRAT i ) in collaboration with RTI Health Solutions. 4
Migraine
Migraine is a recurrent headache disorder, occurring as rarely as once a year to as frequently as several times a week (2 per month is common). 11,12 Attacks last 4 to 72 hours and are believed to result from a neurogenic process associated with changes in cerebral perfusion. The typical characteristics of a migraine attack include unilateral headache with pulsating or throbbing pain of moderate or severe intensity that is aggravated by routine physical activity. Migraines can also be associated with nausea and light and sound sensitivity. Fewer than 20% of migraine sufferers experience aura, visual displays, smells, or other cues before or during attacks.
Migraines affect approximately 12% (29.5 million) of Americans and are most prevalent between the ages of 15 and 55 years, occurring 3 times more commonly in women. 13 The disease can be disabling. The World Health Organization ranked migraine 19th among all disabling diseases worldwide 14 ; 90% of all attacks involve moderate to severe pain, and nearly 30% of patients need bed rest during attacks.
There is currently no cure for migraine, and treatment focuses on attack prevention and the cessation or mitigation of symptoms. First-line therapy is aspirin, nonsteroidal anti-inflammatory drugs (NSAIDs), acetaminophen/paracetamol, and/or caffeine; triptans are second-line therapy for mild to moderate migraine and first-line therapy for moderate to severe migraine.
Triptans
Triptans have demonstrated efficacy clearly superior to placebo in numerous randomized clinical trials. They act on serotonin receptors in nerve endings and blood vessels, resulting in the constriction of cranial blood vessels and the inhibition of proinflammatory neuropeptide release. There are currently 7 triptans approved for the acute treatment of migraine in adults.
Because of their vasoconstricting effects, triptans are contraindicated in patients with ischemic heart disease, coronary artery vasospasm, and other cardiovascular (CV) conditions. However, the actual triptan-related CV risk is unclear, and results from observational studies suggest that the baseline CV risk in migraineurs is greater than that of the general population. 15,16 Furthermore, no clear relationship between triptans and CV events has been observed, in part because few patients with CV risks have been studied in triptan randomized clinical trials. It remains an open question whether patients with CV risks can be safely prescribed triptans.
Scenario
A hypothetical triptan was previously approved for the acute treatment of migraine attacks in adults. As for all triptans, there was a recommendation against its use in patients with CV risk factors. However, in this scenario, a new randomized active-control study compared 2 doses of the triptan to a single dose of an NSAID for the acute treatment of migraine in adults with 1 or 2 CV risk factors. In this 1-year hypothetical trial, 1000 patients aged 18 to 65 years were enrolled in each of 3 treatment arms: low-dose triptan, high-dose triptan and NSAID. Inclusion criteria were as follows: (1) 1 to 6 migraine headaches per month with or without aura, (2) migraines for at least 1 year, (3) migraine history begun before age 50, and (4) 1 or 2 CV risk factors, including hypertension, hypercholesterolemia, smoking, obesity, diabetes, family history of coronary artery disease, being a female with surgical or physiologic menopause, or being a male older than 40 years. Patients with significant underlying CV diseases were excluded.
Data for this exercise were based on the BRAT example, 4 supplemented with data from marketed triptans, NSAIDS, and meta-analyses. An initial set of clinical endpoints for the scenario was provided (value tree in Figure 1), although part of the exercise involved potentially modifying the value tree. Endpoint definitions were similar to those in most triptan studies (Table 1).

Initial value tree for triptans case study. CNS, central nervous system.
Outcome definitions for case study.
Clinical trial data provided to the syndicate discussion groups included the proportion for each outcome for each treatment arm and the risk difference (difference in proportion) for each outcome for each pair of treatments. All supplied data included 95% confidence intervals and were available in tabular (Table 2) and forest plot format (Figure 2).
Effects table for case study (in percentages).
Values are expressed as mean (95% confidence interval). CNS, central nervous system; NSAID, nonsteroidal anti-inflammatory drug.
aNo. per 1000 patient-years.

Rate difference forest plot of high-dose triptan versus nonsteroidal anti-inflammatory drugs (NSAIDs). Diamonds show point estimates. Bars show 95% confidence intervals, orange for efficacy endpoints and blue for safety endpoints. Labels for the safety endpoints are reversed so that negative rate differences all favor the NSAID and positive values all favor the high-dose triptan. Differences are shown per 1000 patients to better enable comparing number of events caused and prevented. AE, adverse event; CNS, central nervous system.
Exercise
The exercise was designed to enable personnel from the pharmaceutical industry, regulatory authorities, and academia to engage in different structured approaches for benefit-risk assessment and provide an approval decision. Three syndicate groups (about 18 members each) independently assessed the case study. Each syndicate had a chair to manage the meeting, a facilitator to lead the assessment, and a rapporteur to report on the process and results. Syndicates A and B used the BRAT framework approach, 3,4 followed by an optional informal point allocation weighting exercise. Syndicate A included mostly members of the industry, while syndicate B included mostly members from regulatory authorities. Syndicates A and B were provided with the BRAT software for value tree and risk difference forest plot generation, adapted to perform point allocation weighting. Syndicate C used multicriteria decision analysis (MCDA), implemented with Hiview software. 17 The participants in this syndicate were a mix of members from industry and health authorities. All 3 syndicates had 4 hours for deliberation and were asked to provide and defend a decision on approving the low-dose triptan only, the high-dose only, both, or neither.
Results
Syndicate Session A: BRAT Framework Approach
Syndicate A made its decision by (1) reassessing the endpoints in the value tree according to patient input on the relevance of each endpoint for the decision, (2) assessing the data for these endpoints, (3) performing a weighting exercise for a rough estimate of the relative clinical importance of each endpoint, and (4) reviewing the data in the context of these rough weights.
Selecting Endpoints
The discussions in this syndicate demonstrated considerable benefit from having “patients” (attendees with migraine) participate. The initial benefits and risks provided were modified according to real-world migraine experience. “Pain-free response” was removed because migraineurs noted that pain-free status is unrealistic, given the background state of pain in migraine. “Reduction in sensitivity to light and sound” was regarded as intrinsically part of “headache relief” and so removed as a separate endpoint. “Sustained response” was retained because of its importance to predict pending disability and in lessening emergency room visits. “Transient triptan sensation” was removed because of its relatively minor nature and its similarity to “central nervous system adverse events.” The team reported that real-time editing of the value tree was very helpful in noting the need for and tracking these changes. The team also found it important to document the rationale behind these decisions, and it presented these rationales as part of the defense of its decision.
Weighting (Relative Importance)
Relative weights were assigned via a point allocation exercise. One myocardial infarction (MI) per 1000 patient-years was designated as the most serious outcome and so assigned a weight of 100. An absolute change of 1% in the rate of other outcomes (eg, from 30% to 29%) was assigned a relative weight between 0 (irrelevant for the decision) and 100 (of equivalent clinical importance as 1 additional MI per 1000 patient-years). Software normalized and displayed the weights so that they summed to 100% (Figure 3). This exercise proceeded quickly, and the group was comfortable with the methodology. The group noted the following: (1) First conducting the weighting exercise without MI and then reassessing the allocations with MI enabled better initial consideration of the distinctions between the endpoints with far less impact than MI; (2) real-time update and display of the normalized weights were very useful for checking whether the point allocations made sense when considered collectively.

Relative weights for benefits and harms as assessed by syndicate A, using a point allocation technique. Weights reflect the relative clinical impact for 1 myocardial infarction per 1000 patient-years and of a 1% absolute change in all other outcomes. AE, adverse event; CNS, central nervous system.
Integrating Weights and Rates
The software was used to create a forest plot depicting rate differences for 1 triptan dose versus NSAID in order of decreasing weight. One challenge was that the MI outcome was measured in patient-years, while other outcomes were measured as proportions. For rate differences to be on the same scale, the typical rate of 2 migraines per month was used to convert proportions into number of events per year. The group found this display valuable for integrating its assessments of clinical importance with the clinical data.
Conclusions
The syndicate recommended that both triptan doses be approved for patients who do not respond to NSAIDs, due to high unmet medical need. Although MI is a serious outcome, the increase in MI for the low-dose triptan compared to NSAIDs was considered to be very low in comparison to the large gains in headache relief and function. The other harms were considered nuisances rather than serious safety concerns; however, a robust risk management plan that included distinguishing real CV events from chest-related adverse events was recommended. The high-dose triptan was also approved, as the incremental benefits and harms from low- to high-dose triptan were viewed as essentially the same as those from the NSAID to the low-dose triptan.
Syndicate A developed a list of recommendations for the conduct of similar assessments: Patient experience should be included early in development. Patient advocate representation in benefit-risk discussion can be very valuable, particularly on therapies for rare diseases. Strong facilitation and the elicitation of differing views is critical to obtaining consensus. Assumptions are made at multiple points in a benefit-risk assessment; documenting these assumptions and their impact on decision making is important. Real-time tabular and graphical display of the data is extremely valuable. Weighting endpoints, even with informal techniques, can greatly influence value judgments and should be considered in benefit-risk assessments.
Syndicate Session B: BRAT Framework Approach
Definition and Focus
Syndicate B focused on the assessment’s decision context and found it essential to more clearly articulate the target population’s needs. Patient input was also more vital than expected by most group members, and participants repeatedly turned to migraineurs in the group to inform their thinking. This information altered the decision frame, as patients were interested in those benefits not usually used as primary outcomes in migraine studies, and they were consistently willing to take risks considered too high by the sponsors or regulators. Migraineur input was especially used for endpoints with a strong subjective component, such as pain, nausea, and paraesthesia.
One example of the importance of patient input came from several participants noting that they expected the entire discussion to take under a half hour, due to the seriousness of having more MIs under the triptan. But as the structured discussion developed, the group came to understand how seriously most patients viewed migraine symptoms and their consequences, and the discussion took the full 4 hours allotted.
Visualization Tools
The group noted that tools such as value trees, 18,19 tables of key benefits and harms (key benefit-risk summary tables or effects tables 2 –4 ), and forest plots 20 helped structure the discussion by providing a focus on critical issues and identifying knowledge gaps. The group noted that these tools, which have been used at FDA Advisory Committees, 21 –23 can also help to identify overlapping benefits and harms, provide a succinct summary of key information, and facilitate sensitivity analyses. The value tree, in particular, facilitated comprehension and communication, as it worked within the constraints for the average 5- to 9-item cognitive limit of most reviewers. 24 Syndicate members cautioned, however, that visualization tools require training for optimal use and interpretation and would benefit from ongoing refinement and training of users.
Weighting
Syndicate B noted that weighting of the attributes is an inescapable aspect of benefit-risk decision making and occurs implicitly, even when no formal weighting process is used. Although the group was comfortable using the simple and accessible point allocation approach, it stressed that more development work is required to construct a weighting approach widely acceptable to health authorities and industry. The group’s weights are unavailable, as they were not recorded during the exercise.
Conclusions
Syndicate B recommended approval of the low-dose triptan only, primarily due to high medical need in the migraine population with 1 or 2 CV risk factors. The primary driver for the decision was how seriously patients in the group regarded migraine symptoms, despite the initial views of several syndicate members that migraine was not that serious a condition.
Syndicate B’s recommendations were as follows: Redouble efforts to get patient input early in the framing process, especially when benefits and harms are subjective or have multiple alternative definitions. Clearly outline the research question and explicitly define the treatment population. Regulators should use a value tree to obtain early agreement with sponsors regarding key benefits and risks. Support the use of an effects table of key benefits/risks and forest plots or other displays to summarize the benefit-risk assessment in regulatory submissions. Continue to refine visualization tools and develop harmonized guidance for their use.
Syndicate Session C: MCDA Approach
Syndicate C used MCDA, a popular approach for general decision making that has been advocated for use in pharmaceutical benefit-risk assessment. 2,18,19,25,26 The session facilitator had considerable experience with MCDA.
Value Tree
Syndicate C considered reviewing the value tree as the most important step. Several favorable effects ii were considered to be interrelated; for example, “pain-free response” was considered a subset of “headache relief.” The group noted that although some effects might be statistically correlated, they were generally preferentially independent, meaning that the preference values of one effect are unaffected by changes in preference values of the other effects. This preferential independence led to the group’s retaining all the benefits in the value tree despite their statistical correlation.
Of the 4 unfavorable effects, 3 were tolerability issues, while only 1 effect, MI, was considered to be a serious adverse event. To reflect this fact, MI was split out as a separate undesirable event in syndicate C’s value tree (Figure 4).

Value tree developed by syndicate C with assessed swing weights. The hierarchical weighting process began by comparing swing weights of the criteria emanating from the “pain” node, with the largest swings—judged to be identical for “rapid onset” and “sustained response”—set at 100 and the other two judged compared to 100. A similar process was applied to the “other” and “AEs” nodes. Swings on the criteria associated with the boxes within a second box were compared to yield weights on the next level. Finally, the largest swing weight under “favorable effects,” “functional disability,” was compared to the largest swing under “unfavorable effects,” “myocardial infarction,” to obtain the highest level weights. AE, adverse event.
Weighting and Scoring
This group used a quantitative hierarchical approach to weighting. For each effect, the group identified a range that encompassed the observed data and the potential changes in the data. For example, the range for “rapid onset” was 0% to 70%, and the range for “MI” was 8 to 16 events per 1000 person-years. iii These ranges enabled the data to be converted to preference values, with the worst and best measurements assigned preference values of 0 and 100, respectively. Since the differences between best and worst were not all of equal clinical relevance, the scales were weighted to equate the units of preference value. The key question asked by the facilitator was “How big is the measured difference between best and worst, and how clinically relevant is that difference?” from a patient’s perspective, with several migraineurs in the group providing input as patients. Figure 5 shows the assessed weights and outlines the weighting process.

Cumulative weights for favorable and unfavorable effects calculated from the hierarchical weights assessed by syndicate C. The cumulative weights are the product of the individual weights along the branches in figure 4, each divided by the best-to-worst measurement range used in the multicriteria decision analysis valuing process for that endpoint. The weights are then rescaled to sum to 100%. Weights reflect the relative clinical impact for 1 myocardial infarction per 1000 patient-years and of a 1% absolute change in all other outcomes. AE, adverse event; CNS, central nervous system.
The Hiview software used the weights and clinical results to compare the treatments, assuming a linear increase in preference value as a function of effect size for all effects. Both triptan doses were superior to NSAIDs and placebo (overall weighted preference values of 60 for the 30-mg dose, 51 for the 15-mg dose, 46 for NSAID, and 33 for placebo). Sensitivity analyses on the stability of the decisions to changes in the weights and selected rates showed that small changes in the weights did not substantially change the outcome. The session’s rapporteur noted, We found out that the MI rate for the high-dose group had to increase from 16 … up to 45 per thousand person-years before the high-dose triptan was viewed as less beneficial than the NSAID. So this was an “a-ha” moment … as it allowed us to pressure test the ideas, and I think any discomfort we might have around the weighting exercise … was somewhat reassured based on the sensitivity analysis.
Conclusions
Based on weighted preference values, both triptan doses were approved. This group regarded the MCDA exercise as allowing a complex problem to be reduced to a series of smaller, more manageable issues in a logical progression. The sensitivity analysis provided reassurance regarding the results.
The group also felt that there were several areas of the exercise that could be improved. The limited time for the exercise (4 hours) was constraining, particularly for weighting. In addition, the syndicate was faced with several uncertainties—for example, whether to assume the perspective of regulators, patients, or physicians in their decision making; how to incorporate uncertainties in the data; and how to define unmet need for a new therapy. Finally, there was considerable discussion regarding whether preference values should change linearly with the size of a treatment effect, and this could have been explored in a lengthier discussion.
Syndicate C’s recommendations were as follows: MCDA, with the inclusion of weighting based on patients’ feedback, could be included in the European public assessment reports for selected assessments, although prescribing physicians will require education for interpretation. MCDA results could be simplified for patients by providing visual displays of the trade-off showing the number of migraines improved and the number of MIs that would occur with triptan compared with NSAIDs.
While not the goal of this syndicate, to enable comparing the weights assessed in syndicates A and C, the MCDA swing weights were combined to obtain their cumulative impact through all levels of the hierarchy and then rescaled to reflect the relative contributions of 1 MI per 1000 patient-years, or an absolute rate change of 1% of the other outcomes. In syndicates A and C, 1 MI per 1000 patient-years clearly stood out as the most important effect, with 1% absolute changes in functional disability and rapid onset being 15% to 20% as impactful (Figures 3 and 5). Although the linear correlation coefficient between the 8 effects in common between these 2 sets of weights is 0.96, it reduces to 0.53 when MI is excluded, suggesting the need for further research to establish reliable methods of weighting.
Discussion
We assessed the feedback from 3 groups applying 2 different approaches for a single-session benefit-risk assessment case study. Three key observations were common to the groups: the value of a structured approach, the value of visualization tools, and the importance of incorporating patient viewpoints. A growing body of literature reviews the critical role of structured approaches and visualization tools in benefit-risk assessment, and these topics will not be further discussed. 3,4,7 –10,18,26 –30 By far the most frequent and important observation was the significant role that patient input made in the decision-making process.
Regulatory benefit-risk assessment has chiefly relied on the viewpoints of physicians and regulators. However, a number of regulatory initiatives are starting to change this approach and are beginning to include patient viewpoints in characterization of unmet medical need, selection of relevant endpoints, design of patient-reported outcome instruments, and qualitative assessments of benefit-risk tradeoffs. 6,31 –33 This exercise demonstrated the value that the patient perspective can provide in these areas.
Patient viewpoints on the impact of migraine symptoms on daily life were crucial in determining which endpoints to include and their relative importance. For example, several syndicate members entered their sessions with the view that the increase in MIs would outweigh any migraine benefit and that the decision would be made against the triptan quickly. However, patient descriptions of the impact of migraine pain and functional impairment changed these viewpoints.
Patient input was chiefly responsible for the changes to the value tree. A particularly telling quote came from a regulator in syndicate B who commented on syndicate A’s removing the pain-free response endpoint: “I was really struck that you threw out the parameter that we focused the most on. We thought that if you were going to have the risk of a heart attack, you should really get rid of your migraine, period.” As noted above, syndicate A’s patients regarded pain-free status as unrealistic for migraineurs and discounted the endpoint. The initial value tree was based on the endpoints in a typical migraine randomized clinical trial. These patient-recommended changes are examples of patient-based insights of potential importance for future trials, a point stressed in the FDA’s “Patient-Focused Drug Development” meetings and a recent European Medicines Agency workshop. 6,31 –33
These observations also reflect a key observation from the decision analysis literature: the importance of using structured group discussions in identifying all the key attributes relevant to a decision. 18,26,28,29 Not incorporating patients or their proxies in designing an assessment runs the risk of missing important endpoints or using ones that do not relate to the issues of importance to patients. The extensive sections on disease symptoms and daily impacts that matter most to patients in the reports from the FDA’s “Patient-Focused Drug Development” meetings also reflect this point. 31,32
All syndicates recommended that patient input be brought early into framing which benefits and harms should be assessed and how they should be measured. In clinical development, this would typically be in the design of phase 2 studies, enabling an opportunity to test the use of the endpoint in preparation for the pivotal phase 3 trials.
Several members noted that patient input was even more imperative when the outcomes are subjective (eg, pain) or for rare diseases where there is little experience for the endpoint among the decision makers. Of note, patients were not intentionally included in this exercise—that migraine prevalence is large enough that each group had 3 or 4 members with migraine experience was a fortunate circumstance that led to these insights. A general policy to include patients in discussions between sponsors and regulatory agencies and, potentially, the use of formal surveys of patients could lessen the chance of missing important patient input.
This exercise is in no way a substitute for formal testing. More formal testing, as conducted in the Innovative Medicines Initiative PROTECT project, 7 –10 is a critical adjunct to the growing body of work in structured benefit-risk assessment. However, the near-uniform agreement on the importance of bringing the patient into the assessment early, the value of a structured approach, and the benefits of real-time visualization tools were clear messages that add to the growing insight into conducting benefit-risk assessments.
Conclusions
Structured approaches to benefit-risk assessment have grown increasingly important in regulatory and postapproval decisions. This work was designed to obtain feedback from 3 groups using 2 structured approaches in single-session decision-making scenarios on a benefit-risk case study of triptans for migraine. The groups provided evidence of the value of a structured approach to benefit-risk, the clarity of using real-time visualization tools, and, most critically, the importance of bringing patients into the discussion early.
Footnotes
Acknowledgments
The authors thank Pat Connelly for providing a preliminary version of the case study report.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
