From Moving Beyond Compliance to Quality to Moving Beyond Quality to Effectiveness: Realities and Challenges

Abstract

In 2007, Taylor proposed to move beyond compliance to develop measures for assessing the ethical quality of institutional review board (IRB) reviews. To date, no such tool has been developed. In 2018, Lynch et al. proposed to move beyond quality to advance effective research ethics oversight. Instead of providing a set of measures, they proposed to define and specify ways to measure the effectiveness of IRBs in protecting human subjects. They further claimed that any attempts to measure the quality and performance of IRBs without using such measures, to be developed by them in the unforeseeable future, were not helpful. The realities are that nearly 50 years after its establishment, there has been no systematic assessment of the quality and performance of IRBs, and that nearly two decades after the deaths of Jesse Gelsinger and Ellen Roche and the implementation of reform processes to improve our system of protecting human subjects, there is still plenty of room for improvement. The challenges today are for Taylor to come up with a tool to measure the ethical quality of IRB reviews, for Lynch et al. to develop measures for assessing the effectiveness of IRBs in protecting human subjects, and for the IRB community to decide whether to continue to wait for Taylor and Lynch et al. to develop their promised tools or to start taking advantage of performance measurements to improve the quality and performance of IRBs by using existing IRB performance metrics or to improve those existing performance metrics.

Keywords

IRB quality IRB performance performance measurement human subject protections quality of IRB reviews

In this issue of the Journal, three articles commented on my proposed systematic assessments of the quality and performance of institutional review boards (IRBs or research ethics committees [RECs] and ethics review boards [ERBs] hereafter all referred to as IRBs) based on how well IRBs have done with respect to what they are supposed to do (Tsan, this issue). Cleaton-Jones (this issue) shared his rich personal experience on RECs in South Africa, for which I am truly grateful. Grady (this issue, p. 197-199) suggested that

one way that ethics review contributes to protecting the rights and welfare of research participants is by providing a check on investigator enthusiasm and possible conflicts through careful and independent review of the promise and significance of the proposed research and research progress with particular attention to whether and how it justifies possible harm and burden to participants and engages participants in the process.

I unreservedly agree with this comment and believe that this independent ethics review of research is a major part of what IRBs are supposed to do. Indeed, the Institute of Medicine (2002) Committee on Assessing the System for Protecting Human Research Participants suggested almost two decades ago that as the principal representative of the interests of potential research participants, IRBs should focus their full committee deliberations and oversight primarily on the ethical aspects of protection issues, and that IRBs should be renamed research ERBs.

However, despite my argument to the contrary, Lynch, Nicholls, Meyer, Taylor, and Consortium to Advance Effective Research Ethics Oversight (2019) continue to insist that the effectiveness of IRBs in protecting human subjects is an essential part of the quality and performance of IRBs. They further claimed that any attempt to measure IRB quality and performance without measuring how effectively IRBs protect human subjects was not helpful. This represents a clear misunderstanding of the role of IRBs in protecting human subjects participating in research and it requires further deliberation.

Compliance versus Quality

In a 2007 paper titled “Moving beyond compliance: Measuring ethical quality to enhance the oversight of human subjects research,” Taylor proposed to develop a valid, reliable, and robust measure to assess the quality, not just the compliance, of IRB reviews (Taylor, 2007). Who would question the importance and value of measuring the ethical quality of IRB reviews? Unfortunately, the ethical quality of IRB reviews is a complex, abstract idea that is neither quantifiable, nor measurable (Tsan, 2019). Thus, more than a decade later, no such tool has been developed. At the same time, the IRB community has been waiting for such a tool to start measuring the quality and performance of IRBs (Gearhart, 2018).

Despite Taylor’s characterization of compliance being something that is less important, compliance matters. Consider the cases of Jesse Gelsinger and Ellen Roche. These two cases redefined the recent history of human subject protections in the United States and provided much of the impetus for the reform processes that have been initiated since the early 2000s (Institute of Medicine, 2002; Steinbrook, 2002a, 2002b, 2008).

Case 1: Jesse Gelsinger was an 18-year-old young man with a mild form of inborn error of urea synthesis due to partial ornithine transcarbamylase (OTC) deficiency. He volunteered to participate in a Phase 1 clinical trial titled, “Recombinant Adenovirus Gene Transfer in Adults with Partial Ornithine Transcarbamylase Deficiency,” at the University of Pennsylvania School of Medicine that was funded by National Institutes of Health (NIH) and the Genovo Company.

On September 13, 1999, Jesse received the highest study dose of the adenovirus-derived vector containing a functional OTC gene (6 × 10¹¹ particles/kg) through direct infusion into his right hepatic artery. Approximately 18 hr later, Jesse developed jaundice and altered mental status and died 4 days later on September 17, 1999, due to a fulminant immune reaction to the adenoviral vector, including the systemic inflammatory response syndrome, disseminated intravascular coagulation, multiple organ system failure, and acute respiratory distress syndrome.

Extensive reviews (Borror, 2001; Steinbrook, 2008) of the study revealed significant financial conflicts of interest by the investigator and the University of Pennsylvania as well as egregious noncompliance by the investigator, which included, but were not limited to

Failure to follow the protocol,

Failure to notify the Food and Drug Administration (FDA) immediately of a Grade III liver toxicity that would necessitate suspension of the protocol,

Failure to promptly notify FDA of results of animal studies that suggested a significant risk of the adenoviral vector for human subjects,

Making multiple changes to the protocol without notifying FDA and failure to make agreed upon changes to the protocol,

Failure to adequately informed potential subjects of all relevant safety information during the informed consent processes.

It is clear that Jesse Gelsinger should not have died, because the results of his liver function test did not meet the minimal level required for inclusion in the study on the day he received the infusion. In addition, the study should have been suspended as specified in the protocol, because one of the study volunteers who received a smaller dose of the adenoviral vector earlier developed Grade III liver toxicity (Steinbrook, 2008).

Case 2: Ellen Roche was a 24-year-old technician at the Johns Hopkins Asthma and Allergy Center. She participated as a healthy volunteer in an NIH funded study titled, “Mechanisms of Deep Inspiration–Induced Airway Relaxation,” conducted at the Asthma and Allergy Center. Under the protocol, healthy subjects were to inhale hexamethonium, a ganglionic blocker. Hexamethonium was once used to treat hypertension but was removed from the U.S. market in 1972 after the FDA found it to be ineffective. It had never been administered to the lung of a human subject through inhalation before.

On May 4, 2001, Ellen received approximately 1 g of hexamethonium by inhalation. On the next day, she developed a dry cough. She was hospitalized on May 9 with fever, hypoxemia, and abnormal chest X-ray and died on June 2 due to progressive hypotension and multiple organ system failure. Autopsy results revealed diffuse alveolar damage.

Extensive reviews (McNeilly & Carome, 2001; Steinbrook, 2002b) of the study revealed egregious noncompliance by the investigator and the IRB, which included, but were not limited to

Failure to recognize the pulmonary toxicity of hexamethonium, which had been previously reported in the literature,

Failure to submit investigational new drug (IND) application to FDA prior to the study,

Failure of the IRB to require investigator to obtain FDA approval of IND,

Failure to report to the IRB promptly the pulmonary adverse event that developed in the first volunteer after receiving hexamethonium,

Failure to adequately inform potential subjects of all relevant safety information during the informed consent processes,

Failure to inform potential subjects that administration of hexamethonium by inhalation was an experimental use of the drug,

Failure to inform subjects that the hexamethonium bromide used in the study was obtained from a chemical company and was labeled, “For laboratory use only, not for drug, household, or other uses.”

These two cases illustrate not only the importance of compliance but also the limitations of IRB oversight alone in protecting human subjects. In addition to IRBs, institutions, sponsors of research, research participants, the government, and particularly the investigators share responsibilities in protecting human subjects (Anderson, Sawatzky-Girling, McDonald, & Willison, 2011; Institute of Medicine, 2001, 2002). This may explain why there were so few studies in the empirical research literature demonstrating the effectiveness of IRBs in protecting human subjects (Abbott & Grady, 2011) and why using the effectiveness of human subject protections as a measure to evaluate the quality and performance of IRBs not only is inappropriate, but also does an injustice to the IRBs (Tsan, 2019).

Quality versus Effectiveness

Despite the fact that Taylor’s proposed development of measures to assess the quality of IRB ethics reviews has not been fulfilled, Lynch et al. (2019) which included Taylor, proposed to move beyond the quality of IRB ethics reviews to advance effective research ethics oversight. They proposed to

define and specify ways to measure relevant outcomes for research ethics oversight, empirically evaluate whether those outcomes are achieved, test new approaches to achieving them, and ultimately, develop and implement empirically-based policy and practice to advance IRB and HRPP effectiveness. (p. 1)

In other words, Lynch et al. argued that the first step was to define and develop how to measure the effectiveness of IRB in protecting human subjects and claimed that without measuring human subject protections, it is not helpful to attempt to measure the quality and performance of IRBs, including the steps that I have proposed to measure how well IRBs have done in respect to what IRBs are supposed to do (Tsan, 2019).

Defining and developing how to measure human subject protections and empirically evaluating whether human subject protections have been achieved are commendable goals. However, demanding the measurement of human subject protection as a requirement for assessing the quality and performance of IRBs suggests a lack of understanding of the limitation of IRBs in protecting human subjects. As Grady pointed out, IRBs contribute to protecting human subjects through independent ethics reviews to “provide a check to balance the enthusiasm and potential conflicts of investigators, who in their pursuit of scientific knowledge and recognition could overestimate the promise of their research and misjudge potential harms to participants” (Grady, 2019, p. 198).

In addition, since the deaths of Jesse Gelsinger and Ellen Roche, and the implementation of reform initiatives to improve our system of protecting human subjects in the early 2000s (Institute of Medicine, 2002; Steinbrook, 2002a), human subject protection communities have been trying to develop ways to measure human subject protections without success (Tsan, 2017). This is at least in part because human subject protections, like the quality of IRB ethics reviews, is a complex, abstract idea that is difficult, if not impossible, to quantify and measure (Tsan, 2017, 2019). Lynch et al. promised to define and specify ways to measure human subject protections even before they had a chance to demonstrate feasibility, much like Taylor’s proposed development of measures to assess the quality of IRB reviews more than a decade ago. It is likely that decades later, the proposed development of metrics to measure human subject protections by Lynch et al. will remain unfulfilled.

Realities and Challenges

The substantial investments made since the deaths of Jesse Gelsinger and Ellen Roche to improve the system for protecting human subjects have undoubtedly resulted in some progress. For example, increasing numbers of institutional human research protection programs worldwide have been accredited by the Association for the Accreditation of Human Research Protection Programs, Inc. (2018). Unfortunately, there are scant data, if any, showing that these reforms have made human research any safer. This is in large part because we don’t know how to measure human subject protection directly. Indeed, there has been no systematic monitoring of the performance of our current system of protecting human research subjects (Tsan, 2017), and as a consequence, the following stark concerns remain:

Healthy volunteers continued to die or suffer from life-threatening illnesses requiring intensive care in the hospitals from participation in Phase 1 clinical trials (Kerbrat et al., 2016; Suntharalingam et al., 2006);

Potentially risky human research continued to be carried out without IRB approval (Lo, 2018);

IRBs continued to approve human research protocols often without fully satisfying the Common Rule criteria, namely, inadequate ethics reviews of proposed research. For example, Lidz et al. (2012) reported that in their review of protocols, IRBs mostly discussed the informed consent documents (98% of the time), but failed to address risk minimization (21% of the time), the risk-to-benefit ratio (57%), equitable subject selection (60%), data monitoring (54%), privacy and confidentiality (25%), and protection of vulnerable populations (13%). Similar observations have also been recently reported by Silaigwana and Wassenaar (2019).

In the1990s, one of the most common deficiencies identified by the FDA was an inadequate or late review of active research protocols (Nightingale, 1995). Today, lapse rates in IRB continuing reviews remain among the highest noncompliance rates (Tsan, 2017; Tsan & Nguyen, 2015). In addition, 13% of Canadian research ethics boards did not conduct annual continuing ethics reviews of ongoing research at all (Norton & Wilson, 2008).

There appears to be much room for improvement.

Performance measurement has been well established as an important tool for improving the quality of health care. Health care providers and payers devote substantial resources to collecting, analyzing, and reporting data on providers’ performance. The challenge for the health care community today is not whether they should continue to measure performance, but how to improve current performance measurements to maximize benefits (Cassel et al., 2014). In contrast, despite the importance of IRBs, there has been no systematic assessment of the quality and performance of IRBs nearly 50 years after their establishment (Tsan, 2019).

Over the past 10 years, we have conducted annual performance measurements of human research protection programs at the 107 Department of Veterans Affairs’ (VA) research facilities and have demonstrated the effectiveness of performance measurements in improving the quality and performance of VA human research protection programs (Tsan, 2017; Tsan & Nguyen, 2017). Based on our experience and in view of the need to improve the quality and performance of IRBs, I have developed a set of five quality indicators for assessing the quality and efficiency of IRBs; each quality indicator contains a number of performance metrics (Tsan, 2017). These quality indicators were developed based on the theory that the quality of IRBs is best defined by how well IRBs have done with respect to what they are supposed to do (Tsan, 2017, 2019).

These IRB quality indicators cover the following areas: (a) Whether IRBs correctly designate protocols as exempt protocols not requiring IRB review, expedited review protocols that could be reviewed and approved by the IRB Chair (or his or her designated IRB members), or protocols requiring full IRB reviews at convened meetings; (b) whether IRBs consider and document all Common Rule criteria and conflict of interest prior to approving/disapproving protocols; (c) whether IRBs correctly approve waiver or alteration of informed consent or documentation of informed consent consistent with the Common Rule requirements; (d) whether IRBs conduct timely continuing reviews as required; and (e) how long it takes IRBs to approve protocols (Tsan, 2017). In addition, there are several performance metrics measuring outcomes related to IRBs in the VA human research protection program quality indicators/performance metrics that can also be used to assess the quality and performance of IRBs (Tsan, 2017).

With the implementation of the revised Common Rule requiring single-IRB review for multi-institutional studies conducted in the United States in January 2019 (Menikoff, Kaneshiro, & Pritchard, 2017; U.S. Department of Health and Human Services, 2018), it is likely that increasing numbers of institutions will rely on free-standing IRBs (independent IRBs not associated with any research institutions, but that provide IRB review services to research institutions), for such studies. It is important that the quality and performance of these free-standing IRBs be monitored closely.

The key challenges that remain are as follows:

For Taylor to develop and publish a measure for assessing the quality of IRB ethics reviews and for Lynch et al. to develop and publish a set of measures for assessing the effectiveness of IRBs in protecting human subjects,

For the IRB community to decide whether to continue to wait for Taylor and Lynch et al. to develop their promised tools, or to start taking advantage of performance measurements to improve the quality and performance of IRBs by using existing IRB performance metrics and/or to improve the current existing metrics.

I believe that the choice is clear. One hopes that we do not have to wait for another 50 years to systematically assess the quality and performance of IRBs.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Author Biography

Min-Fu Tsan is the senior research scientist at the McGuire Research Institute, Richmond, Virginia, USA. His research interests include, but are not limited to, protection of human subjects participating in research.

References

Abbott

Grady

C. A.

(2011). Systematic review of the empirical literature evaluating IRBs: What we know and what we still need to learn. Journal of Empirical Research on Human Research Ethics, 6(1), 3-19.

Anderson

J. A.

Sawatzky-Girling

McDonald

Willison

D. J.

(2011). Research ethics broadly writ: Beyond REB review. Health Law Review, 19(3), 12-24.

Association for the Accreditation of Human Research Protection Programs, Inc. (2018). Available from http://www.aahrpp.org/

Borror

K. C.

(2001, May 7). Human research subject protections under multiple project assurance (MPA) M-1025 [Letter to University of Pennsylvania]. Office for Human Research Protections, Department of Health and Human Services. Retrieved from http://www.hhs.gov/ohrp/compliance-and-reporting/index.html/

Cassel

C. K.

Conway

P. H.

Delbanco

S. F.

Jha

A. K.

Saunders

R. S.

Lee

T. H.

(2014). Getting more performance from performance measurement. New England Journal of Medicine, 371, 2145-2147.

Cleaton-Jones

(2019). Commentary on “measuring the quality and performance of institutional review boards”, 14(3), 200-203 Journal of Empirical Research on Human Research Ethics. doi:10.1177/1556264618804686

Gearhart

(2018). Seeking the Holy Grail of IRB quality. Retrieved from https://www.quorumreview.com/seeking-the-holy-grail-choosing-the-right-irb/

Grady

(2019). The contribution of ethics review to protection of human participants: Comment on “measuring the quality and performance of institutional review boards.”, 14(3), 197-199. Journal of Empirical Research on Human Research Ethics. doi:10.1177/1556264619837774

Institute of Medicine. (2001). Preserving public trust: Accreditation and human research participant protection programs. Washington, DC: National Academies Press.

10.

Institute of Medicine. (2002). Responsible research: A systems approach to protecting research participants. Washington, DC: National Academies Press.

11.

Kerbrat

Ferré

Fillatre

Ronzière

Vannier

Carsin-Nicol

. . . Edan

(2016). Acute neurologic disorder from an inhibitor of fatty acid amide hydrolase. New England Journal of Medicine, 375, 1717-1725.

12.

Lidz

C. W.

Appelbaum

P. S.

Arnold

Candilis

Gardner

Myers

Simon

(2012). How closely do institutional review boards follow the common rule? Academic Medicine, 87, 969-974.

13.

(2018). A parallel universe of clinical trials. New England Journal of Medicine, 379, 101-103.

14.

Lynch

H. F.

Nicholls

Meyer

M. N.

Taylor

H. A.

, & Consortium to Advance Effective Research Ethics Oversight. (2019). Of parachutes and participant protection: Moving beyond quality to advance effective research ethics oversight. Journal of Empirical Research on Human Research Ethics, 14(3), 190-196. doi:10.1177/1556264618812625

15.

McNeilly

P. J.

Carome

(2001, July 19). Human subjects protections under multiple project assurance (MPA) M-1011 [Letter to Johns Hopkins]. Office for Human Research Protections, Department of Health and Human Services. Retrieved from http://www.hhs.gov/ohrp/compliance-and-reporting/index.html

16.

Menikoff

Kaneshiro

Pritchard

(2017). The common rule, updated. New England Journal of Medicine, 376, 613-615.

17.

Nightingale

S. L.

(1995, 20 October). An update from FDA. Plenary address at PRIM&R IRB conference, Boston, MA.

18.

Norton

Wilson

D. M.

(2008). Continuing ethics review practices by Canadian research ethics boards. IRB: Ethics & Human Research, 30(3), 10-14.

19.

Silaigwana

Wassenaar

(2019). Research ethics committees’ oversight of biomedical research in South Africa: A thematic analysis of ethical issues raised during ethics review of non-expedited protocols. Journal of Empirical Research on Human Research Ethics, 14, 107-116. doi:10.1177/1556264618824921

20.

Steinbrook

(2002a). Improving protection for research subjects. New England Journal of Medicine, 346, 1425-1430.

21.

Steinbrook

(2002b). Protecting research subjects—The crisis at Johns Hopkins. New England Journal of Medicine, 346, 716-720.

22.

Steinbrook

(2008). The Gelsinger case. In Emanuel

E. J.

(Ed.), The Oxford textbook of clinical research ethics (pp 110-120). Bethesda, MD: Oxford University Press.

23.

Suntharalingam

Perry

M. R.

Ward

Brett

Castello-Cortes

Brunner

M. D.

Panoskaltsis

(2006). Cytokine storm in a phase 1 trial of the anti-CD28 monoclonal antibody TGN1412. New England Journal of Medicine, 355, 1018-1028.

24.

Taylor

H. A.

(2007). Moving beyond compliance: Measuring ethical quality to enhance the oversight of human subjects research. IRB: Ethics & Human Research, 29(5), 9-14.

25.

Tsan

M. F.

(2017). Assessing the quality of human research protection programs (Chapters 1 & 11). Beau Bassin, Mauritius: Scholars’ Press.

26.

Tsan

M. F.

(2019). Measuring the quality and performance of institutional review boards. Journal of Empirical Research on Human Research Ethics, 14(3), 187-189. doi:10.1177/155626418804686

27.

Tsan

M. F.

Nguyen

(2015). Lapse in institutional review board continuing review approval. IRB: Ethics & Human Research, 37(2), 14-19.

28.

Tsan

M. F.

Nguyen

(2017). Effectiveness of human research protection program performance measurements. Journal of Empirical Research on Human Research Ethics, 12, 217-228.

29.

U.S. Department of Health and Human Services. (2018). Federal policy for the protection of human subjects (45 code of federal registration part 46). Retrieved from https://www.hhs.gov/ohrp/regulations-and-policy/regulations/common-rule/index.html