Abstract

Introduction
T
Proteins That Bind and Rearrange DNAs
My road to gene therapy started with studies of DNA-binding proteins. I had the good fortune to carry out my doctoral work with Mark Ptashne, a colorful figure if ever there was one. Mark isolated phage lambda repressor in the 1960s (Ptashne, 1967a,b), leading to the first biochemical demonstration of gene control by a sequence-specific DNA-binding protein. His race with Walter Gilbert to isolate a repressor protein (Mark working on lambda and Wally on lac repressor) was the subject of a television documentary narrated by George C. Scott. When I joined the lab in the 1980s, Mark's group was driving the field of transcriptional regulation. Work from this period is summarized in Mark's book A Genetic Switch, which describes with exceptional clarity the operation of the bistable switch regulating induction of the lambda prophage (Ptashne, 1986). My graduate studies of gene control were published in fancy journals (Bushman et al., 1985, 1989; Bushman and Ptashne, 1986, 1988), but merit only a few paragraphs in the book, a testament to the amount of research summarized.
In Mark's lab in the late 1980s, one of the leading questions concerned how proteins bound at one DNA site could influence events far away along the DNA chain. Transcriptional enhancers provided one example. Some of the best work on this question was from the field of DNA recombination, notably work from Kiyoshi Mizuuchi and Robert Craigie. Mizuuchi had established an in vitro system that carried out transposition of phage Mu DNA efficiently (Mizuuchi, 1983), a reaction related to retroviral integration, which provided the basis for a striking series of mechanistic experiments, many together with Craigie. They showed that the Mu recombination proteins bound to distant DNA recognition sites and folded up into an ordered complex—only after correct assembly were the DNA breaking and joining steps initiated (Craigie et al., 1984, 1985; Craigie and Mizuuchi, 1985, 1987). Eventually, many integrating elements, including retroviral preintegration complexes, would be shown to assemble into related structures.
Around this time, the newly discovered HIV virus was much in the news, and I grew interested in the mechanism of viral DNA integration. Retroviral particles contain RNA genomes, which after infection are copied to DNA by reverse transcription. The DNA copy is then integrated into a host-cell chromosome. This reaction involved DNA-binding proteins (i.e., HIV integrase) and also captured problems in DNA enzymology. Today retroviral vectors of course are widely used for gene transfer in human gene therapy.
I joined Craigie and Mizuuchi at the National Institutes of Health as a postdoc in 1989 to study retroviral integration, and in short order we had purified integrase proteins working in vitro (Bushman et al., 1990; Craigie et al., 1990). HIV integrase alone, it turned out, could form a covalent bond between model viral DNA and model target DNA under the right conditions. This provided the tool for a series of mechanistic studies, a line of work we are still in to this day (Bushman and Wang, 1994; Pruss et al., 1994b; Farnet et al., 1996; Carteau et al., 1999; Gao et al., 2004; Diamond and Bushman, 2005, 2006; Gupta et al., 2010).
Later, the in vitro HIV integration system was picked up by the pharmaceutical industry. After an immense effort at Merck, the first HIV integrase inhibitors were approved by the Food and Drug Administration in 2007 (Summa et al., 2008). Several of my trainees were recruited by Merck and helped with this project.
Both Kiysoshi and Mark are prizewinning researchers, but their styles could not have been more different. Kiyoshi has the reserve of a samurai warrior and usually wears a black headband to hold back his long hair. Mark is mercurial and drives a Porsche. Mark was interested in sweeping simplifying ideas and tended to ignore data that failed to match his latest theory. Kiyoshi was intensely focused on technology and seemed to love the engineering for its own sake. At one point, projects in Kiyoshi's lab required oligonucleotide substrates with phosphorothioates of defined stereochemistries—Kiyoshi disappeared for several weeks and then returned with the needed DNAs, having synthesized them himself (Engelman et al., 1991; Mizuuchi and Adzuma, 1991). Seeing both Kiyoshi and Mark up close was remarkably instructive—it requires allegiance to your idea to drive it ahead through inevitable setbacks, but you sure better pay close attention to methods and data quality as well.
HIV Integration in Chromosomes
In 1993 I started my own laboratory at the Salk Institute, focusing on molecular mechanisms of HIV replication. I trained on phage, and HIV was the successor, providing a rich window into the molecular biology of vertebrate cells. Our early studies explored the viral and cellular components of viral integration complexes (Miller et al., 1995) and interactions with cellular proteins such as nucleosomes (Pruss et al., 1994a; Bor et al., 1995).
From the start, we were interested in gene therapy, influenced by neighbors such as Ted Friedman at University California–San Diego (UCSD) and Inder Verma at Salk. In one early experiment, I fused the DNA-binding domain of lambda repressor onto HIV integrase and showed that integration in vitro was favored near repressor-binding sites in target DNA (Bushman, 1994). In 1994 this made a splash, though the technology has so far not been translated effectively into vector systems useful in patients. Nevertheless, the repressor–integrase fusions did provide a conceptual prototype for the fusions of DNA-binding domains and nucleases that are popular today.
In the late 1990s, I was really excited by the first articles on complete genome sequences. The data established among other things the tremendous force that mobile DNA represented in biological evolution. Thanks to training with Kiyoshi and Mark, I had a good overview of mobile DNA mechanisms, and retrovirology provided another perspective. In 2001 I wrote about it in a book on how DNA moves between cells and what happens as a result (Lateral DNA Transfer—Mechanisms and Consequences) (Bushman, 2001). Happily the book was favorably reviewed in several major journals. One of the more memorable callouts was from the science fiction writer Greg Bear, who called the book “one of my favorite science books of the decade” and wrote that it helped him in writing his novel Darwin's Radio, in which mobile DNA runs amok in the human genome!
The draft genome sequence of humans, newly available in 2001, suggested that about 8% of the human genome is composed of retroviral sequences. Most of these sequences were found outside genes, probably selected to minimize disruption of the human transcription units (Brady et al., 2009). So where in the human genome did HIV target integration?
In 2002, we set out to sequence a large number of HIV integration sites, read out the flanking human DNA bases, align the reads on the newly available draft human genome sequence, and assess associations with chromosomal features mapped by others (Schroder et al., 2002). Happily, Joe Ecker at Salk had a high-throughput sequencing line set up, which he had used to help sequence the Arabidopsis genome. We sequenced about 500 sites of HIV integration, aligned them to the human genome, and began the job of looking for patterns.
For this we worked closely with statistician Charles Berry at UCSD, a fruitful collaboration that continues to this day (Berry et al., 2006, 2012, 2014). Chuck introduced me to what was possible when sequence information is analyzed using advanced methods and to the value of the statistical mindset in general. My early mentors from the phage school almost never used statistics—experiments involved forming a hypothesis; carrying out a simple experiment, often involving a single pair of experimental and control samples; and assessing whether or not the idea was supported. The general feeling was that if statistics were needed, then the battle was already over and you had lost. However, given many thousands of bases of integration site sequence, and billions of bases of human DNA that you want to compare against, which is itself annotated in dozens of ways, you have no hope of eyeballing the answer. Chuck calmly taught us to arrange the data in a manageable form and articulate answerable questions. He then devised clever computational routes to answers.
By comparing integration site positions to their genomic annotations, we found that HIV favors integration quite strongly in cellular transcription units (Schroder et al., 2002). We then analyzed transcription using Affymetrix microarrays and found that active transcription units were particularly favored. We also found strong clustering of integration sites. For this analysis, Chuck made randomly distributed control sequences, and then compiled a list of distances between integration sites as you progressed along the chromosomes. He made a similar list for the experimental HIV integration sites and then compared the two. After normalizing the numbers of sequences, there were far more short distances between authentic HIV sites, allowing a statistically rigorous demonstration of clustering. The findings made biological sense—HIV infection kills off cells quickly, but integration in transcription units can help assure efficient viral gene expression during the short life of infected cells.
Later we collaborated with several groups to show that binding of HIV integrase to the cellular transcription protein LEDGF/p75 mediates targeting to transcription units via a simple tethering mechanism (Cherepanov et al., 2003; Ciuffi et al., 2005; Ciuffi and Bushman, 2006; Marshall et al., 2007; Shun et al., 2007). One experiment that made this point was the demonstration that hybrid tethers, containing the LEDGF integrase-binding domain fused to a novel chromatin-binding domain, could retarget HIV integration to new locations in chromosomes (Gijsbers et al., 2010).
Gammaretroviruses favor integration in another location—transcription start sites (Wu et al., 2003; Mitchell et al., 2004)—and just this year a tethering factor was found for this reaction as well (cellular bromodomain and extra-terminal [BET] proteins) (Ciuffi and Bushman, 2006; De Rijck et al., 2013; Gupta et al., 2013; Sharma et al., 2013). These findings may allow more control over integration site selection in gene therapy applications—small molecules are currently available that interfere with binding of integrases to both LEDGF/p75 and BET proteins, resulting in more random integration targeting in treated cells.
DNA Integration in Human Gene Therapy
The methods developed in mechanistic studies of retroviral integration were perfect for monitoring outcome in human gene therapy, and happily we were invited to help analyze several influential clinical trails. Around this time I was fortunate to receive an offer from the University of Pennsylvania, and so moved there, lured in part by the chance to work with the distinguished community of gene therapy researchers, which includes Jean Bennett, Kathy High, Carl June, James Riley, Pablo Tebas, and Jim Wilson.
Shortly after arriving at Penn, we undertook the exciting job of applying the new deep sequencing methods to analysis of integration site distributions. The newly available 454/Roche pyrosequencing method was able to produce close to a million sequence reads in a single instrument run—technology so amazing one wondered if space aliens were secretly behind it. After several false starts, in 2006 we managed to sequence about 40,000 sites of HIV integration in a cell line, by far the largest data set at the time. This allowed much more detailed analysis of integration site positions than was previously possible and for the first time allowed association of integration site data with information on epigenetic modification of chromatin in target cells (Wang et al., 2007).
Our first work with samples from gene-corrected subjects—started at the invitation of Marina Cavazzana-Calvo, Alain Fischer, and Salima Hacein-Bey-Abina at the Hôpital Necker-Enfants Malades in Paris—involved analyzing outcome in the first trial to treat severe combined immunodeficiency (SCID)-X1 (Cavazzana-Calvo et al., 2000; Hacein-Bey-Abina et al., 2003b, 2008; Thrasher et al., 2006; Bushman, 2007; Howe et al., 2008). Bone marrow cells were removed from the affected children, transduced using a gammaretroviral vector harboring an intact copy of the IL2RG gene, and then reinfused. Of 20 subjects treated in London and Paris, most achieved dramatic clinical benefit, but 5 of the children suffered adverse events in the form of leukemia associated with integration of the gammaretroviral vector near a cellular cancer gene. Unfortunately, it seemed quite likely that the transcriptional enhancer sequences in the therapeutic vector caused the activation of dangerous cellular genes such as LMO2 (Hacein-Bey-Abina et al., 2003a, 2008). One child died, the other four responded to chemotherapy and have continued to benefit from gene therapy. Despite the setbacks, the results were remarkable—for the first time in history, humans had stably engineered their genetic makeup to cure a disease.
We sequenced integration sites from longitudinal blood samples from the treated subjects and began a long-term analysis of the population biology of the gene-corrected cells (Wang et al., 2008, 2010; Hacein-Bey-Abina et al., 2010). The clinical setting is unique—the children initially had no T cells at all, but after gene correction, T cells grew out robustly and every cell was marked with a unique integration site. Initially, we helped characterize some of the adverse events, identifying CCND2 and BMI1 as genes likely involved in insertional activation in addition to LMO2. We tracked cell clones longitudinally through gene correction, revealing that in some cases cells with integration sites near cancer-related genes were particularly abundant, potentially because of boosted transcription from the vector enhancer increasing expression of cellular genes favoring proliferation or survival. Another application was monitoring the outcome of chemotherapy for leukemia—we could determine which integration site near a cancer gene shot up in abundance associated with adverse events and then confirm that the clone was erased from the population by successful chemotherapy.
After the first SCID-X1 trial, many groups turned to newer types of retroviral vectors that lacked the strong enhancers characteristic of gammaretroviruses, and we used integration site information to track cell clones in several of them. One type of vector maintained most of the gammaretroviral backbone that had worked in the SCID-X1 trial but simply deleted the enhancers. The SCID2 trial with the revised vector is now under way and publications should be appearing shortly.
Another direction was to adapt HIV as a vector for gene delivery. Despite the millions of people infected with HIV worldwide, each harboring enormous numbers of infected cells, there are no reported cases of cancer by insertional activation of proto-oncogenes. Proponents of HIV-based vectors (rebranded “lentiviral vectors” to make them sound less scary) suggest that this indicates that vectors also should be safer. However, HIV encodes a protein that arrests the cell cycle (Vpr), and another that is lethal to cells (Env), both of which are removed in vectors. Thus, the potential genotoxicity may be less clear than suggested by the HIV patient data.
The first introduction of lentiviral vectors into humans was carried out by a large team led by Carl June at the University of Pennsylvania (Levine et al., 2006; Wang et al., 2009). T cells were harvested from HIV-positive patients, a vector containing an HIV inhibitory gene was introduced on a lentiviral vector, and then the cells were reinfused. The procedure was found to be safe, the goal of the trial, and we showed that the distribution of lentiviral vector integration sites in the T cells was indistinguishable from the distribution in test infections of tissue culture cells in the lab—that is, there was no indication that vector integration was driving particular cells to proliferate differentially, good news for safety (Levine et al., 2006; Wang et al., 2009).
This first HIV trial showed some possible clinical benefit, though no strong effects on the amount of circulating HIV. However, two other groups shortly thereafter reported dramatic clinical successes with lentiviral vectors. Natalie Cartier, Patrick Auborg, and their coworkers reported successful use of lentiviral gene delivery to treat adrenoleukodystrophy (Cartier et al., 2009), helping patients who had few other therapeutic options. In a second study, Phillipe Leboulch, Marina Cavazzana-Calvo, and a large team succeeded in using lentiviral vector–based gene delivery to treat one patient with beta-thalassemia, rendering the subject independent of the blood transfusions required before gene correction (Cavazzana-Calvo et al., 2010).
We carried out a longitudinal integration site analysis in the beta-thalassemia subject, with surprising results (Cavazzana-Calvo et al., 2010). A large fraction of all transduced cells contained an integration site in HMGA2, a known cancer gene. It turned out that integration within the gene resulted in formation of a truncated mRNA, which removed a negatively acting binding site for the let7 microRNA, thereby upregulating expression of HMGA2. The transcriptional apparatus of the vector also appeared to be acting on the HMGA2 promoter, further boosting HMGA2 expression and apparently promoting cell proliferation. This provided the first indication of possible “vector driving” by lentiviral vectors in humans—today, the picture seems to be that lentiviral vectors are less likely to affect cellular proliferation than early version gammaretroviral vectors, but there is still some potential for doing so. Happily, at the time of this writing, the treated subject is doing well.
While these studies have yielded exciting new data, they have also left us quite familiar with the limitations of the analytical methods. Recovery methods are limited in the efficiency and reliability, and communicating effectively with clinical colleagues is an ongoing challenge. Going forward, we and others in the field need to optimize methods (Berry et al., 2006, 2012; Hoffmann et al., 2007; Bushman et al., 2008; Wang et al., 2008; Ciuffi et al., 2009; Hacein-Bey-Abina et al., 2010; Brady et al., 2011) and then make them as high throughput and simple as possible for expanded implementation as human gene therapy comes into widespread use.
Looking Forward
Today, numerous successful gene therapy trials are underway. Lentiviral delivery is being applied to a wide array of disease indications (Levine et al., 2006; Cartier et al., 2009; Wang et al., 2009; Cavazzana-Calvo et al., 2010; Grupp et al., 2013), and AAV-based vectors are also showing dramatic success (Maguire et al., 2008; Li et al., 2011a,b; Nathwani et al., 2011; Kaeppel et al., 2013). Cells modified by designer nucleases are now in humans—again Carl June and coworkers have pioneered a new technology in the search for cures for HIV infection (Tebas et al., 2014). Genome editing is taking off as a correction method for many indications, though my guess is we still have a lot to learn about the unwanted side effects of engineered nucleases on genomes.
Whole new worlds are still opening up thanks to the new deep-sequencing methods. For example, gene correction in immunodeficient states can be explored by sequencing products of VDJ recombination (Robins et al., 2009, 2010), providing an exceptionally rich picture of newly formed populations of antigen-recognizing cells. Deep sequencing can also report the response of the human microbiome as immunodeficiencies are corrected (McKenna et al., 2008; Hildebrandt et al., 2009; Hoffmann et al., 2009; Wu et al., 2011; Charlson et al., 2012; Minot et al., 2012, 2013; Peterfreund et al., 2012; Sonnenberg et al., 2012; Alenghat et al., 2013), providing a unique opportunity to investigate how the human immune system shapes our microbial populations.
Without much fanfare, human gene therapy has gone from controversial to almost standard for some indications. It has been a privilege to be able to help along the way. Way down the road, I'm betting that this transition will be seen as a turning point in history.
Footnotes
Acknowledgments
I particularly thank the people who worked in my laboratory and the collaborators who added so much to our projects. Thanks very much to the National Institutes of Health, which has funded most of my research.
Author Disclosure Statement
The author has no conflict of interest.
