Resolving the Unresolvable: Nanopore Sequencing as a Comprehensive Quality Control Platform for Gene Therapy Vectors

Abstract

Cell and gene therapy (CGT) manufacturing has outpaced traditional chemistry, manufacturing, and controls frameworks, leaving a “black box” in vector quality control (QC). Legacy assays such as Sanger and short-read next-generation sequencing often fail to resolve complex structures including adeno-associated virus (AAV) inverted terminal repeats, lentiviral recombination, and mRNA poly(A) tails. Oxford Nanopore Technologies enables long-read, native single-molecule sequencing to access these attributes directly. This review summarizes nanopore sequencing across the CGT lifecycle. For plasmid DNA, it confirms full-length circular identity and reveals structural heterogeneity missed by restriction mapping. For viral vectors (AAV and lentivirus), it functions as an integrity assay to distinguish full genomes from truncations and to detect sequence-resolved impurities, including reverse-packaged plasmid backbones. For mRNA therapeutics, direct RNA sequencing profiles poly(A) tail length distributions and base modifications (e.g., m¹Ψ) in a single assay. We also discuss adaptive sampling for impurity enrichment and native epigenetic profiling of bacterial methylation. Finally, we assess limitations in accuracy and compliance and outline the regulatory path toward moving long-read sequencing from an orthogonal tool to a validated lot-release method. Overall, nanopore sequencing supports risk-based, high-resolution QC while reducing analytical turnaround time.

Keywords

quality control (CMC)nanopore sequencing adeno-associated virus (AAV)mRNA therapeutics poly(A) tail

INTRODUCTION

The landscape of cell and gene therapy (CGT) is undergoing rapid expansion. In recent years, over 100 gene, cell, and RNA therapies have been approved worldwide, with more than 3,700 candidates in clinical or preclinical development.^1,2 This momentum, driven by clinical successes such as adeno-associated virus (AAV)-based gene therapies and CAR-T cell immunotherapies, marks a shift toward precision medicine.^3,4 Viral vectors—specifically AAV and lentivirus—currently dominate delivery platforms, accounting for ∼75% of gene therapy programs.⁵ In parallel, the widespread deployment of mRNA vaccines has validated the scalability and efficacy of RNA-based modalities.^6–8 However, this clinical progress has outpaced the capabilities of traditional chemistry, manufacturing, and controls (CMC) frameworks.^9–12 Gene therapy vectors are complex biological assemblies, and ensuring their quality, purity, and consistency remains a significant challenge, often described as the manufacturing “black box.”^13–16

Traditional quality control (QC) methods, while established, struggle to resolve the structural complexities of these vectors. Sanger sequencing, the legacy standard for identity testing, frequently fails at highly repetitive or guanine-cytosine (GC)-rich regions.¹⁷ For instance, the inverted terminal repeats (ITRs) of AAV—≈145 bp GC-rich palindromic hairpins essential for viral replication—are notoriously difficult to sequence due to polymerase slippage and secondary structure formation.^13,17 Similarly, next-generation sequencing (NGS) platforms, such as Illumina, offer high throughput but are limited by short-read lengths. The fragmentation required for short-read libraries destroys long-range phasing information, making it difficult to resolve repetitive elements or distinguish between independent vector genomes and concatemers.^18–20 Furthermore, polymerase chain reaction (PCR) amplification in standard NGS workflows introduces bias, potentially obscuring GC-rich regions or structural variants.^18,21,22

Quantitative assays face similar limitations. While qPCR is widely used for titration, it is sensitive to primer design and matrix inhibitors, often yielding variable results.²³ Droplet digital PCR (ddPCR) improves precision by enabling absolute quantification,^24,25 yet it remains a targeted method; it confirms the presence of specific amplicons but cannot detect unexpected genomic contaminants or characterize the integrity of the full vector genome.²⁶ Consequently, conventional QC strategies often rely on a labor-intensive patchwork of indirect assays—combining Sanger sequencing, qPCR, and gel electrophoresis. This fragmented approach increases operational complexity and costs while failing to resolve critical quality attributes (CQAs), such as truncated genomes, reverse-packaged plasmid backbones,¹⁴ or epigenetic modifications.^27–29

To address these gaps, long-read sequencing has emerged as a critical tool for vector characterization. It is important to situate Oxford Nanopore Technologies (ONT) within this broader landscape, particularly alongside PacBio single molecule real-time (SMRT) sequencing. PacBio HiFi reads, generated via circular consensus sequencing, offer exceptional per-read accuracy (>Q30), making them highly advantageous for confident single-nucleotide variant (SNV) and indel calling.³⁰ HiFi sequencing has been successfully applied to profile heterogeneous rAAV populations and detect nonvector DNA impurities.³¹ However, HiFi workflows typically require SMRTbell library construction and may be less adaptable to rapid, real-time “in-house” QC iterations compared with nanopore workflows.³²

In contrast, ONT nanopore sequencing offers a distinct advantage: the ability to sequence native DNA and RNA molecules in real time without PCR amplification or extensive library manipulation. Mechanistically, nanopore sequencing operates by unwinding and translocating a single-stranded nucleic acid through a nanometer-scale protein pore embedded in an electrically resistant synthetic membrane. As a motor protein regulates the translocation speed, the passing nucleotides cause characteristic disruptions in a steady ionic current. These electrical signals are then decoded in real time into sequence and epigenetic data by deep-learning algorithms. This capability positions the technology as a rapid, highly informative orthogonal characterization tool that can streamline the QC workflow, complementing rather than entirely replacing multiple orthogonal tests. This “native-state” sensing allows for the direct detection of base modifications (e.g., methylation) and the interrogation of RNA features such as poly(A) tail length and secondary structure—capabilities that are challenging for other platforms.³² Furthermore, nanopore sequencing provides scalable read lengths, enabling the analysis of ultra-long molecules that span entire plasmids or complex vector genomes end-to-end.³³ Recent advancements in pore chemistry (e.g., R10.4.1) and AI-driven basecalling have significantly reduced error rates, with raw single-molecule read accuracies reaching Q20–Q25 (∼99.0–99.5%), thereby increasing the utility of ONT for detailed sequence verification. However, it must be emphasized that nanopore sequencing is currently in its early developmental stages for CMC applications and is not yet an established, fully validated standalone lot-release platform.

Therefore, the most robust CMC strategy is likely a hybrid, risk-centric framework rather than a single-platform solution. Short reads remain valuable for deep variant surveillance; HiFi reads serve as high-accuracy validators; and nanopore sequencing acts as a “molecular microscope” for resolving long-range structural integrity, repetitive elements (ITRs/long terminal repeats [LTRs]), and native epigenetic signatures³⁴ (Fig. 1).

Figure 1.

Nanopore sequencing as a “molecular microscope” for gene therapy vector quality control. Traditional QC methods provide a fragmented or incomplete view of complex vector molecules (left, “Blurry Vision”), exemplified by Sanger drop-off at AAV ITR hairpins, short-read sequencing that breaks long plasmids into ambiguous assemblies, and the inability to directly sense native base modifications. In contrast, nanopore sequencing enables end-to-end single-molecule resolution (right, “Crystal Clear”), including full-length circular plasmid reads, ITR-to-ITR AAV genome profiling with direct modification detection, and direct RNA sequencing of mRNA with single-molecule poly(A) tail length measurements. AAV, adeno-associated virus; ITR, inverted terminal repeat; QC, quality control.

In this review, we examine the specific application of ONT nanopore sequencing across the CGT manufacturing workflow. We detail how this technology is being applied to: (1) plasmid DNA, to verify full circular sequences and resolve recombination events; (2) viral vectors, to illuminate genome truncations, ITR stability, and packaged impurities; and (3) mRNA therapeutics, to measure poly(A) tails and base modifications directly (Fig. 2). We also explore advanced capabilities such as adaptive sampling for impurity enrichment and discuss the current regulatory outlook for integrating these long-read metrics into routine QC.

Figure 2.

A comprehensive workflow for gene therapy quality control enabled by nanopore sequencing. Nanopore sequencing serves as a versatile tool across three critical manufacturing stages: (A) Plasmid DNA construction: Resolves structural heterogeneity often missed by short reads, including recombination at repeat elements (e.g., ITRs/LTRs), plasmid multimerization, and bacterial methylation patterns (Dam/Dcm). (B) Viral vector production (AAV/lenti): Provides end-to-end coverage of packaged genomes to detect truncation events, quantify reverse-packaged plasmid backbone or host DNA contaminants, and verify ITR integrity in the final drug substance. (C) mRNA synthesis: Utilizes direct RNA sequencing to characterize poly(A) tail length distributions, detect synthesis by-products (short/long transcripts), and verify base modifications (e.g., m1Ψ) without reverse transcription. The “magnifying glass” represents the ability of long-read sensing to reveal hidden quality attributes (risk attributes) that constitute the “black box” of traditional QC. LTR, long terminal repeat.

THE FOUNDATION: PLASMID DNA (PDNA) QC

Plasmid DNA serves as the fundamental starting material for the majority of gene therapies, functioning either as the template for viral vector production or as the active drug substance in DNA vaccines.^35–37 Ensuring the sequence fidelity and structural homogeneity of pDNA is therefore critical. Traditional verification methods, primarily restriction digest mapping and Sanger sequencing, rely on fragmenting the molecule. This approach often fails to detect subtle rearrangements or characterize complex secondary structures.^35,36 Nanopore sequencing addresses these limitations by enabling full-length plasmid sequencing—reading the linearized or circular molecule in a single continuous pass.^38–41

Sequence identity and consensus accuracy

A key advantage of the nanopore platform is the ability to confirm plasmid identity without assembly. A single read, typically 5–15 kb, can span the plasmid from origin to terminus.^39,41 This capability is particularly valuable for synthetic biology and vector manufacturing, where confirming the exact arrangement of modular components is essential.^6,38,40

However, single-molecule raw read accuracy has historically been a concern for SNV detection. To address this, workflows utilizing rolling circle amplification (RCA) have been developed (often termed NPlasmid-seq). By amplifying the plasmid into long concatemers containing multiple tandem repeats of the monomer, a single nanopore read can capture dozens of passes over the same sequence.^13,42,43 Aligning these repeats yields a high-fidelity consensus sequence that eliminates random stochastic errors, often achieving consensus accuracies >Q30 (>99.9%)—comparable to or exceeding Sanger sequencing for plasmid verification.^43,44 This approach effectively marries the read-length benefits of ONT with the high accuracy required for Good Manufacturing Practice (GMP) release testing.

Furthermore, systematic errors in nanopore basecalling often arise from bacterial methylation patterns. Plasmids propagated in standard Escherichia coli strains carry Dam (6-methyladenine) and Dcm (5-methylcytosine) modifications, which alter the ionic current signal and can confuse standard basecallers.¹⁸ Modern “methylation-aware” polishing tools (e.g., Medaka and Nanopolish) or models trained on modified bases can correct these systematic artifacts, ensuring that the final consensus sequence is error-free even in the presence of native bacterial methylation.^28,45

Resolving structural heterogeneity and inverted repeats

A major blind spot in short-read QC is structural heterogeneity. Plasmids often exist as a population containing monomeric, multimeric, and recombined forms—variants that may be low-abundance but biologically significant. •

Multimerization: Homologous recombination during bacterial replication can produce head-to-tail dimers or trimers.^46–49 Nanopore reads can traverse these concatemer junctions directly, providing a “species census” of the plasmid population (monomer vs. multimer) that agarose gels may not fully resolve.²⁶

•

ITR Instability: For AAV vectors, the ITRs are notoriously unstable in E. coli, prone to deletions that render the resulting virus replication-incompetent.^27,50 These 145-bp GC-rich hairpins act as strong blocks to Sanger sequencing polymerases and are often bridged (missing) in short-read assemblies.⁵¹ Nanopore sequencing is currently the only accessible technology capable of reading through intact ITR hairpins natively.³¹

This capability allows for the quantification of ITR integrity. Rather than a binary “present/absent” check, nanopore sequencing can quantify the ratio of plasmids with full-length ITRs versus those with truncations (e.g., “70 bp deleted mutant”).²¹ Such quantitative insights enable process engineers to optimize bacterial strains (e.g., using recombination-deficient lines) and culture conditions to minimize the propagation of defective plasmids.⁵¹

Contamination profiling: a sequence-resolved taxonomy

Beyond the target plasmid, pDNA preparations must be assessed for composition-level impurities. Nanopore sequencing enables a comprehensive impurity profile that integrates with downstream vector QC: 1.

Cross-Plasmid Carryover: In facilities handling multiple vectors, cross-contamination between helper, packaging, and transfer plasmids is a risk. Nanopore sequencing can identify trace levels of contaminating plasmids by mapping reads against a facility’s plasmid database, preventing the propagation of incorrect genetic elements into the viral vector production stage.^52,53

Host Genomic DNA: Residual E. coli genomic DNA is a safety-relevant process impurity. While qPCR provides total mass quantitation, nanopore sequencing offers a “taxonomy of impurities,” identifying the specific genomic origins of contaminants.^54–56 This can reveal if specific genomic loci (e.g., mobile elements or transposons) are preferentially co-purifying with the product.

In summary, nanopore sequencing elevates plasmid QC from a patchwork of fragmented assays to a holistic molecular analysis. It concurrently verifies sequence identity,³⁴ quantifies structural heterogeneity at repetitive elements, and screens for sequence-resolved impurities. As workflows standardize, many groups are shifting to ONT for primary plasmid validation, reserving Sanger sequencing only for targeted confirmation of specific loci.⁵⁷ However, it is important to note that while nanopore sequencing excels at sequence verification, orthogonal methods (e.g., capillary gel electrophoresis) remain essential for assessing plasmid topology (supercoiled fraction), which is typically lost during library linearization.

THE DELIVERY VEHICLES: VIRAL VECTOR CHARACTERIZATION

Recombinant AAV: Integrity, ITRs, and contaminants

Recombinant AAV (rAAV) vectors serve as the primary delivery vehicle for in vivo gene therapy due to their favorable safety profile. However, an AAV vector batch is a heterogeneous mixture: capsids may contain full genomes, truncated partial genomes, concatemers, or be entirely empty.¹ Furthermore, unintended DNA species—such as residual plasmid sequences or host–cell chromatin—can be co-packaged during production.³

Traditional analytical methods provide only partial visibility into this complexity. While techniques, such as analytical ultracentrifugation (AUC) or charge detection mass spectrometry, effectively separate empty from full capsids, they cannot distinguish whether a “full” capsid contains an intact genome or foreign DNA of a similar size. Similarly, PCR-based methods (qPCR/ddPCR) rely on targeted amplification. While dual-probe ddPCR strategies can estimate genome integrity by comparing 5′- and 3′-end signals,⁶ these indirect measures fail to characterize the specific nature of internal deletions or identify unexpected sequences.¹³ Long-read sequencing addresses these limitations by enabling the sequencing of whole vector genomes from ITR to ITR.

Evolution of long-read AAV profiling

The utility of long-read sequencing for AAV characterization was first established using PacBio SMRT sequencing. In a seminal 2018 study and subsequent optimizations, Tai et al. introduced “AAV Genome Population sequencing (AAV-GPseq),” revealing that vector preparations appearing homogeneous by gel electrophoresis often contained <50% true full-length genomes and demonstrated how vector design influences truncation heterogeneity.^27,58 They also identified “reverse-packaged” genomes containing plasmid backbones and host–vector chimeras.

While early SMRT protocols required high DNA input, recent advancements in ONT have improved throughput and accessibility. A 2025 study by Dunker-Seidler et al. utilizing the PromethION platform with R10.4.1 chemistry demonstrated comprehensive profiling of a clinical rAAV9 batch.²¹ The study reported that modern nanopore chemistry yielded length profiling and structural resolution comparable to PacBio HiFi. While raw single-molecule accuracy (approx. Q23) is lower than HiFi (>Q30), it proved sufficient for differentiating viral species while requiring significantly lower DNA input. The authors concluded that nanopore sequencing effectively identifies product-related impurities and provides a complete distribution of vector genome lengths.²¹

Resolving ITRs

A critical advantage of nanopore sequencing is the ability to resolve ITRs. The ∼145 bp AAV ITR forms a stable GC-rich hairpin that inhibits standard polymerases, making it historically difficult to sequence via Sanger or Illumina methods without extensive protocol modifications.

Nanopore sequencing can read through native ITR hairpins, enabling the end-to-end verification of vector genomes.⁵⁹ This capability allows for the direct detection of: •

ITR Truncations: Identifying genomes lacking one ITR, a phenomenon often associated with packaging genomes exceeding the ∼5.2 kb limit.³¹

•

Replication Errors: Detecting point mutations or duplications within the ITR “T-tract” or stem regions.³²

•

Sequence Integrity: Validating read-through across difficult upstream motifs, such as GC-rich insulator sequences, which often cause drop-outs in short-read data.⁵¹

Genome length distribution and truncation analysis

Nanopore sequencing provides a direct readout of the physical length of every encapsidated molecule. These data can be visualized as a length histogram (Fig. 3A), revealing subpopulations of truncated species (e.g., “half-genomes” or snapback genomes) that may be missed by bulk sizing methods. Quantifying the “full-length genome fraction” is essential for predicting potency, as partial genomes generally fail to express the transgene, and for safety, as they may integrate aberrantly. This sequence-based metric is increasingly serving as an informative orthogonal tool to cross-reference statistical models derived from ddPCR. Furthermore, the technology holds promise for orthogonal titer estimation. While currently widely used for relative abundance, the incorporation of well-characterized molecular spike-in standards allows nanopore sequencing to assist in estimating physical titer, serving as a supportive quantitative tool alongside ddPCR results. This emerging capability supports the industry’s shift toward multiattribute methods (MAMs), potentially enabling the simultaneous reporting of genome integrity, identity, and relative titer estimation from a single dataset.

Figure 3.

Resolving structural heterogeneity and sequence-resolved impurities at single-molecule resolution. (A) AAV genome integrity analysis. Nanopore sequencing reads individual vector genomes from end-to-end, distinguishing distinct subpopulations: full-length genomes (Type 1), truncated genomes lacking one ITR (Type 2), and self-complementary or “snapback” genomes (Type 3). In contrast, traditional short-read NGS (inset) typically shows only a pile-up coverage plot with characteristic “drop-offs” at GC-rich ITR regions, failing to resolve the connectivity or specific truncation structure of individual molecules. (B) Visualization of reverse-packaged impurities. A schematic of a chimeric read generated by a reverse packaging event. Long-read sequencing captures the physical linkage between the plasmid backbone (red), the ITR hairpin, and the transgene (blue) in a single continuous read, definitively identifying the impurity source as packaging plasmid carryover rather than random host DNA. (C) mRNA poly(A) tail profiling. Comparative histograms of poly(A) tail length distributions derived from direct RNA sequencing. A “Good Batch” (top) exhibits a sharp, narrow peak indicating precise enzymatic polyadenylation and process control. A “Bad Batch” (bottom) shows a diffuse, platykurtic distribution with a leftward skew, indicative of enzymatic inconsistency or RNA degradation. These profiles provide a high-resolution metric for mRNA stability and potency that bulk average sizing cannot reveal. NGS, next-generation sequencing.

Limitations in aav characterization

Despite these capabilities, it is critical to state that nanopore sequencing is not a validated approach for the absolute quantification of truncated AAV genomes. Even with the inclusion of molecular spike-in DNA, quantitative biases can arise from differential extraction efficiencies between full and partial capsids, adapter ligation biases during library preparation, and the preferential pore loading of shorter DNA fragments. Consequently, while it provides unparalleled structural insight, nanopore sequencing remains an orthogonal characterization tool. Established methods such as AUC or capillary electrophoresis (CE) remain indispensable for quantitative evaluation.

Sequence-resolved impurity profiling

Perhaps the most significant safety application of nanopore sequencing is the identification of packaged contaminants. During production, helper plasmids, transfer plasmids, and host genomic DNA co-exist with the vector; fragments of these can be inadvertently packaged.¹³ •

Reverse Packaging: Long-read data have confirmed the “reverse packaging” mechanism, where the viral packaging machinery extends beyond the ITR into the plasmid backbone, encapsidating antibiotic resistance genes or bacterial origins of replication²¹ (Fig. 3B). Crucially, resolving these backbone-ITR linkages typically necessitates ligation-based library preparation to preserve molecular connectivity, as transposase-based methods may fragment these diagnostic junctions.

•

Host–Vector Chimeras: Studies have detected human genomic sequences fused to vector ends, constituting 1–2% of genomes in some preparations.²⁷ Unlike bulk residual DNA assays, sequencing identifies the source of the contamination.

•

Random versus Hotspot Encapsidation: Recent GMP-grade analyses suggest that while some impurities are stochastic (“random origin”), others may be driven by specific sequence elements.⁶⁰

Because encapsidated DNA is protected from nuclease treatments (e.g., benzonase) used during purification, characterization and upstream process control are the primary mitigation strategies. The detection of plasmid backbone carryover has specifically driven the industry adoption of minicircle DNA and other backbone-free technologies to minimize the risk of packaging extraneous bacterial sequences.⁶¹

Lentiviral vectors: LTR stability and carryover

Lentiviral vectors (LVs), widely utilized for ex vivo therapies such as chimeric antigen receptor T-cel cell manufacturing, present a distinct set of QC challenges compared with AAV. These vectors package a ∼9–10 kb RNA genome within an enveloped particle.^62,63 A critical structural feature is the LTR at both ends, which is essential for integration but also acts as a hotspot for homologous recombination.

Genome integrity and cryptic splicing

During vector production, recombination between the identical 5′ and 3′ LTRs in the transfer plasmid can lead to deletion of the internal transgene cassette. Furthermore, the lentiviral RNA genome is prone to aberrant processing events, such as cryptic splicing or premature termination, which result in truncated, nonfunctional transcripts being packaged into virions.

Nanopore sequencing offers a direct method to assess the structural integrity of the packaged RNA. A recent benchmarking study by Zeglinski et al. established an optimized long-read QC workflow, comparing nanopore direct RNA sequencing (dRNA-seq) against nanopore cDNA and PacBio cDNA methods.⁶⁴ The study found that ONT dRNA-seq provided superior coverage of the vector genome without the internal priming biases observed in cDNA approaches, which often skew read distribution toward the 3′ end.⁶⁴

Crucially, this “native RNA” approach enabled the robust identification of cryptic splice sites and cryptic poly(A) motifs that drove premature truncation of the vector genome.⁶⁴ The authors demonstrated that protocol refinements, such as artificial polyadenylation [to capture truncations lacking natural poly(A) tails], allowed for the quantification of full-length versus defective species. These insights directly informed vector design: by mutating identified cryptic motifs, the fraction of full-length, functional vector RNA was significantly increased.⁶⁴

Impurity profiling: Plasmid DNA, host DNA, and nonvector RNA

Unlike AAV, where nonvector DNA is encapsidated within the protein shell, LVs are enveloped particles that primarily package RNA. However, the packaged cargo is not restricted to the therapeutic vector genome. Recent long-read sequencing studies have revealed that substantial amounts of nonvector RNA—including host cell-derived transcripts and RNAs from packaging plasmids—can be inadvertently packaged into lentiviral particles.⁶⁵ In addition to these RNA impurities, residual plasmid DNA from the transient transfection process can co-purify with the product. Regulatory guidelines strictly limit residual host–cell and plasmid DNA levels (typically <10 ng/dose), necessitating sensitive detection methods.

Nanopore sequencing is highly sensitive for characterizing these DNA impurities. Because viral preparations are often dominated by host nucleic acids, adaptive sampling (or “ReadUntil”) can be deployed to selectively enrich for low-abundance contaminants. A proof-of-concept study demonstrated an ∼8-fold enrichment of viral sequences from a complex human background by actively rejecting host DNA reads in real-time.⁶⁶ Conversely, this technique can be inverted to enrich for plasmid backbone sequences, providing a sequence-resolved profile of residual DNA that persists despite DNase treatment.

Limitations in lentiviral profiling

As with AAV vectors, the application of nanopore sequencing to lentiviral QC is subject to significant quantitative limitations. The accurate quantification of intact versus defective lentiviral RNA genomes is hindered by the inherent fragility of long RNA molecules (∼9–10 kb), which are prone to degradation during extraction and handling. This makes it challenging to distinguish true biological premature truncations from in vitro degradation artifacts. Furthermore, if cDNA-based workflows are utilized to mitigate input limitations, biases from internal priming and template switching during reverse transcription (RT) can distort the relative abundance of viral transcript forms and create artificial recombinants. Therefore, nanopore sequencing currently serves as an advanced characterization tool rather than a validated quantitative assay for LVs.

Safety: Replication-competent lentivirus

A primary safety concern for LVs is the potential generation of replication-competent lentivirus (RCL) via recombination between the transfer vector and packaging plasmids.⁶⁷ Traditional RCL testing relies on culture-based assays that can take weeks to complete.⁶⁸ Nanopore sequencing offers a potential molecular alternative: by generating long reads that span the entire vector construct, it can distinguish between safe, self-inactivating vectors and recombination events that restore functional gag-pol-env sequences.⁶⁹ While currently supportive, this molecular assurance provides a level of resolution—distinguishing true recombinants from random noise—that short-read methods cannot achieve.

Summary: The molecular microscope for viral vectors

The application of nanopore sequencing to viral vector characterization—whether for AAV or lentivirus—marks a shift from inferential QC to direct molecular observation. We can now interrogate the actual contents of the therapeutic particle: verifying genome length, confirming the integrity of terminal repeats (ITRs/LTRs), and taxonomizing sequence-resolved impurities.²¹

This “molecular microscope” capability closes critical gaps left by traditional methods. It resolves the “invisible” heterogeneity of vector preparations, such as determining whether a “full” AAV capsid contains a therapeutic genome or a host–vector chimera.²³ As the technology matures, with reduced costs and validated GMP implementations already emerging, long-read sequencing is poised to evolve from an orthogonal characterization tool into a routine release assay, underpinning the next generation of higher-quality genetic medicines.

THE NEW ERA: mRNA THERAPEUTICS AND VACCINES

Messenger RNA therapeutics—exemplified by the rapid deployment of COVID-19 vaccines—have become a central pillar of the CGT field. Manufacturing typically involves in vitro transcription (IVT) to generate synthetic mRNA incorporating modified nucleotides (e.g., N1-methylpseudouridine, m¹Ψ) and a poly(A) tail, followed by formulation into lipid nanoparticles.

Ensuring the quality of these transcripts presents unique CMC challenges. CQAs include the precise sequence identity, the integrity of the 5′ cap and poly(A) tail, and the absence of abortive transcripts or double-stranded RNA (dsRNA) by-products.^70–72 Among these, dsRNA by-products from IVT constitute a major immunostimulatory impurity and are routinely monitored and removed during downstream purification.^73,74 Traditional analytics—such as CE for sizing and Illumina RNA-seq for identity—struggle to resolve these features simultaneously. CE lacks sequence resolution, while standard NGS requires RT and PCR, which erase base modification signals and bias the readout of homopolymeric poly(A) tails.

dRNA-seq on the nanopore platform offers a solution by translocating native RNA molecules through the sensor. This enables the simultaneous interrogation of sequence, tail length, and chemical modifications on the same single molecule.^75,76

Poly(A) tail quality: Length and composition

The poly(A) tail is a determinant of mRNA stability and translational efficiency. Tail heterogeneity or truncation can significantly impact therapeutic potency. Nanopore dRNA-seq enables direct poly(A) tail length profiling at single-molecule resolution. •

Length Profiling: Algorithms such as Nanopolish and Tailfindr infer tail length from the raw ionic current dwell time rather than basecalling, which is prone to error in long homopolymers. These inferred lengths correlate well with orthogonal benchmarks, allowing manufacturers to verify that the bulk product meets design specifications (e.g., “100 ± 10 nt”) and to detect subpopulations of truncated tails caused by polymerase slippage or RNase contamination (Fig. 3C).

•

Platform Considerations and Algorithmic Frameworks: The accuracy and reproducibility of poly(A) tail profiling via nanopore dRNA-seq are intrinsically linked to both the sequencing chemistry (e.g., the specific motor protein and buffer conditions defined in commercial kits such as SQK-RNA004) and the bioinformatic algorithms employed. Comprehensive benchmarking and application studies have validated the use of tools, such as Nanopolish polya, for robust tail length estimation across diverse biological samples.⁷⁷ For the detailed characterization of nonadenosine residues within the tail—a critical attribute for mRNA stability—specialized frameworks such as Ninetails have been developed to directly parse these heterogeneities from the raw sequencing signal.⁷⁸

•

Tail Composition: Recent studies have revealed that therapeutic mRNA tails may not be pure adenosine homopolymers. “Mixed” tails containing incorporated C, G, or U residues can retard deadenylation and enhance stability. Using frameworks such as Ninetails, dRNA-seq can quantify these nonadenosine residues, establishing tail composition as a measurable quality attribute.

Base modifications and identity

A hallmark of effective mRNA therapies is the substitution of uridine with N1-methylpseudouridine (m¹Ψ) to suppress innate immune sensing through Toll-like receptor (TLR) + PKR/OAS–RNase L^79–85 and enhance translational efficiency through eIF2α-dependent and independent mechanisms.⁸⁶ Standard cDNA sequencing cannot distinguish m¹Ψ from U, as both are reverse-transcribed as adenine. Nanopore sensors, however, detect the native physicochemical footprint of the modification.^87,88 •

Detection Mechanism: Modified bases often manifest as systematic basecalling errors or “glitches” at specific positions. Tools such as ELIGOS (Epitranscriptional Landscape Inferring from Glitches of ONT Signals) leverage the increased “Error of Specific Bases” (%ESB) in native RNA relative to an unmodified reference to infer modification sites. Vaccine QC: This capability allows for the direct verification of m¹Ψ incorporation rates. For vaccine lots, dRNA-seq can confirm that essentially all uridine positions carry the modification, differentiating the drug substance from potential contaminants or process failures that incorporate unmodified U.

Transcript integrity and by-products

IVT reactions are prone to generating impurities such as abortive (truncated) transcripts or 3′-extended read-through products (e.g., if the DNA template is not fully linearized). Because nanopore sequencing preserves full-length connectivity:

Truncation Analysis: A length histogram of mapped reads can reveal distinct peaks corresponding to abortive species, which might act as dominant-negative inhibitors or immunogenic contaminants. •

5′ Capping Efficiency: Distinguishing capped (Cap1) from uncapped (Cap0/OH) species is a critical industrial challenge, as the native ionic current shift caused by the 5′ cap is extremely subtle. In humans, the innate immune sensor IFIT1 specifically binds to and inhibits the translation of mRNAs lacking 2′-O-methylation (Cap0), while fully methylated Cap1 structures evade this recognition.^89,90 To address this, specialized library preparation strategies or enzymatic remodeling are required to “amplify” this signal, as standard native protocols often lack the signal-to-noise ratio to distinguish these modifications. Approaches such as enzymatic remodeling—where the cap is specifically cleaved to allow ligation of a distinct sequencing adapter—transform the elusive physicochemical signal into a clear, sequence-based readout. This innovation is crucial for accurately quantifying capping efficiency, a key determinant of translational potency, without identifying false positives from signal noise.^91,92 In parallel, orthogonal analytical methods such as CE and liquid chromatography-mass spectrometry continue to be refined, providing complementary approaches for the detailed characterization of capping efficiency and the identification of cap-end impurities under various stress conditions.^93,94

•

Fusion Transcripts: Rare events such as tandem transcription (multimers) are immediately visible as double-length reads, a structural anomaly often invisible to short-read assembly. However, a critical limitation remains regarding dsRNA, a potent immunogenic impurity. Because nanopore dRNA-seq requires the motor protein to translocate a single linear RNA strand, dsRNA by-products are either unwound or fail to enter the pore efficiently. Consequently, dRNA-seq cannot currently serve as a quantification method for dsRNA content, necessitating continued reliance on orthogonal “gold standard” assays such as J2 antibody-based immunoblotting or enzyme-linked immunosorbent assay (ELISA).

Limitations: Throughput and input requirements

Despite its power, dRNA-seq has distinct limitations compared with cDNA-based methods. 1.

Input Material: Direct RNA sequencing typically requires significantly higher mass input (often 500 ng–1 µg) compared with PCR-amplified cDNA protocols. This can be prohibitive for scarce samples or early-stage process development.

Throughput and Yield: The data yield per flow cell for dRNA-seq is generally lower than for DNA sequencing due to the slower translocation speed and motor protein dynamics.

Accuracy: While improving, the raw single-read accuracy of native RNA (typically ∼96–98%) remains lower than that of DNA (Q20–Q25 or ∼99.5%). Therefore, dRNA-seq is best used as a characterization tool (confirming structure/modifications) rather than for detecting ultra-low frequency SNVs, where deep Illumina sequencing remains superior. Additionally, the platform recognizes an “analyte blind spot”: dsRNA. While it excels at profiling intrinsic attributes such as poly(A) tails and modifications, it does not currently offer a validated workflow to quantify double-stranded impurities. For this specific CQA, traditional immunochemical methods remain indispensable. Concurrently, the standardization and sensitivity of these orthogonal methods, such as antibody-based sandwich ELISA or dot blot assays using the J2 antibody, continue to be refined to meet the stringent requirements of mRNA vaccine QC.^95,96 These efforts are supported by a deeper understanding of dsRNA formation mechanisms, including aberrant RNA-dependent RNA polymerase activity during IVT, and the development of robust purification strategies such as selective binding with chaotropic agents or affinity chromatography to minimize this critical impurity.^97,98

In summary, nanopore sequencing enables a comprehensive “characterization study” on every mRNA batch. By verifying sequence, tail length, and modification status in a single assay, it aligns closely with quality by design (QbD) principles. As throughput scales, it is foreseeable that metrics such as “poly(A) tail length distribution” derived from sequencing could evolve from informational characterization to formal release specifications.

ADVANCED CAPABILITIES: EPIGENETICS AND ADAPTIVE SAMPLING

Beyond primary sequence verification, modern gene therapy products possess critical epigenetic attributes that influence safety and potency. Nanopore sequencing’s ability to directly sense nucleotide modifications allows manufacturers to explore these advanced quality layers. Furthermore, its real-time data streaming enables adaptive sampling—the selective enrichment of rare targets in complex backgrounds. Here, we discuss the detection of bacterial DNA methylation signatures and their implications for immunogenicity.

Epigenetic signatures: Dam/Dcm patterns and CpG motifs

Plasmid DNA produced in E. coli carries distinct bacterial epigenetic marks: N6-methyladenine at GATC sites (Dam methylation) and 5-methylcytosine at CCWGG sites (Dcm methylation).^99–101 In contrast, mammalian DNA is characterized by 5-methylcytosine at CpG dinucleotides. Crucially, bacterial plasmids typically lack CpG methylation.

Immunological implications

The host immune system relies on these methylation differences to distinguish self from nonself. Unmethylated CpG motifs—abundant in bacterial backbones—act as potent pathogen-associated molecular patterns. They bind to TLR9, triggering innate immune signaling and nuclear factor kappa B activation. While this adjuvant effect is beneficial for DNA vaccines, it is often detrimental for gene therapy, where it can lead to inflammation or transgene silencing.^102–105 Consequently, manufacturers may employ strategies such as CpG-free plasmid design or in vitro CpG methylation to “mask” the vector from TLR9 surveillance.^106,107

Direct methylation sensing

Historically, verifying methylation status required laborious indirect methods such as bisulfite sequencing or methylation-sensitive restriction digestion. Nanopore sequencing transforms this by detecting modifications natively.^28,108–110 The passage of a modified base (e.g., 6 mA or 5 mC) through the pore induces a characteristic shift in ionic current compared with its canonical counterpart (Fig. 4B). Advanced basecalling models (e.g., in Dorado or Megalodon) can now reliably call 5 mC and 6 mA sites across the entire plasmid topology.

Figure 4.

Mechanistic basis of nanopore sequencing: motor protein control and single-molecule sensing. (A) Molecular mechanism and adaptive sampling. The sequencing complex consists of a motor protein (helicase, purple) atop a pore protein (blue) embedded in a robust synthetic polymer membrane. The motor protein unwinds the double-stranded DNA (dsDNA), feeding a single strand (ssDNA) through the nanopore at a controlled speed. In adaptive sampling’s “depletion mode,” if the software recognizes the initial sequence as a nontarget (e.g., host DNA), it reverses the voltage field. This physically ejects the ssDNA back out of the pore (upwards) before the entire strand is sequenced, freeing the pore for the next molecule. (B) Native epigenetic sensing. As the ssDNA translocates through the sensing region (reader head), it disrupts the ionic current. Native modifications, such as a methyl group on a cytosine (5 mC), possess a distinct physical footprint compared with unmodified bases. This causes a measurable perturbation or “shift” in the raw current signal (squiggle trace), allowing direct detection of epigenetic marks (e.g., Dam/Dcm bacterial methylation) without bisulfite conversion.

Applications in QC

This capability supports several emerging QC applications: 1.

Strain Validation (Dam⁻/Dcm⁻): To reduce bacterial signatures, plasmids are often propagated in methylase-deficient strains. Nanopore sequencing can quantitatively confirm the phenotype of the production host. A plasmid from a functional Dam⁻/Dcm⁻ strain will show background-level signals at GATC/CCWGG motifs, whereas contamination with a standard strain will reveal near-100% methylation at these sites.^99–101

Verification of In Vitro Methylation: For processes utilizing CpG methyltransferase (M.SssI) to dampen immunogenicity, nanopore sequencing provides a site-specific readout of enzymatic efficiency. It can determine, for example, that “95% of CpG sites are methylated,” highlighting any protected regions that remain unmethylated and potentially immunogenic.¹⁰⁶

Refining Basecalling Accuracy: Interestingly, the systematic basecalling errors caused by modifications—once considered a liability—are now leveraged for detection. Because specific modifications consistently perturb the signal, methylation-aware polishing tools can simultaneously correct the primary sequence consensus (restoring accuracy to >99.9%) and map the modification landscape.^{28,45,108–112}

Regulatory outlook

While epigenetic profiling is not yet a mandatory release test, it represents a deep level of product understanding consistent with QbD principles. As the industry moves toward highly engineered vectors, the ability to demonstrate that a product is not only genetically correct but also “epigenetically optimized” (e.g., devoid of bacterial methylation patterns) offers a compelling safety argument to regulators.

Adaptive sampling: Targeted sequencing in complex samples

Beyond passive data collection, nanopore sequencing introduces a capability unique to real-time electronic sensing: adaptive sampling (also known as “ReadUntil”). This feature allows the sequencing device to actively select or reject individual DNA molecules based on their sequence identity as they transit the pore, effectively performing “software-defined enrichment” without the need for primers, baits, or complex library preparation.^60,113,114

Mechanism and scalability

Adaptive sampling operates by analyzing the initial segment (approximately 400 bases) of a DNA strand. This sequence is aligned in real-time to a user-defined digital reference. If the strand matches a target of interest (enrichment mode), sequencing continues; if it does not (or matches a “blocklist” in depletion mode), the software reverses the voltage across that specific pore, ejecting the strand and freeing the channel to capture a new molecule (Fig. 4A).¹¹⁵

Recent advancements have scaled this capability to high-throughput platforms. Munro et al. demonstrated barcode-aware adaptive sampling on PromethION flow cells, successfully targeting unique gene panels across three multiplexed human genomes simultaneously. They achieved 7–15× enrichment of target regions. Crucially, the study highlighted a dual utility: while targets were enriched, the “rejected” reads were not discarded but used to generate accurate copy number variation profiles, validating the method’s ability to provide both targeted depth and genome-wide structural context in a single run.¹¹⁶

Applications in gene therapy QC

This “search-and-sequence” capability addresses specific CMC challenges where the signal of interest is obscured by a high-background matrix.

Biodistribution and integration site analysis (enrichment mode)

In biodistribution studies or patient monitoring, vector genomes are rare events within a vast background of host DNA. Sequencing total DNA is inefficient and costly. •

Strategy: By setting the vector genome as the target, the system selectively sequences vector-positive strands while ejecting the overwhelming majority of host DNA.

•

Precedent: This approach has been validated in clinical metagenomics, where pathogen DNA was enriched ∼8-fold in clinical bronchoalveolar lavage fluid samples by depleting human host reads.¹¹⁷ Applied to gene therapy, this could enable the detection of rare integration events or low-copy persistence in patient biopsies without PCR bias.

Impurity clearance and contaminant screening (depletion mode)

For purified vector preparations, the analytical goal is often the inverse: detecting rare impurities (host DNA, residual plasmids) amid a dominant population of the vector product. •

Strategy: Manufacturers can define the vector genome as the “depletion target.” The sequencer actively rejects the main product (AAV or lentivirus reads), dedicating its sequencing capacity to “everything else.”

•

Outcome: This effectively enriches for nonvector contaminants, increasing the sensitivity for detecting residual E. coli genomic fragments or plasmid backbones that might otherwise fall below the limit of detection in a standard run. This method transforms the sequencer into a broad-spectrum impurity detector that requires no prior knowledge of the contaminant’s identity.

Flexibility and limitations

The primary advantage of adaptive sampling is its agility: targets are defined digitally in a .bed or .fasta file, allowing for immediate assay reconfiguration. If an unexpected signal (e.g., a strange plasmid fragment) is observed during a run, the targeting parameters can be updated on the fly to enrich that specific sequence for detailed characterization.

However, current limitations exist. The enrichment factor typically plateaus at 5–10-fold. While this is lower than hybridization-based capture methods (which can reach >100×), the speed and simplicity of the workflow make it superior for rapid QC. Additionally, the rejection process introduces “pore dead time,” slightly reducing the total data yield. Nevertheless, for regulatory applications—such as screening for replication-competent viruses (RCL/RCA) or proving the absence of specific transforming sequences—a 10-fold increase in effective coverage can be the deciding factor in detecting a trace safety risk.

In summary, adaptive sampling shifts nanopore sequencing from a passive observation tool to an active interrogation system. It allows QC laboratories to “tune” the sequencer to focus on the most CQAs—whether that is the product itself or the impurities hiding in its shadow.

CURRENT LIMITATIONS AND REGULATORY OUTLOOK

While nanopore sequencing has matured into a robust research tool, its integration into regulated QC environments faces distinct hurdles (summarized in Table 1). Transitioning from a “discovery” mode to a “release testing” mode requires addressing challenges in accuracy, data infrastructure, and regulatory validation.

Table 1.

High-impact CQA matrix for gene therapy vectors and mRNA therapeutics: Nanopore utility and current limitations

Modality/Stage	CQA Category	CQA	Nanopore Readout (Key Metric)	Current Limitations/Technical Challenges	Typical Orthogonal Assays
Plasmid DNA	Identity	Full sequence identity (backbone + insert + ITRs)	FL consensus; rolling-circle consensus (NPlasmid-seq)	Accuracy for low-freq SNVs requires high depth or polishing	Sanger (amplicons), Illumina
	Purity	Structural heterogeneity (recombination/multimers)	SV spectrum; “Intact molecule fraction”	Bioinformatic quantification needs validation vs. CE/gel	Restriction digest, CE/Gel
	Epigenetics	Bacterial methylation (Dam/Dcm)	MOD native methylation calling (6 mA/5mC)	Currently characterizing only; regulatory impact defined case-by-case	Bisulfite seq, restriction enzyme analysis
	Impurities	Host/cross-plasmid contamination	IMP taxonomic mapping and classification	Absolute quantification requires spike-in standards	qPCR (host DNA), PicoGreen
AAV vectors	Identity	Vector genome identity (ITR-to-ITR)	FL reads spanning full genome + ITRs	Homopolymer errors in ITRs (improved in R10.4.1 but require care)	qPCR ID, short-read NGS
	Purity	Genome integrity (full vs. truncated)	FL length distribution histogram	Not a validated approach for absolute quantification of truncations due to sequencing/extraction biases	ddPCR (5′/3′), AUC, CE
	Impurities	Packaged nonvector DNA (reverse packaging/host-chimera)	IMP sequence-resolved impurity ID and fraction	Detection limit for rare impurities depends on sequencing depth/adaptive sampling	qPCR/ddPCR panels, residual DNA
Lentiviral vectors	Identity	LTR integrity/SIN configuration; nonvector RNA packaging	FL reads spanning 5′ to 3′ LTR	Coverage bias if using cDNA (RT steps); subject to RT bias and RNA degradation limiting absolute quantification	ddPCR tiling, Sanger
Lentiviral vectors	Safety	Replication-competent lentivirus	Screening for recombination events	Sensitivity for ultra-rare RCL events needs proof vs. culture assays	Cell-culture RCL assays + ELISA[137]¹³⁶
mRNA/LNP	Identity	Sequence identity and integrity	FL full-length transcript reads	Error rate in raw reads limits SNV detection without consensus	RT-PCR + Sanger, Illumina
	Purity	Poly(A) tail length and composition	TAIL single-molecule sizing (dwell time)	Exact base-level tail sequence (e.g., 3′ U-tail) is computationally heavy	LC/CE sizing, enzymatic assays
	Identity	Base modifications (e.g., m1Ψ)	MOD detection via signal shifts (ESB/ELIGOS)	Requires complex training models; mostly qualitative/ratio-based currently	LC-MS/MS (nucleoside analysis)
	Impurities	Residual DNA template	IMP detection of DNA reads in RNA run	DNA extraction efficiency from LNP can bias results	qPCR (high sensitivity)

CQA, critical quality attribute; DS, drug substance; FL, full-length single-molecule reads; IMP, sequence-resolved impurities; LC-MS, liquid chromatography-mass spectrometry; LNP, lipid nanoparticle; MOD, base modifications; RCL, replication-competent lentivirus; SIN, self-inactivating; SNV, single-nucleotide variant; SV, structural variant; TAIL, poly(A) tail length.

Accuracy and error profiles

Historically, the primary critique of nanopore data was its raw read accuracy, particularly regarding homopolymer insertions/deletions (indels). Early pore chemistries (R9 series) hovered around 90–95% accuracy, insufficient for detecting SNVs without deep coverage.^118,119

However, the introduction of R10.4.1 pores combined with transformer-based basecallers (e.g., Dorado) has significantly closed this gap. The R10.4.1 dual-reader head measures a longer nucleotide context, reducing homopolymer errors.¹¹⁸ A 2025 GMP-grade study by Dunker-Seidler et al. reported that false indel errors in raw simplex reads dropped from ∼1.4% (V9 chemistry) to ∼0.55% (V14 chemistry). This improvement is critical for AAV QC, as it allows for the differentiation between genuine mutations in the poly-T tracts of ITRs and sequencing artifacts.¹¹⁸

Despite these gains, caution is warranted for low-frequency variant calling. Distinguishing a true variant present at 1% abundance from a systematic sequencing error remains challenging without high depth or replicate sequencing.^120,121 For applications requiring near-perfect single-molecule accuracy, duplex sequencing (reading both strands of a DNA molecule connected by a hairpin) offers raw accuracies >Q30 (99.9%), though at the cost of reduced throughput compared with standard protocols.

Bioinformatics: The GMP compliance gap

Perhaps the most significant barrier to routine adoption is not wet-lab chemistry, but software compliance. •

Locked Workflows: Academic pipelines often rely on a constellation of open-source tools (e.g., Minimap2 and Samtools) that change frequently. In a GMP environment, software must be rigorously “locked” (version-controlled) and validated to ensure process stability.¹²²

•

21 CFR Part 11 Compliance: Regulatory standards require electronic records to ensure data integrity through audit trails and user access controls. While ONT offers the “Epi2me” platform, developing a fully compliant, end-to-end bioinformatic pipeline that integrates seamlessly with Laboratory Information Management Systems remains a burden often shouldered by the manufacturer.^123–128

•

AI/ML Algorithm Validation: The “Deterministic Paradox”: A more fundamental challenge lies in validating the basecalling algorithms themselves. GMP regulations rely on “deterministic” processes—where the same input invariably yields the same output. However, modern nanopore basecallers (e.g., Dorado) utilize deep neural networks that are inherently probabilistic. Although models are “locked” during deployment, their “black box” nature complicates root-cause analysis for out-of-specification investigations. To address this, manufacturers should reference the U.S. Food and Drug Administration (FDA)’s framework for Software as a Medical Device and adopt a Predetermined Change Control Plan. This shifts validation to a lifecycle model, ensuring that model updates are prespecified and verified against synthetic “ground truth” datasets without compromising the assay’s validated state.¹²⁹

•

Standardization: The Sequencing Quality Control Consortium (SEQC2), involving FDA scientists, is working to establish best practices and reference standards. Standardizing metrics—such as defining exactly how “genome integrity” is calculated—is a prerequisite for cross-industry comparability.¹³⁰

Implementation logistics: cost and turnaround

Unlike traditional assays, implementing nanopore sequencing shifts the resource burden from capital equipment to data infrastructure. •

Capital Expenditure (CapEx): ONT devices (e.g., GridION and PromethION) function on a low-CapEx model, costing significantly less than high-throughput short-read sequencers (e.g., Illumina NovaSeq) or long-read competitors (e.g., PacBio Revio). This lowers the barrier to entry for smaller QC labs.

•

Operational Costs (OpEx) and Throughput: The cost per sample for ONT is competitive for low-to-medium batch sizes but may be higher than Illumina for ultra-high-throughput applications. However, for gene therapy lots (where sample volume is low but value is high), the cost is negligible compared with the value of the data.

•

Turnaround Time: ONT excels in speed. A typical library-to-report workflow can be completed in <24 h. In contrast, outsourced Sanger sequencing or culture-based biosafety assays can take weeks.

•

The “Hidden” Cost: The primary logistical challenge is IT infrastructure. Storing and processing terabytes of raw “pod5” signal data requires high-performance computing and long-term cold storage strategies that many QC labs historically did not need for simple PCR or ELISA data.

Regulatory recognition and the path forward

As of 2025, no gene therapy product lists nanopore sequencing as a sole release test in public filings. However, regulators are increasingly encouraging its use for characterization and comparability (see Supplementary Table S1 for a detailed mapping of CQAs to regulatory guidelines).¹⁵

The FDA and European Medicines Agency have acknowledged the utility of NGS for investigating vector integrity. The likely regulatory trajectory will follow a phased adoption^15,131: 1.

Phase 1 (Current): Orthogonal Characterization. Used to troubleshoot “out-of-spec” results from legacy assays (e.g., identifying a mystery peak in CE).

Phase 2 (Near-term): Supportive Evidence in Filings. Data are included in IND/BLA submissions to demonstrate deep process understanding (e.g., “We confirmed the absence of plasmid backbone using long-read sequencing”).

Phase 3 (Future): Validated Release Assay. Replacing specific tests (e.g., using sequencing to replace gel electrophoresis for sizing and PCR for identity) once method validation guidelines (ICH Q2) are formally adapted for NGS.^131–133

In conclusion, while limitations in single-molecule accuracy and software compliance persist, they are being rapidly eroded by technological updates. The industry is moving toward a consensus that the depth of insight provided by nanopore sequencing—resolving the “black box” of vector integrity—outweighs the logistical challenges of adoption.

CONCLUSION

Nanopore sequencing has matured from a specialized research capability into a high-resolution analytical platform capable of addressing the most persistent challenges in gene therapy QC. By enabling the end-to-end reading of single molecules, this technology resolves structural and chemical attributes—from plasmid recombination events to mRNA poly(A) tail dynamics—that were previously inferred via indirect surrogate assays.¹³⁴ The applications reviewed here demonstrate that ONT offers a unified mechanism to interrogate CQAs across the entire manufacturing lifecycle.

For plasmid DNA, long reads provide a definitive check on the structural homogeneity of the starting material. By capturing full circular sequences, nanopore sequencing detects subtle recombination events in repetitive regions that short-read methods bridge and Sanger sequencing fails to resolve.^17,51 This capability allows manufacturers to validate upstream constructs with a level of confidence that mitigates the risk of propagating defective elements into downstream viral production.¹⁵

In the context of viral vectors, specifically AAV and lentivirus, the technology serves as a primary integrity assay. It reveals the true distribution of full versus truncated genomes and identifies sequence-resolved impurities, such as reverse-packaged plasmid backbones or host–vector chimeras, which are invisible to standard titration methods. This sequence-level visibility is equally critical for mRNA therapeutics, where direct RNA sequencing simultaneously profiles poly(A) tail length, capping efficiency, and base modification status (e.g., m¹Ψ incorporation) on single molecules, streamlining a battery of physicochemical tests into a single workflow.⁶⁰

Furthermore, advanced capabilities such as epigenetic profiling and adaptive sampling are expanding the boundaries of CMC characterization. The ability to map bacterial methylation patterns allows for a deeper assessment of immunogenic risk,⁴³ while software-driven enrichment enables the detection of ultra-rare contaminants in complex biological matrices without the bias of PCR amplification.²⁸

Looking forward, we anticipate a gradual shift where long-read sequencing evolves from an orthogonal characterization tool toward a more integrated role in the CMC framework, potentially supporting lot release testing once quantification biases are addressed and fully validated. While challenges remain—specifically regarding the validation of bioinformatic pipelines and the standardization of accuracy metrics for regulatory filings²⁸—the trajectory is clear. The transition from fragmented, attribute-specific assays to holistic, single-molecule profiling represents a fundamental advancement in how we define “purity” and “identity” in genetic medicine. This trajectory parallels the broader shift in mRNA analytics toward MAMs, where orthogonal physicochemical assays—often including chromatographic platforms such as HPLC and increasingly at-/on-line monitoring—are being consolidated into standardized QC workflows.¹³⁵

Ultimately, the adoption of nanopore sequencing is driven by the imperative of patient safety. By identifying subtle vector defects and rare impurities early in the manufacturing process, developers can ensure that the increasing complexity of gene therapies is matched by an equally sophisticated QC framework.²³ As the technology overcomes current quantitative limitations, achieves greater regulatory familiarity, and demonstrates operational robustness, it is poised to become a vital orthogonal standard for characterizing the next generation of genetic medicines.

AUTHORS’ CONTRIBUTIONS

X.-B.Z. and J.-P.Z. conceived the study, obtained funding, and supervised the overall work. X.Y. drafted the article. All authors read and approved the article prior to submission.

Footnotes

AUTHOR DISCLOSURE

The authors declare no conflict of interest.

FUNDING INFORMATION

This work was supported by the National Key Research and Development Program of China (grant nos. 2019YFA0110803 and 2021YFA1100900), the National Natural Science Foundation of China (grant nos. 82570286, 92568302, 82402188, 81870149, 82070115, 81890990, and 81730006), the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (CIFMS) (grant nos. 2024-I2M-3-018, 2024-I2M-ZH-015, 2023-I2M-2-007, 2022-I2M-2-003, 2022-I2M-2-001, 2021-I2M-1-041, 2021-I2M-1-040, and 2021-I2M-1-001), the Haihe Laboratory of Cell Ecosystem Innovation Fund (grant nos. 24HHXBSS00005 and HH22KYZX0022), China Foundation For Youth Entrepreneurship and Employment-Incaier Public Welfare Fund (HH25KYHX0009), Postdoctoral Fellowship Program of CPSF (grant no. GZC20240154), and Fundamental Research Funds for the Central Universities (grant no. 3332024074).

References

1. Chancellor

, Barrett

, Nguyen-Jatkoe

, et al. The state of cell and gene therapy in 2023. Mol Ther 2023;31(12):3376–3388.

2. Bulaklak

and, Gersbach

. The once and future gene therapy. Nat Commun 2020;11(1):5820.

3. Dunbar

, High

, Joung

, et al. Gene therapy comes of age. Science 2018;359(6372):eaan4672.

4. June

, O’Connor

, Kawalekar

, et al. CAR T cell immunotherapy for human cancer. Science 2018;359(6382):1361–1365.

5.American Society of Gene & Cell Therapy and Citeline. Gene, Cell, & RNA Therapy Landscape Report: Q3 2025. American Society of Gene & Cell Therapy & Citeline; 2025.

6. Pardi

, Hogan

, Porter

, et al. mRNA vaccines - A new era in vaccinology. Nat Rev Drug Discov 2018;17(4):261–279.

7. Polack

, Thomas

, Kitchin

, et al.; C4591001 Clinical Trial Group. Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine. N Engl J Med 2020;383(27):2603–2615.

8. Baden

, El Sahly

, Essink

, et al.; COVE Study Group. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N Engl J Med 2021;384(5):403–416.

9. Chaudhary

, Weissman

, Whitehead

. mRNA vaccines for infectious diseases: Principles, delivery and clinical translation. Nat Rev Drug Discov 2021;20(11):817–838.

10.

10. Szabo

, Mahiny

, Vlatkovic

. COVID-19 mRNA vaccines: Platforms and current developments. Mol Ther 2022;30(5):1850–1868.

11.

11. Hou

, Zaks

, Langer

, et al. Lipid nanoparticles for mRNA delivery. Nat Rev Mater 2021;6(12):1078–1094.

12.

12. Kulkarni

, Cullis

, van der Meel

. Lipid nanoparticles enabling gene therapies: From concepts to clinical utility. Nucleic Acid Ther 2018;28(3):146–157.

13.

13. Wright

. Product-related impurities in clinical-grade recombinant AAV vectors: Characterization and risk assessment. Biomedicines 2014;2(1):80–97.

14.

14. Blay

, Hardyman

, Morovic

. PCR-based analytics of gene therapies using adeno-associated virus vectors: Considerations for cGMP method development. Mol Ther Methods Clin Dev 2023;31:101132.

15.

15.U.S. Food and Drug Administration. Chemistry, Manufacturing, and Control (CMC) Information for Human Gene Therapy Investigational New Drug Applications (INDs): Guidance for Industry. U.S. Food and Drug Administration: Silver Spring, MD; 2020.

16.

16.European Medicines Agency. Guideline on the quality, non-clinical and clinical aspects of gene therapy medicinal products. European Medicines Agency: Amsterdam. 2018.

17.

17. Treangen

, Salzberg

. Repetitive DNA and next-generation sequencing: Computational challenges and solutions. Nat Rev Genet 2011;13(1):36–46.

18.

18. Aird

, Ross

, Chen

W-S

, et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol 2011;12(2):R18.

19.

19. Logsdon

, Vollger

, Eichler

. Long-read human genome sequencing and its applications. Nat Rev Genet 2020;21(10):597–614.

20.

20. Sedlazeck

, Rescheneder

, Smolka

, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 2018;15(6):461–468.

21.

21. Dunker-Seidler

, Breunig

, Haubner

, et al. Recombinant AAV batch profiling by nanopore sequencing elucidates product-related DNA impurities and vector genome length distribution. Mol Ther Methods Clin Dev 2025;33(1):101417.

22.

22. Dabney

, Meyer

. Length and GC-biases during sequencing library amplification: A comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. Biotechniques 2012;52(2):87–94.

23.

23. Bustin

, Benes

, Garson

, et al. The MIQE guidelines: Minimum information for publication of quantitative real-time PCR experiments. Clin Chem 2009;55(4):611–622.

24.

24. Hindson

, Ness

, Masquelier

, et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem 2011;83(22):8604–8610.

25.

25. Lock

, Alvira

, Chen

S-J

, et al. Absolute determination of single-stranded and self-complementary adeno-associated viral vector genome titers by droplet digital PCR. Hum Gene Ther Methods 2014;25(2):115–125.

26.

26. Dobnik

, Kogovšek

, Jakomin

, et al. Accurate quantification and characterization of adeno-associated viral vectors. Front Microbiol 2019;10:1570.

27.

27. Tai

PWL

, Xie

, Fong

, et al. Adeno-associated virus genome population sequencing achieves full vector genome resolution and reveals human-vector chimeras. Mol Ther Methods Clin Dev 2018;9:130–141.

28.

28. Simpson

, Workman

, Zuzarte

, et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods 2017;14(4):407–410.

29.

29. Rand

, Jain

, Eizenga

, et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat Methods 2017;14(4):411–413.

30.

30. Wenger

, Peluso

, Rowell

, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 2019;37(10):1155–1162.

31.

31. Weirather

, de Cesare

, Wang

, et al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore technologies and their applications to transcriptome analysis. F1000Res 2017;6:100.

32.

32. Jain

, Olsen

, Paten

, et al. The Oxford Nanopore MinION: Delivery of nanopore sequencing to the genomics community. Genome Biol 2016;17(1):239.

33.

33. Bowden

, Davies

, Heger

, et al. Sequencing of human genomes with nanopore technology. Nat Commun 2019;10(1):1869.

34.

34. Wick

, Judd

, Gorrie

, et al. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017;13(6):e1005595.

35.

35. Eastman

, Durland

. Manufacturing and quality control of plasmid-based gene expression systems. Adv Drug Deliv Rev 1998;30(1–3):33–48.

36.

36. Schmeer

, Buchholz

, Schleef

. Plasmid DNA manufacturing for indirect and direct clinical applications. Hum Gene Ther 2017;28(10):856–861.

37.

37.U.S. Food and Drug Administration. Guidance for Industry: Considerations for Plasmid DNA Vaccines for Infectious Disease Indications. U.S. Department of Health and Human Services: Rockville, MD; 2007.

38.

38. Currin

, Swainston

, Dunstan

, et al. Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries. Synth Biol (Oxf) 2019;4(1):ysz025.

39.

39. Mumm

, Drexel

, McDonald

, et al. Multiplexed long-read plasmid validation and analysis using OnRamp. Genome Res 2023;33(5):741–749.

40.

40. Emiliani

, Hsu

, McKenna

. Multiplexed assembly and annotation of synthetic biology constructs using long-read nanopore sequencing. ACS Synth Biol 2022;11(7):2238–2246.

41.

41. Brown

, Dreolini

, Wilson

, et al. Complete sequence verification of plasmid DNA using the Oxford Nanopore technologies’ MinION device. BMC Bioinformatics 2023;24(1):116.

42.

42. Li

, Chng

, Boey

EJH

, et al. INC-Seq: Accurate single molecule reads using nanopore sequencing. Gigascience 2016;5(1):34.

43.

43. Volden

, Palmer

, Byrne

, et al. Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA. Proc Natl Acad Sci U S A 2018;115(39):9726–9731.

44.

44. Vaser

, Sović

, Nagarajan

, et al. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 2017;27(5):737–746.

45.

45. Chiou

C-S

, Chen

B-H

, Wang

Y-W

, et al. Correcting modification-mediated errors in nanopore sequencing by nucleotide demodification and reference-based correction. Commun Biol 2023;6(1):1215.

46.

46. Summers

, Sherratt

. Multimerization of high copy number plasmids causes instability: CoIE1 encodes a determinant essential for plasmid monomerization and stability. Cell 1984;36(4):1097–1103.

47.

47. Summers

, Beton

, Withers

. Multicopy plasmid instability: The dimer catastrophe hypothesis. Mol Microbiol 1993;8(6):1031–1038.

48.

48. Field

, Summers

. Multicopy plasmid stability: Revisiting the dimer catastrophe. J Theor Biol 2011;291:119–127.

49.

49. Crozat

, Fournes

, Cornet

, et al. Resolution of multimeric forms of circular plasmids and chromosomes. Microbiol Spectr 2014;2(5).

50.

50. Radukic

, Le

, Krassuski

, et al. Degradation and stable maintenance of adeno-associated virus inverted terminal repeats in E. coli. Nucleic Acids Res 2025;53(2):gkae1170.

51.

51. Radukic

, Brandt

, Haak

, et al. Nanopore sequencing of native Adeno-Associated Virus (AAV) single-stranded DNA using a transposase-based rapid protocol. NAR Genom Bioinform 2020;2(4):lqaa074.

52.

52. Li

. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 2018;34(18):3094–3100.

53.

53. Li

, Handsaker

, Wysoker

, et al.; 1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and SAMtools. Bioinformatics 2009;25(16):2078–2079.

54.

54. Lemon

, Khil

, Frank

, et al. Rapid nanopore sequencing of plasmids and resistance gene detection in clinical isolates. J Clin Microbiol 2017;55(12):3530–3543.

55.

55. Wood

, Lu

, Langmead

. Improved metagenomic analysis with Kraken 2. Genome Biol 2019;20(1):257.

56.

56. Kim

, Song

, Breitwieser

, et al. Centrifuge: Rapid and sensitive classification of metagenomic sequences. Genome Res 2016;26(12):1721–1729.

57.

57. Emiliani

, Hsu

, McKenna

. Circuit-seq: Circular reconstruction of cut in vitro transposed plasmids using Nanopore sequencing. bioRxiv 2022.

58.

58. Tran

, Heiner

, Weber

, et al. AAV-genome population sequencing of vectors packaging CRISPR components reveals design-influenced heterogeneity. Mol Ther Methods Clin Dev 2020;18:639–651.

59.

59. Namkung

, Tran

, Manokaran

, et al. Direct ITR-to-ITR nanopore sequencing of AAV vector genomes. Hum Gene Ther 2022;33(21–22):1187–1196.

60.

60. Payne

, Holmes

, Clarke

, et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat Biotechnol 2021;39(4):442–450.

61.

61. Cornélie

, Hoebeke

, Schacht

A-M

, et al. Direct evidence that Toll-Like Receptor 9 (TLR9) functionally binds plasmid DNA by specific cytosine-phosphate-guanine motif recognition. J Biol Chem 2004;279(15):15124–15129.

62.

62. Arrasate

, Lopez-Robles

, Zuazo

, et al. Lentiviral vectors: From wild-type viruses to efficient multi-functional delivery vectors. Int J Mol Sci 2025;26(17):8497.

63.

63. Sweeney

, Vink

. The impact of lentiviral vector genome size and producer cell genomic to gag-pol mRNA ratios on packaging efficiency and titre. Mol Ther Methods Clin Dev 2021;21:574–584.

64.

64. Zeglinski

, Montellese

, Ritchie

, et al. An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing. Genome Res 2024;34(11):1966–1975.

65.

65. Suleman

, Khalifa

, Fawaz

, et al. Analysis of HIV-1-based lentiviral vector particle composition by PacBio long-read nucleic acid sequencing. Hum Gene Ther 2025;36(5–6):628–636.

66.

66. Yang

, Lin

, Li

, et al. A comprehensive benchmarking of adaptive sampling tools for nanopore sequencing. Genome Biol 2025;26(1):281.

67.

67. Cornetta

, Yao

, Jasti

, et al. Replication-competent lentivirus analysis of clinical grade vector products. Mol Ther 2011;19(3):557–566.

68.

68. Corre

, Dessainte

, Marteau

J-B

, et al. “RCL-pooling assay”: A simplified method for the detection of replication-competent lentiviruses in vector batches using sequential pooling. Hum Gene Ther 2016;27(2):202–210.

69.

69.American Society of Gene & Cell Therapy and Citeline., Gene, Cell, & RNA Therapy Landscape Report: Q3 2024 Quarterly Data Report. American Society of Gene & Cell Therapy and Citeline; 2024.

70.

70. Hassett

, Benenato

, Jacquinet

, et al. Optimization of lipid nanoparticles for intramuscular administration of mRNA vaccines. Mol Ther Nucleic Acids 2019;15:1–11.

71.

71. Stepinski

, Waddell

, Stolarski

, et al. Synthesis and properties of mRNAs containing the novel “anti-reverse” cap analogs 7-methyl(3′-O-methyl)GpppG and 7-methyl (3′-deoxy)GpppG. RNA 2001;7(10):1486–1495.

72.

72. Jemielity

, Fowler

, Zuberek

, et al. Novel “anti-reverse” cap analogs with superior translational properties. RNA 2003;9(9):1108–1122.

73.

73. Karikó

, Muramatsu

, Ludwig

, et al. Generating the optimal mRNA for therapy: HPLC purification eliminates immune activation and improves translation of nucleoside-modified, protein-encoding mRNA. Nucleic Acids Res 2011;39(21):e142.

74.

74. Nelson

, Sorensen

, Mintri

, et al. Impact of mRNA chemistry and manufacturing process on innate immune activation. Sci Adv 2020;6(26):eaaz6893.

75.

75. Garalde

, Snell

, Jachimowicz

, et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat Methods 2018;15(3):201–206.

76.

76. Ibrahim

, Oppelt

, Maragkakis

, et al. TERA-Seq: True end-to-end sequencing of native RNA molecules for transcriptome characterization. Nucleic Acids Res 2021;49(20):e115.

77.

77. Czarnocka-Cieciura

, Brouze

, Gumińska

, et al. Comprehensive analysis of poly(A) tails in mouse testes and ovaries using nanopore direct RNA sequencing. Sci Data 2025;12(1):43.

78.

78. Guminska

, et al. Direct profiling of non-adenosines in poly(A) tails of endogenous and therapeutic mRNAs with ninetails. Nat Commun 2025;16(1):2664.

79.

79. Fleming

, Burrows

. Nanopore sequencing for N1-methylpseudouridine in RNA reveals sequence-dependent discrimination of the modified nucleotide triphosphate during transcription. Nucleic Acids Res 2023;51(4):1914–1926.

80.

80. Teng

, Stoiber

, Bar-Joseph

, et al. Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework. Genome Res 2024;34(11):1987–1999.

81.

81. Karikó

, Buckstein

, Ni

, et al. Suppression of RNA recognition by toll-like receptors: The impact of nucleoside modification and the evolutionary origin of RNA. Immunity 2005;23(2):165–175.

82.

82. Karikó

, Muramatsu

, Welsh

, et al. Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther 2008;16(11):1833–1840.

83.

83. Anderson

, Muramatsu

, Jha

, et al. Nucleoside modifications in RNA limit activation of 2’-5′-oligoadenylate synthetase and increase resistance to cleavage by RNase L. Nucleic Acids Res 2011;39(21):9329–9338.

84.

84. Anderson

, Muramatsu

, Nallagatla

, et al. Incorporation of pseudouridine into mRNA enhances translation by diminishing PKR activation. Nucleic Acids Res 2010;38(17):5884–5892.

85.

85. Andries

, Mc Cafferty

, De Smedt

, et al. N(1)-methylpseudouridine-incorporated mRNA outperforms pseudouridine-incorporated mRNA by providing enhanced protein expression and reduced immunogenicity in mammalian cell lines and mice. J Control Release 2015;217:337–344.

86.

86. Svitkin

, Cheng

, Chakraborty

, et al. N1-methyl-pseudouridine in mRNA enhances translation through eIF2alpha-dependent and independent mechanisms by increasing ribosome density. Nucleic Acids Res 2017;45(10):6023–6036.

87.

87. Zhao

, Zhang

, Hang

, et al. Detecting RNA modification using direct RNA sequencing: A systematic review. Comput Struct Biotechnol J 2022;20:5740–5749.

88.

88. Wongsurawat

, Jenjaroenpun

, Nookaew

. Direct sequencing of RNA and RNA modification identification using nanopore. Methods Mol Biol 2022;2477:71–77.

89.

89. Abbas

, Laudenbach

, Martínez-Montero

, et al. Structure of human IFIT1 with capped RNA reveals adaptable mRNA binding and mechanisms for sensing N1 and N2 ribose 2’-O methylations. Proc Natl Acad Sci U S A 2017;114(11):E2106–E2115.

90.

90. Johnson

, VanBlargan

, Xu

, et al. Human IFIT3 modulates IFIT1 RNA binding specificity and protein stability. Immunity 2018;48(3):487–499 e5.

91.

91. Sikorski

, Warminski

, Kubacka

, et al. The identity and methylation status of the first transcribed nucleotide in eukaryotic mRNA 5′ cap modulates protein expression in living cells. Nucleic Acids Res 2020;48(4):1607–1626.

92.

92. Welbourne

, Loveday

, Nair

, et al. Anion exchange HPLC monitoring of mRNA in vitro transcription reactions to support mRNA manufacturing process development. Front Mol Biosci 2024;11:1250833.

93.

93. Hutchinson

, Schweikart

, Shannon

, et al. Physicochemical and functional assessment of messenger RNA 5′Cap-end impurities under forced degradation conditions. Mol Ther Nucleic Acids 2025;36(2):102570.

94.

94. Chan

, Whipple

, Dai

, et al. RNase H-based analysis of synthetic mRNA 5′ cap incorporation. RNA 2022;28(8):1144–1155.

95.

95. Holland

, Acevedo-Skrip

, Barton

, et al. Development and application of automated sandwich ELISA for quantitating residual dsRNA in mRNA vaccines. Vaccines (Basel) 2024;12(8):899.

96.

96. Liu

, Zheng

, Xu

, et al. An improved method for the detection of double-stranded RNA suitable for quality control of mRNA vaccines. Protein Cell 2024;15(11):791–795.

97.

97. Piao

, Yadav

, Wang

, et al. Double-stranded RNA reduction by chaotropic agents during in vitro transcription of messenger RNA. Mol Ther Nucleic Acids 2022;29:618–624.

98.

98. Clark

, Kozarski

, Asci

, et al. Removal of dsRNA byproducts using affinity chromatography. Mol Ther Nucleic Acids 2025;36(2):102549.

99.

99. Palmer

, Marinus

. The dam and dcm strains of escherichia coli–A review. Gene 1994;143(1):1–12.

100.

100. Marinus

, Casadesus

. Roles of DNA adenine methylation in host-pathogen interactions: Mismatch repair, transcriptional regulation, and more. FEMS Microbiol Rev 2009;33(3):488–503.

101.

101. Low

, Casadesus

. Clocks and switches: Bacterial gene regulation by DNA adenine methylation. Curr Opin Microbiol 2008;11(2):106–112.

102.

102. Krieg

, Yi

, Matson

, et al. CpG motifs in bacterial DNA trigger direct B-cell activation. Nature 1995;374(6522):546–549.

103.

103. Klinman

, Yi

, Beaucage

, et al. CpG motifs present in bacteria DNA rapidly induce lymphocytes to secrete interleukin 6, interleukin 12, and interferon gamma. Proc Natl Acad Sci U S A 1996;93(7):2879–2883.

104.

104. Hemmi

, Takeuchi

, Kawai

, et al. A toll-like receptor recognizes bacterial DNA. Nature 2000;408(6813):740–745.

105.

105. Bauer

, Kirschning

, Häcker

, et al. Human TLR9 confers responsiveness to bacterial DNA via species-specific CpG motif recognition. Proc Natl Acad Sci U S A 2001;98(16):9237–9242.

106.

106. Reyes-Sandoval

, Ertl

. CpG methylation of a plasmid vector results in extended transgene product expression by circumventing induction of immune responses. Mol Ther 2004;9(2):249–261.

107.

107. Hyde

, Pringle

, Abdullah

, et al. CpG-free plasmids confer reduced inflammation and sustained pulmonary gene expression. Nat Biotechnol 2008;26(5):549–551.

108.

108. Ni

, Huang

, Zhang

, et al. DeepSignal: Detecting DNA methylation state from nanopore sequencing reads using deep-learning. Bioinformatics 2019;35(22):4586–4595.

109.

109. Liu

, Rosikiewicz

, Pan

, et al. DNA methylation-calling tools for Oxford Nanopore sequencing: A survey and human epigenome-wide evaluation. Genome Biol 2021;22(1):295.

110.

110. White

, Hesselberth

. Modification mapping by nanopore sequencing. Front Genet 2022;13:1037134.

111.

111. Low

, Weyand

, Mahan

. Roles of DNA adenine methylation in regulating bacterial gene expression and virulence. Infect Immun 2001;69(12):7197–7204.

112.

112. Militello

, Simon

, Qureshi

, et al. Conservation of Dcm-mediated cytosine DNA methylation in Escherichia coli. FEMS Microbiol Lett 2012;328(1):78–85.

113.

113. Loose

, Malla

, Stout

. Real-time selective sequencing using nanopore technology. Nat Methods 2016;13(9):751–754.

114.

114. Kovaka

, Fan

, Ni

, et al. Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. Nat Biotechnol 2021;39(4):431–441.

115.

115. Weilguny

, De Maio

, Munro

, et al. Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design. Nat Biotechnol 2023;41(7):1018–1025.

116.

116. Munro

, Payne

, Holmes

, et al. Enhancing nanopore adaptive sampling for PromethION using readfish at scale. Genome Res 2025;35(4):877–885.

117.

117. Cheng

, Sun

, Yang

, et al. A rapid bacterial pathogen and antimicrobial resistance diagnosis workflow using Oxford nanopore adaptive sequencing method. Brief Bioinform 2022;23(6):bbac453.

118.

118. Bogaerts

, Van den Bossche

, Verhaegen

, et al. Closing the gap: Oxford nanopore technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens. J Clin Microbiol 2024;62(5):e0157623.

119.

119. Hall

, Wick

, Judd

, et al. Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data. Elife 2024;13:RP98300.

120.

120. Olson

, Wagner

, Dwarshuis

, et al. Variant calling and benchmarking in an era of complete human genome sequences. Nat Rev Genet 2023;24(7):464–483.

121.

121. Wagner

, Olson

, Harris

, et al. Benchmarking challenging small variants with linked and long reads. Cell Genom 2022;2(5):100128.

122.

122. Roy

, Coldren

, Karunamurthy

, et al. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines: A joint recommendation of the Association for Molecular Pathology and the College of American Pathologists. J Mol Diagn 2018;20(1):4–27.

123.

123.U.S. Government Publishing Office. Electronic Records; Electronic Signatures. Electronic Code of Federal Regulations; 2026.

124.

124.U.S. Food and Drug Administration. Guidance for Industry: Part 11, Electronic Records; Electronic Signatures—Scope and Application. U.S. Food and Drug Administration: Silver Spring, MD; 2003.

125.

125.Medicines and Healthcare products Regulatory Agency. ‘GXP’ Data Integrity Guidance and Definitions. Medicines and Healthcare products Regulatory Agency: London; 2018.

126.

126.European Commission. EudraLex Volume 4: EU Guidelines for Good Manufacturing Practice for Medicinal Products for Human and Veterinary Use. Annex 11: Computerised Systems. European Commission: Brussels. 2011.

127.

127.U.S. Food and Drug Administration. Data Integrity and Compliance With Drug CGMP: Questions and Answers. Guidance for Industry. U.S. Food and Drug Administration: Silver Spring, MD; 2018.

128.

128.Pharmaceutical Inspection Co-operation Scheme. Good Practices for Data Management and Integrity in Regulated GMP/GDP Environments. Pharmaceutical Inspection Co-operation Scheme: Geneva; 2021.

129.

129.U.S. Food and Drug Administration. Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence-Enabled Device Software Functions: Guidance for Industry and Food and Drug Administration Staff. U.S. Food and Drug Administration: Silver Spring, MD; 2025.

130.

130. Mercer

, Xu

, Mason

, et al.; MAQC/SEQC2 Consortium. The sequencing quality control 2 study: Establishing community standards for sequencing in precision medicine. Genome Biol 2021;22(1):306.

131.

131.International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. ICH Harmonised Guideline Q2(R2): Validation of Analytical Procedures. International Council for Harmonisation: Geneva; 2023.

132.

132.International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. ICH Harmonised Guideline Q14: Analytical Procedure Development. International Council for Harmonisation: Geneva; 2023.

133.

133.International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. ICH Harmonised Guideline Q5A(R2): Viral Safety Evaluation of Biotechnology Products Derived from Cell Lines of Human or Animal Origin. International Council for Harmonisation: Geneva; 2023.

134.

134. Kim

, Saville

, O’Neill

, et al. Nanopore direct RNA sequencing of human transcriptomes reveals the complexity of mRNA modifications and crosstalk between regulatory features. Cell Genom 2025;5(6):100872.

135.

135. Camperi

, Chatla

, Freund

, et al. Current analytical strategies for mRNA-based therapeutics. Molecules 2025;30(7):1629.

136.

136. Corre

, Seye

, Frin

, et al. Lentiviral standards to determine the sensitivity of assays that quantify lentiviral vector copy numbers and genomic insertion sites in cells. Gene Ther 2022;29(9):536–543.